This dataset was, before 2007, DSS' primary surface observation set. Now we recommend that users obtain the BUFR formatted data in DS461.0. for times since April 2000.
There are thousands of reports every six hours, and three hours. Since 1997 the counts increased considerably. This is because NCEP began collecting hourly reports and in limited regions, 20-minute reports, in the late 1990s.
Descriptions of the attributes of this data involves terminology which can become confusing. It depends on the source and context. DSS documentation, NCEP documentation, and scientist references too often use terms which in one context means one thing, but in another something else. This web page uses DSS terminology, and in some important ways its usage differs from NCEP as described in the following.
The CISL/DSS Research Data Archive (RDA) has several hundred "datasets", and which should not be construed to be "databases". Within NCEP's documents for the ADP data, a "dataset" means a kind of data such as ADPSFC or SFCSHP, which are described later. DSS calls these "groups" or "subsets", which are more consistent with how we refer to such groupings in our other datasets.
The groups of files in this dataset are synoptic, which means they provide data over a wide area (from many stations or platforms), generally global, by simultaneous observations. A time series file of observations would be a collection from a single station. Comparing these files with the files in DS512.0, illustrates the difference.
What NCEP calls a "report type" is an umbrella for a mix of geography, platform, and station identification characterizations. Within the groups in this dataset documentation, we prefer to use "type" to indicate a sub-group, such as SYNOP, METARS or ASOS, which are found in the ADPSFC group. DSS has learned that some "types" are not always collected in one particular group, apparently at NCEP's processing convenience.
The data are provided only as complete reports in the "packed ASCII" ON29 format. This has many implications. Being ASCII rather than binary means a significantly higher volume. Being synoptically sorted, the expense for extracting a time series of reports from a handful of stations can be surprisingly high. A subset of just temperature would not reduce the volume because the format is not dynamic.
Software for this dataset, such as readsfc2.f, opens individual reports and prints human readable data values. To satisfy users who want to see just a few data values, or even just one, such as temperature, readsfc2.f must still open each entire individual report to access them. To make such special subsetting, users will need to modify the readsfc2.f output.
The data reports from land stations, ships, buoys, etc. are received by NCEP via the Global Telecommunications System (GTS).
NCEP uses Automated Data Processing (ADP), to retrieve, decode and reformat the GTS reports. The NCEP document Observational Data Processing at NCEP and the NCAR document TN404 An Introduction to Atmospheric and Oceanographic Datasets, (chapter 3) discuss the collection procedures in greater detail. The evolution of the collection and processing, since 1870, is described here.
These data are what NCEP collected before their cutoff times, when they commence analysis of the upper air data. Reports from any station may be missed for any of a number of reasons, generally involving equipment failure or delays past the cutoff time. It is possible that the report was eventually received at NMC, but it does not get into the files they send us. About the only way to find out if a particular observationb was actually made is to contact the station itself or NCDC.
NCEP collects the decoded reports in files which have a synoptic date/time stamp, which corresponds to the analysis time of the NCEP models. NCEP applies some interactive quality control (QC) and removes duplicate reports. When stations fail to report over the GTS or on time, or there were other data losses, neither NCEP nor DSS attempts any gap filling. Such gap filling is deemed to be in the purview of climatological data center collection practices.
NCEP does not apply much QC to the surface data, compared to the upper air data, because the surface data is not used in their models. DSS did not do any QC, but will investigate user complaints about data values.
NCEP encoded the surface data into ON29 format and ON124, a supplement to ON29, which provided information about surface data which was not in ON29. By the time final versions of these documents were prepared in 2001, the ON29 had become a document solely for upper air data files (as in our ds353.4) and ON124 was a complete stand alone document for surface data files. However, you must see the details about the DSS version of the header blocks in here.
Originally NMC decoded the GTS data and then encoded it to the ON29 format.
Beginning in March 1996, NCEP put it directly into BUFR format. Until October 1999, NCEP used their BUFRON29 conversion program to produce the ON29 formatted files from the BUFR files. This was done initially to verify content, and then for awhile as a service to users while they adapted to BUFR. NCEP ran this conversion on their Crays, and when they abandoned their Crays for a new system, they stopped making this conversion.
In February 2000, DSS acquired NCEP's BUFRON29 in order to continue production of the ON29 format for our users. NCEP later discontinued support for this software. DSS continued using it, while knowing very little about it - the thing is a large black box. Eventually continuing changes in BUFR data content made modifications of the conversion impractical, so DSS stopped production after February 2007.
Perhaps more importantly, the ON29 format does not provide for as much information as does the BUFR format. During BUFR to ON29 conversions, the primary GTS report information was carried along, but not additions to the reports. ON29 does not provide for new groups or types appearing in BUFR. Also, BUFR carries the original raw reports which NCEP collected from the GTS - ON29 does not. The GTS content provides the means for users to make their own judgements and apply their own procedures for things like hydrostatic checks or wind corrections (click here to see a particularly important issue).
In 1997 we got an email from Dennis Keyser (NCEP) about matters regarding the transition to BUFR. Please look at this. It is a good example of what can happen in operational data processing..
The source of the data for the BUFR to ON29 conversion was NCEP's "dump" BUFR data files. This was not the PREPBUFR data.
For the curious, a history of U.S. weather data technology is here.
Beginning with the January 2004 data, in order to comply with WMO Resolution 40, DSS has been removing some restricted data. This involves proprietary data from certain commercial sources. Also see the NCEP Restricted Data Information Site
NCEP's observation file organization changes occasionally, although from 1 January 1978 - 31 March 1997 things were consistent. NCEP/NMC sent us this data on two weekly tapes wherein there was a series of 6-hourly or 3-hourly files each containing all the data groups for a model analysis time. In later years those files can include hourly and even some 20-minute reports.
Prior to January 1, 1978, the data files appear in one stream of files.NCEP made several changes in the spring of 1997, mostly in April. First, NCEP began putting all groups of data in monthly files, inside huge tarfiles. In the mid-2000s, because the monthly volume overwhelmed both disk file and communication limits, they began putting them in daily tar files. For archiving, we split out the groups and then merged these into 5 day, 15 day and 30 day files.
In April 1997, NCEP changed the streams: they put the land surface in one stream, all the ship surface in another. They also stopped making separate data files for 03Z, 09Z, 15Z and 21Z. So for the remainder of this dataset's files, the 6-hourly files have all available 3-hourly, hourly and even some 20-minute reports, within 3 hours of the 6-hourly synoptic times. There can be some duplication of reports, especially when station METAR reports fall on the same time as SYNOP reports.
In April 1997 NCEP stopped shipping the ADP tapes directly to us. In February 1998 we obtained copies of the tapes which they continued to send to NCDC. Another batch was so obtained to bring us through March 1998, and then NCEP's shipments directly to us resumed. Then in June 2000, tape shipments were terminated in favor of using NCEP's FTP server.
DSS collected the BUFR formatted data (from tapes) for April 1998 - December 2003 in DS464.5. DSS is collecting the BUFR formatted data (from FTP) for April 2000 - present in DS461.0.
DSS has prepared several kinds of statistics for various groups, types and stations (both synoptic and time series counts), from 1975 into the early 2000s. The main web page for these is here.
DSS has created a series of annual sample printouts, found here. To obtain these, DSS made a pass through the data for 1975 - 2002 to extract all the data from the first day of each month, at 00Z and 12Z, for about 10% of all available stations. These "snapshots" of the data allow us and users to test the content of the reports among other things. In particular, it has enabled us to study the availability of the cloud and precipitation data.
DSS discovered a steady decline in the total number of reports, from about late 1991 until early 1994. Then there was a steady recovery, terminated by a large rebound in August 1994. This may be related to a similar behavior in the upper air data receipts, where NCEP had a problem with what they called a "flooded station table." Starting with the April 28, 1996 data, shipment cover letters from NCEP indicate data loss due to "too much data" which evidently overflowed their capture buffers. When NCEP moved their processing to larger and faster machines, this problem vanished.
There has been a huge increase in the number of land surface reports in the last few years. This involves the addition of hourly data and even 20 minute data.
This data group includes
SYNOP - SYNOPtic observation
METAR - METeorological Aviation Report
AWOS - Automated Weather Observing System
ASOS - Automated Surface Observing System
CMAN - Coastal-Marine Automated Network
Reports may include categories 51, 52, 08 and 09 (as described in NMC Office Note 124: NCEP ON124 )
DSS has prepared station receipt summaries: daily and monthly. 06Z and 18Z data is lower volume, starts about April 1975, is sometimes a little out of sort, and is missing more frequently.
METAR reports (which use call sign identification) disappear on 1993Sep28.09, and do not reappear until 1998Mar13.12, at which time a larger number appear, apparently from Europe, and then during 1998 a great portion of the northern hemisphere and perhaps more appear.
North American Surface Airway reports have been decoded by NCEP and included in this data. But DSS discovered in November 1995 that they ceased doing this on September 24, 1993. DSS recommends that you use DS472.0, "TDL surface hourlies," to get the surface airways data for 1993-1998. DSS notes that this data returned March 13, 1998, and as of this writing now appears to include most of the northern hemisphere.
ASOS reports were implemented in late 1994, and these do not include a complete set of cloud data. A news story about issues involving ASOS observations is here.
AWOS reports began to appear in March 1998, - these can occur as often as every 20 minutes.
This data group includes
OSV - Ocean Station: fixed ship
OSV - Ocean Station: moving ship (with and without name)
MARS - MArine Reporting Station: Fixed and moving
..... - Buoy: moored and drifting
Reports may include categories 51, 52, 08 and 09 (as described in NMC Office Note 124: NCEP ON124 )
| LIST | GROUPS | PERIOD | FILE STRUCTURE AS | COMMENTS |
|---|---|---|---|---|
| C | ADPSFC & SHPSFC | 1975feb11-1976jun30 | received from NMC/NESDIS | no ship until 1976 Aug |
| C | ADPSFC & SHPSFC | 1976jul05-1977dec31 | received from NMC | no ship until 1976 Aug |
| A | ADPSFC 6-hourly | 1978jan01-1997apr05 | received from NMC, then NCEP | |
| B | ADPSFC & SHPSFC | 1978jan01-1997apr05 | received from NMC, then NCEP | (land at 03, 09, 15 & 21Z only) |
| E | ADPSFC only | 1997apr01-1998mar31 | received from NCDC | |
| E | ADPSFC only | 1998apr01-1999sep30 | received from NCEP | |
| E | ADPSFC only | 1999octll-2007feb28 | converted from BUFR by DSS | |
| F | SHPSFC only | 1997apr01-1998mar31 | received from NCDC | |
| F | SHPSFC only | 1998apr01-1999sep30 | received from NCEP | |
| F | SHPSFC only | 1999oct01-2007feb28 | converted from BUFR by DSS |
BASIC - NCEP segregates the reports by group and synoptic date/time, and calls these data files. A data file begins with a header block and is usually terminated by a trailer block. The data blocks between these contain the data reports. The header block identifies the group (e.g. ADPSFC or SFCSHP) and synoptic date/time (which may also be called the header time). The actual observation times appear in the individual reports. The individual reports do not show the synoptic date/time stamp.
DSS converts NCEP's header blocks to the GATE format. This was originally intended just to support GARP projects in the mid-1970s, but persisted and became our paradigm, which at times is regrettable. Details about the DSS version of the header blocks is in here.
DATE/TIME STAMP - The collection of observations for a synoptic time begins before the synoptic time, as much as half of the time interval between synoptic times, but it should not be before the cutoff time of the previous data file. The header blocks will often, but not always, show a receipt time which indicates when the file was opened to begin the accumulation of reports. Likewise the cutoff time indicates when the file was closed, terminating the accumulation. DSS has found that these delineators have not been dependable, so users should just ignore them.
The synoptic time, header time or analysis time are all the same and are specified in Coordinated Universal Time (CUT or Z). The individual report times are also given in UT and are the actual observation times, as reported by the stations. Stations may report observation times which look exactly like the synoptic times, but in practise these may not always be true.
For information about the time convention, see this.
WARNING: Many users have been accepting or mistakenly expecting the synoptic time to be the observation time. If their work is sensitive to the difference, then they should use the observation times. This is a very important consideration because often users will find more than one report from a station within a data file for a synoptic time. Moreover, some reports in a given data file may be duplicated in the previous or following file.
Users must be aware that the actual time of an observation should be obtained from within the individual report. One may also want to adjust the date shown in the header block, because for some 0Z (and even 06Z) data files the report times which show values between 18Z and 24Z actually belong to the previous date. A comparable thing may occur at 18Z, where the report may belong to the next day. Users need to be aware of this. Our latest access software, readsfc2.f, allows a user to make the adjustment.
STATION IDENTIFICATION - Generally the 6-hourly and 3-hourly SYNOP reports use WMO 5 digit numbers. Many other reports, such as METARS, or hourly and 20 minute reports, use call signs. So when attempting to extract all the reports made by a station, or stations nearby, one should use a latitude-longitude "window", which will get both the WMO and call sign labeled reports. Such a "window" will obtain neighboring station data that could be used when the desired station data is missing. See the code in readsfc2.f.
A list of stations is in our somewhat out of date USAF Station Library of station names, locations, elevations, etc. Note that in this library, 6 digit WMO numbers are used, where [5 digits]0 applies. Caution: the content of these libraries must not be taken as a list of what you may find in this dataset, it is just a cross-reference.
DATA UNITS - The data units in this dataset involve both English and metric. See ON29 for details.
SEA LEVEL PRESSURE - This dataset has reports which include surface (station) pressure, sea level pressure, or both. From a scientific point of view, stations which report a sea level pressure, but not the observed station pressure, are being irresponsible. This is because it spoils the possibility of applying improved algorithms for the reduction of pressure from station elevation to sea level, among other things. For that matter, methods have changed over time.
The reduction of station surface pressure to sea level pressure is almost a black art. Basic meteorology texts describe theories which involve assumptions about the virtual temperature in an imaginary level between the station and sea level. An example is in the classic text "An Introduction to Theoretical Meteorology" by Seymour Hess (1959), on pages 88 - 90. In practise, the reduction algorithms are empirical, where local stations apply local statistical knowledge of weather and seasonal patterns.
Here are a couple of more recent texts on the subject: Chuang, et al and NOAA/NWS.
It is reasonable to expect to know exactly what was the systematic reduction procedure, but we do not have the answer.
WINDS -
From at least Apr 2000 to May 2002, NCEP's GTS decoder generated
incorrect wind speeds for certain parts of the world that were
based on false assumptions about the units. The nature
of the corrections is such that the information needed to make them
does not appear in the ON29 data. Doug Schuster analyzed the BUFR
data and prepared some documents about the problems: global,
regional
and more regional.
Contact Doug Schuster
for more information. We urge you to be careful
about using the wind data for this period.
DEWPOINT - We changed to the NASA GCMD standard for keywords. The GCMD does not provide a keyword for dew point depression. DSS is forced to show "Dew Point Temperature" on its web pages, even though the data values in this dataset are actually dew point depression.
CLOUDS - In recent years the cloud data fields have become unreliable or even unavailable. Initially this was limited to just the United States, beginning in late 1994, when the U.S. observing network switched to automated measurements (ASOS) which could not determine cloud types. On 2000 February 1, NCEP stopped decoding all cloud data except for the total amount. This also occurred in DS470.0. A news story about issues involving ASOS observations is here.
These some general WMO Code Tables, and some special WMO tables height of base of lowest cloud, and total cloud cover.
PRECIPITATION - DSS is frequently asked about the availability of precipitation data, in this data set, outside of the U.S. NCEP routinely decodes the GTS precipitation data only for North American stations, putting it in the report category 52 data. See this. NCEP does not decode the data for the remainder of the world, but does save it in the report category 08 data. Tables of numerous decoding schemes are required to decode these data, and DSS has just a few of these on paper. NCEP operations considers these data to be climatological data, of little use for their models and forecasts. DSS understands that NCEP's Climate Prediction Center (CPC) has been decoding these globally. Please refer to DS512.0, our global summary of day dataset, which is prepared by CPC, but note that it shows only daily totals.
WEATHER CODES - In late 1981, reports from automated stations began to appear. In 1987, the present and past weather codes for the automated stations were switched to a different code table. Users have reported a discontuity in the frequency of these codes denoting precipitation beginning in late 1981. See the final NMC ON124 document for more information.
To see FAQs about FNL and ADP data click here.
Models
For support using NCAR/MMM's WRF click here.
For support using ATMET's RAMS click here.
Collecting, maintaining and distributing the NCEP operational datasets (ADP and
FNL) has been a major part of my career, May 1976 - present.
Gregg Walters