This dataset was, before 2007, DSS' primary upper air observation set. Now we recommend that users obtain the BUFR formatted data in DS351.0. for times since April 2000.
This dataset was also a major input for the Reanalysis Projects conducted by NCEP/NCAR, ECMWF and others.
Descriptions of the attributes of this data involves terminology which can become confusing. It depends on the source and context. DSS documentation, NCEP documentation, and scientist references too often use terms which in one context means one thing, but in another something else. This web page uses DSS terminology, and in some important ways its usage differs from NCEP as described in the following.
The CISL/DSS Research Data Archive (RDA) has several hundred "datasets", and which should not be construed to be "databases". Within NCEP's documents for the ADP data, a "dataset" means a kind of data such as ADPUPA or AIRCFT, which are described later. DSS calls these "groups" or "subsets", which are more consistent with how we refer to such groupings in our other datasets.
The groups of files in this dataset are synoptic, which means they provide data over a wide area (from many stations or platforms), generally global, by simultaneous observations. A time series file of observations would be a collection from a single station. Comparing these files with the files in DS390.0, illustrates the difference.
What NCEP calls a "report type" is an umbrella for a mix of geography, platform, and station identification characterizations. Within the groups in this dataset documentation, we prefer to use "type" to indicate a sub-group, such as raobs (radiosondes) or pibals, which are found in the ADPUPA group. DSS has learned that some "types" are not always collected in one particular group, apparently at NCEP's processing convenience.
The data are provided only as complete reports in the "packed ASCII" ON29 format. This has many implications. Being ASCII rather than binary means a significantly higher volume. Being synoptically sorted, the expense for extracting a time series of reports from a handful of stations can be surprisingly high. A subset of just temperature would not reduce the volume because the format is not dynamic.
Software for this dataset, such as readupa2.f, opens individual reports and prints human readable data values. To satisfy users who want to see just a few data values, or even just one, such as temperature, readupa2.f must still open each entire individual report to access them. To make such special subsetting, users will need to modify the readupa2.f output.
The data reports from radiosondes, pibals and aircraft are received by NCEP via the Global Telecommunications System (GTS) and the satellite data from the National Environmental Satellite, Data, and Information Service (NESDIS). This data stream is the primary input to the Global Data Assimilation System (GDAS), which is used to make forecasts and the Global Final Analyses (FNL).
NCEP uses Automated Data Processing (ADP), to retrieve, decode and reformat the GTS reports. The NCEP document Observational Data Processing at NCEP and the NCAR document TN404 An Introduction to Atmospheric and Oceanographic Datasets, (chapter 3) discuss the collection procedures in greater detail. The evolution of the collection and processing, since 1870, is described here.
NCEP collects the decoded reports in files which have a synoptic date/time stamp, which corresponds to the analysis time of the NCEP models. NCEP applies some interactive quality control (QC), removes duplicate reports, and merges upper-air "parts". When stations fail to report over the GTS or on time, or there were other data losses, neither NCEP nor DSS attempts any gap filling. Such gap filling is deemed to be in the purview of climatological data center collection practices.
When NCEP applies some QC to the data they set some QC flags in the reports to qualify, but not quantify, their corrections. Radiosonde data corrections are described in Observational Data Processing at NCEP. DSS did not do any QC, but will investigate user complaints about data values.
Originally NMC decoded the GTS data and then encoded the upper air data into the ON29 format for use in the NCEP models.
Beginning in March 1996, NCEP put it directly into BUFR format. Until October 1999, NCEP used their BUFRON29 conversion program to produce the ON29 formatted files from the BUFR files. This was done initially to verify content, and then for awhile as a service to users while they adapted to BUFR. NCEP ran this conversion on their Crays, and when they abandoned their Crays for a new system, they stopped making this conversion.
In February 2000, DSS acquired NCEP's BUFRON29 in order to continue production of the ON29 format for our users. NCEP later discontinued support for this software. DSS continued using it, while knowing very little about it - the thing is a large black box. Eventually continuing changes in BUFR data content made modifications of the conversion impractical, so DSS stopped production after February 2007.
Perhaps more importantly, the ON29 format does not provide for as much information as does the BUFR format. During BUFR to ON29 conversions, the primary GTS report information was carried along, but not additions to the reports. ON29 does not provide for new groups or types appearing in BUFR. Also, BUFR carries the original raw reports which NCEP collected from the GTS - ON29 does not. The GTS content provides the means for users to make their own judgements and apply their own procedures for things like hydrostatic checks or wind corrections (click here to see a particularly important issue).
In 1997 we got an email from Dennis Keyser (NCEP) about matters regarding the transition to BUFR. Please look at this. It is a good example of what can happen in operational data processing..
NCEP uses their "dump" BUFR files to prepare the PREPBUFR files which are directly fed to the models. This stage involves the addition of background ("first guess") data, observation error information and automated quality control.
The source of the data for the BUFR to ON29 conversion was NCEP's "dump" BUFR data files. This was not the PREPBUFR data. DSS does not archive PREPBUFR data in DS353.4. DSS does collect PREPBUFR data which happens to be included in model output datasets (e.g. DS609.2 ). As of March 2009, DSS will soon have PREPBUFR data going back about 10 years (in DS337.0)
The GFS forecasts (formerly AVN), which are (were) rushed for aviation needs, use the accumulation available at the analysis time (i.e. early cutoff "dump" BUFR files). The FNL final analyses (see DS082.0, DS083.0 and DS083.2) use the accumulation through a later cutoff "dump" time. The DS353.4 dataset includes much of what goes into the FNL, but this dataset is not in PREPBUFR format.
Please refer to the July 2002 A.M.S. B.A.M.S. article (p.1003) about U.S. N.W.S. instrument changes beginning in 1995 and their effects on the radiosonde data. For the abstract, click here.
NCEP has been using "superobbing" in some of its upper air data processing. See this CIMSS document.
For the curious, a history of U.S. weather data technology is here.
Beginning with the January 2004 data, in order to comply with WMO Resolution 40, DSS has been removing some restricted data. This involves proprietary data from certain commercial sources. Also see the NCEP Restricted Data Information Site
NCEP's observation file organization changes occasionally, although from 1 January 1978 - 31 March 1997 things were consistent. NCEP/NMC sent us this data on weekly tapes wherein there was a series of 6-hourly files each containing all the data groups for a model analysis time. About every quarter we segregated the groups into separate time series. I.e., we reassembled all the raobs and pibals in one series of files, all the aircraft observations in another series, etc.
NCEP made several changes in the spring of 1997, mostly in April. First, NCEP began putting all groups of data in monthly files, inside huge tarfiles. In the mid-2000s, because the monthly volume overwhelmed both disk file and communication limits, they began putting them in daily tar files. For archiving, we split out the groups and then merged these into 5 day, 15 day and 30 day files.
In April 1997 NCEP stopped shipping the ADP tapes directly to us. In February 1998 we obtained copies of the tapes which they continued to send to NCDC. Another batch was so obtained to bring us through March 1998, and then NCEP's shipments directly to us resumed. Then in June 2000, tape shipments were terminated in favor of using NCEP's FTP server.
DSS collected the BUFR formatted data (from tapes) for April 1998 - December 2003 in DS353.5. DSS is collecting the dump BUFR formatted data (from FTP) in DS351.0 and the PREPBUFR data in DS337.0.
DSS has prepared several kinds of statistics for various groups, types and stations (both synoptic and time series counts), from 1972 into the early 2000s. The main web page for these is here.
Note that the volume of the DS353.4 aircraft and satellite data has increased significantly, by an order of magnitude, since the early 1990s. This, and limited disk space, is why we do not publish these groups on our web server.
This data group includes upper air station data from land and ship launched radiosondes, pibals, and a few other platforms. Definitions of upper air station observation platforms are provided here.
ADPUPA reports, at 00Z and 12Z, come from about 650 - 1000 stations, and at 06Z and 18Z (which are mostly pibals), about 150 - 400 stations. The counts declined in later years. Data may be available at up to 20 mandatory levels from 1000mb to 1mb, plus a few significant levels. A little more information about radiosonde top levels is here.
DSS has prepared station receipt summaries: daily and monthly. 06Z and 18Z data is lower volume, starts about April 1975, is sometimes a little out of sort, and is missing more frequently.
See our
USAF Station Libraries
for station names, locations, elevations, etc. A map of the WMO regions is
here
Caution: the content of these libraries must not be taken as a list of what you can
expect to find in the ADPUPA files, they are just cross-references to names of the
locations.
AIRCFT
The AIRCFT data group includes aircraft flight level reports from commercial,
military and reconnaissance sources. There are about 400 every 12 hours
in the oldest files, to about 20000 every 6 hours in the later files.
The SIRSOB data group includes satellite infrared sounding observation data
The SATWND data group includes satellite winds derived from cloud drift analysis. It's volume increased enormously in the later files.
The AIRCAR data group includes data from aircraft takeoff and landings (beginning mid 1991), with between 5000 and 10000 reports every 6 hours. The reports may include pressure, geopotential height, temperature, dewpoint depression, wind direction and speed.
| GROUP | PERIOD | FILE STRUCTURE AS |
|---|---|---|
| ADPUPA | 1973jan01-1997mar31 | received from NMC, then NCEP |
| ADPUPA | 1997apr01-1998mar31 | received from NCDC |
| ADPUPA | 1998apr01-1999sep30 | received from NCEP |
| ADPUPA | 1999oct01-2007feb28 | converted from BUFR by DSS |
| AIRCFT | 1973jan01-1997apr05 | received from NMC, then NCEP |
| AIRCFT | 1997apr01-1998mar31 | received from NCDC |
| AIRCFT | 1998apr01-1999sep30 | received from NCEP |
| AIRCFT | 1999oct01-2006may16 | converted from BUFR by DSS |
| SIRSOB | 1973jan01-1979feb28 | received from NMC |
| SATWND | 1973jan24-1997apr05 | received from NMC, then NCEP |
| SATWND | 1997apr01-1998mar31 | received from NCDC |
| SATWND | 1998apr01-1999sep30 | received from NCEP |
| SATWND | 1999oct01-2006dec31 | converted from BUFR by DSS |
| UPABOG | 1973jan01-1986mar11 | received from NMC |
| AIRCAR | 1991may17-1997apr05 | received from NMC, then NCEP |
| AIRCAR | 1997apr01-1998mar31 | received from NCDC |
| AIRCAR | 1998apr01-1999sep30 | received from NCEP |
| AIRCAR | 1999oct01-2006dec31 | converted from BUFR by DSS |
BASIC - NCEP segregates the reports by group and synoptic date/time, and calls these data files. A data file begins with a header block and is usually terminated by a trailer block. The data blocks between these contain the data reports. The header block identifies the group (e.g. ADPUPA or AIRCFT) and synoptic date/time (which may also be called the header time). The actual observation times appear in the individual reports. The individual reports do not show the synoptic date/time stamp.
DSS converts NCEP's header blocks to the GATE format. This was originally intended just to support GARP projects in the mid-1970s, but persisted and became our paradigm, which at times is regrettable. Details about the DSS version of the header blocks is in here.
DATE/TIME STAMP - The collection of observations for a synoptic time begins before the synoptic time, as much as half of the time interval between synoptic times, but it should not be before the cutoff time of the previous data file. The header blocks will often, but not always, show a receipt time which indicates when the file was opened to begin the accumulation of reports. Likewise the cutoff time indicates when the file was closed, terminating the accumulation. DSS has found that these delineators have not been dependable, so users should just ignore them.
The synoptic time, header time or analysis time are all the same and are specified in Coordinated Universal Time (CUT or Z). The individual report times are also given in UT and are the actual observation times, as reported by the stations. Stations may report observation times which look exactly like the synoptic times, but in practise these may not always be true.
For information about the time convention, see this.
WARNING: Many users have been accepting or mistakenly expecting the synoptic time to be the observation time. If their work is sensitive to the difference, then they should use the observation times. This is a very important consideration because often users will find more than one report from a station within a data file for a synoptic time. Moreover, some reports in a given data file may be duplicated in the previous or following file.
Users must be aware that the actual time of an observation should be obtained from within the individual report. One may also want to adjust the date shown in the header block, because for some 0Z (and even 06Z) data files the report times which show values between 18Z and 24Z actually belong to the previous date. A comparable thing may occur at 18Z, where the report may belong to the next day. Users need to be aware of this. Our latest access software, readupa2.f, allows a user to make the adjustment.
STATION IDENTIFICATION - Generally the SYNOP reports use WMO 5 digit numbers. Many other reports, such as ships, use call signs. So when attempting to extract all the reports made by a station, or stations nearby, one should use a latitude-longitude "window", which will get both the WMO and call sign labeled reports. Such a "window" will obtain neighboring station data that could be used when the desired station data is missing. See the code in readupa2.f.
A list of stations is in our somewhat out of date USAF Station Library of station names, locations, elevations, etc. Note that in this library, 6 digit WMO numbers are used, where [5 digits]0 applies. Caution: the content of these libraries must not be taken as a list of what you may find in this dataset, it is just a cross-reference.
RADIOSONDE TOP LEVELS - .| ONLY ABOUT | OF ALL RAOBS REACH |
|---|---|
| 50% | 20mb |
| 25% | 10mb |
| 5% | 7mb |
| 2% | 5mb |
| <1% | 3mb |
Note that satellite data, such as used in DS067.4, is used to provide the models with high level (such 20mb and above) mperature and geopotential height.
The AMS Glossary (2000) says that the ozonosphere (ozone layer) roughly lies between 15 and 60km, [or about 125mb to 0.2mb], with maximum ozone concentration at 20 - 30km [or about 55 - 10mb]
DATA UNITS - The data units in this dataset involve both English and metric. See ON29 for details.
SEA LEVEL PRESSURE - This dataset has reports which include surface (station) pressure, sea level pressure, or both. From a scientific point of view, stations which report a sea level pressure, but not the observed station pressure, are being irresponsible. This is because it spoils the possibility of applying improved algorithms for the reduction of pressure from station elevation to sea level, among other things. For that matter, methods have changed over time.
The reduction of station surface pressure to sea level pressure is almost a black art. Basic meteorology texts describe theories which involve assumptions about the virtual temperature in an imaginary level between the station and sea level. An example is in the classic text "An Introduction to Theoretical Meteorology" by Seymour Hess (1959), on pages 88 - 90. In practise, the reduction algorithms are empirical, where local stations apply local statistical knowledge of weather and seasonal patterns.
Here are a couple of more recent texts on the subject: Chuang, et al and NOAA/NWS.
It is reasonable to expect to know exactly what was the systematic reduction procedure, but we do not have the answer.
WINDS - In DS464.0 we have notes about a wind problem that might also effect the radiosondes. The following is from that DS464.0 documentation:
DEWPOINT - We changed to the NASA GCMD standard for keywords. The GCMD does not provide a keyword for dew point depression. DSS is forced to show "Dew Point Temperature" on its web pages, even though the data values in this dataset are actually dew point depression.
To see FAQs about FNL and ADP data click here.
Models
For support using NCAR/MMM's WRF click here.
For support using ATMET's RAMS click here.
Collecting, maintaining and distributing the NCEP operational datasets (ADP and
FNL) has been a major part of my career, May 1976 - present.
Gregg Walters