Guide to WMO Table Driven Code Forms:

 

 

 

 

FM 94 BUFR

 

and

 

FM 95 CREX

 

 

 

Layer 3: Detailed Description of the Code Forms

(for programmers of encoder/decoder software)

 

 

 

 

 

 

 

 

 

Geneva, 1 January 2002

 


Preface

 

This guide has been prepared to assist experts who wish to use the WMO Table Driven Data Representation Forms BUFR and CREX.

This guide is designed in three layers to accommodate users who require different levels of understanding.

Layer 1 is a general description designed for those who need to become familiar with the table driven code forms but do not need a detailed understanding.  Layer 2 focuses on the functionality and application of BUFR and CREX, and is intended for those who must use software that encodes and/or decodes BUFR or CREX, but will not actually write the software.

Layer 3 is intended for those who must actually write BUFR or CREX encoding and/or decoding software, although those wishing to study table driven codes in depth, will find it equally useful.

The WMO gratefully acknowledges the contributions of the experts who developed this guidance material.  The Guide was prepared by Dr. Clifford H. Dey of the U. S. A. National Centre for Environmental Prediction.  Contributions were also received in particular from Charles Sanders - Australia, Eva Cervena - Czech Republic, Chris Long - U.K., Jeff Ator - USA and Milan Dragosavac, ECMWF.


Layer 1:          Basic Aspects of BUFR and CREX

Layer 2:          Functionality and Application of BUFR and CREX

(see separate volume for Layers 1 and 2)

 

Layer 3:  Detailed Description of the Code Forms

(for programmers of encoder/decoder software)

 

Table of Contents

Page

 

3.1       BUFR                                                                                                                                       ........................................................................................................................ 3

3.1.1    Sections of a BUFR Message................................................................................. 3

3.1.1.1 Overview of a BUFR Message.................................................................... 3

3.1.1.2 Section 0 – Indicator Section...................................................................... 6

3.1.1.3 Section 1 – Identification Section.............................................................. 8

3.1.1.4 Section 2 – Optional Section.................................................................... 15

3.1.1.5 Section 3 – Data Description Section...................................................... 16

3.1.1.6 Section 4 – Data Section........................................................................... 19

3.1.1.7 Section 5 – End Section............................................................................ 20

3.1.1.8 Required Entries......................................................................................... 21

3.1.1.9 BUFR and Data Management................................................................... 23

3.1.2    BUFR Descriptors.................................................................................................. 23

3.1.2.1 Fundamentals of BUFR Descriptors....................................................... 23

3.1.2.2 Coordinate Descriptors............................................................................. 24

3.1.2.3 Increment Descriptors............................................................................... 25

3.1.3    BUFR Tables........................................................................................................... 29

3.1.3.1 Introduction................................................................................................ 29

3.1.3.2 Table A – Data Category............................................................................ 29

3.1.3.3 Table B – Classification of Elements....................................................... 31

3.1.3.4 Table C – Data Description Operators..................................................... 41

3.1.3.5 Table D – Lists of Common Sequences.................................................. 41

3.1.3.6 Comparison of BUFR and Character Code Bit Counts........................ 48

3.1.3.7 Code Tables and Flag Tables................................................................... 48

3.1.3.8 Local Tables................................................................................................ 49

3.1.4    Data Replication..................................................................................................... 53

3.1.4.1 Introduction................................................................................................ 53

3.1.4.2 Simple Replication..................................................................................... 54

3.1.4.3 Delayed Replication................................................................................... 55

3.1.4.4 Delayed Replication Using a Sequence Descriptor............................... 56

3.1.4.5 Delayed Repetition..................................................................................... 58

3.1.5    Data Compression................................................................................................. 59

3.1.6    Data Description Operators.................................................................................. 68

3.1.6.1 Changing Data Width, Scale and Reference Value................................ 68

3.1.6.2 Changing Reference Value Only.............................................................. 73

3.1.6.3 Add Associated Field................................................................................ 75

3.1.6.4 Encoding Character Data.......................................................................... 81

3.1.6.5 Signifying Length of Local Descriptors.................................................. 82

3.1.6.6 Data Not Present........................................................................................ 84

3.1.6.7 Quality Assessment Information............................................................. 84


                                                                                                                                          Page

 

3.2       CREX                                                                                                                                     ...................................................................................................................... 87

            3.2.1    Sections of a CREX Message................................................................................ 87

3.2.1.1 Overview of a CREX Message................................................................... 87

3.2.1.2 Section 0 – Indicator Section.................................................................... 88

3.2.1.3 Section 1 – Data Description Section...................................................... 89

3.2.1.4 Section 2 – Data Section........................................................................... 91

3.2.1.5 Section 3 – Optional Section.................................................................... 92

3.2.1.6 Section 4 – End Section............................................................................ 92

 

3.2.2    CREX Descriptors................................................................................................... 93

3.2.2.1 Fundamentals of CREX Descriptors........................................................ 93

3.2.2.2 Coordinate Descriptors............................................................................. 94

3.2.2.3 Increment Descriptors............................................................................... 95

3.2.3    CREX Tables........................................................................................................... 98

3.2.3.1 Table A – Data Category............................................................................ 98

3.2.3.2 Table B – Classification of Elements..................................................... 100

3.2.3.3 Table C – Data Description Operators................................................... 104

3.2.3.4 Table D – Lists of Common Sequences................................................ 104

3.2.3.5 Code Tables and Flag Tables................................................................. 106

3.2.3.6 Local Tables.............................................................................................. 107

3.2.4    Decomposition of a Sample CREX Message.................................................... 108

3.2.4.1 Decomposition of the Descriptor Sequence in the Sample CREX Message    108

3.2.4.2 Decomposition of the Data Section in the Sample CREX Message.. 112

 

APPENDIX to Chapter 3.1.6.7 Quality Assessment Information............................................ 115

            3.1.6.7.1          Introduction.............................................................................................. 115

            3.1.6.7.2          First Order Statistics................................................................................ 119

            3.1.6.7.3          Specification of the Type of Difference Statistics............................... 122

            3.1.6.7.4          Quality Information.................................................................................. 125

            3.1.6.7.5          Cancel Backward Data Reference.......................................................... 130

            3.1.6.7.6          Substituted Values................................................................................... 131

            3.1.6.7.7          Replaced/retained Values........................................................................ 133

 


3.1       BUFR

3.1.1    Sections of a BUFR Message

3.1.1.1 Overview of a BUFR Message

The term "message" refers to BUFR being used as a data transmission format.  However, BUFR can be, and is used in a number of meteorological data processing centers as an on-line storage format as well as a data archiving format.  For transmission of data, each BUFR message consists of a continuous binary stream comprising 6 sections.

 


C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

Section

2

Section

3

Section

4

Section

5

Section

Number

Name

Contents

0

Indicator Section

"BUFR" (coded according to the CCITT International Alphabet No. 5, which is functionally equivalent to ASCII), length of message, BUFR edition number

1

Identification Section

Length of section, identification of the message

2

Optional Section

Length of section and any additional items for local use by data processing centers

3

Data Description

Section

Length of section, number of data subsets, data category flag, data compression flag, and a collection of data descriptors which define the form and content of individual data elements

4

Data Section

Length of section and binary data

5

End Section

"7777" (coded in CCITT International Alphabet No. 5)

 

Each of the sections of a BUFR message is made up of a series of octets.  The term octet means 8 bits.  An individual section always consists of an even number of octets, with extra bits added on and set to zero when necessary.  Within each section, octets are numbered 1, 2, 3, etc., starting at the beginning of each section.  Bit positions within octets are referred to as bit 1 to bit 8, where bit 1 is the most significant, leftmost, or high order bit.  An octet with only bit 8 set would have the integer value 1.

Theoretically there is no upper limit to the size of a BUFR message but, by convention, BUFR messages are restricted to 15000 octets or 120000 bits.  This limit is set by the capabilities of the Global Telecommunications System (GTS) of the WMO.  The GTS BLOK feature can be used to break very long BUFR messages into parts.  The GTS specification for breaking up very large bulletins using the BBB parameter in the WMO Abbreviated Heading can also be employed.

Figure 3.1.1-1 is an example of a complete BUFR message containing 52 octets.  The end of each section and the number of the octet within each section is indicated above the binary string.  This particular message contains 1 temperature observation of 295.2 degrees K from WMO block/station 72491. Figures 3.1.1-2 through 3.1.1-8 illustrate decoding of the individual sections. The spaces between octets in Figures 3.1.1-2 through 3.1.1-8 were added to improve readability.


 
                                                                                                                     end of section 0  +
octet number       1       |       2      |        3     |      4       |       5      |       6      |       7       |       8      |       1     |       2      |
binary string  01000010010101010100011001010010000000000000000000110100000000110000000000000000
 
octet number       3       |       4      |        5     |      6       |       7      |       8      |       9       |      10     |      11    |     12      |
binary string  00010010000000000000000000111000000000000000000000000000000000000000100100000001
 
                                                                                     end of section 1  + 
octet number       13     |      14     |       15    |     16      |     17      |     18      |       1       |       2     |       3      |       4      |
binary string  00000001000001000001110100001100000000000000000000000000000000000000111000000000
 
                                                                                                                                                     end of section 3  +
octet number       5       |       6      |        7     |      8       |       9      |     10      |      11      |      12     |      13     |     14      |
binary string  00000000000000011000000000000001000000010000000100000010000011000000010000000000
 
                                                                                                                     end of section 4  +
octet number       1       |       2      |        3     |      4       |       5      |       6      |       7      |       8      |       1      |       2      |
binary string  00000000000000000000100000000000100100001111010111011100010000000011011100110111
 
                                                     +  end of section 5
octet number       3       |       4      |
binary string  0011011100110111

 

Figure 3.1.1-1.  Example of a complete BUFR message containing 52 octets


 

3.1.1.2        Section 0 - Indicator Section

Structure 

SECTION

0

Section

1

Section

2

Section

3

Section

4

Section

5

Octet No.

Contents

1 – 4

"BUFR" (coded according to the CCITT International Alphabet No. 5)

5 – 7

Total length of BUFR message, in octets (including Section 0)

8

BUFR edition number (currently 3)

Total message length (octets 5 – 7):  The earlier editions of BUFR did not include the total message length.  Thus, in decoding BUFR Edition 0 and 1 messages, there was no way of determining the entire length of the message without scanning ahead to find the individual lengths of each of the sections.  Edition 2 eliminated this problem by including the total message length in octets 5 – 7. 

Edition Number (octet 8):  By design, BUFR Edition 2 contained the BUFR Edition number in octet 8, the same octet position relative to the start of the message as it was in Editions 0 and 1.  By keeping the relative position fixed, a decoder program can determine at the outset which BUFR version was used for a particular message and then behave accordingly.  This meant that archives of records in BUFR Editions 0 or 1 did not need to be updated.

Edition number changes:  The Edition number will change only if there is a structural change to the data representation system such that an existing and functioning BUFR decoder would fail to work properly if given a "new" record to decode.  Edition changes can come about in three main ways.  First, if the basic bit or octet structure of the BUFR record were changed, for example by the addition of something new in one of the "fixed format" portions of the record, computer program changes would obviously be required for the programs to work properly.  The addition of total BUFR message length to octets 5 – 7 of the Indicator Section fell in this category – it caused the Edition number to change from 1 to 2.  The WMO community expects these changes to be kept to a bare minimum.

The second way is if the data description operators in Table C (Data description operators) are augmented.  These operator descriptors are qualitatively different from simple data descriptors:  where the data descriptors just passively describe the data in the record, the operator descriptors are, in effect, instructions to the decoding program to undertake some particular action.  Table C defines what actions are possible.  Descriptors of type 1 (F=1), the replication operators, are also in this category since they too tell the computer program to do something.  Unfortunately, not all of the "operator" type descriptors are collected in Table C.  Some of the nominal data descriptors, in particular the "increment" descriptors found in Table B, Classes 4, 5, 6, and 7, take on the character of operators in conjunction with data replication, as well as the operator qualifiers in Table B, Class 31.  These topics will be expanded on further later in Chapter 3.1.

A third change that would require a new Edition would be a change to the Regulations and/or the many notes scattered through the documentation (The "notes", by the way, are as important as the "Regulations" in formally defining BUFR - they contain many of the details that flesh out the rather sparse regulations. Ignore them at your peril.).  This is not particularly likely to happen - more likely will be clarifications to the Regulations or notes that will serve to make the rules more precise in (currently) possibly ambiguous cases.  Whether these cases should be considered as requiring an Edition number change is a matter of some judgment.  The WMO will be the final arbiter.

Sample message decomposition (Indicator Section):  The Indicator Section of the sample BUFR Message shown in Figure 3.1.1-1 is decomposed in detail below.  The hexadecimal equivalent of the first four octets is shown to clarify the representation of the four characters ÒBÓ, ÒUÓ, ÒFÓ, and ÒRÓ.  Note also that the value of the bits in octet 7 is 52 and the value of the bits in octet 8 is 3.

 
octet number:
        1      |       2      |        3       |       4       |       5       |       6       |       7        |       8
binary string:
01000010 01010101 01000110 01010010 00000000 00000000 00110100 0000011
hexadecimal:
   4      2       5      5       4       6      5       2       0      0       0      0      3      4       0     3
 
decoded:
        B             U               F               R                                               52                 3
 
                                                          Length of message in octets ----+-------¦
 
                                                                                                     BUFR Edition ----+

Figure 3.1.1-2.  Section 0

 


3.1.1.3        Section 1 - Identification Section

Structure

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

SECTION

1

Section

2

Section

3

Section

4

Section

5

Octet No.

Contents

1 – 3

Length of section, in octets

4

BUFR master table number – this provides for BUFR to be used to represent data from other disciplines, with their own versions of master tables and local tables.  For example, this octet is zero for standard WMO FM 94 BUFR tables, but ten for standard IOC FM 94 BUFR Tables whose, use is focused on oceanographic data.

5

Originating/generating sub-centre (defined by Originating/generating centre)

6

Originating/generating centre (Common Code tableC-1)

7

Update sequence number (zero for original BUFR messages; incremented for updates)

8

Bit 1= 0 No optional section

 

       = 1 Optional section included

 

Bits 2 – 8 set to zero (reserved)

9

Data category (BUFR Table A)

10

Data sub-category (defined by local ADP centres)

11

Version number of master tables used (currently 9 for WMO FM 94 BUFR tables)

12

Version number of local tables used to augment the master table in use

13

Year of century

14

Month

15

Day

16

Hour

17

Minute

18 -

Reserved for local use by ADP centres

Length of section (octets 1 – 3):  The length of Section 1 can vary between BUFR messages.  Beginning with Octet 18, a data processing center may add any type of information they choose.  A decoding program need not know what that information may be.  Knowing what the length of the Section is, as indicated in octets 1-3, a decoder program can skip over the information that begins at octet 18 and position itself at the next section, either Section 2, if included, or Section 3.  Bit 1 of octet 8 indicates if Section 2 is included.  If there is no information beginning at octet 18, one octet must still be included (and set to 0) in order to have an even number of octets within the section.

Originating/generating sub-centre (octet 5) and Originating/generating center (octet 6):  Octet 6 is used to identify the national (or international) originating/generating centres, using the same Common Code table (C – 1) as is in use for GRIB.  This table is coordinated and maintained by the WMO and published as part of the codes Manual.  Any national sub-center numbers that may be required are generated by the national (or international) center in question and that number is to be placed in octet 5.  List of sub-centres numbers should be passed to the WMO Secretariat for publication in the Manual.

Update sequence number (octet 7):  This feature is not widely used, but it is a powerful one.  Note that the rule does require one to re-send an entire message if even only one element in the message is a correction of a previous message element.  The "associated field" (see Section 3.2.6) is used to indicate which element(s) is(are) the corrected one(s) within the total message.

Optional Section 2 (octet 8):  This section is not usually sent in international messages but it is put to use in some computer centers that use BUFR frequently in a data base context. Some samples are given in Section 3.1.1.4.  If it is present, the flag in octet 8 must be set to 1.

Data category (octet 9):  The data category (taken from BUFR Table A) provides a quick check of the type of data in the BUFR message.  Processing centres can use this information in their observational data ingest processing suite.

Data sub-category:  This is purely a local option, useful in processing the observational data after it has been decoded from BUFR.  By adding this information to the BUFR files in which the ingested data are placed, a processing centre knows in considerable detail just what sort of data is in a BUFR message.  This can make the choice of subsequent processors that much easier.  It also makes it possible to search through a collection of various data types, encoded in BUFR, and select out only those for which there is a special interest.  This has obvious applications in a data base context.  As an example here are the sub-types currently in use at the National Centers for Environmental Prediction, Washington, DC, USA:

 

 

BUFR Data Category 0: Surface data – land

Data Sub-type

Description

1

Synoptic – manual and Automatic

7

Aviation – METAR

11

SHEF

12

Aviation – SCD

20

MESONET – Denver, urban

21

MESONET – RAWS (NIFC)

22

MESONET – MesoWest

23

MESONET – APRS Weather

24

MESONET – Kansas DOT

25

MESONET – Florida

30

MESONET – Other

BUFR Data Category 1: Surface data – sea

Data Sub-type

Description

1

Ship – manual and automatic

2

Drifting buoy

3

Moored buoy

4

Land based C-MAN station

5

Tide gage

6

Sea level pressure bogus

7

Coast guard

8

Moisture bogus

9

SSMI

BUFR Data Category 2: Vertical soundings (other than satellite)

Data Sub-type

Description

0

Unassigned

1

Rawinsonde - fixed land

2

Rawinsonde - mobile land

3

Rawinsonde – ship

4

Dropwinsonde

5

Pibal

7

Wind Profiler (from NOAA)

8

NEXRAD winds

9

Wind profiler (from PILOT)

BUFR Data Category 3: Vertical soundings (satellite)

Data Sub-type

Description

1

Geostationary

2

Polar orbiting

3

Sun synchronous

BUFR Data Category 4: Single level upper-air (other than satellite)

Data Sub-type

Description

0

Unassigned

1

AIREP

2

PIREP

3

AMDAR

4

ACARS (from ARINC)

5

RECCO – flight level

6

E-ADAS

BUFR Data Category 5: Single level upper-air (satellite)

Data Sub-type

Description

10

NESDIS SATWIND:  GOES – High Density IR

11

NESDIS SATWIND:  GOES – High Density WV Imagery

12

NESDIS SATWIND:  GOES – Hi Density Visible

13

NESDIS SATWIND:  GOES – Picture Triplet

14

NESDIS SATWIND:  GOES – Hi Density WV Sounding

21

INDIA SATWIND:  INSAT – IR

22

INDIA SATWIND:  INSAT – Visible

23

INDIA SATWIND:  INSAT – WV Imagery

41

JMA SATWIND:  GMS – IR

42

JMA SATWIND:  GMS – Visible

43

JMA SATWIND:  GMS – WV Imagery

50

NESDIS SATWIND:  GMS – IR

51

NESDIS SATWIND:  GMS – WV Imagery

64

EUMETSAT SATWIND:  METEOSAT – IR

65

EUMETSAT SATWIND:  METEOSAT – VIS

66

EUMETSAT SATWIND:  METEOSAT – WV Imagery

BUFR Data Category 12: Surface data (satellite)

1

SSM/I – Brightness temperatures

2

SSM/I – Derived products

3

GPS – Integrated precipitable water

5

ERS – SAR

9

ERS – Radar altimeter data

10

Navy sea surface temperatures

11

NESDIS sea surface temperatures

12

Navy high resolution sea surface temperatures

103

SSM/I – Neural net 3 products

137

QUIKSCAT data

BUFR Data Category 31: Oceanographic data

1

BATHY

2

TESAC

3

TRACKOB

11

NLSA ERS2:  Altimeter – high resolution

12

NLSA TOPEX:  Altimeter – high resolution

13

NLSA TOPEX:  Altimeter – low resolution

14

NLSA GFO:  Altimeter – high resolution

 

Date/time (octets 13 – 17):  The Manual suggests placing the date/time "most typical for the BUFR message content" (whatever that may mean) in the appropriate octets.  For synoptic observations, the nominal synoptic time is obviously appropriate.  But the exact time of the observation can be placed in the body of the message if this is of interest or value to the users of the data.  Collections of satellite observations, which are inherently asynoptic, by convention (at least as NOAA does) have the time of the first observation of the collection in the date/time octets.  The exact times for each satellite observation will, of course, be in the body of the message.

As the Year 2000 rollover period approached, it was realized the Year of century was not being encoded uniformly because the regulations specifying the values to use for Year of century were not clearly stated.  To that end, a new note was added to the Identification Section.  The new note reads:  ÒTo specify the year 2000, octet 13 (Year of century) must contain a value of 100.  To specify the year 2001, octet 13 must contain a value of 1 (by International Convention, the date of 1 January 2000 was the first day of the hundredth year of the twentieth century and the date of 1 January 2001 was the first day of the first year of the twenty-first century).  One should also note that year 2000 was a leap year, and February 29, 2000 exists.Ó  Lack of specification of the Century in BUFR was also felt to be a deficiency, and some processing centres have begun the practice of using octet 18 (see below) of this section for that value.

Reserved for use ..." (octets 18 - ):  It is not expected that international BUFR messages will contain anything past octet 18.  However, octet 18 itself, which is also reserved for local use, must be present in order to maintain an even number of octets in the Identification Section.  Traditionally, octet 18 was set to zero.  However, as noted above, some centres now use this octet for the Century.  Nevertheless, there is no real damage if Section 1 is "extended" past octet 18, because the "Length of section" in octets 1-3 indicates the full size of Section 1.  Any operational decoding program worthy of the name will check the number in octets 1-3 and respond accordingly, presumably by skipping the extra material.

Sample message decomposition (Identification Section):  The Identification Section of the sample BUFR Message shown in Figure 3.1.1-1 is decomposed in detail below:


 

Figure 3.1.1-3.  Section 1

 
octet number:
        1      |       2       |        3      |       4       |       5       |       6       |       7       |       8
binary string:
00000000 00000000 00010010 00000000 00000000 00111010 00000000 00000000 
decoded:
        0              0              18              0                0              58            0                0
 
      length of section -----+-------¦
              standard BUFR tables ------+-------¦
                         originating center (US Navy - FNMOC) ---+------¦
                                                             flag indicating Section 2 not included -----+------
 
octet number:
        9      |      10       |      11     |     12       |     13       |      14      |      15      |      16
binary string:
00000000 00000000 00001001 00000001 00000001 00000100 00011101 00001100
decoded:
        0              0                 9              1              1               4               29              12
-------+----|
data category
data sub-category-+--|
version of master tables ---+----¦
               version of local tables ---+--------¦
                                           year of century -----+------|
                                                                           month ---+-------¦
                                                                                               day ----+------¦
                                                                                                              hour ---+--------|
 
octet number:          17              18
 
binary string:      00000000 00000000
 
decoded:                    0                0
 
               minute -----+-----¦
                            local use ----+-------|

 

 


3.1.1.4        Section 2 - Optional Section

Structure

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

SECTION

2

Section

3

Section

4

Section

5

Octet No.

Contents

1 – 3

Length of section, in octets

4

Set to zero (reserved)

5 -

Reserved for use by ADP centres

Use

Section 2 may or may not be included in any BUFR message.  When it is contained within a BUFR message, bit 1 of octet 8 in Section 1 is set to 1.  If Section 2 is not included in a message then bit 1 of octet 8 in Section 1 is set to 0.  Section 2 may be used for any purpose by an originating center.  The only restrictions on the use of Section 2 are that octets 1 - 3 are set to the length of the section, octet 4 is set to zero and the section contains an even number of octets.

A typical use of this optional section could be in a data base context.  The section might contain pointers into the data section of the message, pointers that indicate the relative location of the start of individual sets of observations (one station's worth, for example) in the data.  There could also be some sort of index term included, such as the WMO block and station number.  This would make it quite easy to find a particular observation quickly and avoid decoding the whole message just to find one or two specific data elements.

Note the Optional Section was not present in the sample BUFR Message shown in Figure 3.1.1-1.


3.1.1.5        Section 3 - Data description Section

Structure

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

Section

2

SECTION

3

Section

4

Section

5

Octet No.

Contents

1 – 3

Length of section, in octets

4

Set to zero (reserved)

5 – 6

Number of data subsets

7

Bit 1 = 1 observed data

 

        = 0 other data

 

Bit 2 = 1 compressed data

 

        = 0 non-compressed data

 

Bit 3 - 8 set to zero (reserved)

8 -

A collection of descriptors which define the form and content of individual data elements comprising one data subset in the data section (Section 4)

Number of data subsets (octets 5 – 6):  BUFR Regulation 94.5.2 states Ò É Octet 8 and subsequent octets shall contain a collection of descriptors which define the form and content of individual data elements in the Data Section.  A Òdata subsetÓ shall be defined as the subset of data described by one single application of this collection of descriptors.Ó  In this context, the "collection of descriptors" means ALL the descriptors included in Section 3 of the BUFR message.  In other words, one pass through the complete collection of descriptors will allow one to decode one data subset from Section 4.  One then loops back in the descriptor list for as many times as indicated by the Number of data subsets in octets 5 – 6.  All the data in Section 4 are properly described by repeated use of the same set of descriptors from Section 3.

This does not imply that the data subsets are themselves identical in format.  The use of delayed replication, as in a collection of TEMPs with varying numbers of significant levels, could cause variations in format (octet count) among data subsets.  But they are still considered "subsets" in that the same set of descriptors will properly describe each individual set.  The use of the delayed replication descriptor is what makes this possible, and is what delayed replication was designed for.

As we will see in Chapter 3.1.6, certain descriptor operators, from Table C can be used to redefine reference values, data lengths, scale factors, and add associated fields.  There is also a group of descriptors that "remain in effect until superseded by redefinition".  However, Regulation 94.5.3.9 states, ÒIf a BUFR message is made up of more than one subset, each subset shall be treated as though it was the first subset encountered.  This Regulation means that ALL of these redefinitions or "remain in effect" properties are canceled when one cycles back to reuse a set of descriptors for a new data subset.  You wipe the slate clean and start as though it was the first time.

Even though data subsets may be compressed and, as a result, the individual elements in each data subset are all reordered, the data subset concept still holds.  The data subset count must be included in the correct location, and must be correct.  It is impossible to decompress a message without that information; and even if the data are not compressed the count is necessary to retrieve all the data subsets in a given message.

If octets 5-6 indicate that there is more than one data subset in the message, with the total number of the subsets given in those octets, then multiple sets of observations, all with the same format (as described by the data descriptors) will be found in Section 4.  This is, for example, a means of building "collectives" of observations.  Doing so realizes a large portion of the potential efficiency of BUFR.

Flag Bit 1 (octet 7):  Conceptually, one subset is a collection of related meteorological data.  For observational data (Flag bit 1 = 1), each subset usually corresponds to one ÒobservationÓ, where "observation", in this context, could mean one surface synoptic report, one rawinsonde ascent, one profiler sounding, one satellite derived sounding with radiances, etc.  No examples of non-observational data subsets (Flag bit 1 = 0) are given in the BUFR specifications in the Manual on Codes, but a typical one would be a message consisting of a collection of numerical model forecasts of "soundings" at grid-points or other specific locations.  Each forecast sounding (pressure, temperature, wind, relative humidity, whatever, at the many levels of the model) would then be one data subset.

Flag Bit 2 (octet 7):  If the data in Section 4 is compressed, bit 2 of octet 7 is set to one.  If the data is not compressed, it is set to zero.  The nature of "data compression" will be described in Chapter 3.1.5.

Sample message decomposition (Data Description Section):  The Data Description Section of the sample BUFR Message shown in Figure 3.1.1-1 is decomposed in detail below.  The data descriptors are given in octets 8 – 13.  Note that octet 14 has been added and set to zero to ensure the Data Description Section contains an even number of octets.


 

Figure 3.1.1-4.  Section 3

 
octet number:
        1      |       2       |        3      |       4       |       5       |       6       |       7       |
binary string:
00000000 00000000 00001110 00000000 00000000 00000001 10000000
decoded:                           14           0                0              1         ||              
length of section -+----------------¦
                                    reserved -----+------¦
                               number of data subsets -----------+-------------¦
                                                        flag indicating observed data+
                                              flag indicating non-compressed data+
 
 
octet number:
        8      |       9       |      10      |     11       |     12       |      13      |      14      
binary string:
00000001 00000001 00000001 00000010 00001100 00000100 00000000
decoded:
  0        01           001   0        01           002   0        12           004        0
descriptors in F X Y format:
     0 01 001              |       0 01 002            |      0 12 004            ¦
      needed to complete section with an even number of octets ----+-------
 
Recall from Layer 2 that descriptors are composed of thee parts - F (2 bits), X (6 bits), and Y (8 bits).  Figure 3.1.1-5 describes the decoding of the three descriptors contained in octets 8 – 13 in more detail.  Descriptors themselves are discussed at length in Section 3.1.2.
 
 
octet number            8                         9                      10                  11
 
binary string    0 0 0 0 0 0 0 1  0 0 0 0 0 0 0 1  0 0 0 0 0 0 0 1  0 0 0 0 0 0 1 0
                       |     ¦                ¦                         ¦      ¦                 ¦                        ¦
decoded         +- 0+---01-----+ +------001------++-0+-----01----+ +-----002-------+
meaning            F        X                    Y               F         X                   Y
                 +---------------descriptor 1------------++--------------descriptor 2---------+
 
octet number               12                    13
 
binary string    0 0 0 0 1 1 0 0  0 0 0 0 0 1 0 0
                       |     ¦                 ¦                        ¦
decoded         +-0+----12------++------004------+
meaning           F          X                   Y
                       +------------descriptor 3----------+
 

Figure 3.1.1-5.  Decoding of Octets 8 – 13 of Section 3

 


3.1.1.6        Section 4 - Data Section

Structure

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

Section

    2

Section

3

SECTION

4

Section

5

Octet No.

Contents

1 – 3

Length of section, in octets

4

Set to zero (reserved)

5 -

Binary data, as defined by the descriptors that begin at octet 8 of Section 3.

 

Sample message decomposition (Data Section)

The Data Section of the sample BUFR Message shown in Figure 3.1.1-1 is decomposed in detail below.  The length of the Data Section, given in octets 1 – 3, is 8, and octet 4 is reserved.  The remaining octets 5-8 contain the data itself:

 
octet number:
        1      |       2       |        3      |       4        |       5       |       6      |       7        |        8
binary string:
01000000 00000000 00001000 00000000 10010000 11110101 11011100 01000000
                                                  ¦                ¦
decoded                               8   |                |      
---------------length of section ---+ reserved+------Data as described by descriptors-------¦
                                                                                in Section 3 (Figure 1-6)
 ----+ 

Figure 3.1.1-6.  Section 4

 
Now, consider octets 5 – 8 in more detail.  First, recall from octets 8 – 13 of the Data Description Section that the data is described by the three descriptors 0 01 001, 0 01 002, and 0 12 004.  How to determine the characteristics of the data indicated by these three descriptors will be discussed later in Layer 3.  For now, suffice it to note the following:
 
 

DESCRIPTOR

 

NAME

 

UNIT

 

SCALE

 

REFERENCE VALUE

DATA WIDTH (Bits)

F

X

Y

0

01

000

WMO block number

Numeric

0

0

7

0

01

002

WMO station number

Numeric

0

0

10

0

12

004

Dry-bulb temperature at 3 m

K

1

0

12

 
 
Thus, the WMO block number occupies the first 7 bits of octets 5 – 8 in the Data Section, or 1001000.  This binary string of 7 bits has a value of 72, so the WMO block number is 72.  The WMO station number occupies the next 10 bits of octets 5 – 8 in the Data Section, or 0111101011.  This binary string of 10 bits has a value of 491, so the WMO station number is 491.  The temperature, in degrees Kelvin, occupies the next 12 bits of octets 5 – 8 in the Data Section, or 101110001000.  This binary string of 12 bits has a value of 2952.  Since the scale for this temperature descriptor is 1, we must divide by 10 to retrieve the original value of 295.2 degrees Kelvin.  This accounts for 29 of the 32 bits in octets 5 – 8 of the Data Section.  Since all sections of a BUFR message must have an even number of octets and end on an octet boundary, the last three bits of octet 8 are set to zero.  This decomposition is depicted pictorially in Figure 3.2.1-7 below.
 
octet number:                5                     6                         7                     8
 
Binary string:  1 0 0 1 0 0 0 0  1 1 1 1 0 1 0 1  1 1 0 1 1 1 0 0  0 1 0 0 0 0 0 0
                       ¦                  ¦ ¦                              ¦ ¦                                 ¦ ¦       ¦
decoded:        +---- 72 ----+ +--------- 491 --------++---------- 2952 --------++-----+
                                                                         3 bits of zero to end octet+-- --+

Figure 3.1.1-7.  Decoding of Octets 5 – 8 of Section 4

 

3.1.1.7        Section 5 - End Section

Structure

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

Section

2

Section

3

Section

4

SECTION

5

Octet No.

Contents

1 – 4

"7777" (coded according to the CCITT International Alphabet No. 5)

Sample message decomposition (End Section)

The End Section of the sample BUFR Message shown in Figure 3.1.1-1 (and the End Section of all BUFR messages) is decomposed in detail below.  The hexadecimal equivalent of the four octets is shown to clarify the representation of the four characters 7Ó, Ò7Ó, Ò7Ó, Ò7Ó: 

octet number:         1              2              3                4
 
binary string:  00110111 00110111 00110111 00110111
 
hexadecimal:      3   7          3   7         3   7          3   7  
 
decoded:               7                7             7               7

Figure 3.1.1-8. Section 5

 


3.1.1.8        Required Entries

In any BUFR message there are required entries, and there will be a minimum number of bits to represent even the smallest amount of data.  Therefore, there will be a minimum length for any BUFR message.  The required and the minimum number of octets in each section are:

Section 0, octets 1 – 8 required

                   Section 0 will always contain 8 octets.

Section 1, octets 1 – 18 required

                   Section 1 will contain a minimum of 18 octets.

Section 3, octets 1 – 7 required

Section 3 will contain a minimum of 10 octets.  The data descriptors begin in octet 8.  A single data descriptor occupies 16 bits, or 2 octets.  Since the Section must contain at least one descriptor and have an even number of octets, there will be a minimum of 10 octets in Section 3.  Note that Section 3 will always conclude with 8 bits set to zero since all descriptors are 16 bits in length and the first descriptor begins in octet 8.

Section 4, octets 1 – 4 required

Section 4 will contain a minimum of 6 octets.  The data is in bits 5 and beyond.  However, since the Section must contain an even number of octets there must be at least 2 octets after octet 4, so there will be a minimum of 6 octets in Section 4.

Section 5 - octets 1 – 4 required

                   Section 5 will always contain 4 octets.

There will thus be a minimum of 46 octets, or 368 bits, in any BUFR message.  For each section, the minimum number of bits is:

 

             C O N T I N U O U S B I N A R Y S T R E A M

Section 0

(8 octets)

   64 bits

Section 1

(18 octets)

   144 bits

Section 2

(optional)

Section 3

(10 0ctets)

   80 bits

Section 4

(6 octets)

   48 bits

Section 5

(4 octets)

   32 bits

 

Figure 3.1.1-9 is the same BUFR message used in Figures 3.1.1-1 to 3.1.1-8.  However, in Figure 3.1.1-9, those octets that are required in any BUFR message, as described above, are shown in bold.  Not included in the bold areas are descriptors contained in octets 8 - 14 of Section 3 and the data in Octets 5 - 8 of section 4.


                                                                                                                       end of section 0  +
octet number       1       |       2      |        3     |      4       |       5      |       6      |       7       |       8      |       1     |       2      |
binary string  01000010010101010100011001010010000000000000000000110100000000110000000000000000
 
octet number       3       |       4      |        5     |      6       |       7      |       8      |       9       |      10     |      11    |     12      |
binary string  00010010000000000000000000111000000000000000000000000000000000000000100100000001
 
                                                                                     end of section 1  + 
octet number       13     |      14     |       15    |     16      |     17      |     18      |       1       |       2     |       3      |       4      |
binary string  00000001000001000001110100001100000000000000000000000000000000000000111000000000
 
                                                                                                                                                     end of section 3  +
octet number       5       |       6      |        7     |      8       |       9      |     10      |      11      |      12     |      13     |     14      |
binary string  00000000000000011000000000000001000000010000000100000010000011000000010000000000
 
                                                                                                                     end of section 4  +
octet number       1       |       2      |        3     |      4       |       5      |       6      |       7      |       8      |       1      |       2      |
binary string  00000000000000000000100000000000100100001111010111011100010000000011011100110111
 
                                                     +  end of section 5
octet number       3       |       4      |
binary string  0011011100110111

 

Figure 3.1.1-9.  Required entries in the sample BUFR message


3.1.1.9        BUFR and Data Management

Sections 3 and 4 of BUFR contain all of the information necessary for defining and representing data.  The remaining sections are defined and included purely as aids to data management.  Key information within these sections is available from fixed locations relative to the start of each section.  It is thus possible to categorize and classify the main attributes of BUFR data without decoding the data description in Section 3 or the data in Section 4.

 

3.1.2           BUFR Descriptors.

3.1.2.1        Fundamentals of BUFR Descriptors

Section 3 of a BUFR message contains pointers to the information needed to encode and decode the parameters contained in Section 4 of a BUFR message.  The needed information itself is contained in the BUFR Tables that the pointers refer to.  These pointers are called ÒdescriptorsÓ.  Descriptors consist of two octets, or 16 bits.  However, the 16 bits are not to be treated as a 16 bit numeric value, but rather as 16 bits divided into 3 parts F, X, and Y, where the parts (F, X and Y) themselves are 2, 6 and 8 bits, respectively.

Schematically, a BUFR descriptor can be visualized as follows:

 

 
F
 
X
 
Y
 
2 BITS
 
6 BITS
 
8 BITS

 

F denotes the type of descriptor.  With 2 bits, there are 4 possible values for F: 0, 1, 2 and 3.  The four values have the following meanings:

F = 0 Element descriptor, and refers to Table B entries

F = 1 Replication operator

F = 2 Operator descriptor, and refers to Table C entries

F = 3 Sequence descriptor, and refers to Table D entries

The meanings of and uses for X and Y depend on the value of F.

Case 1:      F= 0 or 3

When F is 0 or 3, the descriptor refers to BUFR Tables B or D, and X (6 bits) indicates the class or category of descriptor within the Table.  With 6 bits, there are 64 possibilities, classes 00 to 63.  Classes 48 to 63 are reserved for local use.  Thus far, 29 of the 48 Table B classes and 19 of the 48 Table D classes allocated for international coordination have been defined.  Y (8 bits) indicates the entry within a class X.  Eight bits yields 256 possibilities, 000 to 255, within each of the 64 classes.  Entries 192 to 255 within all classes are reserved for local use.  A varying number of entries are currently defined within each of the internationally coordinated Table B and Table D classes.  Some of the Classes, Class 2 of Table B (instrumentation) in particular, have become alarmingly crowded.

Case 2:      F = 1

When F = 1, the descriptor is a Òreplication operatorÓ.  The BUFR replication operator is the repeating of a single parameter or a group of parameters some number of times, as in a TEMP or PILOT report.  In a replication operator, X gives the number of parameters to be repeated and Y gives the number of times the parameter or group of parameters is to be repeated.  If Y = 0, the number of times the parameter or group of parameters is to be repeated is found in the Data Section.  This is useful when the number of repetitions is not known ahead of time.  Examples of the use of replication operators will be discussed in Chapter 3.1.4.

 

Case 3:      F = 2

When F = 2, the descriptor is an Òoperator descriptorÓ, and refers to BUFR Table C.  Operator descriptors from Table C are used when there is a need to redefine Table B attributes temporarily, such as the need to change data width, scale or reference value of a Table B entry.  Operator descriptors are also used to add associated fields such as quality control information, indicate characters as data items, and signify data width of local descriptors.  In an operator descriptor, X gives the number of the operator descriptor within Table C and Y is the operand for the operator descriptor.

 

3.1.2.2             Coordinate Descriptors

The descriptors in Classes 00 through 09 (with 03 and 09 at present reserved for future use) have a special meaning added to them over and above the specific data elements that they describe.  They (or the data they represent) "remain in effect until superseded by redefinition" (see Regulation 94.5.3.3).  By this is meant that the data in these classes serve as coordinates (in a general sense) for all the following observations.  Once you encounter a 0 04 004 descriptor (which describes the "hour"), one must assume that the hour (a time coordinate) applies to all the following observed parameters, until either another 0 04 004 descriptor is encountered or you reach the end of the data subset.

Obviously the familiar coordinates (the two horizontal dimensions - Classes 05 and 06, the vertical dimension – Class 07, and time – Class 04) are in this sub-category of descriptors.  However, some features that one might not think of as "coordinates", other than in a general sense, are in this sub-category as well.  Forms of "identification" of the observing platform (block and station number, aircraft tail number, etc.) are "coordinates" in this sense, in that they most certainly apply to all the observed parameters taken from that platform and they "remain in effect until superseded by redefinition".  The instrumentation that is used to take the measurements (Class 02) also falls in the same category - it applies to all the actual observed values of a particular parameter because all those observed values were measured with that particular instrument.  However, the ÒcoordinateÓ nature of this Class is more complex because some observations (like SYNOPs) involve several instruments, and therefore the instrumentation would need to be redefined a number of times in an individual SYNOP report.  Nevertheless, the ÒcoordinateÓ philosophy still applies for an individual observed quantity.

A source of confusion can arise by noting that some parameters (height and pressure, for example) appear twice in the Tables: in Class 07 (for values used as coordinates, or the independent variable) and again in Class 10 (for reported values, or the dependent variable).  Which table descriptor is appropriate depends on the nature of the measurement that involves these parameters.  A Radiosonde, which measures wind, temperature, and humidity (and geopotential height by calculation) as a function of pressure, would report the pressure values using Class 07 (the vertical coordinate or independent variable) and the other parameters from the non-coordinate classes (10 for geopotential, 11 for wind, 12 for temperature, and 13 for humidity).  An aircraft radar altimeter, on the other hand, might calculate pressure (and use Class 10 to report the value) as a function of its height measurement (Class 07).

Yet another kind of "coordinate" is imbedded in Class 8 - Significance Qualifiers.  These are a way of reporting various qualitative pieces of information about the (following) data elements, beyond their numeric values, that can be important to the user of the data.  There are cases where it makes no sense to have a particular kind of significance "remain in effect" for the rest of the message (or to the end of the data subset) but there was no explicit way to cancel it.  This issue was resolved with the addition of Note (2) to Class 08, which states:  ÒA previously defined significance may be cancelled by transmitting a ÒmissingÓ from the appropriate code or flag table.Ó

There is an exception to the "remain in effect until redefined" rule: when two identical descriptors from Classes 04 to 07 are placed back to back, that is to be interpreted as defining a range of coordinates.  In this way a layer, an area, a volume, or a span of time can be defined as needed.  If the same descriptor shows up later on in the message, then that appearance does indeed redefine that particular coordinate value even if the original coordinates defined a range.  The others still remain in effect.

Unfortunately some coordinate-like information has appeared in a Table outside the Class 00-09.  Class 25 - Processing information - largely dealing with radar information, contains information that by its nature "remains in effect until superseded".  It should be considered as a "coordinate" class and may get such an official designation in the future.  If it does, it would not involve any changes to the structure of BUFR or the tables, only a change in interpretation, or "meaning", of the data elements.

There is not much a general BUFR decoder program can do with this "coordinate " information, other than decode it and pass the information on to some follow-on applications program.  It is up to the applications program (or the human reading a decoded message) to supply the interpretation and the meaning of what is there, and then to act accordingly.  Some of the interpretation is straightforward, almost second nature. "Obviously" the station identification applies to the following observations made at that station; "obviously" this pressure level is where the RAOB measured the wind and temperature; perhaps not so obvious is the fact that two consecutive azimuth values define a sector in which a hurricane is located.  Making the "obvious" explicit with rules, regulations, and footnotes is part of what BUFR is all about.  The developers of BUFR made every effort to EXCLUDE as much "self-evident" information as possible and instead require that "meaning" be specified by definite rules - that is, in part, what makes the system so powerful.

 

3.1.2.3 Increment Descriptors

Increment descriptors are those descriptors in Classes 04 – 07 with the word ÒincrementÓ in the element name.  As an example, consider Class 04 of Table B:


 

Class 04 - Location (time)

 

TABLE REFERENCE

 

TABLE

ELEMENT NAME

BUFR

CREX

 

 

 

UNIT

 

SCALE

 

REFERENCE VALUE

DATA WIDTH

(Bits)

 

UNIT

 

SCALE

DATA

WIDTH

(Characters)

F

X

Y

 

 

 

 

 

 

 

 

0

04

001

Year

Year

0

0

12

Year

0

4

0

04

002

Month

Month

0

0

4

Month

0

2

0

04

003

Day

Day

0

0

6

Day

0

2

0

04

004

Hour

Hour

0

0

5

Hour

0

2

0

04

005

Minute

Minute

0

0

6

Minute

0

2

0

04

006

Second

Second

0

0

6

Second

0

2

0

04

011

Time increment

Year

0

–1024

11

Year

0

4

0

04

012

Time increment

Month

0

–1024

11

Month

0

4

0

04

013

Time increment

Day

0

–1024

11

Day

0

4

0

04

014

Time increment

Hour

0

–1024

11

Hour

0

4

0

04

015

Time increment

Minute

0

–2048

12

Minute

0

4

0

04

016

Time increment

Second

0

–4096

13

Second

0

4

0

04

017

Reference time period for accumulated or extreme data

Minute

0

-1440

12

Minute

0

4

0

04

021

Time period or displacement

Year

0

–1024

11

Year

0

4

0

04

022

Time period or displacement

Month

0

–1024

11

Month

0

4

0

04

023

Time period or displacement

Day

0

–1024

11

Day

0

4

0

04

024

Time period or displacement

Hour

0

–2048

12

Hour

0

4

0

04

025

Time period or displacement

Minute

0

–2048

12

Minute

0

4

0

04

026

Time period or displacement

Second

0

–4096

13

Second

0

4

0

04

031

Duration of time relating to following value

Hour

0

0

8

Hour

0

3

0

04

032

Duration of time relating to following value

Minute

0

0

6

Minute

0

2

0

04

041

Time difference, UTC –LMT (see Note 6)

Minute

0

–1440

12

Minute

0

4

0

04

043

Day of the year

Day

0

0

9

Day

0

3

0

04

053

Number of days with precipitation equal to or more than 1 mm

Numeric

0

0

 6

Numeric

0

2

0

04

065

Short time increment

Minute

0

-128

8

Minute

0

2

0

04

073

Short time period or displacement

Day

0

-128

8

Day

0

2