Guide to WMO Table Driven Code Forms:

 

 

 

 

FM 94 BUFR

 

and

 

FM 95 CREX

 

 

 

Layer 1:   Basic Aspects of BUFR and CREX

and

Layer 2:  Layout, Functionality and Application of BUFR and CREX

 

 

 

 

 

 

 

 

 

 

Geneva, 1 January 2002

 


Preface

 

This guide has been prepared to assist experts who wish to use the WMO Table Driven Data Representation Forms BUFR and CREX.

This guide is designed in three layers to accommodate users who require different levels of understanding.

Layer 1 is a general description designed for those who need to become familiar with the table driven code forms but do not need a detailed understanding.  Layer 2 focuses on the functionality and application of BUFR and CREX, and is intended for those who must use software that encodes and/or decodes BUFR or CREX, but will not actually write the software.

Layer 3 is intended for those who must actually write BUFR or CREX encoding and/or decoding software, although those wishing to study table driven codes in depth, will find it equally useful.

The WMO gratefully acknowledges the contributions of the experts who developed this guidance material.  The Guide was prepared by Dr. Clifford H. Dey of the U. S. A. National Centre for Environmental Prediction.  Contributions were also received in particular from Charles Sanders - Australia, Eva Cervena - Czech Republic, Chris Long - U.K., Jeff Ator - USA and Milan Dragosavac, ECMWF.


Contents

 

 

Layer 1:        Basic Aspects of BUFR and CREX

Page

1.1       Overview                                                                                                               L1-  2

 

1.2       General Description............................................................................................ L1-  2

            1.2.1    Self-description....................................................................................... L1-  2

            1.2.2    Code Structures...................................................................................... L1-  4

            1.2.3    BUFR and CREX Tables......................................................................... L1-  5

            1.2.4    Features common to BUFR and CREX................................................ L1-  8

            1.2.5    Differences............................................................................................... L1-10

            1.2.6    CREX Examples....................................................................................... L1-11

 

1.3       Updating Procedures......................................................................................... L1-15

            1.3.1    General Procedures................................................................................ L1-16

            1.3.2    Updating the Structures........................................................................ L1-16

            1.3.3    Updating the Tables............................................................................... L1-16

            1.3.4    Validation of Updates............................................................................. L1-16

 

1.4       Migration Guidance............................................................................................ L1-17

           1.4.1    Training.................................................................................................... L1-17

           1.4.2    Technical Issues..................................................................................... L1-17

           1.4.3    Encoding vs. interpretation................................................................... L1-18

 

Layer 2:        Layout, Functionality and Application of BUFR and CREX........ L2- 1

 

Layer 3:        Detailed Description of the Code Forms

(See separate Volume Layer 3 for programmers of encoder/decoder software)


Layer 1:   Basic Aspects of BUFR and CREX

 

 

1.1       Overview

 

The table driven code forms BUFR (Binary Universal Form for the Representation of meteorological data) and CREX (Character form for the Representation and EXchange of data) offer the great advantages of flexibility and expandability compared with the traditional alphanumeric code forms. These beneficial attributes arise because BUFR and CREX are self-descriptive.  The term "self-descriptive" means that the form and content of the data contained within a BUFR or CREX message are described within the BUFR or CREX message itself.  In addition, BUFR offers condensation, or packing, while the alphanumeric code CREX provides human readability.

 

BUFR was first approved for operational use in 1988.  Since that time, it has been used for satellite, aircraft, wind profiler, and tropical cyclone observations, as well as for archiving of all types of observational data.  In 1994, CREX was approved as an experimental code form by the WMO Commission on Basic Systems (CBS Ext.94).  In 1998, CBS (CBS-Ext. 98) recommended CREX be approved as an operational data representation code form as from 3 May 2000.  In 1999, this recommendation was endorsed by the WMO Executive Council (EC-LI (1999)).  CREX is already used among centres for exchange of ozone, radiological, hydrological, tide gauge, tropical cyclone, and soil temperature data.  BUFR should always be the first choice for the international exchange of observational data.  CREX should be used only when BUFR cannot.  BUFR and CREX are the only code forms the WMO needs for the representation and exchange of observational data and are recommended for all present and future WMO applications.

 

This guide to Table Driven Code Forms is designed in three layers to accommodate users who require different levels of understanding.  Layer 1 is a general description designed for those who need to become familiar with the table driven code forms but do not need a detailed understanding.  Layer 2 focuses on the functionality and application of BUFR and CREX, and is intended for those who must use software that encodes and/or decodes BUFR or CREX, but will not actually write the software.  Layer 3 is intended for those who must actually write BUFR or CREX encoding and/or decoding software, although those wishing to study table driven codes in depth, will find it equally useful.

 

 

1.2       General Description

 

1.2.1    Self-definition

 

How do we know what the following character string means in an alphanumeric code?:

 

32325  11027  ?

 

First, we need to know the code form within which this character string falls.  We assume it comes from a bulletin of synoptic observation reports, thus the code form is FM 12 SYNOP.  Second, we need to know the position within the SYNOP code form of the two groups above (the second and third mandatory groups in Section 1).  Third, we need to refer to the WMO Manual on Codes, Volume I.1 (International Codes), Part A (Alphanumeric Codes) for the description of these two groups in the SYNOP code form (unless we have committed the SYNOP code form to memory).  Upon doing this, we find the two groups above have the following symbolic form:

 

Nddff  1snTTT  ,

 

where N = total cloud cover, dd = wind direction, ff = wind speed, 1 is a group indicator, and TTT = air temperature, where the sign of TTT is given by sn.  However, only after looking further at the code book to find the full meanings and coding conventions of this symbolic form, can we determine that the sky is 3/8 covered with clouds, the wind is blowing from 230 degrees at 25 knots, and the air temperature is - 2.7 oC.  Thus, the position within the report and the coding convention (in this example, the symbolic form Nddff 1snTTT) assigned to that position of the report define the data contained within traditional alphanumeric code forms.  Furthermore, if a new group of information were to be inserted before the second and third mandatory groups in Section 1, the positions of these two groups would change.  Such a modification would require a corresponding update to all software programs that encode or decode such reports or the software would either give incorrect values or fail completely.  The reason is that the coding conventions used to describe the data are built into the processing software, not included with the data.  It is this fact that renders the traditional alphanumeric code forms incapable of accommodating new types of data.

 

In a table driven code form, there are also position rules, but they apply only to the shape of the ÇcontainerÈ (or code structure) rather than to the content of the ÇcontainerÈ.  The presence and form of the data are described within the ÇcontainerÈ itself.  This is the concept of self-description.  In order to accomplish it, there is a section (the Data Description Section) in BUFR and CREX messages in which the type and form of the data contained within the message are defined.  Here is an example of a simple self-described message:

 

Data Description:

                 Position:               Element        Parameter         Unit               Data

                                             Reference         Name                                 Width

                                             Number                                                        (characters)

                 1                          B 01 001       Block number    Numeric            2

                 2                          B 01 002       Station number Numeric             3

                 3                          B 04 004       Hour                  Hour                  2

                 4                          B 12 001       Temperature     Tenth ¡C            3

                 5                          B 11 002       Wind Speed       m/sec.               3

                 6                          B 11 003       Wind direction   Degree              3

Data:

07 444 06 154 003 230

 

We can see here that the station is 07444, the hour is 06, the temperature is 15.4¡C, the speed of wind is 3 meter/sec and its direction is 230 degree.  The first section of the message contains the data description, which is in itself very long relative to the data values.  To make this more efficient, standards (unit, data width, scale, etc..) for coding the values are defined for various physical parameters and kept in the WMO Code Tables.  Thus, instead of writing all the detailed definitions within the message, one will just write a number (called above in this example: Element Reference Number) identifying the parameter with its descriptions.  Then in that case the message would be:

 

Data Description: 001002 004004 012001 011002 011003

Data: 07444 06 154 003 230

 

In WMO table driven codes, the Data Description Section contains a sequence of data descriptors, which is like a set of "pointers" towards elements in predefined and internationally agreed tables (stored in the official WMO Manual on Codes).  By definition these descriptors are six digits reference numbers (or six characters for CREX); they are defined in the code tables that are explained further in section 1.2.3 below.  Once the Data Description Section is read, the following section containing the data itself (the Data Section), can be understood.  Indeed, the characteristics of the parameters to be transmitted must already be defined in the tables of the WMO Manual before data containing those parameters can be exchanged in BUFR or CREX messages.

 

 

1.2.2    Code Structures

 

The structures of the BUFR and CREX code forms are the following:

 

BUFR


SECTION 0  Indicator Section
SECTION 1  Identification Section 
SECTION 2  (Optional Section)
SECTION 3  Data Description Section

SECTION 4  Data Section
SECTION 5  End Section

 

CREX

 

SECTION 0  Indicator Section
SECTION 1  Data Description
Section
SECTION 2  Data Section
SECTION 3  (Optional Section)
SECTION 4  End Section

 

The Indicator Sections and the BUFR Identification Section are short sections, which identify the message.  The list of descriptors, pointing towards elements in predefined and internationally agreed tables that are stored in the official WMO Manual on Codes (described previously), are contained in the Data Description Section.  These descriptors describe the type of data contained in the Data Section and the order in which the data appear there.  The Optional Section can be used to transmit any information or parameters for national purpose.  The End Section contains the four alphanumeric characters "7777" to denote the end of the BUFR or CREX message.

 

Since the data in a CREX message are laid out one after the other, and since the data values of the parameters in a CREX message are transmitted in a set of characters, it is very simple to read a CREX message.  While the order of the data contained in a BUFR message is likewise described by the BUFR Data Description Section, the data values of the parameters in a BUFR message are translated in a set of bits in BUFR.  Consequently, a BUFR message is not human readable, or extremely difficult to decipher without the help of a computer program.  CREX can be looked upon as the image in characters of BUFR bit fields.

 

When there is a requirement for transmission of new parameters or new data types, new elements are simply added to the WMO BUFR and CREX tables, after approval by the CBS.  Since table driven code forms can thus describe any new parameter by the simple addition of a new entry to the appropriate code table, table driven code forms possess the flexibility to transmit an infinite variety of information.  Therefore, definition of new Çcode formsÈ is no longer necessary.  Furthermore, procedures and regulations are fixed.  A new edition number is assigned every time the BUFR or CREX code structure is changed.  Although these edition changes require an update to BUFR or CREX encoding or decoding software, such changes are infrequent (the BUFR Edition Number has changed only twice since 1988 – see Section 1.3).  Likewise, a new version number is assigned every time additions are made to BUFR or CREX code tables.  Although version number changes are more frequent than edition number changes, they do not require modifications to the processing software.  The edition number of the format (structure of the message) and version number of the tables are transmitted in the message itself (in the Indicator and Identification sections for BUFR, in the Data Description section for CREX) and enable the treatment of old archived data.

 

1.2.3    BUFR and CREX Tables

 

Tables define how the parameters (or elements) shall be coded as data items in a BUFR or CREX message (i.e. units, size, scale).  They are recorded in the WMO Manual on Codes, Volume I.2 (International Codes), Parts B (Binary Codes) and C (Common Features to Binary and Alphanumeric Codes).  The Manual on Codes also comprises Volume I.1 (international Codes), Part A (Alphanumeric Codes) and Volume II: Regional Codes and National Coding Practices.  These three volumes are collectively referred to as WMO Publication No. 306.  The Tables defining BUFR and CREX coding are Tables A, B, C, and D.

 

Table A subdivides data into a number of discrete categories (e.g. Surface data – land, Surface data - sea, Vertical soundings (other than satellite), Vertical soundings (satellite), etc.).  While not technically essential for BUFR or CREX encoding/decoding systems, the data categories in Table A are useful for telecommunications purposes and for storage of data in and retrieval of data from a data base.

 

Table B describes how individual parameters, or elements, are to be encoded and decoded  in BUFR and CREX.  For each element, the table lists the reference number (or element descriptor number, which is used in the description section of the code like a "pointer", as explained earlier), the element name, and the information needed to encode or decode the element.  For BUFR, this information consists of the units to be used, scale and reference values to apply to the element, and the number of bits used to describe the value of the element (the BUFR data width).  For CREX, this information consists of units to be used, the scale value to apply to the value of the element, and the number of characters used to describe the value of the element (the CREX data width).  Although the same elements are found in both BUFR and CREX Tables B, their unit may differ (BUFR units are SI, while CREX units are more user oriented).  For example, the unit used for temperature is Kelvin in BUFR but Celsius in CREX.  The data items transmitted in a report will have their descriptor numbers listed in the Data Description Section.  As an example, extracts of BUFR and CREX Table B for Temperature is given below.

 

Table B is fundamental to encoding and decoding in both BUFR and CREX.

 


Class 12 - Temperature

 

TABLE REFERENCE

 

TABLE

ELEMENT NAME

BUFR

CREX

 

 

 

UNIT

 

SCALE

 

REFERENCE VALUE

DATA WIDTH (Bits)

 

UNIT

 

SCALE

DATA

WIDTH

(Characters)

F

X

Y

 

 

 

 

 

 

 

 

0

12

001

Temperature/dry-bulb temperature

K

1

0

12

¡C

1

3

0

12

002

Wet-bulb temperature

K

1

0

12

¡C

1

3

0

12

003

Dew-point temperature

K

1

0

12

¡C

1

3

0

12

004

Dry-bulb temperature at 2 m

K

1

0

12

¡C

1

3

0

12

005

Wet-bulb temperature at 2 m

K

1

0

12

¡C

1

3

0

12

006

Dew-point temperature at 2 m

K

1

0

12

¡C

1

3

0

12

007

Virtual temperature

K

1

0

12

¡C

1

3

0

12

011

Maximum temperature, at height and over period specified

K

1

0

12

¡C

1

3

0

12

012

Minimum temperature, at height and over period specified

K

1

0

12

¡C

1

3

 

 

Note:  To encode values in BUFR, the data (in the units as specified in the UNIT column) must be multiplied by 10 to the power of SCALE and then, the REFERENCE VALUE must be subtracted from them.  In the example above, data will be thus encoded in 10th of Degree Kelvin in BUFR.

 

            To encode values in CREX, the data (in the units as specified in the UNIT column) must be multiplied by 10 to the power of SCALE.  In the example above, data will be thus encoded in 10th of Degree Celsius in CREX.


TABLE C defines a number of operations that can be applied to the elements.  Each such operation is assigned an operator descriptor.  For example, BUFR Table C contains operator descriptors to change the scale value, the reference value, or data width listed for a parameter in BUFR Table B.  Some of the operations defined in BUFR Table C are quite complex.  Operator descriptors are described in Layer 2 and at length in Layer 3.  Operator descriptors are also available in CREX, although their number and usage is rather limited.

 

Operator descriptors, although not essential for BUFR and CREX encoding and decoding, are useful in minimizing the number of new table entries and including quality assessment information.

 

 

TABLE D defines groups of elements that are always transmitted together (like a regular SYNOP or TEMP report) in what is called a common sequence.  By using a common sequence descriptor, the individual element descriptors will not need to be listed each time in the data description section.  This will reduce the amount of space required for a BUFR or CREX message.  Common sequences are defined in BUFR and CREX Tables D.  An example of BUFR Table D is shown below.

 

Sequence descriptors, although not essential for BUFR and CREX encoding and decoding, are useful in decreasing the space requirements for BUFR and CREX messages.

 

 

Meteorological sequences common to surface data

 

TABLE

REFERENCE

 

TABLE

REFERENCES

 

ELEMENT NAME

F

X

Y

 

 

3

02

001

0

10

004

Pressure (at station level)

 

 

 

0

10

051

Pressure reduced to mean sea level

 

 

 

0

10

061

3-hour pressure change

 

 

 

0

10

063

Characteristic of pressure tendency

 

 

 

 

 

 

 

 

 

 

 

 

 

(High altitude station)

3

02

002

0

10

004

Pressure (at station level)

 

 

 

0

07

004

Pressure level

 

 

 

0

10

003

Geopotential of pressure level

 

 

 

0

10

061

3-hour pressure change

 

 

 

0

10

063

Characteristic of pressure tendency

 

 

 

 

 

 

 

3

02

003

0

11

011

Wind direction (10 m)

 

 

 

0

11

012

Wind speed (10 m)

 

 

 

0

12

004

Temperature (2 m)

 

 

 

0

12

006

Dew point (2 m)

 

 

 

0

13

003

Relative humidity

 

 

 

0

20

001

Horizontal visibility

 

 

 

0

20

003

Present weather

 

 

 

0

20

004

Past weather (1)

 

 

 

0

20

005

Past weather (2)

 

1.2.4    Features common to BUFR and CREX

 

Structure:  CREX was intentionally designed to be an alphanumeric version of BUFR.  It is therefore not surprising that the CREX and BUFR code forms have many structural similarities.  Both achieve self-definition by including a section within each message describing the form and content of the data included within that message.  Both BUFR and CREX messages begin with an alphanumeric representation of the name of the code form, both have optional sections, and both have identical End Sections.

 

Tables:  Table A is identical for BUFR and CREX.  Furthermore, BUFR and CREX define the same set of elements using nearly identical descriptors - the first value in the descriptor, denoting the descriptor type, is binary in BUFR and alphanumeric in CREX, but the remainder of the descriptors are identical for identical elements.  This made it possible to design a single Table B to serve both code forms.  Finally, although BUFR and CREX Tables D are different, they are closely co-ordinated.  Common sequences that can be transformed easily between BUFR and CREX are not defined in both BUFR and CREX Table D.  If a CREX Table D sequence is not defined in BUFR Table D, it has a number that is not used by any other BUFR sequence.  Similarly, BUFR Table D sequences without CREX counterparts have numbers that are not used by any CREX Table D.  In Tables A, B and D there are ranges of numbers for descriptors outside the internationally agreed range of numbers.  These can be used to define special descriptors for national or local purposes and thus enable the domestic exchange of special national data.

 

Code and Flag Tables:  An element based on a code (e.g., Cloud Type) or a set of conditions defined by flags (bits set to 0 or 1) will have an associated Code Table or Flag Table.  In this case, "Code Table" or "Flag Table" will appear in the Unit column of Table B.  BUFR and CREX Code and Flag Tables are identical (in CREX messages, however, flag values are coded in an octal representation).  An example of a Code Table and a Flag Table is listed below:

 

                 0 20 024

 

       Intensity of phenomena

 

Code figure

 

0

No phenomena

1

Light

2

Moderate

3

Heavy

4

Violent

5-6

Reserved

7

Missing value

 


                   0 20 025

 

                Obscuration

 

Bit No.

 

1

Fog

2

Ice fog

3

Steam fog

4-6

Reserved

7

Mist

8

Haze

9

Smoke

10

Volcanic ash

11

Dust

12

Sand

13

Snow

14-20

Reserved

All 21

Missing value

 

 


Figure 1

 


Decoding Process:  BUFR and CREX decoding software need to keep the Tables in memory. The decoding process is depicted in figure 1 above and summarised below:

 

á    The decoder identifies the successive descriptors in the Data Description Section.  If a descriptor is an element descriptor, the decoder looks up the characteristics of the element (units, scale, reference value, data width) in Table B.  If a descriptor is a sequence descriptor, the decoder looks up the sequence in Table D.  If the sequence in Table D contains only element descriptors, the decoder looks up the characteristics of the elements in Table B and proceeds on to the next descriptor in the Data Description Section.  However, if the sequence in Table D contains other sequence descriptors, it looks these up in Table D, repeating this process until only element descriptors remain.  The decoder them looks up the characteristics of these elements in Table B and proceeds on to the next descriptor in the Data Description Section.   Once the decoder has found the characteristics of all the elements referred to in the Data Description Section, it can decode the values from the Data Section.

 

á    If in Table B, the unit column of the element descriptor contains "Code table" or "Flag table", the interpreter of the decoded data will have to examine the corresponding code table or flag table to understand the meaning of the coded value.  The interpreter could  be a human or, in some cases, an automatic process that acts depending on the value of the code form or the flags.

 

Functionality:  The self‑descriptive feature of both BUFR and CREX leads to another advantage over traditional alphanumeric character codes forms - the relative ease of decoding a BUFR or CREX message.   Whereas a large number of specialised and complex programs are needed to decode the plethora of character codes in current use, a single "universal" BUFR or CREX decoder program is capable of decoding any BUFR or CREX message.  It is not a trivial task to write such a BUFR or CREX decoder, but once it is done, it does not need to be changed for any table version change, but rather only upon the next edition change.  Edition changes should be rare, much less frequent than has been the case for the traditional alphanumeric code forms.  The program therefore does not need to be modified with changes in observational requirements; only the tables need to be augmented, a relatively trivial task.  This self-descriptive feature also makes it possible for both BUFR and CREX to easily accommodate new data within existing report types as well as new report types themselves.

 

Another feature BUFR and CREX have is referred to as replication.  Replication is the repeating of a single parameter or a group of parameters some number of times, as in a TEMP or PILOT report with many levels.  The number of times the parameter or group of parameters is to be repeated can either be specified in the Data Description Section, if the number of repetitions is a fixed known number, or in the Data Section, if the number of repetitions is not a fixed known number (it is then called a "delayed replication").

 

 

1.2.5    Differences

 

BUFR offers packing.  Therefore, voluminous data (e.g., satellites, ACARS, wind profilers) will require fewer resources for transmission and storage than CREX.  BUFR also permits the transmission of quality information with the original observational data.  However, BUFR data is not human readable.  Because it is not human readable, BUFR processing assumes the availability of well designed computer programs to process (decode or encode) the messages.

 

CREX is simpler than BUFR and consequently easy to understand, to code and, because it is an alphanumeric code form, to read with only several hours of explanation.  It is therefore particularly useful where computer equipment is not available.  However, CREX does not offer packing, and has much less comprehensive capability for including quality information than BUFR.

 

 

1.2.6.   CREX Examples

 

Presentation of an example of a BUFR message is beyond the scope of Layer 1 of this Guide.  It is presented in detail in Layer 3.  However, CREX is simpler than BUFR, and the alphanumeric nature of CREX makes it feasible to present examples of two reports in CREX here. 

 

Surface observation from a fixed land station:  The first example is a surface observation from a fixed land station.  These reports are currently exchanged in FM 12-XI Ext. SYNOP.  The example presents both the report in both the SYNOP and the CREX code forms.

 

á          In code form FM 12-XI Ext. SYNOP:

 

AAXX 09091

03075 41480 62413 11073 21105 39962 40001 55019 71562 86800=

 

á          In code form FM 95-XII CREX:

 

CREX++                                                                                  Indicator Section

T000101 A000 D07999++                                                       Description Section

03 075 1 1989 01 09 09 00039 5845 -00308 0030 3000 075 240 0013 -073 -105 09962 10001 05 0019 015 07 02 075 38 20 10++                     Data Section

7777                                                                                        End Section


á    Interpretation of the example:

 

Encoded in     SYNOP

Encoded in CREX

Name of the element

Decoded value

CREX Data Section

 

CREX

 

Indicator of a CREX message

 

 

 

 T000101

 

CREX Master Table Number 00,

Edition 01, Version 01

 

 

 

 A000

 

Data type  (000 = Surface data–land)

 

 

 

 D07999

 

See note below

 

II = 03

 

B 01 001

WMO block number

 

03

Iii = 075

 

B 01 002

WMO station number

 

075

iR = 4

 

 

no counterpart needed in CREX

 

 

ix = 1

 

B 02 001

Type of station 

Manned

1

 

 

B 04 001

Year (of observation)

 

1989

 

 

B 04 002

Month (of observation)

 

01

 

 

B 04 003

Day (of observation)

 

09

 

 

B 04 004

Hour (of observation)

 

09

 

 

B 07 001

Height of station (barometer)

39 m

00039

 

 

B 05 002

Latitude (coarse accuracy)

58.45 deg.

5845

 

 

B 06 002

Longitude (coarse accuracy)

- 3.08 deg.

-00308

h = 4

 

B 20 013

Height of base of cloud

300 m

0030

vv = 80

 

B 20 001

Horizontal visibility

30 km

3000

n = 6

 

B 20 010

Cloud cover (total)

6/8 = 75 %

075

dd = 24

 

B 11 011

Wind direction at 10 m

240 degrees

240

ff = 13

 

B 11 012

Wind speed at 10 m

13 m/s

0013

snTTT = 1073

 

B 12 004

Dry-bulb temperature at 2 m

- 7.3 oC

-073

snTdTdTd  = 1105

 

B 12 006

Dew-point temperature at 2 m

- 10.5 oC

-105

P0P0P0P0 = 9962

 

B 10 004

Pressure

996.2 hPa

09962

PPPP = 0001

 

B 10 051

Pressure reduced to mean sea level

1000.1 hPa

10001

a = 5

 

B 10 063

Characteristic of pressure tendency

 

05

ppp = 019

 

B 10 061

3-hour pressure change

1.9 hPa

0019

ww = 15

 

B 20 003

Present weather

precipitation      in sight

015

w1 = 7

 

B 20 004

Past weather (1)

Snow

07

w2 = 2

 

B 20 005

Past weather (2)

more then 1/2 of the sky covered

02

Nh = 6

 

B 20 051

Amount of low clouds

6/8 = 75 %

075

CL = 8

 

B 20 012

Cloud type (Type of low clouds)

Cu and Sc

38

CM = 0

 

B 20 012

Cloud type (Type of middle clouds)

no CM clouds

20

CH = 0

 

B 20 012

Cloud type (Type of high clouds)

no CH clouds

10

 

 

 

End of Data Section

 

++

 

 

 

End of the CREX message

 

7777

 

 


Note:

 

The sequence descriptor 07999 represents the sequence of element descriptors B01001, B01002, B02001, ......, B20012 listed in the third column.  The sequence descriptor D07999 is hypothetical and has been created for the purpose of this example.  Apart from the time identification (Year, Month, Day, Hour) and co-ordinate locations (barometer height, latitude, longitude) the sequence of the elements in the CREX message corresponds to the sequence of the elements in the above presented SYNOP report.  The systematic passing of geographical co-ordinates, easily performed with the table driven codes, would alleviate the notorious Volume A problems.  There are excessive delays in updating Volume A, the WMO secretariat receiving sometimes with considerable delay or never, the updates that the Countries should send.  Additional delays are introduced when GDPS centres have to implement the changes in their own databases.  Transmitting the geographical co-ordinates with the data itself would solve 98% of the wrong co-ordinates for a station.  The remaining 2% of the errors are cases where the station itself has been incorrectly located, and these errors would of course remain.


 

Ozone sounding:  The second example is an ozone sounding.  There is no traditional alphanumeric code form of the WMO FM-system for representation of these data.  Therefore, the example contains only the CREX version.  These data were among the first to be exchanged in CREX operationally.

 

á          In code form FM 95-XII CREX:

 

 

KULA01 CWAO 051800

CREX++

T000101 A008 D09040++

71 917 EUREKA               7598 -08593 00010 18 1998 04 29 23 18

 061 019 //// //// 375 0082

 0000 400 10137 030 0000 200 10000 030 0001 002 09687 037

 0002 002 09366 033 0004 002 08831 037 0005 200 08500 036

 0007 002 08013 043 0007 002 07881 047 0008 002 07646 037

 0009 002 07442 042 0011 200 07000 031 0012 002 06849 027

 0013 002 06710 036 0015 002 06291 029 0022 200 05000 028

 0025 002 04557 027 0029 002 04065 024 0029 200 04000 020

 0032 002 03626 025 0038 002 03000 020 0040 002 02890 021

 0040 002 02829 065 0041 002 02726 105 0043 002 02576 118

 0044 200 02500 135 0048 002 02218 165 0049 002 02147 161

 0050 002 02104 171 0051 002 02031 153 0051 002 02010 159

 0051 200 02000 171 0052 002 01941 188 0054 002 01854 198

 0056 002 01744 187 0056 002 01717 194 0057 002 01683 191

 0058 002 01640 161 0058 002 01623 159 0059 002 01585 168

 0059 002 01576 185 0060 002 01545 197 0061 002 01500 202

 0063 002 01414 221 0064 002 01370 220 0065 002 01335 230

 0066 002 01269 219 0067 002 01232 227 0067 002 01226 235

 0068 002 01208 241 0072 002 01055 242 0074 200 01000 236

 0075 002 00960 228 0076 002 00936 192 0077 002 00912 180

 0078 002 00897 187 0078 002 00883 210 0079 002 00868 221

 0079 002 00850 202 0080 002 00841 199 0081 002 00815 208

 0081 002 00807 189 0081 002 00803 171 0082 002 00790 152

 0082 002 00777 157 0083 002 00764 172 0084 002 00741 156

 0084 002 00722 156 0085 002 00715 162 0085 200 00700 188

 0085 200 00700 193 0086 002 00682 203 0088 002 00639 212

 0090 002 00608 206 0091 002 00588 190 0091 002 00582 192

 0092 002 00570 209 0092 002 00557 215 0096 200 00500 197

 0099 002 00437 171 0108 002 00316 139 0110 200 00300 128

 0115 002 00242 108++

7777

 


á          Interpretation of the example:

 

     Group              Meaning                                                                   Value                 

CREX++

 

 

 

T000101

 

 

 

A008             

 

 

 

D09040

B01001

:

B01001 + B01002 + É + B15003, where

WMO block number

 

71

B01002

 

WMO station number

917

B01075

:

Station or site name

Eureka

B05002

:

Latitude

7598

B06002

 

Longitude

-08593

B07001

 

Height of station

00010

B08021

:

18 = launch time follows

18

B04001

:

Year

1998

B04002

 

Month

04

B04003

 

Day

29

B04004

:

Hours

23

B04005

 

Minutes

18

B02011

:

Radiosonde type

061

B02143

:

Ozone instrument type

019

B02142

:

Ozone instrument serial number or identifier

////

B15004

:

Ozone sounding correction factor

////

B15005

:

Ozone p

375

R04000

:

Delayed replication factor = number of levels

0082

 

The next four descriptors are repeated 82 times

 

 

B04015

:

Time increment since launch time (minutes)

0000, 0000, 0001, etc,

B08006

:

Ozone vertical sounding significance

400, 200, 002, etc.

B07004

:

Pressure

10137, 10000, 09687, etc.

B15003

:

Measured ozone partial pressure

030, 030, 037, etc.

++

 

 

 

7777

 

End of message

 

 

Note:   The sequence descriptor D09040 represents the sequence of descriptors D01001, B01015, D01204,ÉÉ., B15003 listed in the first column.

 

1.3       Updating Procedures

 

In Section 1.2.2, it was noted that there are two general categories of changes to BUFR and CREX - changes to the code structures and additions to the supporting tables.   Changes to the code structures require a new Edition Number and corresponding modifications to processing software, while additions to the supporting tables require a new Table Version Number but no software changes.  Consequently, changes to the BUFR and CREX code structures are made very infrequently.  The original BUFR Edition was approved for operational use in 1988.  Changes to the code structure approved for operational use in November 1991 were designated as defining BUFR Edition 2.   Additional changes to permit quality information representation and establish Common Code Tables approved for operational use in November 1995 were designated as defining the current BUFR Edition 3.  Thus, the structure of BUFR has changed only twice since its inception in 1988.

 

Since table additions are not only far less disruptive but also required more often and with greater urgency, they are made with greater frequency (table additions have been made 9 times since 1988).  All changes to BUFR and CREX are documented in the form of supplements to the WMO Manual on Codes.  However, these supplements are issued no more than once a year.

 


1.3.1    General Procedures

 

 All amendments to BUFR and CREX must be proposed in writing to the WMO Secretariat.  The proposal must specify the needs, purposes and requirements and include information on a contact point for technical matters.  An Expert Team on Data Representation and Codes (ET/DRC) under the Commission for Basic Systems (CBS) Open Programme Area Group on Information Systems and Services (OPAG/ISS), supported by the Secretariat, then validates the stated requirements and develops a draft recommendation to respond to the requirements as appropriate.

 

What happens next depends on whether the draft recommendation involves changes to the code structure or additions to the supporting tables.

 

1.3.2    Updating the Structures

 

When the recommended solution developed by the ET/DRC requires changes to the BUFR and CREX code structures, the recommendation must be approved by both the full CBS and the full WMO Executive Council.  However, it must first be endorsed by the Chairperson of OPAG/ISS prior to its consideration by CBS.  This must be done early enough that the draft recommendation can be published as a CBS pre-session document at least three months prior to the CBS Session.   If the full CBS approves the draft recommendation, it is submitted to the full WMO Executive Council (EC) for approval.  If the EC approves the recommendation, the recommendation will be implemented on the first Wednesday following the first of November of the year following the CBS Session.

 

1.3.3   Updating the Tables

 

Table additions can follow the same approval process as changes to the code structures.  However, as noted previously, table additions are not only far less disruptive than code structure changes, they are also required more often and with greater urgency.  Therefore, a special approval process has been developed by the WMO Secretariat to ensure the necessary flexibility is available to respond to urgent requirements of users during intersessional periods (i.e., between Sessions of CBS).  This approval process is referred to as the "Fast Track".  Under this procedure, the recommendation does not need to be approved by the full CBS and the full EC.  Rather, after approval by the Chairperson of the OPAG/ISS, the recommendation need only be approved by the president of the CBS on behalf of CBS, and by the President of the  WMO on behalf of the EC.

 

Implementation of amendments approved through the fast track are normally limited to one per year and implemented on the first Wednesday following the first of November.  However, if the Chairpersons of the ET/DRC and OPAG/ISS agree that an exceptional situation exists, a second fast track implementation can be initiated.  In either case, WMO Members must be notified of amendments approved through the fast track early enough to allow a period at least three months between the receipt of the notification and the date of implementation.

 

1.3.4   Validation of Updates

 

Whether changes to the BUFR or CREX code structures or additions to their supporting tables, all changes must be validated by a procedure required by the CBS.  Under this procedure, proposed changes should be tested by the use of two independently developed encoders and two independently developed decoders, which incorporate the proposed change.  However, where the data originated from a necessarily unique source (e.g., the data stream from an experimental satellite), the successful testing of a single encoder with at least two independent decoders is considered adequate.

 

For those recommendations that are considered by the full CBS for approval, CBS may either approve or not approve but not alter them.

 

 

1.4      Migration Guidance

 

Migration refers to the process of converting from the current use of the traditional alphanumeric code forms along with BUFR and CREX to exclusive use of BUFR and CREX.  This process will take some time and require much effort on the part of many.  However, it is essential if we are to move the WMO community into a position where requirements for new parameters and new types of data can be met easily and efficiently.  Additional benefits of improved observational data quality and reduced training costs are to be expected as well. This section reviews some of the issues that must be addressed for the migration process to succeed.

 

 

1.4.1   Training

 

Representing observational data to be ingested into the WMO Information System in BUFR or CREX is at the heart of the migration process.  Training will be critical for the Members to accomplish this.  The type of training needed will depend on the application.  As mentioned above, the first choice for representing observational data to be ingested into the WMO Information system should be BUFR.  BUFR requires computer equipment and software, and BUFR encoding and decoding software is already available from a number of Members.  Members intending to use BUFR to encode their observational data should begin training their personnel in BUFR immediately.  BUFR training seminars are expected to be organised by the WMO Secretariat and should also take place at the national level.  However, BUFR training can begin immediately by studying this Guide.  Personnel who will be expected to use existing software should at least study Layers 1 and 2 of this Guide.  Personnel who will write BUFR software should read all three Layers.

 

Members who find use of BUFR is not feasible at this time could begin planning to use CREX.   Personnel who will be expected to either encode their observations into CREX or interpret the observations encoded in CREX will need training.  As with BUFR, CREX training seminars are expected to be organised by the WMO Secretariat and should also take place at the national level.  Once again, however, such training can begin immediately by studying this guide.  It is recommended that whether planning to encode or interpret observations in CREX, those parts of all three layers of this Guide related to CREX should be studied.

 

 

1.4.2   Technical Issues

 

Members planning to incorporate BUFR into their operations should review their telecommunications system to ensure they can accommodate binary transmissions.  Furthermore, during the migration process, periods of dual transmission of observations in some combination of the BUFR and CREX or traditional alphanumeric code forms may be required.  This will increase both the volume of data (although probably not dramatically) and the number of messages.  Members should review the capacity of their telecommunications systems in this light.

 

Another key part of the migration will be the development of templates in BUFR and CREX for most of the data types currently being exchanged in the traditional alphanumeric code forms.  Each template will prescribe how the data in each of the traditional alphanumeric code forms to be replaced will be represented in BUFR and CREX.  A hypothetical CREX template was shown in section 1.2.6 of this Layer.  The Expert Team on Data Representation and Codes of the CBS OPAG/ISS is working diligently to develop all the required templates and expects to complete this effort soon.  Layer 3 will describe templates in more detail.  When the Expert Team completes its work, all templates will be made available to the WMO Members.  When they become available, the templates should be studied carefully by all those who will use either BUFR or CREX.

 

 

1.4.3   Encoding Vs. Interpretation

 

Encoding:  Those who will be encoding observational data into BUFR or CREX must learn and adhere to the regulations governing these code forms.  This Guide is not intended to describe or interpret the regulations.  The regulations are found in WMO Publication No. 306, Vol.I.2, Part B.  Since anyone encoding data into BUFR will be invoking a software program, they must also learn the form of the input data required by the software used.

 

Interpretation:  As with encoding, anyone interpreting information that was encoded in BUFR must use a computer.  Therefore, they must understand the form of the output produced by the computer program as well as the rules and regulations.  However, since CREX is human readable, it can be easily understood provided one knows the code form thoroughly. 


Layer 2:   Layout, Functionality and Application of BUFR and CREX

 

 

Table of Contents

 

 

                                                                                                                                            Page

 

2.1       Code Layouts and Tables.................................................................................. L2-  2

 

2.1.1    Sections of a BUFR Message................................................................ L2-  2

2.1.2      Sections of a CREX Message................................................................ L2-10

2.1.3    BUFR and CREX Descriptors................................................................ L2-14

2.1.4      BUFR and CREX Tables......................................................................... L2-16

 

2.2       Applications                                                                                                         L2-31

           

2.2.1    BUFR                                                                                                         L2-31

                        2.2.1.1 Represent New Information....................................................... L2-31

2.2.1.2  Facilitate Data Exchange........................................................... L2-31

2.2.1.3                                      Include Quality and Monitoring Information       L2-36

2.2.1.3  Facilitate Data Processing and Storage.................................. L2-36

2.2.1.5                                                                                 Use in a Data Base       L2-37

           

2.2.2    CREX                                                                                                          L2-37

                        2.2.2.1 Represent New Information With Readability Requirements L2-37

                        2.2.2.2 Include Quality and Monitoring Information........................... L2-38

                        2.2.2.3 Facilitate Data Exchange............................................................ L2-38

                        2.2.2.4 Reduce Training Costs............................................................... L2-38

 

 

 


2.1      Code Layouts and Tables

2.1.1   Sections of a BUFR Message

Overview of a BUFR Message.

The term "message" refers to BUFR being used as a data transmission format; however, BUFR can, and is, used in a number of meteorological data processing centers as an on-line storage format as well as a data archiving format.  For transmission of data, each BUFR message consists of a continuous binary stream comprising six sections.

 

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

Section

2

Section

3

Section

4

Section

5

Section

Number

Name

Contents

0

Indicator Section

"BUFR" (coded according to the CCITT International Alphabet No. 5, which is functionally equivalent to ASCII), length of message, BUFR edition number

1

Identification Section

Length of section, identification of the message

2

Optional Section

Length of section and any additional items for local use by data processing centers

3

Data Description

Section

Length of section, number of data subsets, data category flag, data compression flag, and a collection of data descriptors which define the form and content of individual data elements

4

Data Section

Length of section and binary data

5

End Section

"7777" (coded in CCITT International Alphabet No. 5)

 

Each of the sections of a BUFR message is made up of a series of octets.  The term octet, meaning 8 bits, was coined to avoid having to continually qualify byte as an 8-bit byte.  An individual section always consists of an even number of octets, with extra bits added on and set to zero when necessary. Within each section, octets are numbered 1, 2, 3, etc., starting at the beginning of each section. Bit positions within octets are referred to as bit 1 to bit 8, where bit 1 is the most significant, leftmost, or high order bit. An octet with only bit 8 set would have the integer value 1.

The upper limit to the size of a BUFR message is quite large, determined by the maximum number that can fit within octets 5 – 7 of the Indicator Section (224 – 1 or 16777215 octets).  However, by convention BUFR messages are restricted to 15000 octets or 120000 bits.  This limit is set by the capabilities of the Global Telecommunications System (GTS) of the WMO.  The BLOK feature, described elsewhere, can be used to break very long BUFR messages into parts.

 


Section 0 – Indicator Section

 C O N T I N U O U S B I N A R Y S T R E A M

SECTION

0

Section

1

Section

2

Section

3

Section

4

Section

5

Octet No.

Contents

1 – 4

"BUFR" (coded according to the CCITT International Alphabet No. 5)

 

OCTET NO.           1        2        3        4

 

BINARY       01000010 01010101 01000110 01010010

 

HEXADECIMAL     4   2    5   5    4   6    5   2

 

DECODED             B        U        F        R

5 – 7

Total length of BUFR message, in octets (including Section 0)

8

BUFR edition number (currently 3)

 

The earlier editions of BUFR did not include the total message length in octets 5-7. Thus, in decoding BUFR Edition 0 and 1 messages, there was no way of determining the entire length of the message without scanning ahead to find the individual lengths of each of the sections.  Edition 2 eliminated this problem by including the total message length right up front.  By design, BUFR Edition 2 contained the BUFR Edition number in octet 8, the same octet position relative to the start of the message as it was in Editions 0 and 1.  By keeping the relative position fixed, a decoder program can determine at the outset which BUFR version was used for a particular message and then behave accordingly. This meant that archives of records in BUFR Editions 0 or 1 did not need to be updated.

 


Section 1 - Identification Section.

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

SECTION

1

Section

2

Section

3

Section

4

Section

5

Octet No.

Contents

1 – 3

Length of section, in octets

4

BUFR master table number – this provides for BUFR to be used to represent data from other disciplines, and with their own versions of master tables and local tables.  For example, this octet is zero for standard WMO FM 94 BUFR tables, but ten for standard IOC FM 94 BUFR Tables whose use is focused on oceanographic data.

5 – 6

Originating centre: code table 0 01 033

7

Update sequence number (zero for original BUFR messages; incremented for updates)

8

Bit 1 = 0 No optional section

 

= 1 Optional section included

 

Bits 2 – 8 set to zero (reserved)

9

Data Category type (BUFR Table A)

10

Data Category sub-type (defined by local ADP centres)

11

Version number of master tables used (currently 9 for WMO FM 94 BUFR tables)

12

Version number of local tables used to augment the master table in use

13

Year of century

14

Month

15

Day

16

Hour

17

Minute

The length of Section 1 can vary between BUFR messages.  Beginning with Octet 18, a data processing center may add any type of information as they choose.  A decoding program need not know what that information may be.  Knowing what the length of the Section is, as indicated in octets 1-3, a decoder program can skip over the information that begins at octet 18 and position itself at the next section, either Section 2, if included, or Section 3.  Bit 1 of octet 8 indicates if Section 2 is included.  If there is no information beginning at octet 18, one octet must still be included (set to 0) in order to have an even number of octets within the section.

It should be pointed out that the date/time in octets 13 – 17 is not currently well defined.  The BUFR manual only states these octets should describe the date/time ÒMost typical for the BUFR message contentsÓ.  While this may be clear for a group of 1200 UTC SYNOP reports, this statement could be interpreted differently by different data producers for other types of observations.

              

Section 2 - Optional Section.

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

SECTION

2

Section

3

Section

4

Section

5

Octet No.

Contents

1 – 3

Length of section, in octets

4

Set to zero (reserved)

5 -

Reserved for use by ADP centers

Section 2 may or may not be included in a BUFR message.  When it is contained within a BUFR message, bit 1 of octet 8, Section 1, is set to 1.  If Section 2 is not included in a message then bit 1 of octet 8, Section 1 is set to 0.  Section 2 may be used for any purpose by an originating center.  The only restrictions on the use of Section 2 are that octets 1 - 3 are set to the length of the Section, octet 4 is set to zero, and the total length of the Section contains an even number of octets.

A typical use of the Optional Section could be in a data base context.  The Section might contain pointers into the Data Section of the message, pointers that indicate the relative location of the start of individual sets of observations (one station's worth, for example) in the data.  There could also be some sort of index term included, such as the WMO block and station number.  This would make it quite easy to find a particular observation quickly and avoid decoding the whole message just to find one or two specific data elements.

 

Section 3 - Data description section.

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

Section

2

SECTION

3

Section

4

Section

5

Octet No.

Contents

1 – 3

Length of section, in octets

4

Set to zero (reserved)

5 – 6

Number of data subsets

7

Bit 1 = 1    observed data

 

         = 0    other data

 

Bit 2 = 1    compressed data

 

         = 0    non-compressed data

 

Bit 3 - 8     set to zero (reserved)

8 -

A collection of descriptors which define the form and content of individual data elements comprising one data subset in the data section

If octets 5-6 indicate that there is more than one data subset in the message, with the total number of the subsets given in those octets, then multiple sets of observations, all with the same format (as described by the data descriptors) will be found in Section 4. This is, for example, a means of building "collectives" of observations. Doing so realizes a large portion of the potential of efficiency in BUFR.

In the flag bits of octet 7, "observed data" is taken to mean just that; "other data" is by custom, if not explicit statement, presumed to be forecast information, or possibly some form of "observation" indirectly derived from "true" observations.  If the data in Section 4 is compressed, bit 2 of octet 7 is set to one.  If the data is not compressed, it is set to zero.  The nature of "data compression" will be described in Layer 3.

 


Section 4 - Data Section.

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

Section

    2

Section

3

SECTION

4

Section

5

Octet No.

Contents

1 – 3

Length of section, in octets

4

Set to zero (reserved)

5 -

Binary data, as defined by the descriptors that begin at octet 8 of Section 3.

 

Section 5 - End Section.

 C O N T I N U O U S B I N A R Y S T R E A M

Section

0

Section

1

Section

2

Section

3

Section

4

SECTION

5

Octet No.

Contents

1 – 4

"7777" (coded according to the CCITT International Alphabet No. 5)

 

OCTET NO.           1        2        3        4

 

BINARY       00110111 00110111 00110111 00110111

 

HEXADECIMAL     3   7    3   7    3   7    3   7

 

DECODED             7        7        7        7

 


Required Entries.

There are required entries in any BUFR message.  The required entries for each section are:

Section 0, octets 1 - 8

Section 1, octets 1 – 18

Section 3, octets 1 – 10

The data descriptors begin in octet 8. A single data descriptor occupies 16 bits, or 2 octets.  Since the Section must contain at least one descriptor and have an even number of octets, there will be a minimum of 10 octets in Section 3. Note that Section 3 will always conclude with 8 bits set to zero since all descriptors are 16 bits in length and the first descriptor begins in octet 8.

Section 4, octets 1 – 6 

Section 4 must have at least 4 octets.  If there is any data, it is in octets 5 and beyond, and since the Section must contain an even number of octets, there must then be at least 2 octets after octet 4.

Section 5 - octets 1 – 4

Since there are required entries, there will be a minimum number of bits (368) in any BUFR message.  For each section, the minimum number of bits is:

             C O N T I N U O U S B I N A R Y S T R E A M

Section 0

64 bits

Section 1

144 bits

Section 2

(optional)

Section 3

80 bits

Section 4

48 bits

Section 5

32 bits

 

BUFR and Data Management.

Sections 3 and 4 of BUFR contain all of the information necessary for defining and representing data. The remaining sections are defined and included purely as aids to data management. Key information within these sections is available from fixed locations relative to the start of each section. It is thus possible to categorize and classify the main attributes of BUFR data without decoding the data description in Section 3, and the data in Section 4.