National Environmental Satellite, Data, and Information Service Home Page National Oceanic and Atmospheric Administration Home Page National Climatic Data Center Home Page Department of Commerce Home Page
NOAA Logo, National  Environmental Satellite, Data, and Information Service. National Climatic Data Center, U.S. Department of Commerce

NOAA National Operational Model Archive & Distribution System

General Data Structure

Table of Contents

Organization of Data Directories
Organization of Data Files
Use of Templates in GrADS Data Descriptor Files
Dataset Templates

Top of Page Organization of Data Directories

Our repository contains a directory for each model system (i.e., ETA, GFS, and RUC). Each model directory contains one directory for each month. Each month directory contains a directory for each day in the month. The directory for each day contains all the data from model runs initialized during that day. These directories contain the GRIB files. We also have a parallel structure that contains the table of contents (i.e., toc) listing of each GRIB file. A sample of our directories structure is shown below.

noaaport/
|-- merged
| |-- eta
| | `-- 200305
| | `-- 20030505
| |-- eta_toc
| | `-- 200305
| | `-- 20030505
| |-- gfs
| | `-- 200305
| | `-- 20030505
| |-- gfs_toc
| | `-- 200305
| | `-- 20030505
| |-- ruc
| | `-- 200305
| | `-- 20030505
| `-- ruc_toc
| `-- 200305
| `-- 20030505


Top of Page Organization of Data Files

Our model data repository contains data from different climate and weather models. It has recently been expanded to place station datasets under the distributed data access framework. The data from the climate models are relatively static, while the data from the weather models are collected in near realtime. The weather models are run at different frequencies and at different spatial resolutions and domains. The frequency of a model is the number of times or cycles the model is run per day. A model is typically run at 4 cycles per day (i.e., 00Z, 06Z, 12Z, and 18Z). The resolution and domain for a model run is determined by its grid number. A model run is the single execution of a specific model using a specific grid initialized at a specific time and run for so many forecast hours. The same model may be run at different grids. Our weather model data is cataloged by specific combinations of model and grid.

Output from a model run contains data from the analysis field (i.e., the initialization field) and the forecast data at the end of each time step (i.e., forecast hour). Our repository for weather model data contains a separate file for each analysis field and each forecast hour. Thus each file contains data from the gridded analysis field or data from one forecast hour from a single model run. All files contain gridded data in primarily GRIB format. Thus a model run with 20 steps will contain 21 GRIB files: one analysis (e.g., hour 0) plus 20 forecast hours.

Files are named to identify the model run where the data originated. The filenames are divided into 5 sections, delimited by underscores. In certain occasions, the 4th and 5th section may be excluded (such as a climate reanalysis with monthly time steps). An example follows:

<model>_<grid>_<yyyymmdd>_<cycle>_<hour>

In instances where hours and days have no meaning, (e.g. daily and monthly datasets), the filenames take the following default values:

<model>_<999>_<00000101>_<0000>_<000>

  1. model - name of the model(e.g., early-eta)
  2. grid - grid number (e.g. 212) - This number will serve other purposes when non-gridded data is involved; 1-3 digits.
  3. yyyymmdd - initialization year, month, day - 8 digits (e.g., 20030501)
  4. cycle - initialization time in hhmm - 4 digits (e.g., 0000)
  5. hour - forecast hour relative to initialization (e.g., 000); 3 digits
The following is a list of filename extensions, and their corresponding purpose, that you may encounter while browsing the NOMADS data structure. Please note that each file type may not exist for every dataset on our system:
  • .grb - GRIB data
  • .unf - GrADS unformatted binary data file
  • .idx - index file produced by GrADS gribmap utility
  • .map - map file producted by GrADS stnmap utility
  • .ctl - GRADS data descriptor file
  • .toc - Basic table of contents dump produced by WGRIB. toc files exist for each individual file, as well as a merged file for each cycle (fff.toc)
  • .inv - WGRIB grib inventory data, contains information which allows for partial HTTP file transfers to work. Similar but more detailed then .toc files

Examples:

meso-eta_218_20030501_1800_045.grb
meso-eta_218_20030501_1800_000.unf
meso-eta_218_20030501_1800_045.toc
meso-eta_218_20030501_1800_045.inv
meso-eta_218_20030501_1800_fff.ctl
meso-eta_218_20030501_1800_fff.idx
meso-eta_218_20030501_1800_fff.map
meso-eta_218_20030501_1800_fff.toc

Top of Page Use of Templates in GrADS Data Descriptor Files

Our repository has a single data descriptor file (i.e., .ctl) for each model run. We use the GrADS templates feature to aggregate all the forecast hour files for the same run. Since these files represent multiple forecast hours, we identify these files by using "fff" in the forecast hour field in the file name.

Examples:

meso-eta_218_20030501_1800_fff.ctl <-- grads descriptor file for run
meso-eta_218_20030501_1800_fff.idx <-- index file for run
meso-eta_218_20030501_1800_fff.toc <-- table of contents for run


Top of Page Dataset Templates

For select datasets, GrADS control file templates have been built across the entire date range for only a few variables, forecast hours, and vertical levels. These ware recently introduced to provides a solution to the problem of having to write a script to loop across hundreds or even thousands of cycle-organized files just to create a time series for one variable and level.

These templates (or subsets) are best utilized by the Grads Data Server and Live Access Server. They can be found either in the top level of a dataset listing, or, in a subfolder named 'subsets'. For example, the top level directory for the North American Regional Reanalysis:

http://nomads.ncdc.noaa.gov:9091/dods/NCEP_NARR_DAILY

Contains the listing: narr-a_221_hgtprs.subset

Remotely opening this file from an OPeNDAP enabled client such as gradsdods will enable you access to the geopotential height field at any level for the entire reanalysis!

We have created many subsets for The Global Forecast System, so, we have gathered them all into a subsets folder:

http://nomads.ncdc.noaa.gov:9091/dods/NCEP_GFS/subsets

The general naming scheme for subset templates is as follows, squared brackets denote portions that may be excluded:

<model>_<grid>_<Description>[_<fh1>][_<fh2>][.subset]

<model>_<grid> Name and grid # of the dataset
<Description> One-word description of the subset
<fh1> Forecast hour, or start of forecast hour range
Assumed to be 000 if not present
<fh2> End of forecast hour period
Assumed to be not relevant if not present
[.subset] Filename extention to identify subset templates
Not required inside /subsets subdirectory