Advanced Data Access Methods
Table of Contents
- Subsetting Parameters and Levels using FTP4u
- Obtain ASCII data using GDS
- Scripting wget in a Time Loop
- Mass GRIB Subsetting: Utilizing Partial-file HTTP Transfers
The NOMADS web site provides a variety of options to access
our weather and climate model data. You can access the
on-line data using traditional access methods
(web-based or ftp), or you can use open and distributed
access methods under the collaborative approach called
the
NOAA Operational Model Archive and Distribution
System (NOMADS). This page provides a description
of the different functions that you can use to get or
view data. Some the access methods described in this section
are accessable by the Model
Data Access page. However, not all methods are
available for all datasets.
The following access methods involve the use of scripts
and common programs to automatically pull large amounts of
data from the NOMADS data directory structure. It is necessary
to have working knowledge of the Practical Extraction and Report
Language
PERL, and have access to a UNIX computer system.
While it is possible to use convert these scripts for use on
machines running Microsoft Windows, the current scripts and guide
have not been designed with this Operating System in mind.
Subsetting Parameters and Levels using FTP4u
The NOMADS Web Interface provides a utility which allows users to
select GRIB files from an order and extract parameters and
vertical levels of interest. Then,
our server pushes the data to NCDC's anonymous FTP
server where you can retrieve your subsetted data. You can easily
extract individual parameters across multiple days and
model runs. If the data in your order is GRIB format (.grb), use
this function to get small subsets of the data, rather than
transfering the entire file.
To begin, go to the Model Data
Access page, select the "plot" or "ftp" link for the desired
dataset, build a collection for the desired date range (maximum
possible length of the collection depends on the dataset), then
choose "Select files for FTP" at the bottom of the next page.
If your order requires an offline data request, you will need
to wait until this request is completed.
The FTP4u program allows you to quickly select the entire
set of files, or a particular type of file, by entering
"*" or "*.grb" respectively, into the Grib Filter text box and clicking
the button beside it.
Then use the list of check boxes below to specify the parameters
and levels you want. If you are unsure what the cryptic parameter
names correspond to, see the
Model Data Inventory Listings. Navigate to the model system you have ordered
and use the 'toc' option. You can also subset out a spatial
region of interest. Please note that subsetting Lambert Conformal
grids will cuase the GrADS control and index files that come with
the order to become invalid. After the above options are set,
enter your E-mail address in the form at the bottom of the page
and click 'Start FTP'. The script will then subset your files and
inform you where and how to retrieve your data. Please be patient and
wait until the entire page fully loads.
Obtain ASCII data using GDS
Introduction / Basics:
A web browser or wget can be used to obtain ASCII data directly
from NCDC-NOMADS GrADS Data Server. Both require the user to
construct a DODS URL with a "DODS Constraint" with allows for
both spatial or temporal subsetting.
Example using a NARR-A dataset on GDS:
http://nomads.ncdc.noaa.gov/dods/NCEP_NARR_DAILY/
199601/19960101/narr-a_221_19960101_0000_000.info
At the top of this ".info" page, you will find the DODS URL.
To this, the DODS constraint (examples below) must be appended:
The DODS constraint dimensions (t|z|lat|lon) are all grid
relative numbers. You need to reference the Latitude, Longitude,
Altitude, and Time lines on the info page to determine the numeric
values you need to use. Translated values are given at the bottom
of the output page. The Time value in this section is the
absolute number of days since year zero. The 'VAR' must be one
of the variables listed on the info page (immediately below the
dimension definition secion).
The Colon and second value in the brackets may be excluded to hold
that dimension constant. If the Variable choosen is defined only
on one level (tmp2m), the [z1:z2] must be excluded.
If the brackets (for any axis) contains 3 entries with 2 colons,
the :*Inc: is the increment, or step, for this variable. For example,
[0:2:10] is equivalent to the sequence (0,2,4,6,8,10)
General Form for DODS constraint expressions:
.ascii?VAR[t1:tInc:t2][z1:zInc:z2][Lat1:LatInc:Lat2][Lon1:LonInc:Lon2]
Examples
1) .ascii?tmpprs[0][10][100][340]
2) .ascii?tmpprs[0][6]
3) .ascii?tmp2m[3][130][330]
4) .ascii?ugrdprs[0:7][16][90][320]
5) .ascii?ugrdprs[0:2:7][16][90][320]
6) .ascii?ugrdprs[7][0:28][120][120]
7) .ascii?rh2m[2][70:100][140:160]
The examples below will send the URL examples above to your browser.
NARR-A data from January 1, 1996 is used in these instances. The
selection of samples is designed to demonstrate the versitility
of this access method.
Example 1
- Point - Single 750 millibar Air Temperature Point at Lat/Lon 37.5 °N / 92.5 °W at 00 Z.
Example 2
- 2D Grid Slice - The entire 850 millibar Air Temperature Grid at 00 Z.
Example 3
- Point, for variable with no vertical coordinate - 2 meter Air Temperature fpr Lat/Lon 48.75 °N / 96.25 °W
at 09 Z.
Example 4
- 3 hour Time Series at Point - 500 mb U-Wind values for 33.75 °N / 100 °W at each 3 hour time step.
Example 5
- 6 hour Time Series at Point - Same as example 4, except every other time step is skipped.
Example 6
- Vertical Sounding at Point - U-Wind component at all vertical levels for Lat/Lon 45.0 °N / 175.0 °W, time 21 Z.
Example 7
- Spatial Region Subset for variable with no vertical coordinate - 2 meter Relative Humidity for a Lat/Lon subregion
(26.25 - 27.5) °N / (167.5 - 160.0) °W, time 06 Z.
Once you have composed a URL, you can then copy it to your
web browser or use it as an argument to
wget.
to download the file.
To access data for other datasets in this manner, start at the
Data Access Page, click on the GDS
link for your dataset, then drop down the directory structure until
you find an Info link. Use the metadata on the info page to form
the DODS Constraint.
Determining grid relative x/y values for the dods constraint if you want a specific lat/lon
From the .info link given above ~ under longitude you have :
Sample: -220°E to -0.625°E (586 points, avg. res. 0.375°)
Generalized to: init to final (span points, avg. res. aRes)
So, x can range from zero to span-1. To determine the grid relative points x & y
you need to pin-point a particular lon & lat
location of interest (needs decimal units ~ convert from degree/min/sec if needed),
use the following linear equation: int() = truncate all decimals.
lon = init + aRes*x
(lon-init) = aRes*x
x = int( (lon-Init(Lon))/aRes(Lon) )
y = int( (lat-Init(Lat))/aRes(Lat) )
GrADS interpolates all gridded data it knows how to serve with GDS into a
lat/lon grid, so this linear equation will always work. The chance for
small map-projection interpolation error whenever using the GDS with a
dataset which uses a non-rectangular native grid. The Latitude and
Longitude entries on the info page are not representitive of
non-rectangular native grids!
Determining grid relative t values used for the dods constraint
The GDS uses an epoch-based time reference for its underlying time axis. This epoch begins at "0000-00-00 00:00:00" and has units of
decimal days.
Mapping a calendar date to the t can be rather tricky for datasets
with a long time series.
On the .info page, the "Time: " follows the same
format as lat and lon (see section above). In this case a t value
of zero equates to exactly the init time, spanning linear
aRes epoch time increments up to final time, which is span-1.
Scripting wget in a Time Loop
This is a technique best used to obtain large numbers of whole
files quickly. The method works with datasets which are accessable
via HTTP or FTP. The process involves setting up a script which
will loop through a range of dates, creating a URL with the date
variables, then passing this URL to
wget.
Forming a URL for the NOMADS HTTP server:
-- Sample URL --
http://nomads.ncdc.noaa.gov/data/gfs/200306/20030607/
gfs-avn_201_20030607_0000_000.grb
-- URL Generalization (for PERL variables): --
$URL =
"http://${SERVER}/data/${MODEL}/
${YYYYMM}/${YYYYMMDD}/${MODELNAME}_${GRID_NUM}_${YYYYMMDD}
_${CYCLE_hr}00.grb"
-- Wget usage ( system/shell command ) --
wget -O [Local output file] $URL
Organization of Data Directories - Provides a description of the NCDC
NOMADS directory structure.
SERVER varies depending on the dataset. See
Model Data Access and click on HTTP links to
to determine this.
YYYYMM and YYYYMMDD are the variables you will need to loop.
This can be done using various iteration structures
(nested for, nested foreach, while, etc. loops)
in any scripting language.
This guide will not attempt to explain exactly how to script this.
MODEL is the top level directory for the corresponding
model system. This is not equal to MODELNAME and GRID_NUM.
For example, MODEL eta can contain files early-eta_211* and
meso-eta_218*. To determine the appropriate MODELNAME and GRID_NUM,
see the Model Data Access page.
CYCLE_hr may require another nested loop if the model is run
more then once a day.
To avoid overwriting output, [Local output file] should contain
the YYYYMMDD and CYCLE_hr variables.
Related Topic: Using wget on a directory
Are all the data you need in the same HTTP or FTP directory?
If so, then you may use a single wget command rather than writting
a script. The general syntax is below :
wget -r [-lX] --no-parent -A[FILETYPE] -nd http://[SERVER NAME]/[PATH]
[SERVER NAME] & [PATH]
define the URL to the desired directory.
[lX] Defines the directory level depth "X" you want wget to scan.
Example: [-l1]
The -A[FILETYPE] is optional, if omitted, wget will attempt to
download everything in the directory. If you use, for example,
[-A.grb], then just the files ending with .grb will be downloaded.
Note that this may not work on directories forbidden by robots.txt
Mass GRIB Subsetting: Utilizing Partial-file HTTP Transfers
This recently implemented method allows users to install
and run scripts on their local
machine that will access and download parameter and vertical
level subsets of certain NCDC-NOMADS
datasets to your local disk! This method involves downloading
three PERL scripts to your working area, then running a single
command. Currently, this process is tested and works on UNIX/LINUX
systems only. A windows version is in the testing phase.
NCDC-NOMADS Datasets which currently support this access method:
Method:
- Visit and briefly review the following page:
Fast Downloading of GRIB Files
- Ensure your system has the dependancies: PERL and cURL,
installed.
- Download the two PERL scripts, get_inv.pl and get_grib.pl,
from the web page referenced in step 1.
- Add these scripts to your ENV PATH
- Download the following PERL script:
get-httpsubset.pl
- Run get-httpsubset.pl, see syntax and examples below
get-httpsubset.pl syntax:
get-httpsubset.pl interactive OR
get-httpsubset.pl <YYYYMMDDHH Start> <YYYYMMDDHH End> <GRIB Parameter names> <Levels> <output path> [Dataset] [OPTIONS]
interactive : Enter input parameters one at a time through a guided prompt (v1.3 or later).
YYYYMMDDHH Start and End: : The Date-Time range for which you desire data. The program will exit if you get these incorrect.
GRIB Parameter Names: The list of parameters you wish to subset. These is the GRIB Parameter codes, you can view a list of them on the
Model Data Inventories page (Select Snapshot for NARR, the name is the second column). To select multiple parameters, just delimit them using the - character. Use "ALL" for all parameters.
Level: The vertical level(s) you wish to subset.
Similar to the parameter names. Reference the third column of the
snapshot Inventory. Substitute underscore for spaces (500_mb) for
500 millibars. "ALL" for all levels.
Output Path is the absolute path to the directory where you
want the downloaded subset files to be saved. The program will
automatically create a hierchial date directory structure (YYYYMM/YYYYMMDD)
under the given directory to ease storage of a large number of files.
Dataset (Optional: default = narr-a) : Can be one of either:
narr-a, narr-b, narrmon-a, narrmon-b, narrmonhr-a, narrmonhr-b,
gfs, nam, gfs-avn, or meso-eta.
This will determine which dataset the script will download. The
NARR, NAM, and GFS have different periods of record and the dates you select
with YYYYMMDDHH1 and YYYYMMDDHH2 need to be consistant with the
dataset you select here, else no data will be found.
OPTIONS : [-nocheck] Skips the dependancy check, not recommended.
[--output-name-scheme=complex] Changes the naming scheme of the files the
program creates upon successful downloads to BASE/YYYYMM/YYYYMMDD/dataset-grid-yyyymmdd-hh00-000-vars-levs.grb. - By default, the files are placed into a
simpler scheme: BASE/YYYYMM/dataset-grid-yyyymmdd-hh00-000.grb
Usage Examples:
get-httpsubset.pl 2003010100 2003010121 ACPCP sfc .
A day (Jan 01 2003) of Surface Precipitation from the narr-a set
downloads to current working directory
get-httpsubset.pl 2003010100 2003123121 HGT-TMP 500_mb-850_mb-1000_mb
/home/user/data/
A year (2003) of 1000, 850, and 500 millibar
Temperature and Geopotential Height dumps into /home/user/data/
get-httpsubset.pl 1980010100 1980123121 UGRD-VGRD 250_mb . narrmon-a
12 monthly averaged 250 millibar U and V winds for a year (1980)
into current working directory (narrmon-a set)
get-httpsubset.pl 2003010100 2003010121 GFLUX sfc . narr-b
A day (Jan 01 2003) of Surface Ground Heat Flux from narr-b set
get-httpsubset.pl 2005040100 2005040121 TMP 2_m /home/user gfs
A day (Apr 01, 2005) of one degree GFS, two meter temperature data downloads to users home directory.
|