Manipulating Data |
Note that when specific time periods or locations are not selected, these operations are applied to the time and spatial grids in their entirety by default. It also important to note that when multiple data variables are to be compared, as is common in the examples in this section, the time and spatial grids of those variables must be identical. You will see this issue addressed frequently.
1. Basic Arithmetic Operations |
2. Setting Limits |
a. Setting a minimum/maximum | ||
b. Finding a minimum/maximum | ||
c. Creating a numerical mask | ||
d. Flagging Data |
3. Creating Averages |
a. Spatial Averages | ||
b. Seasonal/Chunk Averages | ||
c. Running Averages |
4. Statistical and Other Mathematical Operations |
a. Anomalies | ||
b. Correlation | ||
c. Trigonometric Functions |
Example: Add 2.5°F to the minimum temperature data from Bismarck, ND.
Start at the GLOBALSOD*
dataset main page.
Select the station in Bismarck, ND. CHECK
Select the minimum temperature data variable.
CHECK
EXPERT
When adding a number to the field, the units of that data variable
are automatically used. Note the units of the minimum temperature data variable under
the Other Info heading.
While in expert mode, enter the following line
below the text already there.
2.5 add
Click "OK". CHECK
To see the results of this operation:
Select Tables link > Agree button> columnar table link.
CHECK
Compare with ORIGINAL DATA
Example: Create the observed monthly SST data by adding the monthly climatological SST data and the monthly SST anomaly data.
Start at the Reyn_Smith* dataset main page.
Because we are combining an observational data string with a climatological data string, we do not need to worry about the time grids matching. We must only make sure that the two string have the same time scale (temporal resolution). In this example, the time grids of each of the data variables look like this:
Climatological SST
Time grid: /T (months since 01-Jan) periodic Jan to Dec by 1. N= 12 pts :grid
SSTA
Time grid: /T (months since 1960-01-01) ordered Nov 1981 to Jul 2002 by 1. N=
249 pts :grid
Note that both of the data variables have a monthly time scale and that the time grid for the climatological data is periodic. Ingrid will automatically match that periodic data properly with the SSTA time grid.
Select the monthly SSTA and climatological SST data variables. EXPERT
While in expert mode, enter the following line below the
text already there.
add
Click "OK". CHECK
To see the results of this operation:
Select a small region (175.5-150°W, 10-15°N) and
a single time step (May 1988) to make the size of the data file more manageable.
CHECK
START
Select Tables link > Agree button> columnar table link.
CHECK
Compare with CLIMO DATA
,
ANOMALY DATA
and
OBSERVED DATA
.
Example: Subtract 2.5°F from the minimum temperature data from
Bismarck, ND.
Refer to the example in Section 1.a, Adding a Number to a Field.
Substitute the following Ingrid command for 2.5 add.
2.5 sub
Click "OK". CHECK
Note that Ingrid subtracts the second number/field listed (e.g.,
2.5) from the first number/field listed (e.g., min. temperature).
Example: Create the monthly climatological SST data by subtracting the monthly SSTA data from the observed monthly SST data.
Start at the Reyn_Smith* dataset main page.
Note the time grids of the two variables to be compared.
SSTA
Time grid: /T (months since 1960-01-01) ordered Nov 1981 to Jul 2002 by 1.
N= 249 pts :grid
SST
Time grid: /T (months since 1960-01-01) ordered Nov 1981 to Jul 2002 by 1.
N= 249 pts :grid
As is common when comparing two variables from the same dataset, their time grids match exactly. However, you should get always make a of point of checking this.
Make sure that the spatial grids of these two variables match.
Now that we are sure that the grids match properly, this example
is very much like that in Section 1.b where we added the two fields. The primary
difference here is the order that the variables are listed. As noted in Section 1.c,
Ingrid subtracts the second field listed from the first field listed.
Select the monthly SST and then the monthly SSTA data variables. EXPERT
While in expert mode, enter the following line below the
text already there.
sub
Click "OK". CHECK
Example: Convert the units of the mean sea level pressure data
from mb to Pa by multiplying the field by 100.
Note: there is an Ingrid command that converts units themselves
instead of just the data values. This is just an example and the units of the data will
still appear as mb after the arithmetic operation.
Start at the GLOBALSOD*
dataset main page.
Select the mean sea level pressure data variable. CHECK
EXPERT
While in expert mode, enter the following line below the text
already there.
100 mul
Click "OK". CHECK
This operation works just like that covered in Sections 1.b and 1.d.
Ensure that the grids of the variables match, select both of them, and use mul as the operator in expert mode. The mul command can also be used to find common entries in two different data strings.
Example: Convert the units of the precipitation data from inches to cm
by dividing the field by 2.54.
Note: there is an Ingrid command that converts units themselves instead of
just the data values. This is just an example and the units of the data will still appear as
inches after the arithmetic operation.
Start at the GLOBALSOD*
dataset main page.
Select the precipitation data variable. CHECK EXPERT
While in expert mode, enter the following line below the text already
there.
2.54 div
Click "OK". CHECK
Note that Ingrid divides the first number/field listed (e.g., precipitation) by the second number/field listed (e.g., 2.54).
This operation works just like that covered in Sections 1.b and 1.d.
Ensure that the grids of the variables match, select both of them, and use div as the operator in expert mode. Again, note that Ingrid divided the first field listed by the second field listed.
It is often useful to limit data values. You may want to set minimum and maximum limits on data as a means of quality control or only use data that meets particular criteria. Ingrid makes these types of operations very easy. Below are some common examples.
Example: Create a data string where all minimum temperature data values less than 0°C are given a value of 0°C.
Start at the GLOBALSOD*
dataset main page.
Select the minimum temperature data variable. CHECK EXPERT
As previously described, it is a good idea to note the units of the data in question as all values in Ingrid are automatically referenced to the units of the data variable.
Note that units of temperature by looking at the information
under the Other Info heading.
In this case, the units are in Fahrenheit and we must therefore give
our desired minimum temperature in Fahrenheit.
While in expert mode, enter the following line below the text already
there.
32. max
Click "OK".
To see the results of this operation:
Select a single station (Vienna, WMO ID:110360) and short time period
(e.g., 1996) to make the size of the data file more manageable.
CHECK
Select Tables link > Agree button> columnar table link. CHECK Compare with the ORIGINAL DATA .
An analogous operation, setting a maximum value of 0°C, can be done by replacing the command 32. max with 32. min.
There are two common uses of these feature. You may want to find a minimum/maximum value in a particular region or time period. Let's look at examples of these operations.
Example: Find the largest SSTAs for the entire time grid.
This example finds the largest SSTA from the entire time grid for each
grid point. The result is the largest SSTA as function of X (longitude) and Y (latitude).
Of course, you can limit the time grid to find the largest SSTAs in a more specific
time period.
Start at the Reyn_Smith*
dataset main page.
Select the monthly SSTA data variable. CHECK EXPERT
To find the largest positive SSTA:
While in expert mode, enter the following line below the text
already there.
[T] maxover
Click "OK". CHECK
To find the largest negative SSTA:
While in expert mode, enter the following line below the text
already there.
[T] minover
Click "OK". CHECK
To see the results of this operation:
Select views icon furthest to the left in the function bar that has
the land in black. CHECK
Example: Find the largest SSTAs for the entire spatial grid.
This example finds the largest SSTA from the entire spatial grid for each time
step. The result is the maximum global SSTA as a function of T (time). Of course, you can
limit the spatial grid to find the largest SSTAs in a specific region.
Start at the Reyn_Smith*
dataset main page.
Select the monthly SSTA data variable. CHECK EXPERT
To find the largest positive SSTA:
While in expert mode, enter the following line below the text already
there.
[X Y] maxover
Click "OK". CHECK
To find the largest negative SSTA:
While in expert mode, enter the following line below the text already
there.
[X Y] minover
Click "OK". CHECK
To see the results of this operation:
Select Tables link > columnar table link. CHECK
Masks make data values that meet a particular threshold equal to NaN.
Example: Mask out the maximum temperature values greater than 100°F.
Start at the GLOBALSOD*
dataset main page.
Select the maximum temperature data variable. CHECK
Note that the temperature unit is Fahrenheit.
This is good because our mask threshold is also in Fahrenheit. If the units
had not agreed, then we would have had to convert the mask threshold to the units of the data
variable.
While in expert mode, enter the following line below the text already
there.
100. maskgt
Click "OK". CHECK
An analogous operation, masking out the maximum temperature values less than 100°F, can be done by replacing maskgt with masklt.
To see the results of this operation:
Select a single station (Damascus, WMO ID: 400800) and a short time
period (Jun 1994) to make the size of the data file more manageable.
CHECK
Select Tables link > Agree button> columnar table link. CHECK Compare with the ORIGINAL DATA . Note that the data value from June 15 is missing. (Tables exclude NaN values.)
Flags create a binary version of any variable based on a particular threshold. Those data that meet the threshold are given a value of 1 and those that do not receive a value of 0.
Example: Flag snow depth values greater than 1 meter.
Start at the GLOBALSOD*
dataset main page.
Select the snow depth data variable. CHECK
Note that the depth unit is inches.
Our flag threshhold is in meters, so we must convert that depth to give Ingrid
the threshhold in the units of the data variable.
While in expert mode, enter the following line below the text already
there.
39.4 flaggt
Click "OK". CHECK
To see the results of this operation:
Select Tables link > Agree button> columnar table link.
CHECK
Compare with the ORIGINAL DATA
.
An analogous operation, flagging snow depths less than 1 meter, can be done by
replacing flaggt with flaglt.
When creating a spatial average of station of data, one typically wants to take
into account the location of each station (e.g., weighted average). That operation is beyond
the scope of this tutorial. However, creating a spatial average of gridded data is much more
straightforward and an example is given here.
Example: Find the spatial average of monthly SST data in a region in the Gulf
of Mexico defined by 83°-97°W, 21°-30°N for Jan-Dec 1998.
Start at the Reyn_Smith*
dataset main page.
Enter expert mode and enter the following line below the text already
there.
To see the results of this operation:
This procedure can be easily applied to other types of spatial averaging. For
example, if you wanted to create a zonal average, then you would use the following line of Ingrid
instead.
Example: Create seasonal averages (DJF, MAM, JJA, SON) of monthly SST data
from 1990-1999.
Start at the Reyn_Smith*
dataset main page.
While in expert mode, enter the following line below the text already there.
To see an animation of the seasonal averages you just created:
Note that if you had wanted JFM, AMJ, JAS, OND seasonal averages, then the selected
time period would have been Jan 1990 to Dec 1999. Another important point here is that the step over
which the average is created is always in the units of the data variable in question. For example,
had the SST data been at a daily time scale, the above Ingrid command would have created a 3-day
average instead of a 3-month average. Therefore, it is an excellent idea to get in the habit of
making sure the units of the data variable and the step agree with each other. The technique
used in this example is particularly useful when 12 is evenly divisible by the step over which
you want to average. The next example addresses the cases when this is not true.
Example: Create a May-Sept averages of SSTA data for the time period 1985-1994.
Start at the Reyn_Smith*
dataset main page.
While in expert mode, enter the following line below the text already there.
This Ingrid command splits the time grid with a period of 12. That is, in this example,
it creates a dataset of Jan data, a dataset of Feb data, etc. This is an important step, but we are not quite
finished.
Select May-Sept grids and average over them with the following Ingrid commands.
There is also a convenient option if you want to create averages/climatologies of
single months.
Example: Create a monthly climatology of SST data for the time period 1982-2001.
Start at the Reyn_Smith*
dataset main page.
Select the Filters link in the function bar.
Note that this command can only be applied to monthly data.
This operation offers a fast and easy way to smooth data temporally. Let's look at an
example.
Example: Create a 15-day running average of precipitation data.
Start at the GLOBALSOD*
dataset main page.
At this point, it is a good habit to check the temporal unit to make sure it agrees with
how you want to define your average step. In this example, we want to create a 15-day running mean .
Therefore, the unit over which we want to average is a day.
Make sure that the temporal unit of the precipitation data is the same as the unit
over which you want to average.
Note that this operation will truncate the data to fit the step. In this example, we have
a step of 15 days and are using the full time grid of Jan 8, 1994 - Dec 25, 1999. Therefore, after the
running mean is created, the data will include the dates Jan 15, 1994 - Dec 18, 1999.
To see the results of this operation:
Select one of the views links in the function bar. Compare with those from the line
, bar
, and scatter
plots of the original data.
Below are a few examples of statistical and miscellaneous mathematical operations.
Earth science data is commonly viewed in term of anomalies (i.e., difference between
observations and climatology) rather than as raw values. Anomalies can be produced with Ingrid by first
calculating a climatology and then calculating the difference between it and the observed data. However,
Ingrid also has a single command that does all of these calculations. Let's look at an example.
Example: Recreate the SSTA data for the time period 1982-2001.
Start at the Reyn_Smith*
dataset main page.
While in expert mode, enter the following line below the text already there.
You have just created the SSTA anomalies for the time period 1982-2001 based
a 1982-2001 climatology. While convenient, this operation is bit limited in that it can be applied to
monthly data. And like the yearly-climatology command, you can find this options via the
"Filters" link in the function bar.
This is an excellent example that combines many of the techniques covered to this point.
Example: Correlate precipitation amount observations in Chimbote, Peru with SSTA in
the region defined by 130°-90°W, 5°-15°S for the time period 1997-1998.
Up to this point, we have been using the GLOBALSOD dataset for our station data.
However, you can see in the time grid information of this dataset that its temporal unit is days while
that of the Reyn_Smith SST data we have been using is months since 1960. For simplicity, let's use another station dataset,
NOAA NCDC GHCN v2beta, that has monthly data defined as months since 1960.
Select the NOAA NCDC GHCN v2beta dataset by either searching for it or through the
SOURCES option. CHECK
At this point, when the first dataset selections have been made, it is typically easiest
to make the second dataset selections in expert mode.
While in expert mode, enter the following lines below the text already there. All
of these commands should look familiar to you from previous examples.
You now have two data fields with identical time grids. Let's correlate these fields.
While in expert mode, enter the following lines below the text already there.
To view the correlation data you just produced:
Basic trig functions are typically used with the spatial grids. The results of this
function can then be used as part a broader technique, such as spatial weighting.
Example: Find the cosine of a latitudinal grid of weekly SST data.
Start at the Reyn_Smith*
dataset main page.
To view the data you just produced:
Select a single station (Pellston, MI, WMO ID: 727347) and a short time
period (Jan-Feb 1996) to make the size of the data file more manageable.
CHECK
3. Creating Averages
Select the monthly SST data variable.CHECK
Select the 1998 time period and the lat/lon defined region.
CHECK EXPERT
[X Y] average
Click "OK". CHECK
Select one of the views links in the function bar.
[X] average
This creates a zonal average as a function of T (time) and Y (latitude). Click
here
to see an example of this operation.
b. Seasonal/chunk averages
Select the monthly SST data variable. CHECK
Select the Dec 1989-Nov 1999 time period.
CHECK EXPERT
T 3 boxAverage
Click "OK". CHECK
Select the views link furthest to the left in the function bar.
Enter the "Jan 1990 to Oct 1999" in the time text box at the top of the data viewer.
Click "Redraw". CHECK
This example creates an average over 5 months. Twelve is not evenly divisible by
this step (e.g., 5 months) so we much use a different technique than the one above.
Select the monthly SSTA data variable. CHECK
Select the Jan 1985- Dec 1994 time period.
CHECK EXPERT
T 12 splitstreamgrid
Click "OK". CHECK
T (May) (Jun) (Jul) (Aug) (Sep) VALUES
[T] average
Click "OK". CHECK
Select the monthly SST data variable. CHECK
Select the Jan 1982-Dec 2001 time period.
CHECK EXPERT
Select the monthly climatology link.
CHECK
EXPERT
c. Running averages
Select the precipitation data variable. CHECK
While in expert mode, enter the following line below the text already there.
T 15 runningAverage
Click "OK". CHECK
Select a station (Barcelona, Spain, WMO ID: 81810).
4. Statistical and Other Mathematical Operations
a. Anomalies
Select the monthly SST data variable. CHECK
Select the Jan 1982-Dec 2001 time period.
CHECK EXPERT
yearly-anomalies
Click "OK". CHECK
b. Correlation
Note: in order to correlate two sets of data, they must have the exact same
temporal unit.
Select the station in Chimbote, Peru (WMO ID: 84531000).
CHECK
Select the precipitation variable.
CHECK EXPERT
Select the Jan 1997-Dec 1998 time period.
CHECK EXPERT
SOURCES .NOAA .NCEP .EMC .CMB .GLOBAL .Reyn_SmithOIv1 .monthly .ssta
T (Jan 1997) (Dec 1998) RANGE
X (130W) (90W) RANGE
Y (15S) (5S) RANGE
Click "OK".
CHECK
[T] correlate
Click "OK". CHECK
Select one of the views links in the function bar.
OR
Select Tables link > Agree button > columnar table link.
CHECK
You can correlate over the spatial grids as well by replacing [T] with [X], [Y], [X Y], etc.
c. Trigonometric functions
Select the weekly SST data variable. CHECK
While in expert mode, enter the following line after the text already there.
Y cosd
Click "OK". CHECK
To find the sine of data, replace cosd with sind.
Select one of the views links in the function bar.
OR
Select Tables link > columnar table link. CHECK