Text Point Data

Next: Developing point data readers for RAMADDA

Point Data Documentation

4.2 Text Point Data

RAMADDA can provide rich support for structured CSV or text data files. First of all, download and install the pointtools.zip release. All of the examples below, once saved to disk, can be processed with:

    sh <install path>/pointtools/pointchecker.sh  value.csv

Here are some examples of different point data readers.

4.2.0 Simple CSV Examples

If you have text formatted data that you want to ingest into RAMADDA you can either generate the data in a standard CSV text format for specify a set of metadata properties in an external properties file.

The "standard CSV" format has any number of "#" delimited comment and property lines at the beginning of the file followed by any number of data records. The properties are defined in the header with:

#comment
#property name=property value
#property name=property value
#
value1,value2,valueN
value1,value2,valueN
...

Here is a simple example with just a single column value:

value.csv

#fields=value[unit="some unit"]
-0.931363
-0.930391

The only property that is required is the fields property - a comma separated list of field identifiers with a set of attributes contained within "[" and "]".

    fieldname[attr1="value1" attr2="value2" ...]

Here is a simple example with 2 columns. The second column has a missing value defined.

2values.csv

#fields=value1[unit="some unit"],value2[unit="some unit" missing="-999.0"]
-0.93,100.0
-0.23,-999.0
-1.93,-999.0

An alternative way to specify attributes of fields is with other named properties as shown below.

2values_alt.csv

#fields=value1,value2
#field.value1.unit=some unit
#field.value2.unit=some other unit
#field.value2.missing=-999.0
-0.93,100.0
-0.23,-999.0
-1.93,-999.0

4.2.1 Date/Time

You can specify a date/time by specifying its type="date". Use format="date format" to specify the date format. Here is an example with time and a single value.

time_value.csv

#fields=date[type="date" format="yyyy-MM-dd"],value
2001-01-01,-0.931363
2001-02-01,-0.930391
2001-03-01,-0.95
2001-04-01,-0.96

Here we have a file where the date and time are in different columns. The isdate and istime attributes specify that the time field is created from both of the columns. The dateformat specifies the format to use.

datetime.csv

#
#dateformat=yyyy/MM/dd HH:mm:ss
#fields=date[type="string" isdate="true"],time[type="string" istime="true"],value
#
2012/10/12 14:11:17.14 -0.931363
2012/10/12 14:11:17.24 -0.930391

If you have fields with the names yyyy (or year), month (or mm), day, hour, minute, second (or a subset of them) then RAMADDA will figure out the date/time of the records from the column values.

yymmddhhmmss.csv

#fields=yyyy[type="string"],month[type="string"],day[type="string"],hour[type="string"],minute[type="string"],second[type="string"],value
2001,01,01,01,00,00,-0.931363
2001,02,01,01,00,00,-0.930391
2001,03,01,01,00,00,-0.95
2001,04,01,01,00,00,-0.96

4.2.2 Georeferenced Data

If you have georeferenced data then specify latitude and longitude columns. Please, please, please, use decimal degrees east -180 to 180 and decimal degrees north -90 to 90.

latlon_value.csv

#fields=latitude[unit="degrees"],longitude[unit="degrees"],value
40,-107,-0.931363
45,-110,-0.930391
40,-107,-0.95
35,-120,-0.96

Here is a georeferenced time series:

latlon_time_value.csv

#fields=latitude[unit="degrees"],longitude[unit="degrees"],date[type="date" format="yyyy-MM-dd"],value
40,-107,2001-01-01,-0.931363
45,-110,2001-02-01,-0.930391
40,-107,2001-03-01,-0.95
35,-120,2001-04-01,-0.96

You can specify different coordinate reference systems with the crs property. For UTM coordinates specify an X and Y field and the utm zone and north/south flag: Here is data in UTM zone 58 South:

utm58s_rgbi.csv

#crs=utm
#utm.zone=58
#utm.north=false
#fields=x,y,elevation[unit=m],red,green,blue,intensity
449929.47,  1382815.76, 21.01,  67, 66, 61, 0
449929.45,  1382815.77, 21.00,  67, 66, 61, 0
449929.47,  1382815.77, 21.02,  78, 77, 72, 0
449929.13,  1382815.69, 20.94,  89, 86, 77, 0
449929.16,  1382815.71, 20.94,  90, 90, 82, 0

Here is data in a WGS84 ellipsoid:

wgs84_rgbi.csv

#crs=wgs84
#fields=x[precision=2],y[precision=3],z[precision=4],r[type=integer],g[type=integer],b[type=integer],intensity[type=integer]
-2313174.974,-3717949.974,4622885.034,4,4,4,1166
-2313175.009,-3717949.961,4622885.028,2,2,2,1799
-2313175.001,-3717950.058,4622884.669,2,2,2,1196
-2313175.012,-3717949.889,4622884.824,4,4,4,2659
-2313175.284,-3717950.819,4622883.842,4,4,4,1663
-2313175.097,-3717950.210,4622884.419,6,6,6,2101
-2313175.074,-3717949.930,4622884.744,5,5,4,1598
-2313175.198,-3717950.351,4622884.298,5,5,4,1937
-2313175.079,-3717949.814,4622884.857,3,3,3,1302
-2313175.195,-3717950.237,4622884.345,4,4,4,1425

You can also specify EPSG coordinate systems by setting the crs property to:

crs=epsg:<epgs code>

4.2.3 Site Based Data

If you have text values then specify the type="string"

site_time_value.csv

#
#
#fields=latitude[unit="degrees"],longitude[unit="degrees"],site[type="string"],date[type="date" format="yyyy-MM-dd"],value
#
40,-107,site id 1,2001-01-01,-0.931363
45,-110,site id 2,2001-02-01,-0.930391
40,-107,site id 1,2001-03-01,-0.95
35,-120,site id 3,2001-04-01,-0.96

Its often the case that a single file has site and lat/long data is implicit in a header, etc. For these cases we want to be able to access the site and location as we read the data. So, we define a fake field with a value="..." attribute.

fixed_site.csv

#
#
#fields=latitude[unit="degrees" value="40"],longitude[unit="degrees" value="-107"],site[type="string" value="site id"],date[type="date" format="yyyy-MM-dd"],value
#
2001-01-01,-0.931363
2001-02-01,-0.930391
2001-03-01,-0.95
2001-04-01,-0.96

Likewise, you can specify the time value:

fixed_time.csv

#
#
#fields=latitude[unit="degrees" value="40"],longitude[unit="degrees" value="-107"],site[type="string" value="site id"],date[type="date" format="yyyy-MM-dd" value="2001-01-01"],value
#
-0.931363
-0.930391
-0.95
-0.96

You can also specify a pattern that is applied to the text in the header to extract out latitude, longitude, elevation, etc.

patternexample.csv

#
#fields=Site_Id[ type="string"   pattern="ID:\s(.*)ARGOS:"  ], Latitude[ pattern="Lat:\s(.*)Lon:"  ], Longitude[ pattern="Lon:\s(.*)Elev:"  ], Elevation[pattern="Elev:(.*)"  ], value
#
#
#Year: 2011  Month: 02  ID: KMS  ARGOS: 21364  Name: Kominko-Slade       
#Lat: 79.47S  Lon: 112.11W  Elev: 1801m
#
1
2
3
4

4.2.4 Property Files

If you have text point files that are in some pre-existing format (i.e., you can't add "#" properties) then you can specfy Lets assume we have a simple CSV file with 4 columns -

example1.csv

latitude, longitude, date, value
40.0,-107,2012/10/12,  -0.931363
45.0,-110.0,2012/10/12, -0.930391

We can read this file with just a point.properties file. The key properties are delimiter, skiplines and fields.

example1.csv.properties


#Define the column delimiter
delimiter=,

#number of lines in header to skip
skiplines=1

#fields definition
fields=latitude[unit="degrees"],longitude[unit="degrees"],date[type="date" format="yyyy/MM/dd"],value[unit="some unit"]

example1_alt.csv.properties


#number of lines in header to skip
skiplines=1

#An alternative method to define the attributes of the fields
#define the field names here
fields=latitude,longitude,date,value


date.type=date
date.format=yyyy/MM/dd



#define the attributes with 
#field..=...

field.latitude.unit=degrees
field.longitude.unit=degrees
field.date.type=date
field.date.format=yyyy/MM/dd
field.value.unit=some unit

#One can also set searchable and chartable attributes for ramadda's use

field.value.searchable=true
field.value.chartable=true

4.2.5 Integrating with RAMADDA

You can upload an arbitrary CSV point file and its accompanying properties file. When you are logged in go to File->New Entry. Under the Point Data list choose Text Point Data. Specify the properties in the text field or upload the properties file. You can also define a new entry type in RAMADDA for your point data. Embed the properties in a properties tag. Install the types.xml file as a plugin.
exampletypes_.xml

<?xml version="1.0" encoding="ISO-8859-1"?>
<types>

    <type
     description="Test data"
     handler="org.ramadda.data.services.PointTypeHandler"
     name="type_point_test"
     super="type_point">

         <property name="record.file.class" value="org.ramadda.data.point.text.CsvFile"/>

         <property name="record.properties">
delimiter=
position.required=false
skiplines=1
dateformat=yyyy/MM/dd HH:mm:ss
fields=date[type="string" isdate="true"],time[type="string" istime="true"],value[searchable="true" chartable="true" unit="some unit"]
     </property>

      </type>

</types>