-
Notifications
You must be signed in to change notification settings - Fork 1
Csv
A comma-separated values (csv) file stores tabular data (numbers and text) in plain text.
It is a common data exchange format that is widely supported.
The GeoDMS supports two ways of reading csv files:
- gdal.vect: for most csv files, we advice to use gdal.vect to read csv files, see next subparagraph.
-
TableChopper: we advice to use the TableChopper to read csv files:
- If your csv file contains other separators than comma or semicolon, or
- If your csv is very large (for performance reasons)
The following example shows how to read a .csv file with the gdal.vect StorageManager.
unit<uint32> pc6 : StorageName = "%SourceDataDir%/CBS/pc6_data.csv" , StorageType = "gdal.vect" , StorageReadOnly = "True" { }
All attributes from the csv files are read. The name is default derived from the first (header) row, the default value type for all attributes will be string. Use conversion functions to cast the data to requested values units.
In case a .csvt file is available, values that match the value types of the data van be directly configured. A .csvt can be written with gdal.vect, see the write examples.
In earlier GeoDMS versions a geometry attribute was added, usually containing null values. This attribute can be ignored.
Gdal.vect supports comma and semicolon separated csv files, for reading data no separator has to be configured.
GDAL Options can be configured for reading different csv files.
See: https://gdal.org/drivers/vector/csv.html#open-options for a full list of all open options.
Examples on how to configure read options:
container noheader { unit<int32> optionSet := range(uint32, 0, 1); attribute<string> GDAL_Options (optionSet) : ['HEADERS=NO']; unit<uint32> pc6_ignore_header : StorageName = "%SourceDataDir%/CBS/pc6_data.csv" , StorageType = "gdal.vect"; , StorageReadOnly = "True"; } container emptyvalues { unit<int32> optionSet := range(uint32, 0, 1); attribute<string> GDAL_Options (optionSet) : ['EMPTY_STRING_AS_NULL=YES']; unit<int32> pc6_empty_string_as_null : StorageName = "%SourceDataDir%/CBS/pc6_data.csv" , StorageType = "gdal.vect"; , StorageReadOnly = "True"; }
The first example configures a source file in which the header is ignored. The resulting field names will then be: field_1, field_2.. field_n.
The second example configures how empty cells are treated. By default the become empty strings, by configuring the option: EMPTY_STRING_AS_NULL=YES, they become Null values
Multiple GDAL_Options can be configured in your optionSet, use a comma as separator.
We advice to use different containers for configuring csv files with different open options.
The GeoDMS supports two ways of writing csv files:
- gdalwrite.vect: since GeoDMS version 7408, we advice to use the gdalwrite.vect to write most csv files, see next subparagraph.
-
TableComposer: we advice to use the TableComposer to write csv files.
- If you need more flexibility in the contents of the header line(s) and/or body text, or
- If you want another separator as semicolon or comma, or
- If your csv is very large (for performance reasons)
The following example shows how to write a .csv file with the gdalwrite.vect StorageManager.
unit <uint32> pc6_export := src/pc6 , StorageName = "%localDataProjDir%/export_semicolon.csv" , StorageType = "gdalwrite.vect" , StorageReadOnly = "false" { attribute<uint32> IntegerAtt := const(1, .); attribute<float32> FloatAtt := const(1f, .); attribute<string> StringAtt := const('A', .); attribute<bool> BoolAtt := const(true, .); }
Attributes of all value types, except for values types of the point group, are written to a csv file. This applies to both the direct as the indirect subitems.
The resulting file will contain one header line with the name of each attribute. By default a semicolon is used as seperator and all attributes will be double quoted.
GDAL Options can be configured for writing different csv files.
See: https://gdal.org/drivers/vector/csv.html#layer-creation-options for a full list of all creation options.
Examples on how to configure write options:
container comma { unit<uint32> optionSet := range(uint32, 0, 1); attribute<string> GDAL_LayerCreationOptions (optionSet) : ["SEPARATOR=COMMA"]; uni< pc6_export := src/pc6 , StorageName = "%localDataProjDir%/export_comma.csv" , StorageType = "gdalwrite.vect" , StorageReadOnly = "false" { attribute<uint32> IntegerAtt := const(1, .); attribute<float32> FloatAtt := const(1f, .); attribute<string> StringAtt := const('A', .); attribute<bool> BoolAtt := const(true, .); } } container geometry_as_wkt { unit<uint32> optionSet := range(uint32, 0, 3); attribute<string> GDAL_LayerCreationOptions (optionSet) : ["GEOMETRY=AS_WKT", "GEOMETRY_NAME=GEOMETRY", "CREATE_CSVT=YES"]; unit<uint32> poly := EsriShape/Polygon , StorageName = "%localDataProjDir%/poly.csv" , StorageType = "gdalwrite.vect"` , StorageReadOnly = "false"` { attribute<fpoint> geometry (poly) := EsriShape/Polygon/Geometry; attribute<string> label := EsriShape/Polygon/Label; } }
The first example show how to configure a csv file with a comma as separator.
The second example shows how you can also write a vector geometry to a csv file (point, line and polygon). The data will be written als well-known textformat (WKT).
The second option in this examples configures the name in the csv file with the WKT. The third option indicates that a .csvt is also written with exported attribute names.
We advice to use different containers for configuring csv files with different creation options.
GeoDMS ©Object Vision BV. Source code distributed under GNU GPL-3. Documentation distributed under CC BY-SA 4.0.