Skip to content
eoudejans edited this page Nov 29, 2023 · 46 revisions

A comma-separated values (csv) file stores tabular data (numbers and text) in plain text.

It is a common data exchange format that is widely supported.

Read

The GeoDMS supports two ways of reading csv files:

  • gdal.vect: for most csv files, we advice to use gdal.vect to read csv files, see next subparagraph.
  • TableChopper: we advice to use the TableChopper to read csv files:
    • If your csv file contains other separators than comma or semicolon, or
    • If your csv is very large (for performance reasons)

gdal.vect

The following example shows how to read a .csv file with the gdal.vect StorageManager.

example

unit<uint32> pc6
:  StorageName     = "%SourceDataDir%/CBS/pc6_data.csv"
,  StorageType     = "gdal.vect"
,  StorageReadOnly = "True"
{
}

All attributes from the csv files are read. The name is default derived from the first (header) row, the default value type for all attributes will be string. Use conversion functions to cast the data to requested values units.

In case a .csvt file is available, values that match the value types of the data van be directly configured. A .csvt can be written with gdal.vect, see the write examples.

In earlier GeoDMS versions a geometry attribute was added, usually containing null values. This attribute can be ignored.

Gdal.vect supports comma and semicolon separated csv files, for reading data no separator has to be configured.

options

GDAL Options can be configured for reading different csv files.

See: https://gdal.org/drivers/vector/csv.html#open-options for a full list of all open options.

Examples on how to configure read options:
container noheader
{
   unit<int32>       optionSet := range(uint32, 0, 1);
   attribute<string> GDAL_Options (optionSet) : ['HEADERS=NO'];

   unit<uint32> pc6_ignore_header
   :  StorageName     = "%SourceDataDir%/CBS/pc6_data.csv"
   ,  StorageType     = "gdal.vect";
   ,  StorageReadOnly = "True";
}

container emptyvalues
{
   unit<int32>       optionSet := range(uint32, 0, 1);
   attribute<string> GDAL_Options (optionSet) : ['EMPTY_STRING_AS_NULL=YES'];

   unit<int32> pc6_empty_string_as_null
   :  StorageName     = "%SourceDataDir%/CBS/pc6_data.csv"
   ,  StorageType     = "gdal.vect";
   ,  StorageReadOnly = "True";
}

The first example configures a source file in which the header is ignored. The resulting field names will then be: field_1, field_2.. field_n.

The second example configures how empty cells are treated. By default the become empty strings, by configuring the option: EMPTY_STRING_AS_NULL=YES, they become Null values

Multiple GDAL_Options can be configured in your optionSet, use a comma as separator.

We advice to use different containers for configuring csv files with different open options.

Write

The GeoDMS supports two ways of writing csv files:

  • gdalwrite.vect: since GeoDMS version 7408, we advice to use the gdalwrite.vect to write most csv files, see next subparagraph.
  • TableComposer: we advice to use the TableComposer to write csv files.
    • If you need more flexibility in the contents of the header line(s) and/or body text, or
    • If you want another separator as semicolon or comma, or
    • If your csv is very large (for performance reasons)

gdalwrite.vect

The following example shows how to write a .csv file with the gdalwrite.vect StorageManager.

example

unit <uint32> pc6_export := src/pc6
,  StorageName     = "%localDataProjDir%/export_semicolon.csv"
,  StorageType     = "gdalwrite.vect"
,  StorageReadOnly = "false"
{
   attribute<uint32>  IntegerAtt := const(1, .);
   attribute<float32> FloatAtt   := const(1f, .);
   attribute<string>  StringAtt  := const('A', .);
   attribute<bool>    BoolAtt    := const(true, .);
}

Attributes of all value types, except for values types of the point group, are written to a csv file. This applies to both the direct as the indirect subitems.

The resulting file will contain one header line with the name of each attribute. By default a semicolon is used as seperator and all attributes will be double quoted.

options

GDAL Options can be configured for writing different csv files.

See: https://gdal.org/drivers/vector/csv.html#layer-creation-options for a full list of all creation options.

Examples on how to configure write options:
container comma
{
   unit<uint32> optionSet := range(uint32, 0, 1);
   attribute<string> GDAL_LayerCreationOptions (optionSet) : ["SEPARATOR=COMMA"];

   uni< pc6_export := src/pc6
   ,  StorageName     = "%localDataProjDir%/export_comma.csv"
   ,  StorageType     = "gdalwrite.vect"
   ,  StorageReadOnly = "false"
   {
      attribute<uint32>  IntegerAtt := const(1, .);
      attribute<float32> FloatAtt   := const(1f, .);
      attribute<string>  StringAtt  := const('A', .);
      attribute<bool>    BoolAtt    := const(true, .);
   }
}

container geometry_as_wkt
{
   unit<uint32> optionSet := range(uint32, 0, 3);
   attribute<string> GDAL_LayerCreationOptions (optionSet) : 
      ["GEOMETRY=AS_WKT", "GEOMETRY_NAME=GEOMETRY", "CREATE_CSVT=YES"];

   unit<uint32> poly := EsriShape/Polygon
   ,  StorageName     = "%localDataProjDir%/poly.csv"
   ,  StorageType     = "gdalwrite.vect"`
   ,  StorageReadOnly = "false"`
   {
      attribute<fpoint> geometry (poly) := EsriShape/Polygon/Geometry;
      attribute<string> label           := EsriShape/Polygon/Label;
   }
}

The first example show how to configure a csv file with a comma as separator.

The second example shows how you can also write a vector geometry to a csv file (point, line and polygon). The data will be written als well-known textformat (WKT).

The second option in this examples configures the name in the csv file with the WKT. The third option indicates that a .csvt is also written with exported attribute names.

We advice to use different containers for configuring csv files with different creation options.

Clone this wiki locally