-
Notifications
You must be signed in to change notification settings - Fork 7
Object Design for Metadata and Results
Tom Kwong edited this page Feb 23, 2018
·
13 revisions
See issues
ResultSet is the primary object that represents data returned from reading a SAS data file. ResultSet implements the Base.Iteration interface as well as the IterableTables.jl interface.
Fields
-
columns
: a vector of columns, each being a vector itself -
names
: a vector of column symbols -
size
: a tuple (nrows, ncols)
Accessors
columns(::ResultSet)
names(::ResultSet)
size(::ResultSet)
size(::ResultSet, dim::Integer)
Single Row/Column Indexing
-
rs[i]
returns a tuple for rowi
-
rs[:c]
returns a vector for columnc
Multiple Row/Column Indexing
-
rs[i:j]
returns a view of ResultSet with rows betweeni
andj
-
rs[:c...]
returns a view of ResultSet with columns specified
Cell Indexing
-
rs[i,j]
returns a single value for rowi
columnj
-
rs[i,:c]
returns a single value for rowi
columnc
e.g.rs[:A, :B]
- Specific cell can be assigned using the above indexing methods
Usage
rs = readsas("abc.sas7bdat")
df = DataFrame(columns(rs), names(rs))
size(rs)
Note: A design decision is made here to NOT include file-specific meta data in the result set.
Metadata contains information about a SAS data file.
Fields
-
filename
: file name/path of the SAS data set -
encoding
: file encoding e.g. "ISO8859-1" -
endianness
: either:LittleEndian`` or
:BigEndian` -
compression
: could be:RLE
,:RDC
, or:none
-
pagesize
: size of each data page in bytes -
npages
: number of pages in the file -
nrows
: number of data rows in the file -
ncols
: number of data columns in the file -
columnsinfo
: vector of column symbols and their respective types (Float64 or String)
Usage
metadata("abc.sas7bdat")
handler = SASLib.open("abc.sas7bdat")
metadata(handler)
Note: Direct field access is encouraged for metadata.
readsas
returns ::ResultSet
read
returns ::ResultSet
metadata
(a new function) returns ::Metadata