openoakland · Ethan-bradley · Mar 28, 2018 · Mar 28, 2018 · Apr 9, 2018 · Apr 9, 2018
diff --git a/.gitignore b/.gitignore
@@ -4,3 +4,7 @@ shifts/*
 *.txt
 yml_template/*
 !requirements.txt
+env/*
+.idea/*
+
+*.iml
diff --git a/README.md b/README.md
@@ -22,26 +22,27 @@ This data was collected in a series of surveyor shifts in which a surveyor colle
 
 ### Generating the files
 
+Simply run `python get_shifts.py`. It will create a directory called "shifts" and then save a bunch of CSV files in it.
 
-```bash
-docker-compose up
+The name of each CSV file designates the device name (A, B, etc) and the expected range of readings, for reference. Columns include timestamp, lat/long, the filter size used on this shift, and PM (particulate matter) reading.
 
-#creates a directory called 'shifts' and writes CSVs to it
-python scripts/get_shifts.py
-```
+Orphaned PM and GPS data is *not* included. That is, only readings that contain both PM and lat/long will be present in these files.
 
+### Joining Air Quality Data with GPS Data
 
-The name of each CSV file designates the device name (A, B, etc) and the expected range of readings, for reference. Columns include timestamp, lat/long, the filter size used on this shift, and PM (particulate matter) reading.
+The files/joiner.py script can be used to join air quality data with GPS data. The air quality data should be a CSV, as produced by a DustTrak II device. (See examples/8530C_2-5_002.csv) The GPS data should be a log file containing NMEA sentences. (See examples/GPS_20140717_193858_8530C.log) Additionally, an empty file should be passed that will become the output CSV file. If necessary, this can be obtained via a getter function called on the object (.getFile()) The steps listed below for obtaining a file will also work.
 
-Orphaned PM and GPS data is *not* included. That is, only readings that contain both PM and lat/long will be present in these files.
+The output of the script will be a CSV file containing both GPS and air quality data. (See examples/joiner-output.csv)
 
+The script can be run as `files/joiner.py --aq <air-quality-file.csv> --gps <gps-file.log> --out <output.csv> --tolerance 1 --filter 2.5`
 
+The command line options are as follows:
+
+- `-a`/`--aq` : the path to the CSV file containing air quality data
+- `-g`/`--gps` : the path to the log file containing GPS data
+- `-o`/`--out` : the path to where the output file will be created
+- `-t`/`--tolerance` : this parameter is the maximum difference (in seconds) between an air quality datum and a GPS datum for the two data to be joined. In other words, if there is an air quality datum at 12:00:00, but the closest GPS datum is at 12:00:02, then the output will include a row combining those two data if the value of -t is >= 2, otherwise that air quality datum will be dropped.
+- `-f`/`--filter` : this is the size of the filter used to collect the air quality data. (2.5 is most common, 10 may also be used.) Often this is embedded in the filename. E.g., for 8530C_2-5_002.csv, the filter size was 2.5.
 
-```bash
-# merges shifts from the same month into the shift_by_month directory
-scripts/get_shift_by_month.sh
 
-# Generates markdown pages for jekyll. Writes them to _posts
-scripts/make_markdown.sh
-```
-Original file line number
+Diff line change
@@ Expand Up / @@ -4,3 +4,7 @@ shifts/* @@
     *.txt
     yml_template/*
     !requirements.txt
+    env/*
+    .idea/*
+    *.iml