Python3 porting #510

ircwaves · 2019-09-04T14:49:48Z

Thanks @ags-tolson and @bhbraswell for getting started on this. We anticipate needing to have some discussion on some facets of this. We're off to a good start. https://pythonclock.org/

(also see #376)

ra-tolson · 2019-09-04T14:55:11Z

So for now I'm working in private so I can rebase as-needed.

@bhbraswell, whenever we want to start collaborating, I can push, and if you want to see my commits sooner please just let me know.

In the mean time, here's the rough plan I'm working from at the moment:

make the existing test suite pass <-- in progress
merge in dev (now that we have a working test suite)
ensure landsat bqashadow product survives the merge, and unskip its sys test in std_process.py.
make the test suite pass again
figure out everything the test suite doesn't touch, and ideally, remedy that by writing tests (job 1 of at least two guides on converting to python 3 say to start with getting to high levels of code coverage with your test suite)
make those tests pass
figure out containerization (mostly, make the existing stuff in docker/, ie gips' "official" containerization, work for the test suite)
merge to dev

At some point in this process I think it'd be best for someone with scientific training to inspect some generated raster files, like products and exports, but I'm not sure when would be best. Computed stats look okay so far (eg std dev & mean only vary by a little), but obviously my eyes don't see as much as either of yours would.

ra-tolson · 2019-09-04T17:26:58Z

Another todo: Need to fix landsat bqashadow process sys test after we merge dev into the python 3 branch, because that product has been restored in the dev branch but was deleted entirely in the python 3 branch.

bhbraswell · 2019-09-09T20:25:07Z

@ags-tolson I'd be glad to begin looking at/ informally testing whenever you'd like to push your branch.

ra-tolson · 2019-09-09T21:26:08Z

Okay, the branch is named 510-py3-port; it has commits atop rob-upgrade. Here's the state of it:

drivers touched by the test suite (unit, integration, and system gips_process tests) now run under python 3 + gippy 1.0.
the unit & integration tests pass.
The drivers with gips_process system tests generally run and some of them pass their tests.
Some products are different enough in their outputs that I'm unsure if it's safe to accept the new behavior.
However, some of these tests don't pass in the dev branch in the first place. I'm looking into that presently, but I'm not sure it matters: If you guys like the new product files, that's good enough for me.

Drivers whose process system tests pass:

sar
prism
most of hls (but not msavi2)
landsat (but bqashadow is a special case)

These drivers are broken in their process system tests:

    merra -- the whole suite is different basically (no diff handy)
    s2 easy to fix with --record:
            cfmask, evi-toa, and cloudmask
    s2 bustified, generally with excessive variation in mean/stddev or other values like data type:
        <Function t_process[sentinel2-crcm]>
        <Function t_process[sentinel2-mtci-toa]>
        <Function t_process[sentinel2-mtci]>
        <Function t_process[sentinel2-s2rep-toa]>
        <Function t_process[sentinel2-s2rep]>
    hls
        <Function t_process[hls-msavi2]>
            saved new expectation but variations a bit big:
                -    '  Minimum=0.000, Maximum=10160.000, Mean=2014.882, StdDev=590.982',
                +    '  Minimum=0.000, Maximum=10160.000, Mean=1972.869, StdDev=651.815',

Also print is a bad idea for code intended to act like a library, but verbose_out has no default verbosity level and doesn't let you specify arguments conveniently like print(), so I wrote vprint() as an alternative.

If you want to observe these changes in output, you can build a container suitable for running the test suite with indigo-ci.docker, provided the needed parent container is named indigo-gips (ie use the Dockerfile you added for that). gips/test/README.md tells how to configure & run the tests. I can forward you the needed assets to save you from fetching them; let me know what driver you want to start with.

I can also send along newly-generated product files for your perusal, too.

bhbraswell · 2019-09-10T01:04:29Z

Thanks @ags-tolson. I'll look at MSAVI2 and MERRA, and also HLS which actually would be a priority for me personally. Just a question, instead of gips-indigo, could we make a generic container than anyone could use?

ra-tolson · 2019-09-10T14:58:18Z

I think we could and should. I'm not sure when the best moment to sort containerization would be though.

ra-tolson · 2019-09-10T18:57:52Z

An update on system tests in the dev branch + gippy 0.3.11: The CI pipeline passes, and for process tests, the passing drivers are hls, modis, merra, most of landsat, and some of sentinel2. The landsat problem is minor, and the fixes in place in the 510 branch seem trustworthy, so that just leaves sentinel2.

Sentinel2's problematic products under dev + gippy 0.3.11 are mostly the same as for the gippy 1.0 branch: crcm, s2rep, and mtci. Meanwhile s2rep-toa and mtci-toa pass under gippy 0.3.11 and fail under gippy 1.0.

crcm deviates from the expectation but resembles the 0.3.11 outcome, and I'm told it looks okay graphically, which makes me think it's okay to accept it as-is (maybe just a stale expectation):

# sys test expectation:
-    '  Minimum--8561.000, Maximum-32767.000, Mean-2954.280, StdDev-3506.046',
# dev branch under gippy 0.3.11 outcome:
+    '  Minimum=-8652.000, Maximum=32767.000, Mean=2375.639, StdDev=3342.788',
# branch 510-py3-port under gippy 1.0 outcome:
+    '  Minimum--8624.000, Maximum-32767.000, Mean-2395.565, StdDev-3341.763',

mtci is acting like it got pushed up to its maximum somehow, maybe a problem with the numpy faff (it's computed manually, not a gippy.algorithms.indices call):

# sys test expectation:
-    '  Minimum=-30000.000, Maximum=29998.000, Mean=12443.862, StdDev=11332.215',
# dev branch under gippy 0.3.11 outcome:
+    '  Minimum=-30000.000, Maximum=29998.000, Mean=12388.818, StdDev=11309.554',
# branch 510-py3-port under gippy 1.0 outcome:
+    '  Minimum=-30000.000, Maximum=32767.000, Mean=32742.273, StdDev=1026.779',

mtci-toa looks very different from its expectation, but weirdly similar to the surface mtci's expectation & gippy 0.3.11 outcome:

# sys test expectation:
-    '  Minimum--7.000, Maximum-6.000, Mean--0.635, StdDev-5.246',
# branch 510-py3-port under gippy 1.0 outcome:
+    '  Minimum=-30000.000, Maximum=29999.000, Mean=12659.624, StdDev=12332.159',

s2rep is similarly showing a huge blow-up to the max and is similarly computed by the driver, not gippy:

# sys test expectation:
-    '  Minimum=-1.000, Maximum=17498.000, Mean=8152.291, StdDev=1299.719',
# dev branch under gippy 0.3.11 outcome:
+    '  Minimum=-1.000, Maximum=17498.000, Mean=8163.189, StdDev=1364.628',
# branch 510-py3-port under gippy 1.0 outcome:
+    '  Description = 1',
+    '  Minimum=-10000.000, Maximum=32767.000, Mean=32689.483, StdDev=1458.024',

s2rep-toa looks very different from its expectation but in the ballpark of the surface s2rep expectation and gippy 0.3.11 outcome, which makes me think the expectation is wrong:

# sys test expectation:
-    '  Minimum=-911.000, Maximum=1100.000, Mean=301.864, StdDev=724.321',
# branch 510-py3-port under gippy 1.0 outcome:
+    '  Minimum=1.000, Maximum=17499.000, Mean=8317.195, StdDev=1353.935',

Due to the way sentinel-2 works I suppose I'll look at their surface versions first, since for sentinel-2 it represents less processing.

ra-tolson · 2019-09-10T20:26:42Z

So this is odd; I have some memory of having to work back from the s2 ref product, which is really just opening the asset, to get to the ref-toa product, but looking at the code, I can see no evidence of this. Apparently I'm mistaken? Anyway I'm starting with mtci-toa instead since that's closer to the source.

ircwaves · 2019-09-10T20:38:05Z

So, the idea there is that ESA provides, ref-toa, and coeffs to get back to rad-toa. So, once we convert back to rad-toa, we apply the 6s based atmospheric correction coeffs to get to ref.

ra-tolson · 2019-09-13T15:30:13Z

For the py3 + gippy 1 version of mtci-toa, in the histogram, there's strange spikes near the 0.5 marks of the histogram (see especially spiky spike at 0.0):

But otherwise I guess it looks good, and close to atmo-corrected mtci in any case. Let me know if this isn't an acceptable artifact kind of situation.

ircwaves · 2019-09-13T15:39:15Z

That looks like a possible quantization effect, but I would defer to @bziniti on whether that is problematic. I believe she's the only one who has utilized MTCI.

bziniti · 2019-09-13T18:07:25Z

That looks like a possible quantization effect, but I would defer to @bziniti on whether that is problematic. I believe she's the only one who has utilized MTCI.

I used MTCI once for a forest defoliation mapping project. I'm not sure I understand what the problem is. @lledoux may be more familiar than I. Or maybe this will help? https://www.sciencedirect.com/science/article/pii/S092427161300107X

lledoux · 2019-09-13T18:23:47Z

I would tend to agree with @ircwaves , as it's too much of a coincidence that those spikes happen at every 0.5 step; though I am unsure as to whether it may cause issues later on if/when the data are used. Like @bziniti , I only used this product for mapping deforestation (a while ago). Sorry I'm not more help! Perhaps Torbick may be able to provide some input?

ircwaves · 2019-09-13T18:28:56Z

So, I'd say that MTCI is good for now, but maybe file an issue to look at the level of quantization being employed.

ra-tolson · 2019-09-17T16:00:18Z

So merra processing went okay; I'm pivoting to export tests.

Modis index exports are showing a strange geolocation thing; going from 71d12\'50.33"W to 71d12\'50.24"W is a distance of as much as 0.0015 miles, if I'm doing the math right, ~~which seems significant enough to matter~~ is like 2 meters. Does that matter? I don't think so?

____________________________ t_project[modis-ndvi] _____________________________
E             'Origin = (1777848.52615334,4934179.33097106)',
E             -    'Pixel Size = (100.00000000,-100.00000000)',
E             +    'Pixel Size = (99.88880776,-99.93806065)',

(lots of new metadata, but missing):
E             -    '  SourceFiles=h12v04_2012337_MCD_ndvi',

(possibly relevant for geolocation issue):
E             +    '  GRINGPOINTLATITUDE.1=39.78578782, 49.99719181, 50.07541801, 39.84112776',
E             +    '  GRINGPOINTLONGITUDE.1=-78.20833299, -93.38216574, -77.75056839, '
E             +    '-65.07807811',
E             +    '  GRINGPOINTSEQUENCENO.1=1, 2, 3, 4',

(geolocation issue):
E             'Corner Coordinates:',
E             -    'Upper Left  ( 1777848.526, 4934179.331) ( 71d12\'50.33"W, 43d27\'22.98"N)',
E             ?                                                         ^^
E             +    'Upper Left  ( 1777848.526, 4934179.331) ( 71d12\'50.24"W, 43d27\'22.98"N)',
E             ?                                                         ^^
E             -    'Lower Left  ( 1777848.526, 4868579.331) ( 71d21\'48.67"W, 42d53\'16.00"N)',
E             ?                                    ^^   --               -            ^ ^^
E             +    'Lower Left  ( 1777848.526, 4868619.963) ( 71d21\'48.26"W, 42d53\'17.27"N)',
E             ?                                    ^^  ++               +             ^ ^^
E             -    'Upper Right ( 1825148.526, 4934179.331) ( 70d39\' 9.32"W, 43d22\'31.31"N)',
E             ?                       ^^^ ^ -                        ^^ ^^              ^^
E             +    'Upper Right ( 1825095.932, 4934179.331) ( 70d39\'11.44"W, 43d22\'31.64"N)',
E             ?                       ^^^ ^^                         ^^ ^^              ^^
E             -    'Lower Right ( 1825148.526, 4868579.331) ( 70d48\'24.75"W, 42d48\'30.03"N)',
E             ?                       ^^^ ^ -      ^^   --            ^ -             ^ ^^
E             +    'Lower Right ( 1825095.932, 4868619.963) ( 70d48\'26.53"W, 42d48\'31.62"N)',
E             ?                       ^^^ ^^       ^^  ++             ^  +            ^ ^^
E             -    'Center      ( 1801498.526, 4901379.331) ( 71d 0\'34.22"W, 43d 7\'56.48"N)',
E             ?                         -----       ^  ^^^            ^ ^^            ^ ^^
E             +    'Center      ( 1801472.229, 4901399.647) ( 71d 0\'35.07"W, 43d 7\'57.27"N)',
E             ?                        +++++        ^  ^^^            ^ ^^            ^ ^^

(content issue):
E             'Band 1 Block=473x8 Type=Int16, ColorInterp=Gray',
E             -    '  Description = ndvi',
E             -    '  Minimum=-9125.000, Maximum=8199.000, Mean=5985.630, StdDev=1326.924',
E             ?                 ^^^                 ^              ^  --           ^^ ^^^
E             +    '  Minimum=-9068.000, Maximum=8191.000, Mean=5980.926, StdDev=1317.353',
E             ?                 ^^^                 ^              ^ ++            ^^ ^^^
E             '  NoData Value=-32768',
E             -    '  Unit Type: other',

ircwaves · 2019-09-17T17:00:50Z

The piece that jumps out there to me is the pixel size being different. That shouldn't have changed.

ircwaves · 2019-09-17T17:21:50Z

Not sure how many spots it comes up in, but my guess is that the issue is here:
https://github.com/gipit/gippy/blob/master/GIP/GeoAlgorithms.cpp#L254
where cookie_cutter just grabs the bounding box from the GeoFeature, and ignores the resolution that is provided to cookie_cutter. Then that code just pulls the bbox from this code:
https://github.com/gipit/gippy/blob/master/GIP/GeoResource.cpp#L101-L103
bounding box being used to compute the resolution being stuffed into the image, where it used to be explicitly passed in:
https://github.com/Applied-GeoSolutions/gippy/blob/master/GIP/GeoAlgorithms.cpp#L394-L401

So, I believe that there is a bug in this gippy use-case, but it also points out again that we need to add ALL_TOUCHED support to gippy 1.0 cookie cutter.

ra-tolson · 2019-09-19T14:02:03Z

Okay, MRPR #513 is up for the another round of effort. As of now:

The test suite has been completely exercised; tests generally pass, or else fail for a known, marked reason (ie, most significantly, cookie_cutter: xres yres ignored gipit/gippy#172).
Fetch & process for drivers not exercised by the test suite have been tested manually where possible (see 510 second round of python 3 + gippy 1.0 porting #513 for details). Turns out we got a priori broken code for some minor drivers.

Thus, fetch & process for most drivers has been verified under python 3 + gippy 1.0, plus some random other testing by our unit & integration tests. Next steps (updated Fri 20 sep --T):

* ~~probably, the ORM hasn't been exercised yet, so do that.~~ The ORM was covered by unit tests. It's almost as if writing unit tests saves you work in the long run. Huh.
* we have a few tools under gips/scripts/ that haven't been exercised (and tiles.py, which it might be a good moment to remove?): I think just export & stats need exercise, export_meta.py is being removed, and tiles.py is a candidate for removal.
* I have a list of approx ten other source files that I'm vetting to see if they need further testing based on likely coverage by prior work.
* containerization & install process testing
* re-run the whole test suite in the revised container / installation
* 🍺 ?

ircwaves · 2019-09-24T14:47:40Z

A note for containerization, need to add apt-get install python-dev (2.7) even though we're going python3, because gippy freaks out about needing python2.7/Python.h:

w/o python-dev

bash$ pip3 install gippy 
  [ ... snip ... ]
gippy/gippy_wrap.cpp:5040:34: fatal error: python2.7/Python.h: No such file or directory
  compilation terminated.

w/ python-dev

bash$ pip3 install gippy ; echo "exit status: $?"
[ ... snip ... ]
Collecting gippy
Using cached https://files.pythonhosted.org/packages/c4/58/c79b0213b3a601a5d921f00fc1b3b443edd4c31882d7f69271c793125d9c/gippy-1.0.3.tar.gz
Requirement already satisfied: numpy>=1.9 in /usr/local/lib/python3.5/dist-packages (from gippy) (1.17.1)
Building wheels for collected packages: gippy
Building wheel for gippy (setup.py) ... done
Created wheel for gippy: filename=gippy-1.0.3-cp35-cp35m-linux_x86_64.whl size=7543861 sha256=9c9ffd98059112d851be997e84aa702a10272f37f0a2017e53eb831cfd8cec60
Stored in directory: /root/.cache/pip/wheels/e2/0c/f9/1403921925e62f69cb4c23f3b579a4296de183c6f6aff07a95
Successfully built gippy
Installing collected packages: gippy
Successfully installed gippy-1.0.3
exit status: 0

ra-tolson · 2019-09-26T18:55:16Z

As of now that PR (#513) is mature enough to pass the CI test suite (unit + int + sys), which means technically it can be merged into dev if it passes code review. @ircwaves note that as you recommended mapreduce.py hasn't been removed in this branch.

As we consider merging, the key issues I know of are:

gips_export can't work perfectly due to: cookie_cutter: xres yres ignored gipit/gippy#172
I think people have used the CI container build as their dev builds; the process may be somewhat different at the moment (but still easy enough to go on I think).

I'm going to attempt a local dev merge, see how miserable it is.

ircwaves · 2019-09-27T00:57:20Z

That's great to hear! My current thinking is that we move forward with dev @ python3+gippy1, and keep master at python2.7 until dependent apps have been ported to python 3+gippy1. Raise any flags for anyone?

bhbraswell · 2019-09-27T13:00:16Z

Really impressive @ags-tolson. Definitely doesn't raise any warning flags for me.

ra-tolson · 2019-10-01T13:43:30Z

Merge complete + test suite passes; if dev doesn't move much, the 510 branch head now contains dev as well. There were several merge conflicts but hopefully test passage is a good sign.

A bunch of coreg changes came in, but I don't have a test suite for coreg generally. I attempted to follow @bhbraswell's pattern of changing calls to match gippy but it's possible I didn't catch everything. I didn't do manual testing in any case because I don't know enough about coreg to know if it worked or changed behavior.

Likewise there are more dockerfiles than before. There's going to be a little layer cake with Dockerfile on the bottom:

Dockerfile, basic gips container
docker/gips-ci.docker builds atop it and runs the CI suite
docker/gips-test-full.docker builds atop the CI docker image and lets you run every non-skipped test (mainly adds sixs; note I punted on adding in earthdata creds etc since that's never run using automation right now).

Suggest adding coreg stuff (ortho?) to gips-test-full.docker in the near future and instrumenting coreg code with tests. Suggest that effort should follow this one to keep the bloat down; we can't have everything right away with something this large.

ra-tolson · 2019-10-01T16:00:01Z

Okay, the mergeable pushed branch is 510-py3-port. I'll issue a PR shortly; it's big so I'll suggest a procedure for managing it when I do. Other notes:

gips-bin & bits of gips uniquely called by gips-bin are unported (eg mapreduce.py)
same for tofu (eg algorithm.py)
acolite is unported due to lack of funding or interest at the moment
modtran is unported due to believed lack of likely usage moving forward
hls msavi2 product is being looked at by @bhbraswell; saved new expectation but variations a bit big:

    -    '  Minimum=0.000, Maximum=10160.000, Mean=2014.882, StdDev=590.982',
    +    '  Minimum=0.000, Maximum=10160.000, Mean=1972.869, StdDev=651.815',

t_project & t_stats remain xfailed for now due to gippy issue related to bounding box & computed pixel size
gips_export --rastermask has an a priori bug we're tracking internally.
gips_mask doesn't work a priori; another bug filed internally.
master hotfixy code seems to be in sync with dev: git diff dev...master yielded no diff
will need to notify existing gips users working on the dev branch to switch to a legacy branch (internally I've pushed dev-py2) as a safe haven because:
install.sh isn't yet ready; however it can be done post facto given the safe haven (I can work on it while I'm waiting on the PR review).

ircwaves added the adaptation label Sep 4, 2019

ircwaves assigned bhbraswell, ircwaves and ra-tolson Sep 5, 2019

ra-tolson mentioned this issue Sep 16, 2019

mtci-toa shows signs of quantization error #511

Open

ra-tolson mentioned this issue Sep 18, 2019

510 second round of python 3 + gippy 1.0 porting #513

Closed

ra-tolson mentioned this issue Oct 1, 2019

510 python 3 + gippy 1.0 gips port #514

Merged

ircwaves closed this as completed in #514 Oct 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python3 porting #510

Python3 porting #510

ircwaves commented Sep 4, 2019

ra-tolson commented Sep 4, 2019 •

edited

Loading

ra-tolson commented Sep 4, 2019

bhbraswell commented Sep 9, 2019

ra-tolson commented Sep 9, 2019

bhbraswell commented Sep 10, 2019

ra-tolson commented Sep 10, 2019

ra-tolson commented Sep 10, 2019 •

edited

Loading

ra-tolson commented Sep 10, 2019 •

edited

Loading

ircwaves commented Sep 10, 2019

ra-tolson commented Sep 13, 2019

ircwaves commented Sep 13, 2019

bziniti commented Sep 13, 2019

lledoux commented Sep 13, 2019 •

edited

Loading

ircwaves commented Sep 13, 2019

ra-tolson commented Sep 17, 2019 •

edited

Loading

ircwaves commented Sep 17, 2019

ircwaves commented Sep 17, 2019 •

edited

Loading

ra-tolson commented Sep 19, 2019 •

edited

Loading

ircwaves commented Sep 24, 2019

ra-tolson commented Sep 26, 2019 •

edited

Loading

ircwaves commented Sep 27, 2019

bhbraswell commented Sep 27, 2019

ra-tolson commented Oct 1, 2019 •

edited

Loading

ra-tolson commented Oct 1, 2019

Python3 porting #510

Python3 porting #510

Comments

ircwaves commented Sep 4, 2019

ra-tolson commented Sep 4, 2019 • edited Loading

ra-tolson commented Sep 4, 2019

bhbraswell commented Sep 9, 2019

ra-tolson commented Sep 9, 2019

bhbraswell commented Sep 10, 2019

ra-tolson commented Sep 10, 2019

ra-tolson commented Sep 10, 2019 • edited Loading

ra-tolson commented Sep 10, 2019 • edited Loading

ircwaves commented Sep 10, 2019

ra-tolson commented Sep 13, 2019

ircwaves commented Sep 13, 2019

bziniti commented Sep 13, 2019

lledoux commented Sep 13, 2019 • edited Loading

ircwaves commented Sep 13, 2019

ra-tolson commented Sep 17, 2019 • edited Loading

ircwaves commented Sep 17, 2019

ircwaves commented Sep 17, 2019 • edited Loading

ra-tolson commented Sep 19, 2019 • edited Loading

ircwaves commented Sep 24, 2019

w/o python-dev

w/ python-dev

ra-tolson commented Sep 26, 2019 • edited Loading

ircwaves commented Sep 27, 2019

bhbraswell commented Sep 27, 2019

ra-tolson commented Oct 1, 2019 • edited Loading

ra-tolson commented Oct 1, 2019

ra-tolson commented Sep 4, 2019 •

edited

Loading

ra-tolson commented Sep 10, 2019 •

edited

Loading

ra-tolson commented Sep 10, 2019 •

edited

Loading

lledoux commented Sep 13, 2019 •

edited

Loading

ra-tolson commented Sep 17, 2019 •

edited

Loading

ircwaves commented Sep 17, 2019 •

edited

Loading

ra-tolson commented Sep 19, 2019 •

edited

Loading

ra-tolson commented Sep 26, 2019 •

edited

Loading

ra-tolson commented Oct 1, 2019 •

edited

Loading