- Access to my Google Account (good luck).
- Up to date Google API credentials and token (credentials.json in ./app/ or a safer place).
- A virtual environment /w requirements installed:
-
Put the GDoc Document object's JSON in a.json
. - Option to check what docs are downloaded to not waste bandwith. (and also compare revisionId to see if there were any changes.)
- Save doc metadata somewhere.
- Make a stash of functions to handle different parts of it.
-
First, the hard part - search for all images and their positions (so,"hard part" lolPositionedObject
's). - Download all of the images to
../yoinkstash/pos_objs/
and put their ID's into../yoinkstash/pos_objs/rawposobjs.json
- Then grab all the text and tables.
-
- Find and put tables in a
../yoinkstash/tables/tables.json
- Find and put tables in a
- The rest goes to
../yoinkstash/text/rawtext.json
-
- Develop some system to identify your headers and etc (initially only for Notebook Zero).
-
- Implement an separate general header segmenting system for non-Notebook-Zero notebooks.
-
-
- Split
rawtext.json
intosegtext.json
(segmented text). 09/11/24 NOTE: maybe that's stupid.
- Split
-
- Make the app into a CLI to control what it is supposed to do any given time.
- <❗> Everything stores its position!
- The first hellish attempt at trying to convert everything to XML
or even HTML... -
- <❗> ....but maybe converting the JSON data into a very decent XML format i can engineer is a better idea?