Data Preprocessor

Cloning

git clone https://github.com/Research-Project-Crypto/DataPreprocessor.git --recursive

If you forgot to clone recursively you can use the following command:

git submodule update --init --recursive

Build Dependencies

The following instructions are made for Arch based linux systems but they will give you an idea on how to port it to any other systems.

pacman -S --noconfirm --needed gcc make
pacman -S --noconfirm --needed premake

yay -S ta-lib
# or
paru -S ta-lib

Compiling the application

premake5 gmake2

make config=release

Using the application

With arguments

Position	Argument
1	Data input folder
2	Data output folder

Example:

./bin/Release/DataPreprocessor data/input data/output

Downside of argument only

With argument only mode you are unable to specify the type of the input data, you can only parse CSV text data.

Without argument using settings.json

If you don't give any arguments the application will default to reading the settings from settings.json.

{
    "input": {
        "input_folder": "data/input",
        "is_binary": false
    },
    "output": {
        "output_folder": "data/output"
    }
}

Data Input Format

CSV

The csv reader expects 6 fields of which all of them should be double floating point numbers.

event_time,open,close,high,low,volume

Binary

Usually you won't need this mode unless you've used the TickerTimescaleSwap, then you NEED to set the is_binary value to true in settings.json.

Verify Data Integrity

Included with this project is a python script with which you can verify the binary output data.

python3 scripts/binary_reader.py

requires numpy

It will loop over all the cells slowly, this mostly to shortly verify calculation mistakes in the program.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github/workflows		.github/workflows
.vscode		.vscode
scripts		scripts
src		src
vendor		vendor
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
premake5.lua		premake5.lua
settings.json		settings.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Preprocessor

Table of contents

Cloning

Build Dependencies

Compiling the application

Using the application

With arguments

Downside of argument only

Without argument using settings.json

Data Input Format

CSV

Binary

Verify Data Integrity

About

Contributors 3

Languages

waffle-empire/DataPreprocessor

Folders and files

Latest commit

History

Repository files navigation

Data Preprocessor

Table of contents

Cloning

Build Dependencies

Compiling the application

Using the application

With arguments

Downside of argument only

Without argument using settings.json

Data Input Format

CSV

Binary

Verify Data Integrity

About

Resources

Stars

Watchers

Forks

Contributors 3

Languages