Skip to content

Latest commit

 

History

History
32 lines (23 loc) · 1.82 KB

README.md

File metadata and controls

32 lines (23 loc) · 1.82 KB

BTC_RL_TRADING_BOT

About

A trading bitcoin agent was created with deep reinforcement learning implementations. Various experiments were performed on the type of neural network, the type of reinforcement learning algorithm, and the number of daily input values that were initially required by the agent to make the first decision. Also, the agent was made with the assumption that for each day they could "sell" or "buy."

Environment & RL Algorithms

The agent’s environment (StocksEnv) is an environment that simulates buying and selling situations and can be found at [1]. More specifically, the transactions concern Bitcoin cryptocurrencies, and for the training of the agent, the "Historical Bitcoin Dataset" was utilized, which consists of Bitcoin values from January 2012 to March 2021. The reinforcing learning algorithms used are the Asynchronous Advantage Actor Critic (A2C), the Actor Critic using Kronecker-Factored Trust Region (ACKTR), Proximal Policy Optimization (PPO1), and Trust Region Policy Optimization (TRPO) that were entered through the stable baselines library [2]. Every agent was trained on the whole dataset except the last 100 day-values, which are used to evaluate the agent based on the price of profit. Finally, the time window's size was 20 or 30 or 50 daily input values.

Results

DRL AGENT FINAL REWARD PROFIT
A2C_20 26420 1.371
A2C_30 3574 0.814
A2C_50 29394 1.498
PPO1_20 12945 1.014
PPO1_30 38136 1.782
PPO1_50 30188 1.680
ACKTP_20 19452 1.022
ACKTP_30 30300 1.405
ACKTP_50 20119 1.248
TRPO_20 29530 1.544
TRPO_30 19910 1.110
TRPO_50 14969 1.187

References

[1] https://github.com/AminHP/gym-anytrading
[2] https://stable-baselines.readthedocs.io/en/master/index.html