In order to use the system, we require two types of data:
-
CSV file containing news headlines regarding desired stock
In our research, we used this Kaggle dataset containing news headlines regarding the TSLA stock.
Each line in this file also contains a source link for the original website where given headline was published.
This file must contain the next columns:
1.1 News headline date.
1.2 News headline content. -
CSV file containing the desired stock data
In our research, we used Yahoo Finance to obtain different TSLA stock price data for over a decade.
This file must contain the next columns:
2.1 Trading day date.
2.2 Open stock price.
2.3 High stock price.
2.4 Low stock price.
2.5 Close stock price.
2.6 Adjusted close stock price.
2.7 Stock Volume.
After gathering the required pieces of data, we can start using the system.
Here are the steps for using the system by running the different jupyter notebooks:
-
Extract financial sentiment from news headlines CSV file
RunFinancial_Sentiment_Extraction_System.ipynbnotebook for extracting the financial sentiment for each trading day. The final output from this notebook should be a new column calledSentiment, added to the stock data CSV file. This new column contains the sentiment score, where sentiment score ∈ {-1, 0, 1}. This output should be saved under the nameprocessed_TSLA.csv. -
Predicate TSLA Close Prices
Runpredicting_close_LSTM.ipynbnotebook for predicating TSLA close prices.
In this notebook, we train a LSTM model that learns the hidden patterns between TSLA close prices.
Using this model, we are able to predict the closing price and save this value in a new column calledPredicted_Close, which is added to theprocessed_TSLA.csvfile.
This output should be saved under the nameProcessed_predicted_TSLA.csv. -
Training Deep Reinforcement Learning Agent
At this stage, we should have a stock data CSV file with additional columnsSentimentandPredicted_Close.
Here, we have a wide selection of notebooks stored at the Gym Anytrading Notebooks folder.
In each notebook, we train a deep reinforcement learning agent with a different set of available data in its environment. By doing so, we can assess what are the effective pieces of information for training an optimal agent for autonomous stock trading.