This repository contains the source code for the matura project "Automated Stock Trading using News Headlines" developed by Robin Bacher & Lucien Gees during 2023. The "resultsData" folder also contains all the log data and graphs shown in the paper written for this matura project.
As is commonly done, Stack Overflow was consulted often during development.
To install all the necessary dependencies, the following steps must be completed in the order shown. The program was developed using Python version 3.10.4, but newer versions should work too.
- First, the TA-Lib library must be installed manually. Please refer to the operating system specific instructions listed under dependencies here.
- Next, all other dependencies can be installed using
pip install -r requirements.txt. - Finally, the specific NLTK modules required can be installed by running
py Downloads.py. Please note that the realtime webscraper required the Google Chrome browser to be installed.
The bot and the testing tool can be used in multiple ways.
- The main.py file located in the root folder of the git repository can be run from the explorer or through a terminal. This will launch a command line interface which guides the user through setting up a bot run or a testing cycle.
- When running the main.py file from the terminal, the above process can be skipped by providing the necessary arguments for either a bot run or a testing cycle using command line arguments.
A list of all command line arguments and examples can also be accessed using
py main.py --help - Bot runs and testing cycles can also be perfomed from within a script by importing TradingBot.bot or DataGen.Testing.
Two external datasets were used for this project. We are grateful to the authors for providing them to the public, since without these datasets this project would not have been possible.
- FinancialPhraseBank by Ankur Sinha, released under CC BY-NC-SA 4.0
This dataset was used to train the Naive Bayes Classifier. It is not included in this repository. However, a pretrained classifier saved using Pickle is included. All methods used to train that classifier are included in the code and can be executed if the FinancialPhraseBank dataset is placed in the SentimentAnalysis folder. - Daily Financial News For 6000+ Stocks by Bot_Developer, released under CC0: Public Domain
This dataset is used as the source of historical news headlines the bot uses when running in historical data mode. This dataset is included in this repository as it is needed for the historical data mode to work, albeit not in the format originally published. The dataset has been converted into two .json files which are used by the Histdata.py module to provide easy access to the dataset.