Projects
neural networks vs stock market
Jan 19, 2020     3 minutes read

1. Intro

1. Why I’m making this project

There are two main reasons for this project:

  • I’ve recently deep dived into pytorch and I want to test in on a real task. I’ve been using pytorch extensively at work, but I usually didn’t have enough time to study its quirks as thouroughly as I wanted.

  • As I was studying Quantitative Methods in Economy, my all-time dream was to predict fluctuations on the stock market, i.e. come up with a legal money factory. During the studies I realized that this isn’t possible, but I still want to give it a try ;)

2. What I expect to gain

  • a good understanding of LSTM nets

  • proficiency in pytorch

  • an end-to-end project in pytorch for future reference

  • any money ;)

3. First thoughts

  • I need to collect the data. The best source for me will be Warsaw Stock Exchange, as I am Polish, so hopefully I’ll have some intuition about the stock I buy/sell.

  • First I will concentrate on regular stock market, later on maybe I’ll delve into some more exotic instruments, like derivatives.

  • What if there are many irrational players on the market, so my algorithm could outsmart them?

2. Data

I’ve been searching for APIs to Warsaw Stock Exchange for a while, and I convinced myself that maybe I should try something more user-friendly for a start. I found this article which propeses yfinance by Yahoo. Let’s give it a try. (but IEX looks interesting as well)

Aw, that is a great opportunity to remember how plotly works.

import yfinance as yf
import plotly.express as px
import plotly
import json

msft = yf.Ticker("MSFT")

ms = msft.history(period="max")
ms.reset_index(inplace=True)

fig = px.line(ms, x="Date", y="Close", title='Microsoft stock prices')
fig.show()

btw., displaying plotly in Rmarkdown file is ridiculously complicated

OK, it looks like I got some data.

3. Model type

At first I was thinking that LSTM would be perfect, as it is widely used for time series data prediction. However it does not seem to be able to model/predict (I am used to saying ‘model’ as in economics/econometrics, even though in machine learning we rather ‘fit’ functions to the data) dependencies between different companies. As a result, I will try to implement a convolutional neural network as a super-flexible function which will fit nicely to my data.

But first I need:

“More data” point is as simple changing “MSFT” into some other stock, but creating a valid dataset? For now I assume that:

More data!

To be continued…