In the next few posts we are going to discuss the design, development and testing of a machine learning artificial intelligence stock and forex trading system. Machine Learning is a new frontier. Machine learning is a new name for data mining using statistical algorithms. Machine learning has become possible with the increased computing power that is now available with computers. Before we proceed further in our discussion, I want you to watch the following video that explains what is Machine Learning and some of its applications that includes spam detection, hand writing recognition, speech recognition etc.
R is a powerful language that has been developed by academicians around the world to implement that statistical algorithms that are being used in Machine Learning. R is an open source software that can be downloaded freely online. In the last few posts we discussed how we can use R software to improve our trading. Before you continue you should read the previous on how to develop your algorithmic trading system using R. Now the problem that you are going to face is how to format the data properly before you start applying that different statistical models that are provided in R.
A few posts back, I discussed ARIMA and GARCH processes. Unfortunately ARIMA+GARCH don’t give good results when it comes to predicting financial time series of stocks and currencies. Neural Networks and Support Vector Machines are 2 machine learning algorithms that are being used extensively now by the Quants. If you don’t know what are Support Vector Machines, you should read my post in which I have provided videos tutorials on what are Support Vector Machines. Neural Networks use artificial intelligence just like a human brain. As we go along I will explain Support Vector Machines and Neural Networks in detail. You can also read this post in which I discuss a paper that gives a method to predict GBPUSD daily returns. When dealing with financial time series, it is always a better idea to use the returns as they give much better results. Neural Networks are very good at image recognition. Another algorithms that we should take a look for our trading system is the KNN Nearest Neighbor and Hidden Markov Chains.
Design of Machine Learning Artificial Intelligence Stock And Forex Trading System
We want to design an automated trading system that can predict the daily returns to begin with. We choose a threshold level that the market moves before we make our trading decisions. For example, we want our trading system to predict how many pips a certain currency pair will move in the next 24 hours. We also want to know the direction in which the market will move. So let’s say we choose a threshold level of 200 pip movement in 24 hours. Our machine learning algorithm should predict how much probability there is that GBPUSD will move 200 pips in the up direction. Let’s say it predicts that GBPUSD will move 200 pips in the up direction in the next 24 hours with 70-80% probability. We then use our knowledge of candlesticks to make a long entry with a take profit target of 200 pips. Always keep this in mind all statistical algorithms are probabilistic in nature.
This is what we are going to do. We are going to calculate the daily returns of our chosen stock or currency pair and try to predict the next daily candle. Then use the above mentioned algorithms to train on our daily returns dataset. Once we have trained the algorithm n the daily returns dataset we are going to use it to make prediction for the next day. This will then help us to make a better trading decision. As said above we are dealing with probabilities. What this means is that the prediction will have an average rate of coming true in the longrun which should be roughly equal to the probability of prediction.
We also want to predict the next 4 hourly and the next hourly candle also and see if we can make predictions with 70-80% accuracy. If we succeed in achieving a predictive accuracy which is on average above 70%, then we can think of having an edge against the market. So let’s get started.
How to read data into R?
As said above we will be using R language to make the predictions. You can download R software FREE. Once we have R installed, we need to import the data into it. Let’s start with forex. MT4 gives you a opportunity to download a csv file of any currency pair that is available on it. Open your MT4 platform. Then Tools > History Center. Now download the csv file of daily, 4 hour and 1 hour data. You will use the following command to read the EURUSD daily data csv into R.
quotes <- read.csv(“E:/MarketData/EURUSD1440.csv”, header=FALSE)
E is the drive on which I have saved the EURUSD1440.csv file. You can save the file on C, D or E drive. Just make sure to change the drive name in the above R command. In the same manner, you can read H4 and H1 data into R.
quotes <- read.csv(“E:/MarketData/GBPUSD240.csv”, header=FALSE)
quotes <- read.csv(“E:/MarketData/EURUSD60.csv”, header=FALSE)
Now this data is read as a data.frame which is the most basic data format that R uses. Data.frame is just like a matrix. We need to convert this data.frame into a time series. As said above price is stamped with a time tag which makes it a time series and we need to convert this data into a time series so that R can then treat it as a time series. xts is a time series class that R likes the most. So we will convert the above data.frames into xts time series. We will use the following command to convert the daily data.frame into xts time series with periodicity daily.
x <- as.xts(quotes[,-(1:2)], as.Date(paste(quotes[,1]),format=’%Y.%m.%d’))
In the same manner we are going to use the following commands to convert H4 and H1 data frames into xts time series.
x <- as.xts(quotes[,-(1:2)], as.POSIXct(paste(quotes[,1],quotes[,2]),format=’%Y.%m.%d %H:%M’))
Did you see any difference? You should have seen we have used Date %Y.%m.%d as the time format for the daily while POSIXct %Y.%m.5d %H:%M time format for H4 and H1. Why? Because for the intraday timeframes we need the hour as well the minutes while for the daily and weekly data we don’t need the hour and minutes. The above xts time series are now ready for use. But before we use them we need to name the data columns in the time series. Use the following command to name the data colums:
As you can see the first columns is the index columns that has been stamped with time by R. The data columns are Open, High, Low, Close, Volume.
Now this was for forex data. But if you are trading stocks, data reading is much easier. We can download the data directly from Yahoo Finance and R will do that for you. Another good thing that R will do is automatically convert that data into an xts time series. You just need to install the Quantmod package for doing this. Quantmod is a powerful R package that can draw candlestick charts and do all sorts of technical analysis for you. I will discuss how to do technical analysis using Quantmod as we progress more in our discuss. Right now you should use the following command to download Amazon daily prices:
First we need to load the Quantmod library into R.
getSymbols(“AMZN”, from = “2010-01-01”, to = “2016-04-05”)
This is the daily stock price data from 2010 to 2016. Now we need to reformat the data as it is in a slightly different form to what we need.
colnames(AMZN) <- c(“Open”, “High”, “Low”, “Close”, “Volume”, “AdjClose”)
You can also use the tseries R package to download the data from the web. Use the following command:
AMZN <- as.xts(get.hist.quote(“AMZN”,start=”2010-01-02″, quote=c(“Open”, “High”, “Low”, “Close”,”Volume”,”AdjClose”)))
First we load the tseries library. Then we can use the get.hist.quote command to download AMZN data. You can download data for any stock. Just replace AMZN in the above R command with the stock symbol that you you want to download and R is going to download the data for you automatically from the web. The data is downloaded as a zoo object. We need to coerce it into an xts time series object. So we use as.xts in the above command.
In this post we have shown you how we are going to read the data from MT4 for forex pairs and how we can read the daily stock data directly from the web using the above R commands. It is very important that the data that you download is correctly formated otherwise we will face problems as we apply the statistical algorithms to do the training and predictions. We use the as.xts in the above commands to coerce the data into a proper time series.
As explained above the next step is to use a statistical algorithm to predict whether the currency pair or the stock is going to vary p% in the next k days. In the next post we are going to discuss how to do that.