When working with a time series, one important thing we wish to determine is whether one series “causes” changes in another. In other words, is there a strong correlation between a time series and another given a number of lags? The way we can detect this is through measuring cross correlation.
For instance, one time series could serve as a lagging indicator. This is where the effect of a change in one time series transfers to the other time series several periods later. This is quite common in economic data; e.g. an economic shock having an effect on GDP two quarters later.
But how do we measure the lag where this is significant? One very handy way of doing so in R is using the ccf (cross correlation) function.
Running this function allows us to determine the lag at which the correlation between two time series is strongest.
Two important things that we must ensure when we run a cross correlation:
Our time series is stationary. Once we have chosen the suitable lag, we are then able to detect and correct for serial correlation if necessary.In aprevious post, we looked at how we can determine the extent of cross correlation among different currency pairs using the ccf library in R. Let’s now see how this analysis can be conducted using python.
Downloading currency data from QuandlFirstly, we will import our libraries and download currency data for the EUR/USD and GBP/USD from quandl.
import numpy as np import pandas as pd import statsmodels import statsmodels.tsa.stattools as ts from statsmodels.tsa.stattools import acf, pacf import matplotlib as mpl import matplotlib.pyplot as plt import quandl import scipy.stats as ss # Download currency data eurusd = quandl.get("FRED/DEXUSEU",start_date="2015-05-01",end_date="2015-10-01", type="xts") eurusd Value Date 2015-05-01 1.1194 2015-05-04 1.1145 2015-05-05 1.1174 2015-05-06 1.1345 2015-05-07 1.1283 2015-05-08 1.1241 2015-05-11 1.1142 2015-05-12 1.1240 2015-05-13 1.1372 2015-05-14 1.1368 2015-05-15 1.1428 2015-05-18 1.1354 2015-05-19 1.1151 2015-05-20 1.1079 2015-05-21 1.1126 2015-05-22 1.1033 2015-05-26 1.0876 2015-05-27 1.0888 2015-05-28 1.0914 2015-05-29 1.0994 2015-06-01 1.0913 2015-06-02 1.1130 2015-06-03 1.1285 2015-06-04 1.1271 2015-06-05 1.1108 2015-06-08 1.1232 2015-06-09 1.1284 2015-06-10 1.1307 2015-06-11 1.1236 2015-06-12 1.1278 ... ... 2015-08-20 1.1200 2015-08-21 1.1356 2015-08-24 1.1580 2015-08-25 1.1410 2015-08-26 1.1390 2015-08-27 1.1239 2015-08-28 1.1172 2015-08-31 1.1194 2015-09-01 1.1263 2015-09-02 1.1242 2015-09-03 1.1104 2015-09-04 1.1117 2015-09-08 1.1182 2015-09-09 1.1165 2015-09-10 1.1262 2015-09-11 1.1338 2015-09-14 1.1307 2015-09-15 1.1260 2015-09-16 1.1304 2015-09-17 1.1312 2015-09-18 1.1358 2015-09-21 1.1204 2015-09-22 1.1133 2015-09-23 1.1160 2015-09-24 1.1252 2015-09-25 1.1192 2015-09-28 1.1236 2015-09-29 1.1246 2015-09-30 1.1162 2015-10-01 1.1200 [107 rows x 1 columns] gbpusd = quandl.get("FRED/DEXUSUK",start_date="2015-05-01",end_date="2015-10-01", type="xts") gbpusd Value Date 2015-05-01 1.5137 2015-05-04 1.5118 2015-05-05 1.5178 2015-05-06 1.5244 2015-05-07 1.5223 2015-05-08 1.5455 2015-05-11 1.5593 2015-05-12 1.5685 2015-05-13 1.5748 2015-05-14 1.5766 2015-05-15 1.5772 2015-05-18 1.5679 2015-05-19 1.5523 2015-05-20 1.5544 2015-05-21 1.5672 2015-05-22 1.5484 2015-05-26 1.5398 2015-05-27 1.5324 2015-05-28 1.5291 2015-05-29 1.5286 2015-06-01 1.5187 2015-06-02 1.5331 2015-06-03 1.5351 2015-06-04 1.5367 2015-06-05 1.5267 2015-06-08 1.5280 2015-06-09 1.5383 2015-06-10 1.5530 2015-06-11 1.5493 2015-06-12 1.5587 ... ... 2015-08-20 1.5688 2015-08-21 1.5698 2015-08-24 1.5731 2015-08-25 1.5698 2015-08-26 1.5493 2015-08-27 1.5411 2015-08-28 1.5362 2015-08-31 1.5363 2015-09-01 1.5341 2015-09-02 1.5310 2015-09-03 1.5254 2015-09-04 1.5195 2015-09-08 1.5381 2015-09-09 1.5363 2015-09-10 1.5457 2015-09-11 1.5426 2015-09-14 1.5423 2015-09-15 1.5352 2015-09-16 1.5499 2015-09-17 1.5509 2015-09-18 1.5573 2015-09-21 1.5506 2015-09-22 1.5356 2015-09-23 1.5243 2015-09-24 1.5254 2015-09-25 1.5172 2015-09-28 1.5208 2015-09-29 1.5168 2015-09-30 1.5116 2015-10-01 1.5162 [107 rows x 1 columns]We will now check the data types, rename as x and y, and extract the currency values from each series:
# Check type type(eurusd) class 'pandas.core.frame.DataFrame' type(gbpusd) class 'pandas.core.frame.DataFrame' # Extract value columns and save as x and y x = eurusd[eurusd.columns[0]] y = gbpusd[gbpusd.columns[0]]When dealing with financial data, it is good practice to express our data in returns rather than price .
Since an investor is subject to the compounding effect when holding an asset, the series should be expressed in the form of logarithmic returns:
#Log format: we want to express in returns rather than price x=np.log(x) y=np.log(y)Let us now plot the autocorrelation and partial autocorrelation plots:
acfx=statsmodels.tsa.stattools.acf(x) plt.plot(acfx) plt.title("Autocorrelation Function") Text(0.5,1,'Autocorrelation Function') plt.show()
pacfx=statsmodels.tsa.stattools.pacf(x) plt.plot(pacfx) plt.title("Partial Autocorrelation Function") Text(0.5,1,'Partial Autocorrelation Function') plt.show()

acfy=statsmodels.tsa.stattools.acf(y) plt.plot(acfy) plt.title("Autocorrelation Function") Text(0.5,1,'Autocorrelation Function') plt.show()

pacfy=statsmodels.tsa.stattools.pacf(y) plt.plot(pacfy) plt.title("Partial Autocorrelation Function") Text(0.5,1,'Partial Autocorrelation Function') plt.show()

Here are our currency plots:
#Plot currencies plt.plot(x) plt.title("EUR/USD") Text(0.5,1,'EUR/USD') plt.show()
plt.plot(y) plt.title("GBP/USD") Text(0.5,1,'GBP/USD') plt.show()

Dickey-Fuller Test and First Differencing
As mentioned, we wish to ensure that our time series are stationary before obtaining a cross correlation reading.
To test for stationarity, we will use the Dickey-Fuller test. A p-value below 0.05 indicates stationarity, while a p-value above this threshold indicates non-stationarity.
#Dickey-Fuller Tests xdf = ts.adfuller(x, 1) ydf = ts.adfuller(y, 1) xdf (-3.0704779047168596, 0.028816508715839483, 0, 106, {'1%': -3.4936021509366793, '5%': -2.8892174239808703, '10%': -2.58153320754717}, -723.247574137278) ydf (-2.949959856756157, 0.03983919029636401, 1, 105, {'1%': -3.4942202045135513, '5%': -2.889485291005291, '10%': -2.5816762131519275}, -815.3639322514784)Since our p-values are below 0.05 (xdf = 0.0288, ydf = 0.0398) , this means that we do not have to first-difference our series for stationarity.
Cross Correlation AnalysisNow, we will calculate the cross correlation between these two currency pairs. The following guide gives a great overview as to how to calculate cross correlations in Python, and I recommend viewing for more detail.
Firstly, we will calculate the cross correlation between x and y:
# Calculate correlations cc1 = np.correlate(x - x.mean(), y - y.mean())[0] # Remove means cc1 0.0016363869247897089 cc1 /= (len(x) * x.std() * y.std()) #Normalise by number of points and product of standard deviations cc1 0.09356342030097958 cc2 = np.corrcoef(x, y)[0, 1] cc2 0.09444609407740386 print(cc1, cc2) 0.09356342030097958 0.09444609407740386Now, we will generate the lags and calculate the autocorrelations:
# Generating lags lg = 108 x = np.random.randn(lg) x array([-1.33637126, -0.91268722, 0.58849321, -0.80577306, 0.44216736, 0.30785343, 0.83732885, -1.83063047, -1.27725301, -0.36030619, 0.74378721, 1.24821411, -0.21243094, -0.44926653, 0.53163943, 0.08144901, -0.09353262, -0.25299342, -0.08451991, -1.13738216, 0.11675753, 0.80485171, 2.13296554, 0.72919092, 0.60112596, -1.56293131, 0.66922138, -0.23881075, -1.13166073, -0.83272733, 2.1403491 , 0.02782964, 0.36113361, -1.07173447, 0.02294472, -0.78681884, -0.63989533, 0.57949374, 1.86681425, 0.22116599, 1.20812001, -0.18709786, -0.74800786, 0.5837688 , -1.135576 , 0.54448453, -0.73185883, 0.02619788, -0.0876121 , 0.97783246, -0.78385248, 1.22237382, 1.19862324, 1.38045208, 0.36843351, -1.54990015, 0.80960025, -0.26339043, 1.84675467, 0.7049357 , 0.30870394, 0.16942497, -0.77365822, -0.86607875, -0.16899972, -0.11752054, -0.27399429, 1.70566348, 0.81296314, 1.27271672, 0.15526539, -0.63711848, -0.0908118 , -0.23123026, -2.11297765, -0.21946581, 0.58519721, -2.03020424, -0.61200098, -0.48835915, -1.52874924, -0.38179953, 0.68368341, 0.27901589, 0.09745758, -1.94248569, 1.3592975 , -0.04699326, 1.64716923, -0.82430672, -1.45955716, 0.83442515, 0.33651276, -0.27044973, 0.27876725, -1.87478131, -1.3469446 , 1.10244709, -0.6680917 , -0.87016061, 1.69110314, -1.21336262, -0.12559932, -0.82219965, -0.38586643, -0.90661031, -1.3301571 , 1.41940189]) lags = np.arange(-lg + 1, lg) lags array([-107, -106, -105, -104, -103, -102, -101, -100, -99, -98, -97, -96, -95, -94, -93, -92, -91, -90, -89, -88, -87, -86, -85, -84, -83, -82, -81, -80, -79, -78, -77, -76, -75, -74, -73, -72, -71, -70, -69, -68, -67, -66, -65, -64, -63, -62, -61, -60, -59, -58, -57, -56, -55, -54, -53, -52, -51, -50, -49, -48, -47, -46, -45, -44, -43, -42, -41, -40, -39, -38, -37, -36, -35, -34, -33, -32, -31, -30, -29, -28, -27, -26, -25, -24, -23, -22, -21, -20, -19, -18, -17, -16, -15, -14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107]) xr = x - x.mean() # Remove sample mean xr array([-1.29507509, -0.87139105, 0.62978939, -0.76447689, 0.48346353, 0.3491496 , 0.87862502, -1.78933429, -1.23595684, -0.31901002, 0.78508339, 1.28951028, -0.17113477, -0.40797036, 0.5729356 , 0.12274518, -0.05223644, -0.21169724, -0.04322373, -1.09608598, 0.15805371, 0.84614788, 2.17426171, 0.7704871 , 0.64242213, -1.52163514, 0.71051756, -0.19751458, -1.09036456, -0.79143115, 2.18164528, 0.06912581, 0.40242978, -1.0304383 , 0.0642409 , -0.74552267, -0.59859916, 0.62078992, 1.90811042, 0.26246216, 1.24941618, -0.14580169, -0.70671168, 0.62506497, -1.09427983, 0.58578071, -0.69056266, 0.06749406, -0.04631593, 1.01912863, -0.74255631, 1.26366999, 1.23991941, 1.42174825, 0.40972968, -1.50860398, 0.85089643, -0.22209426, 1.88805084, 0.74623187, 0.35000011, 0.21072114, -0.73236205, -0.82478257, -0.12770354, -0.07622436, -0.23269812, 1.74695966, 0.85425932, 1.3140129 , 0.19656156, -0.59582231, -0.04951562, -0.18993409, -2.07168147, -0.17816963, 0.62649338, -1.98890806, -0.57070481, -0.44706297, -1.48745307, -0.34050335, 0.72497958, 0.32031206, 0.13875376, -1.90118952, 1.40059368, -0.00569709, 1.68846541, -0.78301054, -1.41826098, 0.87572133, 0.37780893, -0.22915356, 0.32006342, -1.83348514, -1.30564843, 1.14374326, -0.62679552, -0.82886443, 1.73239932, -1.17206644, -0.08430314, -0.78090348, -0.34457026, -0.86531414, -1.28886093, 1.46069806]) autocorr_xr = np.correlate(xr, xr, mode='full') # Normalize by the zero-lag value: ... autocorr_xr /= autocorr_xr[lg - 1]Now, we can plot the cross correlation:
fig, ax = plt.subplots() ax.plot(lags, autocorr_xr, 'b') ax.set_xlabel('lag') Text(0.5,0,'lag') ax.set_ylabel('correlation coefficient') Text(0,0.5,'correlation coefficient') ax.grid(True) plt.title("EUR/USD vs GBP/USD") Text(0.5,1,'EUR/USD vs GBP/USD') plt.show()
We see that while the correlations get weaker as the lags increase (which we expect), we have significantly negative lags at t = -50 and t = 50 with correlation coefficients lower than -0.2.
Overall, the cross correlation between EUR/USD and GBP/USD appears more negative than positive.
Let’s compare this to two other currency pairs. We will choose the JPY/USD vs CHF/USD:

Interestingly, we see that we have more frequent negative correlations while the positive correlations are stronger than that of the EUR and GBP. Given that the CHF and JPY are two “safe haven” currencies that typically rise during “risk-off” periods in the market, then it is not surprising that we are seeing stronger positive correlations between the two, along with significantly negative correlations when demand is falling.
ConclusionIn this tutorial, you have learned:
How to analyse financial data in Python using Quandl How to test a time series for stationarity How to conduct a cross correlation in PythonMany thanks for reading this tutorial, and please leave any questions you may have in the comments below.