Visualizing Alt-Coin Correlation

Yesterday I showed you how to download price data from Binance using Python. Today I will show you some of the cool stuff you can do with that data.

In today's tutorial, I want to show you how to download the price data for over 120 crypto pairs which trade on the Binance exchange. We will then look at the correlation between each cyrptocurrency and identify which cyrpto's trade most similarly to STEEM/USDT.

Finding the assets which are most similar to another asset is useful in a number of trading applications, including 1) price prediction, 2) hedging and 3) statistical arbitrage. We'll talk a little bit more about this in a second, but first, let's take a look at how to download so much data...

We could manually write down the list of symbols... but wait, fuck that!! We're programmers and the best programmers are lazy so let's get the symbols the lazy way using the Binance API:

symbols = json.loads(requests.get("https://api.binance.com/api/v1/exchangeInfo").text)
symbols = [symbol['symbol'] for symbol in symbols['symbols'] if symbol['quoteAsset'] == 'ETH']

What we're left with here is a list of any pair with ETH as the quote side of the pair. We will combine this data with the price history of ETH/USDT to create time series for each cryptocurrency quoted in USDT instead of ETH.

To do this we'll use our get_bars function to retrieve the data for ETH/USDT

ethusdt = get_bars('ETHUSDT')

We'll use this to create USDT weighted time series for each */ETH pair we have in our list of symbols.

Now let's download historical price data for all the symbols which trade against ETH on Binance:

price_data = []
new_symbols = []
for symbol in symbols:
    print(symbol)
    data = get_bars(symbol)
    new_symbols.append(symbol.replace('ETH','USDT'))
    price_data.append(data['c'].astype('float') * ethusdt['c'].astype('float'))

For each pair, we've stored the historical price data in a list named price_data

Thats not really terribly useful as is, so let's use pandas to merge all the individual pairs into one big dataframe:

combo = pd.concat(price_data, axis = 1)
combo.columns = new_symbols

You can see that we've got a shit ton of symbols in there:

combo.div(combo.ix[0]).plot(figsize=(16,9))

I cut off the legend but it goes on and on and on...

There's got to be a better way to visualize which crypto's flock together... And luckily for us there are much better ways to visualize this!! W00t

Consider the following graph:

mst = sm.MinimumSpanningTree(dataset = np.log(combo).diff().T)

This was created using the same price data that we tried to visualize before. But instead of plotting the raw price data, this graph plots each symbol as a circle or node on a graph. Nodes are connected via edges if their price history is extremely similar.

This type of graph is known as a Network Graph and this particular graph was created using an algorithm known as a Minimum Spanning Tree aka MST

Where is STEEM in this network graph?

We can see from the graph that the price history of STEEM / USDT is most similar to that of OMG / USDT. You might find that surprising, but a quick look at the side by side prices reveals a striking similarity over time:

There are other algorithms for creating networks out of correlations. Another algorithm is called the Correlation Filtered Graph. This is the graph which is produced from the same price data using the CFG algorithm.

cfg = sm.CorrelationFilteredGraph(dataset = np.log(combo).diff().T)

You can see that the CFG algo generates a lot more edges between nodes. The set of nodes which are directly connected to one particular node is known as the neighborhood of that node.

In the case of STEEM, its neighborhood consists of

cfg.neighborhood("STEEMUSDT")
"""
{'OMGUSDT': {'weight': 0.8315814103529764},
 'TNBUSDT': {'weight': 0.8197091679870396},
 'STORJUSDT': {'weight': 0.7506614670521796},
 'GRSUSDT': {'weight': 0.5237622731586296},
 'MANAUSDT': {'weight': 0.7429232474210377},
 'BNTUSDT': {'weight': 0.8303019286045419}}
"'"

We can plot the neighborhood of STEEM to verify the similarities of each price history:

This would suggest there exists a pretty good portfolio rebalancing trade between all of these crypto's, at least thats been the case over the time frame we looked at (the last month). Correlations are not constant in financial markets and cryptocurrencies are no exception to the rule. Thus it is important to update your model's internal correlation as market conditions evolve. However, this is a trivial process using our get_bars function combined with the script above. All one would have to do is download new data everyday and then you could update your trading models.

Another thing thats interesting about STEEM is that, in both graphs, it is positioned near the edge or periphery of the graph. Nodes which are near the periphery are some of the most unique nodes in the dataset, as nodes which are very similar to many other nodes tend to cluster towards the center of the graph.

The bad news is that STEEM appears to be one of the worst performing in its cluster over the last month.

Another way to look at portfolio diversification is really just maximizing the average distance between the nodes in your portfolio. If you have a portfolio which is clustered in the same part of the graph then you basically have multiple statistical copies of the same asset... When one goes down they all go down and visa versa. This is the opposite of diversification.

On the other hand, assembling a portfolio of symbols which are truly different from one another means when one crypto is down, its likely that another part of your portfolio is up, thus providing the hedging effect which is the foundation of successful diversification.

So where's EOS in all this madness?

Right next to Cardano, apparently!

If you're interested in creating these graphs yourself, please let me know. These particular network graphs we're created using proprietary software which requires an API key to use.


Follow @marketstack for more updates about trading, finance, and technology

H2
H3
H4
3 columns
2 columns
1 column
Join the conversation now
Logo
Center