json module as well as the .json() method from the external requests library,Learn Python Series):After a short break, just to make sure everybody had the time to catch on on all previous Learn Python Series episodes ;-) , I'm back! In this episode we'll discuss handling JSON files, via some theory and by using a few practical examples.
JSON, short for JavaScript Object Notation, is a text based data format used to exchange data between applications and computers. Even though its name has the language "JavaScript" in it, JSON is a textual data format that's language independent, so it can be used with lots of computer languages, including of course Python. JSON is pretty easy to read and write (for humans) and easy to parse as well (for computers). As we will see in this tutorial episode JSON-formatted data is also often used with APIs.
The JSON syntax looks a lot like a Python dictionary, we've discussed before. JSON stores data in name:value pairs, separated by commas, where curly braces { } hold objects and square brackets [ ] hold lists (arrays) of data.
In JSON, names must be strings surrounded by double quotes " ", as do string values. JSON values can be strings, numbers, arrays (lists), booleans, null, or an embedded (JSON) object. This is a little bit more restrictive than what we've been discussing regarding Python dictionaries.
Most Python distributions, including Anaconda, come bundled with the json module. Using the json module therefore doesn't require the installation of an external Python module, you can simply import it, like so:
import json
The json module is used to parse JSON data from files (or strings inside a running Python script) and the other way around, to convert a Python dictionary (or list for example) back into a JSON textual string. It allows you to convert ("serialize" and "deserialize") back and forth between JSON data and Python objects.
Let's look at a few basic examples to get our feet wet using JSON with the methods json.loads() and json.dumps():
json.loads()json.loads() deserializes a string containing JSON data to a Python object.
json_string = '{"name": "Scipio", "utopian_reputation": "Elite"}'
json_dict = json.loads(json_string)
print(type(json_dict), json_dict)
<class 'dict'> {'name': 'Scipio', 'utopian_reputation': 'Elite'}
json.dumps()json.dumps() works the other way around, it serializes a Python object to textual JSON.
my_dict = {'gender': 'male', 'hasJob': True, 'number': 14}
my_json = json.dumps(my_dict)
print(type(my_json), my_json)
<class 'str'> {"gender": "male", "hasJob": true, "number": 14}
As you can see, the Python dictionary object my_dict has now been serialized into a (JSON data) string. Please notice the double quotes " " instead of the single quotes ' ' we began with, as well as how the boolean value was changed from True to true.
json.dumps(indent=) keyword argumentWhen using relatively large JSON datasets, while reading your JSON data, you might prefer to use a bit of indentation for readability, also dubbed as pretty printing oftentimes.
If we want to "pretty print" the my_json variable from the previous example, we can do so like this:
my_json = json.dumps(my_dict, indent=4)
print(my_json)
{
"gender": "male",
"hasJob": true,
"number": 14
}
.json filesJSON data can be stored to and read from disk as well, most commonly using a .json file name extension. To do this, you can use the json.load() and json.dump() methods.
json.load()Suppose you have a file called mydata.json with some JSON data stored inside it. You can open() that file as usual and deserialize it with json.load() as follows:
with open('mydata.json', 'r') as f:
data = json.load(f)
json.dump()Writing JSON data to a file is also pretty easy, using the json.dump() method. You need to pass (at least) two positional arguments: json.dump(object, filepointer), where object is the Python object you want to write to file as JSON. Like so:
some_dict = {"name": "Scipio",
"quote": "Does it matter who's right, or who's left?"}
with open('scipio.json', 'w') as f:
json.dump(some_dict, f)
As a result, the file scipio.json will be saved in your current working directory and in it is the some_dict data converted as JSON.
requests to handle JSON dataIn the previous tutorials, we've been using the external requests library to fetch data from the internet, in a web crawler context. But we haven't yet discussed that the requests library also has a very convenient JSON decoder built in, we can use to grab structured JSON data from web-based APIs (after all: why code a web crawler to scrape semi-structured data, if we can simply use an API providing us wiht perfectly valid structured JSON data?).
Let's show a simple real-life example with which you can fetch the current Steem price data from the coinmarketcap.com Ticker API, using the requests built-in json() decoder:
import json
import requests
r = requests.get('https://api.coinmarketcap.com/v1/ticker/steem/')
steem_data = r.json()
print(type(steem_data), steem_data)
<class 'list'> [{'id': 'steem', 'name': 'Steem', 'symbol': 'STEEM', 'rank': '31', 'price_usd': '3.05373', 'price_btc': '0.00034873', '24h_volume_usd': '15685600.0', 'market_cap_usd': '776589645.0', 'available_supply': '254308549.0', 'total_supply': '271282643.0', 'max_supply': None, 'percent_change_1h': '0.49', 'percent_change_24h': '-3.64', 'percent_change_7d': '17.08', 'last_updated': '1524344348'}]
Alternatively, we can just deserialize the requests response using json.loads():
r = requests.get('https://api.coinmarketcap.com/v1/ticker/steem/')
steem_data = json.loads(r.text)
print(type(steem_data), steem_data)
<class 'list'> [{'id': 'steem', 'name': 'Steem', 'symbol': 'STEEM', 'rank': '31', 'price_usd': '3.05373', 'price_btc': '0.00034873', '24h_volume_usd': '15685600.0', 'market_cap_usd': '776589645.0', 'available_supply': '254308549.0', 'total_supply': '271282643.0', 'max_supply': None, 'percent_change_1h': '0.49', 'percent_change_24h': '-3.64', 'percent_change_7d': '17.08', 'last_updated': '1524344348'}]
In the examples above, we learned how to deserialize & serialize JSON data, read JSON data from an API and/or a file, and write JSON to file. Again using the CMC Ticker API we can combine the above to code a useful bit of code which...
ticks list,coins of your favorite coins,coins list,requests module to get the current data,ticks list,json.dump() with a 4 character indentation depth.Like so:
import json
import requests
ticks = []
cmc_base_url = 'https://api.coinmarketcap.com/v1/ticker/'
coins = ['bitcoin', 'steem', 'steem-dollars']
for coin in coins:
coin_json = requests.get(cmc_base_url + coin).json()
coin_dict = {
'name': coin_json[0]['name'],
'price_usd': coin_json[0]['price_usd'],
'last_updated': coin_json[0]['last_updated']
}
ticks.append(coin_dict)
with open('cmc.json', 'w') as f:
json.dump(ticks, f, indent=4)
If we then open our saved file cmc.json located in our current working directory, we'll see the following output:
[
{
"name": "Bitcoin",
"price_usd": "8790.62",
"last_updated": "1524345872"
},
{
"name": "Steem",
"price_usd": "3.05278",
"last_updated": "1524345847"
},
{
"name": "Steem Dollars",
"price_usd": "3.21964",
"last_updated": "1524345848"
}
]
In this episode, I showed you how to handle JSON data, deserializing with either json.load() or json.loads() and serializing using json.dump() and json.dumps() via the built-in json module, and via using the convenient requests.json() method in case of web APIs containing JSON data.
I've deliberately only covered "the basics" of JSON conversion, because in most of the cases, the techniques covered in this tutorial episode are all you need.