Did you ever wanted to easily access data contained in the Steem blockchain and perform analysis or find valuable information. But not everybody has programming skills to gather those data and compute the wanted result.
Therefore, I created a publicly available SQL database with all the blockchain data in it.
Why use a SQL database?
The main advantage having such a database is the fact data are structured and easily accessible from any application able to connect to a SQL Server database. Having a SQL Server database makes it possible to produce quick answers to queries.
Simply put, a query is a question. You ask the server form something and it sends back an answer (called the query result set).
For example, when dealing with large amounts of data as Steem blockchain data, you might want quick answers to questions (queries) such as:
- What was the Steem power down volume during the past six weeks?
- Which are the top 10 most rewarded post ever?
- Did I get, me or my posts, mentioned in any post or comment?
- How many posts are talking about ants?
Browsing the blockchain over and over to retrieve and compute such information is time and resource consuming.
If you don’t have a local copy of the blockchain, instead of downloading the whole data from some external public node to process it, you will send your query to SteemSQL server and get only the requested information, saving tons of bandwidth.
Let’s have a look at some technical details
Database diagram
The Blocks table contains bare block information (timestamp, witness …)
Each block can contains Transactions
Depending on each transaction type, the associated transaction’s data is stored in the related table.
Full text search
The database has been full text search enabled. This allow fast search of information within post and replies.
Let say I want to know if anyone mentioned me in a post or comment, the following simple query will do the trick
SELECT author, title, body, url FROM TxComments WHERE CONTAINS(body, '@arcange')
Database Connection information:
Here the information to connect and query the database:
Server: sql.steemsql.com
User: steemit
Password: steemit
Database name: DBSteem
How to retrieve query the database
- Using Microsoft Excel
Check this tutorial to see how to create an analysis report with Excel
- If you’re a python programmer
import pypyodbc
connection = pypyodbc.connect('Driver={SQL Server};'
'Server=sql.steemsql.com;'
'Database=DBSteem;'
'uid=steemit;pwd=steemit')
cursor = connection.cursor()
SQLCommand = ("YOUR SQL QUERY”)
cursor.execute(SQLCommand)
# do whatever you want with the retrieved data
connection.close()
Support
If you need help, have any comment or request, join steemsql channel on steemit.chat
Availability and performance
The SQL server is hosted in a datacenter with 24/7/365 availability.
Available output bandwidth is up to 500Mb/s
New data from the latest blocks are injected in the database every 10 seconds.
The server is currently hosted in a shared infrastructure.
I will monitor server load and if it requires more resources, I will allocate any reward to this post to a dedicated infrastructure.
You like it, please upvote or follow me