Indexer For Blockchain Analysis
This tutorial will go over a few examples of using the ‘indexer’ to calculate Algorand blockchain metrics. It will cover client instantiation, data extraction, data processing and metrics calculations.
Requirements
Required
- Python >= 3.6
- Some python IDE. Since this tutorial is a step-by-step tutorial, using Jupyter notebook/ipython is recommended
- py-algorand-sdk
- Indexer:
there are two options to use the indexer:- An infrastructure service (like purestake, algorand-explorer, etc.) - this is a simple solution. It will provide you read access to the indexer using the sdk.
- install indexer - Installing the indexer locally will provide you access to the Algorand blockchain data independently of external providers. It might take a few days to install, since the indexer needs to catchup with the Algorand blockchain from its genesis.
Optional
- Virtual environment
Background
The Algorand blockchain can process transactions at high throughput. Parsing the blocks to find the information of interest might be time-consuming. The indexer provides convenient and quick access to the data recorded on the Algorand blockchain.
In this tutorial, we will go over the data extraction process by getting the data of all transactions from the first week of August 2020. We will go over how to process the results and apply some metrics.
Steps
1. Environment Setup
Install all dependencies in your environment:
pip install pandas
pip install dateutil
pip install py-algorand-sdk
2. Imports and Instantiations
First, lets import all packages needed for this tutorial.
import os
import json
from time import sleep
from datetime import timezone
from dateutil.rrule import HOURLY, rrule
from dateutil.parser import parse
import pandas as pd
from algosdk.v2client import indexer
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
Now, we need to instantiate the indexer client. There are 3 fields we need to provide - indexer_token, indexer_address, and headers (optional).
Info
If you are using an infrastracture service, the information can be found after login. For example, on purestake.io:
Tip
Save your API key as an environment variable for easy access
indexer_token = os.environ.get('TOKEN')
myindexer = indexer.IndexerClient(indexer_token=indexer_token, indexer_address='https://mainnet-algorand.api.purestake.io/idx2')
3. Transactions Extraction
In the function below we extract all transactions recorded on the blockchain between start_time
and end_time
.
def get_txn_response(start_time, end_time):
"""
Returns all transactions added to the blockchain between 'start_time' and 'end_time'
"""
# The indexer expacting time inputs to be in RFC 3339 format
start_time = start_time.astimezone(timezone.utc).isoformat('T')
end_time = end_time.astimezone(timezone.utc).isoformat('T')
nexttoken = ""
numtx = 1
responses = []
# Retrieve up-to 1000 transactions at each request.
while numtx > 0:
response = myindexer.search_transactions(start_time=start_time, end_time=end_time,
next_page=nexttoken, limit=1000)
transactions = response['transactions']
responses += transactions
numtx = len(transactions)
if numtx > 0:
# pointer to the next chunk of requests
nexttoken = response['next-token']
return responses
Note
The indexer provider might have some limitations on the number of requests and their duration.
In this example, we split the week’s worth of data into chunks of 6 hours to avoid exceeding the request processing time limit.
Try using other chunk sizes
start = "20200801T000000"
end = "20200808T000000"
# get a list of dates between start date and end date with 6 hours intervals
dates = list(rrule(HOURLY,
dtstart=parse(start),
until=parse(end),
interval=6))
Now, we can loop over dates and combine all the data:
Tip
If you have memory limitations, save each chunk separately and use a parallel computing library (like spark, dask, etc.) for computations.
responses = []
for start_time, end_time in zip(dates[:-1], dates[1:]):
responses += get_txn_response(start_time, end_time)
df = pd.DataFrame(responses)
df.head()
asset-transfer-transaction | close-rewards | closing-amount | confirmed-round | fee | first-valid | genesis-hash | genesis-id | id | intra-round-offset | last-valid | note | receiver-rewards | round-time | sender | sender-rewards | signature | tx-type | payment-transaction | group | lease | keyreg-transaction | asset-config-transaction | created-asset-index | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | {'amount': 31000, 'asset-id': 2512768, 'close-... | 0 | 0 | 8195181 | 1000 | 8195123 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | RPQ7BVPQQENXXJDFLYCMSVFU7QV6Z44LDIIAVISBUHSXCO... | 0 | 8196123 | gqZwVHhuSWTZJGI0OGY1NTUwLTA4NGEtNDZkNS05OGUxLW... | 0 | 1596254404 | SN3FSQVLLNWSI3I7QCMQEQN6L42OW5RP4FNM6BNOZHBRKH... | 0 | {'sig': 'X73FojlwcRlV8A8A9bWvWPKq/6i5qxBOb/xQZ... | axfer | NaN | NaN | NaN | NaN | NaN | NaN |
1 | NaN | 0 | 0 | 8195181 | 1000 | 8195127 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | BGGD5TM6IFZBTDCDCNGUJ4PBDQFAG5RZTATKB7T34LGVKN... | 1 | 8196127 | gqZwVHhuSWTZJDgzZmU0MmQ5LTg2MjYtNDI0NC1hZDczLT... | 0 | 1596254404 | 62FMJKL2BDSYWSF3RYYZHAXA5HICQ57NFZCWWNM4ZJBYSV... | 0 | {'sig': 'P0qA6tng2r6e1aSwoG2sNKQEXjoGWTua+YykQ... | pay | {'amount': 201000, 'close-amount': 0, 'receive... | NaN | NaN | NaN | NaN | NaN |
2 | {'amount': 100, 'asset-id': 2513338, 'close-am... | 0 | 0 | 8195181 | 1000 | 8195127 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | HHFNE2KELVHPL4TGI6B6PXDIZQ6ROGFH7NOUDCJKCRBHXB... | 2 | 8196127 | gqZwVHhuSWTZJDI4NDE1NmM2LTQxNzAtNDhmNy1hMmFkLT... | 0 | 1596254404 | 62FMJKL2BDSYWSF3RYYZHAXA5HICQ57NFZCWWNM4ZJBYSV... | 0 | {'sig': 'jNSYv/8dclBfXkJU2JEDJIVj8rlcnLZRLdd1R... | axfer | NaN | NaN | NaN | NaN | NaN | NaN |
3 | {'amount': 0, 'asset-id': 2512768, 'close-amou... | 0 | 0 | 8195181 | 1000 | 8195134 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | NS4Z2S3NCKMVZSY6VX4CKE7QTN7N5JLT7AVYQFJSRVRLUS... | 3 | 8196134 | gqZwVHhuSWTZJDY0ZThhOGUyLTBhMTItNDdlNC04N2Q3LW... | 0 | 1596254404 | BXUZYY5XN2PFLBJE3BX2RDMQLQUW5ZUWX4ZCNDUL6QMXJZ... | 0 | {'sig': 'FbhxqPDlZxDcaAELShez+rXNQuTR4rRVP4vpe... | axfer | NaN | NaN | NaN | NaN | NaN | NaN |
4 | {'amount': 100, 'asset-id': 2513338, 'close-am... | 0 | 0 | 8195181 | 1000 | 8195123 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | 2S7TV7FI4TMPO7GXZI55PPW3FQE2PJVQ7F34LCS3C6Z7NW... | 4 | 8196123 | gqZwVHhuSWTZJGJmYjlhNGY4LTNmODgtNDQ1My04YWY5LW... | 0 | 1596254404 | 62FMJKL2BDSYWSF3RYYZHAXA5HICQ57NFZCWWNM4ZJBYSV... | 0 | {'sig': '3QqczHlaOjymMeoEZK/ULKSbXv8pYP59AJqt4... | axfer | NaN | NaN | NaN | NaN | NaN | NaN |
4. Preprocessing the Dataset
Notice that some of the columns in the table above have fields of type ‘dict’. Making calculations on nested fields complicated and messy. In this step, we will parse those fields into new columns.
df = pd.concat([df.drop(['asset-transfer-transaction'], axis=1),
df['asset-transfer-transaction'].apply(pd.Series).add_suffix('-asset-txn').drop(['0-asset-txn'], axis=1, errors='ignore')], axis=1)
df = pd.concat([df.drop(['payment-transaction'], axis=1),
df['payment-transaction'].apply(pd.Series).add_suffix('-pay-txn').drop(['0-pay-txn'], axis=1, errors='ignore')], axis=1)
df = pd.concat([df.drop(['asset-config-transaction'], axis=1),
df['asset-config-transaction'].apply(pd.Series).add_suffix('-asst-cnfg-txn').drop(['0-asst-cnfg-txn'], axis=1, errors='ignore')], axis=1)
df = pd.concat([df.drop(['signature'], axis=1),
df['signature'].apply(pd.Series).add_suffix('-sig').drop(['0-sig'], axis=1, errors='ignore')], axis=1)
# format the unix seconds to human readble date format
df['date'] = pd.to_datetime(df['round-time'], unit='s').dt.date
df.head()
close-rewards | closing-amount | confirmed-round | fee | first-valid | genesis-hash | genesis-id | id | intra-round-offset | last-valid | note | receiver-rewards | round-time | sender | sender-rewards | tx-type | group | lease | keyreg-transaction | created-asset-index | amount-asset-txn | asset-id-asset-txn | close-amount-asset-txn | close-to-asset-txn | receiver-asset-txn | sender-asset-txn | amount-pay-txn | close-amount-pay-txn | close-remainder-to-pay-txn | receiver-pay-txn | asset-id-asst-cnfg-txn | params-asst-cnfg-txn | sig-sig | multisig-sig | logicsig-sig | date | day | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | 8195181 | 1000 | 8195123 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | RPQ7BVPQQENXXJDFLYCMSVFU7QV6Z44LDIIAVISBUHSXCO... | 0 | 8196123 | gqZwVHhuSWTZJGI0OGY1NTUwLTA4NGEtNDZkNS05OGUxLW... | 0 | 1596254404 | SN3FSQVLLNWSI3I7QCMQEQN6L42OW5RP4FNM6BNOZHBRKH... | 0 | axfer | NaN | NaN | NaN | NaN | 31000.0 | 2512768.0 | 0.0 | NaN | IWPA7EBIWKXLROZRFTNQKYKYCAMDSHJQL6QKEMQAO6DKXT... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | X73FojlwcRlV8A8A9bWvWPKq/6i5qxBOb/xQZlqXc+Zuba... | NaN | NaN | 2020-08-01 | Saturday |
1 | 0 | 0 | 8195181 | 1000 | 8195127 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | BGGD5TM6IFZBTDCDCNGUJ4PBDQFAG5RZTATKB7T34LGVKN... | 1 | 8196127 | gqZwVHhuSWTZJDgzZmU0MmQ5LTg2MjYtNDI0NC1hZDczLT... | 0 | 1596254404 | 62FMJKL2BDSYWSF3RYYZHAXA5HICQ57NFZCWWNM4ZJBYSV... | 0 | pay | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 201000.0 | 0.0 | NaN | MGY5TCZEUVOINTRXXPEHFQ3OTBVPOQGMMPANECG2WHASVY... | NaN | NaN | P0qA6tng2r6e1aSwoG2sNKQEXjoGWTua+YykQtf8VVvtwl... | NaN | NaN | 2020-08-01 | Saturday |
2 | 0 | 0 | 8195181 | 1000 | 8195127 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | HHFNE2KELVHPL4TGI6B6PXDIZQ6ROGFH7NOUDCJKCRBHXB... | 2 | 8196127 | gqZwVHhuSWTZJDI4NDE1NmM2LTQxNzAtNDhmNy1hMmFkLT... | 0 | 1596254404 | 62FMJKL2BDSYWSF3RYYZHAXA5HICQ57NFZCWWNM4ZJBYSV... | 0 | axfer | NaN | NaN | NaN | NaN | 100.0 | 2513338.0 | 0.0 | NaN | EBDDABI4F7IWCPLHMICNYKUCAEPCBFP5HXKUITS4SSSQKG... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | jNSYv/8dclBfXkJU2JEDJIVj8rlcnLZRLdd1RYvJbfOWMQ... | NaN | NaN | 2020-08-01 | Saturday |
3 | 0 | 0 | 8195181 | 1000 | 8195134 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | NS4Z2S3NCKMVZSY6VX4CKE7QTN7N5JLT7AVYQFJSRVRLUS... | 3 | 8196134 | gqZwVHhuSWTZJDY0ZThhOGUyLTBhMTItNDdlNC04N2Q3LW... | 0 | 1596254404 | BXUZYY5XN2PFLBJE3BX2RDMQLQUW5ZUWX4ZCNDUL6QMXJZ... | 0 | axfer | NaN | NaN | NaN | NaN | 0.0 | 2512768.0 | 0.0 | NaN | BXUZYY5XN2PFLBJE3BX2RDMQLQUW5ZUWX4ZCNDUL6QMXJZ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | FbhxqPDlZxDcaAELShez+rXNQuTR4rRVP4vpeDassKN0pt... | NaN | NaN | 2020-08-01 | Saturday |
4 | 0 | 0 | 8195181 | 1000 | 8195123 | wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8= | mainnet-v1.0 | 2S7TV7FI4TMPO7GXZI55PPW3FQE2PJVQ7F34LCS3C6Z7NW... | 4 | 8196123 | gqZwVHhuSWTZJGJmYjlhNGY4LTNmODgtNDQ1My04YWY5LW... | 0 | 1596254404 | 62FMJKL2BDSYWSF3RYYZHAXA5HICQ57NFZCWWNM4ZJBYSV... | 0 | axfer | NaN | NaN | NaN | NaN | 100.0 | 2513338.0 | 0.0 | NaN | M2UEUR6EP3JSI2BGGPXEXUIOTTJMNL7A6VSPE357N2CBFA... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3QqczHlaOjymMeoEZK/ULKSbXv8pYP59AJqt4PpRrUFfnp... | NaN | NaN | 2020-08-01 | Saturday |
5. Calculate Metrics
Now that we have the processed table, we calculate some metrics and get some insights on the Algorand blockchain.
Total number of transactions
len(df)
4687303
Types of transactions
df['tx-type'].value_counts()
axfer 3599820 pay 1087417 acfg 57 keyreg 9 Name: tx-type, dtype: int64
Where axfer is asset transfer, pay is payment, acfg is asset config and keyreg is key registration.
More on transaction types can be found here.
Total amount of algos transfered during this week
df['amount-pay-txn'].sum()
251299842396859.0
Distribution of transactions over days of the weeks
df['day'] = pd.to_datetime(df['date']).dt.day_name()
df['day'].value_counts().plot()
Unique assets transfered
df['asset-id-asset-txn'].nunique()
64
df['asset-id-asset-txn'].value_counts()
2512768.0 1699636 2513338.0 1237993 2513746.0 661888 312769.0 87 163650.0 25 438840.0 22 1942776.0 14 604.0 14 438839.0 12 ... 137489.0 1 Name: asset-id-asset-txn, dtype: int64
The transactions table provides us with asset ids but doesn’t have the asset name. In the next step we extract more data to learn more about the transactions.
6. Extract Assets
Just like with the transactions extraction, we create a function to loop over the requests.
def get_txn_response():
nexttoken = ""
numtx = 1
responses = []
# Retrieve up-to 1000 assets at each request.
while numtx > 0:
response = myindexer.search_assets(next_page=nexttoken)
assets = response['assets']
responses += assets
numtx = len(assets)
if numtx > 0:
# pointer to the next chunk of requests
nexttoken = response['next-token']
return responses
assets = pd.DataFrame(get_txn_response())
assets.head()
index | params | |
---|---|---|
0 | 5 | {'clawback': 'L226FSG3LTZR4V2MI5M4SDKJSF5HP2TQ... |
1 | 6 | {'clawback': 'QYQYXRJ7FLQCRRGE3WH5CIAAL56LLMIN... |
2 | 235 | {'clawback': 'QA75IQ76F6H2T55G65BY7BPLF5QNWSLT... |
3 | 236 | {'clawback': 'QA75IQ76F6H2T55G65BY7BPLF5QNWSLT... |
4 | 237 | {'clawback': 'QA75IQ76F6H2T55G65BY7BPLF5QNWSLT... |
As in the preprocessing above, we parse the column with the dict fields:
assets = pd.concat([assets.drop(['params'], axis=1),
assets['params'].apply(pd.Series)], axis=1)
assets.head()
And now, we combine the assets value count from above with the assets names.
assets_counts = pd.DataFrame(df['asset-id-asset-txn'].value_counts()).join(assets[['index', 'name', 'unit-name']].set_index('index'))
assets_counts.head()
asset-id-asset-txn | name | unit-name | |
---|---|---|---|
2512768.0 | 1699636 | YouNow Pending Props | xPROPS |
2513338.0 | 1237993 | Camfrog Pending Props | xPROPS |
2513746.0 | 661888 | Paltalk Pending Props | xPROPS |
312769.0 | 87 | Tether USDt | USDt |
163650.0 | 25 | Asia Reserve Currency Coin | ARCC |
pd.DataFrame(assets_counts.groupby('unit-name')['asset-id-asset-txn'].sum()).nlargest(10, 'asset-id-asset-txn').plot()
Conclusion
There is a lot of information recorded on the blockchain and a lot of fascinating insights that can be derived from it. The indexer provides an easy tool to access this data.
Learn more on how to use the indexer and don’t be shy to share your insights with us