Full relational diagram for Bitcoin public data on Google BigQuery

Rif Kiamil
Google Cloud - Community
2 min readFeb 12, 2021

--

Find a full relational diagram for the Bitcoin data coming from an amazing opensource project https://github.com/blockchain-etl/bitcoin-etl. If you prefer to deal with star or snowflake schema more check out my simplified relational diagram.

An interactive diagram can be found https://dbdiagram.io/d/6026650980d742080a3a482c

Two RECORDS type fields in crypto_bitcoin.transactions

As each Bitcoin transaction can move money from multiple wallets (inputs) to multiple wallets (outputs) we need to be able to store this data in a two record type fields on the crypto_bitcoin.transactions.

transactions.inputs and transactions.outputs

Normalised schema using views

If you still prefer to use table joins vs array/UNNEST function then crypto_bitcoin.input and crypto_bitcoin.output might be helpful.

crypto_bitcoin.input is a view using the data crypto_bitcoin.transactions

Notice how you don’t need to use a table join on transactions and inputs

crypto_bitcoin.output is a view using the data crypto_bitcoin.transactions

Notice how you don’t need to use a table join on transactions and outputs

Partitioned tables - keeping cost low and queries fast!

  • When working with crypto_bitcoin.blocks and crypto_bitcoin.transactions try to use partitioned field block_timestamp_month. This will speed up your queries and reduce your costs. In YYYY-MM-DD the DD is always 01.
WHERE transactions.block_timestamp_month = '2018-02-01'
WHERE blocks.block_timestamp_month = '2010-12-01'
Setting for a table partitioned can be found under the details tab

SQL with partitioned field

Try out this query using a partitioned field.

161 MiB with partitioned field

SQL without a partitioned field

412.3 GiB ( 421,888 MiB ) with out partitioned field

--

--

Rif Kiamil
Google Cloud - Community

Happy to answer questions on SQL, ERP, Blockchain & Google Cloud Platform.