๐ฝ How to Get Bitcoin Transfers ๐ฝ
It's not that hard I promise
Basics:
BTC is UTXO - this means that transactions work like this:
- Address 1 inputs (sends) 10.06 BTC (everything in the address).
- Address 2 outputs (receives) 4 BTC
- Address 3 outputs (receives) 6.06 BTC.
๐คซ Secretly, Address 1 also owns address 3, so they effectively sent 4 BTC to Address 2. BUT we'll never know that Address 1 and Address 3 are owned by the same person (aka the two public keys have the same private key) UNTIL Address 1 and Address 3 appear as inputs together in the same transaction.
Address clustering means finding the networks of addresses that have appeared together as inputs and assigning them an identifier, which confers ownership.
Finally, a transfers table reduces the example transaction above to: FROM: Entity 1, TO: Entity 2, AMOUNT: 4 BTC
QED
Using our new Entities Table, here is some code to make one day of Bitcoin Transfers. Scroll to see the whole query or click the name of the query in the output table to see the query in the editor.
-- first get the input entity for each transaction
-- the input amount doesn't matter, we only need to know the input entity
-- we're grouping by every var at the end so that we don't have input
dupes
WITH inputs AS (
SELECT
tx_id,
COALESCE(project_name, CAST(address_group AS varchar)) AS from_entity
FROM
BITCOIN.CORE.FACT_INPUTS fi
LEFT JOIN BITCOIN.CORE.DIM_ENTITY_CLUSTERS ec ON fi.pubkey_script_address = ec.address
WHERE
block_timestamp >= '2023-10-24'
AND block_timestamp < '2023-10-25'
group by
tx_id,
from_entity
)
-- next get the outputs and merge in the inputs
-- sum up the output value by entity as the transfer amount
-- if there is no project name, or entity, use the pubkey
-- (presumably this is a new address - it could be the change address or just a new address,
-- we'll find out in time)
SELECT
fo.tx_id,
block_timestamp,
from_entity,
COALESCE(
project_name,
CAST(address_group AS varchar),
PUBKEY_SCRIPT_ADDRESS
) AS to_entity,
sum(value) AS transfer_amount
FROM
BITCOIN.CORE.FACT_OUTPUTS fo
LEFT JOIN BITCOIN.CORE.DIM_ENTITY_CLUSTERS ec ON fo.pubkey_script_address = ec.address
LEFT JOIN inputs ON inputs.tx_id = fo.tx_id
WHERE
-- don't forget to subset time or this query will be loooooong
block_timestamp >= '2023-10-24'
AND block_timestamp < '2023-10-25'
AND from_entity != to_entity
GROUP BY
fo.tx_id,
block_timestamp,
from_entity,
to_entity
LIMIT
10000```
โฌ๏ธ โฌ๏ธ As you can see above, with the November 2023 address groups (clusters), the ratio of to_entities to from_entities is about 3:1, but using the same entities (aka having future knowledge), we can calculate the transfers from 3 years ago and get a ratio of 2:1.
>> Over time the transfers table becomes more accurate ๐
>> Flipside has still some tricks up its sleeve to improve transfers accuracy, so buckle in, bc more goodies are on the way ๐บ
โก๏ธ โก๏ธ To the right I've painstakingly copied and pasted the same 1-day transfers query 6 times, so you can see how accuracy improves over time. If the to/from ratio is a good metric for accuracy (aka cluster address completeness), using just address clusters, full accuracy is not achieved until between 6m to 1 year!
๐ Coming soon to a dashboard near you, we'll be publishing a more sophisticated study on clustering accuracy
thank you for reading. Argue Chat with me on twitter @angelaminster