Close Menu
KumbhCoinorg
    What's Hot

    Berlinale 2026: A Politicized Festival with Great Political Cinema

    March 1, 2026

    My Sunday Song – “Joker & The Thief” by Wolfmother – 2 Loud 2 Old Music

    March 1, 2026

    Capitalism’s Coalition Is Cracking — And That Should Worry Us

    March 1, 2026
    Facebook X (Twitter) Instagram
    Trending
    • Berlinale 2026: A Politicized Festival with Great Political Cinema
    • My Sunday Song – “Joker & The Thief” by Wolfmother – 2 Loud 2 Old Music
    • Capitalism’s Coalition Is Cracking — And That Should Worry Us
    • West Indies refuse to play Test cricket against Australia
    • Kane and Kimmich seal thriling 3-2 Klassiker win
    • NHL Rumors: Vancouver Canucks, and the Montreal Canadiens
    • What the Warner Bros deal could mean for streaming, cinemas and news
    • Bitplanet Hits 300 Bitcoin, Joining Asia’s Top 20 Holders
    Facebook X (Twitter) Instagram
    KumbhCoinorg
    Sunday, March 1
    • Home
    • Crypto News
      • Bitcoin & Altcoins
      • Blockchain Trends
      • Forex News
    • Kumbh Mela
    • Entertainment
      • Celebrity Gossip
      • Movie & TV Reviews
      • Music Industry News
    • Market News
      • Global Economy Insights
      • Real Estate Trends
      • Stock Market Updates
    • Education
      • Career Development
      • Online Learning
      • Study Tips
    • Airdrop News
      • Ico News
    • Sports
      • Cricket
      • Football
      • hockey
    KumbhCoinorg
    Home»Crypto News»A Brief History Of Wallet Clustering
    Crypto News

    A Brief History Of Wallet Clustering

    kumbhorgBy kumbhorgJuly 3, 2025No Comments12 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    A Brief History Of Wallet Clustering
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link

    Our previous post in this series introduced the basic idea behind wallet or address clustering, the trivial case of address reuse, and the merging of clusters based on the common input ownership heuristic (CIOH), also known as the multi-input heuristic.

    Today, we’ll expand on more sophisticated clustering methods, briefly summarizing several notable papers. The content here mostly overlaps with a live stream on this topic, which is a companion to this series. Note that the list of works cited is by no means exhaustive.

    Early Observational Studies – 2011-2013

    As far as I’m aware, the earliest published academic study that deals with clustering is Fergal Reid and Martin Harrigan’s An Analysis of Anonymity in the Bitcoin System (PDF). This work, which studies the anonymity properties of bitcoin more broadly, in its discussion of the on-chain transaction graph, introduced the notion of a “User Network” to model the relatedness of a single user’s coins based on CIOH. Using this model, the authors critically examined WikiLeak’s claim that it “accepts anonymous Bitcoin donations.”

    Another study that was not published as a paper was Bitcoin – An Analysis (YouTube) by Kay Hamacher and Stefan Katzenbeisser, presented at 28c3. They studied money flows using transaction graph data and made some remarkably prescient observations about bitcoin.

    In Quantitative Analysis of the Full Bitcoin Transaction Graph (PDF), Dorit Ron and Adi Shamir analyzed a snapshot of the entire transaction graph. Among other things, they note a curious pattern, which may be an early attempt at subverting CIOH:

    We discovered that almost all these large transactions were the descendants of a single large transaction involving 90,000 bitcoins [presumably b9a0961c07ea9a28…] which took place on November 8th, 2010, and that the subgraph of these transactions contains many strange looking chains and fork-merge structures, in which a large balance is either transferred within a few hours through hundreds of temporary intermediate accounts, or split into many small amounts which are sent to different accounts only in order to be recombined shortly afterward into essentially the same amount in a new account.

    Another early confounding of this pattern was due to MtGox, which allowed users to upload their private keys. Many users’ keys were used as inputs to batch sweeping transactions constructed by MtGox to service this unusual pattern of deposits. The naive application of CIOH to those transactions resulted in cluster collapse, specifically the cluster previously known as MtGoxAndOthers on walletexplorer.com (now known as CoinJoinMess). Ron and Shamir seem to note this, too:

    However, there is a huge variance in [these] statistics, and in fact one entity is associated with 156,722 different addresses. By analyzing some of these addresses and following their transactions, it is easy to determine that this entity is Mt.Gox

    Although change identification is mentioned (Ron & Shamir refer to these as “internal” transfers), the first attempt at formalization appears to be in Evaluating User Privacy in Bitcoin (PDF) by Elli Androulaki, Ghassan O. Karame, Marc Roeschlin, Tobias Scherer, and Srdjan Capkun. They used the term “Shadow Addresses,” which these days are more commonly referred to as “change outputs.” This refers to self-spend outputs, typically one per transaction, controlled by the same entity as the inputs of the containing transaction. The paper introduces a heuristic for identifying such outputs to cluster them with the inputs. Subsequent work has iterated on this idea extensively, with several proposed variations. One example based on the amounts in 2 output transactions is if an output’s value is close to a round number when denominated in USD (based on historical exchange rates), that output is likely to be a payment, indicating the other production is the change.

    This early phase of Bitcoin privacy research saw the theory of wallet clustering become established as a foundational tool for the study of Bitcoin privacy. While this wasn’t entirely theoretical, evidential support was limited, necessitating relatively strong assumptions to interpret the observable data.

    Empirical Results – 2013-2017

    Although researchers attempted to validate the conclusions of these papers, for example, by interviewing Bitcoin users and asking them to confirm the accuracy of the clustering of their wallets or using simulations as in Androulaki et al.’s work, little information was available about the countermeasures users were utilizing.

    A fistful of bitcoins: characterizing payments among men with no names (PDFs: 1, 2) by Sarah Meiklejohn, Marjori Pomarole, Grant Jordan, Kirill Levchenko, Damon McCoy, Geoffrey M. Voelker, and Stefan Savage examined the use of Bitcoin mixers, and put the heuristics to the test by actually using such services with real Bitcoin. On the more theoretical side, they defined a more general and accurate change identification heuristic than previous work.

    In his thesis, Data-Driven De-Anonymization in Bitcoin, Jonas Nick was able to validate the CIOH and change identification heuristics using information obtained from a privacy bug in the implementation of BIP 37 bloom filters, mainly used by light clients built with bitcoinj. The underlying privacy leak was described in On the privacy provisions of Bloom filters in lightweight bitcoin clients (PDF) by Arthur Gervais, Srdjan Capkun, Ghassan O. Karame, and Damian Gruber. The leak demonstrated that the clustering heuristics were rather powerful, a finding which was elaborated on in Martin Harrigan and Christoph Fretter’s The Unreasonable Effectiveness of Address Clustering (PDF).

    Attackers have also been observed sending bitcoin, not through a mixer as in the fistful of bitcoins papers, but small amounts sent to addresses that have already appeared on-chain. This behavior is called dusting or dust1 attacks and can deanonymize the victim in two ways. First, the receiving wallet may spend the funds, resulting in address reuse. Second, older versions of Bitcoin Core used to rebroadcast received transactions, so an attacker who was also connected to many nodes on the p2p network could observe if any node was rebroadcasting its dusting transactions and that node’s IP address to the cluster.2

    Although Is Bitcoin gathering dust? An analysis of low-amount Bitcoin transactions (PDF) by Matteo Loporchio, Anna Bernasconi, Damiano Di Francesco Maesa, and Laura Ricci offered insights in 2023, exploring dust attacks, the data set they analyzed only extends to 2017. This work looked at the effectiveness of such attacks in revealing clusters:

    This means that the dust attack transactions, despite being only 4.86% of all dust creating transactions, allow to cluster 66.43% of all dust induced clustered addresses. Considering the whole data set, the transactions suspected of being part of dust attacks are only 0.008% of all transactions but allow to cluster 0.14% of all addresses that would have otherwise remained isolated.

    This period of research was marked by a more critical examination of the theory of wallet clustering. It became increasingly clear that, in some cases, users’ behaviors can be easily and reliably observed and that privacy assurances are far from perfect, not just in theory but also based on a growing body of scientific evidence.

    Wallet Fingerprinting – 2021-2024

    Wallet fingerprints are identifiable patterns in transaction data that may indicate using particular wallet software. In recent years, researchers have applied wallet fingerprinting techniques to wallet clustering. A single wallet cluster is typically created using the same software throughout, so any observable fingerprints should be fairly consistent within the cluster.3

    As a simple example of wallet fingerprinting, every transaction has an nLockTime field, which can be used to post-date transactions.4 This can be done by specifying a height or a time. When no post-dating is required, any value representing a point in time that is already in the past can be used, typically 0, but such transactions haven’t been post-dated when they were signed. To avoid revealing intended behavior and address the fee sniping concern, some wallets will randomly specify a more recent nLockTime value. However, since some wallets always specify a value of 0, when it’s not clear which output of a transaction is a payment and which is change, that information might be revealed by subsequent transactions. For example, suppose all of the transactions associated with the input coins specify nLockTime of 0, but the spending transaction of one of the outputs does not, in this case it would be reasonable to conclude that output was a payment to a different user.

    There are many other known fingerprints. Wallet Fingerprints: Detection & Analysis by Ishaana Misra is a comprehensive account.

    Malte Möser and Arvind Narayanan’s Resurrecting Address Clustering in Bitcoin (PDF) applied fingerprinting to the clustering problem. They used it as the basis for refinements to change identification. They relied on fingerprints to train and evaluate improved change identification using machine learning techniques (random forests).

    Shortly thereafter, in How to Peel a Million: Validating and Expanding Bitcoin Clusters (PDF), George Kappos, Haaroon Yousaf, Rainer Stütz, Sofia Rollet, Bernhard Haslhofer and Sarah Meiklejohn extended and validated this approach using cluster data for a sample of transactions provided by a chain analytics company, indicating that the wallet fingerprinting approach is dramatically more accurate than only using CIOH and simpler change identification heuristics. Taking fingerprints into account when clustering makes deanonymization much easier. Likewise, taking fingerprints into account in wallet software can improve privacy.

    A recent paper, Exploring Unconfirmed Transactions for Effective Bitcoin Address Clustering (PDF) by Kai Wang, Yakun Cheng, Michael Wen Tong, Zhenghao Niu, Jun Pang, and Weili Han analyzed patterns in the broadcast of transactions before they are confirmed. For example, different fee-bumping behaviors can be observed, both via replacement or with child-pays-for-parent. Such patterns, while not strictly fingerprints derived from the transaction data, can still be thought of as wallet fingerprints but about more ephemeral patterns related to certain wallet software, observable when connected to the Bitcoin P2P network but not apparent in the confirmed transaction history that is recorded in the blockchain.

    Similar to the Bitcoin P2P layer, the Lightning network’s gossip layer shares information about publicly announced channels. This is not typically framed as a wallet fingerprint but might be loosely considered as such, in addition to the on-chain fingerprint lightning transactions have. Lightning channels are UTXOs, and they form the edges of a graph connecting Lightning nodes, which are identified by their public key. Since a node may be associated with several channels, and channels are coins, this is somewhat analogous to address reuse.5 Christian Decker has publicly archived historical graph data. One study that looks at clustering in this context is Cross-Layer Deanonymization Methods in the Lightning Protocol (PDF) by Matteo Romiti, Friedhelm Victor, Pedro Moreno-Sanchez, Peter Sebastian Nordholt, Bernhard Haslhofer, and Matteo Maffei.

    Clustering techniques have improved dramatically over the last decade and a half. Unfortunately, widespread adoption of Bitcoin privacy technologies is still far from being a reality. Even if it was, the software has not yet caught up to the state of the art in attack research.

    Not The Whole Story

    As we have seen, starting from the humble beginnings of address reuse and the CIOH described by Satoshi, wallet clustering is a foundational idea in Bitcoin privacy that has seen many developments over the years. A wealth of academic literature has called into question some of the overly optimistic characterizations of Bitcoin privacy, starting with WikiLeaks describing donations as anonymous in 2011. There are also many opportunities for further study and for the development of privacy protections.

    Something to bear in mind is that clustering techniques will only continue to improve over time. “[R]emember: attacks always get better, they never get worse.”6 Given the nature of the blockchain, patterns in the transaction graph will be preserved for anyone to examine more or less forever. Light wallets that use the Electrum protocol will leak address clusters to their Electrum servers. Ones that submit xpubs to a service will leak clustering information of all past and future transactions in a single query. Given the nature of the blockchain analysis industry, proprietary techniques are at a significant advantage, likely benefiting from access to KYC information labeling a large subset of transactions. This and other kinds of blockchain-extrinsic clustering information are especially challenging to account for since, despite being shared with 3rd parties, this information is not made public, unlike clustering based on on-chain data. Hence, these leaks aren’t as widely observable.

    Also, bear in mind that control over one’s privacy isn’t entirely in the hands of the individual. When one user’s privacy is lost, that degrades the privacy of all other users. Through the process of elimination, which suggests a linear progression of privacy decay, every successfully deanonymized user can be discounted as a possible candidate when attempting to deanonymize the transactions of the remaining users. In other words, even if you take precautions to protect your privacy, there will be no crowd to blend into if others don’t take precautions, too.

    However, as we shall see, assuming linear decay of privacy is often too optimistic; exponential decay is a safer assumption. This is because divide-and-conquer tactics also apply to wallet clustering, much like in the game of 20 questions. CoinJoins transactions are designed to confound the CIOH, and the topic of the next post will be a paper that combines wallet clustering with intersection attacks, a concept borrowed from the mixnet privacy literature, to deanonymize CoinJoins.

    1

    Not to be confused with a different kind of dust attack, such as this example analyzed taking clustering into account by LaurentMT and Antoine Le Calvez.

    2

    A notable and somewhat related attack on Zcash and Monero nodes (Remote Side-Channel Attacks on Anonymous Transactions by Florian Tramer, Dan Boneh and Kenny Paterson) was able to link node IP addresses to viewing keys by exploiting timing side channels on the P2P layer.

    3

    More precisely: fingerprint distributions should be consistent within a cluster, as some wallets deliberately randomize certain attributes of transactions.

    4

    Note for nLockTime to be enforced the nSequence value of at least one input of the transaction must also be non-final, which complicates things both for post-dating and in terms of the different observable patterns this gives rise to.

    5

    Channel funds are shared by both parties to the channel but the closing transaction resembles a payment from the funder of a channel. Dual-funded channels may confound CIOH, similarly to PayJoin transactions.

    6

    New Attack on AES – Schneier on Security

    Clustering History Wallet
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleStar Wars actor Kenneth Colley dies aged 87
    Next Article US jobs see strong growth in June with unemployment down to 4.1%
    kumbhorg
    • Website
    • Tumblr

    Related Posts

    Crypto News

    Bitplanet Hits 300 Bitcoin, Joining Asia’s Top 20 Holders

    By kumbhorgMarch 1, 2026
    Bitcoin & Altcoins

    American Bitcoin Corp Down 90%: Trump’s Crypto Businesses Are Bleeding

    By kumbhorgMarch 1, 2026
    Blockchain Trends

    Bitcoin Faces FOMC Test as Past Meetings Trigger Sharp Selloffs

    By kumbhorgMarch 1, 2026
    Forex News

    US and Israel attack Iran, risk aversion to sweep global markets

    By kumbhorgMarch 1, 2026
    Crypto News

    Senate Dems Push DOJ, Treasury To Probe Binance

    By kumbhorgFebruary 28, 2026
    Bitcoin & Altcoins

    Axiom Exchange Insider Trading Scandal: Is Your Trading Data Being Used Against You?

    By kumbhorgFebruary 28, 2026
    Add A Comment

    Comments are closed.

    Don't Miss

    Berlinale 2026: A Politicized Festival with Great Political Cinema

    By kumbhorgMarch 1, 2026

    Given the incendiary news from this year’s edition, it may surprise you to learn that…

    My Sunday Song – “Joker & The Thief” by Wolfmother – 2 Loud 2 Old Music

    March 1, 2026

    Capitalism’s Coalition Is Cracking — And That Should Worry Us

    March 1, 2026

    West Indies refuse to play Test cricket against Australia

    March 1, 2026
    Top Posts

    Satwik-Chirag storm into China Masters final with straight-game win over Malaysia | Badminton News

    September 21, 2025132 Views

    SaucerSwap SAUCE Crypto Breaks Key Resistance Amid Nvidia-Hedera Deal

    July 15, 202545 Views

    Unlocking Your Potential with Mubite: The Future of Crypto Prop Trading

    September 17, 202533 Views

    Stablecoins 2025 Exchange Reserves: Insights into DeFi Trends

    September 8, 202532 Views
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    About Us

    Welcome to KumbhCoin!
    At KumbhCoin, we strive to create a unique blend of cultural and technological news for a diverse audience. Our platform bridges the spiritual significance of the Kumbh Mela with the dynamic world of cryptocurrency and general news.

    Facebook X (Twitter) Pinterest WhatsApp
    Our Picks

    Berlinale 2026: A Politicized Festival with Great Political Cinema

    March 1, 2026

    My Sunday Song – “Joker & The Thief” by Wolfmother – 2 Loud 2 Old Music

    March 1, 2026

    Capitalism’s Coalition Is Cracking — And That Should Worry Us

    March 1, 2026
    Most Popular

    7 things to know before the bell

    January 22, 20250 Views

    Reeves optimistic despite surprise rise in UK borrowing

    January 22, 20250 Views

    Barnes & Noble stock soars 20% as it explores a sale Barnes & Noble stock soars 20% as it explores a sale

    January 22, 20250 Views
    • Terms and Conditions
    • Privacy Policy
    • Contact Us
    • About Us
    © 2026 Kumbhcoin. Designed by Webwizards7.

    Type above and press Enter to search. Press Esc to cancel.