With the advent of software development kits like BDK and LDK, building a bitcoin wallet has never been easier. However, as much as easier development is necessary, it’s important to build in a way that protects user security and privacy by default. For example, how a light wallet connects to a third-party server to receive and send transaction data is an important issue to address.
I believe that bitcoin wallets need block filters to respect a user’s privacy. Why? It’s the only way to keep data from leaking to the server, which would allow it to link a user’s transaction data beyond what is publicly available.
In this article, we will explore why bitcoin wallets need block filters by first looking at how many bitcoin users run full nodes, how API wallets offer good user experience but ultimately lead to all your transactions being linked together, how bloom filters have failed to protect privacy, how block filters are the only lightweight wallet network privacy solution, and finally how this can all be implemented using Tor-only communication to also protect a user’s IP address.
Only a Few Users Run Full Nodes
Running and using a Bitcoin node is the best thing you can do because you’re part of the network and you don’t need any intermediaries to receive and broadcast transaction data. However, It’s obvious that running a full node is not for everyone; the existence and need for light clients (Simple Payment Verification) was even envisioned by Satoshi in the Bitcoin whitepaper.
We can’t know how many users are running a full node, we can only know how many nodes there are. Conservative estimates that count only listening nodes would put this number at around 16,000, as seen on the Bitnodes.io site. More accurate estimates that count both listening and non-listening nodes such as Luke Dashjr’s node count tool put this number higher, at around 53,000.
It’s also important to be aware of the historical context of the number of full bitcoin nodes. According to the Bitcoin Node Count History by Luke Dashjr, we can observe that the usage of bitcoin nodes is far from its peak. On January 13, 2018, the count reached 205,000. This was highly related to the fact that bitcoin had reached its previous all-time high a few weeks earlier. In 2021, we can observe that the node count also increased when the price went up, but it only reached close to 90,000.
We can confirm that there are few users running bitcoin nodes, and that this number is not increasing over time. Light wallets are much easier to use than a bitcoin node, and we need to find the right network privacy solution to implement. Let’s take a look at the most used technology today, which is API wallets.
API Wallet Service Providers Collect Your Data by Default
Most bitcoin wallets use APIs (Application Specific Interface) to send and receive user transaction data. This technology is highly scalable and provides the best user experience, as requests are instantaneous. However, it has an inherent privacy caveat. Let’s break down how it works and how service providers collect your data by default.
When you initialize a standard bitcoin wallet, you import or create a mnemonic seed phrase and set the desired derivation path (often automatically). This gives you a master public key, often called an xpub. Here’s what it looks like:
xpub6CUGRUonZSQ4TWtTMmzXdrXDtypWKiKrhko4egpiMZbpiaQL2jkwSB1icqYh2cfDfVxdx4df189oLKnC5fSwqPfgyP3hooxujYzAu3fDVmz
Once that’s done, the xpub is automatically sent to the service provider’s server, where it derives bitcoin addresses within the gap limit (how many unused addresses with a balance of zero will be checked before the server stops scanning for funds). These addresses are looked up in the server’s index, and if transactions are found, they are sent to the user’s client. The addresses are watched in case new transactions occur. In addition, when a user sends a transaction, it’s also sent through the same communication channel.
It’s obvious that this process is very efficient and allows API wallets to provide a fast and easy user experience. However, the service provider will be able to link all of our transactions together, and thus collect your private information by default. Fortunately, many API wallets allow users to connect through Tor, so at least a user’s IP address is protected.
Let’s now examine an alternative method that does not depend on a single server, the use of bloom filters on light wallets.
Why Bloom Filters Don’t Work for Privacy
Some wallets allow a user to receive and send transaction data through Bloom filters. This communication method was introduced in BIP37 and was originally thought to be private. In this section, we’ll break down what Bloom filters are and why they’re actually not good for privacy.
Bloom filters are probabilistic data structures used to test whether an element is a member of a set. In the bitcoin context, bloom filters are created by a light client and sent to network peers, which test whether there’s a match between an address (element) and blockchain data (set). If there’s a match, the transaction data is sent to the light client. It’s probabilistic because there are false positives, but these are later discarded by the light client.
It was thought that the false positive rate would be high enough that a network peer wouldn’t be able to tell which transactions were really yours and which were fake. However, due to an implementation error, the false positive rate was actually reduced.
Additionally, a light client can create different bloom filters for the same wallet, and if two or more are collected by a network peer, the intersection can be calculated to remove false positives. Finally, if blockchain data is analyzed and the user doesn’t coinjoin or use coin control, a network peer can infer which addresses don’t belong to the user.
You can read more about the privacy issues with BIP37 here. Now let’s examine the remaining light client network solution.
A Bitcoin Wallet Needs Block Filters for Privacy
Back in 2018, there was no real solution to this problem, block filters weren’t a thing yet. Fortunately, they were introduced the following year in BIP157 and 158, and are now implemented in several wallets and bitcoin software such as Wasabi, Blixt, Breez, LND, and LDK. They’re often referred to as Neutrino. In this section, we’ll examine how they work and why they’re the right solution for network privacy.
Block filters compress block data to help wallets receive transactions from peers without compromising privacy by downloading specific blocks instead of looking up individual transactions.
The block filter process typically involves three steps. First, a user downloads the block filters representing the blockchain from a network peer in the case of Breez, or from the coordinator server in the case of Wasabi. Then, the light client checks to see if the addresses within the gap limit match a block filter. Finally, if there’s a match, the corresponding block is downloaded.
Because we’re downloading entire blocks instead of individual transactions, and because there’s a false positive rate, the block filter method works to protect a user’s privacy from network peers. Unlike Bloom filters and API wallets, it can’t figure out (or doesn’t collect directly) the connection between a user’s transactions, other than what is publicly known on the blockchain.
Block filters are part of the solution to network privacy, but something else is needed to complete the picture.
Tor is the Last Remaining Piece to Solving Network Privacy
Tor and bitcoin go hand in hand, and together with block filters, can solve network privacy for lightweight clients. Tor hides a user’s IP address from the destination server by routing it through a network of nodes. This mechanism is called onion routing because of the multiple layers of communication.
Tor and block filtering have one thing in common. They’re both processes that can slow down performance, and that can be noticeable and degrade the user experience. Some people think you just have to accept this, but I think it can be improved to the point where it’s barely noticeable.
For example, the Tor community has implemented a communication reliability solution called Conflux. Instead of making a single request, clients make two requests using two different Tor circuits to increase the likelihood of fast completion. This, along with innovations in wallet loading for block filters like Turbosync on the Wasabi wallet, will lead us to a future where a user doesn’t have to choose between usability and privacy, but can enjoy both.
This is a guest post by Gustavo Flores Echaiz. Opinions expressed are entirely their own and do not necessarily reflect those of BTC Inc or Bitcoin Magazine.