I decided to write this article mainly to compile my notes on the architecture of bitcoin as I studied the Satoshi Nakamoto whitepaper and protocol documentation (doing some due dilligence related to a project I am working on).
There are many potentially very exciting applications made possible by bitcoin (or a successor cryptocurrency), with mind-blowing implications for the evolution of economic and social orders, but in this article I will focus specifically on the details of bitcoin as an existing technology, independent of possible uses and technologies that may be built on top of this versatile protocol.
Cryptocurrencies and the Problem of Double-spending
A fundamental concept common to all cryptocurrencies is that of an immutable and shared record of transactions, wherein transactions are digitally signed by the payer. Using digital signing algorithms like DSA and ECDSA, the public key that anonymously specifies the address of a payer may be used by any onlooker to verify that the payer held an associated private key. We can therefore be certain that only the individual controlling the private key associated with the address of those funds could have initiated the transfer. For this reason, cryptocurrencies do not have a problem analogous to counterfeit money.
The challenge faced by cryptocurrencies prior to bitcoin was how to solve the problem of double-spending. Double-spending occurs when a payer is able to spend funds they have already spent, due to one party seeing a lagging, divergent, or otherwise compromised state of the ledger of transactions. Prior to bitcoin, a central server was the only truly viable solution to ensuring that all parties maintained consistent consensus on the ledger of transactions. The problem with central servers, much like financial institutions, is that they are entities that have to be trusted, and are a potential point of failure, fraud, and exploitation. It was likely this vulnerability that prevented centralized cryptocurrencies from ever gaining nearly as much traction as bitcoin has enjoyed increasingly.
The Block Chain
The major innovation introduced by bitcoin is its strategy of creating a distributed public ledger, the block chain, such that its integrity is maintained without the need for trust and central authority. The block chain as we will explore below, is an immutable chain of block data structures containing recent transactions, a reference to the previous block, and proof of work.
At every moment, bitcoin clients are broadcasting messages over TCP. The messages include transaction messages representing transfers of value, connecting to a peer, receiving new blocks, and a number of other message types. After a transaction has been broadcast, it is not considered safely complete until it has been recorded in 100 blocks. Blocks are generated by miner nodes (we’ll discuss blocks in further detail in the section on mining). When we say that blocks are chained we mean that they contain a reference to the previous block, so their hash proves that the previous block existed when it was generated. The client accepts the chain with most total difficulty as the valid chain. This is typically referred to as the “longest” chain, but it has nothing to do with the number of blocks.
Mining and Proof of Work
Bitcoin’s p2p network maintains the integrity of its block chain by introducing the notion of proof of work. If any node could simply receive the transactions broadcast by payers and record them into new blocks in the ledger, then someone could create a large number of nodes that conspire to falsify the ledger, compromising the integrity of the network. If in this hypothetical no-work version of bitcoin, validity were determined based on what chain is literally the longest, controlling over 50% of nodes and subverting this longest-chain heuristic would not be difficult. In this so-called 51% attack, if it were feasible in the real bitcoin, it would pose an existential threat to the currency because it could no longer be relied upon.
Bitcoin reduces the probability of a 51% attack down to near-impossibility by requiring nodes to perform work, and to provide proof of that work, in order to contribute to maintaining the ledger. The strategy of how this works is as follows:
- all recent unconfirmed transactions broadcast by clients,
- a hash of the block that came before,
- a nonce (see next bullet point), and
- a valid proof of work (see next bullet point).
What is referred to as mining works as follows:
- A sha-256 hash of the block’s header is generated. An arbitrary nonce starting at 1 and incremented on subsequent executions is included in the header so that subsequent hashes will differ.
- The hash of the header must be <= the current target (see next bullet) for the block to be accepted. For this reason lower targets are the means of increasing difficulty. If the hash is greater than the target, repeat step 1 again (increase nonce, hash, lather, rinse, repeat).
- The first node that happens to arbitrarily win the race to compute a hash <= the target, gets a “block reward” of coin assigned to the miner’s address. The hash constitutes a proof of work because the hash is determined by the previous node, the included transactions, and the specific nonce that made the hash fit the constraints of the target. The reward is a special type of transaction that is recorded as the first transaction in the block. Every 210,000 blocks the block reward is cut in half.
Generation of blocks is automatically sped up or slowed to ensure that 2016 blocks takes 2 weeks, by adaptively adjusting the target, which controls the difficulty level. At each 2016-block milestone, the network adjusts the difficulty level to match the network’s total computational power. This adjustment is made by every client comparing the time it took to generate the blocks with the 2 week ideal, and modifying the target by the percentage difference. And all clients of course end up having the same target.
The effect of requiring this proof of work is that in order to falsify the block chain, a dishonest miner would have to control 51% of the processing power of the network, not merely the 51% of the network nodes, in order for the longest chain to not be valid.
Double-spending with Incomplete Confirmation
It must be noted here that while a 51% attack is made very unlikely, the problem of double-spending remains an issue for transactions in which the recipient of funds does not wait for confirmation in a safe number of blocks, but instead accepts fewer confirmations as enough proof, for reasons of expedience. A customer and cashier at a register are for example unlikely to wait even (10 minutes) for the first confirmation, so double spending is possible. However the 100 block confirmation heuristic ensures to a high degree of confidence (given how difficult a 51% attack is) that no double-spending has occurred for fully-confirmed transactions in which less than half of the computational power of the network is controlled by a dishonest miner (using some form of computing technology not known to presently exist presumably).
This lingering double-spending problem while largely not an issue, should not be brushed off. It precludes the trust-free use of bitcoin in markets that demand real-time instantly confirmed transactions. To bitcoin’s credit however, no other altcoin does significantly better than bitcoin in that regard either. Litecoin for example is faster but not to the extent that it can be used in real-time applications.
The money supply grows as a result of mining, but at increasing levels of difficulty, and only up to a fixed number of coins. However, because coins can be split up to 8 decimal places. To put this in perspective, if one bitcoin were valued at $100 million USD, then the smallest unit, a “satoshi”, would trade at $1. At the time of this writing, one bitcoin is valued well under $1 thousand USD. Even if surpassing this highly optimistic scenario (in which many of today’s speculators would become billionaires, and nearly all would become millionaires), the bitcoin source could be modified to enable division by more decimal places to keep its use convenient. The ability to indefinitely divide the currency into more decimal places is thought to prevent what is called a deflationary spiral.
Some currencies are backed by commodities such as gold. USD or the Euro for example are fiat currencies, backed by the perceived credibility of their respective central government authorities. The intrinsic value that backs bitcoin is considered to derive from a combination of factors including its relative anonymity, reduced potential for currency manipulation, tendency toward low oligarchiness, deflation over time, low transaction fees, global availability, and speed of transfer (compared to something like ACH bank transfers for example).