Blockchain Basics - Part Two: Blocks and Creating Chains
In the previous post we defined hash functions and Merkle trees which allowed us generate a single, 256 length hash representing our data. We briefly mentioned a nonce and a how we can use it to generate a hash with a certain number of zeros. In this post we'll build upon this foundation to define a block and see how we can link these together into a chain of blocks or a blockchain.
Linked Lists and Chaining Blocks
First we review the linked list data structure. A Linked list consists of blocks which contain three fields: the block's data and pointer's to the blocks that came before and after it. This is referred to a doubly linked list - when there are only links in one direction this is a singly linked list. Similar to using pointers to the previous block in the list, within our blockchain we'll use the hash of the previous block in the chain with the current block.
Let's build a basic block chain. Each block will contain the hash of the previous block and a generic data field. While this data would be complex in application as long as we can generate a single hash representing the data we'll be able to use it within our chain. Note we'll need some block at the chain's start. This is typically called the genesis block and has some preset previous hash value as part of setting up the blockchain.
Block 0
Block 1
Block 2
Nonces, Mining a Block, and Block Validity
Adding a block to the chain is as easy as reading the previous block's hash, combining it with the new block's data, then generating this block's hash. Exactly how multiple users can add blocks to the chain will be discussed in a follow up post, but this ease of adding a new block results in a few issues, in particular if there are bad actors (those in the network that wish to corrupt or sabotage the chain.)
First anyone with write permissions could spam the chain, adding several new blocks with possibly irrelevant data. Second a new block could be added to any existing block and not necessarily at the tail of the chain. This would result in a divergent chain and would render using the blockchain as a data store infeasible with multiple divergent data stores.
There are two techniques that blockchains use to address these issues. The first is a consensus mechanism (which will be discussed in the future) and another the a difficulty mechanism when adding new blocks.
Recall that when hashing data it is difficult to predict what the hash will be. When a user wishes to add a block to the chain we add the requirement that the new block's hash must meet some criteria. We require that the first three characters of the block's hash must end in zeros. A block that meets this criteria is said to be valid.
To modify the block's hash without changing the data or previous block's hash we add a new field termed nonce. Because of the unpredictability of hashes, users have no better option than a brute-force search over different nonce values to see if the resulting block hash will meet the criteria. Only when a nonce is found that matches this criteria is the block in a valid state and can be added to the chain. This requirement results in users needing to put in effort to mine a block, preventing sporadic new blocks from appearing.
Try out a few nonces below or use the Mine Block button to automatically find one.
Block
Checking Validation of the Chain
Let's take a moment to think through the implications of a valid block and the consequences it has for a bad actor trying to modify the contents of the chain. Suppose this actor wishes to alter some data within the middle of the chain. They first alter the data of the block which will change the block's hash.
Due to the chaining, the next block's previous hash value will change and invalidate that block. This is true for all blocks that follow and propagates forward. This actor will then need to mine each of these blocks to propagate their changes forward to the end of the chain. As blockchains are deployed within a network with other, good actors attempting to add to the chain, combining the computation difficulty in re-mining each block makes it infeasible that a bad actor can alter data in this way and propagate it forward.
Conclusion
In this post we saw how using the previous block's hash while generating the next block's hash provides a way to chain blocks together. We also discussed how we can introduce a notion of difficulty when adding a new block using a nonce value and how certain nonce values lead to validating a block.
So far we've mentioned users adding to the blockchain - both good and bad actors - but we haven't defined a way to identify these users. In the next post we'll dive into this, discussing how each user has a public identifier that can be read by others but not used to imitate them. We'll also see how this links to the mining process to incentives users to contribute compute power to generate new nonces resulting in digital currencies. Thanks for reading!