Blockchains, Blockchains, Blockchains. Have you heard of them? They’re kind of a big deal. Meet me behind the library after school and I might have a few to sell. Maybe you’ve read a block-chain guide about them before, maybe you still have no idea how the fork they work. To go anywhere in the crypto space you have to know a little bit more than basics of blockchain so in this guide we’re going to go over the data structure properties of the blockchain and the technology that it enables. 

Hashing

Without hashing, there would be no blockchain. A hash function is an algorithm that is run. There are many, many types of hashing functions, each have benefits in their own ways. One such example of a cryptographic hash function is The SHA (Secure Hash Algorithm) 256. This algorithm is used by Bitcoin’s Proof of Work system and was interestingly enough a part of a set hash functions developed by the National Security Agency (NSA). It generates a fixed size 256 bit (32 byte) hash. No matter the length of the input string, the resulting hash will have a length of 32 characters. Even if the input string is longer or shorter than 32 characters, the output length will always be the same. This feature is a necessary property of all hashing functions.Here are some other important properties: 

1.) Uniformity - If a hashing algorithm is uniform, this means that any output is equally likely. This means that in a collection of millions of hashes, there will not be a disproportionate amount of any one hashes that begin with 0s. If two hashes are exactly the same it is known as a collision. Ways of avoiding collisions including In software development, it is usually assumed that when using a hashing algorithm, there will be no collisions, although there is still always a chance.

 

2.) Non-invertibility - Means that hashing is one directional. A input string can be turned into a hash. But given the resulting hash, the input string cannot be reverse engineered.

 

3.) Discontinuity - Given two similar inputs, a discontinuities algorithm will produce radically different hashes. This even if two 64 character input string only differ by one character. There resulting hashes will contain many different characters, instead of just one.

 

4.) Speed - Maybe a more obvious feature, but it becomes increasingly important as the size of the network increases.

 

The Blockchain as a Data Structure

Blockchains are data structures, or a specialized way for organizing and storing data on a computer. There are many different types of data structures that have been developed to store particular data sets with unique properties. Some popular forms of data structures are arrays, linked-lists, sets, binary heaps, stacks, etc. On blockchains, the unique property is that it is important that the ordered sequence integrity of the data is maintained and not tampered with.  Blockchains are similar to other listed data structures such as arrays and linked-lists. It is composed of a series of units (blocks) that can store a data and have a way of connecting blocks together in the correct sequential order. In cryptocurrencies, this data on each block is typically thousands of transactions lumped together, but any package of data could be stored. In fact the type of data doesn’t matter at all to the behavior of the blockchain.  What is important however, is that each block carries with it some metadata about itself. In a simple example this metadata could simply be an ID of the block. In between each block there must be some sort of reference to its neighbors. These references are typically pointers to the location of the blocks in memory. At this point we have nothing more than a doubly linked list.  Blockchains step up the game a bit. Every block on the chain stores a hash of the entire contents of the previous neighbor. This means it includes the data, metadata, and previous hash of its neighbor. The very first block is a hash of all zeros and is known as a Genesis Block.  

How does Hashing help?

How does this hashing property of blockchains help with security? Well it turns out it’s very computationally inexpensive to validate a blockchain for fraud. For example, let’s say an algorithm begins by looking at the first block on the blockchain. Each block that is passed, a comparison is made between the hash of the data in the previous block with the previous hash recorded in the current block. If they are a match, no fraud! This makes it extremely difficult for any nefarious individual to tamper with the data on a block. For it not only will change the data on that block, but also the hashes on every block after! Neat, right?