Secureum Book
  • 🛡️Secureum Bootcamp
    • 🛡️Secureum Bootcamp
    • 🙌Participate
    • 📜History
  • 📚LEARN
    • Introduction
      • 🔷1. Ethereum Basics
        • 1.1 Ethereum: Concept, Infrastructure & Purpose
        • 1.2 Properties of the Ethereum Infrastructure
        • 1.3 Ethereum vs. Bitcoin
        • 1.4 Ethereum Core Components
        • 1.5 Gas Metering: Solving the Halting Problem
        • 1.6 web2 vs. web3: The Paradigm Shift
        • 1.7 Decentralization
        • 1.8 Cryptography, Digital Signature & Keys
        • 1.9 Ethereum State & Account Types
        • 1.10 Transactions: Properties & Components
        • 1.11 Contract Creation
        • 1.12 Transactions, Messages & Blockchain
        • 1.13 EVM (Ethereum Virtual Machine) in Depth
        • 1.14 Transaction Reverts & Data
        • 1.15 Block Explorer
        • 1.16 Mainnet & Testnets
        • 1.17 ERCs & EIPs
        • 1.18 Legal Aspects in web3: Pseudonymity & DAOs
        • 1.19 Security in web3
        • 1.20 web2 Timescales vs. web3 Timescales
        • 1.21 Test-in-Prod. SSLDC vs. Audits
        • Summary: 101 Keypoints
      • 🌀2. Solidity
        • 2.1 Solidity: Influence, Features & Layout
        • 2.2 SPDX & Pragmas
        • 2.3 Imports
        • 2.4 Comments & NatSpec
        • 2.5 Smart Contracts
        • 2.6 State Variables: Definition, Visibility & Mutability
        • 2.7 Data Location
        • 2.8 Functions
        • 2.9 Events
        • 2.10 Solidity Typing
        • 2.11 Solidity Variables
        • 2.12 Address Type
        • 2.13 Conversions
        • 2.14 Keywords & Shorthand Operators
        • 2.15 Solidity Units
        • 2.16 Block & Transaction Properties
        • 2.17 ABI Encoding & Decoding
        • 2.18 Error Handling
        • 2.19 Mathematical & Cryptographic Functions
        • 2.20 Control Structures
        • 2.21 Style & Conventions
        • 2.22 Inheritance
        • 2.23 EVM Storage
        • 2.24 EVM Memory
        • 2.25 Inline Assembly
        • 2.26 Solidity Version Changes
        • 2.27 Security Checks
        • 2.28 OpenZeppelin Libraries
        • 2.29 DAppSys Libraries
        • 2.30 Important Protocols
        • Summary: 201 Keypoints
      • 🔏3. Security Pitfalls & Best Practices
        • 3.1 Solidity Versions
        • 3.2 Access Control
        • 3.3 Modifiers
        • 3.4 Constructor
        • 3.5 Delegatecall
        • 3.6 Reentrancy
        • 3.7 Private Data
        • 3.8 PRNG & Time
        • 3.9 Math & Logic
        • 3.10 Transaction Order Dependence
        • 3.11 ecrecover
        • 3.12 Unexpected Returns
        • 3.13 Ether Accounting
        • 3.14 Transaction Checks
        • 3.15 Delete Mappings
        • 3.16 State Modification
        • 3.17 Shadowing & Pre-declaration
        • 3.18 Gas & Costs
        • 3.19 Events
        • 3.20 Unary Expressions
        • 3.21 Addresses
        • 3.22 Assertions
        • 3.23 Keywords
        • 3.24 Visibility
        • 3.25 Inheritance
        • 3.26 Reference Parameters
        • 3.27 Arbitrary Jumps
        • 3.28 Hash Collisions & Byte Level Issues
        • 3.29 Unicode RTLO
        • 3.30 Variables
        • 3.31 Pointers
        • 3.32 Out-of-range Enum
        • 3.33 Dead Code & Redundant Statements
        • 3.34 Compiler Bugs
        • 3.35 Proxy Pitfalls
        • 3.36 Token Pitfalls
        • 3.37 Special Token Pitfalls
        • 3.38 Guarded Launch Pitfalls
        • 3.39 System Pitfalls
        • 3.40 Access Control Pitfalls
        • 3.41 Testing, Unused & Redundand Code
        • 3.42 Handling Ether
        • 3.43 Application Logic Pitfalls
        • 3.44 Saltzer & Schroeder's Design Principles
        • Summary: 201 Keypoints
      • 🗜️4. Audit Techniques & Tools
        • 4.1 Audit
        • 4.2 Analysis Techniques
        • 4.3 Specification, Documentation & Testing
        • 4.4 False Positives & Negatives
        • 4.5 Security Tools
        • 4.6 Audit Process
        • Summary: 101 Keypoints
      • ☝️5. Audit Findings
        • 5.1 Criticals
        • 5.2 Highs
        • 5.3 Mediums
        • 5.4 Lows
        • 5.5 Informationals
        • Summary: 201 Keypoints
  • 🌱CARE
    • CARE
      • CARE Reports
  • 🚩CTFs
    • A-MAZE-X CTFs
      • Secureum A-MAZE-X
      • Secureum A-MAZE-X Stanford
      • Secureum A-MAZE-X Maison de la Chimie Paris
Powered by GitBook
On this page
  • Network
  • Data Structures
  • Consensus Algorithm
  • Ethereum PoW: Present and Future
  • Ethereum protocol upgrades (initially Eth2)
  • Ethereum Nodes and Clients
  • Ethereum Miners
  • GHOST protocol
  1. LEARN
  2. Introduction
  3. 1. Ethereum Basics

1.4 Ethereum Core Components

Previous1.3 Ethereum vs. BitcoinNext1.5 Gas Metering: Solving the Halting Problem

Last updated 1 year ago

Network

The underlying network that Ethereum is built on is a peer-to-peer network that's nowadays running on TCP port 30303. The protocol that enables this P2P network is known as ÐΞ\XiΞVp2p.

If you step back and think about the paradigm of the networks that we use today is built around the concept of clients and servers: Laptops, desktops, smartphones, iot gadgets..., that we use are all really the clients that are talking to servers sitting in the cloud on AWS or any of the application platforms that you're interacting with.

Compared to that, the key change when it comes to Ethereum is that the underlying network is all peer-to-peer: there are no clients and servers, they're all peers that are exchanging messages on a same layer.

On this network we have transactions, which imply the notion of a sender transferring some value (and some data) to a receiving entity.

And on top of that there is this abstraction of a state machine that is driven by the EVM. When it comes to programming that machine, there are high-level languages (HLLs) that programmers and developers work with, being the most common one (the most widely used) Solidity, which is converted into EVM instructions (machine language instructions; bytecode).

Data Structures

The Ethereum protocol itself has several common data structures. There is however a very specific data structure known as the Merkle-Patricia Tree that is used to optimize the way that Ethereum handles some of the states that are used within the context of the blockchain.

  • Merkle Tree

    A Merkle Tree is a type of binary tree that is composed of a set of nodes where the leaf node at the bottom of the contains the underlying data, and all the intermediate nodes between the leaf nodes and the root contain the combined hash of their two child nodes. Visually, the data is located at the bottom of the leaf nodes, and all the intermediate nodes (combining their hashes) lead to the root node, which is at the top of the tree and it's usually referred to as "the root hash".

  • Patricia Tree

    A Patricia tree (aka prefix free radix tree) is a different data structure where the (key, value) pairs that are contained within it have the values at the leaf nodes, and the keys let you traverse a path from the root of the tree to the leaves, so that the nodes that share the same prefix in the key also share the same path down the tree from the root to the leaves.

    Ethereum uses a modified combination of a Merkle tree and Patricia tree that specifically suits the Ethereum data structures. In this particular case, it uses a Hexane-Merkel-Patricia tree which means that each node has 16 children. There's also a recent proposal to convert this to a binary tree.

  • Other data structures

    When looking at the programming languages supported on Ethereum, let's take for example Solidity, there are many common data structures such as arrays, lists, the basic value types, reference types...

Consensus Algorithm

One of the critical core compontents (perhaps the most critical one) is the consensus algorithm.

  • Why is there a need for a consensus algorithm this?

    The Ethereum blockchain, as mentioned earlier, consists in many peer-to-peer nodes: these nodes are all talking to each other and trying to agree upon what is the global state of the blockchain.\

    Each of the nodes has a local state, which consits on new blocks being formed containing the transactions that are processed by the node. But then, everyone on the Ethereum blockchain should agree upon what is the one canonical blockchain that will be used in the future.\

    This is agreement is reached through the consensus algorithm. Decentralized consensus is critical to any blockchain. In the context of Ethereum it refers to the Nakamoto Consensus Protocol that is adapted from Bitcoin. This protocol addresses the problem of which miners' block should be included next to create the canonical blockchain. There are two components to this: Proof of Work (PoW) and The longest chain rule.\

    PoW is used to determine which entity on the Ethereum blockchain gets to add the next block; the consensus algorithm determines which is the longest chain so far, and all the nodes that are chosen to add the next block to the existing canonical blockchain build upon the existing longest chain. This is how a blockchain grows in state.\

    The consensus algorithm is used as a form of sybil resistance mechanism. There's this concept of 51% attack, which consists in the scenario of 51% of the participating entities colluding. Then, these malicious entities can change past states (and thus the state transitions) that have been encoded into the blockchain. This breaks the key property of immutability of a blockchain, thus it is very critical.

Ethereum PoW: Present and Future

As mentioned previously, Ethereum Proof of Work is fundamental to how the Ethereum consensus protocol works. Technically, it is referred to as Ethash, and it's formally defined as

(m,n)=PoW(Hᵰ,Hn,d)m=Hm∧n≤2256Hd(m,n)=\text{PoW}\left(H_{ᵰ},H_n,d\right)\\ m=H_m\wedge n\leq\frac{2^{256}}{H_d}(m,n)=PoW(Hᵰ​,Hn​,d)m=Hm​∧n≤Hd​2256​

where HᵰH_ᵰHᵰ​ is the new block's header but without the nonce (nnn) and mix-hash (mmm) components; HnH_nHn​ is the nonce of the header; ddd is a large data set needed to compute the mixHash and HdH_dHd​ is the new block's difficulty value.

What a miner has to do is to determine a combination of mix-hash mmm and nonce nnn for every block to satisfy the constraint on the difficulty for that particular block. This is a trial and error process so the miner has to keep repeating the computation until a combination of mmm and nnn that satisfies the difficulty is found. When this is done it means that a sufficient amount of work has been performed by the miner for this particular block, or in other words this indicates that there is proof of work by the miner for this block.

PoW is being replaced by what is known as proof of stake (PoS) with the merge. This change is important from a security perspective: the consensus algorithm is critical to the economic security of a blockchain, as it is what makes the blockchain resistant to attacks from any of the untrusted parties operating the infrastructure or operating on it. For now, this security is provided by PoW.

Ethereum protocol upgrades (initially Eth2)

The merge the first of a set of interconnected upgrades to the existing Ethereum network that are being made, perhaps the biggest set of upgrades since Ethereum started. Although it was initially coined as Eth2, this name was phased out because it is not a separate version of the protocol but a continuation of all the research and development activity that has happened on the protocol. These upgrades occur across three vectors: scalability, security and sustainability.

  • Scalability: made possible by the concept of sharding.\

  • Security: through the transition from PoW to proof of stake (PoS), also known as the merge.

    This is again a huge change to how the protocol functions and it offers immense economic and security benefits compared to PoW.\

  • Sustainability: with PoW, there's a certain amount of computation that has to be done to pass the difficulty level. This is for sybil resistance part of the consensus protocol.\

    As a result, there is real energy (in terms of running the mining nodes) that is consumed. This goes away to a great extent when we transition to PoS: it is going to make Ethereum much more sustainable when it comes to environmental impact.

These upgrades already started happening with the deployment of the Beacon chain and the transition to PoS with the merge.

Ethereum Nodes and Clients

An Ethereum node is a software application that implements the Ethereum protocol specification. It communicates with other Ethereum nodes on the network in a peer-to-peer fashion.

The Ethereum client is a specific implementation of the Ethereum Node. Within the Ethereum clients, a specification of the protocol itself (the consensus algorithms, data structures,\dots all these core components) is implemented.

If anyone is running an Ethereum node they're using one of these Ethereum clients that have been built by multiple teams around the world. The most popular ones are

  • Geth

  • Erigon

  • Nethermind

  • OpenEthereum

  • Reth

There have been a lot of transitions and changes. Some of the clients have been under development, and some of them are way more popular. Geth is the most popular with more than 80% of the running nodes. But, for the sake of the diversity and decentralization of Ethereum, other clients are being supported and being developed as well. These clients are open source so anyone is free to examine the client code and maybe even contribute to it.

Ethereum transactions are sent to the Ethereum Nodes, and these in turn broadcast them across the peer-to-peer network. This is how Ethereum transactions propagate across the Ethereum network, reach the various nodes, get combined into blocks and result in the blockchain.

Ethereum Miners

Ethereum miners are entities running Ethereum Nodes on the network. They are the ones that receive, validate, execute and combine the transactions into blocks.

They also provide a mathematical proof of their computation (proof of work; PoW). For all this work, if the miner's block gets chosen to be part of the blockchain, then they are rewarded with what is known as a block reward.

This block reward is currently 2 Ether and it has decreased over time. This is the crypto economics: incentive for miners to participate and to be honest on the Ethereum network. Along with this reward, they're also rewarded with transaction fees: the Ether spent on Gas by all the transactions included in that block.

So block reward and transaction fees are the crypto economic incentives that are paid out to the miner whose block gets accepted into the blockchain.

GHOST protocol

So transactions are sent, and miners validate them. They combine them into blocks and these blocks are propagated over the peer-to-peer network. Multiple miners are doing this process simultaneously. This leads to multiple valid blocks at any level of the blockchain. The canonical blockchain needs to choose one valid block at any level. The choice of that valid block is dictated by the "GHOST protocol" (the Greedy Heaviest Observed Subtree protocol). This protocol allows stale blocks up to seven levels in this calculation. Stale blocks, in Ethereum's nomenclature, are referred to as uncles or armors.

📚
🔷
Merkle Tree example consisting of 8 underlying transactions.
Patricia Tree Example