Decoding the EVM: The Engine Powering Ethereum

Decoding the EVM: The Engine Powering Ethereum

Unveiling the secrets of the Ethereum Virtual Machine and its role in the blockchain wizardry.

ยท

14 min read

In the ever-evolving landscape of blockchain technology, the Ethereum Virtual Machine (EVM) stands as a pivotal and revolutionary component at the core of the Ethereum blockchain and We can say that EVM is Heart of Ethereum ๐Ÿ’–.

In this Article , We wil go through EVM and will justify how it helps ethereum . We will try to decode EVM Architecture , EVM states , EVM Working and many more related stuff .

Virtual Machines (VM)

They are special type of software which can run other Operating Sytems(OS) and applications . Each virtual machine operates as an independent entity, isolated from the others, and has its own set of resources, including CPU, memory, storage, and network interfaces .

We can create multiple VMs on single physical device . Suppose you are using Windows OS on your Device with the help of VM you will be able to run LINUX/UBUNTU/Any other specific OS on your windows device .

Features of Virtual Machines

There are many features of VM but some of the most important features are discussed below

Isolation

  • Virtual machines provide a high level of isolation from each other. Each VM operates independently, with its own set of resources, ensuring that activities in one VM do not impact others.

Encapsulation

  • VMs encapsulate the entire software environment, including the operating system, applications, and configurations, into a single, portable package. This encapsulation simplifies deployment, migration, and management.

Resource Allocation

  • Virtualization allows for flexible resource allocation. Resources such as CPU, memory, and storage can be dynamically adjusted to meet the changing demands of applications running within VMs.

Dynamic Scaling

  • Virtual machines can be dynamically scaled in response to changing workloads. This scalability allows for efficient use of resources during peak demand and resource conservation during off-peak times.

EVM is Ethereum Virtual Machine . It is specially made for the execution of smart contracts and transactions on Ethereum Blockchain Specially . Let's deep dive into EVM and it's role in building glory of Ethereum .

Who Made EVM ?

We all know Vitalik Buterin discovered Ethereum but Gavin Wood made EVM in 2014 and he is also co-founder of Ethereum Foundation , Founder of PolkaDot . His idea was totally influenced by concept said to be "One Computer For Entire Planet" which drove him developing and contributing in solidity language developement also .

Basic Terminologies

Before Deep diving into EVM there are some basic terms which you will come across while working with blockchain

Accounts

  • An Ethereum account is an entity with an ether (ETH) balance that can send transactions on Ethereum. Accounts can be user-controlled or deployed as smart contracts.

  • There are mainly two types of accounts

    • Externally Owned Accounts(EOA)

    • Contract Accounts

  • Every Ethereum account has four fields such as nonce , balance , storage hash , codeHash

A diagram showing the make up of an account

Transactions

  • As we know , Blockchain is P2P network For the sake of simplicity consider it as book which is shared among peoples from different areas in world . Any operation which reads or writes data into this book is transaction . In simple words you can call transaction as operations which changes blockchain state . Each transaction on Blockchain has Transaction Hash and Transaction ID

Blocks

  • Blockchain at its core level consists of blocks . Each blocks contains list of transactions and their information . Blocks also contains a hash pointer to previous block . Whenever new transactions are triggered in blockchain new blocks are mined . Blocks also contains Merkle Root used to verify transaction we will see this Merkle trees.

Gas โ›ฝ

  • For computation of Every Transaction on Blockchain we have to pay certain fees in native currency of blockchain and this is called gas . Gas prices also depends how we are writing smart contracts . Each function in contract has specified gas fees .

EVM : The MasterMind ๐Ÿง  Behind Ethereum

EVM is runtime environment for smart contracts and solidity code . EVM is Transaction Driven State Machine i.e. whenever transaction is triggered by an account (EOA/contract) EVM State changes from previous state to another new state .

Y(S, T)= S' 
// Here , 
S -> Previous State Before Transaction
T -> Triggered Transaction
S'-> New State After Triggering Transaction

World State in EVM

World State is state of all accounts and their respective balances . It is current state of EVM whenever transaction gets triggered then world state changes to new state . You can define accounts as objects in world state .

World States are Stored in special data structure called as " Merkle Patricia Trie " . Each block in the Ethereum blockchain contains a reference to the root of the Merkle Patricia Trie representing the world state at the end of that block.

Like shown in figure world state holds mapping of addresses and accounts associated with this addresses . Transaction can possibly change balances , storage and other related things of accounts so there is change in world state after every transaction like shown in below figure .

Merkle Trees

Merkle Trees are special type of data structure which are used for validation and authentication of transactions . Merkle trees use hash pointer for working and storing hashes .

Merkle follows bottom to top appraoch . First of all , Leaf nodes in tree are hashes of the transactions occured in block . Non- Leaf node are hashes of its child nodes . All hash computation are done with SHA-256 hashing algorithm .This processes of obtaining hash continues untill we get one final hash value and this value is referred as " Merkle Root " . This root is stored inside block header to verify transaction and this is for Bitcoin Block . We are talking about Ethereum so It has three use cases of this State Root , Transaction Root and Receipt Root .

Here , T1 -T4 are transactions . We passed them through SHA256 and got H1-H4 as hashes of transactions T1-T4 respectively then we have combined hash of H1 and H2 got H12 similarly H3 and H4 are fused to get H34 then again we combine H12 and H34 to get our merkle root . This H1234 hash will be added in block header of block containing T1-T4 transactions .

If at any stage of hash computation if any node hash get changed maliciously then it will result into completely different merkle root and this transaction will not mined on blockchain thus we can prevent any malicious block to be mined .

TRIE

A trie (derived from retrieval) is a multiway tree data structure used for storing strings over an alphabet. It is used to store a large amount of strings. The pattern matching can be done efficiently using tries.

In a trie, each node typically has a collection of child nodes, where the number of child nodes corresponds to the size of the alphabet or the character set being used .

It is also called as "Prefix Tree" as they have efficient way to store and retrive word with common prefixes . Talking in consideration of blockchain we will use trie to store key value pairs . These key value pairs mostly be of Address and its associated account to it .

  • Common Syntax of Trie

      // Trie node
      class TrieNode 
      {
          TrieNode[] children = new TrieNode[ALPHABET_SIZE];
          // isEndOfWord is true if the node
          // represents end of a word
          boolean isEndOfWord;
    
          public TrieNode(){
          Children= new TrieNode[26];
          is EndOfWord=false;
          }
      }
    
      // each Node Has one Bool value to check End of Word
    
  • Insertion

    Suppose , We have to add 3 word in our trie {"app" , "apply" , "bye" , "by "} then First of all we have to start with app put a in root node and End=false then reference this root node to child node which will contain out p value and do it for all words . After all words insertion our trie will look like below

    • Inserted App in Trie

  • Inserted Apply in Trie

  • Inserted Bye and by

This is was Insertion when there are no key pairs let us see what about when Key value pair come .

Suppose , We have to insert key-value pair {Aditya:"49" , Adidem:"24" } in our trie . We to follow same procedure as above but after every word we have to give one null node untill we reach to end and When will at end then we have to add node with key value Lets observe below figure for it

Talking in aspect to blockchain we have to store many key value pairs in these tries but you can see here one disadvantage of normal trie is that except value node all other do not have any purpose and they are consuming extra space which is not good so , to overcome this issue we come with some advanced version of trie called as "PATRICIA TRIE"' .

PATRICIA TRIE

The PATRICIA means Practical Algorithm to Retrieve Information Coded in Alphanumeric . In the normal trie represents single character of keys whereas PATRICIA Trie represents partial characters of keys .

Suppose , We have to insert key-value pair {Aditya:"49" , Adidem:"24" } in our trie . What P-trie will do is that it will take partial common prefix as key and then it denotes uncommon part of key seperately with value . Lets see how it occurs ?

As from this , we can observe that normal trie was taking 14-15 null spaces to store nodes but if we look at this it consumes less space works more fast than normal trie and this is more simplified than normal trie .

The magic comes in Blockchain when we merge both Merkle Trees + PATRICIA Trie . The new concept is used in blockchain called as "Mekle PATRICIA Tries" .

PATRICIA Merkle TRIE / Merkle PATRICIA TRIE (PMT / MPT)

Merkle Patricia Trie is a combination of the Patricia Trie and the Merkle tree. Each node in the trie contains a hash of its child nodes, making it a Merkle tree. This enables efficient verification of the integrity of the entire data structure.

Why to use MPT ? ?

If we consider Bitcoin , It only deals with transaction and bitcoin architecture also designed in way to maintain integrity of transactions . Transaction itself have static data once transaction is mined , its data is stored permantly so case of bitcoin is handled itself by merkle Trees . Now coming back to EVM , It have to store data of both transactions and accounts .

Transactions data is itself handled by Merkle Trees but accounts data keeps on changing so we required some more advanced data structure . This type of data which is keeps on changing called as "Ephemeral Data"

Working of MPT

There are three types of nodes in MPT namely

  • Extension Node

  • Branch Node

  • Leaf Node

These Three Nodes maintains state trie and its hash in EVM . Lets understand each one of them in details .

Extension Node

Extension Node is used to find common prefixes among world state trie hash . It has one Field for prefix and one pointer for next node . It finds common nibble (4 bits data) from given world state trie and stores it .

If prefix is 0 then this common prefix nibble is even and vice versa .

Block Node

It is point in trie where multiple path diverges . BlockNode has more than one childrens . Extension node itself is blocknode but only with one child .

It has array of 16 nibble values . Each nibble can be either empty or refer to another node .

Leaf Node

A leaf node represents the end of a key and contains the associated value.It consists of the key's nibbles and the corresponding value. The key's nibbles are stored directly in the node.

If prefix is 2 then this common prefix nibble is even and vice versa .

Summary of MPT

After calculation of nodes a final hash is created with help of keccak256() algorithm and this hash is stored in block header at state root . This calculation is done every second as state of blockchain very rapidly .

That's Lot of distraction from EVM . Now Coming back to EVM and its architecture .

Architecture of EVM

A diagram showing the make up of the EVM

As shown in Figure , There are various components in EVM architecture such as

  • Program Counter (PC)

    We all know when smart contract is compiled abd we get bytecode .This bytecode is given to EVM for execution. EVM has to execute bytecode . Bytecode contains series of OPCODE (OPCODE is instruction) . PC keeps one pointer for which tells EVM which instruction to perform first . Once first OPCODE is executed then moves pointer to other OPCODE .

  • Stack

    EVM is stack based vivrtual machine and it do not use registers . Stack is used to store data temporarily during the execution of opcodes (operation codes) within the EVM bytecode. Data pushed onto the stack is consumed by subsequent opcodes, and it remains at the top of the stack until other operations are performed.

    There are many components but they are self explainatory and this is architecture of EVM .

Machine Space Of EVM

We already know , Stack is used for execution of OPCODES provided by EVM . Now lets Explore some other parts of EVM such as Memory and storage.

When to Use Memory and Storage ? ๐Ÿค”

Memory is Volatile space in EVM . All data of smart contract execution is stored in memory . If you have worked solidity before then you will know about that we use memory keyword for making string . As soon as , smart contract execution is stopped this memory is cleared and made ready for next contract .

Stroage is something that we talked a lot about in this block . It is persistent memory . It uses Merkle trees for stroage and it stores blocks , transactions , accounts data .

So when we have smart contract execution and we need temp. storage we use memory and when we require permant storage we use storage .

What are LOGS and CALLDATA

Logs

Logs in the EVM are a way for smart contracts to communicate information to external entities and provide a mechanism for decentralized applications (DApps) to access events that occurred during contract execution. Here are the key details about logs:

  1. Purpose

    • Logs are typically used to emit events from a smart contract.

    • Events represent occurrences or state changes within a contract that are of interest to external parties or other smart contracts.

  2. How They Work

    • A smart contract emits a log by using the LOG opcode.

    • Logs consist of one or more 32-byte data fields.

    • Contracts define events with specific parameters, and emitting a log involves providing values for those parameters.

  3. Usage

    • DApps and external systems can "watch" for specific events by monitoring the logs.

    • Logs are often used to implement decentralized oracle services, track token transfers, or signal important state changes.

  4. Gas Cost

    • Emitting logs consumes gas, and the cost depends on the number of topics and data included.

CallData

Call data in the EVM refers to the input data provided when a contract is called or a transaction is made. It includes the function selector and any arguments passed to the function. Here are the key details about call data:

  1. Purpose:

    • Call data contains the input parameters for a contract call.

    • It is used by the EVM to determine which function to execute and with what arguments.

  2. Structure:

    • The first four bytes of the call data contain the function selector (hash) that identifies the function to be called.

    • The remaining data contains the actual parameters and arguments passed to the function.

  3. How It Works:

    • When a contract is called, the EVM uses the function selector to identify the function to execute and passes the corresponding arguments from the call data to the function.
  4. Usage:

    • Call data is essential for enabling interactions between contracts and for providing input to contract functions.
  5. Gas Cost:

    • Reading from call data has a gas cost, and the cost depends on the amount of data read.

If You are reading upto to this line then you are warrior โš”๏ธ . This is only few parts of EVM and actual working has more complex nature . This is basic that we have to build before EVM Working .

Thank you For Reading Blog !! . Thats wrap up see you in next blog . In next blog of blockchain series we will cover

  • ABI , Bytecode , OPCODE in details

  • SOLC Compilers

  • Working with Solidity

  • ERC standars and many more to come

Do Comment out suggestions !! . I am pretty noob in writing blogs . If anyone is in working in blockchain do let me know my mistakes .

Thank You Very Much . Lets signoff with one meme

The Ultimate Merkle Tree Guide in Solidity

ย