Sig Engineering - Part 6 - Progress on the Sig SVM

Sig Engineering - Part 6 - Progress on the Sig SVM

Sig Engineering - Part 6 - Progress on the Sig SVM

This post is Part 6 of a multi-part blog post series we will periodically release to outline Sig's engineering updates. You can find other Sig engineering articles here.

This blog post highlights some major achievements in Sig since our last update in January. We've made significant progress on the implementation of the Solana Virtual Machine (SVM), including the development of the eBPF virtual machine, native programs (with some still in progress), native cross-program invocation (CPI), conformance testing, and more.

Solana’s Virtual Machine (SVM)

The Solana Virtual Machine (SVM) is the transaction execution layer of a Solana Validator. At a high level, the SVM consists of two key components: a runtime and an sBPF virtual machine.  The runtime orchestrates the execution of transactions and their instructions, while the virtual machine enables the execution of arbitrary sBPF programs.

Note: The precise definition of the Solana Virtual Machine (SVM) is often debated (as mentioned in Anza’s New SVM API blog post). Some argue it refers solely to the sBPF virtual machine, while others contend it encompasses the entire transaction processing pipeline.

Runtime

As previously mentioned, the runtime orchestrates the execution of transactions and their instructions. Each transaction contains one or more instructions, with each instruction specifying a target program, a command for the target program to execute, and a list of accounts to operate on. The following diagram provides a high-level illustration of how the runtime executes a batch of transactions.

Note: For more information on transactions and instructions, check out Solana’s explainer here

Transactions within a batch are executed sequentially, as they may modify the same accounts. If a transaction fails, the error is recorded, and the runtime proceeds to the next transaction. Likewise, instructions within a transaction are executed in order, but if an instruction fails, the entire transaction fails immediately.

The execution of each instruction begins by pushing it onto an instruction stack. A processor then reads the instruction from the stack and attempts to invoke the target program with the associated accounts and data. In Solana, each program is either owned by the Native Loader, with its implementation defined within the validator (e.g., the SystemProgram), or owned by a BPF Loader Program and defined as an on-chain program (e.g., the Jupiter Program). 

Currently, the runtime directly invokes native programs (i.e. programs owned by the Native Loader) by looking up and calling the associated entrypoint. To execute on-chain programs, the runtime invokes the owning BPF Loader Program, which then executes the target program  within a provisioned sBPF virtual machine. 

If a program performs CPI, a corresponding new instruction is created and pushed onto the stack for execution. Once an instruction completes, it is popped from the stack. If the stack is empty, the next transaction instruction is pushed onto the stack; otherwise, the instruction must have been triggered by a CPI call, and the execution of the caller continues.

Supporting the execution of instructions targeting key native programs and basic sBPF programs is the current focus of Sig’s runtime development. Below outlines some key requirements and our progress towards reaching this objective.

Some Native Programs are omitted in the above diagram as they will be naturally supported due to their inclusion and current status in the Core BPF Migration.

sBPF Virtual Machine

At the core of the SVM is an Extended Berkeley Packet Filter (eBPF) Virtual Machine (VM). 

A VM is an abstract CPU that executes a set of defined instructions. For example, it can move values between registers, add values in registers, jump to different instructions, etc.

eBPF is a bytecode format that encodes instructions into a specific sequence of bytes. Solana programs are compiled into Solana’s modified version of eBPF (sBPF) and executed on a VM conforming to Solana’s Virtual Machine Instruction Set Architecture (SVM ISA). 

As of this writing, Sig now fully supports validation and parsing of Solana programs (including ELF files) as well as execution of all sBPF instructions included in V1, V2, and V3.

Sig’s Implementation

Execution of an instruction via the sBPF VM begins with the runtime loading an executable from the target program account and serializing the instruction accounts and data into a memory map, which can be interpreted and operated on by the VM. The executable and memory map are then provided to the VM for execution.

Solana programs are typically stored in Executable and Linkable Format (ELF). An ELF file contains “sections”, which are regions of the file dedicated to storing a certain type of data. For example, the .TEXT section stores sBPF instructions, and the .RODATA (Read-Only Data) section stores constants.

To execute a program, we first parse the ELF file and decode the sBPF instructions, program entry point, declared functions, and other auxiliary references. Sig accomplishes this without heap allocation, ensuring a fast and reliable process.

Once an ELF file is parsed, it must be validated. This process ensures that there is only one .TEXT section, the number of sections is valid, and no sections overlap, among many other checks. Currently, there are three versions of sBPF: V1 and V2 use a more lenient validation scheme, while V3 introduces a stricter and more secure approach.

Before execution within the VM, the sBPF instructions undergo static verification. This validates every instruction, even those that may never execute.  If any invalid instructions are found, the program is rejected. For example, a "divide immediate" instruction with a zero divisor is illegal and will cause the program to be rejected, even if the instruction is unreachable.

Once the sBPF instructions have been successfully verified, they are executed within the SVM. During execution, the VM checks for runtime errors such as out-of-bounds memory access or invalid instruction usage. The VM also handles system calls, allowing programs to perform operations like memcpy and memset, and logging events for debugging. Programs running in the VM can use these system calls to interact with on-chain data, reading and writing from other accounts as needed. Throughout execution, the VM carefully tracks the number of compute units used, enforcing limits to prevent excessive resource consumptions, or infinitely looping programs.

Fibonacci in sBPF

Below is an example Fibonacci program written in sBPF assembly, and demonstrates how you can run with Sig’s VM implementation!

entrypoint:
    mov r0, 10
    call function_fib ; Compute the 10th Fibonacci number
    exit ; Returns the value in the r0 register

; Passes N in r0
; Returns the Nth Fibonacci number in r0
function_fib: 
    mov r1, 0       ; Fib(0) = 0
    mov r2, 1       ; Fib(1) = 1

    jle r0, 1, done ; If N <= 1, return r0 (Fib(0) or Fib(1))
loop:
    add r1, r2      ; r1 = r1 + r2 (next Fib number)
    mov r3, r2      ; Store the old r2 value
    mov r2, r1      ; Update r2 to new Fib number
    mov r1, r3      ; Move the previous r2 into r1

    add r0, -1      ; Decrement N
    jgt r0, 1, loop ; Continue the loop if N > 1
done:
    mov r0, r2      ; Return the last computed Fibonacci number
    exit
$ zig build vm -- -a fib.asm
result: 55, count: 61

Challenges and Future Work

To improve and ensure the VM’s correctness, the next steps will involve expanding the test suite, including additional unit tests and further fuzzing of both the SVM and the sBPF VM itself. Fuzzing is essential for uncovering rare edge cases that are difficult to anticipate, especially in a system as complex as the SVM.

One of the challenges in implementing the SVM is accurately replicating C’s implicit type conversions, particularly when dealing with signed vs unsigned behavior, as it can introduce subtle inconsistencies that are hard to reproduce. Ensuring that the VM behaves identically across different implementations requires careful handling and rigorous testing. 

Another area of exploration is optimizing execution through Just-In-Time (JIT) compilation. This approach converts the sBPF instructions into native machine instructions before executing them, which can offer large boosts in performance. One approach under consideration is a “copy-and-patch” technique, where the sequences of native machine instructions are pre-determined, making it faster to generate these instructions and ensuring they are safer.

Conformance Testing

We leverage conformance testing frameworks and fuzzing infrastructure developed by Firedancer and Asymmetric Research to ensure compliance with Solana's implicit specification. Firedancer’s solana-conformace repository provides tooling and test harnesses to validate the following eight components against Solana’s Agave implementation:

Currently, we are passing all publicly available test vectors for the Elf Loader, Shred Parser, and VM Validate harnesses, and are actively working toward integrating the VM Interpret, Syscall, CPI and Instruction harnesses. The publicly available test vectors include fuzz inputs that previously uncovered a conformance issue between the Firedancer and Agave implementations. While this is a great starting point, it is insufficient, as Sig-specific issues may be missed. To address this, we have also been working with Asymmetric Research to set up continuous fuzzing of all integrated components.

Integrating conformance tests represents a significant advancement in the successful development of the SVM. With support from teams like Firedancer and Asymmetric Research, we’re confident we’ll be passing all conformance tests, allowing for a truly diverse set of SVM implementations to coexist.

Conclusion

The SVM is a vital component of a Solana validator, enabling account state modifications through transactions. Its implementation marks a major milestone in Sig’s development. While we are still in the early stages, we're encouraged by our progress and eager to tackle the challenges ahead. In the coming months, we'll begin conformance testing for instruction execution and will be sharing further updates as SVM is completed.