So You Want to Build a Language VM - Part 07 - REPL and Code Execution

Adds basic hex code evaluation to the REPL

A More Advanced REPL

Our current REPL doesn’t do a ton, so let’s fix that. In this post, we’ll be adding some commands to look at the program bytecode and the registers and their contents, as well as actually execute code entered in as hexadecimal.

Important
You will need a basic understanding of hexadecimal for this part. I’ll go over a few parts here, but you may want to read a more thorough explanation.

Hexadecimal

We use what is called the decimal numbering system, AKA base 10. Why? Probably because we have 10 fingers. Most of us are accustomed to using the numbers 0-9. The hexadecimal numbering system, AKA base 16, uses the numbers 0-9 and the letters A-F. This means that one place can represent 0-16. This is convenient for us, because one hex digit can represent 4 bits, so 2 can represent 8 bits, or 1 byte. That means we need only 8 digits to represent one of our instructions.

An Example

We’re going to build a LOAD instruction that loads the number 1000 into register 1.

Get the Opcode

If you refer back to instruction.rs, you can see that our LOAD instruction uses the number 0. To represent that in hexadecimal is: 00.

Get First Operand

When using the LOAD instruction, our first operand is the number of the register we want to store the number in. We want to use register 12, which is C in hex. Since we need to use two bytes, we pad, like so: 0C

Get the Second and Third Operands

The last two bytes are using to store the number. Remember, currently we are limited to 2^16. We are storing the number 1000, which is 03 E8.

Putting it All Together

Our complete instruction looks like: 00 0C 03 E8. Now, let’s add some commands to the REPL.

Expanding the REPL

We’re going to add two more commands to the REPL:

  1. .program

  2. .registers

.program

This will print out the bytecode of the entire program vector in the VM. The implementation looks like:

".program" => {
    println!("Listing instructions currently in VM's program vector:");
    for instruction in &self.vm.program {
        println!("{}", instruction);
    }
    println!("End of Program Listing");
},
Important
To be able to access the program and registers fields of the VM struct, you’ll need to add pub in front of them!

.registers

This will list all 32 registers and their current values. It is useful for verifying that our instructions are doing what they are supposed to do:

".registers" => {
    println!("Listing registers and all contents:");
    println!("{:#?}", self.vm.registers);
    println!("End of Register Listing")
},

Inputting Hex

Our final task is to accept user input consisting of 8 hex digits broken into groups of 2. For this, we’ll add a new function to our REPL, called parse_hex:

/// Accepts a hexadecimal string WITHOUT a leading `0x` and returns a Vec of u8
/// Example for a LOAD command: 00 01 03 E8
fn parse_hex(&mut self, i: &str) -> Result<Vec<u8>, ParseIntError>{
    let split = i.split(" ").collect::<Vec<&str>>();
    let mut results: Vec<u8> = vec![];
    for hex_string in split {
        let byte = u8::from_str_radix(&hex_string, 16);
        match byte {
            Ok(result) => {
                results.push(result);
            },
            Err(e) => {
                return Err(e);
            }
        }
    }
    Ok(results)
}

The idea with this function is that we can enter our bytecode directly in the REPL and execute it. Using hex for the input lets us only have to enter 8 characters instead of 32. Next, we need to change how the REPL handles the "everything else" arm of the input matching. Right now, it checks for a few commands and discards everything else:

match buffer {
    ".quit" => {
        println!("Farewell! Have a great day!");
        std::process::exit(0);
    },
    _ => {
        println!("Invalid input");
    }
}

Instead of printing invalid input, we’re going to try to parse out hex and give it to the VM to run:

_ => {
    let results = self.parse_hex(buffer);
    match results {
        Ok(bytes) => {
            for byte in bytes {
                self.vm.add_byte(byte)
            }
        },
        Err(_e) => {
            println!("Unable to decode hex string. Please enter 4 groups of 2 hex characters.")
        }
    };
    self.vm.run_once();
}

Does it Work?

Let’s find out!

Welcome to Iridium! Let's be productive!
>>> .registers
Listing registers and all contents:
[
    0,
    0,
    <snip>,
]
End of Register Listing
>>> 00 01 03 E8
>>> .registers
Listing registers and all contents:
[
    0,
    1000,
    <snip>,
]
End of Register Listing
>>>

Neat, huh? We typed in hex characters, which our VM executed!

Now, the thought of doing all our programming in hex is horrifying; the Elder Coders refer to the period of time where all code had to be written in hex as The Bad Times. But, one of the points of this project is to experience each layer!

End

That about does it for this post. In our next post, we’ll start on an assembler!


If you need some assistance with any of the topics in the tutorials, or just devops and application development in general, we offer consulting services. Check it out over here or click Services along the top.