So You Want to Build a Language VM - Part 21 - Header Offset
Adds in calculating the starting PC location based on how much read-only data there is
Intro
Hello! In this article, we’re going to fix a teensy bug. =) == The Problem An Iridium program (bytecode) always starts with a 64 byte header. The first 4 bytes are a constant magic number that let’s the Iridium VM know that it’s Iridium bytecode. The rest are zeros. An example would be:
[45, 50, 49, 45, ... 0]
Then after that comes the read-only section of the program that contains things like constants. We do not know how long this section is until we have assembled the program. If someone declares a constant for the string "Hello", the read-only section would look like:
[72, 101, 108, 108, 111, 0]
In our program, we now have 64 (the header length) + 6 (the constant Hello, and remember strings are null-terminated) bytes for a total of 70 bytes. The VM needs to start executing at byte 71 of the program.
Fixit Fixit
Worry not, it is pretty easy to fix this! We need to:
Calculate the final length of the read-only section during assembly
Write that to the header using the next 4 bytes
Initialize our VM with that PC
Calculating the Length
Head over to src/assembler/mod.rs
and let’s modify a function. We’re going to use the byteorder crate’s nifty features for this.
The function we’re going to modify is write_pie_header
. Change it to be:
fn write_pie_header(&self) -> Vec<u8> {
let mut header = vec![];
for byte in &PIE_HEADER_PREFIX {
header.push(byte.clone());
}
// Now we need to calculate the starting offset so that the VM knows where the RO section ends
//First we declare an empty vector for byteorder to write to
let mut wtr: Vec<u8> = vec![];
// Write the length of the read-only section to the vector and convert it to a u32
// This is important because byteorder crate will pad with zeros as needed
wtr.write_u32::<LittleEndian>(self.ro.len() as u32).unwrap();
// Append those 4 bytes to the header directly after the first four bytes
header.append(&mut wtr);
// Now pad the rest of the bytecode header
while header.len() < PIE_HEADER_LENGTH {
header.push(0 as u8);
}
header
}
The three new lines in the middle are the key; I’ve added comments to each one explaining what it does.
Don’t forget to add:
use byteorder::{LittleEndian, WriteBytesExt};
src/assembler/mod.rs
file.One More Thing
We need to make sure we write the header after we’ve setup all the read-only data. In src/assembler/mod.rs
, in the function assemble
, move the call to write_pie_header
to just after the body is generated, like this:
let mut body = self.process_second_phase(&program);
// Get the header so we can smush it into the bytecode letter
let mut assembled_program = self.write_pie_header();
// Merge the header with the populated body vector
assembled_program.append(&mut body);
Reading the Offset
Now we need to teach our VM how to read the offset. In src/vm.rs
, add the following function:
fn get_starting_offset(&self) -> usize {
// We only want to read the slice containing the 4 bytes right after the magic number
let mut rdr = Cursor::new(&self.program[4..8]);
// Read it as a u32, cast as a usize (since the VM's PC attribute is a usize), and return it
rdr.read_u32::<LittleEndian>().unwrap() as usize
}
and then in the run
function of the VM, replace:
self.pc = 64;
self.pc = 64 + self.get_starting_offset();
Tests
Now let’s write a test to make sure it works! In src/assembler/mod.rs
, add this test:
#[test]
/// Simple test of data that goes into the read only section
fn test_code_start_offset_written() {
let mut asm = Assembler::new();
let test_string = ".data\ntest1: .asciiz 'Hello'\n.code\nload $0 #100\nload $1 #1\nload $2 #0\ntest: inc $0\nneq $0 $2\njmpe @test\nhlt";
let program = asm.assemble(test_string);
assert_eq!(program.is_ok(), true);
assert_eq!(program[4], 6);
}
With that test string, we should have a header that looks like:
[45, 50, 49, 45, 6, 0, 0, 0, ... ]
If we run our test, we should see:
$ cargo test test_code_start_offset_written -- --nocapture
Finished dev [unoptimized + debuginfo] target(s) in 0.11s
Running target/debug/deps/iridium-981657ef3cdcfc6e
running 1 test
test assembler::tests::test_code_start_offset_written ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 45 filtered out
Running target/debug/deps/iridium-87ed8e3d062c1031
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests iridium
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Yay, it works!
End
That wasn’t so bad, was it? You can see the code here: https://gitlab.com/subnetzero/iridium/tags/0.0.21.
Until next time!
If you need some assistance with any of the topics in the tutorials, or just devops and application development in general, we offer consulting services. Check it out over here or click Services along the top.