So You Want to Build a Language VM - Part 18 - PIDs
Adding PIDs to the VM
Intro
Hey everyone! In this tutorial, we’ll add PID tracking to the Iridium VM. Please ensure you are starting from https://gitlab.com/subnetzero/iridium/tags/0.0.17. == PIDs There are two components we need to unique identify:
Iridium VMs
Processes run by those VMs
This does bring up a more fundamental question, though. Is a VM
long-lived, or short-lived? Should we create a VM, with its own registers and heap, for every application we want to run? Should we create a pool of VMs, each in their own threads, waiting to run any application we care to load?
Warning | There are security considerations with re-using VMs. If we do that, we have to make sure we zero out the registers and heap before allowing another application access to it. Otherwise, applications could read data from previous VMs. |
Identifiers for Iridium VMs
For this, we’re going to use a random UUID. On creation, a VM will generate a random identifier for itself. This will work regardless of what we end up doing about handling multiple VMs.
For generating a random UUID, this crate is quite handy: https://github.com/uuid-rs/uuid. Add it to your Cargo.toml
, and don’t forget extern crate uuid;
in main.rs
.
Note | Because the UUIDs will be random ones, we need to enable the v4 feature: uuid = { version = "0.7", features = ["v4"] } . |
Now over in src/vm.rs
, let’s add a field to our VM:
/// Virtual machine struct that will execute bytecode
#[derive(Default, Clone)]
pub struct VM {
/// Array that simulates having hardware registers
pub registers: [i32; 32],
/// Program counter that tracks which byte is being executed
pc: usize,
/// The bytecode of the program being run
pub program: Vec<u8>,
/// Used for heap memory
heap: Vec<u8>,
/// Contains the remainder of modulo division ops
remainder: usize,
/// Contains the result of the last comparison operation
equal_flag: bool,
/// Contains the read-only section data
ro_data: Vec<u8>,
/// Is a unique, randomly generated UUID for identifying this VM
id: Uuid,
}
And our builder function in impl VM
:
/// Creates and returns a new VM
pub fn new() -> VM {
VM {
registers: [0; 32],
program: vec![],
ro_data: vec![],
heap: vec![],
pc: 0,
remainder: 0,
equal_flag: false,
id: Uuid::new_v4()
}
}
(I’m not going to write a test for it, since it is impossible to fail)
Processes
What we want is really just an event log: "Application X was run at <timestamp> and terminated at <timestamp> with an exit code of <code>".
In theory, it is possible that a long running VM could re-use IDs, which could be confusing. Let’s give each application a random UUID as well.
Head back to vm.rs
and add in this:
use chrono::prelude::*;
#[derive(Clone, Debug)]
pub enum VMEventType {
Start,
GracefulStop,
Crash
}
#[derive(Clone, Debug)]
pub struct VMEvent {
event: VMEventType,
at: DateTime<Utc>
}
Note the use of the chrono package: https://github.com/chronotope/chrono. This is so we can easily use dates and times.
Note | And yes, all times are going to be in UTC. I am scowling right now at everyone who uses timezones in logs. |
Add the chrono package to your Cargo.toml and all the rest.
Tracking Events
For now, we’ll give the VM a list of VMEvents that we’ll keep appending to.
/// Virtual machine struct that will execute bytecode
#[derive(Default, Clone)]
pub struct VM {
// I'm removing the other fields as we have already seen them
events: Vec<VMEvent>
}
and…
pub fn new() -> VM {
VM {
// I'm removing the other fields as we have already seen them
events: Vec::new()
}
}
Almost There!
Let’s modify the VM to add an event when the run()
function starts, stops, or crashes:
/// Wraps execution in a loop so it will continue to run until done or there is an error
/// executing instructions.
pub fn run(&mut self) -> u32 {
self.events.push(VMEvent{event: VMEventType::Start, at: Utc::now()});
// TODO: Should setup custom errors here
if !self.verify_header() {
self.events.push(VMEvent{event: VMEventType::Crash, at: Utc::now()});
println!("Header was incorrect");
return 1;
}
// If the header is valid, we need to change the PC to be at bit 65.
self.pc = 64;
let mut is_done = false;
while !is_done {
is_done = self.execute_instruction();
}
self.events.push(VMEvent{event: VMEventType::Stop, at: Utc::now()});
0
}
Note that we are assuming the application terminated gracefully as long as the while loop ends. This is because execute_instruction
returns a bool, not an integer. Sigh.
Let’s change it. It will be a little painful, but it will be much more painful later.
First, we have to change the return value:
fn execute_instruction(&mut self) -> u32
Then in the check if the pc has exceeded the program length:
if self.pc >= self.program.len() {
return 1;
}
For the HLT and IGL codes:
Opcode::HLT => {
println!("HLT encountered");
return 0;
}
Opcode::IGL => {
println!("Illegal instruction encountered");
return 1;
}
and the very last line, where we returned false when an opcode returned or the application is done:
fn execute_instruction(&mut self) -> u32 {
if self.pc >= self.program.len() {
return 1;
}
match self.decode_opcode() {
Opcode::LOAD => {
let register = self.next_8_bits() as usize;
let number = u32::from(self.next_16_bits());
self.registers[register] = number as i32;
}
// <snip a lot of other opcodes
};
0
}
And now we go to change the run function yet again:
pub fn run(&mut self) -> u32 {
self.events.push(VMEvent{event: VMEventType::Start, at: Utc::now()});
// TODO: Should setup custom errors here
if !self.verify_header() {
self.events.push(VMEvent{event: VMEventType::Crash{code: 1}, at: Utc::now()});
println!("Header was incorrect");
return 1;
}
// If the header is valid, we need to change the PC to be at bit 65.
self.pc = 64;
let mut is_done = 0;
while is_done == 0 {
is_done = self.execute_instruction();
}
self.events.push(VMEvent{event: VMEventType::GracefulStop{code: is_done}, at: Utc::now()});
0
}
Crap. The problem is that we are retreating a return code of 0 as a signal that the application is done, but right now, some instructions (e.g., HLT), return a 0. So the program will continue, even when it shouldn’t.
Does this mean that HLT should return something > 0? To be honest, I don’t know. I do know I don’t want to break from the *nix convention of 0 == ok, and > 0 is an error of some sort…
Oh, hrm, Rust has the wonderful Option<_>…hehe…option. Let’s try using an Option with nothing in it as the signal to keep executing.
Note | I’m writing this as I write the code, so you can see my thought process. |
Let’s try this as the run function in vm.rs
:
/// Wraps execution in a loop so it will continue to run until done or there is an error
/// executing instructions.
pub fn run(&mut self) -> u32 {
self.events.push(VMEvent{event: VMEventType::Start, at: Utc::now()});
// TODO: Should setup custom errors here
if !self.verify_header() {
self.events.push(VMEvent{event: VMEventType::Crash{code: 1}, at: Utc::now()});
println!("Header was incorrect");
return 1;
}
// If the header is valid, we need to change the PC to be at bit 65.
self.pc = 64;
let mut is_done = None;
while is_done.is_none() {
is_done = self.execute_instruction();
}
self.events.push(VMEvent{event: VMEventType::GracefulStop{code: is_done.unwrap()}, at: Utc::now()});
0
}
is_done
in adding the stop event.And then in the execute_instruction
function:
fn execute_instruction(&mut self) -> Option<u32> {
if self.pc >= self.program.len() {
return Some(1);
}
And at last, the end of our run
function:
pub fn run(&mut self) -> u32 {
// <snip>
None
}
Run cargo test
to make sure we didn’t break anything…
test result: ok. 44 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Yay!
Application ID
For now, let’s just use a new VM per application run. This makes the VM ID the same as the Application ID. We may want to think about building a slightly more abstract form of a program, so we can attach additional information to it.
Update VMEvent
Let’s update VMEvent to have an id field:
#[derive(Clone, Debug)]
pub struct VMEvent {
event: VMEventType,
at: DateTime<Utc>,
application_id: Uuid
}
And then in each of the three places we generate an event, clone it from the VM id. Our run function should look like this now:
/// Wraps execution in a loop so it will continue to run until done or there is an error
/// executing instructions.
pub fn run(&mut self) -> u32 {
self.events.push(
VMEvent{
event: VMEventType::Start,
at: Utc::now(),
application_id: self.id.clone()
}
);
// TODO: Should setup custom errors here
if !self.verify_header() {
self.events.push(
VMEvent{
event: VMEventType::Crash{
code: 1
},
at: Utc::now(),
application_id: self.id.clone()
}
);
println!("Header was incorrect");
return 1;
}
// If the header is valid, we need to change the PC to be at bit 65.
self.pc = 64;
let mut is_done = None;
while is_done.is_none() {
is_done = self.execute_instruction();
}
self.events.push(
VMEvent{
event: VMEventType::GracefulStop{
code: is_done.unwrap()},
at: Utc::now(),
application_id: self.id.clone()
}
);
0
}
And…damnit! We’re returning 1 or 0 from the run
function still. So our nice collection of events vanish.
Sigh. OK, let’s change the run
function to return a list of our events, and we’ll change the 1 and 0 returns to return our entire Vector of events. The final run
function should look like:
pub fn run(&mut self) -> Vec<VMEvent> {
self.events.push(
VMEvent{
event: VMEventType::Start,
at: Utc::now(),
application_id: self.id.clone()
}
);
// TODO: Should setup custom errors here
if !self.verify_header() {
self.events.push(
VMEvent{
event: VMEventType::Crash{
code: 1
},
at: Utc::now(),
application_id: self.id.clone()
}
);
println!("Header was incorrect");
return self.events.clone();
}
// If the header is valid, we need to change the PC to be at bit 65.
self.pc = 64;
let mut is_done = None;
while is_done.is_none() {
is_done = self.execute_instruction();
}
self.events.push(
VMEvent{
event: VMEventType::GracefulStop{
code: is_done.unwrap()},
at: Utc::now(),
application_id: self.id.clone()
}
);
self.events.clone()
}
cargo test
and:
error[E0308]: mismatched types
--> src/scheduler/mod.rs:21:7
|
20 | pub fn get_thread(&mut self, mut vm: VM) -> thread::JoinHandle<u32> {
| ----------------------- expected `std::thread::JoinHandle<u32>` because of return type
21 | / thread::spawn(move || {
22 | | vm.run()
23 | | })
| |________^ expected u32, found struct `std::vec::Vec`
|
= note: expected type `std::thread::JoinHandle<u32>`
found type `std::thread::JoinHandle<std::vec::Vec<vm::VMEvent>>`
Fine, compiler. Off we go to src/scheduler/mod.rs
. Add an import:
use vm::{VM, VMEvent};
And change the signature of get_thread
:
/// Takes a VM and runs it in a background thread
pub fn get_thread(&mut self, mut vm: VM) -> thread::JoinHandle<Vec<VMEvent>> {
thread::spawn(move || {
vm.run()
})
}
cargo test
says everything is fine, the compiler isn’t yelling at us…are we done?
Ha. No, of course not! We still aren’t displaying the results to the users.
Hackety Hack
For now, we’re just going to print out the event log when we call run. We have to do this in two places:
When the user runs a program from the CLI, e.g.,
iridium myfile.iasm
When the user runs a program via the REPL
We’ll format it later so that it looks nicer, but this post is already at 2033 words.
Let’s tackle them in sequence.
CLI
In main.rs
, we have this section:
let program = asm.assemble(&program);
match program {
Ok(p) => {
vm.add_bytes(p);
vm.run();
std::process::exit(0);
},
Err(_e) => {
}
}
Let’s assign the out of run
to a variable, and then debug print it:
match program {
Ok(p) => {
vm.add_bytes(p);
let events = vm.run();
println!("VM Events");
println!("--------------------------");
for event in &events {
println!("{:#?}", event);
};
std::process::exit(0);
},
Err(_e) => {
}
REPL
I’ll leave getting it to display in the REPL to you. You can see mine in GitLab.
End
We’ll end here for this one, though I want to make one observation.
Coding Style
My coding style in Rust is oddly freeform for such a strict language. When writing Rust code, my goal in life becomes to appease the compiler. As long as I can do that, what I code usually works like I think it will.
See you next tutorial!
If you need some assistance with any of the topics in the tutorials, or just devops and application development in general, we offer consulting services. Check it out over here or click Services along the top.