So You Want to Build a Language VM - Interlude 01

Goes back and adds docs, tests and other improvements

Fixy fixy

You know how I’ve written several times we’ll come back and fix/change stuff later?

Well…​this is later. In the development world, I know its traditional to never go back and fix things, but we’re maverick rebels, so off we go, == The Finnerale Fixes The dedicated and helpful reader Finnerale has pointed out a few things, so let’s take care of those first. Think of it is as a mini-boss fight before we get to the boss fight, Clippy.

AssemblerInstruction Errors

We have two errors here:

#[test]
fn test_parse_instruction_form_one() {
    let result = instruction_one(CompleteStr("load $0 #100\n"));
    assert_eq!(
        result,
        Ok((
            CompleteStr(""),
            AssemblerInstruction {
                label: None,
                opcode: Token::Opcode { code: Opcode::LOAD },
                operand1: Some(Token::Register { reg_num: 0 }),
                operand2: Some(Token::Number { value: 100 }),
                operand3: None
            }
        ))
    );
}

One is that the field label is still in the test. This is because I’m backporting code into the tutorials, and I’ve already added in labels in that version of the Assembler. These will be in one of the next few tutorials. For now, we’ll just remove that line.

The second error is opcode: Token::Opcode { code: Opcode::LOAD },. It should be: opcode: Token::Op { code: Opcode::LOAD },. This is the result of my changing Opcode to Op to help avoid confusion, and I missed changing it when I put the code in the tutorials. Correct version should look like:

#[test]
fn test_parse_instruction_form_one() {
    let result = instruction_one(CompleteStr("load $0 #100\n"));
    assert_eq!(
        result,
        Ok((
            CompleteStr(""),
            AssemblerInstruction {
                opcode: Token::Op { code: Opcode::LOAD },
                operand1: Some(Token::Register { reg_num: 0 }),
                operand2: Some(Token::Number { value: 100 }),
                operand3: None
            }
        ))
    );
}

Vec Reference

In the Getting at the Bits section of tutorial 9, we have this:

for operand in vec![&self.operand1, &self.operand2, &self.operand3] {
    match operand {
        Some(t) => AssemblerInstruction::extract_operand(t, &mut results),
        None => {}
    }
}

This is better written as:

for operand in &[&self.operand1, &self.operand2, &self.operand3] {
   if let Some(token) = operand {
       AssemblerInstruction::extract_operand(token, &mut results)
   }
}

It avoids the allocation of another vector and uses the Rust if let construct to look cleaner.

Program Parsing

In our REPL, we parse the program and use it like this:

_ => {
    let parsed_program = program(CompleteStr(buffer));
    if !parsed_program.is_ok() {
        println!("Unable to parse input");
        continue;
    }
    let (_, result) = parsed_program.unwrap();
    let bytecode = result.to_bytes();
    // TODO: Make a function to let us add bytes to the VM
    for byte in bytecode {
        self.vm.add_byte(byte);
    }
    self.vm.run_once();
}

This can be better written as:

_ => {
  // You can assign the result of a match to a variable
  // Rust can convert types using `Into` and `From`
  let program = match program(buffer.into()) {
      // Rusts pattern matching is pretty powerful an can even be nested
      Ok((_,  program)) => program,
      Err(_) => {
          println!("Unable to parse input");
          continue;
      }
  };
  // The `program` is `pub` anyways so you can just `append` to the `Vec`
  self.vm.program.append(&mut program.to_bytes());
}

This is more concise and efficient.

Thanks again to Finnerale for taking the time to read and send me these. I’ve updated the code in the repo, but I’m going to leave the tutorials as is, maybe add a note to them.

cargo test

Let’s start with a cargo test run on the latest commit in the repo:

warning: unused import: `nom::types::CompleteStr`
 --> src/repl/mod.rs:6:5
  |
6 | use nom::types::CompleteStr;
  |     ^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: #[warn(unused_imports)] on by default

warning: unused import: `Program`
 --> src/repl/mod.rs:9:34
  |
9 | use assembler::program_parsers::{Program, program};
  |                                  ^^^^^^^

warning: method is never used: `parse_hex`
  --> src/repl/mod.rs:87:5
   |
87 |     fn parse_hex(&mut self, i: &str) -> Result<Vec<u8>, ParseIntError>{
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: #[warn(dead_code)] on by default

Not bad! We can remove the unused CompleteStr easily. Same with the second and Program. The third one, the parse_hex function, we may use later. So for now, we’ll add #[allow(dead_code)] to the function:

/// Accepts a hexadecimal string WITHOUT a leading `0x` and returns a Vec of u8
/// Example for a LOAD command: 00 01 03 E8
#[allow(dead_code)]
fn parse_hex(&mut self, i: &str) -> Result<Vec<u8>, ParseIntError>{
    let split = i.split(" ").collect::<Vec<&str>>();
    let mut results: Vec<u8> = vec![];
    for hex_string in split {
        let byte = u8::from_str_radix(&hex_string, 16);
        match byte {
            Ok(result) => {
                results.push(result);
            },
            Err(e) => {
                return Err(e);
            }
        }
    }
    Ok(results)
}

And Finally, Clippy

If you don’t know, Clippy is a handy Rust tool that smugly points out errors you’ve made, best practices you have ignored, and how you’ve failed at everything and should probably just give up.

Ahem.

Run Clippy, patient wait, and…​.crap:

warning: redundant field names in struct initialization
  --> src/instruction.rs:61:13
   |
61 |             opcode: opcode
   |             ^^^^^^^^^^^^^^ help: replace it with: `opcode`
   |
   = note: #[warn(redundant_field_names)] on by default
   = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#redundant_field_names

warning: unneeded return statement
   --> src/vm.rs:187:9
    |
187 |         return opcode;
    |         ^^^^^^^^^^^^^^ help: remove `return` as shown: `opcode`
    |
    = note: #[warn(needless_return)] on by default
    = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#needless_return

warning: unneeded return statement
   --> src/vm.rs:194:9
    |
194 |         return result;
    |         ^^^^^^^^^^^^^^ help: remove `return` as shown: `result`
    |
    = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#needless_return

warning: unneeded return statement
   --> src/vm.rs:201:9
    |
201 |         return result;
    |         ^^^^^^^^^^^^^^ help: remove `return` as shown: `result`
    |
    = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#needless_return

warning: unused import: `super::*`
  --> src/assembler/opcode_parsers.rs:13:9
   |
13 |     use super::*;
   |         ^^^^^^^^
   |
   = note: #[warn(unused_imports)] on by default

warning: unused import: `super::*`
  --> src/assembler/operand_parsers.rs:21:9
   |
21 |     use super::*;
   |         ^^^^^^^^

warning: unused import: `super::*`
  --> src/assembler/register_parsers.rs:22:9
   |
22 |     use super::*;
   |         ^^^^^^^^

warning: unneeded return statement
  --> src/assembler/instruction_parsers.rs:38:9
   |
38 |         return results;
   |         ^^^^^^^^^^^^^^^ help: remove `return` as shown: `results`
   |
   = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#needless_return

warning: you should consider deriving a `Default` implementation for `vm::VM`
  --> src/vm.rs:19:5
   |
19 | /     pub fn new() -> VM {
20 | |         VM {
21 | |             registers: [0; 32],
22 | |             program: vec![],
...  |
26 | |         }
27 | |     }
   | |_____^
   |
   = note: #[warn(new_without_default_derive)] on by default
   = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#new_without_default_derive
help: try this
   |
4  | #[derive(Default)]
   |

warning: casting u16 to u32 may become silently lossy if types change
  --> src/vm.rs:62:30
   |
62 |                 let number = self.next_16_bits() as u32;
   |                              ^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `u32::from(self.next_16_bits())`
   |
   = note: #[warn(cast_lossless)] on by default
   = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#cast_lossless

warning: casting u8 to u16 may become silently lossy if types change
   --> src/vm.rs:199:23
    |
199 |         let result = ((self.program[self.pc] as u16) << 8) | self.program[self.pc + 1] as u16;
    |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `u16::from(self.program[self.pc])`
    |
    = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#cast_lossless

warning: casting u8 to u16 may become silently lossy if types change
   --> src/vm.rs:199:62
    |
199 |         let result = ((self.program[self.pc] as u16) << 8) | self.program[self.pc + 1] as u16;
    |                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `u16::from(self.program[self.pc + 1])`
    |
    = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#cast_lossless

warning: you should consider adding a `Default` implementation for `repl::REPL`
  --> src/repl/mod.rs:17:5
   |
17 | /     pub fn new() -> REPL {
18 | |         REPL {
19 | |             vm: VM::new(),
20 | |             command_buffer: vec![]
21 | |         }
22 | |     }
   | |_____^
   |
   = note: #[warn(new_without_default)] on by default
   = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#new_without_default
help: try this
   |
15 | impl Default for repl::REPL {
16 |     fn default() -> Self {
17 |         Self::new()
18 |     }
19 | }
   |

warning: single-character string constant used as pattern
  --> src/repl/mod.rs:87:29
   |
87 |         let split = i.split(" ").collect::<Vec<&str>>();
   |                             ^^^ help: try using a char instead: `' '`
   |
   = note: #[warn(single_char_pattern)] on by default
   = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#single_char_pattern

    Finished dev [unoptimized + debuginfo] target(s) in 4.32s

Well, I’ve seen worse. These are simple fixes, so I’m not going to go through each one. After fixing them all, it still complains about unused imports in some of the test modules:

mod tests {
    use super::register;
    use nom::types::CompleteStr;

    #[test]
    fn test_parse_register() {
        let result = register(CompleteStr("$0"));
        assert_eq!(result.is_ok(), true);
        let result = register(CompleteStr("0"));
        assert_eq!(result.is_ok(), false);
        let result = register(CompleteStr("$a"));
        assert_eq!(result.is_ok(), false);
    }
}

Not sure why. Adding #![allow(unused_imports)] like this:

mod tests {
    #![allow(unused_imports)]
    use super::register;
    use nom::types::CompleteStr;
    <snip>
}

And thus, Clippy is appeased and cargo test runs fine! You can check out the updated code in the repo.


If you need some assistance with any of the topics in the tutorials, or just devops and application development in general, we offer consulting services. Check it out over here or click Services along the top.