So You Want to Build a Language VM - Part 18b - REPL Command Parsing

Improves command processing in the REPL

Intro

Hello! In this part, we’re going to factor out the command parsing done in the REPL for user input. Right now, there isn’t much flexibility; a user cannot type:

.load_file /path/to/file

They must type:

.load_file\r
Please enter file path: /path/to/file\r

Commands and Such

One slightly annoying change we have to make first. As you may have noticed, we use the . character in our assembly language, so we have to use something else. For now, we’ll use !.

Let’s make this a submodule of the repl module. Go ahead and make src/repl/command_parser.

Step 1: Command Parser Struct

This wasn’t too bad. All we need is a function to tokenize the input:

pub struct CommandParser {}

impl CommandParser {
    pub fn tokenize(input: &str) -> Vec<&str> {
        let split = input.split_whitespace();
        let vec: Vec<&str> = split.collect();
        vec
    }
}

There isn’t a need to create an instance of the struct here. All we need is to split the user input string on its whitespace, and return the tokens.

Don’t forget, you’ll need to add pub mod command_parser; to src/repl/mod.rs.

Step 2: Break Out Functions

You know how we have that large match block that checks for ".whatever"? Each one of those options should be its own function, like this:

fn quit(&mut self, args: &[&str]) {
    println!("Farewell! Have a great day!");
    std::process::exit(0);
}

I’m not going to put every function them in this tutorial, since it is simple copy-paste.

Step 3: Run Function Changes

What’s more interesting is how we have to change the parsing logic. Here’s the new run function in src/repl/mod.rs:

pub fn run(&mut self) {
    println!("Welcome to Iridium! Let's be productive!");
    loop {
        // This allocates a new String in which to store whatever the user types each iteration.
        // TODO: Figure out how allocate this outside of the loop and re-use it every iteration
        let mut buffer = String::new();

        // Blocking call until the user types in a command
        let stdin = io::stdin();

        // Annoyingly, `print!` does not automatically flush stdout like `println!` does, so we
        // have to do that there for the user to see our `>>> ` prompt.
        print!(">>> ");
        io::stdout().flush().expect("Unable to flush stdout");

        // Here we'll look at the string the user gave us.
        stdin
            .read_line(&mut buffer)
            .expect("Unable to read line from user");

        let historical_copy = buffer.clone();
        self.command_buffer.push(historical_copy);

        if buffer.starts_with("!") {
            self.execute_command(&buffer);
        } else {
            let program = match program(CompleteStr(&buffer)) {
                Ok((_remainder, program)) => {
                    program
                },
                Err(e) => {
                    println!("Unable to parse input: {:?}", e);
                    continue;
                }
            };
            self.vm.program.append(&mut program.to_bytes(&self.asm.symbols));
            self.vm.run_once();
        }
    }
}

And here is the new function execute_command:

fn execute_command(&mut self, input: &str) {
    let args = CommandParser::tokenize(input);
    match args[0] {
        "!quit" => self.quit(&args[1..]),
        "!history" => self.history(&args[1..]),
        "!program" => self.program(&args[1..]),
        "!clear_program" => self.clear_program(&args[1..]),
        "!clear_registers" => self.clear_registers(&args[1..]),
        "!registers" => self.registers(&args[1..]),
        "!symbols" => self.symbols(&args[1..]),
        "!load_file" => self.load_file(&args[1..]),
        "!spawn" => self.spawn(&args[1..]),
        _ => { println!("Invalid command") }
    };
}

Note how we strip the command from the slice we pass to each individual function, so they get the arguments and not a copy of the command.

UTF-8 Annoyances

Because all strings in Rust are UTF-8, you can’t check the first character how you might expect, with something like: &buffer[0]. I was getting annoyed at trying to figure out how to check if the first character is "!" without making a ton of clones.

Happily, I found the starts_with function! There’s lots of useful convenience functions like that lurking about, it seems.

End

That’s it for this article! These changes will be in 0.0.18. I’m trying to figure out a good way to sync versions with tutorials. Code is available in GitLab!


If you need some assistance with any of the topics in the tutorials, or just devops and application development in general, we offer consulting services. Check it out over here or click Services along the top.