So You Want to Build a Language VM - Part 22 - Parallelism: Part 1

Adds in setting the number of logical cores found on the host system

Intro

Hello! In this post, our focus is going to be on starting to add some parallelism to the Iridium VM.

Important
If you are uncertain about the differences between parallelism and concurrency, please read this article first: https://blog.subnetzero.io/article/concurrency-vs-parallelism/

The Problem

The Iridium VM utilizes a single OS thread. This is similar to how CPython, stock Ruby, and other similar language VMs operate. They use something called a GIL, or Global Interpreter Lock; understanding how those work is important, so let’s start there.

GILs and OS Threads

If we create a thread in Python like so:

from thread import start_new_thread

def some_func():
    pass

start_new_thread(some_func,())
c = raw_input("Type something to quit.")

We will get an actual OS thread. It’s logical to assume that if we created more threads, we could do more work, right? I’m afraid that is not true, though. =(

Even if we have multiple threads, there still has to be something executing the bytecode. In Python, that is the Python VM. The GIL is a mutex (or lock) that allows only one thread to use the Python VM at a time. This was put in for a good reason: it helps prevents race conditions in the underlying Python VM. So even if you have 10 OS threads, you can still only use one core.

BEAM

The BEAM VM (or Erlang VM), which is the major inspiration for Iridium, solves the data race problem in a different way that allows for parallelism. That paradigm is called the Actor model.

When writing code in Erlang or Elixir, you create processes. These are not OS processes, but BEAM processes. Other terms for them include green threads, micro threads, lightweight threads, co-routines, and fibers. All of those have different shades of meaning, but refer roughly to the same thing, which is non-OS threads managed by the language VM.

Actors

In the BEAM, each process is isolated from all other processes. Think of them as independent entities walking around an electronic landscape. Here’s the important part: Actors can only communicate with each other via message passing.

If we have process A and B, A cannot reach into B and do something like B.count = 10. It has to send a message to B telling B to change its count to 10. The result of this is that the only thing that can modify an Actor’s internal state is that Actor, thus preventing race conditions. Another effect of this is that an Actor can run on any core and be moved about without issues, even across networks.

The downside is that it is slower than sharing memory directly.

Iridium

In Iridium, we’re going to follow the BEAM model. It’s a big project, so we won’t do it all in this tutorial. We’ll start laying the foundation.

Step 1: Core Awareness

Let’s start by making Iridium aware of the number of cores the machine it is running on has. There’s a handy-dandy crate for this: https://github.com/seanmonstar/num_cpus. Go ahead and add num_cpus = "1.0" to Cargo.toml and extern crate num_cpus; to bin/iridium.rs and lib.rs.

For now, we’ll add an attribute to the VM that contains how many cores it detects. In src/vm.rs add this attribute:

pub logical_cores: usize,

and in the new function of the VM impl, add:

logical_cores: num_cpus::get(),
to the creation of the VM struct.

Step 2: Adding a CLI Flag

While we’re here, let’s go ahead and give the user a way to set the number of threads the VM can use. Go to src/bin/cli.yml and add this parameter:

- THREADS:
    help: Number of OS threads the VM will utilize
    required: false
    takes_value: true
    long: threads
    short: t

And in bin/iridium.rs, add this before we check the target_file:

let num_threads = match matches.value_of("THREADS") {
    Some(number) => {
        match number.parse::<usize>() {
            Ok(v) => { v }
            Err(_e) => {
                println!("Invalid argument for number of threads: {}. Using default.", number);
                num_cpus::get()
            }
        }
    }
    None => {
        num_cpus::get()
    }
};

The last thing we need to do is change the core count when we run a file (and not the REPL):

match target_file {
    Some(filename) => {
        let program = read_file(filename);
        let mut asm = Assembler::new();
        let mut vm = VM::new();
        vm.logical_cores = num_threads;
        // <snip>

Now if we compile and run iridium --help, we see:

Interpreter for the Iridium language

USAGE:
    iridium [OPTIONS] [INPUT_FILE]

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
    -t, --threads <THREADS>    Number of OS threads the VM will utilize

ARGS:
    <INPUT_FILE>    Path to the .iasm or .ir file to run

Yay!

End

Of course, this doesn’t mean that we can use those threads yet. That’s a bit more complex, and something we’ll start on in the next tutorial. See you then!


If you need some assistance with any of the topics in the tutorials, or just devops and application development in general, we offer consulting services. Check it out over here or click Services along the top.