So You Want to Build a Language VM - Part 11 - Memory
Adds in heap memory to our VM
Intro
Right now, we have one place to store data: registers. These are great, but we only have 32 of them, and they only store i32s. What if we want to store, say, strings? Or what if a user wants to create and store a large datastructure? We need more than registers. == Primer on Computer Memory Long ago, in a time when people wore parachute pants and dresses with shoulderpads, computer memory was a series of slots and any code could write to any slot. You could draw by writing a value directly to the memory slots that controlled what was displayed on screen, for example. But time passed, computers became more complicated, got more memory, and began running more than one program at a time. Having your program crash because some unrelated program wrote the wrong value to a memory slot wasn’t fun.
Nowadays, memory is much more complicated from the OS and CPU perspective. But to programmers, it still seems the same: their program has sole access to memory.
The OS
When a user starts an application and the OS begins executing it, it performs all sorts of trickery behind the scenes so that the running application can behave as if the following two statements are true:
The application has sole access to the memory
The application has unlimited memory
That program can use the system calls malloc
, free
, etc to get more memory and return it. In reality, the OS coordinates all the memory allocation to keep applications from interfering with each other. This memory available for use via malloc
is called…
The Heap
The heap is used to store things whose space requirements are not known at compile-time. If we declare an array of f64s
, with a size of 100, then the compiler knows exactly how much memory the application will need to request for it, and it does so at compile-time. But what if we want an array that can grow indefinitely? Then we would need to use the heap to request memory every time we needed to store more f64s
.
Imagine a system that has 4GB of RAM available. What happens when all the applications running on it request more than 4GB? This is often when a system begins that dreaded activity known as swapping. This means that it begins to use local disk as extra RAM.
Note | It is growing more common to disable swapping on servers. If a system can’t swap and has on more memory, the OS will usually pick a process and terminate it. |
Memory Leaks and Garbage Collection
If you’ve never worked in a language that requires explicit memory management (C, C++ are the most common), this may seem odd to you. We don’t even have to use malloc
or free
in Rust. Let’s look at this C snippet from Wikipedia:
#include <stdlib.h>
void function_which_allocates(void) {
/* allocate an array of 45 floats */
float *a = malloc(sizeof(float) * 45);
/* additional code making use of 'a' */
/* return to main, having forgotten to free the memory we malloc'd */
}
int main(void) {
function_which_allocates();
/* the pointer 'a' no longer exists, and therefore cannot be freed,
but the memory is still allocated. a leak has occurred. */
}
See how it mallocs
but never frees
the memory? That memory, from the perspective of the OS, still belongs to the C application; but the application has no way to get to it once a
goes out of scope.
Most higher level languages deal with this by garbage collection. When you run a program in Python, Ruby, Java, Go, or any other language with a GC, it keeps track of all data structures created, and reclaims the memory once it detects that it is no longer used. This frees humans from having to do memory bookkeeping (and we are REALLY bad at it), but the trade-off is that it has computational overhead and, perhaps more critically, causes pauses in your application of non-deterministic duration.
There are several common types of garbage collectors: reference counting, mark-and-sweep, generational, and tracing are some of the most common ones. For now, I’m going to cover reference counting and mark-and-sweep.
Reference Counting
In this model of garbage collection, the language keeps track of how many things reference something. When that count reaches 0, it knows that the thing referenced is no longer in use and it can destroyed.
Check out this Python code:
def leak():
class A:
def __init__(self):
self.b = None
class B:
def __init__(self):
self.a = None
a = A()
b = B()
a.b = b
b.a = a
leak()
a
has a reference to b
and b
has one to a
. And because we don’t return either a
or b
from the function leak
, once that function returns, we have no way to get to those class instances. But here’s the important part: they keep each other alive. Their reference counts will never go to zero. If all we did was rely on reference counting, this would be a memory leak.On the plus side, reference counting is fast, since it requires incrementing or decrementing a number.
Mark and Sweep
This is simplest garbage collector type that can deal with cycle dependencies. Imagine that every program has a root node, and as the application runs and creates new data structures, they become part of a tree, with that root node at the bottom. The mark and sweep algorithm can start at the root node, traverse everything attached to it, and set a bit on each thing it can reach. This is called marking.
But, this doesn’t help us with a cyclic dependency that has no connection to the root node. The solution to this is that our VM keeps its own references to everything it creates. The user/developer is unaware of this and doesn’t have access to the VMs global list of data structures. Once the GC has marked everything it can reach, the VM can go through its global list and delete everything that isn’t marked.
Of course, these types of GC have a downside, which is pauses. If you’ve ever worked in Java, you’ve probably experienced the infuriating pauses where your entire program seems to halt for some number of ms. This is often due to the GC doing its thing. Its difficult to get away from the need to pause execution to do a GC sweep, but you can optimize for pauses of low duration, or of amount of garbage collected.
Note | Java and Go are examples of languages that use more advanced garbage collectors. The JVM GC is quite advanced and can operate in multiple modes depending on your application’s needs. The Go GC is much simpler, and focuses on low latency. |
Allocation in Iridium
Before we can write a garbage collector for Iridium, we need to allow for the allocation and freeing of memory. Right now, our VM is pretending to be a CPU; now we need it to also pretend to be memory.
Memory in Assembly
Most assembly languages provide an instruction to request memory from the OS and to free it; often, brk
or something similar. We can do something similar.
Representing Memory
How do we represent memory in our VM? Why, with Vec<u8>
of course! Then requesting more memory becomes a simple matter of extending the vector.
Adding a Heap to Our VM
Right now, our VM looks like this:
pub struct VM {
pub registers: [i32; 32],
pc: usize,
pub program: Vec<u8>,
remainder: usize,
equal_flag: bool,
}
To add a heap, we can just do:
pub struct VM {
pub registers: [i32; 32],
pc: usize,
pub program: Vec<u8>,
heap: Vec<u8>,
remainder: usize,
equal_flag: bool,
}
And then in our impl
block:
pub fn new() -> VM {
VM {
registers: [0; 32],
program: vec![],
heap: vec![],
pc: 0,
remainder: 0,
equal_flag: false,
}
}
We now have a heap. =)
Allocation Opcode
For the last thing in this tutorial, let’s add an Opcode, called ALOC
, that extends the size of the heap vector by the amount in the register given as an argument. I’m sure you remember how to add in new Opcodes by this point, so I shan’t repeat it here. =)
The handler code for the ALOC
instruction is:
Opcode::ALOC => {
let register = self.next_8_bits() as usize;
let bytes = self.registers[register];
let new_end = self.heap.len() as i32 + bytes;
self.heap.resize(new_end as usize, 0);
}
Testing, testing…
#[test]
fn test_aloc_opcode() {
let mut test_vm = get_test_vm();
test_vm.registers[0] = 1024;
test_vm.program = vec![17, 0, 0, 0];
test_vm.run_once();
assert_eq!(test_vm.heap.len(), 1024);
}
End
Yay, our VM can now allocate memory! Don’t worry if this is confusing to you. Modern memory management is very complicated, and it takes a bit of practice to internalize the concepts. In our next part, we’ll add strings and printing to our assembler.
If you need some assistance with any of the topics in the tutorials, or just devops and application development in general, we offer consulting services. Check it out over here or click Services along the top.