Hello! This article gives you an overview of the major database types, a summary of how they work, and when and why you should use one.
This is not a short entry. Sorry. =)
You may have heard this term whispered amongst your colleagues. While it is an actual theorem, and an important one, people often take different meanings away after hearing about it. So let’s go through it.
C stands for
A stands for
P stands for
I’m going to steal from Wikipedia:
In theoretical computer science, the CAP theorem, also named Brewer’s theorem after computer scientist Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:
Consistency: Every read receives the most recent write or an error
Availability: Every request receives a response that is not an error
Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes
This doesn’t connect with the practical reality a lot of us deal with, though, so let’s discuss the three aspects through the lens of an infrastructure engineer building a system for production.
Pretend you have a distributed database. It doesn’t matter which one, but it runs on more than one discrete server. You have clients that talk to this distributed database and ask it questions, such as "What is the name of the user with the ID of 1?"
Or, in SQL-ish:
SELECT first_name, last_name FROM customers WHERE id = 1;
Let’s say that two clients issue this same query. Query 1 hits Server 1, Query 2 hits Server 2. If a system is consistent, then both clients will receive "Perceval Grimwhisker" back as the answer.
Now let’s say Perceval decided to change his name and updated his profile. Two clients again issue the same query. If a system is not consistent, one client would "Perceval Grimwhisker" while the other client might receive his new name "Xochipilli Deathpaw".
Traditional SQL databases are the canonical examples of consistent databases: Postgres, MySQL, MariaDB.
You still have your pretend database, but it now consists of 5 servers. If a system is available, then if Node 5 dies, Nodes 1-4 are still able to respond to queries. If it is not available then one node being down means the other servers cannot respond to queries.
MongoDB, Riak, RethinkDB, and most "NoSQL" databases are considered available.
This one is important: CAP Theorem only applies to distributed data stores. This means that they exist on a network. If that network fails, in whole or in part, you have what is known as a network partition. You may also hear the term split-brain used to refer to similar situations.
Let’s say you still have that 5 node cluster. The network fails in such a way that Nodes 1 and 2 can talk to each other, and nodes 3, 4, and 5 can talk to each other, but the two groups cannot talk to each other.
All distributed datastores have to be able to deal with this.
The "Pick Two" Fallacy
You’ll often hear "CAP, pick two". This is because the P is required, and general wisdom says you can have a CP system or an AP system, but not a CAP system. This is both true and not true. When you are choosing C or A, what you are choosing is which one you get when the network partitions. When the network is working fine, you get all three. This is the most frequent misunderstanding I hear from people about CAP.
I wasn’t sure how to explain this part, so I am going to just go through some common database configurations and explain how they deal with CAP.
Example #1: Postgres with a Primary and a Replica
Let’s say that in this setup, you have one Postgres node that can accept writes (this Primary node), and then you have one Replica node that mirrors all data on the Primary node. Where does this fall in terms of CAP? Let’s look at failure scenarios:
If no clients were using the replica, then there is no service interruption. However, you no longer have an up-to-date backup. Also note that you no longer have a distributed datastore. So you now have a full CAP system. =)
Let’s change it up a bit and say that clients were only writing to master, but they could read from either server. The Replica dies, and reads start going to the master. What happens if the Replica comes back up after, say, an hour? It will not have the same data as the
|For you DB peeps reading this, yes, it can resync, catch up, all that. We’re keeping this simple for now. =)|
If clients starting reading from the Replica again, the two servers will not be consistent.
Remember how we said that CAP only applies during a network partition, and this wasn’t really a network parition issue? Rather, a server failed, and that led to inconsistent data.
Let’s say the Replica didn’t fail, but the network between them did. But clients can still query either the Primary or the Replica. The replica is not getting updates from the primary, so the data is no long consistent. But it is still available. So our CAP system is now an AP system.
We could configure the cluster so that if the Primary cannot communicate with the Replica both stop responding to queries. In this case, we have built ourselves a CP system.
Example #2: MongoDB Replica Set
|MongoDB gets a lot of crap, some of which I agree with. That being said, it is a very popular option and you will have to deal with it. =)|
In a MongoDB Replica Set, you have 1 Primary and N Replicas. By default, the client will send all reads and writes to the Primary. If the Primary fails, one of the Replicas is promoted to Primary, and life goes on.
There’s some nuances to this that are worth talking about, though. Some are specific to Mongo, but some aren’t.
One Write, All Read
A MongoDB client can be configured to write to the Primary, but to read from the Replicas. What if Replica 3 is lagging for some reason? Perhaps the hypervisor it is running on is oversubscribed or some such.
If a client reads from that node, it is possible to read a value that has been changed on the primary, but hasn’t replicated out to all the replicas yet. The usual term for this is a "stale read".
|MongoDB Replicas will keep track of how far behind the Primary they are. You can configure your client to only read from clients that < N seconds behind, if desired.|
Another term you may have heard is "quorum", which means: > the minimum number of members of an assembly or society that must be present at any of its meetings to make the proceedings of that meeting valid.
A data storage system with more than one member (i.e., a cluster) usually has a concept of quorum. If you have a 5 node cluster, and 2 are down, then the cluster still has quorum, because a majority of members are present. If 3 nodes are down, we don’t have quorum. In this situation, a DB could choose to disallow all writes, flip to read only, or other variations.
|This is why database clusters often consist of odd numbers. If we have a 4 node cluster, and two fail, then we have a split-brain situation.|
Why Does Quorum Matter?
To make decisions. Let’s say we have a 5 node cluster, and that we are using a version where any cluster member can accept a write request. In a typical mode, member 1 would receive the write, replicate it out to all 4 other nodes, and receive back acknowledgements that the data was written to disk. If we have quorum, then we only need 3 nodes to agree on what the data is and that it has been written.
|Many systems allow you to specify the quorum required. So you can say something like "Write this data, and don’t know tell me you’re done until ALL nodes have acknowledged it, because it’s super important kitten pictures"|
No, not the drug.
Definitely not the drug.
ACID is an acronym:
A is for Atomicity C is for Consistency I is for Isolation D is for Durability
Oh God, More Explanations
Atomicity means that an action either works, or it doesn’t. If I am updating an employee record, and my query is updating 48 fields, the query should either update all 48, or none.
Consistency means that the database enforces its rules properly.
Isolation means that multiple transactions can happen concurrently and not effect each other.
Durability means that once a database says data is written to durable storage, it written.
A database has varying degrees of ACID-compliance. Postgres is an example of a full ACID-compliant database. NoSQL databases tend to be less ACID-compliant, but this can vary.
Other Random Thoughts
SQL DBs are proven technology and should be your default. Most of them now support accepting JSON strings and querying against the fields, for example.
SQL DBs are hard to build and manage at large scale. Horizontal scaling is often difficult.
NoSQL DBs are much easier to cluster and scale horizontally.
Even if you think you don’t need a schema, you need a schema. Trust me.
That’s all for this one! The one topic we didn’t discuss is the rise of this new type of database called NewSQL. I’ll cover that in a future article.
If you need some assistance with any of the topics in the tutorials, or just devops and application development in general, we offer consulting services. Check it out over here or click Services along the top.