Algebraic Types
I’m a math nerd. I’m also a language nerd - both for programming languages and for natural languages.
I’ve come across one thing that seems to consistently sit at the intersection of all three of those things: its name comes from math, but it’s used in the context of programming languages. And, I think the name doesn’t really communicate what it is very well (unless you did some undergrad or grad-level stuff). This thing is therefore (in my experience) a little contentious: some people are big fans, and some are pretty against it for some reason. And I contend that that reason is that they haven’t actually looked into it deeply enough to understand it, because the name is a turn-off.
But it doesn’t have to be that way! This thing isn’t really that hard to understand. This post will be my attempt to explain it! So, without further ado, what am I talking about? Algebraic data types.
Algebra
I’m sure everyone reading this remembers algebra from middle school and high
school. But “algebraic data types” comes up in the context of type systems of
different programming languages. So what on earth is the connection between
those things - what is “in common” between solving 3x = 18
and designing a
data type?
Sums and products. That’s it. They go by other names as well (which we’ll see very soon), but since those are the math-y names, I want to explain where those names come from. So let’s see what they are in a type.
Products
Every mainstream programming language has support for product types, but they’re not called that, usually. Instead, we call them “tuples”, “structs”, even “classes” (if we ignore the methods and just focus on the fields).
The general idea is that we take two (or more) types, a
and b
, and combine
them into a new type that has a * b
possible values. In practice, this means
that the new type has to have both an a
and a b
.
Here are some example product types (and the mathematical “product” that they correspond to) in a few different languages:
C++ | Rust | Haskell | Math |
---|---|---|---|
std::pair<int, bool>
|
(i32, bool)
|
(Integer, Bool)
|
232 * 2 |
template <typename T>
|
struct Foo<T> {
|
data Foo T = Foo {
|
28 * T |
I could go on, and all of these examples are products of only two types (though there can obviously be more!), but hopefully you can see what’s going on here: a “product” type is just when you have more than one thing in your data type. We’re used to that idea in programming, that’s nothing new.
Sums
With that in mind, let’s look at sums. It’s the same idea as before: we start
with two (or more) types, a
and b
, and combine them into a new type that has
a + b
inhabitants. In practice, this means your new type has to have either
an a
or b
. Let’s see some examples.
C++ | Rust | Haskell | Math |
---|---|---|---|
enum class ColorScheme {
|
enum ColorScheme {
|
data ColorScheme =
|
1 + 1 + 1 |
std::optional<T>
|
Option<T>
|
Maybe T
|
1 + T |
std::variant<T, E>
|
Result<T, E>
|
Either T E
|
T + E |
???
|
enum OpCode {
|
data OpCode =
|
1 + 232 + ... |
Both Rust and Haskell (as well as other languages) have decent support for sum types, as shown by the example above. In those languages, it’s easy to define new sum types that directly represent the problem we’re working on, like what the fourth row does. (In fact, the fourth row also starts to demonstrate that sum types can embed more complicated variants - even full-fledged products).
Some readers will undoubtedly think, “C++ could absolutely represent that fourth
row - just define an OpCode
class, and extend it for each concrete kind of
opcode we need to support.” And they’re not wrong - but look at how much more
complicated that approach is! It requires defining a whole class hierarchy,
instead of just writing a few lines of an enum
definition. (Not to mention
that it requires dynamic_cast
s at runtime to figure out what kind of OpCode
you have — kind of painful.)
This is what fans of algebraic types get so excited about. In an algebraic type system, it’s simple to make types that fit the problem domain, and it’s easy to keep those types updated over time (without having to refactor giant class hierarchies when the project’s needs change) because the system is very flexible and concise (compared to other approaches like OOP).
Tags
An aside: sum types are also called “discriminated” or “tagged” unions, because they are often implemented by using a union type (like the kind in C or C++) but with an additional “tag” or “discriminant” field that keeps track of what variant the union is currently holding. That way, the tag actually lets the client code operate in a typesafe way, instead of having to either guess which variant is occupied, or keep track of it separately.
It’s important that the tags keep track of what variant is currently held,
rather than what type is currently held, so that it’s possible to use a tagged
union where two of the different variants have the same type. For example,
imagine you’re implementing a read(buffer)
function. You’d like the function
to return the number of bytes read into the buffer on success; or an error code
on failure. So the signature could look something like read(buffer: &mut [byte]) ->
Result<usize, usize>
in Rust. (In actual idiomatic Rust there are better ways
to do it, but let’s ignore that for now.) If we could only distinguish what type
the Result
held, we wouldn’t actually be able to tell success from failure,
without resorting to other tricks.
The other way to see this issue is to notice that without the tag, unions don’t actually have the right number of distinct inhabitants! For example, a union type with two boolean variants can only actually represent 2 distinct values, instead of the 2+2 = 4 that it’s supposed to have.
Recursive types
It can be interesting to look at what happens when we have recursive types (types that refer to themselves), like a linked list. In C++, a linked list could look like this:
template <typename T>
struct LinkedListNode {
T payload;
LinkedListNode<T>* next;
};
We know that the next
pointer can be null OR a valid pointer to another node,
so an equivalent structure in Rust would be:
struct LinkedListNode<T> {
payload: T,
next: Option<Box<LinkedListNode<T>>>,
}
Or in Haskell:
data LinkedListNode t = LinkedListNode {
payload :: t,
next :: Maybe (LinkedListNode t)
}
How many inhabitants does such a type have? Well, obviously infinitely many, since we can always keep making the linked list longer. But let’s look at the math anyway. We can see in the Rust and Haskell versions that this type involves both a sum and a product.
Let L = the number of inhabitants of the LinkedListNode<T>
type. Then we have:
L = T * (1 + L) = T + T*L
Substituting in T + T*L for L in the right-hand side, we get:
L = T + T*(T + T*L) = T + T2 + T2*L
Substituting again:
L = T + T2 + T2*(T + T*L) = T + T2 + T3 + T3*L
We can see the pattern here now, so we’ll skip ahead:
L = T + T2 + T3 + T4 + …
Intuitively, the right-hand-side means “one element of type T, or two of them, or three of them, or four of them, or …”. Which makes total sense here - it’s the same as saying “a list of length 1, or 2, or 3, or 4, or …”. I don’t know about you, but I think it’s pretty cool that that works out so nicely, even though we switched into doing algebra instead of talking about types.
The math you can do on types gets even weirder, too. It turns out that taking the derivative of some generic type (with respect to its generic parameter is useful, and the result is the type of the “one-hole context” (or the zipper) for that type. I won’t try to explain that here - I’ll just link to a post I read (and stole some inspiration from) that does a better job than I would.