Computer Science

Algebraic Data Types: Structuring Data in Functional Programming

Delve into Algebraic Data Types (ADTs) in functional programming with a focus on Sum and Product types. Learn how ADTs enhance code readability and maintainability through a practical Scala 3 example, setting the stage for advanced topics like Functors

Chipotle Coder

27 Oct 2023 — 9 min read

A chalk board filled with mathematical symbo

Overview

In functional programming, how we structure data is vital. It impacts the clarity and maintainability of our code. Algebraic Data Types (ADTs) provide a clear, concise way to represent complex data structures.

Think of ADTs as primary ingredients in a recipe. They are simple yet powerful, and we can mix them in different ways to build complex types. ADTs come in two primary flavors: Sum and Product types. The names might sound intimidating, but the concepts are straightforward.

But, before diving into ADTs, a brief touch on the algebraic principles they stand on might be helpful. In algebra, we often work with operations and sets. Similarly, ADTs in functional programming allow us to perform operations on types and group data.

The term "algebraic" in ADTs hints at the operations and structures we can use on data types. This article will explore the core ideas behind Algebraic Data Types.
We'll delve into Sum and Product types. We'll see how they enable a more organized and expressive way to handle data.

Grasping ADTs will provide a solid foundation and allow us to explore advanced functional programming concepts.

Sets, Operations, and Laws: The Algebra of Programming

Sets, Operations, and Laws in mathematics define an algebra. Their counterparts in computer programming are Types, Functions, and Laws.

Sets to Types

A set is a collection of distinct objects with some common characteristics. For example, the set of natural numbers is a set of integer numbers one and above.

Types group values by their nature, like integers, strings, or custom data structures. Each type represents a set of values with one or more common characteristics.

In set theory, we encounter two unique sets, the empty and singleton sets. The empty set, often denoted as ∅, is the simplest, containing no elements. It's the very definition of emptiness, a foundational concept that signifies the absence of anything.

Contrastingly, a singleton set contains exactly one element. It's the epitome of simplicity while still holding content. For example, the set {42} is a singleton set because it contains one and only one element.

Translating these concepts into programming leads us to the notions of Bottom and Unit types, respectively. The Bottom type in programming languages is a type that can have no values. It is used to represent abnormal termination; for instance, a function that throws an exception or enters an infinite loop effectively has a return type of Bottom, as it produces no value.

The Unit type corresponds to the singleton set. Just as a singleton set has exactly one element, a Unit type has exactly one possible value, which carries no information. This is akin to the void keyword in many C-style languages, which indicates that a function returns no meaningful value even when it executes correctly.

It's important to note that these types interest those who design and create programming languages. For the average programmer, these types often work behind the scenes. We can't explicitly instantiate data of the Bottom type because, by definition, it represents non-termination or absence of a result.

Similarly, while you can use the Unit type in your programs, it's typically managed by the language to denote the lack of return value with significance. These types are essential in type theory and language semantics, providing a foundation for understanding how data and functions behave and interact.

As developers, we'll care primarily about the ADTs that allow us to create our types, usually called Sum and Product Types.

Operations to Functions

Operations are actions on one or more elements to produce a result. The inputs and results can be of the same or different sets.

The function's domain is the set that describes the valid inputs to that function, and the codomain is the group of possible results.

We can classify functions by the relation between their domain and codomain. We can refer to the function as an endofunction if they are equal or as a heterofunction otherwise.

A Graphical Explanation of Endo & Hetero Function

Laws

In mathematics, functions must return the same value for the same input. In programming, we refer to code that upholds this restriction as pure functions.

Besides this general law, operations might have rules governing their behavior in algebra and programming. They ensure consistency, making the system predictable and optimizations possible.

The type system can enforce some laws in programming, ensuring certain behaviors. Others might need the developer's attention to ensure they hold.

We'll gain correctness, reliability, and reusability in our code by paying attention to the laws.

Unveiling Sum and Product Types

Different kinds of Algebraic Data Types (ADTs) exist, but Bottom and Unit are beyond our control unless we design a programming language. On the other hand, most functional programming languages implement Sum and Product types, and they act as the building blocks, enabling us to create complex types.

Let's dissect these two integral types to grasp their essence.

Sum Types

Sum Types provide a way to represent a value that could be one of several possible variants. Each variant can hold different kinds of data.

The term "sum" reflects the total count of possible values this type can represent—the sum of all variants. In a way, Sum types embody the idea of 'either this or that,' meaning a logical disjunction between possible types.

This makes them crucial for statically strongly typed functional programming languages but less interesting for dynamically typed or imperative programming languages. However, these concepts have proven so helpful that they have found their way to Java and Python. However, their implementation might deliver different benefits than in Functional or Multi-Paradigm Programming languages like Haskell, Scala, or Rust.

💡

Java 17 introduced sealed classes, offering a way to define Sum Types albeit more verbosely. Sealed classes allow for defining a closed hierarchy of types, where the possible subtypes are known at compile-time, similar to how Sum Types operate in other languages. This feature brings a more robust type system to Java, allowing developers to express constraints and variations more explicitly.

They are also known as Union Types because of the union operation on sets. In some programming languages, they are named Tagged or Discriminated Unions.

Product Types

On the flip side, Product types are about combining data. They hold many values together in a single structure.

The term "product" has roots in the concept of Cartesian product in set theory, where we generate a set from the Cartesian product of two or more sets. Each tuple in the product set comprises elements from each contributing set.

Here's a fascinating read on Cartesian products that elucidates the concept further.

Composing Sum and Product Types:

The beauty of Sum and Product types lies in their composability. We can nest them within each other to model complex data structures, providing a strong typing foundation to our code making it more readable, maintainable, and less prone to errors.

Sum and Product types bring a slice of mathematical reasoning to our code, aligning how we model data with the domain we are working in.

As we venture further into functional programming, understanding these concepts will act as a springboard, propelling us toward more advanced topics.

Here are some simple examples of code using Sum and Product Types in different programming languages:

        Scala 3
        Haskell
        Rust
        Java 17
        Python 3
    

        
// Sum Type Example
enum DayOfWeek:
  case Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday

// Product Type Example
case class Person(name: String, age: Int)

// Composing Sum and Product Types
case class Event(name: String, day: DayOfWeek, organizer: Person)

// Usage
val alice = Person("Alice", 25)
val meetup = Event("Scala Meetup", DayOfWeek.Wednesday, alice)

// Output: Event(Scala Meetup,Wednesday,Person(Alice,25))
println(meetup)
        
    

        
-- Sum Type Example
data DayOfWeek = Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday
    deriving (Show, Eq)

-- Product Type Example
data Person = Person { name :: String, age :: Int }
    deriving (Show, Eq)

-- Composing Sum and Product Types
data Event = Event { eventName :: String, day :: DayOfWeek, organizer :: Person }
    deriving (Show, Eq)

-- Usage
alice :: Person
alice = Person "Alice" 25

meetup :: Event
meetup = Event "Scala Meetup" Wednesday alice

-- Output: Event {eventName = "Scala Meetup", day = Wednesday, organizer = Person {name = "Alice", age = 25}}
main :: IO ()
main = print meetup
        
    

        
// Sum Type Example
#[derive(Debug)]
enum DayOfWeek {
    Monday,
    Tuesday,
    Wednesday,
    Thursday,
    Friday,
    Saturday,
    Sunday,
}

// Product Type Example
#[derive(Debug)]
struct Person {
    name: String,
    age: u32,
}

// Composing Sum and Product Types
#[derive(Debug)]
struct Event {
    name: String,
    day: DayOfWeek,
    organizer: Person,
}

// Usage
fn main() {
    let alice = Person {
        name: String::from("Alice"),
        age: 25,
    };
    let meetup = Event {
        name: String::from("Scala Meetup"),
        day: DayOfWeek::Wednesday,
        organizer: alice,
    };

    // Output: Print the event details using derived Debug implementation
    println!("{:?}", meetup);
}
        
    

        
// Sum Type Example using sealed classes
public sealed interface DayOfWeek permits DayOfWeek.Monday, DayOfWeek.Tuesday, DayOfWeek.Wednesday,
        DayOfWeek.Thursday, DayOfWeek.Friday, DayOfWeek.Saturday, DayOfWeek.Sunday {
    record Monday() implements DayOfWeek {}
    record Tuesday() implements DayOfWeek {}
    record Wednesday() implements DayOfWeek {}
    record Thursday() implements DayOfWeek {}
    record Friday() implements DayOfWeek {}
    record Saturday() implements DayOfWeek {}
    record Sunday() implements DayOfWeek {}
}

// Product Type Example using record
record Person(String name, int age) {}

// Composing Sum and Product Types
record Event(String name, DayOfWeek day, Person organizer) {}

public class Main {
    public static void main(String[] args) {
        // Usage
        Person alice = new Person("Alice", 25);
        Event meetup = new Event("Scala Meetup", new DayOfWeek.Wednesday(), alice);

        // Output: Event[name=Scala Meetup, day=Wednesday, organizer=Person[name=Alice, age=25]]
        System.out.println(meetup);
    }
}
        
    

        
from enum import Enum
from dataclasses import dataclass

# Sum Type Example using Enum
class DayOfWeek(Enum):
    MONDAY = 1
    TUESDAY = 2
    WEDNESDAY = 3
    THURSDAY = 4
    FRIDAY = 5
    SATURDAY = 6
    SUNDAY = 7

# Product Type Example using dataclass
@dataclass
class Person:
    name: str
    age: int

# Composing Sum and Product Types
@dataclass
class Event:
    name: str
    day: DayOfWeek
    organizer: Person

# Usage
alice = Person("Alice", 25)
meetup = Event("Scala Meetup", DayOfWeek.WEDNESDAY, alice)

# Output: Event(name='Scala Meetup', day=, organizer=Person(name='Alice', age=25))
print(meetup)
        
    

⚠️

Java Enums should not be mistaken for Sum Types. Unlike Sum Types, which can encompass a variety of types and values, Java Enums are limited to a fixed set of constants. While Enums are great for listing known values, they lack the flexibility and power of true Sum Types that can hold different data and structures.

⚠️

Python's Enum is more flexible than Java's enumeration. It allows for complex data structures using Enum members with their properties and methods. However, it's less powerful than algebraic data types in languages like Scala or Rust, where an enum can have different data types associated with each variant. Sometimes, it might be better to use data classes and type hints.

In the above example, DayOfWeek is a sum type representing the days of the week. Person is a product type that aggregates a name and an age. Event is a composition of sum and product types, illustrating how we can nest these types to model more complex data structures.

Conclusion

We've unveiled the basic building blocks of Algebraic Data Types - Sum and Product types and glimpsed at the algebraic principles they hinge on. These concepts pave a smooth path for diving into more nuanced functional programming paradigms.

Next, we'll unravel the concept of Functors, which apply ADTs.

Functors open up possibilities, enabling a more robust and expressive data manipulation. They form a cornerstone in understanding data operations in a functional setting.

Stay tuned for our next article, where we'll venture into the heart of Functors, and we will also be able to help our functional programming palette.

Addendum: A Special Note for Our Readers

I decided to delay the introduction of subscriptions. You can read the full story here.

If you find our content helpful, there are several ways you can support us:

The easiest way is to share our articles and links page on social media; it is free and helps us greatly.
If you want a great experience during the Chinese New Year, I am renting my timeshare in Phuket. A five-night stay in this resort in Phuket costs 11,582 € on Expedia. I am offering it in USD at an over 40% discount compared to that price. I received the Year of the Snake in style.

If your finances permit it, we are happy over any received donation. It helps us offset the site's running costs and an unexpected tax bill. Any amount is greatly appreciated:

Finally, some articles have links to relevant goods and services; buying through them will not cost you more. And if you like programming swag, please visit the TuringTacoTales Store on Redbubble. Take a look. Maybe you can find something you like:

Algebraic Data Types: Structuring Data in Functional Programming

Chipotle Coder

Overview

Sets, Operations, and Laws: The Algebra of Programming

Sets to Types

Operations to Functions

Laws

Unveiling Sum and Product Types

Sum Types

Product Types

Composing Sum and Product Types:

Conclusion

Addendum: A Special Note for Our Readers

Read more

Introducing Mojo: The Next Frontier in Programming Languages

AI Trading Strategies Course 2: Data Wrangling in Mojo

Exploring the AI Trading Strategies Course Using Mojo

Type Checking In Python: Catching Bugs Before They Bite