Around 11 minutes (2275 words). Published January 22, 2021.

Workshop on Agent-Based Modeling

Context: on Thursday I ran a workshop on agent-based modeling for the “Columbia Undergraduate Mathematical Modelling Workshop”, a 4-week series of hands-on lectures on mathematical modeling for interested undergraduates. I’ve reproduced the notebook here.

CUMMW Workshop (Operations Research 2): Agent-Based Modeling

by Aapeli Vuorinen (https://aapelivuorinen.com). IEOR, Columbia University.

Materials available at: https://aap.li/cummw

Introduction

Agent-based modeling is a framework for modeling complex systems that focuses on agents, their individual behaviour, and the interactions between agents.

This allows us to study the structure of potentially very complicated systems without necessarily needing to be able to compute answers analytically or even write down the full system.

What is agent-based modeling?

Agent-based modeling works by building simple, autonomous agents that interact with each other over time. This allows us to build sophisticated models, simulate them, and then study the macro-scale, emergent behavior of large populations of agents. We can then modify the model and perform analyses to study how different rules or agent-level decisions change the outcome of the simulation.

Here’s a classical example used to study segregation: https://ncase.me/polygons/. You saw a bunch of examples last time in Ton’s slides, a lot of the 3D examples were agent-based mdoels. Even computer games can be thought of as agent-based models!

The power of these models is in their simplicity and ease of implementation, as well as the huge variety of different scenarios that can be studied. However, these come with some drawbacks that we’ll discuss throughout the workshop, such as their computational complexity meaning we have to make various tradeoffs while modeling.

A model for COVID-19 at Columbia

The simulation workflow

An important part of ABM (and any modeling regime) is the actual modeling part. We have to choose the right things to include in the model to make it accurate enough while keeping it computationally tractable. Once a realistic model is built, we can move to control by studying how controlling different parameters of the model might influence the outcome. For example, we might try out different rules to limit virus spread, such as reducing the number of students that go to class.

We’ll now construct a simple model of COVID-19 and play around with it a bit so you get a feel for how it works.

The model

To illustrate the point of ABM, we will start with a very basic model. Often when modeling, we first make a simple model and then iterate on it. This is an important part of modeling.

We will model students as the agents that interact with the campus environment, composed of classrooms and apartments. Agents move between classrooms (randomly) and apartments (initial random allocation, but then fixed). Disease spread is random through the population.

Agents (students)

There are \(N\) students going to campus
Students live in one of \(M\) apartment buildings, each apartment has \(B\) bedrooms, and students are allocated randomly into these rooms
Every weekday (Mon-Fri), students go to one of \(K\) (randomly assigned) classes and then back home
If a student gets infected on a given day, they become infectious at the start of the same day, then remain infectious for a randomly chosen amount of time between 7-10 days

Interactions between students

During each day in class, the students have a probability \(q\) of getting infected with COVID independently for each student in the same class that is at the time infectious
During each evening/night, the students have a probability \(p\) of getting infected with COVID independently for each infectious housemate

So for example if four students, \(A\), \(B\), \(C\), and \(D\) live together in an apartment, with \(B\) and \(C\) currently infectious, then each evening, student \(A\) has probability \(p\) of getting infected by \(B\), and an independent probability \(p\) of getting infected by \(C\). Therefore the probability of \(A\) getting infected by either \(B\) or \(C\) (or both) is \(p(1-p)+(1-p)p+p^2\). Similarly \(D\) has the same probability of being infected each evening.

If a student gets infected during the day, they only become infectious at the start of the next day.

Questions:

Is this a good model?
What assumptions have we made?
How should we pick the parameters? (Of course things like \(N\) can be chosen from data we have, but what about \(p\) and \(q\)?)

Imports

# randomization functions
from random import uniform, choice, shuffle

# an enumeration, a set of named things
from enum import Enum

# numpy is a numerical computation library for python
import numpy as np

Helper function

def bernoulli(prob):
    """
    Returns a random variate that takes value True with probability `prob` and False with probability `1-prob`
    """
    return uniform(0, 1) < prob

The main `Student` agent

def generate_infectious_time():
    return round(uniform(7,10))

class State(Enum):
    HEALTHY = 1 # also called susceptible
    INFECTIOUS = 2
    IMMUNE = 3 # also called recovered

class StudentAgent:
    def __init__(self):
        # initially we are healthy
        self.state = State.HEALTHY
        # whether the agent has been infected today, and we should mark
        # them as infected when next advancing the disease
        self._infected = False

    def simulate_disease(self, day):
        """
        Simulate the course of disease for this agent
        """
        if self.state == State.HEALTHY:
            if self._infected:
                # we're infected
                self.state = State.INFECTIOUS
                self.infectious_end = day + generate_infectious_time()
            else:
                # do nothing if we're healthy and not infected
                pass
        elif self.state == State.INFECTIOUS:
            if day == self.infectious_end:
                # it's the day we stop being infectious and become immune
                self.state = State.IMMUNE
            else:
                # we're infectious but no need to change state
                pass

    def infect(self):
        self._infected = True

    def infectious(self):
        """
        Whether we're currently infectious or not
        """
        return self.state == State.INFECTIOUS

    def have_been_sick(self):
        """
        If this agent has ever been sick or not
        """
        return self.state != State.HEALTHY

Some helper classes and functions

class Apartment:
    """
    This is just a placeholder for some info with a few convenience functions. It could easily be replaced by a list.
    """
    def __init__(self, capacity):
        self.occupants = []
        self.capacity = capacity

    def add_occupant(self, occupant):
        self.occupants.append(occupant)

    def is_full(self):
        return len(self.occupants) == self.capacity

class Days(Enum):
    MONDAY = 1
    TUESDAY = 2
    WEDNESDAY = 3
    THURSDAY = 4
    FRIDAY = 5
    SATURDAY = 6
    SUNDAY = 7

def randomly_infect(group, prob):
    """
    Given a group of students, randomly infects each pair of students with probability prob
    """
    # for each pair of occupants, they get infected with prob p
    for agent1 in group:
        for agent2 in group:
            # we simulate a possible infection event for each pair where one is infected and the other is not
            if agent1.infectious() and not agent2.infectious():
                if bernoulli(prob):
                    agent2.infect()

def assign_to_classrooms(students, number_of_classrooms):
    """
    Reandomly assign students into classrooms with roughly the same number of students in each.

    This is a confusing function, don't worry if you don't understand how it works.

    Returns a list of lists.
    """
    # randomize the order of students
    shuffle(students)
    # divide into equal chunks
    return [students[i::K] for i in range(number_of_classrooms)]

The main simulation loop

def run_simulation(N, M, B, K, p, q):
    # create the students and apartments
    students = [StudentAgent() for _ in range(N)]
    apartments = [Apartment(capacity=B) for _ in range(M)]

    # infect ten student at the start
    for i in range(10):
        students[i].infect()

    # randomize the students (so those 10 end up in random spots)
    shuffle(students)

    # randomly allocate students into apartments
    for student in students:
        apartments_with_space = [apartment for apartment in apartments if not apartment.is_full()]

        # choose a random apartment
        random_apartment = choice(apartments_with_space)
        random_apartment.add_occupant(student)

    infected_count = []
    sick_count = []

    day_number = 0

    # let's run for 12 weeks
    for week in range(12):
        # and then for each day in the week, we simulate
        for day in Days:
            # on weekday mornings, we go to class
            if day not in [Days.SATURDAY, Days.SUNDAY]:
                # randomly assign to classrooms
                classrooms = assign_to_classrooms(students, K)

                # within classrooms agents may infect each other
                for classroom in classrooms:
                    randomly_infect(classroom, q)

            for apartment in apartments:
                randomly_infect(apartment.occupants, p)

            infected_count.append(np.sum([student.infectious() for student in students]))
            sick_count.append(np.sum([student.have_been_sick() for student in students]))

            day_number += 1

            # advance the state of the simulation for each agent
            for agent in students:
                agent.simulate_disease(day_number)

    infected_count_daily_change = np.diff(infected_count)
    sick_count_daily_change = np.diff(sick_count)

    # total number of agents who ever got sick
    total_sick = sick_count[-1]

    return total_sick, infected_count, infected_count_daily_change, sick_count, sick_count_daily_change

More helpers (you can ignore these)

# creates a context manager to time an operation
import time
from contextlib import contextmanager
import matplotlib.pyplot as plt

@contextmanager
def timer():
    start = time.perf_counter()
    yield
    print(f"Took {(time.perf_counter() - start)} sec.")

# plots two side-by-side plots
def plot_infected_and_change(infected_count, daily_change, title1, title2):
    f, (ax1, ax2) = plt.subplots(1, 2, figsize=(18, 6))

    ax1.plot(infected_count)
    ax1.set_xlabel("Day of simulation")
    ax1.set_title(title1)
    ax1.set_xticks(np.arange(0, len(infected_count) + 1, 7))

    ax2.plot(daily_change)
    ax2.set_xlabel("Day of simulation")
    ax2.set_title(title2)
    ax2.set_xticks(np.arange(0, len(infected_count) + 1, 7))

    plt.show()

Define some model parameters

# number of students
N = 1000

# number of apartments
M = 500

# capacity of each apartment
B = 3

# number of classrooms
K = 100

# daily probability of getting covid for each infected person when at home
p = 0.04

# daily probability of getting covid for each infected person when in classroom
q = 0.02

with timer():
    total_sick, infected_count, infected_count_daily_change, sick_count, sick_count_daily_change = run_simulation(N, M, B, K, p, q)
plot_infected_and_change(infected_count, infected_count_daily_change, "Number of students infectious", "Change in number infectious")
plot_infected_and_change(sick_count, sick_count_daily_change, "Cumulative number of infected students", "New infections per day")

Took 0.49706657499999984 sec.

png

We can see that if there is no intervention, even ten students will cause a full outbreak. You can also see the effect of weekends here with the “steps” in the graph.

Note that this simulation took around half a second to run (on my computer), compared to the much faster DESs. There is an important tradeoff here with model complexity and

Questions:

Why do only ~600 students ever get infected before the pandemic dies? Why not all 1000?
What happens if we run this again, do we get the same result?
Do our assumptions hold?
Was our model reasonable? What could we tweak?

What if we mandated people to use masks in class?

Suppose we set up a rule where people had to use masks in class. This might lead to the probability of infection in class reducing by, say 20%. We can now test out what happens in this scenario:

new_q = 0.8 * q

with timer():
    total_sick, infected_count, infected_count_daily_change, sick_count, sick_count_daily_change = run_simulation(N, M, B, K, p, new_q)
plot_infected_and_change(infected_count, infected_count_daily_change, "Number of students infectious", "Change in number infectious")
plot_infected_and_change(sick_count, sick_count_daily_change, "Cumulative number of infected students", "New infections per day")

Took 0.4382121269999999 sec.

png

Hands on: play around with the model and see what happens

Study how the current model reacts to changing the input parameters, or choose some tweak to implement to the model and see how the system behaves given that change. Come up with your own or pick one from the list below.

(Easy): Study the variability in the simulation output

Run the simulation multiple times (for example with a for loop), and look at how the simulation output changes. Why does the number of people who ever got sick vary so much?

(Easy): Study how the total number of people who got sick changes with the parameters

Write a for loop to run the simulation mutiple times Plot a graph of the total number of people who got sick at any time as a function of \(p\).

(Medium): Implement a “hybrid” mode of teaching

Set aside a proportion of students, say 20%, and make them not go to class (pretend they attend online and stay home during the day). Does that make a difference.

(Medium): Allocate students to apartments in a better way

What if we allocate students to apartments in order to minimize the number of students per apartment, instead of randomly? Does this change the outcome?

(Medium): Allocate students to classes more realistically

Currently, students wake up in the morning and get assigned to a new classroom randomly. Make some kind of schedule for each student, and see how it influences the simulation. For example, what if each student just went to the same class each day?

Implement some kind of logic for students to go out in the evenings and socialize. How does this extra mixing of students change the simulation output?

(Hard): Implement an incubation period

When people really get infected with COVID-19, they don’t get symptoms immediately, there is a period of time, up to a week when they do not exhibit symptoms. Implement an extra state for agents and see how this influences the simulation output.

(Really hard): Implement testing and quarantine

Starting from the above, see if you can implement a testing regime where student are randomly sampled and if they are found to be in incubating state, they are put into quarantine where they won’t go to class.

Python list comprehensions

Hopefully by now you’ve seen a bit of Python here and there. It’s an extremely expressive and powerful language, and mostly you can figure out what’s going on by just reading it. There’s however a few small piece of syntax I’ll highlight to help you see what’s going on.

List comprehensions

List comprehensions are a shortcut for loops where you want to perform some operation for everything in a list and store their return values in a list. I’ve used them a fair bit here to simplify code.

apartments_with_space = [apartment for apartment in apartments if not apartment.is_full()]

# is equivalent to this:
apartments_with_space = []

for apartment in apartments:
    if not apartment.is_full():
        apartments_with_space.append(apartment)

I also use the underscore (_) to denote a variable I don’t care about. In a list comprehension, you have to assign the list values to something. So:

students = [StudentAgent() for _ in range(N)]

# is equivalent to
students = []

for _ in range(N):
    students.append(StudentAgent())