gen_statem in context

Posted: 2024-01-11T17:56:34Z
Updated: 2024-02-20T06:03:09Z
Changelog

Update!

Jose Valim announced on Twitter that OTP 27 will generate its documentation with ExDoc! This includes the gen_statem docs!

The proof of concept docs are much nicer on the eyes compared to their former counterparts especially the type specifications. The syntax is still Erlang (they are Erlang docs after all), so there's still some overhead for the uninitiated Elixir dev.

Once they're official, I'll probably replace the links to the OTP docs herein with their ExDoc counterparts.

Preface

In this post, I hope to capture some of the institutional knowledge around :gen_statem I wish I had when first using it. Overall it's a really awesome behaviour particularly for dealing with network connections and protocols.

Foundations of :gen_statem

Elixir has processes. Each has its own isolated memory, and they communicate (or rather coordinate) by message passing.

GenServer is a behaviour of a server applied to a process. It's designed to be highly available:

  • They handle all messages received, none are left unanswered.
  • Messages are handled in the order received.

:gen_statem is an abstraction of a state machine atop GenServer. By specifying:

  • The possible states a GenServer can be in.
  • What messages it can handle in what states.
  • How those messages transition our GenServer from one state to the next.

We gain the following amenities:

  • Postponing the handling messages until our GenServer is in a state to handle them.
  • Comprehensive and easy to use timeouts.
  • State enter calls which always perform some task when its state is entered.
  • Colocated callback functions for each state.
  • Easy to use internal messaging and handling.

When's it appropriate to use?

The official :gen_statem behaviour docs provide guidance here, but I'd like to provide more concrete examples.

:gen_statem is ideal for organizing the possible states your GenServer process can be in, but not the possible states your application's data can be in.

The typical examples used to illustrate state machines (say a door that may be :open or :closed) aren't appropriate for :gen_statem. It'd be highly unusual to model a door as a process when a map would suffice:

%{
  id: 1337,
  name: "Front Door",
  position: :closed, # or :open
  handle: :locked, # or :unlocked
  deadbolt: :locked, # or :unlocked
}

In most cases, you should stick to modeling your domain logic as vanilla modules of plain old data structures with pure functions and pattern matching. Those modules should then be consumed by GenServer's which are simpler and more flexible than :gen_statem's.

That withstanding, :gen_statem really shines in use-cases such as managing persistent connections, webserver sessions, and assembling packets. These use-cases all have a limited amount of states and deal with data in motion.

Tutorials

I consider Andrea Leopardi's Persistent connections with gen_statem the best tutorial as persistent connections are an ideal case for :gen_statem.

Andrew Bennett's Time-Out: Elixir State Machines versus Servers summarizes :gen_statem's timeout mechanisms well where the official docs are less concise.

The official OTP design principles are otherwise comprehensive as are the module docs.

Pitfalls

Steep learning curve

The :gen_statem docs are very comprehensive but don't have a gentle learning curve.

Several callback modes

:gen_statem has two possible callback modes, :state_functions and :handle_event_function, along with the :state_enter modifier:

def callback_mode(), do: :state_functions
def callback_mode(), do: [:state_functions, :state_enter]
def callback_mode(), do: :handle_event_function
def callback_mode(), do: [:handle_event_function, :state_enter]

Pretty much every tutorial uses a different permutation making hard to follow examples. I consider :state_functions preferable for its declarative syntax and explicitness compared to :handle_event_function, which groups everything under one callback.

:state_enter functions are useful when entering a state always necessitates some work but shouldn't be used by default. They can be difficult to follow and refactor when you have too many.

Implicit syntax

Initially it can be difficult to follow :gen_statem's syntax, specifically:

  • The state-callback naming conventions
  • The returns e.g. :next_state, :keep_state, :keep_state_and_data, :repeat_state
  • The variety of possible transition actions

The docs outline all of these, but it requires lots of jumping back and forth and a careful eye to translate.

Lack of termination reports

By default Elixir won't log termination reports for modules using :gen_statem, but you can patch them in with this translator from nostrum. Add it with:

Logger.add_translator({StateMachineTranslator, :translate})

A similar effect can be had by setting :handle_sasl_reports to true, but this logs lots of extra unecessary information that obfuscates your logs.

How often is :gen_statem used?

The following are searches for invocations of :gen_statem and GenServer (at time of writing) across Elixir and Erlang repos on github (excluding forks and archives):

For completeness one might also consider :gen_statem's predecessor :gen_fsm:

Calculating the ratios, we net some interesting results:

  • ~173:1 GenServer's to :gen_statem's for Elixir.
  • ~50:1 GenServer's to :gen_statem's for both languages.
  • ~26:1 gen_server's to gen_statem's for Erlang.
  • ~22:1 GenServer's to :gen_statem/:gen_fsm's for both languages.

As an interesting tidbit, Joe Armstrong's 2003 PHD thesis also breaks down some projects by behaviour. It's a comparatively small telecom adjacent sampling, landing at 122+56 gen_server's to 1+10 gen_fsm's, making a ratio of ~16:1.

Overall I'd say these results track with :gen_statem being useful for managing connections, and perhaps lacking first-class support in Elixir.

Where is :gen_statem being used in Elixir?

The following is a sampling of a few popular open source libs and their respective modules that use :gen_statem:

Blue Heron

A library for interfacing with low energy bluetooth devices.

  • BlueHeron.ATT.Client
  • BlueHeron.HCI.Transport
  • BlueHeron.Peripheral

elixir-ecto/db_connection

DB connection behaviour and pool for ecto.

  • DBConnection.Connection

Finch

Popular HTTP client built on Mint and Nimblepool.

  • Finch.HTTP2.Pool

Livebook

Collaborative Elixir notebooks.

  • Livebook.Teams.Connection

elixir-mongo/mongodb

MongoDB driver for Elixir.

  • Mongo.Session

Nostrum

A library for interfacing with Discord's API with an emphasis on scaling.

  • Nostrum.Shard.Session
  • Nostrum.Api.Ratelimiter

Postgrex

Postgres driver for Elixir.

  • Postgrex.SimpleConnection
  • Postgrex.ReplicationConnection

Supavisor

Cloud-native, multi-tenant Postgres connection pooler.

  • Supavisor.ClientHandler
  • Supavisor.DbHandler

Xandra

Fast, simple, and robust Cassandra/ScyllaDB driver for Elixir.

  • Xandra.Connection
  • Xandra.Cluster.Pool

Additional Resources

Official Erlang docs

Existing articles

Talks