Posted Sunday, August 2nd, 2020
I first heard of Elixir when I joined SafeBoda as a Back-End Engineer. Am sure you will agree, Elixir makes you feel productive, with sweet syntax and functionally very interesting to code in. You love each line that you write. I do, especially when I imagine what the code is doing :). This is even more fun for solving concurrency problems. I have been curious how things work under the hood because I knew Erlang before as ugly and hard to get around. Now am conquering my fears :-). Glad you could join me!
Because Elixir runs on top of the Erlang Runtime System (ERTS) with BEAM at its core, this forms a big part of how concurrency is achieved, I decided to take a peek into ERTS especially BEAM for a moment to understand what happens in there while Elixir code being executed.
Erlang Runtime System is a collection of tools, the Erlang VM
(BEAM) being part of it. It follows a fault-tolerant, distributed, and concurrent computing approach that builds on top of the Actor Model. Therefore to better understand how ERTS
works, we have to get the primitive idea behind the Actor Model.
The Actor Model
is a mathematical model/concept of concurrent computation that treats actors
as the fundamental and universal primitive of concurrent computation.
ERTS which Elixir build upon implements the actor model using processes as actors that are lightweight and fast to create and terminate. In the Erlang/Elixir Actor Model, Consider the diagram below.
The actors in this model can have the following properties among others depending on their design.
Now we know what we have when we deploy a node of Erlang Runtime System. Its time to look in detail at what makes up the runtime. Consider the diagram below.
So far we know that ERTS will run a single OS process for each core available on the CPU of the host machine availing a node. From the diagram above, the runtime on a single node will have the following layers.
Elixir
- Which provisions Elixir core and third-party modules.OTP
- Discussed below.BEAM
- Discussed below.By default, there is just one node when you start ERTS with default settings. You can check the node info like below:
iex(4)> Node.list :this
[:nonode@nohost]
iex(5)> Node.alive?
false
iex(6)>
You can start more nodes on different machines or the same machine and connect them together for distributed computing. Here is a simple setup of two nodes in one host machine.
# shell iex --sname node1@localhost
iex(node1@localhost)1> Node.alive?
true
iex(node1@localhost)2> Node.list([:this, :visible])
[:node1@localhost]
On the second node, start and connect to node one above. See the number of nodes after connecting node 1 to node 2
# shell iex --sname node2@localhost
iex(node2@localhost)1> Node.alive?
true
iex(node2@localhost)2> Node.list([:this, :visible])
[:node2@localhost]
iex(node2@localhost)3> Node.connect(:node1@localhost)
iex(node2@localhost)4> Node.list([:this, :visible])
[:node2@localhost, :node1@localhost]
OTP is a collection of libraries written in the Erlang programming language which consist of Erlang Runtime System (ERTS) and a number of ready to use components and a set of design principles for Erlang programs. It is an integral part of the open-source distribution of Erlang and ships with each distribution of Erlang.
From these definitions, OTP has modules and behaviors that implement repeatable tasks like process spawning and supervision, interprocess communication, in-memory caching, etc. OTP modules, behaviors, and tools are available in any ERTS node. Let us name a few OTP components that we may have heard of.
supervisor
- Behavior for implementing supervision trees.gen_server
- Behavior for implementing standard client-server relation.ets
- In memory data storegen_tcp
- For implementing sockets.More OTP libraries are listed on this link, Read through to find useful tools you don't have to rewrite.
BEAM
(Björn’s Erlang Abstract Machine) aka Erlang VM
the Virtual Machine that executes Erlang/Elixir code in processes. Here are key points about BEAM. When it comes to concurrency, the most important part of the BEAM is scheduling. So, that what we will talk about here to limit the scope.
Elixir/Erlang programs are run in BEAM/Erlang processes. We will now try to understand how these processes exist and some of the important things that we should know about the runtime system that helps us achieve concurrency. Below are some key points about ERTS and Elixir that make concurrency easy to achieve.
On Each ERTS instance, there is a scheduler per CPU core that manages a FIFO process queue, assigns execution time, and does garbage collection and memory management for the processes in its queue. You can see the number of schedulers in an instance like below is 4 for a 2 cores CPU 4 hyper-threads (logical CPU cores).
iex(1)> :erlang.system_info :schedulers_online
4
iex(2)>
You can still specify the number of schedulers up to 1024 of them using Erlang runtime flags. Here the same machine with 4 logical cores having 10 schedulers after starting the node with a specific number of schedulers iex --erl "+S 10"
.
# shell iex --erl "+S 10
iex(1)> :erlang.system_info :schedulers_online
4
iex(3)> :erlang.system_info :schedulers
10
iex(4)>
Consider the code below that has a module called Cooker whose task is to cook a meal. The chef in this case is the runtime that can either cook meals asynchronously or synchronously. The idea is that when a customer requests for a starter that will take 3 minutes to prepare and the main dish that will take 10 minutes to prepare, regardless of the order in which the two are cooked, there should be concurrency such that starter dish will be ready before the main dish.
Here is the module that defines a cooker. Its a module because it simulates a collection of computation that will do some task and needs monitoring on how the progress is going. I have used the Erlang :gen_server
behavior which is known in Elixir as GenServer
.
Note: This abstracts many things. To entirely understand how things work under the hood I recommend you read Elixir in Action Chapter 5 and 6
defmodule Cooker do
use GenServer
def init(state), do: {:ok, state}
def start(dish), do: GenServer.start_link(__MODULE__, dish, name: dish)
def handle_cast({:cook, time}, state) do
cook(state, time)
{:noreply, state}
end
def handle_cast(:serve, state) do
serve(state)
{:noreply, state}
end
defp cook(dish, time) do
:timer.sleep(time)
IO.puts("Cooked #{dish} in #{time}, Serving now")
GenServer.cast(self(), :serve)
end
defp serve(dish), do: IO.puts("Served #{dish}")
end
The chef module simulates a part of a system that starts processes for each task it receives so that each is concurrent and easy to monitor or troubleshoot.
defmodule Chef do
alias Cooker
def starter_main_dish() do
Cooker.start(:main_dish)
Cooker.start(:starter)
GenServer.cast(:main_dish, {:cook, 10000})
GenServer.cast(:starter, {:cook, 3000})
end
end
See here even though the main dish takes 10 seconds and is started before the starter process which takes 3 seconds, the completion time is independent of each others computation time.
iex(6)> Chef.starter_main_dish_and()
:ok
Cooked starter in 3000, Serving now
Served starter
Cooked main_dish in 10000, Serving now
Served main_dish
iex(7)>
More resources
I learned the most important things that make concurrency work in Elixir/Erlang. Here are my key points.
Thank you for finding time to read my post. I hope you found this helpful and it was insightful to you. I enjoy creating content like this for knowledge sharing, my own mastery and reference.
If you want to contribute, you can do any or all of the following 😉. It will go along way! Thanks again and Cheers!