How Gaming & Telecom Companies Handle Millions of Concurrent Users | Sajima Solutions

When Discord handles 10 million concurrent voice users, or when a telecom switch routes thousands of calls per second, they can’t afford downtime. The secret? Architecture patterns originally developed for telephone switches in the 1980s—patterns that automatically recover from failures and scale horizontally without breaking a sweat.

In this deep dive, we’ll explore how to build systems that self-heal using Elixir’s GenServer—the same foundation powering WhatsApp (2 billion users) and countless gaming backends.

What is a GenServer?

A GenServer is a process that implements a client-server relationship. It encapsulates state, provides synchronous and asynchronous communication patterns, and integrates seamlessly with OTP supervision trees for fault tolerance.

Creating Your First GenServer

Let’s build a simple counter that demonstrates the core concepts:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
defmodule Counter do
  use GenServer

  # Client API

  def start_link(initial_value \\ 0) do
    GenServer.start_link(__MODULE__, initial_value, name: __MODULE__)
  end

  def increment do
    GenServer.call(__MODULE__, :increment)
  end

  def decrement do
    GenServer.call(__MODULE__, :decrement)
  end

  def get_value do
    GenServer.call(__MODULE__, :get_value)
  end

  def increment_async do
    GenServer.cast(__MODULE__, :increment)
  end

  # Server Callbacks

  @impl true
  def init(initial_value) do
    {:ok, initial_value}
  end

  @impl true
  def handle_call(:increment, _from, state) do
    new_state = state + 1
    {:reply, new_state, new_state}
  end

  @impl true
  def handle_call(:decrement, _from, state) do
    new_state = state - 1
    {:reply, new_state, new_state}
  end

  @impl true
  def handle_call(:get_value, _from, state) do
    {:reply, state, state}
  end

  @impl true
  def handle_cast(:increment, state) do
    {:noreply, state + 1}
  end
end

Understanding Call vs Cast

GenServer provides two primary communication patterns:

call/2: Synchronous. The client waits for a response. Use when you need the result immediately.
cast/2: Asynchronous. Fire-and-forget. Use for operations where you don’t need a response.

1
2
3
4
5
# Synchronous - waits for response
Counter.increment()  # Returns the new value

# Asynchronous - returns immediately
Counter.increment_async()  # Returns :ok

Real-World Example: Rate Limiter

Let’s build something more practical—a rate limiter that tracks API requests per client:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
defmodule RateLimiter do
  use GenServer

  @max_requests 100
  @window_ms 60_000  # 1 minute

  # Client API

  def start_link(opts \\ []) do
    GenServer.start_link(__MODULE__, %{}, opts)
  end

  def check_rate(pid, client_id) do
    GenServer.call(pid, {:check_rate, client_id})
  end

  # Server Callbacks

  @impl true
  def init(_opts) do
    schedule_cleanup()
    {:ok, %{}}
  end

  @impl true
  def handle_call({:check_rate, client_id}, _from, state) do
    now = System.monotonic_time(:millisecond)
    
    client_requests = Map.get(state, client_id, [])
    |> Enum.filter(fn timestamp -> now - timestamp < @window_ms end)
    
    if length(client_requests) < @max_requests do
      new_requests = [now | client_requests]
      new_state = Map.put(state, client_id, new_requests)
      {:reply, {:ok, @max_requests - length(new_requests)}, new_state}
    else
      {:reply, {:error, :rate_limited}, state}
    end
  end

  @impl true
  def handle_info(:cleanup, state) do
    now = System.monotonic_time(:millisecond)
    
    cleaned_state = state
    |> Enum.map(fn {client_id, requests} ->
      valid_requests = Enum.filter(requests, fn ts -> now - ts < @window_ms end)
      {client_id, valid_requests}
    end)
    |> Enum.reject(fn {_client_id, requests} -> Enum.empty?(requests) end)
    |> Map.new()
    
    schedule_cleanup()
    {:noreply, cleaned_state}
  end

  defp schedule_cleanup do
    Process.send_after(self(), :cleanup, @window_ms)
  end
end

Supervision and Fault Tolerance

The real power of GenServer comes when combined with Supervisors. If your GenServer crashes, the supervisor can restart it automatically:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
defmodule MyApp.Application do
  use Application

  def start(_type, _args) do
    children = [
      {Counter, 0},
      {RateLimiter, name: RateLimiter}
    ]

    opts = [strategy: :one_for_one, name: MyApp.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

Best Practices

Keep state minimal: Only store what you need. Large states increase memory usage and make restarts slower.
Use handle_continue for initialization: Don’t block init/1 with expensive operations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
@impl true
def init(opts) do
  {:ok, %{}, {:continue, :load_data}}
end

@impl true
def handle_continue(:load_data, state) do
  data = expensive_data_load()
  {:noreply, Map.put(state, :data, data)}
end

Handle timeouts: Protect against slow operations:

1
2
3
def get_data(pid) do
  GenServer.call(pid, :get_data, 5_000)  # 5 second timeout
end

Use terminate/2 for cleanup:

1
2
3
4
5
@impl true
def terminate(_reason, state) do
  save_state_to_disk(state)
  :ok
end

Conclusion

These patterns are why gaming companies choose this architecture for multiplayer servers, why telecom providers trust it for carrier-grade reliability, and why fintech platforms use it for payment processing. The ability to handle failures gracefully—automatically isolating and restarting failed components—means your users never experience downtime.

Whether you’re building a game that needs to handle launch-day traffic spikes, a trading platform that can’t lose a single transaction, or a messaging system for millions of users, these architectural patterns are your foundation.

At Sajima Solutions, we build fault-tolerant systems for gaming, telecom, and finance across the Gulf region. Contact us to discuss your next project.