[263] Phoenix Is An Interface, Not Your Application, Part 2

Completing our Channels-based collaboration, seeing another "I didn't think this through" mistake, and seeing some benefits that OTP app separation help us accrue.

Subscribe now

Phoenix Is An Interface, Not Your Application, Part 2 [08.23.2016]

In the last episode, we built a basic umbrella application that had a super simple GenServer and Supervisor structure in place. We ended up with an OTP application that could start and supervise counters. It's almost the simplest OTP application you could ever build, so why did we bother with it?

Project

Let's completely ignore that question for now, and instead just make a basic Phoenix application that lets us serve access to these processes via Phoenix Channels. You can follow along with the after_episode_262 tag.

cd apps
mix phoenix.new chocula_rump # because it's the backend...get it?

OK, once again, let's chill out for a second. We just brought in a database. And views. And templates. And controllers. And npm. But we already said our goal was to introduce a websocket interface to our counters. That doesn't require any of that stuff. We made more decisions without thinking about them.

Breathe.

We'll remove that and replace it with a stripped down phoenix app.

rm -fr chocula_rump
mix phoenix.new web --no-brunch --no-ecto --no-html
cd web

That's a bit nicer.

Let's write some tests for our channels:

vim test/channels/counter_channel_test.exs
defmodule Web.CounterChannelTest do
  use Web.ChannelCase
  alias Web.CounterChannel
  import TestHelper

  # When we join a counter channel, we want to make sure that the underlying
  # GenServer is started by the supervisor and registered.
  describe "interacting with a counter" do
    setup [:join_counter_channel]

    test "starts the counter on join", %{id: id} do
      assert is_pid(:global.whereis_name(id))
    end
  end

  # We'll make a basic helper for joining a channel, that also creates a new id
  defp join_counter_channel(_) do
    id = new_id()
    {:ok, _, socket} =
      socket("user_id", %{})
      |> subscribe_and_join(CounterChannel, "counter:#{id}")

    {:ok, socket: socket, id: id}
  end
end

We need that new_id function so we'll copy in the test helper to the phoenix app and add the dependency:

cp ../count_chocula/test/test_helper.exs test
vim mix.exs
  defp deps do
    [
      {:phoenix, "~> 1.2.0"},
      {:phoenix_pubsub, "~> 1.0"},
      {:gettext, "~> 0.11"},
      {:cowboy, "~> 1.0"},
      {:uuid, "~> 1.1"}
    ]
  end
mix deps.get

Alright, so let's try to run the test.

There's no CounterChannel, so we'll add one and add the join function.

vim web/channels/user_socket.ex
  channel "counter:*", Web.CounterChannel
vim web/channels/counter_channel.ex
defmodule Web.CounterChannel do
  use Web.Web, :channel

  def join("counter:"<>id, _payload, socket) do
    {:ok, socket}
  end
end

Run the tests...and now it joins the channel just fine, but no counter was started. Let's start the counter when you join the channel:

defmodule Web.CounterChannel do
  use Web.Web, :channel

  def join("counter:"<>id, _payload, socket) do
    {:ok, _pid} = CountChocula.start_counter(id)
    {:ok, socket}
  end
end

Run the tests...and they fail because this application doesn't know about CountChocula. That's to be expected - we never added it as a dependency! Let's add it as a dependency and add it to our applications list so it gets started when we launch this app:

vim mix.exs
  def application do
    [mod: {Web, []},
     applications: [:phoenix, :phoenix_pubsub, :cowboy, :logger, :gettext, :count_chocula]]
  end
  # ...
  defp deps do
    [
      {:phoenix, "~> 1.2.0"},
      {:phoenix_pubsub, "~> 1.0"},
      {:gettext, "~> 0.11"},
      {:cowboy, "~> 1.0"},
      {:uuid, "~> 1.1"},
      {:count_chocula, in_umbrella: true}
    ]
  end

Now we can run the tests. And they pass. So you can join a channel and we start a process with that name using our CountChocula API, but you can't interact with it presently. Let's fix that. We want the client to get the counter's state as a message when it joins, so that we can show the shared state in our interface after joining. Let's test that:

  describe "interacting with a counter" do
    setup [:join_counter_channel]
    # ...
    test "sends the counter's state on join" do
      assert_push "counter:state", 0
    end
  end

Let's run the test. Of course, it fails. Let's make our channel send this so that it passes!

defmodule Web.CounterChannel do
  use Web.Web, :channel

  def join("counter:"<>id, _payload, socket) do
    {:ok, _pid} = CountChocula.start_counter(id)
    send(self, :push_state)
    {:ok, socket}
  end

  def handle_info(:push_state, socket) do
    push socket, "counter:state", CountChocula.Server.get_state(some_id)
    {:noreply, socket}
  end
end

OK, so this won't work yet, because we don't know how to get the id. We have this ID if we look at the socket's topic:

  def handle_info(:push_state, socket) do
    some_id = socket.topic |> String.split(":") |> Enum.drop(1) |> hd
    push socket, "counter:state", CountChocula.Server.get_count({:global, some_id})
    {:noreply, socket}
  end

Run the tests...and it almost works:

 ** (ArgumentError) topic and event must be strings, message must be a map

The problem now is that we're asserting we get back a 0 but we must send a map out of our channel. Let's tweak our expectations in the test:

    test "sends the counter's state on join" do
      assert_push "counter:state", %{count: 0}
    end

And support this in the channel:

  def handle_info(:push_state, socket) do
    some_id = socket.topic |> String.split(":") |> Enum.drop(1) |> hd
    push socket, "counter:state", %{ count: CountChocula.Server.get_count({:global, some_id}) }
    {:noreply, socket}
  end

And the tests pass...I kind of don't like that push line, but I won't change it for aesthetics' sake alone. Let's move on.

Next, we want to be able to increment the shared state from a channel client. We'll introduce a new test:

    test "incrementing the counter", %{id: id, socket: socket} do
      assert 0 = get_count(id)
      ref = push socket, "counter:increment", %{}
      assert_reply ref, :ok
      assert 1 = get_count(id)
    end

So here we're saying we want some way to get the count - I'll just talk directly to the appropriate CountChocula.Server - then we push a message down the socket, wait for a reply, and assert that the count increased by 1. If we run the test, it will fail:

  1) test interacting with a counter incrementing the counter (Web.CounterChannelTest)
     test/channels/chocula_channel_test.exs:17
     No message matching %Phoenix.Socket.Reply{ref: ^ref, status: :ok, payload: %{}} after 100ms.
     The following variables were pinned:
       ref = #Reference<0.0.6.1664>
     Process mailbox:
       %Phoenix.Socket.Message{event: "counter:state", payload: %{count: 0}, ref: nil, topic: "counter:445f9e39bef3427caef62262b520dae5"}
     stacktrace:
       test/channels/chocula_channel_test.exs:20: (test)

OK, so it never responds that it received our counter:increment message, because it doesn't know how to handle it. Let's teach it!

  def handle_in("counter:increment", _, socket) do
    some_id = socket.topic |> String.split(":") |> Enum.drop(1) |> hd
    :ok = CountChocula.Server.increment({:global, some_id})
    {:reply, :ok, socket}
  end

If we run the tests now, they work just fine. Let's clean that one bit up and add a function we can hand socket.topic to and get back our {:global, some_id}:

  def handle_info(:push_state, socket) do
    push socket, "counter:state", %{ count: CountChocula.Server.get_count(server_id(socket.topic)) }
    {:noreply, socket}
  end

  def handle_in("counter:increment", _, socket) do
    :ok = server_id(socket.topic) |> CountChocula.Server.increment()
    {:reply, :ok, socket}
  end

  defp server_id(topic_string) do
    {
      :global,
      topic_string |> String.split(":") |> Enum.drop(1) |> hd
    }
  end

OK, so that cleans some stuff up a little bit. So where have we gotten?

We have a channels-based interface to our GenServer. It doesn't know that it's a GenServer, so if we want to change our implementation later, that's fine. It does know that it's globally registered with a given name. I hate that. That's completely an implementation detail and it's dirty for it to leak out. We're also having to specify which process id we want to talk to, but our client doesn't and shouldn't know anything about processes, so now our channels interface is having to shim this knowledge in.

Let's modify CountChocula so that you can call something like CountChocula.increment(id) and it can be the layer that knows how to find the process it wants to talk to based on that.

We'll start off presuming this API exists in our channels code:

defmodule Web.CounterChannel do
  use Web.Web, :channel

  def join("counter:"<>id, _payload, socket) do
    {:ok, _pid} = CountChocula.start_counter(id)
    send(self, :push_state)
    {:ok, socket}
  end

  def handle_info(:push_state, socket) do
    push socket, "counter:state", %{ count: CountChocula.get_count(counter_id(socket.topic)) }
    {:noreply, socket}
  end

  def handle_in("counter:increment", _, socket) do
    :ok = counter_id(socket.topic) |> CountChocula.increment()
    {:reply, :ok, socket}
  end

  defp counter_id(topic_string) do
    topic_string |> String.split(":") |> Enum.drop(1) |> hd
  end
end

We know this won't work. Let's go back to count_chocula and add tests for this bit in the count_chocula_test.exs:

defmodule CountChoculaTest do
  # ...
  test "incrementing a counter by id string" do
    id = new_id()
    {:ok, _pid} = id |> CountChocula.start_counter()
    :ok = id |> CountChocula.increment()
    assert 1 = CountChocula.get_count(id)
  end
end

We'll make this work:

defmodule CountChocula do
  use Application

  def start(_type, _args) do
    import Supervisor.Spec, warn: false

    children = [
      supervisor(CountChocula.Supervisor, [])
    ]

    opts = [strategy: :one_for_one]
    Supervisor.start_link(children, opts)
  end

  defdelegate start_counter(id), to: CountChocula.Supervisor

  def increment(id) do
    CountChocula.Server.increment({:global, id})
  end

  def get_count(id) do
    CountChocula.Server.get_count({:global, id})
  end
end

OK, so now this all works just fine. Do our channels?

mix test

Sweet. So I wanted to point out that moving our stuff out to a separate application gave us an obvious place that this API ought to live, because it makes sense that another OTP application shouldn't know about the internals of one that it depends on. If we didn't have a separate OTP application the best we could have done would have been to put these on the CounterSupervisor itself...that's not the end of the world, but I think it gives us more flexibility going forward.

But the fact that we were able to do this also shows another flaw in our design that isn't immediately apparent without taking this time. CountChocula knows that these processes are globally registered a certain way, but it's not the one that registered them. Consequently, if we change the CountChocula.Server module, we will have to change this module as well. This is a connascence that we'd like to avoid. The fix is that the Application module itself should be responsible for registering the GenServers that it spawns, and then all of this knowledge lives and changes in once place. Let's do that:

defmodule CountChocula do
  # ...
  def start_counter(id) do
    case :global.whereis_name(id) do
      :undefined ->
        {:ok, pid} = CountChocula.Supervisor.start_counter
        :yes = :global.register_name(id, pid)
        {:ok, pid}
      pid ->
        {:ok, pid}
    end
  end

  def increment(id) do
    CountChocula.Server.increment({:global, id})
  end

  def get_count(id) do
    CountChocula.Server.get_count({:global, id})
  end
end

defmodule CountChocula.Supervisor do
  # ...
  def start_counter do
    Supervisor.start_child(@name, [])
  end
  # ...
end

So is this better? I'm actually not sure - it's possible it's way smarter to use via tuples and let GenServers do this for us - but I kind of think the fact that I can start a process and the fact that it gets registered make a lot more sense living outside of that process. Aside from the registration, it made no sense for the GenServer itself to think of itself as having a name. So for now I'll assume this is good. But my point is that by separating the applications I think it's a lot easier to reason about this sort of change. I can run my whole test suite on the count_chocula application now and find out which things broke in this application with this change:

mix test

OK, so everything underneath the application that expects to get an ID. Let's go fix up the tests:

vim test/count_chocula/server_test.exs
defmodule CountChocula.ServerTest do
  use ExUnit.Case
  alias CountChocula.Server

  test "starting the server" do
    assert {:ok, _pid} = Server.start_link()
  end

  test "getting the count" do
    assert {:ok, pid} = Server.start_link()
    assert 0 = Server.get_count(pid)
  end

  test "incrementing" do
    assert {:ok, pid} = Server.start_link()
    :ok = Server.increment(pid)
    assert 1 = Server.get_count(pid)
  end
end
vim test/count_chocula/supervisor_test.exs

...huh. Really we just want to make sure we can start a child at all in this test:

defmodule CountChocula.SupervisorTest do
  use ExUnit.Case, async: true

  test "creating new counter" do
    assert {:ok, _pid} = CountChocula.Supervisor.start_counter()
  end
end

So that's simpler, and we were able to remove the TestHelper a couple of times as well. We should move the test that was in the supervisor up to the application that ensures we get the same process back when we try to start it with the same name with that API a second time:

defmodule CountChoculaTest do
  # ...
  test "starting a counter when one already exists by that name" do
    the_id = new_id()
    assert {:ok, pid} = CountChocula.start_counter(the_id)
    assert {:ok, ^pid} = CountChocula.start_counter(the_id)
  end
  # ...
end

Alright, so we got the same guarantees as before now, but we moved the knowledge of this to the application's public interface. This had a side effect of simplifying our other tests as they no longer are creating these uuids so they don't step on some global state, and we only need to use them on a single module, which is the only one that introduces some global state.

I think this is better. I don't think I'd end up here if I didn't separate my applications. Maybe I'm wrong about that. I just wanted to share a nice side effect that came out of separating these applications, for me.

Just because I haven't covered umbrella projects before to my knowledge, I figure it's worth pointing out that we can go the umbrella itself and run the tests for every application:

cd ../..
mix test

So the only thing left is to verify that collaboration from different channel clients propagates as expected. We want all the connected clients to find out what the new state is when any of them make a change. We're just going to assert that the state is broadcast after it receives an increment, because we can trust that phoenix channels work as expected, so we don't really need to connect multiple clients to prove this out:

cd apps/web
vim test/channels/counter_channel_test.exs
defmodule Web.CounterChannelTest do
  # ...
  describe "interacting with a counter" do
    setup [:join_counter_channel]

    # ...
    test "incrementing the counter", %{id: id, socket: socket} do
      assert 0 = get_count(id)
      ref = push socket, "counter:increment", %{}
      assert_reply ref, :ok
      assert 1 = get_count(id)
      assert_broadcast "counter:state", %{count: 1}
    end
  end
  # ...
end

If we run the test, it fails. This is because we aren't sending out the state, so a collaboration wouldn't be able to show state updates from other users yet. Let's fix it:

defmodule Web.CounterChannel do
  # ...
  def handle_in("counter:increment", _, socket) do
    :ok = counter_id(socket.topic) |> CountChocula.increment()
    broadcast! socket, "counter:state", %{ count: CountChocula.get_count(counter_id(socket.topic)) }
    {:reply, :ok, socket}
  end
  # ...
end

Alright, so now any time an action takes place, we send out the state to everyone. We'll call this done for now, though I'll discuss a few things about these collaboration servers at another time.

Summary

So what's the point? Wasn't this pretty similar to Episode 259?

Kind of. But. Our chat server in that episode is baked into our Phoenix application and tied to Ecto. I almost want to say this is fine for prototyping, but increasingly I don't know that I feel that way either, because I almost never see this sort of thing be undone. There's no compelling reason that my chat server can only talk over websockets. But as it stands, in that episode we baked it into both Ecto and Phoenix Channels, and we'll almost certainly start to intertwine those pieces such that it becomes hard for us to disentangle them later.

A lot of people say you really shouldn't care about this sort of thing. I know it's the DHH philosophy.

Here's the thing. I've built big applications. One of those handled billions of dollars per year of transactions. It was a Rails monolith. When the business was scaling rapidly, it was architecturally nearly impossible to make it better and still be reactive to business demands regarding features.

I ran a consultancy. I saw a lot of Rails apps that were built by people that ran into problems keeping them maintainable, and I built a lot of the same. I am of the opinion that the world's better off if you separate things into chunks that make it easy to manage different domains in complete isolation and make strict APIs between them.

OTP applications make this easy. Elixir's umbrella projects make it even easier. We don't have any excuses to end up in a situation where no one knows how to architect their applications outside of a framework.

Phoenix is just HTTP+WS

Phoenix is a way to interact with a user over HTTP and WebSockets. In both cases it should have exactly two responsibilities:

  • When it receives data from a user, it validates that it matches certain criteria, then turns it into events inside of your actual application domain.
  • When it receives data from your application, it converts those to the sort of thing that your user can understand and sends it out to them.

If you find it doing much more than that, then you should consider fixing your abstractions. We'll all end up better off if this becomes common.

Thanks for letting me rant. See you soon!

Resources