Subscribe now

Ecto Validation with Changesets [04.17.2017]

In the last episode, we saw how to get started with Ecto. Today we'll look at how Changesets help you validate your data with a lot of flexibility, while still taking advantage of database-level consistency guarantees. Let's get started.

Project

Testing Setup

We'll start by writing some tests. Here's a basic shell with a single simple test in place:

mkdir -p test/firestorm_data/schema
vim test/firestorm_data/schema/user_test.exs
defmodule FirestormData.UserTest do
  alias FirestormData.{User, Repo}
  use ExUnit.Case

  # We set our test adapter to use the sandbox. This allows us to checkout
  # connections in our tests from a sandbox, so concurrent tests won't step on
  # one another and so they're automatically cleaned up after each test.
  setup do
    :ok = Ecto.Adapters.SQL.Sandbox.checkout(Repo)
  end

  test "creating a user" do
    josh = %User{name: "Josh Adams"}
    assert {:ok, _} = Repo.insert josh
  end
end

If we try to run it, it will fail because there's no database. We could create the database, but honestly it's not a great idea as you don't want to have to remember to run migrations in order to run your tests. Instead, we can use mix aliases to change what our mix test task does.

vim mix.exs
defmodule FirestormData.Mixfile do
  use Mix.Project

  def project do
    [
      # ...
      aliases: aliases()
    ]
  end
  # ...
  defp aliases do
    [
      "test": ["ecto.drop --quiet", "ecto.create --quiet", "ecto.migrate", "test"]
    ]
  end
end

Now when we run mix test it will drop the database, create it, migrate it, and finally run the tests.

Next, since we're using the sandbox we should open up the test helper and configure it to require manual checkouts:

vim test/test_helper.exs
ExUnit.start()
Ecto.Adapters.SQL.Sandbox.mode(FirestormData.Repo, :manual)

Now we can run the test again, and it should pass.

Changesets

Having set our application up for proper testing with Ecto, let's move on to learning about Changesets.

Validations

In our test, we're inserting a User that has no email address. We'd like to make that impossible. Let's first implement a changeset function on our schema:

vim lib/firestorm_data/schema/user.ex
defmodule FirestormData.User do
  # We import `Changeset` to have access to its various functions trivially
  import Ecto.Changeset
  # ...
  def changeset(user, params \\ %{}) do
    user
    |> cast(params, [:username, :name, :email])
    |> validate_required([:email])
  end
end

This function takes a user struct and a map of params. It will produce an error if the email is missing. Let's look at our test, now that we're requiring an email. If you run it now, it will still succeed at inserting a User without an email address. This is because we're inserting the struct directly. Instead, we should pass a User through the changeset and insert the changeset into the Repo. Let's make the change:

defmodule FirestormData.UserTest do
  # ...
  test "creating a user" do
    josh_changeset =
      %User{}
      |> User.changeset(%{name: "Josh Adams"})

    assert {:ok, _} = Repo.insert josh_changeset
  end
end

Now when we run the test, we get a test failure and we can see the error that is returned:

bash-3.2$ mix test test/firestorm_data/schema/user_test.exs
Compiling 1 file (.ex)

02:26:39.937 [info]  == Running FirestormData.Repo.Migrations.CreateUsers.change/0 forward

02:26:39.937 [info]  create table users

02:26:39.942 [info]  == Migrated in 0.0s


  1) test creating a user (FirestormData.UserTest)
     test/firestorm_data/schema/user_test.exs:9
     match (=) failed
     code:  {:ok, _} = Repo.insert(josh_changeset)
     right: {:error,
             #Ecto.Changeset<action: :insert, changes: %{name: "Josh Adams"},
              errors: [email: {"can't be blank", [validation: :required]}],
              data: #FirestormData.User<>, valid?: false>}
     stacktrace:
       test/firestorm_data/schema/user_test.exs:14: (test)

Let's duplicate this test and ensure that we cannot add a user without an email:

  test "creating a user without an email" do
    josh_changeset =
      %User{}
      |> User.changeset(%{name: "Josh Adams"})

    assert {:email, {"can't be blank", [validation: :required]}} in josh_changeset.errors
  end

Note that we can check validations without trying to insert the user into the database.

Now we can fix the original test to include an email and assert that we can create the user successfully:

  test "creating a user" do
    josh_changeset =
      %User{}
      |> User.changeset(%{name: "Josh Adams", email: "josh@dailydrip.com"})

    assert {:ok, _} = Repo.insert josh_changeset
  end

This ensures that an email address is required, but we can presently insert invalid email addresses. Let's write a test to prove this:

  test "creating a user with an invalid email" do
    josh_changeset =
      %User{}
      |> User.changeset(%{name: "Josh Adams", email: "notanemail"})

    refute josh_changeset.valid?
  end

To fix this, we can use validate_format to ensure the format of the data matches our expectation. We won't make a full RFC compliant email regular expression here - instead, we'll simply require that an @ sign exists.

defmodule FirestormData.User do
  # ...
  def changeset(user, params \\ %{}) do
    user
    |> cast(params, [:username, :name, :email])
    |> validate_required([:email])
    |> validate_format(:email, ~r/@/)
  end
end

Now if we run the test, we can see that it passes.

Constraints

So that covers basic data validation, but sometimes you need to validate things that can't be handled in isolation, without talking to the database. Ecto calls these constraints. The most common constraint I see is email uniqueness. Let's write a test asserting that two users in our database cannot have the same email address:

defmodule FirestormData.UserTest do
  # ...
  test "creating two users with the same email address" do
    josh_changeset =
      %User{}
      |> User.changeset(%{name: "Josh Adams", email: "josh@dailydrip.com"})

    assert {:ok, _} = Repo.insert(josh_changeset)
    assert {:error, _} = Repo.insert(josh_changeset)
  end
end

If we run the test right now, it will fail. That's because we haven't added a constraint yet. This can't be verified properly without database support, so let's first add a new migration that adds a unique_index on this field in our database:

mix ecto.gen.migration add_user_email_uniqueness_constraint
defmodule FirestormData.Repo.Migrations.AddUserEmailUniquenessConstraint do
  use Ecto.Migration

  def change do
    create unique_index(:users, [:email])
  end
end

Now if we run the tests, they'll fail but not in a great way:

  1) test creating two users with the same email address (FirestormData.UserTest)
     test/firestorm_data/schema/user_test.exs:33
     ** (Ecto.ConstraintError) constraint error when attempting to insert struct:

         * unique: users_email_index

     If you would like to convert this constraint into an error, please
     call unique_constraint/3 in your changeset and define the proper
     constraint name. The changeset has not defined any constraint.

Let's add a unique_constraint to our changeset - this will make the insertion convert this uncaught database error into a changeset error on insert:

defmodule FirestormData.User do
  # ...
  def changeset(user, params \\ %{}) do
    user
    |> cast(params, [:username, :name, :email])
    |> validate_required([:email])
    |> validate_format(:email, ~r/@/)
    |> unique_constraint(:email)
  end
end

Now when we run the test, it passes. Let's make it a bit more explicit, so we can see the error that gets returned:

  test "creating two users with the same email address" do
    josh_changeset =
      %User{}
      |> User.changeset(%{name: "Josh Adams", email: "josh@dailydrip.com"})

    assert {:ok, _} = Repo.insert(josh_changeset)
    {:error, new_changeset} = Repo.insert(josh_changeset)
    assert {:email, {"has already been taken", []}} in new_changeset.errors
  end

Nice! It's worth pointing out that constraints are only checked in the case of no validation errors. This is because we have to attempt to insert the record into the database in order to see if the constraints are satisfied, and we don't wish to insert data we know fails our validations.

Different changesets for different contexts

Ecto's changesets provide flexibility over ORMs that have global validations, such as Rails' ActiveRecord. Since our changeset is just a function, we could define an admin_changeset function that is called when admins are interacting with the data, that allowed skipping certain validations. That's a simple example - very often I'll see people add an extra argument to their changeset function that defines the type, and this seems reasonable as well.

The point is, due to the design of Ecto, you can pick and choose how you handle validations to support different scenarios, rather than being constrained to a single validation setup for all contexts.

Summary

In today's episode, we saw how to use Ecto changesets to validate our data and combine database-level data integrity with nice Elixir-side error handling. Changesets are fantastic, and have more flexibility than we could possibly cover in a single drip. See you soon!

Resources