Giter VIP home page Giter VIP logo

blog-old's People

Contributors

vereis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

blog-old's Issues

Integrating GitHub Issues as a Blogging Platform

Integrating GitHub Issues as a Blogging Platform

Building a simple blog is my go-to project for trying out new backend or frontend technologies. While learning Erlang/Elixir at university, I created a prototype using an ad-hoc escript that generated things via a cronjob.

It was quick, dirty, and fun. I started with concatenating handwritten .html files and building an index page. Eventually, it evolved to compiling .md files by shelling out to pandoc.

However, as I began drafting posts, I realized that some features were missing. Post tagging was limited to the directory structure, and syntax highlighting wasn't working well. I wanted a platform that allowed me to make edits from anywhere, including devices without a command-line or git.

I needed to fix typos on the spot. After careful consideration, I decided to implement my blog abusing GitHub as a backend.

GitHub was a free service with a nice web UI that provided excellent syntax highlighting support, customizable tagging, pinning, and even supported comments for future extensibility.

Implementation

GitHub offers an open API with several options available. While I haven't really used it much beyond tinkering with language tutorials, I've heard great things about it.

Out of all the APIs provided, I opted for their GraphQL API since I'm a big fan of GraphQL and have prior experience with it, having worked on consulting projects for companies that use it on a large scale.

Initially, I stored my posts as files in my repo, and my backend server would crawl through directories based on the URL slug (e.g., localhost:4000/some/post would look for $MY_REPO/some/post.*). This approach was solid, but when I tried to query files through the repository { object(...) } API, it got a little gnarly 😅. I assume they designed it that way to avoid extensive post-processing on the underlying git protocol.

Honestly, I'd much rather query objects through repository { files("somePath") { ... }} instead, since it would make my server logic more manageable. The object field required me to pass instructions that git rev-parse could understand, and that's no fun 🤮.

After digging around, I found out that querying GitHub Issues could be done from the repository node and gave me the freedom to do whatever I wanted. So, I scrapped everything I had done so far and started fresh with this second iteration.

If you're interested, GitHub's GraphiQL explorer is awesome for playing around and getting a feel for GraphQL (if you're not already familiar with it) and understanding what's happening behind the scenes on my blog.

For instance, the basic query I'm doing is:

query($owner: String!, $name: String!) { 
  repository(owner: $owner, name: $name) {
    nameWithOwner,
    issues(first: 10, orderBy: {field: CREATED_AT, direction: DESC}) {
      nodes {
        number
        bodyHTML
        createdAt
        updatedAt
        title
        labels(first: 10) {
          nodes {
            name
          }
        }     
      }
      totalCount
    }
  }
}

This snippet gets the first ten issues ordered by CREATED_AT descending from a given repository. If one were concerned that anyone would be able to make issues in a given repo, you could further constrain the query with something like:

issues(filterBy: {createdBy: $owner}, first: 10, orderBy: {field: CREATED_AT, direction: DESC})

Absinthe is an absolutely amazing GraphQL server implementation for the backend, but since my site is rendered by the backend, this wasn't what I needed. Whenever I've worked on frontend code interacting with GraphQL, I've always used a popular library called Apollo but (at least at the time of implementation) there was no Apollo library for Elixir.

I ended up rolling my own simple GraphQL client library because I didn't think it would be too complex. I'm sure as time goes on, simple libraries will be built, but honestly, it was easy enough to work with by just POST-ing to GitHub's API endpoint. For future reference, I ended up with something similar to the following:

def fetch_issues_for_repo(owner, repo) do
  %HTTPoison.Response{body: body} =
    HTTPoison.post(
      "https://api.github.com/graphql",
      Jason.encode!(%{
        query: """
        query {
          repository(owner: \"#{owner}\", name: \"#{repo}\") {
            issues(first: 10, orderBy: {field: CREATED_AT, direction: DESC}) {
              nodes {
                title
                createdAt
                updatedAt
                body
                author {
                  login
                }
              }
            }
          }
        }
        """
      })
    )

  body
  |> Jason.decode!()
  |> Map.get("data")
  |> Map.get("repository")
  |> Map.get("issues")
  |> Map.get("nodes")
end

This gives me some HTML (and other metadata like tags, etc.) which I essentially insert ad-hoc into my own Phoenix templates, and it just works. If you're reading this post, you can see this approach works pretty well (I'm unlikely to change it going forwards 😜)

One downside I can see is that you're limited to a few thousand queries as rate-limiting* so as time goes forwards, I'll probably cache posts in an ETS table or something so that I'm not wasting my quota whenever someone clicks on a post.

Just for a bit of fun, you can see the source of this post. Features I'll definitely add going forwards are reactions to the main post, which I can display nicely on the index page, and comments!

I'll continue playing around with my blog and the technology driving it, as is tradition for software engineers, but all-in-all, this is a pretty comfy blogging setup 😈

Edit: I'm still using this approach after a few months, but I definitely needed to implement caching. I woke up one night to my Graph API rate limits exceeded and couldn't do anything about it but wait! That's a good problem to have though 😉

Edit: One fancy thing I've seen is adding estimated reading times to blog posts. This proved a little hard because when I request blog contents from GitHub, I get them all as one monolithic block, which I inject into my templates...

To get it working, I end up processing the HTML returned by GitHub to calculate a reading time, injecting the reading time in the template and maximally abusing CSS Grid to re-order the flow of the page as follows:

sub + h1 {
  grid-area: afterHeaderBeforeContent
}

h1 + p:first-of-type {
  grid-area: afterContent
}

Edit: Updated writing since it was full of typos. This draft should be much clearer — my writing has really improved thanks to this blog... I highly recommend anyone start their own blog just for the fun and flexibility it enables you and be a great place to improve your writing skills!

Setting up a development environment on Windows 10

Setting up a development environment on Windows 10

I find myself writing a lot of code, both professionally and as a hobby. When I write code, 99% of the time, I'm targeting a Unix-like environment.

I also find myself doing a lot of other things, such as playing games or streaming some TV. It is my sincere opinion that for these things, Windows is the operating system to beat: monopolistic malpractices or not, 4K Netflix only works on Windows 10, as do the majority of games I play which I use to keep in contact with old friends.

Even when I'm not using my powerful workstation with its beefy GPU to play games, I frequently use various models of the Microsoft Surface lineup because their form factor is really appealing to me—especially for a machine for the go. Linux, without surprise, does not run very well at all on these devices 😅

None of this is Linux's fault, of course. If I could use Linux on everything, I would, but the year of the Linux desktop is certainly not 2019 for me. It's not for lack of trying either, I've been a longtime Linux user, and I've tried many approaches to try and get the best of both worlds. I hope that this post can share what I've learnt along the way 💪

Popular (or not) approaches I've tried

As I've said, I've tried a lot of different ways to combine the leisure of Windows and the development potential (and customisability) or Linux. Some of the ones I actually stuck to for a while are outlined in the seconds below:

Windows–native development

For a long time, primarily during the first half of my University experience, I used Windows as a host OS and developed exclusively on/for Windows. If I were writing Python, or Erlang, or Ocaml, I'd use the Windows-native builds of those programming languages/compilers.

This felt natural to me because I started my developer journey when I was nine and playing around with Game Maker. It also worked relatively well for interpreted languages, as well as for C/C++ so long as I was developing against a Windows API. Everything kind of works, but a few weird things also don't work how you'd expect; maybe because most developers writing in languages such as Python or Erlang aren't using the Windows builds of their toolchains? 🤔 I assume if I wanted to stick to C# or .NET development, say, things would go a lot smoother, but alas.

When I couldn't manage to compile some NIFs for the BEAM (which I needed for a particular project at University), I ended up looking for another solution.

Linux VM on a Windows host

After trying to go all-in with native Windows development and being bitten by a couple of small and weird issues, I decided to do what anyone in my situation might and download a virtual machine to run Linux on.

The first thing I tried was VirtualBox, which ran pretty slow and had awful input latency when doing anything. I think this might have been a hardware issue.

VMware Workstation worked a lot better but still suffered from weird quirks like the virtual sound device not working correctly.

If not for the sound device issue, I probably could have lived with this setup. VMware could be maximised, and all of its Window decorations/toolbars could be hidden. It had some rudimentary 3D acceleration support as well, and things generally ran very smoothly. Unfortunately, if I were to do this, I'd want the sound to work as expected; it was also expensive both monetarily and battery-wise for my non-workstation machines.

The Windows ricing community: Cygwin & bblean

In 2015, I really fell into the Windows ricing community. To eke out the customisation that Linux gave me on Windows, I hacked around with hundreds of custom AutoHotkey scripts for system automation and custom keybindings, custom theming, custom shells to replace Window's default explorer.exe and more.

I discovered a custom shell for Windows called bblean which was based on Blackbox, a Linux window manager, perhaps best known nowadays for being a precursor of the much more popular Openbox and Fluxbox window managers. This scratched my customisation itch like never before and was genuinely perfect—until I switched to Windows 10 for modern DirectX support. Unfortunately, bblean was built targeting Windows 2000, and it was a miracle it worked as well as it did on every OS until Windows 10, where it still kind of works to this day.

There weren't very many people using bblean, and other alternative shells on Windows, however. The community was very tight-knit, and we shared a lot. Through being a part of this community, I also discovered Cygwin, which is essentially a collection of GNU (and other) tools, and a DLL providing POSIX-like functionality that you can install on Windows. Cygwin allows you to both install applications you might expect to find as part of a Linux distribution (such as zsh, vim, git etc.), and also compile POSIX-compliant applications to run on Windows natively (provided you have the DLL Cygwin provides).

The community was very focused on sharing setups and ideas. I'm unfortunately no longer an active participant in the community because I simply don't use Cygwin anymore (for my use case, a tool we'll explore in a moment is a much better fit). Still, I do have a few old screenshots of the lengths I'd go using these tools—something always fun to reminisce about.

The following images are Windows 7 configurations I had in 2015 and beyond. I taught myself CSS and basic JavaScript to be able to customise my Firefox 😁

image

image

Retrospectively, these tools taught me a lot and even contributed a lot to my first paying software development experience (which was a small company I interned at during my University degree. We used Cygwin on Windows extensively, and I attribute my outside experience with having used Cygwin as a strong selling point!)

These tools shall remain forever in my heart ❤️

Windows VM on a Linux host with GPU passthrough

After a few years of the bblean/Cygwin setup, I decided that I would try and learn to use Linux properly and full-time: my internship had just ended, and I had a lot of time on my hands before my final year at University, I had used Linux as a server OS and was familiar with the GNU userland because of Cygwin, but I was not 100% familiar with using it as a host OS for day to day computing.

It worked relatively well, all things being equal. Games, in general, don't work (though Valve's Proton is making genuinely amazing leaps to this end), especially online multiplayer games with anti-cheat. There were other small annoyances, such as high-dpi support not quite working properly (Wayland was not anywhere near being ready yet 😏), but it worked fine in general.

To get around the gaming issue, I set up a virtual machine using QEMU + GPU passthrough, which allowed me to run a Windows 10 in a virtual machine with a dedicated GPU. This worked amazingly, and game performance was effectively native—I couldn't even tell I was in a virtual machine!

It did require 2 monitors, however, and a software/hardware KVM to be used so that I could move my input devices from my host machine to the VM... and every time my GPU's driver updated, I'd end up getting bluescreens 😅

So close, yet so far...

Windows Subsystem for Linux

I heard about Windows Subsystem for Linux on Hacker News while I was still using my GPU passthrough solution and decided I'd try it out. If it worked well enough, I'd switch to it, given the grief my driver updates were causing me.

In short: it was similar to Cygwin in that it gives you a Linux-like CLI on Windows, but it worked by intercepting and translating Linux syscalls to NT syscalls in real-time, providing a real Linux userspace! No more Cygwin bundled DLLs and compiling any software I wanted for myself. I was effectively running a reverse Wine, and things worked pretty well!

This was pretty frictionless, but the performance wasn't the best (primarily because of Windows Defender and NTFS filesystem performance, apparently), but it was good enough for most things. Since WSL worked via syscall translation, however, there were some things that would not work: the lack of a real Linux kernel meant no Docker, no rootfs, no app images, but you could typically work around these issues.

Microsoft then decided to up their game by providing an alternative implementation of WSL: instead of translating syscalls, they would do the following:

  1. Run a real Linux kernel alongside Window's NT Kernel using Hyper-V
  2. Wrap said Linux kernel into a super lightweight VM (kind of like a Docker container)
  3. Perform some black magic kernel extensions to enable Linux–Windows interoperability
  4. Name it WSL 2 and profit 💰💰💰

I have been using WSL 2 in production since it was first available to try out in a preview branch of Windows. It's been nothing but a pleasure to use, and when using it, I genuinely feel like I have the best of both worlds.

The rest of this post will be about how to build a development environment atop WSL 2; however, before carrying on, there are two main things to note:

  1. WSL 2 requires Hyper-V, which means your Windows 10 version will need to support it, and at the time of writing, this makes a machine using WSL 2 incompatible with Virtualbox or VMware software.

  2. WSL 2 is currently only available in preview builds of Windows 10 as it's currently in development. WSL 2 is due for general release in Q1 2020, which isn't too far off, at least 🙏

Setting up WSL 2

Right now, WSL 2 is only available on Insider builds of Windows 10. You need to be on build number 18917 or higher to install it. You can follow these instructions to opt into these Insider builds, but it is important to note that you'll need to re-install Windows if you ever want to roll back to the standard distribution channel.

Next, we can enable the pre-requisite services for WSL 2 by executing the following lines in an elevated Powershell terminal (win+x should help bring it up):

Enable-WindowsOptionalFeature -Online -FeatureName VirtualMachinePlatform
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux

You'll be asked to reboot after running these commands, and after you do, you can go to the Microsoft Store and install any WSL distribution you'd like. I usually go for Ubuntu since it's the most supported one at the time of writing, though others are available.

To speed things up, after installing the distribution of your choice, but before executing it, in a Powershell or CMD prompt, execute the following:

wsl.exe --set-default-version 2 

This will cause your installed distribution to initialise itself in WSL 2 mode. You can convert versions freely between WSL 1 mode and WSL 2 mode, but it takes quite a long time, even for a base installation of Ubuntu, so running wsl.exe --set-default-version is preferable in my mind.

Congratulations, you now have access to a pretty complete Linux computing environment inside Windows 10! 🎉 Assuming you're using Ubuntu, you'll be able to apt-get install and run most commands you usually would.

After getting this far, one glaringly obvious issue is that your WSL 2 installation is being hosted in an old-school CMD prompt. This isn't ideal because it doesn't do very much: no true colour support, copy/paste is broken, and it generally isn't very customisable. Thankfully we can work around that in the next section 😉

Choosing a Terminal Emulator

There isn't too much choice for Terminal Emulators on Windows; unfortunately, every single one has some tradeoffs to consider.

For example, the default CMD prompt is actually blazingly fast with regards to input latency. It lacks, however, pretty much everything else I'd want from a terminal outside of decent mouse support (surprisingly). At this point, both my customised Vim configuration and Tmux completely break text rendering because of weird Unicode rendering issues however 😅

Cygwin's default terminal is pretty good. It's called minTTY, and it basically feels like the standard CMD prompt but supercharged. It has mouse support, 24-bit true colour support and more. Unfortunately, it also doesn't support rendering Unicode, so it's a no-go for my use case. If you're okay with Unicode not being supported, you can download wslTTY, which is essentially minTTY with WSL-integration—neat!

ConEmu is another super popular terminal emulator for Windows. It (and its several derivatives) are extremely customisable, approaching the level of terminal emulators you'd find on Linux. There are two main downsides with ConEmu, however: it doesn't easily support WSL by default (it's possible to get a working configuration, but of all the ones I tried, there are always tradeoffs with respect to losing some functionality, be it mouse support or weird text wrapping issues), and it gets very, very slow when running WSL for some reason 🤔 This might be due to the extreme newness of WSL 2 (or the fact what I'm running an insider build of Windows). Because of these issues, I cannot recommend ConEmu at this moment in time.

Then there's Windows Terminal, a super-strong new contender from Microsoft. They really have been doing amazing stuff in the last few years! This terminal emulator supports almost everything I want (lightning-fast, customisation features like the ability to set window padding, ligature support, true colour support), but unfortunately, it doesn't support any kind of mouse integration at the moment, which is a crutch I use a lot in my workflow 😭 Support is planned for the future, and right now the project is super early access. This is one to keep an eye on I'm sure.

Foregoing any of the imperfect choices already talked about (but really, feel free to try them all yourself. It's pretty workflow dependant after all), you can actually run an X Server on Windows and simply use X11 forwarding to run your favourite Linux terminal emulator as though it was running natively. I personally use this approach with the popular terminal emulator called Terminator, but as Windows Terminal matures, I may switch to that.

X11-forwarding between Windows and WSL

A few popular X Servers exist for Windows that work, and most of them are pretty similar. The most popular seem to be VcXsrv, MobaXTerm and X410.

I personally choose to use VcXsrv because it's the one I was already using from my Cygwin days (I believe Cygwin has its own X Server, but I've never tried it). VcXsrv is pretty fast, and it is free.

I have tried X410, and while it worked really seamlessly (it was built with WSL integration in mind, seemingly), my display is 1440p, and running applications would stutter and lag, unlike on VcXsrv. This is likely an issue that'll be fixed soon, but for what it's worth, you also need to buy a license to run X410 from the Microsoft Store.

Both VcXsrv and X410 (and I believe others) will enable you to start them in "multiwindow" mode. This essentially behaves as though Window's explorer.exe is the window manager for your X11 applications, and everything feels super native. Here you can see an image of me running a popular database management GUI, dbeaver, on WSL with X11 forwarding to Windows:

image

As you can see, it looks and behaves just like any other application!

Before you can run any applications via X11 forwarding, you'll need to add the following line to your shell's configuration:

WSL_HOST=$(cat /etc/resolv.conf | grep nameserver | awk '{print $2}')
export WSL_HOST=$WSL_HOST
export DISPLAY=$WSL_HOST:0.0

This sets your WSL instance's X11 display to the server running on Windows. For maximum performance, it's also often recommended to add the following line as well:

export LIBGL_ALWAYS_INDIRECT=1

This forces rendering to happen on Windows' side, rather than in software via WSL itself. Having done this, source your changes, and you should be able to just launch any GUI program and have it just work. Even chromium and vscode work if you really want to run them inside WSL for some reason; performance is decent too!

You can make X410 or VcXsrv automatically start by adding shortcuts to them in the following directory: %appdata%/Roaming/Microsoft/Start\ Menu/Programs/Startup/. I really recommend doing this since there isn't much reason not to avoid running an X Server on Windows. It helps improve the system interoperability story even more than already exists!

Other minor notes

How I use Docker

When I was using the original WSL, I was forced to run Docker on Windows and connect to it from my WSL instance. Occasionally things just broke, and this took time to debug and troubleshoot.

Now that WSL 2 is a thing, Docker for Windows has been updated to have its own WSL 2 backend. This works a lot better than standard WSL + Docker on Windows, but it's still not perfect. Occasionally, not shutting down or restarting my workstation for a few days would cause strange networking issues between my containers. I could get around this by running wsl.exe --shutdown but this, of course, also forced me to stop my work.

WSL 2 does, unlike WSL 1, include a real Linux kernel, however, so I opt for just running the native Linux implementation of Docker inside my WSL 2 distribution, and that just works. At this point, I'm not sure why Docker for Windows is even a thing 🤔

After switching to doing this, everything has been super stable! WSL doesn't support any kind of system services, however, so at least once per boot, you'll need to remember to run sudo service docker start to get everything in working order.

Creating shortcuts to Linux tools

Right now, any GUI application you want to run from WSL needs a command prompt open. This is because WSL shuts itself down if no terminals are open.

You can use (probably with some minor tweaking neccessary) the following Visual Basic script to spawn an invisible terminal as a host to launch your applications:

Set oShell = CreateObject ("Wscript.Shell") 
Dim strArgs
strArgs = "wsl bash -c 'DISPLAY=WSLHOST:0 <APPLICATION>'"
oShell.Run strArgs, 0, false

Simply change <APPLICATION> to your application of choice (i.e. vscode) and save this script as (in this example) vscode.vbs. Running this script will cause Visual Studio code to open as though it were a normal Windows shortcut 💪

Miscellaneous tips

You can get Linux window management capabilities such as easy dragging/resizing with AltDrag

Microsoft Powertools is a collection of utilities such as a pseudo tiling window manager.

You can set up hotkeys easily with AutoHotkey. I personally utilise these hotkeys in conjunction with the shortcut visual basic script I detailed above to launch Linux applications via keyboard shortcuts.

WSL allows you to import and export your distribution by simply running wsl.exe --export <YOUR_DISTRO> <PATH_TO_EXPORT_TO> and wsl.exe --import <NAME_FOR_IMPORTED_DISTRO> <PATH_TO_IMPORT_TO> <PATH_OF_EXPORTED_DISTRO>

Conclusion

I hope that this post helps you set up and configure a WSL 2 based development environment on Windows 10! There were definitely a few bumps along the way, but I think it works pretty well when you consider how new the technology is and how many small moving parts there are!

This technology is super excited for me, and it's only going to get better going forward. I'm just happy I get all the benefits of running Windows while still being able to develop with Unix-like semantics/idioms. 👀 on the future for even more improvements and interoperability! 🤞

Elixir pet peeves—Ecto virtual fields

Elixir pet peeves—Ecto virtual fields

Ecto is an amazing ORM that is super easy to use, and with good practices and conventions, it becomes extremely composable and flexible while at the same time being super light weight with minimal magic.

At times though, the lack of "magic" when working with Ecto can prove to be a little limiting: one such time this is an issue in my opinion is when using virtual fields.

What is a virtual field?

Virtual fields are just schema fields that aren't persisted/fetched from the database. For example, if we define the following Ecto.Schema:

def MyApp.User do
  use Ecto.Changeset

  schema "users" do
    field :first_name, :string
    field :last_name, :string
  end
end

We could define a virtual field full_name which is defined as: field :full_name, :string, virtual: true. This effectively tells Elixir that there can exist a field called full_name but won't do anything beyond that.

This is useful because even though we're forced to write a function for resolving this virtual field (with or without the Ecto schema field definition added), it makes working with the struct itself in Elixir a little more annoying without defining it:

def MyApp.User do
  use Ecto.Changeset

  schema "users" do
    field :first_name, :string
    field :last_name, :string
  end

  def resolve_full_name(%User{first_name: first_name, last_name: last_name} = user) do
    Map.put(user, :full_name, first_name <> " " <> last_name)
  end
end

iex(1)> MyApp.resolve_full_name(%MyApp.User{first_name: "Chris", last_name: "Bailey"})
%{__struct__: MyApp.User, first_name: "Chris", last_name: "Bailey", full_name: "Chris Bailey"}

Notice how that breaks the rendering of the MyApp.User struct? It also messes with autocomplete/dialyzer at times as well as locks you out from being able to reason about what fields a struct might have after one resolves all the custom fields one might want to attach to said struct.

If we define the schema with the virtual field, we can do the following instead:

def MyApp.User do
  use Ecto.Changeset

  schema "users" do
    field :first_name, :string
    field :last_name, :string

    field :full_name, :string, virtual: true
  end

  def resolve_full_name(%User{first_name: first_name, last_name: last_name} = user) do
    %User{full_name: first_name <> " " <> last_name}
  end
end

iex(1)> MyApp.resolve_full_name(%MyApp.User{first_name: "Chris", last_name: "Bailey"})
%MyApp.User{first_name: "Chris", last_name: "Bailey", full_name: "Chris Bailey"}

Note that the inspection of the struct appears unbroken. Doing things this way also allows us to have default field values set up as well as peace of mind when we consider that we essentially enumerate every field we will want to ever work with (plus type definitions) on the schema itself.

Using virtual fields

Certain virtual fields can be resolved during query time, but more often than not (at least on projects I've seen or worked on) virtual fields are resolved after query time (i.e. in your contexts).

Virtual fields that are resolved at query time can be done via using Ecto.Query.select_merge/4: I could probably resolve MyApp.User.full_name at query time via the following Ecto query:

# This is certainly ugly and unneccessary to do like this, but more realistically if your
# virtual fields are just an average over some aggregate or join, this makes more sense to do.
# This is just to follow through with the example we've used thus far
from u in MyApp.User,
  where: u.id = 1,
  select_merge: %{full_name: fragment("? || ' ' || ?", u.first_name, u.last_name)}
  # better example would be:
  preload: [:friends],
  select_merge: %{friend_count: count(u.friends)}

But for simplicity and readability sake having it in your contexts is much nicer for simple virtual fields:

defmodule MyApp.Users do
  def get_user_by_id(id) do
    User
    |> User.where_id(id)
    |> Repo.one()
    |> case do
      %User{} = user ->
        {:ok, User.resolve_full_name(user)}
      nil ->
        {:error, :not_found}
    end
  end
end

The only annoyance with the latter approach is composability and repetition: wherever we want to resolve this virtual field (likely everywhere), we need to make sure to call User.resolve_full_name/1, and if we have multiple virtual fields to resolve this can become unwieldy (though one can have a User.resolve_virtual_fields/1 function instead, but the former point still stands. This is a longstanding and common pain point to using virtual fields in my opinion.

The pre-query approach to resolving virtual fields thus has the advantage that you don't necessarily need to worry about it, since select_merge calls can be composed and you build your query multiple times for your context functions in any case. Whether or not all virtual fields can be resolved this way, or whether or not it's worth the reduction of readability should be done on a case by case basis however.

Introducing Ecto.Hooks

In the past, Ecto actually provided functionality to execute callbacks whenever Ecto.Models (the old way of defining schemas) were created/read/updated/deleted from the database. This is exactly the kind of functionality one would like when trying to reduce duplication and points of modification when it comes to resolving virtual fields.

I wrote a pretty small library called EctoHooks which re-implements this callback functionality ontop of modern Ecto.

You can simply add use EctoHooks.Repo instead of use Ecto.Repo and callbacks will be automatically executed in your schema modules if they're defined.

Going back to the example given above, we can centralise virtual field handling as follows, using EctoHooks:

def MyApp.User do
  use Ecto.Changeset

  schema "users" do
    field :first_name, :string
    field :last_name, :string

    field :full_name, :string, virtual: true
  end

  def after_get(%__MODULE__{first_name: first_name, last_name: last_name} = user) do
    %__MODULE__{user | full_name: first_name <> " " <> last_name}
  end
end

Simply add the following line to your application's corresponding MyApp.Repo
module:

use Ecto.Repo.Hooks

Any time an Ecto.Repo callback successfully returns a struct defined in a module that use-es Ecto.Model, any corresponding defined hooks are executed.

All hooks are of arity one, and take only the struct defined in the module as an argument. Hooks are expected to return an updated struct on success, any other value is treated as an error.

A list of valid hooks is listed below:

  • after_get/1 which is executed following Ecto.Repo.all/2, Ecto.Repo.get/3, Ecto.Repo.get!/3, Ecto.Repo.get_by/3, Ecto.Repo.get_by!/3, Ecto.Repo.one/2, Ecto.Repo.one!/2.
  • after_insert/1 which is executed following Ecto.Repo.insert/2, Ecto.Repo.insert!/2, Ecto.Repo.insert_or_update/2, Ecto.Repo.insert_or_update!/2
  • after_update/1 which is executed following Ecto.Repo.update/2, Ecto.Repo.update!/2, Ecto.Repo.insert_or_update/2, Ecto.Repo.insert_or_update!/2
  • after_delete/1 which is executed following Ecto.Repo.delete/2, Ecto.Repo.delete!/2

The EctoHooks repo repo has links to useful places like the Hex.pm entry for the library as well as documentation!

I hope this proves useful!

My usage of Nix, and Lorri + Direnv

My usage of Nix, and Lorri + Direnv

I recently wrote about me transitioning my main working environment to run on NixOS. I've used NixOS on my main workstation and Nix atop Ubuntu on WSL2 on my laptop for ~5 months now and I feel I've really gotten into the flow with it on a handful of client projects.

The coolest bit of the setup is Nix/NixOS agnostic: lorri and direnv. At face value, they're usecase is pretty basic and we will dive into how you're meant to use them on projects, but using them actually enables some pretty cool functionality I've been trying to implement for awhile now.

What is Lorri and Direnv?

Nix provides a really cool utility built in called nix-shell. Basically, invoking nix-shell starts a new interactive shell after evaluating a given Nix expression (which by default is assumed to be ./shell.nix, relative to the CWD). This means that you can write a Nix expression detailing what packages you want to install, what environment variables to set, and automatically set/build/enable them within the context of the interactive shell.

This is already extremely powerful as you can declarative manage software versions etc, but Lorri + Direnv make this even nicer to use!

direnv is a super neat project which provides a shell hook that allows your shell to automatically set environment variables defined within a .envrc file (again, relative the the CWD) whenever you change directories. A basic .envrc is very simple and can look like the following:

export SOME_ENV_VARIABLE="test"

A major win here is that it's super fast, supports most popular shells (including some more esoteric ones), is language agnostic, and is pretty extensible.

lorri is essentially an extension of nix-shell built atop direnv. As soon as you change directory, if the new CWD has a shell.nix and an appropriate .envrc (autogenerated by running lorri init), the shell.nix is immediately evaluated. Lorri improves upon nix-shell in a bunch of ways as well, w.r.t caching of built derivations so it is also super quick.

General usage

Once you have lorri and direnv set up, the most basic use case in my opinion is replacing something like asdf which does much of the same stuff:

  • it enables you to document what packages are needed for a certain project (within a certain limited, but extensible list)
  • it enables you to specify certain versions of said packages
  • it will automatically build said packages by running asdf install
  • those packages will only be available if you're in a directory (or a child directory) with a .tool-versions file

Since lorri is built with Nix however, we get the following instead:

  • it enables you to document what packages are needed for a certain project (any package available for Nix, which is huge)
  • it enables you to specify certain versions of said packages, override versions, build certain custom versions
  • it will automatically build said packages by entering the directory if needed
  • those packages will only be available if you're in a directory (or a child directory) with a shell.nix file
  • you can do anything Nix enables you to do, including file manipulation and describing environment variables

For example, here is an example shell.nix I needed to bootstrap an Elixir project which set some private credentials and needed Docker, GCS utilities, a DB client, and Minikube installed to run:

let
  pkgs = import <nixpkgs> {};
in
  g_sec_user = <REDACTED>;
  g_sec_password = <REDACTED>;

  pkgs.mkShell {
    buildInputs = [
      pkgs.elixir_1_10
      pkgs.nodejs-10_x
      pkgs.yarn
      pkgs.inotify-tools
      pkgs.openssl
      pkgs.kubectl
      pkgs.jq
      pkgs.google-cloud-sdk
      pkgs.minikube
      pkgs.kubernetes-helm
    ];
  }

Once this file is created, everything automagically works when entering that directory. Changes to the shell.nix are also automatically tracked so you don't need to do anything fancy: your shell.nix is your single source of truth.

Another advantage is that once your project has a shell.nix checked into it's repository, anyone running Nix/NixOS can start hacking away by just running shell.nix, but if you they Direnv + Lorri, they just have to enter the directory and stuff just works 🎉

Usage for Nix unfriendly projects

I'm an Erlang/Elixir consultant, and as a result of this, I have the opportunity to work on a lot of projects for a lot of different clients. I also know that a relatively esoteric package manager (or worse, an entirely different Linux distribution) is a hard sell.

Thankfully though, even in environments where you can't get away with commiting any of the lorri/direnv artefacts into the core repository of a project, you can get away with using lorri + direnv!

Because this entire workflow is dependent on directory structure, my workstation typically has a structure like this:

/home/chris/git
├── esl
│   ├── shell.nix
│   ├── internal_project_1
│   └── internal_project_2
│       └── shell.nix
├── client_1
│   ├── shell.nix
│   ├── client_project_1
│   ├── client_project_2
│   ├── client_project_3
│   └── client_project_4
├── client_2
│   ├── shell.nix
│   └── client_project_1
└── vereis
    ├── shell.nix
    ├── blog
    │   └── shell.nix
    ├── build_you_a_telemetry
    │   └── shell.nix
    ├── horde
    ├── httpoison
    └── nixos

If I'm not allowed to add direnv/lorri artefacts, I can simply put them a directory above the repository and I still get all of the benefits 😄 You need to add the source_up to any child .envrc files in order to also execute their parent but otherwise it works as expected.

This means that for I can install packages / handle dependencies on a project by project basis, or a client by client bases. Because the effects of executing these Nix expressions is cumulative, you can even do it on a client by client basis, except for a given project for that client.

As Lorri/Direnv lets me install dependencies and set environment variables, this setup is also perfect for a seamless dev credential manager solution. For example, assuming a client has it's own npm registry for JavaScript dependencies, you can simple do the following:

let
  pkgs = import <nixpkgs> {};
in
pkgs.mkShell {
  buildInputs = [
    pkgs.nodejs-10_x
  ];

  npm_config_user_config = toString ./. + "/.npmrc";
}

Where /home/chris/client_1/.npmrc contains:

registry=<REDACTED>
email=<REDACTED>
init.author.name="Chris Bailey"
init.author.email=<REDACTED>
always-auth=true
_auth=<REDACTED>

Notice that esl, client_1, client_2, and vereis all have a top level shell.nix. This is also perhaps a slight abuse of direnv/lorri but the end effect is that I'm able to treat these different directories as completely separate development domains. Each of these top level shell.nix files also set environment variables to set my git username and email. This means that I no longer need to faff around with setting up each and every git repo I clone: if I'm in /home/chris/git/vereis, I'm going to be working as @vereis, if I'm in /home/chris/git/esl/ I'll be @cbaileyesl etc.

Conclusion

I hope this small write-up has been helpful 😄 I genuinely think that the Nix ecosystem offers a lot of advantages, especially for developers. Tools like nix-shell, lorri, and direnv are super helpful in streamlining the development process.

Of course, tools like asdf, dotenv and others exist which allow you to emulate all of what I've described, but the true advantage of Nix/NixOS is that the configuration is very declarative and built into the core experience. Any nix user can do exactly what I've described, even if they're not running lorri or direnv simply by virtue that it's all built upon the first-class nix-shell.

I hope if you're not a nix user, this has piqued your interests a little bit, and if you are a nix user, this will help you jump a few of the small roadbumps when trying to approach development leveraging all the niceties nix provides you.

Drinking the NixOS kool aid

Drinking the NixOS kool aid

The last time I blogged about my development environment I had become disillusioned with the numerous problems that affected me when using Linux as my daily driver—scrolling and microstutters in Firefox, mouse acceleration and sensitivity setup was archaic and obscure, GPU issues to be worked around.

Ultimately, the times I did get a comfortable setup going on, after a few months something would inadvertantly break; or if I were trying to configure another machine, it'd be a lot of time spent emulating my exact setup.

I've heard about Nix and NixOS passingly over the last few years and with some down time in between projects at work, I decided to take a look at it in depth and I've really started to drink the kool aid.

What is NixOS?

At it's simplest, NixOS is a Linux distribution that aims to let you do system management / package management in a reliable, reproducible and declarative way built atop the Nix package manager (which can be installed independantly in a plethora of operating systems).

It enables you to do all your OS management, package management and configuration in the Nix programming language (which again, you can read more about here—Whats more: every time you make a change to your NixOS configuration and tell NixOS to rebuild your system, the exact specifications of your current system aren't just overwritten but instead a new 'generation' is created and you can switch between these versions of your operating system setup as you please. This means that if you somehow happen to break your environment you can just roll back to how it was before.

When jumping to NixOS, I made the mistake of setting up my configuration such that I could do what I usually do on UNIX-like operating systems: install a bunch of packages to enable me to compile whatever tools I need for work, and while this worked for awhile, for a few things, it wasn't the optimal way of doing things on NixOS. I spend some time looking the the Nix language and some NixOS configurations from other people around the internet and I slowly pieced together a configuration that worked for me. Besides just working for me: I can take my configuration and put it on any machine I own and run nixos-rebuild switch and boom, my entire operating system is set up exactly how I like it.

My NixOS configuration has come a long, long way since the first commit (documentation for Nix is pretty sparse) and I've learned a bunch of things that I feel would be useful to share, so take a look at my NixOS configuration if you'd like a decent, modular starting point and I'll outline a few things below.

My configuration

Modular configuration

A lot of NixOS configurations I've looked at essentially just look like one absolutely huge configuration.nix file; this is an ok way to do things but it can definitely get a bit unwieldy very easily. One of the really cool things about the Nix language is that it is a programming language. I don't leverage anything particularly complicated in my setup, but one nice thing is that you can define modules and import them in a top level configuration.nix module.

My NixOS configuration has the top level configuration.nix and hardware-configuration.nix files gitignored because installing NixOS will generate these for you anyway—because these files contain auto-generated nix expressions and are install/machine dependant (UEFI installation, GPU drivers, boot device etc) I don't think it makes sense to version these. Instead, my NixOS config is set up with the following three directories:

  1. profiles/, which contains nix modules such as desktop.nix, laptop.nix, server.nix. These nix modules, when imported, set up things that I'd expect from that particular class of machine. If I import server.nix, I won't bother setting up any X11 related functionality because I won't need a GUI. If I import laptop.nix, I might automatically include packages such as powertop or enable bluetooth. In fact, my profiles/ directory also contains a base.nix in which I define a standard set of packages to install which for me I use on all machines, as well as configure my preferred timezone, keyboard layout and other random things.
  2. modules/ which contains nix modules such as neovim.nix, dwm.nix, amd_gpu.nix. These are smaller nix modules which are set up to be enabled or disabled explicitly. My neovim.nix module not only downloads Neovim, but also downloads my dotfiles and configures Neovim exactly how I like it, plugins and all; my dwm.nix grabs my personal build of DWM, compiles it and installs it. Modules defined in modules/ can literally be anything that I might want to install but want to abstract away—installing GPU drivers on my workstation requires three different lines of configuration, so I'd rather just add modules.amd_gpu.enable = true; to my configuration instead. The idea there is that these modules can be easily enabled/disabled on a machine-machine basis.
  3. machines/ which contains nix modules which acts as configuration for particular machines of mine. These modules will do things such as set the hostname of a machine, as well as include any profiles/*.nix and modules/*nix files neccessary for that particular machine. The VPS running my site has a HTTP server, SSL certificate generation and related things configured in machines/cbailey.co.uk.nix for example 😄

The main upside to this kind of configuration is that I can share configuration between multiple machines I want to manage in a DRY fashion. My girlfriend also recently installed NixOS and since she's not too familiar with Linux, it's extremely convinient that she can use my NixOS configuration; swapping out say, modules/dwm.nix for modules/plasma5.nix and I can contribute changes that solve technical difficulties she might be having.

Writing a Nix module

I won't go too deeply into the details of the Nix language (I wouldn't call myself an authority on this, I picked up what I know from other configs and playing around), but there are a few nice features/quirks of Nix which seemed you might not expect, which can be leveraged in your modules.

Configuration merging

Nix modules are essentially just attribute sets (basically a map/JSON object like thing) containing keys and values. There are a few different sections to a nix module, but the main bit that is important for NixOS configuration is that an attribute set has a value set to the key config, like this:

{ config, lib, pkgs, ... }:
{
  config = {
   something = true;
  }
}

When this module is imported by your top level configuration.nix, the key config gets merged with your configuration.nix's configuration.

We can utilise this to define a modules/vmware_guest.nix module for instance:

{ config, lib, pkgs, ... }:
{
  virtualisation.vmware.guest.enable = true;
}

When this module gets imported to your top level configuration.nix, virtualisation.vmware.guest.enable = true gets added to your configuration. This is pretty cool because you can utilise this to break your configuration up into smaller chunks.

The merging that nix does is a deep merge as well, so if you have services.xserver.enable = true; in your top level configuration and want to import multiple window managers or desktop environments, these window managers and desktop environments not only can set values nested inside services.xserver, but these values can be used in the top level configuration.nix as well.

If your top level configuration.nix has a list of packages to install; but you also include another nix module that defines a seperate list of packages to install:

# configuration.nix
environment.systemPackages = with pkgs; [
  firefox
];

# some_module.nix
environment.systemPackages = with pkgs; [
  wget
  unzip
  git
];

Then these lists get concatenated and merged together as well, installing all the packages define across all modules imported. This seemed pretty unintuitive but it's pretty cool!

Enabling/disabling modules

Alongside the config key, you can also create options of varying types in your module's options key. Actually, all nix packages you install are defined with modules, and they use this options key to define what configuration options can be set at all!

In my modules/ directory, I have a convention where all of my modules define a modules.<name>.enable boolean option for instance, so I can explicitly enable or disable these modules regardless of whether or not they're imported.

You get access to a bunch of types such as paths, strings, integers etc; pretty much what you'd expect. I'm not sure where exactly this is documented since I've only really needed to have simple boolean type options but reading other nixpkgs was a decent and quick way to learn.

You can define a modules.<name>.enable options like I do with the following boilerplate:

{  config, lib, pkgs, ...}:
with lib;
  let
    cfg = config.modules.NAME;
  in
  {
    options.modules.NAME = {
      enable = mkOption {
        type = types.bool;
        default = false;
        description = ''
          Some documentation here
        '';
      };
    };
  
    config = mkIf cfg.enable (mkMerge [{
      # my optional configuration here
    }]);
  }

Overriding packages

NixOS and Nix (the language) can be pretty intimidating at times. The documentation around Nix/NixOS is pretty sparse, potentially outdated and dense. I got a lot of help from the related IRC channels though and definitely suggest you seek help there since it's one of the friendliest and most helpful communities I've had the pleasure of interacting with!

One of the coolest features hidden behind the lack of documentation is the fact that you can override packages. This is especially useful if you want to install a standard package with a patch or small tweak; the standard library for Nix contains functions to fetch remote files from Git or just fetch a tarball from somewhere. I try not to overuse this feature though because it kind of clashes with the declarative-ness of the system otherwise; but for things like my DWM build, it's super helpful.

DWM is a window manager which is configured by recompiling it. When setting up NixOS, I didn't want to learn how to package my personal build of DWM before even getting to grips with the OS, so being able to override the existing DWM package but point at a different set of source code was super helpful. You can override a package with the following pattern:

nixpkgs.overlays = [
  (self: super: {
    dwm = super.dwm.overrideAttrs(_: {
      src = builtins.fetchGit {
        url = "https://github.com/vereis/dwm";
        rev = "b247aeb8e713ac5c644c404fa1384e05e0b8bc6f";
        ref = "master";
      };
    });
  })
];

Home-Manager

Home-Manager is an application which lets you configure your applications (basically a declarative nix-ish implementation of dotfiles). It's super useful and I've transitioned to using it for a bunch of the applications I use.

My terminal emulator's colors, default Firefox plugins and other things are configured via Home-Manager and thus are versioned in my NixOS generations and upgraded whenever I do a nixos-rebuild switch.

There isn't too much else to say about Home-Manager; it integrates well with Nix and it just works. Check it out if you're using NixOS.

Developer Environments

NixOS by default comes with a feature which lets you set up environments for different projects (similar to how ASDF might work for your Erlang/Elixir projects).

Essentially, you can create a shell.nix file in a directory that contains a list of applications you want to be available in that directory and activate it by invoking nix-shell. You can see an example shell.nix set up for a Phoenix application below:

let
  pkgs = import <nixpkgs> {};
in
pkgs.mkShell {
  buildInputs = [
    pkgs.elixir_1_10
    pkgs.nodejs-10_x
    pkgs.yarn
  ];
}

If you're using direnv (a really nice application which lets you automatically set environment variables when you enter a directory), you can use a cool project called lorri which sets up a daemon and automatically updates your local dev environment when you enter a directory (avoiding the need to run nix-shell manually) as well as automatically watching for changes in your shell.nix.

This emulates the flow of using asdf and defining a .tool-versions file pretty well, and I've actually found it even more useful because if someone has trouble compiling one of my applications, even if they don't use Nix, you can look at the shell.nix and tell them exactly what packages you need and at what versions.

Wrapping up

The more I play with Nix and NixOS, the more in love I fall with it. I've not had this much fun (and drunk this much kool-aid) since falling for OTP or discovering Arch Linux or Gentoo when I was just starting to get into Linux.

Hopefully the contents of this post will be helpful for other people looking to get into NixOS and I'm sure I'll have more to contribute w.r.t this as time goes on!

Thanks for reading 😄

edit: man configuration.nix gives you an exhaustive list of configuration options! happy tinkering!

Testing with tracing on the BEAM

Testing with tracing on the BEAM

When I was writing some tests for a event broadcasting system (the one I outlined in this blog post I found that, as expected, the unit tests worked very easily but when it came to testing that the entire system worked as intended end to end, it was a little more difficult because I'd be trying to test completely asynchronous and pretty decoupled behaviour.

Testing with Mock/:meck

At this point in time, the test suites inside our code base use mocks very extensively which might be a topic for another day, because I don't think I believe that mocking code in tests in the best—nor most idiomatic—approach. One nice bit about the de-facto mocking library for Elixir is that it also comes with a few assertions you can import into your test suites; namely assert_called/1.

Using this library, we could write our tests, mocking the event handling functions in the event listeners we're interested in for the particular test and then use assert_called/n to determine whether or not that function was called as expected and optionally you can provide what looks roughly like an Erlang match spec to assert that the function was called with specific arguments.

Our tests end up looking roughly like the following:

defmodule E2ETests.EctoBroadcasterTest do
  use ExUnit.Case
  alias EventListener.EctoHandler
  import Mock

  setup_with_mocks(
    [
      {EctoHandler, [:passthrough],
       handle_event: fn {_ecto_action, _entity_type, _data} ->
         {:ok, :mocked}
       end}
    ],
    _tags
  ) do
    :ok
  end

  test "Creating entity foo is handled by EctoHandler" do
    assert {:ok, %MyApp.Foo{}} = MyApp.Foos.create_foo(...)
    assert_called EctoHandler.handle_event({:insert, MyApp.Foo, :_})
  end
end

This approach basically met all of our requirements except for the fact that mocking via :meck (the de-facto Erlang mocking library and used internally by Mock) does things in a pretty heavy handed way.

At a high level, all :meck does is replace the module you're trying to mock with a slightly edited one where the functions you want to mock are inserted in place of the original implementations. This comes with three disadvantages for our particular use case:

  1. Because we're essentially loading code into the BEAM VM, we can no longer run our tests asynchronously because we've patched code at the VM level and these patches are global
  2. Because we're replacing the function we want to call; we can't actually test that that particular function behaves as expected; only that it was called
  3. If we want to mock our functions purely for the assert_called/1 assertion (though this has annoyed us somewhat elsewhere too) we need to be careful and essentially re-implement any functionality that you actually expect to happen for that to be testable; which itself might lead itself to brittle mocks.

My main qualm was with the second problem; we ran into it pretty hard for a few edge cases (though we resolved the second case too!): say EctoHandler.handle_event({:delete, _, _}) needed to be implemented in a special way where it persists these actions into an ETS table and you want to assert that that actually happened, you'd be a little stuffed... image your source code looks like:

defmodule EventListener.EctoHandler do
  ...
  def handle_event(:delete, entity_type, data) do
    Logger.log("Got delete event for #{entity_type} at #{System.monotonic_time()}")
    data
    |> EventListener.DeletePool.write()
    |> process_delete_pool_response()
    |> case do
      {:ok, _} = response -> 
        response
      error ->
        add_to_retry_queue({:delete, entity_type, data})
        Logger.error("Failed to add to delete pool, adding to retry queue")
        error
    end
  end
  ...
end

To mock this function properly, you'd probably just have to implement your mock such that we essentially duplicate the implementation, but this of course will blow up when the mocks come out of sync with the implementation.

A less heavy handed approach

One of the coolest but possibly least used features of the BEAM is the built in ability to do tracing.

Tracing essentially lets you capture things such as messages being passed to particular processes, functions being called among other things. It does come with some disadvantages (i.e. if you're not careful you'll blow up your application because of the performance impact (be careful if you trace on a production deployment 😉)) and it's very low level but it works well and there exist several high level wrappers.

I wanted to experiment with tracing because I've never used it very much so I chose a pretty low level abstraction to do the work for me: the :dbg application which comes built into the normal Erlang OTP distribution.

I'm in the process of cleaning up the actual implementation, but if you wanted to build your own replacement for the assert_called/1 function (with a nice added feature) feel free to follow along!

You can start the :dbg application via the following:

def init() do
  # Flush :dbg in case it was already started prior to this
  :ok = :dbg.stop_clear()
  {:ok, _pid} = :dbg.start()

  # This is the important bit; by default :dbg will just log what gets traced on stdout
  # but we actually need to handle these traces because we want to capture that the
  # function was called programmatically rather than reading it off the shell/log file
  # This tells `:dbg` to call the `process_trace/2` function in the current running process
  # whenever we get a trace message
  {:ok, _pid} = :dbg.tracer(:process, {&process_trace/2, []})

  # And this sets up `:dbg` to handle function calls
  :dbg.p(:all, [:call])
end

Now that the tracer is started up; we need to tell it to start tracing certain functions. You can do this quite easily by passing an mfa into :dbg.tpl/4 as follows:

def trace(m, f, a) do
  # :dbg.tpl traces all function calls including private functions (which we needed for our use-case)
  # but I think it's a good default
  # 
  # The 4th parameter basically says to trace all invocations of the given mfa (`:_` means we pattern
  # match on everything), and we also pass in `{:return_trace}` to tell the tracer we want to receive
  # not just function invocation messages but also capture the return of those functions 
  :dbg.tpl(m, f, a, [{:_, [], [{:return_trace}]}])
end

Now provided a super simple stub function for process_trace/2, you should be able to see your traces come back looking all good:

# Using the following process_trace/2 implementation:
def process_trace({:trace, _pid, :call, {module, function, args}} = _message, _state) do
  IO.inspect(inspect({module, function, args}) <>" - call start")
end

def process_trace({:trace, _pid, :return_from, {module, function, arity}, return_value}, _state)
    when length(args) == arity do
  IO.inspect(inspect({module, function, arity}) <>" - returns: " <> inspect(return_value))
end
-------------------------------------------------------------------------------------------------
iex(1)> Assertions.init()
{:ok, [{:matched, :nonode@nohost, 64}]}
iex(2)> Assertions.trace(Enum, :map, 2)
{:ok, [{:matched, :nonode@nohost, 1}, {:saved, 1}]}
iex(3)> Enum.map([1, 2, 3], & &1 + 2)
"{Enum, :map, [[1, 2, 3], #Function<44.33894411/1 in :erl_eval.expr/5>]} - call start"
"{Enum, :map, 2} - returns: [3, 4, 5]"
[3, 4, 5]

This gets us most of the way to what we want! We can see what functions were called and capture the variables passed into it; as well as capturing what the functions returned.

But if you look closely, the message for :call and :return_from have a few issues:

  1. Ideally, I'd have preferred to have a single message rather than two, because we have some additional complexity here now...
  2. Especially because the :return_from message doesn't get the args passed into it but instead only the function's arity... which means we need to try our best to marry them up as part of the process_trace function...
  3. Just as a note; I don't think we can pattern match on lambdas easily on the BEAM (nor do things like send them over :rpc.calls so at least its consistent 🙂

We can solve the first two problems by amending the implementation of process_trace/2 quite easily though (it might not be a foolproof implementation but it worked well enough for our usecase that I'll just add the code here verbatim):

def process_trace({:trace, _pid, :call, {module, function, args}} = _message, state) do
  [{:call, {module, function, args}, :pending} | state]
end

def process_trace({:trace, _pid, :return_from, {module, function, arity}, return_value}, [
      {:call, {module, function, args}, :pending} | remaining_stack
    ])
    when length(args) == arity do
  IO.inspect(inspect({module, function, args}) <> " - returns: " <> inspect(return_value))

  remaining_stack
end
-------------------------------------------------------------------------------------------------
iex(1)> Assertions.init()
{:ok, [{:matched, :nonode@nohost, 64}]}
iex(2)> Assertions.trace(Enum, :map, 2)
{:ok, [{:matched, :nonode@nohost, 1}, {:saved, 1}]}
iex(3)> Enum.map([1, 2, 3], & &1 + 2)
"{Enum, :map, [[1, 2, 3], #Function<44.33894411/1 in :erl_eval.expr/5>]} - returns: [3, 4, 5]"
[3, 4, 5]
iex(4)> Enum.map([[1, 2, 3], [:a, :b, :c], []], fn list -> Enum.map(list, &IO.inspect/1) end)
1
2
3
:a
"{Enum, :map, [[1, 2, 3], &IO.inspect/1]} - returns: [1, 2, 3]"
:b
:c
"{Enum, :map, [[:a, :b, :c], &IO.inspect/1]} - returns: [:a, :b, :c]"
"{Enum, :map, [[], &IO.inspect/1]} - returns: []"
[[1, 2, 3], [:a, :b, :c], []]

Because whatever gets returned from process_trace/2 accumulates into future calls of process_state/2 in the second parameter, we can essentially do the above and treat it like a stack of function calls. We only emit the actual IO.inspect we care about if we end up popping the function call (because it's resolved) off the stack.

All I do from here is persist these functions into an :ets table of type :bag that we create when init/0 is called, and have process_trace({_, _, :return_from, _}, _) write to it whenever a function call succeeds.

For convenience I really hackily threw together the following macros (please don't judge me 🤣 -- this is going to be cleaned up a lot before I throw it up on hex.pm):

defp apply_clause(
       {{:., _, [{:__aliases__, _, mod}, fun]}, _, args},
       match \\ {:_, [], Elixir}
     ) do
  module = ["Elixir" | mod] |> Enum.map(&to_string/1) |> Enum.join(".") |> String.to_atom()

  arg_pattern =
    args
    |> Enum.map(fn
      {:_, _loc, _meta} -> :_
      other -> other
    end)

  quote do
    case :ets.select(unquote(@mod), [
           {{:_,
             {unquote(module), unquote(fun), unquote(length(arg_pattern)), unquote(arg_pattern),
              :_}}, [], [:"$_"]}
         ]) do
      [] ->
        false

      matches ->
        Enum.any?(matches, fn
          {_timestamp, {_mod, _fun, _arity, _args, unquote(match)}} -> true
          _ -> false
        end)
    end
  end
end

defp capture_clause({{:., _, [{:__aliases__, _, mod}, fun]}, _, args}) do
  module = ["Elixir" | mod] |> Enum.map(&to_string/1) |> Enum.join(".") |> String.to_atom()

  arg_pattern =
    args
    |> Enum.map(fn
      {:_, _loc, _meta} -> :_
      other -> other
    end)

  quote do
    case :ets.select(unquote(@mod), [
           {{:_,
             {unquote(module), unquote(fun), unquote(length(arg_pattern)), unquote(arg_pattern),
              :_}}, [], [:"$_"]}
         ]) do
      [] ->
        false

      [{_timestamp, {_mod, _fun, _arity, _args, val}} | _] ->
        val
    end
  end
end

defmacro called?({{:., _, _}, _, _} = ast) do
  apply_clause(ast)
end

defmacro called?({op, _, [lhs, {{:., _, _}, _, _} = rhs]}) when op in [:=, :"::"] do
  apply_clause(rhs, lhs)
end

defmacro called?({op, _, [{{:., _, _}, _, _} = lhs, rhs]}) when op in [:=, :"::"] do
  apply_clause(lhs, rhs)
end

defmacro capture({op, _, [lhs, {{:., _, _}, _, _} = rhs]}) when op in [:=, :<-] do
  captured_function = capture_clause(rhs)

  quote do
    var!(unquote(lhs)) = unquote(captured_function)
  end
end

Which lets us finally put it all together and do the following 🎉

iex(1)> require Assertions
Assertions
iex(2)> import Assertions
Assertions
iex(3)> Assertions.init()
{:ok, [{:matched, :nonode@nohost, 64}]}
iex(4)> Assertions.trace(Enum, :map, 2)
{:ok, [{:matched, :nonode@nohost, 1}, {:saved, 1}]}
iex(5)> called? Enum.map([:hi, "world!"], _) = [:atom, :string]
false
iex(6)> Enum.map([:hi, "world!"], & (if is_atom(&1), do: :atom, else: :string))
[:atom, :string]
iex(7)> called? Enum.map([:hi, "world!"], _) = [:atom, :string]
true
iex(8)> called? Enum.map([:hi, "world!"], _) = [_, :atom]
false
iex(9)> called? Enum.map([:hi, "world!"], _) = [_, _]
true
iex(10)> called? Enum.map([:hi, "world!"], _)
true
iex(11)> capture some_variable <- Enum.map([:hi, "world!"], _)
[:atom, :string]
iex(12)> [:integer | some_variable]
[:integer, :atom, :string]

So as you can see; these macros give us some nice ways to solve two of the three issues we had with Mock:

  1. We can assert that functions were called (passing in a real pattern match v.s. needing to pass in a match spec like DSL)
  2. And we can assert that the functions were called and returned a particular value which again can be any Elixir pattern match
  3. And as a bonus, you can bind whatever some function returned to a variable with my capture x <- function macro! 🤯

The remaining steps for this is to add a before_compile step to automatically parse invocations of called? and capture so that I can automate the Assertions.trace/3 function to make it completely transparent (it'll automatically trace the functions you care about) as well as extensively test some edge cases because we've run into a few already.

I'll also be looking into massively cleaning up these macros since they were thrown together literally in a couple of hours playing around with the ASTs I thought to test but I hope this blog post helped as an intro into tracing on the BEAM but also just a fun notepad of my iterative problem solving process 🙂

Taming my Vim configuration

Taming my Vim configuration

As a long-time Vim user, I've put a lot of time into my Vim configuration over the years. It is something that has organically grown as my editing habits and needs change over time.

Recently, when trying to figure out why certain things weren't working the way I expected, I realised that there was much in my config I either didn't need or just blatantly did not understand. I took this as an opportunity to remove all the old cruft and start my configuration file again from scratch, taking only what I absolutely needed.

My configuration file was also around 2000 lines in length. Organisation was a huge mess: related commands strewn around all over the place due to years of ad-hoc editing. During my refactor of my configuration, one key goal was also to clean this up.

When ranting to a colleague of mine about my configuration woes, he showed me his, and I was struck by inspiration 👼 A means of micromanaging—architecting—Vim configuration to make it saner, compositional, and more optimal too.

A few key ideas I've used in my configuration are outlined below: hopefully, they'll help anyone else also looking to tame their Vim configuration.

Initial refactoring

The first thing I did when breaking apart my 2000 line configuration file was to untangle the organisational mess I've made.

I started this by moving all the configuration related to specific ideas (i.e. tabbing: tabstop, softtabstop, expandtab) into contiguous blocks like the following:

" Cache undos in a file, disable other backup settings
set noswapfile
set nobackup
set undofile

" Better searching functionality
set showmatch
set incsearch
set hlsearch
set ignorecase
set smartcase

" Search related settings
set showmatch
set incsearch
set hlsearch
set ignorecase
set smartcase

This let me tell at a glance what essential "modules" I was dealing with. These "modules" would be blocks of configuration concerning both related settings and configuration for certain plugins. After doing this, my Vim configuration was actually larger, but at least it was easier to read 😅

Dealing with Plugins

I personally use vim-plug for managing my Vim plugins. Some of the reasons I use vim-plug include features such as the minimal amount of boilerplate needed to get plugins working, as well as hooks for lazy loading: we'll get to this later.

The only requirement to use vim-plug is to install it as per its installation instructions, followed by adding the following block to the top of your Vim configuration:

call plug#begin('~/.vim/my_plugin_directory_of_choice')

Plug 'https://github.com/some_repo/example.git'
Plug 'scrooloose/nerdtree'

call plug#end()

Most Vim plugins will have installation instructions for if you're using vim-plug (and if you're not, I'm sure you know how to configure this bit already anyway!), but that's pretty much it. Running :plug-install will fetch all of the configured plugins and download them to the path you give into the call plug#begin(...) option. Once this is done, you're basically good to go.

Because vim-plug is just standard viml, I use the same "module" idea and segregate groups of similar/related plugins together, as follows.

call plug#begin('MY_CONFIG_PATH/bundle')

" General stuff
Plug 'scrooloose/nerdtree'
Plug 'jistr/vim-nerdtree-tabs'
Plug 'neoclide/coc.nvim', {'branch': 'release'}
Plug 'easymotion/vim-easymotion'
Plug 'majutsushi/tagbar'
Plug 'ludovicchabant/vim-gutentags'
Plug 'tpope/vim-surround'
Plug 'tpope/vim-repeat'
Plug 'tpope/vim-eunuch'
Plug 'machakann/vim-swap'

" Setup fzf
Plug '/usr/local/opt/fzf'
Plug '~/.fzf'
Plug 'junegunn/fzf.vim'

" Git plugins
Plug 'Xuyuanp/nerdtree-git-plugin'
Plug 'airblade/vim-gitgutter'
Plug 'tpope/vim-fugitive'

" Erlang plugins
Plug 'vim-erlang/vim-erlang-tags'         , { 'for': 'erlang' }
Plug 'vim-erlang/vim-erlang-omnicomplete' , { 'for': 'erlang' }
Plug 'vim-erlang/vim-erlang-compiler'     , { 'for': 'erlang' }
Plug 'vim-erlang/vim-erlang-runtime'      , { 'for': 'erlang' }

" Elixir plugins
Plug 'elixir-editors/vim-elixir' , { 'for': 'elixir' }
Plug 'mhinz/vim-mix-format'      , { 'for': 'elixir' }

" Handlebar Templates
Plug 'mustache/vim-mustache-handlebars' , { 'for': 'html' }

" Bunch of nice themes
Plug 'flazz/vim-colorschemes'

call plug#end()

Here we can see another killing feature of vim-plug: by specifying options such as { 'for': 'erlang' }, I'm instructing that that particular plugin should only be loaded if the filetype of the currently opened file is erlang. A 2000 line config file isn't necessarily huge by hardcore Vim standards, but even at 2000 lines, my Vim configuration took a second to load, which was distracting. You can use many different hooks to optimise your vim-plug loading time, so I suggest you consult their README for more information; for what it's worth, I get by simply with this { 'for': $filetype } directive.

Turning comments into real "modules"

Now that everything is super well organised, I did a second refactoring pass over my configuration. Since all my related commands were already in logical blocks, I further broke them out into their own files so that I wouldn't need to have the informational overhead in my brain whenever I wanted to change a trivial setting.

The cool thing is that viml is a Turing complete programming language, able to do pretty much anything a normal programming language can do. One thing I'm abusing is the idea of automatically sourcing files from elsewhere, essentially breaking out my logically separated comments into real, logical modules.

I do this via a very naive for ... in ... loop as follows:

for config in split(glob('MY_CONFIG_PATH/config/*.vim'), '\n')
  exe 'source' config
endfor

This basically tells vim to look for all .vim files inside my configured config/ directory and source them. Therefore, I end up organising my configuration modules as follows:

MY_CONFIG_PATH/config
├── coc_config.vim
├── ctag_config.vim
├── easymotion_config.vim
├── editor_config.vim
├── elixir_config.vim
├── erlang_config.vim
├── fzf_config.vim
├── jsonc_config.vim
└── nerdtree_config.vim

In the future as this grows, I might even split this up to house modules into different domains (i.e. MY_CONFIG_PATH/plugins/*.vim versus MY_CONFIG_PATH/languages/*.vim versus plain old MY_CONFIG_PATH/tab_settings.vim). In the meantime, however, this level of encapsulation is more than enough: editor_config.vim is where I keep all my editor commands such as tab stop settings; everything else is either plugin configuration (all my settings related to, say, fzf). Otherwise, files such as elixit_config.vim are configuration files that get loaded when I'm editing elixir files specifically.

Plugin Configuration

My nerdtree.vim plugin configuration file is listed below, but essentially all of these files (except language configuration files) look the same:

" NERDTREE toggle (normal mode)
nnoremap <C-n> :NERDTreeToggle<CR>

" Close VIM if NERDTREE is only thing open
autocmd bufenter * if (winnr("$") == 1 && exists("b:NERDTree") && b:NERDTree.isTabTree()) | q | endif

" Show hidden files by default
let NERDTreeShowHidden=1

" If more than one window and previous buffer was NERDTree, go back to it.
autocmd BufEnter * if bufname('#') =~# "^NERD_tree_" && winnr('$') > 1 | b# | endif

" Selecting a file closes NERDTREE
let g:NERDTreeQuitOnOpen = 1

" Style
let NERDTreeMinimalUI = 1
let NERDTreeDirArrows = 1

Since the plugins can be set to be lazily loaded, I don't bother with any further tinkering with these files. They're literally just plain viml that is sourced by my top-level init.vim or .vimrc.

Language configuration

Language configuration is likewise simple but a little different. I actually end up writing custom viml functions for overriding configuration, which might be set elsewhere.

As an example, I've included a minimal copy of my elixir.vim file below:

function! LoadElixirSettings()
  set tabstop=2
  set softtabstop=2
  set shiftwidth=2
  set textwidth=80
  set expandtab
  set autoindent
  set fileformat=unix
endfunction

au BufNewFile,BufRead *.ex,*.exs,*.eex call LoadElixirSettings()

" Mix format on save
let g:mix_format_on_save = 1

As you can see, I basically do two different things:

  1. I set a few options specific to plugins that might load for filetypes marked elixir, i.e. let g:mix_formation_on_save is a configuration option for a plugin I load in my top-level init.vim or .vimrc via vim-plug

  2. Settings that are sourced in other files (such as tabstop or autoindent) can't be guaranteed to be sourced in any particular order, because again, that's done via a naive for ... in ... loop. To ensure that these settings override the default settings I've defined, I set up an autogroup: something that automatically executes based on some conditions. I'm telling vim that when the buffer is editing a file ending in .ex, .exs, .eex, then run the function I've called—this, in turn, overrides the existing settings with no issues 💪

Conclusion

I find that structuring my Vim configuration this way is super helpful in cutting down the amount of mental overhead I have when trying to edit specific settings. All of my configuration is separated and logically grouped, and when I decide I no longer use a given plugin, I can delete its associated configuration file.

I hope this helps, and if you're interested, you can find my personal vim configuration here. This repo changes often and grows over time, so what you find there might not be 100% the same approach as I've outlined here, but that means I've improved and streamlined it even more! 💪

As an aside, I personally opt to use Neovim over Vim because of asynchronous plugin support.

I originally had a section covering how to convert an existing Neovim configuration to be loadable by Vim (such as to ignore any differences and to enable Vim users to make meaningful use of this guide), but this was removed simply because even if the raw configuration can be made to work for both editors, plugins cannot.

Following this realisation, I've edited this article to be completely editor agnostic. I apologise for any confusion early drafts of this article might have caused!

Refining Ecto Query Composition

Refining Ecto Query Composition

This post is a follow-up to the somewhat popular Ecto Query Patterns post I wrote over a year ago, so if you haven't read that, I definitely recommend skimming over it just to set the groundwork for this post.

In that post, I detailed my preferred way of breaking down Ecto Queries into small, reusable functions for three major reasons: minimising code duplication, isolating where Ecto queries live and are built in your codebase, and increasing the consistency of query patterns in your modules.

Despite these goals, when implementing this idea more fully in Vetspire's codebase, we still ran into a few pain points and oversights which we've attempted to address since the publishing of the original post—something I hope to share with you all now.

Pain Points & Oversights

One of the main issues with the composable Ecto Query pattern is that while we've reduced code duplication by preventing you from writing adhoc Ecto Queries outside of your Ecto Schemas, but it definitely doesn't go as far as it could. For example, if we have multiple functions dealing with Users in your context, your functions probably end up duplicating a lot of logic in your pipelines like so:

defmodule MyApp.Accounts do
  alias MyApp.Accounts.User
  alias MyApp.Repo

  def count_users(org_id) do
    User
    |> User.where_not_deleted()
    |> User.where_org_id(org_id)
    |> Repo.aggregate(:count)
  end

  def list_users(org_id) do
    User
    |> User.where_not_deleted()
    |> User.where_org_id(org_id)
    |> Repo.all()
  end
  
  def get_user(org_id, user_id) do
    User
    |> User.where_not_deleted()
    |> User.where_org_id(org_id)
    |> Repo.get_by(id: user_id)
  end
end

While this evidently has a lot of advantages over duplicating the raw Ecto Queries in all three functions (if only for IDE completion, being able to write multiple clauses per function, etc), we've still got a lot of rampant duplication. We still have to remember to add the User.where_not_delete/1 call to each pipeline, not doing so could easily lead to bugs; We potentially have to add nil pattern clauses to each composable function if we wanted to allow users to opt-out of passing parameters; We either have to duplicate built-in Ecto business logic (implementing a User.where_id/2 versus using Repo.get_by/2) or accept having multiple methods of querying data in our functions; etc.

Additionally, while I touched briefly on being able to reuse any defined composable query functions as part of your application's Dataloader sources, or leveraging named bindings to more easily allow cross-schema query composition for joins/preloads, the solution still forces you to build queries in places outside of your schema and contexts.

This might be fine in the preload/join case if your schemas are related and coupled under a single context (i.e. MyApp.Accounts.User might build a preload query using functions defined in MyApp.Accounts.PhoneNumer of vice versa), doing so in the case where you're reaching between contexts (MyApp.Billing.Invoice building queries reaching into MyApp.Accounts.User) or via Dataloader feels a lot less clean.

While I don't think we necessarily want to avoid reaching into other schemas, especially since we want to leverage pattern matching on structs and type definitions across our codebase, we do want to avoid exposing and using the low-level details too much. I'd definitely put the scale of query fragment we usually compose as low-level details... so being able to avoid doing this as much as possible is a nice incentive to revisit how we do things!

Lastly, we end up creating an ad-hoc, informally specified DSL that barely wraps the underlying functionality. It was questioned in our team whether or not the compositional query pattern was quite the right level of abstraction for what we were trying to do.

As we were evaluating how our compositional query rollout was going, we decided as a team to double down and try to maximise how encapsulated and reusable our queries would be. The direction we ended up going very much embraces the DSL/pattern we were trying to enforce, so maybe this won't be as universally applicable in other applications, but it's definitely something I think applies to the vast majority of apps which utilise Ecto/Phoenix/Absinthe. For other specific use cases (i.e. building ETL pipelines handling very polymorphic data) it might still be beneficial to write very composable Ecto Queries, but as said, for most things, this approach simplifies and ties everything together a lot neater.

Metaprogramming Elixir

What's the solution you ask? The answer with Elixir is almost always (regrettably) metaprogramming!

The high-level idea is that we didn't want to build a pattern or abstraction for query composition; we wanted to build an abstraction to allow us to query data. This small shift in thinking led us down the metaprogramming rabbit hole: in short, entities in our codebase needed to be queryable. This queryability needed to be implemented in a generic but consistent way and needed to be trivially extendable on a schema-by-schema basis.

We took this thinking and started hacking something together quite literally: we decided that all schemas which wanted to be queryable needed to implement a new behaviour and some callbacks for consistency while retaining customisability, and that schemas implementing said behaviour should expose a very simple API to hide query details from the caller.

When looking at the bulk of our code which was already written with compositional query patterns, we saw that the most common pattern for allowing very customisable filtering was the Enum.reduce/3 pattern outlined in the original blog post. We decided that this would be a very flexible base for our behaviour callback; we intended to do something as follows:

def MyApp.Queryable do
  @callback query(opts :: Keyword.t()) :: Ecto.Queryable.t()
end

def MyApp.Accounts.User do
  use Ecto.Schema
  import Ecto.Query

  alias MyApp.Entities.Org
  alias __MODULE__

  schema do
    field :id, :integer
    field :name, :string
    field :password, :string
    field :deleted, :boolean

    belongs_to :org, Org
  end

  def changeset(%User{} = user, attrs) do
    cast(user, attrs, [:name, :password, :boolean, :org_id])
  end

  @impl MyApp.Queryable
  def query(opts) do
    Enum.reduce(opts, (from x in User, as: :user), fn
      {:id, id}, query when is_integer(id) ->
        from [user: user] in query, where user.id == ^id

      {:org_id, org_id}, query when is_integer(org_id) ->
        from [user: user] in query, where user.org_id == ^org_id
        
      {:name, name}, query when is_binary(name) ->
        from [user: user] in query, where user.name == ^name

      {:deleted, true}, query ->
        from [user: user] in query, where not user.deleted
    end)
  end
end

At a super basic level, this minimal implementation allows us to replace all of the examples in the context snippet above like so:

defmodule MyApp.Accounts do
  alias MyApp.Accounts.User
  alias MyApp.Repo

  def count_users(org_id), do:
    Repo.aggregate(User.query(org_id: org_id, deleted: true), :count)

  def list_users(org_id), do:
    Repo.all(User.query(org_id: org_id, deleted: true))
  
  def get_user(org_id, user_id), do:
    Repo.one(User.query(id: user_id, org_id: org_id, deleted: true))
end

Which is definitely a little terser and reduces the number of lines of code duplication we had. One of the big tradeoffs this makes is losing IDE autocompletion as a perk; which for Vetspire's use case, was acceptable. In lieu of having IDE auto-completion, we only have one function we need to remember to use (User.query/1, Org.query/1, etc), and we can still catch bad key errors by using Dialyzer.

A Little Extensibility

Additionally, a lot of schemas in Vetspire happen to utilise soft deletion rather than hard deletion. In 99% of cases, we never want to query the deleted entities from the database. For cases like this, we can extend out MyApp.Queryable behaviour by adding a new @callback base_query :: Ecto.Queryable.t() which could be implemented thusly:

@impl MyApp.Queryable
def base_query, do:
  from x in User, as: :user, where: not x.deleted

@impl MyApp.Queryable
def query(base_query \\ base_query(), opts) do
  Enum.reduce(opts, base_query, fn
    {:id, id}, query when is_integer(id) ->
      from [user: user] in query, where user.id == ^id

    {:org_id, org_id}, query when is_integer(org_id) ->
      from [user: user] in query, where user.org_id == ^org_id
      
    {:name, name}, query when is_binary(name) ->
      from [user: user] in query, where user.name == ^name
  end)
end

Which lets us easily enforce base queries always contain filters that we want in most cases, while still allowing callers to opt out of it by passing in a different base query as a first parameter to the query/2 callback.

Consistency is Key

Do note that the API for MySchema.query/2 is very similar to the API provided by Ecto.Repo.get_by/2. This, of course, is not unintentional.

One of the issues with the compositional query function pattern was that each schema would end up implementing and defining very low-level wrappers over simple Ecto Query clauses; increasing the function surface area of your app's API, and introducing a lot of room for inconsistency in function names/parameter handling semantics to sneak in.

Arguably, the Ecto built-in callback avoids this by tightly coupling supported parameters/filters to your schema's defined fields. In fact, this is something we can trivially do too, to avoid people from needing to redefine benign filters over and over again across your codebase. Instead, we should just have to call out filters which are tightly coupled to your schema; anything generic should be provided out of the box so as to minimise the amount of thinking engineers need to do.

Since our Queryable behaviour relies on implementing a query/2 callback and enumerating filters in an Enum.reduce/3 callback, we need to write a function which takes some pattern {key :: atom(), value :: any()}, Ecto.Queryable.t() and determines if the key is known in the given schema, and if it is known, automatically apply filtering.

We ended up implementing something like the following to achieve this:

def MyApp.Queryable do
  import Ecto.Query

  @callback base_query :: Ecto.Queryable.t()
  @callback query(base_query :: Ecto.Queryable.t(), opts :: Keyword.t()) :: Ecto.Queryable.t()

  @spec apply_filter(Ecto.Queryable.t(), field :: atom(), value :: any()) :: Ecto.Queryable.t()
  def apply_filter(query, :preload, value) do
    from(x in query, preload: ^value)
  end

  def apply_filter(query, :limit, value) do
    from(x in query, limit: ^value)
  end

  def apply_filter(query, :offset, value) do
    from(x in query, offset: ^value)
  end

  def apply_filter(query, :order_by, value) when is_list(value) do
    from(x in query, order_by: ^value)
  end

  def apply_filter(query, :order_by, {direction, value}) do
    from(x in query, order_by: [{^direction, ^value}])
  end

  def apply_filter(query, :order_by, value) do
    from(x in query, order_by: [{:desc, ^value}])
  end

  def apply_filter(query, field, value) when is_list(value) do
    from(x in query, where: field(x, ^field) in ^value)
  end

  def apply_filter(query, field, value) do
    from(x in query, where: field(x, ^field) == ^value)
  end
end

@impl MyApp.Queryable
def query(base_query \\ base_query(), opts) do
  Enum.reduce(opts, base_query, fn {key, value}, query ->
    Queryable.apply_filter(query, key, value)
  end)
end

This not only minimises the number of clauses we need to write, this means that we can support generic Ecto-like semantics such as preloading, naive-paginating, sorting, etc. Engineers are still able to specify more complex filters, or override filters entirely by implementing their own reduce clauses above the apply_filter/3 call.

Solving Reusability

Now that we've extended our Queryable behaviour to be way more generic, we've come to a pretty foundational crossroads. Not only have we written a relatively small amount of code, we've got a very small API surface area. Any function that wants to query, join, or further compose other queries simply needs to know two things: one - the schema we want to join/query/filter/compose on, and two - the parameters which need to be passed into said schema's query/2 callback.

This lends itself well for, well, the intended purpose. To reiterate; one of the problems we didn't quite manage to solve with the previous compositional query pattern was preventing low-level query details from leaking out into adjacent contexts/domains. While we could somewhat mitigate this for our core contexts, Dataloader in particular proved a genuine code-duplication challenge.

With the old composition pattern, we would be forced to implement some Dataloader source roughly as follows:

defmodule MyAppWeb.Loader do
  def query(MyApp.Billing.Payment = query, args) do
    limit = Map.get(args, :limit)
    offset = Map.get(args, :offset)
    sort = Map.get(args, :sort, :asc)
    ...

    MyApp.Billing.Payment
    |> MyApp.Billing.Payment.paginate(limit, offset)
    |> MyApp.Billing.Payment.sort(sort)
    |> ...
  end
end

This effectively means that we're duplicating all our context functions between the core contexts defined within MyApp and MyAppWeb. Not just that; either you split MyAppWeb.Loader into various submodules with delegated functions or dispatched function calls (thus even duplicating the layout of your core contexts...), or you build one massive god module which is a problem in and of itself.

Without even really trying; we've solved this code duplication problem using our Queryable behaviour; if a given schema is defined as queryable, then we know that it implements the query/2 callback. If we know this, then we can dynamically dispatch to it without needing ever to write additional code.

The only thing holding us back from easily doing this is programmatically determining whether or not a module implements our behaviour. There are two easy ways of doing this; behaviour reflection and function reflection.

In the BEAM, behaviour metadata is compiled into the bytecode which is loaded by the BEAM when modules are loaded. This metadata can be reflected upon like so:

@spec implements_behaviour?(module :: atom()) :: boolean()
def implements_behaviour?(module) do
  behaviours =
    module.module_info(:attributes)
    |> Enum.filter(&match?({:behaviour, _behaviours}, &1))
    |> Enum.map(&elem(&1, 1))
    |> List.flatten()

  __MODULE__ in behaviours
end

Alternatively, you can check if the given schema exposes a function of the correct name and arity via function_exported?(module, function, arity). The former is probably more semantically correct and won't return any outliers, but I'm not really sure what's preferable. In fact, the Vetspire codebase does both things at times... we definitely have some more cleaning up to do!

Regardless, for the sake of this example, it doesn't particularly matter; so I'll go with the latter simply because the result is slightly terser:

defmodule MyApp.Queryable do
  import Ecto.Query

  @callback base_query :: Ecto.Queryable.t()
  @callback query(base_query :: Ecto.Queryable.t(), opts :: Keyword.t()) :: Ecto.Queryable.t()

  ...

  @spec implemented_by(module :: atom()) :: boolean()
  def implemented_by(module), do: function_exported?(module, :query, 2)
end

defmodule MyAppWeb.Loader do
  def query(module, args) do
    if MyApp.Queryable.implemented_by(module) do
      module.query(args)
    else
      # Alternatively, fall back to some `do_query/2` function to support both
      # queryable schemas and the compositional query pattern where it is advantageous.
      raise "Dataloader relations must implement the `MyApp.Queryable` behaviour."
    end
  end
end

Of course, this is much easier to do if your Absinthe schema's filtering options are relatively similar to your Ecto schema's fields. However, of course, if this is not the case you can always still supplement and extend the query/2 callback yourself; though you literally have to write no code if things align in your favor!

Finishing Touches

Looking over where we've gotten to; we've effectively reduced our API surface area with respect to code duplication to pretty much a single function call; we've taken a stab at minimising room for error and inconsistency in one's implementation, and we've got something that can be minimally used across contexts without leaking too many implementation details. I think that's pretty impressive for how little code we've written here.

The main win has been taking a step back from the original level of abstraction and building something that abstracts over "queryability" rather tha "query composition". Is there anything we can do to further improve the UX and make things even nicer or more automatic? Perhaps.

One of the main learnings of the Elixir community is to not reach for macros unneccessarily, and this is advice I generally agree with. However, for building developer tools, frameworks, and for increasing DX, I do think that macros are an awfully tempting tool to use, especially if one doesn't overly hide the complexities of what you're building.

At Vetspire, we've coupled the above approach to making our schemas queryable with a few other conveniences via a very minor use macro as follows:

defmodule MyApp.Queryable do
  defmacro __using__(_opts \\ []) do
    quote do
      use Ecto.Schema

      import Ecto.Changeset
      import Ecto.Query

      import unquote(__MODULE__)

      @behaviour unquote(__MODULE__)
      
      @impl unquote(__MODULE__)
      def base_query, do: from x in __MODULE__, as: :self

      @impl unquote(__MODULE__)
      def query(base_query \\ base_query(), filters) do
        Enum.reduce(filters, base_query, fn {field, value}, query ->
          apply_filter(query, field, value)
        end)
      end

      defoverridable query: 1, query: 2, base_query: 0
    end
  end

  @optional_callbacks base_query: 0
  @callback query(base_query :: Ecto.Queryable.t(), Keyword.t()) :: Ecto.Queryable.t()

  @spec apply_filter(Ecto.Queryable.t(), field :: atom(), value :: any()) :: Ecto.Queryable.t()
  def apply_filter(query, :preload, value) do
    from(x in query, preload: ^value)
  end

  def apply_filter(query, :limit, value) do
    from(x in query, limit: ^value)
  end

  def apply_filter(query, :offset, value) do
    from(x in query, offset: ^value)
  end

  def apply_filter(query, :order_by, value) when is_list(value) do
    from(x in query, order_by: ^value)
  end

  def apply_filter(query, :order_by, {direction, value}) do
    from(x in query, order_by: [{^direction, ^value}])
  end

  def apply_filter(query, :order_by, value) do
    from(x in query, order_by: [{:desc, ^value}])
  end

  def apply_filter(query, field, value) when is_list(value) do
    from(x in query, where: field(x, ^field) in ^value)
  end

  def apply_filter(query, field, value) do
    from(x in query, where: field(x, ^field) == ^value)
  end
end

This means that we can swap any use Ecto.Schema with use MyApp.Queryable and we'll end up importing every we need to define a schema, define our changeset, and define our queryable behaviour. Since we inject default implementations for base_query/0 and query/2, the majority of our simple schemas don't need to define any other code to make use of everything we've built. Any schemas that do require extension only need you to implement whatever callbacks you explicitly need to extend; everything else should work out of the box!

Conclusion

This leaves us at an interesting place: I feel we're doing well with respect to our Ecto Queries; we've largely eliminated duplication and solved a few problems (though admittedly, making some tradeoffs along the way).

The next step we could take is extending our MyApp.Queryable module to not only give us better query semantics but also to improve a bunch of schema/changeset stuff which always grinds my gears with vanilla Ecto. We've taken some steps to solve at Vetspire, and as that matures, I'm sure I'll write up a related follow-up to this post!

Another interesting direction to expand would be to take this idea one further: if we're going as far as to couple our queryable behaviour with Absinthe schema definitions, how easily could we extend Ecto schemas to automatically generate Absinthe schemas/queries/mutations? I'm sure something like this would both be super fun to do and valuable too!

Until then, however, I'd love to hear what the Elixir community thinks about this abstraction—I'm aware this isn't the most mindblowing thing in the world, but its a much-appreciated small step in the right direction for DX and discoverability in my own code, and where we've embraced it in Vetspire, we've definitely eased a lot of low level, repetitive code writing.

Happy hacking, and thanks for reading!

Compositional Elixir—Ecto Query Patterns

Compositional Elixir—Ecto Query Patterns

When I was a consultant, one of the opportunities I was most grateful for was the chance to get myself knee's deep in a dizzying variety of Elixir codebases. A direct result of this led me to hone in on a set of Elixir patterns and conventions that I end up pulling into just about every single new project I get my hands on.

I've written high-level posts about patterns I particularly like, such as railway–oriented programming and ways to make your Elixir code more composable, but today I felt like going on a little bit of a deep-dive into getting the most out of one of the most common Elixir libraries in common usage: Ecto!

ELI5—Ecto

Obviously, for a much more fully-featured introduction to Ecto, I recommend jumping right in and reading the official docs/guide as well as Programming Ecto, but a brief and abridged overview I can do.

Ecto is effectively the equivalent of an ORM that you'll find used in, or bundled with, popular web frameworks. While Ecto isn't strictly and ORM, you can think about it doing the same basic job: you define schemas and queries which allow you to read/write records stored in your database of choice (most commonly Postgres) using standard Elixir structs.

On a high level, Ecto is a collection of three different bits and pieces: Ecto.Schema, Ecto.Query, and Ecto.Changeset; providing you with the means of mapping between Elixir structs and your database schema, an extremely robust query builder, and data validation library respectively. Hopefully, that clears up the title of this post—today we'll be focusing on ways to make the most out of Ecto.Query specifically 💪

A Common Problem

One of the things I optimize for as a developer is separating my concerns whenever possible and creating boundaries/domains to isolate different "ideas" and "concepts" in whatever codebase I'm working on.

Unfortunately, on most projects I've worked on, Ecto code tends to be strewn all across the codebase with little regard for how/why it's being strewn about. When projects use libraries such as Absinthe (especially with Dataloader), problems usually end up compounding even more.

For (a surprisingly realistic) example's sake, imagine we're working on a standard Elixir project which uses Phoenix and Absinthe; it'll usually end up having a directory structure not very different from this:

├── my_project/
│   ├── my_context/my_entity.ex
│   └── my_context.ex
└── my_project_web/
    ├── controllers/my_entity_controller.ex
    ├── resolvers/my_entity_resolver.ex
    ├── types/my_entity_types.ex
    └── my_loader.ex

And while organization wise this is what we've come to expect when following standard Elixir conventions, the shocking thing is that every file I've listed here usually contains Ecto code. To make things even worse, these files don't just contain any old Ecto code, but they go as far as to define new Ecto.Querys! In my mind, this is a huge no–no.

Before I get ahead of myself, I'll briefly explain how I interpret the different roles of the files shown in the directory structure example:

  • Everything within my_project/ is what I'd define as "application logic". This is the real meat and potatoes and our app; this is where we would read/write stuff to/from the database, where we'd write code to actually do stuff (as nebulous as that sounds).
    • my_context.ex is a standard Phoenix style context, which in other words, is just a simple Elixir module that we treat as the public API for doing "things" concerning context. A better example to name my_context.ex would be something like accounts.ex which exposes functions such as Accounts.register_new_user/N, Accounts.create_login_token_for_user/N, Accounts.send_forgot_password_email/N.
    • my_entity.ex is a standard Ecto.Schema, something that maps records in your database into standard Elixir structs. A more realistic schema name would be user.ex, or admin.ex.
  • Everything within my_project_web/ is what I'd define as a "consumer application". Typically, I don't expect very much real business logic to live in this directory. Modules defined in these files will simply wrap one or more functions defined by contexts in my_project/, maybe doing some mapping/rendering/converting of different data formats.
    • controllers/my_entity_controller.ex is a standard Phoenix controller which lets us trigger code in response to certain REST-like actions.
    • resolvers/my_entity_resolver.ex and types/my_entity_types.ex are relatively normal Absinthe files which are effectively the equivalent of our Phoenix controller, but for GraphQL.
    • my_loader.ex is a Dataloader specific concept; think of it as a way of batching GraphQL requests to avoid the N+1 problem.

Hopefully, then, you can see why having quite literally every single file here create Ecto.Querys is a code smell? To re-iterate, in case it isn't clear, Ecto.Querys are basically a way for you to write SQL queries to read/write from your database. Now, in reality, you'll likely have a bunch of different resolvers, controllers, types, schemas, contexts; so your problems can only get worse from here... 😅

The issues described here are all effectively different flavors of DRY: I've seen many maintenance nightmares where there is no source of truth for how something is meant to work; instead, you get hundreds of duplicated, one-off Ecto.Querys built all over the place, doing similar but slightly different things. Not only is this the antithesis of comprehensibility, but it's also absolutely unmaintainable: how can anyone sleep at night knowing the next day they'll need to find and update every single query dealing with SomeEntity?!

Much like how I describe error handling and propagation, Ecto.Querys are something I want to push deep into one layer of my codebase. In my mind, Ecto.Querys are an extremely low-level implementation detail that is tightly coupled to your Ecto.Schema and database, and we should treat them as such.

Thankfully, with some elbow grease and hard work (and hopefully, a comprehensive test suite 🤞), we can gradually refactor codebases from the state I've just described into something a little more "onion-like" in shape: having a dedicated database-layer (containing all your Ecto code in one centralized place), a business logic layer (Pretty much just contexts and loader.ex, consumers of your database layer), and one or more consumer layers (Phoenix controllers, Oban workers, Absinthe resolvers, etc.).

One Small Step...

Continuing along with the example directory structure above, let's imagine that our context currently looks like this:

defmodule MyProject.MyContext do
  alias MyProject.MyContext.MyEntity

  alias MyProject.Repo

  import Ecto.Query

  def does_entity_exist?(entity_id) do
    from(
      e in MyEntity,
      where: e.id == ^entity_id,
      where: coalesce(e.deleted, false) != true,
      limit: 1
    )
    |> Repo.exists?()
  end

  def count_entities do
    from(e in MyEntity, where: coalesce(e.deleted, false) != true)
    |> Repo.aggregate(:count)
  end

  def count_entities(org_id) do
    from(
      e in MyEntity,
      where: e.org_id == ^org_id,
      where: coalesce(e.deleted, false) != true,
    )
    |> Repo.aggregate(:count)
  end
end

This looks okay so far, right? I'd go as far as to call this pretty comfortable since this is the state a lot of Elixir projects end up looking like. It works well and is easy to grok in isolation, but we can already foresee some issues that might come up (even with this one module):

  1. Note the repeated query expressions (where: coalesce(e.deleted, false)), which are present in every query. This makes sense since we don't want deleted entities to "exist" or be "counted", but it definitely makes future refactorings in this file a little daunting.

    What if we were asked to amend this where: coalesce(e.deleted, false) clause to also consider something like e.updated_at so that we can effectively ignore the existence of entities that haven't been updated in over some time period? We'd need to go and carefully update every function and manually implement this check ourselves. Not exactly safe or not error-prone.

  2. Since we're building queries on the fly, how is anyone supposed to know what fields are available in our schema? Someone who needs to maintain this file will need to know, say, that MyEntity has an org_id field or a deleted flag. IDE autocompletion gets us part of the way there, but it's flakey at best (at least in Vim).

    This might be fine right now, as we're working with a single schema, but imagine we need to start joining on different entities and remembering what each entity's respective fields are. This requires a great amount of additional mental overhead and exponentially increases the surface area for things to go wrong.

For me, both of these issues highlight the fact that we aren't working at the right level of abstraction for our contexts. The first issue shows us that we need to centralize our queries into a single source of truth. The second issue shows that we're leaking database-level implementation details all over the place (extrapolate what this means when you have queries being built this way across your entire codebase...)

The first thing I usually do when confronted with a codebase that looks like this is to look at all the individual query expressions being used and refactor those into functions. I like putting these functions directly in whatever schema the query's primary entity is (in our case, it'd be my_entity.ex). That way, we can compose our Ecto.Query the same way as before, but leveraging functions instead:

defmodule MyProject.MyContext do
  alias MyProject.MyContext.MyEntity

  alias MyProject.Repo

  def does_entity_exist?(entity_id) do
    MyEntity
    |> MyEntity.where_not_deleted()
    |> MyEntity.where_id(entity_id)
    |> Ecto.Query.limit(1) 
    |> Repo.exists?()
  end

  def count_entities do
    MyEntity
    |> MyEntity.where_not_deleted()
    |> Repo.aggregate(:count)
  end

  def count_entities(org_id) do
    MyEntity
    |> MyEntity.where_not_deleted()
    |> MyEntity.where_org_id(org_id)
    |> Repo.aggregate(:count)
  end
end

defmodule MyProject.MyContext.MyEntity do
  ...

  def where_not_deleted(query) do
    from e in query,
      where: coalesce(e.deleted, false) != true
  end

  def where_id(query, id) do
    from e in query,
      where: e.id == ^id
  end

  def where_org_id(query, org_id) do
    from e in query,
      where: e.org_id == ^id
  end

  ...
end

Hopefully, it's clear that we're already starting to address the issues I described. Thankfully, Ecto makes it dead easy to make queries composable out-of-the-box, so this refactor ends up not being very much work.

The first issue ends up being resolved quite directly, whereas the second one is resolved by... you got me: IDE autocompletion! Why is this an acceptable fix now when it wasn't before? Because we're now working at a higher level of abstraction—it doesn't matter, say, if where_org_id needs to join some other entities to do its job, those are just database level implementation details that are hidden from you. You're composing natural-language statements in far more of a declarative way than if you were to just write the Ecto.Querys by hand.

Going back to the bits of added complexity I mentioned when describing the problems above, imagine we really did want to implement additional filters such as this where_not_outdated check (perhaps where_not_archived would be a better name?): all we would need to do is implement that as a function in my_entity.ex and call it as part of whichever pipeline needs the addition!

Not only do we minimize code duplication in your contexts, because these new composable query functions can be used elsewhere, you can also deduplicate the queries strewn all over your codebase (loader.ex, I'm looking at you 👀). The ultimate goal is to have the source of truth for anything database-related (insofar as MyEntity is concerned) be contained within my_entity.ex itself.

Down the Rabbit Hole

We're not done yet: iterative software development throws another spanner in your works! While the code we've been left with looks to be in a much better place, imagine we need to revisit how we've implemented the MySchema.where_org_id/2 check above: instead of us being able to store an org_id field directly on entity's schema, we need to join the owner of our entity, and join the org of our owner.

Jumping right in, if we try to solve this problem naturally and iteratively using the pattern we've just described, we should end some with something quite similar to the following:

defmodule MyProject.MyContext.MyEntity do
  ...

  def where_org_id(query, org_id) do
    from e in query,
      join: c in assoc(e, :owner),
      join: o in assoc(c, :org),
      where: o.id == ^org_id
  end

  ...
end

Surprisingly, this teeny three-line change does the trick! Before we celebrate, however, I'll point out the Pandora's box we just unleashed into our codebase 😉

The new issue we've introduced is that MyEntity.where_org_id/2 implements a check on data it doesn't own! This might sound fine on such a small scale. Still, I've seen this kind of thing escalate to ridiculous levels—its easy to see how this regresses into the same problem we were trying to solve at the beginning of all of this: if multiple entities all need to check their org_id, at this rate, we'd be reimplementing this same check in literally every schema in our project!

Ideally, we'd just be able to join on our owner struct and call Owner.where_org_id(owner, ^org_id) on it; however, this isn't something that can be expressed in Ecto.Query's DSL of course—the owner struct isn't realized here yet and just represents an expression in a yet-to-be-executed SQL query.

We can express something roughly analagous however! If we keep going down the rabbit hole of trying to break the query example we arrived at into smaller and smaller pieces, let's see if anything clicks and becomes self-evident:

def where_org_id(query, org_id) do
  query
  |> join_owner()
  |> where_owner_org_id(org_id)
end

defp join_owner(query) do
  from e in query,
    join: c in assoc(e, :owner)
end

defp where_owner_org_id(query, org_id) do
  from [..., c] in query,
    where: c.org_id == ^org_id
end

This is a little closer to where we want to be: it encapsulates all of the concerns we have into their own functions, which is good, but we still have yet to solve our issue. Let's take a look at what exactly is insufficient with these three functions.

  1. We still can't move this where_owner_org_id/2 check into the Owner schema. We've decomposed our query function into atomic pieces that we literally can't split anymore, but we still have a big problem.

    Queries thus far have been composable because we're always querying against the main entity (from here on out, we'll call it the first binding) of a query. After we do a join_owner/1, we need to keep in mind that we have an owner binding in our query as well. In addition to this, if we want to move our where_owner_org_id/2 to Owner.where_org_id/2, we need to be somehow able to handle the fact that owner isn't the first binding of this query.

  2. Even if we could somehow determine that owner is the second binding of a query in some cases, how do we get this to work in general? What if we expose multiple join_blah/1 functions? We would need to micromanage the order entities are joined all over the place to try and be consistent. This isn't maintainable or anywhere near the idea of a fun time...

  3. If multiple queries end up calling join_owner/1 internally, Ecto.Query will end up generating SQL with duplicated joins. This might cause issues when running your final SQL query—it isn't optimal, that's for sure! This is likely(?) optimized away by some databases, but maybe not. Either way, it's something to avoid if we can help it.

Thankfully, Ecto provides us with the tool we need to solve all of these issues in one fell swoop by unlocking a new composition superpower: named bindings!

Named bindings work as they sound: they let you name bindings such that you can refer to them later in the query by a given name rather than position. This is in contrast to Ecto's default behavior, which is positional binding: the order in which you bind entities into your query determines how you use them later.

You can create a named binding as simply as using the as keyword following a from or join keyword like so:

defp join_owner(query) do
  from e in query,
    join: c in assoc(e, :owner),
    as: :owner
end

Since naming is hard, I follow the simple convention whereby if I'm joining on a module named SomeSchema, I tend to want to be consistent and name my binding something like :some_schema. This convention is nice because I can reuse any queries that work for :some_schema, no matter the source of the query.

Once a named binding has been created, you can query against said name binding with the syntax below:

defp where_owner_org_id(query, org_id) do
  from [owner: owner] in query,
    where: owner.org_id == ^org_id
end

This, if you noticed, is pretty much exactly what we want. This query is not coupled to anything but the fact that owner is a named binding in our query, which naturally leads us to my next big point (and quite likely my most contrarian one): I opt for never using positional bindings; instead, I tend to use named bindings for quite literally everything.

If we use named bindings (especially if they're consistently named), that means we can freely compose literally any queries between any schemas we've defined in our application! Talk about reaching the end of the rabbit hold 💪

(Consistently named) Named bindings also let us avoid Ecto generated duplicate joins where none are needed, Ecto.Query provides the predicate function Ecto.Query.has_named_binding?/2 which will return true if the given Ecto.Query has a named binding with the given name. This means we can update our join_owner/1 function to be a no-op if we already have the required join in place:

defmodule MyProject.MyContext.MySchema do
  alias MyProject.MyContext.Owner
  ...

  def where_org_id(query, org_id) do
    query
    |> join_owner()
    |> Owner.where_org_id(org_id)
  end

  defp join_owner(query) do
    if has_named_binding?(query, :owner) do
      query
    else
      from [my_context: my_context] in query,
        join: _owner in assoc(my_context, :owner),
        as: :owner
    end
  end

  ...
end

defmodule MyProject.MyContext.Owner do
  ...

  defp where_org_id(query, org_id) do
    from [owner: owner] in query,
      where: owner.org_id == ^org_id
  end

  ...
end

As a side note, I raised an issue to the Ecto mailing list about implementing a guard for this check (since all the data is available by inspecting the Ecto.Query struct).

It won't be available any time soon since it requires the latest version of Elixir. Still, hopefully, one day, we can re-implement the if check by purely utilizing pattern matching a function head when when has_named_binding(query, binding) instead! How neat is that?!

Going the extra mile (and to make automating our named binding generation easier since we're going to want to do it every time we query something), instead of having query pipelines in our context start with a schema module (i.e. MyEntity), I typically define base_query/0 functions in all of my schemas.

These base_query/0 functions serve two purposes: the first being a place to hide the implementation detail of making a named binding, and the second being a convenient place to group queries we always want to do in one place (i.e., the where_not_deleted/1 and where_not_archived/1 checks we talked about above). This equates to the following changes:

# Instead of this...
defmodule MyProject.MyContext do
  ...

  def count_entities do
    MyEntity
    |> MyEntity.where_not_deleted()
    |> Repo.aggregate(:count)
  end

  ...
end

# We do this...
defmodule MyProject.MyContext do
  ...

  def count_entities do
    MyEntity.base_query()
    |> Repo.aggregate(:count)
  end

  ...
end

defmodule MyProject.MyContext.MyEntity do
  ...

  def base_query() do
    query = from e in query, as: :my_entity

    query
    |> MyEntity.where_not_deleted()
  end

  ...
end

With this pattern in hand, you should find yourself empowered to weave, stitch, and compose even the most complex of Ecto queries together! Even so, there is much further we can go—Ecto.Query provides other ways to extending/composing queries beyond the simple expression chaining we've discussed.

If you're interested in this, I definitely suggest reading into fragments, dynamic queries, and subqueries. You can truly build some amazing things with these tools, but dedicated deep-dives for all of these will be saved for another time as they'll prove to be quite the detour.

Compose all the things!

There's still a little bit more we can do to eke out the most composability in our application layer. Now that our Ecto.Querys are written with composability in mind, we can go ahead and refactor our application layer to be dynamic rather than static: what I mean by this is until now, we've written our queries as standard and static Elixir pipelines.

There isn't a reason to limit ourselves to this since Ecto.Query composition is ultimately just accumulating expressions in a struct; it's something we can accumulate over! Since our example talked about being a Phoenix/Absinthe app in the beginning (though we don't really care too much about those details), naturally, our app exposes some public API that provides our users with the ability to provide us arbitrary query parameters that we need to handle.

Tangentially: while I recognize the necessity of a project like Dataloader, it's annoying to me that Dataloader effectively ends up forcing us to duplicate functions we've already defined in our contexts.

For the sake of simplicity, I'm going to ignore Dataloader's existence for the following example.

If your project does use Dataloader, at least with compositional queries, you can try to re-use them between your loader.ex and contexts.

If we assume our app exposes a GraphQL endpoint that ends up looking like this:

object :entity_queries do
  field :entities, non_null(list_of(non_null(:entity))) do
    description("Retrieve a list of all patients at your org")

    arg(:owner_id, description: "Include only entities belonging to the given owner")

    arg(:include_deleted, description: "Include deleted entities in results")
    arg(:include_archived, description: "Include archived entities in results")

    arg(:limit, :integer, description: "Limit the number of results")
    arg(:offset, :integer, description: "For pagination, offset the return values by a number")

    resolve(fn args, _context = %{current_user: %{org: %Org{} = org}} ->
      # somehow implements this endpoint
    end)
  end
end

We now have the problem of needing to query data from our database based on whatever combination of args gets passed in by our API user.

One way of dealing with this need is to build the query in our resolver function itself dynamically (this is typically what gets done by Dataloader or Relay), but I don't want to conflate application layer concerns with consumer layer concerns; I don't want to go this route.

We already have a perfectly operational function in our contexts, so more often than not, my resolver functions end up being very lightweight wrappers around context functions (again, everything in my_project_web/ lives in our consumer layer, so this is to be expected). Usually I end up end up writing something analagous to:

# My resolver function in my Absinthe schema
resolve(fn args, _context = %{current_user: %{org: %Org{} = org}} ->
  MyContext.list_entities(org, args)
end)

# Corresponding in context
defmodule MyProject.MyContext do
  ...
  
  def list_entities(%Org{} = org, args) when is_map(args) do
    args
    |> MyEntity.base_query()
    |> MyEntity.where_org_id(org.id)
    |> MyEntity.where_owner_id(args.owner)
    |> Ecto.Query.limit(args.limit)
    |> Ecto.Query.offset(args.offset)
    |> Repo.all()
  end

  ...
end

# Corresponding in schema
defmodule MyProject.MyContext.MyEntity do
  ...
  
  def where_owner_id(query, nil), do: query

  def where_owner_id(query, owner_id) do
    from [entity: entity] in query,
      where: entity.owner_id == ^owner_id
  end

  ...
end

As you can see, simply leveraging pattern matching in our composable query expression functions let us define quite clearly what possible filters a given function can do, while not explicitly requiring any particular pieces of data.

For particular gnarly or complex outside-world dependant queries (i.e., if a single arg ends up chaining multiple query expressions), I tend to prefer to write in the following pattern instead:

defmodule MyProject.MyContext do
  ...
  
  def list_entities(%Org{} = org, args) when is_map(args) do
    base_query =
      MyEntity.base_query()
      |> MyEntity.where_org_id(org.id)

    query =
      Enum.reduce(args, base_query, fn
        {:limit, limit}, query ->
          Ecto.Query.limit(query, limit)

        {:offset, offset}, query ->
          Ecto.Query.offset(query, offset)

        {:owner_id, owner_id}, query ->
          MyEntity.where_owner_id(query, owner_id)

        {:something_complex, complex}, query ->
          query
          |> MyEntity.do_something()
          |> MyEntity.something_else()
          |> MyEntity.where_complex_thing(complex)

        _unsupported_option, query ->
          query
      end)

    Repo.all(query)
  end

  ...
end

Alternatively, you can define MyEntity.where_complex_thing/2 to be a function that internally combines multiple other, small queries.

It can always be a delicate balance between creating hundreds of these one-off queries which map one-to-one with API options versus having super low-level query functions which don't do anything on their own. Still, much like the difficulty of naming things, this is where we just need to make judgment calls and refactor as we start seeing problems.

Thankfully, once we've reached this stage of API design and query composition, refactoring and reorganizing things tends to be much less daunting a task, so it's not too big of an ordeal ✊

Conclusion

Hopefully, I've said and showed enough to convince you that many projects use Ecto sub-optimally (or at very least, there are ways to make Ecto work harder for you!).

Ecto really does prove to be one of the nicest ORM-like pieces of technology I've ever used, and I'd love to see more people embrace the power that Ecto has to offer.

One of the huge benefits of composing queries like this outside of centralizing the source of truth for querying the database is the increase in confidence it gives us. I'm not personally a TDD Zealot (or hardcore believer in any development dogma), but I do aim to test the majority of the code I write, at the very least, with unit tests.

The nice thing about unit testing context functions made up of these composable query functions is that every single test compounds the confidence of my underlying query expressions more and more. Since I'm extensively testing the bare building blocks of all the business logic in my app, I'm really compounding what I get out of my test suite—definitely by a few orders of magnitude compared to projects which don't centralize their queries in one place.

There is still much, much more that could be said about Ecto, we've only really scratched the surface—I want to eventually talk about how you can make good use of Ecto.Query.fragment/N (has anyone else noticed it's a variadic macro?! How?! 🤯), subqueries, dynamic queries, etc. Until then, if you enjoyed this (admittedly opinionated) post, I've written another Ecto related post on the shortcomings of, and solutions to Ecto's virtual fields.

I hope this has helped! Until next time ☕

Compositional Elixir—Contexts, Error Handling, and, Middleware

Compositional Elixir—Contexts, Error Handling, and, Middleware

This post serves as a follow-up to a post I recently wrote about Railway Oriented Programming, and abusing Elixir's with statement to great effect. As part of that post, I briefly talked about one major advantage of the with pattern being the deference of error handling to the site at which one needs to consume the errors, rather than sprinkling error handling code all over the place.

The main idea is that all of the business logic you write should be made of small, composable units which you can string together in such a way that you shouldn't have to handle errors in each individual unit. The way I build programs is that each small unit essentially has a hard dependency on other units that it itself calls: if any of these sub-units fail, then that failure should propagate itself up to the client or call site—de-coupling high-level logic/thinking such as "what do I do if this subsystem is down from the expectation that the happy path should just work.

TLDR; the high-level pattern

For those uninterested or just wanting to get the gist of what was discussed in the linked post above: the with statement gives two major wins over Elixir's simpler case construct:

  1. It stops you from having to break every possible piece of conditional logic in either another function (in many cases, adding much noise to a module) or in a further nested case (which only serves to make things much harder to follow, especially when combined with the second point).with`.

  2. It allows you to be very terse. If you don't explicitly want to handle errors, you don't have to. Errors automatically bubble up for you to conditionally handle, or not, as you please. This is especially useful if you otherwise would have many nested case statements—something common in the Erlang world, and perhaps only isn't so in the Elixir world because of constructs such as with.

When I'm writing logic in Elixir, I almost always write it in the following form:

def some_function() do
  with {:ok, step_1} <- do_step_one(),
       {:ok, step_2} <- do_step_two(),
       {:ok, step_3} <- do_step_three() do
    {:ok, step_3}
  end
end

If I call such a function, any of do_step_one/0, do_step_two/0, or do_step_three/0 may fail. If either one of these functions doesn't return something in the shape {:ok, _something}, I will receive it. This means that everything is dandy so long as those three functions each return a domain-specific and helpful error.

Assuming function do_step_one/0 fails, and assuming it does somethign super important as follows:

defp do_step_one() do
  case check_database_connection() do
    {:ok, %Db.Connection{} = connection} ->
      {:ok, connection}

    _otherwise ->
      {:error, :database_down}
  end
end

As a caller of some_function/0, I will very nicely get a reason as to _why my invocation of some_function/0 failed. No additional code needed!

Alas, not all systems are so conveniently self-contained. The majority of the systems I've worked on in the last two years as a consultant on a few large Elixir projects for multiple clients have been services exposed on the internet and utilised by numerous different devices. The kind of "letting errors propagate to the client" sounds great, but what does that actually mean in the context of a large Elixir application that's serving users over some API or website versus someone playing around in a REPL?

High-level architecture for a Phoenix application

Phoenix is probably one of the most common frameworks we see as Elixir developers. Phoenix has a very loosely defined convention of splitting up your business logic into the following sections:

  • Controllers: these are modules responsible for mapping HTTP requests to business logic in Contexts. Controllers typically just encapsulate CRUD actions, but ultimately, are just small wrappers transforming external requests into one (or more) Context function invocations in series.

  • Contexts: these are small modules that encapsulate "units" of business logic. In never-so-certain terms, I might have a high-level context called Auth, which provides functions that work on entities vaguely related to the concept of "authorization", such as: create_user, validate_user_password, user_belongs_to_organization? etc.

  • Schemas: these are modules that define structs for your contexts to work against. More often than not, these structs are backed by Ecto (regardless of whether or not you're using a database). These modules might provide minimal utility functions for performing work on these structs, validating them, etc.

Even if you're not using Phoenix in your Elixir app, I've definitely found much merit in this way of splitting up a system. In the majority of projects or libraries I work on, I adopt something similar. You can get further decoupling using umbrella projects instead of standard Elixir projects, but this is not an important detail for now.

One nice thing about this controller/context/schema split in particular is that not only is the concern split (web-layer/business-logic-layer/database-layer) evident, the error handling concern is appropriately split too:

# This is our controller:
defmodule MyAppWeb.SomethingController do
  use MyAppWeb, :controller

  alias MyApp.Somethings
  alias MyApp.Somethings.Something

  def show(conn, %{"id" => id}) do
    case Somethings.get_something_by_id(id) do
      {:ok, %Something{} = something} ->
        conn
        |> put_status(:ok)
        |> render("show.json", %{something: something})

      {:error, reason} when reason in [:not_found, ...] ->
        conn
        |> put_status(reason)

      _error ->
        conn
        |> put_status(:internal_server_error)
  end
end

# This is our context:
defmodule MyApp.Somethings do
  alias MyApp.Something
  alias MyApp.Repo

  def get_something_by_id(id) do
    Something
    |> Something.where_id(id)
    |> Something.not_deleted()
    |> Repo.one()
    |> case do
      %Something{} = something ->
        {:ok, something}

      nil ->
        {:error, :not_found}
    end
  end
end

# This is our schema:
defmodule MyApp.Somethings.Something do
  use Ecto.Schema
  import Ecto.Query

  schema "somethings" do
    field :something, :string
    field :deleted, :boolean, default: false
  end

  def changeset(%__MODULE__{} = entity, attrs) do
    entity
    |> cast(attrs, [:something, :deleted])
  end

  def where_id(query, id) do
    from s in query,
      where: s.id == ^id
  end

  def where_not_deleted(query) do
    from s in query,
      where: s.deleted == false
  end
end

Note that each layer of our business logic example here handles its own concerns:

  • Our schema only cares about providing composable, minimal functions which detail operating with the database. It validates stuff going into the database and provides composable queries which can be chained together arbitrarily.

  • Our context uses the small functions available to it from schemas (and other contexts, such as MyApp.Repo) to provide atomic chunks of logic: fetch X from the database, create a new Y, etc.

  • Our controller maps both successful and unsuccessful invocations of our context functions into domain-specific responses, which we return to the client making the web request.

An example of composability and extensibility

This is essentially the entire idea at work already, just a straightforward version of it. Imagine, instead of the simple example above, we're working to authenticate a user trying to log into a service. I'll leave out the schema implementation details for brevity, but an example would end up looking like this:

# This is our controller:
defmodule MyAppWeb.AuthTokenController do
  use MyAppWeb, :controller

  alias MyApp.Auth
  alias MyApp.Auth.User
  alias MyApp.Auth.Token
  alias MyApp.Auth.Organization

  @encodable_errors [:unauthorized, :not_found, :invalid_password]

  def create(conn, %{"username" => username, "password" => password, "org_code" => org_code}) do
    with {:ok, %Organization{} = organization} <- Auth.get_organization_by_code(org_code),
         {:ok, %User{} = user} <- Auth.get_user_by_username_and_organization(username, organization),
         {:ok, %User{} = user} <- Auth.validate_user_password(user, password),
         {:ok, %Token{} = token} <- Auth.create_auth_token(user, organization) do
       conn
       |> put_status(:ok)
       |> render("show.json", %{token: token})
    else
      {:error, reason} in @encodable_errors ->
        conn
        |> put_status(reason)

      {:error, %Ecto.Changeset{} = error} ->
        conn
        |> put_status(:bad_request, build_error(error))
    end
  end

  # Some extensive pattern matching
  defp build_error(error), do: ...
  defp build_error(error), do: ...
  defp build_error(error), do: ...
  defp build_error(error), do: ...
end

# This is our context:
defmodule MyApp.Auth do
  alias MyApp.Auth.User
  alias MyApp.Auth.Token
  alias MyApp.Auth.Organization
  alias MyApp.Repo

  @ttl_minutes 60

  def get_organization_by_code(org_code) do
    Organization
    |> Organization.where_org_code(org_code)
    |> Repo.one()
    |> case do
      %Organization{} = organization ->
        {:ok, organization}

      nil ->
        {:error, :not_found}
    end
  end

  def get_user_by_username_and_organization(username, %Organization{id: organization_id}) do
    User
    |> User.where_username(username)
    |> User.where_is_authenticated()
    |> User.where_organization_id(organization_id)
    |> User.is_not_deleted()
    |> Repo.one()
    |> case do
      %User{} = user ->
        {:ok, user}

      nil ->
        {:error, :unauthorized}
    end
  end

  def validate_user_password(%User{} = user, password) do
    if User.encrypt_password(password) == user.hashed_password do
      {:ok, user}
    else
      {:error, :invalid_password}
    end
  end

  def create_auth_token(%User{} = user, %Organization{} = organization, ttl_minutes \\ @ttl_minutes) do
    %Token{}
    |> Token.changeset(%{user: user, organization: organization, ttl_minutes: ttl_minutes})
    |> Repo.insert()
  end
end

Ignoring whether or not we want to expose such fine-grained errors for an auth flow (we never do!), we can see the strength of being able to compose everything. Literally every single piece of business logic here is built to be trivially composable. If we wanted to make it such that a user could request a higher token ttl, we could extend the controller function:

def create(conn, %{"username" => username, "password" => password, "org_code" => org_code} = params) do
  custom_ttl = Map.get(params, "ttl", 1200)

  with {:ok, %Organization{} = organization} <- Auth.get_organization_by_code(org_code),
       {:ok, %User{} = user} <- Auth.get_user_by_username_and_organization(username, organization),
       {:ok, %User{} = user} <- Auth.validate_user_password(user, password),
       {:ok, %Token{} = token} <- Auth.create_auth_token(user, organization, custom_ttl) do
     conn
     |> put_status(:ok)
     |> render("show.json", %{token: token})
  else
    {:error, reason} in @encodable_errors ->
      conn
      |> put_status(reason)

    {:error, %Ecto.Changeset{} = error} ->
      conn
      |> put_status(:bad_request, build_error(error))
  end

  # Some extensive pattern matching
  defp build_error(error), do: ...
  defp build_error(error), do: ...
  defp build_error(error), do: ...
  defp build_error(error), do: ...
end

And suppose if Auth.Token.changeset/2 had some validation for the minimum or maximum ttl that can be set? Well, those errors would propagate up through the context and the controller, be serialized into some format the client can understand, and be returned to the user!

This is the power of writing code with respect primarily towards the happy path.

Committing to happy-path-programming: Middleware!

Now that we've outlined the gist, we can dive even deeper into this cult of composability: let's remove error handling from view entirely!

If we take the auth flow example we've been working on, this is all well and good, but we've ended up with a bunch of boilerplate. We still need to extensively pattern match and handle errors from our contexts to map said errors into something client-specific for every action we want to expose through our controllers. Isn't this orthogonal to the point of Railway-Oriented programming? Perhaps.

Unfortunately, for any case where we expect an error to occur, we really do want to map it for the user's benefit. Unexpected errors will always arise, and unexpected errors warrant exceptional behaviour (i.e. crashing the process responsible for handling the user's request a-la OTP), but errors that are entirely within the domain of errors we expect? We should handle those.

Thankfully, instead of writing macros we inject everywhere (a common approach on a few projects I've seen), or just decided to say "screw it" and manually implementing the boilerplate, we can be a little smarter about it.

I'm not too familiar with the internals of Phoenix, but essentially, all requests that come in go through a pipeline of functions (plugs). I'm not sure if semantically your controller function is a plug, but it's essentially just another function that gets executed as part of this pipeline. When we call the render function on the happy path, we're essentially just saying: "return this to the client who requested this", but Phoenix actually has a built-in mechanism for handling the case where we aren't explicitly calling render: a fallback plug!

defmodule MyAppWeb.AuthTokenController do
  use MyAppWeb, :controller
  action_fallback MyAppWeb.ErrorHandler

  alias MyApp.Auth
  alias MyApp.Auth.User
  alias MyApp.Auth.Token
  alias MyApp.Auth.Organization

  @encodable_errors [:unauthorized, :not_found, :invalid_password]

  def create(conn, %{"username" => username, "password" => password, "org_code" => org_code}) do
    with {:ok, %Organization{} = organization} <- Auth.get_organization_by_code(org_code),
         {:ok, %User{} = user} <- Auth.get_user_by_username_and_organization(username, organization),
         {:ok, %User{} = user} <- Auth.validate_user_password(user, password),
         {:ok, %Token{} = token} <- Auth.create_auth_token(user, organization) do
       conn
       |> put_status(:ok)
       |> render("show.json", %{token: token})
    end
  end
end

defmodule MyAppWeb.ErrorHandler do
  @http_errors ["forbidden", "not_found", "conflict", ...]

  def call(conn, {:error, "unauthorized"}) do
    conn
    |> put_resp_header("www-authenticate", "Bearer")
    |> put_status(:unauthorized)
    |> halt()
  end

  def call(conn, {:error, http_error}) when http_error in @http_errors do
    conn
    |> put_status(http_error)
    |> json(%{errors: [%{code: status}]})
    |> halt()
  end

  ...
end

This is much nicer, in my opinion. We massively clean up the core business logic of our main controller in-so-far that it is essentially just describing a bunch of steps that have to happen to create an auth token. If anything unexpected happens, those errors are handled in a module whose entire purpose is to map context errors into client errors. We have the added benefit of centralising how we handle errors and thus returning errors consistently no matter where they happen, so long as our error handler is being used!

Anything unexpected will also either propagate to the client untouched (which is one valid option) or will just cause a crash (another valid option). That implementation detail can be left solely to you 😊

Addendum: Error handling for Absinthe and GraphQL

As stated previously, Phoenix is ubiquitous on Elixir projects. The majority of Elixir projects I've been brought to help out on have had Phoenix as part of their stack.

This is also true of Absinthe, a really nice framework for serving GraphQL.

At the time of writing, error handling honestly isn't even nearly standardized for GraphQL. I've worked on approaches that use union types to represent typed errors for the client as well as just settling for the more standard top-level errors list.

While there is a bunch of development here that will change whether or not this is the best thing to do going forwards, at the time of writing, we've ended up extending our error-handling Middleware idea to Absinthe as well.

Much like Phoenix, Absinthe is essentially designed as a pipeline of functions that end up with your resolvers being executed. GraphQL Schemas and their corresponding resolvers are kind of analogous to Phoenix controllers in that they essentially just call functions in our standard contexts.

Resolvers in Absinthe are expected to return either {:ok, _some_happy_path_result} or {:error, some_supported_error_format}, but much like our Phoenix controller functions, we can implement a middleware to take care of anything that isn't expected transparently.

A similar middleware to the Phoenix one can be implemented as follows:

defmodule MyAppGraphQL.ErrorHandler do
  @behaviour Absinthe.Middleware

  def call(%Absinthe.Resolution{errors: errors} = resolution, _config) do
    %Absinthe.Resolution{resolution | errors: Enum.flat_map(errors, &process_error/1)}
  end

  defp process_error(:unauthorized) do
    [{:error, %{status: 401, error: %{code: "unauthorized"}}}]
  end

  ...
end

And once written, it either needs to be added to Absinthe's top-level middleware execution pipeline, which is executed for all resolvers, or to individual resolvers:

# Top level middleware registration
defmodule MyAppGraphQL.Schema do
  use Absinthe.Schema

  def middleware(middleware, _field, _context) do
    middleware ++ [MyAppGraphQL.ErrorHandler]
  end
end

# Resolver specific middleware registration
field :auth_token, :auth_token do
  middleware MyAppGraphQL.AuthenticationErrorHandler
  ...
end

Addendum: More complex Ecto.Schema/Ecto.Query composition

I'm often asked how we can write composable functions to dynamically build queries involving all sorts of things such as joins, sub-queries etc.

This is a good question since the naive implementation gets verbose and extremely brittle very quickly.

Typically, if I need to define functions that do complex joins, instead of doing it as follows:

defmodule MyApp.Auth.User do
  use Ecto.Schema
  import Ecto.Query

  alias MyApp.Auth.Organization

  schema "user" do
    field :username, :string
    belongs_to :organization, Organization
  end

  def changeset(%__MODULE__{} = entity, attrs) do
    entity
    |> cast(attrs, [:username, :organization_id])
  end

  def where_id(query, id) do
    from u in query,
      where: u.id == ^id
  end

  def join_organization(query) do
    from u in query,
      join: o in assoc(u, :organization)
  end

  def where_organization_region(query, organization_region) do
    from u, o in query,
      where: o.region == ^organization_region
  end
end

# Used as follows:
def user_has_region?(user_id, region) do
  User
  |> User.where_id(user_id)
  |> User.join_organization()
  |> User.where_organization_region(region)
  |> Repo.exists?
end

I tend to use what Ecto calls named bindings, which instead of composing queries and joined entities based on position a-la from a, b, c, d in query, ..., you can do the following instead:

defmodule MyApp.Auth.User do
  use Ecto.Schema
  import Ecto.Query

  alias MyApp.Auth.Organization

  schema "user" do
    field :username, :string
    belongs_to :organization, Organization
  end

  def changeset(%__MODULE__{} = entity, attrs) do
    entity
    |> cast(attrs, [:username, :organization_id])
  end

  def where_id(query, id) do
    from u in query,
      where: u.id == ^id
  end

  def join_organization(query) do
    if has_named_binding?(query, :organization) do
      query
    else
      from u in query,
        join: o in assoc(u, :organization),
        as: :organization
    end
  end

  def where_organization_region(query, organization_region) do
    query = join_organization(query)

    from u, organization: o in query,
      where: o.region == ^organization_region
  end
end

# Used as follows:
def user_has_region?(user_id, region) do
  User
  |> User.where_id(user_id)
  |> User.where_organization_region(region)
  |> Repo.exists?
end

This way, not only do we not have to care what order entities were joined in (which for very complex queries, is hard to ascertain, and for usages in many functions across many different contexts, is basically impossible to guarantee), but it also means we never have to explicitly call join_organization/1, hiding more low-level implementation logic. Since join_organization/1 is a no-op if the relation is already joined as well, there's no harm in building extremely complex flows around this!

In addition to this, if we have functions that need to peak into the fields of structs that may not have been preloaded yet, one pattern I've ended up following perhaps breaks the encapsulation of concerns for structs into that struct's schema (or that struct's embodying context) is simply to call MyApp.Repo.preload(my_struct, [:the_field_i_care_about]) whenever I need to.

MyApp.Repo.preload/2 is actually similar to the join_organization/1 function we defined in that it only preloads the association if it hasn't already been preloaded, meaning that this is essentially costless! Nice 💪

Addendum: Compounding guarantees and unit testing

In short, one other advantage little talked about to this approach of building complex flows by composing queries and delegating error handling to the edges of your application is the improved testing story.

If you unit test each of the functions in your context, you know your assumptions about your schema are sound, and your core business logic functions work as expected. This gives you a degree of guarantees.

Following this, if you unit test your Phoenix controllers, you need only test one main thing: does your controller correctly transform the parameters given to it into parameters that you can pass into your context functions? If so, you inherit your context function unit tests; likewise for Absinthe resolvers.

If you unit test your Phoenix error handler, you gain guarantees about how your system performs under various error cases. Likewise for our Absinthe middleware.

Lastly, Phoenix and Absinthe aren't the only frameworks/patterns for building Elixir apps. One common domain in most Elixir/Erlang applications are any gen_servers, gen_statems or any other OTP behaviour.

These, like Absinthe/Phoenix controllers, essentially are just a different way of composing functions that should exist in your contexts, and thus can be tested the same way: unit test the underlying context functions and then unit test each process's handle_$callback functions. You often don't even need to spin up a process to exhaustive test it, and if you do, one could make the argument these belong in a supervisor test anyway 😉

Conclusion

Hopefully, following this, you'll find that your Phoenix controllers, GraphQL resolvers, OTP Processes, and essentially anything else are a lot easier to reason about and extend!

These solutions were inspired by solutions we put in place for many different large production Elixir projects in different business domains. These patterns have essentially become my go-to tools for taming complexity and exponentially increases developer productivity.

I hope these patterns and thoughts help you as much as they've helped my team and me! Thanks for reading 💪

Easy Ergonomic Telemetry in Elixir with Sibyl

Easy Ergonomic Telemetry in Elixir with Sibyl

One of the things I wanted to do was to start a series of posts which re-implement popular libraries in the BEAM ecosystem from first principles to help explain and teach how they work and perhaps why they work the way they do.

The only one I've managed to see through to date is my "Build you a Telemetry for such learn!" post which walks users through the BEAM's de-factor telemetry library, aptly named :telemetry.

When I was writing that post, one of the things I tried to do was explain what :telemetry did and why, but one question I kept getting after publishing was how we could actually use :telemetry in real Elixir projects. This is something I wanted to try and answer.

In the Elixir community, we have a lot of examples how :telemetry is used in the context of a library (i.e. how to consume telemetry events), but not neccessarily anything more than that.

As a learning from instrumenting several such systems while I was a consultant and during my time here at Vetspire, we built a library to make it as easy as possible to start using :telemetry right away! I'll start off by explaining high-level how we'd do things manually, then tie it all together with Sibyl.

Telemetry on the BEAM

As stated, the de-facto telemetry library on the BEAM (i.e. across Erlang, Elixir, and other BEAM-hosted languages) is :telemetry, which is made up of two high level concepts: "event-emission" and "event-handling".

Typically, a lot of code in the wild will show off "event-emission". This is when code is annotated with calls to the :telemetry library to denote something happened, for example:

defmodule MyApp.Users do
  def register_user(attrs) do
    :telemetry.execute([:my_app, :users, :register], %{email: attrs.email}, %{})

    %User{}
    |> User.changeset(attrs)
    |> Repo.insert()
    |> case do
      {:ok, %User{} = user} ->
        :telemetry.execute([:my_app, :users, :register, :success], %{email: attrs.email}, %{})
        {:ok, user}

      {:error, %Ecto.Changeset{} = error} ->
        :telemetry.execute([:my_app, :users, :register, :error], %{email: attrs.email}, %{})
        {:error, changeset}
    end
  end
end

The calls to :telemetry.execute/3 don't really do very much out-of-the-box! Only if an application running in the BEAM opts into receiving (handling) any listed events will anything fancy happen. That's done elsewhere via the following:

for event <- @my_app_events do
  :telemetry.attach("my-handler", event, &MyApp.Telemetry.handle_event/4, %{})
end

Once the above code is run anywhere on the same BEAM VM as the code emitting events, the :telemetry.execute/3 calls now automatically dynamically dispatch to MyApp.Telemetry.handle_event/4 which is intended to be provided by the user.

Its important to note that telemetry handlers are executed synchronously (and that a single event might be consumed by any number of handlers) so care must be taken to not execute very intensive business logic inside of handlers, but the intended use-case is to allow you to react to any telemetry event being emitted and do something like log to an ETS table, external analytics service, send messages in Slack, etc.

For more details in how this all is wired up internally, do take a look at the official :telemetry repo whose source code is very easy to follow (though its written in Erlang) or my Elixir re-implementation walkthrough, but that detail isn't particularly important for this post.

Something I lightly hinted at above is that telemetry events and handlers are global and are a form of dynamic dispatch! I've witnessed, written, and repented for various eldritch hacks as a result of this, but the main thing we use these facts for are to ship telemetry metadata somewhere else, be it Phoenix performance metadata, Ecto query metadata, Oban worker metadata, or your own apps internal business events!

Ultimately, that's all there is to do. I've seen and worked on many projects where we're just emitting events inline and spawning up a Supervisor and GenServers to listen to telemetry events and offload any intensive work to background processes. These background processes might batch stuff up, but ultimately they just POST to an external service and you've got telemetry set up end-to-end.

Since the main work is annotating your code and writing code to ship your metrics elsewhere, one problem I wanted to look into solving was to maximize my bang for buck: what was the least amount of code I had to write to be able to send my metrics anywhere?

Thankfully the Open Source community at large has a solution in place!

Telemetry off the BEAM

While not quite as de-facto as :telemetry on the BEAM, there is an open-source telemetry "protocol" which allows you to ship your metrics to multiple different external consumers called (again, aptly) OpenTelemetry.

Unlike :telemetry however, OpenTelemetry aims to be a more generic way to implement observability rather than just, well, telemetry for your applications and infrastructure. I won't go into very much detail here, but OpenTelemetry aims to handle three main "signals": Traces, Metrics, and Logs.

Since the BEAM already has an official Logger, and many integrations and solutions exist for shipping logs out to places like Google Cloud or DataDog, neither myself nor Sibyl will cover logging. Rather, I'm more focused and excited about Traces and Metrics!

In short, a metric is a single measurement about something in your system. Metrics are the bread-and-butter of :telemetry, anything you :telemetry.execute/3 is arguably a metric. On the flipside, traces are higher level: what happens when a user clicks a button?

The simple user registration example above can be seen as a collection of metrics, which in tandem denote a trace! The metric would be how many users have registered, which can be derived by looking at the count of [:my_app, :users, :register, :success] events, and the trace give us details like function runtime by comparing either [:my_app, :users, :register] with either [:my_app, :users, :register, :success] or [:my_app, :users, :register, :error].

OpenTelemetry formalizes the above concepts and is supported by various products such as DataDog, LogScale, New Relic, and more. You can check the supported vendors on this list though not all of these places are consumers of OpenTelemetry but rather are producers.

Assuming your organization already uses one of the supported vendors, nothing really needs to be done to hook up to Sibyl, however, if you're not already using one of those vendors, you can still use OpenTelemetry locally for debugging & development with Jaeger which is something I highly recommend.

To start doing OpenTelemetry stuff in Elixir, you can use the OpenTelemetry libraries and instrument your code a second time, or if Sibyl is a fit for you, then we handle this for you!

Introducing Sibyl

Sibyl is a relatively small library which aims to do two things: abstract away implementation details like :telemetry event handling, emission, OpenTelemetry instrumentation, and provide a better DX while doing it.

Auto-instrumentation

When you bring Sibyl into your project, all you have to do to start using it is to use Sibyl in any of your Elixir modules. Upon doing so, your modules get extended with a few telemetry-related superpowers thusly:

defmodule MyApp.Users do
  use Sibyl

  @decorate trace
  def register_user(attrs) do
    %User{}
    |> User.changeset(attrs)
    |> Repo.insert()
  end
end

defmodule MyApp.Orders do
  use Sibyl

  @decorate_all trace

  def add_to_basket(%User{} = user, %Product{} = product) do
    ...
  end

  def empty_basket(%User{} = user) do
    ...
  end
end

One minor, but real issue when using the low level :telemetry library is that as developers add events; code gets copied around, typoed, deleted, and more. Unfortunately, since :telemetry works by opted your application into handling said events, unless you're explicitly testing all events are raised as expected, it is very easy to events to silently break.

By using the @decorate trace directive, Sibyl will automatically wrap the function being traced with :start, :stop, and :exception events as well as provide timings, function parameters, and more to any event handlers which have opted into said events! Additionally, @decorate_all trace() can be used to auto-instrument every function in your module for you.

This reduces the amount of boilerplate code you need to write to emit events, as well as make sure that telemetry events are named consistently and aren't typoed.

Internally, Sibyl will compile all events that it emits into your Elixir modules themselves; meaning that the event handling example from before can be simplified to:

:telemetry.attach_many("my-handler", Sibyl.Events.reflect(), &MyApp.Telemetry.handle_event/4, %{})

While it completely makes sense for :telemetry to be opt-in when consuming events from third party libraries, for application code, if I'm defining an event, I'm expecting to consume it.

This is where Sibyl's event reflection comes in handy. Sibyl allows you to reflect on events your application defines both on a module-by-module basis, or application wide. Doing so reduces the number of moving parts needed to be kept up-to-date when adding/listening to new telemetry events being added.

Compile Time Safety

Sibyl also provides its own Sibyl.emit/1 function to emit telemetry events. The benefit of this function, in tandem with the fact that Sibyl compiles a list of events it knows are going to be emitted, is the fact that Sibyl can validate event emissions are legitimate at compile time!

By default, when using the @decorate trace and @decorate_all trace annotations, Sibyl will automatically define events based on the current module anme and function names plus :start | :stop | :exception, but you can also define custom events with define_event/1 in any module that uses Sibyl.

This means that we can expand the above example to look as follows:

defmodule MyApp.Users do
  use Sibyl

  define_event :registered
  define_event :email_validation_error
  define_event :password_length_error

  @decorate trace
  def register_user(attrs) do
    %User{}
    |> User.changeset(attrs)
    |> Repo.insert()
    |> case do
      {:ok, %User{} = user} ->
        emit :registered
        {:ok, user}

      {:error, %Ecto.Changeset{} = changeset} ->
        case hd(changeset.errors) do
          ... -> emit :email_validation_error
          ... -> emit :password_length_error
        end

        {:error, changeset}
    end
  end
end

Events are validated by default to belong to the current module, but there are cases where metrics maybe defined in a generic MyApp.Events module. If this is the case, Sibyl can still validate event emission at compile time via emit MyApp.Events, :something, provided :something was defined correctly.

Bridging Telemetry & Open Telemetry

Sibyl not only abstracts away the instrumentation of emitting telemetry events, but coupled with reflection, it also allows us to automatically attach events.

Since writing code to ship events elsewhere ended up being one of the main things I wanted to do, and I didn't want to instrument my code with multiple different libraries, Sibyl also has the ability to tie :telemetry events together with OpenTelemetry traces.

As I touched on above, any function which has been decorated will emit :start | :stop | :exception events. Sibyl ships with the concept of "handlers", the most important one of which is Sibyl.Handlers.OpenTelemetry.

Sibyl handlers are just pre-packaged telemetry handlers. Attaching the Sibyl.Handlers.OpenTelemetry handler will automatically build OpenTelemetry traces as your application emits standard :telemetry events, removing the need for manual instrumentation entirely. Sibyl is also aware of how to stitch traces together across different Tasks, and more generic multi-process trace stitching can be built with helper functions in that module.

This means that by simply invoking Sibyl.Handlers.attach_all_events/1 in your application's top-level supervisor, you can automatically inherit all of the power that OpenTelemetry tracing gives you.

Due to this simplicity, I personally always use Jaeger when developing locally, and in production, I'll have all the same tracing/metric stuff being exported to New Relic.

It is also easy to build custom handlers if needed, Sibyl provides an example Sibyl.Handlers.Logger handler which attaches to Elixir's standard logger, and Sibyl.Handlers.FlameGraph which can export traces to a file format compatible with Google Chrome's built in flamegraph tooling, Speedscope, and others.

Definitely do see Sibyl's Documentation for more details, but with that, we've covered the core usage of Sibyl!

Future Work

While Sibyl's core functionality is more or less complete, we're still trying to build out some super cool features at Vetspire to help us debug and make Sibyl the best tool it can be!

Additional Handlers

As demonstrated with Sibyl.Handlers.FlameGraph, Sibyl is powerful enough and flexible enough to be able to output traces to various formats such as function call graphs, flame graphs, OpenTelemetry and more!

As we find usecases to building out new handlers, they will be added to the repo on an experimental basis, but hopefully will become solid enough to be part of Sibyl core in the future.

One of the main things I want to do with this is implement a function call graph exporter, to produce output similar to mix xref.

Plugins

Another piece of low-hanging fruit is the ability to support plugins for all handlers which correctly implement the Sibyl.Handler behaviour. We're currently working on some prototype plugins, but the idea would be to allow Sibyl to automatically configure itself to hook into telemetry events emitted by third party libraries and potentially extend the amount of metadata they emit.

Currently, the API being worked on is as follows:

Sibyl.Handlers.attach_all_events("my-handler", plugins: [Sibyl.Plugins.Absinthe, Sibyl.Plugins.Phoenix, Sibyl.Plugins.Ecto])

This is very near ready for release, so stay tuned!

Dynamic

Lastly, possibly the most powerful feature of Sibyl already exists (though its not quite production-ready at the time of writing!). Sibyl is designed to take :telemetry events as input, but is flexible enough to be able to take any sort of input.

As a proof of concept, Sibyl.Dynamic leverages the BEAM's built in tracing functionality to allow you to build OpenTelemtry traces, FlameGraph traces, or traces for any other handler automatically during runtime rather than compile time on live production systems!

Currently you can try this out via running the following in a live remote shell:

iex(my_app@10.84.9.3)3> Sibyl.Dynamic.enable()
{:ok, #PID<0.16925.0>}
iex(my_app@10.84.9.3)4> Sibyl.Dynamic.trace(Enum, :map, 2)
{:ok, #PID<0.16925.0>}
iex(my_app@10.84.9.3)5> Enum.map([1, 2, 3], & &1 * 2) # emits traces to any attached handlers
[2, 4, 6]

Conclusion

Hopefully, I've been able to outline why I think Sibyl is super cool and ergonomic. Please feel free to take a look at Sibyl's official documentation, package link, and Git Repo.

Please do feel free to take Sibyl for a spin, thanks for reading!

Draft: Oban makes it so easy, you don't even stop to think if you should

Oban makes it so easy, you don't even stop to think if you should

In my opinion, Oban and Oban Pro are absolutely foundational libraries in the Elixir ecosystem, not unlike Ecto, Phoenix, or Absinthe.

I started using Oban because it was already used as a simple background job processing library at Vetspire. Since my first foray into using Oban however, we've really gotten to grips with it and started using it in anger.

Before trying it out, a lot of the stuff I do today with Oban was done via traditional message queues like RabbitMQ/Kafka/SQS, or with serialized gen_statems stored in Mnesia or Postgres which were re-initialized on application boot up.

So many small, low-level wants and needs are so perfectly fulfilled by Oban (even when it's not quite the right tool for the job), that I've realised two things:

One—I seldom ever need to really think hard about lower-levell/fun/distributed/resilient code anymore;
Two—Oban makes things so easy, you don't even necessarily think twice before coercing it to do incredibly cursed things 🤣

The following points are a few fun things I've used Oban for at both Vetspire and personal projects. They're fun recipes which seem to be doing okay in production environments/workloads, but they're moreso fun examples of what can be done rather than what should be done.

Definitely don't copy these ad-hoc, but here we go! We'll start off with the least cursed item and end with the most cursed item!

Oban as a retry mechanism

Outside of having truly background background tasks, Oban can be leveraged to make any function call you're expecting to make more resilient by leveraging its retry functionality.

If an Oban job returns an error, or crashes for whatever reason, by default Oban will retry the job up to 20 times (with exponential backoff, but all of this is very configurable).

One of the main issues with doing this is that Oban jobs are run, well... as background jobs. But assuming we can live with the fact that anything you use Oban for will be asynchronous (for now...), simply wrapping any function call in an Oban job will let you make that call more resilient.

Vetspire integrates with a lot of third-party code. No matter what, even if your integrations are rock solid, due to the nature of HTTP and distributed systems in general, stuff will fail sometimes and there is very little recourse/predictability associated with this.

As a result, a lot of our very critical integrations have their communications wrapped in Oban jobs so that if they fail for random reasons, we'll try again a little later.

A pattern I personally like is having a more generic retry worker template that looks like the following:

defmodule MyApp.Integration.Submitter do
  use Oban.Worker, max_attempts: 5

  def perform(_job) do
    case MyApp.Integration.submit() do
      {:ok, %DataSyncLog{} = datasync_log} ->
        {:ok, datasync_log}

      {:error, something_totally_expected_to_fail} ->
        {:discard, error: something_totally_expected_to_fail}

      ... repeat as needed ...

      error ->
        error
    end
  end
end

Since these integrations can often return completely valid errors, I usually enumerate and handle them all explicitly in the job. Anything that isn't explicitly handled will cause a retry (even if your BEAM VM crashes, Oban will retry it!), but anything that is totally expected and BAU will be transformed into an Oban :discard instead.

I typically like co-locating my domain-specific workers in the same context as the modules which use them. I'll also usually create schedule_ functions which utilise them for me "automagically" like so:

defmodule MyApp.Integration do
  def schedule_submit!, do: Oban.insert!(MyApp.Integration.Submitter.new(%{}))

  def submit do
    HTTPoison.post(...)
    ...
  end
end

This way, instead of calling submit/0 in the example above, a caller can choose to call schedule_submit!/0 for the same effect, just more robust!

A lot of early Vetspire code would utilise spawn/3 to offload work in the background as well. Depending on the actual performance implications of doing so, we've ended up refactoring a lot of that to this exact pattern!

Oban as state

Another cool thing Oban can do is literally act as state. This is one of those things that I genuinely don't condone doing in production, but if you're using Oban anyway, Oban is a great way to have short-lived ad-hoc distributed state across your application/cluster.

This works because Oban allows you to define unique constraints for Oban jobs. Typically, unique constraints are used to protect against double enqueueing jobs, but the actual fields you define as unique are pretty flexible.

You can mark jobs are unique by their :args, :queue, :meta fields, etc. See (unique fields, options, and states)[https://hexdocs.pm/oban/Oban.Job.html#t:unique_field/0] for more information.

Coupling a job with some unique constraint involving its fields and its age, Oban literally becomes a cache. This relies on the fact that you can store arbitrary metadata in a job's :meta field: in some Oban cronjob you can perform expensive computations and store the result in the job's :meta field. Once this is done, your application can try reading from the job queue to read whatever result was cached until the job gets pruned or re-written.

You can combine this with upserting jobs via the replace option when creating new jobs to do all sorts of gross cursed stuff!

Honestly, it isn't that bad though, definitely not much different from rolling your own GenServer or Agent based cache with the advantage that its available to any app that has access to your database. If creating a new database table is too much work, you need distributed state, and you don't want to use :mnesia (who does! 😆) then this might be an approach?

Then again... probably not the best idea, even if it isn't the worst 😅

Oban as a throttle mechanism

Oban as a Flow replacement

Oban as an async/await runner

Provide an RSS/Atom Feed

I like your blog a lot and would like to subscribe to it via a Webfeed. It would be great if there was one.

Elixir's `with` statement and Railway Oriented Programming

Elixir's `with` statement and Railway Oriented Programming

I discovered programming as an aspirational game developer at the age of nine, using Game Maker 5. Still, as I grew more experienced, I naturally tinkered around with the basics of Javascript, PHP, and more.

It really was during University where I had falling in love with Erlang. It was my first taste of functional programming, and despite some initial difficulties because of the syntax, I really came to appreciate the guarantees provided by the BEAM virtual machine.

My love for Elixir came about differently: I don't believe syntax (so long as its reasonable) matters when making decisions about whether or not to use a language—I actually personally dislike Elixir's Ruby-inspired syntax compared to the minimalism of Erlang's—but developer experience is everthing. For Elixir, syntactic sugar is a big piece of the pie.

Almost everything in Elixir can be said to be a macro, and in my mind, one of the most elegant macros in Elixir is the with statement.

The `with` statement

The with statement is a great example of how syntactic sugar can make code both more concise and maintainable; it was introduced Elixir v1.2 and was the last of the control flow statements (think case, if, cond) added.

One common problem in Erlang code (and thus Elixir also) is code which reads like this:

my_function(Email, Password) ->
  case get_user(Email) of
    {user, _UserFields} = User ->
      case validate_password(User, Password) of
        true ->
          case generate_access_token(User) of
            {token, _data} = Token ->
              {ok, Token};

            AccessTokenError ->
              AccessTokenError
          end;

        false ->
          {error, <<"Incorrect password">>};

        Error ->
          Error
      end;

    nil ->
      {error, <<"User does not exist">>}

    Error ->
      Error
  end.

It's very nested and very easy to accidentally miss a clause when reading this, and honestly, it's just a bit jarring to work with. Elixir's with statement cleans this up nicely, being a macro which is designed to compile down to nested case statements such as the above one, maximising readability and linearity:

You can see a simple example of the with statement below:

# The same control flow as the Erlang example, written with a `with` statement
def my_function(email, password) do
  with %User{} = user <- get_user(email),
       true <- validate_password(user, password),
       %Token{} = token <- generate_access_token(user_id) do
    {:ok, token}
  else
    nil ->
      {:error, :user_does_not_exist}

    false ->
      {:error, :invalid_password}

    error ->
      error
  end
end

If the |> operator is the honeymoon phase of becoming an Elixir developer, the with statement is surely twenty years happily married 😝

As you can see, at a super high level, you get two main benefits of using with over case:

  1. You primarily focus on your happy path. You describe a pipeline of steps that you want to happen, and in the case that they do occur, your pipeline succeeds, and you get put into the do ... end part of the statement.

  2. You flatten out your potentially deeply nested unhappy paths into one pattern matching block. We no longer have to add repetitive "bubbling up" matches to expose errors deeply nested in our control flow, and no longer do we have to spend much time parsing N-level deep nesting.

The caveats of the `with` clause

There are arguably two major (related) issues with the with clause presented above:

  1. Because pattern matching against errors is decoupled from the execution of any single given function, it might be confusing to know which error patterns in the else clause correspond to which function in the case of failures.

  2. Any unmatched pattern in the else clause will cause an exception. You must take care to match any patterns you expect will be thrown (or add a catch-all clause), but this leads to another issue: it can be easy to write overly permissive error patterns which shadow other, potentially important, error cases.

Basically, in short—with clauses have the potential for making your code terse but obscure, which is never a good thing. One should never optimise purely for length, but instead for related factors such as readability, maintainability, clarity, etc.

For something the size of the example above, perhaps the issues are small enough that we feel it's ok to ignore them, but this issue is only exacerbated by the growing size of any given with clause.

However, there are ways we can work around these issues and perhaps abuse them for the sake of simplicity itself! 🦉

An ok pattern

There is a simple idiom in Erlang/Elixir (though a pet peeve of mine: it's not enforced nearly enough), which we can use to start refactoring our with example above make it a lot cleaner.

Whenever I'm writing Elixir code that has a happy path and an unhappy path, I tend to implement the function with the following spec:

@spec my_function(some_argument :: any()) :: {:ok, success_state :: any()}
                                           | {:error, error_state :: any()}

What this means is, for any successful invocation of my_function/1, the result returned is expected to be wrapped inside a tuple: {:ok, _result}, otherwise any error will be wrapped inside a tuple: {:error, _result}. This is similar to the concept of an Option Type, though not one strictly enforced.

Using this pattern, our with clause example can be updated as follows:

def my_function(email, password) do
  with {:ok, user} <- get_user(email),
       {:ok, _response} <- validate_password(user, password),
       {:ok, token} <- generate_access_token(user_id) do
    {:ok, token}
  else
    {:error, nil} ->
      {:error, :user_does_not_exist}

    {:error, false} ->
      {:error, :invalid_password}

    error ->
      error
  end
end

This helps us remove a bit of the verbosity from our pattern matches: we don't bother pattern matching the response of validate_password/1 in our example because so long as a function returns {:ok, _response}, we know it succeeded and that we can proceed. Likewise, we don't bother pattern matching against any of the struct definitions.

An oft-missed quirk of the with statement is also the fact that error handling clauses are, in fact, optional!

Take a look at the else clause of our most recent example: we're mapping very concrete errors into domain-specific error atoms such that we can tell what happened. I would argue this is very wrong: the error {:error, :user_does_not_exist} is indeed a domain-specific error; just not of the function we're defining; instead, this error is specific to the domain of the get_user/1 function!

Taking that aboard, we simplify our example further:

def my_function(email, password) do
  with {:ok, user} <- get_user(email),
       {:ok, true} <- validate_password(user, password),
       {:ok, token} <- generate_access_token(user_id) do
    {:ok, token}
  end
end

This, in my mind, is pretty clean 💎

A Railway Oriented Architecture

An introduction to railway oriented programming is probably out of the scope of this particular blog post, but for lack of a better word, its a design pattern for describing your business logic as a series of composable steps—pipelines— focusing primarily on the happy path, coupled with the abstraction of error handling away from the core business logic.

It's a pretty common pattern in the functional programming world, and there are nice libraries in languages such as F#, but even C++, Java, etc., which facilitate this approach to programming.

The cool omission in the list above is that Elixir doesn't need a library to facilitate railway oriented programming, Elixir is batteries included. The following code is, in my mind, pretty idiomatic Elixir code, for example:

@spec login(String.t(), String.t()) :: {:ok, Token.t()} | {:error, term()}
def login(email, password) do
  email
  |> get_user()
  |> validate_password(password)
  |> generate_access_token()
end

@spec get_user(String.t()) :: {:ok, User.t()} | {:error, term()}
def get_user(email), do: ...

@spec validate_password(User.t(), String.t()) :: {:ok, User.t()} | {:error, term()}
def validate_password(%User{} = user, password), do: ...

def validate_password(_error, password), do: ...

@spec generate_access_token(User.t()) :: {:ok, Token.t()} | {:error, term()}
def generate_access_token(%User{} = user), do: ...

def generate_access_token(_), do: ...

This approach is pretty nice since your top-level API is literally a pipeline of simple functions. It becomes very easy to think about each step as a transformation being applied to a given input.

I've seen this pattern done again and again on several projects I've consulted on. However, I wasn't entirely speaking in jest when I described the |> operator as the honeymoon phase for someone learning Elixir...

The main issue with this way of doing things is that you have to pollute your function heads to pattern match errors which are returned by other, completely unrelated functions; in turn, breaking the semantics of your function as defined by its function name: yes, naming is hard, but why must validate_password/2 handle errors returned by get_user/1?

I'd make the very strong argument that the |> operator should only be used for genuine pipelines: data transformations that cannot go wrong. For anything else, we should be using with statements instead:

@spec login(String.t(), String.t()) :: {:ok, Token.t()} | {:error, term()}
def login(email, password) do
  with {:ok, user} <- get_user(email),
       {:ok, true} <- validate_password(password),
       {:ok, token} <- generate_access_token(user_id) do
    {:ok, token}
end

@spec get_user(String.t()) :: {:ok, User.t()} | {:error, term()}
def get_user(email), do: ...

@spec validate_password(User.t(), String.t()) :: {:ok, User.t()} | {:error, term()}
def validate_password(%User{} = user, password), do: ...

@spec generate_access_token(User.t()) :: {:ok, Token.t()} | {:error, term()}
def generate_access_token(%User{} = user), do: ...

Each function is entirely responsible for handling its own errors, and your code is shorter and clearer to boot. Nice.

Of course, this pattern is not dogma. I try to follow this approach whenever I can, but when writing Phoenix controllers or Absinthe resolvers, it's entirely understandable to want to handle errors coming from functions you call, transforming them, perhaps, into appropriate domain-specific GraphQL errors or JSON error objects.

The client I'm currently working consulting with, for example, has a project with several layers of nested with statements:

  • An outer with statement implemented as an Absinthe middleware to catch errors that bubble up from various places in our execution pipeline to transform them into error tuples that Absinthe can understand.

  • An inner with statement which implements authentication error handling on the level of specific GraphQL resolvers.

  • An internal with statement which encapsulates any non-trivial business logic in our actual contexts.

This is by no means a complete picture, but even so, it is by no means an extreme misuse of the with statement. We should isolate our functions into appropriate domains, and compose them. That is the zen of this pattern.

One minor shortcoming

There is precisely one case where this pattern makes life a little less convenient: string interpolation. This pattern forces you to choose one of the following approaches:

"Hello, #{get_name(user_id) |> elem(1)}"

"Hello, #{{:ok, name} = get_name(user_id); name || "default"}"

"Hello, #{case get_name(user_id), do: ({:ok, name} -> name; _ -> "default")}"

None of which are, in my opinion, particularly nice ways of solving the problem.

  1. The first approach is problematic if the result of get_name/1 is {:error, error} as we'd be displaying error to the viewer of the string.

  2. The second approach is fine but will crash if get_name/1 doesn't return {:ok,name}.

  3. The final approach is probably the best, but it is pretty messy to read.

Elixir, thankfully, has an idiom that can rescue us once again! In Elixir, typically, you can expect functions to be in one of two flavours (though again, this isn't always true and not as consistent as I'd wish for it to be):

  1. Functions that have the suffix !

  2. Functions that don't have the suffix !

Functions that belong to the latter class are expected to return tagged values as we described previously. Functions belonging to the former class are expected to return untagged values but may choose to throw an exception if they fail.

For any functions which I want to be able to interpolate cleanly, I can implement a !-variant of it as follows:

def login!(email, password) do
  case login(email, password) do
    {:ok, user} ->
      user

    {:error, error} ->
      raise error # This can always be made into a real exception and raised too
    end
end

def login(email, password) do
  with {:ok, user} <- get_user(email),
       {:ok, true} <- validate_password(user, password),
       {:ok, token} <- generate_access_token(user_id) do
    {:ok, token}
end

But honestly, this case doesn't happen very much. If it does, it means I modelled something that can never fail using this pattern, which is fine, but unnecessary, or I need to wrap my string interpolation inside a with clause and handle failures properly.

It might be nice to write a library that can automatically generate !-variants of functions for you; perhaps that'll be a project I start one day 🤔

I hope this helped outline a pattern I find myself reaching for literally whenever I have to write new code or refactor existing code. Feel free to let me know what you think!

Thanks for reading 💪

A minimal Nix development environment on WSL

Creating a minimal Nix development environment for WSL

When doing my day-to-day development work, I primarily work within some Unix-like environment and I revel in the command line. I've briefly explained that I also prefer to use Windows on most of my computers to eke out the most of my hardware (as far as GPU-support is concerned); but also because I use them more-so as general purpose computing devices.

Background—my computers and use-cases

I have a few laptops in different form factors to support doing different things such as media consumption (primarily visual novels, ebooks, movies) or actually computing on the go. All of these devices run Windows since that is the operating system with the fewest sacrifices I need to make in order to get the most out of their intended use-case with respect to software availability / drivers / power management.

My main desktop machine is a workstation–gaming machine hybrid with a powerful CPU (for the former) but also a powerful GPU (for the latter). Whilst the gaming situation has improved dramatically with software like Valve's Proton, the majority of the games that lock me into using Windows are unfortunately multiplayer and secured by DRM such as EasyDRM which is not easily supported by such endeavours. I've also tinkered with PCIE passhtrough which worked well enough but was difficult to maintain and required some convoluted workarounds to issues.

To avoid much more repetition; modern Windows is (in my opinion) a far better operating system for development work than MacOS for the simple reason that using WSL provides a much more standardized environment for Unix-like development; with WSL 2 I literally have access to a Linux kernel and any* distribution I want without needing to make too many sacrifices in the OS interoperability front.

Background—Nix and NixOS

The main issue I've encountered in the past year since writing this post about setting up a development environment on Windows is that I've discovered Nix and NixOS which I now run on all my servers and dedicated Linux machines. Over the last year I've written a high-level introduction to NixOS as well as a guide on how I use Nix-isms to enable me to write code, but I've skimmed over one major issue: you cannot install an official NixOS distribution atop WSL!

There is work being done regarding this but it's as of now still unofficially supported and when I tried it out it didn't work as I expected all of the time. I'm eagerly awaiting NixOS officially supporting WSL which seems like it might happen so 🤞.

Until a better, more official solution comes along I've just been running Nix (the package manager and language) atop a standard Ubuntu WSL installation; but we can do better 💪😉.

A minimally bloated base for Nix

My issue with using the standard Ubuntu WSL installation as a base for Nix was simply that it provided a lot of stuff out-of-the-box which was great when I wanted to actually use Ubuntu to do my development; but since switching all of my workflows to use Nix/OS I wanted to be able to simply either install NixOS on bare metal, or install Nix atop some existing distribution and
just be done. Because Ubuntu/Fedora/Debian/openSUSE/etc provide a bunch of base packages, it proves relatively difficult to actually be able to define one's environment almost entirely using home-manager like one might use NixOS since default built-in packages may/may not exist on different distributions or may even be called different things.

As said; I want to be able to grab my Nix configuration from the web and simply do a home-manager install and have that set everything up for me much like doing the same via NixOS and nixos-rebuild switch sans the hardware setup and thus I wanted to port over my Nix configuration to a very commonly used (and thus well supported) yet super lean and minimal Linux distribution: (Alpine)[https://alpinelinux.org/]!

Alpine is a great base distribution for this kind of thing:

  1. It's super lean. It contains basically no cruft whatsoever as it was originally designed for use on servers or on embedded devices. This isn't really a huge concern for me in terms of install-size but it means I'm basically working as "close to the metal" as possible with Nix; with the fewest moving parts.
  2. This leanness lends itself well to being basically the de-facto base Docker image of choice. While this doesn't really affect us as non-Docker focused users of Alpine, it does mean that it's very well supported in terms of packages we might want to install with apk and also it's very well documented. Docker images using Alpine also serve as de-facto recipes and documentation for how to get things working 😉
  3. Alpine provides a busybox environment and a musl libc whose project philosophies are rather interesting to me. This apparently can cause some issues down the line (though point number 2 shows this probably isn't an issue in the long run) but is cool enough that I wanted to give it a shot. This is my first experience using busybox/musl and it works really well and rather transparently!
  4. There is an officially supported Alpine WSL package you can install with winget (or the Microsoft Store) 🏆🏆🏆

Enabling WSL and installing Alpine WSL

Since I've already covered a more detailed WSL installation procedure (though it's a little outdated now), I'll simply give a very high level overview instead of dredging through the minute details and tradeoffs.

The first thing we can do is enable WSL. The only pre-requisites to using WSL are that your Windows version is at least 1903 and that you're using a 64-bit build of Windows. Provided this is true, you can enable WSL by executing the following in an elevated powershell prompt (Win+X):

# Enable Hyper-V (for WSL 2) and WSL features for Windows
Enable-WindowsOptionalFeature -Online -FeatureName VirtualMachinePlatform
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux

# Use WSL 2 by default over WSL 1
wsl.exe --set-default-version 2

As you can see, we're also going to default to WSL 2. This is because it provides us a real Linux kernel (and thus all the niceties of using Linux on bare metal essentially; including Snapd and Docker support). It's also much faster for workflows which create/manipulate a lot of files (so long as you're doing this in the Linux environment, if your work requires you to work on the host filesystem then WSL 1 might be a better choice).

Once these commands are executed, you're going to need to reboot. After rebooting, you can install Alpine WSL via the Microsoft Store though you can also install this with Window's new package manager if you have that configured (it's still early access).

For maximal user experience, I also suggest installing Windows Terminal which honestly is the only sane choice on Windows and easily rivals terminal emulators I use on Linux such as urxvt or kitty. Windows Terminal, once installed, will allow you to automatically detect and switch profiles between any installed WSL distribution, the standard CMD prompt, Powershell and more.

Once all of this is done, simply run wsl.exe -d Alpine (or use Windows Terminal to launch it) which should cause it to prompt you to create a new user and you can begin hacking around if you want. We want to bootstrap Nix though so there is some more setup that needs to be done.

Bootstrapping Nix

Alpine is minimal and a little quirky to the uninitiated. This means there are a few pre-requisites we need to install before we're even going to be able to install and use Nix.

We need to actually install sudo ourselves since it's not one of the built-in packages and it's a hard dependency of the Nix installation script. Because our created use doesn't have permissions to install new packages via apk (😅) we need to do the following:

# Change user to root and install `sudo`
su -
apk add --no-cache sudo

# Enable group `wheel` to use `sudo` (this is convention)
echo '%wheel ALL=(ALL) ALL' > /etc/sudoers.d/wheel

You also need to take care and add your created user to this wheel group to enable it to be able to run the sudo command. This is a bit weird on Alpine WSL since your initially created user doesn't actually have a password set that can be entered for the sudo command to actually work. Thus, while still logged in as the root we need to do the following:

# Add user to the `wheel` group
adduser <USERNAME> wheel 

# Set password for your user. This is not set by default in Alpine-WSL
passwd <USERNAME>

# Return to your original user shell
exit                                               

Once this is done, confirm that you can now execute sudo as your base user by running something benign like sudo vi. If this works then we can carry onto the next step: installing Nix.

The Nix install script unfortunately has two* more neccessary pre-requisites (three in total if we count curl, two being sudo and xz) so we need to get those installed and out of the way. Execute the following:

sudo apk add --no-cache curl xz

And once this is one, we can simply follow the standard Nix installation instructions. For ease of following this guide I've copied the commands below but make sure you don't just pipe random things into sh without understanding the consequences. If instead of wanting to take my word for what commands to run (which is wise), follow the install procedure on the official site:

# Fetch and execute `nix` install script
curl -L https://nixos.org/nix/install | sh

# The install script asks you to do the following (this might be different based on the OS you use)
echo ". /home/<USERNAME>/.nix-profile/etc/profile.d/nix.sh" >> ~/.profile

From this point on, your Nix on WSL experience is more or less ready to be used. I want to set up a few other utilities which are coupled more-so to my own workflow that I've outlined here that I'll go over briefly but if you have your own configuration then feel free to try porting that over instead.

Installing Home Manager

The main issue I'm trying to solve is that I want a single nix expression I want to evaluate which will install all my applications and development setup similar to how you might do so on NixOS. You can achieve something very similar by using home-manager which also is recommended even if you're using NixOS anyway. This way, I literally can share configuration regardless of whether or not my environment is a true NixOS install versus a Nix/Ubuntu, Nix/Fedora, Nix/Alpine/WSL install which is pretty cool 💪😎 You can simply execute the following to download and configure home-manager:

nix-channel --add https://github.com/rycee/home-manager/archive/master.tar.gz home-manager
nix-channel --update
nix-shell '<home-manager>' -A install

And after grabbing my dotfiles and symlinking them to the appropriate place I can run a home-manager switch and everything is installed automatically and automatically configured.

# Install `git` so that we can fetch dotfiles from the web
nix-env -i git
git clone https://github.com/vereis/nixos

# Remove default created `home.nix` configuration and replace with fetched one
ln -s $(pwd)/<MY_CONFIG>.nix $HOME/.config/nixpkgs/home.nix

# Install everything as specified in config
home-manager switch

Post-install—Setting login shells installed by Nix

Since I don't use ash and don't have any configuration for it (and personally just very much prefer zsh) I have nixpkgs.zsh specified in my home.nix. It takes a bit of hacking to be able to use chsh -s <SHELL> when the shell you've installed is managed via nix however since Alpine's chsh has no idea you've installed a new shell.

All packages installed by nix and home-manager are symlinked in $HOME/.nix-profile/bin and thus we need to manually add your newly installed shell to /etc/shells in order to be able to use chsh -s ... as expected:

For setting up zsh, I ran the following commands:

cd ~
sudo echo "$(pwd)/.nix-profile/bin/<SHELL>" >> /etc/shells
chsh -s $(pwd)/.nix-profile/bin/<SHELL>

edit: For some reason, setting the shell this way doesn't quite work. Trying to run applications like tmux or sometimes weirdly during normal editing during vim causes the screen to get messed up. I'm not sure why this is the case but it doesn't seem to happen if my terminal emulator is xterm (or anything else Linux-native). All the Windows-native terminal emulators I've tried are buggy 🤔

I've resorted to keeping the default shell on Alpine as /bin/ash and I added /home/chris/.nix-profile/bin/zsh; exit to my ~/.profile to automatically start zsh on login.

Of course, this isn't ideal but it works fine for all of my use cases and is pretty transparent.

edit: I end up playing around with a few setups on a few different machines and WSL instances. I've written a basic script that is installed with home-manager to automatically make this change for me.

Post-install—Nix-ish services on WSL

WSL doesn't have a traditional init system (no SystemD or anything like it) and as such, a bunch of stuff doesn't work quite the same as usual. In more standard installations of WSL distributions, there still exists some service command which can be used for starting services manually; but no built-in way exists to do so automatically.

For me, the main service I want to be able to run is docker since I need it to do any kind of development work. It turns out that under most distributions, one shouldn't really install docker via nix since you won't be able to use nix to configure system services automatically (this is possible in NixOS but again, there is no official NixOS WSL distribution for us to use). In the past I've been forced to use my main distribution to install docker (i.e. via apt-get on Ubuntu WSL) and just accept the fact that one cannot use nix to manage docker outside of NixOS; but with some tweaking it's actually possible and ideal in Alpine!

Alternatively: you can install the Docker engine on Windows itself and configure it to use it's WSL 2 backend but I dislike doing this since I don't want to be managing anything development-centric inside Windows... regardless...

Hacking in Docker support

After digging around to find out what sudo service start docker is doing in Ubuntu WSL, it turns out that the main thing that needs to happen is one needs to execute sudo dockerd manually. Doing so will allow docker to do it's thing and even have docker-compose work exactly how you'd expect. Thus we can actually simply install docker via nix as per any other package as long as we're ok with executing sudo dockerd ourselves.

This can be automated to run on shell startup and automatically daemonized using the nixpkgs.daemonize utility. While this utility doesn't really do anything fancy such as automatic restarting of processes, generally it works well enough. You can daemonize a command (essentially just backgrounding it) by installing nixpkgs.daemonize and then executing daemonize <APP> <ARGS...> as you might expect.

A good first step would simply be to write a shell script that does the following automatically after logging in:

sudo daemonize -u root dockerd

dockerd needs to be run as root though, and thus one needs to always enter the root password when this runs. If this proves too annoying to endure (which it is for me since I already have to type in my ssh passphrase for keyring on first login) you can use the follow dirty hack to circumvent needing a password to run things as root:

# DISCLAIMER:
# This hack allows us to execute any command as any user without needing password authentication.
# This is _disgusting and insecure and is probably pretty problematic._ This is however, a nice simple way
# to get around the lack of any init system in WSL distributions w.r.t starting the docker service on login.
#
# As bad as this is, anyone with access to a WSL installation is able to do this anyway, so us abusing it
# for a better user experience is probably fine all things being considered.
#
# Please evaluate _for yourself_ before doing something like this.
nixpath=$HOME/.nix-profile/bin
wsl.exe -u root -e $nixpath/daemonize -u root $nixpath/dockerd

Please read the disclaimer comment before actually doing this 😅 I won't be held responsible for any pain/suffering/pwnage that might follow; but honestly if someone has access to your machine/terminal anyway, you're already owned. Doing this for the sake of UX is worth it in my opinion—especially since wsl.exe -u ... makes it so easy...

Other (standard, simpler) services

Lorri is an example of another service we want to always have running. Luckily for us (and thankfully for the creators of daemonize), we can simply write a script similarly to our docker-service script above (without the ugly root access hack) to start lorri daemon in the background on startup:

daemonize $HOME/.nix-profile/bin/lorri daemon

These can simply be added to your .bashrc or .zshrc and can these scripts can even be automatically written for you with a custom nix derivation. You can check out my nix configuration for examples of how this is done. If interested, the particular files responsible for doing this will be in modules/wsl/service-lorri.nix and modules/wsl/service-docker.nix.

Conclusion

After trudging through all this, you should now have a very minimal Nix on WSL environment ready for you to start hacking away at. Following this guide, I only have 23 packages installed and managed by Alpine's apk compared to the few hundred on a standard Ubuntu WSL installation and thus pretty much everything is managed by Nix in combination with home-manager.

While services are a little bit of a hurdle, that's not a Nix or Alpine problem perse, but more of a general issue to workaround WSL's limitations which hopefully one day will be fixed.

Hopefully this guide proves helpful to others trying to get something even remotely NixOS like on WSL 🍻 Thanks for reading!

About Me

About Me

Hi, I'm Chris.

I'm a fullstack software engineer problem solver based in London, currently working for Vetspire.

I graduated in 2017 from the University of Kent, Canterbury with a degree in Computer Science focusing on Artificial Intelligence, functional programming, and distributed systems.

I'm a seriously hardcore Erlang & Elixir evangelist, retro computing and mechanical keyboard nerd, and masochist who loves to eat (extremely) spicy food 🌶🔥 Sometimes I write about tech and stuff on my blog.

Professionally, you can find more details in my résumé if you're interested.

Contact

I'm known as vereis, or yiiniiniin in most places on the internet, such as on GitHub, or Twitter.

If you wish to contact me directly, feel free to get in touch via email, LinkedIn, or Twitter.

Build you a `:telemetry` for such learn!

Build you a :telemetry for such learn!

One of the questions I hear time and time again when talking about Elixir is that despite the plethora of great material out there teaching these programming languages, more could be done diving into how one can use said languages to develop real world software.

I'm personally a proponent of learning by doing, and love following books such as Crafting Interpreters to learn new concepts/ideas by building software. Having said that, writing a book is both daunting and admittedly perhaps out of my reach currently, but this idea is still very practical and useful on a small scale!

This post, as the first in a series of posts I hope to write, is meant to be the same kind of thing: implementing a really small, rough version of commonly used libraries/frameworks/projects in Erlang/Elixir which you can follow through, commit-by-commit, concept-by-concept, hopefully learning a bit about how these languages are used in the real world and picking up a few things here and there.

Without much ado, we'll aim to build a small implementation of the :telemetry library in Elixir!

For those more interested in viewing said artefact, instead of following along, you can also view this repository which contains annotated source code for what this blogpost is attempting to build. This may also be a useful aide in following the post itself! Ideally each commit will build a new complete unit of work we can show off and talk about, so make use of the commit history!

What is :telemetry?

In short, :telemetry is a small Erlang library which allows you to essentially broadcast events in your application's business logic, and implement handlers responsible for listening to these events and doing work in response to them.

Some example usage might be as follows:

defmodule MyApp.User do
  def login(username, password) do
    if password_correct?(username, password) do
      :telemetry.execute([:user, :login], %{current_time: DateTime.utc_now()}, %{success: true, username: username})

      :ok
    else
      :telemetry.execute([:user, :login], %{current_time: DateTime.utc_now()}, %{success: false, username: username})

      :ok
    end
  end

  defp password_correct?(username, password), do: false
end

When this :telemetry.execute/3 function is called, any handlers implemented which listen for the event [:user, :login] are run, and these handlers can be implemented within the scope of your own project (i.e. Phoenix LiveView's LiveDashboard feature listens to metrics being raised elsewhere in your application), but also between applications (i.e. if you have a dependency which handles the database like Ecto, you can write your own custom handlers for its telemetry events).

:telemetry also provides some other functions such as span/3 which is meant to measure the time a given function takes to execute, and a bunch of other goodies you can get from using other libraries built on top of :telemetry but these are all just abstractions utilising the base execute/3 function we outlined above, and as such the focus on this project will be building a minimal implementation of that.

It is important to note that :telemetry is an Erlang library, but we will focus on building our clone of it in Elixir. This is because Erlang code is notoriously hard to read for someone without any experience reading/writing Erlang, but an Erlang version of this post can be written if people think it would be helpful. Writing it in Elixir will be a helpful deep dive in how to bootstrap an Elixir project, and also hopefully explain how :telemetry works—hopefully serving as a Rosetta Stone for people unfamiliar but interested in Erlang.

Prerequisites

The only thing to start this project is to have a version of Elixir installed. I'm personally writing this post with Elixir 1.10.4 as that is a reasonably up to date version of Elixir at the time of writing.

Because I'm using NixOS, I've also included files to boostrap the developer environment with lorri into the example project. If you're using this too, you should be able to simply copy the shell.nix and env.sh files into your local working directory.

For asdf users, follow your package manager's installation procedure for this project, or get a version of Elixir from literally anywhere else.

Starting the project

The first thing we need to do to start writing code is to bootstrap a new Elixir project. Thankfully, Elixir comes with a build tool called mix which can make this first step more or less trivial!

To create a new project with mix, you can simply run mix new $MY_PROJECT_NAME and it'll generate the minimal file structure you need to start hacking on it.

By default, mix will generate an empty project with no supervision tree, which for our usecase would cause us to have to do some unneccessary work as mix supports generating different templates (even custom ones) such as umbrella applications or libraries with supervision trees. You can run mix help new to see more detailed information.

For our usecase, we'll actually want to create a project with a supervision tree and application callback. This means that this project is startable by any project which includes this one as a dependency, and it will have its own supervision tree and processes which can automatically be spawned—this perhaps is getting ahead of ourselves but we'll need this further along.

Another consideration we'll have to make is the name of our library. Because mix generates a few files for us describing our project, we want to meaningfully name it. It is a convention in the Elixir ecosystem that for applications named :credo, :absinthe, :phoenix, there exist top level modules Credo, Absinthe, and Phoenix. This isn't actually the case for us though, as if we want to publish this package and make it widely available, we can't choose a name that already exists on hex.pm which of course, :telemetry already does.

We can circumvent this by generating a new project via the following snippet which will generate the standard Elixir application boilerplate, set up a basic supervision tree and name your project :build_you_a_telemetry but name the core module namespace Telemetry instead:

mix new build_you_a_telemetry --sup --module Telemetry

Once this is done, we can verify that everything is hooked up together by running mix test which should be all good. This completes the first step of building a :telemetry clone in Elixir. Your local project should now roughly look like the result of this commit in the example repository

If you now take a look into the lib/ directory, we can see that mix has generated a top level telemetry.ex module. This is where we will be adding all of our top level functions. It has also generated a lib/telemtry/ directory where we can add any modules tailored to more specific behaviour which we don't neccessarily want to expose at the top level of our library, we'll get to this shortly. I bring this up because conventionally the test/ directory should pretty much mimic the file structure of the lib/ directory, so when we do get around to adding more files in lib/telemetry/ we will be doing so for the test/ directory also.

Scaffolding our implementation

I'm very much not a proponent of TDD in that I don't believe in dogmatically writing tests before the writing any code, however, for this post I think this is a good starting point as we can start taking a look at what our spec is.

If we look at the brief usage example we've given above, we know that the core entrypoint of Telemetry is the execute/3 function, so we can add a test to tests/telemetry_test.exs to assert this contract:

# `describe` blocks let us group related unit tests. It is a fairly common convention to
# have a `describe` block for each main function in your modules, with individual nested
# test cases to describe behaviour based on different permutations of arguments/state.
describe "execute/3" do
  test "returns :ok" do
    assert :ok = Telemetry.execute([:example, :event], %{latency: 100}, %{status_code: "200"})
  end
end

Of course, if we run mix test, this will fail with the following output:

warning: Telemetry.execute/3 is undefined or private
  test/telemetry_test.exs:7: TelemetryTest."test execute/3 returns :ok"/1

  1) test execute/3 returns :ok (TelemetryTest)
     test/telemetry_test.exs:6
     ** (UndefinedFunctionError) function Telemetry.execute/3 is undefined or private
     code: assert :ok = Telemetry.execute([:example, :event], %{latency: 100}, %{status_code: "200"})
     stacktrace:
       (build_you_a_telemetry 0.1.0) Telemetry.execute([:example, :event], %{latency: 100}, %{status_code: "200"})
       test/telemetry_test.exs:7: (test)

We can see that, of course, Telemetry.execute/3 hasn't been defined yet so the test is failing. Let's quickly add the minimal implementation that passes this test case into lib/telemetry.ex:

def execute(_event, _measurements, _metadata) do
  :ok
end

And running mix test again reveals that this passed as expected:

Finished in 0.03 seconds
1 test, 0 failures

For those following this on the example project, this change implements what we just added.

Storing state

What we expect to happen now is that any attached event handlers listening for this event being raised are to be executed. Let's think about what we need to do to get this to work:

  1. Someone calls Telemetry.attach/4 to attach an event handler to Telemetry.
  2. Someone calls Telemetry.execute/3 like in our test.
  3. The invokation of Telemetry.execute/3 causes the handler which was attached in step 1 to be executed.

This implies that there is some hidden, shared state that lives somewhere, which allows us to remember what handlers were registered for a given event to enable us execute them when said events are raised. There are a few approaches for doing this in the Elixir world, but in short they all boil down to either using a seperate process (such as an Agent or GenServer), or by writing to an :ets table.

Usually, using a GenServer (or Agent, though GenServers are more flexible) is the more idiomatic approach for just storing some state, as what we'll need is actually quite simple, but :telemetry actually uses an :ets table for this as it gives us a bit more functionality than simply storing state. We'll do the same to try and match the functionality of :telemetry, but it's also beneficial for learning as we'll need to bootstrap some processes to handle this anyway.

What is an :ets table anyway?

The linked documentation for :ets can be quite daunting, but a high level overview is that :ets provides a way for us to create in-memory KV databases.

The cool thing about :ets is that it actually gives us constant-time data access, and it's super quick and easy to create/teardown :ets tables whenever we need to. The important thing is that :ets tables have to be explicitly owned by a process and if that owner process dies for whatever reason, the :ets table by default will be terminated.

We can also create :ets tables in a few different configurations, to enable/disable read/write concurrency and even change how we store entries (1 value per key, order by key vs insert time, many unique values per key, many potentially duplicate values per key). Tables can also be configured to be globally readable/writable, writable by owner but globally readable, or only readable/writable to owner.

It's an extremely useful piece of machinary that has a lot of configuration options, so it's definitely worth reading into a bit despite the dense documentation. There are even extensions such as :dets and :mnesia which build atop :ets to provide disk-based persistance and distribution support respectively.

Implementing handler attachment

Because :ets tables need to be owned by processes, we need to write a minimal process which will own, and provide utility functions, for reading/writing to this table.

Thankfully, doing this is pretty painless because of the supervision tree boilerplate we already generated with our project. We just need to create a new module that has the use GenServer macro as follows:

defmodule Telemetry.HandlerTable do
  use GenServer

  def start_link(_args) do
    GenServer.start_link(__MODULE__, nil, name: __MODULE__)
  end

  @impl GenServer
  def init(_args) do
    # We need to create a table with the format `duplicate_bag` as it needs to be
    # able to handle many different entries for the same key.
    table = :ets.new(__MODULE__, [:protected, :duplicate_bag, read_concurrency: true])
    {:ok, %{table: table}}
  end
end

This creates a minimal GenServer named Telemetry.HandlerTable that simply creates a new :ets table as part of it's initialization and stores the reference to that table as part of the GenServer's state. As an aside, a convention I like to follow is to define a module such as Telemetry.HandlerTable.State that defines a typed struct to represent the GenServer state instead of passing maps around ad-hoc. This can be done in this case via:

defmodule Telemetry.HandlerTable.State do
  defstruct table: nil
end

Because GenServers are asynchronous processes that can encapsulate business logic and state, in order to get them to do anything, we need to send them a message, and have them handle said message. The GenServer abstraction provides a bunch of callbacks we can use, but because we want to synchronously do work when someone calls Telemetry.attach/4 and return a response to them, we need to implement the handle_call callback as follows:

@impl GenServer
def handle_call({:attach, {id, event, function}}, _from, %State{} = state) do
  true = :ets.insert(state.table, {event, {id, function}})
  {:reply, :ok, state}
end

One can call this by running GenServer.call(Telemetry.HandlerTable, {:attach, {id, event, function}}) which we can encapsulate in a function to make the api slightly nicer:

def attach(id, event, function) do
  GenServer.call(Telemetry.HandlerTable, {:attach, {id, event, function, options}})
end

This function can be added either to the current Telemetry.HandlerTable module or to the top level Telemetry module. I personally prefer that the API and callbacks of a GenServer are in one file, so I'll add it to Telemetry.HandlerTable. Elixir also provides us a nice macro called defdelegate which allows us to expose functions from one module in another, which we can add to the top level Telemetry module to do the actual implementation of Telemetry.attach/4 as follows:

alias Telemetry.HandlerTable
defdelegate attach(handler_id, event, function, opts), to: HandlerTable

Once this is done, we simply start the GenServer in our lib/telemetry/application.ex and our :ets table and server will automatically be started alongside our library:

defmodule Telemetry.Application do
  @moduledoc false

  use Application

  def start(_type, _args) do
    children = [
      {Telemetry.HandlerTable, nil}
    ]

    opts = [strategy: :one_for_one, name: Telemetry.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

And for completeness we can add some unit tests for all of the functions we've implemented, which for brevity you can see as part of this commit. All that is left now is to wire up the Telemetry.execute/3 function to call the functions we've saved in :ets.

Handling events on Telemetry.execute/4

So now that we have a list of functions attached for a given event :: list(atom), we just need to list all of those functions and execute them as part of Telemetry.execute/4. To do this we need to add a new handle_call/3 to our Telemetry.HandlerTable GenServer as follows:

def list_handlers(event) do
  GenServer.call(__MODULE__, {:list, event})
end

@impl GenServer
def handle_call({:list, event}, _from, %State{} = state) do
  handler_functions =
    Enum.map(:ets.lookup(state.table, event), fn {^event, {_id, function, _opts}} ->
      function
    end)

  {:reply, handler_functions, state}
end

Whilst it's not neccessary to expose this function in our top level API module, :telemetry does it (possibly for allowing other libraries to hook into it) so we can expose our variant of this as well via:

defdelegate list_handlers(event), to: HandlerTable

This should be a pretty small change, which we can quickly unit test, as per this commit.

This function returns a list of the attached functions for a given event, so we can update our implementation of Telemetry.execute/3 to simply iterate over all of these functions and execute them:

def execute(event, measurements, metadata) do
  for handler_function <- list_handlers(event) do
    handler_function.(event, measurements, metadata)
  end

  :ok
end

We can now augment our existing execute/4 tests by testing that handlers are actually executed!

describe "execute/3" do
  test "returns :ok" do
    assert :ok = Telemetry.execute([:example, :event], %{latency: 100}, %{status_code: "200"})
  end

  test "returns :ok, any attached handlers are executed" do
    test_process = self()

    assert :ok =
             Telemetry.attach(
               "test-handler-id-1",
               [:example, :event],
               fn _, _, _ ->
                 send(test_process, :first_handler_executed)
               end,
               nil
             )

    assert :ok =
             Telemetry.attach(
               "test-handler-id-2",
               [:example, :event],
               fn _, _, _ ->
                 send(test_process, :second_handler_executed)
               end,
               nil
             )

    assert :ok = Telemetry.execute([:example, :event], %{latency: 100}, %{status_code: "200"})

    assert_received(:first_handler_executed)
    assert_received(:second_handler_executed)
  end
end

You can see now that with this commit, the basic core functionality of the :telemetry library is complete.

Tooling and correctness

Before continuing to polish the core business logic of our library (there are a few small things left to do), this is a good point to introduce some common tooling I like to bring into my projects to make them more correct. I'd like to introduce two main tools:

  1. Credo—from what I understand this is basically like Elixir's version of Rubocop or ESlint. It's nice for improving and enforcing code readability and consistency rules among a few other things!
  2. :dialyzer and it's Elixir wrapper Dialyxir—static analyzer which can point out definite typing errors among other things!

Credo

First off, we'll add Credo to our list of dependencies in our mix.exs file:

defp deps do
  [
    {:credo, "~> 1.5", only: [:dev, :test], runtime: false}
  ]
end

After this is done, all we need to do is run mix deps.get to download packages from hex.pm and we're ready to continue. This dependency is set up to not actually execute at runtime. It basically just provides a new mix target called mix credo which will execute and report any problems it finds.

If we take a look at the example repository at the most recent commit until now, running mix credo actually results in the following errors:

==> credo
Compiling 213 files (.ex)
Generated credo app
Checking 6 source files ...

  Code Readability
┃
┃ [R] → Modules should have a @moduledoc tag.lib/telemetry/handler_table.ex:1:11 #(Telemetry.HandlerTable)[R] → Modules should have a @moduledoc tag.lib/telemetry/handler_table.ex:4:13 #(Telemetry.HandlerTable.State)

Please report incorrect results: https://github.com/rrrene/credo/issues

Analysis took 0.02 seconds (0.01s to load, 0.01s running 45 checks on 6 files)
14 mods/funs, found 2 code readability issues.

It's saying that we haven't properly documented our modules we added, which is very true. If we add some @moduledoc tags and re-run mix credo we see that no further issues are reported! (This is surprising given the fact I've just been writing code in parallel to writing this post but hey ho!)

You can see the overall changes made to the example repository here.

Dialyzer and Dialyxir

Similarly to how we added Credo as a dependency of our application, we want to do the same with Dialyxir. Actually, in the BEAM world, :dialyzer comes as part of Erlang's standard library but using it raw from Elixir is a little painful. Dialyxir is a simple wrapper for :dialyzer which adds to mix a mix dialyzer target for you to run.

One should be warned that :dialyzer caches data after running in what is called a PLT&mdasha persistant lookup table&mdashbut of course, this does not exist for the first run. This means :dialyzer can take a really long time to run initially so don't be alarmed if this takes awhile to follow through: :dialyzer is statically analyzing your entire library, and the entire standard library! One should also set up some options to tell Dialyzer where to put the PLT, a full mix.exs configuration can be seen as follows:

diff --git a/mix.exs b/mix.exs
index 3db7007..7026ed5 100644
--- a/mix.exs
+++ b/mix.exs
@@ -7,7 +7,10 @@ defmodule Telemetry.MixProject do
       version: "0.1.0",
       elixir: "~> 1.10",
       start_permanent: Mix.env() == :prod,
-      deps: deps()
+      deps: deps(),
+      dialyzer: [
+        plt_file: {:no_warn, "priv/plts/dialyzer.plt"}
+      ]
     ]
   end

@@ -22,7 +25,8 @@ defmodule Telemetry.MixProject do
   # Run "mix help deps" to learn about dependencies.
   defp deps do
     [
-      {:credo, "~> 1.5", only: [:dev, :test], runtime: false}
+      {:credo, "~> 1.5", only: [:dev, :test], runtime: false},
+      {:dialyxir, "~> 1.0", only: [:dev], runtime: false}
     ]
   end
 end

To execute :dialyzer, simply run mix dialyzer and wait for it to finish running:

Starting Dialyzer
[
  check_plt: false,
  init_plt: '/home/chris/git/vereis/build_you_a_telemetry/priv/plts/dialyzer.plt',
  files: ['/home/chris/git/vereis/build_you_a_telemetry/_build/dev/lib/build_you_a_telemetry/ebin/Elixir.Telemetry.Application.beam',
   '/home/chris/git/vereis/build_you_a_telemetry/_build/dev/lib/build_you_a_telemetry/ebin/Elixir.Telemetry.HandlerTable.State.beam',
   '/home/chris/git/vereis/build_you_a_telemetry/_build/dev/lib/build_you_a_telemetry/ebin/Elixir.Telemetry.HandlerTable.beam',
   '/home/chris/git/vereis/build_you_a_telemetry/_build/dev/lib/build_you_a_telemetry/ebin/Elixir.Telemetry.beam'],
  warnings: [:unknown]
]
Total errors: 0, Skipped: 0, Unnecessary Skips: 0
done in 0m0.92s
done (passed successfully)

In my case, between the previous commit and this commit, everything worked out again (surprisingly) but having :dialyzer set up is indispensible for helping to catch bugs before them happen.

It's also possible to add custom type annotations to our code to help :dialyzer out when it's looking for type errors. Type annotations also serve as good documentation so in my opinion it's always good practice to add typespecs at least to the main functions for any public modules. You can see this commit for an example of how to go about doing this.

Mix aliases

Now that we have Credo and Dialyxir set up and running, we can ponder about maybe setting up a CI pipeline that runs our tests, Credo, and Dialyzer in turn. Elixir provides a way of doing this with mix aliases which are just functions you can define in your mix.exs file:

defp aliases do
  [lint: ["format --check-formatted --dry-run", "credo --strict", "dialyzer"]]
end

def project do
  [
    aliases: aliases(),
    ...
  ]
end

...

What this does is run mix format --check-formatted --dry-run to ensure that all files are formatted as per your .formatter.exs file (add one if you didn't get one generated with your project), credo and dialyzer in one fell swoop. Ideally in your CI pipeline from here, you can simply have threw different steps:

  1. Run mix compile to make sure everything actually builds correctly. For good measure I set warnings_as_errors: true so warnings don't leak into my code.
  2. Run mix lint to ensure code quality, conventions etc are as we expect.
  3. Run mix test --cover to ensure all tests pass and to ensure that we have an adequate test coverage.

I'll leave implementing CI to the reader though, as that varies a lot depending on your preferences and repository hosting solution, but you can see our latest bit of progress on the example repository.

Final changes

Now that we've gotten out of the rabbit-hole which is setting up some useful tooling for our continued development, we need to add one last main feature: exception handling.

If we look at the implementation of :telemetry's execute/3 function equivalent, we can see that it's actually not just executing the handler functions. While iterating over the list of attached handler functions, we're making sure to wrap the handler execution in a try..catch..rescue to prevent the caller from exploding if something goes wrong!

This way, the caller never has to worry about :telemetry calls blowing up the calling context, instead, we want to react to these exceptions and silence them, removing any handlers which fail from the handler table.

We can pretty much copy the approach taken here ad hoc, we need to update our Telemetry.execute/3 function as we do below, but we also need to be able to selectively detach a given handler. This is why we always pass in a handler_id when attaching handlers. These are meant to be unique IDs for us to identify individual handlers with. We will need to update our Telemetry.HandlerTable to treat these keys as unique, as well as implement a way of deleting handlers:

# lib/telemetry.ex
@spec detach_handler(handler_id(), event()) :: :ok
defdelegate detach_handler(handler_id, event), to: HandlerTable

@spec execute(event(), measurements(), metadata()) :: :ok
def execute(event, measurements, metadata) do
  for {handler_id, handler_function} <- list_handlers(event) do
    try do
      handler_function.(event, measurements, metadata)
    rescue
      error ->
        log_error(event, handler_id, error, __STACKTRACE__)
        detach_handler(handler_id, event)
    catch
      error ->
        log_error(event, handler_id, error, __STACKTRACE__)
        detach_handler(handler_id, event)
    end
  end

  :ok
end

defp log_error(event, handler, error, stacktrace) do
  Logger.error("""
  Handler #{inspect(handler)} for event #{inspect(event)} has failed and has been detached.
  Error: #{inspect(error)}
  Stacktrace: #{inspect(stacktrace)}
  """)
end
# lib/telemetry/table_handler.ex
@impl GenServer
def handle_call({:attach, {handler_id, event, function, options}}, _from, %State{} = state) do
  # Pattern match the existing table to make sure that no existing handlers exist for the given
  # event and handler id.
  #
  # If nothing is found, we're ok to insert this handler. Otherwise fail
  case :ets.match(state.table, {event, {handler_id, :_, :_}}) do
    [] ->
      true = :ets.insert(state.table, {event, {handler_id, function, options}})
      {:reply, :ok, state}

    _duplicate_id ->
      {:reply, {:error, :already_exists}, state}
  end
end

@impl GenServer
def handle_call({:list, event}, _from, %State{} = state) do
  response =
    Enum.map(:ets.lookup(state.table, event), fn {^event, {handler_id, function, _opts}} ->
      {handler_id, function}
    end)

  {:reply, response, state}
end

@impl GenServer
def handle_call({:detach, {handler_id, event}}, _from, %State{} = state) do
  # Try deleting any entry in the ETS table that matches the pattern below, caring only
  # about the given event and handler_id.
  #
  # If nothing is deleted, return an error, otherwise :ok
  case :ets.select_delete(state.table, [{{event, {handler_id, :_, :_, :_}}, [], [true]}]) do
    0 ->
      {:reply, {:error, :not_found}, state}

    _deleted_count ->
      {:reply, :ok, state}
  end
end

@spec detach_handler(Telemetry.event(), handler_id :: String.t()) :: :ok | {:error, :not_found}
def detach_handler(handler_id, event) do
  GenServer.call(__MODULE__, {:detach, {handler_id, event}})
end

We can add unit tests for the indivdual handle_call clause we added:

test "{:detach, {event, handler}} returns :ok when trying to add delete attached handler", %{
  state: state
} do
  assert {:reply, :ok, _new_state} =
           Telemetry.HandlerTable.handle_call(
             {:attach, {"my-event", [:event], fn -> :ok end, nil}},
             nil,
             state
           )

  assert {:reply, :ok, _new_state} =
           Telemetry.HandlerTable.handle_call({:detach, {"my-event", [:event]}}, nil, state)

  assert length(:ets.tab2list(state.table)) == 0
end

test "{:detach, {event, handler}} returns {:error, :not_found} when trying to add delete unattached handler",
     %{
       state: state
     } do
  assert {:reply, {:error, :not_found}, _new_state} =
           Telemetry.HandlerTable.handle_call({:detach, {"my-event", [:event]}}, nil, state)

  assert length(:ets.tab2list(state.table)) == 0
end

And lastly, we can unit test the entire flow by testing the top level lib/telemetry.ex like so:

test "returns :ok, any attached handlers that raise exceptions are detached" do
  test_process = self()

  assert :ok =
           Telemetry.attach(
             "detach-handler-test-id-1",
             [:detach, :event],
             fn _, _, _ ->
               send(test_process, :first_handler_executed)
             end,
             nil
           )

  assert :ok =
           Telemetry.attach(
             "detach-handler-test-id-2",
             [:detach, :event],
             fn _, _, _ ->
               raise ArgumentError, message: "invalid argument foo"
             end,
             nil
           )

  assert length(Telemetry.list_handlers([:detach, :event])) == 2

  assert :ok = Telemetry.execute([:detach, :event], %{latency: 100}, %{status_code: "200"})

  assert_received(:first_handler_executed)

  assert length(Telemetry.list_handlers([:detach, :event])) == 1

  assert {:error, :not_found} =
           Telemetry.detach_handler("detach-handler-test-id-2", [:detach, :event])
end

That leaves us with a more or less functional version of the :telemetry library written in Elixir! More can definitely be built ontop of this, but for a basic starting point I think this serves a good example of how one can build an application/library end-to-end using Elixir, that serves some real, potentially ecosystem-improving benefit.

You can follow the example repository up to this commit which shows the library as what I'd call feature-complete.

Publishing the package

The last thing you need to do is to publish the package onto hex.pm. You can do this quite easily with the following steps:

  1. You need to register with hex.pm. You can do this by running mix hex.user register
  2. You need to then add a description and package definition as to your mix.exs file as per this commit
  3. You need to run mix hex.publish
  4. Check out your newly minted project on hex.pm! The example repository of this project is uploaded here for instance.

Conclusion

Hopefully this blog post was, and will serve, as a helpful reference for how to start an Elixir project from beginning to end, ultimately publishing it to hex.pm.

I intend on writing more of this type of post because it's both genuinely helpful in my opinion, but also helps me consolidate my own knowledge. Thanks for taking the time to read this and stay tuned :-)

Stuff I use

Stuff I use

I write a lot about things I'm interested in, and in many posts, I've alluded to particular pieces of hardware or software I use. I'm also often explicitly asked what I think of some software or why I chose what I chose.

In case anyone is interested, I've put together this page to consolidate all of the above 😅. Despite being a timestamped blog post, I'll keep these page updated as time goes on (as well as linking to any relevant posts which might get written about changes in my setup).

Hardware

I'll outline computing devices and peripherals in separate sections below, simply because oftentimes, my peripherals happen to be used by more than one computer.

Computing Devices

Most of my general-purpose computers run the same software stack, so I'll detail those in another section, though I've expanded a little in cases that aren't 100% true.

  • Primary Workstation: Custom built
    My daily driver for all my development work is a custom-built workstation I built a few years ago.

    It's a little outdated so I'll probably upgrade soon but it's currently made up of the following components/parts:

    Case: Corsair Obsidian 750D
    CPU: AMD Ryzen 5950x (16C/32T) clocked @ 4.0Ghz
    GPU: Inno3D GeForce RTX 3080
    Memory: 32GB Corsair Vengeance @ 3200Mhz
    Storage: Samsung 860 EVO 250GB + 2TB WD Blue HDD
    Motherboard: Asrock X470 Taichi
    PSU: Corsair RM850

  • Secondary Workstation/Server: Custom built
    This computer was originally built before going to University, though it's gotten many upgrades since. It usually just gets hand-me-downs from my primary workstation as I upgrade that.

    Unlike literally all of my other machines, instead of running Windows 10, it runs NixOS, though I use PCIe-passthrough to run Windows 10 LTSC as a host for running/streaming games through Steam.

    I don't do anything fancy with the NixOS side of things, though I have plans to host many services locally via k3s. I'll update this as that happens. 😅

    On the Windows side, Windows 10 LTSC automatically boots into Steam's Big Picture Mode, which is displayed on my TV in my bedroom. Coupled with a bunch of dedicated Steam Links, weaker general-purpose computing devices, and Android TV boxes around the house, I can enjoy my entire Steam library (which is way too large), Visual novels, and emulated games effortlessly and at the comfort of something that isn't a desk!

    It has the following specs:

    CPU: AMD Ryzen 2700x (8C/16T) clocked @ 4.2Ghz
    GPU: XFX AMD Radeon RX 550 (4GB) for NixOS
    GPU: Gigabyte GeForce GTX 1060 (6GB) for Windows
    Memory: 16GB Corsair Vengeance @ 3200Mhz
    Storage: Samsung 860 EVO 250GB + 2TB WB Blue HDD
    Motherboard: MSI A320M-A Pro
    PSU: Corsair TX550M

  • Primary Laptop: "New" Dell XPS 15 9500
    This is my primary, work-provided, laptop and I use it whenever I don't want to work at my desk.

    Day-to-day, though, since I'm usually using my workstation, I use Synergy to share my workstation's keyboard and mouse so I can keep all my work-related software installed here rather than polluting my daily driver's OS. This also means I can use the XPS as a second monitor for reading documentation, Slack, as well as a webcam.

    As a standalone machine, it runs everything I throw at it admirably, and the battery lasts most of the day without needing to recharge.

    It has the following specs:

    Display: Infinity Edge w/ touch @ 4k
    CPU: Intel i7-10750H (6C/12T) clocked @ 2.6Ghz
    GPU: NVIDIA GeForce GTX 1650 Ti
    Memory: 32GB OEM Provided @ 2900Mhz
    Storage: OEM Provided 1TB NVME

  • Secondary Laptop: Microsoft Surface Laptop 3 13-inch
    This was my primary laptop until not long ago. I bought it myself to replace an old, work-provided Macbook Pro 13 (2016) since I really didn't get along with MacOS 😅

    Nowadays, I don't really use it, and it acts as a secondary computing device in a pinch or as a general computer for family to use if they're staying over.

    It has the following specs:

    Chassis: Alcantara in grey 🧡
    CPU: Intel Core i7-1035G7 (4C/8T) clocked @ 1.3Ghz
    GPU: Intel Iris Plus 950
    Memory: 16GB OEM Provided @ 2400Mhz
    Storage: OEM Provided 512GB SSD

  • Media Consumption Tablet: Microsoft Surface Pro 4
    This is a tablet that I use for consuming media (Netflix, Visual Novels, ebooks, etc.) when I don't physically want to use a laptop.

    Because the form factor is so convenient, if I don't anticipate having to work when I travel (when that used to be a thing), this would double as my primary laptop. Despite its age and declining maximum battery life, it lets me do development work in a pinch.

    It has the following specs:

    Typecover: Alcantara in grey 💗
    CPU: Intel Core i7-6650U (2C/4T) clocked @ 2.2Ghz
    GPU: Intel Iris 540
    Memory: 16GB OEM Provided @ 2400Mhz
    Storage: OEM Provided 512GB SSD

  • Phone: Xiaomi Mi Mix 3
    This is a phone I find very hard-pressed to upgrade from. I've never been very embedded in the Apple ecosystem, so despite having owned iPhones in the past, I've never stuck to it.

    When the iPhone started popularizing the notch, I really admired the minimizing of bezels but really, really, really, dislike the "notch". I think it's such an inelegant solution to needing to design for a front-facing camera.

    Thankfully, that's where the Mi Mix 3 shines! The phone is made of two sections that slide over each other (much like an old-school sliding phone) to optionally reveal the front-facing camera.

    Not only is the phone effectively bezel-less in every sense of the word, it's also amazingly satisfying to play with the magnetic slider mechanism. It has a real weight and quality to it. It comes with all the features one would expect from a contemporary flagship phone, including a fingerprint sensor, NFC support, wireless charging, and pretty much everything else I care about.

    Despite having bought one at launch, the battery still lasts the entire day, and the slider mechanism still feels as smooth and solid as the day I got it.

    It has the following specs:

    OS: MIUI Global 12.0.2 (Android 10)
    CPU: 4 powerful Kryo 385 Gold (2.8Ghz) cores and 4 weaker Kryo 385 Silver (1.7Ghz) cores
    GPU: Adreno 630
    Chipset: Snapdragon 845
    Memory: 8GB
    Storage: 128GB

Peripherals

  • Desk: Fully Jarvis Bamboo Standing Desk
    I used to use an IKEA Idasen standing desk after getting used to standing in the middle of the day when I used to work in an office, but a good friend recommended the Jarvis to me when I was shopping for something a bit more stable and less loud.

    I went with the 160x80cm bamboo tabletop because I loved how warm. I also went with the gunmetal table legs plus the optional bamboo desk drawer and programmable controller.

    Overall was a great purchase. Despite the fact that I already had a standing desk, the smoothness, quality, and warmth that this desk emanates makes me look forward to working atop it 😍

  • Headphones: Sony WH-1000XM4
    Just a really solid set of headphones. The noise-canceling works reliably and it's really good, they're comfortable, the battery lasts forever, and they sound great.

    There are a few intelligent features such as pausing whatever may be playing and disabling noise cancellation when you speak, as well as doing the same when you perform some gestures. These are neat features but I don't use them very much outside of skipping/pausing media.

    My only complaint is that I have to be very careful wearing these out, since they're not waterproof at all and not exactly cheap! Whenever it rains (which is often in London), I have to put them away pretty quickly.

    I'll probably invest in some wireless earbuds to make up for this weakness, but overall I'm pretty happy with the purchase.

  • Monitor: LG UltraGear 27GN95B
    This is my first time using any sort of high refresh-rate monitor. I primarily got it so that I could enjoy 4k 144hz gaming paired with my GTX 3080, but the improvements to fluidity even when browsing the web make me feel like I'll be hard-pressed to ever go back to a low refresh rate monitor.

    My previous monitor was a BenQ PD2700Q, which had amazing colour accuracy and decent blacks (especially for a monitor from 2017), and while the colour accuracy and black levels are noticeably worse, it isn't so bad as to detract from an amazing viewing experience.

    I'm not a fan of "gamer" aesthetics, and whilst the back of the monitor features RGB and some pretty edgy design, the front of the monitor is incredibly tasteful, with very minimal bezels easily matching the look of my XPS 15. This monitor is highly recommended, especially due to the lack of choices at the time of writing for 4k + 144hz.

  • Keyboard: HHKB Professional Hybrid
    I originally bought this HHKB because I wanted to experience what people meant when they said that Topre key switches felt amazing to type on and were nothing like any of the switch types I had tried in the past.

    I ended up keeping it as my primary keyboard and seldom ever use any of the other keyboards in my collection.

    I really enjoy the feeling and sound of these unsilenced Topre switches, and despite some initial hardship with the layout, once I adapted, I couldn't bring myself to go to a larger form factor.

    The HHKB is super light, effortlessly fits into my bag, is wireless (and I only change the batteries once every 2 months or so), can connect to multiple profiles, and more! I admit the layout might be a deal-breaker for most people, but I'd definitely recommend trying out Topre if you're at all into mechanical keyboards.

    I'm also using aftermarket Japanese legend keycaps because it looks cool 💪

  • Primary Mouse: Logitech MX Master 3
    A lot has been said about the MX Master 3, so I won't ramble on for too long.

    It's a great mouse with arguably the best scroll-wheel money can buy. Once you get used to the thumb scroll-wheel, most mice feel lacking. It's adequately weighty, precise, wireless, and doesn't need ridiculously frequent recharging—a win on all fronts.

  • Secondary Mouse: Logitech M730
    My secondary mouse is almost permanently paired with my Surface Laptop 3 or Surface Pro 4.

    Its battery lasts a hell of a long time, even compared with the MX Master, and it's much smaller, which lets me throw it into a bag if I'm traveling.

  • Gamepads: Steam Controller
    Whenever I want to use a traditional controller for gaming, I use one of my Steam Controllers.

    These controllers seem to be a controversial topic on the web; people will either attack them as unwieldy and (admittedly) not very suitable for fighting games or defend them by calling them the most configurable and comfortable controller money can buy. I'm very much in the latter camp. 😁

    These controllers eschew the right analog stick and traditional D-pad for two touchpads which can be configured to do a bunch of things (oftentimes just to emulate the input method they replaced). I agree with the sentiment that they're not ideal for high precision inputs such as fighting games (though I used to play an embarrassing anime fighter: Naruto Ultimate Ninja Storm 2–4 competitively at a top 3 level in the UK 💪💪💪). They're great for enjoying shooters, though, so you win some, you lose some.

Software

image

I'm actually a bit of a software purist with respect to running as little software as possible. I'm definitely not suckless tier, but I try to avoid installing software if I can help it.

You can definitely see how that pans out with a few less popular choices below!

  • Primary Operating System: Windows 10 Pro
    I've written a lot about running Windows 10, so I won't ramble on too much here.

    In short, if I could run all my software across all my hardware without sacrificing driver stability, driver availability, software availability, or battery life, I'd definitely run Linux (most likely NixOS) on all of my machines. Unfortunately, that isn't the world we live in today unless you're okay with carefully choosing and considering your hardware from a subset of available hardware.

    In my experience, the most cited reasons why Windows isn't something people should use usually involves telemetry, uncontrollable automatic updates, or poor developer experience. I totally agree that these are problematic—to an extent.

    So long as I have my email hosted by Google or have any social media of any kind, people are collecting telemetry about me. It's gotten to the point where telemetry and "spying" is an unavoidable fact of life. I've simply given up this fight.

    Uncontrollable updates are annoying, and once the next release of Windows 10 LTSC is available, I'll be buying a copy of that. That way, I don't get any of the bloat that usually comes with Windows 10 or any updates outside of security updates. The only reason I'm not using LTSC at the moment is that WSL2 isn't available in the current LTSC release.

    WSL affords me a real Linux experience, running on a real Linux kernel, so as far as I'm concerned, the developer experience is there. You get all of the benefits (outside of idealogical benefits) of running Linux while benefiting from all that Windows (and proprietary software) has to offer.

    If you're a macOS user, @ me up when you can run Docker natively 😉

  • Secondary Operating System: NixOS or Alpine Linux
    I've spoken about why I use Alpine Linux on WSL2 and why I drank the NixOS kool aid, so again, I won't ramble on too much 🤞

    Whenever I can run NixOS on bare metal, I will. I operate a few VPSes, and even those run NixOS. I really appreciate all of the configuration, stability, and reproducibility benefits that it affords you. I genuinely think NixOS has been onto something and will become an indispensable tool in the future.

    Whenever I can't run NixOS, I'll opt for running Alpine Linux because, by default, it's super minimal and lightweight. I have my own extensive Nix + HomeManager config, which I install atop Alpine Linux (or literally any other OS that Nix is compatible with), and I'm ready to start hacking away.

  • Web Browser: Microsoft Edge
    I've already started receiving hate mail 🤣

    I used Firefox religiously for the past 15 years or so and only recently switched to Edge after Microsoft made "Edgium" publicly available.

    I do think that trying to avoid a browser (rendering engine) monopoly is a good thing, but Microsoft Edge is actually pretty good! The main features I like include built-in support for vertical tabs, tab collections (I can save all my work-related stuff into a tab "workspace"), and complete(?) interoperability with the Chrome ecosystem, so I never need to worry about stuff being janky because some developers only tested on Chrome.

    Vertical tabs are new to me, but I'm aware you can get them on Brave (out of the box?) and Firefox (with an extension) but how wild is it that Windows 10 comes with a modern, fast web browser with a built in adblocker (I still use uBlock though) and support for things like vertical tabs? That's pretty wild!

    Honestly, it's entirely possible I'll switch back to Firefox, but not having to install an alternative web browser is really nice. One less moving part!

  • Terminal Emulator: Windows Terminal or Kitty
    When I'm on Windows, Windows Terminal is literally the best terminal you can use when it comes to trading off features for speed. What I mean by that is: it's super featureful and fast 🤯

    When I'm on Linux, I use Kitty, which is also very featureful and GPU accelerated. I've heard a lot of discussion about whether or not GPU acceleration in the terminal is actually useful but it seems fine to me.

    The main considerations I have when choosing a terminal are whether or not they support window padding, mouse events, Unicode, and ligatures. Unfortunately, this excludes many other popular choices such as alacritty.

  • Shell: zsh with oh-my-zsh
    When I was still at University, I read Steve Losh's post about zsh, and I tried doing something similar myself. It just stuck!

    Admittedly, I mainly use zsh for eye candy. I know most of what I use is achievable in standard bash, or probably anything else, but it works well enough! More recently, I've started making use of CDPATH which lets me set up "portals" to jump around my filesystem, but again, its nothing fancy.

    Unfortunately, oh-my-zsh is pretty heavy and slow, and in the past, I've actually switched to zprezto, which is much lighter weight. When switching all dotfiles to nix, I ran into some issues with the built-in zprezto support and didn't bother debugging it since it was easy enough to port my config to oh-my-zsh instead.

  • Text Editor: Neovim
    I get by perfectly fine on vanilla Vim. Still, when working on larger and more complex projects, I really appreciate Neovim exclusive plugins such as coc.nvim, which provides really strong IntelliSense.

    I've pretty much always used Vim, so I don't have any objective reasons as to why I'm not using something like Visual Studio Code. Still, this way, I can continue living in my terminal and keeping everything configured easily, as well as maximally utilising years of building up Vim muscle memory.

    I don't use many plugins (though I suppose the amount one considers "a lot" is rather arbitrary). I wrote a walkthrough of my Vim configuration and plugin usage here if you're interested.

  • Colourscheme: Monokai
    I just really like Monokai and like that it's available for pretty much everything (including custom Github userstyles—a fact I abuse for the syntax highlighting of this blog!).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.