How We Established Better Boundaries and Clearer Code Ownership at Alto

Feb 13, 2022

By

Alto Pharmacy

Some problems are nice to have. Over the course of 2020 the engineering team at Alto saw explosive growth, expanding from about fifteen to more than seventy individual contributors. Growth at this pace required us to rethink our team structures and priorities more than once. This growth felt amazing: the new capacity enabled us to make progress on our backlogs like never before.

But still, there were challenges. We saw a steady rise in merge conflicts as more people contributed to the same code base. Build times increased as more and more code was contributed. A big influx of new perspectives and talent also meant new gaps in historical context and responsibility. New teams took big bites out of their new priorities — great! But what about the old ones? Who was going to fix that bug? Who needs to review this PR? Who is accountable for this performance problem?

Who, in short, owned what? By the end of 2020, it was clear to us that a good, practical, working answer to this question would be the critical first step to continuing smooth, rapid, distributed iteration on our product.

Better Boundaries, Clearer Ownership

Up to this point, the large majority of product development at Alto had happened in the context of a Ruby on Rails monolith, which — over a half decade of intense development — had grown pretty large. We’d stuck more or less to the default Rails convention of organizing code by function rather than feature, which meant that our code was distributed across a few mostly flat and very broad subdirectories such as '/models' and '/controllers', making it difficult to learn much about the structure and ownership of our business domains by merely looking at how the code was laid out. We knew that this was not going to work for us over the long term.

Our long term vision started to coalesce around a notion of “better boundaries.” First, we would modularize the monolith along the lines suggested by Shopify’s packwerk utility. Then, we would carve some of those modules out into separate engines that could be built and tested separately. Finally, we’d deploy and operate some of those engines independently as their own services. Clear ownership was fundamental to this plan: on either side of an effective boundary, you need to know not just what it is but also who owns it. We lacked that clarity.

The RFC (Request for Comments) we ultimately adopted opened with four questions:

  • How might we produce a domain model and do something useful at the same time?

  • How will we avoid imposing (the wrong) prescriptive model from the top down?

  • How will we keep it up to date and accurate?

  • How will we know we're "done"?

The Owners Tag

We considered jumping straight into moving files around into a more modular structure. Ultimately we decided against this approach as a first step for a number of reasons. The benefit of moving files around is mostly abstract — it’s not immediately clear what to do with a better module structure. It’s relatively expensive to move files around in a Rails app — doing so typically requires you to rename the contents (and update all its callers), which amounts to a lot of busy work, and it’s extra expensive to be wrong. And the in-between state is rough: which example should I follow in deciding where to put my new file? First you are obligated to create a map of the future state, and then to keep that map in sync with reality as you go.

Instead, we opted for a lightweight solution that would allow us to map our domains and their owners in the code itself — helping to ensure it’s up to date and accurate — without having to restructure it at the same time. We’d do this by tagging code in code comments, and then building tools to leverage these tags in various ways to increase visibility and enforce ownership. This has the nice effect of producing a new domain map incrementally from the bottom up, rather than prescriptively from the top down.

We called this the '@owners' tag, and it looks like this:

# @owners { team: platform, domain: orders } class Wunderbar::OrdersController < Wunderbar::WunderbarController def show ... end # @owners { team: inventory } def out_of_stock ... end end

We based our tooling around extending the YARD documentation tool to recognize this as a custom tag. You can teach YARD to recognize a custom tag by passing a command line option like '--tag owners:"Owners"', which by default will cause the tag value to appear in generated documentation under the header “Owners”. The Ruby API for declaring a tag programmatically is shown in the snippet below as well.

We wanted the tag to be interpreted hierarchically to handle cases where ownership has diverged within the same class or module, such that we might want to tag one or two methods differently from the rest, as well as to ease the burden when all of the classes / modules / methods under a namespace belong to the same team and domain, since you can just tag the shared parent. We set an initial goal around getting 100% of our methods covered by at least a “team” value, at which point we would require all code to be covered in our build process.

Working with Tags in YARD

To get a basic feel for how to work with these tags, make sure the 'yard' gem is available, tag a couple of your files, and then try running the code below in your project. It will print out the tag associated with each tagged code block.

require 'yard' # Define the custom tag. YARD::Tags::Library.define_tag('Owners', :owners) # Parse your codebase. You might want to pass a more specific path here. YARD.parse('.') # Load all the method, class and module code blocks into memory code_blocks = YARD::Registry.all(:method, :class, :module) # Print the tag associated with each tagged block. code_blocks.each do |code_block| tag = code_block.tag('owners') # automatically tag nested classes, modules and methods. while tag.nil? && code_block.namespace != nil && [:class, :module].include?(code_block.namespace.type) code_block = code_block.namespace tag = code_block.tag('owners') end print "Code Block: #{code_block}, tag: #{tag.text}, line: #{tag.line}" if tag end

We ultimately developed a handful of CLI tools based on this technique that allow us to produce reports measuring the percentage of code covered by team and domain at different levels, among other things:

bin/yardowners --help Usage: bin/yardowners [options] --methods view by methods --files view by files --teams teams report --domains domains repot --team=TEAM filter by team --domain=DOMAIN filter by domain --codeowners update codeowners --lint lint owners tags --require-domains lint for presence of domain in owners tags -h, --help display this help and exit

We use this CLI to add checks to our CI process around coverage / correctness, as well as to report progress toward our goals of 100% coverage.

Adoption

To get the tagging process started across the organization, we dumped all of our class and module names (thanks again, YARD!) into a big spreadsheet, and then asked each team to get together and spend a bit of time putting their names next to the ones they owned. We adopted the attitude that it was better to have the wrong value than no value, since it would be easy to adjust later.

We covered a good 80% of the code using this spreadsheet by the time each team had taken an initial pass. From there, we filled the blanks in with “TODO”, and then wrote a little script to automatically insert the tags into the code (again based on the YARD metadata, which conveniently gives you the line numbers for the top of each class / module declaration).

At that point, we enabled a linter requiring 100% team owner coverage on all new PRs. We set a new goal to eliminate TODOs, and chipped away at it over the course of a month or so. We then repeated this process for the “domain” values, driving that to 100% as well. We found that the spreadsheet-based divide-and-conquer approach allowed us to cover a lot of ground very quickly.

Once we were able to parse our '@owners' tags, the first and most important thing we did with them was to automatically generate our CODEOWNERS file. We enforce that each tagged team maps to a team name in github and check that CODEOWNERS is up to date during the build process. This new process introduced some friction, as teams started getting tagged every time a file they were tagged on changed. This turned out to be great, because it sparked a number of very productive negotiations around ownership and scope that got hashed out in code review. We keep our CODEOWNERS file synced with Sentry, too, so that exceptions raised by files you tagged as yours get assigned to your team automatically. We also track and report on the performance of individual endpoints by team according to how they are tagged.

The new owners tag hasn’t fixed all our problems by any means, but it has proved fundamental to our ability to operate as a growing set of increasingly independent teams with well specified responsibilities. In subsequent posts, we’ll talk more about how we’re using our new domain map and a suite of other cool tools to build better boundaries within and ultimately escape from our monolith.

Interested in learning more about engineering at Alto? Follow us on LinkedIn.