If you want to write great Rails apps, there’s a lot you’ll need to know. Sometimes, it feels impossible to keep up. But if you don’t seek out new knowledge, you’ll fall behind.

You don’t have to dig for great Rails information, if it finds you. To help you write great Rails apps, I send new articles every single week to over 2,500+ Rails developers on my list. Sign up, and you’ll learn how to speed up your apps, discover new Rails features, create more effective tests, write code more quickly, and become a better Rails developer.

I’ve also selected my best and most popular articles below for you to read. Keep scrolling for the newest articles.

Most Popular:

For more articles written for developers like you, browse my full article archive.

When you build Rails apps, you’ll use piles of gems. Some of them seem totally magical! But how does that magic happen?

In most gems, there’s nothing they’re doing that you couldn’t. It’s just Ruby code. Sometimes it’s complicated Ruby code. But if you explore that code, you’ll begin to understand where that magic comes from.

Finding the source code

To understand how a gem works, you have to find its code.

If there’s a method that you want to know more about, you have an easy way to find the source: method and source_location. I wrote a little bit about these earlier. Here’s an example:

1
2
3
irb(main):001:0> ActiveRecord::Base.method(:find).source_location
=> ["/usr/local/lib/ruby/gems/2.1.0/gems/
  activerecord-4.2.0.beta4/lib/active_record/core.rb", 127]

But what if you’re interested in more than one method?

If you have a console open inside your Rails app’s directory, you can go right to a gem’s code:

1
~/Source/gem_example[master] jweiss$ bundle open rack

If it’s in your Gemfile, bundle open rack will open up the entire rack gem in your editor. You can comfortably browse all of the files inside it.

Where do you start?

Now that you know where the gem’s code is, how do you begin to understand it?

If you try to learn how activerecord works by reading lib/active_record.rb, you’re not going to get anywhere. You’re just going to find a lot of autoloads and requires.

It’s easiest to understand a gem after your app uses it a little bit. Once you know more about the kind of work that the gem is doing for you. That way, you’ll already have an idea about which interesting classes and methods you should start with.

After you have the names of some interesting methods, you can use source_location, Find in Project in your editor, or ag on the command line to see where those methods are defined. And that’s when the fun starts.

The gem’s code is on your machine, right? That means you can change it however you want! You could even break it, and nobody else has to know.

When I’m trying to understand how a gem works, I don’t just read the code. I add puts statements into the gem, I throw exceptions to figure out how my app got to certain lines, and I mess with the code to understand why the author wrote it that way.

Once you know how the trick’s done, it’s a lot less magical. And you won’t have to guess how that code will act in strange situations, because you’ll be able to see it for yourself.

Cleaning up after yourself

After you mess with gem code, your app could be in pretty bad shape. It might not even run anymore! And even if it does, it’s going to spam all those puts statements you added into your console.

But RubyGems has a quick way to bring things back to normal:

1
2
3
~ jweiss$ gem pristine activerecord -v 4.2.0
Restoring gems to pristine condition...
Restored activerecord-4.2.0

Or, if you don’t remember which gems you messed with, and you’re really patient:

1
~ jweiss$ gem pristine --all

After that, all your gems will be back to the way they were when you installed them.

What are you going to explore?

When you find, read, and explore the code inside your gems, you’ll understand the code you depend on at a deeper level. And you won’t have to rely on assumptions and guesses anymore.

This post was originally sent exclusively to my list. To get more posts like it in your inbox every Friday, sign up here!

Is your Rails app slow?

When it takes seconds to load what should be a simple view, you have a problem worth digging into.

You could have too many database calls or some slow methods. Or maybe it’s that speedup loop someone put in your code and forgot about.

You can find lots of tools to help you find out what’s slowing your app down. A few weeks ago, I talked about rbtrace. New Relic’s rpm gem has also helped me speed up my apps.

But my favorite tool for exploring performance problems does a lot more. Out of the box, it shows you what your code is doing. But when you add a plugin to it, it becomes even more powerful. It helps you see your app’s performance problems, visually. And that can help you find and fix slow apps, faster.

My favorite Rails profiler

My favorite Rails performance tool is called rack-mini-profiler. When you add that gem into your app, you get a little indicator on each of your pages. It looks like this:

A MiniProfiler indicator.

If you click on that box, it expands and you can see all kinds of great stuff. Which SQL statements were run, how long it took to render partials, and more:

An expanded MiniProfiler indicator.

MiniProfiler gives you a constant reminder of how long each page takes to load. That helps you learn more about how your app is performing. You’ll build an intuitive sense of which pages are slow, and which ones are fast. You’ll start to notice when a page takes surprisingly long to render. And you can start fixing it right away, while it’s still on your mind.

MiniProfiler can do more. But first, you’ll have to install the flamegraph gem.

When you do that, you’ll unlock a new way to see your app’s performance.

Flamegraphs: exactly as fun as they sound

A flamegraph looks like this:

Flamegraph!

Pretty clear where the name comes from, right?

After you install the rack-mini-profiler gem and the flamegraph gem, you can see a flamegraph for any of your requests. Just add pp=flamegraph as an HTTP parameter, which would look like this:

1
http://www.example.com/restaurants?pp=flamegraph

The flamegraph will pop up, and you can zoom in and out, scroll around, and try to find interesting things to explore.

Each “layer” in the flamegraph is one line in a stack trace:

Flamegraphs. Identified.

And the horizontal axis is time. So the far left side of the graph is when your request started, and the far right side is when the request finished.

So, it certainly looks cool. But what can you do with a flamegraph?

How to use a flamegraph

Because the X axis represents time, you can really get a clear picture of where your app’s getting bogged down. The widest layers take the longest to run. They’re the first areas you should look into, because speeding them up could have the biggest impact.

How much time is your app spending rendering your view? In the controller action? Hitting the database? Rendering partials?

All those are really easy to see, visually:

Parts of a request.

There’s another useful thing that a flamegraph can show you:

Do you see a bunch of spikes that are all about the same height, like this?

Makes me uncomfortable just looking at it.

That often means you have some kind of N+1 query. You’re missing an includes somewhere, or making a bunch of calls to an API. If you added an includes, you’d get a flamegraph that looks more like this:

Much better.

N+1 SQL queries are pretty easy to see with most performance tools: you just look for SQL calls that look similar. But non-SQL N+1 issues, like hitting an API too many times, are a lot harder to notice. Especially if your logging isn’t that great.

With flamegraphs, though, those problems are a lot more visible.

What not to pay attention to

Flamegraphs can be overwhelming. They show you a lot of information, and you’re pretty much forced to take it all in at once. So what can you ignore?

Usually, you can skip the bottom and top layers of the graph. Instead, I start exploring around the middle of the graph, or maybe ¾ of the way to the top. That’s where my code tends to hang out.

The top of the graph is usually ActiveRecord- or IO-related, and the bottom is Rails framework code, so it makes sense that your code would be somewhere toward the middle.


Have you ever used a flamegraph? They’re a great way to find the best places to optimize. So give it a try! Add the rack-mini-profiler and flamegraph gems to your Gemfile. You’ll be surprised how much more insight you’ll get into your code.

You know how painful it is to work with badly tested code. Every time you fix a bug, you create five more. And when things do work, you never really know if it was designed that way, or just worked coincidentally.

On the other hand, you just wrote what seems like 200 tests to ship one tiny feature. You constantly have to redesign already-working code to hit 100% test coverage. You can’t shake the feeling that your best-tested code is somehow getting less readable. And worst of all, you’re starting to get burned out on your app.

There must be a middle ground. So how much testing is the right amount?

It’d be great if there was a nice round number you could use as a rule: twice as many lines of test code as app code, maybe, or 95% test coverage. But even saying “95% test coverage” is ambiguous.

Coverage can be an indicator of well-tested code, but it’s not a guarantee of well-tested code. I’ve had 100%-covered apps that had more bugs than apps with 85% coverage.

So, the right amount of testing can’t be about a number. Instead, it’s about something fuzzier, and harder to define. It’s about testing efficiently.

Efficient testing

Testing efficiently is all about getting the most benefit for the least amount of work. Sounds great, doesn’t it?

But there’s a lot that goes into testing more efficiently. So it helps to think about three things in particular: size, isolation, and focus.

Size

Integration tests are awesome. They mirror a path an actual person takes through your app. They test all of your code, working together, the same way it’s used in the real world.

But integration tests are slow. They can be long and messy. And if you want to thoroughly test one small part of your system, they add a lot of overhead.

Unit tests are smaller. They run faster. They’re easy to think about, since you only need to keep one tiny part of your system in your head while you write them.

But they can also be fake. Just because something works inside a unit test doesn’t mean it’ll also work in the real world. (Especially if you’re doing a lot of mocking).

So how do you balance those?

Since unit tests are fast and easy to write, it doesn’t cost much to have a lot of them. So they’re a great place to test things like edge cases and complicated logic.

Once you have a bunch of well-tested pieces of your system, you still have to fill in the gaps. You have to test how those parts interact, and the full journeys someone could take through your app. But because most of your edge cases and logic are tested by your unit tests, you only need a few of these more complicated, slower integration tests.

You’ll hear this idea called the “Test Pyramid.” It’s a few integration tests, sitting on top of a base of many unit tests. And if you want to learn more about it, take a look at the third chapter of my book, Practicing Rails.

Isolation

Still, if your system is complicated, it might take what feels like an infinite number of tests to cover every situation you might run into. This can be a sign that you need to rethink your app’s design. It means that parts of your system depend too closely on each other.

Say you have an object that could be in one of a few different states:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
case user.type
when :admin
  message = admin_message
when :user
  message = user_message
when :author
  message = author_message
else
  message = anonymous_message
end

if user.preferred_notification_method = :email
  send_email(message)
elsif user.preferred_notification_method = :text
  send_text_message(message)
else
  queue_notification(message)
end

If you wanted to test every possible path here, you’d have 12 different situations to test:

  1. User is an admin, preferred_notification_method is email
  2. User is an admin, preferred_notification_method is text
  3. User is an admin, preferred_notification_method is neither
  4. User is a user, preferred_notification_method is email
  5. User is a user, preferred_notification_method is text
  6. User is a user, preferred_notification_method is neither
  7. User is an author, preferred_notification_method is email
  8. User is an author, preferred_notification_method is text
  9. User is an author, preferred_notification_method is neither
  10. User is anonymous, preferred_notification_method is email
  11. User is anonymous, preferred_notification_method is text
  12. User is anonymous, preferred_notification_method is neither

There are so many cases to test because “sending a message based on a notification method” and “generating a message based on the type of a user” are tied together. You might be able to squeeze by with fewer, but it’s not obvious – and it’s just asking for bugs.

But what if you broke them apart?

1
2
3
message = get_message_based_on_user_type(user.type)

send_notification(message, user.preferred_notification_method)

Now you can test each part separately.

For the first part, you can test that the right message is returned for each type of user.

For the second part, you can test that a given message is sent correctly based on the value of preferred_notification_method.

And finally, you can test that the parent method will pass the message returned from do_stuff_based_on_user_type along to send_email_or_text. So now, you have 8 states to test:

  1. User is an admin
  2. User is a user
  3. User is an author
  4. User is anonymous
  5. preferred_notification_method is email
  6. preferred_notification_method is text
  7. preferred_notification_method is neither
  8. and one test for the parent method

Here, you save four tests by breaking code apart so you can test it separately. In the second example, it’s a lot more obvious that you can get by with fewer tests. And you can imagine how as you add more states, splitting your code up becomes an even better idea.

It takes time and practice before you’ll find the best balance between isolation and readability. But if you break your dependencies in the right place, you can get by with a lot fewer tests.

Focus

Your app should be well-tested. But that doesn’t mean every part of your app deserves the same amount of attention on its tests.

Even if you do aim for 100% test coverage, you still won’t test everything. You probably won’t test every line of text in your views, for instance, or that you’re polling for updates every five seconds instead of ten.

That’s where focus comes in. Writing fewer, more useful tests. And making a conscious decision where you can best spend the time you have.

Focus is another thing that’s hard to get right. These are a few questions I ask myself that help me concentrate on the most important tests:

  • How interconnected is this with the rest of my app? If it breaks, how many other pieces will go down with it?

  • How likely is it that this will change naturally? If my tests fail, will it be because of a bug, or because someone updated some text in the UI?

  • What’s the impact of this breaking? Am I going to charge someone’s credit card twice, or is it just going to end up with some missing text?

  • How often is this part used? Is it critical to the app’s behavior, or is it an about page buried somewhere in the footer?

You shouldn’t only test the important parts. But you’ll have an app that feels higher quality if you spend your testing time well.


If you try to test every single possible path someone could take through your app, you’ll never ship. TDD helps, but it won’t solve all of your testing problems.

Of course, that doesn’t mean you shouldn’t test at all.

You can use the test pyramid to keep your tests small. You can isolate and break dependencies to turn m * n test cases into m + n. And you can prioritize, so you can spend more time testing the most important parts of your app.

So, how much do you test? Do you consider any of these ideas as you build out your app? And how do you know which parts of your app to focus on? Leave a comment and tell me all about it!

I recently updated Ruby and upgraded a few projects. And while I did, I found some pretty cool RubyGems features that I didn’t know about before:

When your executables get out-of-date

I used to use rvm to manage Ruby versions. But the last time I set up my machine, I decided to try going without it. You don’t need Gemsets when you have Bundler, and you can use Homebrew to keep Ruby up to date.

This works great, until you update Ruby. And rails new, bundle install and all those other commands break. They’ll point to the old Ruby version, not the one you just installed.

You could fix this by uninstalling and reinstalling each gem one by one. But that’s just crazy. Instead, try gem pristine:

1
gem pristine --all --only-executables

gem pristine takes a gem, and resets it to the version you originally downloaded. It’s like uninstalling and reinstalling the gem. (This is also helpful if you decided to edit a gem while debugging, and forgot to change it back.)

--all means “all gems”, and --only-executables means “only reset files like /usr/local/bin/rails and /usr/local/bin/bundle”. That is, only fix the scripts you use to run a gem from the command line.

So this command resets files like /usr/local/bin/rails to what they would have been if you uninstalled and reinstalled the gem.

A minute later, you’ll be back to working on your app.

When you need an older version

When I wrote my post on respond_to, I created some tiny apps to learn more about it. I used a few different Rails versions to see how each version dealt with respond_to and respond_with.

How do you generate each Rails app with the right Rails version? You don’t have to do this:

1
2
3
4
5
6
7
gem install rails -v 4.0.0
rails new respond_to_4.0
gem uninstall rails

gem install rails -v 4.1.0
rails new respond_to_4.1
gem uninstall rails

There’s an easier way. You can tell RubyGems which version you want to run, right on the command line, with underscores:

1
2
rails _4.0.0_ new respond_to_4.0
rails _4.1.0_ new respond_to_4.1

Fuzzy gem versions

But in that last section, there’s still a problem. You probably don’t actually want to install 4.0.0. You want the newest version of 4.0, with all the minor updates.

But do you remember what the newest minor version of Rails 4.0 is?

There are lots of ways to look it up. But why look it up, when RubyGems can just do what you want?

1
gem install rails -v "~>4.0.0"

You can use all the version strings you know from Bundler:

1
gem install rails -v ">3.1, <4.1"

Useful! Especially if, like me, you hate doing stuff that a computer is better at.

But you don’t have to know all this

When I ran into these problems, I didn’t know there was an easy answer. But you could guess there would probably be one.

These were all situations where you can solve the problem by yourself, but it’d be repetitive and annoying.

And when you find a repetitive, annoying task like this, especially in a well-used project, it means one of two things:

  1. Someone has already automated it, or
  2. Lots of people are hoping you’ll automate it.

So before you do the busywork, dig a little deeper. Investigate a little bit. It’ll be worth your time.

This post was originally sent exclusively to my list. To get more posts like it in your inbox every Friday, sign up here!

In Rails, it’s easy to get a bunch of records from your database if you have their IDs:

1
Person.where(id: [1, 2, 3]).map(&:id) => [1, 2, 3]

But what if you wanted to get the records back in a different order? Maybe your search engine returns the most relevant IDs first. How do you keep your records in that order?

You could try where again:

1
Person.where(id: [2, 1, 3]).map(&:id) => [1, 2, 3]

But that doesn’t work at all. So how do you get your records back in the right order?

The compatible way: case statements

Just like Ruby, SQL supports case...when statements.

You don’t see it too often, but case...when statements can almost act like hashes. You can map one value to another:

1
2
3
4
5
case :b
when :a then 1
when :b then 2
when :c then 3
end # => 2

That case statement kind of looks like a hash:

1
2
3
4
5
{
  :a => 1,
  :b => 2,
  :c => 3
}[:b] # => 2

So, you have a way to map keys to order they should appear in. And your database can sort and return your results by that arbitrary order.

Knowing that, you could put your IDs and their position into a case statement, and use it in a SQL order clause.

So if you wanted your objects returned in the order [2, 1, 3], your SQL could look like this:

1
2
3
4
5
6
7
SELECT * FROM people
  WHERE id IN (1, 2, 3)
  ORDER BY CASE id
    WHEN 2 THEN 0
    WHEN 1 THEN 1
    WHEN 3 THEN 2
    ELSE 3 END;

That way, your records are returned in the right order. The CASE transforms each ID into the order it should be returned in.

Of course, that looks ridiculous. And you could imagine how annoying a clause like that would be to build by hand.

But you don’t have to build it by hand. That’s what Ruby’s for:

lib/extensions/active_record/find_by_ordered_ids.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
module Extensions::ActiveRecord::FindByOrderedIds
  extend ActiveSupport::Concern
  module ClassMethods
    def find_ordered(ids)
      order_clause = "CASE id "
      ids.each_with_index do |id, index|
        order_clause << sanitize_sql_array(["WHEN ? THEN ? ", id, index])
      end
      order_clause << sanitize_sql_array(["ELSE ? END", ids.length])
      where(id: ids).order(order_clause)
    end
  end
end

ActiveRecord::Base.include(Extensions::ActiveRecord::FindByOrderedIds)

Person.find_ordered([2, 1, 3]) # => [2, 1, 3]

Exactly how we wanted it!

A cleaner, MySQL-specific way

If you use MySQL, there’s a cleaner way to do this. MySQL has special ORDER BY FIELD syntax:

1
2
3
SELECT * FROM people
WHERE id IN (1, 2, 3)
ORDER BY FIELD(id, 2, 1, 3);

You could also generate that from Ruby:

lib/extensions/active_record/find_by_ordered_ids.rb
1
2
3
4
5
6
7
8
9
10
11
module Extensions::ActiveRecord::FindByOrderedIds
  extend ActiveSupport::Concern
  module ClassMethods
    def find_ordered(ids)
      sanitized_id_string = ids.map {|id| connection.quote(id)}.join(",")
      where(id: ids).order("FIELD(id, #{sanitized_id_string})")
    end
  end
end

ActiveRecord::Base.include(Extensions::ActiveRecord::FindByOrderedIds)

So, if you’re using MySQL, and not too worried about compatibility, this is a good way to go. It’s a lot easier to read as those statements fly through your logs.


When you want to display records in a specific, arbitrary order, you don’t need to sort them in Ruby. With a little code snippet, you can let the database do what it’s good at: finding, sorting, and returning data to your app.

You’re about to check in your next small feature, so you kick off a full integration test run. You wait, and wait, as the dots fill your screen, until…

......FF....

:-(

You still have a few minutes before your tests finish running. But if you quit the test run early, you’ll have no idea which tests failed.

Do you really have to wait for the entire run to finish before you can see those failures?

Ctrl-T to the rescue!

If you’re using a Mac, there’s a way to see your test failures early:

Hit Ctrl-T while your tests are running.

When you do, you’ll see which test case is currently running, and how long it’s been running for. If any tests have failed so far, you’ll also see those failures, so you can get a head start on fixing them before your next run!

This is also really handy for debugging tests that just hang. Ctrl-T will tell you which test is trying to run, so you can isolate just that one test and fix it.

Finally, I’ve built a habit of hitting Ctrl-T anytime a test takes a noticeably long time (say, a second or longer) to finish. It’s pointed me to plenty of slow tests that I need to make faster.

How does Ctrl-T work?

On a Mac, Ctrl-T sends a message, or signal, called INFO, to whichever program is running:

signal_test.rb
1
2
3
4
puts "Starting..."
trap("INFO") { puts "INFO triggered!" }

loop { print "."; sleep 0.1}
1
2
3
4
5
6
7
8
9
10
11
12
~/Source jweiss$ ruby signal_test.rb
Starting...
........^Tload: 7.14  cmd: ruby 6121 running 0.10u 0.08s
INFO triggered!
.......^Tload: 7.14  cmd: ruby 6121 running 0.10u 0.08s
INFO triggered!
................^Tload: 11.77  cmd: ruby 6121 running 0.10u 0.08s
.INFO triggered!
......^Csignal_test.rb:5:in `sleep': Interrupt
 from signal_test.rb:5:in `block in <main>'
  from signal_test.rb:5:in `loop'
 from signal_test.rb:5:in `<main>'

Minitest knows about INFO, and responds to it by printing information about the test run:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
~/Source/rails/activesupport[master] jweiss$ be rake
/usr/local/Cellar/ruby/2.2.0/bin/ruby -w -I"lib:test"  "/usr/local/Cellar/ruby/2.2.0/lib/ruby/2.2.0/rake/rake_test_loader.rb" "test/**/*_test.rb"
Run options: --seed 33445

# Running:

.................F........^Tload: 1.62  cmd: ruby 29646 running 4.37u 1.40s
Current results:


  1) Failure:
CleanLoggerTest#test_format_message [/Users/jweiss/Source/rails/activesupport/test/clean_logger_test.rb:13]:
Expected "error\n" to not be equal to "error\n".



Current: DigestUUIDExt#test_invalid_hash_class 0.02s
............................

Pretty nice!

Knowing that this is possible, you might think of ways other apps could handle INFO:

  • Rails could display the currently running controller action or some performance stats.
  • Sidekiq could tell you what each worker is doing, so you could see where they get stuck.

And Sidekiq actually used to use INFO to print a backtrace of each thread it ran. But because INFO isn’t supported on Linux, Sidekiq switched to a different signal. Unfortunately, that signal can’t be triggered by a keyboard shortcut the way INFO can.

Because INFO isn’t available on Linux (and some might say that using INFO this way isn’t totally right, anyway), this behavior isn’t as widespread as it could be.

Still, it’s a little bit of extra help that could be useful in a wide range of situations. If you’re building an app, it’s worth thinking about what kind of status messages you could display on-demand to people who are interested.

You know that performance is a feature. And a lot of performance problems can be found and fixed during development.

But what about those slowdowns that only show up in production? Do you have to add log messages to every single line of code? That would just slow things down even more! Or do you ship tons of tiny “maybe this fixes it” commits to see what sticks?

You don’t have to ruin your code to analyze it. Instead, try rbtrace-ing it.

Trace your running Ruby app

With rbtrace, you can detect performance problems, run code inside another Ruby process, and log method calls without having to add any code. Just add gem "rbtrace" to your Gemfile.

I learned about rbtrace from Sam Saffron’s amazing post about debugging memory leaks in Ruby (which you should really check out, if you haven’t already).

In that post, Sam used rbtrace to see all of the objects a process used:

1
bundle exec rbtrace -p $SIDEKIQ_PID -e 'Thread.new{GC.start;require "objspace";io=File.open("/tmp/ruby-heap.dump", "w"); ObjectSpace.dump_all(output: io); io.close}'

This is awesome. But there’s a whole lot more you can do.

What can you do with rbtrace?

Ever wanted to see the SQL statements you’re running in production (and how long they took)?

1
2
3
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID --methods "ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#execute_and_clear(sql)"
*** attached to process 7897
ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#execute_and_clear(sql="SELECT  \"articles\".* FROM \"articles\" WHERE \"articles\".\"id\" = $1 LIMIT 1") <0.002631>

All method calls that take longer than 2 seconds?

1
2
3
4
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID --slow 2000
*** attached to process 8154
    Integer#times <2.463761>
        ArticlesController#create <2.558673>

Do you want to know every time a certain method gets called?

1
2
3
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID --methods "ActiveRecord::Persistence#save"
*** attached to process 8154
ActiveRecord::Persistence#save <0.010964>

See which threads your app is running?

1
2
3
4
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID -e "Thread.list"
*** attached to process 8154
>> Thread.list
=> [#<Thread:0x007ff4fcc9a8a8@/usr/local/lib/ruby/gems/2.2.0/gems/puma-2.6.0/lib/puma/server.rb:269 sleep>, #<Thread:0x007ff4fcc9aa10@/usr/local/lib/ruby/gems/2.2.0/gems/puma-2.6.0/lib/puma/thread_pool.rb:148 sleep>, #<Thread:0x007ff4fcc9ab50@/usr/local/lib/ruby/gems/2.2.0/gems/puma-2.6.0/lib/puma/reactor.rb:104 sleep>, #<Thread:0x007ff4f98c0410 sleep>]

Yep, with -e you can run Ruby code inside your server:

1
2
3
4
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID -e "ActiveRecord::Base.connection_config"
*** attached to process 8154
>> ActiveRecord::Base.connection_config
=> {:adapter=>"postgresql", :pool=>5, :timeout=>5000, :database=>"rbtrace_test"}

Yeah, OK, now I’m a little scared. But that’s still very cool. (And only users with permission to mess with the process can rbtrace it, so it’s probably OK).


rbtrace gives you a ton of tools to inspect your Ruby processes in staging and production. You can see how your processes are using (or abusing) memory, trace slow function calls, and even execute Ruby code.

You don’t have to create tons of test commits and log messages to fix problems. You can just hop on to the server, get some data, and hop back out. And even if I’m not totally comfortable using it in production yet, I’m sure it’ll even help out in our test environments.

How about you? What could you use rbtrace for?

You’ve probably seen this pattern before. A method has an options hash as its last argument, which holds extra parameters:

1
2
3
4
5
6
def hello_message(name_parts = {})
  first_name = name_parts.fetch(:first_name)
  last_name = name_parts.fetch(:last_name)

  "Hello, #{first_name} #{last_name}"
end

Unfortunately, you need to extract those parameters out of the hash. And that means there’s a lot of setup to wade through before you get to the good part.

But if you changed this method to use keyword arguments in Ruby 2.0+, you wouldn’t have to pull :first_name and :last_name out of your hash. Ruby does it for you:

1
2
3
def hello_message(first_name:, last_name:)
  "Hello, #{first_name} #{last_name}"
end

Even better, if your app uses Ruby 1.9+ hash syntax, your methods can use keyword arguments without changing those methods’ callers:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def hello_message_with_an_options_hash(name_parts = {})
  first_name = name_parts.fetch(:first_name)
  last_name = name_parts.fetch(:last_name)

  "Hello, #{first_name} #{last_name}"
end

def hello_message_with_keyword_arguments(first_name:, last_name:)
  "Hello, #{first_name} #{last_name}"
end

hello_message_with_an_options_hash(first_name: "Justin", last_name: "Weiss")

hello_message_with_keyword_arguments(first_name: "Justin", last_name: "Weiss")

See? Those arguments are identical!

Pushing keyword argument syntax one step too far

What if you haven’t switched to the new Hash syntax, though? You could convert all your code. But, at least in Ruby 2.2.1, the old hash syntax works just fine with keyword arguments:

1
2
irb(main):007:0> hello_message_with_keyword_arguments(:first_name => "Justin", :last_name => "Weiss")
=> "Hello, Justin Weiss"

Nice! What about passing a hash object, instead of arguments?

1
2
3
irb(main):008:0> options = {:first_name => "Justin", :last_name => "Weiss"}
irb(main):009:0> hello_message_with_keyword_arguments(options)
=> "Hello, Justin Weiss"

Whoa. What if we want to mix a hash and keyword arguments?

1
2
3
4
irb(main):010:0> options = {last_name: "Weiss"}
irb(main):011:0> hello_message_with_keyword_arguments(first_name: "Justin", options)
SyntaxError: (irb):11: syntax error, unexpected ')', expecting =>
 from /usr/local/bin/irb:11:in `<main>'

OK. I guess we took that one step too far. To fix this, you could use Hash#merge to build a hash you could pass in on its own. But there’s a better way.

If you were using regular arguments instead of keyword arguments, you could splat arguments from an Array, using *:

1
2
3
4
5
6
def generate_thumbnail(name, width, height)
  # ...
end

dimensions = [240, 320]
generate_thumbnail("headshot.jpg", *dimensions)

But is there a way to splat keyword arguments into an argument list?

It turns out there is: **. Here’s how you’d fix that broken example with **:

1
2
3
irb(main):010:0> options = {last_name: "Weiss"}
irb(main):011:0> hello_message_with_keyword_arguments(first_name: "Justin", **options)
=> "Hello, Justin Weiss"

And if you’re really crazy, you can mix regular arguments, keyword arguments, and splats:

1
2
3
4
5
6
7
8
def hello_message(greeting, time_of_day, first_name:, last_name:)
  "#{greeting} #{time_of_day}, #{first_name} #{last_name}!"
end

args = ["Morning"]
keyword_args = {last_name: "Weiss"}

hello_message("Good", *args, first_name: "Justin", **keyword_args) # => "Good Morning, Justin Weiss!"

Of course, if you find yourself in the situation where that’s necessary, you probably made a mistake a lot earlier!

Capture keyword arguments the easy way

Do you know how you can turn all your method arguments into an array using *?

1
2
3
4
5
def argument_capturing_method(*args)
  args
end

argument_capturing_method(1, 2, 3) # => [1, 2, 3]

This also works with keyword arguments. They’re converted to a hash, and show up as the last argument of your args array:

1
argument_capturing_method(1, 2, 3, key: "value") # => [1, 2, 3, {:key=>"value"}]

But args.last[:key] isn’t the best way to read keyword arguments grabbed this way. Instead, you can use the new ** syntax to get the keyword arguments by themselves:

1
2
3
4
5
def dual_argument_capturing_method(*args, **keyword_args)
  {args: args, keyword_args: keyword_args}
end

dual_argument_capturing_method(1, 2, 3, key: "value") # => {:args=>[1, 2, 3], :keyword_args=>{:key=>"value"}}

With this syntax, you can access the first regular argument with args[0] and the :key keyword argument with keyword_args[:key].

… Of course, now we’re back to options hashes.


Keyword arguments are great for removing a ton of parameter extraction boilerplate from your code. And you might not even have to change any of your code to take advantage of them.

But when you write more generic methods, there are some new techniques you’ll have to learn to handle keyword arguments well. You might not have to use those techniques very often. But when you do, this power and flexibility will be there, waiting for you.

When you go bugfixing, the quick, obvious change isn’t always the best one. And the code in front of you is never the whole story. To go beyond the easy fix, you have to know why certain decisions were made. You have to understand the history behind the code. And there are three great ways to learn what you need to know to confidently change code.

git blame

With the help of git blame, you can trace through every version of every line of code in a project, all the way back to when it was written.

For example, say you were looking at ActiveJob’s queue_name.rb file, and you wanted to know what this queue_name_delimiter attribute was all about:

activejob/lib/active_job/queue_name.rb
1
2
3
4
5
6
7
included do
  class_attribute :queue_name, instance_accessor: false
  class_attribute :queue_name_delimiter, instance_accessor: false

  self.queue_name = default_queue_name
  self.queue_name_delimiter = '_' # set default delimiter to '_'
end

You could run git blame on it:

1
2
3
4
5
6
7
8
$ git blame queue_name.rb

...
da6a86f8 lib/active_job/queue_name.rb           (Douwe Maan               2014-06-09 18:49:14 +0200 34)     included do
1e237b4e activejob/lib/active_job/queue_name.rb (Cristian Bica            2014-08-25 17:34:50 +0300 35)       class_attribute :queue_name, instance_accessor: false
11ab04b1 activejob/lib/active_job/queue_name.rb (Terry Meacham            2014-09-23 15:51:44 -0500 36)       class_attribute :queue_name_delimiter, instance_accessor: false
11ab04b1 activejob/lib/active_job/queue_name.rb (Terry Meacham            2014-09-23 15:51:44 -0500 37)
...

And for each line, in order, you’ll see:

  • The revision that changed that line most recently (11ab04b1, for example),
  • The name of the author of that commit,
  • And the date the change was made.

To learn more about that line of code, you’ll need the revision number. Pass the id (that 11ab04b1 part) to git show or git log:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ git show 11ab04b1

commit 11ab04b11170253e96515c3ada6f2566b092533a
Author: Terry Meacham <zv1n.fire@gmail.com>
Date:   Tue Sep 23 15:51:44 2014 -0500

    Added queue_name_delimiter attribute.

    - Added ActiveJob::Base#queue_name_delimiter to allow for
      developers using ActiveJob to change the delimiter from the default
      ('_') to whatever else they may be using (e.g., '.', '-', ...).

    - Updated source guide to include a blurb about the delimiter.

diff --git a/activejob/lib/active_job/queue_name.rb b/activejob/lib/active_job/queue_name.rb
index d167617..6ee7142 100644
...

Cool! You get to learn a little more about the change, why it’s useful, and see the part of the Rails Guide about it that you might have missed before.

Here, we got pretty lucky. We found the information we were looking for right away. But git blame only shows you the most recent time that line changed. And sometimes, you won’t find what you’re looking for until you go two or three commits back.

To see earlier commits, you can call git blame again. But this time, pass the revision before the commit git blame found. (In git, you can say “the commit before this other commit” by putting a ^ after the revision, like 11ab04b1^):

1
2
3
4
5
6
7
8
9
10
$ git blame 11ab04b1^ queue_name.rb

...
da6a86f8 lib/active_job/queue_name.rb           (Douwe Maan               2014-06-09 18:49:14 +0200 33)     included do
1e237b4e activejob/lib/active_job/queue_name.rb (Cristian Bica            2014-08-25 17:34:50 +0300 34)       class_attribute :queue_name, instance_accessor: false
94ae25ec activejob/lib/active_job/queue_name.rb (Cristian Bica            2014-08-15 23:32:08 +0300 35)       self.queue_name = default_queue_name
...

$ git blame 1e237b4e^ queue_name.rb
... and so on ...

That’s pretty mind-numbing, though.

Instead, explore your text editor. Most editors make tracing through history with git blame easy. For example, in Emacs, after git blame-ing your code, place your cursor on a line. Then, you can press a to step back through each change that affected that line, and use l and D to see more detail about a commit.

Does your team refer to issue numbers and pull requests in your commit messages? If so, git blame can easily take you from a line of code, to a commit, to the discussion about that commit. And that discussion is where you’ll find all the really good stuff.

Github Issues

A little while ago, I noticed that Rails 4.2 removed respond_with. In the docs, it’s clear that it was removed, but I didn’t understand why.

There’s a ton of great knowledge hidden behind GitHub’s issue search box. If you want to know why a feature was removed, there’s no better place to learn than the discussion the team had while they decided to remove it.

So, if you search Rails’ GitHub repo for respond_with, you’ll find some interesting threads about respond_with. If you’re trying to find out why it was removed, you’ll probably land on this thread. Unfortunately, it describes how it was removed, but not why.

Later on in that thread, though, you’ll find a comment that points to the actual discussion about removing respond_with. That’s where the good stuff is!

As with git blame, you might not find exactly what you’re looking for right away. You’ll have to follow references, read comments, and click on links. But GitHub’s issue search will get you started in the right place. And with a little curiosity and a sense of exploration, you’ll learn what you came for.

Ask questions

Unfortunately, not all knowledge about a project can be found in its history, issues, and pull requests. Not everything is written down.

So if you can find the person that originally wrote the code, ask about it. You’ll discover the dev culture and ideas that led to a decision. You’ll learn some history about the project that might never have been recorded. And you’ll hear about paths that were never taken, which might actually make sense to try now.

The easy way can be dangerous

Sometimes, a fix just seems so easy. All you have to do is rescue this one exception in this one place, and then you can go home!

But it’s dangerous to change code you don’t understand.

So when a line of code seems off, or a decision seems weird, or a call seems useless, put on your archeologist’s hat. Learn as much as you can, however you can, about that code. That’s how you’ll make your code change intentionally, and fix the problem at its root.

What if your Rails app couldn’t tell who was visiting it? If you had no idea that the same person requested two different pages? If all the data you stored vanished as soon as you returned a response?

That might be fine for a mostly static site. But most apps need to be able to store some data about a user. Maybe it’s a user id, or a preferred language, or whether they always want to see the desktop version of your site on their iPad.

session is the perfect place to put this kind of data. Little bits of data you want to keep around for more than one request.

Sessions are easy to use:

1
session[:current_user_id] = @user.id

But they can be a little magical. What is a session? How does Rails know to show the right data to the right person? And how do you decide where you keep your session data?

What is a session?

A session is just a place to store data during one request that you can read during later requests.

You can set some data in a controller action:

app/controllers/sessions_controller.rb
1
2
3
4
5
def create
  # ...
  session[:current_user_id] = @user.id
  # ...
end

And read it in another:

app/controllers/users_controller.rb
1
2
3
4
def index
  current_user = User.find_by_id(session[:current_user_id])
  # ...
end

It might not seem that interesting. But it takes coordination between your user’s browser and your Rails app to make everything connect up. And it all starts with cookies.

When you request a webpage, the server can set a cookie when it responds back:

1
2
3
~ jweiss$ curl -I http://www.google.com | grep Set-Cookie

Set-Cookie: NID=67=J2xeyegolV0SSneukSOANOCoeuDQs7G1FDAK2j-nVyaoejz-4K6aouUQtyp5B_rK3Z7G-EwTIzDm7XQ3_ZUVNnFmlGfIHMAnZQNd4kM89VLzCsM0fZnr_N8-idASAfBEdS; expires=Wed, 16-Sep-2015 05:44:42 GMT; path=/; domain=.google.com; HttpOnly

Your browser will store those cookies. And until the cookie expires, every time you make a request, your browser will send the cookies back to the server:

1
2
3
4
5
6
7
...
> GET / HTTP/1.1
> User-Agent: curl/7.37.1
> Host: www.google.com
> Accept: */*
> Cookie: NID=67=J2xeyegolV0SSneukSOANOCoeuDQs7G1FDAK2j-nVyaoejz-4K6aouUQtyp5B_rK3Z7G-EwTIzDm7XQ3_ZUVNnFmlGfIHMAnZQNd4kM89VLzCsM0fZnr_N8-idASAfBEdS; expires=Wed, 16-Sep-2015 05:44:42 GMT; path=/; domain=.google.com; HttpOnly
...

Many cookies just look like gibberish. And they’re supposed to. Because the information inside the cookie isn’t meant for the user. Your Rails app is in charge of figuring out what a cookie means. Your app set it, so your app can read it.

What does this have to do with a session?

So, you have a cookie. You put data in during one request, and you get that same data in the next. What’s the difference between that and a session?

By default, in Rails, there isn’t much of a difference. Rails does some work with the cookie to make it more secure. But besides that, it works the way you’d expect. Your Rails app puts some data into the cookie, the same data comes out of the cookie. If this was all there was, there’d be no reason to distinguish sessions from cookies.

But cookies aren’t always the right answer for session data:

  • You can only store about 4kb of data in a cookie.

    This is usually enough, but sometimes it’s not.

  • Cookies are sent along with every request you make.

    Big cookies mean bigger requests and responses, which mean slower websites.

  • If you accidentally expose your secret_key_base, your users can change the data you’ve put inside your cookie.

    When this includes things like current_user_id, anyone can become whichever user they want!

  • Storing the wrong kind of data inside a cookie can be insecure.

If you’re careful, these aren’t big problems.

But when you can’t store your session data inside a cookie for one of these reasons, Rails has a few other places to keep your sessions:

Alternative session stores

All of the session stores that aren’t the cookie session store work in pretty much the same way. But it’s easiest to think about using a real example.

If you were keeping track of your sessions with ActiveRecord:

  1. When you call session[:current_user_id] = 1 in your app, and a session doesn’t already exist:

  2. Rails will create a new record in your sessions table with a random session ID (say, 09497d46978bf6f32265fefb5cc52264).

  3. It’ll store {current_user_id: 1} (Base64-encoded) in the data attribute of that record.

  4. And it’ll return the generated session ID, 09497d46978bf6f32265fefb5cc52264, to the browser using Set-Cookie.

The next time you request a page,

  1. The browser sends that same cookie to your app, using the Cookie: header.

    (like this: Cookie: _my_app_session=09497d46978bf6f32265fefb5cc52264;
    path=/; HttpOnly)

  2. When you call session[:current_user_id]:

  3. Your app grabs the session ID out of your cookie, and finds its record in the sessions table.

  4. Then, it returns current_user_id out of the data attribute of that record.

Whether you’re storing sessions in the database, in Memcached, in Redis, or wherever else, they mostly follow this same process. Your cookie only contains a session ID, and your Rails app looks up the data in your session store using that ID.

Cookie store, cache store, or database store?

When it works, storing your sessions in cookies is by far the easiest way to go. It doesn’t need any extra infrastructure or setup.

But if you need to move beyond the cookie session store, you have two options:

Store sessions in a database, or store them in your cache.

Storing sessions in the cache

You might already be using something like Memcache to cache your partials or data. If so, the cache store is the second-easiest place to store session data, since it’s already set up.

You don’t have to worry about your session store growing out of control, because older sessions will automatically get kicked out of the cache if it gets too big. And it’s fast, because your cache will most likely be kept in memory.

But it’s not perfect:

  • If you actually care about keeping old sessions around, you probably don’t want them to get kicked out of the cache.

  • Your sessions and your cached data will be fighting for space. If you don’t have enough memory, you could be facing a ton of cache misses and early expired sessions.

  • If you ever need to reset your cache (say you upgraded Rails and your old cached data is no longer accurate), there’s no way to do that without expiring everyone’s sessions.

Still, this is how we store session data at Avvo, and it’s worked well for us so far.

Storing sessions in the database

If you want to keep your session data around until it legitimately expires, you probably want to keep it in some kind of database. Whether that’s Redis, ActiveRecord, or something else.

But database session storage also has drawbacks:

  • With some database stores, your sessions won’t get cleaned up automatically.

    So you’ll have to go through and clean expired sessions on your own.

  • You have to know how your database will behave when it’s full of session data.

    Are you using Redis as your session store? Will it try to keep all your session data in memory? Does your server have enough memory for that, or will it start swapping so badly you won’t be able to ssh in to fix it?

  • You have to be more careful about when you create session data, or you’ll fill your database with useless sessions.

    For example, if you accidentally touch the session on every request, googlebot could create hundreds of thousands of useless sessions. And that would be a bad time.

Most of these problems are pretty rare. But you should still be aware of them.

So how should you store your sessions?

If you’re pretty sure you won’t run into any of the cookie store’s limitations, use it. It doesn’t need much setup, and isn’t a headache to maintain.

Storing sessions in the cache vs. the database is more a judgement call about how bad it would be to expire a session early. I treat session data as pretty temporary, so the cache store works well for me. So I usually try cookie first, then cache, then database.

But how about you? How do you store your sessions? Leave a comment and let me know!