I am a performance junkie and earlier this year after checking out the TechEmpower benchmarks, I became very curious about what the fastest and best framework choice is right now in Ruby. What this curiosity turned into is something else entirely.

Part of the reason is Market has an ad server written in Ruby and having the best possible performance helps us deliver the best product for our customers. Another reason is, in general, faster programs make for happier users, more sales, etc. So, from a business perspective, a faster program means more money.

Ruby historically is not viewed as a very fast programming language. Compared to C++ or Java, it’s quite slow and in recent years other platforms like Node.js, Scala, Go, and Clojure have peeled away some Ruby programmers. I don’t doubt that Ruby’s performance woes, whether real or not, caused some people to look elsewhere.

So, is Ruby slow?

To try to answer that question, I devised a benchmark inspired by the TechEmpower benchmarks, but with some changes in approach.

  1. I benchmarked every single version of Ruby available as of the beginning of this test — January 2014.
  2. On each version I benchmarked every Rack server and web framework combination I could find.
  3. I tested with both Apache Bench and wrk to see if there are flaws that one benchmarking tool picks up on that the other didn’t.
  4. The goal of the benchmark is the fastest benchmark run, not the average.

Testing Methodology

The tooling used to make this possible is a combination of Rake, Apache Bench, wrk, and RVM. There are separate rake tasks used to warm up the server, then to test it in Apache Bench or wrk.

The benchmark commands are run using Ruby 2.1.0, but the servers are run using whatever runtime they are being tested against at the time.That keeps the responsiveness of the testing commands fast and consistent. The benchmark commands should maybe be rewritten as a bash script at some point.

Two benchmarks are run, one with apache bench and one with wrk. Both are trying to hit the endpoint as many times as they can. Apache bench is processing 5,000 requests as fast as possible, Wrk is sending as many requests as it can over a 10 second period.

Also, a warm up task was run to allow each runtime to try and optimize before a benchmark was run and I ran the benchmarks until the numbers topped out. Meaning, I ran the Apache Bench benchmark until the numbers hit a kind of average maximum and then I took the best result. Often times this meant running tests 5-10 times until they stopped improving. On JRuby this is especially true due to the JVM's JIT compiler capabilities which can really speed things up over multiple runs.

The number we care about is requests per second, but if you look at the raw data, per request time is interesting as well.

While I recorded all the numbers for both Apache Bench and wrk, all the request per second numbers you see in this report are for Apache Bench. I ran into fewer errors and more consistent reporting using Apache Bench than I did with wrk, but in most cases the performance numbers told a similar story across the different frameworks, servers, and runtimes.

The Test Machine

All tests are run on a 2013 Macbook Air 1.3 GHz Intel Core i5 with 8 GB of DDR3 RAM and a 250 GB SSD.

This is not meant to be a good representation of server hardware or anything like that. It's the dev machine I have on hand. It's a great machine, but understand that all of these performance numbers are relative to that level of hardware.

On Amazon EC2 or Rackspace Cloud, you're going to get a lot better performance in most cases and a modern dedicated server on bare metal will get you much, much better performance. I have no idea to what degree that would change the data.

What I Tested

First, let’s start with the different Ruby versions. I tested:

I tested with whatever point releases or patch levels were current in January 2014. Those versions have changed, so I probably will run this benchmark again later this year to see how much of a difference time makes.

Second, let’s talk Rack servers. I wanted to try every possible rack server on every possible Ruby, so I tested:

I think I got them all, but maybe not. I tried to run these on every platform where it made sense, so WEBrick, Thin, Puma, Unicorn, and Passenger were tested across almost all the Ruby versions, but the JRuby specific Fishwife, Jubilee, Mizuno, Torquebox, and Trinidad were just JRuby.

Third, let’s talk web frameworks. Again, I wanted to try every possible web framework to see how much that mattered to are incredibly simple “Hello World” app. So, I tested:

There were even some others I tried to get in there, but didn’t have success with. You can find those in the git repo for this project.

Why Hello World?

The benchmark I devised is a simple app that would output “Hello World” in the smallest amount of code for that framework. The reasoning behind this is that outputting a “Hello World” string as a response is probably the simplest and fastest thing you can do. You will never write a web app that is less complicated or faster than Hello World.

This is useful because it gives us a theoretical top end number for a given runtime/server/framework combination. It’s like the top speed for a car, it doesn’t tell you how fast it goes most of the time, it tells you what the upper limit is.

Where this number becomes important is when you consider the limits of your technology and how you should approach scaling. A lot of startups start on Rails, then switch to Go or Scala or Clojure, in part because they perform better than Rails in terms of throughput

What I was curious about is if there a way to make Ruby as fast as Go or Java/Scala? How fast can a Ruby web app be?

The Results…

There is a lot of data and so we have a full github repo you can peruse with all the result data. Before diving into every single combination, let me tell you about the highlights.

The Slowest Runtime

I was a little surprised, but Rubinius didn’t do super well across the board. I haven’t used Rubinius enough to know exactly why, but regardless of framework or server, it didn’t perform well in my benchmarks.

The Slowest Server

WEBrick is the slowest rack server and this shouldn't surprise anyone. As a built in rack server, it does its job admirably and gives a batteries included Ruby experience, but you shouldn’t be using it in production. At all. Ever.

The Slowest Framework

I’m giving this to Ramaze. However, in many cases it was only a little slower than Rails. On Ruby 2.1 they perform nearly the same and both of them are at the bottom of the pack in terms of performance. Pakow also was slow, but it was also so buggy that it shouldn’t even be considered as comparable.

The Fastest Runtime

JRuby is the fastest runtime pretty much across the board. Depending on your framework and server combination, you can see a 50% - 100% performance improvement using JRuby. For example, Rails on Ruby 2.1 can hit around 1,500 req/sec for Thin or Unicorn. Rails on JRuby can hit 3,300 req/sec on Torqbox (Torquebox 4) and 2,500+ req/sec on Puma, Fishwife, Mizuno, and Torquebox 3.

The one caveat of JRuby is that it takes multiple runs to reach its peak speed. I’m not an expert on this, but JRuby will JIT compile your code over time and it keeps getting faster until it reaches some theoretical maximum. In practice, this means the first couple benchmarks are a bit slower than standard Ruby, but after that it flies right by and doesn’t look back.

The Fastest Server

The fastest server I’ve seen is the Torqbox gem, which is the beta for Torquebox 4 I guess. It doesn’t matter what they decide to call it, that server is fast. It’s ridiculously fast. On a plain old Rack app it did 10,159 req/sec. To put that into perspective, a standard Go Hello World app hit about 10,500 req/sec on my machine.

Yes, with the right combination of runtime, server, and framework Ruby can be nearly as fast as Go.

The Fastest Framework

The fastest framework is Rack. Plain old Rack with no framework at all. It makes intuitive sense that Rack on its own would be the fastest possible, but it’s also worth understanding how much you are giving up by picking a framework like Sinatra or Rails.

For example, On Ruby 2.1 and Thin, Rack can hit 6,301 req/sec, but Rails only does 1,455 req/sec and Sinatra only manages 2,505 req/sec. So, by simply using those frameworks on top of Rack, you give up over 60% of your maximum possible throughput. That holds basically true regardless of runtime or server.

The Rankings

Ranking the three different performance variables is sort of tricky, so we'll take the best score for each and rank accordingly. In some cases one option might have a higher maximum number, but be a bit slower overall. This is a failure in how I am ranking, not in the actual data.

Runtime Performance


  1. JRuby - 10,159 req/sec using Rack and Torqbox
  2. Ruby 2.1.0 - 7,634 req/sec using Rack and Unicorn
  3. Ruby 1.9.3 - 7,125 req/sec using Rack and Unicorn
  4. Ruby 2.0.0 - 7,069 req/sec using Rack and Unicorn
  5. Rubinius 2.2.2 - 5,156 req/sec using Rack and Unicorn

Server Performance


  1. Torqbox (Torquebox 4) - 10,159 req/sec using Rack and JRuby
  2. Jubilee - 9,505 req/sec using Rack and JRuby
  3. Torquebox 3 - 7,808 req/sec using Rack and JRuby
  4. Unicorn - 7,634 req/sec using Rack and Ruby 2.1.0
  5. Fishwife - 7,611 req/sec using Rack and JRuby
  6. Mizuno - 7,137 req/sec using Rack and JRuby
  7. Thin - 6,301 req/sec using Rack and Ruby 2.1.0
  8. Puma - 5,909 req/sec using Rack and JRuby
  9. WEBrick - 5,839 req/sec using Rack and JRuby
  10. Trinidad - 5,567 req/sec using Rack and JRuby
  11. Reel - 2,764 req/sec using Rack and Ruby 2.1.0
  12. Passenger - 2,083 req/sec using Rack and Ruby 2.0.0

Framework Performance


  1. Rack - 10,159 req/sec using Torqbox and JRuby
  2. Cuba - 9,169 req/sec using Torqbox and JRuby
  3. Brooklyn - 8,247 req/sec using Torqbox and JRuby
  4. Rambutan - 8,059 req/sec using Torqbox and JRuby
  5. NYNY - 7,716 req/sec using Torqbox and JRuby
  6. Nancy - 6,460 req/sec using Jubilee and JRuby
  7. Camping - 5,627 req/sec using Torqbox and JRuby
  8. Sinatra - 5,554 req/sec using Torqbox and JRuby
  9. Grape - 4,425 req/sec using Torqbox and JRuby
  10. Cramp - 3,885 req/sec using Thin and Ruby 2.1.0
  11. Rails - 3,343 req/sec using Torqbox and JRuby
  12. Ramaze - 3,244 req/sec using Torqbox and JRuby
  13. Pakyow - 1,117 req/sec using Jubilee and JRuby

The Runtimes…

When you look at all the data you will see that the runtime makes a significant difference, but it’s not just in relative performance, but also in what projects are available and what performance opportunities that gives you. The biggest impact the runtime makes in terms of performance is in the options it makes available and the theoretical performance it unlocks in terms of optimizations.



For example, each standard Ruby gets a bit faster each version, but JRuby can be a big potential performance win just because it makes faster servers available and does a better job of optimizing than other Ruby runtimes. The decades of research and millions of dollars spent making the JVM fast has something to do with that I’m sure.

Ruby 1.9.3 (p484)

One thing to note is that each main version of Ruby seems to perform a bit better, so Ruby 1.9.3 is a bit slower than 2.0 or 2.1. Based on the results of these benchmarks, each version has got about 10% faster or maybe a bit more depending on server and framework.

WEBrick Thin Puma Unicorn Passenger
Brooklyn 512.97 3935.52 2310.58 6443.22 1890.7
Camping 455.94 2218.37 1400.01 1831.74 1577.88
Cuba 524.45 4744.93 2491.91 6419.45 1961.62
Grape 374.2 1160.34 913.7 1725.43 1382.99
Nancy 526.66 2969.61 2061.5 5899.32 1907.97
NYNY 472.54 2481.18 1850.22 5370.43 1791.48
Pakyow 939.5 768.6 1024.12 error error
Rack 435.42 5789.4 3044.66 7125.39 2027.15
Rails 332.57 900.99 1041.44 1490.94 1131.25
Rambutan 483.93 2865.03 2430.24 6316.72 1810.75
Ramaze 302.39 645.43 658.99 1073.43 1070.6
Sinatra 431.99 1474.96 1453.68 2578.47 1597.59
Cramp 353.39 2781.12 error error error
Self
Reel 2225.35

If you are running Rails, performance was in the 900 to 1,500 req/sec range depending on server.

Peak Ruby 1.9.3 performance was 7,125 req/sec using Rack and Unicorn

Ruby 2.0.0 (p287)

What you should expect from Ruby 2.0.0 is a bit faster Ruby. There isn’t much else interesting to say about it.

WEBrick Thin Puma Unicorn Passenger
Brooklyn 508.6 4577.17 2505.45 6270 1825.27
Camping 460.03 2558.6 1591.35 3052.68 1559.85
Cuba 517.5 5156.4 2714.85 6375.45 2034.37
Grape 370.47 1452.54 1144.06 1496.58 1302.86
Nancy 517.81 3953.38 2453.47 6155.76 1948.93
NYNY 494.1 3472.98 2230.05 5342.17 1757.66
Pakyow 1364.35 1326.88 1311.16 error error
Rack 553.62 6027.65 2911.11 7069.77 2083.81
Rails 388.33 1479.54 945.55 1574.87 1167.28
Rambutan 509.34 4202.41 2270.3 6203.95 1969.36
Ramaze 341.73 952.23 790.15 913.3 1074.33
Sinatra 470.71 2121.51 1521.26 2425.53 1574.74
Cramp error error error error error
Self
Reel 1822.78

If you are running Rails, performance was in the 950 to 1,550 req/sec range depending on server.

Peak Ruby 2.0.0 performance was 7,069 req/sec using Rack and Unicorn.

Ruby 2.1.0 (p0)

What you should expect from Ruby 2.1.0 is a bit faster Ruby. There isn’t much else interesting to say about it.

WEBrick Thin Puma Unicorn Passenger
Brooklyn 533.46 5010 3490.43 6837.9 1955.12
Camping 431.65 3187.41 2197.9 3828.82 1700.89
Cuba 515.05 5597.96 3807.7 7335.4 2005.41
Grape 455.03 2197.89 1551.67 2258.29 1488.18
Nancy 540.1 4569.47 3238.87 6646.03 1722.47
NYNY 520.72 4040.04 2878.3 5452.63 1425.75
Pakyow 1566.66 1548.82 1560.18 error error
Rack 563.65 6301.64 4125.52 7634.51 2051.84
Rails 395.52 1455.28 1077.6 1533.81 1155.39
Rambutan 514.51 4639.73 3182.32 6247.88 1897.11
Ramaze 394.94 1256.9 1065.02 1380.54 1201.97
Sinatra 481.83 2505.79 1807.19 2813.4 1615.65
Cramp error 3885.68 error error error
Self
Reel 2764.53

If you are running Rails, performance was in the 1,075 to 1,550 req/sec range depending on server.

Peak Ruby 2.1.0 performance was 7,634 req/sec using Rack and Unicorn

Rubinius 2.2.2

Rubinius didn’t do very well and I’m not sure why. I expected it to perform a lot better. I don’t know if it was misconfiguration on my part or what, but none of the numbers for Rubinius were great. If you are deciding to use or not use Rubinius, please do your own benchmarks if these numbers are important to you.

WEBrick Thin Puma Unicorn Passenger
Brooklyn error 2309.19 2237.01 3458.84 1385.01
Camping error 1273.57 1501.04 1616.29 899.2
Cuba error 2083.19 2110.44 4380.07 1331.92
Grape error 846.68 1127.68 1322.61 800.67
Nancy error 1876.33 2111.23 2940.56 1051.36
NYNY error 2368.99 1746.95 2956.12 1113.32
Pakyow 794.5 759.77 753.45 error error
Rack error 2724.97 2833.24 5156.36 1561.96
Rails error 604.91 1366.99 626.2 696.18
Rambutan 696.18 2025.54 2440.6 2907.87 1484.58
Ramaze error 470.29 844.81 416.28 649.3
Sinatra error 1151.73 1950.03 1559.92 1216.57
Cramp error 2087.24 error error error
Self
Reel 1813.46

If you are running Rails, performance was in the 600 to 1,370 req/sec range depending on server.

Peak Rubinius 2.2.2 performance was 5,156 req/sec using Rack and Unicorn

JRuby 1.7.9

I was really surprised by how fast JRuby can be. Once the code is warmed up and JIT compiled, it can be way faster than any other Ruby platform by a good margin.

WEBrick Puma Passenger Fishwife Jubilee
Brooklyn 1990.62 5249.74 899.54 6979.41 5790.01
Camping 1752.92 4397.38 673.88 5034.17 error
Cuba 1964.52 5007.51 912.03 6907.66 7649.18
Grape 1514.31 3380 14.1 3955.25 error
Nancy 1826.54 4547.18 554.81 4850.38 6460.57
NYNY 1786.84 4726.11 1.96 6104.66 error
Pakyow 1098.73 1214.67 error 1107.32 1117.5
Rack 1997.82 5909.64 1105.44 7611.03 9505.61
Rails 1134.41 2588.18 269.78 2687.64 error
Rambutan 1743.21 4438.94 662.37 4494.62 7685.04
Ramaze 1250.64 2343.07 1.97 2326.57 3020.81
Sinatra 1620.67 3432.3 739.83 4184.09 error
Cramp error error error error error
Mizuno Torquebox 3 Torqbox Trinidad
Brooklyn 6341.96 7077.28 8247.4 4635.28
Camping 5159.03 5407.66 5627.71 3330.98
Cuba 6464.43 7099.98 9169.82 4879.29
Grape 3724.34 3486.19 4425.06 2520.6
Nancy 4283.41 5763.73 5284.58 3800.1
NYNY 5608.24 6658.24 7716.72 3257.51
Pakyow 893.05 error 1078.49 error
Rack 7137.58 7808.69 10159.46 5567.88
Rails 2666.54 2504.63 3343.91 1145.57
Rambutan 3836.56 6359.79 8059.5 4324.22
Ramaze 2290.06 2209.46 3244.25 1038.71
Sinatra 3351.5 4051.02 5554.89 2608.8
Cramp error error error error
Self
Reel 837.9

If you are running Rails, performance was in the 2,580 to 3,350 req/sec range depending on server.

Peak JRuby 1.7.9 performance was 10,159 req/sec using Rack and Torqbox

The Servers…

What server you choose can make an enormous impact on benchmark throughput and as I mentioned previously, the fastest servers are only available on JRuby because they are backed by various Java projects like Netty and Undertow. If you look at TechEmpower’s benchmarks, you will see that the fastest raw performance is in the JVM and that the server can make a real difference.



In Ruby, the server makes a significant difference in performance, but that the runtime has a significant impact on the performance of a server. For example, Puma and Passenger both ran on all of the Ruby runtimes, but Puma was fastest on JRuby and Passenger was slowest on JRuby.

Also, note that the fastest combination of server and runtime won’t make a slow framework fast. Rails peaked at 3,343 req/sec. on JRuby and Torqbox, but a fast framework like Cuba hit 4,744 req/sec. on Ruby 1.9.3 and Thin, which is not the fastest combination by any stretch of the imagination.

WEBrick

Before saying anything else, I just want to say DO NOT USE WEBrick IN PRODUCTION ENVIRONMENTS!!!

WEBrick is slow and when it comes to performance testing it was buggy and very annoying to test. It is included to show just how slow and terrible it is for production environments. Even though WEBrick’s intended role is for development, you will probably have a better development experience using thin instead, especially if you are developing inside a Virtual Machine.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
Brooklyn 512.97 508.6 533.46 error 1990.62
Camping 455.94 460.03 431.65 error 1752.92
Cuba 524.45 517.5 515.05 error 1964.52
Grape 374.2 370.47 455.03 error 1514.31
Nancy 526.66 517.81 540.1 error 1826.54
NYNY 472.54 494.1 520.72 error 1786.84
Pakyow 939.5 1364.35 1566.66 794.5 1098.73
Rack 435.42 553.62 563.65 error 1997.82
Rails 332.57 388.33 395.52 error 1134.41
Rambutan 483.93 509.34 514.51 696.18 1743.21
Ramaze 302.39 341.73 394.94 error 1250.64
Sinatra 431.99 470.71 481.83 error 1620.67
Cramp 353.39 error error error error

If you are running Rails, performance was in the 330 to 1135 req/sec range depending on runtime. JRuby was fastest and the standard Ruby versions were 330 - 395 req/sec.

Peak WEBrick performance was 1,997 req/sec using Rack and JRuby

Thin

Thin is a solid performer and for most ruby web apps is going to be a pretty good choice. It’s not very fiddly, it’s easy to setup, and performance is pretty good out of the box. Unicorn is en vogue right now and there are good reasons for some people to use Unicorn, but if I had to choose between Thin and Unicorn on a new project, I’d just start with Thin and see how far it takes me.

Thin’s solid performance comes from using EventMachine’s evented model to get the most peformance possible out of a single thread, similar to using node.js, but without writing evented code. So, as I understand it, each request is handled as an event, so it’s pretty fast, but it’s performance is limited by being single threaded.

The biggest downside of Thin is that it doesn’t run on JRuby, so you won’t get the maximum runtime performance possible with Thin, but it’s a solid performer.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2
Brooklyn 3935.52 4577.17 5010 1484.87
Camping 2218.37 2558.6 3187.41 1273.57
Cuba 4744.93 5156.4 5597.96 2083.19
Grape 1160.34 1452.54 2197.89 846.68
Nancy 2969.61 3953.38 4569.47 1876.33
NYNY 2481.18 3472.98 4040.04 2368.99
Pakyow 768.6 1326.88 1548.82 759.77
Rack 5789.4 6027.65 6301.64 2724.97
Rails 900.99 1479.54 1455.28 604.91
Rambutan 2865.03 4202.41 4639.73 2025.54
Ramaze 645.43 952.23 1256.9 470.29
Sinatra 1474.96 2121.51 2505.79 1151.73
Cramp 2781.12 3521.38 3885.68 0

If you are running Rails, performance was in the 600 to 1,480 req/sec range depending on runtime. Standard Ruby was the fastest with a range of 900 to 1,480 req/sec.

Peak Thin performance was 6,301 req/sec using Rack and Ruby 2.1.0

Puma

Puma is a server built for concurrency and speed, but the benchmarks I ran don’t quite line up with that claim exactly… sort of… Well, let me explain.

Standard Ruby’s concurrency story by default isn’t great because it has a Global Interpreter Lock (The GIL). That means you don’t really get true concurrency from standard Ruby. That means you don’t have access to real threads and things like that unless you use Rubinius or JRuby, which both allow for true concurrency.

Puma is built to be able to take advantage of real threaded concurrency when available, which is great if you are on Rubinius or JRuby, but it means it won’t be as fast on standard Ruby. Generally speaking this held true in the benchmarks.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
Brooklyn 2310.58 2505.45 3490.43 2237.01 5249.74
Camping 1400.01 1591.35 2197.9 1802.87 4397.38
Cuba 2491.91 2714.85 3807.7 2110.44 5007.51
Grape 913.7 1144.06 1551.67 1127.68 3380
Nancy 2061.5 2453.47 3238.87 2111.23 4547.18
NYNY 1850.22 2230.05 2878.3 1746.95 4726.11
Pakyow 1024.12 1311.16 1560.18 753.45 1214.67
Rack 3044.66 2911.11 4125.52 2833.24 5909.64
Rails 1041.44 945.55 1077.6 1366.99 2588.18
Rambutan 2430.24 2270.3 3182.32 2440.6 4438.94
Ramaze 658.99 790.15 1065.02 844.81 2343.07
Sinatra 1453.68 1521.26 1807.19 1950.03 3432.3
Cramp error error error error error

If you are running Rails, performance was in the 945 to 2,590 req/sec range depending on runtime. JRuby was fastest and the standard Ruby versions were 945 - 1,480 req/sec.

Peak Puma performance was 5,909 req/sec using Rack and JRuby

Unicorn

Unicorn is gaining in popularity because it is probably the fastest option for a standard Ruby webapp deployment. However, it’s not the fastest option available overall. In fact, while in our tests getting to about 7,000 req/sec looks impressive on standard Ruby, it would only be the 4th or 5th fastest server on JRuby.

The reason Unicorn looks fast on standard ruby in most benchmarks is that it’s “cheating”. Or, more precisely Unicorn vs Thin performance is a bit of apples and oranges. Unicorn uses multiple processes to achieve higher performance, Thin uses EventMachine to achieve higher performance from a single thread. In the benchmarks I limited Unicorn to a single worker, but it still gets a bit of benefit from having a dedicated dispatch process as well, which I believe accounts for some of the performance advantage over Thin.

I specifically didn’t test Unicorn with more workers because it would potentially skew the benchmark results in Unicorns favor simply because it would have access to more resources and processing power.

One interesting anomaly for Unicorn is that it didn’t perform well being benchmarked by wrk. I’m not sure why exactly, but I think this might be part of the reason standard Ruby doesn’t perform well on the TechEmpower Benchmarks. They are using wrk for benchmark numbers. I benchmarked with both Apache Bench and wrk, but am using Apache Bench’s numbers because they were more consistent across all the different runtime, server, and framework combinations.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2
Brooklyn 6443.22 6270 6837.9 3458.84
Camping 1831.74 3052.68 3828.82 1616.29
Cuba 6419.45 6375.45 7335.4 4380.07
Grape 1725.43 1496.58 2258.29 1322.61
Nancy 5899.32 6155.76 6646.03 2940.56
NYNY 5370.43 5342.17 5452.63 2956.12
Pakyow error error error error
Rack 7125.39 7069.77 7634.51 5156.36
Rails 1490.94 1574.87 1533.81 626.2
Rambutan 6316.72 6203.95 6247.88 2907.87
Ramaze 1073.43 913.3 1380.54 416.28
Sinatra 2578.47 2425.53 2813.4 1559.92
Cramp error error error error

If you are running Rails, performance was in the 625 to 1,575 req/sec range depending on runtime. Ruby 2.0.0 was fastest(barely) and the standard Ruby versions were 1,490 - 1,575 req/sec.

Peak Unicorn performance was 7,634 req/sec using Rack and Ruby 2.1.0

Passenger

In the benchmarks Passenger did not perform well. It was about on par with Puma in many cases, but performed incredibly poorly on JRuby, which was very surprising. From what I’ve read since doing these benchmarks, Passenger out of the box is maybe configured to conserve resources, not maximize performance. This is a reasonable tradeoff for most users, especially if you are running on a tiny VPS on Digital Ocean or Linode.

However, that doesn’t mean you shouldn’t use Passenger. In fact, the value in Passenger is probably not the highest possible performance as much as it is the easiest possible deploy process. Passenger’s deploy process on a production environment is akin to PHP. Update the files and you’re pretty much good to go. That is far easier for most developers than writing your own Unicorn management scripts or killing Thin processes or whatever.

Also, on Digital Ocean and Amazon you can get a server setup with Passenger setup out of the box, which again, is a huge time saver.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
Brooklyn 1890.7 1825.27 1955.12 1385.01 899.54
Camping 1577.88 1559.85 1700.89 899.2 673.88
Cuba 1961.62 2034.37 2005.41 1331.92 912.03
Grape 1382.99 1302.86 1488.18 800.67 14.1
Nancy 1907.97 1948.93 1722.47 1051.36 554.81
NYNY 1791.48 1757.66 1425.75 1113.32 1.96
Pakyow error error error error error
Rack 2027.15 2083.81 2051.84 1561.96 1105.44
Rails 1131.25 1167.28 1155.39 696.18 269.78
Rambutan 1810.75 1969.36 1897.11 1484.58 662.37
Ramaze 1070.6 1074.33 1201.97 649.3 1.97
Sinatra 1597.59 1574.74 1615.65 1216.57 739.83
Cramp error error error error error

If you are running Rails, performance was in the 270 to 1,170 req/sec range depending on runtime. Ruby 2.0.0 was fastest(barely) and the standard Ruby versions were 1,131 - 1,170 req/sec.

Peak Passenger performance was 2,083 req/sec using Rack and Ruby 2.1.0

Reel

Reel is an interesting server because it is part of the Celluloid project, which enables real threaded concurrency in Ruby where available, which is Rubinius and JRuby. I was very curious to see how much of a performance increase Reel might have over other servers. Reel also doubles as a framework, so the benchmarks treat it as both.

The results were underwhelming. It's about as fast as Sinatra on Thin. It's faster than Rails, but it's nowhere near as fast as a really fast framework on a really fast server.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
Reel 2225.35 1822.78 2764.53 1813.46 837.9

Reel was fastest on Ruby 2.1.0 with performance of 2,764 req/sec. On JRuby it only hit 837 req/sec.

Fishwife

Fishwife is the first of our JRuby only servers and it is based off of Jetty. Performance is good and this is where you start to see the power of the JVM and JVM-based projects really start to shine.

Fishwife is forked from Mizuno and as you’ll see in the benchmarks, it’s about 10% faster across the board. Generally speaking, while Fishwife is not the fastest JRuby server, it is faster than any of the standard Ruby servers. It’s about on par with Torquebox 3, but isn’t as fast as Jubilee or Torqbox (Torquebox 4).

JRuby 1.7.9
Brooklyn 6979.41
Camping 5034.17
Cuba 6907.66
Grape 3955.25
Nancy 4850.38
NYNY 6104.66
Pakyow 1107.32
Rack 7611.03
Rails 2687.64
Rambutan 4494.62
Ramaze 2326.57
Sinatra 4184.09
Cramp error

If you are running Rails, performance was 2,687 req/sec.

Peak Fishwife performance was 7,611 req/sec using Rack and JRuby

Jubilee

Jubilee is a very fast server based off of the Vert.x platform. Performance is very, very good and it’s the 2nd fastest server available for Ruby. However, in my benchmarking I would run into errors. It might just be an older version of Jubilee or something, I’m not sure, but it was really difficult to get a complete benchmark run using Jubilee.

If Jubilee can get a bit more stable, it wouldn't take much more performance to own the performance crown and be on par with Go and other high performance platforms.

JRuby 1.7.9
Brooklyn 5790.01
Camping error
Cuba 7649.18
Grape error
Nancy 6460.57
NYNY error
Pakyow 1117.5
Rack 9505.61
Rails error
Rambutan 7685.04
Ramaze 3020.81
Sinatra error
Cramp error

If you are running Rails, performance was unclear because it error'd in the benchmarks.

Peak Jubilee performance was 9,505 req/sec using Rack and JRuby.

Mizuno

Mizuno is the predecessor to Fishwife and it’s also a Jetty powered server. Performance is good, but pales somewhat in comparison to the other servers on the JVM. There isn’t a lot else to say about Mizuno. It works well and it’s pretty fast.

JRuby 1.7.9
Brooklyn 6341.96
Camping 5159.03
Cuba 6464.43
Grape 3724.34
Nancy 4283.41
NYNY 5608.24
Pakyow 893.05
Rack 7137.58
Rails 2666.54
Rambutan 3836.56
Ramaze 2290.06
Sinatra 3351.5
Cramp error

If you are running Rails, performance was 2,666 req/sec.

Peak Mizuno performance was 7,137 req/sec using Rack and JRuby

Torquebox 3

Torquebox 3 is not your average Ruby server. It’s built on JBoss AS and it does web serving, message queueing, job scheduling, and even daemons/services. That’s all pretty awesome, but for the purposes of this test, I only care about web server performance and Torquebox 3 doesn’t disappoint.

I would call Torquebox 3 the 3rd or 4th best performer alongside Fishwife. It is fast and when you consider all the other things that come with Torquebox, there is a lot of value in that server. My one complaint is that it was sort of annoying to setup easy deployment for this benchmark that would allow me to quickly/easily switch frameworks. Most people won’t run into this problem.

JRuby 1.7.9
Brooklyn 7077.28
Camping 5407.66
Cuba 7099.98
Grape 3486.19
Nancy 5763.73
NYNY 6658.24
Pakyow error
Rack 7808.69
Rails 2504.63
Rambutan 6359.79
Ramaze 2209.46
Sinatra 4051.02
Cramp error

If you are running Rails, performance was 2,504 req/sec.

Peak Torquebox 3 performance was 7,808 req/sec using Rack and JRuby

Torqbox (Torquebox 4)

Ok, so Torqbox, which is the next version of Torquebox, is really fast. Also, it seemed to be quite stable and didn’t error out on any of the tests. Once you get Torqbox warmed up, it is in the same performance ballpark as Go or Scala or Clojure.

This is a big deal for a lot of reasons. When you consider how many development shops moved from Ruby to Scala or Go or Java or Clojure or whatever other compiled language is the flavor of the month for performance reasons, what if they didn’t have to? What if you could use the same Ruby language, but get near the performance of a compiled language like Go?

Torqbox and plain old Rack get you pretty much there without switching languages.

However, you won’t get there with Rails. The closest you can get to raw Torqbox and Rack performance right now is to use Cuba, which got 9,169 req/sec. That’s right, Cuba gets you OVER 9000!!!!

Seeing this level of performance being possible on Ruby as well as the stark performance disparity between Rails and Rack, I wonder if in many cases performance wins from switching languages comes as much from using something that is faster than Rails as they do from something that is faster than Ruby can potentially be.

JRuby 1.7.9
Brooklyn 8247.4
Camping 5627.71
Cuba 9169.82
Grape 4425.06
Nancy 5284.58
NYNY 7716.72
Pakyow 1078.49
Rack 10159.46
Rails 3343.91
Rambutan 8059.5
Ramaze 3244.25
Sinatra 5554.89
Cramp error

If you are running Rails, performance was 3,343 req/sec.

Peak Torqbox performance was 10,159 req/sec using Rack and JRuby

Trinidad

Trinidad is a server built on top of Apache Tomcat. Tomcat is not the fastest Java server, but it is reliable and it works well.

In the benchmarks, Trinidad performed about as well as Puma. That doesn’t make it fast for a JRuby based server, but it is decent compared to what you get in Standard Ruby.

JRuby 1.7.9
Brooklyn 4635.28
Camping 3330.98
Cuba 4879.29
Grape 2520.6
Nancy 3800.1
NYNY 3257.51
Pakyow error
Rack 5567.88
Rails 1145.57
Rambutan 4324.22
Ramaze 1038.71
Sinatra 2608.8
Cramp error

If you are running Rails, performance was 1,145 req/sec.

Peak Trinidad performance was 5,567 req/sec using Rack and JRuby

The Frameworks…

Your choice of framework might be the single most important choice you make in terms of ruby performance because what you pick will largely determine your scaling options moving forward. In fact, whereas runtime and server can be relatively easily changed, switching frameworks potentially costs hundreds of man hours, so the switching cost on a framework is dramatically higher than just about any other choice you make in your system design outside of perhaps maybe your choice of database. The performance implications of framework choice are equally dramatic.



So, let’s get the 800 pound gorilla in the room out of the way - Rails. Rails severely limits your potential application performance. It’s not particularly fast at routing or outputting “hello world”. That means no matter what else you do, you’re already kind of slow. It is no wonder that a lot of Rails developers found node.js and said, “Holy crap this is fast!” It’s not because node.js is the fastest thing available, it’s because Rails is just not that fast.

The human side effects of Rails not being fast is that eventually teams end up leaving their entire Ruby codebases behind for other languages once they need to scale. You don’t even have to look far to find examples of that. This is bad for Ruby because it means some of the best talent leaves the ecosystem never to return.

My hope is that by showing other framework options with better performance, it might lead more projects to stick with Ruby that would otherwise leave. Simply being aware of what is possible with frameworks that aren’t Rails could make a big impact on how these choices are made. Well, here’s to hoping anyway.

Brooklyn

Brooklyn is a small web framework I found via https://github.com/luislavena/bench-micro. I don’t know much about it, but the code itself looks pretty decent and it’s very fast. Outside of pure rack, Brooklyn was the 2nd fastest framework. It’s only about 10% behind Cuba. It’s 2.5x - 3.0x the speed of Rails.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 512.97 508.6 533.46 error 1990.62
Thin 3935.52 4577.17 5010 2309.19 n/a
Puma 2310.58 2505.45 3490.43 2237.01 5249.74
Unicorn 6443.22 6270 6837.9 3458.84 n/a
Passenger 1890.7 1825.27 1955.12 1385.01 899.54
Fishwife n/a n/a n/a n/a 6979.41
Jubilee n/a n/a n/a n/a 5790.01
Mizuno n/a n/a n/a n/a 6341.96
Torquebox 3 n/a n/a n/a n/a 7077.28
Torqbox n/a n/a n/a n/a 8247.4
Trinidad n/a n/a n/a n/a 4635.28

On standard Ruby runtimes, performance was in the 3,935 - 5,010 req/sec range using Thin.

Peak Brooklyn performance was 8,247 req/sec using Torqbox and JRuby

Camping

Camping is a fairly famous microframework from the well known Ruby developer _why. It’s got a pretty reasonable structure and performance is decent, but not spectacular. It’s about as fast or sometimes a bit faster than Sinatra. In most benchmarks, it wasn’t nearly as fast as Cuba.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 455.94 460.03 431.65 error 1752.92
Thin 2218.37 2558.6 3187.41 1273.57 n/a
Puma 1400.01 1591.35 2197.9 1501.04 4397.38
Unicorn 1831.74 3052.68 3828.82 1616.29 n/a
Passenger 1577.88 1559.85 1700.89 899.2 673.88
Fishwife n/a n/a n/a n/a 5034.17
Jubilee n/a n/a n/a n/a error
Mizuno n/a n/a n/a n/a 5159.03
Torquebox 3 n/a n/a n/a n/a 5407.66
Torqbox n/a n/a n/a n/a 5627.71
Trinidad n/a n/a n/a n/a 3330.98

On standard Ruby runtimes, performance was in the 2,220 - 3,190 req/sec range using Thin.

Peak Camping performance was 5,627 req/sec using Torqbox and JRuby

Cuba

Cuba is a microframework that I hadn’t heard of before this benchmark, but I wish I had. It reminds me a bit of Sinatra, but the performance story is very different.

Cuba is the fastest framework I tested that wasn’t just a pure Rack app. It pretty consistently got about 90% of the performance of the pure Rack app. That makes is about 3x as fast as Rails in most of the benchmarks.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 524.45 517.5 515.05 error 1964.52
Thin 4744.93 5156.4 5597.96 2083.19 n/a
Puma 2491.91 2714.85 3807.7 2110.44 5007.51
Unicorn 6419.45 6375.45 7335.4 4380.07 n/a
Passenger 1961.62 2034.37 2005.41 1331.92 912.03
Fishwife n/a n/a n/a n/a 6907.66
Jubilee n/a n/a n/a n/a 7649.18
Mizuno n/a n/a n/a n/a 6464.43
Torquebox 3 n/a n/a n/a n/a 7099.98
Torqbox n/a n/a n/a n/a 9169.82
Trinidad n/a n/a n/a n/a 4879.29

On standard Ruby runtimes, performance was in the 4,745 - 5,560 req/sec range using Thin.

Peak Cuba performance was 9,169 req/sec using Torqbox and JRuby

Grape

Grape is a framework designed to make it easy to design and implement RESTful API’s. It does that job admirably, but it’s not very fast. Performance wise in most tests it sat between Rails and Sinatra. That’s not much of an achievement. I expected it to be faster.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 374.2 370.47 455.03 error 1514.31
Thin 1160.34 1452.54 2197.89 846.68 n/a
Puma 913.7 1144.06 1551.67 1127.68 3380
Unicorn 1725.43 1496.58 2258.29 1322.61 n/a
Passenger 1382.99 1302.86 1488.18 800.67 14.1
Fishwife n/a n/a n/a n/a 3955.25
Jubilee n/a n/a n/a n/a error
Mizuno n/a n/a n/a n/a 3724.34
Torquebox 3 n/a n/a n/a n/a 3486.19
Torqbox n/a n/a n/a n/a 4425.06
Trinidad n/a n/a n/a n/a 2520.6

On standard Ruby runtimes, performance was in the 1,160 - 2,200 req/sec range using Thin.

Peak Grape performance was 4,425 req/sec using Torqbox and JRuby

Nancy

Nancy is a microframework for web development inspired in Sinatra and Cuba. Performance wise, Nancy is faster than Sinatra, but slower than Cuba. On standard Ruby, it did pretty well, but it didn’t get as big of a boost from JRuby as other frameworks did and I’m not sure why.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 526.66 517.81 540.1 error 1826.54
Thin 2969.61 3953.38 4569.47 1876.33 n/a
Puma 2061.5 2453.47 3238.87 2111.23 4547.18
Unicorn 5899.32 6155.76 6646.03 2940.56 n/a
Passenger 1907.97 1948.93 1722.47 1051.36 554.81
Fishwife n/a n/a n/a n/a 4850.38
Jubilee n/a n/a n/a n/a 6460.57
Mizuno n/a n/a n/a n/a 4283.41
Torquebox 3 n/a n/a n/a n/a 5763.73
Torqbox n/a n/a n/a n/a 5284.58
Trinidad n/a n/a n/a n/a 3800.1

On standard Ruby runtimes, performance was in the 2,970 - 4,570 req/sec range using Thin.

Peak Grape performance was 6,460 req/sec using Jubilee and JRuby

NYNY

NYNY is a tiny web framework that is pretty fast and has reasonable looking code. It doesn’t perform as well as Nancy, Brooklyn, or Cuba, but it’s still a lot faster than Sinatra or Rails.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 472.54 494.1 520.72 error 1786.84
Thin 2481.18 3472.98 4040.04 2368.99 n/a
Puma 1850.22 2230.05 2878.3 1746.95 4726.11
Unicorn 5370.43 5342.17 5452.63 2956.12 n/a
Passenger 1791.48 1757.66 1425.75 1113.32 1.96
Fishwife n/a n/a n/a n/a 6104.66
Jubilee n/a n/a n/a n/a error
Mizuno n/a n/a n/a n/a 5608.24
Torquebox 3 n/a n/a n/a n/a 6658.24
Torqbox n/a n/a n/a n/a 7716.72
Trinidad n/a n/a n/a n/a 3257.51

On standard Ruby runtimes, performance was in the 2,480 - 4,040 req/sec range using Thin.

Peak NYNY performance was 7,716 req/sec using Torqbox and JRuby

Pakyow

Before this test I didn’t know much about Pakyow and after this test I don’t much like it. Pakyow was buggy and obnoxious in a lot of scenarios. In most scenarios Pakyow wasn’t very fast either, so it was a whole lot of disappointment all around.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 939.5 1364.35 1566.66 794.5 1098.73
Thin 768.6 1326.88 1548.82 759.77 n/a
Puma 1024.12 1311.16 1560.18 753.45 1214.67
Unicorn error error error error n/a
Passenger error error error error error
Fishwife n/a n/a n/a n/a 1107.32
Jubilee n/a n/a n/a n/a 1117.5
Mizuno n/a n/a n/a n/a 893.05
Torquebox 3 n/a n/a n/a n/a error
Torqbox n/a n/a n/a n/a 1078.49
Trinidad n/a n/a n/a n/a error

On standard Ruby runtimes, performance was in the 770 - 1,550 req/sec range using Thin.

Peak Pakyow performance was 1,566 req/sec using Thin and Ruby 2.1.0

Rack

Rack isn’t really a framework per se, so you probably wouldn’t write a whole app in pure Rack, but it was worth testing simply to see what peak potential performance looked like. What I found was pretty surprising. Pure Rack paired with the right runtime and server can be really, really fast. A lot faster than anyone gives Ruby credit for.

What is also interesting is how much overhead exists for everything on top of Rack. Peak performance compared to Rack was Cuba which hits about 90% of Rack performance, but Sinatra only hits 30-50% of Rack performance, and Rails only hits 20-30% of Rack performance. That means that probably 95% or more of all Ruby projects are giving up 50-80% of potential performance before they even do “hello world”. No wonder Ruby has a reputation for being slow.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 435.42 553.62 563.65 error 1997.82
Thin 5789.4 6027.65 6301.64 2724.97 n/a
Puma 3044.66 2911.11 4125.52 2833.24 5909.64
Unicorn 7125.39 7069.77 7634.51 5156.36 n/a
Passenger 2027.15 2083.81 2051.84 1561.96 1105.44
Fishwife n/a n/a n/a n/a 7611.03
Jubilee n/a n/a n/a n/a 9505.61
Mizuno n/a n/a n/a n/a 7137.58
Torquebox 3 n/a n/a n/a n/a 7808.69
Torqbox n/a n/a n/a n/a 10159.46
Trinidad n/a n/a n/a n/a 5567.88

On standard Ruby runtimes, performance was in the 5,790 - 6,300 req/sec range using Thin.

Peak Rack performance was 10,159 req/sec using Torqbox and JRuby

Rails

If you’ve read everything so far, you’ve probably noticed I beat up on Rails quite a bit and for good reason. It’s the most popular Ruby web framework by a wide margin and it’s basically the slowest reasonable framework option. The only framework that was slower was Ramaze.

What makes this even worse is I am not using a full Rails app, but rather a single file Rails Metal app to give it the best chance of performing well. It’s just not that fast. I’m pretty sure a standard full on rails app would be even slower.

Obviously the reason people use Rails is not performance, but rather the ecosystem and the things you get “for free” so that you don’t have to reinvent the wheel. I totally get the value of those things, but the performance tradeoff is very significant and I don’t think there is enough discussion or data around how significant of a performance tradeoff that really is.

The most common combination of Standard Ruby, Thin (or Unicorn), and Rails yields performance in the range of 1,500 req/sec. That puts it about 15% of the speed of the fastest Rack combination in the benchmarks. Most ruby web apps are giving away 85% of peak performance by their choice of server, runtime, and framework.

So, Rails did bad on the benchmarks, but most projects are going to stick with it anyway because they have a different reason than performance to use Rails. That’s totally fine and 100% expected. Just don’t be surprised when your Rails app is slow.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 332.57 388.33 395.52 error 1998.7
Thin 900.99 1479.54 1455.28 604.91 n/a
Puma 1041.44 945.55 1077.6 1366.99 2588.18
Unicorn 1490.94 1574.87 1533.81 626.2 n/a
Passenger 1131.25 1167.28 1155.39 696.18 269.78
Fishwife n/a n/a n/a n/a 2687.64
Jubilee n/a n/a n/a n/a error
Mizuno n/a n/a n/a n/a 2666.54
Torquebox 3 n/a n/a n/a n/a 2504.63
Torqbox n/a n/a n/a n/a 3343.91
Trinidad n/a n/a n/a n/a 1145.57

On standard Ruby runtimes, performance was in the 900 - 1,480 req/sec range using Thin.

Peak Rails performance was 3,343 req/sec using Torqbox and JRuby

Rambutan

I don’t know much about Rambutan. It’s not a very popular project, but it is a good performer. In fact, it’s in the upper echelon of frameworks in terms of performance. It runs circles around Rails and Sinatra and is right there with Brooklyn and Nancy in terms of performance.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 483.93 509.34 514.51 696.18 1743.21
Thin 2865.03 4202.41 4639.73 2025.54 n/a
Puma 2430.24 2270.3 3182.32 2440.6 4438.94
Unicorn 6316.72 6203.95 6247.88 2907.87 n/a
Passenger 1810.75 1969.36 1897.11 1484.58 662.37
Fishwife n/a n/a n/a n/a 4494.62
Jubilee n/a n/a n/a n/a 7685.04
Mizuno n/a n/a n/a n/a 3836.56
Torquebox 3 n/a n/a n/a n/a 6359.79
Torqbox n/a n/a n/a n/a 8059.5
Trinidad n/a n/a n/a n/a 4324.22

On standard Ruby runtimes, performance was in the 2,865 - 4,650 req/sec range using Thin.

Peak Rambutan performance was 8,059 req/sec using Torqbox and JRuby

Ramaze

Ramaze bills itself as being simple and light, but in my benchmarking Ramaze was slow. In fact, it was slower than Rails and it was the slowest framework I tested. I’m not sure if it’s being maintained anymore or what, but it’s certainly not a performance oriented framework.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 302.39 341.73 465.15 470.29 1250.64
Thin 645.43 952.23 1256.9 470.29 n/a
Puma 658.99 790.15 1065.02 844.81 2343.07
Unicorn 1073.43 913.3 1380.54 416.28 n/a
Passenger 1070.6 1074.33 1201.97 649.3 1.97
Fishwife n/a n/a n/a n/a 2326.57
Jubilee n/a n/a n/a n/a 3020.81
Mizuno n/a n/a n/a n/a 2290.06
Torquebox 3 n/a n/a n/a n/a 2209.46
Torqbox n/a n/a n/a n/a 3244.25
Trinidad n/a n/a n/a n/a 1038.71

On standard Ruby runtimes, performance was in the 645 - 1,255 req/sec range using Thin.

Peak Ramaze performance was 3,244 req/sec using Torqbox and JRuby

Sinatra

Of all the benchmarks I did in this crazy project, Sinatra was the biggest surprise. Before this project I always assumed Sinatra was the smallest, lightest, fastest option for Ruby frameworks. Apparently that isn’t true at all.

When it comes to benchmarks Sinatra is faster than Rails, but pretty unremarkable beyond that. There are like 4 or 5 other frameworks that look and feel like Sinatra, but perform a lot better in these benchmarks.

I still like Sinatra and will use it on some projects, but I wish it was as fast as Cuba.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 431.99 470.71 481.83 error 1620.67
Thin 1474.96 2121.51 2505.79 1151.73 n/a
Puma 1453.68 1521.26 1807.19 1950.03 3432.3
Unicorn 2578.47 2425.53 2813.4 1559.92 n/a
Passenger 1597.59 1574.74 1615.65 1216.57 739.83
Fishwife n/a n/a n/a n/a 4184.09
Jubilee n/a n/a n/a n/a error
Mizuno n/a n/a n/a n/a 3351.5
Torquebox 3 n/a n/a n/a n/a 4051.02
Torqbox n/a n/a n/a n/a 5554.89
Trinidad n/a n/a n/a n/a 2608.8

On standard Ruby runtimes, performance was in the 1,475 - 2,505 req/sec range using Thin.

Peak Sinatra performance was 5,554 req/sec using Torqbox and JRuby

Cramp

Cramp is an EventMachine based asynchronous web framework for Ruby. So, it should be fast and awesome right? Apparently not. Cramp was the most frustrating thing to try and benchmark for a few reasons. First, it seems to only run on Thin or WEBrick. Second, it requires an outdated version of ActiveSupport (3.2 or something), so it required different bundle to run. When I did try and benchmark it, most of the time it wouldn't even run on most servers.

Here’s the best part… It’s not that fast.

I don't think it will be back for the next round of benchmarks.

Ruby 1.9.3 Ruby 2.0.0 Ruby 2.1.0 Rubinius 2.2.2 JRuby 1.7.9
WEBrick 353.39 error error error error
Thin 2781.12 3521.38 3885.68 2087.24 n/a
Puma error error error error error
Unicorn error error error error n/a
Passenger error error error error error
Fishwife n/a n/a n/a n/a error
Jubilee n/a n/a n/a n/a error
Mizuno n/a n/a n/a n/a error
Torquebox 3 n/a n/a n/a n/a error
Torqbox n/a n/a n/a n/a error
Trinidad n/a n/a n/a n/a error

On standard Ruby runtimes, performance was in the 2,780 - 3,885 req/sec range using Thin.

It pretty much didn’t run on anything else.

The Data...

You can get the full data dump on github. All the source code, data output, everything is available open source.

I believe it's important to advance the community knowledge by sharing the information we have, so please take the time to look at the code and the data and by all means fork the repo and make improvements.

The Surprises…

There were quite a few things that surprised me in going through the benchmarking process. Going in to this project I had a few ideas about how fast different frameworks and relative speed that were mostly confirmed, but after running each and every test a few things really shocked me.

Ruby Can Be Fast

With the right combination of server, runtime, and framework Ruby can be as fast as a language like Go or Scala. In my tests Hello World Go was only 350 req/sec faster than JRuby + Torqbox + Rack. JRuby + Torqbox + Rack is a powerful combination. If plain old Rack isn’t your cup of tea, JRuby + Torqbox + Cuba is only about 10% slower.

Sinatra Is Slow

Rails doesn’t have a great performance reputation, but Sinatra has always been seen as “lightweight and fast” and compared to Rails, Sinatra is a lot faster. However, Sinatra is slow compared to Cuba, Brooklyn, and Rambutan.

Unicorn Isn’t That Fast

For the standard Ruby and Rails, Unicorn isn’t especially faster than Thin in my testing. I understand that with Unicorn you can easily spawn a lot of Unicorn instances to get more throughput, but compared to performance of other servers on JRuby, you aren’t gaining anything by doing that. The fastest servers live in JRuby and Unicorn is somewhat hamstrung by the Standard Ruby implementation.

We Are Using The Slowest Stuff

As an industry, many/most Ruby shops are using the standard Ruby versions to run mostly Rails apps. If you look at job openings, probably 95% of Ruby jobs are for Rails. A much smaller group uses Sinatra for various things and almost everything else I tested is obscure projects that most developers have never heard of.

However, the combination of Standard Ruby + Thin/Unicorn + Rails is about the slowest possible combination. That on Ruby 2.1 gives you 1,455 req/sec. The fastest possible combination (JRuby + Torqbox + Rack) gets you 10,159 req/sec.

That is giving up 85.67% performance before you write a single useful line of code. That means at best possible performance, the standard rails app is 6.98 times slower than the fastest Ruby code could be. That also means your standard rails app is at least 7 times slower than the comparable Go or Scala app, simply because of your server, runtime, and framework.

JRuby Is A Huge Ruby Ecosystem Win

Before doing these benchmarks, I didn't realize just how much performance is hiding inside of JRuby. The JVM is an impressive piece of technology and what the JRuby team has done is outstanding. One thing that I don't see often mentioned is the benefit of the various JVM web serving platforms like Jetty, Undertow, JBoss, and Vert.x. There is a lot of investment in these projects by large companies and it really helps the JRuby ecosystem to built on top of those to get servers like Torquebox, Jubilee, and Fishwife.

Another benefit of JRuby is that being on the JVM means all the performance improvements made to the JVM end up making JRuby faster too. Oracle invests a lot in the JVM and with projects like Truffle and Graal, there are even more performance wins on the way for JRuby.

I think the most surprising benefit to using JRuby is you can tap into the performance investments other companies make in JVM related technologies while still using Ruby. You might not have million dollar R&D budgets or scale problems like Google or Twitter, but that doesn't mean you can't benefit from their investment in the Java ecosystem while still using Ruby right?

Other Notes

These are a few other things I scribbled down while doing benchmarks.

  • WEBrick is a slow performer and it is difficult to benchmark with Apache Bench.
  • Sometimes wrk doesn't seem jive well with certain frameworks
  • Unicorn performance doesn't seem very predictable
  • Cramp doesn't like anything outside of thin it seems
  • Even tweaking passenger settings, it lags behind the other servers in performance, I wish there was a --high-performance option.
  • Unicorn and wrk doesn't seem to benchmark well
  • I couldn't get rainbows to work
  • reel didn't work super well on rubinius, wrk caused it to crash
  • ab doesn't like webrick on rubinius
  • you can see the JIT happening over time if you keep running benchmarks
  • pakyow is truly unpredictable as to where it will work/benchmark and why
  • it takes quite a few benchmark runs on rubinius for it to top out
  • JRuby JIT keeps making things faster over time - Rails 300 to 2800 req/sec on Torquebox 3
  • Torquebox 3's deploy is a bit annoying compared to Torqbox (Torquebox 4) for specifying a Rakefile
  • JRuby config settings make some difference, but not night and day difference
  • Running servers like thin under rack command makes them slower, not sure why
  • JRuby JIT pauses make benchmarking weird
  • Passenger JRuby perf isn't terribly predictable
  • JRuby/Passenger/Grape benchmarked very poorly and I'm not exactly sure why
  • "lightweight" or "micro" is a relative term
  • Jubilee is a bit buggy and thus hard to benchmark
  • Jubilee is fast when it works

Credits

I wanted to thank the people who helped directly and indirectly to put this report together and who inspired it's creation in the first place.

  • TechEmpower Benchmarks - Their efforts to benchmark every framework and language are awesome and full of great info.
  • Luis Lavena's Benchmark Project - His project provided some really useful code for some of the Hello World apps.
  • The Ruby, JRuby, and Rubinius teams as well as every other project on this benchmark. Their work goes mostly unnoticed, but without all these open source projects, our code wouldn't run. We truly stand on the shoulders of giants.
  • My Twitter Followers - It's a little known fact, but this whole project started in the middle of the night some Saturday night when I should have been sleeping. Charles Nutter noticed some benchmark tweets and retweeted them. All of a sudden a bunch of people were interested in my ruby benchmarks. After that I knew I needed to publish a full suite of benchmarks, I just didn't realize it was going to be this comprehensive.
  • My Family & My Job - My wife and kids put up with me staying up too late to work on this thing and the team at Market let me do some of this during work time. Sure, it benefits Market, but it would have been easy to tell me to stop writing and start doing some "real work".

Conclusion

This project turned out to be a much bigger undertaking than I realized when I started almost six months ago. What started as a curious question about what the fastest ruby framework and server is turned into possibly the most comprehensive Ruby web benchmarking project I've ever seen. It was a lot of work, but it was also a lot of fun too.

If I had to sum up this entire project I would say that the Ruby ecosystem is not as focused on performance in some areas and I think that has made it easier for developers to justify leaving Ruby for Node.js or Go or Scala over the years. Frankly, I don’t blame them because Ruby’s perceived lack of performance is basically true when it comes to Rails and Sinatra on standard Ruby setups.

However, as these tests prove Ruby is not completely slow in the absolute sense of the word. Given the right combination of runtime, server, and framework, Ruby can be surprisingly fast and offers a lot more performance that it’s often given credit for.

Obviously this is just a “Hello World” web benchmark so it doesn’t tell us much beyond potential performance in Ruby, but I would personally rather have Ruby spending time doing useful things like calculations or talking to a database or rendering strings than have it take 3x as long just to output “Hello World”. More importantly, knowing that most apps are giving up 85% performance before they do anything at all, I think it makes sense to reconsider that tradeoff, especially at scale.

Also, since these benchmark framework versions are about 4 months old as of this writing, I plan on running them again in the future and maybe adding new frameworks like Lotus to see what if anything has changed.

Beyond that, I plan on benchmarking database ORM’s to find the fastest combination of those available and to see how much of an impact talking to a database has on your app performance. I fully expect to be surprised by those results as well.


Article posted on July 11, 2014
Top image by: Yukihiro Matsumoto



Share this post