I use Amazon EC2 infrastructure to run my Ruby on Rails application and lately the load time for my application has gone in minutes, actually couple of minutes. After some googling I realized that it does not matter much if I have lots of models and controllers in my code. Most of the time consumed is while loading gems when the application starts. [Source]
So I realized that in my Gemfile I have couple of Gems which were used only during db:seed and are not at all used in production server. But when server is restarted, rails still load these gems as they are mentioned in Gemfile. To avoid this, we can mention ‘require => false’ for such gems and rails will do following:
- Install these gems whenever you run bundle install
- Do not load them when a server starts
- You can still load them by explicitly specifying “require <gemname>” where you need them
This brought down my server load time by 25% of original load time. Here is the log:
- Before using ‘require => true’ :
time r s -d
=> Booting WEBrick
=> Rails 3.0.3 application starting in development on http://0.0.0.0:3000
real 0m21.213s
user 0m17.353s
sys 0m1.852s - After using require => true (e.g. gem testgem, :require => true)
time r s -d
=> Booting WEBrick
=> Rails 3.0.3 application starting in development on http://0.0.0.0:3000
real 0m17.351s
user 0m13.865s
sys 0m1.648s
With fragment caching floating around it always remain to decide how small piece of code should I cache. Is it worth caching a code as it renders a partial, or just because it has a few lines of HTML/ JS / Ruby code, or it makes a few DB calls to fetch data.
This blog is dedicated to my findings and observations regarding the same. I am using Rails 3 + Memcachedb + Ubuntu for my experiments. Please note that I have mentioned average response time after coupling a few requests.
1. Small partials – caching not needed
About partial: Few lines of HTML / CSS and JS code (Around 10 lines)
- Response Time when cache was set:
Started GET “/get_top_menu.js” for 127.0.0.1 at 2011-04-20 11:58:58 +0530
Processing by HomeController#get_top_menu as JS
User Load (9.9ms) SELECT `users`.* FROM `users` WHERE (`users`.`id` = 1) LIMIT 1
Exist fragment? views/logged_in-true (1.9ms)
Read fragment views/logged_in-true (0.1ms)
Rendered home/get_top_menu.js.erb (1201.0ms)
Completed 200 OK in 1259ms (Views: 1247.3ms | ActiveRecord: 9.9ms | Sphinx: 0.0ms) - Response time when cache was not set:
Rails.cache.delete(‘views/logged_in-true’)
Started GET “/get_top_menu.js” for 127.0.0.1 at 2011-04-20 12:07:53 +0530
Processing by HomeController#get_top_menu as JS
User Load (1.5ms) SELECT `users`.* FROM `users` WHERE (`users`.`id` = 1) LIMIT 1
Exist fragment? views/logged_in-true (6.7ms)
Rendered devise/menu/_logged_in_menu.html.erb (7.7ms)
Rendered devise/menu/_logged_in.html.erb (19.2ms)
Write fragment views/logged_in-true (1.2ms)
Rendered home/get_top_menu.js.erb (1225.8ms)
Completed 200 OK in 1284ms (Views: 1281.5ms | ActiveRecord: 1.5ms | Sphinx: 0.0ms)Please observe there was no time saved with cache here. Probably because sending a tcp request to memcachedb (even on localhost) is itself time consuming. Here we made 2 such requests, one to check if cache exists and second to actually fetch it. This nullifies the time we saved on rendering a small partial.
- Response time when we did not implement any cache:
Started GET “/get_top_menu.js” for 127.0.0.1 at 2011-04-20 12:14:25 +0530
Processing by HomeController#get_top_menu as JS
User Load (3.3ms) SELECT `users`.* FROM `users` WHERE (`users`.`id` = 1) LIMIT 1
Rendered devise/menu/_logged_in_menu.html.erb (1.3ms)
Rendered devise/menu/_logged_in.html.erb (7.6ms)
Rendered home/get_top_menu.js.erb (425.6ms)
Completed 200 OK in 462ms (Views: 457.6ms | ActiveRecord: 3.3ms | Sphinx: 0.0ms)Please observe there we got a faster response when cache was not at all used. Probably because sending a tcp request to memcachedb (even on localhost) is itself time consuming.
I guess results are pretty self explanatory and make sure that you use fragment cache when there is a need and not for every small request. The need arises when there are sql queries (they will anyway make tcp requests to db) or heavy computation in ruby code. You will also save memory by not dumping in huge contents (HTML / CSS / JS) which do not require much time to render.
Stuck with Rails debugging? Here are some useful Commands that you should know and will help you in diagnose your system. First few commands are in general for any production environment. Last few ones are specific to ruby on rails. This is basic diagnose and my next post will elaborate more on the various tools you can use to fix things up.
- last – Checkout out who all last logged in on the machine. This helps a little to get an idea whether the issue was self triggered or due to someone’s mischief.
- who – Who all are logged in right now? Is someone else experimenting on the machine?
- pstree -ap -u <user> – Let us dive into more serious debugging. This command will give you various processes running under a user’s name. It shows the processes in a tree format so that you know which process spawned subsequent children processes. By default you will observe that there are x number of ruby processes spawned by init. This x is the configuration specified to nginx for how many processes to spawn in parallel. Here is the output that I caught on a stable production m/c:
- ps -ef – This command will give you long list of processes, their parent processes and the commands with which they were invoked. We can do a grep to find out the details about a particular process. Here is the output of the process we are interested in (pid: 12368 – as mentioned by pstree output).
$ ps -ef | grep 12368
ubuntu 12368 1 0 Apr05 ? 00:01:18 Rack: /home/ubuntu/github/trips3m
ubuntu 27806 27651 0 11:04 pts/0 00:00:00 grep –color=auto 12368Usually, the parent process of this ruby child is 1 (init). The reason is that any process spawned by a demon (here nginx/passenger) will get its parent id as that of init, 1. This is to save the process by getting killed in case its parent process was killed. init lives in your system untill you reboot. Another observation is that is shows the time when it was spawned (here Apr 05). This is the date when you restarted your server!
Finally the command with which the process was spawned: Rack. This command is provided by rails so that 3rd party servers or scripts can invoke rails and serve the query to browser. So this is a very generic command. - top - Ok, so we now know about the processes running on our system and their utility. How do we debug if we are not getting a timely response or nothing is showing up in logs. Well, if for a web request you are seeing that browser is just waiting for data from your server, first check if any entry is getting populated in:
- logs/production.log, or
- /opt/nginx/logs/error.log
The entries in these files are populated when the request gets completed by the web server (rails). If nothing is getting populated, it means that rails is still stuck while resolving the response for your request. There are high chances that your web request is still running on the server due to one of many possible issues. (may be in infinite loop, or is hanging somewhere, etc.). Check out this command (top) to know what exactly is happening on system now. This will give you leads on CPU usage, memory usage and which process is eating how much resources. See for following symptoms:
- High CPU consumption – This means one of your process is in infinite loop or running some complex algorithms which you need to fix. Normally when a web request comes to your server, the ruby processes consumes 90% cpu for couple of seconds and then chills down. If this is eating high CPU for long, you need to improve your code. Check out the controller and other ruby functions you invoked while serving for this particular action.
- High Memory consumption - Memory Leaks! You are doomed. Resort to one of the memory debugger tools or take a careful look at your code again.
- lsof – It talks about the files that are being handled by a running process. Not sure what you can make out of it, but at times it may be useful.
- /proc/$pid/ – Contains all the information about any running process. All the commands derive their data from this virtual filesystem. So hack it down if you are searching for any answers related to a process.
pstree -ap -u ubuntu
redis-server,2725 redis.conf
ruby,12368
—{ruby},12370
ruby,12376
—{ruby},26025
searchd,5346 –pidfile –config /home/ubuntu/github/trips3m/config/production.sphinx.conf
sshd,27650
—bash,27651
——pstree,27804 -ap -u ubuntu
It is easy to note that the two processes of ruby were started by nginx and they will serve to the requests made by the client. They work as pool threads and nginx keeps them up even if there is no current request to be served.
System information: I am using nginx + Passenger on Rails 3 on Ubuntu.
