Sunday 20 May 2012

High performance Publish/Subscribe solution for Ruby On Rails or Juggernaut vs. Faye


In this post I want to share some experience in pub/sub for RoR. First, I'll briefly explain what kind of application we have, why we've chosen Faye and then I'll say how to run many instances of Faye without any Load Balancers and without Redis. 

One of RoR apps I've developed is handling more than 30 000 RMP (requests per minute) while i'm writing this post (and it is ready to handle more). About 90% of requests do pushes to Faye. About 6 000 users are connected to Faye right now. Everything is working very stable and smooth. And we are going to grow :)

But before Faye we used Juggernaut. These technologies are pretty same. And both are easy to use and integrate with Rails app. We've choosen Juggernaut because it was... more popular I think. So, we've patched it a little bit for our needs and quickly integrated to our application. Everything was good untill first deployment to production with less than hundred online users. It loaded servers - but we were ready to accepted it. But then we found another issue - it was very unstable. If somebody pushes broken data to it - process dies. Totally dies. So we've added monit to monitor Juggernaut. Not really cool, right? We've been patching it a lot. Still, it was not stable and sometimes it was using so much CPU that we've decided to look on another solution. And here we've found out Faye.

Before describing Faye I want to say that Juggernaut has 1 great benefit which Faye doesn't have - you can push events directly to Redis and Juggernaut will catch it and process. You can't easily do the same with Faye (maybe it will be done in future). Instead of it you have to send a HTTP request, which is slower and loads Faye's server.

So, we've decided to switch to Faye. We've choosen node.js instead of Thin and after first deployment we found the difference - Faye is stable and doesn't load system at all.

And now, finally, about high-performance and scalability.

Both Node.js and Thin are very-very slow if you run it with SSL - check my next post to see how to solve this problem

First of all, you need many instances of Faye to support a lot of concurrent users. Faye supports Redis as a  shared storage (in experimental mode, but it seems to be stable). It gives you a possibility to run many instances on many servers but it's slow - it needs to communicate with Redis. So, we've decided to create our own simple mechanism of sharding instead of using Redis.And we didn't want to use one more Load Balancer for it.

Note: this mechanism is designed to work when user subscribes to his own channel. To push data to global channel you need to perform requests to each shard.

Let's say we have a domain name app.com pointing to Load Balancer for Rails app. All our servers have sub-domains like server1.app.com, server2.app.com etc.

Configuration files:

We've created a YAML config where we listed all our instances. It looks like:
  production:
    shards:
      -
        node_port: 42000
        node_host: server1.app.com
        node_local_host: 10.x.x.1 #local IP of server
        run_on: server1_hostname

      -
        node_port: 42001
        node_host: server1.app.com
        node_local_host: 10.x.x.1
        run_on: server1_hostname
      -
        node_port: 42000
        node_host: server2.app.com
        node_local_host: 10.x.x.2
        run_on: server2_hostname
  .....

run_on option is used by our own Rake task to detect what shard to start on specific server during deployment. 
node_host is a public domain name or IP - we are using it to generate URL for users.
node_local_host is a local IP of server, cause we want to push data through local interfaces.

Sharding:

We assign shard for user by very simple formula:
   shard = @shards[user.id % @shards.size]

If you have 3 shards, users with ids 0, 3, 6 are connected to 1st shard; users with id's 1, 4, 7 - to 2nd...

Client side code:
client = new Faye.Client(<%= raw Faye.shard_for(user).url.inspect %>);
client.subscribe(<%= raw Faye.shard_for(user).channel.inspect %>, function(data){...});

Method .url returns URL like http://server1.app.com:42000/faye

Channel for user is just a "/#{user.id}", eg. '/123'.
So, now we need to push events to the needed shard:
...
  uri = URI.parse(Faye.shard_for(user).local_url)
  http = Net::HTTP.new(uri.host, uri.port)
  req = Net::HTTP::Post.new uri.path
  body = {'channel' => Faye.shard_for(user).channel, 'data' => data.to_json, 'ext' => {:auth_token => FAYE_TOKEN}}
  req.set_form_data('message' => body.to_json)
  http.request req
...

Method .local_url returns http://10.x.x.1:42000/faye.


https://github.com/ntenisOT/Faye-Example-Application
http://rails-alex.blogspot.com.au/2011/10/high-performance-publishsubscribe.html
http://railscasts.com/episodes/316-private-pub

No comments:

Post a Comment