asemanfar - a blog about programming

Pandemic -- because information needs to be contagious

April 10, 2009

Pandemic is a map-reduce framework. You give it the map, process, and reduce methods and it handles the rest. It's designed to serve requests in real-time, but can also be used for offline tasks.

It's different from the typical map-reduce framework in that it doesn't have a master-worker structure. Every node can map, process, and reduce. It also doesn't have the concept of jobs, everything is a request.

The framework is designed to be as flexible as possible, there is no rigid request format, or API, you can specify it however you want. You can send it http-style headers and a body, you can send it JSON, or you can even just send it a single line and have it do whatever you want. The only requirement is that you write your handler to appropriately act on the request and return the response.

Here is how you use it:

Server

   1  require 'rubygems'
   2  require 'pandemic'
   3  
   4  class Handler < Pandemic::ServerSide::Handler
   5    def process(body)
   6      body.reverse
   7    end
   8  end
   9  
  10  pandemic_server = epidemic!
  11  pandemic_server.handler = Handler.new
  12  pandemic_server.start.join

In this example, the handler doesn't define the map or reduce methods, and the defaults are used. The default for each is as follows:

  • map: Send the full request body to every connected node
  • process: Return the body (do nothing)
  • reduce: Concatenate all the responses

Client

   1  require 'rubygems'
   2  require 'pandemic'
   3  
   4  class TextFlipper
   5    include Pandemize
   6    def flip(str)
   7      pandemic.request(str)
   8    end
   9  end

Config

Both the server and client have config files:

   1  # pandemic_server.yml
   2  servers:
   3    - host1:4000
   4    - host2:4000
   5  response_timeout: 0.5

Each value for the server list is the host:port that a node can bind to. The servers value can be a hash or an array of hashes, but I'll get to that later. The response timeout is how long to wait for responses from nodes before returning to the client.

   1  # pandemic_client.yml
   2  servers:
   3    - host1:4000
   4    - host2:4000
   5  max_connections_per_server: 10
   6  min_connections_per_server: 1
   7  response_timeout: 1

The min/max connections refers to how many connections to each node. If you're using the client in Rails, then just use 1 for both min/max since it's single threaded.

More Config

There are three ways to start a server:

  • ruby server.rb -i 0
  • ruby server.rb -i machine1hostname
  • ruby server.rb -a localhost:4000

The first refers to the index in the servers array:

   1  servers:
   2    - host1:4000 # started with ruby server.rb -i 0
   3    - host2:4000 # started with ruby server.rb -i 1

The second refers to the index in the servers hash. This can be particularly useful if you use the hostname as the key.

   1  servers:
   2    machine1: host1:4000 # started with ruby server.rb -i machine1
   3    machine2: host2:4000 # started with ruby server.rb -i machine2

The third is to specify the host and port explicitly. Ensure that the host and port you specify is actually in the config otherwise the other nodes won't be able to communicate with it.

You can also set node-specific configuration options.

   1  servers:
   2    - host1:4000:
   3        database: pandemic_node_1
   4        host: localhost
   5        username: foobar
   6        password: f00bar
   7    - host2:4000:
   8        database: pandemic_node_2
   9        host: localhost
  10        username: fizzbuzz
  11        password: f1zzbuzz

And you can access these additional options using config.get(keys) in your handler:

   1  class Handler < Pandemic::ServerSide::Handler
   2    def initialize
   3      @dbh = Mysql.real_connect(*config.get('host', 'username',
   4                                            'password', 'database')) 
   5    end
   6  end

Code: github repository

Install:
sudo gem -a http://gems.github.com
sudo gem install arya-pandemic

Comments

posted by sarfraz on 04/13/09 04:11 AM PDT

Found useful! thx


Leave a Comment