Pandemic -- because information needs to be contagious
April 10, 2009Pandemic is a map-reduce framework. You give it the map, process, and reduce methods and it handles the rest. It's designed to serve requests in real-time, but can also be used for offline tasks.
It's different from the typical map-reduce framework in that it doesn't have a master-worker structure. Every node can map, process, and reduce. It also doesn't have the concept of jobs, everything is a request.
The framework is designed to be as flexible as possible, there is no rigid request format, or API, you can specify it however you want. You can send it http-style headers and a body, you can send it JSON, or you can even just send it a single line and have it do whatever you want. The only requirement is that you write your handler to appropriately act on the request and return the response.
Here is how you use it:
Server
1 require 'rubygems' 2 require 'pandemic' 3 4 class Handler < Pandemic::ServerSide::Handler 5 def process(body) 6 body.reverse 7 end 8 end 9 10 pandemic_server = epidemic! 11 pandemic_server.handler = Handler.new 12 pandemic_server.start.join
In this example, the handler doesn't define the map or reduce methods, and the defaults are used. The default for each is as follows:
- map: Send the full request body to every connected node
- process: Return the body (do nothing)
- reduce: Concatenate all the responses
Client
1 require 'rubygems' 2 require 'pandemic' 3 4 class TextFlipper 5 include Pandemize 6 def flip(str) 7 pandemic.request(str) 8 end 9 end
Config
Both the server and client have config files:
1 # pandemic_server.yml 2 servers: 3 - host1:4000 4 - host2:4000 5 response_timeout: 0.5
Each value for the server list is the host:port that a node can bind to. The servers value can be a hash or an array of hashes, but I'll get to that later. The response timeout is how long to wait for responses from nodes before returning to the client.
1 # pandemic_client.yml 2 servers: 3 - host1:4000 4 - host2:4000 5 max_connections_per_server: 10 6 min_connections_per_server: 1 7 response_timeout: 1
The min/max connections refers to how many connections to each node. If you're using the client in Rails, then just use 1 for both min/max since it's single threaded.
More Config
There are three ways to start a server:
- ruby server.rb -i 0
- ruby server.rb -i machine1hostname
- ruby server.rb -a localhost:4000
The first refers to the index in the servers array:
1 servers: 2 - host1:4000 # started with ruby server.rb -i 0 3 - host2:4000 # started with ruby server.rb -i 1
The second refers to the index in the servers hash. This can be particularly useful if you use the hostname as the key.
1 servers: 2 machine1: host1:4000 # started with ruby server.rb -i machine1 3 machine2: host2:4000 # started with ruby server.rb -i machine2
The third is to specify the host and port explicitly. Ensure that the host and port you specify is actually in the config otherwise the other nodes won't be able to communicate with it.
You can also set node-specific configuration options.
1 servers: 2 - host1:4000: 3 database: pandemic_node_1 4 host: localhost 5 username: foobar 6 password: f00bar 7 - host2:4000: 8 database: pandemic_node_2 9 host: localhost 10 username: fizzbuzz 11 password: f1zzbuzz
And you can access these additional options using config.get(keys) in your handler:
1 class Handler < Pandemic::ServerSide::Handler 2 def initialize 3 @dbh = Mysql.real_connect(*config.get('host', 'username', 4 'password', 'database')) 5 end 6 end
Code: github repository
Install:
sudo gem -a http://gems.github.com
sudo gem install arya-pandemic
Comments
Found useful! thx
Leave a Comment