Tag: cluster

MapReduce for Ruby: Ridiculously Easy Distributed Programming

Posted by on 21-Aug-2006

Google’s MapReduce is now available for Ruby (via gem install starfish ). MapReduce is the technique used by Google to do monstrous distributed programming over 30 terabyte files.

Here is the basic code that will get you up and running with MapReduce in Starfish .

    # item.rb
    ActiveRecord::Base.establish_connection(
      :adapter  => "mysql",
      :host     => "localhost",
      :username => "root",
      :password => "",
      :database => "some_database"
    )

    class Item < ActiveRecord::Base; end

    server do |map_reduce|
      map_reduce.type = Item
    end

    client do |item|
      logger.info item.id
    end

Now just run:

    starfish item.rb

and Starfish takes care of the rest. The code above does the following:

  • The server grabs all the items via: Item.find(:all)
  • Each of the clients grab an item from the collection
  • When there are no more items to be grabbed, everything shuts down

Just add REST (and it’s come by default with the Edge Rails) and you’ll have your own S3 or GDrive for free ;)

Mongrel Clustering

Posted by on 24-May-2006

mongrel_cluster makes it easy to manage multiple Mongrel processes behind a reverse-proxy server and load balancer such as Pound, Balance, Lighttpd, or Apache.

See also:

The adventures of scaling

Posted by on 28-Mar-2006

Must read articles for everybody, doing RoR systems administration:

03-Apr-2006 Update: