MongoDB, Mongoid and Map/Reduce

I’ve been experimenting with MongoDB’s map-reduce, called from Ruby, as a replacement for Ruby’s Enumerable methods (map/collect, inject). It’s faster. Much faster.

Next, the details but first – the disclaimers:

  • My database is not optimised (using, e.g. indices)
  • My non-map/reduce code is not optimised; I’m sure there are better ways to perform database queries and use Enumerable than those in this post
  • My map/reduce code is not optimised – these are my first attempts

In short nothing is optimised, my code is probably awful and I’m making it up as I go along. Here we go then!
Read the rest…

MongoDB: post-discussion thoughts

It’s good to talk. In my previous post, I aired a few issues concerning MongoDB database design. There’s nothing like including a buzzword (“NoSQL”) in your blog posts to bring out comments from the readers. My thoughts are much clearer, thanks in part to the discussion – here are the key points:

It’s OK to be relational
When you read about storing JSON in MongoDB (or similar databases), it’s easy to form the impression that the “correct” approach is always to embed documents and that if you use relational associations, you have somehow “failed”. This is not true. If you feel that your design and subsequent operations benefit from relational association between collections, go for it. That’s why MongoDB provides the option – for flexibility.

Focus on the advantages
Remember why you chose MongoDB in the first place. In my case, a big reason was easy document saves. Even though I’ve chosen two or more collections in my design, the approach still affords advantages. The documents in those collections still contain arrays and hashes, which can be saved “as is”, without parsing. Not to mention that in a Rails environment, doing away with migrations is, in my experience, a huge benefit.

Get comfortable with map-reduce – now!
It’s tempting to pull records out of MongoDB and then “loop through” them, or use Ruby’s Enumerable methods (e.g. collect, inject) to process the results. If this is your approach, stop, right now and read Map-reduce basics by Kyle Banker. Then, start converting all of your old Ruby code to use map-reduce instead.

Before map-reduce, pages in my Rails application were taking seconds – sometimes tens of seconds – to render, when fetching only hundreds or thousands of database records. After: a second or less. That was my map-reduce epiphany and I’ll describe the details in the next blog post.