MongoDB: post-discussion thoughts

It’s good to talk. In my previous post, I aired a few issues concerning MongoDB database design. There’s nothing like including a buzzword (“NoSQL”) in your blog posts to bring out comments from the readers. My thoughts are much clearer, thanks in part to the discussion – here are the key points:

It’s OK to be relational
When you read about storing JSON in MongoDB (or similar databases), it’s easy to form the impression that the “correct” approach is always to embed documents and that if you use relational associations, you have somehow “failed”. This is not true. If you feel that your design and subsequent operations benefit from relational association between collections, go for it. That’s why MongoDB provides the option – for flexibility.

Focus on the advantages
Remember why you chose MongoDB in the first place. In my case, a big reason was easy document saves. Even though I’ve chosen two or more collections in my design, the approach still affords advantages. The documents in those collections still contain arrays and hashes, which can be saved “as is”, without parsing. Not to mention that in a Rails environment, doing away with migrations is, in my experience, a huge benefit.

Get comfortable with map-reduce – now!
It’s tempting to pull records out of MongoDB and then “loop through” them, or use Ruby’s Enumerable methods (e.g. collect, inject) to process the results. If this is your approach, stop, right now and read Map-reduce basics by Kyle Banker. Then, start converting all of your old Ruby code to use map-reduce instead.

Before map-reduce, pages in my Rails application were taking seconds – sometimes tens of seconds – to render, when fetching only hundreds or thousands of database records. After: a second or less. That was my map-reduce epiphany and I’ll describe the details in the next blog post.

One thought on “MongoDB: post-discussion thoughts

  1. Good post.

    RE: Relational Stuff
    IMO, even in entirely non-relational document stores, you still end up storing ids against entities (referring to other entities), there is nothing to enforce consistency but consistency isn’t why you do it so I don’t see that as an ‘evil’. The domain takes care of itself in this circumstance.

    RE: Advantages
    These are the *only* reason you’d opt for a “NoSQL” database over a traditional RDBMS, when there is something that you really want from the alternative that you can’t get from the RDBMS. I think “simplicity” comes foremost for a lot of document stores, although on more complicated systems there are just as many, if not more questions about document design.

    RE: Map/Reduce
    Yes yes and yes, nearly all the new breed document stores expose map/reduce in some capacity and not only does this make things a lot faster, it makes you think about your documents up front by forcing you to think about how you want to access them.

    The doc-db I tend to use (RavenDb) doesn’t allow data queries unless you have defined a map/reduce index, and while this is a barrier to accessibility, it does mean that you are forced down a best practises route from the onset.

Comments are closed.