It has been … a little over a week since I closed with HBase so it is time to move to the next database in the program: MongoDB. It is a so-called document oriented database, relying on Javascript and JSON like Riak but, MongoDB, unlike Riak, has a built-in support for querying, and can also perform partial updates.
Fittingly, this first day is about CRUD and queries.
CRUD in MongoDB
Unlike Riak, MongoDB makes a difference between Create and Update operations. It is not possible to insert documents with the same id twice in the same collection:
1 2 3 4 5 |
|
Like in SQL, the Read, Update and Delete operations all operate on the result of a search. And because the criteria query for a search is just a JSON document, it is possible to reuse it in different contexts. Combined with JavaScript as a query/shell language, this makes a very flexible combination.
Reaching into objects
As documents are naturally nested, it is important to know how to refer to nested attributes and how to use them in queries.
Using the JSON nesting notation is possible but does not have the expected semantic (unless you change your expectations):
1 2 |
|
The query above returns nothing, despite the fact that there is a town
whose mayor is named ‘Sam Adams’. The problem here is that this
specific notation will match documents in the collection whose mayor
attribute has the exact value { name: 'Sam Adams' }
, rather than
merely those whose mayor
attribute has a sub-attribute name
with
the Sam Adams
value (and possible other sub-attributes, a
possibility that the first query denies).
The correct form for the query is
1 2 3 |
|
This documentation from the official website was very useful to me in clarifying this aspect.
The $not
operator
I first got curious about this operator because the example from the book did not work. It generated an error in MongoDB 2.0.2 (the latest at the time of writing).
1 2 3 4 5 6 7 8 |
|
The error is
1
|
|
From the
source code,
it seems $not cannot be applied to a (implicit) equal operator. So the
book code should be using $ne
:
1
|
|
It does not help that the online shell (accessible from the ‘Try it out’ link on http://www.mongodb.org/) does not return an error for the book query. It does not return anything at all, so it is still not correct.
Still, even with 2.0.2, this query:
1 2 |
|
is accepted but does not return every country but US. Yet again it returns nothing.
As far as I can tell, this query is actually incorrect (the $not
operator should be applied to an operator, not a match), so its value
is irrelevant, but it is worrying that the parser allows this code.
The general problem is that the semantic of $not
is not exactly
the same as the similar operator from logic. So basic logic reasoning
cannot help.
I suppose this is just something to keep in mind: be wary of using
$not
.
Exercises
The online MongoDB documentation
is here.
Query using regular expressions
As explained
here,
MongoDB supports two notations: a direct JavaScript regexp object
using /.../
and slightly more verbose but more flexible $regexp
operator. The syntax for the regular expression in both cases is the
same.
MongoDB drivers
The list of drivers.
Print a JSON document containing { "hello" : "world" }
The tojson
function can display (and pretty-print) JSON documents:
1
|
|
Find a town by case insensitive regular expression search
This would be the i
option to the regular expression:
1
|
|
Find all cities whose names contain an ‘e’, and are famous for food or beer.
A good use for the
$in
operator:
1 2 3 4 |
|
or the equivalent but verbose:
1 2 3 4 |
|
Find all countries that do not export tasty bacon
To find countries that export tasty bacon, one has to use
$elemMatch
. It
should not come as a surprise that it is used here as well, but with
the dreaded $not
:
1 2 3 4 5 6 |
|
produces
1 2 |
|
(I had not deleted Canada from the database).
The seemingly equivalent
1 2 3 4 5 6 |
|
will not work. It does not return anything. See above my findings on $not
.
Create a database blogger
with a collection of articles
A new database:
Creating a new database is extremely easy: just name it when you
launch the mongo
shell command:
1
|
|
A new collection:
Like for the database, a collection is created just by inserting something in it:
1 2 3 4 |
|
Update the article with an array of comments
With the operator
$set
,
I can specify just what I need to update, not the whole document as in
Riak:
1 2 3 4 5 6 7 |
|
I can check the inserted value with db.articles.findOne()
(without
arguments it returns the first element in the collection, and pretty
prints it):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
Tomorrow will cover indexing, and more advanced uses of the data (such as MapReduce).