I am working my way through the homework exercises, and so far I have had more success than with the warmups. Here’s what I have solved so far.
It took me far longer than it should have, and I had a very partial success; I guess my excuse is that my brain was still cold…
At least I can claim I did try to solve all the exercises; I really spent hours on this.
For the first post of this hopefully long series, I have a few notes I wrote down as I was reading Chapter 1. Nothing revolutionary, but it gives me a chance to play with math notation.
Stephen Hawking once said that his editor had warned him that each equation in his book would halve the readership.
With that in mind, and taking into account the number of readers of this blog (or lack thereof), would I dare put any equations?
You better believe it!
Third, last and quite short day with Neo4j. Today on the menu: transactions, replication, and backups.
Transactions are a standard feature of relational databases, but NOSQL databases seem to consider them too costly (of the other databases in the book, only HBase and Redis also support transactions, as far as I can tell). Neo4j does support them, along with rollbacks.
Replication is Neo4j’s answer for High Availability and, to some extent, Scaling. The latter is limited as Neo4j does not partition the data, so everything has to fit in each computer in the cluster.
Finally, backups are exactly what you would expect them to be. Neo4J offers both full and incremental backups, which update a previous backup.
Today we play further with Neo4j, exploring the ReST API, indexes, and algorithms in various languages.
The ReST API is always available, although not the easiest thing to work with. Besides what the book covers, I also learned how to extend it, and how to bypass it for large loads.
Indexing can be manual, as the book shows, or automatic (although the documentation warns this is still an experimental feature).
Finally, the algorithms are mostly provided by an external library, JUNG, so its use require direct access to the data, bypassing the server.
As the book is still in beta and incomplete, I skip CouchDB (the chapter is not there yet in beta 2.0), and will spend this week with Neo4j.
Neo4j is a graph database, meaning it focuses on navigation between vertices (called nodes in Neo4j), through edges (called relationships). While other databases made it possible to join various pieces of data, Neo4j treats this as the main semantic mechanism
Final day with MongoDB. First to cover geospatial indexing; then to explore MongoDB’s approach to the CAP theorem.
Today the book covers all kinds of queries goodness in MongoDB: indexing, advanced group queries, and MapReduce.
Fittingly, this first day is about CRUD and queries.