Some thoughts on No SQL
A problem at work led to me run a few quick queries to determine if MoSQL was the way to go, we had some legacy code constraints and some legacy skills constraints.
- https://www.sitepoint.com/sql-vs-nosql-differences/ a non judgemental review of comparative features, schemas and query complexity
- https://docs.mongodb.com/manual/tutorial/getting-started/ mongo’s documentation
- http://programmers.stackexchange.com/questions/54373/when-would-someone-use-mongodb-or-similar-over-a-relational-dbms – a good discussion which talks about activity profiling i.e. i/o and read write rates and physical design, and the hierarchical propensity of NoSQL solutions.
- http://stackoverflow.com/questions/441441/why-should-i-use-document-based-database-instead-of-relational-database – argues it depends upon your data, if tables are good then stick with relational, otherwise, if recursive or variable length or highly variously typed, maybe NoSQL is best, also if table size is large then index management overhead can be a problem for SQL, NoSQL don’t do joins, so if you have a multi table problem SQL, if you have one or two then maybe not, but see my unscientific SWAG below. Also once we are in a position that basically the data is optimised to be implemented as a key value pair, even if the value is a blog or text field then we are leaving the territory of RDBMS.
- http://stackoverflow.com/questions/10087806/mongodb-access-through-sql-like-syntax – when and if to use SQL interfaces with mongoDB, the view would seem to be not! If you need SQL, then you shouldn’t be using Mongo.
- http://blog.nahurst.com/visual-guide-to-nosql-systems, Nathan Hurst has a go, using CAP as a selector; he has a rather terrific graphic, See below.
- http://blog.parityresearch.com/21-nosql-innovators-to-look-for-in-2020/ is a useful NoSQL review; it seems to have lifted from Nat Hursts work, and the credit link is broken.
- http://nosql-database.org/index.html claims to be Your Ultimate Guide to the Non-Relational Universe!
Dave’s Unscientific SWAG
From experience, I believe that 80% of an RDBMS database’s data will be held in one or two tables. The corollary is that they can get very big, so big as to be unmanageable and unre-organisable.
Don’t do Triples or Graphs unless there’s recursion.
A shared disk cluster is not a defence against network partition. (Are you sure?)
Here’s a great blog article, https://www.airpair.com/postgresql/posts/sql-vs-nosql-ko-postgres-vs-mongo which I found when looking for a picture to decorate this blog article, it starts with a discussion on CAP theorem, posits that distributed systems can’t be partition tolerant and so the CAP theorem choice is between consistency and availability. He agrees with me! Hooray. Identifying the CAP properties is a useful way of selecting databases.