Along with cloud computing there is a new modern buzzword discussed – NoSQL. This term is relatively new (it was introduced in 1998) but the concept works for such giants like Facebook, Amazon and the BigTable project by Google.
Initially the term was applied to a lightweight open source database without a SQL interface. Named ‘NoREL’ because the main characteristic of this class of databases is a new look on relational databases which corresponds current needs of big databases like EBay or Digg which are 3 TB big each.
*
Typical relational database like MS SQL or Oracle will be slow for tasks which suppose frequent but small read-write operations for big ranges of data with a floating structure. That applies to Facebook pages which have lots of optional parameters which almost never are filled in completely.
So let’s list pros & cons for these kind of databases and then review a few implementations.
Pros:
- Good for storing big amounts of weak-structured data
- Well-scalable. Most of implementations use distributed hash table for sharing data between several instances of the database
- Advanced support for associative arrays or XQuery in most of implementations (btw XML was one of key features of SQL Server 2005 and was extended more in next versions)
Cons:
- Poor data integrity guarantees. In contrast with relational databases which check foreign/primary key on every operation, NoSQL mostly do not guarantee full integrity for the stored data, sometimes even simplest constraints might be hard
- Almost no integration with top database frameworks (which actually are targeted to classic SQL)
- Weak aggregation support
The main types and most popular implementations are:
- Document/XML storages. This kind of database stores and operates whole documents which can have different internal structures. Well-known engines are MongoDB, IBM Lotus NSF, eXist
- Key/Value databases which just save key-value pairs as plain associated array on the hard drive (BigTable, Redis, keyspace) or in RAM (Redis, Velocity)
- Object databases which handle complex object structures and are most close to object-oriented paradigm are Cassandra (facebook), InterSystems Caché and ZODB for Python
It is hard to predict the future of such inert market as database one; however this new stream becomes more and more popular and has passed the test drive by main players successfully, so it makes sense to keep an eye on this family of products.
Read about a practical test drive of MongoDb in my next post of the NoSQL series.
* Image from http://notonlysql.com/nosql.png