Monday, January 18, 2016

SQL or not SQL?

So much propaganda I have read in the Web about the so-called NoSQL databases that I would like here to say something about this. I will assume that the reader have knowledge of relational databases.

In general, every piece of data needs a non-random structure, otherwise it can hardly represent useful information (worth to be stored and then retrieved for some human purpose). Considering the client-server paradigm, the structure can be imposed to data in the client, in the server or both. Relational databases have the ability to relate data within tables and these can be related to other tables within the same database (server). However, the relational schema can have a minimal relation description of a key-value class. For example, we can define single documents into relational databases by creating a table like this (assuming a maximum document length of 8 kilobytes, which of course can be extended):

CREATE TABLE docs (id INTEGER PRIMARY KEY, docdata VARCHAR(8192))

and then we can store XML or JSON documents into the docdata field, which can be interpreted in the clients of the respective database. Therefore, relational databases can be perfectly used as document-oriented database, provided an appropriate schema. Unfortunately, this quite obvious application of relational databases is normally omitted by proponents of the so-called NoSQL databases, promoting a lot of misunderstanding of what a relational database can or can not do.

On the other hand, the wrongly-termed NoSQL or document-oriented databases can store documents in such a way that the SQL language is avoided, and also providing sometimes automatic parsing or processing of the XML or JSON content. These document formats have become very popular among web developers and other computer-programming enthusiasts and thus there are many software packages that can be used to process these formats. 

In conclusion:
  1. NoSQL databases should be called “client-side structured databases” because the interpretation of the structure of the data is usually left as a task for the client, not the database itself (server) 
  2. In SQL databases the structure of the data is explicit on the database itself (in the form of relations of formal first-order logic, with SQL as the query language). In NoSQL databases the structure is less formally (and thus more flexible) defined in document formats like XML or JSON that are normally parsed or processed by the client
  3. Relational databases can also be used to store/retrieve formatted documents (e.g. XML and JSON)
  4. The real and tangible advantages of NoSQL databases comes from the consequence of storing unstructured data, which can be more easily distributed among servers. Also, that the structure is left to be interpreted to the client in most cases, makes these more scalable. However, relational databases are much more secure for the integrity of the data. 
Using SQLite

No comments:

Post a Comment