Real world use case for JSON in RDBMS PostgreSQL
The NoSQL movement incepted in 2009 and has been blooming ever since, with increasing variety of colors and flavors. Some of the NoSQL vendors claim that any kind of application can use their NoSQL engine as the underlying database. I argue that though possible, it is unadvisable, and choosing a database engine should be one of the most carefully considered decisions a head of R&D is in charge of.
In this short post I would like to share a use case where one of our customers had several massive tables both horizontally (hundreds of columns) and vertically (billons of records). These tables contain data coming from millions of sensors that are spread worldwide. Every sensor has about 10-20 metrics it sends. Because data from all sensors is stored in the same table, there was a set of columns designated for each sensor-type.
A DBA keen on harnessing the power of NoSQL database would have arguably recommended using a wide-column-store such as Cassandra to host this data model.
After carefully considering the implication of changing a database technology and the customer’s needs; I decided to use two methods to overcome the overwhelming heaps of data:
Use partitioning and archiving (old data) to control the number of records in the OLTP system
Use a JSONB column to collect all the sensors’ data into one column
Using partitions with data control sped up queries by a factor of 7. Using one JSONB column instead of hundreds of integer/varchar/timestamp columns with mostly nulls, sped up queries by a factor of 3, and also circumvent the need to alter the tables every time a new sensor type is released to the market.
Traditional RDMBS engines are far easier to maintain than their distributed NoSQL counterparts. Though many applications do need the scale and flexibility of NoSQL – an organization must bear in mind that any distributed system, being much more complex, is harder to manage and provision; therefore, more costly.
In the case of our customer, it might be that we would need to migrate to a NoSQL solution in the future, but according to capacity planning vs computing power – we are good to go with partitioning and JSON for several years. We have managed to come up with a good solution with minimal changes to the application and minimal costs which made the customer very happy with the result.