top of page
Shalom Shmila

TimescaleDB deep dive: unveiling the mysteries of Hypertable partitions | Part 1

A client of mine recently asked me an interesting question: When creating a Hypertable in Timescale DB, if we have multiple columns that we need to measure over time, how many partitions will be created and how will it affect the application performance? In this post, I will mainly focus on simply answering this question. Although there is more light to shed upon it which will be covered by my next articles.

Timescale DB is the most common PostgreSQL extension to manage time series data. It uses special tables called Hypertables to manage that data. Hypertables are partitioned and managed by Timescale’s internal mechanism.


How does a Hypertable work?

The best way to understand that is to create one and examine it. So let’s run a quick example:

The first stage is to create a regular empty table for time-series data. In our example we will create a table for weather samples:

When we describe the table definition in PSQL we get the following output (normal table definition):

Now let’s convert it into a TimescaleDB Hypertable, in order to do so we use the create_hypertable function and provide our table name and its time column (we did not provide any of our measurement columns):


When we describe our Hypertable in PSQL we get:


We can see that there are no new partitions. That’s because we did not insert any data yet.

There is a new trigger on our Hypertable which will execute before every insertion of a new record.

Time to add some data into our Hypertable, we will insert 5 records spread across 5 weeks time period 1 record per each week:


Now let’s see our table description again:


So what just happened?

As we can see 5 partitions (chunks) were created, why is that? Timescale’s default (but changeable) partitioning interval is 7 days, which means each week will have a different chunk.

We can now see the true behavior of the trigger on our table: It will look for a fitting partition (chunk) to place the new record in and if no such partition exists it will be created for our data record.

This also explains why the table must be empty before converting it into a Hypertable.


The answer

As we’ve seen, it does not matter how many columns we are measuring over time. When we create a Hypertable we are only required to provide a timestamp column to partition the data by.

Time partitioning improves query performance when presenting specific time slices of the data we want to measure, it does so by only accessing the corresponding partition of the requested time window (window size defaults to 1 week).

A Hypertable is always partitioned by time first, However it can also be partitioned further by additional columns. Those are called: space partitions and they will be explained in the next article.


Comments


bottom of page