PostgreSQL Columnar Store for Analytic Workloads
We are excited to open source our columnar store extension for PostgreSQL and share it with the community! Columnar stores bring notable benefits for analytic workloads, where data is loaded in batches.
This columnar store extension uses the Optimized Row Columnar (ORC) format for its data layout. ORC improves upon the RCFile format developed at Facebook, and brings the following benefits:
Further, we used the Postgres foreign data wrapper APIs and type representations with this extension. This brings:
It’s worth noting that the columnar store extension is self-contained. If you’re a PostgreSQL user, you can get the entire source code and build using the instructions on our GitHub page. You can even join columnar store and regular Postgres tables in the same SQL query.
Now, let’s see how all this fits together with an example. For this, we start by downloading customer review data from Amazon for the year 1998. In this table, each review represents an event. We find that events or fact tables, or tables that have a large number of columns are good fits for the columnar store.