In an era where the costs for cloud services continue to soar, I want to outline how our data team has successfully managed to run the infrastructure of our energy trading company within the first year using only two PostgreSQL database instances - each equipped with just two CPU cores and four GB of RAM. Despite this minimal use of resources, internal business processes and analytics dashboards .
TL;DR: DuckDB has fully parallelised range joins that can efficiently join millions of range predicates. Range intersection joins are an important operation in areas such as temporal analytics, and occur when two inequality conditions are present in a join predicate. Database implementations often rely on slow O(N^2) algorithms that compare every pair of rows for these operations. Instead, DuckDB leverages its fast sorting logic to implement two highly optimized parallel join operators for these kinds of range predicates, resulting in 20-30x faster queries. With these operators, DuckDB can be used effectively in more time-series-oriented use cases.