273

For pagination purposes, I need a run a query with the LIMIT and OFFSET clauses. But I also need a count of the number of rows that would be returned by that query without the LIMIT and OFFSET clauses.

I want to run:

SELECT * FROM table WHERE /* whatever */ ORDER BY col1 LIMIT ? OFFSET ? 

And:

SELECT COUNT(*) FROM table WHERE /* whatever */ 

At the same time. Is there a way to do that, particularly a way that lets Postgres optimize it, so that it's faster than running both individually?

1

4 Answers 4

370

Yes. With a simple window function.

Add a column with the total count

SELECT *, count(*) OVER() AS full_count FROM tbl WHERE /* whatever */ ORDER BY col1 OFFSET ? LIMIT ?

Be aware that the cost will be substantially higher than without the total number. Postgres has to actually count all qualifying rows either way, which imposes a cost depending on the total number. See:

Two separate queries (one for the result set, one for the total count) may or may not be faster. But the overhead of executing two separate queries and processing results often tips the scales. Depends on the nature of the query, indexes, resources, cardinalities ...

However, as Dani pointed out, when OFFSET is at least as great as the number of rows returned from the base query, no rows are returned. So we get no full_count, either. If that's a rare case, just run a second query for the count in this case.
If that's not acceptable, here is a single query always returning the full count, with a CTE and an OUTER JOIN. This adds more overhead and only makes sense for certain cases (expensive filters, few qualifying rows).

WITH cte AS ( SELECT * FROM tbl WHERE /* whatever */ -- ORDER BY col1 -- ① ) SELECT * FROM ( TABLE cte ORDER BY col1 LIMIT ? OFFSET ? ) sub RIGHT JOIN (SELECT count(*) FROM cte) c(full_count) ON true; 

① Typically it does not pay to add (the same) ORDER BY in the CTE. That forces all rows to be sorted. With LIMIT, typically only a small fraction has to be sorted (with "top-N heapsort").

You get one row of null values, with the full_count appended if OFFSET is too big. Else, it's appended to every row like in the first query.

If a row with all null values is a possible valid result you have to check offset >= full_count to disambiguate the origin of the empty row.

This still executes the base query only once. But it adds more overhead to the query and only pays if that's less than repeating the base query for the count.

Either way, the total count is returned with every row (redundantly). Doesn't add much cost. But if that's an issue, you could instead ...

Add a row with the total count

The added row must match the row type of the query result, and the count must fit into the data type of one of the columns. A bit of a hack. Like:

WITH cte AS ( SELECT col1, col2, int_col3 FROM tbl WHERE /* whatever */ ) SELECT null AS col1, null AS col2, count(*)::int AS int_col3 -- maybe cast the count FROM cte UNION ALL ( -- parentheses required TABLE cte ORDER BY col1 LIMIT ? OFFSET ? ); 

Again, sometimes it may be cheaper to just run a separate count (still in a single query!):

SELECT null AS col1, null AS col2, count(*)::int AS int_col3 FROM tbl WHERE /* whatever */ UNION ALL ( -- parentheses required SELECT col1, col2, int_col3 FROM tbl WHERE /* whatever */ ORDER BY col1 LIMIT ? OFFSET ? ); 

About the syntax shortcut TABLE tbl:

Sign up to request clarification or add additional context in comments.

17 Comments

By both LIMIT and conditions, we have rows to be returned, but with the given offset it would return no result. In that situation, How would we be able to get the row count?
@julealgon: Please start a new question with the defining details. You can always link to this one for context and add leave a comment here to link back (and get my attention) if you wish.
For anyone wondering; if you also want to limit the COUNT(*) done over the view, for example for when you have a huge table and want to prevent counting everything beyond a certain number, then you can use: COUNT(*) OVER(ROWS BETWEEN CURRENT ROW AND 1000 FOLLOWING) where 1000 is the number where the count will stop regardless of whether your query (without the LIMIT) would return even more rows
@JustinL.: The added overhead should only be significant for relatively cheap base queries. Also, Postgres 12 has improved CTE performance in multiple ways. (Though this CTE is still MATERIALIZED by default, being referenced twice.)
@user2959071: About TABLE, see stackoverflow.com/a/30276023/939860
|
21

While Erwin Brandstetter's answer works like a charm, it returns the total count of rows in every row like following:

col1 - col2 - col3 - total -------------------------- aaaa - aaaa - aaaa - count bbbb - bbbb - bbbb - count cccc - cccc - cccc - count 

You may want to consider using an approach that returns total count only once, like the following:

total - rows ------------ count - [{col1: 'aaaa'},{col2: 'aaaa'},{col3: 'aaaa'} {col1: 'bbbb'},{col2: 'bbbb'},{col3: 'bbbb'} {col1: 'cccc'},{col2: 'cccc'},{col3: 'cccc'}] 

SQL query:

SELECT (SELECT COUNT(*) FROM table WHERE /* sth */ ) as count, (SELECT json_agg(t.*) FROM ( SELECT * FROM table WHERE /* sth */ ORDER BY col1 OFFSET ? LIMIT ? ) AS t) AS rows 

2 Comments

You'd also need to WHERE the count(*) subquery otherwise you'll just get the entire table count won't you?
@BenNeill you are right, I edited the answer to include your fix.
6

edit: this answer is valid when retrieving the unfiltered table. I'll let it in case it could help someone but it might not exactly answer the initial question.

Erwin Brandstetter's answer is perfect if you need an accurate value. However, on large tables you often only need a pretty good approximation. Postgres gives you just that and it will be much faster as it will not need to evaluate each row:

SELECT * FROM ( SELECT * FROM tbl WHERE /* something */ ORDER BY /* something */ OFFSET ? LIMIT ? ) data RIGHT JOIN (SELECT reltuples FROM pg_class WHERE relname = 'tbl') pg_count(total_count) ON true; 

I'm actually quite not sure if there is an advantage to externalize the RIGHT JOIN or have it as in a standard query. It would deserve some testing.

SELECT t.*, pgc.reltuples AS total_count FROM tbl as t RIGHT JOIN pg_class pgc ON pgc.relname = 'tbl' WHERE /* something */ ORDER BY /* something */ OFFSET ? LIMIT ? 

1 Comment

About fast count estimate: stackoverflow.com/a/7945274/939860 Like you said: valid when retrieving the whole table - which is contradicted by the WHERE clause in your queries. The second query it logically wrong (retrieves one row for every table in the DB) - and more expensive when fixed.
-24

No.

There's perhaps some small gain you could theoretically gain over running them individually with enough complicated machinery under the hood. But, if you want to know how many rows match a condition you'll have to count them rather than just a LIMITed subset.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.