Limited SELECT count(*)

A SELECT statement with COUNT returns the number of rows retrieved by the SELECT statement (see mysql select count).

For performance reasons, the desired result is to limit that count.
Including a LIMIT clause in the SELECT statement will not work since it only restricts the number of rows returned, which is always one.
 
The solution, what I call “Limited-Count”, is done by limiting a non-count SELECT statement and wrapping it in COUNT(*).

For example:
If your count statement looks like:

SELECT COUNT(*) FROM a_table WHERE some_conditions;

You can limit the results by moving the query into:

SELECT COUNT(*) FROM (SELECT 1 FROM a_table WHERE some_conditions LIMIT x) t;

Notes:
1. COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause (this is true only for MyISAM tables. For InnoDB tables, it’s slow – thanks for Roland Bouman’s comment)
2. If you need to know how many rows your non-count SELECT statement would have returned without the LIMIT, you could consider using the information function, FOUND_ROWS(). It will fetch the total rows number without running the statement again.

MySQL Quiz
24 Comments
  1. Gerry says:

    I used a different solution to this issue, which can become a performance issue under the right conditions: use TRIGGERS w/ an auxiliary table to hold the counts. Then you can just do a ‘SELECT count WHERE some_condition’. SELECT COUNT(*) that would take several seconds (even minutes in some cases), were reduced to fractions of a second.

    My $.02
    G

  2. josh says:

    I have a hard time seeing how this is beneficial. Perhaps you can post some benchmarks and explain when/where this query is useful.

  3. Roland Bouman says:

    Hi there!

    a few comments:

    “For performance, the desired result is to limit that count.”

    Well, perhaps…but what good is a count if you put an arbitrary limit on it?

    “COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause.”

    This is true for MyISAM tables. For InnoDB tables, it’s slow.

  4. Bella says:

    nice post thank your for sharing ?

  5. school grants says:

    Pretty nice post. I just stumbled upon your blog and wanted to say that I have really enjoyed browsing your blog posts. In any case I’ll be subscribing to your feed and I hope you write again soon!

  6. Anse says:

    Alternatively, to avoid the subselect, try this one:
    SELECT IF(COUNT(*) > 100, 100, COUNT(*)) FROM a_table WHERE some_conditions

  7. Shlomi Noach says:

    Hi,

    Be careful. To use FOUND_ROWS() you need to supply with the CALC_FOUND_ROWS hint. What MySQL will do is actually calculate all found rows, and only return you with your LIMITed number.
    Performance-wise, this means you may be waiting for a query to return with 100,000 rows, even though you only expect 20 in your LIMIT clause.

    Regards

  8. Psih says:

    When your query is optimized and indexes are up and running, using FOUND_ROWS() with SQL_CALC_FOUND_ROWS slows things down when you have much data. IT slowes so hard, that 2 separate queries are 2-3 times faster. Just search the google for it.

  9. Hazan Ilan says:

    To Josh:
    This limited count is useful for showing list with a limited roller. If your Roller shows only 10 pages at a time you don’t need to perform a full count(*) query. A limited count(*) is sufficient and much more performed.

  10. Hazan Ilan says:

    To Anse,
    This will not bring the required performance improvement.
    MySQL will need to calculate the full count(*) before returning the answer.

    Ilan.

  11. Anse says:

    Yes, exactly, performance breaks as soon as there are InnoDB tables involved, or if there is a WHERE clause.

    A similar approach, although slightly different topic than discussed here, is the one used for HeidiSQL’s data grid:
    SELECT col1, … FROM table [WHERE …] [ORDER BY col1] LIMIT 0, 1000
    The crux is that COUNT(*) is slow in too many cases, so HeidiSQL avoids using it at all. Instead, chunks of 1000 rows are fetched while you scroll down the grid. If a chunk returns less than 1000 rows this is the indicator for the end of a result set.

  12. scholarships for women says:

    Keep up the good work, I like your writing.

  13. Eran Binkin says:

    Nice post.

  14. student scholarships says:

    Valuable info. Lucky me I found your site by accident, I bookmarked it.

  15. Steve says:

    Keep up the good work, I like your writing.

  16. Emily says:

    To Josh:
    This limited count is useful for showing list with a limited roller. If your Roller shows only 10 pages at a time you don’t need to perform a full count(*) query. A limited count(*) is sufficient and much more performed.

  17. Binod Suman says:

    Thanks a lot, this below SQL worked perfectly in my project. But I am still surprise why count(*) does not work with limit clause.

    Any how thank a lot of this better workaround.

    SELECT COUNT(1) FROM (SELECT 1 FROM subscriber LIMIT 18, 2000)s

    Thanks,

    Binod Suman
    Bangalore,India

  18. Hazan Ilan says:

    The Count will return only one row of result. The limit clause is limiting the number of result rows. Hence, any limit that greater than one will will still show one result which is the only result (that count(*) returns).

  19. XPerez says:

    Firts of all it’s to have good indexes, and it’s better to count a field using these indexes:

    SELECT COUNT(1) FROM users WHERE postal_code = ‘23456’

    If you have indexes on postal_code, these count takes few time.

    Also, take a look that I make count(1), not count(*), because I’m saying to mysql that I don’t need fields, only count, and the query it’s executed quickest.

    Remember that the best policy to make a count before to listing results (and paging), on large sites and large tables, it’s to have a general count in a separate table, almost for those tables that you need to make full scan (or limited where). Also, you can obtain full count on base mysql tables and indexes.

    In relation to LIMIT, on all queries it’s better to use a LIMIT:

    SELECT count(1) FROM users WHERE postal_code=’67677′ LIMIT 1

    SELECT count(1) FROM users LIMIT 1

    Even when you have 2 or 3 millions of users, these counts will be made on less that a sec.

    Remember ever to use cache in your code, and to save those queries that can take a long time.

  20. Hazan Ilan says:

    Thanks for your reply,
    Dont you think that the usage of Limit with a count query is redundant? (because the limit applies only on the result and a count results is always one)

  21. XPerez says:

    Dear Hazan,

    Yes, would be redundant, but the parser understands better the command, takes less time and assures to reserve limited memory results.

    This can be checked and tested for a 1 milion record table, selecting conditional to am indexed field, example:

    SELECT count(1) FROM cities
    WHERE country_id = 50
    LIMIT 1

  22. Stephen Kimberling says:

    Looking forward to reading more. Great blog post.Really looking forward to read more.

  23. Ashley says:

    hey, nice blog…really like it and added to bookmarks. keep up with good work

  24. Angelin says:

    Awsome job!

Leave a Reply