14

There is one table in mysql table which has about 1.76 Million records and growing. Seems like the more records it has the slower it gets. It takes about 65 seconds to run a simple query below. Date_run is a timestamp field. I wonder if that makes it slower to run. Any suggestions that I can tweak in the Options File to make this babe faster?

select * from stocktrack where date(date_run) >= '2014-5-22' and date(date_run) <= '2014-5-29' 
  1. mysql version 5.6
  2. Windows 8.1 64 bit
  3. Intel Core i7-4770, 3.40Ghz 12gb RAM
2
  • Do you have index on your date column? Commented May 30, 2014 at 6:08
  • 1
    @AK47: An index with a leading column of date_run is a good start, but the query (as written) won't be able to use a range scan operation; the query will still need to evaluate the expression (the DATE() function) for every flipping row in the table. Commented May 30, 2014 at 6:26

2 Answers 2

22

To improve performance of this query, have a suitable index available (with date_run as the leading column in the index), and reference the "bare column" in equivalent predicates.

Wrapping the column in a function (like DATE(), as in your query) disables the MySQL optimizer from using a range scan operation. With your query, even with an index available, MySQL is doing a full scan of every single row in the table, each time you run that query.

For improved performance, use predicate on the "bare" column, like this, for example:

WHERE date_run >= '2014-5-22' AND date_run < '2014-5-29' + INTERVAL 1 DAY 

(Note that when we leave out the time portion of a date literal, MySQL assumes a time component of midnight '00:00:00'. We know every datetime/timestamp value with a date component equal to '2014-05-29' is guaranteed to be less than midnight of '2014-05-30'.)

An appropriate index is needed for MySQL to use an efficient range scan operation for this particular query. The simplest index suitable for this query would be:

... ON stocktrack (date_run) 

(Note that any index with date_run as the leading column would be suitable.)

The range scan operation using an index is (usually) much more efficient (and faster) on large sets, because MySQL can very quickly eliminate vast swaths of rows from consideration. Absent a range scan operation, MySQL has to check every single row in the table.

Use EXPLAIN to compare MySQL query plans, between the original and the modified.


The question you asked...

"... any tweaks to the options file ..."

The answer to that question is really going to depend on which storage engine you are using (MyISAM or InnoDB). The biggest bang for the buck comes from allocating sufficient buffer area to hold database blocks in memory, to reduce I/O... but that's at the cost of having less memory available to whatever else is running, and there's no benefit to over allocating memory. Questions about MySQL server tuning, beyond query performance, would probably be better asked on dba.stackexchange.com.

Sign up to request clarification or add additional context in comments.

2 Comments

Oh my goodness. I did exactly what you suggested including indexing date_run as 1st and ID as 2nd also run the query without the date() function. It dramatically cuts down run time to 0.63 seconds. I think the date() function is a drag. Thank you soo much everyone.
@user3690095: the DATE() function itself is very efficient. The performance issue is executing that function for every one of the 1.76 millions rows in the table. Referencing the "bare" column in the predicate (WHERE clause) does avoid the call to DATE(), but the big benefit is that this enables e VERY efficient range scan operation on an index.
2

First, you must create index for date_run column, then compare like this (in case your format is 'Y-m-d H:i:s')

select * from stocktrack where date_run >= '2014-05-22 00:00:00' and date_run <= '2014-05-29 00:00:00' 

4 Comments

I can see the benefit if date_run column is defined as character type; if it's in a standard canonical format, in year, month, day, hour, minute, second order, in a fixed length format (with leading zeros on month, day, etc. But I assumed that the date_run column was defined as TIMESTAMP datatype. For a TIMESTAMP, the substring functions don't provide much benefit.
substr and substring_index work well with TIMESTAMP mysql datatype (tested in my local)
Can you collect the EXPLAIN output for that query, and compare it to the EXPLAIN for the OP DATE() version, and the version in my answer, for a TIMESTAMP column? What version of MySQL are you running?
you're right, after explain it's not different between date and substr methods!