How to Tune “Order by Limit N” SQL Statement?

sql performance monitoring

There may be some business requirements that need to retrieve the first N number of rows from a join tables. Some people may encounter unexpected performance problem.

Here is an example SQL that retrieves first 1000 row from a join tables of Employee and Department in the order of dpt_id.  Where emp_dept and dpt_id columns are both indexed.

select *
     from employee ,department
where emp_dept=dpt_id
order by dpt_id
limit 1000

Here the following are the query plans of this SQL, it takes 1 mins and 8 seconds to finish. The query shows a nested loop from “Full Table Scan Employee” to “Unique Key Lookup Department” table to extract all records. An “Order by” operation of dpt_id of Department table is executed followed from this join result.

You can see the most expensive step is to extract all Employee data with Department data and then an Order by dpt_id  of Department table. Let’s see if we rewrite the original SQL text into the syntax that “order by dpt_id” is changed to “order by emp_dept”. It is a semantically eqvialent rewrite of original SQL statement since “emp_dept = dpt_id”.

select       *
from          employee,
where       emp_dept = dpt_id
order by    emp_dept
limit            1000

Here is the query plan of the rewritten SQL with less cost and run much faster. The new query plan shows that no “Use temporary; Using filesort” in Tabular Explain, it means the Nested Loop operation will be stop at the Limit 1000 records is done.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 70 times faster than the original SQL.