How to Tune “order by rand() limit 1” SQL Statement ?

There may be some business requirements that needs to retrieve only some rows from a table (or join tables) randomly. This kind of SQL is normally hard to tune.

For example the following SQL retrieve one row from two tables join of Employee and Department, where Employee’s department code is ‘AAA’, and both Emp_dept and Dpt_id columns are indexed.

select  *
  from employee ,department
where emp_dept=dpt_id
and emp_dept = ‘AAA’
order by rand()
limit 1

Here the following is the query plan of this SQL, it takes 3.126 seconds to finish. The query shows a nested loop from Department table to Employee table to extract all records with ‘AAA’ department code. An Order operation is executed followed from the join result.

You can see the most expensive step is to extract all Employee data with department code “AAA”, if there is an unique key such Employee ID (EMP_ID) which can uniquely identify a row from the query, you can use the With common table expressions in MySQL version 8 to randomly select rows from the KEY column only with the same conditions given by the original query, for example the following blue colored SQL text in “With” clause, it randomly select 1 row of EMP_ID from the join query. It not only significantly reduces the size of the data retrieved from Employee table, but it also shrank the size of Order By operation. And then the main query will use the selected EMP_ID to extract specific row from original query, so the whole query will run faster with this new rewrite syntax.

with DT1
     as (select  EMP_ID
        from   employee,
        where   emp_dept = dpt_id
         and emp_dept = ‘AAA’
        order by rand() limit 1
      select    *
      from       employee,
      where    emp_dept = dpt_id
     and emp_dept = ‘AAA’
     and EMP_ID in (select EMP_ID
             from   DT1)
      order by rand() limit 1

/* Remark:  “order by rand() limit 1” is used to make sure that only 1 row will be selected if the EMP_ID cannot uniquely identify only one row */

Here is the query plan of the rewritten SQL with less cost and run much faster.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 100 times faster than the original SQL.