How To Use 80/20 Rule To Tune A Database Application II ?

The previous article “How To Use 80/20 Rule To Tune A Database Application I “ demonstrated how the 80/20 Rule can be applied to evaluate the overall performance of SQL workload in a database. In this example, a set of 90 SQL statements retrieved from Oracle SGA is presented in a chart that lists each statement based on its resource usage in descending order, with the most resource-intensive SQL on the left. The analysis reveals that roughly 14.44% of the SQL statements consume 80% of the total elapsed time, while 21.11% of the SQL statements consume 80% of the total CPU time, indicating that the SQL workload distribution aligns well with the 80/20 rule. Therefore, tuning the SQL may not be necessary since it is unlikely to result in significant performance improvements.

However, to further optimize the database performance cost-effectively, it is recommended to conduct an in-depth investigation of the top 20% of high workload SQL statements. This will reveal that the resource utilization drops steeply in the first few SQL statements, making them the most critical candidates for optimization.

Let’s aim to reduce the proportion of the total resource consumption from 80% to 60% and examine the SQL statements that are responsible for utilizing the resources. The results are interesting and reveal that 3 SQL statements account for 60% of the elapsed time, 6 SQL statements account for 60% of the CPU time, and only one SQL statement accounts for 60% of the disk reads. By focusing on these SQL statements, it is possible to enhance up to 60% of the database workload. For instance, if the database is experiencing an IO bottleneck, concentrating on the one SQL statement could yield savings of up to 60% on disk reads.

You can utilize Excel to conduct a simulation of the 80/20 rule analysis described above, providing a comprehensive overview of the distribution of the SQL workload. This approach facilitates a rapid evaluation of the overall health of the database’s SQL performance, as well as the associated costs and benefits of optimizing high workload SQL statements. Furthermore, the SQL resource spectrum analysis is integrated into our Tosska DB Ace for Oracle software.

Tosska DB Ace Enterprise for Oracle – Tosska Technologies Limited

DBAO Inspect SQL – YouTube

How To Use 80/20 Rule To Tune A Database Application I ?

In 2002, Microsoft reported that the majority of errors and crashes in their software were caused by a minority of the bugs they detected, specifically about 20%. This phenomenon, known as the Pareto principle or the eighty-twenty rule, is also relevant to the analysis of database SQL behavior. It is assumed that only 20% of SQL statements should account for 80% of the resource consumption of the entire database system. Any application system in which less than 20% of SQL statements consume more than 80% of database resources is considered abnormal and should be reviewed and optimized.

Using a SQL resource spectrum analysis technique, users have the ability to set a resource threshold percentage, typically around 80%, to pinpoint the top M% of SQL statements that are responsible for the majority of resource consumption. If the analysis reveals that only a few SQL statements, such as only 4%, are responsible for over 80% of the total resource consumption, it suggests that optimizing these specific SQL statements has the potential to significantly enhance the performance of the entire database.

The presented example displays a collection of 90 SQL statements obtained from Oracle SGA, with each statement listed in a chart based on its resource usage, arranged in descending order starting from the highest consumption SQL on the left-hand side. The analysis reveals that approximately 14.44% of the SQL statements account for 80% of the total elapsed time, while 21.11% of the SQL statements account for 80% of the total CPU time. This indicates that the distribution of the SQL workload aligns closely with the 80/20 rule. Therefore, there may not be a pressing need for SQL tuning as it is unlikely to result in significant performance improvements.

Another example highlights a significant deviation from the 80/20 rule in the distribution of the SQL workload. The analysis shows that around 4 SQL statements are accountable for 80% of the total elapsed time, while another 4 SQL statements are responsible for 80% of the total CPU time, and 2 SQL statements contribute to 80% of the total disk reads. This indicates that by optimizing just these 4 SQL statements, it is possible to achieve a substantial improvement in the overall performance of the database.

You can utilize Excel to conduct a simulation of the 80/20 rule analysis described above, providing a comprehensive overview of the distribution of the SQL workload. This approach facilitates a rapid evaluation of the overall health of the database’s SQL performance, as well as the associated costs and benefits of optimizing high workload SQL statements. Furthermore, the SQL resource spectrum analysis is integrated into our Tosska DB Ace for Oracle software.

Tosska DB Ace Enterprise for Oracle – Tosska Technologies Limited

DBAO Inspect SQL – YouTube

How to Tune SQL with DB Link for Oracle II ?

Here is an example SQL,  the query is retrieving employee, department, and grade tables from the remote database @richdb

SELECT   *
FROM     emp_subsidiary@richdb a,
         department@richdb,
         grade@richdb
WHERE    emp_grade < 1200
         AND emp_dept = dpt_id
         AND emp_grade = grd_id
ORDER BY emp_id

Here the following is the query plan of this SQL, it takes 15.92 seconds to finish.  The first step of the query plan is ‘SELECT STATEMENT REMOTE’, it means the entire query will be execute on the remote database @richdb and the result will be sent back to the local database. The query plan is a little bit complicated and not easy to tell it is optimal or not. But one thing we can try if the query is partially executed in the local database @local.

In order to request that Oracle perform certain join operations in the local database, the SQL query must include at least one table that is executed in the local database. This allows the use of the hint /*+ DRIVING_SITE ( [ @ queryblock ] tablespec ) */ in the SQL query. If no tables are explicitly executed in the local database, there is no means to request that Oracle attempt to perform join operations in the local database.

Let’s added a dummy condition “EXISTS (SELECT ‘X’ FROM DUAL)” and a hints /*+ DRIVING_SITE(DUAL) */ to the SQL to force Oracle to execute some join operations in the local database.

SELECT   /*+ DRIVING_SITE(DUAL) */ *
FROM     emp_subsidiary@richdb a,
         department@richdb,
         grade@richdb
WHERE    emp_grade < 1200
         AND emp_dept = dpt_id
         AND emp_grade = grd_id
         AND EXISTS ( SELECT ‘x’
                      FROM   dual)
ORDER BY emp_id

Below is the query plan for the modified SQL, which takes 4.08 seconds and is approximately 4 times faster than the original SQL statement where only one join operation is performed in the remote database.

Adding an ORDERED hint to the SQL query can result in further optimization. This will break down the compound statement highlighted in the previous query plan into individual table data remote extraction, as shown in the following query plan.

SELECT   /*+ DRIVING_SITE(DUAL) ORDERED */ *
FROM     emp_subsidiary@richdb a,
         department@richdb,
         grade@richdb
WHERE    emp_grade < 1200
         AND emp_dept = dpt_id
         AND emp_grade = grd_id
         AND EXISTS ( SELECT ‘x’
                      FROM   dual)
ORDER BY emp_id

If you are familiar with Oracle Exadata, you may notice that the data retrieval process for REMOTE tables in remote database @richdb works similarly to that of the Exadata Storage Server.

It is important to remember that applying this technique to SQL queries with a DB Link is only beneficial in certain environments. For instance, it is ideal when the network speed is good, data traffic is not heavy, and the workload on the local database is low.

Tosska DB Ace for Oracle can automatically perform this type of rewrite, resulting in an SQL query that runs almost 10 times faster than the original.

Tosska DB Ace Enterprise for Oracle – Tosska Technologies Limited

DBAO Tune DB Link SQL – YouTube

How to Tune SQL with DB Link for Oracle I?

Here is an example SQL query used to calculate the average salary of employees on the remote database @richdb in each department in the local database whose department name starts with the letter “D”.

SELECT   Avg(emp_salary),
         emp_dept
FROM     employee@richdb
WHERE    emp_dept IN (SELECT dpt_id
                      FROM   department
                      WHERE  dpt_name LIKE ‘D%’)
GROUP BY emp_dept

Here the following is the query plan of this SQL, it takes 9.16 seconds to finish.  The query plan shows a Nested Loops from DEPARTMENT in local to EMPLOYEE in the remote database. Due to the size of the EMPLOYEE table being much larger than that of the DEPARTMENT table, the nested loop join path is not optimal in this case.

To ask Oracle to consider doing the join operation in the remote database @richdb, I added a Hint  /*+ DRIVING_SITE(employee) */ to tell Oracle to use EMPLOYEE table’s database @richdb as the driving site for the distributed query.

SELECT   /*+ DRIVING_SITE(employee) */ Avg(emp_salary),
         emp_dept
FROM     employee@richdb
WHERE    emp_dept IN (SELECT dpt_id
                      FROM   department
                      WHERE  dpt_name LIKE ‘D%’)
GROUP BY emp_dept

The following query shows the driving site is changed to @richdb and remote retrieves DEPARTMENT data from the “local” database. Now the speed is improved to 5.94 seconds. But the query plan shows a little bit complicated, there is a view that is construed by a Hash Join of two “index fast full scan” of indexes from EMPLOYEE and DEPARTMENT.

I further change the SQL and added a dummy operation Coalesce(dpt_id,dpt_id) in the select list of the subquery to block the index fast full scan of the DEMPARTMENT table.

SELECT   /*+ DRIVING_SITE(employee) */ Avg(emp_salary),
         emp_dept
FROM     employee@richdb
WHERE    emp_dept IN (SELECT Coalesce(dpt_id,dpt_id)
                      FROM   department
                      WHERE  dpt_name LIKE ‘D%’)
GROUP BY emp_dept

The change gives the SQL a new query plan shown in the following, the performance significantly improved to 0.71 seconds. You can learn how the dummy operation Coalesce(dpt_id,dpt_id) affected the Oracle SQL optimizer decision in this example.

This kind of rewrite can be achieved by Tosska DB Ace for Oracle automatically, it shows that the rewrite is almost 13 times faster than the original SQL.

Tosska DB Ace Enterprise for Oracle – Tosska Technologies Limited

How to Tune SQL with IN Subquery with Intersect for Oracle?

Here is an example SQL that retrieves data from EMPLOYEE and DEPARTMENT table with the employee’s grade code in the GRADE table.

SELECT emp_id,
       emp_name,
       dpt_name
FROM   employee,
       department
WHERE  emp_dept = dpt_id
       AND emp_grade IN (SELECT grd_id
                         FROM grade
                         WHERE grd_min_salary < 200000)
and emp_dept < ‘D’

Here the following is the query plan of this SQL, it takes 8.3 seconds to finish. The query plan shows a Hash Join with GRADE and EMPLOYEE and then hash join to DEPARTMENT. It looks like Oracle gave up any Nested Loops operations after the actual number of rows is returned from the GRADE table in this adaptive plan.

In order to ask Oracle to consider the Nested Loops operations, I added an extra Intersect operation in the subquery to rapidly narrow down the result set of grd_id returned from the GRADE table first.

SELECT emp_id,
       emp_name,
       dpt_name
FROM   employee,
       department
WHERE  emp_dept = dpt_id
       AND emp_grade IN (SELECT grd_id
                         FROM   grade
                         WHERE  grd_min_salary < 200000                          INTERSECT SELECT e1.emp_grade
                                   FROM employee e1
                                   WHERE emp_dept < ‘D’)
       AND emp_dept < ‘D’

The rewritten SQL generates a query plan that is entirely different from the original query plan, The new plan is using “Nested Loops” from DEPARTMENT to EMPLOYEE as the first steps and then Hash Join to the GRADE table. The new plan now takes 0.81 seconds only.


This kind of rewrite can be achieved by Tosska SQL Tuning Expert Pro for Oracle automatically, it shows that the rewrite is more than 10 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to Tune SQL Statement with CASE Expression by Hints Injection for Oracle?

Here the following is a simple SQL statement with a CASE expression syntax.

SELECT *
FROM   employee
WHERE
      CASE
      WHEN emp_salary< 1000
      THEN  ‘low’
      WHEN emp_salary>100000
      THEN  ‘high’
      ELSE  ‘Normal’
      END = ‘low’

Here the following are the query plans of this SQL, it takes 4.64 seconds to finish. The query shows a Full Table Scan of the EMPLOYEE table due to the CASE expression cannot utilize the emp_salary index. It is because the CASE statement disabled the index range search of the emp_salary index.

Commonly, we will try to enable index search by forcing the SQL with an Index hint as the following:

SELECT/*+ INDEX(@SEL$1 EMPLOYEE) */ *
FROM   employee
WHERE CASE
      WHEN emp_salary < 1000
      THEN  ‘low’
      WHEN emp_salary > 100000
      THEN  ‘high’
      ELSE  ‘Normal’
     END = ‘low’

Although the CASE statement disabled the index range search of the emp_salary index, an index full scan is now enabled to help filter the result more quickly compared with the original full table scan of the EMPLOYEE table.

This hint injection takes 0.38 seconds and it is 12 times faster than the original SQL will full table scan. For this kind of SQL statement that you cannot change your source code, you can use SQL Patch with the hints and SQL text deployed to the database without the need of changing your source code.

If you can modify your source code, the best performance will be to rewrite the CASE expression into the following syntax with multiple OR conditions.

SELECT *
FROM   employee
WHERE emp_salary < 1000
     AND ‘low’ = ‘low’
     OR NOT  ( emp_salary < 1000 )
        AND  emp_salary > 100000
        AND  ‘high’ = ‘low’
     OR NOT  ( emp_salary < 1000
           OR emp_salary > 100000 )
        AND  ‘Normal’ = ‘low’

The new query plan shows an INDEX RANGE SCAN OF emp_salary index.

This kind of rewrite and hints injection can be achieved by Tosska SQL Tuning Expert Pro for Oracle automatically,

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to use ROWID to improve an UPDATE statement for Oracle?

Here the following is an Update SQL with a subquery that updates the EMPLOYEE table if the emp_dept satisfies the records returned from a subquery.

update  employee
   set  emp_name = ‘testing’
 where  emp_dept IN (select dpt_id
            from department
          where dpt_name like ‘A%’)
and emp_grade>2000

You can see Oracle uses a Hash join of the DEPARTMENT table and EMPLOYEE table to execute the update process. This query plan takes 1.96 seconds to complete and no index is used even though emp_dept, dpt_id, and emp_grade are indexed columns. It looks like the most expansive operation is the Table Access Full scan of the EMPLOYEE table.

Let’s rewrite the SQL into the following syntax to eliminate EMPLOYEE’s Table Access Full operation from the query plan.  The new subquery with the italic Bold text is used to force the EMPLOYEE to extract records with emp_dept in the DEPARTMENT table with the dpt_name like ‘A%’. The ROWID returned from the EMPLOYEE(subquery) is to make sure a more efficient table ROWID access to the outer EMPLOYEE table.

UPDATE  employee
SET   emp_name=‘testing’
WHERE   ROWID IN (SELECT  ROWID
          FROM   employee
          WHERE  emp_dept IN (SELECT  dpt_id
                      FROM   department
                      WHERE  dpt_name LIKE‘A%’))
     AND emp_grade > 2000

You can see the final query plan with this syntax has a better cost without full table access to the EMPLOYEE table. The new syntax takes 0.9 seconds and it is more than 2 times faster than the original syntax.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert Pro for Oracle automatically, there is another SQL rewrite with similar performance, but it is not suitable to discuss in this short article, maybe I can discuss it later in my blog.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

A Quick Guide to Stored Procedures in Oracle Database and SQL

sql tuning for MySQL

Stored procedures are increasing in popularity in Oracle Database and SQL Server because of quicker execution. Earlier, application code mostly resided in external programs. However, its shift toward database engine interiors compels database professionals to keep their memory requirements in mind.

This is as necessary as planning for times when the code related to database access will be present within the database. They also need to know how they can handle these stored procedures to maintain ideal database performance. We will look at some of these methods and the advantages of using stored procedures and triggers in the Oracle database.

Perks of Stored Procedures for Oracle Database Performance Tuning

Until recently, a majority of Oracle databases had limited code within their stored procedures. This shift in trends is because of the various advantages that come with placing larger amounts of code, such as the following:

Performance Improvement – Using more stored procedures means you don’t require Oracle database performance tuning as much. That’s because each of these only has to load once into the shared pool. Executing them, therefore, is quicker than running external code.

Code Segregation – The stored procedures have all the SQL codes which turn all the application programs into calls for those procedures. This is an improvement in the data retrieval process because changing databases gets simpler.

Therefore, one advantage you get through stored procedures is the ability to transfer large amounts of SQL code to the data dictionary. Doing this will enable you to perform SQL tuning without involving the application layer.

Group Data Easily – You can gather relational tables with data that shares certain behaviour before looking for Oracle performance tuning tips. Simply use Oracle stored procedures as methods, along with suitable naming conventions. For example, link the behaviour of the table data to the table name in the form of prefixes.

The users may then request the data dictionary to display all the traits connected to one table. This makes it more convenient to recognise and reuse code with the help of stored procedures.

Other Reasons Behind the Increasing Popularity of Stored Procedures

There are plenty of other reasons ‌stored procedures and triggers take less time in comparison with conventional code. One of these has something to do with SGA caching in Oracle database and SQL.

Once the shared pool within the SGA gets a hold of a stored procedure, it keeps it there until the procedure gets paged out from the memory. The SGA mostly does this to create space for other stored procedures. The paging out process takes place based on a Least Recently Used or LRU algorithm.

Two parameters help determine the amount of space that Oracle uses on startup. These are the Cache Size and the Shared Pool Size parameters. They also help users check how much storage space is available for various tasks. These include caching SQL code, data blocks, and stored procedures.

Stored procedures will run extremely fast once you load them into the shared pool’s RAM – as long as you can avoid pool thrashing. This is important because several procedures compete for varying quantities of memory in the shared pool. 

How to index SQL with aggregate function SQL for Oracle?

Here the following is an example SQL shows you that select the maximum emp_address which is not indexed in the EMPLOYEE table with 3 million records, the emp_grade is an indexed column.

select max(emp_address) from employee a
where emp_grade<4000

As 80% of the EMPLOYEE table’s records will be retrieved to examine the maximum emp_address string. The query plan of this SQL shows a Table Access Full on EMPLOYEE table is reasonable.

How many ways to build an index to improve this SQL?
Although it is simple SQL, there are still 3 ways to build an index to improve this SQL, the following are the possible indexes that can be built for the SQL, the first one is a single column index and the 2 and 3 are the composite index with a different order.
1. EMP_ADDRESS
2. EMP_GRADE, EMP_ADDRESS
3. EMP_ADDRESS, EMP_GRADE

Most people may use the EMP_ADDRESS as the first choice to improve this SQL, let’s see what the query plan is if we build a virtual index for the EMP_ADDRESS column in the following, you can see the estimated cost is reduced by almost half, but this query plan is finally not being used after the physical index is built for benchmarking due to actual statistics is collected.

The following query shows the EMP_ADDRESS index is not used and the query plan is the same as the original SQL without any new index built.

Let’s try the second composite index (EMP_GRADE, EMP_ADDRESS), the new query plan shows an Index Fast Full Scan of this index, it is a reasonable plan which no table’s data is needed to retrieve. So, the execution time is reduced from 16.83 seconds to 3.89 seconds.

Let’s test the last composite index (EMP_ADDRESS, EMP_GRADE) that EMP_ADDRESS is placed as the first column in the composite index, it creates a new query plan that shows an extra FIRST ROW operation for the INDEX FULL SCAN (MIN/MAX), it highly reduces the execution time from 16.83 seconds to 0.08 seconds.

So, indexing sometimes is an art that needs you to pay more attention to it, some potential solutions may perform excess your expectation.

The best index solution is now more than 200 times better than the original SQL without index, this kind of index recommendation can be achieved by Tosska SQL Tuning Expert for Oracle automatically.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

Transferring Data in SQL Server with an Eye on Performance

improve performance of sql query

A lot of database professionals often need to archive older data in SQL Server by transferring it from one table to another. There are multiple ways to achieve the transfer, the most useful of which we will discuss in this blog. We will also provide tips to ensure the performance of the database doesn’t get affected as these approaches are carried out.

Different Methods to Move Data from One Table to Another

Consider the following techniques that various DBAs take when they have to take data from a table to add to another table along with some ways to improve performance of SQL query while using them:

  1. Insert data with the INSERT INTO command – The INSERT INTO query is one of the basic methods of moving data from table 1 to table 2. You can help decrease the time it takes to enter information using this method. If the database is running under the full recovery model, just change it to the bulk-logged model. Doing this saves execution time as it skips over complete logging of bulk operations. The following query should help with this:

ALTER DATABASE <database name> SET RECOVERY <BULK_LOGGED>

Once you switch to the bulk-logged recovery model, you will have to use a truncate statement to flush table 2 (destination). You can carry out the same script you were using to transfer data after this.

  1. Use the SELECT INTO query – Using the SELECT INTO rather than the INSERT INTO command can prove useful in some cases. However, the benefits are significant when the recovery model is bulk-logged due to the reason mentioned above. Although users lack the ability to place the data in an existing table, SQL Server brought with it a feature to make things easier. It essentially enables them to pick the filegroup where they want to create a table.
  1. INSERT INTO query + Tab lock hint – Using both in combination has been known to provide better database performance. To achieve this, you will have to use TABLOCK for table 2. If the destination table is without a clustered index or other constraints, that data will remain as a heap. It helps to use the TABLOCK hint for the destination table during data insertion into a heap using the INSERT INTO statement. Doing this enhances query logging and locking since a shared lock is placed on the whole table rather than every row or page.
  2. Adding data using the SWITCH TO query – You can also try moving the data with the help of the SWITCH TO command. Although this query typically finds its use while transferring information between partitions among separate tables, it can help here as well. How? By moving data from one partition to the next using the ALTER TABLE command. If there are no allocated partitions, the data will transfer through tables instead. Before you begin data insertion, make sure you disable any constraints or indexes that exist on the table. It is better to enable constraints and rebuild indexes after insertion from a performance perspective.

Tips for Enhancing Performance During Data Transfer and Insertion

  • Reduce IO lag – Latency can negatively impact the process of writing database files on disk. You can decrease latency and bottlenecks using SSD drives that are comparatively better than SATA or SCSI drives.
  • Maintain Robust Server Infrastructure – The system needs to be properly built to ensure competent performance for various database operations. The greater the pressure on the resources, the greater the effect on performance.
  • Follow ACID Properties – ACID properties make sure each transaction contains certain properties when it gets processed. In the case of data insertion, the isolation factor is also important to consider because the values have another source. Here, the statements should contain the suitable isolation level to maintain integrity within the database.
  • Database Settings – One of the best ways to achieve improved outcomes is to maintain the right database configuration. This is because the settings can have a significant effect on performance. For instance, the location of the database files on the disk along with TempDB settings.

These are the various ways in which you can gain better performance at the query, trace, and constraint levels along with additions that can improve the execution of insert operations.