Improve SQL Queries & Database for Better Efficiency: Part 1

SQL is probably the most popular and powerful means to handle data, but sometimes you need actionable advice to unleash its power and make the most of such a robust language.

In case you’re operating in the absence of a data warehouse or a segregated analytical database for assessment, you may be able to gain updated information only from the live production database.

However, optimization and tuning are very important while writing queries for an Oracle database, especially a production database. In this two-part series, we will cover eight of the most useful ways to supercharge your database by enhancing the SQL queries used.

Make Tuning Your SQL Queries Easy Using these Tips

Consider the following ways to improve the performance of your database –

Clarify the organization’s requirements first

There are certain practices that benefit not just the users optimizing SQL queries but also the organization in general, such as:

  • Determining relevant stakeholders
  • Concentrate on business implications
  • Structuralize the discussion for the ideal specifications
  • Ask the right questions (Who? What? Where? When? Why?)
  • Make the requirements as specific as possible, and confirm those with the stakeholders.

Limit the scope of the SELECT query

A majority of SQL professionals have a bad habit of using SELECT * as a shorthand and end up fetching all available information from a table. If the table in question has numerous rows and fields, this takes up a lot of Oracle database and SQL resources by returning plenty of unrequired data.

The use of the SELECT statement must be done in a way that makes the database fetch only the data required to fulfill the business requirements. Consider the following instance, where the organization’s requirements request postal addresses for clients –

The query SELECT * from Clients is inefficient as it might bring in other information also fed in the client table that isn’t needed here. Instead, a query like this would only fetch the data necessary as per the requirements.

SELECT Name, Address, City, State, Zip

FROM Clients

Refrain from Using SELECT DISTINCT

Eliminating redundant information from a query is easy with SELECT DISTINCT, which GROUPs certain fields in the statement to return distinct results. Achieving this goal, however, requires substantial processing power. Moreover, the grouping of data may not be accurate, which is why you must avoid using this ‘trick’ altogether.

Take this example, for instance –

SELECT DISTINCT Name, City, State

FROM Clients

This statement doesn’t account for several people in the same city or state having the same name. Common names like John or Jane will be grouped together, leading to an incorrect quantity of records. Bigger databases that contain numerous Johns and Janes will not benefit from this query.

In its place, try to use something like this –

SELECT Name, Address, City, State, Zip

FROM Customers

By increasing the fields, unique records will be fetched without the use of SELECT DISTINCT. The database also wouldn’t have to cluster any fields, and the result set would be accurate.

Use INNER JOIN to Make a Join, not WHERE

Some SQL professionals choose WHERE clauses for joining, like in the example below –

SELECT Clients.ClientID, Clients.Name, Sales.LastSaleDate

FROM Clients, Sales

WHERE Clients.ClientID = Sales.ClientID

This kind of join creates a Cartesian Join, also known as Cross Join or a Cartesian Product. It involves the creation of every potential combination of the variables. If we suppose the table in the above example contains a thousand clients with a thousand total sales, the statement would generate a million results in the first go, after which it would filter those results to fetch records where ClientID is correctly joined.

It’s an unnecessary waste of database resources since the database ends up doing over a hundred times more work than is actually needed. Moreover, Cartesian Joins are particularly cumbersome in large-scale databases because it is likely to fetch billions or trillions of results.

Therefore, it becomes essential to use something like an INNER JOIN instead, in order to avoid the creation of a Cartesian Join –

SELECT Clients.ClientID, Clients.Name, Sales.LastSaleDate

FROM Clients

   INNER JOIN Sales

   ON Clients.ClientID = Sales.ClientID

In this case, the database would only fetch a thousand records where ClientID matches. Certain DBMSs identify WHERE joins and run them as INNER JOINs on their own accord, which is why they won’t show any change in performance between a WHERE and INNER JOIN.