Till now in my previous articles we have discussed about the tools that are used in the performance tuning.
Please find the previous article here.
Before we go in depth of the Performance tuning, it is important to know about the Optimizer and its behaviour.
So, in this article we will be discussion about the Optimizer.
Oracle Optimizer
Basics of Optimizer:
Optimizer is a mechanism which is inside the instance as part of the software components which builds the best execution plan for a query to execute. Any SQL statement will go through the optimizer and the optimizer tell how it should be executed. Hence, optimizer is the decision maker for a query to execute in the instance.
The database executes the SQL statements in many ways like full table scan, index scan, loops, joins etc., Who will define the best plan it may be through index scan, or full table scan or loop or joins and which is the fastest way to execute? That is decided by the optimizer.
There are 3 major activities happens whenever the user submits the SQL statement for execution to the instance.
- The optimizer will generate a set of all the possible plans for SQL statements based on the available access paths and hints. For each of the plans the optimizer will estimates the cost involved. It includes I/O, CPU utilisation, Memory utilisation etc., With the help of the statistics available in the data dictionaries. So, the statistics are the very important data that optimizer uses to define the best execution plan.
- Once the optimizer estimates the cost, it compares the plan and chooses the plan with the lowest cost. The query which quickly gets the data which utilises lowest CPU and lowest memory is the lowest cost.
After the execution plan is released by the optimizer then the database instance will start following the same execution plan that the optimizer is defined then it will complete the query execution and gives the results to the user.
Below are the different operations that the optimizer will perform:
Stage 1: Evaluation of expressions and conditions:
The first step the optimizer does is to go and check all the conditions that are available inside the query.
Stage 2: Statement Transformation:
Once the optimizer checks the conditions in the query, it will transform the query to a different form if it executes faster than the original query.
Example: If a query uses IN clause and it can be substituted with an AND clause and get the output faster, then it will automatically use the AND clause for the execution.
Stage 3: Choice of Optimizer goals:
The optimizer will set goals of optimization.
Stage 4: Choice of access paths:
For each table access the optimizer will chooses one or more available access paths to get the table data.
Stage 6: Choice of Join orders:
For a join statement that joins two or more tables, Optimizer will choose which 2 tables will be joined first, and then the result has to be joined with other set of tables so how can this be possible. Weather table1 joins table2, table2 joins table3 or table2 joins table3.
Understanding the behaviour of the Optimizer:
Behaviour of the optimizer meaning the functionality of the optimizer.
The query optimizer operation includes:
- Query Transformation
- Estimation
- Plan Generation
Here in the parsing stage the optimizer operates as:
1st it will check if the query must be transformed which will be the Query Transformation.
Now after the query is transformed, it will estimate the cost, cardinality, selectivity to identify which is the best execution plan. This is with the help of statistics available in the data dictionaries. If the statistics are not available then the optimizer calculations might have the false values. This is done at the Estimation stage.
Now once the estimations are ready, it will have a clear idea on which plan must be used and using the plan it will generate an execution plan. So, this execution plan will be designed with the help of all the statistical information that had in the earlier stage.
Query Transformation can be done by the optimizer in 4 ways.
- View Merge
- Predicte Pushing
- Subquery unnesting
- Query rewrite with materialized views
There is no rule that every SQL statement will be transformed. But ther is a possibility for the Optimizer to transform a query.
View Merge:
Creating a view on top of a table with the where clause.
1 2 3 4 5 6 7 8 |
Example: create view employee_50_vw as Select employee_id, last_name, job_id, salary, commission_pct, department_id from employees Where department_id = 50; Now we will query the view as below: Select employee_id from employee_50_vw Where employee_id >150; |
The Optimizer may use the view merging technique to transform the query as below.
1 2 3 |
Select employee_id from employees Where department_id = 50 AND employee_id >150 |
Here the optimizer Internally inside the instance will transforms the “Select employee_id from employee_50_vw Where employee_id >150;” query to the original format with all the conditions.
So, on top a view whenever we are running a query, View Merging might happen inside the oracle optimizer in the transformation phase.
Predicate Pushing:
In this stage the optimizer pushes the predicates form containing query block into the view query block. This technique will improve the subplan of the unmerged view because the database can use the pushed-in predicates form to access indexes or to use as filters.
For example, we create a view that reference two employee tables, and is defined with the compound query which uses UNION set operator
1 2 3 4 5 6 7 |
CREATE VIEW all_employee_vw AS (SELECT employee_id, last_name, job_id, commission_pct, department_id FROM employees) UNION (SELECT employee_id, last_name, job_id, commission_pct, department_id FROM contract workers); Now on top of this view if we write a condition as below. SELECT last_name FROM all_employee_vw where department_id = 50; |
Here in the above, the condition cannot be for one select statement since the UNION operator is used, the condition should be applied for both the select statements in the view. Thus, whatever we give inside the where clause is called the predicate. This predicate is substituted to each and every sub query which is being used in creating the view. And it will substitute the condition inside each select statement and then it will go for the evaluation stage.
Subquery unnesting:
In this subquery unnesting the optimizer will transforms the nested statement into the join statement and then it is optimised. This is performed by the optimizer only when both the statements (one with original statement and the one with the join statement) gives exactly same results and the subquery does not have the aggregate functions like AVG.
1 2 |
Example: SELECT * FROM sales WHERE cust_id IN (SELECT cust_id FROM customers); |
Since the customers.cust_id column is the primary key, the optimizer can transform the statement into simple join statement as it give exactly same result in both the cases.
1 2 |
SELECT sales.* FORM sales, customers WHERE sales.cust_id = customers.cust_id; |
Estimation:
This is the phase where the estimator will determine the overall cost for the different plans generated by the optimizer. Below are the three measures that are considered to achieve the goal.
- Selectivity
- Cardinality
- Cost
Selectivity: It represents the fraction of the rows from the row set. If there are 100 rows in a rowset and you select 1 row then the selectivity will be 0.01. If you have 1000 rows and you are trying to select 100 rows then the selectivity will be 0.1. This is depending on what predicate you are using like last_name=’MORA’, or a combination of predicates like last_name=’MORA’ AND job_type=’ADMIN’.
Selectivity is the estimation with the help of the statistics. If there are no statistics, then the optimizer can get the wrong selectivity. Range is from 0.0 to 1.0. here 0.0 means no rows are selected form the row set and 1.0 means all the rows are selected.
Cardinality: This represents the total number of row that it has to fetch from the table. Example, if the table has 1000 rows and the query has to fetch 40 rows, then the cardinality will be 40 and the selectivity will be 40/1000.
Cost: Cost includes the memory usage, I/O operations, CPU consumption to get the data is the cost. It decides between table scan or full index scan or can be index scan.
After all these (selectivity, Cardinality and cost) are defined then only the cost of the particular plan will be evaluated. Then the optimizer decides the best execution plan. Once the optimizer decides the best execution plan, and is generated.
I HOPE WITH THIS YOU GET AN CLEAR IDEA ABOUT HOW OPTIMIZER WORKS…
Please give any suggestions or feedback below in the comment section…
THANK YOU…