I came across this odd limitation (maybe defect) with pushing predicates (join predicate push down) a few years ago that made a dramatic difference to a client query when fixed but managed to hide itself rather cunningly until you looked closely at what was going on. Searching my library for something completely different I’ve just rediscovered the model I built to demonstrate the issue so I’ve tested it against a couple of newer versions of Oracle (including 18.1) and found that the anomaly still exists. It’s an interesting little detail about checking execution plans properly so I’ve written up the details. The critical feature of the problem is a union all view:
rem rem Script: push_pred_limitation.sql rem Author: Jonathan Lewis rem Dated: Jan 2015 rem rem Last tested rem 18.1.0.0 via LiveSQL rem 12.2.0.1 rem 12.1.0.2 rem 11.2.0.4 rem create table t1 as select * from all_objects where rownum <= 10000 -- > comment to avoid WordPress format issue ; create table t2 as select * from all_objects where rownum <= 10000 -- > comment to avoid WordPress format issue ; create table t3 as select * from all_objects where rownum <= 10000 -- > comment to avoid WordPress format issue ; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1 for columns owner size 254' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'T2', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'T3', method_opt => 'for all columns size 1' ); end; / create index t2_id on t2(object_id); -- create index t2_id_ot on t2(object_id, object_type); create index t3_name_type on t3(object_name, object_type); create or replace view v1 as select /*+ qb_name(part1) */ t2.object_id, t2.object_type object_type_2, t3.object_type object_type_3, t2.created date_2, t3.created date_3 from t2, t3 where t3.object_name = t2.object_name union all select /*+ qb_name(part2) */ t2.object_id, t2.object_type object_type_2, t3.object_type object_type_3, t2.last_ddl_time date_2, t3.last_ddl_time date_3 from t2, t3 where t3.object_name = t2.object_name ;
Two points to note so far: first, the view is basically joining the same two tables in the same way twice but selecting different columns. It’s a close model of what the client was doing but so much simpler that it wouldn’t be hard to find a different way of getting the same result: the client’s version would have been much far harder to rewrite. Secondly, I’ve listed two possible indexes for table t2 but commented one of them out. The indexing will make a difference that I’ll describe later.
So here’s the query with execution plan (from explain plan – but pulling the plan from memory gives the same result):
select /*+ qb_name(main) */ t1.object_name, t1.object_type, v1.object_id, v1.date_2, v1.date_3 from t1, v1 where v1.object_id = t1.object_id and v1.object_type_2 = t1.object_type and v1.object_type_3 = t1.object_type and t1.owner = 'OUTLN' ; Plan hash value: 4123301926 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7 | 588 | 82 (2)| 00:00:01 | | 1 | NESTED LOOPS | | 7 | 588 | 82 (2)| 00:00:01 | |* 2 | TABLE ACCESS FULL | T1 | 7 | 280 | 26 (4)| 00:00:01 | |* 3 | VIEW | V1 | 1 | 44 | 8 (0)| 00:00:01 | | 4 | UNION ALL PUSHED PREDICATE | | | | | | | 5 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 6 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 7 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 41 | 2 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | T2_ID | 1 | | 1 (0)| 00:00:01 | |* 9 | INDEX RANGE SCAN | T3_NAME_TYPE | 1 | | 1 (0)| 00:00:01 | | 10 | TABLE ACCESS BY INDEX ROWID | T3 | 1 | 36 | 2 (0)| 00:00:01 | | 11 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 12 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 13 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 41 | 2 (0)| 00:00:01 | |* 14 | INDEX RANGE SCAN | T2_ID | 1 | | 1 (0)| 00:00:01 | |* 15 | INDEX RANGE SCAN | T3_NAME_TYPE | 1 | | 1 (0)| 00:00:01 | | 16 | TABLE ACCESS BY INDEX ROWID | T3 | 1 | 36 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("T1"."OWNER"='OUTLN') 3 - filter("V1"."OBJECT_TYPE_2"="T1"."OBJECT_TYPE") 8 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID") 9 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE") 14 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID") 15 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
The execution plan appears to be fine – we can see at operation 4 that the union all view has been access with the pushed predicate option and that the subsequent sub-plan has
used index driven nested loop joins in both branches – until we look a little more closely and examine the Predicate section of the plan. What, exactly, has been pushed ?
Look at the predicate for operation 3: “V1″.”OBJECT_TYPE_2″=”T1″.”OBJECT_TYPE”. It’s a join predicate that hasn’t been pushed into the view. On the other hand the original, and similar, join predicate v1.object_type_3 = t1.object_type has been pushed into the view, appearing at operations 9 and 15. There is a difference, of course, the object_type_3 column appears as the second column of the index on table t3.
Two questions then: (a) will the object_type_2 predicate be pushed if we add it to the relevant index on table t2, (b) is there a way to get the predicate pushed without adding it to the index. The answer to both questions is yes. First the index – re-run the test but create the alternative index on t2 and the plan changes to:
Plan hash value: 497545587 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7 | 553 | 82 (2)| 00:00:01 | | 1 | NESTED LOOPS | | 7 | 553 | 82 (2)| 00:00:01 | |* 2 | TABLE ACCESS FULL | T1 | 7 | 280 | 26 (4)| 00:00:01 | | 3 | VIEW | V1 | 1 | 39 | 8 (0)| 00:00:01 | | 4 | UNION ALL PUSHED PREDICATE | | | | | | | 5 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 6 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 7 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 41 | 2 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | T2_ID_OT | 1 | | 1 (0)| 00:00:01 | |* 9 | INDEX RANGE SCAN | T3_NAME_TYPE | 1 | | 1 (0)| 00:00:01 | | 10 | TABLE ACCESS BY INDEX ROWID | T3 | 1 | 36 | 2 (0)| 00:00:01 | | 11 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 12 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 13 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 41 | 2 (0)| 00:00:01 | |* 14 | INDEX RANGE SCAN | T2_ID_OT | 1 | | 1 (0)| 00:00:01 | |* 15 | INDEX RANGE SCAN | T3_NAME_TYPE | 1 | | 1 (0)| 00:00:01 | | 16 | TABLE ACCESS BY INDEX ROWID | T3 | 1 | 36 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("T1"."OWNER"='OUTLN') 8 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID" AND "T2"."OBJECT_TYPE"="T1"."OBJECT_TYPE") 9 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE") 14 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID" AND "T2"."OBJECT_TYPE"="T1"."OBJECT_TYPE") 15 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
Notice how the predicate at operation 3 has disappeared, and the access predicate at operation 8 now includes the predicate “T2″.”OBJECT_TYPE”=”T1″.”OBJECT_TYPE”.
Alternatively, don’t mess about with the indexes – just tell Oracle to push the predicate. Normally I would just try /*+ push_pred(v1) */ as the hint to do this, but the Outline section of the original execution plan already included a push_pred() hint that looked like this: PUSH_PRED(@”MAIN” “V1″@”MAIN” 3 1), so I first copied exactly that into the SQL to see if it would make any difference. It did – I got the following plan (and the hint in the outline changed to PUSH_PRED(@”MAIN” “V1″@”MAIN” 3 2 1) so this may be a case where the plan produced by a baseline will perform better than the plan that the produced the baseline!):
Plan hash value: 4123301926 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7 | 553 | 82 (2)| 00:00:01 | | 1 | NESTED LOOPS | | 7 | 553 | 82 (2)| 00:00:01 | |* 2 | TABLE ACCESS FULL | T1 | 7 | 280 | 26 (4)| 00:00:01 | | 3 | VIEW | V1 | 1 | 39 | 8 (0)| 00:00:01 | | 4 | UNION ALL PUSHED PREDICATE | | | | | | | 5 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 6 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | |* 7 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 41 | 2 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | T2_ID | 1 | | 1 (0)| 00:00:01 | |* 9 | INDEX RANGE SCAN | T3_NAME_TYPE | 1 | | 1 (0)| 00:00:01 | | 10 | TABLE ACCESS BY INDEX ROWID | T3 | 1 | 36 | 2 (0)| 00:00:01 | | 11 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | | 12 | NESTED LOOPS | | 1 | 77 | 4 (0)| 00:00:01 | |* 13 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 41 | 2 (0)| 00:00:01 | |* 14 | INDEX RANGE SCAN | T2_ID | 1 | | 1 (0)| 00:00:01 | |* 15 | INDEX RANGE SCAN | T3_NAME_TYPE | 1 | | 1 (0)| 00:00:01 | | 16 | TABLE ACCESS BY INDEX ROWID | T3 | 1 | 36 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("T1"."OWNER"='TEST_USER') 7 - filter("T2"."OBJECT_TYPE"="T1"."OBJECT_TYPE") 8 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID") 9 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE") 13 - filter("T2"."OBJECT_TYPE"="T1"."OBJECT_TYPE") 14 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID") 15 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
In this case we see that the critical late-joining predicate has disappeared from operation 3 and re-appeared as a filter predicate at operation 7 In many cases you may find that the change in predicate use makes little difference to the performance – in my example the variation in run time over several executions of each query was larger than the average run time of the query; nevertheless it’s worth noting that the delayed use of the predicate could have increased the number of probes into table t3 for both branches of the union all and resulted in redundant data passing up through several layers of the call stack before being eliminated … and “eliminate early” is one of the major commandments of optimisation.
You might notice that the Plan Hash Value for the hinted execution plan is the same as for the original execution plan: the hashing algorithm doesn’t take the predicates into account (just one of many points that Randolf Geist raised in a blog post several years ago). This is one of the little details that makes it easy to miss the little changes in a plan that can make a big difference in performance.
Summary
If you have SQL that joins simple tables to set based (union all, etc.) views and you see the pushed predicate option appearing take a little time to examine the predicate section of the execution plan to see if the optimizer is pushing all the join predicates that it should and, if it isn’t, test the effects of pushing more predicates.
In many cases adding the hint /*+ push_pred(your_view_name) */ at the top of the query may be sufficient to get the predicate pushing you need, but you may need to look at the outline section of the execution plan and add a series of more complicated push_pred() and no_push_pred() hints because the push_pred hint has evolved over time to deal with increasingly complicated transformations.