As I’ve often pointed out, this blog isn’t AskTom, or the OTN forum, so I don’t expect to have people asking me to solve their problems; neither do I answer email questions about specific problems. Occasionally, though, questions do appear that are worth a little public airing, and one of these came in by email a couple of weeks ago. The question is longer than the answer I sent, my contribution to the exchange doesn’t start until the heading: “My Reply”.
The question
Last week I find a very interesting thing about use_hash hint accidentally. That is when you have join two tables using unique column from one table and you have a equal predicate on that column, you cannot use hint to make them using hash join. I know that it does not make sense to use hash join in this case because nested loop is the best way to do it. The point is why Oracle ignore the hint here.
Here is the test case.
--Table creation create table t1 as select rownum id, object_name name from all_objects where rownum <= 1000 ; create table t2 as select mod(rownum,20)+1 id from dual connect by rownum <= 1000; -- Index creation (for table T1, we can create a primary key index or unique index) alter table t1 add constraint t1_pk primary key(id); create index ind_t2 on t2(id); -- Gather table statistics here select /*+ ordered use_hash(t2) */ * from t1, t2 where t1.id = t2.id and t1.id = 1 ;
Here, the use_hash hint will be ignored. Without rewriting the query, oracle only uses nested loop (which is the best thing, other join method are completely no make sense).
In your article, you said:
Why do people think that Oracle “ignores” hints ? There are two main reasons.
- The available hints are not properly documented
- the hints are rarely used properly – because they are not documented properly.
- (there are a few bugs that make things go really wrong anyway) – and yes, I know that’s the third reason of two.
In the above test case, reasons 1 and 2 do not apply. So is it an Oracle feature or a bug?
We can argue that this is a feature since in this case oracle really know nested loop is the best thing. Then for table T1
select /*+ full(t1) */ * from t1 where t1.id = 1
The above query should also ignore the FULL hint, but it does not.
Now, if we change the the query to:
select /*+ ordered use_hash(t2) */ * from t1,t2 where t1.id = t2.id and t1.id in (1,2) ;
The hint works and oracle use hash join here.
Since oracle can use transitive closure when generating query plan, now, we rewrite the first query to an equivalent one:
select /*+ ordered use_hash(t2) */ * from t1,t2 where t1.id = t2.id and t2.id = 1 ;
Oracle will use the hint here even though oracle knows we want t1.id = 1.
So, it looks this is more like a bug than a feature. What do you think?
My reply
If you read chapter 6 of Cost Based Oracle – Fundamentals, somewhere around page 142, you will see what’s going on. There is an inconsistency in this part of the optimizer code here which could be addressed but might be hard to change. (So the answer to your question is that this is more like a bug than a feature – but it’s a side effect of a more significant defect in the code, rather than a very local bug.)
In the first case (predicate on t1), Oracle uses transitive closure to generate a predicate on t2 - and as it does so it drops the join predicate. This makes the hash join impossible and puts the hint out of context.
In the second case, Oracle keeps the join predicate (and that’s the inconsistency in behaviour), so the hash join is still legal and therefore the hint has to be obeyed.
I would guess that the logic works like this:
Predicate on t1: We generate a transitive predicate, but the source predicate is equality on a unique key, so the join predicate is redundant and can be dropped.
Predicate on t2: We generate a transitive predicate, but the source predicate is not a guaranteed to be single row predicate, so the join predicate has to be kept and checked. (The “not unique” requirement is also why the join on the in-list behaves the way it does).
Arguably Oracle could introduce a second pass in the optimizer code which could note that the generated predicate has resulted in a predicate with equality on a unique key, and with this change in place the optimizer could decide to drop the join predicate and we would be back to consistent behaviour. In fact, to my mind, the code should never drop predicates – but it needs to change so that it recognises “redundant” predicates properly and doesn’t double count them in the calculation of join selectivity.
