Here’s a little example of why you should be very cautious about implementing undocumented discoveries. If you take a look at the view v$sql_hints in 11.2.0.4 you’ll discover a hint (no_)cluster_by_rowid; and if you look in v$parameter you’ll discover two new parameters _optimizer_cluster_by_rowid and _optimizer_cluster_by_rowid_control.
It doesn’t take much imagination to guess that the parameters and hint have something to do with the costs of accessing compressed data by rowid on an Exadata system (see, for example, this posting) and it’s very easy to check what the hint does:
create table t1 as select rownum id, trunc(dbms_random.value(0,1000)) n1, trunc(dbms_random.value(0,1000)) n2, rpad('x',100,'x') padding from all_objects where rownum <= 10000 ; create index t1_i1 on t1(n1,n2); -- collect stats set autotrace on explain select /*+ index(t1) cluster_by_rowid(t1) */ id from t1 where n1 = 50 ;
I’ve created an index on the two randomly generated columns (n1, n2) and then forced Oracle to walk the index searching for n1 = 50. Because of the presence of n2 in the index the order in which Oracle will visit the rows will be randomised relative to the order in which they are stored in the table – but if I can get Oracle to sort the rowids into order before visiting the table I could (in theory) get a performance benefit from minimising the number of consistent gets I do (some will turn into “buffer is pinned count”), and in the Exadata case I could get a big improvement by ensuring that I decompressed each compression unit no more than once. That’s what the hint does – and to prove the point I can select the id column – if the rows are being visited in order of rowid then the output will be in order of id:
ID ---------- 198 681 1257 1318 5593 7094 7471 8048 8798 8855 9670 11 rows selected. --------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 80 | 12 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 10 | 80 | 12 | | 2 | SORT CLUSTER BY ROWID | | 10 | | 2 | |* 3 | INDEX RANGE SCAN | T1_I1 | 10 | | 2 | --------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("N1"=50)
Sure enough – with the hint the data arrives in sorted order, without it the results are not sorted. And, of course, we can check the execution plan and see that the optimizer has inserted a SORT operation between the index range scan and the table access by rowid.
So here’s the fun bit – the parameters are still there in 12.1.0.1 (and _optimizer_cluster_by_rowid is now true by default when it used to be false) but the hint is no longer available. So if you want the sort to take place, you have to depend (presumably) on the luck of the statistics – when does Oracle think it’s worth doing automatically – or you still have to write the query as a two-part job with an inline, non-mergeable view that accesses and sorts the rowids before visiting the table.
