Oracle SQL优化干货教程

Posted by 道行尚浅 on April 11, 2012

什么是执行计划?

SQL语言本身比较倾向于描述性语言,语句只表述目标,Oracle会对语句解析,并生成执行计划,执行计划包含操作的具体动作与顺序.

执行计划收集

理论执行计划 & 实际执行计划

  • 理论执行计划:由优化器生成的执行计划,并不会执行
  • 实际执行计划:由优化器生成的执行计划,并交由SQL引擎执行

EXPLAIN PALN FOR

生成执行计划


SQL> explain plan for select * from scott.emp ;

Explained.

获取执行计划

  • @?/rdbms/admin/utlxpls

SQL> @?/rdbms/admin/utlxpls

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 3956160932

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    14 |  1218 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| EMP  |    14 |  1218 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

12 rows selected.

  • @?/rdbms/admin/utlxplp

SQL> @?/rdbms/admin/utlxplp

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
Plan hash value: 3956160932

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    14 |  1218 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| EMP  |    14 |  1218 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

12 rows selected.

  • DBMS_XPLAN.DISPLAY
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY) ;

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
Plan hash value: 3956160932

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    14 |  1218 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| EMP  |    14 |  1218 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

12 rows selected.

AUTOTRACE

命令 执行计划类型 是否展示结果 是否显示执行计划 是否显示统计信息
SET AUTOTRACE ON 实际执行计划 YES YES YES
SET AUTOTRACE TRAC 实际执行计划 NO YES YES
SET AUTOTRACE TRAC EXP SELECT 理论执行计划, DML实际执行计划 NO YES NO
SET AUTOTRACE TRAC EXP 实际执行计划 NO NO YES
SET AUTOTRACE office 关闭AUTOTRACE NO NO NO

set autotrace on


SQL> set autotrace on
SQL> select * from scott.dept ;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON


Execution Plan
----------------------------------------------------------
Plan hash value: 3383998547

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     4 |   120 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| DEPT |     4 |   120 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
         30  recursive calls
          0  db block gets
         74  consistent gets
          7  physical reads
          0  redo size
        802  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          6  sorts (memory)
          0  sorts (disk)
          4  rows processed


set autotrace trace

SQL> set autotrace trace
SQL>  select * from scott.dept ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3383998547

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     4 |   120 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| DEPT |     4 |   120 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
         16  consistent gets
          0  physical reads
          0  redo size
        802  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          4  rows processed

set autotrace trace exp

SQL> set autotrace trace  exp
SQL>  select * from scott.dept ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3383998547

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     4 |   120 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| DEPT |     4 |   120 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


SQL> set autotrace trace  stat 
SQL>  select * from scott.dept ;


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        802  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          4  rows processed

查看library cache中的执行计划

SQL> select sql_id,sql_text 
  2    from v$sql 
  3    where sql_text = 'select * from scott.dept ' ;

SQL_ID
-------------
SQL_TEXT
-----------------------------------------------------------------------------------

6735nv767qy6s
select * from scott.dept


SQL> select * from  table(dbms_xplan.display_cursor('6735nv767qy6s'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------

SQL_ID  6735nv767qy6s, child number 0
-------------------------------------
select * from scott.dept

Plan hash value: 3383998547

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |     3 (100)|          |
|   1 |  TABLE ACCESS FULL| DEPT |     4 |   120 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


17 rows selected.

gather_plan_statistics 提示显示精确执行计划


SQL> create table dao_objects1 as select *  from dba_objects ;

Table created.

SQL> create table emp as select * from scott.emp ;

Table created.

SQL> create table dept as  select * from scott.dept ;

Table created.



SQL> select /*+ gather_plan_statistics use_nl(emp dao_objects1)  */count(*)
  2   from  emp , dao_objects1
  3   where empno=object_id ;

  COUNT(*)
----------
        14

SQL>  select * 
  2    from table(dbms_xplan.display_cursor.  (format => 'allstats last'));
  from table(dbms_xplan.display_cursor.  (format => 'allstats last'))
                                         *
ERROR at line 2:
ORA-01747: invalid user.table.column, table.column, or column specification


SQL> select *
  2      from table( dbms_xplan.display_cursor( format=> 'allstats last' ) );

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------
SQL_ID  78r58rj4b2gu6, child number 0
-------------------------------------
select /*+ gather_plan_statistics use_nl(emp dao_objects1)  */count(*)
from  emp , dao_objects1  where empno=object_id

Plan hash value: 2878422018

-------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |              |      1 |        |      1 |00:00:00.09 |   15039 |  14994 |
|   1 |  SORT AGGREGATE     |              |      1 |      1 |      1 |00:00:00.09 |   15039 |  14994 |
|   2 |   NESTED LOOPS      |              |      1 |     14 |     14 |00:00:00.24 |   15039 |  14994 |
|   3 |    TABLE ACCESS FULL| EMP          |      1 |     14 |     14 |00:00:00.01 |       3 |      0 |
|*  4 |    TABLE ACCESS FULL| DAO_OBJECTS1 |     14 |      1 |     14 |00:00:00.09 |   15036 |  14994 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("EMPNO"="OBJECT_ID")

Note
-----
   - dynamic sampling used for this statement (level=2)


26 rows selected.


10046 trace


SQL> alter session set events '10046 trace name context forever, level 12';

Session altered.

SQL> select /*+ gather_plan_statistics use_nl(emp dao_objects1)  */count(*)
  2   from  emp , dao_objects1
  3   where empno=object_id ;

  COUNT(*)
----------
        14

SQL> alter session set events '10046 trace name context off' ;

Session altered.


SQL> select display_value || '/ora_' || instance_name || '_' || sid || '.trc' trace_file_name
  2    from v$parameter vp, 
  3         v$instance,
  4         (select sid from v$mystat where rownum = 1) vm
  5   where name like '%user_dump_dest%' ;

TRACE_FILE_NAME
---------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/ora_orcl_44.trc

10053 trace


SQL> alter session set events '10053 trace name context forever, level 12';

Session altered.

SQL> select /*+ gather_plan_statistics use_nl(emp dao_objects1)  */count(*)
  2   from  emp , dao_objects1
  3   where empno=object_id ;

  COUNT(*)
----------
        14

SQL> alter session set events '10046 trace name context off' ;

Session altered.


SQL> select display_value || '/ora_' || instance_name || '_' || sid || '.trc' trace_file_name
  2    from v$parameter vp, 
  3         v$instance,
  4         (select sid from v$mystat where rownum = 1) vm
  5   where name like '%user_dump_dest%' ;

TRACE_FILE_NAME
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/ora_orcl_44.trc

执行计划阅读

执行计划列含义

列名 含义 特别说明
ID 操作编号  
Operateion 操作内容  
Name 操作对象名称  
Rows 行源行数 剩余行数,并不是处理的行数
Bytes 行源字节数 行数*平均行长度
Cost 成本 当前操作成本=成本-子操作成本
TIme 预计执行时间 很少有准的时候

执行计划阅读顺序

  1. 从执行计划的最顶层ID=0开始和结束,通常是SELECT statement,还 可以是insert、Update或Delete
  2. 向下遍历行源直到生成数据的行源操作,并且此行源操作不在调用任 何一个行源操作。这个行源是开始行源
  3. 开始行源称为主驱动集,通常主驱动集越小执行计划性能越优
  4. 接下来查看同一父ID下的兄弟节点,也就是同级行源操作中ID较大者
  5. 一旦子节点的操作行源操作执行完数据返回给父节点,接着执行父节 点行源操作
  6. 一旦父节点和其子节点行源操作执完成,接着执行父节点的兄弟节点
  7. 同一父节点的行源操作,总是ID小者先执行
  8. 按上述规则遍历完整个执行计划的所有行源操作
SQL> set autotrace trace exp
SQL> select *
  2    from emp, dept, dao_objects1
  3   where emp.deptno = dept.deptno
  4     and dao_objects1.data_object_id = empno ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1609509133

------------------------------------------------------------------------------------
| Id  | Operation           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |              |   244 | 79056 |   308   (1)| 00:00:04 |
|*  1 |  HASH JOIN          |              |   244 | 79056 |   308   (1)| 00:00:04 |
|*  2 |   HASH JOIN         |              |    14 |  1638 |     7  (15)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| DEPT         |     4 |   120 |     3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| EMP          |    14 |  1218 |     3   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL | DAO_OBJECTS1 | 80098 |    15M|   301   (1)| 00:00:04 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DAO_OBJECTS1"."DATA_OBJECT_ID"="EMPNO")
   2 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")

Note
-----
   - dynamic sampling used for this statement (level=2)

执行计划阅读助手

SQL> with fis as
 (select *
  2    3      from v$sql_plan
  4     where plan_hash_value = 1609509133
  5       and sql_id = (select min(sql_id)
  6                       from v$sql_plan
  7                      WHERE PLAN_HASH_VALUE = 1609509133))
  8  select path
  9    from (select distinct sys_connect_by_path(id, '->') path, id
 10            from fis
 11           where CONNECT_BY_ISLEAF = 1
 12           start with id = 0
 13          connect by prior id = fis.parent_id
 14           order by id);

PATH
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
->0->1->2->3
->0->1->2->4
->0->1->5

根据结果可以画出执行计划树

graph TD

0[0] --> 1[1]
1[1] --> 2[2]
2[2] --> 3[3]
2[2] --> 4[4]
1[1] --> 5[5]


执行计划分析

执行计划概述

执行计划中的信息通常分为以下几类

  • 访问路径 描述了如何获取数据
  • 联结方式 两个结果集做联结时,采用哪种算法
  • 连接顺序 多表关联时,哪些表先做关联,哪些表后做关联
  • 其他 除上述三类以外的操作, 比如 union all ,order by, group by

其中DBA能干预的只有前三类

访问路径

全表扫描

SQL> select * from dao_objects1 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 415406725

----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              | 80098 |    15M|   301   (1)| 00:00:04 |
|   1 |  TABLE ACCESS FULL| DAO_OBJECTS1 | 80098 |    15M|   301   (1)| 00:00:04 |
----------------------------------------------------------------------------------

纯索引扫描

SQL> create index idx_1 on dao_objects1 (object_id) ;

Index created.

SQL> select object_id from daO_objects1 where object_id<=5 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3428004586

--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |     4 |    52 |     2   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IDX_1 |     4 |    52 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("OBJECT_ID"<=5)

Note
-----
   - dynamic sampling used for this statement (level=2)
   

先索引扫描,后根据rowid访问表中数据

SQL> select *  from daO_objects1 where object_id<=5 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 81928650

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     4 |   828 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DAO_OBJECTS1 |     4 |   828 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_1        |     4 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"<=5)

Note
-----
   - dynamic sampling used for this statement (level=2)


直接给定rowid


SQL> set autotrace off


SQL> select * from dao_objects1 where rownum=1 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2302929987

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |     1 |   207 |     3  (34)| 00:00:01 |
|*  1 |  COUNT STOPKEY     |              |       |       |            |          |
|   2 |   TABLE ACCESS FULL| DAO_OBJECTS1 | 80098 |    15M|     3  (34)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)

Note
-----
   - dynamic sampling used for this statement (level=2)


索引

索引唯一性扫描

索引唯一扫描出现的必要条件

  • 列存在唯一约束
  • 比较运算符为等号

SQL>  alter table dao_objects1 add constraint pk_object_id primary  key (object_id) ;

Table altered.



SQL> select * from dao_objects1 where object_id=888 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 81928650

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |   207 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DAO_OBJECTS1 |     1 |   207 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_1        |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=888)

索引范围扫描

  • 普通索引且比较运算符为等式
SQL> create index idx_name_obj1 on dao_objects1(object_name) ;

Index created.

SQL>  select * from dao_objects1 where object_name ='EMP' ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3503313396

---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     1 |   207 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DAO_OBJECTS1  |     1 |   207 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_NAME_OBJ1 |     1 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_NAME"='EMP')

Note
-----
   - dynamic sampling used for this statement (level=2)

  • 有效的范围条件
 
SQL> select * from dao_objects1 where object_id<66 ;      

Execution Plan
----------------------------------------------------------
Plan hash value: 81928650

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |    64 | 13248 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DAO_OBJECTS1 |    64 | 13248 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_1        |    64 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"<66)

Note
-----
   - dynamic sampling used for this statement (level=2) 

索引全扫描

  • 所需要的列包含在索引中,并且结果按索引列排序

SQL>  select object_id from dao_objects1  order by object_id ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3375905911

--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       | 80098 |  1016K|   182   (1)| 00:00:03 |
|   1 |  INDEX FULL SCAN | IDX_1 | 80098 |  1016K|   182   (1)| 00:00:03 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

有一种特别类型的索引全扫描(max/min)模式,这种情况下的成本极低

SQL> select min(object_Id) from dao_objects1 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1642255482

------------------------------------------------------------------------------------
| Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |       |     1 |     5 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |       |     1 |     5 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| IDX_1 |     1 |     5 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

索引快速全扫描

  • 所需要的列包含在索引中,并且不要求数据排序

SQL> select object_id from dao_objects1  ;             

Execution Plan
----------------------------------------------------------
Plan hash value: 2692558622

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       | 80098 |  1016K|    51   (0)| 00:00:01 |
|   1 |  INDEX FAST FULL SCAN| IDX_1 | 80098 |  1016K|    51   (0)| 00:00:01 |
------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

索引跳跃扫描

  • 当联合索引中首列选择性不好,而后面的列选择性非常好,并且条件不包含首列时出现

begin
  execute immediate 'create table test_skip (c1 number ,c2 number )';
  for i in 1..500000 loop
    execute immediate 'insert into test_skip values(0,:b1)'
      using i;
    execute immediate 'insert into test_skip values(0,:b2)'
      using i;
    if mod(i, 5000) = 0 then
      commit;
    end if;
  end loop;
  execute immediate 'create index idx_skip on test_skip(c1,c2) ';
  dbms_stats.gather_table_stats(ownname          => user,
                                tabname          => 'TEST_SKIP',
                                estimate_percent => 100,
                                cascade          => TRUE);
end;

SQL> select * from test_skip where c2=5 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3089276301

-----------------------------------------------------------------------------
| Id  | Operation        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT |          |     2 |    14 |     3   (0)| 00:00:01 |
|*  1 |  INDEX SKIP SCAN | IDX_SKIP |     2 |    14 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("C2"=5)
       filter("C2"=5)


索引的结构演示

注意听! 注意听 ! 注意听 !

选择性

一般情况系,索引的选择性主要由三个因素决定

  1. 表总行数
  2. 列中空值的个数
  3. 列中唯一值个数

根据上面三个因素可以得到

某一值对应的行数=(表总行数-列中空值个数)/唯一值个数

二元高度

索引的二元高度=索引的高度-1

索引的二元高度描述了从索引的根节点出发到索引的最底层叶子节点要跳跃几次


graph TD

    1[root] -->2[branch-level1-1]
    1 -->3[branch-level1-2]
    2 -->4[leaf1]
    3 -->5[branch-level2-1]
    5-->6[leaf2]

上图中索引层高为4,二元高度为3 及从根节点到最底层叶子节点leaf2所跳跃的次数.

聚簇因子

聚簇因子描述了根据rowid访问表的难易程度. 其算法为:当索引条目中的rowid与上一条目中的rowid指向的不是同一个数据块,则聚簇因子加一.

索引的创建规则讨论

关联方式

下面我们来讲解一下关联方式,并演示其关键思想.当然具体实现要比我们演示的内容复杂的多.

嵌套循环

循环嵌套联结,选取行数较小的行源作为驱动表,读取驱动表中的行内包含的关联列的值,并使用此值到被驱动表中查找匹配的行.

  • 嵌套循环的执行计划

SQL> select /*+ use_nl(emp, dept) */ * from emp natural join dept ;

14 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 4192419542

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    14 |   812 |     9   (0)| 00:00:01 |
|   1 |  NESTED LOOPS      |      |    14 |   812 |     9   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| DEPT |     4 |    80 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| EMP  |     4 |   152 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("EMP"."DEPTNO"="DEPT"."DEPTNO")


Statistics
----------------------------------------------------------
         57  recursive calls
          0  db block gets
         98  consistent gets
          0  physical reads
          0  redo size
       1760  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         11  sorts (memory)
          0  sorts (disk)
         14  rows processed
  • 嵌套循环的关键算法思想

image

image

image

image

  • nested loop的适用范围

nested loop 适用于小表与大表的关联.并强烈建议大表的关联列上有索引.

排序合并

排序合并联结,先将数据读取到PGA中,并对数据进行排序,然后进行合并操作

  • 排序合并的执行计划

SQL> select /*+ use_merge(emp, dept) */ * from emp natural join dept ;

14 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1407029907

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    14 |   812 |     8  (25)| 00:00:01 |
|   1 |  MERGE JOIN         |      |    14 |   812 |     8  (25)| 00:00:01 |
|   2 |   SORT JOIN         |      |     4 |    80 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| DEPT |     4 |    80 |     3   (0)| 00:00:01 |
|*  4 |   SORT JOIN         |      |    14 |   532 |     4  (25)| 00:00:01 |
|   5 |    TABLE ACCESS FULL| EMP  |    14 |   532 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
       filter("EMP"."DEPTNO"="DEPT"."DEPTNO")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
       1763  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
         14  rows processed
  • 排序合并的关键算法思想

image

image

image

image

image

哈希

  • sort merge join的适用范围

sort merge join 适用于两个预先按关联列排好序的行源.

  • hash的执行计划

SQL> select * from emp natural join dept ;

14 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 615168685

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    14 |   812 |     7  (15)| 00:00:01 |
|*  1 |  HASH JOIN         |      |    14 |   812 |     7  (15)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| DEPT |     4 |    80 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| EMP  |    14 |   532 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
       1911  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         14  rows processed

  • 哈希的关键算法思想

我们先来说一下什么是hash,本质上来说hash是一种标准. 如果一个函数满足以下标准,我们就称这个函数为哈希函数:

给定一个值A经过一个函数处理后一定能等到一个固定的结果B 但给定一个结果B,不能得到一个固定的值A

输入值 函数 结果
1 对3取余 1
2 对3取余 2
3 对3取余 0
4 对3取余 1
5 对3取余 2
6 对3取余 0

假设有两张表做关联

表A 只有一列 c1 包含1-12的数字 (1,2,3…12)

表B 只有一列 c2 包含1-20的数字(1,2,3…20)

image

现在我们选取mod4(对4取余)为哈希函数

对表A进行hash运算然后构造hash桶

读取表B的数据,也经过hash运算 得到桶的ID

image

  • hash join 的适用范围

hash 链接适用于大表和大表的关联 但hash join有一个限制就是仅适用于等式联结

关联顺序

对于多表关联,Oracle原则上使用中间结果集大小作为决定联结顺序的参考标准.

image

以上图为例,关联顺序为 A–>C–>B

我们在后面的实战课中有关于这方面的讲解与演示