用pl/sql分割字符串

pl/sql 自带的package里面有一个dbms_utility.comma_to_table,可以用来分割以’,’为分割符
的字符串,并且对返回的字符数组类型也有限制,并不是一个通用的方法。如果你的需求恰
好符合该方法的定义,可以考虑使用。与此对应的方法是dbms_utility.table_to_comma,是
把字符串数组转换成字符串。

以下实现改自Tom Kyte的例子

功能: 用pl/sql function分割字符串,并返回相应的number数组

CREATE OR REPLACE TYPE typ_number_table as table of NUMBER;

CREATE OR REPLACE FUNCTION split_string(p_string    IN VARCHAR2
                                                                         ,p_delimiter IN VARCHAR2 DEFAULT ‘,’)
  RETURN typ_number_table IS
  l_string    VARCHAR2(32000) DEFAULT p_string || p_delimiter;
  l_num       NUMBER;
  l_num_table typ_number_table := typ_number_table();
BEGIN
  LOOP
    l_num := instr(l_string
                  ,p_delimiter);
    EXIT WHEN(nvl(l_num
                 ,0) = 0);
    l_num_table.EXTEND;
    l_num_table(l_num_table.COUNT) := to_number(trim(substr(l_string
                                                                                               ,1
                                                                                               ,l_num – 1)));
    l_string := substr(l_string
                      ,l_num + length(p_delimiter));
  END LOOP;
  RETURN l_num_table;
END;

一个并行查询的例子

系统有2个cpu,每个4核,raid 1+0,db为oracle 10gr2 64bit

1. 先直接来个查询看看
SQL> set autot trace
SQL> SELECT COUNT(*)
FROM channels
WHERE channels.snapshot_time > to_date(’10/28/2008 00:00′
                                      ,’MM/DD/YYYY HH24:MI:SS’)
      AND channels.snapshot_time <= to_date(’10/29/2008 23:50′
                                           ,’MM/DD/YYYY HH24:MI:SS’);

Elapsed: 00:01:15.96

Execution Plan
———————————————————-
Plan hash value: 1848729565

———————————————————————————————————–
| Id  | Operation                  | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
———————————————————————————————————–
|   0 | SELECT STATEMENT           |              |     1 |    11 |   533K  (7)| 01:46:45 |       |       |
|   1 |  SORT AGGREGATE            |              |     1 |    11 |            |          |       |       |
|*  2 |   FILTER                   |              |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR|              |    89M|   936M|   533K  (7)| 01:46:45 |   KEY |   KEY |
|*  4 |     INDEX FAST FULL SCAN   | CHANNELS_UNQ |    89M|   936M|   533K  (7)| 01:46:45 |   KEY |   KEY |
———————————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

   2 – filter(TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’10/29/2008
              23:50′,’MM/DD/YYYY HH24:MI:SS’))
   4 – filter(“CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND
              “CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’))
Statistics
———————————————————-
       8962  recursive calls
          0  db block gets
     308209  consistent gets
     306423  physical reads
          0  redo size
        518  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
        141  sorts (memory)
          0  sorts (disk)
          1  rows processed

可以看到,默认走了索引,用时1分16秒左右

2. 强制走全表扫描看看
SQL> alter system flush buffer_cache;
SQL> alter system flush shared_pool;

SQL>
SELECT /*+ full(channels)*/
 COUNT(*)
FROM channels
WHERE channels.snapshot_time > to_date(’10/28/2008 00:00′
                                      ,’MM/DD/YYYY HH24:MI:SS’)
      AND channels.snapshot_time <= to_date(’10/29/2008 23:50′
                                           ,’MM/DD/YYYY HH24:MI:SS’);

Elapsed: 00:00:49.31

Execution Plan
———————————————————-
Plan hash value: 4225188383

——————————————————————————————————-
| Id  | Operation                  | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
——————————————————————————————————-
|   0 | SELECT STATEMENT           |          |     1 |    11 |   714K  (6)| 02:22:51 |       |       |
|   1 |  SORT AGGREGATE            |          |     1 |    11 |            |          |       |       |
|*  2 |   FILTER                   |          |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR|          |    89M|   936M|   714K  (6)| 02:22:51 |   KEY |   KEY |
|*  4 |     TABLE ACCESS FULL      | CHANNELS |    89M|   936M|   714K  (6)| 02:22:51 |   KEY |   KEY |
——————————————————————————————————-

Predicate Information (identified by operation id):
—————————————————

   2 – filter(TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’10/29/2008
              23:50′,’MM/DD/YYYY HH24:MI:SS’))
   4 – filter(“CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’)
              AND “CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’))
Statistics
———————————————————-
       8922  recursive calls
          0  db block gets
     404835  consistent gets
     403115  physical reads
          0  redo size
        518  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
        141  sorts (memory)
          0  sorts (disk)
          1  rows processed

全表扫描,用时49秒

3. 并行度为4的查询

SQL> alter system flush buffer_cache;
SQL> alter system flush shared_pool;
SQL>
SELECT /*+ full(channels) parallel(channels,4) */
 COUNT(*)
FROM channels
WHERE channels.snapshot_time > to_date(’10/28/2008 00:00′
                                      ,’MM/DD/YYYY HH24:MI:SS’)
      AND channels.snapshot_time <= to_date(’10/29/2008 23:50′
                                           ,’MM/DD/YYYY HH24:MI:SS’);

Elapsed: 00:00:11.96

Execution Plan
———————————————————-
Plan hash value: 166461870

———————————————————————————————————————————
| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
———————————————————————————————————————————
|   0 | SELECT STATEMENT        |          |     1 |    11 |   198K  (6)| 00:39:37 |       |       |        |      |            |
|   1 |  SORT AGGREGATE         |          |     1 |    11 |            |          |       |       |        |      |            |
|*  2 |   PX COORDINATOR        |          |       |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)  | :TQ10000 |     1 |    11 |            |          |       |       |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE      |          |     1 |    11 |            |          |       |       |  Q1,00 | PCWP |            |
|*  5 |      FILTER             |          |       |       |            |          |       |       |  Q1,00 | PCWC |            |
|   6 |       PX BLOCK ITERATOR |          |    89M|   936M|   198K  (6)| 00:39:37 |   KEY |   KEY |  Q1,00 | PCWC |            |
|*  7 |        TABLE ACCESS FULL| CHANNELS |    89M|   936M|   198K  (6)| 00:39:37 |   KEY |   KEY |  Q1,00 | PCWP |            |
———————————————————————————————————————————

Predicate Information (identified by operation id):
—————————————————

   2 – filter(TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
   5 – filter(TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
   7 – filter(“CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND
              “CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’))
Statistics
———————————————————-
       9661  recursive calls
          4  db block gets
     405834  consistent gets
     403140  physical reads
        632  redo size
        518  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
        152  sorts (memory)
          0  sorts (disk)
          1  rows processed

全表加并行,用时11秒.

Oracle10gR2在RHEL 5下开启异步IO

记录一下步骤,内容主要来自这篇文章

平台:Redhat Enterprise Linux 5 64bit,Oracle 10gR2 10.2.0.4 64bit

1、首先用root用户安装以下必要的rpm包

# rpm -Uvh libaio-0.3.106-3.2.x86_64.rpm
# rpm -Uvh libaio-devel-0.3.106-3.2.x86_64.rpm

2、在系统级支持异步I/O
  与[Note 225751.1]介绍的在RHEL 3里面设置异步IO不同,不需要设置aio-max-size,而且’/proc/sys/fs’路径下也没有这个文件。因为从2.6 kernel开始,已经取消了对IO size的限制[Note 549075.1]。另外根据[Note 471846.1],Oracle建议将aio-max-nr的值设置为1048576或更高。

# echo > /proc/sys/fs/aio-max-nr 1048576

要永久修改这个内核参数,需要在/etc/sysctl.conf加上下面这句
fs.aio-max-nr = 1048576
使参数生效
#/sbin/sysctl -p

3、在数据库级启用异步I/O
  首先修改数据库参数。与[Note 225751.1]在RHEL 3里面设置异步IO不同,Oracle10gR2默认是打开了对异步IO的支持的,不需要重新编译数据库软件。在’$ORACLE_HOME/rdbms/lib’路径下,也没有’skgaioi.o’这个文件。在某些情况下,Oracle无法将IO行为或事件报告给操作系统[Note 365416.1],因此需要做以下操作。

这里开始换成oracle用户

SQL> alter system set disk_asynch_io=TRUE scope=spfile;

SQL> alter system set filesystemio_options=asynch scope=spfile;

SQL>shutdown immediate
$ cd $ORACLE_HOME/rdbms/lib
$ ln -s /usr/lib/libaio.so.1 skgaio.o
$ make PL_ORALIBS=-laio -f ins_rdbms.mk async_on
SQL>startup

  在Oracle10gR2中AIO默认已经是开启的了。可以通过ldd或者nm来检查oracle是否已经启用了AIO支持,有输出代表已经启用。

[oraprod@db01 ~]$ /usr/bin/ldd $ORACLE_HOME/bin/oracle | grep libaio
libaio.so.1 => /usr/lib64/libaio.so.1 (0x00002aaaac4a9000)
[oraprod@db01 ~]$ /usr/bin/nm $ORACLE_HOME/bin/oracle | grep io_getevent
w io_getevents@@LIBAIO_0.4

4、检查异步I/O是否在使用
  根据[Note 370579.1],可以通过查看slabinfo统计信息查看操作系统中AIO是否运行,slab是Linux的内存分配器,AIO相关的内存结构已经分配,kiocb值的第二列和第三列非0即是已使用。与kernel 2.4.x不同,没有显示kiobuf,因为从kernel 2.5.43开始,kiobuf已经从内核中被移除。

$ cat /proc/slabinfo | grep kio
kioctx 64 110 384 10 1 : tunables 54 27 8 : slabdata 11 11 0
kiocb 13 315 256 15 1 : tunables 120 60 8 : slabdata 21 21 44

一个CBO错用索引的例子

channels表,索引是channels_unq(svr_grp_id, channels_record_id, snapshot_time)

执行如下语句

SELECT channels.channel_record_id
      ,SUM(channels.max_viewers) AS viewer_sum
FROM channels
WHERE channels.snapshot_time > to_date(’11/25/2008 23:50′
                                      ,’MM/DD/YYYY HH24:MI:SS’)
      AND channels.snapshot_time <= to_date(’12/06/2008 23:50′
                                           ,’MM/DD/YYYY HH24:MI:SS’)
GROUP BY channels.channel_record_id;

474 rows selected.

Elapsed: 02:06:21.62

Execution Plan
———————————————————-
Plan hash value: 443160641

———————————————————————————————————————

| Id  | Operation                            | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

———————————————————————————————————————

|   0 | SELECT STATEMENT                     |              |     1 |    39 | 365   (1)| 00:00:05 |       |       |

|   1 |  HASH GROUP BY                       |              |     1 |    39 | 365   (1)| 00:00:05 |       |       |

|*  2 |   FILTER                             |              |       |       | |          |       |       |

|   3 |    PARTITION RANGE ITERATOR          |              |     1 |    39 | 364   (0)| 00:00:05 |   KEY |   KEY |

|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID| CHANNELS     |     1 |    39 | 364   (0)| 00:00:05 |   KEY |   KEY |

|*  5 |      INDEX SKIP SCAN                 | CHANNELS_UNQ |     1 |       | 364   (0)| 00:00:05 |   KEY |   KEY |

———————————————————————————————————————
Predicate Information (identified by operation id):
—————————————————

   2 – filter(TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
   5 – access(“CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND “CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
       filter(“CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND “CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))

Statistics
———————————————————-
       8100  recursive calls
          0  db block gets
  577374676  consistent gets
    4631561  physical reads
      59904  redo size
      11269  bytes sent via SQL*Net to client
        833  bytes received via SQL*Net from client
         33  SQL*Net roundtrips to/from client
         91  sorts (memory)
          0  sorts (disk)
        474  rows processed

用了两个多小时,可以看到,在第5步不正确地走了索引 channels_unq

把查询语句加个hint,强制用全表扫描

SELECT /*+ FULL(channels) */channels.channel_record_id
      ,SUM(channels.max_viewers) AS viewer_sum
FROM channels
WHERE channels.snapshot_time > to_date(’11/25/2008 23:50′
                                      ,’MM/DD/YYYY HH24:MI:SS’)
      AND channels.snapshot_time <= to_date(’12/06/2008 23:50′
                                           ,’MM/DD/YYYY HH24:MI:SS’)
GROUP BY channels.channel_record_id;

474 rows selected.

Elapsed: 00:12:33.59

Execution Plan
———————————————————-
Plan hash value: 4197359631

——————————————————————————————————-

| Id  | Operation                  | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

——————————————————————————————————-

|   0 | SELECT STATEMENT           |          |     1 |    39 |  1770K  (1)| 05:54:04 |       |       |

|   1 |  HASH GROUP BY             |          |     1 |    39 |  1770K  (1)| 05:54:04 |       |       |

|*  2 |   FILTER                   |          |       |       |            | |       |       |

|   3 |    PARTITION RANGE ITERATOR|          |     1 |    39 |  1770K  (1)| 05:54:04 |   KEY |   KEY |

|*  4 |     TABLE ACCESS FULL      | CHANNELS |     1 |    39 |  1770K  (1)| 05:54:04 |   KEY |   KEY |

——————————————————————————————————-
Predicate Information (identified by operation id):
—————————————————

   2 – filter(TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
   4 – filter(“CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND “CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
Statistics
———————————————————-
          1  recursive calls
          0  db block gets
    2762409  consistent gets
    2391028  physical reads
       4236  redo size
      11269  bytes sent via SQL*Net to client
        833  bytes received via SQL*Net from client
         33  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        474  rows processed

这次走全表了,用时12分钟左右。

没有时间仔细去找原因,猜想:
1. 可能是最新统计信息没收集
2. 由于使用了绑定变量(在这个例子中简化了,没体现出来),可能是bind varible peeking的问题