把itpub的几篇文章搬了过来,以后就维护这个了。
这里好像是强制必须使用category(历史原因?),有谁知道在文章里只显示tag而不显示category吗?
把itpub的几篇文章搬了过来,以后就维护这个了。
这里好像是强制必须使用category(历史原因?),有谁知道在文章里只显示tag而不显示category吗?
平台: Red Hat Linux Enterprise 5.3 64 bit, Oracle 10gR2 10.2.0.4 64 bit
Oracle开启direct io前
SQL> show parameter filesystemio_options
NAME TYPE VALUE
———————————— ———– ——————————
filesystemio_options string ASYNCH
重启动oracle实例后,cached内存为160700KB
# free
total used free shared buffers cached
Mem: 32887744 4583296 28304448 0 2992 160700
-/+ buffers/cache: 4419604 28468140
Swap: 4192924 0 4192924
现在做一个大表查询
SQL> select /*+ full(channels) parallel(channels,4) */ count(*) from channels;
COUNT(*)
———-
793103894
Elapsed: 00:13:09.42
再看看内存使用情况,cached内存为9669284KB
# free
total used free shared buffers cached
Mem: 32887744 14155316 18732428 0 26484 9669284
-/+ buffers/cache: 4459548 28428196
Swap: 4192924 0 4192924
开启direct io
SQL> alter system set filesystemio_options=SETALL scope=spfile;
System altered.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
把cached内存释放掉
# sync; echo 3 > /proc/sys/vm/drop_caches
# free
total used free shared buffers cached
Mem: 32887744 4374724 28513020 0 540 47532
-/+ buffers/cache: 4326652 28561092
Swap: 4192924 0 4192924
重启Oracle instance
SQL> startup
ORACLE instance started.
Total System Global Area 4294967296 bytes
Fixed Size 2089432 bytes
Variable Size 301993512 bytes
Database Buffers 3976200192 bytes
Redo Buffers 14684160 bytes
Database mounted.
Database opened.
SQL> show parameter filesystemio_options
NAME TYPE VALUE
———————————— ———– ——————————
filesystemio_options string SETALL
此时cached内存94612KB
# free
total used free shared buffers cached
Mem: 32887744 4503964 28383780 0 1872 94612
-/+ buffers/cache: 4407480 28480264
Swap: 4192924 0 4192924
再做一个同样的查询
SQL> select /*+ full(channels) parallel(channels,4) */ count(*) from channels;
COUNT(*)
———-
793103894
Elapsed: 00:03:37.87
速度快了不少,cached内存96872KB,基本不变
# free
total used free shared buffers cached
Mem: 32887744 4559484 28328260 0 43556 96872
-/+ buffers/cache: 4419056 28468688
Swap: 4192924 0 4192924
在64位操作系统下,为oracle SGA启用huge pages memory mapping,可以更高效地使用系统内存。
有关huge pages和oracle的内存分配,可以参见下面两篇文章,讲解非常详细,推荐一看。
Memory
Pythian Goodies: The Answer to Free Memory, Swap, Oracle, and Everything
平台: Red Hat Linux Enterprise 5.3 64 bit, Oracle 10gR2 10.2.0.4 64 bit
启用huge pages的步骤。
查看默认的small page size
# getconf PAGE_SIZE
4096
查看huge page size
# grep Hugepagesize /proc/meminfo
Hugepagesize: 2048 kB
对比以上可以看出,一个是4k,一个是2M
假设要为SGA分配4G内存,则需要4G/2M=2048 pages
我们要在/etc/sysctl.conf里添加一句
vm.nr_hugepages = 2052 #比2048稍大
# echo “vm.nr_hugepages = 2052” >> /etc/sysctl.conf
# sysctl -p
huge pages使用时会锁在内存中,不会被交换出去
需要在/etc/security/limits.conf里添加如下内容 (16777216KB是基于可扩展性考虑,大于4GB即可)
# cat >> /etc/security/limits.conf <<EOF
> oracle soft memlock 16777216
> oracle hard memlock 16777216
> EOF
#
重启oracle instance,然后查看huge page使用情况
# cat /proc/meminfo |grep Huge
HugePages_Total: 2052
HugePages_Free: 1702
HugePages_Rsvd: 1699
Hugepagesize: 2048 kB
如果基于一些复杂查询直接建立快速刷新的物化视图,oracle会返回一个这样的错误
ORA-12015: cannot create a fast refresh materialized view from a complex query
比如下面这条查询
SELECT t2.c1
,t4.c1
,MAX(t3.c1)
FROM (SELECT MAX(t1.c1) c1
FROM t1) t4
,t2
,t3
WHERE t3.c1 > t4.c1
AND t2.c1 = t3.c2
GROUP BY t2.c1
,t4.c1
在此感谢yangtingkun,咨询后他给出了一个使用嵌套物化视图的解决方案,如下
SQL> CREATE TABLE T1(C1 NUMBER);
Table created.
SQL> CREATE TABLE T2(C1 NUMBER);
Table created.
SQL> CREATE TABLE T3(C1 NUMBER,C2 NUMBER);
Table created.
SQL> CREATE MATERIALIZED VIEW LOG ON T1
2 WITH ROWID, SEQUENCE (C1)
3 INCLUDING NEW VALUES;
Materialized view log created.
SQL> CREATE MATERIALIZED VIEW LOG ON T2
2 WITH ROWID, SEQUENCE (C1)
3 INCLUDING NEW VALUES;
Materialized view log created.
SQL> CREATE MATERIALIZED VIEW LOG ON T3
2 WITH ROWID, SEQUENCE (C1, C2)
3 INCLUDING NEW VALUES;
Materialized view log created.
SQL> CREATE MATERIALIZED VIEW MV_T4
2 REFRESH FAST AS
3 SELECT COUNT(*) CN, COUNT(C1), MAX(T1.C1) C1
4 FROM T1;
Materialized view created.
SQL> CREATE MATERIALIZED VIEW LOG ON MV_T4
2 WITH ROWID, SEQUENCE (C1)
3 INCLUDING NEW VALUES;
Materialized view log created.
SQL> CREATE MATERIALIZED VIEW MV_T123
2 REFRESH FAST AS
3 SELECT T2.C1 T2_C1, MV_T4.C1 T4_C1, COUNT(*) CNT, COUNT(T3.C1), MAX(T3.C1)
4 FROM MV_T4, T2, T3
5 WHERE T3.C1 > MV_T4.C1
6 AND T2.C1 = T3.C2
7 GROUP BY T2.C1, MV_T4.C1;
Materialized view created.
快速刷新的物化视图创建成功。
pl/sql 自带的package里面有一个dbms_utility.comma_to_table,可以用来分割以’,’为分割符
的字符串,并且对返回的字符数组类型也有限制,并不是一个通用的方法。如果你的需求恰
好符合该方法的定义,可以考虑使用。与此对应的方法是dbms_utility.table_to_comma,是
把字符串数组转换成字符串。
以下实现改自Tom Kyte的例子
功能: 用pl/sql function分割字符串,并返回相应的number数组
CREATE OR REPLACE TYPE typ_number_table as table of NUMBER;
CREATE OR REPLACE FUNCTION split_string(p_string IN VARCHAR2
,p_delimiter IN VARCHAR2 DEFAULT ‘,’)
RETURN typ_number_table IS
l_string VARCHAR2(32000) DEFAULT p_string || p_delimiter;
l_num NUMBER;
l_num_table typ_number_table := typ_number_table();
BEGIN
LOOP
l_num := instr(l_string
,p_delimiter);
EXIT WHEN(nvl(l_num
,0) = 0);
l_num_table.EXTEND;
l_num_table(l_num_table.COUNT) := to_number(trim(substr(l_string
,1
,l_num – 1)));
l_string := substr(l_string
,l_num + length(p_delimiter));
END LOOP;
RETURN l_num_table;
END;
系统有2个cpu,每个4核,raid 1+0,db为oracle 10gr2 64bit
1. 先直接来个查询看看
SQL> set autot trace
SQL> SELECT COUNT(*)
FROM channels
WHERE channels.snapshot_time > to_date(’10/28/2008 00:00′
,’MM/DD/YYYY HH24:MI:SS’)
AND channels.snapshot_time <= to_date(’10/29/2008 23:50′
,’MM/DD/YYYY HH24:MI:SS’);
Elapsed: 00:01:15.96
Execution Plan
———————————————————-
Plan hash value: 1848729565
———————————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 11 | 533K (7)| 01:46:45 | | |
| 1 | SORT AGGREGATE | | 1 | 11 | | | | |
|* 2 | FILTER | | | | | | | |
| 3 | PARTITION RANGE ITERATOR| | 89M| 936M| 533K (7)| 01:46:45 | KEY | KEY |
|* 4 | INDEX FAST FULL SCAN | CHANNELS_UNQ | 89M| 936M| 533K (7)| 01:46:45 | KEY | KEY |
———————————————————————————————————–
Predicate Information (identified by operation id):
—————————————————
2 – filter(TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’10/29/2008
23:50′,’MM/DD/YYYY HH24:MI:SS’))
4 – filter(“CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND
“CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’))
Statistics
———————————————————-
8962 recursive calls
0 db block gets
308209 consistent gets
306423 physical reads
0 redo size
518 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
141 sorts (memory)
0 sorts (disk)
1 rows processed
可以看到,默认走了索引,用时1分16秒左右
2. 强制走全表扫描看看
SQL> alter system flush buffer_cache;
SQL> alter system flush shared_pool;
SQL>
SELECT /*+ full(channels)*/
COUNT(*)
FROM channels
WHERE channels.snapshot_time > to_date(’10/28/2008 00:00′
,’MM/DD/YYYY HH24:MI:SS’)
AND channels.snapshot_time <= to_date(’10/29/2008 23:50′
,’MM/DD/YYYY HH24:MI:SS’);
Elapsed: 00:00:49.31
Execution Plan
———————————————————-
Plan hash value: 4225188383
——————————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
——————————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 11 | 714K (6)| 02:22:51 | | |
| 1 | SORT AGGREGATE | | 1 | 11 | | | | |
|* 2 | FILTER | | | | | | | |
| 3 | PARTITION RANGE ITERATOR| | 89M| 936M| 714K (6)| 02:22:51 | KEY | KEY |
|* 4 | TABLE ACCESS FULL | CHANNELS | 89M| 936M| 714K (6)| 02:22:51 | KEY | KEY |
——————————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
2 – filter(TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’10/29/2008
23:50′,’MM/DD/YYYY HH24:MI:SS’))
4 – filter(“CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’)
AND “CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’))
Statistics
———————————————————-
8922 recursive calls
0 db block gets
404835 consistent gets
403115 physical reads
0 redo size
518 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
141 sorts (memory)
0 sorts (disk)
1 rows processed
全表扫描,用时49秒
3. 并行度为4的查询
SQL> alter system flush buffer_cache;
SQL> alter system flush shared_pool;
SQL>
SELECT /*+ full(channels) parallel(channels,4) */
COUNT(*)
FROM channels
WHERE channels.snapshot_time > to_date(’10/28/2008 00:00′
,’MM/DD/YYYY HH24:MI:SS’)
AND channels.snapshot_time <= to_date(’10/29/2008 23:50′
,’MM/DD/YYYY HH24:MI:SS’);
Elapsed: 00:00:11.96
Execution Plan
———————————————————-
Plan hash value: 166461870
———————————————————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
———————————————————————————————————————————
| 0 | SELECT STATEMENT | | 1 | 11 | 198K (6)| 00:39:37 | | | | | |
| 1 | SORT AGGREGATE | | 1 | 11 | | | | | | | |
|* 2 | PX COORDINATOR | | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 11 | | | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 11 | | | | | Q1,00 | PCWP | |
|* 5 | FILTER | | | | | | | | Q1,00 | PCWC | |
| 6 | PX BLOCK ITERATOR | | 89M| 936M| 198K (6)| 00:39:37 | KEY | KEY | Q1,00 | PCWC | |
|* 7 | TABLE ACCESS FULL| CHANNELS | 89M| 936M| 198K (6)| 00:39:37 | KEY | KEY | Q1,00 | PCWP | |
———————————————————————————————————————————
Predicate Information (identified by operation id):
—————————————————
2 – filter(TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
5 – filter(TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
7 – filter(“CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’10/29/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND
“CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’10/28/2008 00:00′,’MM/DD/YYYY HH24:MI:SS’))
Statistics
———————————————————-
9661 recursive calls
4 db block gets
405834 consistent gets
403140 physical reads
632 redo size
518 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
152 sorts (memory)
0 sorts (disk)
1 rows processed
全表加并行,用时11秒.
记录一下步骤,内容主要来自这篇文章
平台:Redhat Enterprise Linux 5 64bit,Oracle 10gR2 10.2.0.4 64bit
1、首先用root用户安装以下必要的rpm包
# rpm -Uvh libaio-0.3.106-3.2.x86_64.rpm
# rpm -Uvh libaio-devel-0.3.106-3.2.x86_64.rpm
2、在系统级支持异步I/O
与[Note 225751.1]介绍的在RHEL 3里面设置异步IO不同,不需要设置aio-max-size,而且’/proc/sys/fs’路径下也没有这个文件。因为从2.6 kernel开始,已经取消了对IO size的限制[Note 549075.1]。另外根据[Note 471846.1],Oracle建议将aio-max-nr的值设置为1048576或更高。
# echo > /proc/sys/fs/aio-max-nr 1048576
要永久修改这个内核参数,需要在/etc/sysctl.conf加上下面这句
fs.aio-max-nr = 1048576
使参数生效
#/sbin/sysctl -p
3、在数据库级启用异步I/O
首先修改数据库参数。与[Note 225751.1]在RHEL 3里面设置异步IO不同,Oracle10gR2默认是打开了对异步IO的支持的,不需要重新编译数据库软件。在’$ORACLE_HOME/rdbms/lib’路径下,也没有’skgaioi.o’这个文件。在某些情况下,Oracle无法将IO行为或事件报告给操作系统[Note 365416.1],因此需要做以下操作。
这里开始换成oracle用户
SQL> alter system set disk_asynch_io=TRUE scope=spfile;
SQL> alter system set filesystemio_options=asynch scope=spfile;
SQL>shutdown immediate
$ cd $ORACLE_HOME/rdbms/lib
$ ln -s /usr/lib/libaio.so.1 skgaio.o
$ make PL_ORALIBS=-laio -f ins_rdbms.mk async_on
SQL>startup
在Oracle10gR2中AIO默认已经是开启的了。可以通过ldd或者nm来检查oracle是否已经启用了AIO支持,有输出代表已经启用。
[oraprod@db01 ~]$ /usr/bin/ldd $ORACLE_HOME/bin/oracle | grep libaio
libaio.so.1 => /usr/lib64/libaio.so.1 (0x00002aaaac4a9000)
[oraprod@db01 ~]$ /usr/bin/nm $ORACLE_HOME/bin/oracle | grep io_getevent
w io_getevents@@LIBAIO_0.4
4、检查异步I/O是否在使用
根据[Note 370579.1],可以通过查看slabinfo统计信息查看操作系统中AIO是否运行,slab是Linux的内存分配器,AIO相关的内存结构已经分配,kiocb值的第二列和第三列非0即是已使用。与kernel 2.4.x不同,没有显示kiobuf,因为从kernel 2.5.43开始,kiobuf已经从内核中被移除。
$ cat /proc/slabinfo | grep kio
kioctx 64 110 384 10 1 : tunables 54 27 8 : slabdata 11 11 0
kiocb 13 315 256 15 1 : tunables 120 60 8 : slabdata 21 21 44
1. 查找不包括”2009″的行 ^\(.*2009\)\@!.*$
2. 查找不包括”2009″的单词 \<\(\(\(2009\)\@!\)\w\)\+\>
channels表,索引是channels_unq(svr_grp_id, channels_record_id, snapshot_time)
执行如下语句
SELECT channels.channel_record_id
,SUM(channels.max_viewers) AS viewer_sum
FROM channels
WHERE channels.snapshot_time > to_date(’11/25/2008 23:50′
,’MM/DD/YYYY HH24:MI:SS’)
AND channels.snapshot_time <= to_date(’12/06/2008 23:50′
,’MM/DD/YYYY HH24:MI:SS’)
GROUP BY channels.channel_record_id;
474 rows selected.
Elapsed: 02:06:21.62
Execution Plan
———————————————————-
Plan hash value: 443160641
———————————————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————————————
| 0 | SELECT STATEMENT | | 1 | 39 | 365 (1)| 00:00:05 | | |
| 1 | HASH GROUP BY | | 1 | 39 | 365 (1)| 00:00:05 | | |
|* 2 | FILTER | | | | | | | |
| 3 | PARTITION RANGE ITERATOR | | 1 | 39 | 364 (0)| 00:00:05 | KEY | KEY |
| 4 | TABLE ACCESS BY LOCAL INDEX ROWID| CHANNELS | 1 | 39 | 364 (0)| 00:00:05 | KEY | KEY |
|* 5 | INDEX SKIP SCAN | CHANNELS_UNQ | 1 | | 364 (0)| 00:00:05 | KEY | KEY |
———————————————————————————————————————
Predicate Information (identified by operation id):
—————————————————
2 – filter(TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
5 – access(“CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND “CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
filter(“CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND “CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
Statistics
———————————————————-
8100 recursive calls
0 db block gets
577374676 consistent gets
4631561 physical reads
59904 redo size
11269 bytes sent via SQL*Net to client
833 bytes received via SQL*Net from client
33 SQL*Net roundtrips to/from client
91 sorts (memory)
0 sorts (disk)
474 rows processed
用了两个多小时,可以看到,在第5步不正确地走了索引 channels_unq
把查询语句加个hint,强制用全表扫描
SELECT /*+ FULL(channels) */channels.channel_record_id
,SUM(channels.max_viewers) AS viewer_sum
FROM channels
WHERE channels.snapshot_time > to_date(’11/25/2008 23:50′
,’MM/DD/YYYY HH24:MI:SS’)
AND channels.snapshot_time <= to_date(’12/06/2008 23:50′
,’MM/DD/YYYY HH24:MI:SS’)
GROUP BY channels.channel_record_id;
474 rows selected.
Elapsed: 00:12:33.59
Execution Plan
———————————————————-
Plan hash value: 4197359631
——————————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
——————————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 39 | 1770K (1)| 05:54:04 | | |
| 1 | HASH GROUP BY | | 1 | 39 | 1770K (1)| 05:54:04 | | |
|* 2 | FILTER | | | | | | | |
| 3 | PARTITION RANGE ITERATOR| | 1 | 39 | 1770K (1)| 05:54:04 | KEY | KEY |
|* 4 | TABLE ACCESS FULL | CHANNELS | 1 | 39 | 1770K (1)| 05:54:04 | KEY | KEY |
——————————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
2 – filter(TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’)<TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
4 – filter(“CHANNELS”.”SNAPSHOT_TIME”>TO_DATE(’11/25/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’) AND “CHANNELS”.”SNAPSHOT_TIME”<=TO_DATE(’12/06/2008 23:50′,’MM/DD/YYYY HH24:MI:SS’))
Statistics
———————————————————-
1 recursive calls
0 db block gets
2762409 consistent gets
2391028 physical reads
4236 redo size
11269 bytes sent via SQL*Net to client
833 bytes received via SQL*Net from client
33 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
474 rows processed
这次走全表了,用时12分钟左右。
没有时间仔细去找原因,猜想:
1. 可能是最新统计信息没收集
2. 由于使用了绑定变量(在这个例子中简化了,没体现出来),可能是bind varible peeking的问题