https://andreynikolaev.wordpress.com/2010/10/28/appetizer-for-dtrace/

Appetizer for DTrace

Filed under: DTrace,Latch — andreynikolaev @ 3:33 pm 

To discover how the Oracle latch works, we need the tool. Oracle Wait Interface allows us to explore the waits only. Oracle X$/V$ tables instrument the latch acquisition and give us performance counters. To see how latch works through time and to observe short duration events, we need something like stroboscope in physics. Likely such tool exists in Oracle Solaris. The DTrace, Solaris 10 Dynamic Tracing framework!

Here I would like to give brief, Oracle DBA inclined into to some of DTrace topics. Tanel PoderJames MorleDough Burnswere used the DTrace for performance diagnostics for years. But it is still not popular as should be in our DBA community. One of the problems is another “language”. The best DTrace presentations talk about “probes”, “actions”, unfamiliar Solaris kernel structures, etc… Begging pardon to the DTrace inventors, I will use more database-like terminology here.

I will not even try to touch the most popular DTrace usage. Anyone who’s interested in this revolutionary technology, should read DTrace introductionDTrace user guide,the upcoming DTrace book and DTrace communitymaterials.

DTrace is event-driven instrumentation of Solaris kernel and user applications. This is the key point. No application change is needed to use DTrace. This is very similar to triggers in Oracle database. You define the probe (trigger ) to fire on event, and write the action (body) to execute. I will use this analogy in this post.

probe is a point of instrumentation made available by a provider. A provider has analogy in Oracle trigger type (system/user triggers, DML, DDL, etc…). Officially, the provider represents a methodology for instrumenting the system.

Popular Solaris providers are pid,syscall, sysinfo, … As of now (11.2.0.2), we are still waiting for Oracle specific providers to integrate Oracle Server and Oracle DTrace. But we can do a lot with a generic pid provider, which allows to set triggers on any function call in user application. My goal is to see the latch operations on-the-fly.

The DTrace describes the triggering probe in a four field format: provider:module:function:name. If one need to set trigger inside the oracle process with Solaris spid 16444, to fire on entry to function kslgetl (get exclusive latch), the probe description will be pid16444:oracle:kslgetl:entry

Surprisingly this is enough to start use the DTrace with Oracle. Suppose, I would like to see latch acquisitions by MMON process . In my database the MMON process currently has spid 16444. Ask your Solaris SysAdmin for dtrace_userprivilege and type:

$/usr/sbin/dtrace -n 'pid16444:oracle:kslgetl:entry'
dtrace: description 'pid16444:oracle:kslgetl:entry' matched 1 probe
CPU ID FUNCTION:NAME
1 67480 kslgetl:entry
1 67480 kslgetl:entry
0 67480 kslgetl:entry

<ctrl-C>

This simple DTrace one-liner traces calls of kslgetl() function and shows how the process migrates between CPUs.

How it works? Unlike standard tracing tools, DTrace works in Solaris kernel. When I activated this probe, dtrace set trigger at the entry to kslgetl function. When oracle process entered this function, the execution went to Solaris kernel and the DTrace filled buffers with the data. The dtrace program printed out these buffers.

Let us compare the DTrace and obsoleted Oracle Trace. Both were event driven. Otrace tried to catch all the evens in instance, the DTrace catch only what you asked for. Otrace allowed to set filters, in the DTrace you write the program. Otrace was fully userland, DTrace works in the OS kernel.

Kernel based tracing is much more stable and have less overhead then userland. DTrace sees all the system activity and can take into account the ‘unaccounted for’ time associated with kernel calls, scheduling, etc.

Actions (trigger bodies!) are what happen when a probe is hit. Actions are fully programmable using language, which will be familiar to anybody who ever used C and awk. Action code enclosed in curly brackets {} and could use arguments of function call as arg0, arg1, etc….

Naturally, the next step in our DTrace latch tracing is to see the latch function arguments. It is easy to write such a script (ksl_args.d). Remember that Oracle acquires exclusive latches using kslgetl(laddr, wait, why, where), and shared latches using kslgetsl(laddr, wait, why, where, rs) (ksl_get_shared_latch() in 11g):

#!/usr/sbin/dtrace -Zs
#pragma D option quiet
pid$target::kslgetl:entry
{
printf("%s(0x%X,wait=%d,why=0x%X,whr=%d)\n",probefunc,arg0,arg1,arg2,arg3);
}
 
pid$target::kslgetsl:entry,
pid$target::ksl_get_shared_latch:entry
{
printf("%s(0x%X,wait=%d,why=0x%X,whr=%d,rs=%d)\n",probefunc,arg0, arg1,arg2,arg3,arg4);
}
pid$target::kslfre:entry
{
printf(" %s(0x%X)\n",probefunc,arg0);
}

The script probes (triggers) will fire on each entry to latch acquisition functions. Printf()’s inside the trigger bodies (actions) will print out arguments of these functions to the dtrace kernel buffers. $target macro will be replaced at runtime by spid number from -p script oprion. And this is the output:


$ ./ksl_args.d -p 16444
kslgetl(0x38000CC98,wait=1,why=0x0,whr=175)
kslfre(0x38000CC98)
ksl_get_shared_latch(0x38DE6DD00,wait=1,why=0x38DE6DCB8,whr=290,rs=16)
kslfre(0x38DE6DD00)
kslgetl(0x38BE4C328,wait=1,why=0x0,whr=3487)
kslfre(0x38BE4C328)
kslgetl(0x38BE4C328,wait=1,why=0x0,whr=3510)
...

One can focus on the particular latch using predicate (WHEN clause !). Predicate takes the form of / … / just before action code. The probe will fire only when the predicate evaluates to true.
Look at my test Oracle instance suffered from “transaction allocation” latch contention.

select addr,latch#,name,gets,misses,sleeps,spin_gets,wait_time from v$latch_parent
where name='transaction allocation'

ADDR LATCH# NAME GETS MISSES SLEEPS SPIN_GETS WAIT_TIME
50010AEC 180 transaction allocation 520710921 387182198 74543 387117330 380364770

To see what happens with this latch I used the script:

#!/usr/sbin/dtrace -Zs
#pragma D option quiet
pid$target::kslgetl:entry
/ arg0 == 0x50010AEC /
{
printf("%s(0x%X,wait=%d,why=0x%X,whr=%d)\n",probefunc,arg0,arg1,arg2,arg3);
}
 
pid$target::kslfre:entry
/ arg0 == 0x50010AEC /
{
printf(" %s(0x%X)\n",probefunc,arg0);
}

...
kslgetl(0x50010AEC,wait=1,why=0x0,whr=2098)
kslfre(0x50010AEC)
...

Latch “where“=2098 mean “ktcxbr: parent” in x$ksllw. Oracle acquire parent “transaction allocation” latch at this “where” during select from v$transaction. Concurrent select from this fixed view is indeed the root cause of latch contention.

And finally I would like to show the latch in memoryitself. No problem. On entry to kslgetl the arg0 is the latch address. I will check this location on entry and return from latch functions. The only thing is to remember that DTrace probe acts in kernel address space. You need to copy the latch value from user address space into kernel buffer using copyin(user_address,size) DTrace function:

#!/usr/sbin/dtrace -Zs
#pragma D option quiet
 
pid$target::kslgetl:entry,
pid$target::kslfre:entry
/ arg0 == 0x50010AEC /
{
laddress = arg0;                           /* save laddress */
latch= *(uint32_t *)copyin(laddress, 4);   /* copy latch value from user space*/
printf("%s(0x%X...) \tlatch=0x%X (entry),",probefunc,arg0,latch);
}
 
pid$target::kslgetl:return,
pid$target::kslfre:return
/ laddress /
{
latch= *(uint32_t *)copyin(laddress, 4);   /* copy latch value from user space*/
printf(" 0x%X (return)\n",latch);
laddress =0;
}

...
kslgetl(0x50010AEC...) latch=0x00 (entry), 0xFF (return)
kslfre(0x50010AEC...) latch=0xFF (entry), 0x00 (return)
...

最新文章

  1. 字符串-Alphabet
  2. C#中的Where和Lambda表达式
  3. LUA 配置 运行 异常的备忘录
  4. WireShark系列: 使用WireShark过滤条件抓取特定数据流(zz)
  5. 转 -android:程序无响应,你该如何定位问题?
  6. JSP EL表达式详细介绍
  7. Codeforces 543D Road Improvement
  8. HDU 4454 - Stealing a Cake(三分)
  9. TCP header
  10. JavaScript-Curry
  11. symfony composer安装
  12. 集合的遍历以及在Spring中的注入
  13. iOS保持待续连接
  14. 一位90后的自述:如何从年薪3w到30w
  15. Oracle面试相关
  16. 在用node安装某个全局模块的时候,没有权限修改node_modules
  17. 深刻理解this的指向和var 定义的变量的问题
  18. hdu 5024 最长的L型
  19. 小程序 openid 的原始请求和网络请求
  20. [置顶] Spring中自定义属性编辑器

热门文章

  1. viewController备注
  2. PAT Basic 1074
  3. [转] vuex最简单、最直白、最全的入门文档
  4. python面试题解析(网络编程与并发)
  5. js基础之javascript的存在形式和js代码块在页面中的存放位置和 CSS 对比
  6. STW Family
  7. cobbler dell r730安装问题(四)
  8. 使用runtime关联对象将视图添加到视图的类目里
  9. 用CSS给表格加边框
  10. 关于在IE下面promise兼容的解决办法