Killing an Oracle process from inside Oracle

I had a following situation few days ago – I was running a CREATE TABLE AS SELECT over a heterogenous services dblink. However I cancelled this command via pressing CTRL+C twice in Windows sqlplus (this actually just kills the client sqlplus and not the call).

Anyway, when I wanted to drop that table involved, this happened:

SQL> drop table MYTABLE;
drop table MYTABLE
*
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified

I can’t drop a table as someone is holding a lock on it. Fair enough, this was a dev environment used only by me, so I used DBA_OBJECTS.OBJECT_ID to find out the object ID of that table:

SQL> @o MYTABLE

owner                     object_name                    object_type        CREATED           LAST_DDL_TIME     status           OID      D_OID
------------------------- ------------------------------ ------------------ ----------------- ----------------- --------- ---------- ----------
XYZ_DEV01_OWNER MYTABLE TABLE 20080616 11:08:44 20080616 11:08:44 VALID 63764 63764

…and then I queried what enqueue locks were held on that object:

SQL> select * from v$lock where id1=63764;

ADDR     KADDR           SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
-------- -------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
40034278 40034290 130 TM 63764 0 6 0 2662 0

Ok, I see session 130 holding a TM lock on that table. I queried the corresponding SERIAL# from v$session as well and killed the session:

SQL> alter system kill session '130,8764';
alter system kill session '130,8764'
*
ERROR at line 1:
ORA-00031: session marked for kill SQL> select * from v$lock where id1=63764; ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK
-------- -------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
40034278 40034290 130 TM 63764 0 6 0 2668 0

After hanging for 60 seconds, my kill command gave up (and marked my session for kill), but my lock was still not released… Now what?

This happens when the victim session is so stuck somewhere in an Oracle call that it never has a chance to receive the message it has been killed. And apparently some resources used can in that call can’t be released (for whatever reason, it may be by design, it may just be a bug).

The below queries against V$SESSION and V$LOCK show that even though the session has been marked to be in killed status, it’s still holding a lock:

--===session usid =====

select sid,s.serial#,taddr,sql_hash_value,prev_hash_value,p.spid,s.username,machine,logon_time
from v$session s,v$process p where s.paddr=p.addr and sid='&sid';

SQL> @usid 130

USERNAME                SID                 AUDSID OSUSER           MACHINE            PROGRAM              SPID          HASH_VALUE   LASTCALL STATUS
----------------------- -------------- ----------- ---------------- ------------------ -------------------- ------------ ----------- ---------- --------
XYZ_DEV01_OWNER '130,8764' 33533 1288249 \XYZHOST001 sqlplus.exe 3872 3564023715 4032 KILLED SQL> select * from v$lock where id1=63764; ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK
-------- -------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
40034278 40034290 130 TM 63764 0 6 0 2695 0

Ok, I tried various other options, like kill immediate and disconnect, which should have higher chance to clean up my session properly:

SQL> alter system kill session '130,8764' immediate;
alter system kill session '130,8764' immediate
*
ERROR at line 1:
ORA-00031: session marked for kill SQL> alter system disconnect session '130,8764' immediate;
alter system disconnect session '130,8764' immediate
*
ERROR at line 1:
ORA-00031: session marked for kill SQL> select * from v$lock where id1=63764; ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK
-------- -------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
40034278 40034290 130 TM 63764 0 6 0 2710 0

Still no luck, lock is there.

So I found the OS PID of my server process (or actually OS Thread ID inside oracle.exe process as I was on Windows) and used oradebug short_stack to check where that process was stuck (the output is slightly formatted):

SQL> oradebug setospid 3872;
Oracle pid: 18, Windows thread id: 3872, image: ORACLE.EXE (SHAD) SQL> oradebug short_stack;
_ksdxfstk+14<-_ksdxcb+1481<-_ssthreadsrgruncallback+428<-_OracleOradebugThreadStart@4+819
<-7C80B680<-00000000<-71A557C4<-71AB4376<-6298540C<-6298325E<-60A0D931<-609D005F<-609B073D<-609AF9
endExitAgent+202<-_hoxexit+188<-_hogmdis+890<-_hogdisc+8<-_xupidhs+137<-_upidhs+20<-_kpudtch+305
<-_OCIServerDetach+12<-_ncodncf+268<-_npidr0+2300<-_npifcc+46<-_qerrmFBu+457<-_qerrmFetch+1
+1291<-_opiodr+1099<-_rpidrus+178<-_rpidru+88<-_rpiswu2+426<-_rpidrv+1461<-_psddr0+449
<-_psdnal+283<-_pevm_EXIM+153<-_pfrinstr_EXIM+34<-_pfrrun_no_tool+56<-_pfrrun+781<-_plsql_run+738
<-_pr+1099<-_opidrv+819<-_sou2o+45<-_opimai_real+112<-_opimai+92<-_OracleThreadStart@4+726<-7C80B680

This terse stack shows (start reading from bottom right to left) this process is stuck somewhere “above” qerrmFetch (Remote Fetch). I guess the functions starting with “h” above that are heterogenous services functions. By the way, V$SESSION_WAIT didn’t show any wait state changes either and the session itself was still constantly waiting for “HS message to agent” event. So, being stuck in a HS call was probably the reason why that session could not clean itself up.

Now, in such situations one normally would proceed with ORAKILL on Windows or just killing that server process at Unix level (after carefully thinking what you’re about to do). Killing the process makes PMON to perform the cleanup and PMON usually does clean all resources up ok. However I didn’t have access to that Windows server box, so OS kill was out of question.

So, I used another trick. While being connected to the victim process using oradebug, I issued this:

SQL> oradebug event immediate crash;
ORA-00072: process "Windows thread id: 3872, image: ORACLE.EXE (SHAD)" is not active
SQL>

After waiting for a while (crashdump occurred), sqlplus reported that the target process doesn’t exist anymore. Thanks to the crash, PMON had woken up and performed the cleanup.

Let’s see if it helped:

SQL> oradebug short_stack;
ORA-00072: process "Windows thread id: 3872, image: ORACLE.EXE (SHAD)" is not active SQL> select * from v$lock where id1=63764; no rows selected SQL> @usid 130 no rows selected SQL> drop table MYTABLE; Table dropped.

Yippee, I finally got rid of that session, lock and could drop my table!

Note that I would still prefer killing the processes with Unix kill or Windows orakill, however there are some bugs with orakill (search in Metalink) that it doesn’t always succeed killing the thread properly. And in my case I didn’t have OS access anyway.

Ideally, the ALTER SYSTEM KILL session command should do all needed cleanup for us, but there are some corner cases involving loops, hangs and bugs where this approach won’t work. There’s a Metalink note 1020720.102 with a list of ways for killing Oracle processes/threads from OS level.

NB! If you want to move to the "New World" - offload your data and workloads to Hadoop, without having to re-write your existing applications - check out Gluent. We are making history! ;-)

最新文章

  1. 【网络编程/C++】修改本机ip地址
  2. .NET技术大系概览 (迄今为止最全的.NET技术栈)
  3. Java多线程编程核心技术---对象及变量的并发访问(一)
  4. NBU7.0.1迁移C:\Veritas\Netbackup\db到其他盘
  5. APKTool 提取APK文件的资源
  6. sql 联合查询速度慢,需要对其进行分组
  7. LINUX Find命令使用
  8. hdu 5443 The Water Problem 线段树
  9. hdu 4099 Revenge of Fibonacci Trie树与模拟数位加法
  10. JAVA实现word doc docx pdf excel的在线浏览 - 仿百度文库 源码
  11. 你好,C++(19)“老师,我这次四级考试过了没有?”——4.2 条件选择语句
  12. javac不是内部或外部命令
  13. sqlserver提高篇
  14. C++ UTF8和GB2312相互转换
  15. 微信小程序前端开发踩坑(一)
  16. RabbitMQ教程(二) ——linux下安装rabbitmq
  17. IIS 网站 HTTP 转 HTTPS
  18. 版本控制(svn、Git)环境创建
  19. Net SMTP QQ 发送邮件
  20. ActiveMQ 处理不同类型的消息

热门文章

  1. activity栈管理的3种方式
  2. numpy计算
  3. EventStore文件存储设计
  4. 【Nginx安装】CentOS7安装Nginx及配置
  5. 如何修改织梦dedecms文章标题的最大长度
  6. for、while循环(java基础知识四)
  7. IDEA maven dependency自动提示
  8. javaScript实现增删改查
  9. C#在一段数字区间内随机生成若干个互不相同的随机数
  10. CentOS 6.5升级到CentOS 7