Quartz是为大家熟知的任务调度框架,先看看官网的介绍:

-------------------------------------------------------------------------------------------------------------------------

What is the Quartz Job Scheduling Library?

Quartz is a richly featured, open source job scheduling library that can be integrated within virtually any Java application - from the smallest stand-alone application to the largest e-commerce system. Quartz can be used to create simple or complex schedules for executing tens, hundreds, or even tens-of-thousands of jobs; jobs whose tasks are defined as standard Java components that may execute virtually anything you may program them to do. The Quartz Scheduler includes many enterprise-class features, such as support for JTA transactions and clustering.

Quartz is freely usable, licensed under the Apache 2.0 license.

-------------------------------------------------------------------------------------------------------------------------

翻译:Quartz是一个功能丰富、开源的任务调度库,它可以集成到几乎任意Java应用中---小到最小的独立应用,大到最大的电子商务系统。Quartz 可以用来创建简单或者复杂的工作计划,同时执行数十、成百、甚至上万的任务。可被定义为标准Java组件的任务,几乎可以执行任意可以编程的任务。Quartz 任务调度包含许多企业级功能特性,比如支持JTA事务和集群。

Quartz可以免费试用,遵循 Apache 2.0 license 许可协议

-------------------------------------------------------------------------------------------------------------------------

公司项目也用的Quartz,最近遇到一些关于Quartz的问题,带着疑问,查阅了部分Quartz源码,与大家分享。

开始是为了研究Quartz的MisFire策略,当任务执行时间过长、服务停机、任务暂停等原因,导致其超过其下次执行的时间点时,就会涉及MisFire(失火,错误任务的触发)处理的策略问题。 Quartz的任务分为SimpleTrigger和CronTrigger,项目中一般使用CronTrigger居多,本文只涉及了CronTrigger的MisFire处理策略(SimpleTrigger的MisFire策略与CronTrigger不同,后续再说)。

MisFire策略常量的定义在类CronTrigger中,列举如下:

  1. MISFIRE_INSTRUCTION_FIRE_ONCE_NOW                 = 1
  2. MISFIRE_INSTRUCTION_DO_NOTHING                       = 2
  3. MISFIRE_INSTRUCTION_SMART_POLICY                    = 0
  4. MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY    = -1

根据JavaDoc介绍和官网文档分析,其对应执行策略如下:

  1. MISFIRE_INSTRUCTION_FIRE_ONCE_NOW:立即执行一次,然后按照Cron定义时间点执行
  2. MISFIRE_INSTRUCTION_DO_NOTHING:什么都不做,等待Cron定义下次任务执行的时间点
  3. MISFIRE_INSTRUCTION_SMART_POLICY:智能的策略,针对不同的Trigger执行不同,CronTrigger时为MISFIRE_INSTRUCTION_FIRE_ONCE_NOW
  4. MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY:将所有错过的执行时间点全都补上,例如,任务15s执行一次,执行的任务错过了4分钟,则执行MisFire时,一次性执行4*(60/15)= 16次任务

但是,我写了例子,实际执行策略1策略2与文档又不太相同,示例任务cron表达式为:0/10 * * * * ?,每10s执行一次。

测试步骤如下:

任务下次执行时间为15:05:10,misFire策略为MISFIRE_INSTRUCTION_FIRE_ONCE_NOW(1)

1.将任务暂停至15:05:35

2.重新启动任务,任务瞬间执行了3次

将misFire策略设置为MISFIRE_INSTRUCTION_DO_NOTHING与上述表现一致。这个实验结果与文档描述不太相符。

于是,翻阅Quartz源码,首先从定时任务本身入手,打断点,找到任务执行工作线程为:WorkerThread对象,工作线程池为:SimpleThreadPool

核心代码如下:

       // WorkerThread.class
// 将任务送入工作线程
    public void run(Runnable newRunnable) {
synchronized(lock) {
if(runnable != null) {
throw new IllegalStateException("Already running a Runnable!");
} runnable = newRunnable;
lock.notifyAll();
}
}
//循环执行,当有任务送入时执行任务
@Override
public void run() {
boolean ran = false; while (run.get()) {
try {
synchronized(lock) {
while (runnable == null && run.get()) {
lock.wait(500);
} if (runnable != null) {
ran = true;
runnable.run();
}
}
} catch (InterruptedException unblock) {
// do nothing (loop will terminate if shutdown() was called
try {
getLog().error("Worker thread was interrupt()'ed.", unblock);
} catch(Exception e) {
// ignore to help with a tomcat glitch
}
} catch (Throwable exceptionInRunnable) {
try {
getLog().error("Error while executing the Runnable: ",
exceptionInRunnable);
} catch(Exception e) {
// ignore to help with a tomcat glitch
}
} finally {
synchronized(lock) {
runnable = null;
}
// repair the thread in case the runnable mucked it up...
if(getPriority() != tp.getThreadPriority()) {
setPriority(tp.getThreadPriority());
} if (runOnce) {
run.set(false);
clearFromBusyWorkersList(this);
} else if(ran) {
ran = false;
makeAvailable(this);
} }
}

可以看到当有任务送入工作线程时,任务将被执行。由此,反向找到线程池代码,代码如下:

    // SimpleThreadPool.class
    public boolean runInThread(Runnable runnable) {
if (runnable == null) {
return false;
} synchronized (nextRunnableLock) { handoffPending = true; // Wait until a worker thread is available
while ((availWorkers.size() < 1) && !isShutdown) {
try {
nextRunnableLock.wait(500);
} catch (InterruptedException ignore) {
}
} if (!isShutdown) {
WorkerThread wt = (WorkerThread)availWorkers.removeFirst();
busyWorkers.add(wt);
wt.run(runnable);
} else {
// If the thread pool is going down, execute the Runnable
// within a new additional worker thread (no thread from the pool).
WorkerThread wt = new WorkerThread(this, threadGroup,
"WorkerThread-LastJob", prio, isMakeThreadsDaemons(), runnable);
busyWorkers.add(wt);
workers.add(wt);
wt.start();
}
nextRunnableLock.notifyAll();
handoffPending = false;
} return true;
}

可以看到线程池从可用的工作线程队列中取出一个工作线程,将任务送入工作线程(WorkerThread),然后任务会被执行。

由此,反向找到调用方法runInThread的地方,类QuartzSchedulerThread(约398行),QuartzSchedulerThread集成自Thread,又是一个无限循环执行的线程任务,找到类QuartzSchedulerThread.run()方法(由于代码量较大,此处不再全部粘贴),可以看到这个方法干的活大概是:循环找出需要执行的Job,然后送入线程池,再由线程池送入工作线程

列举部分关键代码:

1.找出需要执行的Job的代码
                    try {
              //此处去数据库查询将要执行的任务
triggers = qsRsrcs.getJobStore().acquireNextTriggers(
now + idleWaitTime, Math.min(availThreadCount, qsRsrcs.getMaxBatchSize()), qsRsrcs.getBatchTimeWindow());
acquiresFailed = 0;
if (log.isDebugEnabled())
log.debug("batch acquisition of " + (triggers == null ? 0 : triggers.size()) + " triggers");
} catch (JobPersistenceException jpe) {
if (acquiresFailed == 0) {
qs.notifySchedulerListenersError(
"An error occurred while scanning for the next triggers to fire.",
jpe);
}
if (acquiresFailed < Integer.MAX_VALUE)
acquiresFailed++;
continue;
} catch (RuntimeException e) {
if (acquiresFailed == 0) {
getLog().error("quartzSchedulerThreadLoop: RuntimeException "
+e.getMessage(), e);
}
if (acquiresFailed < Integer.MAX_VALUE)
acquiresFailed++;
continue;
}

关键点在注释处的代码,方法:acquireNextTriggers,继续debug跟进该方法,找到查询SQL,代码如下:

   // StdJDBCDelegate.class
public List<TriggerKey> selectTriggerToAcquire(Connection conn, long noLaterThan, long noEarlierThan, int maxCount)
throws SQLException {
PreparedStatement ps = null;
ResultSet rs = null;
List<TriggerKey> nextTriggers = new LinkedList<TriggerKey>();
try {
ps = conn.prepareStatement(rtp(SELECT_NEXT_TRIGGER_TO_ACQUIRE)); // Set max rows to retrieve
if (maxCount < 1)
maxCount = 1; // we want at least one trigger back.
ps.setMaxRows(maxCount); // Try to give jdbc driver a hint to hopefully not pull over more than the few rows we actually need.
// Note: in some jdbc drivers, such as MySQL, you must set maxRows before fetchSize, or you get exception!
ps.setFetchSize(maxCount); ps.setString(1, STATE_WAITING);
ps.setBigDecimal(2, new BigDecimal(String.valueOf(noLaterThan)));
ps.setBigDecimal(3, new BigDecimal(String.valueOf(noEarlierThan)));
rs = ps.executeQuery(); while (rs.next() && nextTriggers.size() <= maxCount) {
nextTriggers.add(triggerKey(
rs.getString(COL_TRIGGER_NAME),
rs.getString(COL_TRIGGER_GROUP)));
} return nextTriggers;
} finally {
closeResultSet(rs);
closeStatement(ps);
}
}

根据debug时实时参数,处理过的SQL为:

SELECT
TRIGGER_NAME,
TRIGGER_GROUP,
NEXT_FIRE_TIME,
PRIORITY
FROM
qrtz_TRIGGERS
WHERE
SCHED_NAME = 'schedulerFactoryBean'
AND TRIGGER_STATE = 'WAITING'
AND NEXT_FIRE_TIME <= (now + idleWaitTime)
AND (
MISFIRE_INSTR = -1
OR (
MISFIRE_INSTR != -1
AND NEXT_FIRE_TIME >= (now - misfireThreshold)
)
)
ORDER BY NEXT_FIRE_TIME ASC, PRIORITY DESC

其中:now为系统当前时间,idleWaitTime为系统线程闲置时间,默认取值为30s,misfireThreshold为配置参数,意思为系统能容忍的misFire的最大阀值,默认为60s(当前系统配置也是60s,之前一直不知道这个值什么意思)。从SQL中看得很清楚了,这个SQL语句是要查询出:未来30s内将要执行的任务,且MISFIRE_INSTR为-1(MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY),或者MISFIRE_INSTR不为-1,但是,NEXT_FIRE_TIME错过的执行时间不能超过阀值60s。至此问题搞清楚了,影响misFire执行策略的另一个参数就是misfireThreshold,配置文件quartz.properties中,对应org.quartz.jobStore.misfireThreshold: 60000,单位毫秒。也就是说:如果【错过时间】不超过60s都不算是misFire,不执行misFire策略,依次执行错过的任务时间点;【错过时间】超过60s按misFire策略执行。

根据上述结论重新进行试验,将任务暂停时间超过60s,这次试验结果与文档描述一致。

另外,跟踪启动任务的代码,找到处理misFire的方法,代码位置:org.quartz.impl.triggers.CronTriggerImpl.updateAfterMisfire(Calendar)

    @Override
public void updateAfterMisfire(org.quartz.Calendar cal) {
int instr = getMisfireInstruction(); if(instr == Trigger.MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY)
return; if (instr == MISFIRE_INSTRUCTION_SMART_POLICY) {
instr = MISFIRE_INSTRUCTION_FIRE_ONCE_NOW;
} if (instr == MISFIRE_INSTRUCTION_DO_NOTHING) {
Date newFireTime = getFireTimeAfter(new Date());
while (newFireTime != null && cal != null
&& !cal.isTimeIncluded(newFireTime.getTime())) {
newFireTime = getFireTimeAfter(newFireTime);
}
setNextFireTime(newFireTime);
} else if (instr == MISFIRE_INSTRUCTION_FIRE_ONCE_NOW) {
setNextFireTime(new Date());
}
}

可以清楚看到,misFire的执行逻辑。

在翻阅源码的同时,对之前比较疑惑的几个问题也做了研究,比如:Quartz的任务执行机制如何实现等等问题,都可以轻松通过翻阅源码找到答案,有兴趣的 童鞋 可以自己去翻阅下代码。

其实,针对这个问题,上网也可以查询问题的原因,但是,个人感觉由翻阅源码找到问题原因,对问题理解的更透彻,同时也能了解下Quartz的实现逻辑。鼓励大家遇到问题,去翻阅框架的源码,其实没有想象中的那么复杂。

(以上如有错误,还请指正,欢迎留言评论)

最新文章

  1. Android 热修复方案Tinker
  2. Iterm2 ssh tab title
  3. .net 网络编程
  4. OSChina码云试用
  5. A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning
  6. 获取当前时间日期并格式化--JS
  7. Java当中的运算符
  8. iOS 键盘弹出遮挡输入框
  9. arcgis 瓦片图加载规则(转载)
  10. python CSRF跨站请求伪造
  11. ps 命令的十个简单用法
  12. Cookie的使用(14)
  13. Android--手势及触摸事件的注意点(一)
  14. How to fix ERR_SSL_VERSION_INTERFERENCE on Chrome?
  15. 预备作业二——有关CCCCC语言(・᷄ᵌ・᷅)
  16. 【Android】使用Pull生成/解析XML文件
  17. JQuery常用和很有用处的方法
  18. vim调用替换文件内容
  19. HTML5可用的css reset
  20. 洛谷——P1194 买礼物

热门文章

  1. C语言中的异常处理
  2. 启后台JOB处理单据遇到锁定问题
  3. Centos7彻底删除PHP
  4. Java中四个作用域的可见范围
  5. 折线图值和坐标轴y轴不对应问题记录
  6. Python3之多重继承
  7. Egret入门学习日记 --- 问题汇总
  8. Vue项目过程中遇到的小问题
  9. 封装一个Model或者Vender类
  10. Vue.js与React的全面对比