3.python正则匹配不到内容时消耗大量内存

遇到问题：正常情况获取的网页源码可以通过正则表达式快速匹配到内容，，但是如果出现问题，没有匹配到的内容，正则就会一直回溯，导致内存激增，一直循坏查找。

解决思路：一、如果能够有特殊内容可以标记，满足标记再正则，不匹配则不正则，避免一直回溯

二、可以设置timeout的函数，如果运行超过多少时间则强制结束（下面给出了示例）

用threading.Timer的方法，通过start-》sleep-》cancel的形式，实现强制结束函数的调用

import threading

import time

def fun_timer():

    print('hello timer')

    global timer

    #重复构造定时器

    timer = threading.Timer(5.8,fun_timer)

    timer.start()

#定时调度

timer = threading.Timer(2,fun_timer)

timer.start()

# 50秒后停止定时器

time.sleep(50)

timer.cancel()

参考文章：https://blog.csdn.net/lxcnn/article/details/4756030

参考文章：https://blog.csdn.net/Homewm/article/details/92127567 （处理函数超时的三种方式）

巴特西

3.python正则匹配不到内容时消耗大量内存

最新文章

热门文章