以下是Python数据处理的题目说明与要求:

The attachment is a log file used to show running status of set-top-box, and each line in the file follows the format of “LineNumber + Time + ProcessName + (ProcessID) + Logs”, currently the logs are displayed in time order. Please write one script with Python language to support the following features:

  1. Sort the logs in alphabetical order of process name, e.g.: halserver, processman, etc.
  2. Filter the logs according to process name, the output only show the interested logs, e.g.: “procman”, and hiding the rest.
  3. Statistics the number of log lines for each process.

这是机顶盒执行的blog文本文件,打开后部分截图例如以下:

一看非常乱,事实上不应该用微软的txt打开,尝试用notepad++打开后,结构清楚了非常多,部分截图例如以下:

以下给出代码:

第1题的代码例如以下:

#coding=utf-8
import re
f1=open('stblog.txt','r')
f2=open('cc1.txt','w')
list1=f1.readlines()
list_process=[] #定义列表存放Process
res='\d\D\d\d:\d\d:\d\d\.\d{3}\s([a-z]+)' for i in range(len(list1)):
list_process.append(re.findall(res,str(list1[i]))) for i in range(len(list_process)): #測试正则是否可行
if len(list_process[i])>1:
print 'zheng ze fail' #print len(list_process)
#print len(list1)
#print list_process[141]
#print list1[141]
for m in range(len(list1)): #冒泡排序
for n in range(m+1,len(list1)):
if cmp(list_process[m],list_process[n])>0:
list_process[m],list_process[n]=list_process[n],list_process[m]
list1[m],list1[n]=list1[n],list1[m] f2.writelines(list1)

第2,3题代码例如以下:

#coding=utf-8
import re
f1=open('stblog.txt','r')
f2=open('cc2.txt','w')
list1=f1.readlines()
list_process=[] #定义列表存放Process
list2=[]
count=0
res='\d\D\d\d:\d\d:\d\d\.\d{3}\s([a-z\.\-]+)' for i in range(len(list1)):
list_process.append(re.findall(res,str(list1[i]))) for i in range(len(list_process)): #測试正则是否可行
if len(list_process[i])>1:
print 'zheng ze fail' s=raw_input("please input the log you interested:") for i in range(len(list_process)):
if list_process[i]==s.split():
list2.append(list1[i]) #将相应的process行加入到cc2.txt
count+=1
print count
f2.writelines(list2)

最新文章

  1. Unity3D优化总结(一)
  2. 系统调用方式文件编程,王明学learn
  3. jsp 颜色和表格控制
  4. newsstand杂志阅读应用源码ipad版
  5. Mysql异常:MySQLNonTransientConnectionException: No operations allowed after statement closed
  6. Ganymed SSH-2 for Java
  7. (转) Spring读书笔记-----部署我的第一个Spring项目
  8. NOI2003 文本编辑器
  9. CoreLocation+MapKit系统定位(含坐标以及详细地址)
  10. CSS3 :target伪类实现Tab切换效果 BY commy
  11. 常用Markdown公式整理 && 页内跳转注意 && Markdown preview
  12. Python常用算法(一)
  13. Unable to preventDefault inside passive event listener
  14. 【题解】Luogu P2221 [HAOI2012]高速公路
  15. 网络编程-Socket介绍
  16. eMMC基础技术3:eMMC总线token
  17. Linux命令:pigz多线程压缩工具【转】
  18. Linux命令详解-file
  19. error LNK2019: 无法解析的外部符号 __vsnwprintf,该符号在函数 "long __stdcall StringVPrintfWorkerW
  20. LNMP 网站搭建

热门文章

  1. C++类设计1(Class without pointer members)
  2. Leetcode 378.有序矩阵中第k小的元素
  3. Leetcode 330.按要求补齐数组
  4. hdu2042
  5. 九度oj 题目1025:最大报销额
  6. BestCoder Round #36
  7. 【单调队列+尺取】HDU 3530 Subsequence
  8. Spring-IOC源码解读1-整体设计
  9. #1045 - Access denied for user 'root'@'localhost' (using password: NO)的问题
  10. C# 判断上传图片是否被PS修改过的方法