这几天因为要对yolo进行重新训练,需要用到imagenet pretrain,由于网络是自己设计的网络,所以需要先在darknet上训练imagenet,由于网上都没有相关的说明教程,特别是图片路径是怎么和类别标签对应起来的,让我百思不得其解,所以最后就自己去查看了darknet的源码,发现原来作者是用了字符串匹配,来查找图片路径字符串中是否有与类别标签字符串匹配的子字符串,以此判断该类别标签的。

1、darknet对于图片分类训练、验证命令为:

./darknet classifier train cfg/imagenet1k.data cfg/extraction.cfg extraction.weights

./darknet classifier valid cfg/imagenet1k.data cfg/extraction.cfg extraction.weights

2、数据格式:数据路径配置主要读取自:cfg/imagenet1k.data

classes=1000
train  = imagenet/darknet_train.txt
valid  = imagenet/darknet_val.txt
backup = backup/
labels = data/imagenet.labels.list
names  = data/imagenet.shortnames.list
top=5

darknet_train.txt,darknet_val.txt的训练格式只有图片路径,比如:

/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10026.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10027.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10029.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10040.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10042.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10043.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10048.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10066.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10074.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_1009.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10095.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10108.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10110.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10120.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10124.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10150.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10159.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10162.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10183.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10194.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10211.JPEG
/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/n01440764/n01440764_10218.JPEG

那么darknet是怎么知道每一行图片路径,对应的类别标签的。其主要是从:

data/imagenet.labels.list

读取标签字符串,然后用类别标签字符串,匹配上面每一行的图片路径,查找是否有子字符串,以此确定类别标签,所以训练的时候,一定要确保图片路径包含了类别标签,比如:n01440764等就是类别标签。

3、由于imagenet的val图片是放在一起的,路径不包含标签,所以需要读取val标签.xml文件,把val的图片根据标签,重新存过一遍,放在对应的类别标签文件:

#coding=utf-8
import os
import shutil
from BeautifulSoup import BeautifulSoup
#train.txt可通过运行脚本caffe/data/get_ilsvrc_aux.sh下载获得
'''with open("../imagenet/train.txt") as f:
    with open("../imagenet/darknet_train.txt",'w') as w:
        for l in f.readlines():
            w.writelines('/home/research/disk1/imagenet/ILSVRC2015/Data/CLS-LOC/train/'+l.split()[0]+'\n')'''

#val
dataroot='/home/research/disk1/imagenet/ILSVRC2015/'
vallabel=dataroot+'Annotations/CLS-LOC/val'
valimage=dataroot+'Data/CLS-LOC/val'
with open("../imagenet/darknet_val.txt",'w') as w:
    for l in os.listdir(vallabel):

        xml = ""
        with open(os.path.join(vallabel,l)) as f:
            xml = f.readlines()
        xml = ''.join([line.strip('\t') for line in xml])

        label=BeautifulSoup(xml).find('name').string
        filename=BeautifulSoup(xml).find('filename').string+'.JPEG'

        saveroot='../temp/'+label
        if os.path.exists(saveroot) is False:
            os.makedirs(saveroot)
        shutil.copy(os.path.join(valimage,filename),os.path.join(saveroot,filename))
        w.writelines('/home/research/disk1/compress_yolo/temp/' + filename+ '\n')

最新文章

  1. Linux部署jar包
  2. MySQL每天自动增加分区
  3. javascript、ECMAScript、DOM、BOM关系
  4. markdown语法记录
  5. 理解TCP/IP三次握手与四次挥手的正确姿势
  6. tmpfs:一种基于内存的文件系统
  7. 剑指offer题目1-10
  8. Tempter of the Bone---hdu1010(dfs+剪枝)
  9. div布局
  10. PLSQL_Oracle Logon Trigger的建立
  11. VC++ win32 多线程 一边画圆一边画矩形
  12. HDU 1241Oil Deposits (DFS)
  13. 关于macOS 管理员(Admin)权限问题。
  14. tp5 查询单个字段的值
  15. asp.net的<% %>特定用法
  16. Java基础系列--throw、throws关键字
  17. Docker CE 镜像源站
  18. Entity Framework Core: A second operation started on this context before a previous operation completed
  19. P1141 01迷宫 dfs连通块
  20. eclipse 安装报错

热门文章

  1. MySQL_解决ERROR 2006 (HY000) at line XX MySQL server has gone away问题
  2. iOS日常学习 - 如何配置.gitignore文件
  3. ASP.NET OAuth Authorization - Difference between using ClientId and Secret and Username and Password
  4. GridControl 史上最全的资料(一)
  5. Android通过soap2访问webservice遇到HTTP request failed, HTTP status: 302的问题
  6. Angular内提供了一个可以快速建立测试用web服务的方法:内存 (in-memory) 服务器
  7. js中的内部属性与delete操作符
  8. java使用poi实现excel表格生成
  9. double int char long 等数据类型所占的字节数-----待整理
  10. 手机APP测试环境搭建---appium