Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean

本文摘自《Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean》

Variation calling and annotation.

Mapping.

SAMtools (Version: 0.1.18) software was used to convert mapping results into the BAM format and to filter the unmapped and non-unique reads.

Duplicated reads were filtered with the Picard package (picard.sourceforge.net, Version:1.87).

The BEDtools (Version: 2.17.0) coverageBed program was used to compute the coverage of sequence alignments. (A sequence was defined as absent if coverage was lower than 90% and present if coverage was greater than 90%.)

SNP calling.

SNP detection was performed using the Genome Analysis Toolkit (GATK, version 2.4-7-g5e89f01) and SAMtools. Only the SNPs detected by both methods were analyzed further.

The detailed processes were as follows:

(1) After BWA alignment, the reads around indels were realigned.

Realignment was performed with GATK in two steps.

The first step used the RealignerTargetCreator package to identify regions where realignment was needed;

The second step used IndelRealigner to realign the regions found in the first step, which produced a realigned BAM file for each accession.

(2) SNPs were called at a population level with GATK and SAMtools. For GATK, the SNP confidence score was set as greater than 30, and the parameter -stand_call_conf was set as 30. The same realigned BAM files were used in SNP calling through the SAMtools mpileup package.

(3) In the filter step, we chose the common sites identified by GATK and SAMtools with the SelectVariants package; SNPs with allele frequencies lower than 1% in the population were discarded.

Indel calling.

Indel calling was similar to SNP calling but with the UnifiedGenotyper parameter -glm INDEL for the indel report only. Only insertions and deletions shorter than or equal to 6 bp were taken into account.

Annotation.

SNP annotation was performed according to the genome using the package ANNOVAR (Version: 2013-08-23).

Based on the genome annotation, SNPs were categorized in exonic regions (overlapping with a coding exon), splicing sites (within 2 bp of a splicing junction), 5′UTRs and 3′UTRs, intronic regions (overlapping with an intron), upstream and downstream regions (within a 1 kb region upstream or downstream from the transcription start site), and intergenic regions.

SNPs in coding exons were further grouped into synonymous SNPs (did not cause amino acid changes) or nonsynonymous SNPs (caused amino acid changes; mutations causing stop gain and stop loss were also classified into this group).

Indels in the exonic regions were classified by whether they had frame-shift (3 bp insertion or deletion) mutations.

最新文章

  1. sqlServer 2008修改字段类型和重命名字段名称的sql语句
  2. log4j输出日志乱码(转)
  3. Unity 父物体与子物体位置
  4. 通过.htaccess文件让Magento加速
  5. 怎么用OCR图文识别软件在MS Office中创建PDF文件
  6. 理解Java的封装与接口
  7. 【转载】c++中的 extern "C"(讲的更好一些)
  8. kettle Add XML 、 XML Join
  9. ORACLE SQL单行函数(三)【weber出品必属精品】
  10. c++中 cin、cin.get()、cin.getline()、cin.getchar()的区别
  11. 关于PEER - PEER毅恒挚友 - Powered by Discuz!
  12. POJ2115(扩展欧几里得)
  13. convertView的疑问(软件管理器)
  14. maven的下载安装,配置本地仓库
  15. 《转》xcode创建一个工程的多个taget,便于测试和发布多个版本
  16. 自学Aruba3.1-Aruba配置架构-WLAN配置架构
  17. WPF之Menu绑定XML
  18. finecms如何调用多个栏目的子栏目
  19. 用十条命令在一分钟内检查Linux服务器性能[转]
  20. Linux 移动或重命名文件/目录-mv 的10个实用例子

热门文章

  1. lumen-Permission 权限管理使用心得
  2. tomcat启动不起来,不知原因,没有报错日志,控制台一闪而过 怎么办
  3. CODEVS 必做题:3149、2821、1531、3369、1230
  4. JavaScript通过preventDefault()使input[type=text]禁止输入但保留光标
  5. JavaScript和jQuery改变标签内容
  6. React-学习总结
  7. jquerymobile模板
  8. Tomcat日志备份脚本
  9. 一种基于自定义代码的asp.net网站访问IP过滤方法!
  10. HTML中表格的使用