R-biomaRt使用-代码备份
2024-08-24 20:07:19
目标:使用R脚本从ensembl上下载transcript数据
简单粗暴,直接上代码。biomaRt的介绍晚一点更新。
# this file helps extract information from ensembl with gene name as input
# 11/02/2018, pxy7896 library(biomaRt) # 使用参数
# args=commandArgs(T)
# 从文件读取
geneNamesFile <- "geneNames.txt"
otherInfoFile <- "otherInfo.txt" raw <- read.table(geneNamesFile, col.names = c("geneNames"), stringsAsFactors = FALSE)
# geneNames is character
geneNames <- raw[["geneNames"]]
#geneNames <- args[1]
otherInfo <- read.table(otherInfoFile, stringsAsFactors = FALSE)
dataSet <- otherInfo[[1]][1]
# choose database
#dataSet <- args[2]
mart = useMart("ensembl", dataset=dataSet)
attr <- c("hgnc_symbol", "ensembl_transcript_id", "chromosome_name", "transcript_start", "transcript_end")
# get transcript ids
ids <- getBM(attributes = attr, filters = "hgnc_symbol", values = geneNames, mart = mart)
write.table(ids, "ids.txt", sep="\t", quote=FALSE, row.names=FALSE)
targetIds <- ids["ensembl_transcript_id"]
# get exons info
exonAttr <- c("5_utr_start","5_utr_end","3_utr_start","3_utr_end","strand", "ensembl_transcript_id", "ensembl_exon_id", "exon_chrom_start", "exon_chrom_end")
#attr2 <- c(attr, exonAttr)
result <- getBM(attributes = exonAttr, filters = "ensembl_transcript_id", values = targetIds, mart = mart)
write.table(result, args[4], sep = "\t", quote = FALSE, row.names=FALSE)
最新文章
- python3.5学习笔记--一个简单的图片爬虫
- DP - tencent2016实习生笔试A
- [HIve - LanguageManual] Subqueries
- Protobuf-net学习笔记
- 在CSS文件中引入其他CSS文件
- Android_多媒体_SoundPool声音池使用
- ES6字符串
- .net 超长URL请求返回404错误-解决方法
- extjs如何使用
- Php的常见错误及错误分析
- 国内CDN加速现状
- Dev控件删除按钮的两种方式
- SpringBoot 通过自定义注解实现AOP切面编程实例
- robot自动化分层设计
- js验证护照号码是否合规
- MySql 利用函数 查询所有子节点
- 读写app.config AppSettings,保留注释与不保留注释
- Ubuntu中su认证失败
- 部署Jenkins+docker集成环境
- python中not的用法