欢迎您访问 最编程 本站为您分享编程语言代码,编程技术文章!
您现在的位置是: 首页

新发现:基因组也有自己的‘禁止区域’——ENCODE-Blacklist揭示

最编程 2024-02-11 15:02:21
...

黑名单:顾名思义,就是有问题的区域!具体怎么定义,包含哪些信息呢?咱来详细唠一唠:

The ENCODE Blacklist: Identification of Problematic Regions of the Genome (https://www.nature.com/articles/s41598-019-45839-z) 这篇文章中,定义了基因组中的blacklist区域,即反常的或者无论在二代测序的哪个实验中都是高信号的区域。排除掉这些区域对我们进一步分析功能基因组数据可以提供质量保证。

文章中提供了一个blacklist区域和正常区域的比较:


image.png

example

在blacklist区域信号非常高,要达到background的 6400×左右。

现在有ce10, ce11, dm3, dm6, hg19, hg38和mm10的blacklist region,可以在以下网站下载: https://github.com/Boyle-Lab/Blacklist/https://www.encodeproject.org/annotations/ENCSR636HFF/

  • HUMAN (hg19/GRCh38): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz ENCODE portal link: https://www.encodeproject.org/annotations/ENCSR636HFF/ (Select GRCh38)
  • HUMAN (hg19/GRCh37): ENCODE portal link: https://www.encodeproject.org/annotations/ENCSR636HFF/ (Select hg19) UCSC Genome browser track http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeMapability README on how this track of generated: http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg19-human/hg19-blacklist-README.pdf
  • MOUSE (mm10): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/mm10-mouse/mm10.blacklist.bed.gz ENCODE portal link: https://www.encodeproject.org/annotations/ENCSR636HFF/ (Select mm10)
  • MOUSE (mm9): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/mm9-mouse/mm9-blacklist.bed.gz
  • WORM (ce10): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/ce10-C.elegans/ce10-blacklist.bed.gz
  • FLY (dm3): http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/dm3-D.melanogaster/dm3-blacklist.bed.gz