-
Notifications
You must be signed in to change notification settings - Fork 133
KnownGenesToBed
##Motivation
converts UCSC knownGenes file to BED.
##Compilation
- java 1.8 http://www.oracle.com/technetwork/java/index.html (NOT the old java 1.7 or 1.6) . Please check that this java is in the
${PATH}
. Setting JAVA_HOME is not enough : (e.g: https://github.com/lindenb/jvarkit/issues/23 ) - GNU Make > 3.81
- curl/wget
- git
- apache ant is only required to compile htsjdk
- xsltproc http://xmlsoft.org/XSLT/xsltproc2.html
$ git clone "https://github.com/lindenb/jvarkit.git"
$ cd jvarkit
$ make kg2bed
by default, the libraries are not included in the jar file, so you shouldn't move them (https://github.com/lindenb/jvarkit/issues/15#issuecomment-140099011 ). You can create a bigger but standalone executable jar by addinging standalone=yes
on the command line:
$ git clone "https://github.com/lindenb/jvarkit.git"
$ cd jvarkit
$ make kg2bed standalone=yes
The required libraries will be downloaded and installed in the dist
directory.
The a file local.mk can be created edited to override/add some paths.
For example it can be used to set the HTTP proxy:
http.proxy.host=your.host.com
http.proxy.port=124567
##Synopsis
$ java -jar dist/knowngenestobed.jar [options] (stdin|file)
- -o|--output (OUTPUT-FILE) Output file. Default:stdout
- -i|--intron Hide Introns
- -u|--utr Hide UTRs
- -c|--cds Hide CDSs
- -x|--exon Hide Exons
- -t|--transcript Hide Transcript
- -h|--help print help
- -version|--version show version and exit
##Source Code
Main code is: https://github.com/lindenb/jvarkit/blob/master/src/main/java/com/github/lindenb/jvarkit/tools/misc/KnownGenesToBed.java
$ curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/knownGene.txt.gz" |\
gunzip -c |\
java -jar dist/kg2bed.jar
chr1 11873 14409 + uc001aaa.3 TRANSCRIPT uc001aaa.3
chr1 11873 12227 + uc001aaa.3 EXON Exon 1
chr1 12227 12612 + uc001aaa.3 INTRON Intron 1
chr1 11873 12227 + uc001aaa.3 UTR UTR3
chr1 12612 12721 + uc001aaa.3 EXON Exon 2
chr1 12721 13220 + uc001aaa.3 INTRON Intron 2
chr1 12612 12721 + uc001aaa.3 UTR UTR3
chr1 13220 14409 + uc001aaa.3 EXON Exon 3
chr1 13220 14409 + uc001aaa.3 UTR UTR3
chr1 11873 14409 + uc010nxr.1 TRANSCRIPT uc010nxr.1
chr1 11873 12227 + uc010nxr.1 EXON Exon 1
chr1 12227 12645 + uc010nxr.1 INTRON Intron 1
chr1 11873 12227 + uc010nxr.1 UTR UTR3
chr1 12645 12697 + uc010nxr.1 EXON Exon 2
chr1 12697 13220 + uc010nxr.1 INTRON Intron 2
- 2014: Creation
- 2015-07-21 : removed duplicate exon
- Issue Tracker: http://github.com/lindenb/jvarkit/issues
- Source Code: http://github.com/lindenb/jvarkit
The project is licensed under the MIT license.
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030