convert2bed

Convert common binary and text genomic formats to BED

View the Project on GitHub alexpreynolds/convert2bed

convert2bed

The convert2bed tool converts common binary and text genomic formats (BAM, GFF, GTF, GVF, PSL, RepeatMasker annotation output (OUT), SAM, VCF and WIG) to unsorted or sorted, extended BED or BEDOPS Starch (compressed BED) with additional per-format options.

Convenience wrapper bash scripts are provided for each format that convert standard input to unsorted or sorted BED, or to BEDOPS Starch (compressed BED). Scripts expose format-specific convert2bed options.

We also provide bam2bed_sge, bam2bed_gnuParallel, bam2starch_sge and bam2starch_gnuParallel convenience scripts, which parallelize the conversion of indexed BAM to BED or to BEDOPS Starch via a Sun Grid Engine-based computational cluster or local GNU Parallel installation.

Installation

The following compiles convert2bed and copies the binary and wrappers to /usr/local/bin:

$ make && make install

Usage

Generally, to convert data in format xyz to sorted BED:

$ convert2bed -i xyz < input.xyz > output.bed

Add the -o starch option to write a BEDOPS Starch file, which stores compressed BED data and feature metadata:

$ convert2bed -i xyz -o starch < input.xyz > output.starch

Wrappers are available for each of the supported formats to convert to BED or Starch, e.g.:

$ bam2bed < reads.bam > reads.bed
$ bam2starch < reads.bam > reads.starch

Format-specific options are available for each wrapper; use --help with a wrapper script or --help-bam, --help-gff etc. with convert2bed to get a format-specific description of the conversion procedure and options.

Dependencies

This tool is dependent upon samtools to handle BAM conversion, and BEDOPS sort-bed and starch to generate sorted BED and Starch (compressed BED) output. The directory containing these binaries should be present in the end user's PATH environment variable.

If the samtools binary is not present, BAM conversion will fail. If the sort-bed binary is not installed, all format conversions will fail with default sort rules applied. If the starch binary is not installed, the starch output format option will be unavailable.