Binary Alignment Map (BAM) is the comprehensive raw data of genome sequencing;[1] it consists of the lossless, compressed binary representation of the Sequence Alignment Map-files.[2][3]

BAM is the compressed binary representation of SAM (Sequence Alignment Map), a compact and index-able representation of nucleotide sequence alignments.[4] The goal of indexing is to retrieve alignments that overlap a specific location quickly without having to go through all of them. Before indexing, BAM must be sorted by reference ID and then leftmost coordinate.[5] BAM is in compressed BGZF format.

The BAM format; image from: https://samtools.github.io/hts-specs/SAMv1.pdf

The structure of BAM files include a header section and an alignment section:[6]

Bam format uses 0-based coordinate system, where as SAM uses 1-based coordinate system. BAM can represent values in the range [−2^31 , 2^32).[5]

To view a list of sequencing and analysis tools that work with SAM/BAM click here.


See also

References

  1. ^ "Carl Zimmer's Game of Genomes, Season 1: Episode 3, BAM Reveals All". STAT. Retrieved 2016-08-21.
  2. ^ Li, Heng (2009-06-08). "The Sequence Alignment/Map format and SAMtools" (PDF). Bioinformatics. 25: 2078–9. doi:10.1093/bioinformatics/btp352. PMC 2723002. PMID 19505943.
  3. ^ "Binary Alignment Map". National Cancer Institute Wiki. Retrieved 2016-08-21.
  4. ^ "Genome Browser BAM Track Format". genome.ucsc.edu. Retrieved 2022-05-05.
  5. ^ a b "Sequence Alignment/Map Format Specification" (PDF). The SAM/BAM Format Specification Working Group. 3 Jun 2021.
  6. ^ "BAM File Format". support.illumina.com. Retrieved 2022-05-05.