Hotspot and the SPOT data quality metric

Hotspot is a program for identifying regions of local enrichment of short-read sequence tags mapped to the genome using a binomial distribution model. Regions flagged by the algorithm are called "hotspots." The algorithm utilizes a local background model that automatically normalizes for large regions of elevated tag levels due to, for example, copy number effects. Hotpsot is otherwise able to detect regions of enrichment of highly-variable size, making it applicable to both broad and highly-punctate signals. We have applied it extensively to DNase-seq and ChIP-seq data, including transcription factor (CTCF) and histone modification (H3K4me3, H3K36me3, H3K27me3) data.

Hotspot was originally conceived and implemented by Mike Hawrylycz. Additional contributors and developers include Bob Thurman, Eric Haugen, and Scott Kuehn.

This distribution also includes scripts for computing SPOT (Signal Portion of Tags), a quality measure for short-read sequence experiments. SPOT is simply the percentage of all tags that fall in hotspots.

  • Documentation for hotspot (Word and Powerpoint documents are slightly out-of-date)

  • Feb 2014

    Hotspot is currently hosted on Github

    See the hotspot Github repository for the current version of hotspot.
    5 Jul 2013

    Hotspot-SPOT distribution (v4)


    25 Jan 2013

    Hotspot-SPOT distribution (v3)

    Mappability files

    Below find files containing coordinates of uniquely-mappable regions of the genome for various read-lengths and genomes. These files would be used for the _MAPPABLE_FILE_ variable defined in runall.tokens.txt. NOTE: the .starch files are bed files compressed using the starch tool, which is part of the BEDOPS suite. The file used in _MAPPABLE_FILE_ must be uncompressed (you can use unstarch from BEDOPS for this purpose). If you have need for a particular combination not available below, feel free to contact the authors, rthurman(at)uw.edu
    11 June 2010

    Hotspot-SPOT distribution (v2)