HAMMER algorithm: Hashing with arithmetic modulo-4 for motif extraction of regulatory elements

Huitao Sheng, Kishan Mehrotra, Chilukuri K Mohan, Ramesh Raina

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

A new algorithm, HAMMER, discovers cis-elements in promoter regions of the co-regulated genes. We show that HAMMER is faster and more accurate than well-known tools currently in use to identify cis-elements. Given input sequences that represent promoter regions of genes, this algorithm searches for subsequences of desired length w whose frequency of occurrence is relatively high, while accounting for slightly corrupted variants (with up to d substitutions). Various w-mers are numerically encoded and represented in a hash table, and d-neighbors are efficiently discovered using a modulo-4 arithmetic operation. Profile matrices are constructed and evaluated using a high-order Markov model based on background data (from a gene database). HAMMER discovers the most frequently occurring w-mers (permitting corruption in at most d positions). Experiment results show that HAMMER is significantly faster and discovers more motifs present in the test sequences, when compared with two well-known motif-discovery tools (MDScan and AlignACE).

Original languageEnglish (US)
Title of host publicationProceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
Pages753-758
Number of pages6
DOIs
StatePublished - 2007
Event7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE - Boston, MA, United States
Duration: Jan 14 2007Jan 17 2007

Other

Other7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
CountryUnited States
CityBoston, MA
Period1/14/071/17/07

Fingerprint

Genes
Genetic Promoter Regions
Substitution reactions
Databases
Experiments

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Bioengineering

Cite this

Sheng, H., Mehrotra, K., Mohan, C. K., & Raina, R. (2007). HAMMER algorithm: Hashing with arithmetic modulo-4 for motif extraction of regulatory elements. In Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE (pp. 753-758). [4375645] https://doi.org/10.1109/BIBE.2007.4375645

HAMMER algorithm : Hashing with arithmetic modulo-4 for motif extraction of regulatory elements. / Sheng, Huitao; Mehrotra, Kishan; Mohan, Chilukuri K; Raina, Ramesh.

Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE. 2007. p. 753-758 4375645.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sheng, H, Mehrotra, K, Mohan, CK & Raina, R 2007, HAMMER algorithm: Hashing with arithmetic modulo-4 for motif extraction of regulatory elements. in Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE., 4375645, pp. 753-758, 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE, Boston, MA, United States, 1/14/07. https://doi.org/10.1109/BIBE.2007.4375645
Sheng H, Mehrotra K, Mohan CK, Raina R. HAMMER algorithm: Hashing with arithmetic modulo-4 for motif extraction of regulatory elements. In Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE. 2007. p. 753-758. 4375645 https://doi.org/10.1109/BIBE.2007.4375645
Sheng, Huitao ; Mehrotra, Kishan ; Mohan, Chilukuri K ; Raina, Ramesh. / HAMMER algorithm : Hashing with arithmetic modulo-4 for motif extraction of regulatory elements. Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE. 2007. pp. 753-758
@inproceedings{dcf922f3664e4d9cb745521b3c068f1f,
title = "HAMMER algorithm: Hashing with arithmetic modulo-4 for motif extraction of regulatory elements",
abstract = "A new algorithm, HAMMER, discovers cis-elements in promoter regions of the co-regulated genes. We show that HAMMER is faster and more accurate than well-known tools currently in use to identify cis-elements. Given input sequences that represent promoter regions of genes, this algorithm searches for subsequences of desired length w whose frequency of occurrence is relatively high, while accounting for slightly corrupted variants (with up to d substitutions). Various w-mers are numerically encoded and represented in a hash table, and d-neighbors are efficiently discovered using a modulo-4 arithmetic operation. Profile matrices are constructed and evaluated using a high-order Markov model based on background data (from a gene database). HAMMER discovers the most frequently occurring w-mers (permitting corruption in at most d positions). Experiment results show that HAMMER is significantly faster and discovers more motifs present in the test sequences, when compared with two well-known motif-discovery tools (MDScan and AlignACE).",
author = "Huitao Sheng and Kishan Mehrotra and Mohan, {Chilukuri K} and Ramesh Raina",
year = "2007",
doi = "10.1109/BIBE.2007.4375645",
language = "English (US)",
isbn = "1424415098",
pages = "753--758",
booktitle = "Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE",

}

TY - GEN

T1 - HAMMER algorithm

T2 - Hashing with arithmetic modulo-4 for motif extraction of regulatory elements

AU - Sheng, Huitao

AU - Mehrotra, Kishan

AU - Mohan, Chilukuri K

AU - Raina, Ramesh

PY - 2007

Y1 - 2007

N2 - A new algorithm, HAMMER, discovers cis-elements in promoter regions of the co-regulated genes. We show that HAMMER is faster and more accurate than well-known tools currently in use to identify cis-elements. Given input sequences that represent promoter regions of genes, this algorithm searches for subsequences of desired length w whose frequency of occurrence is relatively high, while accounting for slightly corrupted variants (with up to d substitutions). Various w-mers are numerically encoded and represented in a hash table, and d-neighbors are efficiently discovered using a modulo-4 arithmetic operation. Profile matrices are constructed and evaluated using a high-order Markov model based on background data (from a gene database). HAMMER discovers the most frequently occurring w-mers (permitting corruption in at most d positions). Experiment results show that HAMMER is significantly faster and discovers more motifs present in the test sequences, when compared with two well-known motif-discovery tools (MDScan and AlignACE).

AB - A new algorithm, HAMMER, discovers cis-elements in promoter regions of the co-regulated genes. We show that HAMMER is faster and more accurate than well-known tools currently in use to identify cis-elements. Given input sequences that represent promoter regions of genes, this algorithm searches for subsequences of desired length w whose frequency of occurrence is relatively high, while accounting for slightly corrupted variants (with up to d substitutions). Various w-mers are numerically encoded and represented in a hash table, and d-neighbors are efficiently discovered using a modulo-4 arithmetic operation. Profile matrices are constructed and evaluated using a high-order Markov model based on background data (from a gene database). HAMMER discovers the most frequently occurring w-mers (permitting corruption in at most d positions). Experiment results show that HAMMER is significantly faster and discovers more motifs present in the test sequences, when compared with two well-known motif-discovery tools (MDScan and AlignACE).

UR - http://www.scopus.com/inward/record.url?scp=47649104914&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=47649104914&partnerID=8YFLogxK

U2 - 10.1109/BIBE.2007.4375645

DO - 10.1109/BIBE.2007.4375645

M3 - Conference contribution

AN - SCOPUS:47649104914

SN - 1424415098

SN - 9781424415090

SP - 753

EP - 758

BT - Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE

ER -