Structured motif extraction using affinity based cluster analysis

Faisal Alobaid, Kishan Mehrotra, Chilukuri K Mohan, Ramesh Raina

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper addresses the problem of extracting DNA structured motifs, which are overrepresented gapped patterns in the promoter regions of co-regulated genes. Existing algorithms suffer from three major drawbacks: 1) They are only capable of directly extracting patterns strictly conforming to user specified parameters (templates) that require an unreasonable level of prior knowledge. 2) Some algorithms are only capable of finding limited patterns, such as dyads. 3) The computational effort required by exact algorithms increases exponentially with the number of allowed mismatches in the pattern and the number of boxes in the given template. We present SMExtract, a versatile and efficient algorithm for finding patterns ranging from simple motifs to multi-box structured motifs. The essence of this novel approach is to construct the target unknown pattern by multi-alignment of its fragments. The key benefits are reduction in the number of user specified parameters and flexibility in specifying the number of allowed mismatches regardless of the characteristics of the unknown patterns.

Original languageEnglish (US)
Title of host publication5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013
Pages59-66
Number of pages8
StatePublished - 2013
Event5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013 - Honolulu, HI, United States
Duration: Mar 4 2013Mar 6 2013

Other

Other5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013
CountryUnited States
CityHonolulu, HI
Period3/4/133/6/13

Fingerprint

Cluster analysis
Cluster Analysis
Nucleotide Motifs
Genetic Promoter Regions
DNA
Genes

Keywords

  • Fragment assembly
  • Gap constraints
  • Multiple sequence alignment
  • Structured motif extraction

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Information Management

Cite this

Alobaid, F., Mehrotra, K., Mohan, C. K., & Raina, R. (2013). Structured motif extraction using affinity based cluster analysis. In 5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013 (pp. 59-66)

Structured motif extraction using affinity based cluster analysis. / Alobaid, Faisal; Mehrotra, Kishan; Mohan, Chilukuri K; Raina, Ramesh.

5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013. 2013. p. 59-66.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Alobaid, F, Mehrotra, K, Mohan, CK & Raina, R 2013, Structured motif extraction using affinity based cluster analysis. in 5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013. pp. 59-66, 5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013, Honolulu, HI, United States, 3/4/13.
Alobaid F, Mehrotra K, Mohan CK, Raina R. Structured motif extraction using affinity based cluster analysis. In 5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013. 2013. p. 59-66
Alobaid, Faisal ; Mehrotra, Kishan ; Mohan, Chilukuri K ; Raina, Ramesh. / Structured motif extraction using affinity based cluster analysis. 5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013. 2013. pp. 59-66
@inproceedings{29c4c40d943f42e8bcf271208bc8a36f,
title = "Structured motif extraction using affinity based cluster analysis",
abstract = "This paper addresses the problem of extracting DNA structured motifs, which are overrepresented gapped patterns in the promoter regions of co-regulated genes. Existing algorithms suffer from three major drawbacks: 1) They are only capable of directly extracting patterns strictly conforming to user specified parameters (templates) that require an unreasonable level of prior knowledge. 2) Some algorithms are only capable of finding limited patterns, such as dyads. 3) The computational effort required by exact algorithms increases exponentially with the number of allowed mismatches in the pattern and the number of boxes in the given template. We present SMExtract, a versatile and efficient algorithm for finding patterns ranging from simple motifs to multi-box structured motifs. The essence of this novel approach is to construct the target unknown pattern by multi-alignment of its fragments. The key benefits are reduction in the number of user specified parameters and flexibility in specifying the number of allowed mismatches regardless of the characteristics of the unknown patterns.",
keywords = "Fragment assembly, Gap constraints, Multiple sequence alignment, Structured motif extraction",
author = "Faisal Alobaid and Kishan Mehrotra and Mohan, {Chilukuri K} and Ramesh Raina",
year = "2013",
language = "English (US)",
isbn = "9781622769711",
pages = "59--66",
booktitle = "5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013",

}

TY - GEN

T1 - Structured motif extraction using affinity based cluster analysis

AU - Alobaid, Faisal

AU - Mehrotra, Kishan

AU - Mohan, Chilukuri K

AU - Raina, Ramesh

PY - 2013

Y1 - 2013

N2 - This paper addresses the problem of extracting DNA structured motifs, which are overrepresented gapped patterns in the promoter regions of co-regulated genes. Existing algorithms suffer from three major drawbacks: 1) They are only capable of directly extracting patterns strictly conforming to user specified parameters (templates) that require an unreasonable level of prior knowledge. 2) Some algorithms are only capable of finding limited patterns, such as dyads. 3) The computational effort required by exact algorithms increases exponentially with the number of allowed mismatches in the pattern and the number of boxes in the given template. We present SMExtract, a versatile and efficient algorithm for finding patterns ranging from simple motifs to multi-box structured motifs. The essence of this novel approach is to construct the target unknown pattern by multi-alignment of its fragments. The key benefits are reduction in the number of user specified parameters and flexibility in specifying the number of allowed mismatches regardless of the characteristics of the unknown patterns.

AB - This paper addresses the problem of extracting DNA structured motifs, which are overrepresented gapped patterns in the promoter regions of co-regulated genes. Existing algorithms suffer from three major drawbacks: 1) They are only capable of directly extracting patterns strictly conforming to user specified parameters (templates) that require an unreasonable level of prior knowledge. 2) Some algorithms are only capable of finding limited patterns, such as dyads. 3) The computational effort required by exact algorithms increases exponentially with the number of allowed mismatches in the pattern and the number of boxes in the given template. We present SMExtract, a versatile and efficient algorithm for finding patterns ranging from simple motifs to multi-box structured motifs. The essence of this novel approach is to construct the target unknown pattern by multi-alignment of its fragments. The key benefits are reduction in the number of user specified parameters and flexibility in specifying the number of allowed mismatches regardless of the characteristics of the unknown patterns.

KW - Fragment assembly

KW - Gap constraints

KW - Multiple sequence alignment

KW - Structured motif extraction

UR - http://www.scopus.com/inward/record.url?scp=84883636004&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883636004&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84883636004

SN - 9781622769711

SP - 59

EP - 66

BT - 5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013

ER -