Structural analysis of proteins and nucleic acids is complicated by their inherent flexibility, conferred, for example, by linkers between their contiguous domains. Therefore, the macromolecule needs to be represented by an ensemble of conformations instead of a single conformation. Determining this ensemble is challenging because the experimental data are a convoluted average of contributions from multiple conformations. As the number of the ensemble degrees of freedom generally greatly exceeds the number of independent observables, directly deconvolving experimental data into a representative ensemble is an ill-posed problem. Recent developments in sparse approximations and compressive sensing have demonstrated that useful information can be recovered from underdetermined (ill-posed) systems of linear equations by using sparsity regularization. Inspired by these advances, we designed the Sparse Ensemble Selection (SES) method for recovering multiple conformations from a limited number of observations. SES is more general and accurate than previously published minimum-ensemble methods, and we use it to obtain representative conformational ensembles of Lys48-linked diubiquitin, characterized by the residual dipolar coupling data measured at several pH conditions. These representative ensembles are validated against NMR chemical shift perturbation data and compared to maximum-entropy results. The SES method reproduced and quantified the previously observed pH dependence of the major conformation of Lys48-linked diubiquitin, and revealed lesser-populated conformations that are preorganized for binding known diubiquitin receptors, thus providing insights into possible mechanisms of receptor recognition by polyubiquitin. SES is applicable to any experimental observables that can be expressed as a weighted linear combination of data for individual states.
ASJC Scopus subject areas
- Colloid and Surface Chemistry