Featured Websites

First insight into the prediction of protein folding rate change upon point mutation.

October 5, 2014 |

SUMMARY: The accurate prediction of protein folding rate change upon mutation is an important and challenging problem in protein folding kinetics and design. In this work, we have collected experimental data on protein folding rate change upon mutation from various sources and constructed a reliable...

Presenting and exploring biological pathways with PathVisio.

October 5, 2014 |

The Red Queen said, It takes all the running you can do, to keep in the same place. Lewis Carrol MOTIVATION: Newly solved protein structures are routinely scanned against structures already in the Protein Data Bank (PDB) using Internet servers. In favourable cases, comparing 3D structures may reveal...

BATS: a Bayesian user-friendly software for analyzing time series microarray experiments.

October 5, 2014 |

Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for ...

CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure.

October 5, 2014 |

In this article we describe a new Bioconductor package CALIB for normalization of two-color microarray data. This approach is based on the measurements of external controls and estimates an absolute target level for each gene and condition pair, as opposed to working with log-ratios as a relative me...

Ergatis: a web interface and scalable software system for bioinformatics workflows.

October 5, 2014 |

Profile-based similarity search is an essential step in structure-function studies of proteins. However, inclusion of non-homologous sequence segments into a profile causes its corruption and results in false positives. Profile corruption is common in multidomain proteins, and single domains with lo...

Joint stage recognition and anatomical annotation of Drosophila gene expression patterns.

October 5, 2014 |

We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions...

ChEBI - Chemical Entities of Biological Interest

October 5, 2014 |

Small molecules, atoms, ions and radicals of biological interest ...

Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

October 5, 2014 |

BACKGROUND: Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and ama...

Gene ranking and biomarker discovery under correlation.

October 5, 2014 |

Mapping of next-generation sequencing data derived from RNA samples (RNAseq) presents different genome mapping challenges than data derived from DNA. For example, tags that cross exon-junction boundaries will often not map to a reference genome, and the strand specificity of the data needs to be ret...

MAMMOT--a set of tools for the design, management and visualization of genomic tiling arrays.

October 5, 2014 |

The MAMMOT software suite is a collection of Perl and PHP scripts for designing, annotating and visualizing genome tiling arrays to, for example, facilitate studies into the epigenetics of gene regulation. The web design allows rapid experimental data entry from multiple users, and results can easi...


Protein-protein binding affinity prediction on a diverse set of structures.

MOTIVATION: Accurate binding free energy functions for protein-protein interactions are imperative for a wide range of purposes. Their construction is predicated upon ascertaining the factors that influence binding and their relative importance. A recent benchmark of binding affinities has allowed, for the first time, the evaluation and construction of binding free energy models using a diverse set of complexes, and a systematic assessment of our ability to model the energetics of conformational changes. RESULTS: We construct a large set of molecular descriptors using commonly available tools, introducing the use of energetic factors associated with conformational changes and disorder to order transitions, as well as features calculated on structural ensembles. The descriptors are used to train and test a binding free energy model using a consensus of four machine learning algorithms, whose performance constitutes a significant improvement over the other state of the art empirical free energy functions tested. The internal workings of the learners show how the descriptors are used, illuminating the determinants of protein-protein binding. AVAILABILITY: The molecular descriptor set and descriptor values for all complexes are available in the supplementary. A web server for the learners and coordinates for the bound and unbound structures can be accessed from the website: http://bmm.cancerresearchuk.org/%7EAffinity CONTACT: paul.bates@cancer.org.uk.

Reconstructing transcription factor activities in hierarchical transcription network motifs.

MOTIVATION: A knowledge of the dynamics of transcription factors is fundamental to understand the transcriptional regulation mechanism. Nowadays an experimental measure of transcription factor activities in vivo represents a challenge. Several methods have been developed to infer these activities from easily measurable quantities such as mRNA expression of target genes. A limitation of these methods is represented by the fact that they rely on very simple single-layer structures, typically consisting of one or more transcription factors regulating a number of target genes. RESULTS: We present a novel statistical inference methodology to reverse engineer the dynamics of transcription factors in hierarchical network motifs such as feed-forward loops. The approach we present is based on a continuous time representation of the system where the high level master transcription factor is represented as a two state Markov jump process driving a system of differential equations. We solve the inference problem using an efficient variational approach and demonstrate our method on simulated data and two real datasets. The results on real data show that the predictions of our approach can capture biological behaviours in a more effective way than single-layer models of transcription, and can lead to novel biological insights. AVAILABILITY: http://homepages.inf.ed.ac.uk/gsanguin/software.html CONTACT: g.sanguinetti@ed.ac.uk.

survcomp: an R/Bioconductor package for performance assessment and comparison of survival models.

SUMMARY: The survcomp package provides functions to assess and statistically compare the performance of survival/risk prediction models. It implements state-of-the-art statistics to (i) measure the performance of risk prediction models, (ii) combine these statistical estimates from multiple datasets using a meta-analytical framework, and (iii) statistically compare the performance of competitive models. AVAILABILITY: The R/Bioconductor package survcomp is provided open source under the Artistic-2.0 License with a user manual containing installation, operating instructions and use case scenarios on real datasets. survcomp requires R version 2.13.0 or higher.URL: http://bioconductor.org/packages/release/bioc/html/survcomp.html CONTACT: Benjamin Haibe-Kains <bhaibeka@jimmy.harvard.edu>, Markus Schröder <mschroed@jimmy.harvard.edu>