Featured Websites


October 5, 2014 |

A platform for mining and visualization of next generation sequencing data ...

Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts.

October 5, 2014 |

SUMMARY: We present a visualization tool applied on genome-wide association data, revealing disease-associated haplotypes, epistatically interacting loci, as well as providing visual signatures of multivariate correlations of genetic markers with respect to a phenotype. AVAILABILITY: Freely availabl...

Compounds In Literature (CIL): screening for compounds and relatives in PubMed.

October 5, 2014 |

AStream, an R-statistical software package for the curation and identification of feature peaks extracted from liquid chromatography mass spectrometry (LC/MS) metabolomics data, is described. AStream detects isotopic, fragment and adduct patterns by identifying feature pairs that fulfill expected re...

maxdLoad2 and maxdBrowse: standards-compliant tools for microarray experimental annotation, data management and dissemination.

October 5, 2014 |

BACKGROUND: maxdLoad2 is a relational database schema and Java application for microarray experimental annotation and storage. It is compliant with all standards for microarray meta-data capture; including the specification of what data should be recorded, extensive use of standard ontologies and su...

OligoSpawn: a software tool for the design of overgo probes from large unigene datasets.

October 5, 2014 |

SUMMARY: New additional methods are presented for processing and visualizing mass spectrometry based molecular profile data, implemented as part of the recently introduced MZmine software. They include new features and extensions such as support for mzXML data format, capability to perform batch pro...

STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data.

October 5, 2014 |

BACKGROUND: Classification using microarray datasets is usually based on a small number of samples for which tens of thousands of gene expression measurements have been obtained. The selection of the genes most significant to the classification problem is a challenging issue in high dimension data a...

A tree-based approach for motif discovery and sequence classification.

October 5, 2014 |

SUMMARY: Thousands of cancer exomes are currently being sequenced, yielding millions of non-synonymous single nucleotide variants (SNVs) of possible relevance to disease etiology. Here, we provide a software toolkit to prioritize SNVs based on their predicted contribution to tumorigenesis. It includ...


October 5, 2014 |

Human mitochondrial protein database ...

Pep-3D-Search: a method for B-cell epitope prediction based on mimotope analysis.

October 5, 2014 |

SUMMARY: The accuracy of Bayesian phylogenetic inference using molecular data depends on the use of proper models of sequence evolution. Although choosing the best model available from a pool of alternatives has become standard practice in statistical phylogenetics, assessment of the chosen models a...

A new method to measure the semantic similarity of GO terms.

October 5, 2014 |

We have developed a package named Application for Bioinformatics Computing Grid (ABCGrid). ABCGrid was designed for biology laboratories to use heterogeneous computing resources and access bioinformatics applications from one master node. ABCGrid is very easy to install and maintain at the premise o...

Protein-protein binding affinity prediction on a diverse set of structures.

MOTIVATION: Accurate binding free energy functions for protein-protein interactions are imperative for a wide range of purposes. Their construction is predicated upon ascertaining the factors that influence binding and their relative importance. A recent benchmark of binding affinities has allowed, for the first time, the evaluation and construction of binding free energy models using a diverse set of complexes, and a systematic assessment of our ability to model the energetics of conformational changes. RESULTS: We construct a large set of molecular descriptors using commonly available tools, introducing the use of energetic factors associated with conformational changes and disorder to order transitions, as well as features calculated on structural ensembles. The descriptors are used to train and test a binding free energy model using a consensus of four machine learning algorithms, whose performance constitutes a significant improvement over the other state of the art empirical free energy functions tested. The internal workings of the learners show how the descriptors are used, illuminating the determinants of protein-protein binding. AVAILABILITY: The molecular descriptor set and descriptor values for all complexes are available in the supplementary. A web server for the learners and coordinates for the bound and unbound structures can be accessed from the website: http://bmm.cancerresearchuk.org/%7EAffinity CONTACT: paul.bates@cancer.org.uk.

Reconstructing transcription factor activities in hierarchical transcription network motifs.

MOTIVATION: A knowledge of the dynamics of transcription factors is fundamental to understand the transcriptional regulation mechanism. Nowadays an experimental measure of transcription factor activities in vivo represents a challenge. Several methods have been developed to infer these activities from easily measurable quantities such as mRNA expression of target genes. A limitation of these methods is represented by the fact that they rely on very simple single-layer structures, typically consisting of one or more transcription factors regulating a number of target genes. RESULTS: We present a novel statistical inference methodology to reverse engineer the dynamics of transcription factors in hierarchical network motifs such as feed-forward loops. The approach we present is based on a continuous time representation of the system where the high level master transcription factor is represented as a two state Markov jump process driving a system of differential equations. We solve the inference problem using an efficient variational approach and demonstrate our method on simulated data and two real datasets. The results on real data show that the predictions of our approach can capture biological behaviours in a more effective way than single-layer models of transcription, and can lead to novel biological insights. AVAILABILITY: http://homepages.inf.ed.ac.uk/gsanguin/software.html CONTACT: g.sanguinetti@ed.ac.uk.

survcomp: an R/Bioconductor package for performance assessment and comparison of survival models.

SUMMARY: The survcomp package provides functions to assess and statistically compare the performance of survival/risk prediction models. It implements state-of-the-art statistics to (i) measure the performance of risk prediction models, (ii) combine these statistical estimates from multiple datasets using a meta-analytical framework, and (iii) statistically compare the performance of competitive models. AVAILABILITY: The R/Bioconductor package survcomp is provided open source under the Artistic-2.0 License with a user manual containing installation, operating instructions and use case scenarios on real datasets. survcomp requires R version 2.13.0 or higher.URL: http://bioconductor.org/packages/release/bioc/html/survcomp.html CONTACT: Benjamin Haibe-Kains <bhaibeka@jimmy.harvard.edu>, Markus Schröder <mschroed@jimmy.harvard.edu>