Kooperativer Bibliotheksverbund

Berlin Brandenburg

and
and

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Computer Science
Type of Medium
Language
Year
  • 1
    Language: English
    In: PLoS ONE, 2011, Vol.6(10), p.e25364
    Description: Diagnostic and prognostic biomarkers for cancer based on gene expression profiles are viewed as a major step towards a better personalized medicine. Many studies using various computational approaches have been published in this direction during the last decade. However, when comparing different gene signatures for related clinical questions often only a small overlap is observed. This can have various reasons, such as technical differences of platforms, differences in biological samples or their treatment in lab, or statistical reasons because of the high dimensionality of the data combined with small sample size, leading to unstable selection of genes. In conclusion retrieved gene signatures are often hard to interpret from a biological point of view. We here demonstrate that it is possible to construct a consensus signature from a set of seemingly different gene signatures by mapping them on a protein interaction network. Common upstream proteins of close gene products, which we identified via our developed algorithm, show a very clear and significant functional interpretation in terms of overrepresented KEGG pathways, disease associated genes and known drug targets. Moreover, we show that such a consensus signature can serve as prior knowledge for predictive biomarker discovery in breast cancer. Evaluation on different datasets shows that signatures derived from the consensus signature reveal a much higher stability than signatures learned from all probesets on a microarray, while at the same time being at least as predictive. Furthermore, they are clearly interpretable in terms of enriched pathways, disease associated genes and known drug targets. In summary we thus believe that network based consensus signatures are not only a way to relate seemingly different gene signatures to each other in a functional manner, but also to establish prior knowledge for highly stable and interpretable predictive biomarkers.
    Keywords: Research Article ; Biology ; Computer Science ; Medicine ; Genetics And Genomics ; Molecular Biology ; Computational Biology ; Oncology ; Computer Science ; Pathology ; Biochemistry
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 2
    In: PLoS ONE, 2013, Vol.8(6)
    Description: Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available.
    Keywords: Research Article ; Biology ; Computer Science
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 3
    In: PLoS ONE, 2013, Vol.8(9)
    Description: Predictive, stable and interpretable gene signatures are generally seen as an important step towards a better personalized medicine. During the last decade various methods have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinics is the typical low reproducibility of signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. We here propose a technique that integrates network information as well as different kinds of experimental data (here exemplified by mRNA and miRNA expression) into one classifier. This is done by smoothing t-statistics of individual genes or miRNAs over the structure of a combined protein-protein interaction (PPI) and miRNA-target gene network. A permutation test is conducted to select features in a highly consistent manner, and subsequently a Support Vector Machine (SVM) classifier is trained. Compared to several other competing methods our algorithm reveals an overall better prediction performance for early versus late disease relapse and a higher signature stability. Moreover, obtained gene lists can be clearly associated to biological knowledge, such as known disease genes and KEGG pathways. We demonstrate that our data integration strategy can improve classification performance compared to using a single data source only. Our method, called stSVM, is available in R-package netClass on CRAN ( http://cran.r-project.org ).
    Keywords: Research Article ; Biology ; Computer Science ; Engineering ; Mathematics ; Medicine
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 4
    In: PLoS ONE, 2016, Vol.11(11)
    Description: Background The advance of omics technologies has made possible to measure several data modalities on a system of interest. In this work, we illustrate how the Non-Parametric Combination methodology, namely NPC, can be used for simultaneously assessing the association of different molecular quantities with an outcome of interest. We argue that NPC methods have several potential applications in integrating heterogeneous omics technologies, as for example identifying genes whose methylation and transcriptional levels are jointly deregulated, or finding proteins whose abundance shows the same trends of the expression of their encoding genes. Results We implemented the NPC methodology within “omicsNPC”, an R function specifically tailored for the characteristics of omics data. We compare omicsNPC against a range of alternative methods on simulated as well as on real data. Comparisons on simulated data point out that omicsNPC produces unbiased / calibrated p-values and performs equally or significantly better than the other methods included in the study; furthermore, the analysis of real data show that omicsNPC (a) exhibits higher statistical power than other methods, (b) it is easily applicable in a number of different scenarios, and (c) its results have improved biological interpretability. Conclusions The omicsNPC function competitively behaves in all comparisons conducted in this study. Taking into account that the method (i) requires minimal assumptions, (ii) it can be used on different studies designs and (iii) it captures the dependences among heterogeneous data modalities, omicsNPC provides a flexible and statistically powerful solution for the integrative analysis of different omics data.
    Keywords: Research Article ; Physical Sciences ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Research And Analysis Methods ; Medicine And Health Sciences ; Medicine And Health Sciences ; Medicine And Health Sciences ; Medicine And Health Sciences ; Biology And Life Sciences ; Research And Analysis Methods
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 5
    In: PLoS ONE, 2017, Vol.12(2)
    Description: We present a theoretical analysis of Gaussian-binary restricted Boltzmann machines (GRBMs) from the perspective of density models. The key aspect of this analysis is to show that GRBMs can be formulated as a constrained mixture of Gaussians, which gives a much better insight into the model’s capabilities and limitations. We further show that GRBMs are capable of learning meaningful features without using a regularization term and that the results are comparable to those of independent component analysis. This is illustrated for both a two-dimensional blind source separation task and for modeling natural image patches. Our findings exemplify that reported difficulties in training GRBMs are due to the failure of the training algorithm rather than the model itself. Based on our analysis we derive a better training setup and show empirically that it leads to faster and more robust training of GRBMs. Finally, we compare different sampling algorithms for training GRBMs and show that Contrastive Divergence performs better than training methods that use a persistent Markov chain.
    Keywords: Research Article ; Physical Sciences ; Physical Sciences ; Physical Sciences ; Biology And Life Sciences ; Physical Sciences ; Research And Analysis Methods ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Computer And Information Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 6
    In: PLoS ONE, 2014, Vol.9(3)
    Description: The purpose of feature selection is to identify the relevant and non-redundant features from a dataset. In this article, the feature selection problem is organized as a graph-theoretic problem where a feature-dissimilarity graph is shaped from the data matrix. The nodes represent features and the edges represent their dissimilarity. Both nodes and edges are given weight according to the feature’s relevance and dissimilarity among the features, respectively. The problem of finding relevant and non-redundant features is then mapped into densest subgraph finding problem. We have proposed a multiobjective particle swarm optimization (PSO)-based algorithm that optimizes average node-weight and average edge-weight of the candidate subgraph simultaneously. The proposed algorithm is applied for identifying relevant and non-redundant disease-related genes from microarray gene expression data. The performance of the proposed method is compared with that of several other existing feature selection techniques on different real-life microarray gene expression datasets.
    Keywords: Research Article ; Biology ; Computer Science ; Engineering ; Mathematics ; Medicine
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 7
    In: PLoS ONE, 2016, Vol.11(3)
    Description: Reverse-engineering of biological networks is a central problem in systems biology. The use of intervention data, such as gene knockouts or knockdowns, is typically used for teasing apart causal relationships among genes. Under time or resource constraints, one needs to carefully choose which intervention experiments to carry out. Previous approaches for selecting most informative interventions have largely been focused on discrete Bayesian networks. However, continuous Bayesian networks are of great practical interest, especially in the study of complex biological systems and their quantitative properties. In this work, we present an efficient, information-theoretic active learning algorithm for Gaussian Bayesian networks (GBNs), which serve as important models for gene regulatory networks. In addition to providing linear-algebraic insights unique to GBNs, leading to significant runtime improvements, we demonstrate the effectiveness of our method on data simulated with GBNs and the DREAM4 network inference challenge data sets. Our method generally leads to faster recovery of underlying network structure and faster convergence to final distribution of confidence scores over candidate graph structures using the full data, in comparison to random selection of intervention experiments.
    Keywords: Research Article ; Computer And Information Sciences ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Biology And Life Sciences ; Computer And Information Sciences ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Biology And Life Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Biology And Life Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 8
    In: PLoS ONE, 2015, Vol.10(6)
    Description: Background The joint study of multiple datasets has become a common technique for increasing statistical power in detecting biomarkers obtained from smaller studies. The approach generally followed is based on the fact that as the total number of samples increases, we expect to have greater power to detect associations of interest. This methodology has been applied to genome-wide association and transcriptomic studies due to the availability of datasets in the public domain. While this approach is well established in biostatistics, the introduction of new combinatorial optimization models to address this issue has not been explored in depth. In this study, we introduce a new model for the integration of multiple datasets and we show its application in transcriptomics. Methods We propose a new combinatorial optimization problem that addresses the core issue of biomarker detection in integrated datasets. Optimal solutions for this model deliver a feature selection from a panel of prospective biomarkers. The model we propose is a generalised version of the (α , β)-k -Feature Set problem. We illustrate the performance of this new methodology via a challenging meta-analysis task involving six prostate cancer microarray datasets. The results are then compared to the popular RankProd meta-analysis tool and to what can be obtained by analysing the individual datasets by statistical and combinatorial methods alone. Results Application of the integrated method resulted in a more informative signature than the rank-based meta-analysis or individual dataset results, and overcomes problems arising from real world datasets. The set of genes identified is highly significant in the context of prostate cancer. The method used does not rely on homogenisation or transformation of values to a common scale, and at the same time is able to capture markers associated with subgroups of the disease.
    Keywords: Research Article
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 9
    In: PLoS ONE, 2014, Vol.9(2)
    Description: The recently proposed modified-compressive sensing (modified-CS), which utilizes the partially known support as prior knowledge, significantly improves the performance of recovering sparse signals. However, modified-CS depends heavily on the reliability of the known support. An important problem, which must be studied further, is the recoverability of modified-CS when the known support contains a number of errors. In this letter, we analyze the recoverability of modified-CS in a stochastic framework. A sufficient and necessary condition is established for exact recovery of a sparse signal. Utilizing this condition, the recovery probability that reflects the recoverability of modified-CS can be computed explicitly for a sparse signal with nonzero entries. Simulation experiments have been carried out to validate our theoretical results.
    Keywords: Research Article ; Computer Science ; Engineering ; Mathematics
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 10
    In: PLoS ONE, 2014, Vol.9(3)
    Description: Time-course gene expression datasets, which record continuous biological processes of genes, have recently been used to predict gene function. However, only few positive genes can be obtained from annotation databases, such as gene ontology (GO). To obtain more useful information and effectively predict gene function, gene annotations are clustered together to form a learnable and effective learning system. In this paper, we propose a novel multi-instance hierarchical clustering (MIHC) method to establish a learning system by clustering GO and compare this method with other learning system establishment methods. Multi-label support vector machine classifier and multi-label K-nearest neighbor classifier are used to verify these methods in four yeast time-course gene expression datasets. The MIHC method shows good performance, which serves as a guide to annotators or refines the annotation in detail.
    Keywords: Research Article ; Biology ; Computer Science ; Mathematics
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. Further information can be found on the KOBV privacy pages