Kooperativer Bibliotheksverbund

Berlin Brandenburg

and
and

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • PMC (PubMed Central)  (52)
Type of Medium
Language
Year
  • 1
    Language: English
    In: PLoS ONE, 2011, Vol.6(10), p.e25364
    Description: Diagnostic and prognostic biomarkers for cancer based on gene expression profiles are viewed as a major step towards a better personalized medicine. Many studies using various computational approaches have been published in this direction during the last decade. However, when comparing different gene signatures for related clinical questions often only a small overlap is observed. This can have various reasons, such as technical differences of platforms, differences in biological samples or their treatment in lab, or statistical reasons because of the high dimensionality of the data combined with small sample size, leading to unstable selection of genes. In conclusion retrieved gene signatures are often hard to interpret from a biological point of view. We here demonstrate that it is possible to construct a consensus signature from a set of seemingly different gene signatures by mapping them on a protein interaction network. Common upstream proteins of close gene products, which we identified via our developed algorithm, show a very clear and significant functional interpretation in terms of overrepresented KEGG pathways, disease associated genes and known drug targets. Moreover, we show that such a consensus signature can serve as prior knowledge for predictive biomarker discovery in breast cancer. Evaluation on different datasets shows that signatures derived from the consensus signature reveal a much higher stability than signatures learned from all probesets on a microarray, while at the same time being at least as predictive. Furthermore, they are clearly interpretable in terms of enriched pathways, disease associated genes and known drug targets. In summary we thus believe that network based consensus signatures are not only a way to relate seemingly different gene signatures to each other in a functional manner, but also to establish prior knowledge for highly stable and interpretable predictive biomarkers.
    Keywords: Research Article ; Biology ; Computer Science ; Medicine ; Genetics And Genomics ; Molecular Biology ; Computational Biology ; Oncology ; Computer Science ; Pathology ; Biochemistry
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 2
    In: PLoS ONE, 2013, Vol.8(6)
    Description: Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available.
    Keywords: Research Article ; Biology ; Computer Science
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 3
    In: PLoS ONE, 2017, Vol.12(11)
    Description: Parkinson’s Disease (PD) is a progressive neurodegenerative movement disease affecting over 6 million people worldwide. Loss of dopamine-producing neurons results in a range of both motor and non-motor symptoms, however there is currently no definitive test for PD by non-specialist clinicians, especially in the early disease stages where the symptoms may be subtle and poorly characterised. This results in a high misdiagnosis rate (up to 25% by non-specialists) and people can have the disease for many years before diagnosis. There is a need for a more accurate, objective means of early detection, ideally one which can be used by individuals in their home setting. In this investigation, keystroke timing information from 103 subjects (comprising 32 with mild PD severity and the remainder non-PD controls) was captured as they typed on a computer keyboard over an extended period and showed that PD affects various characteristics of hand and finger movement and that these can be detected. A novel methodology was used to classify the subjects’ disease status, by utilising a combination of many keystroke features which were analysed by an ensemble of machine learning classification models. When applied to two separate participant groups, this approach was able to successfully discriminate between early-PD subjects and controls with 96% sensitivity, 97% specificity and an AUC of 0.98. The technique does not require any specialised equipment or medical supervision, and does not rely on the experience and skill of the practitioner. Regarding more general application, it currently does not incorporate a second cardinal disease symptom, so may not differentiate PD from similar movement-related disorders.
    Keywords: Research Article ; Medicine And Health Sciences ; Medicine And Health Sciences ; Medicine And Health Sciences ; Biology And Life Sciences ; Medicine And Health Sciences ; Computer And Information Sciences ; Biology And Life Sciences ; Medicine And Health Sciences ; Medicine And Health Sciences ; Physical Sciences ; Medicine And Health Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 4
    In: PLoS ONE, 2013, Vol.8(9)
    Description: Predictive, stable and interpretable gene signatures are generally seen as an important step towards a better personalized medicine. During the last decade various methods have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinics is the typical low reproducibility of signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. We here propose a technique that integrates network information as well as different kinds of experimental data (here exemplified by mRNA and miRNA expression) into one classifier. This is done by smoothing t-statistics of individual genes or miRNAs over the structure of a combined protein-protein interaction (PPI) and miRNA-target gene network. A permutation test is conducted to select features in a highly consistent manner, and subsequently a Support Vector Machine (SVM) classifier is trained. Compared to several other competing methods our algorithm reveals an overall better prediction performance for early versus late disease relapse and a higher signature stability. Moreover, obtained gene lists can be clearly associated to biological knowledge, such as known disease genes and KEGG pathways. We demonstrate that our data integration strategy can improve classification performance compared to using a single data source only. Our method, called stSVM, is available in R-package netClass on CRAN ( http://cran.r-project.org ).
    Keywords: Research Article ; Biology ; Computer Science ; Engineering ; Mathematics ; Medicine
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 5
    In: PLoS ONE, 2016, Vol.11(11)
    Description: Background The advance of omics technologies has made possible to measure several data modalities on a system of interest. In this work, we illustrate how the Non-Parametric Combination methodology, namely NPC, can be used for simultaneously assessing the association of different molecular quantities with an outcome of interest. We argue that NPC methods have several potential applications in integrating heterogeneous omics technologies, as for example identifying genes whose methylation and transcriptional levels are jointly deregulated, or finding proteins whose abundance shows the same trends of the expression of their encoding genes. Results We implemented the NPC methodology within “omicsNPC”, an R function specifically tailored for the characteristics of omics data. We compare omicsNPC against a range of alternative methods on simulated as well as on real data. Comparisons on simulated data point out that omicsNPC produces unbiased / calibrated p-values and performs equally or significantly better than the other methods included in the study; furthermore, the analysis of real data show that omicsNPC (a) exhibits higher statistical power than other methods, (b) it is easily applicable in a number of different scenarios, and (c) its results have improved biological interpretability. Conclusions The omicsNPC function competitively behaves in all comparisons conducted in this study. Taking into account that the method (i) requires minimal assumptions, (ii) it can be used on different studies designs and (iii) it captures the dependences among heterogeneous data modalities, omicsNPC provides a flexible and statistically powerful solution for the integrative analysis of different omics data.
    Keywords: Research Article ; Physical Sciences ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Research And Analysis Methods ; Medicine And Health Sciences ; Medicine And Health Sciences ; Medicine And Health Sciences ; Medicine And Health Sciences ; Biology And Life Sciences ; Research And Analysis Methods
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 6
    In: PLoS ONE, 2017, Vol.12(2)
    Description: We present a theoretical analysis of Gaussian-binary restricted Boltzmann machines (GRBMs) from the perspective of density models. The key aspect of this analysis is to show that GRBMs can be formulated as a constrained mixture of Gaussians, which gives a much better insight into the model’s capabilities and limitations. We further show that GRBMs are capable of learning meaningful features without using a regularization term and that the results are comparable to those of independent component analysis. This is illustrated for both a two-dimensional blind source separation task and for modeling natural image patches. Our findings exemplify that reported difficulties in training GRBMs are due to the failure of the training algorithm rather than the model itself. Based on our analysis we derive a better training setup and show empirically that it leads to faster and more robust training of GRBMs. Finally, we compare different sampling algorithms for training GRBMs and show that Contrastive Divergence performs better than training methods that use a persistent Markov chain.
    Keywords: Research Article ; Physical Sciences ; Physical Sciences ; Physical Sciences ; Biology And Life Sciences ; Physical Sciences ; Research And Analysis Methods ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Computer And Information Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 7
    In: PLoS ONE, 2014, Vol.9(9)
    Description: Recent advances in big data and analytics research have provided a wealth of large data sets that are too big to be analyzed in their entirety, due to restrictions on computer memory or storage size. New Bayesian methods have been developed for data sets that are large only due to large sample sizes. These methods partition big data sets into subsets and perform independent Bayesian Markov chain Monte Carlo analyses on the subsets. The methods then combine the independent subset posterior samples to estimate a posterior density given the full data set. These approaches were shown to be effective for Bayesian models including logistic regression models, Gaussian mixture models and hierarchical models. Here, we introduce the R package parallelMCMCcombine which carries out four of these techniques for combining independent subset posterior samples. We illustrate each of the methods using a Bayesian logistic regression model for simulation data and a Bayesian Gamma model for real data; we also demonstrate features and capabilities of the R package. The package assumes the user has carried out the Bayesian analysis and has produced the independent subposterior samples outside of the package. The methods are primarily suited to models with unknown parameters of fixed dimension that exist in continuous parameter spaces. We envision this tool will allow researchers to explore the various methods for their specific applications and will assist future progress in this rapidly developing field.
    Keywords: Research Article ; Computer And Information Sciences ; Physical Sciences ; Research And Analysis Methods
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 8
    In: PLoS ONE, 2014, Vol.9(3)
    Description: The purpose of feature selection is to identify the relevant and non-redundant features from a dataset. In this article, the feature selection problem is organized as a graph-theoretic problem where a feature-dissimilarity graph is shaped from the data matrix. The nodes represent features and the edges represent their dissimilarity. Both nodes and edges are given weight according to the feature’s relevance and dissimilarity among the features, respectively. The problem of finding relevant and non-redundant features is then mapped into densest subgraph finding problem. We have proposed a multiobjective particle swarm optimization (PSO)-based algorithm that optimizes average node-weight and average edge-weight of the candidate subgraph simultaneously. The proposed algorithm is applied for identifying relevant and non-redundant disease-related genes from microarray gene expression data. The performance of the proposed method is compared with that of several other existing feature selection techniques on different real-life microarray gene expression datasets.
    Keywords: Research Article ; Biology ; Computer Science ; Engineering ; Mathematics ; Medicine
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 9
    In: PLoS ONE, 2016, Vol.11(3)
    Description: Reverse-engineering of biological networks is a central problem in systems biology. The use of intervention data, such as gene knockouts or knockdowns, is typically used for teasing apart causal relationships among genes. Under time or resource constraints, one needs to carefully choose which intervention experiments to carry out. Previous approaches for selecting most informative interventions have largely been focused on discrete Bayesian networks. However, continuous Bayesian networks are of great practical interest, especially in the study of complex biological systems and their quantitative properties. In this work, we present an efficient, information-theoretic active learning algorithm for Gaussian Bayesian networks (GBNs), which serve as important models for gene regulatory networks. In addition to providing linear-algebraic insights unique to GBNs, leading to significant runtime improvements, we demonstrate the effectiveness of our method on data simulated with GBNs and the DREAM4 network inference challenge data sets. Our method generally leads to faster recovery of underlying network structure and faster convergence to final distribution of confidence scores over candidate graph structures using the full data, in comparison to random selection of intervention experiments.
    Keywords: Research Article ; Computer And Information Sciences ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Biology And Life Sciences ; Computer And Information Sciences ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Biology And Life Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Biology And Life Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 10
    In: PLoS ONE, 2014, Vol.9(10)
    Description: Dependence measures and tests for independence have recently attracted a lot of attention, because they are the cornerstone of algorithms for network inference in probabilistic graphical models. Pearson's product moment correlation coefficient is still by far the most widely used statistic yet it is largely constrained to detecting linear relationships. In this work we provide an exact formula for the th nearest neighbor distance distribution of rank-transformed data. Based on that, we propose two novel tests for independence. An implementation of these tests, together with a general benchmark framework for independence testing, are freely available as a CRAN software package ( http://cran.r-project.org/web/packages/knnIndep ). In this paper we have benchmarked Pearson's correlation, Hoeffding's , dcor, Kraskov's estimator for mutual information, maximal information criterion and our two tests. We conclude that no particular method is generally superior to all other methods. However, dcor and Hoeffding's are the most powerful tests for many different types of dependence.
    Keywords: Research Article ; Biology And Life Sciences ; Computer And Information Sciences ; Physical Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. Further information can be found on the KOBV privacy pages