Kooperativer Bibliotheksverbund

Berlin Brandenburg

and
and

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
Language
Year
  • 1
    Language: English
    In: F1000Research, 01 September 2019, Vol.7
    Description: Feature (or variable) selection is the process of identifying the minimal set of features with the highest predictive performance on the target variable of interest. Numerous feature selection algorithms have been developed over the years, but...
    Keywords: Medicine ; Women'S Studies
    ISSN: 2046-1402
    E-ISSN: 2046-1402
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 2
    In: PLoS ONE, 2015, Vol.10(5)
    Description: We address the problem of predicting the position of a miRNA duplex on a microRNA hairpin via the development and application of a novel SVM-based methodology. Our method combines a unique problem representation and an unbiased optimization protocol to learn from mirBase19.0 an accurate predictive model, termed MiRduplexSVM. This is the first model that provides precise information about all four ends of the miRNA duplex. We show that (a) our method outperforms four state-of-the-art tools, namely MaturePred, MiRPara, MatureBayes, MiRdup as well as a Simple Geometric Locator when applied on the same training datasets employed for each tool and evaluated on a common blind test set. (b) In all comparisons, MiRduplexSVM shows superior performance, achieving up to a 60% increase in prediction accuracy for mammalian hairpins and can generalize very well on plant hairpins, without any special optimization. (c) The tool has a number of important applications such as the ability to accurately predict the miRNA or the miRNA*, given the opposite strand of a duplex. Its performance on this task is superior to the 2nts overhang rule commonly used in computational studies and similar to that of a comparative genomic approach, without the need for prior knowledge or the complexity of performing multiple alignments. Finally, it is able to evaluate novel, potential miRNAs found either computationally or experimentally. In relation with recent confidence evaluation methods used in miRBase, MiRduplexSVM was successful in identifying high confidence potential miRNAs.
    Keywords: Research Article
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 3
    In: PLoS ONE, 2016, Vol.11(11)
    Description: Background The advance of omics technologies has made possible to measure several data modalities on a system of interest. In this work, we illustrate how the Non-Parametric Combination methodology, namely NPC, can be used for simultaneously assessing the association of different molecular quantities with an outcome of interest. We argue that NPC methods have several potential applications in integrating heterogeneous omics technologies, as for example identifying genes whose methylation and transcriptional levels are jointly deregulated, or finding proteins whose abundance shows the same trends of the expression of their encoding genes. Results We implemented the NPC methodology within “omicsNPC”, an R function specifically tailored for the characteristics of omics data. We compare omicsNPC against a range of alternative methods on simulated as well as on real data. Comparisons on simulated data point out that omicsNPC produces unbiased / calibrated p-values and performs equally or significantly better than the other methods included in the study; furthermore, the analysis of real data show that omicsNPC (a) exhibits higher statistical power than other methods, (b) it is easily applicable in a number of different scenarios, and (c) its results have improved biological interpretability. Conclusions The omicsNPC function competitively behaves in all comparisons conducted in this study. Taking into account that the method (i) requires minimal assumptions, (ii) it can be used on different studies designs and (iii) it captures the dependences among heterogeneous data modalities, omicsNPC provides a flexible and statistically powerful solution for the integrative analysis of different omics data.
    Keywords: Research Article ; Physical Sciences ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Research And Analysis Methods ; Medicine And Health Sciences ; Medicine And Health Sciences ; Medicine And Health Sciences ; Medicine And Health Sciences ; Biology And Life Sciences ; Research And Analysis Methods
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 4
    Language: English
    In: PLoS ONE, 2010, Vol.5(8), p.e11843
    Description: MicroRNAs (miRNAs) are small, single stranded RNAs with a key role in post-transcriptional regulation of thousands of genes across numerous species. While several computational methods are currently available for identifying miRNA genes, accurate prediction of the mature miRNA remains a challenge. Existing approaches fall short in predicting the location of mature miRNAs but also in finding the functional strand(s) of miRNA precursors. ; Here, we present a computational tool that incorporates a Naive Bayes classifier to identify mature miRNA candidates based on sequence and secondary structure information of their miRNA precursors. We take into account both positive (true mature miRNAs) and negative (same-size non-mature miRNA sequences) examples to optimize sensitivity as well as specificity. Our method can accurately predict the start position of experimentally verified mature miRNAs for both human and mouse, achieving a significantly larger (often double) performance accuracy compared with two existing methods. Moreover, the method exhibits a very high generalization performance on miRNAs from two other organisms. More importantly, our method provides direct evidence about the features of miRNA precursors which may determine the location of the mature miRNA. We find that the triplet of positions 7, 8 and 9 from the mature miRNA end towards the closest hairpin have the largest discriminatory power, are relatively conserved in terms of sequence composition (mostly contain a Uracil) and are located within or in very close proximity to the hairpin loop, suggesting the existence of a possible recognition site for Dicer and associated proteins. ; This work describes a novel algorithm for identifying the start position of mature miRNA(s) produced by miRNA precursors. Our tool has significantly better (often double) performance than two existing approaches and provides new insights about the potential use of specific sequence/structural information as recognition signals for Dicer processing. Web Tool available at:
    Keywords: Research Article ; Biochemistry -- Rna Structure ; Computational Biology -- Genomics ; Genetics And Genomics -- Bioinformatics ; Molecular Biology -- Bioinformatics ; Molecular Biology -- Post-translational Regulation Of Gene Expression ; Molecular Biology -- Rna-protein Interactions
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 5
    Language: English
    In: International Journal on Artificial Intelligence Tools, October 2015, Vol.24(5)
    Description: In a typical supervised data analysis task, one needs to perform the following two tasks: (a) select an optimal combination of learning methods (e.g., for variable selection and classifier) and tune their hyper-parameters (e.g., K in K-NN), also called model selection, and (b) provide an estimate of the performance of the final, reported model. Combining the two tasks is not trivial because when one selects the set of hyper-parameters that seem to provide the best estimated performance, this estimation is optimistic (biased/overfitted) due to performing multiple statistical comparisons. In this paper, we discuss the theoretical properties of performance estimation when model selection is present and we confirm that the simple Cross-Validation with model selection is indeed optimistic (overestimates performance) in small sample scenarios and should be avoided. We present in detail and investigate the theoretical properties of the Nested Cross Validation and a method by Tibshirani and Tibshirani for removing the estimation bias. In computational experiments with real datasets both protocols provide conservative estimation of performance and should be preferred. These statements hold true even if feature selection is performed as preprocessing.
    Keywords: Performance Estimation ; Model Selection ; Cross Validation ; Stratification ; Comparative Evaluation ; Computer Science
    ISSN: 0218-2130
    E-ISSN: 1793-6349
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 6
    Language: English
    In: Machine Learning, 2018, Vol.107(12), pp.1895-1922
    Description: Cross-Validation (CV), and out-of-sample performance-estimation protocols in general, are often employed both for (a) selecting the optimal combination of algorithms and values of hyper-parameters (called a configuration) for producing the final predictive model, and (b) estimating the predictive performance of the final model. However, the cross-validated performance of the best configuration is optimistically biased. We present an efficient bootstrap method that corrects for the bias, called Bootstrap Bias Corrected CV (BBC-CV). BBC-CV’s main idea is to bootstrap the whole process of selecting the best-performing configuration on the out-of-sample predictions of each configuration, without additional training of models. In comparison to the alternatives, namely the nested cross-validation (Varma and Simon in BMC Bioinform 7(1):91, 2006) and a method by Tibshirani and Tibshirani (Ann Appl Stat 822–829, 2009), BBC-CV is computationally more efficient, has smaller variance and bias, and is applicable to any metric of performance (accuracy, AUC, concordance index, mean squared error). Subsequently, we employ again the idea of bootstrapping the out-of-sample predictions to speed up the CV process. Specifically, using a bootstrap-based statistical criterion we stop training of models on new folds of inferior (with high probability) configurations. We name the method Bootstrap Bias Corrected with Dropping CV (BBCD-CV) that is both efficient and provides accurate performance estimates.
    Keywords: Performance estimation ; Bias correction ; Cross-validation ; Hyper-parameter optimization
    ISSN: 0885-6125
    E-ISSN: 1573-0565
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 7
    In: Nucleic Acids Research, 2017, Vol. 45(W1), pp.W270-W275
    Description: Flow and mass cytometry technologies can probe proteins as biological markers in thousands of individual cells simultaneously, providing unprecedented opportunities for reconstructing networks of protein interactions through machine learning algorithms. The network reconstruction (NR) problem has been well-studied by the machine learning community. However, the potentials of available methods remain largely unknown to the cytometry community, mainly due to their intrinsic complexity and the lack of comprehensive, powerful and easy-to-use NR software implementations specific for cytometry data. To bridge this gap, we present Single CEll NEtwork Reconstruction sYstem (SCENERY), a web server featuring several standard and advanced cytometry data analysis methods coupled with NR algorithms in a user-friendly, on-line environment. In SCENERY, users may upload their data and set their own study design. The server offers several data analysis options categorized into three classes of methods: data (pre)processing, statistical analysis and NR. The server also provides interactive visualization and download of results as ready-to-publish images or multimedia reports. Its core is modular and based on the widely-used and robust R platform allowing power users to extend its functionalities by submitting their own NR methods. SCENERY is available at scenery.csd.uoc.gr or http://mensxmachina.org/en/software/ .
    Keywords: Web Server Issue;
    ISSN: 0305-1048
    E-ISSN: 1362-4962
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 8
    Language: English
    In: Nucleic acids research, May 2013, Vol.41(9), pp.4938-48
    Description: We report the genomic occupancy profiles of the key hematopoietic transcription factor GATA-1 in pro-erythroblasts and mature erythroid cells fractionated from day E12.5 mouse fetal liver cells. Integration of GATA-1 occupancy profiles with available genome-wide transcription factor and epigenetic profiles assayed in fetal liver cells enabled as to evaluate GATA-1 involvement in modulating local chromatin structure of target genes during erythroid differentiation. Our results suggest that GATA-1 associates preferentially with changes of specific epigenetic modifications, such as H4K16, H3K27 acetylation and H3K4 di-methylation. Furthermore, we used random forest (RF) non-linear regression to predict changes in the expression levels of GATA-1 target genes based on the genomic features available for pro-erythroblasts and mature fetal liver-derived erythroid cells. Remarkably, our prediction model explained a high proportion of 62% of variation in gene expression. Hierarchical clustering of the proximity values calculated by the RF model produced a clear separation of upregulated versus downregulated genes and a further separation of downregulated genes in two distinct groups. Thus, our study of GATA-1 genome-wide occupancy profiles in mouse primary erythroid cells and their integration with global epigenetic marks reveals three clusters of GATA-1 gene targets that are associated with specific epigenetic signatures and functional characteristics.
    Keywords: Epigenesis, Genetic ; Erythropoiesis -- Genetics ; Gata1 Transcription Factor -- Metabolism ; Liver -- Metabolism
    ISSN: 03051048
    E-ISSN: 1362-4962
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 9
    Language: English
    In: F1000Research, 2018, Vol.7, pp.1505
    Description: Feature (or variable) selection is the process of identifying the minimal set of features with the highest predictive performance on the target variable of interest. Numerous feature selection algorithms have been developed over the years, but only few have been implemented in R and made publicly available...
    Keywords: Feature Selection ; R Package ; Algorithms ; Computational Efficiency ; Algorithms
    E-ISSN: 2046-1402
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 10
    Language: English
    In: Journal of Diabetes and Its Complications, July 2013, Vol.27(4), pp.407-413
    Description: This work presents a systematic review of long-term risk assessment models for evaluating the probability of developing complications in diabetes patients. Diabetes mellitus can cause many complications if not adequately controlled; risk assessment models can help physicians and patients in identifying the complications most likely to arise and in taking the necessary countermeasures. We identified six large medical studies related to diabetes mellitus upon which current available risk assessment models are built on; all these studies had duration over 5 years and most of them included some common demographic and clinical data strongly related to diabetic complications. The most common predictions for long term diabetes complications are related to cardiovascular diseases and diabetic retinopathy. Our analysis of the literature led us to the conclusion that researchers and medical practitioners should take in account that some limitations undermine the applicability of risk assessment models; for example, it is hard to judge whether results obtained on a specific cohort can be effectively translated to other populations. Nevertheless, all these studies have significantly contributed to identify significant risk factors associated with the major diabetes complications.
    Keywords: Diabetes Complications ; Risk Assessment ; Statistical Models ; Risk Factors ; Prognosis ; Systematic Review ; Medicine
    ISSN: 1056-8727
    E-ISSN: 1873-460X
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. Further information can be found on the KOBV privacy pages