Kooperativer Bibliotheksverbund

Berlin Brandenburg

and
and

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Data Processing
Type of Medium
Language
Year
  • 1
    In: PLoS ONE, 2018, Vol.13(10)
    Description: Public hospital spending consumes a large share of government expenditure in many countries. The large cost variability observed between hospitals and also between patients in the same hospital has fueled the belief that consumption of a significant portion of this funding may result in no clinical benefit to patients, thus representing waste. Accurate identification of the main hospital cost drivers and relating them quantitatively to the observed cost variability is a necessary step towards identifying and reducing waste. This study identifies prime cost drivers in a typical, mid-sized Australian hospital and classifies them as sources of cost variability that are either warranted or not warranted—and therefore contributing to waste. An essential step is dimension reduction using Principal Component Analysis to pre-process the data by separating out the low value ‘noise’ from otherwise valuable information. Crucially, the study then adjusts for possible co-linearity of different cost drivers by the use of the sparse group lasso technique. This ensures reliability of the findings and represents a novel and powerful approach to analysing hospital costs. Our statistical model included 32 potential cost predictors with a sample size of over 50,000 hospital admissions. The proportion of cost variability potentially not clinically warranted was estimated at 33.7%. Given the financial footprint involved, once the findings are extrapolated nationwide, this estimation has far-reaching significance for health funding policy.
    Keywords: Research Article ; Medicine And Health Sciences ; Research And Analysis Methods ; Physical Sciences ; People And Places ; Medicine And Health Sciences ; Medicine And Health Sciences ; Research And Analysis Methods ; Physical Sciences ; Physical Sciences ; Physical Sciences ; Biology And Life Sciences ; Medicine And Health Sciences ; Computer And Information Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 2
    Language: English
    In: PLoS ONE, 2012, Vol.7(5), p.e35077
    Description: Network inference deals with the reconstruction of biological networks from experimental data. A variety of different reverse engineering techniques are available; they differ in the underlying assumptions and mathematical models used. One common problem for all approaches stems from the complexity of the task, due to the combinatorial explosion of different network topologies for increasing network size. To handle this problem, constraints are frequently used, for example on the node degree, number of edges, or constraints on regulation functions between network components. We propose to exploit topological considerations in the inference of gene regulatory networks. Such systems are often controlled by a small number of hub genes, while most other genes have only limited influence on the network's dynamic. We model gene regulation using a Bayesian network with discrete, Boolean nodes. A hierarchical prior is employed to identify hub genes. The first layer of the prior is used to regularize weights on edges emanating from one specific node. A second prior on hyperparameters controls the magnitude of the former regularization for different nodes. The net effect is that central nodes tend to form in reconstructed networks. Network reconstruction is then performed by maximization of or sampling from the posterior distribution. We evaluate our approach on simulated and real experimental data, indicating that we can reconstruct main regulatory interactions from the data. We furthermore compare our approach to other state-of-the art methods, showing superior performance in identifying hubs. Using a large publicly available dataset of over 800 cell cycle regulated genes, we are able to identify several main hub genes. Our method may thus provide a valuable tool to identify interesting candidate genes for further study. Furthermore, the approach presented may stimulate further developments in regularization methods for network reconstruction from data.
    Keywords: Research Article ; Biology ; Genetics And Genomics ; Computational Biology
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 3
    In: PLoS ONE, 2014, Vol.9(5)
    Description: To obtain predictive genes with lower redundancy and better interpretability, a hybrid gene selection method encoding prior information is proposed in this paper. To begin with, the prior information referred to as gene-to-class sensitivity (GCS) of all genes from microarray data is exploited by a single hidden layered feedforward neural network (SLFN). Then, to select more representative and lower redundant genes, all genes are grouped into some clusters by K-means method, and some low sensitive genes are filtered out according to their GCS values. Finally, a modified binary particle swarm optimization (BPSO) encoding the GCS information is proposed to perform further gene selection from the remainder genes. For considering the GCS information, the proposed method selects those genes highly correlated to sample classes. Thus, the low redundant gene subsets obtained by the proposed method also contribute to improve classification accuracy on microarray data. The experiments results on some open microarray data verify the effectiveness and efficiency of the proposed approach.
    Keywords: Research Article ; Biology And Life Sciences ; Computer And Information Sciences ; Physical Sciences ; Research And Analysis Methods
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 4
    In: PLoS ONE, 2014, Vol.9(4)
    Description: Inferring gene regulatory networks (GRNs) is a major issue in systems biology, which explicitly characterizes regulatory processes in the cell. The Path Consistency Algorithm based on Conditional Mutual Information (PCA-CMI) is a well-known method in this field. In this study, we introduce a new algorithm (IPCA-CMI) and apply it to a number of gene expression data sets in order to evaluate the accuracy of the algorithm to infer GRNs. The IPCA-CMI can be categorized as a hybrid method, using the PCA-CMI and Hill-Climbing algorithm (based on MIT score). The conditional dependence between variables is determined by the conditional mutual information test which can take into account both linear and nonlinear genes relations. IPCA-CMI uses a score and search method and defines a selected set of variables which is adjacent to one of or Y . This set is used to determine the dependency between X and Y . This method is compared with the method of evaluating dependency by PCA-CMI in which the set of variables adjacent to both X and Y , is selected. The merits of the IPCA-CMI are evaluated by applying this algorithm to the DREAM3 Challenge data sets with n variables and n samples ( ) and to experimental data from Escherichia coil containing 9 variables and 9 samples. Results indicate that applying the IPCA-CMI improves the precision of learning the structure of the GRNs in comparison with that of the PCA-CMI.
    Keywords: Research Article ; Biology And Life Sciences ; Computer And Information Sciences ; Physical Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 5
    In: PLoS ONE, 2014, Vol.9(8)
    Description: Background The spread of Bluetongue virus (BTV) among ruminants is caused by movement of infected host animals or by movement of infected Culicoides midges, the vector of BTV. Biologically plausible models of Culicoides dispersal are necessary for predicting the spread of BTV and are important for planning control and eradication strategies. Methods A spatially-explicit simulation model which captures the two underlying population mechanisms, population dynamics and movement, was developed using extensive data from a trapping program for C. brevitarsis on the east coast of Australia. A realistic midge flight sub-model was developed and the annual incursion and population establishment of C. brevitarsis was simulated. Data from the literature was used to parameterise the model. Results The model was shown to reproduce the spread of C. brevitarsis southwards along the east Australian coastline in spring, from an endemic population to the north. Such incursions were shown to be reliant on wind-dispersal; Culicoides midge active flight on its own was not capable of achieving known rates of southern spread, nor was re-emergence of southern populations due to overwintering larvae. Data from midge trapping programmes were used to qualitatively validate the resulting simulation model. Conclusions The model described in this paper is intended to form the vector component of an extended model that will also include BTV transmission. A model of midge movement and population dynamics has been developed in sufficient detail such that the extended model may be used to evaluate the timing and extent of BTV outbreaks. This extended model could then be used as a platform for addressing the effectiveness of spatially targeted vaccination strategies or animal movement bans as BTV spread mitigation measures, or the impact of climate change on the risk and extent of outbreaks. These questions involving incursive Culicoides spread cannot be simply addressed with non-spatial models.
    Keywords: Research Article ; Biology And Life Sciences ; Computer And Information Sciences ; Medicine And Health Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 6
    In: PLoS ONE, 2017, Vol.12(10)
    Description: High-throughput gene expression data are often obtained from pure or complex (heterogeneous) biological samples. In the latter case, data obtained are a mixture of different cell types and the heterogeneity imposes some difficulties in the analysis of such data. In order to make conclusions on gene expresssion data obtained from heterogeneous samples, methods such as microdissection and flow cytometry have been employed to physically separate the constituting cell types. However, these manual approaches are time consuming when measuring the responses of multiple cell types simultaneously. In addition, exposed samples, on many occasions, end up being contaminated with external perturbations and this may result in an altered yield of molecular content. In this paper, we model the heterogeneous gene expression data using a Bayesian framework, treating the cell type proportions and the cell-type specific expressions as the parameters of the model. Specifically, we present a novel sequential Monte Carlo (SMC) sampler for estimating the model parameters by approximating their posterior distributions with a set of weighted samples. The SMC framework is a robust and efficient approach where we construct a sequence of artificial target (posterior) distributions on spaces of increasing dimensions which admit the distributions of interest as marginals. The proposed algorithm is evaluated on simulated datasets and publicly available real datasets, including Affymetrix oligonucleotide arrays and national center for biotechnology information (NCBI) gene expression omnibus (GEO), with varying number of cell types. The results obtained on all datasets show a superior performance with an improved accuracy in the estimation of cell type proportions and the cell-type specific expressions, and in addition, more accurate identification of differentially expressed genes when compared to other widely known methods for blind decomposition of heterogeneous gene expression data such as Dsection and the nonnegative matrix factorization (NMF) algorithms. MATLAB implementation of the proposed SMC algorithm is available to download at https://github.com/moyanre/smcgenedeconv.git .
    Keywords: Research Article ; Biology And Life Sciences ; Physical Sciences ; Research And Analysis Methods ; Biology And Life Sciences ; Medicine And Health Sciences ; Science Policy ; Physical Sciences ; Physical Sciences ; Physical Sciences ; Computer And Information Sciences ; Engineering And Technology
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 7
    Language: English
    In: PLOS ONE, 11/26/2018, Vol.13(11), p.e0207579
    Description: Recently, a number of analytical approaches for probing medical databases have been developed to assist in disease risk assessment and to determine the association of a clinical condition with others, so that better and intelligent healthcare can be provided. The early assessment of disease risk is an emerging topic in medical informatics. If diseases are detected at an early stage, prognosis can be improved and medical resources can be used more efficiently. For example, if rheumatoid arthritis (RA) is detected at an early stage, appropriate medications can be used to prevent bone deterioration. In early disease risk assessment, finding important risk factors from large-scale medical databases and performing individual disease risk assessment have been challenging tasks. A number of recent studies have considered risk factor analysis approaches, such as association rule mining, sequential rule mining, regression, and expert advice. In this study, to improve disease risk assessment, machine learning and matrix factorization techniques were integrated to discover important and implicit risk factors. A novel framework is proposed that can effectively assess early disease risks, and RA is used as a case study. This framework comprises three main stages: data preprocessing, risk factor optimization, and early disease risk assessment. This is the first study integrating matrix factorization and machine learning for disease risk assessment that is applied to a nation-wide and longitudinal medical diagnostic database. In the experimental evaluations, a cohort established from a large-scale medical database was used that included 1007 RA-diagnosed patients and 921,192 control patients examined over a nine-year follow-up period (2000-2008). The evaluation results demonstrate that the proposed approach is more efficient and stable for disease risk assessment than state-of-the-art methods.
    Keywords: Risk Assessment – Case Studies ; Rheumatoid Factor – Case Studies ; Machine Learning – Case Studies ; Arthritis – Prognosis ; Arthritis – Development and Progression ; Arthritis – Case Studies ; Medical Research – Case Studies ; Antiarthritic Agents – Case Studies ; Medical Informatics – Case Studies ; Natural Language Processing – Case Studies ; Online Health Care Information Services – Case Studies;
    ISSN: PLOS ONE
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 8
    In: PLoS ONE, 2018, Vol.13(2)
    Description: Building prediction models based on complex omics datasets such as transcriptomics, proteomics, metabolomics remains a challenge in bioinformatics and biostatistics. Regularized regression techniques are typically used to deal with the high dimensionality of these datasets. However, due to the presence of correlation in the datasets, it is difficult to select the best model and application of these methods yields unstable results. We propose a novel strategy for model selection where the obtained models also perform well in terms of overall predictability. Several three step approaches are considered, where the steps are 1) network construction, 2) clustering to empirically derive modules or pathways, and 3) building a prediction model incorporating the information on the modules. For the first step, we use weighted correlation networks and Gaussian graphical modelling. Identification of groups of features is performed by hierarchical clustering. The grouping information is included in the prediction model by using group-based variable selection or group-specific penalization. We compare the performance of our new approaches with standard regularized regression via simulations. Based on these results we provide recommendations for selecting a strategy for building a prediction model given the specific goal of the analysis and the sizes of the datasets. Finally we illustrate the advantages of our approach by application of the methodology to two problems, namely prediction of body mass index in the DIetary, Lifestyle, and Genetic determinants of Obesity and Metabolic syndrome study (DILGOM) and prediction of response of each breast cancer cell line to treatment with specific drugs using a breast cancer cell lines pharmacogenomics dataset.
    Keywords: Research Article ; Research And Analysis Methods ; Physical Sciences ; Computer And Information Sciences ; Medicine And Health Sciences ; Biology And Life Sciences ; Biology And Life Sciences ; Biology And Life Sciences ; Biology And Life Sciences ; Biology And Life Sciences ; Medicine And Health Sciences
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 9
    Language: English
    In: BMC bioinformatics, 22 July 2014, Vol.15, pp.250
    Description: Network inference deals with the reconstruction of molecular networks from experimental data. Given N molecular species, the challenge is to find the underlying network. Due to data limitations, this typically is an ill-posed problem, and requires the integration of prior biological knowledge or strong regularization. We here focus on the situation when time-resolved measurements of a system's response after systematic perturbations are available. We present a novel method to infer signaling networks from time-course perturbation data. We utilize dynamic Bayesian networks with probabilistic Boolean threshold functions to describe protein activation. The model posterior distribution is analyzed using evolutionary MCMC sampling and subsequent clustering, resulting in probability distributions over alternative networks. We evaluate our method on simulated data, and study its performance with respect to data set size and levels of noise. We then use our method to study EGF-mediated signaling in the ERBB pathway. Dynamic Probabilistic Threshold Networks is a new method to infer signaling networks from time-series perturbation data. It exploits the dynamic response of a system after external perturbation for network reconstruction. On simulated data, we show that the approach outperforms current state of the art methods. On the ERBB data, our approach recovers a significant fraction of the known interactions, and predicts novel mechanisms in the ERBB pathway.
    Keywords: Algorithms ; Signal Transduction ; Systems Biology -- Methods
    E-ISSN: 1471-2105
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 10
    In: PLoS ONE, 2015, Vol.10(9)
    Description: Inferring the gene regulatory network (GRN) is crucial to understanding the working of the cell. Many computational methods attempt to infer the GRN from time series expression data, instead of through expensive and time-consuming experiments. However, existing methods make the convenient but unrealistic assumption of causal sufficiency , i.e. all the relevant factors in the causal network have been observed and there are no unobserved common cause. In principle, in the real world, it is impossible to be certain that all relevant factors or common causes have been observed, because some factors may not have been conceived of, and therefore are impossible to measure. In view of this, we have developed a novel algorithm named HCC-CLINDE to infer an GRN from time series data allowing the presence of hidden common cause(s). We assume there is a sparse causal graph (possibly with cycles) of interest, where the variables are continuous and each causal link has a delay (possibly more than one time step). A small but unknown number of variables are not observed. Each unobserved variable has only observed variables as children and parents, with at least two children, and the children are not linked to each other. Since it is difficult to obtain very long time series, our algorithm is also capable of utilizing multiple short time series, which is more realistic. To our knowledge, our algorithm is far less restrictive than previous works. We have performed extensive experiments using synthetic data on GRNs of size up to 100, with up to 10 hidden nodes. The results show that our algorithm can adequately recover the true causal GRN and is robust to slight deviation from Gaussian distribution in the error terms. We have also demonstrated the potential of our algorithm on small YEASTRACT subnetworks using limited real data.
    Keywords: Research Article
    E-ISSN: 1932-6203
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. Further information can be found on the KOBV privacy pages