Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2013
    In:  Journal of the Royal Statistical Society Series A: Statistics in Society Vol. 176, No. 3 ( 2013-06-01), p. 777-793
    In: Journal of the Royal Statistical Society Series A: Statistics in Society, Oxford University Press (OUP), Vol. 176, No. 3 ( 2013-06-01), p. 777-793
    Abstract: Linguistic corpora are databases of text which are linguistically marked up or otherwise structured and designed to be representative of a specific language. The growing availability of such corpora has brought with it opportunities for statistical analysis. The paper develops and uses statistical approaches to address questions pertaining to an important linguistic phenomenon: the use of different syntactic alternatives. We present a model-selection-based approach for determining possible driving attributes affecting verb complementation for written sentence constructions using the verb ‘give’ in three varieties of English. We are interested in explaining the choice of alternatives in terms of a variety of sentence level linguistic features such as the meaning of the verb, in addition to the country of origin.
    Type of Medium: Online Resource
    ISSN: 0964-1998 , 1467-985X
    RVK:
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2013
    detail.hit.zdb_id: 204794-9
    detail.hit.zdb_id: 1490715-X
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 2
    In: Nature, Springer Science and Business Media LLC, Vol. 594, No. 7862 ( 2021-06-10), p. 265-270
    Abstract: Fast and reliable detection of patients with severe and heterogeneous illnesses is a major goal of precision medicine 1,2 . Patients with leukaemia can be identified using machine learning on the basis of their blood transcriptomes 3 . However, there is an increasing divide between what is technically possible and what is allowed, because of privacy legislation 4,5 . Here, to facilitate the integration of any medical data from any data owner worldwide without violating privacy laws, we introduce Swarm Learning—a decentralized machine-learning approach that unites edge computing, blockchain-based peer-to-peer networking and coordination while maintaining confidentiality without the need for a central coordinator, thereby going beyond federated learning. To illustrate the feasibility of using Swarm Learning to develop disease classifiers using distributed data, we chose four use cases of heterogeneous diseases (COVID-19, tuberculosis, leukaemia and lung pathologies). With more than 16,400 blood transcriptomes derived from 127 clinical studies with non-uniform distributions of cases and controls and substantial study biases, as well as more than 95,000 chest X-ray images, we show that Swarm Learning classifiers outperform those developed at individual sites. In addition, Swarm Learning completely fulfils local confidentiality regulations by design. We believe that this approach will notably accelerate the introduction of precision medicine.
    Type of Medium: Online Resource
    ISSN: 0028-0836 , 1476-4687
    RVK:
    RVK:
    RVK:
    Language: English
    Publisher: Springer Science and Business Media LLC
    Publication Date: 2021
    detail.hit.zdb_id: 120714-3
    detail.hit.zdb_id: 1413423-8
    SSG: 11
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 3
    Online Resource
    Online Resource
    Walter de Gruyter GmbH ; 2013
    In:  Corpus Linguistics and Linguistic Theory Vol. 9, No. 2 ( 2013-10-25), p. 187-225
    In: Corpus Linguistics and Linguistic Theory, Walter de Gruyter GmbH, Vol. 9, No. 2 ( 2013-10-25), p. 187-225
    Abstract: This paper examines parallels and differences between South Asian Englishes and British English with regard to various factors driving the selection of verb-complementation patterns. Focusing on the prototypical ditransitive verb give and its complementation, we use large web-derived corpora and distinguish between two possible response cases, one based on the dative and prepositional construction (i.e. the dative alternation), the other including monotransitive complementation. Our data has been additionally coded for a number of potential driving factors, such as pronominality and discourse accessibility of the participants in the constructions. Applying a model-exploration technique we isolate the main driving factors for the varieties under scrutiny (Indian English, Pakistani English and British English) and analyze their influence on pattern selection based on a multinomial logistic regression formulation. Our findings show that, while there is a large area of overlap between the varieties, Pakistani English is closer to British English with regard to relevant driving factors than Indian English. Furthermore, we reveal interesting parallels between all three varieties in the use of monotransitive complementation.
    Type of Medium: Online Resource
    ISSN: 1613-7035 , 1613-7027
    Language: Unknown
    Publisher: Walter de Gruyter GmbH
    Publication Date: 2013
    detail.hit.zdb_id: 2180564-7
    SSG: 7,11
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 4
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2022
    In:  Biostatistics Vol. 24, No. 1 ( 2022-12-12), p. 85-107
    In: Biostatistics, Oxford University Press (OUP), Vol. 24, No. 1 ( 2022-12-12), p. 85-107
    Abstract: Risk prediction models are a crucial tool in healthcare. Risk prediction models with a binary outcome (i.e., binary classification models) are often constructed using methodology which assumes the costs of different classification errors are equal. In many healthcare applications, this assumption is not valid, and the differences between misclassification costs can be quite large. For instance, in a diagnostic setting, the cost of misdiagnosing a person with a life-threatening disease as healthy may be larger than the cost of misdiagnosing a healthy person as a patient. In this article, we present Tailored Bayes (TB), a novel Bayesian inference framework which “tailors” model fitting to optimize predictive performance with respect to unbalanced misclassification costs. We use simulation studies to showcase when TB is expected to outperform standard Bayesian methods in the context of logistic regression. We then apply TB to three real-world applications, a cardiac surgery, a breast cancer prognostication task, and a breast cancer tumor classification task and demonstrate the improvement in predictive performance over standard methods.
    Type of Medium: Online Resource
    ISSN: 1465-4644 , 1468-4357
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2022
    detail.hit.zdb_id: 2020601-X
    SSG: 12
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 5
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2009
    In:  Bioinformatics Vol. 25, No. 2 ( 2009-01-15), p. 265-271
    In: Bioinformatics, Oxford University Press (OUP), Vol. 25, No. 2 ( 2009-01-15), p. 265-271
    Abstract: Motivation: Combinatorial effects, in which several variables jointly influence an output or response, play an important role in biological systems. In many settings, Boolean functions provide a natural way to describe such influences. However, biochemical data using which we may wish to characterize such influences are usually subject to much variability. Furthermore, in high-throughput biological settings Boolean relationships of interest are very often sparse, in the sense of being embedded in an overall dataset of higher dimensionality. This motivates a need for statistical methods capable of making inferences regarding Boolean functions under conditions of noise and sparsity. Results: We put forward a statistical model for sparse, noisy Boolean functions and methods for inference under the model. We focus on the case in which the form of the underlying Boolean function, as well as the number and identity of its inputs are all unknown. We present results on synthetic data and on a study of signalling proteins in cancer biology. Availability:  go.warwick.ac.uk/sachmukherjee/sci Contact:  s.n.mukherjee@warwick.ac.uk
    Type of Medium: Online Resource
    ISSN: 1367-4811 , 1367-4803
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2009
    detail.hit.zdb_id: 1468345-3
    SSG: 12
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 6
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2012
    In:  Bioinformatics Vol. 28, No. 18 ( 2012-09-15), p. 2342-2348
    In: Bioinformatics, Oxford University Press (OUP), Vol. 28, No. 18 ( 2012-09-15), p. 2342-2348
    Abstract: Motivation: Network inference approaches are widely used to shed light on regulatory interplay between molecular players such as genes and proteins. Biochemical processes underlying networks of interest (e.g. gene regulatory or protein signalling networks) are generally nonlinear. In many settings, knowledge is available concerning relevant chemical kinetics. However, existing network inference methods for continuous, steady-state data are typically rooted in statistical formulations, which do not exploit chemical kinetics to guide inference. Results: Herein, we present an approach to network inference for steady-state data that is rooted in non-linear descriptions of biochemical mechanism. We use equilibrium analysis of chemical kinetics to obtain functional forms that are in turn used to infer networks using steady-state data. The approach we propose is directly applicable to conventional steady-state gene expression or proteomic data and does not require knowledge of either network topology or any kinetic parameters. We illustrate the approach in the context of protein phosphorylation networks, using data simulated from a recent mechanistic model and proteomic data from cancer cell lines. In the former, the true network is known and used for assessment, whereas in the latter, results are compared against known biochemistry. We find that the proposed methodology is more effective at estimating network topology than methods based on linear models. Availability:  mukherjeelab.nki.nl/CODE/GK_Kinetics.zip Contact:  c.j.oates@warwick.ac.uk; s.mukherjee@nki.nl Supplementary Information:  Supplementary data are available at Bioinformatics online.
    Type of Medium: Online Resource
    ISSN: 1367-4811 , 1367-4803
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2012
    detail.hit.zdb_id: 1468345-3
    SSG: 12
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 7
    Online Resource
    Online Resource
    Springer Science and Business Media LLC ; 2020
    In:  Statistics and Computing Vol. 30, No. 3 ( 2020-05), p. 697-719
    In: Statistics and Computing, Springer Science and Business Media LLC, Vol. 30, No. 3 ( 2020-05), p. 697-719
    Abstract: Penalized likelihood approaches are widely used for high-dimensional regression. Although many methods have been proposed and the associated theory is now well developed, the relative efficacy of different approaches in finite-sample settings, as encountered in practice, remains incompletely understood. There is therefore a need for empirical investigations in this area that can offer practical insight and guidance to users. In this paper, we present a large-scale comparison of penalized regression methods. We distinguish between three related goals: prediction, variable selection and variable ranking. Our results span more than 2300 data-generating scenarios, including both synthetic and semisynthetic data (real covariates and simulated responses), allowing us to systematically consider the influence of various factors (sample size, dimensionality, sparsity, signal strength and multicollinearity). We consider several widely used approaches (Lasso, Adaptive Lasso, Elastic Net, Ridge Regression, SCAD, the Dantzig Selector and Stability Selection). We find considerable variation in performance between methods. Our results support a “no panacea” view, with no unambiguous winner across all scenarios or goals, even in this restricted setting where all data align well with the assumptions underlying the methods. The study allows us to make some recommendations as to which approaches may be most (or least) suitable given the goal and some data characteristics. Our empirical results complement existing theory and provide a resource to compare methods across a range of scenarios and metrics.
    Type of Medium: Online Resource
    ISSN: 0960-3174 , 1573-1375
    Language: English
    Publisher: Springer Science and Business Media LLC
    Publication Date: 2020
    detail.hit.zdb_id: 2017741-0
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 8
    Online Resource
    Online Resource
    Springer Science and Business Media LLC ; 2023
    In:  Statistics and Computing Vol. 33, No. 5 ( 2023-10)
    In: Statistics and Computing, Springer Science and Business Media LLC, Vol. 33, No. 5 ( 2023-10)
    Abstract: Causal structure learning (CSL) refers to the estimation of causal graphs from data. Causal versions of tools such as ROC curves play a prominent role in empirical assessment of CSL methods and performance is often compared with “random” baselines (such as the diagonal in an ROC analysis). However, such baselines do not take account of constraints arising from the graph context and hence may represent a “low bar”. In this paper, motivated by examples in systems biology, we focus on assessment of CSL methods for multivariate data where part of the graph structure is known via interventional experiments. For this setting, we put forward a new class of baselines called graph-based predictors (GBPs). In contrast to the “random” baseline, GBPs leverage the known graph structure, exploiting simple graph properties to provide improved baselines against which to compare CSL methods. We discuss GBPs in general and provide a detailed study in the context of transitively closed graphs, introducing two conceptually simple baselines for this setting, the observed in-degree predictor (OIP) and the transitivity assuming predictor (TAP). While the former is straightforward to compute, for the latter we propose several simulation strategies. Moreover, we study and compare the proposed predictors theoretically, including a result showing that the OIP outperforms in expectation the “random” baseline on a subclass of latent network models featuring positive correlation among edge probabilities. Using both simulated and real biological data, we show that the proposed GBPs outperform random baselines in practice, often substantially. Some GBPs even outperform standard CSL methods (whilst being computationally cheap in practice). Our results provide a new way to assess CSL methods for interventional data.
    Type of Medium: Online Resource
    ISSN: 0960-3174 , 1573-1375
    Language: English
    Publisher: Springer Science and Business Media LLC
    Publication Date: 2023
    detail.hit.zdb_id: 2017741-0
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 9
    Online Resource
    Online Resource
    Elsevier BV ; 2011
    In:  SSRN Electronic Journal
    In: SSRN Electronic Journal, Elsevier BV
    Type of Medium: Online Resource
    ISSN: 1556-5068
    Language: English
    Publisher: Elsevier BV
    Publication Date: 2011
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 10
    Online Resource
    Online Resource
    MIT Press ; 2020
    In:  Harvard Data Science Review ( 2020-09-30)
    In: Harvard Data Science Review, MIT Press, ( 2020-09-30)
    Type of Medium: Online Resource
    Language: English
    Publisher: MIT Press
    Publication Date: 2020
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. Further information can be found on the KOBV privacy pages