Kooperativer Bibliotheksverbund

Berlin Brandenburg

and
and

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Dissertation
    Dissertation
    University of North Texas
    Language: English
    Description: Clustering techniques are important for gene expression data analysis. However, efficient computational algorithms for clustering time-series data are still lacking. This work documents two improvements on an existing profile-based greedy algorithm for short time-series data; the first one is implementation of a scaling method on the pre-processing of the raw data to handle some extreme cases; the second improvement is modifying the strategy to generate better clusters. Simulation data and real microarray data were used to evaluate these improvements; this approach could efficiently generate more accurate clusters. A new feature-based algorithm was also developed in which steady state value; overshoot, rise time, settling time and peak time are generated by the 2nd order control system for the clustering purpose. This feature-based approach is much faster and more accurate than the existing profile-based algorithm for long time-series data.
    Keywords: Microarray Data ; Time Series ; Algorithm ; Clustering Analysis ; Distance Matrix ; Time Points
    Source: University of North Texas
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 2
    Language: English
    Description: Conservation biologists are increasingly using phylogenetics as a tool to understand evolutionary relationships and taxonomic classification. The taxonomy of North American prairie grouse (sharp-tailed grouse, T. phasianellus; lesser prairie-chicken, T. pallidicinctus; greater prairie-chicken, T. cupido; including multiple subspecies) has been designated based on physical characteristics, geography, and behavior. However, previous studies have been inconclusive in determining the evolutionary history of prairie grouse based on genetic data. Therefore, additional research investigating the evolutionary history of prairie grouse is warranted. In this study, ten loci (including mitochondrial, autosomal, and Z-linked markers) were sequenced across multiple populations of prairie grouse, and both traditional and coalescent-based phylogenetic analyses were used to address the evolutionary history of this genus. Results from this study indicate that North American prairie grouse diverged in the last 200,000 years, with species-level taxa forming well-supported monophyletic clades in species tree analyses. With these results, managers of the critically endangered Attwater's prairie-chicken (T. c. attwateri) can better evaluate whether outcrossing Attwater's with greater prairie-chickens would be a viable management tool for Attwater's conservation.
    Keywords: Tympanuchus ; Phylogenetics ; Species Tree
    Source: University of North Texas
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 3
    Dissertation
    Dissertation
    University of North Texas
    Language: English
    Description: Obesity is a common disease among all ages that has threatened human health and has become a global concern. Gut microbiota can affect human metabolism and thus may modulate obesity. Certain mixes of gut microbiota can protect the host to be healthy or predispose the host to obesity. Modern next-generation sequencing technique allows accessing huge amount of genetic information underlying microbiota and thus provides new insights into the functionality of these micro-organisms and their interactions with the host. Multiple previous studies have demonstrated that the microbiome might contribute to obesity by increasing dietary energy harvest, promoting fat deposition and triggering systemic inflammation. However, these researches are either based on lab cultivation studies or basic statistical analysis. In order to further explore how gut microbiota affect obesity, this thesis utilize a series of machine learning methods to analyze large amount of metagenomics data from human gut microbiome. The publicly available HMP (Human Microbiome Project) metagenomic sequencing data, contain microbiome data for healthy adults, including overweight and obese individuals, were used for this study. HMP gut data were organized based on two different feature definitions: taxonomic information and metabolic reconstruction information. Several widely used classification algorithms: namely Naive Bayes, Random Forest, SVM and elastic net logistic regression were applied to predict healthy or obese status of the subjects based on the cross-validation accuracy. Furthermore, the corresponding feature selection algorithms were used to identify signature features in each dataset that lead to the differences between healthy and obese samples. The results showed that these algorithms perform poorly on taxonomic data than metabolic pathway data though lots of selected taxa are still supported by literature. Among all the combinations between different algorithms and data, elastic net logistic regression has the best cross-validation performance and thus becomes the best model. In this model, several important features are found and some of these are consistent with the previous studies. Rerunning classifiers by using features selected by elastic net logistic regression again further improved the performance of the classifiers. On the other hand, this study uncovered some new features that haven't been supported by previous studies. The new features could also be the potential target to distinguish obese and healthy subjects. The present thesis work compares the strengths and weaknesses of different machine learning techniques with different types of features originating from the same metagenomics data. The features selected by these models could provide a deep understanding of the metabolic mechanisms of micro-organisms. It is therefore worth to comprehensively understand the differences of gut microbiota between healthy and obese subjects, and particularly how gut microbiome affects obesity.
    Keywords: Gut Microbiome ; Obesity ; Machine Learning
    Source: University of North Texas
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 4
    Language: English
    Description: Extracting information from a stack of data is a tedious task and the scenario is no different in proteomics. Volumes of research papers are published about study of various proteins in several species, their interactions with other proteins and identification of protein(s) as possible biomarker in causing diseases. It is a challenging task for biologists to keep track of these developments manually by reading through the literatures. Several tools have been developed by computer linguists to assist identification, extraction and hypotheses generation of proteins and protein-protein interactions from biomedical publications and protein databases. However, they are confronted with the challenges of term variation, term ambiguity, access only to abstracts and inconsistencies in time-consuming manual curation of protein and protein-protein interaction repositories. This work attempts to attenuate the challenges by extracting protein-protein interactions in humans and elicit possible interactions using associative rule mining on full text, abstracts and captions from figures available from publicly available biomedical literature databases. Two such databases are used in our study: Directory of Open Access Journals (DOAJ) and PubMed Central (PMC). A corpus is built using articles based on search terms. A dataset of more than 38,000 protein-protein interactions from the Human Protein Reference Database (HPRD) is cross-referenced to validate discovered interactive pairs. A set of an optimal size of possible binary protein-protein interactions is generated to be made available for clinician or biological validation. A significant change in the number of new associations was found by altering the thresholds for support and confidence metrics. This study narrows down the limitations for biologists in keeping pace with discovery of protein-protein interactions via manually reading the literature and their needs to validate each and every possible interaction.
    Keywords: Information Retrieval ; Association Rule Mining ; Text Mining ; Protein-Protein Interactions ; Information Extraction ; Protein Binding ; Proteins -- Reactivity ; Medical Informatics
    Source: University of North Texas
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 5
    Dissertation
    Dissertation
    University of North Texas
    Language: English
    Description: Metagenomics is the study of the totality of the complete genetic elements discovered from a defined environment. Different from traditional microbiology study, which only analyzes a small percent of microbes that could survive in laboratory, metagenomics allows researchers to get entire genetic information from all the samples in the communities. So metagenomics enables understanding of the target environments and the hidden relationships between bacteria and diseases. In order to efficiently analyze the metagenomics data, cutting-edge technologies for analyzing the relationships among microbes and communities are required. To overcome the challenges brought by rapid growth in metagenomics datasets, advances in novel methodologies for interpreting metagenomics data are clearly needed. The first two chapters of this dissertation summarize and compare the widely-used methods in metagenomics and integrate these methods into pipelines. Properly analyzing metagenomics data requires a variety of bioinformatcis and statistical approaches to deal with different situations. The raw reads from sequencing centers need to be processed and denoised by several steps and then be further interpreted by ecological and statistical analysis. So understanding these algorithms and combining different approaches could potentially reduce the influence of noises and biases at different steps. And an efficient and accurate pipeline is important to robustly decipher the differences and functionality of bacteria in communities. Traditional statistical analysis and machine learning algorithms have their limitations on analyzing metagenomics data. Thus, rest three chapters describe a new phylogeny based machine learning and feature selection algorithm to overcome these problems. The new method outperforms traditional algorithms and can provide more robust candidate microbes for further analysis. With the frowing sample size, deep neural network could potentially describe more complicated characteristic of data and thus improve model accuracy. So a deep learning framework is designed on top of the shallow learning algorithm stated above in order to further improve the prediction and selection accuracy. The present dissertation work provides a powerful tool that utilizes machine learning techniques to identify signature bacteria and key information from huge amount of metagenomics data.
    Keywords: Metagenomics ; Machine Learning
    Source: University of North Texas
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 6
    Dissertation
    Dissertation
    University of North Texas
    Language: English
    Description: Significant research efforts have been devoted to large-scale dynamical systems, with the aim of understanding their complicated behaviors and managing their responses in real-time. One pivotal technological obstacle in this process is the existence of uncertainty. Although many of these large-scale dynamical systems function well in the design stage, they may easily fail when operating in realistic environment, where environmental uncertainties modulate system dynamics and complicate real-time predication and management tasks. This dissertation aims to develop systematic methodologies to evaluate the performance of large-scale dynamical systems under uncertainty, as a step toward real-time decision support. Two uncertainty evaluation approaches are pursued: the analytical approach and the effective simulation approach. The analytical approach abstracts the dynamics of original stochastic systems, and develops tractable analysis (e.g., jump-linear analysis) for the approximated systems. Despite the potential bias introduced in the approximation process, the analytical approach provides rich insights valuable for evaluating and managing the performance of large-scale dynamical systems under uncertainty. When a system’s complexity and scale are beyond tractable analysis, the effective simulation approach becomes very useful. The effective simulation approach aims to use a few smartly selected simulations to quickly evaluate a complex system’s statistical performance. This approach was originally developed to evaluate a single uncertain variable. This dissertation extends the approach to be scalable and effective for evaluating large-scale systems under a large-number of uncertain variables. While a large portion of this dissertation focuses on the development of generic methods and theoretical analysis that are applicable to broad large-scale dynamical systems, many results are illustrated through a representative large-scale system application on strategic air traffic management application, which is concerned with designing robust management plans subject to a wide range of weather possibilities at 2-15 hours look-ahead time.
    Keywords: Air Traffic Flow Management ; Stochastic Modeling ; Optimal Decision Making ; Uncertainty Evaluation ; Dynamics ; Large Scale Systems ; Uncertainty -- Mathematical Models
    Source: University of North Texas
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 7
    Language: English
    Description: Thesis (Ph.D.)--University of Wisconsin--Madison, 2007. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (p. 126-131).
    Source: Networked Digital Library of Theses and Dissertations
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 8
    Language: English
    Description: Thesis (Ph.D.)--University of Wisconsin--Madison, 2007. Includes bibliographical references (p. 126-131). Also available on the Internet.
    Source: Networked Digital Library of Theses and Dissertations
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 9
    Language: English
    Description: Extracting information from a stack of data is a tedious task and the scenario is no different in proteomics. Volumes of research papers are published about study of various proteins in several species, their interactions with other proteins and identification of protein(s) as possible biomarker in causing diseases. It is a challenging task for biologists to keep track of these developments manually by reading through the literatures. Several tools have been developed by computer linguists to assist identification, extraction and hypotheses generation of proteins and protein-protein interactions from biomedical publications and protein databases. However, they are confronted with the challenges of term variation, term ambiguity, access only to abstracts and inconsistencies in time-consuming manual curation of protein and protein-protein interaction repositories. This work attempts to attenuate the challenges by extracting protein-protein interactions in humans and elicit possible interactions using associative rule mining on full text, abstracts and captions from figures available from publicly available biomedical literature databases. Two such databases are used in our study: Directory of Open Access Journals (DOAJ) and PubMed Central (PMC). A corpus is built using articles based on search terms. A dataset of more than 38,000 protein-protein interactions from the Human Protein Reference Database (HPRD) is cross-referenced to validate discovered interactive pairs. A set of an optimal size of possible binary protein-protein interactions is generated to be made available for clinician or biological validation. A significant change in the number of new associations was found by altering the thresholds for support and confidence metrics. This study narrows down the limitations for biologists in keeping pace with discovery of protein-protein interactions via manually reading the literature and their needs to validate each and every possible interaction.
    Keywords: Information Retrieval. ; Association Rule Mining ; Text Mining ; Protein-Protein Interactions ; Information Extraction ; Protein Binding. ; Proteins -- Reactivity. ; Medical Informatics.
    Source: Networked Digital Library of Theses and Dissertations
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
  • 10
    Dissertation
    Dissertation
    University of North Texas
    Language: English
    Description: Clustering techniques are important for gene expression data analysis. However, efficient computational algorithms for clustering time-series data are still lacking. This work documents two improvements on an existing profile-based greedy algorithm for short time-series data; the first one is implementation of a scaling method on the pre-processing of the raw data to handle some extreme cases; the second improvement is modifying the strategy to generate better clusters. Simulation data and real microarray data were used to evaluate these improvements; this approach could efficiently generate more accurate clusters. A new feature-based algorithm was also developed in which steady state value; overshoot, rise time, settling time and peak time are generated by the 2nd order control system for the clustering purpose. This feature-based approach is much faster and more accurate than the existing profile-based algorithm for long time-series data.
    Keywords: Microarray Data ; Time Series ; Algorithm ; Clustering Analysis ; Distance Matrix ; Time Points
    Source: Networked Digital Library of Theses and Dissertations
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. Further information can be found on the KOBV privacy pages