KOBV Portal

UID:

almahu_9949746877602882

Format: 1 online resource (655 pages)

Edition: 4th ed.

ISBN: 9780128043578

Series Statement: Morgan Kaufmann Series in Data Management Systems

Note: Front Cover -- Data Mining -- Copyright Page -- Contents -- List of Figures -- List of Tables -- Preface -- Updated and Revised Content -- Second Edition -- Third Edition -- Fourth Edition -- Acknowledgments -- I. Introduction to data mining -- 1 What's it all about? -- 1.1 Data Mining and Machine Learning -- Describing Structural Patterns -- Machine Learning -- Data Mining -- 1.2 Simple Examples: The Weather Problem and Others -- The Weather Problem -- Contact Lenses: An Idealized Problem -- Irises: A Classic Numeric Dataset -- CPU Performance: Introducing Numeric Prediction -- Labor Negotiations: A More Realistic Example -- Soybean Classification: A Classic Machine Learning Success -- 1.3 Fielded Applications -- Web Mining -- Decisions Involving Judgment -- Screening Images -- Load Forecasting -- Diagnosis -- Marketing and Sales -- Other Applications -- 1.4 The Data Mining Process -- 1.5 Machine Learning and Statistics -- 1.6 Generalization as Search -- Enumerating the Concept Space -- Bias -- Language bias -- Search bias -- Overfitting-avoidance bias -- 1.7 Data Mining and Ethics -- Reidentification -- Using Personal Information -- Wider Issues -- 1.8 Further Reading and Bibliographic Notes -- 2 Input: concepts, instances, attributes -- 2.1 What's a Concept? -- 2.2 What's in an Example? -- Relations -- Other Example Types -- 2.3 What's in an Attribute? -- 2.4 Preparing the Input -- Gathering the Data Together -- ARFF Format -- Sparse Data -- Attribute Types -- Missing Values -- Inaccurate Values -- Unbalanced Data -- Getting to Know Your Data -- 2.5 Further Reading and Bibliographic Notes -- 3 Output: knowledge representation -- 3.1 Tables -- 3.2 Linear Models -- 3.3 Trees -- 3.4 Rules -- Classification Rules -- Association Rules -- Rules With Exceptions -- More Expressive Rules -- 3.5 Instance-Based Representation -- 3.6 Clusters. , 3.7 Further Reading and Bibliographic Notes -- 4 Algorithms: the basic methods -- 4.1 Inferring Rudimentary Rules -- Missing Values and Numeric Attributes -- 4.2 Simple Probabilistic Modeling -- Missing Values and Numeric Attributes -- Naïve Bayes for Document Classification -- Remarks -- 4.3 Divide-and-Conquer: Constructing Decision Trees -- Calculating Information -- Highly Branching Attributes -- 4.4 Covering Algorithms: Constructing Rules -- Rules Versus Trees -- A Simple Covering Algorithm -- Rules Versus Decision Lists -- 4.5 Mining Association Rules -- Item Sets -- Association Rules -- Generating Rules Efficiently -- 4.6 Linear Models -- Numeric Prediction: Linear Regression -- Linear Classification: Logistic Regression -- Linear Classification Using the Perceptron -- Linear Classification Using Winnow -- 4.7 Instance-Based Learning -- The Distance Function -- Finding Nearest Neighbors Efficiently -- Remarks -- 4.8 Clustering -- Iterative Distance-Based Clustering -- Faster Distance Calculations -- Choosing the Number of Clusters -- Hierarchical Clustering -- Example of Hierarchical Clustering -- Incremental Clustering -- Category Utility -- Remarks -- 4.9 Multi-instance Learning -- Aggregating the Input -- Aggregating the Output -- 4.10 Further Reading and Bibliographic Notes -- 4.11 Weka Implementations -- 5 Credibility: evaluating what's been learned -- 5.1 Training and Testing -- 5.2 Predicting Performance -- 5.3 Cross-Validation -- 5.4 Other Estimates -- Leave-One-Out -- The Bootstrap -- 5.5 Hyperparameter Selection -- 5.6 Comparing Data Mining Schemes -- 5.7 Predicting Probabilities -- Quadratic Loss Function -- Informational Loss Function -- Remarks -- 5.8 Counting the Cost -- Cost-Sensitive Classification -- Cost-Sensitive Learning -- Lift Charts -- ROC Curves -- Recall-Precision Curves -- Remarks -- Cost Curves. , 5.9 Evaluating Numeric Prediction -- 5.10 The MDL Principle -- 5.11 Applying the MDL Principle to Clustering -- 5.12 Using a Validation Set for Model Selection -- 5.13 Further Reading and Bibliographic Notes -- II. More advanced machine learning schemes -- 6 Trees and rules -- 6.1 Decision Trees -- Numeric Attributes -- Missing Values -- Pruning -- Estimating Error Rates -- Complexity of Decision Tree Induction -- From Trees to Rules -- C4.5: Choices and Options -- Cost-Complexity Pruning -- Discussion -- 6.2 Classification Rules -- Criteria for Choosing Tests -- Missing Values, Numeric Attributes -- Generating Good Rules -- Using Global Optimization -- Obtaining Rules From Partial Decision Trees -- Rules With Exceptions -- Discussion -- 6.3 Association Rules -- Building a Frequent Pattern Tree -- Finding Large Item Sets -- Discussion -- 6.4 Weka Implementations -- 7 Extending instance-based and linear models -- 7.1 Instance-Based Learning -- Reducing the Number of Exemplars -- Pruning Noisy Exemplars -- Weighting Attributes -- Generalizing Exemplars -- Distance Functions for Generalized Exemplars -- Generalized Distance Functions -- Discussion -- 7.2 Extending Linear Models -- The Maximum Margin Hyperplane -- Nonlinear Class Boundaries -- Support Vector Regression -- Kernel Ridge Regression -- The Kernel Perceptron -- Multilayer Perceptrons -- Backpropagation -- Radial Basis Function Networks -- Stochastic Gradient Descent -- Discussion -- 7.3 Numeric Prediction With Local Linear Models -- Model Trees -- Building the Tree -- Pruning the Tree -- Nominal Attributes -- Missing Values -- Pseudocode for Model Tree Induction -- Rules From Model Trees -- Locally Weighted Linear Regression -- Discussion -- 7.4 Weka Implementations -- 8 Data transformations -- 8.1 Attribute Selection -- Scheme-Independent Selection -- Searching the Attribute Space. , Scheme-Specific Selection -- 8.2 Discretizing Numeric Attributes -- Unsupervised Discretization -- Entropy-Based Discretization -- Other Discretization Methods -- Entropy-Based Versus Error-Based Discretization -- Converting Discrete to Numeric Attributes -- 8.3 Projections -- Principal Component Analysis -- Random Projections -- Partial Least Squares Regression -- Independent Component Analysis -- Linear Discriminant Analysis -- Quadratic Discriminant Analysis -- Fisher's Linear Discriminant Analysis -- Text to Attribute Vectors -- Time Series -- 8.4 Sampling -- Reservoir Sampling -- 8.5 Cleansing -- Improving Decision Trees -- Robust Regression -- Detecting Anomalies -- One-Class Learning -- Outlier Detection -- Generating Artificial Data -- 8.6 Transforming Multiple Classes to Binary Ones -- Simple Methods -- Error-Correcting Output Codes -- Ensembles of Nested Dichotomies -- 8.7 Calibrating Class Probabilities -- 8.8 Further Reading and Bibliographic Notes -- 8.9 Weka Implementations -- 9 Probabilistic methods -- 9.1 Foundations -- Maximum Likelihood Estimation -- Maximum a Posteriori Parameter Estimation -- 9.2 Bayesian Networks -- Making Predictions -- Learning Bayesian Networks -- Specific Algorithms -- Data Structures for Fast Learning -- 9.3 Clustering and Probability Density Estimation -- The Expectation Maximization Algorithm for a Mixture of Gaussians -- Extending the Mixture Model -- Clustering Using Prior Distributions -- Clustering With Correlated Attributes -- Kernel Density Estimation -- Comparing Parametric, Semiparametric and Nonparametric Density Models for Classification -- 9.4 Hidden Variable Models -- Expected Log-Likelihoods and Expected Gradients -- The Expectation Maximization Algorithm -- Applying the Expectation Maximization Algorithm to Bayesian Networks -- 9.5 Bayesian Estimation and Prediction. , Probabilistic Inference Methods -- Probability propagation -- Sampling, simulated annealing, and iterated conditional modes -- Variational inference -- 9.6 Graphical Models and Factor Graphs -- Graphical Models and Plate Notation -- Probabilistic Principal Component Analysis -- Inference with PPCA -- Marginal log-likelihood for PPCA -- Expected log-likelihood for PPCA -- Expected gradient for PPCA -- EM for PPCA -- Latent Semantic Analysis -- Using Principal Component Analysis for Dimensionality Reduction -- Probabilistic LSA -- Latent Dirichlet Allocation -- Factor Graphs -- Factor graphs, Bayesian networks, and the logistic regression model -- Markov Random Fields -- Computing Using the Sum-Product and Max-Product Algorithms -- Marginal probabilities -- The sum-product algorithm -- Sum-product algorithm example -- Most probable explanation example -- The max-product or max-sum algorithm -- 9.7 Conditional Probability Models -- Linear and Polynomial Regression as Probability Models -- Using Priors on Parameters -- Matrix vector formulations of linear and polynomial regression -- Multiclass Logistic Regression -- Matrix vector formulation of multiclass logistic regression -- Priors on parameters, and the regularized loss function -- Gradient Descent and Second-Order Methods -- Generalized Linear Models -- Making Predictions for Ordered Classes -- Conditional Probabilistic Models Using Kernels -- 9.8 Sequential and Temporal Models -- Markov Models and N-gram Methods -- Hidden Markov Models -- Conditional Random Fields -- From Markov random fields to conditional random fields -- Linear chain conditional random fields -- Learning for chain-structured conditional random fields -- Using conditional random fields for text mining -- 9.9 Further Reading and Bibliographic Notes -- Software Packages and Implementations -- 9.10 Weka Implementations. , 10 Deep learning.

Additional Edition: Print version: Witten, Ian H. Data Mining San Francisco : Elsevier Science & Technology,c2016 ISBN 9780128042915

Language: English

Kooperativer Bibliotheksverbund

Berlin Brandenburg