UID:
almahu_9948198288902882
Umfang:
1 online resource (xxxi, 683 pages)
ISBN:
9781118950845
,
1118950844
,
9781118950807
,
1118950801
,
9781322317465
,
1322317461
Inhalt:
"This book narrows down the scope of data mining by adopting a heavily modeling-oriented perspective"--
Anmerkung:
Preliminaries --
,
Tasks --
,
Introduction --
,
Knowledge --
,
Inference --
,
Inductive learning tasks --
,
Domain --
,
Instances --
,
Attributes --
,
Target attribute --
,
Input attributes --
,
Training set --
,
Model --
,
Performance --
,
Generalization --
,
Overfitting --
,
Algorithms --
,
Inductive learning as search --
,
Classification --
,
Concept --
,
Training set --
,
Model --
,
Performance --
,
Generalization --
,
Overfitting --
,
Algorithms --
,
Regression --
,
Target function --
,
Training set --
,
Model --
,
Performance --
,
Generalization --
,
Overfitting --
,
Algorithms --
,
Clustering --
,
Motivation --
,
Training set --
,
Model --
,
Crisp vs. soft clustering --
,
Hierarchical clustering --
,
Performance --
,
Generalization --
,
Algorithms --
,
Descriptive vs. predictive clustering --
,
Practical issues --
,
Incomplete data --
,
Noisy data --
,
Conclusion --
,
Further readings --
,
References --
,
Basic statistics --
,
Introduction --
,
Notational conventions --
,
Basic statistics as modeling --
,
Distribution description --
,
Continuous attributes --
,
Discrete attributes --
,
Confidence intervals --
,
m-Estimation --
,
Relationship detection --
,
Significance tests --
,
Continuous attributes --
,
Discrete attributes --
,
Mixed attributes --
,
Relationship detection caveats --
,
Visualization --
,
Boxplot --
,
Histogram --
,
Barplot --
,
Conclusion --
,
Further readings --
,
References --
,
Classification --
,
Decision trees --
,
Introduction --
,
Decision tree model --
,
Nodes and branches --
,
Leaves --
,
Split types --
,
Growing --
,
Algorithm outline --
,
Class distribution calculation --
,
Class label assignment --
,
Stop criteria --
,
Split selection --
,
Split application --
,
Complete process --
,
Pruning --
,
Pruning operators --
,
Pruning criterion --
,
Pruning control strategy --
,
Conversion to rule sets --
,
Prediction --
,
Class label prediction --
,
Class probability prediction --
,
Weighted instances --
,
Missing value handling --
,
Fractional instances --
,
Surrogate splits --
,
Conclusion --
,
Further readings --
,
References --
,
Naive Bayes classifier --
,
Introduction --
,
Bayes rule --
,
Classification by Bayesian inference --
,
Conditional class probability --
,
Prior class probability --
,
Independence assumption --
,
Conditional attribute value probabilities --
,
Model construction --
,
Prediction --
,
Practical issues --
,
Zero and small probabilities --
,
Linear classification --
,
Continuous attributes --
,
Missing attribute values --
,
Reducing naivety --
,
Conclusion --
,
Further readings --
,
References --
,
Linear classification --
,
Introduction --
,
Linear representation --
,
Inner representation function --
,
Outer representation function --
,
Threshold representation --
,
Logit representation --
,
Parameter estimation --
,
Delta rule --
,
Gradient descent --
,
Distance to decision boundary --
,
Least squares --
,
Discrete attributes --
,
Conclusion --
,
Further readings --
,
References --
,
Misclassification costs --
,
Introduction --
,
Cost representation --
,
Cost matrix --
,
Per-class cost vector --
,
Instance-specific costs --
,
Incorporating misclassification costs --
,
Instance weighting --
,
Instance resampling --
,
Minimum-cost rule --
,
Instance relabeling --
,
Effects of cost incorporation --
,
Experimental procedure --
,
Conclusion --
,
Further readings --
,
References --
,
Classification model evaluation --
,
Introduction --
,
Dataset performance --
,
Training performance --
,
True performance --
,
Performance measures --
,
Misclassification error --
,
Weighted misclassification error --
,
Mean misclassification cost --
,
Confusion matrix --
,
ROC analysis --
,
Probabilistic performance measures --
,
Evaluation procedures --
,
Model evaluation vs. modeling procedure evaluation.
,
Evaluation caveats --
,
Hold-out --
,
Cross-validation --
,
Leave-one-out --
,
Bootstrapping --
,
Choosing the right procedure --
,
Evaluation procedures for temporal data --
,
Conclusion --
,
Further readings --
,
References --
,
Regression --
,
Linear regression --
,
Introduction --
,
Linear representation --
,
Parametric representation --
,
Linear representation function --
,
Nonlinear representation functions --
,
Parameter estimation --
,
Mean square error minimization --
,
Delta rule --
,
Gradient descent --
,
Least squares --
,
Discrete attributes --
,
Advantages of linear models --
,
Beyond linearity --
,
Generalized linear representation --
,
Enhanced representation --
,
Polynomial regression --
,
Piecewise-linear regression --
,
Conclusion --
,
Further readings --
,
References --
,
Regression trees --
,
Introduction --
,
Regression tree model --
,
Nodes and branches --
,
Leaves --
,
Split types --
,
Piecewise-constant regression --
,
Growing --
,
Algorithm outline --
,
Target function summary statistics --
,
Target value assignment --
,
Stop criteria --
,
Split selection --
,
Split application --
,
Complete process --
,
Pruning --
,
Pruning operators --
,
Pruning criterion --
,
Pruning control strategy --
,
Prediction --
,
Weighted instances --
,
Missing value handling --
,
Fractional instances --
,
Surrogate splits --
,
Piecewise linear regression --
,
Growing --
,
Pruning --
,
Prediction --
,
Conclusion --
,
Further readings --
,
References --
,
Regression model evaluation --
,
Introduction --
,
Dataset performance --
,
Training performance --
,
True performance --
,
Performance measures --
,
Residuals --
,
Mean absolute error --
,
Mean square error --
,
Root mean square error --
,
Relative absolute error --
,
Coefficient of determination --
,
Correlation --
,
Weighted performance measures --
,
Loss functions --
,
Evaluation procedures --
,
Hold-out --
,
Cross-validation --
,
Leave-one-out --
,
Bootstrapping --
,
Choosing the right procedure --
,
Conclusion --
,
Further readings --
,
References --
,
Clustering --
,
(Dis)similarity measures --
,
Introduction --
,
Measuring dissimilarity and similarity --
,
Difference-based dissimilarity --
,
Euclidean distance --
,
Minkowski distance --
,
Manhattan distance --
,
Canberra distance --
,
Chebyshev distance --
,
Hamming distance --
,
Gower's coefficient --
,
Attribute weighting --
,
Attribute transformation --
,
Correlation-based similarity --
,
Discrete attributes --
,
Pearson's correlation similarity --
,
Spearman's correlation similarity --
,
Cosine similarity --
,
Missing attribute values --
,
Conclusion --
,
Further readings --
,
References --
,
K-Centers clustering --
,
Introduction --
,
Basic principle --
,
(Dis)similarity measures --
,
Algorithm scheme --
,
Initialization --
,
Stop criteria --
,
Cluster formation --
,
Implicit cluster modeling --
,
Instantiations --
,
k-Means --
,
Center adjustment --
,
Minimizing dissimilarity to centers --
,
Beyond means --
,
k-Medians --
,
k-Medoids --
,
Beyond (fixed) k --
,
Multiple runs --
,
Adaptive k-centers --
,
Explicit cluster modeling --
,
Conclusion --
,
Further readings --
,
References --
,
Hierarchical clustering --
,
Introduction --
,
Basic approaches --
,
(Dis)similarity measures --
,
Cluster hierarchies --
,
Motivation --
,
Model representation --
,
Agglomerative clustering --
,
Algorithm scheme --
,
Cluster linkage --
,
Divisive clustering --
,
Algorithm scheme --
,
Wrapping a flat clustering algorithm --
,
Stop criteria --
,
Hierarchical clustering visualization --
,
Hierarchical clustering prediction --
,
Cutting cluster hierarchies --
,
Cluster membership assignment --
,
Conclusion --
,
Further readings --
,
References --
,
Clustering model evaluation --
,
Introduction --
,
Dataset performance.
,
Training performance --
,
True performance --
,
Per-cluster quality measures --
,
Diameter --
,
Separation --
,
Isolation --
,
Silhouette width --
,
Davies -- Bouldin index --
,
Overall quality measures --
,
Dunn index --
,
Average Davies-Bouldin index --
,
C index --
,
Average silhouette width --
,
Loglikelihood --
,
External quality measures --
,
Misclassification error --
,
Rand index --
,
General relationship detection measures --
,
Using quality measures --
,
Conclusion --
,
Further readings --
,
References --
,
Getting Better Models --
,
Model ensembles --
,
Introduction --
,
Model committees --
,
Base models --
,
Different training sets --
,
Different algorithms --
,
Different parameter setups --
,
Algorithm randomization --
,
Base model diversity --
,
Model aggregation --
,
Voting/Averaging --
,
Probability averaging --
,
Weighted voting/averaging --
,
Using as attributes --
,
Specific ensemble modeling algorithms --
,
Bagging --
,
Stacking --
,
Boosting --
,
Random forest --
,
Random Naive Bayes --
,
Quality of ensemble predictions --
,
Conclusion --
,
Further readings --
,
References --
,
Kernel methods --
,
Introduction --
,
Support vector machines --
,
Classification margin --
,
Maximum-margin hyperplane --
,
Primal form --
,
Dual form --
,
Soft margin --
,
Support vector regression --
,
Regression tube --
,
Primal form --
,
Dual form --
,
Kernel trick --
,
Kernel functions --
,
Linear kernel --
,
Polynomial kernel --
,
Radial kernel --
,
Sigmoid kernel --
,
Kernel prediction --
,
Kernel-based algorithms --
,
Kernel-based SVM --
,
Kernel-based SVR --
,
Conclusion --
,
Further readings --
,
References --
,
Attribute transformation --
,
Introduction --
,
Attribute transformation task --
,
Target task --
,
Target attribute --
,
Transformed attribute --
,
Training set --
,
Modeling transformations --
,
Nonmodeling transformations --
,
Simple transformations --
,
Standardization --
,
Normalization --
,
Aggregation --
,
Imputation --
,
Binary encoding --
,
Multiclass encoding --
,
Encoding and decoding functions --
,
1-ok-k encoding --
,
Error-correcting encoding --
,
Effects of multiclass encoding --
,
Conclusion --
,
Further readings --
,
References --
,
Discretization --
,
Introduction --
,
Discretization task --
,
Motivation --
,
Task definition --
,
Discretization as modeling --
,
Discretization quality --
,
Unsupervised discretization --
,
Equal-width intervals --
,
Equal-frequency intervals --
,
Nonmodeling discretization --
,
Supervised discretization --
,
Pure-class discretization --
,
Bottom-up discretization --
,
Top-down discretization --
,
Effects of discretization --
,
Conclusion --
,
Further readings --
,
References --
,
Attribute selection --
,
Introduction --
,
Attribute selection task --
,
Motivation --
,
Task definition --
,
Algorithms --
,
Attribute subset search --
,
Search task --
,
Initial state --
,
Search operators --
,
State selection --
,
Stop criteria --
,
Attribute selection filters --
,
Simple statistical niters --
,
Correlation-based filters --
,
Consistency-based filters --
,
Relief --
,
Random forest --
,
Cutoff criteria --
,
Filter-driven search --
,
Attribute selection wrappers --
,
Subset evaluation --
,
Wrapper attribute selection --
,
Effects of attribute selection --
,
Conclusion --
,
Further readings --
,
References --
,
Case studies --
,
Introduction --
,
Datasets --
,
Packages --
,
Auxiliary functions --
,
Census income --
,
Data loading and preprocessing --
,
Default model --
,
Incorporating misclassification costs --
,
Pruning --
,
Attribute selection --
,
Final models --
,
Communities and crime --
,
Data loading --
,
Data quality --
,
Regression trees --
,
Linear models --
,
Attribute selection --
,
Piecewise-linear models --
,
Cover type --
,
Data loading and preprocessing --
,
Class imbalance --
,
Decision trees --
,
Class rebalancing --
,
Multiclass encoding --
,
Final classification models --
,
Clustering --
,
Conclusion --
,
Further readings --
,
References --
,
Closing --
,
Notation --
,
Attribute values --
,
Data subsets --
,
Probabilities --
,
R packages --
,
CRAN packages --
,
DMR packages --
,
Installing packages --
,
References --
,
Datasets.
Weitere Ausg.:
Print version: Cichosz, Pawel. Data mining algorithms. Chichester, West Sussex, United Kingdom : Wiley, 2015 ISBN 9781118332580
Sprache:
Englisch
Schlagwort(e):
Electronic books.
URL:
https://onlinelibrary.wiley.com/doi/book/10.1002/9781118950951
Bookmarklink