UID:
almafu_9958130986802883
Format:
1 online resource (660 p.)
Edition:
1st ed.
ISBN:
1-280-63374-3
,
9786610633746
,
0-08-045940-4
Series Statement:
Handbook of statistics, v. 24
Content:
This book focuses on dealing with large-scale data, a field commonly referred to as data mining. The book is divided into three sections. The first deals with an introduction to statistical aspects of data mining and machine learning and includes applications to text analysis, computer intrusion detection, and hiding of information in digital files. The second section focuses on a variety of statistical methodologies that have proven to be effective in data mining applications. These include clustering, classification, multivariate density estimation, tree-based methods, pattern recognition, o
Note:
Description based upon print version of record.
,
front cover; copyright; front matter; Preface; Table of contents; Contributors; body; 1. Statistical Data Mining; Introduction 1; Computational complexity; Order of magnitude considerations; Feasibility limits due to CPU performance; Feasibility limits due to file transfer performance; Feasibility limits due to visual resolution; The computer science roots of data mining; Knowledge discovery in databases and data mining; Association rules; Data preparation; Missing values and outliers; Quantization; Databases; SQL; Data cubes and OLAP; Statistical methods for data mining; Density estimation
,
Cluster analysisHierarchical clustering; The number of groups problem; Artificial neural networks; The biological basis; Functioning of an artificial neural network; Back propagation; Visual data mining; The four stages of data graphics; Graphics constructs for visual data mining; Example 1 - PRIM 7 data; Example 2 - iterative denoising with hyperspectral data; Streaming data; Recursive analytic formulations; Counts, moments and densities; Evolutionary graphics; Waterfall diagrams and transient geographic mapping; Block-recursive plots and conditional plots; A final word; Acknowledgements 1
,
References 12. From Data Mining to Knowledge Mining; Introduction 2; Knowledge generation operators; Discovering rules and patterns via AQ learning; Types of problems in learning from examples; Clustering of entities into conceptually meaningful categories; Automated improvement of the search space: constructive induction; Reducing the amount of data: selecting representative examples; Integrating qualitative and quantitative methods of numerical discovery; Predicting processes qualitatively; Knowledge improvement via incremental learning; Summarizing the logical data analysis approach
,
Strong patterns vs. complete and consistent rulesRuleset visualization via concept association graphs; Integration of knowledge generation operators; Summary 2; Acknowledgements 2; References 2; 3. Mining Computer Securitycomputer security Data; Introduction 3; Basic TCP/IP; Overview of networking; The threat; Probes and scans; Denial of service attacks; Gaining access; Network monitoring; TCP sessions; Signatures versus anomalies; User profiling; Program profiling; Conclusions 3; References 3; 4. Data Mining of Text Files; 4. Introduction and background
,
Natural language processing at the word and sentence levelHidden Markov models; Probabilistic context-free grammars; Word sense disambiguation; Supervised disambiguation; Unsupervised disambiguation; Approaches beyond the word and sentence level; Information retrieval; Vector space model; Generic implementation.; Using term weights.; Latent Semantic Indexing (LSI); Other approaches; The bigram proximity matrix; Measures of semantic similarity.; Matching coefficient; Jaccard coefficient; Ochiai measure (also called cosine); L1 distance; Information radius measure (IRad)
,
Document classification via supervised learning.
,
English
Additional Edition:
ISBN 0-444-51141-5
Language:
English
Bookmarklink