Format:
1 online resource (428 pages)
Edition:
1st ed.
ISBN:
9781784392659
Content:
If you are a data analyst who has a firm grip on some advanced data analysis techniques and wants to learn how to leverage the features of R, this is the book for you. You should have some basic knowledge of the R language and should know about some data science topics
Note:
Intro -- R for Data Science -- Table of Contents -- R for Data Science -- Credits -- About the Author -- About the Reviewers -- www.PacktPub.com -- Support files, eBooks, discount offers, and more -- Why subscribe? -- Free access for Packt account holders -- Preface -- What this book covers -- What you need for this book -- Who this book is for -- Conventions -- Reader feedback -- Customer support -- Downloading the example code -- Downloading the color images of this book -- Errata -- Piracy -- Questions -- 1. Data Mining Patterns -- Cluster analysis -- K-means clustering -- Usage -- Example -- K-medoids clustering -- Usage -- Example -- Hierarchical clustering -- Usage -- Example -- Expectation-maximization -- Usage -- List of model names -- Example -- Density estimation -- Usage -- Example -- Anomaly detection -- Show outliers -- Example -- Example -- Another anomaly detection example -- Calculating anomalies -- Usage -- Example 1 -- Example 2 -- Association rules -- Mine for associations -- Usage -- Example -- Questions -- Summary -- 2. Data Mining Sequences -- Patterns -- Eclat -- Usage -- Using eclat to find similarities in adult behavior -- Finding frequent items in a dataset -- An example focusing on highest frequency -- arulesNBMiner -- Usage -- Mining the Agrawal data for frequent sets -- Apriori -- Usage -- Evaluating associations in a shopping basket -- Determining sequences using TraMineR -- Usage -- Determining sequences in training and careers -- Similarities in the sequence -- Sequence metrics -- Usage -- Example -- Questions -- Summary -- 3. Text Mining -- Packages -- Text processing -- Example -- Creating a corpus -- Converting text to lowercase -- Removing punctuation -- Removing numbers -- Removing words -- Removing whitespaces -- Word stems -- Document term matrix -- Using VectorSource -- Text clusters -- Word graphics
,
Analyzing the XML text -- Questions -- Summary -- 4. Data Analysis - Regression Analysis -- Packages -- Simple regression -- Multiple regression -- Multivariate regression analysis -- Robust regression -- Questions -- Summary -- 5. Data Analysis - Correlation -- Packages -- Correlation -- Example -- Visualizing correlations -- Covariance -- Pearson correlation -- Polychoric correlation -- Tetrachoric correlation -- A heterogeneous correlation matrix -- Partial correlation -- Questions -- Summary -- 6. Data Analysis - Clustering -- Packages -- K-means clustering -- Example -- Optimal number of clusters -- Medoids clusters -- The cascadeKM function -- Selecting clusters based on Bayesian information -- Affinity propagation clustering -- Gap statistic to estimate the number of clusters -- Hierarchical clustering -- Questions -- Summary -- 7. Data Visualization - R Graphics -- Packages -- Interactive graphics -- The latticist package -- Bivariate binning display -- Mapping -- Plotting points on a map -- Plotting points on a world map -- Google Maps -- The ggplot2 package -- Questions -- Summary -- 8. Data Visualization - Plotting -- Packages -- Scatter plots -- Regression line -- A lowess line -- scatterplot -- Scatterplot matrices -- splom - display matrix data -- cpairs - plot matrix data -- Density scatter plots -- Bar charts and plots -- Bar plot -- Usage -- Bar chart -- ggplot2 -- Word cloud -- Questions -- Summary -- 9. Data Visualization - 3D -- Packages -- Generating 3D graphics -- Lattice Cloud - 3D scatterplot -- scatterplot3d -- scatter3d -- cloud3d -- RgoogleMaps -- vrmlgenbar3D -- Big Data -- pbdR -- Common global values -- Distribute data across nodes -- Distribute a matrix across nodes -- bigmemory -- pdbMPI -- snow -- More Big Data -- Research areas -- Rcpp -- parallel -- microbenchmark -- pqR -- SAP integration -- roxygen2
,
bioconductor -- swirl -- pipes -- Questions -- Summary -- 10. Machine Learning in Action -- Packages -- Dataset -- Data partitioning -- Model -- Linear model -- Prediction -- Logistic regression -- Residuals -- Least squares regression -- Relative importance -- Stepwise regression -- The k-nearest neighbor classification -- Naïve Bayes -- The train Method -- predict -- Support vector machines -- K-means clustering -- Decision trees -- AdaBoost -- Neural network -- Random forests -- Questions -- Summary -- 11. Predicting Events with Machine Learning -- Automatic forecasting packages -- Time series -- The SMA function -- The decompose function -- Exponential smoothing -- Forecast -- Correlogram -- Box test -- Holt exponential smoothing -- Automated forecasting -- ARIMA -- Automated ARIMA forecasting -- Questions -- Summary -- 12. Supervised and Unsupervised Learning -- Packages -- Supervised learning -- Decision tree -- Regression -- Neural network -- Instance-based learning -- Ensemble learning -- Support vector machines -- Bayesian learning -- Random forests -- Unsupervised learning -- Cluster analysis -- Density estimation -- Expectation-maximization -- Hidden Markov models -- Blind signal separation -- Questions -- Summary -- Index
Additional Edition:
Print version Toomey, Dan R for Data Science Birmingham : Packt Publishing, Limited,c2014 ISBN 9781784390860
Language:
English
Keywords:
Electronic books
;
Electronic books
URL:
FULL
((OIS Credentials Required))
URL:
FULL
((OIS Credentials Required))