KOBV Portal

Hits per page

hit 1 - 1 | 1 hit

Sorting

Online Resource

Hands-On Exploratory Data Analysis with Python : Perform EDA Techniques to Understand, Summarize, and Investigate Your Data (2020)

Mukhiya, Suresh Kumar ; Ahmed, Usman

Birmingham : Packt Publishing, Limited

add to watchlist on the watchlist

Details

UID:

kobvindex_INT59022

Format: 1 online resource (342 pages)

Edition: 1st ed.

ISBN: 9781789535624

Content: This book provides practical knowledge about the main pillars of EDA including data cleaning, data preparation, data exploration, and data visualization. You can leverage the power of Python to understand, summarize and investigate your data in the best way possible. The book presents a unique approach to exploring hidden features in your data

Note: Cover -- Title Page -- Copyright and Credits -- About Packt -- Contributors -- Table of Contents -- Preface -- Section 1: The Fundamentals of EDA -- Chapter 01: Exploratory Data Analysis Fundamentals -- Understanding data science -- The significance of EDA -- Steps in EDA -- Making sense of data -- Numerical data -- Discrete data -- Continuous data -- Categorical data -- Measurement scales -- Nominal -- Ordinal -- Interval -- Ratio -- Comparing EDA with classical and Bayesian analysis -- Software tools available for EDA -- Getting started with EDA -- NumPy -- Pandas -- SciPy -- Matplotlib -- Summary -- Further reading -- Chapter 02: Visual Aids for EDA -- Technical requirements -- Line chart -- Steps involved -- Bar charts -- Scatter plot -- Bubble chart -- Scatter plot using seaborn -- Area plot and stacked plot -- Pie chart -- Table chart -- Polar chart -- Histogram -- Lollipop chart -- Choosing the best chart -- Other libraries to explore -- Summary -- Further reading -- Chapter 03: EDA with Personal Email -- Technical requirements -- Loading the dataset -- Data transformation -- Data cleansing -- Loading the CSV file -- Converting the date -- Removing NaN values -- Applying descriptive statistics -- Data refactoring -- Dropping columns -- Refactoring timezones -- Data analysis -- Number of emails -- Time of day -- Average emails per day and hour -- Number of emails per day -- Most frequently used words -- Summary -- Further reading -- Chapter 04: Data Transformation -- Technical requirements -- Background -- Merging database-style dataframes -- Concatenating along with an axis -- Using df.merge with an inner join -- Using the pd.merge() method with a left join -- Using the pd.merge() method with a right join -- Using pd.merge() methods with outer join -- Merging on index -- Reshaping and pivoting -- Transformation techniques , Discussing multivariate analysis using the Titanic dataset -- Outlining Simpson's paradox -- Correlation does not imply causation -- Summary -- Further reading -- Chapter 08: Time Series Analysis -- Technical requirements -- Understanding the time series dataset -- Fundamentals of TSA -- Univariate time series -- Characteristics of time series data -- TSA with Open Power System Data -- Data cleaning -- Time-based indexing -- Visualizing time series -- Grouping time series data -- Resampling time series data -- Summary -- Further reading -- Section 3: Model Development and Evaluation -- Chapter 09: Hypothesis Testing and Regression -- Technical requirements -- Hypothesis testing -- Hypothesis testing principle -- statsmodels library -- Average reading time -- Types of hypothesis testing -- T-test -- p-hacking -- Understanding regression -- Types of regression -- Simple linear regression -- Multiple linear regression -- Nonlinear regression -- Model development and evaluation -- Constructing a linear regression model -- Model evaluation -- Computing accuracy -- Understanding accuracy -- Implementing a multiple linear regression model -- Summary -- Further reading -- Chapter 10: Model Development and Evaluation -- Technical requirements -- Types of machine learning -- Understanding supervised learning -- Regression -- Classification -- Understanding unsupervised learning -- Applications of unsupervised learning -- Clustering using MiniBatch K-means clustering -- Extracting keywords -- Plotting clusters -- Word cloud -- Understanding reinforcement learning -- Difference between supervised and reinforcement learning -- Applications of reinforcement learning -- Unified machine learning workflow -- Data preprocessing -- Data collection -- Data analysis -- Data cleaning, normalization, and transformation -- Data preparation , Performing data deduplication -- Replacing values -- Handling missing data -- NaN values in pandas objects -- Dropping missing values -- Dropping by rows -- Dropping by columns -- Mathematical operations with NaN -- Filling missing values -- Backward and forward filling -- Interpolating missing values -- Renaming axis indexes -- Discretization and binning -- Outlier detection and filtering -- Permutation and random sampling -- Random sampling without replacement -- Random sampling with replacement -- Computing indicators/dummy variables -- String manipulation -- Benefits of data transformation -- Challenges -- Summary -- Further reading -- Section 2: Descriptive Statistics -- Chapter 05: Descriptive Statistics -- Technical requirements -- Understanding statistics -- Distribution function -- Uniform distribution -- Normal distribution -- Exponential distribution -- Binomial distribution -- Cumulative distribution function -- Descriptive statistics -- Measures of central tendency -- Mean/average -- Median -- Mode -- Measures of dispersion -- Standard deviation -- Variance -- Skewness -- Kurtosis -- Types of kurtosis -- Calculating percentiles -- Quartiles -- Visualizing quartiles -- Summary -- Further reading -- Chapter 06: Grouping Datasets -- Technical requirements -- Understanding groupby() -- Groupby mechanics -- Selecting a subset of columns -- Max and min -- Mean -- Data aggregation -- Group-wise operations -- Renaming grouped aggregation columns -- Group-wise transformations -- Pivot tables and cross-tabulations -- Pivot tables -- Cross-tabulations -- Summary -- Further reading -- Chapter 07: Correlation -- Technical requirements -- Introducing correlation -- Types of analysis -- Understanding univariate analysis -- Understanding bivariate analysis -- Understanding multivariate analysis , Training sets and corpus creation -- Model creation and training -- Model evaluation -- Best model selection and evaluation -- Model deployment -- Summary -- Further reading -- Chapter 11: EDA on Wine Quality Data Analysis -- Technical requirements -- Disclosing the wine quality dataset -- Loading the dataset -- Descriptive statistics -- Data wrangling -- Analyzing red wine -- Finding correlated columns -- Alcohol versus quality -- Alcohol versus pH -- Analyzing white wine -- Red wine versus white wine -- Adding a new attribute -- Converting into a categorical column -- Concatenating dataframes -- Grouping columns -- Univariate analysis -- Multivariate analysis on the combined dataframe -- Discrete categorical attributes -- 3-D visualization -- Model development and evaluation -- Summary -- Further reading -- Appendix -- String manipulation -- Creating strings -- Accessing characters in Python -- String slicing -- Deleting/updating from a string -- Escape sequencing in Python -- Formatting strings -- Using pandas vectorized string functions -- Using string functions with a pandas DataFrame -- Using regular expressions -- Further reading -- Other Books You May Enjoy -- Index

Additional Edition: Print version Mukhiya, Suresh Kumar Hands-On Exploratory Data Analysis with Python Birmingham : Packt Publishing, Limited,c2020 ISBN 9781789537253

Language: English

Keywords: Electronic books ; Electronic books

URL: FULL ((OIS Credentials Required))

Bookmarklink

Library	Location	Call Number	Volume/Issue/Year	Availability

Others were also interested in ...

Online Resource

FU Berlin

Berlin International

hit 1 - 1 | 1 hit

Nothing or not found what you are looking for? Please check your search query or use the Interlibrary Loan Search.

Kooperativer Bibliotheksverbund

Berlin Brandenburg