Ihre E-Mail wurde erfolgreich gesendet. Bitte prüfen Sie Ihren Maileingang.

Leider ist ein Fehler beim E-Mail-Versand aufgetreten. Bitte versuchen Sie es erneut.

Vorgang fortführen?

Exportieren
  • 1
    Online-Ressource
    Online-Ressource
    Hoboken, NJ :John Wiley & Sons, Inc.,
    UID:
    almafu_9959328033402883
    Umfang: 1 online resource
    ISBN: 9781119092933 , 1119092930 , 9781119092926 , 1119092922 , 9781119092919 , 1119092914
    Inhalt: Giving extensive coverage to computer science and software engineering since they play such a central role in the daily work of a data scientist, this comprehensive book provides a crash course in data science, combining all the necessary skills into a unified discipline. --
    Anmerkung: Cover -- Title Page -- Copyright -- Dedication -- Contents -- Preface -- Chapter 1 Introduction: Becoming a Unicorn -- 1.1 Aren't Data Scientists Just Overpaid Statisticians? -- 1.2 How Is This Book Organized? -- 1.3 How to Use This Book? -- 1.4 Why Is It All in Python™, Anyway? -- 1.5 Example Code and Datasets -- 1.6 Parting Words -- Part 1 The Stuff You'll Always Use -- Chapter 2 The Data Science Road Map -- 2.1 Frame the Problem -- 2.2 Understand the Data: Basic Questions -- 2.3 Understand the Data: Data Wrangling -- 2.4 Understand the Data: Exploratory Analysis -- 2.5 Extract Features -- 2.6 Model -- 2.7 Present Results -- 2.8 Deploy Code -- 2.9 Iterating -- 2.10 Glossary -- Chapter 3 Programming Languages -- 3.1 Why Use a Programming Language? What Are the Other Options? -- 3.2 A Survey of Programming Languages for Data Science -- 3.3 Python Crash Course -- 3.4 Strings -- 3.5 Defining Functions -- 3.6 Python's Technical Libraries -- 3.7 Other Python Resources -- 3.8 Further Reading -- 3.9 Glossary -- Interlude: My Personal Toolkit -- Chapter 4 Data Munging: String Manipulation, Regular Expressions, and Data Cleaning -- 4.1 The Worst Dataset in the World -- 4.2 How to Identify Pathologies -- 4.3 Problems with Data Content -- 4.4 Formatting Issues -- 4.5 Example Formatting Script -- 4.6 Regular Expressions -- 4.7 Life in the Trenches -- 4.8 Glossary -- Chapter 5 Visualizations and Simple Metrics -- 5.1 A Note on Python's Visualization Tools -- 5.2 Example Code -- 5.3 Pie Charts -- 5.4 Bar Charts -- 5.5 Histograms -- 5.6 Means, Standard Deviations, Medians, and Quantiles -- 5.7 Boxplots -- 5.8 Scatterplots -- 5.9 Scatterplots with Logarithmic Axes -- 5.10 Scatter Matrices -- 5.11 Heatmaps -- 5.12 Correlations -- 5.13 Anscombe's Quartet and the Limits of Numbers -- 5.14 Time Series -- 5.15 Further Reading -- 5.16 Glossary. , Chapter 6 Machine Learning Overview -- 6.1 Historical Context -- 6.2 Supervised versus Unsupervised -- 6.3 Training Data, Testing Data, and the Great Boogeyman of Overfitting -- 6.4 Further Reading -- 6.5 Glossary -- Chapter 7 Interlude: Feature Extraction Ideas -- 7.1 Standard Features -- 7.2 Features That Involve Grouping -- 7.3 Preview of More Sophisticated Features -- 7.4 Defining the Feature You Want to Predict -- Chapter 8 Machine Learning Classification -- 8.1 What Is a Classifier, and What Can You Do with It? -- 8.2 A Few Practical Concerns -- 8.3 Binary versus Multiclass -- 8.4 Example Script -- 8.5 Specific Classifiers -- 8.6 Evaluating Classifiers -- 8.7 Selecting Classification Cutoffs -- 8.8 Further Reading -- 8.9 Glossary -- Chapter 9 Technical Communication and Documentation -- 9.1 Several Guiding Principles -- 9.2 Slide Decks -- 9.3 Written Reports -- 9.4 Speaking: What Has Worked for Me -- 9.5 Code Documentation -- 9.6 Further Reading -- 9.7 Glossary -- Part II Stuff You Still Need to Know -- Chapter 10 Unsupervised Learning: Clustering and Dimensionality Reduction -- 10.1 The Curse of Dimensionality -- 10.2 Example: Eigenfaces for Dimensionality Reduction -- 10.3 Principal Component Analysis and Factor Analysis -- 10.4 Skree Plots and Understanding Dimensionality -- 10.5 Factor Analysis -- 10.6 Limitations of PCA -- 10.7 Clustering -- 10.8 Further Reading -- 10.9 Glossary -- Chapter 11 Regression -- 11.1 Example: Predicting Diabetes Progression -- 11.2 Least Squares -- 11.3 Fitting Nonlinear Curves -- 11.4 Goodness of Fit: R2 and Correlation -- 11.5 Correlation of Residuals -- 11.6 Linear Regression -- 11.7 LASSO Regression and Feature Selection -- 11.8 Further Reading -- 11.9 Glossary -- Chapter 12 Data Encodings and File Formats -- 12.1 Typical File Format Categories -- 12.2 CSV Files -- 12.3 JSON Files -- 12.4 XML Files. , 17.5 Smoothing Signals -- 17.6 Logarithms and Other Transformations -- 17.7 Trends and Periodicity -- 17.8 Windowing -- 17.9 Brainstorming Simple Features -- 17.10 Better Features: Time Series as Vectors -- 17.11 Fourier Analysis: Sometimes a Magic Bullet -- 17.12 Time Series in Context: The Whole Suite of Features -- 17.13 Further Reading -- 17.14 Glossary -- Chapter 18 Probability -- 18.1 Flipping Coins: Bernoulli Random Variables -- 18.2 Throwing Darts: Uniform Random Variables -- 18.3 The Uniform Distribution and Pseudorandom Numbers -- 18.4 Nondiscrete, Noncontinuous Random Variables -- 18.5 Notation, Expectations, and Standard Deviation -- 18.6 Dependence, Marginal and Conditional Probability -- 18.7 Understanding the Tails -- 18.8 Binomial Distribution -- 18.9 Poisson Distribution -- 18.10 Normal Distribution -- 18.11 Multivariate Gaussian -- 18.12 Exponential Distribution -- 18.13 Log-Normal Distribution -- 18.14 Entropy -- 18.15 Further Reading -- 18.16 Glossary -- Chapter 19 Statistics -- 19.1 Statistics in Perspective -- 19.2 Bayesian versus Frequentist: Practical Tradeoffs and Differing Philosophies -- 19.3 Hypothesis Testing: Key Idea and Example -- 19.4 Multiple Hypothesis Testing -- 19.5 Parameter Estimation -- 19.6 Hypothesis Testing: t-Test -- 19.7 Confidence Intervals -- 19.8 Bayesian Statistics -- 19.9 Naive Bayesian Statistics -- 19.10 Bayesian Networks -- 19.11 Choosing Priors: Maximum Entropy or Domain Knowledge -- 19.12 Further Reading -- 19.13 Glossary -- Chapter 20 Programming Language Concepts -- 20.1 Programming Paradigms -- 20.2 Compilation and Interpretation -- 20.3 Type Systems -- 20.4 Further Reading -- 20.5 Glossary -- Chapter 21 Performance and Computer Memory -- 21.1 Example Script -- 21.2 Algorithm Performance and Big-O Notation -- 21.3 Some Classic Problems: Sorting a List and Binary Search. , 21.4 Amortized Performance and Average Performance -- 21.5 Two Principles: Reducing Overhead and Managing Memory -- 21.6 Performance Tip: Use Numerical Libraries When Applicable -- 21.7 Performance Tip: Delete Large Structures You Don't Need -- 21.8 Performance Tip: Use Built-In Functions When Possible -- 21.9 Performance Tip: Avoid Superfluous Function Calls -- 21.10 Performance Tip: Avoid Creating Large New Objects -- 21.11 Further Reading -- 21.12 Glossary -- Part III Specialized or Advanced Topics -- Chapter 22 Computer Memory and Data Structures -- 22.1 Virtual Memory, the Stack, and the Heap -- 22.2 Example C Program -- 22.3 Data Types and Arrays in Memory -- 22.4 Structs -- 22.5 Pointers, the Stack, and the Heap -- 22.6 Key Data Structures -- 22.7 Further Reading -- 22.8 Glossary -- Chapter 23 Maximum Likelihood Estimation and Optimization -- 23.1 Maximum Likelihood Estimation -- 23.2 A Simple Example: Fitting a Line -- 23.3 Another Example: Logistic Regression -- 23.4 Optimization -- 23.5 Gradient Descent and Convex Optimization -- 23.6 Convex Optimization -- 23.7 Stochastic Gradient Descent -- 23.8 Further Reading -- 23.9 Glossary -- Chapter 24 Advanced Classifiers -- 24.1 A Note on Libraries -- 24.2 Basic Deep Learning -- 24.3 Convolutional Neural Networks -- 24.4 Different Types of Layers. What the Heck Is a Tensor? -- 24.5 Example: The MNIST Handwriting Dataset -- 24.6 Recurrent Neural Networks -- 24.7 Bayesian Networks -- 24.8 Training and Prediction -- 24.9 Markov Chain Monte Carlo -- 24.10 PyMC Example -- 24.11 Further Reading -- 24.12 Glossary -- Chapter 25 Stochastic Modeling -- 25.1 Markov Chains -- 25.2 Two Kinds of Markov Chain, Two Kinds of Questions -- 25.3 Markov Chain Monte Carlo -- 25.4 Hidden Markov Models and the Viterbi Algorithm -- 25.5 The Viterbi Algorithm -- 25.6 Random Walks -- 25.7 Brownian Motion.
    Weitere Ausg.: Print version: Cady, Field, 1984- Data science handbook. Hoboken, NJ : John Wiley & Sons, Inc., 2017 ISBN 9781119092940
    Sprache: Englisch
    Fachgebiete: Informatik
    RVK:
    Schlagwort(e): Electronic books. ; Handbooks and manuals. ; Electronic books. ; Handbooks and manuals. ; Handbook ; Electronic books ; Electronic books ; Electronic books. ; Handbooks and manuals. ; Electronic books.
    URL: Volltext  (lizenzpflichtig)
    Bibliothek Standort Signatur Band/Heft/Jahr Verfügbarkeit
    BibTip Andere fanden auch interessant ...
Schließen ⊗
Diese Webseite nutzt Cookies und das Analyse-Tool Matomo. Weitere Informationen finden Sie auf den KOBV Seiten zum Datenschutz