KOBV-Portal

Treffer pro Seite

Treffer 1 - 1 | 1 Treffer

Alles auswählen Exportieren

Online-Ressource

Optimization algorithms for distributed machine learning / (2023)

Joshi, Gauri,

Cham, Switzerland :Springer,

zur Merkliste hinzufügen auf der Merkliste

Details

UID:

edoccha_9961000629602883

Umfang: 1 online resource (137 pages)

ISBN: 9783031190674

Serie: Synthesis Lectures on Learning, Networks, and Algorithms

Anmerkung: Intro -- Preface -- Contents -- Acronyms and Symbols -- 1 Distributed Optimization in Machine Learning -- [DELETE] -- 1.1 SGD in Supervised Machine Learning -- 1.1.1 Training Data and Hypothesis -- 1.1.2 Empirical Risk Minimization -- 1.1.3 Gradient Descent -- 1.1.4 Stochastic Gradient Descent -- 1.1.5 Mini-batch SGD -- 1.1.6 Linear Regression -- 1.1.7 Logistic Regression -- 1.1.8 Neural Networks -- 1.2 Distributed Stochastic Gradient Descent -- 1.2.1 The Parameter Server Framework -- 1.2.2 The System-Aware Design Philosophy -- 1.3 Scalable Distributed SGD Algorithms -- 1.3.1 Straggler-Resilient and Asynchronous SGD -- 1.3.2 Communication-Efficient Distributed SGD -- 1.3.3 Decentralized SGD -- 2 Calculus, Probability and Order Statistics Review -- [DELETE] -- 2.1 Calculus and Linear Algebra -- 2.1.1 Norms and Inner Products -- 2.1.2 Lipschitz Continuity and Smoothness -- 2.1.3 Strong Convexity -- 2.2 Probability Review -- 2.2.1 Random Variable -- 2.2.2 Expectation and Variance -- 2.2.3 Some Canonical Random Variables -- 2.2.4 Bayes Rule and Conditional Probability -- 2.3 Order Statistics -- 2.3.1 Order Statistics of the Exponential Distribution -- 2.3.2 Order Statistics of the Uniform Distribution -- 2.3.3 Asymptotic Distribution of Quantiles -- 3 Convergence of SGD and Variance-Reduced Variants -- [DELETE] -- 3.1 Gradient Descent (GD) Convergence -- 3.1.1 Effect of Learning Rate and Other Parameters -- 3.1.2 Iteration Complexity -- 3.2 Convergence Analysis of Mini-batch SGD -- 3.2.1 Effect of Learning Rate and Mini-batch Size -- 3.2.2 Iteration Complexity -- 3.2.3 Non-convex Objectives -- 3.3 Variance-Reduced SGD Variants -- 3.3.1 Dynamic Mini-batch Size Schedule -- 3.3.2 Stochastic Average Gradient (SAG) -- 3.3.3 Stochastic Variance Reduced Gradient (SVRG) -- 4 Synchronous SGD and Straggler-Resilient Variants -- 4.1 Parameter Server Framework. , 4.2 Distributed Synchronous SGD Algorithm -- 4.3 Convergence Analysis -- 4.3.1 Iteration Complexity -- 4.4 Runtime per Iteration -- 4.4.1 Gradient Computation and Communication Time -- 4.4.2 Expected Runtime per Iteration -- 4.4.3 Error Versus Runtime Convergence -- 4.5 Straggler-Resilient Variants -- 4.5.1 K-Synchronous SGD -- 4.5.2 K-Batch-Synchronous SGD -- 5 Asynchronous SGD and Staleness-Reduced Variants -- 5.1 The Asynchronous SGD Algorithm -- 5.1.1 Comparison with Synchronous SGD -- 5.2 Runtime Analysis -- 5.2.1 Runtime Speed-Up Compared to Synchronous SGD -- 5.3 Convergence Analysis -- 5.3.1 Implications of the Asynchronous SGD Convergence Bound -- 5.4 Staleness-Reduced Variants of Asynchronous SGD -- 5.4.1 K-Asynchronous SGD -- 5.4.2 K-Batch-Asynchronous SGD -- 5.5 Adaptive Methods to Improve the Error-Runtime Trade-Off -- 5.5.1 Adaptive Synchronization -- 5.5.2 Adaptive Learning Rate Schedule to Compensate Staleness -- 5.6 HogWild and Lock-Free Parallelism -- 6 Local-Update and Overlap SGD -- 6.1 Local-Update SGD Algorithm -- 6.1.1 Convergence Analysis -- 6.1.2 Runtime Analysis -- 6.1.3 Adaptive Communication -- 6.2 Elastic and Overlap SGD -- 6.2.1 Elastic Averaging SGD -- 6.2.2 Overlap Local SGD -- 7 Quantized and Sparsified Distributed SGD -- [DELETE] -- 7.1 Quantized SGD -- 7.1.1 Uniform Stochastic Quantization -- 7.1.2 Convergence Analysis -- 7.1.3 Runtime Analysis -- 7.1.4 Adaptive Quantization -- 7.2 Sparsified SGD -- 7.2.1 Rand-k Sparsification -- 7.2.2 Top-k Sparsification -- 7.2.3 Rand-k Sparsified Distributed SGD -- 7.2.4 Error Feedback in Sparsified SGD -- 8 Decentralized SGD and Its Variants -- [DELETE] -- 8.1 Network Topology and Graph Notation -- 8.1.1 Adjacency Matrix -- 8.1.2 Laplacian Matrix -- 8.1.3 Mixing Matrix -- 8.2 Decentralized SGD -- 8.2.1 The Algorithm -- 8.2.2 Variants of Decentralized SGD. , 8.3 Error Convergence Analysis -- 8.3.1 Assumptions -- 8.3.2 Convergence Analysis of Decentralized SGD -- 8.3.3 Convergence Analysis of Decentralized Local-Update SGD -- 8.4 Runtime Analysis -- 9 Beyond Distributed Training in the Cloud -- [DELETE].

Weitere Ausg.: Print version: Joshi, Gauri Optimization Algorithms for Distributed Machine Learning Cham : Springer International Publishing AG,c2023 ISBN 9783031190667

Sprache: Englisch

Bibliothek	Standort	Signatur	Band/Heft/Jahr	Verfügbarkeit

Andere fanden auch interessant ...

Online-Ressource

Charité

Treffer 1 - 1 | 1 Treffer

Kooperativer Bibliotheksverbund

Berlin Brandenburg