Distributed Computing With Python

Distributed Computing With Python Book in PDF, ePub and Kindle version is available to download in english. Read online anytime anywhere directly from your device. Click on the download button below to get a free pdf file of Distributed Computing With Python book. This book definitely worth reading, it is an incredibly well-written.

Distributed Computing with Python

Author : Francesco Pierfederici
Publisher : Packt Publishing Ltd
Page : 171 pages
File Size : 52,9 Mb
Release : 2016-04-12
Category : Computers
ISBN : 9781785887048

Get Book

Distributed Computing with Python by Francesco Pierfederici Pdf

Harness the power of multiple computers using Python through this fast-paced informative guide About This Book You'll learn to write data processing programs in Python that are highly available, reliable, and fault tolerant Make use of Amazon Web Services along with Python to establish a powerful remote computation system Train Python to handle data-intensive and resource hungry applications Who This Book Is For This book is for Python developers who have developed Python programs for data processing and now want to learn how to write fast, efficient programs that perform CPU-intensive data processing tasks. What You Will Learn Get an introduction to parallel and distributed computing See synchronous and asynchronous programming Explore parallelism in Python Distributed application with Celery Python in the Cloud Python on an HPC cluster Test and debug distributed applications In Detail CPU-intensive data processing tasks have become crucial considering the complexity of the various big data applications that are used today. Reducing the CPU utilization per process is very important to improve the overall speed of applications. This book will teach you how to perform parallel execution of computations by distributing them across multiple processors in a single machine, thus improving the overall performance of a big data processing task. We will cover synchronous and asynchronous models, shared memory and file systems, communication between various processes, synchronization, and more. Style and Approach This example based, step-by-step guide will show you how to make the best of your hardware configuration using Python for distributing applications.

Parallel Python with Dask

Author : Tim Peters
Publisher : GitforGits
Page : 172 pages
File Size : 44,8 Mb
Release : 2023-10-19
Category : Computers
ISBN : 9788119177462

Get Book

Parallel Python with Dask by Tim Peters Pdf

Unlock the Power of Parallel Python with Dask: A Perfect Learning Guide for Aspiring Data Scientists Dask has revolutionized parallel computing for Python, empowering data scientists to accelerate their workflows. This comprehensive guide unravels the intricacies of Dask to help you harness its capabilities for machine learning and data analysis. Across 10 chapters, you'll master Dask's fundamentals, architecture, and integration with Python's scientific computing ecosystem. Step-by-step tutorials demonstrate parallel mapping, task scheduling, and leveraging Dask arrays for NumPy workloads. You'll discover how Dask seamlessly scales Pandas, Scikit-Learn, PyTorch, and other libraries for large datasets. Dedicated chapters explore scaling regression, classification, hyperparameter tuning, feature engineering, and more with clear examples. You'll also learn to tap into the power of GPUs with Dask, RAPIDS, and Google JAX for orders of magnitude speedups. This book places special emphasis on practical use cases related to scalability and distributed computing. You'll learn Dask patterns for cluster computing, managing resources efficiently, and robust data pipelines. The advanced chapters on DaskML and deep learning showcase how to build scalable models with PyTorch and TensorFlow. With this book, you'll gain practical skills to: Accelerate Python workloads with parallel mapping and task scheduling Speed up NumPy, Pandas, Scikit-Learn, PyTorch, and other libraries Build scalable machine learning pipelines for large datasets Leverage GPUs efficiently via Dask, RAPIDS and JAX Manage Dask clusters and workflows for distributed computing Streamline deep learning models with DaskML and DL frameworks Packed with hands-on examples and expert insights, this book provides the complete toolkit to harness Dask's capabilities. It will empower Python programmers, data scientists, and machine learning engineers to achieve faster workflows and operationalize parallel computing. Table of Content Introduction to Dask Dask Fundamentals Batch Data Parallel Processing with Dask Distributed Systems and Dask Advanced Dask: APIs and Building Blocks Dask with Pandas Dask with Scikit-learn Dask and PyTorch Dask with GPUs Scaling Machine Learning Projects with Dask

Python Parallel Programming Cookbook

Author : Giancarlo Zaccone
Publisher : Packt Publishing Ltd
Page : 286 pages
File Size : 47,8 Mb
Release : 2015-08-26
Category : Computers
ISBN : 9781785286728

Get Book

Python Parallel Programming Cookbook by Giancarlo Zaccone Pdf

Master efficient parallel programming to build powerful applications using Python About This Book Design and implement efficient parallel software Master new programming techniques to address and solve complex programming problems Explore the world of parallel programming with this book, which is a go-to resource for different kinds of parallel computing tasks in Python, using examples and topics covered in great depth Who This Book Is For Python Parallel Programming Cookbook is intended for software developers who are well versed with Python and want to use parallel programming techniques to write powerful and efficient code. This book will help you master the basics and the advanced of parallel computing. What You Will Learn Synchronize multiple threads and processes to manage parallel tasks Implement message passing communication between processes to build parallel applications Program your own GPU cards to address complex problems Manage computing entities to execute distributed computational tasks Write efficient programs by adopting the event-driven programming model Explore the cloud technology with DJango and Google App Engine Apply parallel programming techniques that can lead to performance improvements In Detail Parallel programming techniques are required for a developer to get the best use of all the computational resources available today and to build efficient software systems. From multi-core to GPU systems up to the distributed architectures, the high computation of programs throughout requires the use of programming tools and software libraries. Because of this, it is becoming increasingly important to know what the parallel programming techniques are. Python is commonly used as even non-experts can easily deal with its concepts. This book will teach you parallel programming techniques using examples in Python and will help you explore the many ways in which you can write code that allows more than one process to happen at once. Starting with introducing you to the world of parallel computing, it moves on to cover the fundamentals in Python. This is followed by exploring the thread-based parallelism model using the Python threading module by synchronizing threads and using locks, mutex, semaphores queues, GIL, and the thread pool. Next you will be taught about process-based parallelism where you will synchronize processes using message passing along with learning about the performance of MPI Python Modules. You will then go on to learn the asynchronous parallel programming model using the Python asyncio module along with handling exceptions. Moving on, you will discover distributed computing with Python, and learn how to install a broker, use Celery Python Module, and create a worker. You will also understand the StarCluster framework, Pycsp, Scoop, and Disco modules in Python. Further on, you will learn GPU programming with Python using the PyCUDA module along with evaluating performance limitations. Next you will get acquainted with the cloud computing concepts in Python, using Google App Engine (GAE), and building your first application with GAE. Lastly, you will learn about grid computing concepts in Python and using PyGlobus toolkit, GFTP and GASS COPY to transfer files, and service monitoring in PyGlobus. Style and approach A step-by-step guide to parallel programming using Python, with recipes accompanied by one or more programming examples. It is a practically oriented book and has all the necessary underlying parallel computing concepts.

Distributed Machine Learning with Python

Author : Guanhua Wang
Publisher : Packt Publishing Ltd
Page : 284 pages
File Size : 43,5 Mb
Release : 2022-04-29
Category : Computers
ISBN : 9781801817219

Get Book

Distributed Machine Learning with Python by Guanhua Wang Pdf

Build and deploy an efficient data processing pipeline for machine learning model training in an elastic, in-parallel model training or multi-tenant cluster and cloud Key FeaturesAccelerate model training and interference with order-of-magnitude time reductionLearn state-of-the-art parallel schemes for both model training and servingA detailed study of bottlenecks at distributed model training and serving stagesBook Description Reducing time cost in machine learning leads to a shorter waiting time for model training and a faster model updating cycle. Distributed machine learning enables machine learning practitioners to shorten model training and inference time by orders of magnitude. With the help of this practical guide, you'll be able to put your Python development knowledge to work to get up and running with the implementation of distributed machine learning, including multi-node machine learning systems, in no time. You'll begin by exploring how distributed systems work in the machine learning area and how distributed machine learning is applied to state-of-the-art deep learning models. As you advance, you'll see how to use distributed systems to enhance machine learning model training and serving speed. You'll also get to grips with applying data parallel and model parallel approaches before optimizing the in-parallel model training and serving pipeline in local clusters or cloud environments. By the end of this book, you'll have gained the knowledge and skills needed to build and deploy an efficient data processing pipeline for machine learning model training and inference in a distributed manner. What you will learnDeploy distributed model training and serving pipelinesGet to grips with the advanced features in TensorFlow and PyTorchMitigate system bottlenecks during in-parallel model training and servingDiscover the latest techniques on top of classical parallelism paradigmExplore advanced features in Megatron-LM and Mesh-TensorFlowUse state-of-the-art hardware such as NVLink, NVSwitch, and GPUsWho this book is for This book is for data scientists, machine learning engineers, and ML practitioners in both academia and industry. A fundamental understanding of machine learning concepts and working knowledge of Python programming is assumed. Prior experience implementing ML/DL models with TensorFlow or PyTorch will be beneficial. You'll find this book useful if you are interested in using distributed systems to boost machine learning model training and serving speed.

Topics in Parallel and Distributed Computing

Author : Sushil K Prasad,Anshul Gupta,Arnold L Rosenberg,Alan Sussman,Charles C Weems
Publisher : Morgan Kaufmann
Page : 360 pages
File Size : 54,6 Mb
Release : 2015-09-16
Category : Computers
ISBN : 9780128039380

Get Book

Topics in Parallel and Distributed Computing by Sushil K Prasad,Anshul Gupta,Arnold L Rosenberg,Alan Sussman,Charles C Weems Pdf

Topics in Parallel and Distributed Computing provides resources and guidance for those learning PDC as well as those teaching students new to the discipline. The pervasiveness of computing devices containing multicore CPUs and GPUs, including home and office PCs, laptops, and mobile devices, is making even common users dependent on parallel processing. Certainly, it is no longer sufficient for even basic programmers to acquire only the traditional sequential programming skills. The preceding trends point to the need for imparting a broad-based skill set in PDC technology. However, the rapid changes in computing hardware platforms and devices, languages, supporting programming environments, and research advances, poses a challenge both for newcomers and seasoned computer scientists. This edited collection has been developed over the past several years in conjunction with the IEEE technical committee on parallel processing (TCPP), which held several workshops and discussions on learning parallel computing and integrating parallel concepts into courses throughout computer science curricula. Contributed and developed by the leading minds in parallel computing research and instruction Provides resources and guidance for those learning PDC as well as those teaching students new to the discipline Succinctly addresses a range of parallel and distributed computing topics Pedagogically designed to ensure understanding by experienced engineers and newcomers Developed over the past several years in conjunction with the IEEE technical committee on parallel processing (TCPP), which held several workshops and discussions on learning parallel computing and integrating parallel concepts

Python Parallel Programming Cookbook

Author : Giancarlo Zaccone
Publisher : Unknown
Page : 370 pages
File Size : 53,5 Mb
Release : 2019-09-06
Category : Application software
ISBN : 1789533732

Get Book

Python Parallel Programming Cookbook by Giancarlo Zaccone Pdf

Implement effective programming techniques in Python to build scalable software that saves time and memory Key Features Design distributed computing systems and massive computational tasks coherently Learn practical recipes with concise explanations that address development pain points encountered while coding parallel programs Understand how to host your parallelized applications on the cloud Book Description Nowadays, it has become extremely important for programmers to understand the link between the software and the parallel nature of their hardware so that their programs run efficiently on computer architectures. Applications based on parallel programming are fast, robust, and easily scalable. This updated edition features cutting-edge techniques for building effective concurrent applications in Python 3.7. The book introduces parallel programming architectures and covers the fundamental recipes for thread-based and process-based parallelism. You'll learn about mutex, semaphores, locks, queues exploiting the threading, and multiprocessing modules, all of which are basic tools to build parallel applications. Recipes on MPI programming will help you to synchronize processes using the fundamental message passing techniques with mpi4py. Furthermore, you'll get to grips with asynchronous programming and how to use the power of the GPU with PyCUDA and PyOpenCL frameworks. Finally, you'll explore how to design distributed computing systems with Celery and architect Python apps on the cloud using PythonAnywhere, Docker, and serverless applications. By the end of this book, you will be confident in building concurrent and high-performing applications in Python. What you will learn Synchronize multiple threads and processes to manage parallel tasks Use message passing techniques to establish communication between processes to build parallel applications Program your own GPU cards to address complex problems Manage computing entities to execute distributed computational task Write efficient programs by adopting the event-driven programming model Explore cloud technology with Django and Google App Engine Apply parallel programming techniques that can lead to performance improvements Who this book is for The Python Parallel Programming Cookbook is for software developers who are well-versed with Python and want to use parallel programming techniques to write powerful and efficient code. This book will help you master the basics and the advanced of parallel computing.

Mastering Large Datasets with Python

Author : John Wolohan
Publisher : Simon and Schuster
Page : 451 pages
File Size : 46,7 Mb
Release : 2020-01-15
Category : Computers
ISBN : 9781638350361

Get Book

Mastering Large Datasets with Python by John Wolohan Pdf

Summary Modern data science solutions need to be clean, easy to read, and scalable. In Mastering Large Datasets with Python, author J.T. Wolohan teaches you how to take a small project and scale it up using a functionally influenced approach to Python coding. You’ll explore methods and built-in Python tools that lend themselves to clarity and scalability, like the high-performing parallelism method, as well as distributed technologies that allow for high data throughput. The abundant hands-on exercises in this practical tutorial will lock in these essential skills for any large-scale data science project. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Programming techniques that work well on laptop-sized data can slow to a crawl—or fail altogether—when applied to massive files or distributed datasets. By mastering the powerful map and reduce paradigm, along with the Python-based tools that support it, you can write data-centric applications that scale efficiently without requiring codebase rewrites as your requirements change. About the book Mastering Large Datasets with Python teaches you to write code that can handle datasets of any size. You’ll start with laptop-sized datasets that teach you to parallelize data analysis by breaking large tasks into smaller ones that can run simultaneously. You’ll then scale those same programs to industrial-sized datasets on a cluster of cloud servers. With the map and reduce paradigm firmly in place, you’ll explore tools like Hadoop and PySpark to efficiently process massive distributed datasets, speed up decision-making with machine learning, and simplify your data storage with AWS S3. What's inside An introduction to the map and reduce paradigm Parallelization with the multiprocessing module and pathos framework Hadoop and Spark for distributed computing Running AWS jobs to process large datasets About the reader For Python programmers who need to work faster with more data. About the author J. T. Wolohan is a lead data scientist at Booz Allen Hamilton, and a PhD researcher at Indiana University, Bloomington. Table of Contents: PART 1 1 ¦ Introduction 2 ¦ Accelerating large dataset work: Map and parallel computing 3 ¦ Function pipelines for mapping complex transformations 4 ¦ Processing large datasets with lazy workflows 5 ¦ Accumulation operations with reduce 6 ¦ Speeding up map and reduce with advanced parallelization PART 2 7 ¦ Processing truly big datasets with Hadoop and Spark 8 ¦ Best practices for large data with Apache Streaming and mrjob 9 ¦ PageRank with map and reduce in PySpark 10 ¦ Faster decision-making with machine learning and PySpark PART 3 11 ¦ Large datasets in the cloud with Amazon Web Services and S3 12 ¦ MapReduce in the cloud with Amazon’s Elastic MapReduce

Concurrent and Distributed Computing in Java

Author : Vijay K. Garg
Publisher : John Wiley & Sons
Page : 331 pages
File Size : 44,7 Mb
Release : 2005-01-28
Category : Computers
ISBN : 9780471721260

Get Book

Concurrent and Distributed Computing in Java by Vijay K. Garg Pdf

Concurrent and Distributed Computing in Java addresses fundamental concepts in concurrent computing with Java examples. The book consists of two parts. The first part deals with techniques for programming in shared-memory based systems. The book covers concepts in Java such as threads, synchronized methods, waits, and notify to expose students to basic concepts for multi-threaded programming. It also includes algorithms for mutual exclusion, consensus, atomic objects, and wait-free data structures. The second part of the book deals with programming in a message-passing system. This part covers resource allocation problems, logical clocks, global property detection, leader election, message ordering, agreement algorithms, checkpointing, and message logging. Primarily a textbook for upper-level undergraduates and graduate students, this thorough treatment will also be of interest to professional programmers.

Learning Ray

Author : Max Pumperla,Edward Oakes,Richard Liaw
Publisher : O'Reilly Media
Page : 0 pages
File Size : 50,7 Mb
Release : 2023-03-31
Category : Electronic
ISBN : 1098117220

Get Book

Learning Ray by Max Pumperla,Edward Oakes,Richard Liaw Pdf

Get started with Ray, the open source distributed computing framework that greatly simplifies the process of scaling compute-intensive Python workloads. With this practical book, Python programmers, data engineers, and data scientists will learn how to leverage Ray locally and spin up compute clusters. You'll be able to use Ray to structure and run machine learning programs at scale. Authors Max Pumperla, Edward Oakes, and Richard Liaw show you how to build reinforcement learning applications that serve trained models with Ray. You'll understand how Ray fits into the current landscape of data science tools and discover how this programming language continues to integrate ever more tightly with these tools. Distributed computation is hard, but with Ray you'll find it easy to get started. Learn how to build your first distributed application with Ray Core Conduct hyperparameter optimization with Ray Tune Use the Ray RLib library for reinforcement learning Manage distributed training with the RaySGD library Use Ray to perform data processing Learn how work with Ray Clusters and serve models with Ray Serve Build an end-to-end machine learning application with Ray

Parallel Programming with Python

Author : Jan Palach
Publisher : Packt Publishing Ltd
Page : 124 pages
File Size : 48,6 Mb
Release : 2014-06-25
Category : Computers
ISBN : 9781783288403

Get Book

Parallel Programming with Python by Jan Palach Pdf

A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.

Julia High Performance

Author : Avik Sengupta
Publisher : Packt Publishing Ltd
Page : 132 pages
File Size : 51,8 Mb
Release : 2016-04-26
Category : Computers
ISBN : 9781785887826

Get Book

Julia High Performance by Avik Sengupta Pdf

Design and develop high performing programs with Julia About This Book Learn to code high reliability and high performance programs Stand out from the crowd by developing code that runs faster than your peers' codes This book is intended for developers who are interested in high performance technical programming. Who This Book Is For This book is for beginner and intermediate Julia programmers who are interested in high performance technical computing. You will have a basic familiarity with Julia syntax, and have written some small programs in the language. What You Will Learn Discover the secrets behind Julia's speed Get a sense of the possibilities and limitations of Julia's performance Analyze the performance of Julia programs Measure the time and memory taken by Julia programs Create fast machine code using Julia's type information Define and call functions without compromising Julia's performance Understand number types in Julia Use Julia arrays to write high performance code Get an overview of Julia's distributed computing capabilities In Detail Julia is a high performance, high-level dynamic language designed to address the requirements of high-level numerical and scientific computing. Julia brings solutions to the complexities faced by developers while developing elegant and high performing code. Julia High Performance will take you on a journey to understand the performance characteristics of your Julia programs, and enables you to utilize the promise of near C levels of performance in Julia. You will learn to analyze and measure the performance of Julia code, understand how to avoid bottlenecks, and design your program for the highest possible performance. In this book, you will also see how Julia uses type information to achieve its performance goals, and how to use multuple dispatch to help the compiler to emit high performance machine code. Numbers and their arrays are obviously the key structures in scientific computing – you will see how Julia's design makes them fast. The last chapter will give you a taste of Julia's distributed computing capabilities. Style and approach This is a hands-on manual that will give you good explanations about the important concepts related to Julia programming.

Mastering Large Datasets

Author : J. T. Wolohan
Publisher : Manning Publications
Page : 350 pages
File Size : 41,6 Mb
Release : 2020-01-06
Category : Electronic
ISBN : 1617296236

Get Book

Mastering Large Datasets by J. T. Wolohan Pdf

With an emphasis on clarity, style, and performance, author J.T. Wolohan expertly guides you through implementing a functionally-influenced approach to Python coding. You'll get familiar with Python's functional built-ins like the functools operator and itertools modules, as well as the toolz library. Mastering Large Datasets teaches you to write easily readable, easily scalable Python code that can efficiently process large volumes of structured and unstructured data. By the end of this comprehensive guide, you'll have a solid grasp on the tools and methods that will take your code beyond the laptop and your data science career to the next level! Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

Introducing Python

Author : Bill Lubanovic
Publisher : "O'Reilly Media, Inc."
Page : 630 pages
File Size : 51,6 Mb
Release : 2019-11-06
Category : Computers
ISBN : 9781492051329

Get Book

Introducing Python by Bill Lubanovic Pdf

Easy to understand and fun to read, this updated edition of Introducing Python is ideal for beginning programmers as well as those new to the language. Author Bill Lubanovic takes you from the basics to more involved and varied topics, mixing tutorials with cookbook-style code recipes to explain concepts in Python 3. End-of-chapter exercises help you practice what you’ve learned. You’ll gain a strong foundation in the language, including best practices for testing, debugging, code reuse, and other development tips. This book also shows you how to use Python for applications in business, science, and the arts, using various Python tools and open source packages.

Distributed Computing in Big Data Analytics

Author : Sourav Mazumder,Robin Singh Bhadoria,Ganesh Chandra Deka
Publisher : Springer
Page : 162 pages
File Size : 49,8 Mb
Release : 2017-08-29
Category : Computers
ISBN : 9783319598345

Get Book

Distributed Computing in Big Data Analytics by Sourav Mazumder,Robin Singh Bhadoria,Ganesh Chandra Deka Pdf

Big data technologies are used to achieve any type of analytics in a fast and predictable way, thus enabling better human and machine level decision making. Principles of distributed computing are the keys to big data technologies and analytics. The mechanisms related to data storage, data access, data transfer, visualization and predictive modeling using distributed processing in multiple low cost machines are the key considerations that make big data analytics possible within stipulated cost and time practical for consumption by human and machines. However, the current literature available in big data analytics needs a holistic perspective to highlight the relation between big data analytics and distributed processing for ease of understanding and practitioner use. This book fills the literature gap by addressing key aspects of distributed processing in big data analytics. The chapters tackle the essential concepts and patterns of distributed computing widely used in big data analytics. This book discusses also covers the main technologies which support distributed processing. Finally, this book provides insight into applications of big data analytics, highlighting how principles of distributed computing are used in those situations. Practitioners and researchers alike will find this book a valuable tool for their work, helping them to select the appropriate technologies, while understanding the inherent strengths and drawbacks of those technologies.

Distributed Machine Learning Patterns

Author : Yuan Tang
Publisher : Simon and Schuster
Page : 375 pages
File Size : 51,5 Mb
Release : 2024-01-30
Category : Computers
ISBN : 9781638354192

Get Book

Distributed Machine Learning Patterns by Yuan Tang Pdf

Practical patterns for scaling machine learning from your laptop to a distributed cluster. Distributing machine learning systems allow developers to handle extremely large datasets across multiple clusters, take advantage of automation tools, and benefit from hardware accelerations. This book reveals best practice techniques and insider tips for tackling the challenges of scaling machine learning systems. In Distributed Machine Learning Patterns you will learn how to: Apply distributed systems patterns to build scalable and reliable machine learning projects Build ML pipelines with data ingestion, distributed training, model serving, and more Automate ML tasks with Kubernetes, TensorFlow, Kubeflow, and Argo Workflows Make trade-offs between different patterns and approaches Manage and monitor machine learning workloads at scale Inside Distributed Machine Learning Patterns you’ll learn to apply established distributed systems patterns to machine learning projects—plus explore cutting-edge new patterns created specifically for machine learning. Firmly rooted in the real world, this book demonstrates how to apply patterns using examples based in TensorFlow, Kubernetes, Kubeflow, and Argo Workflows. Hands-on projects and clear, practical DevOps techniques let you easily launch, manage, and monitor cloud-native distributed machine learning pipelines. About the technology Deploying a machine learning application on a modern distributed system puts the spotlight on reliability, performance, security, and other operational concerns. In this in-depth guide, Yuan Tang, project lead of Argo and Kubeflow, shares patterns, examples, and hard-won insights on taking an ML model from a single device to a distributed cluster. About the book Distributed Machine Learning Patterns provides dozens of techniques for designing and deploying distributed machine learning systems. In it, you’ll learn patterns for distributed model training, managing unexpected failures, and dynamic model serving. You’ll appreciate the practical examples that accompany each pattern along with a full-scale project that implements distributed model training and inference with autoscaling on Kubernetes. What's inside Data ingestion, distributed training, model serving, and more Automating Kubernetes and TensorFlow with Kubeflow and Argo Workflows Manage and monitor workloads at scale About the reader For data analysts and engineers familiar with the basics of machine learning, Bash, Python, and Docker. About the author Yuan Tang is a project lead of Argo and Kubeflow, maintainer of TensorFlow and XGBoost, and author of numerous open source projects. Table of Contents PART 1 BASIC CONCEPTS AND BACKGROUND 1 Introduction to distributed machine learning systems PART 2 PATTERNS OF DISTRIBUTED MACHINE LEARNING SYSTEMS 2 Data ingestion patterns 3 Distributed training patterns 4 Model serving patterns 5 Workflow patterns 6 Operation patterns PART 3 BUILDING A DISTRIBUTED MACHINE LEARNING WORKFLOW 7 Project overview and system architecture 8 Overview of relevant technologies 9 A complete implementation