High Performance Distributed TensorFlow with GPUs - New York - Sept 2017


High Performance Distributed TensorFlow with GPUs - New York - Sept 2017

Title
PipelineAI Distributed Spark ML + Tensorflow AI + GPU Workshop
Highlights
We will each build an end-to-end, continuous Tensorflow AI model training and deployment pipeline on our own GPU-based cloud instance.
At the end, we will combine our cloud instances to create the LARGEST Distributed Tensorflow AI Training and Serving Cluster in the WORLD!
Agenda
Spark ML
TensorFlow AI
Storing and Serving Models with HDFS
Trade-offs of CPU vs. *GPU, Scale Up vs. Scale Out
CUDA + cuDNN GPU Development Overview
TensorFlow Model Checkpointing, Saving, Exporting, and Importing
Distributed TensorFlow AI Model Training (Distributed Tensorflow)
TensorFlow's Accelerated Linear Algebra Framework (XLA)
TensorFlow's Just-in-Time (JIT) Compiler, Ahead of Time (AOT) Compiler
Centralized Logging and Visualizing of Distributed TensorFlow Training (Tensorboard)
Distributed Tensorflow AI Model Serving/Predicting (TensorFlow Serving)
Centralized Logging and Metrics Collection (Prometheus, Grafana)
Continuous TensorFlow AI Model Deployment (TensorFlow, Airflow)
Hybrid Cross-Cloud and On-Premise Deployments (Kubernetes)
High-Performance and Fault-Tolerant Micro-services (NetflixOSS)
More Info including GitHub and Docker Repos

machine learning,artificial intelligence,models in production,ai,ml,pipeline,performance,cuda,gpu,spark ml,hdfs,cudnn,prometheus,jenkins,grafana,kubernetes,tensorflow,tensorflow serving,linear algebra

Comments