Computing Course • Jillur Quddus

Real-Time Machine Learning

Learn how to apply statistical learning techniques to real-time event-driven data in Python by integrating distributed machine learning models with scalable, high-throughput and fault-tolerant streaming platforms.

Real-Time Machine Learning

Real-Time Machine Learning

Jillur Quddus • Founder & Chief Data Scientist • 1st Sep 2020

Back to Training Courses

Overview

Learn how to apply statistical learning techniques to real-time event-driven data in Python by integrating distributed machine learning models with scalable, high-throughput and fault-tolerant streaming platforms.

Course Details

This course provides a hands-on exploration of the industry-standard Apache Kafka distributed streaming platform and how it can be integrated with distributed machine learning models via Apache Spark and its Structured Streaming engine in order to build high-throughput and low-latency real-time machine learning systems. This course follows on from our Applied Machine Learning and Distributed Machine Learning courses, and enables experienced senior data scientists and data engineers to learn from event-driven data and make predictions in real-time. This course also provides guidance on real-time architectural patterns, as well as how to build real-time continuous feedback loops in order to automate the training of machine learning models based on the actions of system users and customers.

Course Modules

  • 1. Introduction to Apache Kafka
  • 2. Apache Kafka and Python
  • 3. Apache Spark Structured Streaming
  • 4. Real-Time Regression
  • 5. Real-Time Classification
  • 6. Real-Time Clustering
  • 7. Real-Time Collaborative Filtering
  • 8. Real-Time Feedback and Training
  • 9. Real-Time Architectural Patterns

Requirements

Outcomes

  • The ability to apply statistical learning techniques to event-driven data in real-time.
  • The ability to integrate distributed machine learning models with distributed streaming platforms in order to learn from data and make predictions in real-time.
  • The ability to build feedback loops to enable automated updates to machine learning models.
  • Knowledge of real-time architectural patterns and best-practice.
  • Knowledge of the industry-standard Apache Spark Structured Streaming engine and Apache Kafka distributed streaming platform.
DASH Platform
Jillur Quddus
Jillur Quddus
Founder & Chief Data Scientist