Data Science and Machine Learning – Corporate and Professional Education in and for St. Louis

About This Offering

This course introduces Data Science and Machine Learning, covering data cleaning, exploratory analysis, and visualization with Pandas and NumPy. Participants gain hands-on experience, progressing to building robust models and tackling real-world data challenges.

Registration form

About the Instructor

Suman Maity

An Assistant Professor in the Department of Computer Science at Missouri S&T. Previously, he was a postdoctoral research associate at MIT and a postdoctoral fellow at CSSI and NICO, Northwestern University. He earned his PhD from IIT Kharagpur and received IBM and Microsoft Research India PhD Fellowships.

His research focuses on Social NLP and Responsible Machine Learning, with work published in top-tier venues like WWW, EMNLP, and ACL. He has served as a reviewer for conferences such as AAAI, WWW, and ICWSM.

About the Course

Course cost: $1,400
Dates: July 31 and Aug 1
Location: 2127 Innerbelt Business Center Drive, St. Louis, MO 63114
- The program will be held in person, with a livestream option if needed.
Prerequisites: Basic Python programming knowledge

Course Goals and Objectives

Gain practical experience in data preprocessing, visualization, and model development.
Understand and apply key machine learning techniques, including regression, classification, and clustering.
Build and implement both supervised and unsupervised ML models.
Learn to analyze time series data and develop forecasting models.
Explore the fundamentals of neural networks and their applications.

Course Overview

Introduction to Data Science and Python Basics - Overview of Data Science and real-world applications, Python fundamentals: syntax, data structures, and control flow.
Hands-on: Writing Python scripts for data manipulation.
Data Preprocessing and Visualization - Data cleaning techniques, Feature engineering, Data visualization with Matplotlib and Seaborn.
Hands-on: Exploratory Data Analysis (EDA) with a real dataset.
Supervised Learning: Regression and Classification - Understanding regression vs. classification, Linear Regression, Logistic Regression for classification problems, Model evaluation metrics (MAE, RMSE, Precision, Recall, F1-score).
Hands-on: Implementing regression and classification models using Scikit-learn.
Unsupervised Learning: Clustering and Dimensionality Reduction - K-Means Clustering, Principal Component Analysis (PCA) for dimensionality reduction.
Hands-on: Clustering real-world data and visualizing high-dimensional data.
Advanced Classification Techniques - Support Vector Machines (SVM) for classification tasks, Hyperparameter tuning for better performance.
Hands-on: Training and tuning an SVM model.
Time Series Analysis and Forecasting - Basics of Time Series data, Forecasting techniques using moving averages and ARIMA.
Hands-on: Forecasting trends using historical data.
Introduction to Neural Networks - Neural networks basics: Perceptron, activation functions, forward/backpropagation.
Hands-on: Building a simple neural network.
Model Interpretability - Why Model Interpretability Matters, Importance of explainability in ML, Trade-offs: Accuracy vs. Interpretability, Regulatory and ethical considerations (e.g., GDPR, fairness in AI), Explaining black-box models - SHAP, LIME, etc.

Corporate and Professional Education in and for St. Louis

July 31 and Aug 1Data Science and Machine Learning