About This Offering

This course introduces Data Science and Machine Learning, covering data cleaning, exploratory analysis, and visualization with Pandas and NumPy. Participants gain hands-on experience, progressing to building robust models and tackling real-world data challenges.

Registration form


Suman Maity

An Assistant Professor in the Department of Computer Science at Missouri S&T. Previously, he was a postdoctoral research associate at MIT and a postdoctoral fellow at CSSI and NICO, Northwestern University. He earned his PhD from IIT Kharagpur and received IBM and Microsoft Research India PhD Fellowships.

His research focuses on Social NLP and Responsible Machine Learning, with work published in top-tier venues like WWW, EMNLP, and ACL. He has served as a reviewer for conferences such as AAAI, WWW, and ICWSM.

  • Course cost: $1,400
  • Dates: May 15-16
  • Prerequisites: Basic Python programming knowledge
  • Gain practical experience in data preprocessing, visualization, and model development.
  • Understand and apply key machine learning techniques, including regression, classification, and clustering.
  • Build and implement both supervised and unsupervised ML models.
  • Learn to analyze time series data and develop forecasting models.
  • Explore the fundamentals of neural networks and their applications.
  • Introduction to Data Science and Python Basics - Overview of Data Science and real-world applications, Python fundamentals: syntax, data structures, and control flow.
    Hands-on: Writing Python scripts for data manipulation.
  • Data Preprocessing and Visualization - Data cleaning techniques, Feature engineering, Data visualization with Matplotlib and Seaborn.
    Hands-on: Exploratory Data Analysis (EDA) with a real dataset.
  • Supervised Learning: Regression and Classification - Understanding regression vs. classification, Linear Regression, Logistic Regression for classification problems, Model evaluation metrics (MAE, RMSE, Precision, Recall, F1-score).
    Hands-on: Implementing regression and classification models using Scikit-learn.
  • Unsupervised Learning: Clustering and Dimensionality Reduction - K-Means Clustering, Principal Component Analysis (PCA) for dimensionality reduction.
    Hands-on: Clustering real-world data and visualizing high-dimensional data.
  • Advanced Classification Techniques - Support Vector Machines (SVM) for classification tasks, Hyperparameter tuning for better performance.
    Hands-on: Training and tuning an SVM model.
  • Time Series Analysis and Forecasting - Basics of Time Series data, Forecasting techniques using moving averages and ARIMA.
    Hands-on: Forecasting trends using historical data.
  • Introduction to Neural Networks - Neural networks basics: Perceptron, activation functions, forward/backpropagation.
    Hands-on: Building a simple neural network.
  • Model Interpretability - Why Model Interpretability Matters, Importance of explainability in ML, Trade-offs: Accuracy vs. Interpretability, Regulatory and ethical considerations (e.g., GDPR, fairness in AI), Explaining black-box models - SHAP, LIME, etc.