Natural Language Processing – Corporate and Professional Education in and for St. Louis

About This Offering

This course is designed to provide practitioners with a hands-on introduction to Natural Language Processing, covering both foundational concepts and advanced techniques. It covers key techniques for text representation, classification, and generation, along with modern advancements like Transformers and Large Language Models (LLMs).

Registration form

About the Instructor

Suman Maity

An Assistant Professor in the Department of Computer Science at Missouri S&T. Previously, he was a postdoctoral research associate at MIT and a postdoctoral fellow at CSSI and NICO, Northwestern University. He earned his PhD from IIT Kharagpur and received IBM and Microsoft Research India PhD Fellowships.

His research focuses on Social NLP and Responsible Machine Learning, with work published in top-tier venues like WWW, EMNLP, and ACL. He has served as a reviewer for conferences such as AAAI, WWW, and ICWSM.

About the Course

Course cost: $1,400
Dates: Late July/early August. Exact dates to be announced soon.
Location: 2127 Innerbelt Business Center Drive, St. Louis, MO 63114
Prerequisites: Basic python, familiarity with machine learning concepts.

Course Goals and Objectives

At the end of this course, students should be able to:

1. Understand the fundamentals of NLP and its real-world applications.
2. Preprocess text data using techniques like tokenization, stemming, stopword removal, and special character handling.
3. Implement text classification models and build a basic text classifier.
4. Understand text generation techniques and experiment with RNNs or transformer-based models.
5. Understand basics of Large Language Models (LLMs)
6. Gain insights into ethical considerations in NLP.
7. Work on a hands-on NLP project to apply learned skills.

Course Overview

Introduction to NLP - Overview of NLP and its applications, Challenges in NLP
Text preprocessing pipeline - Tokenization, stopword removal, stemming, lemmatization
Hands-on: Preprocessing text using NLTK & spaCy
Text Representation - Traditional methods: Bag of Words (BoW), TF-IDF; Word embeddings: Word2Vec, GloVe
Hands-on: Implementing TF-IDF and Word2Vec using scikit-learn and gensim
Text classification - Understanding text classification and basic ML models (Naïve Bayes, Logistic Regression)
Hands-on: Building a text classifier using scikit-learn
Text Generation and Sequence Models - Overview of text generation techniques; RNNs, LSTMs for text generation
Transformers - Introduction to Transformers and BERT, Fine-tuning pre-trained models, Using Hugging Face for NLP
Hands-on: using Hugging Face for NLP
Large Language Models - Pretraining, Finetuning, Prompting, Capabilities of LLMs, Challenges - Hallucination and Bias
Hands-on: Various prompting strategies
Bias, Fairness, and Ethical Considerations in NLP - Bias in language models, Ethical considerations and responsible AI, Mitigation techniques
Final Project & Wrap-up

Corporate and Professional Education in and for St. Louis

Late July/early August. Exact dates to be announced soon. Natural Language Processing