Large Language Models – Corporate and Professional Education in and for St. Louis

About This Offering

We will start with the fundamentals—tokenization, embeddings, and transformer architecture—and move toward cutting-edge advancements in Large Language Models. Along the way, you will gain hands-on experience with platforms and tools like Hugging Face, LangChain, and real-world APIs.

Registration form

About the Instructor

Suman Maity

An Assistant Professor in the Department of Computer Science at Missouri S&T. Previously, he was a postdoctoral research associate at MIT and a postdoctoral fellow at CSSI and NICO, Northwestern University. He earned his PhD from IIT Kharagpur and received IBM and Microsoft Research India PhD Fellowships.

His research focuses on Social NLP and Responsible Machine Learning, with work published in top-tier venues like WWW, EMNLP, and ACL. He has served as a reviewer for conferences such as AAAI, WWW, and ICWSM.

About the Course

Course cost: $1,400
Dates: Aug 21-22
Location: 2127 Innerbelt Business Center Drive, St. Louis, MO 63114
- The program will be held in person, with a livestream option if needed.
Prerequisites: Basic python, familiarity with machine learning concepts.

Course Goals and Objectives

At the end of this course, students should be able to:

Describe the evolution of language models and the significance of LLMs in modern NLP.
Understand the architecture and functionality of transformer-based models.
Apply prompt engineering techniques for various tasks such as summarization, Q&A, and creative generation.
Use libraries and tools like Hugging Face Transformers, LangChain etc. to develop LLM-based solutions.
Build a simple Retrieval-Augmented Generation (RAG) pipeline for domain-specific applications.
Design and implement a small-scale LLM-powered project demonstrating practical skills learned in the course.

Course Overview

Introduction to Language Models
- Evolution of language modeling: N-grams → RNNs → Transformers → LLMs
- Why LLMs matter: zero-shot, few-shot, and chain-of-thought reasoning
- Overview of popular models: Open-source (LLaMA, Mistral) vs proprietary (GPT, Claude)
Tokens and Embeddings
- What are tokens? Tokenization strategies: Byte Pair Encoding (BPE), WordPiece, SentencePiece
- From static to contextual embeddings: Word2Vec → BERT → GPT
- Hands-on: Visualize tokenization and embedding spaces using dimensionality reduction (e.g., t-SNE or UMAP)
Transformer Model
- Key components: self-attention, multi-head attention, positional encodings
- Transformer variants: encoder-only (BERT), decoder-only (GPT), encoder-decoder (T5)
- Anatomy of a layer: attention heads, feedforward layers, residual connections, layer normalization
- Hands-on: Explore a simplified transformer architecture
Prompt Engineering
- Prompting strategies: zero-shot, few-shot, chain-of-thought, and system-level prompts
- Instruction tuning vs prompt engineering
- Hands-on: Design prompt templates for summarization, Q&A, and reasoning
- Build a prompt-based mini app (e.g., travel planner, story generator) using OpenAI API or open-source LLMs via Hugging Face
Retrieval-Augmented Generation (RAG)
- What is RAG? Concept, motivation, and key applications
- Core components: document chunking, text embeddings, vector stores, retrievers, re-rankers
- Popular frameworks: LangChain, LlamaIndex, FAISS, Chroma etc.
- Hands-on: Build a mini RAG system to answer questions from PDF files or Wikipedia articles
Final Project & Wrap-Up

Corporate and Professional Education in and for St. Louis

Aug 21-22Intro to Large Language Models