About This Offering

This course is designed to provide practitioners with a hands-on introduction to Natural Language Processing, covering both foundational concepts and advanced techniques. It covers key techniques for text representation, classification, and generation, along with modern advancements like Transformers and Large Language Models (LLMs).

Registration form


Suman Maity

An Assistant Professor in the Department of Computer Science at Missouri S&T. Previously, he was a postdoctoral research associate at MIT and a postdoctoral fellow at CSSI and NICO, Northwestern University. He earned his PhD from IIT Kharagpur and received IBM and Microsoft Research India PhD Fellowships.

His research focuses on Social NLP and Responsible Machine Learning, with work published in top-tier venues like WWW, EMNLP, and ACL. He has served as a reviewer for conferences such as AAAI, WWW, and ICWSM.

  • Course cost: $1,400
  • Dates: May 29-30
  • Prerequisites: Basic python, familiarity with machine learning concepts.
At the end of this course, students should be able to:
    1. Understand the fundamentals of NLP and its real-world applications.
    2. Preprocess text data using techniques like tokenization, stemming, stopword removal, and special character handling.
    3. Implement text classification models and build a basic text classifier.
    4. Understand text generation techniques and experiment with RNNs or transformer-based models.
    5. Understand basics of Large Language Models (LLMs)
    6. Gain insights into ethical considerations in NLP.
    7. Work on a hands-on NLP project to apply learned skills.
  • Introduction to NLP - Overview of NLP and its applications, Challenges in NLP
    Text preprocessing pipeline - Tokenization, stopword removal, stemming, lemmatization
    Hands-on: Preprocessing text using NLTK & spaCy
  • Text Representation - Traditional methods: Bag of Words (BoW), TF-IDF; Word embeddings: Word2Vec, GloVe
    Hands-on: Implementing TF-IDF and Word2Vec using scikit-learn and gensim
  • Text classification - Understanding text classification and basic ML models (Naïve Bayes, Logistic Regression)
    Hands-on: Building a text classifier using scikit-learn
  • Text Generation and Sequence Models - Overview of text generation techniques; RNNs, LSTMs for text generation
  • Transformers - Introduction to Transformers and BERT, Fine-tuning pre-trained models, Using Hugging Face for NLP
    Hands-on: using Hugging Face for NLP
  • Large Language Models - Pretraining, Finetuning, Prompting, Capabilities of LLMs, Challenges - Hallucination and Bias
    Hands-on: Various prompting strategies
  • Bias, Fairness, and Ethical Considerations in NLP - Bias in language models, Ethical considerations and responsible AI, Mitigation techniques
  • Final Project & Wrap-up