DS-GA 1011 course labs + assigments + materials
Preparation Python, PyTorch, Jupyter Lab.
Multi-class classification in PyTorch
- NLLLoss in PyTorch
- Autodiff in PyTorch
- Optimization in PyTorch
- Embedding layer in PyTorch
- One-hot vector vs. Indexing
Count-based N-gram language models
- Using a trie structure for n-gram modeling
- Add-one smoothing
- If time permits, back-off
- How to use kenlm
- Sampling from count-based n-gram models
Introduction to Google Colab Recurrent neural networks in PyTorch
- How to properly handle sequence data
- RNNCell vs. RNN
- How to compute the loss function properly RNN Language modeling in PyTorch
- Computing perplexity
- Comparison to FF-LM and count-based n-gram LM
Demonstration of learning difficulties in PyTorch
- A deep feedforward network vs. a deep highway network
- A recurrent network vs. a LSTM network
Demonstration of BERT-like training of language models
- Explain the BERT objective
- Sampling from the BERT language model
There will be five programming assignments. These assignments are designed to be done by a team of 3-4 students and will be graded by the section leader and/or graders “in person”.
Dataset: SNLI, MNLI
Models: BoW Neural Networks
Evaluation criteria: Accuracy, Analysis
Dataset: WikiText-2 or WikiText-103
Models: count-based n-gram LM, neural n-gram LM, recurrent LM
Evaluation criteria: Perplexity, Analysis
Target dataset: MNLI (5%, 10%, 20%, 40% training set)
Source dataset: WikiText-103
Models: BoW, recurrent nets
Evaluation criteria: Accuracy improvement, Analysis
Dataset: PersonaChat
Models: attention-based sequence-to-sequence using either recurrent nets or transformers
Evaluation criteria: interactive demo of a chit-chat bot
Dataset: SQuAD 1.0
Model: Attention sum reader or later models
Evaluation criteria: build an interactive demo of a machine reading system