CS 4782 Course Notes

CS 4782 Deep Learning · Cornell University
Comprehensive lecture notes compiled by students. Each topic covers key concepts, mathematical foundations, and practical insights from the course.
Foundations
Topic 0
Recap, Linear Models Available
Review of linear algebra, probability, and linear classifiers as a foundation for deep learning.
Topic 1
MLP, SGD, and Optimization Available
Multilayer perceptrons, forward pass, backpropagation, and optimization algorithms including SGD, momentum, AdaGrad, RMSProp, and Adam.
Topic 2
Regularization Available
Techniques to prevent overfitting: L1/L2 regularization, dropout, batch normalization, and data augmentation.
Computer Vision
Topic 3
Convolutional Neural Networks (CNNs) Available
Convolution operations, pooling, and the architecture of convolutional neural networks for image processing.
Topic 4
Modern ConvNets Available
Advanced architectures: AlexNet, VGG, ResNet, and recent developments in convolutional networks.
Natural Language Processing
Topic 5
Word Embeddings Available
Representing words as dense vectors: Word2Vec, GloVe, and contextual embeddings.
Topic 6
Recurrent Neural Networks (RNNs) Available
Sequential data processing with RNNs, LSTMs, and GRUs for language modeling and sequence prediction.
Topic 7
Attention and Transformers Available
Self-attention mechanisms, the Transformer architecture, and positional encodings.
Topic 8
Large Language Models (LLMs) Available
GPT, BERT, and modern large language models: pre-training, fine-tuning, and emergent capabilities.
Vision-Language & Pre-Training
Topic 9
Vision Pre-Training Available
Supervised and self-supervised pre-training strategies for visual representation learning.
Topic 10
Vision-Language Models Available
CLIP, Flamingo, and multimodal models that connect vision and language understanding.
Generative Models
Topic 11
Discriminators and GANs Available
Generative Adversarial Networks: generator-discriminator training, mode collapse, and GAN variants.
Topic 12
U-Nets and VAEs Available
Encoder-decoder architectures, skip connections, and variational autoencoders for generation.
Reinforcement Learning
Ethics & Safety