From Regression to Deep Learning
- Why Regression Isn't Enough
- Single Neuron & Bias
- ReLU & Sigmoid
- Forward Pass
- Decision Regions
Deep Learning with PyTorch
A practical journey from neurons and CNNs to Transformers and multi-modal vision–language models like CLIP.
Explore CurriculumUnderstand how neural networks learn and predict
Seeing the world through convolutional neural networks
Creating new data with Variational Autoencoders
Teaching machines to understand text
nn.EmbeddingConnecting vision and language together
Jupyter notebooks with runnable code examples
Generate a spiral dataset, train a small PyTorch network with ReLU/Sigmoid, and visualize decision regions.
Classify MNIST digits using an MLP. Achieves ~98% accuracy with live loss visualization and confusion matrix.
Train a CNN on MNIST for ~99% accuracy. Visualize learned convolution filters and activation maps.
Compare training with and without augmentation using FastAI and xResNet18 on CIFAR-10.
From tokenization to Word2Vec magic. Explore word analogies, Skip-Gram, and CBOW.
nn.EmbeddingClassify AG News topics using a pretrained AWD-LSTM with FastAI's text module.
A simpler alternative: average pretrained GloVe embeddings and classify with a linear layer.
Self-attention from scratch, positional encoding, multi-head attention. Use BERT & GPT-2 via Hugging Face.
Zero-shot image classification, prompt engineering, image-text retrieval, and contrastive learning with CLIP.
Image captioning and VQA with BLIP, zero-shot classification with HuggingFace pipelines.
Train an LSTM caption decoder using frozen CLIP embeddings. Learn the "Show and Tell" architecture.
Larger hands-on projects to apply your skills
Train a Variational Autoencoder on LFW faces, analyze the latent space with PCA, and explore an interactive Gradio app that lets you manipulate facial features with sliders.