Picsart Academy - Deep Learning Course

🎓 Deep Learning 2025

Picsart Academy

Deep Learning with PyTorch

A practical journey from neurons and CNNs to Transformers and multi-modal vision–language models like CLIP.

Explore Curriculum

01

Deep Learning Fundamentals

Understand how neural networks learn and predict

01

From Regression to Deep Learning

Why Regression Isn't Enough
Single Neuron & Bias
ReLU & Sigmoid
Forward Pass
Decision Regions

Start Lesson →

02

How Neural Networks Learn

Loss Functions
Backpropagation
SGD vs Adam
Mini-Batch Training
MNIST Hands-on

Start Lesson →

02

Computer Vision with CNNs

Seeing the world through convolutional neural networks

03

Convolutional Neural Networks

Convolutions
Receptive Fields
Pooling
Dropout & BatchNorm
LeNet Architecture

Start Lesson →

04

Data Augmentation & Architectures

Augmentation Pipeline
LeNet → AlexNet → VGG
ResNet
Transfer Learning
Cosine Similarity

Start Lesson →

03

Generative Models

Creating new data with Variational Autoencoders

05

Variational Autoencoders & Latent PCA

Encoder / Decoder
Reparameterization
β-ELBO Loss
KL Divergence
Latent PCA Controls
Face Generation

Start Lesson →

04

Natural Language Processing

Teaching machines to understand text

06

From Words to Embeddings

Tokenization
Vocabularies
nn.Embedding
Word2Vec
Sentiment Classifier

Start Lesson →

07

Transformers & Attention

RNNs & LSTMs
Self-Attention
Multi-Head Attention
BERT & GPT
Hugging Face

Start Lesson →

05

Multi-Modal AI

Connecting vision and language together

08

From Vision–Language Models to CLIP

CNN + RNN Captioning
Visual Attention
CLIP Architecture
Contrastive Learning
Zero-Shot Classification
Text-to-Image

Start Lesson →

📓

Hands-on Notebooks

Jupyter notebooks with runnable code examples

L1

Regression to Deep Learning Demo

Generate a spiral dataset, train a small PyTorch network with ReLU/Sigmoid, and visualize decision regions.

Spiral Dataset
PyTorch MLP
Decision Boundaries

L2

MNIST with Fully-Connected Network

Classify MNIST digits using an MLP. Achieves ~98% accuracy with live loss visualization and confusion matrix.

MNIST
MLP Classifier
Confusion Matrix

L3

MNIST with CNN

Train a CNN on MNIST for ~99% accuracy. Visualize learned convolution filters and activation maps.

CNN Architecture
Filter Visualization
Activation Maps

L4

Data Augmentation (CIFAR-10)

Compare training with and without augmentation using FastAI and xResNet18 on CIFAR-10.

FastAI
Augmentation Pipeline
Transfer Learning

L6

Word Embeddings

From tokenization to Word2Vec magic. Explore word analogies, Skip-Gram, and CBOW.

Tokenization
nn.Embedding
Word2Vec Analogies

L6

Sentence Classification (AWD-LSTM)

Classify AG News topics using a pretrained AWD-LSTM with FastAI's text module.

AG News
AWD-LSTM
Fine-tuning

L6

Bag of Embeddings Classifier

A simpler alternative: average pretrained GloVe embeddings and classify with a linear layer.

GloVe Embeddings
Bag of Embeddings
Topic Classification

L7

Transformers & Attention

Self-attention from scratch, positional encoding, multi-head attention. Use BERT & GPT-2 via Hugging Face.

Self-Attention
BERT & GPT-2
Hugging Face

L8

CLIP Basics

Zero-shot image classification, prompt engineering, image-text retrieval, and contrastive learning with CLIP.

Zero-Shot Classification
Prompt Engineering
Image-Text Retrieval
InfoNCE Loss

L8

Vision-Language Pipelines

Image captioning and VQA with BLIP, zero-shot classification with HuggingFace pipelines.

BLIP Captioning
Visual Question Answering
HuggingFace Pipelines

L8

Caption Generator with CLIP

Train an LSTM caption decoder using frozen CLIP embeddings. Learn the "Show and Tell" architecture.

CLIP Encoder
LSTM Decoder
Greedy Decoding
Caption Training

🚀

Projects

Larger hands-on projects to apply your skills

VAE

Face Autoencoder

Train a Variational Autoencoder on LFW faces, analyze the latent space with PCA, and explore an interactive Gradio app that lets you manipulate facial features with sliders.

VAE
LFW Dataset
Latent PCA
Gradio App
Face Generation

View Project →