CS25: Tranformers United!

Since their introduction in 2017, transformers have revolutionized Natural Language Processing (NLP). Now, transformers are finding applications all over Deep Learning, be it computer vision (CV), reinforcement learning (RL), Generative Adversarial Networks (GANs), Speech or even Biology. Among other things, transformers have enabled the creation of powerful language models like GPT-3 and were instrumental in DeepMind's recent AlphaFold2, that tackles protein folding.

In this seminar, we examine the details of how transformers work, and dive deep into the different kinds of transformers and how they're applied in different fields. We do this through a combination of instructor lectures, guest lectures, and classroom discussions. We will invite people at the forefront of transformers research across different domains for guest lectures.

Prerequisites: Basic knowledge of Deep Learning (must understand attention) or have taken CS224N / CS231N / CS230.

Lecture recordings are now available online

Instructors

Div Garg

Chetanya Rastogi

Advay Pal

Faculty Advisor

Chris Manning

Logistics

Lectures are on Mondays from 10AM - 11:20AM Pacific Time (460-334)
Class Structure: We will be following a hybrid structure with in some remote and some in-person talks
Contact: If you have any questions about the course, contact us at cs25-aut2122-staff@lists.stanford.edu

Content

The bulk of this class will comprise of talks from researchers discussing latest breakthroughs with transformers and explaining how they apply them to their fields of research. The objective of the course is to bring together the ideas from ML, NLP, CV, biology and other communities on transformers, understand their broad implications, and spark cross-collaborative research.

The current class schedule is below (subject to change)

Schedule

Date	Description	Course Materials
Mon Sep 20	Introduction to Transformers	Recommended Readings: Attention Is All You Need The Illustrated Transformer The Annotated Transformer (Assignment) Additional Readings:
Mon Sept 27	Transformers in Language: GPT-3, Codex Speaker: Mark Chen (OpenAI)	Recommended Readings: Language Models are Few-Shot Learners Evaluating Large Language Models Trained on Code Additional Readings:
Mon Oct 4	Applications in Vision Speaker: Lucas Beyer (Google Brain)	Recommended Readings: An Image is Worth 16x16 Words (Vision Transfomer) Additional Readings: How to train your ViT?
Mon Oct 11	Transformers in RL & Universal Compute Engines Speaker: Aditya Grover (FAIR)	Recommended Readings: Pretrained Transformers as Universal Computation Engines Decision Transformer: Reinforcement Learning via Sequence Modeling Additional Readings:
Mon Oct 18	Scaling transformers Speaker: Barret Zoph (Google Brain) with Irwan Bello and Liam Fedus	Recommended Readings: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity ST-MoE: Designing Stable and Transferable Sparse Expert Models Additional Readings:
Mon Oct 25	Perceiver: Arbitrary IO with transformers Speaker: Andrew Jaegle (DeepMind)	Recommended Readings: Perceiver: General Perception with Iterative Attention Perceiver IO: A General Architecture for Structured Inputs & Outputs Additional Readings:
Mon Nov 1	Self Attention & Non-Parametric Transformers Speaker: Aidan Gomez (University of Oxford)	Recommended Readings: Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning Additional Readings:
Mon Nov 8	GLOM: Representing part-whole hierarchies in a neural network Speaker: Geoffrey Hinton (UoT)	Recommended Readings: How to represent part-whole hierarchies in a neural network Additional Readings:
Mon Nov 15	Interpretability with transformers Speaker: Chris Olah (AnthropicAI)	Recommended Readings: Multimodal Neurons in Artificial Neural Networks Additional Readings: The Building Blocks of Interpretability
Mon Nov 29	Transformers for Applications in Audio, Speech and Music: From Language Modeling to Understanding to Synthesis. Speaker: Prateek Verma (Stanford)

CS25: Transformers United