CS25: Tranformers United!

Content

Since their introduction in 2017, transformers have revolutionized Natural Language Processing (NLP). Now, transformers are finding applications all over Deep Learning, be it computer vision (CV), reinforcement learning (RL), Generative Adversarial Networks (GANs), Speech or even Biology. Among other things, transformers have enabled the creation of powerful language models like GPT-3 and were instrumental in DeepMind's recent AlphaFold2, that tackles protein folding.

In this seminar, we examine the details of how transformers work, and dive deep into the different kinds of transformers and how they're applied in different fields. We do this through a combination of instructor lectures, guest lectures, and classroom discussions. We will invite people at the forefront of transformers research across different domains for guest lectures.

The bulk of this class will comprise of talks from researchers discussing latest breakthroughs with transformers and explaining how they apply them to their fields of research. The objective of the course is to bring together the ideas from ML, NLP, CV, biology and other communities on transformers, understand their broad implications, and spark cross-collaborative research.

Logistics

Lectures are on Tuesdays from 10:30AM - 11:50AM Pacific Time in 260-113
Zoom: Link (Password: 611510; Note: Only works for those with Stanford email addresses)
Attendance Policy: Every lecture, submit a response to our Google Form
Discord: Link
Contact: If you have any questions about the course, contact us at cs25-win2223-staff@lists.stanford.edu
Recordings: Speakers' talks will be made publicly available ~2-4 weeks after the recording data
Fall 2021: The Fall 2021 website and Fall 2021 recordings are publicly available.

Instructors

Div Garg

Steven Feng

Rylan Schaeffer

Faculty Advisor

Chris Manning

Schedule

The current class schedule is below (subject to change)

Date	Description	Course Materials
Jan 10	Introduction to Transformers Speaker: Andrej Karpathy	Recommended Readings: Attention Is All You Need The Illustrated Transformer The Annotated Transformer Additional Readings:
Jan 17	Language and Human Alignment Speaker: Jan Leike (OpenAI)	Recommended Readings: ChatGPT InstructGPT Language Models are Few-Shot Learners (GPT-3) Additional Readings:
Jan 24	Emergent Abilities and Scaling in LLMs Speaker: Jason Wei (Google Brain)	Recommended Readings: Emergent Abilities of Large Language Models Chain of Thought Prompting Elicits Reasoning in Large Language Models Scaling Instruction-Finetuned Language Models Additional Readings:
Jan 31	Strategic Games Speaker: Noam Brown (FAIR)	Recommended Readings: Human-level play in the game of Diplomacy by combining language models with strategic reasoning Modeling Strong and Human-Like Gameplay with KL-Regularized Search No-Press Diplomacy from Scratch Additional Readings:
Feb 7	Robotics and Imitation Learning Speaker: Ted Xiao (Google Brain)	Recommended Readings: RT-1: Robotics Transformer for Real-World Control at Scale Do As I Can, Not As I Say: Grounding Language in Robotic Affordances Inner Monologue: Embodied Reasoning through Planning with Language Models Additional Readings:
Feb 14	Common Sense Reasoning Speaker: Yejin Choi (U. Washington / Allen Institute for AI)	Recommended Readings: Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations Symbolic Knowledge Distillation: from General Language Models to Commonsense Models Can Machines Learn Morality? The Delphi Experiment Additional Readings:
Feb 21	Biomedical Transformers Speaker: Vivek Natarajan (Google Health AI)	Recommended Readings: Large Language Models Encode Clinical Knowledge ProtNLM: Model-based Natural Language Protein Annotation Effective gene expression prediction from sequence by integrating long-range interactions Additional Readings:
Feb 28	In-Context Learning & Faithful Reasoning Speakers: Stephanie Chan (DeepMind) & Antonia Creswell (DeepMind)	Recommended Readings: Data Distributional Properties Drive Emergent In-Context Learning in Transformers Faithful Reasoning Using Large Language Models Language models show human-like content effects on reasoning Additional Readings:
Mar 7	Neuroscience-Inspired Artificial Intelligence Speakers: Trenton Bricken (Harvard/Redwood Center for Theoretical Neuroscience/Anthropic) & Will Dorrell (UCL Gatsby Computational Neuroscience Unit/Stanford)	Recommended Readings: Attention Approximates Sparse Distributed Memory The Tolman-Eichenbaum Machine: Unifying Space and Relational Memory through Generalization in the Hippocampal Formation Relating transformers to models and neural representations of the hippocampal formation Additional Readings: Sparse Distributed Memory is a Continual Learner Sparse Distributed Memory and Related Models How to build a cognitive map
Mar 14	Wrap Up Speaker: TBA	Recommended Readings: Additional Readings:

CS25: Transformers United V2

Winter 2023

Content

Logistics

Instructors

Faculty Advisor

Schedule