Spring 2025

Lectures

Introduction: Learning word and sentence representations
Martha Lewis. 2025-04-01.
Abstract

Introduction: Learning word and sentence representations | Abstract

“In this introductory lecture I will give an overview of the course and we will discuss learning word and sentence representations from text.”

Slides Further reading

Introduction: Learning word and sentence representations | Further reading

Samuel R Bowman, Gabor Angeli, Christopher Potts, and Christopher D Manning. A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326, 2015.
Alexis Conneau and Douwe Kiela. Senteval: An evaluation toolkit for universal sentence representations. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018), 2018.
Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, and Antoine Bordes. Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364, 2017.

Overview of the research projects
Martha Lewis. 2025-04-04.
Abstract

Overview of the research projects | Abstract

In this session we will give an overview of the research projects and some general advice on conducting research

Slides

Attention and transformers
Ivo Verhoeven. 2025-04-08.
Abstract

Attention and transformers | Abstract

In this session we will introduce attention and transformer architectures

Slides Further reading

Attention and transformers | Further reading

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. Attention Is All You Need. In Proceedings of NIPS 2017.
A very helpful blog post explaining the transformer architecture.
Visualization of attention heads for BERT: https://github.com/jessevig/bertviz

Seminar: The BERT model
Martha Lewis. 2025-04-11.
Abstract

Seminar: The BERT model | Abstract

In this session we will discuss the BERT model.

Discussion

Seminar: The BERT model | Discussion

In this session we will discuss the following papers:
Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL 2019.
Ian Tenney, Dipanjan Das, Ellie Pavlick. 2019. BERT Rediscovers the Classical NLP Pipeline. In Proceedings of ACL 2019.

Seminar: Model pruning and modularity
Martha Lewis. 2025-04-15.
Abstract

Seminar: Model pruning and modularity | Abstract

In this session we will discuss recent techniques for structured and unstructured pruning and finding task-specific subnetworks in Transformer models.

Seminar: Model pruning and modularity | Further reading

Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. (2020). Gradient surgery for multi-task learning. arXivpreprint arXiv:2001.06782.

Discussion

Seminar: Model pruning and modularity | Discussion

In this session we will discuss the following papers:
Paul Michel, Omer Levy, Graham Neubig. Are Sixteen Heads Really Better than One? In Proceedings of NeuroIPS 2019.
Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin. The Lottery Ticket Hypothesis for Pre-trained BERT Networks. In Proceedings of NeuroIPS 2020.

Multilingual models
Martha Lewis. 2025-04-22.
Abstract

Multilingual models | Abstract

In this session we will discuss learning multilingual word and sentence representations.

Slides

Seminar: LLMs: Instruction-tuning and prompting
Martha Lewis. 2025-04-25.
Abstract

Seminar: LLMs: Instruction-tuning and prompting | Abstract

In this session we will discuss recent research on generative LMs, prompting, instruction-tuning and in-context learning.

Discussion

Seminar: LLMs: Instruction-tuning and prompting | Discussion

In this session we will discuss the following papers:
Sanh et al., 2022. Multitask Prompted Training Enables Zero-Shot Task Generalization. In Proceedings of ICLR 2022.
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models In Proceedings of NeurIPS 2022.

Seminar: Bias in NLP models
Vera Neplenbroek. 2025-05-06.
Abstract

Seminar: Bias in NLP models | Abstract

In this session we will discuss research on bias in NLP models, diagnosing it and de-biasing.

Discussion

Seminar: Bias in NLP models | Discussion

In this session we will discuss the following papers:
Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Thompson, Phu Mon Htut, and Samuel Bowman. 2022. BBQ: A hand-built bias benchmark for question answering. In Findings of the Association for Computational Linguistics: ACL 2022, pages 2086–2105, Dublin, Ireland. Association for Computational Linguistics.
Flor Miriam Plaza-del-Arco, Amanda Cercas Curry, Alba Curry, Gavin Abercrombie, and Dirk Hovy. 2024. Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7682–7696, Bangkok, Thailand. Association for Computational Linguistics.

Seminar: In-context learning
Martha Lewis. 2025-05-09.
Abstract

Seminar: In-context learning | Abstract

In this session we will discuss recent research on generative LMs, prompting, instruction-tuning and in-context learning.

Discussion

Seminar: In-context learning | Discussion

In this session we will discuss the following papers:
Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. 2022. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? In Proceedings of EMNLP 2022.
Ziquian Lin and Kangwook Lee. 2024. Dual operating modes of in-context learning. Forty-first International Conference on Machine Learning.

Seminar: Language and Vision
Martha Lewis. 2025-05-13.
Abstract

Seminar: Language and Vision | Abstract

In this session, we will discuss research on joint modelling of language and vision.

Discussion

Seminar: Language and Vision | Discussion

In this session we will discuss the following papers:
Wai Keen Vong, Wentao Wang, A. Emin Orhan, and Brenden M. Lake. 2024. Grounded language acquisition through the eyes and ears of a single child.
Fu et al. 2024. BLINK: Multimodal Large Language Models Can See but Not Perceive.

Lecture: Visual Storytelling
Aditya Surikuchi. 2025-05-16.
Abstract

Lecture: Visual Storytelling | Abstract

In this session, we will discuss recent research on Visual Storytelling.

Seminar: (Mechanistic) Interpretability in NLP
Martha Lewis. 2025-05-20.
Abstract

Seminar: (Mechanistic) Interpretability in NLP | Abstract

In this session we will discuss reseach on interpretability of LLM computations.

Discussion

Seminar: (Mechanistic) Interpretability in NLP | Discussion

In this session we will discuss the following papers:
Mor Geva, Jasmijn Bastings, Katja Filippova, Amir Globerson. 2023. Dissecting Recall of Factual Associations in Auto-Regressive Language Models. In Proceedings of EMNLP 2023.
Chris Wendler, Veniamin Veselovsky, Giovanni Monea, Robert West. 2024. Do Llamas Work in English? On the Latent Language of Multilingual Transformers. arXiv:2402.10588

Project presentations
Martha Lewis, Vera Neplenbroek, and Ivo Verhoeven. 2025-05-23.
Abstract

Project presentations | Abstract

In this session, you will present the results of your research projects.