Student Projects ‒ TML ‐ EPFL

If you are interested in working with us, here are some additional projects which we would be happy on working on!

Arrow of time in algorithmic tasks: The recent success of transformers in natural language processing has led to research on their reasoning capabilities. This line of research usually focuses on how learning occurs in transformers that are trained from scratch on a specific algorithmic task. Surprisingly, even in a simple task such as addition, training transformers from scratch does not succeed, and non-trivial modifications are required. These modifications are task-specific and take the form of either modifying the data, such as its ordering or adding meta-data for transformers, or modifying components, such as positional encoding. In the addition task, in particular, writing digits in reverse order helps transformers. In this project, we aim to develop a general training procedure that can handle different algorithmic tasks by considering generalized orderings of the data. The primary objective is to benchmark a certain training procedure on various algorithmic tasks and compare it with solutions in the literature.

Contact person: Oguz Kaan Yüksel
Optimization aspects of RNNs and state space models: Contact person: Aditya Varre
Longer or More?: Language models are trained on a large corpus of text, with largeness measured by the number of tokens (T), which can be seen as the product of two quantities: the number of documents (D) and the length of each document (L), rather than the older concept of sample size (S). However, as tokens from the same document are correlated with one another, the i.i.d. assumption central to generalization theory is invalid, and the relationship between (T) and (S) is unclear. In this project, we will study how (D) and (L) influence learning in order to define the concept of an effective sample size (S). The main goal will be the identification of scaling laws w.r.t. (D) and (L) instead of (T) with baby language models in real-world data and some synthetic settings, such as Markovian languages.

Contact person: Oguz Kaan Yüksel
Interplays between approximate second-order optimizers and momentum: Contact person: Aditya Varre
Safety and alignment of LLM agents: LLM agents with some degree of autonomy become increasingly more capable and feasible. We seek to develop rigorous evaluation protocols and improve the robustness of automated LLM judges to monitor their safety and alignment. Our project will assess the accuracy and resilience of these judges in standard scenarios, and then against adversarial attacks and prompt injections. This would require creating a realistic environment that would allow us to benchmark LLM agents in realistic settings. The ultimate goal is to use these robust LLM judges to establish a new holistic benchmark for measuring the alignment of LLM agents.

Please get in touch if this sounds interesting as a semester project or bachelor’s/master’s thesis. We intend to hire multiple students at all levels (bachelor’s, master’s, and potentially PhD) and start working on multiple tasks in parallel. We are primarily interested in getting help on the part related to creating a realistic environment for agents, which would involve creation, filtering, and labeling data using LLM assistance.

Relevant papers:
– Agent safety: AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents, Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents
– Realistic agent environments: TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks (and references therein), RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts

Contact person: Maksym Andriushchenko