Universität Bonn

Institute of Computer Science

14. May 2024

Colloquium: Research Talk by Dr. Ankit Anand Colloquium: Research Talk by Dr. Ankit Anand on May 17

Colloquium: Research Talk by Ankit Anand
Colloquium: Research Talk by Ankit Anand © Unsplash
Download all images in original size The impression in connection with the service is free, while the image specified author is mentioned.

We cordially invite you to the colloquium of Prof Gill Barequet!

On May 17, 2024, from 10:00 to 11:00 a.m, Ankit Anand, Senior Research Scientist bei Google DeepMind Montreal einen Research Talk (in englischer Sprache) halten.

The research talk is dedicated to the topic "Taking reinforcment learning(RL) to formal world of theorem proving and bringing formal world (code) back to RL".


AI has made great strides in multiple domains including robotics, game playing, biology and climate science etc.. In this talk, we will describe some of our recent attempts to use AI in formal world of theorem proving  and how we also brought back formal methods for improvements in world of reinforcement learning using LLMs.

Firstly, I will describe how we adapted the idea of hindsight experience replay from reinforcement learning to the automated theorem proving domain, so as to use the intermediate data generated during unsuccessful proofs. We show that provers trained in this way can outperform previous machine learning approaches and compete with the state of the art heuristic-based theorem prover E in its best configuration, on the popular benchmarks MPTP2078, M2k and Mizar40. The proofs generated by our algorithm are also almost always significantly shorter than E’s proofs.

Secondly, I will describe our recent work "Code-As-Reward" on how we brought the formal representation to the world of reinforcement learning. Specifically, one key challenge in applying RL to any problem is reward specification. In this work, we use pre-trained large Vision Language models to produce interpretable dense reward functions as code blocks.  This hinges on the strong capability of these large models to produce interpretable code. It not only brings formal specification to reward functions but is also significantly cheap as these code blocks are generated with very few calls to VLMs and can be subsequently used in multiple tasks.

The event is free of charge. Interested individuals are warmly invited to attend!

When? Friday, May 17 2024, 10:00 to 11:00 a.m.

Where? Institute of Computer Science, Friedrich-Hirzebruch-Allee 5, 53113 Bonn, Room 0.016

Michaela Musselmann
Institute of Computer Science
University of Bonn
Phone.: +49 228 73-4502
Mail: musselmann@iai.uni-bonn.de

Wird geladen