Profile Picture
Anton Schäfer

I work on Machine Learning. I enjoy challenging technical problems and building and understanding things. I study Computer Science at ETH Zurich (BSc 2017 - 2021, MSc 2021 - now) with a brief stay at MIT (2019).

In the past I have worked on NLP, graph machine learning, reinforcement learning, and learning heuristics for combinatorial optimization problems. These days I am excited about large models and their applications.



Current Projects

I am mostly researching character-level language models for my Master's thesis under the guidance of Imanol Schlag, Tiago Pimentel, and Thomas Hofmann.

I am also contributing to the Swiss AI Initiative project on large multimodal models and I am exploring ideas around leveraging negative samples for improving LLM reasoning capabilities.

Feel free to contact me about these topics or anything else!



Previous Employment

Apple. I worked on novel ideas with diffusion models under the guidance of Markus Rempfler and Thomas Deselaers.

cognote.ai. I co-founded cognote.ai with Nils Blach, Nils Krüger, and Oliver Rausch. We worked on automating medical documentation via structured conversation summarization using speech recognition and deep NLP.

Oracle Labs. I worked on explainable graph machine learning for ad fraud detection with Valentin Venzin and Rhicheek Patra.

ETH Zurich. I held tutorials as a TA for the Discrete Mathematics lecture by Ueli Maurer.