Chinmaya Kausik

Mathematics Ph.D. Student, University of Michigan. Resume

prof_pic.jpg

Hi there! I’m Chinmaya Kausik, a 5th year mathematics Ph.D. candidate at UMich working on sequential decision-making under uncertainty, ranging across reinforcement learning, bandits, RLHF and LLM agents. I am being co-advised by Prof. Ambuj Tewari and Prof. Martin Strauss.

I design and implement principled algorithms and agents, and provide theoretical and empirical guarantees on their performance. My work spans reinforcement learning, bandits, RLHF, and designing as well as post-training LLM agents. I am also working on personal projects involving other aspects of sequence models like LLMs, transformers, state space models.

You can find my resume at this link. Check out my papers, projects, and personal interests!

What do I care about, broadly?
  • Tackling tangible, real-world questions with a principled mathematical approach. These days, my PhD research focuses on sequential decision making under various settings - offline-to-online transfer, partial observability/latent information and non-standard feedback and reward models. I also have side projects and internship research in deep learning, LLM agents, transformers, Bayesian inference, etc. On the other hand, a lot of my undergraduate background was in geometry, topology and dynamics, with work in computer-assisted topology and geometry.
  • Increasing accessibility to and in higher mathematics and creating communities where ideas cross pollinate and people pull each other up. I have started the Stats, Physics, Astronomy, Math (SPAM) graduate student social initiative at the University of Michigan. I also co-founded and co-organize Monsoon Math Camp. I have also been involved in building and expanding other mathematical communities, like platforms for the PolyMath REU, DRP programs and the undergraduate math organization at IISc, etc.
What am I doing these days?
  • Working on how we can robustly combine offline models and online interaction, in contrast to combining offline data and online interaction. With pretrained models like LLMs, foundation models for protein folding, etc readily available but lacking guarantees, we need principled sequential decision making algorithms that use offline models that don’t have performance guarantees.
  • Continuing the LLM agent research from my Nteflix internship, where we designed a generic retrieval/memory agent and we are not formalizing the problem to design more informed experiments.
  • Writing a paper based on my internship at Microsoft in the advertiser optimization team under Ajith Moparthi! I designed and implemented a fast algorithm for updating models used for advertiser bidding.

  • Organizing an interdepartmental social initiative, SPAM (Statistics, Physics, Astronomy, Mathematics).
  • Fleshing out ideas for more academic communities like Monsoon Math.
What do I want to learn about/do in the future?

primary goals

  • Conduct experiments on the efficacy of sequence model architectures with explicit separate latent states (TRMs, HRMs, the free transformer), and conduct an extensive interpretability study of their latent states.

  • Work on designing an agent/model to automatically annotate graded UMich calculus answer sheets, focusing on preventing hallucinations.

  • Work on a large scale applied recommender systems project using the latent bandit algorithms that I designed (LOCAL-UCB and ProBALL-UCB).

side-quests

  • Design codenames bots at both the agent level and by training them from scratch, both with separate as well as combined spymasters and guessers.
  • Explore the nuances of implementing various RL algorithms in simulated motion settings.

news

Aug 12, 2025 I have recently finished a Machine Learning Research Internship at Jane Street, where I worked on two projects and attended several classes and activities related to modeling and trading! In my first project, I worked on using text data to predict returns, focusing on overfitting and memorization issues. In my second project, I trained foundation models for order by order dataa
May 28, 2025 I have received the Edward S. & Amanda C. Everett Memorial Scholarship from the UMich math department.
May 01, 2025 🎉 Our paper “Leveraging Offline Data in Linear Latent Contextual Bandits” has been accepted to ICML 2025! This is joint work with Kevin Tan and my advisor, Ambuj Tewari.
Feb 10, 2025 I have started a Machine Learning Research Internship in the Machine Learning and Inference Research (MLIR) team at Netflix! I will be working under Nathan Kallus and Adith Swaminathan on post-training LLM agents to handle long contexts. Excited to push the envelope on LLM agent capabilities!
Jan 22, 2025 🎉 Our paper “A Theoretical Framework for Partially Observed Reward-States in RLHF” has been accepted to ICLR 2025! This is joint work with Mirco Mutti, Aldo Pacchiano, and my advisor Ambuj Tewari.