Iclr2025acceptance
Our paper on RLHF under intermediate feedback and partial observability of rewards, A Theoretical Framework for Partially Observed Reward-States in RLHF, has been accepted to ICLR 2025! This is work with Mirco Mutti, Aldo Pacchiano and my advisor, Ambuj Tewari.