an attempt at a progress log and a glimpse into grad school life.

March 2022
March 6 Finished setting up my website and applying to a fellowship. Reviewed Math 626 material, worked out all of the assignment besides Q.1., which seems to be presenting issues.
March 7 Outside of a day full of classes and talks, spent about 3 hours ironing out Q.1., which proved to be non-trivial, and discovered an error in my solution to Q.4. Eventually talked to a student who said they saw the first passage decomposition being used in Q.4, which worked out very nicely. Read about some practical issues with RL research to mentally prepare myself before launching back into reading Sutton/Barto and following Silver's RL course.
March 8 Wrote up and submitted my assignment, attended Math 626. We constructed the stationary distribution for an irreducible Markov chain with a positively recurrent state. Started talking about periods of Markov chains and aperiodic Makrov chains. Completed Chapter 3 of Sutton/Barto and reviewed chapter 2, added my questions to Workflowy. Alekh Agarwal et al's RL theory book seems way more mathematically sound and attractive. Once I get a general overview of things from Sutton/Barto and Silver, I'll go to that book.
March 9, 10 and 11 Corrected mistakes in my Math 626 assignment, almost finished the Math 597 assignment due next week. Read part of Rishi Sonthalia's TreeRep paper. Love how they're leveraging the finite combinatorial possiblities in a tree stemming from Gromov products! Will finish this over the weekend. Read a bulk of chapter 1 of the AJKS RL theory book, finished chapter 4 of Sutton/Barto and went through some of chapter 5. Glanced at Chen and Poor's paper on learning mixtures of linear dynamical systems sample efficiently, which seems like a fun problem! Also found Vidyasagar's notes on RL, which seem nice. Talked to a few more people about my changing interests, got more advice and got my exploration of RL and statistical learning theory greenlit by one more relevant entity in the math department.
March 12-21 Completed and submitted the 597 assignment, read more AJKS and Sutton/Barto, took a segue into importance sampling and statistical theory (Slustky's theorem, the delta method, etc). Read Chapter 4 and about half of Chapter 5 of Sutton/Barto. Completed implementing the GUI for py_knots. Added an implementation of Casson-Gordon invariants. Thought more about using Chen and Poor's ideas for learning confounded MDPs (met the undergrad involved in the project). Read about the optimal solution to the gambling problem (from How To Gamble If You Must, adapted from the book on inequalities for stochastic processes).
March 22-24 Completed and submitted another 597 assignment. Completed chapter 5 of Sutton Barto and formulated some questions I had with greater precision. Read the proof of MCES convergence for stochastic feed forwards MDPs as given by Che Wang and Keith Ross. The idea is nice, but I'd also like to understand the counterexamples to the general convergence of MCES someday. Read Rishi Sonthalia's TreeRep paper, which was fun! Graded Math 116 exams. Contacted a few more people for advice about switching into RL/ML and data analysis work. Talked to my housemate Max about my fleshing out my idea for a "six degrees of wikipedia" version of semantle. Removed Casson-Gordon invariants from py_knots, because of various technicalities in its use and the lack of a wide audience for it.
March 25-April 2 Explained Rishi's TreeRep paper to Alex, derived a possible distortion bound for TreeRep based on Gromov's tree approximation algorithm from 1987 *which Alex pointed me to). Worked out some initial ideas for a related problem that Rishi told me about. Completed Part I of Sutton and Barto (so chapters 6, 7 and 8), which covers all of their treatment of tabular RL. Watched David Silver's Lectures 1-5. Looked up the proof of convergence of generalized Q-learning in the GMDP paper by Littman and Szepesvari, tried to pin down the bottlenecks for convergence rates. Seems to involve convergence rates for the Robbins-Monro type fixed point approximation algorithm. Is that the bottleneck for convergence rates? Looked up a few other early papers in RL. Completed another 597 assignment. Added clarifications in an earlier assignment and corrected some mistakes in the new one. Prepared and gave my talk on entropy in topological and measure-preserving dynamical systems.
April 2022
April 3-April 11 Completed Chapter 9 and 10 of Sutton Barto along with Lecture 6 of Silver's course. Attended both the talks in the Tuesday virtual RL theory seminar. Lots of questions about agnostic questions for RL, stability of offline RL, etc. Completed 2 more 597 assignments (done with all 597 assignments now). Hopefully solved most of the 626 assignment, up to smoothing out some details. Will write it down now. Reviewing 626 as well. Briefly read about differential privacy. Not a supremely productive week for my RL reading.
April 12-April 30 (morning) Complete chapters 11 and 13 of Sutton Barto and skimmed chapter 12 (more or less completing Part I and II of the book), along with lecture 7 of Silver's course and Emma Brunskill's tutorial on offline RL. Read through some of Csaba Szepesvari's notes from his RL theory graduate course. Completed and submitted the last 626 assignment, took the 597 exam. Wrapped up teaching and grading, proctored the Math 116 final and graded finals with other instructors. Met Prof. Tewari, read half of the NeurIPS 2016 paper on learning mixtures of Markov chains from 3-trails. I have a question about a certain sample complexity bound for estimating a key matrix used in their algorithm (I think it can be reduced?). Read parts of William Hamilton's graph representation learning book to get introduced to GNNs. Read the first bottlenecks and over-squashing paper for applying to LOGML, reading the second one now (on Stochastic discrete Ricci flow). Super excited! Skimming some introductions to optimal transport to better understand Wasserstein distance and how it relates to discrete notions of Ricci curvature (like Ollivier curvature).
May 2022 Unaccounted for, sorry. Lost track of my website.
June 2022
June 1-June 20 Unaccounted for. Too busy to update this website,