topological data analysis

Analysing UNGA votes, 1945-2015.

Introduction

  • Topological data analysis is a way to find “higher order” structure in data by connecting not just pairs of similar datapoints, but also keeping track of how these pairs build up to groups of three, four, etc.
  • This can help discover natural clusters in data, but also unearth how closely tied the datapoints in the cluster are, and provide ways to quantiy the answers to such questions using algebraic topology.

What I did

  • In a team, we analysed United Nations General Assembly votes over the years using the Kepler Mapper.
  • This involved some non-trivial preprocessing and cleaning of a dataset of raw UN general assembly votes.
  • I learnt that topological data analysis is better viewed as a sophisticated version of clustering with qualitative input, as opposed to a lot of fancy higher order topology.
  • Since this involved choosing some hyperparameters, we statistically verified our results to make sure and guarantee parameter tuning didn’t artifically manufacture them via p-hacking and other unsound methods.

Presentation Slides

Link here.