Our research aims to identify the distance scale of fundamental interactions and understand how other scales emerge. A model system that our group often studies is the strong force, which displays emergent complex quantum phenomena at high energies manifested in the form of jets. Jets are collimated sprays of particles that result from high energy quarks and gluons. These objects are challenging to measure experimentally and difficult to analyze statistically. Jets and many other phenomena in fundamental physics naturally live in high-dimensional phase spaces that are impossible to explore with traditional methods. We call this the hyper challenge: going beyond multivariate analysis to discover new patterns by treating data holistically in their natural high dimensionality. Complex detectors measure these data using the equivalent of hyperspectral cameras and new methods have the potential to qualitatively change data analysis for high energy physics and beyond. This work spans two areas of research:
Data analysis in fundamental physics experiments relies heavily on complex simulations in order to connect fundamental theories to observable quantities. These simulations span an impressive dynamic range in energy and encode a vast set of formal and phenomenological physics models. However, we often do not know the probability density of the data explicitly because of the hyper challenge. Instead, we can sample from our simulators to produce realistic data. Likelihood-free learning is a set of tools to fully utilize our data without ever explicitly constructing their probability density.
While simulations are a critical aspect of analysis with fundamental physics experiments, many science questions require simulation-independent methods. This occurs when simulations are too expensive, are intractable, or are unreliable. This is often the case when probing extreme regions of phase space. In these situations, the strategy shifts from using simulation to learning directly from data. Due to the size, high-dimensionality, or quantum nature of experimental datasets, it is often not possible to know the generative process for a particular datum. As a result, one must go beyond traditional supervised methods for hypervariate analysis with deep learning. To this end, we are researching label-free learning methods that can deploy less-than-supervised techniques directly on unlabeled data.