Research Highlights
Explorations: Some New Problems
Optimizing Policy in Epidemic-Human Behavior Co-evolution: Graph Algorithms to identify best policies under simple models of epidemics and behavior flow.
Better approximation algorithms for Immunization and Influence Maximization.
Data-driven Discovery of Natural Laws: Designing special sparse deep learning networks to identify equations that represent the data
Simultaneous Learning of Architecture and Parameters: Instead of fixing an architecture and training parameters can gradient descent lead to both simultaneously? [FITML@NeurIPS]
Simplicity-Complexity Gap in Epidemic Models: Given a certain amount of noise how much do simple models differ from complex models in homogenous mixing models and network models
Graph Neural Networks (Applications)
We develop graph neural networks for several applications. These include: (i) Autism detection [ICASSP 2023], (ii) Prediction of emergence of virus variants [arXiv], and (iii) Network creation from surveys
Infectious Disease Modeling, Forecasting, and Projections
We proposed the SIkJalpha model at the beginning of the COVID-19 pandemic. Over the years, as the pandemic evolved, more complexities were added to capture crucial factors and variables that can assist with projecting desired future scenarios. Throughout the pandemic, multi-model collaborative efforts have been organized to predict short-term outcomes (cases, deaths, and hospitalizations) of COVID-19, Influenza, and RSV and long-term scenario projections. We have been participating in many such efforts: US Scenario Modeling Hub, US Forecast Hub, Europe Scenario Modeling Hub, Europe Forecast Hub, Germany/Poland Forecast Hub. For Influenza forecasting we proposed a Tree Ensemble model design that utilizes the individual predictors of SIkJalpha to improve its performance.
[Paper on evolution of the model][Scenario Modeling at CDC MMWR][PNAS on US Forecast Hub][Influenza paper][Influenza presentation]
Shape-based Representation , Evaluation, Clustering, Classification, and Ensembles
While there is great value in predicting these numerical targets to assess the burden of the disease, we argue that there is also value in communicating the future trend (description of the shape) of the epidemic. Instead of treating this as a classification problem (one out of n shapes), we define a transformation of the numerical forecasts into a shapelet-space representation. In this representation, each dimension corresponds to the similarity of the shape with one of the shapes of interest (a shapelet). We prove that this representation satisfies the property that two shapes that one would consider similar are mapped close to each other, and vice versa. With this representation, we define an evaluation measure and a measure of agreement among multiple models. We also define the shapelet-space ensemble of multiple models which is the mean of the shapelet-space representation of all the models. We show that this ensemble can accurately predict the shape of the future trend for COVID-19 cases and trends. We also show that the agreement between models can provide a good indicator of the reliability of the forecast.
To address the evaluation for long-term forecasts, we extend this idea. (i) First, we use a moving window to transform each window into our Shapelet Space Representation (SSR), where each dimension represents the similarity of the shape to one of the "shapelets" – shapes of interest (e.g., inc, peak, surge, flat). This results in a matrix representation of the time-series where each column is the SSR of a window. (ii) Now, given the matrix representations of two trajectories, we use dynamic time warping to allow flexibility in the alignment of the time-series and compare shapes in the form of columns of the matrix representations. As a result, similar local trends are aligned first before comparison. We have already shown that measuring distance with DTW+S results in better clustering and classification
Graph Neural Networks (Theory)
Training and inference on deep GNNs on large graphs are difficult due to computational complexity and lack of accuracy improvements with deeper layers. Subgraph-based methods to address training on large graphs exist, but they do not apply during inference, making inference the bottleneck. Such methods also do not address poor accuracy for deep networks due to "oversmoothing". We address the following challenges: (i) Developing subgraph-based schemes that apply to training and inference. (ii) Identifying good subgraph-sampling strategies. (iii) Pruning weights to reduce computations during inference.
Past Research
Prior to my faculty position, I worked on a range of problems spanning from theoretical to experimental to real-world deployments, that involved a mix of Algorithms, Network Science, and Data Mining. The figure summarizes my past research. Please see my Publications page or contact me to learn more about my contributions to these problems.