Publications

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

Published in CVPR, 2025

We introduce FRAMES-VQA, a benchmark designed to evaluate robust fine-tuning strategies for visual question answering (VQA) under diverse multi-modal distribution shifts. By leveraging ten existing VQA datasets categorized into in-distribution (ID), near-OOD, and far-OOD scenarios, we systematically analyze the impact of uni-modal, multi-modal, and adversarial shifts. Our study compares existing robust fine-tuning methods, quantifies distribution shifts using Mahalanobis distance, and explores the interactions between uni- and multi-modal shifts, providing valuable insights for developing more robust VQA models.

Download here

Directional Gradient Projection for Robust Fine-tuning of Foundation Models

Published in ICLR, 2025

This paper introduces Directional Gradient Projection (DiGraP), a novel layer-wise trainable method that incorporates directional information from gradients to bridge regularization and multi-objective optimization. Besides demonstrating our method on image classification, as another contribution we generalize this area to the multi-modal evaluation settings for robust fine-tuning.

Download here

Rethinking Weight Decay for Robust Fine-Tuning of Foundation Models

Published in NeurIPS, 2024

This paper introduces Selective Projection Decay (SPD), a weight decay technique that selectively regularizes certain layers to balance fitting and retaining pre-trained knowledge, improving generalization and robustness when fine-tuning foundation models.

Download here

From Local to Global: Spectral-Inspired Graph Neural Networks

Published in NeurIPS GLFrontiers Workshop, 2022

This paper is about mitigating over-smoothing and over-squashing issues in deep GNNs by proposing a normalization technique in message-passing algorithms (PowerEmbed) to encode global spectra information inspired by spectral embeddings.

Download here