Publications

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

Published in CVPR, 2025

We introduce FRAMES-VQA, a benchmark designed to evaluate robust fine-tuning strategies for visual question answering (VQA) under diverse multi-modal distribution shifts. By leveraging ten existing VQA datasets categorized into in-distribution (ID), near-OOD, and far-OOD scenarios, we systematically analyze the impact of uni-modal, multi-modal, and adversarial shifts. Our study compares existing robust fine-tuning methods, quantifies distribution shifts using Mahalanobis distance, and explores the interactions between uni- and multi-modal shifts, providing valuable insights for developing more robust VQA models.

Download here

Directional Gradient Projection for Robust Fine-tuning of Foundation Models

Published in ICLR, 2025

This paper introduces Directional Gradient Projection (DiGraP), a novel layer-wise trainable method that incorporates directional information from gradients to bridge regularization and multi-objective optimization. Besides demonstrating our method on image classification, as another contribution we generalize this area to the multi-modal evaluation settings for robust fine-tuning.

Download here

Rethinking Weight Decay for Robust Fine-Tuning of Foundation Models

Published in NeurIPS, 2024

This paper introduces Selective Projection Decay (SPD), a weight decay technique that selectively regularizes certain layers to balance fitting and retaining pre-trained knowledge, improving generalization and robustness when fine-tuning foundation models.

Download here

From Local to Global: Spectral-Inspired Graph Neural Networks

Published in NeurIPS GLFrontiers Workshop, 2022

This paper is about mitigating over-smoothing and over-squashing issues in deep GNNs by proposing a normalization technique in message-passing algorithms (PowerEmbed) to encode global spectra information inspired by spectral embeddings.

Download here

Adversarial and Implicit Modality Imputation with Applications to Depression Early Detection

Published in CAAI International Conference on Artificial Intelligence (CICAI), 2022

This paper is about Adversarial and Implicit Modality Imputation with multi-modal representation learning via auto-encoding, clustering based on CPM-Net, adversarial networks and a feedback loop to resolve the modality-missing issue with application to UK Biobank database.

Download here

Chengyue Huang

Publications

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

Directional Gradient Projection for Robust Fine-tuning of Foundation Models

Rethinking Weight Decay for Robust Fine-Tuning of Foundation Models

From Local to Global: Spectral-Inspired Graph Neural Networks

Adversarial and Implicit Modality Imputation with Applications to Depression Early Detection