Research | Lab for Artificial Visual Intelligence

Our core research themes focus on building adaptable AI systems that learn from multiple modalities, perform under resource constraints, and remain reliable in practice.

Multimodal Learning

We study methods that jointly represent and reason over images, text, and structured inputs. This enables richer task understanding, better generalization, and stronger cross-modal retrieval.

Selected publications:

ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models (ECCV 2026)

arXiv 🏷️Generative AI 🏷️Human-Computer Interaction
Organizing Unstructured Image Collections using Natural Language (CVPR Findings 2026)

arXiv 🏷️Clustering 🏷️Generative AI
Democratizing Fine-grained Visual Recognition with Large Language Models (ICLR 2024)

arXiv 🏷️Fine-grained Recognition 🏷️Generative AI

Learning with Limited Resources

Our work targets models that are both data-efficient and compute-aware, allowing intelligent systems to work well in constrained environments and on edge devices.

Selected publications:

Predict-then-Diffuse: Adaptive Response Length for Compute-Budgeted Inference in Diffusion LLMs (IJCNN 2026)

arXiv 🏷️Generative AI
FedMVP: Federated Multi-modal Visual Prompt Tuning for Vision-Language Models (ICCV 2025)

arXiv 🏷️Federated Learning
Less is more: Summarizing Patch Tokens for efficient Multi-Label Class-Incremental Learning (CoLLAs 2024)

arXiv 🏷️Continual Learning

Trustworthy AI

We investigate robustness, privacy, and predictive uncertainty of vision-language systems, ensuring model predictions remain reliable across distribution shifts and safety-critical use cases.

Selected publications:

Ensembling Pruned Attention Heads For Uncertainty-Aware Efficient Transformers (ICLR 2026)

arXiv 🏷️Uncertainty Quantification 🏷️Pruning
How (Mis) calibrated is Your Federated CLIP and What To Do About It? (pre-print, 2026)

arXiv 🏷️Calibration 🏷️Federated Learning
Group-robust Machine Unlearning (TMLR 2025)

arXiv 🏷️Fairness 🏷️Privacy Preservation

Research Themes

Multimodal Learning

Learning with Limited Resources

Trustworthy AI