RoRA-VLM: Robust Retrieval-Augmented Vision Language Models
Published in ICCV 2025 Workshop on Knowledge-Intensive Multimodal Reasoning, 2025
Jingyuan Qi, Zhiyang Xu, Rulin Shao, Zihao Lin, Yang Chen, Di Jin, Yu Cheng, Qifan Wang, Lifu Huang
ICCV 2025 Workshop on Knowledge-Intensive Multimodal Reasoning
Recommended citation: Jingyuan Qi, Zhiyang Xu, Rulin Shao, Zihao Lin, Yang Chen, Di Jin, Yu Cheng, Qifan Wang, Lifu Huang. "RoRA-VLM: Robust Retrieval-Augmented Vision Language Models." ICCV 2025 Workshop on Knowledge-Intensive Multimodal Reasoning.
Download Paper
