Recent News
[Jul. 2024] One paper is accepted by ECCV 2024 and one is accepted by ACMMM 2024.
[Dec. 2023] I have been awarded the PhD degree and Dean's Commendation for Doctoral Thesis Excellence.
[Aug. 2023] I have been awarded ICCV 2023 Doctoral Consortium mentored by Prof. Judy Hoffman.
[Jul. 2023] Two papers are accepted by ICCV 2023.
[Dec. 2022] One paper is accepted by TPAMI.
[Mar. 2022] One paper is accepted by CVPR 2022.
|
Research
|
GroundingMate: Aiding Object Grounding for Goal-Oriented Vision-and-Language Navigation
Qianyi Liu, Siqi Zhang, Yanyuan Qiao, Junyou Zhu, Xiang Li, Longteng Guo, Qunbo Wang, Xingjian He, Qi Wu, Jing Liu
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025
|
|
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Yanyuan Qiao, Wenqi Lyu, Hui Wang, Zixu Wang, Zerui Li, Yuan Zhang, Mingkui Tan, Qi Wu
project
/
arxiv
|
|
VL-Mamba: Exploring State Space Models for Multimodal Learning
Yanyuan Qiao, Zheng Yu, Zijia Zhao, Sihan Chen, Mingzhen Sun, Longteng Guo, Qi Wu, Jing Liu
NeurIPS Workshop on Efficient Natural Language and Speech Processing, 2024
project
/
arxiv
/
code
|
|
MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation
Junyou Zhu, Yanyuan Qiao, Siqi Zhang, Xingjian He, Qi Wu, Jing Liu
arxiv
|
|
Effective Tuning Strategies for Generalist Robot Manipulation Policies
Wenbo Zhang, Yang Li, Yanyuan Qiao, Siyuan Huang, Jiajun Liu, Feras Dayoub, Xiao Ma, Lingqiao Liu
arxiv
|
|
MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation
Mingzhen Sun, Weining Wang, Yanyuan Qiao, Jiahui Sun, Zihan Qin, Longteng Guo, Xinxin Zhu, Jing Liu
ACM Multimedia Conference (ACMMM), 2024
arxiv
|
|
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Yue Zhang*, Ziqiao Ma*, Jialu Li*, Yanyuan Qiao*, Zun Wang*, Joyce Chai, Qi Wu, Mohit Bansal, Parisa Kordjamshidi (*equal contribution)
arxiv
|
|
Improving Online Source-free Domain Adaptation for Object Detection by Unsupervised Data Acquisition
Xiangyu Shi, Yanyuan Qiao, Qi Wu, Lingqiao Liu, Feras Dayoub
ECCV Workshop on ROAM, 2024
arxiv
|
|
LLM as Copilot for Coarse-grained Vision-and-Language Navigation
Yanyuan Qiao, Qianyi Liu, Jiajun Liu, Jing Liu, Qi Wu
European Conference on Computer Vision (ECCV), 2024
paper
|
|
Multi-Modal Adapter for Medical Vision-and-Language Learning
Zheng Yu, Yanyuan Qiao, Yutong Xie, Qi Wu
International Workshop on Machine Learning in Medical Imaging (MLMI@MICCAI), 2023
paper
|
|
March in Chat: Interactive Prompting for Remote Embodied Referring Expression
Yanyuan Qiao, Yuankai Qi, Zheng Yu, Jing Liu, Qi Wu
International Conference on Computer Vision (ICCV), 2023
paper
/
arxiv
/
code
|
|
VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation
Yanyuan Qiao, Zheng Yu, Qi Wu
International Conference on Computer Vision (ICCV), 2023
paper
/
arxiv
/
code
|
|
HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation
Yanyuan Qiao, Yuankai Qi, Yicong Hong, Zheng Yu, Peng Wang, Qi Wu
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
paper
|
|
HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation
Yanyuan Qiao, Yuankai Qi, Yicong Hong, Zheng Yu, Peng Wang, Qi Wu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
paper
/
arxiv
/
code
|
|
R-GAN: Exploring Human-like Way for Reasonable Text-to-Image Synthesis via Generative Adversarial Networks
Yanyuan Qiao, Qi Chen, Chaorui Deng, Ning Ding, Yuankai Qi, Mingkui Tan, Xincheng Ren, Qi Wu
ACM International Conference on Multimedia (ACMMM), 2021
paper
|
|
Referring Expression Comprehension: A Survey of Methods and Datasets
Yanyuan Qiao, Chaorui Deng, Qi Wu
IEEE Transactions on Multimedia (TMM), 2020
paper
/
arxiv
|
|
RANKVQA: Answer Re-ranking for Visual Question Answering
Yanyuan Qiao*, Zheng Yu*, Jing Liu (*equal contribution)
IEEE International Conference on Multimedia and Expo (ICME), 2020  (Oral)
paper
|
|
VC-VQA: Visual calibration mechanism for Visual Question Answering
Yanyuan Qiao*, Zheng Yu*, Jing Liu (*equal contribution)
IEEE International Conference on Image Processing (ICIP), 2020
paper
|
|
Improving Visual Question Answering Using Dropout and Enhanced Question Encoder
Zhiwei Fang, Jing Liu, Yong Li, Yanyuan Qiao, Qu Tang, Hanqing Lu
Pattern Recognition (PR), 2019
paper
|
|
Enhancing Visual Question Answering Using Dropout
Zhiwei Fang, Jing Liu, Yanyuan Qiao, Qu Tang, Yong Li, Hanqing Lu
ACM International Conference on Multimedia (ACM MM), 2018
paper
|
|
|