Information Science-Discipline-FmRead Academic Frontier

DiffuVolume—A Novel Method for Stereo Matching Based on Diffusion Models Research Background and Problem Statement Stereo matching is a crucial task in the field of computer vision, with wide applications in autonomous driving, robotics navigation, and more. Its core objective is to generate a dense disparity map from a pair of rectified stereo ima...

Enhancing Pose Awareness in Self-Supervised Facial Representation Learning Research Background and Problem Statement In the field of computer vision, facial representation learning is a crucial research task. By analyzing facial images, we can extract information such as identity, emotions, and poses, thereby supporting downstream tasks like facial...

A Mutual Supervision Framework for Referring Expression Segmentation and Generation Research Background and Problem Statement In recent years, vision-language interaction technology has made remarkable progress in the field of artificial intelligence. Among these advancements, referring expression segmentation (RES) and referring expression generat...

GL-MCM: Global and Local Maximum Concept Matching for Zero-Shot Out-of-Distribution Detection Research Background and Problem Statement In real-world applications, machine learning models often face changes in data distribution, such as the emergence of new categories. This phenomenon is known as “Out-of-Distribution Detection (OOD).” To ensure the...

Lidar-guided Geometric Pretraining for Vision-centric 3D Object Detection

Lidar-Guided Geometric Pretraining Enhances Performance of Vision-Centric 3D Object Detection Background Introduction In recent years, multi-camera 3D object detection has garnered significant attention in the field of autonomous driving. However, vision-based methods still face challenges in precisely extracting geometric information from RGB imag...

An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-training Academic Background In recent years, self-supervised learning (SSL) has made significant progress in the field of computer vision. In particular, the successful application of masked image modeling (MIM) pre-training methods on large-sca...

DiffuVolume: Diffusion Model for Volume Based Stereo Matching

Sample-Cohesive Pose-Aware Contrastive Facial Representation Learning

A Mutual Supervision Framework for Referring Expression Segmentation and Generation

Global and Local Maximum Concept Matching for Zero-Shot Out-of-Distribution Detection

Lidar-guided Geometric Pretraining for Vision-centric 3D Object Detection

An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-training