Contrastive Decoupled Representation Learning and Regularization for Speech-Preserving Facial Expression Manipulation

Contrastive Decoupled Representation Learning in Speech-Preserving Facial Expression Manipulation Background Introduction In recent years, with the rapid development of virtual reality, film and television production, and human-computer interaction technologies, facial expression manipulation has become a research hotspot in the fields of computer ...

Sample-Cohesive Pose-Aware Contrastive Facial Representation Learning

Enhancing Pose Awareness in Self-Supervised Facial Representation Learning Research Background and Problem Statement In the field of computer vision, facial representation learning is a crucial research task. By analyzing facial images, we can extract information such as identity, emotions, and poses, thereby supporting downstream tasks like facial...

A Mutual Supervision Framework for Referring Expression Segmentation and Generation

A Mutual Supervision Framework for Referring Expression Segmentation and Generation

A Mutual Supervision Framework for Referring Expression Segmentation and Generation Research Background and Problem Statement In recent years, vision-language interaction technology has made remarkable progress in the field of artificial intelligence. Among these advancements, referring expression segmentation (RES) and referring expression generat...

Global and Local Maximum Concept Matching for Zero-Shot Out-of-Distribution Detection

Global and Local Maximum Concept Matching for Zero-Shot Out-of-Distribution Detection

GL-MCM: Global and Local Maximum Concept Matching for Zero-Shot Out-of-Distribution Detection Research Background and Problem Statement In real-world applications, machine learning models often face changes in data distribution, such as the emergence of new categories. This phenomenon is known as “Out-of-Distribution Detection (OOD).” To ensure the...

Lidar-guided Geometric Pretraining for Vision-centric 3D Object Detection

Lidar-guided Geometric Pretraining for Vision-centric 3D Object Detection

Lidar-Guided Geometric Pretraining Enhances Performance of Vision-Centric 3D Object Detection Background Introduction In recent years, multi-camera 3D object detection has garnered significant attention in the field of autonomous driving. However, vision-based methods still face challenges in precisely extracting geometric information from RGB imag...