LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

High-Quality Video Generation with Cascaded Latent Diffusion Models: LaVie Academic Background In recent years, with the breakthrough progress of Diffusion Models (DMs) in the field of image generation, Text-to-Image (T2I) generation technology has achieved significant success. However, extending this technology to Text-to-Video (T2V) generation st...

SLIDE: A Unified Mesh and Texture Generation Framework with Enhanced Geometric Control and Multi-View Consistency

SLIDE: A Unified Mesh and Texture Generation Framework with Enhanced Geometric Control and Multi-View Consistency

Academic Background With the increasing demand for high-quality 3D content across industries such as gaming, architecture, and social media, the manual creation of 3D assets has become time-consuming, technically demanding, and costly. In the gaming industry, the aesthetic quality of assets like characters and furniture significantly impacts the im...

From Behavior to Natural Language: Generative Approach for UAV Intent Recognition

UAV Behavior Intent Recognition Based on Generative Models: A Cross-Modal Study From Behavior to Natural Language Background and Research Objectives In recent years, Unmanned Aerial Vehicle (UAV) technology has advanced rapidly and has found widespread applications in civilian and military domains, including search and rescue, precision agriculture...

Q-Cogni: An Integrated Causal Reinforcement Learning Framework

Research Insight Report: Q-Cogni—An Integrated Causal Reinforcement Learning Framework In recent years, the rapid advancement of artificial intelligence (AI) has propelled researchers to explore the development of more efficient and interpretable reinforcement learning (RL) systems. Due to its ability to mimic human decision-making, reinforcement l...

Epi-Curriculum: Episodic Curriculum Learning for Low-Resource Domain Adaptation in Neural Machine Translation

Epi-Curriculum: Episodic Curriculum Learning for Low-Resource Domain Adaptation Research Background and Problem Statement In recent years, Neural Machine Translation (NMT) has become a benchmark technology in natural language processing. However, while NMT achieves near-human translation performance on large-scale parallel corpora, its effectivenes...

Enhancing Aerial Object Detection with Selective Frequency Interaction Network

Selective Frequency Interaction Network for Improved Aerial Object Detection Background and Problem Statement With the advancements in computer vision, aerial object detection has become a critical research focus in remote sensing. This task aims to identify targets such as vehicles or buildings from aerial images captured at varying angles and alt...