With the rapid advancement of artificial intelligence, unmanned systems such as autonomous driving and embodied intelligence are continuously being promoted and applied in real-world scenarios, leading to a new wave of technological revolution and industrial transformation. Visual perception, a core means of information acquisition, plays a crucial role in these intelligent systems. However, achieving efficient, precise, and robust visual perception in dynamic, diverse, and unpredictable environments remains an open challenge.