GraspMamba: A Mamba-based Language-driven Grasp Detection Framework with Hierarchical Feature Learning

Publikation: Beitrag in Buch oder TagungsbandVortrag mit Beitrag in TagungsbandBegutachtung

Abstract

Grasp detection is a fundamental robotic task critical to the success of many industrial applications. However, current language-driven models for this task often struggle with cluttered images, lengthy textual descriptions, or slow inference speed. We introduce GraspMamba, a new language-driven grasp detection method that employs hierarchical feature fusion with Mamba vision to tackle these challenges. By leveraging rich visual features of the Mamba-based backbone alongside textual information, our approach effectively enhances the fusion of multimodal features. GraspMamba represents the first Mamba-based grasp detection model to extract vision and language features at multiple scales, delivering robust performance and rapid inference time. Intensive experiments show that GraspMamba outperforms recent methods by a clear margin. We validate our approach through real-world robotic experiments, highlighting its fast inference speed.
OriginalspracheEnglisch
TitelProceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems
Seiten15808-15815
Seitenumfang8
DOIs
PublikationsstatusVeröffentlicht - 2025
Veranstaltung2025 IEEE/RSJ International Conference on Intelligent Robots and Systems - Hangzhou, China, Hangzhou, China
Dauer: 19 Okt. 202525 Dez. 2025
https://www.iros25.org/

Konferenz

Konferenz2025 IEEE/RSJ International Conference on Intelligent Robots and Systems
KurztitelIROS
Land/GebietChina
StadtHangzhou
Zeitraum19/10/2525/12/25
Internetadresse

Research Field

  • Complex Dynamical Systems

Fingerprint

Untersuchen Sie die Forschungsthemen von „GraspMamba: A Mamba-based Language-driven Grasp Detection Framework with Hierarchical Feature Learning“. Zusammen bilden sie einen einzigartigen Fingerprint.

Diese Publikation zitieren