Abstract
Foundation models such as ChatGPT have made significant strides in robotic tasks due to their universal representation of real-world domains. In this paper, we leverage foundationmodels to tackle grasp detection, a persistent challenge in robotics with broad industrial applications. Despite numerous grasp datasets, their object diversity remains limited compared to real-world figures. Fortunately, foundation models possess an extensive repository of real-world knowledge, including objects we encounter in our daily lives. As a consequence, a promising solution to the limited representation in previous grasp datasets is to harness the universal knowledge embedded in these foundation models. We present Grasp-Anything, a new large-scale grasp dataset synthesized from foundation models to implement this solution. Grasp-Anything excels in diversity and magnitude, boasting 1M samples with text descriptions and more than 3M objects, surpassing prior datasets. Empirically,we show that Grasp-Anything successfully facilitates zeroshot grasp detection on vision-based tasks and real-world robotic experiments. Our dataset and code are available at https://airvlab.github.io/grasp-anything/.
Original language | English |
---|---|
Title of host publication | 2024 IEEE International Conference on Robotics and Automation (ICRA) |
Pages | 14030-14037 |
DOIs | |
Publication status | Published - 8 Aug 2024 |
Event | IEEE International Conference on Robotics and Automation - Yokohama, Japan Duration: 13 May 2024 → 17 May 2024 https://2024.ieee-icra.org/ |
Publication series
Name | 2024 IEEE International Conference on Robotics and Automation (ICRA) |
---|
Conference
Conference | IEEE International Conference on Robotics and Automation |
---|---|
Abbreviated title | ICRA |
Country/Territory | Japan |
City | Yokohama |
Period | 13/05/24 → 17/05/24 |
Internet address |
Research Field
- Complex Dynamical Systems