- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Automatic Translation feature is available on CiNii Labs
- Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
Construction and Validation of Action-Conditioned VideoGPT
-
- TABATA Koudai
- The University of Tokyo Matsuo Institute
-
- KAMOHARA Junnosuke
- Tohoku University Matsuo Institute
-
- UNNO Ryosuke
- The University of Tokyo Matsuo Institute
-
- SATO Makoto
- Nara Institute of Science and Technology Matsuo Institute
-
- WATANABE Taiju
- Waseda University Matsuo Institute
-
- KUME Taiga
- Keio University Matsuo Institute
-
- NEGISHI Masahiro
- The University of Tokyo Matsuo Institute
-
- OKADA Ryo
- The University of Tokyo Matsuo Institute
-
- IWASAWA Yusuke
- The University of Tokyo
-
- MATSUO Yutaka
- The University of Tokyo
Bibliographic Information
- Other Title
-
- 行動条件付けVideoGPTの構築と検証
Description
<p>World models acquire external structure based on observations of the external world and can predict the future states of the external world as it changes with the action of the agent. Recent advances in generative models and language models have contributed to multi-modal world models, which are expected to be applied in various domains, including automated driving and robotics. Video prediction is the field that has made progress in terms of high fidelity and long term prediction, and world models have potential applications for acquiring temporal representations. One example of model architecture that has performed well is a combination of Encoder-Decoder based latent variable model for image reconstruction and auto-regressive model for prediction of latent sequence. In this work, we extend a video prediction model called VideoGPT, which uses VQVAE and Image-GPT by introducing action conditioning. Validation with CARLA and RoboNet showed improved performance compared to the model without conditioning.</p>
Journal
-
- Proceedings of the Annual Conference of JSAI
-
Proceedings of the Annual Conference of JSAI JSAI2023 (0), 1G4OS21a02-1G4OS21a02, 2023
The Japanese Society for Artificial Intelligence
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390296808221013504
-
- ISSN
- 27587347
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
-
- Abstract License Flag
- Disallowed