|
Motoki Omura
I am a Research Scientist at SB Intuitions, where I work on VLA models in the Robotics team. I received my Ph.D. from the University of Tokyo, where I was advised by Tatsuya Harada and Takayuki Osa.
My research focuses on reinforcement learning (RL) algorithms, particularly on improving stability and efficiency in both online and offline settings. I apply these techniques to domains such as robotics and LLM alignment (e.g., DPO).
Email /
X (Twitter) /
Github /
LinkedIn
|
|
|
|
Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps
Motoki Omura, Yusuke Mukuta, Kazuki Ota, Takayuki Osa, Tatsuya Harada
RLC, 2025
paper |
code
|
|
|
Entropy Controllable Direct Preference Optimization
Motoki Omura, Yasuhiro Fujita, Toshiki Kataoka
ICML, 2025, Workshop on Models of Human Feedback for AI Alignment
paper
|
|
|
Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning
Motoki Omura, Takayuki Osa, Yusuke Mukuta, Tatsuya Harada
ICML, 2025
paper |
code
|
|
|
Latent Space Curriculum Reinforcement Learning in High-Dimensional Contextual Spaces and Its Application to Robotic Piano Playing
Haruki Abe, Takayuki Osa, Motoki Omura, Jen-Yen Chang, Tatsuya Harada
Humanoids, 2025, (Oral)
paper
|
|
|
Stabilizing Extreme Q-learning by Maclaurin Expansion
Motoki Omura, Takayuki Osa, Yusuke Mukuta, Tatsuya Harada
RLC, 2024
paper |
code
|
|
|
Symmetric Q-Learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
Motoki Omura, Takayuki Osa, Yusuke Mukuta, Tatsuya Harada
AAAI, 2024
paper
|
|
|
Model Compression and Acceleration by Combining Reinforcement Learning and Meta-Pruning
Yu Kono, Motoki Omura, Tomohiro Kato, Yusuke Uchida
JSAI, 2024
paper
|
|