Motoki Omura

I am a Research Scientist at SB Intuitions, where I work on VLA models in the Robotics team. I received my Ph.D. from the University of Tokyo, where I was advised by Tatsuya Harada and Takayuki Osa.

My research focuses on reinforcement learning (RL) algorithms, particularly on improving stability and efficiency in both online and offline settings. I apply these techniques to domains such as robotics and LLM alignment (e.g., DPO).

Email  /  X (Twitter)  /  Github  /  LinkedIn

profile photo

Publications

Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps
Motoki Omura, Yusuke Mukuta, Kazuki Ota, Takayuki Osa, Tatsuya Harada
RLC, 2025
paper | code

Entropy Controllable Direct Preference Optimization
Motoki Omura, Yasuhiro Fujita, Toshiki Kataoka
ICML, 2025, Workshop on Models of Human Feedback for AI Alignment
paper

Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning
Motoki Omura, Takayuki Osa, Yusuke Mukuta, Tatsuya Harada
ICML, 2025
paper | code

Latent Space Curriculum Reinforcement Learning in High-Dimensional Contextual Spaces and Its Application to Robotic Piano Playing
Haruki Abe, Takayuki Osa, Motoki Omura, Jen-Yen Chang, Tatsuya Harada
Humanoids, 2025, (Oral)
paper

Stabilizing Extreme Q-learning by Maclaurin Expansion
Motoki Omura, Takayuki Osa, Yusuke Mukuta, Tatsuya Harada
RLC, 2024
paper | code

Symmetric Q-Learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
Motoki Omura, Takayuki Osa, Yusuke Mukuta, Tatsuya Harada
AAAI, 2024
paper

Model Compression and Acceleration by Combining Reinforcement Learning and Meta-Pruning
Yu Kono, Motoki Omura, Tomohiro Kato, Yusuke Uchida
JSAI, 2024
paper

Awards

The 5th place and Honorable Mention in “The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition 2019”
Motoki Omura
Competiton page | Winners announcement


Template