(Open) Research Engineer (Physical AI)

Role Overview — Research Engineer (Physical AI)

“Responsible for everything from R&D of the Foundation Embodied Agent to real-world robotics deployment.”

The WoRV (World model for Robotics and Vehicle control) team builds agents that, like humans, integrate diverse, high-dimensional information to decide and act—tackling challenges in robotics and autonomous driving.

The Physical AI Research team is currently focused on the following four core missions:

Building Training Recipes for the Foundation Embodied Agent
- In the still-forming field of Embodied AI, research and develop new training recipes to build a Foundation Embodied Agent that responds robustly to previously unseen environments, rules, robots, and instructions.
- Our goal is a general-purpose driving agent capable of cooperation, reasoning, and adaptation across varied scenarios.
Solving Memory & Efficiency Challenges of the Foundation Embodied Agent
- Most RFMs (Robotics Foundation Models) decide actions solely from the current observation—without memory—which severely restricts the range of tasks they can perform. Since humans act based on memory, this also undermines key assumptions in imitation learning.
- The way MLLMs (Multimodal LLMs) process visual information is often inefficient. We are researching solutions to these issues to build a more efficient Embodied Agent.
Real-World Robotics System Implementation & Integration
- Together with CANVAS, we will progressively replace modular, rule-based legacy components in autonomous driving (Perception, Planning, Control, Localization, Mapping) with new approaches and integrate them into a single model to achieve human-level, intuitive, intelligent driving.
Closing the Sim-to-Real Gap
- For robotics, simulation-based data generation is practically essential. However, if the gap between simulation and real-world data is too large, simulation data becomes unusable for training.
- To address this, we research approaches such as domain randomization and delayed reinforcement learning to mitigate the sim-to-real gap.

Research Support

Ultra-High-Performance GPU Cluster: We operate CORE (Compute-Oriented Research Environment) [CORE Introduction]
- 12 on-premise DGX H100 nodes (H100×96), 30+ A100s, and 10+ V100s in operation.
High-quality, fully human-driven driving data pipelines for both simulation and real-world environments
- Unlimited custom datasets and annotations tailored to the research team’s needs.
Over 200 hours of map-based driving data, with unlimited access for the research team.
Comprehensive support for conferences & publications
- Full support for attending and submitting to NeurIPS, ICLR, CVPR, ECCV, Interspeech, ACL, etc.
- All conference participation expenses covered upon acceptance in international journals.
WoRV Tech Blog

Publications | maum.ai BRAIN Team

Open-Source Activities | maum.ai BRAIN Team

Key Responsibilities

Design and implement models for pretraining and post-training of the Foundation Embodied Agent.
Build and advance pipelines for diverse vision-language-action data.
Research methods for agents to form and leverage long-term memory.
Optimize multimodal model architectures for efficient training and inference.