Research Scientist - Robotics, Learning From Videos (LFV)
Company: Toyota Research Institute
Location: Los Altos
Posted on: February 16, 2026
|
|
|
Job Description:
Job Description Job Description At Toyota Research Institute
(TRI), we’re on a mission to improve the quality of human life.
We’re developing new tools and capabilities to amplify the human
experience. To lead this transformative shift in mobility, we’ve
built a world-class team advancing the state of the art in AI,
robotics, driving, and material sciences. The Mission Make
general-purpose robots a reality. The Challenge We envision a
future where robots assist with household chores and cooking, aid
the elderly in maintaining their independence, and enable people to
spend more time on the activities they enjoy most. To achieve this,
robots must be able to operate reliably in complex, unstructured
environments. Our mission is to answer the question “What will it
take to create truly general-purpose robots that can accomplish a
wide variety of tasks in settings like human homes with minimal
human supervision?” We believe that the answer lies in cultivating
large-scale datasets of physical interaction from a variety of
sources and building on the latest advances in machine learning to
learn general purpose robot behaviors from this data. The Team The
Learning From Videos (LFV) team in the Robotics division focuses on
the development of foundation models capable of leveraging
large-scale multi-modal (RGB, depth, flow, semantics, bounding
boxes, tactile, audio, etc.) data from multiple domains (driving,
robotics, indoors, outdoors, etc.) to improve the performance of
downstream tasks. This paradigm targets training scalability, since
data from multiple modalities can be equally leveraged to learn
useful data-driven priors (3D geometry, physics, dynamics, etc) for
world understanding. Our topics of interest include, but are not
limited to, Video Generation, World Models, 4D Reconstruction,
Multi-Modal Models, Multi-View Geometry, Data Augmentation, and
Video-Language-Action models, with a primary focus on foundation
models for embodied applications. We are aiming to make progress on
some of the hardest scientific challenges around spatio-temporal
reasoning, and how it can lead to the deployment of autonomous
agents in real-world unstructured environments. The Opportunity Our
Learning From Videos (LFV) team is looking for a Computer Vision
Research Scientist with expertise in Video Generation,
Spatio-temporal Representation Learning, World Models, Foundation
Models, Multi-Modal Learning, Vision-as-Inverse-Graphics (including
Differentiable Rendering), or related fields, to improve dynamic
scene understanding for robots. We are working on some of the
hardest scientific challenges around the safe and effective usage
of large robotic fleets, simulation, and prior knowledge (geometry,
physics, domain knowledge, behavioral science), not only for
automation but also for human augmentation. As a Research
Scientist, you will work with a team proposing, conducting, and
transferring innovative research. You will use large amounts of
sensory data (real and synthetic) to address open problems, train
models at scale, publish at top academic venues, and test your
ideas in the real world (including on our robots). You will also
work closely with other teams at TRI to transfer and ship our most
successful algorithms and models towards world-scale long-term
autonomy and advanced assistance systems. Responsibilities Conduct
high-reaching research that solves problems of high value and
validates them in well established benchmarks and systems. Push the
boundaries of knowledge and the state of the art in ML areas,
including simulation, perception, prediction, and planning for
autonomous driving and robotics. Partner with a multidisciplinary
team including other research scientists and engineers across the
CV team, TRI, Toyota, and our university partners. Present results
in verbal and written communications, internally, at top
international venues, and via open-source contributions to the
community. Work closely with robotics and machine learning
researchers and engineers to understand theoretical and practical
needs. Lead collaborations with our external research partners and
mentor research interns. Follow best practices producing
maintainable code, both for internal use as well as for
open-sourcing to the scientific community. Qualifications PhD or
equivalent years of experience in Machine Learning, Robotics,
Computer Vision, or a related field. Deep expertise in at least one
key ML area among Computer Vision, Large-Scale Pre-Training,
Multi-Modal Learning, World Models, 4D Reconstruction Consistent
record of publishing at high-impact conferences/journals (CVPR,
ICLR, NeurIPS, RSS, ICRA, ICCV, ECCV, PAMI, IJCV, etc.) on the
aforementioned topics. Proficient at scientific Python, Unix, and a
common DL framework (preferably PyTorch). Experience with
distributed learning (especially on AWS) for large-scale training
of foundation models is a plus. You can identify, propose, and lead
new research efforts, working in collaboration with other
researchers and engineers to complete it from initial idea to
working solution. You are intrigued by large-scale challenges in
ML, especially in the space of Robotics, Automated Driving, and for
societal good in general. You are a reliable teammate. You like to
think big and go deeper. You care about openness and delivering
with integrity. Please submit a brief cover letter and add a link
to Google Scholar to include a full list of publications when
submitting your CV for this position. The pay range for this
position at commencement of employment is expected to be between
$176,000 and $264,000/year for California-based roles. Base pay
offered will depend on multiple individualized factors, including,
but not limited to, business or organizational needs, market
location, job-related knowledge, skills, and experience. TRI offers
a generous benefits package including medical, dental, and vision
insurance, 401(k) eligibility, paid time off benefits (including
vacation, sick time, and parental leave), and an annual cash bonus
structure. Additional details regarding these benefit plans will be
provided if an employee receives an offer of employment. Please
reference this Candidate Privacy Notice to inform you of the
categories of personal information that we collect from individuals
who inquire about and/or apply to work for Toyota Research
Institute, Inc. or its subsidiaries, including Toyota A.I. Ventures
GP, L.P., and the purposes for which we use such personal
information. TRI is fueled by a diverse and inclusive community of
people with unique backgrounds, education and life experiences. We
are dedicated to fostering an innovative and collaborative
environment by living the values that are an essential part of our
culture. We believe diversity makes us stronger and are proud to
provide Equal Employment Opportunity for all, without regard to an
applicant’s race, color, creed, gender, gender identity or
expression, sexual orientation, national origin, age, physical or
mental disability, medical condition, religion, marital status,
genetic information, veteran status, or any other status protected
under federal, state or local laws. It is unlawful in Massachusetts
to require or administer a lie detector test as a condition of
employment or continued employment. An employer who violates this
law shall be subject to criminal penalties and civil liability.
Pursuant to the San Francisco Fair Chance Ordinance, we will
consider qualified applicants with arrest and conviction records
for employment. We may use artificial intelligence (AI) tools to
support parts of the hiring process, such as reviewing
applications, analyzing resumes, or assessing responses. These
tools assist our recruitment team but do not replace human
judgment. Final hiring decisions are ultimately made by humans. If
you would like more information about how your data is processed,
please contact us.
Keywords: Toyota Research Institute, Rohnert Park , Research Scientist - Robotics, Learning From Videos (LFV), Science, Research & Development , Los Altos, California