Key Takeaway
Sora 2 is designed to produce video outcomes that adhere to established physics, avoiding unrealistic transformations seen in prior models. For instance, if a basketball player misses a shot, the ball will realistically rebound instead of teleporting. The system maintains consistency across multiple shots and supports various visual styles, including photorealistic and anime. Additionally, it can generate audio elements like soundscapes and dialogue. A new feature allows users to embed recordings of people or objects into generated environments accurately. Currently, Sora 2 is accessible via an invite-only iOS mobile app.
In contrast, Sora 2 is designed to produce results that align more closely with established physics principles.
“Previous video models tend to be overly optimistic—they will alter objects and distort reality to fulfill a text prompt,” the team explains.
“For instance, if a basketball player misses a shot, the ball might suddenly teleport to the hoop.
“In Sora 2, if a basketball player misses a shot, the ball will bounce off the backboard.”
The system can follow instructions across multiple shots while maintaining consistency of elements within the generated scenes.
It accommodates a variety of visual styles, ranging from photorealistic and cinematic to anime.
In addition to video content, the model can also create audio elements such as background soundscapes, dialogue, and sound effects.
OpenAI has introduced a new feature that allows users to embed recordings of people or objects directly into the generated environments.
“By watching a video of one of our teammates, the model can accurately place them into any Sora-generated environment, capturing both their appearance and voice,” the team notes.
How does Sora 2 work?
The company has launched an iOS mobile app that provides access to Sora 2, which is currently available through an invite-only system.



