I discovered this while building and exploring what I now call an early ๐ฆ๐ฝ๐ฎ๐๐ถ๐ผ-๐ง๐ฒ๐บ๐ฝ๐ผ๐ฟ๐ฎ๐น ๐ฃ๐ฒ๐ฟ๐ฐ๐ฒ๐ฝ๐๐ถ๐ผ๐ป ๐๐ป๐ด๐ถ๐ป๐ฒ (๐ฆ๐ง๐ฃ๐) - an early infrastructure experiment turned breakthrough that converts raw video and simulation data into structured, time-aware, space-aware training signals.
The key insight wasnโt about games.
It was about control, observability, and ground truth.
and in my opinion, Roblox, among all game engines available right now, is the easiest way to provide :
- fully deterministic physics
- complete access to object states and world metadata
- massive diversity of human-created environments
- perfect alignment between visual input and underlying state
My early STPE (version 1) sits on top of this and extracts spatial structure + temporal continuity before any model training happens, producing datasets with object trajectories, motion, captions, and state transitions - instead of flat, lossy frames.
In other words, Roblox can be used as more than just a game platform. It can be used as a scalable world simulator with native ground truth.
Thatโs what makes it powerful for training AI systems that need to understand how the world changes over time, not just what it looks like in a single frame.
GitHub (open-source), expanded with new important discoveries and in-depth insights:
https://github.com/Froredion/Spatio-Temporal-Perception-Engine
After deep-diving into frontier ML, this direction genuinely excites me.
The next generation of AI can grow far beyond what we have today, and weโre only just getting started.
For Generating Self-Supervised Training Datasets for Spatio-Temporal Foundation Models - Froredion/Spatio-Temporal-Perception-Engine