Visincept Founder Interviewed on Shenzhen TV’s "Frontier of Innovation": Embarking on the World Model Journey with the EgoTwin Data Engine

On May 16, 2026, Shenzhen TV's program Frontier of Innovation visited Visincept. The program focused on EgoTwin, the egocentric human hand 3D alignment data engine for world models jointly launched by Visincept and Baidu Intelligent Cloud. It also conducted an in-depth conversation with Zhang Lei, Founder and CEO of Visincept, and Liu Wei, Co-founder of Visincept, to decode the technological breakthroughs and industrial implementation paths in the world model track.

World models are core technologies in artificial intelligence that simulate environmental dynamics and predict future states, enabling AI to evolve from "seeing" the world to "understanding" it. Over the past year, global technological forces have intensified competition in this intertwined virtual-real race: Yann LeCun founded AMI Labs, and Li Fei-Fei's World Labs launched the World API. Meanwhile, relevant Chinese innovation teams have emerged one after another like bamboo shoots after a spring rain. In this highly anticipated world model track, the Visincept team, which once developed the globally leading large vision model (DINO series), has officially taken on the core pain points of embodied intelligence.

EgoTwin 产品宣传图.png

Figure 1 EgoTwin Product Image

During the interview, Zhang Lei admitted that while AI has achieved great success in the digital world, it still faces bottlenecks when expanding into the physical world. The core challenge is the severe lack of physical world data. Online image and text data cannot support robots in mastering basic practical skills such as folding clothes and wiping tables, leaving large robot models trapped in a "data hunger" dilemma.

张磊.png Figure 2 Zhang Lei, Founder and CEO of Visincept

"The core of a world model is next state prediction, which is fundamentally different from the next token prediction of language models," Zhang Lei explained. Visincept advocates that world models must have three core elements: object-centric, cross-embodiment action aligned, and causal-driven. The EgoTwin data engine serves as the key carrier to implement this philosophy. As the industry's latest egocentric human hand 3D alignment data engine, EgoTwin solves the problem of converting human hand operation data into robot-usable data, significantly reducing data collection difficulties and improving efficiency remarkably.

Liu Wei, Co-founder of Visincept, clarified the company's industrial implementation roadmap during the interview: in the short term, through cooperation with Baidu Intelligent Cloud and other partners via the EgoTwin data engine, Visincept will provide data support for domestic and international large models and embodied intelligence enterprises, helping upgrade robots' brains; in the long term, by building world models, including empowering robot bodies, Visincept will serve the real economy and form a complete closed loop of "data engine - model iteration - body implementation".

Figure 3 Liu Wei, Co-founder of Visincept

From technological research and development to industrial implementation, Visincept's exploration epitomizes Shenzhen's sci-tech innovation enterprises'dedication to hard technology. This interview on Shenzhen TV's Frontier of Innovation clearly demonstrates to the public: whether it is breakthroughs in lightweight data engines or integrated software and hardware layout, China's sci-tech innovation forces are solving industry challenges through independent innovation, accelerating the arrival of an intelligent era where AI can perceive, deduce, and transform the real world.

Full interview video from Shenzhen TV: [Click Here]