Home / Podcast / Episode 14
Founder Story May 6, 2026 ~79 min

He Left Sweden for China to Build GPS for Robots

"The AI revolution has almost definitionally not begun until we get physical AI."

With Nils Pihl, Founder & CEO of Auki Labs

  • "The AI revolution has almost definitionally not begun." Nils argues physical AI is the real revolution. 70% of the world economy is still tied to physical locations and physical labor, and digital devices have no idea where they actually are in it.
  • Every robot wakes up at 0, 0, 0. Put two robots in the same room and both insist they are at coordinates 0, 0, 0. The Posemesh is Auki Labs' fix: a decentralized spatial layer that gives robots, drones, and AR glasses a shared coordinate system anchored to the real world.
  • The pyramids' worth of time, lost in traffic. Roughly the human-hours it took to build the pyramids are lost in Beijing traffic every single week. Nils uses that number to argue spatial intelligence is a civilization-scale problem, not a niche developer tool.
  • Sweden's largest retailer signed. Auki Labs' first big enterprise client is Sweden's largest retailer. The product, Cactus AI for Retail, is positioned as a store-manager robot, not a store-worker robot. Augmenting managers in 2026, replacing nothing.
  • Three new dimensions of the internet. Nils extends Naval Ravikant's "missing fifth protocol" framing: the next layer is an internet of spaces, sensors, and actuators. The 43-gram Mentra glasses are the consumer wedge; humanoid robots are the industrial one.
  • "Move fast and break robots." $10,000 of Unitree robot hands shattered in seconds during testing, and Nils calls it cheap tuition. His indictment of Western VCs is direct: they are too scared of hardware risk to compete with what China is already shipping.
00:00Cold Open: Why AI Hasn't Started Yet
01:00Welcome and the Great Reversal Thesis
06:3570% of the Economy Still Runs on Atoms
10:48How He Fell Into Robotics: A Tabletop Game
13:45Augmented Reality Is the Original Human Language
16:22Spread Good Memes: His North Star
20:00White-Collar Workers and the Cushy European Job
23:14The Six Layers of a Useful Robot
25:27Beijing Has More Cars Than People in LA
27:16Why Robots Wake Up at Coordinates 0, 0, 0
30:57Naval's Fifth Protocol and the Missing Sixth
33:27Pyramids Lost in Beijing Traffic Every Week
35:04Three New Dimensions of the Internet
39:41Drone Delivery and Apartment 30C
42:43Glasses Will Delay Humanoid Robots
49:45Battery as the Hidden Bottleneck
51:23The 43-Gram Mentra Glasses
55:25Closing Sweden's Largest Retailer
58:53Store Manager Robots, Not Store Workers
1:02:13Why Sweden First, Not China
1:05:467 Years in Beijing, 6 Months Back in Sweden
1:09:13"Developing Country" Versus "Developed Country"
1:12:54Move Fast and Break Robots
1:13:41$10,000 of Robot Hands Shattered in Seconds
1:17:32Be an Agent of Change or Get Replaced
Nils Pihl during the Asiabits Podcast recording

The AI Revolution Hasn't Started Yet

Nils Pihl opens the episode with a line that sounds like a provocation and turns out to be a thesis. "The AI revolution has almost definitionally not begun until we get physical AI." His argument is structural, not rhetorical. Roughly 70 percent of the world economy is still tied to physical locations and physical labor. Bits-to-bits AI moves text around. Atoms are where the real GDP lives. Until software can pick something up, walk it across a warehouse, restock a shelf, or unload a truck, the revolution is on paper.

"70% of the world economy is still tied to physical locations and physical labor. Digital devices don't know where they are in the world. They don't know I'm sitting in this chair. They don't know I'm on this floor. They barely know I'm in this building."

That gap, the gap between an AI model and a piece of the world it can actually touch, is what Auki Labs is trying to close. Nils calls the missing layer the Posemesh. In the simplest framing, it is GPS for robots. In the longer framing, it is a decentralized spatial protocol that gives every device, robot, and pair of AR glasses a shared coordinate system anchored to physical space.

From Sweden to Beijing, Seven Years In

Nils grew up in Sweden. He moved to Beijing for what was supposed to be a three-month visit and stayed for seven years. After a brief detour back to Sweden, he relocated Auki Labs to Hong Kong to stay inside the robotics ecosystem that, in his view, is currently the only one moving fast enough to matter. The Posemesh idea grew partly out of a side project: a tabletop game whose mechanics needed devices to agree on where they were in a room. The same primitive turned out to be missing from every spatial-computing stack he looked at.

He frames the bet in personal terms. Europe, he argues, is structured around protecting cushy white-collar jobs. Asia is structured around shipping. The first will be hollowed out by AI agents inside the next decade. The second is already absorbing humanoids and AR wearables into actual production lines. His own move from Stockholm to Beijing to Hong Kong tracks that thesis.

"I went to visit Beijing for three months and ended up staying seven years because I realized when I arrived in Beijing that we are still living in history. History is still happening."
Nils Pihl at the Asiabits podcast studio

Why Every Robot Wakes Up at Coordinates 0, 0, 0

Most people never notice the most embarrassing problem in robotics. Put two humanoids in the same room and ask each one where it is. Both will answer the same thing: "I am at coordinates 0, 0, 0." Neither knows about the other. Neither knows about the room. They each carry a private map glued to their own startup position, and there is no shared frame between them.

Nils uses Naval Ravikant's framing of "the missing fifth protocol" as a launchpad and then adds his own sixth. The internet, in his telling, is incomplete in three new dimensions: an internet of spaces, an internet of sensors, and an internet of actuators. The Posemesh is the spaces layer. Without it, every robot, drone, and AR headset has to rebuild the world from scratch each time it powers on. That is a tax civilization quietly pays in lost hours.

"This robot will say, I am at coordinates 0, 0, 0. And this robot will be like, no, I'm at coordinates 0, 0, 0."

Pyramids of Time Lost in Beijing Traffic

The number that does the most work in this episode is the pyramid number. Best-estimate human-hours required to build the pyramids of Giza, divided by the average commute friction across Beijing each week, comes out to roughly equal. Every single week, Beijing alone burns the labor required to build the pyramids. Multiply that out across megacities and the cost of imprecise spatial coordination becomes a civilizational tax.

Nils argues that fixing the spatial layer is a precondition for any of the obvious near-term applications. Drone delivery to apartment 30C does not work today because the drone has no way of knowing which window is 30C. AR glasses cannot pin a useful overlay onto a real shelf without a shared map. Humanoids cannot coordinate inside a store without agreeing on the aisles. These are not separate problems. They are the same missing layer.

Nils Pihl with Thomas and Michael

Closing Sweden's Largest Retailer

Auki Labs' first big enterprise customer is Sweden's largest retailer. The product is called Cactus AI for Retail. The deliberate choice, repeated several times in the conversation, is that Auki is building a store-manager robot, not a store-worker robot. The robot does not stock shelves and does not replace cashiers. It walks the floor, watches inventory and presentation, talks to the manager, and acts as a force multiplier for the human running the store.

That framing is part product strategy and part politics. Nils thinks AR glasses and store-manager robots will, paradoxically, delay the moment humanoids displace retail workers. The fastest way to make a human worker more valuable is to give them better spatial tools. The slowest path to a humanoid takeover is one where the existing workforce is augmented first. Sweden is the wedge market because the buyer is sophisticated, the regulatory environment is forgiving, and the unit economics make sense before scale.

"We're deploying store-manager robots, not store-worker robots. Augment the manager. Let the humans keep their jobs and do them better."

Move Fast and Break Robots

The most quoted moment of the episode is the one Ralph put in the cold open. During internal testing, Auki Labs broke roughly $10,000 worth of Unitree robot hands in seconds. Nils calls it cheap tuition. He uses the incident as a wedge into a longer point about Western venture capital. Western VCs, in his view, will not fund the kind of hardware experimentation that ships physical AI on schedule, because the loss profile of a single afternoon of testing scares them off. China-based teams treat broken hardware as a cost of learning. The result, he argues, is that the gap between deployed robots in Asia and on-paper robots in the West is widening every quarter.

"I accidentally broke $10,000 worth of robot hands in a few seconds. That's cheap tuition. Move fast and break robots."

His closing line is the one the production team kept for the trailer. It is also the cleanest summary of why he chose to build from Asia and not from Stockholm. "You can either choose to be an agent of change and exercise your agency and have an impact on history, or you can get replaced by people that are trying to change the world."

Nils Pihl

Nils Pihl

Founder & CEO, Auki Labs

Nils Pihl is the founder and CEO of Auki Labs, a spatial-computing startup based in Hong Kong building the real-world web, a protocol that gives robots, drones, and AR glasses a shared map of physical space. Born in Sweden, seven years in Beijing, now in Hong Kong. Auki's products include Posemesh, a decentralized spatial layer for robots and devices, and Cactus AI for Retail, already deployed with Sweden's largest retailer. Nils is also known for his work on memetic engineering and has been writing and speaking about how ideas spread through populations long before founding Auki.

[00:00] The AI revolution has almost definitionally not begun until we get physical AI. 70% of the world economy is still tied to physical locations and physical labor. Roughly the time it took to build the pyramids, to our best estimate, is lost in Beijing traffic alone every single week. Digital devices don't know where they are in the world. They don't know I'm sitting in this chair. They don't know I'm on this floor. They barely know I'm in this building. This robot will say, "I am at coordinate

[00:31] 0,0." And this world would be like, "No, I'm at coordinate 0,0,0." And you can either choose to be an agent of change and make decisions and exercise your agency and have an impact on history, or you can get replaced by people that are trying to change the world. I'm always impressed by all these great founders that we have on

[01:01] our podcast. They are so smart. And everybody's deep into hardware and tech here in Shenzhen, which is also obvious. Most of the reason why they come here is because it's so convenient to get to all these manufacturers and suppliers within a few distance reach, right? Yeah, no place like Shenzhen. Exactly. That's why one of our guests also told us, "If you want to build hardware, if you want to build your product, you have to come here," right? But that's also one of the questions we get asked the most: how do I start? I mean, it's obvious. When I came here, it still had some barriers. China is still a closed-system ecosystem.

[01:32] And it's always the question of how do I find reliable partners, right? How do I know I can trust this business? We've helped so many startups and people source products, but it's still hard to navigate, right? It is. I mean, we have so many people coming to us asking the same questions all the time: how do I find business partners? How do we do contracts with them? How do we do the payments? And that's why we're very happy that we partnered up with WorldFirst, the sponsor of today's episode. They have this great system

[02:03] that solves a big, big problem: payment. So WorldFirst is originally a London startup and got acquired in 2019 by Ant Financial, which is the parent of Alipay. We all know Alipay, right? And they have this system now where 1.5 million businesses are already inside the system. And if you do business in and with China, they help you resolve a lot of your problems. They have real-time payment, no hidden fees, and you can do it inside the ecosystem. So it is safe, reliable, and fast. So if we have people watching

[02:35] this and they want to do business in and with China, we think WorldFirst is a really great choice, and we are very happy they partnered with us. We put the link in the description. If you're interested in doing safe, reliable business in China without any hidden fees, WorldFirst is your choice number one. You will find all the information in the description. Nils, welcome, welcome. Thanks for having me. So let's directly dive in. I mean, we already talked a lot off camera. What I'd be interested in: you have a long-term vision. How do you imagine the world in 10 years?

[03:08] How will we live? How will we interact with robots? I think that we're going through a period of time now that I like to call the Great Reversal. And what the Great Reversal is: for decades, we human beings have been disappearing into digital worlds, be it social media or computer games. But now during the Great Reversal, digital things are coming into the physical world instead, and they're doing so through robotics, physical AI, but also augmented reality. We are making the physical world accessible to AI now. And that's going to be very impactful for many reasons. Like economically, 70% of the world economy is still tied to physical locations and physical labor, both as a rough percentage of GDP

[03:38] and as a rough percentage of headcount, right? So the AI revolution has almost definitionally not begun until we get physical AI, because that's where most of what AI could do is. But I also actually think that number is going to grow. As it gets cheaper to operate atoms and do things in the physical world, we are going to want more things in the physical world. So I think 10 years from now, more than 70% of the world economy is going to be tied to the physical world. And perhaps paradoxically, more of it is also going to be tied up with artificial intelligence. I think it's likely that over the next 10 years we might see

[04:11] the world economy grow by a full order of magnitude, right? That the world economy in 10 years will be 10 times bigger. physical AI, but also augmented reality. We are making the physical world accessible to AI now. And that's going to be very impactful for many reasons. Like economically, 70% of the world economy is still tied to physical locations and physical

[04:41] labor, but both as a rough percentage of gdp, but also as a rough percentage of headcount. Right? So the AI revolution has almost definitionally not begun until we get physical AI, because that's where most of what AI could do. But also I actually think that that number is going to grow. As it gets cheaper to operate atoms and do things in the

[05:11] physical world, we are going to want more things in the physical world. So I think 10 years from now it's going to be more than 70% of the world economy is tied to the physical world. And perhaps paradoxically, more of it is also going to be tied up with artificial intelligence. I think that it's likely that over the next 10 years we might see the world economy grow by a full order of magnitude. Right?

[05:43] That the world economy in 10 years it'll be 10 times bigger than it is today. And that might sound crazy, but you actually only have to believe that the economy will grow three times faster than it grew in the previous decade for something like that

[06:15] to be true. The economy is already like 10 times bigger than when I was born in the 80s. And I think more and more of the economy is going to be wrapped up in AI to AI transactions.

[06:47] I think an important part of the economy that we're starting to talk about now is the economy of electricity. We're talking a lot about data centers now, but where I think ultimately what we will be talking about a lot is how to move electricity. So, like batteries, right? Because today,

[07:27] let's think about two technologies that are for sure coming and are for sure going to be very impactful on civilization. One is robots, obviously, and the other is augmented reality glasses. I very much believe that both of these technologies are going to be hugely,

[07:58] hugely impactful. And both of them have these very serious battery constraints, right? Like if you ask a humanoid robot to actually do something, it'll be out of battery in two hours. Right. And so range anxiety for robots is very, very real. And I think that

[08:28] more and more of the world economy will be running on batteries. Like more and more of the world's labor will be in robots and glasses that are running on batteries. And one of my most perhaps sci-fi visions of

[09:01] where we will be in 10 years is that we will be thinking about battery life much closer to how we're thinking about currency today. Like, how much battery am I willing to spend on this? How much battery am I willing to spend on this? And potentially that currency

[09:31] as we have it today is maybe not necessarily going to be as meaningful. I mean, I guess it's always true that whatever decade we're in is the most impactful decade because we've been on this exponential for a long time. But it's very clear to me that the world is

[10:02] going to be very different 10 years from now from where it is today. And the big theme is this great reversal that now the computers are coming into our world, the digital intelligence is coming into the physical world.

[10:34] And I don't want to gloss over augmented reality in this because that's where we have our roots and how we fell into robotics. I think that augmented reality is going to play a huge role in human to And GPS is actually a line of sight technology. So it doesn't know where we are right now.

[11:04] It doesn't know I'm sitting in this chair. It doesn't know I'm on this floor. It barely knows I'm in this building. And that realization was, oh my God. Like, how are we going to get AR glasses and robots and things like this if we don't know where things are? And I realized there's a missing part of the tech tree and fell in love with that problem. Space. And what's great with augmented reality is it's essentially the problem of robotics if you remove the arms and legs, right? It's only the part of how do I, as a digital device, reason about my movement through space? So the way augmented reality works is literally the same way a robot reasons about moving through space using things like SLAM, simultaneous localization and mapping.

[11:36] So we realized that, okay, augmented reality is a great way to communicate with the robot. If you can get the phone and the robot into the same coordinate system, then I can actually, when I see this spill in the supermarket, I can just take out my phone, point at the spill, and tell the robot, come clean this up. And because the phone and the robot are in the same coordinate system, it can understand. Last year, I think November or something like that, I think we were the first team in the world to demonstrate that we could communicate with a robot using glasses. So we had a pair of glasses that had a camera, and the camera was streaming to essentially a robot movement server that analyzes the camera feed for understanding how we're moving through space.

[12:07] So that server could keep track of the precise position of the glasses. That meant that when you say, hey, robot, come help me, the robot understood precisely where we were. It's like, oh yeah, I'll come to where the glasses are, because I understand where the glasses are. So augmented reality is a very helpful way to communicate with the robots, but also, you know, spiritually, a very helpful way to communicate with humans, right? Human language is the original augmented reality, right? A story I used to tell all the time was, if we were to imagine that we're walking through a forest together, and we come across a fallen tree, and I point at that fallen tree, and I say, hey, look at that beautiful couch. Something very interesting happens in our collective psyche when I do that, because now we see the "sittingness" of this fallen tree, right?

[12:37] And that may not have been in our consciousness before. What language allows us to do, which is incredibly powerful, is that we can manifest our imagination in the minds of other people, right? It allows us to do intercognitive computing, in a sense. We want to make sure that we have the same state across our minds, and language allows us to do that. And part of making language better and better is finding better compression for language. Every industry, every hobby develops its own kind of language because it compresses how many words I need to say for you to understand me, right? And augmented reality is an incredibly powerful way to compress information. Consider, for example, again, the supermarket scenario. If you're the store manager and you're a store associate, and there's a task that you want to leave for him today, what you would have to do is explain in words where this task is. Like, oh, it's in this aisle next to blah blah blah, let's say there's a broken fixture or something, right? You're going to have to spend so many words explaining precisely where this is.

[13:10] Or you're going to have to walk them over there. Or you leave a piece of digital information in physical space, right? Augmented reality, I super believe, is the future of human language. I think 10 years from now we'll be able to sit on a balcony here in Shenzhen. We're all wearing the glasses and across there's another building and we see something funny in one of the specific windows. And using augmented reality, I'll be able to put a marker in your field of view, precisely which window it is, right? Like, look, that one. And it just appears in your field of view. When you can do that and when you can

[13:41] project your imagination into someone else's field of view, that's as close as we can get to telepathy without literally messing with the brain. So this is the great reversal. The robots are coming, the digital things are coming, but also we are going to start using digital intelligence that was very crucial. It was very dangerous for my parents' marriage because they also were. Yeah, they were fighting. But this is super interesting because we don't

[14:17] need it anymore. Now we have like GPS and it tells us where to go and what to do. So, yeah, the GPS works pretty great outdoors, a lot of the time.

[14:50] So this is actually how I fell into robotics. I wanted to make augmented reality for my favorite tabletop game. I didn't know anything about augmented reality at the time, and what I imagined was like, hey, here's this tabletop game. And I want us both to be able to see augmented

[15:23] reality overlays. And since I didn't know anything about GPS or spatial computing, I just kind of assumed that, like, oh, you know, the device knows where it is, right? It turns out,

[15:53] no, in hindsight should have been obvious. Like digital devices don't know where they are in the world. your imagination into someone else's field of view, that's as close as we can get to telepathy without like literally messing with the brain. So this is the great reversal. The great reversal is that the robots

[16:24] are coming, the digital things are coming, but also we are going to start using digital intelligence to talk to each other through augmented reality. Yeah, that's a very precise way to collect a lot of data too. Right. I remember last month there was something going super viral on X, like Rent a Human, where AI agents now use humans in order to let them do some jobs and also collect data. Right.

[16:54] So yeah, it's funny how this whole industry is evolving now, how we interact with humans, humanoids, or robots. It's crazy. The question is: is this funny or scary? It's both. The window example, for instance. It's both, right? Like every time that we get better at communicating, we get better at telling the truth,

[17:24] but we also get better at lying. Right. So it depends a lot on who we are as a civilization and where our collective psyche is at. Like AI and video generation. Is this good or bad? The Internet, social media. Yeah, it depends very much on who we are as people. Right. I'm very excited that we have the Internet. I'm also very scared of the Internet. I'm very excited that we have AI.

[17:55] I'm also very scared of AI. It's very powerful technology and it's happening a lot faster than our introspection is. We don't spend enough time thinking about how we interact with this technology and what it means to be a man, a citizen, or, you know, what kind of responsibilities we have to our family, to our state, things like this. In a world where we

[18:26] are all so powerful, I wish we would spend more time just thinking about it. Yeah, like what does it mean to be a good man this decade? Like, what does responsible information consumption look like? What does responsible information propagation look like? My personal motto in life, my North Star, is three words: Spread good memes.

[18:59] Right? Just be mindful of how I interact with the information landscape, right? If a meme is a piece of copyable behavior, I need to think about what behaviors am I copying, what behaviors am I putting out that other people may copy? Am I being mindful of my input and output? And I think one of the most meaningful things to do as a human being, because I think what makes humans

[19:29] so unique is that we have this mimetic landscape. We have language, we have culture built on language. Everything that makes us truly unique is made of language, as the late great Terence McKenna used to say. So if we really want to lean on our humanity in the age of AI, I think start with spreading good memes. Think about how you consume information, how you put information out there. So what does it

[20:00] mean for a white-collar worker? I mean, you talked about it; you said a lot of white-collar jobs will be dead or are already dead. Actually, like now in 2026, I would be very scared if I were a white-collar worker. What does all this mean for them? I think it's very important to exercise agency right now. You have to be a self-starter. Like, there are so many problems worth solving in the world, but less and less will there be someone telling

[20:34] you what to do and paying you to do it. More and more you're going to have to find a problem yourself, and you'll also be more capable than ever to solve it yourself. So you need agency. The era of having a cushy European job with very little output any given week, you know, like "Oh, I'm just doing some data entry or whatever," that's clearly gone. And that's both good and terrifying. You know, everyone needs to

[21:05] know how to feed their family. Obviously, we all also need to find a way to actually contribute to society, right? And now we have more power than ever to contribute, but also more pressure than ever to contribute. That's what I find the most super scary. Like, you really have to do something. You can't be too comfortable, like we talked about it, right? If you get too comfortable with life and with your achievements from the

[21:36] past—you look at some countries in Europe—it's very dangerous. See, I am worried how employable I will be five years from now, but maybe employment is not the right way to think about it. It's more about what can I do that other people will be willing to pay me for? I think companies will shrink. Right, but in terms of headcount, yeah, companies will shrink in terms of headcount. They've already stopped growing.

[22:06] I feel like there's an endless amount of problems that humanity has. There's a lot that you can do. You're just going to have to be self-guided and agentic yourself. If not, you're going to have a tough time. The problem you identified is GPS, right? GPS has some restrictions. I'm very interested in this because for me, as a non-expert, I would say GPS is an amazing thing. And even if you think of AirTags, for example, in your phone or in your headphones, they also work with GPS, right? They work with GPS and Wi-Fi trilateration and ultra-wideband trilateration. They're using a lot of different techniques to position them. Like when you find your AirTag

[22:36] between the couch cushions, that's using ultra-wideband to do that. Whereas when you have forgotten your AirTag somewhere and you see it on a map, it may have used Wi-Fi, a mix of Wi-Fi and GPS to correctly place the thing, but that last, you know, couple of centimeters of where is it— that's using ultra-wideband, that's not using GPS. So then you said, okay, your mission is to solve this problem. We want to make sure that devices—machines—can have an improved shared understanding of the physical environment so that they can coordinate with

[23:08] others but also with themselves over time, right? If you think of robotics, you could say that robotics is made up of six layers that are very important. One layer is locomotion, the ability for the robot to traverse the environment. And then there's manipulation, the ability for the robot to affect the environment. But even if you had a robot that knows how to grab things and knows how to traverse the environment,

[23:39] you still can't ask it to go empty all the trash cans here in this building. Because to do that, it's also going to need perception. More specifically, it's going to need spatio-semantic perception— spatial meaning. It understands the difference between something being far away or close, which is not a given, right? Like, this is something you really have to teach computers to do well. And semantic reasoning, meaning it needs to understand the difference between a trash can and a baby, right. And once you have those first three things—locomotion, manipulation, and perception—

[24:10] now you have the first robot that you could in theory ask to go empty all the trash cans, but it's going to bounce around like a first- generation Roomba looking for trash cans until you give it mapping. So what is mapping? Mapping is just a memory of perception— like, where did I see things before? That's what a map is, right? Once you have mapping, then you need to also understand where you are in relation to the map. Like, where am I on the map? Sure, I know that there are trash cans here on the map and a trash can here on the map. But where am I on the map again, you know? 60% of people younger than me can't

[24:40] read a map. Robots don't read maps, right? So you have to give them positioning. And outdoors they get positioning from GPS with a few meters of accuracy, but indoors they don't. And then finally, the sixth layer, the deployment applications layer. You need to tie all of these capabilities together to get the robot to actually do something. And what we saw was that almost all attention and money is going towards locomotion and manipulation, and very little is going towards solving perception, mapping, and positioning. And what we want to do is work on collaborative perception, collaborative mapping, and collaborative positioning so that multiple devices can have a shared understanding of the environment. And to that end, we're building this protocol we call the Real World Web, which is essentially the Internet for physical spaces—making physical spaces browsable, searchable, and navigable to robots and AI.

[25:11] And on top of the Real World Web, we've started building some successful applications of our own. But we also enable other robotics companies to build better products on top of the Real World Web themselves. This is a crazy task. I mean, it took hundreds and thousands of years to map the world as we now know it. And now your task is you want to map every building, every space in the world. Even if you take this building we're sitting in, it's like, I don't know, 30, 40 floors and hundreds of offices. This would be like a huge task. And you lived in Beijing before? Beijing has more cars than people live in LA, for example. So this is— how do you tackle this? So an important distinction is that Auki is not trying

[25:41] to build one big map of the world. Just like the Internet is not one big website, right? We make it possible for any place in the world to make their own local physical website, right? So you can choose to that almost all attention and money is going towards locomotion and manipulation, and very little is going towards solving perception, mapping and positioning. And what we want to do is we want to work on collaborative perception,

[26:11] collaborative mapping and collaborative positioning so that multiple devices can have a shared understanding of the environment. And to that end, we're building this protocol we call the Real World Web, which is essentially the Internet for physical spaces, making physical spaces browsable, searchable, navigable to robots and AI. And on top of the real world web, we've started building some successful applications of our own. But we also enable other robotics

[26:44] companies to build better products on top of the real world Web themselves. This is a crazy task. I mean, it took like hundreds and thousands of years to map the world as we now know the world. And now your task is you want to map every building, every space in the world. Even if you take this building we're sitting here, it's like, I don't know, 30, 40 floors and hundreds of offices. This would be like a huge task. And you lived in Beijing before?

[27:16] Beijing has more cars than people live in la, for example. So this is. How do you tackle this? So an important distinction is that Auki is not trying to build one big map of the world. Just like the Internet is not one big website, right? We make it possible for any place in the world to make their own local physical website, right? So you can choose to map this space and have it on your hardware,

[27:48] just like you could host your website on your server, right? That's why we call it the Real World Web. It's really like the Internet. It's not that we are trying to build one big website that contains the map of the entire world. We are making a protocol for robots to move between local maps of the world. So if I have a map of my space and you have a map of your space, a robot will be able to navigate both my space and your space by talking to the Real World Web.

[28:19] So our job is not to map the world specifically. Our job is to build a protocol that allows any kind of device capable of mapping the world or needing maps of the world to find each other and collaborate with each other. So how does it actually work? If we have two robots here in this building, they are responsible for all the bins, and one does floors one to 12 and then needs

[28:49] to go and charge the battery. How does this robot then in the future tell the other robot the last trash bin I emptied was in 1204 in the boss's office. You continue from there. [No change] So the way you just did it was purely semantic, right? Like you didn't use any map information. You just said this floor and you gave a room number. That's a semantic description. And a semantic description

[29:21] is not super helpful unless you already have a sense of how to navigate the space, right? Just because you know it's room 1204 or whatever, if you don't know where room 1204 is, you're going to be looking for room 1204. What you want is for both robots to have a shared map of the space, a shared coordinate system, so that when one robot says 1204, the other robot immediately understands what that means and already knows how to get there,

[29:51] right? So that's what the Real World Web allows. The first problem we tackled was how do we. Before anything else, how do we at least get you into the same coordinate system? Right? How do we at least get you into the same coordinate system? Because that's not free, right? The devices don't exist in the same coordinate system today. Anytime a robot wakes up and starts doing SLAM—simultaneous localization and mapping—it invents a single-use coordinate system every time.

[30:22] Meaning that if one robot wakes up here and one robot wakes up here, this robot will say, "I'm at coordinate 0,0,0," and this robot will be like, "No, I'm at coordinate 0,0,0," right? How do you get them into the same coordinate system? That was the first problem that we tackled. Once you have that—once they are in the same coordinate system—then you can start building shared maps. Then you can start putting digital information relative to physical space.

[30:53] So when we talked about the good and the bad of the Internet, you read something from Naval and said, "Okay, this is something that was a wake-up call for you, and where you said, 'Okay, I have to tackle this problem. I have to build a business out of it.'" So yeah, in 2014, Naval Ravikant wrote a blog post that it seems he's since taken down, which I'm a little confused by. But it argued that the Internet was missing a fifth protocol—a way

[31:25] to deal with monetary value, a way to transact between machines, like agent-to-agent payments. And he was saying this already in 2014, which is pretty wild. And what he used as an example that really resonated with me at the time—because I was living in Beijing—is he said, "Well, imagine a world where everyone has self-driving cars, and these self-driving cars have to negotiate the use of a scarce resource: the road.

[31:56] There's limited space on the road, and they want to do something like a lane merge. How do you get two AIs to agree who should let who go first? Like, how is that resolved? Especially if they are from different manufacturers running different logic, how can we make this possible?" And what he envisioned was there's going to be a layer of the Internet that he called the fifth protocol that will allow devices

[32:29] to pay each other. So one car will say, "I am willing to pay $0.01 to cut ahead of you right now." And the other car is like, "Yeah, I'll take that. I'll take one cent, sure." Right? The idea was that if you can put some kind of numerical value to your priorities so that you can communicate priorities across different agents, that is super helpful. And if those priorities actually map to the economic interests of the agent

[32:59] or the agent's owner, that makes it even better. So we need to build this fifth protocol. And what I realized several years later is like, yes, there is a fifth protocol missing, but there's also a sixth protocol missing, which is: how do these devices talk about space? Right? You preempted this story by bringing up Beijing has more cars on the road than people in Los Angeles, right? So just for the listeners to get a sense of scale, right? There are millions and millions of people commuting by car in Beijing on any given day. And the average commute is close to two hours.

[33:29] And if you do a quick back-of-the-napkin calculation, it means that roughly the time it took to build the pyramids to our best estimate is lost in Beijing traffic alone every single week, right? So if we want to imagine us getting back some percentage of that lost human productivity, we want to imagine our self-driving cars able to coordinate with each other, right? To say that, okay, I'm driving this fast, here's precisely where I am. And now that you know that, maybe you want to slow down just a little bit so we don't hit each other at the intersection, etc. But to do that, they need to be able to communicate very, very precisely about where they are in space and very, very precisely about how quickly they're moving, etc.

[33:59] And there's no rails on the Internet to do that today. Like, GPS is not precise enough for that today. And there's no spatial communication channel where they know how to find each other and tell each other these kinds of things. So yeah, I think that's a missing part of the Internet. In fact, I think there are three missing pieces of the Internet that are going to start getting built out over this next decade. So you can think of it as three new dimensions that the Internet is growing into. So one is an Internet of spaces, an Internet of maps, essentially. We've been talking about this already, right? The next layer is the Internet of sensors. How can one AI look at the world through another machine, right? How can my self-driving car look through the cameras of another car when it goes around a corner so that I can figure out: do I need to slow down? Is there anything on the road? That's the Internet of sensors.

[34:30] You might want your agent to be able to log in to different public CCTV cameras, things like this. Pay a small amount to the local government for borrowing the camera to look for your missing child or whatever. So the Internet of sensors. And the third is the Internet of actuators, right? The Internet of robot hardware, which you can think of as like teleoperation for robots, like robots teleoperating other robots, if that makes sense. Like AI teleoperating other robots. So that when you tell your agent, like, hey, I really want you to buy my wife this really rare purse that's on sale right now in New York, I can't buy it online. The agent goes, no problem, I'm just going to sign in to this nearby humanoid robot, I'm going to pilot that humanoid robot, I'm going to buy the bag and I'm going to drop it off at the mall and then I'm going to sign out of the humanoid robot, right?

[35:02] That's the Internet of actuators. So I think the Internet is going to get these three new dimensions that will just be a permanent part of civilization after that. The Internet of spaces, the Internet of sensors, and the Internet of actuators. This is what we're building towards with the real world web. We started with the spaces, and now we are increasingly adding support for signing into sensors and signing into hardware in general. So you said if I would be the management of this building, I would buy some robots to do the cleaning, whatever, then you would help me teach them to communicate with each other, to set up everything. But how would it work for private households? Because people have the dream. I talk to people that say, oh, in

[35:32] the morning I need a coffee, I'm still in bed on the weekend. And then I just call a robot service from outside, and the robot knows what is my favorite cup, where are my beans? And then the robot will come to my house, make my coffee, bring it to my bed, and then leave and go to other people's houses. How would that work? Well, today the robot would have a tough time finding your apartment in a building like this, right? And if you... It's actually even more fun to think of it as a drone delivery problem, because let's say you live in apartment 30C, right? So on the 30th floor, apartment C, there is no way today for And the third is the Internet of actuators, right? The Internet of robot hardware, which you can think of as like teleoperation for robots,

[36:05] like robots teleoperating other robots, if that makes sense. Like AI teleoperating other robots. So that when you tell your agent that like, hey, I really want you to buy my wife this really rare purse that's on sale right now in New York, I can't buy it online. The agent goes, no problem, I'm just going to sign in to this nearby humanoid robot, I'm going to pilot that humanoid robot, I'm going to buy the bag and I'm going to drop it off at the mail and then I'm going to sign

[36:36] out of the humanoid robot, right? That's the Internet of actuators. So I think the Internet is going to get these Three new dimensions that will just be a permanent part of civilization after that. The Internet of spaces, the Internet of sensors, and the Internet of actuators. This is what we're building towards with the real world web. We started with the spaces, and now we are increasingly adding support for signing into sensors and signing into hardware in general. So you said

[37:06] if I would be the management of this building, I would buy some robots to do the cleaning, whatever, then you would help me me to teach them to communicate with each other, to set up everything. But how would it work for private households? Because people have the dream. I talk to people that say, oh, in the morning I need a coffee, I'm still in bed on the weekend. And then I just call a robot service from outside, and the robot knows what is my favorite cup, where are my beans? And

[37:37] then the robot will come to my house, make my coffee, bring it to my bed, and then leave and go to other people's house. How would that work? Well, today the robot would have a tough time finding your apartment in a building like this. Right. And if you. It's actually even more fun to think of it as a drone delivery problem because let's say you live in apartment 30C, right? So on the 30th floor, apartment C, there is no way today for a drone to know which apartment that is from the outside.

[38:10] Why? Because there's no public registry of which floors this building skips. Like, a lot of buildings here will skip the fourth floor, some will skip the 14th and the 24th, but not all. All right, almost everyone skips the fourth. Some skip the 14th, some skip the 24th. Right. The drone doesn't know. And also which one is A, B, and C facing which direction? Right. So how would you explain to a drone where your balcony is? How would

[38:43] you do that? Right. You need to be able to express it in some way that the drone can understand. You could do it in a GPS coordinate system, but the GPS reading for the drone is not going to be accurate enough. GPS has an accurate coordinate system, it just doesn't have accurate positioning. So you can describe this precise location on the table right here, in theory, in the GPS coordinate system. The problem is, how did you make that measurement? Does that make sense? Right. How accurate was your measurement?

[39:16] So what will happen, I think, is your building will create a publicly accessible map that they either store on someone else's cloud or they store it on their own machines, just as they do with their website. You know, like maybe they self-host, maybe they put it on a cloud somewhere. That's up to them. Right? But they are going to create a physical website. We call it a domain. Right. They're going to create a domain of the building so that when the delivery robot or delivery drone shows up, it can authenticate against that map and be like, okay,

[39:48] I'm going to Thomas's apartment. Now I understand how to get to Thomas's apartment because this building had a map of itself. But that's also, like, super scary, right? Like, most of the people that I talk to, when I tell them, okay, I moved to China, it's like, aren't you afraid of all these cameras they took? You take your data and they do whatever. I don't know. But then this is even a step further, right? If you think of it, okay, I'm living now in this building and there's a map

[40:19] from this building. And like, somewhere someone actually knows exactly how to enter my apartment. What is the concern? I think today there are already, you know, like, fire escape maps and things like this. And someone already knows where you live. The problem is the people that you want to know where you live don't. Right? You want the delivery driver to know where you live. You want the delivery robot to be able to get to you, and they don't know how to do

[40:49] that today. Like, if you're worried about some intelligence agency or organized crime or something like that, they have no problem finding out, right? It's the robots that need help. And that's where we come back to the conspiracy theory that all the delivery, especially in China, it's all a big data collection program. But it kind of makes sense, right? If I order my milk tea to 30C, then the guy, he will come, and then maybe

[41:20] he can film the elevator. He can also already map: Is there a 14th floor? Yes or no? Then he puts the milk tea in front of my door, he takes a picture and calls me. Yeah. So, yeah. And you also talk a lot about human and robotics cooperation, how they work together. And you talked about glasses, right? Glasses are very important for this. You can elaborate a little bit on this. Yeah. I think in the very near future, the majority of humans will wear glasses, just like,

[41:52] you know, most, at least in the developed world. Like, most of us have phones and laptops and we're going to have glasses. The reason why is because increasingly we are going to work with AI, and AI is going to help us be the best versions of ourselves. And what glasses allow the AI to do is to hear what we hear and see what we see so that it can always give us, you know, contextually relevant recommendations. So I actually think glasses

[42:24] might be the biggest delay to robots taking our jobs, because a human with glasses will be able to do their job so much better than a human without glasses that the ROI on the robot gets pushed into the future, right? So if you think of something like a car mechanic today, right, they know how to use a certain set of tools. Maybe they're a licensed Toyota car mechanic. and they know their way around a Toyota very well,

[42:55] but maybe they don't know their way around a BMW and don't feel comfortable repairing a BMW. Put a pair of glasses on them and an AI that knows every car, and now this human who knows how to operate the tools and follow the instructions from the AI can now repair anything. A washing machine, a spaceship, or surgeries, for example. Well, I think surgeries are tricky because you need very good fine motor skills. But maybe you could get Warhammer painters or something like that to become surgeons.

[43:27] But maybe it's tricky now, but not in the future when they will. Because I saw some videos of surgery robots doing it with a lemon or with a fruit or something, and they did very, very precise stitches. The robots are getting very, very good at it now, right? You know, there's a lot of robots building luxury cars, and it's not because they're cheaper than human workers. It's because they are better than human workers at getting consistent results, right? And if you make a high quality car, you want consistent results, right? You want the gap between the door to be exactly 1.5 millimeters on every single car. And a robot or, you know, an army of robots is better at getting that result than humans are. Because you can teach robots today to be better at some manipulation tasks than most or even all humans,

[43:58] but humans are very good at general purpose manipulation, right? Like we're not all good enough to be a surgeon. Like we're not all skilled enough with our hands to be surgeons. Even if someone told us exactly what to do, some of us have two shaky hands and things like that, right? But almost everyone can perform almost every task. They just don't know how to do it. And AI will show us how to do it. So think of, for example, retail workers, right? Retail workers with these glasses will have a perfect understanding of where every product is,

[44:28] what needs to get done, what am I doing next, and will just be way, way more productive, but also happier, like having a better time at work. And, you know, if the human being gets 50 or 100% more productive, which I think might actually be a low estimate for some of these low skill and low motivation jobs. I think it's possible, you know, that with good gamification and good AI, good glasses, your typical retail worker will be three times more productive than they are today. Well, that really makes it a lot harder to replace them with a robot. Right. Because all of a sudden a robot today is like a state of the art retail robot is maybe 10% as good as a human. We're still pretty far away.

[44:58] So if the glasses just make us 300% better over the next five years, then that's going to delay the coming of the robots. I think almost everyone's going to wear glasses because they're just going to make us so good at everything that we want to do. Yeah, it's also with the AI agents and everything right now. Like as soon as I feel disconnected from the Internet, I feel like, what am I going to do now? Anxiety. Like I cannot continue my task. Like I feel super unproductive. Like if I'm wearing those glasses and they run out of battery, I would, you know. Yeah. Imagine a pair of glasses

[45:28] that as you're on your way back home from work and you see the grocery store, it reminds you that, hey, you promised your wife that you would get milk actually. Amazing, right? We will be better husbands, we will be better workers, we will be better parents. The glasses like AI will just help us be the best version of ourselves. And this is what we saw with white collar work too, you know, like it didn't take all the programming jobs, all the lawyer jobs, but every good programmer uses AI now, and every good lawyer uses AI now. And they are better at their job because they're doing that. And that's going to be true for everything in the physical world as well. Doesn't matter if you're stitching together footballs in Pakistan or disassembling UFOs in Area 51.

[45:59] Right. Doesn't matter. You put on the glasses and you're going to be better at that job. There are already a lot of glasses on the market right now, but you don't see anyone wearing them besides some nerds. So what do you think? When will they actually be part of our everyday life? So one of my very best friends just going to make us so good at everything that we want to do. Yeah, it's also with the, with the AI agents and everything right now. Like as soon as I feel disconnected from the Internet, I feel like, what am I going to do now? Anxiety. Like I cannot, I cannot continue

[46:31] my task. Like I feel super unproductive. Like if I'm wearing those glasses and they run out of battery, I would, you know. Yeah. Imagine a pair of glasses that as you're on your way back home from work and you see the grocery store, it reminds you that, hey, you promised your wife that you would get milk actually. Amazing, right? We will be better husbands, we will be better workers, we will be better parents. The glasses, like AI will just help us be

[47:02] the best version of ourself. And this is what we saw with white collar work too, you know, like it didn't take all the programming jobs, all the lawyer jobs, but every good programmer uses AI now, and every good lawyer uses AI now. And they are better at their job because they're doing that. And that's going to be true for everything in the physical world as well. Doesn't matter if you're stitching together footballs in Pakistan or disassembling UFOs in Area 51. Right. Doesn't matter. You put on the glasses and you're going to be better at that job. There are already a

[47:33] lot of glasses on the market right now, but you don't see anyone wearing them besides some nerds. So what do you think? When will they actually be part of our everyday life? So one of my very best friends runs a glasses company, and he blew my mind with a—in hindsight— very obvious realization. He said that glasses were invented roughly 800 years ago,

[48:03] but it wasn't until the 1920s or so that people started wearing glasses all day. Why? Well, it wasn't until the 1920s or so that material science had gotten good enough that you could make glasses that were around 40 grams. It turns out the human face doesn't deal

[48:36] well with weight over that. And so if you try to put on a pair of AR glasses that weigh like 100 grams, they're going to end up in your pocket. Like the Apple glasses. Right. That's also why they discontinued them— they need to be light enough. So my friend just put out a pair of fully open-source programmable camera glasses

[49:15] that are only 43 grams. They're comfortable enough that you can actually wear them all day. They just launched them just under a month ago. As soon as I've made the apps that I want for these glasses, I'm gonna wear them. I think the big challenge—to throw back to our earlier

[49:45] conversation about battery economy— is these kinds of glasses with that form factor maybe have an hour of battery. Right. But luckily here in Asia, things like neck fans are pretty popular, and there are neck batteries. So I'm definitely

[50:17] getting a neck battery for my glasses, and then I'm gonna have my own personal AI co-pilot that reminds me of everything. You know, I'm never going to forget anyone's name anymore. I'm not gonna

[50:48] forget to do the grocery shopping. I'm not gonna miss my appointments, and I'm gonna record your meetings and everything. I'll be a better human because of it. Yeah. So we need to get the glasses under 40 grams. They need to be under 45 grams.

[51:18] I mean, the lighter, the better. The lighter, the better. The Mentra glasses are 43 grams now, and you can wear them all day. If they were 35 grams, then, you know, obviously even better. And they are getting smaller and smaller, but battery is

[51:48] going to be a big concern. In fact, battery was one of the first problems that I realized about augmented reality—and you know, the not too distant future— that made me want to build the real world web as a big civilization-scale thing. Because before then I was just kind of working on simple techniques for doing shared augmented reality,

[52:24] and there was no civilization-scale infrastructure play to it. And what I realized was when I was talking about it in 2021, but now people see it like, "Oh, yeah, energy management is going to be one of the big questions of our time." And I think even for the robots, the ability for the robot when it goes to the mall shopping for you to offload some of its compute to local compute resources in the mall in exchange for a bit of money— so it gets longer battery life—it's just like a no-brainer

[52:54] economic decision for everyone, right? You're happy to pay half a dollar to give your robot another hour's worth of battery so it can finish your tasks for you. And half a dollar is a good markup on the electricity for the mall, and they now make a little bit of money. Robots don't look at ads. It's going to be a problem for our economy. Robots don't look at ads. So how are we going to monetize our retail spaces? Well, we're going to monetize it by providing maps

[53:24] and context and things that the robot needs to perform its job. So things like visual positioning systems that allow augmented reality navigation and robot navigation have been solved for a pretty long time, but there hasn't been a good economic model for it. Like, there are malls out there that have visual positioning systems now, but they're not open to your robot. Your robot can't connect to it because they built that visual positioning system for some specific, maybe AR use case or something that they had in mind so they could do navigation with ads or something, and your robot can't access it.

[53:54] So we're basically telling them all, "Ten years from now, a very sizable percentage of everyone shopping here is going to be a robot, and they don't look at ads. So what you're going to do is open up your visual positioning system and you're going to be able to sell mapping data and positioning data to the robot at a comparable rate to what you get for advertising to humans. And that's how you don't die. Right? That's how you keep making money in the age of robots." And I just fully believe that this is where civilization is going and that civilization will still work that way

[54:24] a thousand years from now. Right? I've had long conversations with the team about what we can build today that will still be relevant in five years, in fifty years, in five hundred years, in five thousand years. And what that is—well, these three new dimensions of the Internet. The Internet is still going to be around five thousand years from now. The Internet is still going to be important, and robots are still going to want to be able to ask questions about the physical world five thousand years from now.

[54:54] They're still going to want to be able to ask questions about other people's sensors five thousand years from now. And they're still going to want to be able to hop into some other robot body five thousand years from now. So that's the particular flavor of Kool-Aid that we're drinking at Auki—that, hey, let's build something that's still going to be relevant five thousand years from now. Yeah, and it's not just talking or the vision. You actually made a big deal with a European retailer to help them solve some of their problems. What was their problem? Yeah, we closed our first big enterprise client not that

[55:25] long ago. It's Sweden's largest retailer. And what we realized we could do for them with spatial computing is, if you think about the computer game of retail, what the game is about is which products should I carry that my customers want to buy? And there's some AI tooling for that and analytics for that. "How should I price my products?" And there's some AI tooling and stuff for that. But then there's also, "Where should I place them in my store? How much space should I give them?" Because that's a scarce resource that I have. I only have this many shelf meters and I need to choose how much space each product gets and where to put it

[55:58] to optimize the amount of sales that I do. And this is a crazy black box for retailers today because they don't actually know what their stores look like. Because how would they know? Right? Someone at headquarters might have made a plan called a planogram of, "Here's how we want the store to look." But how sure are they that the store actually looks that way? It turns out that store managers are not very good at following those plans also because the plans aren't, you know, they don't think they're good. Right? Like, "I know my store well. I've run out of this. I'm going to..." space should I give them? Because that's

[56:29] a scarce resource that I have. I only have this many shelf meters and I need to choose how much space each product gets and where to put it to optimize the amount of sales that I do. And this is a crazy black box for retailers today because they don't actually know what their stores look like. Because how would they know? Right. Someone at headquarters might have made a plan. It's called a planogram of, here's how we want the store to look. But how sure are they

[56:59] that the store actually looks that way? It turns out that the store managers are not very good at following those plans also because the plans aren't, you know, they don't think they're good. Right. Like, I know my store well. I've run out of this. I'm going to put something else in this spot. And then it turns out that when I put something else in that spot, I was selling more of it. So I'm going to keep it there now, but I'm not telling headquarters about it. And, you know, globally, maybe like 60% planogram compliance across retailers, which means that all of these analytics teams working at retail companies are working with absolute garbage data. Yeah, especially in China. Sorry to interrupt, but I also always tell you,

[57:30] like, I go to the supermarket next to our office and the water is on the right aisle, second aisle, and then I go tomorrow, and everything is different. They love to just throw things around. So, yeah. And I think in China, you will have an even bigger market to do this. Right? Indeed. Right. So the value prop was just, hey, we're going to tell your humans and your AI systems and your analytics systems what your store actually looks like. We're going to do robotic vision using this little robot with no arms and legs that you already have in your pocket. And that will allow you to get AR navigation for your staff and shoppers.

[58:03] It's actually a bigger deal for the staff than for the shoppers, believe it or not, because there's such high staff turnover in retail and it takes a long time to train people up. We found that we could reduce the walking distance for staff members by 25 to 45% for our biggest customer that already had some kind of navigation system, just because we had better data about where the products actually are. Right. So you make your workers more productive. There's actually American research that indicates that 6% of baskets would contain at least one more item if the staff were more knowledgeable about where the products are. Right. So yeah, you get AR navigation, you get an actual accurate understanding

[58:33] of where the products are and you get an augmented reality task manager so that the store manager can leave tasks asynchronously to people. We found that just that shaves off 15 minutes per employee per day, just on handovers. Just on handovers. Right. And now we're going to use the robots to populate that same augmented reality task manager. So we're putting robots into stores this year that are just going to drive around the store and find out what needs to get done. It's a store manager robot, not a store worker robot. So the humans will still be doing the work. It's not taking any human's job. The store manager is going to keep their job for sure. Right. But all the human workers will have a

[59:03] better understanding of what needs to get done and probably a better relationship to work as well, because now it's not their manager telling them what to do, it's the AI telling them what to do, which is, you know, probably a better feeling. So yeah, we're looking to deploy several hundred of those perception-based store manager robots this year. So yeah, at Auki, we're focusing a lot on retail this year. But other people building on the real world web, like Bud Break, the agricultural robotics company, they're doing robots that help detect crop-destroying diseases in vineyards so farmers don't lose so many grapes.

[59:34] Every year there's a lot of interesting stuff happening on the real world web. So yeah, this sounds very theoretical to me still. Like, how do you actually onboard? If you have this deal, how do you start the project with the retailer? So the retailers either print or receive markers that are like road signs for robots that they start placing around their store. Then they film the venue using their own phones and their own staff. They upload the video and the video will contain these markers. In fact, we have a special video recording app that instructs them like, okay, let's find one marker now, let's find another marker, let's find overlapping markers. So like a little copilot that teaches you how to film the store the right way and then you upload that video to the real world web where it gets analyzed by AI and turned into a map.

[1:00:05] So essentially they place markers, they film the store, they upload the video, and that's how they get a map. Do they have robots working already? No, we're going to deploy the first robots in Q2 this year. Okay, and why is it a European retailer and not a Chinese one? I think it was a Guanxi thing, kind of. Our head of sales is Swedish and lives in Sweden and met another Swedish retail startup at a conference. And that retail startup had Sweden's largest retailer as a customer and knew that they were looking for something

[1:00:36] like what we're doing. So that's how it happened. Yeah, there are very big opportunities in China, this deal, how do you start the project with the retailer? So the retailers either print or receive markers that are like road signs for robots that they start placing around their store. Then they film the venue using their own phones and their own staff. They upload the

[1:01:07] video and the video will contain these markers. In fact, we have a special video recording app that instructs them to like, okay, let's find one marker now let's find another marker. Let's find Overlapping markers. So like a little copilot that teaches you how to film the store the right way and then you upload that video to the real world web where it gets analyzed by an AI and turned into a map. So essentially they place markers, they film the store, they upload the video, and that's

[1:01:38] how they get a map. Do they have already robots now working? No, we're going to deploy the first robots in Q2 this year. Okay. And why is it a European retailer and not a Chinese one? I think it was a Guanxi thing, kind of. Our head of sales is Swedish and lives in Sweden and met another Swedish retail

[1:02:08] startup at a conference. And that retail startup had Sweden's largest retailer as a customer and knew that they were looking for something like what we were doing. So that, that's how I think. Yeah, there are very big opportunities in China, but it's also a very tough landscape. And as a foreigner, you still have to be a little bit careful. You need to make sure you have good partners

[1:02:39] in mainland China so your things don't get just copied and stolen. So we've kind of been holding off on going into the mainland Chinese market until we have stronger partners and better relationships with the robot OEMs, which we have now, some good heavy-name Chinese investors, things like this to just protect ourselves a little bit. The world is a big place. There are plenty of retail opportunities outside of China, but obviously we want to be a

[1:03:10] global company. Obviously, China is an important part of our strategy. We just thought it was best to be a little bit careful until we've developed the right kind of muscle to be successful here. And you've built business with the US, with Europe, and with China. Where do you see the biggest difference in how they approach AI and robotics in these three different markets? I really appreciate how open

[1:03:40] and collaborative the Chinese ecosystem is. There's a lot of open source, and there's a lot of cross-pollination. Like, people talk pretty openly—gossip, bagua—across companies. So there's a lot of information traveling between companies, meaning companies learn faster together and the ecosystem moves faster. Whereas Western companies are quite secretive, so they don't learn as much from each other, which I

[1:04:10] think is slowing them down. Like, one of the great things about Shenzhen, as a guy named Mehdi pointed out on X recently, is just the information landscape here. When you have so many different manufacturers in walking distance from each other, things like price discovery and process discovery all happen super quick and information propagates in the ecosystem. Much better. I was at a closed-door VIP dinner for the robotics industry about half a year ago in Shanghai, and

[1:04:41] one of the world's largest robot OEMs made the joke that, hey, Americans don't buy robots because they have immigrants. And the point of that joke was not to make fun of immigrants or Americans even. It was in the context that I was asking, why is it so hard to buy your robots outside of China? Why are the wait times so long? And the explanation was, European and American companies are so slow at making decisions, it takes them months to years to decide to buy one robot.

[1:05:15] So it never makes sense for us to have stock overseas because the companies here make decisions in days to weeks. So every new robot we build, the smartest thing to do is to keep it in a Chinese warehouse because chances are we'll sell it tomorrow. So because European and American businesses are so slow at making decisions, there's no stock and no replacement parts. No. Like, the Chinese manufacturers don't bother keeping the same level of service because it just doesn't make sense because we're so slow at making decisions. After I had lived in Beijing

[1:05:46] for seven years, I went back to Sweden for six months. Sweden is where I was born. I went back to Sweden for six months and did some consulting. And I noticed something about European and Western work culture that I hadn't noticed before I was taken out of it, which is, how are meetings scheduled, right? So in Beijing, it's like, I'm talking to the CEO of so-and-so. And like, hey, here's something I think we should do together. And it's like, okay, let's meet tonight or tomorrow. Or you just figure it out on WeChat, right? And everything in Sweden was like, yes, yes, let's have a meeting about this in two weeks. Why? What are you doing for two weeks?

[1:06:18] Like, what are you doing? Like, I know you're not busy. Why don't we just go for dinner tonight, right? Like, even working at the same company, you know, like, hey, here's something that we really should fix. Like, we need to do an overhaul on the website or something. Like, yes, yes, let's do it. Let's book a meeting about this for Thursday. Yeah, I mean, we're already. Right. Can we just fix it right now? That sense of urgency is just, I think, pathologically missing in the West. People talk a lot about, like, long working hours. I don't think that's it at all.

[1:06:49] I think it's just the sense of urgency and how quickly decisions are made in Asia and in China in particular that has made this place be so successful. They fucking make decisions. Yeah, we talked about it earlier, right? Yeah. About the concept of developing countries and developed countries. It's people getting so comfortable you know, like, hey, here's something that we really should fix. Like, we need to do an overhaul on the website or something. Like, yes, yes, let's do it. Let's book a meeting about this for Thursday. Yeah, I mean, we're already. Right. Can we just fix it right now? That sense of urgency is just, I think,

[1:07:21] pathologically missing in the West. People talk about a lot about, like, Joe, Joe, Leo, like, long working hours. I don't think that's it at all. I think just the sense of urgency and how quickly decisions are made in Asia and in China in particular is what has made this place be so successful. They fucking make decisions. Yeah, we talked about it earlier, right? Yeah. About the concept of developing

[1:07:51] countries and developed countries. It's people getting so comfortable with what they achieved in the past, especially in Europe, that they don't see the urgency of doing something. So they get really, really slow. Yeah, I mean, I contacted you two or three days ago and I said, do you have time for a podcast? We have a free spot and we booked this studio for six spots and when we arrived here, we had like two filled. But I was not worried at all that I would get great guests and great stories because if I would contact you two or three weeks ago, what are you doing in two or three weeks? You say, well, I don't know. Who knows? Yeah, yeah. The same with all

[1:08:21] the robotics companies. Like, oh, yeah, we plan the trip in May and then, yeah, let's fix it maybe one week earlier. There's a German robotics influencer on X that wanted to do an interview or something with me. And it was again this like, yeah, let's have a meeting about it in three weeks. Right? Like, I gave him my calendar and he booked something three weeks out. Yeah, but I really think I should come and visit your lab. But he's in Germany. Okay, yeah, sure. Come visit my lab. Absolutely. When do you want to come? I'm thinking like August or September. You have my number. So how can we help Europe? We as Europeans—it's not that we just want to criticize Europe, but I honestly hope

[1:08:51] that Europe can find its place in this world. And if we have AI, if we have robotics, it doesn't seem right now that Europe is playing an important role between the US and China. Europe just needs to make a decision to get its act together. And I think it starts with learning how to make decisions. Right? Stop having decision paralysis. So what you were referencing earlier about developed nations—I was out having drinks in San Francisco with Elvis Nava, who is the CTO of Mimic Robotics in Switzerland, one of the better-funded European robotics labs. And he made this great observation that it was a mistake—memetically, it was a mistake—to introduce this term "developing country" and "developed country," because we started calling ourselves developed countries

[1:09:21] many, many decades ago. And it did something to our psyche that made us think that we are on top and we don't need to improve. Now you're like, oh, China's a developing country, but it's also ahead of us. What does that mean when the person on top is developing and you're not? Right? Europe has gone stagnant. And a lot of it, I think, is just this missing sense of urgency. We've gotten incredibly comfortable. We're not motivated. We don't aspire, we don't look outside to the world and see how much better things could be. I find, you know, my home country of Sweden to be very

[1:09:51] low on motivation, very low on action and agency. And these are cultural problems that we can fix with better memes, right? Correct. We have to inspire ourselves to want to accomplish more. We have to make it okay to strive for success, which I think is something that in Sweden in particular is a little bit frowned upon. And been to Germany, we can sing a song of that. That's also one of the reasons why I just left and decided, okay, I can spend most of my time here working at a company, get a good salary, but I don't have the feeling that I really earn what I'm

[1:10:21] doing. And that is something that most of the people also—my close friends—they're like, I like it this way. Like, okay, good. That explains a lot. I feel the same. I want to earn my bread. I want to know that I'm doing something that's contributing to society. I was going crazy working at these European companies where, like, nothing ever happens and decision-making is so slow. And you talk like, oh, we have a very flat hierarchy. No, you just have decision paralysis. It's not a flat hierarchy. It's that everything becomes a fucking committee doing a meeting for a meeting because we want to have a meeting in two weeks. Yeah, it's crazy. Europe and the US need to learn how to take some risk. I'm releasing a

[1:10:54] new article this week called "Move Fast and Break Robots," which is about this, right? So "move fast and break things" has been one of these mantras for software in Silicon Valley. It's basically this idea that the only way a startup can compete with a bigger company is by learning things faster. Time to insight is like the most sacred KPI for a startup. Like, how fast can we learn? We have to make it okay to strive for success, which I think is something that in Sweden in particular is a little bit frowned upon. And been to Germany. We can sing a song of that. That's also one of the reasons

[1:11:24] why I just left and decided, okay, I can spend most of my time here working at a company, get a good salary, but I don't have the feeling that I really earn what I'm doing. And that is something that most of the people also, my close friends, they're like, I like it this way. Like, okay, good. That explains a lot. I feel the same. I want to earn my bread. I want to know that I'm doing something that's contributing to society. I was going

[1:11:55] crazy working at these European companies where, like, nothing ever happens and decision making is so slow. And you talk like, oh, we have a very flat hierarchy. No, you just have decision paralysis. It's not a flat hierarchy. It's that everything becomes a fucking committee doing a meeting for a meeting, because we want to have a meeting in two weeks. Yeah, it's crazy. Europe and the US Needs to learn how to take some risk. I'm releasing a new article this week called Move Fast and

[1:12:25] Break Robots, which is about this, right? So move fast and break things has been one of these mantras for software in Silicon Valley. It's basically this idea that the only way a startup can compete with a bigger company is by learning things faster. Time to insight is like the most sacred KPI for a startup. Like, how fast can we learn? Can we learn faster than these bigger companies? It's going to take two weeks just to book a meeting to discuss what happened in the sales call, and blah, blah, blah. Don't be afraid to break something, because when you break something, you learn from that too, right? So move fast and break things.

[1:12:59] You know, Elon Musk is not afraid to blow up some rockets on his way to Mars, right? Move fast and break things. But for some reason, Western VCs are terrified of the idea of spending money on robots because they're like, "What if they break?" Yeah, yeah, what if? Don't you want to learn under what circumstances robots break? Right? Don't you want to build up that muscle? If I went to a Western VC and said, "Hey, I'm going to spend a million dollars on a compute experiment. I have a new world model architecture. Maybe it'll go somewhere, maybe it won't. But we're going to spend a million dollars of compute to run that experiment," they'd go, "Yeah, yeah, okay." If I go, "Hey, I'm going to spend a million dollars on a new generation of Chinese robots that I think are capable of doing this task," they're like, "That's crazy. Why are you going to spend a million dollars?" Well, because if I'm right, the upside

[1:13:29] is billions and billions of dollars. It's like, "Yeah, but what if it breaks?" Like, yeah, what if my LLM experiment doesn't work, right? There's this weird blind spot in the Western psyche about spending money on hardware—some learned trauma from the hardware winters before that we just need to get over. You have to be willing to break some robots, right? I accidentally broke $10,000 worth of robot hands in a few seconds once because we had these nice hands. I think they're $8,000 or something like that. And we'd attach them to our humanoid robot. Then the humanoid robot had just received a software upgrade from Unitree, and now it could do some new dance. It's like, "Hey, let's look at the new dance just for fun." The thing was that dance was not calibrated for the weight of the hands, right? Which meant the movement was off by just a little bit. So it smashed its hands into its thighs, and the hands just shattered, right? Boom. $10,000 gone. And now we've learned something very valuable, right? That I

[1:13:59] would very willingly pay a consultant $10,000 to tell us, like, "Hey, these motion policies are actually very sensitive to weight." So we learned a lot of very valuable things in those few seconds of robot dancing, seeing the big pieces of hands fly all over. And you're not going to learn that at the whiteboard. And if you're trying to learn it at the whiteboard, it's going to cost you more than $10,000, right? Yeah, for sure. And I always see people commenting on these videos on Instagram where robots do something wrong—they stumble or they do something—and people are like, "Hahaha, this is the future." But then I tell people, they don't understand that this is actually learning. It's not failing. And they're failing now, but the learning

[1:14:29] they have in five or ten years will be so valuable for them. Yeah, yeah. Imagine looking at computer games in the mid-90s and being like, "These don't look realistic at all. The physics are so buggy. When a car collides with another car, it'll fly up in the air. Computers are so stupid." And yeah, okay, but then a few years later, the computer game industry was bigger than Hollywood and music combined, right? It's the same with LLMs. Also, like in the beginning, everybody was like, "Oh yeah, ChatGPT is good because it just helps me rewrite something." And then in the end it's not able to calculate one plus one. And now, like only a year later, it's part of it. It's already developing itself. It's crazy. Yeah. Also the meme went viral. Don't know if you saw it on LinkedIn. Then people asking their LLM, "I want to wash my car, but the car wash is just

[1:14:59] 500 meters away." And then the LLM said, "No, no, just walk there." And then it's so stupid. But this is so dangerous like it gives people who are against technology and change a good feeling, or at least what they feel in that very moment is like, "Oh, I have this good feeling. Okay, this is too far away. It's too stupid right now. I don't have to deal with this now because it's not going" And now we've learned something very, very valuable, right? That I would very willingly pay a consultant $10,000 to tell us, you know, like, hey, these, like, motion policies are actually very, very sensitive to weight. So, you know,

[1:15:29] we learned a lot of very valuable things in those few seconds of robot dancing. Seeing the big pieces of hands fly all over. And you're not going to learn that at the whiteboard. And if you're trying to learn it at the whiteboard, it's going to cost you more than $10,000, right? Yeah, for sure. And I always see people, like, commenting on these videos on Instagram where robots do something wrong, they stumble or they do some and people are hahaha, this is the future. But then I said people, they don't understand that this is

[1:16:00] actually learning. It's not failing and they're failing now, but the learning they have in five or 10 years will be so valuable for them. Yeah, yeah. Imagine looking at like computer games in the mid-90s and being like, these don't look realistic at all. The physics are so buggy. When a, when a car collides with another car, it'll fly up in the air. Computers are so stupid. And yeah, okay, but then a few years later, the computer game industry was bigger than Hollywood and music combined. Right?

[1:16:32] It's the same with the LLMs. Also, like in the beginning everybody was like, oh yeah, chatgpt is good because just helps me to maybe rewrite something. And then in the end it's not able to calculate one plus one. And now like only a year later, it's part of it. It's only, it's already developing, developing itself. It's crazy. Yeah. Also the meme went viral. Don't know if you saw it on LinkedIn. Then people asking their LLM, I want to wash my car,

[1:17:02] but the car wash is just 500 meters away. And then the LLM said no, no, just walk there. And then, so stupid. But this is so dangerous. Like it gives people that are against technology and changes the good feeling, or at least what they feel in that very moment is like, oh, I have this good feeling. Okay, this is too far away. It's too stupid right now. I don't have to deal with this now because it's not going to happen. So they get comfortable and lean back. No, it's not worth my time looking at it because obviously

[1:17:32] somebody supports my theory that it's bad. We live in history. Actually, that was the thing that made me stay in Beijing. I went to visit Beijing for three months and ended up staying seven years because I realized when I arrived in Beijing that we are still living in history. History is still happening. And growing up in Europe, I was raised to believe that we lived in post-history. Right? Like after the fall of the Berlin Wall, history is done. We now live in the new stable world order and we are developed countries and blah, blah. No, we live in history. And we live at the very foot of an exponential takeoff. And you need to really embrace the idea that change is coming very, very quickly and history is changing, civilization is changing, the universe is changing and you can either choose to be an agent of change and make decisions

[1:18:03] and exercise your agency and have an impact on history, or you can get replaced by people that are trying to change the world. Nothing to add to that, right? Yes. Thank you so much. This was amazing. Thank you. Thank you, guys. Good closing? Yeah. We are very excited to see what the future brings and what you bring to the world. Let's start deploying some robots. Yeah, definitely. I'd love to see them at your lab. Yeah.

[1:18:33] on history, or you can get replaced by people that are trying to change the world. Nothing to add to that, right? Yes. Thank you so much. This was amazing. Thank you. Thank you, guys. Good closing? Yeah. We are very excited to see what the future brings and what you bring to the world. Let's start deploying some robots. Yeah, definitely. I'll love to see them at your lab. Yeah.

Get These Insights
Every Morning

Join 18,000+ professionals who stay ahead with Asiabits. Every Monday, straight from Shanghai.

Unsubscribe anytime. No spam, ever.