Skip to content

WebXR & A-Frame's Co-Creator On The Immersive Web's Future & Why It Matters

Diego Marcos is one of the most important figures in WebXR’s story so far.

At Mozilla he helped kickstart the WebVR standard, which then became WebXR.

He co-created A-Frame, an open source library to create VR experiences on the web, and remains its lead developer, maintainer, community manager. Marcos now works at Supermedium, a company he co-founded in 2018, developing WebXR showcases and a new kind of browser for VR headsets.

We sat down with Diego in our virtual studio to talk about all of the above, and the future of WebXR as an approach to immersive content in general:

I’m Diego Marcos main maintainer of A-frame, a framework to develop VR experiences on the web. So anyone that has any web development knowledge, the idea is that they can pick it up and start developing VR and AR content in the same way that they are used to developing web applications.

I used to work at Mozilla. I was part of the team that started the WebVR initiative. For those that don’t know, what is called now WebXR used to be called WebVR. That was the first API and standard that we developed at Mozilla. When we released it, other companies jumped in and joined the initiative like Samsung and Microsoft and Oculus and the scope of the API grew to incorporate AR use cases also.

WebXR is this set of APIs that the browsers implement in order to add VR and AR capabilities to the websites. Within that team I was working more on the front end and tools part of it and an A-frame was/is one of those initiatives to enable and to empower all web developers out there to start developing AR and VR content.

Talk us through that year where WebVR came out of Mozilla. What was the initial feeling there and how did that progress to browsers?

It was early 2014. I was working on FirefoxOS. If you remember, FirefoxOS was an operating system for smartphones that was built on web technologies – everything from the developer APIs to the front end of the browser, all the first party applications.

I was part of the mindset of that time: so we have this cool stack of technologies, what other spaces can we apply the same philosophy of openness, not walled gardens, like open ecosystems and open standards? At the time I was very interested in VR in my spare time. I had that first DK1 and then a DK2, and one of the very first kind of Vive developer kits. And I was tinkering on the side with VR. With a couple of colleagues – Josh Carpenter, Kevin Ngo, Casey Yee – we had matching interests and we decided to start prototyping some things and to sell the idea within Mozilla and the idea got momentum and then we formed the team, to fully focus on VR. And one of the things that came up from that was the first draft of the WebVR API.

FirefoxOS on smartphones – an OS that’s built up from the ground for web content – some people have said this idea would seem like it would work quite well in VR. Do you see that happening? Have you heard of anything in that sense, sort of going away from installing apps and having all these fragmented systems into a headset that you boot into WebXR and you browse around and an open web?

So there’s several, several schools of thoughts in that space. One is that the web always has a space in any platform – it doesn’t compete, but complements native. The web is very well suited for that kind of bite-sized content. The kind of content that you would actually have a hard time convincing someone to install an app, because it’s got fast consumption. Let’s say you want to read a quick article about something and it comes with some VR piece of content that illustrates the information. So you will then install an app to consume just one article? So WebXR is for those use cases. You can just share that article via a link. You just click a link and immediately without downloading anything you are inside the experience, consuming that information.

On the other hand, the FirefoxOS approach was like more of a holistic approach. It’s like, OK we have these very set of cool standards and APIs: we could actually build the whole system and platform using web technologies. So far we haven’t seen that in the VR and AR space, but I think there’s potential for someone to offer a different approach. Instead of building another kind of closed ecosystem, walled garden, they could approach building a headset based on standards and open technologies.

The first approach, the web complementing the native, ecosystem is materializing already because in the Oculus Quest we have an amazing browser. The Oculus folks are doing an excellent job at pushing the web forward and have a top implementation of the APIs and a top browser for people to access that content.

The holistic approach we haven’t seen it yet, and I would love someone to tackle it.

If we look at the VR ecosystem today, WebXR seems to still be stuck in that kind of bite-sized content showcase. Why do you think that is? And what do you think needs to happen until we start seeing some of the most popular VR experiences be built on WebXR?

That’s a question that doesn’t have a single answer. It’s complex. First of all, WebXR is still very young. We’ve been talking for many years about it, but it was not until early 2020 that actual browsers started to ship the final API. So it’s still very, very young. And since VR and AR has been focused on games, there’s a lot of inertia and the industry is built on certain workflows and certain tools that people learn over the years. When you develop for the web, it requires some retraining. People have to retrain and learn new tools and new patterns to develop content.

The second thing is monetization. The gaming focus of VR also implies the way to make money – you develop your video game, you put it on a store and you charge some money for that piece of content. This doesn’t apply as well on the web. If you see the way that people make money on the web today, either you have advertisements or you have some sort of subscription model – those are not that common in the video game space.

What I’m convinced is that it’s not that technology problem anymore. Because we have the standard, the final WebXR standard shipping in browsers. So all the technology pieces are there, it’s more of a problem of educating and for people to learn how to take advantage of the potential of the web. I think it’s going to happen, but as always in technology, things always take a bit longer. Especially if you’re an early adopter and you feel ‘I’ve been here for five years’, but when you see things at a global scale, people are starting to learn. Most people are still surprised that you can actually consume AR & VR through the browser. Most people are still not aware of that.

That monetization issue is something we’ve certainly heard from developers when asking them their thoughts on this. Do you think the Web Payments API is going to change that? What I’ve always wondered and hoped for is that we’d see the Oculus Browser adopt that API, such that if I go to purchase something in the Oculus browser, it’ll bring up that same pin code as when I’m purchasing a native app.

Yep. They are in a position to actually solve it because they have control of the platform underneath and also the browser – they could tie at both ends. I guess in those ecosystems, like Android or iOS or Oculus Quest, there’s always this tension between native and web, what cases would prioritize the web over native, or the other way around. I imagine there’s tons of internal discussions around it.

It feels like there are parallels to when modern smartphones launched. At the start of the iPhone you had this native SDK and there was kind of lip service to support for web apps, but often Apple prioritized bringing features to native first and made web apps second-class citizens. Do you worry that Facebook will go down the same path, in that they have a financial disincentive to make the web work when they’re making all this money from the native store?

This is just my guesses, I don’t have any internal knowledge: Apple, their roots are full control of everything right? From the bottom to the top, like from the hardware to the software. For them, the web feels like something foreign. They don’t speak the web language. But I have high hopes about Facebook because Facebook is the epitome of web success, right?

So Facebook truly understands, it has the web in its DNA. And hopefully they will be able to figure out and introspect: oh, actually, do I be awesome, right!?

Do you think WebXR needs Unity and Unreal to add WebXR export, or do you think that the existing web development tools are sufficient, that they can create this ecosystem without support of these big engines?

People that are involved in native development with Unity and Unreal, those people are in their own kind of bubble and they don’t care that much about the web. It’s mostly people that are already on the web that see the potential in those new technologies, like WebGL and WebXR and say: now I have this whole new world for me to create new stuff, right. It was not possible before.

This is our goal with A-frame, the goal was not that much as convincing people that do native to come to the web, but enabling people that are already on the web, know the value of the web and know how to use the web, enabling these people to create VR and AR content.

The support of native engines would be super welcome, and that would be very easy for some people that don’t participate in the web to enter the space, but I don’t think is required.

Can you talk about, both from a development perspective and an end-user perspective, some of the advantages for VR apps that are on the web rather than native apps?

The advantage of the web is immediacy, right? You can just share a piece of content via a link and you click and immediately you are consuming that piece of content. That’s an advantage from the user’s point of view, that immediacy and not having to install the content in order to consume it.

From the developer’s point of view, it’s that full control of your own work. So you don’t have to ask for permission and you don’t have to pass a curation process or approval process to get your application out there. And you can just publish and based on its own merit, users will tell if your content is good or not.

That’s really needed because we don’t know yet what the killer app of VR and AR is yet. Oculus doesn’t know, people that curate the store don’t know. I don’t know. When the web was born, we didn’t know. No-one anticipated Twitter or Facebook or anything. We need as many people as possible developing stuff, trying out ideas and see what happens.

Also from the point of view of developers, you don’t have a gateway , as I mentioned before, to make money, that’s the downside. But on the upside, it’s your business. Nobody is going to take 30% or 20% or 15% of your revenue. And things will last forever, right? You can visit websites made in the ’90s that still work. If I publish an app today on iOS, in three years if I don’t update that app it won’t work anymore. So it has this property of archival, because web standards, once they ship the browsers don’t remove APIs. So anything that works today is going to work in 10 years.

That’s a really great point because we’ve already seen so many in the industry notice that this Oculus Go and Gear VR and Daydream content, is already lost to history. And that was only just a few years ago. What may make WebXR the preferred way to make VR apps is that you have this kind of separation of the content and the engine. The browser is the engine that is a living thing. If someone releases a native VR app they’re using the version of Unity or Unreal that was in their day and 10 years later, the fundamental technology of it will be outdated.

Even experiments I was doing in 2014, I can just grab the code. Obviously there was a bit of churn there, WebVR was not still standard, but with a few tweaks here and there I can make it work in an afternoon. It has a ton of value!

You talked about one of the big challenges here being bringing developers on board. What do you think is missing in the current web and WebXR development ecosystem? What tools or frameworks or ideas or tutorials do you think are the missing pieces that we really need to move into the next phase of the WebXR ecosystem?

That’s a good question. With A-frame we are trying to solve that issue. There’s millions of web developers out there that don’t have the skillsets, or are not even aware that they can develop AR and VR content. And our goal is to educate and provide tools to these people that resonate with their workflow. Our background is also as web developers so we understand that mindset. We understand those ways of doing things and it’s much needed.

If we want this space to flourish, we need to onboard as many web developers as possible. So we need more and more tools that resonate with them, not trying to convince as much native developers, because they already have a workflow that works for them and they love their tools. And those are amazing. This is a matter of enabling existing people that already love the web that are not aware or don’t know how to develop VR and AR content. If we only convince like 5% of that audience of millions of developers, the space will bloom!

In the longer term do you think native developers could be brought in too as the advantages of this decoupling of the engine and the content is becomes more apparent over time? I don’t think the general audience are going to enjoy downloading apps to join their friends and an experience, and I think if people can deliver these kinds of rich experiences where getting to your friend and an entirely different experience is just one tap away… do you see that displacing native engines in the long-term?

Indeed as there’s the potential to displace native engines. I would love native developers to come to the web. It was like a more a practical way of approaching because it’s much easier. With A-frame I go and talk to a web developer. It’s like, oh you like to use Node or Webpack or React or Angular. You like all those tools so I can give you an engine to develop VR and AR content, it can integrate with the tools that you already know. It’s a much easier path to onboard those people versus going to native developers and saying, okay leave all your toolsets on the side. I’m going to teach you these all new things.

It’s a much harder task to convince native developers, but eventually if, WebXR flourishes, the value of it will become obvious for them. And I’m convinced they will see the value as we make progress.

Over time, do you see A-Frame focusing in on this just-above-WebXR level in the stack functionality, or do you see higher level things over time that do more and more for the developer?

Yeah I think we need both. We see this less and less now, but at the beginning most of the web developers jumping into AR wanted to do a video on 360° photos. That was the main use case. And they were demanding, all I want to do is do like a tour of 360° video or pictures. And for these people it would have been very convenient to have these high level tools. Visually oh I have all these 360° pictures bonded together and link those pictures in this way and add a little bit of text here and there and having high-level tools for them to build those tours would have been very useful and convenient for them.

It’s a bit premature because those categories of content are still not super defined in VR. Sometimes the value of the experience is in the nuances of the interaction model. In those cases you want to have a lot of control, and to have a lot of control you need tools that allow you to customize those interactions in a very detailed way. So you will also need category of low level tools that allow you to open the box and customize the way you want. With A-Frame we’re trying to tackle this middle ground. So it’s very approachable and you can get something going in a minute, like put like a 360° video or a 360° panorama, you can do it in a minute and you can publish that in a minute and share it with your friends via Twitter in a simple link. But at the same time, you can actually open the box and look inside and deepen your knowledge.

Did you see that kind of ability to modify and change bring all these different frameworks together being a really important in WebXR’s future?

So I learned how to develop and program using the web. And I think most people started in the same way, right? You see a website you like, you wonder: oh, how is this made? And in the web you always had the ability to open the developer tools that come built-in with a browser.

I know this is a CSS property that changes the color of this button, and you can also inspect the JavaScript and figure things out, how things are built. And there’s a lot of value in learning from others. And you have some ability to access the insides and learn how it’s built. It’s not a black box as an engine can be.

How important do you think it is that any WebXR application by its nature has access to the existing web of 2D content. We always hear people wondering where the metaverse is, and there’s this kind of core assumption that it’s going to be this one native app that provides everything in one. But isn’t it arguable that the web already is this and that the metaverse will simply be the extension of the web into these new platforms – and as long as you have multitasking in WebXR, an environment where I can bring up a 2D browser tab here and Discord there, isn’t that the metaverse? Do you see that as the metaverse it itself?

The web is already the metaverse, we just have to make it 3D. That is what we used to say at Mozilla.

VR and AR is another kind of media, and the web already does text and does images and does video and audio. But now I can do also VR and AR. So yeah, it’s a multimedia environment and it’s up to the browser. This is why we started Supermedium, the company, that was our goal. OK, we have 2D browsers it’s clear what a 2D browser is, we settle on those UI patterns. We have a window with a URL bar on the top and maybe some tabs, and you can switch between different open websites.

But we don’t know if that pattern translates to VR. And if we get rid of the frame on the window, how can we access the web and different pieces of the web, in VR and AR? We haven’t discovered how it’s going to happen. But yeah, what you were saying is the technology is already there, it’s a matter of someone to put those pieces together in a way that makes sense.

And I think native applications, they will have far much harder time to replicate all that functionality because they will have to reinvent the wheel, right? They will have to figure out all the things that the world has figured out already over 30 years, they will have to reinvent not only those pieces, but also convince developers and content producers to adopt this new way of doing things and convince them that this is better than the actual web.

In the WebXR experiences I’ve tried performance is equal to native apps, but there is this perception out there that WebXR is still slow. How do you fight that perception? What do you think the path forward there is to get sort of consumer and developer trust in the performance of the web?

At Mozilla we always struggled with that kind of perception that in many cases is a myth. Once the opinion has settled in people’s head it’s very hard to convince them otherwise to change their minds. There’s tons of metrics and statistics and profiling of browsers and applications showing them all actually works.

But at some point we decided we are wasting our time. The better way to convince people is to show. If you’re able to show a piece of content that performs well and users enjoy, that’s undeniable, right? That was our goal with Moon Rider – try to get the most beloved piece of content out there, Beat Saber, and try to replicate this piece of content with web technologies, just as a proof, as a tangible proof that the web is ready to deliver compelling content. If you are willing to put some time to profile, to tune the performance in the same way that you see in the apps that are published on the store.

You don’t see the experimentations that people do on native because they are very hard to share. Those are hidden. So you don’t see all the things that don’t perform. You don’t see the things that are full of bugs back because people don’t share them, but those things are exposed on the web. Because as soon as you feel some pride of something, when it’s something you made, you’re going to tweet about it and people are going to click on it and they are going to see.

Moon Rider was out there for 18 months and it has like three or four thousand daily active users and more than 16 minutes average session – people love it and people use it.

What are your thoughts on the future of WebXR, A-Frame, and and the entire web ecosystem going into the spatial computing age?

I’m super excited. A couple of years ago I was getting impatient, because it seemed like we started the first version of the standard in 2014 and we were in late 2019 and I was thinking: oh it’s been five years and the standard has not yet shipped. I was kind of getting a bit antsy.

But once the standard shipped, everything fell into place. We have this done there. We saw Quest and the Oculus folks that are doing amazing with the browser also shipping the new standard and everything kind of clicked. And you see new headsets that like Magic Leap also has had a very nice good effort trying to push the web to develop content. And you have the HoloLens folks doing an excellent job also incorporating the WebXR APIs in their browser. Everything is clicking right now and it’s a matter of volume. As more headsets and devices get released and as more people get onboard using VR and AR headsets, I think the web is ready. It’s just waiting for someone to take advantage of it, and for the potential to manifest for everybody.

I’m super super optimistic. And this only happened like since 12, 15 months ago. Because I truly think like I used to say, you have to wait till we finish the standard. You have to wait till all the browsers ship. But those things are done already, and it’s there. The web is your oyster now. So I’m super, super happy and super pumped about what’s coming.

Now the next frontier is AR. The Google folks, they are doing an excellent job at incorporating AR features into the WebXR standard. And we see some initial moves and some interest from Apple in the standard, and also some rumors that maybe they might enter the space. I’m super pumped about what’s going to happen there too.

Do you think the industry needs Apple to really support WebXR in full, or do you think it’s fine if Apple remains with its stance of native-first?

I think Apple entering this space is going to grow the pie, both for native and the web. This is going to make people to really take seriously AR and VR and is going to raise all the boats, right? Like both native and web. And for those companies that are invested or are investing in VR and WebXR right now, but it’s more like an experiment, may justify going full on once Apple enters the space and matures the industry. I think the web is gonna flourish with it regardless of how seriously Apple takes it.

People criticize Apple a lot, how Apple approaches the web. I’m mostly a Windows user, but when I’m using Mac I’m always surprised how good Safari on desktop is compared to alternatives. You cannot claim that Apple doesn’t care about the web. They kind of always take different paths or put the web on a second plane, because the way they make money doesn’t align that much with the web, but eventually they always come around and they put something out there that is really, really good.

Diego, it was an honor to have you in the studio. The future of WebXR seems bright.

I’m really, really excited.

UploadVR Member Takes

Weekly Newsletter

See More