Bringing AI to the Metaverse, and viceversa
MetaGen is a project to explore and develop the intersection between AI and VR/AR.
Recording tool released!
You may be asking yourself: what does this mean for me? Well, it depends on who you are, and what things you care about of course.
What I can tell you is that the goal of this is foundational science and technology, which can have impact in many things. Here are some I can think of.
We want to understand the subtle things that humans intuitively know, but machines currently have no way of knowing. Furthermore, even though we intuitively know these things, can we say that we understand them? Think of things like: how do I judge how someone else is feeling, when I am speaking with them? Conversely, how do I express feelings with some intent in mind? How do humans know which places are interesting to explore, when they enter a new unknown environment? Or what questions to ask when you meet a new person?
Gut-feelings, emotions, knee-jerk reactions, subconcious perceptions, judgements, consciousness. We all know these things. They are the very essence of what makes us human. And we’ve wondered about them since humans have been able to wonder. Yet, only in the last century of so, have we developed tools precise, and safe enough, and methodologies designed carefully enough, to be able to get a little bit deeper into their mysteries. Psychology has gathered large amounts of observational records of what people do and how they behave. Neuroscience has begun to connect these behaviours with the underlying machinery of our biology and below.
However, probing deep into the brain is a risky business; even more so if you want to “play” with it and see how it reacts to stimuli. We don’t yet have good enough brain-computer interfaces to throw the wonders of computational science at the hard problem of the brain. So, how close can we get to the brain with the technology we have today?
Virtual reality is the latest advance in Human-Computer interaction. It connects our senses, and our actions directly to a computer. What you see, hear, feel, and what you do, say , and express, are all being digitalized. In the age of big data and machine learning, the opportunity is clear. We have the tools to answer hard questions, like the ones we posed above (and have been posing for a long time), if we have enough data. And we have the technology to gather that data. Answering hard questions (science), is also how we make the biggest breakthroughs in solving hard problems (engineering), in this case the problems related to the mind and human behaviour.
Let me be clear: I don’t think VR will give us all that we need to fully answer these questions and solve these problems. We will almost certainly need to dig deeper into the brain. But I think there is a big chance it can still lead to big advances in the above things.
Even if the progress in the above quesionts is limited, the progress in AI technologies may be more noticable. I have made a case in this post that models that learn from human-produced data have made impressive progress in restricted domains like text, image understanding, speech comprehension, etc, and we are beginning to understand the properties of this advancements, like how much data and compute are needed.
I argued that for multimodal understanding (related to the subtle questions about emotion perception, etc. I mentioned above), we are currently primarily data-limited.
Therefore, to advance AI research in this important direction we need, like above, multimodal data abou how humans speak, move, and behave, in different situations, represented by the stream of multi-sensory stimuli (hearing, vision, touch, etc). The kind of data that VR makes available in digital form.
Just look at this video
This technology will have a huge impact in the fields of animation, and AI in games. More realistic characters, more autonomy, diversity, etc.
But that video is only the beginning. Those characters currently don’t see, or hear, or feel anything. To solve those problems will require some clever ideas in model design. But as I mentioned in the previous section, my experience tells me that the main bottleneck is data. The models in the video above are trained on a few tens of hours of data at most. VRChat generated 16 million hours of VR data in 2018 alone. I am not kidding when I say that I think the difference between a standard game NPC and a virtual being that is able to understand your feelings and, perhaps, have some proto-versions of them itself, is one year of VRChat data (or something equivalent). Some more recent evidence towards this is this new work by deepmind showing that they could learn to ground language in an interactive setting, with enough data (about 20K hours) of humans interacting in their simplified environment.
I think getting agents with intuition and common sense is also the beginning. Giving them autonomy and curiosity is the next step in the path towards sparking the Metaverse with life, intelligence, and whatever else may lie in the infinite depths of the computational universe (see the awesome work on Lenia for example).
Virtual beings that are more human-like would enable whole new game and entertaining experiences. Imagine the first truly AI VTuber, or an AI friend who gets to know you and offer you support, or whatever you may (reasonably) need.
Imagine a virtual coach that teaches you to dance, or to do public speaking, or to get better at emotional intelligence, and can understand how and why you fail, and how you can do better. Imagine autocomplete, but where it’s funny not because of how stupid it is, but because it actually makes witty suggestions which make sense in context.
The tl;dr is: data. The more the better. The higher the quality and the diversity, the better. The more open, transparent, and publicly available for the advancement of science, the better. The more we can reward the participants and volunteers who donate their data, and their time, and work, the better.
These are not easy things to achieve. But they are definitely not impossible. I have been working as hard as I can to solve the technical hurdles (https://github.com/MetaGenAI/MetaGenNeos https://github.com/oxai/vrai), but that alone means nothing, without people. I want everyone to own this project and this vision. I can only show you the door; you are the one that has to walk through it.
Please, super most definitely contact me (@guillefix in twitter) if you want to help with technical or social aspects of this project. But the easiest thing everyone can contribute just by virtue of being human (or sentient) is data. Your data can be worth more than ads. Your data could be our data communism anthem plays (lol, ok that sounds creepy). Seriously, you have the freedom to do whatever you want with your data. I’m just showing you one cool thing you can do with some bits of it, if you want. You always choose what to share or not to share. (but really, we could do super cool things:)).
Many things. We have a bunch of projects in development.
For one of them, we are looking for VR dancers to gather data from their dancing. We wanna teach an AI to dance. We think it would be a cool project to start with, and showing the scientific community the value of this idea, while having fun!
We are also looking for other kinds of data, and we are offering some NCR for it. You can check out more at the MetaGen Data Challenge. The data will be made publicly available for other researchers to use, and the tools will be open sourced.
More recently, we are looking to pay people to do certain kinds of simple social activities and play games in VR. Check out this short poll to gauge interest in participating.
Contact me @guillefix on twitter, or guillefix#5692 on discord, or ｇｕｉｌｌｅｆｉｘ＠ｇｍａｉｌ．ｃｏｍ, if you’re interested