Author: Jason Michael Perry
- 
	DALL-EDALL-E and the world of generated images have captured the attention and imaginations of many. When I first watched the video, humans need not apply; close to 8 years ago, the technologies felt possible but still distant. DALL-E alerts you with a flashing red light to how far we’ve come over a relatively short period. DALL-E is an AI-based system that generates realistic images from a string of text. What makes it uncanny is its understanding of the text and its ability to apply context and do all of this across different artistic stylings. Using the tool is addictive; if you have not, I suggest you create a free account and give it a whirl. Our CEO Todd has also turned me on to many other AI tools, like jasper.ai that allow you to generate blog post articles with a simple topic or description. While they may miss the depth and meat many expect in a well-crafted post, it is a shockingly great starter (and better than some content on the Internet). What I find fascinating in the new AI space are the same copyright issues we struggle to answer around ownership, especially when referencing prior art. For example, one can sample music and use it to make a new song, but we have defined a line that determines when a new song is unique and, in other cases, when the sampling requires the artist to pay royalties to the previous artist. In the case of tools like DALL-E, the prior art is exactly how you train a machine to create something new or unique. You give it as many samples of images and artwork as possible and provide it with metadata to describe each piece of work. It allows you to ask it to generate an image of a dog in the style of Van Gough. Is this a case of a new unique piece of art? To what extent is it based on the prior works that AI used to create this new piece of work? Are the uses of training sources any different than me asking a human to do the same thing? If one profits from the work, who should receive the royalty? The engineer who developed the AI? The company who created the AI? The license holder who typed a string of text to generate this new work of art? Or maybe the AI itself? 
- 
	AlexaThe recent layoffs by Amazon targeting its device unit have sprung up many articles on Amazon’s inability to monetize Alexa – especially knowing that Amazon’s strategy has long focused on selling devices at cost and making revenue from its greater ecosystem. While sold at a cost, Kindle is a gateway product to Amazon’s extensive library of ebooks. Alexa as a product felt like the future was released into everyday consumers’ hands. This is not something that happens often, but the echo was a genuine awe-inspiring product. But looking at Alexa after ten years of constant development and iteration, it’s hard not to think of it as less of an awakening to a new way of interacting with computers but a few trick ponies. I long ago gravitated to Apple’s Siri for my house voice AI needs, not because I disliked Alexa, but because Apple has me very tightly in the grip of their entire ecosystem. Even with that, I rarely use a voice AI to do more than a few mundane tasks: Check the weather, play music, and control other smart home devices. I wish I could do more, but the promise of devices like Star Trek’s computer still feels very distant. Heck, using the word “and” is still impossible for the lion’s share of smart speakers or home AI. Many imagined that voice would become a new interaction stream regarding monetization. Instead, we have learned more about different devices’ values and the interactions’ pros and cons. Speaking has a ton of limitations. It requires a generally quiet location, privacy is limited to people who can hear, and listening takes more concentration than we realize. How often have you asked about the day’s weather only to not remember and ask as soon as your smart speaker is done? I love my smart speaker, and I still find Alexa a fantastic device, but I wonder what we all collectively want in Alexa 2.0. 
- 
	Welcome to the metaverseWhen Mindgrub announced our move to the metaverse, we wanted to explore the many sprouting virtual worlds and determine where to plant our virtual roots. The way people talk about it, the idea of a metaverse sounds like one world or one place you can enter to access a broad land of virtual content. The truth is: there is no singular virtual world (yet). Futurists imagine that instead of one world, the metaverse may parallel the internet. It might reflect a network of worlds that allows all of us to enter and leave at a whim. Facebook, now Meta, believes it has what it takes to help create that future. I’m not sure if their vision will win, but it’s helping us all see what the end could be. Suppose you find yourself, like me, office hunting in the virtual world. In that case, you quickly learn that what exists now is a patchwork of siloed communities, each with different levels of immersion, rules, and financial expectations. In many ways, these worlds’ quirky ideas and explorative nature give a feel of a new frontier to explore – reminiscent of the early days of the internet. Just the heart of this leads to one hugely important question. What is the Metaverse?Let’s start with the Wikipedia definition: “In futurism and science fiction, the metaverse is a hypothetical iteration of the internet as a single, universal and immersive virtual world facilitated by the use of virtual reality (VR) and augmented reality (AR) headsets.[2][3] In colloquial use, a metaverse is a network of 3D virtual worlds focused on social connection.” This definition limits what many may see as the actual metaverse. An immersive virtual world does not require virtual reality headsets – levels of immersion can happen on a computer in a networked 3D world. The foundation of the metaverse was built by communities like Second Life. Second Life is primarily known as one of the first virtual worlds with an expansive economy and a vast set of communities. In 2003, it was a pioneer, giving anyone connected to the internet a place to create a new life to live and explore. Many have spent thousands of hours immersed in this world. The key to that experience is being immersed, which defines a metaverse. I describe the metaverse as: An immersive network of interconnected worlds or communities commonly accessed through devices such as a phone, computer, and a virtual or augmented headset. These worlds can be used for dating, fun, social connection, work, or recreation. This definition better encapsulates what currently exists and what is possible. The key to this definition is immersion. Imagine using immersion on a scale similar to the six levels of vehicle autonomy: - Level 0 (No Driving Automation)
- Level 1 (Driver Assistance)
- Level 2 (Partial Driving Automation)
- Level 3 (Conditional Driving Automation)
- Level 4 (High Driving Automation)
- Level 5 (Full Driving Automation)
 Vehicle autonomy is a scale that differentiates cars by their autonomous driving abilities. Having such a scale allows the US Department of Transportation to define better rules and regulations for a car based on how autonomous a person should expect a vehicle to be. On this scale, a level 5 vehicle would no longer need a driving wheel – it is so autonomous we can depend on it to handle all driving conditions and focus our time watching a movie or relaxing. These rules allow us to acknowledge the foundation of autonomous driving and see what the future will bring us. Many of today’s US cars, including a Tesla, come standard with technologies such as adaptive cruise control, parallel parking, blind side monitoring, and lane assistance, all of which rate as level 2 features. A scale like this also lets us pause and see how much technology has evolved in a few quick years while realizing that massive chasm of technical intelligence needed for us to move from a level 2 vehicle to a level 4. The six levels of human immersionIf we keep those same six levels of vehicle autonomy in mind and use them as a template for the software and hardware that enables the metaverse, we get the scale of immersion: - Level 0 (No Augmentation)
- Level 1 (Device Augmentation)
- Level 2 (Augmented/Mixed Reality)
- Level 3 (Virtual Reality)
- Level 4 (Physically Immersive Virtual Reality)
- Level 5 (Full Mental Reality)
 Level 1We as humans exist with no augmentation or connection to any reality but that we can see or imagine. We step into level 1 immersion with the assistance of a device, think game consoles, laptops, or phones. Each of these transfers you into an immersive land. Get lost in Second Life, World of Warcraft, Minecraft, or Roblox? Lost in the scrolling feeds of TikTok, Instagram, or Snap? These worlds exist now and function with whole economies, social interactions, rules, and regulations. Level 1 immersion tends to focus heavier on sight with the option for advanced audio. On a scale of immersion, it requires concentration and, sometimes, our imagination to remove our existing reality and truly feel enveloped. Level 2Level 2 devices connect us with a virtual community as an overlay of the real world. It can overlay the virtual or contextual information in the real world. The first notable example is Google Glass, which allows you to overlay directions or store reviews over the real world while looking around. It also expanded our idea of sharing by imagining the ability to let one truly see your viewpoint. Other older successes include games like Pokemon Go that use a phone’s camera to meld the Pokemon world with our own. Additional credit for products like Microsoft’s HoloLens, nreal’s AR glasses, and Snap’s glasses. Other fringe devices in this space include AR Drones and game consoles that require physical toys to interact. These devices are less about the immediate plane but still invite a user to connect to reality in a different and more immersive way. The rumor mill continues to circle on an Apple device targeted to this level of immersion. We can only speculate what Apple may bring to the table, but the idea of contextual visual interfaces that evolve on Google Glass seems probable. In recent years Apple and Google have incorporated LIDAR and other stereographic sensors into their devices mixed with developer-friendly tools such as ARKit, making level 4 devices easier to bring to the masses. Level 3Level 3 requires a virtual reality headset that masks a person’s vision and, optionally, hearing, immersing them in a new world. A clear sign of level 3 is a device that attempts to remove you from your current reality as much as possible while offering an interactive and immersion experience. This means a device should allow interaction through head tracking, hand tracking, and an external gamepad. At level 3, a person should feel as if the sense of sight and hearing have been transported into a different world. Popular devices in this category include the Meta Oculus, Playstation VR, and HTC Vive. Many of these devices may quickly move between an augmented (level 2) and level 3 world. Until recently, level 3 VR has mainly been a space for immersive games like Half-Life: Alyx and impressive demos, but the pandemic sped up the development of social spaces, games, and work environments for VR but many of these are new and early. In social, some notable names include Meta Horizons,, and . For work Spatial IO, and, . For games Roblox, , and. A tiny segment of the population still regulates virtual reality. Few have regular access to it, so the possibilities and impacts remain largely unexplored. Our content consumption is essentially 2D; for all the visual advances in movies and television, we still look at a 2D plane and primarily rely on audio to create the feeling of 3D immersion. VR changes the idea and opens the world of storytelling up into a different and much more immersive experience. A horror movie no longer directs you, but your experience and fear may change based on how you orient yourself to that world. Level 4Levels 4 and 5 often feel like a dream but are much closer than you realize. Level 4 devices must trick three senses. These often focus on sight, hearing, and touch, immersing your body in a different world. Level 4 devices commonly track movement to allow users to move around an environment or feel vibrations and feedback. The Meta Oculus Quest and Quest Pro are notable for allowing you to define a boundary and physically walk in those confines but mask this virtually to give users an infinite playground. CES is always great to see the many level 4 devices that take this idea further. Devices like an immersive body suit or glove transmit the feelings of touch or impact; walking devices allow a person to move, walk, or run in place; or even a rollercoaster. Those examples give a good taste of what is possible in level 4 devices and the amount of equipment needed, which makes it out of reach for many homes. At the same time, the technology gets more portable, and arcades, art exhibits, and other experiences open with immersive level 4 options. A new chain with locations in many major cities opened with arcades that offer real-life arena games, including virtual laser tag. The difference is in these worlds, you play with real people and feel the impact of others shooting at you. One experience I hope to try combines a satellite with sensory deprivation tanks to simulate floating in space. Level 5At the peak of our imagination and scores of anime like Sword Art Online is level 5 immersion. Level 5 requires an immersion that tricks every one of our senses sight, hearing, touch, taste, and smell. Imagine the ability to travel to a distant country and smell the countryside while tasting the food. That is the true pentacle of an immersive world – a place nearly indistinguishable from our reality. Much research and development is needed for level 5 immersion, but a surprising amount is coming from technologies focused on accessibility that have recently begun to converge with big tech. This technology is also further along than many realize it is. Researchers have worked on robotic implants, hearing technologies, assistive sight devices, and brain control for decades. Some of this has begun to merge into products for the everyday consumer. Apple, for example, offers AirPods the ability to alter our external audio or magnify it, similar to hearing aids. It has a watch using sensors to detect the movements on our fingers (or the muscles attached to them) to allow assistive touch control options. Elon Musk has a company called Neuralink that enables primates to play pong with their mind using brain implants. How immersed are we?Using our definition of the Metaverse, Mindgrub wanted an approachable environment that allows anyone to interact without the need to invest in a headset or other hardware. For Mindgrub, the environment we invite users to should embrace the best immersion possible without requiring more than a laptop or a phone. Accessibility on the run or while traveling feels like something that should be essential in an office environment. I believe that any true metaverse needs to be hospitable to varied ways of connecting. An individual should be able to cross many, not all, levels of immersion. Think of our current world, I may invite you to a Zoom, Amazon Chime, or Microsoft Teams meeting, but does that exclude you from connecting with a phone call? You may have a diminished experience, but as a tool, it includes and allows folks to connect how they can versus excluding. True magic happens when different levels of immersion mix. Each device offers a different perspective on the world and allows users to interact differently. Gamers may think of this as an MMO that allows PC users to play with console users (Playstation, Xbox, Nintendo Switch, etc.) – a keyboard and mouse can be very precise. Still, a joystick or gamepad can sometimes allow for faster movement. In the end, we have the same meta realm but different ways of interacting through the view of other devices. Of course, this mixing can be confrontational. It is weirdly believed that the precision I mentioned from a mouse and keyboard can offer an unfair competitive advantage to users on a gamepad. Regardless of strengths and different levels of immersion, we, as users, always judge our environments to determine the best option for our needs. A TV is far superior for watching a movie or a quick how-to video, but a phone’s ability to watch a video anywhere often supersedes the best experience. In our vision of the metaverse, this is crucial. It also leads us to three rules on what we expect for how we use the metaverse: - It should target the level of immersion that best deliveries the message
- It needs to support devices from many levels of immersion
- It should feel simple (and ideally effortless)
 The most important of these rules is #1. Much of the angst and unhappiness with some of Meta’s ideas around the metaverse is that it ignores looking for the right technology for the right message. When I call my parents, it is not to interact with my mom and dad’s virtual avatar but to connect with them. For those who FaceTime or Video chat over a voice call, it is to create that intimacy you can only imagine in person. We want to see each other and know that you look and feel well. Google has a fascinating research project that attempts to recreate a 3D physical visual representation of a person if given the option. That would be my favorite way to converse without having that person with me. Virtual reality allows a whimsical possibility that would not replace a video chat but will enable me to show ideas in a way I could not before. Some of the Disney Lucas art ideas of creating a world in a world are some of the most amazing things I have ever seen. How can we imagine a world of 3D cad printouts in 2D when a much better medium now exists to create this? Mindgrub’s new officeUltimately, we learned a lot and had to reimagine what the metaverse meant to us as a company. We did that by coupling our reimagining of the metaverse with our understanding of the vast landscape of existing technologies and tools. The number one rule we came back to is to target the best immersion level for the message we need to deliver. In many, many cases, some of the technologies – while aged – that we currently have done that very well. We use Slack, Zoom, and Google Workplace and find these to be ok – not always great – but ok tools for a lot f our work. These tools are not going away, and until something better comes along will not change. We quickly fell in love with Mozilla Hubs, an open-source world that can create around a well-backed and robust platform and framework freely. A structure that balanced openness by allowing you to produce a world and host it independently. This openness allowed us to let people experience one world using different devices of varying levels of immersion. It also allowed us to test the boundaries by pulling zoom, slack, and other of our go-to ideas into a place we could control and further iterate. For Mindgrub and me, the internet and the interconnected virtual realm are essential. While many technologies make up the tools we use as the foundation of the internet, we can move from website to website regardless of what was used to construct them. Hubs align more with web domains, each web domain or URL representing an independent world. I find Hubs kin to modern development frameworks (like Drupal, WordPress, or Node) – a solid bedrock with low-cost or free hosted tools to create an environment but with the ability to venture out and design and develop something different while rifting on the ideas of the others. With hubs, we can build a module or plug-in and decide to make that available to the more incredible world freely. I’m sure we will all have much more to say as we begin our adventure but in the meantime, feel free to visit and check out our “lobby.” If you like it, stay a while – VR headset optional. 
- 
	The algorithms did itOver the weekend, I read a piece on algorithms running the city of Washington DC. Articles like this frustrate me in their ability to take a word that once felt clear and make it crowded and confusing. Algorithm even when used correctly represent such a broad and vast definition that using the word necessitates a pause. I can write an algorithm that lets you determines if a number is even or odd: `function isEven(x:uint):Boolean{ if(x!=0 && x/2==2){ return true; } else{ return false; } }`That is an algorithm, a set of instructions that solve a problem. Guess what? Alexa, Siri, and Google Search are built on algorithms. Do you know what else is? A calculator, a light switch, a TV remote. The number of algorithms we interact with daily is so massive that it may be uncountable. Pointing to a algorithm is like pointing to a singular and in a colony of millions. Any large system includes millions and millions of lines of code, many almost certainly rely on code written by other third-party developers, all focused on solving different types of problems. Articles that cite algorithms feel incorrect and morph the meaning by making the nature of math or development to be the issue. Developers write code that make simple and complex decisions based on product plans and designs. Those designs say what we anticipate a system to do and provide criteria to determine if it does what we ask. Code written and designed well will not lie. Could the code or a rouge programmer cause issues? Yes, but many times the true design decisions are made on purpose. If I want an application to find me, candidates, I need to establish criteria for a good candidate. To build a search engine that returns excellent results, I must first define what makes a great search result. This criteria is always subjective, and this is the actual question. Should we fear the mountain of algorithms or acknowledge that company-designed software is purposefully built to do something? Let’s not blame the algorithm and ask if the results we see may be as we configured or designed. 
