A fundamental characteristic of our reality is our ability to interact with the world around us. Janet Murray defines agency as “the satisfying power to take meaningful action and see the results of our decisions and choices”.
We can break down agency in 2 components: The ability to make meaningful choices and the freedom to move around in the world.
Choice
The predecessors of simulated realities had little to no degree of meaningful interaction. Fresco paintings on the ceiling of a church, panoramic video in an IMAX cinema or multi-sensorial art installations in a museum, did not respond to the spectator.
A larger degree of choice can be found in early generation computer games. Adventure games such as Zork, Leisure Suit Larry, Secret of Monkey Island can be seen as interactive narratives where the reader has to make a choice between a small number of predetermined responses, leading to different storylines and outcomes. Often it would take considerable time to discover the right combinations of actions to trigger a dialogue or event that would advance the plot and saving the game often was necessary to prevent getting stuck or having to repeat an entire section of the game.
Modern open world computer games such as Call of Duty, Halo and Grand Theft Auto allow players to freely walk, run, drive, fly and fight their way in large open worlds and offer a much wider degree of choice. However, if we look closer many of the interactions in the simulated reality feel scripted, narrow and unreal. These handwritten decision trees often result in “artificial stupidity” such as repetitive behavior, loss of immersion, or abnormal behavior in situations the developers did not plan for. Enemies always seem to appear at the same window first or walk the same route no matter if you shot them a hundred times before.
Artificial intelligence in games is increasingly becoming more sophisticated and in some cases even achieving a level of realism that exceeds human intelligence: Google’s DeepMind subsidiary made news headlines in 2016 when AlphaGo won from Go genius Lee Se-dol and since then has proved unbeatable by human players. Dota 2 is a mod on the popular Warcraft real-time strategy game where multiple players share an online battle arena. OpenAI’s Dota 2 bot is capable of winning from professional players. The ever evolving nature of neural networks and other forms of machine learning will no doubt making non-player characters and game AI more unpredictable, smarter and behave like real human players in this decade, adding more games to list where AI outsmarted humans.
A much harder agency problem is that true meaningful interaction is about being able to perform meaningless actions. You can walk up to a virtual car, smash the window and drive the car in Grand Theft Auto, but if you stand in front a wall you cannot write your name on the wall with a pen on the table. The programmers of the open world have built-in the logic that determines what happens in the game world when the event is triggered. An event is triggered if all its preconditions are satisfied, e.g. the player is at the right location, carries the right object and the enemy has seen the player. If the game developers did not think about programming the trigger and game logic and creating the game assets (graphics, audio) for the writing your name on the wall – there is no way for you to make that choice. Game artificial intelligence not only needs to learn the game logic but also the game content and objects itself.
One popular way to increase agency is to enable users to create and shape the game world and its in-game assets themselves. The best example today is the popular Minecraft game in which players can explore a vast blocky 3D world and build structures from simple houses to large castles and earth works. The simulated reality becomes meaningful because players collaborate and co-create the experience as they would do in the real world. One of the key challenges with user-generated content is how to reduce the authoring time it takes to design aesthetically pleasing structures and game assets to ensure players maintain their interest in the game. Earlier virtual worlds such as SecondLife required a lot of artistic and programming skill to design interesting in-game content. Procedural content generation is a technique used by game level editors to support game developers to design large, varied open world environments faster and at lower cost and can also been applied in in-game editors. Spore, the evolutionary game where players have to evolve from a one cell organism to a space faring civilization, has applied procedural content generation in its creature editor.
Movement
In the real world we can use our fingers, hands, arms, legs and our bodies to move objects, use tools, control devices or move ourselves physically from one place to another. We can move our head to see in the direction we want. We experience agency in a simulated reality if we feel our interaction with the world is meaningful. Even in the earliest immersive spaces of the prehistoric and antique times visitors of the caves of Lascaux and frescos in Villa dei Misteri in Pompeii spectators could freely move and interact with the environment. The earliest computer games restricted our ability to move to pressing a few keys on our keyboard or moving a joystick controller in 4 possible directions. Recent advances in human computer interaction are increasingly delivering the same degrees of freedom to move to virtual and augmented reality environments.
Virtual reality headsets enable us today to freely move our head and view the virtual reality around us in 360 degree. Microsoft Xbox Kinect and other similar movement and tracking sensing technology based on computer vision, enable people and object tracking without the need of a user having to hold a special device or wear a suit with actuators. Computer vision algorithms can detect and distinguish between different gestures and movements. Advances in deep learning make it possible to control games with just a webcam and no extra hardware.
Everybody who has ever worn a VR headset knows that if you try to physically walk you quickly run into your computer screen or a chair you overlooked. The ability to freely walk and run in all directions in virtual reality is enhanced by omni-directional treadmill technology. Omni-directional treadmills fix the player in a certain position using a band or ring around the waist and a platform under the user’s feet that ensures the user will always stay in the center — no matter how fast or in which direction the user moves. Omni-directional treadmills today are too expensive for most consumers and require a lot of free space to use and store unlike a headset or game controller. For these and other reasons the use of treadmills is limited to professional VR environments. Omni by Virtuix and Infinadeck are examples of commercial VR treadmills available on the market. These VR treadmills are not just developed and designed for gamers and esport tournaments but are also used in psychiatry to treat patients who have irrational fears of e.g. being in a narrow spaces or are afriad of heights, fitness centers to create a more immersive and pleasant running experience or military combat training situations.
Natural interaction in simulated reality however still faces significant challenges. First are usability challenges from physical restrictions such as goggles and gloves we need to wear or carry and cables that impair our movement. Improvements are needed in downsizing the equipment and make users forget they are wearing or using them. Secondly, the precision and reliability of tracking multiple independent users, their gestures and movement with computer vision algorithms must be enhanced for true interaction. Writing a book on a virtual reality laptop requires a lot of use of the ‘backspace’ button. Finally, there are psychological hurdles to overcome; our brains must adapt to a new way of interaction (walking on a treadmill is different compared to real walking) and socially we need to come to accept that other people will see us walk in a giant baby walker with a VR headset on.
From a distance it may seem that the current level of agency in simulated reality today is approaching true agency – indistinguishable from the real, but if we zoom in closer we see that our ability to take meaningful choices and see the results of our decisions and choices by interacting with the simulated reality breaks down. Given the rapid advances in artificial intelligence, computer vision and natural interaction, we expect that believable agency is within our reach before the end of this decade. For natural agency advanced brain computer interaction and breakthroughs in artificial intelligence are required to make us forget we are constrained by devices and perform the meaningless actions that create the meaning.





