Sunday, 6 July 2025

Interaction Design for Livestream Performance

One of the great joys of my work - which spans educational games, alternative controllers, and playable stage shows - is I’m constantly coming across new ways audiences will approach a piece of technology. The more I explore the more I find myself revisiting the same the same key design questions. In particular, how does my work create a space where users are ready to interact with it?

Put yourself in the mind of your user. What’s the first question you need to answer? The most obvious answer is what do I do? 

But think again. What do I do presupposes that the user already knows that there is something to be done. Perhaps the user doesn’t even know that there’s something that can be interacted with. Even if they did, they may not even know why they'd want to interact with it at all.

Designing for interactions is not just about conveying information. It's about constructing meaning, providing context, and provoking emotions.

How does the user believe the interactive space already works?

What does the user want out of interacting?

What is the user afraid of?

Codex Bash at the Wellcome Collection, 2015

To give an example, in Codex Bash - a game built for exhibitions and museums - most users don’t know the game is there until they start playing it by accident. 

In a video game exhibition, pressing a big colourful button is usually encouraged. So each of my big colourful buttons triggers the start of the game, ready to be pressed by an unwitting passer-by. With one press the game springs to life with colour and sound, displaying a puzzle so trivial it would be impolite not to complete it.

Once this has been solved, the player is an active headspace, willing to participate. Buoyed by success in a simple first puzzle they're emotionally prepared for the more complex challenges the game begins to throw at them. 

The Book Ritual at GDC San Francisco, 2019

In The Book Ritual - a game where I invite players to shred a physical book - players are often afraid to deface a book. Defacing books is quite the taboo, especially in public.

So how can I assuage this fear? I sit the game among the mess and clutter left behind by the others who have come and shredded books before you. It's less intimidating to shred a book knowing that you will not be the first person to do it.


Interaction Design in Artholomew Video’s Reading Challenge

In this article I will take one of my interactive Twitch streams, and pick apart the user experience (UX) decisions that went into making a great show for viewers. Even in an environment where the only interaction the players have is to type into Twitch’s chat feature, there’s a surprising amount to be understood to allow for great interactivity.


Let’s have a look at one of my signature livestream games. This is the Reading Challenge segment from Artholomew Video’s Stream ChallengeArtholomew Video is a livestream video game collection I designed to get streamers and their viewers being creative together. It won the grand prize at the Creative Gaming Awards in Hamburg in 2022.

On screen I am reading Alice's Adventures in Wonderland, but with a twist. Viewers can edit the text as I’m reading, simply by typing into Twitch chat. Type replace ___ with ___ into Twitch’s built-in chat window and all instances of the first word will be replaced with the second. 

So, if you type in replace Alice with Richard Nixon, Alice will, for the remainder of the text, be referred to as Richard Nixon. That is, until someone types replace Richard Nixon with a baked potato

My job as streamer is to simply read with passion, bring the newfound nonsense to life as if it’s a serious work of literature, and see if I can get to the end of the book.

Viewers can also type remove to stop a word from ever appearing again, hesitate to add increasingly many um and ah noises into the text, and undo to revert changes they don’t like. Or at least, in the initial versions they could. I since removed these keywords for UX reasons that I will explain later.


The Key Questions

Now let’s look at what’s going on in this livestream. It’s important to understand it, step by step. Here's the key questions I'm concerned with as a designer:

Do they understand what’s going on?

Do they feel safe to join in?

Do they want to interact?

How do I deal with these questions in order to provide a livestream that coherent and welcoming and inspires viewers to join in? 


Do they understand what’s going on?

In most of my games I aim for a very clear goal: players should be able to understand it just by looking at it.

As best as I can, I try to meet that standard with my livestreams too. Even when a viewer joins the stream midway through they should be able to understand what's going on. Even if it takes a little time, a clear initial idea should be legible at a glance.

What should be immediately obvious is that there is a large amount of text on-screen. It takes up much of the view and is clearly the centre of attention.


Sitting with the video for a few seconds the viewer should probably notice that the streamer, visible on-camera, is saying words that match the text, or talking about that text. Looking a bit longer they'll probably notice the text is fairly nonsensical.

Hopefully this should inspire a question: why is it nonsensical?

Fortunately, most viewers have made a conscious choice to tune into my stream, so they come in with some amount of curiosity. So to look around the screen for clues that would explain this nonsense should not be a big ask as they remain curious.

They'll find clues in what people are typing into chat: why does everyone keep on typing replace? There’s clues in the popups that appear when other viewers make changes. When the text rewrites itself at the bottom of the screen it tells you that something has changed. Someone has done something. And when there's no activity from other viewers, an explanation cycles at the top: Reading Alice in Wonderland while you edit it for 20 seconds, then  Type "replace ___ with ___" to change the text for another 20.

It’s important to understand that this is a video medium, so until viewers have understood that it’s interactive, they will consume it passively. If this were a menu in a game, or a payment screen on a website, the user would have a desire or pressure to do something. In these cases the interaction would need to be obvious as soon as you see it. Passive consumption is a luxury afforded to the interactive livestream.

However, they will only be willing to decipher the view as long as they remain curious. So it’s important that these clues are obvious, concise, and reinforce each other. The viewer should not have to work hard at hunting for them.

Of course, many viewers simply ask in chat. For some viewers, simply asking is a more efficient way to gain understanding than reading the on-screen text. But asking a question relies on a little courage, on the viewer's part, to make themselves known to the streamer and perhaps interrupt his flow.

Once a viewer has had the mechanics explained to them by me verbally, they build a mental model of how to interact. The on-screen text is still useful for them, though. If their model matches the on-screen instructions they can confirm they have understood.


Do they feel safe joining in?

Having built an understanding of what’s going on in the stream, the viewer may still not be ready to interact. Before typing replace into chat they still need to feel safe and welcome.

There are, of course, plenty of risks to joining in in a chat. After all, you're putting yourself on the spot, with whatever message you post under the scrutiny of the other viewers. What if you make yourself sound stupid? What if you accidentally single yourself out as a bad apple?

After all, if you only type keywords, you may come off as rude. You'd be treating the streamer as an object to be toyed with rather than a human being. What if you make a change that throws the streamer off his flow and ruins the stream for everyone else?

That's why, even before trying out the replace keyword, viewers will type a greeting into chat. Or, they may ask some variation of what do I do? For many viewers, asking how to join is a passive way to confirm they are allowed to. 


Having me, as the streamer, confirm that you are invited to join in welcomes you into the magic circle. It's typically up to me to personally reassure the first few viewers that there is no bad suggestion. Once the stream is well underway, and chat is filled with ambitious and bizarre edits, the anything-goes tone is self-evident and viewers can feel a little self-conscious as one voice among many.


Do they want to interact?

Knowing how to interact with the stream is one challenge. Knowing that you are allowed to is the second. But the third vital question, before you can interact at all, is why would you choose to?

Any interaction comes with a risk. What are the consequences of getting it wrong? For a user to choose to do any interaction, the desire for the outcome must outweigh the risk.

Even in a simple video game like Super Mario Bros, a new player may choose not to touch the controller at all. They may desire the sense of achievement of beating a level, or they may desire feeding their curiosity about what’s on the next screen. But they also risk embarrassing themselves by being bad at the game, or by wasting their valuable time. In order to interact with Super Mario Bros, the desire must outweigh the risk.

So what does a viewer desire from interacting with my livestream? 

Maybe they want to be funny. Maybe they want to be witty or smart. Maybe they want to create genuinely beautiful poetry. Maybe they want to make me inadvertently say something rude. Maybe they want to wreak havoc, or throw me off of my rhythm, or maybe they want to make things easier for me. Maybe they want to see if they can break the software, and explore its technological limits.

When these desires are satisfied, viewers are inspired to interact again. Perhaps they may type something more ambitious, hoping for an even better payoff. But if their desires go unsatisfied, the viewer may feel the payoff is no longer worth the risk, and cease interacting.

It’s here where the streamer becomes a living part of the user experience. As a performer, I strive to make good on every interaction. It’s important to respond to any suggestion with aplomb, with enthusiasm, and with curiosity. 

I am literally part of the feedback mechanism.

If a player types something in to mess me up (players have replaced words with Welsh words, Cyrillic characters, and even emoticons) my job is not to feign being flummoxed by it, but to try my genuine best to read it. The underlying desire for the player is not actually to mess me up. Instead, they want to satisfy their curiosity, and see how I roll with their punch. They are making a clown game of messing-me-up. If I don’t genuinely try to read the text I’m not playing their game and I’m not fulfilling their desire.


Functional UX Choices in the Reading Challenge

I want to take some time go one-by-one through various design decisions that went into this stream - some larger, and some more minute. Some of these individual choices were designed to make the stream readable and intuitive. Others serve a more emotional purpose: creating a space that doesn’t just facilitate interaction, but also invites it.


When you make a change a popup appears

Whenever a viewer types in a change, a message appears at the top of the screen, telling them - in no uncertain terms - that their input has been received and interpreted. If you type something and the book doesn’t change, the fact that the popup appeared shows you that your input was recognised, even if it didn’t have the outcome you expected, and reassures you that the game isn’t broken.

As a viewer, when you see these messages come up, you should also have enough information to relate it back to a line in the chat log - who sent it and what words did they use? This reinforces how to join in without me or the instructions having to tell you.


When you make a change you quickly get to see its impact


If you type in a change, it will always lead to a change on-screen, so you can see the impact of what you have done. In the reading challenge this is largely determined by the text of the book. If you change a word that won’t appear for the next 10 chapters, you won’t see its effects right away. This is a limitation that can lead to disappointment for first-type viewers.

Because of this I changed the format of the stream for future versions. I now read poems instead of novels, as you can see in the video above. Reading a 14-line poem on a loop, I can guarantee that whatever change you make you will see the impact within the next minute.

This serves a functional purpose of reinforcing your understanding of the mechanic. It also serves an emotional purpose of reassuring you that your voice is being listened to. Once you see your choice has made an impact you feel empowered to make more choices and explore the boundaries of the game.


When you make a change I honour it in good faith

When a machine that does not respond as expected appears to be broken, it suggests “don’t touch me.” When a performer does not play fair it suggests “don’t interact.” 

If you replace text with something unreadable it is important that I still try to read it out. Again, on a functional level this gives you the feedback of the human-machine system working, and on an emotional level it shows you you have the power to create play, and inspires curiosity. Try again and you might get another strange response out of me.


Keywords are required to make an input

One of my first interactive streams was called the Ideas Challenge. I come up with 100 game ideas based on prompts typed into the chat. In this version I got the game to take every single message typed into chat and use it as a prompt. I loved the idea that I’d have to come up with game ideas around themes of “what is this?” and “wait, my messages come up in the game?”

Unfortunately, the reality was that if every chat was treated as an input, nobody wanted to chat.

People needed to have the space to bounce ideas around, respond to funny moments, and congratulate each other on their suggestions, without every message being committed to the performance. Viewers were afraid their messages would derail the performance, and didn’t like the idea that their minor whims would be put on blast or picked apart on-camera by the artist.

Chat has an expected functionality, and regardless of what you’re playing, chat should be usable as a chat.


Your range of possible interactions is limited

As I mentioned earlier, originally there were multiple ways to edit the book. Type replace to replace one word with another. Type remove to get rid of a word. Type hesitate to add random um and ah noises into the text. You could even undo changes you didn’t like.

The problem was, this gave the player a massive toolset of actions to wrap their heads around. It also meant I had four instructions to explain on-screen instead of just one. A viewer has to first see that the instruction is there, then parse it, then try it out, before they can feel empowered to explore it.

All of that for outcomes far less interesting than replace.


With just one keyword - replace - you can see one viewer use it and already understand the entirety of the toolset available. replace is a surprisingly versatile toy. By focusing the viewer’s attention on that one keyword they become inclined to explore its possibilities, rather than continue working to understand the other keywords.

Playing with the toy you have is emotionally empowering - it gives you control of the play-space. Working to understand a suite of toys is emotionally disempowering. It puts me, the creator, in charge of the play-space. 

I’d rather you felt empowered with one toy than disempowered with four.


Black text and grey text

When the viewer makes that change they need to see the effects of that change immediately. However, I need to be able to read complete sentences without the text rewriting before my eyes. 

To try and allow for both, I split the visible text into two sections: the black text at the top will not change, but the grey text at the bottom changes as soon as a viewer edits it. When I turn a page, the grey text moves to the top of the screen and becomes black, revealing new grey text below it.


It’s an imperfect solution, because ideally the viewer should immediately see the impact of any change they make. When looking for words to replace, viewers will look to see what’s in the text, which are usually the words in black. This was another reason to switch from novels to poetry, as viewers will now see the impact of changes to black words, albeit only after the poem has looped back around.

While imperfect as a solution, it’s worth noting why people choose the words in black: because it feels natural to them. As a designer it is better for me to work with what viewers do naturally, rather than find a way to discourage it.


(Don’t) make me tap the sign

You may spot a small notice on the screen, showing the text family friendly mode. The notice is there to signal that I don’t want people to make rude requests, as I feel these can sour the experience for some viewers.


The sign is not important to understand the stream, so I give it very little screen space.

During streams I will probably get a couple of inputs that are a little on the risqué side. In these moments I can remind viewers of the notice. I don’t expect any viewer to have spotted it, but the fact that it’s always been there makes that viewer feel less like they’re being personally singled out. 

Having the sign there makes the difference between “ah, I should have noticed” and “that’s not fair, I didn’t know!”

Secretly, I expect to tap the sign. The sign is there to provide authority. I don’t expect you to actually read it.


Strategies for Developing User Experience

So that's how user experience is built into an interactive livestream. But what of this can be applied generally to all interactive experiences? How do I identify UX problems? And how do I go about solving them?

This is something I've done across my work, from mobile games to alternative controllers. When I used to work for Sparx Learning, we'd regularly take my in-development times tables games to primary school classrooms to test with the children. I noticed my approach to getting feedback was very unusual compared to the engineers and content writers who would accompany me.

My approach often asks the designer to remain passive, and expect to see their best-laid plans proven wrong.


Sit back and observe

When given any piece of software users will use it however feels natural to them. When you're watching a user struggle it can be tempting to jump in and help them. Instead, just watch it. Can they figure it out for themselves, and if they can, how did they reach the solution? What futile actions do they keep trying? 

Often users will gravitate to what looks like it could be a solution, and it's helpful to know what imagined solution they're reaching for. They might be looking for a tool that you know won't solve the problem: in that case, what in your interface is misleading them? Have they come with prior expectations or from previous software? Are you using a misleading visual metaphor? 

In the classroom I noticed that students were reluctant to do any action that might be deemed "a wrong answer." Even in what's clearly the play part of a maths-game, students were reluctant to do anything by trial-and-error. In the classroom environment students approach learning games as schoolwork, not as a toy, and in mathematics schoolwork trial-and-error is discouraged.

At first, a student who can't find a button to press behaves the same as a student who's seen a button but is afraid to press it. You need to watch and wait to understand what their real thought process is.

On a livestream this observation can be harder, as the only live feedback you have is in the chat window. But the kinds of replace actions players try shows me a lot about their headspace. Streams typically start with a lot of puns and wordplay, and only get daft and experimental later on. This taught me that players feel nervous about their first input, and work hard to make sure it's "good." To be playful and experimental requires the ice to be broken.


Get users to talk aimlessly

Analytics and questionnaires have the aura of getting accurate and statistically-rigorous evidence. However, for developing a baseline understanding of how your interface is being used, they are borderline useless. They are reliant on you already knowing what problem you're looking for. Through the development of an interface what you want to understand is the user's mental model of what's going on, and it reality it will be very different to what yours as the designer.

To reach this understanding, I like to just get users talking. I like to strike up a conversation with a player, during or after one of my games. I don't need to ask them anything specific, but just keep them talking as long as possible. Most people will return multiple times to the topics that are most important to them. It may be the flaw that really bothered them. It may be the opportunity that really excited them. 

Indeed, the way they describe what they're trying to do can provide a lot of insight into their mental model. If a player describes all of your NPCs as enemies you may have failed to convey an important piece of meaning about your friendly community of creatures.

After my livestreams I often message viewers to ask what they thought about it. When viewers told me about bugs in the text-replacement, there was a reason these bugs had particularly bothered them. These viewers came up with a good jokes which failed to land, as the text-replacement did not function how they expected. These weren't bugs but intentional oversights that made logical sense and I expected would be fun to toy with - but that expectation was part of my mental model that did not match my viewers'.

On a deeper level this taught me that my anything-goes ethos, where mess and nonsense are beautiful, does not come naturally to most viewers. I realised that my job as streamer was to cultivate that mentality, and this ultimately changed the design goal of Artholomew Video as software. Where I had originally intended it as a dadaist art-making tool, the game became educational, with a goal of teaching creative flexibility.


Accept what users find natural

During my first livestreams, viewers didn’t interact how I expected they would. I didn't expect them to want to chat to me before they joined in. I didn't expect them to try to change words in the screen-area I'd earmarked to not change. I didn’t expect that parsing every chat message as an input from the game would stop viewers from using the chat entirely.

In each of these cases, viewers were acting in a way that was totally natural to them. Even if it went against my original design intent, my job was to work with that and support their natural expectations - not to discourage them.

Indeed, users' natural behaviour can often lead to something better than your initial intent. I created Codex Bash as a single-player game, but players instinctively invited friends to help them out. In fact, the game was much more exciting as a co-operative game, and my players could see that before I could.


Don’t make them have to think

In my reading livestreams I needed to reduce the number of keywords available to viewers to just one: replace. With only one keyword they only had one mechanic to understand and, when understood, were free to explore the range of possibilities it offered. 

Users want to feel independent and empowered. If they understand all the mechanics they are in full control. If they do not understand all the mechanics, they feel like an interloper in a space the designer controls.


Reframe your assumptions

Often when we build something it’s not until we see it up against a real audience that we truly understand what experience we wanted to build.

It was like this with the Reading Challenge. Initially I got the audience to edit complete novels because I pictured the stream as a Sisyphean task. I must, against the odds, try to read a complete story as the audience endlessly fudges and extends it. 

That’s not what viewers responded to. Instead, the magic came from giving them a way creatively transform the work. The text begins complete and polished. It becomes a pool of nonsense. It then develops into its own kind of experimental poetry, something everyone had created together out of their shared sense of humour.

Playful transformation was the heart of the livestream, and switching my source material from novels to poems played better into that idea. It made sure that small changes had big impact. To reach that important change I needed to jettison my initial intention of it being a Sisyphean struggle.


A World of User Experience

Interactivity is all around us. Every action we take as humans is done with an understanding of the space around us, and what it allows us to do. In other words, its affordances

What do our living rooms allow us to do as we move around its nooks and crannies? We can sit on our sofa or crawl under it. We can watch our television or balance a glass on top of it. Every environment offers multiple ways to interact with it. Some of these seem instinctive and natural. Some of them are taught to us by experience. Some of these seem subversive, surprising, perhaps risky and dangerous.

Making games for unusual spaces forces us to see the affordances in spaces we'd not previously have investigated. And in doing so we learn which of these affordances come naturally and which require a leap of logic or courage. We can take this skill back with us when we return to the world of software.

Let's say you make a game to be played in a garden. Maybe you notice the grass, and you notice its affordances: it's soft and springy, enough to roll around on it. So let's make a game where you roll around on the grass.

Invite people to join you and you may learn: the affordances of grass are not necessarily an invitation. A player may not want to roll on the grass because they’re afraid of staining their favourite white shirt. They may be reluctant to roll on the grass for fear getting into trouble. There may have been a keep off the grass sign they'd neglected to spot, after all!

What choices can you make to help players feel more comfortable rolling on the grass? Here we see laid bare the emotional negotiation that must go on before a player will choose to interact.

An emotional negotiation is required to make players feel comfortable joining hands and spinning around. An emotional negotiation is required for a player to reach over to another player's stomach, pressing the attached tablet as a video game button.

Live-testing Go! Power Team! at JOIN Summit in 2015

There's an emotional negotiation required with any interface, no matter how small or how simple.

To create good user experience we need to regard our users through a compassionate lens. They do what feels natural to them. They do what they are driven to by their desires. They choose not to do things that they are afraid of. And they cannot have a sense of desire or confidence to overcome fear until they have been clearly informed.

Once these needs are fulfilled the user can approach your software with a sense of trust.