Technoculture, Art and Games (TAG) is an interdisciplinary centre for research/ creation in game studies and design, digital culture and interactive art


  back to blog

Alexa Studies – On Ubiquitous Audio Agents

Posted by Bart

I ordered an Amazon Echo for the Milieux nerve centre. For a while I have been dancing around projects, presentations, and tinkerish writing on non-human and machine agencies in contexts of play and as fond as I am about thinking about video games, consumer robots and toys, this hype cycle around “intelligent personal assistants (IPAs)” like Alexa, Siri, and Cortana is intriguing. Right now, it’s a think-thing but I’d welcome projects that could push this as a make-thing.

Alexa is worrying for all the old reasons. What are the corporate motivations behind the pushing of these IPAs? Why is there so much social normativity explicit in the conceptualization and design of IPAs (especially vis a vis gender)? What are the implications for our ever encroaching surveillance society with the ubiquity of a device that is always listening and when did the idea of someone or something that is “always listening” become a value?  The list of worries is long.

But it’s a good time for critical engagement with the Amazon-Alexa-Echo assemblage (amongst others) around these questions because the objects exist but they are fairly useless (or in my play theoretical idiom they are “unserious”). The crucial work has less to do with what these things are then with what these things can become. The mistake would be to dismiss them out of hand in some kind of artistic, critical or moral sneer born out of not-nuanced-enough assumptions about the military-industrial-entertainment complex which churns out all these gadgets.

I have suggested this before… critical intervention in the early adoption phase of these technology cycles allow for the possibility of subversion, appropriation, remixing and derailing in ways that become more difficult as the socio-material scaffolds for the use and meaning of these things are put in place. It’s never the objects that should worry us, it’s the scaffolds that are scary. The strategy is to keep Amazon-Alexa-Echo off balance all David and Goliath style before the phalanx shores things up. Since techno-capitalism requires no a priori scaffolding so that profit can be extracted from anywhere and anyone at anytime there is always an opening; a crack, a vantage… a clear shot for someone with steady aim.

As long as all anyone can do is order pizza with Alexa and play Spotify tunes then it’s possible that the killer app for Alexa can also kill, or at least wound, Amazon’s supposedly benign Orwellian fantasy. The key feature of my techno-politics on this front is to risk being complicit. This requires a certain hubris, but then so does refusal.

Back to Alexa. Nomenclature not-withstanding, what we are dealing with here is automated speech recognition on the one hand and natural language processing on the other. What are the possibilities for “natural language” interaction with intelligent agents?  Behind this is a further conversation about machine learning and the question of what it might mean for these naturally conversing agents to be able to adapt to their interlocutors and conversational situations. Right now Alexa is little more than a kid of universal voice activated remote control but the pretense is about much more than this and thus we have a cultural opening. As with VR and other gadgets, the consumer subject is constituted in marketing schemes and now waits with baited breath.

There are three core entities in this assemblage however and I want to consider them all. There is the Echo hardware, the Alexa software and Amazon. I am interested in this Echo-Alexa-Amazon thingy

The Echo part is just a wifi and Bluetooth enabled speaker and microphone array. The teardown web posts are instructive – There are the beginnings here of thinking about what the located hardware/software configuration could be with something like the Echo and it is certainly hackable as it is.

Intriguingly, there has been some discussion on blogs about the effectiveness of the design configuration of the Echo’s microphone array in terms of voice input in public and semi-public places which shifts the context of voice recognition from a kind of private affair of speaking into a mic at close proximity (phone mics ask speakers to come close) to a more public affair of talking out loud.

It’s interesting from a surveillance studies point of view that whatever you say to Alexa (via the Echo) will also be heard by anyone else in proximity and that means, sociologically speaking, that its only appropriate to say certain things. Of course the same is true when people speak at their phones (to Siri or to other people with those dumb headsets) but the key there is you are not supposed to be listening in (even though everyone does). With Echo the possibilities for Alexa to be an interlocutor in group/public conversation, as well as the presumption that it might be okay, makes for an interesting set of radical design affordances. Most functional apps presume dyadic relations between Alexa and a user but the intriguing ones are the party apps in which Alexa becomes a member of, or helps constitute, a group (via Simmel, with you, me and Alexa we have a group and therefore sociology can happen).

What also intrigues me about the Echo (but also the Dot and or dedicated Alexa hosts) is that despite the fact that everything that makes Alexa distinctive is happening in the software located on Amazon servers via “the cloud,” people seem to find themselves talking to the black cylinder as if it was Alexa. Echo is just one body for the Amazon-Alexa agent “to possess” in this sense. That Echo might be a material body of Amazon-Alexa then becomes an interesting investigation which would consider patterns of everyday activity with the device as well as interventions through the production of alternative host bodies. The more interaction vectors that can be configured in the host body the more located Alexa might become though the danger in this would be that it could become easier to ignore the Amazon-Borg brain that Alexa ultimately owes its fealty.

As for the cloud based Alexa software and APIs itself. I am less intrigued by the idiocy of trying to impute a corporately massaged personality to the thing as a sweet talking, ever at your service, docile, female lackey that reinforces norms and stereotypes around the division of labour in the domestic sphere… “honey, can you turn on the lights?”, “honey, can you change the channel?”, “honey, why don’t you just order us a pizza?” The critique here is low hanging fruit even if the situation is worrisome.

The challenge as I see it is in the opportunity for subversion and satire that might transcend the moment of the critical design sketch or prototype and actually make it into the Alexa “skills” store while there is a clear sense that folks who bought the thing are still reflexive about it (as they periodically search for anything available to help justify the expense before the thing gets relegated to a closet). There is everything here from creating a bitchy Alexa to more subtle interactive audio adventure stories and games that aim to unsettle the platform, the interaction context and it’s meanings. Critical subversive minded game designers could really do something here I think because Alexa is a perfect cover for lots of interactions.  A lot can happen if you take “honey, can you turn on the lights?” as simply the opening move for an unfolding game or story.

Which brings us finally to Amazon and the failure of corporate imagination. The prevailing imaginaries for the intelligent personal assistant aren’t really scary so much as they are epically sad. Given the long history of fictional and non-fictional imagination around human interaction with non-humans (in and out of bodies) be they machines, spirits, aliens, or animals it’s astounding that even the corporate designers and executives could be so uninspired, so vacuous and so banal. At the very least we should have an Alexa skill to recapitulate Captain Kirk’s battle of wits with Norman (amongst other machines he managed to talk to death). Alexa is already a capable foil for humanist pretensions. It helps also to be messing with Alexa in Canada where it is even more aimless. Canadian Alexa just isn’t worth Amazon’s attention at the moment.

Alexa will give way to viable open source alternatives soon enough and Amazon will be cut out of the equation for everything but the most banal of operations. But for a brief moment… could Amazon be de-natured by its own IP?  Because it is near useless, it’s possible that Alexa is still a free agent. We need more non-humans on our side for once.