Scrying through AI
Group experience with “synthetic verbal collaging” - a projective technique for co-imagining divergent futures with AI through text-to-image feedback loop, 2021-22
In January 2021, OpenAI published a demo of DALL-E, a tool that generates (almost) photorealistic images based on a simple text caption. Text-to-image neural networks have existed before (e.g., the freely available AttnGAN model), but in 2021 they have brought a significant leap in the quality and realism of the generated output. Although DALL-E is not freely available, other open source versions quickly emerged that achieved similar results and became a sensation among "creative AI" enthusiasts. What can these generated images tell us about our world? Can generating images from text be anything more than an addictively entertaining cabaret?
It turns out that more and more generative AI tools are converging towards interaction via text input, and the essence of design is thus clearly moving towards the need to master the ability of so-called prompt engineering (=creating effective text input). Although it may sound simple, creating the right text prompt is not a completely trivial matter and must take into account many different factors. There is friction between visual and verbal representation, between human and algorithmic logic, between our cultural references and the statistical representation of our visual culture by artificial intelligence. Can this constant translation between the verbal and the visual, between our language and computer code, between ourselves and artificial intelligence, teach us anything?
The workshop method consists of first formulating an initial verbal statement that acts as an initial textual prompt, generating a series of images based on this prompt, analyzing the results, and creating a new textual prompt that is used to modify the next generated series of images. The generated image is a collage of the different words in the word input (plus the input image - initial image). By editing the words in the next iteration, participants build on the previously generated image and edit it. The goal is to collectively co-create a shared image that may be difficult to imagine or visualize in the first place. Visualization is achieved simply by adding, removing or changing words in the verbal input.
No additional tampering with the generated visuals is allowed, nor is breaking the chain between the previous iteration and the new text prompt.