Algorithmic Image Production

By Rosemary Lee

Images are increasingly informed by their engagement with algorithms, with the growing ubiquity and processing power of machine learning giving rise to the proliferation of new practices, aesthetics, and theories in image-making, art, and visual culture at large. The effects of algorithms on images tend to subvert existing theoretical conventions, as they are connected to highly automated and dynamic processes in ways that are not always intelligible to viewers. Characterizing images, art, and visual media as “algorithmic” emphasizes that these may be defined to a greater extent by their procedural qualities than their position relative to human perceptual experience and expressions of agency. In this sense, recent developments in the technical production of images present certain aspects of genuine novelty, challenging traditional evaluation criteria that have typically prioritized visual aesthetics, human authorship, and direct referential forms of representation, yet they are often accompanied by narratives that originally emerged in relation to earlier visual paradigms such as photography, painting, drawing, or printmaking. Examining how current examples, methods, and contexts are informed by historical tendencies in visual technologies and their surrounding discourses, this investigation focuses on the theoretical and artistic implications of algorithmic methods, specifically those engaged in machine learning. By situating recent developments in the use of machine learning to produce images in relation to instances drawn from the history of art and of visual media, it seeks to develop a better understanding of the historical threads that converge in the use of algorithmic approaches in recent artistic practice.

An algorithm can be explained as a recipe of sorts, or “a set of modular or autonomous instructions — in execution — for the doing or making of something, which includes necessary elements, constraints, and procedure, taken together dynamically” (Bianco, 2018, p. 24). When generating an image using a machine learning system, the algorithm is the sequence of operations performed by a computer in solving a given problem. In this case, the “problem” may be framed as a question of how to create an image that reflects the attributes of a given dataset. It’s important to note that, contrary to a popular misconception, it is not the algorithm that changes over time, but the model. The algorithm is performed, often repeatedly, and a statistical model is updated and adjusted to improve the performance of the algorithm at a given task. A trained machine learning model, therefore, comes to bear the impressions of the content and context to which it is applied.

Melanie Mitchell defines machine learning as “a subfield of artificial intelligence (AI) in which machines ‘learn’ from data or from their own ‘experiences’” (2019, p. 8). In such an approach, a statistical model is “trained” or adjusted in relation to the relative success or accuracy of its output, aimed at improving the performance of an algorithm at a given task over time. This enables visual processing tasks such as the generation, classification, or labeling of images, to be performed by highly automated computer systems. Mitchell situates machine learning within artificial intelligence, which she describes as “a branch of computer science that studies the properties of intelligence by synthesizing intelligence” (p. 7). While rather recursive, this definition gives us a starting point to work from and illustrates this topic’s potential for ambiguities.

It’s historically significant that the term artificial intelligence (AI) was coined with the explicit intention of establishing a new direction from cybernetics research, from which it and machine learning emerged. Cybernetics draws inspiration from biological and environmental feedback systems in the design of technical systems. For example, by modeling human cognition or biological vision in computational systems, parallels between various kinds of processes could drive new forms of technological development. Remnants of this history are invoked by the use of artificial intelligence as a marketing tool and in cultural metaphors that draw cybernetic comparisons between human brains and programmed machines. Such instances frequently rely on misconceptions regarding the nature of intelligence and technical attempts at replicating it, computationally, as well as having the potential to distract from the realities of what is truly at stake in AI.

This area of research has developed rapidly in the past 10 years, with artists often keen to be first adopters of emerging technical affordances as they are developed. As a result, machine learning has been implemented in an increasingly broad range of visual applications beyond its prior limits within computer science research. From the direct generation of images to determining the visibility of networked content, machine learning has become pervasive in many tools and forms of visual media that are accessible to non-experts. Algorithmic processes have thereby come to have a pervasive influence on visual media, as well as cultural imaginaries surrounding this tendency.

In combination with the growing technical potential of machine learning the conceptual associations attached to artificial intelligence and the performance of algorithms have proven especially relevant in art contexts. In recent years, this has resulted in a growing number of exhibitions, funding opportunities, research projects, and even the establishment of labs dedicated to the topic of AI art (Zylinska, 2020). The wide-ranging technical, theoretical, and societal implications of machine learning and artificial intelligence have drawn the interest of high-profile artists who have worked with machine learning, both thematically and from a technical standpoint. Theorists have also weighed in on this topic, providing philosophical insights and often working closely with practitioners across several fields. It is a great challenge to select which examples to focus on, as those discussed here are just a few of the growing number of practitioners and thinkers who have made significant artistic and theoretical contributions relevant to this topic.

There are various ways of incorporating machine learning in visual art, among which image generator systems have become especially widespread. Generative adversarial networks (GANs), introduced in 2014, became a popular machine learning approach among artists for the relatively high-quality, photographic images they generate. Since early 2022, GANs have been largely overshadowed by diffusion models. Several popular examples of this kind of image generator system include DALL-E 2, Midjourney, and Stable Diffusion, which have recently experienced widespread media coverage in the general public for their ability to create images with photographic aesthetics from input text prompts. These shifting tendencies have been encouraged by a number of factors including new technical developments, growing accessibility to the technology and knowledge of how to use it, as well as increasing the cultural visibility of artificial intelligence.

In the context of image production, machine learning enables images to be informed by the application of statistical models so that new visual content may be generated based on the analysis of a dataset. Recent developments in machine learning research for graphical applications have afforded new ways of creating and analyzing images. A statistical model is said to be “trained” on a dataset, from which the system “learns” or extracts patterns that in turn inform adjustments to the model with the aim of improving its performance. There are various ways of doing this, but one common practice is for machine learning datasets to be composed of examples fitting the attributes desired to be learned.

Beyond directly shaping visual content, machine learning is frequently integrated into other software, where its influence is often obscure to users. Digital cameras and many apps may automatically detect faces, gestures, or facial expressions, adjust lighting and focus, or add a filter or lens in real time as photographs are taken. Machine learning has also become ubiquitous in networked contexts, where it is unclear what criteria are entailed in determining the visibility of online content. The opacity of the algorithmic processes behind many forms of visual media contributes to ambiguity in mainstream understandings of what algorithms are, and what role they play in digital media.

Training a model relies on providing a machine learning system with suitable data, usually meaning an ample amount of data that is of high quality and fits a specific scope. What high-quality means, in this case, is quite open to interpretation and the quantity and kind of data required to successfully train a machine learning model are also variable, although it’s generally agreed that the more data there is and the more representative it is of the phenomenon in consideration, the greater likelihood there is of achieving the intended results. But as has been duly pointed out, the express intentions behind a machine learning application may have little to do with the actual outcomes that result from such unpredictable processes.

This touches on the problem of measuring the accuracy of visual media, which is especially problematic in the case of machine learning systems that persistently prove prone to embedded bias and error, and are open to variation based on the parameters, data, methods, and contexts involved. On the one hand, machine learning systems have demonstrated a capacity to achieve statistically unpredictable results. However, it has become clear in recent years that one thing that is not unpredictable in machine learning is its tendency towards highly problematic instances of built-in bias that originate in the human perspectives that have informed the design, application, and evaluation of machine learning systems.

Although algorithms are, in themselves, clearly defined and deterministic, the results of applying them to the production of images entail various potential openings for ambiguity. There may be substantial differences between the way that the visual qualities of an image or the processes involved in its production are perceived and understood by humans and the way the same image may be interpreted by a computer. Variations between the particularities of a given system, as opposed to another, may also radically impact the results of automating a visual processing task. The outputs of machine learning systems are far from self-evident, often requiring specific knowledge or external information in order to be “read” or understood on more than a superficial level. Aspects of machine learning, such as the fact that different results may be achieved each time a given operation is performed, make its results difficult to predict, meaning that these systems and their outputs can be opaque to human understanding. While this may, on the one hand, allow them to deliver surprising results, it again raises the issue of the limits of human perspectives on algorithmic media on the level of design, implementation, and assessment of their outputs.

Creating images in accordance with clearly defined constraints and procedures serves as a starting point from which to discuss relationships between recent developments in machine learning and aspects of earlier forms of image-making technologies. Images may be transcribed as — or enacted from — written instructions outlining the specifics of how an image is to be produced. Algorithmic sets of instructions may be used to produce several different iterations of the same image that are constrained by pre-defined rules. This enables a degree of unity to be maintained between several instantiations, as well as affording the transcription of instructions for the creation of an image in written form, like a program for its execution. These aspects raise several challenges to the development of image ontologies and the evaluation of art, issues that become especially visible in the highly automated and networked contexts in which algorithmic media are ubiquitous today.

The increasing formulation of images in terms of algorithmic processes draws attention to the act of imaging as an event, a mode of committing to memory, of making a spatial experience perceptible and communicable. In this light, it becomes understandable why one might photograph something without any intention of looking at that image, itself, again. The act of capturing the moment, of participating in its orchestration and archival, outweighs the image as a visual record. More than visually documenting objects, ideas, and contexts, producing images allows us to form a particular relation to the visual. The conditions surrounding an image’s performance and articulation correspondingly also inform its reception, making it a site of mediation between human perception, meaning-making, and agency, also demanding new forms of technical literacy to interpret the products of highly automated algorithmic systems.

The widespread adoption of machine learning in a variety of different fields emerges from a long history in which conflicting perspectives on visual technologies have shaped thinking about images. In the history of photography, for example, the application of technical processes and apparatuses has been demonstrated to have the capacity to produce highly accurate visual representations, while also being open to the manipulation of appearances. Mechanical automation and later the increasingly computational nature of visual media enabled production processes to be made increasingly programmatic, with implications for evaluating the products of highly automated machines, not only in terms of their owncharacteristics, but also in terms of other factors such as their position relative to human perception, interpretation, and agency. Data-intensive approaches to image-making such as machine learning in many ways expand on, rather than distinctly departing from these narratives, and thus highlight existing ambiguities in image studies concerning not only the defining qualities of images but also how their meaning is constructed.

The recent popularity of algorithmic methods has been characterized by William Ulricchio (2011) as an algorithmic turnthat departs in certain respects from prior conceptualizations of images and visual media. Rather than irrevocably breaking from them, I argue that this shift expands on tendencies present in image production, even in much older, analog practices. While machine learning offers new ways of creating and interpreting images, it does not make a complete break from earlier technical and visual paradigms of image-making. Instead, it often draws on elements derived from a variety of different forms of visual media. For example, images generated using machine learning often have what may be considered a photographic appearance, both in the sense of applying a particular realistic visual aesthetic as well asentailing processual connotations inherited from analog photography. This two-fold character draws on conceptions of the image as at once a visual representation of the world, as in the tradition of photography, and the product of a database (Hoelzl and Marie, 2015), which in machine learning often entails being directly derived from a database composed of digital photographs.

Recent technical developments in visual media have given rise to discussions of the “changing ontology of the image” (Lund, 2021) in the sense that the modalities of algorithmic processes often subvert traditional expectations of what defines an image. As Harun Farocki (2004) points out, highly automated forms of images may be operative in the sense that they “do not represent an object, but rather are part of an operation” (p. 17). In this way, the use of algorithms in image-making emphasizes processual qualities and the interpretation of data over conceptions of images as materially fixed, visual likenesses of real-world objects.

The generation of images based on learned patterns in datasets disrupts the apparent direct visual referential connection between image and the world that is found in other forms of image-making. A generated digital image’s pixel values may be determined by statistical patterns in data that have little to do with how it is interpreted by human viewers. The incorporation of machine learning into the production of images thereby adds to an already complex area of discourse surrounding the technologically mediated nature of image-making, being at once highly technical and based on data while open to ambiguity and error. Visual technologies tend to be viewed as offering a level of scientific accuracy in that act of representation, creating referential connections between the image and the things or ideas it is intended to point to or stand in for. This is especially noticeable in the history of photography, where technical apparatus and process play a significant role in mediating the referentiality of the images that are produced in this fashion, and because of this, there is a tendency to assume that images that appear photographic or that employ technical processes like those of photography imply stable relationships between visual representations and their referents.

The growing technical promise, ubiquity, and cultural implications of algorithmic approaches to image production draw together multiple, overlapping value systems and traditions surrounding images, especially those at the convergence of art and technology. But while this may appear to be a new phenomenon, structuring the production of images according to defined sets of rules, calculations, data, or constraints is not exclusive to the digital forms that are currently predominant. Many aspects of even much older techniques and technologies of image-making than those involving machine learning or even digital technology have become embedded in the invisible infrastructure behind algorithmic visual media. This includes the value systems that color the experience, interpretation, and assessment of images, that have been shaped by earlier visual traditions such as those that have been built up around photography, painting, and drawing.

In The Finiteness of Algorithms, Friedrich Kittler (2007) traces the origin of the word “algorithm” back to the corruption of the name Muḥammad ibn Mūsā al-Khwārizmī through its successive interpretation, articulation, and reinterpretation. In so doing, Kittler points not only to the long, non-linear history of algorithms, but also to the fact that it mirrors the very qualities of algorithms, themselves: culturally programmed, reiterated, and decoded ad libitum.

We have algorithms and processes on one side, and artworks on the other side, and between them things like the camera obscura, the Turing machine, computers, and more basic things: palettes, painting tools, and musical instruments. Things within which knowledge, often thousands of years of knowledge, have accumulated, knowledge that is, however, different from that which is in artworks; instruments and machines collect knowledge in order to create works and processes. (Kittler, 2007)

Images, algorithms, and art transcend beyond any individual material instantiation in ways that defy attempts at pinning them down. They also share a complex, intertwined history that engages a number of largely unresolved issues across several intersecting fields. In terms of methods, apparatus, as well as the ideas associated with them, early precursor technologies to those that are currently prevalent have influenced thinking about visual media in formative ways. Prior tothe development of the computational methods that are now employed in visual media, related ideas, and modalities have either influenced or offer insights into the present context. Image-making has had a longstanding entanglement with algorithms, albeit often taking on forms that are unfamiliar but that are nevertheless relevant today.

The history of algorithmic visual media is much longer than those instances involving the use of digital computers. Analog algorithmic processes have been employed in image-making and in art for a very long time and while they are especially prevalent now, accelerated with the aid of automation, digital computation, and machine learning, algorithmic media is distinct from any particular technology it may be articulated through. Examples from this history offer insight into several aspects of current forms of algorithmic media that are often highly opaque to intuitive understanding. Drawing connections to distant precursors to today’s use of algorithms in visual contexts also allows us to delve into some of the historical value judgments that have accompanied imaging technologies, and that persistently haunt discourse on visual media.

Examining the deep history behind the use of machine learning in visual media, the following chapters consider the accumulated knowledge that has become embedded in the technical production of images. The view of images as the product of data holds resounding implications for many different aspects of visual media. Not only has this tendency reshaped the way that we make sense of the world around us, directly in terms of human visual perception. It also has deeper relevance to the way we make sense out of that which is perceived, as advanced visual technologies change how we think about what we see. Examining these themes through historical examples in the following chapters, we build toward the examination of how ideas built up around visual technologies contribute to current narratives surrounding machine learning.

References

Bianco, Jamie “Skye.” “Algorithm.” In Posthuman Glossary, edited by Rosi Braidotti and Maria Hlavajova, 24. London: Bloomsbury, 2018.

Farocki, Harun. “Phantom Images.” Public 29 (2004): 12–22.

Hoelzl, Ingrid, and Rémi Marie. “From Softimage to Postimage.” Leonardo 50, no. 1 (2017): 72–73.

Kittler, Friedrich. “The Finiteness of Algorithms.” Presented at the transmediale festival, March 2, 2007.

Lund, Jacob. “Questionnaire on the Changing Ontology of the Image.” The Nordic Journal of Aesthetics 30 (July 2021): 6–7.

Mitchell, Melanie. Artificial Intelligence: A Guide for Thinking Humans. London: Penguin Books, 2019.

Ulricchio, William. “The Algorithmic Turn: Photosynth, Augmented Reality and the State of the Image.” Visual Studies 26, no. 1 (March 2011): 25–35.

Zylinska, Joanna. AI Art. London: Open Humanities Press, 2020.