
.png)
at MUŻA - National Museum of Art, Valletta, MT

Woman, holding
Steel, epoxy, printed image,
paraffin hand made candles,
personal hard drives with image data,
tablets, thread, fabric.
190cm x 260cm x 60cm
shown at MUŻA National Museum of Art, Valletta (MT)
commisioned by Unfinished Art Space, as part of
Strangers in a Strange Land
(2020)
The work addresses algorithmic bias in commercially available image description and text-to-image services. Presented in an ambiguous form resembling a shrine, memorial, or futuristic display, it examines the biases embedded in commercial facial analysis and image description services trained on datasets like ImageNet and others.
To create this work, multiple images of the artist were processed through various commercial text-to-image machine learning services and ML image descriptors. These services displayed little bias when describing non-human and male subjects, avoiding evaluative descriptors like 'pretty,' 'good looking,' or 'sexy' for nature, cityscapes, or clothed men. However, when describing women, they often employed evaluative descriptors. Shirtless male images were rarely described similarly, while images of women rarely escaped sexualized undertones. For instance, a shirtless man in a club ad was labelled as 'serious' and 'fine-looking'. In a converse action using commercial text-to-image services, the text output from the image description service was processed. The artist removed evaluative descriptors like 'pretty' and 'good looking.' For example, an image description like 'woman in front of a mirror' generated a semi-abstract representation resembling a posed selfie in underwear, a beach photo, or a mirror selfie—all sharing a visual language of objectifying women. While algorithms may seem neutral, they are ultimately created by people with their biases. The work's title, "woman, holding," derives from a frequently encountered image description, suggesting that the algorithms perceive women as caregivers. Descriptive algorithms rely on datasets tainted by biases from individuals (mostly men) who labelled images, shaping perceptions about potential criminals and gender roles in professions like doctors, lawyers, and scientists.
To create this work, multiple images of the artist were processed through various commercial machine learning image description services. These services displayed little bias when describing non-human subjects or male-presenting figures, avoiding evaluative descriptors like "pretty," "good looking," or "sexy" for nature, cityscapes, or clothed men. When describing woman-presenting subjects, they consistently employed evaluative and sexualising language. Shirtless male images were described in terms of demeanour or presence: one image of a shirtless man in a club advertisement was labelled "serious" and "fine-looking," while equivalent or more neutral images of woman-presenting subjects were given sexualised descriptions. This asymmetry reflects a well-documented pattern in which AI systems focus on physical appearance when the subject presents as a woman, and on profession or action when the subject presents as a man.

To create this work, multiple images of the artist were processed through various commercial text-to-image machine learning services and ML image descriptors. These services displayed little bias when describing non-human and male subjects, avoiding evaluative descriptors like 'pretty,' 'good looking,' or 'sexy' for nature, cityscapes, or clothed men. However, when describing women, they often employed evaluative descriptors. Shirtless male images were rarely described similarly, while images of women rarely escaped sexualized undertones. For instance, a shirtless man in a club ad was labelled as 'serious' and 'fine-looking'. In a converse action using commercial text-to-image services, the text output from the image description service was processed. The artist removed evaluative descriptors like 'pretty' and 'good looking.' For example, an image description like 'woman in front of a mirror' generated a semi-abstract representation resembling a posed selfie in underwear, a beach photo, or a mirror selfie—all sharing a visual language of objectifying women. While algorithms may seem neutral, they are ultimately created by people with their biases. The work's title, "woman, holding," derives from a frequently encountered image description, suggesting that the algorithms perceive women as caregivers. Descriptive algorithms rely on datasets tainted by biases from individuals (mostly men) who labelled images, shaping perceptions about potential criminals and gender roles in professions like doctors, lawyers, and scientists.




Installation elements. Personal hand drives forming a core of a candle
The work's title, Woman, holding, derives from a recurring observation during this research. When a woman-presenting figure was photographed holding an everyday object such as a jacket or a bottle of water, commercial image recognition systems would frequently describe her as holding or carrying a child, or frame her within a caregiving role. The object became irrelevant to the algorithm. What the system consistently identified was a carrier


In a converse action, the text output from the image description services was fed back into commercial text-to-image generators, with evaluative descriptors removed by the artist. A description as sparse as "woman in front of a mirror" consistently generated semi-abstract representations echoing posed selfies in underwear, beach photographs, or mirror selfies. These images share a visual language of objectification, produced automatically and without any such explicit prompt.



These outcomes are structural. Datasets like ImageNet were built using crowdsourced microwork platforms, distributing labelling tasks across tens of thousands of workers from across the globe, with no knowledge of how their annotations would be used (this echoes in the elements of the installation such as hand made epoxied keys in shape of diamonds and dice). The platforms presented this labour as an impersonal, objective process, obscuring the human judgement embedded at every stage. Workers were compelled to align their labels with majority answers to retain assignments, a mechanism that enforced conformity and embedded the most prevalent cultural assumptions directly into the dataset. The categories they were asked to apply were designed by academic and corporate researchers, the majority of whom were male and drawn from a narrow band of elite institutions, whose assumptions about gender, bodies, and social roles were built into the taxonomy before a single image was labelled. The algorithm inherits this structure and presents it as neutral.


