Best Practices for Working Creatively with Personal Data

Participation, Authorship, and Dissemination

In the previous section, we discussed consent and recommended involving data subjects/participants in artistic projects as much as possible so that they are informed and able to make ongoing choices with regards to their data. Their involvement leads to considerations of whether or not the data subjects in the datasets are in fact collaborators or co-authors in the resulting artwork. If they have been heavily involved in the project by giving ongoing feedback and making aesthetic and conceptual suggestions, should they be considered a collaborator in the artwork and credited as such? If yes, should they be acknowledged, and what kind of acknowledgement is appropriate while respecting their privacy? In some cases, acknowledgement may compromise the confidentiality of the data subject (Cox et al. 2014, 19).  The participant may however be very comfortable with their identity being linked to the data in an artistic context, especially if the work engages with themes they are passionate about and thus they may waive their privacy rights in such conditions.

If secondary data is used in the creation of an artwork, should the holding institution be acknowledged? Is data accreditation/citation explicit in the terms of use? How should the ways in which the datasets were obtained be acknowledged in the work? The different kinds of data permissions and citation regulations were explained in the Current Data Protection Guidelines section, but here we pause to consider what the artwork needs conceptually and how and where citation and accreditation are appropriate. Should data accreditation be displayed next to the artwork when it is exhibited, or is it better to reserve this information for catalogues and exhibition statements? Could there be a webpage for the project that includes such credits, with an acknowledgement accessible via a QR code? It may confuse the experience and intention of the artwork for too much textual information about the artwork to be displayed as part of the exhibit. For example, if the artist creates an immersive audio-visual installation, they will unlikely want the installation to be interrupted by large lit text panel. Furthermore, when artworks are curated to exhibitions where the artist is not present or not in charge of the visual identity of the show, the amount of didactic information is harder for the artist to control. 


The nature of artworks is that they re-present, make visible, and reframe their subject matter (in this instance, data) for a public audience in galleries, museums, publications, and online platforms. Often, the experience of an artwork occurs in a collective and public space. This level of visibility is different from a scientific paper, which is typically read in a more private setting such as an office or library and usually does not have the aim of eliciting intense, emotive, embodied encounters with data as a way to explore a personal or socio-political question, which can often be the goal of artists. 

It is more or less standard now for visitors to exhibitions to use their smartphones to take photographs of artworks and share them on social media (with comments you are not able to control), making them even more visible. Not only does the sharing of images of the exhibit risk exposing the data subject (if they wanted to remain anonymous) to facial recognition algorithms that social media platforms use to tag people in images, but it also means images can become part of feeds with which the data subject may not be comfortable. 

Your Data Body

One of the KTVR VR projects, Your Data Body, works with the data of others. The project has evolved (and keeps evolving), continually raising practical and ethical challenges, including those around authorship. The work was originally made using a combination of open-source and donated datasets, with the goal of focusing on issues of data privacy. In VR the user is able to pick up, move, resize, recolour, and duplicate the scanned body parts, or stack them to make Frankenstein-like figures. Audio files attached to each scan play when the user holds and manipulates each dataset. Anonymized open-source datasets are accompanied by an automated voice recounting the study data published alongside the dataset, whereas datasets “donated with explicit consent” have a “personal story” based on the original subject of the data. Early in the project, a request for “donations” of scan data was sent out via email to potential donors by means of university listservs. Several people responded to the call but preferred not to do the recording and wished only to donate their data anonymously. For those who did provide a recording, they can choose to be listed as a contributor in the credits of the project. 

One participant who responded to the call with a donation of over ten datasets is Canadian artist Liz Ingram. Ingram has created several artworks with her own medical scans, which she’s been acquiring since 2014 as part of her ongoing oncological care. In 2019, for instance, Ingram worked with her husband and collaborator Bernd Hildebrandt to create Light Touch, a large silk fabric tent printed with images of her brain scan held tenderly in both her own and her husband’s hands. Inside the tent, on the floor, is a poem written by Hildebrandt in mirror vinyl. Marilène Oliver, who is leading the creation of Your Data Body, struggled to work artistically with Ingram’s scans knowing that Ingram and Hildebrandt had already made strong aesthetic choices about how Ingram’s scans are presented in their artworks (typically very fragile, transparent, and intermingled with images of flowing water and poetic text). Both Ingram and Hildebrandt have been invited to become collaborators and will work with the creators of Your Data Body to decide how Ingram’s scans are rendered in the project. Inviting Ingram and Hildebrandt’s collaboration now requires the consideration of new questions: How will Ingram and Hildebrandt be credited? Will they be full authors of the work, even if they only work with Ingram’s own scans and not other parts of the project (which will include several other datasets)? Will Ingram and Hildebrandt partly “own” Your Data Body? What if Your Data Body is sold at a future date? What percentage of any sales would they receive? To avoid misunderstandings and avoid any future conflict with regards to artistic control and ownership, it will be wise in this instance to consult CARFAC’s sample Artists’ Collaboration Agreement and agree to terms of the collaboration (Sanderson and Hier 2006) .  

The way in which different open-access datasets are used in Your Data Body has also evolved over the duration of the project. Not all open-access datasets have the same amount of information; accompanying information ranges from just the dataset file name to detailed scientific papers detailing demographic information and pathologies of the data subjects, the scanner used to acquire the scans, what was discovered from the data, and so on. 

One of the open-access datasets used in Your Data Body is the Visible Human Project (VHP), a widely used, open-access scan dataset of Joseph Paul Jernigan (also discussed in Provenance, Access, and Licencing). There are several documentaries, books, and news articles about the VHP that detail Jernigan’s personal life before he was executed and the process of his corpse being digitized. Additionally, the creators of VHP, the American National Library of Medicine, now have a whole webpage about the project that provides links to four VHP conference proceedings and several academic papers. One of the original goals of Your Data Body was to have a conversational AI avatar that would advise or guide the viewer through the VR experience. Through group discussions about what the AI avatar should know and what form it should take (e.g., a CGI humanoid or scan body part), it was suggested that the AI avatar could be the Visible Human. We are now attempting to train a conversational AI model on information available about the VHP, in the hope that viewers of the Your Data Body artwork will be to “chat” with the Visible Human scan dataset about its history and what it has been used for. To train the model, it was necessary to create a dataset of fifty conversations between the Visible Human and future viewers. We started by making a character profile listing what the Visible Human knows about their data, feels about their data and wonders about their data. In some way, this is an attempt to give some authorship to the Visible Human and Joseph Paul Jernigan. As it is impossible to get posthumous consent from Jernigan, is this more appropriation than attribution? Or is it fact-based fiction? Working with data of the deceased is an important grey ethical area to trouble and expose, especially as large corporations such as Microsoft are already working to commercialize the creation of bots from specific people (Abramson and Johnson 2020).

Marilène Oliver, Screen capture of Your Data Body, VR artwork (in progress), 2022.
Image courtesy of the artist.

Participation, Authorship, and Dissemination Discussion Questions

• Are the data subjects considered participants, contributors, collaborators, or authors in the artwork? How should data subjects be acknowledged in a specific work? 

• Is the institution that created and stores the dataset (the custodial institution) a participant? Should the institution be acknowledged as a participant? Does acknowledgement of the institution compromise the identity of the data subjects represented in the dataset? Has the institution been contacted to determine whether they want acknowledgement in any research project created with their data? Does any user agreement or terms of use stipulate whether acknowledgement is required? 

• How is acknowledgement done? Does it risk disclosing the identity of the data subject? 

• Do data subjects or the custodial institution hold any right to the resulting artwork? Should data subjects or the institution have access to the resulting work (perhaps as digital or physical copies of whole or part of the resulting work)?

• How are the works shared online, if they are shared online? Will visitors to exhibits be able to take photos of the works and share them as they wish on their social media feeds or elsewhere?

• If images of artworks are posted online, how are comments managed? Is it appropriate to turn comments off or to actively manage comments, or is it more important to allow a free discussion to happen? If comments are open, how are problematic comments dealt with, particularly if data subjects are the target of the comments and if those comments are disrespectful to them? Is it appropriate to delete comments? Is deletion transparent, with the necessary steps in the decision-making process made available?

• How are alternate readings of the work acknowledged, discussed, and, if necessary, refuted? Is there a risk that images of the data subject in the artwork will be documented and shared out of context?