Lab Notebook 6

 

Google Sheets Copy of Visual-Style-Shot-Data

Introduction

Throughout the course of this post, I will provide a critical commentary on the visual-style-shot-data dataset accompanying Taylor Arnold, Lauren Tilton, and Annie Berke’s “A Visual Style in Two Network Sitcoms”. The first half of this post will function as an observational outline of the datasets metadata and content. The second half of this post will function as an appraisal and critique of the distant viewing methodology (which the visual-style-shot-data dataset is part of), and what it may provide for future film theorists and researchers.

Observation and Description

The visual-style-shot-data dataset is part of Taylor Arnold and Lauren Tilton’s broader Distant Viewing Lab project, which seeks to provide an empirical and quantifiable basis for film theory through the use of facial recognition technology. The Distant Viewing project seeks to isolate frame and shot composition through the distance and tracking of faces. The dataset is comprised of 21 metadata columns ranging from the given series being analyzed, to the episode and season number, and even the specific frames associated with a shot composition. The rationale for this project, as I previously mentioned, is to provide film theory with a computationally derived facial recognition algorithm for analyzing large batches of “semantic” shot data. Semantic shot data is, for example, facial recognition tools to isolate a face as belonging to a given actor within a frame, or the breaks dividing two shots. The distant viewing tool is, hypothetically, capable of compiling all of this data to provide researchers with at-a-glance shot information for the analysis of a large corpus of films and television.

The dataset functions by being “fed” video files for films and television programs. The toolset then scrubs through the given video file frame by frame. Using facial recognition software, the computer is able to determine the characters present on screen and the number of characters present in a given frame. Based on the position of the characters faces within a frame, the software is then able to determine if the shot is an over-the-shoulder, close-up, two-shot, long-shot, and so on. Through the use of facial recognition software, Arnold and Tilton are able to algorithmically define the constitutive blocking patterns for film and television programs. Film theorists may then theorize the visual style of 1960s television programs and their aesthetic relationship to contemporary television sitcoms based on the distant viewing methodology.

The dataset is structured around two 1960s sitcoms: Bewitched and I Dream of Jeannie. Both of these programs aired in the mid-60s (1964 and 1965, respectively) and were visually and thematically similar. The distant viewing methodology is able to plot the shot type distribution throughout the course of the episode and, for example, visualize the increasing complexity of plots within I Dream of Jeannie by charting the movement within episodes from close-up to wide, or group, shots. In this example, the chronological movement of a standard episode of I Dream of Jeannie moves from a singular character or object that anchors and orients the plot of the episode to an increasingly complex distribution of faces and characters throughout the frame. In other words, towards the beginning of an episode the frame is singularly occupied and throughout the course of the episode becomes further enmeshed in the lives of several characters. This is ultimately represented by the increasing number of characters within a frame.

Critique

While the dataset is a fascinating attempt at grounding conceptual and theoretical developments in an empirical/computational methodology, I am skeptical of the tool’s utility for contemporary film theory and, specifically, the theorization of non-traditional film and television programs. The modern television landscape is saturated with so-called prime TV. These programs have broken with the “sitcom” visual style and format, and are capable of telling stories in a manner that resembles film. Directors have much more freedom in designing the look and feel of a program, and are able to take creative risks with aesthetics. Arguably, the same does not hold for television sitcoms of the 1960s. Most programs were filmed on a set with a multi-camera rig, in a manner that resembled stage productions. I am curious to see how the distant viewing methodology translates for the analyses of television series such as Twin Peaks, The Wire, and Mad Men, among a myriad of other programs. These serials are visually disparate and, while they employ the traditional array of camera framing and block (over the shoulder, close-up, two-shot, etc.), also prioritize an experimental use of mise-en-scene and lighting.

Further, the two examples on display in Arnold and Tilton’s case study are predominantly, if not exclusively, white-casted serial sitcoms. Yet, Arnold and Tilton do not address the critical racial limitations and discriminations immanent to facial recognition software. In “Racial Discrimination in Face Recognition Technology” by Alex Najibi, Najibi details the facial recognition softwares inability to accurately capture non-white faces and recounted significant error rate discrepancies in facial recognition software between, most noticeably, “darker-skinned females” and “lighter-skinned males.” In the various supplementary materials provided by Arnold and Tilton, there is no reference to these existing issues with facial recognition software. This does not necessarily spell doom for the critical interdisciplinary method Arnold and Tilton are attempting to produce. Rather, this opens up a legitimate pathway of additional testing for determining the validity and utility of facial recognition software for computational film studies methodologies such as distant viewing.

Alongside the above potential critiques, Arnold, Tilton, and other distant viewing adherents should address the troubled racial history of facial recognition software, and its use as a tool for law enforcement. As Najibi writes,

“Black people are overrepresented in mug shot data, which face recognition uses to make predictions. The Black presence in such systems creates a feed-forward loop whereby racist policing strategies lead to disproportionate arrests of Black people, who are then subject to further surveillance.”

We should read the distant viewing methodology and the problematic history of facial recognition software and surveillance alongside the work of Jennifer Guiliano and Carolyn Heitman’s “Difficult Heritage and the Complexities of Indigenous Data.” While the distant viewing methodology is a laudable advancement in the inter-relationship between cultural analytics, film studies, and digital humanities, it also ushers forth a series of question of the ethical responsibility of digitization and large-scale batch analyses of visual corpora. I will here quote Guiliano and Heitman at length:

For communities who have been traumatized through colonization, the desire of digital humanists to use their ancestor’s histories as data to be experimented upon recall a past where Natives were casualties to be acted upon rather than sovereign agents of their own lives. The ethical, and we argue the only path forward is through slow, thoughtful, inclusive, and collaborative practices that recognize and privilege indigenous-centric research practices and ways of knowing. (25).

The distant viewing methodology handles the incredibly sensitive data of a person’s image. The image is then marked, denoted, and placed into circulation with other images and other faces. But, the face becomes the foundation for a broad-level analysis of form. Yet, in this way, the face disappears from sight. Distant Viewing must be sensitive to these potential ethical problematics, the sensitivity of the data they hold, and the problematic history of this method of data collection. If left unaddressed, Distant Viewing has the potential to re-capitulate racist modes of seeing and surveillance practices. The neutrality of the methodology must be grounded in the actuality of the faces and places on screen.