Throughout my career I’ve felt a great deal of ambivalence from those who have witnessed my dual position within computing and photography. For many it is difficult to see that those two are intrinsically interconnected. People’s natural need to differentiate (discriminate?), identify, and taxonomize, often prevents people from seeing the links between the analog and digital image. Indeed, I’ve never personally felt that the transition from analog to digital was all that significant aside from the material and economic transition. The core of photography is the capture of data. That was true in 1826 when Niépce captured the first chemical image. It is still true today with digital images.
The pace of our class content through history has been fast, but in moving quickly we’ve experienced first-hand the collapse of traditional distinctions (like analog/digital or classical/computational). For this I am grateful. Some universal truths about photography have emerged in this holistic approach.
The first truth is truth itself. More specifically, it is the veracity of the photographic image as a predominant theme from the beginning of photography (I think of Rejlander as an iconic example) to today (I think deep fakes and, specifically, the Talking Heads video and paper).
The second truth is the question of agency. Who is the artist when image making is mechanized? In the beginning of photography (for the Salon of 1859) Baudelaire, a vociferous critic of photography and modernity said this:
Each day art further diminishes its self-respect by bowing down be¬fore external reality [i.e. photography’s influence] ; each day the painter becomes more and more given to painting not what he dreams but what he sees.
(
https://www.csus.edu/indiv/o/obriene/ar ... graphy.htm)
He hated photography and, what he saw as, the demise of the artist’s subjective role in art creation, or dreams.
The last truth I’ll mention is the collapse between art and science within photography, which was summarized in George Legrady’s essay for Drone: Reflections on the Computational Photography (
https://www.mat.ucsb.edu/~g.legrady/aca ... ion_gl.pdf). Throughout the readings I’ve paid particular attention to the scientific papers, as my current interest is participating in the field of computational photography as both a researcher and an artist. The papers I read were Few-Shot Adversarial Learning of Realistic Neural Talking Head Models (
https://arxiv.org/abs/1905.08233v1 (as well as the authors’ attached statement beneath the YouTube demonstration video (
https://www.youtube.com/watch?v=p1b5aiTrGzY)), Single-View View Synthesis with Multiplane Images (
https://single-view-mpi.github.io/), DeepView: View synthesis with learned gradient descent (
https://augmentedperception.github.io/deepview/), and Femto Photography: Capturing and Visualizing the Propagation of Light (
https://dspace.mit.edu/bitstream/handle ... sAllowed=y).
In reading these papers, my focus was on methodology, ideology, and terminology. To be concise I’m going to say a bit about each of these 3 areas and collapse all of the papers.
Methodology
Methodology involved reading how the papers are structured and understanding what a computer vision/machine learning paper generally includes and what kind of testing is required for it to be considered complete.
Among the general observations I made about methodology within machine learning is the way in which distinctly programmed systems are generally grouped together into networks.
Ideology
The statement made by the authors of the “talking head” paper to address the possibility of their technology being used in fakes wrote a statement on their GitHub which I found particularly interesting in its ideological clarity (for better or worse). They make the statement that they believe that their work amounts to “the democratization of the [sic] certain special effects technologies” and that “the net effect of democratization on the world has been positive.”
This statement, though made by computer scientists, seems in line with responses to the public’s ambivalence toward both machine generated (today) and captured (at the beginning of photography) imagery.
For me, their statement falls flat in its amorality and a-historicity. What the authors fail to take account of is the massive transformation of image transfer infrastructure and the amplified role that images play in social interactions today. What they should address is the ethical implications of creating a technology that absolutely will be used for harm. They write about the need for realistic avatars in VR spaces to make people feel more connected. This is, to my mind, currently an edge case. This technology’s more likely use is for propaganda, and this is something that needs to be addressed more directly.
These ideas will be something I will have to address directly in the future, so engaging with these ethical issues is an important value for me.
Just because you can do something doesn’t mean you should. Computer science’s need to replicate every process needs to be reconsidered… or just considered at all.
Terminology
There seems to be a general push in computer science and to approach all phenomena ontologically. In other words, to study how something comes into being so as to reproduce it digitally. This is evidenced in the many approaches to computer vision which began by replicating human/biological vision and pattern recognition processes using computational processes. There is a reason François Chollet, the creator of Keras/TensorFlow and a Google software engineer, is a big fan of neuroscience and childhood development (see these interviews:
https://www.youtube.com/watch?v=Bo8MY4JpiXE and
https://www.youtube.com/watch?v=PUAdj3w3wO4).
Indeed, my favorite borrowed terminology from the papers was the term ablation study, coined by Chollet. Within neuroscience, an ablation study is where a portion of the physical brain is removed or cut off in order to assess its function and necessity. This is also done in machine learning where a portion of the neural network is removed to test its effect on performance. It gives researchers a glimpse into the black box.
One of my other term-related discoveries throughout the readings was how a convolutional neural network actually works.
Convolution is essentially the process of applying a kernel to an image, pixel-by-pixel that collapses the image into a core symbol for easy and efficient identification. There is a beautiful visualization I found on YouTube that illustrates this process very well. The video involves the identification of the letter ‘A’ by a convolutional neural network.
https://www.youtube.com/watch?v=f0t-OCG ... L&index=2
Another useful resource is 3Blue1Brown’s excellent series on Neural Networks:
https://www.youtube.com/playlist?list=P ... _ZCJB-3pi