Report 2: The Pixel-based Image & its Processing

glegrady · Post by **glegrady** » Mon Oct 05, 2020 12:57 pm

MAT594GL Techniques, History & Aesthetics of the Computational Photographic Image
https://www.mat.ucsb.edu/~g.legrady/aca ... f594b.html

Please provide a response to any of the material covered in this week's two presentations by clicking on "Post Reply". Consider this to be a journal to be viewed by class members. The idea is to share thoughts, other information through links, anything that may be of interest to you and the topic at hand.

Report for this topic is due by October 27, 2020 but each of your submissions can be updated throughout the length of the course.

ehrenzeller · Post by **ehrenzeller** » Mon Oct 26, 2020 2:38 pm

When it comes to noise, it seems as if our society is still overcoming an obsession with cleanliness and order—removing background noise in recorded music, radio stations having multiple call signals to prevent overlap with neighboring channels, images growing crisper every few months when new digital cameras with faster processors, supporting higher quality images are released. However, there is also an inherent beauty, even something mysterious, when it comes to knowingly introducing noise through seemingly conflicting forms of media and observing what happens.

The musical recording artist and songwriter Tom Waits is known for his obscure forms of seeking inspiration, most notably playing multiple songs as the same time and looking for overlapping elements (https://reverb.com/ca/news/songwriting- ... he-masters). He even staged a recording studio to look operate like a restaurant to capture the audience and background noise to set the tone for the album Nighthawks at the Diner (https://www.tomwaitsfan.com/tom%20waits ... hawks.html) and perhaps society at large is starting to catch on. There are countless ways to introduce sound into our digital photos through artificial grain and sepia, vintage-type filters.

Our survey of Ikeda, Kurokawa, and George Legrady’s work, transforming visuals to sound reminded me of how the reverse process (going from sound to visual), particularly non-Newtonian fluid’s behavior on speakers (https://www.youtube.com/watch?v=3zoTKXXNQIU) which not only broadens our understanding of science, but also provides a stunning visual. I’m curious as to other forms of moving from sound to video, particularly in the CGI world. In commercial applications, soundtracks are often found/created to fit a specific scene afterwards, perhaps the future will include an automated AI means of selecting/composing music to fit a scene’s emotional mood, or even ways to create a scene from the sound itself.

Lastly, the Sun Gardens cyanotypes featured reminded me of the microscopy photos of diatoms and radiolarians from my undergraduate years studying marine biology. This led me to ponder the possibilities of Deepfakes outside of the photography realm. As new technology emerges, will we be able to tell a real X-Ray from a doctored one? True electron microscopy from an imposter? Perhaps digital watermarking is a field worth investing in….

wqiu · Post by **wqiu** » Mon Oct 26, 2020 11:04 pm

Several responses to pixel

When images are presented as pixels, photographs can be viewed as purely numeric data information. Many artworks were referencing to the Information Theory, to remove redundancies in a photograph, to compress the data and present the information with minimal number of pixels. Jim Campbell is a great example of using this concept. In his famous series of LED worlds, he used an only small number of LEDs to present motion images, and the user are still able to perceive the original information.

This leads to the appreciation of entropy. A image with high entropy does not have great redundancy so they can't be effectively reduced. Therefore, it cannot be reconstructed from the small number of LEDs by Jim Campbell. However, the image is not interesting either if there is extreme amount of entropy, the image of white noise. Some images of this kind aesthetics have a balance on the level of being chaotic and the level of orderliness.

: Royoki Ikeda

The other aesthetics is to appreciate the essence of data. Images consists of data, so any data can be rendered as images, even the erratic data. Stan Douglas's Corrupted File series of work is to have the audience appreciate the erratic data caused by overheat CCD or SD card during the image capture. This is like the Kazimir Malevich's Black Square which exaggerate the presence of the canvas and paint texture by removing the painting content. It is also like the Allison Rossiter's work of exposing outdated film to reveal the data of the material itself.

: Corrupt Files 2010_3024 by Stan Douglas, 2013

: Black Saqure by Kazimir Malevic

: Allison Rossiter

Another kind of the works with image pixels is to find the statistical characteristics of the pixels. Averaging is the most used one. Hiroshi Sugimoto's "Theater" series "shoot a whole movie in a single frame" are essentially averaging the motion frames. This is process is digitalized by Jim Campbell in his Illuminated Average Series, such as HITCHCOCK'S PSYCHO 2000. Jason Salavon uses this technique even more and further. His works, Portrait, 100 Special Moments, utilized this averaging idea. Of course, averaging is only one of the techniques that used to extract the statistical characteristics of image data. In fact, today's deep neural networks can be thought of a data mining tool too. What it does is actually to analyze the image dataset and summarize the information from a big dataset. Tom White's Perception Engine outlines the essential strokes for computer to recognize the image as if it contains certain object. The GANs, such as Deep Fake, basically derived a complex equation from a huge face dataset. It is an equation that can generate given number of pixel values which, if placed in an 3-channel RGB 2D grid, will be recognized as a face in human's mind.

: Carpenter Center by Hiroshi Sugimoto, 1993

https://www.sugimotohiroshi.com/new-page-7

: hitchcocks_psycho by Jim Campbell

https://www.jimcampbell.tv/portfolio/st ... index.html

: Portrait (Rembrandt) by Jason Salavon 2009

: 100 Special Moments by Jason Salavon 2004

: Perception Engine by Tom White

https://medium.com/artists-and-machine- ... 46bc598d57

k_parker · Post by **k_parker** » Tue Oct 27, 2020 9:44 am

This week I was particularly drawn to the artists working in layered transparencies and pixel averaging image processing. To name a few that were mentioned in lecture and also linked in the syllabus: Idris Khan, Jason Salavan, Jim Campbell and Micheal Naggar. In my studio practice I am consistently playing around with layered transparencies: such as digital photo manipulation, projection reflected onto canvas or glass panels, and with the painting material itself, egg tempera, which involves layering many many films of translucent paint over on another. I am interested in overlaying duplicity to create a singular image.

I was struck by the idea that layering all of these images was a way to get to some kind of a “truth”, that by finding the places where these photographs/ videos overlap there is a hidden regularity and order to the process. Formulaically, the “truth” that is arrived at is interestingly blurred, visually vibrates, ghostly, and, going a bit off of Alex’s post, is reminiscent of medical imaging. Referring specifically to Idris Khan, the image could be seen as an x-ray of sorts where the “skin” of the city becomes translucent and what is left are the “bones”.

: Idris Khan

I am interested in the visual vibration that seems to happen in the layers of transparencies. This brought me back to basic vibrating color theory (I’ll spare your eyes by not including an image). The hard lines in the layered pieces look as though they are vibrating back and forth as I try to visually decode the structures (or faces with Salavan). I wonder what would happen if the “blurry” transparencies images were to be plugged into a neural network that tried to remove the noise and make sense of the image again. Or maybe it would be more interesting to overlay the images then remove what doesn’t line up- try to find structural repetitions in different locations as a commentary on social/political narratives.

chadress · Post by **chadress** » Tue Oct 27, 2020 11:39 am

: “Winona”
Eigenface (Colorized),
Labelled Faces in the Wild Dataset
2016

Trevor Paglen & Hal Foster

If I told those outside this classroom and field of research (my septuagenarian mother for example) that there was a vast alternate universe where machines created, tracked, and surveilled images largely for the processing and consumption of other machines, for purposes largely unknown and invisible to humans yet greatly affecting them, they might think I was describing a piece of science fiction. The Matrix, perhaps? Take the blue pill, Morpheus would say, and you’ll never know this universe existed. Take the red pill, and “I’ll show you how deep the rabbit hole goes.”

Tevor Paglen wants everyone to take the red pill. “Something dramatic has happened,” claims Paglen. “The overwhelming majority of images are now made by machines for other machines, with humans rarely in the loop.” For Paglen, this occurrence can be described as a break in our historic understanding of visual culture, from one based on “fleshy things” viewed with another set of “fleshy things” (our eyeballs), to one where machines do all of the looking. This shift, resulting in what Paglen describes as a culture of machine vision, is “detached from human eyes”, and therefore remains largely invisible.

So how deep does the machine vision rabbit hole go? In Invisible Images (Your Pictures Are Looking at You) from this weeks readings, Paglen makes a convincing case that we’re staring into the Nietzschen Abyss: “Its continued expansion is starting to have profound effects on human life, eclipsing even the rise of mass culture in the mid 20th century.” Paglen goes on to describe in detail the inner-workings of this burgeoning mostly unseen machine vision culture, and it’s wider implications for a society that increasingly communicates with images. He touches on issues ranging from ideological pitfalls to exercises and abuses of power from both corporate and governmental entities.

For those interested in a more critical-theoretical discussion on many of the topics we’ve been covering in this class thus far, I would point to two essays in Hal Foster’s recent publication: What Comes After Farce?

In Smashed Screens, Foster largely covers the implications of Hito Steryl’s work in, around, and through her seminal publication Duty Free Art, with both past and future nods to Walter Benjamin, Harun Faruki and Trevor Paglen.

: Screen Capture from Hito Steryl's How Not to be Seen: A Fucking Didactic Educational .MOV File, 2013

The essay Machine Images delves almost entirely on the work of Trevor Paglen. One question Foster asks warrants my attention in particular: “ So why present this research in the form of art at all?” And then answers with the following, which I believe is worth quoting in its entirety:

"… but the primary justification is that Paglen continues the critique of representations and institutions developed by artistic predecessors from Hans Haacke to Jenny Holzer. He also implies that, however embedded in the neoliberal economy, the art world can still provide limited occasions for media safe houses."

Paglen’s response here implies that exploring, or perhaps exposing such modus operandi may threaten power structures that are invested in keeping certain technological wizardry tucked safely behind the curtain, away from the larger scrutiny of society. Which begs the question: What exactly aren’t we seeing?

Links:
https://www.versobooks.com/books/3170-w ... fter-farce
https://thenewinquiry.com/invisible-ima ... ng-at-you/

isalty · Post by **isalty** » Tue Oct 27, 2020 11:06 pm

In thinking about this week’s lectures and readings, conceptions of “space” and “form” appear to be central to how we create and use pixel-based images. For example, the JPEG is used to average the relationship of pixels in order to compress the digital image and therefore save space in the process. The more an image is compressed, the lower the clarity and quality of the digital image’s form. The manipulation of form therefore is directly in relation to the spatial qualities of the material image. Thomas Ruff, a contemporary German photographer from the Düsseldorf School, was mentioned in lecture as an example of an artistic practice that engages digital portraiture and found JPEGS. Thomas Ruff manipulates the spatial qualities of the jpeg itself and the formal qualities of the pixel-based image by enlarging the pixels to create a complex inter-play between imagination and reality through the blurred image. In 2009 Aperture Foundation published a book dedicated to Ruff’s “Jpeg” series that investigates how pixelated images, when enlarged and thus making the whole image appear imprecise and unclear, allow for a different experience of seeing the image that relies on the viewer to recognize what is visible, and make recognizable what appears less visible.

https://aperture.org/books/jpegs/

These conceptions of space and form of the digital image constitute the framework for James Elkins' article “Art History and the Criticism of Computer-Generated Images,” in which he describes how space is transformed in digital art through its apparent “limitless-ness.” (For temporal context: In connection with the introduction of the JPEG in 1992, Elkins is writing from a contemporaneous perspective only 2 years after this technological milestone.) While I think there is something interesting about the idea of space being, or better yet appearing, limitless because of technological software and processing that allow more expansive experimentations of spatial perceptions in an image—and this can easily tie into philosophies of space and representations of space—I take issue with a majority of Elkin’s project. In trying to bridge the gap between the field of art history and its minimal examination of computer images, he makes reductionist claims about computer-image processing in terms of their form and spatial qualities—namely, stating that the “computer-assisted drawing is more rapid and less pictorially informed than in previous centuries, but it is also lucid and schematic as never before.” There is a lot going on here and I think the main cause of this is Elkin's language in describing the digital image. He continues by arguing that even though the digital image appears distinct from traditional art historical institutionalized conceptions of "art", that in fact these computer-generated "visualizations" have roots in the history of Western painting. With formal elements that allow for the visual perception of images in certain spatial ways, this is certainly the case if we consider linear perspective, light, shadow, color, etc., but I think this connection can only extend so far as recognizing that all images, digital and non-digital, maintain certain formal qualities. In this project, I don't think Elkins' is necessarily looking that closely and critically at isolated digital images themselves—which is my main issue in a project presenting itself as intervening in an apparent disconnect/lag between digital images and art historical criticism—but rather is trying to place them in a larger visual convention of how images are used to represent and convey information in realistic ways as communicative data.

merttoka · Post by **merttoka** » Wed Oct 28, 2020 12:35 am

These days, it is commonplace to transform digital images with various algorithms after capturing the digital image. However, the intricate system found in digital cameras where photons are converted into digital signals can also provide interesting image effects without any post-processing. The cheapest and most ubiquitous sensors used in digital cameras are the CMOS sensors. It has many advantages over the oldest sensor technology (CCD), yet it mostly suffers from a problem of rolling shutter. Rolling shutter (as opposed to global shutter) does not take a snapshot of the scene instantaneously but fills the image array by scanning the scene one line at a time. This "feature" enables image effects when the sensor or the targeted object is moving fast. (Smarter Every Day's explanation of rolling shutter with a wide variety of examples: https://www.youtube.com/watch?v=dNVtMmLlnoE)

Once the pictures are on the digital medium, we are presented with an overwhelming amount of operations to transform our images. One operation I found interesting was averaging multiple frames in a single image, where the repetitive information stands out in a ghostly fashion and less persistent signals disappear. Jason Salavon's Rembrandt and Idris Khan's A World Within are good examples in this regard. The viewer can recognize the silhouette of a portrait or a building, yet it does not correspond to any particular example of any portrait or a building. Another interesting work I found done using the averaging method is "The average leader" by Matty Mariansky (2012). She collects the portraits of world leaders --presidents, vice-presidents, chief of staff, etc., and creates averaged portraits of them. As you expect, all averages demonstrate white males. The average prime minister is given below (Who do you see here?):

In conjunction with averaging images to achieve a silhouette, object detection on images helps machines to summarize an image with higher-level representations. With the help of a vast amount of data, machine learning methods can learn objects and annotate them on the images. This method is now used in almost every aspect of our technological civilization, from airports to traffic lights to counting people entering a smart building.

When a digital image contains repetitive information, it is straightforward to reduce that information to save disk space by compromising from the full resolution. "jpeg ny02" by Thomas Ruff is a great example of such a deliberate attempt. We can still recognize the scene of a smoky building, and with our past experience, we can even deduce the timestamp for the picture.

Rather than grouping similar pixels to compress images, object detection could be seen as a better way of doing such compression. In his book "A New Kind of Science", Stephen Wolfram has extensively examined complex systems that are originated from simple initial conditions. In the later sections, he argues that being able to "summarize" images using higher-level information gives the best image compression. He also mentions that our brains use exactly this type of compression to perceive the visual information received from our environment.

masood · Post by **masood** » Sun Nov 01, 2020 1:43 pm

Working through the content of week two, my interest in convolutional neural networks (CNNs) was piqued by the discussion of image classification, a related sub-topic of CNNs that has huge implications in terms of computational ethics and photography. In this response I'd like to focus on one artifact from the history of computation invented in 1980 by Dr. Kunihiko Fukushima of Kansai University: the Neocognitron. Not only does it have an amazing name, but it sets the precedent for a whole array of technologies that have continued to be developed to this day.

The main purpose of the Neocognitron was to identify hand-written alphanumeric characters.

Link:
http://www.scholarpedia.org/article/Neocognitron

This work relies upon the ideas of two Nobel Prize winning scientists David Hubel and Torsten Wiesel.

Link:
https://knowingneurons.com/2014/10/29/h ... erception/

In the 1950s, Hubel and Wiesel discovered that the visual cortex of cats responded to certain transformations of light signals as they changed and became more complex. Through this research they discovered that the visual cortex in the brain has a topographic relationship with the visual field, meaning that certain specific regions in the brain corresponded to the seen image. There are 3 different topographic maps within the brain's visual cortex:

Retinoscopic Maps: Regions in the physical visual cortex that correspond directly to areas of the seen image, i.e. if a part of that brain is destroyed, one would see a blind spot in the image.

Occular Dominance Maps: Unknown what this is used for, but it corresponds to whichever eye is dominant. It is believed that this is used for stereovision, although some binocular animals do not have this.

Orientation Maps: This region corresponds to transformations of objects and these neurons fire when something is rotated or turned.

Sources:
https://en.wikipedia.org/wiki/Topograph ... anatomy%29
https://en.wikipedia.org/wiki/Ocular_dominance_column

The idea that certain areas of the brain correspond to different attributes of an image during recognition inspired Fukushima's Neocognitron neural network. The network is comprised of many layers that extract low level features of the image with two types of distinct types of "cells" that extract low order to high order details gradually.

https://www.youtube.com/watch?v=Qil4kmvm2Sw

In sight perception, it was discovered that the brains of small animals (cats in the case of Hubel and Wiesel) respond to low order features of images first. Examples of low order features would be spots, edges, angles, and contrast. Higher order features are what those smaller features combine to create, like the corners of letters or whole letters themselves.

The best thing about the Neocognitron is that it functions to identify letters independent of transformations made to the letters. For example if a letter is shifted over to one side or scaled up or down, the system still successfully identifies the letter.

This also represents computer science's "bottom up" approach to replicating human neurophysiology and I think it's a positive contribution.

Yann LeCun, who is the current head of AI at Facebook and the inventor of back-propagating convolutional neural networks (ConvNets), cites Fukushima in his own original papers. It's incredible that this work was done as early as 1980 and has had such a huge influence on us.

Here's a better image that might explain a bit better:

zhangweidilydia · Post by **zhangweidilydia** » Wed Nov 04, 2020 8:21 pm

This week George shared with us two topics: from analog to digital pixels / and image processing algorithms & image layering

for the pixel-based image part: I feel resonated with this quote

How much information is required for recognition and what information is the most important

and it echoes with the statement in the first week

Image is a constructed artifact rather than a document of the world.
A representation that follows rules of how culture has evolved its understanding of images

It makes me to rethink photography as a medium to convey meanings and deliver information. Now for the machine learning algorithms, photographs is more a like vehicle of features. Machine learning algorithms classify and categorize the world based on it's vocabulary. In this case, image can be a weapon.

The overwhelming majority of images are now made by machines for another machine, humans are rarely in the loop

In the digital humanity side, images are necessarily to be viewed critically in aspects of politics, culture, and surveilance. In the art side, I think pixel based image can be related to abstraction and metaphor. As a CalArtian, I am familiar with CalArts faculty Charles Gaines's works. I think his works are very well connected to this topic. - Evidencing Reality

Layering is a language of transparency

Charles's works connected to the meaning of image and critique the meaning of representation through building up a system to generate new paintings.
One of his famous painting: face1: identity politics is a triptych of a system generated paintings. By using his own grid system to caculate the past one to generate new ones. Then he layered them by palying with transparency and materials. I think this work also critisize the current social issue of represention and political identity. When I came back to China in recent years, I find my face is actually a identity representation in anywhere, I use my face to enter the airport, my face are scanned by different kinds of camera in shopping malls to pay, and even to take flight. Every time the machine scanned my face, it automatically generates a series of information that relates to me - my gender, my fingerprint, my bank account, my payment history, etc. The large sets of privacy data is the information underlying my face. But does my face can really tell who I am? Can we generize a creature by using sets of data through a representation image? Is camera a weapon? Is meaning being shifted when it is processed by a system?
-
-

-
-

Media Arts and Technology

Report 2: The Pixel-based Image & its Processing

Report 2: The Pixel-based Image & its Processing

Re: Report 2: The Pixel-based Image & its Processing

Re: Report 2: The Pixel-based Image & its Processing

Re: Report 2: The Pixel-based Image & its Processing

Re: Report 2: The Pixel-based Image & its Processing

Re: Report 2: The Pixel-based Image & its Processing

Re: Report 2: The Pixel-based Image & its Processing

Re: Report 2: The Pixel-based Image & its Processing

Re: Report 2: The Pixel-based Image & its Processing