4 Final Project

Posts: 8
Joined: Wed Apr 12, 2017 5:15 pm

Re: 4 Final Project

Post by chantelchan » Thu Mar 15, 2018 1:01 pm

My dataset comes from the Museum of Modern Art's API on GitHub. They contain JSON files from over 200,000 artworks which contains attributes of each artwork from Title, Artist, Nationality, Year, Medium, etc. I only used the title, artist, and year in my project.
In my engineering classes, we studied up on image analysis and what information we can get from a digital image. There are many applications of that use this process today, such as Photoshop which obtains the RGB histogram from the photo. I always wondered what were the main colors used in an artwork, and was curious what percentage each color actually made up the artwork. I decided to create a pie chart that would take all the pixel values in the image, and section a piece of it in the circle.
Because 200,000 artworks is very large to process, I shortened it to contain only 101 Most Important Western Painters according to "The Art Wolf." I created a code that would compile a new JSON file for me that contain these painters in "filter.js" that produces "data_compressed.json." This file contains 3,970 artworks.

Code: Select all

fs = require('fs');
let finished = [];
let constituentID = [//url: http://www.theartwolf.com/articles/most-important-painters.htm
                      4609, //PABLO PICASSO
                      //GIOTTO DI BONDONE
                      //LEONARDO DA VINCI
                      4869, //PAUL CÉZANNE
                      //REMBRANDT VAN RIJN
                      //DIEGO VELÁZQUEZ
                      2981, //WASSILY KANDINSKY
                      4058, //CLAUDE MONET
                      //JOSEPH MALLORD WILLIAM TURNER
                      //JAN VAN EYCK
                      //ALBRECHT DÜRER
                      4675, //JACKSON POLLOCK
                      //MICHELANGELO BUONARROTI
                      2098, //PAUL GAUGUIN
                      2274, //FRANCISCO DE GOYA //did not find any
                      2206, //VINCENT VAN GOGH
                      3721, //ÉDOUARD MANET
                      5047, //MARK ROTHKO
                      3832, //HENRI MATISSE
                      370, //JEAN-MICHEL BASQUIAT
                      4614, //EDVARD MUNCH
                      4057, //PIET MONDRIAN
                      //PIERO DELLA FRANCESCA
                      //PETER PAUL RUBENS
                      6246, //ANDY WARHOL
                      4016, //JOAN MIRÓ
                      //TOMMASO MASACCIO
                      1055, //MARC CHAGALL
                      //GUSTAVE COURBET
                      //NICOLAS POUSSIN
                      3213, //WILLEM DE KOONING
                      3130, //PAUL KLEE
                      272, //FRANCIS BACON
                      3147, //GUSTAV KLIMT
                      1474, //EUGÈNE DELACROIX //not found
                      //PAOLO UCCELLO
                      //WILLIAM BLAKE
                      3710, //KAZIMIR MALEVICH
                      //ANDREA MANTEGNA
                      //JAN VERMEER
                      //EL GRECO
                      //CASPAR DAVID FRIEDRICH
                      //WINSLOW HOMER
                      1634, //MARCEL DUCHAMP
                      2963, //FRIDA KAHLO
                      //HANS HOLBEIN THE YOUNGER
                      1465, //EDGAR DEGAS
                      //FRA ANGELICO
                      5358, //GEORGES SEURAT
                      //JEAN-ANTOINE WATTEAU
                      1364, //SALVADOR DALÍ
                      1752, //MAX ERNST
                      2923, //JASPER JOHNS
                      //SANDRO BOTTICELLI
                      2678, //DAVID HOCKNEY
                      624, //UMBERTO BOCCIONI
                      //JOACHIM PATINIR
                      //DUCCIO DA BUONINSEGNA
                      //ROGER VAN DER WEYDEN
                      //JOHN CONSTABLE
                      //JACQUES-LOUIS DAVID
                      2252, //ARSHILLE GORKY
                      //HIERONYMUS BOSCH
                      //PIETER BRUEGEL THE ELDER
                      //SIMONE MARTINI
                      //FREDERIC EDWIN CHURCH
                      2726, //EDWARD HOPPER
                      1930, //LUCIO FONTANA
                      3748, //FRANZ MARC
                      4869, //PIERRE-AUGUSTE RENOIR
                      6336, //JAMES MCNEILL WHISTLER
                      //THEODORE GÉRICAULT
                      //WILLIAM HOGARTH
                      1253, //CAMILLE COROT
                      744, //GEORGES BRAQUE
                      //HANS MEMLING
                      4907, //GERHARD RICHTER
                      4038, //AMEDEO MODIGLIANI
                      //GEORGES DE LA TOUR
                      //GENTILESCHI, ARTEMISIA
                      //JEAN FRANÇOIS MILLET
                      //FRANCISCO DE ZURBARÁN
                      1739, //JAMES ENSOR
                      3692, //RENÉ MAGRITTE
                      3569, //EL LISSITZKY
                      5215, //EGON SCHIELE
                      //DANTE GABRIEL ROSSETTI
                      //FRANS HALS
                      //CLAUDE LORRAIN
                      3542, //ROY LICHTENSTEIN
                      4360, //GEORGIA O'KEEFE
                      //GUSTAVE MOREAU
                      //GIORGIO DE CHIRICO
                      6624, //FERNAND LÉGER
                      //JEAN-AUGUSTE-DOMINIQUE INGRES
]; //constituentID array
let originalFilePath = 'Artworks.json'; //PUT UR ORIGINAL FILEPATH HERE ie dumbHoney.json
let endFilePath = 'data_compressed.json'; //PUT WHAT YOU WANT YOUR NEW JSON FILE TO BE CALLED HERE ie honeyIsTheBest.json
fs.readFile(originalFilePath, (err, data) => {
        throw err;
    artwork = JSON.parse(data);
    for(let i = 0; i < artwork.length; i++) {
        // this if statement might need changing, if you want more stuff like sculptures and whatnot
        && (artwork[i].Classification !== 'Sculpture' || artwork[i].Classification !== 'Architecture')
        && (artwork[i].URL !== null && artwork[i].ThumbnailURL !== null)){
    fs.writeFile('data_compressed.json', JSON.stringify(finished));

Part of my code also downloads the image and saves it into my local directory, so that the processing time is shorter. Instead of retrieving the image for the URL, and retrieves it locally and cuts down on time.

Code: Select all

 for (int i =0; i<resultCounts; i++)
   String imageURL = artworks.getJSONObject(i).getString("ThumbnailURL");
   covers[i] = loadImage(imageURL);
   println(resultCounts-i); //print countdown
Because we are looking at "modern art," not every artist on the list will be shown. However other notable artists such as Pablo Picasso, Roy Lichtenstein, and Marcel Duchamp are included.

In my project, I was able to browse through all artworks and see their large scaled image as well. However, when drawing the Pie Chart, the code takes more processing time, and thus I had to shorten the number of results to prevent excessive lag. On top of that, adding the histograms values of red, green, blue, hue, saturation, value would add more processing power, and thus I had to shorten the number of results again. Although my functions work, my laptop's processing power limits me from easily browsing through all the artworks while getting information about the pie chart and histograms as well.

Users are able to use the scrollbar on the bottom to look at catalog of art. In addition, when the user hovers over a square image of the art piece, a larger view of the artwork is scaled to size along with its title, artist, and year that was made.
The color wheel is not smoothly transitioning, and I want to believe that when I sorted the color values, it does it by hue first, and then brightness. This would then cause it to be more patchy, making it harder to get the bigger image of what percentage of the color makes up the artwork.

I created another code called "unique.js" that compiles a string array of unique elements. For example, with "unique.js," I created "uniqNationality.json" that has every unique nationality mentioned in the "data_compressed.json" file.

Code: Select all

const fs = require('fs');
let filepath = 'data_compressed.json';
let writeFilePath = 'uniqDates.json'; //'uniqNationalities.json';
fs.readFile(filepath, 'utf8', (err, data) => {
  if(err) {
  } else {
    console.log(typeof data);
    data = data.slice(1, data.length);
    let stuff = JSON.parse(data);

const helper = (bunchOfData) => {
    let uniqueStorage = {};
    let uniqueNames = [];
    bunchOfData.forEach((element) => {
        if(!uniqueStorage[element.Date]) { //element.Nationality[0])
            uniqueStorage[element.Date] = true;
    let data = JSON.stringify(uniqueNames);
    fs.writeFile(writeFilePath, data, (err) => {
I would have wanted to continue filtering the artworks in my code so that those artworks are grouped together according to their respective category. Users could click on buttons that would sort them together. But because the processing power limits the capacity to run smoothly, it was frustrating to implement. Maybe I can develop a much faster algorithm to retrieve this kind of information for future use.
(788.28 KiB) Downloaded 101 times
Last edited by chantelchan on Thu Mar 15, 2018 2:12 pm, edited 3 times in total.

Posts: 6
Joined: Fri Jan 19, 2018 11:03 am

Re: 4 Final Project

Post by annikatan » Thu Mar 15, 2018 1:19 pm

The goal of my text mining project is to build a visual classifier that breaks each text into designated clusters.

My dataset if from records of TED Talks (TED.com) until September 21st, 2017 (brought to us by Kagglehttps://www.kaggle.com/rounakbanik/ted-talks). It contains information about TED Talks. The dataset includes 17 attributions like name, title, description, event, tags, and so forth. For my project, I am only focusing on tags which holds a list of themes associated with the talk.

Preprocessing Dataset
The original dataset holds more than 2,500 rows. However, due to technical issues and fatal errors, I had to cut my data down to 10 lines. For the tags, I removed special symbols and punctuations and split the word vectors by space.

Library / Inspirations
I have done previously text mining projects and visualized it with word clouds, and simple bar graphs and lines plots. I want to challenge myself and learn how to use a new exploratory analysis.

I was inspired by Rodger’s “Reporting on Boko Haram” project. He created single text webs by every category.

My project is heavily based on the ToxicLibs example, “Force Directed Graph” by Daniel Shiffmann.

Press Key
N - New Graph
P - Points
C - Connections

Final Results

Future Work
- Add more data points without crashing program
- Display color by cluster
- Add addition animations (inspired by example, “Ref_ClosestPoint_Mesh3D”)
- Add GUI functions
(8.51 KiB) Downloaded 109 times
Screen Shot 2018-03-15 at 2.35.39 AM.png
Last edited by annikatan on Thu Mar 15, 2018 2:53 pm, edited 2 times in total.

Posts: 9
Joined: Fri Apr 01, 2016 2:34 pm

Re: 4 Final Project

Post by zhenyuyang » Thu Mar 15, 2018 2:03 pm

Forex tree
- 2D data visualization based on the real-time Forex (foreign exchange) rate

2D data visualization, Recursion Tree, Foreign Exchange, Reinforcement Learning, PID controller.

Data source
I use the dataset from OANDA Corporation (a registered Futures Commission Merchant and Retail Foreign Exchange Dealer). With a simulation account, I could obtain the latest rate of any foreign exchange pair(e.x. EUR/USD).

After registering an account and set up a Rest API, I successfully fetched rate data on EUR/USD pair in the processing environment. The data is two-dimensional data with variables of time and rate. The rate is active from Sunday to Friday, 24h/day.
The main idea is that using processing to visualize how the EUR/USD pair rate grows and also visualize the rate future trend predicted by a pre-trained reinforcement learning model. To visualize the two cases(rate rising/dropping), I decided to use structures with branches. Each branch can be viewed as a timestamp, where future is divided into two directions from that point of view.

Visual Inspiration
I was inspired by a tree photo. In the photo, a tree is standing on the ground. The outline of a tree starts from its trunk, then keeps branching and growing in all directions.
Then I found the following picture, in which a tree is placed in the center, making me think that I can make the center of the screen as a new starting point for each moment when a tree is growing continuously. This idea guided me to a new design of autonomous camera control that always keeps the growing point in the screen center.
I further simplified the image of a tree and obtained various representations of a tree. Some representations were particularly simply and beautiful(shown in the following pictures).
I looked into the simplified image of a tree and suddenly realized that the tree shape contains all the features that I was looking for, as the tree shape is a set of branches, and grows from a base point into different directions, which can be mapped into the different future trends of a forex pair rate. With some drawing and sketching, I finally decided to use a branch to represent the EUR/USD Forex rate in a period of 600 frames. During each period, a pre-trained reinforcement learning module will be used to determine the probability that the Forex rate will go up in the next period. Then this probability will be used to guide the branch divided into two directions and grow differently.

The two divided parts are two new branches and will continue to group for a while. These colors and the shapes of these two new branches convey information of rising probability of a Forex pair rate in the next 600 frames.

Color coding:
The colors of branches are in red and green, which were mainly based on the color scheme of US stock market. In the US stock market, stocks are in green color if their prices are rising. on the contrary, stocks are in red color if their prices are dropping.
The probability of rate rises affects the exuberance of two sub-branches. If a sub-branch shows exuberance, it means that the forex rate has higher chance to change in the way the sub-branch represents.
Some other sketches:
Final Results:
Key "T": Show/Hide timeline(rate history)

Screen Record
Last edited by zhenyuyang on Tue Mar 20, 2018 9:26 pm, edited 12 times in total.

Posts: 4
Joined: Fri Jan 19, 2018 11:14 am

Final Project Echo

Post by echotheohar » Thu Mar 15, 2018 2:28 pm


I chose to work with George Legrady's dataset that he sourced from the Centre Pompidou, an arts institution based in Paris France. The data seemed to be a small snippet of a much larger set, and included query histories stemming from popular French news sources such as Le Monde and Le Figaro, to search queries from dailymotion.com, to general search queries from bing. This data was captured in multiple ways, stemming from date, time, year, url, and string of the query. The data I chose to focus on were general search queries from bing.com.



I was curious about how the data was collected by the institution and wanted to visualize the data in such a way that the user was implicated in the viewing of the data. I decided to use the OpenCV library to use the internal camera built into my laptop to capture live video and use it as an input for moving between the first two columns of the data, being the session ID and date of access.


The project used all data points in the dataset with the exception of mentioning Bing.com, since it was not necessary for understanding the final visualization.



A large portion of the project involved building a ext parser that could act similarly to MySQL in the sense that I could isolate specific terms that I deemed as frequently occurring in the Bing searches. The parser worked by looking at a string of text that could be cross referenced with the information I had in three different CSV documents, which were split up by month.



Overall, the project didn't work exactly the way I wanted it to because I had trouble shifting between the differing months and sheets of data. In the future, I would like to build upon this project and clean up the code since I "brute forced" a lot of functionality that could probably be accomplished in a more elegant way.
(392.97 KiB) Downloaded 94 times

Posts: 10
Joined: Wed Jan 06, 2016 1:38 pm

Re: 4 Final Project

Post by junxiangyao » Thu Mar 15, 2018 2:58 pm

Cultivation of Energy
Screen Shot 2018-03-15 at 2.49.20 PM.png
In this project, I used data from EXIOBase dataset which provided me the supply data and consumption of 6 main kinds of crops from 42 countries. In fact, there are data’s from categories like “rest of Asia”, which I decided not to use at last since there is no applicable geographical data.
Screen Shot 2018-03-15 at 2.50.27 PM.png
I used spherical coordinate system to map the data I get on a globe model. Locating a point in spherical coordinate system requires three parameters, two angles and a radius which is the distance between that location and the center of the coordinate system. In the visualization, the two angles are latitude and longitude, whereas the radius is the supply value of one kind of crop from a county which the point is presenting. The higher the point from the surface of the globe, the larger the supply value will be.

I drew a sphere at every point. The radius of the sphere is controlled by the self consumption data. The larger the sphere indicates the larger self consumption.
Screen Shot 2018-03-15 at 2.50.10 PM.png
If a country consumed certain kind of crop from another country, there will be a line drawn between the two points.The brightness of the line shows the amount of international consumption between this two country. And when displaying all the data, a net was formed consisted of all the lines.
Screen Shot 2018-03-15 at 2.50.41 PM.png
Also, there is a individual mode which enable the user to check one country in detail on the globe.
Screen Shot 2018-03-15 at 2.49.32 PM.png
Screen Shot 2018-03-15 at 2.49.36 PM.png
Screen Shot 2018-03-15 at 2.51.03 PM.png
Screen Shot 2018-03-15 at 2.51.11 PM.png

[The extension rar has been deactivated and can no longer be displayed.]

Posts: 5
Joined: Fri Jan 19, 2018 11:07 am

Re: Adam Jahnke Final Project

Post by admjahnke » Mon Mar 19, 2018 9:56 am

With my final project I was interested in using and viewing the Pompidou data set from professor George Legrady. With the data I was interested in seeing what was the frequency of words searched. In particular I was interested in focusing in on searches for words that also reflect my daily web activity. So I focused on Google, Instagram, BBC, Yahoo, Youtube and Facebook.

I also compiled all of the search data into one CSV file and have those results streaming on the side of the sketch. The moving list does not implicate the visualized column but I feel adds an interesting effect.
(196.9 KiB) Downloaded 49 times
Screen Shot 2018-03-19 at 10.43.58 AM.png
Screen Shot 2018-03-19 at 10.44.10 AM.png
Screen Shot 2018-03-19 at 10.44.27 AM.png

Post Reply