Proj 4: 3D Interactive Final Project

Post Reply
glegrady
Posts: 203
Joined: Wed Sep 22, 2010 12:26 pm

Proj 4: 3D Interactive Final Project

Post by glegrady » Tue Jan 14, 2014 3:31 pm

This is the final project in the course. The visualization should be done with the peasy.cam library in Processing 1.5.1. The Peasy.Cam allows for 3D and interactivity. You can turn your data object around, zoom in and out, etc.

You can transform any project realized in the course into a 3D interactive version or create something from scratch. Once again, use the SPL data and any correlation data if that is part of your concept.

3D means that you can map any number of the multivariate metadata attached to each dataset in the SPL data.

Criteria for evaluation:
1) interesting, varied, or complex data
2) 3D spatialization of the data
3) Try variations in your choice of formal elements: colors, scaling, labeling, etc.
George Legrady
legrady@mat.ucsb.edu

songgaogeo
Posts: 4
Joined: Tue Jan 14, 2014 11:48 am

Re: Proj 4: 3D Interactive Final Project

Post by songgaogeo » Sun Mar 02, 2014 5:05 pm

Concept
For this project, I am interested in exploring the temporal check-out patterns of all dewey classes and classify all 1000 sessions into different clustered groups/dispersed outliers based on their temporal signature similarities and investigating how to effectively visualize temporal patterns and classification results together in an interactive 3D environment.
Visual Design
We are the astronomers of the information universe! I would like to integrate the beauty of the universe/galaxies pictures captured by space telescopes into consideration for the design of visualizing the clusters of dewey classes. The color hue schema will refer to the galaxy color and brightness analog to the sparking sky.
Time Schedule
2014.03.01~03.04: Develop the query and collect the data from Seattle Public Library;
2014.03.04~03.07: Implement the classification algorithm to classify all 1000 dewey classes based on their temporal check-out frequencies.
2014.03.07~03.13: Detailed visual design and implementation in Processing; discuss with classmates and receive feedback.
2014.03.14~03.18: Improve the project based on feedback and comments; Prepare for the final presentation.
Download:
http://www.geog.ucsb.edu/~sgao/mat259/SongGao_P4.zip
SongGao_MAT259_HW4_Final.pdf
(1.91 MiB) Downloaded 781 times
3D_spiral_colorlegend2.png
3Dgalaxy.png
3Dgalaxy_label.png
The attachment SongGao_MAT259_HW4.pdf is no longer available
Attachments
SongGao_MAT259_HW4.pdf
(1.49 MiB) Downloaded 724 times
Last edited by songgaogeo on Tue Mar 18, 2014 4:15 pm, edited 9 times in total.

milrober
Posts: 4
Joined: Tue Jan 14, 2014 11:44 am

Re: Proj 4: 3D Interactive Final Project

Post by milrober » Mon Mar 03, 2014 9:35 pm

Concept:
The Seattle Public Library has many items that have a large number of physical copies. I will collect all of the items in the SPL that have more than 50 physical copies, and look at the distribution of various item types. Within these items, I will also collect information about the subject matter and the relative popularity of these items. My goal is to try to explore whether the items that appear in bulk in the SPL actually warrant their quantity given their popularity. The visualization will essentially be a collection of several 3D tree maps.
Attachments
doodle.png
Miller-FinalDoodle.docx
(156.14 KiB) Downloaded 1248 times
Last edited by milrober on Thu Mar 06, 2014 4:09 pm, edited 1 time in total.

currier
Posts: 4
Joined: Tue Jan 14, 2014 11:50 am

Re: Proj 4: 3D Interactive Final Project

Post by currier » Tue Mar 04, 2014 3:11 pm

Exploratory data visualization to investigate bar code anomalies
(Revised on 3/13/2014)

Project Description:
For the final project, I return to the subject of my first project: bar code anomalies. The barcode field only appears in two tables of the spl2 database: inraw and outraw. Tables derived from these (e.g. activity, callnum, collection) include the item number but not the bar code as columns, suggesting that the bar code is redundant information. In the first project I posed queries to investigate whether, in fact, the bar code and item number are both unique to individual items and found that this is not the case: some item numbers are associated with more than one bar code and vice versa.

While the majority of itemNumbers are associated with only one bar code, a handful (26,900, or <1%) have more than one bar code over their checkout history at the SPL.

My final project is a visualization to investigate the characteristics of these anomalous items. Are there temporal patterns corresponding to the date(s) at which items acquire new bar codes? Does this depend on item type or item location (Central library branch or other branch)? To investigate this I will visualize items’ patterns of bar code behavior over time, highlighting the time at which each item acquires a new barcode.
Attachments
currier_proj4_final.pdf
(1.17 MiB) Downloaded 747 times
currier_proj4_color.zip
(207.23 KiB) Downloaded 605 times
final_images.zip
(3.3 MiB) Downloaded 1895 times
DAYS360_3_or_more.csv
(21.02 KiB) Downloaded 1947 times
Last edited by currier on Thu Mar 13, 2014 5:10 pm, edited 2 times in total.

grant.mckenzie
Posts: 4
Joined: Tue Jan 14, 2014 11:47 am

Re: Proj 4: 3D Interactive Final Project

Post by grant.mckenzie » Wed Mar 05, 2014 2:25 pm

From a research perspective, one of the most fascinating aspects of a dataset such as the Seattle Public Library check-outs is exploring the similarities between items. By similarity, I mean the ways in which two items are similar or dissimilar. For example, do the authors of two books use similar language? Is the subject matter similar? Do the topics or themes discussed in the books flow along the same lines? Given this concept of topic or theme similarity, I've always thought it would be very interesting to visualize data points (books in this case) in thematic space. In this case the placement of an item in a three-dimensional space would be based on its similarity to other items in the same set. For example, two children books on the topic of dogs would be placed very closely together in space, while a book on Cold War politics would be placed much farther away.

One statistical method for approaching this idea is Multi-Dimensional Scaling (MDS). MDS takes a series of attributes for each item in a dataset and uses these attributes to compute a location in N-dimensional space. Items that are more similar are clustered together while those dissimilar are placed further apart. The number of dimensions “N” is up to the user to determine and in this case I have chosen to represent the data in three dimensions.

The basis for MDS will be a series of attributes related to each item. These attributes will be based on descriptive data pertaining to each item. Given textual descriptions of each book, for example, a latent Dirichlet allocation (LDA) model can be run resulting in a finite set of topics. LDA is an unsupervised, generative topic model that approaches text as a bag-of-words. The co-occurrence of words in a specific document and across documents produces a set of topics from which each original book, in this case) can be defined. In essence, each book will be given a unique distribution of topic probabilities. This distribution of topics can then be compared to each other distribution of topics (through a Euclidean distance measure for example) and a resulting matrix of similarity values is produced.

Data
The first step in collecting data for this project is to determine the number and type of items that should be explored. Given the vast amount of data it is sensible to reduce the study dataset. For this project I chose to take the top 5,000 most check-out books from the Seattle Public Library. The query to get the titles , bibNumbers and check-out counts for these books is listed below.

Code: Select all

SELECT bibNumber, title, count(*) as ct 
FROM public.spl2 
WHERE substring(title,3,2) == 'bk' 
GROUP BY bibNumber, title
ORDER BY ct DESC 
LIMIT 5000;
After exporting these results in CSV format, the next step is to ascertain more descriptive content about the specific books. In order to do this I chose to explore the Google Books API. Limited to 1,000 requests a day, the title of each book is queried against the Google Books Database and metadata about each book is returned. The following information can be accessed and downloaded:
  • Direct Google Books Item link
    Title
    Authors
    Publisher
    Published Date
    Description
    ISBN
    Page Count
    Categories
    Thumbnail
    Language
    Text Snippet
Please read the PDF for further details and Concept Doodle.
Attachments
McKenzie_projFinal.pdf
(1.11 MiB) Downloaded 611 times
Last edited by grant.mckenzie on Mon Mar 10, 2014 8:44 am, edited 1 time in total.

m_uppal
Posts: 4
Joined: Tue Jan 14, 2014 11:43 am

Re: Proj 4: 3D Interactive Final Project

Post by m_uppal » Thu Mar 06, 2014 4:03 pm

Concept: The goal of this was to understand the emergence of Entrepreneurship and individuals businesses post and pre-recession. Forbes in 2013 concluded with a study that U.S Entrepreneurship has reached a record high in 2013 since 1999.
3d visualization will be used to depict each year's, each month, each day, each hour situation pre and post recession(2007).

Attached in the link to the document and a csv files.
After studying each year number, each book was considered into account taking into fact- frequency, the total time the book has checked out!
Sample query: SELECT year(cout) as year, month(cout) as month, hour(cout) as hour, count(*) as count FROM spl2.inraw where (title like "%entrepreneur%" OR title like "%self business%" OR title like "%own business%" OR title like "%business%") AND (year(cout) <'2008' AND year(cout > '2005')) group by month(cout),year(cout), hour(cout) order by year(cout), month(cout), hour(cout);
Attachments
Doodle-Updated1.jpg
entre3.csv
(5.44 KiB) Downloaded 601 times
entre2.csv
(11.73 KiB) Downloaded 596 times
entre.txt.csv
Before 2008
(12.56 KiB) Downloaded 590 times
FinalProjectProposal[small].pdf
(134.6 KiB) Downloaded 725 times
Last edited by m_uppal on Thu Mar 13, 2014 12:26 pm, edited 7 times in total.

hellobuaazl
Posts: 4
Joined: Tue Jan 14, 2014 11:54 am

Re: Proj 4: 3D Interactive Final Project

Post by hellobuaazl » Sun Mar 09, 2014 5:12 pm

Basically, my idea is to visualize the top 100 cities in the world according to their checked out book in SPL and make a prediction on how hot will these cities be in the following year.
Attachments
3D visualiztion proposal.pdf
(170.3 KiB) Downloaded 1217 times

mohithingorani
Posts: 5
Joined: Tue Jan 14, 2014 11:46 am

Re: Proj 4: 3D Interactive Final Project

Post by mohithingorani » Mon Mar 10, 2014 9:52 pm

Concept:
Computer science is a very dynamic field, it has been rapidly changing since its conception and shall continue to do so. Majority of the computer code is written in so-called ‘programming languages’. It is essentially a way of instructing the computer to perform a set of instructions. Over the years these so called programming languages have evolved, some caught on, some did not. Seattle with heavy weight companies like Microsoft an amazon, can possibly a good measure of it. I queried for the most popular languages: Java, C/ C++, Objective C, PHP, Python, Ruby, JavaScript, SQL & Perl.
I may include HTML and CSS for comparison, though they are not exactly programming languages

I will be looking into data from (2005-2013): an 8-year period.

Design:
The design has been inspired by Circos (circos.ca) a data visualization tool used to show relationships. I intend to extrude the circle into a spiral thereby utilizing the third dimension to represent time. I have chosen bright blue for the spiral staircase with fine lines to demarcate the days. Each step represents a month. The background is black.


Color Scheme:
Blue & Black
Attachments
Chasing PL.pdf
(409.09 KiB) Downloaded 1288 times

mohithingorani
Posts: 5
Joined: Tue Jan 14, 2014 11:46 am

Re: Proj 4: 3D Interactive Final Project

Post by mohithingorani » Tue Mar 18, 2014 10:23 am

Extruding Circos
By Mohit Hingorani
Concept:
Computer science is a very dynamic field; it has been rapidly changing since its conception and shall continue to do so. Majority of the computer code is written in so-called ‘programming languages’. It is essentially a way of instructing the computer to perform a set of instructions. Over the years these so called programming languages have evolved, some caught on, some did not. Seattle with heavy weight companies like Microsoft an amazon, can possibly a good measure of it. I queried for the most popular languages: Java, C/ C++, Objective C, PHP, Python, Ruby, JavaScript, SQL, Perl , Lisp.
I may include HTML and CSS for comparison, though they are not exactly programming languages.

I will be looking into data from (2005-2013): an 8-year period.

Design:
The design has been inspired by Circos (circos.ca) a data visualization tool used to visualize relationships. I intend to extrude the circle into a spiral thereby utilizing the third dimension to represent time. I have chosen bright blue for the spiral staircase with fine lines to demarcate the days. Each step represents a month. The background is black. I have color coded each language, and each book ( cin & cout ) is represented by an arc of that color. Obviously the checkout is lower in position than the check-in. The height of the arc is governed by how long the book has been checked out for.
I have used ControlP5 for adding a graphical user interface for interaction and used peasycam for exploration in 3D space.

This visualization allows individuals who are not familiar with SPL to interact with the data and explore the trends in programming languages and draw there own conclusions. The aim is not to visualize anomalous data, instead to educate people on current trends and possible predictions.
Attachments
EXTRUDINGCIRCOS3.zip
zip file with everything in it
(6.64 MiB) Downloaded 544 times
Screen Shot 2014-03-18 at 10.50.25 AM.png
screenshot
Chasing PL2.pdf
PDF
(848.08 KiB) Downloaded 666 times

Post Reply