Proj 3 - Student Defined Visualization Project

Professor George Legrady
glegrady
Posts: 203
Joined: Wed Sep 22, 2010 12:26 pm

Proj 3 - Student Defined Visualization Project

Post by glegrady » Wed Jan 19, 2022 12:54 pm

Proj 3 - Student Defined Visualization Project

Final Project Schedule
Feb 15 - Schedule review, examples of previous final projects - Students do research about data, MySQL, visualization examples
Feb 17 - Class presentation of examples, Association Rule-Learning [FP Tree Algorithm] demo
--
Feb 22 - Lab and individual meetings
Feb 24 - Lab and individual meetings
--
Mar 01 - Work-in-Progress group presentation
Mar 03 - Overview of online project documentation template for vislab.mat.ucsb.edu website
--
Mar 08 - Final Project Class Presentation
Mar 10 - Final Project Class Presentation & Online Documents
--
Mar 15 - Latest to submit online documents

--------------
DETAILS
The final project integrates all things learned through the course, such as asking an interesting MysQL question, working with large data, and visualizing the data through Processing. The big difference is that each student selects their dat and visualization design. Data must be multi-dimensional and visualization has to be in 3D.

Your first task is to identify and select your data. Some of you may be working with datasets in your studies. This is the opportunity to try an alternative visualization of the data. Key current research topics of interests include environmental data, political data, news language analysis, bio diversity, etc. One of the challenges for datavis is that it takes time to learn what the data can do, so continuation with working with the Seattle library data is also an option.

Project criteria evaluation: Significant effort in innovative approaches in data content, data sampling and analysis, are a key to a successful project. Data can also be correlated between multiple sources. Visualization software environment to be used is Processing but data pre-processing can be done in other softwares like Python or R. In fact the Processing interface does have a Python and R so these are also possibilities.

The data should be relevant and granular, meaning that there should be a significant density of data to be visualized in 3D space. Each data’s x,y,z position should be directly defined by the data’s values. Preference is for the data to determine the visual form, rather than matching data to an existing form, for instance, a geographic map has a pre-determined visual/spatial organization.

The project should reveal an understanding of how to use spatial relationships, color coding, interaction methods, and all the features of visual language basics covered in the previous demos and projects.

--------------
George Legrady
legrady@mat.ucsb.edu

sdinulescu
Posts: 3
Joined: Fri Jan 07, 2022 12:30 pm

Re: Proj 3 - Student Defined Visualization Project

Post by sdinulescu » Thu Feb 17, 2022 11:38 pm

Work 03/12 [StejaraDinulescu_Final_V6]:

Updated color palette using a color picker tool that uses machine learning to generate visually distinctive colors --> https://mokole.com/palette.html.

Implemented a drag feature that slows movement every frame based on the value of the selected feature (letters with higher values will travel faster but have more drag).
StejaraDinulescu_Final_V6.zip
(531.6 KiB) Downloaded 52 times
------------------------------------------------------------

Work 03/09-10 [StejaraDinulescu_Final_V5]:

Fixed bugs with connecting lines found in in-class presentation.
StejaraDinulescu_Final_V5.zip
(531.08 KiB) Downloaded 50 times
------------------------------------------------------------

Work 03/04-07 [StejaraDinulescu_Final_V4]:

Updated color mappings for letters of the alphabet and trial number

Building geometries for starting alphabet and input letters

Screenshots are resultant geometries from the word "hello" and "superior".
StejaraDinulescu_Final_V4.zip
(530.9 KiB) Downloaded 53 times
------------------------------------------------------------

Work 02/26-03/01 [StejaraDinulescu_Final_V3]:

The visualized feature (in the position and movement of the data points) is now controlled by a controlP5 dropdown menu (implemented 5 features out of the 24, but I want to see if I can use PCA or other data compression methods to make this more interesting rather than having a dropdown menu containing 24 visualization options).

Textual input via controlP5 enables users to visualize the features extracted from tactile sign gestures that occur when spelling out the input word.
StejaraDinulescu_Final_V3.zip
(530.44 KiB) Downloaded 57 times
------------------------------------------------------------

Work 02/24 [StejaraDinulescu_Final_V2]:

1. Position of agents are now mapped to the screen based on their mean absolute deviation feature values.
X = Mean Absolute Deviation computed from sensor 1,
Y = Mean Absolute Deviation computed from sensor 2,
Z = Mean Absolute Deviation computed from sensor 3.

2. Agents move with constant velocity, computed based on their peak-to-peak feature value (signal absolute max - signal absolute min) .
X = Peak-to-Peak computed from sensor 1,
Y = Peak-to-Peak computed from sensor 2,
Z = Peak-to-Peak computed from sensor 3

3. Every frame, forces are computed on each agent in the particle system (using formula Force = mass * acceleration).
- Mass of each agent is its Mean Absolute Deviation feature value, computed from sensor 4
- Acceleration of each agent equals the Spectral Centroid feature values
X = Spectral Centroid computed from sensor 1,
Y = Spectral Centroid computed from sensor 2,
Z = Spectral Centroid computed from sensor 3,

TO DO: implement a drag force (force of friction) on the agent, based on another feature value

Conceptual TO DOs:
- Figure out when agents should be stationary versus when they should move, based on user interface
- Tailor the color to tease apart which letter and trial number is represented by each agent
StejaraDinulescu_Final_V2.zip
(529.81 KiB) Downloaded 54 times
------------------------------------------------------------

Work 02/22 [StejaraDinulescu_Final_V1]:

- Loaded my tactile sign language data into processing
- Created an "Agent" class to store features and agent properties
- Populated agent parameters from data
(hue is mapped based on what letter it is, from A-Z, and saturation is mapped based on which trial it is, from 0-40)
- Visualized agents in random starting position with default size
StejaraDinulescu_Final_V1.zip
(529.57 KiB) Downloaded 56 times
------------------------------------------------------------

Concept/Description:
I would like to create data-driven sculptural forms based on a dataset from my recent IEEE Haptics Symposium 2022 conference paper on a haptic device for supporting tactile interaction and communication (such as transcription or translation of Tactile Sign Languages, known as TSLs).

Data: This dataset consists of 96 features extracted from captured RMS acceleration signals from 40 repetitions of all 26 letters of the Deafblind Manual alphabet, an Australian TSL --> [ 96 x 40 repetitions x 26 letters = 99840 cells of data ].

System Overview: A user would interact with the system by typing in a word and seeing a geometric form emerge from the motion of data points through space. Each row of data (one trial of one letter) contains one value for 96 features, placing it in the space of a 96-dimensional coordinate system (when considering each feature as a dimension or axis.) I will then use PCA or t-SNE to compress my 96-dimensional feature data into 3 axes (X, Y, Z) to visualize in Processing. The combinations of letters of the input word will make up the geometry of the resulting form, where multiple instances of a letter will exhibit greater influence of that letter in the structure/architecture. In this way, the "genotype" of feature values yield a specific "phenotype" (or generated 3D form) for each letter, which, when used as building blocks and chained together to form words, become forms. Communication via the tactile sense becomes digitally inscribed into geometry.

In-Class Presentation Link:
https://docs.google.com/presentation/d/ ... sp=sharing

References:
https://www.zhangweidi.com/vov
https://www.stejarasart.com/memoriae
https://www.karlsims.com/evolved-virtua ... raph94.pdf
https://www.pnas.org/content/101/14/4728
https://www.wolframscience.com/nks/note ... -automata/
https://jsantell.com/l-systems/
https://drive.google.com/file/d/1l4Tah5 ... sp=sharing --> accepted IEEE Haptics Symposium 2022 Conference Paper
Attachments
Screen Shot 2022-03-12 at 10.04.25 PM.png
Screen Shot 2022-03-12 at 10.03.50 PM.png
Screen Shot 2022-03-12 at 10.03.33 PM.png
Screen Shot 2022-03-12 at 10.03.24 PM.png
Screen Shot 2022-03-07 at 11.32.07 PM.png
Screen Shot 2022-03-07 at 11.31.47 PM.png
Screen Shot 2022-03-07 at 11.31.34 PM.png
Screen Shot 2022-03-07 at 11.31.07 PM.png
Last edited by sdinulescu on Sat Mar 12, 2022 10:05 pm, edited 14 times in total.

zijianwan
Posts: 3
Joined: Fri Jan 07, 2022 12:32 pm

Re: Proj 3 - Student Defined Visualization Project

Post by zijianwan » Tue Feb 22, 2022 1:55 pm

Concept
Quote from a paper I recently submitted: "As movement is arguably the most significant way by which animals respond to changes in their surrounding environment, it can serve as an instrument for environmental response and understanding movement can elucidate the relations between environmental drivers of animal behavior and demography (Eikelboom et al., 2020; Nathan et al., 2008)."
In this project, I would like to vividly visualize how animals interact with the surrounding environment (i.e., how environmental variables might influence the movement path selection) using the turkey vulture migration example.

Data
The raw tracking datasets and environmental annotations were obtained from Movebank at https://www.doi.org/10.5441/001/1.f3qt46r2 (Bildstein et al., 2021). The tracking dataset was preprocessed and clustered to obtain the dataset for this visualization. As the paper is not yet published, I cannot share the codes for preprocessing or clustering. But the datasets after those steps (attached) are as follows:
Environment sample dataset: a series of systematically sampled points, each with a series of environmental variables corresponding to that location
17112 sample points in total
Data_Sample_Pts.png
Turkey vulture dataset: tracking points recorded. Note that in the field "cluster_id", 10000+ represents fall migration and 20000+ represents spring migration.
16402 tracking points in total
Data_Track_Pts.png
Results
Scrennshot1.png
Buttons.png
Trajs_Terrains.png
References
Visualization examples
Daniel Shiffman Smoke particle system https://processing.org/examples/smokepa ... ystem.html
Braun, E., Düpmeier, C., Mirbach, S., & Lang, U. (2017, May). 3D Volume Visualization of Environmental Data in the Web. In International Symposium on Environmental Software Systems (pp. 457-469). Springer, Cham.

Research on movement and environment
Bildstein, K. L., Barber, D., Bechard, M. J., Graña Grilli, M., & Therrien, J. (2021). Data from: Study “Vultures Acopian Center USA GPS” (2003-2021). Movebank data repository. https://doi.org/doi:10.5441/001/1.f3qt46r2
Eikelboom, J. A. J., de Knegt, H. J., Klaver, M., van Langevelde, F., van der Wal, T., & Prins, H. H. T. (2020). Inferring an animal’s environment through biologging: quantifying the environmental influence on animal movement. Movement Ecology, 8(1), 1–18. https://doi.org/10.1186/s40462-020-00228-4
Nathan, R., Getz, W. M., Revilla, E., Holyoak, M., Kadmon, R., Saltz, D., & Smouse, P. E. (2008). A movement ecology paradigm for unifying organismal movement research. 105(49), 19052–19059.
Attachments
Wan_Proj3_Final.zip
(1.85 MiB) Downloaded 49 times
Sample_Pts_wEnv_TailCross.csv
(3.42 MiB) Downloaded 58 times
Clustered_Pts_Lat30N.csv
(1.66 MiB) Downloaded 57 times
Last edited by zijianwan on Sun Mar 13, 2022 5:00 pm, edited 3 times in total.

zilongliu
Posts: 3
Joined: Fri Jan 07, 2022 12:32 pm

Re: Proj 3 - Student Defined Visualization Project

Post by zilongliu » Thu Feb 24, 2022 11:55 am

Concept
I am interested in the visualization of entities with temporal scopes (in fact almost every entity has a temporal scope) in knowledge graphs. Therefore, I choose KnowWhereGraph (http://knowwheregraph.org/), a constantly growing knowledge graph with rich spatial and temporal information. My first attempt is to follow the tutorial here (https://www.youtube.com/watch?v=WEBOTRboXBE) to rasterize a 2D hazard image (e.g., fire) into a 3D image given that I want to focus on hazard-related entities and I want to use visualization techniques that have metaphors. However, this idea has drawn concerns that the shape is 'too' determined by the data so that the shape is too fixed.

Then, inspired by the 3D rain simulation here (https://discourse.processing.org/t/simp ... tion/10834), I realize that by controlling the velocity I can visualize entities with different scopes. Also, I plan to visualize the spatial information of a droplet (i.e., an entity in this visualization project) by showing this information in the splash caused by it falling onto the ground.

Data
The data can be retrieved from this endpoint: https://stko-kwg.geog.ucsb.edu/graphdb/sparql. By using the SPARQL, which is similar to SQL but is tailor to information retrieval in knowledge graphs. A sample query to retrieve all earthquakes, wildfires and hurricanes along with their temporal and spatial information is included below. The data repository is KWG-V3.

Code: Select all

PREFIX kwg-ont: <http://stko-kwg.geog.ucsb.edu/lod/ontology/>
PREFIX sosa: <http://www.w3.org/ns/sosa/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?hazard ?hazard_label ?hazard_type ?time ?time_label ?location ?location_label
{
    ?hazard kwg-ont:locatedIn ?location;
            sosa:isFeatureOfInterestOf/sosa:phenomenonTime | kwg-ont:hasTemporalScope ?time;
            rdfs:label ?hazard_label;
           	rdf:type ?hazard_type.
    ?location rdfs:label ?location_label.
    ?time rdfs:label ?time_label.
    values ?hazard_type {kwg-ont:EarthquakeEvent kwg-ont:Hurricane kwg-ont:Wildfire}
} order by ?time_label
The retrieved data can be found here
query-results-statement-rain.zip
(998.67 KiB) Downloaded 55 times
References
All references have been attached in the sections above.
Attachments
ZilongLiu_KWGHazardEntityRainFall.zip
(1.99 MiB) Downloaded 48 times
Last edited by zilongliu on Sun Mar 13, 2022 1:28 pm, edited 4 times in total.

yifei_liu
Posts: 3
Joined: Fri Jan 07, 2022 12:31 pm

Re: Proj 3 - Student Defined Visualization Project

Post by yifei_liu » Thu Feb 24, 2022 1:22 pm

Concept
I am interested in visualizing the trajectory of moving agents, however, the tiger GPS data that I used in my research is confidential, so I decide to continue using the Harry Potter data. I would like to show how every book moves from check-out date to check-in date and then jump to the next check-out date. The trajectory each agent is creating will show in bright color, while the past trajectory will be in gray. Each agent will look like a swimming tadpole. The moving speed and size of the point will depend on how long the borrowed book is being kept.

Data
In Project 2 I summarized the total number of Harry Potter series in each category that were borrowed each day. In this project I will trace the check-in and check-out date of each book according to the itemNumber (unique id for every book). Organized Harry Potter data is attached below.

References
https://github.com/move-ucsb/DynamoVis

Process
I first created a box with x, y, and z axis are year (2005-2022), day (Jan 1 to Dec 31), and time (00:00-23:59), and plot all the book record into the box as points with 12 colors for each category. The color is the same as the one used in Project 2. The points look like cloud in the box. Then I use PVector function to make the point and its trajectory move with time. The current trajectory is shown with bright color and no transparency, while the past trajectory is grey and partly transparent according to the time the book has been kept.
process_one.png
process_two.png
process_three.png
The third step is to add interaction to this project. I add a slider to control the process of moving (time), and another slider to control speed (how fast the points move). The number of books that is borrowed (Active Books Numbers) and total checkout numbers of current time is shown at the bottom left corner grouped by 12 categories.
The point size is modified according to the time the book is kept. Thus, the large points move slowly, and small points move faster and vanish faster.
final_one.png
final_two.png
Attachments
Yifei Liu_Project3_Harry Potter V4.zip
(2.17 MiB) Downloaded 41 times
Last edited by yifei_liu on Thu Mar 10, 2022 10:46 pm, edited 9 times in total.

jiaxinwu
Posts: 3
Joined: Fri Jan 07, 2022 12:21 pm

Re: Proj 3 - Student Defined Visualization Project

Post by jiaxinwu » Thu Feb 24, 2022 1:44 pm

Data
I got the dataset Museum of Modern Art Collection on Kaggle https://www.kaggle.com/momanyc/museum-collection. The MoMA dataset includes two datasets, the artworks dataset, and the artists dataset.
The artworks dataset contains 130,262 records, representing all of the works that have been accessioned into MoMA’s collection and cataloged in our database. It includes basic metadata for each work, including title, artist, date, medium, dimensions, and date acquired by the Museum. The artists dataset contains 15,091 records, representing all the artists who have work in MoMA's collection and have been cataloged in our database. It includes basic metadata for each artist, including name, nationality, gender, birth year, and death year.

I merged the two datasets and did some basic data preprocessing for them. And then I use labelencoder to map strings in data to integers. After that, I use the TSNE algorithm https://scikit-learn.org/stable/modules ... .TSNE.html to reduce the dimension of the data to 3D. I use TSNE to calculate the x, y, z coordinates of each artwork and artist.
Then I link the coordinates of each artwork and artist. The black points represent artists and the red points represent artworks. Specifically, I use different kinds of red https://image.baidu.com/search/detail?c ... IsOQ%3D%3D to represent artworks from different departments.
I visualize the movement of each artwork to show how it was created by the artist.
Attachments
3D Visualization for MoMA Dataset.zip
(26.49 MiB) Downloaded 45 times
screenshot2.png
screenshot.png
Last edited by jiaxinwu on Thu Mar 03, 2022 3:06 pm, edited 2 times in total.

ziyu_zhang309
Posts: 3
Joined: Fri Jan 07, 2022 12:34 pm

Re: Proj 3 - Student Defined Visualization Project

Post by ziyu_zhang309 » Thu Feb 24, 2022 1:51 pm

Concept
The topic of my final project is about analyzing the environment and economy of the countries where people most want to immigrate, specifically in the aspects of emissions of CO2, the increasing change of population and GDP development. And then comparing these three aspects, we can get the most healthy, green and economic-friendly country that suits immigration most. My original plan is to create different forms for three aspects. dots and lines are for the change of CO2 emission, polygons are for the population and the GDP increase accordingly.

Data
I selected the top 22 countries that people want to immigrate most according to this website: https://www.weforum.org/agenda/2017/11/ ... o-move-to/ and sorted out the above data from 1970 to 2020.
data_01.xlsx
(1.65 MiB) Downloaded 51 times
Screen shot
03.png
01.png
02.png
04.png

Description
The lines in the first picture represent the distribution of the population and GDP of each country from 1970 to 2018 in the coordinate system with the year as the x coordinate, the population increase as the z coordinate, and the GDP as the y coordinate. Figures 2 to 4 show CO2 emissions and energy consumption per GDP for 18 countries. The size of the dot represents the carbon dioxide emissions of the country in the year. The larger the radius of the dot, the higher the average GDP emissions. Similarly, the line segment representing energy consumption is the same way. The color order is ranked by the list on the website I linked above.

Project
ZiyuZhang_FinalProject.zip
(27.26 KiB) Downloaded 31 times
Last edited by ziyu_zhang309 on Sun Mar 13, 2022 5:01 pm, edited 2 times in total.

senyuan
Posts: 3
Joined: Tue Jan 11, 2022 10:44 am

Re: Proj 3 - Student Defined Visualization Project

Post by senyuan » Thu Feb 24, 2022 1:59 pm

Russell Liu - Final project

Concept
I will focus on the air temperature of 28 cities in the United States. I acquire the air temperature data (which is updated every 1 hour) and city data from 2012 to 2017, from Kaggle https://www.kaggle.com/selfishgene/hist ... /version/2
There will be three axes, respectively representing longitude, latitude, and air temperature. Each dots represent a city and it moves up and down along with the numerical value of the temperature.

Visualization Explanation
There are two modes for this visualizations, one is hourly air temperature movement and the other one is daily average temperature movement. You can simply click anywhere on the screen to switch the mode, and the mode shows on the upper left of the screen. Regarding the speed of the movement, you could justify it by moving your mousing near the slider area and dragging it without any click. To the very left of the slide, that would be the fastest speed and the opposite is the slowest. The time of the current movement is shown beside the slider as well. Also, I mark the current value of temperature along with the name of the city on each dot. Regarding the color, I use ColorMode() and HSB instead of stroke(), background(), etc. The brightness of the color is corresponding to the numerical value of the temperature. If the city's temperature is high, its color would be very light and red, and oppositely, it would be green. You could tell the difference in the temperature among cities.

Data
temperature.csv
(12.72 MiB) Downloaded 28 times
city.csv
(890 Bytes) Downloaded 31 times
Data Source
https://www.kaggle.com/selfishgene/hist ... /version/2

Code
TempDataVis.zip
(3.89 MiB) Downloaded 29 times
My visualizarion in Youtube
https://www.youtube.com/watch?v=WZu94kk-t1Y
Last edited by senyuan on Sat Mar 12, 2022 7:17 pm, edited 4 times in total.

lijuan
Posts: 3
Joined: Fri Jan 07, 2022 12:25 pm

Re: Proj 3 - Student Defined Visualization Project

Post by lijuan » Thu Feb 24, 2022 2:19 pm

Final Project Idea
Lijuan Cheng


Concept
Due to Covid-19, most of the people have to work from home or study online, so there are a significantly growing in the view counts of different kinds of videos. Hence, my research topic of the final project is to find some interesting results from the analysis of youtube trending videos. As recommened by the professor and TAs, I plan to focus on one specific category of the trending videos. When I finish the data cleaning work, I could find the hiding rules behind these videos and then decide the design pattern of my visualization. Now, I am still learning how to deal with the original data with python. But I have narrowed down the dataset to three major categorys, which are Howto & Style (category_id 26), Education(category_id 27), and Science & Technology(category_id 28). The next step I will try some data statistics to find the common rules.

Dataset
I have found the original dataset from Kaggle website, and here is the link:
https://www.kaggle.com/rsrishav/youtube ... g_data.csv
The following is my preliminary data after deleting some columns.
mat259_finalProject_data_ 26.csv
(1.34 MiB) Downloaded 33 times
mat259_finalProject_data_27.csv
(878.61 KiB) Downloaded 33 times
mat259_finalProject_data_28.csv
(1.32 MiB) Downloaded 33 times

Work in Progress
March 1st 2022
Data cleaning:
I used python to clean the primary data and stored them in three new data files as follows.
2021_26_v2.csv
(218.75 KiB) Downloaded 29 times
2021_27_v2.csv
(189.68 KiB) Downloaded 26 times
2021_28_v2.csv
(246.71 KiB) Downloaded 33 times
I parsed the original date and time information into three new columns, which are published date, published time and trending day. Then I create a new column of response count which is the sum of likes, dislikes and comment_count. In the new dataset, there are 8 columns including title, channelTitle, categoryId, view_count, response_count, published_date, published_time and trending_day. Every row represents a record of one trending video. Although some videos have the same title, published_date, and published_time, they have different view_count, response_count and trending_day. Hence, through analyzing these videos, you can see the trend of the change in their counting data.

Visualization:
In the visualization process, I draw every video record as a 3D point and design the coordinates like this, X axis showing the published_date, Y axis showing the published_time and Z axis showing the view_count. Meanwhile, I also use the response_count as the length of the line starting with the video point. The categoryId will decide the color of every video, such as the red is the color of category Howto & Style, the yellow being the color of Education and the blue being the color of Science & Technology. Due to the view_count number being significantly larger than the days and times, I mapped the X axis and Y axis between -200 and 200, whereas mapped the Z axis between - 600 and 600 for clearly showing the variations among the view_count of different video points. Moreover, I also designed some interactions and rotating effects with the controlP5 buttons and keyboard pressing. The following images illustrate the latest progress in my visualization results.
Screen Shot 2022-03-01 at 10.09.46 AM.png
Screen Shot 2022-03-01 at 10.09.22 AM.png
Last edited by lijuan on Wed Mar 02, 2022 8:51 pm, edited 2 times in total.

siming
Posts: 3
Joined: Fri Jan 07, 2022 12:29 pm

Re: Proj 3 - Student Defined Visualization Project

Post by siming » Thu Feb 24, 2022 3:13 pm

Final Project Idea
Siming, Su

Concept
For this final project, I would like to make a visualization that analyses the attractiveness of certain users in a dating app by using a machine learning algorithm called XGBoost. Even though I have not gotten in touch with this algorithm before, I have the confidence to tackle it down based on my statistics background. I would like to make a prediction of how attractive a certain user will be in this dating app and visualize this prediction.

Data
The datasets I used are from Kaggle, which are all female users' info from a dating app called Lovoo.

Reference
https://arxiv.org/abs/1603.02754 --- this is XGboost's original paper.
https://vislab.mat.ucsb.edu/2012/p4/rj/index.html (Because XGboost is a tree algorithm, probably I would use a tree structure to visualize the prediction, I am not sure.)

Final update on March 10th
For this final project, I would like to make a visualization that analyses the attractiveness of certain users in a dating app by using three machine learning algorithms, which are Linear Regression, XGBoost, and Neural Network. The data is taken from the Kaggle website. Kaggle DataSet There are 42 variables in this dataset. After data cleaning, 25 variables are selected. Then, I would use different Machine Learning Algorithms to see different perceptions on attractiveness from each algorithm by predicting the variable "Counts_Kisses" (which is the like you get from other users).
More details could be found on the website (much more clearly explained algorithm by algorithm).

Visualization:
final_heart.png
final_heart1.png
final_heart2.png
final_heart3.png
Reference:
Kaggle.com -- kaggle website
Attachments
final_heart.zip
(1.91 MiB) Downloaded 32 times
lovoo_data.csv
(953.55 KiB) Downloaded 35 times

Post Reply