wk10 - 11.29.22 Final Project

Post Reply
Posts: 203
Joined: Wed Sep 22, 2010 12:26 pm

wk10 - 11.29.22 Final Project

Post by glegrady » Fri Sep 16, 2022 8:01 am

11.29.22 Final Project

The final project is a project of your choice. You can present anything of interest to you that deals with data analytics. The database can be the one we have been using, or else, any other database. Additionally, it is an opportunity to recommend a topic that ws not covered but could be important to add. And finally, please rate assignments from most important to least.
George Legrady

Posts: 8
Joined: Fri Sep 23, 2022 10:07 am

Re: wk10 - 11.29.22 Final Project

Post by shaokang » Mon Nov 28, 2022 10:44 pm

Final Project: Clustering & Dimensionality Reduction
Clustering and Dimensionality Reduction are 2 effective approaches used in data analysis. Clustering is often used to see if there’s any grouping pattern in the data, while dimensionality reduction are helpful in visualizing high-dimesional data.

In this project, I collect 3 dimensional data for each of the title in database. `Number of copies`, `Number of checkout`s and `average borrow duration`. Using clustering and dimensionality reduction algorithm, I am able to find some patterns with respect to the data and also perform data visualization in a 2D plane.

Analysis is mainly written in Jupyter notebook (in Python). Check the .ipynb.zip (Since uploading .ipynb file is not allowed) file in the attachment.
(160.66 KiB) Downloaded 273 times
(2.82 MiB) Downloaded 258 times
Final-Clustering & Dimensionality Reduction.pdf
(1.01 MiB) Downloaded 268 times

Posts: 15
Joined: Tue Mar 29, 2022 3:30 pm

Re: wk10 - 11.29.22 Final Project

Post by nataliadubon » Tue Nov 29, 2022 2:16 am

For most of this course, we primarily focused on analyzing data from the past and keeping it there. For this project, I thought I could focus more on future predictions as another means of exploring a topic that hasn’t been assigned yet. During week 5, I did a similar project focusing on trends, but this time I plan on solely focusing on prediction using week 8’s data set (outliers) with some tweaking. The goal is to focus on the future versus the past.

The pdf is attached labeled "Week 10 Future Predictions" along with its respective queries.

Ranking assignments (not including midterm and final):
1. Discover patterns
2. Outliers
3. Random Sampling
4. New MySQL commands
5. 2nd MySQL project
6. 1st MySQL project
Pandemic - Week8QueryC (1).pdf
(43.75 KiB) Downloaded 294 times
2021 Dataset - Week10_QueryA.pdf
(52.55 KiB) Downloaded 279 times
Week 10_ Future Predictions.pdf
(400.81 KiB) Downloaded 275 times

Posts: 11
Joined: Fri Sep 23, 2022 10:04 am

Re: wk10 - 11.29.22 Final Project

Post by briannagriffin » Tue Nov 29, 2022 11:37 am

For my final project, I will be using Python, R, and Tableau as technologies to analyze a data set that I found online. The data set that I found is from Kaggle, and originally contained 5 different CSV files. The context of the data is Udemy Courses. Udemy is an online platform in which you can take courses in a variety of subjects. These courses are either free or of charge. I will first clean the data set then analyze it, perform some statistical analysis, linear modeling, and visualize some of the results and findings. The ranking of past projects for the course is also included within the pdf below.

Here are the pdfs containing my project write up and code:
Final Project - MAT 265.pdf
(956.53 KiB) Downloaded 269 times
(95.16 KiB) Downloaded 290 times
(152.24 KiB) Downloaded 272 times

Posts: 8
Joined: Tue Oct 04, 2022 10:24 am

Re: wk10 - 11.29.22 Final Project

Post by ilianikiforov » Tue Nov 29, 2022 2:17 pm

In this report, I try to model the amount of time that passes between check-out and return using several variables that I constructed and a sample of 2813 observations. The linear regression showed that adult items, CDs, and DVDs tend to be returned faster. However, the regression method with my dataset failed several important diagnostics, so I conclude that these preliminary findings should be tested using a different method, more appropriate for this data.

My ranking of assignments:
1) Patterns
2) New commands
3) Outliers
4) 2nd project
5) 1st project
6) sampling
(2.66 MiB) Downloaded 266 times
(345.5 KiB) Downloaded 255 times
(124.95 KiB) Downloaded 271 times

Post Reply