Page 1 of 1

wk10 - 11.29.22 Final Project

Posted: Fri Sep 16, 2022 8:01 am
by glegrady
11.29.22 Final Project

The final project is a project of your choice. You can present anything of interest to you that deals with data analytics. The database can be the one we have been using, or else, any other database. Additionally, it is an opportunity to recommend a topic that ws not covered but could be important to add. And finally, please rate assignments from most important to least.

Re: wk10 - 11.29.22 Final Project

Posted: Mon Nov 28, 2022 10:44 pm
by shaokang
Final Project: Clustering & Dimensionality Reduction
Clustering and Dimensionality Reduction are 2 effective approaches used in data analysis. Clustering is often used to see if there’s any grouping pattern in the data, while dimensionality reduction are helpful in visualizing high-dimesional data.

In this project, I collect 3 dimensional data for each of the title in database. `Number of copies`, `Number of checkout`s and `average borrow duration`. Using clustering and dimensionality reduction algorithm, I am able to find some patterns with respect to the data and also perform data visualization in a 2D plane.

Analysis is mainly written in Jupyter notebook (in Python). Check the .ipynb.zip (Since uploading .ipynb file is not allowed) file in the attachment.
project.ipynb.zip
(160.66 KiB) Downloaded 174 times
data.csv
(2.82 MiB) Downloaded 160 times
Final-Clustering & Dimensionality Reduction.pdf
(1.01 MiB) Downloaded 157 times

Re: wk10 - 11.29.22 Final Project

Posted: Tue Nov 29, 2022 2:16 am
by nataliadubon
Abstract
For most of this course, we primarily focused on analyzing data from the past and keeping it there. For this project, I thought I could focus more on future predictions as another means of exploring a topic that hasn’t been assigned yet. During week 5, I did a similar project focusing on trends, but this time I plan on solely focusing on prediction using week 8’s data set (outliers) with some tweaking. The goal is to focus on the future versus the past.

The pdf is attached labeled "Week 10 Future Predictions" along with its respective queries.

Ranking assignments (not including midterm and final):
1. Discover patterns
2. Outliers
3. Random Sampling
4. New MySQL commands
5. 2nd MySQL project
6. 1st MySQL project

Re: wk10 - 11.29.22 Final Project

Posted: Tue Nov 29, 2022 11:37 am
by briannagriffin
For my final project, I will be using Python, R, and Tableau as technologies to analyze a data set that I found online. The data set that I found is from Kaggle, and originally contained 5 different CSV files. The context of the data is Udemy Courses. Udemy is an online platform in which you can take courses in a variety of subjects. These courses are either free or of charge. I will first clean the data set then analyze it, perform some statistical analysis, linear modeling, and visualize some of the results and findings. The ranking of past projects for the course is also included within the pdf below.

Here are the pdfs containing my project write up and code:
Final Project - MAT 265.pdf
(956.53 KiB) Downloaded 159 times
r_script_finalproject.pdf
(95.16 KiB) Downloaded 166 times
final_project_statistics.pdf
(152.24 KiB) Downloaded 162 times

Re: wk10 - 11.29.22 Final Project

Posted: Tue Nov 29, 2022 2:17 pm
by ilianikiforov
In this report, I try to model the amount of time that passes between check-out and return using several variables that I constructed and a sample of 2813 observations. The linear regression showed that adult items, CDs, and DVDs tend to be returned faster. However, the regression method with my dataset failed several important diagnostics, so I conclude that these preliminary findings should be tested using a different method, more appropriate for this data.

My ranking of assignments:
1) Patterns
2) New commands
3) Outliers
4) 2nd project
5) 1st project
6) sampling