Proj 1: Data Query

milrober
Posts: 4
Joined: Tue Jan 14, 2014 11:44 am

Re: Proj 1: Data Query

Post by milrober » Tue Jan 21, 2014 3:11 pm

Attachments
RMiller-MAT259-DataQuery.pdf
(83.93 KiB) Downloaded 272 times

laks316
Posts: 3
Joined: Tue Jan 14, 2014 11:52 am

Re: Proj 1: Data Query

Post by laks316 » Tue Jan 21, 2014 3:54 pm

MAT 259 Assignment 1, Lakshman Nataraj

As an avid reader of Japanese manga, I was interested in seeing how many people read Japanese manga and how many watch the anime of the manga. I wanted to analyze this on the current most popular manga: Naruto. Some of the specific questions that I was interested in were:

How many checkouts of Naruto manga books?

SELECT count(*) FROM spl2.inraw where title like "%naruto%" and itemtype = "acbk";
Ans: 44555, Query time: 50.047 sec

How many checkouts of Naruto anime CD/DVDs?

SELECT count(*) FROM spl2.inraw where title like "%naruto%" and (itemtype = "accd" or itemtype = "acdvd") ;

Ans: 44519, Query Time: 54.875 sec

What is the number of Naruto manga books checked out every year after 2006?

SELECT count(*),year(cout) FROM spl2.inraw
where (title like "%naruto%" and itemtype="acbk")
and year(cout)>="2006"
group by year(cout);

9 rows returned, Query Time: 53.343 sec, CSV file: naruto_books_checkouts_per_year.csv

What is the number of Naruto anime CD/DVDs checked out every year after 2006?

SELECT count(*),year(cout) FROM spl2.inraw
where (title like "%naruto%" and (itemtype="accd" or itemtype="acdvd"))
and year(cout)>="2006"
group by year(cout);

9 rows returned, Query Time: 54.578 sec, CSV file: naruto_cddvd_checkouts_per_year.csv

Comparison Table (in pdf)

What is the checkout time (in number of days) for every book and CD/DVD checked out?

SELECT DATEDIFF(cin,cout) FROM spl2.inraw where (title like "%naruto%" and itemtype = "acbk") AND year(cout)>='2006' order by dayofyear(cout);

43185 rows returned, Query Time: 54.094 sec, CSV file: naruto_books_chkout_time.csv

SELECT DATEDIFF(cin,cout) FROM spl2.inraw where (title like "%naruto%" and (itemtype = "accd" or itemtype=”acdvd”)) AND year(cout)>='2006' order by dayofyear(cout) ;

43654 rows returned, Query Time: 57.656 sec, CSV file: naruto_cddvd_chkout_time.csv

Comments and Analysis:
The overall number of books and CD/DVDs of Naruto that have been checked out seeem to be almost the same. However, a year by year analysis shows that from 2006 to 2008, the books seem to be more popular. But from 2009-2011, the number of CD/DVDs checked out were significantly more than the number of books. This could probably be due to the popularity of the new TV show Naruto Shippudden which was released in the USA around mid 2008. The popularity of the CD/DVDs seem to fade from 2012 onwards. I was also interested to see the checkout times for books and CD/DVDs. However, since the rows returned were large (40K+), I plan to do a comparative visualization of both the checkout times in the next assignment.

Link to csv files:
https://drive.google.com/folderview?id= ... sp=sharing
Attachments
mat-259-assign-1-lakshman.pdf
(89.29 KiB) Downloaded 251 times

Post Reply