wk6 - 11.01.22 MidTerm Presentation
Posted: Fri Sep 16, 2022 7:57 am
11.01.22 MidTerm Presentation
Presentation is focused on Frequency Pattern mining.
. What patterns emerge in terms of what circulates at what hour of the day
. What are temporal patterns throughout the day, days of the week, months, years
. Is there a correlation between checkout and return times and topics?
. What are co-occurrence patterns through frequency-pattern algorithm searches?
. Prediction analysis: If certain things circulate over certain periods, what are the chances of
--
. Are there correlations between topics and items that disappear?
. What are short-term, long-term performance of titles, topics, media, etc.
. What is an object’s life expectancy in relation to the subject’s performance based on their ID?
. Sequential history: when something is returned, what items are then checked-out
--
Frequent Pattern Mining (AKA Association Rule Mining) is an analytical process that finds frequent patterns, associations, or causal structures from data sets found in various kinds of databases such as relational databases, transactional databases, and other data repositories. Given a set of transactions, this process aims to find the rules that enable us to predict the occurrence of a specific item based on the occurrence of other items in the transaction.
Let’s look at an example of Frequent Pattern Mining. First, we will want to understand the terminology used in this type of analysis. While there are numerous metrics and factors used in this technique, for this example, we will only consider two factors namely, Support and Confidence.
Support: The support of a rule x -> y (where x and y are each items/events etc.) is defined as the proportion of transactions in the data set which contain the item set x as well as y. So, Support (x -> y)= no. of transactions which contain the item set x & y / total no. of transactions.
Confidence: The confidence of a rule x -> y is defined as: Support (x -> y) / support (x). So, it is the ratio of the number of transactions that include all items in the consequent (y in this case), as well as the antecedent (x in this case) to the number of transactions that include all items in the antecedent (x in this case).
In the table below, Support (milk->bread) = 0.4 means milk and bread are purchased together occur in 40% of all transactions. Confidence (milk->bread) = 0.5 means that if there are 100 transactions containing milk then there will be 50 that will also contain bread.
The attached drawing comes from this website: https://www.dataversity.net/frequent-pa ... -analysis/#
Presentation is focused on Frequency Pattern mining.
. What patterns emerge in terms of what circulates at what hour of the day
. What are temporal patterns throughout the day, days of the week, months, years
. Is there a correlation between checkout and return times and topics?
. What are co-occurrence patterns through frequency-pattern algorithm searches?
. Prediction analysis: If certain things circulate over certain periods, what are the chances of
--
. Are there correlations between topics and items that disappear?
. What are short-term, long-term performance of titles, topics, media, etc.
. What is an object’s life expectancy in relation to the subject’s performance based on their ID?
. Sequential history: when something is returned, what items are then checked-out
--
Frequent Pattern Mining (AKA Association Rule Mining) is an analytical process that finds frequent patterns, associations, or causal structures from data sets found in various kinds of databases such as relational databases, transactional databases, and other data repositories. Given a set of transactions, this process aims to find the rules that enable us to predict the occurrence of a specific item based on the occurrence of other items in the transaction.
Let’s look at an example of Frequent Pattern Mining. First, we will want to understand the terminology used in this type of analysis. While there are numerous metrics and factors used in this technique, for this example, we will only consider two factors namely, Support and Confidence.
Support: The support of a rule x -> y (where x and y are each items/events etc.) is defined as the proportion of transactions in the data set which contain the item set x as well as y. So, Support (x -> y)= no. of transactions which contain the item set x & y / total no. of transactions.
Confidence: The confidence of a rule x -> y is defined as: Support (x -> y) / support (x). So, it is the ratio of the number of transactions that include all items in the consequent (y in this case), as well as the antecedent (x in this case) to the number of transactions that include all items in the antecedent (x in this case).
In the table below, Support (milk->bread) = 0.4 means milk and bread are purchased together occur in 40% of all transactions. Confidence (milk->bread) = 0.5 means that if there are 100 transactions containing milk then there will be 50 that will also contain bread.
The attached drawing comes from this website: https://www.dataversity.net/frequent-pa ... -analysis/#