Association rule mining and itemset-correlation based variants

Niels M\"undler

arxiv: 1907.09535 · v1 · pith:IJNANPNYnew · submitted 2019-07-22 · 💻 cs.DB · cs.IR

Association rule mining and itemset-correlation based variants

Niels M\"undler This is my paper

Pith reviewed 2026-05-24 17:29 UTC · model grok-4.3

classification 💻 cs.DB cs.IR

keywords association rule miningApriori algorithmdownward closurequantitative attributesitem generalizationssupport and confidenceitemset correlation

0 comments

The pith

The Apriori algorithm prunes association rules by discarding candidates that fail minimum support thresholds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents the Apriori algorithm for discovering association rules in databases of itemsets. The method generates candidate itemsets level by level and uses the downward closure property to prune any set whose subsets fail the user-specified minimum support. Variants extend the approach to quantitative attributes and to rules involving generalizations of items. These extensions keep the same pruning mechanism intact. The paper also proposes intertransformations between certain variants in special cases.

Core claim

The Apriori algorithm is presented, the basis for most association rule mining algorithms. It works by pruning away rules that need not be evaluated based on the user specified minimum support confidence. Additionally, variations of the algorithm are presented that enable it to handle quantitative attributes and to extract rules about generalizations of items, but preserve the downward closure property that enables pruning. Intertransformation of the extensions is proposed for special cases.

What carries the argument

The downward closure property of frequent itemsets, which lets the algorithm prune any candidate whose subsets fall below the minimum support threshold.

If this is right

Association rules can be mined from databases containing quantitative attributes without losing the original pruning power.
Rules involving generalizations of items can be extracted while still discarding infrequent candidates early.
Intertransformations between some of the variants become possible in special cases.
The same level-by-level candidate generation and pruning strategy remains applicable across the extensions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The preserved pruning could let the same framework scale to larger itemset collections than exhaustive search would allow.
Similar downward-closure arguments might be tested on other pattern-mining tasks that currently lack efficient candidate reduction.
Hybrid algorithms could combine the quantitative and generalization extensions when both kinds of data appear together.

Load-bearing premise

The downward closure property continues to hold for the quantitative-attribute and generalization variants, allowing the same level of pruning as the original Apriori algorithm.

What would settle it

A worked example in which one of the quantitative or generalization variants must evaluate every possible itemset without any pruning based on support would show that the claimed efficiency does not hold.

Figures

Figures reproduced from arXiv: 1907.09535 by Niels M\"undler.

**Figure 2.** Figure 2: Example transaction database for a market providing Aubergines, [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Example for small shop selling tea (t) and coffee (c) where the association rule T ea → Cof fee with negative correlation is generated. frequent itemsets of all lengths. The next step is the generation of association rules from the set of frequent itemsets. The procedure will be shown by the example of the frequent itemset {A, B, C} ∈ L3. First, single consequent rules are generated and their confidence is… view at source ↗

**Figure 4.** Figure 4: Graphs showing why sensititivity to the underlying data may be [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Intervals of a quantitative attribute represented as a taxonomy. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: A taxonomy converted into quantities. The subintervals shown below [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

read the original abstract

Association rules express implication formed relations among attributes in databases of itemsets. The apriori algorithm is presented, the basis for most association rule mining algorithms. It works by pruning away rules that need not be evaluated based on the user specified minimum support confidence. Additionally, variations of the algorithm are presented that enable it to handle quantitative attributes and to extract rules about generalizations of items, but preserve the downward closure property that enables pruning. Intertransformation of the extensions is proposed for special cases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a plain recap of Apriori plus two textbook extensions, with no new results or derivations.

read the letter

The main thing to know is that this paper restates the Apriori algorithm and describes two standard extensions already covered in the literature it cites, without claiming or showing anything new. The abstract walks through the basic pruning step based on minimum support and downward closure, then notes that the same property holds after discretizing quantitative attributes or adding item taxonomies. That part is accurate and matches what has been known since the mid-1990s. The only extra remark is that the two extensions can be intertransformed in special cases, but the text gives no examples, proof, or even a sketch of how that would work. The stress-test note is correct: nothing here contradicts itself or requires independent verification because no technical claim is being advanced. The title mentions itemset-correlation variants yet the content stays inside ordinary association rules, which is a minor mismatch but does not change the substance. Overall the writing is clear enough for an overview, but it supplies no data, no code, no proofs, and no fresh observation that would move the field. A reader who already knows the 1994 paper and the later work on quantitative and generalized association rules will learn nothing. This is the sort of material that belongs in lecture notes or a survey chapter, not a research submission. I would not bring it to reading group, would not cite it, and would not send it to referees.

Referee Report

1 major / 1 minor

Summary. The manuscript presents the Apriori algorithm for mining association rules, highlighting its pruning mechanism based on user-specified minimum support and confidence that relies on the downward closure property. It describes two variants—one for quantitative attributes (via discretization or partitioning) and one for item generalizations using taxonomies—asserting that both preserve downward closure to support equivalent pruning. It further proposes intertransformation between the extensions in special cases.

Significance. If accurate, the work offers a structured, pedagogical exposition of a foundational algorithm and two well-known extensions in association rule mining. Its value lies in consolidating standard techniques for readers new to the area, but it introduces no new theoretical results, proofs, empirical comparisons, or machine-checked derivations, limiting its significance to consolidation rather than advancement of the database mining literature.

major comments (1)

[Abstract] Abstract: the claim that the quantitative-attribute and generalization variants preserve the downward closure property (enabling the same pruning) is asserted without derivation, explicit re-definition of support on binned or generalized items, or citation to the specific literature establishing the property; because this property is load-bearing for the central pruning argument, a short justification or pointer to the relevant definitions would make the presentation self-contained and verifiable.

minor comments (1)

[Abstract] The abstract phrase 'minimum support confidence' is imprecise and should read 'minimum support and minimum confidence'.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review. The manuscript is an expository presentation of the Apriori algorithm and its standard extensions; we address the single major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the quantitative-attribute and generalization variants preserve the downward closure property (enabling the same pruning) is asserted without derivation, explicit re-definition of support on binned or generalized items, or citation to the specific literature establishing the property; because this property is load-bearing for the central pruning argument, a short justification or pointer to the relevant definitions would make the presentation self-contained and verifiable.

Authors: We agree that the abstract asserts preservation of the downward-closure property without a derivation or citation. The property follows from the standard redefinition of support on binned quantitative items (Srikant & Agrawal, 1996) and on generalized items via taxonomies (Srikant & Agrawal, 1995), both of which retain anti-monotonicity and therefore the same pruning. Because the manuscript is pedagogical rather than a new theoretical contribution, we did not re-derive these known facts in the abstract. We will revise by adding a concise pointer (or one-sentence justification) either in the abstract or, if length-constrained, in the introduction, together with the two foundational citations. This change will be incorporated in the next version. revision: yes

Circularity Check

0 steps flagged

No circularity: purely expository description of known algorithms

full rationale

The paper is an expository survey of the Apriori algorithm and two standard extensions (quantitative attributes and item generalizations). It describes the downward-closure property and pruning behavior without introducing novel derivations, equations, fitted parameters, or self-citations that serve as load-bearing premises. All claims reduce to well-known prior literature rather than internal construction. No steps qualify under any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No new free parameters, axioms, or invented entities are introduced; the text is an exposition of existing algorithms.

pith-pipeline@v0.9.0 · 5589 in / 1091 out tokens · 20959 ms · 2026-05-24T17:29:11.603345+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

Mining association rules between sets of items in large databases,

R. Agrawal, T. Imielinski, and A. N. Swami, “Mining association rules between sets of items in large databases,” in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 26-28, 1993. , P. Buneman and S. Jajodia, Eds. ACM Press, 1993, pp. 207–216. [Online]. Available: https://doi.org/10.1145/170035.170072

work page doi:10.1145/170035.170072 1993
[2]

Fast algorithms for mining association rules in large databases,

R. Agrawal and R. Srikant, “Fast algorithms for mining association rules in large databases,” in VLDB’94, Proceedings of 20th International Conference on Very Large Data Bases, September 12-15, 1994, Santiago de Chile, Chile , J. B. Bocca, M. Jarke, and C. Zaniolo, Eds. Morgan Kaufmann, 1994, pp. 487–499. [Online]. Available: http://www.vldb.org/conf/1994...

work page 1994
[3]

Data mining: An overview from a database perspective,

M. Chen, J. Han, and P. S. Yu, “Data mining: An overview from a database perspective,” IEEE Trans. Knowl. Data Eng. , vol. 8, no. 6, pp. 866–883, 1996. [Online]. Available: https://doi.org/10.1109/69.553155

work page doi:10.1109/69.553155 1996
[4]

Beyond market baskets: Generalizing association rules to correlations,

S. Brin, R. Motwani, and C. Silverstein, “Beyond market baskets: Generalizing association rules to correlations,” in SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13-15, 1997, Tucson, Arizona, USA. , J. Peckham, Ed. ACM Press, 1997, pp. 265–276. [Online]. Available: https: //doi.org/10.1145/253260.253327

work page doi:10.1145/253260.253327 1997
[5]

Interestingness measures for data mining: A survey,

L. Geng and H. J. Hamilton, “Interestingness measures for data mining: A survey,” ACM Comput. Surv. , vol. 38, no. 3, p. 9, 2006. [Online]. Available: https://doi.org/10.1145/1132960.1132963

work page doi:10.1145/1132960.1132963 2006
[6]

Discovering frequent closed itemsets for association rules,

N. Pasquier, Y . Bastide, R. Taouil, and L. Lakhal, “Discovering frequent closed itemsets for association rules,” in Database Theory - ICDT ’99, 7th International Conference, Jerusalem, Israel, January 10-12, 1999, Proceedings., ser. Lecture Notes in Computer Science, C. Beeri and P. Buneman, Eds., vol. 1540. Springer, 1999, pp. 398–416. [Online]. Availab...

work page doi:10.1007/3-540-49257-7_25 1999
[7]

Mining quantitative association rules in large relational tables,

R. Srikant and R. Agrawal, “Mining quantitative association rules in large relational tables,” in Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, 1996. , H. V . Jagadish and I. S. Mumick, Eds. ACM Press, 1996, pp. 1–12. [Online]. Available: https://doi.org/10.1145/233269.233311

work page doi:10.1145/233269.233311 1996
[8]

Improving the quality of association rules by preprocessing numerical data,

M. Moreno García, S. Segrera, V . Batista, and M. Jose, “Improving the quality of association rules by preprocessing numerical data,” May 2019

work page 2019
[9]

Improving association rule mining using clustering-based discretization of numerical data,

S. C. Tan, “Improving association rule mining using clustering-based discretization of numerical data,” in 2018 International Conference on Intelligent and Innovative Computing Applications (ICONIC), Dec 2018, pp. 1–5

work page 2018
[10]

Fuzzy association rules: An implementation in r,

L. Helm, “Fuzzy association rules: An implementation in r,” Aug 2007. [Online]. Available: https://michael.hahsler.net/students/stud/done/helm/ fuzzy_AR_helm.pdf

work page 2007
[11]

A novel unsupervised fuzzy clustering method for preprocessing of quantitative attributes in association rule mining,

B. Thomas and G. Raju, “A novel unsupervised fuzzy clustering method for preprocessing of quantitative attributes in association rule mining,” Information Technology and Management , vol. 15, no. 1, pp. 9–17,

work page
[12]

Available: https://doi.org/10.1007/s10799-013-0168-7

[Online]. Available: https://doi.org/10.1007/s10799-013-0168-7

work page doi:10.1007/s10799-013-0168-7
[13]

Mining generalized association rules,

R. Srikant and R. Agrawal, “Mining generalized association rules,” in VLDB’95, Proceedings of 21th International Conference on Very Large Data Bases, September 11-15, 1995, Zurich, Switzerland. , U. Dayal, P. M. D. Gray, and S. Nishio, Eds. Morgan Kaufmann, 1995, pp. 407–419. [Online]. Available: http://www.vldb.org/conf/1995/P407.PDF

work page 1995
[14]

Using association rules to solve the cold-start problem in recommender systems,

G. Shaw, Y . Xu, and S. Geva, “Using association rules to solve the cold-start problem in recommender systems,” in Advances in Knowledge Discovery and Data Mining, 14th Paciﬁc-Asia Conference, PAKDD 2010, Hyderabad, India, June 21-24, 2010. Proceedings. Part I , ser. Lecture Notes in Computer Science, M. J. Zaki, J. X. Yu, B. Ravindran, and V . Pudi, Eds....

work page doi:10.1007/978-3-642-13657-3_37 2010
[15]

Finding inﬂuential users in social media using association rule learning,

F. Erlandsson, P. Bródka, A. Borg, and H. Johnson, “Finding inﬂuential users in social media using association rule learning,” Entropy, vol. 18, no. 5, p. 164, 2016. [Online]. Available: https: //doi.org/10.3390/e18050164

work page doi:10.3390/e18050164 2016
[16]

Objective-oriented utility-based association mining,

Y . Shen, Z. Zhang, and Q. Yang, “Objective-oriented utility-based association mining,” in Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 9-12 December 2002, Maebashi City, Japan . IEEE Computer Society, 2002, pp. 426–433. [Online]. Available: https://doi.org/10.1109/ICDM.2002.1183938

work page doi:10.1109/icdm.2002.1183938 2002

[1] [1]

Mining association rules between sets of items in large databases,

R. Agrawal, T. Imielinski, and A. N. Swami, “Mining association rules between sets of items in large databases,” in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 26-28, 1993. , P. Buneman and S. Jajodia, Eds. ACM Press, 1993, pp. 207–216. [Online]. Available: https://doi.org/10.1145/170035.170072

work page doi:10.1145/170035.170072 1993

[2] [2]

Fast algorithms for mining association rules in large databases,

R. Agrawal and R. Srikant, “Fast algorithms for mining association rules in large databases,” in VLDB’94, Proceedings of 20th International Conference on Very Large Data Bases, September 12-15, 1994, Santiago de Chile, Chile , J. B. Bocca, M. Jarke, and C. Zaniolo, Eds. Morgan Kaufmann, 1994, pp. 487–499. [Online]. Available: http://www.vldb.org/conf/1994...

work page 1994

[3] [3]

Data mining: An overview from a database perspective,

M. Chen, J. Han, and P. S. Yu, “Data mining: An overview from a database perspective,” IEEE Trans. Knowl. Data Eng. , vol. 8, no. 6, pp. 866–883, 1996. [Online]. Available: https://doi.org/10.1109/69.553155

work page doi:10.1109/69.553155 1996

[4] [4]

Beyond market baskets: Generalizing association rules to correlations,

S. Brin, R. Motwani, and C. Silverstein, “Beyond market baskets: Generalizing association rules to correlations,” in SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13-15, 1997, Tucson, Arizona, USA. , J. Peckham, Ed. ACM Press, 1997, pp. 265–276. [Online]. Available: https: //doi.org/10.1145/253260.253327

work page doi:10.1145/253260.253327 1997

[5] [5]

Interestingness measures for data mining: A survey,

L. Geng and H. J. Hamilton, “Interestingness measures for data mining: A survey,” ACM Comput. Surv. , vol. 38, no. 3, p. 9, 2006. [Online]. Available: https://doi.org/10.1145/1132960.1132963

work page doi:10.1145/1132960.1132963 2006

[6] [6]

Discovering frequent closed itemsets for association rules,

N. Pasquier, Y . Bastide, R. Taouil, and L. Lakhal, “Discovering frequent closed itemsets for association rules,” in Database Theory - ICDT ’99, 7th International Conference, Jerusalem, Israel, January 10-12, 1999, Proceedings., ser. Lecture Notes in Computer Science, C. Beeri and P. Buneman, Eds., vol. 1540. Springer, 1999, pp. 398–416. [Online]. Availab...

work page doi:10.1007/3-540-49257-7_25 1999

[7] [7]

Mining quantitative association rules in large relational tables,

R. Srikant and R. Agrawal, “Mining quantitative association rules in large relational tables,” in Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, 1996. , H. V . Jagadish and I. S. Mumick, Eds. ACM Press, 1996, pp. 1–12. [Online]. Available: https://doi.org/10.1145/233269.233311

work page doi:10.1145/233269.233311 1996

[8] [8]

Improving the quality of association rules by preprocessing numerical data,

M. Moreno García, S. Segrera, V . Batista, and M. Jose, “Improving the quality of association rules by preprocessing numerical data,” May 2019

work page 2019

[9] [9]

Improving association rule mining using clustering-based discretization of numerical data,

S. C. Tan, “Improving association rule mining using clustering-based discretization of numerical data,” in 2018 International Conference on Intelligent and Innovative Computing Applications (ICONIC), Dec 2018, pp. 1–5

work page 2018

[10] [10]

Fuzzy association rules: An implementation in r,

L. Helm, “Fuzzy association rules: An implementation in r,” Aug 2007. [Online]. Available: https://michael.hahsler.net/students/stud/done/helm/ fuzzy_AR_helm.pdf

work page 2007

[11] [11]

A novel unsupervised fuzzy clustering method for preprocessing of quantitative attributes in association rule mining,

B. Thomas and G. Raju, “A novel unsupervised fuzzy clustering method for preprocessing of quantitative attributes in association rule mining,” Information Technology and Management , vol. 15, no. 1, pp. 9–17,

work page

[12] [12]

Available: https://doi.org/10.1007/s10799-013-0168-7

[Online]. Available: https://doi.org/10.1007/s10799-013-0168-7

work page doi:10.1007/s10799-013-0168-7

[13] [13]

Mining generalized association rules,

R. Srikant and R. Agrawal, “Mining generalized association rules,” in VLDB’95, Proceedings of 21th International Conference on Very Large Data Bases, September 11-15, 1995, Zurich, Switzerland. , U. Dayal, P. M. D. Gray, and S. Nishio, Eds. Morgan Kaufmann, 1995, pp. 407–419. [Online]. Available: http://www.vldb.org/conf/1995/P407.PDF

work page 1995

[14] [14]

Using association rules to solve the cold-start problem in recommender systems,

G. Shaw, Y . Xu, and S. Geva, “Using association rules to solve the cold-start problem in recommender systems,” in Advances in Knowledge Discovery and Data Mining, 14th Paciﬁc-Asia Conference, PAKDD 2010, Hyderabad, India, June 21-24, 2010. Proceedings. Part I , ser. Lecture Notes in Computer Science, M. J. Zaki, J. X. Yu, B. Ravindran, and V . Pudi, Eds....

work page doi:10.1007/978-3-642-13657-3_37 2010

[15] [15]

Finding inﬂuential users in social media using association rule learning,

F. Erlandsson, P. Bródka, A. Borg, and H. Johnson, “Finding inﬂuential users in social media using association rule learning,” Entropy, vol. 18, no. 5, p. 164, 2016. [Online]. Available: https: //doi.org/10.3390/e18050164

work page doi:10.3390/e18050164 2016

[16] [16]

Objective-oriented utility-based association mining,

Y . Shen, Z. Zhang, and Q. Yang, “Objective-oriented utility-based association mining,” in Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 9-12 December 2002, Maebashi City, Japan . IEEE Computer Society, 2002, pp. 426–433. [Online]. Available: https://doi.org/10.1109/ICDM.2002.1183938

work page doi:10.1109/icdm.2002.1183938 2002