pith. sign in

arxiv: 1907.09535 · v1 · pith:IJNANPNYnew · submitted 2019-07-22 · 💻 cs.DB · cs.IR

Association rule mining and itemset-correlation based variants

Pith reviewed 2026-05-24 17:29 UTC · model grok-4.3

classification 💻 cs.DB cs.IR
keywords association rule miningApriori algorithmdownward closurequantitative attributesitem generalizationssupport and confidenceitemset correlation
0
0 comments X

The pith

The Apriori algorithm prunes association rules by discarding candidates that fail minimum support thresholds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents the Apriori algorithm for discovering association rules in databases of itemsets. The method generates candidate itemsets level by level and uses the downward closure property to prune any set whose subsets fail the user-specified minimum support. Variants extend the approach to quantitative attributes and to rules involving generalizations of items. These extensions keep the same pruning mechanism intact. The paper also proposes intertransformations between certain variants in special cases.

Core claim

The Apriori algorithm is presented, the basis for most association rule mining algorithms. It works by pruning away rules that need not be evaluated based on the user specified minimum support confidence. Additionally, variations of the algorithm are presented that enable it to handle quantitative attributes and to extract rules about generalizations of items, but preserve the downward closure property that enables pruning. Intertransformation of the extensions is proposed for special cases.

What carries the argument

The downward closure property of frequent itemsets, which lets the algorithm prune any candidate whose subsets fall below the minimum support threshold.

If this is right

  • Association rules can be mined from databases containing quantitative attributes without losing the original pruning power.
  • Rules involving generalizations of items can be extracted while still discarding infrequent candidates early.
  • Intertransformations between some of the variants become possible in special cases.
  • The same level-by-level candidate generation and pruning strategy remains applicable across the extensions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The preserved pruning could let the same framework scale to larger itemset collections than exhaustive search would allow.
  • Similar downward-closure arguments might be tested on other pattern-mining tasks that currently lack efficient candidate reduction.
  • Hybrid algorithms could combine the quantitative and generalization extensions when both kinds of data appear together.

Load-bearing premise

The downward closure property continues to hold for the quantitative-attribute and generalization variants, allowing the same level of pruning as the original Apriori algorithm.

What would settle it

A worked example in which one of the quantitative or generalization variants must evaluate every possible itemset without any pruning based on support would show that the claimed efficiency does not hold.

Figures

Figures reproduced from arXiv: 1907.09535 by Niels M\"undler.

Figure 1
Figure 1. Figure 1: Visualization of the frequent itemset generation of the apriori algorithm on [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Example transaction database for a market providing Aubergines, [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example for small shop selling tea (t) and coffee (c) where the association rule T ea → Cof fee with negative correlation is generated. frequent itemsets of all lengths. The next step is the generation of association rules from the set of frequent itemsets. The procedure will be shown by the example of the frequent itemset {A, B, C} ∈ L3. First, single consequent rules are generated and their confidence is… view at source ↗
Figure 4
Figure 4. Figure 4: Graphs showing why sensititivity to the underlying data may be [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Intervals of a quantitative attribute represented as a taxonomy. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: A taxonomy converted into quantities. The subintervals shown below [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
read the original abstract

Association rules express implication formed relations among attributes in databases of itemsets. The apriori algorithm is presented, the basis for most association rule mining algorithms. It works by pruning away rules that need not be evaluated based on the user specified minimum support confidence. Additionally, variations of the algorithm are presented that enable it to handle quantitative attributes and to extract rules about generalizations of items, but preserve the downward closure property that enables pruning. Intertransformation of the extensions is proposed for special cases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript presents the Apriori algorithm for mining association rules, highlighting its pruning mechanism based on user-specified minimum support and confidence that relies on the downward closure property. It describes two variants—one for quantitative attributes (via discretization or partitioning) and one for item generalizations using taxonomies—asserting that both preserve downward closure to support equivalent pruning. It further proposes intertransformation between the extensions in special cases.

Significance. If accurate, the work offers a structured, pedagogical exposition of a foundational algorithm and two well-known extensions in association rule mining. Its value lies in consolidating standard techniques for readers new to the area, but it introduces no new theoretical results, proofs, empirical comparisons, or machine-checked derivations, limiting its significance to consolidation rather than advancement of the database mining literature.

major comments (1)
  1. [Abstract] Abstract: the claim that the quantitative-attribute and generalization variants preserve the downward closure property (enabling the same pruning) is asserted without derivation, explicit re-definition of support on binned or generalized items, or citation to the specific literature establishing the property; because this property is load-bearing for the central pruning argument, a short justification or pointer to the relevant definitions would make the presentation self-contained and verifiable.
minor comments (1)
  1. [Abstract] The abstract phrase 'minimum support confidence' is imprecise and should read 'minimum support and minimum confidence'.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review. The manuscript is an expository presentation of the Apriori algorithm and its standard extensions; we address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the quantitative-attribute and generalization variants preserve the downward closure property (enabling the same pruning) is asserted without derivation, explicit re-definition of support on binned or generalized items, or citation to the specific literature establishing the property; because this property is load-bearing for the central pruning argument, a short justification or pointer to the relevant definitions would make the presentation self-contained and verifiable.

    Authors: We agree that the abstract asserts preservation of the downward-closure property without a derivation or citation. The property follows from the standard redefinition of support on binned quantitative items (Srikant & Agrawal, 1996) and on generalized items via taxonomies (Srikant & Agrawal, 1995), both of which retain anti-monotonicity and therefore the same pruning. Because the manuscript is pedagogical rather than a new theoretical contribution, we did not re-derive these known facts in the abstract. We will revise by adding a concise pointer (or one-sentence justification) either in the abstract or, if length-constrained, in the introduction, together with the two foundational citations. This change will be incorporated in the next version. revision: yes

Circularity Check

0 steps flagged

No circularity: purely expository description of known algorithms

full rationale

The paper is an expository survey of the Apriori algorithm and two standard extensions (quantitative attributes and item generalizations). It describes the downward-closure property and pruning behavior without introducing novel derivations, equations, fitted parameters, or self-citations that serve as load-bearing premises. All claims reduce to well-known prior literature rather than internal construction. No steps qualify under any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No new free parameters, axioms, or invented entities are introduced; the text is an exposition of existing algorithms.

pith-pipeline@v0.9.0 · 5589 in / 1091 out tokens · 20959 ms · 2026-05-24T17:29:11.603345+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

  1. [1]

    Mining association rules between sets of items in large databases,

    R. Agrawal, T. Imielinski, and A. N. Swami, “Mining association rules between sets of items in large databases,” in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 26-28, 1993. , P. Buneman and S. Jajodia, Eds. ACM Press, 1993, pp. 207–216. [Online]. Available: https://doi.org/10.1145/170035.170072

  2. [2]

    Fast algorithms for mining association rules in large databases,

    R. Agrawal and R. Srikant, “Fast algorithms for mining association rules in large databases,” in VLDB’94, Proceedings of 20th International Conference on Very Large Data Bases, September 12-15, 1994, Santiago de Chile, Chile , J. B. Bocca, M. Jarke, and C. Zaniolo, Eds. Morgan Kaufmann, 1994, pp. 487–499. [Online]. Available: http://www.vldb.org/conf/1994...

  3. [3]

    Data mining: An overview from a database perspective,

    M. Chen, J. Han, and P. S. Yu, “Data mining: An overview from a database perspective,” IEEE Trans. Knowl. Data Eng. , vol. 8, no. 6, pp. 866–883, 1996. [Online]. Available: https://doi.org/10.1109/69.553155

  4. [4]

    Beyond market baskets: Generalizing association rules to correlations,

    S. Brin, R. Motwani, and C. Silverstein, “Beyond market baskets: Generalizing association rules to correlations,” in SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13-15, 1997, Tucson, Arizona, USA. , J. Peckham, Ed. ACM Press, 1997, pp. 265–276. [Online]. Available: https: //doi.org/10.1145/253260.253327

  5. [5]

    Interestingness measures for data mining: A survey,

    L. Geng and H. J. Hamilton, “Interestingness measures for data mining: A survey,” ACM Comput. Surv. , vol. 38, no. 3, p. 9, 2006. [Online]. Available: https://doi.org/10.1145/1132960.1132963

  6. [6]

    Discovering frequent closed itemsets for association rules,

    N. Pasquier, Y . Bastide, R. Taouil, and L. Lakhal, “Discovering frequent closed itemsets for association rules,” in Database Theory - ICDT ’99, 7th International Conference, Jerusalem, Israel, January 10-12, 1999, Proceedings., ser. Lecture Notes in Computer Science, C. Beeri and P. Buneman, Eds., vol. 1540. Springer, 1999, pp. 398–416. [Online]. Availab...

  7. [7]

    Mining quantitative association rules in large relational tables,

    R. Srikant and R. Agrawal, “Mining quantitative association rules in large relational tables,” in Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, 1996. , H. V . Jagadish and I. S. Mumick, Eds. ACM Press, 1996, pp. 1–12. [Online]. Available: https://doi.org/10.1145/233269.233311

  8. [8]

    Improving the quality of association rules by preprocessing numerical data,

    M. Moreno García, S. Segrera, V . Batista, and M. Jose, “Improving the quality of association rules by preprocessing numerical data,” May 2019

  9. [9]

    Improving association rule mining using clustering-based discretization of numerical data,

    S. C. Tan, “Improving association rule mining using clustering-based discretization of numerical data,” in 2018 International Conference on Intelligent and Innovative Computing Applications (ICONIC), Dec 2018, pp. 1–5

  10. [10]

    Fuzzy association rules: An implementation in r,

    L. Helm, “Fuzzy association rules: An implementation in r,” Aug 2007. [Online]. Available: https://michael.hahsler.net/students/stud/done/helm/ fuzzy_AR_helm.pdf

  11. [11]

    A novel unsupervised fuzzy clustering method for preprocessing of quantitative attributes in association rule mining,

    B. Thomas and G. Raju, “A novel unsupervised fuzzy clustering method for preprocessing of quantitative attributes in association rule mining,” Information Technology and Management , vol. 15, no. 1, pp. 9–17,

  12. [12]

    Available: https://doi.org/10.1007/s10799-013-0168-7

    [Online]. Available: https://doi.org/10.1007/s10799-013-0168-7

  13. [13]

    Mining generalized association rules,

    R. Srikant and R. Agrawal, “Mining generalized association rules,” in VLDB’95, Proceedings of 21th International Conference on Very Large Data Bases, September 11-15, 1995, Zurich, Switzerland. , U. Dayal, P. M. D. Gray, and S. Nishio, Eds. Morgan Kaufmann, 1995, pp. 407–419. [Online]. Available: http://www.vldb.org/conf/1995/P407.PDF

  14. [14]

    Using association rules to solve the cold-start problem in recommender systems,

    G. Shaw, Y . Xu, and S. Geva, “Using association rules to solve the cold-start problem in recommender systems,” in Advances in Knowledge Discovery and Data Mining, 14th Pacific-Asia Conference, PAKDD 2010, Hyderabad, India, June 21-24, 2010. Proceedings. Part I , ser. Lecture Notes in Computer Science, M. J. Zaki, J. X. Yu, B. Ravindran, and V . Pudi, Eds....

  15. [15]

    Finding influential users in social media using association rule learning,

    F. Erlandsson, P. Bródka, A. Borg, and H. Johnson, “Finding influential users in social media using association rule learning,” Entropy, vol. 18, no. 5, p. 164, 2016. [Online]. Available: https: //doi.org/10.3390/e18050164

  16. [16]

    Objective-oriented utility-based association mining,

    Y . Shen, Z. Zhang, and Q. Yang, “Objective-oriented utility-based association mining,” in Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 9-12 December 2002, Maebashi City, Japan . IEEE Computer Society, 2002, pp. 426–433. [Online]. Available: https://doi.org/10.1109/ICDM.2002.1183938