pith. sign in

arxiv: 1906.10686 · v2 · pith:GBMSNVC7new · submitted 2019-06-26 · 💻 cs.OH

Wise Data: A Novel Approach in Data Science from a Network Science Perspective

Pith reviewed 2026-05-25 15:23 UTC · model grok-4.3

classification 💻 cs.OH
keywords data modelingnetwork sciencedata qualityuniquenessreference frequencydata networksWizQuestions
0
0 comments X

The pith

Modeling data as a network measures quality and reveals what makes some pieces unique and frequently referenced.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper poses three questions about why certain data gets referenced more often than others and what traits make those pieces stand out. It introduces a model that treats the collection of data as a network to address those questions from combined data science and network science angles. The approach claims to let users quantify data quality and examine the data network in detail. A reader would care if the network view surfaces reference patterns and uniqueness traits that standard data handling misses.

Core claim

Our proposed approach enables us to model the data (as a network), measure the quality of data, and study the network of data deeply and thoroughly.

What carries the argument

The network representation of data that links pieces by reference frequency to expose uniqueness characteristics.

If this is right

  • Users can answer why some data is referenced more by examining network link patterns.
  • Data quality becomes quantifiable through properties of the constructed data network.
  • Thorough analysis of data becomes possible by applying network science tools to the model.
  • The three WizQuestions about reference habits and uniqueness receive direct treatment via the network view.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The model could be tested by building networks from citation or access logs and checking whether high-degree nodes align with human judgments of data importance.
  • If the network view works, it might extend to dynamic datasets where links update over time to track shifting uniqueness.
  • One open connection is whether the same network construction could flag low-quality or redundant entries for removal.

Load-bearing premise

Framing data as a network will reveal characteristics of uniqueness and frequency that are not already captured by existing data-quality or graph-based methods.

What would settle it

A side-by-side test on the same dataset showing that network-derived quality scores and uniqueness rankings match or add nothing beyond results from conventional frequency counts or standard graph metrics.

Figures

Figures reproduced from arXiv: 1906.10686 by Mike Raeini (Mike WiseMan).

Figure 1
Figure 1. Figure 1: Data Visualization using the WizData Model [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: WizNet (network of wizwords) vs. BuzzNet (network of buzzwords) [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Factfulness [16], Courtesy of Bill Gates, [Source: BusinessInsider] [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The 80/20 Rule, Pareto Distribution and Power-Law Distribution [1] [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
read the original abstract

Human beings have been generating data since very long times ago. We ask the following common-sense and wise questions (WizQuestions): 1. Why do we refer to some pieces of data more often than referring to other pieces? 2. What does make those commonly-referred pieces of data so unique and different? 3. What are the characteristics of data that sometimes make the data so unique and different? In this article, we introduce a novel approach (model) that helps us answer these questions from data science and network science perspectives. WizWordily speaking, our proposed approach enables us to model the data (as a network), measure the quality of data, and study the network of data deeply and thoroughly.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper poses three questions (WizQuestions) about why some data pieces are referred to more often than others, what makes them unique, and the characteristics driving uniqueness. It asserts that a novel 'Wise Data' approach, viewed from data science and network science perspectives, enables modeling data as a network, measuring data quality, and studying the network of data deeply and thoroughly.

Significance. If a concrete, validated network model with explicit quality metrics were supplied and shown to yield non-redundant insights on frequency and uniqueness, the work could offer an integrative framework linking data quality to network properties. No such model, metric, algorithm, or result is present, so significance cannot be assessed.

major comments (1)
  1. [Abstract] Abstract: the central claim that the proposed approach 'enables us to model the data (as a network), measure the quality of data, and study the network of data deeply and thoroughly' is unsupported; the manuscript supplies neither a network construction, a quality metric, nor any algorithm or example that would allow evaluation of whether the three WizQuestions are answered.
minor comments (1)
  1. The phrase 'WizWordily speaking' is unclear and appears to be either a typo or an undefined neologism; replace with standard phrasing.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and constructive feedback on our manuscript. We respond to the major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the proposed approach 'enables us to model the data (as a network), measure the quality of data, and study the network of data deeply and thoroughly' is unsupported; the manuscript supplies neither a network construction, a quality metric, nor any algorithm or example that would allow evaluation of whether the three WizQuestions are answered.

    Authors: We acknowledge that the manuscript introduces the WizQuestions and outlines the Wise Data approach at a conceptual level, without supplying an explicit network construction method, concrete quality metrics, algorithms, or empirical examples. The paper's contribution is the framing of these questions from combined data science and network science perspectives, with the approach proposed for future elaboration. The referee correctly notes that the abstract overstates the current deliverables. We will revise the abstract to describe the work as a conceptual proposal rather than a completed model that enables the claimed modeling and measurement. This change will be made in the revised version. revision: yes

Circularity Check

0 steps flagged

No derivation chain or equations present; circularity not applicable

full rationale

The manuscript poses three high-level questions on data frequency and uniqueness, then asserts that modeling data as a network will answer them and enable quality measurement. No formal network construction, quality metric, algorithm, equations, or analytic result is supplied in the provided text. With no claimed derivation, no self-citations, and no reduction of any result to its inputs, there are no load-bearing steps that can be evaluated for circularity under the defined patterns. The paper is an unelaborated proposal rather than a derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no free parameters, axioms, or invented entities; the proposal is too high-level to populate the ledger.

pith-pipeline@v0.9.0 · 5647 in / 952 out tokens · 19219 ms · 2026-05-25T15:23:54.900233+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

  1. [1]

    Cambridge university press (2016)

    Barab´ asi, A.L., et al.: Network science. Cambridge university press (2016)

  2. [2]

    Synthese 92(3), 315–348 (1992)

    Corry, L.: Nicolas bourbaki and the concept of mathematical structure. Synthese 92(3), 315–348 (1992)

  3. [3]

    Educational Stud- ies in Mathematics 77(2-3), 369–388 (2011)

    Hegedus, S.J., Moreno-Armella, L.: The emergence of mathematical structures. Educational Stud- ies in Mathematics 77(2-3), 369–388 (2011)

  4. [4]

    Quality Progress 8(5), 8–9 (1975)

    Juran, J.M.: The non-pareto principle; mea culpa. Quality Progress 8(5), 8–9 (1975)

  5. [5]

    Juran, J.M., Gryna, F.M., Bingham, R.S.: Quality control handbook, vol. 3. McGraw-Hill New York (1951)

  6. [6]

    Simon and Schuster (1992)

    Juran, J.M., et al.: Juran on quality by design: the new steps for planning quality into goods and services. Simon and Schuster (1992)

  7. [7]

    Pearson (2008)

    Katz, V.J.: The History of Mathematics: An Introduction. Pearson (2008)

  8. [8]

    Hachette UK (2011)

    Koch, R.: The 80/20 Principle: The secret of achieving more with less: Updated 20th anniversary edition of the productivity and business classic. Hachette UK (2011)

  9. [9]

    Hachette UK (2013)

    Koch, R.: The 80/20 principle and 92 other powerful laws of nature: the science of success. Hachette UK (2013)

  10. [10]

    International Journal of Mathematical Education in Science and Technology 40(2), 271–277 (2009)

    Lipovetsky, S.: Pareto 80/20 law: derivation via random partitioning. International Journal of Mathematical Education in Science and Technology 40(2), 271–277 (2009)

  11. [11]

    entrepreneur Press (2013)

    Marshall, P.: 80/20 Sales and Marketing: The Definitive Guide to Working Less and Making More. entrepreneur Press (2013)

  12. [12]

    Merkle, R.C.: Method of providing digital signatures (Jan 5 1982), US Patent 4,309,569 18 M. WiseMan

  13. [13]

    Science 159(3810), 56–63 (1968)

    Merton, R.K.: The matthew effect in science: The reward and communication systems of science are considered. Science 159(3810), 56–63 (1968)

  14. [14]

    Pareto, V.: Cours d’´ economie politique, vol. 1. Librairie Droz (1964)

  15. [15]

    Journal of the society for industrial and applied mathematics 8(2), 300–304 (1960)

    Reed, I.S., Solomon, G.: Polynomial codes over certain finite fields. Journal of the society for industrial and applied mathematics 8(2), 300–304 (1960)

  16. [16]

    Flammarion (2019)

    Rosling, H.: Factfulness. Flammarion (2019)

  17. [17]

    Journal of Services Marketing 1(2), 37–40 (1987)

    Sanders, R.: The pareto principle: its use and abuse. Journal of Services Marketing 1(2), 37–40 (1987)

  18. [18]

    Broadway Books (2008)

    Stross, R.E.: The Wizard of Menlo Park: How Thomas Alva Edison Invented the Modern World. Broadway Books (2008)

  19. [19]

    Welch, L.R., Berlekamp, E.R.: Error correction for algebraic block codes (Dec 30 1986), US Patent 4,633,470

  20. [20]

    Wilkinson, L.: Revising the pareto chart. The American Statistician 60(4), 332–334 (2006) Note Mike Raeini (Mike WiseMan) is also affiliated with the department of CEECS, Florida Atlantic University (FAU), Florida, United States