Formalization of the generalized Pareto principle and structural typicality of the 20/80-rule

arxiv: 2602.11131 · v2 · pith:VATU4PQHnew · submitted 2026-02-11 · ⚛️ physics.soc-ph · math.ST· stat.TH

Formalization of the generalized Pareto principle and structural typicality of the 20/80-rule

Antti Hippel\"ainen This is my paper

Pith reviewed 2026-05-16 05:36 UTC · model grok-4.3

classification ⚛️ physics.soc-ph math.STstat.TH

keywords Pareto principle20/80 ruleKolkata indexLorenz curvetruncated distributionsfinite-sample effectsgain densitiesdecreasing rearrangement

0 comments p. Extension

pith:VATU4PQH Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{VATU4PQH}

Prints a linked pith:VATU4PQH badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

The pith

The generalized Pareto principle, where a fraction p of inputs produces a fraction 1-p of outputs, emerges structurally from truncated exponential and normal distributions for sample sizes between 100 and 100000, concentrating near the 20/

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper formalizes the generalized Pareto principle as the property that a fraction p of the largest inputs accounts for a fraction 1-p of the total output, obtained uniquely via the decreasing rearrangement of a non-negative gain density. For probability distributions this p equals one minus the Kolkata index of the associated Lorenz curve. Closed-form expressions are derived for p in the truncated power-law, exponential, and normal families. When these expressions are paired with estimates of the truncation cutoff as a function of sample size N, the resulting predictions show that p for exponential data falls in [0.15, 0.26] and for normal data in [0.20, 0.29] when N lies between 10^2 and 10^5.

Core claim

The central claim is that the imbalance parameter p defined by the generalized Pareto principle is a direct, calculable consequence of the decreasing rearrangement applied to truncated common distributions; when finite-sample truncation is taken into account, p for both exponential and normal families concentrates in narrow intervals around the canonical 0.2 value for realistic dataset sizes, remaining strictly below the infinite-sample saturation conjectured earlier.

What carries the argument

The decreasing rearrangement of the gain density ℓ, which produces a unique p satisfying the integral condition that the rearranged density over [0,p] equals 1-p.

If this is right

For exponential distributions of size N between 100 and 100000, p is predicted to lie between 0.15 and 0.26.
For normal distributions of the same sizes, p is predicted to lie between 0.20 and 0.29.
Both ranges lie strictly below the saturation value k approximately 0.865 conjectured for infinite samples.
The structural appearance of such imbalances in standard distributions implies that Pareto-type imbalances arise without requiring special generative mechanisms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same truncation-plus-rearrangement mechanism could be applied to log-normal or other commonly observed families to check whether they also produce p near 0.2 at realistic N.
If the finite-sample effect dominates, many empirical 20/80 observations in social or economic data may be explained by ordinary sampling from standard distributions rather than by domain-specific processes.
The framework supplies a quantitative way to test whether a given dataset's imbalance is typical or atypical for its size and distribution family.

Load-bearing premise

The estimates of the truncation parameter as a function of sample size N are accurate enough to be combined with the closed-form expressions for p.

What would settle it

Measuring p directly on large numbers of synthetic datasets of size N=1000 drawn from truncated exponential or normal distributions and finding that the observed values fall consistently outside the predicted intervals [0.15,0.26] or [0.20,0.29] would falsify the finite-sample concentration claim.

Figures

Figures reproduced from arXiv: 2602.11131 by Antti Hippel\"ainen.

**Figure 2.** Figure 2: Minimal inequality index or ratio of gain densities a distribution must have to satisfy a [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: An L 1 -integrable divergent density with its decreasing rearrangement, and their related cumulative gain functions. As must be, for all t ∈ [0, 1], L ∗ (t) ≥ L(t). Remembering the requirement of no padding by zeros, define a distribution of periodic diminishing returns with essential support in [0, 1]. To make finding the rearrangement 10 [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: A periodic gain density with its decreasing rearrangement, and their related cumulative [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Polynomial densities with α = 1 4 , 1, 3 and 10, and their related cumulative gain functions. A natural question arises: can we always find an α ∈ [0, ∞) such that any given generalized principle is satisfied? That is, can we always find an α such that p(1 − p α + α) α = 1 − p , (16) for any p ∈ (0, 1/2]? On one hand, at the limit α → ∞ we obtain the uniform distribution, for which p → 1/2. On the other h… view at source ↗

**Figure 6.** Figure 6: Power-law densities with ratio and scale combinations [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: The maximal value of p with which the generalized Pareto principle can be satisfied by varying α with a fixed r. The canonical 0.2/0.8-principle becomes impossible with a pure power-law after r ≈ 3100. 3.2.2. Exponential distribution Any process where decay has the same probability of occurring at any moment in time follows an exponential law. For a given rate λ > 0, the truncated exponential distribution … view at source ↗

**Figure 8.** Figure 8: Exponential densities with Λ = 0.1, 1, 4 and 10, and their related cumulative gain functions. The singularity at Λ = 0 is removable, and for all Λ > 0, h is continuous. The limits are lim Λ→0 h(Λ; p) = −1 + 2p ≤ 0 , lim Λ→∞ h(Λ; p) = p > 0 , (37) and by IVT, there exists at least one value of Λ with which any generalized principle can be satisfied. 3.2.3. Normal distribution Finally, the normal distributi… view at source ↗

**Figure 9.** Figure 9: Normal densities with Σ = 1, 3, 5 and 10, and their related cumulative gain functions. Analogous to previous cases, set h(Σ; p) = erf(Σp) erf(Σ) − 1 + p . (41) The singularity at Σ = 0 is removable, and h is continuous for all Σ > 0. Studying the limits, we again find lim Σ→0 h(Σ; p) = −1 + 2p ≤ 0 , lim Σ→∞ h(Σ; p) = p > 0 , (42) and by IVT, there exists at least one value of Σ with which any generalized p… view at source ↗

**Figure 10.** Figure 10: The generalized Pareto principles satisfied by truncated power-laws on common param [PITH_FULL_IMAGE:figures/full_fig_p021_10.png] view at source ↗

read the original abstract

We formalize a generalized form of the Pareto principle - ``fraction $p$ of inputs yields fraction $1-p$ of outputs'' - as a property of non-negative gain densities $\ell \in L^1([0,1])$, working with the decreasing rearrangement to obtain a unique characterization. For probability distributions, the resulting $p$ coincides with $1 - k_F$, where $k_F$ is the Kolkata index of the corresponding Lorenz curve. Within this framework we analyze both constructed gain densities and commonly encountered distribution families. We derive closed-form expressions for $p$ for truncated power-law, exponential, and normal distribution families. Combining these with estimates of the truncation parameter as a function of sample size $N$, we predict that datasets of size $N \in [10^2, 10^5]$ from exponential and normal families concentrate $p$ near $[0.15, 0.26]$ and $[0.20, 0.29]$ - values close to the canonical 0.2/0.8-rule, and strictly below the saturation $k \approx 0.865$ conjectured earlier by Ghosh and Chakrabarti. We discuss the implications of the structural ubiquity of Pareto-type imbalances for their use as prescriptive targets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Formalization of generalized Pareto via rearrangements is new and clean, but finite-N claims need the truncation details to hold up.

read the letter

The punchline is that this paper offers a fresh mathematical framing of the generalized Pareto principle using decreasing rearrangements of gain densities, which cleanly characterizes p and connects it to the Kolkata index. They derive closed forms for several common distribution families and use that to argue for structural reasons why the 20/80 rule appears often. What works well is the self-contained formal part. By working with the rearrangement, they get a unique p for each density, and the link to Lorenz curves and Kolkata index is a nice tie-in to existing work. The closed-form expressions for truncated versions of power-law, exponential, and normal distributions are concrete and allow direct calculation. This gives a structural explanation for the ubiquity of such imbalances without relying on specific mechanisms. The soft spots are around the finite-sample predictions. The claims that p concentrates near 0.15-0.26 for exponential and 0.20-0.29 for normal at N from 100 to 100k come from plugging in estimates of the truncation parameter as a function of N. Without seeing how those estimates are obtained—whether from theory, simulation, or fitting—the intervals are hard to assess for robustness. The abstract mentions combining closed forms with these estimates, but the derivation and validation steps are not visible, which makes the concentration predictions the load-bearing but least transparent part. Overall, the work is for people in social physics, econophysics, or applied stats who want a mathematical baseline for Pareto imbalances. It shows clear thinking on the formal side and engages with prior conjectures like the saturation k from Ghosh and Chakrabarti. The central argument holds up on the formalization, though the predictions would benefit from more detail. I think it deserves a serious referee to go through the math and check if the truncation handling is solid. Recommendation: send to peer review but require the full details on the N-dependent truncation estimates and any reproducibility materials.

Referee Report

1 major / 2 minor

Summary. The paper formalizes the generalized Pareto principle ('fraction p of inputs yields fraction 1-p of outputs') as a property of non-negative gain densities ℓ in L1([0,1]) via decreasing rearrangement, yielding a unique characterization. For probability distributions this p equals 1 - k_F, the complement of the Kolkata index of the Lorenz curve. Closed-form expressions for p are derived for truncated power-law, exponential, and normal families. These are combined with estimates of the truncation parameter as a function of sample size N to predict that datasets with N ∈ [10², 10⁵] from exponential and normal families concentrate p near [0.15, 0.26] and [0.20, 0.29] respectively—values close to the canonical 20/80 rule and below the saturation k ≈ 0.865 conjectured by Ghosh and Chakrabarti. Implications for prescriptive use of the principle are discussed.

Significance. If the finite-N predictions hold after proper justification of the truncation estimates, the manuscript supplies a structural, distribution-family-independent explanation for the frequent appearance of Pareto-type imbalances, thereby accounting for the typicality of the 20/80 rule in data drawn from common continuous distributions. The closed-form derivations for the truncated families constitute a clear technical contribution that could be reused in other contexts.

major comments (1)

[Abstract and finite-sample prediction section] Abstract and the finite-sample prediction section: the headline numerical claims—that p concentrates in [0.15,0.26] for exponential and [0.20,0.29] for normal families when N ∈ [10²,10⁵]—are obtained only after substituting the closed-form expressions with separate estimates of the truncation cutoff as a function of N. No derivation, simulation protocol, error analysis, or validation of these N-dependent estimates appears in the manuscript, so the reported intervals cannot be reproduced or stress-tested.

minor comments (2)

[Section 2] The definition of the decreasing rearrangement and its application to the gain density ℓ should be stated explicitly with an equation number in the main text rather than left implicit.
[Section 4] Notation for the truncation parameter (e.g., its symbol and dependence on N) is introduced only in the abstract and should be defined consistently in the body before the finite-N predictions are presented.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. We appreciate the positive assessment of the formalization, the closed-form derivations, and the potential significance of the finite-N predictions. We address the single major comment below and will revise the manuscript to incorporate the requested justification.

read point-by-point responses

Referee: [Abstract and finite-sample prediction section] Abstract and the finite-sample prediction section: the headline numerical claims—that p concentrates in [0.15,0.26] for exponential and [0.20,0.29] for normal families when N ∈ [10²,10⁵]—are obtained only after substituting the closed-form expressions with separate estimates of the truncation cutoff as a function of N. No derivation, simulation protocol, error analysis, or validation of these N-dependent estimates appears in the manuscript, so the reported intervals cannot be reproduced or stress-tested.

Authors: We agree that the truncation estimates require explicit derivation, a simulation protocol, and validation to support reproducibility of the headline intervals. In the revised manuscript we will add a dedicated subsection to the finite-sample prediction section that (i) derives the N-dependent truncation cutoff from the expected value of the sample maximum for the exponential and normal families using standard order-statistic results, (ii) specifies the Monte Carlo protocol (10,000 replications per N) used to obtain the estimates, and (iii) supplies error bounds and concentration diagnostics confirming that p remains inside the stated intervals for N ∈ [10², 10⁵]. These additions will make the numerical claims fully reproducible and stress-testable while preserving the original closed-form expressions for p. revision: yes

Circularity Check

1 steps flagged

Finite-N predictions combine closed forms with auxiliary estimates of truncation parameter

specific steps

fitted input called prediction [Abstract]
"Combining these with estimates of the truncation parameter as a function of sample size N, we predict that datasets of size N ∈ [10^2, 10^5] from exponential and normal families concentrate p near [0.15, 0.26] and [0.20, 0.29]"

The paper derives closed-form expressions for p from the generalized Pareto principle, then obtains the headline finite-sample intervals only after inserting externally estimated values of the truncation cutoff (as a function of N). Because those estimates are auxiliary inputs rather than outputs of the formalization, the reported 'predictions' reduce to the closed forms evaluated at fitted truncation values; the numerical concentration near the 20/80 rule is therefore conditioned on the estimates rather than emerging solely from the rearrangement characterization.

full rationale

The core formalization of the generalized Pareto principle via decreasing rearrangement and the closed-form derivations for p in truncated families are self-contained and independent. The load-bearing numerical claims for finite N, however, are produced only by substituting those closed forms with separately estimated truncation parameters as a function of N. This matches the fitted-input-called-prediction pattern at moderate strength because the interval predictions [0.15,0.26] and [0.20,0.29] are statistically forced once the N-dependent estimates are supplied, yet the estimates themselves are not derived from the formalization. No self-citation chain or self-definitional loop is present, so the overall circularity remains limited and the central result retains independent content.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The framework rests on standard rearrangement theory for L1 functions and the definition that p coincides with 1 - k_F; the finite-sample predictions additionally rest on an externally estimated truncation parameter whose functional form is not derived inside the paper.

free parameters (1)

truncation parameter
Estimated as a function of sample size N and combined with closed-form expressions to obtain the reported concentration intervals for p.

axioms (2)

standard math Decreasing rearrangement yields a unique characterization of the generalized Pareto property for non-negative gain densities in L1([0,1])
Invoked to obtain the unique p for any such density.
domain assumption For probability distributions, p coincides with 1 - k_F where k_F is the Kolkata index of the Lorenz curve
Stated as part of the framework linking the new definition to existing inequality measures.

pith-pipeline@v0.9.0 · 5532 in / 1528 out tokens · 35963 ms · 2026-05-16T05:36:52.660618+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Definition 1. ... L∗(p)=1−p
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 1 ... G(t)=L(t)−1+kt ... IVT

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages · 3 internal anchors

[1]

20/80–rule

INTRODUCTION The so-called Pareto principle or “20/80–rule” is among the most widely quoted heuristics in economics, management, and cognitive science. It states that 20% of causes result in 80% of effects. Originally formulated by Vilfredo Pareto [1], it was an empirical observation about the distribution of wealth, later generalized to domains as divers...

work page
[2]

fractionpof inputs yields fraction1−pof outputs

FORMALIZATION We model bounded cumulative processes with a non-negative gain density ℓ:I t →[0,∞), Z It dt ℓ(t) = 1,(1) whereI t = [t min, tmax]is a compact (closed and bounded) interval. Since total gains are assumed to be finite, we requireℓ∈L 1(It), and thatℓis normalized to unity. In fact, if either the domain of input or the total output were not fin...

work page
[3]

inequality

EXAMPLE DISTRIBUTIONS AND EXISTENCE OF GENERALIZED PRIN- CIPLES We now examine gain density examples to illustrate how the generalized principle emerges in diverse functional forms. These cases demonstrate, in addition to the framework, the lim- 1 In fact, historically, Pareto himself was aware of the possibility of a probabilistic interpretation, but dec...

work page
[4]

In such a simple case, the decreasing rearrangement is achieved by shifting the right-hand side of the distribution to start from zero and by sendingt→t/2, so together,t→ t 2 + 1

work page
[5]

This can be thought of as the continuous version of doubling the length of every bin. Note that shifting the divergence to zero and re-normalizing is not in general equivalent to the decreasing rearrangement; as the rearrangement is done correctly, the density profile stays normalized. The rearranged distribution is ℓ∗(t) =ℓ t 2 + 1 2 = 1 2 √ t .(12) The ...

work page
[6]

cut off its tail

COMMON GENERALIZED PRINCIPLES AND SOCIAL DISCUSSION With the generalized Pareto principle formalized and explicit density functions analyzed, we now turn to two questions. First, given common distributions and realistic parameter ranges, whichp/(1−p)-principles should one expect to observe in practice? Second, if such asymmetries are in large part structu...

work page 2010
[7]

Eureka moments

CONCLUSIONS We have formalized and studied the existence of the (generalized) Pareto principle or the p/(1−p)-principle. The principle was formalized in Section 2 with the decreasing rearrange- ment of a density functionℓ(t)describing the density of gains on the unit interval. This allowed us to define the satisfied generalized principle unambiguously as ...

work page
[8]

padding by zeros

Padding by zeros Another limitation we must impose is on the length of intervals whereℓ(t) = 0, that is, on the support ofℓ(t). A continuous family of generalized principles is trivial to satisfy if one allows such “padding by zeros”. The possibility of padding by zeros does seem natural, since there can for example be periods of various lengths when no g...

work page
[9]

negative learning

Negative gains In principle, one could consider the possibility of negative gains as well. The question seems to be mostly semantic; take as an example the case of learning: is forgetting something you have learned “negative learning", or is “forgetting” a separate process from learning? Conversely, is obtaining such forgotten information “learning again”...

work page
[10]

Define the function evaluating the total mass inside an interval of lengthpwith M(s) = Z s+p s dt ℓ ∗(t),(B7) withs∈[0,1−p].Mis continuous and M(0) =p 1 , M(1−p) =p 2

Sinceℓ ∗ is decreasing and positive, Z p 0 dt ℓ ∗(t) =p 1 ≥1−p ∗ , Z 1 1−p dt ℓ ∗(t) =p 2 ≤p ∗ .(B5) Hence, we find that integrating over an interval of lengthpwill result in a total mass p1 ≥1−p ∗ > p ∗ ≥p 2 .(B6) Sincep > p ∗, we would like to find an interval with mass1−psuch that1−p∗ >1−p > p ∗. Define the function evaluating the total mass inside an ...

work page
[11]

Pareto, Cours d’Économie Politique, The Economic Journal7, 91 (1897)

V. Pareto, Cours d’Économie Politique, The Economic Journal7, 91 (1897)

work page
[12]

Nielsen, The 90-9-1 Rule for Participation Inequality in Social Media and Online Commu- nities (2006), accessed 2026-01-25

J. Nielsen, The 90-9-1 Rule for Participation Inequality in Social Media and Online Commu- nities (2006), accessed 2026-01-25

work page 2006
[13]

G.Zipf,HumanBehaviorandthePrincipleofLeastEffort: AnIntroductiontoHumanEcology, Social Forces28, 340 (1950)

work page 1950
[14]

Lotka, The frequency distribution of scientific productivity, Journal of the Washington Academy of Sciences16, 317 (1926)

A. Lotka, The frequency distribution of scientific productivity, Journal of the Washington Academy of Sciences16, 317 (1926)

work page 1926
[15]

Merton, The Matthew Effect in Science, Science159, 56 (1968)

R. Merton, The Matthew Effect in Science, Science159, 56 (1968)

work page 1968
[16]

Newman, Power laws, Pareto distributions and Zipf’s law, Contemporary Physics46, 323 (2005), cond-mat/0412004

M. Newman, Power laws, Pareto distributions and Zipf’s law, Contemporary Physics46, 323 (2005), cond-mat/0412004

work page arXiv 2005
[17]

Simon, A behavioral model of rational choice, The Quarterly Journal of Economics69, 99 (1955)

H. Simon, A behavioral model of rational choice, The Quarterly Journal of Economics69, 99 (1955)

work page 1955
[18]

Tusset, Pareto and probability distributions, International Review of Economics71, 521 (2024)

G. Tusset, Pareto and probability distributions, International Review of Economics71, 521 (2024)

work page 2024
[19]

Hardy, Pareto’s Law, Math

M. Hardy, Pareto’s Law, Math. Intell.32, 38 (2010)

work page 2010
[20]

Power-law distributions in empirical data

A. Clauset, C. Shalizi, and M. Newman, Power-Law Distributions in Empirical Data, SIAM Review51, 661 (2009), 0706.1062

work page internal anchor Pith review Pith/arXiv arXiv 2009
[21]

Arnold, N

B. Arnold, N. Balakrishnan, and H. Nagaraja,A First Course in Order Statistics (Classics in Applied Mathematics)(Society for Industrial and Applied Mathematics, 2008)

work page 2008
[22]

Limpert, W

E. Limpert, W. Stahel, and M. Abbt, Log-Normal Distributions Across the Sciences: Keys and Clues, BioScience51, 341 (2001)

work page 2001
[23]

Emergence of scaling in random networks

A.-L. Barabási and R. Albert, Emergence of scaling in random networks, Science286, 509 (1999), cond-mat/9910332

work page internal anchor Pith review Pith/arXiv arXiv 1999
[24]

A.Ghosh, N.Chattopadhyay,andB.Chakrabarti,Inequalityinsocieties, academicinstitutions and science journals: Gini and k-indices, Physica A: Statistical Mechanics and its Applications 410, 30 (2014), 1401.6951. 32

work page internal anchor Pith review Pith/arXiv arXiv 2014
[25]

Banerjee, B

S. Banerjee, B. Chakrabarti, M. Mitra, and S. Mutuswami, Inequality Measures: The Kolkata Index in Comparison With Other Measures, Frontiers in Physics8, 10.3389/fphy.2020.562182 (2020), 2005.08762

work page doi:10.3389/fphy.2020.562182 2020
[26]

Halmos,Measure Theory(Springer, 1974)

P. Halmos,Measure Theory(Springer, 1974)

work page 1974
[27]

Kechris,Classical Descriptive Set Theory(Springer New York, 1995)

A. Kechris,Classical Descriptive Set Theory(Springer New York, 1995)

work page 1995
[28]

Lieb and M

E. Lieb and M. Loss,Analysis, Graduate Studies in Mathematics, Vol. 14 (American Mathe- matical Society, 2001). 33

work page 2001

[1] [1]

20/80–rule

INTRODUCTION The so-called Pareto principle or “20/80–rule” is among the most widely quoted heuristics in economics, management, and cognitive science. It states that 20% of causes result in 80% of effects. Originally formulated by Vilfredo Pareto [1], it was an empirical observation about the distribution of wealth, later generalized to domains as divers...

work page

[2] [2]

fractionpof inputs yields fraction1−pof outputs

FORMALIZATION We model bounded cumulative processes with a non-negative gain density ℓ:I t →[0,∞), Z It dt ℓ(t) = 1,(1) whereI t = [t min, tmax]is a compact (closed and bounded) interval. Since total gains are assumed to be finite, we requireℓ∈L 1(It), and thatℓis normalized to unity. In fact, if either the domain of input or the total output were not fin...

work page

[3] [3]

inequality

EXAMPLE DISTRIBUTIONS AND EXISTENCE OF GENERALIZED PRIN- CIPLES We now examine gain density examples to illustrate how the generalized principle emerges in diverse functional forms. These cases demonstrate, in addition to the framework, the lim- 1 In fact, historically, Pareto himself was aware of the possibility of a probabilistic interpretation, but dec...

work page

[4] [4]

In such a simple case, the decreasing rearrangement is achieved by shifting the right-hand side of the distribution to start from zero and by sendingt→t/2, so together,t→ t 2 + 1

work page

[5] [5]

This can be thought of as the continuous version of doubling the length of every bin. Note that shifting the divergence to zero and re-normalizing is not in general equivalent to the decreasing rearrangement; as the rearrangement is done correctly, the density profile stays normalized. The rearranged distribution is ℓ∗(t) =ℓ t 2 + 1 2 = 1 2 √ t .(12) The ...

work page

[6] [6]

cut off its tail

COMMON GENERALIZED PRINCIPLES AND SOCIAL DISCUSSION With the generalized Pareto principle formalized and explicit density functions analyzed, we now turn to two questions. First, given common distributions and realistic parameter ranges, whichp/(1−p)-principles should one expect to observe in practice? Second, if such asymmetries are in large part structu...

work page 2010

[7] [7]

Eureka moments

CONCLUSIONS We have formalized and studied the existence of the (generalized) Pareto principle or the p/(1−p)-principle. The principle was formalized in Section 2 with the decreasing rearrange- ment of a density functionℓ(t)describing the density of gains on the unit interval. This allowed us to define the satisfied generalized principle unambiguously as ...

work page

[8] [8]

padding by zeros

Padding by zeros Another limitation we must impose is on the length of intervals whereℓ(t) = 0, that is, on the support ofℓ(t). A continuous family of generalized principles is trivial to satisfy if one allows such “padding by zeros”. The possibility of padding by zeros does seem natural, since there can for example be periods of various lengths when no g...

work page

[9] [9]

negative learning

Negative gains In principle, one could consider the possibility of negative gains as well. The question seems to be mostly semantic; take as an example the case of learning: is forgetting something you have learned “negative learning", or is “forgetting” a separate process from learning? Conversely, is obtaining such forgotten information “learning again”...

work page

[10] [10]

Define the function evaluating the total mass inside an interval of lengthpwith M(s) = Z s+p s dt ℓ ∗(t),(B7) withs∈[0,1−p].Mis continuous and M(0) =p 1 , M(1−p) =p 2

Sinceℓ ∗ is decreasing and positive, Z p 0 dt ℓ ∗(t) =p 1 ≥1−p ∗ , Z 1 1−p dt ℓ ∗(t) =p 2 ≤p ∗ .(B5) Hence, we find that integrating over an interval of lengthpwill result in a total mass p1 ≥1−p ∗ > p ∗ ≥p 2 .(B6) Sincep > p ∗, we would like to find an interval with mass1−psuch that1−p∗ >1−p > p ∗. Define the function evaluating the total mass inside an ...

work page

[11] [11]

Pareto, Cours d’Économie Politique, The Economic Journal7, 91 (1897)

V. Pareto, Cours d’Économie Politique, The Economic Journal7, 91 (1897)

work page

[12] [12]

Nielsen, The 90-9-1 Rule for Participation Inequality in Social Media and Online Commu- nities (2006), accessed 2026-01-25

J. Nielsen, The 90-9-1 Rule for Participation Inequality in Social Media and Online Commu- nities (2006), accessed 2026-01-25

work page 2006

[13] [13]

G.Zipf,HumanBehaviorandthePrincipleofLeastEffort: AnIntroductiontoHumanEcology, Social Forces28, 340 (1950)

work page 1950

[14] [14]

Lotka, The frequency distribution of scientific productivity, Journal of the Washington Academy of Sciences16, 317 (1926)

A. Lotka, The frequency distribution of scientific productivity, Journal of the Washington Academy of Sciences16, 317 (1926)

work page 1926

[15] [15]

Merton, The Matthew Effect in Science, Science159, 56 (1968)

R. Merton, The Matthew Effect in Science, Science159, 56 (1968)

work page 1968

[16] [16]

Newman, Power laws, Pareto distributions and Zipf’s law, Contemporary Physics46, 323 (2005), cond-mat/0412004

M. Newman, Power laws, Pareto distributions and Zipf’s law, Contemporary Physics46, 323 (2005), cond-mat/0412004

work page arXiv 2005

[17] [17]

Simon, A behavioral model of rational choice, The Quarterly Journal of Economics69, 99 (1955)

H. Simon, A behavioral model of rational choice, The Quarterly Journal of Economics69, 99 (1955)

work page 1955

[18] [18]

Tusset, Pareto and probability distributions, International Review of Economics71, 521 (2024)

G. Tusset, Pareto and probability distributions, International Review of Economics71, 521 (2024)

work page 2024

[19] [19]

Hardy, Pareto’s Law, Math

M. Hardy, Pareto’s Law, Math. Intell.32, 38 (2010)

work page 2010

[20] [20]

Power-law distributions in empirical data

A. Clauset, C. Shalizi, and M. Newman, Power-Law Distributions in Empirical Data, SIAM Review51, 661 (2009), 0706.1062

work page internal anchor Pith review Pith/arXiv arXiv 2009

[21] [21]

Arnold, N

B. Arnold, N. Balakrishnan, and H. Nagaraja,A First Course in Order Statistics (Classics in Applied Mathematics)(Society for Industrial and Applied Mathematics, 2008)

work page 2008

[22] [22]

Limpert, W

E. Limpert, W. Stahel, and M. Abbt, Log-Normal Distributions Across the Sciences: Keys and Clues, BioScience51, 341 (2001)

work page 2001

[23] [23]

Emergence of scaling in random networks

A.-L. Barabási and R. Albert, Emergence of scaling in random networks, Science286, 509 (1999), cond-mat/9910332

work page internal anchor Pith review Pith/arXiv arXiv 1999

[24] [24]

A.Ghosh, N.Chattopadhyay,andB.Chakrabarti,Inequalityinsocieties, academicinstitutions and science journals: Gini and k-indices, Physica A: Statistical Mechanics and its Applications 410, 30 (2014), 1401.6951. 32

work page internal anchor Pith review Pith/arXiv arXiv 2014

[25] [25]

Banerjee, B

S. Banerjee, B. Chakrabarti, M. Mitra, and S. Mutuswami, Inequality Measures: The Kolkata Index in Comparison With Other Measures, Frontiers in Physics8, 10.3389/fphy.2020.562182 (2020), 2005.08762

work page doi:10.3389/fphy.2020.562182 2020

[26] [26]

Halmos,Measure Theory(Springer, 1974)

P. Halmos,Measure Theory(Springer, 1974)

work page 1974

[27] [27]

Kechris,Classical Descriptive Set Theory(Springer New York, 1995)

A. Kechris,Classical Descriptive Set Theory(Springer New York, 1995)

work page 1995

[28] [28]

Lieb and M

E. Lieb and M. Loss,Analysis, Graduate Studies in Mathematics, Vol. 14 (American Mathe- matical Society, 2001). 33

work page 2001