{"paper":{"title":"Formalization of the generalized Pareto principle and structural typicality of the 20/80-rule","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"The generalized Pareto principle, where a fraction p of inputs produces a fraction 1-p of outputs, emerges structurally from truncated exponential and normal distributions for sample sizes between 100 and 100000, concentrating near the 20/ ","cross_cats":["math.ST","stat.TH"],"primary_cat":"physics.soc-ph","authors_text":"Antti Hippel\\\"ainen","submitted_at":"2026-02-11T18:42:37Z","abstract_excerpt":"We formalize a generalized form of the Pareto principle - ``fraction $p$ of inputs yields fraction $1-p$ of outputs'' - as a property of non-negative gain densities $\\ell \\in L^1([0,1])$, working with the decreasing rearrangement to obtain a unique characterization. For probability distributions, the resulting $p$ coincides with $1 - k_F$, where $k_F$ is the Kolkata index of the corresponding Lorenz curve. Within this framework we analyze both constructed gain densities and commonly encountered distribution families. We derive closed-form expressions for $p$ for truncated power-law, exponentia"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"datasets of size N ∈ [10^2, 10^5] from exponential and normal families concentrate p near [0.15, 0.26] and [0.20, 0.29] - values close to the canonical 0.2/0.8-rule, and strictly below the saturation k ≈ 0.865 conjectured earlier by Ghosh and Chakrabarti.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The estimates of the truncation parameter as a function of sample size N that are combined with the closed-form expressions to produce the finite-sample predictions.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"A formalization of the generalized Pareto principle derives that exponential and normal distributions with 100 to 100,000 samples produce p values near 0.2, close to the 80/20 rule and below prior saturation conjectures.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"The generalized Pareto principle, where a fraction p of inputs produces a fraction 1-p of outputs, emerges structurally from truncated exponential and normal distributions for sample sizes between 100 and 100000, concentrating near the 20/ ","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"f7b495274ab39026df8a33012b910dc1425c55d8dd0b9cd99ad90b31e9eb156a"},"source":{"id":"2602.11131","kind":"arxiv","version":2},"verdict":{"id":"91359b83-11a8-41c7-97a1-2dacb8d412ce","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T05:36:52.660618Z","strongest_claim":"datasets of size N ∈ [10^2, 10^5] from exponential and normal families concentrate p near [0.15, 0.26] and [0.20, 0.29] - values close to the canonical 0.2/0.8-rule, and strictly below the saturation k ≈ 0.865 conjectured earlier by Ghosh and Chakrabarti.","one_line_summary":"A formalization of the generalized Pareto principle derives that exponential and normal distributions with 100 to 100,000 samples produce p values near 0.2, close to the 80/20 rule and below prior saturation conjectures.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The estimates of the truncation parameter as a function of sample size N that are combined with the closed-form expressions to produce the finite-sample predictions.","pith_extraction_headline":"The generalized Pareto principle, where a fraction p of inputs produces a fraction 1-p of outputs, emerges structurally from truncated exponential and normal distributions for sample sizes between 100 and 100000, concentrating near the 20/ "},"references":{"count":28,"sample":[{"doi":"","year":null,"title":"INTRODUCTION The so-called Pareto principle or “20/80–rule” is among the most widely quoted heuristics in economics, management, and cognitive science. It states that 20% of causes result in 80% of ef","work_id":"1ec79be7-0d61-4077-91f8-d325abeb45a0","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"fractionpof inputs yields fraction1−pof outputs","work_id":"62e5f016-7c08-4831-ba9d-c4c7c7e28fe9","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"EXAMPLE DISTRIBUTIONS AND EXISTENCE OF GENERALIZED PRIN- CIPLES We now examine gain density examples to illustrate how the generalized principle emerges in diverse functional forms. These cases demons","work_id":"59e433ea-09b9-4b98-9076-a1dd0ce7fe3d","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"In such a simple case, the decreasing rearrangement is achieved by shifting the right-hand side of the distribution to start from zero and by sendingt→t/2, so together,t→ t 2 + 1","work_id":"d76d3983-61d7-4733-a28f-2090eade1b08","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"This can be thought of as the continuous version of doubling the length of every bin. Note that shifting the divergence to zero and re-normalizing is not in general equivalent to the decreasing rearra","work_id":"9af0340b-aafa-4a56-ae5a-0bc13646b13a","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":28,"snapshot_sha256":"36606cd5db229cc10f740f5ca3959a0293e1a7291183649d4f0ec50e8567f9b2","internal_anchors":3},"formal_canon":{"evidence_count":2,"snapshot_sha256":"89150a33f38f8f465397c72a015283e52ae3cf18694a553eb557d1694ad6335d"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}