pith. sign in

arxiv: 2606.05623 · v1 · pith:UAG6HW3Onew · submitted 2026-06-04 · 💱 q-fin.RM · stat.AP

Bankruptcy Prediction from 10-K Narratives: Evidence from Interpretable Text Scores and Accounting Baselines

Pith reviewed 2026-06-27 23:06 UTC · model grok-4.3

classification 💱 q-fin.RM stat.AP
keywords bankruptcy prediction10-K narrativestext analysisdistress scoreaccounting ratiosrisk monitoringdictionary method
0
0 comments X

The pith

A dictionary-based Pre-Bankruptcy Stress Score from 10-K text raises AUC from 0.8323 to 0.9019 when added to accounting ratios.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper asks whether narrative language in 10-K filings can flag corporate bankruptcy risk earlier than accounting numbers alone. It builds a simple dictionary count of words tied to liquidity problems, debt refinancing stress, operating decline, restructuring, and business fragility. In the main holdout test this PB Stress Score lifts area under the curve and the share of bankrupt firms placed in the highest-risk decile. The gain persists under bootstrap checks, different accounting controls, and out-of-time samples. Readers would care because earlier, interpretable signals could let lenders and regulators respond before losses crystallize.

Core claim

Distress-specific language in 10-K narratives supplies incremental information for one-year-ahead bankruptcy prediction beyond a five-variable accounting baseline and the Loughran-McDonald negative-word list.

What carries the argument

The Pre-Bankruptcy Stress (PB Stress) Score, a transparent dictionary count of terms across five distress categories that quantifies narrative signals of emerging trouble.

If this is right

  • Narrative disclosures can precede the accounting deterioration that standard models rely on.
  • A combined text-plus-accounting model improves top-decile capture from 44 percent to 65 percent.
  • The incremental lift holds across bootstrap inference, alternative benchmarks, and out-of-time windows.
  • Transparent dictionary scores remain usable for routine risk monitoring without black-box models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Regulators could scan 10-K language for early systemic-distress flags before ratios turn.
  • Firms aware of the score might alter wording in future filings to manage perceived risk.
  • The same dictionary approach could be tested on other infrequent events such as covenant violations or delistings.

Load-bearing premise

The selected dictionary words capture distress signals that accounting ratios and the Loughran-McDonald list have not already picked up.

What would settle it

An exact replication on the same holdout sample in which adding the PB Stress Score produces no rise in AUC or top-decile capture rate.

read the original abstract

Bankruptcy is a low-frequency but high-impact corporate event, making early risk identification important for creditors, investors, regulators, and risk managers. Traditional bankruptcy-prediction models rely primarily on accounting ratios, but these measures may reflect financial deterioration only after it appears in reported financial statements. Narrative disclosures in annual 10-K filings may therefore provide incremental warning signals about emerging distress. This study examines whether 10-K narratives improve bankruptcy prediction beyond conventional accounting variables. Using firm-year observations matched to 10-K text, SEC financial statement data, and bankruptcy events from the Florida-UCLA-LoPucki Bankruptcy Research Database, the analysis evaluates bankruptcy risk over the year following the 10-K filing date. The paper develops a transparent Pre-Bankruptcy Stress (PB Stress) Score, a dictionary-based measure designed to capture distress-specific language related to liquidity and funding stress, debt covenant and refinancing stress, operating deterioration, restructuring and legal distress, and business fragility. The score is evaluated against a five-variable accounting baseline and a Loughran-McDonald dictionary benchmark. In the primary one-year holdout test, adding the PB Stress Score increases AUC from 0.8323 to 0.9019 and raises top-decile bankruptcy capture from 44.12% to 64.71%. The positive incremental pattern remains visible across bootstrap inference, alternative accounting benchmarks, alternative outcome definitions, and out-of-time validation. The findings indicate that distress-specific 10-K narratives provide interpretable incremental information for bankruptcy-risk monitoring beyond conventional accounting ratios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that a transparent, dictionary-based Pre-Bankruptcy Stress (PB Stress) Score extracted from 10-K narratives supplies incremental information for one-year-ahead bankruptcy prediction. In the primary holdout test it raises AUC from 0.8323 (five-variable accounting baseline) to 0.9019 and top-decile capture from 44.12% to 64.71%, with the lift persisting under bootstrap, out-of-time, and alternative-specification checks.

Significance. If the PB Stress Score is shown to be orthogonal to the accounting ratios, the result would supply a replicable, interpretable text-based early-warning signal that complements conventional ratio models and could be directly useful for creditors and regulators.

major comments (2)
  1. [Primary one-year holdout test] Primary holdout results (abstract and §4): the reported AUC increase of 0.0696 is presented as evidence of incremental information, yet no correlation matrix, variance-inflation factors, or residualization of the PB Stress Score against the five accounting predictors is supplied; without this check the lift could be partly mechanical if liquidity- or covenant-related dictionary terms simply proxy ratios already in the baseline.
  2. [Variable construction / PB Stress Score] Methods section on dictionary construction: the abstract states that terms were chosen to reflect five distress themes, but provides neither the exact term list, the selection protocol, nor an explicit statement that no terms were tuned on the bankruptcy outcome; this detail is load-bearing for the claim that the score supplies genuinely new narrative information rather than re-expressing known deterioration.
minor comments (1)
  1. [Benchmark comparisons] The Loughran-McDonald benchmark is mentioned but its exact implementation (word lists, weighting) is not compared side-by-side with the PB Stress Score in the reported tables.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our incremental-information results. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Primary one-year holdout test] Primary holdout results (abstract and §4): the reported AUC increase of 0.0696 is presented as evidence of incremental information, yet no correlation matrix, variance-inflation factors, or residualization of the PB Stress Score against the five accounting predictors is supplied; without this check the lift could be partly mechanical if liquidity- or covenant-related dictionary terms simply proxy ratios already in the baseline.

    Authors: We agree that explicit checks for multicollinearity and orthogonality would strengthen the claim. The PB Stress Score is constructed from narrative language intended to capture forward-looking distress signals that may not yet appear in the accounting ratios. The lift remains stable in out-of-time validation, which already mitigates concerns about mechanical proxying. Nevertheless, we will add a correlation matrix, variance-inflation factors, and a residualization exercise (regressing the PB Stress Score on the five accounting variables and using the residuals) to the revised manuscript. revision: yes

  2. Referee: [Variable construction / PB Stress Score] Methods section on dictionary construction: the abstract states that terms were chosen to reflect five distress themes, but provides neither the exact term list, the selection protocol, nor an explicit statement that no terms were tuned on the bankruptcy outcome; this detail is load-bearing for the claim that the score supplies genuinely new narrative information rather than re-expressing known deterioration.

    Authors: The dictionary terms were selected a priori from the academic literature on corporate distress to map onto the five themes (liquidity/funding stress, covenant/refinancing stress, operating deterioration, restructuring/legal distress, and business fragility) without any tuning on the bankruptcy labels. We will include the complete term list, the explicit selection protocol, and a clear statement confirming the absence of outcome-based tuning in the methods section or a new appendix of the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity; out-of-sample holdout validation is independent of score construction

full rationale

The paper defines the PB Stress Score via a transparent, hand-crafted dictionary targeting specific distress themes (liquidity, covenants, operating deterioration, etc.) and then tests its incremental AUC contribution (0.8323 to 0.9019) in a one-year holdout sample, with bootstrap, out-of-time, and alternative-benchmark checks. No equation or step equates the reported lift to a fitted parameter, self-referential definition, or self-citation chain; the outcome variable comes from an external bankruptcy database and the evaluation is statistically independent of the dictionary construction. This is a standard empirical design with external benchmarks, so no load-bearing circularity exists.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are described. The central claim implicitly rests on the unstated premise that the chosen distress dictionary terms are fixed ex ante and independent of the bankruptcy outcome labels.

pith-pipeline@v0.9.1-grok · 5825 in / 1171 out tokens · 21911 ms · 2026-06-27T23:06:12.622025+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references · 19 canonical work pages · 1 internal anchor

  1. [2]

    McNichols, and Jung-Wu Rhie

    Financial ratios as predictors of failure.Journal of Accounting Research4: 71–111.https://doi.org/10.2307/2490171 Beaver, William H., Maureen F. McNichols, and Jung-Wu Rhie

  2. [3]

    2013.Interagency Guidance on Lever- aged Lending

    Bankruptcy classification errors in the 1980s: An empirical analysis of Altman’s and Ohlson’s models.Review of Accounting Studies1: 267–84.https://doi.org/10.1007/BF00570833 Board of Governors of the Federal Reserve System, Federal Deposit Insurance Corporation, and Office of the Comptroller of the Currency. 2013.Interagency Guidance on Lever- aged Lendin...

  3. [4]

    https: //doi.org/10.1017/S0022109015000411 Campbell, John L., Hsinchun Chen, Dan S

    Using 10-K text to gauge financial constraints.Journal of Financial and Quantitative Analysis50: 623–646. https: //doi.org/10.1017/S0022109015000411 Campbell, John L., Hsinchun Chen, Dan S. Dhaliwal, Hsin-Min Lu, and Logan B. Steele

  4. [5]

    The information content of mandatory risk factor disclosures in corporate filings.Review of Accounting Studies19: 396–455.https://doi.org/10.1007/s11142-013-9258-3 Campbell, John Y., Jens Hilscher, and Jan Szilagyi

  5. [6]

    In search of distress risk.The Journal of Finance63: 2899–2939.https://doi.org/10.1111/j.1540-6261.2008.01416.x Cheng, Zhiyuan, Longying Lai, and Yue Liu

  6. [7]

    Resolving the robustness-precision trade-off in financial RAG through hybrid document-routed retrieval.arXiv preprint arXiv:2603.26815.https://doi.org/10.48550/arXiv.2603.26815 Davis, Jesse, and Mark Goadrich

  7. [8]

    InProceedings of the 23rd International Conference on Machine Learning, pp

    The relationship between precision-recall and ROC curves. InProceedings of the 23rd International Conference on Machine Learning, pp. 233– 240.https://doi.org/10.1145/1143844.1143874 24 Efron, Bradley, and Robert J. Tibshirani. 1993.An Introduction to the Bootstrap. New York: Chapman & Hall.https://doi.org/10.1007/978-1-4899-4541-9 Fawcett, Tom

  8. [9]

    n.d.Bankruptcy Research Database

    An introduction to ROC analysis.Pattern Recognition Letters27: 861–874.https://doi.org/10.1016/j.patrec.2005.10.010 Florida-UCLA-LoPucki Bankruptcy Research Database. n.d.Bankruptcy Research Database. Available online:https://lopucki.law.ufl.edu/index.php(accessed on 2 May 2026). Fu, Rong, Ziming Wang, Chunlei Meng, Jiaxuan Lu, Jiekai Wu, Kangan Qian, Hao...

  9. [10]

    https://arxiv.org/ abs/2602.16144 Gombola, Michael J., Mark E

    Missing-by-Design: Certifiable modality deletion for revocable multimodal sentiment analysis.arXiv preprintarXiv:2602.16144. https://arxiv.org/ abs/2602.16144 Gombola, Michael J., Mark E. Haskins, J. Edward Ketz, and David D. Williams

  10. [11]

    QuantE- val: Benchmarking large language models on financial quantitative tasks.arXiv preprint arXiv:2601.08689.https://arxiv.org/abs/2601.08689 Kravet, Todd, and Volkan Muslu

  11. [12]

    https://doi.org/10.1016/j.jacceco.2008

    Annual report readability, current earnings, and earnings persistence.Journal of Accounting and Economics45: 221–247. https://doi.org/10.1016/j.jacceco.2008. 02.003 Liu, Yue, Zhiyuan Cheng, and Longying Lai

  12. [13]

    https://doi.org/10.2139/ssrn.6720239 Lin, Luyun, and Yiqing Wang

    Improving the completeness and compara- bility of segment disclosures: A large language model approach.SSRN Electronic Journal. https://doi.org/10.2139/ssrn.6720239 Lin, Luyun, and Yiqing Wang

  13. [14]

    Sequential Monte Carlo samplers

    When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks.The Journal of Finance66: 35–65. https://doi.org/10.1111/j. 1540-6261.2010.01625.x Loughran, Tim, and Bill McDonald

  14. [15]

    https://doi.org/10.1111/ 1475-679X.12123 Loughran, Tim, and Bill McDonald

    Textual analysis in accounting and finance: A survey.Journal of Accounting Research54: 1187–1230. https://doi.org/10.1111/ 1475-679X.12123 Loughran, Tim, and Bill McDonald. n.d.Loughran-McDonald Master Dictionary w/ Sentiment Word Lists. Software Repository for Accounting and Finance, University of Notre Dame. Available online: https://sraf.nd.edu/loughra...

  15. [16]

    Mai, Feng, Shaonan Tian, Chihoon Lee, and Ling Ma

    Available online: https://papers.neurips.cc/paper/ 7062-a-unified-approach-to-interpreting-model-predictions (accessed on 2 May 2026). Mai, Feng, Shaonan Tian, Chihoon Lee, and Ling Ma

  16. [17]

    Deep learning models for bankruptcy prediction using textual disclosures.European Journal of Operational Research 274: 743–758.https://doi.org/10.1016/j.ejor.2018.10.024 Mayew, William J., Mani Sethuraman, and Mohan Venkatachalam

  17. [18]

    https://doi.org/10.2308/accr-50983 Moody’s Ratings

    MD&A disclosure and the firm’s ability to continue as a going concern.The Accounting Review90: 1621–1651. https://doi.org/10.2308/accr-50983 Moody’s Ratings. 2024.General Principles of Liquidity Risk Assessment. Cross-Sector Rat- ing Methodology. Available online: https://ratings.moodys.com/api/rmc-documents/ 425858(accessed on 2 May 2026). Office of the ...

  18. [19]

    Journal of Accounting Research18: 109–131.https://doi.org/10.2307/2490395 S&P Global Ratings

    Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research18: 109–131.https://doi.org/10.2307/2490395 S&P Global Ratings. 2021.How We Rate Nonfinancial Corporate Entities. Avail- able online: https://www.spglobal.com/ratings/en/regulatory/article/ 210219-how-we-rate-nonfinancial-corporate-entities-s10944886 (accessed o...

  19. [20]

    Forecasting bankruptcy more accurately: A simple hazard model.The Journal of Business74: 101–124.https://doi.org/10.1086/209665 Song, Zitao, Yining Wang, Pin Qian, Sifan Song, Frans Coenen, Zhengyong Jiang, and Jionglong Su

  20. [21]

    From deterministic to stochastic: An interpretable stochastic model- free reinforcement learning framework for portfolio optimization.Applied Intelligence53(12): 15188–15203.https://doi.org/10.1007/s10489-022-04217-5 Sun, Wenxi, Qiannan Shen, Yijun Gao, Qinkai Mao, Tongsong Qi, and Shuo Xu

  21. [22]

    Securities and Exchange Commission

    Objective over architecture: Fraud detection under extreme imbalance in bank account opening.Computation13(12): 290.https://doi.org/10.3390/computation13120290 U.S. Securities and Exchange Commission. 2024.EDGAR Application Program- ming Interfaces (APIs). Available online: https://www.sec.gov/search-filings/ edgar-application-programming-interfaces(acces...

  22. [23]

    In2025 International Conference on Artificial Intelligence, Human-Computer Interaction and Natural Language Processing (ICAHN), pp

    Interpretable credit default prediction with ensemble learning and SHAP. In2025 International Conference on Artificial Intelligence, Human-Computer Interaction and Natural Language Processing (ICAHN), pp. 102–106.https://doi.org/10.1109/ICAHN67688.2025.00027 Zhang, Zijian, Rong Fu, Yangfan He, Xinze Shen, Yanlong Wang, Xiaojing Du, Haochen You, Keyan Jin,...

  23. [24]

    InICASSP 2026 – 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp

    FinSentLLM: Multi-LLM and Structured Semantic Signals for Enhanced Financial Sentiment Forecasting. InICASSP 2026 – 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 17682–17686.https://doi.org/10.1109/ICASSP55912.2026.11461632 Zmijewski, Mark E

  24. [25]

    Methodological issues related to the estimation of financial distress prediction models.Journal of Accounting Research22: 59–82. https://doi.org/10.2307/ 2490859 27 A Benchmark Score Formulas and Empirical Imple- mentations This appendix reports the benchmark accounting-score formulas used for comparison in the paper. To avoid confusion between published ...