RTI-Bench: A Structured Dataset for Indian Right-to-Information Decision Analysis
Pith reviewed 2026-05-19 21:11 UTC · model grok-4.3
The pith
RTI-Bench supplies the first structured collection of Indian Central Information Commission decisions with outcome labels and exemption details.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper's central claim is that RTI-Bench constitutes the first publicly released structured dataset for Indian RTI administrative decisions, assembled from 1,218 cases in a public corpus plus 298 CIC PDFs spanning multiple commissioners and document formats, with outcome labels, exemption citations, reasoning components, and timelines extracted through rule-based methods and validated at 95.3 percent precision on a reviewed sample.
What carries the argument
Rule-based field extraction from PDF decisions, supplemented by manual review on a 50-case sample, to produce consistent labels for outcomes, exemptions, and reasoning across the collected cases.
If this is right
- Models can be trained to predict whether an RTI appeal is likely to succeed before filing.
- Researchers can measure how often particular exemptions are upheld across different commissioners.
- Citizens gain a way to scan past decisions for patterns that match their own information requests.
- Future datasets can extend the same extraction approach to state-level RTI bodies.
Where Pith is reading between the lines
- The same labeling scheme could be applied to older decisions to trace changes in exemption use over time.
- Linking the dataset to actual filed appeals would let users test whether the predicted outcomes match real results.
- Automated extraction rules might be refined by training on the manually reviewed subset to raise coverage beyond 51 percent.
Load-bearing premise
Rule-based extraction plus manual review on a small sample produces accurate enough labels to represent the full set of decisions despite partial coverage.
What would settle it
A full manual audit of the labeled cases that finds precision below 80 percent or a test on newly released CIC decisions where the outcome-prediction baseline falls to the majority-class level.
read the original abstract
India's Right to Information Act, 2005 gives every citizen the right to demand information from public authorities, yet in practice most people cannot make sense of the dense administrative language used in Central Information Commission (CIC) decisions, let alone predict whether an appeal is worth filing. This paper introduces RTI-Bench, a structured dataset of CIC decisions with outcome labels, exemption citations, IRAC-style reasoning components, and procedural timelines. To the best of our knowledge it is the first publicly released structured dataset for Indian RTI administrative decisions. The dataset draws from two sources: 1,218 cases from a publicly available instruction-response corpus (with structured fields added through rule-based extraction), and 298 CIC decision PDFs collected directly from the Commission portal, spanning five commissioners and three document format generations from 2023 to 2026. Label coverage reaches 89% on the instruction-response corpus. For the PDF subset of 239 primary decisions, coverage is 51% in this first release. A random sample of 50 labelled cases was manually reviewed, yielding a label precision of 95.3%. A zero-shot Mistral 7B baseline on 100 cases gives 57.3% accuracy and 37.0% macro-F1 on outcome prediction, well above the majority-class baseline of 14.3% macro-F1. RTI-Bench is available at https://huggingface.co/datasets/joyboseroy/rti-bench
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces RTI-Bench, the first publicly released structured dataset of Central Information Commission (CIC) decisions under India's Right to Information Act. It comprises 1,218 cases from an existing instruction-response corpus (89% label coverage via rule-based extraction) and 298 PDF decisions (239 primary decisions with 51% label coverage in the initial release), annotated for outcome labels, exemption citations, IRAC-style reasoning components, and procedural timelines. A zero-shot Mistral 7B baseline on 100 cases achieves 57.3% accuracy and 37.0% macro-F1 on outcome prediction, exceeding the majority-class baseline of 14.3% macro-F1. The dataset is released at https://huggingface.co/datasets/joyboseroy/rti-bench.
Significance. If the reported label quality and coverage hold, RTI-Bench fills an important gap as the first public structured resource for analyzing Indian RTI administrative decisions, enabling downstream NLP work on legal reasoning, exemption prediction, and citizen-facing applications. The transparent reporting of extraction coverage, the 95.3% precision on the 50-case manual sample, and the public Hugging Face release with baseline results are clear strengths that lower barriers for reproducibility and extension.
major comments (1)
- [Data Collection and Annotation (PDF subset)] PDF subset extraction (§3 or equivalent data section): The 51% label coverage on the 239 primary PDF decisions, obtained via rule-based methods, leaves open whether the successfully labeled subset is representative across the five commissioners and three document formats. The 50-case manual review (95.3% precision) does not address potential systematic bias toward simpler or more standardized decisions, which could affect the reliability of the reported Mistral baseline and any downstream uses of the labeled PDF portion.
minor comments (2)
- [Abstract] Clarify the exact time span of the 298 PDFs (abstract states '2023 to 2026'); confirm whether this includes future or projected decisions and update accordingly.
- [Experiments / Baseline] The baseline evaluation uses 100 cases; specify whether these are drawn exclusively from the labeled PDF subset or the full dataset, and report the exact split to aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of RTI-Bench and for the constructive comment on the PDF subset. We address the major comment below and will incorporate revisions to strengthen transparency.
read point-by-point responses
-
Referee: PDF subset extraction (§3 or equivalent data section): The 51% label coverage on the 239 primary PDF decisions, obtained via rule-based methods, leaves open whether the successfully labeled subset is representative across the five commissioners and three document formats. The 50-case manual review (95.3% precision) does not address potential systematic bias toward simpler or more standardized decisions, which could affect the reliability of the reported Mistral baseline and any downstream uses of the labeled PDF portion.
Authors: We agree that the 51% coverage achieved through rule-based extraction on the 239 primary PDF decisions could introduce selection bias if extraction succeeds preferentially on simpler or more standardized cases, and that the random 50-case manual review confirms precision but does not test for distributional differences across the five commissioners or three document formats. In the revised manuscript we will add a stratified breakdown (new table in the data section) reporting the number of decisions and label coverage rates for each commissioner and each format, comparing the full PDF set to the successfully labeled subset. We will also expand the limitations discussion to explicitly note the possibility of such bias and its potential implications for the Mistral baseline and downstream applications. These additions increase transparency without changing the core dataset claims or baseline results. revision: yes
Circularity Check
No significant circularity; dataset release with transparent rule-based construction
full rationale
The paper is a data release describing collection of CIC decisions from public sources, application of rule-based extraction to add structured fields, and manual review on a 50-case sample for validation. The central claim of being the first such structured dataset rests on the public release itself rather than any derivation. The reported baseline uses zero-shot inference on Mistral 7B with no fitted parameters or predictions that reduce to the extraction rules by construction. No equations, self-citations, uniqueness theorems, or ansatzes appear in the methodology. The work is self-contained against external benchmarks of public data availability and does not invoke prior author work to justify its core outputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Government of India (2022). Annual Report 2021-22. Central Information Commission, New Delhi. https://cic.gov.in/sites/default/files/Reports/AR2021-22E.pdf
work page 2022
-
[2]
Satija, N. (2021). Over 32,000 RTI appeals pending with Central Information Commission. Hindustan Times, December 16, 2021. https://www.hindustantimes.com/india-news/over-32-000-rti-appeals-pending-with- central-information-commission-govt-101639657691173.html
work page 2021
-
[3]
Malik, V., Sanjay, R., Nigam, S. K., Ghosh, K., Guha, S. K., Bhattacharya, A., & Modi, A. (2021, August). ILDC for CJPE: Indian legal documents corpus for court judgment prediction and explanation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Proc...
work page 2021
-
[4]
K., Sharma, A., Khanna, D., Shallum, N., Ghosh, K., & Bhattacharya, A
Nigam, S. K., Sharma, A., Khanna, D., Shallum, N., Ghosh, K., & Bhattacharya, A. (2024, August). Legal judgment reimagined: PredEx and the rise of intelligent AI interpretation in Indian courts. In Findings of the Association for Computational Linguistics: ACL 2024 (pp. 4296-4315). arXiv:2406.04136
-
[5]
Pal, A. (2022, December). Deepparliament: A legal domain benchmark & dataset for parliament bills prediction. In Proceedings of the Workshop on Unimodal and Multimodal Induction of Linguistic Structures (UM - IoS) (pp. 73-81). arXiv:2211.15424
-
[6]
IL-PCSR: Indian Legal Corpus for Prior Case and Statute Retrieval
Exploration-Lab (2025). IL-PCSR: Indian Legal Corpus for Prior Case and Statute Retrieval. HuggingFace Datasets. https://huggingface.co/datasets/Exploration-Lab/IL-PCSR
work page 2025
-
[7]
jatinmehra (2023). RTI-CASE-DATASET. HuggingFace Datasets. https://huggingface.co/datasets/jatinmehra/RTI-CASE-DATASET
work page 2023
-
[8]
Jiang, A. Q. et al. (2023). Mistral 7B. arXiv:2310.06825
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[9]
Chalkidis, I., Androutsopoulos, I., & Aletras, N. (2019, July). Neural legal judgment prediction in English. In Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 4317-4323)
work page 2019
-
[10]
The Right to Information Act, 2005
Government of India (2005). The Right to Information Act, 2005. Ministry of Law and Justice, New Delhi. https://cic.gov.in/sites/default/files/RTI-Act_English.pdf
work page 2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.