RTI-Bench: A Structured Dataset for Indian Right-to-Information Decision Analysis

Joy Bose

arxiv: 2605.16843 · v1 · pith:Y573BBBUnew · submitted 2026-05-16 · 💻 cs.CL

RTI-Bench: A Structured Dataset for Indian Right-to-Information Decision Analysis

Joy Bose This is my paper

Pith reviewed 2026-05-19 21:11 UTC · model grok-4.3

classification 💻 cs.CL

keywords Right to InformationdatasetCentral Information Commissionlegal decision analysisexemption predictionadministrative lawoutcome labeling

0 comments

The pith

RTI-Bench supplies the first structured collection of Indian Central Information Commission decisions with outcome labels and exemption details.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates RTI-Bench by pulling structured fields from an existing instruction-response corpus and from 298 recent CIC decision PDFs. It adds labels for case outcomes, cited exemptions, IRAC-style reasoning steps, and procedural timelines. Citizens currently face dense administrative language that makes it difficult to judge whether an RTI appeal has a reasonable chance of success. The dataset reaches 89 percent label coverage on the corpus portion and 51 percent on the PDF portion, with a manual check on 50 cases confirming 95.3 percent precision. A simple baseline model already beats the majority-class guess on outcome prediction, showing the data can support further modeling work.

Core claim

The paper's central claim is that RTI-Bench constitutes the first publicly released structured dataset for Indian RTI administrative decisions, assembled from 1,218 cases in a public corpus plus 298 CIC PDFs spanning multiple commissioners and document formats, with outcome labels, exemption citations, reasoning components, and timelines extracted through rule-based methods and validated at 95.3 percent precision on a reviewed sample.

What carries the argument

Rule-based field extraction from PDF decisions, supplemented by manual review on a 50-case sample, to produce consistent labels for outcomes, exemptions, and reasoning across the collected cases.

If this is right

Models can be trained to predict whether an RTI appeal is likely to succeed before filing.
Researchers can measure how often particular exemptions are upheld across different commissioners.
Citizens gain a way to scan past decisions for patterns that match their own information requests.
Future datasets can extend the same extraction approach to state-level RTI bodies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same labeling scheme could be applied to older decisions to trace changes in exemption use over time.
Linking the dataset to actual filed appeals would let users test whether the predicted outcomes match real results.
Automated extraction rules might be refined by training on the manually reviewed subset to raise coverage beyond 51 percent.

Load-bearing premise

Rule-based extraction plus manual review on a small sample produces accurate enough labels to represent the full set of decisions despite partial coverage.

What would settle it

A full manual audit of the labeled cases that finds precision below 80 percent or a test on newly released CIC decisions where the outcome-prediction baseline falls to the majority-class level.

read the original abstract

India's Right to Information Act, 2005 gives every citizen the right to demand information from public authorities, yet in practice most people cannot make sense of the dense administrative language used in Central Information Commission (CIC) decisions, let alone predict whether an appeal is worth filing. This paper introduces RTI-Bench, a structured dataset of CIC decisions with outcome labels, exemption citations, IRAC-style reasoning components, and procedural timelines. To the best of our knowledge it is the first publicly released structured dataset for Indian RTI administrative decisions. The dataset draws from two sources: 1,218 cases from a publicly available instruction-response corpus (with structured fields added through rule-based extraction), and 298 CIC decision PDFs collected directly from the Commission portal, spanning five commissioners and three document format generations from 2023 to 2026. Label coverage reaches 89% on the instruction-response corpus. For the PDF subset of 239 primary decisions, coverage is 51% in this first release. A random sample of 50 labelled cases was manually reviewed, yielding a label precision of 95.3%. A zero-shot Mistral 7B baseline on 100 cases gives 57.3% accuracy and 37.0% macro-F1 on outcome prediction, well above the majority-class baseline of 14.3% macro-F1. RTI-Bench is available at https://huggingface.co/datasets/joyboseroy/rti-bench

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RTI-Bench is a first public structured dataset for Indian CIC decisions but the 51% PDF coverage and 50-case validation leave real questions about label quality and representativeness.

read the letter

This paper's main point is that it puts out RTI-Bench as the first public structured dataset for decisions from India's Central Information Commission under the RTI Act. It includes outcome labels, exemption details, IRAC-style parts, and timelines from both an existing corpus and recent PDFs. The authors do a good job laying out the data sources and how they built it. They get solid coverage on the 1,218 cases from the instruction-response set and manage 51% on the 239 PDF decisions using rules. The 50-case manual review hitting 95.3% precision is reassuring, and the zero-shot Mistral baseline beating the majority class on 100 cases shows the data has some signal for modeling. Where it falls short is the partial coverage on the PDFs and the scale of the validation. Checking only 50 cases out of the labeled ones doesn't give a full picture of accuracy across the five commissioners or the three document formats. Rule-based extraction can miss nuances in reasoning or procedural steps, so the labeled subset might not represent the harder decisions. This kind of resource is useful for people doing NLP on legal or administrative texts, particularly those focused on Indian government data or building tools for transparency. It has enough new material and documentation to go through peer review rather than get rejected at the desk. I'd suggest sending it out for comments, as the dataset release can help others even with the current limitations on completeness.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces RTI-Bench, the first publicly released structured dataset of Central Information Commission (CIC) decisions under India's Right to Information Act. It comprises 1,218 cases from an existing instruction-response corpus (89% label coverage via rule-based extraction) and 298 PDF decisions (239 primary decisions with 51% label coverage in the initial release), annotated for outcome labels, exemption citations, IRAC-style reasoning components, and procedural timelines. A zero-shot Mistral 7B baseline on 100 cases achieves 57.3% accuracy and 37.0% macro-F1 on outcome prediction, exceeding the majority-class baseline of 14.3% macro-F1. The dataset is released at https://huggingface.co/datasets/joyboseroy/rti-bench.

Significance. If the reported label quality and coverage hold, RTI-Bench fills an important gap as the first public structured resource for analyzing Indian RTI administrative decisions, enabling downstream NLP work on legal reasoning, exemption prediction, and citizen-facing applications. The transparent reporting of extraction coverage, the 95.3% precision on the 50-case manual sample, and the public Hugging Face release with baseline results are clear strengths that lower barriers for reproducibility and extension.

major comments (1)

[Data Collection and Annotation (PDF subset)] PDF subset extraction (§3 or equivalent data section): The 51% label coverage on the 239 primary PDF decisions, obtained via rule-based methods, leaves open whether the successfully labeled subset is representative across the five commissioners and three document formats. The 50-case manual review (95.3% precision) does not address potential systematic bias toward simpler or more standardized decisions, which could affect the reliability of the reported Mistral baseline and any downstream uses of the labeled PDF portion.

minor comments (2)

[Abstract] Clarify the exact time span of the 298 PDFs (abstract states '2023 to 2026'); confirm whether this includes future or projected decisions and update accordingly.
[Experiments / Baseline] The baseline evaluation uses 100 cases; specify whether these are drawn exclusively from the labeled PDF subset or the full dataset, and report the exact split to aid reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive assessment of RTI-Bench and for the constructive comment on the PDF subset. We address the major comment below and will incorporate revisions to strengthen transparency.

read point-by-point responses

Referee: PDF subset extraction (§3 or equivalent data section): The 51% label coverage on the 239 primary PDF decisions, obtained via rule-based methods, leaves open whether the successfully labeled subset is representative across the five commissioners and three document formats. The 50-case manual review (95.3% precision) does not address potential systematic bias toward simpler or more standardized decisions, which could affect the reliability of the reported Mistral baseline and any downstream uses of the labeled PDF portion.

Authors: We agree that the 51% coverage achieved through rule-based extraction on the 239 primary PDF decisions could introduce selection bias if extraction succeeds preferentially on simpler or more standardized cases, and that the random 50-case manual review confirms precision but does not test for distributional differences across the five commissioners or three document formats. In the revised manuscript we will add a stratified breakdown (new table in the data section) reporting the number of decisions and label coverage rates for each commissioner and each format, comparing the full PDF set to the successfully labeled subset. We will also expand the limitations discussion to explicitly note the possibility of such bias and its potential implications for the Mistral baseline and downstream applications. These additions increase transparency without changing the core dataset claims or baseline results. revision: yes

Circularity Check

0 steps flagged

No significant circularity; dataset release with transparent rule-based construction

full rationale

The paper is a data release describing collection of CIC decisions from public sources, application of rule-based extraction to add structured fields, and manual review on a 50-case sample for validation. The central claim of being the first such structured dataset rests on the public release itself rather than any derivation. The reported baseline uses zero-shot inference on Mistral 7B with no fitted parameters or predictions that reduce to the extraction rules by construction. No equations, self-citations, uniqueness theorems, or ansatzes appear in the methodology. The work is self-contained against external benchmarks of public data availability and does not invoke prior author work to justify its core outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central contribution is a new dataset release; no free parameters, invented entities, or non-standard axioms are required beyond standard assumptions about legal text extraction and label accuracy.

pith-pipeline@v0.9.0 · 5787 in / 1112 out tokens · 26364 ms · 2026-05-19T21:11:22.414721+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages · 1 internal anchor

[1]

Annual Report 2021-22

Government of India (2022). Annual Report 2021-22. Central Information Commission, New Delhi. https://cic.gov.in/sites/default/files/Reports/AR2021-22E.pdf

work page 2022
[2]

Satija, N. (2021). Over 32,000 RTI appeals pending with Central Information Commission. Hindustan Times, December 16, 2021. https://www.hindustantimes.com/india-news/over-32-000-rti-appeals-pending-with- central-information-commission-govt-101639657691173.html

work page 2021
[3]

K., Ghosh, K., Guha, S

Malik, V., Sanjay, R., Nigam, S. K., Ghosh, K., Guha, S. K., Bhattacharya, A., & Modi, A. (2021, August). ILDC for CJPE: Indian legal documents corpus for court judgment prediction and explanation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Proc...

work page 2021
[4]

K., Sharma, A., Khanna, D., Shallum, N., Ghosh, K., & Bhattacharya, A

Nigam, S. K., Sharma, A., Khanna, D., Shallum, N., Ghosh, K., & Bhattacharya, A. (2024, August). Legal judgment reimagined: PredEx and the rise of intelligent AI interpretation in Indian courts. In Findings of the Association for Computational Linguistics: ACL 2024 (pp. 4296-4315). arXiv:2406.04136

work page arXiv 2024
[5]

(2022, December)

Pal, A. (2022, December). Deepparliament: A legal domain benchmark & dataset for parliament bills prediction. In Proceedings of the Workshop on Unimodal and Multimodal Induction of Linguistic Structures (UM - IoS) (pp. 73-81). arXiv:2211.15424

work page arXiv 2022
[6]

IL-PCSR: Indian Legal Corpus for Prior Case and Statute Retrieval

Exploration-Lab (2025). IL-PCSR: Indian Legal Corpus for Prior Case and Statute Retrieval. HuggingFace Datasets. https://huggingface.co/datasets/Exploration-Lab/IL-PCSR

work page 2025
[7]

RTI-CASE-DATASET

jatinmehra (2023). RTI-CASE-DATASET. HuggingFace Datasets. https://huggingface.co/datasets/jatinmehra/RTI-CASE-DATASET

work page 2023
[8]

Jiang, A. Q. et al. (2023). Mistral 7B. arXiv:2310.06825

work page internal anchor Pith review Pith/arXiv arXiv 2023
[9]

(2019, July)

Chalkidis, I., Androutsopoulos, I., & Aletras, N. (2019, July). Neural legal judgment prediction in English. In Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 4317-4323)

work page 2019
[10]

The Right to Information Act, 2005

Government of India (2005). The Right to Information Act, 2005. Ministry of Law and Justice, New Delhi. https://cic.gov.in/sites/default/files/RTI-Act_English.pdf

work page 2005

[1] [1]

Annual Report 2021-22

Government of India (2022). Annual Report 2021-22. Central Information Commission, New Delhi. https://cic.gov.in/sites/default/files/Reports/AR2021-22E.pdf

work page 2022

[2] [2]

Satija, N. (2021). Over 32,000 RTI appeals pending with Central Information Commission. Hindustan Times, December 16, 2021. https://www.hindustantimes.com/india-news/over-32-000-rti-appeals-pending-with- central-information-commission-govt-101639657691173.html

work page 2021

[3] [3]

K., Ghosh, K., Guha, S

Malik, V., Sanjay, R., Nigam, S. K., Ghosh, K., Guha, S. K., Bhattacharya, A., & Modi, A. (2021, August). ILDC for CJPE: Indian legal documents corpus for court judgment prediction and explanation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Proc...

work page 2021

[4] [4]

K., Sharma, A., Khanna, D., Shallum, N., Ghosh, K., & Bhattacharya, A

Nigam, S. K., Sharma, A., Khanna, D., Shallum, N., Ghosh, K., & Bhattacharya, A. (2024, August). Legal judgment reimagined: PredEx and the rise of intelligent AI interpretation in Indian courts. In Findings of the Association for Computational Linguistics: ACL 2024 (pp. 4296-4315). arXiv:2406.04136

work page arXiv 2024

[5] [5]

(2022, December)

Pal, A. (2022, December). Deepparliament: A legal domain benchmark & dataset for parliament bills prediction. In Proceedings of the Workshop on Unimodal and Multimodal Induction of Linguistic Structures (UM - IoS) (pp. 73-81). arXiv:2211.15424

work page arXiv 2022

[6] [6]

IL-PCSR: Indian Legal Corpus for Prior Case and Statute Retrieval

Exploration-Lab (2025). IL-PCSR: Indian Legal Corpus for Prior Case and Statute Retrieval. HuggingFace Datasets. https://huggingface.co/datasets/Exploration-Lab/IL-PCSR

work page 2025

[7] [7]

RTI-CASE-DATASET

jatinmehra (2023). RTI-CASE-DATASET. HuggingFace Datasets. https://huggingface.co/datasets/jatinmehra/RTI-CASE-DATASET

work page 2023

[8] [8]

Jiang, A. Q. et al. (2023). Mistral 7B. arXiv:2310.06825

work page internal anchor Pith review Pith/arXiv arXiv 2023

[9] [9]

(2019, July)

Chalkidis, I., Androutsopoulos, I., & Aletras, N. (2019, July). Neural legal judgment prediction in English. In Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 4317-4323)

work page 2019

[10] [10]

The Right to Information Act, 2005

Government of India (2005). The Right to Information Act, 2005. Ministry of Law and Justice, New Delhi. https://cic.gov.in/sites/default/files/RTI-Act_English.pdf

work page 2005