Embracing Biased Transition Matrices for Complementary-Label Learning with Many Classes
Pith reviewed 2026-05-20 19:55 UTC · model grok-4.3
The pith
By designing a biased non-uniform process for complementary labels restricted to class subsets, CLL scales to 100+ classes with over sevenfold accuracy gains.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a deliberately biased transition matrix, induced by restricting complementary labels to a known subset of classes, preserves a usable learning signal and thereby enables complementary-label learning to succeed on problems with 100 or more classes, as shown by the BICL framework's performance improvements.
What carries the argument
The biased (non-uniform) transition matrix that encodes the restricted complementary-label generation process and is used directly in the training objective.
If this is right
- CLL becomes practical for 100-class and 200-class image datasets rather than remaining limited to 10 classes.
- Accuracy gains exceeding seven times those of traditional methods are achievable when the bias is known.
- Real-world CLL applications become feasible when annotation processes can enforce and record the restricted label generation.
Where Pith is reading between the lines
- Data annotation pipelines could be redesigned to intentionally introduce and document known biases instead of striving for uniformity.
- The same bias-leveraging principle may extend to other weak-supervision settings where label noise or incompleteness can be controlled.
- Optimal subset size for the restriction could be studied as a tunable parameter for different numbers of classes.
Load-bearing premise
The data collection process can be controlled so the biased complementary-label generation is known and matches the transition matrix used at training time.
What would settle it
If applying BICL with the correctly estimated biased transition matrix on CIFAR-100 produces no accuracy improvement over uniform-assumption baselines, the central claim would be falsified.
Figures
read the original abstract
Complementary-label learning (CLL) is a weakly supervised paradigm where instances are labeled with classes they do not belong to. Despite a decade of research, CLL methods remain competitive mainly on 10-class classification, with scaling to large label spaces continuing to be an enduring bottleneck. This limitation stems from the common assumption of uniform label generation in traditional methods, which fatally dilutes the learning signal in many-class settings. In this paper, we demonstrate that this long-standing barrier can be overcome by deliberately designing a biased (non-uniform) generation process that restricts complementary labels to a subset of classes. This finding motivates us to propose Bias-Induced Constrained Labeling (BICL), a principled framework spanning data collection to training that leverages this bias. BICL enables effective learning on CIFAR-100 and TinyImageNet-200, achieving more than sevenfold accuracy improvements over traditional methods. Our findings establish a new trajectory for making CLL feasible for many classes in real-world applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that traditional complementary-label learning (CLL) fails to scale beyond 10 classes due to the uniform transition matrix assumption, which dilutes the signal in large label spaces. It proposes Bias-Induced Constrained Labeling (BICL), a framework that deliberately imposes a biased (non-uniform) complementary-label generation process restricting labels to class subsets, derives an unbiased risk estimator from the known biased transition matrix T, and reports more than sevenfold accuracy gains on CIFAR-100 and TinyImageNet-200 over prior CLL methods.
Significance. If the central claim holds under the stated assumptions, BICL would represent a meaningful shift in CLL by moving from passive uniform labeling to controlled biased data collection, potentially making the paradigm viable for real-world many-class problems. The empirical scale of the reported gains, if reproducible with proper controls, would be notable; however, the significance hinges on whether the bias can be practically enforced and whether the estimator remains robust outside idealized settings.
major comments (2)
- [BICL framework] The unbiased risk estimator derivation (BICL framework section) treats the biased transition matrix T as known exactly and matching the data-generating process. No estimation procedure, sensitivity analysis, or robustness experiments are provided for cases where the assumed T deviates from the true generation process; this is load-bearing because even modest misspecification would bias the estimator and undermine the reported gains.
- [Experiments] Experiments on CIFAR-100 and TinyImageNet-200 claim >7x accuracy improvements, but lack details on how the biased complementary labels were actually generated and imposed during data collection, exact parameterization of the bias distribution, error bars across runs, and ablations isolating the effect of the bias parameters versus other modeling choices.
minor comments (1)
- [Introduction / Framework] Notation for the biased transition matrix T and the subset restriction should be introduced with an explicit equation early in the framework section to improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.
read point-by-point responses
-
Referee: [BICL framework] The unbiased risk estimator derivation (BICL framework section) treats the biased transition matrix T as known exactly and matching the data-generating process. No estimation procedure, sensitivity analysis, or robustness experiments are provided for cases where the assumed T deviates from the true generation process; this is load-bearing because even modest misspecification would bias the estimator and undermine the reported gains.
Authors: We appreciate this observation. In the BICL framework, the transition matrix T is deliberately designed and imposed as part of the controlled data collection process, so it is known exactly by construction rather than estimated. This enables the exact unbiased risk estimator under the stated assumptions. We agree that robustness to misspecification is important to demonstrate. In the revised manuscript we will add a sensitivity analysis together with experiments that quantify estimator degradation under controlled deviations from the assumed T. revision: yes
-
Referee: [Experiments] Experiments on CIFAR-100 and TinyImageNet-200 claim >7x accuracy improvements, but lack details on how the biased complementary labels were actually generated and imposed during data collection, exact parameterization of the bias distribution, error bars across runs, and ablations isolating the effect of the bias parameters versus other modeling choices.
Authors: We agree that these details are necessary for reproducibility and for isolating the contribution of the bias. In the revision we will expand the experimental section to describe the exact procedure used to generate and impose the biased complementary labels, provide the precise parameterization of the bias distribution, report error bars from multiple independent runs, and include ablation studies that vary the bias parameters while holding other modeling choices fixed. revision: yes
Circularity Check
BICL framework derivation is self-contained with no reduction to inputs by construction
full rationale
The paper introduces BICL by proposing a deliberately biased complementary-label generation process whose transition matrix T is treated as known and controlled at data collection time. The derivation of the unbiased risk estimator follows directly from this known T via standard risk correction techniques for complementary labels; this is a forward mathematical construction rather than a tautology or fitted quantity renamed as prediction. No equations in the abstract or described framework reduce the final performance claim to a parameter fit on the target data, nor does the argument rest on self-citation chains or imported uniqueness theorems. Empirical gains on CIFAR-100 and TinyImageNet-200 are presented as validation under the stated assumption, not as evidence that forces the result. The central premise therefore remains an external modeling choice (controllable biased labeling) rather than a self-referential loop.
Axiom & Free-Parameter Ledger
free parameters (1)
- bias distribution parameters
axioms (1)
- domain assumption The biased generation process can be enforced at data collection time and the resulting transition matrix is known exactly for training.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We provide an information-theoretic perspective... Theorem 1 (Lower Bound on Supervision Error)... H_Q(Y|¯Y) quantifies the uncertainty...
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
BICL... deliberately designing a biased (non-uniform) generation process that restricts complementary labels to a subset of classes.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Tan-Ha Mai, Nai-Xuan Ye, Yu-Wei Kuan, Po-Yi Lu, and Hsuan-Tien Lin. The unexplored potential of vision-language models for generating large-scale complementary-label learning data. InPacific-Asia Conference on Knowledge Discovery and Data Mining, pages 90–102, 2025
work page 2025
-
[2]
Masashi Sugiyama, Han Bao, Takashi Ishida, Nan Lu, Tomoya Sakai, and Gang Niu.Machine learning from weak supervision: An empirical risk minimization approach. MIT Press, 2022
work page 2022
-
[3]
Learning classifiers from only positive and unlabeled data
Charles Elkan and Keith Noto. Learning classifiers from only positive and unlabeled data. InProceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 213–220, 2008
work page 2008
-
[4]
du Plessis, and Masashi Sugiyama
Ryuichi Kiryo, Gang Niu, Marthinus C. du Plessis, and Masashi Sugiyama. Positive-unlabeled learning with non-negative risk estimator. InAdvances in Neural Information Processing Systems, volume 30, pages 1675–1685, 2017
work page 2017
-
[5]
Learning from similarity- confidence data
Yuzhou Cao, Lei Feng, Yitian Xu, Bo An, Gang Niu, and Masashi Sugiyama. Learning from similarity- confidence data. InProceedings of the 38th International Conference on Machine Learning, pages 1272–1282, 2021
work page 2021
-
[6]
Binary classification with confidence difference
Wei Wang, Lei Feng, Yuchen Jiang, Gang Niu, Min-Ling Zhang, and Masashi Sugiyama. Binary classification with confidence difference. InAdvances in Neural Information Processing Systems 36, pages 5936–5960, 2023
work page 2023
-
[7]
Classification from pairwise similarity and unlabeled data
Han Bao, Gang Niu, and Masashi Sugiyama. Classification from pairwise similarity and unlabeled data. In Proceedings of the 35th International Conference on Machine Learning, pages 461–470, 2018
work page 2018
-
[8]
Pairwise supervision can provably elicit a decision boundary
Han Bao, Takuya Shimada, Liyuan Xu, Issei Sato, and Masashi Sugiyama. Pairwise supervision can provably elicit a decision boundary. InProceedings of the 25th International Conference on Artificial Intelligence and Statistics, pages 2618–2640, 2022
work page 2022
-
[9]
Learning from complementary labels
Takashi Ishida, Gang Niu, Weihua Hu, and Masashi Sugiyama. Learning from complementary labels. In Advances in Neural Information Processing Systems, page 5639–5649, 2017
work page 2017
-
[10]
Unbiased risk estimators can mislead: A case study of learning with complementary labels
Yu-Ting Chou, Gang Niu, Hsuan-Tien Lin, and Masashi Sugiyama. Unbiased risk estimators can mislead: A case study of learning with complementary labels. InInternational Conference on Machine Learning, pages 1929–1938, 2020
work page 1929
-
[11]
Learning with multiple labels.Advances in Neural Information Processing Systems, 15, 2002
Rong Jin and Zoubin Ghahramani. Learning with multiple labels.Advances in Neural Information Processing Systems, 15, 2002
work page 2002
-
[12]
Progressive identification of true labels for partial-label learning
Jiaqi Lv, Miao Xu, Lei Feng, Gang Niu, Xin Geng, and Masashi Sugiyama. Progressive identification of true labels for partial-label learning. InInternational Conference on Machine Learning, pages 6500–6510, 2020
work page 2020
-
[13]
Nagarajan Natarajan, Inderjit S Dhillon, Pradeep K Ravikumar, and Ambuj Tewari. Learning with noisy labels. InAdvances in Neural Information Processing Systems, volume 26, 2013
work page 2013
-
[14]
Making deep neural networks robust to label noise: A loss correction approach
Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. Making deep neural networks robust to label noise: A loss correction approach. InIEEE Conference on Computer Vision and Pattern Recognition, pages 1944–1952, 2017
work page 1944
-
[15]
Wei Wang, Takashi Ishida, Yu-Jie Zhang, Gang Niu, and Masashi Sugiyama. Learning with complementary labels revisited: The selected-completely-at-random setting is more practical. InProceedings of the 41st International Conference on Machine Learning, 2024. 10
work page 2024
-
[16]
Hsiu-Hsuan Wang, Mai Tan Ha, Nai-Xuan Ye, Wei-I Lin, and Hsuan-Tien Lin. CLImage: Human-annotated datasets for complementary-label learning.Transactions on Machine Learning Research, 2025
work page 2025
-
[17]
Learning with biased complementary labels
Xiyu Yu, Tongliang Liu, Mingming Gong, and Dacheng Tao. Learning with biased complementary labels. InEuropean Conference on Computer Vision, pages 68–83, 2018
work page 2018
-
[18]
NLNL: Negative learning for noisy labels
Youngdong Kim, Junho Yim, Juseung Yun, and Junmo Kim. NLNL: Negative learning for noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 101–110, 2019
work page 2019
-
[19]
Discriminative complementary-label learning with weighted loss
Yi Gao and Min-Ling Zhang. Discriminative complementary-label learning with weighted loss. In International Conference on Machine Learning, pages 3587–3597, 2021
work page 2021
-
[20]
Reduction from complementary-label learning to probability estimates
Wei-I Lin and Hsuan-Tien Lin. Reduction from complementary-label learning to probability estimates. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 469–481, 2023
work page 2023
-
[21]
libcll: an extendable python toolkit for complementary-label learning, 2024
Nai-Xuan Ye, Tan-Ha Mai, Hsiu-Hsuan Wang, Wei-I Lin, and Hsuan-Tien Lin. libcll: an extendable python toolkit for complementary-label learning, 2024
work page 2024
-
[22]
Learning multiple layers of features from tiny images
Alex Krizhevsky. Learning multiple layers of features from tiny images. Computer Science University of Toronto, Canada, 2009
work page 2009
-
[23]
Tiny ImageNet visual recognition challenge
Ya Le and Xuan Yang. Tiny ImageNet visual recognition challenge. Report of CS231N: Deep Learning for Computer Vision Course, 2015. Stanford University
work page 2015
-
[24]
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning. InAdvances in Neural Information Processing Systems, volume 36, pages 34892–34916. Curran Associates, Inc., 2023
work page 2023
-
[25]
Tan-Ha Mai and Hsuan-Tien Lin. Intra-cluster mixup: An effective data augmentation technique for complementary-label learning.Transactions on Machine Learning Research, 2026
work page 2026
-
[26]
Exploring simple siamese representation learning
Xinlei Chen and Kaiming He. Exploring simple siamese representation learning. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15750–15758, 2021
work page 2021
-
[27]
Wiley-Interscience, Hoboken, NJ, 2nd edition, 2006
Thomas M Cover and Joy A Thomas.Elements of Information Theory. Wiley-Interscience, Hoboken, NJ, 2nd edition, 2006
work page 2006
-
[28]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016
work page 2016
-
[29]
Complementary-label learning for arbitrary losses and models
Takashi Ishida, Gang Niu, Aditya Menon, and Masashi Sugiyama. Complementary-label learning for arbitrary losses and models. InInternational Conference on Machine Learning, pages 2971–2980, 2019
work page 2019
-
[30]
Learning with multiple complementary labels
Lei Feng, Takuo Kaneko, Bo Han, Gang Niu, Bo An, and Masashi Sugiyama. Learning with multiple complementary labels. InInternational Conference on Machine Learning, pages 3072–3081, 2020
work page 2020
-
[31]
Haoran Jiang, Zhihao Sun, and Yingjie Tian. Comco: Complementary supervised contrastive learning for complementary label learning.Neural Networks, 169:44–56, 2024
work page 2024
-
[32]
Yiwei You, Jinglong Huang, Qiang Tong, and Bo Wang. Tackling biased complementary label learning with large margin.Information Sciences, 687:121400, 2025
work page 2025
-
[33]
Hiroki Ishiguro, Takashi Ishida, and Masashi Sugiyama. Learning from noisy complementary labels with robust loss functions.IEICE Transactions on Information and Systems, 105:364–376, 2022
work page 2022
-
[34]
Class-imbalanced complementary-label learning via weighted loss.Neural Networks, 166:555–565, 2023
Meng Wei, Yong Zhou, Zhongnian Li, and Xinzheng Xu. Class-imbalanced complementary-label learning via weighted loss.Neural Networks, 166:555–565, 2023
work page 2023
-
[35]
Learning with noisy labels revisited: A study using real-world human annotations
Jiaheng Wei, Zhaowei Zhu, Hao Cheng, Tongliang Liu, Gang Niu, and Yang Liu. Learning with noisy labels revisited: A study using real-world human annotations. InInternational Conference on Learning Representations, 2022
work page 2022
-
[36]
Learning imbalanced datasets with label-distribution-aware margin loss
Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, and Tengyu Ma. Learning imbalanced datasets with label-distribution-aware margin loss. InAdvances in Neural Information Processing Systems, volume 32, pages 1565–1576, 2019
work page 2019
-
[37]
Class-balanced loss based on effective number of samples
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. Class-balanced loss based on effective number of samples. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9268–9277, 2019. 11
work page 2019
-
[38]
Improved Regularization of Convolutional Neural Networks with Cutout
Terrance DeVries and Graham W Taylor. Improved regularization of convolutional neural networks with cutout.arXiv preprint arXiv:1708.04552, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[39]
Autoaugment: Learning augmentation strategies from data
Ekin D Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V Le. Autoaugment: Learning augmentation strategies from data. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 113–123, 2019
work page 2019
-
[40]
Randaugment: Practical automated data augmentation with a reduced search space
Ekin D Cubuk, Barret Zoph, Jonathon Shlens, and Quoc V Le. Randaugment: Practical automated data augmentation with a reduced search space. InIEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 702–703, 2020
work page 2020
-
[41]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PmLR, 2020
work page 2020
-
[42]
Byol works even without batch statistics
Pierre H Richemond, Jean-Bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel Smith, Soham De, Razvan Pascanu, Bilal Piot, et al. Byol works even without batch statistics. arXiv preprint arXiv:2010.10241, 2020
-
[43]
An empirical study of training self-supervised vision transformers
Xinlei Chen, Saining Xie, and Kaiming He. An empirical study of training self-supervised vision transformers. InIEEE/CVF international conference on computer vision, pages 9640–9649, 2021
work page 2021
-
[44]
An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, Chujie Zheng, Dayiheng Liu, Fan Zhou, Fei Huang, Feng Hu, Hao Ge, Haoran Wei, Huan Lin, Jialong Tang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jing Zhou, Jingren Zhou, Junyang Lin, Kai Dang, Keqin Bao, Kexin Yang, ...
work page 2025
-
[45]
Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Anto...
work page 2024
-
[46]
Wei Wang, Takashi Ishida, Yu-Jie Zhang, Gang Niu, and Masashi Sugiyama. Learning with complementary labels revisited: The selected-completely-at-random setting is more practical. InInternational Conference on Machine Learning, volume 235, pages 50683–50710, 2024
work page 2024
-
[47]
Consistent complementary-label learning via order-preserving losses
Shuqi Liu, Yuzhou Cao, Qiaozhen Zhang, Lei Feng, and Bo An. Consistent complementary-label learning via order-preserving losses. InInternational Conference on Artificial Intelligence and Statistics, pages 8734–8748, 2023
work page 2023
-
[48]
PiCO: Contrastive label disambiguation for partial label learning
Haobo Wang, Ruixuan Xiao, Yixuan Li, Lei Feng, Gang Niu, Gang Chen, and Junbo Zhao. PiCO: Contrastive label disambiguation for partial label learning. InInternational Conference on Learning Representations, 2022
work page 2022
-
[49]
Haobo Wang, Mingxuan Xia, Yixuan Li, Yuren Mao, Lei Feng, Gang Chen, and Junbo Zhao. Solar: Sinkhorn label refinery for imbalanced partial-label learning.Advances in neural information processing systems, 35:8104–8117, 2022. 12 A Proofs of Theoretical Results A.1 Proof of Theorem 1 Proof. We start with the standard Fano’s Inequality [ 27], which bounds th...
work page 2022
-
[50]
Rows were normalized to sum to 1
Dense Bias:We generated a random transition matrix QBias ∈R C×C where QBias ij ∼U[0,1] fori̸=jandQ Bias ii = 0. Rows were normalized to sum to 1
-
[51]
Results.We computed the conditional entropy H(Y| ¯Y) for both matrices
Sparse Bias (Ours):From QBias, we derived a sparse matrix QOurs by retaining k (k= 4 ) randomly selected elements per row and re-normalizing. Results.We computed the conditional entropy H(Y| ¯Y) for both matrices. The simulation revealed that: HOurs(Y| ¯Y)≤H Bias(Y| ¯Y) holds in100 percentage pointof the trials across all tested dimensions ( 10×10 , 100×1...
-
[52]
Effect of the Number of Sam- pled Labels
VLM annotator: Which are provided in Appendix C.6 “Effect of the Number of Sam- pled Labels”. We selected 4 candidate labels as CLs for each class. Take note that the Appendix C.6 is also discard the true label from candidate labels
-
[53]
A rule-based annotator: discard the true label (reduce the candidate set to 4 classes), and then uniformly select one from the remaining 4 (all 4 are CLs) classes. We can do so since in Figure 2, we have the true class. Table 6: Comparison between the VLM annotator and the rule-based annotator on CIFAR-20 (accuracy (%), mean±std). Annotator Method Dataset...
-
[54]
A candidate set of 4 labels is uniformly sampled from the label space
-
[55]
The VLM (LLaV A) is prompted to select the label from this set that doesnotdescribe the image. Characteristics.While using the same protocol, the VLM annotator significantly reduces label noise, achieving a noise rate of approximately 0.24 percentage points on CIFAR-10, which is much lower than CLImage. However, contrary to the expectation that uniform ca...
-
[56]
proposed a framework to estimate the classification risk unbiasedly using CLs. The core idea relies on an inverse transition matrix to recover the unbiased risk of the true classifier. The general loss formulation is: RURE(g) = 1 N NX i=1 e⊤ ¯yi Q−1L(g(xi)), where e¯yi is the one-hot vector of the complementary label, and L(g(xi)) denotes the vector of lo...
-
[57]
introduced the Complementary Probability Estimation (CPE) framework. Unlike risk-correction methods, CPE focuses on directly estimating the probability of a label being complementary, de- noted as p(¯y|x). The objective is to minimize the divergence between the model’s output and the complementary target. CPE employs a surrogate complementary estimation l...
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.