Position: Weight Space Should Be a First-Class Generative AI Modality
Pith reviewed 2026-05-20 12:06 UTC · model grok-4.3
The pith
Treating neural network checkpoints as a first-class generative modality lets models be synthesized in weight space to match fine-tuning at far lower cost.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Neural network checkpoints should be treated as a first-class data modality, and generative modeling in weight space should be standardized as a core machine learning primitive. High-performing models occupy low-dimensional, highly structured regions of weight space shaped by symmetry, flatness, modularity, and shared subspaces, allowing weights to be synthesized on demand that often match fine-tuning performance while reducing adaptation cost by orders of magnitude.
What carries the argument
Generative synthesis in weight space: learning distributions over trained checkpoints to sample new weight vectors that inherit the structural properties of high-performing models.
If this is right
- New checkpoints can be created for specific tasks without running full fine-tuning or optimization from random initialization.
- Adaptation to new domains or architectures becomes feasible at orders-of-magnitude lower compute cost than current practice.
- Methods can be standardized into a five-stage pipeline covering data collection, representation learning, distribution modeling, sampling, and evaluation.
- Practical deployment is already possible for adapter-scale and conditional generation settings.
- AI systems can begin to improve or create other AI systems by sampling directly from learned weight distributions.
Where Pith is reading between the lines
- Model repositories could evolve from static collections into primary training corpora for meta-generative systems.
- Conditional control over sampled weights might enable systematic creation of models with targeted properties such as efficiency or robustness.
- The same low-dimensional structure could inform new approaches to model merging, compression, and modular composition.
Load-bearing premise
The structural properties observed in recent adapter-scale and conditional generation results will scale to unrestricted frontier-scale checkpoint synthesis without additional fundamental limitations.
What would settle it
An experiment in which generative synthesis from weight distributions fails to reach fine-tuning accuracy on a large new task, or in which no low-dimensional structured regions are found among frontier-model weights.
Figures
read the original abstract
Neural network checkpoints have quietly become a large-scale data resource: millions of trained weight vectors now exist, each encoding task-, domain-, and architecture-specific knowledge. This position paper argues that model checkpoints should be treated as a first-class data modality, and that generative modeling in weight space should be standardized as a core machine learning primitive. Recent advances demonstrate that neural weights can be synthesized on demand, often matching fine-tuning performance while reducing adaptation cost by orders of magnitude. We contend that these results reflect an underlying structural fact: high-performing models occupy low-dimensional, highly structured regions of weight space shaped by symmetry, flatness, modularity, and shared subspaces. Building on this view, we organize existing methods into a five-stage pipeline, survey applications where the approach is already practical, and clarify current limits: adapter-scale and conditional generation are advancing rapidly, while unrestricted frontier-scale checkpoint synthesis remains open. Our goal is to shift the community's default mindset from optimizing models per task to sampling models from learned weight distributions, accelerating toward an era in which AI systems routinely improve or create other AI systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper is a position paper arguing that neural network checkpoints should be treated as a first-class generative AI modality. It claims that high-performing models occupy low-dimensional, highly structured regions of weight space due to symmetry, flatness, modularity, and shared subspaces. Recent advances in weight synthesis are said to match fine-tuning performance at orders-of-magnitude lower adaptation cost. The authors organize existing methods into a five-stage pipeline, survey practical applications, and note that adapter-scale and conditional generation are advancing while unrestricted frontier-scale checkpoint synthesis remains open. The goal is to shift the community from per-task optimization toward sampling models from learned weight distributions.
Significance. If the position holds, it could drive a paradigm shift in machine learning by standardizing generative modeling over weight distributions, enabling AI systems to create or improve other models with substantially reduced compute. This would build directly on cited advances in adapters and conditional generation to realize large efficiency gains. The significance is tempered by the acknowledged open problem at frontier scale, but the framing as a core primitive could usefully redirect research priorities if the structural assumptions prove robust.
major comments (2)
- [Structural fact paragraph] The paragraph beginning 'We contend that these results reflect an underlying structural fact': the central claim that observed synthesis results reflect low-dimensional, symmetric, flat, and modular structure enabling orders-of-magnitude cost reduction is asserted on the basis of adapter-scale and conditional-generation advances. No measurement of effective dimensionality, no scaling relation between manifold dimension and parameter count, and no ablation showing that these properties survive removal of adapters are supplied, leaving the extrapolation to unrestricted frontier-scale synthesis untested and load-bearing for the main thesis.
- [Five-stage pipeline section] The section organizing existing methods into a five-stage pipeline: while the pipeline provides a useful taxonomy, the manuscript does not analyze how each stage would scale when the effective dimension of high-performing weight regions grows with model size, nor does it identify capacity limits of current generative models that could prevent the claimed cost reductions at frontier scale.
minor comments (2)
- The abstract states that 'millions of trained weight vectors now exist' without a supporting citation or rough estimate of the current scale of public checkpoints.
- [Applications survey] The survey of applications would benefit from explicit cross-references to the specific performance numbers or cost-reduction factors reported in the cited works.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on our position paper. We address the major comments below, clarifying our approach as a synthesis of existing work and outlining planned revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Structural fact paragraph] The paragraph beginning 'We contend that these results reflect an underlying structural fact': the central claim that observed synthesis results reflect low-dimensional, symmetric, flat, and modular structure enabling orders-of-magnitude cost reduction is asserted on the basis of adapter-scale and conditional-generation advances. No measurement of effective dimensionality, no scaling relation between manifold dimension and parameter count, and no ablation showing that these properties survive removal of adapters are supplied, leaving the extrapolation to unrestricted frontier-scale synthesis untested and load-bearing for the main thesis.
Authors: As a position paper, our intent is to highlight the implications of recent advances in weight-space generation rather than to conduct new empirical studies. The structural properties are supported by the body of cited work on neural network geometry. We will revise the relevant paragraph to explicitly note that the low-dimensional structure is inferred from adapter-scale results and to emphasize that extension to frontier-scale models is a motivating hypothesis rather than a proven fact. We will also incorporate additional citations on measurements of effective dimensionality in weight spaces to better ground the claim. revision: partial
-
Referee: [Five-stage pipeline section] The section organizing existing methods into a five-stage pipeline: while the pipeline provides a useful taxonomy, the manuscript does not analyze how each stage would scale when the effective dimension of high-performing weight regions grows with model size, nor does it identify capacity limits of current generative models that could prevent the claimed cost reductions at frontier scale.
Authors: We agree that a more explicit discussion of scaling would be beneficial. In the revised manuscript, we will add analysis to the pipeline section addressing how the stages might be affected by increasing effective dimensionality and the known limitations of current generative models (e.g., mode collapse or computational intractability in very high dimensions). This will better contextualize why unrestricted frontier-scale synthesis is presented as an open challenge. revision: yes
Circularity Check
No circularity: position paper references external advances without internal reduction
full rationale
This is a high-level position paper that organizes existing methods into a five-stage pipeline and interprets recent external results as evidence for low-dimensional structure in weight space. No equations, fitted parameters, or derivations appear in the manuscript. The central contention that results 'reflect an underlying structural fact' is presented as an interpretive claim supported by cited prior work rather than a self-referential construction or load-bearing self-citation chain internal to this document. The paper explicitly flags frontier-scale synthesis as open, avoiding any claim that reduces to its own inputs by definition. This is the expected non-finding for a survey-style position statement.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption High-performing models occupy low-dimensional, highly structured regions of weight space shaped by symmetry, flatness, modularity, and shared subspaces.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
high-performing models occupy low-dimensional, highly structured regions of weight space shaped by symmetry, flatness, modularity, and shared subspaces
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Permutation Symmetries and Quotient Geometry; Flatness and Low Intrinsic Dimension
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
K., Hayase, J., and Srinivasa, S
Ainsworth, S. K., Hayase, J., and Srinivasa, S. Git re-basin: Merging models modulo permutation symmetries.ICLR, 2023
work page 2023
-
[2]
Andreis, B., Soro, B., Torr, P., and Hwang, S. J. Set- based neural network encoding without weight tying. In NeurIPS, 2024
work page 2024
-
[3]
Balzano, L., Ding, T., Haeffele, B. D., Kwon, S. M., Qu, Q., Wang, P., Wang, Z., and Yaras, C. An overview of low-rank structures in the training and adaptation of large models.arXiv, 2025
work page 2025
-
[4]
Revisiting model stitching to compare neural representations
Bansal, Y ., Nakkiran, P., and Barak, B. Revisiting model stitching to compare neural representations. InNeurIPS, 2021
work page 2021
-
[5]
Understanding the role of individual units in a deep neural network.PNAS, 2020
Bau, D., Zhu, J.-Y ., Strobelt, H., Lapedriza, A., Zhou, B., and Torralba, A. Understanding the role of individual units in a deep neural network.PNAS, 2020
work page 2020
-
[6]
SMASH: One-shot model architecture search through hypernet- works
Brock, A., Lim, T., Ritchie, J., and Weston, N. SMASH: One-shot model architecture search through hypernet- works. InICLR, 2018
work page 2018
-
[7]
Charakorn, R., Cetin, E., Tang, Y ., and Lange, R. T. Text-to- lora: Instant transformer adaption. InICML, 2025
work page 2025
-
[8]
Charakorn, R., Cetin, E., Uesaka, S., and Lange, R. T. Doc- to-lora: Learning to instantly internalize contexts.arXiv, 2026
work page 2026
-
[9]
Net2net: Accelerat- ing learning via knowledge transfer
Chen, T., Goodfellow, I., and Shlens, J. Net2net: Accelerat- ing learning via knowledge transfer. InICLR, 2016
work page 2016
-
[10]
The lottery ticket hypothesis for pre- trained bert networks
Chen, T., Frankle, J., Chang, S., Liu, S., Zhang, Y ., Wang, Z., and Carbin, M. The lottery ticket hypothesis for pre- trained bert networks. InNeurIPS, 2020
work page 2020
-
[11]
Sym- bolic discovery of optimization algorithms
Chen, X., Liang, C., Huang, D., Real, E., Wang, K., Pham, H., Dong, X., Luong, T., Hsieh, C.-J., Lu, Y ., et al. Sym- bolic discovery of optimization algorithms. InNeurIPS, 2023
work page 2023
-
[12]
Choromanska, A., Henaff, M., Mathieu, M., Arous, G. B., and LeCun, Y . The loss surfaces of multilayer networks. InAISTATS, 2015
work page 2015
-
[13]
Discovering sym- bolic models from deep learning with inductive biases
Cranmer, K., Spergel, D., and Ho, S. Discovering sym- bolic models from deep learning with inductive biases. In NeurIPS, 2020. Csord´as, R., van Steenkiste, S., and Schmidhuber, J. Are neural nets modular? inspecting functional modularity through differentiable weight masks. InICLR, 2021
work page 2020
-
[14]
Sharp minima can generalize for deep nets
Dinh, L., Pascanu, R., Bengio, S., and Bengio, Y . Sharp minima can generalize for deep nets. InICML, 2017
work page 2017
-
[15]
Interpreting the weight space of customized diffusion models
Dravid, A., Gandelsman, Y ., Wang, K.-C., Abdal, R., Wet- zstein, G., Efros, A., and Aberman, K. Interpreting the weight space of customized diffusion models. InNeurIPS, 2024
work page 2024
-
[16]
Draxler, F., Veschgini, K., Salmhofer, M., and Hamprecht, F. A. Essentially no barriers in neural network energy landscape. InICML, 2018
work page 2018
-
[17]
The role of permutation invariance in linear mode connectivity of neural networks
Entezari, R., Sedghi, H., Saukh, O., and Neyshabur, B. The role of permutation invariance in linear mode connectivity of neural networks. InICLR, 2022. Erkoc ¸, Z., Ma, F., Shan, Q., Nießner, M., and Dai, A. Hy- perDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion. InICCV, 2023
work page 2022
-
[18]
Sharpness-aware minimization for efficiently improving generalization
Foret, P., Kleiner, A., Mobahi, H., and Neyshabur, B. Sharpness-aware minimization for efficiently improving generalization. InICLR, 2021
work page 2021
-
[19]
Frankle, J. and Carbin, M. The lottery ticket hypothesis: Finding sparse, trainable neural networks. InICLR, 2019
work page 2019
-
[20]
Galanti, T., Siegel, Z. S., Gupte, A., and Poggio, T. A. SGD and weight decay secretly minimize the rank of your neural network. InNeurIPS 2024 Workshop on Mathematics of Modern Machine Learning, 2024. 10 Position: Weight Space Should Be a First-Class Generative AI Modality
work page 2024
-
[21]
Wilson, A. G. Loss surfaces, mode connectivity, and fast ensembling of dnns. InNeurIPS, 2018
work page 2018
-
[22]
An investiga- tion into neural net optimization via hessian eigenvalue density
Ghorbani, B., Krishnan, S., and Xiao, Y . An investiga- tion into neural net optimization via hessian eigenvalue density. InICML, 2019
work page 2019
-
[23]
Efficient training of bert by progressively stacking
Gong, L., He, D., Li, Z., Qin, T., Wang, L., and Liu, T. Efficient training of bert by progressively stacking. In ICML, 2019
work page 2019
-
[24]
Goodfellow, I. J., Vinyals, O., and Saxe, A. M. Qualita- tively characterizing neural network optimization prob- lems.ICLR, 2015
work page 2015
-
[25]
Gur-Ari, G., Roberts, D. A., and Dyer, E. Gradient descent happens in a tiny subspace.arXiv, 2018
work page 2018
- [26]
-
[27]
The platonic representation hypothesis
Huh, M., Cheung, B., Wang, T., and Isola, P. The platonic representation hypothesis. InICML, 2024
work page 2024
-
[28]
Grama, A., Tian, Y ., and Wang, Z. From low rank gradient subspace stabilization to low-rank weights: Observations, theories, and applications. InICML, 2025
work page 2025
-
[29]
Ji, Z. and Telgarsky, M. Gradient descent aligns the layers of deep linear networks. InICLR, 2019
work page 2019
-
[30]
Fantastic generalization measures and where to find them
Bengio, S. Fantastic generalization measures and where to find them. InICLR, 2020
work page 2020
-
[31]
The universal weight subspace hypothesis
Yuille, A. The universal weight subspace hypothesis. arXiv, 2025
work page 2025
-
[32]
Parameter prediction for unseen deep archi- tectures
Soriano, A. Parameter prediction for unseen deep archi- tectures. InNeurIPS, 2021
work page 2021
-
[33]
Can we scale transformers to predict parameters of diverse ima- genet models? InICML, 2023
Knyazev, B., Hwang, D., and Lacoste-Julien, S. Can we scale transformers to predict parameters of diverse ima- genet models? InICML, 2023
work page 2023
-
[34]
Similar- ity of neural network representations revisited
Kornblith, S., Norouzi, M., Lee, H., and Hinton, G. Similar- ity of neural network representations revisited. InICML, 2019
work page 2019
-
[35]
Explaining landscape connectivity of low-cost solutions for multilayer nets
Arora, S., and Ge, R. Explaining landscape connectivity of low-cost solutions for multilayer nets. InNeurIPS, 2019
work page 2019
-
[36]
Le, T. H. and Jegelka, S. Training invariances and the low- rank phenomenon: Beyond linear networks. InICLR, 2022
work page 2022
-
[37]
Measuring the intrinsic dimension of objective landscapes
Li, C., Farkhoor, H., Liu, R., and Yosinski, J. Measuring the intrinsic dimension of objective landscapes. InICLR, 2018
work page 2018
-
[38]
Secure on-device video ood detection without backpropagation
Tu, Z., Hu, X., and Zhao, Y . Secure on-device video ood detection without backpropagation. InICCV, 2025
work page 2025
-
[39]
Drag- and-drop llms: Zero-shot prompt-to-weights
Li, Z., Wang, P., Sch ¨urholt, K., Borth, D., et al. Drag- and-drop llms: Zero-shot prompt-to-weights. InNeurIPS, 2025
work page 2025
-
[40]
T., Lorraine, J., and Lucas, J
Lim, D., Maron, H., Law, M. T., Lorraine, J., and Lucas, J. Graph metanetworks for processing diverse neural architectures. InICLR, 2024
work page 2024
-
[41]
Shine: A scalable in-context hypernetwork for mapping context to lora in a single pass
Zhang, M. Shine: A scalable in-context hypernetwork for mapping context to lora in a single pass. InICML, 2026
work page 2026
- [42]
-
[43]
Transtrum, M. K., Sethna, J. P., and Chaudhari, P. The training process of many deep networks explores the same low-dimensional manifold.PNAS, 2024
work page 2024
-
[44]
A function space view of bounded norm infinite width relu nets: The multivariate case
Ongie, G., Willett, R., Soudry, D., and Srebro, N. A function space view of bounded norm infinite width relu nets: The multivariate case. InICLR, 2020
work page 2020
-
[45]
Carbon emissions and large neural network training.arXiv, 2021
Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L.- M., Rothchild, D., So, D., Texier, M., and Dean, J. Carbon emissions and large neural network training.arXiv, 2021
work page 2021
-
[46]
S., Radosavovic, I., Brooks, T., Efros, A
Peebles, W. S., Radosavovic, I., Brooks, T., Efros, A. A., and Malik, J. Learning to learn with generative models of neural network checkpoints.arXiv, 2022
work page 2022
-
[47]
Relative flatness and generalization
Petzka, H., Kamp, M., Adilova, L., Sminchisescu, C., and Boley, M. Relative flatness and generalization. In NeurIPS, 2021. 11 Position: Weight Space Should Be a First-Class Generative AI Modality
work page 2021
-
[48]
Unlocking emergent modu- larity in large language models
Qiu, Z., Huang, Z., and Fu, J. Unlocking emergent modu- larity in large language models. InNAACL, 2024
work page 2024
-
[49]
Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation
Ruiz, N., Li, Y ., Jampani, V ., Pritch, Y ., Rubinstein, M., and Aberman, K. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. InCVPR, 2023
work page 2023
-
[50]
Hyper- DreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models
Wadhwa, N., Rubinstein, M., and Aberman, K. Hyper- DreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models. InCVPR, 2024
work page 2024
-
[51]
U., Dauphin, Y ., and Bottou, L
Sagun, L., Evci, U., Guney, V . U., Dauphin, Y ., and Bottou, L. Empirical analysis of the hessian of over-parametrized neural networks.arXiv, 2017
work page 2017
-
[52]
Flow to learn: Flow matching on neural network parameters
Saragih, D., Cao, D., Balaji, T., and Santhosh, A. Flow to learn: Flow matching on neural network parameters. InWorkshop on Neural Network Weights as a New Data Modality, 2025. Sch¨urholt, K., Kostadinov, D., and Borth, D. Self- Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction. InNeurIPS Workshop, 2021. Sch¨...
work page 2025
-
[53]
Per- sonalized federated learning using hypernetworks
Shamsian, A., Navon, A., Fetaya, E., and Chechik, G. Per- sonalized federated learning using hypernetworks. In ICML, 2021
work page 2021
-
[54]
W., Zhang, Y ., Fetaya, E., Chechik, G., and Maron, H
Shamsian, A., Navon, A., Zhang, D. W., Zhang, Y ., Fetaya, E., Chechik, G., and Maron, H. Improved generaliza- tion of weight space networks via augmentations.ICML, 2024
work page 2024
-
[55]
Shevchenko, A. and Mondelli, M. Landscape connec- tivity and dropout stability of sgd solutions for over- parameterized neural networks. InICML, 2020
work page 2020
-
[56]
Does sgd really happen in tiny subspaces? InICLR, 2025
Song, M., Ahn, K., and Yun, C. Does sgd really happen in tiny subspaces? InICLR, 2025
work page 2025
-
[57]
Soro, B., Andreis, B., Lee, H., Jeong, W., Chong, S., Hutter, F., and Hwang, S. J. Diffusion-based neural network weights generation. InICLR, 2025
work page 2025
-
[58]
The implicit bias of gradient descent on separable data
Soudry, D., Hoffer, E., and Srebro, N. The implicit bias of gradient descent on separable data. InICLR, 2018
work page 2018
-
[59]
J., Chen, P.-Y ., Das, P., Melnyk, I., Sattigeri, P., and Lai, R
Tatro, N. J., Chen, P.-Y ., Das, P., Melnyk, I., Sattigeri, P., and Lai, R. Optimizing mode connectivity via neuron alignment. InNeurIPS, 2020
work page 2020
-
[60]
Team, T. H. et al. Hy-wu (part i): An extensible functional neural memory framework and an instantiation in text- guided image editing.arXiv, 2026
work page 2026
-
[61]
Predicting neural network accuracy from weights.arXiv, 2020
Tolstikhin, I. Predicting neural network accuracy from weights.arXiv, 2020
work page 2020
-
[62]
Neural network diffusion.arXiv, 2024
Zang, Z., Darrell, T., Liu, Z., and You, Y . Neural network diffusion.arXiv, 2024
work page 2024
-
[63]
Recurrent diffusion for large-scale parameter generation
Wang, K., Tang, D., Zhao, W., Sch ¨urholt, K., Wang, Z., and You, Y . Recurrent diffusion for large-scale parameter generation. InNeurIPS, 2025
work page 2025
-
[64]
Wang, P. and Wang, Z. Why neural network can discover symbolic structures with gradient-based training: An al- gebraic and geometric foundation for neurosymbolic rea- soning.arXiv, 2025
work page 2025
-
[65]
Farhadi, A., Carmon, Y ., Kornblith, S., et al. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In ICML, 2022
work page 2022
-
[66]
Yang, X., Zhou, D., Liu, S., Ye, J., and Wang, X. Deep model reassembly. InNeurIPS, 2022
work page 2022
-
[67]
Bayesian nonparametric federated learning of neural networks
Hoang, T., and Khazaeni, Y . Bayesian nonparametric federated learning of neural networks. InICML, 2019
work page 2019
-
[68]
Generative modeling of weights: Generalization or memorization? InCVPR, 2026
Zeng, B., Yin, Y ., Xu, Z., and Liu, Z. Generative modeling of weights: Generalization or memorization? InCVPR, 2026
work page 2026
-
[69]
Emergent modularity in pre-trained transformers.ACL Findings, 2024
Zhang, Z., Zeng, Z., Lin, Y ., Xiao, C., Wang, X., Han, X., Liu, Z., Xie, R., Sun, M., and Zhou, J. Emergent modularity in pre-trained transformers.ACL Findings, 2024. 12 Position: Weight Space Should Be a First-Class Generative AI Modality
work page 2024
-
[70]
Symbolic learning to optimize: Towards interpretability and scala- bility
Zheng, W., Chen, T., Hu, T.-K., and Wang, Z. Symbolic learning to optimize: Towards interpretability and scala- bility. InICLR, 2022
work page 2022
-
[71]
Zhu, H., Zhang, Z., Cong, W., Liu, X., Park, S., Chandra, V ., Long, B., Pan, D. Z., Wang, Z., and Lee, J. Apollo: Sgd-like memory, adamw-level performance. InMLSys, 2025. 13
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.