Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning
Pith reviewed 2026-05-25 05:42 UTC · model grok-4.3
The pith
A single intelligent editing task with complex instructions improves understanding, generation, and editing in unified multimodal models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Uni-Edit shows that image editing is inherently suited as a general task for unified multimodal models because it requires both visual understanding and generation; existing simplistic editing data underuses understanding capacity, but an automated synthesis pipeline that embeds reasoning into instructions creates data that lets one training stage on one dataset improve all three capabilities at once.
What carries the argument
The automated data synthesis pipeline that converts diverse VQA data into complex editing instructions with embedded questions and nested logic to form the Uni-Edit-148k dataset.
If this is right
- One training stage on one dataset replaces multi-stage mixed pipelines and balancing tricks.
- Performance on understanding, generation, and editing all rise together without task conflicts.
- Editing data can be scaled automatically from existing VQA sources while preserving reasoning demands.
- The approach works across different unified models without model-specific auxiliary operations.
Where Pith is reading between the lines
- Task complexity may matter more than task diversity for avoiding conflicts in multimodal training.
- The synthesis method could be adapted to create reasoning-intensive data for other paired capabilities such as captioning paired with generation.
- If the gains hold, unified models might be tuned more efficiently by focusing on one well-designed cross-cutting task.
Load-bearing premise
The synthesized complex editing instructions are what drive gains in understanding capacity, rather than dataset artifacts, scale, or evaluation choices.
What would settle it
Retraining the same base models on the Uni-Edit-148k dataset and measuring no gain or a drop on standard understanding and generation benchmarks would falsify the claim.
Figures
read the original abstract
Currently, enhancing Unified Multimodal Models (UMMs) with image understanding, generation, and editing capabilities mainly relies on mixed multi-task training. Due to inherent task conflicts, such strategy requires complex multi-stage pipelines, massive data mixing, and balancing tricks, merely resulting in a performance trade-off rather than true mutual reinforcement. To break this paradigm, we propose Uni-Edit, an intelligent image editing task that serves as the first general task for UMM tuning. Unlike complex mixed pipelines, Uni-Edit improves performance across all three abilities at once using only one task, one training stage, and one dataset. Specifically, we first identify image editing as an inherently ideal general task, as it naturally demands both visual understanding and generation. However, existing editing data relies on simplistic instructions that severely underutilize a model's understanding capacity. To address this, we introduce the first automated and scalable data synthesis pipeline for intelligent editing, transforming diverse VQA data into complex and effective editing instructions with embedded questions and nested logic. This yields Uni-Edit-148k, pairing diverse reasoning-intensive instructions with high-quality edited images. Extensive experiments on BAGEL and Janus-Pro demonstrate that tuning solely on Uni-Edit achieves comprehensive enhancements across all three capabilities without any auxiliary operations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that image editing can function as a general task for unified multimodal model (UMM) tuning. By synthesizing a dataset (Uni-Edit-148k) of complex editing instructions with embedded questions and nested logic from VQA data, tuning solely on this single task, single stage, and single dataset improves understanding, generation, and editing capabilities simultaneously on models such as BAGEL and Janus-Pro, avoiding the performance trade-offs of mixed multi-task training.
Significance. If the central empirical claim holds after proper controls, the result would be significant: it would demonstrate that a single inherently multi-capability task (editing) can produce mutual reinforcement across understanding and generation without auxiliary operations or multi-stage balancing, simplifying UMM training pipelines. The automated synthesis pipeline for reasoning-intensive editing instructions would also be a methodological contribution if shown to be the driver of gains.
major comments (2)
- [Abstract; Experiments] Abstract and Experiments section: The claim that 'tuning solely on Uni-Edit achieves comprehensive enhancements across all three capabilities' and that gains arise specifically from the 'complex and effective editing instructions with embedded questions and nested logic' (rather than the editing task itself or dataset artifacts) is load-bearing but unsupported by controls. No ablation is reported comparing Uni-Edit to (a) simplistic editing instructions on identical image pairs or (b) non-editing multi-task baselines using the same VQA source data; without these, attribution to the 'intelligent' property cannot be verified.
- [Method / Data Synthesis] Data synthesis pipeline description: The pipeline is presented as transforming VQA data into complex instructions, but no quantitative validation is supplied (e.g., distribution of nesting depth, percentage of embedded questions, or inter-annotator agreement on instruction quality and edited-image fidelity). This leaves open whether the synthesized data actually demands and improves understanding capacity as asserted.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive feedback. The two major comments highlight important aspects of experimental rigor and methodological validation. We respond to each point below and outline revisions that will strengthen the attribution of results to the proposed intelligent editing task.
read point-by-point responses
-
Referee: [Abstract; Experiments] Abstract and Experiments section: The claim that 'tuning solely on Uni-Edit achieves comprehensive enhancements across all three capabilities' and that gains arise specifically from the 'complex and effective editing instructions with embedded questions and nested logic' (rather than the editing task itself or dataset artifacts) is load-bearing but unsupported by controls. No ablation is reported comparing Uni-Edit to (a) simplistic editing instructions on identical image pairs or (b) non-editing multi-task baselines using the same VQA source data; without these, attribution to the 'intelligent' property cannot be verified.
Authors: We agree that isolating the contribution of instruction complexity is critical for the central claim. Our current experiments compare single-task Uni-Edit tuning against the base models and against mixed multi-task baselines reported in the literature, showing simultaneous gains without trade-offs. However, we did not include the exact controls suggested: (a) a simplistic-instruction variant on the same image pairs and (b) a non-editing multi-task setup derived directly from the VQA source data. These ablations would provide stronger evidence that the embedded questions and nested logic are the key drivers. In the revised manuscript we will add both controls on a held-out subset of Uni-Edit-148k, reporting performance deltas for understanding, generation, and editing metrics. revision: yes
-
Referee: [Method / Data Synthesis] Data synthesis pipeline description: The pipeline is presented as transforming VQA data into complex instructions, but no quantitative validation is supplied (e.g., distribution of nesting depth, percentage of embedded questions, or inter-annotator agreement on instruction quality and edited-image fidelity). This leaves open whether the synthesized data actually demands and improves understanding capacity as asserted.
Authors: We acknowledge that the Method section describes the pipeline at a procedural level without accompanying statistics. While the pipeline is fully automated and scalable, the absence of quantitative descriptors (nesting-depth histograms, fraction of embedded questions, or quality metrics) limits the ability to verify that the generated instructions are reasoning-intensive. In the revision we will add a dedicated subsection with these statistics for Uni-Edit-148k, including average nesting depth, percentage of instructions containing embedded questions, and results from a small-scale human evaluation of instruction quality and edited-image fidelity (reported as agreement rates). revision: yes
Circularity Check
No circularity; empirical results rest on external benchmarks and data synthesis, not self-referential reduction
full rationale
The paper advances an empirical claim: tuning UMMs solely on the synthesized Uni-Edit-148k dataset yields simultaneous gains in understanding, generation, and editing on BAGEL and Janus-Pro. No equations, derivations, fitted parameters renamed as predictions, or uniqueness theorems appear. The synthesis pipeline transforms VQA data into instructions, but this is a constructive data-generation step whose outputs are then evaluated on independent benchmarks; it does not reduce the performance claim to a definitional identity. No self-citations are invoked as load-bearing mathematical facts. The central result is therefore self-contained against external test sets rather than circular by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu, Sangkyung Kwak, Huiwon Jang, Jongheon Jeong, Jonathan Huang, Jinwoo Shin, and Saining Xie. Representation alignment for generation: Training diffusion transformers is easier than you think.arXiv preprint arXiv:2410.06940, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[2]
Emerging Properties in Unified Multimodal Pretraining
Chaorui Deng, Deyao Zhu, Kunchang Li, Chenhui Gou, Feng Li, Zeyu Wang, Shu Zhong, Weihao Yu, Xiaonan Nie, Ziang Song, et al. Emerging properties in unified multimodal pretraining.arXiv preprint arXiv:2505.14683, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[3]
Yufeng Cui, Honghao Chen, Haoge Deng, Xu Huang, Xinghang Li, Jirong Liu, Yang Liu, Zhuoyan Luo, Jinsheng Wang, Wenxuan Wang, et al. Emu3. 5: Native multimodal models are world learners.arXiv preprint arXiv:2510.26583, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[4]
Anyedit: Mastering unified high-quality image editing for any idea
Qifan Yu, Wei Chow, Zhongqi Yue, Kaihang Pan, Yang Wu, Xiaoyang Wan, Juncheng Li, Siliang Tang, Hanwang Zhang, and Yueting Zhuang. Anyedit: Mastering unified high-quality image editing for any idea. InCVPR, 2025
work page 2025
-
[5]
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training
Xiang An, Yin Xie, Kaicheng Yang, Wenkang Zhang, Xiuwei Zhao, Zheng Cheng, Yirui Wang, Songcen Xu, Changrui Chen, Didi Zhu, et al. Llava-onevision-1.5: Fully open framework for democratized multimodal training.arXiv preprint arXiv:2509.23661, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[6]
Google. Nano-banana-pro. Accessed November, 2025 [Online] https://deepmind.google/models/ gemini-image/pro/, 2025
work page 2025
-
[7]
OpenAI. Gpt-4o. Accessed November 18, 2024 [Online]https://chatgpt.com/, 2024
work page 2024
-
[8]
Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, Humen Zhong, Yuanzhi Zhu, Mingkun Yang, Zhaohai Li, Jianqiang Wan, Pengfei Wang, Wei Ding, Zheren Fu, Yiheng Xu, Jiabo Ye, Xi Zhang, Tianbao Xie, Zesen Cheng, Hang Zhang, Zhibo Yang, Haiyang Xu, and Junyang Lin. Qwen2.5-vl technical report.ar...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[9]
An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[10]
The llama 3 herd of models.arXiv e-prints, 2024
Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models.arXiv e-prints, 2024
work page 2024
-
[11]
LLaVA-OneVision: Easy Visual Task Transfer
Bo Li, Yuanhan Zhang, Dong Guo, Renrui Zhang, Feng Li, Hao Zhang, Kaichen Zhang, Peiyuan Zhang, Yanwei Li, Ziwei Liu, et al. Llava-onevision: Easy visual task transfer.arXiv preprint arXiv:2408.03326, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[12]
Emu3: Next-Token Prediction is All You Need
Xinlong Wang, Xiaosong Zhang, Zhengxiong Luo, Quan Sun, Yufeng Cui, Jinsheng Wang, Fan Zhang, Yueze Wang, Zhen Li, Qiying Yu, et al. Emu3: Next-token prediction is all you need.arXiv preprint arXiv:2409.18869, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[13]
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team. Chameleon: Mixed-modal early-fusion foundation models.arXiv preprint arXiv:2405.09818, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[14]
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Yecheng Wu, Zhuoyang Zhang, Junyu Chen, Haotian Tang, Dacheng Li, Yunhao Fang, Ligeng Zhu, Enze Xie, Hongxu Yin, Li Yi, et al. Vila-u: a unified foundation model integrating visual understanding and generation.arXiv preprint arXiv:2409.04429, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[15]
Longcat-next: Lexicalizing modalities as discrete tokens.arXiv preprint arXiv:2603.27538, 2026
Meituan LongCat Team, Bin Xiao, Chao Wang, Chengjiang Li, Chi Zhang, Chong Peng, Hang Yu, Hao Yang, Haonan Yan, Haoze Sun, et al. Longcat-next: Lexicalizing modalities as discrete tokens.arXiv preprint arXiv:2603.27538, 2026
-
[16]
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Xiaokang Chen, Zhiyu Wu, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, and Chong Ruan. Janus-pro: Unified multimodal understanding and generation with data and model scaling.arXiv preprint arXiv:2501.17811, 2025. 10 Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[17]
HunyuanImage 3.0 Technical Report
Siyu Cao, Hangting Chen, Peng Chen, Yiji Cheng, Yutao Cui, Xinchi Deng, Ying Dong, Kipper Gong, Tianpeng Gu, Xiusen Gu, et al. Hunyuanimage 3.0 technical report.arXiv preprint arXiv:2509.23951, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[18]
Han Li, Xinyu Peng, Yaoming Wang, Zelin Peng, Xin Chen, Rongxiang Weng, Jingang Wang, Xunliang Cai, Wenrui Dai, and Hongkai Xiong. Onecat: Decoder-only auto-regressive model for unified understanding and generation.arXiv preprint arXiv:2509.03498, 2025
-
[19]
AIA: Rethinking Architecture Decoupling Strategy In Unified Multimodal Model
Dian Zheng, Manyuan Zhang, Hongyu Li, Kai Zou, Hongbo Liu, Ziyu Guo, Kaituo Feng, Yexin Liu, Ying Luo, Yan Feng, et al. Architecture decoupling is not all you need for unified multimodal model.arXiv preprint arXiv:2511.22663, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[20]
Scaling rectified flow transformers for high-resolution image synthesis
Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling rectified flow transformers for high-resolution image synthesis. InICML, 2024
work page 2024
-
[21]
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion models for high-resolution image synthesis.arXiv preprint arXiv:2307.01952, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[22]
Flux.https://github.com/black-forest-labs/flux, 2024
Black Forest Labs. Flux.https://github.com/black-forest-labs/flux, 2024
work page 2024
-
[23]
Chenfei Wu, Jiahao Li, Jingren Zhou, Junyang Lin, Kaiyuan Gao, Kun Yan, Sheng-ming Yin, Shuai Bai, Xiao Xu, Yilei Chen, et al. Qwen-image technical report.arXiv preprint arXiv:2508.02324, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[24]
Glm-image.https://huggingface.co/zai-org/GLM-Image, 2026
Zhipu AI. Glm-image.https://huggingface.co/zai-org/GLM-Image, 2026
work page 2026
-
[25]
Step1X-Edit: A Practical Framework for General Image Editing
Shiyu Liu, Yucheng Han, Peng Xing, Fukun Yin, Rui Wang, Wei Cheng, Jiaqi Liao, Yingming Wang, Honghao Fu, Chunrui Han, et al. Step1x-edit: A practical framework for general image editing.arXiv preprint arXiv:2504.17761, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[26]
NextStep Team, Chunrui Han, Guopeng Li, Jingwei Wu, Quan Sun, Yan Cai, Yuang Peng, Zheng Ge, Deyu Zhou, Haomiao Tang, et al. Nextstep-1: Toward autoregressive image generation with continuous tokens at scale.arXiv preprint arXiv:2508.10711, 2025
-
[27]
Hongyu Li, Manyuan Zhang, Dian Zheng, Ziyu Guo, Yimeng Jia, Kaituo Feng, Hao Yu, Yexin Liu, Yan Feng, Peng Pei, et al. Editthinker: Unlocking iterative reasoning for any image editor.arXiv preprint arXiv:2512.05965, 2025
-
[28]
Chenhui Gou, Zilong Chen, Zeyu Wang, Feng Li, Deyao Zhu, Zicheng Duan, Kunchang Li, Chaorui Deng, Hongyi Yuan, Haoqi Fan, et al. Vq-va world: Towards high-quality visual question-visual answering.arXiv preprint arXiv:2511.20573, 2025
-
[29]
Le Zhuo, Songhao Han, Yuandong Pu, Boxiang Qiu, Sayak Paul, Yue Liao, Yihao Liu, Jie Shao, Xi Chen, Si Liu, and Hongsheng Li. Factuality matters: When image generation and editing meet structured visuals.arXiv preprint arXiv:2510.05091, 2025
-
[30]
Yi Zhang, Bolin Ni, Xin-Sheng Chen, Heng-Rui Zhang, Yongming Rao, Houwen Peng, Qinglin Lu, Han Hu, Meng-Hao Guo, and Shi-Min Hu. Bee: A high-quality corpus and full-stack suite to unlock advanced fully open mllms.arXiv preprint arXiv:2510.13795, 2025
-
[31]
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin, Mengdan Zhang, Xu Lin, Jinrui Yang, Xiawu Zheng, Ke Li, Xing Sun, et al. Mme: A comprehensive evaluation benchmark for multimodal large language models.arXiv preprint arXiv:2306.13394, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[32]
Mmbench: Is your multi-modal model an all-around player? InECCV, 2024
Yuan Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, et al. Mmbench: Is your multi-modal model an all-around player? InECCV, 2024
work page 2024
-
[33]
Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi
Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, et al. Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi. In CVPR, 2024
work page 2024
-
[34]
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, and Jianfeng Gao. Mathvista: Evaluating mathematical reasoning of foundation models in visual contexts.arXiv preprint arXiv:2310.02255, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[35]
Eyes wide shut? exploring the visual shortcomings of multimodal llms
Shengbang Tong, Zhuang Liu, Yuexiang Zhai, Yi Ma, Yann LeCun, and Saining Xie. Eyes wide shut? exploring the visual shortcomings of multimodal llms. InCVPR, 2024
work page 2024
-
[36]
Geneval: An object-focused framework for evaluating text-to-image alignment
Dhruba Ghosh, Hannaneh Hajishirzi, and Ludwig Schmidt. Geneval: An object-focused framework for evaluating text-to-image alignment. InNeurIPS, 2023
work page 2023
-
[37]
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
Yuwei Niu, Munan Ning, Mengren Zheng, Weiyang Jin, Bin Lin, Peng Jin, Jiaqi Liao, Chaoran Feng, Kunpeng Ning, Bin Zhu, et al. Wise: A world knowledge-informed semantic evaluation for text-to-image generation.arXiv preprint arXiv:2503.07265, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[38]
ImgEdit: A Unified Image Editing Dataset and Benchmark
Yang Ye, Xianyi He, Zongjian Li, Bin Lin, Shenghai Yuan, Zhiyuan Yan, Bohan Hou, and Li Yuan. Imgedit: A unified image editing dataset and benchmark.arXiv preprint arXiv:2505.20275, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[39]
Xiangyu Zhao, Peiyuan Zhang, Kexian Tang, Xiaorong Zhu, Hao Li, Wenhao Chai, Zicheng Zhang, Renqiu Xia, Guangtao Zhai, Junchi Yan, et al. Envisioning beyond the pixels: Benchmarking reasoning-informed visual editing.arXiv preprint arXiv:2504.02826, 2025
-
[40]
"Your output must be a single JSON object.\n\n
Ji Xie, Trevor Darrell, Luke Zettlemoyer, and XuDong Wang. Reconstruction alignment improves unified multimodal models. arXiv preprint arXiv:2509.07295, 2025. 11 Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning A System Prompt Task Type Classification System Prompt "You are an expert data processor. Your task is to analyze the inpu...
-
[41]
**Blurriness & Artifacts**: - Is the image significantly blurry, pixelated, or noisy? - Are there compression artifacts or "fried" textures? - Is the text (if any) legible, or is it garbled/gibberish?
-
[42]
**Structural Coherence (The "Uncanny Valley" Check)**: - Do objects look physically plausible? - Are there distorted limbs, melted faces, or floating objects that defy gravity? - Is the composition chaotic or nonsensical?
-
[43]
**Visual Harmony**: - Do the lighting and shadows match across the image? - Are there harsh, unnatural seams or "pasted-on" effects (bad compositing)? - Are the colors overly saturated, washed out, or broken? ### Scoring Scale (1-5): - **5 (High Quality)**: Sharp, coherent, natural-looking, and aesthetically pleasing. No visible artifacts. - **4 (Good)**:...
-
[44]
**Original Image**: The first input image, which is before editing and is a realistic image
-
[45]
**Edited Image**: The second input image, which is the one after editing
-
[46]
**edit_instruction**: The command the model was supposed to follow. Note that this instruction may involve: - **Spatial Grounding**: Referring to specific regions (e.g., "the region in the answer"). - **Visual Transformation**: Changing style, objects, attributes or doing ocr, caption
-
[47]
replace bushes with flower beds
**original_question & process_answer**: These define the **target** or **premise** of the edit. - If the Answer is a coordinate (bounding box), it defines *where* the edit must happen. - If the Answer is a caption/description, it defines the *answer* for the region and it need to be pushed into a blackboard or letter based on the edit_instruction. ### Eva...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.