ConfusionPrompt: Practical Private Inference for Online Large Language Models
Pith reviewed 2026-05-24 04:56 UTC · model grok-4.3
The pith
ConfusionPrompt protects prompts sent to black-box LLMs by splitting them into sub-prompts mixed with generated pseudo-prompts that the user later filters and recomposes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ConfusionPrompt achieves private inference on black-box LLMs by decomposing the original prompt into sub-prompts, generating pseudo-prompts to form a privacy-preserving group, transmitting the mixed set to the server, and allowing the user to filter and recompose the responses into the correct output, yielding higher utility than local open-source inference or perturbation methods while using less memory than full local models.
What carries the argument
The ConfusionPrompt framework, which decomposes prompts into genuine sub-prompts, interleaves them with pseudo-prompts, and relies on user-side recomposition of server responses.
If this is right
- Black-box LLM services can be used privately without model changes or local model hosting.
- Prompt decomposition reduces the computational burden compared to full local open-source models.
- The (λ, μ, ρ)-privacy model quantifies the protection level of any mixed prompt group.
- Complexity analysis shows decomposition lowers the effective query cost for privacy.
Where Pith is reading between the lines
- The approach may extend to multi-turn conversations if recomposition logic can track context across exchanges.
- If pseudo-prompt generation can be made domain-specific, utility loss could drop further for specialized tasks.
- Adoption would require users to run a local client for decomposition and recomposition, shifting some compute from server to client.
Load-bearing premise
The user can reliably identify which server responses come from genuine sub-prompts and recombine them into the correct final output without large accuracy loss.
What would settle it
A test set of prompts where an automated or human recomposer fails to recover the original answer at a rate comparable to direct LLM use, or where an adversary distinguishes genuine from pseudo sub-prompts above the (λ, μ, ρ) threshold.
Figures
read the original abstract
State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers. This raises significant privacy concerns. In response, we introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by: (i) decomposing the original prompt into smaller sub-prompts, and (ii) generating pseudo-prompts alongside the genuine sub-prompts, which are then sent to the LLM. The server responses are later recomposed by the user to reconstruct the final output. This approach offers key advantages over previous LLM privacy protection methods: (i) it integrates seamlessly with existing black-box LLMs, and (ii) it delivers a significantly improved privacy-utility trade-off compared to existing text perturbation methods. We also develop a $(\lambda, \mu, \rho)$-privacy model to formulate the requirements for a privacy-preserving group of prompts and provide a complexity analysis to justify the role of prompt decomposition. Our empirical evaluation shows that ConfusionPrompt achieves significantly higher utility than local inference methods using open-source models and perturbation-based techniques, while also reducing memory consumption compared to open-source LLMs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ConfusionPrompt, a framework for private inference on black-box online LLMs. The method decomposes a user prompt into sub-prompts, mixes them with generated pseudo-prompts according to a (λ, μ, ρ)-privacy model, sends the batch to the LLM server, and relies on the user to recompose the returned responses into the final output. It claims seamless integration with existing LLMs, a significantly improved privacy-utility tradeoff versus text-perturbation baselines, lower memory use than local open-source models, and supports these claims with a complexity analysis plus empirical evaluation.
Significance. If the recomposition step can be shown to recover outputs reliably, the approach would provide a practical black-box privacy mechanism that avoids both the utility loss of perturbation methods and the memory/compute cost of local open-source LLMs. The explicit (λ, μ, ρ) privacy formulation and complexity analysis are positive elements that could be built upon.
major comments (2)
- [Framework description and recomposition step] The recomposition step is described only at high level in the framework overview. The central utility claim—that ConfusionPrompt delivers higher utility than perturbation baselines—rests on the unverified assumption that users can accurately isolate and recombine genuine sub-prompt responses from a batch of indistinguishable pseudo-prompt responses without substantial loss; no concrete filtering mechanism, algorithm, or experimental measurement of filtering fidelity (e.g., accuracy or semantic overlap under LLM nondeterminism) is supplied, rendering the reported gains unsupported.
- [Empirical evaluation] Empirical evaluation section: the abstract states that ConfusionPrompt “achieves significantly higher utility” than local open-source inference and perturbation techniques, yet no dataset details, baseline implementations, error bars, statistical tests, or exact recomposition procedure are referenced. Without these, the quantitative privacy-utility claims cannot be assessed and the comparison to perturbation methods remains unverifiable.
minor comments (1)
- [Privacy model] The (λ, μ, ρ) privacy model is introduced without an explicit equation or formal definition in the provided abstract; a numbered definition or boxed formulation would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the framework and evaluation. We address each major comment below and will revise the manuscript to provide the requested details.
read point-by-point responses
-
Referee: [Framework description and recomposition step] The recomposition step is described only at high level in the framework overview. The central utility claim—that ConfusionPrompt delivers higher utility than perturbation baselines—rests on the unverified assumption that users can accurately isolate and recombine genuine sub-prompt responses from a batch of indistinguishable pseudo-prompt responses without substantial loss; no concrete filtering mechanism, algorithm, or experimental measurement of filtering fidelity (e.g., accuracy or semantic overlap under LLM nondeterminism) is supplied, rendering the reported gains unsupported.
Authors: We agree that the current manuscript describes the recomposition step at a high level and does not supply a concrete algorithm or fidelity measurements. In the revised version we will add a detailed filtering and recombination algorithm, including handling of nondeterminism, together with new experiments quantifying its accuracy and semantic overlap. revision: yes
-
Referee: [Empirical evaluation] Empirical evaluation section: the abstract states that ConfusionPrompt “achieves significantly higher utility” than local open-source inference and perturbation techniques, yet no dataset details, baseline implementations, error bars, statistical tests, or exact recomposition procedure are referenced. Without these, the quantitative privacy-utility claims cannot be assessed and the comparison to perturbation methods remains unverifiable.
Authors: We acknowledge that the empirical section lacks the listed details. The revised manuscript will include dataset descriptions, baseline implementations, error bars, statistical tests, and the precise recomposition procedure used in the experiments. revision: yes
Circularity Check
No load-bearing circularity; privacy model and recomposition described independently of results
full rationale
The paper defines a (λ, μ, ρ)-privacy model to formulate prompt-group requirements and provides a complexity analysis for decomposition. These are presented as design choices rather than derived predictions that reduce to fitted inputs or self-citations. The recomposition step is described at a high level without equations that loop back to the privacy claims by construction. No self-citation chains or ansatzes are invoked to justify the core privacy-utility tradeoff. This yields a minor score reflecting normal self-referential method description without forcing the central result.
Axiom & Free-Parameter Ledger
free parameters (1)
- λ, μ, ρ
axioms (1)
- domain assumption User can accurately recompose final output from mixed sub-prompt responses
Reference graph
Works this paper leans on
-
[1]
Language models are few-shot learners
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020
work page 1901
-
[2]
Code Llama: Open Foundation Models for Code
Baptiste Roziere, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiao- qing Ellen Tan, Yossi Adi, Jingyu Liu, Tal Remez, J´ er´ emy Rapin, et al. Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[3]
Instruct2act: Mapping multi-modality instructions to robotic actions with large language model
Siyuan Huang, Zhengkai Jiang, Hao Dong, Yu Qiao, Peng Gao, and Hongsheng Li. Instruct2act: Mapping multi-modality instructions to robotic actions with large language model. arXiv preprint arXiv:2305.11176 , 2023
-
[4]
Large language models in medicine
Arun James Thirunavukarasu, Darren Shu Jeng Ting, Kabilan Elangovan, Laura Gutierrez, Ting Fang Tan, and Daniel Shu Wei Ting. Large language models in medicine. Nature medicine, 29(8):1930–1940, 2023
work page 1930
-
[5]
BloombergGPT: A Large Language Model for Finance
Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, Mark Dredze, Se- bastian Gehrmann, Prabhanjan Kambadur, David Rosenberg, and Gideon Mann. Bloomberggpt: A large language model for finance. arXiv preprint arXiv:2303.17564, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[6]
Training language models to follow instructions with human feedback
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems , 35:27730–27744, 2022
work page 2022
-
[7]
OpenAI. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 , 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[8]
Llms can understand encrypted prompt: Towards privacy-computing friendly transformers
Xuanqi Liu and Zhuotao Liu. Llms can understand encrypted prompt: Towards privacy-computing friendly transformers. arXiv preprint arXiv:2305.18396 , 2023. 20
-
[9]
The-x: Privacy-preserving transformer inference with homomorphic encryption
Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, Jianxin Li, and Furu Wei. The-x: Privacy-preserving transformer inference with homomorphic encryption. arXiv preprint arXiv:2206.00216 , 2022
-
[10]
Dp-forward: Fine-tuning and inference on language models with differential privacy in forward pass
Minxin Du, Xiang Yue, Sherman SM Chow, Tianhao Wang, Chenyu Huang, and Huan Sun. Dp-forward: Fine-tuning and inference on language models with differential privacy in forward pass. arXiv preprint arXiv:2309.06746 , 2023
-
[11]
A survey on homomorphic encryption schemes: Theory and implementation
Abbas Acar, Hidayet Aksu, A Selcuk Uluagac, and Mauro Conti. A survey on homomorphic encryption schemes: Theory and implementation. ACM Computing Surveys (Csur), 51(4):1–35, 2018
work page 2018
-
[12]
Ronald Cramer, Ivan Bjerre Damg˚ ard, et al. Secure multiparty computation. Cambridge University Press, 2015
work page 2015
-
[13]
Lingjuan Lyu, Xuanli He, and Yitong Li. Differentially private representation for nlp: Formal guarantee and an empirical study on privacy and fairness. In Findings of the Association for Computational Linguistics: EMNLP 2020 , pages 2355–2365, 2020
work page 2020
-
[14]
Cynthia Dwork. Differential privacy. In International colloquium on automata, languages, and programming, pages 1–12. Springer, 2006
work page 2006
-
[15]
Natural language understanding with privacy-preserving bert
Chen Qu, Weize Kong, Liu Yang, Mingyang Zhang, Michael Bendersky, and Marc Najork. Natural language understanding with privacy-preserving bert. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 1488–1497, 2021
work page 2021
-
[16]
Split-and- denoise: Protect large language model inference with local differential privacy
Peihua Mai, Ran Yan, Zhe Huang, Youjia Yang, and Yan Pang. Split-and- denoise: Protect large language model inference with local differential privacy. In Forty-first International Conference on Machine Learning
-
[17]
Mohammad Malekzadeh and Fahim Kawsar. Salted inference: Enhancing privacy while maintaining efficiency of split inference in mobile computing. In Proceedings of the 25th International Workshop on Mobile Computing Systems and Applications, pages 14–20, 2024
work page 2024
-
[18]
Trusted execution environment: What it is, and what it is not
Mohamed Sabt, Mohammed Achemlal, and Abdelmadjid Bouabdallah. Trusted execution environment: What it is, and what it is not. In 2015 IEEE Trust- com/BigDataSE/Ispa, volume 1, pages 57–64. IEEE, 2015. 21
work page 2015
-
[19]
Named entity recognition and classification in historical documents: A survey
Maud Ehrmann, Ahmed Hamdi, Elvys Linhares Pontes, Matteo Romanello, and Antoine Doucet. Named entity recognition and classification in historical documents: A survey. ACM Computing Surveys , 56(2):1–47, 2023
work page 2023
-
[20]
Neural Architectures for Named Entity Recognition
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. Neural architectures for named entity recogni- tion. arXiv preprint arXiv:1603.01360 , 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[21]
Zhigang Kan, Linbo Qiao, Hao Yu, Liwen Peng, Yifu Gao, and Dongsheng Li. Protecting user privacy in remote conversational systems: A privacy-preserving framework based on text sanitization. arXiv preprint arXiv:2306.08223 , 2023
-
[22]
Hide and seek (has): A lightweight framework for prompt privacy protection
Yu Chen, Tingxin Li, Huiming Liu, and Yang Yu. Hide and seek (has): A lightweight framework for prompt privacy protection. arXiv preprint arXiv:2309.03057, 2023
-
[23]
t-plausibility: Generalizing words to desensitize text
Balamurugan Anandan, Chris Clifton, Wei Jiang, Mummoorthy Murugesan, Pedro Pastrana-Camacho, and Luo Si. t-plausibility: Generalizing words to desensitize text. Trans. Data Priv., 5(3):505–534, 2012
work page 2012
-
[24]
Cryptonets: Applying neural networks to en- crypted data with high throughput and accuracy
Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin Lauter, Michael Naehrig, and John Wernsing. Cryptonets: Applying neural networks to en- crypted data with high throughput and accuracy. In International conference on machine learning, pages 201–210. PMLR, 2016
work page 2016
-
[25]
Iron: Private inference on transformers
Meng Hao, Hongwei Li, Hanxiao Chen, Pengzhi Xing, Guowen Xu, and Tianwei Zhang. Iron: Private inference on transformers. Advances in Neural Information Processing Systems, 35:15718–15731, 2022
work page 2022
-
[26]
Differentially private language models benefit from public pre-training
Gavin Kerrigan, Dylan Slack, and Jens Tuyls. Differentially private language models benefit from public pre-training. arXiv preprint arXiv:2009.05886 , 2020
-
[27]
Differentially private fine-tuning of language models
Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, et al. Differentially private fine-tuning of language models. arXiv preprint arXiv:2110.06500, 2021
-
[28]
Flocks of stochastic parrots: Differentially private prompt learning for large language models, 2023
Haonan Duan, Adam Dziedzic, Nicolas Papernot, and Franziska Boenisch. Flocks of stochastic parrots: Differentially private prompt learning for large language models, 2023. 22
work page 2023
-
[29]
Privacy-preserving prompt tuning for large language model services, 2023
Yansong Li, Zhixing Tan, and Yang Liu. Privacy-preserving prompt tuning for large language model services, 2023
work page 2023
-
[30]
Privacy-and utility-preserving textual analysis via calibrated multivariate perturbations
Oluwaseyi Feyisetan, Borja Balle, Thomas Drake, and Tom Diethe. Privacy-and utility-preserving textual analysis via calibrated multivariate perturbations. In Proceedings of the 13th international conference on web search and data mining , pages 178–186, 2020
work page 2020
-
[31]
Saiteja Utpala, Sara Hooker, and Pin Yu Chen. Locally differentially private document generation using zero shot prompting.arXiv preprint arXiv:2310.16111, 2023
-
[32]
The limits of word level differential privacy
Justus Mattern, Benjamin Weggenmann, and Florian Kerschbaum. The limits of word level differential privacy. In Findings of the Association for Computational Linguistics: NAACL 2022 , pages 867–881, 2022
work page 2022
-
[33]
Embellishing text search queries to protect user privacy.(2010)
Hwee Hwa PANG, Xuhua DING, and Xiaokui XIAO. Embellishing text search queries to protect user privacy.(2010). In Proceedings of the VLDB Endowment: 36th International Conference on Very Large Data Bases: Singapore, pages 13–17, 2010
work page 2010
-
[34]
Constructing plausible innocuous pseudo queries to protect user query intention
Zongda Wu, Jie Shi, Chenglang Lu, Enhong Chen, Guandong Xu, Guiling Li, Sihong Xie, and S Yu Philip. Constructing plausible innocuous pseudo queries to protect user query intention. Information Sciences, 325:215–226, 2015
work page 2015
- [35]
-
[36]
Did aristotle use a laptop? a question answering benchmark with implicit reasoning strategies
Mor Geva, Daniel Khashabi, Elad Segal, Tushar Khot, Dan Roth, and Jonathan Berant. Did aristotle use a laptop? a question answering benchmark with implicit reasoning strategies. Transactions of the Association for Computational Linguistics, 9:346–361, 2021
work page 2021
-
[37]
Musique: Multihop questions via single-hop question composition
Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, and Ashish Sabharwal. Musique: Multihop questions via single-hop question composition. Transactions of the Association for Computational Linguistics , 10:539–554, 2022
work page 2022
-
[38]
Bhargav Srinivasa-Desikan. Natural Language Processing and Computational Linguistics: A practical guide to text analysis with Python, Gensim, spaCy, and Keras. Packt Publishing Ltd, 2018. 23
work page 2018
-
[39]
Flair: An easy-to-use framework for state-of-the-art nlp
Alan Akbik, Tanja Bergmann, Duncan Blythe, Kashif Rasul, Stefan Schweter, and Roland Vollgraf. Flair: An easy-to-use framework for state-of-the-art nlp. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics (demonstrations) , pages 54–59, 2019
work page 2019
-
[40]
Gender classification using twitter text data
Pradeep Vashisth and Kevin Meehan. Gender classification using twitter text data. In 2020 31st Irish Signals and Systems Conference (ISSC) , pages 1–6. IEEE, 2020
work page 2020
-
[41]
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William W Cohen, Ruslan Salakhutdinov, and Christopher D Manning. Hotpotqa: A dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[42]
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality
Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph E Gonzalez, et al. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. See https://vicuna. lmsys. org (accessed 14 April 2023) , 2023
work page 2023
-
[44]
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages 7871–7880, 2020
work page 2020
-
[45]
Exploring the limits of transfer learning with a unified text-to-text transformer
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research, 21(140):1–67, 2020
work page 2020
-
[46]
Scaling instruction-finetuned language models
Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, et al. Scaling instruction-finetuned language models. Journal of Machine Learning Research, 25(70):1–53, 2024. 24
work page 2024
-
[47]
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers and Iryna Gurevych. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 , 2019
work page internal anchor Pith review Pith/arXiv arXiv 1908
-
[48]
Gopichand Kanumolu, Lokesh Madasu, Pavan Baswani, Ananya Mukherjee, and Manish Shrivastava. Unsupervised approach to evaluate sentence-level fluency: Do we really need reference? arXiv preprint arXiv:2312.01500 , 2023
-
[49]
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 , 2019
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[50]
Squad: 100,000+ questions for machine comprehension of text
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. Squad: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing , pages 2383–2392, 2016
work page 2016
-
[51]
Drop: A reading comprehension benchmark requiring discrete reasoning over paragraphs
Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, and Matt Gardner. Drop: A reading comprehension benchmark requiring discrete reasoning over paragraphs. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers...
work page 2019
-
[52]
Bleu: a method for automatic evaluation of machine translation
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics , pages 311–318, 2002. 25 Appendix A. Proof of Theorem 11 and 12 We begin with the proof for Theorem 11 as followed: Proof. For the prompt wi...
work page 2002
-
[53]
and DROP [51]. Appendix C.3. Semantic Similarity Model and Discriminator The comparison data collection for the generator involves a local similarity evalu- ation model and discriminator. Similarity evaluation model: We adopt a finetuned version of MiniLM-6L model [47] to extract the embedding of each private attribute. The semantic relevance between a pa...
-
[54]
Inarticulate/ non-fluent sentence
Score 1: Incomprehensible. Inarticulate/ non-fluent sentence
-
[55]
Score 2: Low Quality. Partially fluent sentence: (a) only half of the sentence 31 is fluent or (b) more than 1 missing words or (c) more than 1 misspelt words or d) contains individual fluent word-groups with missing coherence between them
-
[56]
Score 3: Moderate. Sentence is predominantly fluent but contains either (a) misspelt word or (b) missing word or (c) multiple occurrence of a word
-
[57]
Perfectly fluent sentence without any syntactic or grammatical error
Score 4: Perfect. Perfectly fluent sentence without any syntactic or grammatical error. Strictly respond in the form of JSON with the following format: {”S1”: the score, ”S2”: the score }. Sentences: {dictionary of sentences} On obtaining 4000 training and 700 validation samples, we finetune a Bert-base (110M parameters) to train a local discriminator. Ap...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.