Beyond Scaling: Agents Are Heading to the Edge

Chunlin Tian; Dongqi Cai; Nicholas D. Lane; Wanru Zhao

arxiv: 2605.18535 · v1 · pith:Y3CSLXRAnew · submitted 2026-05-18 · 💻 cs.LG · cs.MA

Beyond Scaling: Agents Are Heading to the Edge

Chunlin Tian , Dongqi Cai , Wanru Zhao , Nicholas D. Lane This is my paper

Pith reviewed 2026-05-20 11:47 UTC · model grok-4.3

classification 💻 cs.LG cs.MA

keywords agentic intelligenceedge computingpersonal agentslocal contextzero-latency executionprefrontal turndata geography paradoxinteraction alignment

0 comments

The pith

Personal agents must move to edge devices because their tasks couple tightly to local context that degrades in cloud transmission.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that the bottleneck for useful agentic intelligence has shifted from model scale to coordinated execution in personal systems. It argues this requires edge architectures, since agent tasks structurally depend on high-fidelity local context and zero-latency loops that cloud designs cannot preserve. Three shifts drive the argument: executive control now matters more than pre-training and must stay near the action environment; local data like file hierarchies and sensor streams lose meaning when prepared for cloud; and sustainable refinement comes only from real-time implicit preference signals generated through local interaction. A sympathetic reader would care because this reframes deployment away from centralized scaling toward on-device systems that keep agents aligned with personal environments.

Core claim

Personal-agent architecture must move to the edge because the core properties of agentic intelligence tasks, particularly their structural coupling with high-fidelity local context and the need for zero-latency execution loops, do not sit well with cloud-centric designs. This is developed through the Prefrontal Turn where marginal capability gains come from framework-level executive control that requires physical proximity to the environment, the Data-Geography Paradox where local data degrades or disappears in transmission, and the interaction-alignment loop where real-time local signals provide the only sustainable source of refinement data.

What carries the argument

The three structural shifts—Prefrontal Turn, Data-Geography Paradox, and interaction-alignment loop—that together establish why cloud transmission severs agents from ground-truth context.

If this is right

Executive control frameworks must execute on-device to preserve cognitive alignment with the immediate environment.
Agents will rely on local data sources that cannot be fully replicated or sent upstream without loss of meaning.
Refinement data will come primarily from high-fidelity implicit signals collected through ongoing local user interactions.
Future personal agent deployments will prioritize hardware and software stacks that keep execution loops at the point of action.
Cloud-centric scaling approaches will yield diminishing returns for agentic tasks once local coupling becomes the dominant constraint.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Edge agents could enable new forms of continuous personal adaptation that cloud latency would make impossible.
Hardware designs focused on low-power local inference might become more central than further increases in model size.
Privacy and data sovereignty arguments for edge deployment follow directly but remain unstated in the paper.
The same logic may apply to other real-time embodied systems such as robotics or augmented reality interfaces.

Load-bearing premise

High-fidelity local context and real-time implicit preference signals from personal environments cannot be adequately approximated or transmitted without significant degradation in cloud-based systems.

What would settle it

A controlled comparison showing that cloud-based personal agents match or exceed edge-based ones on tasks involving local file hierarchies, real-time sensor streams, and transient OS states would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.18535 by Chunlin Tian, Dongqi Cai, Nicholas D. Lane, Wanru Zhao.

read the original abstract

The bottleneck of useful agentic intelligence has shifted from compressing world knowledge into a single model to executing a coordinated system. This position paper argues that personal-agent architecture must move to the edge because the core properties of agentic intelligence tasks, particularly their structural coupling with high-fidelity local context and the need for zero-latency execution loops, do not sit well with cloud-centric designs. We develop this claim through three structural shifts. First, the Prefrontal Turn: the main marginal lever of capability has moved from pre-training scale to framework-level executive control. Such control must remain physically close to the environment of action if the agent is to preserve cognitive alignment. Second, the Data-Geography Paradox, the ``dark matter'' of agentic data (local file hierarchies, real-time sensor streams, and transient OS states) degrades, disappears, or loses meaning once prepared for cloud transmission, thereby cutting the agent off from ground-truth context. Third, the interaction-alignment loop, the only economically and ecologically sustainable source of agentic refinement data is the high-fidelity implicit preference signal produced through real-time local interaction. Third, the interaction-alignment loop, the only economically and ecologically sustainable source of agentic refinement data is the high-fidelity implicit preference signal produced through real-time local interaction. We conclude with falsifiable predictions for the next deployment cycle of personal agents.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This position paper frames a shift to edge agents around three conceptual changes but rests its core case on an unquantified assumption about irreversible context loss.

read the letter

The main thing to know is that the authors argue personal agents need to move to the edge because their tasks tie tightly to local context and real-time loops that cloud setups handle poorly. They organize this around the Prefrontal Turn, where control frameworks matter more than scale, the Data-Geography Paradox, where local data like file structures and sensor streams loses value in transmission, and the interaction-alignment loop, where useful refinement signals come only from on-device use. The predictions at the end give something concrete to watch for in the next round of deployments. That framing is the clearest part of the work and connects agent properties to practical choices about latency and privacy in a direct way. It gives readers a way to think about why cloud models might hit limits even if they keep getting bigger. The paper does a decent job pulling together these points without overclaiming new experiments or proofs. The soft spot sits in the Data-Geography Paradox. The argument treats local context as something that necessarily degrades or disappears when sent to the cloud, yet it supplies no measure of how much loss occurs, no bounds on when that loss becomes task-critical, and no counter-examples of cases where preprocessing or partial transmission works well enough. This leaves the rejection of cloud designs hanging on an assumption whose size is not specified. If the loss is often tolerable with current techniques, the structural claim weakens. The rest of the paper stays at the level of assertion rather than derivation or data, which fits a position piece but limits how far the conclusions can be pressed. This is for people working on agent architectures who want to consider deployment tradeoffs and hardware implications. A reader looking for high-level framing and future directions will find it useful even without new measurements. It deserves a serious referee because the ideas are coherent enough to spark discussion, though any review would likely ask for tighter support on the central assumption about context. I would send it to review with the expectation that the authors add either quantitative estimates or clearer falsification tests around the paradox.

Referee Report

2 major / 1 minor

Summary. This position paper argues that personal-agent architectures must shift to the edge because agentic intelligence tasks are structurally coupled to high-fidelity local context and zero-latency execution loops, which are incompatible with cloud-centric designs. It develops the argument via three shifts: the Prefrontal Turn (capability now driven by local executive control rather than pre-training scale), the Data-Geography Paradox (local 'dark matter' data such as file hierarchies, sensor streams, and OS states degrades or loses meaning when prepared for cloud transmission), and the interaction-alignment loop (real-time local implicit preference signals as the only sustainable source of refinement data). The paper concludes with falsifiable predictions for the next deployment cycle.

Significance. If the structural claims hold, the paper could meaningfully redirect agentic-AI research and deployment toward edge-first designs that preserve local context and interaction signals. A clear strength is the explicit listing of falsifiable predictions, which supplies a concrete basis for empirical testing rather than purely conceptual assertion.

major comments (2)

[Data-Geography Paradox] Data-Geography Paradox section: the claim that high-fidelity local context necessarily degrades or loses ground-truth value upon preparation for cloud transmission is treated as a structural fact, yet no information-theoretic bounds, quantitative measures of fidelity loss, or counter-examples are supplied to show when degradation becomes task-critical versus tolerable. This unquantified magnitude is load-bearing for the rejection of cloud-centric designs.
[interaction-alignment loop] interaction-alignment loop section: the assertion that real-time local interaction is the 'only economically and ecologically sustainable source' of agentic refinement data is a strong, central claim that lacks comparative analysis against alternative data sources or any supporting quantification, leaving the sustainability argument unsupported.

minor comments (1)

[Abstract] Abstract: the sentence introducing the third shift is duplicated verbatim ('Third, the interaction-alignment loop...'), which is a typographical error that should be removed for readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our position paper. Below we respond to each major comment, clarifying the conceptual framing while indicating where revisions can strengthen the presentation.

read point-by-point responses

Referee: [Data-Geography Paradox] Data-Geography Paradox section: the claim that high-fidelity local context necessarily degrades or loses ground-truth value upon preparation for cloud transmission is treated as a structural fact, yet no information-theoretic bounds, quantitative measures of fidelity loss, or counter-examples are supplied to show when degradation becomes task-critical versus tolerable. This unquantified magnitude is load-bearing for the rejection of cloud-centric designs.

Authors: We agree that illustrative examples would help readers assess the practical significance of the Data-Geography Paradox. Although the paper advances a structural rather than empirical argument, the revised manuscript will incorporate concrete scenarios—such as the irreversible loss of hierarchical file semantics or the meaning of transient sensor streams once serialized and compressed for transmission—to show when fidelity degradation crosses from tolerable to task-critical. We do not claim a universal information-theoretic bound, but these additions will make the magnitude of the issue more tangible without altering the position-paper scope. revision: yes
Referee: [interaction-alignment loop] interaction-alignment loop section: the assertion that real-time local interaction is the 'only economically and ecologically sustainable source' of agentic refinement data is a strong, central claim that lacks comparative analysis against alternative data sources or any supporting quantification, leaving the sustainability argument unsupported.

Authors: The claim is offered as a structural observation grounded in the prohibitive bandwidth, latency, and energy costs of moving high-volume, high-fidelity interaction traces to the cloud. In revision we will add a short qualitative comparison with alternatives such as synthetic data augmentation and federated logging, explaining why each falls short for capturing personal, real-time preference signals at scale. As this remains a position paper, we do not introduce new quantitative sustainability metrics; however, the falsifiable predictions already listed in the conclusion supply a concrete route for subsequent empirical evaluation. revision: partial

Circularity Check

0 steps flagged

No circularity: structural observations without derivations or self-referential reductions

full rationale

The paper is a position paper that advances its core thesis through three descriptive structural shifts (Prefrontal Turn, Data-Geography Paradox, and interaction-alignment loop) framed as inherent properties of agentic tasks. These are presented as observational arguments about context fidelity, latency, and data sources rather than any mathematical derivation, equation, fitted parameter, or self-citation chain. No load-bearing step reduces a claimed prediction or result to its own inputs by construction, and the text supplies no equations, uniqueness theorems, or ansatzes that could create circularity. The conclusions are offered as falsifiable predictions for future agent deployments, keeping the reasoning self-contained and independent of any internal fitting or renaming.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on domain assumptions about the irreplaceable nature of local context and interaction signals for agentic tasks, with no free parameters, invented entities, or formal axioms stated.

axioms (2)

domain assumption Agentic intelligence tasks have structural coupling with high-fidelity local context that cannot be preserved under cloud transmission.
Invoked directly in the abstract as a core property that disqualifies cloud-centric designs.
domain assumption Zero-latency execution loops are required for effective agentic intelligence.
Presented as a fundamental need that conflicts with cloud latency.

pith-pipeline@v0.9.0 · 5775 in / 1272 out tokens · 35189 ms · 2026-05-20T11:47:24.261803+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages · 11 internal anchors

[1]

SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills

Amey Agrawal, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S Gulavani, and Ramachandran Ramjee. Sarathi: Efficient llm inference by piggybacking decodes with chunked prefills.arXiv preprint arXiv:2308.16369, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

Llm in a flash: Efficient large language model inference with limited memory

Keivan Alizadeh, Seyed Iman Mirzadeh, Dmitry Belenko, S Khatamifard, Minsik Cho, Carlo C Del Mundo, Mohammad Rastegari, and Mehrdad Farajtabar. Llm in a flash: Efficient large language model inference with limited memory. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12562–12584, 2024

work page 2024
[3]

Private Cloud Compute: A new frontier for AI privacy in the cloud

Apple Security Research. Private Cloud Compute: A new frontier for AI privacy in the cloud. https://security.apple.com/blog/private-cloud-compute/, 2024. Ac- cessed: 2024-06-10

work page 2024
[4]

Seven failure points when engineering a retrieval augmented generation system

Scott Barnett, Stefanus Kurniawan, Srikanth Thudumu, Zach Brannelly, and Mohamed Ab- delrazek. Seven failure points when engineering a retrieval augmented generation system. InProceedings of the IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI, pages 194–199, 2024

work page 2024
[5]

Small Language Models are the Future of Agentic AI

Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, and Pavlo Molchanov. Small language models are the future of agentic ai. arXiv preprint arXiv:2506.02153, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[6]

Distributed inference and fine-tuning of large language models over the internet

Alexander Borzunov, Max Ryabinin, Artem Chumachenko, Dmitry Baranchuk, Tim Dettmers, Younes Belkada, Pavel Samygin, and Colin Raffel. Distributed inference and fine-tuning of large language models over the internet. InAdvances in Neural Information Processing Systems, 2023

work page 2023
[7]

Citizens’ data privacy in china: The state of the art of the personal information protection law (pipl).Smart Cities, 5(3):1129–1150, 2022

Igor Calzada. Citizens’ data privacy in china: The state of the art of the personal information protection law (pipl).Smart Cities, 5(3):1129–1150, 2022

work page 2022
[8]

Extracting training data from large language models

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-V oss, Kather- ine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, et al. Extracting training data from large language models. In30th USENIX security symposium (USENIX Security 21), pages 2633–2650, 2021

work page 2021
[9]

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

Lingjiao Chen, Matei Zaharia, and James Zou. Frugalgpt: How to use large language models while reducing cost and improving performance.arXiv preprint arXiv:2305.05176, 2023. 10

work page internal anchor Pith review Pith/arXiv arXiv 2023
[10]

Octopus v2: On-device language model for super agent.arXiv preprint arXiv:2404.01744, 2024

Wei Chen and Zhiyuan Li. Octopus v2: On-device language model for super agent.arXiv preprint arXiv:2404.01744, 2024

work page arXiv 2024
[11]

Using autonomy flight software to improve science return on earth observing one.Journal of Aerospace Computing, Information, and Communication, 2(4):196–216, 2005

Steve Chien, Rob Sherwood, Daniel Tran, Benjamin Cichy, Gregg Rabideau, Rebecca Castano, Ashley Davis, Dan Mandl, Stuart Frye, Bruce Trout, et al. Using autonomy flight software to improve science return on earth observing one.Journal of Aerospace Computing, Information, and Communication, 2(4):196–216, 2005

work page 2005
[12]

Orbital edge computing: Nanosatellite constellations as a new class of computer system

Bradley Denby and Brandon Lucia. Orbital edge computing: Nanosatellite constellations as a new class of computer system. InProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 939–954, 2020

work page 2020
[13]

Intelli- gence and the frontal lobe: The organization of goal-directed behavior.Cognitive psychology, 30(3):257–303, 1996

John Duncan, Hazel Emslie, Phyllis Williams, Roger Johnson, and Charles Freer. Intelli- gence and the frontal lobe: The organization of goal-directed behavior.Cognitive psychology, 30(3):257–303, 1996

work page 1996
[14]

Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach

Alireza Fallah, Aryan Mokhtari, and Asuman Ozdaglar. Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach. InAdvances in Neural Information Processing Systems, 2020

work page 2020
[15]

Model swarms: Col- laborative search to adapt llm experts via swarm intelligence.arXiv preprint arXiv:2410.11163, 2024

Shangbin Feng, Zifeng Wang, Yike Wang, Sayna Ebrahimi, Hamid Palangi, Lesly Miculicich, Achin Kulshrestha, Nathalie Rauschmayr, Yejin Choi, Yulia Tsvetkov, et al. Model swarms: Col- laborative search to adapt llm experts via swarm intelligence.arXiv preprint arXiv:2410.11163, 2024

work page arXiv 2024
[16]

Openai creates a scale to track progress toward human-level AI.Bloomberg, July 2024

Shirin Ghaffary. Openai creates a scale to track progress toward human-level AI.Bloomberg, July 2024

work page 2024
[17]

Cellular basis of working memory.Neuron, 14(3):477–485, 1995

Patricia S Goldman-Rakic. Cellular basis of working memory.Neuron, 14(3):477–485, 1995

work page 1995
[18]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang, Shirong Ma, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[19]

Thomas E Hazy, Michael J Frank, and Randall C O’reilly. Towards an executive without a homunculus: computational models of the prefrontal cortex/basal ganglia system.Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1485):1601–1613, 2007

work page 2007
[20]

Scaling Laws for Autoregressive Generative Modeling

Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B Brown, Prafulla Dhariwal, Scott Gray, et al. Scaling laws for autoregressive generative modeling.arXiv preprint arXiv:2010.14701, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[21]

Training Compute-Optimal Large Language Models

Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, DDL Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, et al. Training compute-optimal large language models.arXiv preprint arXiv:2203.15556, 10, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[22]

Metagpt: Meta programming for a multi-agent collaborative framework

Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, et al. Metagpt: Meta programming for a multi-agent collaborative framework. InThe twelfth international conference on learning representations, 2023

work page 2023
[23]

Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding

Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Soderstrom. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 387–395, 2018

work page 2018
[24]

Scaling Laws for Neural Language Models

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models.arXiv preprint arXiv:2001.08361, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2001
[25]

Evaluating language-model agents on realistic autonomous tasks

Megan Kinniment, Lucas Jun Koba Sato, Haoxing Du, Brian Goodrich, Max Hasin, Lawrence Chan, Luke Harold Miles, Tao R Lin, Hjalmar Wijk, Joel Burget, et al. Evaluating language- model agents on realistic autonomous tasks.arXiv preprint arXiv:2312.11671, 2023. 11

work page arXiv 2023
[26]

A path towards autonomous machine intelligence version 0.9

Yann LeCun et al. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. Open Review, 62(1):1–62, 2022

work page 2022
[27]

Lenovo unveils Lenovo and Motorola Qira

Lenovo Group. Lenovo unveils Lenovo and Motorola Qira. https://news.lenovo.com/ pressroom/press-releases/lenovo-unveils-lenovo-and-motorola-qira/, 2026

work page 2026
[28]

Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in neural information processing systems, 33:9459–9474, 2020

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in neural information processing systems, 33:9459–9474, 2020

work page 2020
[29]

A comprehensive survey on large language model compression for artificial intelligence applications in edge systems.IEEE Internet of Things Journal, 2026

Yuzhu Liang, Changfu Xu, Yaxin Mei, Haodong Zou, Jianxiong Guo, Xinggang Fan, Tian Wang, and Haiyang Huang. A comprehensive survey on large language model compression for artificial intelligence applications in edge systems.IEEE Internet of Things Journal, 2026

work page 2026
[30]

AgentBench: Evaluating LLMs as Agents

Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, et al. Agentbench: Evaluating llms as agents.arXiv preprint arXiv:2308.03688, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[31]

Mobilellm: Optimizing sub-billion parameter language models for on-device use cases

Zechun Liu, Changsheng Zhao, Forrest Iandola, Chen Lai, Yuandong Tian, Igor Fedorov, Yunyang Xiong, Ernie Chang, Yangyang Shi, Raghuraman Krishnamoorthi, et al. Mobilellm: Optimizing sub-billion parameter language models for on-device use cases. InForty-first International Conference on Machine Learning, 2024

work page 2024
[32]

Communication-efficient learning of deep networks from decentralized data

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. InArtificial intelligence and statistics, pages 1273–1282. Pmlr, 2017

work page 2017
[33]

Gaia: a benchmark for general ai assistants

Grégoire Mialon, Clémentine Fourrier, Thomas Wolf, Yann LeCun, and Thomas Scialom. Gaia: a benchmark for general ai assistants. InThe Twelfth International Conference on Learning Representations, 2023

work page 2023
[34]

Distributed mixture-of-agents for edge inference with large language models.arXiv preprint arXiv:2412.21200, 2024

Purbesh Mitra, Priyanka Kaswan, and Sennur Ulukus. Distributed mixture-of-agents for edge inference with large language models.arXiv preprint arXiv:2412.21200, 2024

work page arXiv 2024
[35]

Kimi agent swarm: Open agentic intelligence, 2026

Moonshot AI. Kimi agent swarm: Open agentic intelligence, 2026

work page 2026
[36]

Position: Levels of agi for operationalizing progress on the path to agi

Meredith Ringel Morris, Jascha Sohl-Dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, and Shane Legg. Position: Levels of agi for operationalizing progress on the path to agi. InForty-first International Conference on Machine Learning, 2024

work page 2024
[37]

Analysis of india’s digital personal data protection act, 2023.International Journal of Law and Management, 67(5):543–553, 2025

Paarth Naithani. Analysis of india’s digital personal data protection act, 2023.International Journal of Law and Management, 67(5):543–553, 2025

work page 2023
[38]

A comprehensive overview of large language models.ACM Transactions on Intelligent Systems and Technology, 16(5):1–72, 2025

Humza Naveed, Asad Ullah Khan, Shi Qiu, Muhammad Saqib, Saeed Anwar, Muhammad Usman, Naveed Akhtar, Nick Barnes, and Ajmal Mian. A comprehensive overview of large language models.ACM Transactions on Intelligent Systems and Technology, 16(5):1–72, 2025

work page 2025
[39]

NimbleEdge: Enabling real-time, on-device personalization and machine learning

NimbleEdge. NimbleEdge: Enabling real-time, on-device personalization and machine learning. https://www.nimbleedge.com/, 2023

work page 2023
[40]

Hermes Agent: The open-source autonomous agent with a closed learning loop

Nous Research. Hermes Agent: The open-source autonomous agent with a closed learning loop. https://github.com/nousresearch/hermes-agent, 2026

work page 2026
[41]

The final frontier of space computing has arrived, 2026

NVIDIA Corporation. The final frontier of space computing has arrived, 2026

work page 2026
[42]

Training language models to follow instructions with human feedback.Advances in neural information processing systems, 35:27730–27744, 2022

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback.Advances in neural information processing systems, 35:27730–27744, 2022

work page 2022
[43]

Snapdragon 8 Elite Gen 5: Redefining on-device AI compute, 2026

Qualcomm Technologies, Inc. Snapdragon 8 Elite Gen 5: Redefining on-device AI compute, 2026. 12

work page 2026
[44]

Sensitivity of revised diagnostic criteria for the behavioural variant of frontotemporal dementia

Katya Rascovsky, John R Hodges, David Knopman, Mario F Mendez, Joel H Kramer, John Neuhaus, John C Van Swieten, Harro Seelaar, Elise GP Dopper, Chiadi U Onyike, et al. Sensitivity of revised diagnostic criteria for the behavioural variant of frontotemporal dementia. Brain, 134(9):2456–2477, 2011

work page 2011
[45]

An- droidinthewild: A large-scale dataset for android device control.Advances in Neural Information Processing Systems, 36:59708–59728, 2023

Christopher Rawles, Alice Li, Daniel Rodriguez, Oriana Riva, and Timothy Lillicrap. An- droidinthewild: A large-scale dataset for android device control.Advances in Neural Information Processing Systems, 36:59708–59728, 2023

work page 2023
[46]

Regulation (eu) 2016/679 of the european parliament and of the council

Protection Regulation. Regulation (eu) 2016/679 of the european parliament and of the council. Regulation (eu), 679(2016):10–3, 2016

work page 2016
[47]

LaMP: When large language models meet personalization

Alireza Salemi et al. LaMP: When large language models meet personalization. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024

work page 2024
[48]

Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face.Advances in Neural Information Processing Systems, 36:38154–38180, 2023

Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, and Yueting Zhuang. Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face.Advances in Neural Information Processing Systems, 36:38154–38180, 2023

work page 2023
[49]

S-LoRA: Serving thousands of concurrent LoRA adapters

Ying Sheng et al. S-LoRA: Serving thousands of concurrent LoRA adapters. InProceedings of the 31st ACM Symposium on Operating Systems Principles (SOSP), 2023

work page 2023
[50]

Edge computing: Vision and challenges.IEEE internet of things journal, 3(5):637–646, 2016

Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. Edge computing: Vision and challenges.IEEE internet of things journal, 3(5):637–646, 2016

work page 2016
[51]

Reflexion: Language agents with verbal reinforcement learning.Advances in neural information processing systems, 36:8634–8652, 2023

Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. Reflexion: Language agents with verbal reinforcement learning.Advances in neural information processing systems, 36:8634–8652, 2023

work page 2023
[52]

LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collab- orate through LLMs

V olker Strobel, Marco Dorigo, and Mario Fritz. LLM2Swarm: Robot swarms that responsively reason, plan, and collaborate through LLMs.arXiv preprint arXiv:2410.11387, 2024

work page arXiv 2024
[53]

Cognitive architectures for language agents.Transactions on Machine Learning Research, 2023

Theodore Sumers, Shunyu Yao, Karthik R Narasimhan, and Thomas L Griffiths. Cognitive architectures for language agents.Transactions on Machine Learning Research, 2023

work page 2023
[54]

Insights into human behavior from lesions to the prefrontal cortex.Neuron, 83(5):1002–1018, 2014

Sara M Szczepanski and Robert T Knight. Insights into human behavior from lesions to the prefrontal cortex.Neuron, 83(5):1002–1018, 2014

work page 2014
[55]

On the planning abilities of large language models-a critical investigation.Advances in neural information processing systems, 36:75993–76005, 2023

Karthik Valmeekam, Matthew Marquez, Sarath Sreedharan, and Subbarao Kambhampati. On the planning abilities of large language models-a critical investigation.Advances in neural information processing systems, 36:75993–76005, 2023

work page 2023
[56]

Will we run out of data? limits of llm scaling based on human-generated data, 2024

Pablo Villalobos, Jaime Sevilla, Lennart Heim, Tamay Besiroglu, Marius Hobbhahn, and Anson Ho. Will we run out of data? An analysis of the projected depletion of human-generated text. arXiv preprint arXiv:2211.04325, 2022

work page arXiv 2022
[57]

Freshllms: Refreshing large language models with search engine augmentation

Tu Vu, Mohit Iyyer, Xuezhi Wang, Noah Constant, Jerry Wei, Jason Wei, Chris Tar, Yun-Hsuan Sung, Denny Zhou, Quoc Le, et al. Freshllms: Refreshing large language models with search engine augmentation. InFindings of the Association for Computational Linguistics: ACL 2024, pages 13697–13720, 2024

work page 2024
[58]

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

Junyang Wang, Haiyang Xu, Jiabo Ye, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, and Jitao Sang. Mobile-agent: Autonomous multi-modal mobile device agent with visual perception. arXiv preprint arXiv:2401.16158, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[59]

A survey on large language model based autonomous agents.Frontiers of Computer Science, 18(6):186345, 2024

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. A survey on large language model based autonomous agents.Frontiers of Computer Science, 18(6):186345, 2024

work page 2024
[60]

Emergent Abilities of Large Language Models

Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, et al. Emergent abilities of large language models.arXiv preprint arXiv:2206.07682, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[61]

Open (for business): Big tech, concentrated power, and the political economy of open AI.SSRN 4543807, 2023

David Gray Widder, Sarah Myers West, and Meredith Whittaker. Open (for business): Big tech, concentrated power, and the political economy of open AI.SSRN 4543807, 2023. 13

work page 2023
[62]

Os-copilot: Towards generalist computer agents with self-improvement.arXiv preprint arXiv:2402.07456, 2024

Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianze Liu, Shunyu Yao, Tao Yu, and Lingpeng Kong. Os-copilot: Towards generalist computer agents with self-improvement.arXiv preprint arXiv:2402.07456, 2024

work page arXiv 2024
[63]

The rise and potential of large language model based agents: A survey.Science China Information Sciences, 68(2):121101, 2025

Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, et al. The rise and potential of large language model based agents: A survey.Science China Information Sciences, 68(2):121101, 2025

work page 2025
[64]

Mimo: Unlocking the reasoning potential of language model – from pretraining to posttraining, 2025

LLM-Core Xiaomi. Mimo: Unlocking the reasoning potential of language model – from pretraining to posttraining, 2025

work page 2025
[65]

Satellite- terrestrial integrated edge computing networks: Architecture, challenges, and open issues.Ieee Network, 34(3):224–231, 2020

Renchao Xie, Qinqin Tang, Qiuning Wang, Xu Liu, F Richard Yu, and Tao Huang. Satellite- terrestrial integrated edge computing networks: Architecture, challenges, and open issues.Ieee Network, 34(3):224–231, 2020

work page 2020
[66]

Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments.Advances in Neural Information Processing Systems, 37:52040–52094, 2024

Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh J Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, et al. Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments.Advances in Neural Information Processing Systems, 37:52040–52094, 2024

work page 2024
[67]

On-device language models: A comprehensive review.arXiv preprint arXiv:2409.00088,

Jiajun Xu, Zhiyuan Li, Wei Chen, Qun Wang, Xin Gao, Qi Cai, and Ziyuan Ling. On-device language models: A comprehensive review.arXiv preprint arXiv:2409.00088, 2024

work page arXiv 2024
[68]

Wideseek-r1: Exploring width scaling for broad information seeking via multi-agent reinforcement learning.arXiv preprint arXiv:2602.04634, 2026

Zelai Xu, Zhexuan Xu, Ruize Zhang, Chunyang Zhu, Shi Yu, Weilin Liu, Quanlu Zhang, Wenbo Ding, Chao Yu, and Yu Wang. Wideseek-r1: Exploring width scaling for broad information seeking via multi-agent reinforcement learning.arXiv preprint arXiv:2602.04634, 2026

work page arXiv 2026
[69]

Agenttuning: Enabling generalized agent abilities for llms

Aohan Zeng, Mingdao Liu, Rui Lu, Bowen Wang, Xiao Liu, Yuxiao Dong, and Jie Tang. Agenttuning: Enabling generalized agent abilities for llms. InFindings of the Association for Computational Linguistics: ACL 2024, pages 3053–3077, 2024

work page 2024
[70]

Lin- gualinked: Distributed large language model inference on mobile devices

Junchen Zhao, Yurun Song, Simeng Liu, Ian G Harris, and Sangeetha Abdu Jyothi. Lin- gualinked: Distributed large language model inference on mobile devices. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 160–171, 2024

work page 2024
[71]

WebArena: A Realistic Web Environment for Building Autonomous Agents

Shuyan Zhou, Frank F Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, et al. Webarena: A realistic web environment for building autonomous agents.arXiv preprint arXiv:2307.13854, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[72]

Edge intelligence: Paving the last mile of artificial intelligence with edge computing.Proceedings of the IEEE, 107(8):1738–1762, 2019

Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang. Edge intelligence: Paving the last mile of artificial intelligence with edge computing.Proceedings of the IEEE, 107(8):1738–1762, 2019

work page 2019
[73]

When foundation model meets federated learning: Motivations, challenges, and future directions, 2025

Weiming Zhuang, Chen Chen, Jingtao Li, Chaochao Chen, Yaochu Jin, and Lingjuan Lyu. When foundation model meets federated learning: Motivations, challenges, and future directions, 2025

work page 2025
[74]

Language agents as optimizable graphs

Mingchen Zhuge, Wenyi Wang, Louis Kirsch, Francesco Faccio, Dmitrii Khizbullin, and Jürgen Schmidhuber. Language agents as optimizable graphs. InInternational Conference on Machine Learning, 2024. 14

work page 2024

[1] [1]

SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills

Amey Agrawal, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S Gulavani, and Ramachandran Ramjee. Sarathi: Efficient llm inference by piggybacking decodes with chunked prefills.arXiv preprint arXiv:2308.16369, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[2] [2]

Llm in a flash: Efficient large language model inference with limited memory

Keivan Alizadeh, Seyed Iman Mirzadeh, Dmitry Belenko, S Khatamifard, Minsik Cho, Carlo C Del Mundo, Mohammad Rastegari, and Mehrdad Farajtabar. Llm in a flash: Efficient large language model inference with limited memory. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12562–12584, 2024

work page 2024

[3] [3]

Private Cloud Compute: A new frontier for AI privacy in the cloud

Apple Security Research. Private Cloud Compute: A new frontier for AI privacy in the cloud. https://security.apple.com/blog/private-cloud-compute/, 2024. Ac- cessed: 2024-06-10

work page 2024

[4] [4]

Seven failure points when engineering a retrieval augmented generation system

Scott Barnett, Stefanus Kurniawan, Srikanth Thudumu, Zach Brannelly, and Mohamed Ab- delrazek. Seven failure points when engineering a retrieval augmented generation system. InProceedings of the IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI, pages 194–199, 2024

work page 2024

[5] [5]

Small Language Models are the Future of Agentic AI

Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, and Pavlo Molchanov. Small language models are the future of agentic ai. arXiv preprint arXiv:2506.02153, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[6] [6]

Distributed inference and fine-tuning of large language models over the internet

Alexander Borzunov, Max Ryabinin, Artem Chumachenko, Dmitry Baranchuk, Tim Dettmers, Younes Belkada, Pavel Samygin, and Colin Raffel. Distributed inference and fine-tuning of large language models over the internet. InAdvances in Neural Information Processing Systems, 2023

work page 2023

[7] [7]

Citizens’ data privacy in china: The state of the art of the personal information protection law (pipl).Smart Cities, 5(3):1129–1150, 2022

Igor Calzada. Citizens’ data privacy in china: The state of the art of the personal information protection law (pipl).Smart Cities, 5(3):1129–1150, 2022

work page 2022

[8] [8]

Extracting training data from large language models

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-V oss, Kather- ine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, et al. Extracting training data from large language models. In30th USENIX security symposium (USENIX Security 21), pages 2633–2650, 2021

work page 2021

[9] [9]

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

Lingjiao Chen, Matei Zaharia, and James Zou. Frugalgpt: How to use large language models while reducing cost and improving performance.arXiv preprint arXiv:2305.05176, 2023. 10

work page internal anchor Pith review Pith/arXiv arXiv 2023

[10] [10]

Octopus v2: On-device language model for super agent.arXiv preprint arXiv:2404.01744, 2024

Wei Chen and Zhiyuan Li. Octopus v2: On-device language model for super agent.arXiv preprint arXiv:2404.01744, 2024

work page arXiv 2024

[11] [11]

Using autonomy flight software to improve science return on earth observing one.Journal of Aerospace Computing, Information, and Communication, 2(4):196–216, 2005

Steve Chien, Rob Sherwood, Daniel Tran, Benjamin Cichy, Gregg Rabideau, Rebecca Castano, Ashley Davis, Dan Mandl, Stuart Frye, Bruce Trout, et al. Using autonomy flight software to improve science return on earth observing one.Journal of Aerospace Computing, Information, and Communication, 2(4):196–216, 2005

work page 2005

[12] [12]

Orbital edge computing: Nanosatellite constellations as a new class of computer system

Bradley Denby and Brandon Lucia. Orbital edge computing: Nanosatellite constellations as a new class of computer system. InProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 939–954, 2020

work page 2020

[13] [13]

Intelli- gence and the frontal lobe: The organization of goal-directed behavior.Cognitive psychology, 30(3):257–303, 1996

John Duncan, Hazel Emslie, Phyllis Williams, Roger Johnson, and Charles Freer. Intelli- gence and the frontal lobe: The organization of goal-directed behavior.Cognitive psychology, 30(3):257–303, 1996

work page 1996

[14] [14]

Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach

Alireza Fallah, Aryan Mokhtari, and Asuman Ozdaglar. Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach. InAdvances in Neural Information Processing Systems, 2020

work page 2020

[15] [15]

Model swarms: Col- laborative search to adapt llm experts via swarm intelligence.arXiv preprint arXiv:2410.11163, 2024

Shangbin Feng, Zifeng Wang, Yike Wang, Sayna Ebrahimi, Hamid Palangi, Lesly Miculicich, Achin Kulshrestha, Nathalie Rauschmayr, Yejin Choi, Yulia Tsvetkov, et al. Model swarms: Col- laborative search to adapt llm experts via swarm intelligence.arXiv preprint arXiv:2410.11163, 2024

work page arXiv 2024

[16] [16]

Openai creates a scale to track progress toward human-level AI.Bloomberg, July 2024

Shirin Ghaffary. Openai creates a scale to track progress toward human-level AI.Bloomberg, July 2024

work page 2024

[17] [17]

Cellular basis of working memory.Neuron, 14(3):477–485, 1995

Patricia S Goldman-Rakic. Cellular basis of working memory.Neuron, 14(3):477–485, 1995

work page 1995

[18] [18]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang, Shirong Ma, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[19] [19]

Thomas E Hazy, Michael J Frank, and Randall C O’reilly. Towards an executive without a homunculus: computational models of the prefrontal cortex/basal ganglia system.Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1485):1601–1613, 2007

work page 2007

[20] [20]

Scaling Laws for Autoregressive Generative Modeling

Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B Brown, Prafulla Dhariwal, Scott Gray, et al. Scaling laws for autoregressive generative modeling.arXiv preprint arXiv:2010.14701, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[21] [21]

Training Compute-Optimal Large Language Models

Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, DDL Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, et al. Training compute-optimal large language models.arXiv preprint arXiv:2203.15556, 10, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[22] [22]

Metagpt: Meta programming for a multi-agent collaborative framework

Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, et al. Metagpt: Meta programming for a multi-agent collaborative framework. InThe twelfth international conference on learning representations, 2023

work page 2023

[23] [23]

Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding

Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Soderstrom. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 387–395, 2018

work page 2018

[24] [24]

Scaling Laws for Neural Language Models

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models.arXiv preprint arXiv:2001.08361, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2001

[25] [25]

Evaluating language-model agents on realistic autonomous tasks

Megan Kinniment, Lucas Jun Koba Sato, Haoxing Du, Brian Goodrich, Max Hasin, Lawrence Chan, Luke Harold Miles, Tao R Lin, Hjalmar Wijk, Joel Burget, et al. Evaluating language- model agents on realistic autonomous tasks.arXiv preprint arXiv:2312.11671, 2023. 11

work page arXiv 2023

[26] [26]

A path towards autonomous machine intelligence version 0.9

Yann LeCun et al. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. Open Review, 62(1):1–62, 2022

work page 2022

[27] [27]

Lenovo unveils Lenovo and Motorola Qira

Lenovo Group. Lenovo unveils Lenovo and Motorola Qira. https://news.lenovo.com/ pressroom/press-releases/lenovo-unveils-lenovo-and-motorola-qira/, 2026

work page 2026

[28] [28]

Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in neural information processing systems, 33:9459–9474, 2020

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in neural information processing systems, 33:9459–9474, 2020

work page 2020

[29] [29]

A comprehensive survey on large language model compression for artificial intelligence applications in edge systems.IEEE Internet of Things Journal, 2026

Yuzhu Liang, Changfu Xu, Yaxin Mei, Haodong Zou, Jianxiong Guo, Xinggang Fan, Tian Wang, and Haiyang Huang. A comprehensive survey on large language model compression for artificial intelligence applications in edge systems.IEEE Internet of Things Journal, 2026

work page 2026

[30] [30]

AgentBench: Evaluating LLMs as Agents

Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, et al. Agentbench: Evaluating llms as agents.arXiv preprint arXiv:2308.03688, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[31] [31]

Mobilellm: Optimizing sub-billion parameter language models for on-device use cases

Zechun Liu, Changsheng Zhao, Forrest Iandola, Chen Lai, Yuandong Tian, Igor Fedorov, Yunyang Xiong, Ernie Chang, Yangyang Shi, Raghuraman Krishnamoorthi, et al. Mobilellm: Optimizing sub-billion parameter language models for on-device use cases. InForty-first International Conference on Machine Learning, 2024

work page 2024

[32] [32]

Communication-efficient learning of deep networks from decentralized data

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. InArtificial intelligence and statistics, pages 1273–1282. Pmlr, 2017

work page 2017

[33] [33]

Gaia: a benchmark for general ai assistants

Grégoire Mialon, Clémentine Fourrier, Thomas Wolf, Yann LeCun, and Thomas Scialom. Gaia: a benchmark for general ai assistants. InThe Twelfth International Conference on Learning Representations, 2023

work page 2023

[34] [34]

Distributed mixture-of-agents for edge inference with large language models.arXiv preprint arXiv:2412.21200, 2024

Purbesh Mitra, Priyanka Kaswan, and Sennur Ulukus. Distributed mixture-of-agents for edge inference with large language models.arXiv preprint arXiv:2412.21200, 2024

work page arXiv 2024

[35] [35]

Kimi agent swarm: Open agentic intelligence, 2026

Moonshot AI. Kimi agent swarm: Open agentic intelligence, 2026

work page 2026

[36] [36]

Position: Levels of agi for operationalizing progress on the path to agi

Meredith Ringel Morris, Jascha Sohl-Dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, and Shane Legg. Position: Levels of agi for operationalizing progress on the path to agi. InForty-first International Conference on Machine Learning, 2024

work page 2024

[37] [37]

Analysis of india’s digital personal data protection act, 2023.International Journal of Law and Management, 67(5):543–553, 2025

Paarth Naithani. Analysis of india’s digital personal data protection act, 2023.International Journal of Law and Management, 67(5):543–553, 2025

work page 2023

[38] [38]

A comprehensive overview of large language models.ACM Transactions on Intelligent Systems and Technology, 16(5):1–72, 2025

Humza Naveed, Asad Ullah Khan, Shi Qiu, Muhammad Saqib, Saeed Anwar, Muhammad Usman, Naveed Akhtar, Nick Barnes, and Ajmal Mian. A comprehensive overview of large language models.ACM Transactions on Intelligent Systems and Technology, 16(5):1–72, 2025

work page 2025

[39] [39]

NimbleEdge: Enabling real-time, on-device personalization and machine learning

NimbleEdge. NimbleEdge: Enabling real-time, on-device personalization and machine learning. https://www.nimbleedge.com/, 2023

work page 2023

[40] [40]

Hermes Agent: The open-source autonomous agent with a closed learning loop

Nous Research. Hermes Agent: The open-source autonomous agent with a closed learning loop. https://github.com/nousresearch/hermes-agent, 2026

work page 2026

[41] [41]

The final frontier of space computing has arrived, 2026

NVIDIA Corporation. The final frontier of space computing has arrived, 2026

work page 2026

[42] [42]

Training language models to follow instructions with human feedback.Advances in neural information processing systems, 35:27730–27744, 2022

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback.Advances in neural information processing systems, 35:27730–27744, 2022

work page 2022

[43] [43]

Snapdragon 8 Elite Gen 5: Redefining on-device AI compute, 2026

Qualcomm Technologies, Inc. Snapdragon 8 Elite Gen 5: Redefining on-device AI compute, 2026. 12

work page 2026

[44] [44]

Sensitivity of revised diagnostic criteria for the behavioural variant of frontotemporal dementia

Katya Rascovsky, John R Hodges, David Knopman, Mario F Mendez, Joel H Kramer, John Neuhaus, John C Van Swieten, Harro Seelaar, Elise GP Dopper, Chiadi U Onyike, et al. Sensitivity of revised diagnostic criteria for the behavioural variant of frontotemporal dementia. Brain, 134(9):2456–2477, 2011

work page 2011

[45] [45]

An- droidinthewild: A large-scale dataset for android device control.Advances in Neural Information Processing Systems, 36:59708–59728, 2023

Christopher Rawles, Alice Li, Daniel Rodriguez, Oriana Riva, and Timothy Lillicrap. An- droidinthewild: A large-scale dataset for android device control.Advances in Neural Information Processing Systems, 36:59708–59728, 2023

work page 2023

[46] [46]

Regulation (eu) 2016/679 of the european parliament and of the council

Protection Regulation. Regulation (eu) 2016/679 of the european parliament and of the council. Regulation (eu), 679(2016):10–3, 2016

work page 2016

[47] [47]

LaMP: When large language models meet personalization

Alireza Salemi et al. LaMP: When large language models meet personalization. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024

work page 2024

[48] [48]

Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face.Advances in Neural Information Processing Systems, 36:38154–38180, 2023

Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, and Yueting Zhuang. Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face.Advances in Neural Information Processing Systems, 36:38154–38180, 2023

work page 2023

[49] [49]

S-LoRA: Serving thousands of concurrent LoRA adapters

Ying Sheng et al. S-LoRA: Serving thousands of concurrent LoRA adapters. InProceedings of the 31st ACM Symposium on Operating Systems Principles (SOSP), 2023

work page 2023

[50] [50]

Edge computing: Vision and challenges.IEEE internet of things journal, 3(5):637–646, 2016

Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. Edge computing: Vision and challenges.IEEE internet of things journal, 3(5):637–646, 2016

work page 2016

[51] [51]

Reflexion: Language agents with verbal reinforcement learning.Advances in neural information processing systems, 36:8634–8652, 2023

Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. Reflexion: Language agents with verbal reinforcement learning.Advances in neural information processing systems, 36:8634–8652, 2023

work page 2023

[52] [52]

LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collab- orate through LLMs

V olker Strobel, Marco Dorigo, and Mario Fritz. LLM2Swarm: Robot swarms that responsively reason, plan, and collaborate through LLMs.arXiv preprint arXiv:2410.11387, 2024

work page arXiv 2024

[53] [53]

Cognitive architectures for language agents.Transactions on Machine Learning Research, 2023

Theodore Sumers, Shunyu Yao, Karthik R Narasimhan, and Thomas L Griffiths. Cognitive architectures for language agents.Transactions on Machine Learning Research, 2023

work page 2023

[54] [54]

Insights into human behavior from lesions to the prefrontal cortex.Neuron, 83(5):1002–1018, 2014

Sara M Szczepanski and Robert T Knight. Insights into human behavior from lesions to the prefrontal cortex.Neuron, 83(5):1002–1018, 2014

work page 2014

[55] [55]

On the planning abilities of large language models-a critical investigation.Advances in neural information processing systems, 36:75993–76005, 2023

Karthik Valmeekam, Matthew Marquez, Sarath Sreedharan, and Subbarao Kambhampati. On the planning abilities of large language models-a critical investigation.Advances in neural information processing systems, 36:75993–76005, 2023

work page 2023

[56] [56]

Will we run out of data? limits of llm scaling based on human-generated data, 2024

Pablo Villalobos, Jaime Sevilla, Lennart Heim, Tamay Besiroglu, Marius Hobbhahn, and Anson Ho. Will we run out of data? An analysis of the projected depletion of human-generated text. arXiv preprint arXiv:2211.04325, 2022

work page arXiv 2022

[57] [57]

Freshllms: Refreshing large language models with search engine augmentation

Tu Vu, Mohit Iyyer, Xuezhi Wang, Noah Constant, Jerry Wei, Jason Wei, Chris Tar, Yun-Hsuan Sung, Denny Zhou, Quoc Le, et al. Freshllms: Refreshing large language models with search engine augmentation. InFindings of the Association for Computational Linguistics: ACL 2024, pages 13697–13720, 2024

work page 2024

[58] [58]

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

Junyang Wang, Haiyang Xu, Jiabo Ye, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, and Jitao Sang. Mobile-agent: Autonomous multi-modal mobile device agent with visual perception. arXiv preprint arXiv:2401.16158, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[59] [59]

A survey on large language model based autonomous agents.Frontiers of Computer Science, 18(6):186345, 2024

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. A survey on large language model based autonomous agents.Frontiers of Computer Science, 18(6):186345, 2024

work page 2024

[60] [60]

Emergent Abilities of Large Language Models

Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, et al. Emergent abilities of large language models.arXiv preprint arXiv:2206.07682, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[61] [61]

Open (for business): Big tech, concentrated power, and the political economy of open AI.SSRN 4543807, 2023

David Gray Widder, Sarah Myers West, and Meredith Whittaker. Open (for business): Big tech, concentrated power, and the political economy of open AI.SSRN 4543807, 2023. 13

work page 2023

[62] [62]

Os-copilot: Towards generalist computer agents with self-improvement.arXiv preprint arXiv:2402.07456, 2024

Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianze Liu, Shunyu Yao, Tao Yu, and Lingpeng Kong. Os-copilot: Towards generalist computer agents with self-improvement.arXiv preprint arXiv:2402.07456, 2024

work page arXiv 2024

[63] [63]

The rise and potential of large language model based agents: A survey.Science China Information Sciences, 68(2):121101, 2025

Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, et al. The rise and potential of large language model based agents: A survey.Science China Information Sciences, 68(2):121101, 2025

work page 2025

[64] [64]

Mimo: Unlocking the reasoning potential of language model – from pretraining to posttraining, 2025

LLM-Core Xiaomi. Mimo: Unlocking the reasoning potential of language model – from pretraining to posttraining, 2025

work page 2025

[65] [65]

Satellite- terrestrial integrated edge computing networks: Architecture, challenges, and open issues.Ieee Network, 34(3):224–231, 2020

Renchao Xie, Qinqin Tang, Qiuning Wang, Xu Liu, F Richard Yu, and Tao Huang. Satellite- terrestrial integrated edge computing networks: Architecture, challenges, and open issues.Ieee Network, 34(3):224–231, 2020

work page 2020

[66] [66]

Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments.Advances in Neural Information Processing Systems, 37:52040–52094, 2024

Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh J Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, et al. Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments.Advances in Neural Information Processing Systems, 37:52040–52094, 2024

work page 2024

[67] [67]

On-device language models: A comprehensive review.arXiv preprint arXiv:2409.00088,

Jiajun Xu, Zhiyuan Li, Wei Chen, Qun Wang, Xin Gao, Qi Cai, and Ziyuan Ling. On-device language models: A comprehensive review.arXiv preprint arXiv:2409.00088, 2024

work page arXiv 2024

[68] [68]

Wideseek-r1: Exploring width scaling for broad information seeking via multi-agent reinforcement learning.arXiv preprint arXiv:2602.04634, 2026

Zelai Xu, Zhexuan Xu, Ruize Zhang, Chunyang Zhu, Shi Yu, Weilin Liu, Quanlu Zhang, Wenbo Ding, Chao Yu, and Yu Wang. Wideseek-r1: Exploring width scaling for broad information seeking via multi-agent reinforcement learning.arXiv preprint arXiv:2602.04634, 2026

work page arXiv 2026

[69] [69]

Agenttuning: Enabling generalized agent abilities for llms

Aohan Zeng, Mingdao Liu, Rui Lu, Bowen Wang, Xiao Liu, Yuxiao Dong, and Jie Tang. Agenttuning: Enabling generalized agent abilities for llms. InFindings of the Association for Computational Linguistics: ACL 2024, pages 3053–3077, 2024

work page 2024

[70] [70]

Lin- gualinked: Distributed large language model inference on mobile devices

Junchen Zhao, Yurun Song, Simeng Liu, Ian G Harris, and Sangeetha Abdu Jyothi. Lin- gualinked: Distributed large language model inference on mobile devices. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 160–171, 2024

work page 2024

[71] [71]

WebArena: A Realistic Web Environment for Building Autonomous Agents

Shuyan Zhou, Frank F Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, et al. Webarena: A realistic web environment for building autonomous agents.arXiv preprint arXiv:2307.13854, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[72] [72]

Edge intelligence: Paving the last mile of artificial intelligence with edge computing.Proceedings of the IEEE, 107(8):1738–1762, 2019

Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang. Edge intelligence: Paving the last mile of artificial intelligence with edge computing.Proceedings of the IEEE, 107(8):1738–1762, 2019

work page 2019

[73] [73]

When foundation model meets federated learning: Motivations, challenges, and future directions, 2025

Weiming Zhuang, Chen Chen, Jingtao Li, Chaochao Chen, Yaochu Jin, and Lingjuan Lyu. When foundation model meets federated learning: Motivations, challenges, and future directions, 2025

work page 2025

[74] [74]

Language agents as optimizable graphs

Mingchen Zhuge, Wenyi Wang, Louis Kirsch, Francesco Faccio, Dmitrii Khizbullin, and Jürgen Schmidhuber. Language agents as optimizable graphs. InInternational Conference on Machine Learning, 2024. 14

work page 2024