{"total":11,"items":[{"citing_arxiv_id":"2605.27696","ref_index":4,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Structure over Pixels: Learning Variable-Length Visual Programs","primary_cat":"cs.CV","submitted_at":"2026-05-26T21:16:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"STROP learns variable-length discrete visual programs for images by training a length head against frozen DINOv3 features in a four-phase curriculum while bypassing pixel reconstruction.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.25166","ref_index":17,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Training Transformers as a Universal Computer","primary_cat":"cs.AI","submitted_at":"2026-04-28T03:15:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A transformer trained on random meaningless MicroPy programs generalizes to execute diverse human-written programs, providing empirical evidence it can act as a universal computer.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18907","ref_index":109,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Gradient-Based Program Synthesis with Neurally Interpreted Languages","primary_cat":"cs.LG","submitted_at":"2026-04-20T23:14:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"NLI autonomously discovers a vocabulary of primitive operations and interprets variable-length programs via a neural executor, allowing end-to-end training and gradient-based test-time adaptation that outperforms prior methods on combinatorial generalization tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.06425","ref_index":24,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Neural Computers","primary_cat":"cs.LG","submitted_at":"2026-04-07T20:01:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Neural Computers are introduced as a new machine form where computation, memory, and I/O are unified in a learned runtime state, with initial video-model experiments showing acquisition of basic interface primitives from traces.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"All full-resolution pages are in Appendix E; below we keep clickable thumbnails at the original location for quick navigation. 11 CLIGen Visualization Thumbnails Click any thumbnail to jump to its full-resolution page in Appendix CLIGen (General) Visualizations TheterminaldisplaysaseriesofANSIescapecodeformattedtextswithchangingbackgroundandforegroundcolors,executingcommandslike `\\\\u001b[48;2;255;128;128;38;2;0;0;0m`whichsetthebackgroundtoashadeofpinkandtexttoblack,andprintingnumberedlistswithcolors.Theoutputincludesspecific numbers,suchas\\\"1\\\",\\\"5\\\",\\\"7\\\",and\\\"9\\\",indifferentcolors,creatingavisuallydynamicandcolorfuldisplay,buttheexactusername,hostname,andpatharenot specifiedintheprovidedterminalsessioncontent. Theusertypesthecommand`CREATETABLEposts(IDINTEGER)`,withtheterminaldisplayingthecommandinadarkbackgroundwithcoloredsyntaxhighlighting, includinggreenandyellowtext,andthecursormovingcharacter-by-characterastheusertypes,withsomecorrectionsandbackspacingalongtheway.Theoutputshows thecommandbeingexecuted,withkeywordslike`CREATE`and`TABLE`indistinctcolors,andthefilename`posts`appearinginthecommandline. Samples A Atthe`root@localhost:~#`prompt,theusertypesthe`date`command,whichdisplaysthecurrentdateandtimeinaplaintextformatas\\\"2021.10.11.22:47:43KST\\\",then beginstypingthe`cat`command. Theterminaldisplayingprogressbars,packagenameslike`pillow`,`notebook`,and`tzlocal`,andversionchangesingreenandredtext.Theoutputshowsdownloading andinstallingstatuses,includingpercentages,forpackageslike`smmap`,`tomli`,and`protobuf`,withtheterminalscrollingthroughtheoutputrapidly. Samples B Attheunspecifiedusername@hostnameprompt,theterminaldisplaysapartitioneditorwithadiskimagefilenamed\\\"sd.img\\\"(128MiB)andtheuserinteractswithit, creatinganewLinuxpartitionfromfreespace,withkeyoutputcontentshowingpartitiondetailsinatableformat,including\\\"sd.img1\\\"and\\\"sd.img2\\\"withtheirrespective sizesandtypes,andanewpartition\\\"sd.img3\\\"with55MsizeandLinuxtype(83).Theterminalshowsamixofblack"},{"citing_arxiv_id":"2211.14275","ref_index":33,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Solving math word problems with process- and outcome-based feedback","primary_cat":"cs.LG","submitted_at":"2022-11-25T18:19:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"On GSM8K, outcome-based supervision achieves similar final-answer error rates to process-based with less labeling, but process-based or learned reward models are needed to reach 3.4% reasoning error among correct solutions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2201.02177","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets","primary_cat":"cs.LG","submitted_at":"2022-01-06T18:43:37+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Neural networks exhibit grokking on small algorithmic datasets, achieving perfect generalization well after overfitting.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2112.00114","ref_index":15,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Show Your Work: Scratchpads for Intermediate Computation with Language Models","primary_cat":"cs.LG","submitted_at":"2021-11-30T21:32:46+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Training language models to generate intermediate computation steps on a scratchpad enables them to perform multi-step tasks such as long addition and arbitrary program execution that they otherwise fail at.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2104.13478","ref_index":67,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges","primary_cat":"cs.LG","submitted_at":"2021-04-27T21:09:51+00:00","verdict":"ACCEPT","verdict_confidence":"HIGH","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Geometric deep learning provides a unified mathematical framework based on grids, groups, graphs, geodesics, and gauges to explain and extend neural network architectures by incorporating physical regularities.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2102.04664","ref_index":67,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation","primary_cat":"cs.SE","submitted_at":"2021-02-09T06:16:25+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1907.09273","ref_index":70,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Why Build an Assistant in Minecraft?","primary_cat":"cs.AI","submitted_at":"2019-07-22T12:32:15+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A rationale is presented for developing an assistant in Minecraft to advance natural language understanding and dialogue learning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1603.08983","ref_index":25,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Adaptive Computation Time for Recurrent Neural Networks","primary_cat":"cs.NE","submitted_at":"2016-03-29T22:09:00+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":8.0,"formal_verification":"none","one_line_summary":"ACT lets RNNs dynamically adapt computation depth per input via a differentiable halting unit, yielding large gains on synthetic tasks and structural insights on language data.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[23] B. A. Olshausen et al. Emergence of simple-cell receptive ﬁeld properties by learning a sparse code for natural images. Nature, 381(6583):607-609, 1996. [24] B. Recht, C. Re, S. Wright, and F. Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems, pages 693-701, 2011. [25] S. Reed and N. de Freitas. Neural programmer-interpreters. Technical Report arXiv:1511.06279, 2015. [26] J. Schmidhuber. Self-delimiting neural networks. arXiv preprint arXiv:1210.0118 , 2012. [27] J. Schmidhuber and S. Hochreiter. Guessing can outperform many long time lag algorithms. Technical report, 1996. [28] N. Srivastava, G. Hinton, A. Krizhevsky, I."}],"limit":50,"offset":0}