{"work":{"id":"07ef7360-d385-4033-83f7-8384a6325204","openalex_id":null,"doi":null,"arxiv_id":"1711.05101","raw_key":null,"title":"Decoupled Weight Decay Regularization","authors":null,"authors_text":"Ilya Loshchilov and Frank Hutter","year":2017,"venue":"cs.LG","abstract":"L$_2$ regularization and weight decay regularization are equivalent for standard stochastic gradient descent (when rescaled by the learning rate), but as we demonstrate this is \\emph{not} the case for adaptive gradient algorithms, such as Adam. While common implementations of these algorithms employ L$_2$ regularization (often calling it \"weight decay\" in what may be misleading due to the inequivalence we expose), we propose a simple modification to recover the original formulation of weight decay regularization by \\emph{decoupling} the weight decay from the optimization steps taken w.r.t. the loss function. We provide empirical evidence that our proposed modification (i) decouples the optimal choice of weight decay factor from the setting of the learning rate for both standard SGD and Adam and (ii) substantially improves Adam's generalization performance, allowing it to compete with SGD with momentum on image classification datasets (on which it was previously typically outperformed by the latter). Our proposed decoupled weight decay has already been adopted by many researchers, and the community has implemented it in TensorFlow and PyTorch; the complete source code for our experiments is available at https://github.com/loshchil/AdamW-and-SGDW","external_url":"https://arxiv.org/abs/1711.05101","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-07-02T08:46:49.002839+00:00","pith_arxiv_id":"1711.05101","created_at":"2026-05-08T19:09:02.472902+00:00","updated_at":"2026-07-02T08:46:49.002839+00:00","title_quality_ok":true,"display_title":"Decoupled Weight Decay Regularization","render_title":"Decoupled Weight Decay Regularization"},"hub":{"state":{"work_id":"07ef7360-d385-4033-83f7-8384a6325204","tier":"mega_hub","tier_reason":"1,000+ Pith inbound or 100,000+ external citations","pith_inbound_count":1039,"external_cited_by_count":null,"distinct_field_count":56,"first_pith_cited_at":"2019-07-21T17:08:50+00:00","last_pith_cited_at":"2026-06-30T17:25:45+00:00","author_build_status":"needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"needed","reader_status":"needed","recognition_status":"needed","updated_at":"2026-07-02T09:42:23.136111+00:00","tier_text":"mega_hub"},"tier":"mega_hub","role_counts":[{"context_role":"method","n":102},{"context_role":"background","n":64},{"context_role":"dataset","n":5},{"context_role":"baseline","n":3},{"context_role":"other","n":3}],"polarity_counts":[{"context_polarity":"use_method","n":102},{"context_polarity":"background","n":57},{"context_polarity":"unclear","n":9},{"context_polarity":"use_dataset","n":5},{"context_polarity":"baseline","n":3},{"context_polarity":"support","n":1}],"runs":{"ask_index":{"job_type":"ask_index","status":"succeeded","result":{"title":"Decoupled Weight Decay Regularization","claims":[{"claim_text":"L$_2$ regularization and weight decay regularization are equivalent for standard stochastic gradient descent (when rescaled by the learning rate), but as we demonstrate this is \\emph{not} the case for adaptive gradient algorithms, such as Adam. While common implementations of these algorithms employ L$_2$ regularization (often calling it \"weight decay\" in what may be misleading due to the inequivalence we expose), we propose a simple modification to recover the original formulation of weight decay regularization by \\emph{decoupling} the weight decay from the optimization steps taken w.r.t. the","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Decoupled Weight Decay Regularization because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-13T18:03:36.731653+00:00"},"author_expand":{"job_type":"author_expand","status":"succeeded","result":{"authors_linked":[{"id":"04d63bfc-63bb-4ef5-aa6e-77f634e82f86","orcid":null,"display_name":"Ilya Loshchilov and Frank Hutter"}]},"error":null,"updated_at":"2026-05-13T18:03:36.729295+00:00"},"context_extract":{"job_type":"context_extract","status":"succeeded","result":{"enqueued_papers":25},"error":null,"updated_at":"2026-05-13T18:03:36.632513+00:00"},"graph_features":{"job_type":"graph_features","status":"succeeded","result":{"co_cited":[{"title":"Adam: A Method for Stochastic Optimization","work_id":"1910796d-9b52-4683-bf5c-de9632c1028b","shared_citers":61},{"title":"An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale","work_id":"e96730e3-129b-4db6-b981-15ab7932e297","shared_citers":55},{"title":"GPT-4 Technical Report","work_id":"b928e041-6991-4c08-8c81-0359e4097c7b","shared_citers":38},{"title":"DINOv2: Learning Robust Visual Features without Supervision","work_id":"26b304e5-b54a-4f26-be7e-83299eca52e4","shared_citers":33},{"title":"SGDR: Stochastic Gradient Descent with Warm Restarts","work_id":"ad476478-c5ea-495b-a454-168c504bbfcc","shared_citers":33},{"title":"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning","work_id":"e6b75ad5-2877-4168-97c8-710407094d20","shared_citers":27},{"title":"Flow Matching for Generative Modeling","work_id":"6edb71c4-5d64-40af-a394-9757ea051a36","shared_citers":27},{"title":"LLaMA: Open and Efficient Foundation Language Models","work_id":"c018fc23-6f3f-4035-9d02-28a2173b2b9d","shared_citers":27},{"title":"Qwen3 Technical Report","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","shared_citers":27},{"title":"The Llama 3 Herd of Models","work_id":"1549a635-88af-4ac1-acfe-51ae7bb53345","shared_citers":27},{"title":"DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models","work_id":"c5006563-f3ec-438a-9e35-b7b484f34828","shared_citers":26},{"title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding","work_id":"ed240a10-5b19-406c-baa5-30803f465785","shared_citers":24},{"title":"Classifier-Free Diffusion Guidance","work_id":"acf2c588-c088-4a6c-938e-150ad7c666d7","shared_citers":24},{"title":"Gaussian Error Linear Units (GELUs)","work_id":"0466fd22-03a1-4a61-af0a-a900e77bb023","shared_citers":24},{"title":"Scaling Laws for Neural Language Models","work_id":"b7dd8749-9c45-4977-ab9b-64478dce1ae8","shared_citers":23},{"title":"Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge","work_id":"28ea1282-d657-4c61-a83c-f1249be6d6b1","shared_citers":23},{"title":"Auto-Encoding Variational Bayes","work_id":"97d95295-30e1-42b4-bbf6-85f0fa4edb44","shared_citers":22},{"title":"Proximal Policy Optimization Algorithms","work_id":"240c67fe-d14d-4520-91c1-38a4e272ca19","shared_citers":22},{"title":"Training Verifiers to Solve Math Word Problems","work_id":"acab1aa8-b4d6-40e0-a3ee-25341701dca2","shared_citers":22},{"title":"Qwen3-VL Technical Report","work_id":"1fe243aa-e3c0-4da6-b391-4cbcfc88d5c0","shared_citers":21},{"title":"Score-Based Generative Modeling through Stochastic Differential Equations","work_id":"d9110e53-a5d4-4794-a4c5-a575e91c31ad","shared_citers":21},{"title":"Wan: Open and Advanced Large-Scale Video Generative Models","work_id":"ad3ebc3b-4224-46c9-b61d-bcf135da0a7c","shared_citers":21},{"title":"Denoising Diffusion Implicit Models","work_id":"8fa2128b-d18c-405c-ac92-0e669cf89ac0","shared_citers":20},{"title":"DINOv3","work_id":"c8b07deb-8fe7-4e18-9620-f3569d3529ce","shared_citers":20}],"time_series":[{"n":1,"year":2019},{"n":2,"year":2020},{"n":4,"year":2021},{"n":9,"year":2022},{"n":7,"year":2023},{"n":9,"year":2024},{"n":11,"year":2025},{"n":381,"year":2026}]},"error":null,"updated_at":"2026-05-13T17:25:55.545393+00:00"},"identity_refresh":{"job_type":"identity_refresh","status":"succeeded","result":{"fixed":1,"items":[{"title":"Qwen3 Technical Report","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","resolver":"local_arxiv","confidence":0.98,"old_work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e"}],"errors":[],"attempted":1},"error":null,"updated_at":"2026-05-13T18:03:36.019663+00:00"},"reader_index":{"job_type":"reader_index","status":"succeeded","result":{"note":"annotated reader requires full-text/OA fetch; shell is wired for mega hubs","status":"reader queued"},"error":null,"updated_at":"2026-07-01T22:01:51.820676+00:00"},"recognition_alignment":{"job_type":"recognition_alignment","status":"succeeded","result":{"modules":["IndisputableMonolith.Gravity.PropagationSpeed","IndisputableMonolith.Foundation.PreTemporalForcingOrder","IndisputableMonolith.Physics.LightConeCausalityFromRS","IndisputableMonolith.Cosmology.EtaBPrefactorDerivation","IndisputableMonolith.Physics.MaxwellEquationsFromRS","IndisputableMonolith.Gravity.BlackHoleEntropyFromLedger","IndisputableMonolith.Thermodynamics.FermiDirac","IndisputableMonolith.Gravity.BlackHoleHorizonStates"],"query_chars":1302},"error":null,"updated_at":"2026-07-01T22:01:51.819683+00:00"},"role_polarity":{"job_type":"role_polarity","status":"succeeded","result":{"title":"Decoupled Weight Decay Regularization","claims":[{"claim_text":"L$_2$ regularization and weight decay regularization are equivalent for standard stochastic gradient descent (when rescaled by the learning rate), but as we demonstrate this is \\emph{not} the case for adaptive gradient algorithms, such as Adam. While common implementations of these algorithms employ L$_2$ regularization (often calling it \"weight decay\" in what may be misleading due to the inequivalence we expose), we propose a simple modification to recover the original formulation of weight decay regularization by \\emph{decoupling} the weight decay from the optimization steps taken w.r.t. the","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Decoupled Weight Decay Regularization because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-13T18:03:36.634368+00:00"},"summary_claims":{"job_type":"summary_claims","status":"succeeded","result":{"title":"Decoupled Weight Decay Regularization","claims":[{"claim_text":"L$_2$ regularization and weight decay regularization are equivalent for standard stochastic gradient descent (when rescaled by the learning rate), but as we demonstrate this is \\emph{not} the case for adaptive gradient algorithms, such as Adam. While common implementations of these algorithms employ L$_2$ regularization (often calling it \"weight decay\" in what may be misleading due to the inequivalence we expose), we propose a simple modification to recover the original formulation of weight decay regularization by \\emph{decoupling} the weight decay from the optimization steps taken w.r.t. the","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Decoupled Weight Decay Regularization because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-13T17:25:52.715956+00:00"}},"summary":{"title":"Decoupled Weight Decay Regularization","claims":[{"claim_text":"L$_2$ regularization and weight decay regularization are equivalent for standard stochastic gradient descent (when rescaled by the learning rate), but as we demonstrate this is \\emph{not} the case for adaptive gradient algorithms, such as Adam. While common implementations of these algorithms employ L$_2$ regularization (often calling it \"weight decay\" in what may be misleading due to the inequivalence we expose), we propose a simple modification to recover the original formulation of weight decay regularization by \\emph{decoupling} the weight decay from the optimization steps taken w.r.t. the","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Decoupled Weight Decay Regularization because it crossed a citation-hub threshold.","role_counts":[]},"graph":{"co_cited":[{"title":"Adam: A Method for Stochastic Optimization","work_id":"1910796d-9b52-4683-bf5c-de9632c1028b","shared_citers":61},{"title":"An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale","work_id":"e96730e3-129b-4db6-b981-15ab7932e297","shared_citers":55},{"title":"GPT-4 Technical Report","work_id":"b928e041-6991-4c08-8c81-0359e4097c7b","shared_citers":38},{"title":"DINOv2: Learning Robust Visual Features without Supervision","work_id":"26b304e5-b54a-4f26-be7e-83299eca52e4","shared_citers":33},{"title":"SGDR: Stochastic Gradient Descent with Warm Restarts","work_id":"ad476478-c5ea-495b-a454-168c504bbfcc","shared_citers":33},{"title":"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning","work_id":"e6b75ad5-2877-4168-97c8-710407094d20","shared_citers":27},{"title":"Flow Matching for Generative Modeling","work_id":"6edb71c4-5d64-40af-a394-9757ea051a36","shared_citers":27},{"title":"LLaMA: Open and Efficient Foundation Language Models","work_id":"c018fc23-6f3f-4035-9d02-28a2173b2b9d","shared_citers":27},{"title":"Qwen3 Technical Report","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","shared_citers":27},{"title":"The Llama 3 Herd of Models","work_id":"1549a635-88af-4ac1-acfe-51ae7bb53345","shared_citers":27},{"title":"DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models","work_id":"c5006563-f3ec-438a-9e35-b7b484f34828","shared_citers":26},{"title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding","work_id":"ed240a10-5b19-406c-baa5-30803f465785","shared_citers":24},{"title":"Classifier-Free Diffusion Guidance","work_id":"acf2c588-c088-4a6c-938e-150ad7c666d7","shared_citers":24},{"title":"Gaussian Error Linear Units (GELUs)","work_id":"0466fd22-03a1-4a61-af0a-a900e77bb023","shared_citers":24},{"title":"Scaling Laws for Neural Language Models","work_id":"b7dd8749-9c45-4977-ab9b-64478dce1ae8","shared_citers":23},{"title":"Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge","work_id":"28ea1282-d657-4c61-a83c-f1249be6d6b1","shared_citers":23},{"title":"Auto-Encoding Variational Bayes","work_id":"97d95295-30e1-42b4-bbf6-85f0fa4edb44","shared_citers":22},{"title":"Proximal Policy Optimization Algorithms","work_id":"240c67fe-d14d-4520-91c1-38a4e272ca19","shared_citers":22},{"title":"Training Verifiers to Solve Math Word Problems","work_id":"acab1aa8-b4d6-40e0-a3ee-25341701dca2","shared_citers":22},{"title":"Qwen3-VL Technical Report","work_id":"1fe243aa-e3c0-4da6-b391-4cbcfc88d5c0","shared_citers":21},{"title":"Score-Based Generative Modeling through Stochastic Differential Equations","work_id":"d9110e53-a5d4-4794-a4c5-a575e91c31ad","shared_citers":21},{"title":"Wan: Open and Advanced Large-Scale Video Generative Models","work_id":"ad3ebc3b-4224-46c9-b61d-bcf135da0a7c","shared_citers":21},{"title":"Denoising Diffusion Implicit Models","work_id":"8fa2128b-d18c-405c-ac92-0e669cf89ac0","shared_citers":20},{"title":"DINOv3","work_id":"c8b07deb-8fe7-4e18-9620-f3569d3529ce","shared_citers":20}],"time_series":[{"n":1,"year":2019},{"n":2,"year":2020},{"n":4,"year":2021},{"n":9,"year":2022},{"n":7,"year":2023},{"n":9,"year":2024},{"n":11,"year":2025},{"n":381,"year":2026}]},"authors":[{"id":"04d63bfc-63bb-4ef5-aa6e-77f634e82f86","orcid":null,"display_name":"Ilya Loshchilov and Frank Hutter","source":"manual","import_confidence":0.72}]},"citers":{"total":1039,"items":[{"citing_arxiv_id":"2606.31988","ref_index":59,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Joint inference of weak lensing convergence map and cosmology with diffusion models","primary_cat":"astro-ph.CO","submitted_at":"2026-06-30T17:25:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A transformer-based diffusion model learns the joint distribution of convergence maps and cosmology from log-normal weak lensing simulations and generates calibrated posterior samples matching MCMC results.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31986","ref_index":31,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"CoLT: Teaching Multi-Modal Models to Think with Chain of Latent Thoughts","primary_cat":"cs.CV","submitted_at":"2026-06-30T17:24:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CoLT replaces text-based chain-of-thought in MLLMs with 3-step latent thought chains supervised by a removable external decoder in forward and backward modes, yielding 10.1x faster inference on eight benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31924","ref_index":31,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"InstanceControl: Controllable Complex Image Generation without Instance Labeling","primary_cat":"cs.CV","submitted_at":"2026-06-30T16:33:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"InstanceControl uses VLMs to auto-generate instance masks from text and visual conditions, with adaptive refinement, to enable controllable multi-object image generation without manual labeling.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31859","ref_index":10,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Review Residuals: Update-Conditioned Residual Gating for Transformers","primary_cat":"cs.LG","submitted_at":"2026-06-30T15:53:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Review Residuals add an update-conditioned gate to transformer residual connections, yielding depth-stable training and performance gains that emerge and grow with model size from 590M parameters upward.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31839","ref_index":55,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Towards Voxel Spacing Consistency for Medical Image Segmentation","primary_cat":"cs.CV","submitted_at":"2026-06-30T15:42:23+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Consispace is a semantic-aware resampling method that uses an implicit neural network with ODE constraints and feature reweighting to achieve consistent axial voxel spacing while preserving anatomy and semantics, improving downstream segmentation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31777","ref_index":9,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Mesh BDF: Barycentric Dominance Field for 3D Native Mesh Generation","primary_cat":"cs.CV","submitted_at":"2026-06-30T14:58:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Barycentric Dominance Field converts discrete mesh connectivity into a continuous surface signal that diffusion models can use directly for higher-quality native 3D mesh generation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31748","ref_index":43,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Addressing Over-Refusal in LLMs with Competing Rewards","primary_cat":"cs.LG","submitted_at":"2026-06-30T14:38:49+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SEAR trains one LLM via adversarial process rewards to explore harmful reasoning paths but flip to safe outputs, reducing over-refusal while preserving safety.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31688","ref_index":78,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Semantic Occupancy Prediction with Dual Range-Voxel Representation","primary_cat":"cs.CV","submitted_at":"2026-06-30T14:01:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DRVR uses range-view and geometry-aware voxel-view encoders plus fusion to deliver 5.4% higher mIoU and 2.1x faster inference than multi-sweep baselines on nuScenes-Occupancy from single sweeps.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31626","ref_index":27,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"PrISM-IQA: Image Quality Assessment Made Practical for Smartphone Photography","primary_cat":"cs.CV","submitted_at":"2026-06-30T13:10:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"PrISM-IQA reformulates IQA as multi-issue ordinal diagnosis predicting absent/minor/severe/critical levels for 53 ISP issues using cumulative encoding and structured inference.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31570","ref_index":17,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Mitigating Positional Leakage in 3D Masked Autoencoders for Robust Representation Learning","primary_cat":"cs.CV","submitted_at":"2026-06-30T12:29:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"MPL-MAE introduces recalibrated positional embedding and gated positional interface modules to reduce positional over-reliance in 3D masked autoencoders and improve semantic representation quality.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31513","ref_index":25,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"PRISM: Latent Composition Consistency for Single-Image Reflection Removal","primary_cat":"cs.CV","submitted_at":"2026-06-30T11:28:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"PRISM performs single-image reflection removal by linear decomposition in pretrained latent space with flow matching and latent composition consistency losses, outperforming prior methods on six benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31288","ref_index":9,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Probabilistic Inversion with Flow Matching","primary_cat":"cs.LG","submitted_at":"2026-06-30T08:04:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Adapts Flow Matching from generative AI to probabilistic inversion, evaluated on a simple 2D velocity model and the OpenFWI seismic dataset.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31268","ref_index":25,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"TDGT: A Tabular Data Generation Toolkit supporting adaptive GPU-accelerated Bayesian mixture models, diffusion-based models, and latent-space generative modeling","primary_cat":"cs.LG","submitted_at":"2026-06-30T07:42:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"TDGT toolkit introduces ABMS for adaptive synthetic tabular data generation with multi-metric fidelity assessment and a web interface.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31247","ref_index":177,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"FlexiSLM: A Dynamic and Controllable Frame Rate Spoken Language Model","primary_cat":"cs.SD","submitted_at":"2026-06-30T07:24:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"FlexiSLM is the first spoken language model supporting dynamic and controllable frame rates on speech input and output, outperforming fixed-rate 7B models at high quality and enabling faster inference at lower rates like 6.25 Hz.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31127","ref_index":41,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SkillSpotter: Pose-Aware Multi-View Skilled Action Detection and Grading in Ego-Exo Videos","primary_cat":"cs.CV","submitted_at":"2026-06-30T04:43:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SkillSpotter raises class-specific mAP from 12.40 to 21.82 and balanced accuracy to 60.40% on Ego-Exo4D by adding adaptive temporal suppression, gated pose fusion, and bidirectional cross-view attention to temporal action detectors.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31099","ref_index":25,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Seeing Through Multiple Views: Parameter-Efficient Fine-Tuning via Selective Neurons for Consistent Radiology Report Generation","primary_cat":"cs.CV","submitted_at":"2026-06-30T03:48:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"View-PNDF detects and selectively fine-tunes view-specific neurons for consistent multi-view chest X-ray report generation, followed by LLM consolidation of reports.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.31029","ref_index":34,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"TerraDiT-$\\Omega$: Unified Spatial Control for Satellite Image Synthesis with Any Geospatial Primitive","primary_cat":"cs.CV","submitted_at":"2026-06-30T01:56:41+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"TerraDiT-Ω generates satellite imagery from native geospatial primitives via Geometry-Aware Local Attention and outperforms dense and sparse control baselines while boosting downstream GeoAI tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30914","ref_index":65,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Beyond Clean Text: Evaluating Encoder and Decoder Robustness for Bangla Event Detection in Noisy Text","primary_cat":"cs.CL","submitted_at":"2026-06-29T21:03:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Creates a Bangla event detection benchmark with clean, ASR, and corrupted text variants and finds decoder-only LLMs more robust to noise than encoder models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30813","ref_index":20,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Gradient Smoothing: Coupling Layer-wise Updates for Improved Optimization","primary_cat":"cs.LG","submitted_at":"2026-06-29T18:37:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Gradient Smoothing applies depth-wise smoothing to optimizer updates from base methods like Adam, yielding consistent gains in optimization and generalization on language, RL, diffusion, and vision tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30765","ref_index":50,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Deep Reinforcement Learning for Individual Atomic Control and Cooling","primary_cat":"quant-ph","submitted_at":"2026-06-29T18:01:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Deep reinforcement learning achieves real-time cooling of single-atom motion with a 388 microsecond time constant using cavity feedback, outperforming a linear differentiator controller.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30562","ref_index":39,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Morphing into Hybrid Attention Models","primary_cat":"cs.CL","submitted_at":"2026-06-29T17:02:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"FlashMorph formulates hybrid layer selection as budget-constrained optimization, trains per-layer gates on synthetic retrieval data with linearization regularization, then discretizes and distills to produce efficient hybrid architectures.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30489","ref_index":21,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Factorizable Normalizing Flows for parameter-dependent density morphing","primary_cat":"stat.ML","submitted_at":"2026-06-29T15:54:35+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Factorizable Normalizing Flows represent parameter-dependent densities via a reference flow composed with a factorized polynomial transformation, enabling isolated per-parameter learning and linear scaling.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30408","ref_index":52,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SA-Homo: Scale Adaptive Homography Estimation for Scale Variation Scenarios","primary_cat":"cs.CV","submitted_at":"2026-06-29T14:51:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SA-Homo introduces a hierarchical scale-adaptive homography estimation framework with SDBM, MLAC, CSMB, and IHERM modules plus the HMSA dataset that claims robust performance under up to 8x scale discrepancies.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30319","ref_index":11,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"BrainJanus: A Unified Model for Understanding and Generation across Brain, Vision, and Language","primary_cat":"cs.CV","submitted_at":"2026-06-29T14:02:15+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"BrainJanus presents a unified autoregressive model with a brain tokenizer that maps between neural activity, vision, and language for encoding and decoding tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30190","ref_index":30,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Few-Shot Domain Incremental Learning via Continual Vision-Language Consolidation","primary_cat":"cs.CV","submitted_at":"2026-06-29T12:04:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CVLC fuses calibrated vision prototypes with LLM-generated language prototypes and applies dual coalescent projection plus latent space reservation to enable few-shot adaptation across sequential domains, reporting up to 16% gains over prior methods.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30140","ref_index":4,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"DNA Language Models: An Assessment of Pre-Training for Fine-Tuning Tasks","primary_cat":"q-bio.GN","submitted_at":"2026-06-29T11:20:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Benchmark assessment of pretraining contribution and BPE tokenization in transformer versus convolutional DNA language models for genomics fine-tuning tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30082","ref_index":8,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Clinical Risk-Aware Multi-Level Grading for Coronary Artery Stenosis through Curved Feature Reconstruction","primary_cat":"cs.CV","submitted_at":"2026-06-29T10:17:22+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Introduces CFR module for point-by-point feature alignment across curved vessel modalities and CR Loss for risk-weighted multi-level stenosis grading, reporting outperformance on an in-house dataset.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30047","ref_index":57,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Argus: Metric Panoramic 3D Reconstruction for Indoor Scenes","primary_cat":"cs.CV","submitted_at":"2026-06-29T09:39:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Argus is a feed-forward network for metric panoramic 3D reconstruction, trained on the new Realsee3D dataset of 10K indoor scenes and using a learned covisibility module plus decomposed mapping supervision to achieve SOTA on camera pose, depth, and point cloud tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.30015","ref_index":15,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Parametric Skills","primary_cat":"cs.CL","submitted_at":"2026-06-29T09:19:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"ParametricSkills uses a hypernetwork to turn textual skills into LoRA adapters, outperforming in-context learning by 6.44 points on average across six SWE subtasks with higher BERT Score and F1.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.29843","ref_index":78,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Gappy Reconstruction of Bubbly Flows by Guided Diffusion Models","primary_cat":"physics.flu-dyn","submitted_at":"2026-06-29T06:30:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A guided diffusion model trained on DNS data reconstructs bubble-phase velocity fields in bubbly flows from liquid measurements, reproducing key statistics and supporting 3D reconstruction via 2D slice patching.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.29760","ref_index":23,"ref_count":2,"confidence":0.98,"is_internal_anchor":true,"paper_title":"MR-IQA: A Unified Margin View of Regression and Ranking for Blind Image Quality Assessment","primary_cat":"cs.CV","submitted_at":"2026-06-29T04:07:46+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MR-IQA unifies regression and ranking in BIQA via a quality-margin optimization framework in RL, showing competitive performance on six benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.29575","ref_index":43,"ref_count":2,"confidence":0.98,"is_internal_anchor":true,"paper_title":"TF-MoE: Time-Frequency Mixture-of-Experts for Efficient Speech Separation","primary_cat":"cs.SD","submitted_at":"2026-06-28T19:37:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"TF-MoE uses dynamic per-frame and per-mel-band expert selection in time and frequency dimensions to improve speech separation performance at comparable compute cost to prior models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.29473","ref_index":45,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"MAVIN: Multi-Shot Audio-Visual Generation with Narrative Control","primary_cat":"cs.CV","submitted_at":"2026-06-28T16:01:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"MAVIN proposes boundary-aware attention, ID-aware propagation, a multi-agent scripting pipeline, and the MAVINSet dataset as the first framework for multi-shot audio-visual generation with narrative control, claiming SOTA results.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.29166","ref_index":19,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"A Self-Supervised Learning Framework for Video Encoding Complexity Clustering","primary_cat":"eess.IV","submitted_at":"2026-06-28T03:01:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"CECL pretrains video encoders via compression responses for downstream clustering by encoding complexity, claiming gains over SOTA encoders and bitrate/quality savings vs fixed ladders.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.29162","ref_index":11,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Spatially Localized Image Degradation Embeddings for Image Quality Assessment","primary_cat":"cs.CV","submitted_at":"2026-06-28T02:55:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SLIDE-IQA uses a dual-branch ViT with Threshold-Bounded Exclusion Mechanism for contrastive pretraining on localized degradations to boost sensitivity in NR-IQA while matching existing SSL benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28996","ref_index":71,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"On Surrogate Modeling of Static Response of AM Short-Fiber Thermoplastics Using Graph Neural Networks","primary_cat":"cs.LG","submitted_at":"2026-06-27T16:18:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A GNN-LSTM surrogate trained on Voronoi-cell homogenized nonlinear FE data predicts unseen SFT microstructure responses with R²≈0.98 and >100x speedup over direct FE.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28991","ref_index":8,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Learning from Acquisition: Metadata-driven Multimodal Pre-training for Cardiac MRI","primary_cat":"cs.CV","submitted_at":"2026-06-27T15:59:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MetaCLIP-CMR applies CLIP-style contrastive learning to cardiac MRI by treating acquisition metadata as text labels, delivering 86.8% modality and 86.5% view accuracy plus top Dice scores on ACDC/M&Ms segmentation with far less pre-training data than recent large-scale CMR models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28831","ref_index":19,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"HARD-KV: Head-Adaptive Regularization for Decoding-time KV Compression","primary_cat":"cs.LG","submitted_at":"2026-06-27T09:36:37+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"HARD-KV bridges dynamic head-adaptive KV cache compression with static inference engine constraints via Cascade Cache and Logits Calibration, reporting up to 2x throughput gains on long-context math benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28757","ref_index":41,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"A Physics-Grounded Benchmark for Multi-Agent Dynamics in World Models","primary_cat":"cs.CV","submitted_at":"2026-06-27T06:13:35+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CrashTwin is a new benchmark framework that exposes physical violations in state-of-the-art world models during multi-agent collisions despite high visual quality.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28719","ref_index":24,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"ComMem: Complementary Memory Systems for Test-Time Adaptation of Vision-Language Models","primary_cat":"cs.AI","submitted_at":"2026-06-27T03:55:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"ComMem proposes complementary fast visual cache and slow textual prototype memories for test-time adaptation of VLMs, claiming superior performance on 15 benchmarks under distribution shifts.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28677","ref_index":11,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SATB-VR: Training Few-Step Video Restoration Diffusion Model using SNR-Aware Trajectory Blending","primary_cat":"cs.CV","submitted_at":"2026-06-27T01:32:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"SATB-VR trains few-step video restoration diffusion models via SNR-aware trajectory blending of predictor outputs with ground-truth and a denoiser-driven consistency loss to achieve favorable performance on benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28551","ref_index":188,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"DataComp-VLM: Improved Open Datasets for Vision-Language Models","primary_cat":"cs.CV","submitted_at":"2026-06-26T19:11:29+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.27978","ref_index":7,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Parallel Rollout Approximation for Pixel-Space Autoregressive Image Generation","primary_cat":"cs.CV","submitted_at":"2026-06-26T11:27:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"PRA approximates sequential rollout training in parallel for pixel-space AR models via intermediate states and a pixel decoder, achieving FID 2.58 (135M params) and 1.94 (511M params) on ImageNet-1K 256x256, new SOTA among pixel-space AR models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28453","ref_index":20,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"DeVAR: Low-Dose CT Denoising via Visual Autoregressive Modeling","primary_cat":"eess.IV","submitted_at":"2026-06-26T09:55:37+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"DeVAR is the first application of visual autoregressive modeling to low-dose CT denoising, using next-scale token prediction, a residual refiner, and hybrid discrete-continuous decoding to outperform prior methods on two public datasets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.27872","ref_index":53,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"S$^2$-VLA: State-Space Guided Vision-Language-Action Models for Long-Horizon Manipulation","primary_cat":"cs.RO","submitted_at":"2026-06-26T09:13:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"S²-VLA uses a state-space model to maintain a belief state that produces dynamic gating weights for fusing visual, language, and action features, claiming better long-horizon manipulation than 7B models with only 2B parameters.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28445","ref_index":32,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"LoRA-Tuned Large Language Models for Dementia Detection via Multi-View Speech-Derived Features","primary_cat":"cs.SD","submitted_at":"2026-06-26T08:29:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A LoRA-tuned LLM integrates four complementary speech views to detect dementia, reaching 90.14% F1 on ADReSSo with ablation support for each view's contribution.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.27733","ref_index":28,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"BashCoder-R1: Towards Robust and Explainable Bash Code Generation with Robustness-Aware Group Relative Policy Optimization","primary_cat":"cs.SE","submitted_at":"2026-06-26T05:29:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"BashCoder-R1 applies CPT, L-CoT SFT, and R-GRPO to reach higher syntax, robustness, and functionality rates than baselines on the new BashBench benchmark of 952 tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.27708","ref_index":21,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"ZooClaw-FashionSigLIP2: Distilled Fine-tuning for Robust Fashion Retrieval","primary_cat":"cs.CV","submitted_at":"2026-06-26T04:13:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"ZooClaw-FashionSigLIP2 applies distilled full fine-tuning plus WiseFT interpolation to SigLIP2-base and reports outperforming LoRA, larger backbones, and external data on fashion retrieval benchmarks while releasing a new benchmark and bias analysis.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.27634","ref_index":4,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Continual Learning for Sequential Personalization of Small Language Models: A Stability Monitoring Analysis","primary_cat":"cs.LG","submitted_at":"2026-06-26T01:16:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Checkpoint monitoring during sequential LoRA adaptation of SLMs reveals instability patterns via reference set diagnostics that standard task metrics can miss.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.27617","ref_index":28,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Masked Language Flow Models","primary_cat":"cs.CL","submitted_at":"2026-06-26T00:16:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MLFMs combine masking with continuous flows to scale flow-based language models to reasoning and instruction-following tasks on GSM8K and MT-Bench.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.27554","ref_index":24,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Understanding Cross-Rig Generalization in Automotive Perception: a Multi-Rig Benchmark and Rig Variation Metrics","primary_cat":"cs.CV","submitted_at":"2026-06-25T21:06:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Introduces Plentiful CARLA Camera Rigs benchmark and two calibration-derived metrics (Rig Variance, Rig Contrastive Distance) showing geometric rig differences correlate with cross-rig performance drops in multi-view automotive perception.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.27514","ref_index":23,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Tessellating The Earth","primary_cat":"cs.CV","submitted_at":"2026-06-25T19:58:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"TTE replaces fixed spherical bases with differentiable Voronoi partitions plus shared semantic tokens to create adaptive geolocation encoders that reach new SOTA on geospatial tasks and iNaturalist species classification.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.26711","ref_index":23,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Mask to Concept: Auto-Promptable SAM3 via Efficient Test-Time Concept Embedding Search for Few-Shot Annotation","primary_cat":"cs.CV","submitted_at":"2026-06-25T07:44:37+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"M2C turns SAM3 into an auto-promptable annotator for medical few-shot segmentation via test-time concept embedding optimization and uncertainty-driven active refinement.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.25937","ref_index":17,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Event-Aware Loss Design for Forecasting of Convective Precipitation and Lightning","primary_cat":"physics.ao-ph","submitted_at":"2026-06-24T15:14:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A multi-task Patch-cGAN with lightning-derived spatial loss weighting improves post-processed forecasts of intense precipitation and lightning occurrence over the Korean Peninsula in summer 2025.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.28405","ref_index":57,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Enhancing Layer Interaction Using Key-Correlated Layer Attention","primary_cat":"cs.CV","submitted_at":"2026-06-24T13:54:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"KCLA is a linear-complexity layer attention mechanism that exploits high key cosine similarity to preserve dynamic updates and long-range cross-layer connections.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.25478","ref_index":29,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"TACO: Towards Task-Consistent Open-Vocabulary Adaptation in Video Recognition","primary_cat":"cs.CV","submitted_at":"2026-06-24T07:06:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"TACO proposes Relative Structure Distillation and a lightweight specialization projection to mitigate inconsistency between fine-tuning and evaluation objectives in open-vocabulary video recognition, claiming state-of-the-art results on cross-dataset and base-to-novel benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.25086","ref_index":24,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Training for the Model You Return: Improving Optimization for Iterate-Averaged Language Models","primary_cat":"cs.LG","submitted_at":"2026-06-23T18:47:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"PACE is a clipped per-coordinate controller added to AdamW that improves the limiting error of the returned iterate average in both quadratic analysis and LM experiments.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.21066","ref_index":2,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Demographic Metadata as Construct-Irrelevant Noise in DistilBERT-Based Automated Essay Scoring","primary_cat":"cs.CL","submitted_at":"2026-06-19T03:32:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"Naive concatenation of demographic metadata to text input in a DistilBERT AES model reduces QWK from 0.727 to 0.656, raises validation loss, and lowers score parity instances from 15 to 12 on the ASAP 2.0 dataset via 10-fold cross-validation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.19781","ref_index":37,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Towards Engineering Scaling Laws with Pretraining Data Composition","primary_cat":"hep-ex","submitted_at":"2026-06-18T04:32:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Pretraining data composition can be used to engineer neural scaling laws in hadronic jet classification toward data-heavy rather than model-size-heavy regimes.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.18698","ref_index":39,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Leveraging Energy Features for Surface Classification with Deep Learning: A Comparative Analysis Across Three Independent Datasets","primary_cat":"cs.RO","submitted_at":"2026-06-17T05:24:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"Energy features support 85-90% surface classification accuracy with DL models across three datasets and yield 1-2% gains when fused with inertial data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.18460","ref_index":48,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"ParticleTransformer is all you need for reconstructing hadronic tau leptons","primary_cat":"hep-ex","submitted_at":"2026-06-16T20:06:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"First fully machine-learned hadronic tau reconstruction at FCC-ee using ParticleTransformer achieves high performance on simulated data for identification, decay mode, charge, and kinematics.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.15129","ref_index":29,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"EyeMVP: OCT-Informed Fundus Representation Learning via Paired CFP--OCT Pretraining","primary_cat":"cs.CV","submitted_at":"2026-06-13T05:44:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"EyeMVP learns OCT-informed CFP representations via cross-modal masked reconstruction on 674k paired triples and reports competitive or superior performance on 15 retinal classification and segmentation tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.12949","ref_index":14,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"ViPER: Vision-based Packing-Aware Encoder for Robust Malware Detection","primary_cat":"cs.CR","submitted_at":"2026-06-11T06:21:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"ViPER uses a LoRA-adapted ViT-B/14 with dual heads for malware classification and packing detection plus a gating mechanism and weighted losses to reach 0.8521 balanced accuracy on 200k Windows PE images while detecting packing at 0.9949 AUC.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.12661","ref_index":5,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Finding Novel Precursors for Solar Wind Stream Interaction Regions with Interpretable Deep Learning","primary_cat":"astro-ph.SR","submitted_at":"2026-06-10T20:45:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"SIREN is a ~100k-parameter Transformer that detects SIRs with ROC-AUC 0.93 on held-out data and attributes 24% importance to proton density and 13-17% to transverse velocity, identifying flow deflection as a consistent signature.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.12635","ref_index":6,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"CD-RCM: Generalizable Continuous-Depth Novel View Synthesis for Reflectance Confocal Microscopy","primary_cat":"cs.CV","submitted_at":"2026-06-10T19:54:23+00:00","verdict":"UNVERDICTED","verdict_confidence":"MODERATE","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CD-RCM is a feedforward neural model for novel view synthesis that predicts unseen depths in reflectance confocal microscopy stacks to produce isotropic 3D volumes for arbitrary sectioning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.10547","ref_index":43,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Unsupervised Deep Learning for Limited-Angle STEM-EDX Tomography -- Application to 3D Chemical Analysis of Phase-Change Memory Devices","primary_cat":"eess.IV","submitted_at":"2026-06-09T08:16:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Unsupervised multi-channel DIP-TV reconstructs near-isotropic 3D elemental maps from limited-angle EDX tomography data using only EDX signals, applied to GST memory devices in virgin and SET states.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.08783","ref_index":104,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"OptMuon: Closed-Loop Orthogonalized Momentum Methods for Stochastic Optimization with Zero-Noise Optimality","primary_cat":"math.OC","submitted_at":"2026-06-07T18:59:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"OptMuon combines orthogonalized momentum with trajectory-dependent AdaGrad-Norm adaptation to obtain expected-stationarity rates of order T^{-1/2} + sigma^{1/2}T^{-1/4} or T^{-1/2} + sigma^{1/3}T^{-1/3} that reduce to near-optimal deterministic first-order rates in the zero-noise regime.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.08276","ref_index":51,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"QnRL: Quantum-Native Reinforcement Learning","primary_cat":"quant-ph","submitted_at":"2026-06-06T17:54:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"QnRL is a distributional quantum RL framework that distills conditional action policies from moments of quantum generative models in Hilbert space via the QuAK algorithm, reporting higher scores and fewer parameters than baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.05071","ref_index":29,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"InstantRetouch: Efficient and High-Fidelity Instruction-Guided Image Retouching with Bilateral Space","primary_cat":"cs.CV","submitted_at":"2026-06-03T16:30:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"InstantRetouch performs efficient high-fidelity language-guided retouching via bilateral grid prediction of affine transforms combined with variational score distillation from diffusion models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.05261","ref_index":2,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"NIV: Neural Axis Variations for Variable Font Generation","primary_cat":"cs.CV","submitted_at":"2026-06-03T16:17:43+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"NIV trains a neural model with property embeddings to predict per-point displacements on vector glyphs, turning static fonts into multi-axis variable fonts using a new dataset of over one million tuples from Google Fonts.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.05030","ref_index":15,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Imbuing Large Language Models with Bidirectional Logic for Robust Chain Repair","primary_cat":"cs.CL","submitted_at":"2026-06-03T15:58:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"TRI trains LLMs on goal-conditioned fill-in-the-middle tasks via PSM token rearrangement and symbolic verification to surgically repair erroneous CoT segments.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.04776","ref_index":13,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SoftPINCH: EMG-Driven Soft Exoskeleton Assistance for Finger Flexion and Grasping","primary_cat":"cs.RO","submitted_at":"2026-06-03T11:59:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"SoftPINCH is an EMG-driven soft exoskeleton using CNN+LSTM decoding and magnetic fingertip sensing that achieves 99.4% cross-subject accuracy and reduces muscular effort during pinch grasping.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.04737","ref_index":47,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Physics-Informed Video Generation via Mixture-of-Experts Latent Alignment","primary_cat":"cs.CV","submitted_at":"2026-06-03T11:20:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"PILA aligns frozen flow-matching video models to a physics attribute bank via MoE experts and operational residuals, reporting SOTA physical plausibility on VBench-2.0, VideoPhy-2 and PhyGenBench while preserving visual quality.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.04552","ref_index":6,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling","primary_cat":"cs.CL","submitted_at":"2026-06-03T07:38:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LDARNet learns adaptive token boundaries via dynamic chunking in a genomic foundation model and reports gains on histone modification tasks over larger models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.04438","ref_index":26,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"LoopMoE: Unifying Iterative Computation with Mixture-of-Experts for Language Modeling","primary_cat":"cs.LG","submitted_at":"2026-06-03T04:38:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LoopMoE is a looped MoE language model that outperforms matched vanilla MoE on 8 of 9 downstream benchmarks at 3B scale and continues to outperform at 9B scale under strictly controlled budgets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.04366","ref_index":8,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"MeshTok: Efficient Multi-Scale Tokenization for Scalable PDE Transformers","primary_cat":"cs.LG","submitted_at":"2026-06-03T02:29:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MeshTok uses AMR-inspired adaptive multiscale tokenization to improve the efficiency-accuracy trade-off of Transformer models for PDEs over uniform-grid baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03994","ref_index":38,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SimuScene: Simulation-Ready Compositional 3D Scene Reconstruction from a Single Image","primary_cat":"cs.CV","submitted_at":"2026-06-02T17:59:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SimuScene feeds physics simulation diagnostics back into shape and layout estimation to correct geometric errors and output simulation-ready compositional scenes from single images.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03915","ref_index":12,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"PatchScene: Patch-based Voxel Diffusion for Large-Scale Scene Completion","primary_cat":"cs.CV","submitted_at":"2026-06-02T17:09:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"PatchScene introduces patch-based voxel diffusion with spatio-temporal fusion and annular-flow propagation for large-scale LiDAR scene completion, claiming SOTA results on SemanticKITTI and generalization from 20m to 50m ranges.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03874","ref_index":18,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"DyaPlex: Full-Duplex Speech-Motion Model for Dyadic Interaction","primary_cat":"cs.CV","submitted_at":"2026-06-02T16:42:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DyaPlex introduces a dual-tower Transformer that adds a streaming motion pathway to a frozen full-duplex speech model using dyadic token interleaving and time-aligned RoPE for synchronized multimodal dyadic interaction.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03810","ref_index":34,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Consistency Training Can Entrench Misalignment","primary_cat":"cs.CL","submitted_at":"2026-06-02T15:54:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Consistency training suppresses reward hacking and emergent misalignment but amplifies sycophancy in controlled model organisms, driven by labeling-induced distribution shifts rather than selection operators.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03577","ref_index":30,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching","primary_cat":"cs.CV","submitted_at":"2026-06-02T12:46:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Authors create ReasonMatch-Bench and DCRL training to boost MLLM performance on wide-baseline matching, reporting gains over baselines while preserving general capabilities.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03539","ref_index":23,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Knowledge-Preserved Model Tuning in Null-Space for Robust Spatio-Temporal Video Grounding","primary_cat":"cs.CV","submitted_at":"2026-06-02T11:59:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Null-Space Tuning injects learnable residuals into input features confined to the null-space for high-quality inputs to preserve pre-trained knowledge while directing restoration components for low-quality inputs outside that space.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03376","ref_index":95,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"P$^2$-DPO: Grounding Hallucination in Perceptual Processing via Calibration Direct Preference Optimization","primary_cat":"cs.CV","submitted_at":"2026-06-02T09:22:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"P²-DPO generates on-policy preference pairs targeting focus-and-enhance perception and visual robustness, combined with a calibration loss, to reduce hallucinations in LVLMs more effectively than human-feedback baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.04048","ref_index":5,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Unlocking Feature Learning in Gated Delta Networks at Scale","primary_cat":"cs.LG","submitted_at":"2026-06-02T08:45:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Derives μP-style scaling rules for Gated Delta Networks and validates stable learning-rate transfer in language model pre-training experiments.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03287","ref_index":21,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"BA-T: An Iterative Transformer for Two-View Bundle Adjustment","primary_cat":"cs.CV","submitted_at":"2026-06-02T07:51:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"BA-T is an iterative Transformer that implements bundle adjustment as a repeatable lightweight layer to progressively refine pose and geometry predictions in two-view 3D reconstruction while using far fewer decoder parameters than prior models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.09871","ref_index":21,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SD-GRPO: Verifiable Segment Decomposition for Long-Form Vision-Language Generation","primary_cat":"cs.CV","submitted_at":"2026-06-02T07:50:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SD-GRPO extends GRPO by computing per-segment advantages via z-normalization of verifiable segment rewards, yielding gains on long-form VL tasks with varying semantic independence across segments.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03254","ref_index":17,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"FreeStreamGS: Online Feed-forward 3D Gaussian Splatting from Unposed Streaming Inputs","primary_cat":"cs.CV","submitted_at":"2026-06-02T07:16:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"FreeStreamGS achieves online NVS from unposed streaming inputs competitive with offline 3DGS methods via decoupled intrinsic recovery and dynamic point refinement.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03251","ref_index":48,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Do Real-World Datasets Contain Natural Experiments? An Empirical Study Using Causal Feature Selection","primary_cat":"cs.AI","submitted_at":"2026-06-02T07:12:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Empirical evaluation on synthetic and real-world datasets indicates that natural experiments are present and can be leveraged via causal feature selection to boost model performance.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02937","ref_index":45,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"BEAST3D: Animal behavioral analysis and neural encoding from multi-view video via Gaussian splatting","primary_cat":"q-bio.NC","submitted_at":"2026-06-01T22:34:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"BEAST3D learns viewpoint-invariant 3D features from calibrated multi-view animal videos via Gaussian splatting for novel view synthesis, pose estimation, and neural encoding across four species.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02912","ref_index":60,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Data-Driven Forecasting of three-Component Seismograms Using Transformer Architectures","primary_cat":"astro-ph.IM","submitted_at":"2026-06-01T21:38:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SeismoGPT is a transformer autoregressive model achieving median normalized cross-correlation above 0.93 when forecasting synthetic three-component seismograms up to 240 s ahead from P- and S-wave context.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02565","ref_index":112,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Policy-based Foveated Imaging and Perception","primary_cat":"cs.CV","submitted_at":"2026-06-01T17:55:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A task-aware policy learned via reinforcement learning allocates high-resolution pixels on dual-stream sensors in real time, outperforming fixed or non-predictive baselines under tight pixel budgets in both simulation and 200 MP hardware tests.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02491","ref_index":29,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"MORPHOS: Autoregressive 4D Generation with Temporal Structured Latents","primary_cat":"cs.CV","submitted_at":"2026-06-01T17:01:21+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MORPHOS introduces an autoregressive 4D generation method with Temporal Structured Latents (T-SLAT) that produces dynamic 3D assets from videos while handling topological changes and long sequences.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02441","ref_index":42,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Spatial-Temporal Decoupled Reference Conditioning for Identity-Preserving Text-to-Video Generation","primary_cat":"cs.CV","submitted_at":"2026-06-01T16:12:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"ST-DRC proposes latent in-context injection, TASS-RoPE, appearance-invariant augmentation, and three-stream guidance to improve identity preservation in text-to-video diffusion models built on LTX-2.3.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02427","ref_index":18,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Spectral Audit of In-Context Operator Networks","primary_cat":"math.NA","submitted_at":"2026-06-01T16:04:21+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The paper defines a Jacobian-Fourier audit that extracts frequency-dependent gains, phase structure, and cross-mode coupling from in-context operator networks to test local operator fidelity beyond prediction error.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02366","ref_index":27,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"PRIMA: Boosting Animal Mesh Recovery with Biological Priors and Test-Time Adaptation","primary_cat":"cs.CV","submitted_at":"2026-06-01T15:13:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"PRIMA boosts 3D quadruped mesh recovery by injecting BioCLIP biological priors and using test-time adaptation with 2D constraints to build the Quadruped3D pseudo-3D dataset and reach SOTA on imbalanced animal benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02350","ref_index":35,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"TROPHIES: Temporal Reconstruction of Places, Humans, and Cameras from Multi-view Videos","primary_cat":"cs.CV","submitted_at":"2026-06-01T15:00:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"UNKNOWN","novelty_score":7.0,"formal_verification":"none","one_line_summary":"TROPHIES introduces a unified framework for human-scene-camera reconstruction from multi-view videos, achieving globally aligned and physically plausible 4D outputs on EgoHuman and EgoExo4D.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02300","ref_index":59,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Beyond Isolated Behaviors: Hierarchical User Modeling for LLM Personalization","primary_cat":"cs.CL","submitted_at":"2026-06-01T14:23:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"PHF applies Bourdieu's Theory of Practice to create hierarchical user models for LLM personalization and reports consistent gains on the LaMP benchmark.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02209","ref_index":33,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Conditional Graph Diffusion for Negotiation Support: Overcoming Discrete Infeasibility and Preference Elicitation Gaps","primary_cat":"cs.GT","submitted_at":"2026-06-01T13:10:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Conditional Graph Diffusion generates continuous negotiation outcomes with high individual rationality using GATv2 encoders, cross-attention fusion, and inference-time normative guidance gradients.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02145","ref_index":54,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Hybrid Neural Ordinary Differential Equations for Data-Efficient Polymerization Modeling with Incomplete Kinetics","primary_cat":"cs.LG","submitted_at":"2026-06-01T12:10:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Hybrid NODE retains mechanistic kinetics for free-radical polymerization and learns only the radical concentration closure, achieving RMSE 0.013 on noisy unseen conditions versus 0.31 and 0.68 for data-driven baselines with as few as ten measurements.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02133","ref_index":61,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Variational Learning for Insertion-based Generation","primary_cat":"cs.LG","submitted_at":"2026-06-01T11:59:46+00:00","verdict":"UNVERDICTED","verdict_confidence":"UNKNOWN","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Introduces the Insertion Process model for variable-length non-monotonic sequence generation via a bijective permutation mapping and permutation-based variational inference.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":100,"offset":0}}