A Gaussian process surrogate gate inserted between generative crystal models and property oracles matches or exceeds ungated fine-tuning while using roughly one-fifth the oracle calls for heat capacity and bulk modulus.
Mixed citations
Title resolution pending
Mixed citation behavior. Most common role is method (57%).
citation-role summary
citation-polarity summary
representative citing papers
LightGBM models on citation and diversity features predict exogenous diffusion of quantum computing concepts with R² up to 0.78 while endogenous reinforcement remains largely unpredictable after growth controls, with replications in other fields.
Introduces EURO-5K dataset from 136 EU acts and benchmarks full fine-tuning vs QLoRA for BERT and LLM models on reporting obligation extraction, reporting 0.89 F1 with limited gains from legal pretraining except under parameter-efficient adaptation.
Latent prediction SSL recovers latent trees from PCFG data with sample complexity constant in hierarchy depth L (up to logs), unlike exponential for token-level or supervised methods.
A taxonomy of SNN training algorithms is presented with the release of NeuroTrain, an open benchmarking framework for reproducible comparisons across datasets and architectures.
Introduces Calibrated Size Ratio (CSR) and confidence-weighted metrics to better detect overconfidence risk and calibration issues beyond the limitations of ECE.
EstGraph benchmark evaluates LLMs on estimating properties of very large graphs from random-walk samples that fit in context limits.
A low-stake adversary can degrade a liquid staking pool's performance via consensus manipulation and profit from the resulting drop in its LST value through application-layer financial positions.
Players exhibit consistent flexibility or specialization behavior across two games with conflicting performance incentives, indicating individual agency dominates structural differences.
Unified framework proves the score function yields the minimum-variance unbiased shear estimator and that response-weighted inverse-variance weights minimize shape noise independent of galaxy shape distributions, with RDSM reducing noise by ~17.5% at LSST depth.
On eight PMLB tabular benchmarks, an LLM HPO advisor adds only +0.40 pp CV accuracy beyond a fixed default seed and is overtaken by seeded classical methods within 5-12 evaluations, with no held-out test gain.
Online conformal prediction post-processing guarantees calibrated uncertainty coverage for GenCast, NeuralGCM, and AIFS-ENS forecasts of temperature and precipitation including extremes.
P²CE is a model-agnostic algorithm for plausible Pareto-optimal counterfactual explanations that uses isolation forest for plausibility and SHAP for efficiency, claiming better quality and speed on three datasets.
MSC-CMA-ES makes CMA-ES restarts structure-aware via cyclic nearest-better basin discovery on Sobol pre-samples, achieving 2.7x higher target coverage than BIPOP-CMA-ES on composition functions across CEC suites.
A two-stage LightGBM model on 59 features from concept networks forecasts link formation and intensity with ROC-AUC 0.95-0.967 across domains.
A triplet-based plateau search algorithm is proposed to adaptively determine a near-minimal number of trees for random forests by monitoring relative OOB score changes across forest size triplets, removing n_trees from the TPE search space.
ML-accelerated screening of 8640 AB2C2D variants yields 34 low-hull-energy altermagnets with spin splittings exceeding 1.5 eV, including RbMn2Te2O with 1.88 eV splitting and ~390 K Neel temperature.
An LLM-orchestrated physics simulation search identifies polymers with strong insulin interactions, outperforming standard optimization methods by significant margins.
AutoLLMResearch trains agents in a multi-fidelity LLMConfig-Gym environment formulated as a long-horizon MDP to enable cross-fidelity extrapolation for automating high-cost LLM experiment configurations.
Coverage tests for simulation-based inference of f_NL can pass while the posteriors are underconfident in the tails and sometimes yield weaker constraints than using power spectrum or bispectrum alone.
RL-STPA adapts STPA for RL via hierarchical subtask decomposition, coverage-guided perturbation testing, and iterative checkpoints that feed hazards back into training, demonstrated on autonomous drone navigation to reveal loss scenarios missed by standard evaluations.
CivBench trains models on turn-level states in Civilization V to predict victory probabilities, providing a progress-based evaluation of LLM strategic capabilities across 307 games with 7 models.
Physics-informed graph attention networks predict multi-phase equilibria in Ag-Bi-Cu-Sn alloys with 96% exact-set accuracy on in-domain data and strong generalization to unseen sections.
Unsupervised domain adaptation via feature alignment raises radioisotope identification accuracy on real LaBr3 gamma spectra from 0.754 to 0.904 for models trained only on synthetic data.
citing papers explorer
-
AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration - Learning from Cheap, Optimizing Expensive
AutoLLMResearch trains agents in a multi-fidelity LLMConfig-Gym environment formulated as a long-horizon MDP to enable cross-fidelity extrapolation for automating high-cost LLM experiment configurations.