Acceptance Cards is a new four-diagnostic standard for safe fine-tuning defense claims that requires statistical reliability, fresh semantic generalization, mechanism alignment, and cross-task transfer; under this protocol SafeLoRA fails the full-card pass on Gemma-2-2B-it.
hub Mixed citations
Model cards for model reporting
Mixed citation behavior. Most common role is background (64%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
LLMs default to responses more similar to opinions from the USA and some European and South American countries; prompting for a country shifts alignment but can introduce stereotypes, while translation does not reliably match language speakers.
StructuredSemanticSearch uses table discovery operators and orientation-aware integration on model-card tables to improve evidence coverage and diversity in model recommendation queries over a semantic baseline.
AI deployment in high-stakes areas requires domain-scoped calibrated verification with monitoring and revocation, using a proposed six-component Verification Coverage standard instead of mechanistic interpretability.
Agent benchmarks can report evidence-supported score bounds instead of single misleading success rates by adding a layer that checks required artifacts for outcome verification.
CIVeX maps agent tool calls to structural causal queries, checks identifiability, and issues auditable verdicts to prevent false executions while preserving utility on confounded benchmarks.
No agent system can be accountable without auditability, which requires five dimensions (action recoverability, lifecycle coverage, policy checkability, responsibility attribution, evidence integrity) and mechanisms for detect/enforce/recover.
DAISY is a structured form tool that generates more complete AI disclosure statements for research papers without reducing author comfort levels.
Introduces a parameter-driven framework for data attribution in LLMs that enables negotiation among creators, users, and intermediaries to meet stakeholder goals within the data economy.
The paper proposes the IARC-TS protocol that combines drift monitoring, uncertainty quantification, and stress tests to generate reproducible robustness evidence for industrial time series models mapped to EU AI Act obligations.
Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
PaLM 540B demonstrates continued scaling benefits by setting new few-shot SOTA results on hundreds of benchmarks and outperforming humans on BIG-bench.
CTRL is a large conditional transformer language model that uses naturally occurring control codes to steer text generation style and content.
Ontological Knowledge Blocks formalize regulatory obligations as 5-tuples linking RDF/OWL schemas, SHACL rules, evidence requirements and provenance, with a compiler enabling profile-based validation demonstrated in an HPC allocation scenario.
A NIMROD-to-IMAS conversion workflow preserves equilibrium, profile, perturbation and grid data from an edge harmonic oscillation simulation and identifies gaps in the IMAS schema for extended MHD.
Empirical analysis of 1,524 AI incident reports shows 83% arise from worker-AI trait misalignments, with 74% of those traceable to developers prioritizing efficiency over precision or personalization.
The agentic economy features distributed economic action across humans, AI agents, robots, protocols, and energy systems, with quantitative diagnostics from public data indicating accelerating AI adoption, robot capacity, and task reallocation rather than labor disappearance.
Introduces the Institutional Alignment Readiness (IAR) framework with five dimensions to evaluate institutional deployment readiness for AI in public systems, motivated by two anonymized education-sector cases.
Authors build a harmonized, geolocated atlas of participatory AI projects from existing and new sources, documenting geographic concentration and participation mostly at problem formulation and evaluation stages while providing update and governance mechanisms.
Introduces the Mechanism Plausibility Scale, a four-level framework separating generative sufficiency from mechanistic plausibility in LLM-based agent-based models.
Interviews in a semiconductor company reveal 16 collaboration and communication challenges in ML engineering teams, with unclear roles and responsibilities as the top issue, and list effective mitigation practices under hardware-driven constraints.
Modeling recommender systems as control systems shows that time-optimized fairness interventions can improve overall long-term performance rather than merely trading off against utility.
The EU AI Act narrows accountability for multi-agent AI in critical infrastructure by excluding safety components from key explanation and impact assessment rights, and the paper proposes AgentGov-SC, a three-layer architecture with 25 measures to address this through traceability to existing AI and
citing papers explorer
-
Acceptance Cards:A Four-Diagnostic Standard for Safe Fine-Tuning Defense Claims
Acceptance Cards is a new four-diagnostic standard for safe fine-tuning defense claims that requires statistical reliability, fresh semantic generalization, mechanism alignment, and cross-task transfer; under this protocol SafeLoRA fails the full-card pass on Gemma-2-2B-it.
-
Towards Measuring the Representation of Subjective Global Opinions in Language Models
LLMs default to responses more similar to opinions from the USA and some European and South American countries; prompting for a country shifts alignment but can introduce stereotypes, while translation does not reliably match language speakers.
-
Diversed Model Discovery via Structured Table Discovery
StructuredSemanticSearch uses table discovery operators and orientation-aware integration on model-card tables to improve evidence coverage and diversity in model recommendation queries over a semantic baseline.
-
The Open-Box Fallacy: Why AI Deployment Needs a Calibrated Verification Regime
AI deployment in high-stakes areas requires domain-scoped calibrated verification with monitoring and revocation, using a proposed six-component Verification Coverage standard instead of mechanistic interpretability.
-
Can Agent Benchmarks Support Their Scores? Evidence-Supported Bounds for Interactive-Agent Evaluation
Agent benchmarks can report evidence-supported score bounds instead of single misleading success rates by adding a layer that checks required artifacts for outcome verification.
-
CIVeX: Causal Intervention Verification for Language Agents
CIVeX maps agent tool calls to structural causal queries, checks identifiability, and issues auditable verdicts to prevent false executions while preserving utility on confounded benchmarks.
-
Auditable Agents
No agent system can be accountable without auditability, which requires five dimensions (action recoverability, lifecycle coverage, policy checkability, responsibility attribution, evidence integrity) and mechanisms for detect/enforce/recover.
-
AI Disclosure with DAISY
DAISY is a structured form tool that generates more complete AI disclosure statements for research papers without reducing author comfort levels.
-
A Human-Centric Framework for Data Attribution in Large Language Models
Introduces a parameter-driven framework for data attribution in LLMs that enables negotiation among creators, users, and intermediaries to meet stakeholder goals within the data economy.
-
Industrial AI Robustness Card for Time Series Models
The paper proposes the IARC-TS protocol that combines drift monitoring, uncertainty quantification, and stress tests to generate reproducible robustness evidence for industrial time series models mapped to EU AI Act obligations.
-
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
-
PaLM: Scaling Language Modeling with Pathways
PaLM 540B demonstrates continued scaling benefits by setting new few-shot SOTA results on hundreds of benchmarks and outperforming humans on BIG-bench.
-
CTRL: A Conditional Transformer Language Model for Controllable Generation
CTRL is a large conditional transformer language model that uses naturally occurring control codes to steer text generation style and content.
-
Ontological Knowledge Blocks: Executable Compliance and Profile-Based Validation for Trustworthy AI Systems
Ontological Knowledge Blocks formalize regulatory obligations as 5-tuples linking RDF/OWL schemas, SHACL rules, evidence requirements and provenance, with a compiler enabling profile-based validation demonstrated in an HPC allocation scenario.
-
NIMROD-to-IMAS workflow for extended-magnetohydrodynamic data with reusable datasets and implications for IMAS schema development
A NIMROD-to-IMAS conversion workflow preserves equilibrium, profile, perturbation and grid data from an edge harmonic oscillation simulation and identifies gaps in the IMAS schema for extended MHD.
-
The Quiet Path from Seemingly Minor Design Errors to Workplace AI Incidents
Empirical analysis of 1,524 AI incident reports shows 83% arise from worker-AI trait misalignments, with 74% of those traceable to developers prioritizing efficiency over precision or personalization.
-
The Agentic Economy: Humans, AI Agents, Robots, and the Measurable Transition toward Distributed Economic Action
The agentic economy features distributed economic action across humans, AI agents, robots, protocols, and energy systems, with quantitative diagnostics from public data indicating accelerating AI adoption, robot capacity, and task reallocation rather than labor disappearance.
-
Beyond Model Readiness: Institutional Readiness for AI Deployment in Public Systems
Introduces the Institutional Alignment Readiness (IAR) framework with five dimensions to evaluate institutional deployment readiness for AI in public systems, motivated by two anonymized education-sector cases.
-
Voices in the Loop: Mapping Participatory AI
Authors build a harmonized, geolocated atlas of participatory AI projects from existing and new sources, documenting geographic concentration and participation mostly at problem formulation and evaluation stages while providing update and governance mechanisms.
-
Mechanism Plausibility in Generative Agent-Based Modeling
Introduces the Mechanism Plausibility Scale, a four-level framework separating generative sufficiency from mechanistic plausibility in LLM-based agent-based models.
-
Exploring CoCo Challenges in ML Engineering Teams: Insights From the Semiconductor Industry
Interviews in a semiconductor company reveal 16 collaboration and communication challenges in ML engineering teams, with unclear roles and responsibilities as the top issue, and list effective mitigation practices under hardware-driven constraints.
-
Recommender Systems as Control Systems
Modeling recommender systems as control systems shows that time-optimized fairness interventions can improve overall long-term performance rather than merely trading off against utility.
-
Governing What the EU AI Act Excludes: Accountability for Autonomous AI Agents in Smart City Critical Infrastructure
The EU AI Act narrows accountability for multi-agent AI in critical infrastructure by excluding safety components from key explanation and impact assessment rights, and the paper proposes AgentGov-SC, a three-layer architecture with 25 measures to address this through traceability to existing AI and
-
Fairness-First Design Thinking for Software Architecture
A fairness-first Design Thinking method is proposed and tested in software architecture education to systematically address hidden fairness issues in digital systems.
-
Reckoning with the Political Economy of AI: Avoiding Decoys in Pursuit of Accountability
AI accountability efforts are undermined by five decoys that create illusions of progress while co-constituting the extractive political economy of the AI Project.
-
Towards A Framework for Levels of Anthropomorphic Deception in Robots and AI
A conceptual framework classifies anthropomorphic deception into four levels using humanlikeness, agency, and selfhood to guide ethical and practical decisions in HCI and HRI.
-
AI of the People, by the People, for the People: A Social Choice Approach to Collective Control of Artificial Intelligence
Proposes applying social choice theory as a modeling language and axiomatic tool for incorporating collective input across the ML development pipeline.
-
Playing Games with My Heart: An Evaluation of AI Companion Apps
All five AI companion apps use substantial dark patterns for monetization and engagement, prevalent erotica and gamification, and highly anthropomorphic designs that may foster parasocial relationships.
-
The Imbalanced User-AI Relationships as an Ethical Failure of Front-End Design in Healthcare AI
Imbalanced user-AI relationships form a distinct front-end ethical failure in healthcare AI that design choices such as restricted inputs and suppressed uncertainty can undermine agency and that reciprocity offers a path to more balanced interactions.
-
What Is The Political Content in LLMs' Pre- and Post-Training Data?
Training data for open LLMs is systematically left-leaning, with pre-training corpora containing more political material than post-training data and model stances aligning with data distributions.
-
StarCoder: may the source be with you!
StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.
-
AIMBio-Mat: An AI-Native FAIR Platform for Closed-Loop Materials Discovery and Biomedical Translation
AIMBio-Mat is a conceptual blueprint for an AI-native, FAIR, governance-aware decision layer that formulates biomedical-materials discovery as constrained multi-objective optimization under uncertainty.
-
FASE : A Fairness-Aware Spatiotemporal Event Graph Framework for Predictive Policing
FASE pairs a spatiotemporal graph neural network and multivariate Hawkes process for crime prediction with a fairness-constrained linear program for patrol allocation, showing that allocation fairness holds in simulation but a 3.5 percentage point detection gap between minority and non-minority ZIPs
-
Mapping the Stochastic Penal Colony
Content moderation operates as a stochastic penal colony that banishes users through the constant threat of account suspension, shown via auto-ethnographic case studies of Twitter, OpenAI DALL-E 2, and Pinterest.
-
Human-aligned AI Model Cards with Weighted Hierarchy Architecture
Introduces CRAI-MCF, an eight-module framework distilling 217 parameters from 240 projects into a quantitative sufficiency criterion for cross-model LLM comparison grounded in Value Sensitive Design.
-
Building a Regional Data-Centric Materials Science Ecosystem for Processing-Rich Materials Innovation in the Great Plains
Proposes a regional data-centric materials science ecosystem for the Great Plains, identifying five barriers to data sharing and outlining a staged roadmap illustrated by a high-purity germanium pilot.
-
AgriIR: A Scalable Framework for Domain-Specific Knowledge Retrieval
AgriIR is a configurable RAG framework using modular stages and 1B-parameter models to deliver grounded, citable answers for Indian agricultural information access.
-
LLMs in Qualitative Research: Opportunities, Limitations, and Practical Considerations
The paper outlines opportunities, limitations, and practical parameters for integrating LLMs into qualitative research while aligning with epistemological commitments like reflexivity and interpretive judgment.
- Causal state binding predicts action control in language agents