GPTQ is equivalent to Babai's nearest plane algorithm for CVP on the Hessian lattice of layer inputs, yielding geometric interpretation, inherited error bounds, and improved clipping-free quantization with GPU kernels.
hub Mixed citations
Beyond Individual Accountability: (Re-)Asserting Democratic Control of AI
Mixed citation behavior. Most common role is background (62%).
hub tools
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 16representative citing papers
Full development of 7B and 32B Olmo 3 models used 12.3 GWh datacenter energy and emitted 4,251 tCO2eq, with development overheads accounting for 82% of compute and reasoning models costing 17x more to post-train than instruction-tuned ones.
Domain-specialized small language models enable deterministic atomic-resolution scanning probe microscopy control with 99.3% command accuracy, lower computational cost, and better domain performance than larger general models.
An end-to-end energy measurement framework for LLM distillation pipelines reveals hidden teacher-side costs and yields selection guidelines plus an open-source harness.
Thematic analysis of 43 AI contestation cases, using Bovens's relational accountability model, produces categories of demands from below, institutional pushback, outcomes, and contextual factors shaping effective contestation.
KAIROS reduces power by 27% on average (up to 39.8%) for agentic AI inference by using long-lived context to jointly manage GPU frequency, concurrency, and request routing across instances.
A fixed-parameter multidimensional IRT calibration approach allows extending LLM benchmark suites over time, predicting full performance within 2-3 points and preserving rankings (Spearman ρ ≥ 0.9) using only 100 anchor questions per dataset.
Babbling Suppression stops LLM code generation upon test passage to reduce token output and energy consumption by up to 65% across Python and Java benchmarks.
Researchers created a stigma-aware WhatsApp chatbot for menstrual health education in Pakistan through co-design workshops and a two-week deployment, yielding insights on its use for challenging taboos alongside tensions around trust and cultural explanations.
Execution-idle accounts for 19.7% of GPU execution time and 10.7% of energy in a large cluster, motivating power management that treats it as a distinct operating state.
GreenZ is a conceptual three-layer sustainable UX framework built on ten principles, five operational systems, and practical tools, centered on an eight-type Digital Waste Taxonomy and a model questioning AI necessity before implementation.
LLM inference should be reframed and evaluated as energy-to-token production with a Token Production Function that accounts for power, cooling, and efficiency ceilings.
AI accountability efforts are undermined by five decoys that create illusions of progress while co-constituting the extractive political economy of the AI Project.
Proposes applying social choice theory as a modeling language and axiomatic tool for incorporating collective input across the ML development pipeline.
Expert interviews demonstrate that context in generative AI workplace use collapses or rots over time, limiting tool effectiveness and revealing pitfalls in computational context approaches.
The paper outlines opportunities, limitations, and practical parameters for integrating LLMs into qualitative research while aligning with epistemological commitments like reflexivity and interpretive judgment.
citing papers explorer
-
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
GPTQ is equivalent to Babai's nearest plane algorithm for CVP on the Hessian lattice of layer inputs, yielding geometric interpretation, inherited error bounds, and improved clipping-free quantization with GPU kernels.
-
The Hidden Cost of Thinking: Energy Use and Environmental Impact of LMs Beyond Pretraining
Full development of 7B and 32B Olmo 3 models used 12.3 GWh datacenter energy and emitted 4,251 tCO2eq, with development overheads accounting for 82% of compute and reasoning models costing 17x more to post-train than instruction-tuned ones.
-
Integrating Domain-Specialized Language Models with AI Measurement Tools for Deterministic Atomic-Resolution Experimentation
Domain-specialized small language models enable deterministic atomic-resolution scanning probe microscopy control with 99.3% command accuracy, lower computational cost, and better domain performance than larger general models.
-
Towards Resource-Efficient LLMs: End-to-End Energy Accounting of Distillation Pipelines
An end-to-end energy measurement framework for LLM distillation pipelines reveals hidden teacher-side costs and yields selection guidelines plus an open-source harness.
-
Push and Pushback in Contesting AI: Demands for and Resistance to Accountability
Thematic analysis of 43 AI contestation cases, using Bovens's relational accountability model, produces categories of demands from below, institutional pushback, outcomes, and contextual factors shaping effective contestation.
-
KAIROS: Stateful, Context-Aware Power-Efficient Agentic Inference Serving
KAIROS reduces power by 27% on average (up to 39.8%) for agentic AI inference by using long-lived context to jointly manage GPU frequency, concurrency, and request routing across instances.
-
Growing Pains: Extensible and Efficient LLM Benchmarking Via Fixed Parameter Calibration
A fixed-parameter multidimensional IRT calibration approach allows extending LLM benchmark suites over time, predicting full performance within 2-3 points and preserving rankings (Spearman ρ ≥ 0.9) using only 100 anchor questions per dataset.
-
Babbling Suppression: Making LLMs Greener One Token at a Time
Babbling Suppression stops LLM code generation upon test passage to reduce token output and energy consumption by up to 65% across Python and Java benchmarks.
-
Designing Around Stigma: Human-Centered LLMs for Menstrual Health
Researchers created a stigma-aware WhatsApp chatbot for menstrual health education in Pakistan through co-design workshops and a two-week deployment, yielding insights on its use for challenging taboos alongside tensions around trust and cultural explanations.
-
The Energy Cost of Execution-Idle in GPU Clusters
Execution-idle accounts for 19.7% of GPU execution time and 10.7% of energy in a large cluster, motivating power management that treats it as a distinct operating state.
-
GreenZ: A Sustainable UX Framework for Complex Digital Systems
GreenZ is a conceptual three-layer sustainable UX framework built on ten principles, five operational systems, and practical tools, centered on an eight-type Digital Waste Taxonomy and a model questioning AI necessity before implementation.
-
Position: LLM Inference Should Be Evaluated as Energy-to-Token Production
LLM inference should be reframed and evaluated as energy-to-token production with a Token Production Function that accounts for power, cooling, and efficiency ceilings.
-
Reckoning with the Political Economy of AI: Avoiding Decoys in Pursuit of Accountability
AI accountability efforts are undermined by five decoys that create illusions of progress while co-constituting the extractive political economy of the AI Project.
-
AI of the People, by the People, for the People: A Social Choice Approach to Collective Control of Artificial Intelligence
Proposes applying social choice theory as a modeling language and axiomatic tool for incorporating collective input across the ML development pipeline.
-
Context Collapse: Barriers to Adoption for Generative AI in Workplace Settings
Expert interviews demonstrate that context in generative AI workplace use collapses or rots over time, limiting tool effectiveness and revealing pitfalls in computational context approaches.
-
LLMs in Qualitative Research: Opportunities, Limitations, and Practical Considerations
The paper outlines opportunities, limitations, and practical parameters for integrating LLMs into qualitative research while aligning with epistemological commitments like reflexivity and interpretive judgment.