BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
hub
A simulation model of intermittently controlled point- and-click behaviour
11 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 11polarities
background 4representative citing papers
RoboLineage introduces an agent-native data lifecycle governance system that represents robot policy iteration steps as typed lineage artifacts to improve speed and auditability in real-robot workflows.
Instrumented data augments observations with mechanistic models, uncertainty, and counterfactuals to enable causal interventions via Pearl's do-operator in scientific machine learning.
GPT produces click distributions significantly different from real humans in 53% of UX first-click tasks, with prompting techniques like personas and chain-of-thought failing to improve alignment.
Structured dataset documentation shows little engagement with major reflexivity themes from FAccT literature, leading to a new codebook and extended datasheet questions.
Interviews in a semiconductor company reveal 16 collaboration and communication challenges in ML engineering teams, with unclear roles and responsibilities as the top issue, and list effective mitigation practices under hardware-driven constraints.
LipB-ViT adds bi-Lipschitz Bayesian layers to vision transformers and uses uncertainty-aware fusion to identify corrupted labels with over 93% recall at 15% noise, beating kNN baselines.
AI/ML weather tools face integration challenges from mismatched 'regimes of scale' in how data and models are organized compared to traditional meteorology practices.
Prioritization algorithms in public services generate relative disparities among intersectional groups as resources become scarce, intensifying perceptions of inequality.
A literature review concludes that pursuing consensus in data annotation creates biased AI by dismissing subjective disagreements and enforcing geographic hegemony, and proposes mapping diversity instead.
Ethnographic study of feminist civic-tech data work argues reparative AI dataset production requires resetting accountability ties to center those harmed by current practices.
citing papers explorer
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
-
RoboLineage: Agent-Native Data Lifecycle Governance Across Robot Policy Iterations
RoboLineage introduces an agent-native data lifecycle governance system that represents robot policy iteration steps as typed lineage artifacts to improve speed and auditability in real-robot workflows.
-
Instrumented data for causal scientific machine learning
Instrumented data augments observations with mechanistic models, uncertainty, and counterfactuals to enable causal interventions via Pearl's do-operator in scientific machine learning.
-
What Would GPT Click: Practical Effects of Human-AI Behavioral Misalignment and the Cost of Synthetic Participants in User Experience
GPT produces click distributions significantly different from real humans in 53% of UX first-click tasks, with prompting techniques like personas and chain-of-thought failing to improve alignment.
-
Evaluating Structured Documentation as a Tool for Reflexivity in Dataset Development
Structured dataset documentation shows little engagement with major reflexivity themes from FAccT literature, leading to a new codebook and extended datasheet questions.
-
Exploring CoCo Challenges in ML Engineering Teams: Insights From the Semiconductor Industry
Interviews in a semiconductor company reveal 16 collaboration and communication challenges in ML engineering teams, with unclear roles and responsibilities as the top issue, and list effective mitigation practices under hardware-driven constraints.
-
Architecture-agnostic Lipschitz-constant Bayesian header and its application to resolve semantically proximal classification errors with vision transformers
LipB-ViT adds bi-Lipschitz Bayesian layers to vision transformers and uses uncertainty-aware fusion to identify corrupted labels with over 93% recall at 15% noise, beating kNN baselines.
-
Regimes of Scale in AI Meteorology
AI/ML weather tools face integration challenges from mismatched 'regimes of scale' in how data and models are organized compared to traditional meteorology practices.
-
The Paradox of Prioritization in Public Sector Algorithms
Prioritization algorithms in public services generate relative disparities among intersectional groups as resources become scarce, intensifying perceptions of inequality.
-
The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation
A literature review concludes that pursuing consensus in data annotation creates biased AI by dismissing subjective disagreements and enforcing geographic hegemony, and proposes mapping diversity instead.
-
Can Data Work be Reparative?
Ethnographic study of feminist civic-tech data work argues reparative AI dataset production requires resetting accountability ties to center those harmed by current practices.