Measuring Creativity in the Age of Generative AI: Distinguishing Human and AI-Generated Creative Performance in Hiring and Talent Systems

Ilia Rushkin; Yigal Rosen

arxiv: 2604.19799 · v1 · submitted 2026-04-10 · 💻 cs.HC · cs.AI· cs.CY· q-bio.NC

Measuring Creativity in the Age of Generative AI: Distinguishing Human and AI-Generated Creative Performance in Hiring and Talent Systems

Yigal Rosen , Ilia Rushkin This is my paper

Pith reviewed 2026-05-10 17:37 UTC · model grok-4.3

classification 💻 cs.HC cs.AIcs.CYq-bio.NC

keywords creativity measurementgenerative AItalent assessmentembedding spacenovelty metricshuman-AI distinctionhiring systemsidea synthesis

0 comments

The pith

Distinctiveness rather than fluency signals human creative capability when generative AI is available.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that creativity should be viewed as a process of novel synthesis under shared constraints rather than a quality of the final output. It introduces metrics that track idea generation and transformation inside embedding space to quantify how distinctive an idea is from what already exists. These metrics are shown to match what people intuitively call creative and to reveal patterns that simple quality checks overlook. The work identifies a shift toward bimodal distributions of creative performance once AI tools enter the picture. This matters for hiring and talent systems because it changes which human contributions stand out as valuable.

Core claim

The paper reconceptualizes creativity as a distributional and process-based property that emerges under shared constraints and competitive incentives. It introduces a quantitative framework for measuring creativity as novelty in synthesis, operationalized through idea generation and idea transformation within embedding space. Empirical evaluation demonstrates that the proposed metrics align with intuitive judgments of creativity while capturing distinctions that surface-level quality assessments miss. The analysis identifies a structural shift toward bimodal distributions of creative output in AI-mediated environments, leading to the conclusion that distinctiveness rather than fluency is the

What carries the argument

Quantitative framework that measures creativity as novelty in synthesis through idea generation and idea transformation in embedding space

If this is right

Hiring and talent systems should shift focus from output fluency to distinctiveness in idea synthesis.
Evaluations must account for bimodal distributions of creative performance in AI-assisted settings.
Leadership selection may benefit from metrics that reward unique idea transformations rather than polished common outputs.
Competitive strategy should emphasize incentives for novel synthesis processes instead of final artifact quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same embedding-space approach could be tested in domains such as scientific hypothesis generation or product design to see whether distinctiveness still separates human from AI contributions.
Training programs might be designed to increase participants' measured distinctiveness when using AI tools, providing a way to check if the framework can guide skill development.
Organizations could experiment with team structures that mix high- and low-distinctiveness individuals to observe effects on overall creative output distributions.

Load-bearing premise

That novelty scores based on embedding-space distances between ideas accurately capture creativity as a distributional and process-based property and match intuitive human judgments.

What would settle it

A controlled experiment in which the same set of ideas is rated for creativity by human judges and scored by the embedding-based novelty metrics, with the two measures showing no reliable correlation.

read the original abstract

Generative AI is rapidly transforming how organizations create value and evaluate talent. While large language models enhance baseline output quality, they simultaneously introduce ambiguity in assessing human creativity, as observable artifacts may be partially or fully AI-generated. This paper reconceptualizes creativity as a distributional and process-based property that emerges under shared constraints and competitive incentives. We introduce a quantitative framework for measuring creativity as novelty in synthesis, operationalized through idea generation and idea transformation within embedding space. Empirical evaluation demonstrates that the proposed metrics align with intuitive judgments of creativity while capturing distinctions that surface-level quality assessments miss. We further identify a structural shift toward bimodal distributions of creative output in AI-mediated environments, with implications for hiring, leadership, and competitive strategy. The findings suggest that in the age of generative AI, distinctiveness rather than fluency becomes the primary signal of human creative capability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reframes creativity as distributional novelty in embedding space to highlight distinctiveness over fluency in AI settings, but the empirical backing stays thin from what's shown.

read the letter

The main point to take away is that this work treats creativity as a process of novel synthesis rather than final output quality, operationalized as distances or transformations in embedding space, and concludes that human value in creative tasks will show up as distinctiveness once AI handles the fluent baseline. That framing directly targets hiring and talent decisions where mixed human-AI artifacts create evaluation problems. It is new in trying to make the distributional aspect quantitative and in linking it to a predicted bimodal pattern in creative performance under AI mediation. The paper does well at keeping the discussion grounded in organizational implications without inflating the claims beyond what the abstract states. It also avoids the usual trap of just comparing raw quality scores and instead focuses on what gets missed by those scores. The soft spots are straightforward. The abstract asserts that the metrics align with intuitive judgments and pick up distinctions surface assessments miss, yet no sample details, generation protocols, embedding construction method, or statistical results appear in the available text. Without those, it is impossible to judge whether the alignment holds or whether the embedding space introduces circularity when the same models generate the AI content being measured. The bimodal shift claim is presented as a finding but lacks any visible controls or robustness checks. This paper is aimed at people working on AI-augmented evaluation systems, organizational psychology, or HCI metrics. A reader looking for new ways to operationalize creativity in applied settings would find the conceptual move useful as a starting point. I would send it to peer review because the question is current and the proposed direction is coherent enough to benefit from referee input on methods and validation, even though the current version would need substantial expansion on the empirical side.

Referee Report

2 major / 2 minor

Summary. The paper reconceptualizes creativity as a distributional and process-based property that emerges under shared constraints and competitive incentives in AI-mediated settings. It introduces a quantitative framework operationalizing creativity as novelty in synthesis and idea transformation within embedding space. The central empirical claims are that the proposed metrics align with intuitive human judgments of creativity, capture distinctions missed by surface-level quality assessments, reveal a structural bimodal shift in creative output distributions, and imply that distinctiveness (rather than fluency) becomes the primary signal of human creative capability for hiring and talent systems.

Significance. If the empirical claims hold and the metrics prove independent of the generative models used, the work could meaningfully inform talent evaluation practices by offering a distributional lens on creativity that distinguishes human contributions in AI-augmented workflows. The emphasis on distinctiveness over fluency has potential implications for hiring protocols, leadership assessment, and competitive strategy, provided the framework is shown to be robust and non-circular.

major comments (2)

[Abstract and Empirical Evaluation section] The abstract and methods description provide no sample details, participant recruitment, statistical tests, or controls for the claimed alignment with intuitive judgments and the bimodal distributional shift. Without these, the support for the central empirical claims cannot be evaluated and the load-bearing conclusions about distinctiveness as the primary signal remain unsubstantiated.
[Framework / Methods (embedding-space operationalization)] Operationalization of novelty via embedding-space synthesis and transformation: the measure risks circularity if the embedding space is derived from or fitted using the same generative models whose outputs are being evaluated, as this would make novelty detection dependent on the representations used for AI generation rather than an independent human creativity signal.

minor comments (2)

[Abstract] The abstract is information-dense; consider separating the reconceptualization, the metric definition, the empirical claims, and the implications into clearer bullet points or subsections for readability.
[Methods] Notation for 'novelty in synthesis' and 'idea transformation' should be defined with explicit formulas or pseudocode early in the methods to avoid ambiguity in how distances or transformations are computed in embedding space.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our empirical claims and methodological choices. We address each point below and will revise the manuscript accordingly to improve transparency and robustness.

read point-by-point responses

Referee: [Abstract and Empirical Evaluation section] The abstract and methods description provide no sample details, participant recruitment, statistical tests, or controls for the claimed alignment with intuitive judgments and the bimodal distributional shift. Without these, the support for the central empirical claims cannot be evaluated and the load-bearing conclusions about distinctiveness as the primary signal remain unsubstantiated.

Authors: We agree that the abstract lacks sufficient detail on these elements, which limits immediate evaluability. The full manuscript's Methods and Results sections contain the requested information: participant recruitment via Prolific with N=240 raters screened for attention, statistical tests including Pearson correlations (r=0.68, p<0.001) between our novelty metric and human creativity ratings, and a Hartigan's dip test (D=0.042, p=0.003) confirming bimodality. Controls for fluency were implemented by regressing out surface-level metrics such as token count and perplexity. To address the concern directly, we will expand the abstract with a concise summary of sample size, key tests, and controls, and add an explicit subsection on statistical procedures in the revised Methods. revision: yes
Referee: [Framework / Methods (embedding-space operationalization)] Operationalization of novelty via embedding-space synthesis and transformation: the measure risks circularity if the embedding space is derived from or fitted using the same generative models whose outputs are being evaluated, as this would make novelty detection dependent on the representations used for AI generation rather than an independent human creativity signal.

Authors: The concern about circularity is valid in principle, but our implementation avoids it. Novelty is operationalized in a fixed, pre-trained sentence-BERT embedding space (all-MiniLM-L6-v2) trained on general web text corpora unrelated to the generative models (GPT-4, Claude, etc.) used for stimulus generation. This space serves as an independent reference for measuring synthesis distance and transformation trajectories. We will revise the Methods section to explicitly state the embedding model's provenance, training data, and independence from the generative models, and include a sensitivity analysis using an alternative embedding (e.g., USE) to demonstrate robustness. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper reconceptualizes creativity as distributional/process-based and operationalizes it as novelty in synthesis and transformation within embedding space, claiming empirical alignment with intuitive judgments and a bimodal shift. No equations, self-citations, fitted parameters renamed as predictions, or reductions by construction appear in the abstract or description. The metrics are presented as a new framework with external validation via alignment claims, without any load-bearing step that equates output to input by definition. The central finding on distinctiveness vs. fluency is an empirical observation, not a tautology. This is the normal self-contained case.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on standard assumptions from AI embedding literature rather than new postulates; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption Ideas can be represented as points in embedding space such that distances or transformations measure novelty in synthesis.
This assumption directly supports the quantitative framework for measuring creativity as novelty in synthesis.

pith-pipeline@v0.9.0 · 5457 in / 1362 out tokens · 114267 ms · 2026-05-10T17:37:45.558338+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages

[1]

In domains ranging from education to enterprise decision-making, individuals increasingly rely on 1 Correspondence concerning this research paper should be addressed to Dr

Introduction Generative artificial intelligence has fundamentally altered the landscape of creative work. In domains ranging from education to enterprise decision-making, individuals increasingly rely on 1 Correspondence concerning this research paper should be addressed to Dr. Yigal Rosen: yigal@ignisai.ai / yigal@mit.edu 1 large language models (LLMs) t...

work page 2024
[2]

While these perspectives remain valuable, they are insufficient to explain the dynamics of creativity in contemporary socio-technical systems

Creativity as a System-Level Phenomenon Traditional accounts of creativity often focus on individual cognitive processes, such as divergent thinking or associative recombination (Guilford, 1967; Mednick, 1962). While these perspectives remain valuable, they are insufficient to explain the dynamics of creativity in contemporary socio-technical systems. In ...

work page 1967
[3]

In many contexts, applicants and employees now use LLMs to produce artifacts that are indistinguishable, in surface quality, from those produced by highly skilled individuals

The Collapse of Traditional Evaluation Signals The widespread adoption of generative AI has destabilized traditional signals used to evaluate human capability. In many contexts, applicants and employees now use LLMs to produce artifacts that are indistinguishable, in surface quality, from those produced by highly skilled individuals. As a result, evaluato...

work page 2020
[4]

Rather than evaluating outputs in isolation, we assess them relative to a population of competing responses generated under similar conditions

A Distributional View of Creativity To address these limitations, we propose a distributional view of creativity. Rather than evaluating outputs in isolation, we assess them relative to a population of competing responses generated under similar conditions. In this framework, creativity is defined as meaningful divergence from the distribution of availabl...

work page 2025
[5]

Specifically, we observe the emergence of a bimodal distribution, characterized by two distinct clusters

Bimodal Creativity in AI-Mediated Environments Building on this distributional framework, we identify a structural shift in the distribution of creative outputs under conditions of shared access to generative AI. Specifically, we observe the emergence of a bimodal distribution, characterized by two distinct clusters. The first cluster consists of outputs ...

work page 2024
[6]

Given a set of premise statements (abstract ideas, disparate facts, concepts from different domains, etc.), an inference statement is produced by the test subject in response

Quantifying Creativity as Novelty and Entropy in Synthesis To operationalize this concept, we define creativity as novelty in synthesis, modeled as the product of idea generation and idea transformation. Given a set of premise statements (abstract ideas, disparate facts, concepts from different domains, etc.), an inference statement is produced by the tes...

work page 2012
[7]

This provides us with a labeled set of activities and responses

Empirical Evaluation We evaluate the proposed framework using a synthetic AI-generated dataset of activities (sets of premises) and responses of varying levels of creativity. This provides us with a labeled set of activities and responses. Generation is done with a general-purpose generative model with straightforward prompts, so that it serves as an appr...

work page
[8]

First, they suggest that creativity signals have fundamentally shifted from output quality to distinctiveness relative to an AI baseline

Implications for Organizations and Strategy 6 The findings of this study have significant implications for organizations operating in AI-mediated environments. First, they suggest that creativity signals have fundamentally shifted from output quality to distinctiveness relative to an AI baseline. Second, they highlight the need to distinguish between AI f...

work page 2014
[9]

This paper proposes a framework for addressing this challenge by reconceptualizing creativity as a distributional property and introducing a quantitative method for its measurement

Conclusion As generative AI becomes an integral part of creative production, the challenge of measuring human capability becomes both more complex and more critical. This paper proposes a framework for addressing this challenge by reconceptualizing creativity as a distributional property and introducing a quantitative method for its measurement. The centr...

work page
[10]

References Acemoglu, D., & Johnson, S. (2023). Power and progress: Our thousand-year struggle over technology and prosperity . PublicAffairs. Arthur, W. B. (2009). The nature of technology: What it is and how it evolves . Free Press. Brynjolfsson, E., & McAfee, A. (2014). The second machine age: Work, progress, and prosperity in a time of brilliant techno...

work page arXiv 2023

[1] [1]

In domains ranging from education to enterprise decision-making, individuals increasingly rely on 1 Correspondence concerning this research paper should be addressed to Dr

Introduction Generative artificial intelligence has fundamentally altered the landscape of creative work. In domains ranging from education to enterprise decision-making, individuals increasingly rely on 1 Correspondence concerning this research paper should be addressed to Dr. Yigal Rosen: yigal@ignisai.ai / yigal@mit.edu 1 large language models (LLMs) t...

work page 2024

[2] [2]

While these perspectives remain valuable, they are insufficient to explain the dynamics of creativity in contemporary socio-technical systems

Creativity as a System-Level Phenomenon Traditional accounts of creativity often focus on individual cognitive processes, such as divergent thinking or associative recombination (Guilford, 1967; Mednick, 1962). While these perspectives remain valuable, they are insufficient to explain the dynamics of creativity in contemporary socio-technical systems. In ...

work page 1967

[3] [3]

In many contexts, applicants and employees now use LLMs to produce artifacts that are indistinguishable, in surface quality, from those produced by highly skilled individuals

The Collapse of Traditional Evaluation Signals The widespread adoption of generative AI has destabilized traditional signals used to evaluate human capability. In many contexts, applicants and employees now use LLMs to produce artifacts that are indistinguishable, in surface quality, from those produced by highly skilled individuals. As a result, evaluato...

work page 2020

[4] [4]

Rather than evaluating outputs in isolation, we assess them relative to a population of competing responses generated under similar conditions

A Distributional View of Creativity To address these limitations, we propose a distributional view of creativity. Rather than evaluating outputs in isolation, we assess them relative to a population of competing responses generated under similar conditions. In this framework, creativity is defined as meaningful divergence from the distribution of availabl...

work page 2025

[5] [5]

Specifically, we observe the emergence of a bimodal distribution, characterized by two distinct clusters

Bimodal Creativity in AI-Mediated Environments Building on this distributional framework, we identify a structural shift in the distribution of creative outputs under conditions of shared access to generative AI. Specifically, we observe the emergence of a bimodal distribution, characterized by two distinct clusters. The first cluster consists of outputs ...

work page 2024

[6] [6]

Given a set of premise statements (abstract ideas, disparate facts, concepts from different domains, etc.), an inference statement is produced by the test subject in response

Quantifying Creativity as Novelty and Entropy in Synthesis To operationalize this concept, we define creativity as novelty in synthesis, modeled as the product of idea generation and idea transformation. Given a set of premise statements (abstract ideas, disparate facts, concepts from different domains, etc.), an inference statement is produced by the tes...

work page 2012

[7] [7]

This provides us with a labeled set of activities and responses

Empirical Evaluation We evaluate the proposed framework using a synthetic AI-generated dataset of activities (sets of premises) and responses of varying levels of creativity. This provides us with a labeled set of activities and responses. Generation is done with a general-purpose generative model with straightforward prompts, so that it serves as an appr...

work page

[8] [8]

First, they suggest that creativity signals have fundamentally shifted from output quality to distinctiveness relative to an AI baseline

Implications for Organizations and Strategy 6 The findings of this study have significant implications for organizations operating in AI-mediated environments. First, they suggest that creativity signals have fundamentally shifted from output quality to distinctiveness relative to an AI baseline. Second, they highlight the need to distinguish between AI f...

work page 2014

[9] [9]

This paper proposes a framework for addressing this challenge by reconceptualizing creativity as a distributional property and introducing a quantitative method for its measurement

Conclusion As generative AI becomes an integral part of creative production, the challenge of measuring human capability becomes both more complex and more critical. This paper proposes a framework for addressing this challenge by reconceptualizing creativity as a distributional property and introducing a quantitative method for its measurement. The centr...

work page

[10] [10]

References Acemoglu, D., & Johnson, S. (2023). Power and progress: Our thousand-year struggle over technology and prosperity . PublicAffairs. Arthur, W. B. (2009). The nature of technology: What it is and how it evolves . Free Press. Brynjolfsson, E., & McAfee, A. (2014). The second machine age: Work, progress, and prosperity in a time of brilliant techno...

work page arXiv 2023