CHIMERA: A Knowledge Base of Scientific Idea Recombinations for Research Analysis and Ideation
Pith reviewed 2026-05-19 13:54 UTC · model grok-4.3
The pith
CHIMERA is a large-scale knowledge base that automatically extracts examples of scientific idea recombinations from research papers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce CHIMERA, the first large-scale knowledge base of recombination examples automatically mined from the scientific literature, which enables empirical analysis of how scientists recombine concepts and draw inspiration from different areas, and training of models that propose cross-disciplinary research directions.
What carries the argument
LLM-based extraction model fine-tuned on an expert-annotated dataset to identify recombination instances in papers
If this is right
- Recombination patterns across AI subfields can be systematically analyzed using the KB.
- A hypothesis generation model trained on the KB proposes research directions that researchers rate as inspiring.
- The extraction approach generalizes to domains beyond AI, such as biology.
- Empirical studies of inspiration and cross-disciplinary idea integration become possible at scale.
Where Pith is reading between the lines
- Such knowledge bases could help identify promising but underexplored combinations of ideas for future research.
- Training data from recombinations might improve AI systems for scientific discovery beyond current methods.
- Tracking changes in recombination patterns over time could reveal shifts in research focus.
Load-bearing premise
The fine-tuned LLM model correctly identifies genuine recombination instances in papers without substantial false positives, false negatives, or biases.
What would settle it
A large-scale manual review by domain experts of extracted instances showing many are not true recombinations or many real ones are missed would disprove the utility of the KB.
Figures
read the original abstract
A hallmark of human innovation is recombination -- the creation of novel ideas by integrating elements from existing concepts and mechanisms. In this work, we introduce CHIMERA, the first large-scale Knowledge Base (KB) of recombination examples automatically mined from the scientific literature. CHIMERA enables empirical analysis of how scientists recombine concepts and draw inspiration from different areas, and enables training models that propose cross-disciplinary research directions. To construct this KB, we define a new information extraction task: identifying recombination instances in papers. We curate an expert-annotated dataset and use it to fine-tune an LLM-based extraction model, which we apply to a broad corpus of AI papers. We also demonstrate generalization to a biological domain. We showcase the utility of CHIMERA through two applications. First, we analyze patterns of recombination across AI subfields. Second, we train a scientific hypothesis generation model using the KB, showing that it can propose directions that researchers rate as inspiring.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces CHIMERA, the first large-scale knowledge base of scientific idea recombinations automatically mined from the literature. It defines a new information extraction task for identifying recombinations in papers, curates an expert-annotated dataset to fine-tune an LLM-based extractor, applies the model to a broad corpus of AI papers, demonstrates generalization to biology, and showcases utility via (1) analysis of recombination patterns across AI subfields and (2) training a hypothesis generation model whose outputs researchers rate as inspiring.
Significance. If the KB construction is reliable, CHIMERA could enable new quantitative studies of how scientists recombine ideas across domains and support improved AI systems for research ideation. The two applications provide concrete demonstrations of downstream use, including human ratings of generated hypotheses as an external validity check. However, the absence of reported quantitative metrics on annotation quality and extraction accuracy makes it difficult to gauge the robustness of these contributions.
major comments (1)
- [Methods / Extraction Model Evaluation] The description of the expert annotation process and the subsequent fine-tuning of the LLM extractor (in the methods and evaluation sections) does not report inter-annotator agreement statistics or held-out precision, recall, or F1 scores for the extraction model. Because the central claims about KB utility rest on the extractor correctly and comprehensively identifying genuine recombinations without substantial false positives, false negatives, or domain biases, these metrics are load-bearing for interpreting both the pattern analysis and the hypothesis-generation results.
minor comments (2)
- [Abstract] The abstract would be strengthened by including at least one quantitative indicator of corpus scale or extraction performance to give readers an immediate sense of the resource size and reliability.
- [Introduction / Task Definition] Notation for recombination instances and the precise definition of the IE task could be clarified with a short example table early in the paper to aid readers unfamiliar with the new task formulation.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback, which highlights an important aspect of our methods reporting. We address the single major comment below and have revised the manuscript to incorporate the requested quantitative metrics.
read point-by-point responses
-
Referee: [Methods / Extraction Model Evaluation] The description of the expert annotation process and the subsequent fine-tuning of the LLM extractor (in the methods and evaluation sections) does not report inter-annotator agreement statistics or held-out precision, recall, or F1 scores for the extraction model. Because the central claims about KB utility rest on the extractor correctly and comprehensively identifying genuine recombinations without substantial false positives, false negatives, or domain biases, these metrics are load-bearing for interpreting both the pattern analysis and the hypothesis-generation results.
Authors: We agree that inter-annotator agreement and held-out extraction metrics are necessary to substantiate the reliability of the mined KB. In the revised manuscript we have expanded the Methods section to describe the expert annotation protocol in greater detail and now report inter-annotator agreement statistics (pairwise agreement and Fleiss’ kappa) computed on the double-annotated subset. We have also added a dedicated evaluation subsection that presents precision, recall, and F1 scores on a held-out test set of expert-annotated recombinations, together with error analysis that addresses potential domain biases. These additions directly support the validity of the downstream pattern analysis and hypothesis-generation experiments. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper constructs CHIMERA via LLM-based extraction from external AI and biology literature after fine-tuning on an expert-annotated dataset. Downstream applications consist of empirical pattern analysis across subfields and training a hypothesis-generation model whose outputs are assessed via independent human researcher ratings for inspiration. These steps operate on external corpus data and external human judgments rather than any fitted parameter, self-referential definition, or self-citation chain that reduces the reported results back to the inputs by construction. No equations, uniqueness theorems, or ansatzes are invoked that would create circularity.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We define a new information extraction task: identifying recombination instances in papers. We curate an expert-annotated dataset and use it to fine-tune an LLM-based extraction model
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CHIMERA contains over 28K recombinations... analyze patterns of recombination across AI subfields
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Joel Chan, Joseph Chee Chang, Tom Hope, Dafna Sha- haf, and Aniket Kittur
The cambridge handbook of creativity. Joel Chan, Joseph Chee Chang, Tom Hope, Dafna Sha- haf, and Aniket Kittur. 2018. Solvent.Proceedings of the ACM on Human-Computer Interaction, 2:1 – 21. Joel Chan, Katherine Fu, Christian Schunn, Jonathan Cagan, Kristin Wood, and Kenneth Kotovsky. 2011. On the benefits and pitfalls of analogies for innovative design: ...
work page 2018
-
[2]
10 DaEun Choi, Sumin Hong, Jeongeon Park, John Joon Young Chung, and Juho Kim
Visiblends: A flexible workflow for visual blends.Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 10 DaEun Choi, Sumin Hong, Jeongeon Park, John Joon Young Chung, and Juho Kim. 2023. Creative- connect: Supporting reference recombination for graphic design ideation with generative ai.Proceed- ings of the CHI Conference on Huma...
-
[3]
LoRA: Low-Rank Adaptation of Large Language Models
Iter: Iterative transformer-based entity recog- nition and relation extraction. InConference on Em- pirical Methods in Natural Language Processing. Keith J. Holyoak and Paul Thagard. 1994. Mental leaps: Analogy in creative thought. Tom Hope, Joel Chan, Aniket Kittur, and Dafna Sha- haf. 2017. Accelerating innovation through analogy mining. InProceedings o...
work page internal anchor Pith review Pith/arXiv arXiv 1994
-
[4]
Constraint relaxation and chunk decomposi- tion in insight problem solving.Journal of Experi- mental Psychology: Learning, Memory, and Cogni- tion, 25(6):1534–1555. 00691. Mario Krenn, Lorenzo Buffoni, Bruno Coutinho, Sagi Eppel, Jacob Gates Foster, Andrew Gritsevskiy, Har- lin Lee, Yichao Lu, João P. Moutinho, Nima Sanjabi, Rishi Sonthalia, Ngoc M. Tran,...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[5]
00117. 11 Céline McKeown. 2014. The cognitive science of sci- ence: explanation, discovery, and conceptual change. Ergonomics, 57:632 – 633. Yan Meng, Liangming Pan, Yixin Cao, and Min-Yen Kan. 2023. Followupqg: Towards information- seeking follow-up question generation. InInterna- tional Joint Conference on Natural Language Pro- cessing. Aakanksha Naik, ...
work page internal anchor Pith review arXiv 2014
-
[6]
C-Pack: Packed Resources For General Chinese Embeddings
Scimon: Scientific inspiration machines opti- mized for novelty.ACL. Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, and 1 others. 2022. Chain-of-thought prompting elic- its reasoning in large language models.Advances in Neural Information Processing Systems, 35:24824– 24837. Shitao Xiao, Zheng Liu, Peitian ...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[7]
Massw: A new dataset and benchmark tasks for ai-assisted scientific workflows.arXiv preprint arXiv:2406.06357. Zexuan Zhong and Danqi Chen. 2021. A frustratingly easy approach for entity and relation extraction. In North American Chapter of the Association for Com- putational Linguistics. Appendix Contents A Annotator Agreement. . . . . . . . . . . . . . ...
-
[8]
Boundary disagreements, where annota- tors selected different spans with overlapping meaning. Here, the expert favored the span that preserved more context (e.g., "reinforce- ment learning which uses traditional time se- 13 Annotators’ Disagreement Examples Abstract: ". . . This research proposed a framework based on Long Short-Term Memory (LSTM) deep lea...
-
[9]
These were resolved through further discus- sion and clarification
Conceptual disagreements, where annota- tors identified fundamentally different entities. These were resolved through further discus- sion and clarification. B Additional Extraction Details B.1 Recombination keywords We use keyword-based filtering to identify works that are more likely to discuss recombination before assigning papers to human annotators. ...
work page 2021
-
[10]
inspiration-source: A concept, idea, problem, approach, or domain the authors drew inspiration from
First, familiarize yourself with the possible entity types for recombinations: <entity_types> combination-element: An idea, method, model, technique, or approach combined in the text with other elements. inspiration-source: A concept, idea, problem, approach, or domain the authors drew inspiration from. inspiration-target: A concept, idea, problem, approa...
-
[20]
Now, provide your final output in the specified JSON format. Ensure that the output is a valid JSON string. If the output is empty, return {}. Place your answer within <answer> tags. Remember to carefully analyze the abstract and only identify a recombination if it is clearly present and central to the work described. Figure 7: E2E extraction prompt. {TEX...
work page 2021
-
[21]
inspiration-src: A concept, idea, problem, approach, or domain the authors drew inspiration from
First, familiarize yourself with the possible entity types for recombinations: <entity_types> comb-element: An idea, method, model, technique, or approach combined in the text with other elements. inspiration-src: A concept, idea, problem, approach, or domain the authors drew inspiration from. inspiration-target: A concept, idea, problem, approach, or dom...
-
[22]
Review the following examples to understand the expected output format and the process of identifying recombinations: <examples>{EXAMPLES}</examples>
-
[23]
Now, carefully read the following scientific abstract: <abstract>{TEXT}</abstract>
-
[24]
Your task is to extract the most salient recombination from this abstract. A recombination can be either: a) Combination: The authors combine two or more ideas, methods, models, techniques, or approaches to obtain a certain goal. b) Inspiration: The authors draw inspiration or similarities from one concept, idea, problem, approach, or domain and implement...
-
[25]
After identifying the recombination, you will format it as a JSON string in the following structure: <recombination>{recombination_type: {entity_type_1: [ent_1, ent_2], entity_type_2: [ent_3],...}}</recombination> If you don't think the text discusses a recombination, or that the recombination is not a central part of the work, return an empty JSON object: {}
-
[26]
Before providing your final answer, use the following scratchpad to think through the process: <scratchpad>
-
[27]
Identify the main ideas, methods, or approaches discussed in the abstract
-
[28]
Determine if there is a clear combination of ideas or if one idea inspired the application in another domain
-
[29]
Identify the specific entities involved in the recombination
-
[30]
Classify the entities according to the provided entity types
-
[31]
Determine the recombination type (combination or inspiration). </scratchpad>
-
[32]
Ensure that the output is a valid JSON string
Now, provide your final output in the specified JSON format. Ensure that the output is a valid JSON string. If the output is empty, return {}. Place your answer within <recombination> tags. Remember to carefully analyze the abstract and only identify a recombination if it is clearly present and central to the work described. Figure 9: E2E ICL prompt. {TEX...
-
[33]
comb-element: An idea, method, model, technique, or approach combined in the text with other elements
-
[34]
inspiration-src: A concept, idea, problem, approach, or domain the authors drew inspiration from
-
[35]
inspiration-target: A concept, idea, problem, approach, or domain in which the authors utilize the inspiration they drew from the inspiration source. Here is the text you need to analyze: <text>{TEXT}</text> Please read the text carefully and identify all entities that belong to the types listed above. Pay close attention to the context and relationships ...
work page 2021
-
[36]
We use it in the extraction evaluation process as discussed in Section 4.1. To mitigate position bias, we query the model twice per pair with re- versed orderings, accepting a match only if both judgments are positive. We prefer GPT-4o-mini over GPT-4o based on a comparison which found only3disagreements across the test set. You are tasked with comparing ...
-
[37]
First, read the full text for context: <full_text>{TEXT}</full_text>
-
[38]
Now, consider these two spans extracted from the text above: <span1>{SPAN1}</span1> <span2>{SPAN2}</span2>
-
[39]
The idea the spans discuss should be exactly the same, up to minor lexical or semantic variations
Your task is to carefully analyze these two spans and determine if they discuss the same {ENTITY_TYPE}. The idea the spans discuss should be exactly the same, up to minor lexical or semantic variations
-
[40]
The main topic or idea presented in each span b
In your analysis, consider the following: a. The main topic or idea presented in each span b. The context in which these spans appear in the full text c. Any potential contradictions between the spans
-
[41]
After your analysis, provide a justification for your determination. Explain your reasoning clearly, referencing specific elements from the spans and the full text if necessary
- [42]
-
[43]
Present your response in the following format: <justification>[Your detailed justification here]</justification> <answer>[Your "Yes" or "No" answer here]</answer> Figure 11: Span similarity prompt. {ENTITY_TYPE} is either "combination-element", "inspiration-source" or "inspiration-target". {TEXT} is a placeholder for the paper’s abstract. {SPAN1}, {SPAN2}...
work page 2019
-
[45]
Identify the scientific branch from the provided branches list. If there's insufficient information, use "insufficient-info". If no branch name in the list describes the source properly, use "other". Return your output in the following format: <output> [{"text": "element1", "arxiv_category": "category1", "scientific_branch": "branch1"}, {"text": "element2...
-
[46]
other". If there's insufficient information, use
Identify the best matching arXiv taxonomy category from the provided list. If it doesn't match any category, use "other". If there's insufficient information, use "insufficient-info"
-
[47]
Identify the scientific branch from the provided branches list. If there's insufficient information, use "insufficient-info". If no branch name in the list describes the source properly, use "other". Repeat the same process for the inspiration target. Provide your analysis in the following format: <source-branch>[Insert the scientific branch of the inspir...
-
[48]
Abbreviation Expansion: We standardize en- tities using (a) Regex-based cleaning (e.g., “Chain of Thought (CoT)” → “Chain of Thought”) and (b) Scispacy’s abbreviation ex- pansion mechanism 8, which resolves short- forms to their long-form versions using the document context (e.g., “CoT” → “Chain of Thought”)
-
[49]
the snap-through action of a steel hairclip
Agglomerative Clustering: We group entities by performing agglomerative clustering with cosine similarity on all-mpnet-base-v2 embeddings, us- ing average linkage and a distance threshold of0.05. 8https://github.com/allenai/scispacy# abbreviationdetector C.3 Entity Postprocessing After the initial large-scale extraction, we ap- ply an LLM-based postproces...
-
[50]
Large language models (LLMs) commonly employ autoregressive generation during inference, leading to high memory bandwidth demand and consequently extended latency
-
[51]
However, existing methods suffer from slow rendering speed, greatly limiting their practical use
Reconstructing deformable tissues from endoscopic videos is essential in many downstream surgical applications. However, existing methods suffer from slow rendering speed, greatly limiting their practical use
-
[52]
Many industrial tasks-such as sanding, installing fasteners, and wire harnessing-are difficult to automate due to task complexity and variability
-
[53]
Multi-legged robots offer enhanced stability in complex terrains, yet autonomously learning natural and robust motions in such environments remains challenging. Now, consider this methodology statement: <methodology_statement>{{METHODOLOGY_STATEMENT}}</methodology_statement> To complete this task, follow these steps:
-
[54]
Analyze the abstract thoroughly, focusing on: - The context or reasons that justify the methodology choice - Any challenges, limitations, or research needs the methodology addresses - Mentions of previous research or knowledge gaps that the methodology aims to target
-
[55]
- Use exclusively the information from the abstract
When formulating your response: - Phrase your response as a general 1-2 sentence description of a challenge, limitation research needs, etc. - Use exclusively the information from the abstract. Do not incorporate external knowledge or assumptions. - Minimize including information from the methodology statement in your answer. - Do not include information ...
-
[56]
Combine <source-entity> and <target-entity>
Format your response as follows: <background> [1-2 background sentences] </background> Remember to base your response strictly on the provided abstract and statement. Do not include additional information or assumptions. Figure 16: Context extraction prompt. {{ABSTRACT}} is a placeholder for the input abstract. {{METHODOL- OGY_STATEMENT}} is a sentence de...
work page 2091
-
[57]
Read the following query: <query>{{QUERY}}</query>
-
[58]
Now, read the corresponding answer: <answer>{{ANSWER}}</answer>
-
[59]
Analyze the query for any information that might disclose the answer. Look for words, phrases, or implications in the query that directly relate or reveal information from the answer
-
[60]
Write your analysis in the following format: <analysis> [If you identified a leakage, briefly explain what information from the answer is included in the query. If you did not identify a leakage, write "no leakage".] </analysis>
-
[61]
Based on your analysis, determine if there is a leakage
-
[62]
yes" if there is a leakage, or
Provide your response in the following format: <leakage> [Write "yes" if there is a leakage, or "no" if there is no leakage. Do not include any additional explanation or reasoning.] </leakage> Remember, your task is to identify leakages, not to answer the query or explain your reasoning. Stick strictly to the output format provided. Figure 17: Leak detect...
work page 2023
-
[63]
the inherent human attribute of engaging in logical rea- soning to facilitate decision-making
-
[64]
principles of rational decision-making
-
[65]
the Level-K framework from game theory and behavioral economics, which extends reasoning from simple reactions to structured strategic depth ... Post-reranking (top-20)
-
[66]
the Level-K framework from game theory and behavioral economics, which extends reasoning from simple reactions to structured strategic depth
-
[67]
Bayesian inference: conditioning a prior on evidence
-
[68]
What would be a good source of inspiration forData-driven storytelling?" Pre-reranking (top-20)
principles of rational decision-making Query: "...while Large Language Models (LLMs) excel in various NLP tasks, their ability to generate comprehensive data stories remains underexplored... What would be a good source of inspiration forData-driven storytelling?" Pre-reranking (top-20)
-
[69]
the human storytelling process
-
[70]
Interactive digital stories
- [71]
-
[72]
story analysis and generation systems
-
[73]
generative artificial intelligence (Gen-AI)-driven narrative personalization
-
[74]
narrative structure designs
-
[75]
the human storytelling process ... (ii) Semantically similar variants Query: "Prior methods for aligning large language models face challenges in tuning to maximize non-differentiable and non-binary objectives...This highlights a need for a more flexible approach that can generalize to various user preferences... while maintaining alignment... What could ...
-
[76]
aligning Large Language Models with human preferences
-
[78]
direct preference optimization
-
[79]
State-of-the-art language model fine-tuning techniques, such as Direct Preference Optimization ... Post-reranking (top-20)
-
[80]
Direct Preference Optimization for preference alignment
-
[81]
State-of-the-art language model fine-tuning techniques, such as Direct Preference Optimization
-
[82]
contrastive learning-based methods like Direct Preference Optimization
-
[83]
a Semi-Policy Preference Optimization method
-
[84]
This study reviews the problem of
direct preference optimization ... Table 22: Illustrative examples where the reranker preferred a different answer over the gold one. 32 Figure 20: User study interface. granite-embedding-125m-english to retrieve semantically similar contexts to this description from the relevant arXiv categories. We manually verify that the retrieved contexts match the d...
-
[85]
BV-MAPP (Verbal Behavior Milestones Assessment and Placement Program)
Figure 21a presents how general scientific IE schemas lack relation types to model recombina- tions. The figure presents the results of our spe- cialized extraction method besides a transformer- based extraction model (Hennen et al., 2024) fine- tuned on SciERC (Luan et al., 2018), a general IE schema. While our new data schema easily mod- els the recombi...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.