Scieval: A multi-level large language model evaluation benchmark for scientific research

Liangtai Sun, Yang Han, Zihan Zhao, Da Ma, Zhennan Shen, Baocai Chen, Lu Chen, Kai Yu · 2024 · DOI 10.1609/aaai.v38i17.29872

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

GIScholarBench: Benchmarking LLM Overconfidence in GIS Research

cs.IR · 2026-06-06 · unverdicted · novelty 6.0

GIScholarBench shows LLMs exhibit consistent overconfidence across three scholarly tasks in GIS, with different manifestations in factual retrieval, citation expansion, and idea generation.

Evolving Roles of LLMs in Scientific Innovation: Assistant, Collaborator, Scientist, and Evaluator

cs.DL · 2025-07-16 · unverdicted · novelty 4.0

The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.

citing papers explorer

Showing 2 of 2 citing papers.

GIScholarBench: Benchmarking LLM Overconfidence in GIS Research cs.IR · 2026-06-06 · unverdicted · none · ref 23
GIScholarBench shows LLMs exhibit consistent overconfidence across three scholarly tasks in GIS, with different manifestations in factual retrieval, citation expansion, and idea generation.
Evolving Roles of LLMs in Scientific Innovation: Assistant, Collaborator, Scientist, and Evaluator cs.DL · 2025-07-16 · unverdicted · none · ref 167
The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.

Scieval: A multi-level large language model evaluation benchmark for scientific research

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer