LMM-Track4D formulates a trajectory-grounded dialogue task, releases Track4D-Bench with 526 samples, and proposes RTGE encoding, TRK state token, and OSK-RA decoder to elicit better 4D spatiotemporal reasoning in LMMs.
Meteor: An automatic metric for mt evaluation with improved correlation with human judgments
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7roles
method 1polarities
use method 1representative citing papers
Evolutionary trees from LLM weights recover ground-truth training topologies and identify key datasets and layers through phenotypic analysis.
XAI explanations should be narratives with continuous structure, cause-effect, fluency and diversity, and new metrics are needed to evaluate this better than standard NLP scores.
A-MEM is a dynamic memory system for LLM agents that builds and refines an interconnected network of notes with agent-driven linking and evolution, showing performance gains over prior memory methods on six models.
SciCore-Mol augments LLMs with three integrated modules for molecular perception, latent diffusion generation, and reaction reasoning, claiming an 8B open model competes with or exceeds proprietary systems on chemical tasks.
ShellfishNet is a new benchmark of 8,691 images across 32 mollusc taxa for evaluating vision models on real-world underwater ecological monitoring tasks including robustness to degradation.
A systematic review that categorizes prompting strategies for LLM-based code summarization, assesses their effectiveness, and identifies gaps in research and evaluation practices.
citing papers explorer
-
LMM-Track4D: Eliciting 4D Dynamic Reasoning in LMMs via Trajectory-Grounded Dialogue
LMM-Track4D formulates a trajectory-grounded dialogue task, releases Track4D-Bench with 526 samples, and proposes RTGE encoding, TRK state token, and OSK-RA decoder to elicit better 4D spatiotemporal reasoning in LMMs.
-
Analysis and Explainability of LLMs Via Evolutionary Methods
Evolutionary trees from LLM weights recover ground-truth training topologies and identify key datasets and layers through phenotypic analysis.
-
On the Importance and Evaluation of Narrativity in Natural Language AI Explanations
XAI explanations should be narratives with continuous structure, cause-effect, fluency and diversity, and new metrics are needed to evaluate this better than standard NLP scores.
-
A-MEM: Agentic Memory for LLM Agents
A-MEM is a dynamic memory system for LLM agents that builds and refines an interconnected network of notes with agent-driven linking and evolution, showing performance gains over prior memory methods on six models.
-
SciCore-Mol: Augmenting Large Language Models with Pluggable Molecular Cognition Modules
SciCore-Mol augments LLMs with three integrated modules for molecular perception, latent diffusion generation, and reaction reasoning, claiming an 8B open model competes with or exceeds proprietary systems on chemical tasks.
-
ShellfishNet: A Domain-Specific Benchmark for Visual Recognition of Marine Molluscs
ShellfishNet is a new benchmark of 8,691 images across 32 mollusc taxa for evaluating vision models on real-world underwater ecological monitoring tasks including robustness to degradation.
-
Prompt-Driven Code Summarization: A Systematic Literature Review
A systematic review that categorizes prompting strategies for LLM-based code summarization, assesses their effectiveness, and identifies gaps in research and evaluation practices.