S-GRPO unifies SFT and RL for LVLMs via conditional ground-truth injection that supplies a maximal-reward anchor when group exploration fails completely.
Bleu: a method for automatic evaluation of machine translation
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6representative citing papers
ScrapeGraphAI-100k releases 93,695 real telemetry examples pairing web page content with prompts, schemas, and LLM responses to support training and benchmarking of schema-constrained generation.
HotComment is a new multimodal benchmark that quantifies online comment popularity via content quality assessment, interaction-based prediction, and agent-simulated user engagement, accompanied by the StyleCmt stylistic model.
A small recency window of 3-5 prior ADRs as context produces higher-fidelity LLM-generated Architecture Decision Records than no context, full history, or retrieval-augmented selection in typical sequential workflows.
CARE fine-tunes LLMs on counselor-validated crisis dialogues to produce responses with stronger semantic and strategic alignment to expert standards than general-purpose models in Hebrew and Arabic.
UNO distills user logs into semi-structured rules and preferences, applies query-and-feedback clustering to handle heterogeneity, quantifies cognitive gaps to filter noise, and builds primary and reflective modules that outperform RAG and memory baselines.
citing papers explorer
-
S-GRPO: Unified Post-Training for Large Vision-Language Models
S-GRPO unifies SFT and RL for LVLMs via conditional ground-truth injection that supplies a maximal-reward anchor when group exploration fails completely.
-
ScrapeGraphAI-100k: Dataset for Schema-Constrained LLM Generation
ScrapeGraphAI-100k releases 93,695 real telemetry examples pairing web page content with prompts, schemas, and LLM responses to support training and benchmarking of schema-constrained generation.
-
HotComment: A Benchmark for Evaluating Popularity of Online Comments
HotComment is a new multimodal benchmark that quantifies online comment popularity via content quality assessment, interaction-based prediction, and agent-simulated user engagement, accompanied by the StyleCmt stylistic model.
-
Context Matters: Evaluating Context Strategies for Automated ADR Generation Using LLMs
A small recency window of 3-5 prior ADRs as context produces higher-fidelity LLM-generated Architecture Decision Records than no context, full history, or retrieval-augmented selection in typical sequential workflows.
-
CARE: Counselor-Aligned Response Engine for Online Mental-Health Support
CARE fine-tunes LLMs on counselor-validated crisis dialogues to produce responses with stronger semantic and strategic alignment to expert standards than general-purpose models in Hebrew and Arabic.
-
Improve Large Language Model Systems with User Logs
UNO distills user logs into semi-structured rules and preferences, applies query-and-feedback clustering to handle heterogeneity, quantifies cognitive gaps to filter noise, and builds primary and reflective modules that outperform RAG and memory baselines.