arXiv preprint arXiv:2410.23261 (2025), https://arxiv.org/abs/2410.23261

Khandelwal, A · 2025 · arXiv 2410.23261

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Enhanced LLM Reasoning by Optimizing Reward Functions with Search-Driven Reinforcement Learning

cs.CL · 2026-05-03 · unverdicted · novelty 7.0 · 2 refs

Iterative search over reward functions with ranked feedback in GRPO training improves LLM math reasoning, achieving F1 of 0.795 on GSM8K versus 0.609 for baseline.

Towards Automated Pentesting with Large Language Models

cs.CR · 2026-04-13 · unverdicted · novelty 5.0

RedShell fine-tunes LLMs on enhanced malicious PowerShell data to produce syntactically valid offensive code for pentesting, reporting over 90% validity, strong semantic match to references, and better edit-distance similarity than prior methods plus functional execution success.

citing papers explorer

Showing 2 of 2 citing papers.

Enhanced LLM Reasoning by Optimizing Reward Functions with Search-Driven Reinforcement Learning cs.CL · 2026-05-03 · unverdicted · none · ref 33 · 2 links
Iterative search over reward functions with ranked feedback in GRPO training improves LLM math reasoning, achieving F1 of 0.795 on GSM8K versus 0.609 for baseline.
Towards Automated Pentesting with Large Language Models cs.CR · 2026-04-13 · unverdicted · none · ref 26
RedShell fine-tunes LLMs on enhanced malicious PowerShell data to produce syntactically valid offensive code for pentesting, reporting over 90% validity, strong semantic match to references, and better edit-distance similarity than prior methods plus functional execution success.

arXiv preprint arXiv:2410.23261 (2025), https://arxiv.org/abs/2410.23261

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer