pith. sign in

Title resolution pending

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

roles

background 1

polarities

background 1

clear filters

representative citing papers

Self-Rewarding Language Models

cs.CL · 2024-01-18 · conditional · novelty 7.0

Iterative self-rewarding via LLM-as-Judge in DPO training on Llama 2 70B improves instruction following and self-evaluation, outperforming GPT-4 on AlpacaEval 2.0.

Towards an AI co-scientist

cs.AI · 2025-02-26 · unverdicted · novelty 6.0

A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.

Capabilities of Gemini Models in Medicine

cs.AI · 2024-04-29 · unverdicted · novelty 6.0

Med-Gemini sets new records on 10 of 14 medical benchmarks including 91.1% on MedQA-USMLE, beats GPT-4V by 44.5% on multimodal tasks, and surpasses humans on medical text summarization.

Effects of Cross-lingual Evidence in Multilingual Medical Question Answering

cs.CL · 2026-04-22 · unverdicted · novelty 4.0

Combining English and target-language web retrieval boosts medical QA for low-resource languages to match high-resource performance, while English web data benefits high-resource languages most and specialized sources like PubMed lack multilingual coverage.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • Towards an AI co-scientist cs.AI · 2025-02-26 · unverdicted · none · ref 252

    A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.

  • Capabilities of Gemini Models in Medicine cs.AI · 2024-04-29 · unverdicted · none · ref 256

    Med-Gemini sets new records on 10 of 14 medical benchmarks including 91.1% on MedQA-USMLE, beats GPT-4V by 44.5% on multimodal tasks, and surpasses humans on medical text summarization.