pith. sign in

Bin Bi

Identifiers

  • name variant Bin Bi 0.60 · backfill

Papers (5)

  1. UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function cs.CL · 2024 · author #2
  2. UNA: A Unified Supervised Framework for Efficient LLM Alignment Across Feedback Types cs.LG · 2024 · author #2
  3. Reinforcement Learning for LLM Post-Training: A Survey cs.CL · 2024 · author #3
  4. A Neural Comprehensive Ranker (NCR) for Open-Domain Question Answering cs.CL · 2017 · author #1
  5. KeyVec: Key-semantics Preserving Document Representations cs.CL · 2017 · author #1

Mentions

  • 2407.16216 #3 · arxiv_oai · confidence 0.70 Bin Bi

Frequent Coauthors