pith. sign in

Yong Jae Lee

Identifiers

  • name variant Yong Jae Lee 0.60 · backfill

Papers (30)

  1. Personal AI Agent for Camera Roll VQA cs.CV · 2026 · author #4
  2. MAOAM: Unified Object and Material Selection with Vision-Language Models cs.CV · 2026 · author #7
  3. Latent Recurrent Transformer: Architecture Exploration, Training Strategies, and Scaling Behavior cs.LG · 2026 · author #10
  4. Your Embedding Model is SMARTer Than You Think cs.IR · 2026 · author #6
  5. From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing cs.CV · 2026 · author #3
  6. Exploration and Exploitation Errors Are Measurable for Language Model Agents cs.AI · 2026 · author #6
  7. MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models cs.CV · 2026 · author #5
  8. Relational Visual Similarity cs.CV · 2025 · author #8
  9. Multimodal Reinforcement Learning with Adaptive Verifier for AI Agents cs.AI · 2025 · author #17
  10. See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models cs.CV · 2025 · author #11
  11. Low-Resolution Editing is All You Need for High-Resolution Editing cs.CV · 2025 · author #3
  12. Revisiting Active Speaker Detection: An In-the-Wild Benchmark for Generalization and Robustness cs.CV · 2025 · author #11
  13. Improved Baselines with Visual Instruction Tuning cs.CV · 2023 · author #4
  14. Visual Instruction Tuning cs.CV · 2023 · author #4
  15. FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery cs.CV · 2018 · author #3
  16. Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond cs.CV · 2018 · author #5
  17. A Visual Attention Grounding Neural Model for Multimodal Machine Translation cs.CL · 2018 · author #3
  18. DOCK: Detecting Objects by transferring Common-sense Knowledge cs.CV · 2018 · author #4
  19. Learning to Anonymize Faces for Privacy Preserving Action Detection cs.CV · 2018 · author #2
  20. Video Object Detection with an Aligned Spatial-Temporal Memory cs.CV · 2017 · author #2
  21. Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery cs.CV · 2017 · author #2
  22. Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks cs.CV · 2017 · author #6
  23. Weakly-supervised Visual Grounding of Phrases with Linguistic Structures cs.CV · 2017 · author #3
  24. Identifying First-person Camera Wearers in Third-person Videos cs.CV · 2017 · author #5
  25. Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-supervised Object and Action Localization cs.CV · 2017 · author #2
  26. Interspecies Knowledge Transfer for Facial Keypoint Detection cs.CV · 2017 · author #3
  27. End-to-End Localization and Ranking for Relative Attributes cs.CV · 2016 · author #2
  28. Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection cs.CV · 2016 · author #3
  29. Predicting Important Objects for Egocentric Video Summarization cs.CV · 2015 · author #1
  30. Weakly-supervised Discovery of Visual Pattern Configurations cs.CV · 2014 · author #2

Mentions

  • 2505.21954 #11 · arxiv_oai · confidence 0.70 Yong Jae Lee
  • 2606.05275 #4 · arxiv_oai · confidence 0.70 Yong Jae Lee
  • 2606.04880 #7 · arxiv_oai · confidence 0.70 Yong Jae Lee
  • 1505.04803 #1 · backfill · confidence 0.70 Yong Jae Lee
  • 2511.19945 #3 · arxiv_oai · confidence 0.70 Yong Jae Lee
  • 1406.6507 #2 · backfill · confidence 0.70 Yong Jae Lee
  • 2605.26797 #10 · arxiv_oai · confidence 0.70 Yong Jae Lee
  • 2605.24938 #6 · arxiv_oai · confidence 0.70 Yong Jae Lee

Frequent Coauthors