Yuexian Zou
Identifiers
- name variant Yuexian Zou 0.60 · backfill
Papers (104)
- Graph-PiT: Enhancing Structural Coherence in Part-Based Image Synthesis via Graph Priors cs.CV · 2026 · author #5
- Image Conductor: Precision Control for Interactive Video Synthesis cs.CV · 2024 · author #7
- Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning cs.CL · 2024 · author #5
- VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding cs.CV · 2024 · author #10
- VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework cs.CV · 2024 · author #10
- WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs cs.CV · 2024 · author #8
- Retrieval is Accurate Generation cs.CL · 2024 · author #6
- Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning cs.CV · 2024 · author #6
- AFL-Net: Integrating Audio, Facial, and Lip Modalities with a Two-step Cross-attention for Robust Speaker Diarization in the Wild cs.MM · 2023 · author #4
- ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding cs.CL · 2023 · author #6
- UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework cs.CV · 2023 · author #9
- Video Referring Expression Comprehension via Transformer with Content-conditioned Query cs.CV · 2023 · author #6
- NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement cs.SD · 2023 · author #5
- MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning cs.CV · 2023 · author #6
- G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory cs.CV · 2023 · author #6
- Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels cs.CV · 2023 · author #7
- Customizing General-Purpose Foundation Models for Medical Report Generation cs.CV · 2023 · author #3
- HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec cs.SD · 2023 · author #6
- WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research eess.AS · 2023 · author #8
- TLAG: An Informative Trigger and Label-Aware Knowledge Guided Model for Dialogue-based Relation Extraction cs.CL · 2023 · author #5
- Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation cs.CV · 2023 · author #6
- PoseRAC: Pose Saliency Transformer for Repetitive Action Counting cs.CV · 2023 · author #3
- Improve Retrieval-based Dialogue System via Syntax-Informed Attention cs.AI · 2023 · author #5
- ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation cs.CL · 2023 · author #3
- Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss cs.SD · 2023 · author #3
- Improving Weakly Supervised Sound Event Detection with Causal Intervention cs.SD · 2023 · author #5
- FTM: A Frame-level Timeline Modeling Method for Temporal Graph Representation Learning cs.LG · 2023 · author #4
- FiTs: Fine-grained Two-stage Training for Knowledge-aware Question Answering cs.CL · 2023 · author #5
- SSVMR: Saliency-based Self-training for Video-Music Retrieval cs.MM · 2023 · author #5
- Exploiting Auxiliary Caption for Video Grounding cs.CV · 2023 · author #6
- Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation cs.SD · 2022 · author #3
- M3ST: Mix at Three Levels for Speech Translation cs.CL · 2022 · author #6
- Aligning Source Visual and Target Language Domains for Unpaired Video Captioning cs.CV · 2022 · author #5
- A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding cs.CL · 2022 · author #5
- NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS cs.SD · 2022 · author #6
- DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention cs.CV · 2022 · author #7
- Prophet Attention: Predicting Attention with Future Attention for Image Captioning cs.CV · 2022 · author #5
- Video Referring Expression Comprehension via Transformer with Content-aware Query cs.CV · 2022 · author #4
- Correspondence Matters for Video Referring Expression Comprehension cs.CV · 2022 · author #4
- LocVTP: Video-Text Pre-training for Temporal Localization cs.CV · 2022 · author #6
- Diffsound: Discrete Diffusion Model for Text-to-sound Generation cs.SD · 2022 · author #6
- Competence-based Multimodal Curriculum Learning for Medical Report Generation cs.CL · 2022 · author #3
- LAE: Language-Aware Encoder for Monolingual and Multilingual ASR cs.CL · 2022 · author #5
- Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention eess.AS · 2022 · author #3
- End-to-end Spoken Conversational Question Answering: Task, Dataset and Model cs.CL · 2022 · author #6
- Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction eess.AS · 2022 · author #5
- RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection cs.SD · 2022 · author #4
- A Mixed supervised Learning Framework for Target Sound Detection cs.SD · 2022 · author #3
- Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches eess.AS · 2022 · author #5
- Improving Target Sound Extraction with Timestamp Information cs.SD · 2022 · author #5
- Learning Decoupling Features Through Orthogonality Regularization cs.SD · 2022 · author #6
- SpatioTemporal Focus for Skeleton-based Action Recognition cs.CV · 2022 · author #3
- Integrating Lattice-Free MMI into End-to-End Speech Recognition cs.CL · 2022 · author #4
- Unsupervised Pre-training for Temporal Action Localization Tasks cs.CV · 2022 · author #6
- Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model cs.CL · 2022 · author #4
- Detect what you want: Target Sound Detection cs.SD · 2021 · author #3
- Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI cs.AI · 2021 · author #7
- CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter cs.CV · 2021 · author #3
- Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information cs.SD · 2021 · author #4
- A Mutual learning framework for Few-shot Sound Event Detection cs.SD · 2021 · author #3
- Towards Joint Intent Detection and Slot Filling via Higher-order Attention cs.CL · 2021 · author #5
- On Pursuit of Designing Multi-modal Transformer for Video Grounding cs.CV · 2021 · author #5
- Self-supervised Contrastive Cross-Modality Representation Learning for Spoken Question Answering cs.CL · 2021 · author #3
- HAN: Higher-order Attention Network for Spoken Language Understanding cs.CL · 2021 · author #3
- Fully Non-Homogeneous Atmospheric Scattering Modeling with Convolutional Neural Networks for Single Image Dehazing cs.CV · 2021 · author #3
- Joint Multiple Intent Detection and Slot Filling via Self-distillation cs.CL · 2021 · author #3
- Deep Motion Prior for Weakly-Supervised Temporal Action Localization cs.CV · 2021 · author #5
- Text Anchor Based Metric Learning for Small-footprint Keyword Spotting cs.SD · 2021 · author #4
- O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning cs.CL · 2021 · author #6
- Audio-Oriented Multimodal Machine Comprehension: Task, Dataset and Model cs.CL · 2021 · author #7
- Long-Short Temporal Modeling for Efficient Action Recognition cs.CV · 2021 · author #2
- SRF-Net: Selective Receptive Field Network for Anchor-Free Temporal Action Detection cs.CV · 2021 · author #3
- All You Need is a Second Look: Towards Arbitrary-Shaped Text Detection cs.CV · 2021 · author #4
- Exploring Semantic Relationships for Unpaired Image Captioning cs.CV · 2021 · author #4
- Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation cs.CV · 2021 · author #5
- Self-supervised Dialogue Learning for Spoken Conversational Question Answering cs.CL · 2021 · author #3
- Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification cs.SD · 2021 · author #3
- Rethinking Skip Connection with Layer Normalization in Transformers and ResNets cs.LG · 2021 · author #5
- RR-Net: Injecting Interactive Semantics in Human-Object Interaction Detection cs.CV · 2021 · author #2
- Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech Separation in Complex Domain cs.SD · 2021 · author #3
- Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency cs.CL · 2021 · author #4
- SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification eess.AS · 2021 · author #2
- CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning cs.CV · 2021 · author #5
- A Global-local Attention Framework for Weakly Labelled Audio Tagging eess.AS · 2021 · author #2
- FWB-Net:Front White Balance Network for Color Shift Correction in Single Image Dehazing via Atmospheric Light Estimation cs.CV · 2021 · author #3
- Adaptive Bi-directional Attention: Exploring Multi-Granularity Representations for Machine Reading Comprehension cs.CL · 2020 · author #5
- Knowledge Distillation for Improved Accuracy in Spoken Question Answering cs.CL · 2020 · author #3
- Contextualized Attention-based Knowledge Transfer for Spoken Conversational Question Answering cs.CL · 2020 · author #3
- Towards Data Distillation for End-to-end Spoken Conversational Question Answering cs.CL · 2020 · author #5
- PIN: A Novel Parallel Interactive Network for Spoken Language Understanding cs.CL · 2020 · author #4
- PAN: Towards Fast Action Recognition via Learning Persistence of Appearance cs.CV · 2020 · author #2
- A Graph-based Interactive Reasoning for Human-Object Interaction Detection cs.CV · 2020 · author #2
- Acoustic Scene Classification with Spectrogram Processing Strategies cs.SD · 2020 · author #2
- All you need is a second look: Towards Tighter Arbitrary shape text detection cs.CV · 2020 · author #2
- Multi-modal Multi-channel Target Speech Separation eess.AS · 2020 · author #5
- GID-Net: Detecting Human-Object Interaction with Global and Instance Dependency cs.CV · 2020 · author #2 as printed: YueXian Zou
- Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning eess.AS · 2020 · author #7
- Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation cs.SD · 2020 · author #2
- Environmental Sound Classification with Parallel Temporal-spectral Attention cs.SD · 2019 · author #2
- Non-Autoregressive Coarse-to-Fine Video Captioning cs.CV · 2019 · author #2
- C-RPNs: Promoting Object Detection in real world via a Cascade Structure of Region Proposal Networks cs.CV · 2019 · author #2 as printed: YueXian Zou
- End-to-End Multi-Channel Speech Separation cs.SD · 2019 · author #8
- KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization cs.CV · 2014 · author #6
- Comparison of Spearman's rho and Kendall's tau in Normal and Contaminated Normal Models cs.IT · 2010 · author #4
Mentions
- 2303.17395 #8 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2406.15339 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2303.06458 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2405.20852 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2301.05997 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2403.09530 #10 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2402.17532 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2403.09027 #10 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2312.05730 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2403.07944 #8 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2401.17186 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2307.14277 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2311.11375 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2311.10125 #9 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2310.16402 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2309.01212 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2308.13218 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2303.15932 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2307.01969 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2306.05642 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2305.02765 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2207.09983 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2210.10914 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2206.14579 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2303.05681 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2303.17119 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2303.08450 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2302.11814 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2302.11799 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2303.06605 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2303.05678 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2302.09328 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2212.08348 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2212.03657 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2211.12148 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2211.04023 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2211.02448 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2210.16431 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2210.02953 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2108.05607 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2203.15614 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2111.15162 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2207.10400 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2207.10362 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2204.02088 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2112.10153 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2206.02093 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2110.04474 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2205.01280 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2204.14272 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2204.07375 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2201.01995 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2109.06085 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2204.02143 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2204.01355 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2204.00821 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2203.16772 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2203.16767 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2203.13609 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2108.02359 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2112.02498 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2110.06100 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2109.08890 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2109.03381 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2104.12359 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2108.11916 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2108.11292 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2108.08042 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2106.10658 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2108.05516 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2103.16392 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2107.01571 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2106.15787 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2106.15258 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2106.06963 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2106.12720 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2106.02182 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2010.11066 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2103.16858 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2105.10340 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2105.07205 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2105.00812 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2104.15015 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2010.11067 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 1911.12018 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2102.01931 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2012.10877 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2101.08465 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2003.07032 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2010.08923 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2009.13431 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2008.03462 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2007.06925 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2007.03781 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 1912.06808 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2004.12436 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2003.03927 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
- 2003.05242 #2 · arxiv_oai · confidence 0.70 YueXian Zou
- 2001.00391 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
- 1908.06665 #2 · arxiv_oai · confidence 0.70 YueXian Zou
Frequent Coauthors
- Fenglin Liu 17 shared papers
- Dongchao Yang 16 shared papers
- Helin Wang 15 shared papers
- Meng Cao 14 shared papers
- Bang Yang 13 shared papers
- Can Zhang 12 shared papers
- Rongzhi Gu 12 shared papers
- Xian Wu 11 shared papers
- Xuxin Cheng 11 shared papers
- Dong Yu 10 shared papers
- Nuo Chen 10 shared papers
- Chenyu You 9 shared papers
- Zhihong Zhu 9 shared papers
- Chao Weng 8 shared papers
- Shen Ge 8 shared papers
- Jianwei Yu 7 shared papers
- Jinchuan Tian 7 shared papers
- Wenwu Wang 7 shared papers
- Dongming Yang 6 shared papers
- Hongxiang Li 6 shared papers