Yuexian Zou

Identifiers

name variant Yuexian Zou 0.60 · backfill

Papers (104)

Graph-PiT: Enhancing Structural Coherence in Part-Based Image Synthesis via Graph Priors cs.CV · 2026 · author #5
Image Conductor: Precision Control for Interactive Video Synthesis cs.CV · 2024 · author #7
Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning cs.CL · 2024 · author #5
VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding cs.CV · 2024 · author #10
VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework cs.CV · 2024 · author #10
WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs cs.CV · 2024 · author #8
Retrieval is Accurate Generation cs.CL · 2024 · author #6
Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning cs.CV · 2024 · author #6
AFL-Net: Integrating Audio, Facial, and Lip Modalities with a Two-step Cross-attention for Robust Speaker Diarization in the Wild cs.MM · 2023 · author #4
ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding cs.CL · 2023 · author #6
UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework cs.CV · 2023 · author #9
Video Referring Expression Comprehension via Transformer with Content-conditioned Query cs.CV · 2023 · author #6
NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement cs.SD · 2023 · author #5
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning cs.CV · 2023 · author #6
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory cs.CV · 2023 · author #6
Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels cs.CV · 2023 · author #7
Customizing General-Purpose Foundation Models for Medical Report Generation cs.CV · 2023 · author #3
HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec cs.SD · 2023 · author #6
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research eess.AS · 2023 · author #8
TLAG: An Informative Trigger and Label-Aware Knowledge Guided Model for Dialogue-based Relation Extraction cs.CL · 2023 · author #5
Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation cs.CV · 2023 · author #6
PoseRAC: Pose Saliency Transformer for Repetitive Action Counting cs.CV · 2023 · author #3
Improve Retrieval-based Dialogue System via Syntax-Informed Attention cs.AI · 2023 · author #5
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation cs.CL · 2023 · author #3
Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss cs.SD · 2023 · author #3
Improving Weakly Supervised Sound Event Detection with Causal Intervention cs.SD · 2023 · author #5
FTM: A Frame-level Timeline Modeling Method for Temporal Graph Representation Learning cs.LG · 2023 · author #4
FiTs: Fine-grained Two-stage Training for Knowledge-aware Question Answering cs.CL · 2023 · author #5
SSVMR: Saliency-based Self-training for Video-Music Retrieval cs.MM · 2023 · author #5
Exploiting Auxiliary Caption for Video Grounding cs.CV · 2023 · author #6
Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation cs.SD · 2022 · author #3
M3ST: Mix at Three Levels for Speech Translation cs.CL · 2022 · author #6
Aligning Source Visual and Target Language Domains for Unpaired Video Captioning cs.CV · 2022 · author #5
A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding cs.CL · 2022 · author #5
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS cs.SD · 2022 · author #6
DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention cs.CV · 2022 · author #7
Prophet Attention: Predicting Attention with Future Attention for Image Captioning cs.CV · 2022 · author #5
Video Referring Expression Comprehension via Transformer with Content-aware Query cs.CV · 2022 · author #4
Correspondence Matters for Video Referring Expression Comprehension cs.CV · 2022 · author #4
LocVTP: Video-Text Pre-training for Temporal Localization cs.CV · 2022 · author #6
Diffsound: Discrete Diffusion Model for Text-to-sound Generation cs.SD · 2022 · author #6
Competence-based Multimodal Curriculum Learning for Medical Report Generation cs.CL · 2022 · author #3
LAE: Language-Aware Encoder for Monolingual and Multilingual ASR cs.CL · 2022 · author #5
Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention eess.AS · 2022 · author #3
End-to-end Spoken Conversational Question Answering: Task, Dataset and Model cs.CL · 2022 · author #6
Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction eess.AS · 2022 · author #5
RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection cs.SD · 2022 · author #4
A Mixed supervised Learning Framework for Target Sound Detection cs.SD · 2022 · author #3
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches eess.AS · 2022 · author #5
Improving Target Sound Extraction with Timestamp Information cs.SD · 2022 · author #5
Learning Decoupling Features Through Orthogonality Regularization cs.SD · 2022 · author #6
SpatioTemporal Focus for Skeleton-based Action Recognition cs.CV · 2022 · author #3
Integrating Lattice-Free MMI into End-to-End Speech Recognition cs.CL · 2022 · author #4
Unsupervised Pre-training for Temporal Action Localization Tasks cs.CV · 2022 · author #6
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model cs.CL · 2022 · author #4
Detect what you want: Target Sound Detection cs.SD · 2021 · author #3
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI cs.AI · 2021 · author #7
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter cs.CV · 2021 · author #3
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information cs.SD · 2021 · author #4
A Mutual learning framework for Few-shot Sound Event Detection cs.SD · 2021 · author #3
Towards Joint Intent Detection and Slot Filling via Higher-order Attention cs.CL · 2021 · author #5
On Pursuit of Designing Multi-modal Transformer for Video Grounding cs.CV · 2021 · author #5
Self-supervised Contrastive Cross-Modality Representation Learning for Spoken Question Answering cs.CL · 2021 · author #3
HAN: Higher-order Attention Network for Spoken Language Understanding cs.CL · 2021 · author #3
Fully Non-Homogeneous Atmospheric Scattering Modeling with Convolutional Neural Networks for Single Image Dehazing cs.CV · 2021 · author #3
Joint Multiple Intent Detection and Slot Filling via Self-distillation cs.CL · 2021 · author #3
Deep Motion Prior for Weakly-Supervised Temporal Action Localization cs.CV · 2021 · author #5
Text Anchor Based Metric Learning for Small-footprint Keyword Spotting cs.SD · 2021 · author #4
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning cs.CL · 2021 · author #6
Audio-Oriented Multimodal Machine Comprehension: Task, Dataset and Model cs.CL · 2021 · author #7
Long-Short Temporal Modeling for Efficient Action Recognition cs.CV · 2021 · author #2
SRF-Net: Selective Receptive Field Network for Anchor-Free Temporal Action Detection cs.CV · 2021 · author #3
All You Need is a Second Look: Towards Arbitrary-Shaped Text Detection cs.CV · 2021 · author #4
Exploring Semantic Relationships for Unpaired Image Captioning cs.CV · 2021 · author #4
Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation cs.CV · 2021 · author #5
Self-supervised Dialogue Learning for Spoken Conversational Question Answering cs.CL · 2021 · author #3
Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification cs.SD · 2021 · author #3
Rethinking Skip Connection with Layer Normalization in Transformers and ResNets cs.LG · 2021 · author #5
RR-Net: Injecting Interactive Semantics in Human-Object Interaction Detection cs.CV · 2021 · author #2
Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech Separation in Complex Domain cs.SD · 2021 · author #3
Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency cs.CL · 2021 · author #4
SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification eess.AS · 2021 · author #2
CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning cs.CV · 2021 · author #5
A Global-local Attention Framework for Weakly Labelled Audio Tagging eess.AS · 2021 · author #2
FWB-Net:Front White Balance Network for Color Shift Correction in Single Image Dehazing via Atmospheric Light Estimation cs.CV · 2021 · author #3
Adaptive Bi-directional Attention: Exploring Multi-Granularity Representations for Machine Reading Comprehension cs.CL · 2020 · author #5
Knowledge Distillation for Improved Accuracy in Spoken Question Answering cs.CL · 2020 · author #3
Contextualized Attention-based Knowledge Transfer for Spoken Conversational Question Answering cs.CL · 2020 · author #3
Towards Data Distillation for End-to-end Spoken Conversational Question Answering cs.CL · 2020 · author #5
PIN: A Novel Parallel Interactive Network for Spoken Language Understanding cs.CL · 2020 · author #4
PAN: Towards Fast Action Recognition via Learning Persistence of Appearance cs.CV · 2020 · author #2
A Graph-based Interactive Reasoning for Human-Object Interaction Detection cs.CV · 2020 · author #2
Acoustic Scene Classification with Spectrogram Processing Strategies cs.SD · 2020 · author #2
All you need is a second look: Towards Tighter Arbitrary shape text detection cs.CV · 2020 · author #2
Multi-modal Multi-channel Target Speech Separation eess.AS · 2020 · author #5
GID-Net: Detecting Human-Object Interaction with Global and Instance Dependency cs.CV · 2020 · author #2 as printed: YueXian Zou
Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning eess.AS · 2020 · author #7
Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation cs.SD · 2020 · author #2
Environmental Sound Classification with Parallel Temporal-spectral Attention cs.SD · 2019 · author #2
Non-Autoregressive Coarse-to-Fine Video Captioning cs.CV · 2019 · author #2
C-RPNs: Promoting Object Detection in real world via a Cascade Structure of Region Proposal Networks cs.CV · 2019 · author #2 as printed: YueXian Zou
End-to-End Multi-Channel Speech Separation cs.SD · 2019 · author #8
KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization cs.CV · 2014 · author #6
Comparison of Spearman's rho and Kendall's tau in Normal and Contaminated Normal Models cs.IT · 2010 · author #4

Mentions

2303.17395 #8 · arxiv_oai · confidence 0.70 Yuexian Zou
2406.15339 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
2303.06458 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2405.20852 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2301.05997 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2403.09530 #10 · arxiv_oai · confidence 0.70 Yuexian Zou
2402.17532 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2403.09027 #10 · arxiv_oai · confidence 0.70 Yuexian Zou
2312.05730 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2403.07944 #8 · arxiv_oai · confidence 0.70 Yuexian Zou
2401.17186 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2307.14277 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2311.11375 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2311.10125 #9 · arxiv_oai · confidence 0.70 Yuexian Zou
2310.16402 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2309.01212 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2308.13218 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2303.15932 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2307.01969 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
2306.05642 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2305.02765 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2207.09983 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2210.10914 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2206.14579 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2303.05681 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2303.17119 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2303.08450 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2302.11814 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2302.11799 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2303.06605 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2303.05678 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2302.09328 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2212.08348 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2212.03657 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2211.12148 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2211.04023 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2211.02448 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2210.16431 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
2210.02953 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2108.05607 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2203.15614 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2111.15162 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2207.10400 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2207.10362 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2204.02088 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2112.10153 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2206.02093 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2110.04474 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2205.01280 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2204.14272 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2204.07375 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2201.01995 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2109.06085 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2204.02143 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2204.01355 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2204.00821 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2203.16772 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2203.16767 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2203.13609 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2108.02359 #6 · arxiv_oai · confidence 0.70 Yuexian Zou
2112.02498 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
2110.06100 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2109.08890 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2109.03381 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2104.12359 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2108.11916 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2108.11292 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2108.08042 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2106.10658 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2108.05516 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2103.16392 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2107.01571 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
2106.15787 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
2106.15258 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2106.06963 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2106.12720 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2106.02182 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2010.11066 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2103.16858 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
2105.10340 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2105.07205 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2105.00812 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2104.15015 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
2010.11067 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
1911.12018 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
2102.01931 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
2012.10877 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2101.08465 #3 · arxiv_oai · confidence 0.70 Yuexian Zou
2003.07032 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2010.08923 #5 · arxiv_oai · confidence 0.70 Yuexian Zou
2009.13431 #4 · arxiv_oai · confidence 0.70 Yuexian Zou
2008.03462 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
2007.06925 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
2007.03781 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
1912.06808 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
2004.12436 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
2003.03927 #7 · arxiv_oai · confidence 0.70 Yuexian Zou
2003.05242 #2 · arxiv_oai · confidence 0.70 YueXian Zou
2001.00391 #2 · arxiv_oai · confidence 0.70 Yuexian Zou
1908.06665 #2 · arxiv_oai · confidence 0.70 YueXian Zou

Frequent Coauthors

Fenglin Liu 17 shared papers
Dongchao Yang 16 shared papers
Helin Wang 15 shared papers
Meng Cao 14 shared papers
Bang Yang 13 shared papers
Can Zhang 12 shared papers
Rongzhi Gu 12 shared papers
Xian Wu 11 shared papers
Xuxin Cheng 11 shared papers
Dong Yu 10 shared papers
Nuo Chen 10 shared papers
Chenyu You 9 shared papers
Zhihong Zhu 9 shared papers
Chao Weng 8 shared papers
Shen Ge 8 shared papers
Jianwei Yu 7 shared papers
Jinchuan Tian 7 shared papers
Wenwu Wang 7 shared papers
Dongming Yang 6 shared papers
Hongxiang Li 6 shared papers