Hybrid Congestion Classification Framework Using Flow-Guided Attention and Empirical Mode Decomposition

Armstrong Aboah; Blessing Agyei Kyem; Eugene Kofi Okrah Denteh; Joshua Kofi Asamoah

arxiv: 2605.04752 · v1 · submitted 2026-05-06 · 💻 cs.CV · cs.AI

Hybrid Congestion Classification Framework Using Flow-Guided Attention and Empirical Mode Decomposition

Eugene Kofi Okrah Denteh , Blessing Agyei Kyem , Joshua Kofi Asamoah , Armstrong Aboah This is my paper

Pith reviewed 2026-05-08 17:04 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords traffic congestionoptical flowempirical mode decompositionattention mechanismvideo classificationspatiotemporal modelinghybrid frameworkmotion analysis

0 comments

The pith

Flow-guided attention and empirical mode decomposition together classify traffic congestion levels more effectively by integrating spatial motion cues with adaptive temporal analysis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper seeks to establish that a unified video analysis framework can overcome the separate weaknesses of appearance-based and signal-based methods for detecting road congestion. By using optical flow to direct attention to moving parts of the scene and applying empirical mode decomposition to break down motion patterns into intrinsic components, the approach captures both location and timing of traffic flow. A sympathetic reader would care because reliable congestion classification from existing camera feeds could support better traffic control systems and reduce reliance on physical sensors. The reported results indicate that this combination leads to high accuracy and stability under different conditions.

Core claim

The authors claim that their FLO-EMD model, which applies dense optical flow to guide attention in refining RGB features for motion-relevant regions and uses empirical mode decomposition on aggregated flow statistics to obtain intrinsic temporal components, when fused with spatiotemporal representations, enables effective classification of light, medium, and heavy congestion.

What carries the argument

The hybrid FLO-EMD architecture that links motion evidence from optical flow to spatial feature selection through attention and performs data-adaptive temporal characterization via empirical mode decomposition.

If this is right

The combined model reaches 97.5% overall test accuracy and a weighted F1 of 0.9742 on the 1,050 clips.
It outperforms several established baseline methods.
Performance stays robust across the varied conditions in the four surveillance networks.
Ablation experiments show the specific contributions of the EMD step, the number of intrinsic mode functions, and the motion descriptors used.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This approach could be tested for extension to predicting congestion evolution over longer time periods rather than classifying current state.
The method's emphasis on motion might make it suitable for low-light or poor visibility scenarios where color cues fail.
Integrating such a system with existing traffic management software could provide granular level information for dynamic signal timing.

Load-bearing premise

The selected video clips and motion features sufficiently capture the essential variations in traffic behavior without introducing selection bias or overfitting through post-hoc choices.

What would settle it

Evaluating the trained model on a new collection of traffic video clips from additional locations or different times that results in substantially reduced classification accuracy would disprove the claim of robust high performance.

Figures

Figures reproduced from arXiv: 2605.04752 by Armstrong Aboah, Blessing Agyei Kyem, Eugene Kofi Okrah Denteh, Joshua Kofi Asamoah.

**Figure 1.** Figure 1: Overview of the FLO-EMD framework. The architecture processes traffic video sequences through parallel RGB and optical flow backbones, employs flow-guided attention mechanisms for spatial feature enhancement, integrates EMD-based temporal analysis of motion statistics, and fuses multimodal features through bidirectional LSTM encoding for final classification. The rest of this section details the implementa… view at source ↗

**Figure 2.** Figure 2: RGB backbone: hierarchical convolutions (64, 128, 256, 512 channels) with decreasing spatial resolution, followed by global average pooling to obtain a fixeddimensional frame representation. Optical–flow backbone: This mirrors the RGB backbone design but operates on dense flow fields of shape 𝐵 × (𝑇−1) × 𝐻 × 𝑊 × 2. The first convolution adapts to the two-channel input (𝑢, 𝑣) and is followed by the same se… view at source ↗

**Figure 3.** Figure 3: Flow-guided channel attention mechanism. RGB and optical flow features undergo global average and maximum pooling, followed by processing through shared MLPs to generate channel attention weights that emphasize motion-relevant feature channels for traffic analysis. 10 Eugene Denteh May 7, 2026 view at source ↗

**Figure 4.** Figure 4: Flow-guided spatial attention mechanism. Channel-refined RGB features are combined with flow magnitude information through channel-wise pooling operations. The concatenated spatial maps undergo convolution and sigmoid activation to produce spatial attention weights that focus on motion-active regions while suppressing static background elements. Temporal Modeling and Feature Fusion Temporal modeling integr… view at source ↗

**Figure 5.** Figure 5: Dataset diversity showing various traffic conditions and environmental scenarios: (a) light traffic flow on a multi-lane highway under clear daytime conditions, (b) moderate traffic density with well-spaced vehicles during clear weather, (c) increased traffic density showing more congested but still flowing conditions, (d) highway infrastructure with moderate traffic under overcast conditions, (e) modera… view at source ↗

**Figure 6.** Figure 6: Accurate classification examples showing the model’s ability to correctly identify light, medium, and heavy traffic conditions with high confidence scores ranging from 97.40% to 99.20%. The attention heatmaps demonstrate consistent focus on trafficrelevant regions across different congestion levels. light medium heavy Predicted Label light medium heavy True Label 0.99 0.01 0.00 0.01 0.94 0.05 0.00 0.05 0.… view at source ↗

**Figure 7.** Figure 7: Confusion matrix for the proposed model showing classification performance across all traffic congestion classes. The matrix reveals strong diagonal performance with minimal misclassification between adjacent congestion levels view at source ↗

**Figure 8.** Figure 8: Representative misclassification examples showing boundary cases where the model struggles with ambiguous traffic scenarios. The moderate confidence scores (65-81%) indicate uncertainty in these challenging transition cases between medium and heavy congestion. area. This suggests that the attention mechanism can be influenced by motion-adjacent cues and optical-flow artifacts near boundaries, as well as hi… view at source ↗

**Figure 9.** Figure 9: Temporal attention visualization analysis across 16 consecutive frames showing attention pattern evolution over time. Top: FLO-EMD with flow-guided attention demonstrating consistent tracking of vehicle locations and adaptive focus on trafficactive regions throughout the sequence. Bottom: FLO-EMD V2 without flow-guided attention exhibiting static attention patterns concentrated on road infrastructure elem… view at source ↗

read the original abstract

Accurate traffic congestion classification requires models that jointly capture roadway scene context and non-stationary traffic motion, yet most prior work treats these requirements in isolation. Vision-based methods often depend on appearance cues with standard temporal pooling, which can bias predictions toward static infrastructure, whereas signal-based approaches characterize temporal dynamics but lack the spatial context needed for scene-level localization. These complementary limitations motivate a unified framework that links motion evidence to spatial feature selection while preserving data-adaptive temporal characterization. This study therefore proposes FLO-EMD, a hybrid approach that couples motion-guided attention with empirical, data-driven temporal decomposition. Dense optical flow guides channel and spatial attention so that RGB features are refined toward motion-relevant regions. In parallel, aggregated flow statistics form compact motion traces that are decomposed using Empirical Mode Decomposition (EMD) to extract intrinsic temporal components. The resulting EMD embedding is fused with learned spatiotemporal representations to classify light, medium, and heavy congestion. Experiments on 1,050 five-second clips from four surveillance networks show that FLO-EMD achieves 97.5% overall test accuracy (weighted F1 = 0.9742), outperforming established baselines and remaining robust across diverse environmental conditions; ablation and sensitivity analyses further quantify the contributions of EMD, the number of intrinsic mode functions, and the selected motion descriptors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper combines flow-guided attention with EMD decomposition for congestion classification and reports 97.5% accuracy, but the four-network dataset and unspecified splits leave open whether the gains come from motion patterns or scene-specific cues.

read the letter

The core contribution is a straightforward hybrid: dense optical flow steers spatial and channel attention on RGB frames while aggregated flow traces get broken into intrinsic mode functions via EMD, then the two streams are fused for three-class congestion prediction. That pairing is not a new organizing principle, but it is a clean way to link spatial focus with adaptive temporal decomposition on short clips. The ablations on the number of modes and the choice of motion descriptors are useful and show each piece adds something measurable. The reported 97.5% overall accuracy and 0.9742 weighted F1 on the 1,050-clip test set beat the listed baselines, and the authors note robustness across lighting and weather variations in their collection. Those numbers are the main practical takeaway for anyone building traffic monitors from fixed cameras. The soft spot is the data partition. With only four surveillance networks supplying all clips, any split that allows clips from the same camera or network into both train and test sets lets the model pick up fixed background statistics, camera angle, or illumination patterns that happen to co-occur with the labels. The abstract gives no indication the splits were made network-disjoint or camera-disjoint, and the sensitivity analysis on mode count does not address whether that choice was tuned on the test distribution. That gap makes the generalization claim harder to trust without further checks. The work is aimed at applied computer-vision groups doing video-based traffic monitoring who want a ready-to-implement fusion that improves on standard temporal pooling. Readers looking for large-scale cross-city validation or a new theoretical framing will find little here. The reasoning is coherent and the experiments are at least partially documented, so the paper merits a serious referee who can ask for explicit split details and cross-network testing. I would send it to review rather than desk-reject.

Referee Report

2 major / 2 minor

Summary. The paper proposes FLO-EMD, a hybrid congestion classification framework that couples dense optical flow-guided channel and spatial attention on RGB features with Empirical Mode Decomposition (EMD) applied to aggregated flow statistics for extracting intrinsic temporal components. These are fused to classify light, medium, and heavy congestion. On a dataset of 1,050 five-second clips from four surveillance networks, the method reports 97.5% overall test accuracy (weighted F1 = 0.9742), outperforming baselines, with supporting ablation and sensitivity analyses on EMD components, number of intrinsic mode functions, and motion descriptors.

Significance. If the reported accuracy and robustness hold under properly controlled validation, the work offers a concrete advance in hybrid vision-signal methods for non-stationary scene understanding, addressing the complementary weaknesses of pure appearance-based and pure signal-based congestion classifiers. The explicit use of data-adaptive EMD on motion traces and the reported ablation quantifications of component contributions are strengths that could inform follow-on work in traffic monitoring and related dynamic classification tasks.

major comments (2)

[Experiments / abstract] The experimental evaluation (abstract and §4/§5) does not specify whether the train/test split on the 1,050 clips is network-disjoint or camera-disjoint. With only four source networks, any non-disjoint split risks the model exploiting network-specific lighting, camera geometry, or background statistics that correlate with congestion labels, rather than learning general motion dynamics; this directly undermines the central 97.5% accuracy claim and the assertion of robustness across diverse conditions.
[Ablation and sensitivity analyses] The sensitivity analysis on the number of intrinsic mode functions (a free hyperparameter listed in the axiom ledger) is mentioned but lacks details on how the value was selected without post-hoc optimization on the test distribution; if the chosen number was tuned after seeing test performance, the ablation results quantifying EMD contribution become circular and no longer support the headline performance numbers.

minor comments (2)

[Experiments] Baseline implementations are referenced but lack explicit details on data splits, hyperparameter search, or statistical testing (e.g., multiple runs with standard deviations), making it difficult to assess whether the reported outperformance is robust.
[Method] Notation for the fused embedding and attention modules could be clarified with an explicit diagram or equation showing how the EMD embedding is concatenated or attended with the spatiotemporal features.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to incorporate clarifications and additional details as outlined.

read point-by-point responses

Referee: [Experiments / abstract] The experimental evaluation (abstract and §4/§5) does not specify whether the train/test split on the 1,050 clips is network-disjoint or camera-disjoint. With only four source networks, any non-disjoint split risks the model exploiting network-specific lighting, camera geometry, or background statistics that correlate with congestion labels, rather than learning general motion dynamics; this directly undermines the central 97.5% accuracy claim and the assertion of robustness across diverse conditions.

Authors: We acknowledge that the manuscript does not explicitly describe the train/test split procedure. The 1,050 clips were randomly partitioned into training, validation, and test sets (70/15/15 ratio) at the clip level without enforcing network-disjoint or camera-disjoint splits, as this was necessary to maintain class balance and adequate sample sizes given only four source networks. We agree this introduces a risk of the model capturing network-specific cues rather than purely general motion dynamics. In the revised manuscript, we will clearly state the split method, add a dedicated limitations discussion on this point, and include leave-one-network-out cross-validation results to quantify generalization across networks. revision: yes
Referee: [Ablation and sensitivity analyses] The sensitivity analysis on the number of intrinsic mode functions (a free hyperparameter listed in the axiom ledger) is mentioned but lacks details on how the value was selected without post-hoc optimization on the test distribution; if the chosen number was tuned after seeing test performance, the ablation results quantifying EMD contribution become circular and no longer support the headline performance numbers.

Authors: We agree that additional transparency is needed on the selection of the number of intrinsic mode functions. This hyperparameter was determined exclusively via grid search (values 1–10) on the training set using 5-fold cross-validation, selecting the value that maximized average validation accuracy; the test set was never used. We will revise the sensitivity analysis section to document this procedure in full, including the validation performance for each candidate number of IMFs, and explicitly confirm that no test data influenced the choice. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

This is an empirical ML paper proposing a hybrid FLO-EMD architecture that fuses flow-guided attention with EMD-based temporal decomposition, then trains and evaluates a classifier on held-out video clips. No equations, uniqueness theorems, or self-citations are invoked to derive performance metrics or architectural choices by construction; accuracy and ablation results are obtained via standard supervised training on a fixed dataset split rather than any reduction of outputs to fitted inputs or prior self-referential claims.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The framework rests on standard computer-vision and signal-processing assumptions plus a modest number of design choices whose impact is quantified only via ablation on the authors' own data.

free parameters (1)

number of intrinsic mode functions
Hyperparameter controlling the EMD embedding dimensionality; its effect is studied via sensitivity analysis but remains a tunable choice.

axioms (2)

domain assumption Dense optical flow reliably identifies motion regions relevant to congestion level
Used to guide both channel and spatial attention without independent validation that flow errors do not systematically bias attention maps.
domain assumption Aggregated flow statistics contain non-stationary temporal structure that EMD can meaningfully decompose for classification
Core premise of the temporal branch; no proof or external benchmark is supplied.

pith-pipeline@v0.9.0 · 5548 in / 1509 out tokens · 21435 ms · 2026-05-08T17:04:43.034582+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

113 extracted references · 113 canonical work pages

[1]

2015 IEEE 18th International Conference on Intelligent Transportation Systems , pages=

Reliability of probe speed data for detecting congestion trends , author=. 2015 IEEE 18th International Conference on Intelligent Transportation Systems , pages=. 2015 , organization=

work page 2015
[2]

Multimedia systems , volume=

Video-based driver action recognition via hybrid spatial--temporal deep learning framework , author=. Multimedia systems , volume=. 2021 , publisher=

work page 2021
[3]

Journal of Intelligent Transportation Systems , volume=

Convolutional neural network for recognizing highway traffic congestion , author=. Journal of Intelligent Transportation Systems , volume=. 2020 , publisher=

work page 2020
[4]

Transportation Research Record , volume=

Traffic congestion detection from camera images using deep convolution neural networks , author=. Transportation Research Record , volume=. 2018 , publisher=

work page 2018
[5]

Journal of advanced transportation , volume=

A deep learning based traffic state estimation method for mixed traffic flow environment , author=. Journal of advanced transportation , volume=. 2022 , publisher=

work page 2022
[6]

2011 14th international IEEE conference on intelligent transportation systems (ITSC) , pages=

Video processing techniques for traffic flow monitoring: A survey , author=. 2011 14th international IEEE conference on intelligent transportation systems (ITSC) , pages=. 2011 , organization=

work page 2011
[7]

IEEE Transactions on intelligent transportation systems , volume=

A review of computer vision techniques for the analysis of urban traffic , author=. IEEE Transactions on intelligent transportation systems , volume=. 2011 , publisher=

work page 2011
[8]

Proceedings of the AAAI conference on artificial intelligence , volume=

Deep spatio-temporal residual networks for citywide crowd flows prediction , author=. Proceedings of the AAAI conference on artificial intelligence , volume=. 2017 , organization=

work page 2017
[9]

IAES International Journal of Artificial Intelligence , volume=

Adaptive real time traffic prediction using deep neural networks , author=. IAES International Journal of Artificial Intelligence , volume=. 2019 , publisher=

work page 2019
[10]

2022 , school=

Traffic congestion detection and optimizing traffic flow using object detection, optical flow and fluid dynamics , author=. 2022 , school=

work page 2022
[11]

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

Optical-flow features empirical mode decomposition for motion anomaly detection , author=. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2017 , organization=

work page 2017
[12]

Journal of Built Environment, Technology and Engineering , volume=

Deterministic algorithm for traffic detection in free-flow and congestion using video sensor , author=. Journal of Built Environment, Technology and Engineering , volume=

work page
[13]

Journal of industrial information integration , volume=

Anomaly detection in NetFlow network traffic using supervised machine learning algorithms , author=. Journal of industrial information integration , volume=. 2023 , publisher=

work page 2023
[14]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Learning memory-guided normality for anomaly detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[15]

IEEE transactions on signal processing , volume=

Variational mode decomposition , author=. IEEE transactions on signal processing , volume=. 2013 , publisher=

work page 2013
[16]

Electronics , volume=

A complex empirical mode decomposition for multivariant traffic time series , author=. Electronics , volume=. 2023 , publisher=

work page 2023
[17]

Mechanical Systems and Signal Processing , volume=

Enhancement of adaptive mode decomposition via angular resampling for nonstationary signal analysis of rotating machinery: Principle and applications , author=. Mechanical Systems and Signal Processing , volume=. 2021 , publisher=

work page 2021
[18]

Proceedings of the AAAI conference on artificial intelligence , volume=

Spatial temporal graph convolutional networks for skeleton-based action recognition , author=. Proceedings of the AAAI conference on artificial intelligence , volume=. 2018 , organization=

work page 2018
[19]

The Journal of Supercomputing , volume=

Spatial-temporal graph convolutional networks for traffic flow prediction considering multiple traffic parameters , author=. The Journal of Supercomputing , volume=. 2023 , publisher=

work page 2023
[20]

IEEE Transactions on Knowledge and Data Engineering , volume=

Spatio-temporal joint graph convolutional networks for traffic forecasting , author=. IEEE Transactions on Knowledge and Data Engineering , volume=. 2023 , publisher=

work page 2023
[21]

Alexandria Engineering Journal , volume=

A combined method for short-term traffic flow prediction based on recurrent neural network , author=. Alexandria Engineering Journal , volume=. 2021 , publisher=

work page 2021
[22]

Pattern Recognition , volume=

A decomposition dynamic graph convolutional recurrent network for traffic forecasting , author=. Pattern Recognition , volume=. 2023 , publisher=

work page 2023
[23]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

The 5th AI City Challenge , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=. 2021 , organization=

work page 2021
[24]

International Transportation Economic Development Conference

Congestion evaluation best practices , author=. International Transportation Economic Development Conference. Sheraton Dallas Hotel, Dallas, USA , pages=. 2014 , organization=

work page 2014
[25]

Texas: Texas Transportation Institute , year=

Urban mobility report texas transportation institute , author=. Texas: Texas Transportation Institute , year=

work page
[26]

Transportation research record , volume=

Real-world carbon dioxide impacts of traffic congestion , author=. Transportation research record , volume=. 2008 , publisher=

work page 2008
[27]

Clean Air Journal , volume=

Ambient air pollution: A global assessment of exposure and burden of disease , author=. Clean Air Journal , volume=

work page
[28]

Science of the total environment , volume=

Quantifying on-road vehicle emissions during traffic congestion using updated emission factors of light-duty gasoline vehicles and real-world traffic monitoring big data , author=. Science of the total environment , volume=. 2022 , publisher=

work page 2022
[29]

Transportation Research Part C: Emerging Technologies , volume=

On feature selection for traffic congestion prediction , author=. Transportation Research Part C: Emerging Technologies , volume=. 2013 , publisher=

work page 2013
[30]

Middle-East Journal of Scientific Research , volume=

A survey on intelligent transportation systems , author=. Middle-East Journal of Scientific Research , volume=

work page
[31]

Transportation Research Part C: Emerging Technologies , volume=

A real-time computer vision system for vehicle tracking and traffic surveillance , author=. Transportation Research Part C: Emerging Technologies , volume=. 1998 , publisher=

work page 1998
[32]

Journal of Intelligent Transportation Systems , volume=

Connected and automated vehicle systems: Introduction and overview , author=. Journal of Intelligent Transportation Systems , volume=. 2018 , publisher=

work page 2018
[33]

2018 3rd International conference on computational systems and information technology for sustainable solutions (CSITSS) , pages=

A review on video based vehicle detection, recognition and tracking , author=. 2018 3rd International conference on computational systems and information technology for sustainable solutions (CSITSS) , pages=. 2018 , organization=

work page 2018
[34]

International Journal of Signal Processing, Image Processing and Pattern Recognition , volume=

Moving object tracking of vehicle detection: A concise review , author=. International Journal of Signal Processing, Image Processing and Pattern Recognition , volume=

work page
[35]

IEEE Transactions on Intelligent Transportation Systems , volume=

Deep learning on traffic prediction: Methods, analysis, and future directions , author=. IEEE Transactions on Intelligent Transportation Systems , volume=. 2021 , publisher=

work page 2021
[36]

IEEE Transactions on Intelligent Transportation Systems , volume=

A hybrid deep learning model with attention-based conv-LSTM networks for short-term traffic flow prediction , author=. IEEE Transactions on Intelligent Transportation Systems , volume=. 2020 , publisher=

work page 2020
[37]

Comprehensive Survey and Analysis of Techniques, Advancements, and Challenges in Video-Based Traffic Surveillance Systems , author=. Int. J. Recent Innov. Trends Comput. Commun , volume=

work page
[38]

2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) , pages=

Dynamic traffic system based on real time detection of traffic congestion , author=. 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) , pages=. 2018 , organization=

work page 2018
[39]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Motion guided attention for video salient object detection , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=. 2019 , organization=

work page 2019
[40]

IEEE Access , volume=

Adaptive signal processing algorithms based on EMD and ITD , author=. IEEE Access , volume=. 2019 , publisher=

work page 2019
[41]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Tam: Temporal adaptive module for video recognition , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=. 2021 , organization=

work page 2021
[42]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Tea: Temporal excitation and aggregation for action recognition , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=. 2020 , organization=

work page 2020
[43]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

X3d: Expanding architectures for efficient video recognition , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=. 2020 , organization=

work page 2020
[44]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Mvitv2: Improved multiscale vision transformers for classification and detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=. 2022 , organization=

work page 2022
[45]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Video swin transformer , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=. 2022 , organization=

work page 2022
[46]

Icml , volume=

Is space-time attention all you need for video understanding? , author=. Icml , volume=. 2021 , organization=

work page 2021
[47]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Understanding traffic density from large-scale web camera data , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=. 2017 , organization=

work page 2017
[48]

2019 IEEE Pune section international conference (PuneCon) , pages=

Hog, lbp and svm based traffic density estimation at intersection , author=. 2019 IEEE Pune section international conference (PuneCon) , pages=. 2019 , organization=

work page 2019
[49]

Proceedings of the IEEE conference on Computer Vision and Pattern Recognition , pages=

A closer look at spatiotemporal convolutions for action recognition , author=. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition , pages=. 2018 , organization=

work page 2018
[50]

Artificial intelligence , volume=

Determining optical flow , author=. Artificial intelligence , volume=. 1981 , publisher=

work page 1981
[51]

Two-Frame Motion Estimation Based on Polynomial Expansion , volume =

Farnebäck, Gunnar , year =. Two-Frame Motion Estimation Based on Polynomial Expansion , volume =. In: Image analysis , doi =

work page
[52]

and Singh, Sameer , title =

Rodriguez-Serrano, Jose A. and Singh, Sameer , title =. Pattern Anal. Appl. , month = nov, pages =. 2012 , issue_date =. doi:10.1007/s10044-012-0269-7 , abstract =

work page doi:10.1007/s10044-012-0269-7 2012
[53]

IEEE International Conference on Image Processing 2005 , volume=

Similarity based vehicle trajectory clustering and anomaly detection , author=. IEEE International Conference on Image Processing 2005 , volume=. 2005 , organization=

work page 2005
[54]

proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

Quo vadis, action recognition? a new model and the kinetics dataset , author=. proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=. 2017 , organization=

work page 2017
[55]

Proceedings of the IEEE international conference on computer vision , pages=

Learning spatiotemporal features with 3d convolutional networks , author=. Proceedings of the IEEE international conference on computer vision , pages=. 2015 , organization=

work page 2015
[56]

Advances in neural information processing systems , volume=

Two-stream convolutional networks for action recognition in videos , author=. Advances in neural information processing systems , volume=

work page
[57]

Proceedings of the European conference on computer vision (ECCV) , pages=

Cbam: Convolutional block attention module , author=. Proceedings of the European conference on computer vision (ECCV) , pages=. 2018 , organization=

work page 2018
[58]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Non-local neural networks , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=. 2018 , organization=

work page 2018
[59]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Optical flow guided feature: A fast and robust motion representation for video action recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=. 2018 , organization=

work page 2018
[60]

Proceedings of the IEEE international conference on computer vision , pages=

Flow-guided feature aggregation for video object detection , author=. Proceedings of the IEEE international conference on computer vision , pages=. 2017 , organization=

work page 2017
[61]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

Motion Guided Spatial Attention for Video Captioning , volume=. Proceedings of the AAAI Conference on Artificial Intelligence , author=. 2019 , month=. doi:10.1609/aaai.v33i01.33018191 , abstractNote=

work page doi:10.1609/aaai.v33i01.33018191 2019
[62]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Long-term recurrent convolutional networks for visual recognition and description , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=. 2015 , organization=

work page 2015
[63]

IEEE Transactions on Circuits and Systems for Video Technology , volume=

Two-stream collaborative learning with spatial-temporal attention for video classification , author=. IEEE Transactions on Circuits and Systems for Video Technology , volume=. 2018 , publisher=

work page 2018
[64]

arXiv preprint arXiv:2012.08510 , year=

Gta: Global temporal attention for video action understanding , author=. arXiv preprint arXiv:2012.08510 , year=

work page arXiv 2012
[65]

Proceedings of the Royal Society of London

The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , author=. Proceedings of the Royal Society of London. Series A: mathematical, physical and engineering sciences , volume=. 1998 , publisher=

work page 1998
[66]

Sustainability , VOLUME =

Rui, Yikang and Gong, Yannan and Zhao, Yan and Luo, Kaijie and Lu, Wenqi , TITLE =. Sustainability , VOLUME =. 2024 , NUMBER =

work page 2024
[67]

2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) , volume=

Probabilistic kernels for the classification of auto-regressive visual processes , author=. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) , volume=. 2005 , organization=

work page 2005
[68]

2025 , institution=

2025 urban mobility report , author=. 2025 , institution=

work page 2025
[69]

Inrix global traffic scorecard , author=

work page
[70]

2016 , publisher=

Traffic Monitoring Guide , author=. 2016 , publisher=

work page 2016
[71]

2006 , institution=

Traffic detector handbook: Volume I , author=. 2006 , institution=

work page 2006
[72]

A survey on Hilbert-Huang transform: Evolution, challenges and solutions , journal =

Uender Barbosa de Souza and João Paulo Lemos Escola and Leonardo da Cunha Brito , keywords =. A survey on Hilbert-Huang transform: Evolution, challenges and solutions , journal =. 2022 , issn =. doi:https://doi.org/10.1016/j.dsp.2021.103292 , url =

work page doi:10.1016/j.dsp.2021.103292 2022
[73]

Expert Systems with Applications , volume=

MobileNetV2 with Spatial Attention module for traffic congestion recognition in surveillance images , author=. Expert Systems with Applications , volume=. 2024 , publisher=

work page 2024
[74]

Engineering Applications of Artificial Intelligence , volume=

Traffic congestion recognition based on convolutional neural networks in different scenarios , author=. Engineering Applications of Artificial Intelligence , volume=. 2025 , publisher=

work page 2025
[75]

Transport and Telecommunication , volume=

Efficient road traffic video congestion classification based on the multi-head self-attention vision transformer model , author=. Transport and Telecommunication , volume=. 2024 , publisher=

work page 2024
[76]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Mitigating and evaluating static bias of action representations in the background and the foreground , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

work page
[77]

arXiv preprint arXiv:2512.17953 , year=

Seeing Beyond the Scene: Analyzing and Mitigating Background Bias in Action Recognition , author=. arXiv preprint arXiv:2512.17953 , year=

work page arXiv
[78]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Gmflow: Learning optical flow via global matching , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[79]

Sensors , volume=

Deep learning-based congestion detection at urban intersections , author=. Sensors , volume=. 2021 , publisher=

work page 2021
[80]

Transportation Research Part C: Emerging Technologies , volume=

Two-stream video-based deep learning model for crashes and near-crashes , author=. Transportation Research Part C: Emerging Technologies , volume=. 2024 , publisher=

work page 2024

Showing first 80 references.

[1] [1]

2015 IEEE 18th International Conference on Intelligent Transportation Systems , pages=

Reliability of probe speed data for detecting congestion trends , author=. 2015 IEEE 18th International Conference on Intelligent Transportation Systems , pages=. 2015 , organization=

work page 2015

[2] [2]

Multimedia systems , volume=

Video-based driver action recognition via hybrid spatial--temporal deep learning framework , author=. Multimedia systems , volume=. 2021 , publisher=

work page 2021

[3] [3]

Journal of Intelligent Transportation Systems , volume=

Convolutional neural network for recognizing highway traffic congestion , author=. Journal of Intelligent Transportation Systems , volume=. 2020 , publisher=

work page 2020

[4] [4]

Transportation Research Record , volume=

Traffic congestion detection from camera images using deep convolution neural networks , author=. Transportation Research Record , volume=. 2018 , publisher=

work page 2018

[5] [5]

Journal of advanced transportation , volume=

A deep learning based traffic state estimation method for mixed traffic flow environment , author=. Journal of advanced transportation , volume=. 2022 , publisher=

work page 2022

[6] [6]

2011 14th international IEEE conference on intelligent transportation systems (ITSC) , pages=

Video processing techniques for traffic flow monitoring: A survey , author=. 2011 14th international IEEE conference on intelligent transportation systems (ITSC) , pages=. 2011 , organization=

work page 2011

[7] [7]

IEEE Transactions on intelligent transportation systems , volume=

A review of computer vision techniques for the analysis of urban traffic , author=. IEEE Transactions on intelligent transportation systems , volume=. 2011 , publisher=

work page 2011

[8] [8]

Proceedings of the AAAI conference on artificial intelligence , volume=

Deep spatio-temporal residual networks for citywide crowd flows prediction , author=. Proceedings of the AAAI conference on artificial intelligence , volume=. 2017 , organization=

work page 2017

[9] [9]

IAES International Journal of Artificial Intelligence , volume=

Adaptive real time traffic prediction using deep neural networks , author=. IAES International Journal of Artificial Intelligence , volume=. 2019 , publisher=

work page 2019

[10] [10]

2022 , school=

Traffic congestion detection and optimizing traffic flow using object detection, optical flow and fluid dynamics , author=. 2022 , school=

work page 2022

[11] [11]

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

Optical-flow features empirical mode decomposition for motion anomaly detection , author=. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2017 , organization=

work page 2017

[12] [12]

Journal of Built Environment, Technology and Engineering , volume=

Deterministic algorithm for traffic detection in free-flow and congestion using video sensor , author=. Journal of Built Environment, Technology and Engineering , volume=

work page

[13] [13]

Journal of industrial information integration , volume=

Anomaly detection in NetFlow network traffic using supervised machine learning algorithms , author=. Journal of industrial information integration , volume=. 2023 , publisher=

work page 2023

[14] [14]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Learning memory-guided normality for anomaly detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page

[15] [15]

IEEE transactions on signal processing , volume=

Variational mode decomposition , author=. IEEE transactions on signal processing , volume=. 2013 , publisher=

work page 2013

[16] [16]

Electronics , volume=

A complex empirical mode decomposition for multivariant traffic time series , author=. Electronics , volume=. 2023 , publisher=

work page 2023

[17] [17]

Mechanical Systems and Signal Processing , volume=

Enhancement of adaptive mode decomposition via angular resampling for nonstationary signal analysis of rotating machinery: Principle and applications , author=. Mechanical Systems and Signal Processing , volume=. 2021 , publisher=

work page 2021

[18] [18]

Proceedings of the AAAI conference on artificial intelligence , volume=

Spatial temporal graph convolutional networks for skeleton-based action recognition , author=. Proceedings of the AAAI conference on artificial intelligence , volume=. 2018 , organization=

work page 2018

[19] [19]

The Journal of Supercomputing , volume=

Spatial-temporal graph convolutional networks for traffic flow prediction considering multiple traffic parameters , author=. The Journal of Supercomputing , volume=. 2023 , publisher=

work page 2023

[20] [20]

IEEE Transactions on Knowledge and Data Engineering , volume=

Spatio-temporal joint graph convolutional networks for traffic forecasting , author=. IEEE Transactions on Knowledge and Data Engineering , volume=. 2023 , publisher=

work page 2023

[21] [21]

Alexandria Engineering Journal , volume=

A combined method for short-term traffic flow prediction based on recurrent neural network , author=. Alexandria Engineering Journal , volume=. 2021 , publisher=

work page 2021

[22] [22]

Pattern Recognition , volume=

A decomposition dynamic graph convolutional recurrent network for traffic forecasting , author=. Pattern Recognition , volume=. 2023 , publisher=

work page 2023

[23] [23]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

The 5th AI City Challenge , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=. 2021 , organization=

work page 2021

[24] [24]

International Transportation Economic Development Conference

Congestion evaluation best practices , author=. International Transportation Economic Development Conference. Sheraton Dallas Hotel, Dallas, USA , pages=. 2014 , organization=

work page 2014

[25] [25]

Texas: Texas Transportation Institute , year=

Urban mobility report texas transportation institute , author=. Texas: Texas Transportation Institute , year=

work page

[26] [26]

Transportation research record , volume=

Real-world carbon dioxide impacts of traffic congestion , author=. Transportation research record , volume=. 2008 , publisher=

work page 2008

[27] [27]

Clean Air Journal , volume=

Ambient air pollution: A global assessment of exposure and burden of disease , author=. Clean Air Journal , volume=

work page

[28] [28]

Science of the total environment , volume=

Quantifying on-road vehicle emissions during traffic congestion using updated emission factors of light-duty gasoline vehicles and real-world traffic monitoring big data , author=. Science of the total environment , volume=. 2022 , publisher=

work page 2022

[29] [29]

Transportation Research Part C: Emerging Technologies , volume=

On feature selection for traffic congestion prediction , author=. Transportation Research Part C: Emerging Technologies , volume=. 2013 , publisher=

work page 2013

[30] [30]

Middle-East Journal of Scientific Research , volume=

A survey on intelligent transportation systems , author=. Middle-East Journal of Scientific Research , volume=

work page

[31] [31]

Transportation Research Part C: Emerging Technologies , volume=

A real-time computer vision system for vehicle tracking and traffic surveillance , author=. Transportation Research Part C: Emerging Technologies , volume=. 1998 , publisher=

work page 1998

[32] [32]

Journal of Intelligent Transportation Systems , volume=

Connected and automated vehicle systems: Introduction and overview , author=. Journal of Intelligent Transportation Systems , volume=. 2018 , publisher=

work page 2018

[33] [33]

2018 3rd International conference on computational systems and information technology for sustainable solutions (CSITSS) , pages=

A review on video based vehicle detection, recognition and tracking , author=. 2018 3rd International conference on computational systems and information technology for sustainable solutions (CSITSS) , pages=. 2018 , organization=

work page 2018

[34] [34]

International Journal of Signal Processing, Image Processing and Pattern Recognition , volume=

Moving object tracking of vehicle detection: A concise review , author=. International Journal of Signal Processing, Image Processing and Pattern Recognition , volume=

work page

[35] [35]

IEEE Transactions on Intelligent Transportation Systems , volume=

Deep learning on traffic prediction: Methods, analysis, and future directions , author=. IEEE Transactions on Intelligent Transportation Systems , volume=. 2021 , publisher=

work page 2021

[36] [36]

IEEE Transactions on Intelligent Transportation Systems , volume=

A hybrid deep learning model with attention-based conv-LSTM networks for short-term traffic flow prediction , author=. IEEE Transactions on Intelligent Transportation Systems , volume=. 2020 , publisher=

work page 2020

[37] [37]

Comprehensive Survey and Analysis of Techniques, Advancements, and Challenges in Video-Based Traffic Surveillance Systems , author=. Int. J. Recent Innov. Trends Comput. Commun , volume=

work page

[38] [38]

2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) , pages=

Dynamic traffic system based on real time detection of traffic congestion , author=. 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) , pages=. 2018 , organization=

work page 2018

[39] [39]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Motion guided attention for video salient object detection , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=. 2019 , organization=

work page 2019

[40] [40]

IEEE Access , volume=

Adaptive signal processing algorithms based on EMD and ITD , author=. IEEE Access , volume=. 2019 , publisher=

work page 2019

[41] [41]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Tam: Temporal adaptive module for video recognition , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=. 2021 , organization=

work page 2021

[42] [42]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Tea: Temporal excitation and aggregation for action recognition , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=. 2020 , organization=

work page 2020

[43] [43]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

X3d: Expanding architectures for efficient video recognition , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=. 2020 , organization=

work page 2020

[44] [44]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Mvitv2: Improved multiscale vision transformers for classification and detection , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=. 2022 , organization=

work page 2022

[45] [45]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Video swin transformer , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=. 2022 , organization=

work page 2022

[46] [46]

Icml , volume=

Is space-time attention all you need for video understanding? , author=. Icml , volume=. 2021 , organization=

work page 2021

[47] [47]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Understanding traffic density from large-scale web camera data , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=. 2017 , organization=

work page 2017

[48] [48]

2019 IEEE Pune section international conference (PuneCon) , pages=

Hog, lbp and svm based traffic density estimation at intersection , author=. 2019 IEEE Pune section international conference (PuneCon) , pages=. 2019 , organization=

work page 2019

[49] [49]

Proceedings of the IEEE conference on Computer Vision and Pattern Recognition , pages=

A closer look at spatiotemporal convolutions for action recognition , author=. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition , pages=. 2018 , organization=

work page 2018

[50] [50]

Artificial intelligence , volume=

Determining optical flow , author=. Artificial intelligence , volume=. 1981 , publisher=

work page 1981

[51] [51]

Two-Frame Motion Estimation Based on Polynomial Expansion , volume =

Farnebäck, Gunnar , year =. Two-Frame Motion Estimation Based on Polynomial Expansion , volume =. In: Image analysis , doi =

work page

[52] [52]

and Singh, Sameer , title =

Rodriguez-Serrano, Jose A. and Singh, Sameer , title =. Pattern Anal. Appl. , month = nov, pages =. 2012 , issue_date =. doi:10.1007/s10044-012-0269-7 , abstract =

work page doi:10.1007/s10044-012-0269-7 2012

[53] [53]

IEEE International Conference on Image Processing 2005 , volume=

Similarity based vehicle trajectory clustering and anomaly detection , author=. IEEE International Conference on Image Processing 2005 , volume=. 2005 , organization=

work page 2005

[54] [54]

proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

Quo vadis, action recognition? a new model and the kinetics dataset , author=. proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=. 2017 , organization=

work page 2017

[55] [55]

Proceedings of the IEEE international conference on computer vision , pages=

Learning spatiotemporal features with 3d convolutional networks , author=. Proceedings of the IEEE international conference on computer vision , pages=. 2015 , organization=

work page 2015

[56] [56]

Advances in neural information processing systems , volume=

Two-stream convolutional networks for action recognition in videos , author=. Advances in neural information processing systems , volume=

work page

[57] [57]

Proceedings of the European conference on computer vision (ECCV) , pages=

Cbam: Convolutional block attention module , author=. Proceedings of the European conference on computer vision (ECCV) , pages=. 2018 , organization=

work page 2018

[58] [58]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Non-local neural networks , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=. 2018 , organization=

work page 2018

[59] [59]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Optical flow guided feature: A fast and robust motion representation for video action recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=. 2018 , organization=

work page 2018

[60] [60]

Proceedings of the IEEE international conference on computer vision , pages=

Flow-guided feature aggregation for video object detection , author=. Proceedings of the IEEE international conference on computer vision , pages=. 2017 , organization=

work page 2017

[61] [61]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

Motion Guided Spatial Attention for Video Captioning , volume=. Proceedings of the AAAI Conference on Artificial Intelligence , author=. 2019 , month=. doi:10.1609/aaai.v33i01.33018191 , abstractNote=

work page doi:10.1609/aaai.v33i01.33018191 2019

[62] [62]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Long-term recurrent convolutional networks for visual recognition and description , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=. 2015 , organization=

work page 2015

[63] [63]

IEEE Transactions on Circuits and Systems for Video Technology , volume=

Two-stream collaborative learning with spatial-temporal attention for video classification , author=. IEEE Transactions on Circuits and Systems for Video Technology , volume=. 2018 , publisher=

work page 2018

[64] [64]

arXiv preprint arXiv:2012.08510 , year=

Gta: Global temporal attention for video action understanding , author=. arXiv preprint arXiv:2012.08510 , year=

work page arXiv 2012

[65] [65]

Proceedings of the Royal Society of London

The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , author=. Proceedings of the Royal Society of London. Series A: mathematical, physical and engineering sciences , volume=. 1998 , publisher=

work page 1998

[66] [66]

Sustainability , VOLUME =

Rui, Yikang and Gong, Yannan and Zhao, Yan and Luo, Kaijie and Lu, Wenqi , TITLE =. Sustainability , VOLUME =. 2024 , NUMBER =

work page 2024

[67] [67]

2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) , volume=

Probabilistic kernels for the classification of auto-regressive visual processes , author=. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) , volume=. 2005 , organization=

work page 2005

[68] [68]

2025 , institution=

2025 urban mobility report , author=. 2025 , institution=

work page 2025

[69] [69]

Inrix global traffic scorecard , author=

work page

[70] [70]

2016 , publisher=

Traffic Monitoring Guide , author=. 2016 , publisher=

work page 2016

[71] [71]

2006 , institution=

Traffic detector handbook: Volume I , author=. 2006 , institution=

work page 2006

[72] [72]

A survey on Hilbert-Huang transform: Evolution, challenges and solutions , journal =

Uender Barbosa de Souza and João Paulo Lemos Escola and Leonardo da Cunha Brito , keywords =. A survey on Hilbert-Huang transform: Evolution, challenges and solutions , journal =. 2022 , issn =. doi:https://doi.org/10.1016/j.dsp.2021.103292 , url =

work page doi:10.1016/j.dsp.2021.103292 2022

[73] [73]

Expert Systems with Applications , volume=

MobileNetV2 with Spatial Attention module for traffic congestion recognition in surveillance images , author=. Expert Systems with Applications , volume=. 2024 , publisher=

work page 2024

[74] [74]

Engineering Applications of Artificial Intelligence , volume=

Traffic congestion recognition based on convolutional neural networks in different scenarios , author=. Engineering Applications of Artificial Intelligence , volume=. 2025 , publisher=

work page 2025

[75] [75]

Transport and Telecommunication , volume=

Efficient road traffic video congestion classification based on the multi-head self-attention vision transformer model , author=. Transport and Telecommunication , volume=. 2024 , publisher=

work page 2024

[76] [76]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Mitigating and evaluating static bias of action representations in the background and the foreground , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

work page

[77] [77]

arXiv preprint arXiv:2512.17953 , year=

Seeing Beyond the Scene: Analyzing and Mitigating Background Bias in Action Recognition , author=. arXiv preprint arXiv:2512.17953 , year=

work page arXiv

[78] [78]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Gmflow: Learning optical flow via global matching , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page

[79] [79]

Sensors , volume=

Deep learning-based congestion detection at urban intersections , author=. Sensors , volume=. 2021 , publisher=

work page 2021

[80] [80]

Transportation Research Part C: Emerging Technologies , volume=

Two-stream video-based deep learning model for crashes and near-crashes , author=. Transportation Research Part C: Emerging Technologies , volume=. 2024 , publisher=

work page 2024