Orli is an autoregressive image-to-sequence model that jointly detects text lines and determines their reading order on historical documents via chord-frame baselines, trained on 196k pages across ten scripts.
hub
(2017) Focal loss for dense object detection
19 Pith papers cite this work. Polarity classification is still indexing.
hub tools
representative citing papers
TravelFraudBench is a new configurable benchmark for GNN-based fraud ring detection in travel networks, simulating star, clique, and chain topologies and showing GraphSAGE outperforming MLP baselines on AUC and ring recovery.
SARR modifies trigonometric rotation encodings with object symmetry orders to produce unique continuous poses, enabling standard CNNs to outperform existing methods on symmetry-aware 6D pose estimation without custom losses or 3D models.
FRTSearch reframes fast radio transient detection as instance segmentation on dynamic spectra and uses the segmented shapes to infer dispersion measure and time of arrival, achieving 98% recall with over 99.9% fewer false positives than traditional methods.
BRIDGE creates the first formal heterogeneous multi-dataset benchmark for IoT botnet detection with LODO evaluation, and TCH-Net achieves mean LODO F1 of 0.5577 while reaching F1 0.8296 on standard tests, outperforming twelve baselines.
Venice-H1 improves failure-case mIoU by 0.89-1.40 points in referring image segmentation via multi-scale grid signatures and a failure-aware re-ranker, with positive CIs on all tested pairs and low harmful-switch rates.
A local cascade framework for educational dialogue de-identification reaches 0.958 macro F1 on math tutoring transcripts, outperforming same-family LLM-only and commercial baselines while remaining fully on-device.
Tetris decomposes stationary videos into tile polyominoes and applies classifier plus ILP pruning to cut detector calls, staying within 5% accuracy loss while delivering up to 17.4x throughput gains over priors.
Learning-Zone Energy is a new online data selection framework for RL post-training that retains 40% of data per step yet matches or exceeds full-data baselines on math tasks with 36% lower FLOPs.
A 1-D CNN with novel multi-stage spectral attention mechanisms and adjustable class-balanced focal loss improves recognition accuracy on real ship-radiated noise datasets.
SEAGAN applies a domain-specific graph attention network to classify limitation states in A-Ci curves, achieving F1-score 0.857 and accuracy 0.882 on synthetic data with known ground truth.
Family-FL uses family-level aggregation in a three-tier setup with a sub-5KB quantized CNN-LSTM to cut communication by 76.7% versus FedAvg while reaching 91.9% accuracy on MIT-BIH arrhythmia data.
Hybrid neural network predicts eruptive versus confined solar flares from SDO/HMI magnetogram sequences, reports good performance, and links results to magnetic flux cancellation in polarity inversion lines.
InternLM2 is a new open-source LLM that outperforms prior versions on 30 benchmarks and long-context tasks through scaled pre-training to 32k tokens and a conditional online RLHF alignment strategy.
STR-Net achieves AUROC of 0.933 for binary bone-loss screening and 0.801 correlation for T-score estimation from knee X-rays on a held-out test set.
A multi-head RoBERTa model with overlapping chunking and max-pooling achieves Macro-F1 of 0.80 on 3-way clarity classification and 0.51 on 9-way evasion strategy detection, ranking 11th in both subtasks of SemEval-2026 Task 6.
A hybrid deep learning plus classical ML pipeline for waste image classification reaches up to 100% accuracy on TrashNet and a corrected household dataset while cutting feature dimensionality by over 95%.
citing papers explorer
-
End-to-End Text Line Detection and Ordering
Orli is an autoregressive image-to-sequence model that jointly detects text lines and determines their reading order on historical documents via chord-frame baselines, trained on 196k pages across ten scripts.
-
TRAVELFRAUDBENCH: A Configurable Evaluation Framework for GNN Fraud Ring Detection in Travel Networks
TravelFraudBench is a new configurable benchmark for GNN-based fraud ring detection in travel networks, simulating star, clique, and chain topologies and showing GraphSAGE outperforming MLP baselines on AUC and ring recovery.
-
Towards Symmetry-sensitive Pose Estimation: A Rotation Representation for Symmetric Object Classes
SARR modifies trigonometric rotation encodings with object symmetry orders to produce unique continuous poses, enabling standard CNNs to outperform existing methods on symmetry-aware 6D pose estimation without custom losses or 3D models.
-
FRTSearch: Unified Detection and Parameter Inference of Fast Radio Transients using Instance Segmentation
FRTSearch reframes fast radio transient detection as instance segmentation on dynamic spectra and uses the segmented shapes to infer dispersion measure and time of arrival, achieving 98% recall with over 99.9% fewer false positives than traditional methods.
-
BRIDGE and TCH-Net: Heterogeneous Benchmark and Multi-Branch Baseline for Cross-Domain IoT Botnet Detection
BRIDGE creates the first formal heterogeneous multi-dataset benchmark for IoT botnet detection with LODO evaluation, and TCH-Net achieves mean LODO F1 of 0.5577 while reaching F1 0.8296 on standard tests, outperforming twelve baselines.
-
Venice-H1: Failure-Aware Query Re-Ranking with Multi-Scale Grid Signatures for Referring Image Segmentation
Venice-H1 improves failure-case mIoU by 0.89-1.40 points in referring image segmentation via multi-scale grid signatures and a failure-aware re-ranker, with positive CIs on all tested pairs and low harmful-switch rates.
-
Redact or Keep? A Fully Local AI Cascade for Educational Dialogue De-Identification
A local cascade framework for educational dialogue de-identification reaches 0.958 macro F1 on math tutoring transcripts, outperforming same-family LLM-only and commercial baselines while remaining fully on-device.
-
Tetris: Tile-level Sampling for Efficient and High-Fidelity Video Object Tracking
Tetris decomposes stationary videos into tile polyominoes and applies classifier plus ILP pruning to cut detector calls, staying within 5% accuracy loss while delivering up to 17.4x throughput gains over priors.
-
Learning-Zone Energy: Online Data Selection for Efficient RL Post-Training
Learning-Zone Energy is a new online data selection framework for RL post-training that retains 40% of data per step yet matches or exceeds full-data baselines on math tasks with 36% lower FLOPs.
-
Modulation Feature Enhancement with a Multi-Stage Attention Network for Underwater Acoustic Target Recognition
A 1-D CNN with novel multi-stage spectral attention mechanisms and adjustable class-balanced focal loss improves recognition accuracy on real ship-radiated noise datasets.
-
SEAGAN: domain-Specific and Edge-Aware Graph Attention Network for Dynamic Plant Processes
SEAGAN applies a domain-specific graph attention network to classify limitation states in A-Ci curves, achieving F1-score 0.857 and accuracy 0.882 on synthetic data with known ground truth.
-
Towards Family-Grouped Hierarchical Federated Learning on Sub-5KB Models: A Feasibility Study of Privacy-Preserving ECG Monitoring for Ultra-Resource-Constrained Wearables
Family-FL uses family-level aggregation in a three-tier setup with a sub-5KB quantized CNN-LSTM to cut communication by 76.7% versus FedAvg while reaching 91.9% accuracy on MIT-BIH arrhythmia data.
-
Predicting Associations between Solar Flares and Coronal Mass Ejections Using SDO/HMI Magnetograms and a Hybrid Neural Network
Hybrid neural network predicts eruptive versus confined solar flares from SDO/HMI magnetogram sequences, reports good performance, and links results to magnetic flux cancellation in polarity inversion lines.
-
InternLM2 Technical Report
InternLM2 is a new open-source LLM that outperforms prior versions on 30 benchmarks and long-context tasks through scaled pre-training to 32k tokens and a conditional online RLHF alignment strategy.
-
Opportunistic Bone-Loss Screening from Routine Knee Radiographs Using a Multi-Task Deep Learning Framework with Sensitivity-Constrained Threshold Optimization
STR-Net achieves AUROC of 0.933 for binary bone-loss screening and 0.801 correlation for T-score estimation from knee X-rays on a held-out test set.
-
SG-UniBuc-NLP at SemEval-2026 Task 6: Multi-Head RoBERTa with Chunking for Long-Context Evasion Detection
A multi-head RoBERTa model with overlapping chunking and max-pooling achieves Macro-F1 of 0.80 on 3-way clarity classification and 0.51 on 9-way evasion strategy detection, ranking 11th in both subtasks of SemEval-2026 Task 6.
-
Towards Accurate and Efficient Waste Image Classification: A Hybrid Deep Learning and Machine Learning Approach
A hybrid deep learning plus classical ML pipeline for waste image classification reaches up to 100% accuracy on TrashNet and a corrected household dataset while cutting feature dimensionality by over 95%.