Tabular Data with Class Imbalance: Predicting Electric Vehicle Crash Severity with Pretrained Transformers (TabPFN) and Mamba-Based Models

Gaurab Chhetri; Pavan Hebli; Shriyank Somvanshi; Subasish Das

arxiv: 2509.11449 · v1 · submitted 2025-09-14 · 💻 cs.LG · cs.AI

Tabular Data with Class Imbalance: Predicting Electric Vehicle Crash Severity with Pretrained Transformers (TabPFN) and Mamba-Based Models

Shriyank Somvanshi , Pavan Hebli , Gaurab Chhetri , Subasish Das This is my paper

Pith reviewed 2026-05-18 15:59 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords electric vehicle crashescrash severity predictiontabular data classificationclass imbalanceMambaAttentionTabPFNfeature importance

0 comments

The pith

MambaAttention outperforms TabPFN and MambaNet at classifying severe injuries in electric vehicle crashes through attention-based feature reweighting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper applies deep tabular models to real-world Texas crash data to forecast injury severity in electric vehicle collisions. After filtering to 23,301 EV-only records and using SMOTEENN to correct class imbalance, the authors compare TabPFN, MambaNet, and MambaAttention. Feature ranking highlights intersection type, first harmful event, driver age, speed limit, and day of week as leading signals, along with automatic emergency braking. MambaAttention delivers the best results on the severe-injury class by dynamically reweighting features, while TabPFN shows strong overall generalization.

Core claim

MambaAttention achieved superior performance in classifying severe injury cases due to its attention-based feature reweighting on the filtered Texas EV crash dataset, whereas TabPFN demonstrated strong generalization across severity levels.

What carries the argument

MambaAttention, which uses attention to reweight tabular features for improved classification of the minority severe-injury class.

If this is right

Intersection relation, speed limit, and automatic emergency braking emerge as top predictors that safety programs can target.
Deep tabular architectures can support data-driven interventions to reduce severe outcomes in EV collisions.
Attention mechanisms in sequence models improve minority-class detection in imbalanced tabular safety data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pipeline could be applied to non-EV or multi-state crash datasets to test consistency of the top predictors.
Embedding these predictions into real-time vehicle or infrastructure systems might allow earlier safety alerts.
Removing the resampling step and retraining on raw class distributions would reveal how much the reported gains depend on synthetic balancing.

Load-bearing premise

The filtered Texas EV crash records are representative of broader EV crashes and SMOTEENN resampling preserves the original relationships between features and severe-injury labels.

What would settle it

Testing the three models on an independent set of EV crash records from another state or recent year without any resampling and measuring whether MambaAttention still leads on the severe-injury class.

Figures

Figures reproduced from arXiv: 2509.11449 by Gaurab Chhetri, Pavan Hebli, Shriyank Somvanshi, Subasish Das.

**Figure 1.** Figure 1: Study Design B. Deep Learning for Tabular Data Recent advances in deep learning have produced competitive alternatives to gradient-boosted trees for structured data [13], [14]. Notably, TabPFN, a pretrained transformer tailored for small-scale tabular classification, offers stateof-the-art performance with no hyperparameter tuning via one-shot inference and in-context learning that approximates Bayesian … view at source ↗

**Figure 2.** Figure 2: Variable Selection Using XGBoost and Random Forest ed) [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Before and after feature distribution analysis with SMOTEENN (a)Feature distribution of first harmful event, same [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 6.** Figure 6: Training performance of the MambaAttention model: [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

read the original abstract

This study presents a deep tabular learning framework for predicting crash severity in electric vehicle (EV) collisions using real-world crash data from Texas (2017-2023). After filtering for electric-only vehicles, 23,301 EV-involved crash records were analyzed. Feature importance techniques using XGBoost and Random Forest identified intersection relation, first harmful event, person age, crash speed limit, and day of week as the top predictors, along with advanced safety features like automatic emergency braking. To address class imbalance, Synthetic Minority Over-sampling Technique and Edited Nearest Neighbors (SMOTEENN) resampling was applied. Three state-of-the-art deep tabular models, TabPFN, MambaNet, and MambaAttention, were benchmarked for severity prediction. While TabPFN demonstrated strong generalization, MambaAttention achieved superior performance in classifying severe injury cases due to its attention-based feature reweighting. The findings highlight the potential of deep tabular architectures for improving crash severity prediction and enabling data-driven safety interventions in EV crash contexts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a domain application of TabPFN and Mamba variants to Texas EV crash data with SMOTEENN resampling, but the performance claims rest on unshown metrics and an untested explanation for why one model wins.

read the letter

The paper takes recent tabular deep learning tools and applies them to predict injury severity in a filtered set of 23k Texas EV crashes from 2017-2023. The core move is benchmarking TabPFN, MambaNet, and MambaAttention after SMOTEENN resampling, plus some tree-based feature rankings that flag intersection type, first harmful event, age, speed limit, and automatic emergency braking as useful signals. That combination on this specific dataset appears to be new, and the timing makes sense given rising EV numbers and the practical value for safety work. The feature list itself looks reasonable and aligns with what traffic safety people already track. The authors also avoid the usual trap of claiming a brand-new architecture; they are mostly testing off-the-shelf options on a real imbalance problem. That keeps the contribution modest but honest. The main weakness is that the abstract and visible summary give zero numeric results—no accuracy, no F1 on the severe class, no confidence intervals, and no train-test details. Without those numbers it is impossible to tell whether MambaAttention's edge is large enough to matter or just noise. The stated reason for its win, attention-based feature reweighting, also lacks any supporting ablation or internal check in what is shown, so the causal story stays speculative. The single-state scope and the known risks that SMOTEENN can distort minority-class relationships further limit how far the results travel. This work is mainly for traffic-safety analysts or applied ML groups who already handle tabular crash data and want to see how these newer models behave on an EV subset. A reader looking for methodological novelty or tightly controlled experiments will find little. It is coherent enough on its own terms to deserve a serious referee, mainly so the authors can supply the missing metrics, splits, and any ablation that would back the attention claim. I would send it out for review with a clear request for those additions rather than desk-reject it outright.

Referee Report

3 major / 2 minor

Summary. The manuscript presents a deep tabular learning framework for predicting crash severity in electric vehicle collisions using 23,301 filtered Texas crash records (2017-2023). It applies XGBoost and Random Forest for feature importance (highlighting intersection relation, first harmful event, person age, crash speed limit, day of week, and advanced safety features), uses SMOTEENN to address class imbalance, and benchmarks TabPFN, MambaNet, and MambaAttention, claiming superior severe-injury classification performance for MambaAttention due to attention-based feature reweighting.

Significance. If the performance claims are substantiated with quantitative metrics, ablations, and validation details, the work could illustrate the applicability of state-of-the-art tabular deep learning models (including Mamba variants) to imbalanced, safety-critical transportation datasets and support data-driven EV safety interventions.

major comments (3)

[Abstract] Abstract: the central claim that MambaAttention 'achieved superior performance in classifying severe injury cases due to its attention-based feature reweighting' is unsupported; the abstract (and by extension the manuscript) supplies no numeric metrics, confidence intervals, ablation results, train-test split details, or hyperparameter search information to ground the benchmarking results.
[Results] Results section: the attribution of MambaAttention superiority specifically to attention-based feature reweighting lacks any ablation (e.g., MambaAttention without the attention component or MambaNet augmented with attention) or internal analysis (attention weights versus tree-based feature importance) that would isolate the mechanism from other differences in state-space modeling, capacity, or optimization.
[Data and Methods] Data and Methods: the representativeness of the filtered Texas EV-only records and the claim that SMOTEENN does not distort feature-severe injury relationships are load-bearing for generalizability but receive no sensitivity analysis to resampling parameters or external validation.

minor comments (2)

[Abstract] Abstract: the statement that advanced safety features like automatic emergency braking are among the top predictors should include their specific importance scores or ranking positions from the XGBoost/Random Forest analysis.
[General] General: the exact architectural differences between MambaNet and MambaAttention (e.g., how attention is integrated) and the precise implementation of TabPFN fine-tuning should be detailed for reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate the specific revisions planned to strengthen the quantitative support, mechanistic analysis, and robustness checks.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that MambaAttention 'achieved superior performance in classifying severe injury cases due to its attention-based feature reweighting' is unsupported; the abstract (and by extension the manuscript) supplies no numeric metrics, confidence intervals, ablation results, train-test split details, or hyperparameter search information to ground the benchmarking results.

Authors: We agree the abstract should foreground key quantitative results. The Results section already reports model performance via F1-score, precision, and recall on the severe-injury class, together with an 80/20 stratified train-test split and grid-search hyperparameter details. We will revise the abstract to include the concrete metrics (e.g., MambaAttention F1 on severe cases versus baselines) and a concise statement of the validation protocol. revision: yes
Referee: [Results] Results section: the attribution of MambaAttention superiority specifically to attention-based feature reweighting lacks any ablation (e.g., MambaAttention without the attention component or MambaNet augmented with attention) or internal analysis (attention weights versus tree-based feature importance) that would isolate the mechanism from other differences in state-space modeling, capacity, or optimization.

Authors: We acknowledge that the current text infers the benefit of attention from model architecture and overall results without isolating experiments. We will add an ablation that removes the attention module from MambaAttention and compares it directly to MambaNet, plus a side-by-side comparison of learned attention weights against the XGBoost/Random-Forest feature importances. These analyses will appear in a new subsection of the revised Results. revision: yes
Referee: [Data and Methods] Data and Methods: the representativeness of the filtered Texas EV-only records and the claim that SMOTEENN does not distort feature-severe injury relationships are load-bearing for generalizability but receive no sensitivity analysis to resampling parameters or external validation.

Authors: The 23,301 records constitute the complete filtered Texas EV crash population for 2017-2023; we will state this explicitly and note the geographic scope as a limitation. We will add a sensitivity study varying SMOTEENN sampling ratios and nearest-neighbor counts, reporting effects on both feature distributions and downstream F1 scores. External validation on additional state datasets is not feasible with currently available data and will be listed as future work in the Discussion. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical benchmarking on external crash records

full rationale

The manuscript is a standard empirical study that filters real Texas EV crash records (2017-2023), applies SMOTEENN resampling, extracts feature importance via independent tree models (XGBoost, Random Forest), and benchmarks three off-the-shelf deep tabular architectures (TabPFN, MambaNet, MambaAttention) on held-out data. Reported performance numbers are direct measurements of model predictions against ground-truth severity labels; no equations, fitted parameters, or self-citations are used to define or derive those numbers. The interpretive claim that MambaAttention superiority stems from attention-based reweighting is an after-the-fact explanation of benchmark deltas rather than a mathematical reduction to the paper's own inputs. The work therefore contains no load-bearing step that collapses to self-definition, fitted-input-as-prediction, or self-citation chains.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central empirical claim rests on the assumption that the filtered Texas dataset and the chosen resampling procedure preserve the true feature-label relationships; no new physical axioms or invented entities are introduced.

free parameters (2)

SMOTEENN resampling parameters
Number of neighbors and sampling ratios chosen to balance classes; values not stated but required for exact replication.
MambaAttention hyperparameters
Attention heads, state dimension, and learning-rate schedule fitted during benchmarking.

axioms (1)

domain assumption Texas crash records 2017-2023 after electric-vehicle filtering are representative of future EV crashes.
Invoked when generalizing performance results to safety interventions.

pith-pipeline@v0.9.0 · 5732 in / 1284 out tokens · 35966 ms · 2026-05-18T15:59:31.408473+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MambaAttention achieved superior performance in classifying severe injury cases due to its attention-based feature reweighting

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models
cs.LG 2025-11 unverdicted novelty 6.0

TabPFN-2.5 scales tabular foundation models to 20x larger datasets, outperforms tuned tree models on TabArena, achieves near-perfect win rates against default XGBoost, and adds a distillation engine for fast productio...

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Sound decisions: How synthetic motor sounds improve autonomous vehicle-pedestrian interactions,

D. Moore, R. Currano, and D. Sirkin, “Sound decisions: How synthetic motor sounds improve autonomous vehicle-pedestrian interactions,” in 12th International Conference on Automotive User Interfaces and Inter- active Vehicular Applications, 2020, pp. 94–103

work page 2020
[2]

Functional safety design for torque control of a pure electric vehicle,

F. Yi, W. Zhang, and W. Zhou, “Functional safety design for torque control of a pure electric vehicle,” in2021 9th International symposium on next generation electronics (ISNE). IEEE, 2021, pp. 1–4

work page 2021
[3]

Development of a new electric vehicle post-crash fire safety test in korea (proposed for the korean new car assessment program),

J. In, J. Ma, and H. Kim, “Development of a new electric vehicle post-crash fire safety test in korea (proposed for the korean new car assessment program),”World Electric Vehicle Journal, vol. 16, no. 2, p. 103, 2025

work page 2025
[4]

Road traffic crash severity analysis: a bayesian-optimized dynamic ensemble selection guided by instance hardness and region of competence strategy,

K. Aziz, F. Chen, I. Khan, S. H. Khahro, A. M. Muhammad, Z. A. Memon, and A. Khattak, “Road traffic crash severity analysis: a bayesian-optimized dynamic ensemble selection guided by instance hardness and region of competence strategy,”IEEE Access, 2024

work page 2024
[5]

Predicting freeway traffic crash severity using xgboost-bayesian network model with consideration of features interaction,

Y . Yang, K. Wang, Z. Yuan, and D. Liu, “Predicting freeway traffic crash severity using xgboost-bayesian network model with consideration of features interaction,”Journal of advanced transportation, vol. 2022, no. 1, p. 4257865, 2022

work page 2022
[6]

A comparative study using generalized ordered probit, stacking ensemble, and tabnet: Application to determinants of pedestrian crash severity,

A. Rafe, M. A. Arman, and P. A. Singleton, “A comparative study using generalized ordered probit, stacking ensemble, and tabnet: Application to determinants of pedestrian crash severity,”Data Science for Trans- portation, vol. 6, no. 2, p. 13, 2024

work page 2024
[7]

Applying tabular deep learning models to estimate crash injury types of young motorcyclists,

S. Somvanshi, A. G. Tusti, R. Chakraborty, and S. Das, “Applying tabular deep learning models to estimate crash injury types of young motorcyclists,”arXiv preprint arXiv:2503.10474, 2025

work page arXiv 2025
[8]

Investigation of different classification algorithms for predicting occupant injury criterion to decide the required restraint strategy,

G. J. Sequeira, E. Elnagdy, G. Danapal, R. Lugner, U. Jumar, and T. Brandmeier, “Investigation of different classification algorithms for predicting occupant injury criterion to decide the required restraint strategy,” in2021 IEEE International Intelligent Transportation Systems Conference (ITSC). IEEE, 2021, pp. 204–210

work page 2021
[9]

Factors influencing the patterns of wrong-way driving crashes on freeway exit ramps and median crossovers: Exploration using ‘eclat’ association rules to promote safety,

S. Das, A. Dutta, M. Jalayer, A. Bibeka, and L. Wu, “Factors influencing the patterns of wrong-way driving crashes on freeway exit ramps and median crossovers: Exploration using ‘eclat’ association rules to promote safety,”International Journal of Transportation Science and Technology, vol. 7, no. 2, pp. 114–123, 2018. [Online]. Available: https://www.sci...

work page 2018
[10]

Das,Artificial Intelligence in Highway Safety, 1st ed

S. Das,Artificial Intelligence in Highway Safety, 1st ed. Boca Raton, FL: CRC Press, Taylor & Francis Group, 2022

work page 2022
[11]

Causal analysis and classification of traffic crash injury severity using machine learning algorithms,

M. Chakraborty, T. J. Gates, and S. Sinha, “Causal analysis and classification of traffic crash injury severity using machine learning algorithms,”Data science for transportation, vol. 5, no. 2, p. 12, 2023

work page 2023
[12]

Cyclist crash severity modeling: A hybrid approach of xgboost-shap and random parameters logit with heterogeneity in means and variances,

A. Scarano, M. Sadeghi, F. Mauriello, M. R. Riccardi, K. Aghabayk, and A. Montella, “Cyclist crash severity modeling: A hybrid approach of xgboost-shap and random parameters logit with heterogeneity in means and variances,”Journal of Safety Research, vol. 93, pp. 373–398, 2025

work page 2025
[13]

A survey on deep tabular learning,

S. Somvanshi, S. Das, S. A. Javed, G. Antariksa, and A. Hossain, “A survey on deep tabular learning,”arXiv preprint arXiv:2410.12034, 2024

work page arXiv 2024
[14]

Crash severity analysis of child bicyclists using arm-net and mambanet,

S. Somvanshi, R. Chakraborty, S. Das, and A. K. Dutta, “Crash severity analysis of child bicyclists using arm-net and mambanet,” in2025 IEEE Conference on Artificial Intelligence (CAI), 2025, pp. 821–824

work page 2025
[15]

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

N. Hollmann, S. M ¨uller, K. Eggensperger, and F. Hutter, “Tabpfn: A transformer that solves small tabular classification problems in a second,”arXiv preprint arXiv:2207.01848, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[16]

Revisiting deep learning models for tabular data,

Y . Gorishniy, I. Rubachev, V . Khrulkov, and A. Babenko, “Revisiting deep learning models for tabular data,”Advances in neural information processing systems, vol. 34, pp. 18 932–18 943, 2021

work page 2021
[17]

Bayan Bruss and Tom Goldstein , title =

G. Somepalli, M. Goldblum, A. Schwarzschild, C. B. Bruss, and T. Goldstein, “Saint: Improved neural networks for tabular data via row attention and contrastive pre-training,”arXiv preprint arXiv:2106.01342, 2021

work page arXiv 2021
[18]

Mambular: A sequential model for tabular deep learning,

A. F. Thielmann, M. Kumar, C. Weisser, A. Reuter, B. S ¨afken, and S. Samiee, “Mambular: A sequential model for tabular deep learning,” arXiv preprint arXiv:2408.06291, 2024

work page arXiv 2024
[19]

Mambatab: A plug-and-play model for learning tabular data,

M. A. Ahamed and Q. Cheng, “Mambatab: A plug-and-play model for learning tabular data,” in2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE, 2024, pp. 369–375

work page 2024
[20]

Advancing Intelligent Sequence Modeling: Evolution, Trade-offs, and Applications of State- Space Architectures from S4 to Mamba

S. Somvanshi, M. M. Islam, M. S. Mimi, S. B. B. Polock, G. Chhetri, and S. Das, “From s4 to mamba: A comprehensive survey on structured state space models,”arXiv preprint arXiv:2503.18970, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[21]

Smote vs. smoteenn: A study on the performance of resampling algorithms for addressing class imbalance in regression models,

G. Husain, D. Nasef, R. Jose, J. Mayer, M. Bekbolatova, T. Devine, and M. Toma, “Smote vs. smoteenn: A study on the performance of resampling algorithms for addressing class imbalance in regression models,”Algorithms, vol. 18, no. 1, p. 37, 2025

work page 2025
[22]

Do electric vehicles lead to more severe crashes? a doubly robust-based causal inference approach,

G. Zhai, K. Xie, D. Yang, and H. Yang, “Do electric vehicles lead to more severe crashes? a doubly robust-based causal inference approach,” SSRN Preprint, 2025

work page 2025
[23]

Spatio- temporal analysis and severity analysis using machine learning classifiers for electric vehicle crashes data of metro manila, philippines,

A. M. Salang, S. F. Javier, J. Ballarta, and J. E. Taguiam, “Spatio- temporal analysis and severity analysis using machine learning classifiers for electric vehicle crashes data of metro manila, philippines,”Journal of the Eastern Asia Society for Transportation Studies, vol. 15, pp. 3207– 3227, 2024

work page 2024
[24]

Effectiveness of forward collision warning and au- tonomous emergency braking systems in reducing front-to-rear crash rates,

J. B. Cicchino, “Effectiveness of forward collision warning and au- tonomous emergency braking systems in reducing front-to-rear crash rates,”Accident Analysis & Prevention, vol. 99, pp. 142–152, 2017

work page 2017
[25]

A study on real-world effectiveness of model year 2015–2023 advanced driver assis- tance systems,

A. Aukema, K. Berman, T. Gaydos, T. Sienknechtet al., “A study on real-world effectiveness of model year 2015–2023 advanced driver assis- tance systems,” The MITRE Corporation and Partnership for Analytics Research in Traffic Safety, McLean, V A, Tech. Rep., January 2025, technical Report

work page 2015
[26]

Crash reports and records,

Texas Department of Transportation, “Crash reports and records,” 2025, accessed July 21, 2025. [Online]. Available: https://www.txdot.gov/ data-maps/crash-reports-records.html

work page 2025

[1] [1]

Sound decisions: How synthetic motor sounds improve autonomous vehicle-pedestrian interactions,

D. Moore, R. Currano, and D. Sirkin, “Sound decisions: How synthetic motor sounds improve autonomous vehicle-pedestrian interactions,” in 12th International Conference on Automotive User Interfaces and Inter- active Vehicular Applications, 2020, pp. 94–103

work page 2020

[2] [2]

Functional safety design for torque control of a pure electric vehicle,

F. Yi, W. Zhang, and W. Zhou, “Functional safety design for torque control of a pure electric vehicle,” in2021 9th International symposium on next generation electronics (ISNE). IEEE, 2021, pp. 1–4

work page 2021

[3] [3]

Development of a new electric vehicle post-crash fire safety test in korea (proposed for the korean new car assessment program),

J. In, J. Ma, and H. Kim, “Development of a new electric vehicle post-crash fire safety test in korea (proposed for the korean new car assessment program),”World Electric Vehicle Journal, vol. 16, no. 2, p. 103, 2025

work page 2025

[4] [4]

Road traffic crash severity analysis: a bayesian-optimized dynamic ensemble selection guided by instance hardness and region of competence strategy,

K. Aziz, F. Chen, I. Khan, S. H. Khahro, A. M. Muhammad, Z. A. Memon, and A. Khattak, “Road traffic crash severity analysis: a bayesian-optimized dynamic ensemble selection guided by instance hardness and region of competence strategy,”IEEE Access, 2024

work page 2024

[5] [5]

Predicting freeway traffic crash severity using xgboost-bayesian network model with consideration of features interaction,

Y . Yang, K. Wang, Z. Yuan, and D. Liu, “Predicting freeway traffic crash severity using xgboost-bayesian network model with consideration of features interaction,”Journal of advanced transportation, vol. 2022, no. 1, p. 4257865, 2022

work page 2022

[6] [6]

A comparative study using generalized ordered probit, stacking ensemble, and tabnet: Application to determinants of pedestrian crash severity,

A. Rafe, M. A. Arman, and P. A. Singleton, “A comparative study using generalized ordered probit, stacking ensemble, and tabnet: Application to determinants of pedestrian crash severity,”Data Science for Trans- portation, vol. 6, no. 2, p. 13, 2024

work page 2024

[7] [7]

Applying tabular deep learning models to estimate crash injury types of young motorcyclists,

S. Somvanshi, A. G. Tusti, R. Chakraborty, and S. Das, “Applying tabular deep learning models to estimate crash injury types of young motorcyclists,”arXiv preprint arXiv:2503.10474, 2025

work page arXiv 2025

[8] [8]

Investigation of different classification algorithms for predicting occupant injury criterion to decide the required restraint strategy,

G. J. Sequeira, E. Elnagdy, G. Danapal, R. Lugner, U. Jumar, and T. Brandmeier, “Investigation of different classification algorithms for predicting occupant injury criterion to decide the required restraint strategy,” in2021 IEEE International Intelligent Transportation Systems Conference (ITSC). IEEE, 2021, pp. 204–210

work page 2021

[9] [9]

Factors influencing the patterns of wrong-way driving crashes on freeway exit ramps and median crossovers: Exploration using ‘eclat’ association rules to promote safety,

S. Das, A. Dutta, M. Jalayer, A. Bibeka, and L. Wu, “Factors influencing the patterns of wrong-way driving crashes on freeway exit ramps and median crossovers: Exploration using ‘eclat’ association rules to promote safety,”International Journal of Transportation Science and Technology, vol. 7, no. 2, pp. 114–123, 2018. [Online]. Available: https://www.sci...

work page 2018

[10] [10]

Das,Artificial Intelligence in Highway Safety, 1st ed

S. Das,Artificial Intelligence in Highway Safety, 1st ed. Boca Raton, FL: CRC Press, Taylor & Francis Group, 2022

work page 2022

[11] [11]

Causal analysis and classification of traffic crash injury severity using machine learning algorithms,

M. Chakraborty, T. J. Gates, and S. Sinha, “Causal analysis and classification of traffic crash injury severity using machine learning algorithms,”Data science for transportation, vol. 5, no. 2, p. 12, 2023

work page 2023

[12] [12]

Cyclist crash severity modeling: A hybrid approach of xgboost-shap and random parameters logit with heterogeneity in means and variances,

A. Scarano, M. Sadeghi, F. Mauriello, M. R. Riccardi, K. Aghabayk, and A. Montella, “Cyclist crash severity modeling: A hybrid approach of xgboost-shap and random parameters logit with heterogeneity in means and variances,”Journal of Safety Research, vol. 93, pp. 373–398, 2025

work page 2025

[13] [13]

A survey on deep tabular learning,

S. Somvanshi, S. Das, S. A. Javed, G. Antariksa, and A. Hossain, “A survey on deep tabular learning,”arXiv preprint arXiv:2410.12034, 2024

work page arXiv 2024

[14] [14]

Crash severity analysis of child bicyclists using arm-net and mambanet,

S. Somvanshi, R. Chakraborty, S. Das, and A. K. Dutta, “Crash severity analysis of child bicyclists using arm-net and mambanet,” in2025 IEEE Conference on Artificial Intelligence (CAI), 2025, pp. 821–824

work page 2025

[15] [15]

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

N. Hollmann, S. M ¨uller, K. Eggensperger, and F. Hutter, “Tabpfn: A transformer that solves small tabular classification problems in a second,”arXiv preprint arXiv:2207.01848, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[16] [16]

Revisiting deep learning models for tabular data,

Y . Gorishniy, I. Rubachev, V . Khrulkov, and A. Babenko, “Revisiting deep learning models for tabular data,”Advances in neural information processing systems, vol. 34, pp. 18 932–18 943, 2021

work page 2021

[17] [17]

Bayan Bruss and Tom Goldstein , title =

G. Somepalli, M. Goldblum, A. Schwarzschild, C. B. Bruss, and T. Goldstein, “Saint: Improved neural networks for tabular data via row attention and contrastive pre-training,”arXiv preprint arXiv:2106.01342, 2021

work page arXiv 2021

[18] [18]

Mambular: A sequential model for tabular deep learning,

A. F. Thielmann, M. Kumar, C. Weisser, A. Reuter, B. S ¨afken, and S. Samiee, “Mambular: A sequential model for tabular deep learning,” arXiv preprint arXiv:2408.06291, 2024

work page arXiv 2024

[19] [19]

Mambatab: A plug-and-play model for learning tabular data,

M. A. Ahamed and Q. Cheng, “Mambatab: A plug-and-play model for learning tabular data,” in2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE, 2024, pp. 369–375

work page 2024

[20] [20]

Advancing Intelligent Sequence Modeling: Evolution, Trade-offs, and Applications of State- Space Architectures from S4 to Mamba

S. Somvanshi, M. M. Islam, M. S. Mimi, S. B. B. Polock, G. Chhetri, and S. Das, “From s4 to mamba: A comprehensive survey on structured state space models,”arXiv preprint arXiv:2503.18970, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[21] [21]

Smote vs. smoteenn: A study on the performance of resampling algorithms for addressing class imbalance in regression models,

G. Husain, D. Nasef, R. Jose, J. Mayer, M. Bekbolatova, T. Devine, and M. Toma, “Smote vs. smoteenn: A study on the performance of resampling algorithms for addressing class imbalance in regression models,”Algorithms, vol. 18, no. 1, p. 37, 2025

work page 2025

[22] [22]

Do electric vehicles lead to more severe crashes? a doubly robust-based causal inference approach,

G. Zhai, K. Xie, D. Yang, and H. Yang, “Do electric vehicles lead to more severe crashes? a doubly robust-based causal inference approach,” SSRN Preprint, 2025

work page 2025

[23] [23]

Spatio- temporal analysis and severity analysis using machine learning classifiers for electric vehicle crashes data of metro manila, philippines,

A. M. Salang, S. F. Javier, J. Ballarta, and J. E. Taguiam, “Spatio- temporal analysis and severity analysis using machine learning classifiers for electric vehicle crashes data of metro manila, philippines,”Journal of the Eastern Asia Society for Transportation Studies, vol. 15, pp. 3207– 3227, 2024

work page 2024

[24] [24]

Effectiveness of forward collision warning and au- tonomous emergency braking systems in reducing front-to-rear crash rates,

J. B. Cicchino, “Effectiveness of forward collision warning and au- tonomous emergency braking systems in reducing front-to-rear crash rates,”Accident Analysis & Prevention, vol. 99, pp. 142–152, 2017

work page 2017

[25] [25]

A study on real-world effectiveness of model year 2015–2023 advanced driver assis- tance systems,

A. Aukema, K. Berman, T. Gaydos, T. Sienknechtet al., “A study on real-world effectiveness of model year 2015–2023 advanced driver assis- tance systems,” The MITRE Corporation and Partnership for Analytics Research in Traffic Safety, McLean, V A, Tech. Rep., January 2025, technical Report

work page 2015

[26] [26]

Crash reports and records,

Texas Department of Transportation, “Crash reports and records,” 2025, accessed July 21, 2025. [Online]. Available: https://www.txdot.gov/ data-maps/crash-reports-records.html

work page 2025