A LightGBM classifier trained on NWAY Bayesian matches identifies true Chandra-Gaia counterparts for 113k X-ray sources, flags 7k ambiguous cases, and attributes half of 20k separation-only matches to chance coincidences, validated at 95% on COUP without positional features.
and Pfister, Tomas , month = dec, year =
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4representative citing papers
ReSS extracts decision paths from trees as scaffolds to guide LLM reasoning generation, fine-tunes the LLM on the resulting dataset with scaffold-invariant augmentation, and reports up to 10% gains on medical and financial tabular benchmarks with new faithfulness metrics.
Spline encodings for numerical features show task-dependent performance in tabular deep learning, with piecewise-linear encoding robust for classification and variable results for regression depending on spline family, knot strategy, and backbone.
TabTransformer uses Transformer self-attention to generate contextual embeddings from categorical features in tabular data, outperforming prior deep learning methods by at least 1% mean AUC and matching tree-based ensembles on 15 public datasets while showing robustness to missing and noisy features
citing papers explorer
-
The Chandra-Gaia Catalog of Counterparts: Resolving ambiguous Gaia matches to X-ray sources in the Chandra Source Catalog using Machine Learning
A LightGBM classifier trained on NWAY Bayesian matches identifies true Chandra-Gaia counterparts for 113k X-ray sources, flags 7k ambiguous cases, and attributes half of 20k separation-only matches to chance coincidences, validated at 95% on COUP without positional features.
-
ReSS: Learning Reasoning Models for Tabular Data Prediction via Symbolic Scaffold
ReSS extracts decision paths from trees as scaffolds to guide LLM reasoning generation, fine-tunes the LLM on the resulting dataset with scaffold-invariant augmentation, and reports up to 10% gains on medical and financial tabular benchmarks with new faithfulness metrics.
-
From Uniform to Learned Knots: A Study of Spline-Based Numerical Encodings for Tabular Deep Learning
Spline encodings for numerical features show task-dependent performance in tabular deep learning, with piecewise-linear encoding robust for classification and variable results for regression depending on spline family, knot strategy, and backbone.
-
TabTransformer: Tabular Data Modeling Using Contextual Embeddings
TabTransformer uses Transformer self-attention to generate contextual embeddings from categorical features in tabular data, outperforming prior deep learning methods by at least 1% mean AUC and matching tree-based ensembles on 15 public datasets while showing robustness to missing and noisy features