Canonical reference

A comprehensive survey on pre- trained foundation models: A history from bert to chatgpt

· 2023 · arXiv 2302.09419

Canonical reference. 100% of citing Pith papers cite this work as background.

9 Pith papers citing it

Background 100% of classified citations

read on arXiv browse 9 citing papers

citation-role summary

background 6

citation-polarity summary

background 6

representative citing papers

GeoGNN: Time Series Geo-Localization using Two-Tower Graph Neural Networks

cs.LG · 2026-06-06 · unverdicted · novelty 6.0

GeoGNN is a two-tower GNN that learns geographic cell embeddings from adjacency graphs and matches them to temporal representations via dot-product similarity plus classification, improving geolocalization accuracy by ~27% on electricity datasets.

ADAPTive Input Training for Many-to-One Pre-Training on Time-Series Classification

cs.LG · 2026-04-09 · unverdicted · novelty 6.0

ADAPT is a new pre-training paradigm that aligns physical properties of time-series data to allow simultaneous training on 162 diverse classification datasets, achieving new state-of-the-art performance.

Mixture-of-Experts Transformer for Automatic Modulation Recognition

eess.SP · 2026-06-08 · unverdicted · novelty 5.0

MoEformer uses temporal resampling, input-dependent gating, and RoPE in a Transformer to achieve 63.74%, 66.24%, and 64.22% average accuracy on RadioML2016.10a, 2016.10b, and 2018.01A benchmarks.

FedProxy: Federated Fine-Tuning of LLMs via Proxy SLMs and Heterogeneity-Aware Fusion

cs.LG · 2026-04-21 · unverdicted · novelty 5.0

FedProxy replaces weak adapters with a proxy SLM for federated LLM fine-tuning, outperforming prior methods and approaching centralized performance via compression, heterogeneity-aware aggregation, and training-free fusion.

Can LLMs Generate and Solve Linguistic Olympiad Puzzles?

cs.CL · 2025-09-26 · unverdicted · novelty 4.0

LLMs like o1 outperform humans on most linguistic olympiad puzzle types except writing systems and understudied languages, with insights applied to the new task of puzzle generation.

Large Language Models: A Survey

cs.CL · 2024-02-09 · accept · novelty 3.0

The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.

A Survey of Large Language Models

cs.CL · 2023-03-31 · accept · novelty 3.0

This survey reviews the background, key techniques, and evaluation methods for large language models, emphasizing emergent abilities that appear at large scales.

Small Language Models (SLMs) Can Still Pack a Punch: A survey (updated 2026)

cs.CL · 2025-01-03 · unverdicted · novelty 2.0

A literature survey of Small Language Models (1-8B parameters) that can perform comparably or better than larger models, covering general-purpose and task-specific approaches plus creation techniques.

A Comprehensive Overview of Large Language Models

cs.CL · 2023-07-12 · unverdicted · novelty 2.0

A survey paper providing an overview of Large Language Models, their background, and recent advances in the field.

citing papers explorer

Showing 3 of 3 citing papers after filters.

GeoGNN: Time Series Geo-Localization using Two-Tower Graph Neural Networks cs.LG · 2026-06-06 · unverdicted · none · ref 68
GeoGNN is a two-tower GNN that learns geographic cell embeddings from adjacency graphs and matches them to temporal representations via dot-product similarity plus classification, improving geolocalization accuracy by ~27% on electricity datasets.
ADAPTive Input Training for Many-to-One Pre-Training on Time-Series Classification cs.LG · 2026-04-09 · unverdicted · none · ref 18
ADAPT is a new pre-training paradigm that aligns physical properties of time-series data to allow simultaneous training on 162 diverse classification datasets, achieving new state-of-the-art performance.
FedProxy: Federated Fine-Tuning of LLMs via Proxy SLMs and Heterogeneity-Aware Fusion cs.LG · 2026-04-21 · unverdicted · none · ref 28
FedProxy replaces weak adapters with a proxy SLM for federated LLM fine-tuning, outperforming prior methods and approaching centralized performance via compression, heterogeneity-aware aggregation, and training-free fusion.

A comprehensive survey on pre- trained foundation models: A history from bert to chatgpt

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer