CRONOS introduces scalable convex optimization for two-layer neural networks reaching ImageNet scale, with CRONOS-AM extending to arbitrary multi-layer architectures while matching tuned deep learning performance.
JAX : composable transformations of P ython+ N um P y programs, 2018
6 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
PaLI jointly scales a 4B-parameter vision transformer with language models on a new 10B multilingual image-text dataset to reach state-of-the-art results on vision-language tasks while keeping a simple modular design.
Perceiver IO is a general architecture that processes arbitrary structured inputs and outputs with linear scaling and achieves strong results on GLUE, Sintel optical flow, multi-task reasoning, and StarCraft II without task-specific components.
ActNet is a new KST-based neural network that outperforms KANs and competes with MLPs in PINN benchmarks for PDE simulation tasks.
Gemini Ultra reaches human-expert performance on MMLU for the first time and sets new state-of-the-art results on 30 of 32 benchmarks, including all 20 multimodal ones tested.
citing papers explorer
-
CRONOS: Enhancing Deep Learning with Scalable GPU Accelerated Convex Neural Networks
CRONOS introduces scalable convex optimization for two-layer neural networks reaching ImageNet scale, with CRONOS-AM extending to arbitrary multi-layer architectures while matching tuned deep learning performance.
-
PaLI: A Jointly-Scaled Multilingual Language-Image Model
PaLI jointly scales a 4B-parameter vision transformer with language models on a new 10B multilingual image-text dataset to reach state-of-the-art results on vision-language tasks while keeping a simple modular design.
-
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Perceiver IO is a general architecture that processes arbitrary structured inputs and outputs with linear scaling and achieves strong results on GLUE, Sintel optical flow, multi-task reasoning, and StarCraft II without task-specific components.
-
Deep Learning Alternatives of the Kolmogorov Superposition Theorem
ActNet is a new KST-based neural network that outperforms KANs and competes with MLPs in PINN benchmarks for PDE simulation tasks.
-
Gemini: A Family of Highly Capable Multimodal Models
Gemini Ultra reaches human-expert performance on MMLU for the first time and sets new state-of-the-art results on 30 of 32 benchmarks, including all 20 multimodal ones tested.
- Sinc Kolmogorov-Arnold network and its application for solving PDEs with singularities