pith. sign in

Martin Jaggi

Identifiers

  • name variant Martin Jaggi 0.60 · backfill

Papers (49)

  1. Local MixVR: Breaking the Communication-Sample Dependence in Distributed Learning cs.LG · 2026 · author #4
  2. Apertus LLM Family Expansion via Distillation and Quantization cs.LG · 2026 · author #3
  3. Toward Cross-Lingual Quality Classifiers for Multilingual Pretraining Data Selection cs.CL · 2026 · author #4
  4. An Engineering Journey Training Large Language Models at Scale on Alps: The Apertus Experience cs.DC · 2026 · author #17
  5. Beyond URLs: Metadata Diversity and Position for Efficient LLM Pretraining cs.CL · 2025 · author #4
  6. A Split-Client Approach to Second-Order Optimization math.OC · 2025 · author #2
  7. MEDITRON-70B: Scaling Medical Pretraining for Large Language Models cs.CL · 2023 · author #19
  8. Correlating Twitter Language with Community-Level Health Outcomes cs.CL · 2019 · author #5
  9. Better Word Embeddings by Disentangling Contextual n-Gram Information cs.CL · 2019 · author #3
  10. Crosslingual Document Embedding as Reduced-Rank Ridge Regression cs.CL · 2019 · author #4
  11. Overcoming Multi-Model Forgetting cs.LG · 2019 · author #4
  12. Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication cs.LG · 2019 · author #3
  13. Error Feedback Fixes SignSGD and other Gradient Compression Schemes cs.LG · 2019 · author #4
  14. Efficient Greedy Coordinate Descent for Composite Problems math.OC · 2018 · author #4
  15. Sparsified SGD with Memory cs.LG · 2018 · author #3
  16. COLA: Decentralized Linear Learning cs.DC · 2018 · author #3
  17. A Distributed Second-Order Algorithm You Can Trust cs.LG · 2018 · author #6
  18. Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients cs.LG · 2018 · author #3
  19. Training DNNs with Hybrid Block Floating Point cs.LG · 2018 · author #3
  20. On Matching Pursuit and Coordinate Descent stat.ML · 2018 · author #7
  21. Simple Unsupervised Keyphrase Extraction using Sentence Embeddings cs.CL · 2018 · author #5
  22. An Accelerated Communication-Efficient Primal-Dual Optimization Framework for Structured Machine Learning math.OC · 2017 · author #2
  23. Safe Adaptive Importance Sampling cs.LG · 2017 · author #3
  24. Efficient Use of Limited-Memory Accelerators for Linear Learning on Heterogeneous Systems cs.LG · 2017 · author #3
  25. Learning Aerial Image Segmentation from Online Maps cs.CV · 2017 · author #4
  26. Unsupervised robust nonparametric learning of hidden community properties stat.ML · 2017 · author #3
  27. Approximate Steepest Coordinate Descent cs.LG · 2017 · author #3
  28. Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees cs.LG · 2017 · author #4
  29. Generating Steganographic Text with LSTMs cs.AI · 2017 · author #2
  30. Faster Coordinate Descent via Adaptive Importance Sampling cs.LG · 2017 · author #3
  31. Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features cs.CL · 2017 · author #3
  32. Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification cs.CL · 2017 · author #8
  33. A Unified Optimization View on Generalized Matching Pursuit and Frank-Wolfe cs.LG · 2017 · author #4
  34. CoCoA: A General Framework for Communication-Efficient Distributed Optimization cs.LG · 2016 · author #6
  35. Screening Rules for Convex Problems math.OC · 2016 · author #5
  36. Primal-Dual Rates and Certificates cs.LG · 2016 · author #4
  37. Pursuits in Structured Non-Convex Matrix Factorizations cs.LG · 2016 · author #3
  38. Distributed Optimization with Arbitrary Local Solvers cs.LG · 2015 · author #3
  39. L1-Regularized Distributed Optimization: A Communication-Efficient Primal-Dual Framework cs.LG · 2015 · author #4
  40. On the Global Linear Convergence of Frank-Wolfe Optimization Variants math.OC · 2015 · author #2
  41. Adding vs. Averaging in Distributed Primal-Dual Optimization cs.LG · 2015 · author #3
  42. Communication-Efficient Distributed Dual Coordinate Ascent cs.LG · 2014 · author #1
  43. An Affine Invariant Linear Convergence Analysis for Frank-Wolfe Algorithms math.OC · 2013 · author #2
  44. An Equivalence between the Lasso and Support Vector Machines cs.LG · 2013 · author #1
  45. An Optimal Affine Invariant Smooth Minimization Algorithm math.OC · 2013 · author #3
  46. Block-Coordinate Frank-Wolfe Optimization for Structural SVMs cs.LG · 2012 · author #2
  47. Convex Optimization without Projection Steps math.OC · 2011 · author #1
  48. A Combinatorial Algorithm to Compute Regularization Paths cs.LG · 2009 · author #3
  49. An Exponential Lower Bound on the Complexity of Regularization Paths cs.LG · 2009 · author #2

Mentions

  • 1108.1170 #1 · arxiv_oai · confidence 0.70 Martin Jaggi
  • 1502.03508 #3 · backfill · confidence 0.70 Martin Jaggi
  • 2606.01128 #4 · arxiv_oai · confidence 0.70 Martin Jaggi
  • 1409.1458 #1 · backfill · confidence 0.70 Martin Jaggi
  • 1312.7864 #2 · backfill · confidence 0.70 Martin Jaggi
  • 2605.29128 #3 · arxiv_oai · confidence 0.70 Martin Jaggi
  • 1303.1152 #1 · backfill · confidence 0.70 Martin Jaggi
  • 1301.0465 #3 · backfill · confidence 0.70 Martin Jaggi
  • 1207.4747 #2 · backfill · confidence 0.70 Martin Jaggi
  • 2311.16079 #19 · arxiv_oai · confidence 0.70 Martin Jaggi
  • 1108.1170 #1 · backfill · confidence 0.70 Martin Jaggi
  • 2510.15714 #2 · arxiv_oai · confidence 0.70 Martin Jaggi
  • 0903.4856 #3 · backfill · confidence 0.70 Martin Jaggi
  • 0903.4817 #2 · backfill · confidence 0.70 Martin Jaggi

Frequent Coauthors