pith. sign in

arxiv: 2511.17972 · v2 · submitted 2025-11-22 · ❄️ cond-mat.mtrl-sci

PyAPX: Python toolkit for atomic configuration pattern exploration

Pith reviewed 2026-05-17 06:47 UTC · model grok-4.3

classification ❄️ cond-mat.mtrl-sci
keywords atomic configurationBayesian searchmaterials discoveryencoding methodscrystal structuremachine learningh-BCN
0
0 comments X

The pith

PyAPX toolkit finds stable atomic configurations using new encodings that converge faster than one-hot encoding.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

PyAPX is a Python toolkit that performs Bayesian searches to identify stable atomic arrangements within fixed crystal structures and compositions. Properties can still vary based on how atoms occupy crystallographic sites, so the toolkit focuses on efficient exploration of these configurations. It introduces specialized encoding methods for this search task and tests them on the h-BCN system, where they achieve better convergence than standard one-hot encoding. The approach aims to enable more detailed material design by combining first-principles calculations with machine learning for configuration optimization.

Core claim

The central claim is that encoding methods introduced in PyAPX for atomic configuration search produce superior convergence compared to commonly used one-hot encoding, as shown through evaluation on the h-BCN system. This supports Bayesian optimization for finding favorable atomic patterns in crystalline materials even when structure and composition are held constant.

What carries the argument

Encoding methods suitable for configuration search that replace one-hot encoding to improve the performance of Bayesian searches for stable atomic arrangements.

If this is right

  • Bayesian searches become practical for identifying low-energy atomic arrangements in a range of crystalline systems.
  • Material properties that depend on site occupations can be optimized more systematically alongside structure and composition.
  • The toolkit can be combined with existing first-principles workflows to accelerate targeted property tuning.
  • Exploration of configuration space supports finer control in materials discovery pipelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The encodings may scale to larger supercells or systems with more atom types if computational cost remains manageable.
  • Similar representation choices could be tested in related tasks such as defect configuration sampling.
  • Integration with active learning loops might further reduce the number of required first-principles calculations.

Load-bearing premise

That the new encodings and Bayesian search will generalize effectively to other crystalline materials beyond the specific h-BCN test case described.

What would settle it

Applying the new encodings to a different material system such as silicon carbide or a perovskite and checking whether convergence remains faster than one-hot encoding.

Figures

Figures reproduced from arXiv: 2511.17972 by Akira Kusaba, Karol Kawka, Pawel Kempisty, Tetsuji Kuboyama, Yoshihiro Kangawa.

Figure 1
Figure 1. Figure 1: FIG. 1. (a, b) Typical problem settings in materials discovery based [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Scheme of the PyAPX workflow. Items in black boxes are [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Schematic illustration of (a) the neighbor-atom (NA) encod [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. An example of an atomic configuration pattern in the (3 [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5. Sampling histories obtained by Bayesian optimization for each encoding case: (a) one-hot, (b) NA, (c) modified NA, and (d) modified [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
read the original abstract

In materials discovery, the integration of first-principles calculations with machine learning techniques has been actively studied for two key tasks: crystal structure prediction, which searches for stable structures given a chemical composition, and elemental substitution, which explores chemical compositions that yield desirable properties in a given crystal structure. However, even when both the crystal structure and chemical composition are fixed, material properties can still vary depending on the atomic arrangements (configurations) at crystallographic sites. To support detailed material design, we present PyAPX, a Python toolkit that performs Bayesian searches of stable atomic configurations. A distinctive feature of this initial release is the introduction of encoding methods suitable for configuration search, and we evaluate their performance using the h-BCN system. As a result, they were confirmed to yield superior convergence compared to commonly used one-hot encoding. PyAPX is broadly applicable to crystalline materials and is expected to further advance materials discovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents PyAPX, a Python toolkit for Bayesian optimization searches over atomic configurations in fixed crystal structures and compositions. A distinctive feature is the introduction of new encoding methods for configuration search; these are evaluated on the h-BCN system and reported to achieve superior convergence relative to standard one-hot encoding. The toolkit is described as broadly applicable to crystalline materials for advancing materials discovery.

Significance. If the reported performance advantage holds under broader testing, the specialized encodings and accompanying toolkit would offer a practical aid for efficient exploration of configuration spaces in materials design. The work supplies an open implementation that could support reproducibility, though the single-system evaluation limits the strength of claims about general utility.

major comments (2)
  1. [§4] §4 (Performance evaluation): The central claim that the new encodings yield superior convergence is demonstrated exclusively on the h-BCN test case. No additional benchmarks on other lattices, varying site counts, or different symmetry constraints are provided, leaving open the possibility that the observed advantage is tied to h-BCN-specific features of the energy landscape rather than intrinsic properties of the encodings.
  2. [§3] §3 (Methods): The evaluation lacks specification of the Bayesian optimization details (surrogate model, acquisition function), quantitative convergence metrics, number of independent runs, error bars, data selection criteria, and any statistical tests used to establish superiority over one-hot encoding.
minor comments (2)
  1. [Abstract] Abstract: The statement that the encodings 'were confirmed to yield superior convergence' should be accompanied by the specific metrics and trial counts used.
  2. [Introduction] Introduction: The distinction between crystal structure prediction, elemental substitution, and fixed-structure configuration search could be clarified with a brief schematic to better position the contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and will revise the manuscript to improve clarity and completeness where needed.

read point-by-point responses
  1. Referee: [§4] §4 (Performance evaluation): The central claim that the new encodings yield superior convergence is demonstrated exclusively on the h-BCN test case. No additional benchmarks on other lattices, varying site counts, or different symmetry constraints are provided, leaving open the possibility that the observed advantage is tied to h-BCN-specific features of the energy landscape rather than intrinsic properties of the encodings.

    Authors: The h-BCN system was chosen as it features a ternary composition on a hexagonal lattice with a combinatorially large configuration space and diverse local environments, serving as a demanding test case for configuration search. We agree that the single-system evaluation limits the strength of general claims. In the revised manuscript we will expand §4 to justify the choice of h-BCN, explicitly note that the observed advantage may be influenced by system-specific aspects of the energy landscape, and add a forward-looking statement on planned benchmarks across additional lattices and compositions. revision: partial

  2. Referee: [§3] §3 (Methods): The evaluation lacks specification of the Bayesian optimization details (surrogate model, acquisition function), quantitative convergence metrics, number of independent runs, error bars, data selection criteria, and any statistical tests used to establish superiority over one-hot encoding.

    Authors: We will revise §3 to fully specify the Bayesian optimization procedure, including the surrogate model, acquisition function, quantitative convergence metrics, number of independent runs, error bars on reported results, data selection criteria, and the statistical tests used to compare encodings. These additions will make the evaluation reproducible and strengthen the basis for the reported performance differences. revision: yes

Circularity Check

0 steps flagged

No circularity detected; empirical evaluation on external test system

full rationale

The paper introduces PyAPX for Bayesian atomic configuration search and presents new encodings whose performance is evaluated empirically on the h-BCN system, showing better convergence than one-hot encoding. This is a direct comparison on an external crystalline test case rather than any derivation that reduces to fitted inputs renamed as predictions, self-definitional quantities, or load-bearing self-citations. No equations or claims in the abstract or description exhibit the enumerated circular patterns; the central result remains an observable benchmark outcome independent of the toolkit's internal definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review reveals no explicit free parameters, axioms, or invented entities; the contribution centers on software implementation and empirical comparison rather than new theoretical constructs.

pith-pipeline@v0.9.0 · 5464 in / 1001 out tokens · 62925 ms · 2026-05-17T06:47:41.060277+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

  1. [1]

    author author A. R. \ Oganov \ and\ author C. W. \ Glass ,\ title title Crystal structure prediction using ab initio evolutionary techniques: Principles and applications , \ @noop journal journal The Journal of Chemical Physics \ volume 124 ( year 2006 ) NoStop

  2. [2]

    author author A. R. \ Oganov , author A. O. \ Lyakhov , \ and\ author M. Valle ,\ title title How evolutionary crystal structure prediction works and why , \ @noop journal journal Accounts of Chemical Research \ volume 44 ,\ pages 227--237 ( year 2011 ) NoStop

  3. [3]

    author author A. O. \ Lyakhov , author A. R. \ Oganov , author H. T. \ Stokes , \ and\ author Q. Zhu ,\ title title New developments in evolutionary structure prediction algorithm uspex , \ @noop journal journal Computer Physics Communications \ volume 184 ,\ pages 1172--1182 ( year 2013 ) NoStop

  4. [4]

    Wang , author J

    author author Y. Wang , author J. Lv , author L. Zhu , \ and\ author Y. Ma ,\ title title Crystal structure prediction via particle-swarm optimization , \ @noop journal journal Physical Review B \ volume 82 ,\ pages 094116 ( year 2010 ) NoStop

  5. [5]

    Wang , author J

    author author Y. Wang , author J. Lv , author L. Zhu , \ and\ author Y. Ma ,\ title title Calypso: A method for crystal structure prediction , \ @noop journal journal Computer Physics Communications \ volume 183 ,\ pages 2063--2070 ( year 2012 a ) NoStop

  6. [6]

    Wang , author J

    author author H. Wang , author J. S. \ Tse , author K. Tanaka , author T. Iitaka , \ and\ author Y. Ma ,\ title title Superconductive sodalite-like clathrate calcium hydride at high pressures , \ @noop journal journal Proceedings of the National Academy of Sciences \ volume 109 ,\ pages 6463--6466 ( year 2012 b ) NoStop

  7. [7]

    Peng , author Y

    author author F. Peng , author Y. Sun , author C. J. \ Pickard , author R. J. \ Needs , author Q. Wu , \ and\ author Y. Ma ,\ title title Hydrogen clathrate structures in rare earth hydrides at high pressures: possible route to room-temperature superconductivity , \ @noop journal journal Physical Review Letters \ volume 119 ,\ pages 107001 ( year 2017 ) NoStop

  8. [8]

    Liu , author I

    author author H. Liu , author I. I. \ Naumov , author R. Hoffmann , author N. Ashcroft , \ and\ author R. J. \ Hemley ,\ title title Potential high-tc superconducting lanthanum and yttrium hydrides at high pressure , \ @noop journal journal Proceedings of the National Academy of Sciences \ volume 114 ,\ pages 6990--6995 ( year 2017 ) NoStop

  9. [9]

    author author I. A. \ Kruglov , author D. V. \ Semenok , author R. Szc e \'s niak , author M. M. D. \ Esfahani , author A. Kvashnin , \ and\ author A. Oganov ,\ title title Superconductivity in lah10: a new twist of the story , \ @noop journal journal arXiv preprint arXiv:1810.01113 \ ,\ pages 1--28 ( year 2018 ) NoStop

  10. [10]

    Drozdov , author P

    author author A. Drozdov , author P. Kong , author V. Minkov , author S. Besedin , author M. Kuzovnikov , author S. Mozaffari , author L. Balicas , author F. F. \ Balakirev , author D. Graf , author V. Prakapenka , et al. ,\ title title Superconductivity at 250 k in lanthanum hydride under high pressures , \ @noop journal journal Nature \ volume 569 ,\ pa...

  11. [11]

    Nishijima , author T

    author author M. Nishijima , author T. Ootani , author Y. Kamimura , author T. Sueki , author S. Esaki , author S. Murai , author K. Fujita , author K. Tanaka , author K. Ohira , author Y. Koyama , et al. ,\ title title Accelerated discovery of cathode materials with prolonged cycle life for lithium-ion battery , \ @noop journal journal Nature Communicati...

  12. [12]

    Iwasaki , author H

    author author Y. Iwasaki , author H. Jaekyun , author Y. Sakuraba , author M. Kotsugi , \ and\ author Y. Igarashi ,\ title title Efficient autonomous material search method combining ab initio calculations, autoencoder, and multi-objective bayesian optimization , \ @noop journal journal Science and Technology of Advanced Materials: Methods \ volume 2 ,\ p...

  13. [13]

    Du , author J

    author author X. Du , author J. K. \ Damewood , author J. R. \ Lunger , author R. Millan , author B. Yildiz , author L. Li , \ and\ author R. G \'o mez-Bombarelli ,\ title title Machine-learning-accelerated simulations to enable automatic surface reconstruction , \ @noop journal journal Nature Computational Science \ volume 3 ,\ pages 1034--1044 ( year 20...

  14. [14]

    Han , author C

    author author Y. Han , author C. Ding , author J. Wang , author H. Gao , author J. Shi , author S. Yu , author Q. Jia , author S. Pan , \ and\ author J. Sun ,\ title title Efficient crystal structure prediction based on the symmetry principle , \ @noop journal journal Nature Computational Science \ ,\ pages 1--13 ( year 2025 ) NoStop

  15. [15]

    Zeni , author R

    author author C. Zeni , author R. Pinsler , author D. Z \"u gner , author A. Fowler , author M. Horton , author X. Fu , author Z. Wang , author A. Shysheya , author J. Crabb \'e , author S. Ueda , et al. ,\ title title A generative model for inorganic materials design , \ @noop journal journal Nature \ volume 639 ,\ pages 624--632 ( year 2025 ) NoStop

  16. [16]

    Merchant , author S

    author author A. Merchant , author S. Batzner , author S. S. \ Schoenholz , author M. Aykol , author G. Cheon , \ and\ author E. D. \ Cubuk ,\ title title Scaling deep learning for materials discovery , \ @noop journal journal Nature \ volume 624 ,\ pages 80--85 ( year 2023 ) NoStop

  17. [17]

    Zunger , author S.-H

    author author A. Zunger , author S.-H. \ Wei , author L. G. \ Ferreira , \ and\ author J. E. \ Bernard ,\ title title Special quasirandom structures , \ @noop journal journal Physical Review Letters \ volume 65 ,\ pages 353 ( year 1990 ) NoStop

  18. [18]

    author author A. van de Walle ,\ title title Multicomponent multisublattice alloys, nonconfigurational entropy and other additions to the alloy theoretic automated toolkit , \ @noop journal journal Calphad \ volume 33 ,\ pages 266--278 ( year 2009 ) NoStop

  19. [19]

    van de Walle , author P

    author author A. van de Walle , author P. Tiwary , author M. de Jong , author D. L. \ Olmsted , author M. Asta , author A. Dick , author D. Shin , author Y. Wang , author L.-Q. \ Chen , \ and\ author Z.-K. \ Liu ,\ title title Efficient stochastic generation of special quasirandom structures , \ @noop journal journal Calphad \ volume 42 ,\ pages 13--18 ( ...

  20. [20]

    Kasamatsu , author Y

    author author S. Kasamatsu , author Y. Motoyama , author K. Yoshimi , \ and\ author T. Aoyama ,\ title title Configuration sampling in multi-component multi-sublattice systems enabled by ab initio configuration sampling toolkit (abics) , \ @noop journal journal Science and Technology of Advanced Materials: Methods \ volume 3 ,\ pages 2284128 ( year 2023 ) NoStop

  21. [21]

    author author A. Y. \ Liu , author R. M. \ Wentzcovitch , \ and\ author M. L. \ Cohen ,\ title title Atomic arrangement and electronic structure of bc2n , \ @noop journal journal Physical Review B \ volume 39 ,\ pages 1760 ( year 1989 ) NoStop

  22. [22]

    Azevedo \ and\ author R

    author author S. Azevedo \ and\ author R. De Paiva ,\ title title Structural stability and electronic properties of carbon-boron nitride compounds , \ @noop journal journal Europhysics Letters \ volume 75 ,\ pages 126 ( year 2006 ) NoStop

  23. [23]

    Zhu , author S

    author author J. Zhu , author S. Bhandary , author B. Sanyal , \ and\ author H. Ottosson ,\ title title Interpolation of atomically thin hexagonal boron nitride and graphene: Electronic structure and thermodynamic stability in terms of all-carbon conjugated paths and aromatic hexagons , \ @noop journal journal The Journal of Physical Chemistry C \ volume ...

  24. [24]

    Hara , author A

    author author T. Hara , author A. Kusaba , author Y. Kangawa , author T. Kuboyama , author D. Bowler , author K. Kawka , \ and\ author P. T. \ Kempisty ,\ title title Exploration of stable atomic configurations in graphene-like bcn systems by density functional theory and bayesian optimization , \ @noop journal journal Crystal Growth and Design \ volume 2...

  25. [25]

    Kusaba , author Y

    author author A. Kusaba , author Y. Kangawa , author T. Kuboyama , \ and\ author A. Oshiyama ,\ title title Exploration of a large-scale reconstructed structure on gan (0001) surface by bayesian optimization , \ @noop journal journal Applied Physics Letters \ volume 120 ,\ pages 021602 ( year 2022 ) NoStop

  26. [26]

    Kawka , author P

    author author K. Kawka , author P. T. \ Kempisty , author K. Sakowski , author S. Krukowski , author M. Bo \'c kowski , author D. Bowler , \ and\ author A. Kusaba ,\ title title Augmentation of the electron counting rule with ising model , \ @noop journal journal Journal of Applied Physics \ volume 135 ,\ pages 225302 ( year 2024 ) NoStop

  27. [27]

    Kempisty , author K

    author author P. Kempisty , author K. Kawka , author A. Kusaba , \ and\ author Y. Kangawa ,\ title title Polar gan surfaces under gallium rich conditions: Revised thermodynamic insights from ab initio calculations , \ @noop journal journal Materials \ volume 16 ,\ pages 5982 ( year 2023 ) NoStop

  28. [28]

    Kuboyama \ and\ author A

    author author T. Kuboyama \ and\ author A. Kusaba ,\ title title Sat solver-driven approach for validating local electron counting rule , \ @noop journal journal Journal of Crystal Growth \ volume 650 ,\ pages 127927 ( year 2025 ) NoStop

  29. [29]

    author author S. Ono ,\ title title Optimization of configurations of atomic species on two-dimensional hexagonal lattices for copper-based systems , \ @noop journal journal AIP Advances \ volume 12 ( year 2022 ) NoStop

  30. [30]

    Motoyama , author R

    author author Y. Motoyama , author R. Tamura , author K. Yoshimi , author K. Terayama , author T. Ueno , \ and\ author K. Tsuda ,\ title title Bayesian optimization package: Physbo , \ @noop journal journal Computer Physics Communications \ volume 278 ,\ pages 108405 ( year 2022 ) NoStop

  31. [31]

    Ueno , author T

    author author T. Ueno , author T. D. \ Rhone , author Z. Hou , author T. Mizoguchi , \ and\ author K. Tsuda ,\ title title Combo: An efficient bayesian optimization library for materials science , \ @noop journal journal Materials Discovery \ volume 4 ,\ pages 18--21 ( year 2016 ) NoStop

  32. [32]

    Giannozzi , author O

    author author P. Giannozzi , author O. Andreussi , author T. Brumme , author O. Bunau , author M. B. \ Nardelli , author M. Calandra , author R. Car , author C. Cavazzoni , author D. Ceresoli , author M. Cococcioni , et al. ,\ title title Advanced capabilities for materials modelling with quantum espresso , \ @noop journal journal Journal of Physics: Cond...

  33. [33]

    Giannozzi , author S

    author author P. Giannozzi , author S. Baroni , author N. Bonini , author M. Calandra , author R. Car , author C. Cavazzoni , author D. Ceresoli , author G. L. \ Chiarotti , author M. Cococcioni , author I. Dabo , et al. ,\ title title Quantum espresso: a modular and open-source software project for quantumsimulations of materials , \ @noop journal journa...

  34. [34]

    Momma \ and\ author F

    author author K. Momma \ and\ author F. Izumi ,\ title title Vesta 3 for three-dimensional visualization of crystal, volumetric and morphology data , \ @noop journal journal Applied Crystallography \ volume 44 ,\ pages 1272--1276 ( year 2011 ) NoStop

  35. [35]

    Rahimi \ and\ author B

    author author A. Rahimi \ and\ author B. Recht ,\ title title Random features for large-scale kernel machines , \ @noop journal journal Advances in Neural Information Processing Systems \ volume 20 ( year 2007 ) NoStop

  36. [36]

    Chapelle \ and\ author L

    author author O. Chapelle \ and\ author L. Li ,\ title title An empirical evaluation of thompson sampling , \ @noop journal journal Advances in Neural Information Processing Systems \ volume 24 ( year 2011 ) NoStop

  37. [37]

    author author C. E. \ Rasmussen ,\ title title Gaussian processes in machine learning , \ in\ @noop booktitle Summer School on Machine Learning \ ( publisher Springer ,\ year 2003 )\ pp.\ pages 63--71 NoStop

  38. [38]

    author author J. P. \ Perdew , author K. Burke , \ and\ author M. Ernzerhof ,\ title title Generalized gradient approximation made simple , \ @noop journal journal Physical Review Letters \ volume 77 ,\ pages 3865 ( year 1996 ) NoStop

  39. [39]

    Kresse \ and\ author D

    author author G. Kresse \ and\ author D. Joubert ,\ title title From ultrasoft pseudopotentials to the projector augmented-wave method , \ @noop journal journal Physical Review B \ volume 59 ,\ pages 1758 ( year 1999 ) NoStop