pith. sign in

arxiv: 1907.11321 · v1 · pith:MUYVG75Rnew · submitted 2019-07-25 · 💻 cs.AI · cs.LG· cs.LO· q-bio.CB

Probabilistic Approximate Logic and its Implementation in the Logical Imagination Engine

Pith reviewed 2026-05-24 15:55 UTC · model grok-4.3

classification 💻 cs.AI cs.LGcs.LOq-bio.CB
keywords probabilistic approximate logicmean approximate probabilitylogical independencecontinuous semanticsstochastic gradient descentMarkov chain Monte Carlodomain knowledge integrationnetwork synthesis
0
0 comments X

The pith

Probabilistic Approximate Logic uses mean approximate probability and logical independence assumptions to enable efficient inference in probabilistic settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Probabilistic Approximate Logic (PALO) to address conceptual and computational difficulties in strictly probabilistic logics by relying on the notion of mean approximate probability. Logical independence assumptions generate approximate probabilities, which averaging over many formula instances converts into estimates of mean probability with known confidence. A continuous semantics supports efficient inference through stochastic gradient descent and Markov chain Monte Carlo methods, even though it preserves only a subset of classical logic's structural properties, with richer theories from classical inference providing partial compensation. The approach is realized in the Logical Imagination Engine implementation and illustrated on a bioinformatics network synthesis task. A sympathetic reader would care because it supplies a systematic route for embedding domain knowledge into machine learning rather than relying on ad hoc methods.

Core claim

PALO is a logic based on the notion of mean approximate probability to overcome conceptual and computational difficulties inherent to strictly probabilistic logics. Logical independence assumptions are used to obtain approximate probabilities, but by averaging over many instances of formulas a useful estimate of mean probability with known confidence can usually be obtained. To enable efficient computational inference, the logic has a continuous semantics that reflects only a subset of the structural properties of classical logic, but this imprecision can be partly compensated by richer theories obtained by classical inference or other means. Computational inference, which refers to the con

What carries the argument

mean approximate probability obtained via logical independence assumptions and averaging over instances, paired with continuous semantics

If this is right

  • Domain knowledge can be incorporated systematically into machine learning models through logical theories rather than ad hoc means.
  • Efficient construction of models and validation of logical properties becomes feasible using SGD and MCMC techniques.
  • Useful estimates of mean probability with known confidence can be obtained despite the approximations involved.
  • The method supports applications such as network synthesis and analysis in bioinformatics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The averaging step might offer statistical robustness that exact probabilistic logics lack when theories contain uncertainty.
  • PALO-style approximations could be tested in hybrid systems that combine logical constraints with neural network training.
  • The continuous semantics might generalize to other inference engines if the compensation via classical inference scales to larger theories.

Load-bearing premise

Averaging over many instances of formulas under independence assumptions reliably produces useful estimates of mean probability with known confidence, and losses of structural properties from continuous semantics can be offset by richer theories from classical inference.

What would settle it

A controlled logical theory where exact probability computation is feasible and shows that the averaged mean approximate probabilities deviate substantially from the true values or that property validation via the continuous semantics produces incorrect results.

Figures

Figures reproduced from arXiv: 1907.11321 by Akos Vertes, Carolyn L. Talcott, Mark-Oliver Stehr, Merrill Knapp, Minyoung Kim.

Figure 1
Figure 1. Figure 1: Sample Model with a Likelihood of 0.97. The subgraph on the right (related to the cholesterol pathway) is shown in context of the larger network limited to causal dependencies with probability at least 0.7. Edges are colored so that darker colors correspond to higher probabilities. The node coloring reflects the average fold-change over the entire time series (green and red correspond to up and down regula… view at source ↗
Figure 2
Figure 2. Figure 2: Relational Histograms of our Sample Model. Through sampling we can visualize the ”probabilistic shape” of relations in our model. On the horizontal axis we show the probability, and on the logarithmic (!) vertical axis we show the number of samples (pairs of genes in the relation) with approximately that probability. The graph of [PITH_FULL_IMAGE:figures/full_fig_p025_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Mean Probability Analysis of the Sample Model. Using a slightly different logical form and notation, for each axiom we show the computed importance weight, the (normalized) probability, and the mean probability of the axiom without the top￾level quantifier (which can be more intuitive for the user). to the normal user. For example, the axiom involving lidata is satisfied in the model with an adequate (norm… view at source ↗
Figure 4
Figure 4. Figure 4: Lower and Upper Bound Mean Probability Analysis of the Sample Model. We show only the mean probabilities of each axiom without the top-level quantifier and the corresponding lower and upper bounds with confidence intervals. as the approximate probability does not account for logical dependencies. To this end, we use the full semantics that includes lower and upper mean proba￾bilities. The results of model … view at source ↗
Figure 5
Figure 5. Figure 5: Propositional Histograms for our Sample Model. Through sampling we can also visualize the ”probabilistic shape” of the axioms (and other properties) in our model. To this end, we again consider the axiom without the top-level quantifier. Two examples are given for axioms that seem to exhibit a perfect mean probability of 1.0, but it is important to note that the mean probability can still hide their detail… view at source ↗
Figure 6
Figure 6. Figure 6: Comparing Mean Probabilities under the Approximate (Soft) Semantics and under the Concrete Classical Semantics. In spite of the fact, that crispification leads to a significant loss of information for individual instantiations of formulas, it turns out that the mean probabilities are quite similar in our application (albeit with clearly noticeable differences). project. This is, however, not necessary with… view at source ↗
read the original abstract

In spite of the rapidly increasing number of applications of machine learning in various domains, a principled and systematic approach to the incorporation of domain knowledge in the engineering process is still lacking and ad hoc solutions that are difficult to validate are still the norm in practice, which is of growing concern not only in mission-critical applications. In this note, we introduce Probabilistic Approximate Logic (PALO) as a logic based on the notion of mean approximate probability to overcome conceptual and computational difficulties inherent to strictly probabilistic logics. The logic is approximate in several dimensions. Logical independence assumptions are used to obtain approximate probabilities, but by averaging over many instances of formulas a useful estimate of mean probability with known confidence can usually be obtained. To enable efficient computational inference, the logic has a continuous semantics that reflects only a subset of the structural properties of classical logic, but this imprecision can be partly compensated by richer theories obtained by classical inference or other means. Computational inference, which refers to the construction of models and validation of logical properties, is based on Stochastic Gradient Descent (SGD) and Markov Chain Monte Carlo (MCMC) techniques and hence another dimension where approximations are involved. We also present the Logical Imagination Engine (LIME), a prototypical implementation of PALO based on TensorFlow. Albeit not limited to the biological domain, we illustrate its operation in a quite substantial bioinformatics machine learning application concerned with network synthesis and analysis in a recent DARPA project.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper introduces Probabilistic Approximate Logic (PALO) as an approximate logic based on the notion of mean approximate probability. It uses logical independence assumptions and averaging over formula instances to obtain estimates of mean probability with known confidence, adopts a continuous semantics that preserves only a subset of classical logic properties (compensated partly by richer theories from classical inference), and performs computational inference via SGD and MCMC. The paper also presents the Logical Imagination Engine (LIME) implementation in TensorFlow and illustrates it on a bioinformatics network synthesis task from a DARPA project.

Significance. If the averaging procedure under independence assumptions can be shown to deliver estimates with quantifiable confidence and if the continuous semantics can be rigorously related to the classical properties it approximates, PALO could offer a practical route to incorporating domain knowledge into ML pipelines. The TensorFlow-based LIME implementation is a concrete strength that supports reproducibility and immediate experimentation.

major comments (2)
  1. [Abstract] Abstract (second paragraph): the central claim that 'by averaging over many instances of formulas a useful estimate of mean probability with known confidence can usually be obtained' is load-bearing for the entire approach, yet the manuscript supplies neither a derivation of the confidence quantity nor an analysis of bias or variance under the independence assumptions.
  2. [Abstract] Abstract (second paragraph): the statement that loss of structural properties from continuous semantics 'can be partly compensated by richer theories obtained by classical inference' is asserted without any formal relation or bound showing how the compensation affects the mean-probability estimates.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's report. The referee raises valid points regarding the need for more formal justification of key claims in the abstract. We address these below and commit to revisions that will strengthen the manuscript by providing the requested derivations and relations.

read point-by-point responses
  1. Referee: [Abstract] Abstract (second paragraph): the central claim that 'by averaging over many instances of formulas a useful estimate of mean probability with known confidence can usually be obtained' is load-bearing for the entire approach, yet the manuscript supplies neither a derivation of the confidence quantity nor an analysis of bias or variance under the independence assumptions.

    Authors: The referee correctly identifies that a formal derivation is absent from the current manuscript. The approach relies on averaging under independence assumptions to obtain estimates with quantifiable confidence, but we did not include the explicit bias and variance analysis. In the revision, we will add this analysis, including a derivation of the confidence interval using standard statistical methods such as concentration bounds applied to the formula instances. revision: yes

  2. Referee: [Abstract] Abstract (second paragraph): the statement that loss of structural properties from continuous semantics 'can be partly compensated by richer theories obtained by classical inference' is asserted without any formal relation or bound showing how the compensation affects the mean-probability estimates.

    Authors: We agree that a formal relation or bound is not provided. The compensation is described conceptually in the manuscript. We will revise to include a more detailed explanation with examples or a proposition illustrating how additional classical axioms affect the mean probability estimates under the continuous semantics. revision: yes

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The abstract introduces PALO via the notion of mean approximate probability together with logical independence assumptions, averaging over instances, continuous semantics, and SGD/MCMC inference. No equations, fitted parameters, or self-citations appear in the provided text that would reduce any claimed prediction or result to a definition or input by construction. The central claims rest on conceptual and computational choices whose validity is presented as independent of the target estimates themselves. This matches the reader's assessment of score 2.0 and qualifies as a self-contained presentation with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The logic relies on unstated background assumptions about probability and logic that cannot be audited from the given text.

pith-pipeline@v0.9.0 · 5811 in / 1128 out tokens · 19622 ms · 2026-05-24T15:55:28.844310+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 5 internal anchors

  1. [1]

    Abadi, P

    M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghe- mawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Conference on Operating Sys...

  2. [2]

    F. Bacchus. Representing and Reasoning with Probabilistic Knowledge: A Logical Approach to Probabilities. MIT Press, Cambridge, MA, USA, 1990

  3. [3]

    S. H. Bach, M. Broecheler, B. Huang, and L. Getoor. Hinge-loss Markov random fields and probabilistic soft logic. J. Mach. Learn. Res. , 18(1):3846–3912, Jan. 2017

  4. [4]

    Bouhoula, J.-P

    A. Bouhoula, J.-P. Jouannaud, and J. Meseguer. Specification and proof in mem- bership equational logic. In M. Bidoit and M. Dauchet, editors, TAPSOFT’97: Theory and Practice of Software Development, 7th International Joint Conference CAAP/FASE, Lille, France, April 14-18, 1997, Proceedings , volume 1214 of Lec- ture Notes in Computer Science , pages 67–92...

  5. [5]

    A. Bundy. Incidence calculus: A mechanism for probabilistic reasoning. J. Autom. Reason., 1(3):263–283, Jan. 1985

  6. [6]

    Esteva and L

    F. Esteva and L. Godo. Putting together Lukasiewicz and Product logics.Mathware and Soft Computing , 6(2-3):219–234, 1999

  7. [7]

    M. V. M. Fran¸ ca, G. Zaverucha, and A. S. d’Avila Garcez. Fast relational learning using bottom clause propositionalization with artificial neural networks. Machine Learning, 94(1):81–104, 2014

  8. [8]

    B. Gaines. Fuzzy and probability uncertainty logics. Information and Control , 38(2):154169, 1978

  9. [9]

    Gens and P

    R. Gens and P. M. Domingos. Deep symmetry networks. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Infor- mation Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 2537–2545, 2014

  10. [10]

    Ghosh, W

    S. Ghosh, W. Steiner, G. Denker, and P. Lincoln. Probabilistic modeling of fail- ure dependencies using Markov Logic Networks. In 2013 IEEE 19th Pacific Rim International Symposium on Dependable Computing , pages 162–171, Dec 2013

  11. [11]

    Graves, M

    A. Graves, M. G. Bellemare, J. Menick, R. Munos, and K. Kavukcuoglu. Auto- mated curriculum learning for neural networks. In Proceedings of the 34th Inter- national Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017 , pages 1311–1320, 2017

  12. [12]

    Grefenstette

    E. Grefenstette. Towards a formal distributional semantics: Simulating logical cal- culi with tensors. In M. T. Diab, T. Baldwin, and M. Baroni, editors, Proceedings of the Second Joint Conference on Lexical and Computational Semantics, *SEM 2013, June 13-14, 2013, Atlanta, Georgia, USA. , pages 1–10. Association for Com- putational Linguistics, 2013

  13. [13]

    R. V. Guha. Towards a model theory for distributed representations. CoRR, abs/1410.5859, 2014. http://arxiv.org/abs/1410.5859

  14. [14]

    Haaren, G

    J. Haaren, G. Broeck, W. Meert, and J. Davis. Lifted generative learning of Markov logic networks. Mach. Learn., 103(1):27–55, Apr. 2016

  15. [15]

    J. Y. Halpern. An analysis of first-order logics of probability. In Proceedings of the 11th International Joint Conference on Artificial Intelligence - Volume 2 , 35 IJCAI’89, pages 1375–1381, San Francisco, CA, USA, 1989. Morgan Kaufmann Publishers Inc

  16. [16]

    P. Hjek, L. Godo, and F. Esteva. A complete many-valued logic with product conjunction. Archive for Mathematical Logic, 35:191–208, 1996

  17. [17]

    D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014

  18. [18]

    Kummer and M.-O

    O. Kummer and M.-O. Stehr. Petri’s axioms of concurrency- A selection of recent results. In P. Az´ ema and G. Balbo, editors,Application and Theory of Petri Nets 1997, 18th International Conference, ICATPN ’97, Toulouse, France, June 23- 27, 1997, Proceedings, volume 1248 of Lecture Notes in Computer Science , pages 195–214. Springer, 1997

  19. [19]

    LeCun, Y

    Y. LeCun, Y. Bengio, and G. E. Hinton. Deep learning. Nature, 521(7553):436–444, 2015

  20. [20]

    On the Semantic Relationship between Probabilistic Soft Logic and Markov Logic

    J. Lee and Y. Wang. On the semantic relationship between probabilistic soft logic and Markov logic. CoRR, abs/1606.08896, 2016

  21. [21]

    C. Li, C. Chen, D. Carlson, and L. Carin. Preconditioned stochastic gradient Langevin dynamics for deep neural networks. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence , AAAI’16, pages 1788–1794. AAAI Press, 2016

  22. [22]

    Mandt, M

    S. Mandt, M. D. Hoffman, and D. M. Blei. Stochastic gradient descent as approx- imate bayesian inference. J. Mach. Learn. Res. , 18(1):4873–4907, Jan. 2017

  23. [23]

    Mart´ ı-Oliet and J

    N. Mart´ ı-Oliet and J. Meseguer. Rewriting logic as a logical and semantic frame- work. Electr. Notes Theor. Comput. Sci. , 4:190–225, 1996

  24. [24]

    Meseguer

    J. Meseguer. Conditional rewriting logic as a unified model of concurrency. Theor. Comput. Sci., 96(1):73–155, Apr. 1992

  25. [25]

    https://www.sri.com/sites/default/files/ brochures/sri_palmilitary.pdf

    PAL Technologies for the Military. https://www.sri.com/sites/default/files/ brochures/sri_palmilitary.pdf

  26. [26]

    Poon and P

    H. Poon and P. Domingos. Sum-product networks: A new deep architecture. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intel- ligence, UAI’11, pages 337–346, Arlington, Virginia, United States, 2011. AUAI Press

  27. [27]

    http://www.ai.sri.com/pal/ PAL-software-downloads/PAL-zipfiles-and-docs/pce-doc/pal_pce.pdf

    Probabilistic Consistency Engine (PCE). http://www.ai.sri.com/pal/ PAL-software-downloads/PAL-zipfiles-and-docs/pce-doc/pal_pce.pdf

  28. [28]

    Reichenbach

    H. Reichenbach. The Theory of Probability . University of California Press, 1949

  29. [29]

    S. Rendle. Factorization machines. In Proceedings of the 2010 IEEE International Conference on Data Mining , ICDM ’10, pages 995–1000, Washington, DC, USA,

  30. [30]

    IEEE Computer Society

  31. [31]

    Richardson and P

    M. Richardson and P. Domingos. Markov logic networks. Mach. Learn., 62(1- 2):107–136, Feb. 2006

  32. [32]

    Rocktaschel, S

    T. Rocktaschel, S. Singh, M. Bosnjak, and S. Riedel. Low-dimensional embeddings of logic. In ACL 2014 Workshop on Semantic Parsing (SP14) , 2014

  33. [33]

    Rockt¨ aschel, S

    T. Rockt¨ aschel, S. Singh, and S. Riedel. Injecting logical background knowledge into embeddings for relation extraction. In R. Mihalcea, J. Y. Chai, and A. Sarkar, editors, NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31 - June ...

  34. [34]

    Logic Tensor Networks: Deep Learning and Logical Reasoning from Data and Knowledge

    L. Serafini and A. S. d’Avila Garcez. Logic tensor networks: Deep learning and logical reasoning from data and knowledge. CoRR, abs/1606.04422, 2016. http: //arxiv.org/abs/1606.04422. 36

  35. [35]

    Serafini, I

    L. Serafini, I. Donadello, and A. d. Garcez. Learning and reasoning in logic tensor networks: Theory and application to semantic image interpretation. In Proceedings of the Symposium on Applied Computing , SAC ’17, pages 125–130, New York, NY, USA, 2017. ACM

  36. [36]

    Socher, D

    R. Socher, D. Chen, C. D. Manning, and A. Y. Ng. Reasoning with neural tensor networks for knowledge base completion. In C. J. C. Burges, L. Bottou, Z. Ghahra- mani, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems

  37. [37]

    Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., pages 926–934, 2013

  38. [38]

    Learning Causality: Synthesis of Large-Scale Causal Networks from High-Dimensional Time Series Data

    M.-O. Stehr, P. Avar, A. R. Korte, L. Parvin, Z. J. Sahab, D. I. Bunin, M. Knapp, D. Nishita, A. Poggio, C. L. Talcott, B. M. Davis, C. A. Morton, C. J. Sevinsky, M. I. Zavodszky, and A. Vertes. Learning causality: Synthesis of large-scale causal networks from high-dimensional time series data. CoRR, abs/1905.02291, 2019

  39. [39]

    http://maude.csl.sri.com/

    The Maude System. http://maude.csl.sri.com/

  40. [40]

    http://yices.csl.sri.com/

    The Yices SMT Solver. http://yices.csl.sri.com/

  41. [41]

    Vertes, A

    A. Vertes, A. Arul, P. Avar, A. R. Korte, H. Li, P. Nemes, L. Parvin, S. Stopka, S. Hwang, Z. J. Sahab, L. Zhang, D. I. Bunin, M. Knapp, A. Poggio, M.-O. Stehr, C. L. Talcott, B. M. Davis, S. R. Dinn, C. A. Morton, C. J. Sevinsky, and M. I. Zavodszky. Inferring mechanism of action of an unknown compound from time series omics data. In M. Ceska and D. Safr...

  42. [42]

    Welling and Y

    M. Welling and Y. W. Teh. Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML’11, pages 681–688, USA, 2011. Omnipress. 37