arxiv: 2604.01021 · v2 · submitted 2026-04-01 · 💻 cs.LG · cs.AI

Recognition: no theorem link

Transfer learning for nonparametric Bayesian networks

Rafael Sojo , Pedro Larra\~naga , Concha Bielza

Authors on Pith no claims yet

Pith reviewed 2026-05-13 22:26 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords transfer learningnonparametric Bayesian networksstructure learningnegative transferkernel density estimationPC-stable algorithmhill climbingscarce data

0 comments

The pith

Two transfer learning algorithms improve nonparametric Bayesian network estimation from limited data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PCS-TL, a constraint-based method, and HC-TL, a score-based method, to learn nonparametric Bayesian networks when only scarce target data is available. It adds specific metrics to detect and block negative transfer, where borrowing from a source domain would degrade performance, and uses log-linear pooling to combine parameters across domains. Evaluation on synthetic networks of varying sizes and UCI repository datasets, with added noise to simulate mismatch, shows these methods outperform learning from the target data alone. A Friedman test with Bergmann-Hommel post-hoc analysis supplies statistical evidence of the improvement. In practical terms this means models can be deployed faster in settings where collecting large amounts of domain-specific data is costly.

Core claim

PCS-TL and HC-TL are reliable transfer learning procedures for nonparametric Bayesian networks that raise structure-learning and parameter accuracy under scarce target data by selectively importing information from related source datasets while using dedicated metrics to prevent negative transfer; log-linear pooling is used for the parameters, and the gains are confirmed on both synthetic networks and real UCI data via statistical testing.

What carries the argument

PCS-TL (PC-stable transfer learning) and HC-TL (hill-climbing transfer learning) algorithms that embed negative-transfer detection metrics and apply log-linear pooling to parameter estimates.

If this is right

Structure and parameter estimates for kernel-density-estimation Bayesian networks become more accurate than standard learning when target samples are few.
Negative-transfer metrics succeed in protecting performance when source and target distributions differ.
Statistical tests confirm the methods outperform non-transfer baselines across multiple dataset sizes and noise levels.
Deployment time for such networks in data-scarce industrial settings is reduced because less target data needs to be collected.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same negative-transfer safeguards could be adapted to other nonparametric density estimators beyond Bayesian networks.
If source datasets arrive incrementally, the pooling step could be updated online without restarting the structure search.
The approach may shorten model-building cycles in any domain where related but not identical data sources are easier to obtain than perfectly matched data.

Load-bearing premise

Suitable related source datasets exist and the proposed negative-transfer metrics can reliably detect and block harmful transfers without adding new biases.

What would settle it

On fresh scarce-data problems where the chosen sources are unrelated, either PCS-TL or HC-TL produces lower accuracy than learning from the target data alone or the metrics fail to flag the mismatch.

Figures

Figures reproduced from arXiv: 2604.01021 by Concha Bielza, Pedro Larra\~naga, Rafael Sojo.

**Figure 2.** Figure 2: Structures of the synthetic SPBNs [56]. Model Nodes Arcs Max indegree Synthetic SPBN 1 7 10 3 Synthetic SPBN 2 13 21 5 Synthetic SPBN 3 8 7 1 Synthetic SPBN 4 15 14 1 [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Results for the synthetic SPBNs with two auxiliary source domains. 0% and 10% of arc modification. [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Results for the synthetic SPBNs with three auxiliary source domains. 5%, 10% and 20% of arc modification. [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Results for the bnlearn networks with two auxiliary source domains. 0% and 10% of arc modification. [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: Results for the bnlearn networks with three auxiliary source domains. 5%, 10% and 20% of arc modification. [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: Results for the UCI datasets with two auxiliary source domains. 0% and 10% of arc modification. [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: Results for the UCI datasets with three auxiliary source domains. 5%, 10 and 20% of arc modification. [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Average UCI structures for the HC and HC-TL algorithms with 25 target instances. Results for 3 auxiliary [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 11.** Figure 11: Critical difference diagram for the network’s DHD results. Evaluation for less than [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: Significance heatmap with p-values for the network’s DHD results. Evaluation for less than 525 target instances. ( [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗

**Figure 13.** Figure 13: Critical difference diagram for the network’s log-likelihood results. Evaluation for less than [PITH_FULL_IMAGE:figures/full_fig_p020_13.png] view at source ↗

**Figure 14.** Figure 14: Significance heatmap with p-values for the network’s log-likelihood results. Evaluation for less than 525 target instances. instance, in [PITH_FULL_IMAGE:figures/full_fig_p020_14.png] view at source ↗

read the original abstract

This paper introduces two transfer learning methodologies for estimating nonparametric Bayesian networks under scarce data. We propose two algorithms, a constraint-based structure learning method, called PC-stable-transfer learning (PCS-TL), and a score-based method, called hill climbing transfer learning (HC-TL). We also define particular metrics to tackle the negative transfer problem in each of them, a situation in which transfer learning has a negative impact on the model's performance. Then, for the parameters, we propose a log-linear pooling approach. For the evaluation, we learn kernel density estimation Bayesian networks, a type of nonparametric Bayesian network, and compare their transfer learning performance with the models alone. To do so, we sample data from small, medium and large-sized synthetic networks and datasets from the UCI Machine Learning repository. Then, we add noise and modifications to these datasets to test their ability to avoid negative transfer. To conclude, we perform a Friedman test with a Bergmann-Hommel post-hoc analysis to show statistical proof of the enhanced experimental behavior of our methods. Thus, PCS-TL and HC-TL demonstrate to be reliable algorithms for improving the learning performance of a nonparametric Bayesian network with scarce data, which in real industrial environments implies a reduction in the required time to deploy the network.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces PCS-TL, a constraint-based structure learning method, and HC-TL, a score-based method, for transfer learning in nonparametric Bayesian networks with scarce data. It defines metrics to address negative transfer, employs log-linear pooling for parameter estimation, and evaluates the approaches on synthetic networks of varying sizes and UCI datasets by adding noise and modifications, demonstrating statistical improvements via Friedman tests with Bergmann-Hommel post-hoc analysis.

Significance. If the proposed negative-transfer metrics are shown to be robust, the work could facilitate more efficient deployment of Bayesian network models in data-scarce industrial applications by leveraging related source datasets. The inclusion of statistical hypothesis testing provides a solid empirical foundation for the performance claims.

major comments (3)

[Methods] Methods (PCS-TL and HC-TL definitions): The exact mathematical definitions and thresholds of the negative-transfer metrics are not specified. This is load-bearing for the central reliability claim, as the abstract states these metrics are used to avoid negative transfer; without explicit forms, it is impossible to assess whether they generalize beyond the tested noise additions or introduce new biases.
[Evaluation] Evaluation section: No ablation results are reported to separate the effect of the negative-transfer metrics from the log-linear pooling step or the base PC-stable/HC algorithms. The performance claims rest on high-level summaries of Friedman/Bergmann-Hommel tests without error bars or detailed cases of prevented negative transfer, weakening the assertion that the methods reliably improve learning under scarce data.
[Experimental setup] Experimental setup: The specific perturbations ('noise and modifications') applied to synthetic networks and UCI datasets are not detailed (e.g., whether they affect higher-order moments or conditional independencies). This leaves open whether the metrics detect harmful transfers only under the tested conditions or more broadly, directly impacting the industrial deployment-time reduction claim.

minor comments (2)

[Abstract] Abstract: Refers to 'particular metrics' without naming or briefly describing them; expanding this would improve clarity for readers.
[Throughout] Throughout: Missing implementation details such as code availability, exact hyperparameter settings for kernel density estimation, or the precise form of the log-linear pooling weights would strengthen reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We appreciate the referee's detailed feedback, which highlights areas for improvement in clarity and empirical validation. We will make the suggested revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Methods] Methods (PCS-TL and HC-TL definitions): The exact mathematical definitions and thresholds of the negative-transfer metrics are not specified. This is load-bearing for the central reliability claim, as the abstract states these metrics are used to avoid negative transfer; without explicit forms, it is impossible to assess whether they generalize beyond the tested noise additions or introduce new biases.

Authors: We thank the referee for pointing this out. Upon review, the definitions of the negative-transfer metrics for PCS-TL and HC-TL were described in prose but lacked explicit mathematical formulations and specific threshold values. In the revised version, we will include the precise equations for these metrics, such as the condition for detecting negative transfer based on performance degradation, and specify the thresholds (e.g., a 5% drop in accuracy or similar). This will allow better assessment of their robustness. revision: yes
Referee: [Evaluation] Evaluation section: No ablation results are reported to separate the effect of the negative-transfer metrics from the log-linear pooling step or the base PC-stable/HC algorithms. The performance claims rest on high-level summaries of Friedman/Bergmann-Hommel tests without error bars or detailed cases of prevented negative transfer, weakening the assertion that the methods reliably improve learning under scarce data.

Authors: We acknowledge that no explicit ablation studies were presented to isolate the contributions of the negative-transfer metrics versus the log-linear pooling and the base algorithms. To address this, we will add ablation experiments in the revised manuscript, comparing variants with and without the metrics, to demonstrate their individual impacts. Additionally, we will include error bars in the performance summaries and detail specific cases where negative transfer was prevented. revision: yes
Referee: [Experimental setup] Experimental setup: The specific perturbations ('noise and modifications') applied to synthetic networks and UCI datasets are not detailed (e.g., whether they affect higher-order moments or conditional independencies). This leaves open whether the metrics detect harmful transfers only under the tested conditions or more broadly, directly impacting the industrial deployment-time reduction claim.

Authors: We agree that the specific perturbations applied to the datasets were not detailed sufficiently. In the revision, we will expand the experimental setup section to describe the exact noise additions (e.g., Gaussian noise with specific variances) and modifications (e.g., altering conditional probability tables or removing edges), including how they impact higher-order moments and conditional independencies. This will clarify the conditions under which the metrics operate. revision: yes

Circularity Check

0 steps flagged

Methods defined independently of evaluation data; central claims do not reduce to fitted inputs or self-citation chains

full rationale

The paper defines PCS-TL and HC-TL algorithms, negative-transfer metrics, and log-linear pooling explicitly in terms of structure learning and parameter estimation steps that operate on source and target datasets. Evaluation uses external UCI repository datasets and synthetic networks with added noise/modifications, followed by Friedman/Bergmann-Hommel tests. No equation or definition in the derivation chain equates a reported performance gain to a quantity fitted on the same validation data, nor does any load-bearing premise rest solely on prior self-citation without independent content. This yields only a minor self-citation score with no circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard Bayesian network assumptions (Markov condition, faithfulness for constraint-based learning) and introduces new algorithmic components without new free parameters or invented entities explicitly stated in the abstract.

axioms (1)

domain assumption Markov condition and faithfulness assumptions standard to constraint-based and score-based Bayesian network structure learning.
Implicit foundation for both PCS-TL and HC-TL methods.

pith-pipeline@v0.9.0 · 5520 in / 1247 out tokens · 35057 ms · 2026-05-13T22:26:32.909834+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages

[1]

A survey on transfer learning.IEEE Transactions on Knowledge and Data Engineering, 22:1345–1359, 2010

Sinno Jialin Pan and Qiang Yang. A survey on transfer learning.IEEE Transactions on Knowledge and Data Engineering, 22:1345–1359, 2010

work page 2010
[2]

A survey on negative transfer.IEEE/CAA Journal of Automatica Sinica, 10(2):305–329, 2023

Wen Zhang, Lingfei Deng, Lei Zhang, and Dongrui Wu. A survey on negative transfer.IEEE/CAA Journal of Automatica Sinica, 10(2):305–329, 2023

work page 2023
[3]

Kouw and Marco Loog

Wouter M. Kouw and Marco Loog. A review of domain adaptation without target labels.IEEE Transactions on Pattern Analysis & Machine Intelligence, 43(03):766–785, 2021

work page 2021
[4]

Instance reweighting and dynamic distribution alignment for domain adaptation.Journal of Ambient Intelligence and Humanized Computing, 13(10):4967–4987, 2022

Maryam Azarkesht and Fatemeh Afsari. Instance reweighting and dynamic distribution alignment for domain adaptation.Journal of Ambient Intelligence and Humanized Computing, 13(10):4967–4987, 2022

work page 2022
[5]

Bayesian adaptation for covariate shift

Aurick Zhou and Sergey Levine. Bayesian adaptation for covariate shift. InAdvances in Neural Information Processing Systems, volume 34, pages 914–927. Curran Associates, Inc., 2021

work page 2021
[6]

Inductive transfer for Bayesian network structure learning

Alexandru Niculescu-Mizil and Rich Caruana. Inductive transfer for Bayesian network structure learning. In Proceedings of the ICML Workshop on Unsupervised and Transfer Learning, volume 27 ofProceedings of Machine Learning Research, pages 167–180. PMLR, 2012

work page 2012
[7]

Transfer learning for Bayesian discovery of multiple Bayesian networks.Knowledge and Information Systems, 43(1):1–28, 2015

Diane Oyen and Terran Lane. Transfer learning for Bayesian discovery of multiple Bayesian networks.Knowledge and Information Systems, 43(1):1–28, 2015

work page 2015
[8]

Multi-task transfer learning for Bayesian network structures

Sarah Benikhlef, Philippe Leray, Guillaume Raschia, Montassar Ben Messaoud, and Fayrouz Sakly. Multi-task transfer learning for Bayesian network structures. InSymbolic and Quantitative Approaches to Reasoning with Uncertainty, pages 217–228. Springer, 2021

work page 2021
[9]

Koller and N

D. Koller and N. Friedman.Probabilistic Graphical Models: Principles and Techniques. The MIT Press, 2009

work page 2009
[10]

An algorithm for fast recovery of sparse causal graphs.Social Science Computer Review, 9:62–72, 1991

Peter Spirtes and Clark Glymour. An algorithm for fast recovery of sparse causal graphs.Social Science Computer Review, 9:62–72, 1991

work page 1991
[11]

Causality from probability

Peter Spirtes, Clark Glymour, and Richard Scheines. Causality from probability. Technical report, Department of Philosophy, Carnegie Mellon University, 1989

work page 1989
[12]

Maathuis

Diego Colombo and Marloes H. Maathuis. Order-independent constraint-based causal structure learning.Journal of Machine Learning Research, 15(116):3921–3962, 2014

work page 2014
[13]

Cooper and Edward Herskovits

Gregory F. Cooper and Edward Herskovits. A Bayesian method for the induction of probabilistic networks from data.Machine Learning, 9(4):309–347, 1992. 21 Binned Semiparametric Bayesian networks

work page 1992
[14]

Bouckaert

Remco R. Bouckaert. Properties of Bayesian belief network learning algorithms. InProceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, pages 102–109, 1994

work page 1994
[15]

John Wiley & Sons, 1997

Fred Glover and Manuel Laguna.Tabu Search. John Wiley & Sons, 1997

work page 1997
[16]

Constantinou, Zhigao Guo, Yang Liu, and Kiattikun Chobtham

Neville Kenneth Kitson, Anthony C. Constantinou, Zhigao Guo, Yang Liu, and Kiattikun Chobtham. A survey of Bayesian network structure learning.Artificial Intelligence Review, 56(8):8721–8814, 2023

work page 2023
[17]

On the sample complexity of learning bayesian networks

Nir Friedman and Zohar Yakhini. On the sample complexity of learning bayesian networks. InProceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI1996), pages 274–282, 1996

work page 1996
[18]

The sample complexity of learning fixed-structure Bayesian networks.Machine Learning, 29(2):165–180, 1997

Sanjoy Dasgupta. The sample complexity of learning fixed-structure Bayesian networks.Machine Learning, 29(2):165–180, 1997

work page 1997
[19]

Lourenço, and G.V

Mayank Mishra, Paulo B. Lourenço, and G.V . Ramana. Structural health monitoring of civil engineering structures by using the internet of things: A review.Journal of Building Engineering, 48:103954, 2022

work page 2022
[20]

Significance of machine learning in healthcare: Features, pillars and applications.International Journal of Intelligent Networks, 3:58–73, 2022

Mohd Javaid, Abid Haleem, Ravi Pratap Singh, Rajiv Suman, and Shanay Rab. Significance of machine learning in healthcare: Features, pillars and applications.International Journal of Intelligent Networks, 3:58–73, 2022

work page 2022
[21]

Inductive transfer for learning Bayesian networks.Machine Learning, 79:227–255, 2010

Roger Luis, Luis Sucar, and Eduardo Morales. Inductive transfer for learning Bayesian networks.Machine Learning, 79:227–255, 2010

work page 2010
[22]

M. Stone. The opinion pool.The Annals of Mathematical Statistics, 32(4):1339–1342, 1961

work page 1961
[23]

Fiedler, L

Lindsey J. Fiedler, L. Enrique Sucar, and Eduardo F. Morales. Transfer learning for temporal nodes Bayesian networks.Applied Intelligence, 43(3):578–597, 2015

work page 2015
[24]

Operational adjustment modeling approach based on Bayesian network transfer learning for new flotation process under scarce data.Journal of Process Control, 128, 2023

Hao Yan, Shiji Song, Fuli Wang, Dakuo He, and Jianjun Zhao. Operational adjustment modeling approach based on Bayesian network transfer learning for new flotation process under scarce data.Journal of Process Control, 128, 2023

work page 2023
[25]

Hao Yan, Xinchun Jia, Kang Li, and Fuli Wang. A Bayesian network method using transfer learning for solving small data problems in abnormal condition diagnosis of fused magnesia smelting process.Control Engineering Practice, 147, 2024

work page 2024
[26]

Ping Yuan, Yufeng Sun, Hui Li, Fuli Wang, and Hongru Li. Abnormal condition identification modeling method based on Bayesian network parameters transfer learning for the electro-fused magnesia smelting process.IEEE Access, 7:149764–149775, 2019

work page 2019
[27]

Bearing fault diagnosis under small data set condition: A bayesian network method with transfer learning for parameter estimation.IEEE Access, 10:35768–35783, 2022

Yongyan Hou, Ao Yang, Wenqiang Guo, Enrang Zheng, Qinkun Xiao, Zhigao Guo, and Zixuan Huang. Bearing fault diagnosis under small data set condition: A bayesian network method with transfer learning for parameter estimation.IEEE Access, 10:35768–35783, 2022

work page 2022
[28]

Dougherty

Alireza Karbalayghareh, Xiaoning Qian, and Edward R. Dougherty. Optimal Bayesian transfer learning.IEEE Transactions on Signal Processing, 66(14):3724–3739, 2018

work page 2018
[29]

Hospedales, and Norman Fenton

Yun Zhou, Timothy M. Hospedales, and Norman Fenton. When and where to transfer for Bayesian network parameter learning.Expert Systems with Applications, 55:361–373, 2016

work page 2016
[30]

Manton, Uwe Aickelin, and Jingge Zhu

Xuetong Wu, Jonathan H. Manton, Uwe Aickelin, and Jingge Zhu. A Bayesian approach to (online) transfer learning: Theory and algorithms.Artificial Intelligence, 324, 2023

work page 2023
[31]

Ferguson

Thomas S. Ferguson. A Bayesian analysis of some nonparametric problems.Annals of Statistics, 1(2):209–230, 1973

work page 1973
[32]

Transferring model structure in Bayesian transfer learning for Gaussian process regression.Knowledge-Based Systems, 251:108875, 2022

Milan Papež and Anthony Quinn. Transferring model structure in Bayesian transfer learning for Gaussian process regression.Knowledge-Based Systems, 251:108875, 2022

work page 2022
[33]

Carl Edward Rasmussen and Christopher K. I. Williams.Gaussian Processes for Machine Learning. The MIT Press, 2006

work page 2006
[34]

Bayesian nonparametric learning and knowledge transfer for object tracking under unknown time-varying conditions.Frontiers in Signal Processing, 2:868638, 2022

Omar Alotaibi and Antonia Papandreou-Suppappola. Bayesian nonparametric learning and knowledge transfer for object tracking under unknown time-varying conditions.Frontiers in Signal Processing, 2:868638, 2022

work page 2022
[35]

Distribution inference from early-stage stationary data streams by transfer learning.IISE Transactions, pages 1–25, 2021

Kai Wang, Jian Li, and Fugee Tsung. Distribution inference from early-stage stationary data streams by transfer learning.IISE Transactions, pages 1–25, 2021

work page 2021
[36]

Dynamic Bayesian networks for feature learning and transfer applications in remaining useful life estimation.IEEE Transactions on Instrumentation and Measurement, 72:1–12, 2023

Lingquan Zeng, Junhua Zheng, Le Yao, and Zhiqiang Ge. Dynamic Bayesian networks for feature learning and transfer applications in remaining useful life estimation.IEEE Transactions on Instrumentation and Measurement, 72:1–12, 2023

work page 2023
[37]

A multiple kernel-based kernel density estimator for multimodal probability density functions.Engineering Applications of Artificial Intelligence, 132:107979, 2024

Jia-Qi Chen, Yu-Lin He, Ying-Chao Cheng, Philippe Fournier-Viger, and Joshua Zhexue Huang. A multiple kernel-based kernel density estimator for multimodal probability density functions.Engineering Applications of Artificial Intelligence, 132:107979, 2024. 22 Binned Semiparametric Bayesian networks

work page 2024
[38]

Scott.Multivariate Density Estimation: Theory, Practice, and Visualization

David W. Scott.Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, Inc., 2015

work page 2015
[39]

Djuric, and Franz Hlawatsch

Gunther Koliander, Yousef El-Laham, Petar M. Djuric, and Franz Hlawatsch. Fusion of probability density functions.Proceedings of the IEEE, 110(4):404–453, 2022

work page 2022
[40]

Strobl, Kun Zhang, and Shyam Visweswaran

Eric V . Strobl, Kun Zhang, and Shyam Visweswaran. Approximate kernel-based conditional independence tests for fast non-parametric causal discovery.Journal of Causal Inference, 7(1):20180017, 2019

work page 2019
[41]

Discovering structure in continuous variables using Bayesian networks

Reimar Hofmann and V olker Tresp. Discovering structure in continuous variables using Bayesian networks. Advances in Neural Information Processing Systems, 8:501–507, 1995

work page 1995
[42]

A characterization theorem for externally Bayesian groups.The Annals of Statistics, 12(3):1100 – 1105, 1984

Christian Genest. A characterization theorem for externally Bayesian groups.The Annals of Statistics, 12(3):1100 – 1105, 1984

work page 1984
[43]

M. P. Wand. Error analysis for general multivariate kernel estimators.Journal of Nonparametric Statistics, 2(1):1–15, 1992

work page 1992
[44]

Adjacency-faithfulness and conservative causal inference

Joseph Ramsey, Peter Spirtes, and Jiji Zhang. Adjacency-faithfulness and conservative causal inference. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI 2006), pages 401–408, 2006

work page 2006
[45]

Causal inference and causal explanation with background knowledge

Christopher Meek. Causal inference and causal explanation with background knowledge. InProceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, page 403 – 410, 1994

work page 1994
[46]

Dor and M

D. Dor and M. Tarsi. A simple algorithm to construct a consistent extension of a partially oriented graph. Technical report, UCLA, Cognitive Systems Laboratory, 1992. Available as Technical Report R-185

work page 1992
[47]

Semiparametric Bayesian networks.Information Sciences, 584:564–582, 2022

David Atienza, Concha Bielza, and Pedro Larrañaga. Semiparametric Bayesian networks.Information Sciences, 584:564–582, 2022

work page 2022
[48]

Random features for large-scale kernel machines

Ali Rahimi and Benjamin Recht. Random features for large-scale kernel machines. InAdvances in Neural Information Processing Systems 20 (NIPS 2007), pages 1177–1184, 2007

work page 2007
[49]

Kernel-based conditional independence test and application in causal discovery

Kun Zhang, Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Kernel-based conditional independence test and application in causal discovery. InProceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011), pages 804–813. AUAI Press, 2011

work page 2011
[50]

UCI Machine learning repository

Dheeru Dua and Casey Graff. UCI Machine learning repository. http://archive.ics.uci.edu/ml, 2017

work page 2017
[51]

statistical comparisons of classifiers over multiple data sets

Salvador García and Francisco Herrera. An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons.Journal of Machine Learning Research, 9:2677–2694, 2008

work page 2008
[52]

J. E. Chacón and T. Duong.Multivariate Kernel Smoothing and Its Applications. Chapman & Hall/CRC, 1st edition, 2018

work page 2018
[53]

Wand and C

P. Wand and C. Jones.Kernel Smoothing. Chapman & Hall/CRC, 1st edition, 1994

work page 1994
[54]

PyBNesian: An extensible python package for Bayesian networks.Neurocomputing, 504:204–209, 2022

David Atienza, Concha Bielza, and Pedro Larrañaga. PyBNesian: An extensible python package for Bayesian networks.Neurocomputing, 504:204–209, 2022

work page 2022
[55]

Shachter and C

Ross D. Shachter and C. Robert Kenley. Gaussian influence diagrams.Management Science, 35(5):527–550, 1989

work page 1989
[56]

Binned semiparametric Bayesian networks for efficient kernel density estimation, 2025

Rafael Sojo, Javier Díaz-Rozo, Concha Bielza, and Pedro Larrañaga. Binned semiparametric Bayesian networks for efficient kernel density estimation, 2025. https://arxiv.org/abs/2506.21997

work page arXiv 2025
[57]

Learning Bayesian networks with the bnlearn R package.Journal of Statistical Software, 35(3):1–22, 2010

Marco Scutari. Learning Bayesian networks with the bnlearn R package.Journal of Statistical Software, 35(3):1–22, 2010

work page 2010
[58]

Home monitoring for older singles: A gas sensor array system.Sensors and Actuators B: Chemical, 393:134036, 2023

Daniel Marín, Joshua Llano-Viles, Zouhair Haddi, Alexandre Perera-Lluna, and Jordi Fonollosa. Home monitoring for older singles: A gas sensor array system.Sensors and Actuators B: Chemical, 393:134036, 2023

work page 2023
[59]

Lyon, Ben W

Robert J. Lyon, Ben W. Stappers, Sally Cooper, J. M. Brooke, and Joshua D. Knowles. Fifty years of pulsar candidate selection: From simple filters to a new principled real-time classification approach.Monthly Notices of the Royal Astronomical Society, 459:1104–1123, 2016

work page 2016
[61]

R. Bock. MAGIC gamma telescope. UCI Machine Learning Repository, 2004. https://doi.org/10.24432/C58K54

work page doi:10.24432/c58k54 2004
[62]

Ibarra Candanedo, Veronique Feldheim, and Dominique Deramaix

Luis M. Ibarra Candanedo, Veronique Feldheim, and Dominique Deramaix. Data driven prediction models of energy use of appliances in a low-energy house.Energy and Buildings, 140:81–97, 2017

work page 2017
[63]

Statistical comparisons of classifiers over multiple data sets.Journal of Machine Learning Research, 7(1):1–30, 2006

Janez Demšar. Statistical comparisons of classifiers over multiple data sets.Journal of Machine Learning Research, 7(1):1–30, 2006. 23 Binned Semiparametric Bayesian networks A Synthetic SPBNs Synthetic SPBN 1: f(a)∼ N(µ A = 3, σA = 2) f(b|a)∼ N(µ B =a·0.5, σ B = 2) f(c|a)∼0.45· N(µ C1 =a·0.5, σ C1 = 1.5) + 0.55· N(µ C2 = 5, σC2 = 1) f(d|b, c)∼0.5· N(µ D1...

work page 2006