Mutual Information Surprise: Rethinking Unexpectedness in Autonomous Systems
Pith reviewed 2026-05-18 21:02 UTC · model grok-4.3
The pith
Mutual Information Surprise redefines unexpectedness as a measure of epistemic growth to guide adaptive reactions in autonomous systems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Mutual Information Surprise quantifies the impact of new observations on the mutual information between those observations and the system's internal model. This quantity serves as a direct indicator of epistemic growth. The paper shows that a statistical test built around this quantity can trigger a reaction policy that governs system behavior by adjusting sampling and forking processes, and that systems controlled by this policy outperform those based on classical surprise measures in stability, responsiveness, and accuracy on both synthetic domains and a dynamic pollution map estimation task.
What carries the argument
Mutual Information Surprise, which computes how new data changes the mutual information shared with the internal model and thereby signals epistemic growth to drive sampling adjustments and process forking.
If this is right
- A system using the MIS-based reaction policy exhibits greater stability and responsiveness than systems driven by Shannon or Bayesian surprise.
- The measure shifts surprise from a purely reactive signal to one that supports reflection on the system's own learning progression.
- Dynamic adjustment of sampling and process forking under MIS leads to higher predictive accuracy in tasks such as pollution map estimation.
- The approach supplies a concrete mechanism for autonomous systems to become more self-aware and adaptive in complex, changing environments.
Where Pith is reading between the lines
- If the link between mutual information change and epistemic growth holds, MIS could be inserted into active learning loops to decide when to query new data based on information gain rather than prediction error alone.
- The same quantity might help multi-agent systems coordinate by letting each agent share its current surprise level to align exploration priorities.
- Testing MIS in reinforcement learning agents could reveal whether the reflective policy reduces unnecessary exploration in stable regimes while accelerating adaptation when the environment shifts.
Load-bearing premise
That changes in mutual information reliably indicate genuine epistemic growth and can be converted into a reaction policy that improves adaptation without creating fresh instabilities or demanding extensive tuning.
What would settle it
Run the MIS policy and a classical surprise policy side-by-side on a rapidly changing environment where the internal model must be updated continuously; if the MIS version exhibits lower stability or worse long-term accuracy than the classical version, the claim that MIS produces superior reflective adaptation would be falsified.
read the original abstract
A community of researchers appears to think that a machine can be surprised and have introduced various surprise measures, principally the Shannon Surprise and the Bayesian Surprise. The questions of what constitutes a surprise and how to react to one still elicit debates. In this work, we introduce Mutual Information Surprise (MIS), a new framework that redefines surprise not as anomaly measure, but as a signal of epistemic growth. Furthermore, we develop a statistical test sequence that could trigger a surprise reaction and propose a MIS-based reaction policy that dynamically governs system behavior through sampling adjustment and process forking. Empirical evaluations -- on both synthetic domains and a dynamic pollution map estimation task -- show that a system governed by the MIS-based reaction policy significantly outperforms those under classical surprise-based approaches in stability, responsiveness, and predictive accuracy. The important implication of our new proposal is that MIS quantifies the impact of new observations on mutual information, shifts surprise from reactive to reflective, enables reflection on learning progression, and thus offers a path toward self-aware and adaptive autonomous systems. We expect the new surprise measure to play a critical role in further advancing autonomous systems on their ability to learn and adapt in a complex and dynamic environment.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Mutual Information Surprise (MIS) as a new framework that redefines surprise in autonomous systems as the impact of new observations on mutual information between data and the internal model, shifting it from a reactive anomaly measure to a reflective signal of epistemic growth. It develops a statistical test sequence to trigger reactions and a MIS-based reaction policy that dynamically adjusts sampling and performs process forking. Empirical evaluations on synthetic domains and a dynamic pollution map estimation task claim that systems governed by this policy significantly outperform classical surprise-based approaches in stability, responsiveness, and predictive accuracy, with implications for self-aware and adaptive autonomous systems.
Significance. If the empirical claims are substantiated, the work could meaningfully advance research on surprise measures and adaptive agents in machine learning by providing a mechanism for systems to reflect on their learning progression via mutual information. This addresses ongoing debates around Shannon and Bayesian surprise by emphasizing epistemic growth and could support more robust behavior in non-stationary environments.
major comments (2)
- [Abstract] The abstract asserts empirical superiority in stability, responsiveness, and accuracy on synthetic and pollution tasks, but supplies no details on baselines, statistical tests, data splits, or error analysis, so it is not possible to verify whether the data actually supports the central claims.
- [MIS-based reaction policy] The stability of the MIS reaction policy hinges on untested assumptions about MI estimation robustness in non-stationary environments; the manuscript does not address whether the test thresholds or forking logic were tuned per task or if performance degrades under modest changes to observation noise or model capacity.
minor comments (1)
- The manuscript would benefit from an explicit mathematical definition of MIS (e.g., how the impact on mutual information is computed) and a direct comparison to Bayesian surprise to clarify the claimed novelty.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Abstract] The abstract asserts empirical superiority in stability, responsiveness, and accuracy on synthetic and pollution tasks, but supplies no details on baselines, statistical tests, data splits, or error analysis, so it is not possible to verify whether the data actually supports the central claims.
Authors: We agree that the abstract is concise and omits key experimental details. The full manuscript describes the baselines (Shannon and Bayesian surprise), the statistical test sequence, synthetic data generation procedures, the pollution map task setup including train/test splits, and error metrics for predictive accuracy. To improve verifiability, we will revise the abstract to include a short reference to these elements and add explicit pointers to the experimental sections. revision: yes
-
Referee: [MIS-based reaction policy] The stability of the MIS reaction policy hinges on untested assumptions about MI estimation robustness in non-stationary environments; the manuscript does not address whether the test thresholds or forking logic were tuned per task or if performance degrades under modest changes to observation noise or model capacity.
Authors: The manuscript derives test thresholds from the statistical test sequence and evaluates the policy across synthetic domains and the pollution task to show consistent behavior. We did not include explicit sensitivity analysis for observation noise levels or model capacity changes. We will add a discussion of the underlying assumptions regarding MI estimation and include additional robustness experiments in the revised version. revision: partial
Circularity Check
No significant circularity: MIS defined from standard mutual information with separate empirical validation
full rationale
The paper defines Mutual Information Surprise (MIS) directly from the established mutual information quantity between observations and the internal model, then proposes a statistical test sequence and reaction policy (sampling adjustment and process forking) as downstream applications. Empirical results on synthetic domains and pollution mapping are presented as separate evaluations of the policy's performance in stability and accuracy. No equations or steps reduce the central claims to fitted parameters on the same data or to self-citations that bear the load of the derivation; the definition and policy remain independent of the reported outcomes.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Mutual information between new observations and the system's model can be used as a reliable indicator of epistemic growth.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
MIS ≜ Î_{n+m} − Î_n ... Theorem 1 ... MIS± bounds ... three-pronged reaction policy (sampling adjustment, process forking, coin-toss)
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
mutual information ... epistemic growth ... reflection on learning progression
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
This is true when we regard the initial observations as true system information
We assume that the existing observations are typical in the sense of the Asymptotic Equipar- tition Property (44), meaning that empirical statistics computed from the data are repre- sentative of their corresponding expected values under the experimental design’s intended distribution, i.e., ˆ𝐼𝑛 ≈ E[ ˆ𝐼𝑛]. This is true when we regard the initial observati...
-
[2]
The number of existing observations 𝑛 is much smaller than cardinality of space X, Y. 𝑛 ≪ |X| , |Y |
-
[3]
The number of new observations 𝑚 is much smaller than the number of existing observations. 𝑚 ≪ 𝑛. Theorem 1. Consider a well-regulated autonomous system defined in Section 3.1, which satisfies the conditions in Assumption 1. With probability at least 1 − 𝜌, the change in MLE-based mutual information estimates satisfies: ˆ𝐼𝑛+𝑚 − ˆ𝐼𝑛 ∈ (log(𝑚 + 𝑛) − log 𝑛) ...
-
[4]
Stagnation in Exploration:A downward shift driven by a decrease in input entropyΔ𝐻 (x) < 0 suggests the system repeatedly samples in a limited region, thus gathering redundant data 12 with minimal new information
-
[5]
Increased Noise or Process Drift: A downward shift could also result from increased conditional entropy Δ𝐻 (y | x) > 0, indicating greater uncertainty in predicting y given x. Practically, this often signifies increased external noise or a fundamental change in the underlying process. Violation from Above: Sudden Growth in Understanding If MIS > MIS+, thi...
-
[6]
Aggressive Exploration: If the increase is driven by higher input entropy Δ𝐻 (x) > 0, the system is likely exploring previously unvisited regions aggressively, potentially inflating knowledge gains without sufficient validation
-
[7]
Reduction in Noise: An increase due to reduced conditional entropy Δ𝐻 (y | x) < 0 signals a desirable decrease in uncertainty, thus generally representing a beneficial development
-
[8]
Novel Discovery:An increase in output entropy Δ𝐻 (y) > 0 suggests discovery of novel and previously rare outputs—particularly valuable in exploratory or scientific contexts. Summary Table Violation Type Possible Causes Trend in Mutual Information Violation from Below Stagnation in exploration ↓ 𝐻 (x) ⇒↓ 𝐼 (x, y) Increased noise / process drift ↑ 𝐻 (y | x)...
-
[9]
Sampling Adjustment.The first policy addresses variations in input entropy𝐻 (x). If Δ ˆ𝐻 (x) > 0 dominates MIS, indicating overly aggressive exploration, the system should moderate exploration and emphasize exploitation to prevent fitting to noise. Conversely, if Δ ˆ𝐻 (x) < 0, suggesting redundant sampling, the system should enhance exploration to restore...
-
[10]
Process Forking. The second policy responds to variations in conditional entropy 𝐻 (y | x), 14 i.e., changes in function mapping. Upon surprise triggered by Δ ˆ𝐻 (y | x), the system forks into two subprocesses, each consisting of 𝑛 existing observations and 𝑚 new observations divided at the surprise moment (Theorem 1). The two subprocesses represent the p...
-
[11]
the extra resources spent on deciding the nature of an observation
Coin Toss Resolution. There are occasions where changes in Δ ˆ𝐻 (x) and Δ ˆ𝐻 (y | x) are comparable, making selecting a reaction policy challenging. Instead of arbitrarily favoring the slightly larger change, we always use a biased coin toss approach, stochastically selecting which entropy to address based on the magnitude of changes: 𝑝adjust = |Δ ˆ𝐻 (x)|...
-
[12]
SR: The surprise-reactive sampling method (14) switches between exploration and exploita- tion modes based on observed Shannon or Bayesian Surprise. By default, SR operates in an exploration mode guided by the widely used space-filling principle ( 53), selecting new 24 sampling locations via the min-max objective: x∗ = argmax x min x𝑖 ∈X ∥x − x𝑖 ∥2, where...
-
[13]
SC/E: The subtractive clustering/entropy active learning strategy (51) selects the next sam- pling location by maximizing a custom acquisition function. For an unseen region X and a probabilistic predictive function ˆ𝑓 (x) trained on the observed data, the acquisition function is defined as: 𝑎(x) = (1 − 𝜂)Ex′∈X [𝑒−∥x−x′ ∥2] + 𝜂𝐻 ( ˆ𝑓 (x)), where 𝜂 is the ...
-
[14]
GS/QBC: The greedy search/query by committee active learning strategy (52) uses a different acquisition function. Given the set of seen observations {X, Y} and a model committee F composed of multiple predictive models trained on this data, the acquisition function is defined as: 𝑎(x) = (1 − 𝜂) min x′,y′∈X,y ∥x − x′∥2∥ ˆ𝑓 (x) − y′∥2 + 𝜂 max ˆ𝑓 (·), ˆ𝑓 ′ (...
work page 2000
-
[15]
in one 𝑋-category and one 𝑌-category the counts change by ±1 (all other marginal counts are unchanged)
-
[16]
in one joint cell the count changes by −1 and in another joint cell the count changes by +1. Step 1. How much can one empirical Shannon entropy change? Assume a single observation is moved from category 𝐴 to category 𝐵. Let the counts before the move be 𝐴 = 𝑎 (with 𝑎 ≥ 1) and 𝐵 = 𝑏 (with 𝑏 ≥ 0). After the move the counts become𝑎 − 1 and 𝑏 + 1. Only these ...
-
[17]
B. Burger, P. M. Maffettone, V. V. Gusev, C. M. Aitchison, Y. Bai, X. Wang, X. Li, B. M. Alston, B. Li, R. Clowes, N. Rankin, B. Harris, R. S. Sprick, and A. I. Cooper, “A mobile robotic chemist,”Nature, vol. 583, pp. 237–241, 2020
work page 2020
-
[18]
Scaling deep learning for materials discovery,
A. Merchant, S. Batzner, S. S. Schoenholz, M. Aykol, G. Cheon, and E. D. Cubuk, “Scaling deep learning for materials discovery,”Nature, vol. 624, pp. 80–85, 2023
work page 2023
-
[19]
An autonomous laboratory for the accelerated synthesis of novel materials,
N. J. Szymanski, B. Rendy, Y. Fei, R. E. Kumar, T. He, D. Milsted, M. J. McDermott, M. Gallant, E. D. Cubuk, A. Merchant, H. Kim, A. Jain, C. J. Bartel, K. Persson, Y. Zeng, and G. Ceder, “An autonomous laboratory for the accelerated synthesis of novel materials,”Nature, vol. 624, pp. 86–91, 2023
work page 2023
-
[20]
Autonomous mobile robots for exploratory synthetic chemistry,
T. Dai, S. Vijayakrishnan, F. T. Szczypi ´nski, J.-F. Ayme, E. Simaei, T. Fellowes, R. Clowes, L. Kotopanov, C. E. Shields, Z. Zhou, J. W. Ward, and A. I. Cooper, “Autonomous mobile robots for exploratory synthetic chemistry,”Nature, vol. 635, pp. 890–897, 2024
work page 2024
-
[21]
Towards fully autonomous driving: Systems and algorithms,
J. Levinson, J. Askeland, J. Becker, J. Dolson, D. Held, S. Kammel, J. Z. Kolter, D. Langer, O. Pink, V. Pratt, M. Sokolsky, G. Stanek, D. Stavens, A. Teichman, M. Werling, and S. Thrun, “Towards fully autonomous driving: Systems and algorithms,” in Proceedings of the 2011 IEEE Intelligent Vehicles Symposium, (Baden-Baden, Germany), June 2011
work page 2011
-
[22]
Self-driving laboratory for accelerated discovery of thin-film materials,
B. P. MacLeod, F. G. Parlane, T. D. Morrissey, F. H ¨ase, L. M. Roch, K. E. Dettelbach, R. Moreira, L. P. Yunker, M. B. Rooney, and J. R. Deeth, “Self-driving laboratory for accelerated discovery of thin-film materials,”Science Advances, vol. 6, no. 20, p. eaaz8867, 2020
work page 2020
-
[23]
A survey of autonomous driving: Common practices and emerging technologies,
E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A survey of autonomous driving: Common practices and emerging technologies,”IEEE Access, vol. 8, pp. 58443–58469, 2020
work page 2020
-
[24]
Anomaly detection in autonomous driving: A survey,
D. Bogdoll, M. Nitsche, and J. M. Z ¨ollner, “Anomaly detection in autonomous driving: A survey,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, (New Orleans, USA), June 2022. 41
work page 2022
-
[25]
An autonomous manufacturing system based on swarm of cognitive agents,
H.-S. Park and N.-H. Tran, “An autonomous manufacturing system based on swarm of cognitive agents,”Journal of Manufacturing Systems, vol. 31, no. 3, pp. 337–348, 2012
work page 2012
-
[26]
Towards resilience in industry 5.0: A decentralized autonomous manufacturing paradigm,
J. Leng, Y. Zhong, Z. Lin, K. Xu, D. Mourtzis, X. Zhou, P. Zheng, Q. Liu, J. L. Zhao, and W. Shen, “Towards resilience in industry 5.0: A decentralized autonomous manufacturing paradigm,” Journal of Manufacturing Systems, vol. 71, pp. 95–114, 2023
work page 2023
-
[27]
High-tech defense industries: Developing autonomous intelligent systems,
J. Reis, Y. Cohen, N. Mel˜ao, J. Costa, and D. Jorge, “High-tech defense industries: Developing autonomous intelligent systems,”Applied Sciences, vol. 11, no. 11, p. 4920, 2021
work page 2021
-
[28]
Autonomy in materials research: A case study in carbon nanotube growth,
P. Nikolaev, D. Hooper, F. Webber, R. Rao, K. Decker, M. Krein, J. Poleski, R. Barto, and B. Maruyama, “Autonomy in materials research: A case study in carbon nanotube growth,” NPJ Computational Materials, vol. 2, p. 16031, 2016
work page 2016
-
[29]
Efficient closed-loop maximization of carbon nanotube growth rate using Bayesian optimization,
J. Chang, P. Nikolaev, J. Carpena-N´ u˜nez, R. Rao, K. Decker, A. E. Islam, J. Kim, M. A. Pitt, J. I. Myung, and B. Maruyama, “Efficient closed-loop maximization of carbon nanotube growth rate using Bayesian optimization,”Scientific Reports, vol. 10, p. 9040, 2020
work page 2020
-
[30]
Toward futuristic autonomous experi- mentation—a surprise-reacting sequential experiment policy,
I. Ahmed, S. T. Bukkapatnam, B. Botcha, and Y. Ding, “Toward futuristic autonomous experi- mentation—a surprise-reacting sequential experiment policy,” IEEE Transactions on Automa- tion Science and Engineering, vol. 22, pp. 7912–7926, 2025
work page 2025
-
[31]
Z.-G. Zhou and P. Tang, “Continuous anomaly detection in satellite image time series based on z-scores of season-trend model residuals,” in Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium, (Beijing, China), July 2016
work page 2016
-
[32]
Active hypothesis testing for anomaly detection,
K. Cohen and Q. Zhao, “Active hypothesis testing for anomaly detection,” IEEE Transactions on Information Theory, vol. 61, no. 3, pp. 1432–1450, 2015
work page 2015
-
[33]
Null hypothesis test for anomaly detection,
J. F. Kamenik and M. Szewc, “Null hypothesis test for anomaly detection,” Physics Letters B, vol. 840, p. 137836, 2023
work page 2023
-
[34]
A survey of distance and similarity measures used within network intrusion anomaly detection,
D. J. Weller-Fahy, B. J. Borghetti, and A. A. Sodemann, “A survey of distance and similarity measures used within network intrusion anomaly detection,” IEEE Communications Surveys & Tutorials, vol. 17, no. 1, pp. 70–91, 2014. 42
work page 2014
-
[35]
Artificial immune system via Euclidean distance minimization for anomaly detection in bearings,
L. Montechiesi, M. Cocconcelli, and R. Rubini, “Artificial immune system via Euclidean distance minimization for anomaly detection in bearings,” Mechanical Systems and Signal Processing, vol. 76, pp. 380–393, 2016
work page 2016
-
[36]
Online anomaly detection for hard disk drives based on Mahalanobis distance,
Y. Wang, Q. Miao, E. W. Ma, K.-L. Tsui, and M. G. Pecht, “Online anomaly detection for hard disk drives based on Mahalanobis distance,” IEEE Transactions on Reliability, vol. 62, no. 1, pp. 136–145, 2013
work page 2013
-
[37]
Mahalanobis distance based adversarial network for anomaly detection,
Y. Hou, Z. Chen, M. Wu, C.-S. Foo, X. Li, and R. M. Shubair, “Mahalanobis distance based adversarial network for anomaly detection,” in Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, (Virtual), May 2020
work page 2020
-
[38]
F-anoGAN: Fast unsupervised anomaly detection with generative adversarial networks,
T. Schlegl, P. Seeb ¨ock, S. M. Waldstein, G. Langs, and U. Schmidt-Erfurth, “F-anoGAN: Fast unsupervised anomaly detection with generative adversarial networks,” Medical Image Analysis, vol. 54, pp. 30–44, 2019
work page 2019
-
[39]
B. Lian, Y. Kartal, F. L. Lewis, D. G. Mikulski, G. R. Hudas, Y. Wan, and A. Davoudi, “Anomaly detection and correction of optimizing autonomous systems with inverse reinforce- ment learning,” IEEE Transactions on Cybernetics, vol. 53, no. 7, pp. 4555–4566, 2022
work page 2022
-
[40]
A. Barto, M. Mirolli, and G. Baldassarre, “Novelty or surprise?,” Frontiers in Psychology, vol. 4, p. 907, 2013
work page 2013
-
[41]
Bayesian surprise attracts human attention,
L. Itti and P. Baldi, “Bayesian surprise attracts human attention,” Vision Research, vol. 49, no. 10, pp. 1295–1306, 2009
work page 2009
-
[42]
Learning in volatile environments with the Bayes factor surprise,
V. Liakoni, A. Modirshanechi, W. Gerstner, and J. Brea, “Learning in volatile environments with the Bayes factor surprise,”Neural Computation, vol. 33, no. 2, pp. 269–340, 2021
work page 2021
-
[43]
Balancing new against old information: The role of puzzlement surprise in learning,
M. Faraji, K. Preuschoff, and W. Gerstner, “Balancing new against old information: The role of puzzlement surprise in learning,” Neural Computation, vol. 30, no. 1, pp. 34–83, 2018
work page 2018
-
[44]
Anomaly detection for au- tonomous guided vehicles using Bayesian surprise,
O. C ¸ atal, S. Leroux, C. De Boom, T. Verbelen, and B. Dhoedt, “Anomaly detection for au- tonomous guided vehicles using Bayesian surprise,” in Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, (Las Vegas, USA), October 2020. 43
work page 2020
-
[45]
A Bayesian surprise approach in designing cognitive radar for autonomous driving,
Y. Zamiri-Jafarian and K. N. Plataniotis, “A Bayesian surprise approach in designing cognitive radar for autonomous driving,”Entropy, vol. 24, no. 5, p. 672, 2022
work page 2022
-
[46]
A. Dinparastdjadid, I. Supeene, and J. Engstrom, “Measuring surprise in the wild,” arXiv preprint arXiv:2305.07733, 2023
-
[47]
An augmented surprise-guided se- quential learning framework for predicting the melt pool geometry,
A. S. Raihan, H. Khosravi, T. H. Bhuiyan, and I. Ahmed, “An augmented surprise-guided se- quential learning framework for predicting the melt pool geometry,”Journal of Manufacturing Systems, vol. 75, pp. 56–77, 2024
work page 2024
-
[48]
Autonomous experimentation systems and benefit of surprise-based Bayesian optimization,
S. Jin, J. R. Deneault, B. Maruyama, and Y. Ding, “Autonomous experimentation systems and benefit of surprise-based Bayesian optimization,” in Proceedings of the 2022 International Symposium on Flexible Automation, (Yokohama, Japan), July 2022
work page 2022
-
[49]
A taxonomy of surprise definitions,
A. Modirshanechi, J. Brea, and W. Gerstner, “A taxonomy of surprise definitions,” Journal of Mathematical Psychology, vol. 110, p. 102712, 2022
work page 2022
-
[50]
A computational theory of surprise,
P. Baldi, “A computational theory of surprise,” in Information, Coding and Mathematics: Proceedings of Workshop Honoring Prof. Bob Mceliece on his 60th Birthday, pp. 1–25, 2002
work page 2002
-
[51]
Human inference in changing environments with temporal structure,
A. Prat-Carrabin, R. C. Wilson, J. D. Cohen, and R. Azeredo da Silveira, “Human inference in changing environments with temporal structure,” Psychological Review, vol. 128, no. 5, p. 879–912, 2021
work page 2021
-
[52]
Alternatives to the median absolute deviation,
P. J. Rousseeuw and C. Croux, “Alternatives to the median absolute deviation,”Journal of the American Statistical Association, vol. 88, no. 424, pp. 1273–1283, 1993
work page 1993
-
[53]
Clustering and unsupervised anomaly detection with l-2 normalized deep auto-encoder representations,
C. Aytekin, X. Ni, F. Cricri, and E. Aksu, “Clustering and unsupervised anomaly detection with l-2 normalized deep auto-encoder representations,” in Proceedings of the 2018 International Joint Conference on Neural Networks, (Rio de Janeiro, Brazil), October 2018
work page 2018
-
[54]
Anomaly detection with multiple-hypotheses predictions,
D. T. Nguyen, Z. Lou, M. Klar, and T. Brox, “Anomaly detection with multiple-hypotheses predictions,” inProceedings of the 36th International Conference on Machine Learning, (Long Beach, USA), June 2019. 44
work page 2019
-
[55]
A computational analysis of the neural bases of Bayesian inference,
A. Kolossa, B. Kopp, and T. Fingscheidt, “A computational analysis of the neural bases of Bayesian inference,”Neuroimage, vol. 106, pp. 222–237, 2015
work page 2015
-
[56]
A mathematical theory of communication,
C. E. Shannon, “A mathematical theory of communication,”The Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948
work page 1948
-
[57]
Estimation of entropy and mutual information,
L. Paninski, “Estimation of entropy and mutual information,” Neural Computation, vol. 15, no. 6, pp. 1191–1253, 2003
work page 2003
-
[58]
The permutation test for feature selection by mutual information,
D. Franc ¸ois, V. Wertz, and M. Verleysen, “The permutation test for feature selection by mutual information,” in Proceedings of the 14th European Symposium on Artificial Neural Networks, (Bruges, Belgium), April 2006
work page 2006
-
[59]
Mutual information-based feature selection for multilabel classification,
G. Doquire and M. Verleysen, “Mutual information-based feature selection for multilabel classification,”Neurocomputing, vol. 122, pp. 148–155, 2013
work page 2013
-
[60]
T. M. Cover, Elements of Information Theory. John Wiley & Sons, 1999
work page 1999
-
[61]
Exploration vs. exploitation in active learning: A Bayesian approach,
A. Bondu, V. Lemaire, and M. Boull ´e, “Exploration vs. exploitation in active learning: A Bayesian approach,” in Proceedings of the 2010 International Joint Conference on Neural Networks, (Barcelona, Spain), July 2010
work page 2010
-
[62]
A unifying view on dataset shift in classification,
J. G. Moreno-Torres, T. Raeder, R. Alaiz-Rodr´ıguez, N. V. Chawla, and F. Herrera, “A unifying view on dataset shift in classification,”Pattern Recognition, vol. 45, no. 1, pp. 521–530, 2012
work page 2012
-
[63]
Covariate shift adaptation by importance weighted cross validation,
M. Sugiyama, M. Krauledat, and K.-R. M¨ uller, “Covariate shift adaptation by importance weighted cross validation,” Journal of Machine Learning Research , vol. 8, no. 5, pp. 985– 1005, 2007
work page 2007
-
[64]
Discriminative learning under covariate shift,
S. Bickel, M. Br¨ uckner, and T. Scheffer, “Discriminative learning under covariate shift,”Journal of Machine Learning Research, vol. 10, no. 9, pp. 2137–2155, 2009
work page 2009
-
[65]
An overview of concept drift applications,
I. ˇZliobait˙e, M. Pechenizkiy, and J. Gama, “An overview of concept drift applications,” Big Data Analysis: New Algorithms for a New Society, vol. 16, pp. 91–114, 2016
work page 2016
-
[66]
Concept drift monitoring and diagnostics of supervised learning models via score vectors,
K. Zhang, A. T. Bui, and D. W. Apley, “Concept drift monitoring and diagnostics of supervised learning models via score vectors,”Technometrics, vol. 65, no. 2, pp. 137–149, 2023. 45
work page 2023
-
[67]
Active learning for object classification: From exploration to exploitation,
N. Cebron and M. R. Berthold, “Active learning for object classification: From exploration to exploitation,”Data Mining and Knowledge Discovery, vol. 18, pp. 283–299, 2009
work page 2009
-
[68]
U. J. Islam, K. Paynabar, G. Runger, and A. S. Iquebal, “Dynamic exploration–exploitation trade-off in active learning regression with Bayesian hierarchical modeling,”IISE Transactions, vol. 57, no. 4, pp. 393–407, 2025
work page 2025
-
[69]
Space-filling designs for computer experiments: A review,
V. R. Joseph, “Space-filling designs for computer experiments: A review,”Quality Engineering, vol. 28, no. 1, pp. 28–35, 2016
work page 2016
-
[70]
Generalization errors and learning curves for regression with multi-task Gaussian processes,
K. Chai, “Generalization errors and learning curves for regression with multi-task Gaussian processes,” in Proceedings of the 23rd Advances in Neural Information Processing Systems , (Vancouver, Canada), December 2009
work page 2009
-
[71]
Entropy and information in neural spike trains,
S. P. Strong, R. Koberle, R. R. D. R. Van Steveninck, and W. Bialek, “Entropy and information in neural spike trains,”Physical Review Letters, vol. 80, p. 197, 1998
work page 1998
-
[72]
On the method of bounded differences,
C. McDiarmid, “On the method of bounded differences,” Surveys in Combinatorics, vol. 141, no. 1, pp. 148–188, 1989. 46
work page 1989
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.