The Ethical Dilemma when (not) Setting up Cost-based Decision Rules in Semantic Segmentation

Fabian H\"uger; Hanno Gottschalk; Matthias Rottmann; Peter Schlicht; Radin Dardashti; Robin Chan

arxiv: 1907.01342 · v1 · pith:AAKI3WAWnew · submitted 2019-07-02 · 💻 cs.CV

The Ethical Dilemma when (not) Setting up Cost-based Decision Rules in Semantic Segmentation

Robin Chan , Matthias Rottmann , Radin Dardashti , Fabian H\"uger , Peter Schlicht , Hanno Gottschalk This is my paper

Pith reviewed 2026-05-25 11:14 UTC · model grok-4.3

classification 💻 cs.CV

keywords semantic segmentationdecision theorycost functionsethical dilemmasprecision recallfalse positive rateMAP ruleurban scenes

0 comments

The pith

Cost functions from egoistic and altruistic views alter precision, recall and error rates when replacing the MAP rule in semantic segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Neural networks for semantic segmentation output probability distributions over classes for each pixel and normally select the class with the highest probability via the maximum a-posteriori rule. This rule is optimal only under a symmetric cost function that treats every class confusion equally. The paper instead defines two explicit cost matrices, one egoistic and one altruistic, then forms new decision rules by linear interpolation between the MAP rule and each of these cost-based rules. It demonstrates that the resulting segmentations produce different values for precision, recall, and segment-wise false-positive and false-negative rates on urban street scenes.

Core claim

We define two cost functions from different extreme perspectives, an egoistic and an altruistic one, and show how safety relevant quantities like precision / recall and (segment-wise) false positive / negative rate change when interpolating between MAP, egoistic and altruistic decision rules.

What carries the argument

Linear interpolation between the MAP decision rule and two explicitly defined cost matrices (egoistic and altruistic) that assign different penalties to specific class confusions.

If this is right

Different class confusions can be weighted unequally, so that mistaking a person for a street incurs a different cost from mistaking a building for a tree.
Safety quantities such as precision, recall, and segment-wise false-positive and false-negative rates become tunable by the choice of cost perspective.
The standard MAP rule is revealed as only one point on a continuum of possible decision rules once costs are made explicit.
Explicit cost assignment immediately surfaces ethical questions about whose interests the model should prioritize.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same interpolation technique could be applied to other dense prediction tasks such as depth estimation or instance segmentation.
Practical deployment would require a method for eliciting or validating the numerical cost values from stakeholders.
The observed metric shifts could be used to calibrate decision thresholds in safety-critical systems once the costs are fixed.

Load-bearing premise

That numerical costs reflecting distinct ethical perspectives can be meaningfully assigned to each type of class confusion and that linear interpolation between the resulting rules yields interpretable shifts in safety metrics.

What would settle it

A concrete test on a segmentation model where the interpolated egoistic and altruistic decision rules produce no measurable change in precision, recall, or segment-wise false-positive/negative rates compared with the MAP rule.

Figures

Figures reproduced from arXiv: 1907.01342 by Fabian H\"uger, Hanno Gottschalk, Matthias Rottmann, Peter Schlicht, Radin Dardashti, Robin Chan.

**Figure 2.** Figure 2: Illustration of two segmentation masks obtained with the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 4.** Figure 4: Regions of interest derived from the priors of the classes [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Illustration of three semantic segmentation masks and different perception obtained by the application of cost-based decision [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Two extreme confusion cost matrices that we study in our experiments. [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: Confusion cost matrix space V spanned by our exemplary altruistic (CA) and egoistic (CE) cost matrix and the robotistic (CR) cost matrix. Inside the triangle as heatmap the behavior of rec( V (C) | person ), the recall of person pixels. Blue indicates high recall, red indicates low recall. from this analysis is that DeepLabv3+ confuses only persons which are not completely visible, e.g., persons standi… view at source ↗

**Figure 9.** Figure 9: Falsely detected (false positive) person (top row) and building (bottom row) segments. [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 10.** Figure 10: Non-detected (false negative) person (top row) and building (bottom row) segments. [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

read the original abstract

Neural networks for semantic segmentation can be seen as statistical models that provide for each pixel of one image a probability distribution on predefined classes. The predicted class is then usually obtained by the maximum a-posteriori probability (MAP) which is known as Bayes rule in decision theory. From decision theory we also know that the Bayes rule is optimal regarding the simple symmetric cost function. Therefore, it weights each type of confusion between two different classes equally, e.g., given images of urban street scenes there is no distinction in the cost function if the network confuses a person with a street or a building with a tree. Intuitively, there might be confusions of classes that are more important to avoid than others. In this work, we want to raise awareness of the possibility of explicitly defining confusion costs and the associated ethical difficulties if it comes down to providing numbers. We define two cost functions from different extreme perspectives, an egoistic and an altruistic one, and show how safety relevant quantities like precision / recall and (segment-wise) false positive / negative rate change when interpolating between MAP, egoistic and altruistic decision rules.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Flags ethical weight in segmentation cost rules but leaves the interpolation step undefined.

read the letter

The main thing here is that the authors treat the usual MAP rule in semantic segmentation as one particular cost choice and then define two others—an egoistic matrix and an altruistic matrix—to show that safety metrics such as precision, recall, and segment-wise false-positive rates shift when you move between the three rules. The explicit naming of those two extreme ethical standpoints is the piece that is not routine in the cited decision-theoretic work on segmentation. That framing does surface a real point: once costs are allowed to differ by class pair, the numbers chosen carry consequences that are worth naming in safety-critical settings. The paper does that naming plainly. The limitation is that the interpolation operator itself is not described. The abstract simply says the metrics change “when interpolating” without stating whether the cost matrices are blended first, whether the decision thresholds on the posteriors are averaged, or whether the final pixel labels are combined in some other way. Without that step, the trajectories in the metrics cannot be read as the result of any coherent family of cost functions. The manuscript also contains no numerical example or figure, so the demonstration stays conceptual. This is for readers already working on cost-sensitive methods or on the ethics of deployed vision systems. Someone who needs a reproducible procedure or quantitative validation will not get it. Someone who wants a short prompt for thinking about whose costs are encoded in the loss function may find the language useful. I would send the paper to referees. The topic is worth a public record even if the current version needs the interpolation rule written down before the central claim can be evaluated.

Referee Report

2 major / 1 minor

Summary. The paper claims that semantic segmentation networks typically employ MAP decision rules derived from symmetric 0-1 costs, but that asymmetric cost matrices can be defined from contrasting ethical standpoints (egoistic versus altruistic). It asserts that interpolating between the MAP rule and the two cost-derived Bayes rules produces observable, safety-relevant shifts in precision, recall, and segment-wise false-positive/negative rates, thereby illustrating the ethical difficulties of assigning numerical confusion costs.

Significance. If the interpolation construction and resulting metric trajectories could be made rigorous and reproducible, the manuscript would usefully foreground the ethical implications of cost-sensitive decision rules in safety-critical vision systems. At present the contribution remains conceptual and lacks any quantitative demonstration or explicit mathematical construction, limiting its technical impact.

major comments (2)

[Abstract] Abstract: the interpolation operator between the MAP, egoistic, and altruistic decision rules is never defined. It is therefore impossible to determine whether the reported changes in precision/recall and segment-wise FP/FN rates arise from a coherent family of cost functions (e.g., convex combination of the three cost matrices) or from an ad-hoc blending of outputs; this specification is load-bearing for the central claim that the metric trajectories are interpretable as ethical trade-offs.
[Abstract] Abstract: no concrete cost matrices, no explicit Bayes decision rules, and no quantitative results or experimental protocol are supplied. The demonstration therefore rests solely on conceptual description, leaving the weakest assumption (that linear interpolation between rules yields meaningful safety-relevant changes) untested.

minor comments (1)

[Title] The parenthetical “(not)” in the title is ambiguous; a clearer phrasing would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive critique emphasizing the need for explicit mathematical definitions and quantitative support. We agree these elements are required to make the ethical trade-off claim rigorous and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the interpolation operator between the MAP, egoistic, and altruistic decision rules is never defined. It is therefore impossible to determine whether the reported changes in precision/recall and segment-wise FP/FN rates arise from a coherent family of cost functions (e.g., convex combination of the three cost matrices) or from an ad-hoc blending of outputs; this specification is load-bearing for the central claim that the metric trajectories are interpretable as ethical trade-offs.

Authors: The referee correctly identifies that the interpolation operator is not formally defined. We will revise the manuscript to define it explicitly as a convex combination of cost matrices: for λ ∈ [0,1], C(λ) = (1−λ)C_MAP + λ C_ego (and analogously for the altruistic matrix). The pixel-wise Bayes rule is then obtained by minimizing the expected cost under C(λ). This construction will be added to the abstract, methods, and a new figure showing the resulting metric trajectories, ensuring the changes are reproducible and directly interpretable as ethical trade-offs. revision: yes
Referee: [Abstract] Abstract: no concrete cost matrices, no explicit Bayes decision rules, and no quantitative results or experimental protocol are supplied. The demonstration therefore rests solely on conceptual description, leaving the weakest assumption (that linear interpolation between rules yields meaningful safety-relevant changes) untested.

Authors: We agree that concrete matrices, explicit rules, and quantitative results are necessary. In the revision we will supply numerical egoistic and altruistic cost matrices (with clear ethical rationales), derive the corresponding Bayes decision rules, and report results from applying the interpolated rules to a semantic segmentation network on an urban scene dataset. The experimental protocol, including how segment-wise FP/FN rates are computed, will be detailed so that the safety-relevant metric shifts can be verified. revision: yes

Circularity Check

0 steps flagged

No circularity: demonstration constructed directly from explicitly defined cost functions

full rationale

The paper defines egoistic and altruistic cost matrices by explicit construction, derives the associated Bayes decision rules (distinct from MAP), and then examines metric changes under interpolation. No equations reduce a claimed result to a fitted parameter, no self-citation supplies a load-bearing uniqueness theorem, and no ansatz is smuggled via prior work. The demonstration is therefore self-contained against the introduced definitions; the interpolation step itself is presented as an exploratory device rather than a derived prediction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on the standard decision-theoretic fact that MAP is optimal under symmetric costs; no free parameters, new entities, or ad-hoc axioms beyond this background fact are introduced in the abstract.

axioms (1)

standard math Bayes rule (MAP) is optimal regarding the simple symmetric cost function
Invoked in the first paragraph of the abstract as established decision theory.

pith-pipeline@v0.9.0 · 5746 in / 1178 out tokens · 51175 ms · 2026-05-25T11:14:24.114408+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 8 internal anchors

[1]

Abadi, A

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghe- mawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y . Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Man´e, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V . Vanh...

work page 2015
[2]

E. Awad, S. Dsouza, R. Kim, J. Schulz, J. Henrich, A. Shar- iff, J.-F. Bonnefon, and I. Rahwan. The moral machine ex- periment. Nature, 563:59–64, 2018. 8

work page 2018
[3]

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

V . Badrinarayanan, A. Kendall, and R. Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. CoRR, abs/1511.00561, 2015. 5

work page internal anchor Pith review Pith/arXiv arXiv 2015
[4]

N. T. S. Board. Preliminary report highway hwy18mh010,

work page
[5]

J. Broome. Weighing lives. Oxford University Press, 2004. 2

work page 2004
[6]

R. Chan, M. Rottmann, F. H ¨uger, P. Schlicht, and H. Gottschalk. Application of decision rules for han- dling class imbalance in semantic segmentation. CoRR, abs/1901.08394, 2019. 4

work page internal anchor Pith review Pith/arXiv arXiv 1901
[7]

L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. CoRR, abs/1606.00915, 2016. 5

work page internal anchor Pith review Pith/arXiv arXiv 2016
[8]

L. Chen, Y . Zhu, G. Papandreou, F. Schroff, and H. Adam. Encoder-decoder with atrous separable convolution for se- mantic image segmentation. CoRR, abs/1802.02611, 2018. 5

work page internal anchor Pith review Pith/arXiv arXiv 2018
[9]

F. Chollet. Xception: Deep learning with depthwise separa- ble convolutions. CoRR, abs/1610.02357, 2016. 5

work page internal anchor Pith review Pith/arXiv arXiv 2016
[10]

Cordts, M

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 2, 5

work page 2016
[11]

Everingham, S

M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual ob- ject classes challenge: A retrospective.International Journal of Computer Vision, 111(1):98–136, Jan 2015. 5

work page 2015
[12]

Fahrmeir, A

L. Fahrmeir, A. Hamerle, and W. H ¨aussler. Multivariate sta- tistical Methods (in German). Walter De Gruyter, 2 edition,

work page
[13]

P. Foot. The problem of abortion and the doctrine of double effect. Oxford Review, 5:5–15, 1967. 1

work page 1967
[14]

Himmelreich

J. Himmelreich. Never mind the trolley: The ethics of au- tonomous vehicles in mundane situations. Ethical Theory and Moral Practice, 21(3):669–684, 2018. 1

work page 2018
[15]

Krizhevsky, I

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classiﬁcation with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012. 5

work page 2012
[16]

P. Lin. Why ethics matters for autonomous cars. In Au- tonomous driving , pages 69–85. Springer, Berlin, Heidel- berg, 2016. 1

work page 2016
[17]

P. Lin, K. Abney, and G. A. Bekey. Robot ethics: the ethical and social implications of robotics. The MIT Press, 2014. 8

work page 2014
[18]

P. Lin, K. Abney, and R. Jenkins.Robot Ethics 2.0: From Au- tonomous Cars to Artiﬁcial Intelligence . Oxford University Press, 2017. 1

work page 2017
[19]

E. C. on automated, networked driving of the German Fed- eral Ministry for Transport, and Infrastructure. Report of the ethics commission automated and networked driving (in ger- man), 2017. 1, 8

work page 2017
[20]

W. H. Organization. Road trafﬁc injuries, 2018. 1

work page 2018
[21]

U-Net: Convolutional Networks for Biomedical Image Segmentation

O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolu- tional networks for biomedical image segmentation. CoRR, abs/1505.04597, 2015. 5

work page internal anchor Pith review Pith/arXiv arXiv 2015
[22]

Shelhamer, J

E. Shelhamer, J. Long, and T. Darrell. Fully convolutional networks for semantic segmentation. PAMI, 2016. 5

work page 2016
[23]

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan and A. Zisserman. Very deep convolu- tional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014. 5

work page internal anchor Pith review Pith/arXiv arXiv 2014
[24]

M. Taylor. Self-driving mercedes-benzes will prioritize oc- cupant safety over pedestrians.Car and Driver, Oct. 7, 2016. 2

work page 2016
[25]

Multi-Scale Context Aggregation by Dilated Convolutions

F. Yu and V . Koltun. Multi-scale context aggregation by di- lated convolutions. CoRR, abs/1511.07122, 2015. 5

work page internal anchor Pith review Pith/arXiv arXiv 2015

[1] [1]

Abadi, A

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghe- mawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y . Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Man´e, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V . Vanh...

work page 2015

[2] [2]

E. Awad, S. Dsouza, R. Kim, J. Schulz, J. Henrich, A. Shar- iff, J.-F. Bonnefon, and I. Rahwan. The moral machine ex- periment. Nature, 563:59–64, 2018. 8

work page 2018

[3] [3]

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

V . Badrinarayanan, A. Kendall, and R. Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. CoRR, abs/1511.00561, 2015. 5

work page internal anchor Pith review Pith/arXiv arXiv 2015

[4] [4]

N. T. S. Board. Preliminary report highway hwy18mh010,

work page

[5] [5]

J. Broome. Weighing lives. Oxford University Press, 2004. 2

work page 2004

[6] [6]

R. Chan, M. Rottmann, F. H ¨uger, P. Schlicht, and H. Gottschalk. Application of decision rules for han- dling class imbalance in semantic segmentation. CoRR, abs/1901.08394, 2019. 4

work page internal anchor Pith review Pith/arXiv arXiv 1901

[7] [7]

L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. CoRR, abs/1606.00915, 2016. 5

work page internal anchor Pith review Pith/arXiv arXiv 2016

[8] [8]

L. Chen, Y . Zhu, G. Papandreou, F. Schroff, and H. Adam. Encoder-decoder with atrous separable convolution for se- mantic image segmentation. CoRR, abs/1802.02611, 2018. 5

work page internal anchor Pith review Pith/arXiv arXiv 2018

[9] [9]

F. Chollet. Xception: Deep learning with depthwise separa- ble convolutions. CoRR, abs/1610.02357, 2016. 5

work page internal anchor Pith review Pith/arXiv arXiv 2016

[10] [10]

Cordts, M

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 2, 5

work page 2016

[11] [11]

Everingham, S

M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual ob- ject classes challenge: A retrospective.International Journal of Computer Vision, 111(1):98–136, Jan 2015. 5

work page 2015

[12] [12]

Fahrmeir, A

L. Fahrmeir, A. Hamerle, and W. H ¨aussler. Multivariate sta- tistical Methods (in German). Walter De Gruyter, 2 edition,

work page

[13] [13]

P. Foot. The problem of abortion and the doctrine of double effect. Oxford Review, 5:5–15, 1967. 1

work page 1967

[14] [14]

Himmelreich

J. Himmelreich. Never mind the trolley: The ethics of au- tonomous vehicles in mundane situations. Ethical Theory and Moral Practice, 21(3):669–684, 2018. 1

work page 2018

[15] [15]

Krizhevsky, I

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classiﬁcation with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012. 5

work page 2012

[16] [16]

P. Lin. Why ethics matters for autonomous cars. In Au- tonomous driving , pages 69–85. Springer, Berlin, Heidel- berg, 2016. 1

work page 2016

[17] [17]

P. Lin, K. Abney, and G. A. Bekey. Robot ethics: the ethical and social implications of robotics. The MIT Press, 2014. 8

work page 2014

[18] [18]

P. Lin, K. Abney, and R. Jenkins.Robot Ethics 2.0: From Au- tonomous Cars to Artiﬁcial Intelligence . Oxford University Press, 2017. 1

work page 2017

[19] [19]

E. C. on automated, networked driving of the German Fed- eral Ministry for Transport, and Infrastructure. Report of the ethics commission automated and networked driving (in ger- man), 2017. 1, 8

work page 2017

[20] [20]

W. H. Organization. Road trafﬁc injuries, 2018. 1

work page 2018

[21] [21]

U-Net: Convolutional Networks for Biomedical Image Segmentation

O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolu- tional networks for biomedical image segmentation. CoRR, abs/1505.04597, 2015. 5

work page internal anchor Pith review Pith/arXiv arXiv 2015

[22] [22]

Shelhamer, J

E. Shelhamer, J. Long, and T. Darrell. Fully convolutional networks for semantic segmentation. PAMI, 2016. 5

work page 2016

[23] [23]

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan and A. Zisserman. Very deep convolu- tional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014. 5

work page internal anchor Pith review Pith/arXiv arXiv 2014

[24] [24]

M. Taylor. Self-driving mercedes-benzes will prioritize oc- cupant safety over pedestrians.Car and Driver, Oct. 7, 2016. 2

work page 2016

[25] [25]

Multi-Scale Context Aggregation by Dilated Convolutions

F. Yu and V . Koltun. Multi-scale context aggregation by di- lated convolutions. CoRR, abs/1511.07122, 2015. 5

work page internal anchor Pith review Pith/arXiv arXiv 2015