Leveraging Knowledge Bases And Parallel Annotations For Music Genre Translation

Anis Khlif; Elena V. Epure; Romain Hennequin

arxiv: 1907.08698 · v2 · pith:74JTB6E4new · submitted 2019-07-18 · 💻 cs.SD · cs.IR· cs.LG· eess.AS· stat.ML

Leveraging Knowledge Bases And Parallel Annotations For Music Genre Translation

Elena V. Epure , Anis Khlif , Romain Hennequin This is my paper

Pith reviewed 2026-05-24 19:23 UTC · model grok-4.3

classification 💻 cs.SD cs.IRcs.LGeess.ASstat.ML

keywords music genre translationtag system unificationknowledge baseslogistic regressionhybrid translationmultilabel classificationtaxonomy mappingparallel annotations

0 comments

The pith

A hybrid of taxonomy mapping and logistic regression translates music genres between tag systems more effectively than either method alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the problem of predicting how musical items would be labeled with genres in a target tag system, given their labels in a source system. It divides real situations into three cases according to whether the systems share no annotated items, a large set of them, or only a few. Corresponding methods are a knowledge-based mapping through taxonomies, a statistical model using maximum-likelihood logistic regression, and a hybrid that feeds the taxonomy mapping in as prior probabilities to a maximum a posteriori logistic regression. Across the cases the hybrid solution yields the strongest results on multilabel classification metrics. The work therefore offers a concrete route to aligning separate genre vocabularies while respecting both their structured knowledge and their observed usage patterns.

Core claim

The central claim is that the hybrid translation modeled with maximum a posteriori logistic regression, using priors supplied by the knowledge-based taxonomy mapping, is systematically the most effective solution with respect to multilabel classification metrics and fits the three identified cases of common annotations between source and target tag systems.

What carries the argument

Hybrid translation that combines taxonomy mapping from knowledge bases with maximum a posteriori logistic regression.

If this is right

Knowledge-based taxonomy mapping suffices when source and target systems share no annotated corpus.
Maximum-likelihood logistic regression suffices when source and target systems share a large annotated corpus.
The hybrid method is required and performs best when source and target systems share only a few annotations.
Musical items can be enriched with consistent tags drawn from multiple genre systems.
Genre tag systems can be unified by jointly using representation diversity in taxonomies and interpretation diversity in annotations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same case division and hybrid construction could be tested on subjective tagging tasks outside music, such as film or book genres.
Large music platforms holding multiple internal tag vocabularies could adopt the hybrid to reduce label inconsistency in recommendation pipelines.
Further experiments could check whether replacing logistic regression with other probabilistic models preserves the hybrid advantage.
The approach suggests a general pattern for cross-system label translation whenever both structured hierarchies and partial parallel data are available.

Load-bearing premise

The three cases based on the quantity of shared annotations cover the main real-world situations, and logistic regression can capture the probabilistic relationships between genre assignments across systems.

What would settle it

A pair of genre tag systems whose overlap falls outside the three defined cases, or for which the hybrid method does not produce the highest multilabel classification scores.

read the original abstract

Prevalent efforts have been put in automatically inferring genres of musical items. Yet, the propose solutions often rely on simplifications and fail to address the diversity and subjectivity of music genres. Accounting for these has, though, many benefits for aligning knowledge sources, integrating data and enriching musical items with tags. Here, we choose a new angle for the genre study by seeking to predict what would be the genres of musical items in a target tag system, knowing the genres assigned to them within source tag systems. We call this a translation task and identify three cases: 1) no common annotated corpus between source and target tag systems exists, 2) such a large corpus exists, 3) only few common annotations exist. We propose the related solutions: a knowledge-based translation modeled as taxonomy mapping, a statistical translation modeled with maximum likelihood logistic regression; a hybrid translation modeled with maximum a posteriori logistic regression with priors given by the knowledge-based translation. During evaluation, the solutions fit well the identified cases and the hybrid translation is systematically the most effective w.r.t. multilabel classification metrics. This is a first attempt to unify genre tag systems by leveraging both representation and interpretation diversity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper frames genre tag alignment as a translation task across three overlap regimes and finds the hybrid MAP logistic method strongest, which is a clean practical contribution for MIR data integration.

read the letter

The main contribution is treating genre prediction as translating between independent tag systems rather than just classifying from audio. They split the problem into three realistic cases based on shared annotations and match methods to them: taxonomy mapping for zero overlap, plain logistic regression for large overlap, and a MAP version that folds in knowledge-base priors for the sparse case. The hybrid coming out ahead on multilabel metrics is the central result and feels like a sensible way to combine structured knowledge with whatever data is available.

Referee Report

2 major / 2 minor

Summary. The manuscript frames music genre translation as a mapping task between source and target tag systems under three data-overlap regimes (no common annotated corpus, large common corpus, few common annotations). It proposes three methods—knowledge-based taxonomy mapping, maximum-likelihood logistic regression, and a hybrid maximum a posteriori logistic regression that uses the knowledge-based output as priors—and claims that the hybrid is systematically strongest on multilabel classification metrics, constituting a first step toward unifying genre tag systems by combining representation and interpretation diversity.

Significance. If the empirical results hold under rigorous evaluation, the work offers a practical framework for aligning heterogeneous music genre ontologies, which could improve data integration and tag enrichment in MIR. The explicit partitioning into three overlap regimes and the hybrid construction that injects knowledge priors into a statistical model are clear strengths; the paper also earns credit for grounding the methods in both structured knowledge bases and parallel annotations rather than relying on a single paradigm.

major comments (2)

[Abstract] Abstract: the central claim that 'the hybrid translation is systematically the most effective w.r.t. multilabel classification metrics' is load-bearing, yet the abstract supplies no quantitative results, dataset descriptions, baseline comparisons, or statistical significance tests; without these details in the evaluation section the superiority assertion cannot be verified.
[Abstract] The modeling choice that logistic regression reliably captures cross-system genre relationships (weakest assumption) is presented without a concrete diagnostic (e.g., calibration plots or residual analysis) that would confirm the probabilistic mapping is not misspecified for the genre label distributions.

minor comments (2)

[Abstract] Abstract: the phrase 'the solutions fit well the identified cases' is vague; a one-sentence summary of how each method was instantiated on real tag systems would improve clarity.
[Abstract] The abstract could name the specific multilabel metrics (precision, recall, F1, etc.) rather than referring only to 'multilabel classification metrics'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and the recommendation of minor revision. The comments focus on strengthening the abstract and methodological justification; we respond point by point and indicate where revisions will be made.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'the hybrid translation is systematically the most effective w.r.t. multilabel classification metrics' is load-bearing, yet the abstract supplies no quantitative results, dataset descriptions, baseline comparisons, or statistical significance tests; without these details in the evaluation section the superiority assertion cannot be verified.

Authors: The evaluation section (Sections 4–5) already contains the requested details: dataset descriptions for the three overlap regimes, multilabel metrics (e.g., micro/macro F1, precision, recall) for all three methods, explicit baseline comparisons, and statistical significance testing via paired t-tests or McNemar tests where appropriate. The hybrid method outperforms the others consistently. We will revise the abstract to include concise quantitative highlights and dataset references so the central claim is supported within the abstract itself. revision: yes
Referee: [Abstract] The modeling choice that logistic regression reliably captures cross-system genre relationships (weakest assumption) is presented without a concrete diagnostic (e.g., calibration plots or residual analysis) that would confirm the probabilistic mapping is not misspecified for the genre label distributions.

Authors: Logistic regression is used because it directly estimates conditional genre probabilities from parallel annotations and admits a natural MAP extension with knowledge-based priors. Its suitability is supported by the hybrid’s systematic gains over the pure maximum-likelihood baseline. While calibration plots are not present, we will add a short paragraph in the methods section explaining the modeling assumptions and noting that the empirical superiority of the hybrid serves as validation of the base mapping. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper enumerates three explicit data-overlap regimes and defines three distinct modeling strategies (taxonomy mapping, MLE logistic regression, MAP logistic regression with knowledge-derived priors). The hybrid approach uses the output of the first method as a prior for the third; this is a standard Bayesian construction, not a definitional reduction. Superiority is reported as an empirical outcome on multilabel metrics rather than a theorem or fitted quantity renamed as prediction. No self-citation chains, uniqueness theorems, or ansatzes smuggled via prior work appear in the provided abstract or method descriptions. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The paper relies on standard logistic regression assumptions for modeling genre probabilities and on the existence of usable taxonomies for the knowledge-based component. No new entities are postulated. The logistic regression coefficients are free parameters fitted to data in the statistical and hybrid cases.

free parameters (1)

logistic regression coefficients
Coefficients in the maximum likelihood and MAP logistic regression models are fitted to the available annotation data.

axioms (2)

domain assumption Logistic regression can model the conditional probability of target-system genre assignments given source-system genres.
Invoked when modeling the statistical and hybrid translations with logistic regression.
domain assumption Taxonomies in the knowledge bases provide a valid basis for mapping genres across tag systems.
Used in the knowledge-based translation modeled as taxonomy mapping.

pith-pipeline@v0.9.0 · 5752 in / 1466 out tokens · 37005 ms · 2026-05-24T19:23:05.211864+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 2 internal anchors

[1]

Leveraging knowledge bases and parallel annotations for music genre translation

INTRODUCTION Music genres have been long studied as semantic dimen- sions of artists and tracks [8]. Rooted in musicology, music experts have mainly undertaken this endeavour. With dig- itization of music and prevalence of Internet music con- sumption, online communities have also shown increasing interest in annotating musical items with genres (e.g. cre...

work page 2019
[2]

Leveraging Knowledge Bases And Parallel Annotations For Music Genre Translation

A cold-start case, when genre tag systems of the target arXiv:1907.08698v2 [cs.SD] 27 Jul 2019 and sources are known, but there is no parallel corpus. We address this case with a Knowledge-Based (KB) system based on taxonomy mapping (Section 3)

work page internal anchor Pith review Pith/arXiv arXiv 1907
[3]

when some sources use alternative rock the target tends to use alt

Many parallel annotations are available allowing to learn mappings between genre interpretations (e.g. when some sources use alternative rock the target tends to use alt. rock and indie rock). To deal with this case, we use a simple linear multilabel classiﬁer, namely a logistic regression model trained with Maximum Like- lihood (ML) (Section 4.1)

work page
[4]

We tackle this scenario with an hybrid Bayesian approach that leverages the KB translation as a prior for the logistic regression model trained with Maximum A Posteriori (MAP)

The case in-between when less annotations are avail- able and some target tags may be missing in the parallel corpus. We tackle this scenario with an hybrid Bayesian approach that leverages the KB translation as a prior for the logistic regression model trained with Maximum A Posteriori (MAP). This case, presented in Section 4.2, is the most general. Find...

work page
[5]

Calligraphic font is used for sets of sets (e.g.S) and capital letters for sets (e.g.S)

NOTATIONS AND PROBLEM FORMULATION In this work, we denote matrices by bold capital letters,M; vectors by bold lower case letters,v; then-th row vector of matrix M by mn; scalars by italic lower case letters,x; the coefﬁcient at rowi and columnj of matrix M bymij; the i-th element of vector v byvi. Calligraphic font is used for sets of sets (e.g.S) and cap...

work page
[6]

-", "_"). The basic normalization converts tags to lower case and brings tags containing

KNOWLEDGE-BASED GENRE TRANSLATION We propose a translation method based on multiple genre taxonomies brought together under a genre graph. Sec- tion 3.1 introduces the graph types of concepts and rela- tions and presents the genre taxonomies. In Section 3.2, we show how we create the links between the genre tax- onomies using advanced normalization and to...

work page 2018
[7]

A trie is a tree data structure that efﬁciently stores and retrieves strings

and a probabilistic tokenization built on Wikipedia unigrams [3]. A trie is a tree data structure that efﬁciently stores and retrieves strings. Each node has a char and a ﬂag to mark if the path from the root to it forms a word. We modify the way we populate the trie as follows. At ﬁrst, we sort the tokens obtained from the basic tokeniza- tion and normal...

work page
[8]

Rock/Pop becomes pop rock)

Normalize s with the process described in Section 3.2 (e.g. Rock/Pop becomes pop rock)

work page
[9]

If true, all entries in ZD linked to the DBpedia aliases of the found genres are set to1 and all others to0 (e.g

Check if the normalized s equals any normalized genre of B. If true, all entries in ZD linked to the DBpedia aliases of the found genres are set to1 and all others to0 (e.g. acid house is mapped to Acid_house, with aliases Acid_(electronic_music), Warehouse_music, etc.)

work page
[10]

If true, pro- ceed as in Step 1

If the normalized s is not inB, then map it using its con- text genres inD: compounds with each parent tag inD and check if the normalized compounded tag equals any normalized genre ofB (inspired from [43]). If true, pro- ceed as in Step 1. (e.g.stoner has parent rock in Lastfm; search by rock stoner and map it to Stoner_rock)

work page
[11]

First, retrieve the DBpedia directed subgraph composed of the nodes which contain the normalized s as a substring in their normalized form

If Steps 2 and 3 are unsuccessful, consider two cases: (a) s is a concept genre as deﬁned in Section 3.2. First, retrieve the DBpedia directed subgraph composed of the nodes which contain the normalized s as a substring in their normalized form. Second, map s to the nodes with the highest in-degree central- ity [5] in this subgraph. The intuition is that ...

work page
[12]

The intuition is that parent genres or subgenres could be relevant and sometimes speciﬁed by other sources

For each genre in B associated tos in Steps 1–4, prop- agate half of the value of its score to its neighbors inB. The intuition is that parent genres or subgenres could be relevant and sometimes speciﬁed by other sources. For each s not mapped in the previous process, we com- pute its scores by averaging the rows in ZD of its related genres in the input t...

work page
[13]

DATA-INFORMED GENRE TRANSLATION In this section, we consider that a parallel corpus is avail- able and present two statistical approaches: ML that relies only on annotations (Section 4.1), and MAP that leverages the KB results as a prior knowledge (Section 4.2). 4.1 Maximum Likelihood logistic regression In statistical approaches to the tag translation ta...

work page
[14]

This also serves as an indirect evaluation of the DBpedia mapping, which, in a work dedicated to taxonomy mapping, could have been as- sessed by experts

EXPERIMENTS We report the performances of the proposed models on a recording-based tag translation task. This also serves as an indirect evaluation of the DBpedia mapping, which, in a work dedicated to taxonomy mapping, could have been as- sessed by experts. Due to its novelty, we do not benchmark our work against other genre-related research from MIR. 5....

work page 2018
[15]

We show that the availability of large amounts of data advantages statistical methods over the knowledge-based one in terms of multilabel classiﬁcation metrics

CONCLUSION In this work, we investigated the translation of tags from various source tag systems to a common target tag sys- tem. We show that the availability of large amounts of data advantages statistical methods over the knowledge-based one in terms of multilabel classiﬁcation metrics. Moreover, the proposed hybrid method consistently outperforms both...

work page
[16]

TensorFlow: Large-scale ma- chine learning on heterogeneous systems, 2015

Martín Abadi and et al. TensorFlow: Large-scale ma- chine learning on heterogeneous systems, 2015. Soft- ware available from tensorﬂow.org

work page 2015
[17]

Doremus: A graph of linked musical works

Manel Achichi, Pasquale Lisena, Konstantin Todorov, Raphaël Troncy, and Jean Delahousse. Doremus: A graph of linked musical works. In International Se- mantic Web Conference, pages 3–19, 2018

work page 2018
[18]

Word ninja, 2019

Derek Anderson. Word ninja, 2019. Software available at https://github.com/keredson/wordninja

work page 2019
[19]

Dbpe- dia: A nucleus for a web of open data

Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. Dbpe- dia: A nucleus for a web of open data. In The semantic web, pages 722–735. Springer, 2007

work page 2007
[20]

Jrgen Bang-Jensen and Gregory Z. Gutin. Digraphs: Theory, Algorithms and Applications . Springer Pub- lishing Company, Incorporated, 2nd edition, 2008

work page 2008
[21]

Pattern recognition and ma- chine learning

Christopher M Bishop. Pattern recognition and ma- chine learning. springer, 2006. pages 30–31

work page 2006
[22]

The mediaeval 2017 acousticbrainz genre task: content-based music genre recognition from multiple sources

Dmitry Bogdanov, Alastair Porter, Julián Urbano, and Hendrik Schreiber. The mediaeval 2017 acousticbrainz genre task: content-based music genre recognition from multiple sources. In MediaEval 2017 Acous- ticBrainz, 2017

work page 2017
[23]

Categorizing sound: genre and twentieth-century popular music

David Brackett. Categorizing sound: genre and twentieth-century popular music . Univ of California Press, 2016

work page 2016
[24]

Word translation without parallel data

Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou. Word translation without parallel data. In International Con- ference on Learning Representations, 2018

work page 2018
[25]

Lanckriet

Emanuele Coviello, Riccardo Miotto, and Gert R.G. Lanckriet. Combining content-based auto-taggers with decision-fusion. In Conference of the International So- ciety on Music Information Retrieval , pages 705–710, 2011

work page 2011
[26]

How many beans make ﬁve? the consensus prob- lem in music-genre classiﬁcation and a new evalua- tion method for single-genre categorisation systems

Alastair JD Craft, Geraint A Wiggins, and Tim Craw- ford. How many beans make ﬁve? the consensus prob- lem in music-genre classiﬁcation and a new evalua- tion method for single-genre categorisation systems. In Conference of the International Society on Music Infor- mation Retrieval, pages 73–76, 2007

work page 2007
[27]

File searching using variable length keys

Rene De La Briandais. File searching using variable length keys. In Western Joint Computer Conference , IRE-AIEE-ACM ’59 (Western), pages 295–298, New York, NY , USA, 1959. ACM

work page 1959
[28]

A closer look on artist ﬁlters for musi- cal genre classiﬁcation

Arthur Flexer. A closer look on artist ﬁlters for musi- cal genre classiﬁcation. In Conference of the Interna- tional Society on Music Information Retrieval , pages 341–344, 2007

work page 2007
[29]

The elements of statistical learning

Jerome Friedman, Trevor Hastie, and Robert Tibshi- rani. The elements of statistical learning . Springer se- ries in statistics New York, 2001. pages 241–249

work page 2001
[30]

A weakly informative default prior distribution for logistic and other regression mod- els

Andrew Gelman, Aleks Jakulin, Maria Grazia Pit- tau, and Yu-Sung Su. A weakly informative default prior distribution for logistic and other regression mod- els. The Annals of Applied Statistics , 2(4):1360–1383, 2008

work page 2008
[31]

Audio based disambiguation of music genre tags

Romain Hennequin, Jimena Royo-letelier, and Manuel Moussallam. Audio based disambiguation of music genre tags. In Conference of the International Society of Music Information Retrieval, pages 645–652, 2018

work page 2018
[32]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[33]

K-pop genres: A cross-cultural exploration

Jin Ha Lee and J Stephen Downie. K-pop genres: A cross-cultural exploration. In Conference of the Inter- national Society on Music Information Retrieval, pages 529–534, 2013

work page 2013
[34]

Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, and Christian Bizer

Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, and Christian Bizer. Dbpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web, 6:167–195, 2015

work page 2015
[35]

Controlled vocabularies for music metadata

Pasquale Lisena, Konstantin Todorov, Cécile Cecconi, Françoise Leresche, Isabelle Canno, Frédéric Puyre- nier, Martine V oisin, Thierry Le Meur, and Raphaël Troncy. Controlled vocabularies for music metadata. In Conference of the International Society on Music Infor- mation Retrieval, pages 424–430, 2018

work page 2018
[36]

Learning tags that vary within a song

Michael Mandel, Douglas Eck, and Yoshua Bengio. Learning tags that vary within a song. In International Society for Music Information Retrieval Conference, ISMIR 2010, pages 399–404, 08 2010

work page 2010
[37]

Multi-label music genre classiﬁcation from audio, text and images using deep features

Sergio Oramas, Oriol Nieto, Francesco Barbieri, and Xavier Serra. Multi-label music genre classiﬁcation from audio, text and images using deep features. In Conference of the International Society on Music In- formation Retrieval, pages 23–30, 2017

work page 2017
[38]

Rodríguez- Martínez, and Alma Gómez-Rodríguez

Lorena Otero-Cerdeira, Francisco J. Rodríguez- Martínez, and Alma Gómez-Rodríguez. Ontology matching: A literature review.Expert Systems with Ap- plications, 42(2):949–971, 2015

work page 2015
[39]

Taci: Taxonomy-aware cat- alog integration

Panagiotis Papadimitriou, Panayiotis Tsaparas, Ariel Fuxman, and Lise Getoor. Taci: Taxonomy-aware cat- alog integration. IEEE Transactions on Knowledge and Data Engineering, 25(7):1643–1655, July 2013

work page 2013
[40]

Scikit-learn: Machine learning in Python

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gram- fort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vin- cent Dubourg, et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research , 12:2825–2830, 2011

work page 2011
[41]

Large- scale taxonomy mapping for restructuring and integrat- ing wikipedia

Simone Paolo Ponzetto and Roberto Navigli. Large- scale taxonomy mapping for restructuring and integrat- ing wikipedia. InInternational Joint Conference on Ar- tiﬁcal Intelligence, IJCAI’09, pages 2083–2088, San Francisco, CA, USA, 2009. Morgan Kaufmann Pub- lishers Inc

work page 2083
[42]

Aligning multi-cultural knowledge taxonomies by combinatorial optimization

Natalia Prytkova, Gerhard Weikum, and Marc Span- iol. Aligning multi-cultural knowledge taxonomies by combinatorial optimization. In International Confer- ence on World Wide Web , WWW ’15 Companion, pages 93–94, New York, NY , USA, 2015. ACM

work page 2015
[43]

Classiﬁer chains for multi-label classiﬁca- tion

Jesse Read, Bernhard Pfahringer, Geoff Holmes, and Eibe Frank. Classiﬁer chains for multi-label classiﬁca- tion. Machine learning, 85(3):333, 2009

work page 2009
[44]

Aanen, Damir Vandic, and Flavius Frasin- car

Steven S. Aanen, Damir Vandic, and Flavius Frasin- car. Automated product taxonomy mapping in an e- commerce environment. Expert Systems with Applica- tions, 42(3):1298–1313, 2015

work page 2015
[45]

Improving genre annotations for the million song dataset

Hendrik Schreiber. Improving genre annotations for the million song dataset. In Conference of the Inter- national Society of Music Information Retrieval, pages 241–247, 2015

work page 2015
[46]

Genre ontology learning: Compar- ing curated with crowd-sourced ontologies

Hendrik Schreiber. Genre ontology learning: Compar- ing curated with crowd-sourced ontologies. In Confer- ence of the International Society on Music Information Retrieval, pages 400–406, 2016

work page 2016
[47]

On the stratiﬁcation of multi-label data

Konstantinos Sechidis, Grigorios Tsoumakas, and Ioannis Vlahavas. On the stratiﬁcation of multi-label data. In European Conference on Machine Learning and Knowledge Discovery in Databases , pages 145– 158, Berlin, Heidelberg, 2011. Springer-Verlag

work page 2011
[48]

The Quest for Musical Genres: Do the Ex- perts and the Wisdom of Crowds Agree? In Confer- ence of the International Society on Music Information Retrieval, pages 255–260, 2008

Mohamed Sordo, Oscar Celma, Matin Blech, and En- ric Guaus. The Quest for Musical Genres: Do the Ex- perts and the Wisdom of Crowds Agree? In Confer- ence of the International Society on Music Information Retrieval, pages 255–260, 2008

work page 2008
[49]

Con- ceptnet 5.5: An open multilingual graph of general knowledge

Robyn Speer, Joshua Chin, and Catherine Havasi. Con- ceptnet 5.5: An open multilingual graph of general knowledge. In AAAI Conference on Artiﬁcial Intelli- gence, pages 4444–4451, 2017

work page 2017
[50]

A machine learning approach to multilingual and cross- lingual ontology matching

Dennis Spohr, Laura Hollink, and Philipp Cimiano. A machine learning approach to multilingual and cross- lingual ontology matching. In International Semantic Web Conference, pages 665–680, Berlin, Heidelberg,

work page
[51]

Springer Berlin Heidelberg

work page
[52]

Bob L. Sturm. Classiﬁcation accuracy is not enough. Journal of Intelligent Information Systems, 41(3):371– 406, 2013

work page 2013
[53]

Musicbrainz: A semantic web service

Aaron Swartz. Musicbrainz: A semantic web service. IEEE Intelligent Systems, 17(1):76–77, 2002

work page 2002
[54]

Combining taxonomies us- ing word2vec

Tobias Swoboda, Matthias Hemmje, Mihai Dascalu, and Stefan Trausan-Matu. Combining taxonomies us- ing word2vec. In ACM Symposium on Document Engi- neering, DocEng’16, pages 131–134, New York, NY , USA, 2016. ACM

work page 2016
[55]

Using regression to combine data sources for semantic music discovery

Brian Tomasik, Joon Hee Kim, Margaret Ladlow, Mal- colm Augat, Derek Tingle, Rich Wicentowski, and Douglas Turnbull. Using regression to combine data sources for semantic music discovery. In Conference of the International Society on Music Information Re- trieval, pages 405–410, 2009

work page 2009
[56]

Walker and David B

Strother H. Walker and David B. Duncan. Estimation of the probability of an event as a function of several independent variables. Biometrika, 54:167–178, 1967

work page 1967
[57]

Pre- dicting High-level Music Semantics using Social Tags via Ontology-based Reasoning

Jun Wang, Xiaoou Chen, Yajie Hu, and Tao Feng. Pre- dicting High-level Music Semantics using Social Tags via Ontology-based Reasoning. In Conference of the International Society on Music Information Retrieval , pages 405–410, 2010

work page 2010
[58]

Statistical rank, 2019

Eric W Weisstein. Statistical rank, 2019. From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/StatisticalRank.html

work page 2019
[59]

Cross-lingual taxonomy alignment with bilingual biterm topic model

Tianxing Wu, Guilin Qi, Haofen Wang, Kang Xu, and Xuan Cui. Cross-lingual taxonomy alignment with bilingual biterm topic model. In AAAI Conference on Artiﬁcial Intelligence, pages 287–293, 2016

work page 2016
[60]

G. K. Zipf. Human behavior and the principle of least effort. Cambridge, MA, Addison-Wesle, 1949

work page 1949

[1] [1]

Leveraging knowledge bases and parallel annotations for music genre translation

INTRODUCTION Music genres have been long studied as semantic dimen- sions of artists and tracks [8]. Rooted in musicology, music experts have mainly undertaken this endeavour. With dig- itization of music and prevalence of Internet music con- sumption, online communities have also shown increasing interest in annotating musical items with genres (e.g. cre...

work page 2019

[2] [2]

Leveraging Knowledge Bases And Parallel Annotations For Music Genre Translation

A cold-start case, when genre tag systems of the target arXiv:1907.08698v2 [cs.SD] 27 Jul 2019 and sources are known, but there is no parallel corpus. We address this case with a Knowledge-Based (KB) system based on taxonomy mapping (Section 3)

work page internal anchor Pith review Pith/arXiv arXiv 1907

[3] [3]

when some sources use alternative rock the target tends to use alt

Many parallel annotations are available allowing to learn mappings between genre interpretations (e.g. when some sources use alternative rock the target tends to use alt. rock and indie rock). To deal with this case, we use a simple linear multilabel classiﬁer, namely a logistic regression model trained with Maximum Like- lihood (ML) (Section 4.1)

work page

[4] [4]

We tackle this scenario with an hybrid Bayesian approach that leverages the KB translation as a prior for the logistic regression model trained with Maximum A Posteriori (MAP)

The case in-between when less annotations are avail- able and some target tags may be missing in the parallel corpus. We tackle this scenario with an hybrid Bayesian approach that leverages the KB translation as a prior for the logistic regression model trained with Maximum A Posteriori (MAP). This case, presented in Section 4.2, is the most general. Find...

work page

[5] [5]

Calligraphic font is used for sets of sets (e.g.S) and capital letters for sets (e.g.S)

NOTATIONS AND PROBLEM FORMULATION In this work, we denote matrices by bold capital letters,M; vectors by bold lower case letters,v; then-th row vector of matrix M by mn; scalars by italic lower case letters,x; the coefﬁcient at rowi and columnj of matrix M bymij; the i-th element of vector v byvi. Calligraphic font is used for sets of sets (e.g.S) and cap...

work page

[6] [6]

-", "_"). The basic normalization converts tags to lower case and brings tags containing

KNOWLEDGE-BASED GENRE TRANSLATION We propose a translation method based on multiple genre taxonomies brought together under a genre graph. Sec- tion 3.1 introduces the graph types of concepts and rela- tions and presents the genre taxonomies. In Section 3.2, we show how we create the links between the genre tax- onomies using advanced normalization and to...

work page 2018

[7] [7]

A trie is a tree data structure that efﬁciently stores and retrieves strings

and a probabilistic tokenization built on Wikipedia unigrams [3]. A trie is a tree data structure that efﬁciently stores and retrieves strings. Each node has a char and a ﬂag to mark if the path from the root to it forms a word. We modify the way we populate the trie as follows. At ﬁrst, we sort the tokens obtained from the basic tokeniza- tion and normal...

work page

[8] [8]

Rock/Pop becomes pop rock)

Normalize s with the process described in Section 3.2 (e.g. Rock/Pop becomes pop rock)

work page

[9] [9]

If true, all entries in ZD linked to the DBpedia aliases of the found genres are set to1 and all others to0 (e.g

Check if the normalized s equals any normalized genre of B. If true, all entries in ZD linked to the DBpedia aliases of the found genres are set to1 and all others to0 (e.g. acid house is mapped to Acid_house, with aliases Acid_(electronic_music), Warehouse_music, etc.)

work page

[10] [10]

If true, pro- ceed as in Step 1

If the normalized s is not inB, then map it using its con- text genres inD: compounds with each parent tag inD and check if the normalized compounded tag equals any normalized genre ofB (inspired from [43]). If true, pro- ceed as in Step 1. (e.g.stoner has parent rock in Lastfm; search by rock stoner and map it to Stoner_rock)

work page

[11] [11]

First, retrieve the DBpedia directed subgraph composed of the nodes which contain the normalized s as a substring in their normalized form

If Steps 2 and 3 are unsuccessful, consider two cases: (a) s is a concept genre as deﬁned in Section 3.2. First, retrieve the DBpedia directed subgraph composed of the nodes which contain the normalized s as a substring in their normalized form. Second, map s to the nodes with the highest in-degree central- ity [5] in this subgraph. The intuition is that ...

work page

[12] [12]

The intuition is that parent genres or subgenres could be relevant and sometimes speciﬁed by other sources

For each genre in B associated tos in Steps 1–4, prop- agate half of the value of its score to its neighbors inB. The intuition is that parent genres or subgenres could be relevant and sometimes speciﬁed by other sources. For each s not mapped in the previous process, we com- pute its scores by averaging the rows in ZD of its related genres in the input t...

work page

[13] [13]

DATA-INFORMED GENRE TRANSLATION In this section, we consider that a parallel corpus is avail- able and present two statistical approaches: ML that relies only on annotations (Section 4.1), and MAP that leverages the KB results as a prior knowledge (Section 4.2). 4.1 Maximum Likelihood logistic regression In statistical approaches to the tag translation ta...

work page

[14] [14]

This also serves as an indirect evaluation of the DBpedia mapping, which, in a work dedicated to taxonomy mapping, could have been as- sessed by experts

EXPERIMENTS We report the performances of the proposed models on a recording-based tag translation task. This also serves as an indirect evaluation of the DBpedia mapping, which, in a work dedicated to taxonomy mapping, could have been as- sessed by experts. Due to its novelty, we do not benchmark our work against other genre-related research from MIR. 5....

work page 2018

[15] [15]

We show that the availability of large amounts of data advantages statistical methods over the knowledge-based one in terms of multilabel classiﬁcation metrics

CONCLUSION In this work, we investigated the translation of tags from various source tag systems to a common target tag sys- tem. We show that the availability of large amounts of data advantages statistical methods over the knowledge-based one in terms of multilabel classiﬁcation metrics. Moreover, the proposed hybrid method consistently outperforms both...

work page

[16] [16]

TensorFlow: Large-scale ma- chine learning on heterogeneous systems, 2015

Martín Abadi and et al. TensorFlow: Large-scale ma- chine learning on heterogeneous systems, 2015. Soft- ware available from tensorﬂow.org

work page 2015

[17] [17]

Doremus: A graph of linked musical works

Manel Achichi, Pasquale Lisena, Konstantin Todorov, Raphaël Troncy, and Jean Delahousse. Doremus: A graph of linked musical works. In International Se- mantic Web Conference, pages 3–19, 2018

work page 2018

[18] [18]

Word ninja, 2019

Derek Anderson. Word ninja, 2019. Software available at https://github.com/keredson/wordninja

work page 2019

[19] [19]

Dbpe- dia: A nucleus for a web of open data

Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. Dbpe- dia: A nucleus for a web of open data. In The semantic web, pages 722–735. Springer, 2007

work page 2007

[20] [20]

Jrgen Bang-Jensen and Gregory Z. Gutin. Digraphs: Theory, Algorithms and Applications . Springer Pub- lishing Company, Incorporated, 2nd edition, 2008

work page 2008

[21] [21]

Pattern recognition and ma- chine learning

Christopher M Bishop. Pattern recognition and ma- chine learning. springer, 2006. pages 30–31

work page 2006

[22] [22]

The mediaeval 2017 acousticbrainz genre task: content-based music genre recognition from multiple sources

Dmitry Bogdanov, Alastair Porter, Julián Urbano, and Hendrik Schreiber. The mediaeval 2017 acousticbrainz genre task: content-based music genre recognition from multiple sources. In MediaEval 2017 Acous- ticBrainz, 2017

work page 2017

[23] [23]

Categorizing sound: genre and twentieth-century popular music

David Brackett. Categorizing sound: genre and twentieth-century popular music . Univ of California Press, 2016

work page 2016

[24] [24]

Word translation without parallel data

Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou. Word translation without parallel data. In International Con- ference on Learning Representations, 2018

work page 2018

[25] [25]

Lanckriet

Emanuele Coviello, Riccardo Miotto, and Gert R.G. Lanckriet. Combining content-based auto-taggers with decision-fusion. In Conference of the International So- ciety on Music Information Retrieval , pages 705–710, 2011

work page 2011

[26] [26]

How many beans make ﬁve? the consensus prob- lem in music-genre classiﬁcation and a new evalua- tion method for single-genre categorisation systems

Alastair JD Craft, Geraint A Wiggins, and Tim Craw- ford. How many beans make ﬁve? the consensus prob- lem in music-genre classiﬁcation and a new evalua- tion method for single-genre categorisation systems. In Conference of the International Society on Music Infor- mation Retrieval, pages 73–76, 2007

work page 2007

[27] [27]

File searching using variable length keys

Rene De La Briandais. File searching using variable length keys. In Western Joint Computer Conference , IRE-AIEE-ACM ’59 (Western), pages 295–298, New York, NY , USA, 1959. ACM

work page 1959

[28] [28]

A closer look on artist ﬁlters for musi- cal genre classiﬁcation

Arthur Flexer. A closer look on artist ﬁlters for musi- cal genre classiﬁcation. In Conference of the Interna- tional Society on Music Information Retrieval , pages 341–344, 2007

work page 2007

[29] [29]

The elements of statistical learning

Jerome Friedman, Trevor Hastie, and Robert Tibshi- rani. The elements of statistical learning . Springer se- ries in statistics New York, 2001. pages 241–249

work page 2001

[30] [30]

A weakly informative default prior distribution for logistic and other regression mod- els

Andrew Gelman, Aleks Jakulin, Maria Grazia Pit- tau, and Yu-Sung Su. A weakly informative default prior distribution for logistic and other regression mod- els. The Annals of Applied Statistics , 2(4):1360–1383, 2008

work page 2008

[31] [31]

Audio based disambiguation of music genre tags

Romain Hennequin, Jimena Royo-letelier, and Manuel Moussallam. Audio based disambiguation of music genre tags. In Conference of the International Society of Music Information Retrieval, pages 645–652, 2018

work page 2018

[32] [32]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[33] [33]

K-pop genres: A cross-cultural exploration

Jin Ha Lee and J Stephen Downie. K-pop genres: A cross-cultural exploration. In Conference of the Inter- national Society on Music Information Retrieval, pages 529–534, 2013

work page 2013

[34] [34]

Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, and Christian Bizer

Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, and Christian Bizer. Dbpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web, 6:167–195, 2015

work page 2015

[35] [35]

Controlled vocabularies for music metadata

Pasquale Lisena, Konstantin Todorov, Cécile Cecconi, Françoise Leresche, Isabelle Canno, Frédéric Puyre- nier, Martine V oisin, Thierry Le Meur, and Raphaël Troncy. Controlled vocabularies for music metadata. In Conference of the International Society on Music Infor- mation Retrieval, pages 424–430, 2018

work page 2018

[36] [36]

Learning tags that vary within a song

Michael Mandel, Douglas Eck, and Yoshua Bengio. Learning tags that vary within a song. In International Society for Music Information Retrieval Conference, ISMIR 2010, pages 399–404, 08 2010

work page 2010

[37] [37]

Multi-label music genre classiﬁcation from audio, text and images using deep features

Sergio Oramas, Oriol Nieto, Francesco Barbieri, and Xavier Serra. Multi-label music genre classiﬁcation from audio, text and images using deep features. In Conference of the International Society on Music In- formation Retrieval, pages 23–30, 2017

work page 2017

[38] [38]

Rodríguez- Martínez, and Alma Gómez-Rodríguez

Lorena Otero-Cerdeira, Francisco J. Rodríguez- Martínez, and Alma Gómez-Rodríguez. Ontology matching: A literature review.Expert Systems with Ap- plications, 42(2):949–971, 2015

work page 2015

[39] [39]

Taci: Taxonomy-aware cat- alog integration

Panagiotis Papadimitriou, Panayiotis Tsaparas, Ariel Fuxman, and Lise Getoor. Taci: Taxonomy-aware cat- alog integration. IEEE Transactions on Knowledge and Data Engineering, 25(7):1643–1655, July 2013

work page 2013

[40] [40]

Scikit-learn: Machine learning in Python

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gram- fort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vin- cent Dubourg, et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research , 12:2825–2830, 2011

work page 2011

[41] [41]

Large- scale taxonomy mapping for restructuring and integrat- ing wikipedia

Simone Paolo Ponzetto and Roberto Navigli. Large- scale taxonomy mapping for restructuring and integrat- ing wikipedia. InInternational Joint Conference on Ar- tiﬁcal Intelligence, IJCAI’09, pages 2083–2088, San Francisco, CA, USA, 2009. Morgan Kaufmann Pub- lishers Inc

work page 2083

[42] [42]

Aligning multi-cultural knowledge taxonomies by combinatorial optimization

Natalia Prytkova, Gerhard Weikum, and Marc Span- iol. Aligning multi-cultural knowledge taxonomies by combinatorial optimization. In International Confer- ence on World Wide Web , WWW ’15 Companion, pages 93–94, New York, NY , USA, 2015. ACM

work page 2015

[43] [43]

Classiﬁer chains for multi-label classiﬁca- tion

Jesse Read, Bernhard Pfahringer, Geoff Holmes, and Eibe Frank. Classiﬁer chains for multi-label classiﬁca- tion. Machine learning, 85(3):333, 2009

work page 2009

[44] [44]

Aanen, Damir Vandic, and Flavius Frasin- car

Steven S. Aanen, Damir Vandic, and Flavius Frasin- car. Automated product taxonomy mapping in an e- commerce environment. Expert Systems with Applica- tions, 42(3):1298–1313, 2015

work page 2015

[45] [45]

Improving genre annotations for the million song dataset

Hendrik Schreiber. Improving genre annotations for the million song dataset. In Conference of the Inter- national Society of Music Information Retrieval, pages 241–247, 2015

work page 2015

[46] [46]

Genre ontology learning: Compar- ing curated with crowd-sourced ontologies

Hendrik Schreiber. Genre ontology learning: Compar- ing curated with crowd-sourced ontologies. In Confer- ence of the International Society on Music Information Retrieval, pages 400–406, 2016

work page 2016

[47] [47]

On the stratiﬁcation of multi-label data

Konstantinos Sechidis, Grigorios Tsoumakas, and Ioannis Vlahavas. On the stratiﬁcation of multi-label data. In European Conference on Machine Learning and Knowledge Discovery in Databases , pages 145– 158, Berlin, Heidelberg, 2011. Springer-Verlag

work page 2011

[48] [48]

The Quest for Musical Genres: Do the Ex- perts and the Wisdom of Crowds Agree? In Confer- ence of the International Society on Music Information Retrieval, pages 255–260, 2008

Mohamed Sordo, Oscar Celma, Matin Blech, and En- ric Guaus. The Quest for Musical Genres: Do the Ex- perts and the Wisdom of Crowds Agree? In Confer- ence of the International Society on Music Information Retrieval, pages 255–260, 2008

work page 2008

[49] [49]

Con- ceptnet 5.5: An open multilingual graph of general knowledge

Robyn Speer, Joshua Chin, and Catherine Havasi. Con- ceptnet 5.5: An open multilingual graph of general knowledge. In AAAI Conference on Artiﬁcial Intelli- gence, pages 4444–4451, 2017

work page 2017

[50] [50]

A machine learning approach to multilingual and cross- lingual ontology matching

Dennis Spohr, Laura Hollink, and Philipp Cimiano. A machine learning approach to multilingual and cross- lingual ontology matching. In International Semantic Web Conference, pages 665–680, Berlin, Heidelberg,

work page

[51] [51]

Springer Berlin Heidelberg

work page

[52] [52]

Bob L. Sturm. Classiﬁcation accuracy is not enough. Journal of Intelligent Information Systems, 41(3):371– 406, 2013

work page 2013

[53] [53]

Musicbrainz: A semantic web service

Aaron Swartz. Musicbrainz: A semantic web service. IEEE Intelligent Systems, 17(1):76–77, 2002

work page 2002

[54] [54]

Combining taxonomies us- ing word2vec

Tobias Swoboda, Matthias Hemmje, Mihai Dascalu, and Stefan Trausan-Matu. Combining taxonomies us- ing word2vec. In ACM Symposium on Document Engi- neering, DocEng’16, pages 131–134, New York, NY , USA, 2016. ACM

work page 2016

[55] [55]

Using regression to combine data sources for semantic music discovery

Brian Tomasik, Joon Hee Kim, Margaret Ladlow, Mal- colm Augat, Derek Tingle, Rich Wicentowski, and Douglas Turnbull. Using regression to combine data sources for semantic music discovery. In Conference of the International Society on Music Information Re- trieval, pages 405–410, 2009

work page 2009

[56] [56]

Walker and David B

Strother H. Walker and David B. Duncan. Estimation of the probability of an event as a function of several independent variables. Biometrika, 54:167–178, 1967

work page 1967

[57] [57]

Pre- dicting High-level Music Semantics using Social Tags via Ontology-based Reasoning

Jun Wang, Xiaoou Chen, Yajie Hu, and Tao Feng. Pre- dicting High-level Music Semantics using Social Tags via Ontology-based Reasoning. In Conference of the International Society on Music Information Retrieval , pages 405–410, 2010

work page 2010

[58] [58]

Statistical rank, 2019

Eric W Weisstein. Statistical rank, 2019. From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/StatisticalRank.html

work page 2019

[59] [59]

Cross-lingual taxonomy alignment with bilingual biterm topic model

Tianxing Wu, Guilin Qi, Haofen Wang, Kang Xu, and Xuan Cui. Cross-lingual taxonomy alignment with bilingual biterm topic model. In AAAI Conference on Artiﬁcial Intelligence, pages 287–293, 2016

work page 2016

[60] [60]

G. K. Zipf. Human behavior and the principle of least effort. Cambridge, MA, Addison-Wesle, 1949

work page 1949