Understanding and Classifying Cultural Music Using Melodic Features Case Of Hindustani, Carnatic And Turkish Music
Pith reviewed 2026-05-25 18:49 UTC · model grok-4.3
The pith
Melodic features based on pitch contours and energy can classify Hindustani, Carnatic, and Turkish music styles in a way that matches human listener judgments.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A set of melody-based features derived from musicology captures distinct characteristics of the melodic contour in Hindustani, Carnatic, and Turkish music. These features exploit pitch contour transitions, the presence of micro-tonal notes, and the nature of variations in vocal energy. When used for automatic classification, the style labels correlate well with subjective listening judgments as verified by statistical tests. The melody features also improve performance when combined with timbre-based features.
What carries the argument
A set of highly discriminatory melodic features that capture pitch contour transitions, micro-tonal notes, and vocal energy variations to distinguish style in the absence of instrumentation cues.
If this is right
- Style labels produced by the melodic classifier match those from subjective listening tests, as confirmed by statistical comparison.
- Melody alone suffices to discriminate the styles when similar ragas or makams are used and instrumentation is controlled.
- Combining the melodic features with timbre features raises overall classification accuracy beyond either set used alone.
Where Pith is reading between the lines
- The same feature approach could be tested on other improvisational traditions that share theme-and-structure concert formats.
- These features might support music search tools that group pieces by melodic style identity rather than artist or region alone.
- Controlled recordings across multiple performers could check whether the features remain stable when recording conditions vary.
Load-bearing premise
The chosen melodic features derived from musicology are highly discriminatory and capture the distinct characteristics of the melodic contour independent of performer or recording quality.
What would settle it
An experiment in which the automatic classifier using only the melodic features performs no better than chance at matching the style labels that human listeners assign when all non-melodic cues are removed from the audio.
Figures
read the original abstract
We present a melody based classification of musical styles by exploiting the pitch and energy based characteristics derived from the audio signal. Three prominent musical styles were chosen which have improvisation as integral part with similar melodic principles, theme, and structure of concerts namely, Hindustani, Carnatic and Turkish music. Listeners of one or more of these genres can discriminate between these based on the melodic contour alone. Listening tests were carried out using melodic attributes alone, on similar melodic pieces with respect to raga/makam, and removing any instrumentation cue to validate our hypothesis that style distinction is evident in the melody. Our method is based on finding a set of highly discriminatory features, derived from musicology, to capture distinct characteristics of the melodic contour. Behavior in terms of transitions of the pitch contour, the presence of micro-tonal notes and the nature of variations in the vocal energy are exploited. The automatically classified style labels are found to correlate well with subjective listening judgments. This was verified by using statistical tests to compare the labels from subjective and objective judgments. The melody based features, when combined with timbre based features, were seen to improve the classification performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that pitch- and energy-derived melodic features (pitch transitions, micro-tonal notes, vocal energy variations) extracted from audio can classify Hindustani, Carnatic, and Turkish music styles. It reports that these features, when used for automatic classification, correlate well with human listening judgments on melodic contour alone (after matching pieces on raga/makam and removing instrumentation cues), with the correlation verified by statistical tests; combining the melodic features with timbre features further improves classification performance.
Significance. If the central claim holds after addressing potential confounds, the work would demonstrate that musicology-derived melodic features can capture style-specific contour characteristics in a manner that aligns with listener perception, offering a concrete contribution to MIR for improvisation-based genres. The explicit use of listening tests with statistical validation and the reported improvement from feature combination are positive elements that would strengthen the result.
major comments (1)
- [Abstract] Abstract (and implied Methods): the central claim that the selected melodic features discriminate style independently of performer or recording variables rests on the assumption that pitch/energy quantities encode only style-specific contour; however, no explicit cross-performer or cross-recording controls are described that would isolate the style signal from ornamentation habits or acoustics, leaving open the possibility that automatic labels correlate with listener judgments for spurious reasons.
Simulated Author's Rebuttal
We thank the referee for the detailed review and constructive feedback. We address the major comment below, providing clarification on the controls employed in the study while acknowledging areas where the manuscript can be strengthened.
read point-by-point responses
-
Referee: [Abstract] Abstract (and implied Methods): the central claim that the selected melodic features discriminate style independently of performer or recording variables rests on the assumption that pitch/energy quantities encode only style-specific contour; however, no explicit cross-performer or cross-recording controls are described that would isolate the style signal from ornamentation habits or acoustics, leaving open the possibility that automatic labels correlate with listener judgments for spurious reasons.
Authors: We appreciate this observation on potential confounds. The manuscript describes selection of pieces matched on raga/makam to control for melodic theme and structure, with instrumentation cues removed in the listening tests to isolate melodic contour. These steps were intended to focus the analysis on style-specific melodic characteristics. However, we acknowledge that explicit cross-performer normalization (e.g., multiple performers per raga across styles) or acoustic normalization across recordings is not detailed beyond the raga/makam matching. The statistical validation against human judgments on melodic attributes alone provides supporting evidence against purely spurious correlations, but to address the concern directly we will revise the Methods and Discussion sections to explicitly describe the controls used, discuss limitations regarding performer ornamentation habits and recording acoustics, and clarify how the feature set targets contour transitions, micro-tones, and energy variations independent of those factors. revision: partial
Circularity Check
No circularity: classification validated against independent subjective judgments
full rationale
The paper derives melodic features (pitch transitions, micro-tones, vocal energy) from audio, trains a classifier on style labels, and validates automatic outputs against separate human listening tests via statistical comparison. No self-definitional steps, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided derivation chain. The central result is an empirical correlation between feature-based labels and external judgments, which does not reduce to the inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- feature selection thresholds
axioms (1)
- domain assumption The selected features from musicology capture style-specific melodic characteristics
Reference graph
Works this paper leans on
-
[1]
Cmusphinx: The carnegie mellon sphinx project.https://cmusphinx.github.io/ wiki/, accessed: 2019-04-21
work page 2019
-
[2]
distinguishing between similar ragas
"distinguishing between similar ragas", itc sangeet research academy.http://www. itcsra.org/sra_raga/sra_raga_index.asp, accessed: 2019-04-21
work page 2019
-
[3]
Wikipedia contributors. alap. in wikipedia, the free encyclopedia. https://en. wikipedia.org/wiki/Alap, accessed: 2019-04-21
work page 2019
-
[4]
Wikipedia contributors. makam. in wikipedia, the free encyclopedia.https://en. wikipedia.org/wiki/Makam, accessed: 2019-04-21
work page 2019
-
[5]
Wikipedia contributors. taqsim. in wikipedia, the free encyclopedia.https://en. wikipedia.org/wiki/Taqsim, accessed: 2019-04-21
work page 2019
- [6]
-
[7]
IEEE Transactions on speech and audio processing13(5), 1035–1047 (2005)
Bello, J.P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., Sandler, M.B.: A tutorial on onset detection in music signals. IEEE Transactions on speech and audio processing13(5), 1035–1047 (2005)
work page 2005
-
[8]
Proceedings of Les Corpus de l’oralité, Strasbourg, France (2011)
Bozkurt, B.: Pitch histogram based analysis of makam music in turkey. Proceedings of Les Corpus de l’oralité, Strasbourg, France (2011)
work page 2011
-
[9]
Chordia, P., Rae, A.: Automatic raag classification using pitch-class and pitch- class dyad distributions. In: Proceedings of the International Symposium on Music Information Retrieval, Vienna, Austria (2007)
work page 2007
-
[10]
In: 2011 National Conference on Communications (NCC)
Kini, S., Gulati, S., Rao, P.: Automatic genre classification of north indian devo- tional music. In: 2011 National Conference on Communications (NCC). pp. 1–5. IEEE (2011)
work page 2011
-
[11]
Sound and Music Computing38, 39–41 (2011)
Koduri, G.K., Gulati, S., Rao, P.: A survey of raaga recognition techniques and im- provements to the state-of-the-art. Sound and Music Computing38, 39–41 (2011)
work page 2011
-
[12]
In: Gouyon F, Herrera P, Martins LG, Müller M
Koduri, G.K., Serrà Julià, J., Serra, X.: Characterization of intonation in car- natic music by parametrizing pitch histograms. In: Gouyon F, Herrera P, Martins LG, Müller M. ISMIR 2012: Proceedings of the 13th International Society for Music Information Retrieval Conference; 2012 Oct 8-12; Porto, Portugal. Porto: FEUP Ediçoes, 2012. International Society...
work page 2012
-
[13]
In: Audio Engineering Society Conference: 42nd International Conference: Semantic Audio
Kruspe, A., Lukashevich, H., Abeßer, J., Großmann, H., Dittmar, C.: Automatic classification of musical pieces into global cultural areas. In: Audio Engineering Society Conference: 42nd International Conference: Semantic Audio. Audio Engi- neering Society (2011)
work page 2011
-
[14]
In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Liu, Y., Xiang, Q., Wang, Y., Cai, L.: Cultural style based music classification of audio signals. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 57–60. IEEE (2009)
work page 2009
-
[15]
CS229 Lecture notes1(1), 1–3 (2000)
Ng, A.: Cs229 lecture notes. CS229 Lecture notes1(1), 1–3 (2000)
work page 2000
-
[16]
In: 2010 National Conference On Communications (NCC)
Pant, S., Rao, V., Rao, P.: A melody detection user interface for polyphonic music. In: 2010 National Conference On Communications (NCC). pp. 1–5. IEEE (2010)
work page 2010
-
[17]
Poliner, G.E., Ellis, D.P., Ehmann, A.F., Gómez, E., Streich, S., Ong, B.: Melody transcription from music audio: Approaches and evaluation. IEEE Transactions on Audio, Speech, and Language Processing15(4), 1247–1256 (2007) STYLE CLASSIFICATION USING MELODIC FEATURES 19
work page 2007
-
[18]
IEEE Transactions on Audio, Speech, and Language Processing20(1), 342–348 (2012)
Rao, V., Gaddipati, P., Rao, P.: Signal-driven window-length adaptation for si- nusoid detection in polyphonic music. IEEE Transactions on Audio, Speech, and Language Processing20(1), 342–348 (2012)
work page 2012
-
[19]
In: International Workshop on Adaptive Multimedia Retrieval
Rao, V., Gupta, C., Rao, P.: Context-aware features for singing voice detection in polyphonic music. In: International Workshop on Adaptive Multimedia Retrieval. pp. 43–57. Springer (2011)
work page 2011
-
[20]
IEEE transactions on audio, speech, and language processing 18(8), 2145–2154 (2010)
Rao, V., Rao, P.: Vocal melody extraction in the presence of pitched accompa- niment in polyphonic music. IEEE transactions on audio, speech, and language processing 18(8), 2145–2154 (2010)
work page 2010
- [21]
-
[22]
IEEE Transactions on Audio, Speech, and Language Processing 20(6), 1759–1770 (2012)
Salamon, J., Gómez, E.: Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language Processing 20(6), 1759–1770 (2012)
work page 2012
-
[23]
In: Gouyon F, Herrera P, Martins LG, Müller M
Salamon, J., Gulati, S., Serra, X.: A multipitch approach to tonic identification in indian classical music. In: Gouyon F, Herrera P, Martins LG, Müller M. IS- MIR 2012: Proceedings of the 13th International Society for Music Information Retrieval Conference; 2012 Oct 8-12; Porto, Portugal. Porto: FEUP Ediçoes; 2012. International Society for Music Informa...
work page 2012
-
[24]
In: 2012 IEEE International Con- ference on Acoustics, Speech and Signal Processing (ICASSP)
Salamon, J., Rocha, B., Gómez, E.: Musical genre classification using melody fea- tures extracted from polyphonic music signals. In: 2012 IEEE International Con- ference on Acoustics, Speech and Signal Processing (ICASSP). pp. 81–84. IEEE (2012)
work page 2012
- [25]
-
[26]
In: Serra X, Rao P, Murthy H, Bozkurt B, editors
Vidwans, A., Ganguli, K.K., Rao, P.: Classification of indian classical vocal styles from melodic contours. In: Serra X, Rao P, Murthy H, Bozkurt B, editors. Proceed- ings of the 2nd CompMusic Workshop; 2012 Jul 12-13; Istanbul, Turkey. Barcelona: Universitat Pompeu Fabra; 2012. p. 139-146. Universitat Pompeu Fabra (2012)
work page 2012
-
[27]
Vlachos, M., Lin, J., Keogh, E., Gunopulos, D.: A wavelet-based anytime algorithm for k-means clustering of time series. In: In proc. workshop on clustering high dimensionality data and its applications. Citeseer (2003)
work page 2003
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.