Can anyone learn overtone singing?

The short answer: yes.

The longer answer: yes, and recent acoustic research is beginning to explain why your voice is already designed for it. What follows is the most honest account I can give you, as someone who teaches overtone singing and publishes research on how it works.

Why people doubt it

Overtone singing sounds extraordinary. When you hear a skilled singer produce a clear, ringing harmonic floating above a deep drone, your brain files it under “impossible for me.” It seems like something that requires a special anatomy, a genetic gift, or a childhood spent on the Mongolian steppe.

This reaction is understandable. It’s also wrong.

I know because I’ve heard it from nearly every student I’ve ever worked with, right before they produced their first overtone.

What overtone singing actually is

Let’s strip away the mystery.

When you sing a single note, your vocal folds produce not one frequency but dozens, a fundamental pitch and a whole series of harmonics above it (2×, 3×, 4×, 5× the fundamental, and so on). You already produce overtones every time you speak. They’re what make the vowel “ee” sound different from “oo,” even at the same pitch.

In overtone singing, you learn to amplify one of those naturally occurring harmonics until it becomes audible as a separate, whistling tone above your drone. The tool you use is your vocal tract, the tube of air between your vocal folds and your lips. By shaping it with precision (tongue position, jaw opening, lip rounding), you create what acousticians call formant tuning: concentrating your resonance on a single harmonic.

You’ve been doing a rough version of this since you learned to talk. Overtone singing is the precise version.

What’s the difference between overtone singing and throat singing? →

Your voice is built for this

Here’s something most tutorials won’t tell you: the physics of your vocal tract is actually on your side.

In my published research (“Flute in the Throat,” Zenodo 2025), I describe how the vocal tract during overtone singing may function as a system where multiple control parameters (lip opening, tongue-palate constriction, and subtler adjustments deeper in the vocal tract) all move frequency in the same direction. Larger opening or wider aperture → higher harmonic. Smaller → lower.

In control theory, this is called monotonicity, and it’s the best possible news for a learner. It means:

  • The system is self-correcting. If one adjustment drifts, others can compensate.
  • There are no contradictory feedback signals. You’re never pushing in two directions at once.
  • Multiple configurations work for the same target. There’s not one “right” tongue position, there’s a range of valid ones.

This is why different teachers can emphasize different techniques and still produce similar results. It’s also why your body, once it gets the first clear harmonic, can find its way to the next one. The architecture of your voice is designed to make this learnable.

spectrogram frequencies, resonances and overtones

What you're actually learning
Skill hierarchy

One thing the research revealed is that overtone singing isn’t a single skill — it’s a progression of skills, each building on the last.

Stage 1: Your first overtones. You learn to sustain a stable drone and make coarse adjustments, mostly with your lips and jaw. The harmonics are faint, unstable, but audible. On a spectrogram, you’d see a flicker of brightness at one harmonic. This is the “I heard something!” moment.

Stage 2: Control and stability. You refine your tongue position and learn to target specific harmonics. The overtone becomes louder, more stable. You start to hear which harmonic you’re on (the 7th? the 8th? the 9th?) and can move between them. The spectrogram shows a clear, bright line above your drone.

Stage 3: Precision and expressiveness. You develop finer control, using deeper structures in the vocal tract that aren’t part of normal speech articulation. The overtone becomes crisp, the neighboring harmonics quiet down, and you can play melodic phrases. This is where years of practice make the difference: not learning new positions, but refining control over structures you didn’t know you had.

Each stage uses the same basic physics. What changes is the precision of your control, and that precision comes from practice, not from anatomy.

What affects how quickly you learn

Let me be direct about this.

Musical experience helps, especially singing. If your ear is already trained and your vocal coordination is refined, you’ll progress faster. Experience with tonal languages, didgeridoo, or jaw harp also helps, because they develop sensitivity to overtones.

But it’s not required. I’ve seen complete beginners with no musical background produce their first audible harmonic within a single lesson. And I’ve seen experienced singers struggle initially, not because they lack skill, but because the habits and techniques they’ve built over years are harder to approach from a completely new perspective. When you’ve trained for a decade to produce a certain kind of sound, letting go of that control long enough to discover a different one takes patience. It’s not a lack of ability, it’s the weight of expertise.

The biggest predictor of progress is consistent, attentive practice. Fifteen minutes a day of focused work is worth more than two hours of distracted experimentation. The neural pathways that control formant precision need repetition to develop, just like any fine motor skill.

And there’s a fascinating reason to keep at it: recent neuroscience research suggests that the active practice of overtone singing may engage the brain in ways relevant to working memory and cognitive health, but that’s a story for another article.

The real barrier isn't in your mouth, it's in your ear

This may be the most important thing I’ll say in this article.

Most people assume the hard part is the production, getting the tongue right, the jaw right, the airflow right. After years of teaching, I’m convinced the deepest challenge lies elsewhere: not in what you do with your voice, but in how you interpret what you hear.

Since infancy, your brain has been trained to interpret vocal sounds as language. Vibrations become vowels. Frequencies become words. When most people try overtone singing, “go from ‘oo’ to ‘ee’ slowly”, they hear a sequence of vowels, because that’s what their brain is trained to do. But the harmonic you’re trying to isolate isn’t a vowel. It’s a frequency at 1200 or 1500 Hz, and your ear needs to learn to hear it as itself.

I remember this from my own early training with Johanni Curtet, the ethnomusicologist who first introduced me to overtone singing. I could hear those ringing frequencies appearing above my drone (bright, almost crystalline) and in the same instant, my brain would rush to categorize them. I’d stop and ask: “Which vowel was that?” It took time to understand that the question itself was the obstacle.

In my teaching, I start with vowels, I use the vowel chart, I work with the familiar. But progressively, we learn to forget them. To shift from semantic listening to tonal, musical listening. This perceptual shift is at the heart of everything I teach, and I’ll write much more about it, and the neuroscience behind it, in a dedicated article soon.

spectrogram with subharmonic throat singing

What makes the difference: understanding

This is where science becomes your ally.

Most online tutorials tell you to make a shape with your mouth and hope for the best. Sometimes it works. When it doesn’t, you have no idea why and no way to troubleshoot, because you’re working blind, relying on vowel labels for a process that goes beyond vowels.

When you understand what you’re trying to do acoustically (narrow a formant to align with a specific harmonic) you can work systematically. You’re not guessing; you’re adjusting. A spectrogram gives you visual confirmation of what your ear is learning to hear: not a vowel, but a frequency. You can see the harmonic brighten, dim, shift and correlate what you see with what you feel.

This is why I published the “Flute in the Throat” hypothesis not because you need acoustic theory to sing, but because understanding the mechanics gives you a map. It turns a mysterious skill into one you can navigate. And in my experience, the learners who progress fastest are the ones who stop asking “which vowel am I on?” and start asking “which harmonic am I amplifying?”

That shift, from semantic listening to acoustic listening, is the deepest thing I teach. Everything else is technique.

Read more: The science of overtone singing →

Ready to find out for yourself? Book a discovery session →. Your first overtone might be closer than you think.

Want to understand the science? The science of overtone singing →

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top