Hearing aids: an introduction to DSP in hearing aids — let’s start with the part before it hits the processor

September 26, 2012 – 4:55 pm

The last hearing aids post wasn’t particularly technical — it focused on the perceptual and cognition aspects of DNR (digital noise reduction) so we’d have something concrete to talk about when we dove into an explanation of DSP (digital signal processing), recipe which is what we’re about to do now.

I should give a disclaimer first. I’m writing this first and foremost for myself, to help me synthesize and understand ideas behind hearing aid technologies. Since I studied electrical engineering as an undergraduate, I take a lot of basic DSP concepts for granted; to some extent, I don’t feel like I need to take notes on this, except for application-specific statistics and measures I want to note. However, there’s a secondary audience — it’s been interesting to see what aspects of the things I take for granted are new to my audiology-PhD classmates, and what things they take for granted that I have no clue about. So as I go along in my fast clip of I’m-an-engineer dialogue, I’ll try to pause and explain the terms I’ve noticed are less familiar for folks without that technical background. Here goes.

DNR is a feature implemented via DSP, which stands for “digital signal processing.” To start talking about either, we need to understand the “D,” which stands for “digital.”

The world is not digital. The world is analog — it is continuous, with infinite resolution in both time and amplitude. But computers are not analog; they only understand numbers. You don’t ask how many megapixels the Grand Canyon is — the answer, if anything, is “infinitely many.” But you do say that your photograph of the Grand Canyon is 5 megapixels, because that’s the resolution at which your digital camera was able to capture its image of the Grand Canyon. (Just as the original Grand Canyon will always be cooler than your picture, analog signals will always be higher-fidelity than any digitization, because they are the original signal we’re trying to replicate.)

In order to convert between the two, we use an ADC, which stands for analog to digital converter, which is what we call anything — component, process, algorithm, whatever — that does the job of making something analog into something digital, turning something continuous into something discrete.

The word for “making something discrete” is discretization. Discretization of time is called sampling; think about sampling cookies off a baking sheet every few minutes to check when they’re done baking. (“15 minutes, undercooked. 20 minutes, almost there. 25 minutes, perfect. 25 minutes, burnt.”) Discretization of amplitude is called quantization; think about how many different levels of cookie-doneness you have (is it just undercooked/perfect/burnt, or do you say raw/gloopy/slightly soft/crisp/dry/charcoal?)

One thing the cookie analogy is good for is showing how sampling and quantization are independent of each other. We could only have undercooked/perfect/burnt but be sampling cookies every 5 seconds (om nom nom.) Or we could have a really, really fine gradient of classifications with 30 levels on our “cookie doneness” scale, but only be checking the oven every 20 minutes. (Sadface.) When we say “high-resolution,” we have to ask: are you tasting the cookies often (high sampling rate), or do you have a really detailed cookie-doneness scale (quantization)? Usually, “high-resolution” means you’ve got a lot of detail on both, but it’s good to know.

Let’s hang out in the time domain for a while and talk about sampling first. Sampling is expensive; it takes time, energy, battery, etc, so we want to sample as infrequently as possible. How do we know how often to sample? You already intuitively know the answer to this question; it depends on how fast the thing we’re sampling is liable to change. If the turkey needs to bake for 7 hours, maybe we’ll check in every hour or two. If the shrimp crackers puff up after 5 seconds in hot oil and then rapidly burn, we’re going to watch the wok like a hawk, not looking away for more than 2-3 seconds at a time.

If we have a signal that changes often — with high frequency — it is (drumroll…) a high-frequency signal. Low-frequency signals change more slowly — that’s by definition, that’s what high and low frequency mean. In order to capture a higher-frequency signal (shrimp chips), we need a higher-frequency sampling rate (check wok more often). To be precise, the Nyquist Sampling Theorem (technically the “Nyquist-Shannon Sampling Theorem,” but we usually forget about Shannon) says we need to sample at twice the maximum frequency we want to capture. There’s some beautiful, beautiful math behind it that I won’t go into, though I highly recommend the journey through that proof for anyone keen on understanding signal processing. But the gist that you actually need is this: want to capture a 500Hz signal? You need a sample rate of at least 1kHz. (But don’t go over that too much — it’s just a waste of resources, like the kid in the back of the car that keeps asking “are we there yet?” when we obviously aren’t.) Another way of putting this is that the “Nyquist frequency” of a 1kHz sample rate is 500Hz — the Nyquist frequency being the “highest frequency signal present in the reconstructed output.”

What happens if you sample at too low a rate? Aliasing. I’ll let Wikipedia explain that. All you really need to know is that it makes a high signal frequency sound like a low alias frequency instead. To be precise, the alias frequency is the sampling frequency minus the signal frequency; to be descriptive, it sounds terrible. Take a piece of piano music and play all the high Fs as low C-sharps and you’ll see what I mean.

Therefore, right before a signal hits the ADC (analog to digital converter, remember) we usually have an anti-aliasing filter, which is just a low-pass filter. Sometimes the microphone itself acts as the low-pass filter; if the mic can’t physically capture sounds beyond the Nyquist frequency, those sounds just never get a chance to be digitized and aliased. But in case the mic does pass through signals above the Nyquist frequency, the anti-aliasing filter chucks them out (introducing a bit of distortion in the process, since there are no perfect filters).

We have now set our bandwidth limitations. We have decided that we will not capture or work with any signals below 2x our sampling frequency. Higher sounds are gone forever. A system only has as much bandwidth as its smallest component.

How fast do modern hearing aids sample? The typical one nowadays is about 16-32kHz. (Quick reminder for my audiology classmates: “Hz” is short for Hertz, which means “samples per second” — 32kHz is “thirty-two kilohertz,” or “thirty-two thousand samples per second.”) Compared to CD-quality audio at 44.1kHz, that sampling rate seems downright sluggish. But hearing aid processors are working under all sorts of time and space and power limitations — they can only process so much data at a time, and a 16-32kHz sample rate is already giving it 16-24 bits per second to deal with.

I should backtrack, because some of my classmates wondered: what are bits?

Bits are units of digital data. Computers work in binary (base 2), so the word “bit” is short for “BInary digiT.” I won’t go into detail as to why and how this works because it’s not important (to audiology students, anyway — engineering students should totally look this up) and Wikipedia does it well, but basically, if you have more bits, you can represent more numbers; 1 bit lets you represent 2 numbers, 2 bits lets you represent 4, 3 bits lets you represent 8, and so on down the line: 8 bits lets you represent 256 numbers. We use bits in units of 8 so frequently that we have a unit for that: 8 bits equals one byte. That’s the same “byte” as in “kilobyte” (kB, 1000 bytes) or “megabyte” or “terabyte” and… basically, the words and numbers on the packages of thumbdrives that you buy. Bytes and bits describe data size, the amount of information you have. The more bytes (or bits) you have, the more data you have.

How big — how many bits — is an incoming digital signal? Well, that’s going to depend on a few things, so let’s go back to the Grand Canyon with our cameras again and take a time-lapse film. How much hard drive space will we need? This depends on how long the time-lapse is and how often we are taking pictures (sampling rate), but also how high-resolution each picture is (quantization). The more detail we are trying to capture with each sample, the more bits we’re going to need.

If you remember from earlier: sampling is going from analog-to-digital in time, and quantization is going from analog-to-digial in amplitude. We need a reasonable number of “loudness levels” in order to be able to make sense out of sound; think of how terrible it would be if every sound you heard were coming out at exactly the same volume — if the soft low hum of your air conditioner and the faint high cheeps of the cicada outside your window became exactly as loud as the phone call you were trying to listen to.

Through a bunch of (I assume) magical math that I didn’t get to see and therefore don’t understand, the general rule of thumb is that each bit — each doubling of the number of different “loudness levels” you can have — can give you another 6dB of dynamic range. A typical hearing aid will run at 16-20 bits and with a 96-120dB dynamic range, meaning that the difference in volume between the softest and loudest sounds the hearing aid will detect (and process and amplify) is 96-120dB.

You really, really want to leave “headroom” at the ceiling of your dynamic range — about 6dB, as a rule of thumb. If the loudest sound your mic can pick up is 100dB, give yourself 6dB of headroom (also called “reserve gain”) in your ADC so that in case you get a really loud sound, you’ll have a bit of wiggle room before the processor goes awry and everything begins to sound like crap. (Also put an output-limiting compressor protection circuit in just before your ADC to make sure it never gets a voltage signal higher than what it can handle, the same way you put the anti-aliasing filter in to make sure it never gets a frequency higher than it can handle.)

Note that the softest sound does not need to be 0 dB SPL.

And once again I need to sidestep to explain a term: SPL stands for “sound pressure level,” and 0 dB SPL i’s a reference point for the sensitivity threshold of normal human hearing — in other words, we call “the volume of the softest possible detectable 1kHz sound” 0dB SPL, then talk about sound volumes in decibels relative to that (the same way 0 degrees Celsius is calibrated against the freezing point of water, and we talk about temperatures in degrees Celsius relative to that).

Anyway. Note that the softest sound does not need to be 0dB SPL. In fact, it shouldn’t be — things that soft are probably background noise rather than signal you’ll need to pay attention to. (If they wanted you to pay attention, they’d have made it louder, right?) So we can squeeze a bit more out of our dynamic range by raising the noise floor – if we have a 100dB dynamic range and want to be able to process volume distinctions up to 120dB, we simply make the noise floor 20dB. This means that the softest sound the hearing aid will even detect is 20dB; anything below 20dB is discarded, turned into silence, all in the name of presevation of digital space.

We’re just discarding data left and right, aren’t we? Yes, we are. The heavier your backpack is, the longer it’s going to take you to hike across the mountain — so you only, only, only take essentials. (In a later post, we’ll talk about why latency sucks, but for right now, trust me: latency sucks.) Processors can only do so much when they’re tiny and current-strapped and have to be fast. If information isn’t necessary for understanding or comfort or something of that sort, out it goes; high frequencies (above the Nyquist) get slapped off by the antialiasing filter so they don’t alias, low frequencies get discarded underneath the noise floor.

Even “useful” signals are squeezed as small as possible. If we’ve decided on a 20dB noise floor (“things softer than 20dB SPL are probably not important so we’re not even going to pick them up”) then maybe we can also decide that soft sounds — say, between 20-40dB SPL — are things we want to be aware of, but we don’t need to hear them particularly clearly — it’s ok if they’re low-resolution. So we encode them in less space (say, 3-4 bits) whereas louder sounds (say, 60dB SPL) might be encoded with 16 bits.

Maybe this sounds abstract, so here’s a musical analogy: say you want to hear a violin concerto, and you’re going to get the violin melody (loud) and the orchestral background (soft) as 2 separate mp3 tracks. One of them will be a fantastic mp3; brilliant sound quality, all that — the other will use the lossiest encoding possible. You get to pick which track gets encoded which way. You’d pick the complex, loud violin melody, right? The soft orchestral stuff is great to hear, but if fidelity is going to get lost somewhere, you’d rather it be there rather than in Joshua Bell’s solo.

So yes. Squeeze, squeeze, squeeze. Don’t need that frequency? Discard it. Don’t need that resolution? Mangle it into fewer bits. Ultralight backpackers trim the borders off their maps and drill holes in their toothbrush handles to save extra precious ounces, and we want to do the same before the (decimated, somewhat battered) signals hit the DSP.

What happens when it does will be the topic of our next post.

Know someone who'd appreciate this post?
  • Print
  • Facebook
  • Twitter
  • Google Bookmarks
  • email
  • Identi.ca
  1. 9 Responses to “Hearing aids: an introduction to DSP in hearing aids — let’s start with the part before it hits the processor”

  2. Quick point of clarification: I thought the Nyquist frequency was (usually) half the sampling frequency, not identical to it. Wikipedia suggests some textbooks use a different definition, but f_N = 1/2 f_s seems to be the one I come across most often…. so 500 Hz would be the Nyquist frequency if the sampling frequency is 1kHz. Then, ideally, the Nyquist frequency is equal to the highest frequency present in the signal, or at least the highest frequency we’re interested in.

    By Grant on Sep 26, 2012

  3. Gah! Yes. Thank you. Fixing.

    By Mel on Oct 13, 2012

  4. I’d to know how to get an analog to digital convertor in my hearing aids that has enough head room to handle live musical instrument input. Most of the hearing aid companies use ADC’s that only process up to 90-96dB inputs. Thus an instrument like my banjo causes distortion from the beginning of the processing cycle and no amount of compression changing will correct it. I guess the hearing aid companies feel that hearing impaired musicians do not exist or do not count because the companies seldom respond to any inquiries on this subject. We need an aid with an ADC that can process at least 105 dB.

    By Bruce on Apr 11, 2013

  5. Bruce, I suspect the reason HA manufacturers don’t include ADCs that go that high is because they’d probably get sued a lot more often. The reason is that 100 dB is easily loud enough to cause hearing damage. Anything over 80 dB can cause damage if exposure is long enough, and the higher you go, the shorter the time needed for the damage to happen. To say you want ADCs that go up to 105 dB is also pretty dangerous, I think, because the pain threshold is generally around 120 dB.

    While it would be nice to have the option to go this high in cases like yours that would like to hear undistorted sound from instruments, HA companies can’t do this because most non-technical HA users wouldn’t be careful enough. As Mel has mentioned in other blog posts, it’s tempting to just ramp up the HAs to full volume at times. If there weren’t a cutoff, it would be easy to cause even more damage.

    Mel, am I on base here or am I missing something?

    By Grant on Apr 11, 2013

  6. Grant: my completely unprofessional, I-am-not-an-audiologist, opinion is that you are on base (thanks for beating me to that answer!)

    Bruce, I can sympathize with the lack of headroom; I’m a musician as well. My guess is that the hearing aid companies are feeling nervousness regarding regulations and the possibility of giving users things they could use to hurt themselves, as Grant said — but also resource constraints, because it takes a lot of time and effort to make what we might see as trivial/small changes sometimes. For instance, the headroom change you suggest might require a new mic design — which means creating a whole new factory setup to produce them, then getting FDA approval on the new component/design/factory, and so on. It’s hard to tell what the constraints are from the outside.

    I’d love to see more direct dialogue between savvy hearing aid users (musicians, engineers, teachers, doctors, etc. who happen to wear hearing aids) and the companies that make hearing aids — if you’re interested in that as well, let me know and we’ll keep working to make it happen.

    By Mel on Apr 18, 2013

  7. Two of the best live music/hearing aid related articles I have found are:

    http://www.audiologyonline.com/articles/music-as-input-to-hearing-954 by Marshall Chasin

    Hearing Journal Sept 2010-Vol 63 -Issue 9 Programming hearing instruments to make music more enjoyable by Hockley, Bahlmann and M. Chasin.

    Marshall Chasin, an Audiologist at Musicians clinics of Canada, has done a lot of research on the subject and his description of the differences in speech and music “physics” and sound processing has been quite enlightening to me.

    Also over the last several years I have e-mailed most of the major hearing aid companies about getting more ‘headroom’ on the input side but only 1 ever responded which was Sebotec. In fact most of the companies won’t even tell my audiologist what the specs on their ADC’s are.

    And yes, I agree that we need a place to discuss our actual experiences with hearing instruments as musicians on a consistent basis.
    I have noticed that some of the existing hearing aid blog sites don’t even have a category for musicians. And the musicians that do comment from time to time get frustrated and give up. Felt that way myself in fact. I have not checked to see what percent of the hearing impaired population are musicians/live music listeners, but we probably don’t make up much of the ‘market share’, which is why I thought that hearing aid companies are none responsive.

    By Bruce on Apr 18, 2013

  8. Bruce, have you seen https://www.bigtent.com/groups/aamhl (Association of Adult Musicians with Hearing Loss)?

    By Mel on Apr 19, 2013

  9. Digital audio is not, by definition, lower fidelity than analog. Yes, it must be band-limited, but so are our ears. A sampling rate can always be chosen that exceeds 2x the required bandwidth. Quantization is also not a limitation, as long as dithering is properly employed. Dithering completely de-correlates linearizes distortion due to quantization. Dither is the most counter-intuitive and misunderstood aspect in all of digital audio. If you do not fully understand dither, then you do not fully understand digital audio.

    As for clipping and headroom, true 24-bit ADCs can yield 144 dB of dynamic range. Since 120 dB is the approximate range between the threshold of audibility and the threshold of pain, I would say 24-bit ADCs are more than adequate. And no, that doesn’t imply any legal liability for hearing aid vendors. It simply means that the audio front end will be more immune to clipping and overload. DSP dynamic range compression and loudness limiters are the proper mechanisms to prevent excessive loudness from reaching the wearer’s ears. What’s more, compressors and limiters work better with undistorted audio.

    I am an audio and music lover, and I bought my hearing aids based on their audio quality. The fidelity of these hearing aids is nothing less than spectacular. I have never noticed distortion in live concerts, rock or orchestral.

    By Karl U on Jan 16, 2015

  10. Thanks, Karl — I definitely don’t fully understand dither (it’s been a long time since my signal processing class), but would like to — if there are any resources you find especially helpful, I’d love to know what they are.

    Audio quality perceptions depend on a whole host of complex factors — details of hearing loss (what frequencies, what severity, what kind of distortion if any), type of hearing aids (which sadly doesn’t always fit the hearing loss for many reasons), programming of hearing aids (also sometimes doesn’t fit), listening history (i.e. did you grow up hearing and establish a sound baseline, then suddenly lose it — or have you always been deaf/hard-of-hearing? Are you a musician/audiophile, or a casual listener?)

    I’d love it if more people unpacked their decision-making process for hearing aid selection (and included these sorts of factors in their discussion so I and others could figure out what things apply to us). It would be fun to play with music with other non-hearing audio geeks and *really* geek out — not from any sort of “oh, woe is me, I cannot hear things” perspective (which annoys me to no end), but with a sense of “wow, we get to experience sound in such different ways, let’s play with it!”

    By Mel on Nov 17, 2016

What do you think?