音频应用

 找回密码
 快速注册

QQ登录

只需一步,快速开始

音频应用 首页 新闻资讯 查看内容

为什么我们选择 44.1K 和 48K?音频采样率的历史回顾

2022-6-20 11:17| 发布者: 4531209| 查看: 4650| 评论: 0

摘要: 在我们最近关于使用不同采样率的讨论之后,让我感到震惊的是,历史在为什么选择采样率以及与我们在时间。在本文中,Mike Thornton 根据他在英国专业音频领域 45 年的经验提出了一个观点。
Why We Choose 44.1K And 48K? A Historical Look At Audio Sample Rates

Following on from our recent discussions about the use of different sample rates, it struck me that history played a significant part in why sample rates were chosen and how good even the early digital options were when compared to the analog specs we were working to at the time. In this article, Mike Thornton offers a perspective based on his 45 years of experience in UK professional audio.

Being old enough to have now retired, I started work in pro-audio in an all-analogue world and have lived and worked through the analogue to hybrid transition through to full end-to-end digital audio workflows.

Consider The Delivery Path
Irrespective of the time frame, It makes sense when considering the technology we use to undertake a particular task to refer to the delivery specs and how the content will be consumed to inform and direct our choices of what equipment and techniques we need to be able to meet or preferably exceed the required delivery specifications and platform.

Back To The Bad Old Days Of Analog

To start my perspective, I am going to roll back to the 80s, when the path was analog from end to end. To give you a benchmark, at this time I was working in independent commercial radio here in the UK and we had a technical code of practice that we had to meet. Fail and the Independent Broadcasting Authority had the power to take us off-air. These were all laid out in the IBA Technical Review 13 - Standards for Television and Local Radio Stations, a red paperback book, that was effectively our ‘bible’. 

Slightly bizarrely, I managed to find a copy that had been scanned in on the NTL Pension Association website if you would like to read it for yourself.

Let's start with the spec we had to meet for using analog tape recorders. If you are too young to have worked in the analog world you might be shocked at what we worked to back in the day.

Frequency response: 40Hz to 15kHz +2.0dB to -2.5dB

Wow and Flutter: No more than 0.12%. This is something that doesn’t affect the digital world, but wow and flutter is a measure of the speed variation, something you don’t want in audio recording and playback. Wow is the slow changes in speed and flutter the fast variations. This required the machines to be well maintained mechanically.

Distortion: 2% at +8dBu at 1kHz. As this was tape the distortion increased progressively as you increased the level, whereas analog circuits were pretty good, typically 0.1% up to around +18dBu. 

Signal to noise ratio: 45dB unweighted peak level. Note that our headroom was +8dBu so we are looking at a noise floor that had to exceed -37dBu!

Remember these were real-world day-to-day figures. Could we get better? Yes, but that required frequent lineup and cleaning, which when a local station could easily have 20 to 30 ¼” tape recorders wasn’t practical. Laterly, at Piccadilly Radio, we chose to go for Studer B67s, which stayed lined up both electrically and mechanically for much longer. However, the budget wouldn’t stretch to the likes of A80s etc.

For music, we played vinyl records and the ‘disc reproducers’ had to meet this spec…

Frequency response: 40Hz to 15kHz +/-2.5dB

Wow and Flutter: No more than 0.12%.

Signal to noise ratio: 55dB unweighted peak level

Then there was the electronic studio path. This would be measured through the transmission chain from mic preamp input to the studio output, going to the transmitter.

Frequency response: 40Hz to 15kHz +/-1.0dB

Signal to noise ratio with a mic preamp providing 70dB of gain: 46dB unweighted peak level giving an equivalent input noise of -116dB. This usually wasn’t a problem as the Neve mic preamps we used were much better than that but the more budget desks were not so good, especially at high gains, hence the real-world spec.

Just for reference, the line-level inputs had to exceed a signal to noise ratio of 63dB

We also had to do a full loop test. This we did in the middle of the night, even though Piccadilly Radio was a 24-hour station. So this was from the studio desk out to the transmitter, in our case using a radio link, but more often than not copper telephone links optimised for ‘music’, through the transmitter and back to the studio via the off-air check receiver. 

I can’t remember what the spec was but I do remember with FM radio on a good day we were lucky to get a 45dB signal to noise ratio.

So there you have it. In the days of all analog paths, we were constrained by the delivery system to the consumer as well as key elements in the signal flow, especially the analog tape recorders. Even with a well lined up ¼” tape machine, signal to noise ratio and the levels of distortion meant we had a pretty tight window of 45dB. That was it.

Yes, I know that with a lot of work you could stretch that to around 50 to 55dB without resorting to noise reduction. From memory, Dolby A could give you an extra 10dB to play with, but I hope this goes some way to explain why even the early digital audio platforms were so much better than our analog options.

The Onset Of Digita

lWith the coming of digital audio recorders and signal chains, wow and flutter effectively went away, headroom and distortion became binary, either it was very good or when you hit digital headroom it was very bad, as little as 0.1dB could make all the difference. With 16-bit digital audio, signal to noise ratios jumped from 45dB up to 96dB. Frequency response became very flat from 20Hz to 20kHz. No low-frequency bumps, no deterioration at the high end because of tape head wear. Yes, with digital audio, things got a little ‘rocky’ close to the anti-aliasing filters which were set close to half the sample rates, but overall 16bit/44.1kHz sample rate audio was stunning, especially when you compare it with the analog world we had become used to up until that point.  The idea of a ‘noise-free’ world with a significant increase in the dynamic range and no wow and flutter all the way to the consumer’s equipment was a game-changer. 

But it wasn’t just the broadcast sector that benefited. With the birth of the compact disc, the delivery of music moved to digital with the CD. Not only could high-quality audio be delivered all the way to the end-user, without data compression, for the end-user the CD was also very convenient, the benefits outweighed the drawbacks even in the early days when anti-aliasing filters were not as good as they are now. But like it or not the CD as a delivery format cemented 44.1K and 16 bit as a high quality and very convenient delivery format of music to the consumer.

Why 44.1K And 48K Sample Rates?

Back to the choice of sample rate and you might also be wondering why we ended up with sample rates with less than round numbers. If so then do check out our article Why Are The Sample Rates Based Around 44.1K or 48K? You will see that history and the hardware we had at the start of digital audio together with some math meant we ended up with 44.1K and 48K sample rates.

Here in the UK, radio chose 44.1K as so much of the content came from CDs and so it made sense to use 44.1K as the standard sample rate. By this point, I was editing and mixing content for national radio on the BBC. DAT was adopted as the first digital format, again largely using 44.1K/16 bit, but always supplying a main and backup tape as DAT wasn’t completely reliable and so when they played out the programme they would run both the main and backup tapes at the same time, in case there was an issue with the main tape. 

Once CDR became reliable we migrated to delivering CDs for prerecorded programmes, which for some time were played out live before the transition to a playout system where the CD could be ripped and uploaded to the playout system. Then finally a file upload system bypassed the need to deliver programs on any kind of media, but the spec was still 44.1K/16 bit. But remember that this is still way better than the analog FM transmission system, which we still have, in parallel with DAB, which whilst offering a digital end-to-end path, uses heavy data compression to get it to work and the signal coverage is still not really good enough to be able to even reliably listen to DAB inside buildings!

Moving away from broadcasting, the CD was an incredibly cost-effective way of delivering high-quality audio to the end-user using a very convenient delivery system. But like DAB for broadcasting the additional convenience of personal digital audio players like the Apple iPod, meant to make it practical, especially in the early days, it was necessary to resort to data compression to get enough content onto the early devices. 

When it came to content involving pictures 48K/16 bit was settled on for the audio paths, even though the math meant that 44.1K would work too. It has been suggested that the slight increase in sample rate from 44.1K to 48K meant the side effects of the early anti-aliasing filters were lessened and so the quality benefit was worth having, and TV was much less dependent on CDs for the bulk of its output.

Why All These Sample Rates?

In the recent comments to our article, Why Are Audio Engineers Avoiding High Sample Rates? Jay Tei said…

“Some of us remember a time when sample rate conversion sounded much worse than it does now. We tend to record at the sample rate of the final deliverable.”

Alan Hardiman quoted Bob Katz from his book Mastering Audio (3rd edition, chapter 23), the difference in reproduced audio quality at different sample rates has to do with the steepness of the low-pass filters in the digital-to-analog converter. 

"Steep low-pass filters at or near the high-frequency limit of the ear interact with the cochlear filter, creating pre-echoes that the ear interprets as a loss of transient response, obscuring the sharpness or clarity of the sound."


These are just two examples, but there is no doubt that in the early days of digital audio, sample rate conversion and anti-aliasing filters were not as good as they are now and so it made sense that in order to get that increased quality to use multiples of either sample rate, rather than converting from one to the other. It is our understanding that it was these limitations that produced the 44.1/88.2/176.4 and 48/96/192 range of sample rates. 

Original Content Creation Or Compilation Of Existing Content?
Another factor in determining what technology to use in the production process is the question of whether you are working on content that is already at 44.1/48. Until recently, in post-production, especially in TV, most of the content, was at 48K. At which point it is debatable as to whether it is worth using higher sample rates and bit depths. However, when recording and mixing an album project, when most, if not all, of the content is being acquired in high-quality studios, there is a lot more merit in using higher sample rates and bit depth.

Nowadays, with the combination of lower-cost storage, much-improved anti-aliasing filters and sample rate conversion, using 96K/24Bit makes sense as a pretty universal format now, irrespective of the final delivery sample rate, but back in the early days, it mattered.

So Why Aren’t More People Using Higher Sample Rates?

In our 2019 poll of nearly 2,000 professionals and hobbyists recording and mixing audio, only 1 in 5 said they bothered recording and mixing above 44.1kHz/48kHz. Two groups were polled, those who said they were professionals and those who said they recorded and mixed as a hobby, in both cases, the majority opted to record at either 44.1kHz or 48kHz.

In the comments on the poll results, ‘mattjhuber’ said…

“The gains in fidelity are too minimal to justify the loss in computer performance and storage space.”

‘Joel.d’ made an interesting comment…

“I'll say this as nicely as possible cause I am so sick of this debate. The problem is, is most "professionals" aren't professionals anymore. And with all due respect - the majority of everyone in this industry now-a-days are idiots, so they don't see or understand why to do things the best way possible anymore regardless of price or consequences.”

But there is no doubt that the benefits are there to have and are much more achievable now than they have ever been, with more and more powerful computers, able to handle the increased demands of processing and handling higher sample rate content, added to the availability of much larger storage devices that are also cost-effective.  As ‘Joel.d’ went on to say in the comments to our poll results…

“Plugins sound WAY better at 96k and especially reverbs tails are simply night and day. Never heard it? Test and listen.”

 
In the comments on our article, Why Are Audio Engineers Avoiding High Sample Rates? Michael Carnes put it so well…
“The truth is that any instrument will have a considerable and rapid variance in frequencies of all the partials. In many cases (piano, percussion, etc) the variance is very fast, very strong and discernible well below 20K. Your 44.1 reconstruction filter will only give you a smoothed over version of all that change--there simply aren't enough sample points to really catch this important part of the sound. Some of the result is phase-shifted by a little and some just doesn't match up to what the instrument did. Now push that through a reverb or IIR filter and the inaccuracy will be multiplied. High sample rates will give you a better representation of what came into the mics (and there's LOTS of stuff above 20K) and will reduce what happens when the signal slops around through a filter a few hundred times.

The argument that comes now is that it all has to get knocked down to 44.1, so why use up all that (incredibly cheap) storage space? Well, even when you downsample, all of that error hasn't built up in filters and reverbs. You only smooth out those partials one time.

I'd recorded at 44.1K for years and years, probably repeating some of the arguments I see here. But I realized I was just parroting what people said and not going by the evidence of my own ears. So I decided to record a chamber music concert at 96K. I was in my 50s at the time. You could have knocked me over with a feather. I'm in my 70s now and 192K is my standard sample rate for any live recording. I can't imagine going back.”

In Conclusion
Hopefully, in this article, I have been able to provide some background to show where we have come from, that in the early days of digital, the reality was that even though things weren’t perfect, with not very good anti-aliasing filters and sample rate conversion, even so, digital audio was so much better in so many ways than analog audio that preceded it.

The good news is that most, if not all, of the deficiencies, early digital audio suffered from are now a thing of the past. You can choose to use higher sample rates and bit depths and not get stung by computers unable to handle the increased load, not having enough hard drive storage and having to work around less than ideal filters and sample rate conversion. 

That said, at the end of the day, it is your choice, but hopefully, you can now make that choice, being a little more well informed about the factors at play here and why some of the choices were made the way they were back in the day.

Mike Thornton 
Mike Thornton has been involved in the broadcast audio industry for all his working life, some 45 years. Mike has worked with Pro Tools since the mid-1990s recording, editing and mixing documentaries, comedy and drama for both radio and TV as well as doing the occasional music project. He was the co-founder of Pro Tools Expert and has now retired and has taken up the role of Chairman of Production Expert Ltd.

路过

雷人

握手

鲜花

鸡蛋

相关阅读

最新评论

原创周排行
    音频应用搜索

    小黑屋|手机版|音频应用官网微博|音频招标|音频应用 (鄂ICP备16002437号)

    Powered by Audio app

    返回顶部