I don’t know who the Web Audio API is designed for (opens in new tab)

bartread8y ago

And I'll third it.

The MP3 issues don't end there, which is something the article touches on obliquely: you can't reuse many of the important constructs you might want to.

Here's my use case. I have a couple of games (https://arcade.ly/games/starcastle, https://arcade.ly/games/asteroids), each of which has three pieces of music: title screen, in game, and game over. If you play the game a couple of times you're going to hear the title screen audio probably once, in game twice or more (because it loops from the beginning after every playthrough), and game over twice. To put it simply: I need to play the same MP3s multiple times each.

To play an MP3 you have to decode it, which is an expensive operation. Firstly it takes time to decode - enough time that the user will notice the lag even on a fast machine. However the main problem is the amount of memory use: decoding takes you from a couple of MB of compressed MP3 to potentially hundreds of MB of uncompressed audio. The problem worsens for multiple tracks.

I discovered the memory issues via Chrome Task Manager, when I noticed my page using hundreds of MB of native memory, and traced this usage back to the music. You can often get away with this when running on a desktop browser, but not so much on mobile.

You can mitigate the memory issue to some extent by dropping the sample rate of your uncompressed PCM audio to 22.05KHz, which obviously halves its uncompressed size. Quality starts to suffer too much for music if you go much below this though. (Note here that I'm talking about the uncompressed sample rate, and NOT the MP3 bitrate. A 44.1KHz MP3 encoded at 64Kbps and one encoded at 128Kbps will decompress to the same size, although the 64Kbps version will obviously sound worse because more information will have been lost.)

But the inability to reuse a source buffer, which holds compressed audio, is absolutely aggravating, and something I've posted at length about here: https://github.com/WebAudio/web-audio-api/issues/1175. The reason you might want to do this is because it means you're only using as much memory as the compressed audio takes up and (hopefully) the rest will have been freed by the browser's runtime (no guarantees, obviously).

The downside of this approach is that you can't start a piece of music at a defined instant, which is extremely frustrating when you might want to synchronise it with events happening on screen.

Also, due to the re-decoding every time, and the asynchronous nature of such, I've now introduced a weird bug where it's possible to end up with both title and in game music playing at the same time if the user starts the game before decoding the title music is complete. It's fixable (although I haven't had time yet), but it's just one more irritation with a poorly designed API.

I'm actually thinking of going back to using the good old HTML5 AUDIO element just for playing music, since it seems a bit more reliable, but I need to do some experimentation to see what the memory impact is. I also had issues with AUDIO misbehaving quite badly in Firefox with multiple sounds playing simultaneously.

Sound effects are less of an issue because they're obviously quite short and therefore don't take an excessive amount of memory even when uncompressed, so I can at least keep buffer sources around for them. Nonetheless the API's excessive complexity shows through even here: why is it such a drama just to play a sound? Why do I need to create and connect a bunch of objects together just to play a single sound at a given volume? Ridiculous. Asinine.

fenomas8y ago

I tend to think of the Web Audio API as the answer to the question: "how much of an audio API can you have if you stipulate that all user-specified code must run in the UI thread?".

Within that constraint I don't think it's a terrible API, but it's a big constraint and naturally raw access would be far preferable.

symstym8y ago

Yes.. after I wrote my comment I was feeling a bit bad for sounding like I was just trashing the API. In a world where JS is slow and there is no worker thread machinery, yet you need low latency and flexible processing, the design makes more sense.

That being said, the AudioParam "automation" methods still make me want to cry.

shermanyo8y ago

"how much of an audio API can you have if you stipulate that all user-specified code must run in the UI thread?"

I just threw up in my mouth a little :/

bsder8y ago

The API is frustrating because it is meant to hide the fact that Android audio sucks giant hairy donkey balls.

If you give Web developers access to raw samples, they are going to expect it to work. When it doesn't on Chrome on Android, lots of people are going to start complaining and filing bugs.

So, instead of fixing the audio path, they decided to bury its crappiness under a "higher-level" API which has fuzzier latency and can be built with hacks in the audio driver stacks themselves.

code_duck8y ago

Android audio is truly terrible for instrument apps. I don't understand how it suffices for things like games. I also don't understand why people even bother to make things like pianos and drum set… The latency is so extreme and inconsistent that even on recent phones they are useless. In contrast, iOS has had excellently playable instruments at least as far back as the iPod Touch 4.

bzbarsky8y ago

That... explains why Google was so keen to kill off the Audio Data API Mozilla proposed, I guess.

https://issuetracker.google.com/issues/36908622

j_s8y ago

This has been a known issue with Android since at least 2009 (~2,700 stars); work done to address this is starting to trickle out this year.

AAudio is a new C API. It is designed for high-performance audio applications that require low latency. It is currently in the Android O developer preview and not ready for production use. (Jun 2017)

baybal28y ago

Aha, I had that eerie feeling that I saw those crappy patterns somewhere else. So, Android it was

metajack8y ago

I think there were alternative designs but either no one cared or people were determined to push WebAudio despite faults. An example alternative that was proposed is http://robert.ocallahan.org/2012/01/mediastreams-processing-...

(https://github.com/rsimmons/plinth)

This demo doesn't keep a straight 120 BPM on my machine, it's incapable of holding the rhythm after 10 seconds of playback ( I tried the first patch on the left , Edge browser).

symstym8y ago

That's unfortunate. I don't have a machine running Edge to test it on. It uses less than 20% of the CPU in Chrome on my 2012 Macbook Air. It's possible that it's a problem with my code, but in general the Web Audio API does not have very good cross-browser support.

netule8y ago

Damn, I just played with the Plinth demo for an hour, lost total track of time. Great stuff.

symstym8y ago

Thanks! After a certain point it felt like a dead end to me, so I dropped it in favor of exploring something more along the lines of a JS-based Max/MSP. But it is surprising how much fun can be had with the small number of modules available in Plinth.

greghendershott8y ago

My first lesson in this was the Roland MPU-401 MIDI interface. It had a "smart mode" which accepted timestamped buffers. It was great... if you wanted a sequencer with exactly the features it supported, like say only 8 tracks. It was well-intentioned, because PCs of that era were slow.

The MPU-401 also had a "dumb" a.k.a. "UART" mode. You had to do everything yourself... and therefore could do anything. It turned out that early PCs were fast enough -- especially because you could install raw interrupt service routines and DOS didn't get in the way. :)

As a sequencer/DAW creator, you really want the system to give you raw hardware buffers and zero latency -- or as close to that as it can -- and let you build what you need on top.

If a system is far from that, it's understandable and well-meaning to try to compensate with some pre-baked engine/framework. It might even meet some folks' needs. But....

subwayclub8y ago

IIRC games made pretty good use of MPU-401 intelligent mode to drive the MT-32 module. The first really elaborate game scoring work on the IBM platform came through the MT-32(Sierra picked it up and everyone else followed - it was a good target for composers but in practice most people heard the music on Adlib/SB), so I would consider it successful in that niche.

And on that note, what I think Web Audio tried to be was a drop-in kit for game engines. Getting the full functionality of Unreal into the browser motivated the requirement for audio processing. But the actual implementation was muddled from the start: basic audio playback remains challenging(try to stream a BGM loop instead of load+uncompress and discover to your woe that it's not going to loop gaplessly, even when the codec is designed to allow that.) and my hobby stab at an independent implementation ran out of gas when I tried to get their envelope model working. The spec has a lot of features but not enough detail, and my morale sank further when I looked at how Chrome did it(stateful pasta code). I got something half-working, put it aside and never came back.

OTOH I had also tried Mozilla's system. That was very simple, and I got a synth working in no time at all with decent performance and latency. Optimizing from that point would have been the way to do it, but something in browser vendor politics at that time led to it being dropped.

CamperBob28y ago

IIRC games made pretty good use of MPU-401 intelligent mode to drive the MT-32 module

Very few games used the MPU-401's intelligent mode, actually. Never mind how I know, that was a long time ago...

S_A_P8y ago

Hopefully poster is ok with this, but this guy would know as he was one of the first DAWs for Windows. Cakewalk/Sonar.

eclipxe8y ago

Whoa, you made CakewalK! Thank you so much! Cakewalk inspired a lifelong love of music creation for me. You rock!

romwell8y ago

Off-topic, but thanks for Cakewalk! I got started making music on Cakewalk 5, and it's been a ride ever since.

fit2rule8y ago

How do you feel that MIDI is still with us today, and still prone to mis-use!? I mean, in the context of the original article, its quite clear that there is a deeper lesson to be learned .. especially when compared with a near-40 year old technology which is still in use today.

(If I could return to Cakewalk, I would. Wrote some of my best tracks with that little ISR of yours!)

better_ra8y ago

Goodness, going to join in the chorus. Cakewalk and Reason were some of my earliest forays into production, it's awesome that you made Cakewalk!

roca8y ago

Here's my take on the history here: http://robert.ocallahan.org/2017/09/some-opinions-on-history... From the beginning it was obvious that JS sample processing was important, and I tried hard in the WG to make the Web Audio API to focus on that, but I failed.

mov8y ago

Back there I followed a bit the discussion and your alternative spec, which was really interesting. If I remember it well, that will take lots of work and you were the only one working to get it implemented on FF. Are there any plans to get back to that? Maybe as an independent API for JS sample processing by workers only, in parallel with WA? Congratulations on your past efforts and thanks in advance for your answers.

roca8y ago

Audio Worklets are the future of JS audio processing.

fenomas8y ago

I have to cast a vote in opposition here.

I've been heavily into procedural audio for a year or two, and have had no big issues with using Web Audio. There are solid libraries that abstract it away (Tone.js and Tuna, e.g.), and since I outgrew them working directly with audio nodes and params has been fine too.

The big caveat is, when I first started I set myself the rule that I would not use script processor nodes. Obviously it would be nice to do everything manually, but for all the reasons in the article they're not good enough, so I set them aside, and everything's been smooth since.

So I feel like the answer to the articles headline is, today as of this moment the Web Audio API is made for anyone who doesn't need script nodes. If you can live within that constraint it'll suit you fine; if not it won't.

(Hopefully audio worklets will change this and it'll be for everyone, but I haven't followed them and don't know how they're shaping up.)

kevingadd8y ago

The problem is not that Web Audio doesn't do useful things. The problem is that it's a terrible foundation to build applications on top of, because it only solves a tiny set of use cases. This results in way too many people needing script nodes.

Other proposals for audio APIs solved a wider set of use cases, while also making it possible to do procedural audio without depending on browser vendors to implement key features for you.

p0nce8y ago

For music software you would need script nodes right away. There isn't one single way to implement filters, compressors, etc. Perhaps that's not the focus of the API.

Jasper_8y ago

If you're a creative mind and you constrain yourselves to the effects available in Web Audio, I'm sure you'll be just at home.

The effects are useful in one setting: hobbyist and toy usage, where you really don't have that many constraints and can play with whatever cool things are around. That said, I'm sure you'd actually get a lot more mileage out of a library of user-made script nodes, rather than whatever the browsers have built for you.

If you're trying to build something production-ready, or port an existing system to the web, most of the fun toys seem like just that: toys.

AudioWorklets don't look like they would improve things for me, but that's a topic for another blog post.

fenomas8y ago

I didn't say I was making trivial toy that isn't production ready. :| I just said I'm not using script nodes, and I think that's what TFA boils down to - half of it is about script nodes not being usable and the other half is about sample buffers not being suitable replacements for script nodes.

And obviously not having raw script access isn't a good thing. Nonetheless, the other nodes mostly work as advertised, in my limited experience so far, so the stuff that you'd expect to be able to do with them (e.g. FM/AM synthesis) seems to work pretty well.

> AudioWorklets don't look like they would improve things for me, but that's a topic for another blog post.

AFAIK worklets are supposed to be script processor nodes that work performantly. They wouldn't solve the sample rate problems mentioned in TFA but apart from that I'd think they should be pretty usable if they someday work as advertised.

http://media.uaudio.com/assetlibrary/t/e/teletronix_la2a_car...

kowdermeister8y ago

Stopped reading at: "Something like the DynamicsCompressorNode is practically a joke: basic features from a real compressor are basically missing, and the behavior that is there is underspecified such that I can’t even trust it to sound correct between browsers. "

Then if you look into it:

    dictionary DynamicsCompressorOptions : AudioNodeOptions {
             float attack = 0.003;
             float knee = 30;
             float ratio = 12;
             float release = 0.25;
             float threshold = -24;

Which are indeed the basics that you need and totally enough for most use cases.

Check out a vintage compressor that has a dozen implementation as VST plugins:

bsder8y ago

> Which are indeed the basics that you need and totally enough for most use cases.

However, I can take your "simple" compressor and swap it out of my audio chain for a more complex one if I need to.

I can't do that for the Web Audio API. That's really what everybody is complaining about.

The problem is that if your use case only covers 95% and I use 10 pieces, I am practically guaranteed to have a mismatch for multiple pieces--and I can't escape.

sdeep278y ago

If you have 95% coverage, and you have 10 separate random pieces, you actually have a 40% chance of failure.

https://developer.mozilla.org/en-US/docs/Web/API/DynamicsCom...

Jasper_8y ago

Where's the sidechain input?

kowdermeister8y ago

You can access the reduction property and connect it a gain node of anouther source.

https://wiki.mozilla.org/Audio_Data_API

graphememes8y ago

Just the BASICS. Like before we knew anything about Audio

quadrangle8y ago

Regardless of the other valid reply to this question, the implication that sidechain is a fundamentally basic thing for a compressor is questionable. Sidechains are extremely useful for many cases, but there's clearly tons of applications of compressors that don't use sidechains. It's not like a sidechainless compressor is unusable.

coldtea8y ago

The LA2A is as basic as they come. And it's more used for its character and coloring than for its flexibility.

whipoodle8y ago

Still, the cross-browser thing seems quite bad.

fzzzy8y ago

Mozilla had a competing api that just worked with sample buffers. Unfortunately it didn't win the standardization battle.

EdSharkey8y ago

I always thought some browser vendors who own mobile app stores wouldn't appreciate gamers having access to a distribution channel for great games on their platform that they didn't control. You can't have great games without great sound, so their mucking up the Sound API would be a nice way to stall the emergence.

It's a conspiracy theory, I know. Reality is probably far more boring and depressing. :/

Like the blog poster, I cut my teeth on the Mozilla API, and I was able to get passable sound out of a OPL3 emulator in a week's time. Perhaps Mozilla could convince other browser vendors to adopt their API in addition to Web Audio API?

andrewguenther8y ago

My theory is that Google used it's influence to hinder the API so they could work around the problems with Android's audio stack. They pushed for an API they knew they could get to work on Chrome for Android, rather than fixing Android (which is supposedly improved in 8.0).

fzzzy8y ago

I think having a simple api in addition to the Web Audio API is a great idea. I have no idea what the chances of that happening are though.

sitkack8y ago

Same thing applies to WebGL on iOS. Until you get someone to leak a damning document ... but theory is sound.

mcbits8y ago

I tried making a simple Morse code trainer using the Web Audio API, which seemed perfectly suited to the task, but I ran into two major problems:

1. Firefox always clicks when starting and stopping each tone. I think that's due to a longstanding Firefox bug and not the Web Audio API. I could mostly elminate the clicks by ramping the gain, but the threshold was different for each computer.

2. This was the deal-breaker. Every mobile device I tested had such terrible timing in JavaScript (off by tens of milliseconds) that it was impossible to produce reasonably correct-sounding Morse code faster than about 5-8 WPM.

I found these implementation problems more frustrating than the API itself. At this point I'm pretty sure the only way to reliably generate Morse code is to record and play audio samples of each character, which wastes bandwidth and can be done more easily without using the Web Audio API at all.

kobeya8y ago

> Firefox always clicks when starting and stopping each tone. I think that's due to a longstanding Firefox bug and not the Web Audio API. I could mostly elminate the clicks by ramping the gain, but the threshold was different for each computer.

You sure it is not due to the sound files you are using not having a normalized start?

mcbits8y ago

There were no sound files. I used an oscillator and changed the gain with `setTargetAtTime` to briefly fade in and out. That should prevent any clicking, but in Firefox it required an excessive amount of time.

kevingadd8y ago

Is that actually a bug? If you start a tone instantly without ramping the volume, the first sample should be relatively loud, which will click. It is reasonable to want and even expect a different behavior, but it might not be what is specced.

mcbits8y ago

Yes, to clarify, I was using `setTargetAtTime` to ramp the volume, but it was still clicking in Firefox without an extra hack. At the time I found a bug report that appeared to point to the cause (something related to internal audio buffers that Firefox uses), but I'm not finding it at the moment. It could have been fixed.

Tanner8y ago

This article focuses on emscripten examples and for good reason! The effort to resolve the differences between OpenAL and Web Audio has been on-going and exacerbated by Web Audio's API churn, deprecations and poor support.

That said, this current pull request on emscripten is a fantastic step forward and I'm very excited to see it's completion: https://github.com/kripken/emscripten/pull/5367

jpernst8y ago

I'm one of the authors of this PR, and yes, WebAudio's baffling lack of proper consecutive buffer queuing has been no small source of frustration. They seem to have put so much effort into adding effects nodes and other such things, but something as simple as scheduling one sound to play gaplessly after another can't be (easily) done. Requests for such support have been batted aside as unnecessary, which is funny considering where all the effort is going instead.

To do it properly would require just giving up on WebAudio's features completely and doing all the mixing in software via WebAssembly. Honestly though, if you're going to do that, you may as well just compile OpenAL-Soft with emscripten and use that, so I opted to just try to get the best out of WebAudio that I could. Hopefully it's good enough.

aaroninsf8y ago

Plus one for sure.

I put some weekends into trying to build a higher-level abstraction framework of sorts for my own sound art projects on top of Web Audio, and it was full of headaches for similar reasons to those mentioned.

The thing that I put the most work into is mentioned here, the lack of proper native support for tightly (but prospectively dynamically) scripted events, with sample accuracy to prevent glitching.

Through digging and prior work I came to a de facto standard solution using two layers of timers, one in WebAudio (which support sample accuracy but gives you hook to e.g. cancel or reschedule events), and one using coarse but flexible JS timers. Fugly, but it worked. But why is this necessary...!?

There's a ton of potential here, and someone like myself looking to implement interactive "art" or play spaces is desperate for a robust cross-platform web solution, it'd truly be a game-changer...

...so far Web Audio isn't there. :/

Other areas I wrestled with: • buffer management, especially with CORS issues and having to write my own stream support (preloading then freeing buffers in series, to get seamless playback of large resources...) • lack of direction on memory management, particularly, what the application is obligated to do, to release resources and prevent memory leaks • the "disposable buffer" model makes perfect sense from an implementation view but could have easily been made a non-issue for clients. This isn't GL; do us some solids yo.

Will keep watching, and likely, wrestling...

cyberferret8y ago

I had a discussion on Twitter recently about a possible use case for WebAudio - and that was a sound filters - in pretty much the same way as Instagram popularised image filters for popular consumption.

One thing that really irks me at the moment is the huge variation in sound volume of the increasing plethora of videos in my social media feed. If there was some way we could use a real time WebAudio manipulation on the browser to equalise the volume on all these home made videos, so much the better. Not just volume up/down, but things like real time audio compression to make vocals stand out a little.

Add delay and reverb to talk tracks etc. for podcasts.

EQ filters to reduce white noise on outdoor videos etc. also would be better. People with hearing difficulties in particular ranges, or who suffer from tinnitus etc. would be able to reduce certain frequencies via parametric equalisation.

It would be intriguing to see a podcast service or SoundCloud etc. offer real time audio manipulation, or let you add post processing mastering effects on your audio productions before releasing them in the wild.

trejj8y ago

Curiously, reading through Web Audio API bug tracker find items such as https://github.com/WebAudio/web-audio-api/issues/1305 and https://github.com/WebAudio/web-audio-api/issues/938, that echo the point from the article quite well. Oh dear..

kevingadd8y ago

The huge set of deficiencies in this API were communicated to the designers from the very beginning, and unfortunately most of them went unresolved for a long time (or indefinitely). It's a real bummer.

For a while there was a huge footgun that made it easy to synchronously decode entire mp3 files on the ui thread by accident. Oops (:

Even better, for a while there was no straightforward way to pause playback of a buffer. It took a while for the spec people to come around on that one, because they insisted it wasn't necessary.

joshontheweb8y ago

I'm running a SaaS built on the back of the Web Audio + WebRTC apis. While it isn't perfect at all, it is still pretty impressive what progress has been made in the last few years allowing you to do all kinds of audio synthesis and processing right in the browser. It seems to me that it is a pretty general purpose api in intent. The approach seems to be to do the easy low hanging fruit first and then get to the more complicated things. This doesn't satisfy any single use case quickly but progress is steady. No doubt it would be nice if it was totally capable out of the gate but I'm simply happy that even the existing capabilities are there. Be patient, it will improve vastly over time.

EDIT: I should also add that the teams behind the apis are quite responsive. You can make an impact in the direction of development simply by making your needs/desires known.

j_s8y ago

"Record in Lossless WAV" as feature #2 made me smile. Success to you with your SaaS!

joshontheweb8y ago

Thank you ;)

cyberferret8y ago

I worked on a (now abandoned) project a while back using Web Audio API, but it was NOT for Audio at all - in fact, it was to build a cross platform MIDI controller for a guitar effects controller.

As someone mentioned elsewhere on this thread Android suffered from a crappy Audio/MIDI library. iOS's CoreMIDI was great, but not transportable outside of iOS/OSX. Web Audio API's MIDI control seemed a great way to go - just build a cross platform interface using Electron App and use the underlying WebAudio to fire off MIDI messages.

Unfortunately, at the time of developing the project, WebAudio's MIDI SYSEX spec was still too fluid or not completely defined, so I had trouble sending/reading SYSEX messages via the API, and thus shelved the project for another day.

pitaj8y ago

I think I know what you mean, but in my mind, MIDI is very much Audio.

BHSPitMonkey8y ago

It's more about expressing events in time than audio necessarily... Sometimes it's used just to keep other devices in sync with a tempo, sometimes it's used to control lights, and -sometimes- it tells an actual audio synthesizer when to start and stop making noise.

cyberferret8y ago

In 99% of occurrences, yes, Audio and MIDI can be intertwined, but in this particular project, I was using MIDI CC and PC messages to change preset and parameter settings on a rack mounted effects processor.

Oh, and we needed to use SYSEX a LOT in order to intercept clock timing messages, as well as complex data like preset names and multi parameter effect settings (EQ etc.). None of the messages sent/received affected music notes at all - it was all setting configuration only.

gwbas1c8y ago

> "16 bits is enough for everybody"

Not really, the full range of human hearing is over 120db. Getting to 120db within 16 bits requires tricks like noise shaping. Otherwise, simple rounding at 16 bits gives about 80db and horrible sounding artifacts around quiet parts.

It's even more complicated in audio production, where 16 bits just doesn't provide enough room for post-production editing.

This is why the API is floating-point. Things like noise shaping need to be encapsulated within the API, or handled at the DAC if it's a high-quality one. (Edit) There's nothing wrong with consumer-grade DACs that are limited to about 80-90db of dynamic range; but the API shouldn't force that limitation on the entire world.

Jasper_8y ago

In the same vein, sure, the audio production nodes should use floating point, but for simple playback, which I'd argue is the 90% case, it shouldn't require me to use floats. Real-time audio toolkits like fmod and wwise all work in fixed-point formats on device, because the cost of floats is too expensive for realtime audio.

The floats are only required if you have a complex audio graph -- with a sample-based API, you can totally do the production in floats, and then have a final mix pass which does the render to an Int16Array. All in JavaScript.

gwbas1c8y ago

> because the cost of floats is too expensive for realtime audio

Round(Sample * 32767) is really that slow?

If you're doing integer DSP, you still need to deal with 16 -> 24, or 24 -> 16 overhead; and then the DAC still is converting to its own internal resolution. (Granted, 16 <-> 24 can be simple bit shifting if aliasing is acceptable.)

captn3m08y ago

I once attempted to do RS232 decoding in WebAudio (A speedstack timer of mine does RS232 over aux) and faced these exact issues before giving up.

cromwellian8y ago

Isn't Web Audio based off of MacOS'x Audio API?

I think the whole point is that Javascript used to be slow, and using the CPU as a DSP to process samples prevents acceleration. Seems to me what is needed is like "audio shaders" equivalent to compute/pixel shaders, that you farm off to OpenAL-like API which can be compiled to run on native HW.

Even if you grant emscripten produces reasonable code, it's still bloated, and less efficient on mobile devices than leveraging OS level DSP capability.

pierrec8y ago

Modern CPUs actually make decent general purpose DSPs for audio, so I'm not sure what kind of hardware acceleration you expect. Parallelization? Audio processing is not as extremely parallelizable as video. It's somewhat parallelizable. In a classic audio filter function, each sample affects the sample that comes immediately after it, so it's not embarrassingly parallel. The best you can do is, for example, parallelize multiple independent filters, so CPU SIMD instructions often turn out to be a decent fit.

As a side note, for some common audio DSP tasks, you could presumably take better advantage of highly parallel processing by doing a fourier transform and working in the spectral domain. There has been research do do this on GPUs and it works. However, if you do this you'll have high latency, and it's not a hardware problem, it's inherent to the FFT algorithm, so it's kind of a dead end for many applications.

TD-Linux8y ago

While it sort of looks like that, I don't know of any non-software implementations of the API (they use some SIMD at most). The problem is that 48,000 samples per second really isn't that much, even with very complex processing. When you compare it to a 1080p screen which has to process at least 373,248,000 samples per second, it hardly seems worth it to even spin up a shader.

iainmerrick8y ago

Isn't Web Audio based off of MacOS'x Audio API?

I hadn't heard that, but some of the "processor node" stuff does sound familiar.

What OS X also has, though, is proper low-level low-latency sound APIs. And that's why there are so many Mac (and iOS) music apps.

kevingadd8y ago

It is more or less based on OS X audio, yes. The author of the Web Audio spec previously was an architect on Core Audio at Apple. He basically moved over to Google, implemented his chosen subset of Core Audio in webkit, and shipped it prefixed. Then the evangelism group got big players like Rovio to ship apps that depended on the half-baked prefixed API so it was the de-facto standard for game SFX on the web.

flohofwoe8y ago

Even 'idiomatic' Javascript is plenty fast to generate audio samples, not to mention asm.js or WebAssembly. The latency / non-realtime nature of the 'browser loop' is the main problem (not being able to generate new sample data exactly when it is needed).

_pgon8y ago

How to play a sine wave:

  const audioContext = new AudioContext();
  const osc = audioContext.createOscillator();
  osc.frequency.value = 440;
  osc.connect(audioContext.destination);
  osc.start();

"BufferSourceNode" is intended to play back samples like a sampler would. The method the author proposes of creating buffers one after the other is a bizarre solution.

Jasper_8y ago

I picked a 440Hz sine wave because I didn't want to write a more complex demo example, knowing full well someone would nitpick this.

Please use your imagination and try to imagine one of infinitely many other streams that I could make at runtime that are not easily made with the built-in toy oscillators.

_pgon8y ago

It's a higher level API and you're deliberately ignoring all of its higher level features and concentrating on the part that clearly is underdeveloped. Maybe you should use your imagination instead of putting a square peg in a round hole?

kruhft8y ago

> Please use your imagination and try to imagine one of infinitely many other streams that I could make at runtime that are not easily made with the built-in toy oscillators.

Somebody already did. Check out Fourier Theory. The oscillators (well actually just sin, the rest will give you some help as well) can be used to make any stream, technically.

revelation8y ago

I mean, this is like saying "I can draw a lighted 3D cube with these 10 lines of OpenGL immediate mode, what do I need this 500 line Vulkan example for". And that's true!

It also misses the point, because, as with Vulkan, you just want a stable, sane, fast low-level API to access the hardware because OpenGL immediate mode doesn't get you beyond kindergarten in todays computer graphics. In audio, that is a sample-level API. Everything else should be handled by the application!

You can still make a source/sink directed graph system with components like "oscillators". In a fricking library!

Matthias2478y ago

> The method the author proposes of creating buffers one after the other is a bizarre solution.

I used the same solution when I tried to perform realtime audio streaming from a daemon on an embedded device to a browser (which is probably even a more realistic use-case for a browser audio API than generatic sine waves). I basically stumbled over the same issues than the author: A deprecated ScriptProcessorNode and high-level APIs which don't help me (like the oscillator one).

In the end I opted for a very similar solution as the author: Whenenver I got enough samples through websocket (I encoded them simply as raw 16bit samples there) I created a BufferSource, copied all samples in there (with conversion to floating point), and enqueued the buffer for playback at the position where the last buffer finished.

I really didn't expect that to work well due to all the overhead of creating and copying buffers and due to the uncertainity whether the browser will switch between 2 buffers without missing samples. But surprisingly it worked and did the job. I included a buffering of 200ms, which means I only started playback 200ms to be able to receive more data in the background and have a little bit more time to append further buffers. I experimented a little bit with that number but can't remember how deep the lower limit was before getting dropouts regurarly. It definitely wasn't usable for low-latency playback.

jancsika8y ago

Just from skimming the spec, the AudioWorklet interface looks very close to what is needed to build sensible, performant frameworks for audio profs and game designers.

So the most important question is: why isn't this interface implemented in any browser yet?

That a BufferSourceNode cannot be abused to generate precision oscillators isn't very enlightening.

bzbarsky8y ago

> why isn't this interface implemented in any browser yet

Partially because in addition to the interface itself it relies on a bunch of generic worklet machinery which also doesn't exist in any browser and is not trivial to implement in non-sucky ways.

But also partially because the spec has kept mutating, so no one wants to spend time implementing until there's some indication that maybe that will stop.

symstym8y ago

It's been under development for a very long time in Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=469639

I think there were some false starts where previous specs were written and then found to have issues.

kevingadd8y ago

For bonus points, prep work done for the eventual rollout of AudioWorklet in Chromium shipped a bug to release channel Chrome that breaks all uses of Web Audio. The bug wasn't caught in beta/canary channels because it only affects some user machines, and they can't revert the bug because of architectural dependencies. A basic way to summarize it is that AudioWorklet required the threading structure of Web Audio to change for safety reasons, and this results in a sort of priority inversion that can cause audio mixing to fall behind forever. Even simple test cases where you play a single looping sound buffer will glitch out repeatedly as a result.

So basically, Web Audio is unusable in release Chrome on a measurable subset of user machines, for multiple releases (until the fix makes it out), all because of AudioWorklet. Which isn't available yet.

I am being a little unfair here, because this bug isn't really the fault of any of the people doing the AudioWorklet work. But it sucks, and the blame for this horrible situation lies largely with the people who designed WebAudio initially. :(

diminish8y ago

>> Can the ridiculous overeagerness of Web Audio be reversed? Can we bring back a simple “play audio” API

To be frank, graphics world had some type of standard (OpenGL) long time ago, next to DirectX. So WebGL had a good example. However in the audio world we haven't seen a cross platform quasi-standard spec covering Mac, Linux and Windows. So IMHO, non-web audio lacks also common standards for mixing, sound engineering, music-making. That's why web audio appears to lack a use case. IMHO, that smells opportunity.

I use Web Audio, in canvas-WebGL based games where music making is needed. I understand the issues - we definitely need more than "play" functionality.

Jasper_8y ago

We've been through this multiple times. WASAPI, MME, DirectSound on Windows. CoreAudio on Mac. Libraries like SDL_mixer, FMOD, Wwise. We know how to construct a sound API. There's 20 years of prior art.

If you provide a low-level "play" API, others can build stuff on top because it's just numbers. Sure, sometimes there's "expensive numbers" like MP3 decoders, FFTs, etc., but these can be added as needed.

asveikau8y ago

It's fairy easy to get PCM out on any one platform (which means you can build support for Win/Mac/Linux by writing that small C code 3 times), and as Jasper_ noted, the rest is just math on some integers or floats, so there is nothing much platform specific about it.

I think the bigger issue is that non-experts sometimes get tasked with adding support for things.

The "audio device API that leaves the sample rate completely unspecified" example is, believe it or not, one I've seen before elsewhere. And yet, if you know the first thing about PCM samples, you know this is a mind-numbingly stupid mistake to make. Yet it's a mistake that a few people have made into shipping products, because they can't or won't reason about audio, and this did not stop them from being in charge of an audio API.

mikepurvis8y ago

Whether the API could be used to play MOD files is a good litmus test of its suitability for a variety of purposes. Covers repeatedly playing samples at differing volumes and pitches, simultaneously.

I'd rather have a comprehensive API that someone can dumb down than one that's so crippled as to be unusable beyond very basic functionality.

rzzzt8y ago

A FastTracker 2 player was discussed on HN quite some time ago (spoiler: uses ScriptProcessorNode): https://news.ycombinator.com/item?id=10538791

kruhft8y ago

> However in the audio world we haven't seen a cross platform quasi-standard spec covering Mac, Linux and Windows.

OpenAL: https://www.openal.org/

Jasper_8y ago

Don't let the name fool you. OpenAL is a closed-source library, much like Wwise or FMOD or PortAudio, that just implements playback. Bizarrely enough, it is also the only one of these APIs that uses a similar "play this buffer" approach and suffers from the same issues as Web Audio's memory management, just without a GC.

The actual audio equivalent to OpenGL is OpenSL [0], which I don't think picked up any support from anybody.

[0] https://www.khronos.org/opensles/

3 more replies

Pxtl8y ago

So it sounds like the web audio API is every good at modeling audio as the web GUI API is at modeling GUIs.

raphlinus8y ago

I think things will get a lot better when the underlying enabling technology is in good shape. The audio engine needs to be running in a real-time thread, with all communication with the rest of the world in nonblocking IO. There are lots of ways to do this, but one appealing path is to expose threading and atomics in wasm; then the techniques can be used for lots of things, not just audio. Another possibility is to implement Worker.postMessage() in a nonblocking way. None of this is easy, and will take time.

If we had gone with the Audio Data API, it wouldn't have been satisfying, because the web platform's compute engine simply could not meet the requirement of reliably delivering audio samples on schedule. Fortunately, that is in the process of changing.

Given these constraints, the complexity of building a signal processing graph (with the signal path happening entirely in native code) is justified, if those signal processing units are actually useful. I don't think we've seen the evidence for that.

I'd personally be happy with a much simpler approach based on running wasm in a real-time thread, and removing (or at least deprecating) the in-built behavior. It's very hard to specify the behavior of something like DynamicsCompressorNode precisely enough that people can count on consistent behavior across browsers. To me, that's a sign perhaps it shouldn't be in the spec.

Disclaimer: I've worked on some of this stuff, and have been playing with a port of my DX7 emulator to emscripten. Opinions are my own and not that of my employer.

Jasper_8y ago

> If we had gone with the Audio Data API, it wouldn't have been satisfying, because the web platform's compute engine simply could not meet the requirement of reliably delivering audio samples on schedule.

1. I'm not convinced this is the case. From what I see, GC pauses constitute the big blockers, rather than event processing and repaints. Introducing an API that's friendlier to GC would be a huge win here.

2. We have WebWorkers. What would have prevented a WebWorker from calling new global.Audio() for the Audio Data API?

raphlinus8y ago

1. This is going to depend a lot on the app; doing an actual DAW is going to require some pretty heavy processing. It also depends on the performance goal. Truly pro audio would be a 10ms end-to-end latency, which is extremely unforgiving.

2. Some form of WebWorker is obviously where we're going. But does postMessage() have the potential to cause delay in the worker that receives it? (There are ways to solve this but it requires some pretty heavy engineering)

For reference : https://www.audiotool.com/app ( using Flash ).

You just can't do that with the same level of tightness of rhythm on low hardware with web techs today. Flash was bad yet Flash also opened up insane possibilities on the web when it comes to multimedia applications that just can't be matched with Webtechs. ASM.js might fill the gap, but i haven't seen any equivalent yet.

roma1n8y ago

I briefly tried Web Audio to implement a Karplus-Strong synthesizer (about the simplest thing in audio synthesis I guess?).

Without using ScriptProcessorNode, there was no way of tuning the synthesizer because of the limitation that any loop in the audio graph has a 128 samples delay at least.

Maybe a more "compilation-oriented" handling of the audio graphs (at the user's choice) could help overcome this?

k__8y ago

Are there any good audio APIs out there?

Video AND audio? They good you all covered with nice APIs!

Just audio? You're screwed!

sporkologist8y ago

Audio always seems to be neglected and underfunded, compared with video. BeOS seems like one of the ones that did it right.

_pmf_8y ago

Now step back and honestly think about which web API is actually powerful and nice to use and makes the impression that it has been carefully crafted by a domain expert.

I cannot think of one.

derefr8y ago

Question: is the "point" of Web Audio to expose the native hardware-accelerated functionality of the underlying audio controller, through a combination of the OS audio driver + shims? Or is it more an attempt to implement everything in userspace, in a way equivalent to any random C++ DSP graph library? I've always thought it was the former.

Jasper_8y ago

Most consumer-grade audio hardware really only does playback. We've been doing software audio since around the turn of the century.

In Chrome's implementation, none of the mixing, DSP, etc. go through the hardware, and I'm more than certain that's the case for every other browser out there.

derefr8y ago

Audio controllers do at least do hardware-accelerated decoding of audio streams in e.g. H.264, though, yes?

But my question was more like: is Web Audio a mess mostly because it's an attempt to expose the features of the twenty-odd different OS audio backends on Windows/Mac/Linux, where the odd inclusions and exclusions map to the things that all the OS audio backends happen to share that Chrome can then expose?

mncharity8y ago

Web api standardization for VR/AR is currently a work in progress. And it's been... less than pretty.

So if you've been wanting to try some intervention to make web standards less poor, or just want to observe how they end up the way they do, here's an opportunity.

creatonez8y ago

>you can’t directly draw DOM elements to a canvas without awkwardly porting it to an SVG

This is not a wart, this is a security feature. Of course, it wouldn't be a necessary limitation if the web wasn't so complicated, but the web is complicated.

Jasper_8y ago

What's the security issue in play here?

creatonez8y ago

Just one example. The canvas API can grab the image data on the canvas. If you could rasterize arbitrary DOM nodes then you could very easily fingerprint users by, say, checking which fonts are installed. You could also load external resources such as images and iframes bypassing same-origin policy, so if your bank's website was configured incorrectly, a malicious site could steal information by taking screenshots of a canvas.

[0] https://news.ycombinator.com/item?id=14930824

pcsanwald8y ago

Kinda tagential to the thread, but what's the best book for an introduction to audio programming for an experienced, language agnostic coder (java, c, c++, obj-c, etc)?

soundwave1068y ago

I'm not sure about "best", but I got a lot out of Will Pirkle's two books on programming synthesizer / effect plugins (http://www.willpirkle.com/about/books/).

For a more generic guide I've heard a lot of good things about a free (in electronic form) book called DSPGuide (http://dspguide.com/). Haven't had a chance to dive into this one, though.

telesilla8y ago

Fantastic resource here:

http://musicdsp.org/

rl38y ago

>It [WebGL] gives you raw access to the GPU ...

Not to be semantic, but that's technically incorrect. Indeed, if WebGL were to be supplanted by a lower-level graphics API, that would make a lot of people happy.[0]

As far as the author's thesis concerning the Web Audio API: I agree that it's a total piece of shit.

rl38y ago

Not to be pedantic, rather.

I've come to suspect that my phone's autocorrect functionality, HN's two-hour edit window, and my own brain routinely conspire against me to paint a picture of total idiocy.

dmitriid8y ago

One word: w3c.

I've said it before, I'll say it again: it exists in a vacuum, and is run by people who have never done any significant work on the web, with titles like "Senior Specifications Specialist". Huge chunks of their work is hugely theoretical (note: not academical, just theoretical) and have no bearing on the real world.

pcwalton8y ago

The Web Audio API was the work of the Chrome team, not the W3C in isolation. I'm right there with you as far as W3C criticism is concerned, but they don't deserve the blame in this case.

roca8y ago

It was really the work of Chris Rogers. If more core Chrome people had been involved, I suspect things would have worked out better.

dmitriid8y ago

> work of the Chrome team

Indeed. Because people who do browser internals and never do any web development are a good fit to create APIs for web developers

irascible8y ago

I disagree with a lot of the assertions in this blog. You have to suspend some of your expectations since this is all .js. you can't have a js loop feeding single samples to a buffer. Js isn't deterministic to that level of granularity, but overall it's fast enough to generate procedural audio in chunks if you manage the timing. If you check out some of the three.js 3d audio demos you can see some pretty cool stuff being done with all those crazy nodes the auThor is decrying. He'll I wrote a tron game and did the audio using audio node chains and managed to get something really close to the real tron cycle sounds, without resorting to sample level tweaking.. and with > 16 bikes emitting procedural audio.. I think more focus on the strengths than weaknesses is in order.. and if you really want to peg your cpu.. you can still use emscripten/webasm or similar to generate buffers, if that's your thing..

Jasper_8y ago

> Js isn't deterministic to that level of granularity

Why not? I linked a test app [0] in my post that generates generate PCM data on demand, and fast. It works deterministically on all the browsers. Mozilla certainly implemented AudioData back in 2011 and it was fast enough for them.

> He'll I wrote a tron game and did the audio using audio node chains and managed to get something really close to the real tron cycle sounds, without resorting to sample level tweaking

Why couldn't this be a high-level userspace library like three.js? Yes, with a lot of creative energy, you can recreate a lot of sounds, I'm willing to believe that. But I think a low-level API would have been more useful from the getgo.

[0] http://magcius.github.io/spc.js/spc.html

megamindbrian8y ago

Web audio can be used with this: https://wavesurfer-js.org/

CharlesW8y ago

I'm not making the connection between the link and the point you're trying to make. Can you elaborate?

megamindbrian8y ago

I was expressing my excitement over something neat which implicitly answers the query in the title that it was designed for me. People who think wave analysis is neat. There are at least 3 other posts on here describing the basic features of wave analysis are just fine. Thanks for the down votes!

Camillo8y ago

The font on this page is insanely thin on my browser (Chrome, OS X).

revelation8y ago

The major problem of this API is that they couldn't just copy something designed by people with actual knowledge, as in WebGL. So it was design by committee that does so much the application should handle but has so deficient core capabilities no application can rectify any of it.

stupidcar8y ago

Nope. Web Audio was designed almost entirely by a single person, Chris Rogers, an engineer with a long history of working on audio for Google, Apple and Macromedia[1]. Whatever Web Audio's problems, design-by-committee is not their cause.

[1] https://www.linkedin.com/in/diagonal/

bsder8y ago

> Web Audio was designed almost entirely by a single person, Chris Rogers, an engineer with a long history of working on audio for Google, Apple and Macromedia[1].

Who at Apple beat him with a stick to get audio right? Can we get that person to design the audio API's for the Web and Android?

(Edit: I realized that this was an unfair comment born of my frustration with Audio APIs from Google.

The real issue driving this is that audio is still a dumpster fire on Android. So, if he gives web developers access to audio samples, everybody is going to expect it to work. And, on Android, it will fail miserably. So, better to isolate audio functions, give them "fuzzy" latency which you can bury in C code drivers, and hide the fact that audio on Android is a flaming pile of poo rather than piss off even more developers and get even more bugs filed against Android's shitty audio.)

FrankBooth8y ago

Yes, in fact the problem here is that Rogers barely considered spec input from outside (and others at Google have continued this behavior). A committee's design would've resulted in a better outcome in this case.

The Web Audio API is designed for web developers who would want to integrate sound into their web apps. Notifications, etc.

That pre-browser era where we would have sounds for everything. Minimize window, user logged in, logged out, all that crap.

Also the API has good support for visual. Spectrum analysis. This is pretty good for an education course to offer for beginners on sound processing.

I wouldn't use it for anything serious like a DAW.

kevingadd8y ago

This explanation doesn't fit, because <audio> already solved all the scenarios you're describing. Web Audio attempts to solve other problems, and does a bad job of it.

0. https://developer.mozilla.org/en-US/docs/Web/API/Media_Sourc...

Yes, Audio does work. But the Web Audio API has some things more to offer for the simpler stuff. Simple games, timing, effects, detune, etc.

j / k navigate · click thread line to collapse

178 comments

symstym8y ago

I've spent quite a lot of time working with the Web Audio API, and I strongly agree with the author.

Ultimately we have to hold out for the AudioWorklet API (which itself seems potentially over-complicated) to finally get the ability to do "raw" output.

akira25018y ago

> I strongly agree with the author.

I will second this. I wanted to make a live streaming playback feature using the API so I could remotely monitor an audio matrix/routing system that I have in the office.

The API has _zero_ provision for streaming MP3. You either load and playback a complete MP3 file or you get corrupted playback because the API simply won't maintain state between decoding calls.

Which is an insane amount of work for a gaping oversight in a common use-case of the API, a simple flag in the browsers native decoder would've sufficed.

eric_bullington8y ago

> The API has _zero_ provision for streaming MP3.

That said, Media Source Extensions (MSE) is only supported on relatively modern browsers (IE11+) but you should be able to use it to stream mp3 to the Web Audio API on supported browsers.

1. http://dalecurtis.github.io/llama-demo/index.html

2. https://github.com/72lions/PlayingChunkedMP3-WebAudioAPI

bartread8y ago

And I'll third it.

The MP3 issues don't end there, which is something the article touches on obliquely: you can't reuse many of the important constructs you might want to.

The downside of this approach is that you can't start a piece of music at a defined instant, which is extremely frustrating when you might want to synchronise it with events happening on screen.

fenomas8y ago

I tend to think of the Web Audio API as the answer to the question: "how much of an audio API can you have if you stipulate that all user-specified code must run in the UI thread?".

Within that constraint I don't think it's a terrible API, but it's a big constraint and naturally raw access would be far preferable.

symstym8y ago

That being said, the AudioParam "automation" methods still make me want to cry.

shermanyo8y ago

"how much of an audio API can you have if you stipulate that all user-specified code must run in the UI thread?"

I just threw up in my mouth a little :/

bsder8y ago

The API is frustrating because it is meant to hide the fact that Android audio sucks giant hairy donkey balls.

If you give Web developers access to raw samples, they are going to expect it to work. When it doesn't on Chrome on Android, lots of people are going to start complaining and filing bugs.

So, instead of fixing the audio path, they decided to bury its crappiness under a "higher-level" API which has fuzzier latency and can be built with hacks in the audio driver stacks themselves.

code_duck8y ago

bzbarsky8y ago

That... explains why Google was so keen to kill off the Audio Data API Mozilla proposed, I guess.

https://issuetracker.google.com/issues/36908622

j_s8y ago

This has been a known issue with Android since at least 2009 (~2,700 stars); work done to address this is starting to trickle out this year.

baybal28y ago

Aha, I had that eerie feeling that I saw those crappy patterns somewhere else. So, Android it was

metajack8y ago

(https://github.com/rsimmons/plinth)

This demo doesn't keep a straight 120 BPM on my machine, it's incapable of holding the rhythm after 10 seconds of playback ( I tried the first patch on the left , Edge browser).

symstym8y ago

netule8y ago

Damn, I just played with the Plinth demo for an hour, lost total track of time. Great stuff.

symstym8y ago

greghendershott8y ago

As a sequencer/DAW creator, you really want the system to give you raw hardware buffers and zero latency -- or as close to that as it can -- and let you build what you need on top.

If a system is far from that, it's understandable and well-meaning to try to compensate with some pre-baked engine/framework. It might even meet some folks' needs. But....

subwayclub8y ago

CamperBob28y ago

IIRC games made pretty good use of MPU-401 intelligent mode to drive the MT-32 module

Very few games used the MPU-401's intelligent mode, actually. Never mind how I know, that was a long time ago...

S_A_P8y ago

Hopefully poster is ok with this, but this guy would know as he was one of the first DAWs for Windows. Cakewalk/Sonar.

eclipxe8y ago

Whoa, you made CakewalK! Thank you so much! Cakewalk inspired a lifelong love of music creation for me. You rock!

romwell8y ago

Off-topic, but thanks for Cakewalk! I got started making music on Cakewalk 5, and it's been a ride ever since.

fit2rule8y ago

(If I could return to Cakewalk, I would. Wrote some of my best tracks with that little ISR of yours!)

better_ra8y ago

Goodness, going to join in the chorus. Cakewalk and Reason were some of my earliest forays into production, it's awesome that you made Cakewalk!

roca8y ago

mov8y ago

roca8y ago

Audio Worklets are the future of JS audio processing.

fenomas8y ago

I have to cast a vote in opposition here.

(Hopefully audio worklets will change this and it'll be for everyone, but I haven't followed them and don't know how they're shaping up.)

kevingadd8y ago

Other proposals for audio APIs solved a wider set of use cases, while also making it possible to do procedural audio without depending on browser vendors to implement key features for you.

p0nce8y ago

For music software you would need script nodes right away. There isn't one single way to implement filters, compressors, etc. Perhaps that's not the focus of the API.

Jasper_8y ago

If you're a creative mind and you constrain yourselves to the effects available in Web Audio, I'm sure you'll be just at home.

If you're trying to build something production-ready, or port an existing system to the web, most of the fun toys seem like just that: toys.

AudioWorklets don't look like they would improve things for me, but that's a topic for another blog post.

fenomas8y ago

> AudioWorklets don't look like they would improve things for me, but that's a topic for another blog post.

http://media.uaudio.com/assetlibrary/t/e/teletronix_la2a_car...

kowdermeister8y ago

Then if you look into it:

    dictionary DynamicsCompressorOptions : AudioNodeOptions {
             float attack = 0.003;
             float knee = 30;
             float ratio = 12;
             float release = 0.25;
             float threshold = -24;

Which are indeed the basics that you need and totally enough for most use cases.

Check out a vintage compressor that has a dozen implementation as VST plugins:

bsder8y ago

> Which are indeed the basics that you need and totally enough for most use cases.

However, I can take your "simple" compressor and swap it out of my audio chain for a more complex one if I need to.

I can't do that for the Web Audio API. That's really what everybody is complaining about.

The problem is that if your use case only covers 95% and I use 10 pieces, I am practically guaranteed to have a mismatch for multiple pieces--and I can't escape.

sdeep278y ago

If you have 95% coverage, and you have 10 separate random pieces, you actually have a 40% chance of failure.

https://developer.mozilla.org/en-US/docs/Web/API/DynamicsCom...

Jasper_8y ago

Where's the sidechain input?

kowdermeister8y ago

You can access the reduction property and connect it a gain node of anouther source.

https://wiki.mozilla.org/Audio_Data_API

graphememes8y ago

Just the BASICS. Like before we knew anything about Audio

quadrangle8y ago

coldtea8y ago

The LA2A is as basic as they come. And it's more used for its character and coloring than for its flexibility.

whipoodle8y ago

Still, the cross-browser thing seems quite bad.

fzzzy8y ago

Mozilla had a competing api that just worked with sample buffers. Unfortunately it didn't win the standardization battle.

EdSharkey8y ago

It's a conspiracy theory, I know. Reality is probably far more boring and depressing. :/

andrewguenther8y ago

fzzzy8y ago

I think having a simple api in addition to the Web Audio API is a great idea. I have no idea what the chances of that happening are though.

sitkack8y ago

Same thing applies to WebGL on iOS. Until you get someone to leak a damning document ... but theory is sound.

mcbits8y ago

I tried making a simple Morse code trainer using the Web Audio API, which seemed perfectly suited to the task, but I ran into two major problems:

kobeya8y ago

You sure it is not due to the sound files you are using not having a normalized start?

mcbits8y ago

kevingadd8y ago

mcbits8y ago

Tanner8y ago

That said, this current pull request on emscripten is a fantastic step forward and I'm very excited to see it's completion: https://github.com/kripken/emscripten/pull/5367

jpernst8y ago

aaroninsf8y ago

Plus one for sure.

The thing that I put the most work into is mentioned here, the lack of proper native support for tightly (but prospectively dynamically) scripted events, with sample accuracy to prevent glitching.

There's a ton of potential here, and someone like myself looking to implement interactive "art" or play spaces is desperate for a robust cross-platform web solution, it'd truly be a game-changer...

...so far Web Audio isn't there. :/

Will keep watching, and likely, wrestling...

cyberferret8y ago

Add delay and reverb to talk tracks etc. for podcasts.

trejj8y ago

kevingadd8y ago

For a while there was a huge footgun that made it easy to synchronously decode entire mp3 files on the ui thread by accident. Oops (:

Even better, for a while there was no straightforward way to pause playback of a buffer. It took a while for the spec people to come around on that one, because they insisted it wasn't necessary.

joshontheweb8y ago

EDIT: I should also add that the teams behind the apis are quite responsive. You can make an impact in the direction of development simply by making your needs/desires known.

j_s8y ago

"Record in Lossless WAV" as feature #2 made me smile. Success to you with your SaaS!

joshontheweb8y ago

Thank you ;)

cyberferret8y ago

I worked on a (now abandoned) project a while back using Web Audio API, but it was NOT for Audio at all - in fact, it was to build a cross platform MIDI controller for a guitar effects controller.

pitaj8y ago

I think I know what you mean, but in my mind, MIDI is very much Audio.

BHSPitMonkey8y ago

cyberferret8y ago

gwbas1c8y ago

> "16 bits is enough for everybody"

It's even more complicated in audio production, where 16 bits just doesn't provide enough room for post-production editing.

Jasper_8y ago

gwbas1c8y ago

> because the cost of floats is too expensive for realtime audio

Round(Sample * 32767) is really that slow?

captn3m08y ago

I once attempted to do RS232 decoding in WebAudio (A speedstack timer of mine does RS232 over aux) and faced these exact issues before giving up.

cromwellian8y ago

Isn't Web Audio based off of MacOS'x Audio API?

Even if you grant emscripten produces reasonable code, it's still bloated, and less efficient on mobile devices than leveraging OS level DSP capability.

pierrec8y ago

TD-Linux8y ago

iainmerrick8y ago

Isn't Web Audio based off of MacOS'x Audio API?

I hadn't heard that, but some of the "processor node" stuff does sound familiar.

What OS X also has, though, is proper low-level low-latency sound APIs. And that's why there are so many Mac (and iOS) music apps.

kevingadd8y ago

flohofwoe8y ago

_pgon8y ago

How to play a sine wave:

  const audioContext = new AudioContext();
  const osc = audioContext.createOscillator();
  osc.frequency.value = 440;
  osc.connect(audioContext.destination);
  osc.start();

"BufferSourceNode" is intended to play back samples like a sampler would. The method the author proposes of creating buffers one after the other is a bizarre solution.

Jasper_8y ago

I picked a 440Hz sine wave because I didn't want to write a more complex demo example, knowing full well someone would nitpick this.

Please use your imagination and try to imagine one of infinitely many other streams that I could make at runtime that are not easily made with the built-in toy oscillators.

_pgon8y ago

kruhft8y ago

> Please use your imagination and try to imagine one of infinitely many other streams that I could make at runtime that are not easily made with the built-in toy oscillators.

Somebody already did. Check out Fourier Theory. The oscillators (well actually just sin, the rest will give you some help as well) can be used to make any stream, technically.

revelation8y ago

I mean, this is like saying "I can draw a lighted 3D cube with these 10 lines of OpenGL immediate mode, what do I need this 500 line Vulkan example for". And that's true!

You can still make a source/sink directed graph system with components like "oscillators". In a fricking library!

Matthias2478y ago

> The method the author proposes of creating buffers one after the other is a bizarre solution.

jancsika8y ago

Just from skimming the spec, the AudioWorklet interface looks very close to what is needed to build sensible, performant frameworks for audio profs and game designers.

So the most important question is: why isn't this interface implemented in any browser yet?

That a BufferSourceNode cannot be abused to generate precision oscillators isn't very enlightening.

bzbarsky8y ago

> why isn't this interface implemented in any browser yet

Partially because in addition to the interface itself it relies on a bunch of generic worklet machinery which also doesn't exist in any browser and is not trivial to implement in non-sucky ways.

But also partially because the spec has kept mutating, so no one wants to spend time implementing until there's some indication that maybe that will stop.

symstym8y ago

It's been under development for a very long time in Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=469639

I think there were some false starts where previous specs were written and then found to have issues.

kevingadd8y ago

diminish8y ago

>> Can the ridiculous overeagerness of Web Audio be reversed? Can we bring back a simple “play audio” API

I use Web Audio, in canvas-WebGL based games where music making is needed. I understand the issues - we definitely need more than "play" functionality.

Jasper_8y ago

asveikau8y ago

I think the bigger issue is that non-experts sometimes get tasked with adding support for things.

mikepurvis8y ago

Whether the API could be used to play MOD files is a good litmus test of its suitability for a variety of purposes. Covers repeatedly playing samples at differing volumes and pitches, simultaneously.

I'd rather have a comprehensive API that someone can dumb down than one that's so crippled as to be unusable beyond very basic functionality.

rzzzt8y ago

A FastTracker 2 player was discussed on HN quite some time ago (spoiler: uses ScriptProcessorNode): https://news.ycombinator.com/item?id=10538791

kruhft8y ago

> However in the audio world we haven't seen a cross platform quasi-standard spec covering Mac, Linux and Windows.

OpenAL: https://www.openal.org/

Jasper_8y ago

The actual audio equivalent to OpenGL is OpenSL [0], which I don't think picked up any support from anybody.

[0] https://www.khronos.org/opensles/

3 more replies

Pxtl8y ago

So it sounds like the web audio API is every good at modeling audio as the web GUI API is at modeling GUIs.

raphlinus8y ago

Disclaimer: I've worked on some of this stuff, and have been playing with a port of my DX7 emulator to emscripten. Opinions are my own and not that of my employer.

Jasper_8y ago

2. We have WebWorkers. What would have prevented a WebWorker from calling new global.Audio() for the Audio Data API?

raphlinus8y ago

For reference : https://www.audiotool.com/app ( using Flash ).

roma1n8y ago

I briefly tried Web Audio to implement a Karplus-Strong synthesizer (about the simplest thing in audio synthesis I guess?).

Without using ScriptProcessorNode, there was no way of tuning the synthesizer because of the limitation that any loop in the audio graph has a 128 samples delay at least.

Maybe a more "compilation-oriented" handling of the audio graphs (at the user's choice) could help overcome this?

k__8y ago

Are there any good audio APIs out there?

Video AND audio? They good you all covered with nice APIs!

Just audio? You're screwed!

sporkologist8y ago

Audio always seems to be neglected and underfunded, compared with video. BeOS seems like one of the ones that did it right.

_pmf_8y ago

Now step back and honestly think about which web API is actually powerful and nice to use and makes the impression that it has been carefully crafted by a domain expert.

I cannot think of one.

derefr8y ago

Jasper_8y ago

Most consumer-grade audio hardware really only does playback. We've been doing software audio since around the turn of the century.

In Chrome's implementation, none of the mixing, DSP, etc. go through the hardware, and I'm more than certain that's the case for every other browser out there.

derefr8y ago

Audio controllers do at least do hardware-accelerated decoding of audio streams in e.g. H.264, though, yes?

mncharity8y ago

Web api standardization for VR/AR is currently a work in progress. And it's been... less than pretty.

So if you've been wanting to try some intervention to make web standards less poor, or just want to observe how they end up the way they do, here's an opportunity.

creatonez8y ago

>you can’t directly draw DOM elements to a canvas without awkwardly porting it to an SVG

This is not a wart, this is a security feature. Of course, it wouldn't be a necessary limitation if the web wasn't so complicated, but the web is complicated.

Jasper_8y ago

What's the security issue in play here?

creatonez8y ago

[0] https://news.ycombinator.com/item?id=14930824

pcsanwald8y ago

Kinda tagential to the thread, but what's the best book for an introduction to audio programming for an experienced, language agnostic coder (java, c, c++, obj-c, etc)?

soundwave1068y ago

I'm not sure about "best", but I got a lot out of Will Pirkle's two books on programming synthesizer / effect plugins (http://www.willpirkle.com/about/books/).

For a more generic guide I've heard a lot of good things about a free (in electronic form) book called DSPGuide (http://dspguide.com/). Haven't had a chance to dive into this one, though.

telesilla8y ago

Fantastic resource here:

http://musicdsp.org/

rl38y ago

>It [WebGL] gives you raw access to the GPU ...

Not to be semantic, but that's technically incorrect. Indeed, if WebGL were to be supplanted by a lower-level graphics API, that would make a lot of people happy.[0]

As far as the author's thesis concerning the Web Audio API: I agree that it's a total piece of shit.

rl38y ago

Not to be pedantic, rather.

I've come to suspect that my phone's autocorrect functionality, HN's two-hour edit window, and my own brain routinely conspire against me to paint a picture of total idiocy.

dmitriid8y ago

One word: w3c.

pcwalton8y ago

The Web Audio API was the work of the Chrome team, not the W3C in isolation. I'm right there with you as far as W3C criticism is concerned, but they don't deserve the blame in this case.

roca8y ago

It was really the work of Chris Rogers. If more core Chrome people had been involved, I suspect things would have worked out better.

dmitriid8y ago

> work of the Chrome team

Indeed. Because people who do browser internals and never do any web development are a good fit to create APIs for web developers

irascible8y ago

Jasper_8y ago

> Js isn't deterministic to that level of granularity

> He'll I wrote a tron game and did the audio using audio node chains and managed to get something really close to the real tron cycle sounds, without resorting to sample level tweaking

[0] http://magcius.github.io/spc.js/spc.html

megamindbrian8y ago

Web audio can be used with this: https://wavesurfer-js.org/

CharlesW8y ago

I'm not making the connection between the link and the point you're trying to make. Can you elaborate?

megamindbrian8y ago

Camillo8y ago

The font on this page is insanely thin on my browser (Chrome, OS X).

revelation8y ago

stupidcar8y ago

[1] https://www.linkedin.com/in/diagonal/

bsder8y ago

> Web Audio was designed almost entirely by a single person, Chris Rogers, an engineer with a long history of working on audio for Google, Apple and Macromedia[1].

Who at Apple beat him with a stick to get audio right? Can we get that person to design the audio API's for the Web and Android?

(Edit: I realized that this was an unfair comment born of my frustration with Audio APIs from Google.

FrankBooth8y ago

The Web Audio API is designed for web developers who would want to integrate sound into their web apps. Notifications, etc.

That pre-browser era where we would have sounds for everything. Minimize window, user logged in, logged out, all that crap.

Also the API has good support for visual. Spectrum analysis. This is pretty good for an education course to offer for beginners on sound processing.

I wouldn't use it for anything serious like a DAW.

kevingadd8y ago

This explanation doesn't fit, because <audio> already solved all the scenarios you're describing. Web Audio attempts to solve other problems, and does a bad job of it.