A warm home studio at night with waveforms on screen, a synthesizer, and a gray-blue cat watching from the windowsill

web-audio-api: The Browser's Audio Engine, Now Running in Node

The Gray Cat
The Gray Cat
0 views

The Web Audio API is one of the great success stories of the browser platform: a graph-based audio engine with oscillators, filters, convolution reverb, and sample-accurate scheduling, all available through window.AudioContext. The catch has always been that you needed a browser. If you wanted to synthesize a notification sound on a server, render audio in a CI pipeline, or build a command-line synth, you were stuck reaching for native bindings or rolling your own DSP.

web-audio-api removes that catch. It is a portable, pure-JavaScript implementation of the W3C Web Audio API maintained by the audiojs organization, and it runs on Node.js and Bun. To be clear from the very start: this is not the browser's built-in API, and it is not something you import to use in a web page. It is the same standard API surface (AudioContext, OscillatorNode, GainNode, and friends) reimplemented for environments that have no browser at all. You write Web Audio code once and run it in the terminal, on a server, or in a test runner exactly as it would behave in Chrome.

The current 1.x line is effectively a 2026 relaunch of a much older, long-stale package. The rewrite is near-complete and spec-conformant, and the headline claim is a strong one: it passes 100% of the official Web Platform Tests for Web Audio. There is no node-gyp step, no prebuilt binaries to fail on install, and no native toolchain to babysit.

Why Run Web Audio Outside the Browser

The appeal becomes obvious once you see the situations it unlocks.

  • Audio in CI. OfflineAudioContext renders deterministically with no speakers and no sound card, so you can unit-test audio code on a headless build agent.
  • Server-side sound. Generate audio from an API, a chat bot, or a background job: notification tones, procedural music, text-to-speech post-processing.
  • CLI tooling. Synthesize, process, and pipe raw PCM straight from the terminal into tools like aplay or sox.
  • Porting browser libraries. A polyfill mode makes existing browser-targeted libraries, most notably Tone.js, run in Node unchanged.
  • Zero native dependencies. Everything is pure JavaScript from the audiojs ecosystem, so installs are fast and reliable across platforms.

Because the implementation tracks the spec faithfully, almost everything you already know from MDN applies directly. The contexts, the source and processing nodes, AudioParam automation, FFT analysis through AnalyserNode, and decodeAudioData all behave the way the standard says they should.

Installation

The core package is all you need to get started:

npm install web-audio-api
yarn add web-audio-api

Microphone capture in Node is the one feature that requires an extra optional peer dependency:

npm install audio-mic

Without audio-mic installed, the library still works fully for synthesis, decoding, effects, and offline rendering. The mic dependency only matters when you want live input through getUserMedia.

Making Sound From Scratch

The "hello world" of Web Audio is a 440 Hz sine wave, and it looks identical here to the browser version, with one server-friendly addition: you actually get sound out of your system speakers, no extra native setup required.

import { AudioContext } from 'web-audio-api'

const ctx = new AudioContext()
await ctx.resume()

const osc = ctx.createOscillator()
osc.frequency.value = 440
osc.connect(ctx.destination)
osc.start()
// A440 plays through your speakers

One spec detail worth internalizing: contexts start in a suspended state. In a browser this exists because audio cannot begin without a user gesture, and the library honors the same lifecycle. That is why the await ctx.resume() call is there before realtime playback. When you are done, close the context with await ctx.close(). If you are on a modern runtime, you can also lean on TC39 explicit resource management and let using ctx = new AudioContext() clean up for you automatically.

Rendering Audio With No Hardware At All

For tests and pipelines, realtime playback is the wrong tool. OfflineAudioContext renders the entire graph as fast as the machine allows and hands you a buffer of samples, with no audio device involved whatsoever.

import { OfflineAudioContext } from 'web-audio-api'

const ctx = new OfflineAudioContext(2, 44100, 44100) // 1 second, stereo

const osc = ctx.createOscillator()
osc.frequency.value = 440
osc.connect(ctx.destination)
osc.start()

const buffer = await ctx.startRendering()
const samples = buffer.getChannelData(0) // Float32Array of 44100 samples

Notice there is no resume() call here: offline contexts do not need one, because nothing is playing in real time. The output is fully deterministic, which is exactly what you want when asserting that a synthesis routine produced the right waveform. Run it on a build agent with no sound card and it behaves identically to your laptop. This is the single most compelling reason to reach for the library: audio you can actually test.

Decoding Existing Audio Files

Working with real audio is just as direct. Read a file off disk, hand the bytes to decodeAudioData, and you get back an AudioBuffer ready to feed into the graph.

import { readFileSync } from 'node:fs'
import { AudioContext } from 'web-audio-api'

const ctx = new AudioContext()
const buffer = await ctx.decodeAudioData(readFileSync('track.mp3'))
// WAV, MP3, FLAC, OGG, and AAC are all supported

Decoding is backed by the audio-decode package and handles WAV, MP3, FLAC, OGG, and AAC. From there a classic processing pipeline (decode, run through a BiquadFilterNode for EQ, add a DynamicsCompressorNode, then render to a buffer) is just ordinary Web Audio node wiring.

Running Browser Audio Libraries in Node

This is where the package earns its keep for anyone with existing audio code. It ships a polyfill that installs the Web Audio globals (AudioContext, GainNode, and the rest) onto the global scope, so code written for browsers simply finds them.

import 'web-audio-api/polyfill'
// AudioContext, GainNode, and friends are now global

The marquee use case is Tone.js, the popular browser framework for music and interactive audio. With the polyfill in place, it runs server-side untouched.

import 'web-audio-api/polyfill'
const Tone = await import('tone')

Tone.setContext(new AudioContext())
const synth = new Tone.Synth().toDestination()
synth.triggerAttackRelease('C4', '8n')

There is one gotcha that will bite you if you miss it: Tone.js must be loaded with a dynamic import(), as shown above. Static imports are hoisted to the top of the module and execute before the polyfill has a chance to install the globals, so Tone.js would initialize against a missing AudioContext. If you would rather keep clean static imports, use the loader flag instead:

node --import web-audio-api/polyfill app.js

With that flag the polyfill is installed before your module body runs, so a plain import * as Tone from 'tone' works as expected. The polyfill also wires up navigator.mediaDevices.getUserMedia({ audio: true }), backed by the optional audio-mic dependency, so browser microphone-capture code can run verbatim.

Beyond the Spec: Node-Native Extensions

Faithful spec conformance is the foundation, but the library adds a handful of Node-only extensions that make server and CLI work genuinely pleasant. These are deliberately not portable to browsers; they are the value-add for non-browser environments.

Piping Raw PCM to Any Stream

In a browser, audio goes to the speakers and nowhere else. On a server, you often want the raw samples elsewhere. The sinkId option lets you point a context's output at any writable stream.

import { AudioContext } from 'web-audio-api'

const ctx = new AudioContext({ sinkId: process.stdout })
// ... build your graph ...
// then: node synth.js | aplay -f cd

Suddenly your synth script is a Unix citizen, piping PCM into aplay, sox, or a file. The constructor also accepts numberOfChannels and bitDepth so you can dictate the exact output format.

Defining AudioWorklets Without a File

The browser's AudioWorklet requires a separate worklet file loaded by URL, which is awkward on a server. This implementation adds addModule(fn), letting you register a processor with a plain callback. No second file, no URL juggling, just inline DSP. Worth knowing for the technically curious: worklets here run synchronously on the same thread rather than in an isolated audio thread, a deliberate simplification of the browser model that keeps the pure-JS implementation straightforward.

Feeding Live Microphone Input

For real input, CustomMediaStreamTrack extends the standard MediaStreamTrack with a public constructor and a pushData method, modeled after the existing CanvasCaptureMediaStreamTrack pattern. Combined with the optional audio-mic dependency, you can route a live mic into the graph.

import {
  AudioContext,
  MediaStreamAudioSourceNode,
  CustomMediaStreamTrack,
  MediaStream,
} from 'web-audio-api'
import mic from 'audio-mic'

const ctx = new AudioContext()
await ctx.resume()

const track = new CustomMediaStreamTrack({
  kind: 'audio',
  label: 'mic',
  settings: { channelCount: 1, sampleSize: 16, sampleRate: ctx.sampleRate },
})
const stream = new MediaStream([track])

const src = new MediaStreamAudioSourceNode(ctx, { mediaStream: stream })
src.connect(ctx.destination) // live monitor

const read = mic({ sampleRate: ctx.sampleRate, channels: 1, bitDepth: 16 })
const pump = () =>
  read((err, buf) => {
    if (err || !buf) return
    track.pushData(buf, { channels: 1, bitDepth: 16 })
    pump()
  })
pump()

The track accepts Float32Array, arrays of Float32Array, or interleaved 8/16/32-bit integer PCM, with conversion handled by pcm-convert. One platform note for macOS users: if the mic opens but stays silent, pass backend: 'process' to audio-mic so it uses sox or ffmpeg instead of the native CoreAudio binding.

What You Can Actually Build

The repository ships a generous pile of runnable examples that double as a tour of what the engine handles. There are test signals (tones, exponential sweeps, white/pink/brown noise, a DTMF dialer, a metronome, even a guitar tuner that reads pitch in cents from the mic), auditory illusions like the endlessly rising Shepard tone and Risset rhythm, and a whole synthesis menu: subtractive synths with filter sweeps and ADSR envelopes, additive synthesis, DX7-style FM, and Karplus-Strong plucked strings. The generative examples wander further afield into step sequencers, twelve-tone rows, Balinese gamelan, and modal jazz. Each is parametric, so node examples/tone.js f=440 or note names like A4 and C#3 just work.

A Word on Performance and Trade-offs

Every benchmarked scenario renders faster than real time, which is what makes offline rendering and CI use practical. On simple graphs, pure JavaScript holds its own against the Rust napi bindings of competing projects. On heavier DSP such as convolution and dynamics compression, it currently runs roughly two to four times slower than native code; WASM kernels are planned to close that gap, with the DSP cleanly separated from the graph plumbing to make that swap possible.

The honest framing is this. If you need maximum throughput on heavy convolution and you do not mind a native build step, the Rust-based node-web-audio-api is the throughput champion. If you value a zero-native-deps install, full spec conformance, CI and serverless friendliness, and the ability to run Tone.js and other browser libraries unchanged, this library is the one to reach for. For the vast majority of server-side and tooling work, faster-than-realtime pure JS is more than enough.

Conclusion

web-audio-api takes one of the browser's best APIs and sets it free. The same AudioContext you know from the web now runs in Node and Bun, with no native dependencies, full Web Platform Tests conformance, and thoughtful Node-only extensions for streams, inline worklets, and live input. Whether you want to test audio in CI without a sound card, build a command-line synth, generate sound on a server, or finally run Tone.js outside the browser, it gives you a familiar, standards-faithful path. Write your audio graph once, and run it wherever JavaScript runs.