| -*- outline -*- |
| |
| * Pro Audio with GStreamer |
| |
| This file attempts to document usage of GStreamer for so-called "pro |
| audio"[0]. Two audiences are considered: programmers that are |
| considering GStreamer for their pro-audio app, and GStreamer developers |
| interested in which parts of GStreamer pro-audio uses. |
| |
| [0] I actually don't like this term, because it's elitist. Of course |
| other audio applications are not inferior, but they are different. |
| I'll stick with the term out of established practice. |
| |
| ** What GStreamer Offers the Pro Audio Developer |
| |
| Choosing GStreamer for your application gives you lots of things for |
| free. |
| |
| *** A high penetration into POSIX desktops |
| |
| GStreamer is included with Gnome, so you'll find it already installed on |
| an increasing number of desktops. It makes it easier for a user to |
| install your app. However, you still have to check for individual |
| plugins that you depend on. |
| |
| *** An extremely flexible signal flow graph |
| |
| You have elements, connection points, different kinds of processing |
| functions, schedulers, etc. You can subclass just about everything, or |
| replace whole subsystems as you need to. |
| |
| All of this you would have to implement somehow. The downside is, of |
| course, that it's extremely flexible. The graph isn't run by clock-tick |
| -- the delays are carried out by the timekeeping element (if any), when |
| execution reaches it. It's cooperative, rather than dictator-style like |
| Jack. If all problems have been worked out, etc, it runs smoothly, but |
| one poorly coded element can stall the graph. |
| |
| Restricting graph operation to clock-ticks and using buses instead, like |
| SuperCollider 3, would introduce many simplifications to scheduling and |
| such, I would think. However, you'd still have to implement your |
| signal-flow infrastructure from scratch if you decided to go it alone. |
| |
| I might revise the above paragraph, though. I like GStreamer's level of |
| flexibility a bit too much :) |
| |
| *** A wide variety of existing plugins |
| |
| This includes inputs like ALSA, OSS, sndfile, etc, as well as their |
| corresponding sinks (outputs). Then there are the network transports. |
| And the sound servers (including Jack). LADSPA plugins for free. Some |
| DSP things, but admittedly not too much -- this is an area for future |
| expansion. |
| |
| *** Generic plugin behavior |
| |
| Of course you still have to know some specifics about the plugins you |
| use (which properties they have, for example), but in general elements |
| of a "pipeline" (signal flow graph -- and no, it doesn't have to look |
| like a pipe) are replaceable. Your user can choose between ALSA or OSS |
| or even ESD (shudder), and it's simple to implement. |
| |
| *** Easy threads |
| |
| Adding threads to your signal flowgraph does takes some thought, but |
| once you've decided how to set things up it's reasonably easy. |
| Unfortunately realtime threads aren't implemented yet, but that should |
| be an easy project, knock on wood. |
| |
| *** Other Stuff |
| |
| GStreamer is big these days. I wouldn't say bloated, but there are a lot |
| of subsystems relating to "media" that just aren't applicable to |
| processing float data. There's a whole system (called "caps") that deals |
| with negotiating common formats between elements, when all pro audio has |
| to deal with is sample-rate and the number of frames per buffer. There's |
| a typefinding and pipeline autoplugging subsystem. There's "tags", like |
| from ID3 tags. |
| |
| You might find uses for these things, and thankfully these uses blur the |
| lines between "pro" and "consumer" audio. To an extent, these features |
| complicate GStreamer programming. But mostly they stay out of your way |
| -- besides caps, they only bother you when you ask them to :-) |
| |
| ** Pro Audio for GStreamer Programmers |
| |
| Pro audio is a restricted, almost purely mathematical domain. There's |
| not that much to worry about. Each channel is separate from the rest |
| (never interleaved). All data is in float format, and native byte order. |
| The sample rate is typically the same in the whole system. Same with the |
| number of frames in a buffer. |
| |
| So it's simple, but it's different from "normal" audio processing (a |
| whole mess of variables to synchronise and convert between, interleaved |
| data, codecs, etc). But it's sufficiently different that in the past |
| we've had discussions every 8 months or so about why things are |
| implemented in such-and-such a way, and why don't we change them, and so |
| on. So this part of the document is aimed at GStreamer developer's as a |
| kind of documentation for the whole float-caps space. |
| |
| *** The Format |
| |
| Pro audio deals with floats. I'm not really worried about doubles -- |
| although LADSPA carefully #define's LADSPA_Sample so you can override |
| it, everything's in float. |
| |
| There are two variables to be concerned about. One is sample rate, which |
| is pretty obvious. The not-so-obvious one is buffer-frames, specifying |
| the number of frames that will come in a buffer. If a buffer has fewer |
| frames, that indicates EOS is coming on the next pull. This property is |
| an optimization to allow easy chaining of buffers in multi-pad elements, |
| as well as to prevent deadlocks in circular pipelines, and to comply |
| with systems like Jack that operate on clock ticks. |
| |
| *** Channels |
| |
| One variable that is not in pro-audio is the number of channels in a |
| stream. Streams are always mono. All DSP algorithms expect to receive |
| mono data. Multichannel processing is done via multiple inputs. This is |
| the complicated part of pro audio for GStreamer, because it means lots |
| of multi-pad elements, and complicated pipelines, which is a pain to |
| code for (if you're not coding it in Scheme, of course ;). So yes, it's |
| kindof a pain, but it is a flexibility that's necessary. |
| |
| *** Stability |
| |
| DSP routines written years back still work, because all you need to use |
| them is to -lm. GStreamer is a step towards DLL hell. And audio |
| developers are a funny bunch. Look at Paul Davis's Ardour CVS, for |
| instance. He has a local copy of every library ever coded, ever. No |
| joke. |
| |
| If our platform is to remain attractive to this group, we need to start |
| to stabilize the way GStreamer works. Of course API and ABI change, |
| we're young. But outside of media-related work, the core is pretty |
| stable. When we move to change things after 0.8, changes should be well |
| documented. |
| |
| That's all pretty normal, but there is one special consideration. DSP |
| involves lots of custom plugins, maintained outside the GStreamer tree. |
| So just because you grep the tree and don't find an instance of X |
| function or whatever, it doesn't necessarily mean the feature/behaviour |
| is unused. This will be increasingly true for other GStreamer users in |
| the future, but it's true now for DSP. I'm talking about me now ;) |
| |
| OK, enough rambling. Hope this clarifies things a bit. |
| |
| Andy Wingo, 24 Jan 2004. |
| |