A Personal journey with audio

(Warning: This part is largely autobiographical and of little technical interest: please feel free to skip to the next section.)

First steps

My first experiences with audio recording began when I was four years old and were conducted with one of those Fisher Price cassette players most people born in the eighties will readily remember. Later, I appropriated my parents' deck which I used to conduct long recording sessions of anything and nothing. In general, throughout my childhood, I used various recording devices to great effect, usually for clandestine purposes (I once spied on the neighbours extensively for several weeks, though they sadly had nothing to hide at the time). In my teens, I began playing guitar and various other instruments, and began doing poor man's multi-tracking with a few double-decks with interesting, if not altogether melodious, results.

Going digital

Then came my first computer with a sound interface and Windows 95, along with the discovery of such applications as "sound recorder" and CoolEdit, and what they could do. Granted, looking back from fifteen years in the future, that was in fact not very much, but it did seem to open many new and exciting possibilities at the time. These programmes already had a serious drawback for me even then, though: being blind, it was somewhat difficult to manipulate the audio (moving in time, cut-n-pasting, etc.) in an intuitive manner, unlike the tape-based recorders I had been using before.

Then came my discovery of Linux, my eventual complete switch to its platform, and hence my return to an entirely command-line and text-oriented computer experience. After a rather frustrating time trying to get a generic sound card to work properly (Remember OSS and isapnp?), I purchased a used (SoundBlaster) SB16 from a friend, and then eventually upgraded to an SB Live! with much more satisfactory results.

Now that I finally had the ability to record and play audio in my text environment, the next step in my quest for audio freedom was to find text-based applications which would allow me to manipulate the data in interesting ways. The first programme I discovered was SoX, already quite mature then and still very much alive today. SoX allows one to process audio in any manner of ways and is capable of crude multi-tracking. However, it has an extremely involved syntax and, more to the point, it is mostly suitable for batch operations, having been designed back when real-time audio processing was more dream than reality. After extensive research in the multitude of applications and utilities for audio available on Linux even then, I finally stumbled on a very promising project named Ecasound.

I will not dwell on Ecasound's features, because they will be covered in much greater detail in the next section of this article, but suffices to mention that it could do nearly all that SoX could do, but with a much clearer command structure and the ability to work interactively. Despite its many advantages and the obvious flexibility and power it offered, however, it rapidly became obvious to me that Ecasound alone would not be sufficient to handle even simple audio tasks, such as mixing and processing more than two or three audio streams efficiently, and that some kind of front-end would be required. I therefore began making tentative plans for writing such a front-end.

In parallel, I had seriously been considering attending a sound engineering school, having been fascinated by audio production and recording before I even knew what they were called, but abandoned the idea after discovering that much of the training was done using either graphical computer programmes or hardware workstations using graphical interfaces. I ended up going back to college in an altogether different field and my plans for a shell-like front-end to Ecasound were dropped, though my interest in audio was never quite entirely forsaken.

Trying something new

After some five years had elapsed, full of experiences, both joyful and sad, none of which are relevant to this article, my interest in audio engineering was revived while helping someone polish pieces they were working on and about to submit. That person kindly encouraged me to return to audio production, suggesting I consider getting a Mac and try my hand at using Logic Studio, which, after discovering that Mac OS now came with an integrated screen-reader (called VoiceOver), is exactly what I did.

At that point, I had been using Linux exclusively for the past decade and I welcomed the opportunity to try something new. My first impressions were very favourable and, though I feel guilty to admit it, I did bask somewhat in the glow of ease surrounding such actions as adding peripherals and connecting to a wifi access point. (On the other hand, I must also qualify that I still did make a B-line for the terminal, like a drunkard for a gin-shop, and had little peace until I could type "mpg123 file.mp3" and hear music come forth.) The honeymoon was fairly short-lived, however, as, upon installing Apple Logic, I quickly discovered that much if not most of it was inaccessible to me using VoiceOver. This was a serious blow to my hopes of using Logic as a DAW (digital audio workstation), but I persisted, trying to learn key shortcuts which would allow me to perform the tasks needed. I eventually discovered that Logic could be controlled externally via "control surfaces" and decided to purchase a Novation Zero SL in hopes to remedy to my inaccessibility woes.

The Novation board turned out to be an utter annoyance to set up, both because of the complexity of the software I was attempting to control and because of the graphical interface on the board itself. After many frustrating tries, I did manage to control the volume and pan in the mixer section, but the whole set-up was liable to become unhinged at the slightest mispress of a button and was nearly impossible to restore without sighted help. Aggravated, I searched the internet for other blind Logic users and eventually found one. Upon contacting him and asking for advice, I received a very bleak answer indeed. He told me that Logic was just barely usable through his control surface (a very high-end console with motorised faders), and that even so, he needed to rely on external rack units because most of the internal plugins were inaccessible. He also asseverated having contacted Apple and several other DAW makers and offered his help and advice to make their software more accessible with very mixed results. His final summation was this: he had invested several thousand dollars in hardware and software to set himself up as a competitive, independent engineer and producer and was now seriously considering giving it up, mostly due to what he saw as a general lack of support from the industry. Needless to say that this gave me cause to think. At this point, a nagging suspicion crept into my mind that the only way to achieve what I wished was through open source software projects.

Breaking free again

With that thought in mind, I applied myself to dual-booting my Mac, which I fairly easily succeeded in doing, and returned to an almost entirely Debian-based life, leaving the Macness behind without much regret. The first thing to do, once this was achieved, was to find out what had been going on in the Linux Audio world since I had left it some six years previously. After some research, I made a startling discovery: someone else had decided to write an Ecasound front-end, and the resulting software was called Nama.

All about Nama

It turned out that a fellow name Joel Roth had had much the same experience with Ecasound. Initially attracted by its obvious power and simplicity, he quickly came to realise that translating real life engineering scenarios (mixing, monitoring, doubling, etc.) into the syntax Ecasound expected was a surprisingly clumsy and often error-prone endeavour, rendering the enterprise frustrating and time-consuming (which we all no can be fatal to inspiration). The only difference was that, where I had envisioned the remedy to this as a shell-like system of macros and variables, he originally thought the solution out as a setup generator to cover common use-cases. However, as the use-cases multiplied (Once you record, you want to apply effects, don't you?), the project, then simply named Ecmd, kept growing in complexity, much the same way a pearl does, until it looked less and less like a set-up generator and more and more like a crude DAW (digital audio workstation). When it eventually acquired a user interface, it was graphical, but Roth decided to try his hand at an alternative text interface, inspired both by the desire to learn about parsers and the knowledge that there were blind Ecasound users experiencing frustrations similar to those I had contended with several years before who might benefit.

All this was before my time, however. Nama's graphical interface is not very much in use nowadays, and nobody could call the software "crude", nor yet a simple front-end. It is, for all intents and purposes, a DAW-like abstraction layer on top of Ecasound.

What is Ecasound?

(Skip to the next section if you already know.)

We cannot possibly speak of Nama without expending even more time on Ecasound, since it is responsible for all the audio processing done. Ecasound is a very mature (the copyright notice boasts 1997-2012) audio processing, multi-tracking, real-time capable engine written and maintained by Kai Vehmanen. Its set-ups, which can incorporate as many input, output and operator objects as the hardware will tolerate, is described by a flexible syntax which can be expressed from the command-line, loaded from a file, or controlled in real time from an Interactive-Mode (IAM) or a Control Interface (ECI). The latter offers an extensive and well-documented set of commands, and can be used through various programming languages, such as C, Perl, Python and Ruby, or via a simple network socket (netECI). This, as you can imagine, offers a wealth of ways in which Ecasound can be used to solve audio processing problems, from the simplest to the most complex. Ecasound comes with many decent built-in effect operators it gained over the years (filters, delays, reverbs, etc.), but, more importantly, it supports LADSPA plugins, and has recently gained LV2 support as well. All in all, it is quite as capable as most other audio engines out there, and a great deal more flexible.

What can one do with Nama?

Upon loading Nama for the first time and starting with an empty project, a user accustomed to mainstream DAWs such as Logic or the excellent Ardour may well feel confused and bewildered by a simple prompt. As is and will always be the case, a command driven interface necessarily implies a steeper learning curve than that presented by a graphical interface. Once that hurdle is passed, however, I believe it offers the opportunity for much greater efficiency: after all, most people working with DAW software on a regular basis advocate using the mouse as little as possible and focusing on keyboard shortcuts instead. One can think of Nama as being nothing but keyboard shortcuts and no mouse at all! Much effort as been and still is focused by Roth and the user-base on creating a syntax which is logical, easy to master, and as flexible as possible, with a strong emphasis on abbreviated commands. Those commands are all documented through an extensive online help system along with examples. Once one manages to deal with the paradigm shift of manipulating audio through textual commands, one will find much the same concepts as are present in any respectable DAW software. Here is an overview of such concepts and features:

  • Transport:
    • Ability to set transport to arbitrary position.
    • Ability to move backward and forward by an arbitrary amount of time.
    • Set and manage markers (marks) which can be referred to by index or arbitrary name.
    • Move to markers arbitrarily or sequentially.
    • JACK transport support, master and slave.
  • Tracks:
    • Add/remove tracks, giving them arbitrary names.
    • Record from specified source or import existing audio.
    • Full track versioning system with optional version comments.
    • Ability to perform operations on multiple tracks using a "for" style syntax.
    • Ability to create groups (bunches) of tracks.
    • Volume, pan, mute, solo, fade and temporary exclusion (off).
    • Track freezing (caching) for resource economy.
  • Effects:
    • Supports Ecasound's internal effects and effect presets as well as LADSPA and LV2 plugins.
    • Add, remove, insert, bypass and modify effects.
    • Ability to create reusable effect chains.
    • Ability to use inserts, hardware, JACK, or local, with mix (wetness) control.
  • Busses:
    • Ability to create arbitrary hierarchies or busses and sub-busses.
    • Auxiliary (send) busses.
    • Busses have all the features tracks have: mute/solo, effects, inserts, etc.
  • Recording:
    • Preview mode to adjust recording conditions.
    • Easy recording workflow: type "rerec" to record a new version of a track.
    • Punch-in/out style edits.
  • Mastering:
    • Mastering mode toggles creation of a buss of the same name.
    • Distinct compressor and spatialiser for low, mid, and high frequencies.
    • Preset parametric equaliser, or one can choose one's own.
    • Limiter.
    • Versioned mixdown.

Why use a text interface?

Still the question may lurk in the reader's mind: "Why would anyone not afflicted with accessibility issues or hopeless geekiness care to use a text interface?" Well, in the end, the answer may be quite subjective, but here are a few arguments nevertheless.

A graphical interface consumes much CPU power and memory which might otherwise be devoted to processing audio. Granted, with the specifications found on most modern systems that is hardly an issue, but people working with more modest resources may find it appealing. The fact of a text interface could easily allow one to control Nama remotely using a netbook (over the network) or a simple wireless keyboard, allowing the engineer to interact with the mix from any position in his listening environment. To my mind, however, these are only superficial advantages. The greatest advantage of a text interface could well be that it breaks the mold. Graphical interfaces must be designed, and their design forces the user into whatever workflow was envisioned, which may or may not correspond with his inclination or particular application. I have also heard a variety of engineers, first-hand, on forums and in print, complain that graphical interfaces and plugins have led to a certain shift away from traditional engineering values, where engineers, especially amateur or inexperienced ones, rely less and less on their ears and more and more on their displays. While a text interface might not really be a cure for this change of attitude, the challenge of using a different approach, one providing lees distraction and requiring slightly more forethought, could very well encourage users to rethink the way in which they work and force them to rely more on their senses and knowledge.

Limitations

So, if Nama is so great, can one honestly say it is fit to use for a medium-scale production? Not quite. Though it has gained an amazing amount of functionality, even in the past year, some sticking points remain which would currently render it difficult to use for a small to medium commercial-grade production. To my mind, here they are in decreasing order of importance:

  • Nama currently does not have any automation mechanism, either manual or MIDI. This, to me, is by far the greatest obstacle to using Nama in third-party production, as such a feature would almost unfailingly be required at one stage or another.
  • Nama's support of "regions" is currently in an infant stage. While it allows the creation of tracks from snippets of a mother track, there is no unified way in which these can be managed or coordinated. For small projects, there are several ways in which one can work around this limitation, but this would become difficult to manage for even a largish small commercial production, where splitting a single track in multiple contiguous regions and stitching a track from multiple takes are common practices.
  • Nama currently does not have a time grid and its notion of time is restricted to seconds expressed as decimal numbers. Although this is not a capability strictly required for any kind of production, it means some extra work for the engineer and is such a common feature in most DAWs that many users might well feel unsettled to find it missing.
  • Although its help system is informative and extensive, there is currently no documentation at all, which, in the face of a steep learning curve, might well prove discouraging to potential users. (I hope to address this partly with some tutorials I hope to write soon.)

Apart from the issues above, Nama is also unable and, in my opinion, unsuited to perform fine edits, which one might use to correct slight timing errors in a recording. I believe this functionality might best be left to a separate piece of software (sadly unwritten at this time).

Setting this aside, Nama is perfectly suitable for home recording projects where one has full control over the material, and smaller, relaxed commercial productions, such as live shows, demos, and small bands.

Development, present and future

Many exciting developments have been and are cooking in the Nama pot. Joel Roth has done a wonderful job of restructuring, documenting and clarifying the codebase in the last year to the point where even a relative Perl idiot like myself can find his way around when needed. Some of the most exciting new features which should make their way in an upcoming release are:

  • Support for LV2 plugins. (Thanks to Jeremy Salwent's implementation for Ecasound.)
  • Git-based project tracking, which would allow, among many things:
    • Undo and redo capabilities.
    • Messages associated with project saves.
    • Ability to tag saves to identify project phases.
    • Branching, to allow exploration, comparison and export of alternate mixes and remixes.
    • AB'ing of said alternate mixes.
    • Online collaboration.
  • Latency compensation to account and compensate for such things as JACK inserts.
  • Rework of reusable effect chains to include inserts.
  • And much more under the hood.

As for longer term goals, things aren't quite as clear, but here are some of the ideas which have been floating around, some recently and some less so:

  • Midi integration using the Midish text-based sequencer.
  • Support for external Midi controllers.
  • Support for OSC.
  • Sample-accurate positioning.
  • Automation support. (Guess who proposed that one.)

Sampling and trying it

If you are curious as to what Nama can do, the first place to go to is Mr. Julien Claassen's music page, where you can find many excellent pieces recorded and produced with Nama.

Until a new release comes, the best way to try Nama is to pull the master branch from GitHub and join the mailing list for support. I look forward to seeing you there.

Conclusion

I have already successfully mixed a few small projects for people with very satisfactory results using Nama and I hope to continue doing so in the future (I am hoping to build a portfolio). While neither Nama nor I may be anywhere near ready for Abbey Road or Nashville Row just yet, this piece of software has definitely allowed me to perform audio production tasks which would otherwise have been needlessly complex and clumsy using available mainstream software, free or otherwise, and I believe it could benefit anybody craving a different audio engineering experience.

Thanks and acknowledgments

  • Thanks go to Joel Roth for his great work on Nama and for providing me with a short account of how Nama got started.
  • Thumbs up to Mr. Julien Claassen for being such a good Nama advocate through his many musical realisations and for being such a capital sparring partner on the Nama mailing list.
  • And, finally, great thanks to Kai Vehmanen for all his hard work on the Ecasound engine, without which Nama simply could not be.