So, awhile back, I took a digital music programming class for a language called ChucK. OMG! I found out that I know NOTHING about music! (5th - 8th grade clarinet in band, I thought a knew a little - but I was wrong! ;)
Now comes an opportunity to code some sound effects in iOS - and I'm still struggling with the concepts of digital/electronic music. What's a “mixer”? Do I need one? What is an audio node, an audio unit? What exactly is “reverb” and “delay”? Which pieces do I need and how do I put them together to make an “echo” or change the pitch? Well, I still don't know exactly, but I can tell you enough to get some interesting sound effects going in your Swift 2.1 iOS app. Read on...
In ChucK, you “ChucK” stuff to the DAC to make sound come out. Really? It looks something like this:
Gain masterGainLeft => Pan2 panLeft => dac.left;
Gain masterGainRight => Pan2 panRight => dac.right;
Gain masterGainCenter => Pan2 panCenter => dac;
The
=>
is the “ChucK” symbol. In this example, I'm taking variables of type Gain
named “masterGain”-something, “ChucKing” them to Pan2
objects named “pan”-something, which are each then getting “ChucKed" to one of the DAC's left, right, and center nodes.
If you want to know what this stuff means, see the footnotes at the bottom. For now, just know that they're all particular facets of an audio ”sound chain”.
I was able to understand enough of the concepts to make music to complete the ChucK course. But, now I needed to do it again in Swift…
As I tried to "gain” (sorry! ;) a better understanding of how the pieces fit together, this video helped:
AVAudio in Practice - WWDC 2014, Session 502
AVAudio in Practice - WWDC 2014, Session 502
In short, my Swift “sound chain” needed something like this ChucK statement:
input => effect => DAC
To do this in Swift at a most basic level, I followed these steps:
1. Create an
AVAudioEngine
.2. Create an
AVAudioPlayerNode
, and attach it to the audio engine.3. Set up one or more effects. (Some example effects are:
AVAudioUnitTimePitch
, AVAudioUnitDelay
, and AVAudioUnitReverb
.) Set any values associated with the effects (such as pitch or ”wet/dry mix"), then attach the effect(s) to the audio engine.4. Connect the pieces of the sound chain together using the audio engine's
connect
function, starting with the AVAudioPlayerNode
and ending with the AVAudioEngine.outputNode
(representing the DAC in iOS).5. Using the audio player node's
scheduleFile
function, set the chain up to play.6. Start the engine.
7. Using the audio player node, play the sound.
Here's what that might look like in code for a pitch change:
/**
Plays audio at specified pitch.
- Parameter pitchVal: Pitch at which to play audio.
*/
func playAudioAtPitch(pitchVal:Float) {
// the audio engine is a global variable,
// instantiated in viewDidLoad
// my resetAudio function stops and resets
// the engine before we set up our sound chain
resetAudio()
// set up audio player node, and attach it to the engine
// (the audio player node is also a global variable)
let audioPlayerNode = AVAudioPlayerNode()
audioEngine.attachNode(audioPlayerNode)
// set up pitch effect node, and attach it to the engine
let pitchEffect = AVAudioUnitTimePitch()
pitchEffect.pitch = pitchVal
audioEngine.attachNode(pitchEffect)
// connect the nodes to each other through
// the audio engine to make the sound chain:
// AVAudioPlayerNode => AVAudioUnitTimePitch
// AVAudioUnitTimePitch => AVAudioEngine's output node
audioEngine.connect(audioPlayerNode, to: pitchEffect, format: nil)
audioEngine.connect(pitchEffect, to: audioEngine.outputNode, format: nil)
// schedule the recording to play
// audioFile is a global AVAudioFile variable,
// instantiated in viewDidLoad with a sound file
audioPlayerNode.scheduleFile(audioFile, atTime: nil, completionHandler: nil)
// start the engine
try! audioEngine.start()
// play the audio
audioPlayerNode.play()
}
Easy peasy, eh? It plays the sound file with the pitch effect. Here's more detail on how this works...
You’ve got a sound file you want to play with effects. You set up an audio player node to be the “input”, representing your sound file. That input will be put through whatever effects you set up. Then, the whole thing will be put out through the DAC, er, audio engine’s output node. (BTW, the output node points to the default sound output for your device.)
I assume you get pitch. What about “reverb” and “delay”? What are those? You can see a nice GIF and description here that can give you a head start:
Reflection: Echo vs Reverberation
Reflection: Echo vs Reverberation
Basically, reverberation is like a much smaller echo. Reverberation is what happens in a room (or singing in your shower ;) while echo is what happens when you yell around a bunch of rock walls, such as in a canyon. This is where “delay” comes in. Take a reverb, add a delay - and you’ve got an echo.
So, we can do this with the code above. We’ve got a pitch node, so let’s add delay and reverb nodes, using the same pattern we used for the pitch effect:
// set up delay node for echo
let delayUnit = AVAudioUnitDelay()
delayUnit.wetDryMix = echoWetDryMix
audioEngine.attachNode(delayUnit)
// set up reverb effect node
let reverbEffect = AVAudioUnitReverb()
reverbEffect.loadFactoryPreset(.Cathedral)
reverbEffect.wetDryMix = reverbWetDryMix
audioEngine.attachNode(reverbEffect)
Er… what is this
wetDryMix
function? Numerous explanations exist on the web; maybe you’ll be able to find one that makes sense to you (unless you work with Midi equipment or the like and totally get it already!). The value is a Float
, representing a percentage of the original sound vs the effect’ed sound. A value of 0.0
will give you no effect, while a value of, say, 2.0
might give you a humongous effect. (Check Apple’s documentation for valid value ranges.)
I set my
echoWetDryMix
to 0.0
and my reverbWetDryMix
to 25.0
to get a lovely cathedral sound. This provided all reverb, no echo. Alternatively, I set my echoWetDryMix
to 10.0
and my reverbWetDryMix
to 0.0
to get an awesome echo, with no reverb. Experiment with your values to see what interesting things happen!
Now that we’ve added the delay and reverb effects, we need to adjust our sound chain to include them. It should look something like this:
input => pitch => delay => reverb => output
So, in Swift, change your connections section to match your sound chain:
// connect nodes to each other through audio engine
audioEngine.connect(audioPlayerNode, to: pitchEffect, format: nil)
audioEngine.connect(pitchEffect, to: delayUnit, format: nil)
audioEngine.connect(delayUnit, to: reverbEffect, format: nil)
audioEngine.connect(reverbEffect, to: audioEngine.outputNode, format: nil)
Everything else stays the same.
What about that “mixer” thing I mentioned at the beginning? Not related to the “wet/dry mix”, a “mixer” comes into play when you want to combine effects in special ways - from multiple sources, say. Like the
outputNode
property, the audio engine has a default mainMixerNode
to help with this. The WWDC video can give you great info on it if you’re ready to go to that level. But, it works the same way: the mixer just gets added to the sound chain, along with the effects, as appropriate based on how you’re using it. I don't need a mixer, so I'll leave it at that for now.
Feel free to play with this stuff. Changing the order of the chain can sometimes affect the effect. Changing other values within an effect (such as
delayTime
within the AVAudioDelayUnit
object) can create interesting sounds.
Use the free documentation on ChucK => Strongly-timed, On-the-fly Music Programming Language, or check out the book Programming for Musicians and Digital Artists - even if you’re not interested in learning the language of ChucK, you’ll likely find the discussions on the workings of digital music to be helpful in your sound programming efforts.
This has been a very basic level discussion of how to set up some sound effects in iOS using Swift. I know very little about digital music, but I hope I have given you a great start to expanding your own knowledge way past mine! Enjoy!
Footnotes:
ChucK => Strongly-timed, On-the-fly Music Programming Language - free, open source; includes Cocoa and iOS source code that makes up the underpinnings of ChucK. Note: some of the developers of ChucK have a company called Smule - makers of several popular music and sound-related apps on the App Store. See Dr. Ge’s TED talk linked on the ChucK site.
Programming for Musicians and Digital Artists - text for learning to program in ChucK, including discussions on the workings of digital music. (Even if you don’t care to learn ChucK, the digital music explanations can be very helpful!)
DAC:
Digital to Analog Converter - it represents the speakers, headset, or whatever outputs the sound on your computer.
Digital to Analog Converter - it represents the speakers, headset, or whatever outputs the sound on your computer.
Gain:
I'm probably not going to explain this right, but I'll try based on my understanding: gain is to audio output as bandwidth is to an Internet connection. On the Internet connection, if multiple people are downloading files, one might take up the whole bandwidth while the others have to wait their turn. Alternatively, the network might be set so all of them can download at the same time, but each can only use a fraction of the bandwidth.
Gain, then, is the amount of audio pipeline the sound is allowed to take up. In my ChucK example, I'm splitting my sound output up between 3 panning objects, so each has 1/3 of the sound “bandwidth”. (Note that volume is separate from gain: you could have something at full volume, yet it'll still sound quieter if it only has a fraction of the gain. This is similar to how you can still do a complete download with only part of your Internet bandwidth - the download will just be slower.)
I'm probably not going to explain this right, but I'll try based on my understanding: gain is to audio output as bandwidth is to an Internet connection. On the Internet connection, if multiple people are downloading files, one might take up the whole bandwidth while the others have to wait their turn. Alternatively, the network might be set so all of them can download at the same time, but each can only use a fraction of the bandwidth.
Gain, then, is the amount of audio pipeline the sound is allowed to take up. In my ChucK example, I'm splitting my sound output up between 3 panning objects, so each has 1/3 of the sound “bandwidth”. (Note that volume is separate from gain: you could have something at full volume, yet it'll still sound quieter if it only has a fraction of the gain. This is similar to how you can still do a complete download with only part of your Internet bandwidth - the download will just be slower.)
Pan:
Panning only works in stereo. It controls where you hear the sound: left speaker, right speaker, or both, as well as combinations inbetween. As an example, I really like songs or videos that have a car racing by - you hear the car come in from the left, say, then it seems to pass in front of you, then it leaves to the right. This is an awesome use of panning! Listen to this Car Passing By Sound Effect video.
Panning only works in stereo. It controls where you hear the sound: left speaker, right speaker, or both, as well as combinations inbetween. As an example, I really like songs or videos that have a car racing by - you hear the car come in from the left, say, then it seems to pass in front of you, then it leaves to the right. This is an awesome use of panning! Listen to this Car Passing By Sound Effect video.