AVAudioEngine

RSS for tag

Use a group of connected audio node objects to generate and process audio signals and perform audio input and output.

Posts under AVAudioEngine tag

55 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

Compressing AVAudioPCMBuffer within AVAudioEngine Tap
Hi everyone, I’m working on a project that involves streaming audio over WebSockets, and I need to compress the audio to reduce bandwidth usage. I’m currently using AVAudioEngine to capture and process audio in PCM format (AVAudioPCMBuffer), but I want to compress the buffer into Opus (or another efficient codec) before sending it over the network. Has anyone worked with compressing an AVAudioPCMBuffer into Opus format within a tap on the inputNode, or could you recommend the best approach for compressing the PCM buffer into a different format? I haven’t been able to find a working solution for this. Any advice or code examples would be greatly appreciated! Thanks in advance, Ondřej -- My current code without the compression: inputNode.installTap(onBus: .zero, bufferSize: 1440, format: nil) { [weak self] buffer, time in guard let self else { return } // 1. Send data // a) Convert the buffer into the desired format if let outputBuffer = buffer.convert(toFormat: Self.websocketInputFormat) { // b) Use the converted buffer // TODO: compress it into a different format if let data = outputBuffer.convertToData() { self.sendAudio(data) } } // 2. Get sound level self.visualizeRecorderBuffer(buffer) } func convert(toFormat outputFormat: AVAudioFormat) -> AVAudioPCMBuffer? { let outputFrameCapacity = AVAudioFrameCount( round(Double(frameLength) * (outputFormat.sampleRate / format.sampleRate)) ) guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: outputFormat, frameCapacity: outputFrameCapacity), let converter = AVAudioConverter(from: format, to: outputFormat) else { return nil } converter.convert(to: outputBuffer, error: nil) { packetCount, status in status.pointee = .haveData return self } return outputBuffer } static private let websocketInputFormat = AVAudioFormat( commonFormat: .pcmFormatInt16, sampleRate: 16000, channels: 1, interleaved: false )!
2
0
1.2k
Sep ’24
ApplicationMusicPlayer Audio Session Issue When Switching to AVAudioEngine in Background
Hi! I'm developing a music player app that interchanges between ApplicationMusicPlayer and AVAudioEngine. I'm facing an issue when switching from playback via ApplicationMusicPlayer to AVAudioEngine while the app is in background. Based on testing, it seems like the issue has to do with being unable to set audio focus in background, causing error AVAudioSessionErrorCodeCannotInterruptOthers. I would like to check if ApplicationMusicPlayer has its own audio focus separated from the app's own audio focus. If it is, is there anything that I can do to ensure that ApplicationMusicPlayer returns focus to the app? (I notice that the issue does not occur if we are moving playback from AVAudioEngine to ApplicationMusicPlayer. Not sure why the opposite does not work)
1
0
598
Aug ’24
AVAudioSession gets interrupted when closing a window
I have a visionOS app that plays audio using AVAudioEngine and presents both a window and an immersive space. If I close the window, the audio session gets interrupted and attempting to restart the session and audio engine has no effect. I need to dismiss the app, then reopen it, which reopens the main window, in order for audio to start playing again. This is in all visionOS 2 betas. Note that I have background audio enabled for my app.
2
1
799
Nov ’24
Recordings on iOS 18.0 beta start with stuttering.
I'm experiencing stuttering every time I record something with my iOS app on iOS 18 beta. The code ran fine on previous iOS versions. The stuttering occurs for the first 2 seconds. Here's an example: https://soundcloud.com/thomas-walther-219010679/ios-18-stuttering The way I set up AVAudioEngine and AVAudioSession was vetted quite thoroughly during sessions at WWDC '23. Here is how the engine and the tap is configured: let engine = AVAudioEngine() let recorderNode = AVAudioMixerNode() engine.attach(recorderNode) engine.connect(engine.mainMixerNode, to: engine.outputNode, format: engine.outputNode.inputFormat(forBus: 0)) engine.connect(recorderNode, to: engine.mainMixerNode, format: recordingOutputFormat) engine.connect(engine.inputNode, to: recorderNode, format: engine.inputNode.inputFormat(forBus: 0)) let bufferSize: AVAudioFrameCount = 4096 recorderNode.installTap(onBus: 0, bufferSize: bufferSize, format: nil) { [weak self] buffer, time in guard let self = self else { return } do { // Write recording to disk try audioFile.write(buffer) } catch { // ... } } I tried setting a different buffer size, but with no luck. I also can't see any hangs in Instruments. Do you have any pointers on how to debug this?
5
0
1k
Aug ’24
How do I output different sounds to headphones and speakers while simultaneously recording, all without using AVAudioSession.Category.multiroute?
I need to find a way to allow recording from the mic while outputting two different sound streams to two different devices (speaker and headphones). I've done a fair bit of reading around using AVAudioSession.Category.multiroute but haven't found any modern examples. @theanalogkid posted a nice example using obj-C nine years ago, but others have noted that the code isn't readily translatable to Swift. To make matters worse, this is one of the very few examples on how to properly use multirouting. The official documentation is lacking, to say the least, and the WWDC 2012 session is, well, old enough to attend middle school and be a Taylor Swift fan, but definitely not in Swift. The few relevant forum posts here are spread over this middle schooler's life span and likely outdated, with most having no responses other than the poster's own plightful echo. They don't paint a pretty picture of .multiroute's health, with a recent poster noting that volume buttons don't work in this mode, contacting DTS and finding that there's no fix; another finding that it just doesn't work for certain devices, etc. Audio is giving me enough of a headache so I'd like to avoid slogging through this if possible. .multiroute feels like the developer mode of AVAudioSession, but without documentation. tl;dr - Without using .multiroute, is there a way to allow an app to output two different devices while simultaneously recording audio? If .multiroute is the only way to achieve this, can someone give me a quick rundown of how this category works?
1
0
764
Aug ’24
Configuring Apple Vision Pro's microphones to effectively pick up other speaker's voice
I am developing a visionOS app that captions speech in real environments. Currently, I am using Apple's built-in speech recognizer. However, when I was testing the app with a Vision Pro, the device seemed to only pick up the user's voice (in other words, the voices of the wearer of the Vision Pro device). For example, when the speech recognition task is running, and another person in front of me is talking, the system does not pick up the speech well. I tried to set the AVAudioSession to be equally sensitive to all directions: private func configureAudioSession() { do { try audioSession.setCategory(.record, mode: .measurement) try audioSession.setActive(true) if #available(visionOS 1.0, *) { let availableDataSources = audioSession.availableInputs?.first?.dataSources if let omniDirectionalSource = availableDataSources?.first(where: {$0.preferredPolarPattern == .omnidirectional}) { try audioSession.setInputDataSource(omniDirectionalSource) } } } catch { print("Failed to set up audio session: \(error)") } } And here is how I set up the speech recognition and configure the microphone inputs: private func startSpeechRecognition(completion: @escaping (String) -> Void) { do { // Cancel the previous task if it's running. if let recognitionTask = recognitionTask { recognitionTask.cancel() self.recognitionTask = nil } // The AudioSession is already active, creating input node. let inputNode = audioEngine.inputNode try inputNode.setVoiceProcessingEnabled(false) // Create and configure the speech recognition request recognitionRequest = SFSpeechAudioBufferRecognitionRequest() guard let recognitionRequest = recognitionRequest else { fatalError("Unable to create a recognition request") } recognitionRequest.shouldReportPartialResults = true // Keep speech recognition data on device if #available(iOS 13, *) { recognitionRequest.requiresOnDeviceRecognition = true } // Create a recognition task for speech recognition session. // Keep a reference to the task so that it can be canceled. recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest) { result, error in // var isFinal = false if let result = result { // Update the recognizedText completion(result.bestTranscription.formattedString) } else if let error = error { completion("Recognition error: \(error.localizedDescription)") } if error != nil || result?.isFinal == true { // Stop recognizing speech if there is a problem self.audioEngine.stop() inputNode.removeTap(onBus: 0) self.recognitionRequest = nil self.recognitionTask = nil } } // Configure the microphone input let recordingFormat = inputNode.outputFormat(forBus: 0) inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in self.recognitionRequest?.append(buffer) } audioEngine.prepare() try audioEngine.start() } catch { completion("Audio engine could not start: \(error.localizedDescription)") } }
0
0
951
Jul ’24
Issue with AVAudioEngine and AVAudioSession after Interruption and Background Transition - 561145187 error code
Description: I am developing a recording-only application that supports background recording using AVAudioEngine. The app segments the recording into 60-second files for further processing. For example, a 10-minute recording results in ten 60-second files. Problem: The application functions as expected in the background. However, after the app receives an interruption (such as a phone call) and the interruption ends, I can successfully restart the recording. The problem arises when the app then transitions to the background; it fails to restart the recording. Specifically, after ending the call and transitioning the app to the background, the app encounters an error and is unable to restart AVAudioSession and AVAudioEngine. The only resolution is to close and restart the app, which is not ideal for user experience. Steps to Reproduce: 1. Start recording using AVAudioEngine. 2. The app records and saves 60-second segments. 3. Receive an interruption (e.g., an incoming phone call). 4. End the call. 5. Transition the app to the background. 6. Transition the app to the foreground and the session will be activated again. 7. Attempt to restart the recording. Expected Behavior: The app should resume recording seamlessly after the interruption and background transition. Actual Behavior: The app fails to restart AVAudioSession and AVAudioEngine, resulting in a continuous error. The recording cannot be resumed without closing and reopening the app. How I’m Starting the Recording: Configuration: internal func setAudioSessionCategory() { do { try audioSession.setCategory( .playAndRecord, mode: .default, options: [.defaultToSpeaker, .mixWithOthers, .allowBluetooth] ) } catch { debugPrint(error) } } internal func setAudioSessionActivation() { if UIApplication.shared.applicationState == .active { do { try audioSession.setPrefersNoInterruptionsFromSystemAlerts(true) try audioSession.setActive(true, options: .notifyOthersOnDeactivation) if audioSession.isInputGainSettable { try audioSession.setInputGain(1.0) } try audioSession.setPreferredIOBufferDuration(0.01) try setBuiltInPreferredInput() } catch { debugPrint(error) } } } Starting AVAudioEngine: internal func setupEngine() { if callObserver.onCall() { return } inputNode = audioEngine.inputNode audioEngine.attach(audioMixer) audioEngine.connect(inputNode, to: audioMixer, format: AVAudioFormat.validInputAudioFormat(inputNode)) } internal func beginRecordingEngine() { audioMixer.removeTap(onBus: 0) audioMixer.installTap(onBus: 0, bufferSize: 1024, format: AVAudioFormat.validInputAudioFormat(inputNode)) { [weak self] buffer, _ in guard let self = self, let file = self.audioFile else { return } write(file, buffer: buffer) } audioEngine.prepare() do { try audioEngine.start() recordingTimer = Timer.scheduledTimer(withTimeInterval: recordingInterval, repeats: true) { [weak self] _ in self?.handleRecordingInterval() } } catch { debugPrint(error) } } On the try audioEngine.start() call, I receive error code 561145187 in the catch block. Logs/Error Messages: • Error code: 561145187 Request: I would appreciate any guidance or solutions to ensure the app can resume recording after interruptions and background transitions without requiring a restart. Thank you for your assistance.
2
0
1.2k
Aug ’24
Setting Audio Input node for AVAudioEngine causes outside audio to stop
I'm building an app that will allow users to record voice notes. The functionality of all that is working great; I'm trying to now implement changes to the AudioSession to manage possible audio streams from other apps. I want it so that if there is audio playing from a different app, and the user opens my app; the audio keep playing. When we start recording, any third party app audio should stop, and can then can resume again when we stop recording. This is my main audio setup code: private var audioEngine: AVAudioEngine! private var inputNode: AVAudioInputNode! func setupAudioEngine() { audioEngine = AVAudioEngine() inputNode = audioEngine.inputNode audioPlayerNode = AVAudioPlayerNode() audioEngine.attach(audioPlayerNode) let format = AVAudioFormat(standardFormatWithSampleRate: AUDIO_SESSION_SAMPLE_RATE, channels: 1) audioEngine.connect(audioPlayerNode, to: audioEngine.mainMixerNode, format: format) } private func setupAudioSession() { let audioSession = AVAudioSession.sharedInstance() do { try audioSession.setCategory(.playAndRecord, mode: .default, options: [.defaultToSpeaker, .allowBluetooth]) try audioSession.setPreferredSampleRate(AUDIO_SESSION_SAMPLE_RATE) try audioSession.setPreferredIOBufferDuration(0.005) // 5ms buffer for lower latency try audioSession.setActive(true) // Add observers setupInterruptionObserver() } catch { audioErrorMessage = "Failed to set up audio session: \(error)" } } This is all called upon app startup so we're ready to record whenever the user presses the record button. However, currently when this happens, any outside audio stops playing. I isolated the issue to this line: inputNode = audioEngine.inputNode When that's commented out, the audio will play -- but I obviously need this for recording functionality. Is this a bug? Expected behavior?
0
0
664
Jul ’24
Implementing Multi-Channel Audio Recording on iOS with Built-In and External Mics
Hi there community, First and foremost, a big thank you to everyone who takes the time to read this. TL;DR: How, if even possible, can I record multiple audio streams simultaneously on an iOS application (iPad/iPhone)? I'm working on a recorder for the iPad to gather data for a machine learning project focused on speech recognition. Our goal is to capture extensive speech data, which requires recording from multiple microphones. Specifically, I need to record from all mics connected to our Scarlett 4i4 audio interface and, most importantly, also record from the built-in mic on the iPad or iPhone at the same time. As a newcomer to Swift development, I initially explored AVAudioRecorder. However, I quickly realized that it only supports one active audio node at a time, making multi-channel recording impossible. (perhaps you can proof me wrong, would make my day) Next, I transitioned to using AVAudioEngine, but encountered the same limitation: I couldn't manage to get input nodes for both the built-in mic and the Scarlett interface channels simultaneously. The application started behaving oddly, often resulting in identical audio data being recorded across all files. Determined to find a solution, I delved deeper into the Core Audio framework, specifically using Audio Toolbox. My approach involved creating and configuring multiple Audio Units, each corresponding to a different audio input device. Here's a brief overview of my current implementation: Listing Available Input Devices: I used AVAudioSession to enumerate all available input devices. Creating Audio Units: For each device, I created an Audio Unit and attempted to configure it for recording. Setting Up Callbacks: I set up input and output callbacks to handle the audio processing. Despite my efforts over the last few days, I haven't had much success. The callbacks for the Audio Units don't seem to be invoked correctly, and I'm struggling to achieve simultaneous multi-channel recording. Below is a snippet of my latest attempt: let audioUnitCallback: AURenderCallback = { ( inRefCon: UnsafeMutableRawPointer, ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>, inTimeStamp: UnsafePointer<AudioTimeStamp>, inBusNumber: UInt32, inNumberFrames: UInt32, ioData: UnsafeMutablePointer<AudioBufferList>? ) -> OSStatus in guard let ioData = ioData else { return noErr } print("Input callback invoked") let audioUnit = inRefCon.assumingMemoryBound(to: AudioUnit.self).pointee var bufferList = AudioBufferList( mNumberBuffers: 1, mBuffers: AudioBuffer( mNumberChannels: 1, mDataByteSize: 0, mData: nil ) ) let status = AudioUnitRender(audioUnit, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, &bufferList) if status != noErr { print("AudioUnitRender failed: \(status)") return status } // Copy rendered data to output buffer let buffer = UnsafeMutableAudioBufferListPointer(ioData)[0] buffer.mData?.copyMemory(from: bufferList.mBuffers.mData!, byteCount: Int(bufferList.mBuffers.mDataByteSize)) buffer.mDataByteSize = bufferList.mBuffers.mDataByteSize print("Rendered audio data") return noErr } let outputCallback: AURenderCallback = { ( inRefCon: UnsafeMutableRawPointer, ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>, inTimeStamp: UnsafePointer<AudioTimeStamp>, inBusNumber: UInt32, inNumberFrames: UInt32, ioData: UnsafeMutablePointer<AudioBufferList>? ) -> OSStatus in guard let ioData = ioData else { return noErr } print("Output callback invoked") // Process the output data if needed return noErr } In essence, I'm stuck and in need of guidance. Has anyone here successfully implemented multi-channel recording on iOS, especially involving both built-in microphones and external audio interfaces? Any shared experiences, insights, or suggestions on how to proceed would be immensely appreciated. Thank you once again for your time and assistance!
0
0
915
Jul ’24
How to show VoIP calls on Apple Watch (WatchOS9) with CallKit?
Now my main app is already invoked voip callkit, I would want to invoke voip on iWatch app, but I have some issue: 1、How to deal audio data and network connect of iWatch voip ? can we depends on iPhone app? 2、How to use voip callkit on iWatch that only via bluetooth connect ? 3、If main app is already support voip callkit, how to support callkit for iwatch? Do we need to repeat and independently implement callkit, network, and audio on iWatch? 4、how to add support dial number on iWatch use by the thirdpart app? the case is the same with on the iPhone use, user can send dial by system call record . Any help is appreciated, thanks in advance.
1
0
1.1k
Jul ’24
AVAudioEngine Dolby Atmos
Hi! I have a music app using AVAudioEngine. Right now, I have set it up to play multi channel tracks and show "Multichannel" in the volume controls. However, I am unable to figure out how to get it to use Dolby Atmos. Is there something that needs to be enabled? Is it even possible for AVAudioEngine? I saw some apps that are able of playing with Dolby Atmos, but they do not have EQ feature, so I'm guessing that they are not using AVAudioEngine.
2
0
929
Nov ’24
Understanding AVAudioTime in AVAudioNodeTapBlock? Is there a way to get time relative to a scheduled Buffer?
I'm using AVAudioEngine to play AVAudioPCMBuffers. I'd like to synchronize some events with the playback. For example if the audio's frame position is >= some point && less than some point trigger some code. So I'm looking at - (void)installTapOnBus:(AVAudioNodeBus)bus bufferSize:(AVAudioFrameCount)bufferSize format:(AVAudioFormat * __nullable)format block:(AVAudioNodeTapBlock)tapBlock; Now I have frame positions calculated (predetermined before audio is scheduled I already made all necessary computations) . So I just need to fire code at certain points during playback: [playerNode installTapOnBus:bus bufferSize:bufferSize format:format block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) { //Inspect current audio here and fire... }]; [playerNode scheduleBuffer:fullbuffer atTime:startTime options:0 completionCallbackType:AVAudioPlayerNodeCompletionDataPlayedBack completionHandler:^(AVAudioPlayerNodeCompletionCallbackType callbackType) { // some code is here, not important to this question. }]; The problem I'm having is figuring out at what point in full buffer I'm at within the tap block. The tap block passes chunks (not the full audio buffer). I tried using the when parameter of the block to calculate the frame position relative to the entire audio but have be unsuccessful so far. I'm assuming the when parameter is relative to the buffer passed in the tap block (not my entire audio buffer I scheduled). Not installing a tap and just using a timer before scheduling my fullBuffer has given me good results but I'd rather avoid using a timer if possible and use sample time.
3
0
1.4k
Dec ’24
Playing Timed Sound Effects in Background
Hi, I'm relatively new to iOS development and kindly ask for some feedback on a strategy to achieve this desired behavior in my app. My Question: What would be the best strategy for sound effect playback when an app is in the background with precise timing? Is this even possible? Context: I created a basic countdown timer app (targeting iOS 17 with Swift/SwiftUI.). Countdown sessions can last up to 30-60 mins. When the timer is started it progresses through a series of sub-intervals and plays a short sound for each one. I used AVAudioPlayer and everything works fine when the app is in the foreground. I'm considering switching to AVAudioEngine b/c precise timing is very important and the AIs tell me this would have better precision. I'm already setting "App plays audio or streams audio/video using AirPlay" in my Plist, and have configured: AVAudioSession.sharedInstance().setCategory(.playback, mode: .default, options: .mixWithOthers) Curiously, when testing on my iPhone 13 mini, sounds sometimes still play when the app is in the background, but not always. What I've considered: Background Tasks: Would they make any sense for this use-case? Seems like not if the allowed time is short &amp; limited by the system. Pre-scheduling all Sounds: Not sure this would even work and seems like a lot of memory would be needed (could be hundreds of intervals). ActivityKit Alerts: works but with a ~50ms delay which is too long for my purposes. Pre-Render all SFX to 1 large audio file: Seems like a lot of work and processing time and probably not worth it. I hope there's a better solution. I'd really appreciate any feedback.
1
0
1.1k
Dec ’24
Indicate Packet Loss With AVAudioConverter for OPUS Decoding
I'm using an AVAudioConverter object to decode an OPUS stream for VoIP. The decoding itself works well, however, whenever the stream stalls (no more audio packet is available to decode because of network instability) this can be heard in crackling / abrupt stop in decoded audio. OPUS can mitigate this by indicating packet loss by passing a null pointer in the C-library to int opus_decode_float (OpusDecoder * st, const unsigned char * data, opus_int32 len, float * pcm, int frame_size, int decode_fec), see https://opus-codec.org/docs/opus_api-1.2/group__opus__decoder.html#ga9c554b8c0214e24733a299fe53bb3bd2. However, with AVAudioConverter using Swift I'm constructing an AVAudioCompressedBuffer like so:         let compressedBuffer = AVAudioCompressedBuffer(             format: VoiceEncoder.Constants.networkFormat,             packetCapacity: 1,             maximumPacketSize: data.count         )         compressedBuffer.byteLength = UInt32(data.count)         compressedBuffer.packetCount = 1   compressedBuffer.packetDescriptions! .pointee.mDataByteSize = UInt32(data.count)         data.copyBytes(             to: compressedBuffer.data .assumingMemoryBound(to: UInt8.self),             count: data.count         ) where data: Data contains the raw OPUS frame to be decoded. How can I specify data loss in this context and cause the AVAudioConverter to output PCM data whenever no more input data is available? More context: I'm specifying the audio format like this:         static let frameSize: UInt32 = 960         static let sampleRate: Float64 = 48000.0         static var networkFormatStreamDescription = AudioStreamBasicDescription(             mSampleRate: sampleRate,             mFormatID: kAudioFormatOpus,             mFormatFlags: 0,             mBytesPerPacket: 0,             mFramesPerPacket: frameSize,             mBytesPerFrame: 0,             mChannelsPerFrame: 1,             mBitsPerChannel: 0,             mReserved: 0         )         static let networkFormat = AVAudioFormat( streamDescription: &networkFormatStreamDescription )! I've tried 1) setting byteLength and packetCount to zero and 2) returning nil but setting .haveData in the AVAudioConverterInputBlock I'm using with no success.
1
1
761
May ’25
AVAudioPCMBuffer Memory Management
I’m using AVAudioEngine to get a stream of AVAudioPCMBuffers from the device’s microphone using the usual installTap(onBus:) setup. To distribute the audio stream to other parts of the program, I’m sending the buffers to a Combine publisher similar to the following: private let publisher = PassthroughSubject<AVAudioPCMBuffer, Never>() I’m starting to suspect I have some kind of concurrency or memory management issue with the buffers, because when consuming the buffers elsewhere I’m getting a range of crashes that suggest some internal pointer in a buffer is NULL (specifically, I’m seeing crashes in vDSP.convertElements(of:to:) when I try to read samples from the buffer). These crashes are in production and fairly rare — I can’t reproduce them locally. I never modify the audio buffers, only read them for analysis. My question is: should it be possible to put AVAudioPCMBuffers into a Combine pipeline? Does the AVAudioPCMBuffer class not retain/release the underlying AudioBufferList’s memory the way I’m assuming? Is this a fundamentally flawed approach?
1
1
1.4k
Aug ’24