Seeking Confirmation on Picture-in-Picture Support for Audio Calls

hi Apple review team, I’m developing an app with audio calling functionality, and I’d like to take advantage of Picture-in-Picture (PiP) so that when the user moves the app to the background, the ongoing call can remain minimized on the Home screen. Based on my research, it seems possible to display a view in PiP mode and have it play, and I haven’t found any documentation stating that this is prohibited. Could you please confirm if this is allowed?

Answered by DTS Engineer in 849126022

Thank you for your response, Kevin. From what I can see, the document appears to focus only on video calls, while my use case involves audio-only communication. Could you kindly confirm whether it’s acceptable to display a Picture-in-Picture view for an audio-only call, even if the document primarily focuses on video calls?

What are you actually trying to do?

While the documentation primarily describes "camera to camera" streaming and the idea of "two people looking at each other through their camera's", nothing in the API actually requires that. As far as the API concerned, it has no idea what content it's actually showing and will happily display whatever you tell it to.

Similarly, many VoIP apps allow their users to send content other than "the image of the person currently talking" (screen captures, shared documents, white boards, etc.). I'm not aware of us ever having had any issue with that sort of thing, particularly when the user has direct control of what's happening and the value to the user is obvious.

However, that doesn't mean that any app can simply use PiP for "whatever it wants". For example, an app shouldn't be using PiP to put up things like:

  • An empty black screen.

  • A random static image.

...as that's simply not what PiP is "for". One thing to understand here is that Apple's concerns around how our APIs are used is not simply about the technical details of how those APIs are used, it's also about the functionality being provided to the user.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Based on my research, it seems possible to display a view in PiP mode and have it play, and I haven’t found any documentation stating that this is prohibited. Could you please confirm if this is allowed?

Not only is allowed, the document "Adopting Picture in Picture for video calls" specifically describes how video call apps should adopt PiP.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thank you for your response, Kevin. From what I can see, the document appears to focus only on video calls, while my use case involves audio-only communication. Could you kindly confirm whether it’s acceptable to display a Picture-in-Picture view for an audio-only call, even if the document primarily focuses on video calls?

Thank you for your response, Kevin. From what I can see, the document appears to focus only on video calls, while my use case involves audio-only communication. Could you kindly confirm whether it’s acceptable to display a Picture-in-Picture view for an audio-only call, even if the document primarily focuses on video calls?

What are you actually trying to do?

While the documentation primarily describes "camera to camera" streaming and the idea of "two people looking at each other through their camera's", nothing in the API actually requires that. As far as the API concerned, it has no idea what content it's actually showing and will happily display whatever you tell it to.

Similarly, many VoIP apps allow their users to send content other than "the image of the person currently talking" (screen captures, shared documents, white boards, etc.). I'm not aware of us ever having had any issue with that sort of thing, particularly when the user has direct control of what's happening and the value to the user is obvious.

However, that doesn't mean that any app can simply use PiP for "whatever it wants". For example, an app shouldn't be using PiP to put up things like:

  • An empty black screen.

  • A random static image.

...as that's simply not what PiP is "for". One thing to understand here is that Apple's concerns around how our APIs are used is not simply about the technical details of how those APIs are used, it's also about the functionality being provided to the user.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

hi Kevin, what I’m trying to do is use Picture-in-Picture to display relevant call information to the user—such as the current call status and connection quality—when the app goes into the background. The purpose is to keep the user informed during an ongoing call, not to misuse PiP for unrelated or static content. Could you please confirm if this is allowed?

Hi Kevin, what I’m trying to do is use Picture-in-Picture to display relevant call information to the user—such as the current call status and connection quality—when the app goes into the background.

My read of that is that it seems like something that would be much better handled through Live Activities and/or some other mechanism like the notification system. Notably:

  • Live Activities work in broader contexts, like on the lock screen.

  • Live Activities can be initiated from the background, while starting PiP requires your app to be in the foreground.

  • Using PiP for this is going to disrupt whatever other PiP activity the user was doing.

Basically, I think using PiP for something like this is going to require significantly more work than LiveActivities and will only work in a much narrower context.

The purpose is to keep the user informed during an ongoing call.

Just to clarify something, you are planning to use CallKit, LiveCommunicationKit, or the PushToTalk framework, correct? My concern here is that you're trying to use PiP to avoid those APIs and that would be a very serious mistake. While it's technically possible to implement a VoIP app without those APIs, the audio behavior of that app will be SEVERELY compromised. Most notably, if an incoming call arrives while your app is "on a call", then:

  • Your app will immediately receive an audio interruption.

  • Even if the user quickly declines the call, your app will NOT be able to resume its audio session, because PlayAndRecord session cannot be activated from the background.

  • Shortly after that, your app will suspend, because it doesn't have an active audio session keeping it awake.

Note that the incoming "call" can be from any app using either of those three frameworks, not just Phone.app. In practical terms, the users will be forced to "hang up” any time someone contacts them through any other VoIP app.

*This is actually exactly what happened to all VoIP apps before we introduced CallKit and is, in fact, the major reason we created CallKit.

Not to misuse PiP for unrelated or static content. Could you please confirm if this is allowed?

No, not based on what you've described. What you're describing isn't what PiP was directly designed to support, so the final determination would depend on the specific behavior and implementation details of the product you're creating.

My recommendation is that you contact them directly, so you can walk them through the specifics of exactly what you want to do and why. However, as part of that, I would strongly recommend that you prepare some screenshots or mockups of what you're doing, not just a general description. Ultimately, this is really about the value your unusual usage provides to the user, not an abstract decision about policy.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Seeking Confirmation on Picture-in-Picture Support for Audio Calls
 
 
Q