LanguageModelSession always returns very lengthy responses

No matter what, the LanguageModelSession always returns very lengthy / verbose responses. I set the maximumResponseTokens option to various small numbers but it doesn't appear to have any effect. I've even used this instructions format to keep responses between 3-8 words but it returns multiple paragraphs. Is there a way to manage LLM response length? Thanks.

I've tried the same instructions and prompt you provided with different text, and the issue doesn't happen to me: Every time I tried, the models generated a response no more than 8 words.

I built my code with Xcode 26.0 beta (17A5241e) on macOS 15.5 (24F74), and ran it on my iPhone 16 Plus + 23A5260h. The code is pretty straightforward, and so I am wondering if the test environment has any difference...

Best,
——
Ziqiao Chen
 Worldwide Developer Relations.

LanguageModelSession always returns very lengthy responses
 
 
Q