App works fine in development but crashes in hardened runtime

I am building an application using .NET and Avalonia UI. The application is cross-platform. One of the tasks of the application is to coordinate data collection that is then routed into a Docker container for analysis.

Everything works as expected in Windows. Everything works as expected in macOS on the development workstation and before packaging. After I package/codesign into a hardened runtime, I start seeing crashes at the moment when I try to execute the system calls to Docker.

I am reasonably confident that this has something to do with an entitlement flag or some other permissions issue. I have been trying to sort this on my own for a while. I am only hoping someone can nudge me in the right direction.

Thanks, Kevin

Answered by DTS Engineer in 847789022

Look at the backtrace of the crashing thread here:

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib  0x7ff8089fb846 __pthread_kill + 10
1   libsystem_pthread.dylib 0x7ff808a36b16 pthread_kill + 259
2   libsystem_c.dylib       0x7ff80895573e abort + 126
3   libcoreclr.dylib           0x10fb351d9 PROCAbort + 57
4   libcoreclr.dylib           0x10fb350f9 TerminateProcess + 137
5   libcoreclr.dylib           0x10fd45de9 UnwindManagedExceptionPass1(PAL_SEHException&, _CONTEXT*) + 1081
6   libcoreclr.dylib           0x10fd45e2d DispatchManagedException(PAL_SEHException&, bool) + 61
7   libcoreclr.dylib           0x10fcb559d IL_Throw(Object*) + 445
8   ???                        0x11475fdca ???
9   ???                        0x1139f5148 ???
10  ???                        0x1139f4c42 ???
11  ???                        0x1139f4981 ???
12  ???                        0x1139f4633 ???
13  ???                        0x113716f35 ???
14  ???                        0x111942cf3 ???
15  ???                        0x11193194e ???
16  libcoreclr.dylib           0x10fdf7db1 CallDescrWorkerInternal + 124
17  libcoreclr.dylib           0x10fc57e13 MethodDescCallSite::CallTargetWorker(unsigned long long const*, unsi…
18  libcoreclr.dylib           0x10fb51588 RunMain(MethodDesc*, short, int*, PtrArray**) + 712
19  libcoreclr.dylib           0x10fb51992 Assembly::ExecuteMainMethod(PtrArray**, int) + 450
20  libcoreclr.dylib           0x10fb7bd81 CorHost2::ExecuteAssembly(unsigned int, char16_t const*, int, char16…
21  libcoreclr.dylib           0x10fb3eb72 coreclr_execute_assembly + 226
22  libhostpolicy.dylib        0x10f44b57d run_app_for_context(hostpolicy_context_t const&, int, char const**) …
23  libhostpolicy.dylib        0x10f44c4f4 corehost_main + 276
24  libhostfxr.dylib           0x10f36ee55 fx_muxer_t::handle_exec_host_command(std::__1::basic_string, std::__…
25  libhostfxr.dylib           0x10f36df61 fx_muxer_t::execute(std::__1::basic_string, std::__1::allocator>, in…
26  libhostfxr.dylib           0x10f36ac83 hostfxr_main_startupinfo + 131
27  CVRAnalysisToolkit         0x10f2b0b4d exe_start(int, char const**) + 1565
28  CVRAnalysisToolkit         0x10f2b0daf main + 175
29  dyld                    0x7ff808691530 start + 3056

Frames 28 through 16 seem to be your language runtime starting up. Frames 15 through 8 seem to be JITed code, because the addresses aren’t in any image listed in the Binary Images section. Frame 7 suggests that your JITed code threw a language exception, and that leads to frames 6 through 3, which suggest that the language exception went unhandled and thus crashed the process.

It’s hard to offer definitive advice without knowing more about the issue, such as:

  • The identity of the code in frames 15 through 8.
  • The nature of the exception thrown in frame 7.

I don’t have any hints for the first point. That’s something you’d need to discuss with your tools vendor.

My experience with the second point is that such exceptions are ofter logged to stderr, which you can see if you manually run the app from Terminal. See my advice on this topic in Resolving Trusted Execution Problems (search that page for Terminal).

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

After I package/codesign into a hardened runtime, I start seeing crashes at the moment when I try to execute the system calls to Docker.

I have some general advice on this front in my Resolving Hardened Runtime Incompatibilities post, part of the Resolving Trusted Execution Problems series.

If you post a crash report, I might be able to offer more specific advice. See Posting a Crash Report for advice on how to do that.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Any help you can offer will be fantastic and greatly appreciated. Our Mac testers are getting jealous of our Windows testers. We can't have THAT!!

Look at the backtrace of the crashing thread here:

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib  0x7ff8089fb846 __pthread_kill + 10
1   libsystem_pthread.dylib 0x7ff808a36b16 pthread_kill + 259
2   libsystem_c.dylib       0x7ff80895573e abort + 126
3   libcoreclr.dylib           0x10fb351d9 PROCAbort + 57
4   libcoreclr.dylib           0x10fb350f9 TerminateProcess + 137
5   libcoreclr.dylib           0x10fd45de9 UnwindManagedExceptionPass1(PAL_SEHException&, _CONTEXT*) + 1081
6   libcoreclr.dylib           0x10fd45e2d DispatchManagedException(PAL_SEHException&, bool) + 61
7   libcoreclr.dylib           0x10fcb559d IL_Throw(Object*) + 445
8   ???                        0x11475fdca ???
9   ???                        0x1139f5148 ???
10  ???                        0x1139f4c42 ???
11  ???                        0x1139f4981 ???
12  ???                        0x1139f4633 ???
13  ???                        0x113716f35 ???
14  ???                        0x111942cf3 ???
15  ???                        0x11193194e ???
16  libcoreclr.dylib           0x10fdf7db1 CallDescrWorkerInternal + 124
17  libcoreclr.dylib           0x10fc57e13 MethodDescCallSite::CallTargetWorker(unsigned long long const*, unsi…
18  libcoreclr.dylib           0x10fb51588 RunMain(MethodDesc*, short, int*, PtrArray**) + 712
19  libcoreclr.dylib           0x10fb51992 Assembly::ExecuteMainMethod(PtrArray**, int) + 450
20  libcoreclr.dylib           0x10fb7bd81 CorHost2::ExecuteAssembly(unsigned int, char16_t const*, int, char16…
21  libcoreclr.dylib           0x10fb3eb72 coreclr_execute_assembly + 226
22  libhostpolicy.dylib        0x10f44b57d run_app_for_context(hostpolicy_context_t const&, int, char const**) …
23  libhostpolicy.dylib        0x10f44c4f4 corehost_main + 276
24  libhostfxr.dylib           0x10f36ee55 fx_muxer_t::handle_exec_host_command(std::__1::basic_string, std::__…
25  libhostfxr.dylib           0x10f36df61 fx_muxer_t::execute(std::__1::basic_string, std::__1::allocator>, in…
26  libhostfxr.dylib           0x10f36ac83 hostfxr_main_startupinfo + 131
27  CVRAnalysisToolkit         0x10f2b0b4d exe_start(int, char const**) + 1565
28  CVRAnalysisToolkit         0x10f2b0daf main + 175
29  dyld                    0x7ff808691530 start + 3056

Frames 28 through 16 seem to be your language runtime starting up. Frames 15 through 8 seem to be JITed code, because the addresses aren’t in any image listed in the Binary Images section. Frame 7 suggests that your JITed code threw a language exception, and that leads to frames 6 through 3, which suggest that the language exception went unhandled and thus crashed the process.

It’s hard to offer definitive advice without knowing more about the issue, such as:

  • The identity of the code in frames 15 through 8.
  • The nature of the exception thrown in frame 7.

I don’t have any hints for the first point. That’s something you’d need to discuss with your tools vendor.

My experience with the second point is that such exceptions are ofter logged to stderr, which you can see if you manually run the app from Terminal. See my advice on this topic in Resolving Trusted Execution Problems (search that page for Terminal).

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thanks for the insight. I will try to explain a little more about what is happening at the point of failing...and to express my frustration that I can't even reproduce the problem outside of the hardened runtime.

The application is processing MR images using a mix of our own code, in C# and a third-party MRI image processing suite, running in a docker/podman container.

The application sets up the data so that podman can access it and then triggers a podman process to start its processing. The triggering happens by preparing a 'Process' object...like a command shell invocation. Like:

podman exec <options> <container_name> <command>

In development, this all works perfectly on all platforms. In packaged form, it works on Windows...because Microsoft has nothing even close to the hardened runtime in their distribution model. On the Mac, it generates the crash log provided above.

This leads me to believe that I need some sort of Entitlement but which one? The test loop time is around 30 minutes and I don't which one might be the culprit.

Alternately, I need some flag or other as-yet-unknown configuration in the ProcessStartInfo object. In C#, the ProcessStartInfo object describes the command that we intend to 'launch'. This object is injected into a Process object, and Start() is called.

Something like:

...
var process = new Process();
process.StartInfo = startInfo;
try 
{
  process.Start();
}
catch (ApplicationException exception)
{
    Console.WriteLine(exception.Message);
}
...

If there's anything else I can offer, just let me know. I have resisted building a 'non-product' product that contains only the suspect code because it would mean packaging and signing something that isn't meant for distribution, which feels wrong. I now don't see any other way to ensure that I have truly isolated the offending code.

In the interest of education: a solution has been found...and maybe this is obvious to everyone. The reporting afforded by the environment certainly did not get me to this conclusion!

The issue between packaged/hardened applications and the same code in development is: The PATH

I finally set up a repeatable test where I didn't need to sign anything to get it to fail. After a couple of hours I had it pinned down to the process.Start() not being able to find the executable that I was trying to start! In development the external process was on the path...so everything just worked. In the hardened runtime the path is the barebones '/usr/bin'...etc.

I am 'smarterer' now! ;)

Thanks for all of your help!!

App works fine in development but crashes in hardened runtime
 
 
Q