I have difficulty imagining what could be done to rectify the program
such that it can be resumed, except simply killing the offending
thread so the program (operating system) as a whole can continue.
The stack boundary is a kernel-mode mechanism. Its intent, I believe, is to protect the interrupt vectors from corruption. Vector corruption is very bad; there's a wild jump to somewhere that's probably going to happen at some point in the future.
As for recovery: this is the kernel. It probably hasn't any mechanism to abort a "thread" of execution and it probably has only a single kernel stack anyway. The systems I am used to had non-reentrant kernels (rescheduling took place only on exit from kernel mode) so one k-stack was all you needed.
You could conceivably forcibly empty the stack (reload SP with the stack bottom) and then exit (to user mode or the null loop), but you basically aborted kernel processing at some random point, so who knows what state the world is in. It's no more recoverable than most other trap 4s in k-mode.
I therefore suppose that the only way to recover from stack overflow is to completely reinitialize the kernel. Maybe you disable interrupts, reset the stack, and reload the core image from disk.
Remember that process control was a considerable part of the PDP-11 target base. If your system is so borked that it just got a stack violation, maybe the best way to avoid disaster is to restart ASAP. It's a lot cleaner than random jumps through corrupted interrupt vectors.
The specific question of when "it's ok to use the yellow zone" ends is a good one. I have no authoritative answer. I suspect it might be a consequence of reloading SP. But that's very hand-wavy.
P.S. You figured the yellow zone as 346-400. I make it 340 to 400. It's 16 words, or 32 bytes, or 40 in god's own radix.
I have an hypothesis, completely untested. Here it is:
The yellow zone is a spacewise construction. Note that the description says you only get a trap by a reference of the form -(SP) or @-(SP).
Therefore (I guess), you get a "yellow trap" on an instruction that actually crosses the limit; for a conventional push, like MOV R0,-(SP), it would be the transition from 400 to 376; for something like the useless MOV -(SP),-(SP) it would be a transition from 400 to 374. The cue is the before-value equaling the limit.
Once the SP is less than 400, it's ok to reference through it until it goes below 340, at which point you get the "red trap".
According to this hypothesis, if you get a yellow trap on MOV R0,-(SP), and the trap service routine immediately executes RTI, then you're still in the yellow zone.
An interesting experiment might be to transport yourself into the yellow zone without passing through the limit: MOV #370,SP; CLR -(SP). Trap or no trap?