User:Anotheridiot/Error Handling
|
What is error handling?
Error handling is a fundamental component of an Operating System, similar to exception handling in software development. In the context of operating systems, error handling mechanisms include system-wide failure responses such as kernel panics, the BSOD, and my RSOD (Red Screen Of Death).
Why do I need error handling?
Error handling serves a critical role in maintaining system stability, diagnosing issues, and preventing data loss.
Operating systems encounter various types of errors, ranging from hardware failures to software faults, and an effective error-handling mechanism ensures that these issues are managed appropriately. The best option is to kill the offending thread if its a user space or kernel space issue, and if its an issue within the kernel you should stop everything by pausing it, and start the kernel debugger if it is fixable, or just make a log file and shutdown if an option at that point.
What should an error handler do?
An error handler can be implemented in many ways, for example, it could just shut down everything and report no info at all, another implementation would give you everything including a full stack trace, kernel dump and so on.
A user-friendly error handler might halt the system but not shut down the system, and instead show the user an in-between of the above types with an error code, and a possible cause, with some human-readable diagnosis information.
A debug error handler (one that is used in a debug build) could do more logging, run a stack trace, give a full kernel dump, show the running threads, and possibly give you a debug shell.