Finding hidden exceptions in your application

What are logs and why do I need to monitor them?
Using logs to easily diagnose a server crash

What are hidden errors and why do you need to find them?

Hidden exceptions can occur by design. For example, adding a try-catch block into your code is an easy way to prevent errors from displaying to your users and allow the code to execute as expected. In some situations, this is required, for example, if you are handling a return code from another service. In other cases hiding these errors can be detrimental as you may have a problem, but any traditional monitoring tool or user cannot see it.

How do we start finding hidden errors in your application?

Traditionally developers have relied on debug logging to find hidden exceptions or even just adding logs, to the standard out logs in the application to provide extra context for what’s happening in the code. If you have an issue, you could make some general dump statements to see what’s going on at the time.

Manual log checking is a drain on the resource.

Providing you know when the issue occurred, this approach works, but you need to monitor the log files in the application actively. This is dependent on someone remembering to keep checking the logs. However, manual checking is not always done because priorities, like team members, can change. Using excepting handlers like try-catch ignore in the code or adding extra logs can be helpful, but it can sometimes lead to you not seeing the complete picture. For example, in a log query engine such as LogQL, which FusionReactor uses, we can perform full-text searches on logs for the system. In the video, run a query to search all my application logs for the text “exception.”

{ job=~"store-.+" } |= "Exception"

The result will display all of the hidden exceptions running in the background that we may not have been aware of. In the video example, you can see them occurring reasonably regularly. For example, we can see an array index out of bounds exception fire within our application. If you click the error line, FusionRecator will show you all the logs surrounding that particular error message. This context gives you the insight you need to solve the issue. The video demonstrates how you go even further to the exact error and the line it fires on.

We can go further by configuring an alert that notifies us each time an exception is logged so you know there is an unhandled error to investigate.

How to fix the hidden error

So we have seen that errors can happen that you may not be aware of, and we have shown you how using log queries you can quickly find them.

  • First of all, you could enhance the code to throw the error. Once the error starts firing in your APM, you can use automatic route cause analysis such as FusionReactors Event snapshot to give you even greater context.
  • You could, instead of that, enhance the logging you have and put more specific debug logs in to find the problem at the time.
  • You could connect the debugger to your code and see what’s happening.

But without the logs, you wouldn’t know the error was happening in the first place.