Unmasking unstructured logs with FusionReactor Cloud Logging LogQL pattern parser

Options in Observability
How to create fast LogQL queries to filter terabytes of logs per second with FusionReactor

Log data represents untapped or mismanaged resources for many organizations. Even if they can harness valuable insights from structured logs through their log management system, unmasking unstructured logs pose the most significant challenge.

However, with FusionReactor Cloud’s logging feature, writing LogQL queries to access and parse unstructured log formats just got easier. Parsing unstructured log data can be done a lot faster than the conventional parser. Let’s unmask more!

Log-parsing challenges

When log volumes are large, parsing helps convert them into simple data fields to query with LogQL. Parsing queries in the regex can be challenging and time-consuming, unlike queries from JSON and Logfmt, which are pretty easy to use and fast.

With LogQL, performing a full text-based search to analyze unstructured logs becomes simple. FusionReactor logging comes with LogQL parsers that manage JSON, regex, and Logfmt.

For instance, when extracting labels and values from NGINX logs, finding the rate of requests by status and method can be challenging. Consider the regex query highlighted within this example:

sum by (method, status) (rate({stream=''stdout", container=''nginx} |
regexp '' ^ (\\S+) (?P<user_identifier>\\S+) (?P<user>\\S+) \\[(.*)\\]
\"(?P<method>\\S+) (?P<path>\\S+) HTTP/(?P<http_version>\\d+\\.\\d+) \''
(?P<status>\\d+) (?P<bytes_sent>\\d+|-) '' [1m] ) )

Using FusionReactor Pattern Parser

The latest FusionReactor logging feature comes with a pattern parser that is simple and easy to use to produce decisive results to extract insights from unstructured logs. It’s important to note that there is a vast difference in how the latest pattern parser expresses its output from other regular expressions because the pattern parses logs faster than other parsers.

To elaborate on the capability of this logging capability, we have highlighted the same query written using the FusionReactor logging pattern parser:

sum by (method,status) (rate({stream="stdout" ,container="nginx"}
| pattern `<_> - - <_> "<method> <_> <_>" <status> <_> <_> "<_>" <_>`
[$__interval] ) )

Pattern parser syntax and semantics

To invoke a pattern parser, we specify the following expression within the LogQL query described below:

| pattern "<pattern-expression>"

or

| pattern `<pattern-expression>`

<pattern-expression>  defines the kind of structure the log line will have. The log line comprises capture and literals.

we’ll notice that the < and > characters define the field name status. Referring to the example mentioned above, <status> represents the field’s name status, and <_> represents the unnamed capture because it skips and ignores matched content across the logline.
With such expressions, it’s easier to know when a capture does not match because captures are matched at the start of the line to the following sequence of the log query. The advantage of the FusionReactor logging pattern parser is that if capture is not identical, the parser immediately stops extracting data from the log lines.

Understanding the Pattern parser with examples

The following example of a pattern parser will be queried on an NGINX log. We will use a table to dissect the logline expression. For example

NGINX log line fields NGINX sample <pattern expression>
$remote_addr 203.0.113.0 <_>
$remote_user
[$time_local] [08/Nov/2021:19:12:04 +0000] <_>
“$request” “GET /healthz HTTP/1.1” “<method> <_> <_>”
$status 200 <status>
$bytes_sent 15 <_>
“$http_referer” “-” <_>
“$http_user_agent” “GoogleHC/1.0” “<_>”
“-” “-” “-” “-” <_>

Kubernetes example

Let’s look at another instance in a Kubernetes environment. The pattern parser is used to query an envoy proxy which returns the 99th percentile latency per path and method. The metric query is measured in seconds.

quantile_over_time(0.99, {container="envoy"} | pattern `[<_>] "<method> <path> <_>" <_> <_> <_> <_> <latency>` | unwrap latency\[$__interval] ) by (method,path) / 1e3

Moving Forward

FusionReactor log parsing capabilities make it easy and faster to parse logs within multiple environments

How to create fast LogQL queries to filter terabytes of logs per second with FusionReactor

Performance matters when retrieving or analyzing data. This is why the need to create fast queries which filter terabytes of logs is critical because fast retrieval is as good as efficient queries.

FusionReactor with LogQL makes it easier to create fast queries to filter terabytes of logs per second. This article will break down the key concepts and give you simple tips to create fast queries in seconds.