![gopherbadger-tag-do-not-edit]
Parser is in charge of turning raw log lines into objects that can be manipulated by heuristics. Parsing has several stages represented by directories on config/stage. The alphabetical order dictates the order in which the stages/parsers are processed.
The runtime representation of a line being parsed (or an overflow) is an Event, and has fields that can be manipulated by user :
- Parsed : a string dict containing parser outputs
- Meta : a string dict containing meta information about the event
- Line : a raw line representation
- Overflow : a representation of the overflow if applicable
The Event structure goes through the stages, being altered with each parsing step. It's the same object that will be later poured into buckets.
A parser configuration is a Node object, that can contain grok patterns, enrichement instructions.
For example :
filter: "evt.Line.Labels.type == 'testlog'" debug: true onsuccess: next_stage name: tests/base-grok pattern_syntax: MYCAP: ".*" nodes: - grok: pattern: ^xxheader %{MYCAP:extracted_value} trailing stuff$ apply_on: Line.Raw statics: - meta: log_type value: parsed_testlogoptional if present and prometheus or profiling are activated, stats will be generated for this node.
filter: "Line.Src endsWith '/foobar'"
- optional
filter: an expression that will be evaluated against the runtime of a line (Event)- if the
filteris present and returns false, node is not evaluated - if
filteris absent or present and returns true, node is evaluated
- if the
debug: true
- optional
debug: a bool that sets debug of the node to true (applies at runtime and configuration parsing)
onsuccess: next_stage|continue
- mandatory indicates the behavior to follow if the node succeeds.
next_stagemake the line go to the next stage, whilecontinuewill continue processing the current stage.
statics: - meta: service value: tcp - meta: source_ip expression: "Event['source_ip']" - parsed: "new_connection" expression: "Event['tcpflags'] contains 'S' ? 'true' : 'false'" - target: Parsed.this_is_a_test value: foobarStatics apply when a node is considered successful, and are used to alter the Event structure. An empty node, a node with a grok pattern that succeeded or an enrichment directive that worked are successful nodes. Statics can :
- meta: add/alter an entry in the
Metadict - parsed: add/alter an entry in the
Parseddict - target: indicate a destination field by name, such as Meta.my_key The source of data can be :
- value: a static value
- expr_result : the result of an expression
Grok patterns are used to parse one field of Event into one or several others :
grok: name: "TCPDUMP_OUTPUT" apply_on: messagename is the name of a pattern loaded from patterns/. Base patterns can be seen on the repo : https://github.com/crowdsecurity/grokky/blob/master/base.go
grok: pattern: "^%{GREEDYDATA:request}\\?%{GREEDYDATA:http_args}$" apply_on: requestpattern which is a valid pattern, optionally with an apply_on that indicates to which field it should be applied
Present at the Event level, the pattern_syntax is a list of subgroks to be declared.
pattern_syntax: DIR: "^.*/" FILE: "[^/].*$"The Enrichment mechanism is exposed via statics :
statics: - method: GeoIpCity expression: Meta.source_ip - meta: IsoCode expression: Enriched.IsoCode - meta: IsInEU expression: Enriched.IsInEUThe GeoIpCity method is called with the value of Meta.source_ip. Enrichment plugins can output one or more key:values in the Enriched map, and it's up to the user to copy the relevant values to Meta or such.
The Node object allows as well a nodes entry, which is a list of Node entries, allowing you to build trees.
filter: "Event['program'] == 'nginx'" #A nodes: #A' - grok: #B name: "NGINXACCESS" # this statics will apply only if the above grok pattern matched statics: #B' - meta: log_type value: "http_access-log" - grok: #C name: "NGINXERROR" statics: - meta: log_type value: "http_error-log" statics: #D - meta: service value: httpThe evaluation process of a node is as follows:
- apply the
filter(A), if it doesn't match, exit - iterate over the list of nodes (A') and apply the node process to each.
- if a
grokentry is present, process it- if the
grokentry returned data, apply the local statics of the node (if the grok 'B' was successful, apply B' statics)
- if the
- if any of the
nodesor thegrokwas successful, apply the statics (D)
Main structs :
- Node (config.go) : the runtime representation of parser configuration
- Event (runtime.go) : the runtime representation of the line being parsed
Main funcs :
- CompileNode : turns YAML into runtime-ready tree (Node)
- ProcessNode : process the raw line against the parser tree, and produces ready-for-buckets data