Skip to content

Commit 21178de

Browse files
committed
pipeline: filters: lookup added new filter
New documentation page outlines description, example configuration and CSV handling for the new LookUp filter. Signed-off-by: Oleg Mukhin <oleg.v.mukhin@gmail.com>
1 parent 3002555 commit 21178de

File tree

1 file changed

+134
-0
lines changed

1 file changed

+134
-0
lines changed

pipeline/filters/lookup.md

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
# Lookup
2+
3+
The Lookup plugin looks up a key value from a record in a specified CSV file and, if a match is found, adds the corresponding value from the CSV as a new key-value pair to the record.
4+
5+
## Configuration parameters
6+
7+
The plugin supports the following configuration parameters
8+
9+
| Key | Description | Default |
10+
| :-- | :---------- | :------ |
11+
| `file` | The CSV file that Fluent Bit will use as a lookup table. The file should contain two columns (key and value), with the first row as an optional header that is skipped. Supports quoted fields and escaped quotes. | _none_ |
12+
| `lookup_key` | The specific key in the input record to look up in the CSV file's first column. Supports [record accessor](../../administration/configuring-fluent-bit/record-accessor). | _none_ |
13+
| `result_key` | The name of the key to add to the output record with the matched value from the CSV file's second column if a match is found. | _none_ |
14+
| `ignore_case` | Ignore case when matching the lookup key against the CSV keys. | `false` |
15+
16+
## Example configuration
17+
18+
{% tabs %}
19+
{% tab title="fluent-bit.yaml" %}
20+
21+
```yaml
22+
parsers:
23+
- name: json
24+
format: json
25+
26+
pipeline:
27+
inputs:
28+
- name: tail
29+
tag: test
30+
path: devices.log
31+
read_from_head: true
32+
parser: json
33+
34+
filters:
35+
- name: lookup
36+
match: test
37+
file: device-bu.csv
38+
lookup_key: $hostname
39+
result_key: business_line
40+
ignore_case: true
41+
42+
outputs:
43+
- name: stdout
44+
match: test
45+
```
46+
47+
{% endtab %}
48+
{% tab title="fluent-bit.conf" %}
49+
50+
```text
51+
[PARSER]
52+
Name json
53+
Format json
54+
55+
[INPUT]
56+
Name tail
57+
Tag test
58+
Path devices.log
59+
Read_from_head On
60+
Parser json
61+
62+
[FILTER]
63+
Name lookup
64+
Match test
65+
File device-bu.csv
66+
Lookup_key $hostname
67+
Result_key business_line
68+
Ignore_case On
69+
70+
[OUTPUT]
71+
Name stdout
72+
Match test
73+
```
74+
75+
{% endtab %}
76+
{% endtabs %}
77+
78+
The following configuration reads log records from `devices.log` that includes the following values for device hostnames:
79+
80+
```text
81+
{"hostname": "server-prod-001"}
82+
{"hostname": "Server-Prod-001"}
83+
{"hostname": "db-test-abc"}
84+
{"hostname": 123}
85+
{"hostname": true}
86+
{"hostname": " host with space "}
87+
{"hostname": "quoted \"host\""}
88+
{"hostname": "unknown-host"}
89+
{}
90+
{"hostname": [1,2,3]}
91+
{"hostname": {"sub": "val"}}
92+
{"hostname": " "}
93+
```
94+
95+
It uses the value of the `hostname` field (which has been set as the `lookup_key`) to find matching values in column 1 of the (`device-bu.csv`) CSV file.
96+
97+
```text
98+
hostname,business_line
99+
server-prod-001,Finance
100+
db-test-abc,Engineering
101+
db-test-abc,Marketing
102+
web-frontend-xyz,Marketing
103+
app-backend-123,Operations
104+
"legacy-system true","Legacy IT"
105+
" host with space ","Infrastructure"
106+
"quoted ""host""", "R&D"
107+
no-match-host,Should Not Appear
108+
```
109+
110+
Where a match is found the filter adds new key (name of which is set by the `result_key` input) with the value from the second column of the CSV file of the matched row.
111+
112+
For above configuration the following output can be expected (when matching case is ignored as `ignore_case` is set to true):
113+
114+
```text
115+
{"hostname"=>"server-prod-001", "business_line"=>"Finance"}
116+
{"hostname"=>"Server-Prod-001", "business_line"=>"Finance"}
117+
{"hostname"=>"db-test-abc", "business_line"=>"Marketing"}
118+
{"hostname"=>123}
119+
{"hostname"=>true}
120+
{"hostname"=>" host with space ", "business_line"=>"Infrastructure"}
121+
{"hostname"=>"quoted "host"", "business_line"=>"R&D"}
122+
{"hostname"=>"unknown-host"}
123+
{}
124+
{"hostname"=>[1, 2, 3]}
125+
{"hostname"=>{"sub"=>"val"}}
126+
```
127+
128+
## CSV import
129+
130+
The CSV is used to create an in-memory key value lookup table. Column 1 of the CSV is always used as key, while column 2 is assumed to be the value. All other columns in the CSV are ignored.
131+
132+
This filter is intended for static datasets. CSV is loaded once when Fluent Bit starts and is not reloaded.
133+
134+
Multiline values in CSV file are not currently supported.

0 commit comments

Comments
 (0)