Graylog is a great tool to visualize and analyze what’s happening to your backend systems.
Production log files are usually too noisy for human eyes, i.e. a tail -f
on the log will show you the influx of new entries which are impossible for human eyes to catch. Moreover, logs normally live on different hosts. One may need to have multiple terminals open to ‘tail’.
Traditionally, one may need to copy the logs to a location, and perform text analysis (sed and awk) to extract useful information. With Graylog (or ELK or Splunk), centralizing and searching is much easier.
Create a Graylog Input
In Graylog, click on ‘System’ -> ‘Inputs’, then select ‘Beats’ and click on ‘Launch new input’.
On the prompt, give it a meaningful title. I also recommend checking the last checkbox (Do not add Beats type as prefix) because it will work with the default ‘Source’ dashboard without changing the fields.
After clicking on ‘Save’, your Graylog is ready to receive logs via the Beats input.
Install Filebeat on your Nginx Server
Filebeat is an agent to send logs stored on the filesystem to a log server.
wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.2.0-x86_64.rpm
rpm -ivh filebeat-8.2.0-x86_64.rpm
Modify Filebeat config in /etc/filebeat/filebeat.yml, add these on top (replace ansible_hostname with how you wish to name the source. Without this chunk, the source will appear as ‘unknown’ in Graylog. Reference:
https://community.graylog.org/t/filebeat-linux-source-unknown/10793
# ============================== Filebeat inputs ===============================
# Needed for Graylog
fields_under_root: true
fields.collector_node_id: {{ ansible_hostname }}
fields.source: {{ ansible_hostname }}
fields.gl2_source_collector: {{ ansible_hostname }}
# /Needed for Graylogfilebeat.inputs:
Under inputs, modify the paths:
paths:
- /var/log/nginx/*.log
Modify the output under output.logstash
# ------------------------------ Logstash Output -------------------------------
output.logstash:
# The Logstash hosts
hosts: ["my-graylog-hostname:5044"]
Now start and enable filebeat service.
systemctl start filebeat
systemctl enable filebeat
Grok Pattern for Nginx
Now in Graylog, under ‘Search’, you should see logs coming in. They appear as text strings, i.e. they are not ‘split’ into fields. To split them, you need to make use of Grok pattern.
Under ‘System’ -> ‘Grok Patterns’, click on ‘Create pattern’. Name it ‘NGINX’. The pattern will be
%{IPORHOST:clientip} %{HTTPDUSER:ident} %{USER:auth} \[%{HTTPDATE:timestamp;date;dd/MMM/yyyy:HH:mm:ss Z}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{QS:forwarder}
You can test with some sample data from the raw input you received.
After saving the Grok Pattern, go to ‘System’ -> ‘Inputs’, on your Beats input, click on ‘Manage extractors’, make a new extractor and specify the Grok pattern you just made:
%{NGINX}
After saving it, you can see Nginx logs ingested with fields split. The fields allow easier searching and visualization. For example, you can make a dashboard and display top 10 requester by IP address, top referrer pages etc.