Metrics: Part 2 - InfluxDB
26 Mar 2018The first stop in our metrics adventure was to install and configure Netdata to collect system level statistics. We then configured all of the remote Netdata agents to send all of their data to a central proxy node. This is a great start, however, it is not without some challenges. That is, the data supplied by Netdata, while extensive, needs to be stored elsewhere for more than basic reporting, analysis, and trending.
What then should we do with said data? In addition to allowing you to stream data between Netdata instances as we did in our prior post, you can also stream to various databases, both standard and specialized.
As we are exploring TICK stack, we will stream our metrics data into InfluxDB, a specialized time-series database.
InfluxDB
InfluxDB is the “I” in TICK Stack. InfluxDB is a time series database, designed specifically for metrics and event data. This is a good thing, as we have quite an extensive set of system metrics provided by Netdata that we will want to retain so we can observe trends over time or search for anomalies.
In this post we will configure InfluxDB to receive data from Netdata. Additionally, we will reconfigure our Netdata proxy node to ship metric data to InfluxDB.
The following sections rely heavily on the Ansible playbooks from Larry Smith Jr. A basic understanding of Ansible is assumed
Netdata - Configure Netdata to export metrics
Take a moment to review the configuration and metrics collection architecture from our first post.
Reviewed? Good. While Netdata will allow us to ship metrics data from each installed instance of Netdata, this can be quite noisy, or not otherwise provide the control you would like. Fortunately, the configuration to send metrics data is the same in either case.
One other consideration when shipping data from Netdata to InfluxDB, is how best to take the data in. Netdata supports different data export types: graphite, opentsdb, json, and prometheus. Our environment will be configured to send data using the opentsdb telnet interface
.
Note: As none of these are native to InfluxDB, they are exceedingly difficult to use with InfluxDB-Relay.
To reconfigure your Netdata proxy using the ansible-netdata role, the following playbook can be used:
---
- hosts: netdata-proxies
vars:
netdata_configure_archive: true
netdata_archive_enabled: 'yes'
netdata_archive_type: 'opentsdb'
netdata_archive_destination: ":4242"
netdata_archive_prefix: 'netdata'
netdata_archive_data_source: 'average'
netdata_archive_update: 1
netdata_archive_buffer_on_failures: 30
netdata_archive_timeout: 20000
netdata_archive_send_names: true
roles:
- role: ansible-netdata
The variables above tell netdata to:
- Archive data to a backend
- Enable said backend (as netdata only supports one at a time)
- Configure the opentsdb protocol to send data
- Configure the host and port to send to
Additionally, it configures some additional features:
- Send data once a second
- Keep 30 seconds of data in case of connection issues
- Send field names instead of UUID
- Specify connection timeout in milliseconds
Once this playbook has run, your netdata instance will start shipping data. Or, trying to anyways, we haven’t yet installed and configured InfluxDB to capture it. Let’s do that now.
InfluxDB - Install and configure InfluxDB
As discussed above, InfluxDB is a stable, reliable, timeseries database. Tuned for storing our metrics for long term trending. For this environment we are going to install a single small node. Optimizing and scaling are a topic in and of themselves. To ease installation and maintenance, InfluxDB will be installed using the ansible-influxdb role.
The following Ansible playbook configures the ansible-influxdb role to listen for opentsdb messages from our Netdata instance.
---
- hosts: influxdb
vars:
influxdb_config: true
influxdb_version: 1.5.1
influxdb_admin:
enabled: true
bind_address: "0.0.0.0"
influxdb_opentsdb:
database: netdata
enabled: true
influxdb_databases:
- host: localhost
name: netdata
state: present
roles:
- role: ansible-influxdb
A quick breakdown of the settings supplied:
- Configure influxdb rather than use the default config
- Use influxdb 1.5.1
- Enable the admin interface
- Have InfluxDB listen for opentsdb messages and store them in the
netdata
database - Create the netdata database
After this playbook run is successful, you will have an instance of InfluxDB collecting stats from your Netdata proxy!
Did it work?
If both playbooks ran successfully, system metrics will be flowing something like this:
nodes ==> netdata-proxy ==> influxdb
You can confirm this by logging into your InfluxDB node and running the following commands:
Check that InfluxDB is running:
# systemctl status influxdb
● influxdb.service - InfluxDB is an open-source, distributed, time series database
Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2018-04-14 18:29:03 UTC; 9min ago
Docs: https://docs.influxdata.com/influxdb/
Main PID: 11770 (influxd)
CGroup: /system.slice/influxdb.service
└─11770 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
Check that the netdata
database was created:
# influx
Connected to http://localhost:8086 version 1.5.1
InfluxDB shell version: 1.5.1
> SHOW DATABASES;
name: databases
name
----
netdata
_internal
Check that the netdata
database is receiving data:
Connected to http://localhost:8086 version 1.5.1
InfluxDB shell version: 1.5.1
> use netdata;
Using database netdata
> show series;
key
---
netdata.apps.cpu.apps.plugin,host=netdata-01
netdata.apps.cpu.build,host=netdata-01
netdata.apps.cpu.charts.d.plugin,host=netdata-01
netdata.apps.cpu.cron,host=netdata-01
netdata.apps.cpu.dhcp,host=netdata-01
netdata.apps.cpu.kernel,host=netdata-01
netdata.apps.cpu.ksmd,host=netdata-01
netdata.apps.cpu.logs,host=netdata-01
netdata.apps.cpu.netdata,host=netdata-01
netdata.apps.cpu.nfs,host=netdata-01
netdata.apps.cpu.other,host=netdata-01
netdata.apps.cpu.puma,host=netdata-01
netdata.apps.cpu.python.d.plugin,host=netdata-01
netdata.apps.cpu.ssh,host=netdata-01
netdata.apps.cpu.system,host=netdata-01
netdata.apps.cpu.tc_qos_helper,host=netdata-01
netdata.apps.cpu.time,host=netdata-01
netdata.apps.cpu_system.apps.plugin,host=netdata-01
Summary
With that, you now have high resolution system metrics being collected and sent to InfluxDB for longer term storage, analysis, and more.