Type something to search...
NetDATA

NetDATA

NetDATA

68.2k 5.7k
01 May, 2024
  C

What is Netdata ?

Netdata collects metrics per second and presents them in beautiful low-latency dashboards. It is designed to run on all of your physical and virtual servers, cloud deployments, Kubernetes clusters, and edge/IoT devices, to monitor your systems, containers, and applications.

It scales nicely from just a single server to thousands of servers, even in complex multi/mixed/hybrid cloud environments, and given enough disk space it can keep your metrics for years.


Netdata Features

  • Collects metrics from 800+ integrations Operating system metrics, container metrics, virtual machines, hardware sensors, applications metrics, OpenMetrics exporters, StatsD, and logs.

  • Real-Time, Low-Latency, High-Resolution All metrics are collected per second and are on the dashboard immediately after data collection. Netdata is designed to be fast.

  • Unsupervised Anomaly Detection Trains multiple Machine-Learning (ML) models for each metric collected and detects anomalies based on the past behavior of each metric individually.

  • Powerful Visualization Clear and precise visualization that allows you to quickly understand any dataset, but also to filter, slice and dice the data directly on the dashboard, without the need to learn any query language.

  • Out of box Alerts Comes with hundreds of alerts out of the box to detect common issues and pitfalls, revealing issues that can easily go unnoticed. It supports several notification methods to let you know when your attention is needed.

  • systemd Journal Logs Explorer Provides a systemd journal logs explorer, to view, filter and analyze system and applications logs by directly accessing systemd journal files on individual hosts and infrastructure-wide logs centralization servers.

  • Low Maintenance Fully automated in every aspect: automated dashboards, out-of-the-box alerts, auto-detection and auto-discovery of metrics, zero-touch machine-learning, easy scalability and high availability, and CI/CD friendly.

  • Open and Extensible Netdata is a modular platform that can be extended in all possible ways and it also integrates nicely with other monitoring solutions.


What’s New and Coming?

WhatDescriptionWhenStatus
WebRTCBrowser to Agent communication via WebRTC.laterPOC
Advanced TroubleshootingExpanded view of dashboard charts integrating Metrics Correlations, Anomaly Advisor, and many more.laterinterrupted
Easy CustomDashboardsDrag and drop charts to create custom dashboards on the fly, while troubleshooting!soonplanned
More CustomizabilitySet default settings for all charts and views!soonplanned
UCUM UnitsMigrate all metrics to the Unified Code for Units of Measure.soonin progress
Click to ActivateConfigure Alerts and Data Collectors from the UI!soonin progress
Netdata CloudOn-PremNetdata Cloud available for On-Prem installation!availablefill this form
systemd journalView the systemd journal logs of your systems on the dashboard.Oct2023v1.43
IntegrationsNetdata Integrations Marketplace!Aug2023v1.42
New Agent UINow Netdata Cloud and Netdata Agent share the same dashboard!Jul2023v1.41
Summary DashboardsHigh level tiles everywhere!Jun2023v1.40
Machine LearningMultiple ML models per metric.Jun2023v1.40
SSLNetdata Agent gets a new SSL layer.Jun2023v1.40
New Cloud UIFilter, slice and dice any dataset from the UI! ML-first!May2023v1.39
Microsoft WindowsMonitor Windows hosts and apps!May2023v1.39
Virtual NodesGo collectors can now be assigned to virtual nodes!May2023v1.39
DBENGINE v2Faster, more reliable, far more scalable!Feb2023v1.38
Netdata FunctionsNetdata beyond metrics! Monitoring anything!Feb2023v1.38
Events FeedLive feed of events about topology changes and alerts.Feb2023v1.38
Role BasedAccess ControlMore roles, offering finer control over access to infrastructure.Feb2023v1.38
Infinite ScalabilityStreaming compression. Replication. Active-active clustering.Nov2022v1.37
Grafana PluginNetdata Cloud as a data source for Grafana.Nov2022v1.37
PostgreSQLCompletely rewritten, to reveal all the info, even at the table level.Nov2022v1.37
Metrics CorrelationsAdvanced algorithms to find the needle in the haystack.Aug2022v1.36
Database TieringNetdata gets unlimited retention!Aug2022v1.36
KubernetesMonitor your Kubernetes workloads.Aug2022v1.36
Machine LearningAnomaly Rate information on every chart.Aug2022v1.36
Machine LearningAnomaly Advisor! Bottom-up unsupervised anomaly detection.Jun2022v1.35
Machine LearningMetrics Correlation on the Agent.Jun2022v1.35

Getting Started

1. Install Netdata everywhere :v:

Netdata can be installed on all Linux, macOS, and FreeBSD systems. We provide binary packages for the most popular operating systems and package managers.

Check also the Netdata Deployment Strategies to decide how to deploy it in your infrastructure.

By default, you will have immediately available a local dashboard. Netdata starts a web server for its dashboard at port 19999. Open up your web browser of choice and navigate to http://NODE:19999, replacing NODE with the IP address or hostname of your Agent. If installed on localhost, you can access it through http://localhost:19999.

2. Configure Collectors :boom:

Netdata auto-detects and auto-discovers most operating system data sources and applications. However, many data sources require some manual configuration, usually to allow Netdata to get access to the metrics.

  • For a detailed list of the 800+ collectors available, check this guide.
  • To monitor Windows servers and applications use this guide.
  • To monitor SNMP devices check this guide.

3. Configure Alert Notifications :bell:

Netdata comes with hundreds of pre-configured alerts, that automatically check your metrics, immediately after they start getting collected.

Netdata can dispatch alert notifications to multiple third party systems, including: email, Alerta, AWS SNS, Discord, Dynatrace, flock, gotify, IRC, Matrix, MessageBird, Microsoft Teams, ntfy, OPSgenie, PagerDuty, Prowl, PushBullet, PushOver, RocketChat, Slack, SMS tools, Syslog, Telegram, Twilio.

By default, Netdata will send e-mail notifications, if there is a configured MTA on the system.

4. Configure Netdata Parents :family:

Optionally, configure one or more Netdata Parents. A Netdata Parent is a Netdata Agent that has been configured to accept streaming connections from other Netdata agents.

Netdata Parents provide:

  • Infrastructure level dashboards, at http://parent.server.ip:19999/.

    Each Netdata Agent has an API listening at the TCP port 19999 of each server. When you hit that port with a web browser (e.g. http://server.ip:19999/), the Netdata Agent UI is presented. When the Netdata Agent is also a Parent, the UI of the Parent includes data for all nodes that stream metrics to that Parent.

  • Increased retention for all metrics of all your nodes.

    Each Netdata Agent maintains each own database of metrics. But Parents can be given additional resources to maintain a much longer database than individual Netdata Agents.

  • Central configuration of alerts and dispatch of notifications.

    Using Netdata Parents, all the alert notifications integrations can be configured only once, at the Parent and they can be disabled at the Netdata Agents.

You can also use Netdata Parents to:

  • Offload your production systems (the parents run ML, alerts, queries, etc. for all their children)
  • Secure your production systems (the parents accept user connections, for all their children)

5. Connect to Netdata Cloud :cloud:

Optionally, sign-in to Netdata Cloud and claim your Netdata Agents and Parents. If you connect your Netdata Parents, there is no need to connect your Netdata Agents. They will be connected via the Parents.

When your Netdata nodes are connected to Netdata Cloud, you can (on top of the above):

  • Access your Netdata agents from anywhere
  • Access sensitive Netdata agent features (like “Netdata Functions”: processes, systemd-journal)
  • Organize your infra in spaces and rooms
  • Create, manage, and share custom dashboards
  • Invite your team and assign roles to them (Role Based Access Control - RBAC)
  • Get infinite horizontal scalability (multiple independent Netdata Agents are viewed as one infra)
  • Configure alerts from the UI (coming soon)
  • Configure data collection from the UI (coming soon)
  • Netdata Mobile App notifications (coming soon)

:love_you_gesture: Netdata Cloud does not prevent you from using your Netdata Agents and Parents directly, and vice versa.

:ok_hand: Your metrics are still stored in your network when you connect your Netdata Agents and Parents to Netdata Cloud.


How it works

Netdata is built around a modular metrics processing pipeline.

Each Netdata Agent can perform the following functions:

  1. COLLECT metrics from their sources Uses internal and external plugins to collect data from their sources.

    Netdata auto-detects and collects almost everything from the operating system: including CPU, Interrupts, Memory, Disks, Mount Points, Filesystems, Network Stack, Network Interfaces, Containers, VMs, Processes, systemd units, Linux Performance Metrics, Linux eBPF, Hardware Sensors, IPMI, and more.

    It collects application metrics from applications: PostgreSQL, MySQL/MariaDB, Redis, MongoDB, Nginx, Apache, and hundreds more.

    Netdata also collects your custom application metrics by scraping OpenMetrics exporters, or via StatsD.

    It can convert web server log files to metrics and apply ML and alerts to them, in real-time.

    And it also supports synthetic tests / white box tests, so you can ping servers, check API responses, or even check filesystem files and directories to generate metrics, train ML and run alerts and notifications on their status.

  2. STORE metrics to a database Uses database engine plugins to store the collected data, either in memory and/or on disk. We have developed our own dbengine for storing the data in a very efficient manner, allowing Netdata to have less than 1 byte per sample on disk and amazingly fast queries.

  3. LEARN the behavior of metrics (ML) Trains multiple Machine-Learning (ML) models per metric to learn the behavior of each metric individually. Netdata uses the kmeans algorithm and creates by default a model per metric per hour, based on the values collected for that metric over the last 6 hours. The trained models are persisted to disk.

  4. DETECT anomalies in metrics (ML) Uses the trained machine learning (ML) models to detect outliers and mark collected samples as anomalies. Netdata stores anomaly information together with each sample and also streams it to Netdata Parents so that the anomaly is also available at query time for the whole retention of each metric.

  5. CHECK metrics and trigger alert notifications Uses its configured alerts (you can configure your own) to check the metrics for common issues and uses notifications plugins to send alert notifications.

  6. STREAM metrics to other Netdata Agents Push metrics in real-time to Netdata Parents.

  7. ARCHIVE metrics to 3rd party databases Export metrics to industry standard time-series databases, like Prometheus, InfluxDB, OpenTSDB, Graphite, etc.

  8. QUERY metrics and present dashboards Provide an API to query the data and present interactive dashboards to users.

  9. SCORE metrics to reveal similarities and patterns Score the metrics according to the given criteria, to find the needle in the haystack.

When using Netdata Parents, all the functions of a Netdata Agent (except data collection) can be delegated to Parents to offload production systems.

The core of Netdata is developed in C. We have our own libnetdata, that provides:

  • DICTIONARY A high-performance algorithm to maintain both indexed and ordered pools of structures Netdata needs. It uses JudyHS arrays for indexing, although it is modular: any hashtable or tree can be integrated into it. Despite being in C, dictionaries follow object-oriented programming principles, so there are constructors, destructors, automatic memory management, garbage collection, and more. For more see here.

  • ARAL ARray ALlocator (ARAL) is used to minimize the system allocations made by Netdata. ARAL is optimized for peak performance when multi-threaded. It also allows all structures that use it to be allocated in memory-mapped files (shared memory) instead of RAM. For more see here.

  • PROCFILE A high-performance /proc (but also any) file parser and text tokenizer. It achieves its performance by keeping files open and adjustings its buffers to read the entire file in one call (which is also required by the Linux kernel). For more see here.

  • STRING A string internet mechanism, for string deduplication and indexing (using JudyHS arrays), optimized for multi-threaded usage. For more see here.

  • ARL Adaptive Resortable List (ARL), is a very fast list iterator, that keeps the expected items on the list in the same order they are found in input list. So, the first iteration is somewhat slower, but all the following iterations are perfectly aligned for best performance. For more see here.

  • BUFFER A flexible text buffer management system that allows Netdata to automatically handle dynamically sized text buffer allocations. The same mechanism is used for generating consistent JSON output by the Netdata APIs. For more see here.

  • SPINLOCK Like POSIX MUTEX and RWLOCK but a lot faster, based on atomic operations, with significantly smaller memory impact, while being portable.

  • PGC A caching layer that can be used to cache any kind of time-related data, with automatic indexing (based on a tree of JudyL arrays), memory management, evictions, flushing, pressure management. This is extensively used in dbengine. For more see here.

The above, and many more, allow Netdata developers to work on the application fast and with confidence. Most of the business logic in Netdata is a work of mixing the above.

Netdata data collection plugins can be developed in any language. Most of our application collectors though are developed in Go.