- 1 Collection/Shippers
- 2 Programming Language Collection/Shippers
- 3 Generalized Collection
- 4 Cloud Services
- 4.0.1 sematext
- 4.0.2 LogDNA
- 4.0.3 Stackdriver
- 4.0.4 Loggly
- 4.0.5 Loki
- 4.0.6 LogDevice
- 4.0.7 Splunk
- 4.0.8 NewRelic
- 4.0.9 Graylog
- 4.0.10 Flume
- 4.0.11 Papertrail/Timber.io
- 4.0.12 Scalyr
- 4.0.13 Sumologic
- 4.0.14 sentry.io
- 4.0.15 rollbar
- 4.0.16 CloudWatch
- 4.0.17 DataDog
- 4.0.18 Coralogix
- 4.0.19 Logentries
- 4.0.20 Humio
- 4.0.21 Seq
- 4.0.22 Logz.io
- 4.0.23 Honeycomb
- 4.1 Specific
- 5 TODO
Standalone (e.g. not a programming library) log collectors/shippers.
- Filebeat is a lightweight shipper for forwarding and centralizing log data. Installed as an agent on your servers, Filebeat monitors the log files or locations that you specify, collects log events, and forwards them either to Elasticsearch or Logstash for indexing.
- Filebeat guarantees that events will be delivered to the configured output at least once and with no data loss. Filebeat is able to achieve this behavior because it stores the delivery state of each event in the registry file.
- Built with Go https://github.com/elastic/beats/tree/master/filebeat
- Has interesting common problems, such as https://www.elastic.co/guide/en/beats/filebeat/current/inode-reuse-issue.html
- Supports inputs as of 3/2020 (called modules): ActiveMQ, Apache, Auditd, AWS, Azure, CEF, Cisco, CoreDNS, Elasticsearch, Envoyproxy, Google Cloud, haproxy, IBM MQ, Icinga, IIS, Iptables, Kafka, Kibana, Logstash, MISP, MongoDB, MSSQL, MySQL, nats, NetFlow, Nginx, Osquery, Palo Alto Networks, PostgreSQL, RabbitMQ, Redis, Santa, Suricata, System, Traefik, Zeek (Bro)
- Has own keystore to manage secrets
- Has subcommand CLI interface
- Does parsing and exports "fields" to use parts of log line
- Fluentd decouples data sources from backend systems by providing a unified logging layer in between.
- Fluentd's 500+ plugins connect it to many data sources and outputs while keeping its core simple.
- 5,000+ data-driven companies rely on Fluentd. Its largest user currently collects logs from 50,000+ servers.
- Built with C and Ruby
- 650 plugins available
- Built in Rust, Vector is blistering fast and memory efficient. It's designed to handle the most demanding environments.
- Vector does not favor any storage and fosters a fair, open ecosystem with your best interest in mind. Lock-in free and future proof.
- Vector aims to be the single, and only, tool needed to get data from A to B, deploying as an agent or service.
- Vector supports logs, metrics, and events, making it easy to collect and process all observability data.
- Programmable transforms give you the full power of programmable runtimes. Handle complex use cases without limitation.
- Guarantees matter, and Vector is clear on it's guarantees, helping you to make the appropriate trade offs for your use case.
- Rust source https://github.com/timberio/vector
- Faster than the competition
- Uses a config file to wire things up
- Supported sources as of 3/2020: docker, file, http, journald, kafka, logplex, prometheus, socket, splunk_hec, statsd, stdin, syslog, vector
- Supported transforms as of 3/2020: add_fields, add_tags, ansi_stripper, aws_ec2_metadata, coercer, concat, field_filter, geoip, grok_parser, json_parser, log_to_metric, logfmt_parser, lua, merge, regex_parser, remove_fields, remove_tags, rename_fields, sampler, split, swimlanes, tokenizer
- Supported sinks as of 3/2020: aws_cloudwatch_logs, aws_cloudwatch_metrics, aws_kinesis_firehose, aws_kinesis_streams, aws_s3, blackhole, clickhouse, console, datadog_metrics, elasticsearch, file, gcp_cloud_storage, gcp_pubsub, gcp_stackdriver_logging, http, humio_logs, influxdb_metrics, kafka, logdna, loki, new_relic_logs, prometheus, sematext_logs, socket, splunk_hec, statsd, vector,
2 Programming Language Collection/Shippers
Specific libraries for various programming libraries that can do interesting things with collecting or shipping logs.
Gollum is an n:m multiplexer that gathers messages from different sources and broadcasts them to a set of destinations.
Gollum originally started as a tool to MUL-tiplex LOG-files (read it backwards to get the name). It quickly evolved to a one-way router for all kinds of messages, not limited to just logs. Gollum is written in Go to make it scalable and easy to extend without the need to use a scripting language.
3 Generalized Collection
3.1 Apache Kafka
4 Cloud Services
- From Kubernetes, syslog, or code libraries to REST APIs, LogDNA supports more than 30+ integrations to ingest log data.
- Auto-parse the most popular formats, or request custom parsing from any data source.
- Search, monitor, and analyze logs across multiple deployments, and see results in a single pane.
- Save your search query as a view so you can access it later, much like a shortcut.
- Enterprise-level authentication and custom controls for all team members.
- LogDNA is HIPAA, SOC2, PCI, Privacy Shield and GDPR compliant.
- participated in Y Combinator in 2015 https://newscenter.io/2017/11/former-y-combinator-partners-lead-7-million-series-logdna/
Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate. It does not index the contents of the logs, but rather a set of labels for each log stream.
LogDevice is a scalable and fault tolerant distributed log system. While a file-system stores and serves data organized as files, a log system stores and delivers data organized as logs. The log can be viewed as a record-oriented, append-only, and trimmable file.
LogDevice is designed from the ground up to serve many types of logs with high reliability and efficiency at scale. It's also highly tunable, allowing each use case to be optimized for the right set of trade-offs in the durability-efficiency and consistency-availability space. Here are some examples of workloads supported by LogDevice:
- Write-ahead logging for durability
- Transaction logging in a distributed database
- Event logging
- Stream processing
- ML training pipelines
- Replicated state machines
- Journals of deferred work items
- Humio ingests log data as quickly as it comes, without indexing, and regardless of bursts. Efficiently stored data means you can ingest terabytes of data per day and search it all in a matter of seconds.
- Humio is index-free, and it works with any structured or unstructured data format. Because you don’t need to define fields up front, you can ask any question with live or archived data and experience fast response times.
- The Humio engine was built from scratch to ensure that ingest and search scales to terabytes per day. Humio has virtually no latency even at huge volumes. And with constant focus to optimize the use of infrastructure and storage, Humio requires very few resources.
- Application logs are the most useful data available for detecting and solving a wide range of production issues and outages. Seq makes it easier to pinpoint the events and patterns in application behavior that show your system is working correctly — or why it isn't.
- Seq is built for modern structured logging with message templates. Rather than waste time and effort trying to extract data from plain-text logs with fragile log parsing, the properties associated with each log event are captured and sent to Seq in a clean JSON format. Message templates are supported natively by ASP.NET Core, Serilog, NLog, and many other libraries, so your application can use the best available diagnostic logging for your platform.
- Seq accepts logs via HTTP, GELF, custom inputs, and the seqcli command-line client, with plug-ins or integrations available for .NET Core, Java, Node.js, Python, Ruby, Go, Docker, message queues, and many other technologies.
My main advice is avoid ELK. I have no clue how Elastic managed to convince the world that Elasticsearch should be the default log database when it is terrible for logs.
We used to use an ELK cluster but it was always breaking - I'm sure this stuff can be reliable but we just wanted an easy way to search ~300GB of logs (10GB/day)