The document compares two popular log collectors, Fluentd and Logstash, focusing on their features, configurations, performance, and transport protocols. Fluentd, written in cruby, uses tagged logs for routing, while Logstash, written in jruby, aggregates logs without tagging. The document also discusses integration with OpenStack, log handling, and some unique features of each log collector such as Fluentd's support for active-active load balancing.
About Me
● MasakiMATSUSHITA
● Software Engineer at
○ We are providing Internet access here!
● Github: mmasaki Twitter: @_mmasaki
● 16 Commits in Liberty
○ Trove, oslo_log, oslo_config
● CRuby Commiter
○ 100+ commits for performance improvement
2
3.
What are LogCollectors?
● Provide pluggable and unified logging layer
Without Log Collectors With Log Collectors
Images from http://fluentd.org/ 3
4.
Input, Filter andOutput
4
Input Plugins
tail
syslog
Filter Plugins
grep
hostname
Output Plugins
InfluxDB
Elasticsearch
● They are implemented as plugins
● Can be replaced easily
Log FIles
Components
5.
Two Popular LogCollectors
● Fluentd
○ Written in CRuby
○ Used in Kubernetes
○ Maintained by Treasure Data Inc.
● Logstash
○ Written in JRuby
○ Maintained by elastic.co
● They have similar features
● Which one is better for you? 5
6.
Agenda
● Comparisons
○ Configuration
○Supported Plugins
○ Performance
○ Transport Protocol
● Integrate OpenStack with Fluentd/Logstash
○ Considering High Availability 6
7.
Configuration: Fluentd
● Everyinputs are tagged
● Logs will be routed by tag
nova-api.log
(tag: openstack.nova)
cinder-api.log
(tag: openstack.cinder)
<match openstack.nova>
<match openstack.cinder>
Filter/Route
7
8.
Fluentd Configuration: Input
<source>
@typetail
path /var/log/nova/nova-api.log
tag openstack.nova
</source>
Example of tailing nova-api log
● Every inputs will be tagged
8
9.
Fluentd Configuration: Output
<matchopenstack.nova> # nova related logs
@type elasticsearch
host example.com
</match>
<match openstack.*> # all other OpenStack related logs
@type influxdb
# …
</match>
Routed by tag
(First match is priority)
Wildcards can be used
9
10.
Fluentd Configuration: Copy
<matchopenstack.*>
@type copy
<store>
@type influxdb
</store>
<store>
@type elasticsearch
</store>
</match>
Copy plugin enables multiple
outputs for a tag
Copied Output
tag: openstack.*
10
11.
Logstash Configuration
● Notags
● All inputs will be aggregated
● Logs will be scattered to outputs
nova-api.log
cinder-api.log
Filter/Aggregate
aggregated logs
11
Configuration
● Fluentd
○ Routedby simple tag matching
○ Suited to handle log streams separately
● Logstash
○ Logs are aggregated
○ Suited to handle logs in gather-scatter style
17
18.
Plugins
● Both providemany plugins
○ Fluentd: 300+, Logstash: 200+
● Popular plugins are bundled with Logstash
○ They are maintained by the Logstash project
● Fluentd contains only minimal plugins
○ Most plugins are maintained by individuals
● Plugins can be installed easily by one command
18
19.
Performance
● Depends oncircumstances
● More than enough for OpenStack logs
○ Both can handle 10000+ logs/s
● Applying heavy filters is not a good idea
● CRuby is slow because of GVL?
○ GVL: Global VM (Interpreter) Lock
○ It’s not true for IO bound loads
19
20.
GVL on IObound loads
● IO operation can be performed in parallel
20
Thread 1 Thread 2
Idle :
User Space:
Kernel Space:
Actual Read/Write
Ruby Code Execution
GVL Released/
Acquired
IO operations
in parallel
21.
Transport Protocol
● Bothcollectors have their own transport protocol.
○ Failure Detection and Fallback
● Logstash: Lumberjack protocol
○ Active-Standby only
● Fluentd: forward protocol
○ Active-Active (Load Balancing), Active-Standby
○ Some additional features
21
22.
Logstash Transport: lumberjack
●Active-Standby lumberjack { #config@source
hosts => [
“primary”,
“secondary”
]
port => 1234
ssl_certificate => …
}
primary
secondary
source
secondary is used
when primary fails
Fail
Fallback
22
Fluentd Transport: forward
●At-least-one Semantics
(may affect performance)
<match openstack.*>
type forward
require_ack_response
<server>
host dest
</server>
</match>
destsource
send logs
ACK
Logs are re-transmitted
until ACK is received
26
27.
Transport Protocol
● Bothcan be configured as Active-Standby mode.
● Fluentd has great features:
○ Active-Active Mode (Load Balancing)
○ At-least-one Semantics
○ Weighted Load Balancing
27
28.
Forwarders
● Fluentd/Logstash havetheir own “forwarders”
○ Lightweight implementation written in Golang
○ Low memory consumption
○ One binary: Less dependent and easy to install
28
Node
Tail log files
Forwarder
Log AggregatorForward/
Lumberjack
Protocol
29.
Forwarders: Config Example
fluentd-forwarder:
[fluentd-forwarder]
to= fluent://fluentd1:24224
to = fluent://fluentd2:24224
logstash-forwarder:
"network": {
"servers": [
"logstash1:5043",
"logstash2:5043"
]
}Always send logs to both servers.
Pick one active server and send logs only to it.
Fallback to another server on failure. 29
30.
Integration with OpenStack
●Tail log files by local Fluentd/Logstash
○ must parse many form of log files
● Rsyslog
○ installed by default in most distribution
○ can receive logs in JSON format
● Direct output from oslo_log
○ oslo_log: logging library used by components
○ Logging without any parsing 30
Direct output fromoslo_log
# logging.conf:
[handler_fluent]
class = fluent.handler.FluentHandler # fluent-logger
formatter = fluent
args = (’openstack.nova', 'localhost', 24224)
[formatter_fluent]
class = fluent.handler.FluentFormatter # our Blueprint
43
Format logs as Dictionary
44.
Our BP inoslo_log: FluentFormatter
{
"hostname":"allinone-vivid",
"extra":{"project":"unknown","version":"unknown"},
"process_name":"MainProcess",
"module":"wsgi",
"message":"(4132) wsgi starting up on http://0.0.0.0:8774/",
"filename":"wsgi.py",
"name":"nova.osapi_compute.wsgi.server",
"level":"INFO",
"traceback":null,
"funcname":"server",
"time":"2015-10-15 10:09:12,255"
}
Don’t need to parse!
44
45.
Conclusion
● Log Handling
○Fluentd: Logs are distinguished by tag
○ Logstash: No tags. Logs are aggregated
● Transport Protocol
○ Both supports active-standby mode
○ Fluentd supports some additional features
■ Client-side load balancing (Active-Active)
■ At-least-one semantics
■ Weighted load balancing 45
46.
Conclusion
● Integration withOpenStack
○ Tail log files: regular expression hell
○ Rsyslog: No agents are needed
○ Direct output from oslo_log w/o any parsing
○ Review is welcome for our Blueprint
(oslo_log: fluent-formatter)
46