Simple logging with Elastic Cloud Kubernetes and Fluentd

Alex G
Kubernauts
Published in
7 min readSep 11, 2020

--

Photo by Ryoji Iwata on Unsplash

Introduction

At Kubernauts we always care to set up resilient, scalable and observable environments. Therefore unified event logging is an essential pillar. This post could make a starting point for you, to centralize your log storing and tracing. Our Kubernautic cloudless service with Rancher is such a use case. We are running multiple clusters with even more nodes. To keep the effort for debugging and tracing as low as possible we are using the Elastic Cloud on Kubernetes (ECK) with Fluentd for log collecting. This stack is completely open-source and a powerful solution for logging. Now I want to introduce you to a basic setup for this stack.

What components we are going to use: Fluentd, Elasticsearch and Kibana.

  • Fluentd collect logs. Comparable products are FluentBit (mentioned in Fluentd deployment section) or logstash.
  • Elasticsearch for storing the logs. Comparable products are Cassandra for example.
  • Kibana as a user interface. A similar product could be Grafana.

As of September 2020 the current elasticsearch and Kibana versions are 7.9.0. Elastic Operator 1.2.1 and Fluentd Kubernetes Daemonset v1.11.2

Prerequisites:

  • kubectl 1.11+
  • Kubernetes 1.12+ or OpenShift 3.11+
  • Storage provider

Get the source on Github

I have prepared a GitHub repository with all the necessary resources here

tl;dr: there is a full deployment shell script in the root directory of the repository “deploy.sh” if you want to skip the tasks.

Let’s get started

Elastic Operator deployment:

First we’re going to deploy the elastic operator. Operators in general are pieces of software that ease the operational complexity of running another piece of software. What does this operator do? It sets up nodes, security, TLS and internal services to link the elastic stack. The all-in-one.yaml operator contains the following resources:

Namespace, RBAC Roles, Serviceaccount, Secret, Elasticsearch, Kibana, ApmServer, Beat, Enterprisesearch, ValidatingWebhookConfiguration, Service

kubectl apply -f https://download.elastic.co/downloads/eck/1.2.1/all-in-one.yaml

For demo purposes we keep it simple and put all resources in the same namespace. So we switch the namespace to the newly created one from the operator:

kubectl config set-context — current — namespace=elastic-system

The final setup should be structured as follows:

Elasticsearch deployment:

Elasticsearch is a RESTful, distributed search and analytics engine and its open-source. It processes JSON requests and gives you back JSON data.

So we’re deploy the Elasticsearch custom resource for storing the logs:

kubectl apply -f https://raw.githubusercontent.com/Alex77g/simple-logging-eck-fluentd-fluentbit/master/quickstart-es.yaml

Check if the deployment is successful with:

kubectl get pods quickstart-es-default-0

Log if successful:

NAME                            READY STATUS    RESTARTS AGE
pod/quickstart-es-default-0 1/1 Running 0 2m45s

The elastic operator tries to deploy a persistent volume and its claim but

if the deployment fails, check if there is a persistent volume available manually and redeploy.

The requirement for the persistent volume claim is a persistent volume like so:

And for the persistent volume again is a storage class required. For demo purposes I used a local storage provisioner.

I have prepared a local storage provider and persistent volume deployment:

kubectl apply -f k8s/storage-class.yaml
kubectl apply -f k8s/pv.yaml

Redeploy Elasticsearch with:

kubectl delete -f https://raw.githubusercontent.com/Alex77g/simple-logging-eck-fluentd-fluentbit/master/quickstart-es.yaml
kubectl apply -f https://raw.githubusercontent.com/Alex77g/simple-logging-eck-fluentd-fluentbit/master/quickstart-es.yaml

Again check if elasticsearch is running successfully.

kubectl get pods quickstart-es-default-0

Log:

NAME                         READY  STATUS    RESTARTS  AGE
pod/quickstart-es-default-0 1/1 Running 0 2m45s

Kibana deployment:

Kibana is an open source application, as a part of the elastic stack. It visualises the log data which is indexed in Elasticsearch. It’s also a user interface for monitoring, managing, and securing an Elastic Stack cluster. Due to Kibana’s tracing opportunities helps mitigate issues.

Kibana custom resource deployment:

kubectl apply -f https://raw.githubusercontent.com/Alex77g/simple-logging-eck-fluentd-fluentbit/master/quickstart-kb.yaml

For checking the deployment status:

kubectl rollout status deployment/quickstart-kb

if you want to go more in detail i suggest to take a further look at their documentation.

Fluentd deployment:

Fluentd is an open source data collector. Helps you to centralize your logs. Fluentd is out of the box easy to use and very flexible. It can be extended with plugins for many services and platforms. Fluentd decouples data sources from backend systems by providing a unified logging layer in between and aggregated logs should be in a consistent format so that it is easier for log aggregation tools like Fluentd to process them.

We are deploying Fluentd with the standard configuration which I copied from the Dockerhub container image. Three logging layers are Configured.

Container logging

Those are logs being generated by your containerized applications. The output (stdout) and standard error (stderr) streams are going to be collected. Those logs give us information about the application behavior.

Node logging

If a container terminates or restarts, kubelet stores logs on the node. Those logs are not detailed as the container logs but gives us information about the Container behavior. To prevent these files from consuming all of the host’s storage, the Kubernetes node implements a log rotation mechanism. When a container is evicted from the node, all containers with corresponding log files are evicted.

Cluster logging

Those enable a bird eye view to the environment. By collecting logs from the core components of the cluster/s kube-api server, Kube-scheduler, Etcd, Kubelet and the Kube-proxy in addition if deployed the Cluster Autoscaler and Ingress.

For this demo deployment it is not necessary to deploy an extra config map with the configuration but in case you want to change the configuration, this configmap just replaces the original configuration inside the pod

kubectl apply -f fluentd/fluentd-cm.yaml

Before we are able to deploy Fluentd. The daemonset needs the Elasticsearch host, port and it’s credentials passed by environment variables. Because of the issue (Fluentd stopped sending data to ES for some while. · Issue #525 · uken/fluent-plugin-elasticsearch) we need to set FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS as well.

kubectl apply -f fluentd/fluentd-daemonset.yaml

We will briefly go through the daemonset environment variables. They are going to be passed to the configmap. They give only an extract of the possible parameters of the configmap. Parameter documentation can be found here and the configmap is fluentd/fluentd.cm.yaml in the Git repository.

Explanation:

The official Fluentd documentation can be found here.

With kubectl rollout status kubernetes tells us when the fluentd daemonset successfully rolled out.

kubectl rollout status daemonset.apps/fluent-bit

Log:

Waiting for daemon set “fluent-bit” rollout to finish: 0 of 3 updated pods are available…
Waiting for daemon set “fluent-bit” rollout to finish: 1 of 3 updated pods are available…
Waiting for daemon set “fluent-bit” rollout to finish: 2 of 3 updated pods are available…

We are almost finished. We need credentials for login into kibana, so we are getting the secret which is already present from the elasticsearch deployment and decode the base64 encoded string.

elasticpw=`kubectl get secret quickstart-es-elastic-user -o go-template=’{{.data.elastic | base64decode}}’`echo -e “[${LB}Info${NC}] here are your kibana credentials. User is ${LB}elastic${NC}, Password ${LB}${elasticpw}${NC}”

Wait for the deployment of kibana (could take a while depending on your hardware and internet connection)

kubectl rollout status deployment/quickstart-kb

Now forward the kibana port to access the UI

kubectl port-forward service/quickstart-kb-http 5601

Open the link in your browser http://localhost:5601

With user “elastic” and the given password you are able to view your logs. The logger daemonset ensures cluster wide log collection.

For multi cluster logging deploy the operator with Elasticsearch and Kibana in your central logging cluster and expose Elasticsearch with an ingress for example. Then point the logger daemonset to the endpoint instead of the internal quickstart-es-http service by setting the port FLUENT_ELASTICSEARCH_PORT and the FLUENT_ELASTICSEARCH_HOST to the new domain. Finally deploy the elastic secret quickstart-es-elastic-user in each cluster where the log collector runs because of the basic auth of Elasticsearch, that’s it.

And the final setup it’s going to look like this:

We can replace Fluentd with FluentBit if we are using small nodes like raspberry pi’s. FluentBit is a fast and lightweight log processor and forwarder. It is open source, cloud oriented and a part of the Fluentd ecosystem. It has less plugins out of the box but it also consumes less resources. In addition you find a deployment in the Github repository. The configuration and behavior is similar to Fluentd.

Conclusion

To gain the maximum on observability each of the multiple layers of Kubernetes and components should be monitored well and should be tracked. On the opposite Analyzing too many data points continuously will generate volumes of unnecessary alerts, data, and false flags. It’s necessary to customize each logging stack for each environment depending on the software and infrastructure setup. For keeping the Kubernetes environment secure and performant. The deployment gives you a basic understanding of how unified logging can be used in Kubernetes environments.

--

--