Bug 1363932 - [intservice_public_138] Failed to show logs in kibana console with log driver used journald when OSE is containerized installation
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.3.0
Assignee: Luke Meyer
QA Contact: chunchen
Reported: 2016-08-04 03:35 UTC by chunchen
Modified: 2016-09-30 02:15 UTC (History)
3 users (show)

Description chunchen 2016-08-04 03:35:02 UTC
Description of problem:
It's failed to show logs in kibana console with log driver used journald when OSE is containerized installation,but it's not reproduced with rpm installation.

Version-Release number of selected component (if applicable):   3.3.0               e71d2b04669c         3.3.0               80847240fa91        3.3.0               1c127f4f36a0         3.3.0               2c88e1273c11      3.3.0               c0b7d9b08a2e          3.3.0               32d276bb46ae

openshift v3.3.0.13
kubernetes v1.3.0+57fb9ac
etcd 2.3.0+git

 Version:         1.10.3
 API version:     1.22
 Package version: docker-common-1.10.3-46.el7.10.x86_64
 Go version:      go1.6.2
 Git commit:      2a93377-unsupported
 Built:           Fri Jul 29 13:45:25 2016
 OS/Arch:         linux/amd64

How reproducible:

Steps to Reproduce:
1. Install OSE env via containerized method

2. Make docker daemon to use journald as log driver on node machine
# ps -ef |grep docker

3. Deploy logging stack with setting "use-journal=true" according to

Below is the creating configmap command line:
$ oc create configmap logging-deployer  --from-literal --from-literal public-master-url= --from-literal es-cluster-size=1 --from-literal enable-ops-cluster=false --from-literal use-journal=true

Below is the creating pod command line:
$ oc new-app logging-deployer-template -p,IMAGE_VERSION=3.3.0,MODE=install

4. Check the pods and configmap
$ oc get pods -o wide
$ oc get configmap logging-deployer -o yaml

5. Check the rubygem/rpm packages in fluentd container
$ oc rsh logging-fluentd-4xlg3

6. Try to check the pod logs in kibana console via browser

Actual results:
at step 2:
root     11903     1  8 21:49 ?        00:01:34 /usr/bin/docker-current daemon --authorization-plugin=rhel-push-plugin --exec-opt native.cgroupdriver=systemd --selinux-enabled --insecure-registry= --log-driver=journald --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/rhel-docker--pool --storage-opt dm.use_deferred_removal=true --storage-opt dm.use_deferred_deletion=true -b=lbr0 --mtu=1450 --add-registry --add-registry --insecure-registry --insecure-registry --insecure-registry --insecure-registry

root     12068     1  0 21:49 ?        00:00:00 /usr/bin/docker-current run --name atomic-openshift-node --rm --privileged --net=host --pid=host --env-file=/etc/sysconfig/atomic-openshift-node -v /:/rootfs:ro -e CONFIG_FILE=/etc/origin/node/node-config.yaml -e OPTIONS=--loglevel=5 -e HOST=/rootfs -e HOST_ETC=/host-etc -v /var/lib/origin:/var/lib/origin:rslave -v /etc/origin/node:/etc/origin/node -v /etc/localtime:/etc/localtime:ro -v /etc/machine-id:/etc/machine-id:ro -v /run:/run -v /sys:/sys:ro -v /usr/bin/docker:/usr/bin/docker:ro -v /var/lib/docker:/var/lib/docker -v /lib/modules:/lib/modules -v /etc/origin/openvswitch:/etc/openvswitch -v /etc/origin/sdn:/etc/openshift-sdn -v /etc/systemd/system:/host-etc/systemd/system -v /var/log:/var/log -v /dev:/dev --volume=/usr/bin/docker-current:/usr/bin/docker-current:ro --volume=/etc/sysconfig/docker:/etc/sysconfig/docker:ro openshift3/node:v3.3.0.13

at step 4:
$ oc get pod -o wide
NAME                          READY     STATUS      RESTARTS   AGE       IP          NODE
logging-curator-1-f371u       1/1       Running     0          5m
logging-deployer-4ck4z        0/1       Completed   0          7m
logging-es-foe2d5ab-1-phly1   1/1       Running     0          5m
logging-fluentd-4xlg3         1/1       Running     0          5m
logging-kibana-1-w4owb        2/2       Running     0          5m

$ oc get configmap logging-deployer -o yaml
apiVersion: v1
  enable-ops-cluster: "false"
  es-cluster-size: "1"
  use-journal: "true"
kind: ConfigMap
  creationTimestamp: 2016-08-04T01:57:06Z
  name: logging-deployer
  namespace: clogg
  resourceVersion: "17772"
  selfLink: /api/v1/namespaces/clogg/configmaps/logging-deployer
  uid: bf0661b4-59e6-11e6-a394-fa163ee090d2

at step 5:
$ oc rsh logging-fluentd-4xlg3
sh-4.2# gem list |grep -e jour -e fluen
rpm -fluent-plugin-docker_metadata_filter (0.1.1)
fluent-plugin-elasticsearch (1.3.0)
fluent-plugin-flatten-hash (0.2.0)
fluent-plugin-kubernetes_metadata_filter (0.24.0)
fluent-plugin-rewrite-tag-filter (1.5.5)
fluent-plugin-systemd (0.0.3)
fluentd (0.12.20)
systemd-journal (1.2.2)

sh-4.2# rpm -qa | grep -e jour -e fluent

at step 6: No logs shown in the kibana console, please refer to the screenshot

Expected results:
Should show logs in kibana console with log driver used journald when OSE is containerized installation

Additional info:
On node machine:
root@openshift-147 ~]# ls /var/log/journal* -ltr
-rw-r--r--. 1 root root 124 Aug  3 22:35 /var/log/journal.pos

[root@openshift-147 ~]# ls /run/log/journal* -ltr
total 0
drwxr-s---+ 2 root systemd-journal 200 Aug  3 22:30 d9ae96c9a4a24a52ad714d489bb55293

[root@openshift-147 ~]# ls /run/log/journal//d9ae96c9a4a24a52ad714d489bb55293/*

Comment 1 chunchen 2016-08-04 03:35:43 UTC
Created attachment 1187333 [details]
kibana console screenshot

Comment 2 Rich Megginson 2016-08-05 00:42:12 UTC
# show the deployer logs
oc logs logging-deployer-4ck4z

# show the fluentd logs
oc logs logging-fluentd-4xlg3

# see if the files are in the right places
oc exec logging-fluentd-4xlg3 -- ls -alrtF /etc/fluent /etc/fluent/configs.d /etc/fluent/configs.d/dynamic

# see if configured to use journal
oc exec logging-fluentd-4xlg3 -- cat /etc/fluent/configs.d/dynamic/input-syslog-default-syslog.conf

# see if anything is in elasticsearch
oc exec logging-kibana-1-w4owb -- curl -s -k --cert /etc/kibana/keys/cert --key /etc/kibana/keys/key https://logging-es:9200/_search | python -mjson.tool

Comment 3 chunchen 2016-08-05 06:13:07 UTC
For pod logs, please refer to the attachments, below messages are the output of command line:

[chunchen@F17-CCY daily]$ oc exec logging-fluentd-h9qwd -- ls -alrtF /etc/fluent /etc/fluent/configs.d /etc/fluent/configs.d/dynamic
total 8
drwxr-xr-x. 2 root root 4096 Jul 26 15:49 openshift/
drwxr-xr-x. 5 root root   47 Jul 26 15:51 ./
drwxrwxrwx. 3 root root 4096 Aug  5 01:21 user/
drwxr-xr-x. 4 root root   51 Aug  5 01:21 ../
drwxrwxrwx. 2 root root  101 Aug  5 01:21 dynamic/

total 4
lrwxrwxrwx.  1 root root   38 Jul 26 15:51 fluent.conf -> /etc/fluent/configs.d/user/fluent.conf
drwxr-xr-x.  5 root root   47 Jul 26 15:51 configs.d/
drwxrwxrwt.  3 root root  140 Aug  5 01:21 keys/
drwxr-xr-x. 51 root root 4096 Aug  5 01:21 ../
drwxr-xr-x.  4 root root   51 Aug  5 01:21 ./

total 12
drwxr-xr-x. 5 root root  47 Jul 26 15:51 ../
-rw-r--r--. 1 root root 150 Aug  5 01:21 input-syslog-default-syslog.conf
-rw-r--r--. 1 root root   1 Aug  5 01:21 es-copy-config.conf
-rw-r--r--. 1 root root   1 Aug  5 01:21 es-ops-copy-config.conf
drwxrwxrwx. 2 root root 101 Aug  5 01:21 ./

[chunchen@F17-CCY daily]$ oc exec logging-fluentd-h9qwd -- cat /etc/fluent/configs.d/dynamic/input-syslog-default-syslog.conf
  @type systemd
  @label @INGRESS
  path "/run/log/journal"
  pos_file /var/log/journal.pos
  tag journal
  read_from_head "false"

[chunchen@F17-CCY daily]$ oc exec logging-kibana-1-v2lgw  -- curl -s -k --cert /etc/kibana/keys/cert --key /etc/kibana/keys/key https://logging-es:9200/_search | python -mjson.tool
    "error": "ForbiddenException[Attempt from null to _all indices for indices:data/read/search and User [name=system.logging.kibana, roles=[]]]", 
    "status": 403

Comment 4 chunchen 2016-08-05 06:14:04 UTC
Created attachment 1187750 [details]
deployer logs

Comment 5 chunchen 2016-08-05 06:14:35 UTC
Created attachment 1187751 [details]
fluentd log

Comment 6 chunchen 2016-08-05 06:27:57 UTC
After I re-installed a fresh OSE env and used latest images, the issue did not reproduce any more.

Could you help to change the bug's status to ON_QA, then I can mark it as verified?


Comment 7 chunchen 2016-08-08 02:05:41 UTC
According to comment #6, mark it as verified.

