Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1519679 - logging-fluentd not using output-ops-extra-localfile.conf after update from v3.6.173.0.21 to v3.6.173.0.49.
Summary: logging-fluentd not using output-ops-extra-localfile.conf after update from v...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.6.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.6.z
Assignee: Noriko Hosoi
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-12-01 07:32 UTC by Jatan Malde
Modified: 2018-01-23 17:58 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: There was a logic error in the fluentd startup script and when an ops cluster was first disabled then enabled, the proper ops configuration file was not enabled. Consequence: Sub configuration files starting with output-ops-extra- did not have a chance to be called from the ops configuration file. Fix: The logic error was fixed. Result: When an ops cluster is first disabled then enabled, the proper ops configuration file is enabled and its sub configuration files are also enabled.
Clone Of:
Environment:
Last Closed: 2018-01-23 17:58:09 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0113 normal SHIPPED_LIVE OpenShift Container Platform 3.7 and 3.6 bug fix and enhancement update 2018-01-23 22:55:59 UTC

Description Jatan Malde 2017-12-01 07:32:22 UTC
Description of problem:

logging-fluentd not using output-ops-extra-localfile.conf after update from v3.6.173.0.21 to v3.6.173.0.49. The logs where not written to the file inside the fluentd pod.

Version-Release number of selected component (if applicable):

-OCP v3.6
-RHEL 7.4

How reproducible:


Steps to Reproduce:
1. Depoly the logging project using ansible playbook files.
2. Initially do not use the variables 'openshift_logging_es_ops_host=logging-es-ops','openshift_logging_use_ops=true' in the inventory file
3. Once deployed check the environment variable value ES_HOST and OPS_HOST, both has the same value.

Actual results:

Due to same value the fluentd-ops file is not getting created inside the fluentd pod.

$ oc rsh logging-fluentd-jvt5h
sh-4.2# ls -ltr /var/fluentd-out/
total 3584
-rw-r--r--. 1 root root 1710120 Nov 29 13:25 fluentd.20171129.b55f1e3152b0f0b44

Expected results:

Both the files must get created inside the fluentd pod.

sh-4.2# ls -l /var/fluentd-out/
total 1792
-rw-r--r--.  1 root root 957428 Nov 29 03:43 fluentd-ops.20171129.b55f1b0aabe17a1c6
-rw-r--r--.  1 root root 235110 Nov 29 03:43 fluentd.20171129.b55f1b0aacb5bda36


Additional info:

The configmap file is attached.
- apiVersion: v1
  data:
    fluent.conf: |
      # This file is the fluentd configuration entrypoint. Edit with care.

      @include configs.d/openshift/system.conf

      # In each section below, pre- and post- includes don't include anything initially;
      # they exist to enable future additions to openshift conf as needed.

      ## sources
      ## ordered so that syslog always runs last...
      @include configs.d/openshift/input-pre-*.conf
      @include configs.d/dynamic/input-docker-*.conf
      @include configs.d/dynamic/input-syslog-*.conf
      @include configs.d/openshift/input-post-*.conf
      ##

      <label @INGRESS>
      ## filters
        @include configs.d/openshift/filter-pre-*.conf
        @include configs.d/openshift/filter-retag-journal.conf
        @include configs.d/openshift/filter-k8s-meta.conf
        @include configs.d/openshift/filter-kibana-transform.conf
        @include configs.d/openshift/filter-k8s-flatten-hash.conf
        @include configs.d/openshift/filter-k8s-record-transform.conf
        @include configs.d/openshift/filter-syslog-record-transform.conf
        @include configs.d/openshift/filter-viaq-data-model.conf
        @include configs.d/openshift/filter-post-*.conf
      ##
      </label>

      <label @OUTPUT>
      ## matches
        @include configs.d/openshift/output-pre-*.conf
        @include configs.d/openshift/output-operations.conf
        @include configs.d/openshift/output-applications.conf
        # no post - applications.conf matches everything left
      ##
      </label>
    output-extra-localfile.conf: |
      <store>
        @type file
        path /var/fluentd-out/fluentd
        format json
        time_slice_format %Y%m%d
        time_slice_wait 1m
        buffer_chunk_limit 256m
        time_format %Y%m%dT%H:%M:%S%z
        compress gzip
        utc
      </store>
    output-ops-extra-localfile.conf: |
      <store>
        @type file
        path /var/fluentd-out/fluentd-ops
        format json
        time_slice_format %Y%m%d
        time_slice_wait 1m
        buffer_chunk_limit 256m
        time_format %Y%m%dT%H:%M:%S%z
        compress gzip
        utc
      </store>
    secure-forward.conf: |
      # @type secure_forward

      # self_hostname ${HOSTNAME}
      # shared_key <SECRET_STRING>

      # secure yes
      # enable_strict_verification yes

      # ca_cert_path /etc/fluent/keys/your_ca_cert
      # ca_private_key_path /etc/fluent/keys/your_private_key
        # for private CA secret key
      # ca_private_key_passphrase passphrase

      # <server>
        # or IP
      #   host server.fqdn.example.com
      #   port 24284
      # </server>
      # <server>
        # ip address to connect
      #   host xxx.xx.xx.x
        # specify hostlabel for FQDN verification if ipaddress is used for host
      #   hostlabel server.fqdn.example.com
      # </server>
    throttle-config.yaml: |
      # Logging example fluentd throttling config file

      #example-project:
      #  read_lines_limit: 10
      #
      #.operations:
      #  read_lines_limit: 100
  kind: ConfigMap
  metadata:
    creationTimestamp: null
    name: logging-fluentd
kind: List
metadata: {}

Comment 2 Ruben Romero Montes 2017-12-01 08:58:30 UTC
The problem lays in the fact that once the daemonset/logging-fluent exists it is not updated or replaced (not even the env variables) as seen here:
  https://github.com/openshift/openshift-ansible/blob/release-3.6/roles/openshift_logging_fluentd/tasks/main.yaml#L154-L186

Therefore if the aggregated logging is deployed without OPS cluster and later on with the OPS cluster (i.e. `openshift_logging_use_ops=true`) the OPS_HOST env variable will remain with value `logging-es`. 

That will cause the fluentd start script to consider OPS is not deployed using the filter-post-z-retag-one.conf instead of the filter-post-z-retag-two.conf

The consequence is that all logs (ops and non-ops) will go to the non-ops outputs, ignoring the ops ones.

Comment 5 Anping Li 2017-12-18 09:16:48 UTC
verified with openshift3/logging-fluentd/images/v3.6.173.0.83-2
After added ops stack
1)The fluentd Environment OPS_HOST=logging-es-ops
2)The filter-post-z-retag-one.conf was replaced with filter-post-z-retag-two.conf

The following code are added to filter system level log to ops es stack.

<match journal.** system.var.log** **_default_** **_openshift_** **_openshift-infra_**>
  @type rewrite_tag_filter
  @label @OUTPUT
  rewriterule1 message .+ output_ops_tag
  rewriterule2 message !.+ output_ops_tag
</match>

3) The kibana can view the projects logs and prior operations logs.
   The kibana-ops can view post operations logs

Comment 6 Noriko Hosoi 2017-12-18 18:10:08 UTC
(In reply to Ruben Romero Montes from comment #2)
> The problem lays in the fact that once the daemonset/logging-fluent exists
> it is not updated or replaced (not even the env variables) as seen here:
>  
> https://github.com/openshift/openshift-ansible/blob/release-3.6/roles/
> openshift_logging_fluentd/tasks/main.yaml#L154-L186
> 
> Therefore if the aggregated logging is deployed without OPS cluster and
> later on with the OPS cluster (i.e. `openshift_logging_use_ops=true`) the
> OPS_HOST env variable will remain with value `logging-es`. 

Hi @Ruben,

I revisited your comment #c2 and am worried that the customer's case may not be addressed by the PR #774.

The customer's system is configured with OPS, but the both application logs and the system logs are both sent to the same Elasticsearch logging-es.  Right?

Now I wonder 1) deploying with no ops, then 2) redeploying with ops by ansible having `openshift_logging_use_ops=true`, but OPS_HOST value remains `logging-es` is the problem?

The customer expects it's updated to `logging-ops-es`, but it did not happen?
Thanks.

Comment 7 Noriko Hosoi 2017-12-19 01:41:12 UTC
(In reply to Anping Li from comment #5)
Thank you Anping, for the verification.  I'd assume the behaviour is acceptable for the customer.

Comment 16 errata-xmlrpc 2018-01-23 17:58:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0113


Note You need to log in before you can comment on or make changes to this bug.