Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1361363 - [Intsvc_public_275_281] Log of the curator pod is stating that all indices are ignored during index deletion
Summary: [Intsvc_public_275_281] Log of the curator pod is stating that all indices ar...
Keywords:
Status: NEW
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Vikram Goyal
QA Contact: Vikram Goyal
Vikram Goyal
URL:
Whiteboard:
Depends On: 1352489
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-28 22:55 UTC by Rich Megginson
Modified: 2018-02-07 17:38 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1352489
Environment:
Last Closed:


Attachments (Terms of Use)

Description Rich Megginson 2016-07-28 22:55:46 UTC
+++ This bug was initially created as a clone of Bug #1352489 +++

Problem description: 
Log of the curator pod is stating that all indices are ignored during index deletion, but in fact the index deletion effected to indices there:
$ oc logs -f logging-curator-1-sh7gc
curator running [5] jobs
No indices matched provided args: {'regex': None, 'index': (), 'suffix': None, 'newer_than': None, 'closed_only': False, 'prefix': None, 'time_unit': 'days', 'timestring': '%Y.%m.%d', 'exclude': ('.searchguard*', '.kibana*', '.apiman_*', 'project-qe.*', 'myapp.*', '.operations.*', 'test.*', 'project-prod.*', 'project-dev.*'), 'older_than': 30, 'all_indices': False}
No indices matched provided args: {'regex': None, 'index': (), 'suffix': None, 'newer_than': None, 'closed_only': False, 'prefix': '.operations.', 'time_unit': 'months', 'timestring': '%Y.%m.%d', 'exclude': (), 'older_than': 2, 'all_indices': False}
No indices matched provided args: {'regex': None, 'index': (), 'suffix': None, 'newer_than': None, 'closed_only': False, 'prefix': 'project-dev.', 'time_unit': 'days', 'timestring': '%Y.%m.%d', 'exclude': (), 'older_than': 1, 'all_indices': False}
No indices matched provided args: {'regex': None, 'index': (), 'suffix': None, 'newer_than': None, 'closed_only': False, 'prefix': 'project-prod.', 'time_unit': 'days', 'timestring': '%Y.%m.%d', 'exclude': (), 'older_than': 28, 'all_indices': False}
No indices matched provided args: {'regex': None, 'index': (), 'suffix': None, 'newer_than': None, 'closed_only': False, 'prefix': 'test.', 'time_unit': 'days', 'timestring': '%Y.%m.%d', 'exclude': (), 'older_than': 7, 'all_indices': False}
curator run finish

Version-Release number of selected component (if applicable):
openshift3/logging-curator         3.3.0               0f4e933a812a

How reproducible:
Always

Steps to Reproduce:
Please refer to the "Problem description" part

Actual Result:


Expected Result:


Additional info:

--- Additional comment from Luke Meyer on 2016-07-06 16:08:50 EDT ---

Sorry, I'm a little confused about the problem description.

Are you saying that:
1. Some indices *should* have been deleted (which ones?) but logs report that none were?
2. Some indices *were* deleted (which ones?) but logs report that none were?

I assume you created some data that was old enough it should have been deleted?

--- Additional comment from Luke Meyer on 2016-07-06 16:57:26 EDT ---

Guessing this is in relation to https://bugzilla.redhat.com/show_bug.cgi?id=1352486 so (2) above.

--- Additional comment from Luke Meyer on 2016-07-06 17:12:19 EDT ---

The indices that were deleted from that bug were:

.operations.2016.04.30
project-dev.2016.07.03
project-prod.2016.06.06

However the logs say (respectively):
No indices matched provided args: {'prefix': '.operations.'
No indices matched provided args: {'prefix': 'project-dev.'
No indices matched provided args: {'prefix': 'project-prod.'

In fact it looks like these might be exactly the inverse of what happened, as the index that was *not* deleted (though expected to be) was project-qe.2016.06.27 and the logs don't say anything about that.

--- Additional comment from Xia Zhao on 2016-07-06 21:55:25 EDT ---

(In reply to Luke Meyer from comment #3)
> The indices that were deleted from that bug were:
> 
> .operations.2016.04.30
> project-dev.2016.07.03
> project-prod.2016.06.06
> 
> However the logs say (respectively):
> No indices matched provided args: {'prefix': '.operations.'
> No indices matched provided args: {'prefix': 'project-dev.'
> No indices matched provided args: {'prefix': 'project-prod.'
> 
> In fact it looks like these might be exactly the inverse of what happened,
> as the index that was *not* deleted (though expected to be) was
> project-qe.2016.06.27 and the logs don't say anything about that.

Yes, Luke, that's exactly what I meant.

Thanks,
Xia

--- Additional comment from Rich Megginson on 2016-07-07 17:56:12 EDT ---

so the problem here is that curator is reporting that it could not find matching indices because there are none?  That is curator is working - it is deleting the right indices - but it is printing spurious error messages?

--- Additional comment from Rich Megginson on 2016-07-07 21:25:01 EDT ---

Try to reproduce it with debugging enabled.

# oadm policy add-scc-to-user privileged system:serviceaccount:logging:aggregated-logging-curator

# oc get pods -l component=curator # set pod name to curpod
# oc exec $curpod -- cat /opt/app-root/src/run_cron.py > /tmp/run_cron.py

edit /tmp/run_cron.py - change INFO to DEBUG, and change ERROR to DEBUG

# oc edit dc logging-curator

add under volumeMounts:

        - mountPath: /opt/app-root/src/run_cron.py
          name: run-cron
          readOnly: true

add under volumes:

      - hostPath:
          path: /tmp/run_cron.py
        name: run-cron

# redeploy the dc

# oc deploy dc/logging-curator --latest

# wait for the new pod to start

# oc logs $newcurpod # see if it spews a lot of messages

NOTE: This will make the curator run take a lot longer - you will need to look for "curator run finish" in the curator pod log to know for sure if the run is complete

# oc logs $newcurpod | grep "curator run finish"

After it has run, grab the oc logs $newcurpod > curator.log 2>&1 and attach to the bug

--- Additional comment from Rich Megginson on 2016-07-08 23:40:45 EDT ---

PR https://github.com/openshift/origin-aggregated-logging/pull/197

--- Additional comment from Rich Megginson on 2016-07-21 17:58:55 EDT ---

So I'm not sure what to do about this bug.  Yes, if you do not have any indices matching, you will get an error like this:

curator running [1] jobs
No indices matched provided args: {'regex': None, 'index': (), 'suffix': None, 'newer_than': None, 'closed_only': False, 'prefix': None, 'time_unit': 'days', 'timestring': '%Y.%m.%d', 'exclude': ('.searchguard*', '.kibana*'), 'older_than': 30, 'all_indices': False}
curator run finish

Which is correct - there are no matching indices.

How should we handle this case?

Is this a documentation issue?

Are we going to have an issue when curator is spewing too many of these messages, and customers want us to make it stop?

--- Additional comment from Xia Zhao on 2016-07-27 06:31:47 EDT ---

Hi Rich,

This is not a documentation issue, also I'm not going to address the thing that "curator is spewing too many messages".

I intented to check from curator pod log to identify the situation about index deletion, but failed to tell me that, tried with latest 3.3.0 images on brew registry, here is the whole log curator pod told:

$ oc logs -f logging-curator-2-5apiw
curator running [3] jobs
No indices matched provided args: {'regex': None, 'index': (), 'suffix': None, 'newer_than': None, 'closed_only': False, 'prefix': None, 'time_unit': 'days', 'timestring': '%Y.%m.%d', 'exclude': ('.searchguard*', '.kibana*', 'xiazhao1\\.*', 'xiazhao2\\.*', 'myapp\\.*', 'myapp1\\.*', 'xiazhao\\.*'), 'older_than': 30, 'all_indices': False}
curator run finish

but in the mean time, I did see these indices were deleted:
myapp.2016.05.19
myapp.2016.07.19
myapp1.2016.05.19
xiazhao.2015.12.31
xiazhao.2016.07.19
xiazhao1.2016.01.22
xiazhao2.2016.01.22

More info to my scenario above:

In my test I have these policies in curator:

    # example for a normal project
    myapp:
      delete:
        weeks: 1

    xiazhao:
      delete:
        weeks: 1

    myapp1:
      delete:
        months: 1

    xiazhao1:
      delete:
        months: 1

    xiazhao2:
      delete:
        months: 1


Here are the indices (only for my interest) before deletion:
myapp.2016.05.19
myapp.2016.07.19
myapp1.2016.05.19
myapp1.2016.07.22
xiazhao.2015.12.31
xiazhao.2016.07.19
xiazhao.2ed9fe86-53c8-11e6-87c5-fa163eb6f4c1.2016.07.27
xiazhao1.2016.01.22
xiazhao1.2016.07.22
xiazhao2.2016.01.22

Here are indices after deletion:
myapp1.2016.07.22
xiazhao.2ed9fe86-53c8-11e6-87c5-fa163eb6f4c1.2016.07.27
xiazhao1.2016.07.22
xiazhao2.2016.07.22
xiazhao2.2016.07.22

I'm trying to enable logging and provide more logs for curator later. (Seems I need to redeploy my logging pods, and do the enable logging steps before configuring the index deletion policy to curator? )

--- Additional comment from Rich Megginson on 2016-07-27 10:09:38 EDT ---

This:

{'regex': None, 'index': (), 'suffix': None, 'newer_than': None, 'closed_only': False, 'prefix': None, 'time_unit': 'days', 'timestring': '%Y.%m.%d', 'exclude': ('.searchguard*', '.kibana*', 'xiazhao1\\.*', 'xiazhao2\\.*', 'myapp\\.*', 'myapp1\\.*', 'xiazhao\\.*'), 'older_than': 30, 'all_indices': False}

Is the definition of the default curator policy for any indexes not specified in the configuration.  However, in your configuration, you have specified all of the indexes, so the default match finds nothing.

This:
curator running [3] jobs

means that there were 3 jobs.  Curator breaks down the jobs according to the time - all jobs that have the same time value and unit are run in a single job.  Curator runs the job for 1 month, then for 1 week, then for the default.  The first 2 succeeded and your indices were deleted.

My suggestion is that we document this - if your curator configuration matches all of your indices, you will get this message when it attempts to delete the default indices and there are none.

--- Additional comment from Xia Zhao on 2016-07-28 05:06:59 EDT ---

@rmeggins Cool! This is exactly what I want, and it'll be good if we can doc it. Thank you for the info! Please feel free to transfer this bz back for closure.

Thanks,
Xia


Note You need to log in before you can comment on or make changes to this bug.