Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1602156 - imfile symlink uses source filename in metadata filename instead of given name
Summary: imfile symlink uses source filename in metadata filename instead of given name
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: rsyslog
Version: 7.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Rich Megginson
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-17 21:54 UTC by Rich Megginson
Modified: 2018-10-30 10:18 UTC (History)
6 users (show)

Fixed In Version: rsyslog-8.24.0-28.el7
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-10-30 10:17:55 UTC
Target Upstream Version:


Attachments (Terms of Use)
patch for https://github.com/rsyslog/rsyslog/pull/2749 (deleted)
2018-07-18 00:43 UTC, Rich Megginson
no flags Details | Diff
patch to apply to current rhel-7.6 dist-git (deleted)
2018-07-18 00:45 UTC, Rich Megginson
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Github rsyslog rsyslog pull 2749 None None None 2018-07-18 00:46:33 UTC
Red Hat Bugzilla 1531295 None CLOSED [RFE]: rsyslog imfile support for symlink wildcard combination 2019-03-15 13:26:58 UTC
Red Hat Product Errata RHEA-2018:3135 None None None 2018-10-30 10:18:06 UTC

Internal Links: 1531295

Description Rich Megginson 2018-07-17 21:54:11 UTC
Description of problem:

The $!metadata!filename field is the name of the "source" file for the symlink instead of the name of the link.  It must be the name of the link.

Example:

input(type="imfile" file="/var/log/containers/*.log" tag="kubernetes" addmetadata="on")

The files in /var/log/containers/*.log are symlinks to /var/log/pods/

# ll /var/log/containers/logging-kibana-1-ctwqh_openshift-logging_kibana-1ce792f7c50f2b2e38d56050f32e6c527589515e299ee2670a099a2d08361f79.log 
lrwxrwxrwx. 1 root root 63 Jul 13 15:04 /var/log/containers/logging-kibana-1-ctwqh_openshift-logging_kibana-1ce792f7c50f2b2e38d56050f32e6c527589515e299ee2670a099a2d08361f79.log -> /var/log/pods/f1736be8-86ad-11e8-b7bd-545200b22332/kibana/0.log

The /var/log/pods files are symlinks to /var/lib/docker/containers:

# ll /var/log/pods/f1736be8-86ad-11e8-b7bd-545200b22332/kibana/0.log
lrwxrwxrwx. 1 root root 165 Jul 13 15:04 /var/log/pods/f1736be8-86ad-11e8-b7bd-545200b22332/kibana/0.log -> /var/lib/docker/containers/1ce792f7c50f2b2e38d56050f32e6c527589515e299ee2670a099a2d08361f79/1ce792f7c50f2b2e38d56050f32e6c527589515e299ee2670a099a2d08361f79-json.log

Version-Release number of selected component (if applicable):

rsyslog-8.24.0-27.el7.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:

$!:{ "metadata": { "filename": "\/var\/lib\/docker\/containers\/1ce792f7c50f2b2e38d56050f32e6c527589515e299ee2670a099a2d08361f79
cab77626e6d0c024ecdb40674e2b8102b194d5af232d9a7f\/1ce792f7c50f2b2e38d56050f32e6c527589515e299ee2670a099a2d08361f7924ecdb40674e2b8102b194d5af232d9a7f-json.log", "fileoffset": "5955682" } }

Expected results:

$!:{ "metadata": { "filename": "\/var\/log\/containers\/logging-kibana-1-ctwqh_openshift-logging_kibana-1ce792f7c50f2b2e38d56050f32e6c527589515e299ee2670a099a2d08361f79.log", "fileoffset": "5955682" } }

Additional info:

Comment 2 Rich Megginson 2018-07-17 21:55:14 UTC
Noriko, have you seen this in your testing?

Comment 3 Rich Megginson 2018-07-18 00:43:36 UTC
Created attachment 1459578 [details]
patch for https://github.com/rsyslog/rsyslog/pull/2749

This is a patch to apply on top of https://github.com/rsyslog/rsyslog/pull/2749

Comment 4 Rich Megginson 2018-07-18 00:45:03 UTC
Created attachment 1459579 [details]
patch to apply to current rhel-7.6 dist-git

This patch applies to the current rhel-7.6 dist-git code - it is a backport to rhel-7.6 of https://bugzilla.redhat.com/attachment.cgi?id=1459578&action=edit

Comment 5 Rich Megginson 2018-07-18 00:46:34 UTC
@jvymazal - I could not submit a PR against your github jvymazal:wildcard_symlink so please apply https://bugzilla.redhat.com/attachment.cgi?id=1459578 to your PR

Comment 6 Noriko Hosoi 2018-07-18 01:18:12 UTC
(In reply to Rich Megginson from comment #2)
> Noriko, have you seen this in your testing?

Yes, I do.  The same result:
{ "metadata": { "filename": "\/var\/lib\/docker\/containers\/cbc3840aa3fe025d29b4452e46114045e1f4501d2e621e152eb9cd788ea7db39\/cbc3840aa3fe025d29b4452e46114045e1f4501d2e621e152eb9cd788ea7db39-json.log", 
...

The filename is the real path although I'm specifying the symlink to watch in the config:
  input(type="imfile" File="/var/log/containers/loader-*.log" Tag="docker" ruleset="docker_logs")

Comment 7 Rich Megginson 2018-07-18 02:10:45 UTC
hold on the patches - with the new code I'm getting crashes at shutdown - valgrind reports this, a double free:

123 ==13777== Invalid read of size 8
124 ==13777==    at 0x8604542: act_obj_destroy (imfile.c:958)
125 ==13777==    by 0x860482C: act_obj_destroy_all (imfile.c:1021)
126 ==13777==    by 0x8604964: fs_node_destroy (imfile.c:1073)
127 ==13777==    by 0x8604940: fs_node_destroy (imfile.c:1070)
128 ==13777==    by 0x86080CB: freeCnf (imfile.c:2022)
129 ==13777==    by 0x14EE25: freeCnf (rsconf.c:220)
130 ==13777==    by 0x14EFFE: rsconfDestruct (rsconf.c:232)
131 ==13777==    by 0x121A33: deinitAll (rsyslogd.c:1987)
132 ==13777==    by 0x121C93: main (rsyslogd.c:2083)
133 ==13777==  Address 0x7b86638 is 24 bytes inside a block of size 136 free'd
134 ==13777==    at 0x4C2FDAC: free (vg_replace_malloc.c:530)
135 ==13777==    by 0x8604789: act_obj_destroy (imfile.c:1005)
136 ==13777==    by 0x860482C: act_obj_destroy_all (imfile.c:1021)
137 ==13777==    by 0x8604964: fs_node_destroy (imfile.c:1073)
138 ==13777==    by 0x8604940: fs_node_destroy (imfile.c:1070)
139 ==13777==    by 0x86080CB: freeCnf (imfile.c:2022)
140 ==13777==    by 0x14EE25: freeCnf (rsconf.c:220)
141 ==13777==    by 0x14EFFE: rsconfDestruct (rsconf.c:232)
142 ==13777==    by 0x121A33: deinitAll (rsyslogd.c:1987)
143 ==13777==    by 0x121C93: main (rsyslogd.c:2083)
144 ==13777==  Block was alloc'd at
145 ==13777==    at 0x4C30B06: calloc (vg_replace_malloc.c:711)
146 ==13777==    by 0x8603738: act_obj_add (imfile.c:705)
147 ==13777==    by 0x8604328: poll_tree (imfile.c:911)
148 ==13777==    by 0x8604A4B: fs_node_walk (imfile.c:1091)
149 ==13777==    by 0x86088E5: in_processEvent (imfile.c:2204)
150 ==13777==    by 0x8608BB3: do_inotify (imfile.c:2277)
151 ==13777==    by 0x8608CF2: runInput (imfile.c:2426)
152 ==13777==    by 0x1A1E1B: thrdStarter (threads.c:223)
153 ==13777==    by 0x5059593: start_thread (in /usr/lib64/libpthread-2.27.so)
154 ==13777==    by 0x624202E: clone (in /usr/lib64/libc-2.27.so)

Comment 8 Noriko Hosoi 2018-07-18 03:31:12 UTC
Comment on attachment 1459579 [details]
patch to apply to current rhel-7.6 dist-git

Reviewed and tested.

Comment 9 Noriko Hosoi 2018-07-18 03:33:08 UTC
Hmmm, my test did not show the double free... :p

Comment 10 Noriko Hosoi 2018-07-18 03:40:09 UTC
Rich, is your crash observed in rsyslog-8.36 (rhlog-1.0-rhel-7) or 8.24 (rhel-7.6)?  I tested with rhel-7.6 (https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=17226456).

Comment 11 Jiří Vymazal 2018-07-18 06:45:39 UTC
Thanks you both for testing the new build, I was not aware of the requirement to list target file name anywhere, I will take a patch and add it to the next build and also update upstream PR with it. As for the possible double free I think I found it, there was one extra dereference  which might cause it in some cases.

Comment 12 Jiří Vymazal 2018-07-18 07:17:36 UTC
Just one more question, I just properly re-red all the examples and affected code, so If I understood it correctly you want to list *symlink* name in the metadata, and not the actual file name from which data are coming from?

If yes then this definitely needs some broader discussion because that is a major change of current behavior and goes directly against documentation of imfile which clearly states that: "Name of the file where the message originated from. This is most useful when using wildcards inside file monitors, because it then is the only way to know which file the message originated from. The value can be accessed using the %$!metadata!filename% property."

Comment 13 Rich Megginson 2018-07-18 14:35:08 UTC
(In reply to Jiří Vymazal from comment #12)
> Just one more question, I just properly re-red all the examples and affected
> code, so If I understood it correctly you want to list *symlink* name in the
> metadata, and not the actual file name from which data are coming from?

Correct.  That is the way it works currently.

> 
> If yes then this definitely needs some broader discussion because that is a
> major change of current behavior and goes directly against documentation of
> imfile which clearly states that: "Name of the file where the message
> originated from. This is most useful when using wildcards inside file
> monitors, because it then is the only way to know which file the message
> originated from. The value can be accessed using the %$!metadata!filename%
> property."

I don't understand.

Before the imfile wildcard symlink patch, imfile works fine with symlinks.  That is, with https://bugzilla.redhat.com/show_bug.cgi?id=1602156#c0

I get the expected result:

$!:{ "metadata": { "filename": "\/var\/log\/containers\/logging-kibana-1-ctwqh_openshift-logging_kibana-1ce792f7c50f2b2e38d56050f32e6c527589515e299ee2670a099a2d08361f79.log", "fileoffset": "5955682" } }

I know this for a fact because I relied on this developing mmkubernetes.  mmkubernetes _will not work_ if $!metadata!filename _does not contain_ the name of the symlink.

Comment 14 Rich Megginson 2018-07-18 14:35:36 UTC
(In reply to Noriko Hosoi from comment #10)
> Rich, is your crash observed in rsyslog-8.36 (rhlog-1.0-rhel-7) or 8.24
> (rhel-7.6)?  I tested with rhel-7.6
> (https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=17226456).

crash is observed with latest upstream master 8.36 code.

Comment 15 Rich Megginson 2018-07-18 14:39:01 UTC
If I am gathering logs from /var/log/containers/*.log, and I configure imfile like this:

input(type="imfile" file="/var/log/containers/*.log" tag="kubernetes" addmetadata="on")

Doesn't this mean I want $!metadata!filename to contain filenames matching /var/log/containers/*.log?  It seems very counter-intuitive that $!metadata!filename would contain filenames in /var/lib/docker/containers/uuid/uuid-json.log

Comment 16 Rich Megginson 2018-07-18 23:24:44 UTC
@jvymazal - looks like you decided to go in a different direction in your latest commit to https://github.com/jvymazal/rsyslog/tree/wildcard_symlink

I have a complete implementation that works for the first step mentioned at https://github.com/rsyslog/rsyslog/pull/2749#issuecomment-406024642

my implementation is here: https://github.com/richm/rsyslog/commit/d9a2e869780cdcc7068b8160485f50309e29a86e

basically, for every act_obj_s object, keep track of the link_source if the file has a link to it, and report the link_source as the $!metadata!filename.  For example, if we have 

A -> B -> C

The act_obj for C has a link_source of A since it is the "top" link, and $!metadata!filename will list A (if imfile is configured with file="A").

One wrinkle is that I'm using readlink without lstat - I'm using a buffer size of PATH_MAX+1, and truncating the buffer to the returned length.  I don't know if it is possible for the returned buffer size to be greater than PATH_MAX - the man pages are unclear and I cannot find the source code for the implementation of readlink(2).  If so, then the returned buffer will need to be truncated at the max of the returned length or PATH_MAX.

The implementation in the man page using lstat seems needlessly complex when readlink will return -1 and EINVAL if the given path is not a symlink.

Comment 17 Rich Megginson 2018-07-18 23:33:16 UTC
There is an alternate way to call readlink, listed in the glibc filesystem manual:

char *
readlink_malloc (const char *filename)
{
  int size = 100;
  char *buffer = NULL;

  while (1)
    {
      buffer = (char *) xrealloc (buffer, size);
      int nchars = readlink (filename, buffer, size);
      if (nchars < 0)
        {
          free (buffer);
          return NULL;
        }
      if (nchars < size)
        return buffer;
      size *= 2;
    }
}

Comment 18 Jiří Vymazal 2018-07-19 06:24:51 UTC
W(In reply to Rich Megginson from comment #15)
> If I am gathering logs from /var/log/containers/*.log, and I configure
> imfile like this:
> 
> input(type="imfile" file="/var/log/containers/*.log" tag="kubernetes"
> addmetadata="on")
> 
> Doesn't this mean I want $!metadata!filename to contain filenames matching
> /var/log/containers/*.log?  It seems very counter-intuitive that
> $!metadata!filename would contain filenames in
> /var/lib/docker/containers/uuid/uuid-json.log

Well, my first though would be that filename should be actual file name from which the data came from (which is also what is in the doc and why I have originally written in that way). On the other hand one may also want to know why the data came from where it did. I will edit the PR to preserve current behavior (have not tested the code without my changes so I did not know it was that way).

Comment 19 Jiří Vymazal 2018-07-19 06:25:38 UTC
(In reply to Rich Megginson from comment #17)
> There is an alternate way to call readlink, listed in the glibc filesystem
> manual:
> 
> char *
> readlink_malloc (const char *filename)
> {
>   int size = 100;
>   char *buffer = NULL;
> 
>   while (1)
>     {
>       buffer = (char *) xrealloc (buffer, size);
>       int nchars = readlink (filename, buffer, size);
>       if (nchars < 0)
>         {
>           free (buffer);
>           return NULL;
>         }
>       if (nchars < size)
>         return buffer;
>       size *= 2;
>     }
> }

Thanks for note, will use it, it will be more clear what is it doing this way.

Comment 26 Anping Li 2018-08-22 09:04:27 UTC
The feature rsyslog-mmkubernetes-8.24.0-33.el7.x86_64 works well when Openshift use docker json driver. The imfile symlink work as expected.

Comment 29 errata-xmlrpc 2018-10-30 10:17:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:3135


Note You need to log in before you can comment on or make changes to this bug.