Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1056672 - tar uses wrong magic number for xz archives
Summary: tar uses wrong magic number for xz archives
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: tar
Version: 6.5
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Pavel Raiskup
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 1070830 1159820
TreeView+ depends on / blocked
 
Reported: 2014-01-22 17:01 UTC by Bryn M. Reeves
Modified: 2015-07-24 09:05 UTC (History)
4 users (show)

Fixed In Version: tar-1.23-12.el6
Doc Type: Bug Fix
Doc Text:
Previously, tar did not automatically detect archives compressed by the xz program if the user did not specify the "-J" or "--xz" option on the command line. As a consequence, if the processed archive had the ".xz" extension, tar extracted or listed the contents of the archive but printed an error message and eventually exited with a non-zero exit status. If the archive did not have this extension, tar failed. With this update, the automatic recognition mechanism has been improved. As a result, tar no longer prints an error message in this scenario, and it extracts or lists the contents of such archives correctly regardless of the extension.
Clone Of:
Environment:
Last Closed: 2015-07-22 06:13:30 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1285 normal SHIPPED_LIVE tar bug fix update 2015-07-20 17:48:54 UTC

Description Bryn M. Reeves 2014-01-22 17:01:28 UTC
Description of problem:
The tar program uses a built-in table of magic numbers to identify compressed archives. If the table fails to match a given file then tar will still attempt to open it by assuming the compression type based on file extension:

 219 static struct zip_magic const magic[] = {
 220   { ct_tar },
 221   { ct_none, },
 222   { ct_compress, 2, "\037\235",  COMPRESS_PROGRAM, "-Z" },
 223   { ct_gzip,     2, "\037\213",  GZIP_PROGRAM,     "-z"  },
 224   { ct_bzip2,    3, "BZh",       BZIP2_PROGRAM,    "-j" },
 225   { ct_lzip,     4, "LZIP",      LZIP_PROGRAM,     "--lzip" },
 226   { ct_lzma,     6, "\xFFLZMA",  LZMA_PROGRAM,     "--lzma" },
 227   { ct_lzop,     4, "\211LZO",   LZOP_PROGRAM,     "--lzop" },
 228   { ct_xz,       6, "\0xFD7zXZ", XZ_PROGRAM,       "-J" },
 229 };
[...]
 324             case ct_none:
 325               if (shortfile)
 326                 ERROR ((0, 0, _("This does not look like a tar archive")));
 327               set_comression_program_by_suffix (archive_name_array[0], NULL);
 328               if (!use_compress_program_option)
 329                 return archive;
 330               break;

An exception is for "short" files; in this case an error message is logged and tar will eventually exit with failure status:

# tar tf foo.tar.xz 
tar: This does not look like a tar archive
foo/
foo/bar
tar: Exiting with failure status due to previous errors
# echo $?
2

This is misleading as tar has guessed the compression type and successfully processed the archive.

Version-Release number of selected component (if applicable):
tar-1.23-11.el6

How reproducible:
100%

Steps to Reproduce:
1. Create an xz compressed archive that is < 1 block in size
2. Run tar on the resulting archive (t/x/whatever)

Actual results:
# rm -rf foo
# mkdir foo
# touch foo/bar
# tar cf foo.tar foo
# xz foo.tar
# tar tf foo.tar.xz 
tar: This does not look like a tar archive
foo/
foo/bar
tar: Exiting with failure status due to previous errors
# echo $?
2


Expected results:
# rm -rf foo
# mkdir foo
# touch foo/bar
# tar cf foo.tar foo
# xz foo.tar
# tar tf foo.tar.xz 
foo/
foo/bar
# echo $?
0


Additional info:
This happens because tar uses an invalid magic number for XZ files in its magic table:

 228   { ct_xz,       6, "\0xFD7zXZ", XZ_PROGRAM,       "-J" },

The correct magic string for XZ is "fd37 7a58 5a" (\xfd7zXZ). The '\0' encodes a null byte causing the rest of the magic string to appear empty:

(gdb) p *p                               -----------
$38 = {type = ct_xz, length = 6, magic = 0x445e20 "", program = 0x44596a "xz", rpl_option = 0x44596d "-J"}

'FD' is then encoded as ASCII etc.

It seems like the intent was to avoid gcc treating '\xfd7' as a single hex escape this will cause a "hex escape sequence out of range" warning as the 7 is interpreted as part of the escape.

It seems like the simplest way to solve this is to also use a hex escape for the '7' char; this leaves the string unambiguous and fixes the problem with very small tar archives for me:

  { ct_xz,       6, "\xFD\x37zXZ", XZ_PROGRAM,       "-J" },

Comment 1 Bryn M. Reeves 2014-01-22 17:04:04 UTC
Turns out this was already fixed upstream in a couple of commits in 2010:

commit 80a6ef7d94ce144db0249384e55846baa404f4dd
Author: Sergey Poznyakoff <gray@gnu.org.ua>
Date:   Mon Jun 28 00:04:49 2010 +0300

    Minor fix.
    
    * src/buffer.c (magic): Split the character constant to help
    cc recognize character boundaries (7 is a valid hex character).

diff --git a/src/buffer.c b/src/buffer.c
index 5b7cbc7..444f612 100644
--- a/src/buffer.c
+++ b/src/buffer.c
@@ -225,7 +225,7 @@ static struct zip_magic const magic[] = {
   { ct_lzip,     4, "LZIP",      LZIP_PROGRAM,     "--lzip" },
   { ct_lzma,     6, "\xFFLZMA",  LZMA_PROGRAM,     "--lzma" },
   { ct_lzop,     4, "\211LZO",   LZOP_PROGRAM,     "--lzop" },
-  { ct_xz,       6, "\xFD7zXZ",  XZ_PROGRAM,       "-J" },
+  { ct_xz,       6, "\xFD" "7zXZ",  XZ_PROGRAM,       "-J" },
 };

commit 9b31db388e6af753ec2e1c84db53a5d47e94ec15
Author: Sergey Poznyakoff <gray@gnu.org.ua>
Date:   Sun Jun 27 23:42:08 2010 +0300

    Minor fix.
    
    * src/buffer.c (magic): Fix xz magic.

diff --git a/src/buffer.c b/src/buffer.c
index 239d3f1..5b7cbc7 100644
--- a/src/buffer.c
+++ b/src/buffer.c
@@ -225,7 +225,7 @@ static struct zip_magic const magic[] = {
   { ct_lzip,     4, "LZIP",      LZIP_PROGRAM,     "--lzip" },
   { ct_lzma,     6, "\xFFLZMA",  LZMA_PROGRAM,     "--lzma" },
   { ct_lzop,     4, "\211LZO",   LZOP_PROGRAM,     "--lzop" },
-  { ct_xz,       6, "\0xFD7zXZ", XZ_PROGRAM,       "-J" },
+  { ct_xz,       6, "\xFD7zXZ",  XZ_PROGRAM,       "-J" },
 };

Comment 8 errata-xmlrpc 2015-07-22 06:13:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1285.html


Note You need to log in before you can comment on or make changes to this bug.