Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 224679 - FEAT: Executing >2GB binaries with <2GB code but >2GB debug
Summary: FEAT: Executing >2GB binaries with <2GB code but >2GB debug
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Dave Anderson
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: RHEL5u2_relnotes 393501 425461
TreeView+ depends on / blocked
 
Reported: 2007-01-26 23:53 UTC by Jan Kratochvil
Modified: 2008-05-21 14:41 UTC (History)
4 users (show)

Fixed In Version: RHBA-2008-0314
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-05-21 14:41:06 UTC


Attachments (Terms of Use)
.gz.bz2 of 2.5GB RHEL5.i386 ELF for convenience on memory-limited systems. (deleted)
2007-12-06 16:21 UTC, Jan Kratochvil
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0314 normal SHIPPED_LIVE Updated kernel packages for Red Hat Enterprise Linux 5.2 2008-05-20 18:43:34 UTC

Description Jan Kratochvil 2007-01-26 23:53:02 UTC
Tested kernel-2.6.18-4.el5.x86_64 refuses to load the file >2GB even despite it
has very small code section (it has >2GB `.debug_macinfo' DWARF section).
  execve("./main", ["./main"], [/* 59 vars */]) = -1 EFBIG (File too large)

While loading >2GB code could be a major effort loading only standard <2GB
code/data/bss sizes on x86_64 should not be much a problem, only some file
offsets need to be 64bit.

Such functionality should apply even for i686.

While not directly requested by a customer it was found during Bug 222814
evaluation - the customer is using such >2GB debug data libraries (work fine).

-- Additional comment from jkratoch@redhat.com on 2007-01-22 18:39 EST --
Created an attachment (id=146256)
Testcase .tar.gz creating >2GB file with >2GB `.debug_macinfo' DWARF section.

Machine with >=5GB physical RAM is required for acceptable build time.


-- Additional comment from jkratoch@redhat.com on 2007-01-22 19:53 EST --
Created an attachment (id=146263)
.bz2.bz2 of 2.5GB RHEL5.x86_64 ELF for convenience on memory-limited systems.

Comment 2 Dave Anderson 2007-10-26 20:47:33 UTC
AFAICT, despite all of the discussion about the contents of the
attached "main" executable (the big .debug_macinfo' DWARF section,
etc...), the issue at hand appears to be simply a matter of file size.

Tinkering with kprobes, I found that when the failing sys_execve()
occurs with EFBIG, the function trace is this:

 sys_execve
  do_execve
   open_exec
    nameidata_to_filp
     __dentry_open
      generic_file_open  (ext3_file_operations.open)

and where generic_file_open() is returns EFBIG because the
inode's file size is greater than MAX_NON_LFS (2GB-1):
  
  /*
   * Called when an inode is about to be open.
   * We use this to disallow opening large files on 32bit systems if
   * the caller didn't specify O_LARGEFILE.  On 64bit systems we force
   * on this flag in sys_open.
   */
  int generic_file_open(struct inode * inode, struct file * filp)
  {
          if (!(filp->f_flags & O_LARGEFILE) && i_size_read(inode) > MAX_NON_LFS)
                  return -EFBIG;
          return 0;
  }
  
As the function's comment indicates, when the file is explicitly opened
by the sys_open() system call, O_LARGEFILE gets set:

  asmlinkage long sys_open(const char __user *filename, int flags, int mode)
  {
          long ret;
  
          if (force_o_largefile())
                  flags |= O_LARGEFILE;
  
          ret = do_sys_open(AT_FDCWD, filename, flags, mode);
          /* avoid REGPARM breakage on x86: */
          prevent_tail_call(ret);
          return ret;
  }

where force_o_largefile() looks like this:

  #define force_o_largefile() (BITS_PER_LONG != 32)
    
Anyway this would occur on any executable greater than 2GB in size.
For example, I took this program:

  main()
  {
          printf("hello world\n");
  }

compiled it into "hello":

  # hello
  hello world
  #

then did this:

  # cat main >> hello
  # ./hello
  -bash: ./hello: File too large
  #

I wrote a jprobe handler that catches the generic_file_open() function,
prints the inode and file pointer information, dumps the stack, and 
and then force-sets the O_LARGEFILE bit in the filp->f_flags:

  # insmod jprobe.ko
  # ./hello
  hello world
  # dmesg
  Planted jprobe at ffffffff810a1bd3, handler addr ffffffff88215000
  generic_file_open: inode=0xffff81004c58adc0, i_size: 2588871678 \
                     filp=0xffff81008a62f700 f_flags: 0

  Call Trace:
   [<ffffffff8821502f>] :jprobe:jdo_fork+0x2f/0x64
   [<ffffffff810a1f3a>] __dentry_open+0xd9/0x1b0
   [<ffffffff810a7401>] open_exec+0x76/0xc0
   [<ffffffff8109c115>] init_object+0x27/0x6e
   [<ffffffff8109dd18>] kmem_cache_alloc+0x7a/0xa0
   [<ffffffff810544bd>] trace_hardirqs_on+0x12e/0x151
   [<ffffffff810a8549>] do_execve+0x46/0x1f6
   [<ffffffff8100a61d>] sys_execve+0x36/0x4c
   [<ffffffff8100bff7>] stub_execve+0x67/0xb0

  pid: 10186 comm: bash (setting O_LARGEFILE)
  #

(BTW jprobes is pretty cool -- it's the first time I've ever used it...)

Anyway, it seems as simple as forcing the O_LARGEFILE in the sys_execv()
trail some place.  I wonder why nobody has ever complained about this
before.

Oh yeah -- my testing above was on an FC8 (2.6.23) kernel.
 


Comment 3 Dave Anderson 2007-10-30 19:40:03 UTC
> Anyway, it seems as simple as forcing the O_LARGEFILE in the sys_execv()
> trail some place.  I wonder why nobody has ever complained about this
> before.

My proposed RHEL5 linux-kernel-test.patch to open_exec(), the brew-built
kernel, and the kernel src.rpm can be found here:

  http://people.redhat.com/anderson/BZ_224679

Tested with the attached "main" executable.


Comment 11 Dave Anderson 2007-12-06 13:28:16 UTC
> Such functionality should apply even for i686.

Jan,

Andi Kleen agrees with you there, i.e., that this should
also apply to 32-bit arches.

Can you create a >2GB i386 executable that I can test?

Thanks,
  Dave
 

Comment 12 Jan Kratochvil 2007-12-06 16:21:14 UTC
Created attachment 279851 [details]
.gz.bz2 of 2.5GB RHEL5.i386 ELF for convenience on memory-limited systems.

But gcc.i386 is unable to create such file (used gcc.x86_64 -m32):
cc1: out of memory allocating 8016 bytes after a total of 925716480 bytes

Comment 13 Dave Anderson 2007-12-07 15:47:56 UTC
The i386 test program runs fine with my posted patch.

Upstream, Andi Kleen extended my post to unconditionally
set O_LARGEFILE in open_exec(), and also in sys_uselib():

=========================================================

To: Dave Anderson <anderson@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>, linux-kernel@vger.kernel.org
Subject: [NEW-PATCH] exec: allow > 2GB executables to run on 64-bit systems

Since Dave didn't post an updated patch. This is how I think what
the patch should be. I also changed sys_uselib just to be complete.

----

Always use O_LARGEFILE for opening executables

This allows to use executables >2GB.

Based on a patch by Dave Anderson

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux-2.6.24-rc3/fs/exec.c
===================================================================
--- linux-2.6.24-rc3.orig/fs/exec.c
+++ linux-2.6.24-rc3/fs/exec.c
@@ -119,7 +119,7 @@ asmlinkage long sys_uselib(const char __
 	if (error)
 		goto exit;
 
-	file = nameidata_to_filp(&nd, O_RDONLY);
+	file = nameidata_to_filp(&nd, O_RDONLY|O_LARGEFILE);
 	error = PTR_ERR(file);
 	if (IS_ERR(file))
 		goto out;
@@ -658,7 +658,8 @@ struct file *open_exec(const char *name)
 			int err = vfs_permission(&nd, MAY_EXEC);
 			file = ERR_PTR(err);
 			if (!err) {
-				file = nameidata_to_filp(&nd, O_RDONLY);
+				file = nameidata_to_filp(&nd,
+							O_RDONLY|O_LARGEFILE);
 				if (!IS_ERR(file)) {
 					err = deny_write_access(file);
 					if (err) {


Comment 15 Don Zickus 2008-01-10 20:39:40 UTC
in 2.6.18-66.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 17 Don Domingo 2008-02-06 02:36:06 UTC
added to RHEl5.2 release notes under "Kernel-Related Updates":

<quote>
Executing binaries with more than 2GB of debug information no longer fails.
</quote>

please advise if any further revisions are required. thanks!

Comment 18 Dave Anderson 2008-02-06 13:53:08 UTC
(In reply to comment #17)
> added to RHEl5.2 release notes under "Kernel-Related Updates":
> 
> <quote>
> Executing binaries with more than 2GB of debug information no longer fails.
> </quote>
> 
> please advise if any further revisions are required. thanks!

It does not require 2GB of *debug* information -- it's simply a matter of
the file size of 64-bit binaries.  So you could say something like:

<quote>
Executing 64-bit binaries greater than 2GB no longer fails.
</quote>




Comment 19 Jan Kratochvil 2008-02-06 18:23:19 UTC
While technically right I would find it more misleading as RHEL still does not
support general executables with >2GB of code/data as GCC cannot produce it.

See man gcc, -mcmodel=large:
  Generate code for the large model: This model makes no assumptions about
  addresses and sizes of sections.  Currently GCC does not implement this model.
Linking fails as everything is using R_X86_64_32 / R_X86_64_PC32 by the model:
-mcmodel=small:
  Generate code for the small code model: the program and its symbols must be
  linked in the lower 2 GB of the address space.  Pointers are 64 bits.
  Programs can be statically or dynamically linked.
  This is the default code model.


Comment 20 Dave Anderson 2008-02-06 18:37:37 UTC
(In reply to comment #19)
> While technically right I would find it more misleading as RHEL still does not
> support general executables with >2GB of code/data as GCC cannot produce it.

Point taken -- I was just looking at it from the kernel point of view, which
doesn't care about which parts of the binary cause the "> 2gb size".

Anyway, I rescind my release note suggestion.

Comment 21 Don Domingo 2008-04-02 02:16:40 UTC
Hi,
the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at
which point no further additions or revisions will be entertained.

a mockup of the RHEL5.2 release notes can be viewed at the following link:
http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html

please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
bug number.

Cheers,
Don

Comment 23 errata-xmlrpc 2008-05-21 14:41:06 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html



Note You need to log in before you can comment on or make changes to this bug.