Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 153015 - GFS 6.0 causes poor app performance and kernel panic when not using "localflock" mount option
Summary: GFS 6.0 causes poor app performance and kernel panic when not using "localflo...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: gfs
Version: 3
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Ben Marzinski
QA Contact: GFS Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-03-31 21:44 UTC by Kyle Gonzales
Modified: 2010-01-12 03:04 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-12-13 19:54:55 UTC


Attachments (Terms of Use)
Strace of app over GFS (deleted)
2005-03-31 21:46 UTC, Kyle Gonzales
no flags Details

Description Kyle Gonzales 2005-03-31 21:44:42 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0

Description of problem:
From a customer:

"I have narrowed down the issue with GFS.  I created my new file system
with the locking protocol of LOCK_GULM, and mounted it normally using
mount -t gfs -o acl ......  It showed the exact same problems as before
-- the app would freeze up at a certain point, and I would have to kill
it to continue.  I have created an strace of the app which ends at the
freeze.  It is attached to this email.  When I unmounted the file system
after killing my app, I got the following kernel panic:

Kernel panic: GFS: Assertion failed on line 1076 of file linux_super.c
GFS: assertion: "list_empty(&sdp->sd_plock_list)"
GFS: time = 1112216791
GFS: fsid=TheBorg:gfs_test.0

(The hardware is a Dell GX280 single proc with Hyperthreading enabled.
The running kernel is 2.4.21-27.0.2.Elsmp and GFS-6.0.2-25.)

After rebooting, I mounted the file system using mount -t gfs -o
acl,localflocks ...., and my app appears to be working perfectly -- the
performance is there, I don't see any hesitation, and most of all, I'm
able to get into the parts of the app which were hanging before.  Not
only that, I'm able to get multiple instances of the app running in the
same areas, as it was designed.  Everything looks really nice.

It appears that I am running into a locking performance problem.  I
would welcome any ideas on how to solve this, if there are any
available.  :)"

Version-Release number of selected component (if applicable):
GFS 6.0

How reproducible:
Always

Steps to Reproduce:
1. Mount GFS filesystem using "mount -t gfs -o acl <file system> <mount point>"
2. Run app
3. Stop app and unmount
  

Actual Results:  Poor application performance and lockups, and a kernel panic on unmount

Expected Results:  Good application performance, and no kernel panic

Additional info:

Comment 1 Kyle Gonzales 2005-03-31 21:46:25 UTC
Created attachment 112546 [details]
Strace of app over GFS

Strace of app that is seeing poor performance over GFS when not using
localflock mount option

Comment 2 Peter Shearer 2005-03-31 22:48:03 UTC
The filesystem was created using the following command:

mkfs_gfs -p lock_gulm -t TheBorg:gfs_test -j 10 /dev/pool/gfs_test

The cluster only has two computers in it, and both mount this file system.  The 
10 journals were created for expansion.  We will deploy it starting with 3 
servers in the cluster (all mounting the file system), and probably the same 
number of journals so that file system resizing is not needed in the near 
future when more servers are added to the cluster.

--Peter

Comment 3 Ben Marzinski 2005-04-22 21:57:13 UTC
Some questions

What is the App? Can I get a copy of it to play with? Is the performance bad right
from the start, or does it get worse over time? If over time, then how long does
it have to run until it starts having problems?

Comment 4 Kyle Gonzales 2005-04-22 22:02:00 UTC
> Some questions
> 
> What is the App? Can I get a copy of it to play with? Is the performance bad 
> right from the start, or does it get worse over time? If over time, then how 
> long does it have to run until it starts having problems?

Peter can provide more information about the app.  If I remember tho, it would
have bad performance right from the start.

Comment 5 Ben Marzinski 2005-04-22 22:47:27 UTC
More questions:

Is the App just a single process running on each machine? If so, getting
traces like the one attached earlier, but of both machines, would be really
helpful.

Comment 6 Ben Marzinski 2006-09-15 19:49:02 UTC
This bug has been inactive for over a year. Does anyone object to me closing it out?


Note You need to log in before you can comment on or make changes to this bug.