Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 590151 - cxgb3i_ddp Error occurred on the host and the Sessions failed to login to controller during controller reboot test - (CR175840)
Summary: cxgb3i_ddp Error occurred on the host and the Sessions failed to login to con...
Keywords:
Status: CLOSED DUPLICATE of bug 567444
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: powerpc
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Mike Christie
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-07 20:37 UTC by hong.chung
Modified: 2010-05-26 19:15 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-05-26 19:14:27 UTC
Target Upstream Version:


Attachments (Terms of Use)
cxgb3i-dsmeg and messages log (deleted)
2010-05-07 20:37 UTC, hong.chung
no flags Details

Description hong.chung 2010-05-07 20:37:26 UTC
Created attachment 412437 [details]
cxgb3i-dsmeg and messages log

Description of problem:

While running 12 hrs of controller reboot test, after array several reboots, not constant number, login sessions failed to log back to the target which causes the host reports Read/Write Error.

We have another RHEL5.5 host running with tcp, not with the toe (cxgb3i), and we did not see this issue.

Version-Release number of selected component (if applicable):

Hostname:          Ayeka-RH55
Host IP:           10.10.10.35
Kernel Release:    2.6.18-194.el5
RHEL Release:      Red Hat Enterprise Linux Server release 5.5 (Tikanga)
Version:           Linux version 2.6.18-194.el5 (mockbuild@ppc-008.build.bos.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Tue Mar 16 22:03:12 EDT 2010
Platform:          ppc64

filename:       /lib/modules/2.6.18-194.el5/kernel/drivers/scsi/cxgb3i/cxgb3i.ko
version:        1.0.2
license:        GPL
description:    Chelsio S3xx iSCSI Driver
author:         Karen Xie kxie@chelsio.com>
srcversion:     92C55DB348F36EC6BDAA7B6
depends:        libiscsi2,libiscsi_tcp,cxgb3,scsi_transport_iscsi2,scsi_mod
vermagic:       2.6.18-194.el5 SMP mod_unload gcc-4.1
parm:           cxgb3_rcv_win:TCP receive window in bytes (default=256KB) (int)
parm:           cxgb3_snd_win:TCP send window in bytes (default=128KB) (int)
parm:           cxgb3_rx_credit_thres:int
parm:           rx_credit_thres:RX credits return threshold in bytes (default=10KB)
parm:           cxgb3_max_connect:Max. # of connections (default=8092) (uint)
parm:           cxgb3_sport_base:starting port number (default=20000) (uint)
module_sig:	883f3504ba0409b9ba3c1dbd688a8a41123a3d09f5e7f92eac401995da8ff3e3d5baca33afe8bd00a0b618675fcf9083773f11c5cdfe5871846c1f146b


How reproducible:
Often.

Steps to Reproduce:
1. Created 64 volumes from in each array.
2. Created Snapshot of all volumes.
3. Mapped 64 base volumes from each array to the RH 5.5 host.
4. Start the 4 cxgb3i session 2 for each array
cxgb3i: [1] 10.10.10.10:3260,1 iqn.1992-01.com.lsi:4981.60080e500017b96a000000004b4db14b
cxgb3i: [2] 10.10.10.20:3260,1 iqn.1992-01.com.lsi:4981.60080e500017b962000000004b4db188
cxgb3i: [3] 11.11.11.10:3260,2 iqn.1992-01.com.lsi:4981.60080e500017b96a000000004b4db14b
cxgb3i: [4] 11.11.11.20:3260,2 iqn.1992-01.com.lsi:4981.60080e500017b962000000004b4db188

5. Started I/O using LunixSmash and started the sysreboot test.

*sysreboot test:
Both A controllers sleeps ten minutes and then sysReboots “reboots" both B controllers and sleeps ten minutes and so on. Test should run for 12 hours.
While rebooting the A controllers, I/Os will be running in the alt path "B path”. When rebooting is completed, I/O will get back to its preferred path "A path".
  
Actual results:
After the target several reboots, not constant number, login sessions failed to log back to the target which causes the host reports Read/Write Error.

*The following log appeared in /var/log/messegaes before I/Os Error occurred:

May  6 13:05:23 Ayeka-RH55 kernel: cxgb3i_ddp: ERR! release 0x1005274b, idx 0x149d, gl 0x0000000000000000, 0.
May  6 13:05:24 Ayeka-RH55 iscsid: Kernel reported iSCSI connection 2:0 error (1011) state (3)

Expected results:
Sessions logged back to the target and I/O continue without any error.

Additional info:

Setup information (1x2)
Switch Cisco Nexus 5020 4.1(3)N2(1a)

HOST
Ayeka		172.22.229.116
OS – RHEL 5.5
Server Brand - IBM
Server Model - P520
Failover - MPP
Architecture ; PPC
HBA 1 Model / Protocol - RedRiver / iscsi

eth0 : 10.10.10.35
eth1 : 11.11.11.35

/var/lib/iscsi/iface/

iface.iscsi_ifacename = cxgb3i.00:14:5e:99:04:68
iface.hwaddress = 00:14:5e:99:04:68
iface.ipaddress = 10.10.10.35
iface.net_ifacename =  empty >
iface.transport_name = cxgb3i
iface.initiator_name = iqn.1994-05.com.redhat:7ae845b4ba6b

iface.iscsi_ifacename = cxgb3i.00:14:5e:99:04:66
iface.hwaddress = 00:14:5e:99:04:66
iface.ipaddress = 11.11.11.35
iface.net_ifacename =  empty >
iface.transport_name = cxgb3i
iface.initiator_name = iqn.1994-05.com.redhat:7ae845b4ba6b


Array 1
Model - 49xx
CFW - 07.60.35.00
Protocol - iscsi
Speed - 1Gb

Array 2
Model - 49xx
CFW  - 07.70.10.00
Protocol - iscsi
Speed - 1Gb

Comment 1 Mike Christie 2010-05-10 19:46:13 UTC
Chelsio has confirmed this is fixed upstream. They were planning on bringing in the fix into 5.6 in this bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=567444
requesting that we update their driver.

Comment 2 Abdel Sadek 2010-05-11 21:21:49 UTC
(In reply to comment #1)
> Chelsio has confirmed this is fixed upstream. They were planning on bringing in
> the fix into 5.6 in this bugzilla
> https://bugzilla.redhat.com/show_bug.cgi?id=567444
> requesting that we update their driver.    

Will the fix be backported to RHEL 5.5 maintenance?

Comment 3 Andrius Benokraitis 2010-05-26 19:14:27 UTC

*** This bug has been marked as a duplicate of bug 567444 ***

Comment 4 Andrius Benokraitis 2010-05-26 19:15:28 UTC
LSI - please add yourselves to the dupe'd bug 567444. Lobbying for z-stream should be done there.


Note You need to log in before you can comment on or make changes to this bug.