Sun Fire V480 Systems Using a Specific Configuration may Unexpectedly Reset |
|
| Category : | Availability |
| Release Phase : | Resolved |
| Product : | Sun Fire V480 Server
|
| Bug Id : | 4898531
|
| Date of Resolved Release : | 04-AUG-2004
|
Impact
With a limited number of applications and with a specific system configuration, Sun Fire V480 systems may experience a system reset followed by a reboot, or possibly a hard hang.
Contributing Factors
This issue can occur in the following releases:
SPARC Platform
This issue only occurs if all of the following applies:
-
the "ce1" onboard interface is configured on PCI bus C, and
-
there is a PCI card present in any of the 66 MHz PCI slots (slot 0 or 1), and
-
high data activity (read or write) occurs on the PCI bus
Notes:
1) It is very difficult to determine exactly the amount of PCI activity necessary to trigger this issue.
2) With network cards P/N 501-5524 or 501-5902 installed, the second onboard network port may be numbered something other than "ce1". The reason is that controller IDs get assigned directly from the way the PCI busses are numbered. On the Sun Fire V480, there may be 4 PCI busses, as shown by the example below:
/pci@8,700000 "B" (33 mhz) bus on "first" schizo
/pci@8,600000 "A" (66 mhz) bus on "first" schizo
/pci@9,700000 "D" (33 mhz) bus on "second" schizo
/pci@9,600000 "C" (66 mhz) bus on "second" schizo
Solaris walks thru the device tree and assigns controller IDs in the order devices are found, so in the V480's case above, the PCI probe list (determining the device build order) looks like the following:
/pci@8,700000/@2 (bus B, PCI slot 2, 33 MHz)
/pci@8,700000/@3 (bus B, PCI slot 3, 33 MHz)
/pci@8,700000/@4 (bus B, PCI slot 4, 33 MHz)
/pci@8,700000/@5 (bus B, PCI slot 5, 33 MHz)
/pci@8,700000/ide@6 (bus B, onboard IDE, DVD-ROM)
/pci@8,600000/@1 (bus A, PCI slot 0, 66 MHz)
/pci@8,600000/@2 (bus A, PCI slot 1, 66 MHz)
/pci@9,700000/ebus@1 (bus D, serial, pmc, rsc, etc.)
/pci@9,700000/usb@1,3 (bus D, USB ports)
/pci@9,700000/network@2 (bus D, ce0, net0, onboard 10/100/1000 ethernet interface)
/pci@9,600000/network@1 (bus C, ce1, net1, onboard 10/100/1000 cassini interface)
/pci@9,600000/SUNW,qlc@2 (bus C, onboard FC-AL, ISP2200)
So if a card is in slot 2, for instance, it would take the controller number of "ce0". The onboard ports would then become "ce1" and "ce2" respectively.
Symptoms
Should the described issue occur and lead to a reboot, the following message will be displayed on the system console:
ERROR: System "FATAL RESET" from DAR/DCS/CDX
The above message is not logged to the "/var/adm/messages" file. Kernel core files are not generated.
In rare cases, the system may experience a hard hang with no error messages displayed or logged.
Workaround
Please see the Resolution section below.
Resolution
Identification of this issue requires troubleshooting to be performed by your Sun Services representative in order to determine the appropriate resolution.
The issue described in this Sun Alert is addressed with the implementation of a Field Change Order (FCO). If you are concerned that your operation may be affected by this issue, please contact your local Sun Services representative and reference this document.
Modification History
AttachmentsThis solution has no attachment