| title: | Re PATCH v3 05 11 PCI beef up pci do scan |
|
* Kenji Kaneshige <kaneshige.kenji@xxxxxxxxxxxxxx :
Alex Chiang wrote:
The more I think about it though, the more I think that even
without the below patch to clean up the callers of
pci_do_scan_bus, we should be ok, because:
- all the old code (which I removed below) existed
because the old PCI core would refuse to scan PCI buses
that had already been discovered
- that meant that it would never descend past a known
bridge to try and find new child bridges
- that meant that hotplug drivers had to manually
discover new bridges and add them, essentially
duplicating functionality in pci_scan_bridge
This patch series allows the PCI core to scan existing bridges
and descend down into the children every time, looking for new
bridges and devices, so all the code in shpchp, cpcihp, and other
callers of pci_do_scan_bus shouldnt be necessary anymore.
Also, if we do add new bridges once manually in shpchp, and then
call the new pci_do_scan_bus again, we will _not_ add devices
twice because the core should check each bridge and device for
struct pci_dev.is_added.
So anyway, I think that cleaning up the callers of
pci_do_scan_bus is a good idea, but multiple calls to the
interface definitely should not result in problems. If they do,
then thats a bug in my patch series.
Im sorry, but I didnt have enough time to try your patch on
my environment. So Im still just looking at the code.
Ok.
I looked at shpchp_configure_device() from the view point of
bridge hot-add. I think it is broken regardless of your change
because it calls pci_bus_add_devices() (through pci_do_scan_bus)
before assigning resources. So I think it must be changed
regardless of your change. But its a little difficult for me
because I dont have any test environment as I mentioned before.
Hm, what you say makes sense.
I managed to find a very old machine supported by cpqphp, and
also found a card with a bridge.
cpqhp_configure_device() follows a similar algorithm to
shpchp_configure_device(). Im just starting my testing now, and
there is good news and bad news.
The bad news is that although cpqphp loads successfully, and we
can successfully offline a card, we cannot online it again
afterwards due to BAR collisions. This failure occurs even
without my changes (2.6.27 kernel), and I havent had time to
track the regression down yet.
We do discover the bridge on the device correctly and it is added
back into the device tree correctly, but we cant use it because
its not programmed correctly.
The good news is, after rewriting cpqphp_configure_device() to
resemble the shpchp patch I gave you, we still discover the
bridge correctly and add it back into the device tree in the
proper place. We no longer get BAR collisions, but we fail in a
slightly different way.
At least Im not introducing a new regression in cpqphp, and I
suspect shpchp will be similar.
But Im still worrying about your change against pci_do_scan_bus().
Without your change, pci_do_scan_bus() scans child buses and add
devices without assigning resources. I guess that it means existing
callers of pci_do_scan_bus() have some mechanism to assign resource
by theirselves and they dont expect pci_do_scan_bus() assigns
resources.
I looked through shpchp and couldnt find this assumption. Is it
stored in the struct controller, under mmio_base and mmio_size?
I am motivated to get this patch series into 2.6.30 for several
reasons, so I think for now, I will not change pci_do_scan_bus().
Instead, Ill create a new interface that only the PCI core will
use, and leave the drivers alone.
Over time, we can migrate the drivers to the PCI core interface.
By the way, I have one question about rescan. Please suppose that
we enable the bridge(B) and its children using rescan interface
in the picture below.
|
-------------------------------------- parent bus
| |
bridge(A) bridge(B)
(working) (Not working)
| |
------------- -------------
| | | |
dev dev dev dev
(working) (working) (Not working)
In this case, your rescan mechanism calls pci_do_scan_bus() for
parent bus, and pci_do_scan_bus() calls pci_bus_assign_resources()
for parent bus. My question is, does pci_bus_assign_resources() do
nothing against bridge(A) that is currently working? I guess
pci_bus_assign_resources() would update some registers of bridge(A)
and it would breaks currently working devices.
This is a very good catch, thank you.
I added another patch to prevent this situation. We now check to
see if the bridge is already added inside of pci_setup_bridge().
Thanks.
/ac
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at rel="nofollow" vger.kernel.org/majordomo-info.html vger.kernel.org/majordomo-info.html
|