From 05f1084f44d270131803dc559393ffa4cabbe34c Mon Sep 17 00:00:00 2001
From: Myron Stowe <mstowe@redhat.com>
Date: Tue, 25 Mar 2025 12:20:53 -0600
Subject: [PATCH] PCI: Avoid FLR for Mediatek MT7922 WiFi
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

JIRA: https://issues.redhat.com/browse/RHEL-83611
Upstream Status: 81f64e925c29fe6e99f04b131fac1935ac931e81

commit 81f64e925c29fe6e99f04b131fac1935ac931e81
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Feb 12 13:35:16 2025 -0600

    PCI: Avoid FLR for Mediatek MT7922 WiFi

    The Mediatek MT7922 WiFi device advertises FLR support, but it apparently
    does not work, and all subsequent config reads return ~0:

      pci 0000:01:00.0: [14c3:0616] type 00 class 0x028000 PCIe Endpoint
      pciback 0000:01:00.0: not ready 65535ms after FLR; giving up

    After an FLR, pci_dev_wait() waits for the device to become ready.  Prior
    to d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS"),
    it polls PCI_COMMAND until it is something other that PCI_POSSIBLE_ERROR
    (~0).  If it times out, pci_dev_wait() returns -ENOTTY and
    __pci_reset_function_locked() tries the next available reset method.
    Typically this is Secondary Bus Reset, which does work, so the MT7922 is
    eventually usable.

    After d591f6804e7e, if Configuration Request Retry Status Software
    Visibility (RRS SV) is enabled, pci_dev_wait() polls PCI_VENDOR_ID until it
    is something other than the special 0x0001 Vendor ID that indicates a
    completion with RRS status.

    When RRS SV is enabled, reads of PCI_VENDOR_ID should return either 0x0001,
    i.e., the config read was completed with RRS, or a valid Vendor ID.  On the
    MT7922, it seems that all config reads after FLR return ~0 indefinitely.
    When pci_dev_wait() reads PCI_VENDOR_ID and gets 0xffff, it assumes that's
    a valid Vendor ID and the device is now ready, so it returns with success.

    After pci_dev_wait() returns success, we restore config space and continue.
    Since the MT7922 is not actually ready after the FLR, the restore fails and
    the device is unusable.

    We considered changing pci_dev_wait() to continue polling if a
    PCI_VENDOR_ID read returns either 0x0001 or 0xffff.  This "works" as it did
    before d591f6804e7e, although we have to wait for the timeout and then fall
    back to SBR.  But it doesn't work for SR-IOV VFs, which *always* return
    0xffff as the Vendor ID.

    Mark Mediatek MT7922 WiFi devices to avoid the use of FLR completely.  This
    will cause fallback to another reset method, such as SBR.

    Link: https://lore.kernel.org/r/20250212193516.88741-1-helgaas@kernel.org
    Fixes: d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS")
    Link: https://github.com/QubesOS/qubes-issues/issues/9689#issuecomment-2582927149
    Link: https://lore.kernel.org/r/Z4pHll_6GX7OUBzQ@mail-itl
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
    Cc: stable@vger.kernel.org

Signed-off-by: Myron Stowe <mstowe@redhat.com>
---
 drivers/pci/quirks.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 02545f14d625e..c4618d9b3d66c 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5546,7 +5546,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap);
  * AMD Matisse USB 3.0 Host Controller 0x149c
  * Intel 82579LM Gigabit Ethernet Controller 0x1502
  * Intel 82579V Gigabit Ethernet Controller 0x1503
- *
+ * Mediatek MT7922 802.11ax PCI Express Wireless Network Adapter
  */
 static void quirk_no_flr(struct pci_dev *dev)
 {
@@ -5558,6 +5558,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x149c, quirk_no_flr);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x0616, quirk_no_flr);
 
 /* FLR may cause the SolidRun SNET DPU (rev 0x1) to hang */
 static void quirk_no_flr_snet(struct pci_dev *dev)
-- 
GitLab