#Proxmox passthrougb debugging

1 messages ยท Page 1 of 1 (latest)

lost escarp
#

Please run these on your node via SSH and tell me what it says

pvesh get /nodes/YOURNODENAMEHERE/hardware/pci --pci-class-blacklist ""
cat /proc/cmdline
grep -sR hostpci /etc/pve/qemu-server
lsusb -vvt
lspci -k
random heraldBOT
#

Please use a code share site to share code or logs, for example:

Please don't use Pastebin, since it can randomly add spaces to the main view. Please also don't share text as images since it makes it harder for people to help you. Remember that others may have colour blindness, impaired vision, etc.

wheat fjord
#

Let me do itr

#

first:

#
root@pve:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.15.149-1-pve root=/dev/mapper/pve-root ro quiet
root@pve:~#
#
/etc/pve/qemu-server/100.conf:hostpci0: 0000:05:00.0,pcie=1
/etc/pve/qemu-server/100.conf:hostpci1: 0000:07:00.1
/etc/pve/qemu-server/101.conf:hostpci0: 0000:0a:00.0,pcie=1
root@pve:~#
#
root@pve:~#```
#

Here you go!

lost escarp
#

Huh. You have no IOMMU kernel args set.
If you compare 0000:07:00.1 from your 100.conf to the output from pvesh you see that it belongs to the USB 3 controller.

wheat fjord
#

Yes, i need to pass that pci-e device containing a coral TPU

#

PAssing the usb port/device leads to interuptions and drops

#

I used to have it this way, as well as the other ports passed to HA

lost escarp
#

It's possible the USB 2 ports belong to that. Weird that lsusb is quiet.

wheat fjord
#

it worked earlier

#

odd

lost escarp
#

Did you change anything between? Re-plugged something, added something new, updated the kernel?

wheat fjord
#

I rebooted.

#

After some reboots some are tehre, some not

#

and it changes

lost escarp
#

I'm confused why your kernel args have no IOMMU settings.

wheat fjord
#

how do I fix that?

lost escarp
#

Depends. What does efibootmgr -v say?

wheat fjord
lost escarp
#

In your case GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub could look something like

GRUB_CMDLINE_LINUX_DEFAULT="debug amd_iommu=on iommu=pt"
#

Also follow the module steps and then run

update-initramfs -u -k all
update-grub

and reboot.

#

I'd recommend disabling the auto start for 100 and 101 for now.

wheat fjord
#

let me try this!

#

Oh my... I think it worked for one of the devices (IZigbee dongle)

#

I still don't see the Zwave one

#

Let me see!

lost escarp
#

Not a fan of working around something like this but a PoE/Ethernet one would circumvent your issue.

wheat fjord
#

ODD! I see them at first.

But they disapear eventually

#

Like 1 minute after coming online

lost escarp
#

Are the VMs stopped?

wheat fjord
#

Let me stop them all

lost escarp
#

They might not give it back. Hence the suggestion not to start them at all for now.

wheat fjord
#

ah gotcha

#

Another reboot

lost escarp
#

My guess is currently that passing through the USB device is causing your issue.

wheat fjord
#

But it worked before replacing the 1660ti with a 3090.

#

That is the only change

#

Same PCIE port is used

lost escarp
#

0000:07:00.1 is in group 19 which includes both USB controllers and some Reserved SPP thing I don't know anything about.
Groups can switch around when devices change. I'd give the acs arg a try but let's see what happens without any VMs running first.

wheat fjord
#

Ok! This is when nothing is booted up (vms)

#

i still am missing my nortek zwave

#

its light is on ( the usb stick)

#

let me try that arg

lost escarp
#

What does lsusb -vvt say now?

#

Also try dmesg -Tw, then re-plug the zwave device and see what it says.

wheat fjord
#

root@pve:~# lsusb -vvt
/:  Bus 06.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    /sys/bus/usb/devices/  /dev/bus/usb/006/001
/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    /sys/bus/usb/devices/usb5  /dev/bus/usb/005/001
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    /sys/bus/usb/devices/  /dev/bus/usb/004/001
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    /sys/bus/usb/devices/usb3  /dev/bus/usb/003/001
    |__ Port 6: Dev 2, If 0, Class=Vendor Specific Class, Driver=, 12M
        ID 0b05:18f3 ASUSTek Computer, Inc.
        /sys/bus/usb/devices/3-6  /dev/bus/usb/003/002
    |__ Port 6: Dev 2, If 2, Class=Human Interface Device, Driver=usbhid, 12M
        ID 0b05:18f3 ASUSTek Computer, Inc.
        /sys/bus/usb/devices/3-6  /dev/bus/usb/003/002
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    /sys/bus/usb/devices/  /dev/bus/usb/002/001
    |__ Port 2: Dev 2, If 0, Class=Vendor Specific Class, Driver=, 5000M
        ID 18d1:9302 Google Inc.
        /sys/bus/usb/devices/2-2  /dev/bus/usb/002/002
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    /sys/bus/usb/devices/usb1  /dev/bus/usb/001/001
    |__ Port 1: Dev 2, If 0, Class=Vendor Specific Class, Driver=cp210x, 12M
        ID 10c4:ea60 Silicon Labs CP210x UART Bridge
        /sys/bus/usb/devices/1-1  /dev/bus/usb/001/002
root@pve:~#


#

nothing in the Dmesg when i unplug/replug the usb stick

#

should i still try the arg?

lost escarp
#

Can you try plugging one of the other devices into that port and see if it says something then?

wheat fjord
#

yes

lost escarp
#

The ACS arg is supposed to split the groups more but if no VM was started nothing should steal the port.

#

I see that the GPU has a USB controller too but I'm not familiar if that can steal something that way.

#

I'd ask the guys in the proxmox discord or the forum about this.

lost escarp
#

:<

wheat fjord
#

i feel like when I added that PCIe device to my vm, before the issue. it fucked something up

#

Cause that's when the issue started.

Could it be grabbing the usb ports still ?

lost escarp
#

I don't think so.
Try

qm set 100 --autostart false
qm set 101 --autostart false
qm set 100 --delete hostpci0
qm set 100 --delete hostpci1
qm set 101 --delete hostpci0
qm set 101 --delete hostpci1
reboot

This basically disables auto boot for the VMs with pass through and removes their PCI devices and reboots. Just to test for now.

wheat fjord
#

let me try

#

Still missing the fourth item Zwaves stick.

I even change its port

#

Would reinstalling proxmox fix this isuse since i would hve the pcie stuff already in during install?

lost escarp
#

Seems more like some hardware thing at the moment.

wheat fjord
#

Oh?

lost escarp
#

My suggestion would be to live boot ubuntu/mint or something like that.

#

If it happens there too you know it's not PVE related.

#

I can recommend ventoy. Just format your USB stick with it and drop a few isos on there.

wheat fjord
#

I do have 2x 8usb3 ports PCIE 4x cards

#

would this work in any way? Like by adding another group

#

I could install one

#

(they are in a drawer atm)

lost escarp
#

Maybe. I would try the above first.

wheat fjord
#

So you think maybe the controller for the 4 ports not working ( they are usb 3 btw, not 2, i just checked )
But the one for the two other ports is?

lost escarp
#

I'm not quite sure. All I know is that it apparently doesn't respond to anything being plugged into it.

#

With a live iso you have a complete fresh environment, no PVE, no VMs, no IOMMU kernel args or stuff like that so if it happens there too then we (or you) can concentrate on the hardware. Perhaps some UEFI setting is causing it. I know that sometimes certain PCI-E ports disable other slots but I can't imagine this happens for the USB ports on the Mainboard itself.

wheat fjord
#

I found a workaround.

I use the two working usb 3.0 A ports for my zigbee and zwave stuff.

And I use the usb-c port for my coral TPU

That works as a band aid

#

I will now boot and see what happens on mint

lost escarp
#

The true workaround would be a USB HUB ๐Ÿ˜„

wheat fjord
#

Well, tonight.

today is sunday and people are complaining about the jellyfin being down

#

Can't keep the mom away from young sheldon ya know

lost escarp
#

That's a good sign. I heard the guys on reddit's /r/selfhosted aspire to have their services be depended upon.

wheat fjord
#

100tb and counting, like 30 users at this point

#

Thanks god for canada's lax laws on piracy

lost escarp
#

Gotta lie down a bit anyways. You can ping me if you have news.

wheat fjord
#

Ay!