Infosec Scribbles

November 9, 2019

eGPU Adventures on Linux

I was pretty impressed with two new pieces of technology recently:

  1. Being able to run Quake Champions on Linux using Proton
  2. External GPUs seeing just a 20% performance hit on external displays and being compatible with Linux

Below are results of my experiments.

Test Hardware

Desktop

  • Core i7-6700
  • 16GB DDR4 2666MHz
  • Samsung 950 EVO SATA SSD
  • Windows 7

XPS 9560

  • Core i7-7700HQ
  • 2 PCIe lanes wired to the TB3 controller
  • 16GB DDR4 2400MHz
  • GTX 1050 4GB
  • Toshiba XG5 NVMe M.2 PCIe SSD
  • Ubuntu 18.04 LTS

XPS 7590

  • Core i7-9750H
  • 4 PCIe lanes wired to the TB3 controller
  • 16GB DDR4 2400MHz
  • GTX 1650 4GB
  • Samsung 970 EVO NVMe M.2 PCIe SSD
  • Windows 10 and Ubuntu 18.04 LTS

GPUs

  • Sonnet Breakaway Box 550 eGPU enclosure
  • Zotac GTX 970 4GB - my old GPU
  • EVGA RTX 2070 Super 8GB - my new GPU

Setup

Setting up eGPU on Windows 10

In case of XPS 7590, connecting the eGPU was a simple case of downloading the latest nVidia driver that supports the RTX 2070 Super. It’s a very plug-n-play experience, you get both internal and external screens support out of the box and hotplugging just works.

Setting up eGPU on Linux

Both laptops required some fiddling on Linux.

First, you need to disable Thunderbolt security in BIOS. Yes, Linux kernel has the option to authorize devices and yes there is now a GUI for that. It doesn’t work for eGPUs and I don’t know why.

Next, you probably want to add nvidia.modeset=1 to /etc/default/grub and run sudo update-grub.

You have to run prime-select nvidia of course and then only add the BusID to Xorg configuration for the eGPU, but not the dGPU. Otherwise everything will always render on the dGPU, even though the system will see the eGPU in nvidia-smi. You should be familiar with this routine already, as XPS laptops tend to boot into a blinking black screen when you have an nVidia GPU selected, but don’t have the BusID specified in your configuration.

This is my config file for it:

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    BusID          "PCI:11:0:0"
    Option         "AllowEmptyInitialConfiguration"
    Option         "AllowExternalGpus" "True"
EndSection

Next, if you are running GDM, which Ubuntu ships by default, your external screens won’t get recognized. You have to either add needs_root_rights=yes to /etc/X11/Xwrapper.config and run Xorg as root, or switch to LightDM, which does not suffer from this issue. There is a good chance you will have to run xrandr --setprovideroutputsource modesetting NVIDIA-0 once in order to make it switch to the eGPU.

Initially, before I made the externals available, running the xrandr command above and restarting current user session would result in that session being rendered on the eGPU, fed back to the internal screen of the laptop. However, I could not get this to work ever again after switching to externals just once. Something odd going on there, but whatever, I was never going to use the internal screen with eGPU anyway because of the severe performance penalty of that use case.

Hotplugging-in will make the device show up in lspci. Hotplugging-out will result in a kernel freeze requiring a hard reboot. Ideally, each time you want to switch, you run prime-select and rearrange the Xorg config files for the GPU you want, then reboot.

You should see something like this when it’s working:

$ nvidia-smi
Mon Nov  4 20:41:35 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.26       Driver Version: 440.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1050    Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   42C    P8    N/A /  N/A |      0MiB /  4042MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 207...  Off  | 00000000:0B:00.0  On |                  N/A |
|  0%   40C    P8    17W / 215W |    444MiB /  7982MiB |      5%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    1      1358      G   /usr/lib/xorg/Xorg                           195MiB |
|    1      4574      G   /usr/bin/gnome-shell                         107MiB |
|    1      5196      G   /opt/google/chrome/chrome                    138MiB |
+-----------------------------------------------------------------------------+

Bottom line: not plug-n-play, not feature complete, but straightforward enough to get it to work as far as my Linux experience goes.

Benchmarking

CPU Influence

Of course, my tests will not be perfectly equal to one another, because the CPUs are slightly different.

I use this page as a guide for what the adjustment should be. XPS 9560 CPU is a little behind the desktop, while the XPS 7590 CPU is quite a bit ahead. Therefore, if there are significant differences between the desktop and the older XPS caused by the CPU, the newer XPS should show even more significant differences ahead of the desktop.

Choice of Benchmarks

Since I am doing cross-platform testing, the standard benchmark to use seems to be Unigine. I decided to run all 3 of their benchmarks each time since it doesn’t cost me anything, but might provide useful data to someone who finds this post later. On Windows, I ran Fire Strike on the desktop and on the 7590 to provide more data points. I ran some of the tests on the GTX 970 just out of curiosity.

Unigine Heaven

Platform OS GPU Render Score
Desktop Windows 7 GTX 970 DX11 1291
Desktop Windows 7 GTX 970 OpenGL 1144
Desktop Windows 7 RTX 2070S DX11 3234
Desktop Windows 7 RTX 2070S OpenGL 3049
XPS 9560 Ubuntu 18.04.03 GTX 1050 OpenGL 628
XPS 9560 Ubuntu 18.04.03 GTX 970 OpenGL 1105
XPS 9560 Ubuntu 18.04.03 RTX 2070S OpenGL 2282
XPS 7590 Windows 10 GTX 1650 DX11 1033
XPS 7590 Windows 10 GTX 1650 OpenGL 945
XPS 7590 Windows 10 RTX 2070S DX11 2312
XPS 7590 Windows 10 RTX 2070S OpenGL 2675
XPS 7590 Ubuntu 18.04.03 RTX 2070S OpenGL 2276

Unigine Valley

Platform OS GPU Render Score
Desktop Windows 7 GTX 970 DX11 2295
Desktop Windows 7 GTX 970 OpenGL 2003
Desktop Windows 7 RTX 2070S DX11 4584
Desktop Windows 7 RTX 2070S OpenGL 4390
XPS 9560 Ubuntu 18.04.03 GTX 1050 OpenGL 1167
XPS 9560 Ubuntu 18.04.03 GTX 970 OpenGL 1749
XPS 9560 Ubuntu 18.04.03 RTX 2070S OpenGL 2905
XPS 7590 Windows 10 GTX 1650 DX11 1683
XPS 7590 Windows 10 GTX 1650 OpenGL 1523
XPS 7590 Windows 10 RTX 2070S DX11 4194
XPS 7590 Windows 10 RTX 2070S OpenGL 3943
XPS 7590 Ubuntu 18.04.03 RTX 2070S OpenGL 2929

Unigine Superposition

Platform OS GPU Render Score
Desktop Windows 7 GTX 970 DX11 1955
Desktop Windows 7 GTX 970 OpenGL 1775
Desktop Windows 7 RTX 2070S DX11 6305
Desktop Windows 7 RTX 2070S OpenGL 5308
XPS 9560 Ubuntu 18.04.03 GTX 1050 OpenGL 860
XPS 9560 Ubuntu 18.04.03 GTX 970 OpenGL 1774
XPS 9560 Ubuntu 18.04.03 RTX 2070S OpenGL 5105
XPS 7590 Windows 10 GTX 1650 DX11 1759
XPS 7590 Windows 10 GTX 1650 OpenGL 1275
XPS 7590 Windows 10 RTX 2070S DX11 5743
XPS 7590 Windows 10 RTX 2070S OpenGL 4975
XPS 7590 Ubuntu 18.04.03 RTX 2070S OpenGL 5055

Fire Strike

Platform OS GPU Render Score
Desktop Windows 7 GTX 970 DX11 9666
Desktop Windows 7 RTX 2070S DX11 18318
XPS 7590 Windows 10 RTX 2070S DX11 14228

Fire Strike Ultra

Platform OS GPU Render Score
Desktop Windows 7 GTX 970 DX11 2506
Desktop Windows 7 RTX 2070S DX11 5839
XPS 7590 Windows 10 RTX 2070S DX11 5211

Benchmark Musings: eGPU vs Desktop

Looking just at Windows for now:

Platform Benchmark Score % Loss
Desktop Heaven 3234 0 %
XPS 7590 Heaven 2312 28.5 %
Desktop Valley 4584 0 %
XPS 7590 Valley 4194 8.5 %
Desktop Superposition 6305 0 %
XPS 7590 Superposition 5743 8.9 %
Desktop Fire Strike 18318 0 %
XPS 7590 Fire Strike 14228 22.3 %
Desktop Fire Strike Ultra 5839 0 %
XPS 7590 Fire Strike Ultra 5211 10.8 %

This is not quite the 20% performance drop I expected.

Quake Champions

Since this is the only game I play regularly and the only game that matters to me until Cyberpunk 2077 comes out, performance in this game really mattered to me in all this testing, and it was a way to gauge whether or not this “rig replacement” would work for Cyberpunk next year.

My test setup was a custom 4x4 bot game on Blood Run. I based this on playing through a couple of my favorite maps (Awoken, Blood Covenant, Blood Run, Burial Chamber) and noticing the FPS range on this one was rather wide. In case there was stutter, I’d click through every champion and play multiple rounds before starting to pay attention to the FPS counter. This should have alleviated any shader caching issues there might have been. I found that although PROTON_NO_ESYNC=1 helps with the stutter initially, removing it later actually adds about 10 fps, so the numbers below were taken without using it.

First off, using the desktop, the game lets me set every setting to Ultra even on the GTX 970. Using the eGPU, all platforms have a limitation: textures setting can’t go above High. This means my comparison is already off, because the graphics settings on eGPU are actually lower than on the desktop. Nonetheless, these are the raw numbers:

Platform OS GPU Render FPS Low FPS High FPS Typical Stutter
Desktop Windows 7 GTX 970 DX11 79 143 98 No
Desktop Windows 10 RTX 2070S DX11 117 251 189 No
XPS 9560 Ubuntu 18.04.03 RTX 2070S DXVK 38 113 75 Yes
XPS 7590 Windows 10 RTX 2070S DX11 99 153 128 No
XPS 7590 Ubuntu 18.04.03 RTX 2070S DXVK 40 119 84 Yes

“Typical” is what I keep seeing repeatedly when I glance at the counter. It’s probably close to median FPS, if there is such a thing. Windows 10 on the desktop is not a typo here, I upgraded the gaming rig at the end of this test.

The difference between XPS 9560 and XPS 7590 is most likely due to 2 extra cores on the latter. I had to switch off TurboBoost on it to avoid constantly hitting 100C on the CPU and it didn’t affect the performance one bit; the 9560 doesn’t have this thermal problem.

Desktop vs eGPU: 32% performance loss. Windows vs Linux: another 34% loss and stutter settles in. Linux eGPU vs Windows desktop (my use cae): 56% performance loss and stutter. This is now performing worse than a 5 years old midrange desktop on a GTX 970.

Accounting for the differences between the CPUs in my setups and the High texture setting cap on laptops, the actual losses attributed to using eGPU and Linux are probably closer to 40% each.

/me drops the mic