1. Introduction
As one of the top brands for video hardware and graphics cards in particular, NVIDIA has support for many platforms. Although previously reserved mostly for gaming, a graphics processing unit (GPU) in personal computers now has alternative applications such as cryptocurrency mining, encryption, and machine learning. These new uses make Linux systems with their lightweight footprint and flexible kernels invaluable, despite the reduced gaming capabilities.
In this tutorial, we talk about the management of an NVIDIA graphics adapter. First, we discuss NVIDIA driver options. After that, we explore how to check the current video hardware that we have. Next, we delve into ways to configure a given driver. Finally, we turn to methods to disable and reenable a specific NVIDIA GPU.
We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. It should work in most POSIX-compliant environments unless otherwise specified.
2. Drivers
As part of the Linux graphics system, GPU drivers have the important role of ensuring optimal communication between the kernel and the graphics hardware.
Still, it’s important to match the driver package with the actual hardware in the system. For example, there’s the Maxwell next-generation NVIDIA architecture, as well as some older cards that might not be supported by newer driver versions.
Indeed, NVIDIA provides native drivers for Linux. In fact, there are several main choices with different configuration options, catering to different system types and scenarios:
- nvidia driver: proprietary, with broader device and general support
- nvidia-open experimental driver: open source, supports fewer devices
- nouveau experimental driver: freedesktop.org open-source implementation, limited support for all NVIDIA cards
In theory, both nvidia and nvidia-open can support additional kernel modules:
- nvidiafb: framebuffer support
- nvidia_modeset: Kernel Mode Setting (KMS) support
- nvidia_uvm: Unified Virtual Memory (UVM) support
- nvidia_drm: Direct Rendering Management (DRM) support
Each of these modules can add an extra feature. Their interoperability between each other and with the different nvidia drivers depends on multiple factors. So, if a feature isn’t enabled automatically, tests are usually the best way to establish compatibility.
3. Check Video Hardware
First, it’s usually best to verify what our current GPU is:
$ lspci -k | grep -A 2 -E '(3D|VGA)'
lspci -k | grep -A 2 -E '(3D|VGA)'
00:08.0 VGA compatible controller: NVIDIA Corporation GR666GL [GeForce GX 666] (rev a0)
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia
Here, we use the lspci command with its -k switch to also show any kernel drivers. Further, to ensure we only look at graphics hardware, we filter through grep, showing 2 lines [-A]fter each one matching the (3D|VGA) [-E]xtended regular expression.
In this case, we have an NVIDIA card with the respective drivers.
Let’s verify that via the NVIDIA-native nvidia-smi:
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.129 Driver Version: 535.129 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GX 666 ... Off | 0000:08:00.0 Off | N/A |
| 23% 49C P8 33W / 200W | 10666MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
[...]
Thus, we see the same graphic adapter at the same PCI bus slot that lspci showed.
4. Set Drivers
There are several ways to use a given graphics driver or configure its features:
- kernel boot parameters
- module loading system
- graphics server changes
In some cases, we may need all three.
For example, let’s set the nvidia_modeset via the boot command line:
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.10.0-666-amd64 root=UUID=606660e8-7a07-4fc2-dead-27ed8425e7a0 nvidia_modeset=1 ro quiet
To change driver options, we can use module parameters as well. For that, we usually create a file under /etc/modprobe.d/:
$ cat /etc/modprobe.d/nvidia.conf
options nvidia <OPTIONS>
Here, we can include any NVIDIA driver module option. Notably, not all hardware supports all options.
Finally, we can set the nvidia driver in Xorg via an /etc/X11/xorg.conf.d/ configuration file:
$ cat /etc/X11/xorg.conf.d/10-nvidia.conf
Section "Device"
Identifier "NVIDIA Card"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GX 666"
EndSection
Now, let’s understand how to enable and disable an NVIDIA card from the shell.
5. Enable and Disable NVIDIA Hardware
Especially in systems with more than one GPU, we might want to switch between different hardware for the rendering of applications. Further, we may choose to disable a given graphics card completely.
Let’s understand how to do the latter.
5.1. Remove Hardware
As with most other physical components, unmounting the graphics card from the motherboard is a fairly basic way to prevent it from being used.
The exact way to disconnect or reconnect a card depends on a number of factors. However, in most cases, we’d have to turn off the machine and open it up.
Naturally, this isn’t optimal, as it can lead to different issues:
- down time
- warranty voiding
- malfunction
- damage
Further, we’d be unable to restore the function of the GPU without physical access to the machine. So, let’s look at software possibilities.
5.2. Configure BIOS
Commonly, the Basic Input/Output System (BIOS) or Unified Extensible Firmware Interface (UEFI) options include ways to control external hardware and peripherals.
Thus, we might be able to enable and disable any GPU from those interfaces.
Again, the exact way to do that depends on the manufacturer. Interface menus usually differentiate between internal (integrated) and external (discrete) graphics cards, different adapters, and PCI slots, as well as priority.
Let’s see some example categories:
- Graphics Configuration
- Graphics Device: Integrated Graphics, Discrete Graphics, NVIDIA Optimus
- Integrated Graphics: Auto, Forced, Disabled
- Internal Graphics: Auto, Disabled, Enabled
- Primary Graphics Adapter: Internal, PCI, PCI-E
- Onboard VGA: Onboard, Offboard
- VGA priority: Auto, Onboard, Offboard
Importantly, some systems also have a switchable graphics setting, which employs the external adapter only when heavy graphics processing is necessary, while the integrated chip is used for most other needs.
Although the BIOS or UEFI method is fairly convenient, needing a reboot to disable, reenable, and prioritize a video card isn’t usually optimal.
5.3. Use Management Tools
After the system is fully started and the kernel takes over, only specialized management tools with privileged access can provide ways to enable or disable the GPU.
In practice, NVIDIA provides the aforementioned nvidia-smi tool as a wrapper for its system management interface (SMI).
So, to fully disable or enable a given NVIDIA graphics adapter via nvidia-smi, we follow three steps:
- check current adapters
- note down slot numbers
- disable or enable slot by toggling modes
Let’s see this process in practice.
First, we check the currently available devices via one of the two means we already discussed:
$ lspci -k | grep -A 2 -E '(3D|VGA)'
lspci -k | grep -A 2 -E '(3D|VGA)'
00:08.0 VGA compatible controller: NVIDIA Corporation GR666GL [GeForce GX 666] (rev a0)
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia
In this case, the slot of interest is 00:08.0. So, we use nvidia-smi to disable the GPU on that slot:
$ nvidia-smi --id 0000:xx:00.0 --persistence-mode 0
$ nvidia-smi drain --pciid 0000:xx:00.0 --modify 1
$ nvidia-smi --persistence-mode 1
In both of the first commands, we replace xx with 08.
Essentially, this sequence performs several actions:
- disables –persistence-mode (-pm) for our specific GPU as identified by –id (-i) (UUID, PCI bus ID, or serial number)
- enables drain via –modify 1 (when persistence mode is off) for the same GPU as identified by –pciid (-p) which only uses the XXXX:YY.Z.a domain:bus.device.function format
- ensures all other graphics controllers are in –persistence-mode
Thus, our drain card doesn’t show up or get activated. In general, persistence mode ensures the NVIDIA driver is loaded all the time, not only when requested. On the other hand, drain prevents the GPU from accepting new client applications, usually employed before turning off the card. Importantly, both of these options are only available on Linux.
At this point, the target device should only be visible via tools like lspci.
To restore visibility and functionality, we just disable drain mode:
$ nvidia-smi drain --pciid 0000:xx:00.0 --modify 0
Critically, root or sudo privileges are usually required. Further, this may crash processes that are using the GPU.
6. Summary
In this article, we explored the management of NVIDIA graphics controllers in a Linux system. Specifically, we checked out different drivers and ways to disable and reenable a particular GPU.
In conclusion, although we can enumerate devices in a system via different means, the best way to configure them often involves the included toolset.