Skip to content

Things to know when installing NVIDIA drivers

Xin (Cindy) edited this page Mar 8, 2017 · 1 revision

Driver incompatibility issues

Currently the server Blacksword-yoru has two NVIDIA graphics cards, GeForce GTX TITAN X and GeForce GTX 980, both of them need NVIDIA graphics driver to support further usage, i.e. CUDA 8, Tensorflow, Theano, etc.

The software package compatibility issue is very important when deciding which NVIDIA driver to install. If the graphics driver is not installed properly, you might encounter problems like:

  • Infinite Ubuntu login loop,
  • Low-graphics mode, e.g., Your system is running in low-graphics mode, ……
  • nvidia-* commands not working, e.g., NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
  • And possibly others.

Things to know/do in order to solve this problem:

  1. Command line log in:

    When stuck at the GUI login page, you can use command line log in by press and hold:

    Ctrl + Alt + F1

    and then log in using your username and password to access the command line.

  2. Several things to check:

    CPU info: lscpu

    Ubuntu version, e.g. 16.04.2: lsb_release -a

    Kernel version, e.g. 4.4.0-64-generic: name -r

    Note: If necessary, reinstall kernel by: sudo apt-get install linux-image-4.4.0-64-generic --reinstall

    Check Nvidia driver info to determine which version to install: http://www.geforce.com/drivers

  3. Important: check gcc version before continue!

    NVIDIA drivers and CUDA all rely on the compiler of gcc, so it is VERY important to look for a version of compiler that is compatible with both of them.

    Currently, gcc 4.8 doesn’t support a dependency package needed by NVIDIA installer, and CUDA 8 only support gcc 5.3.1 and before. So we can install gcc 4.9 and created symbolic links to it at /usr/bin/gcc and /usr/alternatives/gcc.

  4. Now let’s continue to completely remove nvidia drivers.

    1). Search what packages from nvidia you have installed.

    dpkg -l | grep -i nvidia

    All packages other than nvidia-common should be purged.

    If you want to be sure that you will purge everything related to nvidia you can give this command

    sudo apt-get remove --purge nvidia* && sudo apt-get autoremove

    the asterisk in the end means (Purge everything that begins with the name nvidia-)

    2). BUT this command will also remove the nvidia-common package, and the nvidia-common package has as a dependency the ubuntu-desktop package.

    So you should also give the installation command for ubuntu-desktoppackage:

    sudo apt-get install ubuntu-desktop

    Also sometimes the nouveau driver get blacklisted from nvidia driver. With purge command it should UN-blacklisted. If you want to be sure that nouveau will be load in boot, you can force-load it by add it to /etc/modules

    echo 'nouveau' | sudo tee -a /etc/modules

    3). Last , search for the xorg.conf file (if exists) and remove it as well

    sudo rm /etc/X11/xorg.conf

  5. Install the latest Nvidia graphics drivers in Ubuntu or Linux Mint via PPA

    There is a detailed discussion about which source to use to install NVIDIA drivers HERE. In short, the PPA way is the most convenient and reliable way to have the drivers installed.

    1). Add the PPA

    To add the Proprietary GPU Drivers PPA in Ubuntu or Linux Mint and update the software sources, use the following commands:

      sudo add-apt-repository ppa:graphics-drivers/ppa   
      sudo apt-get update   
      sudo apt-get upgrade  
    

    2). Install (and activate) the latest Nvidia graphics drivers:

      apt-cache search nvidia
      apt search nvidia
    

    You should be able to see a list of NVIDIA driver versions. Choose the one you found that is the latest compatible version with both GPUs, for example, 370, and install it.

      sudo apt-get install nvidia-370
    

    3). Then reboot the computer. This should solve the infinite login loop problem.

  6. If, by any chance, reinstalling the graphics driver doesn't solve the problem of infinite login loop, you can try the methods discussed HERE. Basically there are three causes:

    • .Xauthority
    • Inappropriate authorization of /tmp
    • Mis-configured lightdm
  7. Notes and references:

    A post summarizing things you might encounter with inappropriately installed graphics drivers HERE.

    Steps to purge NVIDIA drivers: original discussion HERE