Install NVIDIA Graphic Driver and CUDA 9.0:
To follow this tutorial, create and start an Ubuntu 16.0.4 instance with 4 CPU’s, 22 GB RAM, 1 Tesla k-80 GPU and 50 GB of SSD Persistent disk. You can connect to the instance using either GCP console or the gcloud command-line tool. Compute Engine generates an SSH key and uses that to connect to your instance via SSH. The generated key will be stored in one of the following locations on your account:
- By default, Compute Engine adds the generated key to project or instance metadata.
- If your account is configured to use OS-Login, Compute Engine stores the generated key with your user account.
Last time I showed you how to connect to your instance using the project console. This time I am going to use gcloud command-line tool to connect to the instance. Before you use the gcloud command-tool, you should enable compute engine API’s in “APIs and Services” and follow this instruction to install it on your machine. https://cloud.google.com/sdk/docs/quickstarts Once gcloud command-tool is installed and added to your PATH, you can open a terminal window on your computer and run the following commands:
$ gcloud auth login (To login to your account)
you can list all your project using:
$ gcloud projects list
And you can see your instances using:
$ gcloud compute instances list
Finally to connect to your instance, use
$ gcloud compute ssh [INSTANCE_NAME] --zone [INSTANCE_ZONE]
You can find the Instance name and instance zone by listing all your instances using the above command. Once you are connected to your instance you can check the configurations of your system. You can check the type of CPU your VM has by running:
$ cat /proc/cpuinfo
Run the following command to see GPU information:
$ lspci | grep -i nvidia
Next, we want to install TensorFlow with GPU support but before that, we need the following NVIDIA software installed on our machine:
NVIDIA Graphic Driver:
To find the correct version of the driver you need to install, head to NVIDIA’s website and enter the specification of your system:
For this tutorial, I am using a Tesla k-series, Tesla k-80 GPU on Ubuntu 16.04 and the CUDA kit I will be using is CUDA 9.0.
So the correct version of the driver I need to install is 384.
The easiest way to install this driver is to add graphics-driver repository to apt-get and use it to install the correct version. So go ahead and execute the following commands.
$ sudo add-apt-repository ppa:graphics-drivers
$ sudo apt-get update
$ sudo apt-get install nvidia-384
Now, reboot your instance by running:
$ sudo reboot
You will notice that your ssh session will end. Wait for a minute for you machine to boot up and connect to it again using the above ssh command. Confirm that the correct version of nvidia driver is installed by running:
Install CUDA 9.0:
CUDA is a parallel computing platform developed by NVIDIA that enables you to use your GPU in a very efficient way. If your graphics card is from NVIDIA and it is listed in http://developer.nvidia.com/cuda-gpus, your GPU is CUDA-capable. (Tesla k80 is!) Install build-essential package:
$ sudo apt-get install build-essential
To verify gcc is installed try:
$ gcc —version
Ensure the correct version of the kernel headers and development packages are installed prior to installing the CUDA Drivers, as well as whenever you change the kernel version.
The version of the kernel your system is running can be found by running the following command:
$ uname -r
The kernel headers and development packages for the currently running kernel can be installed with:
$ sudo apt-get install linux-headers-$(uname -r)
Install NVIDIA driver suitable for your GPU and CUDA 9.0. We will use CUDA 9.0 because the officially released version of tensor flow was compiled with coda 9.0 and cuDNN SDK 7 Download runfile for cuda 9.0 for ubuntu from this page: https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1704&target_type=runfilelocal
$ wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda_9.0.176_384.81_linux-run
$ sudo chmod +x cuda_9.0.176_384.81_linux-run
$ sudo ./cuda_9.0.176_384.81_linux-run --override --no-opengl-lib
After installation is complete, add the 2 following lines to your .bashrc file:
$ vi ~/.bashrc
Also, execute the following commands to make them effective in the current session:
$ sudo export PATH=/usr/local/cuda-9.0/bin:$PATH
$ sudo export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH
And finally, verify the CUDA toolkit installation by running:
$ nvcc -V