Tensorflow-GPU in multi-user environment
This post is intended for setting up tensorflow-gpu setup in a multi-user setting. This is written as a guide for GPU users at the WAVES research group, Ghent University, Belgium. But these are also applicable to any linux multi-user environment with GPU-based jobs.
- Installing Tensorflow in conda
- Admin only
Installing Tensorflow in conda
Anaconda is a popular python environment among the AI/ML community. The anaconda distribution can be downloaded from here. Follow the instructions here to properly install it to your user account.
Once you have installed anaconda into your user account, you can create a conda environment using
$ conda create -n <name-of-your-environment>
Then you can activate that environment using:
$ conda activate <name-of-your-environment>
Once you are in the environment, you can install whatever python packages you want. Anaconda already comes with numpy,scipy and many other useful python libraries. If you need a specific library, google for conda install
$ conda deactivate
Anaconda also offers tensorflow and keras installations among many many other libraries. In order to install it to your environment, follow the steps below:
- Activate your conda environment
- Install keras
$ conda install -c conda-forge keras
- Install tensorflow GPU version
$ conda install tensorflow-gpu
This should install other libraries that are required by keras and tensorflow. I found that it is better to install keras before installing tensorflow since keras also installs a tensorflow that may not be comaptible with the GPU (I am not 100% sure about this).
conda install, we can also use
pip install <the-library-you-need>in the same environment for installing libraries. But I recommend using
You can use
conda list to see all the installed libraries in your environment.
conda env list will list all the conda environments in your system.
Testing Tensorflow Installation
You can test whether the tensorflow installation is using the GPU using the following options.
$ python -c "import tensorflow as tf; sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))"
$ python -c "import tensorflow as tf; tf.test.is_gpu_available()"
If it gives message like
Adding visible gpu devices:, then it means that tensorflow indeed uses the GPU. If it only mentions CPU, then you will need to correct the installation. Often, it is better to install keras first and then install tensorflow, or use
conda install tensorflow-gpu instead of
conda install -c conda-forge tensorflow-gpu.
nvidia-smicommand does not list any GPUs (meaning the system does not see the GPUs anymore).
Installing the cuda compiler and nvidia drivers
These steps are adapted from here. Ignore the
$ sign in the beginning of the commands.
Install the kernel headers for the current Ubuntu installation.
$ sudo apt-get install linux-headers-$(uname -r)
For other linux flavors, this step is different (Refer here for other linux distributions).
Download the runfile from the cuda downloads page.
Disable the nouveau driver. The instructions are given in this page. For ubuntu, create a new file
/etc/modprobe.d/blacklist-nouveau.confwith the following contents:
blacklist nouveau <br> options nouveau modeset=0
Regenerate the kernel initramfs:
$ sudo update-initramfs -u
Disable the lightdm service to kill the X server from running.
$ sudo service lightdm stop
Also kill vncserver sessions (if they exist) (e.g.,
vncserver -kill :1to kill the first vncserver and so on.). Also remove the
.lockfiles present in the
Go to the Downloads folder where the downloaded runfile is stored. Make the file executable:
$ chmod +x cuda<version>.linux.run
Install the driver and compiler
$ sudo ./cuda<version>.linux.run --no-opengl-libs
--no-opengl-libsis important to avoid the login problems. You will then be asked the following and the requried responses are provided in bold font.
- Accept license agreement? yes
You can press
Ctrl+Cto skip to the end of the license.
- Install NVIDIA driver? yes
- Should NVIDIA modify the x-config ? no
- Install CUDA? yes
- Path where cuda installations should be put: choose default or provide a path of your choice
- Install symbolic link? yes
- Install samples? yes
- Choose samples location: choose default or enter your choice
This should install both the cuda compiler and nvidia drivers to the machine.
- Accept license agreement? yes
Perform the post installation actions such as adding the cuda installation to your
LD_LIBRARY_PATH. Follow the instructions here. You can also edit the
~/.bashrcfile to add modify these variables.
nvcc -Vto check the nvidia compiler version and
nvidia-smito see the GPUs’ status in your machine.
Finally, restart the
$ sudo service lightdm restart
The machine will have the GUI after the
lightdm service is restarted. You will need to launch new vnc sessions in order to use remote desktop.