安装cuda8+cudnn5.1+tensorflowgpu+keras


0.由于对centos不熟悉,以及超算中心的centos版本比较低,实在没法装有些库,于是先换成了ubuntu16.04server。

#一 创建用户
1.1 创建用户
adduser dluser01
passwd xxxxxxxx
dluser01~10

1.2 增加root权限
vim /etc/sudoers

Allow root to run any commands anywhere

root ALL=(ALL) ALL
ubuntu ALL=(ALL) ALL
dluser01 ALL=(ALL) ALL
dluser02 ALL=(ALL) ALL

#二 修改源
参见 http://mirrors.ustc.edu.cn/help/ubuntu.html

#三 安装python2.7
换成ubuntu16.04后自带

四 安装pip

4.0 网速够,ubuntu16.04下 直接sudo apt-get install python-pip python-dev

4.1 安装easyinstall

1
2
wget -q http://peak.telecommunity.com/dist/ez_setup.py
python ez_setup.py

4.2 编译安装python
下载 https://github.com/pypa/pip/releases

1
2
3
tar zvxf pip-9.0.1.tar.gz    #解压文件
cd pip-9.0.1/
python setup.py install

4.3 修改pip源(阿里源)

1
2
3
4
5
6
7
8
cd ~
mkdir .pip
vim ~/.pip/pip.conf

[global]
index-url = http://mirrors.aliyun.com/pypi/simple/
[install]
trusted-host=mirrors.aliyun.com

五 安装NVIDIA驱动

5.1 查找对应驱动
下载并传至服务器,进入root

1
2
3
sudo init 3
sudo sh NVIDIA-Linux-x86_64-375.39.run
sudo reboot

装好了用nvidia-smi,检查一下:
1490615001(1).jpg-20.6kB

六 安装cuda

6.1 下载
下载地址
image_1bc7pgehh1u1kv99iidh5518mel.png-65.7kB
下载runfile。。

6.2安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
sudo sh xxxxx.run
刷屏漫长的EULA条文,接下来这么选:
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 367.48?
(y)es/(n)o/(q)uit: y

Do you want to install the OpenGL libraries?
(y)es/(n)o/(q)uit [ default is yes ]: y

Do you want to run nvidia-xconfig?
This will update the system X configuration file so that the NVIDIA X driver
is used. The pre-existing X configuration file will be backed up.
This option should not be used on systems that require a custom
X configuration, such as systems with multiple GPU vendors.
(y)es/(n)o/(q)uit [ default is no ]: n

Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
[ default is /usr/local/cuda-8.0 ]:

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
[ default is /home/ubuntu ]:

Installing the NVIDIA display driver...
Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so

Installing the CUDA Samples in /home/ubuntu ...
Copying samples to /home/ubuntu/NVIDIA_CUDA-8.0_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver: Installed
Toolkit: Installed in /usr/local/cuda-8.0
Samples: Installed in /home/ubuntu, but missing recommended libraries

Please make sure that
- PATH includes /usr/local/cuda-8.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.

6.3 配置环境变量(当前用户)

1
2
3
4
5
6
7
sudo vim ~/.bashrc

最后加入
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda-8.0

source ~/.bashrc 刷新文件

七 安装cudnn

与8匹配的是cudnn5.1,下载地址
首先需要注册,填一个问卷。
然后下载这个cuDNN v5.1 Runtime Library for Ubuntu14.04 (Deb)
16.06的那个不是amd64平台的。。下载14.04的

1
sudo dpkg -i libcudnn5_5.1.10-1+cuda8.0_amd64.deb

八 安装tensorflow gpu

为了保证稳定,不在root配置tensorflow,转而在各个用户下配置,所以需要每个用户配置下pip源(参照上文),配置好之后,执行

1
pip install tensorflow-gpu

注意,环境变量也是随着用户的,所以每增加一个用户,需要重新配一下这个用户的环境变量,打开python测试一下:

1
2
3
4
5
6
7
8
9
10
11
dluser02@ubuntu:~$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
>>>

注意,安装版本过低,建议按照官网推荐的方法,找到gpu字样

1
sudo pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.1-cp27-none-linux_x86_64.whl

九 安装keras

装好前面的前提下,直接pip install keras,等待安装好即可,测试如下:

1
2
3
4
5
6
7
8
9
10
11
12
dluser02@ubuntu:~$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import keras
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
>>>

注意如果版本过低,去github上面下载源码安装