當前位置:
首頁 > 知識 > 如何在Ubuntu 18.04伺服器上安裝TensorFlow(Nvidia GPU)

如何在Ubuntu 18.04伺服器上安裝TensorFlow(Nvidia GPU)

本文介紹了如何使用Nvidia GPU在Ubuntu 18.04伺服器上安裝TensorFlow。安裝需要具有Nvidia顯卡的伺服器架構 ,這樣的專用伺服器可用於各種目的,包括遊戲。為了保障設備的使用壽命,建議不要在localhost上安裝繁重且耗時的程序。顯卡必須支持至少Nvidia compute 3.0才能獲得比TensorFlow更多的運用。

如何在Ubuntu 18.04伺服器上安裝TensorFlow(Nvidia GPU)

我們假設使用64位的操作系統,顯卡為GeForce 740m。SSH登錄到伺服器,更新和升級:

apt update -y
apt upgrade –y

運行這個命令來安裝Python庫:

sudo apt install openjdk-8-jdk git python-dev python3-dev python-numpy python3-numpy python-six python3-six build-essential python-pip python3-pip python-virtualenv swig python-wheel python3-wheel libcurl3-dev libcupti-dev

繼續運行

sudo apt install libcurl4-openssl-dev

通過運行,我們可以看到安裝的顯卡硬體:

sudo lshw -C display | grep product

我們需要安裝Nvidia驅動程序。我們可以檢查SSH上的圖形驅動程序:

nvidia-smi

這是Ubuntu的PPA,瀏覽一下:

https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa

nvidia-graphics-drivers-396是最新的,但可能沒有太多測試。我們可以添加 nvidia-graphics-drivers-390 PPA 並安裝該應用程序。

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt upgrade
ubuntu-drivers devices
sudo ubuntu-drivers autoinstall

如果有意外情況,autoinstall不起作用,則運行:

sudo apt install nvidia-390

現在,再次運行命令:

nvidia-smi

您將得到一個有用的輸出。我們應該保持住這個版本停止升級。

sudo apt-mark hold nvidia-driver-390

安裝 Linux—headers :

sudo apt install linux-headers-$(uname -r)

為了後續步驟正常進行,我們需要 gcc, g++ 等等:

apt install gcc g++ gcc-6 g++-6 gcc-4.8 g++-4.8
# if gcc-4.8 package not found run
# sudo add-apt-repository ppa:ubuntu-toolchain-r/test
# sudo apt update
# sudo apt install gcc-4.8 g++-4.8

現在我們必須安裝CUDA工具包:

apt install nvidia-cuda-toolkit libcupti-dev

重啟

sudo reboot

安裝CUDA工具包:

https://developer.nvidia.com/cuda-toolkit

運行:

cd Downloads/
sudo sh cuda_9.0.176_384.81_linux.run --override --silent –toolkit

接下來,您需要安裝CUDNN,NCCL。您需要按照PyTorch老方法,使用Nvdia帳戶登錄,這很簡單。您將獲得鏈接:cuDNN v7.1.x Library for Linux。您需要下載deb文件,並將FTP上傳到伺服器。URL是:

https://developer.nvidia.com/rdp/cudnn-download

https://developer.nvidia.com/nccl

找到已安裝CUDA的目錄。它正在將文件複製到/usr/local/cuda/。將上述內容移到安裝CUDA的目錄中並運行這些操作(注意版本編號的目錄,以下是格式示例):

tar -xzvf cudnn-9.0-linux-x64-v7.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

以上將節省空間,並避免apt警告。打開配置文件,如.bashrc:

nano ~/.bashrc

添加這些:

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda

重新載入:

source ~/.bashrc
sudo ldconfig
echo $CUDA_HOME

安裝Bazel:

sudo apt install curl
echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
sudo apt update -y
sudo apt upgrade -y
sudo apt install bazel
sudo apt upgrade bazel
pip install keras

查看Nvidia版本:

cd ~
git clone https://github.com/tensorflow/tensorflow
cd ~/tensorflow
# check current revision number from browser
git checkout r1.11
cd ~/tensorflow

通過運行創建配置文件:

./configure

您將得到這樣的輸出:

Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: N
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: N
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: N
Do you wish to build TensorFlow with Apache Kafka Platform support? [y/N]: N
Do you wish to build TensorFlow with XLA JIT support? [y/N]: N
Do you wish to build TensorFlow with GDR support? [y/N]: N
Do you wish to build TensorFlow with VERBS support? [y/N]: N
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: Y
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.0
Please specify the location where CUDA 9.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda
Do you wish to build TensorFlow with TensorRT support? [y/N]: N
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 5.0] 3.0
Do you want to use clang as CUDA compiler? [y/N]: N
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: /usr/bin/gcc-4.8
Do you wish to build TensorFlow with MPI support? [y/N]: N
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: -march=native
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:N

構建TensorFlow :

最後的步驟:

bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow_pkg
cd tensorflow_pkg/
sudo pip3 install tensorflow-<name_of_generated_file>.whl

通過切換到另一個目錄並運行python來檢查您的構建是否正常工作:

import tensorflow as tf
hello = tf.constant("Hello World!")
sess = tf.Session()
print(sess.run(hello))

您將得到Hello World!輸出。TensorFlow有以下型號:

https://github.com/tensorflow/models

您可以運行:

git clone https://github.com/tensorflow/models.git
cd models/tutorials/image/imagenet
python classify_image.py

這是一些基本設置和測試。

喜歡這篇文章嗎?立刻分享出去讓更多人知道吧!

本站內容充實豐富,博大精深,小編精選每日熱門資訊,隨時更新,點擊「搶先收到最新資訊」瀏覽吧!


請您繼續閱讀更多來自 IT168企業級 的精彩文章:

Splunk>live!2018用戶大會北京大聊安全話題,到底支了哪些招?
你必須知道如何回答的五大計算機安全問題!

TAG:IT168企業級 |