Reference Deployment Guide for TensorFlow with an NVIDIA GPU Card over Mellanox 100 GbE Network

Version 11

    In this document, we will demonstrate a distributed deployment procedure of TensorFlow and Mellanox end-to-end 100 Gb/s Ethernet solution.

    This document describes the process of building the most recent stable TensorFlow 1.0 branch with Bazel 0.4.4 from sources for Ubuntu 16.04.1 LTS on five physical servers (4 Worker nodes and 1 Parameter Server).

    We will show how to update and install the NVIDIA drivers, install Bazel and Mellanox software and hardware components.

     

    References

     

    Overview

    What is TensorFlow?

    TensorFlow is an open source software library developed by the Google Brain team for the purpose of conducting machine learning and deep neural networks research. The library performs numerical computation by using data flow graphs, where the nodes in the graph represent mathematical operations and the graph edges represent the multidimensional data arrays (tensors) which communicate between the nodes. These applications are implemented using graphs to organize the flow of operations and tensors for representing the data. It offers an application programming interface (API) in Python, as well as a lower level set of functions implemented using C++. It provides a set of features to enable faster prototyping and implementation of machine learning models and applications for highly heterogeneous computing platforms. TensorFlow supports Cuda 8.0 & CuDNN 5.1 (req. registration), in this guide we will use the installing from sources from their website for a much easier installation. In order to use TensorFlow with GPU support, you must have an NVIDIA GPU with a minimum compute capability of 3.0.

    Mellanox’s Machine Learning

    Mellanox Solutions accelerate many of the world’s leading artificial intelligence and machine learning platforms. Data analytics has become an essential function within many high-performance clusters, enterprise data centers, clouds and Hyperscale platforms. Machine learning is a pillar of today’s technological world, offering solutions that enable better and more accurate decision making based on the great amounts of data being collected. Machine learning encompasses a wide range of applications, ranging from security, finance, and image and voice recognition, to self-driving cars and smart cities. Mellanox solutions enable companies and organizations such as Baidu, NVIDIA, JD.com, Facebook, PayPal and more to leverage machine learning platforms to enhance their competitive advantage.  Training a Deep Neural Network (DNN) requires complex computations. Mellanox solutions implement In-Network Computing capabilities, including data aggregation and reduction functions, which perform integer and floating point operations on data flows at wire speed, enabling efficient data-parallel reduction operations. In-Network Computing dramatically improves DNN training performance.

     

    Setup Overview

    Before you start, make sure you are aware of the distributed TensorFlow architecture, see Glossary in Distributed TensorFlow for more info.

     

    In the distributed TensorFlow configuration described in this guide, we are using the following hardware specification.

     

    Equipment

     

    Parameter server:

    • E5-2650V4, 12 cores @ 2.2GHz, 30M L2 cache, 9.6GT QPI
    • 256GB RAM: 16 x 16 GB DDR4 dual rank
    • One Mellanox ConnectX-4 VPI 100GbE adapter
    • Ubuntu 16.04 x86_64 used as OS on all servers

     

    Four Worker servers, each containing:

    • E5-2650V4, 12 cores @ 2.2GHz, 30M L2 cache, 9.6GT QPI
    • 256GB RAM: 16 x 16 GB DDR4 dual rank
    • One or more CUDA-Capable GPU Cards with Compute Capability 3.0 or higher (http://developer.nvidia.com/cuda-gpus)
    • One Mellanox ConnectX-4 VPI 100GbE adapter
    • Ubuntu 16.04 x86_64 used as OS on all servers

     

    Networking

    • SN2700 32 ports 100GbE QSFP28
    • Mellanox LinkX MCP1600-Cxxx series cables for connect servers to 100GbE network

    ·

    Note: This document, does not cover the server’s storage aspect. You should configure the servers with the storage components appropriate to your usecase (Data Set size)

     

    2.jpg

     

     

    Network Configuration

    Each server is connected to the SN2700 switch by a 100GbE copper cable.

    The switch port connectivity in our case is as follow:

    • 1st -4th ports – connected to Worker servers
    • 5th port – connected to the Parameter server

     

    Server names with network configuration provided below

     

    Server type

    Server name

    IP and NICS                

    Internal network

    External network

    Parameter Server

    clx-mld-ps01

    ens13f0: 11.11.11.100

    eno1: From DHCP (reserved)

    Worker Server 01

    clx-mld-01

    ens13f0: 11.11.11.1

    eno1: From DHCP (reserved)

    Worker Server 02

    clx-mld-02

    ens13f0: 11.11.11.2

    eno1: From DHCP (reserved)

    Worker Server 03

    clx-mld-03

    ens13f0: 11.11.11.3

    eno1: From DHCP (reserved)

    Worker Server 04

    clx-mld-04

    ens13f0: 11.11.11.4

    eno1: From DHCP (reserved)

     

     

    Deployment Guide

     

    Prerequisites

    Disable a Nouveau kernel Driver

     

    Note: Skip this procedure if you are installing the driver without a GPU support (for the Parameter Server).

     

    Prior to installing CUDA in Ubuntu 16.04 by executing cuda_8.0.44_linux.run, the Nouveau kernel driver must be disabled. To disabled it, follow the procedure below.

    1. Check that the Nouveau kernel driver is loaded.

     

    $ lsmod |grep nouv

     

     

    3.jpg

    2. Remove all NVIDIA packages.

     

    Note: Skip this step if your system is fresh installed.

    $ sudo apt-get remove nvidia* && sudo apt autoremove

     

    3. Install the packages below for the build kernel.

     

    $ sudo apt-get install dkms build-essential linux-headers-generic

     

     

    4.    Block and disable the Nouveau kernel driver.

    $ sudo vim /etc/modprobe.d/blacklist.conf

     

    5. Insert the follow lines to the blacklist.conf file.

    blacklist nouveau
    blacklist lbm-nouveau
    options nouveau modeset=0
    alias nouveau off
    alias lbm-nouveau off

     

    6.    Disable the Nouveau kernel module and update the initramfs image.  (Although the nouveau-kms.conf file may not exist, it will not affect this step).

    $ echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf

    $ sudo update-initramfs -u

     

    7.    Reboot

     

    $ sudo reboot

     

     

    8. Check that the Nouveau kernel drive is not loaded.

    $ lsmod |grep nouv

     

    Install General Dependencies

    1. To install general dependencies, run the commands below or paste each line.

    $ sudo apt-get install openjdk-8-jdk git build-essential python-virtualenv swig python-wheel libcupti-dev

     

    2. To install TensorFlow, you must install the following packages:

    • Numpy: A numerical processing package that TensorFlow requires
    • dev: Enables adding extensions to Python
    • pip: Enables installing and managing of certain Python packages
    • wheel: Enables management of Python compressed packages in the wheel (.whl) format

     

    3. To install these packages for Python 2.7:

     

    $ sudo apt-get install python-numpy python-dev python-pip python-wheel

     

    Update Ubuntu Software Packages

    1. To update/upgrade Ubuntu software packages, run the commands below.

    $ sudo apt-get update            # Fetches the list of available update

    $ sudo apt-get upgrade -y        # Strictly upgrades the current packages

     

     

    Install the NVIDIA Drivers

    Note: Skip this procedure if you are installing the driver without a GPU support (for Parameter Server).

     

    The 367 (or later) NVIDIA drivers must be installed. To install them, you can use the Ubuntu built (when installing the additional drivers) after updating the driver packages.

    1. Go to the NVIDIA’s website (http://www.nvidia.fr/Download/index.aspx).

     

    2. Download the latest version of the driver.  The example below uses a Linux 64-bit driver (NVIDIA-Linux-x86_64-375.40.run).

     

    3.    Exit the GUI (as the drivers for graphic devices are running at a low level).

    $ sudo service lightdm stop

     

    4. Set the RunLevel to 3 with the program init.

    $ sudo init 3

     

    5. Run the downloaded file.

    $ sudo sh NVIDIA-Linux-x86_64-375.20.run

    $ sudo apt-get update

    During the run, you will be asked to confirm several things such as the pre-install of something failure, no 32-bit libraries and more.

     

    6. Once installed using additional drivers, restart your computer.

    $ sudo reboot

     

    Verify the Installation

      1.    Make sure the NVIDIA driver can work correctly with the installed GPU card.

    $ lsmod |grep nvidia

    4.jpg

       

     

    2. Run the nvidia-debugdump utility to collect internal GPU information.

    $ nvidia-debugdump -l

    5.jpg

     

    3.    Run the nvidia-smi utility to check the NVIDIA System Management Interface.

    $ nvidia-smi

     

    6.jpg

     

    Installation Mellanox OFED

    See HowTo Install MLNX_OFED for ConnectX-4/ConnectX-5 via ISO Image

     

     

    Install Nvidia Toolkit 8.0 (CUDA) & CudNN

     

    Note: Skip this procedure if you are installing the driver without a GPU support (for Parameter Server).

    Pre-installation Actions

     

    The following action must be taken before installing the CUDA Toolkit and Driver on the Linux driver:

    • Verify the system has a CUDA-capable GPU
    • Verify the system is running a supported version of Linux
    • Verify the system has gcc installed
    • Verify the system has the correct kernel headers and development packages installed
    • Download the NVIDIA CUDA Toolkit
    • Handle conflicting installation methods

     

    Note: You can override the install-time prerequisite checks by running the installer with the “-override” flag. Remember that the prerequisites will still be required to use the NVIDIA CUDA Toolkit.

     

     

    1. Verify You Have a CUDA-Capable GPU. To verify that your GPU is CUDA-capable, go to your distribution's equivalent of System Properties, or, from the command line, enter:

     

    $ lspci | grep -i nvidia

     

    If you do not see any settings, update the PCI hardware database that Linux maintains by entering “update-pciids” (generally found in /sbin) at the command line and rerun the previous lspci command.

    If your graphics card is from NVIDIA, and it is listed in http://developer.nvidia.com/cuda-gpus, your GPU is CUDA-capable.

    The Release Notes for the CUDA Toolkit also contain a list of supported products.

     

    2. Verify You Have a Supported Linux Version

    The CUDA Development Tools are only supported on some specific distributions of Linux. These are listed in the CUDA Toolkit release notes.

    To determine which distribution and release number you are running, type the following at the command line:

    $ uname -m && cat /etc/*release

     

    You should see output similar to the following, modified for your particular system:

    • x86_64
    • Ubuntu 16.04.2 LTS

    The x86_64 line indicates you are running on a 64-bit system. The remainder gives information about your distribution.

     

    3. Verify the System Has a gcc Compiler Installed

    The gcc compiler is required for development using the CUDA Toolkit. It is not required for running CUDA applications. It is generally installed as part of the Linux installation, and in most cases the version of gcc installed with a supported version of Linux will work correctly.

    To verify the version of gcc installed on your system, type the following on the command line:

    $ gcc --version

     

    You should see output similar to the following, modified for your particular system:

    gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 2016060

    If an error message is displayed, you need to install the “development tools” from your Linux distribution or obtain a version of gcc and its accompanying toolchain from the Web.

     

    4. Verify the System has the Correct Kernel Headers and Development Packages Installed. The CUDA Driver requires that the kernel headers and development packages for the running version of the kernel to be installed at the time of the driver installation, as well whenever the driver is rebuilt. For example, if your system is running kernel version 3.17.4-301, the 3.17.4-301 kernel headers and development packages must also be installed.

    While the Runfile installation performs no package validation, the RPM and DEB installations of the driver will make an attempt to install the kernel header and development packages if no version of these packages is currently installed. However, it will install the latest version of these packages, which may or may not match the version of the kernel your system is using. Therefore, it is best to manually ensure the correct version of the kernel headers and development packages are installed prior to installing the CUDA Drivers, as well as whenever you change the kernel version.

    The version of the kernel your system is running can be found by running the following command:

    $ uname -r

     

    This is the version of the kernel headers and development packages that must be installed prior to installing the CUDA Drivers. This command will be used multiple times below to specify the version of the packages to install. Note that below are the common-case scenarios for kernel usage. More advanced cases, such as custom kernel branches, should ensure that their kernel headers and sources match the kernel build they are running.

    The kernel headers and development packages for the currently running kernel can be installed with:

     

    $ sudo apt-get install linux-headers-$(uname -r)

     

    Installation Process

    1. Download the base installation .run file from NVIDIA CUDA website.

     

    2. Create an account if you do not already have one, and log in (an account is also required to download cuDNN).

     

    3. Choose Linux > x86_64 > Ubuntu > 16.04 > runfile (local) and download the base installer and the patch.

    Note: MAKE SURE YOU SAY NO TO INSTALLING NVIDIA DRIVERS!

    Note: Make sure you select yes to creating a symbolic link to your CUDA directory.

    $ cd ~/root # or directory to where you downloaded file

    $ sudo sh cuda_8.0.44_linux.run --override # hold s to skip

     

    4.    Install CUDA into: /usr/local/cuda.

    Do you accept the previously read EULA?

    accept/decline/quit: accept

    Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 367.48?

    (y)es/(n)o/(q)uit: N

    Install the CUDA 8.0 Toolkit?

    (y)es/(n)o/(q)uit: Y

    Enter Toolkit Location[ default is /usr/local/cuda-8.0 ]:

    Do you want to install a symbolic link at /usr/local/cuda?

    (y)es/(n)o/(q)uit: Y

    Install the CUDA 8.0 Samples?

    (y)es/(n)o/(q)uit: Y

    Enter CUDA Samples Location

    [ default is /root ]:

    Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...

     

    To install cuDNN download cuDNN v5.1 for Cuda 8.0 from the NVIDIA website and extract into /usr/local/cuda via:

    $ tar -xzvf cudnn-8.0-linux-x64-v5.1.tgz

    $ sudo cp cuda/include/cudnn.h /usr/local/cuda/include

    $ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64

    $ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

             

    Post-installation Action

     

    The post-installation actions must be manually performed. These actions are split into mandatory, recommended, and optional sections.

     

    Mandatory Actions

    1. Some actions must be taken after the installation before the CUDA Toolkit and Driver can be used.

    Environment Setup (http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#environment-setup)

    • The “PATH” variable needs to include /usr/local/cuda-8.0/bin
    • In addition, when using the .run file installation method, the “LD_LIBRARY_PATH” variable needs to contain /usr/local/cuda-8.0/lib64.
    • Update your bash file.
    $ vim ~/.bashrc

     

    This will open your bash file in a text editor which you will scroll to the bottom and add these lines:

    export CUDA_HOME=/usr/local/cuda-8.0

    export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}

    export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

     

    Once you save and close the text file, you can return to your original terminal and type this command to reload your .bashrc file.

    $ source ~/.bashrc

     

    2. Check that the paths have been properly modified.

    $ echo $CUDA_HOME

    $ echo $PATH

    $ echo $LD_LIBRARY_PATH

     

    3. Set the “LD_LIBRARY_PATH” and “CUDA_HOME” environment variables. Consider adding the commands below to your ~/.bash_profile. These assume your CUDA installation is in /usr/local/cuda-8.0.

    $ vim ~/.bash_profile

    This will open your file in a text editor which you will scroll to the bottom and add these lines:

    export CUDA_HOME=/usr/local/cuda-8.0

    export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}

    export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

     

    Other actions are recommended to verify the integrity of the installation.

    1. Install Writable Samples (http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#install-samples)

    In order to modify, compile, and run the samples, the samples must be installed with “write” permissions. A convenience installation script is provided:

    $ cuda-install-samples-8.0.sh <dir>

    This script is installed with the cuda-samples-8-0 package. The cuda-samples-8-0 package installs only a read-only copy in /usr/local/cuda-8.0/samples.

     

    2. Verify the Installation (http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#verify-installation)

    Before continuing, it is important to verify that the CUDA toolkit can find and communicate correctly with the CUDA-capable hardware. To do this, you need to compile and run some of the included sample programs.

    Note: Ensure the PATH and, if using the .run file installation method the LD_LIBRARY_PATH variables are set correctly. See section Mandatory Actions.

     

    3. Verify the Driver Version (http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#verify-driver)

    If you installed the driver, verify that the correct version of it is loaded.

    If you did not install the driver, or are using an operating system where the driver is not loaded via a kernel module, such as L4T, skip this step.

    When the driver is loaded, the driver version can be found by executing the following command.

    $ cat /proc/driver/nvidia/version
    You should see output similar to the following:
    VRM version: NVIDIA UNIX x86_64 Kernel Module 375.20  Tue Nov 15 16:49:10 PST 2016
    GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)

     

    4. Compiling the Examples (http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#compiling-examples)

    The version of the CUDA Toolkit can be checked by running “nvcc –V” in a terminal window. The “nvcc” command runs the compiler driver that compiles the CUDA programs. It calls the “gcc” compiler for C code and the NVIDIA PTX compiler for the CUDA code.

    $ nvcc –V

    You should see output similar to the following:

    • nvcc: NVIDIA (R) Cuda compiler driver
    • Copyright (c) 2005-2016 NVIDIA Corporation
    • Built on Sun_Sep__4_22:14:01_CDT_2016
    • Cuda compilation tools, release 8.0, V8.0.44

     

    5. The NVIDIA CUDA Toolkit includes sample programs in the source form. You should compile them by changing to ~/NVIDIA_CUDA-8.0_Samples and typing make. The resulting binaries will be placed under ~/NVIDIA_CUDA-8.0_Samples/bin.

    $ cd ~/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery/

    $ make

    $ cd ~/NVIDIA_CUDA-8.0_Samples

    $ ./bin/x86_64/linux/release/deviceQuery

     

    Running the Binaries (http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#running-binaries)

    After the compilation, find and run deviceQuery under ~/NVIDIA_CUDA-8.0_Samples. If the CUDA software is installed and configured correctly, the output should look similar to the below.

    11.jpg

    The exact appearance and the output lines might be different on your system. The important outcomes are that a device was found (the first highlighted line), that the device matches the one on your system (the second highlighted line), and that the test passed (the final highlighted line).

    If a CUDA-capable device and the CUDA Driver are installed but the deviceQuery reports that no CUDA-capable devices are present, this likely means that the /dev/nvidia* files are missing or have the wrong permissions.

    Running the bandwidthTest program ensures that the system and the CUDA-capable device are able to communicate correctly. Its output is shown below.

    $ cd ~/NVIDIA_CUDA-8.0_Samples/1_Utilities/bandwidthTest/

    $ make

    $ cd ~/NVIDIA_CUDA-8.0_Samples

    $ ./bin/x86_64/linux/release/bandwidthTest

    12.jpg

    Note: The measurements for your CUDA-capable device description will vary from system to system. The important point is that you obtain measurements, and that the second-to-last line confirms that all necessary tests passed.

    Should the tests not pass, make sure you have a CUDA-capable NVIDIA GPU on your system and make sure it is properly installed.

    If you run into difficulties with the link step (such as libraries not being found), consult the Linux Release Notes found in the doc folder in the CUDA Samples directory.

     

    Installing TensorFlow

     

    Clone TensorFlow Repository

    To clone the latest TensorFlow repository, issue the following command:

    $ cd ~

    $ git clone https://github.com/tensorflow/tensorflow

     

     

    The preceding git clone command creates a subdirectory called “tensorflow”. After cloning, you may optionally build a specific branch (such as a release branch) by invoking the following commands:

    $ cd tensorflow

    $ git checkout r1.0 # where r1.0 is the desired branch

          

    Install the Bazel Tool

    Bazel is a build tool from Google. For further information, please see http://bazel.io/docs/install.html

    1. Download and install JDK 8, which will be used to compile Bazel form source.

    $ sudo add-apt-repository ppa:webupd8team/java

    $ sudo apt-get update

    $ sudo apt-get install oracle-java8-installer

     

    2.    Install the Bazel tool.

    $ echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list

    $ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -

    $ sudo apt-get update

    $ sudo apt-get install bazel

    $ sudo apt-get upgrade bazel

     

    3. Fix the CROSSTOOL file.

    • Edit the text file tensorflow/third_party/gpus/crosstool/CROSSTOOL.

     

    $ vim ~/tensorflow/third_party/gpus/crosstool/CROSSTOOL.tpl

     

     

    • Add cxx_builtin_include_directory: "%{cuda_include_path}"as below.

    cxx_builtin_include_directory: "/usr/lib/gcc/"

    cxx_builtin_include_directory: "/usr/local/include"

    cxx_builtin_include_directory: "/usr/include"

    cxx_builtin_include_directory: "/usr/local/cuda-8.0/include"

    tool_path { name: "gcov" path: "/usr/bin/gcov" }

     

    If you do not do this, you will get an error that looks like this:

    ERROR: /home/marcnu/Documents/tensorflow/tensorflow/contrib/rnn/BUILD:46:1: undeclared inclusion(s) in rule ‘//tensorflow/contrib/rnn:python/ops/_lstm_ops_gpu’:
    this rule is missing dependency declarations for the following files included by ‘tensorflow/contrib/rnn/kernels/lstm_ops_gpu.cu.cc’:
    ‘/usr/local/cuda-8.0/include/cuda_runtime.h’
    ‘/usr/local/cuda-8.0/include/host_config.h’

    ‘/usr/local/cuda-8.0/include/curand_discrete2.h’.
    nvcc warning : option ‘–relaxed-constexpr’ has been deprecated and replaced by option ‘–expt-relaxed-constexpr’.
    nvcc warning : option ‘–relaxed-constexpr’ has been deprecated and replaced by option ‘–expt-relaxed-constexpr’.
    Target //tensorflow/tools/pip_package:build_pip_package failed to build
    Use –verbose_failures to see the command lines of failed build steps.
    INFO: Elapsed time: 203.657s, Critical Path: 162.10s

     

    Install TensorFlow using the configure Script

    The root of the source tree contains a bash script named “configure”. This script asks you to identify the pathname of all relevant TensorFlow dependencies and specify other build configuration options such as compiler flags.

    Note: Run this script prior to creating the pip package and installing TensorFlow

     

    To build TensorFlow with GPU, the “configure” script needs to know the version numbers of CUDA and cuDNN. If several versions of CUDA or cuDNN are installed on your system, explicitly select the desired version instead of relying on the system default.

    $ cd tensorflow    # cd to the top-level directory created

    $ ./configure

     

    If you received the following error message:

     

    locale.Error: unsupported locale setting

     

     

    1. Run the following command:

    export LANGUAGE=en_US.UTF-8

    export LANG=en_US.UTF-8

    export LC_ALL=en_US.UTF-8

     

    For further information see: http://askubuntu.com/questions/205378/unsupported-locale-setting-fault-by-command-not-found

     

    2.    Or, edit the locale file: /etc/default/locale to:

    LANGUAGE=en_US.UTF-8

    LANG=en_US.UTF-8

    LC_ALL=en_US.UTF-8

     

     

    3. Restart the server.

     

    Here is an example execution of the “configure” script. In the example we configure the installation with GPU and CUDA libraries support.

    Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python2.7

     

    Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:

     

    Do you wish to use jemalloc as the malloc implementation? [Y/n] Y

    jemalloc enabled

     

    Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N

    No Google Cloud Platform support will be enabled for TensorFlow

     

    Do you wish to build TensorFlow with Hadoop File System support? [y/N] N

    No Hadoop File System support will be enabled for TensorFlow

     

    Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] N

    No XLA JIT support will be enabled for TensorFlow

     

    Found possible Python library paths:

      /usr/local/lib/python2.7/dist-packages

    /usr/lib/python2.7/dist-packages

    Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]

    Using python library path: /usr/local/lib/python2.7/dist-packages

     

    Do you wish to build TensorFlow with OpenCL support? [y/N] N

    No OpenCL support will be enabled for TensorFlow

     

    Do you wish to build TensorFlow with CUDA support? [y/N] Y (For a Worker, N for a Parameter Server)

    CUDA support will be enabled for TensorFlow

     

    Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:

     

    Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0

     

    Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

     

    Please specify the cuDNN version you want to use. [Leave empty to use system default]: 5

     

    Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

     

    Please specify a list of comma-separated Cuda compute capabilities you want to build with.

    You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.

    Please note that each additional compute capability significantly increases your build time and binary size.

    [Default is: "3.5,5.2"]: 3.5 (Tesla K40 from https://developer.nvidia.com/cuda-gpus)

    Setting up Cuda include

    Setting up Cuda lib

    Setting up Cuda bin

    Setting up Cuda nvvm

    Setting up CUPTI include

    Setting up CUPTI lib64

    Configuration finished

     

    Pip Installation

    Pip installation installs TensorFlow on your machine, possibly upgrades previously installed Python packages. Note, this may impact existing Python programs on your machine.

    Pip is a package management system used to install and manage software packages written in Python.

    This installation requires the code from Github. You can either take the most recent master branch (lots of new commits) or the latest release branch (should be more stable, but still updated every few days). Here, we use the branch r1.0.

     

    Build the Pip Package

    1. To build a pip package for TensorFlow with CPU-only support (For a Parameter Server), invoke the following command:

     

    $ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

     

     

    2. To build a pip package for TensorFlow with GPU support (For a Worker Server), invoke the following command:

    $ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

     

    3. The bazel build command builds a script named build_pip_package. Running this script as follows will build a .whl file within the /tmp/tensorflow_pkg directory:

    $ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

     

    Install the Pip Package

    Note: You will need Pip version 8.1 or later for the following commands to work on Linux.

     

    1. Install that pip package.

     

    $ pip install

     

     

    The filename of the .whl file depends on your platform. For example, the following command will install the pip package for TensorFlow 1.0.0 on Linux:

    For a Parameter Server (sample)

     

    $ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.0.0-py2-none-any.whl

     

     

    For Worker Server (sample)

    $ sudo pip install /tmp/tensorflow_pkg/tensorflow_gpu-1.0.0-py2-none-any.whl


    Validate TensorFlow Installation

    To validate the TensorFlow installation:

    1. Close all your terminals and open a new terminal to test.

     

    2. Change directory (cd) to any directory on your system other than the tensorflow subdirectory from which you invoked the configure command.

     

    3.    Invoke python:

    $ cd /

    $ python

    ...
    >>> import tensorflow as tf
    >>> hello = tf.constant('Hello, TensorFlow!')
    >>> sess = tf.Session()
    >>> print(sess.run(hello))
    Hello, TensorFlow!
    >>> a = tf.constant(10)
    >>> b = tf.constant(32)
    >>> print(sess.run(a + b))
    42
    >>>

    CTRL-D to EXIT.

     
    Troubleshooting

    The installation issues that might occur typically depend on the installed Operating System. For further information, please see the "Common installation problems" Installing TensorFlow on Linux guide.

    Beyond the errors documented in the guide above, the following table specifies additional errors specific to building TensorFlow. Note that we are relying on Stack Overflow as the repository for build and installation problems. If you encounter an error message not listed in the preceding guide or in the following table, search for it on Stack Overflow. If Stack Overflow does not show the error message, ask a new question on Stack Overflow and specify the tensorflow tag.

    Stack overflow issueError Message
    42013316

    ImportError: libcudart.so.8.0: cannot open shared object file:

    No such file or directory

    ImportError: libcudnn.5: cannot open shared object file:

    No such file or directory

    35953210

    Invoking `python` or `ipython` generates the following error:

    ImportError: cannot import name pywrap_tensorflow