TensorFlow on Windows (Native)

On 11/29/2016 Google Developers Blog (https://developers.googleblog.com/2016/11/tensorflow-0-12-adds-support-for-windows.html) announced native TensorFlow package for Windows 7, 10, and Server 2016. The TensorFlow r0.12 is aimed at providing a complete experience including CUDA support on Windows platforms.

System Requirements

In order to run TensorFlow with GPU support natively on Windows 10, you should have NVIDIA GPU with Compute Capability >= 3.0. To check the compute capability of your NVIDIA GPU, please visit https://developer.nvidia.com/cuda-gpus.

I have tested these steps on NVIDIA GPU with Compute Capability 5.0 (GeForce GTX 960M).

Software Requirements

In order to complete the required steps, you should be a member of the NVIDIA Accelerated Computing Developer Program. You can create a free account at https://developer.nvidia.com/accelerated-computing-developer. Once you have created the account, you should start downloading the required software as listed below

Once you have downloaded these bundles, you are all set to start the installation process.

Installation Steps

You should follow these steps in sequence to successfully complete the installation:

  1. Install Microsoft Visual Studio Community Edition 2015. Restart your computer once the installation is finished.
  2. Run NVIDIA® CUDA® Toolkit 8.0 installer.
  3. Unzip the cuDNN bundle.
  4. If you have selected defaults for CUDA® Toolkit installation, you should have CUDA® Toolkit installed at "C:\Program Files\NVIDIA GPU Computing Toolkit". Locate following folders in the unzipped cuDNN bundle:
    • cuda\bin
    • cuda\include
    • cuda\lib\x64
  5. Copy the contents of the above three folders to the following directories (respectively):
    • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin
    • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include
    • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64
  6. Now, proceed with installation of Anaconda 4.2.0. Complete Anaconda installation in a directory of your choice, later referred as ANACONDA_HOME.
  7. Open a Windows command prompt (cmd) with option "Run as administrator".
  8. Execute command pip install --ignore-installed --upgrade pip setuptools (See a note on corporate firewall proxy settings at the end of instructions). If you don't perform this step, the next step will result in an error

    Cannot remove entries from nonexistent...

  9. After updating setuptools, execute command pip install --upgrade https://storage.googleapis.com/tensorflow/windows/gpu/tensorflow_gpu-0.12.0rc0-cp35-cp35m-win_amd64.whl
  10. Upon completion of step 8, locate Spyder executable "ANACONDA_HOME\Scripts\spyder.exe". Run Spyder.
  11. Copy paste following lines in the Spyder editor:
    import tensorflow as tf
    from tensorflow.python.client import device_lib
    print('TensorFlow Version: %s' % tf.__version__)
    local_devices = device_lib.list_local_devices()
    gpus = [x.name for x in local_devices if x.device_type == 'GPU']
    print('Available GPUs: %s' % gpus)
    if (len(gpus) > 0):
      with tf.device(gpus[0]):
        a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
        b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
        c = tf.matmul(a, b)
      sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
      print('Output:')
      print(sess.run(c))
    
  12. Upon running the above code, you should see following output in an IPython console:
    TensorFlow Version: 0.12.0-rc0
    Available GPUs: ['/gpu:0']
    Output:
    [[ 22.  28.]
    [ 49.  64.]]
    

This output does not show any device placement log. To see the device placement log, we need to run above program in a dedicated Python console, or from cmd commandline. Following is the detailed output with device placement log showing the GPUs being used:

C:\Users\Yogendra Pandey\.spyder-py3>python temp.py
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cublas64_80.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cudnn64_5.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cufft64_80.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library nvcuda.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library curand64_80.dll locally
TensorFlow Version: 0.12.0-rc0
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX 960M
major: 5 minor: 0 memoryClockRate (GHz) 1.176
pciBusID 0000:02:00.0
Total memory: 4.00GiB
Free memory: 3.35GiB
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:906] DMA: 0
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:916] 0:   Y
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0)
E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:586] Could not identify NUMA node of /gpu:0, defaulting to 0.  Your kernel may not have been built with NUMA support.
Available GPUs: ['/gpu:0']
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0)
E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:586] Could not identify NUMA node of /job:localhost/replica:0/task:0/gpu:0, defaulting to 0.  Your kernel may not have been built with NUMA support.
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\direct_session.cc:255] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0
Output:
MatMul: (MatMul): /job:localhost/replica:0/task:0/gpu:0
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\simple_placer.cc:827] MatMul: (MatMul)/job:localhost/replica:0/task:0/gpu:0
b: (Const): /job:localhost/replica:0/task:0/gpu:0
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\simple_placer.cc:827] b: (Const)/job:localhost/replica:0/task:0/gpu:0
a: (Const): /job:localhost/replica:0/task:0/gpu:0
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\simple_placer.cc:827] a: (Const)/job:localhost/replica:0/task:0/gpu:0
[[ 22.  28.]
 [ 49.  64.]]

You are now ready to run TensorFlow on Windows 10 natively with GPU support.

Note

If you are behind corporate firewall, the pip install commands at the steps 7 and 8 will result in Failed to establish a new connection or similar error. To bypass proxy, append --proxy http://<your.corp.proxy.com>:<port> or --proxy http://<username>:<password>@<your.corp.proxy.com>:<port> at the end of pip install commands, e.g.

pip install --ignore-installed --upgrade pip setuptools --proxy http://<your.corp.proxy.com>:<port>


results matching ""

    No results matching ""