TensorFlow on Windows (Native)
On 11/29/2016 Google Developers Blog (https://developers.googleblog.com/2016/11/tensorflow-0-12-adds-support-for-windows.html) announced native TensorFlow package for Windows 7, 10, and Server 2016. The TensorFlow r0.12 is aimed at providing a complete experience including CUDA support on Windows platforms.
System Requirements
In order to run TensorFlow with GPU support natively on Windows 10, you should have NVIDIA GPU with Compute Capability >= 3.0. To check the compute capability of your NVIDIA GPU, please visit https://developer.nvidia.com/cuda-gpus.
I have tested these steps on NVIDIA GPU with Compute Capability 5.0 (GeForce GTX 960M).
Software Requirements
In order to complete the required steps, you should be a member of the NVIDIA Accelerated Computing Developer Program. You can create a free account at https://developer.nvidia.com/accelerated-computing-developer. Once you have created the account, you should start downloading the required software as listed below
- Microsoft Visual Studio Community Edition 2015 (https://www.visualstudio.com/vs/community/)
- NVIDIA® CUDA® Toolkit 8.0 (https://developer.nvidia.com/cuda-toolkit)
- NVIDIA CUDA® Deep Neural Network library (cuDNN) 5.1 (https://developer.nvidia.com/cudnn)
- Anaconda 4.2.0 For Windows Python 3.5 version: 64-bit installer (https://www.continuum.io/downloads)
Once you have downloaded these bundles, you are all set to start the installation process.
Installation Steps
You should follow these steps in sequence to successfully complete the installation:
- Install Microsoft Visual Studio Community Edition 2015. Restart your computer once the installation is finished.
- Run NVIDIA® CUDA® Toolkit 8.0 installer.
- Unzip the cuDNN bundle.
- If you have selected defaults for CUDA® Toolkit installation, you should have CUDA® Toolkit installed at "C:\Program Files\NVIDIA GPU Computing Toolkit". Locate following folders in the unzipped cuDNN bundle:
- cuda\bin
- cuda\include
- cuda\lib\x64
- Copy the contents of the above three folders to the following directories (respectively):
- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin
- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include
- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64
- Now, proceed with installation of Anaconda 4.2.0. Complete Anaconda installation in a directory of your choice, later referred as ANACONDA_HOME.
- Open a Windows command prompt (
cmd
) with option "Run as administrator". - Execute command
pip install --ignore-installed --upgrade pip setuptools
(See a note on corporate firewall proxy settings at the end of instructions). If you don't perform this step, the next step will result in an errorCannot remove entries from nonexistent...
- After updating
setuptools
, execute commandpip install --upgrade https://storage.googleapis.com/tensorflow/windows/gpu/tensorflow_gpu-0.12.0rc0-cp35-cp35m-win_amd64.whl
- Upon completion of step 8, locate Spyder executable "ANACONDA_HOME\Scripts\spyder.exe". Run Spyder.
- Copy paste following lines in the Spyder editor:
import tensorflow as tf from tensorflow.python.client import device_lib print('TensorFlow Version: %s' % tf.__version__) local_devices = device_lib.list_local_devices() gpus = [x.name for x in local_devices if x.device_type == 'GPU'] print('Available GPUs: %s' % gpus) if (len(gpus) > 0): with tf.device(gpus[0]): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) print('Output:') print(sess.run(c))
- Upon running the above code, you should see following output in an IPython console:
TensorFlow Version: 0.12.0-rc0 Available GPUs: ['/gpu:0'] Output: [[ 22. 28.] [ 49. 64.]]
This output does not show any device placement log. To see the device placement log, we need to run above program in a dedicated Python console, or from cmd
commandline. Following is the detailed output with device placement log showing the GPUs being used:
C:\Users\Yogendra Pandey\.spyder-py3>python temp.py I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cublas64_80.dll locally I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cudnn64_5.dll locally I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cufft64_80.dll locally I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library nvcuda.dll locally I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library curand64_80.dll locally TensorFlow Version: 0.12.0-rc0 I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:885] Found device 0 with properties: name: GeForce GTX 960M major: 5 minor: 0 memoryClockRate (GHz) 1.176 pciBusID 0000:02:00.0 Total memory: 4.00GiB Free memory: 3.35GiB I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:906] DMA: 0 I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:916] 0: Y I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0) E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:586] Could not identify NUMA node of /gpu:0, defaulting to 0. Your kernel may not have been built with NUMA support. Available GPUs: ['/gpu:0'] I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0) E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:586] Could not identify NUMA node of /job:localhost/replica:0/task:0/gpu:0, defaulting to 0. Your kernel may not have been built with NUMA support. Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0 I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\direct_session.cc:255] Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0 Output: MatMul: (MatMul): /job:localhost/replica:0/task:0/gpu:0 I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\simple_placer.cc:827] MatMul: (MatMul)/job:localhost/replica:0/task:0/gpu:0 b: (Const): /job:localhost/replica:0/task:0/gpu:0 I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\simple_placer.cc:827] b: (Const)/job:localhost/replica:0/task:0/gpu:0 a: (Const): /job:localhost/replica:0/task:0/gpu:0 I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\simple_placer.cc:827] a: (Const)/job:localhost/replica:0/task:0/gpu:0 [[ 22. 28.] [ 49. 64.]]
You are now ready to run TensorFlow on Windows 10 natively with GPU support.
Note
If you are behind corporate firewall, the pip install
commands at the steps 7 and 8 will result in Failed to establish a new connection
or similar error. To bypass proxy, append --proxy http://<your.corp.proxy.com>:<port>
or --proxy http://<username>:<password>@<your.corp.proxy.com>:<port>
at the end of pip install
commands, e.g.
pip install --ignore-installed --upgrade pip setuptools --proxy http://<your.corp.proxy.com>:<port>