As the official documentation at the moment lacks some painful details, here's a quick list how to install CUDA, CUDA-powered TensorFlow, and Keras on Windows 10.

Procedure

  1. Install the CUDA 8.0 toolkit from Nvidia, this will automatically add CUDA's bin directory to Windows' PATH variable.
  2. Download cuDNN 5.1 from Nvidia. Be sure to use 5.1, as 6.0 quite fresh and not yet supported by TensorFlow.
  3. Extract the cuDNN DLL from the cuDNN zip file, and put it in CUDA's bin directory, which normally is C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin
  4. Download and install Python 3.5.x from Python.org. TensorFlow only supports 3.5.x on Windows.
  5. On a elevated admin cmd, run
    python -m pip install --upgrade pip
    to upgrade to the latest pip version. If you do not upgrade pip, it'll not be able to find the TensorFlow packages.
  6. In the same cmd, run pip install --upgrade tensorflow-gpu
  7. Download the latest scipy wheel file from Christoph Gohlke's homepage. This is the least painful way (apart from Anaconda) to get scipy with LAPACK, etc.
  8. Install Keras by running pip install --upgrade keras. If you would like to have the latest version from the Github repository, run pip install git+https://github.com/fchollet/keras.git instead.
  9. Log out and in again, to be sure all environment variables set during installation are picked up correctly.

With that, the installation should be done.

To verify both TensorFlow and Keras were installed successfully, run

import tensorflow as tf
import keras
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
b'Hello, TensorFlow!'

from a Python prompt. Both should not produce any errors. If TensorFlow has trouble finding CUDA or cuDNN, check that CUDA's bin is really in Windows' PATH (see Settings > System > Advanced System Settings > Advanced > Environment Variables).

I'll try to keep this updated as new TensorFlow or Keras versions arrive.

Known issues

With the last TensorFlow release, some errors are generated upon running a session:

E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "BestSplits" device_type: "CPU"') for unknown op: BestSplits
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "CountExtremelyRandomStats" device_type: "CPU"') for unknown op: CountExtremelyRandomStats
...

These seem to relate only to the CPU backend and have been fixed in the latest nightly builds. Results from the GPU backend seem unaffected.

Sources