There may be a way to solve your problem.
Just look here:
sudo nvidia-smi
Mon Nov 14 16:14:48 2016
+------------------------------------------------------+
| NVIDIA-SMI 358.16 Driver Version: 358.16 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT... Off | 0000:01:00.0 Off | N/A |
| 35% 76C P2 111W / 250W | 475MiB / 12287MiB | 60% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 12235 C /home/tteikhua/torch/install/bin/luajit 215MiB |
| 0 27771 C python 233MiB |
+-----------------------------------------------------------------------------+
Using the "nvidia-smi" command, you can see that a program written to use the GPU in Torch is also sharing the GPU memory (just total up the individual GPU memory and you can see that it is less than the total 12G memory which Titan X have) with the python (which is running Tensorflow).
And the underlying mechanism how the sharing is done? It is because both Torch and Tensorflow are compiled with CUDA, and the access to the GPU is also shared:
ls -al /proc/12235/fd|grep nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 12 07:54 10 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 12 07:54 11 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 12 07:54 12 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 12 07:54 4 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 12 07:54 5 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 12 07:54 6 -> /dev/nvidia0
ls -al /proc/27771/fd|grep nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 14 15:51 10 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 14 15:51 11 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 14 15:51 15 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 14 15:51 16 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 14 15:51 17 -> /dev/nvidia0
lrwx------ 1 tteikhua tteikhua 64 Nov 14 15:51 9 -> /dev/nvidia0
So how to achieve this?
Look at the picture here below:
http://cuda-programming.blogspot.sg/2013/01/shared-memory-and-synchronization-in.html

and this:
https://www.bu.edu/pasi/files/2011/07/Lecture31.pdf
This is sharing between GPU and CPU. But your "sharing" is two different process sharing the same GPU memory. This is possible as shown below:
Modifying simpleMultiCopy from CUDA samples:
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 12235 C /home/tteikhua/torch/install/bin/luajit 215MiB |
| 0 27771 C python 233MiB |
| 0 31014 C ./simpleMultiCopy 238MiB |
| 0 31021 C ./simpleMultiCopy 238MiB |
| 0 31024 C ./simpleMultiCopy 238MiB |
+-----------------------------------------------------------------------------+
You see that running multiple copies of the same program result in concurrent sharing of GPU memory as the individual memory used by different program add up to the total used on the GPU.
For chainer, I did a "git clone https://github.com/pfnet/chainer", and then the examples/mnist directory ran "python train_mnist.py --gpu=0" twice, and subsequently got this:
| 0 GeForce GTX 850M Off | 0000:01:00.0 Off | N/A |
| N/A 64C P0 N/A / N/A | 322MiB / 4043MiB | 81% Default |
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 17406 C python 160MiB |
| 0 17414 C python 160MiB |
+-----------------------------------------------------------------------------+
which means that the two different processes are sharing the same GPU memory.