Showing posts with label optirun. Show all posts
Showing posts with label optirun. Show all posts

Sunday, June 4, 2017

First learning into pycuda part 2

So I continue where I left off and explore further on the pycuda library. This is a good tutorial to explain further what the code does. You can read it more here .

 user@localhost:~$ optirun python3 test_cumath.py   
 Traceback (most recent call last):  
  File "test_cumath.py", line 245, in <module>  
   from py.test.cmdline import main  
 ImportError: No module named 'py'  

module py is actually pytest. So install away

 user@localhost:~$ sudo apt-get install python3-pytest  
 Reading package lists... Done  
 Building dependency tree      
 Reading state information... Done  
 The following packages were automatically installed and are no longer required:  
  libgl1-nvidia-glx:i386 libgl1-nvidia-glx-i386:i386 libllvm3.5v5 libnvidia-glcore:i386 linux-image-4.1.0-2-amd64 linux-image-4.2.0-1-amd64 linux-source-4.3 python3-ecdsa syslinux  
  unetbootin-translations  
 Use 'sudo apt autoremove' to remove them.  
 The following additional packages will be installed:  
  python3-py  
 The following NEW packages will be installed:  
  python3-py python3-pytest  
 0 upgraded, 2 newly installed, 0 to remove and 508 not upgraded.  
 Need to get 249 kB of archives.  
 After this operation, 962 kB of additional disk space will be used.  
 Do you want to continue? [Y/n] Y  
 Get:1 http://ftp.us.debian.org/debian testing/main amd64 python3-py all 1.4.31-1 [81.9 kB]  
 Get:2 http://ftp.us.debian.org/debian testing/main amd64 python3-pytest all 2.9.2-3 [167 kB]                                                   
 Fetched 249 kB in 6s (36.0 kB/s)                                                                                 
 Selecting previously unselected package python3-py.  
 (Reading database ... 285403 files and directories currently installed.)  
 Preparing to unpack .../python3-py_1.4.31-1_all.deb ...  
 Unpacking python3-py (1.4.31-1) ...  
 Selecting previously unselected package python3-pytest.  
 Preparing to unpack .../python3-pytest_2.9.2-3_all.deb ...  
 Unpacking python3-pytest (2.9.2-3) ...  
 Processing triggers for man-db (2.7.5-1) ...  
 Setting up python3-py (1.4.31-1) ...  
 Setting up python3-pytest (2.9.2-3) ...  

 user@localhost:~$ optirun python3 -mpytest test_cumath.py   
 ==================================================================================== test session starts =====================================================================================  
 platform linux -- Python 3.5.2+, pytest-2.9.2, py-1.4.31, pluggy-0.3.1  
 rootdir: /home/user/, inifile:   
 collected 28 items   
   
 test_cumath.py ............................  
   
 ================================================================================ 28 passed in 106.94 seconds =================================================================================  

First run is very slow, I have no idea why. Then I run tests again.

 user@localhost:~$ optirun python3 -s -mpytest test_cumath.py   
 ==================================================================================== test session starts =====================================================================================  
 platform linux -- Python 3.5.2+, pytest-2.9.2, py-1.4.31, pluggy-0.3.1  
 rootdir: /home/user/, inifile:   
 collected 28 items   
   
 test_cumath.py ............................  
   
 ================================================================================= 28 passed in 8.43 seconds ==================================================================================  
 user@localhost:~/oss/asus-rt-n14uhp-mrtg/src/pycuda$ optirun python3 -s -mpytest test_cumath.py   
 ==================================================================================== test session starts =====================================================================================  
 platform linux -- Python 3.5.2+, pytest-2.9.2, py-1.4.31, pluggy-0.3.1  
 rootdir: /home/user/, inifile:   
 collected 28 items   
   
 test_cumath.py ............................  
   
 ================================================================================= 28 passed in 8.44 seconds ==================================================================================  

tests average at 8.5seconds, pretty good.  Then I add some code to print the time of the cpu and gpu, looks to me the maths test, cpu perform much better.

 name="<class 'numpy.float32'>" cpu time="0.00000230599835049360990524291992187500000000000000" gpu time="0.00008139800047501921653747558593750000000000000000"  
 name="<class 'numpy.float64'>" cpu time="0.00000265298876911401748657226562500000000000000000" gpu time="0.00008218799484893679618835449218750000000000000000"  
 name="<class 'numpy.float32'>" cpu time="0.00000343500869348645210266113281250000000000000000" gpu time="0.00007703299343120306730270385742187500000000000000"  
 name="<class 'numpy.float64'>" cpu time="0.00000536799780093133449554443359375000000000000000" gpu time="0.00008030000026337802410125732421875000000000000000"  
 name="<class 'numpy.float32'>" cpu time="0.00001179199898615479469299316406250000000000000000" gpu time="0.00007762899622321128845214843750000000000000000000"  
 name="<class 'numpy.float64'>" cpu time="0.00002014498750213533639907836914062500000000000000" gpu time="0.00010288099292665719985961914062500000000000000000"  
 name="<class 'numpy.float32'>" cpu time="0.00001173198688775300979614257812500000000000000000" gpu time="0.00007860400364734232425689697265625000000000000000"  
 name="<class 'numpy.float64'>" cpu time="0.00001884899393189698457717895507812500000000000000" gpu time="0.00007704700692556798458099365234375000000000000000"  
 name="<class 'numpy.float32'>" cpu time="0.00007372199615929275751113891601562500000000000000" gpu time="0.00008238101145252585411071777343750000000000000000"  
 name="<class 'numpy.float64'>" cpu time="0.00014613500388804823160171508789062500000000000000" gpu time="0.00016600399976596236228942871093750000000000000000"  

Have fun and you can get the code from my repository. 

Saturday, June 3, 2017

First learning into pycuda

With last blog which I fail to get a sample working, today I thought of giving pycuda a try. So what is pycuda?

PyCUDA gives you easy, Pythonic access to Nvidia‘s CUDA parallel computation API. 

With that said, I'm gonna give the sample code a try. Let's install python3 pycuda module.

 user@localhost:~/Downloads$ sudo apt-get install python3-pycuda   
 Reading package lists... Done  
 Building dependency tree      
 Reading state information... Done  
 The following packages were automatically installed and are no longer required:  
  libgl1-nvidia-glx:i386 libgl1-nvidia-glx-i386:i386 libllvm3.5v5 libnvidia-glcore:i386 linux-image-4.1.0-2-amd64 linux-image-4.2.0-1-amd64 linux-source-4.3 python3-ecdsa syslinux  
  unetbootin-translations  
 Use 'sudo apt autoremove' to remove them.  
 The following additional packages will be installed:  
  fonts-mathjax libboost-python1.61.0 libboost-system1.61.0 libboost-thread1.61.0 libjs-mathjax python-pycuda-doc python3-appdirs python3-decorator python3-pytools  
 Suggested packages:  
  fonts-mathjax-extras fonts-stix libjs-mathjax-doc python-pycuda python3-pytest python3-opengl python3-pycuda-dbg  
 The following NEW packages will be installed:  
  fonts-mathjax libboost-python1.61.0 libboost-system1.61.0 libboost-thread1.61.0 libjs-mathjax python-pycuda-doc python3-appdirs python3-decorator python3-pycuda python3-pytools  
 0 upgraded, 10 newly installed, 0 to remove and 508 not upgraded.  
 Need to get 7,150 kB of archives.  
 After this operation, 47.8 MB of additional disk space will be used.  
 Do you want to continue? [Y/n] Y  
 Get:1 http://ftp.us.debian.org/debian testing/main amd64 fonts-mathjax all 2.6.1-1 [959 kB]  
 Get:2 http://ftp.us.debian.org/debian testing/main amd64 libboost-python1.61.0 amd64 1.61.0+dfsg-2.1 [137 kB]                                          
 Get:3 http://ftp.us.debian.org/debian testing/main amd64 libboost-system1.61.0 amd64 1.61.0+dfsg-2.1 [32.1 kB]                                          
 Get:4 http://ftp.us.debian.org/debian testing/main amd64 libboost-thread1.61.0 amd64 1.61.0+dfsg-2.1 [71.2 kB]                                          
 Get:5 http://ftp.us.debian.org/debian testing/main amd64 libjs-mathjax all 2.6.1-1 [5,473 kB]                                                  
 Get:6 http://ftp.us.debian.org/debian testing/contrib amd64 python-pycuda-doc all 2016.1-1 [122 kB]                                               
 Get:7 http://ftp.us.debian.org/debian testing/main amd64 python3-appdirs all 1.4.0-2 [11.1 kB]                                                  
 Get:8 http://ftp.us.debian.org/debian testing/main amd64 python3-decorator all 4.0.6-1 [12.8 kB]                                                 
 Get:9 http://ftp.us.debian.org/debian testing/main amd64 python3-pytools all 2016.2.1-1 [33.9 kB]                                                
 Get:10 http://ftp.us.debian.org/debian testing/contrib amd64 python3-pycuda amd64 2016.1-1+b2 [298 kB]                                              
 Fetched 7,150 kB in 1min 23s (85.4 kB/s)                                                                             
 Selecting previously unselected package fonts-mathjax.  
 (Reading database ... 281119 files and directories currently installed.)  
 Preparing to unpack .../fonts-mathjax_2.6.1-1_all.deb ...  
 Unpacking fonts-mathjax (2.6.1-1) ...  
 Selecting previously unselected package libboost-python1.61.0.  
 Preparing to unpack .../libboost-python1.61.0_1.61.0+dfsg-2.1_amd64.deb ...  
 Unpacking libboost-python1.61.0 (1.61.0+dfsg-2.1) ...  
 Selecting previously unselected package libboost-system1.61.0:amd64.  
 Preparing to unpack .../libboost-system1.61.0_1.61.0+dfsg-2.1_amd64.deb ...  
 Unpacking libboost-system1.61.0:amd64 (1.61.0+dfsg-2.1) ...  
 Selecting previously unselected package libboost-thread1.61.0:amd64.  
 Preparing to unpack .../libboost-thread1.61.0_1.61.0+dfsg-2.1_amd64.deb ...  
 Unpacking libboost-thread1.61.0:amd64 (1.61.0+dfsg-2.1) ...  
 Selecting previously unselected package libjs-mathjax.  
 Preparing to unpack .../libjs-mathjax_2.6.1-1_all.deb ...  
 Unpacking libjs-mathjax (2.6.1-1) ...  
 Selecting previously unselected package python-pycuda-doc.  
 Preparing to unpack .../python-pycuda-doc_2016.1-1_all.deb ...  
 Unpacking python-pycuda-doc (2016.1-1) ...  
 Selecting previously unselected package python3-appdirs.  
 Preparing to unpack .../python3-appdirs_1.4.0-2_all.deb ...  
 Unpacking python3-appdirs (1.4.0-2) ...  
 Selecting previously unselected package python3-decorator.  
 Preparing to unpack .../python3-decorator_4.0.6-1_all.deb ...  
 Unpacking python3-decorator (4.0.6-1) ...  
 Selecting previously unselected package python3-pytools.  
 Preparing to unpack .../python3-pytools_2016.2.1-1_all.deb ...  
 Unpacking python3-pytools (2016.2.1-1) ...  
 Selecting previously unselected package python3-pycuda.  
 Preparing to unpack .../python3-pycuda_2016.1-1+b2_amd64.deb ...  
 Unpacking python3-pycuda (2016.1-1+b2) ...  
 Processing triggers for fontconfig (2.11.0-6.5) ...  
 Processing triggers for libc-bin (2.19-22) ...  
 Setting up fonts-mathjax (2.6.1-1) ...  
 Setting up libboost-python1.61.0 (1.61.0+dfsg-2.1) ...  
 Setting up libboost-system1.61.0:amd64 (1.61.0+dfsg-2.1) ...  
 Setting up libboost-thread1.61.0:amd64 (1.61.0+dfsg-2.1) ...  
 Setting up libjs-mathjax (2.6.1-1) ...  
 Setting up python-pycuda-doc (2016.1-1) ...  
 Setting up python3-appdirs (1.4.0-2) ...  
 Setting up python3-decorator (4.0.6-1) ...  
 Setting up python3-pytools (2016.2.1-1) ...  
 Setting up python3-pycuda (2016.1-1+b2) ...  
 Processing triggers for libc-bin (2.19-22) ...  

Okay, we are all good. Let's start python3 interpreter. By the way, I'm using python3.5

 user@localhost:~$ python3  
 Python 3.5.2+ (default, Aug 5 2016, 08:07:14)   
 [GCC 6.1.1 20160724] on linux  
 Type "help", "copyright", "credits" or "license" for more information.  
 >>> import pycuda.autoinit  
 Traceback (most recent call last):  
  File "<stdin>", line 1, in <module>  
  File "/usr/lib/python3/dist-packages/pycuda/autoinit.py", line 5, in <module>  
   cuda.init()  
 pycuda._driver.RuntimeError: cuInit failed: no CUDA-capable device is detected  
   

ah craps, you would think that something is wrong with the lib. It's just that the library did not detect a gpu that is cuda capable. For your information, I have workstation that has two gpu, an intel and nvidia gpu, so it is currently running intel which is not power consumption intensive and I have to explicitly enable nvidia gpu should I need to. With that said, let's try it again.

 user@localhost:~$ optirun python3  
 Python 3.5.2+ (default, Aug 5 2016, 08:07:14)   
 [GCC 6.1.1 20160724] on linux  
 Type "help", "copyright", "credits" or "license" for more information.  
 >>> import pycuda.autoinit  
 >>> import pycuda.driver as drv  
 >>> import numpy  
 >>>   
 >>> from pycuda.compiler import SourceModule  
 >>> mod = SourceModule("""  
 ... __global__ void multiply_them(float *dest, float *a, float *b)  
 ... {  
 ...   const int i = threadIdx.x;  
 ...   dest[i] = a[i] * b[i];  
 ... }  
 ... """)  
 >>> multiple_them = mod.get_function("multiply_them")  
 >>> a = numpy.random.randn(400).astype(numpy.float32)  
 >>> b = numpy.random.randn(400).astype(numpy.float32)  
 >>>   
 >>> dest = numpy.zeros_like(a)  
 >>> multiple_them(drv.Out(dest), drv.In(a), drv.In(b), block=(400,1,1), grid=(1,1))  
 >>> print(dest-a*b)  
 [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.  
  0. 0. 0. 0.]  

optirun is a command to enable nvidia discreet gpu on debian. So now import library and it works! Brilliant. By the way, I'm using nvidia 960M gpu. That's it for today.