## Installing ArrayFire

ArrayFire is now opensource!!

So I wanted to install and test it on my computer. Since they don’t yet have the installer for the opensource version, I had to build it from source. Normally for me, this isn’t an issue.

I went through the installation instruction on github/arrayfire for linux OS. Since I’ve been mostly using CUDA, I decided to only build it with CUDA and not openCL. Everything went alright until about the 73% mark. It turned out that on the particular machine I was on, apparently “ipv6 broken at googlecode.com” so after I temporarily disabled it.

sysctl -w net.ipv6.conf.all.disable_ipv6=1

After this, I also faced the following error:

svn: Working copy ‘/myrepo/repodirectory’ locked
svn: run ’svn cleanup’ to remove locks (type ’svn help cleanup’ for details)

I had to do a lot of “svn cleanups” and finally, everything decided to work!!

The tests all passed. Now I can mess around arrayfire on my machine. Time to learn and actually get some work done!!!

## Mathematica on Raspberry Pi

Mathematica has graciously offered Mathematica free on Raspberry Pi. To All who loves free stuff, this is one of the best gift to most and all of the openSource community. I am was elated to hear about it and embarked on a course install and test out this gift.

With the appropriate command line (found on the page above), I was able to acquire Mathematica Verison 10

~ $/opt/Wolfram/WolframEngine/10.0/  After installation, Wolfram should be located under the ‘Education menu’ in the app launcher. This brings up the engine in a terminal. According to the multiply forums, I should be able to get Mathematica in a GUI form. The GUI wasn’t located where it should be. This is confirmed from within Wolfram  ~$ wolfram
Wolfram Language (Raspberry Pi Pilot Release)
Information & help: wolfram.com/raspi

In[1]:= $Version Out[1]= 10.0 for Linux ARM (32-bit) (November 19, 2013) In[2]:=  For more information, see Mike Croucher’s post. Of course I tried seeing if this engine was available on UDOO but it wasn’t. Oops, wishful thinking ubuntu@imx6-qsdl:~$ sudo apt-get install wolfram-engine
Building dependency tree
E: Unable to locate package wolfram-engine


## Coursera is offering Intro to Functional Analysis

I am very excited that coursera is offering a Functional Analysis course. Few of my classmates and myself had been trying to have this analysis course offered at my institution for quite some times now.

## I’m Back! (CUDA Shared Matrix Multiplication)

I have been away from blogging for way too long. I miss it.

I need to make a come back. For starter, here is a Matrix Multiplication code (Makefile) I had written when I took Heterogeneous Parallel Programming course on Coursera.

The code attempts to calculate the performance of my shared matrix multiplication kernel (There might be a bug somewhere🙂

## Manabus Report: Trapezoid Rule

(I forgot to publish this draft from a while ago)

Quadrature is the numerical evaluation of definite integrals in the following form

$F(x) = \int_a^b f(x) dx$

The quadrature approximation of the above equation takes the form of $latex F(x) \approx \sum_k w_k f(x_k)$
There are many forms of quadratures available. Each of these methods are of the same form, varying in choices of $x_k$ called quadrature nodes, and $w_k$ quadrature weights. We first approximate $f(x)$ by a polynomial $p_n(x)$ and then integrate this. The integral of a first degree Langrange polynomial is the trapezoid formula [reference my previous blog blog and dafeda.wordpress.com]
We use trapezoidal rule in this example.

This method allows user to split the interval of intergration $[a,b]$ into $N$ small uniform subiterval. In calculus, we use

$\int_a^b f(x) dx = \frac{b-a}{2N} f(x_0) + 2f(x_1) + 2f(x_2) + \hdots + 2f(x_{N-1}) + f(x_n)$

$x_k = a + k\frac{b-a}{N} \quad k = 0, 1, 3, \hdots N$

As show in figures, the trapezoidal rule approximates the integral very well as the number of trapezoids increase. In order to achieve a desirable result, the user must use more trapezoids. While the number of trapezoid increase to achieve the desired accuracy, the compuatational time increases unfortunately. In order to be happy, we can parallelize the scheme and use as much trapezoids until satisfied and not have to wait forever.

Since the interval of intergration can be divided into small subinterval, the same idea is exploited to parallelize trapezoid rule. Each processors are simply assigned $lN = \frac{N}{np}$ subintervals, where $np$ is the number of processors available.

This method is rather simply. However, the efficiency of parallel implementation of this simple scheme depends on how the user implements the scheme. Message passing is very expensive, it depends on the user to think ahead and consider how they executes MPI command influences the scheme. To show this, we write two code that executes trapezoidal rule in parallel. The efficiency of these codes are compared. Furthermore, the following graphs depicts how well the cluster performs the trapezoidal rule.

## RBF Residual subsampling method

In Adaptive residual sub-sampling methods for radial basis function interpolation and collocation method, Dr. Heryudono explains his new method for adapting the centers in RBF and parameters based on interpolation processes. This is a recursive method that “solve-estimate-refine/coarsen” until a certain threshold is achieve.

The algorithm for the method can be summarize as the following:

1. Generate an initial discritization of $N$ equally spaced points on the domain $\Omega$
2. Find an RBF approximation $F$ to $f$ on $\Omega$
3. compute $error = |f -F|$ at the midpoint of the nodes
4. At error points where $\theta_r$ threshold is not met, the are made new centers
5. error points that lower than $\theta_c$ threshold are removed
6. Do nothing to the two end points
7. shape parameter $\epsilon$ is chosen based on the spacing between two node points
8. if stopping criterion is not met, repeat 1

Along with the paper, he provides a MATLAB script of his method. A very modest codes. Had I coded this method, I would have used much more lines🙂

Using the code, I perform some test with different function $f$.

1. Runge function $f(x) = \frac{1}{1+25x^2}$ $\theta_r = 10e-5, \theta_c = 10e_8, N = 13$

## Xgrid

xgrid is set up in a way that it runs as a “nobody”, any task where submitting/receiving agent are NOT using kerberos authentication, has restricted access to the filesystem. Leopard/Snow Leopard runs tasks using the new “sandbox” facility in Mac OS X. The solution is to set up kerberos authentication for everything. However, this is a pain.

A not so clever option, is to have a directory that Xgrid has read/write permission to. To do this, edit the last line of nobody.sb to the following (the full path is: /usr/share/sandbox/xgridagentd_task_nobody.sb)