Speed of cellfun in MATLAB

When I started programming in Matlab, I quickly realized that in order to reduce the running time of your code you had to make use of Matlab's handy vectorization features whenever possible. Suddenly I became highly suspicious of loops, which I had airily been using in VBA and other languages, and as a result tried to avoid them as much as I could. However, as the data sets I analysed grew in size and I had to use cell arrays, such as when displaying date strings along with numerical values in a GUI table, it dawned on me that using built-in functions instead of loops may not always result in faster code.

Matlab comes with the built-in cellfun function. It basically applies the same operation on each cell in a cell array. To check whether the individual cells in the cell array are empty, for instance, you can have cellfun run the isempty function on each cell instead of writing a loop to do it. However, there are two ways of using cellfun and one of them is really slow while the other is actually quite fast. You can use the @(x) syntax, which turns out to be slow, and a string syntax, which is faster than loops but only works with few simple functions. This fact is not adequately pointed out in Matlab's documentation, in my opinion.

To give you an example, let's stick to isempty and see how cellfun performs on a 1,000,000 x 1 cell array:

Now apply isempty using cellfun with both the @(x) syntax and the string syntax:

As you can see, use of the string syntax results in a significant run time reduction. But as already stated above, cellfun only supports this syntax for a handful of simple functions such as isreal, isempty, length, size and ndims. So what can you do when you have to apply more complex functions? The answer really depends on what you want to achieve. When your cell arrays are not too big speed might not be much of a problem and hence cellfun can be a valid choice because it makes your code shorter. However, when you need to transform larger data sets it is usually better to use a loop instead: