#vectorized operations vs loops

1 messages · Page 1 of 1 (latest)

true pebble
#

coming from numpy, I thought that like A .+ B would be faster than A + B because it is vectorized. But I just read that in julia loops are faster. Then what is the point of A .+ B?

true pebble
uneven maple
true pebble
#

then

uneven maple
#

in your case nothing

#
julia> x = rand(10);

julia> x .= sin.(x)
10-element Vector{Float64}:
 0.14254193236205576
 0.3701762327696455
 0.3623631091392756
 0.8036946167310883
 0.2740764904500096
 0.5152639962240843
 0.5124770784935754
 0.2804176660697805
 0.5793032650627981
 0.6758465850947173
pseudo egret
true pebble
pliant eagle
#

Unlike python loop, julia loops are fast so writing a loop isn't something that you must avoid for performance

true pebble
#

wait

#

so if I define a function

#

and then can I add . in front of it and it automatically works?

pseudo egret
#

it's not explicitly forbidden/disabled for something like A .+ B, even though the + operator already has an overload for arrays of the same size

pliant eagle
#

Yes, that's beautiful, isn't it?

pseudo egret
# true pebble so its the same as a for loop?

a bit, but it can go a bit further, too. For multidimensional arrays, it loops over the non-shared dimensions, so you can do stuff like

rand(5,3) .+ rand(5)
5×3 Matrix{Float64}:
 1.84255  1.12342   1.16323
 1.44004  1.49876   1.02867
 1.10553  0.667457  0.583658
 1.1654   0.90351   0.424651
 1.72491  1.13988   1.48848
#

^ adding a 5-vector to each of the columns of a 5x3 matrix

#

and since operators in julia are just functions, the . works for essentially any function

true pebble
pseudo egret
#

or rand(3,5) .+ rand(5)'

true pebble
#

interesting thats like having last dimensions as batch dimensions

pliant eagle
true pebble
pliant eagle
#

There is also something called "fused vectorized operations"(see this paragraph of the julia manual).

Sometimes you can't write code that contains loops, for example when doing array programming with CUDA.jl

You can "add BenchmarkTools", "@btime" and try to benchmark the different styles of code and see the performance difference and then comprare it to standard python or numpy.

Also sometimes they recommend you to write the function that works on scalars first and then leverage the dot and multiple dispatch to "vectorize" it. I think that in the end the dot operation will be lowered to a loop. That's also the same for numpy functions, if I'm not mistaken numpy calls are C calls and loops in C are fast.

pliant eagle
true pebble
#

interesting so after googling linear algebra doesn't support batched matrices but I don't yet know if that will matter or if for loop is fine

pseudo egret
#

e.g. multiplying big matrices can be made faster by going through the matrices is chunks, rather than just applying the definition of the matrix product directly

pseudo egret
#

(for working with small matrices/vectors, you might also want to look at StaticArrays.jl)