#What stops this vectorizing?
1 messages · Page 1 of 1 (latest)
I recreated a version in a unit test, both as a function pointer and a job just to start testing
[BurstCompile]
public static void Basic([NoAlias] float* remainings, [NoAlias] bool* buffer, int length, float deltaTime)
{
for (var i = 0; i < length; i++)
{
remainings[i] = math.max(0, remainings[i] - deltaTime);
buffer[i] = remainings[i] != 0;
}
}
[BurstCompile]
public struct BasicJob : IJob
{
public NativeArray<float> Remainings;
public NativeArray<bool> Buffer;
public float DeltaTime;
public void Execute()
{
for (var i = 0; i < this.Remainings.Length; i++)
{
this.Remainings[i] = math.max(0, this.Remainings[i] - this.DeltaTime);
this.Buffer[i] = this.Remainings[i] != 0;
}
}
}```
and it seems to generate quite the large chunk of vectorized code
the thing i can't figure out is, why this is not being generated in the original job
I even simplified the original system into a copy test
var testBuffer = new NativeArray<bool>(chunk.Count, Allocator.Temp);
var remainings = chunk.GetNativeArray(ref this.RemainingHandle).Reinterpret<float>();
for (var i = 0; i < testBuffer.Length; i++)
{
remainings[i] = math.max(0, remainings[i] - this.DeltaTime);
testBuffer[i] = remainings[i] != 0;
}```
it still generates the original non-vectorized code
they're pointers?
if i simply move this
// {
// remainings[i] = math.max(0, remainings[i] - this.DeltaTime);
// this.onBuffer[i] = remainings[i] != 0;
// }
CalculateOn(remainings, this.onBuffer.GetUnsafePtr(), this.onBuffer.Length, this.DeltaTime);```
into a method
it now works
it's a float*
float* remainings = (float*)chunk.GetRequiredComponentDataPtrRW(ref this.RemainingHandle);
Oh
even copying everything local no longer works
var deltaTime = this.DeltaTime;
var length = this.onBuffer.Length;
var buffer = this.onBuffer;
for (var i = 0; i < length; i++)
{
remainings[i] = math.max(0, remainings[i] - deltaTime);
buffer[i] = remainings[i] != 0;
}```
Can you then calculate final ptr only once?
but if i do this exact code in a separate method it works fine
So you avoid using [i] multiple times
the question is why it works in 1 situation
but not the other
ok figured it out
this doesn't work
public static void CalculateOn(float* remainings, NativeArray<bool> buffer, int length, float deltaTime)
{
for (var i = 0; i < length; i++)
{
remainings[i] = math.max(0, remainings[i] - deltaTime);
buffer[i] = remainings[i] != 0;
}
}```
this works
```cs
public static void CalculateOn([NoAlias] float* remainings, [NoAlias] NativeArray<bool> buffer, int length, float deltaTime)
{
for (var i = 0; i < length; i++)
{
remainings[i] = math.max(0, remainings[i] - deltaTime);
buffer[i] = remainings[i] != 0;
}
}```
it thinks the array could alias because of the NativeDisableContainerSafetyRestriction
so the question is, how do i make this pattern work then =S
[NativeDisableContainerSafetyRestriction] // Only initialized in the job
private NativeList<bool> onBuffer;```
doesn't seem to help
(the weird thing is, the vectorized loop is like 5x more instructions, is it actually faster 🤔)
[BurstCompile]
public static void Scalar(float* remainings, bool* buffer, int length, float deltaTime)
{
for (var i = 0; i < length; i++)
{
remainings[i] = math.max(0, remainings[i] - deltaTime);
buffer[i] = remainings[i] != 0;
}
}
[BurstCompile]
public static void Vectorized([NoAlias] float* remainings, [NoAlias] bool* buffer, int length, float deltaTime)
{
for (var i = 0; i < length; i++)
{
remainings[i] = math.max(0, remainings[i] - deltaTime);
buffer[i] = remainings[i] != 0;
}
}```
is the simple repo but yeah, not sure how to stop burst thinking this is aliasing in the actual code
obviously i can just have this as the method but it's more about the general principle because this is a pattern is use frequently
even throwing it on the struct doesn't help
[BurstCompile]
public unsafe struct UpdateTimeJob : IJobChunk```
burst is insistent these alias
after all that
it's such a minor bump in performance due to the huge amount of extra instructions anyway =S
questions like this are basically the story of ispc https://pharr.org/matt/blog/2018/04/30/ispc-all
Collected together all for your convenience.
and also why andreas added intrinsics, because he hated when this stuff happened to him with cpp compilers
FWIW I am also pretty firmly in the camp of "if you want vectorized code, write vectorized code". It's great that Burst supports auto-vectorization, but it's very easy to fall off that path without realizing it (or to figure out why it happened even if you do realize it, as this thread solidly demonstrates)
Yep that’s explicitly stated in aliasing section of burst docs
Yeah I know but my issue was I can't tell burst otherwise
Putting [NoAlias] on the field or job didn't do anything
On the field in particular I feel should solve this
I feel like NativeDisableContainerSafetyRestriction has max priority here, don’t know if burst team would change that behaviour unfortunatelly
personally, if i had put this amount of investigation into a non-vectorized situation i would 100% be using intrinsics for said situation
To me this is less about getting this vectorisation to work, more about understanding what's going on and how I can code efficiently by default.
This pattern of allocating per thread containers but having to use disable safety to make it work is something I often do to minimise allocations.
And it's the first time I've stumbled upon a problem with it
just in case it wasn't clear because I've realized I never mentioned this in the post, I'm only using NativeDisableContainerSafetyRestriction because the native container isn't passed in, it's only initialized in the job but safety system doesn't like this.
[NativeDisableContainerSafetyRestriction] // Only initialized in the job
private NativeList<bool> onBuffer;
/// <inheritdoc/>
public void Execute(in ArchetypeChunk chunk, int unfilteredChunkIndex, bool useEnabledMask, in v128 chunkEnabledMask)
{
if (!this.onBuffer.IsCreated)
{
this.onBuffer = new NativeList<bool>(chunk.Count, Allocator.Temp);
}
this.onBuffer.ResizeUninitialized(chunk.Count);```
doing it with an unsafe list doesn't help either,
[NoAlias]
private UnsafeList<bool> onBuffer;```
i mean, maybe they were both incorrect, but matt pharr concluded in that blog post series, and andreas concluded in his oft-cited intrinsics talk, that one should just never assume that you can code in a non-intrinsics, non-ispc way that will reliably be vectorized