#algos-and-data-structs

1 messages Β· Page 71 of 1

stiff quartz
#

lemme upgrade

regal spoke
#

but then no one will be able to recreate the slowdown

#

except for windows

stiff quartz
#

I upgraded and now I have a bunch of errors

#

what have I done

regal spoke
#

You can always trust windows

#

to bug out

stiff quartz
#

how do you upgrade wsl 🀑

regal spoke
#

sudo apt update && sudo apt full-upgrade

stiff quartz
#

0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

regal spoke
#

sudo do-release-upgrade

stiff quartz
#
lhott@JasonTower:~$ sudo do-release-upgrade
Checking for a new Ubuntu release
There is no development version of an LTS available.
To upgrade to the latest non-LTS development release
set Prompt=normal in /etc/update-manager/release-upgrades.
regal spoke
#

yes I had that too

stiff quartz
#

ah you're not on LTS

regal spoke
#

Just change Prompt=normal

#

Its just a text file

regal spoke
#

Btw my desktop computer is not healthy atm. A couple of days ago I had tons of bluescreens. After that chrome didn't want to start, forcing me to reinstall it.

#

I should really get it fixed

#

Seems my ram or drive is fucked

stiff quartz
#

too many benchmarks

#

computer's complaining

#

no more sparse tables please

#

is the gcc version different on your new ubuntu

#

maybe only gcc fixed it

haughty mountain
stiff quartz
#

seems hefty already

regal spoke
stiff quartz
#

how did you screenshot

#

whilst having a bluescreen

regal spoke
#

||I didn't||

stiff quartz
#

Ah yes I hadn't even seen the stop code

#

πŸ™‚

#

my wsl is taking ages to update ubuntu

#

but then I won't be able to reproduce any issues

regal spoke
#

It warns it could take hours

stiff quartz
#

yes but

ripe kayak
#

Where is the best place to practice algos and data structures?

stiff quartz
regal spoke
#

But cses.fi is more for C++ than Python

#

Because of its very tight time limits

stiff quartz
#

But then you learn about obscure xor trees

ripe kayak
regal spoke
#

No one reports a slowdown even close to what we are experiencing

#

They talk about things like 30% slower on AMD, that is what they call "significant"

#

Not 3000% slower =p

#

Could be that they either found the bug themselves, or just accidently fixed it while changing something else

stiff quartz
#
lhott@JasonTower:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.10
Release:        24.10
Codename:       oracular
lhott@JasonTower:~$
#

why are you special enough that you get 25.04

regal spoke
#

have you tried updating again

stiff quartz
#

even if that worked i would be too offended to do it

regal spoke
#

can you try slowdown bug

stiff quartz
regal spoke
#

try the C one too

#

this one

#

Slow should be around 8s

#

Fast should be like 0.3s

stiff quartz
#
lhott@JasonTower:~/tmp$ vim test.c
lhott@JasonTower:~/tmp$ gcc test.c -O3
lhott@JasonTower:~/tmp$ ./a.out
lhott@JasonTower:~/tmp$ time ./a.out

real    0m0.221s
user    0m0.212s
sys     0m0.000s
lhott@JasonTower:~/tmp$ vim test.c
lhott@JasonTower:~/tmp$ gcc test.c -O3
lhott@JasonTower:~/tmp$ time ./a.out

real    0m0.210s
user    0m0.224s
sys     0m0.000s
#

nothingness

regal spoke
#

Maybe it is already fixed

stiff quartz
#

only for gcc though

#

definitely not for MSVC

#

πŸ™‚

regal spoke
#

*glibc

#

Still one thing that doesn't mate sense

#

How did I have string slowdown but not list slowdown

#

in 3.12.3

stiff quartz
#

you cannot even investigate πŸ™‚

#

how frustrating πŸ™‚

regal spoke
#

try me

stiff quartz
#

sudo do-downgrade

#

idk linux

regal spoke
#

Just have to wait for my computer to bluescreen and press recover

#

and bang

#

free downgrade

stiff quartz
#

so

#

with godbolt

#

you can run code

#

and all GCC versions are listed

#

let's find the one

regal spoke
#

You have to understand how things work. All you see on godbolt is that memcpy is called

stiff quartz
#

you can execute the code

regal spoke
#

You don't see what is going on inside of it

stiff quartz
#

and time it

#

you just have to see when the bug starts happening

#

for which version i mean

regal spoke
#

Maybe you could use that

#

But that is not typically what godbolt is used for

stiff quartz
#

i know i just wanted somewhere where I can test all gcc versions I guess

#

although if it's glibc

#

ah yes if it's glibc I won't be able to reproduce

#

yeah even old versions of gcc are fast on godbolt

regal spoke
stiff quartz
#

Yep

#

gcc 6.3 does not have any slowdown for example

regal spoke
#

clang 14.0 seems death

stiff quartz
#

godbolt?

#

or locally?

regal spoke
#

godbolt

#

Clang >= 15.0 is fast

stiff quartz
#

but

#

so

#

GCC fixed it but not clang?.

#

I thought that the problem was "past the compilers"

#

But on windows, I've never been able to reproduce with gcc

#

can you?

regal spoke
#

nope

stiff quartz
#

so the fact that wsl now works

#

where does the fix come from

#

i'm confused

stiff quartz
regal spoke
#

Maybe it just runs things on different servers?

#

For me it seems semi consistent on godbolt

stiff quartz
#

I never got 373 ms

regal spoke
#

Quite big variations

stiff quartz
#

always > 2000 ms

#

I wouldn't trust it

regal spoke
stiff quartz
#

what was the GCC version of Ubutun 24.04.2 argh

regal spoke
#

ConfusedReptile had the slowdown on ubuntu 24.04 with gcc version 13.3.0 and clang version 18.1.8

#

I have no slowdowns with Ubuntu 25.04 with gcc version 14.2.0 and clang version 20.1.2

regal spoke
stiff quartz
#

but he had a real linux

regal spoke
#

That should not matter

stiff quartz
#

on my real linux, gcc 13.3, I have NOTHINGNESS

#

well it's not real it's a VM idk

#

I have a windows update to do

#

Should I do it

#

maybe we'll unlock new behaviours

#

πŸ™‚

regal spoke
#

Are you Romain Lhotte?

stiff quartz
#

Yeah

#

I thought it’d be relevant to add that it was fixed

#

On Ubuntu

#

Put some pressure on the windows devs

regal spoke
#

what if putting computer to sleep makes the bug appear again

stiff quartz
#

It’s not sleep

#

It’s restart

#

This time

#

But we’ve had plenty of restart tests already with your blue screens

#

hum HUM HUM

#
PS C:\Users\lhott> python -m timeit "([0] * 2**18)*14"
50 loops, best of 5: 5.4 msec per loop
PS C:\Users\lhott> python -m timeit "([0] * (2**18+1))*14"
20 loops, best of 5: 6.8 msec per loop
PS C:\Users\lhott> python -m timeit "([0] * (2**18+100))*14"
50 loops, best of 5: 5.46 msec per loop
#

I can't reproduce

#

It's over

#

Not even in C (nevermind I was using gcc for C)

regal spoke
stiff quartz
#

I can reproduce in C

#

but not in Python anymore

regal spoke
#

my python is still going πŸ’ͺ

stiff quartz
#

ah now I can

#
PS C:\Users\lhott> python -m timeit "([0] * (2**18+100))*14"
50 loops, best of 5: 5.39 msec per loop
PS C:\Users\lhott> python -m timeit "([0] * (2**18+100))*14"
50 loops, best of 5: 5.47 msec per loop
PS C:\Users\lhott> python -m timeit "([0] * (2**18+1))*14"
20 loops, best of 5: 13.1 msec per loop
PS C:\Users\lhott> python -m timeit "([0] * (2**18+1))*14"
20 loops, best of 5: 9.36 msec per loop
PS C:\Users\lhott> python -m timeit "([0] * (2**18+1))*14"
20 loops, best of 5: 13.1 msec per loop
PS C:\Users\lhott> python -m timeit "([0] * 2**18)*14"
50 loops, best of 5: 5.2 msec per loop
#

so upgrading windows didn't change anything

#

I honestly think the best course is to downgrade back to the ubuntu that wasn't working

#

to investigate why the memcpy we think happens doesn't happen (otherwise it'd be slow)

regal spoke
#

time to just report confusedreptile's version as not working

stiff quartz
#

we just need to add a printf statement in the C code, I'm downgrading ubuntu

regal spoke
#

what

stiff quartz
# regal spoke what

well in C:

memcpy with a distance of 2**18 + 8 and a count of 2**18 is slow

#

in Python:

not slow

#

so I'll put a printf in CPython source code just before the memcpy

regal spoke
#

You want to modify CPythons source?

stiff quartz
#

I don't think it's that complicated

#

you just have to recompile it

#

it's worth a shot

#

I did compile it "manually" to install Python 3.13.5 anyway

#

ok, downgrading is annoying

regal spoke
#

In a perfect world, microsoft will reply with more info about what the bug is and what is affected

#

But reading through other issues, I'm expecting a "Our engineers will take a look at this" kind of reply

stiff quartz
#

I'm back on Ubuntu 24.04.1 LTS

#
lho.....5$ python3 -m timeit -r 20 "('d' * ((2 ** 14) + 1000))*4"
500000 loops, best of 20: 853 nsec per loop
lho.....5$ python3 -m timeit -r 20 "('d' * ((2 ** 14) + 1))*4"
50000 loops, best of 20: 9.06 usec per loop

as expected, string is bad.

#

And lists are not bad.
Even though, in C:
memcpy(src + (1 << 18) + 8, src, 1 << 18);
is slow

#

So I'm going to go printf size_t((dest + copied) - dest) and (size_t)bytes_to_copy (basically what is called here in CPython source code):

memcpy(dest + copied, dest, (size_t)bytes_to_copy);
#

Running:

a = ([0]*((1<<18) + 1))*4
Prints this to me in my custom-compiled CPython:
Offset: 2097160, Bytes to copy: 2097160
Offset: 4194320, Bytes to copy: 4194320

#

2 097 160 = 2^18*8 + 8

#

It all makes sense, the bug cannot be repeated when the distance is 2^21

#

Fast:

lhott@JasonTower:~/tmp$ vim test.c
lhott@JasonTower:~/tmp$ gcc test.c -o test
lhott@JasonTower:~/tmp$ time ./test

real    0m0.259s
user    0m0.278s
sys     0m0.000s
lhott@JasonTower:~/tmp$ cat test.c
#include <string.h>
#include <stdio.h>

#define REPEATS 8000

char src[10000000];  // Allocate a large buffer

int main() {
    for (int i = 0; i < REPEATS; ++i) {
        // Perform memcpy with specific offset
        memcpy(src + 8*(1 << 18) + 8, src, 8*(1 << 18));
    }

    return 0;
}

Fast as well:

lhott@JasonTower:~/tmp$ vim test.c
lhott@JasonTower:~/tmp$ gcc test.c -o test
lhott@JasonTower:~/tmp$ time ./test

real    0m0.242s
user    0m0.264s
sys     0m0.000s
lhott@JasonTower:~/tmp$ cat test.c
#include <string.h>
#include <stdio.h>

#define REPEATS 8000

char src[10000000];  // Allocate a large buffer

int main() {
    for (int i = 0; i < REPEATS; ++i) {
        // Perform memcpy with specific offset
        memcpy(src + 8*(1 << 18) + 800, src, 8*(1 << 18));
    }

    return 0;
}
stiff quartz
regal spoke
#

uhm what

stiff quartz
#

tl;dr: no bug in C with a distance of 2^21 + 8 which is what is used by CPython for your list example

regal spoke
#

But you said +7 was the highest you could go I think

stiff quartz
#

2^18+8 = slow for me (on WSL)

#

2^21+8 = fast

#

And obviously both 2^18+800 and 2^21+800 are fast

It seems the something small is dependent on the power of 2

regal spoke
#

but << 19 is slow

stiff quartz
#

So the consistency is less nice as we though on the C side

#

But at least on the CPython side, everything makes sense with the C benchmarks

#

It really was frustrating me that we weren't able to link both

#

Also, it's surpisingly easy to add a printf in CPython Source code and compile it, honestly I thought it was going to be a nightmare of bugs and whatnot but it's not

regal spoke
#

This one is fast for me

#include <string.h>
#include <stdio.h>
#define REPEATS 4000000

int main() { 
    char src[10000000];
    for (int i = 0; i < REPEATS; ++i) {
        memcpy(src + (1 << 19) + 1, src, 4*4096 + 1);
    }

    return 0;
}

This one is slow in msvc

#include <string.h>
#include <stdio.h>
#define REPEATS 4000000

int main() { 
    char src[1000000];
    for (int i = 0; i < REPEATS; ++i) {
        memcpy(src + (1 << 19) + 1, src, 4*4096 + 1);
    }

    return 0;
}

Somehow the size being 1e6 vs 1e7 matters here

#

Does it just optimize away the 1e7 one?

#

Moving src to global makes both slow

stiff quartz
#

My benchmarks were with src in global

#

and no optimizations flags anyway

#

I only did things with gcc to explain why we could reproduce the evil string example but not the evil list example

#

The consistency around when memcpy is slow makes no sense for now - I was just making sure that everytime Python is slow, we can pinpoint exactly the corresponding slow memcpy and everytime Python is fast, we can pinpoint exactly the corresponding fast memcpy. That's reassuring

regal spoke
#

So what would be a better example for the list case?

#

Some smaller list?

stiff quartz
#

8 = 2^3

#

so I imagine 2^15

#

since 2^18+8 is slow

regal spoke
#

So ([0] * (2**15 + 1)) * 2?

stiff quartz
#

I think so

regal spoke
#

No wait

stiff quartz
regal spoke
stiff quartz
#

Why?

regal spoke
#

I'm wrong

#

ok thats fine

stiff quartz
#

A Python list of size 2^15+1 has an underlying 2^18+8 bytes-long array

regal spoke
#

but then why * 8?

stiff quartz
regal spoke
#

Shouldn't * 2 be the correct thing

stiff quartz
#
lhott@JasonTower:~$ python3 -m timeit "([0] * (2**15 + 1)) * 2"
5000 loops, best of 5: 89.4 usec per loop
lhott@JasonTower:~$ python3 -m timeit "([0] * (2**15 + 100)) * 2"
5000 loops, best of 5: 46.6 usec per loop
#

Yes.

#

Unless 2^18+16 is also slow but I forgot

#

And at this point everything makes sense so I'm not too worried

regal spoke
#

try **14 and * 4 instead

stiff quartz
#

Did you benchmark 2^17+16? Why those numbers?

#
lhott@JasonTower:~$ python3 -m timeit "([0] * (2**14 + 100)) * 4"
10000 loops, best of 5: 35 usec per loop
lhott@JasonTower:~$ python3 -m timeit "([0] * (2**14 + 1)) * 4"
5000 loops, best of 5: 62.1 usec per loop
regal spoke
#

To get a big slowdown, I would like the x in * x to be as big as possible

#

For list, with * 2, you get a +8

#

For list, with * 4, you get a +16

stiff quartz
#

You need to see which power of 2 the "something small" is as large as possible

#

For * 6 you actually get a +16 still

#

don't you?

#

you might be able to make it bigger a bit like this

regal spoke
#

Oh is that why things like 14 worked

#

no

stiff quartz
#

For * 7 you get a +24 I assume

regal spoke
#

No

#

wait

#

Thats not how it works

#

They double the pointer each time. But the count can be less than double in last iteration

stiff quartz
#

Ah, I think you're right

regal spoke
#

I don't know why 14 would be good

#

probably that just us no benchmarking well

stiff quartz
regal spoke
#

In theory, *16 should be the winner

#

or *8

#

It is still strange that 14 and 20-22 seemed optimal

stiff quartz
regal spoke
#

Knowing what we know now

stiff quartz
regal spoke
#

I did

stiff quartz
#

I think I used 14 to not Memory Limit Exceeded the Discord Python bot

regal spoke
#

I used 14 since my benchmark told me it was a winner

#

in slowdown

#

Oh here is a theory

#

It could be that the time it takes (the "slowdown") doesn't scale linearly in count

#

That could definitely explain why *14 looks better than *16

#

Maybe a slowdown bugged memcpy scales sublinearly

#

Ok I think this is probably it

#

Makes total sense

stiff quartz
#

Possible

#

as long as count >= 8193

#

but sublinearly gets slower

#

so you'd want to hit as many 8194 as possible

regal spoke
#

Ye probably that

stiff quartz
#

But ... do we care

regal spoke
#

well 8193

stiff quartz
#

ah yes

#

I was leaving a lil margin πŸ˜„

regal spoke
#

off by one errors is what probably caused this bug in the first place

stiff quartz
#

I think the only reason I'm now decent at off-by-one errors is doing some CP - I was terrible at debugging these things before but still decent at my job lol

regal spoke
#

Why havent we done a 8193 long string yet. That should be murder

#

Something like 8193 times 16

regal spoke
#

2^13 + 1 string

stiff quartz
#

ain't that disappointing

#

no one said the slowdown was even monotonic in count to be fair

regal spoke
#

The point of 2**13+ 1 is that you can do *16 or *32

stiff quartz
#

16 is / seems optimal

#

!e

import time

start_time = time.time()
for _ in range(50000):
  ("s" * (2**13 + 1000)) * 16
print(time.time() - start_time)

start_time = time.time()
for _ in range(50000):
  ("s" * (2**13 + 1)) * 16
print(time.time() - start_time)
halcyon plankBOT
stiff quartz
#

!e

import time

start_time = time.time()
for _ in range(50000):
  ("s" * (2**12+ 1000)) * 16
print(time.time() - start_time)

start_time = time.time()
for _ in range(50000):
  ("s" * (2**12+ 1)) * 16
print(time.time() - start_time)
halcyon plankBOT
stiff quartz
#

I guess 2**13's slowdown is bigger

#

makes sense since the first copy with 12 is not problematic

regal spoke
#

On my computer, *64 is the clear winner.

0.4272184371948242
3.963329315185547
#

no

#

Something is funky

#
import time

start_time = time.time()
for _ in range(50000):
  ("s" * (2**13 + 100)) * 32
print(time.time() - start_time)

start_time = time.time()
for _ in range(50000):
  ("s" * (2**13 + 1)) * 32
print(time.time() - start_time)

start_time = time.time()
for _ in range(50000):
  ("s" * (2**13 + 100)) * 64
print(time.time() - start_time)

start_time = time.time()
for _ in range(50000):
  ("s" * (2**13 + 1)) * 64
print(time.time() - start_time)
#
0.7776021957397461
3.9934470653533936
0.9463193416595459
4.486739873886108
#

Wtf are these times

#

double of 0.7776021957397461 is not 0.9463193416595459 last time I checked

stiff quartz
#

Wel if the last copy doesn't trigger a bug in memcpy

regal spoke
#

Could it be that +100 isnt enough?

stiff quartz
#

It might be super fast

stiff quartz
regal spoke
#

Lets use +0 instead

stiff quartz
#

Couldn't it just be that the last copy doesn't trigger a bug in memcpy

#

last copy is a copy of size 2^.. + 32

#

you said the bug was working up to 24

#

Where are you getting your 64 from

regal spoke
#

Let me make a plot

#

Its not straight

#

None of them are linear

stiff quartz
#

Why would it be straight if your x-axis is logarithmic

regal spoke
#

oh ye

#

Let me remake the plot

stiff quartz
#

This is the string example?

regal spoke
regal spoke
stiff quartz
#

it'd be linear if it wasn't for that last point lol

#

seems like you maxed out how much of a slowdown you can get

#

they're perfectly parallel

regal spoke
#

Let me do a better plot with every possible *k

stiff quartz
#

More benchmarking more benchmarking!

regal spoke
stiff quartz
#

?

#

I meant like k > 32 doesn't seem to give you additional slowdown over the negative controls

regal spoke
#

ye after 32 the slowdown dissapear

stiff quartz
#

yep

#

but it makes sense

#

you said 24 was the upper bound on your computer

#

k = 64, you do a copy with +32

#

and this isn't slowed down

#

but k = 32, the last copy is +16 which is slowed down

regal spoke
#

So if we had a 2^10 + 1 = 1025 long list

#

Then we would have effectly start with 2^13 + 8

stiff quartz
#

Yep i think so

regal spoke
#

and we would be able to do * 4

stiff quartz
#

I think so

#

Are you on a quest to find the worst possible example?

regal spoke
#

Kind of

#

I still dont understand how *22 could result in such a good example

#

that should be +88 on last copy

#

Clearly there must be other + values that work other than the small ones we've tried

#

Given that *22 was working, which effectly runs +128

#

My github example starts with:
2^21 + 8 bytes

fluid matrix
#

how can i make this code

#

is this the right channel

stiff quartz
#

Yes maybe it stops at +24 but then comes back

stiff quartz
#

With my custom python it tells you all the arguments of memcpy that are hit 😎

regal spoke
stiff quartz
#

Mmmh

#

Or 2^24 + 64

#

Who knows

#

Anyone of them could

regal spoke
#

For 1 << 24 it seems slow happens +x <= 31

#

I'm unnable to reproduce ([0] * 262145) * 14 in C calling memcpy manually

#

It should be doing these 4 calls

memcpy(2^21 + 8,  0, 2^21 + 8)
memcpy(2^22 + 16, 0, 2^22 + 16)
memcpy(2^23 + 32, 0, 2^23 + 32)
memcpy(2^24 + 64, 0, 12582960)
#

memcpy(2^24 + 64, 0, 12582960) does not result in a slowdown for me

#

neither does memcpy(2^23 + 32, 0, 2^23 + 32)

#

only memcpy(2^21 + 8, 0, 2^21 + 8) has a slowdown tourist_mad

#

Idk what is going on here

stiff quartz
stiff quartz
regal spoke
#

Maybe I'm not seeing the slowdowns because I'm not using the exact same version of MSVC as CPython uses

#

Another thing that we do differently

#

We keep calling the same memcpy over and over. It could be that this somehow makes things faster because of caching.
CPython doesn't do that.

opal oriole
#

The specific memcpy implementation would be the difference. Maybe different branch at different sizes.

stiff quartz
#

"Who" is in charge of implementing memcpy?

#

Is it the "os people"?

#

or the "cpu people"?

#

I have no idea how things work "down there"

opal oriole
#

Compiler people.

stiff quartz
#

Ah it's the compiler people directly?

opal oriole
#

Although you can use the OS's.

#

You can also use your own.

stiff quartz
#

Ok I see

opal oriole
#

There is some implementation for the standard library somewhere, and you can swap this out.

stiff quartz
#

What do you mean?

opal oriole
#

There are multiple C standard library implementations.

#

The GNU C Library, commonly known as glibc, is the GNU Project implementation of the C standard library. It provides a wrapper around the system calls of the Linux kernel and other kernels for application use. Despite its name, it now also directly supports C++ (and, indirectly, other programming languages). It was started in the 1980s by the Fr...

#

On Windows this is Microsoft's for MSVC.

stiff quartz
#

Ah yeah ok I see

opal oriole
#

Parts of the C standard library are implemented in assembly, and so it's a bit weird, some C features are tied to the library/runtime.

#

Then there is like math.h, which is technically not part of it, and so on Linux you need to separately link to it.

regal spoke
opal oriole
#

So whatever the memcpy implementation happens to be.

regal spoke
#

It really looked to me like it didn't compute anything in the 1e7 case

opal oriole
regal spoke
#

I'll check again when I'm back home

stiff quartz
opal oriole
#

My first initial guess is that past a certain size it triggers a different branch inside memcpy with a different loop. That loop may be slower for that exact size and a bit above it, but if you go even further it gets faster.

stiff quartz
opal oriole
stiff quartz
#

I was mentioning the "initial" thing we were investigating

opal oriole
stiff quartz
#

I have tried to be as exhaustive as possible in that issue

opal oriole
#

Oh, ok.

stiff quartz
#

The first few lines should sum up everything

opal oriole
#

Hmm, need to dig into how memcpy is implemented to tell what is going on.

stiff quartz
#

I find that fascinating but I'm starting to be very far from my comfort zone unfortunately

opal oriole
stiff quartz
#

Yep everything you are is consistent with the printf

#

So idk

regal spoke
opal oriole
stiff quartz
#

99% of the people getting it have AMD

#

so yeah I'm definitely thinking hardware problem

#

but upgrading our WSL Ubuntu versions to the latest preview versions fix the issues

#

πŸ’€

opal oriole
regal spoke
stiff quartz
#

and has the bug

opal oriole
#

It could be a heuristic triggered internally too, or some other threshold.

stiff quartz
#

Should we run perf

#

To have an idea

#

Like if we see a huge spike in branch misses

#

Or whatever

#

No idea how to run that on Windows though

opal oriole
#

(The specific one that ended up in the binary and the assembly it compiled to specifically)

#

(Before going further check what other compiler versions output on the same hardware to make sure it's not that)

#

(And if they are different, that gives you something to compare too though)

stiff quartz
#

On Windows, MSVC triggers the bug but not GCC

#

On WSL for a non-preview version of Ubuntu, gcc triggers the bug

#

So if we look at the gcc and MSVC binary compiled on windows

#

There should be a diff and that should be insightful I guess

opal oriole
#

Yeah, compare the final binaries, can objdump them, find the memcpy location, compare assembly. Specifically any unique instruction differences (AMD has many fancy instructions, if there is a fancy looking one that is the difference, probably it).

stiff quartz
#

I’ll be honest i have no idea how to do those things

#

Currently googling what objdump mean

opal oriole
#

You just give it the binary and it outputs all the assembly and symbols (you can think of as functions / jump locations).

#

It's just a tool to look at what is in the binary.

stiff quartz
#

Ah ok

#

But that’s a Unix thing

opal oriole
#

The binary has more than the machine code, lots of info.

#

I think it's in mingw too on Windows, not sure, but there are other tools.

#

There is ofc Ghidra.

stiff quartz
#

And how will you know which instructions correspond to the memcpy?

opal oriole
#

You can also go to where memcpy would be called (you know this because its your own code), then follow that address.

#

Try making any little program, like hello world, compile it, then objdump it.

#

(-D to see the assembly (converts machine code to assembly))

#

Ghidra has a GUI and many tutorials for it.

stiff quartz
#

\mingw\bin\objdump.exe

#

I have to log off but I’ll see if I can find the memcpy call and see what instructions (if any) differ between the + 1 (slow) and the +1000 (fast)

#

I feel like comparing the Windows buggy MSVC binary and the Windows non-buggy gcc one will be more complicated but will probably give more insight πŸ€·β€β™‚οΈ

regal spoke
#

I've been "triaged" on the developer community

#

I don't know that word

opal oriole
#

I guess in this case it probably means that you got put to the back of the line.

regal spoke
#

So triaged could mean both down-prioritzed or prioritized

opal oriole
#

(The image of this being a battlefield where there are many injured soldiers that are being sorted by needs is funny though (it's not that serious ("I'm dying! Please you gotta fix my bug!")))

tight cedar
#

shouldn't you look at libc's source instead if you want to look at memcpy

stiff quartz
stiff quartz
#

And the MSVC one is 35163 lines and ends with "..." so I think it couldn't even fit everything in objdump -D .\smallerExampleMSVC.exe > msvcDump.txt

regal spoke
tight cedar
#

-D will disassemble everything

#

if you just want to see main's code use -d

#

But you might just use godbolt for that

stiff quartz
#

But I think the point was to see the internals of memcpy

#

At least on godbolt, I just see call memcpy, so that doesn't really tell me why MSVC has the performance bug and GCC doesn't

stiff quartz
tight cedar
#

if you want to see memcpy's assembly code you would have to dump glibc

#

sth like libc.so.6 on the system

#

navigating the source code might be a better idea but I don't have much experience in that

regal spoke
#

Adding larger stack via /F67108864 fixed it

opal oriole
opal oriole
stiff quartz
#

And the initial .c file is just:

#include <string.h>
#include <stdio.h>

#define REPEATS 4000000

char src[1000000];
int main() {
    for (int i = 0; i < REPEATS; ++i) {
        // Perform memcpy with specific offset
        memcpy(src + (1 << 18) + 1, src, 8193);
    }
    return 0;
}
#

I guess including stdio is useless but it should be optimised away?

opal oriole
#

objdump can disassemble a specific symbol (-d=symbol).

#

It looks like the issue has been found though, but if you want to learn how to do this anyhow, go for it.

#

Btw, 108 KB for something that basically does nothing is kind of wild, you can fit an entire 3D FPS into about 96 KB if you try hard enough (on Windows).

opal oriole
#

It depends on the platform but some constant for each OS in the compiler could be there and adjusted as the OS does.

stiff quartz
stiff quartz
opal oriole
#

We have a lot more memory now, but still does not feel good.

stiff quartz
#

I guess since no one cares about KBs they don't care either

stiff quartz
#

I'm struggling to understand how nm would help

#

I checked in the de-assembled file, there is no memcpy

#

I stand corrected

opal oriole
stiff quartz
#

So wait, what does that mean

#

Why is it in nm's output and not in objdump's output

opal oriole
#

objdump -d=memcpy you can try.

stiff quartz
#

Does it mean that there was some inlining that made the "name"/"calling" of the function disappear from the machine code?

opal oriole
stiff quartz
#

so was my objdump output incomplete?

#

Ah no, my bad

opal oriole
stiff quartz
#

There is memcpy in the objdump of GCC and in the nm of GCC

#

But there is no memcpy in neither of the commands applied on the (huge) MSVC binary

#

When I nm the (huge) MSVC binary I get: "no symbols"

opal oriole
#

For the MSVC one you will need other tools.

stiff quartz
#

Ah.

opal oriole
#

Maybe.

stiff quartz
#

But then, I can't reproduce the (initial issue) with GCC

opal oriole
#

Try doing a debug mode build.

stiff quartz
#

So analysing the GCC binary won't help me that much

opal oriole
#

It's been a while seen a messed around with mingw on WIndows.

opal oriole
stiff quartz
opal oriole
#

Or Ghidra.

stiff quartz
#

should I just do objdump -d memcpy myBinary.exe?

opal oriole
stiff quartz
#

PS C:\Users\lhott\Desktop\temp\c\untitled> objdump -d=memcpy smallerExampleGcc.exe
gets me
objdump.exe: unknown option -- =

opal oriole
#

Try --disassemble (I usually type the whole long form).

stiff quartz
#

ah it works

#

so what I'm doing here is:
tell me how the assembly code corresponding to a memcpy call of that binary?

#

And I'll see the "choice" the compiler made as to "which memcpy implementation" is used?

opal oriole
stiff quartz
opal oriole
#

Does it show something like call memcpy? Or do you see memcpy's implementation assembly?

stiff quartz
#

The latter

opal oriole
#

Then we find that library binary file and same process (dynamic library is also an object file, only difference with an executable is that it has no main function and a flag is set (in the header) that it can't be executed).

stiff quartz
#

I probably can’t use that on Windows

#

Lemme try, yeah i gotta use an alternative

stiff quartz
opal oriole
stiff quartz
#

Ah lol

#

My bad

#

I looked at it too fast

old swift
#

I dont have permission to start voice

opal oriole
#

If you want to use those.

stiff quartz
old swift
#

Hi

opal oriole
#

The Intel 386, originally released as the 80386 and later renamed i386, is the third-generation x86 architecture microprocessor from Intel. It was the first 32-bit processor in the line, making it a significant evolution in the x86 architecture. Pre-production samples of the 386 were released to select developers in 1985, while mass production c...

#

x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit extension of the x86 instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new operating modes: 64-bit mode and compatibility mode, along with a new four-level paging mechanism.
In 64-bit mode, x86-64 supports signific...

#

"elf" is Executable and Linkable Format, used by Linux. https://en.wikipedia.org/wiki/Executable_and_Linkable_Format

In computing, the Executable and Linkable Format (ELF, formerly named Extensible Linking Format) is a common standard file format for executable files, object code, shared libraries, and core dumps. First published in the specification for the application binary interface (ABI) of the Unix operating system version named System V Release 4 (SVR4)...

#

The little and big refer to endianness. https://en.wikipedia.org/wiki/Endianness

In computing, endianness is the order in which bytes within a word of digital data are transmitted over a data communication medium or addressed (by rising addresses) in computer memory, counting only byte significance compared to earliness. Endianness is primarily expressed as big-endian (BE) or little-endian (LE), terms introduced by Danny Coh...

#

"pe" is Portable Executable, used by Windows. https://en.wikipedia.org/wiki/Portable_Executable

Portable Executable (PE) is a file format for executables, object code, dynamic-link-libraries (DLLs), and binary files used on 32-bit and 64-bit Windows operating systems, as well as in UEFI environments. It is the standard format for executables on Windows NT-based systems, including files such as .exe, .dll, .sys (for system drivers), and .mu...

regal spoke
#

@stiff quartz About the github issue. Is the guy that closed the issue part of DevDiv? He is using "we" in a way that confuses me

The good news is that VCRuntime is owned by DevDiv (maintained by the libraries and compiler back-end teams), so we can ship fixes much more easily than getting UCRT changes (which is owned by Windows and is much more difficult to get changes into).
stiff quartz
#

No idea but if he says we I’d guess he works for them

haughty mountain
#

Stephan works for microsoft and maintains the STL

regal spoke
#

microsoft?

stiff quartz
#

Idk, I thought he was from DevDiv

haughty mountain
#

DevDiv is a team inside of microsoft afaict

#

STL might be part of it, idk

#

(STL the person, not STL the lib)

plain portal
zealous loom
#

hi

#

can someone help me

burnt ermine
deft spruce
#

I wonder how people come up with solutions like this in 45 min interview

Given an integer array nums sorted in increasing order, return an array result of the same length, where result[i] should be the sum of the absolute differences between nums[i] and every other element in nums.

def getSumAbsoluteDifferences(self, nums):
    n = len(nums)
    result = [0] * n
    prefix_sum = [0] * (n + 1)

    # Calculate prefix sums
    for i in range(n):
        prefix_sum[i + 1] = prefix_sum[i] + nums[i]

    # Calculate result array
    for i in range(n):
        left_sum = prefix_sum[i]
        right_sum = prefix_sum[n] - prefix_sum[i + 1]
        result[i] = i * nums[i] - left_sum + right_sum - (n - i - 1) * nums[i]

    return result

I could never on my own come with formula result[i] = i * nums[i] - left_sum + right_sum - (n - i - 1) * nums[i] Because of this I lose motivation to practice

burnt ermine
#

also depending on the company you're interviewing for, doing a simple working solution first and then talking through the process of optimizing it might impress them more than just being able to go directly to the clever solution

deft spruce
deft spruce
#

maybe I could benefit from thinking more about naive approach

burnt ermine
#

yeah comparing them and gradually moving from a naive approach to a clever approach is a skill in itself

burnt ermine
agile stratus
#

Hello. I am new in the programming world. I have just started learning Python but before I have been learning MakeCode and Scratch. I have base but not so fully because of my teachers at school. Do you have some tips what to do, learn? I am having classes about Python with group. We have just finished algorithms and some things that we use in Python such as int, if else... Please someone help me I want to be persuade a career in programming.

analog hamlet
#

can anyone help get me going in the right direction on visualizing an algorithm which packs boxes of known dimension into a shipping container? it's basically the 3D bin packing problem where n of bins = 1. so far GPT has really failed to help me in any meaningful way, therefore I'd like to begin to implement known algorithms from scratch, with the end goal of visualizing the algorithm as it runs. setting aside the animation, im struggling to implement the logic and space filling code, for example be it a voxel based approach, handling rotations of boxes, et cetera

haughty mountain
#

you start with this, the most naive thing

result[i] = sum(abs(num[i] - num[j]) for j in range(n))
```you can split this in two cases, where the number is smaller and when it is bigger, to get rid of the `abs`
#

in this case we have a sorted array, so this just becomes

sum(num[i] - num[j]) for j in range(i))
+ sum(num[j] - num[i]) for j in range(i+1, n))
haughty mountain
#

this is still a slow solution, but now you can notice that the sums that remain are prefix/suffix sums

#

so we can optimize that

#

and that's the whole problem

deft spruce
#

although at least I feel that I made improvement and that I got better at solving problems

haughty mountain
#

it's a nice case of starting with a naive and correct formulation and transforming it into something that can be better attacked

analog hamlet
#

Ok so I have a working prototype now of the box placing algo, however I am failing to capture the intermediate steps, eg I want to actually show when a placement tries and fails, show rotations, et cetera

#

But depending on my stack this may now be like a react or nodejs question

crystal moat
analog hamlet
analog hamlet
#

How can you most efficiently pack a shipping container basically given a set of boxes of known dimension

#

Fitting the most boxes possible

regal spoke
#

My solution would then look like this

def getSumAbsoluteDifferences(self, nums):
    n = len(nums)
    result = [0] * n
    prefix_sum = [0] * (n + 1)

    # Calculate prefix sums
    for i in range(n):
        prefix_sum[i + 1] = prefix_sum[i] + nums[i]

    # Calculate result array
    for i in range(n):
        left_sum = prefix_sum[i + 1]
        right_sum = prefix_sum[n] - left_sum
        # The +2 is because of 0-indexing
        result[i] = nums[i] * (2 * i + 2 - n) - left_sum + right_sum

    return result
wicked obsidian
#

can someone check if 2nd option is correct or na ?

agile sundial
#

G is a directed graph

haughty mountain
#

Without giving away too much I think there should be ||2|| true statements total

vocal gorge
#

hmm, I got ||1|| true statements

haughty mountain
#

actually you are probably right

#

there could be ||a negative cycle in a separate component||

#

wait no

#

it says "in p" pithink

#

not in G

#

the other statement would be ||the subpath one||

agile sundial
#

||#3, yeah||

regal spoke
#

But I'm not 100% what they mean by p containing a cycle

#

Only case where a cycle appears in a shortest path is if its weights is 0
By that logic, there are ||3|| true statements in total

fast gazelle
#

Is anyone aware of how i can progrsm a bot to recognize wicks on a candlestick in the money market?

haughty mountain
#

talking about cycles in a shortest path is kinda sketchy in general

regal spoke
#

Talking about a shortest path as a list of nodes is sketchy too

#

I see it as a list of edges

#

It's the edges that are weighted

#

It's the subgraph containing those edges that potentially could contain a cycle

vagrant perch
#

Rate my solution to Reducing Dishes

#

Someone explain how people are doing this in 10 lines with 1 loop and 1 item

vocal gorge
#

huh, this problem really shouldn't be marked Hard

vocal gorge
#

This problem is solvable in O(n) by doing this

vagrant perch
#

does my approach logically make sense

vagrant perch
#

damnnn

#

i just shove on the largest negative numbers infront of the positive list

#

to shunt it along one at a time and check if its improved

fiery cosmos
vocal gorge
# vagrant perch i just shove on the largest negative numbers infront of the positive list

That's what I do too (my reasoning was: it's always optimal to have your chosen list in sorted order, and use the largest numbers possible, so the only question is how many of the biggest negatives you prepend to your list of positives).
I wrote mine like this (and then realised it's overcomplicated because I could handle positive and negatives together instead of having separate neg_score and cur_score variables, but was too lazy to rewrite it):

#

how tf wud anyone know that intuitivley
I would say it relies on having done a few problems solved by computing cumulative sums. When I looked at this scoring mechanism I went "oh that's a cumsum for sure, no wait it isn't, oh no wait it still is"

vagrant perch
#

The code I posted is after looking at it and thinking for
Half an hour because I don’t do leetcode so I just had to brute force it in my brain

haughty mountain
vocal gorge
#

Oh right, I forgot that sorting takes time πŸ˜›

haughty mountain
#

and you can make it a tad shorter

def solve(satisfaction):
  sum_so_far = 0
  best = 0
  score = 0
  for x in sorted(satisfaction, reverse=True):
    sum_so_far += x
    score += sum_so_far
    best = max(best, score)
  return best
deft spruce
#

For people that practiced a lot of competitive programming, in what ways your thinking improved and what kind of techniques have you learnt that are useful for getting better at problem solving? I am asking about things related to problem solving other than knowing some specific tricks like slow and fast pointer, so I am asking, what you have realized about how you can generally improve problem solving skills?

These are things that I have learnt:

  • I solved like 170 problems and I have realised that it is useful to think about different characteristics/parts of solution and try to understand how different characteristics of solution map to solution, what kind of problems different parts of solution solve, e.g. that hashing could be used for checking number of occurences of particular element. After understanding how different parts of solution solve problem, then it is easier to solve new problems, because solution for new problems sometimes consist of different parts of solution from old problems
  • I have learnt to be patient and if I don't understand something at first, to ask specific question, then try to get an answer to that question so I come up with solution (before I was too impatient).
  • I have learnt that it is important to "think in examples" and not wait for solution so that it comes on it's own, so really put effort and "think in examples"
  • Other than that, I have learnt that sometimes it is useful to think about how to solve problem without using code
stiff quartz
deft spruce
marsh timber
#

where to start data structure and algo!?

regal spoke
#

The answer is just the sum of all sum_so_far that are positive since if sum_so_far turns negative, then it will never become positive again

warm aurora
#

Im doing leetcode and even though i was learning for a while and completly finished the base learn python course on boot.dev I cant seem to solve any leetcode problems does anyone know a fix for this, do i just need to code more because i cant solve anything without AI help and then i feel like im not learning its a shitty spot lol any ideas

rancid sonnet
blissful cypress
blissful cypress
#

Probably too early to focus on Leetcode then

#

I was steered towards leetcode when I started and it wasn’t a good use of time. Even the β€œeasy” problems assume a lot of baseline knowledge

#

but that's just my two cents. i'm by no means an expert

languid remnant
#

not sure if this is the right place to ask, but, just wanted to know if anyone could suggest a good place to learn algos and data structures from.

hollow monolith
languid remnant
past flare
#

Bruh why leetcode is freaking hard. I feel very dumbπŸ™ƒπŸ™ƒ

worn orbit
#

https://www.skartner.com/dsa/two-pointers/remove-occurences

Hi, I am creating a course on DSA in Python, it's free, please check out.

I plan to cover 300 important questions. Blog contains problem-statement, different approaches, code and video explanation at end.

My background - SDE with 4 yoe, now a startup founder.

Here's my leetcode - https://leetcode.com/u/RahulGupta9877/

fiery cosmos
#

I will leave yall with a question

#

go write some code to find the saddlepoint of a m x n matrix

real sequoia
#

looking for a coding parter for DSA in Python

haughty mountain
#

and I don't think this is possible in better than O(n m) unless we have some guarantees about the values

#

e.g. you can have a sea of say constant 5 then a

5 5 5 5 5
5 5 6 5 5
5 4 5 4 5
5 5 6 5 5
5 5 5 5 5
```in the middle
#

or is a constant like that all saddle points?

#

even if so you can make some evil patterns if repeated values are allowed

midnight lion
#

Chat i need a help!!! I am working for a competition where i get to finish 400 tasks. I am stuck on this task where I need to write a python program to convert inputs like this to output. They can give any input and i may have to convert it. here is some input and output data :

halcyon plankBOT
fiery cosmos
fiery cosmos
haughty mountain
#

it's trivial

haughty mountain
#

e.g. why can't the solution for the yellow part of the first one be

#

similarly for the one with the pluses, how do we know it's not more tightly spaced pluses?

umbral mountain
#

Actually may be fun to golf

umbral mountain
narrow mica
haughty mountain
midnight lion
#

lol

#

thats what im in

mystic fulcrum
haughty mountain
woven peak
#

Guys any ideas about where I can find the link to sign up for IBM's open source quantum?

jade abyss
#

does anyone know how in order dfs traversal works?

dreamy tulip
dreamy tulip
woven peak
fiery cosmos
#

Hi Everyone, here's a small task if anyone interested.
You have to write a algorithm which will visit every url on the site once, save the url and screenshot the page using x,y coordinates. you can generate the x,y coordainates and tagtype using the pupeteer script i am providing. It will generate the x,y,h,w and tagtype for all the visible elements on the webpage. I am also attaching a script which you can use to control the mouse and keyboard using python. So you have to move the cursor to the center of the rectangle. and save. visit next url, screenshot and save, repeat for all the urls, as you visit a new url, wait for page to load, run the run script to get all the new elements, and filter only the urls, use a list to track all the urls visited. and restrict the algorithm from visiting any url outside of domain.
And yes, I am willing to pay
Looking forward to hear from interested minds

halcyon plankBOT
fiery cosmos
#

you can use python selenium, or any other library you want.

limber pelican
#

i was looking for a fellow python mate to grind DSA in , anyone who is serious for getting into FAANG , hardcore level of DSA grinding , it would be convenient if you are a newbie in DSA and using python mainly for consistency

Note: Mostly focussing on leetcode grinding , not doing competitive programming or codeforces

abstract ruin
#

does any one know how to to fix found numbers in python with formulas relating to data sets

stiff quartz
#

@regal spoke I didn't see but look what the bot has said on your issue "Our teams prioritize action on product issues with broad customer impact."

#

Does that mean they put the issue in "We basically don't care"

regal spoke
#

There has basically been no commucation. That bot just post the same stuff on every issue.

#

That doesn't mean they arent working on it

stiff quartz
#

Ah I thought only some issues had that

regal spoke
#

I wouldnt read too much into it

stiff quartz
#

@regal spoke

#

I don't know what they did but hacking CPython dictionaries is like so much worse (or good, depends on how you see it) in Python 3.14

stiff quartz
#

Someone tried on all versions from 3.12 to 3.15 preview

haughty mountain
#

I've seen odd reports of 3.14 having random performance issues, I wonder if it might be a messed up build setup or something

#

like missing some optimizations that otherwise would happen

#

lto maybe?

stiff quartz
#

could be, but the "non-hacked" version is consistent across versions

#

if it wasn't compiled properly, the "non-hacked" version would be slower too?

stiff quartz
haughty mountain
stiff quartz
#

I don’t understand what you mean sorry

#

If it was compiled without optimisation, the peak in time for 3.14 would occur for both the bad path

#

And good path

#

Ah yeah I didn’t show the graph for good path

#

But good path is the same time for all versions

haughty mountain
haughty mountain
# stiff quartz And good path

if the path gets hit that part would indeed be slower, but if you have something like

total_cost = lookup_cost + resolve_conflict*n
```assuming the overall lookup cost is a largish constant, then you might not see the contribution of the conflict resolution for small n
#

so even if it got slower you might not see it unless you hit that path a lot of times per query

#

in normal usage you expect very few collisions (typically 0), while with the hack input you expect a ton of them

stiff quartz
#

Ah so you mean that optimization would only affect resolve_conflict cost?

haughty mountain
#

the lookup cost itself might be dominated by just a fetch from main memory

stiff quartz
#

ah ok makes sense

haughty mountain
#

i.e. nothing you can just trivially optimize

stiff quartz
#

I'll try to compile the source code myself from the latest point in the branch

haughty mountain
#

of course I'm speculating πŸ™‚

stiff quartz
#

Yeah of course, but it's still worth it to check it I guess

#

Ah I think I found how to enable release mode

#

@haughty mountain damn

#

maybe you were right, those are my results with python locally built with optimizations

fiery cosmos
#

when debugging a function clicking step over it sometimes says running and i wait for 5 minutes and nothing happens still waiting?

stiff quartz
fiery cosmos
#

jasonn what makes you sucha good enginerer?
\

stiff quartz
#

Could be different build setups that aren't mistakes

#

Just some have side effects on obscure hash collisions that barely anyone cares about

haughty mountain
#

it might be some bad release setup though

stiff quartz
#

That would be ironic

#

Python 3.13.6 is literally on the homepage of the main website

haughty mountain
#

idk why it would be ironic

#

maybe at some point something changed in the release process (scripts maybe) changed and caused a regression, and the setup is then re-used for new releases

stiff quartz
fiery cosmos
#

hay guys im planning to start dsa in python, could you please shear some resources

vague maple
#

@robust lark

robust lark
#

Yall discuss dsa or cp questions?

ripe kayak
#

What is the best way to start learning algorithms?

#

What should I start with first?

indigo wave
#

Hi guys, i am taking dsa in uni, and our teacher let us pick a language to use for the class, java, c, or python. I know pretty decent in all three. I just dont know which is more effecient to use with, or just enough for the job, here are our syllabus

Course Outline
The following modules are covered in CMSC 122.
I. Introduction to Data Structures and Algorithms
A. Basic Concepts
B. A Philosophy of Data Structures
C. From Problems to Programmed Solutions
II. Fundamental Data Structures
A. Arrays
B. Linked Lists
III. Search and Sorting Algorithms
A. Arrays
B. Sorting Algorithms
IV. Algorithm Complexity and Analysis
A. A Deep Dive to Algorithm Complexity
B. Worst and Average Cases
C. Asymptotic Analysis
V. Stacks and Queues
A. Stacks
B. Queues
VI. Binary Search Trees
A. Introduction to Non-linear Data Structures
B. Binary Trees
C. Array-based Representation of Binary Trees
D. Tree Traversals
VII. Self-balancing Trees
A. Introduction to Self-balancing Trees
B. Red-Black Trees
C. Complexity of Red Black Trees and related algorithms
VIII. Heaps
A. Min- and Max Priority Queues
B. Binary Heaps
C. Min- and Max Heaps
D. Complexity of Heaps and related algorithms
IX. Hash Tables
A. Hash Tables
B. Hash Functions
C. Managing collisions
X. Graphs
A. Introduction to Graphs
B. Data Structures for Graphs
C. Traversal Algorithms for Graphs
D. Problem-solving with Graphs
E. Complexity of Graphs and related algorithms

Thanks in advance guys

eager narwhal
# indigo wave Hi guys, i am taking dsa in uni, and our teacher let us pick a language to use f...

Having to deal with memory management yourself makes it more tedious; but if you think you can handle learning to go about learning a bit more difficult algorithms, it's worth it imo to learn DSA in C because less is hidden from you (which makes complexity analysis easier because you know the complexity of the operations you're using generally)

What is "for the job" depends on the job; programming jobs is a large category, can't say much on that

indigo wave
ripe kayak
#

Where should I start with first?
What is the best way to start learning algorithms?

blissful cypress
#

I have some formal algo studying on the horizon and am debating between Python and Java

umbral mountain
analog hamlet
#

An advance in shortest paths compute.

haughty mountain
#

For sparse graphs in particular

analog hamlet
rotund mist
#

Can anyone help me fix my greedy search? This is a broken old version but I need it fixed soon:

dreamy night
#

i may or may not have thrown away all my algo knowledge, but i'll see if i can help with my single brain cell

#

code block pls?

#

also input and expected output

rotund mist
rotund mist
umbral mountain
#

You should keep track of whether you've enqueued them.

dreamy night
#

it appears that city_coordinates is a dict

#

im not really sure how city_coordinates is structured

rotund mist
dreamy night
#

this is technically not even related to the greedy algorithm, it's just a syntax error

#

anyway

umbral mountain
#

Hint: to use automatic tuple sorting, you could put the cost at the front of the tuple.
(cost, node_info)

dreamy night
#

@umbral mountain am I missing something here?

umbral mountain
#

nope what you said should be right.

dreamy night
#

city_coordinates(start, None, None, 0) is just wrong syntax

rotund mist
#

Do I need to post more screenshots to provide more clarity?

umbral mountain
#

python error more than algo error

umbral mountain
rotund mist
dreamy night
#

thing is, i dont really know what city you are trying to add to the queue

#

can you go through with me how you determine which city you are trying to add?

umbral mountain
rotund mist
dreamy night
#

ok, maybe I rephrase

rotund mist
#

I'm not good at python and again I'm really tired this corruption is making things difficult

#

Just need someone to get me over the line

dreamy night
#

what are you trying to achieve on the line
frontier.put((graph[start], city_coordinates(start, None, None, 0)))

#

you need to tell me what you are trying to do in english

rotund mist
dreamy night
#

i dont mean to sound mean, but did you write it originally?

#

or did you ask chatGPT, or someone else to do it for you

rotund mist
dreamy night
#

how long do you have till you have to submit

rotund mist
#

4 hours, the others didn't take long but christ I am knackered, came here for a quick solution but I probably made myself look dumb

dreamy night
#

and a more technical question,

#

what does graph look like

#

if i understand the assignment correctly, you are trying to get from Ipswich to Newcastle with greedy

#

thing is

#

I can't tell how their connections are represented in the variable graph

rotund mist
umbral mountain
#

There are also many variants of "greedy" pathfinding. Which one are you attempting to do

dreamy night
#

thats true

rotund mist
dreamy night
#

i kinda assume that you are just picking the lowest cost at every node to follow

#

tepig

#

mmmm

#

it's clear to me that you have absolutely no idea what you are doing

#

i

#

i could bail you out

umbral mountain
#

You might need to hit the books lol

dreamy night
#

but you cant just

#

get through life with people bailing you out

umbral mountain
#

So you know how to code an A*?

dreamy night
#

do you know what is A*?

rotund mist
rotund mist
umbral mountain
#

Then you need to define "greedy".

#

I have no idea what is "uniform".

rotund mist
#

I should probably just figure it out myself, I don't think I'm getting a quick solution here.

umbral mountain
#

Ok

rotund mist
#

This greedy search bit is the last bit if I need to make it clear, the performance comparison code works.

dreamy night
#

from what i can see

#

you are supposed to add every city connected to Ipswich

#

to frontier

#

right?

#

well, the starting city, which you hard coded as Ipswich

umbral mountain
rotund mist
rotund mist
#

I'm going to go back to the original templates and guides, sorry for wasting both of your time. I'll definitely get this done, thanks for trying to help me.

dreamy night
#

do you know what does frontier mean in the context of this graph problem?

#

it means the possible next nodes that have not been explored yet

#

well, edges

#

right now your code is failing at the part where it's trying to populate frontier for the first time

#

i.e. you need to find a way to get all the cities connected to the starting cities

#

and add them to the frontier

rotund mist
#

dw about it a friend is gonna help my sleep deprived state

dreamy night
#

your code is failing because the expression that you used in frontier.put((graph[start], city_coordinates(start, None, None, 0))) is not the correct way to find a list of connected cities to add to frontier

haughty mountain
rotund mist
#

@dreamy night I am a fool I wasn't supposed to do greedy, I was allowed to do a simple one like depth first search. Sorry for wasting your time and thanks for trying to help, I really appreciate it!

dreamy night
#

whether you use greedy or depth first search

#

you are still going to have to

#

populate your frontier queue

orchid pebble
#

In the Quicksort example, why did the first partitioning step produce the exact sequence [8, 5, 9, 6] instead of another random-looking order given the initial unordered list [9, -3, 5, 2, 6, 8, -6, 1, 3]? Note that the last element is the pivot element.

raven dagger
#

it doesn't necessarily need to be that

#

i think the last element is a[0] which is strange

orchid pebble
#

But I also found another example with the same exact case

raven dagger
orchid pebble
raven dagger
#

oh

#

quicksort works by repeatedly taking a pivot and taking the sublist of elements smaller and larger than the pivot

#

and does quicksort on those sublists

orchid pebble
#

I just found why this behaves in such a way

umbral mountain
#

It all depends on the partition scheme.

#

As long as "stuff on the left less than pivot, and stuff on the right more than pivot", it's all good.

#

Quicksort is unstable anyway.

haughty mountain
#

but in the case you're looking at for C++ the partition seems to be done with swapping like in std::partition

orchid pebble
#

Do you know what a full binary tree is? Is it a binary tree, whose nodes have 0 or 2 children?

outer meteor
#

a bit

orchid pebble
#

Guys can you help me further?

outer meteor
orchid pebble
# outer meteor just ask your question dont ask to ask

In a complete binary tree, the nodes can have either 0, 1 or 2 children and every level is filled, except possibly the last level and all nodes in the last level are as far left as possible. With that being said, a complete binary tree can also be a full tree or a perfect tree, right?

haughty mountain
#

some

#

e.g. a perfect tree is also full and complete

umbral mountain
orchid pebble
#

I have another question: There is a general concept about balanced and unbalanced binary trees, where balanced simply means that it does support O(log n) effectively, while an unbalanced tree means that it doesn't support O(log n) effectively?

orchid pebble
umbral mountain
orchid pebble
umbral mountain
#

Balanced means depth of left and right subtrees differ by <= k

#

If k == 1, and child nodes are aligned left, then balanced => complete

orchid pebble
umbral mountain
#

A balanced tree, particularly a balanced binary tree, is a type of tree data structure where the height of the left and right subtrees of any node differ by at most a certain value, typically 1. This property ensures that the tree remains relatively "flat" or compact, preventing the worst-case scenarios of highly unbalanced trees that can degenerate into a linked list.

orchid pebble
umbral mountain
#

Not necessarily. There are many types of balanced tree.

orchid pebble
umbral mountain
orchid pebble
# umbral mountain 9. It's the same as the number of nodes.

Yes, I would also think about this. The rule is that every node in a tree is the root of its own subtree. Most people would assume that the child nodes (also known as leaves) are not part of a subtree. However, a single node counts as a tree as well

umbral mountain
orchid pebble
umbral mountain
#

Have you written your own tree class yet? That is an excellent way to improve your understanding of trees.

haughty mountain
#

or wait, I'm thinking of a different thing than subtrees

limber pelican
#

hello everyone , i am looking for a partner to hardcore grind DSA in python for FAANG mostly , beginners and intermediete both are welcome

limber pelican
#

Yeah sure

grave jackal
#

hiiii new here so I hope this isn't the wrong place to ask. Does anyone know of a way to make a large collection of objects and efficiently pick a (pseudo-) random one? (in fact, pick all of them, in a random order, doing a pop() on each as I go until the collection is empty)
I'm currently just using something like list[random.randint()] but with the number of objects I have it's sloooooow.

#

Not a python expert but I at least figured out that dicts/hashmaps are faster than lists; I implemented dicts in a few other places in my program to speed things up but I can't figure out a way to pick randomly from a dict without first converting the keys to a list, which ruins the efficiency advantage

tight cedar
#

why not just shuffle them first then pop it

#

I assume it is the removing from list part that is slow because you are removing from a specific index in the list

grave jackal
#

yeah

#

oh I forgot popping from the end would not have to do that

#

Ah no right the problem I had with that was that when I pick a random one, sometimes I use it, and other times I put it back in the list and pick a new one

#

which if I just shuffled them popped from the end would give me the same one again

#

Honestly there's probably a better way to do that part, so that I don't have to put any back, but I am ignoring that for now

umbral mountain
grave jackal
#

after reading the documentation on that function for a while I am struggling to figure out what to do with that

narrow mica
grave jackal
# narrow mica then could you fully and specifically describe what you're trying to do? would b...

ok so I'm trying to create a 2d grid of objects ("tiles"), with each holding a "type" that is randomly selected, but biased by the other objects that surround it in the grid. (i.e., make a fun little randomly generated terrain map) (there are four "types" (water, grass, beach, tree) with whether it can be each "type" stored in an attribute as a dict with boolean values)

Function:

  • create empty dict, create empty list
  • Populate dict with same set of "tiles", keyed "i_j" (The coordinates [i,j] are also stored as an attribute by each object). At same time, append() each object to the list (just a 1d list holding each one sequentially)

~While there are still "tiles" in list --

  • Pick a random one ( list[random.randint(0, len(list)-1)] )
  • Check if it's on the wall of the grid, i.e. i or j is 0 or the size of the grid in that dimension. If so, tell it it's a wall, remove it from the list, and move on.
  • Elif the tile does not have True for all of its possible "types:"
    • or rand.randint(0,1000)==5 (to break the starting case where all have True for all their cases, and randomly choose some other ones too so it doesn't just spread out from a single tile)
    • While tile type is None: rand.choice() the possible "types," and check if this "tile" is allowed to be that "type." If so, set tile to that type.
    • (outside while loop, still in elif) create list of surrounding tiles, indexed from the dict (dict.get(str(x-1)+"_"+str(y)), etc.)
    • For each "tile" in the surrounding tiles: modify its list of allowed "types" based on this tile's type.
    • Remove this "tile" from the list.
  • No else statement -- (if neither if condition is true, keep the "tile" in the list and pick a new random one)
#

My code is terribly messy and not well commented at the moment πŸ™ƒ because of course it would be and I did not come here prepared

#

It works well enough though, for what I want it to create. However, the step of picking a random one from the list like that is very slow

ivory yacht
#

Does it matter whether the pick step is uniformly random? What you could do is have two lists of tiles. When you need to re-insert, instead put it in the second list. Once the main list is empty, shuffle the second list, then swap the two lists.

grave jackal
#

That sounds like it could work? I'll try it -- for reference this process takes about 5.8-6 seconds at the moment (per time.perf_counter())

#

(that's for a small area, 200x200 pixels; if I scale it up to reasonable size for my screen it takes minutes, which is why I want to speed it up lol)

grave jackal
ivory yacht
#

Probably worth running with a real profiler, that'll help tell you where your actual slow spots are.

#

Just noticed something - this could be a slowdown: dict.get(str(x-1)+"_"+str(y)). Instead it'd probably be better to just use (x+1, y) tuples.

grave jackal
#

I originally had the dict as a 2d list if that's what you mean? It was much slower

ivory yacht
#

No, dicts don't have to use string keys, they can use any immutable object, like tuples.

grave jackal
#

Oh, huh. Did not realize. I had tried keying it with a 2-item list and it yelled at me so I just wrongly assumed tuples would be the same

grave jackal
grave jackal
umbral mountain
#

I.e. List of lists.

stiff quartz
austere sparrow
#

(A mutable object can be hashable -- for example functions, or instances of a user-defined class that didn't override __eq__ explicitly)

grave jackal
naive oak
grave jackal
#

I am indexing random elements and then removing them

#

So yeah structure changes frequently

naive oak
#

yeah that'd explain things
not too sure what you're doing, but dict is probably fine
if you wanted to stick with 2d lists, it'd be better to just set stuff to something like None instead of removing it from the list

umbral mountain
grave jackal
#

I am choosing random ones but need to ascertain that eventually all of them have been chosen. If I set them to "None" then by the end it would be just choosing random numbers repeatedly until eventually happening to land on the one single one that hasn't been chosen yet, no? I can't imagine that would be fast

#

Would also need to copy them into a second list at the start since I need to keep the information about the ones that have already been chosen but that would be fine presumably

haughty mountain
#

can you explain what you actually trying to do?

#

list does not sound like the data structure you want

pearl ravine
#

How can I start learning py

orchid pebble
#

Why wouldn't it choose the other edge that costs 7 as well?

haughty mountain
#

it can, but it doesn't matter for that input

orchid pebble
haughty mountain
# grave jackal ^^ ?

can you just shuffle your list and go through it, rather than doing the slow removal at each step?

grave jackal
#

No, because I often need to repeat points if they did not meet conditions for removal the first time, and if I just used a single shuffled list it would just give me back the same point again when I need to try a different one instead

Suppose I could shuffle, go through, then make a new list of the ones that haven't been edited yet and repeat? But id be making a new list for often every few points so I'm not certain that would be faster...? I can try it at least (later)

haughty mountain
#

(though the duplicates part isn't interesting)

#

actually nvm your problem should be easier

#

instead of removing by .remove or whatever, swap the element with the last element and .pop

grave jackal
#

Hmmm, that might work, though if I'm understanding correctly I worry about some undesirable patterning, as items earlier in the list are less likely to be removed, so the end of the list would be skewed towards points that initially did not meet conditions. Might not make a noticeable effect though...? Worth trying

grave jackal
#

if I've implemented it correctly it did indeed produce an undesirable output (before is on the left, after is on the right)

#

Additionally, took... exactly the same amount of time. Which made me suspicious that this process wasn't what was eating time, but when I skipped it entirely it went down to near-zero time, so... ?? I could also just be misunderstanding though

bronze bobcat
haughty mountain
grave jackal
vocal sonnet
#

hi guys i need help

grave jackal
#

I guess I'm removing items from large lists like that forever now πŸ’€

umbral mountain
grave jackal
#

Ah, suppose so

#

Program as a whole is still quite slow and I'll have to come back to this, I think, but I think a two-fold reduction is enough to move on for now as long as it's still just a personal project

haughty mountain
#

a lot of algorithmic design comes down to avoiding having to do the expensive naive thing

fiery cosmos
#

whats your fav ds

#

and algo

naive oak
#

dsu

#

favourite algorithm is tough pithink

#

probably the polynomial time 2 sat solver

#

I quite like tarjan's scc alg as well

#

the suite of algorithms surrounding computation of the matrix exponential is also rather interesting

#

putzer's, the column method, jordan chevalley, Jordan normal form, etc.

#

tho that might be more mathy than cs

regal spoke
#

Segment tree and segment tree

haughty mountain
#

Big surprise

naive oak
regal spoke
#

But like, I'm used enough to coding these that I code them on the spot everytime

#

So I don't really have one implementation that I use

subtle garden
#

F

#

YT

#

HELLO

#

I some one can give me some recommend how to leran paython from 0

native violet
royal kestrel
#

I swear this was initially revealed like a month and people back then went like "oh it's actually only faster in a very very very specific niche" and somehow it's exploded in the news again with no context

silver cloud
#

how can i post block of code here??

blissful cypress
halcyon plankBOT
#
Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

silver cloud
#
def is_balanced(input:str)->bool:
    #((()))
    #(()
    if len(input) == 1:
        return False
    
    stack = Stack()
    for paren in input:
        stack.push(paren)
    for i in range(len(stack.items)):
        if stack.items[i] == stack.items[len(stack.items) - 1 - i]:
            return False
    return True
print(is_balanced("(()(()))"))

#

this returns false

#

but other cases returns true

#

like ()() or ((()))

umbral mountain
#

(like the one in your print)