#algos-and-data-structs

1 messages Β· Page 37 of 1

spring jasper
#

For a nums=[0]

#

Why does this pass

regal spoke
#

Oh I see

spring jasper
#

This has to be an issue with the testing

regal spoke
#

I think the simple answer is that leetcode is not all that high quality of a website

#

Either they dont have a test case for len(nums)=1, or their validator is buggy

spring jasper
#

It should give index out of bounds error, is it worth reporting?

narrow mica
spring jasper
#

mfw cheated the fuck out of this problem

narrow mica
#

if you have range(a, b) and a > b then it's just empty

spring jasper
#

Yes I know

regal spoke
#

and it still gets AC

narrow mica
#

Ah I see

regal spoke
spring jasper
#

fuck inplace solutions man

#
def removeDuplicates(self, nums: List[int]) -> int: 
    from itertools import groupby, chain
    nums[:] = list(chain.from_iterable([[k] * min(len(list(g)), 2) for k, g in groupby(nums)]))
    return len(nums)
#

accepted πŸ’€

regal spoke
#

But like, the point of this problem is clearly to solve it in place

#

Had you been asked a question like this in a programming interview, they you wouldnt be able to cheese it like this.

spring jasper
#

yea i know but also i dont really do python cause i like worrying about allocations

#

if theyre so worried maybe theyr should tighten their constraints

regal spoke
spring jasper
#

just use a profiler

regal spoke
spring jasper
#

it doesnt matter because it only cares about the length of the array up to k

regal spoke
#

The entire idea behind the problem is that you are given an array of a fixed size and have to work around that

spring jasper
#

you're supposed to return the length on the array that holds the valid elements

#

whether you shrunk the array or grew it is irrelevant

#

More formally, if there are k elements after removing the duplicates, then the first k elements of nums should hold the final result. It does not matter what you leave beyond the first k elements.

#

besides, in-place doesnt mean keep the array at the same size, it means dont allocate a new array

#

(i do anyway but thats besides the point)

regal spoke
#

and I think that is kinda the point of the problem

spring jasper
#

yea but im not doing c

#

i know my solution doesnt satisfy the criteria but honestly 🀷

waxen onyx
#

`def two_oldest_ages(ages):
oldest = float('-inf')
second_oldest = float('-inf')

for age in ages:
    if age > oldest:
        second_oldest = oldest
        oldest = age
    elif age > second_oldest:
        second_oldest = age

return [second_oldest, oldest]`

how does this make sure that the second_oldest doesn't equal oldest? I asked chatgpt but it gives me a solution that contain != oldest

haughty mountain
#

I think this would work (insert into function as needed)

ins = 1
for cur in range(1, len(data)):
  if data[ins-1] != data[cur]:
    data[ins] = data[cur]
    ins += 1
return ins
regal spoke
#

It is not just removing duplicates

haughty mountain
#

oh, that's just a minor correction though

#

the check changes a tad, you loop from 2

regal spoke
#

So it is a little more tricky than just changing the 1s to a 2

haughty mountain
#

I guess a general thing would be

def keep_k(data, k):
  ins = 0
  for cur in data:
    if ins < k or any(cur != data[ins-i-1] for i in range(k)):
      data[ins] = cur
      ins += 1
haughty mountain
#

right

#

just checking ins-k is enough

#
def keep_k(data, k):
  ins = 0
  for cur in data:
    if ins < k or data[ins-k] != cur:
      data[ins] = cur
      ins += 1
#

I guess a nicer more general implementation would do swaps rather than assignments

#

but whatever

regal spoke
haughty mountain
#

yeah

#

is allows you to run the algo even on non-copyable types

thick hazel
#

for path planning through a dense maze/obstacle field, is it true that BIT* is the best algorithm we have?

#

is that like THE SOTA?

robust needle
#

Okay I need some help on improving my pathfinding algorithm for a project

#

The problem is basically trying to find a path for tourist that goes through as many attractions as possible, while minimising time and cost taken

#

So I have a graph, G1, which contains two sets of nodes, attractions and bus stops, which are linked with various weighted edges

#

my initial algorihm uses Floyd-Warshall on G1 to get all the minimum distances, and uses these to make a new graph, G2, which consists only of attractions (the paths are stored as edge-attributes)

#

then i just pathfind through G2 using best-first-search or something similiar

#

now as to improve it, my first idea was to use dijkstra's multiple times on attractions

#

but then i was like, wait if I run dijkstra multiple times, i could save some paths so i don't have to recalculate them!

#

and then one hour later i realised i have a shitty clone of floyd-warshall

arctic ginkgo
#

Lets say we have a unsorted list with 7 digits

And I call mergesort on it

[1, 2, 3, 4, 5, 6, 7]

how many recursive calls would it take?

it would be 14 right

narrow mica
regal spoke
haughty mountain
haughty mountain
#

do you not count the first call?

regal spoke
#

Ye

haughty mountain
#

down to definitions I guess

#

13 calls to the recursive function

glass epoch
#

how to know the shape of a variable before running and debugging it?
I found this needful especially when Im using numpy

#

I mean I can trace the code but it is too much work

vocal gorge
#

there's no shape typing in numpy, if that's what you mean

#

what I usually do is leave comments on the shapes of all arrays

#

then it's easy to compute new shapes - if I do

c = b.reshape(1,-1) + a

and b is (N,) and a is (N,1), then c will be (N,N).

summer badge
#

hey guys hope yall doing well here. I was wondering if someone understands a lil bit trading, because i'm developing a trading AI that is nearly finished: I have a float issue because i want to float my broker balance and that is perturbing me a lot and i am struggling to fix. If you want to help. Please ping me. PS: I'm 15 y/o I don't have as many experience as you here. Thanks for reading

summer fossil
#
class Solution:
    def fill(self, n, m, mat):
        vis = [len(row)*[0] for row in mat]
        delrow = [-1, 0, 1, 0]
        delcol = [0, 1, 0, -1]
        
        def dfs(row, col):
            vis[row][col] = 1

            for i in range(4):
                nrow = row + delrow[i]
                ncol = col + delcol[i]

                if nrow >= 0 and ncol >= 0 and ncol < m and nrow < n and not vis[nrow][ncol] and mat[nrow][ncol] == 'O':
                    dfs(nrow, ncol)

        # Traverse First Row and Last Row
        for j in range(m):
            if not vis[0][j] and mat[0][j] == 'O':
                dfs(0, j)

            if not vis[n - 1][j] and mat[n - 1][j] == 'O':
                dfs(n - 1, j)

        # Traverse First Col and last Col
        for i in range(n):
            if not vis[i][0] and mat[i][0] == 'O':
                dfs(i, 0)
            if not vis[i][m - 1] and mat[i][m - 1] == 'O':
                dfs(i, m - 1)

        for i in range(n):
            for j in range(m):
                if not vis[i][j] and mat[i][j] == 'O':
                    mat[i][j] = 'X'

        return mat```
summer fossil
whole aurora
#

ok what is it currently outputting

#

well thats above my pay grade!

shadow glen
#

!e
message = "Hello\nWorld"

second_message = message[1].split('\n', 1)

second_message = second_message[1]

print(f"1st: {message}")
print(f"2nd: {second_message}")

halcyon plankBOT
#

@shadow glen :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "/home/main.py", line 5, in <module>
003 |     second_message = second_message[1]
004 |                      ~~~~~~~~~~~~~~^^^
005 | IndexError: list index out of range
shadow glen
#

!e
message = "Hello\nWorld"

second_message = message.split('\n', 1)

second_message = second_message[1]

print(f"1st: {message}")
print(f"2nd: {second_message}")

halcyon plankBOT
#

@shadow glen :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | 1st: Hello
002 | World
003 | 2nd: World
shadow glen
#

perfect

modern gulch
summer fossil
#

wrong o.p

modern gulch
#

Ok, but you're not giving us any information. You should probably give us a minimal test case, and try to isolate where it's not working?

summer fossil
#

It;s fails 1st test case

#

it passed sample test case

summer fossil
modern gulch
#

I'm happy to look if you can narrow it down, but someone else would have to analyze the entire code & look at the original problem, which is a lot of work.

summer fossil
#

and it;s working but got tle

#

Shitty mistake i am just copying entire matrix i need to create zero matrix so i change vis part and it works but gives tle

#

if i change vis[0][j] == 0 it works instead of not vis[0][j]
Why In not vis[0][j] i got tle ?

modern gulch
#

oh, right, yah so if the matrix isn't zeroed, it assumes it's visited?

summer fossil
#

if edge is zero then we can't replace it into x

regal spoke
summer fossil
#

but can u explain me why not vis is slower than vis = 0?

regal spoke
#

and not 0s and 1s

summer fossil
regal spoke
#

No it is not

#

Your code starts by copying it from mat

#

oh wait

summer fossil
regal spoke
#

Ah ok

#

Btw I dont see the reason to do dfs in the first place

#

Wouldnt just two for loops suffice to solve this problem?

#

ah nvm

#

Missread the problem

#

ah I understand your solution now

#

You could have also done a bfs solution

regal spoke
#
class Solution:
  def fill(self, n, m, mat):
    bfs  = [(i,j) for i in range(n) for j in (0, m-1)]
    bfs += [(i,j) for i in (0, n-1) for j in range(m)]
    for i,j in bfs:
      if 0 <= i < n and 0 <= j < m and mat[i][j] == 'O':
        mat[i][j] = '#'
        bfs += (i+1,j), (i-1,j), (i,j+1), (i,j-1)
    return [list(''.join(row).replace('O','X').replace('#','O')) for row in mat]

Here is a kinda codegolfed solution

rustic coral
#

So a question came up in my head while learning about linear approximation in calculus:
Would the time complexity of an algorithm made to evaluate a function across a given range, f(x), overall benefit from using linear approximation?

#

This is assuming the function provided is not predefined, the derivative and tangent line(s) must be calculated prior to using linear appx

#

Or is the time complexity of the algorithm independent of how complex a function is, say sqrt(x) vs sqrt(x^3+2)?

vocal gorge
#

What do you mean by "evaluate a function accross a given range"?

rustic coral
#

get all values of f(x) from [-5,5]

#

for example

vocal gorge
#

well, there's an uncountable number of points in [-5,5], so that's not possible to do in finite time and memory

#

or do you mean only at integer points?

rustic coral
#

yeah only at integer points

#

sorry

haughty mountain
vocal gorge
#

well, if the function isn't linear you can't accurately determine its values via linear approximation

#

you can, well, approximate it

rustic coral
#

yeah thats what I mean

vocal gorge
#

e.g. only compute values on every second point and linearly interpolate inbetween. but that's a speed-accuracy tradeoff

#

that said, asymptotic complexity wouldn't change, it's just O(n) either way

haughty mountain
#

computing the function need not be O(1)

marsh topaz
#

Hello!

#

This is really dumb but how do I represent a char array in c types?

#

Here's what I mean:

#
struct cmd {
    uint16_t timeout;
    uint16_t len;
    uint16_t next;
    uint16_t type;
    size_t id;
    char buf[];
};
#

That last member

#

I don't need a pointer to a string, I need to include it in the struct inline whatever the size is

#

Oh, it doesn't handle flexible array members super well πŸ˜›

robust needle
#

I'm having some trouble understanding the difference between optimsiation and decision problems

#

I understand that every optimisation problem can be transformed into a decision problem

#

and for like TSP, it makes sense to me that the decision version is NP-Complete, while the optimisation is NP-Hard

#

what I don't quite get is the difference between the optimisation and decsion versions of the 0-1 Knapsack problem

#

so far what i've gathered from the internet is that the decisions version (Given a multiset of items, S, is there a subset which has value, V, while having a weight of less than W), is NP-Complete

#

so what is the complexity class for optimisation version?

regal spoke
#

The theory about P, NP, NP-hard, NP-complete is only about decision problems (YES/NO problems)

regal spoke
#

A decision problem is said to be in NP if the YES answer has a polynomial time verifiable proof.

#

The <= k version of TSP is in NP since to prove YES all you need to do is to give an example of a tour of length <= k

regal spoke
#

My guess is that it is an open problem whether or not the =k version is in NP.

normal gull
# marsh topaz This is really dumb but how do I represent a char array in c types?

You can do zero-length arrays in ctypes, just as you can in C. Example:

import ctypes as ct
import struct

class cmd(ct.Structure) :
    _fields_ = \
        [
            ("timeout", ct.c_uint16),
            ("len", ct.c_uint16),
            ("next", ct.c_uint16),
            ("type", ct.c_uint16),
            ("id", ct.c_size_t),
            ("buf", 0 * ct.c_char),
        ]
#end cmd

s = b"hi there"
val = struct.pack("@HHHHN", 10, len(s), 8, 7, 6) + s
cptr = ct.cast(val, ct.POINTER(cmd))
print("timeout = ", cptr.contents.timeout)
print("type = ", cptr.contents.type)
bufptr = ct.cast(cptr, ct.c_void_p).value + cmd.buf.offset
s2 = ct.cast(bufptr, ct.POINTER(cptr.contents.len * ct.c_ubyte)).contents
print("buf = ", bytes(s2))
haughty mountain
marsh topaz
#

It's not a zero length array, it's a flexible array member, or are those terms equivilent to some people?

haughty mountain
#

I was responding to the mention of zero length arrays, not your struct πŸ™‚

outer rain
marsh topaz
#

wdym? I'm trying to represent the struct I showed above in c-types.

#

I'm not sure if ("buf", 0 * ct.c_char) will allow me to allocate an arbitrary length char buff like you would in C.

#

When you put a zero length array at the end of a struct in C it's actually a flexible array member that can have any length.

outer rain
#

You don't allocate a struct, you just allocate the memory for the struct and use it as a struct

haughty mountain
#

it allows you to do stuff like allocating some amount of memory, cast it to your struct, and use the available memory for the array contents

outer rain
#

(And it's not zero length array, it is flexible array with no size specified)

#

Considering that you will probably wrap this struct into python class or some other abstraction anyway, you can just not mention that array in the struct at all

#

Simply access the memory outside of the struct, nothing can stop you

#

Or, alternatively, have an array of size 1, and do "out-of-bounds" accesses on it (the ansi C way)

haughty mountain
#

right, size 1 array would be the old-school way to do it

#

(which is still UB afaik)

raven willow
#

I don't know which chat to put this question into exactly but, has anyone heard of the python package "dpss". (pip install dpss) I want to use it at work and nervous about introducing malware. I'm the only python programmer and don't want to screw it up.

#

Does anyone here have any experience with it and is it legit?

turbid granite
#

I'm sure this is silly but is there a way to loop through a dict over again until all of a list has been added to each dict evenly (as possible)?
ie:
dict a, b, c
list 1,2,3,4,5,6,7,8
result
a: 1,4,7
b: 2,5,8
c: 3,6

raven willow
#

How many sets you want in the results?

#

Always 3?

turbid granite
#

no it'll be between 1-6
but the first will be a number to setup
Think of a sports league. one season could be 3 divisions and next year 6 divisions.
prob the best example I can think of

#

Can be random
was thinking of doing an every Nth based on number of divisions

faint sluice
#

I'm trying to use an implementation of FIt-SNE, but it runs out of memory on an A10 at this line:

dY = torch.sum(
            (PQ * num.to(device)).unsqueeze(1).repeat(1, no_dims, 1).transpose(2, 1) * (Y.unsqueeze(1) - Y),
            dim=1
        )

Unfortunately, these variables are poorly named and undocumented. However, the function preceeding the line where the function runs out of memory is

sum_Y = torch.sum(Y * Y, dim=1)
num = -2. * torch.mm(Y, Y.t())  # (N, N)
num = 1. / (1. + (num.to('cpu') + sum_Y.to('cpu')).t().to('cpu') + sum_Y.to('cpu'))
num.fill_diagonal_(0)
Q = num / torch.sum(num)
Q = torch.max(Q, torch.tensor(1e-12, device=Q.device))
Q = Q.to(device)

# Compute gradient
PQ = P - Q

Q.to('cpu')
P.to('cpu')

Y is initialized to a tensor of zeros equal to the length of the dataset and the code given as well as the line that breaks is within a loop that runs for a given number of times. P is some tensor based on the dataset (I think it's the dataset after undergoing some dimension reduction).

Essentially: How can I change this expression to calcualte dY with less memory?

turbid granite
#

hmmm

faint sluice
#

The problem seems to be the .repeat(1, no_dims, 1) segment. It is what tries to allocate 13 GB.

num is a tensor with dims (42000, 42000), so not exactly small. PQ is the same size, and Y is (42000, 2).

If I could do the transpose in batches, then I could split this up and wouldn't have any issues. Maybe. Essentially I just need to avoid storing the output of the repeat() operation. Are there any batched transpose algorithms that work with tensors?

turbid granite
#

I like to break things down before cleaning it up. I got to this point before asking.
`
division = 3
teams = ['alpha','beta','charie','donky','elephant','fish']
teams2 = []
teams3 = {}
print (division)
print (teams)

teams2 = teams
a = 1

def every_nth(nums, nth):
return nums[nth -1::nth]

if division > 1:
div = division
a = 1
for i in range(division):
b = "division" + str(a)
al = []
al.append(every_nth(teams2,div))
div -= 1
a = a + 1
print("Al : ", al)
teams3.update({str(b):al})
k = lambda teams2, al: list(filter(lambda element: element not in al, teams2))
print ("teams 2 ", teams2)
print ("temas 3 ", teams3)
`

#

seems to work except it's not removing the "duplicate" found iteams on each loop from the temp list I'm using.

#

I tried replacing
k = lambda teams2, al: list(filter(lambda element:
with
for k in teams2: print (k) if k in al: print ("yo") teams2.pop(k)
but it doesn't do the if, even though it sees each k.

shadow glen
#

a question

#

why B is being re-writed with ":"?

#

!e

message_disc = "cB: Hello World"

instances = message_disc.split(':', 1)
print(instances)
prefix = message_disc[0]
prefix = prefix+':'
instances[0] = prefix

print(instances)

halcyon plankBOT
#

@shadow glen :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | ['cB', ' Hello World']
002 | ['c:', ' Hello World']
shadow glen
#

it's suppose to add a ":" at final

#

and not rewrite it

regal spoke
#

Btw next time put your code in discord inside tags like this;
```py
print('Hello, World!')
```
That way it will get syntax highlighed and a lot nicer to read

#
print('Hello, World!')
shadow glen
halcyon plankBOT
#

@shadow glen :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | ['cB', ' Hello World']
002 | c
003 | c:
004 | ['c:', ' Hello World']
shadow glen
#

yep it's a bug

#

why the hell cB turns into c?

regal spoke
#

which is c

#

Thats what your code is setting prefix equal to

shadow glen
#

it meant to grab the first element from the list

#

not the first string

#

*first character

regal spoke
#

As I said, you did message_disc[0] (first character of the message), but you should have done instances[0] (the prefix up to the first :)

#

instances[0] is cB

shadow glen
#

!e

message_disc = "cB: Hello World"

instances = message_disc.split(':', 1)
print(instances)
prefix = instances[0]
print(prefix)
prefix = prefix+':'
print(prefix)
instances[0] = prefix

print(instances)
halcyon plankBOT
#

@shadow glen :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | ['cB', ' Hello World']
002 | cB
003 | cB:
004 | ['cB:', ' Hello World']
shadow glen
#

finally

turbid granite
#

Whoot

#

I figured mine out.

division = 3
teams = ['alpha','beta','charie','donky','elephant','fish','giant']
teams2 = []
teams3 = {}
print (division)
print (teams)

teams2 = teams
a = 1

def every_nth(nums, nth):
    return nums[nth -1::nth]

if division > 1:
    div = division
    a = 1
    for i in range(division):
        b = "division" + str(a)
        al = []
        al.append(every_nth(teams2,div))
        al2 = al[0]
        div -= 1
        a = a + 1
        teams3.update({str(b):al})

        for k in al2:
            if k in teams2:
                l = teams2.index(k)
                teams2.pop(l)
        print ("teams 3 ", teams3)

Now to clean it up. lol

haughty mountain
normal gull
#

Yes, you can do zero-length arrays in C:

ldo@theon:c_try> e zero-length-array.c

#include <stdio.h>

typedef int zerolength[0];

int main(int argc, char ** argv)
  {
    fprintf(stdout, "sizeof(zerolength) = %d\n", sizeof(zerolength));
    return
        0;
  } /*main*/


ldo@theon:c_try> gcc -o zero-length-array zero-length-array.c
ldo@theon:c_try> ./zero-length-array
sizeof(zerolength) = 0
haughty mountain
normal gull
haughty mountain
#

that doesn't matter

#

it's not valid standard C

normal gull
haughty mountain
#

if you code ignoring the standard your code is likely to break across compilers (or even within the same compiler)

#

this is probably one of the least likely things to ever break, but still

#

C even has a standard construct for this since C99

#

use that

#

rely on stuff like undefined behavior extensively and it will end up biting you in the ass

#

I've seen it all too often

normal gull
haughty mountain
#

None of the common compilers will break on this because basically all support the extension, but that doesn't mean it's standard C

#

and I did find a compiler that refuses to compile it, but it's an obscure one

#

"it works on my compiler" β‰  "this is correct and guaranteed to work"

regal spoke
#

Just because something is technically a supported extension doesn't necessarily mean it is a good idea to use it

#

Also, even supported extensions can behave buggy

#

For example, I recently experienced some really weird errors trying to use VLAs together with lambda functions:

#
int main()
{
    int N = 1;
    bool a[N];

    auto f = [&](auto &&self) {
      a[0] = true;
    };

    f(f);
}
#

Try out compiling this with GCC and Clang ^

fiery cosmos
#

VLA ooo

#

Lambdas ah dam it. It's not c

regal spoke
#

Here is a fun example in C C++ with 0 sized arrays. (Found it on stackover https://stackoverflow.com/a/65422119)

#include <cstdio>

int main() {
    struct A {
        A() {
            printf("A()\n");
        }
        ~A() {
            printf("~A()\n");
        }
        int empty[0];
    };
    A vals[3];
}
#

The output of this is probably not what you'd expect

regal spoke
#

ah ye

stray fractal
#

whm

regal spoke
waxen fern
#

i have a really dumb problem

def getSignatures():
    #sortedDict = {}
    with open("file-signatures.json", "r") as file:
        jsonObject = json.load(file)
    for item in jsonObject["filesigs"]:
        item = item["Header (hex)"].replace(" ", "")
        print(item)
    jsonObject = json.dumps(jsonObject, indent=4)
    with open("file-signatures.json", "w") as file:
        file.write(jsonObject)```

this is my function, I am just trying to remove spaces from the JSON file(attached as image)
it is not working. I know it is something dumb, what could it be?
#

the print statement is working correctly...

sonic dagger
waxen fern
#

i think you misunderstood my problem. I want to eliminate the spaces from the Header (hex) key only

#

I want to keep the indents otherwise.

sonic dagger
#

Gotcha, then that line with the replace looks weird to me

waxen fern
#

it does work

sonic dagger
#

Your item variable is just a copy of the value

#

You're not reassigning it to the list

waxen fern
#

Oh

sonic dagger
#

If it was working as the item itself, you would have a list of just the hexcodes btw, is that what you want?

waxen fern
#

All I want to do is eliminate the spaces between the hexcodes. I want the entire JSON file otherwise.

#

As when I calculate the hexcodes of a file, it does not have spaces.

sonic dagger
#

You probably have to iterate manually through your list:
For (i, item) in enumerate(jsonObject["filesigs"]) :
jsonObject["filesigs"][i]["Header (hex)"] = item["Header (hex)"].replace(...

#

Or do list comprehension with a function designed to update the value of each dictionary

waxen fern
#

thank you, I will try this in a moment

shadow glen
#

okay soo

#

what this error means?

TypeError: unsupported parameter kind in callback: VAR_POSITIONAL
shadow glen
#

so nvm

summer fossil
stiff quartz
#

Hi, is there a reason why there is typically more stack overflow problems in Python compared to C++?
I was solving this problem (https://codeforces.com/contest/1843/problem/D), and my solution (recursive dfs, https://codeforces.com/contest/1843/submission/220724973) seemed similar to the official solution except it's in Python rather than in c++ (cf editorial of the associated contest https://codeforces.com/blog/entry/117468). My solution raises a STACKOVERFLOW error even though I did sys.setrecursionlimit(2 * 10**9). To pass it through I did my dfs iteratively (https://codeforces.com/contest/1843/submission/220728763) and it was quite a good exercise but I wondered if there was a reason why the recursionlimit in c++ can be set higher? I can't set my limit higher than 10^9 ish and it seems you can in c++? Does it have to do with memory management being less ideal in Python? I know there are some magic decorator functions @regal spoke invented to transform a recursive function into an iterative one but I just wanted to understand why there is typically more stack overflow problems in Python compared to C++. Writing my dfs iteratively is fun too so I'm just curious rather than stuck on a problem.

hexed aurora
hexed aurora
summer fossil
hexed aurora
#

Try running it locally.

summer fossil
# hexed aurora Idk. G4G is a pretty terrible site, and apparently they thought it was appropria...
#User function Template for python3

from typing import List

class Solution:    
    def eventualSafeNodes(self, v : int, adj : List[List[int]]) -> List[int]:
        vis = [0]*v
        pathvis = [0]*v
        check = [0]*v
        q = []
        def dfs(node,adj,vis,pathvis,check):
            vis[node]=1
            pathvis[node]=1
            check[node]=0
            
            for neb in adj[node]:
                if not vis[neb]:
                    if dfs(neb,adj,vis,pathvis,check):
                        check[node]=0
                        return True
                elif  pathvis[neb]:
                    check[node]=0
                    return True
            check[node] = 1
            pathvis[node] = 0
            return False
                
            
        for i in range(v):
            if not vis[i]:
                dfs(i,adj,vis,pathvis,check)
        
        for i in range(v):
            if check[i] == 1:
                q.append(i)
        return q```
#

My code

hexed aurora
# summer fossil I also don't like it!

They spread disinformation. I would be very careful taking information from that site. The people who write their articles often do not know what they're talking about.

#

Or, I guess it's "misinformation" since the mistakes are due to incompetence instead of out of malice.

stiff quartz
summer fossil
hexed aurora
summer fossil
#

I am learning about graphs from utube and that guy solves problem from gfg so i also solving problems in gfg and gfg is not great like leetcode

hexed aurora
summer fossil
#

i think i need to download input then i need to try

#

I will try tommorow

#

My friend! , U advertise on worng place

haughty mountain
#

the bigger issue is python stack frames being huge

regal spoke
# stiff quartz Hi, is there a reason why there is typically more stack overflow problems in Pyt...

As @haughty mountain said, Python's stack frames are huge, so recursion in Python is really memory hungry.

Another thing to note is that sys.setrecursionlimit(2 * 10**9) doesn't actually make the stack bigger, it just removes the check for the recursion limit.
If you want more stack memory on codeforces you need to abuse the threading module like this https://codeforces.com/blog/entry/118668#comment-1051846

#

But even then, the typical memory limit 256mb only works for about 1e5 recursive calls

#

In comparison, C++ can probably get to like 1e7

stiff quartz
regal spoke
loud agate
#

Hello! I have a question. I have an algorithm that gathers data on financial performance, keeps it and yields it to an AI. However, im facing problems feeding the ai bc the amount of data is too large. To solve this, I am trying with langchain models, but i dont know which one and how to use it. Does anyone know about it ? If not, do you know any alternatives?
langchain agents*

regal spoke
loud agate
#

no one awnsering😒

unkempt linden
#

Hello people. I hope this is the correct place to ask. How do i remove specific duplicate elements from a list? ex: I wanna remove 6 [1 , 5, 6 ,6 ,2 ,6 ] --> [1, 5 , 2]

regal spoke
unkempt linden
#

Yes. Nevermind just used list comprehension

regal spoke
#

Ye to remove all occurences I would use a list comprehension

#

But to just remove the first occurence, I would use .remove(x)

unkempt linden
fervent saddle
#

You want to remove a specific value if there's more than one of it? Or you just want to remove it no matter what?

unkempt linden
#

Yup and i exceeded the time limit lmao

summer fossil
unkempt linden
#

Top K frequent elements. That's the problem

summer fossil
#
return [ans[0] for ans in Counter(nums).most_common(k)]```
unkempt linden
#

Yeah didn't really understand how this solution works. Was gonna check youtube

#
class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        result = []
        for i in range(0,k):
            top = max(nums, key=nums.count)
            result.append(top)
            nums = [x for x in nums if x != top]
        return result
#

that's my code. Time limit exceeded on test case 19

regal spoke
summer fossil
#

and constraints is too big!

unkempt linden
#

Yeah i gotta work on my efficiency xD. I'll also watch a youtube video explaining how to understand my time complexity. Will update you guys when i'm all done!

regal spoke
#

Suppose that k = n and that all numbers in nums are unique. In that case your code practically runs two nested for loops of length n, resulting in around n^2 operations.

summer fossil
unkempt linden
regal spoke
#

The easiest trick to making an efficent solution is to use hashtables (aka dict in Python)

unkempt linden
#

I've finished a "Learn Python 3 from scratch" course last week and did around 6 leetcode problems to better understand what I learned. I'll be tackling the "Data Structures an Algorithms" course this week so I'll probably understand everything a lot better after it

regal spoke
#

The most basic solution to that type of problem starts with

counter = {}

# Initialize counter as 0
for x in nums:
  counter[x] = 0

# Count how many numbers you've got of each value
for x in nums:
  counter[x] += 1

Then to figure out the k most frequent values you need to somehow sort the counter. The way I would do this is ||sorted(counter, key = lambda x: -counter[x])[:k]||

summer fossil
#

we can also do only last part

regal spoke
#

!e

counter = {}
for x in [1,2,3]:
  counter[x] += 1
halcyon plankBOT
#

@regal spoke :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "/home/main.py", line 3, in <module>
003 |     counter[x] += 1
004 |     ~~~~~~~^^^
005 | KeyError: 1
regal spoke
#

You need to somehow initialize counter[x] to 0 before you can do += to it

summer fossil
regal spoke
summer fossil
#

?

regal spoke
summer fossil
#

zero as a default?

unkempt linden
#

So basically create a dict, count and add the values, sort them and return the last k values?

regal spoke
summer fossil
#
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        a = defaultdict(int)
        for i in nums:
            a[i]+=1
        # print(a)
        
        sorted_freq = sorted(a.items(), key=lambda x: x[1], reverse=True)
        # print(sorted_freq)
        result = [item[0] for item in sorted_freq[:k]]
        return result
        ```
unkempt linden
#

sorted(counter, key = lambda x: -counter[x])[:k] do you mind explaining this? I'm like one step away from understanding it

summer fossil
#
return [ans[0] for ans in Counter(nums).most_common(k)]```
summer fossil
regal spoke
#

!e

A = [-10, 3, 2, 5]
B = sorted(A, key = lambda x: x*x)
print(B)
halcyon plankBOT
#

@regal spoke :white_check_mark: Your 3.11 eval job has completed with return code 0.

[2, 3, 5, -10]
regal spoke
#

Note how -10 is last in the result since (-10)^2 is larger than 2^2, 3^2 or 5^2.

#

@unkempt linden Did you understand my example?

unkempt linden
#

Yes understood the example. Now just wrapping my head around the first one

regal spoke
summer fossil
regal spoke
summer fossil
unkempt linden
regal spoke
#

So do you see the error in the code?

#

There is one small error

#

It has to do with using nums in the sorted expression

unkempt linden
#

Doesn't it have to be counter?

#

Wait im lost

regal spoke
#

!e

nums = [2, 3, 2, 1, 4]
k = 3

counter = {}

# Initialize counter as 0
for x in nums:
  counter[x] = 0

# Count how many numbers you've got of each value
for x in nums:
  counter[x] += 1

B = sorted(nums, key = lambda x: -counter[x])[:k]
print(B)
halcyon plankBOT
#

@regal spoke :white_check_mark: Your 3.11 eval job has completed with return code 0.

[2, 2, 3]
regal spoke
regal spoke
#

Just so you know, if you iterate over a dictionary in python then you iterate over the keys of the dictionary

unkempt linden
#

and when you sort it, you sort by keys too?

regal spoke
#

!e

nums = [2, 3, 2, 1, 4]
k = 3

counter = {}

# Initialize counter as 0
for x in nums:
  counter[x] = 0

# Count how many numbers you've got of each value
for x in nums:
  counter[x] += 1

for x in counter:
  print(x)
halcyon plankBOT
#

@regal spoke :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | 2
002 | 3
003 | 1
004 | 4
regal spoke
unkempt linden
regal spoke
#

Each key is paired with a value. counter[2] is 2, counter[3] is 1, counter[1] is 1 and counter[4] is 1

#

when you iterate over a dictionary like this:

for x in counter:

then you iterate over the keys of the dictionary

#

Thats why it prints 2, 3, 1, 4

unkempt linden
#

Okay

#

But shouldnt there be 3 keys = to 1 and 1 key = to 2?

#

As there is only 2 2, and 1 1 3 4

regal spoke
unkempt linden
#

Yes but the output is different. Unless im not reading it correctly

regal spoke
#

the thing you index counter with is called the key

#

This is not to be confused with the sorted argument that also happens to be called key for a completely unrelated reason

unkempt linden
#

so technically you add 1 to the value of that specific key "x" with x being the number

regal spoke
#

Not sure I understand what you meant

unkempt linden
#

Let me go back to some dictionary lesson and I'll be back with you. I feel like I'm wasting your time 😭

regal spoke
#

@unkempt linden Maybe it is easier to understand what is going on if I use strings for the keys

#

!e

nums = ["banana", "fish", "banana", "pear", "berry"]
k = 3

counter = {}

# Initialize counter as 0
for x in nums:
  counter[x] = 0

# Count how many copies you've got of each string
for x in nums:
  counter[x] += 1

for x in counter:
  print(x)
halcyon plankBOT
#

@regal spoke :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | banana
002 | fish
003 | pear
004 | berry
regal spoke
#

!e

nums = ["banana", "fish", "banana", "pear", "berry"]
k = 3

counter = {}

# Initialize counter as 0
for x in nums:
  counter[x] = 0

# Count how many copies you've got of each string
for x in nums:
  counter[x] += 1

print(counter)
halcyon plankBOT
#

@regal spoke :white_check_mark: Your 3.11 eval job has completed with return code 0.

{'banana': 2, 'fish': 1, 'pear': 1, 'berry': 1}
unkempt linden
#

Yes understood everything now

#

I'm familiar with the second output

unkempt linden
regal spoke
#

The reason it is not correct is because the same number can appear multiple times in the output

unkempt linden
#

can we just transform it into a set?

regal spoke
#

But we could also just reuse counter

unkempt linden
#

But what is the original cause tho

regal spoke
unkempt linden
#

What makes it give the same number twice

regal spoke
#

Ahhh

#

When I said to use a set, I meant to use a set like this

#
sorted(set(nums), key = lambda x: -counter[x])
unkempt linden
#

It worked! Understood it all. Thank you so much and sorry for wasting your time lmao

fiery cosmos
#

might buy a giant whiteboard anyone actually practice like that

#

ive only done zoom interviews lol

dark aurora
#
CONST = 10       
def generator(x):
    #generator like this but the number of nested loops is controlled!
    num = 0
    for i in range(CONST):
        for j in range(i+1, CONST):
            for k in range(j+1, CONST):
                num = 2**i+2**j+2**k
                yield num

#

Is there any way to create a generator like this but I can type how many nested loops I want(x)?

regal spoke
#

So I don't really understand what you are trying to do

dark aurora
#

I don't really understand generators so sorry if I understood something incorrectly

regal spoke
#

One thing you could do is a recursive generator

CONST = 10       
def generator(x, base = 0, start = 0):
    if x == 0:
      yield base
    
    for i in range(start, CONST):
      yield from generator(x - 1, base + 2**i, i + 1)        
#

There is probably some way to do this using the itertools module too

dark aurora
#

Thank you so much

lilac cradle
#

anybody have an implementation of perlin noise without using numpy?

#

i have something but it's kinda.. meh

#

i 'translated' the C code from wikipedia and that worked, ig, but it looks kinda 'off'

#

something like this would be ideal, but idk how you'd add it into itself, i tried averaging but that just made it laggy and didn't make it look better

#

really, i think i just need a good 'gradient noise' function, that takes in coordinates as input

#

lemme try something dumb

#

i could do something with the x and y coords to make them a single number (bitshift, maybe?) and then get the hash of that

#

the number would have to be a float

#

since int hash is (usually) itself

#

does this look good?

#

i did some bullshit bitmath to do it

#
def random_gradient(ix, iy):
    a = ix
    b = iy
    n = (a*111000777) ^ (b*123123123)
    r = hash(n/16.123124123)
    v = vec2(math.cos(r),math.sin(r))
    return v

the numbers have no greater meaning, i just typed them out randomly lol

outer rain
lilac cradle
#

i translated most of the code from wikipedia's implementation of it in C

outer rain
#

I see

lilac cradle
#

the code is wacky rn, i know, i just wanna get it working properly first before i polish

shadow glen
#

!e

search_result = None
prompt_result = "Not found" if search_result is None else prompt_result
print(prompt_result)

search_result = "There is Data"
prompt_result = "Not found" if search_result is None else prompt_result
print(prompt_result)
halcyon plankBOT
#

@shadow glen :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | Not found
002 | Not found
shadow glen
#

okay why this algorithm is not working the way i entended to?

lilac cradle
#

you messed up your ternary

#

you're trying to give it the value of itself, while declaring it

#

what youd want is ```py
search_result = None
prompt_result = "Not found" if search_result is None else search_result
print(prompt_result)

search_result = "There is Data"
prompt_result = "Not found" if search_result is None else search_result
print(prompt_result)

shadow glen
#

oh i see

#

XD, my stupid misstake XD

lilac cradle
#

yeah, happens to the best of us

#

usually though, you'd get an error when doing that, unless you previously declared prompt_result

lilac cradle
outer rain
#

it certainly looks like noise

lilac cradle
#

well it seems fine now, i just would like a 'proper' perlin noise and idk if this is it

#

i would try simplex noise as well but simplex seems complex as all hell

#

and i think perlin should be fine since i'm doing 2d noise mainly anyway

#

lemme make a black/white version

#

the interpolation doesnt seem all that great but eh

outer rain
#

Can you also inrease the scale a little bit? it looks like white noise atm

outer rain
lilac cradle
#

i should do that

#

i was wondering why it was so dark, i forgot

outer rain
#

looks good to me

#

btw adding octaves is rather simple

#

you can just take like 4 instances of your noise with different seeds (which you might have to either implement properly or just hack in some way, like by shifting the noise my a large value)

#

then simply scale each subsequent noise up, and divide its value by something

#

like

noise(x, y) + noise(2 * x, 2 * y) * 0.5 + noise(4 * x, 4 * y) * 0.25 + noise(8 * x, 8 * y) * 0.125
lilac cradle
#

can i just change the scale of the input coordinates

#

ok yeah

#

ill try that

#

ok i used smootherstep, it looks a lot less jaggy

#

i literally just copied the equation from wikipedia

lilac cradle
outer rain
#

gorgeous

lilac cradle
#

there i did it iteratively instead of manually stacking them

#

i did it in terminal, it a bit laggy tho

lilac cradle
outer rain
#

that's cool

#

I wonder why it is so slow...

#

maybe just a lot of escapes?..

lilac cradle
#

it's the math being slow

#

i have realtime terminal rendering (at least for some smaller resolutions) for some things

tawny prairie
misty cargo
#

If I do a for-loop with a function, like for letter in "h e l l o".split():, the function (split in this case) only gets calls once, right? I want to make sure that this would run just as fast as making a variable x = "h e l l o".split() and then looping on x.

fervent saddle
#

Yeah, that part is only evaluated once.

iron turret
#

Hey, is slicing always a copy? For example I have a list of floats, the sliced data will be a copy?

someList = (ctypes.c_double * len(someListPtr))(*someListPtr)
newList = someList[:len(someList)]
#

I'm taking array from C-extension and I trying to sort out memory management (python manages own and C manages own)

lean walrus
#

For builting classes (list, tuple, str, bytes, ...) - always copy (except for memoryview, it creates new views)
For custom classes (including ctypes) - everything can happen, read corresponding documentation

lean walrus
spiral forum
#

If anyone is good with python selenium here and you have a moment please check my post on python help page, just posted it there ❀️

stuck quest
#

Yo guys REALLY need your help. I have this massive json response from an API Call and I need just 1 value "full_text". I converted the response into a python dictionary and used pprint but its just too much to use the conventional method for extracting a single value. PLEASE HELP!

#

Its so nested

stuck quest
#

sorry WDYM w bottleneck?

#

like whats the issue?

outer rain
#

Yeah

#

You say it is slow

#

What part is slow, exactly?

#

Or, maybe not slow, sorry, I assumed

#

Yeah, whatever, what's the problem?

stuck quest
#

I want to get the value β€žfull_textβ€œ so I can use the text of the tweet it pulled. The problem is that I cant extract the value from the response. Do you know a way of getting just the value β€žfull_textβ€œ from the respone?

stuck quest
#

I dont know how.

outer rain
#

Well, you say you can't "extract" it, whatever that is

#

Have you tried? @stuck quest

stuck quest
#

I want to for example create a function print("full_text")

outer rain
#

I don't know what you mean

stuck quest
#

I send a request to an API and get a json response which is unfiltered data. From this json response I want to print the value "full_text".

outer rain
#

print(json.loads(response)["full_text"])

#

I don't know which http client you are using, no which json library

outer rain
stuck quest
#

doesnt work

lean walrus
# stuck quest

You cant pass response to json functions πŸ—Ώ
Error message literally says that

lean walrus
stuck quest
#

im new to python as u can tell but trying my best. Sorry if it seems dumb. If I cant pass "response" to json functions, couldn't I create a json dictionary by json.loads and then pass that?

stuck quest
lean walrus
#

Please post full error traceback

stuck quest
#

found that online. In my case this is the whole problem ->the correct order. I cant order the whole json response because its so long.

stuck quest
lean walrus
#

Why dumps

#

That doesn't make sense

stuck quest
#

bcs json.dumps converts the json dictionary into strings, as the error told me was the reason why it couldnt read it.

stuck quest
stuck quest
lean walrus
halcyon escarp
#

Is best case time complexity in terms of Big-Oh and worst case is in terms of Big-Theta?

haughty mountain
#

no

#

O, Θ and Ω are three different measures of asymptotic growth

#

O is an upper bound, think ≀ but for asymptotics

#

Ξ© is a lower bound, think β‰₯ but for asymptotics

#

Θ is a tight bound, think = but for asymptotics

#

if f(x) is in both O(g(x)) and Ω(g(x)) then f(x) is also in Θ(g(x))

#

similar to how x β‰₯ y and x ≀ y means that x = y

#

these can be applied on anything

#

you can talk about both a lower and upper bound of the worst case if you want

#

or on the best case

#

or on the average case

#

or ...

lilac cradle
#

you guys know the collatz conjecture?

#

here's a little render of it i did, where each pixel's index is input to the function, and the output color is how many steps it took to go back down to 1

#
#====#
w,h = 2048,2048
MAX_STEP_COUNT = 2048
#====#
from PIL import Image
im = Image.new("RGB",(w,h))
px = im.load()

def collatz(x):
    n = 0
    while x>1 and n<MAX_STEP_COUNT:
        x = 3*x+1 if x&1 else x//2
        n+=1
    return n

def render():
    idx = 0
    for y in range(h):
        for x in range(w):
            n = collatz(idx)
            idx +=1
            px[x,y] = (n,n,n)
    im.save("collatz_render_test.png")

render()

MAX_STEP_COUNT is just a fallback to prevent infinite loops (though those shouldn't be an issue)

lean walrus
#

Beautiful

outer rain
#

you might want to try arranging the values in hilbert curve pattern

lilac cradle
outer rain
#

because right now the differences between rows are significatly greater than by columns, while this is not an issue with a hilber curves

#

it kind of turns the plane into unlerated squares, but locally, within each square, the numbers are close to each other

#

like here is how IP maps are layouted

lean walrus
#

differences between rows are significatly greater than by columns

Is that a problem? I think it is good, because it emphasizes some patterns

lean walrus
outer rain
#

like this gray square in the middle is 127.something, which is reseved for a local host

#

without hilber curve it would be like 4 pixel row

outer rain
lilac cradle
#

well it's a power of two either way but yeah

outer rain
#

rescaling the hilber curve simply takes one of the quadrants of the bigger map

lilac cradle
#

tbh im not sure how to do a hilbert curve other than the 'paste sections of it onto itself' which seems like a bitch to code

#

also if you guys like renders, i have plenty

lean walrus
#

x^y

lilac cradle
#

ye

#

but i did it on floats

#

the actual code for doing so is a little wacky but it works

#

i split the operands into float and int vars, converted the floats to an integer by multiplying them by 2**16 (could be some other number, i just chose 2**16 'just because', ig) and then casting to int, then doing the operation on the int and float sections of both, then recombining them

#
def xor_float(a,b):
    n = 10**16
    a_int,a_float = int(a),a%1
    b_int,b_float = int(b),b%1
    a_float,b_float = int(a_float*n),int(b_float*n)
    xor_int = a_int^b_int
    xor_float = (a_float^b_float)/n
    return xor_int+xor_float

the code could def be optimized but here

#

lemme generalize it to use an operator

jolly mortar
lilac cradle
#

oh, math.modf is a thing

#

however it returns two floats so i'd have to cast one to int

#

divmod would also work

halcyon escarp
#

How would I justify the in-correctness that n^3 is in O(n^2) ?

haughty mountain
vocal gorge
#

so, disprove it? well, start by writing out what would it mean (by definition) that n^3 is O(n^2).

haughty mountain
#

oh

#

yeah, just write out the definition

#

simplify a tad

#

let nβ†’βˆž

lilac cradle
haughty mountain
lilac cradle
#

oh also

haughty mountain
lilac cradle
#

can you even get those in python, or would you have to do that manually

lilac cradle
# lilac cradle oh also

doing this for rightshift/leftshift wont work, simply because of the sheer scale that would be involved, unless you have like 1.000000000001

halcyon escarp
lilac cradle
#
Traceback (most recent call last):
  File "c:\Users\β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ\Downloads\float_op.py", line 11, in <module>
    print(op_float(operator.lshift,2.12312,3.12355))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ\Downloads\float_op.py", line 8, in op_float
    op_float = op(a_float,b_float)/n
               ^^^^^^^^^^^^^^^^^^^
MemoryError
haughty mountain
# halcyon escarp wdym ->

the statement is true as n tends to 0, but what you care about is of course n tends to ∞ in this case

haughty mountain
haughty mountain
#

!d math.frexp @lilac cradle

halcyon plankBOT
#

math.frexp(x)```
Return the mantissa and exponent of *x* as the pair `(m, e)`. *m* is a float and *e* is an integer such that `x == m * 2**e` exactly. If *x* is zero, returns `(0.0, 0)`, otherwise `0.5 <= abs(m) < 1`. This is used to β€œpick apart” the internal representation of a float in a portable way.
lilac cradle
#

alos the issue with those would be that they aren't even nearly equivalent so it'd look funky, you could still do it but it'd be weird

lilac cradle
lean walrus
lilac cradle
#

why doesn't py have a proper sign function,btw

#

its pretty trivial to implement

#

it has 'copysign' but that's weird

halcyon escarp
lean walrus
#

Read big-O definition, and answer to your question will become clear

lilac cradle
#

clamp is another one that seems like it'd be trivial to implement, i usually copy it over as boilerplate in my scripts

vocal gorge
#

numpy has it :p

lilac cradle
#
def clamp(minimum, n, maximum):
    return max(minimum, min(maximum, n))

you could also do this with a ternary, but this is what i've always used

haughty mountain
#

yeah that

lilac cradle
#

ye

#

honestly the ternary is longer so having it be min and max seems the best way

haughty mountain
#

less fiddling with constants you usually don't care about πŸ™‚

lilac cradle
#
def clamp(mn,n,mx):
    return mn if n<mn else n if n<mx else mx

even with shortening 'minimum' and 'maximum' to 'mn' and 'mx' (which feels stinky anyway), it's longer

lilac cradle
#

n->inf means 'as n goes to infinity' afaik

outer rain
#

I have been forced into installing python on my windows machine 😑😑😑😑😑

halcyon escarp
lilac cradle
#

neat

lean walrus
#

Wow, thats cool

lilac cradle
#

hm, im gonna redo voronoi i think

haughty mountain
lilac cradle
#

oh god this code is stinky :(

lean walrus
#

I think it is impossible to write non-stinky voronoi code

halcyon escarp
lean walrus
#

Reread

halcyon escarp
#

done

lilac cradle
#

well its old, and for me that means its using camelcase which i originally used when doing python

#

for some reason

lean walrus
#

Some parts of python stdlib are stinky too

#

unittest uses camelCase and CamelCase everywhere

lilac cradle
#

☹️

lean walrus
#

tkinter, iirc, also uses camelCase

#

No, that is not true

#

tkinter stinks for another reason

halcyon escarp
#

So I understand big o, how are y’all saying i would justify it tho?

lilac cradle
#

aight lemme render and see if it works

vocal gorge
#

well, write down what it'd mean for that statement to be true (there exists a constant C>0 such that...), then show formally that it's not the case.

lilac cradle
#

took a bit to render (like a minute :(, i might multiprocess it and see if that helps), but here

#

the actual important function is this

def voronoi(points,pos):
    distances = [pos.distance(i) for i in points]
    return distances.index(min(distances))
#

i used my vector class for it which is where distance() comes from

lean walrus
#

I think everyone at some moment implements their own vector class

lilac cradle
#

real

#

ok one thing i find a bit odd is how multiprocessing requires you to have a main function (at least in some cases)

#

oh wow yeah that's way faster, that might just be because of how i'm doing the pixel stuff tho

#

for when i need speed, i process each line, then use putdata instead of just setting each pixel with px[x,y]

vocal gorge
lilac cradle
#

easier for me to work with my own code rather than do numpy, also pride

vocal gorge
#

scipy.spatial.distance.cdist: am I a joke to you? πŸ˜”

lilac cradle
#

ok it fucked up and is out of order, however it kinda looks cool

#

also iirc scipy makes a script take a damn while to start up

vocal gorge
#

sure, numpy+scipy is like a second

lilac cradle
#

im not sure why its fucked up honestly

#

the obvious answer is 'it's out of order' but it shouldn't be

lean walrus
#

Hwo many processes are you using? 6?

lilac cradle
#

i was yeah, tho the amount doesnt change it

#

ok yeah i made each line print what y value it got, it's out of order

#

im using imap, which i was under the impression that that would be ordered

#

maybe i should just use map()

#

and yet it seems to be fine for my other scripts

#

which is hte odd part

#

this is also using multiprocessing.Pool().imap()

#

lemme check through some stuff maybe i fucked up something else

#

very odd

#

lemme just try it without multiprocessing

#

yeah i think the speed improvement was using putdata and not doing multiprocessing

#

...maybe? im not sure

opal oriole
lilac cradle
#

neat

#

im gonna do a 4k render now and its gonna take a damn while so i'll just let it run in the background

opal oriole
lilac cradle
#

ill check it out sometime then

modern gulch
lilac cradle
#

renders

#

that one is the julia set, or at least a slice of it (it's a 4d fractal, afaik)

lean walrus
#

Imagining point in 4d is not hard, but i cant imagine 2d of 4d space

#

Even 1d is very hard to imagine

lilac cradle
#

the set has the complex plane, and then another one as a parameter

#

so you give it two sets of 2d coords

halcyon escarp
#

On the left is proving the function in the top left is in O(n^2) and on the right it’s proving it’s in omega(n^2) would you say these are valid justifications? Sorry if a lil messy, ask if any confusion

regal spoke
#

So like 7.6 n^2 + 6.2 n - 3.2 = n^2 (7.6 + 6.2/n - 3.2/n^2)

#

Note that 6.2/n -> 0 and 3.2/n^2 -> 0 when n -> inf

halcyon escarp
regal spoke
#

ehm yes, read the hand writing wrong

halcyon escarp
#

no worries

#

just wanted to confirm

regal spoke
#

Once you have n^2 (7.6 + 6.2/n - 3.2/n^2).
You can easily see that there exists some n_0 such that

#

7.5 n^2 <= n^2 (7.6 + 6.2/n - 3.2/n^2) <= 7.7 n^2, for all n >= n_0

halcyon escarp
#

is the 1.5 and 1.7 representing C

regal spoke
#

The lower bounding and upper bounding c's, ye

halcyon escarp
#

1.5 for omega

regal spoke
#

Yes

#

I mean picking any constant below 7.6 works for Omega

halcyon escarp
#

should it be 7.5 and 7.7

regal spoke
halcyon escarp
#

Haha its okay I just didn't wanna get confused haha

regal spoke
#

Okey now I've changed all the 1.6 to 7.6

halcyon escarp
regal spoke
#

Yes

regal spoke
#

You know that 6.2/n -> 0 and 3.2/n^2 -> 0 when n -> inf

#

This means that by picking n_0 large enough you can make 6.2/n and 3.2/n^2 become arbitrarily small for all n >= n_0

halcyon escarp
#

I see that makes it more understandable thank you so much

regal spoke
#

Btw one last thing. You wrote that 7.6 n^2 + 6.2 n - 3.2 is in O(n^2). While this is what makes sense, it is not how people normally write it.

halcyon escarp
#

Yeah I'm not sure how to do the symbol

regal spoke
#

The standard is to write
7.6 n^2 + 6.2 n - 3.2 = O(n^2)

#

using =

halcyon escarp
#

oh

regal spoke
#

It doesn't really make sense to use = here, but everyone writes it with =

halcyon escarp
#

Yeah, so for Big-Oh we can't satisfy it with C = 7.7 and n_0 = 1

regal spoke
#

Thing is, you dont need to find the smallest C. Any C works

halcyon escarp
#

I don't think it works cause you get 10.8 <= 7.7

regal spoke
halcyon escarp
#

n_0 for sure does tho = 2

regal spoke
#

My point is just that you can pick C and n_0 to whatever you want

halcyon escarp
#

Yeah, it makes sense. Just seems like such a weird way to prove it

regal spoke
#

So pick something like C = 7.8, or maybe C = 1000000000000

#

Picking C = 7.7 is asking for trouble

halcyon escarp
#

Haha okay thank ya

#

Wanna help me with one other thing?

#

f (n) + g(n) ∈ O(max(f (n), g(n))) I need to justify the correctness of this statement

#

But, I'm not sure what max means

#

f(n), g(n) are both asymptotically positive functions

regal spoke
#

max is just maximum, as in pick the largest of 2 numbers

halcyon escarp
#

so the largest of f(n) or g(n)

regal spoke
#

If f(n) and g(n) are both positive, then 1/2 * (f(n) + g(n)) <= max(f(n), g(n)) <= f(n) + g(n)

halcyon escarp
#

So it should be false?

regal spoke
regal spoke
#

which clearly is <= max(f(n), g(n))

halcyon escarp
#

Why are we taking the mean value πŸ˜†

regal spoke
#

Look here

#

1/2 * (f(n) + g(n)) <= max(f(n), g(n))

#

is the same as
f(n) + g(n) <= 2 * max(f(n), g(n))

#

This means you can take C = 2

halcyon escarp
#

Oh I see I didn't know you were using C = 2 I see

regal spoke
#

The point of the big Oh notation is to compare differently growing functions.

halcyon escarp
#

I see

#

@regal spoke i’m looking back what was the point of factoring out the n^2

regal spoke
#

Like when n becomes large, what in the expression is it that dominates

#

By factorizing out n^2 from 7.6 n^2 + 6.2 n - 3.2

#

you see that 7.6 n^2 + 6.2 n - 3.2 = n^2 (7.6 + 6.2/n - 3.2/n^2)

#

from this it is very clear what will happen when n is large

#

6.2/n will tend to 0, and 3.2/n^2 will tend to 0

halcyon escarp
#

Does it do anything besides give us a better understanding of what is dominating?

regal spoke
#

Once you've factorized out the n^2 you can do lots of stuff

#

For example, note that:

6.2/n <= 6.2 if n >= 1
-3.2/n^2 <= 0

#

So when n >= 1
n^2 (7.6 + 6.2/n - 3.2/n^2) <= n^2 (7.6 + 6.2) = 13.8 n^2

#

So you could have used C = 13.8 and n_0 = 1

halcyon escarp
#

Okay

sterile magnet
#

hi ppl

iron turret
#

how could I simplify this: x1 * y1 + x2 * y2 + x3 * y3 + x4 * y4, x can only be +0.5 or -0.5, y are known. I need to solve x variables

#

I could try all the x1, x2, x3, x4 with different +0.5 or -0.5, so it becomes a 2^4 problem, but if there are many many variables that needs to be solved

#

and I only need an approximation

#

calculating this with 24 cores takes already like 13s for 2^30 problem

haughty mountain
#

@regal spoke this reads like it's close to your area of expertise

outer rain
#

If you substitute x_i with z_i = x_i + 0.5 you get z_i which can either be 0 or 1, which is similar to how you can either take or not take the item of weight y_i

regal spoke
iron turret
#

what do you mean by substitute x_i with z_i

#

I'm familiar with 0/1 knapsack problem, but I think on my case it's minimization problem

iron turret
#

Obviously there's multiple solutions

regal spoke
iron turret
#

I need to guess plausible values for x1...xn

#

so it can be x1=0.5, x2=-0.5

regal spoke
#

I'm still really confused

iron turret
#

let me rephrase:

x1 * y1 + x2 * y2 + x3 * y3 + x4 * y4 = z

y and z is known numbers. I need to solve x variables. and x can only be either +0.5 or -0.5

regal spoke
#

Okey, you didnt have z before which was confusing me

regal spoke
iron turret
#

ahh nice, thanks for the tip

outer rain
#

GOD FKING DAMMIT HOW I MISS THE MOST OBVIOUS REDUCTIONS AGAIN 😑

regal spoke
# iron turret ahh nice, thanks for the tip

A simple approach to solving it is to split up
x1 * y1 + x2 * y2 + x3 * y3 + x4 * y4
into two halves
x1 * y1 + x2 * y2 and x3 * y3 + x4 * y4
Then compute all of the 2^(n/2) assignments of the first half and second half seperately.

#

Using a hash table you can check in O(2^(n/2)) if it is possible to hit z.

iron turret
#

hmm hash table to check remainder is quite nice idea

#

but I don't know if it will become O^n again

regal spoke
trim lake
#

Hello

I'm trying to figure out how to make a calculator to figure out the following
I have mixed items for ex:
item1 = 10 A, 50 B, 4 C; item2 = 400 B, 120C; ... Qty1= 16000 C; Qty2 = 50000 B;
what kind of method I can use to figure out which items how many of them I need to satisfy each quantity;
Thanks

vocal gorge
trim lake
iron turret
#

I have competition!

sterile oak
#

why is [2,0,1,3] a possibility

sterile oak
vocal gorge
#

the adjacency list of vertex 2 is just [0,1,3]

#

why they draw it like this, no idea.

blazing flume
#

Question,

#

How many of you have graphics tablets?

tawny prairie
#

in imagination

blazing flume
#

I'm considering buying one to help me study linear algebra and graph theory but I'm not sure if its worth it

#

Since I've got through like 200 sheets of paper in the last day or so

tawny prairie
#

so you asking people who may or may not have one.. thats research.

#

sorry ill take you seriously

#

umm well

blazing flume
#

no worries

#

If you didn't have one that would still be a significant finding to me because you have managed to learn ML without using a graphics tablet

#

so I'd be more inclined to think that I don't need one

tawny prairie
#

this is an achievement? i didn't know that graphic tablets where necessary to do ML?

#

hold on, let me ask ChatGpt if this is true.

blazing flume
#

Sorry I must step out for a moment I'll be back later.

tawny prairie
#

im kidding

iron turret
#

Whats a graphic tablet

#

iPad?

fiery cosmos
#

I don't know if this is the right channel...

Is there a best way to understand the logical structure of an algorithm? I have been trying to grasp the logic behind insertion sort, but I am very frustrated that it is taking me so long to understand. Plus, I might even forget how to construct it in the next day. I've been used to bubble sort but I think using insertion sort is much better?

Do you know of any methods to help me understand the logic behind complex algorithms? I feel like I am learning the wrong way.

regal spoke
fiery cosmos
#

I just wanna understand the logical structure of ANY algorithm (like the insertion sort). It's like you don't need to memorize cause it just makes sense. You know what i mean? It helps me remember how to write the code and there's no need for memorization cause u already understand its logic.

regal spoke
sterile oak
vocal gorge
#

yup

#

it's meant to be a linked list, hence the arrow, but why the first node is included, who knows

sterile oak
#

oh alright

vocal gorge
#

also, nobody uses linked lists :p

sterile oak
#

hahaha

#

thanks for the help tho

regal spoke
#

Each pass in bubble sort is one bubble traversing through the list

#

Insert sort on the other hand is sorting elements by inserting them one at a time in sorted order

fiery cosmos
regal spoke
#

Most of that is just a matter of experience

fiery cosmos
vocal gorge
#

insertion sort involves insertions - each time select the first element that's not included yet, find the location where it should be in the sorted part, insert it there.
bubble sort involves swaps of adjacent elements - you "move" the element until it's at the right position. I'd say it's somewhat easier to mess up.

#

selection sort is even easier than insertion: each time, you append the smallest of the unsorted elements to the list.

haughty mountain
#

bubble sort
go through all adjacent pairs and swap if not in the correct order, n times

#

merge sort and quick have similarly simple ideas

#

the kernel of truth of merge sort
"I don't know how to sort, but I know how to merge two sorted lists"

#

!e silly insertion sort

def insertion_sort(seq):
  return [seq.pop(seq.index(min(seq))) for _ in range(len(seq))]

print(insertion_sort([4, 5, 1, 2, 3, 2]))
halcyon plankBOT
#

@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.

[1, 2, 2, 3, 4, 5]
haughty mountain
#

neat, it even works first try

wheat gorge
#

Hey guys! Could anyone help me with this please? I have a list of tuples where my_list = [(2,40),(6,20),...]. I am plotting those points on a graph. There are a lot of points on the graph and I am trying to figure out a way to make a horizontal line where these points are clumped.

I tried doing this with rounding the values, but I want it to be fully automatic where I wont have to manually put a value in. Any ideas?

narrow mica
wheat gorge
#

Lets say there are these points: [(3,45), (6, 46), (9, 20), (18, 45), (19, 66)]. I want to put a horizontal line around y=45

#

I want to include (6,46) in the "clump" because it is very close to 45, the line would be y=45.33 ((45+45+46)/3)

narrow mica
#

Well for starters, you may want to iron out your definition of points that "are close"; For instance, is 46 close to 45? what about 47? 48? at which point does some number a stop being close to another number b?

wheat gorge
#

I wanted to make a formula for that but was unsure how.

#

with the range of the data I can figure out how close the numbers are, but other than that I am clueless

#

But as I add numbers, the area it would search would get smaller if that makes sense

#

So lets say I have two 45s, I would search just slightly above and below (+- 1) to see if there are anymore points based off the range

#

then when it finds 46, it searches above and below again but much smaller (+- .5)

narrow mica
#

That's a bit complicated imo, why not use a static range?

wheat gorge
#

Because I am going to use more than one data set. some data will range from 40-120 while others range from 38.16830- 39.439691

#

Using a static range would not work in that case

narrow mica
#

Then, what if you calculated the "static" range using the known minimum and maximum value of your data?
Say for instance without particular reason, range = (max - min) / 10
Then for your 40-120 dataset, range = 8
And for your 38.16830-39.439691 dataset, range = 0.1271

wheat gorge
#

I see

narrow mica
#

Maybe also apply some techniques like excluding the outliers to get a better range

wheat gorge
#

Ok, I tried that but it wasnt as accurate as I wanted.

data = [(2, 159), (3, 169), (5, 203)]
zone_storage = []

# Find Range
max_ds = max([x[1] for x in data])
min_ds = min([x[1] for x in data])
rng = max_ds - min_ds

number_of_areas = 50
search_area = rng/number_of_areas

top_value = 0
for i in range(number_of_areas):
    top_value += i
    bottom_value = top_value - i
    for d in data:
        if d[1] > bottom_value and d[1] <= top_value:
            zone_storage.append((i, top_value, bottom_value))
narrow mica
#

If you decide to go with this, I'll describe an O(nlogn) algorithm to create the lines

wheat gorge
wheat gorge
narrow mica
#

First sort the data by y, then iterate over the points
For each point (u, v), check how many points are within y = (v, v + 2*range); if the amount exceeds some threshhold, then make a line at v + range
This will make multiple close lines sometimes, so you need to do some extra work to prevent that

narrow mica
#

Try numpy.arange which can be floats

#

Maybe something like this can be more accurate/intuitive?

import numpy as np
data = [(2, 159), (3, 169), (5, 203)]
zone_storage = []

# Find Range
max_ds = max([x[1] for x in data])
min_ds = min([x[1] for x in data])

steps = 100
for line_y in np.arange(min_ds, max_ds, (max_ds - min_ds) / steps):
  # test if points are clustered around y = line_y
wheat gorge
#

ok so that passes the max, min, and range/steps right?

narrow mica
#

You can think of numpy.arange as just range but you can have floats

wheat gorge
#

oh, so why pass 3 values to range (np.arrange)?

narrow mica
#

There's also numpy.linspace which... is probably the better function to use in this case now that I think about it

for line_y in np.linspace(min_ds, max_ds, steps):
  ...
narrow mica
#

Higher steps values would mean better accuracy in this case, though you may also get multiple line_y values that are very close to each other but all pass the "if points are clustered" test, so you have to deal with that

wheat gorge
#

Hmm, I see

#

How would I do it if I wanted to round the values?

narrow mica
#

Round what values? line_y?
One way to do it is to first calculate the valid line_ys, then apply the round after all other calculations are done

wheat gorge
#

I want to round the initial data set so some of them are the same, then I could just use a simple .count to find them

narrow mica
#

If your data looks like what you wrote

data = [(2, 159), (3, 169), (5, 203)]
``` Then a simple list comprehension will do the trick
```py
data = [(2, 159), (3, 169), (5, 203)]
data = [(x, round(y)) for x, y in data]
wheat gorge
#

I did it like this:

data_round = [round(x[1],y) for x in data]

Could you explain why you did it like that?

narrow mica
#

Well first of all yours will throw an error as y isn't defined
Second, you could do this instead:

data = [(v[0], round(v[1])) for v in data]
``` But I find the former code more readable
wheat gorge
#

Oh I see because of the tuple

#

I would set y to a value based off the range of the data

narrow mica
#

I see... I suggest then to change y to something else to reflect its actual meaning

wheat gorge
#

Ok

narrow mica
#

And I guess you don't care about the x values, so omitting it is fine

wheat gorge
#
round_place = 1
data_round = [(x, round(y,round_place)) for x,y in data]
#

ok sweet

#

So now finding round_place...

#

I can take the range and just do some formula I am sure

#

Ok thank you very much with your help

#

Also do you have any tips for me?

#

Anything I can learn

lapis hill
#

anyone have a good course they can recommend for algos and data structures?

fiery cosmos
# lapis hill anyone have a good course they can recommend for algos and data structures?
clear eagle
#

The instruction is convert roman to integer

class Solution:
    def romanToInt(self, s: str) -> int:
        # s = "MCMXCIV"
        rom_int_val = {
            "I" : 1,
            "V" : 5,
            "X" : 10,
            "L" : 50,
            "C" : 100,
            "D" : 500,
            "M" : 1000,
        }

        if len(s) == 1:
            return rom_int_val.get(s)
        else:
            result = 0
            index = 0
            while index < len(s):
                # if value of current roman < val of preceding roman
                # subtract the value then add to result and increment index by 2
                if rom_int_val.get(s[index]) < rom_int_val.get(s[index+1]):
                    result = rom_int_val.get(s[index+1]) - rom_int_val.get(s[index]) + result
                    index += 2
                else:
                    result += rom_int_val.get(s[index])
                    index += 1

        return result

the error is string index out of range in

 if rom_int_val.get(s[index]) < rom_int_val.get(s[index+1]):
#

in what case will s[index] be out of range? my While loop makes sure that index < length of string

#

this is a problem i found from leetcode easy category

ripe anvil
#

Probably in yhe last case where s[index+1]

ripe anvil
#

Say len(s) == 2
while index < 2
Index==1
Then s[index+1] will be [2]
then that will be out of range since [1] is the last index.

#

@clear eagle

clear eagle
#

rip me thank you

#

i've fixed it but my loop looks like this now

            while index < len(s):
                if index < len(s) -1: 
                    if rom_int_val.get(s[index]) < rom_int_val.get(s[index+1]):
                        result = rom_int_val.get(s[index+1]) - rom_int_val.get(s[index]) + result
                        index += 2
                    else:
                        result += rom_int_val.get(s[index])
                        index += 1
                else:
                    result += rom_int_val.get(s[index])
                    index += 1

should i change it to

            while index < len(s):
                if index < len(s) -1: 
                    if rom_int_val.get(s[index]) < rom_int_val.get(s[index+1]):
                        result = rom_int_val.get(s[index+1]) - rom_int_val.get(s[index]) + result
                        index += 2
                        continue

                result += rom_int_val.get(s[index])
                index += 1
#

since im having two else block which basically does the same thing

#

both works im just curious which one is preferable

lilac cradle
#

can you get the angle of a vector in any dimension?

#

like atan2

haughty mountain
lilac cradle
#

like its phase

#

angle to the origin

haughty mountain
#

you mean the angle to the (1, 0, 0, ...) vector?

lilac cradle
#

i think so, yes
like the phase of (5,5) would be 45 degrees afaik

haughty mountain
#

you can get an angle through a dot product

robust needle
#

Does the n-queens have an optimisation version

#

or is it simply limited to the decision version: Given an n * n board is it possible to place n queens on it

haughty mountain
#

it's trivial that we can't do more than n

#

and we have constructions for n >= 4

granite venture
#

how to solve this questions ?

ornate tendon
#

I think this is the appropriate channel to ask thisβ€”

I’m having trouble understanding uses for generator and how prevalent it is used, especially in the analytics space? I understand it’s a low memory and higher latency method, but I cannot understand it’s use case if the output needs to be called (which I understand is slow when generator is used)

hot shell
#

Question: a student is studying for an exam. In his book, each chapter contains pages number of pages. On each day, he can study at most X pages or till the end of the chapter, whichever is minimum. Given the number of days in which he has to complete studying, find minimum value of X.

(Given pages in each chapter and number of days, find minimum value of X)

#

@plush owl ^

plush owl
#

Find the minimum value of X?

hot shell
#

Yes, minimum value of X

plush owl
#

Given an X, it's easy to find whether he can complete the book or not. This can be done in O(n)

#

The value of X can range from 1 to maximum number of pages in a chapter

#

With these limits (1 to max_pages), we can binary search for X

hot shell
#

Oh pithink

plush owl
#

Overall time: n log(n)

Edit: nlog(pages)

covert thorn
#

is either the number of chapters or the total number of pages given?

hot shell
hot shell
plush owl
#

You're right

hot shell
#

Constraints: (iirc)
max 10^4 chapters
Max 10^9 pages per chapter

#

I forgot the constraints, let me check if it's available online

haughty mountain
#

binary search for the answer is a good approach yeah

haughty mountain
#

O(n_chapters log(answer))

#

where answer is for sure ≀ max(pages)

vocal gorge
#

chapters have different page counts? if so, yeah, binary search would be straightforward

haughty mountain
#

if they have the same count the problem is trivial

#

and can be done in O(1)

strange stone
#

Hey I am learning machine learning in python, using tensor flow and scikit learn. How to get a job as a beginner?

midnight bobcat
#

Any good books to learn data structures and algorithms?

ornate tendon
soft apex
#

even read books in java or other languages

#

and find whats universl amognst htem and then create your own notes

#

some peoples writing styles will not fit with you some will

midnight bobcat
#

And then learn Python?

#

Seeing that Python hides a lot especially when it comes to arrays?

#

Sounds like I should learn the subject first than the language

little plaza
#

I'm new to this and I'm learning the beginning yet

regal spoke
#

It is better to post your code in discord like this instead of using screenshots:

```py
print('Hello, World!')
```

#

It will look like this

print('Hello, World!')
#

As for your question. If you want to do a finite loop then use

for i in range(nota):
  print('executing')
soft apex
#

learning the subject first is good

#

but you should cross-refrence that wtih codeexmaples

#

also geeksforgeeks is really really good

olive socket
regal spoke
olive socket
#

18 mb for the set 15 for the trie

regal spoke
olive socket
#

Yes, ive done that. Implemented suffix sharing to avoid recreating identical nodes, chain compression to reduce the length of nodes

#

Im curious what other optimizations i could make

#

All mine are based in mathematical theory rather than computation

regal spoke
#

Are you actually sure that you are calculating the memory used correctly?

olive socket
#

Ive attached the code you should be able to check, but yea i think so

regal spoke
#

Python code in general is pretty memory inefficient.

olive socket
#

Im aware; however im more familiar with python's AI toolkit and cant be bothered to learn how to remake this in C++

regal spoke
#

A typical word should be like 10 characters long, meaning roughly 10 bytes of data

#

I don't know how you could possibly hope compressing that

#

with oob in Python

olive socket
#

Theres 387,871 unique characters across 172,820 words

#

Able to reduce that to 213,653 nodes using graph theory.

regal spoke
#

Each one of those characters should cost roughly 1 byte, but each node should cost signlificantly more than that

olive socket
#

Correction: 1,570508 characters across 172k words.

#

So abt 9 letters per word

#

And Ive validated it, every word is in the trie and words that arent supposed to be in it arent in it

#

No loss

regal spoke
#

My guess is that you are not calculating the memory usage correctly

olive socket
#
def main():
    trie = Trie()
    words = load_words("word_list.txt")
    mega_words = load_words("words_alpha.txt")

    print(f"Size of native Python word set: {asizeof(words) / 1024 / 1024} MB")

    for word in words:
        trie.insert(word)

    trie.validate(mega_words, words)

    old_size = display_trie_info(trie, "before")
    trie.compress()
    trie.validate(mega_words, words)
    new_size = display_trie_info(trie, "after")

    print(f"Size reduction compared to old trie: {100 - 100 * new_size / old_size}%")
    print(f"Size reduction compared to native Python set: {100 - 100 * new_size / (asizeof(words) / 1024 / 1024)}%")


def display_trie_info(trie, when):
    print(f"Total nodes {when} compression: {trie.get_total_nodes()}")
    print(f"Total words {when} compression: {trie.get_total_words()}")
    print(f"Total end nodes {when} compression: {trie.get_end_nodes()}")
    size = asizeof(trie) / 1024 / 1024  
    print(f"Size of trie {when} compression: {size} MB")
    return size``` This is how im computing it
#

I know the built in way of doing it doesnt do a deep search but this should

regal spoke
#

What is "mega_words"?

olive socket
#

A word list based off the dictionary

#

More comprehensive so i use it to test that words that doesnt exist arent being included