#algos-and-data-structs | Python | Page 128

winged plover Sep 1, 2021, 4:41 AM

#

you can do like
lineas = [linea.strip() for linea in f1.readlines()[-n:]]
and write all lines from linea

bright halo Sep 1, 2021, 4:55 AM

#

winged plover you can do like `lineas = [linea.strip() for linea in f1.readlines()[-n:]]` and...

I tried to implement that in this code but I get error

n = 5

with open("file_text_chat.txt","r+") as f1:
    #lineas = [linea.strip() for linea in f1.readlines()]
    lineas = [linea.strip for linea in f1.readlines[-n:]]

    with open("file_only_last_lines.txt","r+") as f2:
        if palabra not in lineas:
            num_linea = lineas.index(palabra)
            f2.write(f"{palabra}\n")

I get this error message

Traceback (most recent call last):
  File "n_ultimas_lineas.py", line 5, in <module>
    lineas = [linea.strip for linea in f1.readlines[-n:]]
TypeError: 'builtin_function_or_method' object is not subscriptable

#

What should I modify?

daring delta Sep 1, 2021, 5:04 AM

#

anyone here can help me with coding problems tomorrow 8-10 am??
[9:29 AM]
coding questions like dynamic programming ,graph and trees

winged plover Sep 1, 2021, 5:15 AM

#

bright halo I tried to implement that in this code but I get error ```python n = 5 with op...

Oh dangit.. I forgot the parenthesis there
f1.readlines()[-n:]
My bad

bright halo Sep 1, 2021, 5:31 AM

#

winged plover Oh dangit.. I forgot the parenthesis there `f1.readlines()[-n:]` My bad

I try with that code but it doesn't give me the correct lines

n = 5

with open("file_text_chat.txt","r+") as f1:
    #lineas = [linea.strip() for linea in f1.readlines()]
    lineas = [linea.strip for linea in f1.readlines()[-n:]]
    print(lineas)

    with open("file_only_last_lines.txt","r+") as f2:
        f2.write(f"{lineas}\n")

This code write that:

[<built-in method strip of str object at 0x0000024EEED9ECB0>, <built-in method strip of str object at 0x0000024EEED59D30>, <built-in method strip of str object at 0x0000024EEEDE69F0>, <built-in method strip of str object at 0x0000024EEECB7E90>, <built-in method strip of str object at 0x0000024EEEE03480>]

rancid hound Sep 1, 2021, 5:34 AM

#

strip is a function call
Use strip with paranthesis: "strip()"

#

Hope this helps:
https://www.google.com/amp/s/www.geeksforgeeks.org/python-reading-last-n-lines-of-a-file/amp/

GeeksforGeeks

Python - Reading last N lines of a file - GeeksforGeeks

A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.

winged plover Sep 1, 2021, 5:51 AM

#

bright halo I try with that code but it doesn't give me the correct lines ```python n = 5 ...

oops- i forgot the parenthesis around strip too

lean dome Sep 1, 2021, 5:55 AM

#

daring delta anyone here can help me with coding problems tomorrow 8-10 am?? [9:29 AM] coding...

what timezone?

daring delta Sep 1, 2021, 5:55 AM

#

India Standard Time
Time zone in India (GMT+5:30)

lean dome Sep 1, 2021, 5:56 AM

#

I won't be able to help then, but I can help you with them now

#

What are the problems?

bright halo Sep 1, 2021, 5:58 AM

#

@winged ploverthank you very much it worked for me 😄

#

@rancid houndThank you very much, I tried the methods on that page and they also worked for me

fluid prism Sep 1, 2021, 5:58 AM

#

!decorator

eager cliff Sep 1, 2021, 6:15 AM

#

lean dome What are the problems?

sounds like someone looking for help in an exam...

daring delta Sep 1, 2021, 6:17 AM

#

lean dome I won't be able to help then, but I can help you with them now

will let u know soon tysm

lean dome Sep 1, 2021, 6:41 AM

#

daring delta will let u know soon tysm

What's it for? As I've already said, if it's for an exam or some other thing where you'll be compared against other people, we will not help with it here.

daring delta Sep 1, 2021, 6:56 AM

#

yeah i know

daring delta Sep 1, 2021, 6:56 AM

#

lean dome What's it for? As I've already said, if it's for an exam or some other thing whe...

it is related to web scrapping

#

now good?

brave oak Sep 1, 2021, 7:14 AM

#

daring delta anyone here can help me with coding problems tomorrow 8-10 am?? [9:29 AM] coding...

indeed, it sounds like you want help with some sort of graded assignment

daring delta Sep 1, 2021, 7:28 AM

#

hmm

lean dome Sep 1, 2021, 7:34 AM

#

daring delta it is related to web scrapping

why is the time window so short, if it isn't some sort of graded assignment?

daring delta Sep 1, 2021, 7:42 AM

#

u can take whole day if u want

#

i was avialable at that time na

stable pecan Sep 1, 2021, 8:24 AM

#

@floral lintel let's not spam #internals-and-peps which is a channel about python and not about algorithms

#

i can show you a solver i wrote with numpy --- maybe the structure of it will make sense anyway:

def possible(y, x):
    """
    Return array of possible digits for location (y, x).
    """
    row = grid[y]
    column = grid[:, x]

    gy = y - y % 3
    gx = x - x % 3
    subgrid = grid[gy: gy + 3, gx: gx + 3]

    return np.argwhere(
        ~np.isin(N, row)
        & ~np.isin(N, column)
        & ~np.isin(N, subgrid)
    )

def solve(where_empty=None):
    """
    Sudoku solver with backtracking.
    """
    if where_empty is None:
        # Find coordinates of cells that are empty.
        where_empty = np.argwhere(grid == 0)

    if len(where_empty) == 0:
        return True  # Indicate grid is solved.

    y, x = where_empty[0]

    for n in possible(y, x):
        grid[y, x] = n

        if solve(where_empty=where_empty[1:]):  # Done!
            print(grid)
            return

        grid[y, x] = 0

#

This doesn't keep backtracking once the grid is solved, which i think is the issue you're having.

#

It returns a sentinel value to indicate it has a solved grid (True, in this case)

floral lintel Sep 1, 2021, 8:30 AM

#

oh, wow. That is a very different way of solving my issue.

#

I gave up trying to get it to return a value, but this could work very well

mint jewel Sep 1, 2021, 8:47 AM

#

given an infinite DAG where each edge has a weight 1, is there a better algorithm than just breath first search to find the shortest path to a point.

#

paths can be assumed finite

#

hmm, in my problem I can find an upper bound on the path

#

so IG it is not an infinite DAG and therefore I can just use normal graph search algorithms

stable pecan Sep 1, 2021, 8:52 AM

#

you can have an infinite dag where a path between any two points is always finite

#

kinda trivially

#

but i don't think there's a fast algorithm for finding paths unless there's some nice property of your dag

clever gust Sep 1, 2021, 11:50 AM

#

A college maintains academic information about students in three separate lists

Course details: A list of pairs of form (coursecode,coursename), where both entries are strings. For instance,
[ ("MA101","Calculus"),("PH101","Mechanics"),("HU101","English") ]

Student details: A list of pairs of form (rollnumber,name), where both entries are strings. For instance,
[ ("UGM2018001","Rohit Grewal"),("UGP2018132","Neha Talwar") ]

A list of triples of the form (rollnumber,coursecode,grade), where all entries are strings. For instance,
[ ("UGM2018001", "MA101", "AB"), ("UGP2018132", "PH101", "B"), ("UGM2018001", "PH101", "B") ]. You may assume that each roll number and course code in the grade list appears in the student details and course details, respectively.

Your task is to write a function transcript (coursedetails,studentdetails,grades) that takes these three lists as input and produces consolidated grades for each student. Each of the input lists may have its entries listed in arbitrary order. Each entry in the returned list should be a tuple of the form

(rollnumber, name,[(coursecode_1,coursename_1,grade_1),...,(coursecode_k,coursename_k,grade_k)])

where the student has grades for k >= 1 courses reported in the input list grades.

The output list should be organized as follows.

1 The tuples shold sorted in ascending order by rollnumber

2 Each student's grades should sorted in ascending order by coursecode

#

#

Can someone kindly help me with this question?

winged plover Sep 1, 2021, 1:34 PM

#

guys what would be a good way to get all possible combinations of natural numbers whose sum = a given number
like

3 = {(1,1,1), (1,2), (3,)}```i mean i tried a method and it works.. but i am not sure if its best because above 50-60 ish i think it starts to slow up 👀

my method being iterate and reach the number by addition 
like start from [(1,)]
iterate to give [(1,1),(2,)] and so on

so uhm well wanted to ask if there is anything better 👀

dusk gust Sep 1, 2021, 2:17 PM

#

idk, but you can use numba, that will speed it up.

winged plover Sep 1, 2021, 2:19 PM

#

hmm yeah i think.. that would speed it up a bit

dusk gust Sep 1, 2021, 2:19 PM

#

#

super basic example using pointless code

#

like ~14x faster in that specific example (I haven't slept in a while so my math is prob wrong)

winged plover Sep 1, 2021, 2:24 PM

#

ah hmm yeah

#

but the thing is i am using tuples and sets

#

for a functionality in it

dusk gust Sep 1, 2021, 2:29 PM

#

pretty sure you can use tuples and list with numba

#

https://numba.pydata.org/numba-doc/dev/reference/pysupported.html

winged plover Sep 1, 2021, 2:29 PM

#

dusk gust pretty sure you can use tuples and list with numba

ahh ohkay hmm nice

dusk gust Sep 1, 2021, 2:30 PM

#

pretty much any base python and a few libraries like numpy

#

Only downside to numba is that the first time you call a function it has to take a few seconds to compile, which will lead to an initial slow down of the program. But after the function compiles, it will be faster.

winged plover Sep 1, 2021, 2:33 PM

#

dusk gust Only downside to numba is that the first time you call a function it has to take...

ahh ohkay

winged plover Sep 1, 2021, 2:34 PM

#

winged plover > guys what would be a good way to get all possible combinations of natural numb...

26.527116537094116
time to find it for 50

dusk gust Sep 1, 2021, 2:34 PM

#

It is good if you call a function many times or if the function has a large loop or while statement.

winged plover Sep 1, 2021, 3:35 PM

#

Ahh it does have large loops yes

lament quiver Sep 1, 2021, 4:35 PM

#

heyy all, I implemented the algo to find the longest path in the maze, but as the size of the grid increases, the time of computation increases, and yes the maze have multiple solutions, so I am approaching the longest path, is there any way to reduce the time of computation, I mean using VMs or something else, that I am not aware of...

#

or maybe GPUs, but I suppose they are mostly used for parallel computation rather than recursion one...

tropic glacier Sep 1, 2021, 6:24 PM

#

How are you defining the "longest" path? Can't I just keep retracing my steps to make a path longer?

vocal gorge Sep 1, 2021, 10:08 PM

#

Just have this node only have one child, yes. An AST is generally not binary, nodes often have many or only 1 child.

vocal gorge Sep 1, 2021, 10:24 PM

#

that's a pretty cursed way to do unary minus

#

just don't have it be a binary tree, yeah. It's fine to have different kinds of nodes

stable pecan Sep 1, 2021, 10:33 PM

#

binary ast seems cursed

#

no unary functions if ast is required to be binary

vocal gorge Sep 1, 2021, 10:36 PM

#

stable pecan no unary functions if ast is required to be binary

just use linked lists when 2 is too few children 😛

#

[1,2,3] -> List(ListStart, List(2, List(3, ListEnd)))

#

aren't binary trees beautiful

knotty magnet Sep 1, 2021, 10:38 PM

#

😔 ListStart and ListEnd

stable pecan Sep 1, 2021, 10:38 PM

#

vocal gorge aren't binary trees beautiful

where's the unary function

vocal gorge Sep 1, 2021, 10:38 PM

#

okay okay, UnaryMinus(None, 5)

stable pecan Sep 1, 2021, 10:39 PM

#

there's no reason for a binary ast if you are including any ary nodes

#

just complicates things

#

python's ast is just a general tree:

In [39]: ppast("[1, 2, 3, 4, 5]")
Expr
╰──List
   ├──Constant
   │  ╰──1
   ├──Constant
   │  ╰──2
   ├──Constant
   │  ╰──3
   ├──Constant
   │  ╰──4
   ├──Constant
   │  ╰──5
   ╰──Load

vocal gorge Sep 1, 2021, 11:10 PM

#

yup, ASTs in general really aren't binary

fervent saddle Sep 1, 2021, 11:30 PM

#

If you’re reading a leetcode type of problem, are there different things that give hints on what kind of techniques or data structures you’ll probably need to use? Is there anything that talks about that?

#

Like can you figure out that you’ll probably need to use a hash table, or a graph, or dynamic programming, just by reading the problem?

knotty magnet Sep 1, 2021, 11:33 PM

#

sometimes, yeah

fervent saddle Sep 1, 2021, 11:34 PM

#

Is there like a cheat sheet or something for doing that?

knotty magnet Sep 1, 2021, 11:34 PM

#

i think it's just practice, a lot of questions are worded similarly

fiery cosmos Sep 2, 2021, 12:05 AM

#

can a binary tree keep an object alive?

knotty magnet Sep 2, 2021, 12:05 AM

#

if it has a reference to it yes

fiery cosmos Sep 2, 2021, 12:08 AM

#

knotty magnet if it has a reference to it yes

can this be referenced?

node._left = Cacheable(...)

knotty magnet Sep 2, 2021, 12:09 AM

#

yes, through node._left

fiery cosmos Sep 2, 2021, 12:11 AM

#

Thanks

idle pier Sep 2, 2021, 7:18 PM

#

Hello folks, I have a question.
from what I know nested for loops are usually O(n^2) but not all the time, my question is when you have 2 separate for loops, thats O(n) correct?

for i in somethingHere:
  return something

for i in somethingHere:
  return something```

fervent saddle Sep 2, 2021, 7:22 PM

#

Yeah, O(n)

#

for i in somethingHere:
  # do something with somethingHere

for i in somethingHere:
  # do something with somethingHere ```

#

O(n) with the size of somethingHere

#

Assuming somethingHere is the same thing for both loops

#

Otherwise, if it’s two independent things, it’s O(n + m)

ivory quest Sep 2, 2021, 9:08 PM

#

Are these function both O(n) https://github.com/georgepittock/py-scripts/blob/main/scripts/numbers_in_jumbled_string.py? If I run the code in the comments at the top it looks like submitted is O(n) but new is O(log n) but I don’t understand how it could be?

GitHub

py-scripts/numbers_in_jumbled_string.py at main · georgepittock/py-...

A collection of scripts I have used/made and wanted to make publicly available. Most of these will be algorithms etc. - py-scripts/numbers_in_jumbled_string.py at main · georgepittock/py-scripts

#

If it is O(log n) could someone please explain how it is, I’m fairly new to big o notation and understand log n as “you don’t check every element just halving and halving again”

heady jewel Sep 2, 2021, 9:09 PM

#

Counter iterates over each character in string so it's O(n)

ivory quest Sep 2, 2021, 9:10 PM

#

I thought so, so what would explain why execution time appears to show it as O(log n)?

#

Or is that just an example of it’s up to the order of n but not necessarily order of n

vocal gorge Sep 2, 2021, 9:11 PM

#

Why do you believe it to be O(log n)?

heady jewel Sep 2, 2021, 9:12 PM

#

could be many things, noise, internal optimization for short strings, etc.

ivory quest Sep 2, 2021, 9:12 PM

#

If I run the execution time based on input size n it goes up logarithmically

heady jewel Sep 2, 2021, 9:12 PM

#

are you sure it's actually logarithmic instead of just a small coefficient

vocal gorge Sep 2, 2021, 9:12 PM

#

That shouldn't be possible, as Counter(string) is O(n) already.

heady jewel Sep 2, 2021, 9:13 PM

#

what range of input lengths are we talking about

ivory quest Sep 2, 2021, 9:13 PM

#

heady jewel are you sure it's actually logarithmic instead of just a small coefficient

Well uo to n = 4 the new function is slower than the submitted function but above that it gets faster, by the time n is 50-60 it’s 1/4 of the speed

vocal gorge Sep 2, 2021, 9:14 PM

#

but what makes you think it's logarithmic?

#

lemme make some plots, I guess

ivory quest Sep 2, 2021, 9:16 PM

#

Just from reading the numbers it seemed logarithmic so yeah not very sure :/

heady jewel Sep 2, 2021, 9:19 PM

#

Consider the graphs of 1 + 0.2n and 0.5n

vocal gorge Sep 2, 2021, 9:19 PM

#

ivory quest Just from reading the numbers it seemed logarithmic so yeah not very sure :/

import perfplot
numbers = ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"]
perfplot.show(
    setup = lambda n: "".join(random.choices(numbers, k=n)),
    kernels = [submitted, new],
    n_range = list(range(10)) + list(2**i for i in range(4,10)),
    xlabel = "len(string)",
    target_time_per_measurement=1,
    equality_check=lambda a,b:a==b
)

Looks extremely linear to me.

heady jewel Sep 2, 2021, 9:20 PM

#

both grow at the order of n but one of them has this pesky 1 + at the beginning

#

in big-o notation we don't care about it because we only care about the rate of growth, it's also the same reason coefficients are discarded

ivory quest Sep 2, 2021, 9:21 PM

#

Ah that makes it much clearer, thanks

vocal gorge Sep 2, 2021, 9:22 PM

#

and here's a loglog plot

heady jewel Sep 2, 2021, 9:22 PM

#

An algorithm described as O(N^2) could in reality be something like 3 N ^ 2 + 15 N + 10 000 but since as N grows, the term N ^ 2 will have the most drastic effects the others are discarded.

vocal gorge Sep 2, 2021, 9:22 PM

#

only at below 2 numbers is the submitted version faster

heady jewel Sep 2, 2021, 9:23 PM

#

You also have to remember that runtime complexity does not automatically imply performance

austere sparrow Sep 2, 2021, 9:25 PM

#

vocal gorge ```py import perfplot numbers = ["zero", "one", "two", "three", "four", "five", ...

👀 that's a nice library

austere sparrow Sep 2, 2021, 9:25 PM

#

heady jewel An algorithm described as `O(N^2)` could in reality be something like `3 N ^ 2 +...

sleepsort is O(n) 🙂

ivory quest Sep 2, 2021, 9:26 PM

#

heady jewel You also have to remember that runtime complexity does not _automatically_ imply...

Are you effectively saying here that for mostly reasonable N some functions will have O(N) performs better than a function which is O(log N)

vocal gorge Sep 2, 2021, 9:26 PM

#

austere sparrow 👀 that's a nice library

I hate its interface (every time I use it, I have to go look at the github page, because the author didn't bother putting the same thing in the docstring or, god forbid, the type hints), but yeah, it is

austere sparrow Sep 2, 2021, 9:26 PM

#

types don't exist

#

yolo

heady jewel Sep 2, 2021, 9:26 PM

#

No I'm saying that results between algorithms can be surprising and if you want to gauge performance you need to measure it.

vocal gorge Sep 2, 2021, 9:27 PM

#

>>> ?perfplot.show
Signature:
perfplot.show(
    *args,
    time_unit='s',
    relative_to=None,
    logx='auto',
    logy='auto',
    **kwargs,
)
Docstring: <no docstring>

you might notice that these are not all the arguments this function accepts, but you wouldn't be able to tell without looking at github 😩

ivory quest Sep 2, 2021, 9:28 PM

#

Got it thank you both @vocal gorge @heady jewel really helpful, @vocal gorge am I ok to use one of those photos on my GitHub?

vocal gorge Sep 2, 2021, 9:29 PM

#

sure; you can also generate them yourself - the snippet I posted is what does it

#

@ivory questhere's a better one with more points and correctly labeled x axis

numbers = ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"]
perfplot.show(
    setup = lambda n: "".join(random.choices(numbers, k=n)),
    kernels = [submitted, new],
    n_range = list(range(1,10)) + list(2**i for i in range(4,15)),
    xlabel = "number count",
    target_time_per_measurement=1,
    equality_check=lambda a,b:a==b,
    logx=True,
    logy=True
)

heady jewel Sep 2, 2021, 9:31 PM

#

As a simple example, in something like C a simple O(n) linear scan for membership on an array is often faster than a O(log n) binary search until fairly significant sizes as reading sequential data is one of the things that CPUs love, and there is not really branching (no if statements) so the branch predictor doesn't fail as often. Runtime analysis does not give you the full picture, you always have to measure.

fiery cosmos Sep 2, 2021, 10:27 PM

#

Are there any practical instances of sin(n) being part of the big O? lemon_glass

stable pecan Sep 2, 2021, 10:28 PM

#

fiery cosmos Are there any practical instances of `sin(n)` being part of the big O? <:lemon_g...

sin(n) is bounded by 1 on either side

fiery cosmos Sep 2, 2021, 10:29 PM

#

I know but like x*sin(x)

#

Though I guess that's bounded under x

austere sparrow Sep 2, 2021, 10:41 PM

#

fiery cosmos Are there any practical instances of `sin(n)` being part of the big O? <:lemon_g...

Of course

import math
import time

def f(n):
    time.sleep(10 * math.sin(n / 1000))

fiery cosmos Sep 2, 2021, 10:45 PM

#

By practical I mean part of an actual, useful algorithm pithink

wise fulcrum Sep 2, 2021, 11:05 PM

#

https://www.online-python.com/0iB2dDrTWR
dang that was harder to figure out than i expected

Online Python - IDE, Editor, Compiler, Interpreter

Build and Run your Python code instantly. Online-Python is a quick and easy tool that helps you to build, compile, test your python programs.

short mesa Sep 3, 2021, 1:56 AM

#

anyone have xp with a*

#

my code runs in to an infinite ish loop in some particular mazes

#

id like to talk it over with someone

gritty marsh Sep 3, 2021, 2:13 AM

#

share your code

wheat flare Sep 3, 2021, 2:30 AM

#

hello all, I have a bunch of partial call graphs that I created per script in the form of json files.

#

#

for example, it looks like this

#

I assume its a dictionary of dictionaries of lists?

#

does networkx accept formats like this? If it does, can I just concatenate the json files together? or will I need to perform cleanup on the files

halcyon plankBOT Sep 3, 2021, 2:32 AM

#

Hey @wheat flare!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .json attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

wheat flare Sep 3, 2021, 2:33 AM

#

oh

halcyon plankBOT Sep 3, 2021, 2:33 AM

#

Hey @short mesa!

It looks like you tried to attach a Python file - please use a code-pasting service such as https://paste.pythondiscord.com

wheat flare Sep 3, 2021, 2:34 AM

#

https://paste.pythondiscord.com/suqidisave.json

short mesa Sep 3, 2021, 2:34 AM

#

gritty marsh share your code

https://paste.pythondiscord.com/avogulohex.yaml

#

it takes a while to get out of a boxed in start

#

#

something like this gets solved but after like 5 minutes

wheat flare Sep 3, 2021, 3:46 AM

#

@stable pecan sorry for the ping but i got a few questions on networkx

#

do you prefer if we do here or in a help channel?

stable pecan Sep 3, 2021, 3:47 AM

#

you can ask wherever

wheat flare Sep 3, 2021, 3:47 AM

#

ok let me just repeat earlier from above

#

I have a bunch of partial call graphs that I created per script in the form of json files. https://cdn.discordapp.com/attachments/650401909852864553/883177261635874857/unknown.png
for example, it looks like this
I assume its a dictionary of dictionaries of lists?
does networkx accept formats like this? If it does, can I just concatenate the json files together? or will I need to perform cleanup on the files

#

https://paste.pythondiscord.com/suqidisave.json

#

heres an example sorry if its messy

#

i assume this is what i want then? https://networkx.org/documentation/stable/reference/generated/networkx.convert.from_dict_of_lists.html

#

please tell me if you need me to be clearer

stable pecan Sep 3, 2021, 3:58 AM

#

you need to convert the json files to dictionary yes

wheat flare Sep 3, 2021, 3:58 AM

#

ah i know when i read them they're already in dict format

#

can i just concatenate them? or i have to combine them into one big dictionary

#

i assume its the second one

stable pecan Sep 3, 2021, 3:59 AM

#

either way, it's probably easier to do the second

wheat flare Sep 3, 2021, 3:59 AM

#

i see

#

they're in the form of a dictionary of dictionaries of lists right?

wheat flare Sep 3, 2021, 3:59 AM

#

wheat flare https://paste.pythondiscord.com/suqidisave.json

from the example i showed here

#

so after combining, i'll use this to read it? https://networkx.org/documentation/stable/reference/generated/networkx.convert.from_dict_of_lists.html

stable pecan Sep 3, 2021, 4:00 AM

#

the jsons are basically going to convert to dict of lists

#

one sec

wheat flare Sep 3, 2021, 4:00 AM

#

yes just want to combine i have like 2k json files

#

i have another question but after this one

#

thanks for answering

stable pecan Sep 3, 2021, 4:01 AM

#

do the dicts share any keys?

#

because that will change how you combine them

wheat flare Sep 3, 2021, 4:02 AM

#

yes theres a lot of them sharing keys

#

its hard to determine which though

stable pecan Sep 3, 2021, 4:02 AM

#

are the key, value pairs the same or different between dictionaries?

#

for dictionaries that share keys

wheat flare Sep 3, 2021, 4:03 AM

#

different keys but they have the same value

#

its a gigantic callgraph so a lot of functions may call one function

#

oh most format is like this i guess

#

#

should have said earlier sorry

#

{
    "node1": ["node2", "node3"],
    "node2": ["node3"],
    "node3": []
}

#

there can also be another json file thats like

stable pecan Sep 3, 2021, 4:08 AM

#

!e

from collections import defaultdict

my_dict_1 = {
  "a": [1, 2, 3],
  "b": [4],
  "c": [],
}

my_dict_2 = {
  "a": [2, 3, 4],
  "c": [4],
  "d": [2, 4],
}

master_dict = defaultdict(set)

def combine(graph):
    """
    Add a dict to the master.
    """

    for key, value in graph.items():
        master_dict[key].update(value)

combine(my_dict_1)
combine(my_dict_2)
print(master_dict)

halcyon plankBOT Sep 3, 2021, 4:08 AM

#

@stable pecan :white_check_mark: Your eval job has completed with return code 0.

defaultdict(<class 'set'>, {'a': {1, 2, 3, 4}, 'b': {4}, 'c': {4}, 'd': {2, 4}})

wheat flare Sep 3, 2021, 4:08 AM

#

{
"node4": ["node1", "node3"],
"node5": ["node2"],
"node6": []
}

stable pecan Sep 3, 2021, 4:08 AM

#

this is how i would combine them

wheat flare Sep 3, 2021, 4:08 AM

#

oh sorry, you can assume the keys in every dict to be unique

#

but different keys from different dicts can share the same value

#

im sorry i should have clarified

stable pecan Sep 3, 2021, 4:09 AM

#

every key is unique across all the dicts?

wheat flare Sep 3, 2021, 4:09 AM

#

i would say so

#

if not i can use that method above

stable pecan Sep 3, 2021, 4:09 AM

#

ok, then that's simpler because you can just use |=

wheat flare Sep 3, 2021, 4:10 AM

#

{
    "node1": ["node2", "node3"],
    "node2": ["node3"],
    "node3": []
}

{
    "node4": ["node1", "node3"],
    "node5": ["node2"],
    "node6": []
}

#

it can be like this

#

and other dicts can have a key in one dict be a value in theirs

stable pecan Sep 3, 2021, 4:10 AM

#

!e

my_dict_1 = {
  "a": [1, 2, 3],
  "b": [4],
  "c": [],
}

my_dict_2 = {
  "d": [2, 4],
}

master_dict = { }
master_dict |= my_dict_1
master_dict |= my_dict_2

print(master_dict)

halcyon plankBOT Sep 3, 2021, 4:10 AM

#

@stable pecan :white_check_mark: Your eval job has completed with return code 0.

{'a': [1, 2, 3], 'b': [4], 'c': [], 'd': [2, 4]}

stable pecan Sep 3, 2021, 4:10 AM

#

the |= operator will join them all

wheat flare Sep 3, 2021, 4:11 AM

#

oh i see

#

and other dicts can have a key in one dict be a value in theirs

#

this covers this situation too

#

i would say

stable pecan Sep 3, 2021, 4:11 AM

#

yep, that doesn't matter at all

#

so if you have a list of dictionaries

wheat flare Sep 3, 2021, 4:11 AM

#

i see i see

stable pecan Sep 3, 2021, 4:11 AM

#

you can just do:

for a_dict in iterable_of_dicts:
    master_dict |= a_dict

wheat flare Sep 3, 2021, 4:12 AM

#

yeah i just can loop through all the json files i have in a folder

stable pecan Sep 3, 2021, 4:12 AM

#

and then you can convert master_dict into a graph

wheat flare Sep 3, 2021, 4:12 AM

#

regarding the issue of duplicate pairs

#

networkx will just ignore it right

#

i dont really know if theres duplicates but theres potential to have it

stable pecan Sep 3, 2021, 4:12 AM

#

keys are unique in a dictionary, so if you have the same key it will overwrite a previous one

wheat flare Sep 3, 2021, 4:13 AM

#

oh i see

#

okay

#

ok i'll give it a shot then

#

one last thing

#

as you can see these 2 are the same functions

#

but the cursed json file outputs it differently for different dicts

#

i assume networkx is unable to find that they are the same

#

and i have to perform clean up on the json files

stable pecan Sep 3, 2021, 4:15 AM

#

yeah, you can clean it up with regex or maybe modify how you're generating the json to begin with

wheat flare Sep 3, 2021, 4:15 AM

#

yeah i have to clean up then alright

#

is it fair to ask about dictionary clean up then here?

stable pecan Sep 3, 2021, 4:16 AM

#

well i would just comb through the raw json files and make sure everything is in the same format with some regex

wheat flare Sep 3, 2021, 4:16 AM

#

    "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.boolean_mask._apply_mask_1d",
    "tensorflow.python.ops.gen_math_ops.prod",
    "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.reshape",
    "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.concat",
    "tensorflow.python.framework.ops.convert_to_tensor",
    "tensorflow.python.framework.tensor_shape.as_shape",
    "tensorflow.python.framework.tensor_util.constant_value",
    "<builtin>.ValueError",
    "tensorflow.python.framework.ops.name_scope",
    "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.shape"
  ],```

#

unfortunately its a bit harder

#

sorry if its a mess

#

this is one of the dictionaries in the json

#

"Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.reshape",
"tensorflow.python.framework.ops.convert_to_tensor",

#

you can see these two so i need to do some editing to the values

stable pecan Sep 3, 2021, 4:17 AM

#

yes, but you can automate that with regex

wheat flare Sep 3, 2021, 4:17 AM

#

to get array_ops.reshape
and ops.convert_to_tensor

#

oh can i?

#

i dont know like every value to look for though

#

i also need to change the key itself

stable pecan Sep 3, 2021, 4:18 AM

#

that doesn't matter

wheat flare Sep 3, 2021, 4:18 AM

#

"Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.boolean_mask": to array_ops.boolean_mask

#

really?

stable pecan Sep 3, 2021, 4:18 AM

#

let's see if i can figure out a regex pattern for it

wheat flare Sep 3, 2021, 4:18 AM

#

thank you so much

stable pecan Sep 3, 2021, 4:18 AM

#

feel like i have to relearn regex everytime

wheat flare Sep 3, 2021, 4:18 AM

#

i'll try to look around too

#

yeah i feel you

#

https://paste.pythondiscord.com/suqidisave.json

#

heres one if you want to test

#

my previous idea was

      see if tensorflow exists in the value name, 
        extract depending if theres \\ or . in the name

#

but if regex works then its much easier

#

theres also the issue of changing the key pair i saw i had to make a new key which is a pain too

stable pecan Sep 3, 2021, 4:24 AM

#

can i just use the last part of all the strings --- or do you need the preceding modules

wheat flare Sep 3, 2021, 4:24 AM

#

give me a sec let me compare

stable pecan Sep 3, 2021, 4:24 AM

#

"tensorflow.python.framework.fast_tensor_util.AppendObjectArrayToTensorProto" -> "AppendObjectArrayToTensorProto" or "fast_tensor_util.AppendObjectArrayToTensorProto"

wheat flare Sep 3, 2021, 4:25 AM

#

the latter

#

"tensorflow.python.framework.tensor_util.constant_value"
"Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util.constant_value",

#

from these two i want tensor_util.constant_value

stable pecan Sep 3, 2021, 4:25 AM

#

there's no way to tell a module from a larger directory unfortunately

wheat flare Sep 3, 2021, 4:26 AM

#

wheat flare my previous idea was ``` go through each key for their values see if tens...

yeah so is this idea all i have?

#

very messy splitting

#

cause i noticed the names either have \\ or not

stable pecan Sep 3, 2021, 4:36 AM

#

ok

#

!e

import re

pattern = re.compile(r'(?:\S*\\)*(.+)')
m = re.match(pattern, "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util.SlowAppendBFloat16ArrayToTensorProto")
n = re.match(pattern, "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util.ExtractBitsFromBFloat16")
print(m.group(1), n.group(1))

halcyon plankBOT Sep 3, 2021, 4:36 AM

#

@stable pecan :white_check_mark: Your eval job has completed with return code 0.

tensor_util.SlowAppendBFloat16ArrayToTensorProto tensor_util.ExtractBitsFromBFloat16

wheat flare Sep 3, 2021, 4:37 AM

#

ooo

#

it seems to work

#

!e


pattern = re.compile(r'(?:\S*\\)*(.+)')
m = re.match(pattern, "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.expand_dims_v2")
n = re.match(pattern, "tensorflow.python.ops.gen_array_ops.list_diff")
print(m.group(1), n.group(1))

halcyon plankBOT Sep 3, 2021, 4:38 AM

#

@wheat flare :white_check_mark: Your eval job has completed with return code 0.

array_ops.expand_dims_v2 tensorflow.python.ops.gen_array_ops.list_diff

wheat flare Sep 3, 2021, 4:39 AM

#

for every value of a key i can run this regex then and just rewrite

stable pecan Sep 3, 2021, 4:45 AM

#

on works for the \\ patterns

#

i was trying to run it on the json text, but i'm failing

#

might be easier to use on keys to the dictionary and just create a new dict

#

or find someone better at regex

wheat flare Sep 3, 2021, 4:46 AM

#

i dont mind rewriting

#

so i just perform your regex as i iterate through it?

#

all the keys and their values

stable pecan Sep 3, 2021, 4:49 AM

#

yeah, but still need some way to turn "tensorflow.python.ops.gen_array_ops.list_diff" into gen_array_ops.list_diff it's not obvious to me how you'd do --- perhaps you need a list of directory prefixes to ignore

#

possibly generated from your filesystem

wheat flare Sep 3, 2021, 4:50 AM

#

i was thinking a horrible way

#

search the list for tensorflow

#

"tensorflow.python.ops.gen_array_ops.list_diff"

#

then just split it by . and append last 2

#

its a very big assumption that they're all represented like that

#

well i need a condition if your regex fails now

#

god this method is horribly messy..

#

i need to go through every list and make a new list then assign a new key

#

Rough psuedocode

1. for every key, go through the value (a list)  
    2. for every value, go through each of its element
      3. perform filtering and append to a list (regex and whatever method i said up there) for the values
      4. perform filtering on the key too
  5. assign them together in a new dictionary and output a new json file

wheat flare Sep 3, 2021, 5:01 AM

#

stable pecan !e ```py from collections import defaultdict my_dict_1 = { "a": [1, 2, 3], ...

then i will obtain a gigantic dictionary of dictionaries of lists which will be converted to a graph using this  ``` https://networkx.org/documentation/stable/reference/generated/networkx.convert.from_dict_of_lists.html

#

Please validate and thanks a lot for helping

#

how does networkx handle this btw? i guess its fine

stable pecan Sep 3, 2021, 5:17 AM

#

it's all just strings to networkx

#

if the strings are different then they're different nodes

wheat flare Sep 3, 2021, 5:19 AM

#

Oh i meant empty lists

#

Guess it does nothing

wheat flare Sep 3, 2021, 5:20 AM

#

wheat flare ```after filtering, connect them together using this then i will obtain a gigant...

@stable pecan very sorry but after im done with whatever cursed thing i come up with, can you validate this step?

#

Sorry if you answered it already my head is a bit cluttered

stable pecan Sep 3, 2021, 5:21 AM

#

yes you would use from_dict_of_lists

wheat flare Sep 3, 2021, 5:22 AM

#

Thank you

#

Ill go figure out something then, ill post here if any issues

wheat flare Sep 3, 2021, 7:46 AM

#

@stable pecan im so sorry for pinging again, but i have been testing a graph example on network x

#

f = open(r"C:\Users\User\Desktop\Work\SMU\Capstone\Datasets\callgraph\TensorflowPythonCallgraphs\tensor_util.json")
graph = json.load(f)

G = nx.from_dict_of_lists(graph)
for i in list(G.edges):
    print(i)

#

https://paste.pythondiscord.com/suqidisave.py

#

said json

#

why does it have an edge from a "leaf" to a node

#

stable pecan Sep 3, 2021, 7:48 AM

#

your graph isn't directed

#

use a directed graph instead

wheat flare Sep 3, 2021, 7:48 AM

#

oh dxgraph func or something?

#

sorry for pinging

stable pecan Sep 3, 2021, 7:48 AM

#

i think from_dict_of_lists has an optional 2nd argument

#

lemme check

wheat flare Sep 3, 2021, 7:48 AM

#

#

here

stable pecan Sep 3, 2021, 7:49 AM

#

there you go, create_using= just pass nx.DiGraph

wheat flare Sep 3, 2021, 7:49 AM

#

ah!

#

ok thank you thank you

#

i didnt know you could do that

stable pecan Sep 3, 2021, 7:50 AM

#

f = open(r"C:\Users\User\Desktop\Work\SMU\Capstone\Datasets\callgraph\TensorflowPythonCallgraphs\tensor_util.json")
graph = json.load(f)

G = nx.from_dict_of_lists(graph, create_using=nx.DiGraph)
for i in G.edges:  # edges is iterable already, don't need to make a list first
    print(i)

wheat flare Sep 3, 2021, 7:50 AM

#

sorry for pinging im working on that filtering we made

#

thanks again

#

#

seemed to work

stable pecan Sep 3, 2021, 7:50 AM

#

i have an idea to merge the different keys that are actually the same

wheat flare Sep 3, 2021, 7:51 AM

#

wouldnt it just be overwritten when i combine them?

#

heres the code if you're curious its very messy

#

pattern = re.compile(r'(?:\S*\\)*(.+)')
old_dict = json.load("Some json file")
for key in old_dict:
    value_list = []
    #Check if the key contains tensorflow, if it does we want to filter it
    if "tensorflow" in key:
        #Assumption: the tensorflow keys have either \ or not, filter accordingly
        #Example: Desktop\Work\tensorflow-master\tensorflow\python\framework\tensor_util._is_array_like -> tensor_util._is_array_like
        if "\\" in key:
            #Use that regex filtering to perform it
            cleaned_key = re.match(pattern, key)
            new_key = cleaned_key.group(1)
        #Else split it with the dots and get the last 2 parts
        #Example: tensorflow.python.util.nest.flatten -> nest.flatten
        else: 
            parts = key.split(".")
            new_key = ".".join(parts[-2:])

    #Do the same on the values
    for uncleaned_value in old_dict[key]:
        if "tensorflow" in uncleaned_value:
            if "\\" in uncleaned_value:
                cleaned_value = re.match(pattern, uncleaned_value)
                new_value = cleaned_value.group(1)
            else:
                parts = uncleaned_value.split(".")
                new_value = ".".join(parts[-2:])
            value_list.append(new_value)
        elif "tensorflow" not in uncleaned_value:  
            value_list.append(uncleaned_value)

stable pecan Sep 3, 2021, 8:00 AM

#

wheat flare heres the code if you're curious its very messy

i have something similar but no regex, and doesn't look for "tensorflow" but instead just splits the keys, labeling them by the suffix, and preferring the . ones over the // ones:

In [6]: keys = [
   ...:     "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.boolean_mask._apply_mask_1d",
   ...:     "tensorflow.python.ops.gen_math_ops.prod",
   ...:     "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.reshape",
   ...:     "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.concat",
   ...:     "tensorflow.python.framework.ops.convert_to_tensor",
   ...:     "tensorflow.python.framework.tensor_shape.as_shape",
   ...:     "tensorflow.python.framework.tensor_util.constant_value",
   ...:     "<builtin>.ValueError",
   ...:     "tensorflow.python.framework.ops.name_scope",
   ...: ]

In [7]: key_suffix = { }
   ...: for key in keys:
   ...:     if "\\" in key:
   ...:         split_key = key.split("\\")
   ...:         split_key[-1:] = split_key[-1].split(".")
   ...:         if split_key[-1] in key_suffix:  # Key already in dict
   ...:             continue
   ...:         key_suffix[split_key[-1]] = split_key
   ...:     else:
   ...:         split_key = key.split(".")
   ...:         key_suffix[split_key[-1]] = split_key
   ...:

#


In [8]: key_suffix
Out[8]: 
{'_apply_mask_1d': ['Desktop',
  'Work',
  'tensorflow-master',
  'tensorflow',
  'python',
  'ops',
  'array_ops',
  'boolean_mask',
  '_apply_mask_1d'],
 'prod': ['tensorflow', 'python', 'ops', 'gen_math_ops', 'prod'],
 'reshape': ['Desktop',
  'Work',
  'tensorflow-master',
  'tensorflow',
  'python',
  'ops',
  'array_ops',
  'reshape'],
 'concat': ['Desktop',
  'Work',
  'tensorflow-master',
  'tensorflow',
  'python',
  'ops',
  'array_ops',
  'concat'],
 'convert_to_tensor': ['tensorflow',
  'python',
  'framework',
  'ops',
  'convert_to_tensor'],
 'as_shape': ['tensorflow', 'python', 'framework', 'tensor_shape', 'as_shape'],
 'constant_value': ['tensorflow',
  'python',
  'framework',
  'tensor_util',
  'constant_value'],
 'ValueError': ['<builtin>', 'ValueError'],
 'name_scope': ['tensorflow', 'python', 'framework', 'ops', 'name_scope']}

wheat flare Sep 3, 2021, 8:01 AM

#

oh wow

stable pecan Sep 3, 2021, 8:01 AM

#

after they are filtered you can rejoin them

#

this does assume there are no function names that are the same, hopefully that's true

wheat flare Sep 3, 2021, 8:01 AM

#

interesting

#

let me try to understand them

#

mine kinda works already let me show output

#


New key: tensor_util

Old list is:
['tensorflow.python.util.tf_export.tf_export', 'Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util._generate_isinstance_check', '<builtin>.frozenset']

New list is:
['tf_export.tf_export', 'tensor_util._generate_isinstance_check', '<builtin>.frozenset']```

#

for "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util": [ "tensorflow.python.util.tf_export.tf_export", "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util._generate_isinstance_check", "<builtin>.frozenset" ],

stable pecan Sep 3, 2021, 8:04 AM

#

another interesting idea is just to have the nodes in your graph be the suffix of the keys (the bare function or whatever) and have a dictionary that provides the modules the functions come from elsewhere

wheat flare Sep 3, 2021, 8:04 AM

#

thats another alternative

stable pecan Sep 3, 2021, 8:05 AM

#

you can save these properties on the nodes themselves in networkx

wheat flare Sep 3, 2021, 8:05 AM

#

i'll visit that if my current one fails

#

it seems to work well i hope?

#

well its rather hard to understand, always hard to read toher people's code

#

im going off the assumption that all the keys are unique

stable pecan Sep 3, 2021, 8:07 AM

#

the last idea would be simplest to implement, gimme a sec for an example

wheat flare Sep 3, 2021, 8:11 AM

#

https://paste.pythondiscord.com/nayozexuve.json

#

https://paste.pythondiscord.com/suqidisave.json

#

my method seems to work nicely, im curious about your method, always good to learn

stable pecan Sep 3, 2021, 8:13 AM

#

In [9]: from collections import defaultdict
   ...: filtered_keys = defaultdict(dict)
   ...: 
   ...: def clean_key(key):
   ...:     if "\\" in key:
   ...:         *_, last = key.split("\\")
   ...:         *_, last = last.split(".")
   ...:         filtered_keys[last]["directory path"] = key
   ...:     else:
   ...:         *_, last = key.split(".")
   ...:         filtered_keys[last]["moduled path"] = key
   ...: 
   ...: for key in keys:
   ...:     clean_key(key)
   ...: 

In [10]: filtered_keys
Out[10]: 
defaultdict(dict,
            {'_apply_mask_1d': {'directory path': 'Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.boolean_mask._apply_mask_1d'},
             'prod': {'moduled path': 'tensorflow.python.ops.gen_math_ops.prod'},
             'reshape': {'directory path': 'Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.reshape'},
             'concat': {'directory path': 'Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.concat'},
             'convert_to_tensor': {'moduled path': 'tensorflow.python.framework.ops.convert_to_tensor'},
             'as_shape': {'moduled path': 'tensorflow.python.framework.tensor_shape.as_shape'},
             'constant_value': {'moduled path': 'tensorflow.python.framework.tensor_util.constant_value'},
             'ValueError': {'moduled path': '<builtin>.ValueError'},
             'name_scope': {'moduled path': 'tensorflow.python.framework.ops.name_scope'}})

wheat flare Sep 3, 2021, 8:14 AM

#

oo

#

networkx accepts a module path too? or i will be using these to combine stuff

stable pecan Sep 3, 2021, 8:15 AM

#

networkx doesn't know the difference, anything that's hashable can be a node

#

but you can just feed networkx these suffixes

#

and keep this table to look up the paths

#

or attach the paths to the nodes after you build the graph

wheat flare Sep 3, 2021, 8:15 AM

#

i see thats interesting

#

thanks a lot

stable pecan Sep 3, 2021, 8:21 AM

#

np

copper agate Sep 3, 2021, 9:48 AM

#

Hi I'm new to Python. I'm a freshman college student hoping to learn AI/Data Science. I'm looking for a laptop, and would a RTX 3050 GPU do me good?

fiery cosmos Sep 3, 2021, 9:51 AM

#

copper agate Hi I'm new to Python. I'm a freshman college student hoping to learn AI/Data Sci...

idk, sometimes they need 3 gpu for the AI.

keen hearth Sep 3, 2021, 12:04 PM

#

copper agate Hi I'm new to Python. I'm a freshman college student hoping to learn AI/Data Sci...

Probably a better fit for the #data-science-and-ml channel.

wise fulcrum Sep 3, 2021, 12:15 PM

#

stable pecan you can just do: ```py for a_dict in iterable_of_dicts: master_dict |= a_dic...

man i feel like a newb for doing {**dict1,**dict2} to join dictionaries lol

#

{**dict1,dict2}

#

i m ean

#

arg it keeps killing the asteriks

wheat flare Sep 3, 2021, 12:25 PM

#

i lost some values using |= and im not sure why

#

The asteriks one also makes me lose values

#

Hm

#

like some keys no longer have the values associated with it

wheat flare Sep 3, 2021, 12:52 PM

#

nwm

lunar shoal Sep 3, 2021, 4:27 PM

#

|= means what?

knotty magnet Sep 3, 2021, 4:39 PM

#

calls __ior__

chrome folio Sep 3, 2021, 4:41 PM

#

🖐️

trim fiber Sep 3, 2021, 5:23 PM

#

lunar shoal `|=` means what?

For a |= b it means a = a | b

#

!e

a = 0b101
b = 0b011
print(f"a = {a}")
print(f"b = {b}")
a |= b
print("a |= b")
print(f"a = {a}")
print(f"b = {b}")

halcyon plankBOT Sep 3, 2021, 5:24 PM

#

@trim fiber :white_check_mark: Your eval job has completed with return code 0.

001 | a = 5
002 | b = 3
003 | a |= b
004 | a = 7
005 | b = 3

lunar shoal Sep 3, 2021, 5:28 PM

#

Things are getting real here

woven trench Sep 3, 2021, 6:28 PM

#

i want to imporve my dsa is there any free good source for that?

keen hearth Sep 3, 2021, 6:48 PM

#

woven trench i want to imporve my dsa is there any free good source for that?

Hello 🙂 There are some resources linked in the pins of this channel.

woven trench Sep 3, 2021, 6:50 PM

#

ok thankyou

buoyant crescent Sep 3, 2021, 8:52 PM

#

it seems nice

fervent saddle Sep 3, 2021, 9:21 PM

#

x = some_dict.pop(k)
some_dict[k] = None```
Is this amortized O(1), or does needing to resize the insertion ordered array make it amortized O(n)?

fervent saddle Sep 3, 2021, 9:41 PM

#

for _ in range(n):
    x = some_dict.pop(k)
    some_dict[k] = x```
Would that just be O(n), or would it be O(n * size_of_dict)?

vocal gorge Sep 3, 2021, 9:49 PM

#

fervent saddle ```py x = some_dict.pop(k) some_dict[k] = None``` Is this amortized O(1), or doe...

Repeately popping and setting the same element?

#

I believe it's amortized O(1), because dicts simply don't shrink their tables

#

maybe at all, maybe unless they get really empty

stable pecan Sep 3, 2021, 9:57 PM

#

i don't think they ever shrink

trim fiber Sep 3, 2021, 9:58 PM

#

fervent saddle ```py for _ in range(n): x = some_dict.pop(k) some_dict[k] = x``` Would ...

!e

import sys
d = {}
print("pure", sys.getsizeof(d))
d[0] = None
print("one element", sys.getsizeof(d))
n = 8
for i in range(n):
  d[i] = i
print(f"{n} elements", sys.getsizeof(d))
for i in range(n):
  value = d.pop(i)
  d[i] = value
print(f"after loop", sys.getsizeof(d))

halcyon plankBOT Sep 3, 2021, 9:58 PM

#

@trim fiber :white_check_mark: Your eval job has completed with return code 0.

001 | pure 64
002 | one element 232
003 | 8 elements 360
004 | after loop 640

trim fiber Sep 3, 2021, 10:00 PM

#

pithink

stable pecan Sep 3, 2021, 10:04 PM

#

dict never shrinks only grows more powerful

fervent saddle Sep 3, 2021, 10:07 PM

#

!e

import sys
d = {}
print("pure", sys.getsizeof(d))
d[0] = None
print("one element", sys.getsizeof(d))
n = 8
for i in range(n):
  d[i] = i
print(f"{n} elements", sys.getsizeof(d))
for _ in range(1000000):
  value = d.pop(0)
  d[0] = value
print(f"after loop", sys.getsizeof(d))

halcyon plankBOT Sep 3, 2021, 10:07 PM

#

@fervent saddle :white_check_mark: Your eval job has completed with return code 0.

001 | pure 64
002 | one element 232
003 | 8 elements 360
004 | after loop 640

fervent saddle Sep 3, 2021, 10:08 PM

#

~~It’s decreasing in size~~

trim fiber Sep 3, 2021, 10:40 PM

#

https://tenor.com/uk2C.gif

Tenor

Give Me More - Despicable Me

▶ Play video

fervent saddle Sep 3, 2021, 10:56 PM

#

So it’s O(n * size_of_dict)?

fervent saddle Sep 3, 2021, 11:15 PM

#

No wait, it’s just never increasing in size

#

!e ```py
import sys

d = {}

for n in range(10000):
d[n] = None

print(sys.getsizeof(d))

for n in range(10000):
del d[n]

print(sys.getsizeof(d))```

halcyon plankBOT Sep 3, 2021, 11:15 PM

#

@fervent saddle :white_check_mark: Your eval job has completed with return code 0.

001 | 295000
002 | 295000

lament totem Sep 3, 2021, 11:15 PM

#

what are you asking is O(n) or O(n*dictsize)?

fervent saddle Sep 3, 2021, 11:15 PM

#

Wow

lament totem Sep 3, 2021, 11:16 PM

#

the search time of a dict item?

fervent saddle Sep 3, 2021, 11:16 PM

#

fervent saddle ```py for _ in range(n): x = some_dict.pop(k) some_dict[k] = x``` Would ...

This

lament totem Sep 3, 2021, 11:17 PM

#

pretty sure its O(1) no?

#

getting an item is avg O(1) and setting an item too iirc when using hashes

#

python by default uses hashes for dicts, not red-black-trees or anything

fervent saddle Sep 3, 2021, 11:18 PM

#

But I was thinking about it resetting the iterated array

#

Because it fills with empty space

lament totem Sep 3, 2021, 11:18 PM

#

space or time complexity?

fervent saddle Sep 3, 2021, 11:19 PM

#

Time complexity

lament totem Sep 3, 2021, 11:19 PM

#

well yh thats bound to n for sure

fervent saddle Sep 3, 2021, 11:20 PM

#

!e py import sys d = {} print("pure", sys.getsizeof(d)) d[0] = None print("one element", sys.getsizeof(d)) n = 8 for i in range(n): d[i] = i print(f"{n} elements", sys.getsizeof(d)) for n in range(1000000): value = d.pop(n % 2) d[n % 2] = value print(f"after loop", sys.getsizeof(d))

halcyon plankBOT Sep 3, 2021, 11:20 PM

#

@fervent saddle :white_check_mark: Your eval job has completed with return code 0.

001 | pure 64
002 | one element 232
003 | 8 elements 360
004 | after loop 640

fervent saddle Sep 3, 2021, 11:21 PM

#

I don’t get how is this not increasing the dict’s memory linearly. The insertion ordered array should be building up empty space

fervent saddle Sep 3, 2021, 11:21 PM

#

fervent saddle !e ```py import sys d = {} for n in range(10000): d[n] = None print(sys.g...

If it never decreases size, like this suggests

#

I don’t get it

fervent saddle Sep 3, 2021, 11:52 PM

#

Does it only account for the size of the hashed into array, not the consecutively filled array?

fiery cosmos Sep 4, 2021, 1:09 AM

#

I think you have to know the arity of the operators

#

How would you know what args to give the - here 4 3 1 - +

bright halo Sep 4, 2021, 1:25 AM

#

How to separate a string of input characters into 2 substrings and save each of them in a variable.

input_text_to_check = "ElectrikVocal95#9525: what are we talking about?"

I need to create a condition regex that if what are we talking about? appears in the sentence(patron) then enter the condition...

Something like this:

    regex_patron = r"\s*\¿?(?:what we were talking about | what we talked about before | what we talked about)\s*(about|)\s*((?:\w+\s*)+)?"

    l = re.search(regex_patron, input_text_to_check, re.IGNORECASE)

    if l:

Inside that if statement I would need to divide the sentence between the username and what the sentence really is, and to save each substring in a variable

Like this:


user_name = "ElectrikVocal95#9525" #without the ":"

text = "what are we talking about?"

I hope you can help me with this.

wheat flare Sep 4, 2021, 2:33 AM

#

https://networkx.org/documentation/stable/reference/generated/networkx.convert.from_dict_of_lists.html

#

does igraph have a similar version of this?

#

my computer froze when i tried using networkx on a 20k node graph

#

networkx can handle 24k nodes and 52k edges right?

wheat flare Sep 4, 2021, 3:00 AM

#

please ping if you want to discuss or soemthing

fiery cosmos Sep 4, 2021, 4:48 AM

#

⇔((Σa/((a))*1.15)/(Σ∀(b, c, d)/((∀(b, c, d) m)))) < 1) ⊃ ∃⊤

#

does that look right?

broken sapphire Sep 4, 2021, 5:05 AM

#

Hello. Is this channel about algorytm channel?

#

Awnser me if anyone is here...

#

:(

#

python

fiery cosmos Sep 4, 2021, 5:13 AM

#

broken sapphire Hello. Is this channel about algorytm channel?

Yes. About algorithms and data structures.

broken sapphire Sep 4, 2021, 5:14 AM

#

fiery cosmos Yes. About algorithms and data structures.

Thanks!

wheat flare Sep 4, 2021, 8:31 AM

#

hey @stable pecan i managed to fit my graph into networkx but it froze my entire computer

#

just want to ask if these pings are annoying

#

also wanted to ask about this question because i think you're one of the very few people here who know about these

wheat flare Sep 4, 2021, 8:32 AM

#

wheat flare does igraph have a similar version of this?

up here

#

#

from what i understand they seem to look the same

#

networkx vs igraph

wheat flare Sep 4, 2021, 8:50 AM

#

#

neat it has a from_networkx thing

#

i hate going through 2 things but i cant figure out how to make it load a json file

#

it says unknown format

wheat flare Sep 4, 2021, 9:26 AM

#

please ping if you see this and would like to discuss


2. making it accept json or a dictionary of lists, cant seem to find it in the python documentation```

wheat flare Sep 4, 2021, 9:55 AM

#

#

this is a pair thats in the json i fed to networkx -> converted to igraph

#

so im not sure whats wrong

stable pecan Sep 4, 2021, 10:13 AM

#

wheat flare please ping if you see this and would like to discuss ```1. Not sure why all my...

yes nodes in igraph are always integers

#

same with graph-tool

#

you can store node names as properties of the node though

#

these are c-libraries wrapped in python so they are very good for large graphs though

wheat flare Sep 4, 2021, 10:14 AM

#

Oh it doesnt do that automatically?

stable pecan Sep 4, 2021, 10:14 AM

#

no

wheat flare Sep 4, 2021, 10:14 AM

#

Oh boy

stable pecan Sep 4, 2021, 10:14 AM

#

what you can do, if you want

wheat flare Sep 4, 2021, 10:14 AM

#

I couldnt find a way to directly use the json file

#

Had to do from networkx

stable pecan Sep 4, 2021, 10:15 AM

#

is create your networkx graph, and use nx.relabel to get all integer nodes

#

but you'll want some list of nodes labels by index

wheat flare Sep 4, 2021, 10:16 AM

#

https://networkx.org/documentation/stable/reference/generated/networkx.relabel.convert_node_labels_to_integers.html

stable pecan Sep 4, 2021, 10:18 AM

#

int_to_labels = dict(enumerate(dict_of_lists))
labels_to_int = {v, k for k, v in int_to_labels}
new_graph = {labels_to_int[k]: [labels_to_int[node] for node in v] for k, v in dict_of_lists}

can also do this to not go through networkx

wheat flare Sep 4, 2021, 10:19 AM

#

Oh if i want to feed the json directly to igraph ?

#

Either way it ends at the same result right

#

Im fine with using libraries

stable pecan Sep 4, 2021, 10:20 AM

#

it will be faster not going through nx for this many nodes

wheat flare Sep 4, 2021, 10:20 AM

#

Its better to confirm accuracy as i may mess stuff up

#

Oh

stable pecan Sep 4, 2021, 10:20 AM

#

networkx is dict-of-dict-of-dict structure for graphs, so it's a huge memory hog for large graphs

wheat flare Sep 4, 2021, 10:20 AM

#

Can it reliably handle 20k nodes

stable pecan Sep 4, 2021, 10:20 AM

#

it can reliably handle any nodes, if you have infinite memory

wheat flare Sep 4, 2021, 10:21 AM

#

I can try it on a better computer but with like 32gb memory but its hard to say if it'll handle huh

stable pecan Sep 4, 2021, 10:21 AM

#

sounds like plenty

wheat flare Sep 4, 2021, 10:22 AM

#

Really

#

My laptop has like 8 gb memory so it just froze

#

Lmao

wheat flare Sep 4, 2021, 10:22 AM

#

stable pecan ```py int_to_labels = dict(enumerate(dict_of_lists)) labels_to_int = {v, k for k...

This performs the steps you mention here?

wheat flare Sep 4, 2021, 10:22 AM

#

stable pecan is create your networkx graph, and use nx.relabel to get all integer nodes

This

stable pecan Sep 4, 2021, 10:23 AM

#

this takes your dict of lists and creates two intermediate dictionaries that convert integers to labels and vice versa

wheat flare Sep 4, 2021, 10:23 AM

#

Oops sorry didnt turn off ping

stable pecan Sep 4, 2021, 10:23 AM

#

then another dictionary that's the relabeled graph

#

technically, labels_to_int could just be a list

#

would be more efficient

wheat flare Sep 4, 2021, 10:24 AM

#

Then ill feed both to igraph or something?

stable pecan Sep 4, 2021, 10:24 AM

#

errr, int_to_labels

#

just feed the new_graph to igraph

wheat flare Sep 4, 2021, 10:25 AM

#

stable pecan then another dictionary that's the relabeled graph

Which would be this

stable pecan Sep 4, 2021, 10:25 AM

#

i'm not sure if it has a dict_of_lists constructor, but you can create an iterable of tuples from the dict in any case

wheat flare Sep 4, 2021, 10:25 AM

#

https://igraph.org/python/doc/tutorial/generation.html#from-nodes-and-edges

Graph generation

#

Assume its this i guess

#

So steps are

#

1. Use the code up there to generate new_graph and feed to networkx.
2. If it doesnt, create an iterate of tuples from the dict to feed

wheat flare Sep 4, 2021, 10:28 AM

#

stable pecan you can store node names as properties of the node though

New graph should have this already?

#

Maybe ill just look for a better pc >< this is a lot of work

stable pecan Sep 4, 2021, 10:29 AM

#

no the new graph doesn't have this already

#

In [3]: dict_of_lists = {
   ...:     "zero": [],
   ...:     "one": ["one", "two", "three"],
   ...:     "two": ["three"],
   ...:     "three": ["zero"],
   ...: }
   ...: int_to_labels = list(dict_of_lists)
   ...: labels_to_int = {label: i for i, label in enumerate(int_to_labels)}
   ...: new_graph = {labels_to_int[key]: [labels_to_int[node] for node in value] for key, value in dict_of_lists.items()}

In [4]: print(int_to_labels, labels_to_int, new_graph)
['zero', 'one', 'two', 'three'] {'zero': 0, 'one': 1, 'two': 2, 'three': 3} {0: [], 1: [1, 2, 3], 2: [3], 3: [0]}

#

here is an example

#

you have string labels in dict_of_lists

#

you make a list of the labels called int_to_labels

#

you make a dictionary of labels to integers named labels_to_int, and then you can use this last dictionary to relabel the entire graph

#

the last dictionary is all integers

#

if you need to know what label an integer node is referring to, you can look it up in int_to_labels

#

In [5]: int_to_labels[0]
Out[5]: 'zero'

wheat flare Sep 4, 2021, 10:31 AM

#

I see, from there though int to labels only gets the keys right of the dict

stable pecan Sep 4, 2021, 10:32 AM

#

yep, this assumes that all the labels will appear as keys in the dict

wheat flare Sep 4, 2021, 10:32 AM

#

Ah there are unique labels in the values too

stable pecan Sep 4, 2021, 10:33 AM

#

then should add them as keys

wheat flare Sep 4, 2021, 10:33 AM

#

Ohh

#

Well i need to look at my json to figure that out it may be already like that

#

Ok i mostly understand what you do, how would that work for igraph now?

#

I guess we're just preprocessing the data

stable pecan Sep 4, 2021, 10:34 AM

#

once you've converted everything to integer, you can make an igraph graph with it

wheat flare Sep 4, 2021, 10:36 AM

#

Yes then i store node names as properties?

stable pecan Sep 4, 2021, 10:37 AM

#

yes

wheat flare Sep 4, 2021, 10:38 AM

#

https://igraph.org/r/doc/vertex_attr.html

igraph R manual pages

#

The python version of this

#

Hm does this mean i have to iterate the whole dict and manually construct the graph then

#

Since it doesnt seem to accept the whole input

wheat flare Sep 4, 2021, 10:39 AM

#

wheat flare https://igraph.org/python/doc/tutorial/generation.html#from-nodes-and-edges

Well i cant find it from here

#

Sorry the documentation is rather confusing

#

Sounds like a lot of iteration

#

Going through every node and creating it then assigning after

stable pecan Sep 4, 2021, 10:45 AM

#

wheat flare Going through every node and creating it then assigning after

In [39]: dict_of_lists = {
    ...:     "zero": [],
    ...:     "one": ["one", "two", "three"],
    ...:     "two": ["three"],
    ...:     "three": ["zero"],
    ...: }
    ...: int_to_labels = list(dict_of_lists)
    ...: labels_to_int = {label: i for i, label in enumerate(int_to_labels)}
    ...: new_graph = {labels_to_int[key]: [labels_to_int[node] for node in value] for key, value in dict_of_lists.items()}
    ...: g = ig.Graph(); g.add_vertices(len(int_to_labels)); g.add_edges(((u, v) for u, out in new_graph.items() for v in out))
    ...: g.vs["label"] = int_to_labels

In [40]: print(g)
IGRAPH U--- 4 5 --
+ attr: label (v)
+ edges:
1--1 1--2 1--3 2--3 0--3

In [41]: g.vs[0]["label"]
Out[41]: 'zero'

wheat flare Sep 4, 2021, 10:45 AM

#

Woa

#

Just cirious shouldnt the last edge be 3--0

#

Unless your graph is undirected

stable pecan Sep 4, 2021, 10:46 AM

#

it's undirected, use directed graph

#

g = ig.Graph(directed=True)

#

is all you have to do

wheat flare Sep 4, 2021, 10:47 AM

#

Ah i see

#

G.vs label just saves node properties

#

Interesting

#

Well saves it to the node under label

stable pecan Sep 4, 2021, 10:48 AM

#

yeah, you can attach arbitrary properties to nodes or edges

wheat flare Sep 4, 2021, 10:48 AM

#

Ill try to work it on a small file first

#

Thanks a lot

#

May i ask if im bothering with the pings

stable pecan Sep 4, 2021, 10:49 AM

#

it's fine

wheat flare Sep 4, 2021, 10:49 AM

#

I try to not do it too much but its confusing

stable pecan Sep 4, 2021, 10:49 AM

#

there's not a lot of people that know all the graph libraries here

#

i've worked with them all at some point

wheat flare Sep 4, 2021, 10:49 AM

#

When i search in the search bar its like

#

You and 2 others

stable pecan Sep 4, 2021, 10:49 AM

#

well, "all" -- i've worked with nx, igraph, and graph-tool

wheat flare Sep 4, 2021, 10:50 AM

#

Thanksss

#

Anything i should look out for or any extra tips?

#

Probably gonna have issues cause i need to learn it

stable pecan Sep 4, 2021, 10:50 AM

#

a lot of it is just playing around in an interpreter and making sure the graph is doing what you want

wheat flare Sep 4, 2021, 10:51 AM

#

After this is labeled i can try

stable pecan Sep 4, 2021, 10:52 AM

#

i have a file somewhere that unifies the graph interface for all three libraries

wheat flare Sep 4, 2021, 10:52 AM

#

Well the get shortest path stuff

#

Using names

#

print(g.get_shortest_paths("one", to="zero") )

#

I assume

#

Then itll be [one, three, zero]

stable pecan Sep 4, 2021, 10:54 AM

#

oh yeah, https://gist.github.com/salt-die/ac6b7e75df258bdd4cff6e259ee50909 -- this was a bit of an experiment to create a single Graph that could use any library underneath, i didn't finish it

wheat flare Sep 4, 2021, 10:55 AM

#

Wewww that looks neat

south gorge Sep 4, 2021, 10:55 AM

#

Salt die
Pepper
Then its againn
Salt die and then pepper
You guyz are not gonna give chance to anyone else to write : 😅 😂

wheat flare Sep 4, 2021, 10:55 AM

#

Well im almost done

#

Sorry for hogging the channep

wheat flare Sep 4, 2021, 10:56 AM

#

wheat flare print(g.get_shortest_paths("one", to="zero") )

One final thing before i go mess around, would this be correct then

#

Ill do this in a help channel next time

stable pecan Sep 4, 2021, 10:57 AM

#

i think what will happen is ig will give you a list of Vertex objects and you can get the label from each of these

wheat flare Sep 4, 2021, 10:58 AM

#

Hmm so i need to know the number associated to the label then

#

Guess i can just get it that way

stable pecan Sep 4, 2021, 10:58 AM

#

more like, labels_to_int dictionary was created for this

#

g.get_shortest_paths(labels_to_int["one"], to=labels_to_int["zero"])

wheat flare Sep 4, 2021, 10:59 AM

#

Ahh

#

Alright

#

Thabks a lot sorry for hogging the channel

#

Ill do it in a help channel next time

#

@south gorge sorry for taking the whole channel

south gorge Sep 4, 2021, 11:01 AM

#

Hey no problem i was just joking
I mean this had alot of serious in it

wheat flare Sep 4, 2021, 11:01 AM

#

yert

#

This will be useful if someone searches igraph

dawn geyser Sep 4, 2021, 12:59 PM

#

from itertools import product as prod
lst2=['d','e','f']
lst1=['a','b','c']
lst=[lst1,lst2]

list(prod(lst1,lst2))

#

 ('a', 'e'),
 ('a', 'f'),
 ('b', 'd'),
 ('b', 'e'),
 ('b', 'f'),
 ('c', 'd'),
 ('c', 'e'),
 ('c', 'f')]

#

How do I correct the code if I want to just pass 'lst' in list(prod(lst)) and get the same output instead of whats written

#

Like, pass a single list that contains all of the individual lists within itself.

#

Btw, the output is just the list itself if I try to pass just 'lst'

feral hound Sep 4, 2021, 1:18 PM

#

@stable pecan Hi sorry to bother you but I saw the conversation earlier about graphs and I've never used a graphing library before could you explain to me if it would be better for storing/creating a simple graph with node - connections than a dict would be

graph = {
    "node_name": [children]
}

stable pecan Sep 4, 2021, 1:19 PM

#

typically graph libraries already have algorithms made for things like shortest path or largest connected component

feral hound Sep 4, 2021, 1:19 PM

#

and other than that wouldn't it be even more efficient to create a node class that has reference to its connections?

stable pecan Sep 4, 2021, 1:19 PM

#

so if you need to do any algorithmic thing with your graph, probably better off using the library

feral hound Sep 4, 2021, 1:19 PM

#

what about in terms of the graph itself?

#

would it take up more memory?

stable pecan Sep 4, 2021, 1:20 PM

#

depends -- some libraries are actually written in c, so the graphs are smaller than anything you could make in python

#

networkx has large graphs though, because it's a pure python library

feral hound Sep 4, 2021, 1:20 PM

#

really?

stable pecan Sep 4, 2021, 1:21 PM

#

yes, graph-tool and igraph both wrap c

feral hound Sep 4, 2021, 1:21 PM

#

I know it would probably be faster but dont understand why it would take less memory

stable pecan Sep 4, 2021, 1:21 PM

#

because c objects are often contiguous memory

feral hound Sep 4, 2021, 1:22 PM

#

I dont think that would work for creating a graph though no?

#

you would need pointers either way

stable pecan Sep 4, 2021, 1:23 PM

#

you don't need, you can just resize when adding nodes, but i don't know how they're implemented - just that it's possible

feral hound Sep 4, 2021, 1:24 PM

#

hmm but then I guess the maximum size would be smaller since it would require a fully empty block in memory?

#

whereas pointers as long as there is memory available it can keep expanding the graph and would also be faster to add connections dynamically?

#

so I guess its more about use case then if I know the graph wont ever change I could create it as a contiguous block otherwise use pointers to add nodes dynamically

stable pecan Sep 4, 2021, 1:26 PM

#

both graph-tool and igraph don't allow arbitrary objects for nodes though, nodes can only be ints, so you can specialize your data structure here

feral hound Sep 4, 2021, 1:26 PM

#

ahh

stable pecan Sep 4, 2021, 1:27 PM

#

and the nodes are always contiguous

#

so you cant have like 0, 5, 19 as your only nodes

feral hound Sep 4, 2021, 1:27 PM

#

do you know what kind of structure they use to create the graphs?

stable pecan Sep 4, 2021, 1:27 PM

#

no idea, graph-tool wraps boost library, and igraph wraps something else

feral hound Sep 4, 2021, 1:27 PM

#

fair

#

what resources would you recommend to look into this a bit more btw?

stable pecan Sep 4, 2021, 1:28 PM

#

can just read their documentation

feral hound Sep 4, 2021, 1:28 PM

#

aight thx for the info appreciate it 🙂

stable pecan Sep 4, 2021, 1:30 PM

#

https://igraph.org/ igraph docs

igraph – Network analysis software

cloud delta Sep 4, 2021, 1:31 PM

#

Hi guys, I'm looking for a good library for Trees(RBTree, BTree, AVLTree)

stable pecan Sep 4, 2021, 1:39 PM

#

i implemented avl and binary trees in a library, i dunno a general tree library though

stable pecan Sep 4, 2021, 1:40 PM

#

cloud delta Hi guys, I'm looking for a good library for Trees(RBTree, BTree, AVLTree)

https://github.com/salt-die/sacks/blob/main/sacks/sets/avl_tree.py#L55-L63

halcyon plankBOT Sep 4, 2021, 1:40 PM

#

sacks/sets/avl_tree.py lines 55 to 63

class AVLTree(BinarySearchTree):
    """
    A self-balancing binary search tree.

    Notes
    -----
    This version of an AVL tree allows multiple of the same item to be inserted.

    """```

cloud delta Sep 4, 2021, 1:41 PM

#

Thanks

meager slate Sep 4, 2021, 2:04 PM

#

is there a better way of finding the "length" of a number in binary other than len(bin(n)) - 2 for n > -1? by length i mean this ```
length(0b1010) == 4
length(0b000000010) == 2
^^^^^^^
Leading 0s ignored

stable pecan Sep 4, 2021, 2:14 PM

#

can always take the log base 2 of the number

stable pecan Sep 4, 2021, 2:17 PM

#

meager slate is there a better way of finding the "length" of a number in binary other than `...

In [6]: from random import randrange
   ...: from math import log, ceil
   ...: 
   ...: for _ in range(10):
   ...:     c = randrange(100000)
   ...:     print(c, len(bin(c)) - 2, ceil(log(c, 2)))
   ...: 
19575 15 15
66502 17 17
13819 14 14
72240 17 17
17658 15 15
63511 16 16
49062 16 16
97095 17 17
43991 16 16
81209 17 17

keen hearth Sep 4, 2021, 2:18 PM

#

meager slate is there a better way of finding the "length" of a number in binary other than `...

Also, int.bit_length

#

!eval ```py
for i in range(10):
print(i, bin(i), i.bit_length())

halcyon plankBOT Sep 4, 2021, 2:18 PM

#

@keen hearth :white_check_mark: Your eval job has completed with return code 0.

001 | 0 0b0 0
002 | 1 0b1 1
003 | 2 0b10 2
004 | 3 0b11 2
005 | 4 0b100 3
006 | 5 0b101 3
007 | 6 0b110 3
008 | 7 0b111 3
009 | 8 0b1000 4
010 | 9 0b1001 4

keen hearth Sep 4, 2021, 2:19 PM

#

This comes in handy if you want to get an approximate log base 2 without importing math (for some reason).

keen hearth Sep 4, 2021, 2:27 PM

#

stable pecan ```py In [6]: from random import randrange ...: from math import log, ceil ...

It would probably be better to test this with eg c = int(10 ** random.uniform(0, 5)) rather than c = random.randrange(100000), so that you get a nice distribution of bit-lengths 🤔

knotty magnet Sep 4, 2021, 2:27 PM

#

kinda unlucky he didn't get any that weren't 5 digits lol

keen hearth Sep 4, 2021, 2:30 PM

#

knotty magnet kinda unlucky he didn't get any that weren't 5 digits lol

Ah, well it's because 9/10ths of the numbers in range(10_000) are 5-digits long.

knotty magnet Sep 4, 2021, 2:30 PM

#

at least a little bit unlucky then ¯_(ツ)_/¯

keen hearth Sep 4, 2021, 2:32 PM

#

knotty magnet at least a little bit unlucky then ¯\_(ツ)_/¯

A little bit 😄 It has a 35% chance of happening.

meager slate Sep 4, 2021, 2:36 PM

#

ah i found it, its int(log2(x)) + 1
source: https://math.stackexchange.com/questions/1508902/given-a-number-how-to-find-the-length-of-its-binary-representation

Mathematics Stack Exchange

Given a number, how to find the length of its binary representation?

I think of $\text{log}_2$. But it does not work. For $8 = 2^3$, but the binary representation of 8 is $1000$. The length of it is 4. Any suggestion or help? Thanks.

stable pecan Sep 4, 2021, 2:40 PM

#

meager slate ah i found it, its `int(log2(x)) + 1` source: https://math.stackexchange.com/que...

this isn't exactly right

#

if log2 is an integer, you'll be one too high

knotty magnet Sep 4, 2021, 2:41 PM

#

isn't it just ceil? not floor + 1?

stable pecan Sep 4, 2021, 2:42 PM

#

or maybe this works

#

yeah, it's fine

keen hearth Sep 4, 2021, 2:44 PM

#

!eval ```py
import random, math

def f(x):
return len(bin(x)) - 1

def g(x):
return int(math.log2(x)) + 1

def test(x):
assert f(x) == g(x), f"{x}: {f(x)} != {g(x)}"

for _ in range(104):
x = int(2random.uniform(0, 5))
test(x)

for x in range(1, 10):
test(x)

print('Pass!')

knotty magnet Sep 4, 2021, 2:45 PM

#

you did log 0

stable pecan Sep 4, 2021, 2:45 PM

#

log(0)

halcyon plankBOT Sep 4, 2021, 2:45 PM

#

@keen hearth :x: Your eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 14, in <module>
003 |   File "<string>", line 10, in test
004 | AssertionError: 9: 5 != 4

keen hearth Sep 4, 2021, 2:45 PM

#

knotty magnet you did log 0

Ah yeah lemon_sweat

stable pecan Sep 4, 2021, 2:45 PM

#

2

keen hearth Sep 4, 2021, 2:45 PM

#

!eval ```py
import random, math

def f(x):
return len(bin(x)) - 2

def g(x):
return int(math.log2(x)) + 1

def test(x):
assert f(x) == g(x), f"{x}: {f(x)} != {g(x)}"

for _ in range(104):
x = int(2random.uniform(0, 5))
test(x)

for x in range(1, 10):
test(x)

print('Pass!')

halcyon plankBOT Sep 4, 2021, 2:45 PM

#

@keen hearth :white_check_mark: Your eval job has completed with return code 0.

Pass!

knotty magnet Sep 4, 2021, 2:46 PM

#

huh, interesting

keen hearth Sep 4, 2021, 2:46 PM

#

Might not work for large numbers though.

stable pecan Sep 4, 2021, 2:46 PM

#

this isn't a great random test because x is always a power of 2

knotty magnet Sep 4, 2021, 2:46 PM

#

random.uniform gives a float though

stable pecan Sep 4, 2021, 2:46 PM

#

ok

#

nevermind then

knotty magnet Sep 4, 2021, 2:47 PM

#

although it didn't tnecessarily test the case salt-die mentioned, what if it is a power of two

keen hearth Sep 4, 2021, 2:48 PM

#

stable pecan this isn't a great random test because x is always a power of 2

Ah yeah, the int is done after the exponentiation.

keen hearth Sep 4, 2021, 2:50 PM

#

knotty magnet although it didn't tnecessarily test the case salt-die mentioned, what if it is ...

Some more thorough testing: #bot-commands message

#

How many bits of precision do python floats have?

knotty magnet Sep 4, 2021, 2:51 PM

#

they're doubles, so...53?

keen hearth Sep 4, 2021, 2:52 PM

#

Ah right, so maybe we should test 2**54 and numbers around this.

knotty magnet Sep 4, 2021, 2:52 PM

#

that wouldn't work for any method involving math.log though, since it converts it to a float, right?

keen hearth Sep 4, 2021, 2:53 PM

#

knotty magnet that wouldn't work for any method involving math.log though, since it converts i...

Yeah, found a bug! 😄 #bot-commands message

knotty magnet Sep 4, 2021, 2:54 PM

#

actually, what happens when you convert a really big int into a float. does it just round down?

#

oh, overflow

#

duh

keen hearth Sep 4, 2021, 2:56 PM

#

Just realised we haven't tested negative numbers 👀

#

For which I think len(bin(x)) - 2 is wrong anyway, because theres a - sign.

knotty magnet Sep 4, 2021, 2:57 PM

#

yeah

stable pecan Sep 4, 2021, 2:57 PM

#

log solution is wrong too

#

In [20]: cmath.log(-1234, 2)
Out[20]: (10.26912667914942+4.532360141827194j)

you have to discard the imaginary part

#

and you can't even choose which branch cut to use

knotty magnet Sep 4, 2021, 2:59 PM

#

that would give 11 right? which is correct?

stable pecan Sep 4, 2021, 3:00 PM

#

yes, after you discard the imaginary part

#

but this doesn't work:

In [22]: int(cmath.log(-1234, 2)) + 1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-92bdd2d35a8e> in <module>
----> 1 int(cmath.log(-1234, 2)) + 1

TypeError: can't convert complex to int

keen hearth Sep 4, 2021, 3:06 PM

#

So, all in all, I think it's safest to go with int.bit_length 😄

keen hearth Sep 4, 2021, 3:07 PM

#

knotty magnet that wouldn't work for any method involving math.log though, since it converts i...

Goes to show, always test at or and around the edge cases.

keen hearth Sep 4, 2021, 3:08 PM

#

stable pecan ```py In [20]: cmath.log(-1234, 2) Out[20]: (10.26912667914942+4.532360141827194...

I'll have to review complex numbers to understand how this works 🤔

steady maple Sep 4, 2021, 3:09 PM

#

Guys what Is the use of data structs

#

And algorithms

keen hearth Sep 4, 2021, 3:10 PM

#

Well data structures are just how data is represented in computer programs. The way that you represent data can have a dramatic effect on how efficiently you can process that data.

#

For example, if you represent a list of numbers as an un-sorted array, to find out if a number is in that list, you have to check all of the numbers one-by-one.

steady maple Sep 4, 2021, 3:12 PM

#

Yes

keen hearth Sep 4, 2021, 3:12 PM

#

But if you stored those numbers in Eg a sorted list, or a binary-search tree, you could narrow in on where the number should be by repeatedly ruling out half of the numbers.

#

Even better, if you store them in a hash-table, you can to straight to where the number should be, and check if it is there.

steady maple Sep 4, 2021, 3:12 PM

#

U mean this will just reduce some time?

keen hearth Sep 4, 2021, 3:13 PM

#

Yep

#

Think about how you would go about finding a particular page of a book.

steady maple Sep 4, 2021, 3:13 PM

#

I got it

#

Thanks for the perfect example

#

Well is it helpfull in case of db too ?

keen hearth Sep 4, 2021, 3:14 PM

#

You could start at the first page and go page-by-page until you find the page number you're looking for. Or you could go to the middle of the book, see whether the page number is less than or greater than the one you're looking for, then either look in the left-half or the right half, then repeat.

#

Sorry, I started typing that out so had to finish 😄

keen hearth Sep 4, 2021, 3:14 PM

#

steady maple Well is it helpfull in case of db too ?

Yeah, so have you used a relational database before?

steady maple Sep 4, 2021, 3:14 PM

#

keen hearth Yeah, so have you used a relational database before?

I haave used

keen hearth Sep 4, 2021, 3:15 PM

#

Have you heard of "indexing a column"?

steady maple Sep 4, 2021, 3:15 PM

#

I have experience with both SQL and non SQL db

steady maple Sep 4, 2021, 3:16 PM

#

keen hearth Have you heard of "indexing a column"?

Yep I have heard about it before but not sure as I used SQL many days ago or months

keen hearth Sep 4, 2021, 3:17 PM

#

If you index a column of a table, the database builds an auxiliary data-structure (generally a tree or a hash-table), which allows you to quickly look up rows of the table based on the value of that column.

steady maple Sep 4, 2021, 3:17 PM

#

keen hearth If you index a column of a table, the database builds an auxiliary data-structur...

Like primary key does?

keen hearth Sep 4, 2021, 3:18 PM

#

Yep, I think when you specify a column as the primary key most databases automatically index that column.

steady maple Sep 4, 2021, 3:19 PM

#

Well maybe but as much as I know primary key is the special value of a cell which cannot be repeated in the table once more and thats why it helps the db to find it faster

#

As it is the only one with that value in the table

#

Or I should say a row

#

I always get confused in row and column lol

keen hearth Sep 4, 2021, 3:22 PM

#

steady maple Well maybe but as much as I know primary key is the special value of a cell whic...

Ah yeah, but under the surface, the thing that actually makes it faster is that the database builds an index for the primary key.

steady maple Sep 4, 2021, 3:23 PM

#

keen hearth Ah yeah, but under the surface, the thing that actually makes it faster is that ...

Ah I get it

#

So how does data structs help to make it faster

keen hearth Sep 4, 2021, 3:24 PM

#

steady maple So how does data structs help to make it faster

Well if for example the index is a hash-table, the database can go straight to that row of the table. It doesn't have to scan through the rows of the table to find it.

#

Dictionaries in python are also hash tables.

steady maple Sep 4, 2021, 3:25 PM

#

So we need a hash table so thaat we can stop scanning all the rows

steady maple Sep 4, 2021, 3:25 PM

#

keen hearth Dictionaries in python are also hash tables.

Can u plz explain me what are hash tables

#

Sry if I am irritating u by pinging and asking again and again

keen hearth Sep 4, 2021, 3:26 PM

#

steady maple Sry if I am irritating u by pinging and asking again and again

Erm, no problem, I'm in this channel anyway so I don't get the pings.

#

And I get like 100 pings a day anyway as a mod 😄

steady maple Sep 4, 2021, 3:27 PM

#

Lol

stable pecan Sep 4, 2021, 3:27 PM

#

hash tables are tables made out of shredded potatoes

steady maple Sep 4, 2021, 3:28 PM

#

stable pecan hash tables are tables made out of shredded potatoes

What!

#

Wth was that!

stable pecan Sep 4, 2021, 3:28 PM

#

if you fry them they become hash brown tables

steady maple Sep 4, 2021, 3:28 PM

#

What r u trying to do?

#

LX

keen hearth Sep 4, 2021, 3:30 PM

#

steady maple Can u plz explain me what are hash tables

Yeah. So a hash-table stores key-value pairs. You can use a key to look up its associated value. The hash table uses the key itself to decide where to store it.

stable pecan Sep 4, 2021, 3:30 PM

#

hash tables are really a reserved chunk of memory -- then keys are hashed which gives you which chunk in memory the key goes

knotty magnet Sep 4, 2021, 3:30 PM

#

and hamt is just bacon

stable pecan Sep 4, 2021, 3:30 PM

#

if you have too many keys and a small table, your hash table will have poor performance!

steady maple Sep 4, 2021, 3:31 PM

#

keen hearth Yeah. So a hash-table stores key-value pairs. You can use a key to look up its a...

I am still confused maybe a example will be helpful

knotty magnet Sep 4, 2021, 3:31 PM

#

have you used a python dictionary before?

steady maple Sep 4, 2021, 3:34 PM

#

knotty magnet have you used a python dictionary before?

Well yes but not too much

knotty magnet Sep 4, 2021, 3:34 PM

#

python dicts are hashmaps

steady maple Sep 4, 2021, 3:34 PM

#

knotty magnet python dicts are hashmaps

Can u plz just code a simple example and tell me which one is the hash or whatever u call it

keen hearth Sep 4, 2021, 3:35 PM

#

Like, we could implement our own hash-table in python: ```py
buckets = []
NUM_BUCKETS = 10

for _ in range(NUM_BUCKETS):
bucket = []
buckets.append(bucket)

def hash(key: str) -> int:
"""Tells you which bucket a given key should be stored in."""
# We just add up the ordinals of the characters of the key
# then take the remainder when you divide that by the number
# of buckets.
result = 0
for char in key:
result += ord(char)
return result % NUM_BUCKETS

def add(key: str):
"""Adds a key to the hash table."""
if contains(key):
return
# We want to add the key to a specific bucket.
bucket = buckets[hash(key)]
bucket.append(key)

def contains(key: str) -> bool:
"""Tells you whether the hash table contains the given key."""
# We only need to look in one bucket for the key (not all of them).
bucket = buckets[hash(key)]
return key in bucket
``` This example of a hash-table doesn't have values, just keys. (In practice, you would just use dict or set in python, you wouldn't implement your own hash table. If you were writing a program in C, you might implement your own.)

#

@steady maple this

steady maple Sep 4, 2021, 3:35 PM

#

I am reading it

keen hearth Sep 4, 2021, 3:35 PM

#

Remember that algorithms and data structures are abstract concepts.

#

You can implement the same data-structure in any language.

#

@steady maple #bot-commands message

#

My implementation is not a particularly good implementation of a hash table, but is just to illustrate the concept.

steady maple Sep 4, 2021, 3:39 PM

#

keen hearth Like, we could implement our own hash-table in python: ```py buckets = [] NUM_BU...

Well its pretty confusing to me what it does as I am new to this data struct and algo

#

I saw it in bot cmds and it was printing something

keen hearth Sep 4, 2021, 3:40 PM

#

steady maple Well its pretty confusing to me what it does as I am new to this data struct and...

Tbh, it would probably be best to learn this from a book or lecture. These concepts need to be learned carefully to understand them.

steady maple Sep 4, 2021, 3:41 PM

#

Well yea this is really confusing

stable pecan Sep 4, 2021, 3:43 PM

#

the buckets represent different slots in memory and the hashing is a way to pick which slot a key goes in

steady maple Sep 4, 2021, 3:46 PM

#

Well isin't hashing the way by which websites store user data and password in their db encrypted

keen hearth Sep 4, 2021, 3:47 PM

#

steady maple Well isin't hashing the way by which websites store user data and password in th...

Yep, that's a related concept!

knotty magnet Sep 4, 2021, 3:47 PM

#

yes, that's one use of hashing

keen hearth Sep 4, 2021, 3:48 PM

#

In that case they want the hashes to be evenly distributed for security reasons (to make it hard to guess the password from the hash).

steady maple Sep 4, 2021, 3:48 PM

#

Yes true

#

Maybe I got It

keen hearth Sep 4, 2021, 3:48 PM

#

In this case we want the hashes to be evenly distributed so that the data is evenly distributed in the table.

#

The key idea is that the hash function tells us where to look for a given key.

#

Sorry for mixing different meanings of the word "key" in one sentence 😄

steady maple Sep 4, 2021, 3:50 PM

#

U mean hash just updates the data and make it something like a primary key ?@keen hearth

#

So that the db gets it faster

keen hearth Sep 4, 2021, 3:51 PM

#

This image from Wikipedia kind of illustrates it:

#

Hash-tables are an algorithm. Databases are one place where they are implemented, to make looking up rows by primary key faster.

steady maple Sep 4, 2021, 3:52 PM

#

I think the hash func just encrypted and changed the data

#

To makke a primary key

keen hearth Sep 4, 2021, 3:52 PM

#

Yep, but notice that the hash function tells us exactly where to look for certain data.

#

Otherwise, we would have to go through the table row-by-row to find it.

#

The hash function just points us straight to it.

steady maple Sep 4, 2021, 3:53 PM

#

Ah I got ur point

#

Well what in case if the key is a int

keen hearth Sep 4, 2021, 3:54 PM

#

Any data can be converted into a string of bytes, then you just hash that string of bytes.

steady maple Sep 4, 2021, 3:55 PM

#

keen hearth Any data can be converted into a string of bytes, then you just hash that string...

Can u tell me how to convert something into bytes am I supposed to use ioBytes or any other method

keen hearth Sep 4, 2021, 3:55 PM

#

I just mean in general, that's one way you could hash any kind of data.

steady maple Sep 4, 2021, 3:55 PM

#

Ok

knotty magnet Sep 4, 2021, 3:55 PM

#

all data is just bytes afterall

keen hearth Sep 4, 2021, 3:56 PM

#

We don't talk about implementation details in this channel lemon_sweat

steady maple Sep 4, 2021, 3:56 PM

#

That's enough to know all this

#

Well how can I hash something?

stable pecan Sep 4, 2021, 3:57 PM

#

i think cpython has a very simple hash function

#

something something mod 5

steady maple Sep 4, 2021, 3:57 PM

#

Well I am using python in my bot

#

So I need to use python in it

keen hearth Sep 4, 2021, 3:57 PM

#

Python has a built-in hash function.

steady maple Sep 4, 2021, 3:57 PM

#

keen hearth Python has a built-in `hash` function.

What!

keen hearth Sep 4, 2021, 3:57 PM

#

But python also has a built-in hash-table called dict.

#

You would generally use dict rather than using hash directly.

steady maple Sep 4, 2021, 3:58 PM

#

Where and how to get that

keen hearth Sep 4, 2021, 3:58 PM

#

steady maple Where and how to get that

dict?

steady maple Sep 4, 2021, 3:58 PM

#

Sry for my slow internet

#

Lol

#

My msgs are sent after 5-7 secs

keen hearth Sep 4, 2021, 3:59 PM

#

ah right 😄

steady maple Sep 4, 2021, 3:59 PM

#

So I can use hash instead of dict

#

?

#

Oops I mean

#

Dict instead of hash

keen hearth Sep 4, 2021, 4:01 PM

#

hash just takes an object and turns it into a number. It is what dict, a key-value hash-table, uses to decide where to store things.

#

!eval print(hash('hello'))

halcyon plankBOT Sep 4, 2021, 4:01 PM

#

@keen hearth :white_check_mark: Your eval job has completed with return code 0.

5560446419581122223

steady maple Sep 4, 2021, 4:03 PM

#

Ah I got it

keen hearth Sep 4, 2021, 4:03 PM

#

Example of a dict: ```py
my_dict = {
'Steve': 20,
'Amanda': 25,
}

knotty magnet Sep 4, 2021, 4:04 PM

#

stable pecan i think cpython has a very simple hash function

yeah, most ints just hash to themselves

keen hearth Sep 4, 2021, 4:04 PM

#

Then you could look up values by doing Eg my_dict['Steve']

#

Which would return 20.

steady maple Sep 4, 2021, 4:05 PM

#

But what in case I want to get the hashed value back @keen hearth

glossy breach Sep 4, 2021, 4:05 PM

#

I'm sure hash is meant to be one way

keen hearth Sep 4, 2021, 4:12 PM

#

Anyway, I'd recommend picking up a book or online-course in algorithms to learn more.

#

There are some recommended resources in the pinned messages of this channel, and on our website:

#

!resources

halcyon plankBOT Sep 4, 2021, 4:14 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

trim fiber Sep 4, 2021, 5:04 PM

#

Neo4j client has builtin option for that as far as I remember pithink

shut breach Sep 4, 2021, 5:43 PM

#

yeah it has a pretty good renderer too iirc

keen hearth Sep 4, 2021, 6:21 PM

#

A commonly used tool is Graphviz: https://graphviz.org

#

Graphs are specified in its DOT language: https://graphviz.org/doc/info/lang.html

#

What format is the graph in to begin with?

main flower Sep 4, 2021, 6:25 PM

#

graphviz is great but if you need something easy and fast check https://csacademy.com/app/graph_editor/

CS Academy

CSAcademy is a next generation educational platform. Discover computer science
with interactive lessons and a seamless online code editor.

#

it's really helpful in cp

keen hearth Sep 4, 2021, 6:26 PM

#

Ooo nice!

steady maple Sep 4, 2021, 6:51 PM

#

!eval print(hash('hello'))

halcyon plankBOT Sep 4, 2021, 6:51 PM

#

@steady maple :white_check_mark: Your eval job has completed with return code 0.

8854791851944303749

steady maple Sep 4, 2021, 6:51 PM

#

!eval print(hash('hello'))

halcyon plankBOT Sep 4, 2021, 6:51 PM

#

@steady maple :white_check_mark: Your eval job has completed with return code 0.

7179474724399652171

steady maple Sep 4, 2021, 6:52 PM

#

keen hearth !eval `print(hash('hello'))`

LX help

#

Is it possible to get the hashed value back?

keen hearth Sep 4, 2021, 6:53 PM

#

Ah, I think hashes aren't necessarily the same between runs of Python.

steady maple Sep 4, 2021, 6:55 PM

#

So how will I compare it with the value in the db

#

How will it match

knotty magnet Sep 4, 2021, 6:57 PM

#

the db doesn't use the same hash function as python

#

python's hash function was mostly meant for speed

steady maple Sep 4, 2021, 6:59 PM

#

@knotty magnet Can u suggest me a hashing algo that can do it?

knotty magnet Sep 4, 2021, 7:02 PM

#

i think most databases will use BTrees actually

#

rust uses siphash for their hashmaps so that's probably fine

#

wonder what python's hashing algorithm is named 🤔

elfin river Sep 4, 2021, 7:29 PM

#

!eval 1+1

#

!eval print(1+1)

cosmic ruin Sep 4, 2021, 7:56 PM

#

Hello! I have a question regarding matplotlib. As someone suggested, I might as well ask here. This is the message link with the specifics, feel free to ping if you have any solution!
#help-peanut message

vocal gorge Sep 4, 2021, 8:02 PM

#

cosmic ruin Hello! I have a question regarding matplotlib. As someone suggested, I might as ...

plt.plot(x, y)
fig, ax = plt.subplots()

This plots a plot on a new figure, then creates a second figure with one subplot, which you apply all the formatting to. I'm not sure why you need subplots at all for only one plot, but either way, you should be calling it before plotting, and plot on the Axes object it provides you, like ax.plot(x, y).

keen hearth Sep 4, 2021, 8:05 PM

#

!eval print(1 + 1)

halcyon plankBOT Sep 4, 2021, 8:06 PM

#

@keen hearth :white_check_mark: Your eval job has completed with return code 0.

cosmic ruin Sep 4, 2021, 8:06 PM

#

Ohhh, makes sense

#

thank you!

keen hearth Sep 4, 2021, 8:06 PM

#

Wait what @elfin river

keen hearth Sep 4, 2021, 8:29 PM

#

Just kidding lemon_pleased

elfin river Sep 4, 2021, 9:07 PM

#

nothing to see here.. just adding bits together.

mental parcel Sep 5, 2021, 1:59 AM

#

Looking to see if anyone has any ideas for approaches to this as I'd be interested in hearing things.

So my problem is that I have a file that's too big to load into memory and I need to randomise the lines in the file. How would you approach it?

I've got a couple ideas related to splitting the file and then merging files, a little like splitting a deck of cards in half and then merging.
One of my mates also suggested I could do passes over the file, noting a line and then doing an in place swap. Then repeating this process.

But I'm curious if there's a better way / alternatives

#

Since Im sure there's a good way to do it. I just don't know what it is

fervent saddle Sep 5, 2021, 2:09 AM

#

One of my mates also suggested I could do passes over the file, noting a line and then doing an in place swap. Then repeating this process.
The lines are all the same length?

#

If they are then you could do something like this in "r+" mode. You can kind of index it using seek

keen hearth Sep 5, 2021, 2:27 AM

#

mental parcel Looking to see if anyone has any ideas for approaches to this as I'd be interest...

Seems like splitting it up into files small enough to fit in memory, shuffling each individually, then randomly merging them back together would a reasonable way to do it. I don't think you can do an in-place swap, because, as mesolikey pointed out, the lines would have to be the same length.

mental parcel Sep 5, 2021, 2:32 AM

#

Yea there not same length unfortunately

brisk aurora Sep 5, 2021, 8:14 AM

#

numpy question:
I have a viewing frustum that is represented by 6 planes (a plane is 4 floats: a,b,c for the normal, and d for the distance: ax+by+cz+d = 0). I have N elements that I want to render. each element has an axis-aligned bounding box represented by a bounding_box_min vector3 (x,y,z floats) and a bounding_box_max.
I'm trying to figure out which elements are completely outside the viewing frustum by checking where their bounding box lies on each plane.
This is my regular, procedural python code:

def is_bbox_contained_or_intersects(self, bbox_min: Vector3, bbox_max: Vector3) -> bool:
    for plane in self._planes:
        bbox_x = bbox_max[0] if plane[0] > 0 else bbox_min[0]
        bbox_y = bbox_max[1] if plane[1] > 0 else bbox_min[1]
        bbox_z = bbox_max[2] if plane[2] > 0 else bbox_min[2]

        dot = plane[0] * bbox_x + plane[1] * bbox_y + plane[2] * bbox_z

        if dot < -plane[3]:
            return False

    return True

self._planes is a list of the 6 planes (a plane is a Tuple[float, float, float, float]). Vector3 is Tuple[float, float, float]
I'm running this function per element.
How can I vectorize this with numpy?

brave oak Sep 5, 2021, 8:15 AM

#

brisk aurora numpy question: I have a viewing frustum that is represented by 6 planes (a plan...

np.where

#

for the first 3 lines

#

np.dot for the dot product

#

then just vectorised comparison

brisk aurora Sep 5, 2021, 8:16 AM

#

how would the np arrays look like though?

#

(I'm a complete numpy novice, apologies for the silly questions)

brave oak Sep 5, 2021, 8:16 AM

#

brisk aurora (I'm a complete numpy novice, apologies for the silly questions)

nah they're not silly

#

this is an interesting question

#

so you wanna make self._planes an array

#

a 6x4 array

brisk aurora Sep 5, 2021, 8:17 AM

#

so it would be a 6x4?

brave oak Sep 5, 2021, 8:17 AM

#

okay

#

and I'm going to assume

#

you want to stack the bounding boxes

#

which will make them Nx3 arrays?

#

where N is the number of elements to be rendered

brisk aurora Sep 5, 2021, 8:18 AM

#

so bbox_min and bbox_max will be Nx3, sure makes sense

brave oak Sep 5, 2021, 8:18 AM

#

that seems like the main axis of vectorisation to me

#

yeah

#

that's the sketched outline

#

of the solution

#

if you don't get it in a few hours ping me again here or in #data-science-and-ml

#

I'm not on my coding computer

brisk aurora Sep 5, 2021, 8:18 AM

#

thanks!

brave oak Sep 5, 2021, 8:18 AM

#

so I can only speak in abstractions

#

yw 👋

brisk aurora Sep 5, 2021, 8:18 AM

#

much appreciated

#

another numpy question - are there good type hints for numpy arrays and methods?

stable pecan Sep 5, 2021, 8:30 AM

#

if you call:

seven(times(five()))

the definition for seven is:

def seven(f=None): 
    return 7 if not f else f(7)

let's substitute in times(five()) for f:

seven(times(five())) = 7 if not times(five()) else times(five())(7)

but what's times(five())?

times(five()) = lambda x: x * five()

what's five()?

five() = 5 if not f else f(5) = 5

going back up the chain:

times(five()) = lambda x: x * five()
times(5) = lambda x: x * 5

and finally:

seven(lambda x: x * 5) = 7 if not (lambda x: x * 5) else (lambda x: x * 5)(7) = (lambda x: x * 5)(7) = 7 * 5 = 35

brave oak Sep 5, 2021, 8:32 AM

#

brisk aurora another numpy question - are there good type hints for numpy arrays and methods?

not really.

#

AFAIK?

#

I could be wrong about this

#

haven't done much with numpy for at least a year

#

what kind of type hints

brisk aurora Sep 5, 2021, 8:33 AM

#

it could be nice to type hint the shape of ndarrays

brave oak Sep 5, 2021, 8:33 AM

#

are you looking for

#

oh

#

that would require dependent types

#

which we don't have in Python

brisk aurora Sep 5, 2021, 8:42 AM

#

@brave oak I mean the python type hinting: https://docs.python.org/3/library/typing.html

#

@brave oak regarding np.where, I'm not sure what my condition is for the 6x4 planes array vs. the bbox_min + bbox_max arrays?

brave oak Sep 5, 2021, 8:45 AM

#

brisk aurora <@!171929073063297024> I mean the python type hinting: https://docs.python.org/3...

yeah, so

#

two arrays of different shapes

#

are of type np.ndarray

#

how do you represent the shape?

#

keep in mind that shapes differ based on the values in the array

#

in other words, we have types depending on values

#

this is called dependent typing

#

not many languages have it

brisk aurora Sep 5, 2021, 8:46 AM

#

I see

brave oak Sep 5, 2021, 8:46 AM

#

Python's typing system is kinda anaemic tbh

#

but well that's not on topic

brave oak Sep 5, 2021, 8:46 AM

#

brisk aurora <@!171929073063297024> regarding np.where, I'm not sure what my condition is for...

okay

#

so

#

planes is 6x4, right

brisk aurora Sep 5, 2021, 8:47 AM

#

I really wanted it just as a comment for "this ndarray is of shape X at the moment"

#

ok, yeah

brave oak Sep 5, 2021, 8:47 AM

#

planes > 0 will also be 6x4

#

you don't need the distance, though

#

so you can take planes[:, :3] > 0

#

that gives you 6x3

brisk aurora Sep 5, 2021, 8:47 AM

#

that's "free"? (planes[:, :3]) ?

#

it's not copying anything?

brave oak Sep 5, 2021, 8:48 AM

#

now, you want a coordinate from bbox_max or bbox_min depending on whether the corresponding vavalue in plane is True or False, respectively

brave oak Sep 5, 2021, 8:48 AM

#

brisk aurora that's "free"? (planes[:, :3]) ?

free meaning?

brisk aurora Sep 5, 2021, 8:48 AM

#

like not taking time based on the amount of values in it

brave oak Sep 5, 2021, 8:48 AM

#

you mean

#

a view

#

as opposed to a copy?

brisk aurora Sep 5, 2021, 8:48 AM

#

yeah

#

exactly

brave oak Sep 5, 2021, 8:49 AM

#

I don't remember the exact rules

#

but I believe that should create a view

brisk aurora Sep 5, 2021, 8:49 AM

#

brave oak now, you want a coordinate from `bbox_max` or `bbox_min` depending on whether th...

ok sorry, I was distracting your main point

brave oak Sep 5, 2021, 8:49 AM

#

https://numpy.org/doc/stable/reference/arrays.indexing.html

#

you can read this

brave oak Sep 5, 2021, 8:49 AM

#

brave oak now, you want a coordinate from `bbox_max` or `bbox_min` depending on whether th...

this is where

#

you use np.where

#

because np.where is basically

#

np.where(c, x, y) -> x if c else y

#

but broadcasted

brisk aurora Sep 5, 2021, 8:50 AM

#

np.where(planes[:, :3] > 0), bbox_max, bbox_min)
that's literally it?

brave oak Sep 5, 2021, 8:50 AM

#

brisk aurora `np.where(planes[:, :3] > 0), bbox_max, bbox_min)` that's literally it?

well, no

#

because you want to broadcast it across all the bounding boxes

#

so the dimensions need to line up

#

basically, what that means is that each axis needs to "mean" the same thing

#

for example

#

right now a bbox is (3,)

#

and planes is (6, 3)

#

so the first axis of planes means "number of planes"

#

but the first axis of the bbox means "coordinate" (x, y, or z)

brave oak Sep 5, 2021, 8:51 AM

#

brave oak but the first axis of the bbox means "coordinate" (x, y, or z)

and this in fact corresponds to the second axis of planes

#

you see what I mean?

#

consider further

brisk aurora Sep 5, 2021, 8:51 AM

#

yes

brave oak Sep 5, 2021, 8:51 AM

#

that if you want to vectorise across bboxes

#

you need a third axis

#

because you can't "mix" that with either axis in planes

#

since, again, that third axis "means" something different

#

got all that?

brisk aurora Sep 5, 2021, 8:52 AM

#

I don't get the third axis

brave oak Sep 5, 2021, 8:52 AM

#

brisk aurora I don't get the third axis

okay

#

one bbox is 3 coordinates

#

N bboxes

#

are Nx3

#

let's say

#

so in general

#

ugh

#

man I should have stuck with M

#

you have P planes, each of which is 3 points (ignore distance)

#

therefore Px3

stable pecan Sep 5, 2021, 8:53 AM

#

planes[:, :3] is a view, but planes[:, :3] > 3 is not view --- numpy will create a new boolean matrix

brave oak Sep 5, 2021, 8:53 AM

#

stable pecan `planes[:, :3]` is a view, but `planes[:, :3] > 3` is not view --- numpy will cr...

yup, that's true

#

such manipulations always create new arrays (basically)

#

now (focus on the mins), you have B bounding boxes, each of which is also 3 points

#

so Bx3

stable pecan Sep 5, 2021, 8:54 AM

#

you can manually call the functions with the out= parameter to reuse arrays if you want

#

i'm derailing though

brave oak Sep 5, 2021, 8:55 AM

#

yeah, simplest example is np.add

#

I don't know what the equivalent is for comparison actually

brave oak Sep 5, 2021, 8:55 AM

#

brave oak now (focus on the mins), you have B bounding boxes, each of which is also 3 poin...

the "3" in both cases means "number of coordinates"

#

so when using np.where between two arrays

#

one of which represents the planes

#

and the other of which represents the bounding boxes

#

that needs to be the same axis

#

so we have one axis accounted for

#

next, we have the P axis (number of planes) and the B axis (number of bounding boxes)

#

these two mean different things

#

therefore they should not line up

brisk aurora Sep 5, 2021, 8:56 AM

#

ok so for that I reshape planes[:, :3] so the coordinate axis will be first

brave oak Sep 5, 2021, 8:56 AM

#

which means in total

#

you have 3 axes

#

planes
coordinates
bounding boxes

#

got it?

brisk aurora Sep 5, 2021, 8:56 AM

#

yes

brave oak Sep 5, 2021, 8:56 AM

#

yeah

#

that's about it

#

I need to go

#

but

#

play around with it

#

it's a good experience

#

🙂

brisk aurora Sep 5, 2021, 8:56 AM

#

OK, no idea how to combine the different number of planes with the different number of bboxes but I'll try 🙂

#

thanks!

brisk aurora Sep 5, 2021, 10:24 AM

#

@brave oak OK I'm kinda lost with shaping the planes array to support the np.where call so the condition, bbox_min and bbox_max are broadcastable. still trying though

brisk aurora Sep 5, 2021, 12:33 PM

#

@brave oak
If I'm not mistaken, if bbox_min and bbox_max are each 1000x3, and planes is 6x4, then:

planes_normal = all_planes[:, :3]
planes_greater_than_zero = planes_normal > 0
bbox = np.where(planes_greater_than_zero[:, None, :3], bbox_max, bbox_min)

Now bbox is 6x1000x3 which means, for each of the 6 planes, for each element, what is the x,y,z that I want to check for the bbox.
Now I'm having trouble with doing the dot product between planes_normal and bbox 🙂

brave oak Sep 5, 2021, 12:34 PM

#

brisk aurora <@!171929073063297024> If I'm not mistaken, if `bbox_min` and `bbox_max` are ea...

my brain is currently fried, sorry

#

but

#

that doesn't seem so right..

#

the resultant shape should be 6, 1000, 3, right

brisk aurora Sep 5, 2021, 12:35 PM

#

the bbox array is correct, I think. I checked some values.

#

but I can't figure out yet how to do a dot product between the 3 in bbox (which is 6x1000x3), and the 3 in planes_normal (which is 6x3) while "Sharing" the 6 at the beginning.
The result of the dot product should be 6x1000, a dot product per plane, per element

#

np.tensordot(planes_normal, bbox, axes=(1,2)).shape
Out[120]: (6, 6, 1000)

I don't get the extra 6 here