#algos-and-data-structs

1 messages · Page 128 of 1

winged plover
#

you can do like
lineas = [linea.strip() for linea in f1.readlines()[-n:]]
and write all lines from linea

bright halo
# winged plover you can do like `lineas = [linea.strip() for linea in f1.readlines()[-n:]]` and...

I tried to implement that in this code but I get error

n = 5

with open("file_text_chat.txt","r+") as f1:
    #lineas = [linea.strip() for linea in f1.readlines()]
    lineas = [linea.strip for linea in f1.readlines[-n:]]

    with open("file_only_last_lines.txt","r+") as f2:
        if palabra not in lineas:
            num_linea = lineas.index(palabra)
            f2.write(f"{palabra}\n")

I get this error message

Traceback (most recent call last):
  File "n_ultimas_lineas.py", line 5, in <module>
    lineas = [linea.strip for linea in f1.readlines[-n:]]
TypeError: 'builtin_function_or_method' object is not subscriptable
#

What should I modify?

daring delta
#

anyone here can help me with coding problems tomorrow 8-10 am??
[9:29 AM]
coding questions like dynamic programming ,graph and trees

winged plover
bright halo
# winged plover Oh dangit.. I forgot the parenthesis there `f1.readlines()[-n:]` My bad

I try with that code but it doesn't give me the correct lines

n = 5

with open("file_text_chat.txt","r+") as f1:
    #lineas = [linea.strip() for linea in f1.readlines()]
    lineas = [linea.strip for linea in f1.readlines()[-n:]]
    print(lineas)

    with open("file_only_last_lines.txt","r+") as f2:
        f2.write(f"{lineas}\n")

This code write that:

[<built-in method strip of str object at 0x0000024EEED9ECB0>, <built-in method strip of str object at 0x0000024EEED59D30>, <built-in method strip of str object at 0x0000024EEEDE69F0>, <built-in method strip of str object at 0x0000024EEECB7E90>, <built-in method strip of str object at 0x0000024EEEE03480>]

rancid hound
#

strip is a function call
Use strip with paranthesis: "strip()"

winged plover
daring delta
#

India Standard Time
Time zone in India (GMT+5:30)

lean dome
#

I won't be able to help then, but I can help you with them now

#

What are the problems?

bright halo
#

@winged ploverthank you very much it worked for me 😄

#

@rancid houndThank you very much, I tried the methods on that page and they also worked for me

fluid prism
#

!decorator

eager cliff
daring delta
lean dome
daring delta
#

yeah i know

daring delta
#

now good?

brave oak
daring delta
#

hmm

lean dome
daring delta
#

u can take whole day if u want

#

i was avialable at that time na

stable pecan
#

@floral lintel let's not spam #internals-and-peps which is a channel about python and not about algorithms

#

i can show you a solver i wrote with numpy --- maybe the structure of it will make sense anyway:

def possible(y, x):
    """
    Return array of possible digits for location (y, x).
    """
    row = grid[y]
    column = grid[:, x]

    gy = y - y % 3
    gx = x - x % 3
    subgrid = grid[gy: gy + 3, gx: gx + 3]

    return np.argwhere(
        ~np.isin(N, row)
        & ~np.isin(N, column)
        & ~np.isin(N, subgrid)
    )

def solve(where_empty=None):
    """
    Sudoku solver with backtracking.
    """
    if where_empty is None:
        # Find coordinates of cells that are empty.
        where_empty = np.argwhere(grid == 0)

    if len(where_empty) == 0:
        return True  # Indicate grid is solved.

    y, x = where_empty[0]

    for n in possible(y, x):
        grid[y, x] = n

        if solve(where_empty=where_empty[1:]):  # Done!
            print(grid)
            return

        grid[y, x] = 0
#

This doesn't keep backtracking once the grid is solved, which i think is the issue you're having.

#

It returns a sentinel value to indicate it has a solved grid (True, in this case)

floral lintel
#

oh, wow. That is a very different way of solving my issue.

#

I gave up trying to get it to return a value, but this could work very well

mint jewel
#

given an infinite DAG where each edge has a weight 1, is there a better algorithm than just breath first search to find the shortest path to a point.

#

paths can be assumed finite

#

hmm, in my problem I can find an upper bound on the path

#

so IG it is not an infinite DAG and therefore I can just use normal graph search algorithms

stable pecan
#

you can have an infinite dag where a path between any two points is always finite

#

kinda trivially

#

but i don't think there's a fast algorithm for finding paths unless there's some nice property of your dag

clever gust
#

A college maintains academic information about students in three separate lists

Course details: A list of pairs of form (coursecode,coursename), where both entries are strings. For instance,
[ ("MA101","Calculus"),("PH101","Mechanics"),("HU101","English") ]

Student details: A list of pairs of form (rollnumber,name), where both entries are strings. For instance,
[ ("UGM2018001","Rohit Grewal"),("UGP2018132","Neha Talwar") ]

A list of triples of the form (rollnumber,coursecode,grade), where all entries are strings. For instance,
[ ("UGM2018001", "MA101", "AB"), ("UGP2018132", "PH101", "B"), ("UGM2018001", "PH101", "B") ]. You may assume that each roll number and course code in the grade list appears in the student details and course details, respectively.

Your task is to write a function transcript (coursedetails,studentdetails,grades) that takes these three lists as input and produces consolidated grades for each student. Each of the input lists may have its entries listed in arbitrary order. Each entry in the returned list should be a tuple of the form

(rollnumber, name,[(coursecode_1,coursename_1,grade_1),...,(coursecode_k,coursename_k,grade_k)])

where the student has grades for k >= 1 courses reported in the input list grades.

The output list should be organized as follows.

1 The tuples shold sorted in ascending order by rollnumber

2 Each student's grades should sorted in ascending order by coursecode

#

Can someone kindly help me with this question?

winged plover
#

guys what would be a good way to get all possible combinations of natural numbers whose sum = a given number
like

3 = {(1,1,1), (1,2), (3,)}```i mean i tried a method and it works.. but i am not sure if its best because above 50-60 ish i think it starts to slow up 👀

my method being iterate and reach the number by addition 
like start from [(1,)]
iterate to give [(1,1),(2,)] and so on

so uhm well wanted to ask if there is anything better 👀

dusk gust
#

idk, but you can use numba, that will speed it up.

winged plover
#

hmm yeah i think.. that would speed it up a bit

dusk gust
#

super basic example using pointless code

#

like ~14x faster in that specific example (I haven't slept in a while so my math is prob wrong)

winged plover
#

ah hmm yeah

#

but the thing is i am using tuples and sets

#

for a functionality in it

dusk gust
#

pretty sure you can use tuples and list with numba

winged plover
dusk gust
#

pretty much any base python and a few libraries like numpy

#

Only downside to numba is that the first time you call a function it has to take a few seconds to compile, which will lead to an initial slow down of the program. But after the function compiles, it will be faster.

winged plover
dusk gust
#

It is good if you call a function many times or if the function has a large loop or while statement.

winged plover
#

Ahh it does have large loops yes

lament quiver
#

heyy all, I implemented the algo to find the longest path in the maze, but as the size of the grid increases, the time of computation increases, and yes the maze have multiple solutions, so I am approaching the longest path, is there any way to reduce the time of computation, I mean using VMs or something else, that I am not aware of...

#

or maybe GPUs, but I suppose they are mostly used for parallel computation rather than recursion one...

tropic glacier
#

How are you defining the "longest" path? Can't I just keep retracing my steps to make a path longer?

vocal gorge
#

Just have this node only have one child, yes. An AST is generally not binary, nodes often have many or only 1 child.

vocal gorge
#

that's a pretty cursed way to do unary minus

#

just don't have it be a binary tree, yeah. It's fine to have different kinds of nodes

stable pecan
#

binary ast seems cursed

#

no unary functions if ast is required to be binary

vocal gorge
#

[1,2,3] -> List(ListStart, List(2, List(3, ListEnd)))

#

aren't binary trees beautiful

knotty magnet
#

😔 ListStart and ListEnd

stable pecan
vocal gorge
#

okay okay, UnaryMinus(None, 5)

stable pecan
#

there's no reason for a binary ast if you are including any ary nodes

#

just complicates things

#

python's ast is just a general tree:

In [39]: ppast("[1, 2, 3, 4, 5]")
Expr
╰──List
   ├──Constant
   │  ╰──1
   ├──Constant
   │  ╰──2
   ├──Constant
   │  ╰──3
   ├──Constant
   │  ╰──4
   ├──Constant
   │  ╰──5
   ╰──Load
vocal gorge
#

yup, ASTs in general really aren't binary

fervent saddle
#

If you’re reading a leetcode type of problem, are there different things that give hints on what kind of techniques or data structures you’ll probably need to use? Is there anything that talks about that?

#

Like can you figure out that you’ll probably need to use a hash table, or a graph, or dynamic programming, just by reading the problem?

knotty magnet
#

sometimes, yeah

fervent saddle
#

Is there like a cheat sheet or something for doing that?

knotty magnet
#

i think it's just practice, a lot of questions are worded similarly

fiery cosmos
#

can a binary tree keep an object alive?

knotty magnet
#

if it has a reference to it yes

fiery cosmos
knotty magnet
#

yes, through node._left

fiery cosmos
#

Thanks

idle pier
#

Hello folks, I have a question.
from what I know nested for loops are usually O(n^2) but not all the time, my question is when you have 2 separate for loops, thats O(n) correct?

for i in somethingHere:
  return something

for i in somethingHere:
  return something```
fervent saddle
#

Yeah, O(n)

#
for i in somethingHere:
  # do something with somethingHere

for i in somethingHere:
  # do something with somethingHere ```
#

O(n) with the size of somethingHere

#

Assuming somethingHere is the same thing for both loops

#

Otherwise, if it’s two independent things, it’s O(n + m)

ivory quest
#

Are these function both O(n) https://github.com/georgepittock/py-scripts/blob/main/scripts/numbers_in_jumbled_string.py? If I run the code in the comments at the top it looks like submitted is O(n) but new is O(log n) but I don’t understand how it could be?

GitHub

A collection of scripts I have used/made and wanted to make publicly available. Most of these will be algorithms etc. - py-scripts/numbers_in_jumbled_string.py at main · georgepittock/py-scripts

#

If it is O(log n) could someone please explain how it is, I’m fairly new to big o notation and understand log n as “you don’t check every element just halving and halving again”

heady jewel
#

Counter iterates over each character in string so it's O(n)

ivory quest
#

I thought so, so what would explain why execution time appears to show it as O(log n)?

#

Or is that just an example of it’s up to the order of n but not necessarily order of n

vocal gorge
#

Why do you believe it to be O(log n)?

heady jewel
#

could be many things, noise, internal optimization for short strings, etc.

ivory quest
#

If I run the execution time based on input size n it goes up logarithmically

heady jewel
#

are you sure it's actually logarithmic instead of just a small coefficient

vocal gorge
#

That shouldn't be possible, as Counter(string) is O(n) already.

heady jewel
#

what range of input lengths are we talking about

ivory quest
vocal gorge
#

but what makes you think it's logarithmic?

#

lemme make some plots, I guess

ivory quest
#

Just from reading the numbers it seemed logarithmic so yeah not very sure :/

heady jewel
#

Consider the graphs of 1 + 0.2n and 0.5n

vocal gorge
# ivory quest Just from reading the numbers it seemed logarithmic so yeah not very sure :/
import perfplot
numbers = ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"]
perfplot.show(
    setup = lambda n: "".join(random.choices(numbers, k=n)),
    kernels = [submitted, new],
    n_range = list(range(10)) + list(2**i for i in range(4,10)),
    xlabel = "len(string)",
    target_time_per_measurement=1,
    equality_check=lambda a,b:a==b
)

Looks extremely linear to me.

heady jewel
#

both grow at the order of n but one of them has this pesky 1 + at the beginning

#

in big-o notation we don't care about it because we only care about the rate of growth, it's also the same reason coefficients are discarded

ivory quest
#

Ah that makes it much clearer, thanks

vocal gorge
#

and here's a loglog plot

heady jewel
#

An algorithm described as O(N^2) could in reality be something like 3 N ^ 2 + 15 N + 10 000 but since as N grows, the term N ^ 2 will have the most drastic effects the others are discarded.

vocal gorge
#

only at below 2 numbers is the submitted version faster

heady jewel
#

You also have to remember that runtime complexity does not automatically imply performance

ivory quest
vocal gorge
# austere sparrow 👀 that's a nice library

I hate its interface (every time I use it, I have to go look at the github page, because the author didn't bother putting the same thing in the docstring or, god forbid, the type hints), but yeah, it is

austere sparrow
#

types don't exist

#

yolo

heady jewel
#

No I'm saying that results between algorithms can be surprising and if you want to gauge performance you need to measure it.

vocal gorge
#
>>> ?perfplot.show
Signature:
perfplot.show(
    *args,
    time_unit='s',
    relative_to=None,
    logx='auto',
    logy='auto',
    **kwargs,
)
Docstring: <no docstring>

you might notice that these are not all the arguments this function accepts, but you wouldn't be able to tell without looking at github 😩

ivory quest
#

Got it thank you both @vocal gorge @heady jewel really helpful, @vocal gorge am I ok to use one of those photos on my GitHub?

vocal gorge
#

sure; you can also generate them yourself - the snippet I posted is what does it

#

@ivory questhere's a better one with more points and correctly labeled x axis

numbers = ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"]
perfplot.show(
    setup = lambda n: "".join(random.choices(numbers, k=n)),
    kernels = [submitted, new],
    n_range = list(range(1,10)) + list(2**i for i in range(4,15)),
    xlabel = "number count",
    target_time_per_measurement=1,
    equality_check=lambda a,b:a==b,
    logx=True,
    logy=True
)
heady jewel
#

As a simple example, in something like C a simple O(n) linear scan for membership on an array is often faster than a O(log n) binary search until fairly significant sizes as reading sequential data is one of the things that CPUs love, and there is not really branching (no if statements) so the branch predictor doesn't fail as often. Runtime analysis does not give you the full picture, you always have to measure.

fiery cosmos
#

Are there any practical instances of sin(n) being part of the big O? lemon_glass

stable pecan
fiery cosmos
#

I know but like x*sin(x)

#

Though I guess that's bounded under x

austere sparrow
fiery cosmos
#

By practical I mean part of an actual, useful algorithm pithink

wise fulcrum
short mesa
#

anyone have xp with a*

#

my code runs in to an infinite ish loop in some particular mazes

#

id like to talk it over with someone

gritty marsh
#

share your code

wheat flare
#

hello all, I have a bunch of partial call graphs that I created per script in the form of json files.

#

for example, it looks like this

#

I assume its a dictionary of dictionaries of lists?

#

does networkx accept formats like this? If it does, can I just concatenate the json files together? or will I need to perform cleanup on the files

halcyon plankBOT
#

Hey @wheat flare!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .json attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

wheat flare
#

oh

halcyon plankBOT
wheat flare
short mesa
#

it takes a while to get out of a boxed in start

#

something like this gets solved but after like 5 minutes

wheat flare
#

@stable pecan sorry for the ping but i got a few questions on networkx

#

do you prefer if we do here or in a help channel?

stable pecan
#

you can ask wherever

wheat flare
#

ok let me just repeat earlier from above

#

heres an example sorry if its messy

#

please tell me if you need me to be clearer

stable pecan
#

you need to convert the json files to dictionary yes

wheat flare
#

ah i know when i read them they're already in dict format

#

can i just concatenate them? or i have to combine them into one big dictionary

#

i assume its the second one

stable pecan
#

either way, it's probably easier to do the second

wheat flare
#

i see

#

they're in the form of a dictionary of dictionaries of lists right?

wheat flare
stable pecan
#

the jsons are basically going to convert to dict of lists

#

one sec

wheat flare
#

yes just want to combine i have like 2k json files

#

i have another question but after this one

#

thanks for answering

stable pecan
#

do the dicts share any keys?

#

because that will change how you combine them

wheat flare
#

yes theres a lot of them sharing keys

#

its hard to determine which though

stable pecan
#

are the key, value pairs the same or different between dictionaries?

#

for dictionaries that share keys

wheat flare
#

different keys but they have the same value

#

its a gigantic callgraph so a lot of functions may call one function

#

oh most format is like this i guess

#

should have said earlier sorry

#
{
    "node1": ["node2", "node3"],
    "node2": ["node3"],
    "node3": []
}
#

there can also be another json file thats like

stable pecan
#

!e

from collections import defaultdict

my_dict_1 = {
  "a": [1, 2, 3],
  "b": [4],
  "c": [],
}

my_dict_2 = {
  "a": [2, 3, 4],
  "c": [4],
  "d": [2, 4],
}

master_dict = defaultdict(set)

def combine(graph):
    """
    Add a dict to the master.
    """

    for key, value in graph.items():
        master_dict[key].update(value)

combine(my_dict_1)
combine(my_dict_2)
print(master_dict)
halcyon plankBOT
#

@stable pecan :white_check_mark: Your eval job has completed with return code 0.

defaultdict(<class 'set'>, {'a': {1, 2, 3, 4}, 'b': {4}, 'c': {4}, 'd': {2, 4}})
wheat flare
#

{
"node4": ["node1", "node3"],
"node5": ["node2"],
"node6": []
}

stable pecan
#

this is how i would combine them

wheat flare
#

oh sorry, you can assume the keys in every dict to be unique

#

but different keys from different dicts can share the same value

#

im sorry i should have clarified

stable pecan
#

every key is unique across all the dicts?

wheat flare
#

i would say so

#

if not i can use that method above

stable pecan
#

ok, then that's simpler because you can just use |=

wheat flare
#
{
    "node1": ["node2", "node3"],
    "node2": ["node3"],
    "node3": []
}
{
    "node4": ["node1", "node3"],
    "node5": ["node2"],
    "node6": []
}
#

it can be like this

#

and other dicts can have a key in one dict be a value in theirs

stable pecan
#

!e

my_dict_1 = {
  "a": [1, 2, 3],
  "b": [4],
  "c": [],
}

my_dict_2 = {
  "d": [2, 4],
}

master_dict = { }
master_dict |= my_dict_1
master_dict |= my_dict_2

print(master_dict)
halcyon plankBOT
#

@stable pecan :white_check_mark: Your eval job has completed with return code 0.

{'a': [1, 2, 3], 'b': [4], 'c': [], 'd': [2, 4]}
stable pecan
#

the |= operator will join them all

wheat flare
#

oh i see

#

and other dicts can have a key in one dict be a value in theirs

#

this covers this situation too

#

i would say

stable pecan
#

yep, that doesn't matter at all

#

so if you have a list of dictionaries

wheat flare
#

i see i see

stable pecan
#

you can just do:

for a_dict in iterable_of_dicts:
    master_dict |= a_dict
wheat flare
#

yeah i just can loop through all the json files i have in a folder

stable pecan
#

and then you can convert master_dict into a graph

wheat flare
#

regarding the issue of duplicate pairs

#

networkx will just ignore it right

#

i dont really know if theres duplicates but theres potential to have it

stable pecan
#

keys are unique in a dictionary, so if you have the same key it will overwrite a previous one

wheat flare
#

oh i see

#

okay

#

ok i'll give it a shot then

#

one last thing

#

as you can see these 2 are the same functions

#

but the cursed json file outputs it differently for different dicts

#

i assume networkx is unable to find that they are the same

#

and i have to perform clean up on the json files

stable pecan
#

yeah, you can clean it up with regex or maybe modify how you're generating the json to begin with

wheat flare
#

yeah i have to clean up then alright

#

is it fair to ask about dictionary clean up then here?

stable pecan
#

well i would just comb through the raw json files and make sure everything is in the same format with some regex

wheat flare
#
    "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.boolean_mask._apply_mask_1d",
    "tensorflow.python.ops.gen_math_ops.prod",
    "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.reshape",
    "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.concat",
    "tensorflow.python.framework.ops.convert_to_tensor",
    "tensorflow.python.framework.tensor_shape.as_shape",
    "tensorflow.python.framework.tensor_util.constant_value",
    "<builtin>.ValueError",
    "tensorflow.python.framework.ops.name_scope",
    "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.shape"
  ],```
#

unfortunately its a bit harder

#

sorry if its a mess

#

this is one of the dictionaries in the json

#

"Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.reshape",
"tensorflow.python.framework.ops.convert_to_tensor",

#

you can see these two so i need to do some editing to the values

stable pecan
#

yes, but you can automate that with regex

wheat flare
#

to get array_ops.reshape
and ops.convert_to_tensor

#

oh can i?

#

i dont know like every value to look for though

#

i also need to change the key itself

stable pecan
#

that doesn't matter

wheat flare
#

"Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.boolean_mask": to array_ops.boolean_mask

#

really?

stable pecan
#

let's see if i can figure out a regex pattern for it

wheat flare
#

thank you so much

stable pecan
#

feel like i have to relearn regex everytime

wheat flare
#

i'll try to look around too

#

yeah i feel you

#

heres one if you want to test

#

my previous idea was

      see if tensorflow exists in the value name, 
        extract depending if theres \\ or . in the name
#

but if regex works then its much easier

#

theres also the issue of changing the key pair i saw i had to make a new key which is a pain too

stable pecan
#

can i just use the last part of all the strings --- or do you need the preceding modules

wheat flare
#

give me a sec let me compare

stable pecan
#

"tensorflow.python.framework.fast_tensor_util.AppendObjectArrayToTensorProto" -> "AppendObjectArrayToTensorProto" or "fast_tensor_util.AppendObjectArrayToTensorProto"

wheat flare
#

the latter

#

"tensorflow.python.framework.tensor_util.constant_value"
"Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util.constant_value",

#

from these two i want tensor_util.constant_value

stable pecan
#

there's no way to tell a module from a larger directory unfortunately

wheat flare
#

very messy splitting

#

cause i noticed the names either have \\ or not

stable pecan
#

ok

#

!e

import re

pattern = re.compile(r'(?:\S*\\)*(.+)')
m = re.match(pattern, "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util.SlowAppendBFloat16ArrayToTensorProto")
n = re.match(pattern, "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util.ExtractBitsFromBFloat16")
print(m.group(1), n.group(1))
halcyon plankBOT
#

@stable pecan :white_check_mark: Your eval job has completed with return code 0.

tensor_util.SlowAppendBFloat16ArrayToTensorProto tensor_util.ExtractBitsFromBFloat16
wheat flare
#

ooo

#

it seems to work

#

!e


pattern = re.compile(r'(?:\S*\\)*(.+)')
m = re.match(pattern, "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.expand_dims_v2")
n = re.match(pattern, "tensorflow.python.ops.gen_array_ops.list_diff")
print(m.group(1), n.group(1))
halcyon plankBOT
#

@wheat flare :white_check_mark: Your eval job has completed with return code 0.

array_ops.expand_dims_v2 tensorflow.python.ops.gen_array_ops.list_diff
wheat flare
#

for every value of a key i can run this regex then and just rewrite

stable pecan
#

on works for the \\ patterns

#

i was trying to run it on the json text, but i'm failing

#

might be easier to use on keys to the dictionary and just create a new dict

#

or find someone better at regex

wheat flare
#

i dont mind rewriting

#

so i just perform your regex as i iterate through it?

#

all the keys and their values

stable pecan
#

yeah, but still need some way to turn "tensorflow.python.ops.gen_array_ops.list_diff" into gen_array_ops.list_diff it's not obvious to me how you'd do --- perhaps you need a list of directory prefixes to ignore

#

possibly generated from your filesystem

wheat flare
#

i was thinking a horrible way

#

search the list for tensorflow

#

"tensorflow.python.ops.gen_array_ops.list_diff"

#

then just split it by . and append last 2

#

its a very big assumption that they're all represented like that

#

well i need a condition if your regex fails now

#

god this method is horribly messy..

#

i need to go through every list and make a new list then assign a new key

#

Rough psuedocode

1. for every key, go through the value (a list)  
    2. for every value, go through each of its element
      3. perform filtering and append to a list (regex and whatever method i said up there) for the values
      4. perform filtering on the key too
  5. assign them together in a new dictionary and output a new json file
wheat flare
#

Please validate and thanks a lot for helping

#

how does networkx handle this btw? i guess its fine

stable pecan
#

it's all just strings to networkx

#

if the strings are different then they're different nodes

wheat flare
#

Oh i meant empty lists

#

Guess it does nothing

wheat flare
#

Sorry if you answered it already my head is a bit cluttered

stable pecan
#

yes you would use from_dict_of_lists

wheat flare
#

Thank you

#

Ill go figure out something then, ill post here if any issues

wheat flare
#

@stable pecan im so sorry for pinging again, but i have been testing a graph example on network x

#
f = open(r"C:\Users\User\Desktop\Work\SMU\Capstone\Datasets\callgraph\TensorflowPythonCallgraphs\tensor_util.json")
graph = json.load(f)

G = nx.from_dict_of_lists(graph)
for i in list(G.edges):
    print(i)
#

said json

#

why does it have an edge from a "leaf" to a node

stable pecan
#

your graph isn't directed

#

use a directed graph instead

wheat flare
#

oh dxgraph func or something?

#

sorry for pinging

stable pecan
#

i think from_dict_of_lists has an optional 2nd argument

#

lemme check

wheat flare
#

here

stable pecan
#

there you go, create_using= just pass nx.DiGraph

wheat flare
#

ah!

#

ok thank you thank you

#

i didnt know you could do that

stable pecan
#
f = open(r"C:\Users\User\Desktop\Work\SMU\Capstone\Datasets\callgraph\TensorflowPythonCallgraphs\tensor_util.json")
graph = json.load(f)

G = nx.from_dict_of_lists(graph, create_using=nx.DiGraph)
for i in G.edges:  # edges is iterable already, don't need to make a list first
    print(i)
wheat flare
#

sorry for pinging im working on that filtering we made

#

thanks again

#

seemed to work

stable pecan
#

i have an idea to merge the different keys that are actually the same

wheat flare
#

wouldnt it just be overwritten when i combine them?

#

heres the code if you're curious its very messy

#
pattern = re.compile(r'(?:\S*\\)*(.+)')
old_dict = json.load("Some json file")
for key in old_dict:
    value_list = []
    #Check if the key contains tensorflow, if it does we want to filter it
    if "tensorflow" in key:
        #Assumption: the tensorflow keys have either \ or not, filter accordingly
        #Example: Desktop\Work\tensorflow-master\tensorflow\python\framework\tensor_util._is_array_like -> tensor_util._is_array_like
        if "\\" in key:
            #Use that regex filtering to perform it
            cleaned_key = re.match(pattern, key)
            new_key = cleaned_key.group(1)
        #Else split it with the dots and get the last 2 parts
        #Example: tensorflow.python.util.nest.flatten -> nest.flatten
        else: 
            parts = key.split(".")
            new_key = ".".join(parts[-2:])

    #Do the same on the values
    for uncleaned_value in old_dict[key]:
        if "tensorflow" in uncleaned_value:
            if "\\" in uncleaned_value:
                cleaned_value = re.match(pattern, uncleaned_value)
                new_value = cleaned_value.group(1)
            else:
                parts = uncleaned_value.split(".")
                new_value = ".".join(parts[-2:])
            value_list.append(new_value)
        elif "tensorflow" not in uncleaned_value:  
            value_list.append(uncleaned_value)
stable pecan
# wheat flare heres the code if you're curious its very messy

i have something similar but no regex, and doesn't look for "tensorflow" but instead just splits the keys, labeling them by the suffix, and preferring the . ones over the // ones:

In [6]: keys = [
   ...:     "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.boolean_mask._apply_mask_1d",
   ...:     "tensorflow.python.ops.gen_math_ops.prod",
   ...:     "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.reshape",
   ...:     "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.concat",
   ...:     "tensorflow.python.framework.ops.convert_to_tensor",
   ...:     "tensorflow.python.framework.tensor_shape.as_shape",
   ...:     "tensorflow.python.framework.tensor_util.constant_value",
   ...:     "<builtin>.ValueError",
   ...:     "tensorflow.python.framework.ops.name_scope",
   ...: ]

In [7]: key_suffix = { }
   ...: for key in keys:
   ...:     if "\\" in key:
   ...:         split_key = key.split("\\")
   ...:         split_key[-1:] = split_key[-1].split(".")
   ...:         if split_key[-1] in key_suffix:  # Key already in dict
   ...:             continue
   ...:         key_suffix[split_key[-1]] = split_key
   ...:     else:
   ...:         split_key = key.split(".")
   ...:         key_suffix[split_key[-1]] = split_key
   ...: 
#

In [8]: key_suffix
Out[8]: 
{'_apply_mask_1d': ['Desktop',
  'Work',
  'tensorflow-master',
  'tensorflow',
  'python',
  'ops',
  'array_ops',
  'boolean_mask',
  '_apply_mask_1d'],
 'prod': ['tensorflow', 'python', 'ops', 'gen_math_ops', 'prod'],
 'reshape': ['Desktop',
  'Work',
  'tensorflow-master',
  'tensorflow',
  'python',
  'ops',
  'array_ops',
  'reshape'],
 'concat': ['Desktop',
  'Work',
  'tensorflow-master',
  'tensorflow',
  'python',
  'ops',
  'array_ops',
  'concat'],
 'convert_to_tensor': ['tensorflow',
  'python',
  'framework',
  'ops',
  'convert_to_tensor'],
 'as_shape': ['tensorflow', 'python', 'framework', 'tensor_shape', 'as_shape'],
 'constant_value': ['tensorflow',
  'python',
  'framework',
  'tensor_util',
  'constant_value'],
 'ValueError': ['<builtin>', 'ValueError'],
 'name_scope': ['tensorflow', 'python', 'framework', 'ops', 'name_scope']}
wheat flare
#

oh wow

stable pecan
#

after they are filtered you can rejoin them

#

this does assume there are no function names that are the same, hopefully that's true

wheat flare
#

interesting

#

let me try to understand them

#

mine kinda works already let me show output

#

New key: tensor_util

Old list is:
['tensorflow.python.util.tf_export.tf_export', 'Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util._generate_isinstance_check', '<builtin>.frozenset']

New list is:
['tf_export.tf_export', 'tensor_util._generate_isinstance_check', '<builtin>.frozenset']```
#

for "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util": [ "tensorflow.python.util.tf_export.tf_export", "Desktop\\Work\\tensorflow-master\\tensorflow\\python\\framework\\tensor_util._generate_isinstance_check", "<builtin>.frozenset" ],

stable pecan
#

another interesting idea is just to have the nodes in your graph be the suffix of the keys (the bare function or whatever) and have a dictionary that provides the modules the functions come from elsewhere

wheat flare
#

thats another alternative

stable pecan
#

you can save these properties on the nodes themselves in networkx

wheat flare
#

i'll visit that if my current one fails

#

it seems to work well i hope?

#

well its rather hard to understand, always hard to read toher people's code

#

im going off the assumption that all the keys are unique

stable pecan
#

the last idea would be simplest to implement, gimme a sec for an example

wheat flare
#

my method seems to work nicely, im curious about your method, always good to learn

stable pecan
#
In [9]: from collections import defaultdict
   ...: filtered_keys = defaultdict(dict)
   ...: 
   ...: def clean_key(key):
   ...:     if "\\" in key:
   ...:         *_, last = key.split("\\")
   ...:         *_, last = last.split(".")
   ...:         filtered_keys[last]["directory path"] = key
   ...:     else:
   ...:         *_, last = key.split(".")
   ...:         filtered_keys[last]["moduled path"] = key
   ...: 
   ...: for key in keys:
   ...:     clean_key(key)
   ...: 

In [10]: filtered_keys
Out[10]: 
defaultdict(dict,
            {'_apply_mask_1d': {'directory path': 'Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.boolean_mask._apply_mask_1d'},
             'prod': {'moduled path': 'tensorflow.python.ops.gen_math_ops.prod'},
             'reshape': {'directory path': 'Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.reshape'},
             'concat': {'directory path': 'Desktop\\Work\\tensorflow-master\\tensorflow\\python\\ops\\array_ops.concat'},
             'convert_to_tensor': {'moduled path': 'tensorflow.python.framework.ops.convert_to_tensor'},
             'as_shape': {'moduled path': 'tensorflow.python.framework.tensor_shape.as_shape'},
             'constant_value': {'moduled path': 'tensorflow.python.framework.tensor_util.constant_value'},
             'ValueError': {'moduled path': '<builtin>.ValueError'},
             'name_scope': {'moduled path': 'tensorflow.python.framework.ops.name_scope'}})
wheat flare
#

oo

#

networkx accepts a module path too? or i will be using these to combine stuff

stable pecan
#

networkx doesn't know the difference, anything that's hashable can be a node

#

but you can just feed networkx these suffixes

#

and keep this table to look up the paths

#

or attach the paths to the nodes after you build the graph

wheat flare
#

i see thats interesting

#

thanks a lot

stable pecan
#

np

copper agate
#

Hi I'm new to Python. I'm a freshman college student hoping to learn AI/Data Science. I'm looking for a laptop, and would a RTX 3050 GPU do me good?

fiery cosmos
wise fulcrum
#
{**dict1,dict2}
#

i m ean

#

arg it keeps killing the asteriks

wheat flare
#

i lost some values using |= and im not sure why

#

The asteriks one also makes me lose values

#

Hm

#

like some keys no longer have the values associated with it

wheat flare
#

nwm

lunar shoal
#

|= means what?

knotty magnet
#

calls __ior__

chrome folio
#

🖐️

trim fiber
#

!e

a = 0b101
b = 0b011
print(f"a = {a}")
print(f"b = {b}")
a |= b
print("a |= b")
print(f"a = {a}")
print(f"b = {b}")
halcyon plankBOT
#

@trim fiber :white_check_mark: Your eval job has completed with return code 0.

001 | a = 5
002 | b = 3
003 | a |= b
004 | a = 7
005 | b = 3
lunar shoal
#

Things are getting real here

woven trench
#

i want to imporve my dsa is there any free good source for that?

keen hearth
woven trench
#

ok thankyou

buoyant crescent
#

it seems nice

fervent saddle
#
x = some_dict.pop(k)
some_dict[k] = None```
Is this amortized O(1), or does needing to resize the insertion ordered array make it amortized O(n)?
fervent saddle
#
for _ in range(n):
    x = some_dict.pop(k)
    some_dict[k] = x```
Would that just be O(n), or would it be O(n * size_of_dict)?
vocal gorge
#

I believe it's amortized O(1), because dicts simply don't shrink their tables

#

maybe at all, maybe unless they get really empty

stable pecan
#

i don't think they ever shrink

trim fiber
halcyon plankBOT
#

@trim fiber :white_check_mark: Your eval job has completed with return code 0.

001 | pure 64
002 | one element 232
003 | 8 elements 360
004 | after loop 640
trim fiber
stable pecan
#

dict never shrinks only grows more powerful

fervent saddle
#

!e

import sys
d = {}
print("pure", sys.getsizeof(d))
d[0] = None
print("one element", sys.getsizeof(d))
n = 8
for i in range(n):
  d[i] = i
print(f"{n} elements", sys.getsizeof(d))
for _ in range(1000000):
  value = d.pop(0)
  d[0] = value
print(f"after loop", sys.getsizeof(d))
halcyon plankBOT
#

@fervent saddle :white_check_mark: Your eval job has completed with return code 0.

001 | pure 64
002 | one element 232
003 | 8 elements 360
004 | after loop 640
fervent saddle
#

It’s decreasing in size

fervent saddle
#

So it’s O(n * size_of_dict)?

fervent saddle
#

No wait, it’s just never increasing in size

#

!e ```py
import sys

d = {}

for n in range(10000):
d[n] = None

print(sys.getsizeof(d))

for n in range(10000):
del d[n]

print(sys.getsizeof(d))```

halcyon plankBOT
#

@fervent saddle :white_check_mark: Your eval job has completed with return code 0.

001 | 295000
002 | 295000
lament totem
#

what are you asking is O(n) or O(n*dictsize)?

fervent saddle
#

Wow

lament totem
#

the search time of a dict item?

lament totem
#

pretty sure its O(1) no?

#

getting an item is avg O(1) and setting an item too iirc when using hashes

#

python by default uses hashes for dicts, not red-black-trees or anything

fervent saddle
#

But I was thinking about it resetting the iterated array

#

Because it fills with empty space

lament totem
#

space or time complexity?

fervent saddle
#

Time complexity

lament totem
#

well yh thats bound to n for sure

fervent saddle
#

!e py import sys d = {} print("pure", sys.getsizeof(d)) d[0] = None print("one element", sys.getsizeof(d)) n = 8 for i in range(n): d[i] = i print(f"{n} elements", sys.getsizeof(d)) for n in range(1000000): value = d.pop(n % 2) d[n % 2] = value print(f"after loop", sys.getsizeof(d))

halcyon plankBOT
#

@fervent saddle :white_check_mark: Your eval job has completed with return code 0.

001 | pure 64
002 | one element 232
003 | 8 elements 360
004 | after loop 640
fervent saddle
#

I don’t get how is this not increasing the dict’s memory linearly. The insertion ordered array should be building up empty space

fervent saddle
#

I don’t get it

fervent saddle
#

Does it only account for the size of the hashed into array, not the consecutively filled array?

fiery cosmos
#

I think you have to know the arity of the operators

#

How would you know what args to give the - here 4 3 1 - +

bright halo
#

How to separate a string of input characters into 2 substrings and save each of them in a variable.

input_text_to_check = "ElectrikVocal95#9525: what are we talking about?"

I need to create a condition regex that if what are we talking about? appears in the sentence(patron) then enter the condition...

Something like this:

    regex_patron = r"\s*\¿?(?:what we were talking about | what we talked about before | what we talked about)\s*(about|)\s*((?:\w+\s*)+)?"

    l = re.search(regex_patron, input_text_to_check, re.IGNORECASE)

    if l:

Inside that if statement I would need to divide the sentence between the username and what the sentence really is, and to save each substring in a variable

Like this:


user_name = "ElectrikVocal95#9525" #without the ":"

text = "what are we talking about?"

I hope you can help me with this.

wheat flare
#

does igraph have a similar version of this?

#

my computer froze when i tried using networkx on a 20k node graph

#

networkx can handle 24k nodes and 52k edges right?

wheat flare
#

please ping if you want to discuss or soemthing

fiery cosmos
#

⇔((Σa/((a))*1.15)/(Σ∀(b, c, d)/((∀(b, c, d) m)))) < 1) ⊃ ∃⊤

#

does that look right?

broken sapphire
#

Hello. Is this channel about algorytm channel?

#

Awnser me if anyone is here...

#

:(

fiery cosmos
broken sapphire
wheat flare
#

hey @stable pecan i managed to fit my graph into networkx but it froze my entire computer

#

just want to ask if these pings are annoying

#

also wanted to ask about this question because i think you're one of the very few people here who know about these

wheat flare
#

from what i understand they seem to look the same

#

networkx vs igraph

wheat flare
#

neat it has a from_networkx thing

#

i hate going through 2 things but i cant figure out how to make it load a json file

#

it says unknown format

wheat flare
#

please ping if you see this and would like to discuss


2. making it accept json or a dictionary of lists, cant seem to find it in the python documentation```
wheat flare
#

this is a pair thats in the json i fed to networkx -> converted to igraph

#

so im not sure whats wrong

stable pecan
#

same with graph-tool

#

you can store node names as properties of the node though

#

these are c-libraries wrapped in python so they are very good for large graphs though

wheat flare
#

Oh it doesnt do that automatically?

stable pecan
#

no

wheat flare
#

Oh boy

stable pecan
#

what you can do, if you want

wheat flare
#

I couldnt find a way to directly use the json file

#

Had to do from networkx

stable pecan
#

is create your networkx graph, and use nx.relabel to get all integer nodes

#

but you'll want some list of nodes labels by index

stable pecan
#
int_to_labels = dict(enumerate(dict_of_lists))
labels_to_int = {v, k for k, v in int_to_labels}
new_graph = {labels_to_int[k]: [labels_to_int[node] for node in v] for k, v in dict_of_lists}

can also do this to not go through networkx

wheat flare
#

Oh if i want to feed the json directly to igraph ?

#

Either way it ends at the same result right

#

Im fine with using libraries

stable pecan
#

it will be faster not going through nx for this many nodes

wheat flare
#

Its better to confirm accuracy as i may mess stuff up

#

Oh

stable pecan
#

networkx is dict-of-dict-of-dict structure for graphs, so it's a huge memory hog for large graphs

wheat flare
#

Can it reliably handle 20k nodes

stable pecan
#

it can reliably handle any nodes, if you have infinite memory

wheat flare
#

I can try it on a better computer but with like 32gb memory but its hard to say if it'll handle huh

stable pecan
#

sounds like plenty

wheat flare
#

Really

#

My laptop has like 8 gb memory so it just froze

#

Lmao

wheat flare
stable pecan
#

this takes your dict of lists and creates two intermediate dictionaries that convert integers to labels and vice versa

wheat flare
#

Oops sorry didnt turn off ping

stable pecan
#

then another dictionary that's the relabeled graph

#

technically, labels_to_int could just be a list

#

would be more efficient

wheat flare
#

Then ill feed both to igraph or something?

stable pecan
#

errr, int_to_labels

#

just feed the new_graph to igraph

wheat flare
stable pecan
#

i'm not sure if it has a dict_of_lists constructor, but you can create an iterable of tuples from the dict in any case

wheat flare
#

Assume its this i guess

#

So steps are

#
1. Use the code up there to generate new_graph and feed to networkx.
2. If it doesnt, create an iterate of tuples from the dict to feed
wheat flare
#

Maybe ill just look for a better pc >< this is a lot of work

stable pecan
#

no the new graph doesn't have this already

#
In [3]: dict_of_lists = {
   ...:     "zero": [],
   ...:     "one": ["one", "two", "three"],
   ...:     "two": ["three"],
   ...:     "three": ["zero"],
   ...: }
   ...: int_to_labels = list(dict_of_lists)
   ...: labels_to_int = {label: i for i, label in enumerate(int_to_labels)}
   ...: new_graph = {labels_to_int[key]: [labels_to_int[node] for node in value] for key, value in dict_of_lists.items()}

In [4]: print(int_to_labels, labels_to_int, new_graph)
['zero', 'one', 'two', 'three'] {'zero': 0, 'one': 1, 'two': 2, 'three': 3} {0: [], 1: [1, 2, 3], 2: [3], 3: [0]}
#

here is an example

#

you have string labels in dict_of_lists

#

you make a list of the labels called int_to_labels

#

you make a dictionary of labels to integers named labels_to_int, and then you can use this last dictionary to relabel the entire graph

#

the last dictionary is all integers

#

if you need to know what label an integer node is referring to, you can look it up in int_to_labels

#
In [5]: int_to_labels[0]
Out[5]: 'zero'
wheat flare
#

I see, from there though int to labels only gets the keys right of the dict

stable pecan
#

yep, this assumes that all the labels will appear as keys in the dict

wheat flare
#

Ah there are unique labels in the values too

stable pecan
#

then should add them as keys

wheat flare
#

Ohh

#

Well i need to look at my json to figure that out it may be already like that

#

Ok i mostly understand what you do, how would that work for igraph now?

#

I guess we're just preprocessing the data

stable pecan
#

once you've converted everything to integer, you can make an igraph graph with it

wheat flare
#

Yes then i store node names as properties?

stable pecan
#

yes

wheat flare
#

The python version of this

#

Hm does this mean i have to iterate the whole dict and manually construct the graph then

#

Since it doesnt seem to accept the whole input

wheat flare
#

Sorry the documentation is rather confusing

#

Sounds like a lot of iteration

#

Going through every node and creating it then assigning after

stable pecan
# wheat flare Going through every node and creating it then assigning after
In [39]: dict_of_lists = {
    ...:     "zero": [],
    ...:     "one": ["one", "two", "three"],
    ...:     "two": ["three"],
    ...:     "three": ["zero"],
    ...: }
    ...: int_to_labels = list(dict_of_lists)
    ...: labels_to_int = {label: i for i, label in enumerate(int_to_labels)}
    ...: new_graph = {labels_to_int[key]: [labels_to_int[node] for node in value] for key, value in dict_of_lists.items()}
    ...: g = ig.Graph(); g.add_vertices(len(int_to_labels)); g.add_edges(((u, v) for u, out in new_graph.items() for v in out))
    ...: g.vs["label"] = int_to_labels

In [40]: print(g)
IGRAPH U--- 4 5 --
+ attr: label (v)
+ edges:
1--1 1--2 1--3 2--3 0--3

In [41]: g.vs[0]["label"]
Out[41]: 'zero'
wheat flare
#

Woa

#

Just cirious shouldnt the last edge be 3--0

#

Unless your graph is undirected

stable pecan
#

it's undirected, use directed graph

#
g = ig.Graph(directed=True)
#

is all you have to do

wheat flare
#

Ah i see

#

G.vs label just saves node properties

#

Interesting

#

Well saves it to the node under label

stable pecan
#

yeah, you can attach arbitrary properties to nodes or edges

wheat flare
#

Ill try to work it on a small file first

#

Thanks a lot

#

May i ask if im bothering with the pings

stable pecan
#

it's fine

wheat flare
#

I try to not do it too much but its confusing

stable pecan
#

there's not a lot of people that know all the graph libraries here

#

i've worked with them all at some point

wheat flare
#

When i search in the search bar its like

#

You and 2 others

stable pecan
#

well, "all" -- i've worked with nx, igraph, and graph-tool

wheat flare
#

Thanksss

#

Anything i should look out for or any extra tips?

#

Probably gonna have issues cause i need to learn it

stable pecan
#

a lot of it is just playing around in an interpreter and making sure the graph is doing what you want

wheat flare
#

After this is labeled i can try

stable pecan
#

i have a file somewhere that unifies the graph interface for all three libraries

wheat flare
#

Well the get shortest path stuff

#

Using names

#

print(g.get_shortest_paths("one", to="zero") )

#

I assume

#

Then itll be [one, three, zero]

stable pecan
wheat flare
#

Wewww that looks neat

south gorge
#

Salt die
Pepper
Then its againn
Salt die and then pepper
You guyz are not gonna give chance to anyone else to write : 😅 😂

wheat flare
#

Well im almost done

#

Sorry for hogging the channep

wheat flare
#

Ill do this in a help channel next time

stable pecan
#

i think what will happen is ig will give you a list of Vertex objects and you can get the label from each of these

wheat flare
#

Hmm so i need to know the number associated to the label then

#

Guess i can just get it that way

stable pecan
#

more like, labels_to_int dictionary was created for this

#

g.get_shortest_paths(labels_to_int["one"], to=labels_to_int["zero"])

wheat flare
#

Ahh

#

Alright

#

Thabks a lot sorry for hogging the channel

#

Ill do it in a help channel next time

#

@south gorge sorry for taking the whole channel

south gorge
#

Hey no problem i was just joking
I mean this had alot of serious in it

wheat flare
#

This will be useful if someone searches igraph

dawn geyser
#
from itertools import product as prod
lst2=['d','e','f']
lst1=['a','b','c']
lst=[lst1,lst2]

list(prod(lst1,lst2))
#
 ('a', 'e'),
 ('a', 'f'),
 ('b', 'd'),
 ('b', 'e'),
 ('b', 'f'),
 ('c', 'd'),
 ('c', 'e'),
 ('c', 'f')]
#

How do I correct the code if I want to just pass 'lst' in list(prod(lst)) and get the same output instead of whats written

#

Like, pass a single list that contains all of the individual lists within itself.

#

Btw, the output is just the list itself if I try to pass just 'lst'

feral hound
#

@stable pecan Hi sorry to bother you but I saw the conversation earlier about graphs and I've never used a graphing library before could you explain to me if it would be better for storing/creating a simple graph with node - connections than a dict would be

graph = {
    "node_name": [children]
}
stable pecan
#

typically graph libraries already have algorithms made for things like shortest path or largest connected component

feral hound
#

and other than that wouldn't it be even more efficient to create a node class that has reference to its connections?

stable pecan
#

so if you need to do any algorithmic thing with your graph, probably better off using the library

feral hound
#

what about in terms of the graph itself?

#

would it take up more memory?

stable pecan
#

depends -- some libraries are actually written in c, so the graphs are smaller than anything you could make in python

#

networkx has large graphs though, because it's a pure python library

feral hound
#

really?

stable pecan
#

yes, graph-tool and igraph both wrap c

feral hound
#

I know it would probably be faster but dont understand why it would take less memory

stable pecan
#

because c objects are often contiguous memory

feral hound
#

I dont think that would work for creating a graph though no?

#

you would need pointers either way

stable pecan
#

you don't need, you can just resize when adding nodes, but i don't know how they're implemented - just that it's possible

feral hound
#

hmm but then I guess the maximum size would be smaller since it would require a fully empty block in memory?

#

whereas pointers as long as there is memory available it can keep expanding the graph and would also be faster to add connections dynamically?

#

so I guess its more about use case then if I know the graph wont ever change I could create it as a contiguous block otherwise use pointers to add nodes dynamically

stable pecan
#

both graph-tool and igraph don't allow arbitrary objects for nodes though, nodes can only be ints, so you can specialize your data structure here

feral hound
#

ahh

stable pecan
#

and the nodes are always contiguous

#

so you cant have like 0, 5, 19 as your only nodes

feral hound
#

do you know what kind of structure they use to create the graphs?

stable pecan
#

no idea, graph-tool wraps boost library, and igraph wraps something else

feral hound
#

fair

#

what resources would you recommend to look into this a bit more btw?

stable pecan
#

can just read their documentation

feral hound
#

aight thx for the info appreciate it 🙂

stable pecan
cloud delta
#

Hi guys, I'm looking for a good library for Trees(RBTree, BTree, AVLTree)

stable pecan
#

i implemented avl and binary trees in a library, i dunno a general tree library though

halcyon plankBOT
#

sacks/sets/avl_tree.py lines 55 to 63

class AVLTree(BinarySearchTree):
    """
    A self-balancing binary search tree.

    Notes
    -----
    This version of an AVL tree allows multiple of the same item to be inserted.

    """```
cloud delta
#

Thanks

meager slate
#

is there a better way of finding the "length" of a number in binary other than len(bin(n)) - 2 for n > -1? by length i mean this ```
length(0b1010) == 4
length(0b000000010) == 2
^^^^^^^
Leading 0s ignored

stable pecan
#

can always take the log base 2 of the number

stable pecan
keen hearth
#

!eval ```py
for i in range(10):
print(i, bin(i), i.bit_length())

halcyon plankBOT
#

@keen hearth :white_check_mark: Your eval job has completed with return code 0.

001 | 0 0b0 0
002 | 1 0b1 1
003 | 2 0b10 2
004 | 3 0b11 2
005 | 4 0b100 3
006 | 5 0b101 3
007 | 6 0b110 3
008 | 7 0b111 3
009 | 8 0b1000 4
010 | 9 0b1001 4
keen hearth
#

This comes in handy if you want to get an approximate log base 2 without importing math (for some reason).

keen hearth
knotty magnet
#

kinda unlucky he didn't get any that weren't 5 digits lol

keen hearth
knotty magnet
#

at least a little bit unlucky then ¯_(ツ)_/¯

keen hearth
meager slate
stable pecan
#

if log2 is an integer, you'll be one too high

knotty magnet
#

isn't it just ceil? not floor + 1?

stable pecan
#

or maybe this works

#

yeah, it's fine

keen hearth
#

!eval ```py
import random, math

def f(x):
return len(bin(x)) - 1

def g(x):
return int(math.log2(x)) + 1

def test(x):
assert f(x) == g(x), f"{x}: {f(x)} != {g(x)}"

for _ in range(104):
x = int(2
random.uniform(0, 5))
test(x)

for x in range(1, 10):
test(x)

print('Pass!')

knotty magnet
#

you did log 0

stable pecan
#

log(0)

halcyon plankBOT
#

@keen hearth :x: Your eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 14, in <module>
003 |   File "<string>", line 10, in test
004 | AssertionError: 9: 5 != 4
keen hearth
stable pecan
#
  • 2
keen hearth
#

!eval ```py
import random, math

def f(x):
return len(bin(x)) - 2

def g(x):
return int(math.log2(x)) + 1

def test(x):
assert f(x) == g(x), f"{x}: {f(x)} != {g(x)}"

for _ in range(104):
x = int(2
random.uniform(0, 5))
test(x)

for x in range(1, 10):
test(x)

print('Pass!')

halcyon plankBOT
#

@keen hearth :white_check_mark: Your eval job has completed with return code 0.

Pass!
knotty magnet
#

huh, interesting

keen hearth
#

Might not work for large numbers though.

stable pecan
#

this isn't a great random test because x is always a power of 2

knotty magnet
#

random.uniform gives a float though

stable pecan
#

ok

#

nevermind then

knotty magnet
#

although it didn't tnecessarily test the case salt-die mentioned, what if it is a power of two

keen hearth
keen hearth
#

How many bits of precision do python floats have?

knotty magnet
#

they're doubles, so...53?

keen hearth
#

Ah right, so maybe we should test 2**54 and numbers around this.

knotty magnet
#

that wouldn't work for any method involving math.log though, since it converts it to a float, right?

keen hearth
knotty magnet
#

actually, what happens when you convert a really big int into a float. does it just round down?

#

oh, overflow

#

duh

keen hearth
#

Just realised we haven't tested negative numbers 👀

#

For which I think len(bin(x)) - 2 is wrong anyway, because theres a - sign.

knotty magnet
#

yeah

stable pecan
#

log solution is wrong too

#
In [20]: cmath.log(-1234, 2)
Out[20]: (10.26912667914942+4.532360141827194j)

you have to discard the imaginary part

#

and you can't even choose which branch cut to use

knotty magnet
#

that would give 11 right? which is correct?

stable pecan
#

yes, after you discard the imaginary part

#

but this doesn't work:

In [22]: int(cmath.log(-1234, 2)) + 1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-92bdd2d35a8e> in <module>
----> 1 int(cmath.log(-1234, 2)) + 1

TypeError: can't convert complex to int
keen hearth
#

So, all in all, I think it's safest to go with int.bit_length 😄

keen hearth
keen hearth
steady maple
#

Guys what Is the use of data structs

#

And algorithms

keen hearth
#

Well data structures are just how data is represented in computer programs. The way that you represent data can have a dramatic effect on how efficiently you can process that data.

#

For example, if you represent a list of numbers as an un-sorted array, to find out if a number is in that list, you have to check all of the numbers one-by-one.

steady maple
#

Yes

keen hearth
#

But if you stored those numbers in Eg a sorted list, or a binary-search tree, you could narrow in on where the number should be by repeatedly ruling out half of the numbers.

#

Even better, if you store them in a hash-table, you can to straight to where the number should be, and check if it is there.

steady maple
#

U mean this will just reduce some time?

keen hearth
#

Yep

#

Think about how you would go about finding a particular page of a book.

steady maple
#

I got it

#

Thanks for the perfect example

#

Well is it helpfull in case of db too ?

keen hearth
#

You could start at the first page and go page-by-page until you find the page number you're looking for. Or you could go to the middle of the book, see whether the page number is less than or greater than the one you're looking for, then either look in the left-half or the right half, then repeat.

#

Sorry, I started typing that out so had to finish 😄

keen hearth
steady maple
keen hearth
#

Have you heard of "indexing a column"?

steady maple
#

I have experience with both SQL and non SQL db

steady maple
keen hearth
#

If you index a column of a table, the database builds an auxiliary data-structure (generally a tree or a hash-table), which allows you to quickly look up rows of the table based on the value of that column.

keen hearth
#

Yep, I think when you specify a column as the primary key most databases automatically index that column.

steady maple
#

Well maybe but as much as I know primary key is the special value of a cell which cannot be repeated in the table once more and thats why it helps the db to find it faster

#

As it is the only one with that value in the table

#

Or I should say a row

#

I always get confused in row and column lol

keen hearth
steady maple
#

So how does data structs help to make it faster

keen hearth
#

Dictionaries in python are also hash tables.

steady maple
#

So we need a hash table so thaat we can stop scanning all the rows

steady maple
#

Sry if I am irritating u by pinging and asking again and again

keen hearth
#

And I get like 100 pings a day anyway as a mod 😄

steady maple
#

Lol

stable pecan
#

hash tables are tables made out of shredded potatoes

steady maple
#

Wth was that!

stable pecan
#

if you fry them they become hash brown tables

steady maple
#

What r u trying to do?

#

LX

keen hearth
stable pecan
#

hash tables are really a reserved chunk of memory -- then keys are hashed which gives you which chunk in memory the key goes

knotty magnet
#

and hamt is just bacon

stable pecan
#

if you have too many keys and a small table, your hash table will have poor performance!

steady maple
knotty magnet
#

have you used a python dictionary before?

steady maple
knotty magnet
#

python dicts are hashmaps

steady maple
keen hearth
#

Like, we could implement our own hash-table in python: ```py
buckets = []
NUM_BUCKETS = 10

for _ in range(NUM_BUCKETS):
bucket = []
buckets.append(bucket)

def hash(key: str) -> int:
"""Tells you which bucket a given key should be stored in."""
# We just add up the ordinals of the characters of the key
# then take the remainder when you divide that by the number
# of buckets.
result = 0
for char in key:
result += ord(char)
return result % NUM_BUCKETS

def add(key: str):
"""Adds a key to the hash table."""
if contains(key):
return
# We want to add the key to a specific bucket.
bucket = buckets[hash(key)]
bucket.append(key)

def contains(key: str) -> bool:
"""Tells you whether the hash table contains the given key."""
# We only need to look in one bucket for the key (not all of them).
bucket = buckets[hash(key)]
return key in bucket
``` This example of a hash-table doesn't have values, just keys. (In practice, you would just use dict or set in python, you wouldn't implement your own hash table. If you were writing a program in C, you might implement your own.)

#

@steady maple this

steady maple
#

I am reading it

keen hearth
#

Remember that algorithms and data structures are abstract concepts.

#

You can implement the same data-structure in any language.

#

@steady maple #bot-commands message

#

My implementation is not a particularly good implementation of a hash table, but is just to illustrate the concept.

steady maple
#

I saw it in bot cmds and it was printing something

keen hearth
steady maple
#

Well yea this is really confusing

stable pecan
#

the buckets represent different slots in memory and the hashing is a way to pick which slot a key goes in

steady maple
#

Well isin't hashing the way by which websites store user data and password in their db encrypted

knotty magnet
#

yes, that's one use of hashing

keen hearth
#

In that case they want the hashes to be evenly distributed for security reasons (to make it hard to guess the password from the hash).

steady maple
#

Yes true

#

Maybe I got It

keen hearth
#

In this case we want the hashes to be evenly distributed so that the data is evenly distributed in the table.

#

The key idea is that the hash function tells us where to look for a given key.

#

Sorry for mixing different meanings of the word "key" in one sentence 😄

steady maple
#

U mean hash just updates the data and make it something like a primary key ?@keen hearth

#

So that the db gets it faster

keen hearth
#

This image from Wikipedia kind of illustrates it:

#

Hash-tables are an algorithm. Databases are one place where they are implemented, to make looking up rows by primary key faster.

steady maple
#

I think the hash func just encrypted and changed the data

#

To makke a primary key

keen hearth
#

Yep, but notice that the hash function tells us exactly where to look for certain data.

#

Otherwise, we would have to go through the table row-by-row to find it.

#

The hash function just points us straight to it.

steady maple
#

Ah I got ur point

#

Well what in case if the key is a int

keen hearth
#

Any data can be converted into a string of bytes, then you just hash that string of bytes.

steady maple
keen hearth
#

I just mean in general, that's one way you could hash any kind of data.

steady maple
#

Ok

knotty magnet
#

all data is just bytes afterall

keen hearth
#

We don't talk about implementation details in this channel lemon_sweat

steady maple
#

That's enough to know all this

#

Well how can I hash something?

stable pecan
#

i think cpython has a very simple hash function

#

something something mod 5

steady maple
#

Well I am using python in my bot

#

So I need to use python in it

keen hearth
#

Python has a built-in hash function.

steady maple
keen hearth
#

But python also has a built-in hash-table called dict.

#

You would generally use dict rather than using hash directly.

steady maple
#

Where and how to get that

keen hearth
steady maple
#

Sry for my slow internet

#

Lol

#

My msgs are sent after 5-7 secs

keen hearth
#

ah right 😄

steady maple
#

So I can use hash instead of dict

#

?

#

Oops I mean

#

Dict instead of hash

keen hearth
#

hash just takes an object and turns it into a number. It is what dict, a key-value hash-table, uses to decide where to store things.

#

!eval print(hash('hello'))

halcyon plankBOT
#

@keen hearth :white_check_mark: Your eval job has completed with return code 0.

5560446419581122223
steady maple
#

Ah I got it

keen hearth
#

Example of a dict: ```py
my_dict = {
'Steve': 20,
'Amanda': 25,
}

knotty magnet
keen hearth
#

Then you could look up values by doing Eg my_dict['Steve']

#

Which would return 20.

steady maple
#

But what in case I want to get the hashed value back @keen hearth

glossy breach
#

I'm sure hash is meant to be one way

keen hearth
#

Anyway, I'd recommend picking up a book or online-course in algorithms to learn more.

#

There are some recommended resources in the pinned messages of this channel, and on our website:

#

!resources

halcyon plankBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

trim fiber
#

Neo4j client has builtin option for that as far as I remember pithink

shut breach
#

yeah it has a pretty good renderer too iirc

keen hearth
#

What format is the graph in to begin with?

main flower
#

graphviz is great but if you need something easy and fast check https://csacademy.com/app/graph_editor/

#

it's really helpful in cp

keen hearth
#

Ooo nice!

steady maple
#

!eval print(hash('hello'))

halcyon plankBOT
#

@steady maple :white_check_mark: Your eval job has completed with return code 0.

8854791851944303749
steady maple
#

!eval print(hash('hello'))

halcyon plankBOT
#

@steady maple :white_check_mark: Your eval job has completed with return code 0.

7179474724399652171
steady maple
#

Is it possible to get the hashed value back?

keen hearth
#

Ah, I think hashes aren't necessarily the same between runs of Python.

steady maple
#

So how will I compare it with the value in the db

#

How will it match

knotty magnet
#

the db doesn't use the same hash function as python

#

python's hash function was mostly meant for speed

steady maple
#

@knotty magnet Can u suggest me a hashing algo that can do it?

knotty magnet
#

i think most databases will use BTrees actually

#

rust uses siphash for their hashmaps so that's probably fine

#

wonder what python's hashing algorithm is named 🤔

elfin river
#

!eval 1+1

#

!eval print(1+1)

cosmic ruin
#

Hello! I have a question regarding matplotlib. As someone suggested, I might as well ask here. This is the message link with the specifics, feel free to ping if you have any solution!
#help-peanut message

vocal gorge
keen hearth
#

!eval print(1 + 1)

halcyon plankBOT
#

@keen hearth :white_check_mark: Your eval job has completed with return code 0.

5
cosmic ruin
#

Ohhh, makes sense

#

thank you!

keen hearth
#

Wait what @elfin river

keen hearth
#

Just kidding lemon_pleased

elfin river
#

nothing to see here.. just adding bits together.

mental parcel
#

Looking to see if anyone has any ideas for approaches to this as I'd be interested in hearing things.

So my problem is that I have a file that's too big to load into memory and I need to randomise the lines in the file. How would you approach it?

I've got a couple ideas related to splitting the file and then merging files, a little like splitting a deck of cards in half and then merging.
One of my mates also suggested I could do passes over the file, noting a line and then doing an in place swap. Then repeating this process.

But I'm curious if there's a better way / alternatives

#

Since Im sure there's a good way to do it. I just don't know what it is

fervent saddle
#

One of my mates also suggested I could do passes over the file, noting a line and then doing an in place swap. Then repeating this process.
The lines are all the same length?

#

If they are then you could do something like this in "r+" mode. You can kind of index it using seek

keen hearth
mental parcel
#

Yea there not same length unfortunately

brisk aurora
#

numpy question:
I have a viewing frustum that is represented by 6 planes (a plane is 4 floats: a,b,c for the normal, and d for the distance: ax+by+cz+d = 0). I have N elements that I want to render. each element has an axis-aligned bounding box represented by a bounding_box_min vector3 (x,y,z floats) and a bounding_box_max.
I'm trying to figure out which elements are completely outside the viewing frustum by checking where their bounding box lies on each plane.
This is my regular, procedural python code:

def is_bbox_contained_or_intersects(self, bbox_min: Vector3, bbox_max: Vector3) -> bool:
    for plane in self._planes:
        bbox_x = bbox_max[0] if plane[0] > 0 else bbox_min[0]
        bbox_y = bbox_max[1] if plane[1] > 0 else bbox_min[1]
        bbox_z = bbox_max[2] if plane[2] > 0 else bbox_min[2]

        dot = plane[0] * bbox_x + plane[1] * bbox_y + plane[2] * bbox_z

        if dot < -plane[3]:
            return False

    return True

self._planes is a list of the 6 planes (a plane is a Tuple[float, float, float, float]). Vector3 is Tuple[float, float, float]
I'm running this function per element.
How can I vectorize this with numpy?

brave oak
#

for the first 3 lines

#

np.dot for the dot product

#

then just vectorised comparison

brisk aurora
#

how would the np arrays look like though?

#

(I'm a complete numpy novice, apologies for the silly questions)

brave oak
#

this is an interesting question

#

so you wanna make self._planes an array

#

a 6x4 array

brisk aurora
#

so it would be a 6x4?

brave oak
#

okay

#

and I'm going to assume

#

you want to stack the bounding boxes

#

which will make them Nx3 arrays?

#

where N is the number of elements to be rendered

brisk aurora
#

so bbox_min and bbox_max will be Nx3, sure makes sense

brave oak
#

that seems like the main axis of vectorisation to me

#

yeah

#

that's the sketched outline

#

of the solution

#

I'm not on my coding computer

brisk aurora
#

thanks!

brave oak
#

so I can only speak in abstractions

#

yw 👋

brisk aurora
#

much appreciated

#

another numpy question - are there good type hints for numpy arrays and methods?

stable pecan
#

if you call:

seven(times(five()))

the definition for seven is:

def seven(f=None): 
    return 7 if not f else f(7)

let's substitute in times(five()) for f:

seven(times(five())) = 7 if not times(five()) else times(five())(7)

but what's times(five())?

times(five()) = lambda x: x * five()

what's five()?

five() = 5 if not f else f(5) = 5

going back up the chain:

times(five()) = lambda x: x * five()
times(5) = lambda x: x * 5

and finally:

seven(lambda x: x * 5) = 7 if not (lambda x: x * 5) else (lambda x: x * 5)(7) = (lambda x: x * 5)(7) = 7 * 5 = 35
brave oak
#

AFAIK?

#

I could be wrong about this

#

haven't done much with numpy for at least a year

#

what kind of type hints

brisk aurora
#

it could be nice to type hint the shape of ndarrays

brave oak
#

are you looking for

#

oh

#

that would require dependent types

#

which we don't have in Python

brisk aurora
#

@brave oak regarding np.where, I'm not sure what my condition is for the 6x4 planes array vs. the bbox_min + bbox_max arrays?

brave oak
#

two arrays of different shapes

#

are of type np.ndarray

#

how do you represent the shape?

#

keep in mind that shapes differ based on the values in the array

#

in other words, we have types depending on values

#

this is called dependent typing

#

not many languages have it

brisk aurora
#

I see

brave oak
#

Python's typing system is kinda anaemic tbh

#

but well that's not on topic

brisk aurora
#

I really wanted it just as a comment for "this ndarray is of shape X at the moment"

#

ok, yeah

brave oak
#

planes > 0 will also be 6x4

#

you don't need the distance, though

#

so you can take planes[:, :3] > 0

#

that gives you 6x3

brisk aurora
#

that's "free"? (planes[:, :3]) ?

#

it's not copying anything?

brave oak
#

now, you want a coordinate from bbox_max or bbox_min depending on whether the corresponding vavalue in plane is True or False, respectively

brave oak
brisk aurora
#

like not taking time based on the amount of values in it

brave oak
#

you mean

#

a view

#

as opposed to a copy?

brisk aurora
#

yeah

#

exactly

brave oak
#

I don't remember the exact rules

#

but I believe that should create a view

brisk aurora
brave oak
#

you can read this

brave oak
#

you use np.where

#

because np.where is basically

#

np.where(c, x, y) -> x if c else y

#

but broadcasted

brisk aurora
#

np.where(planes[:, :3] > 0), bbox_max, bbox_min)
that's literally it?

brave oak
#

because you want to broadcast it across all the bounding boxes

#

so the dimensions need to line up

#

basically, what that means is that each axis needs to "mean" the same thing

#

for example

#

right now a bbox is (3,)

#

and planes is (6, 3)

#

so the first axis of planes means "number of planes"

#

but the first axis of the bbox means "coordinate" (x, y, or z)

brave oak
#

you see what I mean?

#

consider further

brisk aurora
#

yes

brave oak
#

that if you want to vectorise across bboxes

#

you need a third axis

#

because you can't "mix" that with either axis in planes

#

since, again, that third axis "means" something different

#

got all that?

brisk aurora
#

I don't get the third axis

brave oak
#

one bbox is 3 coordinates

#

N bboxes

#

are Nx3

#

let's say

#

so in general

#

ugh

#

man I should have stuck with M

#

you have P planes, each of which is 3 points (ignore distance)

#

therefore Px3

stable pecan
#

planes[:, :3] is a view, but planes[:, :3] > 3 is not view --- numpy will create a new boolean matrix

brave oak
#

such manipulations always create new arrays (basically)

#

now (focus on the mins), you have B bounding boxes, each of which is also 3 points

#

so Bx3

stable pecan
#

you can manually call the functions with the out= parameter to reuse arrays if you want

#

i'm derailing though

brave oak
#

yeah, simplest example is np.add

#

I don't know what the equivalent is for comparison actually

brave oak
#

so when using np.where between two arrays

#

one of which represents the planes

#

and the other of which represents the bounding boxes

#

that needs to be the same axis

#

so we have one axis accounted for

#

next, we have the P axis (number of planes) and the B axis (number of bounding boxes)

#

these two mean different things

#

therefore they should not line up

brisk aurora
#

ok so for that I reshape planes[:, :3] so the coordinate axis will be first

brave oak
#

which means in total

#

you have 3 axes

#
  1. planes
  2. coordinates
  3. bounding boxes
#

got it?

brisk aurora
#

yes

brave oak
#

yeah

#

that's about it

#

I need to go

#

but

#

play around with it

#

it's a good experience

#

🙂

brisk aurora
#

OK, no idea how to combine the different number of planes with the different number of bboxes but I'll try 🙂

#

thanks!

brisk aurora
#

@brave oak OK I'm kinda lost with shaping the planes array to support the np.where call so the condition, bbox_min and bbox_max are broadcastable. still trying though

brisk aurora
#

@brave oak
If I'm not mistaken, if bbox_min and bbox_max are each 1000x3, and planes is 6x4, then:

planes_normal = all_planes[:, :3]
planes_greater_than_zero = planes_normal > 0
bbox = np.where(planes_greater_than_zero[:, None, :3], bbox_max, bbox_min)

Now bbox is 6x1000x3 which means, for each of the 6 planes, for each element, what is the x,y,z that I want to check for the bbox.
Now I'm having trouble with doing the dot product between planes_normal and bbox 🙂

brave oak
#

but

#

that doesn't seem so right..

#

the resultant shape should be 6, 1000, 3, right

brisk aurora
#

the bbox array is correct, I think. I checked some values.

#

but I can't figure out yet how to do a dot product between the 3 in bbox (which is 6x1000x3), and the 3 in planes_normal (which is 6x3) while "Sharing" the 6 at the beginning.
The result of the dot product should be 6x1000, a dot product per plane, per element

#
np.tensordot(planes_normal, bbox, axes=(1,2)).shape
Out[120]: (6, 6, 1000)

I don't get the extra 6 here