#algos-and-data-structs

1 messages Β· Page 49 of 1

glossy lintel
#

!e

import numpy, sys
l1 = numpy.zeros((10000 + 31) //32, dtype=numpy.uint32)
l2 = [0] * ((10000 + 31) // 32)
print(f"Numpy Array: {sys.getsizeof(l1)}")
print(f"Normal List: {sys.getsizeof(l2)}")
halcyon plankBOT
#

@glossy lintel :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | Numpy Array: 1364
002 | Normal List: 2560
glossy lintel
#

oh nvm

#

forget that

haughty mountain
#

int sizes on typical CPUs are 8, 16, 32, 64 bits

glossy lintel
haughty mountain
#

wdym?

#

this thing stores bits in "blocks" of 8/16/32/64 bits

#

it's not really restricting anything

#

if blocks are 8 bits and you want to get the 10th bit, you would look at the second block

glossy lintel
#

im quite confused

#

hold on

#

!e

import numpy, sys
l1 = numpy.zeros((10000 + 31) //32, dtype=numpy.uint32)
l2 = [0] * ((10000 + 31) // 32)
l3 = "0" * ((10000 + 31) // 32)
print(f"Numpy Array: {sys.getsizeof(l1)}")
print(f"Normal List: {sys.getsizeof(l2)}")
print(f"String: {sys.getsizeof(l3)}")
halcyon plankBOT
#

@glossy lintel :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | Numpy Array: 1364
002 | Normal List: 2560
003 | String: 354
glossy lintel
#

uh...

#

string does like 354 bytes against numpy array doing 1364

haughty mountain
#

in the string you are storing one bit per character

#

in the the uint32 you are storing 32 bits per uint32

#

!e assuming you're storing just 0/1 in l2

import numpy, sys
l1 = numpy.zeros((10000 + 31) //32, dtype=numpy.uint32)
l2 = [0] * 10000
l3 = "0" * 10000
print(f"Numpy Array: {sys.getsizeof(l1)}")
print(f"Normal List: {sys.getsizeof(l2)}")
print(f"String: {sys.getsizeof(l3)}")
halcyon plankBOT
#

@haughty mountain :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | Numpy Array: 1364
002 | Normal List: 80056
003 | String: 10041
haughty mountain
#

so the numpy array wins by a huge margin

exotic bolt
#
chars = [some charset list]
ngrams = []
for i in range(n):
    if i == 0:
        ngrams = chars
    else:
        new_ngrams=[]
        for elm1 in ngrams:
            for elm2 in chars:
                new_ngrams.append(elm1+elm2)
        ngrams = new_ngrams

the above is disgusting, what's a neat implementation

exotic bolt
soft holly
#

I need a video course to learn and get good grades for my "Design And Analysis of Algorithm" lesson at university

glossy lintel
#

tbh im kinda confused how to handle the numpy array in this scenario

haughty mountain
#

I gave an example here of how to make a bitset based on a numpy array

glossy lintel
#

ok

glossy lintel
exotic bolt
#

is the only way to get the FLOPS of a piece of code to manually analyze the code and add counters?

haughty mountain
glossy lintel
#

kind of

haughty mountain
#

for simplicity, let's say we worked in blocks of 4 instead

n//4  0000111122223333
n%4   0123012301230123
#

n//4 picks out the block

#

n%4 is the position in the block

#

and the logic is just

block_index = n//4
pos = n%4
block = data[block_index]
the_bit = (block >> pos) & 1
#

e.g.

bits 0b1101
pos    3210
```getting the bit at pos 2

1101
11 # >> 2
1 # & 1 (removes anything but the lowest bit)

#

@glossy lintel makes sense?

glossy lintel
#

hold on

#

yup makes sense

#

thanks

haughty mountain
#

the bit math looks scarier than it is

glossy lintel
#

ig

glossy lintel
#

also one quick question

#

now this has to do with linked lists

#

i have 2 arrows that are where the linked list splits

#

so far im thinking this algorithm

#

the first is doing iterativeley by first making a linked list then appending all of the node's data values til it comes across the breakpoint. When it does then it append the node that the breakpoint is in but then creates a other linked list and switches to that instead while having the first value be the breakpoint node value. After which repeat

#

(as shown in the image above)

#

is there a faster alternative that will get the job done quicker?

#

(on a side note tho still appreciate the effort you put to helping me)

glossy lintel
#

Got something like this

    def split(self, positions: set[int]) -> list["SinglyLinkedList"]:
        target = self.starting_node
        l = []
        prevTarget = None
        lists = []
        index = -1
        while target:
            index += 1
            if index - 1 in positions:
                linkedList = SinglyLinkedList().from_iterable(l)
                lists.append(linkedList)
                l.clear()
                l.append(prevTarget.data)
            prevTarget = target
            l.append(target.data)
            target = target.point
        return lists
haughty mountain
#

what if you keep track of the head of the current list and the previous node?

#

if you encounter a cut point you can just unlink the previous node from the current, and head is now the first list

#

the current node becomes the new head

#

repeat

wide marlin
#

Is there a book about design patterns that uses ts/js or python for their code examples

candid anvil
# glossy lintel now this has to do with linked lists

Do you need to maintain the original list? Is singly-linked list a requirement, or can it be doubly-linked? (doubly linked means you can just remove the link from before the red/blue pointers). If it MUST be singly linked, you can use a while loop to iterate through the nodes "searching" for the red/blue nodes, and keep an extra variable for the prev node. The you can create the 3 linked lists.

harsh mason
#

hey why isnt my code working and giving the right answer in this question https://leetcode.com/problems/merge-sorted-array/?envType=study-plan-v2&envId=top-interview-150 ```python
class Solution:
def merge(self, nums1: List[int], m: int, nums2: List[int], n: int) -> None:
temp = nums1 + nums2
temp.sort()

    def checkFirstElement():
        if temp[0] == 0:
            return True
        
    while True:
        if checkFirstElement():
            temp.pop(0)
        else:
            break
    print(temp)
    ```
LeetCode

Can you solve this real interview question? Merge Sorted Array - You are given two integer arrays nums1 and nums2, sorted in non-decreasing order, and two integers m and n, representing the number of elements in nums1 and nums2 respectively.

Merge nums1 and nums2 into a single array sorted in non-decreasing order.

The final sorted array should...

#

i did do this nums1 = temp

glossy lintel
solemn moss
harsh mason
#

This is what I don't understand

#

I made logic

#

How should I go on about practicing coding questions ?

keen hearth
harsh mason
#

Wtf

keen hearth
#

So nums1 is something like [1, 2, 3, 0, 0, 0].

solemn moss
# harsh mason But why ?

well, because that's how python works
the first one just creates a new local variable
i don't have any good resources with explanations

keen hearth
#

So in your code, the line temp = nums1 + nums2 would need to be changed to temp = nums1[:m] + nums2.

solemn moss
keen hearth
#

But yeah, like I said, this doesn't make sense as a python problem. You generally wouldn't supply the lengths of lists along with the lists in Python.

#

Nor would it make any sense to "pre-allocate" space in a python list.

harsh mason
#

But them influencers say leetcode is good

keen hearth
#

It's alright πŸ˜„ It has a lot of programming problems to work on. I've always disliked how it makes you put your solution in a method in a class for some reason, and how the method name is in camel case πŸ˜”

harsh mason
keen hearth
#

Are you specifically preparing for coding interviews?

#

That's really the audience Leetcode is aimed at.

harsh mason
#

I actually am doing my second internship but the thing is I wanna get into better companies

keen hearth
harsh mason
#

Why not leetcode I was thinking of solving leetcode 150

keen hearth
harsh mason
keen hearth
#

The easy/medium/hard ratings aren't always very accurate either.

#

Sometimes the "easy" problems are only easy if you already know the solution.

harsh mason
#

Some dude said to learn solutions what does that even mean

keen hearth
keen hearth
#

Every problem has a "solutions" tab where people can post their code and explain their approach.

harsh mason
#

U know what I am gonna watch a Playlist of solutions

keen hearth
#

If you've been stuck on a problem for a long time, it may be a better use of your time to look at people's solutions to that problem, as it may be a problem with a "trick" solution, i.e. one that you either see immediately or not at all.

harsh mason
#

What if them interviewer gives me a trick ass question

keen hearth
#

They generally shouldn't do that. Those kinds of questions don't make good interview questions.

glossy lintel
#
    def split(self, positions: set[int]) -> list["SinglyLinkedList"]:
        """Splits the linked list into seperate linked list \"chunks\" given the positions 
        which the chunks are returned as lists of linked lists. it will perform always O(n) time"""
        target = self.starting_node
        prevTarget = None
        lists = []
        header = SinglyNode(target.data)
        currHeader = header
        prevHeader = header
        index = -1
        while target:
            index += 1
            if index in positions:
                linkedList = SinglyLinkedList()
                linkedList.__nodes_length = index + 1
                linkedList.starting_node = header
                linkedList.__ending_node = currHeader
                linkedList.__pre_end_node = prevHeader
                header = SinglyNode(target.data)
                currHeader = header
                prevHeader = header
                lists.append(linkedList)
            prevTarget = target
            target = target.point
            prevHeader = currHeader
            currHeader.point = SinglyNode(target.data) if target else None
            currHeader = currHeader.point
        return lists
#

can it be improved more?(has some flaws i noticed but im fixing them)

keen hearth
# harsh mason U know what I am gonna watch a Playlist of solutions

Yeah, that sounds like a good idea. This guy is a competitive programmer and has some nice instructional videos on his channel: https://www.youtube.com/@Errichto/playlists

#

To solve these kinds of problems, you need some understanding of basic data-structures like hash tables, linked lists, graphs, stacks, and queues. You should also understand how to analyse the runtime complexity of an algorithm (i.e. "big oh" notation).

glossy lintel
willow vapor
#

yo guys

#

anyone here knows where can i learn how to convert recursive calls (top to bottom) into iterative

keen hearth
willow vapor
#

for example:
I have this fibo

def fib(n):
    if n <= 1:
        return n
    else:
        return fib(n - 1) + fib(n - 2)```
willow vapor
#

can you show how to do it on this fibo

keen hearth
#

So the most general way is to think about how the recursive code runs, and then replicate that with a stack.

willow vapor
#

stack as in push pop data type?

keen hearth
#

Yeah. When you call a function, a frame is added to the call stack, which contains the local variables of that function call.

#

When you make a recursive call, the stack grows, and when you return from a call the stack shrinks.

willow vapor
#

so every time i call a function it gets pushed to the stack?

#

how does that help convert the recursive method into iterative?

keen hearth
#

You can re-create recursive calls with a stack of "continuations". You can think of a continuation as representing a function call that's paused part of the way through executing.

#

Your Fibonacci function, when it makes a recursive call, has essentially two states it could be in:

  • waiting for the call fib(n-1) to finish
  • waiting for the call fib(n-1) to finish
willow vapor
#

can we do waiting = True if waiting on fib(n-1) and waiting = false if waiting on fib(n-2)

keen hearth
#

Actually designing a good continuation representation can be a little bit tricky sorry lemon_sweat Because it needs to contain all the information necessary to continue the function call after a recursive call has completed.

narrow mica
willow vapor
#

i am trying my best to understand

willow vapor
keen hearth
#

The Fibonacci function is kind of a special case, as fib(n) only depends on fib(n-1) and fib(n-2), you can perform the calculation iteratively without a stack, by just starting with [fib(0), fib(1)] and working up to n calculating each fibonacci number based on the previous two.

#

This approach is called bottom-up dynamic programming.

willow vapor
#

i have seen this approach on youtube(dont remember the video) but i am trying to learn how to convert recursive calls into iterative

#

are there better examples to learn how to convert recursive to iterative?

#

instead of fib

keen hearth
#

Are you familiar with depth-first search?

willow vapor
#

not really, i can try understand it

#

as long as its not very complex

keen hearth
#

Are you familiar with binary trees (the data structure)?

willow vapor
#

yes

keen hearth
#

Alright, so say you are given a binary tree, and you need to determine whether it contains a particular value. It's not a binary search tree, ie the nodes are not sorted in any particular order, so you have to search the whole tree in the worst case.

#

You could perform a depth-first search on the tree to find the value.

#

Let's say a node in the tree looks like this: ```py
class Node:
def init(self, value, left=None, right=None):
self.value = value
self.left = left
self.right = right

willow vapor
#

so it is an algorithms that finds whether there is a value within a binary tree

narrow mica
#

Not necessarily a binary tree, any tree
It's just that here you can use a binary tree for fib

keen hearth
#

A depth-first recursive search of the tree might look like this: ```py
def contains(tree, value):
if tree is None:
return False
return (
tree.value == value
or contains(tree.left, value)
or contains(tree.right, value)
)

willow vapor
#

oo

#

i see

#

recursive calls until it finds the node

#

by going to the far left side of the tree then the right side

#

how do you convert the contains method into iterative?

keen hearth
#

To implement this recursively, you would need a stack.

willow vapor
#

what does the implementation of a stack look like?

keen hearth
#

I guess al element of the stack could consist of a node, and whether we've visited that node before.

#

I.e. when we next visit that node, are we visiting it on the way down or on the way up.

#

So something like this (sorry if my explanations haven't been very clear, I probably need more coffee lemon_sweat)

keen hearth
#
def contains(root, value):
    stack = [(root, False)]
    while stack:
        node, visited = stack.pop()
        if visited:
            continue
        if node.value == value:
            return True
        for child in node.left, node.right:
            stack.append((child, False))
    return False
keen hearth
keen hearth
willow vapor
#

i get it

#

the stack is kind of like a queue

willow vapor
#
def contains(tree, value):
    if tree is None:
        return False
    return (
        tree.value == value
        or contains(tree.left, value)
        or contains(tree.right, value)
    )```
keen hearth
#

I.e. you can only add and remove from the top of the stack.

#

The last element in is the next element out.

willow vapor
#

i meant by a queue as in sequence not data type lemon_sweat

keen hearth
#

Oh right

willow vapor
#

the iterative code looks very different from the recursive one

keen hearth
#

Btw, the reason you might want to convert a recursive function to an iterative one is to avoid hitting the maximum recursion depth in python (or causing a "stack overflow" in other languages).

#

The call stack size is fairly limited. I think in python by default you can have up to 1000 frames on the call stack.

#

So if you recursively searched a tree with nodes more than 1000 edges deep, you would hit the maximum recursion depth.

willow vapor
#

i thought that converting to iterative is straight forward, it really gets tricky on harder problems

keen hearth
#

It can be. Coming up with a good representation for the continuations is the tricky part I think. You have to find a good way to represent the state of a partially completed function.

#

In python you could maybe do this using a stack of generator functions, if you're familiar with those.

willow vapor
keen hearth
#

It's a concept that comes from functional programming.

willow vapor
#

i looked up generator function, i understand it, it pauses the function after yield until it gets called again

#

i assume similar to continuations

#

anyways i wont trouble you any further, ill try dig up the remaining online, thanks for helping me

keen hearth
#

Yeah, sorry if I've just added more confusion by introducing a bunch of new terms! lemon_sweat

#

But the gist of converting a recursive function to an iterative one is to try to replicate what Python is doing when you make recursive calls (i.e. implement your own call stack equivalent).

willow vapor
#

no, i think its good to get familiar with new terms early on

keen hearth
keen hearth
willow vapor
#

at least i know some related keywords

keen hearth
#

If the tree of recursive calls is organisable into some kind of table, you can apply a bottom-up dynamic programming approach.

willow vapor
#

i liked the bottom top approach more, it is easier and without recursion however i still wanted to learn about top to bottom

#

anyways thanks for the help πŸ‘

keen hearth
keen hearth
solemn moss
#

An example of implementing decorator to replace recursion with stack from pajenegod

from types import GeneratorType
def bootstrap(f, stack=[]):
    def wrappedfunc(*args, **kwargs):
        if stack:
            return f(*args, **kwargs)
        else:
            to = f(*args, **kwargs)
            while True:
                if type(to) is GeneratorType:
                    stack.append(to)
                    to = next(to)
                else:
                    stack.pop()
                    if not stack:
                        break
                    to = stack[-1].send(to)
            return to
    return wrappedfunc

What is cool that you don't actually need to change your code, only replace returns (and function calls) with yield

For example fib function would look like

@bootstrap
def fib(n):
    if n <= 1:
        yield n
    else:
        yield (yield fib(n - 1)) + (yield fib(n - 2))

https://codeforces.com/blog/entry/77752?#comment-627450

keen hearth
#

Yeah I thought you might be able to do something using a stack of generators.

#

In this case I guess the generator iterator is the continuation (am I using that word right btw?) πŸ˜„

solemn moss
#

Ye (i am not sure how to name it too pixels_snek_2 )

lean walrus
#

now when im reading the code again, im not sure if i understood this mechanism correctly

haughty mountain
#

with top to bottom things just happen in the correct order naturally, but you end up having to deal with recursion

sour abyss
#

Hello, Unsure if this is a data structure question. But I have an api http://baseurl/a/b/c/d
I am writing a python client for the api and want to call the api with client.a.b.c.d()

    class a():
      class b():
         class c():
             def d():
            #implemention code here```
This looks like a really bad way of implementing the function~ Also breaks class naming standards .
Are there any patterns around this issue I have?
soft holly
#

I need a video course to learn and get good grades for my "Design And Analysis of Algorithm" lesson at university

fiery cosmos
#

What would be implementation for that algorithm in Python? (this is algorithm for fastest route)

wraith dragon
#

real talk, how do you guys generally benchmark your algorithms against one another? Is it based on min-time, mean-time, median-time, worst-time, etc?

candid anvil
near charm
#

Hello

#

I am new and really want to get into python proggramming.

#

Any tips you can give to either get out of tutorial hell or just an aproach in general to proggramming with the end goal of a job as a data engineer.

haughty mountain
#

big O is just an upper bound of something

#

it could be an upper bound of the average runtime, or whatever

#

O Θ and Ξ© are the ≀ = and β‰₯ of asymptotics

candid anvil
# haughty mountain big O isn't necessarily about worst time

Big O is the upper bound of the worst case in theory. I’m sharing what I do in practice. Unless your median & mean runtime are super far apart (skewed data), I would use worst case in theory and the average in practice to compare (cuz the big O doesn’t always align with reality).

If your runtimes have a big variance, it would depend on your particular optimization challenge what numbers you use & what your goals are. There is rarely a one-size-fits all solution to profiling & run time metrics. πŸ˜€

candid anvil
haughty mountain
#

you can upper/lower bound the best case run time if you want to

#

that would be denoted with O and Ξ© as well

#

because it's just upper/lower bounds of something

summer fossil
hollow stratus
#

pretty basic but why are linked lists better than a normal list? online it says dynamic sizing which means its easy to change size but I dont see how vs changing the size of a regular list by removing elements

summer fossil
#

but in python list are dynamic

#

list/array has generally fixed size

hollow stratus
#

ohhh ok I see im used to python thats prob why I was confused

#

I have an interview and I have to remember Data Structures and Algorithms and Big-O, anything in specific that anyone would recommend stressing on? (Its been sometime Im starting to remember the concepts again now)

narrow mica
haughty mountain
#

the insert in front can often be solved better by other structures

outer rain
# wraith dragon real talk, how do you guys generally benchmark your algorithms against one anoth...

The answer, as always, is "it depends".

Let me start with non-programming related example first. Imagine you have a company you are about to apply to. Of course, you wonder what your salary is going to be. Let's also assume you know the salaries of everyone in the team you are going to be the part of. Statistically, there are multiple ways to estimate it.

You can consider the average of all salaries, but that's basically a useless estimation. Mean is basically the same thing as sum, what does sum mean to you? It does mean something for a finance department: that's how much money they are spending on the team, but this does not mean anything to you.

What you are looking for is a "typical" salary. That's what median is. Because that's the best you can estimate: you are probably about to be a typical person in that team, so you are likely to get typical salary. What if there is one super cool guy which gets a million a year, and the rest get 100k? Then you are likely to get the same 100k, not a million. On the other hand, the finance department does not care about median, they only consider the team as a whole.

#

For the same reason, it makes sense to estimate different characteristics of your benchmarks:

  • If your code must run multiple times until completion (i.e. an important business logic algorithm), then you should probably look at the mean time (or total time)
  • If your code estimates something and is not necessarily required to finish (i.e. you have a routing network and you need to find the semi-optimal path for a packet to take: if you spend too much time calculating the optimal path you will end up having worse results than if you just send it in some path, so it makes sense to terminate the estimation after some time), then you should probably look at the median time
  • If your code sometimes works slowly and it is noticeable to the user (i.e. you have a button in the UI which usually works in 0.1s, but sometimes lags and and takes 2s), then you should probably look at the 90% percentile (that's almost max) and try to minimize it (user won't notice the 0.1s -> 0.05s speedup, but they will notice 2s -> 1s speedup)
  • If your code runs on concurrent machines and you are only interested in the first result (i.e. you analyze signals from the trading platform in high frequency trading and you have 64 cores which run the same algorithm with different random values, but you can use any result and terminate all the other workers once one has finished), then you should probably look at the 10% percentile (that's almost min).

Of course, there is also a question of "what are you measuring, exactly?". You can benchmark the algorithm on the same inputs, or you can profile in production on real data, and you will have to work with multiple different metrics. You also need to make sure you are optimizing the right one. I will not go into that, though.

wraith dragon
#

I see I see

#

thanks for the detailed response, super helpful brother! @outer rain

#

just to bring the discussion back to what I had in one of the offtopic channel@hushed mist, I talked with my prof about this and she had similar arguments about "it depends"

#

for example, if I have the same input size but different possible configurations and a way to define a "random" configuration, then taking the median across those made sense
But she pretty much agreed that if I'm running multiple times over the same exact input, min makes more sense

#

That being said, she also mentioned mentioning fluctuations and stuff

#

is that really important to talk about? Like I can't imagine somebody using a box plot to visualize the growth of an algorithm it would look too cluttered and you can't put more than one graph

hushed mist
#

You should always try to look at the data unconsed if possible, before you use one number to represent a whole distribution.

#

But of course, it depends on how accurate you need to be.

gentle bough
#

I think it's my for loop that isn't properly doing what i want it to do, i want it to start at the last index and create a whole at the given index, shift all the elements to the right then place the new item at the given index

    def insert(self, indx, newItem):

        if indx < 0 or indx > self.size:
            raise KeyError("ERROR: Invalid index accessed.")
        
        if self.size == len(self.items):
            temp = Array(2 * len(self.items))

            for i in range(self.size):
                temp[i] = self.items[i]

            self.items = temp

        for i in range(self.size - 1, indx - 1, - 1):
            self.items[i + 1] = self.items[i]

        self.items[indx] = newItem
        self.size += 1
normal bison
#

I'm not sure what this Array() is

normal bison
#

do you have a negative test case?

fleet flare
#

Im new to python.. is it possible to make a tool integrated with chat GPT where it has a pre loaded question and just simple input is required? For example:

The tool would say β€œEnter city:”

You’d enter the city name and press enter, and it would feed back the temperate in that city.

Example:
Enter city: New York
Response: The tempature in new york is **

This is just an example, i’d use it for multiple purposes.

keen hearth
hushed mist
#

Any experience with pathlib and in particular it being slower than os. methods?

keen hearth
#

At one point it was a bit slower, but that may have since been fixed.

#

I think pathlib is nicer to work with than os.path.

fiery cosmos
#

Uh

hushed mist
#

Thanks @keen hearth my opinion as well, I just a have a lot of legacy calls with strings. Debating myself if I should make an effort to port it.

Sorry for posting this in the wrong channel, I just realized.

cloud epoch
#

@little scaffold

wide marlin
#

so im in the middle of learning sorting algorithms, in the part where we're talking about efficiency, let's take for example bubble sort, they talk about the two significant steps in bubble sort:

  1. Comparisons
  2. Swaps

my question is, how do we determine which part of our code that does something 'significant'?

#

I initially thought all we need to pay attention to most of the time are loops, be they for or while loops

haughty mountain
#

for the former you just count how many "primitive operations" you make, and see how that scales

#

this is likely not too relevant, but to model real world stuff you can get quite intricate if you really want to, e.g, considering swaps and comparisons having different costs and counting exactly

I've even seen operations being scaled by distance to mimic cache behavior

wide marlin
#

that said, I learned that selection sort is also quadratic like bubble sort, but i also learned that not all quadratic time are equal

haughty mountain
wide marlin
haughty mountain
#

decide what operations you say take constant time

wide marlin
haughty mountain
#

and then start counting

narrow mica
# wide marlin that said, I learned that selection sort is also quadratic like bubble sort, but...

Do note that time complexity measures how well an algorithm scales, not how well an algorithm runs in practice
E.g. For large enough inputs, an O(n) algorithm will run better than an O(n^2) one, but that doesn't mean that the O(n^2) one will be slower for smaller inputs
In fact, galactic algorithms are algorithms with very good complexity, but the underlying constants are so big, that they're not practical at all

A galactic algorithm is one with world-beating theoretical (asymptotic) performance, but which is never used in practice. Typical reasons are that the performance gains only appear for problems that are so large they never occur, or the algorithm's complexity outweighs a relatively small gain in performance. Galactic algorithms were so named by...

haughty mountain
wide marlin
haughty mountain
wide marlin
# haughty mountain wdym?

i was gonna say:

since im deciding which operations in an algorithm takes O(1), and im assuming algorithms would have different things that would be, O(1)

now that i read what i've written it doesn't make sense

haughty mountain
#

what is considered O(1) can be quite context dependent

wide marlin
#

wait, so it does make sense?

haughty mountain
#

what's the time complexity of multiplying two numbers?

wide marlin
#

i mean we take one number and multiply it with the other
so im assuming it takes one step

#

but the numbers can be single digits or 50 digits long

haughty mountain
#

what if the numbers are large? say d digits?

wide marlin
#

yea that

#

if they are n digits

#

bigO ignores constants, so im assuming they are still O(*1)

wide marlin
haughty mountain
#

my point is just that it's for sure Ο‰(1), i.e. it does scale with the number of digits

wide marlin
#

oh so that what the scaling part means

#

i was imagining the plotted O(N) graph

#

they will stay diagonal

haughty mountain
wide marlin
#

oooh

haughty mountain
#

if you can assume some fixed sized integers, you can just treat integer multiplication (and other similar operations) as constant time

wide marlin
#

that make sense

#

but in algos we rarely are doing just simple pemdas right

#

what about things like comparison operators?

haughty mountain
#

comparing what?

wide marlin
#

something like

if my_list[idx] > my_list[idx + 1]:
  #do something here
#

list slicing(?) is O(1)
so both would take a single step

haughty mountain
#

what are your list elements?

wide marlin
haughty mountain
#

if you assume they are not arbitrarily large, sure

#

and yeah, memory lookups being O(1) is what you typically assume

wide marlin
#

and so, back to the question which is an excerpt from the book im following:

The Bubble Sort algorithm contains two significant kinds of steps:
β€’ Comparisons: two numbers are compared with one another to determine
which is greater.
β€’ Swaps: two numbers are swapped with one another in order to sort them.

so even if they're O(1), we still do it X number of times?

haughty mountain
#

right

wide marlin
#

Okok
and then we go into the swapping part, which is, im not really sure how arrays are treated in memory. i just know they populate a contiguous area in memory

haughty mountain
#

talking about combining a bunch of O(1) is a bit iffy, it might be a better idea to just think of a primitive operations taking some constant time, or even just 1 time

wide marlin
#

so guess that's why linear time is O(N),
if there's 10 elements, we're gonna do 10 steps, if there are 100, we do 100 steps

haughty mountain
haughty mountain
#

(and for large enough N)

wide marlin
#

so constant time?

haughty mountain
#

yeah

wide marlin
#

and i need to remember that these operations are for every iteration where the if statement evaluates to true

#

so the significant steps are basically, the list of steps that the algorithm has lined out

haughty mountain
wide marlin
#

for bubble sort, we iterate through the list of elements, we compare one element to the one after it, and swap only if the current element is bigger than the element after it

#

and how many iteration happened depends on the length of the list (aka the input size) and the unsortedness of the list in the first place. but the latter is something not within our control

wide marlin
haughty mountain
#

there is the strict thing involving n0 and other constant, but there is some more helpful phrasings that is a tad more sloppy

#

asymptotics is looking at what happens with f(n)/g(n) as nβ†’βˆž

#

if it goes to 0 then f(n) is in o(g(n))

#

if it goes to ∞ then f(n) is in Ο‰(g(n))

#

if it goes to some constant it's in Θ(g(n))

#

these are basically <, > and =

#

O is like ≀ (so either o or Θ)

#

Ξ© is like β‰₯ (so either Ο‰ or Θ)

#

I suspect o and Ο‰ are not usually taught

wide marlin
#

Nope

haughty mountain
#

but you will see thing like O and Ω implies Θ

wide marlin
#

Yea those ones i have no idea

haughty mountain
#

as mentioned earlier O and Ξ© are like ≀ and β‰₯

#

β‰₯ and ≀ implies... =

#

so Θ

haughty mountain
wide marlin
#

fiery i'll brb i gotta drop off my sister lmao i'll be back after

haughty mountain
#

which doesn't matter asymptotically

glossy lintel
#

should i include lazy loading to linked lists?

#

such that when u generate a linked list from a iterable. You can specify if you want it to lazily load

#

if you do specify, the linked list expands only when needed such as passing a index higher than the current linked list length

#

operations can also eagerly load the linked list for what they need

#

you can also eagerly load it manually by specifying the limit, and if you want dynamic allocation if its set to false, it stops the lazy loading even if the iterator is not completed

#

although one of the drawbacks of lazy loading is if you want to get the length of the linked list, it may not be accurate since iterators can be infinite. You get only the length of the linked list that is loaded, same goes if you wanna get the tail of the linked list

glossy lintel
#

Made this new batch insert function. But there is a problem, it doesn't perform faster than normal inserting in fact worse, i want the batch insert to perform way faster than insert since its meant for inserting many nodes

    def batch_insert(self, entries: dict[int, SinglyNode]) -> None:
        if not entries: raise ValueError("Entries Dict Argument Is Empty")
        positions = entries.keys()
        
        prevLength = self.__nodes_length
        positionsLength = len(positions)
        self.__nodes_length += positionsLength

        if entries.get(-1) or entries.get(prevLength - 1):
            node = entries.get(-1) or entries.get(prevLength - 1)
            self.__ending_node.point = node
            self.__ending_node = self.__ending_node.point
        if entries.get(0):
            node = entries[0]
            prevStartNode = self.starting_node
            self.starting_node = node
            node.point = prevStartNode

        counter = positionsLength
        target = self.starting_node.point
        for i in range(1, self.__nodes_length):
            if i >= self.__nodes_length or i < -1: raise ValueError("Position Entry Pair Out Of Bounds")
            elif counter <= 0 or i == self.__nodes_length or target is None: break
            node = entries.get(i)
            if node:
                node.point = target.point
                target.point = node
                counter -= 1
            target = target.point
#

the bottleneck is specifically the loop

glossy lintel
#

and here is the performance

Non-Batched Insert 0.4138023115374381
Batched Insert 9.738792831907631
#

(its the sum of many timestamps)

modern gulch
#

First thing that comes to mind is trying cython here, given the tight loop

#

I might also try moving nodes length to a local, to cut out a few loopups

#

But also, when iterating over nodes indexed by I? Can you can this to an enumerate over nodes.items?

#

(I’m only skimming the code, not actually looking at the logic)

haughty mountain
#

I wouldn't even provide indexing for a linked list

glossy lintel
#

ok

glossy lintel
lean walrus
#

wtf is this ```py
for i in range(1, self.__nodes_length):
if i >= self.__nodes_length or i < -1: raise ValueError("Position Entry Pair Out Of Bounds")

lean walrus
#

i guess you are right, so this code does make sense

glossy lintel
#

good question

#

i have no idea what my brain thought of

#

removed so no worries

zinc narwhal
#

def prime(n):
    if n <= 1:
        return False
    if n == 2:
        return True
    for i in range(3, int(n**0.5) + 1):  

        if n % i == 0: 
            return False
    return True

def checkprime(x):
    p = []
    for num in x:
        if prime(num):
            p.append(num)
    return p

``` i want this to print this : [2, 3, 5, 7] when i give : ```py
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
prime_numbers = checkprime(numbers)
print(prime_numbers)
``` but instead its giving: [2, 3, 4, 5, 6, 7, 8, 10]
#

plsssss help!

#

anyone

lean walrus
#

you are not checking divisibility by 2

glossy lintel
#

i've ran a test for inserting, deletion, searching and traversing. These are the results, im testing their normal counterpart versus their batch counterpart. There were 200 iterations and each iteration has a sub 2000 iterations and all of them operate on different linked lists but on the same length, same main iteration and same sub-iterations. Does this look normal?

#

they are meausred in seconds

#

(i kinda smoothed out the curve so it looks more readable)

#

bc to me it looks too good to be true

lean walrus
#

how big are the batches?

#

blue lines are suspiciously close to zero, i guess that is because your batches are pretty big

glossy lintel
#

yeh

glossy lintel
#

give me a moment

#

2k in length are the batches

#

per iteration

#

Here is the code if ur wondering

def nonBatchInsert(l: SinglyLinkedList, iters, index, coordsIndex):
    s = 0
    for i in range(iters):
        start = time.perf_counter()
        l.insert(SinglyNode(i / 4), (i * 2) % len(l))
        t = time.perf_counter() - start
        s += t
    coordsY[coordsIndex][index] = s

def batchInsert(l: SinglyLinkedList, iters, index, coordsIndex):
    length = len(l)
    start = time.perf_counter()
    entries = {(i * 2) % length: SinglyNode(i / 4) for i in range(iters)}
    l.batch_insert(entries)
    t = time.perf_counter() - start
    coordsY[coordsIndex][index] = t
#

don't ask about coordsY, ik its a bad practice especially when multi threading

#

for smaller batches, the blue lines aren't that close to 0

#

as shown

#

its only when i increase the sub-iterations, batching performs very "good" compared to non-batching

#

(i yet don't know if its actually good or its doing its own shenanigans)

wide marlin
#

man sorting algorithms are fascinating

#

@haughty mountain also sorry i pretty much got occupied all of yesterday, and was completely beat lol

summer fossil
#

In computer science, the longest increasing subsequence problem aims to find a subsequence of a given sequence in which the subsequence's elements are sorted in an ascending order and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous or unique. The longest increasing subsequences are studied in the c...

mint jewel
#

I would probably reach for (N choose K) - all combinations with duplicates , creating a inclusion/exclusion formula.

runic birch
#

I think there is a simpler solution.

haughty mountain
runic birch
haughty mountain
#

I would look at variations of LIS

#

your answer is basically the number of increasing subsequences plus the number of decreasing subsequences

haughty mountain
#

how is it not? pithink

#

oh wait

runic birch
#

I just need number of strictly sorted combinations. So in a way you are right.

haughty mountain
#

only of length K?

runic birch
#

Yes

haughty mountain
#

the thing I mentioned would be any length

runic birch
#

I suspect the solution is simple. But I cant find it

#

It may be inclusion exclustion principle but ive found it hard to implement as I dont really understand the theory behind it

haughty mountain
#

what's the range of values?

runic birch
haughty mountain
#

not the sizes

#

the range of values

runic birch
#

Which values do you mean? Like the actual numbers?

haughty mountain
#

the numbers in the array

runic birch
#

completely random. Here is a sample: 7556456 6290139 1141998 5749754 8122327 6166082 2081186 3607828 157756 9024717 9890673

#

But they tend to repeat a lot

haughty mountain
#

so <= 10^7 or something

runic birch
#

Entire int range and fully random

#

Never negative though

#

1 <= N <= INT_MAX

haughty mountain
#

which with the relevant data structures is O(n k log(something))

runic birch
#

Not really though

#

Because in their case they do not consider duplicates as unique

#

If you look at the output explanation notice how there is 3a and 3b

#

Still can be helpful

outer rain
#

it says "combinations", not subsequences tho πŸ€”

#

right?

haughty mountain
runic birch
#

"Example
Suppose I have an array 1,2,2,10

The increasing sub-sequences of length 3 are 1,2,4 and 1,3,4

So, the answer is 2."

But in my case each 2 would be unique and this the answer in my case should be 3 and not 2

#

Cited from the reference the user links in their response

haughty mountain
#

how would the answer be 3?

#

for k=3

runic birch
#

But technically the combinations are sequences in a way. I suppose

haughty mountain
#

the answer is 2, no?

#

1, 2, 10 and 1, 2, 10

runic birch
#

Wait you are correct and im dumb

#

I gtg. I will be back in 20 mins or tommorow. But thanks for your help. WIll try this approach and let you know

fiery cosmos
haughty mountain
#

no, I don't see any reason to do stuff over dms

outer rain
#

I can solve that problem in O(n log^2 n) using FFT 😭 I am too spoiled

#

nevermind, no, I cannot, because 1e9+7 is dumb number

ornate rune
#
    for i in range(n): 
        print(i)
        for j in range(i):
            for k in range(n):
                print(k)
        #This loop
        for j in range(i):
            print(j)```
Assume input size is number n itself, will the loop "for j in range(i)" be O(n) or O( (n^2 + n) / 2 ) = O(n^2)?
haughty mountain
#

neither?

#

this is O(n^3)

ornate rune
#

oh wait

#

what if its the loop below

haughty mountain
#
for i in range(n):
    for j in range(i):
        print(j)
```n*(n-1)/2 = O(n^2)
#

it just boils down to an arithmetic sum
0 + 1 + 2 + ... + n-1

ornate rune
#

okay yeah thankss

outer rain
# outer rain I can solve that problem in O(n log^2 n) using FFT 😭 I am too spoiled

best solution I've got:
||Sort the array, group equal numbers together and count the numbers in each group. Call that array of counts a . Then we must select k groups, and choose one number in each to use in the sequence. That's sum a[i[0]] * a[i[i]] * ... * a[i[k - 1]] over all combinations i of k elements in 0..len(a) range. That's the same thing as the k-th coefficient in polynomial (x + a[0])(x + a[1])...(x + a[-1]). To find it, recursively multiply the polynomials using divide and conquer, and discard the coefficients after k-th (since the rest is irrelevant). D&C will require log n recursion layers, and d-th does n/2**d * min(2**d, k)**2 work, which evaluates to approximately 296_086_621 operations in total. Surely it's not that many multiplications πŸ’€ ||
my god I can't english today

edit: ||actually you can group and count elements of a again, find all (x + a[i])^b[j], and multiply just O(sqrt(n)) polynomials with D&C. Then it will probably be fast enough (O(log n * sqrt n * k + n * log n)?). Still, I don't believe this is an intended solution||

runic birch
outer rain
#

you probably shouldn't, it's stupidly overcomplicated, I just don't know the easier way

#

and it's most likely too slow anyway

runic birch
#

The solutions for these tasks tend to be easier. So I suppose there is a better one.

haughty mountain
#

oh wait, I misinterpreted the problem, it's not subsequences, and you can wlog decide to look exclusively at increasing/decreasing

haughty mountain
#

!e what about this?

from collections import Counter

def f(seq, k):
  # Counts in order.
  counts = [c for _, c in sorted(Counter(seq).items())]

  # The real dp is smth like
  # dp(i, j) = dp(i, j-1) + dp(i-1, j-1)*counts[j]
  # where dp(i, j) is the number of sequences
  # with length i using numbers ≀ j.
  # But we don't need a 2d array, we can keep only one row.
  dp = [1] + [0]*k
  for c in counts:
    for i in reversed(range(k)):
      dp[i + 1] += dp[i] * c
  return dp[-1]

seq = [1, 2, 2, 10]
print(f(seq, 3))
halcyon plankBOT
#

@haughty mountain :white_check_mark: Your 3.12 eval job has completed with return code 0.

2
haughty mountain
#

O(n k)

#

ignoring the sorting

keen hearth
#

Here's a hint as to how I would go about solving this: you can use the relationship ```py
math.comb(n, k) == math.comb(n-1, k) + math.comb(n-1, k-1)

normal bison
keen hearth
#

Because the numbers are not unique, the combinations that contain a particular number need to be multiplied by the number of times that particular number appears.

haughty mountain
#

I guess that's essentially the O(nk) thing I proposed

keen hearth
#

Intuitively, the equation above says, the number of ways of choosing k elements from n is the sum of:

  • the number of ways that don't contain the n'th number
  • the number of ways that do contain the n'th number
mint jewel
#

Wouldn't you run into the principle of inclusion/exclusion again, since each of those two branches needs to split per duplicate element?

keen hearth
#

I don't understand sorry lemon_sweat

mint jewel
#

If you want to do this, you need to split each of those subcomputations of combinations into 2 cases for n-1th element, recursively. This is the principle of inclusion/exclusion more or less.

#

If it fits into DP nicely, it'll work though

haughty mountain
#

I just wonder whether O(n k) is intended

#

in python 1e8 sounds like a lot

#

in C++ it would feel intended

keen hearth
keen hearth
#

Anyway, this is the code for the solution I was hinting at:
||```py
from collections import Counter
from itertools import pairwise

def solve(xs, k):
counts = Counter(xs)
table = [1] + [0] * k
for x, count in counts.items():
table = [1] + [a*count + b for a, b in pairwise(table)]
return table[-1]

#

Not sure if that's the same as yours fenix. My brain's not really in code reading mode.

haughty mountain
#

that's my solution, just phrased differently πŸ˜›

keen hearth
#

Ah right

#

Oh, I forgot about calculating the remainder

haughty mountain
#

eh, small detail at the end of the day

haughty mountain
#

this is just the product of a bunch of terms (x*count + 1)

#

so you could totally divide and conquer polynomial multiplications

#

so I guess the total cost could be n/2 * M(2) + n/4 * M(4) + n/8 * M(8) + ... where M(n) is the cost of multiplying a polynomial with N terms

normal bison
#

I think we might speed up because if a bunch of numbers have the same count we might be able to treat them in the same way, and there can only be up to sqrt(n) distinct counts.

haughty mountain
#

I was about to say things are order dependent, but I guess they aren't

#

so yeah, you could optimize this a bunch using that insight

normal bison
#

and it's basically asking for a coefficient of a polynomial once you throw generating functions at it

#

which is why there are FFT solutions

keen hearth
haughty mountain
#

I mean, that's what the polynomial description is

#

just the polynomial description should end up much more efficient

#

the polynomial description lends itself well to FFT based approaches

keen hearth
#

Oh right yeah I see

#

I need to pick up an algorithms book again I think

naive oak
#

is there an easy way to save memory for dp tables that only have values in a triangular portion of the array? more specifically, saving memory on an NxN dp table dp[i][j] that satisfies dp[i][k] = 0 for all k >= N - i

narrow mica
naive oak
#

i do use dicts for "sparse" dp tables but this is more like sparse in a specific area while dense in another

#

i was thinking of folding it into a Nx(N+1)/2 rectangle (N is odd in my use case), but that seems like a lot of overhead to keep converting between indexing systems

wraith dragon
#

an interesting question I got for ya'll that I'm facing rn

#

given a finite grid of [0, 256], what's the maximum convex hull one can generate?

#

where convex hull doesn't include colinear points

dreamy valley
#

wouldn't that be the whole region (I.e. all the points) by definition?

wraith dragon
#

How can all points be part of the convex hull

#

This is a convex hull

#

By maximum I mean number of vertices in the hull

dreamy valley
#

Well, if the points in the input set are arranged in a circle, the answer is "as many vertices as there are points in the original set"

#

I know what a convex hull is, I'm poking at the word "maximum"

wraith dragon
#

"as many vertices as there are points in the original set"

#

there's the rub

#

actually nvm, clarify what you mean

dreamy valley
#

In informal language, the convex hull of a set is the smallest set of points surrounded by the set

#

Maximum on the other hand means largest

#

So what is the largest, smallest set of points surrounded by a set?

wraith dragon
dreamy valley
#

I thought you meant the maximum convex hull of any set of points but that just leads to "all the points"

wraith dragon
#

so there are multiple other restrictions

dreamy valley
wraith dragon
#

just to clarify then, find a shape in [0, 256]^2 such that it maximumizes the number of points of the convex hull

wraith dragon
#

my addition here would be that no colinear points are allowed

dreamy valley
#

Ah OK I see the problem now

#

Do the coordinates have to be integers?

wraith dragon
#

yup, it's a finite grid of integer values

#

I'm kind of having a debate about this with my professor that's why I'm asking here

dreamy valley
#

Hmm

#

What's the debate over?

wraith dragon
#

not really a debate actually

#

we were given an assignment but one of the requirements for our input is that it has to be within a finite range of integers the prof specified

#

I'm trying to prove that this fundementally limits how far we can test our convex hull algorithms

#

for example for [0, 256]^2 I have a hunch that the maximum is like 86

#

and for the range that our prof gave us it would be around 2000

#

too low for any meaningful comparisons

#

but all of this is a hunch

#

my prof said "I suggest you try to find a way"

#

so I want to show that there is no way

fiery remnant
#

Is it a good idea to learn web dev skills first and then prepare for dsa or the other way around?

HELP

glossy lintel
#

i have an idea for a data structure thats basically a sort of "superset" of Hashmap(Hash Grid), This data structure instead of having only key-value pairs, it is a grid containing key pairs. You can specify the dimension of the data structure which is how many keys will be on each pair

#

im still thinking of how it should go

#

maybe there will be a hash version & a normal version

#

does it sound useful?

dreamy valley
#

@wraith dragon My first non stupid idea would be to start with a set containing just the 4 central points (I think I can argue they have to be included) and add arbitrary points to the set one at a time until adding any other point would decrease the number of vertices. But I don't know how to prove that a set found that way is maximum.
There's got to be some backtracking algorithm there.
But brute force only works because the number of points is small (perhaps that's the point you're trying to make)

wraith dragon
#

damn

#

yah very interesting approach ngl

#

but yah you kinda have to prove that this does indeed produce the maximum

#

lowkey I feel like that's a mathematicians job or a phd in cs guy

dreamy valley
#

I suspect you can cut the work by a factor of 8 using symmetry

narrow mica
dreamy valley
#

Maybe going the other way would be better. Start with all the points included and remove vertices from the CH until removing any additional vertex reduces the total number of vertices?

narrow mica
# narrow mica Rotate some lines?

E.g. start with the smallest x point p1 and a vertical line, rotate it around p1 until it hits another point p2, then rotate the line around p2 until it hits p3, ..., until it loops back around and hit p1
Maybe not actually simulate rotation but you get the idea

glossy lintel
# glossy lintel i have an idea for a data structure thats basically a sort of "superset" of Hash...

one structure i've thought is the chain

Its basically a cluster of types but each type is unique, there can be no sub-types. When adding a type to the chain it first looks if there is a type that supports addition with the other type, if it finds then it adds and doesn't insert a new type. But if not then it inserts that new type. Adding chains is a similar process, every type is checked if they have a a additive counterpart

lilac cradle
#

Is there an algorithm for arranging a β€˜line’ of N elements into a square (or square-ish, at least) by splitting it into rows?

#

for instance 16 elements would be split into 4 rows, and either A: have it be as close to a clean rectangle, i.e without a 'partial row' (when possible, since there's shit like prime numbers that wont work) or B: have it get as close to an aspect ratio of 1 as possible

solemn moss
#

Just iterate over all divisors of n and pick the best one

If d is a divisor of n, you want choose such d where abs(n // d - d) is as small as possible

lilac cradle
#

i was considering that but figured it'd be suboptimal but fair

#

lemme write a function to try it

#

actually, lemme see, is there something for the divisors of a number?

vocal gorge
lilac cradle
#

ideally if i split N elements into a rectangle with size A,B, A and B should be as close together as possible

vocal gorge
#

all divisors of a number are <=sqrt(N), and also fullfill a*(N//a)==N (duh), so I think you always want to pick A to be the largest nontrivial divisor of N

fiery remnant
#

Do web developers need to learn dsa?

fiery remnant
#

Im preparing for front end dev ...
Just curious if I need to learn dsa?
If yes i will do it in python for sure

narrow mica
#

Uh... not sure what you mean? What are you doing exactly?

narrow mica
solemn moss
#

the largest divisor of n is n πŸ˜„

agile sundial
#

checking a hash against every key is the opposite of what you should be doing. you need a way to turn the hash into an index that you check

fiery remnant
agile sundial
#

you guess incorrectly

slender sandal
#

What are you trying to do

agile sundial
#

are you trying to implement your own dictionary?

agile sundial
#

i'm not sure where you question of

When hashing, are there more performant approaches to checking every new hash against the keys of the entire dictionary? Maybe blindly insert and catch the exception? Or are there other trees I could try climb?
comes up then

#

you don't handle any hashing yourself with a dictionary

glossy lintel
#

its a question

#

if the mentioned data structure is useful

#

more nicely compact is

#

A chain is a cluster of unique types. For operations Like

Type Insertion

The program checks if there is a type that is able to add up with the type to insert the chain prioritises adding the same type(like string + string) instead of different ones(like int + float). If there is then we add that, otherwise we add that new type to the collection

Chain Addition

The program checks if there are same types on both chains, If there are prioritise them first and then it checks if there other types that support the chain's types from the other chain's types, if there are then add them toggether

Chain Multiplication With Int Value

Instead of expanding, every value inside the chain that can be multiplied with the integer will be multiplied and the others will be left untouched

vocal gorge
#

What'd you use that for? That's so weirdly specific.

mint jewel
#

I only ever heard of chain as this sort of thing -

type Chain[T] = None | T | tuple[Chain[T], Chain[T]]
#

I could see the one you are talking about for some sort of garbage collector IG.

winter helm
#

Hellooo im currently going through the backend on roadmap.sh, and one of the nodes is learning about data structures and algorithms but honestly... I have no clue what Im looking at and how its relevant

#

Any chacne someone may be able to help break it down a little to understand why its important and perhaps that will help me learn from there

kind trout
#

Not sure if this is on the right chat, but how do i call a function from a different file that I defined? or should i just re-write it to be a local function instead

haughty mountain
#

My go to analogy is data structures, algorithms being tools in your tool box, which you can use to break problems apart with problem solving ability. But same idea

fossil ice
#

Alright, I got what feels like an easy one (and in C I'd have it) but I'm struggling with a way to do this in python.

#

I have a big pile of bytes.

#

I know that each set of 11 bytes has a particular mapping.

#

Is there an effective way to map the array of bytes to a set of identical dicts (probably 50 or so dicts) based on offset, without having to cycle through the whole array one item at a time?

#

Trying to maybe use Map here? I'm not familiar with it though

fossil ice
#

For now a whole stack of maps gets it to a list that's at least parsable, but I'm still having to address the offsets the hard way.

haughty mountain
#

if the 11 bytes are structs then the struct module can help

#

!d struct.unpack iirc

halcyon plankBOT
#

struct.unpack(format, buffer)```
Unpack from the buffer *buffer* (presumably packed by `pack(format, ...)`) according to the format string *format*. The result is a tuple even if it contains exactly one item. The buffer’s size in bytes must match the size required by the format, as reflected by [`calcsize()`](https://docs.python.org/3/library/struct.html#struct.calcsize).
haughty mountain
#

if it's some custom encoding, just go through every 11 bytes and do your conversion

#
[your_conversion(chunk) for chunk in itertools.batched(your_bytes, 11)]
haughty mountain
halcyon plankBOT
#

struct.iter_unpack(format, buffer)```
Iteratively unpack from the buffer *buffer* according to the format string *format*. This function returns an iterator which will read equally sized chunks from the buffer until all its contents have been consumed. The buffer’s size in bytes must be a multiple of the size required by the format, as reflected by [`calcsize()`](https://docs.python.org/3/library/struct.html#struct.calcsize).

Each iteration yields a tuple as specified by the format string.

New in version 3.4.
fossil ice
#

Hey that's fantastic, exactly what I was looking for

#

I'm a low level guy still thinking in pointers so stuff like this is really frustrating to me.

fossil ice
#

Excellent, that worked great!

#

Now.. can I go back the other way?

#

Can I take the dict, modify it, and then pack it back into a struct?

#

The brute force way is, of course, simple enough. But there has always been a smarter, better way to do it.

haughty mountain
old heath
#

I have a question: I have 3 years of experience in programming and yet i still forget the difference between all sorting algorithms... Why is that

narrow mica
#

Though...

difference between all sorting algorithms
What does that mean exactly? You forget the implementation? Time comlexity?

old heath
#

No i know about time complexity, i mean about how to execute them

#

such that for instance Quick Sort is so different from Merge Sort, Bubble Sort is differnt from selection sort etc..

#

I know that sorting low input wont make difference and ebtter use for instance bubble sort and when having large inputs better use Quick Sort or Merge Sort etc..

narrow mica
old heath
#

yea that is what i actually always forget about

narrow mica
#

E.g. quicksort

- pick a partition value
- split list into two halves
- sort the two halves
```merge sort
  • sort the left and right halves
  • merge these 2 in O(n)
old heath
#

Okay like right it down some where

#

so when i implement sorting algorithm i can just look up the concept then implement

narrow mica
#

So one way to implement "quicksort" (not really) is

def qsort(a):
    if len(a) == 1:
        return a
    part = a[-1]
    l = [x for x in a if x <= part]
    r = [x for x in a if x > part]
    qsort(l)
    qsort(r)
    return l + r
old heath
#

pivote

#

ah

#

okay

narrow mica
old heath
#

yes yes okay

glossy lintel
#

maybe

#

im theorising

#

idk the use either

#

thats why i asked this question

glossy lintel
#

mhm

#

should i draw it?

glossy lintel
#

here

#

for addition

#

quite a sh*tty one

#

(hold up)

#

the list with the 6 numbers is a matrices(ignore it, its just there as a placeholder)

#

and object is just a random object

#

that doesn't support addition

#

red notes the change

#

for the first case, it picked 34. So 34 + 3 = 37

#

but second case, it adds obj to the chain since obj cannot addup with string, list, matrices neither int

haughty mountain
#

just a bit less efficient one

#

(and with O(nΒ²) worst case on a very common case)

narrow mica
agile sundial
#

sorta depends how you define "quicksort"

haughty mountain
#

idk if O(1) is even possible

#

but sure O(n) is the naive approach

agile sundial
#

O(1) auxiliary memory, probably

haughty mountain
#

yeah, I don't know if that's doable

agile sundial
#

you mean with the stack?

haughty mountain
#

the recursion depth alone is at least log n

#

yeah

#

I would still call the O(n) space one quick sort

narrow mica
#

Just pretty inefficient I guess
And the pivot selection is bad but that's another point

haughty mountain
#

yeah, beginning/end is probably the worst choice

frigid token
#

Hi! I don't get what datatype I can use to make this solution pass time req. The ones I know req. That I convert the arrays to some standardized form that's easier to compare with each other and I don't get how to do that. https://dmoj.ca/problem/cco07p2

keen hearth
# frigid token Hi! I don't get what datatype I can use to make this solution pass time req. The...

Hey πŸ‘‹ A suitable normal form might be something like rotating (and potentially reversing) the snowflake so that it's lexicographically as small as it can be. I.e. there are 12 ways to represent the same snowflake (given there are 6 possible rotations and the arms can be listed clockwise or anticlockwise), so you could pick the smallest of these as the canonical representation (hint: ||list the 12 representations and select the smallest with min||).

#

Combine that with a ||hash table|| and it should be good enough.

frigid token
#

Hi Alex! Thank you! I will read up and attempt implementing a solution this way.

fiery cosmos
#

nice to see @haughty mountain and @agile sundial are still holding things down

#

i used to come round these parts

fossil ice
#

Ahh, struct is gonna drive me crazy today.

#

My variable is 8 bytes.

#

struct.unpack('4H',val) works

#

struct.unpack("b3Hb",val) tells me it's expecting a buffer of 9 bytes?

#

For some reason, the first b is being interpreted as being 2 bytes, not 1

#

b3H is valid.

#

3Hbb is valid

#

Something is wrong?

mint jewel
# fossil ice Something is wrong?

if you take a look at this table, under the default native byte order @, you get alignment. That is, a short must be on an even byte. To avoid this, use = or one of the other formats.

#

bH is 4 bytes. 1 for the byte, 1 as padding, 2 for the short.

fossil ice
#

padding

#

I figured I'd have to add padding manually. Interesting. Thank you.

#

Any trick to unpacking 3 bytes as a 4 byte integer?

fossil ice
#

If I only have one 3byte variable I can just append \x00 just fine, but if I have, say, 4 3byte integers in one pack I don't see a way to do it without some very awkward slicing

quasi ember
#

Hey i have multiple monitor and i try to take a screen shot with pyauto gui or many otherlibrary butthe thing is , i am able to take a screenshot of the main monitor but not able to take the content of other monitor . It always show a black screen shot . Does someone can help me please .

old heath
#
def recursiveSum(number: int) -> int:
    # Memorize value in order to avoid unessary recursive call
    cache: set[int] = set()
    # Additional Note during memorization or memoization it will reduce calls for reduntant process.
    # Using Set instead of list
        # Because set has time complexty of O(1) when checking value
        # While lists has time complexity of O(n)
    # Just return sum of 0
    if number <= 0:
        return 0
    # Create a helper function for N Sum
    def helper(number: int):

        if number <= 0:
            return 0
        # Declare variables
        first: int = 0
        second: int = 0
        third: int = 0
        # Check  if result already calculated before
        if number - 1 not in cache:
            first = helper(number - 1)
            cache.add(number - 1)
        # Check  if result already calculated before
        if number - 2 not in cache:
            second = helper(number - 2)
            cache.add(number - 2)
        # Check  if result already calculated before
        if number - 3 not in cache:
            third = helper(number - 3)
            cache.add(number - 3)
        # Return the sum
        return number + first + second + third
    # Call helper function Time complexity should sit in between O(n) & O(n^2)
    return helper(number)

what is the time complexity of that code?

narrow mica
old heath
#

Thank you just trying to grab the concept

#

for loading speeds etc..

narrow mica
#

The idea is if you memoized correctly then you only need O(N * [cost of calculating answer once]), the [ ... ] part in this case is just a constant cause it's always only 3 numbers back

old heath
#

ah

#

I'm trying to improve my self in dynamic programming and Algorithms overall since i have a fly's memory in real life.. πŸ˜„ have not gotten job until now just freelance 😦

haughty mountain
old heath
#

Done that pithink

haughty mountain
#

and yeah, I'm guessing something should return 1?

old heath
#

trying to make my self strong at concepts

#

return 1 why?

haughty mountain
#

all you're doing currently is adding a bunch of zeroes

#

err, how is your caching even supposed to work?

#

you're not caching the values of the function

old heath
#

i works

#

you mean i should cache values before calling recursive function?

#

but it still works

haughty mountain
#

oh wait, number + first + second + third

#

this will just return number

#

no?

#

at least if values are already computed

#

if cached, 0 is used

old heath
#

yes

haughty mountain
#

why?

#

caching is about remembering the values pithink

#

like, e.g.

def recursiveSum(number: int) -> int:
    cache: dict[int, int] = {}

    def helper(number: int):
        if number <= 0:
            return 0
        if number not in cache:
            cache[number] = number + helper(n - 1) + helper(n - 2) + helper(n - 3)
        return cache[number]

    return helper(number)
#

or you're computing something quite weird pithink

#

is your code just computing the overall sum with extra steps?

#

!e

def recursiveSum(number: int) -> int:
    # Memorize value in order to avoid unessary recursive call
    cache: set[int] = set()
    # Additional Note during memorization or memoization it will reduce calls for reduntant process.
    # Using Set instead of list
        # Because set has time complexty of O(1) when checking value
        # While lists has time complexity of O(n)
    # Just return sum of 0
    if number <= 0:
        return 0
    # Create a helper function for N Sum
    def helper(number: int):

        if number <= 0:
            return 0
        # Declare variables
        first: int = 0
        second: int = 0
        third: int = 0
        # Check  if result already calculated before
        if number - 1 not in cache:
            first = helper(number - 1)
            cache.add(number - 1)
        # Check  if result already calculated before
        if number - 2 not in cache:
            second = helper(number - 2)
            cache.add(number - 2)
        # Check  if result already calculated before
        if number - 3 not in cache:
            third = helper(number - 3)
            cache.add(number - 3)
        # Return the sum
        return number + first + second + third
    # Call helper function Time complexity should sit in between O(n) & O(n^2)
    return helper(number)

print(recursiveSum(100))
halcyon plankBOT
#

@haughty mountain :white_check_mark: Your 3.12 eval job has completed with return code 0.

5050
haughty mountain
#

it is

old heath
#

nice

#

your code made my pc snoooz -_-

haughty mountain
#

well, my code actually does a bunch of work since I'm computing something different πŸ₯΄

#

something related to tribonacci, whose value grows exponentially

#

!e

def recursiveSum(number: int) -> int:
    cache: dict[int, int] = {}

    def helper(number: int):
        if number <= 0:
            return 0
        if number not in cache:
            cache[number] = number + helper(number - 1) + helper(number - 2) + helper(number - 3)
        return cache[number]

    return helper(number)

for n in [5, 10, 15, 20]:
    print(n, recursiveSum(n))
halcyon plankBOT
#

@haughty mountain :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | 5 30
002 | 10 709
003 | 15 15052
004 | 20 317019
haughty mountain
#

ok, I did accidentally use n rather than number in helper

#

still

#

!e

def recursiveSum(number: int) -> int:
    cache: dict[int, int] = {}

    def helper(number: int):
        if number <= 0:
            return 0
        if number not in cache:
            cache[number] = number + helper(number - 1) + helper(number - 2) + helper(number - 3)
        return cache[number]

    return helper(number)

for n in [20, 40, 80, 160]:
    print(n, recursiveSum(n))
halcyon plankBOT
#

@haughty mountain :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | 20 317019
002 | 40 62242913830
003 | 80 2399206641506934910812
004 | 160 3564701211352961547755527111227541932616440
old heath
#

going to practice more Algorithms and DataStructures since i have weakness to it

glossy lintel
#

something is wrong with my eager load and delete function and idk what is

#

Eager load

    def eager_load(self, limit: int, exhuastion: bool = True) -> None:
        isLimitInt = isinstance(limit, int)
        condChain = {
            not isLimitInt: TypeError("Limit argument has to be int type"),
            not isinstance(exhuastion, bool): TypeError("Exhuastion argument has to be bool type"),
            not self.__sequence: RuntimeError("Eager loading can only be done on lazy loaded linked lists"),
            isLimitInt and limit < 1: ValueError("Limit argument has to be greater than or equal to 1")
        }
        if condChain.get(True): raise condChain[True]
        prevEnd = None
        self.__nodes_length += limit + 1
        for i in range(limit):
            try: element = next(self.__sequence)
            except StopIteration: 
                self.__nodes_length -= i
                self.__sequence = None
                break
            self.__ending_node.point = SinglyNode(element)
            prevEnd = self.__ending_node
            self.__ending_node = self.__ending_node.point
            if prevEnd is self.starting_node: continue
            self.__pre_end_node = self.__pre_end_node.point
        if exhuastion: self.__sequence = None
        self.__iter_length = self.__nodes_length
#

Delete

    def delete(self, position: int = -1) -> None:
        isPositionCorrect = isinstance(position, int)
        selectedPos = position if isPositionCorrect else 0
        outOfBounds = selectedPos >= self.__nodes_length or selectedPos <= -self.__nodes_length
        condChain = {
            not isinstance(position, int): TypeError("Position Argument Has To Be Int Type"),
            (not self.__sequence) and isPositionCorrect and outOfBounds: IndexError("Position argument out of bounds")
        }
        if condChain.get(True): raise condChain[True]
        position = position if self.__sequence else position % self.__nodes_length
        self.__nodes_length -= 1

        
        if self.__sequence and position >= self.__nodes_length + 1:
            self.eager_load(position - 1, False)
            if not self.__sequence:
                self.__pre_end_node.point = None
                self.__ending_node = self.__pre_end_node
                return
            try: next(self.__sequence)
            except StopIteration: self.__sequence = None
            self.eager_load(2, False)
            return
#

doing this

#

for delete(4)

#

results in

#
1 -> E -> [1] -> (3,) -> Sentinel  |  6
#

which length of 6 is not correct

#

this is the generator btw

#
def simpleGeneratorFunc(): 
    yield 1 # 0
    yield "E" # 1
    yield [1,] # 2
    yield (3,) # 3
    yield {4,} # 4
    yield "Sentinel" # 5
#

only i and god knew how this junk worked

#

now only god knows

#

(this is very bad code im aware)

glossy lintel
#

the thing thats very annoying is that there are way too many conditions

glossy lintel
#

pithink tbh idk if i should make it so that the user must insert a eager_load before executing methods that require loading next elements or have the methods themselves eager load the parts they need

stable pecan
#

this is really wierd

        condChain = {
            not isLimitInt: TypeError("Limit argument has to be int type"),
            not isinstance(exhuastion, bool): TypeError("Exhuastion argument has to be bool type"),
            not self.__sequence: RuntimeError("Eager loading can only be done on lazy loaded linked lists"),
            isLimitInt and limit < 1: ValueError("Limit argument has to be greater than or equal to 1")
        }
        if condChain.get(True): raise condChain[True]
haughty mountain
#

probably irrelevant, but one issue:
-n is a valid index in a sequence of length n

glossy lintel
#

got confused

glossy lintel
# stable pecan this is really wierd ```py condChain = { not isLimitInt: Typ...

i tried to go with a dict, since personally im not a fan of putting everything in like

if not isLimitInt:
  raise TypeError("Limit argument has to be int type")
elif not isinstance(exhuastion, bool):
  raise TypeError("Exhuastion argument has to be bool type")
elif not self.__sequence: 
  raise RuntimeError("Eager loading can only be done on lazy loaded linked lists")
elif isLimitInt and limit < 1:
  raise ValueError("Limit argument has to be greater than or equal to 1")
#

its just makes things a bit clumsy

glossy lintel
#

its not a important part tbh

#

now that i think of it

#

the best thing to do is maybe give the user the control, instead of managing it myself the lazy loading for each method i could tell the user to eagerly load it manually and then do the operation

haughty mountain
#

I just used n as a placeholder because typing the whole thing was too much πŸ˜…

glossy lintel
#

oh

glossy lintel
quasi ember
#

hi does somebody know how to take a screen shot of 2 monitor with python . this is the code but never take the second monitor
import pyautogui

left = -2000
top = -2000
width = 14096
height = 12160

# Take a screenshot of the specified region
screenshot = pyautogui.screenshot(region=(left, top, width, height))

# Save the screenshot
screenshot.save(r"C:\Users\Danny\Desktop\imagepush\screenshot3.png")```
glossy lintel
#

nvm found the solution

#
        if self.__sequence and position >= self.__nodes_length + 1:
            diff = position - self.__nodes_length - 1
            if diff == 0: 
                try: next(self.__sequence)
                except StopIteration: pass
                self.eager_load(1, False)
            else:
                self.eager_load(diff, False)
                try: next(self.__sequence)
                except StopIteration: pass
            self.__nodes_length += 1
            if self.__sequence: self.eager_load(1, False)
            return
#

this took a while

#

(only for delete method ._.)

#

got a long way to go

haughty mountain
glossy lintel
#

tbh if possible

glossy lintel
#

for modulo part, i basically do it so i can get the positive index

haughty mountain
#

this is not about any modulo

    def delete(self, position: int = -1) -> None:
        ...
        outOfBounds = selectedPos >= self.__nodes_length or selectedPos <= -self.__nodes_length
glossy lintel
#

mhm

haughty mountain
#

this check is incorrect compared how python does indexing

glossy lintel
haughty mountain
#

-len is a valid index

#

this check says it isn't

glossy lintel
#

oh so then

outOfBounds = selectedPos >= self.__nodes_length or selectedPos < -self.__nodes_length
haughty mountain
#

yes

glossy lintel
#

pretty minor change

#

but ok

glossy lintel
haughty mountain
#

just use a generator function?

glossy lintel
#

thats what im doing

#

im using a generator function(or iterable) and basically add support to the methods like delete, traverse, search.... etc

#

the problem(quite intentional) is the length being dynamic

haughty mountain
#

I would just give the user a generator

#

and they can use whatever existing stuff in itertools

#

I guess the real answer to your question is "I wouldn't do this"

fiery cosmos
#

What is perimeter of island?

Why is output for that island 14?

keen hearth
#

There are 14 of them

fiery cosmos
#

Ah, okay

#

Ty

keen hearth
#

Essentially, it's 4 times the number of 1s, minus 2 for any pair of 1s that are adjacent.

haughty mountain
# fiery cosmos

more interesting variant:
I also give you a point somewhere on the island, same task: find the perimeter of the island

#

what's the best time complexity solution you can come up with?

haughty mountain
#

and a different variant still (a harder one), allow lakes on the island, given a point on the island, count the number of lakes

glossy lintel
#

would u make it so methods like delete, search... etc would load the parts they need

for example if the iterable is length 10 and i have loaded 2 nodes, and i wanna delete the 5th node, then i load til the 6th node

haughty mountain
#

I think it's exceedingly rare to have a use case where you do lazy generation that can't be easily expressed in terms of just manipulating iterators

glossy lintel
#

i think i won't make it so that delete uses the part it needs to delete the node

#

i think i should give the user more control and not abstract it away

glossy lintel
fiery cosmos
upbeat wedge
#
s_class=pd.Series(['XIIA','XIIB','XIIC','XIID'])
s_marks=pd.Series([60,70,80,90])

d1=pd.DataFrame({'name':s_name,'class':s_class,'marks':s_marks}
                ,index=['A','B','C','D'])```
#

hey i keep getting nan as values

#

can someone help

modern gulch
#

try d1=pd.DataFrame({'name':s_name,'class':s_class,'marks':s_marks})

#

Or, if you want that index: d1=pd.DataFrame({'name':s_name,'class':s_class,'marks':s_marks,'myindex':['A','B','C','D']}).set_index("myindex")

#

The problem is that each of the series have an implicit index, so the index doesn't align with it

upbeat wedge
#

oh yeah that helped thank you

modern gulch
#

You could've also done:

#

d1=pd.DataFrame({'name':[yourlist],'class':[yourlist],'marks':[yourlist]}
,index=['A','B','C','D'])

#

And skipped the pd.series.

upbeat wedge
modern gulch
idle zephyr
#

whats wrong here

#
a = [print(arr[x]) for x in range(len(arr)) if x == 0 or x%2==0]```
#

ok endline spacing

#

what a stupid website

glossy lintel
#

now it supports changing the iterable / generator to a other one

#

you can also specify a cut index. Which tells the code that when changing to a new iterable, to cut a specified amount of elements(as index point)

#

for example if i have range(1, 10) and i specify cut index to 4 it will be
Before

1 2 3 4 5 6 7 8 9
#

After

3 4 5 6 7 8 9
#

it raises a value error when the index is out of bounds

#

and also returns the previous sequence(as iterable form)

haughty mountain
#

!e

import itertools

it = iter(range(1, 10))
print(list(itertools.islice(it, 2, None)))
halcyon plankBOT
#

@haughty mountain :white_check_mark: Your 3.12 eval job has completed with return code 0.

[3, 4, 5, 6, 7, 8, 9]
glossy lintel
#

btw how fast is itertools.islice(it, 2, None)?

haughty mountain
agile sundial
#

does it drop the first elements right away?

glossy lintel
#

when switching to a new sequence

haughty mountain
#

it's when you get the next element

glossy juniper
#

anyone interested in practicing leetcode questions with me according to EST ?

true elbow
#

Got this for HW and i have no clue how to do it pithink

In Exercise 1, we will group the dataframe by birdname and then find the average speed_2d for each bird. pandas makes it easy to perform basic operations on groups within a dataframe without needing to loop through each value in the dataframe.

Instructions
Fill in the code to find the mean altitudes of each bird using the pre-loaded birddata dataframe.

Here is the code:

      # First, use `groupby()` to group the data by "bird_name".
grouped_birds =

# Now calculate the mean of `speed_2d` using the `mean()` function.
mean_speeds = 

# Find the mean `altitude` for each bird.
mean_altitudes = 
    
What is the mean speed for Sanne?
true elbow
#

tysm

#

sry

haughty mountain
#

also, is this exercise a Monty Python reference?

true elbow
#

idk im doing a harvard course and i genuinly got no clue even after looking through the videos given

haughty mountain
#

"what's the average airspeed velocity of an unladen swallow"

true elbow
#

i dont know

#

sry

haughty mountain
true elbow
#

yeah

#

but like when i try its not working

narrow mica
#

I mean it looks simple enough

# First, use `groupby()` to group the data by "bird_name".
grouped_birds = df.groupby(['bird_name'])

# Now calculate the mean of `speed_2d` using the `mean()` function.
mean_speeds = grouped_birds['speed_2d'].mean()

# Find the mean `altitude` for each bird.
mean_altitudes = grouped_birds['altitude'].mean()
    
What is the mean speed for Sanne?
true elbow
narrow mica
true elbow
#

πŸ™‚

haughty mountain
haughty mountain
#

that way people can see if you have the right idea

#

and also "it's not working" isn't saying much to us

#

"I expected X and got Y, why?" would be more helpful

true elbow
haughty mountain
#

no worries, just saying some things that can make things easier when asking for help in general, worthwhile stuff to pick up

#

providing context and whatnot

true elbow
narrow mica
#

Like how you usually use np for numpy and pd for pandas

true elbow
#
Traceback (most recent call last):
  File "C:\Users\idkla\Documents\Python Resources\Harvard\Practicals\YAY.py", line 9, in <module>
    grouped_birds = df.groupby(['sanne'])
NameError: name 'df' is not defined

this is the error code i get when i run that code

haughty mountain
#

df was just used as a placeholder for the dataframe the task is talking about

true elbow
#

ah sry thx

#

i got confused sry

#

thank you so much both of yall, code finnaly runs!

glossy lintel
#

what optimisations i could add to the linked list?

#

i added batching and lazy loading

#

but what other optimisations should i consider implementing?

haughty mountain
#

you know my opinions πŸ₯΄

outer rain
#

actually, what even is "batching" and "lazy loading"?

glossy lintel
glossy lintel
glossy lintel
#

batching deals with optimising the time it takes for non-sequential ordered operations(operations where doing something to a node doesn't influnce the next node). Examples are

Sequential Ordered Operations

> Create Linked List
> Access n-node
> Check if n-node is bigger than 10, if not then stop and print "FAILED 0"
> Access i-node
> Check if i-node is bigger than 20, if not then stop and print "FAILED 1"
> print "SUCCESS"
...

Non-Sequential Ordered Operations

> Create Linked List
> Access n-node
> Access i-node
> Access k-node
> Check if i-node equals n-node equals k-node, if not then stop and print "FAILED"
> print "SUCCESS"
...
#

whereas, lazy loading optimises the memory & initialisation time. Suppose a big linked list with about 5 million elements, loading every node will cost u a ton of time & memory. So lazy loading only loads the parts it truly needs. The only way to load more parts is to use eager_load, any linked list can be transformed into a lazy loaded one

haughty mountain
#

for the lazy loading I'm still confused what this does that can't just be done with iterators and itertools

haughty mountain
#

the benefits over just using iterators

glossy lintel
#

you don't load the entire thing

haughty mountain
#

neither does iterators

glossy lintel
#

you can also switch the iterators to use at any time

haughty mountain
#

that you could easily do with iterators as well

glossy lintel
#

and idk what else

glossy lintel
haughty mountain
#

none of those will be very applicable in python

glossy lintel
#

mhm

glossy lintel
outer rain
# glossy lintel can u describe them in more detail?

Yeah, none of these are applicable

Arena allocator is a type of allocator which works on one memory buffer of either fixed size, or of dynamically sized chunks, where to allocate an object instead of calling the system allocator every time you simply use the memory from the buffer instead. This also improves locality of your data and cache utilization.

Dense pointers is the idea of storing pointers in smaller objects. Usually pointers are 64-bit integers, and only 48 of those are utilized. But your nodes are likely to be aligned by 16 byte boundary, meaning only 44 bites are actually important. More over, they are most likely to be all allocated in one memory region, and it is probably ran on a desktop machine, in a userspace app which does not require more than 16GBs of ram, meaning only 32 bits truly matter. Then you only store these 32 bits, making double-connected linked list node 8 bytes smaller, improving cache usage, the data layout, and minimizing memory fetches.

Manual prefetching refers to CPU caches and it's a way of fetching memory before CPU does it by itself speculatively. It is a good optimization if you have assumptions about the inter-node data layout (i.e. know your allocator better than the CPU)

glossy lintel
outer rain
#

The first one requires fine memory management and unsafe conversions. Dense pointers require casting things into pointers and dereferencing them. Prefetching requires special CPU instructions. There is no way of doing anything even remotely close to than in cython. The abstractions are too high up.

glossy lintel
#

mhm

#

so what do i do?

outer rain
#

you don't

#

just don't do linked lists

#

it's that simple

#

it's the worst data structure imaginable, invented by a software engineers to suffer people in algorithmic interviews

haughty mountain
#

There is one interesting optimization that can be made like not allocating for each node, but instead working with bigger chunks that hold several nodes. So you have a linked list of chunks

#

deque in python does this

haughty mountain
#

but even then, doing this on the python layer feels questionable

#

so much overhead for everything

glossy lintel
outer rain
#

there are very rare cases where you actually need to use linked list, but these use cases are so intricate, you basically won't have alternatives anyway

haughty mountain
#

stacks and queues are actually useful

glossy lintel
#

stacks need arrays

outer rain
#

these are basically lists anyways, so whatever

glossy lintel
outer rain
#

(vectors I mean)

#

(not linked lists)

glossy lintel
#

speaking of stacks and queues, any way i can implement without having to use arrays and all of that while still being optimised

haughty mountain
#

stacks can be implemented in several ways, but yeah a vector support the stack operations

#

why?

glossy lintel
haughty mountain
#

I mean yes there are a zillion ways to implement them

#

some worse than others

glossy lintel
haughty mountain
#

but some things are pretty actively bad

#

there are some actual use cases for linked lists, but they are very rare

#

most of the time you have vastly superior options

#

fwiw, stacks and queues to me are more interfaces

glossy lintel
#

mhm

haughty mountain
#

they can be based on a lot of things internally

glossy lintel
#

i got ur point

#

but

haughty mountain
#

in fact in C++ you can slot in the type you want to use for storage

glossy lintel
#

im not trying to be the guy who uses the linked lists

haughty mountain
#

stack is just a generic thing there

glossy lintel
#

im the guy who provides them to you, already implemented

outer rain
#

Honestly, I see no reason for me to use preexisting linked list implementation ever, except for if I need it in Rust for some reason

#

because if I need performance, I would either not use linked list, or write my own for finer tuning

haughty mountain
outer rain
#

and if I don't need performance, I would still just writing it myself, maybe very badly, but whatever

glossy lintel
#

ok

outer rain
#

because it's not like linked lists are used as a plain data structures

#

they are usually either intrusive, or an essential part of something else

#

(i.e. a list of memory chunks for something)

outer rain
#

that said, linked lists are nice in functional programming, and I do use them all the time... but they are also so freaking essential, they are built-in anyway

#

and by "built-in" I don't mean in a standard library, I mean they are actually built-in and always available even without imports

glossy lintel
#

mhm

outer rain
#

(and very easy to reimplement anyway)

#

linked lists in python are just... weird..

#

And I am not trying to deminish your work in any way. If you want to write a fastest pure python linked list -- go ahard

haughty mountain
#

linked lists are nice in functional programming
until you start asking yourself, why are my strings so freakin slow?

glossy lintel
#

python is weird anyway

outer rain
#

it's just that it is very unlikely this is going to be used at all

#

linked lists just suck, by design

outer rain
outer rain
#

it's a bit better in other languages

glossy lintel
#

ok

haughty mountain
glossy lintel
#

got it

outer rain
#

but then you also need to consider the domain

#

like if your linked list is targeted for concurrency, then you need to think about atomicity

#

actually, nevermind

#

I admit, I would use a pre-made linked list for that one use case

#

because I don't want to be the one responsible for messing up the ordering of atomics

haughty mountain
#

proper concurrency is either slow or hard

#

(or both)

outer rain
#

it is :sad:

#

how is :sad: not a thing?...

#

anyways

#

I actually can't wait for the parallel algorithms chapter on algorithmica, I am very hyped for that one

#

because concurrency is probably the hottest topic nowadays in the field of algorithms

solar plume
#

where can I ask about Syntaxerrors?

keen hearth
mystic grail
#

is there any good playlist on utube for dsa in python?

keen hearth
mystic grail
keen hearth
#

People sometimes recommend against it here, but I think GeeksForGeeks has some reasonably good algorithms and data-structures content for beginners. It has code examples in different languages which is nice. Although apparently the Python examples look as if they were written by a Java programmer, or so I've been told πŸ˜„

#

Also, if you need to understand a new algorithm, reading the Wikipedia article isn't a bad way to start!

#

Sorry for all the edits; my brain can't language today.

mystic grail
#

no problem

#

tysm for the suggestions

keen hearth
#

No probs!