#algos-and-data-structs
1 messages · Page 25 of 1
base 2: 1 bit
base 4: 2 bits
base 8: 3 bits
base 16: 4 bits
etc
so each digit in that
needs that many bits and must be padded with zeros
Let's convert an example octal number
base 8
167
as you know
1 = 1 binary
6 = 110 in binary
7 = 111 in binary
so we combine those
to form the number in binary
1 + 110 + 111 = 1110111
which is correct
however
this misses
a step that is important, but might not be necessary for all numbers
Let's convert 111 instead
1 = 1
1 = 1
1 = 1
but 111 is not the correct answer
that's 7!
we need to pad it to be 3 bits each
oh damn I get that now
because 2 ^ 3 = 8 which is octal
so the number must be
001001001
which if we convert it
equals 111
- in octal
what if the number were to be like 16+
like base 32?
but it is in base 8
so assuming it can only fit into xxxx
casue together it only can equal 15
if you were to do 1111
i meant the number 16
or we can do like 18 in base 8
gotcha so that would be like invalid ?
recall that hex's range is 0123456789ABCDEF
so doing G
would be 17 in decimal
which is outside the limit
which rolls over to
10 in hex
just like 9 + 1 = 10, F + 1 = 10
intresting
theirs also this question on my homework which the professor went over in class but idk if theese rules apply
Further exploration:
Suppose a computer uses one byte (8 bits) to represent each (unsigned) integer.
What happens if you add 91 to 167?
so would the 8 bits be like in the same xxxx
and if so does that make it invalid ?
So what he's trying to show
is an integer overflow I think
Yeah
So remember
that the max number
that can be stored
in n bytes
is also 2 ^ n
just like that base
and those two are actually closely correlated but it's not important for now
so in decimal 91 + 167 = 258
BUT
the max number
is 2 ^ 8 = 256
so
258 > 256 and it cannot store that number
what ends up happening typically is a wrap-around
so that would equal 2 in the end
in decimal
11111111 in binary equals 256 decimal
and if you only have 8 bits
you cannot store more than that
what ends up happening
is
=11111111
+00000001
goes through the process:
11111111
1 + 1 = roll over and go up one
11111110
1 + 1 = roll over and go up one
11111100
1 + 1 = roll over and go up one
11111000
1 + 1 = roll over and go up one
11110000
1 + 1 = roll over and go up one
11100000
1 + 1 = roll over and go up one
11000000
1 + 1 = roll over and go up one
10000000
1 + 1 = roll over and go up one
00000000
we have 1 left over but no more bits! So the computer just rolls over to
00000001
next we +1
1 + 1 = roll over and go up one
00000010
done:
=00000010
it rolls over a bunch of times
until it runs out of bits
and then goes back to the first bit
it's very similar to
carrying numbers up while adding in base 10
so that causes a overflow then
Like that left over 10
yeah
when it overflows it just wraps around
gotcha
So it ends up equaling n % 256
(the remainder when dividing n by 256)
!e example ```py
print((91 + 167) % 256)
@floral pumice :white_check_mark: Your 3.11 eval job has completed with return code 0.
2
not really that helps out plenty tho thank you for your time
no problem
I was just super confused XD
yeah it can be a little complicated
Where can i start learning dsa as a beginner
Just pick a language of your choice and you are good to go
People usually choose C++/Java/Python
The language doesn't matter at all but the concepts and logic do
Guys, I have a question about @dataclass. Example:
Here is the type class:
class CarType(Enum):
bike = 0
van = 1
truck = 2
Than I have basic class, where are all the attributes that are common for all cars:
@dataclass
class CarBaseData:
car_type: CarType
consumption: float
color: str
And now, I have car subclasses, with attributes specific for each car:
@dataclass
class BikeCar(CarBaseData):
car_type = CarType.bike
@dataclass
class VanCar(CarBaseData):
car_type = CarType.van
doors: int
seats: int
@dataclass
class TruckCar(CarBaseData):
car_type = CarType.truck
load: float
equipement: list
Now, my question is, how do I make car_type attribute static for each sublass. I want it to be set on class instance creation, that can be done by giving it default value: car_type: CarType = CarType.bike, but how do I make it not changeable? So once it is set (on init), it cannot be changed.
I can make getter / setter methods into each subclass like:
@property
def car_type(self):
return self.car_type
@car_type.setter
def car_type(self, value):
raise Exception('car_type cannot be changed!')
But that is nightmare once the number of subclasses get high and I need to modify these.
I have been trying to put getter / setter into CarBaseData class, but that does not work. It raises exception, CarBaseData does not know about attributes in subclasses (which is logical).
@dataclass
class CarBaseData:
car_type = CarType
I don't get this part, should be car_type: CarType or car_type: ClassVar[CarType]
why would you store the enum itself as an attribute
reading again, I believe it is the second one you're after: car_type: ClassVar[CarType]
because I need it later, when storing data into DB
That's not a reason to store the enum type itself as an attribute
BTW, yes, you are right, it should be car_type: CarType, I have fixed it in original mesage
ah
well, it does not matter much in my opinion, the point is, I want to set it once, one init and than not able to be changed, could be int or str does not change the issue I have
that makes sense
no, car_type: CarType fixes it
my issue was with car_type = CarType, which would store the type itself
oh, okay
making it car_type: Type[CarType] which is just Enum's metaclass
I have been trying it like this:
@dataclass(kw_only=True)
class CarBaseData:
__car_type: CarType
consumption: float
color: CarColor
def __post_init__(self):
"""
car_type is created and SET in subclasses. Here we are checking, if its present and correctly filled
"""
if isinstance(self, BikeCar) and self.car_type == CarType.bike:
return
elif isinstance(self, VanCar) and self.car_type == CarType.van:
return
else:
raise AttributeError(f'car_type does NOT match instance class: {self.__class__.__name__}!')
@property
def car_type(self):
return self.__car_type
@dataclass(kw_only=True)
class BikeCar(CarBaseData):
__car_type: CarType = CarType.bike
@dataclass(kw_only=True)
class VanCar(CarBaseData):
doors: int
seats: int
__car_type: CarType = CarType.van
# Test run it
# car_object = BikeCar(car_type=CarType.van, consumption=1.23, color=CarColor.red) # This will raise exception
car_object = BikeCar(consumption=1.23, color=CarColor.red)
print(car_object)
car_object.color = CarColor.blue # Test that other attribute can be changed
print(car_object)
car_object.car_type = CarType.truck # This should fail!
print(car_object)
But it failed:
/TEST_APP # python3 class_unchangeable_test.py
Traceback (most recent call last):
File "/TEST_APP/class_unchangeable_test.py", line 60, in <module>
car_object = BikeCar(consumption=1.23, color=CarColor.red)
TypeError: BikeCar.__init__() missing 1 required keyword-only argument: '_CarBaseData__car_type'
/TEST_APP #
This version works:
@dataclass(kw_only=True)
class CarBaseData:
_car_type: CarType
consumption: float
color: CarColor
def __post_init__(self):
"""
car_type is created and SET in subclasses. Here we are checking, if its present and correctly filled
"""
if isinstance(self, BikeCar) and self.car_type == CarType.bike:
return
elif isinstance(self, VanCar) and self.car_type == CarType.van:
return
else:
raise AttributeError(f'car_type does NOT match instance class: {self.__class__.__name__}!')
@property
def car_type(self):
return self._car_type
@dataclass(kw_only=True)
class BikeCar(CarBaseData):
_car_type: CarType = CarType.bike
@dataclass(kw_only=True)
class VanCar(CarBaseData):
doors: int
seats: int
_car_type: CarType = CarType.van
# Test run it
# car_object = BikeCar(car_type=CarType.van, consumption=1.23, color=CarColor.red) # This will raise exception
car_object = BikeCar(consumption=1.23, color=CarColor.red)
print(car_object)
car_object.color = CarColor.blue # Test that other attribute can be changed
print(car_object)
car_object.car_type = CarType.truck # This should fail!
print(car_object)
/TEST_APP # python3 class_unchangeable_test.py
BikeCar(_car_type=<CarType.bike: 0>, consumption=1.23, color=<CarColor.red: 'red'>)
BikeCar(_car_type=<CarType.bike: 0>, consumption=1.23, color=<CarColor.blue: 'blue'>)
Traceback (most recent call last):
File "/TEST_APP/class_unchangeable_test.py", line 64, in <module>
car_object.car_type = CarType.truck # This should fail!
AttributeError: can't set attribute 'car_type'
/TEST_APP #
but the attribute there is "only" protected, not private.
a single underscore is how you make private attribute in python
two underscores invokes name mangling, not what you want
Hi, i'm searching information about some search algorithms for 2D arrays (Matrix's)
you can do linear search and binary search
binary search in a matrix is pretty iffy
unless it has a very specific structure
yeah dfs and bfs are better than binary search
when doing a calculation for pagerank how do we deal with special cases such as nodes that have no backlinks, nodes that do not have outward links and nodes that don't have either?
it's not necessarily about averages
you can talk about best case and worst case complexities as well
I'm reading Fluent Python and the author explains why array is more compact than tuples or lists, which makes sense, but the results are confusing when i try to get their memory size
!e
from sys import getsizeof
from array import array
x = [1, 2, 3]
y = array('i', [1, 2, 3])
print(getsizeof(x), getsizeof(y))
@manic kite :white_check_mark: Your 3.11 eval job has completed with return code 0.
88 92
apparently array has a bunch of overhead for small sizes
for some reason, idk why
!e
from sys import getsizeof
from array import array
x = list(range(100))
y = array('i', range(100))
print(getsizeof(x), getsizeof(y))
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
856 488
yeah looking at the source it contains a ton of runtime type info which really shouldn't be needed
if you had C++ templates rather than working in C you could make most of it go away
Why do we need type info? Isn't that the purpose of passing 'i' ?!
getsizeof is not calculating size of elements
!e
from sys import getsizeof
from array import array
x = list(range(100))
y = array('i', range(100))
print('list:', getsizeof(x) + sum(map(getsizeof, x)))
print('array:', getsizeof(y))
@lean walrus :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | list: 3652
002 | array: 488
list of integers uses 7 times more memory than array
I actually tried it with pympler asizeof's (don't know how accurate it is). The results were bigger than those of getsizeof, but still the list size was smaller than the array's, at least for my small example of 1,2,3
oh right, getsizeof isn't recursive
I am so confused
!e
from sys import getsizeof
from array import array
x = [1, 2, 3]
print(getsizeof(x[0]), getsizeof(1))
y = array('i', [1, 2, 3])
print(getsizeof(y[0]))
Unless python is evaluating y[0] and then returning its size
so the ~2x difference is that the list itself is because the list holds 8 byte pointers
and the array holds 4 byte ints
that makes a lot of sense
How does it make sense? Even if we don't count recursively, 3 * 8 should be greater than 3 * 4
@manic kite when you get to larger sizes list is ~2x larger, see 
!e and for small sizes arrays keep a bunch of extra metadata around (for questionable reasons)
from sys import getsizeof
from array import array
x = list(range(0))
y = array('i', range(0))
print(getsizeof(x), getsizeof(y))
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
56 80
And apparently when done recursively it can even be 7× larger
right
the size of the python ints aren't included otherwise
but just the size of the list itself
in the array case you get the actual size of the whole structure
since it stores the values inside it, rather than pointers to values like list does
Does anyone know any algorithm to calculate the inverse function value?
Like I have the function f(x) = k and I want to know the value of its inverse f^-1(x)
very much depends on f
a lot of the time a function f^-1 doesn't exist
if you have the function f symbolically you're better off just doing some math
I can not generalize this?
I think the question is ill-posed unless f has some specific properties. And it makes a huge difference if f is known or if f is a black box
if f is invertible and a black box you could binary search
if f is invertible and not a black box then throwing math at it is a better idea
if f is not invertible you're kinda screwed
getsizeof() calls obj.__sizeof__ which lists implement
if the object doesn't implement obj.__sizeof__, then only the size of the reference is returned
!e
import sys
a = 10**1000000
b = [a]
print(sys.getsizeof(a))
print(sys.getsizeof(b))
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 442948
002 | 64
the point was just that the size of the elements aren't included
that's because list.__sizeof__ is not recursively calling its elements __sizeof__ methods
yes, that was the point being made? 
to me it was left under the impression that recursively calculating the size would be the default, but in fact very few objects are implemented that way
The discussion was:
"And apparently when done
recursively it can even be 7× larger"
"right, the size of the python ints aren't included otherwise" (i.e. if you don't do it recursively)
and then I say stuff about why the array case is different
I guess the formulation might have been a bit awkward
doing it recursively has other implications like circular references
What app do you use to check this?
It's all LeetCode
Ok. I will try it
Thanks
Hard to say without seeing the alternative code. In your post, the fact that it uses the Python data structure of set means that it's using very efficient standard code. Using Set means that it's using a bunch of code "hidden" away in a library. So by using a library data structure, the author can abstract away all the functionality. It's still there though, under the hood.
TLDR; using the Set object compartmentalizes functionality.
Ah, I see, there's two jpgs. Yeah, hashmap can be faster than a set because a set uses a dictionary nlog(n) time, while a hashmap works in linear time.
right here is my alternative soln
oh I dind't take htat into consideration
The hashmap is faster right?
hashmap searching is linera time
so I believe it is faster yea
I believe insertion and deletion is also linear time?
and those are the only operations im using for the hashmap
Yeah, hasmap can be faster, provided that you don't have too much key collision. So ideally, the programmer would tweak the amount of keyspace according to his needs.
but my for loop also dependso n the length of the array, so the time complexity of the whole algo would be O(n) right?
Yes. Insertion and deletion is in linear time, provided no key collisions.
The computer hashes the key, and that hash process is in linear time. Then it just puts that value in there.
yeaa
Hashmap can be wasteful in terms of RAM usage, but that's fine because as my computer science teacher taught me, "Memory is cheap."
oh as in its space complexity is sorta high?
Precisely.
I say "as long as no key collisions" because in the case that it does happen, all the values that share the same key go inside a dictionary for the value.
can someone tell me what is the time complexity of this solution? https://www.youtube.com/watch?v=riBWq1DvVb8
Specifically, writer claims that it is an O(n) solution but if right[ : : -1] itself takes O(n) then I feel like the solution should be O(n^2). Please help to understand. Thanks!
Don't leave your software engineering career path to chance. Make sure you're interview-ready with Exponent's software developer interview prep course: https://bit.ly/3AX0DmM
Want interview coaching for FAANG companies like Google? Browse our mock interview coaches: https://www.tryexponent.com/coaching?role=swe
In this video, a Google Software...
So potentially, you can have like, linear times n*log(n) for the dictionary, so like, theoretically, it'll be back to nlogn.
Oh that's interesting. Yeah, I'm surprised to hear that there's something better than n^2.
Hahaha. I think I figured it out.
You just multiply ALL the numbers one time. Then just divide the product by the ith element.
LINEAR TIME! TADA!
haha. i am sorry. but we cannot do that. because input elements can contain 0 and negative values. so 0 will be trouble for that i guess.
So check for zero. 🤷♂️
because we will not get the answer at the index where 0 is placed
lol. that is another thought. great.
Strictly, if there are two zeros, than all elements would be zero.
So therefore there's three cases, And we can code all three.
so basically, if there are 2 or more 0s then eventually, the whole answer would be [0]*N; where N is the number of elements.
Yes, that's that case.
And for the case of one zero, you ignore it when you're doing your rolling product and use that value for that element that is zero.
and for all other elements, you put zero.
Still linear time.
I guess I didn't get you. "use that value for that element that is zero."? But now I understood the three cases you mentioned. and definitely we get O(n) solution.
Oh got it.
in the case of one zero. we will need to just calculate one place for whole result after getting the rolling product.
I understood. Thank you.
I am just surprised that the interviewer didn't catch this LOL.
Hello everyone
Is there a good video or guide that explains all the different programming paradigms in a simple to understand way?
thank you @dusky trellis , i will check this out
here's an examination from a different perspective (features instead of language taxonomy) => https://ars.els-cdn.com/content/image/1-s2.0-S0167642314000501-gr011.jpg
linear in terms of the key size, but not in terms of the hashmap size
it's linear if there are collisions, but the other way around
python sets use a hashmap
mb mb I meant hashmap searching is constnat time
either the latter, or maybe better actually do a few practice contests which will start out easier and get progressively harder
Hi i got an issue, i wanted to install 'ccxt' package for crypto trading but i made a mistake and i downloaded 'ccTX'... I immediately uninstall this fake package of ccxt but i am scared that this package install dependencies
*i used pip command
why do people not read the channel name and realize the thing they are asking has no relation to this channel? 
reading
y'know, maybe we should put a dummy channel at the top of topical help
so that this channel isn't on top
Rename to data-structure-and-algo so that async will be on top instead?
make the python bot pick a random permutation of the channels everyday
hey, new on python and i wonder if i can do somthing like that..
int *ptr; Person* p = *(Person**)(&ptr);
No, not really.
You can usually reassign __class__, which seems to be close to what you're describing.
Also you can do all kinds of dangerous things with the ctypes module.
i can do some like that with ctypes?
*p = &m
lets say somthing like that more simple
Keep in mind that you almost never will want to do this.
If you need to interface with C code, you should probably use https://cffi.readthedocs.io/en/latest/
And if you need to interface with C++, you should look at https://pybind11.readthedocs.io/en/stable/
ty ill read about that
just to verify, its mean ill write in c++ program with syntax of python?
cause i dont want it.
No, you'll write a C++ library and call it from Python.
You write the C++ as usual. Then you write an additional C++ part that is the Python interface, and you rely on pybind11 to turn that additional part into something you can call from Python.
got you. ok ill read about that ty very much!
I suggest to sacrifice #esoteric-python as being top channel. Pretty much flood channel anyway
and esoteric python people get to mess with people by giving esoteric code as responses to people not reading the names
it's...not a bad idea
Guys i did a good leedcode problem but with O(Log N) time complexity passed all tests but one says Stack Over Flow, is there a technique to counter stack over flow or do i need to understand well how it works first, for instance do i need to make like a recursive function inside another recursive function to counter it some how?
How do I get a stable hash from a frozen dataclass?
Hi, i use NEAT to create AI that can play games i created with pygame, but i am looking to "draw" the neurol network of my genome like on the left of this photo
Do you know if this is draw just for the video or is there any module that can draw that for me ?
I tried to spin my own... any comments already?
def stable_hash(d):
t = bytes(type(d).__name__, 'utf-8')
t += bytes([2])
if isinstance(d, str):
t += bytes(d, 'utf-8')
elif isinstance(d, float):
t += bytes(d.hex(), 'ascii')
elif isinstance(d, list) or isinstance(d, tuple):
t += b''.join(map(stable_hash, d))
elif isinstance(d, set) or isinstance(d, frozenset):
t += b''.join(sorted(map(stable_hash, d)))
elif isinstance(d, dict):
for k, v in d.items():
t += stable_hash(k)
t += bytes([1])
t += stable_hash(v)
elif is_dataclass(d):
for f in fields(d):
t += bytes(f.name, 'utf-8')
t += bytes([1])
t += stable_hash(getattr(d, f.name))
else: t += bytes(d)
return hashlib.md5(t, usedforsecurity=False).digest()
LeetCode gave me these diagnostics
Found the answer in the history, thanks anyway!
Lines of code has nothing to do with time. Time complexity doesn't mean time to run, it means how fast the time to run increases as you increase the size of the input. Sets and dicts are known to have very similar performance since they are both based on an underlying hash table. The memory usage difference is trivial, totally indistinguishable. I believe the second algorithm is faster because it may early return at any time, which means assuming each list has a duplicate somewhere, the runtime should be half on average, since the average index to return at would be half the list size. Since they probably don't all have duplicates, it makes sense that the second algorithm isn't twice as fast. In general, early return that is going to happen at one point going through a list that is randomly distributed will cut your runtime in half.
Think about it like this, in case 1, you're always building a set of all of the elements, in case 2, you're rarely putting ever value into the dict, you're often returning before it is finished.
Oh that makes a lot of sense, thanks!
Updated to be a bit more general.
Are Arrays from Numpy the same as Data Structure Arrays? From the following list.
Lists
Tuples
Dictionaries
Sets
Strings
Arrays
Queues
Stacks
Linked Lists
Trees (Binary Trees, AVL Trees, Binary Search Trees)
Graphs
Hash Tables
Heaps
define "data structure arrays"
this is a weird-ass list tbh. it has "lists" but also "arrays" and "linked lists" - even though list is an abstract data structure that can be implemented as an array, a linked list, or perhaps other ways.
It's from ChatGPT that's why I guess
I don't know? All the lists of dsa have Arrays. So I am asking if they are the same
well, they have about the same semantics as what people usually think of when they say "array", sure
now there's your problem
They are of type "ndarray." They are multidimensional arrays (n-dimensional (nd)) of homogeneous items. Internally they store those items in an array, but they contain some additional information that is used to access those items in a certain way making them a different data structure.
inb4 this is the real dummy channel
you just decided to chat in it!
Morning gentlemen/gentlelady
I'd like your opinion here.
I am trying to figure out what could be the best approach to temporarily store in-memory data for further computation.
In principle, I have the following data:
- set of URLs
- set of metric like visit, exist and so on
What I want is to group these data by URL patterns, providing metrics accordingly,
I can see two options:
- create a dictionary whose key is a "part" of the URL, and the relevant metrics. The URL part will be processed, therefore for a URL like /folder1/folder2/page-name I will be having three entries, all of them with the metrics attached
- create a nested list in a list, with one of these nesting being all the possible permutations a given complete URL could lead me to
Which do you think it could be more efficient on paper?
dict
flat is better than nested
how could this be?
@wraith nymph
report_dict doesn't have a 'data' key
I think we need more information to answer your question. You don't say what kind of further computation you want to do, just that it involves grouping by patterns and providing metrics. But are you concerned with time efficiency or memory efficiency? How do you need to group the URLs (by matching regexes? matching prefixes?), and what kinds of metrics do you need to compute?
@robust dagger metrics will be visit at the time the URL was retrieved, the same for the exit rate.
The computation is likely to be an average across all patterns matching.
E.g.
Assume the following
/Path1/path2/page1
/Path1/path2/page2
/Path1/path2
/Path1/path1/page1
Assuming the flat file representation as we said before, if I'll check for /Path1/path2 metrics I'll average the three URLs.
But if I'll be asking /path1 it will be four of them.
Concerns are in execution time as these computations have to be done real time
do you have any strong reasons to do IDS over just doing a BFS?
ok, then just do the DFS over and over for increasing depths
queues are not what you want for DFS
DFS would use a stack
This Hamming Example doesn't make sense. Because atleast how I think, hamming doesn't require a specific amount of bits right?
0110
0001
0110
0010
0110
0100
If you do the logic, you will see the error must be on index 15, but on the last parity bit you also find out an oddness in the last two lines
My suggestion would be a tree of dataclasses. Each instance would represent the prefix of a URL and would have a dictionary of children representing all ways you can follow the current prefix. For example, the root of the tree would represent the URL /, there would be a child for Path1, that child would have two children path2 and path1, and the path2 child would have children page1 and page2. Each instance would have attributes that record the metrics you need for itself and all descendent URLs. For example, there could be a descendent_visits attribute; the node for / would store the number of visits to all URLs in descendent_visits, the node for Path1 would store the number of visits to URLs under /Path1, and so on. To update, you split the URL on slashes and walk the tree. At each step, you update whatever counters you're tracking. You may need to create new nodes (e.g., if /Path1/path3 has never been seen before, you need a new child of Path1). If you need to update some display in real-time (e.g., a dashboard), then you can add callbacks.
there's some courses in pins
Thanks, sounds like an interesting approach.
Would you nest objects in a parent child relationship to traverse and find the things, or leave them still flat?
hey guys, simple question. When I execute my code the .read_excel() function execute ans i can't do anything after, I'd like to : .head(), remove lines ... but i can't because the program stop directly after. thx
Yes, I would nest everything. For the example you gave, I'm imagining that the tree of dataclass instances would be:
Path1 -+-> path1 ---> page1
|
+-> path2 -+-> page1
|
+-> page2
Here is cool GitHub repo about algo. https://github.com/hemangjoshi37a/hjAlgos_notebooks
Hey guys im new to python and was having trouble with one of my tic tac toe programs, I was wondering if I could get some help, that would be amazing!
I see. So the class will contain an ancestor property containing a pointer to its parent class.
Wouldn't traversing all these objects be slower?
Consider, I'll have users (me), asking for patterns relatively faster
Or when you said "at each step you update the counter" you meant to consolidate the data for each node to avoid reprocessing in realtime?
Yes a dashboard will be there, but here I lost you when you mentioned the callback. What will that be useful for?
I wasn't imagining ancestor pointers. I was only imagining lists of children.
It might be (depending on what you want to do) that ancestor pointers are useful. It depends.
Yes, by "at each step you update the counter," I did mean that you would consolidate the data for each node. So the node for path2, for example, would have the count for itself and all URLs underneath it.
The reason for mentioning the callback is that you might want some way to update the dashboard without traversing the tree again. Whether or not this is a good idea depends on a lot of factors, but if it is, then one way to do it is to update the dashboard as you walk the tree. This could be done using a series of callbacks. Each time you visit a tree node, you can call back into the dashboard and update it.
An alternative to callbacks (which for some uses cases might be better) is that, as you walk the tree, you create a list of what needs to be updated on the dashboard. When you're done walking the tree, you give the list to your dashboard object, and it uses the list to update the dashboard.
@oblique panther about our conversation yesterday, i made this and it works and doesn't feel like cheating. I have to do it for each method but it's much better than actually rewriting it or copying it, which idk how i'd go about doing, because they're in C ```py
import pygame
class Hitbox:
def init(self, mask: pygame.mask.Mask):
self.mask = mask
# store the mask surface, set its colorkey and transparency
self.image = self.mask.to_surface()
self.image.set_alpha(136)
self.image.set_colorkey(0)
# store the center y, useful for Ysort
self.center = Hitbox.get_center(self.mask)
def overlap(self, hitbox, offset):
return self.mask.overlap(hitbox.mask, offset)
@classmethod
def from_file(cls, name):
# load the hitbox image and set the colorkey for the mask
try:
hb_image = pygame.image.load(f"../gamedata/hitbox/{name}").convert_alpha()
except FileNotFoundError:
try:
hb_image = pygame.image.load(f"../poke_assets/sprites/{name}").convert_alpha()
except FileNotFoundError:
hb_image = pygame.image.load(f"../poke_assets/search/{name}").convert_alpha()
hb_image.set_colorkey(-1)
mask = pygame.mask.from_surface(hb_image)
return Hitbox(mask)
@classmethod
def get_center(cls, mask):
start = -1
end = 0
for x in range(mask.get_size()[0]):
for y in range(mask.get_size()[1]):
if mask.get_at((x, y)) == 1:
if start == -1:
start = y
else:
end = y
break
return (start + end) / 2
mesk = pygame.mask.Mask((10, 10), True)
musk = pygame.mask.Mask((5, 5), False)
hbx1 = Hitbox(mesk)
hbx2 = Hitbox(musk)
if hbx1.overlap(hbx2, (0, 0)):
print("OL")
else:
print("NOL")
i get to use the overlap method between hitbox objects as if they were mask objects
and any other method i might need can be added easily
How would I go about converting a tree-dictionary to a list of lists (that each contains the path to its given node)?
i.e. this:
d = {
'a': {
'b': {
'b1': {},
'b2': {}
},
'c': {
'c1': {}
}
}
}
to this:
x = [
['a'],
['a', 'b'],
['a', 'b', 'b1'],
['a', 'b', 'b2'],
['a', 'c'],
['a', 'c', 'c1'],
]
do a dfs traversal, yield paths as you go
!e let's see if I write ok code first second third try
d = {
'a': {
'b': {
'b1': {},
'b2': {}
},
'c': {
'c1': {}
}
}
}
def paths(cur, path=[]):
for child, dct in cur.items():
path.append(child)
yield path[:]
yield from paths(dct, path)
path.pop()
for path in paths(d):
print(path)
@haughty mountain :x: Your 3.11 eval job has completed with return code 1.
001 | ['a']
002 | ['a', 'b']
003 | ['a', 'b', 'b1']
004 | Traceback (most recent call last):
005 | File "<string>", line 20, in <module>
006 | File "<string>", line 17, in paths
007 | File "<string>", line 17, in paths
008 | File "<string>", line 18, in paths
009 | TypeError: 'str' object cannot be interpreted as an integer
let me re-run :/
!e
d = {
'a': {
'b': {
'b1': {},
'b2': {}
},
'c': {
'c1': {}
}
}
}
def paths(cur, path=[]):
for child, dct in cur.items():
path.append(child)
yield path[:]
yield from paths(dct, path)
path.pop()
for path in paths(d):
print(path)
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | ['a']
002 | ['a', 'b']
003 | ['a', 'b', 'b1']
004 | ['a', 'b', 'b2']
005 | ['a', 'c']
006 | ['a', 'c', 'c1']
there we go
this is pretty cool thanks
out of curiosity, why do you use 2 spaces for indentation?
discord does by default lol
does it?
I'm typing code on my phone and can't use tab to indent, so doing 4 spaces manually would be annoying 😛
(and I'm used to 2 spaces for most code at work)
on PC yeah. it's kinda bad
😬
hello guys
@median dawn For the lulz, its turns out to be a generalisation of an NP-Complete problem called the Closest String problem - I think linear is currently state of the art when it comes to this stuff, I got trolled 
Sometimes throwing an "unsolvable" problem at some unsuspecting undergrad produces surprising results - that it's not in fact unsolvable
well, if the problem boils down to whether P=NP then I feel very sorry for undergrads
Some problems are only assumed to be NP-complete, iirc... ;)
Yeah, the log time algorithm speculation comes from a beard scratching from someone else, hardly what I call an authority on the subject
Either way, why it's importance arises elsewhere, I can try and explain in #data-science-and-ml since that's where the story starts.
I would hope people would actually do the reduction before stating that something is NP-complete 
I can picture the idea of the callback, which might make the anchestor redundant, but I can't think to lines of code as of now.
Do you mind sharing a prototype?
Anyone have recommendations on some good resources to learn data structures and algorithms for coding interviews?
So far, I’ve read Cracking the Coding Interview, almost finished Grokking the Coding Interview, and am solving Leetcode problems
so im going thru a list using for item in data and removing all the items that dont meet certain criteria but it only works properly if i run it over multiple times
can someone explain why it doesnt work the first time and how i could do it efficently
can give example if needed
if you remove an item from the list, the elements to the right shift to the left one spot. then when you go to the next iteration, you skip an item
Sure, this was kinda fun.
from __future__ import annotations
from collections import deque
from dataclasses import dataclass
from typing import Callable, Optional
@dataclass
class PathTree:
path: str
part: str
count: int
children: dict[str, PathTree]
@classmethod
def create_root(cls) -> PathTree: # Not gonna bother with subclass types
return cls('', '', 0, {})
def _add_to_count(
self,
parts_queue: deque[str],
paths_to_update: list[tuple[str, int]],
pre_callback: Optional[Callable[[PathTree], None]],
post_callback: Optional[Callable[[PathTree], None]],
) -> None:
self.count += 1
paths_to_update.append((self.path, self.count))
if pre_callback:
pre_callback(self)
if parts_queue:
next_part = parts_queue.popleft()
next_node = self.children.get(next_part)
if not next_node:
next_path = f'{self.path}/{next_part}'
next_node = PathTree(next_path, next_part, 0, {})
self.children[next_part] = next_node
next_node._add_to_count(
parts_queue, paths_to_update, pre_callback, post_callback
)
if post_callback:
post_callback(self)
def visit(
self,
path: str,
pre_callback: Optional[Callable[[PathTree], None]] = None,
post_callback: Optional[Callable[[PathTree], None]] = None,
) -> list[tuple[str, int]]:
# Ignore leading slash
if path and path[0] == '/':
path = path[1:]
parts_queue = deque(path.split('/'))
paths_to_update: list[tuple[str, int]] = []
self._add_to_count(
parts_queue, paths_to_update, pre_callback, post_callback
)
return paths_to_update
root = PathTree.create_root()
print(root.visit('/ham'))
print(root.visit('/ham/eggs'))
print(root.visit('/ham/spam'))
print(root.visit('/ham/eggs/spam', pre_callback=print))
print(root.visit('/ham/spam/spam'))
print(root.visit('/ham/spam/spam/spam', post_callback=print))
print(root.visit('/spanish_inquisition'))
stack = [root]
while stack:
node = stack.pop()
print(f"{node.count:2} {node.path}")
stack.extend(node.children.values())
Generally the right idiom is to use filter.
or list comp
Yeah, that too.
It's a builtin function.
alr
can i be sure that there is no hamiltonian path if ore's and dirac's theorem dont return true?
I don't think those are necessary conditions
they are not
there for sure are hamiltonian paths where neither is true
both of the theorems are "in these special conditions such a path exists"
a counterexample for both is a boring line graph
A - B - C
every degree isn't >=n/2
the sum of degrees of non-adjacent vertices isn't >=n (A and C is 2 < 3)
Hey @crude granite!
It looks like you tried to attach file type(s) that we do not allow (.rtf). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.
Feel free to ask in #community-meta if you think this is a mistake.
Hey, could someone help me with Path of rook ? Breadth first search algorithm
On a chessboard of 8 x 8 squares, a rook is positioned on a certain square and needs to move to a certain position without crossing any occupied square. Determine the minimum number of moves necessary for the rook to reach its destination position.
Standard input will contain a description of the chessboard in the form of 8 lines, each containing 8 symbols. Each symbol represents one chessboard square:
'.' - a free square
'x' - an occupied square
'v' - the starting position of the rook
'c' - the ending position of the rook
The program should write a single integer to standard output - the minimum number of moves the rook must make to reach the destination square. If the rook cannot reach its destination, write the number -1.
Sample input:
........
...xx...
..vx....
...xc...
...x....
...xx...
........
....x.x.
Output:
4
Hi, I have this structure
if a > 5:
break
else :
pass```
How can I do it in one line of code ?
That'll error without a surrounding loop, so I assume there's a loop, too. py if a > 5: break is all you need, then. You can put that on one line, although it's still technically two either way
https://www.cs.ubc.ca/~rbridson/docs/bridson-siggraph07-poissondisk.pdf so i have a grid 1000x1000 any ideas how to implement step 2, too much spam in mainchannel
so i have a grid lets say 1000x1000 and random points gets chosen. and from the grid ( at the random point and random pii2 to another point. need to calculate what point is closest to apex of vector.
using with noise and maybe alone to generate some intresting landscapes lol and for fun,
so i have a point in 300,200 in grid it gets modified to 0 and randomly makes vector lets say 62 degrees. lenght is randomly calculated between min and max x and y. whats the closest variable it hits.
give every point a lenght of one ? then somekidn of sopichaceted calculation whats closest one number or something ? im lost here am i even in right track lol
@stable skiff The idea is as follows. Say your grid is 1000⨉1000. First pick a random point and call it (x, y). It does not have to be a grid point, so it could be (158.29, 773.56). Now pick a random point that's near but not too near. This requires picking a random angle and a random radius. You want to do this so that you end up uniform in the annulus of inner radius R and outer radius 2R. When you change from rectangular to polar coordinates, your density element changes from dx dy to r dr d𝜃; so you should pick the angle uniformly between 0 and 2𝜋, but because of the factor of r, you should not pick the radius uniformly. Instead, you should pick a uniform random s between ½R² and 2R², and you should set r = √(2s); this is equivalent to ½r² = s, i.e., r dr = ds, so picking uniformly in s gets you the right distribution for the radius. When you've done this, you get a new point, (xʹ, yʹ). This point lands in one of the grid boxes. You need to test whether it's too close to one of the other points. To do this, you test whether the grid box it lands in contains a point and whether any nearby grid boxes contain a point. If so, then this new point is too close. If not, then you just found a new point.
You should also keep in mind that the paper is slightly wrong. Without a lower bound on the cell size, you may use superlinear work when searching for nearby points. Also, the paper does not make any claim about the dependence of the complexity on the dimension; in fact the complexity is exponential. (Of course, exponential dependence on the dimension is no problem if the dimension is only two!)
yo stupid question, how do i convert a string such as "-14" to an int without actually making a function?
int("-14")
wait what i swear that didnt work like 5 minutes ago, ty
Can someone provide me DS with Python resource ?
Hi. I hope you are doing well. Are python lists and dictionaries abstract data structures?
I'd say they are very concrete data structures, since they are actual implementations.
But python lists are an implementation of the List abstract DS, and dicts are an implementation of the Map abstract DS, if that's what you mean.
Thank you so much
i would like help with this question, it seems to me that its 64n
i think it's n+6
why ?
this is a semi-okay explanation i think:
t/64 = 3 * 2^n
t = 3 * 2^n * 64
t = 3 * 2^n + 2^6
t = 3 * 2^(n+6)
okay
that makes sense
there's a few (5?) more of the same style of problem here: https://opendsa-server.cs.vt.edu/embed/FasterCorASumm
ty
Do you guys know any way of predicting deriv smart trader next values? My hope is to use machine learning to predict new numbers
hey
problem : https://leetcode.com/problems/merge-two-sorted-lists/description/
my solution :
# Definition for singly-linked list.
# class ListNode:
# def __init__(self, val=0, next=None):
# self.val = val
# self.next = next
class Solution:
def mergeTwoLists(self, list1: Optional[ListNode], list2: Optional[ListNode]) -> Optional[ListNode]:
head_1 = list1
head_2 = list2
prev_1 = head_1
if list1.val < list2.val:
x = list1.val
else:
x = list2.val
while head_1 or head_2:
if head_1.val < head_2.val:
prev_li.next = head_2
temp = head_2.next
head_2.next = head_1
prev_1 = head_2
head_2 = temp
else:
prev_1 = head_1
head_1 = head_1.next
return x
my code doesnt even submit i checked the logic its well quite correct
error : AttributeError: 'NoneType' object has no attribute 'val'
if head_1.val < head_2.val:
Line 16 in mergeTwoLists (Solution.py)
ret = Solution().mergeTwoLists(param_1, param_2)
Line 50 in _driver (Solution.py)
_driver()
Line 61 in <module> (Solution.py)
please tell me what is wrong and how it can be fixed
Merge Two Sorted Lists - You are given the heads of two sorted linked lists list1 and list2.
Merge the two lists in a one sorted list. The list should be made by splicing together the nodes of the first two lists.
Return the head of the merged linked list.
Example 1:
[https://assets.leetcode.com/uploads/2020/10/03/merge_ex1.jpg]
Input: ...
You are only setting x at the top and are never returning anything other than x. X can be None which is the problem and this also means that the code is incorrect also
You probably meant to return something other than x
Anyone here proficient in pandas?
#data-science-and-ml. also, it's more useful to ask your question than asking if people know how to answer
Hey, anyone know if I can print multidimensional lists with the * operator? I don't wanna use loops. I could use joins but its an integer array and I dont wanna have that extra processing. Thanks!
This is what I have for now```
print(*[' '.join(map(str, x)) for x in arr], sep='\n')
But if I go beyond 2D lists, I would have to keep adding more joins and maps etc.
Yeah, in general you'd need a recursive function formatting it in whatever format you want.
I dont wanna have that extra processing
whatever you do, it takes processing to turn list of integers into a string :p
print just stringifies the arguments for you, but it's not really much faster than doing it yourself.
Can cast to numpy array and print that
But after 2d it becomes messy (because how do you print a 3d array, or even 4d)
numpy does it just fine
why do you not want to use a loop?
for thing in things:
print(*thing)
```is short and simple
I've learned that there are different runtimes depending on if it's a quadratic or cubic runtime algorithm. That sense but why wouldn't you want to always just always use the fastest algorithm? Time complexity?
Well yeah, you wanted the lowest (quickest) runtime.
But lower time complexity (with the big-O notation) is not always faster in practice. If you have to choose between 10,000,000*N operations or 5*N^2, then for small N the quadratic time complexity one would be quicker.
That sense but why wouldn't you want to always just always use the fastest algorithm?
well yeah, but it's usually harder to come up with better algorithms
there are also hard lower limits depending on the class of problem you're solving, e.g. you can't do standard sorting in less than nlog(n)
to illustrate PcCamel's answer: Fuhrer's multiplication algorithm.
https://en.wikipedia.org/wiki/Multiplication_algorithm#Further_improvements
it has a better asymptotic complexity, but in practice is only faster for unreasonably large numbers
I could but again if the dimensionality increases I would have to keep adding more for loops.
Yeah I tried NumPy and itertools too, but the same question about greater than 2 dimensionality stumped me
how do you want a 3d array to be printed?
Well printing more than 2 dimensions is most of the time not necessary/a good idea anyways
3d can be done by printing every slice/matrix
But after that it's just messy..
i believe numpy just keeps stacking
!e
import numpy as np
print(np.arange(32).reshape(2, 2, 2, 2, 2))
@naive oak :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | [[[[[ 0 1]
002 | [ 2 3]]
003 |
004 | [[ 4 5]
005 | [ 6 7]]]
006 |
007 |
008 | [[[ 8 9]
009 | [10 11]]
010 |
011 | [[12 13]
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/ihuxumiyut.txt?noredirect
yep just keeps stacking and adding more spaces
van der Hoeven's one is also fun
totally unusable
but O(n log n)
I know, right? Ho even thought it was a good idea?!
it's funny, when I saw the article I recognized the guy's name because he wrote software I used
texmacs
(i am genuinely sorry for the F-tier pun)

like... hoeven = who even
in any case it's interesting that O(n log n) is actually possible, it was conjectured for a long time
it's likely it's optimal, but we don't know
it's hard to find a lower bound better than Ω(n)
Hey, I want to store grocery items I bought to get an overview in the future. Just a small project for myself.
Thinking how I should build my data structure, probably a data class? For example
@dataclass
class item
name: str
price: float
store: str
date: str #better option?
category: str
Would something like this be useful to later filter stuff?
Also how would I best store the data? I could write to excel, .txt, .json (?)
I suspect knowing the rough pronunciation blinded me to it 😅
studying data structure, is it safe to assume simple single for loop iteration is always O(n), and nested for loops are always O(n^2)?
assuming both loops have very simple mathematic function like count+=1
no
e.g. this is n log n
for i in range(1, n):
for j in range(0, n, i):
...
thanks
thats a great example
yar
Hello,
short qustion. I have 180. 000 dicts. Would it be better to store them in one dict (dict of dict) or in one list (list of dicts). Dict of dict would be json format or am i wrong? What should i prefer ? Thanks 🙂
depends on your needs
if you just want a sequence of dicts use a list
if you have a mapping from something to the dicts, use a dict
I need to write it to a csv and modify the data from the dicts later.
But yes, I know what you mean
if you have the same keys everywhere then you can use DictWriter/DictReader
!d csv.DictWriter
class csv.DictWriter(f, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds)```
why isn't it showing any of the summary? 
Propably not, because I have nested dicts and they are different size
What is the time complexity of pow and is that included in the overall time complexity of an algorithm?
idk and yes the time complexity of any algorithm can be thought of as a composition of its parts
basically it depends
pow with two floats will be O(1)
pow with two ints pow(something, n) will be O(log n) if we can assume integer multiplication is O(1)
which isn't generally the case in python because big ints
For large integers all the multiplications except the last one are negligible (in big-O sense) which means n ** k works in something like O((k * log n) ** log2(3)), at least in CPython. log(7) log2(3) is a power from Karatsuba multiplication algorithm.
isn't the power from the Karatsuba algorithm log2(3)?
yes it is... (facepalm)
log2(7) appears in Strassen's matmult algo
so you probably remember it from there
hey guys I was trying to solve this leap year problem and this is my code:
def is_leap(year):
leap = False
if not year % 100 == 0 and year % 4 == 0 and year % 400 == 0:
leap = True
return leap
year = int(input())
print(is_leap(year))
the task or the problem is this:
In the Gregorian calendar, three conditions are used to identify leap years:
The year can be evenly divided by 4, is a leap year, unless:
The year can be evenly divided by 100, it is NOT a leap year, unless:
The year is also evenly divisible by 400. Then it is a leap year.
but I mean.. Is this a leap year or not?
your logic is wrong
you require all criteria at the same time, so e.g. both divisible by 4 and 400
?
Oh so I have to use or instead of and?
I would probably write it as some nested if statements
actually you could avoid nesting
if n%4 != 0:
return False
if n%100 == 0 and n % 400 != 0:
return False
return True
you could squeeze this into one expression, but it wouldn't be that readable
not (n%4 != 0 or (n%100 == 0 and n % 400 != 0))
```or applying De Morgan's laws a bit
```py
n%4 == 0 and (n%100 != 0 or n % 400 == 0)
how about this?
def is_leap(year):
leap = False
if year % 4 == 0 or year % 400 == 0 and year % 100 != 0:
leap = True
return leap
year = int(input())
print(is_leap(year))
it worked(for some) except the year 2100
still wrong
what you wrote is this
year % 4 == 0 or (year % 400 == 0 and year % 100 != 0)
which is true whenever year % 4 == 0
which is wrong
But that means it can divide evenly?
the correct expression if you want it is the bottom one here
1900 is not a leap year since it's divisible by 4 and 100 (but not 400)
def is_leap(year: int) -> bool:
leap = False
if (year % 4 == 0 or year % 400 == 0) and year % 100 != 0:
leap = True
return leap
year = int(input())
that breaks as well
that would say 2000 is not a leap year
as I said, try writing it not as a single expression, it makes things easier to reason about
e.g. if year % 4 != 0 it's for sure not a leap year
and then there is the second case where it's not a leap year
all other cases are leap years
This is getting difficult
def is_leap(year: int) -> bool:
leap = False
if (year % 4 == 0 or year % 400 == 0) and year % 100 != 0:
leap = True
elif year % 4 == 0 and year % 100 == 0:
leap = True
return leap
year = int(input())
the dumbest translation of the requirements in the task is
if n%4 == 0:
if n%100 == 0:
if n%400 == 0:
return True
else:
return False
else:
return True
else:
return False
but it's neater if you instead try to consider when there isn't a leap year
it allows you to exit early
i.e.
if n%4 != 0:
return False
if n%100 == 0 and n%400 != 0:
return False
return True
this?
def is_leap(year: int) -> bool:
leap = True
if year % 4 != 0 and year % 100 == 0:
leap = False
elif year % 400 != 0 and year % 100 == 0:
leap = False
return leap
year = int(input())
print(is_leap(year))
read the code I posted
i have a question about huffman trees
how do i encode(?) the huffman tree into a sort of header to put in the file im compressing (since it'll be used to decompress it)
i looked it up but i could not find single easy to understand solution
A fuzzy match is performed to determine matching street instances, [LFA1_Street] -> [EA_Street and House Number (emp)], with a 90% threshold to produce a table with 2 columns, [Test3_RecordID] with their corresponding matching [Test3_RecordID2] called FM
how can this be done using pandas?
sorry for the long question
Can anyone please help me with this task I have over here. I have been stuck on it for a while now and im not sure what to do.
Hello,
short question about recursion.
I have those two methods for my example. If my check_valid_data method returns main() because the data is invalid, than the body in e.g register where the function was called is still execuded completely. How can i avoid that ?
Is a quick sort system the best way of finding out the top percentile of objects with a speculative/undefined value? I have a set of 30,000 pictures that I want to wittle down to about 1,000. I was thinking of doing something similar to quicksort, where a user chooses between individual artworks from a set on the right and a "sample" or "control" artwork on the left. After that, either the chosen or unchosen pictures repeat the process depending on if the ratio is under or over 97/3. Would that be the most efficient way to do this?
It's not exactly clear to me what you mean by "speculative/undefined value". If the values are known, but you do not know the threshold which cuts 30,000 pictures down to 1,000, then you can use a priority queue (see the heapq module) to find the top 1,000 pictures. If you mean that each picture has a value, but those values are unknown, then you have a statistical problem which has been studied a lot but has no definitive answer. You could try something like a https://en.wikipedia.org/wiki/Bradley–Terry_model or the https://en.wikipedia.org/wiki/Elo_rating_system.
Hi all,
I need your help.
I would like to replace '2019'/'12'/'25' in this function by a column "DATE" of my dataframe :
JoursFeries.is_bank_holiday(datetime.date(2019, 12, 25), zone="Métropole")
I tried :
JoursFeries.is_bank_holiday(datetime.date(pd.to_datetime(data_modif['DATE'])),zone="Métropole")
But I have the following error:
TypeError: cannot convert the series to <class 'int'>
I also tried with my others 3 columns YEAR/MONTH/DAY: same error.
Could you please explain what is the problem ? Many thanks.
Just found the problem. Sorry. 🙂
I mean, it's true. Practice and read up on topics you don't know as you encounter them
yeah, I've been in AC basically from the start of it
I didn't realize purgatory saw the ioi channel 
I should fix that
lol
just get CM and you can see the whole thing 😛
Python includes a standard function that solves this for you. It is defined like this:
def isleap(year):
'Return True for leap years, False for non-leap years.'
return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)
Hi
🙏soon
where is that defined
i thought it's chatgpt-brand bullshit, but apparently the snippet's real and from calendar
https://github.com/python/cpython/blob/3.11/Lib/calendar.py#L102-L104
Lib/calendar.py lines 102 to 104
def isleap(year):
"""Return True for leap years, False for non-leap years."""
return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)```
though this function has never had a non-triple-quoted docstring, even 23 years ago when it first got a docstring. so chatgpt involvement is probable.
Hey guys hope you are okay 🙂
I've a question, I've lots of tuples, each tuple has the same length. Now each tuple has a linked value.
So for example my database is like this:
Tuple | Value
(1, 2, 3) | 5
(2, 1, 3) | 2.5
(3, 3, 2) | 9
(1, 1, 10) | 8
and so on | and so on
For me, a "good" tuple is the one that has the highest number for the column value.
So yeah it's easy to find the best tuple, you just sort the values. But my goal is to see which number repeats the most in certain index and links to a high value. So to avoid coding the part that links to a high value, I can just delete negatives for example, but still I would need to find an algorithm (I don't really want to code it because it'll take me a lot of time) that tells me which number is repeated the most in certain index. So for example with the table I gave above, the algo would tell me that in index two, 3 repeats twice. And another thing I'd like to have is something that tells me based on the numbers that are repeated the most (for each index), a range. Not pretty clear, I know, but I'll give an example, in index 2, the range is (3-2). Why not 10? Well 10 is really far away from 2 or 3. Not how the difference between 2 and 3 is 1 and 3 and 10 is 7. Sorry for the poor vocabulary I did my best. I'll reformulate whatever you guys don't understand. Thanks 🙂
You can count repetitions in an index using collections.Counter. For example,
from collections import Counter
from operator import itemgetter
c = Counter(map(itemgetter(2), data))
Then c.most_common will let you find the most common values.
Finding the range is a more subtle statistical question. It sounds like you're asking for some kind of outlier detection. There is no simple answer to this that works in all cases.
It seems to me that what you actually want to do is, like:
- explode that column of tuples into 3 integer columns
- look at the value_counts of each column
- look at the mean and standard deviation of each column - that seems to me to be the closest coherent idea to the "range" thing you're describing. Or, say, 10th and 90th percentiles of each column.
For range-type questions, I generally recommend looking at quantiles. For example, look at the 1% and 99% percentiles of the data.
(You can do this quickly with numpy.percentile.)
The reason for doing this instead of looking at mean and standard deviation is because those are extremely sensitive to outliers.
For example, suppose your data comes from some kind of sensor. Most of the time it gives reasonable results, but sometimes it goes nuts and gives you weird giant values. Statistics like the median and percentiles don't care! As long as you collected enough good data, they're completely unchanged by corrupt data.
Whereas the mean and standard deviation can both be changed arbitrarily far by a single corrupt measurement.
(This can be understood precisely in terms of the influence function of an estimator.)
There are plenty of times where the mean and standard deviation are appropriate. If they work for your data, that's fine. But there are also times when they'll be wildly wrong. The median and interquartile range are good in different situations and for different reasons; but at least they don't have the problem of being corruptible by a single measurement.
let me read this, too much text 😄 thanks btw 💙
fair
data has to be a list? a tuple? what exactly?
thats also a good one I'll look into that, stdev can be interesting. thanks !
that's definitely chatgpt imo. no human would just copy the source instead of the doc link or just the name of the function lol
i was thinking that no human would edit the source in such a way when doing so
that too
can someone help me understand what Big O notation is in layman terms? Thanks!
"how much slower will my code be when my input size gets bigger"
For collections.Counter, it just needs to be an iterable. So it can be lots of things: list, tuple, array.array, NumPy array, custom class with __iter__, etc.
amazing!!! thanks a lot you solved my question!
def is_prime(n):
m = int(n**0.5)
if n % 2 == 0 or n == 1:
return False
if any(n % i == 0 for i in range(3, m+1, 2)):
return False
return True
What would the time complexity of this be?
Its iterating through every odd number up to n**0.5
Would that be O(n**0.5/2) which just simplifies to O(n**0.5)?
whatever the time complexity of n**0.5 is or O(sqrt(n)), probably the latter
Oh right yeah, thanks
It's defined in the standard module "calendar".
Python is open source. Look up the code on GitHub.
!e
from math import factorial
n=10
print(factorial(n)*n**2)
@urban kiln :white_check_mark: Your 3.11 eval job has completed with return code 0.
362880000
that's...what he did
anyone know how to fix this?
What is plot_accuracy?
That's how I know the source never looked exactly like your message, yes.
is it still O(n log(n)) if I sort a sorted array to check if it is sorted
like
if arr == sorted(arr): ...
and the arr is sorted
or should I just iterate through the array and do
arr = [...]
def is_sorted():
for index, i in enumerate(arr[:-1]):
if i > arr[index+1]:
return False
return True
in python it should be linear
ok
but to just check, sorting would be bad right?
if the array isnt sorted
and i want to check if it is sorted
yes that would be not the best
doing this would be better right?
that would be a way. a bit clunky but yeah it works
Can anyone tell me the approach for vertical order traversal in binary tree
Only answering the question that was directly asked of me.
It works but you're not providing an optimal space complexity
Python produces a new, reversed version of the previous list when you do [::-1]
Enumerate seems fine because it produces more of a stream, but I'm not sure
Your solution's space complexity is O(N) when it should be O(1) (if you didn't reverse)
If you were to sort the array it would result in a time complexity of O(nlogn) when checking if an array is sorted should only result in a time complexity of O(N)
to avoid the O(n) space complexity due to slicing, one can use itertools.islice instead.
good to know, but he/she shouldn't be slicing in this case at all
I was going to say that, you could instead use a function that translates the indexes to the reverse, or set up an iterable, but that's not necessary here
Why not? Skipping the first last element is a very common use for slicing/islice.
there is no need to skip the first element
I think you might have misread - it's arr[:-1], not arr[::-1]. Or we're looking at different code.
the last one (I also can't read)
because if they don't, they'll get an indexerror on the last index
oh yeah
could i just do an if condition which breaks the loop when the index is the last to avoid an error
like
if index == len(arr)-1: break
the slice is better imo if you use itertools
tbh the way I'd write it is
def is_sorted(arr):
# use islice instead of slicing if you care about space complexity
return all(a<=b for a,b in zip(arr, arr[1:]))
I think all might have some space complexity issues
because it evaluates a list of booleans
it doesn't
oh fair
it consumes any iterable, and here I'm passing it a generator expression
oh ok
arr[1:] does have O(n) space complexity, but can be replaced by islice(arr, 1, None) if it matters.
is it making a copy of the array
isnt it similar to just popping the element
oh nvm
bruh
yes
funnily enough, in numpy it's O(1), because numpy has the concept of views.
are numpy arrays better than normal arrays when only having nums in your array?
depends on what you mean by "better"
faster
is the speed increase big?
or is the reason why people use numpy something else
like numerical operations
speed increase of what? 😛 indexing a numpy array is actually slower than a list afaik. It's vectorized operations that are orders of magnitudes faster.
they are also more compact - a python int/float is around 24 bytes at least, whereas numpy's are, well, primitive types. So several times less memory usage when the number of elements is large.
but speed comparisons are somewhat hard to do because you simply don't work with numpy arrays the same way as lists.
In js maybe...
const is_sorted = arr => arr.every((x, i, a) => i < a.length - 1 ? x <= a[i+1] : true);
somewhat hacky to do the check every time
oh ok
ill probably just use lists
meanwhile in Rust: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.is_sorted
Js just doesn't have a ton of built in functions methods or data structures
for instance no heap
oh yeah, its std is horrible
I wouldn't say horrible because it ends up being more paradigm agnostic
you need to import, like, lodash, to get stuff I'm used to having in python like map on generators, etc.
the issue with python imo is that they gatekeep some FP stuff like reduce into functools
Not having iterable interfaces is the biggest issue
not sure why it's there but it's still standard library
oh yeah, true
and so the not having reduce outside functools has lead to a lot of people having a preference for mutation
which is
like for the level that python operates at not acceptable imo
it's super annoying I can't chain iterator methods like lst.iter().map(...).sum() in Python like I can in Rust. In python's standard library, that is.
chaining like that is bad
it's better to use functions like lisp
using instance level methods is actually less composable
so python does that better with map/filter/reduce
but the lambda syntax in js is leagues better
list comprehension chaining
list comprehensions are non-composable
which is part of why it's bad
list comprehensions aren't first class functions I mean
I dislike the lisp-like way because:
- you have to read them right to left to understand what happens.
sum(map(filter(chain(...))))- that'schain().filter().map().sum()in fluent syntax, way clearer IMO. - so many nested parentheses, oh my god
right
however it's more composable
if you want to make a pipe function then pipe the list transformations
at my job we actually have a weird stack based language that you might like then
because
{5,6,3}@+*2@Print // 28
it ends up looking like that
the previous value is passed implicitly to the next thing
stack based language
A lot of non-purely functional languages struggle because of the lack of composition operator, I believe
It's minor because most people refuse to learn FP anyway but I feel like list comprehensions are just not intelligent language design
I feel like all the composition and syntax problems are solved with dot: fold sum filter ... . \x x * x . $ {1, 2, 3}...
you trade composability and simpler syntax for something that's slightly easier to write
yeah
they are nice, but there are better alternatives
what's wrong with list comps
list comprehensions are not composable unlike map filter reduce
they are not a function
so you can't manipulate the way a list comprehension creates a new list as data itself
each list comprehension might be map, filter, or reduce, but chaining them together you end up using extra variables and a lot of memory
so you can't use pipe with it or the theoretical dot operator
because it's a list comprehension
🤔 you can pass it to other functions no problem
well, yes, what else would you do
you can't pass the process the list comprehension uses to produce the list to other functions
why would you?
why do you need the process the list comprehension uses to produce the list for composability?
tbh if python lambda syntax is so verbose to a degree that it requires 6 extra characters, simple composability won't help...
python lambda syntax is also terrible
say you want to process like 100 lists at a time which is common in CRUD
map the json to a list or dictionary, map the dictionaries to a scalar, format the scalars, output the scalars
you can create a pipe function
const pipe = (...fs) => arg => fs.reduce((a,x) => x(a), arg);
const processJSON = pipe(jsonToDict, dictToScalar, formatScalar, outputScalar);
then define those 4 functions separately, and have tests for them
How is that not a generator expression?
¯_(ツ)_/¯
what json? in any case, if you don't want a list, you can just make an iterator with a generator expression
i don't see why that makes list comps a bad idea. if you don't want a list as a result, don't use them?
because map/filter do the same thing without introducing new syntax and can be used for more things
list comprehensions are a bandaid on bad lambda syntax
that's an interesting take
I realize this doesn't matter a ton irl it's just the principle
idk about the "can be used for more things", though. as i've said before, you can just use genexps if you want to pipe them
of creating your own new syntax which is worse than what languages from the 1960s use
If the six characters makes a difference, you should use def instead.
Do you object to Python's use of whitespace, too?
That's a serious question.
pretty much everything has better lambda syntax
yes
but... python's listcomps are based on haskell's, which is, well, the functional language.
^
I'm only familiar with lisp but
It's the principle of giving up composability in exchange for slightly more terse syntax
you say you give up composability, but you can literally just use a genexp if you want an iterator
You seem very keen on composability. And that's a great property, but it's not everything. How would you implement an FFT by composition? I think that the result is likely to be illegible.
Right, it's not always the answer
But the thing is you lose nothing by using map/filter instead of a list comp
But if you want to use that genexpr you need to give it a NAME
you do lose something when you use a list comp instead of map/filter
but the thing is python language devs seem to not really care
because reduce is in functools and lambda syntax is bad
no? you can literally just replace the [] on a list comp with ()
functools jail
still the same thing. you can use () instead of [] and boom, you have an iterator
from functools import reduce
Actually I think they cared, but the programming was not mature enough to understand that
^
and as a result of functools gulag holding reduce we see accumulator loops everywhere mutating willy nilly for no reason
it's an imperative language, get over it
don't get me wrong I will write mutating code when it makes sense
people write accumulator loops because they aren't used to FP, not because it's hidden in functools
because all of the docs don't show it either
It's is imperative, but you can do better. And that bothers
js is imperative too
tbh I also very rarely use reduce, because, like, sum and math.prod exist, so you only need it for reducing other operators on custom objects
js #freedthereduce
(let me introduce you to C++ lambdas...)
I agree on that, to use reduce normally you need a a lot of other functions to not die while typing "reduce"
I just macrosed it 😉
#define $(...) {return __VA_ARGS__;}
#define $_(...) [&]<class _ = void, class _1 = void, class _2 = void, class _3 = void, class _4 = void>(__VA_ARGS__)
if you want reduce outside functools so badly, switch to python 2 :^)
Really? It reminds of the time I once saw C code with the macros
#define BEGIN {
#define END }
If you're going to use C++, at least use standard C++ syntax.
But it's better: $_(x) $(x * x)
No, it really isn't.
I would never use it outside of CP, of course
So you prefer [&](auto x) {return x * x;} ???
Yes, because then people who aren't me can read it.
(no need for the &)
But if you were to design a language, and you had to choose between current C++ syntax and that, would you choose C++ one?
tbf, no-one would design C++ in its current state, but you can't fix syntax without breaking every code ever
Actually, I think I might. Parentheses aren't used for introducing block statements in C-like languages. You should use braces. And since you may need to specify types, you're kinda stuck. C++ is stuck with a lot of historical design decisions; I think the syntax they adopted is about as good as they could do, even though it's not very good.
alright, fair
actually now that I think about it, it would be nice if reduce had support for operators, i.e.
functools.reduce(+, lis)
The operator module.
oh I wish operators where just functions...
That would be a lot of parsing issues, of course
I think the parsing issues are why it's not possible. You need very different syntax to accommodate that.
Idk what the exact implications would be though
haskel's (+)
Also in some other languages most operators are not binary but any_ary
which + is +? unary or binary?
You need both. Unary + is not always a no-op.
But you can figure out ary-ty from type system, if there was one
And you run into the same issue with unary versus binary -.
why not just have a single any_ary + and the reduce calls it with two args then rename the unary +
well I guess you'd have to remove operators entirely and treat everything as a function call
which wouldn't be so bad
but I don't think the world is ready for that level of based
Sounds like every functional language ever
in a mainstream lang
Floating-point + cannot be made any-arity.
It's non-associative.
You need the parentheses.
It's fine, world is slowly embracing functional
~~just don't use floating-point numbers, roll your own software ones
~~
I think on a lot of cool projects functional isn't the way to go but at my crud job I rarely see a reason to mutate anything
nah. world is slowly leaching more and more functional features into imperative langs
purely functional programming feels overly constrained to me, but I'm fine with stealing the good parts 😛
well no one said that your lang must be purely-purely functional
the guarantees result in speed/space bonuses
and ease of testing
you can have mutables in funcational, just in a local scope
like you can't have the benefit of trees that store transformations of a data structure that cost less than a clone if you actually mutate
but I agree at some point it becomes annoying
but you can have a vector which has unsafe code underneath it, and then use it as functional
I think if you don't go too deep and don't replace every semicolon with monad, while keeping other good principles of functional, it's still functional
for instance if we're in an FP context we can lazy eval
map(lambda x: x + 1, [5,2,3])
we don't need to actually evaluate the result
we can just store it as a transformation on that array
python does that 😔
that's why rust borrow checker exists
so we have to immediately allocate a new arr with the result
rust does sound cool I haven't checked it out
you totally should
I like langs where mutating functions/methods are clearly defined by a character
I forgot which lang it was maybe ada?
it's a good enough compromise between functional and imperative, mostly still imperative
like any mutating function is prefixed by !
I think that's beneficial in any paradigm
in rust it's mut keyword
true'
in python they already started adding the evil
walrus operator
aka I'm gonna be an expression and a side effect because
why not
why is the walrus operator unique? literally any function could have a side effect in python
which is the problem
it's a problem if you can only cope with pure functions
it would be nice to know which ones do and which ones don't via a mandatory character prefix to the function name
It sounds to me like you just have different language preferences from the rest of us.
like you could force a ! prefix on any function that has a side effect and void return value, and a # prefix to any function with a return value and side effect
yeah
well you can have impure functions
in rust some functions don't implement traits like Fn and it marks that the function is not referentially transparent. Bot quite the prefix though. Prefixes are too intrusive IMO.
the only things I've said is we get certain guarantees that result in better time/space complexities and deferred evaluation, and if we label them correctly it would help programmers
I didn't say go full functional
at some point you can still stop using that immutable structure and step out back into imperative land
It's more so when reduce is in functools gulag and that indicates an overall disdain for even the most minor aspects of fp
Every time you say this, you make me think you're a troll.
I am about 60% sure that reduce being in functools leads to people using it less along with the docs not showing it
There is a programming language called typst (which is actually a typesetting software which has a programming language inside) and all functions must be pure, all arguments are passed by value, but mutability inside of function's body is allowed. I find that compromise quite nice.
Like if I have to type functools.reduce vs reduce I think it does actually result in people using reduce less
You can write examples for the documentation, if you really think that would help.
yeah that's how I write a lot of things
I just avoid it where possible
I don't think it's inherently bad there's just a lot of work where the user of a high level lang doesn't need to interact with mutation
Where as on more interesting projects with higher throughput, lower level, more time intensive whatever it becomes a necessity obviously
That's just not the majority of what people tend to be doing I feel
And even there immutability helps because data-oriented
yeah
and even then I will say I don't think teaching FP immediately is necessarily the right call
Because people don't get a good math education
and the other thing is people tend to not understand what FP is doing under the hood if they learn it first
I just think at some point you should start picking aspects of it up
and most people are just happy to be imperative out of complacency and normalcy
I would let people suffer from OOP, then slowly introduce FP concepts, and then let them decide which FP concepts they will employ in, hopefully, imperative langs
well when we say OOP we mean imperative OOP but yeah
I find it really strange most people don't learn FP because it's required in most of the unis I've seen
and usually it's haskell or ocaml or something
Maybe it's a generational thing
10 years ago maybe there was no FP class at a lot of state unis? Now they all have one at least, and back in the 70s/80s they did too
maybe java killed it for a bit, not sure
Not all people in the industry have a degree, and it's cheaper to hire a bunch of java programmers to ruin the whole project codebase... Those who have a degree are paid a lot to suffer, those who don't are paid a little but work.
wdym by the second sentence
I don't have a degree actually, but I was pursuing a CS degree and had friends who had taken FP courses
I stopped after dsa, machine arch, calc 3,
A lot of people I know have a degree, have taken FP courses, but write in C++ in a well-paid job because the market is bigger. It's a self-sustaining paradox, kind of
Right yeah
I've met a lot of older people at my job like that
and it's a bit rude to expect them to totally retool 20 years of knowledge
which would probably result in some attrition
I'd rather write a loop statement than get someone fired because they didn't learn FP fast enough
it's no paradox. FP paradigms are not, IMO, well suited for programming larger, complex systems. once you get beyond a certain size, the work of software development is dominated by managing architectural complexity. FP is not well suited for that, IMO, because it lacks tools to structure code hierarchically.
I believe code structuring is done with different mechanisms independent of a core language paradigm. You can have modules and libraries with nice encapsulation in both FP and imperative. But to have modules, library management systems, build systems, static analyzers capable of working with a large system, incremental compilation, and other structural utilities you first have to prove that the language is worth implementing all of it, which is always difficult. The imperative won the battle at the time because it way more versatile, and because the complexity of the projects at the time was not as insane as it is now. The linux kernel is the clearest example of that. But now we see that functional languages often provide better solutions for complexity. The most obvious example of that is elixir IMO.
Just to be clear: I don't think functional is a total replacement of imperative, but I believe there are way too many applications where functional would be a better option, but it's not used because the languages and current programming cultures are not mature enough.
the best sauce, IMO, is a rich imperative language with first class support for common functional features
A thought that came to me... there exist ways to estimate the value of numerical integrals over infinite intervals - scipy.integrate.quad does it automatically, say. But is there simular routines for estimating the value of infinite sums? I'm guessing you can convert the sum into an integral manually and then integrate that, but are there automatic methods?
Depends on the function, obviously. It's generally more difficult to evaluate the sum than the integral for a lot of "simple" functions. But I know there is an insanely op method for calculating the exact (!) parametric (!!) value of a finite (!!!) sum. There was a book about that, I believe it's called A=B or something.
this looks interesting, maybe if I read that book I'll understand how the hell wolfram mathematica proved one of the identities in my bachelor's thesis 🥴
there's wilf-zeilberger and gosper's alg i guess
The basic tool is to just accumulate terms. This works pretty well in a lot of circumstances. When it doesn't work well, you can use a variety of https://en.wikipedia.org/wiki/Series_acceleration techniques. A common theme in these methods is to interpret the original series as the value of a function (for example, Euler's transform uses the terms as the coefficients of a formal power series), and transforming this function gives you a different series which nevertheless has the same value.
what is total cost and big O cost?
count = 0
for p = 0 to p = n
for w = 0 to w = p^2
If each inner loop iteration is O(1), it works in O(1^2 + 2^2 + 3^2 + ... + n^2) = O(n^3).
Just so you know, the total cost is 1^2 + 2^2 + 3^2 + ... + n^2 = n * (n + 1) * (2n + 1) / 6. The exact value is rarely used though, it only worth remembering that 1^2 + 2^2 + 3^2 + ... + n^2 = n^3 / 3 + O(n^2)
The sum 1^k + 2^k + 3^k + ... + n^k is always equal to O(n^(k + 1)) by the way. More over, 1^k + 2^k + 3^k + ... + n^k = n^(k + 1) / (k + 1) + O(n^k).
Numbers like these (1^2 + 2^2 + ... + n^2) are called square pyramidal numbers (https://oeis.org/A000330).
I think I see it now. if the inner loop were instead running to w = p, the entire doubly nested thing would be closer to n^2, however, because w must run to p^2, the inner loop itself is n^2? and then you multiply by the outer loop to get n^3?
Kinda. But your approach shows that that is the upper bound only. For example
for i in range(1, n):
for j in range(1, n, i): # stepping by i
...
The inner loop does at most n iterations, the outer always does n - 1. If you multiply you get the upper bound - O(n^2). Technically, this is the big-O of the time complexity - the upper bound. But you can improve the estimate by knowing that n + n / 2 + n / 3 + ... + n / n = O(n * log n). During the first iteration of the outer loop, the inner loop does n steps, during the second n / 2, and so on. Summing up, we get n * log n because... math. This is called harmonic series, there are a lot of explanations why it's O(n log n), you can read about them yourself.
When analyzing time complexity you usually only need the upper bound, but how precise it is depends on your use case. You can say that this:
for i in range(n): ...
Is O(n^2). I mean, why not? Big O is just an upper bound.
What a lot of times meant as big-O is an exact time complexity, or at least good enough to mean something (O(n log n) is the first example, O(n) in the second). And to compute it you have to do some actual analysis. Usually it's very simple. In your example the obvious upper bound is also the exact bound. You can read more about why the formula is as above here: https://en.wikipedia.org/wiki/Square_pyramidal_number. And since the fastest growing term has big-O of O(n^3), the entire sum is also O(n^3).
Guys what is the use of using for instance LinkedLists , Arrays, Sets in Python or javascript
Does anyone have a good source for explaining complexity and do/don'ts in regards to this?
I am building a python turtle application as a personal project, It can take in a map as a grid and use the left hand rule or a* to walk the turtle from a start point to an end point. Do you have any ideas for extra features i should implement to make it more unique?
add support for saving/loading maps, add animations, add more pathfinding algorithms, maybe make a race where each turtle uses a different algorithm, 3d grid
Hello,
short question about dataclasses in Python. I have a dataclass e.g. Player with attribute cards and names. I call that Player in an other class. If I want to change the value of cards, I can do it like this: Player.cards = new_value. Considererd of OOP do I have to implement an method e.g. addCard to Player? Or is it pythonic to do it like before (Player.cards = new_value) ? Thanks 🙂
This depends on the language. In Python, it is generally considered idiomatic to set attributes directly. If, when this happens, the instance needs to execute some code to update its internal state, then it can use a property. On the other hand, in C++, it is idiomatic to use getters and setters for everything.
it seems you're modifying the class directly instead of creating an instance?
I am creating an object from that class player = Player() and later on I do the following player.card = 10.
Now I am not sure, if this i pythonic or if i should do something like this player.set_card_value() So I would define a method in the dataclass to change the card value an call it with the instance/object
Note: I am using a dataclass / not a normal class
So, I should use a method to update the value from a Dataclass / Class in Python ?
that seems reasonable then
if you don't need a method, there's no reason to define one that will just set the attribute
methods are useful if you have a class invariant to maintain
property is useful, too
yes but why should I use a property when i can access it easily like this player.card = 10
again, maintaining class invariants
is one reason
as in, you might have members that are related and must be updated together
hiding things behind a function means you can enforce that the state of the class remains valid
if you let the user interact with the members directly they can create invalid states
fwiw, this is a case for properties/functions over direct access in general
with direct access you expose users directly to your implementation
if you operate through a function call you could do whatever inside, so you're not exposing your implementation through your api
that literally could be a property
Do you guys prefer lists or classes to implement segment trees?
Someone who knows pandas well - please, please help: https://discord.com/channels/267624335836053506/1070542237227823174
I almost always wrap data structures in a class. It's kind of a time/convenience trade off, and I choose convenience. You can spend some time to make interface look more natural, or just make functions which take an array and mutate it, or you can just inline all the operations right in the logic (with bottom-up segment trees only, of course, you would need recursion otherwise), the choice is yours. But if you are planning to reuse or generalize your tree to different operations, I highly suggest creating a wrapper. It won't take too much time and you will find it easier to remember what the structure does, and how to modify it to your needs. I've recently for the first time wrapped treap into a set-like structure and it is way more elegant than just plain split and merge functions.
👋 Hello all! I have a program I am trying to write - to convert my many notes on movies on my phone into reviews on letterboxd.com
It involves writing to a csv - if anyone wants to help, I'd really appreciate it 😄 check out #1070706066523951156 cheers! 👍
Anyone have dsa python all notes?
Is it posible to run google teams script on your own pc?
guys i need some help, i think almost every one here that one day work with maps wonder how to make a circle that's fit in the screen boundaries
so we have this formula right, the "Haversine" formula to calculate the great-circle distance between two points
u set lat 1 as your current location(center) and the lat 2 as the boundaries from your screen u get from the Maps API
a = sin²(Δφ/2) + cos φ1 ⋅ cos φ2 ⋅ sin²(Δλ/2)
c = 2 ⋅ atan2( √a, √(1−a) )
d = R ⋅ c
where φ is latitude, λ is longitude, R is earth’s radius
and it's work fine, but have a catch, when u start to rotate the maps this formula do not work right anymore, the rotation is called bearing
i need to find some formula to apply the bearing to the calculation so the circle still fitting in the screen
Hello gents, I'm trying to understand whether I'm doing the correct thing implementing a iterator for a custom class. I want to take advantage of generators to implement the iterator to keep code simple.
`
class Iterator:
def init(self, data):
self._data = data
self._generator = generate()
def generate(self):
for i in data:
yield i
def __next__(self):
return next(self._generator)
class A:
def init(self):
self._data = []
def __iter__(self):
return Iterator(self._data)
`
Is this the correct pattern to use? I don't want to keep track of indexes in my Iterator class and feel that the generator(yield) pattern fits, but I'm not sure(newbie).
why not just return iter(self._data)?
the internal structure is more complicated than a list
could you give a more specific example then?
a list objects containing lists
you could make a generator function inside __iter__
that way you don't need the Iterator class
would that work if someone writes
for i in object: for j in object: ?
how would that look like? When I made iter a generator(yielded from it), the objects iterated over in the for loop are not A::_data items anymore, but generator objects. This is where I`m a bit confused
this kind of thing, but probably with less trivial logic
def __iter__(self):
for x in self._data:
yield x
indeed that is simpler. I think using an Iterator class somehow made it more complicated. Still need to understand how. @haughty mountain thanks for the help
is just using dict and set plenty for most cases instead of importing and creating custom data structure like binary search tree or binary heap?
Which 3rd party library did you find most helpful for using additional data structure such as tree, graphs, etc?
hmm, to be honest yeah, I almost never had to use a non-builtin DS in python
in Rust I sometimes play with DSes for better performance, e.g. a BTreeMap instead of a HashMap (the latter is basically a python dict). But in python, it's likely a microoptimization anyway.
The only time I can recall I had to use a library for a data structure is when I needed a trie (prefix tree), for which I used marisa-trie and it was of course much faster than a manual implementation (since marisa-trie binds to a C library).
the only data structure python is missing that is somewhat common is a bbst
Hi
hey guys I wanted to get your opinion on this function, anything I can improve?
def get_all_char_positions(char: str, up_to_idx: int, string: str) -> list[int]:
start_idx = 0
indexes: list[int] = []
while (idx := string.find(char, start_idx)) != -1 and idx < up_to_idx:
start_idx = idx + 1
indexes.append(idx)
return indexes
string = 'hey. how are you. im going for a coffee.'
>>> get_all_char_positions(char='.', up_to_idx=50, string=string)
[3, 16, 39]
>>> get_all_char_positions(char='.', up_to_idx=35, string=string)
[3, 16]
depends how you implement it, it's not unique
I think the algo always prints the child elements first
you're saying depending on the order of the child elements?
yes
so if it were implemented as R, 1, 2, 3, it should print out (z, y, 1, x, 2, 3, r), or (z, y, x, 1, 2, 3, r)
oh, no it visits the current node before the children
depends
post-order, pre-order, ...
Best book for DSA in Python??
Not sure about "in python", but I've taken both undergraduate and graduate algos courses, and they both used the same textbook, namely Introduction to Algorithms from MIT
There's a git repo on our resources page with idiomatic python implementations of tons of DSAs, if you want to refer to that as well.
If you want unpythonic implementations that suck, there's also geeks4geeks
Do you need to know 6.042J for this course? Or can I directly go for 6.006
speaking of this, I recently had to learn what digit dp was for a problem, and I spent ages trying to understand what the geeks4geeks code was doing
eventually I understood it, but it was a pretty bad implementation with confusing stuff that you really don't want when you're already struggling to understand a concept
a codeforces post gave a much better explanation
I once found an absolutely incorrect solution to a problem there (https://www.geeksforgeeks.org/check-if-a-string-can-be-split-into-even-length-palindromic-substrings/). There are also a lot of weird articles about very specific "simple" algorithms or topics which require very complex background theory out of nowhere. I faced that a lot when researching formal grammars.
big surprise
Hey all, I have a question, given a grid like this; grid = [
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 4, 0],
[0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0],
[0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0],
[0, 3, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
]
How would i go about finding every single shortest path from the start point 3 to end point 4? (1 indicates open path, 0 indicates closed path)
How can I learn data structures and algorithms
Build a graph on that grid. Perform BFS, but instead of just storing the shortest path also store a list of all vertices which are one edge closer to starting point. Call them parents. Now if you imagine that nodes of the graph are connected only by those "parent-edges" (in the direction of decreased depth) then the formed graph is a DAG. And every path from end point to start point in it corresponds to a reversed path in an original graph. So just walk that DAG, every path you find is the answer. This works in O(max(n, ans)). Where n is an amount of vertices (area of the grid), ans is the total length of paths in the answer. In case of grid graphs I believe the ans is always not greater than n^2, but I might be wrong.
If you are confused google "grid bfs" and "bfs find all paths". This will help :).
you can do courses like coursera or EdX or just simply start coding with various data structures and algorithms to compare your solutions with other ones
is DPV good enough book to learn enough about algorithms for a data science career? or is it too much / too little (requiring more graduate stuff)? I'm asking because I already have some basic knowledge about DP and graphs but looking at something like DPV with a lot of exercises feels like a lot of work that maybe I'd rather spend studying data science instead
I already have somewhat decent knowledge within Python, but in desparate need of doing some data science projects. but I wonder if it is worth it to take a break to study algorithms before fully commiting to data science or is the knowledge I have enough
Hello, this is the worst-case for insertion sort:
If for example, an array has 7 integers, would the worst case be 7(6)/2 = 21 comparisons or 7^2 = 49 comparisons?
And why?
I did read that it should be 7(6)/2, but I can't find any source that mentions why we can't use the one with the notation
how many comparisons would be needed if the array was in reverse sorted order at the start?
Um, is it n^2?
Oh
So that n^2 is just the representation since it's the dominant term?
And the rest wouldn't matter since they won't cause the growth to vary much?
And does that mean the number of comparisons for worst case is n(n-1)/2?
yes
The above reasoning is also correct?
which above reasoning
^
And the one after that
kinda
big o notation has a more formal definition but you can think of it that way sure
Why is it done differently for merge sort?
Is it due to the presence of recursion?
