#algos-and-data-structs
1 messages · Page 59 of 1
love u all
yo can anyone give me hints for this problem
Given `N` and `A`, an array of `N` elements, find the highest possible value for the greatest common divisor of 2 elements in array `A`.
Constraints:
2 <= N <= 2 * 10^5
1 <= a[i] <= 10^6```
at first I though I have to maximize lcm and I was extremely confused.. if anyone wants to solve that instead, I would like to hear your approaches
as for original problem with gcd, I would do it like this, but I have to say in advance, I don't know if this solution is fast enough for python, because it's O(M log M) where M = 1e6. The constant is very good though, so it's definitely solvable in C++.
- hint 1: ||All you need is to find largest number
gcdsuch that at least two numbers inaare divisible bygcd|| - hint 2: ||
gcdcan be any number between 1 and 1e6 - which is small enough to try every single one|| - hint 3 (main one, i'd say): ||n + n/2 + n/3 + n/4 + ... + n/n = n * ln(n) + O(n) = O(n log n) - this follows from harmonic series||
- hint 4: ||Iterate over all possible
gcd, then over all numbers from 1 to M divisible bygcd- that's O(n log n) as established above|| - Solution: ||Let
cnt[x]be number of valuesxina. Find maxgcdfrom 1 to 1e6 such thatsum(cnt[i * gcd]) for i in range(M // gcd)is at least 2.|| - Optimizing constant: ||You can optimize it further by find finding all duplicates in
a, and replacingcntwith bitset, only having to check if at least 2 bits are set. Another optimization is to only consider values fromaas possiblegcd's, from largest to smallest||
||js n + n/2 + n/3 + n/4 + ... + n/n = n * ln(n) + O(n) = O(n log n) isnt it +o(n) in n * ln(n) + O(n) ?
harmonic series differs from log only by constant, and constant is o(n)||
nvm, i cant read đ
There is a technique that I call divisor sieve, which can be used to solve tons of problems like this.
# Returns a list divisors of len n, where
# divisors[i] = list of divisors of i
def divisor_sieve(n):
divisors = [[] for _ in range(n)]
for d in range(1, n):
for multiple in range(d, n, d):
divisors[multiple].append(d)
return divisors
It runs in time n/1 + n/2 + n/3 + ... + n/n which is approximately n log n.
Going back to the problem, in order to find the largest GCD between two numbers in A, simply find the largest divisor that appears twice
This could easily be generalized to solve k-tuple with largest GCD for any k
not really on-topic for this channel, try #python-discussion
its a stack data structure and #python-discussion doesnt allow me to upload ss
sometimes I might not need to upload code, for example the question that Im solving right now
if this is meant to be a python exercise whoever gave that exercise is teaching confusing stuff
not an exercise, its from my A2 question paper
python doesn't really do arrays, python doesn't really use pointers
well Ig you could call it an exercise
list and indices yes, but then the thing could just have said that instead of being incorrect
unfortunately I have to learn all the abstract data types like stack, queue, linked list without object oriented, binary tree
as for this, I'm pretty sure the answer is that it does nothing
yes thats what I thought
forcing people to do globals when stuffing things into a class is the exactly correct thing to do is...bad
clearly the person who wrote this is more used to a different programming language
class Stack:
def __init__(self):
self.data = ...
self.pointer = ... # not actually a pointer
yes
its funny cuz for a linked list im taught to use class for the node but for the linked list, the syllabus is teaching me to use pointers
Hi All Here is my python visualization program. please provide me some feedback in my code or any suggestion
https://github.com/codemouli/Sorting-Visualization
def merge_sort(data, left, right):
if left>=right:
return
mid = (left+right)//2
yield from merge_sort(data, left, mid)
yield from merge_sort(data, mid+1, right)
yield from merge(data, left, right, mid)
yield data
``` I'm confused about this code. What is it trying to do
here check if I reached the single element (if condition ) and if not then I get the middle value and recursivly calling merge_sort to divide the array until it reach single element
I get that, but what is all the yielding for
Hello
the yielding are used to make this function as generator because for visualization I use animation which will use the generator and refresh the figure at certian interval
so in my code for every 20 milliseconds the generator is called and get the newly positioned value and update it in figure
Mouli what is it an output in python?
it will display the barchart
Ok thanks
ok please provide me any sugestions
Mouli You are an expert with python?
Code wise, one thing that is sticking out is
while(r<len(right_data)):
pajenegod what is it a for if and while?
In Python, there is no reason to put the (...). Just do while r<len(right_data):
Another thing I was thinking of is that if you want to compare sorting algorithms, it is probably better to start with a random permutation than random numbers input_data = np.random.randint(1, 50, input_length)
Not expert but have some experiance in python
like while l<len(left_data) and r<len(right_data) right?
Ok Sure let me try that
Missing a :, but ye
Sure I will correct that
See the style guide: https://peps.python.org/pep-0008/#other-recommendations
Stuff like += and < should have a single space on both sides (usually).
Thanks I will correct that
i keep getting error when i try to use unicode in this grammar:
https://pastes.dev/tHi00e9D3o
why is this happening?
why my grammar is not getting my unicode?
Enroll in 'AI Python for Beginners' by DeepLearning.AI and learn Python programming with AI assistance. Gain skills writing, testing, and debugging code efficiently, and create real-world AI applications.
what error are you getting?
this one: https://pastes.dev/J6adLByBZT
Hi guys, i'm new
So what do you do now?
guys is it possible to get 2 for loops to run at the same time on the same line by separating them with the and conditional?
nevermind
no, but you can use threads
so denbal, i sent you the error i'm getting
this website pissed me off so I don't care anymore
what did i did wrong?
sorry if i offended you buddy
no, you are fine
I just can't read the code there
what paste site i can use?
!paste this works a lot better on mobile
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
so u can see?
@lean walrus here:
https://paste.pythondiscord.com/SXJA
the grammar and the error
what text are you parsing?
you can try removing stuff from the text making sure that error still happens
ah, ok, single line
/macro create +>attack !echo [Sucess] !if !roll 1d20 >10 !else !echo [Failure]
the string i'm parsing is the one i'm highlighting with `
yeah
it's failing after i added the unicode for every different language grammar
before that it was working fine
now that i added them
the grammar stopped being created
i have no idea why
i tried simplifying it but it didn't worked also
i have no idea, i was taught to parse like that
cus unicode is a string
stuff between // is a regex
stuff inside [] is a set of characters to match
yeah, and it's a set of characters described by unicode
!e print(0x980, 0x9fe)
:white_check_mark: Your 3.12 eval job has completed with return code 0.
2432 2558
once again, what are these characters?
are they Bengali characters?
yes they are
oh god what are you doing
why are you mentioning every single language that is supposed by Unicode
CHAR: /./ or something like that should be enough
instead of everything on lines 70-98
where did you get all of that
yeah, but i don't want to include some characters
like brackets, keys, parentesis and white space
/[^()]/
include everything, exclude some
instead of mentioning every character you do allow
I have no idea why original error occurs
can i see how do i exclude characters?
also, are the characters i wanna exclude goes inside parentesis?
i see
and how do i exclude the brackets, cus it's gonna be hard excluding the brackets
^\]
and white space?
i just type space okay
thasnks denball
it worked
also, now i'm having problem with this:
my math code was suppose to show 7
not <Response [400]>
requests.get returns a response, not a string
.content is a content of a response
400 is not ok iirc
.http 400
-http 400
!http 400
oh so it's request.content?
also, using requests in async app will freeze your entire bot
what would be the right way then?
how do i fix that problem that can occour
use aiohttp or other library to make async requests
import random, math
chars = [chr(i) for i in range(40, 65)] + [chr(i) for i in range(91, 127)]
print(chars)
exec_string = ""
count = 1
while True:
temp_count = int(count)
string = [chars[0]] * (int(math.log(temp_count, len(chars))) + 1)
while temp_count != 0:
number_place = int(math.log(temp_count, len(chars)))
place_value = (len(chars) ** number_place)
digit = temp_count // place_value
temp_count -= place_value * digit
string[number_place] = chars[digit]
string = ''.join(string)[::-1]
print("Would execute: ", string)
count += 1
how long till this prints a string that destroys your pc lol
why the hell is is my logic is slow is it because of pthon ? ```python
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
arr = [] # This will store the longest substring without repeating characters
subarray = [] # This stores the current substring we are building
for char in s:
try:
# Check if the character is already in the current substring
exists = subarray.index(char)
# Update arr if subarray is longer
if len(arr) < len(subarray):
arr = subarray[:]
# Reset subarray to start after the previous occurrence of the duplicate character
subarray = subarray[exists + 1:]
except ValueError:
# If the character is not in the current substring, just add it
pass
subarray.append(char)
# Check the last subarray in case it is the longest
if len(arr) < len(subarray):
arr = subarray.copy()
return len(arr)
not because of python
because of you - you chose non-optimal algorithm
is the code really that bad ?
it's possibly quadratic in the size of the string
shit man I must be a noob then
subarray.index(char) is the thing that's slower than needed
I think the intended solution is some sliding window that keeps track of which characters are present
with a set or whatever
what are good resources/tutorials to learn DSA in python?
I did see the MIT course linked in pinned, but I wanted to ask if there was anything more coding-focused
I was wondering if I should learn C++/Java considering thats more widely used for DSA, but i didn't really want to learn a whole new language when I could just do DSA in python
Is it possible to use python's existing constant folding for my own AST tree work? I see that it does constant folding inside of ast_opt.c ==> astfold_expr. Could I access this from python to do folding on stuff that doesn't go further through its pipeline?
I'd say two pointer is the natural method to solve this problem
Move the left pointer forward while there is a character that appears twice in the "marked area"
yeah, that's what I was calling a sliding window
but two pointer might be a better thing to call it
more accordion than window
# Compute length of longest substring without repeating characters
def lengthOfLongestSubstring(S):
best = 0
i = -1
prev = {}
for j in range(len(S)):
c = S[j]
i = max(i, prev.get(c, -1))
prev[c] = j
best = max(best, j - i)
return best
Here is a nice short two pointer solution, i and j are the two pointers
When j is incremented, i is moved forward to max(i, prev[S[j]])
That way, the interval (i, j] will only contain unique characters
Can someone tell me how to get started with a problem in leetcode or anywhere in general
Like when I see the question even though I can state the problem in plain English and understand the problem
I can't find out how to start the program
Should I use a for loop
Should I use a while loop
Or should I use double for loops
Or no loops at all
I keep thinking and I get stuck
Can someone mention a tip for me how to start a program and solve at least "easy" questions
I dont do much leetcode but in general, if you can solve it with a loop (unless you're using an vectorized module like numpy) you generally should
for the sake of readability if you have some counter going up use a for loop but if its based on a condition that might change or if you want the code to run forever use a while loop
"Vectorized operations using NumPy are significantly quicker and more efficient than using for-loops." Up to avout 10,000x faster...
I've never used it but pandas and numpy are going to help me at work next month... after I learn how to use them... đ€Ł
so for example
when faced with this problem, how do you start thinking .LIke whats the first thing that comes to your mind?
yeah lol python loops are hillariously unoptimized
I feel like for that because you dont know how long the process may take depending on you're algorithm i'd start with a while loop with some condition that tells when it has found the largest container
this may be a solution so I hid it incase you want to find it yourself
||Won't selecting the largest line, then finding both the farthest and tallest line from it yield the largest container||
So when should you use while loop and when for loop?
When you have a condition you use a while loop
I suggest you start with getting familiar with programming concepts without leetcode first. I don't know any good website for practicing simple python this though except this one cool weird af russian website I found once, but that's besides the point. Leetcode problems require some problem solving skills which are not unlocked for you yet because you don't even have a good intuition for what a loop is. Learn basics of programming, then start with easy leetcode problems. Only once you are confident, proceed to medium ones. The problem you have linked is definitely too difficult for you at the moment. I might also suggest trying simple math problems to get better in general problem solving (I know alcumus is pretty good, https://artofproblemsolving.com/alcumus)
As for your other questions, there is no answer to "when you should use while and where for" because this is an implementation question. To solve the problem, you usually first need to come up with (abstract) algorithm with will have good performance and memory usage (and be correct, duh), and then implement it in code. These are two very distinct steps. You need to practice more coding to get familiar with how these two interact. It feels like magic, or something impossible at first but you will find it trivial after a while.
CPython loops are pretty well optimized for what they are, they are slow because you can't really do some things faster than they are. Due to dynamic typing and lack of JIT compilation, python loops are slow, but this doesn't mean they are unoptimized. People have put months of work into optimizing CPython, and they really did a good job.
what's the cool weird af russian website
bassically if you have a list you want to iterate though or you know excatly how many loops you need use a for loop, otherwise use a while true
A friend of mine rewrote a program that another person wrote. It was taking over an hour for 1,000 records. He took it down to 2.5 minutes for 10,000. And after the boss teased him, he got it under 2 minutes. So now he has quite a reputation.
how can i implement this function using recursion?
knowing size=2^d - 1 might help, d is the input of the function above
yes
just build it
it was a visual
dfs
the number ordering there is an dfs ordering
in-order traversal I think it would be called
so here's what I think you can do
- get the lowest and highest number in the tree (1 and 2^d - 1)
- use the middle value as root
- recursively build the left subtree where the highest number becomes mid - 1
- recursively build the right subtree where the lowest number becomes mid + 1
what I had in mind was
!e
from dataclasses import dataclass
from itertools import count
from pprint import pprint
@dataclass
class Node:
value: int
children: list['Node']
def tree(n, counter):
if n == 0:
return Node(None, [])
lhs = tree(n-1, counter)
mid = next(counter)
rhs = tree(n-1, counter)
return Node(mid, [lhs, rhs])
n = 3
t = tree(n, count(1))
pprint(t)
:white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | Node(value=4,
002 | children=[Node(value=2,
003 | children=[Node(value=1,
004 | children=[Node(value=None, children=[]),
005 | Node(value=None, children=[])]),
006 | Node(value=3,
007 | children=[Node(value=None, children=[]),
008 | Node(value=None, children=[])])]),
009 | Node(value=6,
010 | children=[Node(value=5,
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/JQF4ASELP2LYTERQO352RONS64
yeah I guess you can do that
but yeah, similar approach to yours
just I'm careful to have the iteration be in the correct order for the counter
I thought I would have to use some global variables to update the number of calls outside of the recursive functions, using count is pretty neat
Oh thank you so much
Yeah thats true
can anyone help?
how do I rigorously show the calculation of the time complexity using the level work and:
- depth of the recursion tree
- height of the recursion tree
I mean literally with the sigma notation of 2â°+2Âč+....
start by just setting up the recurrence relation for the time
T(n) = ???
i mean i just drew the graph and did:
i=0, untill i=log(n) SIGMA 2^i = 2^0 + 2^1 + ... + 2^log_2(n) <= 2*2^log_2(n)=2n=O(n)
is this necssary?
it's helpful 
we were only taught to calculate either by chart or by recursion graph
the relevant thing is ||T(n) = 1 + 2*T(n/2)||
as in, not to prove with T(n)
i know, i've tried to use this formula in my homework in the past, i usually used it for an upper bound for obviously exponential functions
but is my way correct?
took i=0 since DEPTH of the tree is logn
the recursion graph is essentially based on that recurrence, no?
yes, but rigirously proving T(n) we didnt see
only graph visualazaion and calculation straight away
is this method incorrect?
idk why it would necessarily be incorrect
this recurrence is easy to just do directly
T(n) = 1 + 2*T(n/2)
= 1 + 2 + 4*T(n/4)
= 1 + 2 + 4 + 8*T(n/8)
= ...
= 1 + 2 + 4 + 8 + ... + 2^log2(n)
which just boils down to O(n)
this doesnt feel as rigirous as the way i did it
how so?
like in my Linear Algebra class i did something similiar to what you did there in the expansion, and they told you can't prove it like that
as in, you have to use induction
or some other method
you can totally prove like this 
idk i just think they might take me points
i mean i did a <= change which is correct algebricly without a doubt
drawing the call graph and counting stuff is equivalent to this, just more verbose
i see your point but still
maybe the complaint in the linear algebra thing was about some other thing you did
as part of it
nope it was a very similiar calculation with sigma notation, as it was pointed out in my homework review
generally in my intro to cs class the TAs also use stuff like the known sums of sequences in the TC calculation, so thats why im asking
I can't really tell if the complaint was correct or not without the actual example
1s
there for sure is stuff you can do wrong with summation
you agree the det() calculation can be expressed recursively right?
so i did, and they say to me in my language "needs to be proven inductively."
yeah on a11 for all minors
did the task ask for induction?
they just asked to calc the det()
what you see is a simplified version of the original det()
idk, I've never changed the proofs I did to accommodate TAs or whatnot if I knew what I was doing was correct
we were free to complain about incorrectness in grading and have things corrected
how do you think about solving a leetcode problem. Like im doing Two sum and I still dont understand the logic behind the solution
how is complement = Target - nums[i]
my friend made me a list of leet code problems on tech interview handbook but im just terrified cause I passed data structures and algos with a B but I have no idea what im doing here
I learned it at a community college so it was really easy
but I see these leetcode problems and I have no idea what im doing
"what would I need to add to nums[i] to get to the target?"
shit its in C++ so they used vectors but
am I better off doing leetcode in python?
I was taught in C++ but I dont think its used in the field
really tho how does your mind know to use a hashmap?
Like if you look at a problem and read it, How would your mind know to use a hashmap?
"I need to do fast existence checks"
Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
You can return the answer in any order.
How would you know to do existence checks tho
What gives it away
when re-phrased in terms of looking for the complement that need becomes obvious
So im trying to find a number and its complement that adds up to target
conceptually
doesnt matter what number or what target is
I gotta make num A and num B = target
oh I see now
What does indices mean?
is it just plural for indexes?
plural of index
I get it now
x + y = z
z - x = y
y is my complement of x
y is the index of a array value

in these x, y, z are values, not indices
no I mean like z is the inputted target
and x is the first pair in the dictionary
x is the first value in the array
y is whatever complement matches it?
to make target
seen = {}
for i, x in enumerate(nums):
compl = target - x
if compl in seen:
return [seen[compl], i]
seen[x] = i
I havent coded in python in ages but im just now tryna readjust
@haughty mountain What about hash code? I found a book that told me to calculate the hash code for the key
whatever that means
either it meant a hash table or hashing function such as sha256
Hi guys. I need book (or other resources) to learn Data Structures and Algorithms.
Please recommend.
@unborn sundial
in python dicts work like hashmap right?
and im assuimg that python doesnt have treemaps
yes
with the bonus that it maintains key insertion order which is not typical to what we refer to when we say hashmap in computer science
i think the library sortedcontainers is almost like a treemap
gotcha
its also crazy that we can search dicts in constant time
the next lesson of my course is on implementation of hashmap
excited
class LRUCache:
def __init__(self, capacity: int):
self.cache = {}
self.capacity = capacity
self.used = capacity
self.order = []
def get(self, key: int) -> int:
if key not in self.cache:
return -1
self.order.append(key)
self.order.remove(key)
return self.cache[key]
def put(self, key: int, value: int) -> None:
if key in self.cache:
self.cache.pop(key)
self.order.remove(key)
self.used += 1
if self.used <= 0:
toRemove = self.order.pop(0)
self.cache.pop(toRemove)
self.used += 1
self.order.append(key)
self.cache[key] = value
self.used -= 1
# Your LRUCache object will be instantiated and called as such:
# obj = LRUCache(capacity)
# param_1 = obj.get(key)
# obj.put(key,value)
im guessing this is not how you are supposed to solve this
atleast it works lol
had this in a programming comp last week
spent 2 and a half hours on it, but couldn't get my python sol to not hit time limit
ignore the gen stuff that's me trying to optimize it
my alg was UCS with nodes (time, data_left, cafe) and i'd only add nodes for when you first arrive at a cafe and when the cafe closes
not sure how i would optimize it further, any thoughts?
It would be nice if that is how the world worked
!e
def dict_ddos_0(n):
pow2 = 1
while pow2 < 2 * n: pow2 *= 2
# Fill up all spots that 0 wants to use
A = [pow2]
i = 1
for _ in range(n - 1):
A.append(i)
i = (i * 5 + 1) % pow2
return A
# This part runs fast
B = {}
for a in dict_ddos_0(10**5):
B[a] = 1
# Make Python check for 0 in the dict which will take ages since
# the first 10^5 spaces where 0 could be at are occupied by non-zeroes
for _ in range(10**5):
0 in B
:warning: Your 3.12 eval job timed out or ran out of memory.
[No output]
Python dictionaries are not safe against "ddos" attacks
its over
well i mean no hashmaps (direct addressing or open addressing) is safe from that if the hashing function is not random right? doesn't have anything to do with Python
Thats not entirely true
"good" hashmaps have worst case O(1) look up
The trick is that if you have a hash collision when creating a new key in the hashtable, then you change your hash function.
That way you can guarantee that look ups always take O(1) time
https://epubs.siam.org/doi/10.1137/1.9781611977936.33 here is a pretty good read about this stuff, if you are into learning about "modern" hash tables
You have to distiguish between two things. The hash function used internally in the dictionary, and the hash function used externally to convert your object into an int
Hi guys. I need books (or other resources) to learn Data Structures and Algorithms.
Please recommend.
I always thought the internal hash function was just a modulo (to not be over bounds when accessing the index within the dictionary's internal array)
But actually it's (unsurprisingly) more complicated than that
I'm reading Grokking Algorithms and I'm liking it a lot so far, for someone having zero or close to zero DSA knowledge it explains things very well
And it also has pictures
cuckoo hash tables go hard fr
Thank you.
The internal "hash function" in a python dictionary is just repeatedly multiplying by 5 and add 1 (mod some power of two)
This is the actual reason why Python dictionaries are so "hackable"
They don't even attempt to do anything randomized
"hackable" in what way?
I can fill up a dictionary B in such a way that
0 in B
takes O(n) time
Also B[0] = 0 takes O(n) time
So "hackable" just means that certain operations are slow?
The word "hack" is used in competitive programming for input data that some program fails on / TLEs on
Oooh I see, thanks. I'm very new to this stuff
People write code expecting every hash table to run in O(1) time. But turns out it is easy to create evil sequences that make them run in O(n) time
Could I get help trying to visualise these
I get if they were on their own
But the c is confusing me here
Would this not be a fully shaded vendiagram because its using U
well there has to be a modulus somewhere right?
nevermind I can't read
you did say there was one
It is U remove C. So everything will be shaded apart from C
No. U remove C is not this
U is everything
Then remove C
i'm confused are you using u for universe or u for or
U is the universe
Which is that entire paper
Btw some else is wrong with your picture
You should have 3 intersecting circles
One for A, one for B and one for C
But you only have two for some reason
ooh that makes way more sense
i was so confused on the two circle model
was going off this so was absoloutly lost
thank you
is there a text based course which teaches DSA in python
what's 'text' based? books? there is a book called Grokking Algorithm - Aditya Y Bhargava if that's what you are looking for
it's very beginner friendly though
I need help with searching partial strings
Say a script is given to find and replace all instances of a specific string and its partial strings (up til a certain base length).
For example, the script will replace all substrings from "Lorem" up til "Lorem ipsum dolor sit amet, consectetur adipiscing elit".
What is the lowest time complexity method to search through all of these substrings?
I asked GPT and it gave me this brick wall approach to generating every possible pattern to search:
def generate_patterns(base_string):
# Generate all possible substrings for the pattern
patterns = []
for i in range(len(base_string)):
for j in range(i + 1, len(base_string) + 1):
substring = base_string[i:j]
# Escape special characters and make whitespace optional
pattern = re.escape(substring).replace(r'\s', r'\s?')
patterns.append(pattern)
return patterns
This is some insane time complexity, what better ways are there?
I'm thinking O(n log m), but you'll have to get your hands pretty dirty
(n is length of original, m is length of substring to replace)
(assuming you build a new string, cause you can't modify a string in python)
for each index i in the original string, binary search how much could be replaced starting from i:
- if no match, add that character to
new_string - if match (can replace length
l), add the replacement string tonew_stringand skip forwardlcharacters
in order to keep each binary search O(log m) and not around O(m log m) (where the extra m comes from string comparisons), you need to use something like a polynomial rolling hash
or I guess heccin AC automaton with all the substrings' desired partials put into it
You can definitely do this stuff in O(n + m) if you just know your string data structures.
Both suffix array, and an accompanying LCP can be computed in O(n + m) time. (my implementation of that can be found here https://github.com/cheran-senthil/PyRival/blob/master/pyrival/strings/suffix_array.py)
All matches of the substrings of interest will lie on an interval in the suffix array. Making it easy to identify them.
if you just know your string data structures.
well you lost me there, I have major skill issues
but TIL
does anyone know where i can learn how a computer knows where to start reading a file or a string on the RAM/disk?
i ask because a lot of algorithms like Lempel-Ziv compression algorithm assume they KNOW where the file starts and stops, meanwhile i don't find it trivial and dont know how it works. i do know how to analyze the binary/hex representation of the file on the disk, but that doesn't explain how the computer/OS knows the 0s/1s before it wasnt also a file before it and this is just a coincidence of sequence that looks like the start of a file.
is there some source/simple explanation/course in uni/college that delves into this topic?
The RAM question and disk questions are two different ones, and the short answers for what you want to learn are: for in-memory - pointers, CPU registers, and types of memory (heap, static, stack, etc.), and for disk - block devices, file systems, virtual file systems. The easiest and most robust way to learn about the first one is to learn C, and the second one is to learn about file systems in operating systems, maybe look and play around with into toy OS like xv6. Or, well, just read about them.
The overview for how memory works is as follows: CPU has memory cells called "registers". Those store a single 32 or 64 bit value, or maybe multiple values at once (so-called SIMD registers). I will only consider necessary and general purpose registers. For developer, there are about 10 to 40 of these, so not a lot. There can be more in the hardware itself, but the machine code only exposes these ~20 hard-coded registers. The rest of the program data is stored in RAM. One of the register is called "stack pointer", and usually denoted as sp/esp/rsp (in RISC-V & some x86 modes, 32 bit x86, 64 bit x86 respectfully). This registers stores the memory location of, well, stack.
Every time you create a variable in compiled language, it is stored either in one of the registers, or in the stack. So for example, if the are two local variables in the function, a and b (both are 64 bit integers), compiler can assign register rax for a, and memory location [rsp - 16] for b - that is 16 bytes to the left of where stack is ending.
If the variable is string, it is usually stored as either so-called C-string - a pointer to a memory location where null-terminated string begins (null terminated meaning that at the end of the string there is a 0 byte, so you know where it ends), or slice (aka fat pointer) - two values: a pointer to a memory location, and length of the string (so it takes two registers, or register and value on stack, or 2 values on stack - as if it's 2 64-bit variables). Nowadays, slices are preferred (any array of fixed-size objects can be a slice too! and strings = arrays of characters).
There are also static variables. Their memory location is known at compile time (one of the parts of the compiler - the linker - figures out the final position of each local variable in the final executable). These are just constants you know. If your program is printf("Hello, world"), then there is an array of characters somewhere in the executable containing characters Hello, world (you can use linux strings command to see that!), and when the program is executed, it loads the constant address of that string into one of the registers, and calls a function printf, which doesn't care about what it's given, but it knows that if it looks into that register, there is a string there which needs to be printed. It then loops over each character, starting at that memory position, and prints characters, until it hits 0 byte, meaning string should end.
So, to answer your question, "how a computer knows where to start reading a string in the RAM?" - all data in your program is accessible through some pointer path - that is a path you can trace if you start with a constant or a register, look at the memory position at that register (maybe offset by some other value), then look at the memory at this position, then look at the memory at the position you just read, then at the position you just read again, etc, until you eventually arrive at the actual value.
And I am not going to talk about the disks do that đ. The mechanism is extremely similar though. If you understand how RAM works, you will get an intuition for how disks may work too. The main exception is that "roots" of pointer paths are not registers and constants, but a "superblock" values, stoted at the beginning of the hard drive.
Here is full printf("Hello, world") example in C: https://godbolt.org/z/zqTe8hG5e
Although I use puts instead of printf for simplicity - doesn't matter - it just prints the string to the standard output.
#include <stdio.h>
int test() {
puts("Hello, world!");
}
it compiles into the following assembly
.LC0:
.string "Hello, world!" ; the string itself, at memory location labeled .LC0
test:
; ignore this line - it just ensured that stack pointer is divisible by 16 for boring reasons
sub rsp, 8 ; subtracts 8 from rsp
; load the memory location of .LC0 into edi register - so now edi has the position of the string
mov edi, OFFSET FLAT:.LC0
; call function "puts"
call puts
; and this line un-does the first one
add rsp, 8
; and return to exit the function
ret
thank you very much for the explanation, I know the basics of C, but our intro to cs course + DSA course are on python (and OOP one is in Java). are what we are looking at are "System calls"? I have a course in 2nd year called OSs, which I think I saw from its tests talk about similar stuff such as rsp (I think I saw those key words on the the tests)
I don't really understand your explanation, and it leads me to believe that the compiler/interpreter always does at least O(m) lookup time, when m is the size of the registers of the programs
is there a formal course for this? like in MIT OpenCourseware?
Data on RAM is kept track by programs that use that data. Data on the disk is kept track by the file system. The programs keep track of what they use on the RAM, but need the operating system's help which in turn turns to the file system interface to use files
It's ok if you don't understand, I just wanted to give you a base you can google for. System calls is an unrelated OS concept, but you are right that most of what I am talking about is taught in OS course.
Also, I don't think that's what you meant, but compiler indeed does very complex register lookup which is at least O(number of registers), in reality it's usually even worse because distributing values between registers properly is a very difficult problem.
I'm just a silly
I'm only supposed to return the first two numbers that sum to the target_sum from the array. I do not need the values. I don't know why it's giving me (2,3) when I use [2, 1, 4, 4, 6, 4, 0, 4, 4, 5, 5, 1, 0, 6, 0, 3, 6, 3, 2, 1] and a target_sum of 5... I've spent too long on this problem and I have finals today. This is making me feel like I fried my brain in butter for some reason.
def hashtable(array,target_sum):
complements={}
for number in array:
complement=target_sum-number
if complement in complements:
return number,target_sum-number
else:
complements[number]=3 # the value is arbitrary
return -1
complements[number]=3 # the value is arbitrary
what's with this line?
you will never find a pair other than (3, target - 3) if you do that
def dutch_national_flag_partition(array, start, end):
mid = (start+end) // 2
pivot = array[mid]
red = start
blue = end
j = start
while j <= blue:
if array[j] < pivot:
swap(array, j, red)
red += 1
j += 1
elif array[j] > pivot:
swap(array, j, blue)
blue -= 1
else:
j += 1
return red, blue
My lecturer say if I were to use the dutch national flag to partition in my quick sort algorithm, it will fail at a specific case, something like a multi swapping???
I tried a lot of test cases but still couln't find it, do u guys have any idea
what's an algorithm
not sure how this would fail
I don't see anything wrong with this. The choice of pivot being (start+end)//2 is weird, but it is not wrong
Maybe doing swap(array, i, i) is seen as a bad thing that should be avoided? (Swapping an index with itself). Your code is currently doing this.
You'd avoid that by picking pivot = array[start]
on the other hand pivot being the start means trivial worst cases
Yes, picking pivot in middle works well for already sorted data
However, I've seen implementations of quick sort that start by swapping the pivot element to array[start]
yeah, that's sensible
st = LazySegmentTree([1,2,3,4,5], func= lambda x,y : x + y)
st.add(0,3,1)
print(st.query(0,5))
why does this give 17
it should be 18
similarly there's off by one index discrepancy in some queries idk why
this is the LazySegmentTree on pyrival
Lazy segment trees with addition are tricky
i was trying to do this with lazy segtreehttps://codeforces.com/contest/1872/problem/E
i am not sure but we can probably make a pair class, combining both value and binary string value at the index, then do some operator overload before sending this to segtree , is this overkill lol
but well
range update is also tricky
should do something else
ah okay
I had this problem in a comp today
You have an array of nonnegative integers and an integer k. You can repeatedly pick contiguous subarrays of length k and replace that subarray with its bitwise or (so if the subarray was 3 4 2 you can replace that with 0b111 = 7)
if you repeat this process until you get an array of length < k, what's the maximal sum of this array?
I briefly attempted to use a segment tree but couldn't figure out how that'd work with the array changing shape
limits on length of array and k?
400k for both, integers are bounded by 1 billion
oof ok
i wrote some code to do like a sliding window Counter of bits, tried mapping out a segment tree sol, and then looked at the constraints and decided that those'd probably be too slow to do anything so i just moved onto another problem
Here is the trick,
Suppose I give you an sequence of intervals [l, r_1], [l, r_2], [l, r_3], ... (where r_1 < r_2 < r_3 < ... and l is fixed).
Now think about what happens with the corresponding sequence of OR-values of the intervals.
- This sequence is increasing
- This sequence can increase at most ||log(max(A)) times||
You can use this to speed up the natural O(n^2) DP to ||O(n log max A)||
The fact that the problem uses OR is kind of arbitrary. You'd have pretty much the exact same problem by switching out OR with for example GCD.
any python competition programmers able to help
can't youd o this in O(n) time
What I wrote is a hint to a problem posted earlier
What is the best approach to learn algorithms and data structures?
By solving questions
hey i am new whats and where to start coding language
Yo there is a DSA problem I am not able to solve can anyone help me with it
Sure. I can help you.
Give me a sec
paste it here
Description:
You are given ( n ) regions where some servers are hosted. The number of machines in the ( i )-th region is given by machineCount[i], where ( 0 \leq i < n ). Managing all these different regions can be challenging, so the team decided to move machines to exactly 3 regions. The desired number of machines in these 3 regions is given by finalMachineCount[j], where ( 0 \leq j < 3 ).
There are two types of operations that can be performed:
- Add or remove a machine from any existing region. The number of machines in that region should be non-zero before and after the operation. This operation costs 1 unit per machine.
- Move all machines from one region to another existing region at a cost of shiftingCost units. The now-empty region is destroyed in this operation.
Your task is to find the minimum cost to redistribute the machines such that exactly 3 regions have the counts specified in finalMachineCount.
It is possible that there are additional machines left over after placing the specified number of machines in the final 3 regions.
Example:
python
n = 4
machineCount = [2, 3, 5, 7]
finalMachineCount = [5, 10, 5]
shiftingCost = 2
Output: 5
Explanation:
-
Increase the number of machines in the 1st and 2nd regions by 1 and 2 machines, respectively:
- New machineCount: [3, 5, 5, 7]
- Cost: 1 + 2 = 3
-
Move all machines from the 4th region to the 1st region:
- New machineCount: [10, 5, 5]
- Cost: 2
Therefore, the total cost is 3 + 2 = 5.
Function Signature:
python
def getMinimumCost(machineCount: List[int], finalMachineCount: List[int], shiftingCost: int) -> int:
# Write your code here
Constraints:
- ( 1 \leq n \leq 100 )
- ( 1 \leq machineCount[i], finalMachineCount[j] \leq 100 )
- ( 1 \leq shiftingCost \leq 100 )
Note: The solution should aim to efficiently compute the minimum cost to achieve the desired configuration.
Explain the example of this question in a simpler way
This was from an OA
I GAVE
Yeah whats your code that your having issues with
I mean like I donât know how to approach this question
I have solved it using brute force and DP
but both the solution
Didnât pass all the cases
All the necessary cases failed
Itâs been a week and I have been trying to solve it
If I DM you later for this can you help me solve it I am in class rn
Alr
Thank you
So you have regions with different numbers of machines
You need to move machines around so that exactly three of these regions end up with specific numbers of machines. Moving machines can be done in two ways:
adding or removing machines are a cost of 1 unit/machine
or moving all machines from one region to another region that already exisrs
exists
which costs a fixed amount (shiftingCost)
You can
Increase the machine counts in some regions
Or move all machines from one region to another
Bear with me it might take me a while to type this
the first and second desired machine counts are 5 and 10, respectively and the current regions have 2, 3, 5, and 7 machines. To math the desired counts: Increase the number of machines in the first region from 2 to 5 (costs 3 units) ok so now increase the number of machines in the second region from 3 to 5 (costs 2 units), after this adjustment, the machine counts are [5, 5, 5, 7], and the total cost so far is 3 + 2 = 5
For this u do
the third desired count is 5, which we already have in some regions, and the fourth region has 7 machines. move the machines from the fourth region (7 machines) to one of the regions that already has 5 machines. This operation costs 2 units (shiftingCost) and results in the final machine counts being [10, 5, 5].
Ok so the total cost calculations is
Adjusting machine counts : 5 units
Moving machines : 2 units
Total cost 7 units
So the minimum cost to achieve the desired machine count in exactly 3 regions is 7
7 units
@fiery cosmos lmk if anything else
can someone explain these using venn diagrams please
really bad but i was just using paint to get an idea are these correct for one and two?
lol i do this in india
Let me see
assuming that c just shades in everything that c is or overlaps with?
in the second example
do u want me to explain everything
if possible please
Ok
Lets start with
a.
AUB represents all elements that are in either set A or set B, or in bith.
Both.
(AUB) UC that includes all elements that are in A,B, or C
In the venn diagram
All regions corresponding to sets A,B and C would need shaded
ok now
b.
anb: represents the intersection of A and B
In other words elements common to both A and B
And then the
(Anb) UC includes all elements that are either in the intersection of A and B or in the set C.
In the venn diagram the region where A and B overlap, and the entire region of C would be shaded
so i was right with those two diagrams above i believe correct?
or would these two be incorrectly shaded
Yes
Correct
ah alright thank you
The left diagram correctly represents
(Aub) UC
And the right diagram is
(AnB) UC
Yes this would be .a
A
But not the entire set of A
But specifically the parts of A that do not overlap with the intersectionof B and C
alright thank you so much man that has made it a lot easier to understand
No problem
no
but the total cost is coming out is 5 in the example and you are using the same example
you removed union of B and C from A, not intersection of B and C
and also where if we are moving the machines from one region to another
why are we able to use all 7 from the last region in the example
i might be wrong on this
if possible we can hop on a discord call
What would that look like?
Sorry can't visualise that
Yeah isnt the goal to explain the example in a more simpler way
Like this
ooh
You want to remove everything that is both in B and in C from A
Which is just the very middle section of the diagram (for the purpose of this)
so what i drew was just A?
or do you also shad in a union if you are just asked to shade in A
Just A as in everything that is only in A
oh wait bro i got it
damm i read your explanation wrong
mb
np
i will write up the solution
The diagram splits A into 4 sections. If you were asked to shade A all 4 sections would be shaded
Which would be the entire circle of A
is what course/year of the CS degree do we learn about AVL trees/B trees/Segment trees? in my Intro to CS classwe just learned binary trees as is for now
I don't think segtrees are taught tbh
Solve this question in python programming, expert level : very hard
Suppose we need to adjust the matrix to a new element distribution đ 2 =
(đ1, đ2, đ3, đ4, đ5). When making this adjustment, the optimal matrix should adhere to the following rules:
- The number of subgroups for each integer value should be minimized.
- The optimal matrix should have the smallest value in the top left corner and the largest value in the bottom right corner.
Definition of subgroups for an integer value: If groups of the same integer value are not connected to each other (i.e., there is no direct path connecting one group to another), they are considered
distinct subgroups for that integer value.
Example.
Given a current 5 Ă 5 matrix with đ 1 = {1: 3,2: 7,3: 5,4: 7,5: 3}, this indicates there are 3 elements
with value 1, 7 elements with value 2, 5 elements with value 3, 7 elements with value 4, and 3
elements with value 5. The new target distribution is đ 2 = {1: 5,2: 8,3: 4,4: 6,5: 2}
Current matrix
[1 1 1 2 2]
[2 2 2 2 2]
[3 3 3 3 3]
[4 4 4 4 4]
[4 4 5 5 5]
Matrix after adjustment
[1. 1. 1. 2. 2.]
[1. 1 .2. 2. 2.]
[3. 3. 2. 2. 2.]
[3. 3. 4. 4. 4.]
[4. 4. 4. 5. 5.]
Please outline a strategy to transform the current element distribution from đ 1to đ 2 while minimizing the number of subgroups for each integer value.
Is this AI generated or an actual human written problem? I can't make sense of this
It actually human problem
I have solved it recently i found it very difficult please try to solve it
people here are more likely to help you solve it rather than solve it for you
sorry coming back semi to this
i have these
sets
P = 3, 5,7,9,11,13,15,17,19,21 â infinity (all odd numbers)
Q = 1,3,5,7,9,11,13,15
R = 3, 5,7,9,11,13,15
N = 3,6,9
Draw a Venn diagram that visually represents the relationships between those four sets. (Do not show the elements of the sets, just regions within
labelled by the names of the sets. Do not show intersecting regions with no elements in them.)
this is the question given
is there a specific part you're stuck on?
have you drawn like an incomplete one?
yeah sorry was just quickly drawing it up
so is this correct so far?
if so where would i draw n
or have i done this wrong
correct so far
I don't see any problems (well, you didin't include1in yourPthat you typed, but I assume that's just an oversight)
where would I drawN
Nhas3and9which are shared withP Q R, but also a6
soNwould have a part of it contained inP Q R
oh sorry here is the og sets
This might be easier to work with
ah, then P doesn't include 1, meaning that some of Q would be outside of P
edit: actually it depends on if they include 0 in â or not
this is the slide our teacher gave us
so i assume so
which probably means i missed one of these up
no you're fine then
that means P = {1, 3, 5, ...}, so your original diagram is still right
so where would i place S would it intersecting with p but not fully in it?
yes, but S also needs to intersect Q and R (all 4 sets contain 3, and P Q S contain 9)
bad drawing but imagine that little red "circle" was s would that be correct?
a bit more down
so it intersects with r
somthing like that?
ehh kinda, I realize that rectangles might be better
cause there's nothing in the section where only P and S intersect
can you do rectangles for venn diagrams? we've only been taught circle ones so far so not fully sure how that would look
I don't think there's a rule that venn diagrams must only use circles and ellipses
this might be a bit unholy but from my understanding
P is the biggest thing there which has R completely inside of it
Q is not a proper subset since 1 is not in P so it intersects it and R
with S intersecting all 3 of them as well cause 3 and 9
is this completly wrong?
1 is in P since 0 in N, P = { 2k + 1 | k in N }
oooh
and the slight issue with using circular shapes is you'll have a section of P and S which is actually empty
so p would actually be 0, 2, 4, 6 onwards correct? cause the +1 is apperently only applied once?
P would be all odd numbers
P = { 2k + 1 | k in N }, so imagine
k=0 -> 2k+1=1 -> 1 in P
k=1 -> 2k+1=3 -> 3 in P
...
oh so the +1 is applied every time
yeah i got confused earlier and asked chatgpt and it told me the +1 was only applied the first time which seemed a bit off to me
the math is basically saying [2k+1 for k in range(inf)] in python list comprehension (assuming you have infinite memory)
LLMs are good at spewing out correct-looking answers that are wrong, so I'd always double check what they say
so how would i intersect S with Q and R without intersecting with P
well... that's why I said circular shapes are a bit weird here
i feel like the 2 am brain is hitting cause this is giving me the same issue
so i add S here?
cause i assume it can't be inside p since its not a proper subset
you can always sanity check
6 would be in S
3 would be in S and P and Q and R
9 would be in S and P and Q
so looks alright?
uhh i believe so
i'll probably have to fully look at it in the morning but that looks right for now
thank you for the help btw
Just checking my logic for this question
A: the range is 0 to +infinity because a number squared can't be a negative
b the function is not onto because Z has negative values which are not being mapped to the set
i'm a bit lost on c
the domain of d would be something like (1 to 9) or d in natural numbers d<10 correct?
and codomain would just be n in natural numbers i think?
does that seem right?
Deleted original message because I misread
d represents all possible digits, so you are missing something
oh sorry 0-9 right?
Yeah
I do not know exactly what "range" means (I don't think it is standard terminology), however I doubt the answer is {x in Z: x>=0}
But maybe it is, i don't know
The image of the function are all square numbers (starting from 0)
That is what I would assume "range" to be
I took range to mean image
And yeah I didnât read that answer that closely. I agree
Altho question b may imply the definition of ârangeâ is not necessarily the image here
If the range is the image the function is automatically onto but maybe that is the point of the question
definitely depends on how they define it
schools like to do it differently than one would in actual maths, since usually a function needs the domain and codomain to be specified to be defined in the first place
they like to tell you to "find the range of this function", but then that makes surjectivity confusing to talk about because surjectivity is inherently about the difference between the codomain and the image, but since they don't define the codomain it doesn't make much sense
Math books started using "range" for a while and then stopped because it's confusing. This confusion is from a mistake in pedagogy.
(It used to mean codomain in older math books, then newer ones started using it to mean image (even newer books have stopped using the term at all))
what is the confusing part?
whether it is min - max or the set of all possible outputs?
e.g. given f : R -> R and f(x) = xÂČ is it R or is it {x in R : x â„ 0}?
I've been mulling over this for a while now and I have no clue what the n^2 DP is, could I have a hint
The term range is sometimes ambiguously used to refer to either the codomain or the image of a function.
https://en.wikipedia.org/wiki/Codomain
In mathematics, the range of a function may refer to either of two closely related concepts:
the codomain of the function, or
the image of the function.
In some cases the codomain and the image of a function are the same set; such a function is called surjective or onto. For any non-surjective function
f
:
...
the issue is ambiguity of what is actually meant by range
who is familiar with adaboost
Have you heard of aliens DP / lagrange multiplier?
The tricky thing is how to incorporate the k constraint
The idea is that instead of keeping track of k (the number of intervals), you add an additional cost of lambda for each interval you create
If you pick lambda too small, then this will result in the optimal solution using > k intervals
Picking lambda too big will result in < k intervals
But if you pick lambda correctly, the optimal solution will have exactly k intervals
If you pick lambda too small, then this will result in the optimal solution using < k intervals
k, you mean?
i know of lagrange multipliers but not this
Yes exactly
The technique got famous in competitive programming because of a problem called aliens
But it is really just lagrange multiplier
In fact, I think this gives a much better intuition of lagrange multiplier than the typical math course
Ah yes
Fixed it now
speaking of a function
i also got coincidentally a function problem as well
so suppose i got f(x), g(x), h(x).... etc. And a class called composite which stores the functions in a list but computes them in a composition way from first shallow to last deepest. For example
[f(x), g(x), h(x)] => f(g(h(x)))
the question is Say i have the same class but has 2 of the same class type under the functions for example suppose p(x) and q(x) where
p(x) = [f(x), g(x), h(x)]
q(x) = [g(x), g(x), f(x), h(x), f(x)]
If i combine the functions such as the
result = p(x) + q(x) = [f(x),g(x),h(x),g(x),g(x),f(x),h(x),f(x)
is it the same as basically navigating the composition tree and reaching from bottom to top?
your notation is a bit confusing; I think when you say f(x) you actually mean f
tried to do this more mathematical
in which case it'd indeed be valid to consider Compose([f,g,h]) to be a function itself, and you'd be able to e.g. add them up
but when adding to compose functions. Will the result be the same?
(p+q)(x) would be the same as p(q(x)), yeah
oh
will this also speedup execution. Say that to compute this there is a method called compute
so i give it a x value and it returns a y value for the function
this function will execute a lot of times
with different x values ofc
Hey, I'm trying to understand the editorial of this problem
problem: https://codeforces.com/contest/920/problem/C
editorial: https://codeforces.com/blog/entry/57516
they say "It means that all the swaps from i to jâ-â1 should be allowed. Then it's easy to notice that it's enough to check only i and iâ+â1 as any other pair can be deducted from this."
I used a prefix sum to make sure all the swaps from i to j-1 are allowed but they say "it's enough to check only i and i+1" so I'm not sure I understand, am I missing something? They also say they use a prefix sum so I assume they suggested to do the same but this sentence is really confusing me and there might be something I'm missing?
is writing in pypy any different from normal python code?
you don't have access to all the libraries
typically you can't use pytorch
and there probably are some other differences but i'm no expert
well so if just for cf and stuff it would be no difference?
it'll be faster usually
i think if there are strings it could be a bit slower
but otherwise it'll be 99% of the time be faster
There are some things that are different.
sys.setrecursionlimit(10**9) works differently (can give MLE using PyPy)
This is probably the main reason why an AC CPython solution does not get AC using PyPy
AC?
AC = Accept, it is what happens if a solution passes all test cases for a problem
Another significant difference is that PyPy is usually faster. But that is less of a fundamental difference.
well what I care is do I just write python, select pypy and submit and it works fine. And it is prob yes
Yes
I would think too deep about this. The actual solution is the last sentence:
We can precompute pos[a_i] for each i and use prefix sums over the string of allowed swaps for quick verification.
People are questening the quality of the editorial in general
I think of it like this: element a_i needs to be moved to pos[a_i], check that this is possible for all i?
Acutally thinking about it, there is a "better" O(n) solution (that doesn't use that fact that the sequence given is a permutation). Suppose that the input contains numbers from 1 to 1e9, and that it can contain duplicates.
Here is the algorithm:
Split up the input array into intervals (where inside each interval, all elements can be freely swapped)
Then check the the max of an interval <= min of the next interval, for all intervals
Doing this, you wont ever need to do any kind of sorting. So it runs in true O(n)
Who's familiar with ada boost
Can someone look at this and tell me where im potentially going wrong
isnt working correctly when it comes to boosting
Ah damn, thatâs pretty smart
If you're skilled in algorithmic programming with Python, I would like to hear from you. I'm currently looking for talented individuals to join my hedge fund. Please reach out to me privately if you're interested.
Im kinda lost
Which is the correct channel for Machine Learning stuff?
this is the wrong channel for this, open a help thread or send a message in #python-discussion
import math
import time
import random
import numpy as np
nodes = []
numnodes = 10
nodes = np.random.randint(0, 100, (numnodes, numnodes)).tolist()
print("loaded")
ais = []
for x in range(0,100):
l = [x for x in range(0,numnodes)]
random.shuffle(l)
ga = l.copy()
ais.append(ga)
mutationchance = 0.01
print("--")
overallbest = [-1,[]]
z= 0
def loop(z,inp=0):
global overallbest
global mutationchance
best = -1
bestai = []
average = 0
for x in ais:
tdist = 0
summer = [ nodes[x[y-1]][x[y]] for y in range(numnodes) ]
tdist = sum(summer)
average+=tdist
if best==-1 or tdist < best:
best = tdist
bestai = x
bestai=bestai.copy()
average/=len(ais)
if overallbest[0] == -1 or best<overallbest[0]:
if best!=-1:
ov = [best,bestai]
overallbest = ov.copy()
for x in range(0,len(ais)):
s = random.randint(0,numnodes-2)
e = s+random.randint(0,numnodes-1-s)
piece = overallbest[1][s:e]
if random.randint(0,int(1/mutationchance))==0:
if len(piece)!=0:
tp = piece[0]
piece[0] = piece[-1]
piece[-1] = tp
b = ais[x]
for p in piece:
ais[x].remove(p)
rpos = random.randint(0,len(ais[x]))
ais[x][rpos:rpos] = piece
print(f"@@ generation{z} @@ avg {average} - best {best}")
print(f"@@ over all best {overallbest[0]}")
from line_profiler import LineProfiler
while True: #main loop
print(f"LOOP {z}")
inp = input("continue..")
#loop(z,inp)
lp = LineProfiler()
lp_wrapper = lp(loop)
lp_wrapper(z,inp)
lp.print_stats()
z+=1
this is the full code for the algorithm im working on @agile sundial
@agile sundial can you help me with something
everyone needs their help đ€Ł
no
tf should i do if i get these errors
sorry I can't help rn. if you post the details of the problem and condense it down to the part you need help with, that will help other people help
i need help đ
I dont know if this is the rigth place
I was wondering, for programming interviews, should I memorize or be using anything outside of the built in data structures and the collections library?
No, you shouldn't memorize them, you should understand them. At coding interview level there is very little memorization involved if you are doing it right. If you just remember the algorithm without understanding it properly interview will notice. Yes, you should be using data structures where applicable.
Also note that if interviews are done orally, then you are likely not expected to bring up how to implement a data structure
The more likely scenario is that they want you to show how to use a data structure to solve a problem
You can usually assume that someone else magically has implemented the data structure for you
Then you can use it as a black box
knowing the existence and properties of a ds/algo is the key part
Is there a best practice for finding an object on a list from a user input? I usually do it with a list comprehension, but is there a method for finding it or something like that?
For example:
class Car:
def init(self, model, price):
self.model = model
self.price = price
cars = [Car("Fiesta", 5000), Car("Etios", 3000), Car("Clio", 2000)]
car_wanted = input("Car model you want: ")
selected_car = [c for c in cars if c.model == car_wanted][0] #This right here
I solved https://codeforces.com/problemset/problem/869/C using https://github.com/cheran-senthil/PyRival/blob/master/pyrival/combinatorics/nCr_mod.py (Lucas theorem)
Any idea what the official solution used to compute n chooses k mod M? https://codeforces.com/blog/entry/55009 They don't really detail it and it looks like the precompute a bunch of stuff but I'm a bit confused
Actually:
p[i][j]=(p[i-1][j-1]+p[i-1][j])
it seems they just use pascal's rule to precompute all the n chooses k?
I feel like it is slower than using lucas theorem but I've only ever used nCr_mod from pyrival twice so I'm not sure
so the question is what is the complexity of nCr from pyrival? it should be log_p(n) where p is that prime number?
not with a list
might make more sense to use a dict
@narrow mica that sounds more reasonable đ
It is lucas theorem indeed, and it is O(log_p (n)), with O(M + log p) precalc where M is the largest digit in base-p representations of n and r. For big primes it's just O(1) per call since all it does is computes n!*(1/r!)*(1/(n-r)!) mod p with factorials and inverse factorials precomputed.
So forget about Lucas theorem for big primes, it's just plugging numbers in the formula
Computing pascal triangle is a faster if you need many combinations for small n and r where O(nÂČ) is acceptable
That precalc is way cheaper since it's just additions, while computing factorials requires multiplications and modulos, plus additional math to actually get the answer
Yeah I was just gonna say for most of the problems n and r are smaller than the big prime
So I could just precompute factorials and inverse factorials mod p for that problem
Yes
And not use pyrival function
Wait inverse factorials are just floats?
What do you mean exactly by inverse factorials
Inverse modulo p of n is m such that n * m = 1 mod p
Yes, exactly
Like how comes it « all works out »
n chooses k mod p can be computed using the notion of inverse mod p
I donât know if my question makes sense
it's just how it would work with "regular" division, only you have a different notation of what division means in the modulo world
Thatâs what bothering me
Itâs not exactly regular division
Like 2/4 in the modulo world is 2 right?
In the modulo world 2
That's 0/0
Wait no itâs not
My bad
2/4 in the mod 4 world
Wait that doesnât work
I canât get to 5
đ€
Ah so in my example it doesnât work because some donât have inverses
doesn't exist
Ah modulo a prime
Sorry I didnât see that part in your first message Iâm stupid
yeah, for composite numbers they don't exist for all numbers
Is there an easy proof of that?
Yeah Wikipedia says that
there's fermat's little theorem, which this is a consequence of
I donât know fermatâs little theorem damn
Wikipedia says itâs just a consequence of bĂ©zout identity
Maybe I can understand that proof
Ah yeah the proof using bézout identity is decently easy
For existence at least (if coprimes)
a**p = a (mod p) => a**(p - 1) = 1 (mod p) => a**(-1) == a**(p - 2) (mod p)
using fermat's
for prime p
So coming back to n choose k mod p
Iâm not entirely sure I understand why
Using 1/k! mod p and 1/(n-k)! mod p is âokâ
I know @haughty mountain said itâs just like normal division
division modulo p has all the same properties as division in integers
What property are we using I guess is my question
(a*b)mod p times 1/b mod p = a mod p
Probably?
Ah well
Thatâs trivial to prove
1/n (mod p) is just the multiplicative inverse of n (mod p)
1/3 (mod 5) == 2 (mod 5) for instance
Yeah I got that
I think with this property
Iâm now more convinced
Why I can just do
a/b = k <==> a = b*k
a/b = a*inv(b) = k (mod p) <==> a = b*k (mod p)
it's the usual definition of division
Yep
Ok I see
I see
Weâre only talking about when a is divisible by b though? But thatâs ok n choose k because we know the result will be integer
though you get some niceness like fractions being integers
The only thing left to grasp is how to compute those inverse factorials I guess (so that I can precompute the arrays)
Wikipedia says extended Euclidean algorithm for the inverse
Is that the standard way of doing it?
If I wanna precompute all of them until 5000 for example (my initial problem had this bound)
there is this nice trick for computing all inverses, and the inverse factorial is juse the cumulative product of those
wasn't here at the start, but python pow can do modular inverses with 3-arguments
i dunno if that's useful to you
it can, but it gets expensive if you need all of them
https://mirror.codeforces.com/blog/entry/83075
I imagine thatâs what I should look at
Damn that looks complex
this identity, yes
Yeah Iâm trying to understand
If you just need inverse factorials only just compute 1/n!, and then multiply it by n, n-1, ... to get the rest đ
Ah
Also I guess
But now that Iâve started reading that blogpost
Im curious
What do they mean in the last paragraph after big brain mode
(Yes the only part I understood is the Euclidean divisionâŠ)
What is this 1-(i-1)
Where does it come from
oh 1 - (i - 1) is a very bad way of writing what they mean
Ah from 1
To i-1
Omg
I was like this is negative most of the time what is going on
we have the inverses from 1 through i-1
Ah well
That was pretty easy then
I was getting scared ngl every time I open a blog like that it takes me hours to understand it
Well thank you so much for helping me
I can now finally say I kinda understand the pyrival nCr_mod function
(possibly) a stupid question: It is known that the k-clique problem is NP-hard, so as k-anticlique. How can I prove that problem "given a graph, return either k-clique, or k-anticlique" is NP-complete too? (No other context btw - it's not reduced from anything, it's the problem as is)
The normal k-clique NP-completeness is proved by reducing 3-SAT to it, but I can't quite grasp how to do this here
Here is an explanation that I like for existence/how to compute modular inverse.
Can you guess what 2^-1 (mod 10^9+7) is?
There is a very simple way of figuring out what it is by hand
1/2 is not an integer
but
x=||(10^9+7+1)||/2 is an integer
So the modular inverse of 2 mod 10^9+7 is simply x (see definition above)
You can use this same trick for computing modular inverse in general. Just keep adding 10^9+7 to 1 until you get something that is divisible.
def mod_inverse(a, p):
x = 1
while x % a != 0:
x += p
return x // a
This is not a particularly fast algorithm, but it gives good intuition for when/why modular inverse exists
(it exists iff this while loop terminates)
Iâm not sure I understand
Why is that true
If we do the multiplication ok
But in general
Like if I want 4^-1
Mod 10^9+7
Note that I only use integer division in my explanation
x is an integer
2*x is an integer
And 2*x happens to be one
Mod p
That i understand
But if I want 4^-1
And I do
1
1+p
1+2p
etc
Why would that be useful
Im missing something
This about the reminder of 1+k*p when you divide it by 4
When k=0, it's 1, duh
When k=1, it's, well, something
But you know that p must have remainder of either 1 or 3
1 + 10^9+7 is divisible by 4
So 4^-1 is (1+ 10^9+7)/4
(oh wait, you need euler theorem to show it works đ)
Ah yes
But what about 7 then how do you know it will eventually be divisible
Maybe I am missing something too, but I don't think it's that simple đ€
Yes
Im not sure my brain sees why this is intuitive
But to prove this, you need to use gcd
I've seen mathematicians struggle with computing 2^-1 mod 10^9+7 by hand
That should be within the realm of my understanding then
Well it so happens that 10^9+7+1 is ALSO divisible by 7 ..

This
pajenegod, I'm afraid gcd is not enough, and you need full blown Euler's theorem to show that, which is a harder problem đ
One definition of gcd is
gcd(a,b) = smallest integer that can be written as a linear combination of a and b
From this definition, it follows directly that gcd(a,p)=1 <=> algorithm terminates for a,p
And p is a prime
No p can be anything
I hate this
ÂŽIt follows directlyâ
But then I get it
And it does follow directly
Yes
gcd(1, anything) is 1
Yes so 1+p!=1
Yep
Where does the mod come from
y is the number of iterations that the algorithm will take
a*x = 1 mod p
Yes
Yes exactly
Yes
There is a very well known transform, called Montgomery transform, used for speeding up modular multiplication that essentially uses this algorithm
You know them all itâs frightening
Here is the gist of it.
Suppose x is a really big number, and we want to compute x%p for some kind of small number p
Here is something that almost solves this problem: repeatedly add p to x until the number becomes divisible by 2^64
Then divide x by 2^64
Now x is about 64 bits smaller (and it is also off by a factor of 2^64)
Then repeat this entire process until x has same size as p
Why is just doing x%p not good
Well checking divisibility by 2^64
Canât be done just with bit shift
You have to check 64 positions
Or are we talking about really really really big numbers?
Wait 2^64 is freaking huge
Already
You do this step fast using modular inverse (of p^-1 mod 2^64)
So no need for any "checking"
Btw something important to note, for most of this, only the smallest 64 bits matter
So it doesn't matter that x is huge
This is why this method is so useful in practice
If you have p^-1 mod 2^64
Why do you know how many times you add this to x
Is x + p^-1 mod 2^64 0? Mod 2^64
Wouldnât that change the mod p answer?
Like 100%8 surely isnât the same as 100/2 %8
Yes. That is why this doesn't exactly solve the problem
Ah okok
It was just a glimpse into where the algo fits
You need to compensate afterwards I assume
Yes. People typically compensate before with a transform. The "Montgomery transform of x mod p" is just x*2^32%p
If you multiply two transformed numbers and run the algorithm 1 iteration
You get x*y%p since the 2^32s cancels with the 2^-64 from the "fast mod"
Ah so first you bitshift 32 your x and your y then you find out the modulo of both
But you have two 2^-64
No?
Almost
You instead compute x*2^32%p and y*2^32%p to keep the numbers small
But
When computing the first youâre « off 2^64 »
And the second youâre « off 2^64 » too?
Uhm I'm a bit confused
transform(x) = x*2^32%p
You use the algorithm we described
And youâre off by 2^64?
Yes fast_mod(x)= x*2^-64%p
So when you do fast_mod(x*2^32)
And fast_mod(y*2^32)
You get two of those 2^-64?
Thatâs the part that got me confused
And you only compensated for one
With your 2 2^32s
fast_mod(transform(x) * transform(y)) = x*y%p
Ah ok
transform(x) and transform(y) are huge
Ah ok
So itâs really not a stretch when people say mod really really slow
If people go to those lengths to avoid it
đ
Btw here is one final question, how do you think transform(x) is implemented?
Hint:it doesn't use mod
Bitshift?
Using mod would be slow
Nope
Ah no you need to mod after bit shifting