#algos-and-data-structs
1 messages · Page 120 of 1
I must resist the urge to start arguing on why C++ would be a terrible first language.
Great first language. Or just go with Java.
How do you learn C once you've been using Python for so long?
is anyone free to help me out with a mcts algorithm im working on?
Should be easy to pick up the basics if you've been using python for long. Learn how data types are used, how loops are used, etc. Or just create a project and use that to learn.
in queries, what does <> mean?
i assume this should have been in #databases, and i assume you are asking about sql. <> in sql means "not equal", like != in python.
Hello,
How can you convert JSON B to JSON A in encoding in Python?
JSON A
{\"version\":10000,\"archive3dm\":60,\"opennurbs\":-1937213195,\"data\":\"+n8CAMA4AAAAAAAA+/8CABQAAAAAAAAAQ8Jv8CqjCEad2KfSxM4qNij+rDr8/wIAiDgAAAAAAAAzAIAAQEETAAAAAAAAEBgAAAABAAAA+n8CAL0AAAAAAAAA+/8CABQAAAAAAAAAGRGvXlEL1BG//gAQgwEi8EoaeRf8/...
JSON B
{"version": 10000, "archive3dm": 60, "opennurbs": -1910998424, "data": "+n8CAE8qHwAAAAAA+/8CABQAAAAAAAAAxdu1YGDm0xG/5AAQgwEi8JhRycz8/wIAFyofAAAAAAAzAIAAQLFHBwAAAAAAEAwJAAABAAAA+n8CAL4AAAAAAAAA+/8CABQAAAAAAAAA3dTXTkfp0xG/5QAQgwEi8BDyeHD8/wIAhgAAAAAAAAARAgAAAAAAAAACAAAAAgAAAAAAAAAAAAAAAAAAAAAA8D8AAAAAAAAAAAAAAAAAAAAAAAAAAAAA8L8AAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAAAAAAAAxcfMiErVEACAAAAAAAAAAAAAAAAAAAAAAAAAAAAHzIhK1RAAAAAAAAA4D0ACWqW+f9/AoAAAAAAAAAAAAEAAAD6fwIAvgAAAAAAAAD7/wIAFAAAAAAAAADd1NdOR+nTEb/lABCDASLwEPJ4cPz/...
the first one looks like a JSON string, while the latter is contents of this string, basically.
anyway, just json.loads it
@spice owl what is the context for this question? i agree that these look like 2 different representations of the same thing, but i've seen enough cursed mishandling of json data in the past to be cautious of making a recommendation without more information.
@loud trail @vocal gorge I am getting Newsoft JSON PARSE on my back-end server when trying to send JSON B
Then found the JSON A example, and figured I needed to encode it.
Error on back-end
How are you sending it?
Why is the len of a python list or tuple in constant time
res = requests.post(compute_url, json=payload)
Got it
i assume it's because the PyListObject itself tracks its own length
Yes. That's why
essentially, because tuples and lists are arrays (a pointer to a place in memory and a length), not something like linked lists
(though I don't believe it's, like, in the language specification?.. so a sufficiently crazy Python implementation could use linked lists for lists)
reimplement python in common lisp
the more i think about it, the less bad of an idea it seems
back python lists and dicts directly with common lisp dicts and hash tables
I think python is sufficiently optimised enough.
if anything it's de-optimizing the python implementation and letting the lisp compiler take care of it
idris 2 i believe is the same way, the chez backend is so damn good that they don't need to do a lot of optimizing in the idris compiler itself
anyway, here:
https://github.com/python/cpython/blob/bb3e0c240bc60fe08d332ff5955d54197f79751c/Objects/listobject.c#L197-L206
https://github.com/python/cpython/blob/304dfec8d3c0763734ea8b5fa2af1d9e1ce69ffa/Include/object.h#L140
ultimately PyList_Size uses the Py_SIZE macro which just gets the ob_size attribute
Objects/listobject.c lines 197 to 206
Py_ssize_t
PyList_Size(PyObject *op)
{
if (!PyList_Check(op)) {
PyErr_BadInternalCall();
return -1;
}
else
return Py_SIZE(op);
}```
`Include/object.h` line 140
```h
#define Py_SIZE(ob) (_PyVarObject_CAST(ob)->ob_size)```
Never even heard of idris 2 before
https://stackoverflow.com/a/54323374/2954547 someone on stackoverflow already figured it out, but i had fun finding it
I was going through https://wiki.python.org/moin/TimeComplexity of Lists in python. I know python list internally works as an Array. But how internally length of list is O(1). As it needs to traverse
like haskell, but nicer (imo) and simpler, and with dependent types built into the language
I dream of the day when I can write my own language with a compiler etc.
I really love looking at internal things
maybe SICP is a good book for you
I'll look at it. Thanks for the recommendation.
huh. What language could they have come from where arrays without a size are common?..
C?
Maybe he only knew of dynamic arrays
C has array size limits
doesn't C strlen() just walk the array until it hits a nul byte?
yeah, C strings are null-terminated, but isn't that only a string thing?
I think so. I know a string in python also does the same
do... they use null-terminated arrays too? that sounds basically impossible, since in non-strings you can have zeros
They use null terminated for the str class
that i'm not sure about
I don't think so, len is constant-time for strings.
Objects/unicodeobject.c lines 132 to 139
#define PyUnicode_WSTR_LENGTH(op) \
(PyUnicode_IS_COMPACT_ASCII(op) ? \
((PyASCIIObject*)op)->length : \
((PyCompactUnicodeObject*)op)->wstr_length)
#define _PyUnicode_WSTR_LENGTH(op) \
(((PyCompactUnicodeObject*)(op))->wstr_length)
#define _PyUnicode_LENGTH(op) \
(((PyASCIIObject *)(op))->length)```
Objects/unicodeobject.c lines 147 to 149
#define _PyUnicode_GET_LENGTH(op) \
(assert(_PyUnicode_CHECK(op)), \
((PyASCIIObject *)(op))->length)```
!e
Also yes, you can have \0 bytes
print(repr("Here's a NUL: \0"))
@austere sparrow :white_check_mark: Your eval job has completed with return code 0.
"Here's a NUL: \x00"
although probably more on topic in #internals-and-peps
As expected, we see that inserting at the beginning of a list is most expensive, requiring linear time per operation. Inserting at the middle requires about half the time as inserting at the beginning, yet is still Ω(n) time. Inserting at the end displays O(1) behaviour, akin to append.
Python knows where the array starts from, so I'm confused as to why it would be linear time to insert at the beginning.
It needs to move all the existing values forward by 1
All the values that come after the inserted value
Yes that makes sense.
So the less it has to move the less time
Yeah
But it’s always O(n) except if you’re inserting at the end, which is amortized O(1)
I mean I guess always inserting within a specific range from the end would technically be O(1)
But yeah you get the idea
Kinda makes me wonder why Python doesn't implement lists like a circular queue array https://www.geeksforgeeks.org/circular-queue-set-1-introduction-array-implementation/ for O(1) insert at start and half the shifts to do in general 
But there are probably drawbacks I don't realize 
do those have constant-time lookups too?
sup
Does anyone here have opinions on what metaheuristics package to use in python?
If it’s done using an array then it should
It’s an interesting question! This is the best answer I could find https://stackoverflow.com/a/56409211
Yeah, I guess you have to add an offset to every index call, but that doesn't seem like that expensive of an operation. 
The trade-off makes sense I suppose since indexing is way way more common than inserting 
If you implement a ring buffer for fast insertion, is that meaning you need pointers for each block? that's extra memory overhead too
i've only implemented constant size ring buffer, so that it was just using a list internally, you could "insert" at either end, but not arbitrarily -- though i guess you could just occasionally shrink or grow it, so that you don't need to implement as a linked list
yeah, you can double (or whatever factor the dynamic array uses) the size when the ring buffer fills up and still have O(1) amortized appends, but from either side
Anyone know any good books on the practical side (CP, practice questions, etc.) of data structures (and maybe algorithms)?
(pls ping me when replying lol)
Hi, I am trying to create a custom Dictionary for blob and in memory object storage
How do I overload .keys() dictionary method?
I don't think you can "overload" funcs in python
You can override them
You just have to define it again, it'll override it by default
ok thanks
class BaseClass:
def some_func(self):
return 0
class Class(BaseClass):
def some_func(self, var):
return var
Yo reptile
yeah, you don't need any special keyword (like Java's @Override) to override a parent function in Python
Do you know any good books?
well, CLRS isn't on the practical side, so not really
Hmmm
Ideally take up a project for practical, and just incorporate everything
Ah, but that's not what they want lol. The person who asked me wanted to practice dsa questions or smth
Hackerrank has topic specific sections
I am overriding keys(), but that return dict_keys object type, how can I do the same?
`
def init(self, tablename="data"):
super().init()
databaseProvider = SqliteDict(self.dbpath, tablename=tablename,
encode=self.my_encode, decode=self.my_decode)
def __getitem__(self, key):
if key in self.keys():
return Dictionary.__getitem__(self, key)
elif key in self.databaseProvider.keys():
return self.databaseProvider[key]
else:
raise KeyError
def __setitem__(self, key, value):
Dictionary.__setitem__(self, key, value)
self.databaseProvider[key] = value
def keys(self):
return Dictionary.keys().extend()`
hmm, I'm not sure there's a builtin way
you can return an object that behaves the same way, though
according to https://docs.python.org/3/library/collections.abc.html, a KeysView is a Set, basically.
so, just a set would be fine to return.
def keys(self):
keys = set(Dictionary.keys())
keys.update(self.databaseProvider.keys())
return keys
Is there a more elegant way to do this?
That seems like a nice way.
Hi, I am using joblib for a parallel loop, but antimalware executable service is not allowing it to run properly
!e
left = {'a': 1, 'b': 2}
right = {'c': 3, 'd': 4}
result = {*left, *right}
print(result)
@brave oak :white_check_mark: Your eval job has completed with return code 0.
{'d', 'c', 'b', 'a'}
...assuming those are dicts, of course
key views have an __or__ method:
In [1]: a = {'a': 1, 'b': 2}; b = {'c': 3, 'd': 4}
In [2]: a.keys() | b.keys()
Out[2]: {'a', 'b', 'c', 'd'}
Super
i just looked at your earlier post and it looks like you're just implementing a ChainMap?
In [2]: from collections import ChainMap
In [3]: a = {'a': 1, 'b': 2}; b = {'c': 3, 'd': 4}
In [4]: c = ChainMap(a, b)
In [5]: c['a']
Out[5]: 1
In [6]: c['d']
Out[6]: 4
In [7]: c['e']
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
What is the difference between chainmap(a,b) and a.update(b)?
chainmaps keep the underlying dictionarys separate
any update to a chainmap only affects the first dictionary (a in this case)
it will look for keys in the first dict and then continue to the rest
ohh but my requirement is totally different, a is in memory lru cache and b is sqldict
I will load key from db to lru cache, do concurrent operation and then update in db
'''
'''
'''
class RepositoryDictionary(Dictionary):
dbpath = os.path.join(os.path.abspath(
os.path.dirname(__file__)), 'data', "app.sqlite")
def my_encode(self, obj):
return sqlite3.Binary(zlib.compress(pickle.dumps(obj, pickle.HIGHEST_PROTOCOL)))
def my_decode(self, obj):
return pickle.loads(zlib.decompress(bytes(obj)))
def __init__(self, tablename="data"):
super().__init__()
self.databaseProvider = SqliteDict(self.dbpath, tablename=tablename,
encode=self.my_encode, decode=self.my_decode)
```
def getitem(self, key):
if key in self.memkeys():
return Dictionary.getitem(self, key)
elif key in self.dbkeys():
Dictionary.setitem(self, key, self.databaseProvider[key])
return Dictionary.getitem(self, key)
else:
raise KeyError
def __setitem__(self, key, value):
Dictionary.__setitem__(self, key, value)
self.databaseProvider[key] = value
self.databaseProvider.commit()
def memkeys(self):
return self.keys()
def dbkeys(self):
return self.databaseProvider.keys()
def keystore(self):
return self.memkeys() | self.dbkeys()
i see, there's a very nice method you can overload __missing__ that will probably simplify your implementation
I think instead of getitem I can implement missing
but when it is missing I want to load in lru and then return from it, instead of direct return from db
@stable pecan Will missing load it into the lru cache automatically?
if you implement it to
okay, thanks
something like this:
In [17]: class MyDict(ChainMap):
...: def __init__(self, database):
...: super().__init__({}, database)
...:
...: def __missing__(self, key):
...: try:
...: self[key] = self.maps[1][key]
...: return self[key]
...: except: # Whatever error database would raise
...: raise KeyError from None
...:
...: def __setitem__(self, key, value):
...: self.maps[1][key] = value
...: self.maps[1].commit()
...: self.maps[0][key] = value
might work
@stable pecan Hahaha
def __setitem__(self, key, value):
...: self.maps[1][key] = value
...: self.maps[1].commit()
...: self[key] = value
This is a infinite loop
Already fixed, Thanks
Can some one help me on how to tackle problems involving substrings?
I always ends up using loops which results in TLE
what's the most memory efficient data structure for caching in python
...a dict? I mean, you need quick lookup too.
there are some stuff like BTreeMaps which are more memory-efficient, but Python doesn't have a builtin for that, though there's probably a library
in my case yes i need quick lookups but memory is my first priority
looks like there's https://btrees.readthedocs.io/en/latest/overview.html that implements a BTree
it's not a very popular library, though
i have some trees implemented
thanks! lemme look into it @vocal gorge
Make sure you actually measure whether you get a memory decrease compared to dicts.
sure
but they weren't implemented with saving as much memory as possible
trees use a lot of pointers
https://github.com/salt-die/sacks have an AVLTree that can be remade into a mapping pretty easily (just by adding values to the nodes)
@fathom prairie I believe it mostly depends on what type of data you want to store
{string: {int: python object}}
there can be multiple keys not just one
some can be up to 10^5
@fathom prairie Is it a dictionary of dictionary?
yes
nested dict
hey any binary heap (min/max heap ) folks here
I feel like i understand the time complexity for everything here except the "average insert" wouldnt on average the value would be somewhere in the middle of the tree so it requires like log N / 2 time? so wouldnt that round to Log N time?
i understand that worst case is log n, n being the height of the tree, but on average an insert would be less then half the values so would need to go up some part of the tree
maybe even log N /4 or some other factor
which algos should I learn first excep binary search al.
Right but keep applying this logic. At each level, the bottom is half of the total nodes. So you could say each level you go up, the probability of the value being above is halved.
read a data structures book @raven dirge it will explore time/space complexity (big O notation) and give you a preview with examples of many of the commmon ones
@half lotus so its like log(n) over log (n) because each step you take is half of the chance to find a medium value?
which equates to ~ O(1)
ok thanks
any books you recommend
As the level increases you need more operations, but the chance of it being at a higher level decreases
So it converges to a constant value
im not saying you can find it free, but a google search wont hurt. https://www.amazon.com/Structures-Algorithms-Python-Michael-Goodrich/dp/1118290275
Data Structures and Algorithms in Python [Goodrich, Michael T., Tamassia, Roberto, Goldwasser, Michael H.] on Amazon.com. FREE shipping on qualifying offers. Data Structures and Algorithms in Python
thanksss
im not saying you can find it free, but a google search wont hurt.
certainly not saying that it's true for every book, either
can someone help me with this code
Hey @bright cedar!
Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:
• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)
• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:
!e
print("yay") if True else print("Not yay")
print("yay") if False else print("Not yay")
@unborn sundial :white_check_mark: Your eval job has completed with return code 0.
001 | yay
002 | Not yay
i think i may have just made the worst selection sort ever
def minimaxi(n1,n2):
if n1 > n2:
return n2, n1
return n1, n2
def selectionSort(arr):
temp = arr
sortedArr = [0]*len(arr)
for j in range(len(arr)):
for i in range(len(temp)):
temp[i-1], temp[i] = minimaxi(temp[i-1], temp[i])
sortedArr[j] = temp.pop(0)
return sortedArr
two for loops feels inefficient as hell
What's the point of temp = arr?
oh my god lmao
i think i was messing around with ideas before, and that was left over. 10/10 best practice programming
thats better, still not happy about using 2 loops
sortedArr = [0]*len(arr)
for j in range(len(arr)):
for i in range(len(arr)):
arr[i-1], arr[i] = minimaxi(arr[i-1], arr[i])
sortedArr[j] = arr.pop(0)
return sortedArr
wait isnt selection sort O(n^2) anyway? or is that insertion sort, i get the two confused
why did you make a local variable with a list of 0 in it? like you are affecting the space in the memory with that
it is
both and bubble also
oh i can use append cant i...
i really need to brush up on my python, been spending too much time with ts
you can sort it in place without making a list
I have a question asking for a solution to a cube that rotates and the blocks within fall to the base.
It's essentially a matrix, with values inside that are blocks, and barriers that are fixed. The block are supposed to fall to the lowest position when rotated.
how would I approach this problem, I'm very lost
just use 2 loops and with the indexes you are replacing them at the end of the outer loop
i did say it might be the worst selection sort ever
on this site you can find examples and much more
thanks
Hi All, I am looking for a solution for my question: How to create a group by considering the 'identical consecutive groupings in a column (in Pandas DataFrame) : (dont know how to insert here my code properly, if someone will tell me, I'll edit it):
from itertools import groupby
test_list = ['AA', 'AA', 'BB', 'CC', 'DD', 'DD', 'DD', 'AA', 'BB', 'EE', 'CC']
data = pd.DataFrame(test_list)
data['batches'] = ['1','1','2','3','4','4','4','5','6','7','8'] # this is the goal to reach
print(data)
result = [list(y) for x, y in groupby(test_list)]
print(result)
[['AA', 'AA'], ['BB'], ['CC'], ['DD', 'DD', 'DD'], ['AA'], ['BB'], ['EE'], ['CC']] So, I have a DataFrame with two columns: the first is a list of elements that must be kept in order + grouped into batches: identical consecutive grouping. The batch column where the result should be assigned.
I couldn't find a solution or a workaround. As you can see, I've created a list using the itertools groupby function by grouping the same cons. items, but this isn't the final result I'd like to see. (result I'd like to see):
| list | batches |
|---|---|
| AA | 1 |
| AA | 1 |
| BB | 2 |
| CC | 3 |
| DD | 4 |
| DD | 4 |
| DD | 4 |
| AA | 5 |
| BB | 6 |
| EE | 7 |
| CC | 8 |
Which statement is false?
Adjacency list representation is better than adjacency matrix representation for directed graphs
Adjacency list representation is better than adjacency matrix representation for undirected graphs
Adjacency list representation is better than adjacency matrix representation for weighted graphs
Adjacency matrix representation is better than adjacency list representation for graphs
is this B?
@scenic flint
nope B is true
Maybe c cause a list would have to store a tuple of the vertex and weight. A matrix could just use the weight instead of putting a 1.
That being said this question is vague. What does "better" mean? Better on what basis?
It has some advantages. There's a good so post that goes over or, hold on
It is fast to lookup and check for presence or absence of a specific edge
between any two nodes O(1)
ah, that's true
wait, actually no
one could use an adjacency dict and then that's also O(1)
yeah
Oh like a dict that stores a set
though one could also just use a dict of from,to tuples for the entire thing
or a dict mapping from to a set of to
either way, O(1) lookup and addition
only problem is memory, which is asymptotically good (just O(E)) but not very good in absolute terms (dicts aren't the most efficient, they waste what, two thirds their memory or so?)
so im coding in python from last 8 months and ik flask and fastapi and now i wanted to learn dsa (daily 2 hr side by side) with learn django in which lang should i learn dsa , my friends said cpp but i hate it and it's syntax is kinda frustating and hard what should i do
data structures and algorithms apply to all languages
learning them in Python is fine and can be pretty elegant
Java might also be a decent choice if you want to learn a new language too

it depends, really
C could be a good language
if you wanted to get into the low-level details
otherwise, Python is fine
.
So if we have the following:
nums = [1, 2, 3, 4, 5]
And we have two for loops. Is that O(n^2) time complexity?
.
.
.
And if we have list_n and list_m and we traverse through them both 1 by 1 so:
for num in list_n:
...
for num in list_m:
...
then to time complexity for this would be O(n + m)?
.
.
.
And if we have a list and a dictionary so something like:
dictionary = {}
for index, num in enumerate(list_n):
dictionary[num] = index
for num in dictionary:
...
Is the time complexity for this also O(n + m)?
two for loops
You mean, nested loops?
Yes, nested for loop that has to compare 1 element with every element in the list and we do that comparison for each element in the list/array.
Yes, that's quadratic complexity.
What would m be here?
m would be dictionary which I supposed could also be n because we traversed over that list to store elements in the dictionary.
When would the size of the dictionary be different from the size of the list?
because if we traverse over the same list twice, that's still considered O(n).
If/when we have duplicate elements
Well, yes, if the list contains 1000 copies of the same element, they will be different.
But since 0 <= m <= n, m is always on the same order as n, so that would be just O(n)
Oh so this could be O(n + m) if there are 1000 copies of the same element? or this could be O(n) if elements are different?:
dictionary = {}
for index, num in enumerate(list_n):
dictionary[num] = index
for num in dictionary:
...
Should this be m >= 1 instead of m >= 0? Because there will at least be 1 element that will get stored even if it has copies of the same element?
that'd be O(n+m) where n is the length of the list and m is the number of unique elements in the list
since 0 < m <= n: O(n+m) == O(2n) == O(n)
Since m > 0 and n >= m:
so if m is 100, n is 100 supposing each element is unique:
O(100 + 100) => O(2(100)) => O(100).
lol we are probably not supposed to do what I did by using 100, but just trying to get a better picture.
it doesn't make any sense to talk about O(100) like that
O(n) is a class of functions
based on how fast they grow with respect to their input
the behavior of a particular function at a particular point is irrelevant
Yeah I agree it doesnt make sense to use 100. But I actually see how O(n + m) transforms into 0(2n)
So I understand O(n+m) becomes O(2n)
But I am trying to understand how does O(2n) becomes O(n) ?
because O(n) is notation for the class of functions which grow linearly w.r.t. their input
So if we traversed through a list 3 or 4 times, it's still considered O(n). And not O(3n) or O(4n).
yes, or to put it another way, O(n) = O(2n) = O(3n) because they are all the same class of functions
Say your time function is T(n) = 2n, then T(kn) = 2kn
you multiplied your input by k, and the output only scaled up by k
linear!
Now consider the time function T(n) = n^2, so T(kn) = (kn)^2 = (k^2)(n^2)
you multiplied your function by k and the output was scaled up by k^2
not linear!
That's why we care so much about this in CS, if you increase the size of your input a little bit and it leads to a large increase in the time it takes your function to run, that's a problem
I see what you mean if we have 100 or 500 elements in a list and we traverse through it 2 or 3 times, we still call it O(n) because it's linear. It would take 5x time at 500 elements compared 100 elements
But when we have K^2. Time can increase exponentially by each additional element
So O(n), O(2n), O(3n) end up being O(n) because time increase is linear.
That was helpful thanks!!
one nitpick: 2^n is exponential, n^2 is quadratic
exponential is way way worse
oh thanks for highlighting that! Quadratic and exponential are two different time complexities, with O(2^n) being worst.
Is my implementation linear?
def linear_search(list, target_value_to_find):
for index, value in enumerate(list):
if value == target_value_to_find:
return f'Found your target value at index {index}'
return -1
print(linear_search([1, 2, 4, 67, 92, 56], 92))
yes
Thanks
@brave oak
what
Trick question: what is the space completixy of the function?
why?
for loop+ traversing through the list?
Do you understand the difference between time complexity and space complexity?
oh not yet, still learning
nested loops = n*n, loop -> n
two different loop n+m
wrong answer?
oh just now noticed you asked space complexity
my bad
Is it ||O(n) since we only store the list once|| ?
It's O(n)
Every variable uses the same size, so the same space. The loop would make it linear
When computing space complexity, we usually talk about auxillary space occupied.
The time complexity describes for how much longer the program will run when you increase the input.
The space complexity describes how much more memory the program will occupy when you increase the input, or rather, how much additional (auxiliary) memory it will need. For example, here: py def add_squares(numbers): total = 0 for number in numbers: total += number**2 return number if the size of the list is N, the time complexity is O(N), because you're traversing the list once. However, the space complexity here is O(1) -- you're only storing a constant amount of data. However, here: ```py
def add_squares(numbers):
squares = [x**2 for x in numbers]
return sum(squares)
Actually it's Space Complexity = Auxiliary space + Space use by input values
Thank you!!
the function doesn't recieve a full copy of the data though, only a reference so I don't think the size of the input is even O(n) space
I think different sources use different terminology, but I don't think including the input size makes a lot of sense
especially in Python, when passing an argument to a function never copies memory
Join takes an iterable and then joins it to a string. Would join be creating a new string for each element in the iterable? Does this increase the time complexity of join?
The practical answer is that this py def linear_search(list, target_value_to_find): for index, value in enumerate(list): if value == target_value_to_find: return index return None uses O(1) of auxiliary memory, but the pedantic answer is that it uses O(log(n)) of auxiliary memory, because the larger the index gets, the more space it occupies. But this will only matter if you have an unreasonable (~2**64) number of elements
O(1)? Is it because all elements of the list are ints?
It doesn't matter what elements are in the list, because you're not copying them (value will refer to the same object that's stored in the list)
I'll say the truth. I still don't really understand space complexity.

well, it's sometimes tricky 🙂
I mainly didn't really learn it because I heard they don't ask about it
Join creates a giant string in-place, without creating intermediate once, something like this:
def NOT_join(iterable):
result = ""
for s in iterable:
result = result + s
return result
def join(iterable):
result_chars = []
for s in iterable:
result_chars.extend(chars_of(s))
return string_from_chars(result)
That makes more sense. So basically join creates something like a dynamic array and then finally returns it when it finishes going through the iterable.
yeah
You can see the CPython implementation: https://github.com/python/cpython/blob/main/Objects/unicodeobject.c#L10014-L10208
it's, uh... complicated, as always
😂 Thanks for your help as always
I'm trying to draw a triangle using numpy. I found this formula to draw an n-sided polygon, but I don't know how to apply the formula in the code, so how do I apply this formula in numpy to draw a triangle?
https://stackoverflow.com/a/7198179/14243840
c+ python? = cpython?
yes
cpython is the open source python written in c
nice, hearing for first time
"standard" python from python.org is cpython
Hi everyone
Could any one explain me how to set the time limit for user input in using python3.9.
Open a help channel. This channel is for data structures and algorithms.
.
Should I learn insertion sort or I should skip it?
Hello. Someone could help me with a question? Please. I have 3 lists, products category list:[food, beverages, appetizers], for each categories, a products list:[chicken, meats, rice], [wine, beers, juices], [skewers, rolls, tartlets] and products stock list for each one, [12,34,10] [12,66,15],[6,4,5] I need to associate them. I could use: for categoría in categoria_products:
products_granos=categoria[0]?
Or what do you suggest me? Please. I am new.
Excuse my English
Do people read the banner 
Hello. What's the best way of converting a list to a dict and vise versa (depending on the number of occurences)?
Example:
# This list
["apple", "apple", "banana", "mango", "mango", "mango", "banana", "peach"]
# Would be converted into this dict
{
"apple": 2,
"banana": 2,
"mango": 3,
"peach": 1
}
# And vise versa
I know how to do it (using a loop) but I feel like I can do it another way so it's more efficient or uses a more dedicated function
look at the Counter data structure in the collections module of the stdlib
Is next(iter(some_dict)) and next(iter(some_set)) technically O(n), at least in the worst case, since there’s a possibility it will have to look over a bunch of empty space?
O(n) for the set because everything might have been placed in the further half of the array, and O(n) for the dict since maybe all the key-value pairs from the front were deleted without deleting enough to trigger a resize
you mean, like, when you have tons of collisions and you deleted a lot of keys recently?
oh, right, next, just when you deleted lots of keys recently. hmm
I was thinking just about where everything gets hashed to in the case of the set, I wasn’t thinking collisions would affect anything
Yeah
Where everything gets places if you’re using a set, whether you’ve been deleting from the beginning if you’re using a dict
oh, that was easy to check
N =10**5
st = set(range(N))
%timeit next(iter(st))
#183 ns ± 24 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
for i in range(N//10):
st.remove(i)
%timeit next(iter(st))
#12.1 µs ± 189 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
looks like a solid yes
this works very well here because hash of ints is themselves, so sets of ints are sorted
I wonder if that’s still O(n) in the average case too with sets
Or if it’s different in the average case
But anyways, thanks for testing that
But yeah, you have to overallocate more and more extra space as the set increases in size. So it seems like the average amount of empty space you have to iterate over would increase too
Same with dicts, assuming you’re randomly deleting from them a random number of times. It seems like that would be O(n) in relation to the size of the dict after deleting from it
what's %timeit ?
timeit is a python module which helps you time how long python code takes to run
Thanks. Never seen it before. I usually do
start_time = time.time() time.time() - start_time
It basically does that but a lot of times, a million by default, to get an even better measurement
what's the module name?
Thanks
I saw something on an MIT video. Use binary search in the insertion sort algorithm to improve the time complexity. I wonder if that still makes it insertion or really improves the complexity.
You can use a binary search on the sorted portion to find where to insert it faster, yeah.
But then you still need to swap a lot so it's O(n^2) still
or now I'm not sure but this talks about it https://stackoverflow.com/a/18022248/10241514
it would probably be O(n log n) if you were able to insert in constant time
but alas
Ok, I’ve been thinking about this, and now I’m wondering if my ideas are right:
next(set_iterator) is always average O(n) because the bigger the set is, the more empty spaces it will need to look over to find the next non-empty index.
next(dict_iterator)is average O(1) if you aren’t deleting from the dict because it won’t have to look over any empty spaces, but if you are deleting from the dict then it’s O(n) in relation to the current size of the dict.
next(OrderedDict_iterator) is always O(1) because it uses nodes for iteration
Is any of that not true?
I don't think next(set_iterator) is O(n).
if it were, then iterating a whole set would be O(n**2), which it isn't.
@fervent saddle it seems that sets don't shrink, so I guess something like this could happen with a lot of deletes.
Yeah it does happen like that, you’re right: #algos-and-data-structs message
If you specifically remove from the beginning then it makes it take longer. But if you remove randomly then it doesn’t seem to make it take longer. So I guess it makes it O(n) in the worst case and O(1) on average.
But I don’t get why it doesn’t still increase without deletions
It has it set up so there’s the same average amount of empty space between non-empty indexes regardless of the size of the set if you haven’t been deleting from it?
Nevermind, I took a few minutes to think about it, and it seems really obvious now
It tries to keep the set around 50% full or something like that, so that means no matter what size the set is, each individual index should always have the same chance of not being empty
So it will always have to look over the same number of indexes on average
I've been told recently that json is a bad database and you should use something else. Is this true? and if it is, what would you say is the best database?
json is not a database, it's a file format. there is no "best" database, but for a lot of hobby applications sqlite is a good choice.
so there's your answer: json is a bad database because it's not a database.
Hello, I was trying to define circularLinked list data type and wrote insert, insertfromlast, delete, size, rotate functions successfully. I initiated the class as ```class circularNode:
def init(self, v = None):
self.value = v
self.Next = None
class circularLinkedList:
def init(self, l = []):
self.head = None
self.tail = None
for x in reversed(l):
self.insert(x) But there is some trouble defining the reverse and map functionsdef map(self,f):
if self.size() == 1:
return circularLinkedList([f(self.head)])
x = self.head
for i in range(self.size()):
x = circularNode(f(x))
x = x.Next
def reverse(self):
if self.size() == 1:
return self
if self.size() == 2:
return self.rotate()
elif self.size() > 2:
Z = self.head
X = self.delete().reverse()
X.insert_last(Z)
return X``` This is how I defined the two, but am getting error in both cases. Any help is highly appreciated. Thanks!
ghost ping?
https://www.codechef.com/ZCOPRAC/problems/ZCO12002
https://www.codechef.com/viewsolution/47855929 ---> only 2nd subtask 4th case is wrong i am not able to find why it is. Can anybody help?
Is there a way to implement a faster the way ? An alternatives of the loop ? I want to add element in the trackContainer without passing through loops. stream.Part() is an object in which I want to add notes... And if I let as it is adding 1000 notes will take approximately 1 minute
loops are pretty basic to the hardware
the hardware performs loops, not sure how you can bypass this
I mean a cpu doing a quadratic instead of a loop could be created
but it is for a narrow and specialised application, not general purpose
alternative to for loop would be while loop and recursion, and speed for all three is roughly the same.
I dont know if you can use multithreading and break it into chunks of 2 or 4?
so in my research about chess I came across https://www.chessprogramming.org/Transposition_Table, from my understanding it's used for ai/engines to reduce the amount of time required to find the best move in a pre-analysed position but would it be useful to avoid calculating all possible moves (for app generation not engine) as well or would the time/complexity of the lookup not be worth it?
recursion will be significantly slower in python @dapper sapphire , function calls are expensive
also limited recursion stack size
but in this case it probably won't matter if the operation is some slow i/o
I looked up to see in what other languages recursion is expensive, and wow recursion is also expensive in Java and C. Thanks for pointing that out! Especially limited recursion stack..
@minor night I need help in coding a good code.
It’s expensive in C unless it uses tail-end optimization
And then it just uses a jump back to the start of the function instead of calling it again.
guys please recommend me any nice course for dsa in python
Ok, but I am more experienced with #user-interfaces so I can't really help you with data structure but if you need help in anything else then tag me.
So in python even tail recursion is expensive?
Python doesn’t do tail-end optimization
Also, can someone help me with this? I keep getting None.
def recursive_binary_search(array, item_to_find, start=0, end=None):
if end is None:
end = len(array) - 1
if end < start:
return -1
midpoint = (start + end) // 2
if array[midpoint] == item_to_find:
return midpoint
elif array[midpoint] > item_to_find:
recursive_binary_search(array, item_to_find, start, midpoint - 1)
else:
recursive_binary_search(array, item_to_find, midpoint + 1, end)
tail recursion is returning immediately after the recursion call
How is that not optimised?
You are not returning in your recursive calls
It’s when instead of actually calling the function again, it just compiles it with a jump back to the start of the function
I return at midpoint == item to find
Or?
Python doesn’t do anything like that
??
Wait are we talking about the same thing?
In response to this
It doesn't return immediately after the return call?
if I do return recursive function
Immediately after the function gives it's answer it returns.
I just found it. Thanks
No, python doesn’t do anything to optimize recursive functions. It won’t optimize away recursive function calls
So yeah, if a part of your code calls a function recursively, it will always have to go back up the call stack
I totally forgot promise.
Cause the previous functions would need those answers
errmmmm
ok
If I have something like
for i in range(n):
# inner loop running in log(i)
What is the overall time complexity?
i'd say it's log(1) + log(2) + ... + log(n-1) = log((n-1)*(n-2)...2*1) = log(n!)
is O(log(n!)) a thing?
Can I get facts for this? I know I might go into an interview and someone might say do a recursive function. I want to be able to say no 😂 
O(n2). Any inner loop is quadratic.
huh
that's not necessarily true, if you have a log n inner loop, it becomes n log n, not n^2
http://neopythonic.blogspot.com/2009/04/tail-recursion-elimination.html
http://neopythonic.blogspot.com/2009/04/final-words-on-tail-calls.html
Here’s some blog posts by Guido where he says why he’ll never add tail recursion optimization
I recently posted an entry in my Python History blog on the origins of Python's functional features . A side remark about not supporting ta...
A lot of people remarked that in my post on Tail Recursion Elimination I confused tail self-recursion with other tail calls, which proper T...
Isn’t it n*log(n) ?
a SO post does say that n log n provides an upper bound on log(n!)
but i'm not sure why i'd consider the upper bound of log(n!), not log(n!) itself
I think Θ(nlogn) = Θ(log(n!)). Not sure I can be rigorous but think about the terms of log(1) + log(2) + ... + log(n-2) + log(n-1). The latter half of that list of terms will all be within 1 or each other and they are all at least log(n/2). So you have log(n!) is at least (n/2)*log(n/2) which simplifies to Θ(nlogn). 
Is it saying that if you were graphing how long an algorithm takes if you’re very lucky and it takes the least possible time, and you were also graphing how long it takes if it you’re very unlucky and it takes the most possible time, big-omega is the line of the lucky least long time, and big-O is the line of the unlucky most long time, and big-theta is whatever kind of line stays between those two lines?
Or is that still not right?
That's right mostly I guess
Big-Omega is the lower bound so if you get really lucky you still have to spend at least that much time. Big-O is the upper bound - you might have to spend that much time if you get unlucky.
And when Big-O and Big-Omega have the same function then their Big-Theta is that function - bounded on both sides.
I asked “are they just different ways of saying best case and worst case complexity” in general, and some mods said no
#python-discussion message
Hi guys, I have a quick question, if you're doing an algorithm and structure excercise and they ask you to order an unordered list, would be considered "cheating" if I used method like range, min and max ?
Okay, brillian thanks.
No I misread that somehow
lol I thought you were asking if you could straight up call my_list.sort()
yeah sorry nvm
That would sort a list called a, would that be accepted ?
Is the question just “find any way to sort a list without using sorted or list.sort” ?
If it is, then that seems ok
The question doesn't have any constraints of what not to use. But would that be considered an algorithm and would that be acceptable even if they asked me not to use sorted or list.sort
It must say some restrictions. If it just says “sort a list” then nothing is stopping you from using list.sort
I think "no" is the correct answer to that but I'm not entirely clear on it myself 
If you’re following all the restrictions it says, it should be fine
Here's the question: "write a Python program to insert items into a list in sorted order"
It’s probably fine, but if you’re worried about min and range then you could use your own min function and use a while loop instead
But even if they were to say to not use sorted or list.sort(), would the method I use go against the spirit of the question by using other methods/functions
Yes very good point, I didn't think of that. thanks.
But I think it should be fine
!e
def factorial(n, a = 1):
if n <= 0:
return a
return factorial(n - 1, a * n)
print(factorial(2000))
@austere sparrow :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 6, in <module>
003 | File "<string>", line 4, in factorial
004 | File "<string>", line 4, in factorial
005 | File "<string>", line 4, in factorial
006 | [Previous line repeated 995 more times]
007 | File "<string>", line 2, in factorial
008 | RecursionError: maximum recursion depth exceeded in comparison
Well, "python doesn’t do anything to optimize recursive functions" is not quite the right wording.
CPython doesn’t do anything to optimize recursive functions.
Optimizations aren't really part of the language spec.
There's already a binary search in the stdlib https://github.com/python/cpython/blob/3.9/Lib/bisect.py, you can look at it.
Just because CPython doesn't do tail call recursion optimisation or anything, it doesn't mean that recursion isn't a valid strategy to use for tasks its suited to.
yeah, that as well
well, you can always transform recursion into a queue if you're hitting a limit, but I don't think it's going to be faster
Some day I’ll finally stop forgetting to say “CPython” instead of just “python”
well, PyPy doesn't optimize it either, it seems 🙂
If iteration is faster, why worry yourself over recursion?
I'm mainly saying this because I'm not a fan of recursion 😂
Both approaches are proven to have the same computational power, indeed. But solutions for specific problems in with one or the other technique may be more elegant, and easier to understand .
For example, if you wanted to write a program that goes through a big nested tree, and does something to all the elements - say a JSON or HTML file - doing that recursively would be pretty straightforward.
Doing it with iteration can be done, but you'd have to basically manually implement the call stack in a list, and keep track of appending/popping things.
That sort of busywork distracts from what you're actually trying to do, making it harder to understand the code.
It's the same reason Python's for loop being an iterator is nicer than being forced to manually increment a counter.
@last fulcrum recursion is great for recursive data structures
Yes yes. I just thought for example of pythons os module. Os.walk and os.list dir
I've never seen any of those. Maybe because I'm still learning arrays
Well, one is what you mentioned, the file system.
os.walk is an interesting case: it deals with a recursive structure (nested directories), but presents them to you as a linear iteration.
!e
Another example is when you want polymorphism:
import math
Env = dict[str, int]
class Expression:
def evaluate(self, env: Env) -> int:
raise NotImplementedError
class Val(Expression):
def __init__(self, value: int):
self.value = value
def evaluate(self, _: Env):
return self.value
class Var(Expression):
def __init__(self, name: str):
self.name = name
def evaluate(self, env: Env):
return env[self.name]
class Add(Expression):
def __init__(self, *exprs: Expression):
self.exprs = exprs
def evaluate(self, env: Env):
return sum(expr.evaluate(env) for expr in self.exprs)
class Mul(Expression):
def __init__(self, *exprs: Expression):
self.exprs = exprs
def evaluate(self, env: Env):
return math.prod(expr.evaluate(env) for expr in self.exprs)
expr = Add( # 4 + 2x^2 + (-5)(x + y)
Val(4),
Mul(
Val(2),
Var("x"),
Var("x"),
),
Mul(
Val(-5),
Add(Var("x"), Var("y")),
)
)
print(expr.evaluate({"x": 3, "y": 1}))
This is an example of mutual recursion. If you want to add a new type of expression, you don't need to change any other code.
@austere sparrow :white_check_mark: Your eval job has completed with return code 0.
2
another good example of a recursive structure.
Now, technically you can evaluate expressions without recursion recursion. You can make a clever setup with a queue and treat recursion in a stackless way. In other words, replace the hardware stack with your own stack. (Example: stackless Python (if I understand what it's doing correctly), and a toy language I participated in https://github.com/gurkult/py-gurklang)
but it gets pretty complicated, and it's probably slower
I think the polymorphism is orthogonal to the recursion. The recursion is because expressions are a recursive structure. or maybe I'm missing a subtlety
If this didn't use mutual recursion, you couldn't (easily) add a new type of expression
if you have a fixed list of expressions, you can probably hack together some queue mechanism, or just transform it into a list of stack operations
how can i speed up loops?
are there some rules i should keep in mind while performing iterations to make it faster?
how much processing are you doing on each iteration? It might be worth importing async and converting the iterations to coroutines
10 to the power 6 iterations minimum
and not much processing just performing a lookup
changing things into local variables may speed up loops slightly
Can you show the code @fathom prairie ?
sure
one sec
@austere sparrow
data = [...]
idx = 0
for i in range(len(data)):
if i.id == some_int:
idx = i
break
Well, there's nothing in particular you can do to speed it up. Have you tried running it with PyPy?
also, you mean ```py
data = [...]
idx = 0
for i in data:
if i.id == some_int:
idx = i
break
yes right
If you're doing that loop many times, maybe you can use a dict like {id: value} instead of a list?
memory usage
A million-element dict is literally nothing
@fathom prairie According to my quick benchmark, it's about 8 5 megabytes if the integers are not very big.
wait, no
It will be 5 megabytes (5242968 bytes in my case), because the keys will already be part of your objects.
it's about 40mb
?
If you're searching by the id multiple times, then yes, a dict is going to be way better
tiny queston how do you write the code part on discord?
the formatting i mean..
!code
Here's how to format Python code on Discord:
```py
print('Hello world!')
```
These are backticks, not quotes. Check this out if you can't find the backtick key.
@fathom prairie let python do the looping:
result = next((val for val in data if val.id == some_int), None)
i need help with a if else code i made, is this the right channel?
@distant sable maybe a help channel. see #❓|how-to-get-help
ah i see, thank you 🙂
huh, surprising!
In [3]: %paste
data = list(range(1_000_000))
def loop1(data):
for val in data:
if val == 545:
return val
def loop2(data):
result = next((val for val in data if val == 545), None)
return result
## -- End pasted text --
In [4]: %timeit loop1(data)
15 µs ± 282 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [5]: %timeit loop2(data)
15.3 µs ± 538 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
i guess generators aren't any faster than for loops nowadays
iirc they used to be
@loud trail the generator already does the same for loop
right, i thought for had more overhead
but this is why we benchmark, after all
if anything the plain for loop is slightly faster, less overhead from next
In [2]: %%cython
...: def loop3(data: list):
...: for val in data:
...: if val == 545:
...: return val
In [7]: %timeit loop3(data)
3.06 µs ± 237 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
of course the cython version is 4x faster
In [4]: %timeit loop1(data)
894 ns ± 75.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: %timeit loop2(data)
1.01 µs ± 24.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
and pypy is 3x faster than cython!
can't test easily on graalvm, no ipython support 😦
or maybe it does, let me see
and for completeness, if you do data_np = np.asarray(data) then numba (under cpython) can match pypy performance
In [32]: loop4 = njit(loop1)
In [33]: %timeit loop4(data_np)
893 ns ± 71.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Hey @fiery cosmos!
It looks like you tried to attach file type(s) that we do not allow (.pdf). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.
Feel free to ask in #community-meta if you think this is a mistake.
is implementing linked-lists in python really better than using lists?
considering that list is a primitive data structure and implemented in C, comes built-in with python
Linked lists are almost never better 🙂
Pretty much only for teaching DSA.
there's a rant here about the limited usability of linked lists:
https://rust-unofficial.github.io/too-many-lists/index.html#an-obligatory-public-service-announcement
this is funny
You could also sort the list by each object’s id and do binary search, unless you’re adding to the list after initially creating it because then you’ll have to insert them into the correct position which is O(n) anyways. I think inserting should still be faster than searching through the list since it’s a compiled function, but it still would probably not be very good.
The hash map idea might be better for this situation, but binary search might be a good thing to know about for the future
Also python lists don’t use linked lists, they use arrays. So it wouldn’t be the same kind of data structure
class collections.deque([iterable[, maxlen]])```
Returns a new deque object initialized left-to-right (using [`append()`](https://docs.python.org/3/library/collections.html#collections.deque.append "collections.deque.append")) with data from *iterable*. If *iterable* is not specified, the new deque is empty.
Deques are a generalization of stacks and queues (the name is pronounced “deck” and is short for “double-ended queue”). Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.
Though [`list`](https://docs.python.org/3/library/stdtypes.html#list "list") objects support similar operations, they are optimized for fast fixed-length operations and incur O(n) memory movement costs for `pop(0)` and `insert(0, v)` operations which change both the size and position of the underlying data representation.
strangely, access in deques is way faster than I'd expect from linked lists, though linear:
import collections
d = collections.deque(range(100000))
%timeit d[0]
%timeit d[10]
%timeit d[100]
%timeit d[1000]
%timeit d[10000]
62 ns ± 5.44 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
58.6 ns ± 3.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
69 ns ± 10.1 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
69.6 ns ± 4.97 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
317 ns ± 2.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
If you put all the nodes of a linked list in an array, is it still considered a real linked list?
I wonder if the deque does that
10000 steps over a linked list in ~250ns..
Also I wonder why that doesn’t start increasing until after 1000
before that, the function call itself is taking the 60ns, probaby.
that's about how long an empty function takes
I think it's not a naive linked list, instead it's a linked list of blocks
typedef struct BLOCK {
struct BLOCK *leftlink;
PyObject *data[BLOCKLEN];
struct BLOCK *rightlink;
} block;
typedef struct {
PyObject_VAR_HEAD
block *leftblock;
block *rightblock;
Py_ssize_t leftindex; /* 0 <= leftindex < BLOCKLEN */
Py_ssize_t rightindex; /* 0 <= rightindex < BLOCKLEN */
size_t state; /* incremented whenever the indices move */
Py_ssize_t maxlen; /* maxlen is -1 for unbounded deques */
Py_ssize_t numfreeblocks;
block *freeblocks[MAXFREEBLOCKS];
PyObject *weakreflist;
} dequeobject;
Is that like a linked list where each node holds an array instead of one single value, so it doesn’t need to go through as many nodes to reach the nth index?
oh, cool.
I keep forgetting this fact and getting told that
I think I'm learning how deques are structured for the third time now.
Also if the block size is 64 then why doesn’t the time it takes to index it increase between index 100 and index 1000?
🤔
it does for me
idk why it doesn't change between 10 and 100 here, though
His test didn’t, that’s weird
I was going to guess that maybe there’s an array with references to some of the blocks at the start and at the end. But maybe it doesn’t do that after all
question i saw somethign and i was just wondering what is did
heres the code snippet
cards = ['1','2','3','4','5','6','7','8','9','10','11']*4```
what does the *4 at the end of the list do to it?
!e
print([1,2,3]*3)
@vocal gorge :white_check_mark: Your eval job has completed with return code 0.
[1, 2, 3, 1, 2, 3, 1, 2, 3]
duplicates the list 4 times.
If there was a version of random.choice for dicts and sets which went to a random place in the array and then went forward or backward (maybe the direction could be chosen randomly) until it reached a non-empty / non-deleted space, would that have a problem of being less random?
Or is the main reason there’s nothing like that available just because they wanted to try to keep things in python and out of the underlying C code as much as possible? Or is there a different reason?
thanks for the help, i used bisect.insort to always keep my list sorted and used binary search for lookups
loops is difficoult
Great article.
I just finished learning arrays. Anyone knows where I can get some problems to solve? Mainly easy and medium.
Also, can someone explain amortization in simple terms with an example for me? I learnt about it but it's not sticking. 
I wonder why they wrote python like that? 1,2,3 all refer to the same elements here. Maybe to save space? Make it faster?
What do you mean
it's more of a syntax convenience so you don't need to write out print([1, 2, 3, 1, 2, 3, 1, 2, 3])
They all refer to the same elements in memory.
Here's the explanation.
To quickly initialize a one-dimensional list, we generally rely on a syntax such as data = [0] * n to create a list of n zeros. On page 189, we emphasized that from a technical perspective, this creates a list of length n with all entries referencing the same integer instance, but that there was no meaningful consequence of such aliasing because of the immutability of the int class in Python.
Isn't the meaningful consequence that you now have a zeroed list of ints to do stuff with? 
I don't know. That was not my original question. I was just asking why python would make everything reference the same ints.
Thanks for the code wars suggestion.
Any performance increase is likely negated by the fact that you're using Python to start with xD It's mainly for convenience.
Besides, Python automatically caches small integers so they would refer to the same object regardless
!e
a = 2
b = 2
print(a is b)
c = 3000
d = 3000
print(c is d)
@flat sorrel :white_check_mark: Your eval job has completed with return code 0.
001 | True
002 | True
hmm
the second one prints False on my PC, the implementation used for the bot probably caches more values than usual
I thought CPython always interned constant ints
!e ```py
def f(n):
return n + 1
x = 3000
y = f(2999)
print(x is y)```
@fervent saddle :white_check_mark: Your eval job has completed with return code 0.
False
!e ```py
def f(n):
return n
x = 3000
y = f(3000)
print(x is y)```
@fervent saddle :white_check_mark: Your eval job has completed with return code 0.
True
It means something looks like it has a certain complexity in the context of a loop or something like that, even though it might not actually have that complexity. I think the most common example, which you might have heard already, is appending to a list. It has linear time complexity because the array needs to resize and copy everything if it runs out of space, but that happens infrequently enough because of overallocation that it seems like it has constant complexity if you’re appending in a loop, so it has constant amortized time complexity.
only in a certain range
What is the best place to learn DSA?
Check out the pinned messages in this channel for some recommendations.
Thanks for your reference.
Thank you
hey i referred to them but can you suggest some video courses with python implementation as i struggle with written content.
MIT has some full (though bit old) courses in DSA all in Python 2 https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/
This course provides an introduction to mathematical modeling of computational problems. It covers the common algorithms, algorithmic paradigms, and data structures used to solve these problems. The course emphasizes the relationship between algorithms and programming, and introduces basic performance measures and analysis techniques for these p...
I am looping over files in a directory and would like to check if a file matches a glob specified in .gitignore file. One possible way to do this will be to simply loop over the files and then loop over each glob in .gitignore and check if the file matches one of the globs. But this will have a complexity of n^2. I am wondering if there would be a better way to do this?
It's not quite a n^2 complexity, it's more m*n since there's two different quantities - and you'd expect the number of globs to be far less than the number of total files. One thing you could do is some pre-processing, parse all the globs into some sort of object, and/or first for each folder you check do a filtering step to skip any globs that can't apply to stuff in that folder (because they reference another folder).
But first you want to ensure there's actually a performance issue here, before worrying about elaborate optimisations.
It is fairly likely the IO overhead from checking files might dominate the globbing.
Also a big speedup that the gitignore docs specifically mention - check each directory, if it's ignored, you can skip recursing its whole tree.
Looks like the actual implementation of git's file searching logic is here, though in C so not the easiest to understand. Looks like it does do partial matching with the folder name, to find which globs can apply to any files/subfolders there. That way as you recurse down the directory tree, the globs can be pruned down to the relevant ones.
https://github.com/git/git/blob/master/dir.c
I was thinking about if you were trying to make battleships but with any grid size, ship sizes, and number of ships. Something I was thinking about was how the computer opponent might have a problem placing ships if there isn’t a lot of space.
For example, if you have a 20x20 grid, and you place forty 10-length ships so the entire board is occupied, how will the computer opponent efficiently figure out a way to place the ships? It seems like recursively checking different combinations might quickly start taking a long time with larger grid sizes.
Is there a name for this problem and a good way that people have figured out to solve it?
idk but that's an interesting problem 
Is f string also known as string interpolation?
foo = "Foo"
print(f"My name is {foo}")
So the below is called f-string?
foo = "Foo"
print(f"My name is {foo}")
@rustic helm :white_check_mark: Your eval job has completed with return code 0.
Hello World
Both % and f-strings are considered string interpolation.
Yes that's right. Sorry for the wrong info earlier. Here is an article https://www.programiz.com/python-programming/string-interpolation
In this article we will learn about the python string interpolation. Python supports multiple ways to format text strings and these includes %-formatting, sys.format(), string.Template and f-strings.
ok thanks!
Yeah I found the pep 498
So you are saying that I should not worry about optimization first and come back to it later? Should I go with the m*n complexity algorithm then?
Which goes over different types of string interpolation
You should run a profiler on your code, to time functions and see where any slowdowns occur - there's no point spending all of your time on a function that isn't slow to begin with. cProfile is available in the stdlib, but there's several around to pick from.
Just run python -m cProfile yourcode.py.
I am using rust for this particular project. Is there something similar?
Searching "rust profiler" gave me quite a few hits, though I'm not a Rust programmer so I can't exactly recommend one.
yes but they all seem a bit difficult to use. Anyways thanks @ivory yacht
thanks!
Hey!
I have question about how dictionaries are implemented in py.
How am I able to iterate through the dictionary?
cause in essence, the thing that makes dicts faster is the fact that they keys are not stored right? Or am I getting something wrong here?
I am sorry if this is a nub question, any material that I can read up is also appreciated.
A dict is a hash table. That's what makes dicts faster. Same as a set. Here's the definition of a hash table.
Hash Table is a data structure which stores data in an associative manner. In a hash table, data is stored in an array format, where each data value has its own unique index value. Access of data becomes very fast if we know the index of the desired data.
Thus, it becomes a data structure in which insertion and search operations are very fast irrespective of the size of the data. Hash Table uses an array as a storage medium and uses hash technique to generate an index where an element is to be inserted or is to be located from.
!e a normal for loop goes through the keys of a dictionary py dct = {1: "A", 2: "B", 3: "C"} for key in dct: val = dct[key] print(key, val)
@fiery cosmos :white_check_mark: Your eval job has completed with return code 0.
001 | 1 A
002 | 2 B
003 | 3 C
You mean to ask how you’re able to iterate through it in the order you put thing in?
Or you’re just asking how iteration works at all for things that use hashing, like dicts and sets?
I’m pretty sure the way it deals with empty or deleted index positions is that it just ignores them and moves on to the next index
Yea, iteration for hashed data structures, Like a key has to be given into the hash function in order to give out the address at which the value is stored right? So, how does the dict know that these are the keys that are there in the dictionary that I have initialized, since the hash function is not reversible right?
I think it just goes through the array that stores everything, index by index, and ignores all the unused spaces
It doesn’t have to use hashing for iteration, it just goes through the underlying array like it would with any other array
Ohh, so in memory, there is an array which has all the keys stored?
Yeah
Oh sweet, that answers my doubt! Thanks @fervent saddle
@signal bridge It keeps all the entries (key-value pairs) of the dictionary in an array. The actual hash table contains indices of entries in this array.
Here's the source: https://github.com/python/cpython/blob/main/Objects/dictobject.c
And I think Raymond Hettinger must have done a talk on this at some point 🤔 You can probably find that on YouTube.
Dictionaries are an important data-structure in Python, so some thought has been put into their design.
Ah. Here's the talk. Well worth watching: https://www.youtube.com/watch?v=p33CVV29OG8
Abstract
Python's dictionaries are stunningly good. Over the years, many great ideas have combined together to produce the modern implementation in Python 3.6. This fun talk is given by Raymond Hettinger, the Python core developer responsible for the set implementation and who designed the compact-and-ordered dict implemented in CPython for Pyth...
Hello
i had this subject and i dont know how i can do
I don’t even know how i can start
Can somebody help me please ?
how do you go about learning the string data structure in python?
or is it even relevant?
Chapter 4 of Fluent Python has a detailed description, but basically:
- From the Python side, a string acts like an immutable sequence of characters
- On the CPython implementation side, a string is an array of UTF-8 encoded bytes (note that a single character is often more than one byte!)
(this implies some things such that:
s = ""
for i in range(n):
s += "a"
is O(n^2), because each += has to copy the entire string to a new position in memory. This was indeed true until Python2.7 introduced an optimization that, when possible, resizes the array inplace, making the above snippet O(n).
Nevertheless, because of this, many people still prefer appending to a list and then "".joining it, which is always O(n).
)
CPython doesn't use UTF-8, actually (though PyPy does). Depending on the characters in the string, it uses a 1, 2 or 4-byte buffer of code points.
That way indexing is still O(1), but pure ASCII strings don't waste memory.
whether it's relevant depends on why you are asking about it. Almost no one needs to be concerned with that implementation detail.
Has anyone used cracking the coding interview.
Thanks mate!
using a list then joining also has the side effect of looking much nicer
Is it possible to have an indexable linked list type of thing if you’re only ever moving nodes to the end, and never adding new nodes or inserting at intermediate positions?
Or is it still not possible to have indexing for something like that?
what do u mean by moving nodes to the end\
0 -> 1 -> 2 -> 3 -> 4
to this:
0 -> 1 -> 3 -> 4 -> 2
@fervent saddle You could keep a dict from indices to nodes
and then change things in that dict when you modify the list
hm, no, it wouldn't really work
It feels like it’s one of those thing where if there’s a solution that involves a full blown hash map, there’s a better one that involves index mapping
But I can’t think of a solution with either
I think you could but it would emulate an array afaik
when moving a node to the end how would you maintain the indexes without shifting the nodes in O(n)
Actually, I’ve been thinking about it, and isn’t OrderedDict something that does something like this?
But fix error said a hash map wouldn’t work, so maybe this is different in some way I don’t notice
it's not indexable
A linked list allows you to shift nodes around arbitrarily and efficiently, but the problem is you can't constant-time index it - indexing requires going through the nodes one by one.
Something that might work reasonably well is to not actually remove the nodes - leave None there, then append the node. Have a secondary list, which records the start/end indexes for each continuous "run" of non-none nodes. Then when indexing, you go through the list of runs, and use that to skip over the None elements. You'd also want to watch how many holes are present, and once you have more than some threshold compact down the list by removing all the Nones.
Ok I think I see what the difference is
With OrderedDict, if you move a node to the end, you don’t have to update any of the other keys
But with this, like other people have said, you’d need to update all the further along indexes
So that’s why OrderedDict wouldn’t work
I'm practicing leetcode in every day but I find it difficult to set up algorithms as the problems get harder. Can you give me an idea on how to improve my algorithm knowledge. What is your recommendation for book or video etc.
I like The Algorithm Design Manual, I shared a link below. The code is in C but I find it to be pretty generalizable. It does cover a ton of algorithms, so I'd focus on the most important ones for interviews to not get overwhelmed (Graphs & DP are where I found this book most helpful). The second half of the book also just lists common algorithm problems, so you can go through and understand them.
When I'm looking for a video explanation, I like Abdul Bari and Gaurav Sen on YouTube. Those aren't the only good channels though. Best of luck!
Could also do:
for key, val in dct```
that would be dct.items()
What should be my first project about algos and data structs?
I currently know literally nothing about it
hello everyone
I'm having trouble implementing (and maybe even understanding) the interaction between an ordered dictionary and a binary search tree
i must implement a BST that relies on an ordered dictionary, but I'm a little at loss
i need help
for a question
def multiplication_table(start, stop):
for x in ___:
for y in ___:
print(str(x*y), end=" ")
print()
multiplication_table(1, 3)
# Should print the multiplication table shown above
fill in the blanks
This function prints out a multiplication table (where each number is the result of multiplying the first number of its row by the number at the top of its column). Fill in the blanks so that calling multiplication_table(1, 3) will print out:
1 2 3
2 4 6
3 6 9
this is the question
I need Help for a questions If CCC + BBB + AAA = CAAB, what are A, B, C =? The digits are distinct and
whole numbers from 1 to 9. Solve the problem using a python script.
def multiplication_table(start, stop):
for x in range(start, stop+1):
for y in range(start, stop+1):
print(str(x*y), end=" ")
print()
multiplication_table(1, 3)
i just saw wasnt that tuff what questions are these?
i am having a doubt
can u tell me what's the difference between using(start,stop+1) and range(start,stop+1)
not in range ig it just works on those 2 no. right?
The advantage of the range type over a regular list or tuple is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed). https://docs.python.org/3/library/stdtypes.html#typesseq-range
def main(arr,n,x):
for i in range(0,n):
if(arr[i]==x):
return 1
return -1
n=int(input())
arr=[]
for i in range(n):
arr.append(int(input()))
x = int(input())
main(arr,n,x)
whats wrong in code please help me,below screenshot is giving error
i'm guessing you're supposed to print your answer?
you can view the inputs right?
nope only sample testcase data other two are nothing showing
what is the quesiton prompt?
booleans are True and False, not -1 and 1
i'm guessing it passed the first case because it checked for equality
can you explain where to change the code
you're returning -1 and 1, when you should be returning False and True
already done but not returning
i don't understand
def main(arr,n,x):
for i in range(0,n):
if(arr[i]==x):
return True
return False
n=int(input())
arr=[]
for i in range(n):
arr.append(int(input()))
x = int(input())
main(arr,n,x)
still not working
what doesn't work
output not generating
i'm not sure what's wrong
when i run on jupyter notebook it give answer
but when i upload on website it dont give output
#help-lollipop plz help me guys
Hey guys, someone know how to create and queue with search function that will search if their is a spacific value?
you are given an array of size n count the number of non empty subsets s of a such that the product of numbers in that subset is of the form p1*p2....*pk where p1,p2... pk are prime numbers. since the answer can be huge print it modulo 10^9 +7
Hey guys i had trouble trying to solve this, pls help me out with this
uh, almost any positive number is of that form
does it say anything about what numbers are in the array?
integers?
a subset of all ones would arguably not count
so i would do 2**n - 2**array.count(1) for example
Thanks for your reply🙂
i think the main restriction is that the number's prime factors appear just once in the factorization
45 wouldn't count, because it's of the form 3*3*5, with 3 repeated
I think
so the question is basically... each element of the array can be considered a multiset of primes. The task is to find the number of ways to choose a subset of all these multisets such that the sum of the subset has no elements with count>1
clearly, numbers the prime factorization of which has repeats immediately need to be discarded (they can't appear in any chosen subsets)
as for the others, uhh, not sure how to count this...
if the contents are not given at all, it's just all non empty subsets
not much choice
if you have 2 and 2, they together aren't a valid subset because their product is 2^2
(that's likely what the constraint is)
hello what the deferent between range() and xrange() ?? and thank you so match
In Python 3, there's no xrange anymore.
You should be using Python 3, unless you have a specialized tool that requires you to use Python 2, which has been deprecated for many years.
In Python 2, range makes a list, and xrange makes a small object that produces the numbers when you iterate over it. Python 3's range is Python 2's xrange.
i appreciate your help for me
why do some easy questions feel harder lol
#python-discussion message Need some help with Linked Lists!
Can I create a python script to know the next number in a sequence without having the formula?
What do I need to know first in order to create it? 
What kind of sequence 
of numbers example1: 1, 1, 2, 2, -1, 1 -2, -1
or 10, 37, 82, 145, 226
and know which number is next
Well that's impossible in general. The sequence could be totally random.
There might be some machine learning way to make an educated guess 
Or if you know it's a certain type of sequence e.g. geometric progression you could construct a formula for the next number
If you dont mean [i + 1] that would look at the value at the next index, then you would have to look at the numbers to understand if there is a pattern in the sequence.
But it would only work for lists that follow that pattern.

0, 1, 2, 3, 4, 5, 6 can be a start of these sequences:
- non-negative integers, in order
- palindromes in base 10 https://oeis.org/A002113
- first decimal digit of the number: https://oeis.org/A000030
- "Numbers that cannot be represented as
x*y + x + y, wherex >= y > 1. " https://oeis.org/A254636
and, well, infinitely many others 🙂
What would be the time complexity of a matrix where we are accessing elements diagonally?
Would it still be O(n)?
We are not accessing all elements.
algorithm I wrote only accesses 1, 5, 9 and 3, 5, 7
What do you think?
If you have an N * N matrix, how many diagonal elements will there be?
at 1 element, we access 1
at 4 element, we access 4
at 9 element, we access 5
at 16 elements, we access 8
So as elements increase we are accessing less elements
You have to consider what N is in your question. Is it the side of the matrix? Or the total number of elements?
by side of matrix do you mean 2x2, or 3x3 and so on?
Yes, the number of rows (or columns).
Yeah you made a good point. I was going by total number of elements in the matrix.
(There's no right or wrong answer, you'll just have different complexity functions depending on what you choose as the input size)
so if N is the side of the matrix, how many diagonal elements are there?
1x1, 1 element
2x2, 4 elements
3x3, 5 elements
4x4, 8 elements
5x5, 9 elements
So this also looks like we are accessing less elements as N side of the matrix increases.
Can you derive a formula for the amount of diagonal elements, given the side of the matrix?
I sort of wanted to disagree with using side of the matrix as N, but I can also see why we could use side of matrix as N
I'm thinking if we can derive a formula
How do you derive a formula for something like this, plot it on a graph?
Let's take a look at just one diagonal.
If the matrix size is N x N, how many elements are there on a single diagonal?
Oh I see!! O(n)
# [1, 2, 3, 4, 5, 6]
# [1, 2, 3, 4, 5, 6]
# [1, 2, 3, 4, 5, 6]
# [1, 2, 3, 4, 5, 6]
# [1, 2, 3, 4, 5, 6]
# [1, 2, 3, 4, 5, 6]
# 6x6 => 12(after we traverse through diagonal elements twice - top-left to bottom-right and bottom-left to top-right)
# 6x6 => 6(When we traverse through top-left to bottom-right) - We access 6 elements, so O(n)
Yeah, O(n)
So if the matrix size is even, there will be 2N elements, and if it's odd, then 2N - 1. So yes, it's O(N).
if we use N as side of a matrix.
And if we use, say, M as the number of elements?
wow you are a lot more precise. I missed to identify that pattern. Good catch on odd and even and 2N - 1!!
you could golf it and say 2N - N%2 🙂
This would be O(log m)
Why?
at 1 element, we access 1 elements
at 4 element, we access 4 elements
at 9 element, we access 5 elements
at 16 elements, we access 8 elements
at 25 elements, we access 9 elements
That is we traverse the matrix twice.
If N is the side of the matrix, how are M and N related?
(this is not a trick question)
If N is the side of the matrix, then as N increases M also increases. Now the question would be at what rate does M increase in relation to N?
if a square matrix has side N, how many elements does it have?
N * N
right, so M = N * N
N^2
That's how you can find M, given N. How can you find N, given M?
i.e. if you know the number of elements in a square matrix, how do you find its side?
If we know total number of elements, then we can do the square root to find number of elements
right, so N = sqrt(M)
square root of total number of elements
So the number of diagonal elements is on the order of sqrt(M), right?
Oh so it's O(m^1/2) where m is the total number of elements
yeah
So when I did
at 1 element, we access 1 elements
at 4 element, we access 4 elements
at 9 element, we access 5 elements
at 16 elements, we access 8 elements
at 25 elements, we access 9 elements
This would instead actually be
at 1 element, we access 1 elements
at 4 element, we access 2 elements
at 9 element, we access 3 elements
at 16 elements, we access 4 elements
at 25 elements, we access 5 elements
We just take into account the first traversal and compute the time complexity from there?
because the second traversal is similar to first except with odd where we do - 1.
Sort of like if we traverse through a list twice and access each element, we still say O(n).
O-notation doesn't describe how many operations exactly it takes, but rather the 'order' on which a function grows.
if f(n) is on the order of O(g(n)) (in the colloquial CS meaning. I think this is technically 'big theta' instead of 'big O', I don't remember), then if we let K = f(n)/g(n), as n increases, K will tend to some positive constant. For example, 2n - 1 is on the order of O(n), because (2n - 1) / n will be a better and better approximation of 2 as you increase n.
I think #software-architecture might be a good channel for something like this.
But if you dont have a lot of data(couple of lines), you could put it in the same file. Otherwise, yeah create a separate file something like a json and then access the data.
Yup json.
That way your python file stays clean and you can load the json file.
I'm thinking about the portion where you said, rather the 'order' on which a function grows.
This part makes sense O-notation doesn't describe how many operations exactly it takes So O(n) doesnt exactly matter if we traversed through all elements once, twice, or three times.
Yes. Even if you traverse the diagonal elements 100 times (independent of the size), you still get O(n).
Asymptotic complexity doesn't really determine the 'speed' of an algorithm, because it ultimately depends on the implementation. Rather, it describes how some metric (time/space/bandwidth etc.) will change as the input gets larger.
Did you mean, O(n ^ 1/2)? If you were still referring to a matrix to traverse elements diagonally 100 times.
Yeah I see exactly what you mean!
well, I picked the side of the matrix for n
Oh lol right. I should have pick up on it. Because you referred to number of elements with m. Thanks for all the help and asking questions to help me think through it!
👍
I think they should have very easy difficulty on leetcode because some easy are really easy like you are done in couple of minutes, but some easy require a lot more thinking.
Wondering if people here felt that some Easy questions were as hard as Medium
agreed. Sometimes I opt for other sites, such as codewars, generally speaking their challenges are ranked quite accurately
not pure math, although algorithms often involve some math (for example, when describing big O notation)
If a BST has strings as keys, and you’re looking up a string in it, doesn’t it only need to look at each character of the string once?
Is that an advantage over a hash table? The hash table first needs to look over each character for the hash value, then it needs to compare the string with at least one other string. So it needs to look through all the characters at least twice
So then it depends on how many times you need to compare one character to another in the BST, right?
When I look at solutions for arrays, most of the time, when they start iterating from 1 , then stop at the len(array).
I'm wondering why, because obviously, it's wrong. They should rather start from 1 to len(array) + 1. So I'm here to ask, why do they make the end len(array)? Do they want to leave the last element out?
to loop over an array by the indices, you start from 0, since the first element is arr[0]
I just said they start the range from 1?
without anymore context, it's impossible to know why they did what they did ¯_(ツ)_/¯
looping to len(arr)+1 will give you index out of bounds though, if you're accessing with arr[i]
I think this might be the answer.
if you showed the code and problem maybe i could answer more in depth
but ranges tend to be open ended, because it makes everything easier
I'll tag you when i remember were I saw it.
@knotty magnet I saw it on a Google array question. I forgot where the question was but I recreated something.
!e
s = 'abacabad'
for i in range(1, len(s)):
print(f" {s[i]}, this is {i}")
@last fulcrum :white_check_mark: Your eval job has completed with return code 0.
001 | b, this is 1
002 | a, this is 2
003 | c, this is 3
004 | a, this is 4
005 | b, this is 5
006 | a, this is 6
007 | d, this is 7
Yes len + 1 gives me an index error. But then I wonder, how would they have gotten the first element?
The constraints were that the array starts from 1 to length of the array.
no idea without seeing the code ¯_(ツ)_/¯
Well look at it like this. You have array. The array has to start from one to the length of the array. How would you get the first element?
arr[1]
yes, because python has 0-indexed sequences
if you define the array to be 1-indexed, then arr[1] would be the first element
How are you going to define it. Look at this question from amazon.
Given a string s consisting of small English letters, find and return the first instance of a non-repeating character
in it.
If there is no such character, return '_'
Guaranteed constraints:
1 ≤ s.length ≤ 105.
[output] char
Here, the array must start from one
to the end of the list
where does it say that
Constraints
it just says that the length is from 1 to 105
huh?
the length is just how many elements are in it
in this case a string, so how many characters
what index the first element of a sequence is depends on the language
If the length is 1, the last element is s[0]. If the length is 2, the last element is s[1], and so on.
(in Python)
A length of 0 means the sequence is empty.
Seems my question is being confused.
Let me come again
I know how to get the length of an array. I know they are zero indexed. I'm asking. If your array has to start from the number one to the length of array, how would you get the last or first element. Because if I do range( 1, len(array)), it would stop before the length of the array.
Here is a question.
Given an array a that contains only numbers in the range from 1 to a.length, find the first duplicate number for which the second occurrence has the minimal index
I don't think I understand your question. Are you asking how to iterate over an entire list?
if it's zero indexed, range(1, len(array)) will touch the last index
Can you perhaps give a link to the problem statement?
You have to log in. or create an account
then just copy it
What the problem is saying is that if e.g. the list has length 4, it can contain numbers 1, 2, 3, and 4.
So [1, 2, 2, 3], [2, 4, 1, 3] and [4, 4, 4, 4] are valid inputs.
i thought it'd be 1, 2, and 3. 🤔
Wait, no, it's a bit confusing.
it's not clear whether the range is inclusive in the problem.
I've solved the problem though. Here's the full question
Given an array a that contains only numbers in the range from 1 to a.length, find the first duplicate number for which the second occurrence has the minimal index. In other words, if there are more than 1 duplicated numbers, return the number for which the second occurrence has a smaller index than the second occurrence of the other number does. If there are no such elements, return -1.
Example
For a = [2, 1, 3, 5, 3, 2], the output should be firstDuplicate(a) = 3.
There are 2 duplicates: numbers 2 and 3. The second occurrence of 3 has a smaller index than the second occurrence of 2 does, so the answer is 3.
For a = [2, 2], the output should be firstDuplicate(a) = 2;
For a = [2, 4, 3, 5, 1], the output should be firstDuplicate(a) = -1.
Input/Output
[execution time limit] 4 seconds (py3)
[input] array.integer a
Guaranteed constraints:
1 ≤ a.length ≤ 105,
1 ≤ a[i] ≤ a.length.
[output] integer
The element in a that occurs in the array more than once and has the minimal index for its second occurrence. If there are no such elements, return -1.
I used a set.
But that's not my question
Voice chat?
Maybe I can explain better
There
sorry, i can't
same
Ok. I'll write it out
You have an array that should start from the range 1
So you can't zero index
do you mean it's 1-indexed?
to use python syntax, arr[len(arr)]
sure
not if it's 1-indexed though?
@austere sparrow did you get it
if the array is one indexed, then doing
for i in range(len(arr) + 1):
arr[i]
``` won't give you an error (from indexing at least)
s = 'abacabad'
for i in range(1, len(s)):
print(f" {s[i]}, this is {i}")
What about this?
what about it
Well, s is 0-indexed
For example I want to start my counter from 1. In the real world things start from 1
How would I get the first element?
how does that relate to that code example
!e
s = 'abcde'
for i in range(len(s)):
print(f" {s[i]}, this is {i+1}")
@austere sparrow :white_check_mark: Your eval job has completed with return code 0.
001 | a, this is 1
002 | b, this is 2
003 | c, this is 3
004 | d, this is 4
005 | e, this is 5
oh, you meant that counter
!e
or:
s = 'abcde'
for i, char in enumerate(s, start=1):
print(f"{char}, this is {i}")
@austere sparrow :white_check_mark: Your eval job has completed with return code 0.
001 | a, this is 1
002 | b, this is 2
003 | c, this is 3
004 | d, this is 4
005 | e, this is 5
So I asked this question because of a double for loop
In one code example you had to start from 1
And end at the length of that array
And it was not zero indexed
impossible to know why without the code ¯_(ツ)_/¯
Yes. That's why I tried explaining
And that went badly
Anyway @austere sparrow in C I can do a for loop like this.
for(int i=1;i<=n;i++) // outer loop
{
for(int j=i + 1;j<=10;j++) // inner loop
{
How would I write the same in python?
you can use range, just add 1 to the right bound.
for i in range(1, n+1):
for j in range(i+1, 11):
...
Ok thank you.
yo how would I print consecutive numbers in a list like
[0, 1, 3, 4, 5, 6] and print 01 and 3456
have a variable to store the last number you looked at
are there more constraints on the input?
nah
I went to the route of learning Scheme, so I could learn recursion and then do Trees lmao. So I'm thinking of skipping Trees for now and do Easy, Medium, and Hard in all other categories and then tackle Trees afterwards?
Wondering what strategies people have used?
Well eventually you'd hit graphs and trees are just a simple type of graph 
Yeah that was my understanding that Trees are simple version of Trees. That one needs to know how Trees work to do well at Graphs.
So I'll save mostly Trees and Graphs for last.
hey im trying to run my py file in vscode but it instantly stops when i try to run it
this appears and instantly goes away after a sec
https://github.com/ninjapolski2/sudokuSolver/blob/master/sudoku.py why Python avoid if statements in last function and also... effect isn't good... if statements don't perform their function... help
I'm doing an array question, and would like to use a default dict. I know dicts are not ordered because they are hash tables, but is there anyway to get an ordered dict with defualt dict?
As of Python 3.6 dicts are actually semi-ordered - they remember the order of insertion and provide that when you iterate, but there's no way to re-order elements specifically.
If you do want a OrderedDict subclass, you can still do that - dicts have a __missing__(self, key) method you can override to implement the behaviour.
Defaultdict is basically just this:
class defaultdict(dict):
def __init__(self, factory, *args, **kwargs):
super().__init__(*args, **kwargs)
self.factory = factory
def __missing__(self, key):
self[key] = val = self.factory()
return val
So you could also just inherit from OrderedDict, implement __missing__, and whatever else you need.
Ok great. Would this fly for interviews though?
I don't understand how to do that, but great ideas
It's mentioned here in the lang spec:
https://docs.python.org/3/library/stdtypes.html#mapping-types-dict
As of 3.7 it's official, but it was actually implemented in 3.6.
In your code example, there was self.factory()
Is that also mentioned in the language spec?
I skipped adding an __init__ to setup that attribute, sorry.
If you subclassed from OrderedDict instead, you'd have a OrderedDefaultDict.
Great. So based upon your information I implemented something. It's very horrible optimised, so can you help me improve it?
Also, I forgot how to subclass etc. Long time since I used classes.
Here's the question.
Given a string s consisting of small English letters, find and return the first instance of a non-repeating character
in it.
If there is no such character, return '_'.
Example
For s = "abacabad", the output should be
firstNotRepeatingCharacter(s) = 'c'.
There are 2 non-repeating characters in the string: 'c' and 'd'. Return c since it appears in the string first.
For s = "abacabaabacaba", the output should be
firstNotRepeatingCharacter(s) = '_'.
There are no characters in this string that do not repeat.
Input/Output
[execution time limit] 4 seconds (py3)
[input] string s
A string that contains only lowercase English letters.
Guaranteed constraints:
1 ≤ s.length ≤ 105.
[output] char
The first non-repeating character in s, or '_' if there are no characters that do not repeat.
Here's my bad code.
from collections import defaultdict
def firstNotRepeatingCharacter(string: str) -> str:
count_dict = defaultdict(int)
for char in string:
count_dict[char] += 1
for char in string:
if count_dict[char] == 1:
return char
return '_'
That would work fine, and you don't even actually need the ordered behaviour there.
If you did want to use it, your second loop could just loop over .items() and check if the value is 1. Since it orders by insertion order, it'll produce them in the order they were first added which is what you want.
This is because the recent improvements right?
Because let's say I go to a place that has python 3.5
I can't use this
Python 3.5 no, but the subclass I showed would work there. And 3.5 is officially end of life, so it's no longer supported anymore by the Python team.
Or you could just use setdefault and a plain OrderedDict, you don't need defaultdict's functionality really.
Can you explain more?
Are you familiar with the dict.setdefault() method?
Or actually you could even just use get().
get() returns the key for a specific value, or the default value you specified if it's not present.
I remember get(). Forgot about it totally.
What setdefault does is it takes the key and the default also. If the key is present it returns the value. If not, it sets it to the default you provide, then returns the default.
That way you can do mydict[key] = mydict.get(key, 0) + 1.
Thanks. That's really helpful. Also, first time I've seen that syntax with get. I usually see it used to prevent an exception.
Isn't that exactly what orderedict does? In a way this is just "reinventing the wheel" (even though it's not a very complex problem)
Yep, but Carrickk needed it to be an OrderedDict too.
I meant defaultdict, but I assume you understood that
Wait, doesn't defaultdict preserve order like a normal dict?
Only since 3.6, but yes.