#algos-and-data-structs

1 messages · Page 61 of 1

stiff quartz
#

Then I do dfs from 4 and get 4-2-1 (90) and 4-2-3 (80)

#

Unfortunately the two longest paths were 1-2-3 (90) and 1-2-4 (90)

#

So I guess this "technique" of doing dfs from a random node, and then dfs from the farthest node (farthest defined from that random node) only works for the longest path?

golden ravine
#

can someone explain the red bit please i did this and wanted to double check it and from what i understand this should be True

#

but gpt is saying its false?

#

if not P and not Q = false and they imply R then it should always be True right?

#

this is what i had

#

which is basically identical

#

besides the highlighted row

#

i know its pretty inaccurate a lot of the time but this i wanted to make sure i wasn't missing anything

lean walrus
#

F => x is always T

#

the table seems incorrect

hoary flare
#

Hi, I'm really stuck on this problem, I'm fairly new to data structures and algorithms and I can't seem to debug thus issue. I would really live some help possible

#

Also idk if this is inappropriate to put here, just lemme know and I can take it down.

narrow mica
stable pecan
# golden ravine can someone explain the red bit please i did this and wanted to double check it ...
>>> TruthTable("~p", "~q", "~p & ~q", "~p & ~q -> r", "~r", "~r & ~p & ~q", "~(~r & ~p & ~q)")
TruthTable('~p', '~q', '~p & ~q', '~p & ~q -> r', '~r', '~r & ~p & ~q', '~(~r & ~p & ~q)', binary=False)
>>> print(_)
┌───┬───┬───┬────┬────┬─────────┬──────────────┬────┬──────────────┬─────────────────┐
│ p │ q │ r │ ~p │ ~q │ ~p & ~q │ ~p & ~q -> r │ ~r │ ~r & ~p & ~q │ ~(~r & ~p & ~q) │
├───┼───┼───┼────┼────┼─────────┼──────────────┼────┼──────────────┼─────────────────┤
│ F │ F │ F │ T  │ T  │    T    │      F       │ T  │      T       │        F        │
│ F │ F │ T │ T  │ T  │    T    │      T       │ F  │      F       │        T        │
│ F │ T │ F │ T  │ F  │    F    │      T       │ T  │      F       │        T        │
│ F │ T │ T │ T  │ F  │    F    │      T       │ F  │      F       │        T        │
│ T │ F │ F │ F  │ T  │    F    │      T       │ T  │      F       │        T        │
│ T │ F │ T │ F  │ T  │    F    │      T       │ F  │      F       │        T        │
│ T │ T │ F │ F  │ F  │    F    │      T       │ T  │      F       │        T        │
│ T │ T │ T │ F  │ F  │    F    │      T       │ F  │      F       │        T        │
└───┴───┴───┴────┴────┴─────────┴──────────────┴────┴──────────────┴─────────────────┘

should be more dependable than gpt

golden ravine
#

theres a python truth table method?

stable pecan
#

i made a library for it a long time ago

golden ravine
#

actually while you're here

narrow mica
#

salt-die has really cool libraries

golden ravine
#

do you have any good suggestions for learning simplification of

golden ravine
stable pecan
#

i can stick it on pypi probably, i haven't looked at the code in a long time

golden ravine
#

i'm pretty new never heard of pypi

stable pecan
#

when you do pip install my_library then my_library is usually on pypi

golden ravine
#

ooh alright

golden ravine
#

i feel like i've been looking at these for a couple hours and have nothing

#

are there any good sources for simplfying questions like these

#

i can't find any that partically seem to help

#

especially in relation to imply ones

#

we have a table like this but i just can't seem to get it down

stable pecan
golden ravine
#

its got to end up as (p' and q' and r') apperently so idk if i can do that

#

all i have so far is being able to turn it

into

(p' and Q')' or R using rule of implication

#

but i don't fully know if thats correct

stable pecan
#

might still help to replace with a single variable while you transform

   T -> R === (R' ^ T)'
=> T -> R === (R'' v T')
=> T -> R === (R v T')
=> ...
#

you can re-substitute later

lean walrus
lean walrus
#

precedence is not, and, everything else
brackets do not belong to this list

stiff quartz
#

Is it me or it's a nightmare to get a functionning input reading in Python on SPOJ?

#

This gets me a runtime error

def get_int():
    return int(input())

def invr():
    return map(int, input().split())

def solve():
    n, s = invr()
    graph = [{} for _ in range(n)]
    for _ in range(n-1):
        x, y, z = invr()
        graph[x-1][y-1] = z
        graph[y-1][x-1] = z


if __name__ == "__main__":
    n_tests = get_int()
    for test_nb in range(n_tests):
        solve()
stable pecan
#

expressions can be in parens

lean walrus
#

that doesnt say anything about precedence of brackets

#

brackets do not have any precedence

stable pecan
#

if you say so

lean walrus
#

A @ B $ C can be parsed as (A @ B) $ C or A @ (B $ C), it is determined by @ and $ precedence relative to each other
with brackets there is never an ambiguity

#

brackets is an operator that controls order of operations

stable pecan
#

so they have precedence

lean walrus
#

there is never an ambiguity, so no

stable pecan
#

there is never an ambiguity because i made the grammar file that evaluates parens first, because i gave them precedence

lean walrus
#

you didnt
you can swap lines 11 and 12, it would not change anything

stable pecan
#

yep

lean walrus
#

if you say that not, (), and, ... does make sense, then how is (), not, and, ... or not, and, (), ... different from that?

stable pecan
#

they are in a different order

lean walrus
#

this order is invisible for the user

#

and if there is no observable difference, then they are the same

stable pecan
#

i don't know what you mean

lean walrus
#

i say that your readme should not mention () in precedence list, because it doesnt make any sense

stable pecan
#

i get you, i should ignore parens

stiff quartz
# stiff quartz This gets me a runtime error ```py def get_int(): return int(input()) def i...

There were empty lines in the input, this works:

import sys
input = sys.stdin.readline

def next_non_empty():
    while True:
        line = input()
        if line.strip():
            return line

def get_int():
    return int(next_non_empty())

def invr():
    return map(int, next_non_empty().split())

def solve():
    n, s = invr()
    graph = [{} for _ in range(n)]
    for _ in range(n-1):
        x, y, z = invr()
        graph[x-1][y-1] = z
        graph[y-1][x-1] = z

    # insert logic

if __name__ == "__main__":
    n_tests = get_int()
    for test_nb in range(n_tests):
        solve()
#

But now there were so many empty lines that just reading the input takes 1.9 seconds

#

great

#

I guess I'll have to request the wiseness of @regal spoke to see if there is some magic that makes input reading faster because I'm really not seeing it

regal spoke
#

Whats this about?

#

A quick way to read input is to just do

input = [int(x) for x in sys.stdin.read().split()][::-1].pop
#

split identifies tokens seperated by 1 or more whitespace characters

stiff quartz
#

Has (A LOT) empty lines scattered around (they didn't say we just inferred that from the multiple runtime errors)

stiff quartz
#

but it takes 1.9 seconds of the time (2 seconds total allowed)

regal spoke
stiff quartz
#

there are probably 10^8 empty lines

#

when I was putting sys.stdin.read() in an array I had almost 300 MB of memory used

#
import sys
input = [int(x) for x in sys.stdin.read().split()][::-1].pop

def get_int():
    return input()


def solve():
    n = get_int()
    s = get_int()
    graph = [{} for _ in range(n)]
    for _ in range(n-1):
        x = get_int()
        y = get_int()
        z = get_int()
        graph[x-1][y-1] = z
        graph[y-1][x-1] = z


if __name__ == "__main__":
    n_tests: int = get_int()
    for test_nb in range(n_tests):
        solve()

Your solution is faster: 1.54s (301M space)

#

to read input

#

Hopefully that's enough margin for my logic

#

or maybe i'm doing something wrong?

regal spoke
#

Dont wrap functions like that

stiff quartz
#

ah yes my bad

regal spoke
#
  1. Dont use any dictionaries to store the tree
#

Thats just asking for trouble

stiff quartz
#

I do need the edge weights though

#

I can't really use a list of list right?

regal spoke
#

tuples is one way to do it

stiff quartz
#

ah yes fair

#

but then I'd have to loop over the neighbours

#

to find the right tuple

regal spoke
#

I dont know what algortihm you want to use to begin with

stiff quartz
#

I'm finding the diameter of the tree (https://ideone.com/f4jLVc full code I didn't want to bother you with the details initially which is why i had cut the logic)

#

So I mainly need to be able to do DFS

#

(I do 3 DFS in total)

tardy sable
#

hello, where is the apropriate channel that i can ask for advices to complete my project?

regal spoke
stiff quartz
#

I do dfs to find the longest path from that leaf

#

And then I pick the leaf of that longest path

#

And I do DFS again from that selected leaf

#

To get the global longest path of the tree

#

Which I believe is called a diameter?

#

I know there is a way to do it with a queue when you have unweighted edges in a tree, but the tree has weighted edges, so I think that's the only way?

regal spoke
#

How do you then solve the problem?

stiff quartz
#

Oh, then I have the true longest path of the tree

#

I go to the middle of it (considering weights)

#

And I greedily add vertices as long as I have enough S

#

I keep two variables sum_left and sum_right

#

That represent how much distance from the (growing) middle part there is to the left part (beginning of the true longest path) and the right part (end of the true longest path)

#

I hope that's clear, this is quite a hard problem for me so I might have trouble communicating my ideas clearly

stiff quartz
regal spoke
#

The trick is that for each node, store 2 things

#
  1. The degree of the node
#
  1. The xor of all neighbours of the node
#

no adjecency list needed

stiff quartz
#

Um

#

Where do you store the weights?

#

And isn’t the xor losing us information?

regal spoke
regal spoke
#

Think about the leaves

#

They only have one neighbour

narrow mica
stiff quartz
#

1-2-3
Now xor of 1and3 is 2 and 2 has two neighbors, I’m not sure how I « get back » to 1 and 3

#

9-2-11
xor of 9and11 is also 2 and 2 has two neighbors too

#

🤔

regal spoke
#

The trick is to remove one leaf at a time

stiff quartz
#

Ah and we only ever treat nodes with degree 1

#

Though it seems I can’t do a dfs starting from a non-leaf

#

Those two lines

import sys
input = [int(x) for x in sys.stdin.read().split()][::-1].pop

Take 0.75 seconds

#

So yeah my adjacency list is actually decently taking a lot of time on top of reading the input from stdin

regal spoke
#

On cses, all fastest solutions for tree problems use this technique, including finding diameter

#

I would call it xor-representation of a tree

regal spoke
stiff quartz
#

mmm you're right

#

I'm going to try tuples first

#

It seems easier conceptually and might be faster than my current hashmap approach

#

I gain 0.2s on graph creation

regal spoke
#
deg[root] = -1
for node in range(n):
  while deg[node] == 1:
    deg[node] -= 1
    nei = xor[node]
    xor[nei] ^= node
    deg[nei] -= 1
    node = nei
#

This how to traverse an xor tree

#

No need for any stack/queue

#

You can pick which node becomes the root by doing deg[root] = -1 before running the traversal code

stiff quartz
regal spoke
#

no

#

You simply want deg[root] == 1: to always return False

stiff quartz
#

Ah it'll start at the other leaves

#

mmmh

#

Let's consider the graph 3-1-2

#

let's take root = 3

#

the for loop will go to node 1 first, and not do anything because its degree is 2

#

and then it'll go to node 2 and decrease the degree of 1 but it's too late, no?

regal spoke
#

ops I missed something

#

Fixed

stiff quartz
#

AH ok

#

makes sense

#

spoiler the tuple version timed out too though i'm pretty sure it was faster

regal spoke
#

I'm sure that this xor representation is by far the fastest

stiff quartz
#

Yep, going to work at it

#

Wanted to try the easy improvement first

regal spoke
#

Optimally, you only traverse the tree once

stiff quartz
#

ah yes if you re-traverse you kind of have to reset everything?

regal spoke
#

no

#

xor contains parent

stiff quartz
#

?

#

I mean if you do a second unrelated dfs

#

on the tree

regal spoke
#

After traversing

#

xor will become parent

#

So traversing 2nd time is easier

stiff quartz
#

!e

class XorTree:
    def __init__(self, n):
        self.deg = [0] * n
        self.xor = [0] * n

    def add_edge(self, u, v):
        self.deg[u] += 1
        self.deg[v] += 1
        self.xor[u] ^= v
        self.xor[v] ^= u

    def traverse(self, root):
        self.deg[root] = -1
        for node in range(len(self.deg)):
            while self.deg[node] == 1:
                self.deg[node] -= 1
                neighbor = self.xor[node]
                self.xor[neighbor] ^= node
                self.deg[neighbor] -= 1
                node = neighbor

tree = XorTree(5)
edges = [(0, 1), (1, 2), (2, 3), (3, 4)]
for u, v in edges:
    tree.add_edge(u, v)

tree.traverse(0)
print(tree.xor)
print(tree.deg)
halcyon plankBOT
stiff quartz
#

It appears you're right (I'm still to try to understanding the magic behind it)

#

But tree.deg is messed up

#

If I want to do tree.traverse(4), well I can't

regal spoke
#

If you are smart, you just need 1 traversal. Think about how to do it with 1 DFS. It is possible by keeping track of the two deepest descendants (from different subtrees) for each node

regal spoke
#

Python has to do hash table look ups every time you want to use a variable

stiff quartz
#

ah

#

yes I don't need it

#

it was just easier to play around to understand it

stiff quartz
#

the data structure just became twice as cool

patent junco
#

someone know how to get variable from random one of generated/forked processes?

#

like i did an algo to generate n number of processes each with a random number and i want to get from one (or more, like 3 or 5) of these processes variables…

patent junco
#

nvm. possibly did it

lean walrus
#

sounds like you are creating problems for yourself, tbh

acoustic wadi
#

is python rly slower than java ?

golden ravine
#

@stable pecan your library was so helpful thanks again being able to generate truth tables directly makes it much easier even the teacher liked it

golden ravine
# acoustic wadi is python rly slower than java ?

Pretty new so someone should correct me if I'm wrong but python is a interpreted language which are much slower then compiled languages like Java

However I think some ides go ahead and just compile python so not sure if it fully matters

acoustic wadi
#

mm yeah but what about run time?

#

i wrote some code a while back that did the same in thing in java and then in python. python took several times longer but now im wondering if i just wrote it super unoptimised

golden ravine
#

Uhhh again pretty new I would look in things like big o notion I think that's what determines speed and you can check how optimised a program is but overall java is faster

lean walrus
#

you can write slow code in any language

#

pure Python should be generally slower than pure Java

narrow mica
acoustic wadi
#

Yummers

narrow mica
#

stuff like being statically typed also allows Java to more aggressively optimize

acoustic wadi
#

Agh

opal oriole
#

It depends on the implementation. There are several for both. Also statically typed makes it easier, but given how fast V8 is for Javascript which is not statically typed, it seems that it's more about how much effort is put into it (the JIT).

#

Python by design has a bunch of slow things (pointer chasing, memory usage).

#

GraalPy runs on the java virtual machine, so it's somewhat comparable.

#

(And Jython)

narrow mica
#

there's also pypy which is a jitted python (that doesn't rely on jvm)

#

numba's a library that can jit functions

opal oriole
#

As can Taichi (including targeting the GPU) (it can also ahead of time compile it, so not JIT, just full compile).

#

(It's better than numba from my testing)

narrow mica
#

that's a new one I've not heard of

opal oriole
#

CPython also has its own JIT now in 3.13, but it's not enabled and not even close to being ready.

#

(Probably ready by 3.15 earliest)

narrow mica
#

the other angle to make python fast is to not use python
e.g. numpy & friends

lean walrus
#

numba/taichi good only at jitting numerically heavy code
they are awful at optimizing regular python

#

stuff like nuitka/mypyc/cython can transpile python to C, giving you a decent speedup of pure Python code

haughty mountain
opal oriole
#

Most JITs struggle to get close to the JVM or V8, and that is because making a JIT that works that well is something that takes many years. CPython may get there eventually if they keep at it, but it will probably be around 6+ years.

#

Considering how much Python code is out there, worth it. Making all of it faster without any work on the user side is great.

hushed nova
#

Need a suggestion : so im done with python basics and now wants to get started with DSA, is there any good Playlist I can follow?, I tried strivers list but since the Implementation is in java I find it quite hard to understand the conceptsducky_concerned

empty hound
#

hey guys, idk where to put this question, what do you call that thing where you split the paragraph by sentences while you add a newline. you also have to delete characters like ! ? , . - and others. after that, you have to label each sentence if it's positive or not. this is about natural language processing (NLP)

patent junco
haughty mountain
#

(possibly an xy problem)

patent junco
#

okay. now a more proper question (but please - be honest, i'm doing something non-debuggable/non-testable)
someone knows how to properly:

  • import file with "code" making a bunch of "fork" processes with random data
  • get variable from them in main file (not that forker)
  • delete all of these processes (also in main file)
  • process that (idk, some dividing that random num or taking last num+first num… like that)
    ?
#

||yep, i'm writing thing wabbit-based||

#

code is like that currently and idk if proper:


var =0 

def brick:
    import forker
    thing = var
    os.kill(current, 9)

brick()
if thing == …

and second (forker) file:

import os

var = var + random.choice(0,1,2,3,4,5,6,7,8,9)

while 1: #maybe some other number but yk…
    current = os.fork()

lean walrus
#

while 1: ... and while 2: ... do the same thing

patent junco
#

for i in 2 : ?

haughty mountain
#

you probably want for i in range(2):

#

but regardless, this thing sounds odd, maybe you can give some sensible high level overview of what you're actually trying to do

regal spoke
#

!e

for i in 1,...,3:
    print('hi')
halcyon plankBOT
haughty mountain
#

!e

for i in 2, 3, ..., 7:
  print(hash(i) % 9)
patent junco
#

i love …

halcyon plankBOT
patent junco
topaz otter
#

is it normal for insertion sort to take 5 minutes plus sorting data from text files greater than 5000 integers?

haughty mountain
#

sorting 5000 values shouldn't be insanely slow

#

could be a bad impl that's actually O(n³)

regal spoke
#

5000 should take 1s max

narrow mica
topaz otter
topaz otter
regal spoke
haughty mountain
#

!e here's a naive insertion sort that's O(n²)

import random
import time

start = time.perf_counter()

a = list(range(5000))
random.shuffle(a)

res = []
for x in a:
  ins = 0
  for ins, y in enumerate(res):
    if y > x:
      break
  res.insert(ins, x)

print(time.perf_counter() - start, 'seconds')
halcyon plankBOT
haughty mountain
#

does the bot fail to see edits if you have stuff above the code?

regal spoke
haughty mountain
#

!e

import random
import time

start = time.perf_counter()

a = list(range(5000))
random.shuffle(a)

res = []
for x in a:
  ins = 0
  for ins, y in enumerate(res):
    if y > x:
      break
  res.insert(ins, x)

print(time.perf_counter() - start, 'seconds')
halcyon plankBOT
haughty mountain
topaz otter
haughty mountain
#

just share the code?

#

my guess is that your code is cubic

#

cubic sounds about right for several minutes for 5000

topaz otter
haughty mountain
haughty mountain
topaz otter
topaz otter
haughty mountain
#

which is why we would need to see it

topaz otter
haughty mountain
#

if you want to run for 1000000 that will take...a while

topaz otter
haughty mountain
#

assuming it's actually cubic

You have: 10 minutes * (1000000/5000)**3
You want: years
        * 152.10607
#

150 years

#

even a quadratic solution would be very slow

topaz otter
haughty mountain
#

or just share the code and we could point out some probably obvious error

#

!e

def InsertionSort(A):
  for j in range(1, len(A)):
    key = A[j]
    i = j - 1
    while i > 0 and A[i] > key:
      A[i + 1] = A[i]
      i = i - 1
    A[i + 1] = key

import random
import time

start = time.perf_counter()

a = list(range(5000))
random.shuffle(a)

InsertionSort(a)

print(time.perf_counter() - start, 'seconds')
halcyon plankBOT
haughty mountain
#

see, an impl of that pseudocode

#

runs in about the expected time

#

and 1000000 for that might take more like tens of hours

#

so clearly you're doing something bad in your implementation

topaz otter
topaz otter
regal spoke
#

This one takes about 13s for 1e5

#

For 1e6 it would take around 20 min

#

For 5000 it takes just 0.02s

haughty mountain
haughty mountain
#

presumably what you can do is run it for a bunch of datasets but then decide to set some time limit

#

like, if it takes more than x minutes then just stop and say that time means >x minutes

haughty mountain
regal spoke
#

yes

topaz otter
#

I need to go now, thanks everyone for helping 😀

fluid bone
#

Hey I'm working on a Linux package manager (essentially writing my own distro from scratch as a hobby project).

This will be the first time I need proper algorithms, in particular I need to work out how to traverse dependencies.
(packageA requires packageB and it must be >= 1.0.0, packageB also requires packageC and it must be version >= 2.0.0).

From what I've read the preferred method to do this is presents as a graph problem that can be solved either using "BFS" or "DFS".
I'm working on a class to standardise using this algorithm, any tips for implementation would be greatly appreciated. Thanks 🙂

regal spoke
#

What you are looking for is called topological sort

#

I think in Python there is a built in topological sort in the module graphlib. I've never tried using it myself.

#

But I think it might be exactly what you are looking for

solemn moss
#

huh graphlib

#

weird module

#

only cycles check and topsort

hoary flare
#

Not really a data struct question. kinda ig, but does anyone know a good way of having a byte array in python with more than 8 bits per element? and I mean like 17 bits.

#

I could ofc use a list, but seems kinda gross plus, I need it to be compact and short of making my own data struct, I can't come up with a better solution.

#

ig I could store it in a string.

#

idk, anyone have any better ideas.

#

also, I can't import any libraries (short of math and random lol)

regal spoke
#

However, you can store three 17 bit values in one 64 bit int, using some bitshifts and bitands.

#

So you can pretty easily save a factor of 3 in memory compared to storing one element per index in the list

hoary flare
hoary flare
#

so you reckon using a list is the way to go then?

regal spoke
#

Yes

hoary flare
#

aight, sick. appreciate it. now I just gotta fix this dumbass stubborn bug, then hopefully I'll implement using 3 17 bit bytes per int.

haughty mountain
#

a numpy array of uint64 might also be an option

#

if it's worth using kinda depends on how you will actually use them, if you express your usage as vectorized operations that can win you a ton of performance

patent junco
#

someone here knows how to increment a variable if it isn't empty and if is - create it?

patent junco
#

ok. linter shutted :D

patent junco
#

okay. now other problem. how to fix THIS: ???

def importer():
  from file import custom
  global custom

i get very much errors…

#

(mostly about unused variable, i use ruff cause file quite dangerous to be ran…)

lean walrus
#

its job is to complain about all sorts of unnecessary stuff

patent junco
#

but as far i heard this isn't global…

haughty mountain
patent junco
#

i tried to debug it with someone else (due to hardware) and he was getting IndexError: string index out of range on line where variable was used

patent junco
haughty mountain
#

I'm surprised it's even allowed to have the global there

#

typically is needs to go before the usage

patent junco
#

okay fixed. broken was positioning (i'm dumb person)

patent junco
#

nvm. still broken… (decommented sussy lines to be able to run)

haughty mountain
#

thing is not defined there

patent junco
#

deleted whole function. idgaf

patent junco
#

and works. now only fix image generation and whole project done :D

fair marsh
#

why do you have a function with the same name as the module?

stiff quartz
#

I've been trying with
dp[i][j] => [largest arithmetic rectangle ending at (i, j) (included), beginning of that rectangle's row, beginning of that rectangle's column]

#

so size NxMx3

#

But for examples like this:

3 5
0 0 0 0 0
0 0 0 0 0
1 0 0 0 0
#

It always fails because if you only remember the biggest area at i' < i and j' < j you might "miss good opportunities to make an even bigger rectangle"

#

Like for example, when you're here

3 5
0 0 0  0 0
0 0 0  0 0
1 0 0* 0 0
#

You would want the guy to your left to "store" that rectangle:

3 5
0 0* 0 0 0
0 0* 0 0 0
1 0* 0 0 0

But it'll "store" that rectangle:

3 5
0  0  0 0 0
0* 0* 0 0 0
1* 0* 0 0 0
#

So you can't make that rectangle

3 5
0 0* 0* 0 0
0 0* 0* 0 0
1 0* 0* 0 0
narrow mica
fluid bone
shell burrow
#

Why does it show an error can someone help?

#

KeyError: 'M'
~~~~~~^^^
num += romans[i]
Line 12 in romanToInt (Solution.py)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ret = Solution().romanToInt(param_1)
Line 32 in _driver (Solution.py)
_driver()
Line 43 in <module> (Solution.py)

regal spoke
#

instead of 'M':1000

shell burrow
#

Thanks

obtuse merlin
#

Check out my video! | Captured by Outplayed
https://outplayed.tv/media/KEOgKe

Outplayed - The ultimate capture app for gamers

Outplayed - The ultimate capture app for gamers. While playing, it automatically captures your best moments and biggest plays. When the match is over, relive your best (and not so best) moments by watching them in the match timeline.

▶ Play video
regal spoke
#

It very much was possible to AC with Python

#

The XOR tree technique is OP

#

In the end I did two graph traversals. It is probably possible to do it in just one, but that would be far more complicated than doing it with two

#

First traversal is used to find the diameter (a path), and 2nd is used to find longest distance to the path from any node outside of the path

#

Optimized it a bit more

#

They didn't have to make TL be 2s. 1s is plenty pythonk

pulsar pumice
#

how to solve this issue?

stiff quartz
stiff quartz
naive oak
regal spoke
naive oak
#

thx

stiff quartz
naive oak
#

that would make sense

#

thought i was logged in

haughty mountain
regal spoke
#

It really shouldnt matter

stiff quartz
#

That had taken me 0.75 seconds

#

how the hell did you solve the whole problem in like 0.00 seconds

regal spoke
#

CPython or PyPy?

stiff quartz
#

did you improve the stdin as well in your optimised thing?

regal spoke
#

no

stiff quartz
regal spoke
#

I'm using sys.stdin.buffer, which should be slightly faster

#

but thats about it

stiff quartz
#

With merely those two lines

import sys
input = [int(x) for x in sys.stdin.read().split()][::-1].pop

#

not that it matters that much, I'm just very surprised, it means your xor tree is really really damn fast

haughty mountain
#

inb4 the bytes io speed it up a tad, but the actual logic for solving the problem is basically free

stiff quartz
#

Well, still surprised at the basically 0.00 s to do two traversals (n up to 10^5)

regal spoke
#

n = 10^5 is pretty smmall

stiff quartz
#

alright that's fair

haughty mountain
#

this is also code that's kinda written to JIT well

regal spoke
#

I also dont know how much you can trust spoj's timings

haughty mountain
#

very raw

stiff quartz
#

I was mainly wondering if you had a secret sauce to make input significantly faster

#

so if it's just the problem being not expensive to solve

#

i'm happier cos that means no black magic for me to decipher

haughty mountain
#

you could try the buffer on stdin

#

basically lets it work with bytes rather than str

stiff quartz
#

input = [int(x) for x in sys.stdin.buffer.read().decode().split()][::-1].pop like so?

#

well if i use decode

#

i get a string back

#

surely that's not it

regal spoke
#

int() is fine with byte strings

stiff quartz
#

ah so i don't need .decode()

regal spoke
#

Yes, see my code

stiff quartz
#

ah i thought you had only done it on your "optimised version"

#

my bad

#

thanks

#

yeah it helps quite a bit

#

apparently, if we trust spoj timings

regal spoke
#

input = (int(x) for x in sys.stdin.buffer.read().split()).__next__

#

input = iter([int(x) for x in sys.stdin.buffer.read().split()]).__next__

haughty mountain
regal spoke
#

I don't know how much these times can be trusted (my guess is dont trust them at all)

#

But lowest memory usage on the generator one makes sense

haughty mountain
#

it makes sense for 3 to be faster than 2 at least

stiff quartz
#

3 faster than 2 because you read everything beforehands?

#

but more mem usage because you have to store it all?

haughty mountain
#

yeah, you just do the work in one batch

stiff quartz
#

fair fair

regal spoke
#

So ye, these times cannot be trusted at all

stiff quartz
#

at least the memory usage makes sense

#

and seems quite consistent

cosmic glacier
#

Hi everyone, I already tried everywhere but apparently i have no luck. I started with algo trading and crypto a few years ago, now I moved on the stock market. i am using backtrader as core library to build a backtesting system. Has to be said, im not a developer but i have python rudimentals. Now i implemented a basic genetic algorithm for hyperoptimization of parameters and I'm looking for fellow students/practitioners that have a similar goal so we can join forces, exchange ideas, help each other and collaborate. Whoever is interest let me know.

stiff quartz
#

the one with just [int(x) for x in sys.stdin.read().split()][::-1].pop

regal spoke
stiff quartz
#

instead of sys.stdin.buffer.read

regal spoke
#

I wonder how much of this is just because spoj is running ancient versions of pypy and cpython

#

PyPy has changed its io/strings since some while back

stiff quartz
stiff quartz
#

When you do the graph traversal with the xor tree

#

It's neither dfs nor bfs from the root

#

It's like going from leaves all the way up

regal spoke
#

Fun story
Back when we tried to get fastest solve on all cses tree problems using this xor technique (which we were successful at).

#

There are some tree problems that require dfs

#

At first you might think this means it is impossible use the xor traversal

#

But turns out that if you are smart about it, you can compute a dfs ordering using a xor traversal

#

This got to be one of the weirdest ways to do a dfs out there

merry lintel
#
class Node:
    def __init__(self, data):
        self.data = data
        self.next = None
   
class LinkedList:
    def __init__(self):
        self.head = None
    
    def insertAtHead(self, data):
        new_Node = Node(data)
        # head is a variable storing the location of head. It does have attributes like data and next
        if self.head is None:
            self.head = new_Node
        else:
            new_Node.next = self.head
            self.head = new_Node

    def insertAtEnd(self, data):
        new_Node = Node(data)
        pointer = self.head
        flag = True
        while flag:
            val = pointer.next is None
            if val:
                pointer.next = new_Node
                flag = False
            pointer = pointer.next

llinkedlist = LinkedList()

llinkedlist.insertAtHead(999999)
print(llinkedlist.head)
llinkedlist.insertAtEnd(23423)

pointer = llinkedlist.head
while pointer.next:
    print(pointer.data)
    pointer = pointer.next

insertAtEnd operation is working as it is supposed. Can you anyone help me out please?

rapid valve
#

OrderedDict
Unlike the regular dict structure, it preserves the insertion order. As of Python 3.7, the standard dict also preserves the insertion order, but OrderedDict provides this feature in older versions.

İs it true? So is there a difference then?

rapid valve
merry lintel
rapid valve
#

pointer.next.value != None Try (Pseudocode)

#

for the while loop

merry lintel
#

the loop for printing the data?

rapid valve
# merry lintel the loop for printing the data?
    def insertAtEnd(self, data):
        new_Node = Node(data)
        if self.head is None:
            self.head = new_Node
            return
        
        pointer = self.head
        while pointer.next:
            pointer = pointer.next

        pointer.next = new_Node
``` 👀
#

There was potential for an infinite loop in your code. You were also not printing the last item

merry lintel
#

@rapid valve I figured it out. The error was in how I printed in the values,

pointer = llinkedlist.head
while pointer:
    print(pointer.data)
    pointer = pointer.next

this works fine

rapid valve
# merry lintel I tried to debug it. But there was no infinite loop.

Only potential situations. Example when you convert a Circular Linked Lists. When you're addicted to circual dependency. This is also valid for the code I sent.

Also I don't know if it is thread safe. If it's not safe, it's still a problem. But these are more advanced topics. You don't need to worry about them so much.

#

Let the end of the Linked list point the head.

#

🤓

merry lintel
#

Let the end of the Linked list point the head
@rapid valve Are you referring to the circular linked list or some kind of cryptic message lemon_glass ?

regal spoke
#

🚤

stiff quartz
#

with a stack and 2 standard dfs i was at 0.55s

regal spoke
#

The xor technique really does wonders

stiff quartz
#

Such a weird thing that no one talks about it online

#

but I guess it doesn't change the complexity so it's not that big of a deal for the folks that do CP in C++ unless they just wanna have the fastest submission on a problem

stiff quartz
#

So I finally made it work on the initial problem

#

but I couldn't understand what you did there:

    # Greedily remove nodes from either end
    s = diameter - s
    
    i = 0
    j = len(weights) - 1
    
    sa = 0
    sb = 0
 
    while i <= j and sa + sb < s:
        if sa + weights[i] < sb + weights[j]:
            sa += weights[i]
            i += 1
        else:
            sb += weights[j]
            j -= 1
#

My code is to find the middle and greedily remove from the middle to both side

    # Find middle
    l: int = 0
    r: int = len(path) - 1
    sum_left: int = 0
    sum_right: int = 0
    while l < r:
        if sum_left < sum_right:
            sum_left += weights[l]
            l += 1
        else:
            sum_right += weights[r-1]
            r -= 1

    # Greedily remove from either side
    while l > 0 or r < len(path) - 1:
        if sum_left == 0 and sum_right == 0:
            break
        if sum_left < sum_right:
            # nullify the right side
            weight = weights[r]
            if weight <= s:
                s -= weight
                sum_right -= weight
                r += 1
            else:
                break
        else:
            # nullify the left side
            weight = weights[l-1]
            if weight <= s:
                s -= weight
                sum_left -= weight
                l -= 1
            else:
                break
#

And yours starts with s = diameter - s which I'm not sure I get

#

does that mean you remove the whole path even if you don't have the money and then if diameter > s then you have to "unremove" some you removed?

regal spoke
#

and weights is a list of the weight of the edges of the path

#

You are told to remove >= diameter - s of edge weight from this path.
I do this greedily.

#

There is no "middle" in the way I think of it

#

There are just two ends of a path

mortal ridge
#

Dumb question, what's the python StringBuilder() equivalent? I'm in one of those rare situations where I'm concatenating hundreds of millions of tiny strings and I actually need it.

#

Just add build a list and l.join("")? Is there anything that actually builds it incrementally so that the tiny strings can be collected?

narrow mica
#

ig if you really wanted a dynamic string you'd have to compromise with a list of singular characters
or maybe there's a library out there

jolly mortar
#

in cpython s += t is optimized to mutate the existing s instead of creating a new one

#

if there's no other references to s

narrow mica
#

o interesting

jolly mortar
#

pypy doesnt do it

tight cedar
#

what is meant by build incrementally?

mortal ridge
#

Ah yeah other order, mb

mortal ridge
mortal ridge
jolly mortar
# tight cedar what is meant by build incrementally?

strings arent mutable, so if you did s += t it's supposed to create a new string in memory where it copies over the characters in s and t
by building incrementally they mean modifying the s string in-place instead of creating a new one

mortal ridge
#

Unless if you already have the string you already have the refcount.

jolly mortar
#

you already have the refcount yeah

#

all objects do

mortal ridge
#

I thought it required a dereference

#

But I suppose you're sort of doing that already so why not grab the object header

#

wack

tight cedar
#

does io.StringIo do what you want? I haven't seen ppl use it tho

mortal ridge
#

To be honest I'm not sure. My assumption was that it was meant to be backed by something.

covert thorn
#

<@&831776746206265384> piracy of some fintech thingy. premium is $600/yr

rigid steeple
#

!cban 312503822914289664 your only message is advertising pirated software

halcyon plankBOT
#

:incoming_envelope: :ok_hand: applied ban to @worthy skiff permanently.

regal spoke
# jolly mortar in cpython `s += t` is optimized to mutate the existing `s` instead of creating ...
regal spoke
#

Thats probably the closest you'll get.

#

Other than using ''.join

#

Fun fact, PyPy has some string builders in the __pypy__ module. __pypy__.builders.BytesBuilder() and __pypy__.builders.StringBuilder().

hoary flare
#

Hey, y’all. I was wondering if anyone knew any good ways to decrease the false positive rate of a bloom filter. Currently, I hash the input with a cyclic shift and the compress it with MAD ((a*k + b) % p) % N where n is the size of the bloom filter, p is the next prime after N and a and b are randomly generated such that they are both less than p (also I keep sampling them until I get a unique a and unique b). Also, I just took the size of the bloom filter and the number of has functions from the wiki.

#

But rn, I’m only getting a false positive of about 3.8%. So other than increasing the number of bits, is there a good way I could try and bring down the false positive rate?

#

Idk, maybe there is a better way to hash it or smth.

regal spoke
# hoary flare Idk, maybe there is a better way to hash it or smth.

If you think you've got a bad hash function, then you could try using a theoretical optimal hash function.

from collections import defaultdict
import random
hasher = defaultdict(lambda: random.randrange(2**61))

If this has the same false positivity rate, then no you cant improve your hash function

hoary flare
#

Oh yeah, would that generate random hash functions in the range of your bloom filter? Sorry, I’m unaware of the collections.defaultdict class.

#

Good idea though.

regal spoke
#

defaultdict is just a dict with a default value

#

I use it a ton. For example defaultdict(lambda: 0) or defaultdict(lambda: [])

#

Very convenient to use

hoary flare
#

Okay, that makes sense. So, how would I use that to get a theoretical optimal hash?

regal spoke
#

It maps integers to a random value in [0, 2**61)

hoary flare
#

Oh true. Couldn’t you just use random.randint?

regal spoke
#

Ofc

#

random.randint(0, 2**61-1)

#

is the exact same thing as random.randrange(2**61)

#

I just find randrange nicer to use

hoary flare
#

Yeah, okay. My bad, I’ve honestly never used randrange before.

#

That’s actually a really sick idea tho. Thanks heaps btw

regal spoke
#

Note that this is only useful to "benchmark" your bloom filter.

hoary flare
#

Ye, ofc

#

Okay, lol. My hash function sucked ass. I got 10^-3% as the false error rate for the “optimal” one and 3.8% for mine. Could the be any reason why mine sucked so much? Or is there a better way to hash it?

lean walrus
#

p is the next prime after N
that doesn't feel good

#

also, using random a and b is not good either

#

try picking extremely large prime a,b,p and testing your hash function again

hoary flare
lean walrus
#

everything should be prime

hoary flare
#

Fr?

lean walrus
#

I would at least recommend so

#

random a and b are definitely bad

hoary flare
#

How come? Also how should I sample a and b then?

lean walrus
#

there are ways to generate big prime numbers

hoary flare
#

Yeah, I know but should I just choose any big prime?

regal spoke
#

((a*k + b) % p) % N to me what sticks out here is that the addition with b does almost nothing

#

I'm sure there are better hash functions out there for this

regal spoke
#

But if you see improvements, then your current hash function is inoptimal

lean walrus
#

if we are generating hashes using this, then I agree that b does almost nothing
it shifts all possible results, and that is it

hoary flare
#

Okay, so implemented it. Choosing a large p helped by .3% but choosing large primes a and b seemed to increase it by .4% from that strangely

#

To be fair, I just generate random a and b between p//2 and p - 1 such that a and b are prime. So it's still random. But I don't really see that mattering, ig I could choose the prev prime before p and then the next prev etc? Would that actually make a difference tho?

hoary flare
#

Actually the only thing that made a difference (+0.2%) was more carefully choosing p. Is it possible that cyclic shifting is making it fairly high?

lean walrus
#

can you show your code?

#

!paste

halcyon plankBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

hoary flare
lean walrus
#

huh? I'm trying to help

#

ok

hoary flare
#

Actually fuck it, doesn't matter just immediately a sec

lean walrus
#

!pypi bitarray

halcyon plankBOT
#

efficient arrays of booleans -- C extension

Released on <t:1704143281:D>.

hoary flare
#

Yeah that's what I was thinking

lean walrus
#

give me an hour, I will get back home and play with your code

hoary flare
#

Thanks heaps dude.

#

No stress tho, I don't wanna waste your time if you don't wanna do it, but I would really love the help

#

Plus, imma go to bed so I won't be able to get back to you for a few hrs.

hoary flare
regal spoke
#

Oh wow

#

really

hoary flare
#

Ya. Pretty crazy

regal spoke
#

Here is something to try

struct custom_hash {
    static uint64_t splitmix64(uint64_t x) {
        // http://xorshift.di.unimi.it/splitmix64.c
        x += 0x9e3779b97f4a7c15;
        x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9;
        x = (x ^ (x >> 27)) * 0x94d049bb133111eb;
        return x ^ (x >> 31);
    }

    size_t operator()(uint64_t x) const {
        static const uint64_t FIXED_RANDOM = chrono::steady_clock::now().time_since_epoch().count();
        return splitmix64(x + FIXED_RANDOM);
    }
};

Which translated to python would look something like this

import random
FIXED_RANDOM = random.randrange(2**64)
MASK = 2**64 - 1

def splitmix64(x):
    x = (x + FIXED_RANDOM) & MASK
    x = (x + 0x9e3779b97f4a7c15) & MASK;
    x = ((x ^ (x >> 30)) * 0xbf58476d1ce4e5b9) & MASK;
    x = ((x ^ (x >> 27)) * 0x94d049bb133111eb) & MASK;
    return x ^ (x >> 31);
hoary flare
regal spoke
#

yes

hoary flare
regal spoke
#

Pythons built in hashes for strings are pretty good

hoary flare
#

actually, nvm. I think I got it.

regal spoke
#

Try either splitmix64(hash(S)) or hash(S)

hoary flare
#

More or less the same

#

It's probably smth to do with the compression function then.

lean walrus
#

it specifically returns non-prime numbers

#
def __str__(self) -> str:
    self._data.__str__()
``` do you have any typechecker enabled? this contains an error
#

how are you checking your error rates? can i see that code?

#
def find_in_arr(self, arr: list, elem1: Any, elem2: Any = "") -> bool:
    for i in arr:
        if i == elem1 or i == elem2:
            return True
    return False
``` that reduces to `elem1 in arr or elem2 in arr`
#
p = BloomFilter.next_prime(self._num_bits)
``` even if that generated a prime number, it would be comically low
hoary flare
lean walrus
#

also, all your hash functions are the same

#
for i in range(self._num_hashes): # initialise hash functions lol seems kinda goofy but whatever
    a = b = -1
    while self.find_in_arr(used_ab, a, b):
        a = random.randint(1, p - 1)
        b = random.randint(0, p - 1)
    used_ab[2*i] = a
    used_ab[2*i+1] = b
    self._hash_funcs[i] = lambda x: self.compression_function(x, a, b, p, self._num_bits)

all created lambdas will refer to the last value of a and b

hoary flare
#

What, really? But I generate new a and b each time?

lean walrus
#

that is how lambdas work

#

they do not capture all variables individually, they capture a reference to frame and extract variables from it when needed

#

lambda x,a=a,b=b: ... this will fix it

hoary flare
#

Really appreciate the help btw dude.

lean walrus
#
from collections.abc import Callable
import math
import random

from bitarray import bitarray


def myhash(x: object) -> int:
    return hash(x)

def is_prime(n: int) -> bool:
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

def next_prime(n: int) -> int:
    while True:
        n += 1
        if is_prime(n):
            return n


class BloomFilter:
    data: bitarray
    hash_funcs: list[Callable[[object], int]]
    size: int

    def __init__(self, max_keys: int, error_rate: float = 0.01) -> None:
        num_bits = round(- 2.08 * max_keys * math.log(error_rate, 2))
        num_hashes = round(- math.log(error_rate, 2))
        self.hash_funcs = []

        p = next_prime(num_bits)
        for _ in range(num_hashes):
            a = random.randint(1, p - 1)
            b = random.randint(0, p - 1)
            self.hash_funcs.append(lambda x,a=a,b=b: ((a * myhash(x) + b) % p) % num_bits)

        self.size = 0
        self.data = bitarray()
        for _ in range(num_bits):
            self.data.append(0)

    def add(self, key: object) -> None:
        for func in self.hash_funcs:
            self.data[func(key)] = 1
        self.size += 1

    def __contains__(self, key: object) -> bool:
        for func in self.hash_funcs:
            if not self.data[func(key)]:
                return False
        return True
``` this is what i am left with
#

works for me

#

i guess your math is just wrong on these lines ```py
num_bits = round(- 2.08 * max_keys * math.log(error_rate, 2))
num_hashes = round(- math.log(error_rate, 2))

#

i can get arbitrary precision by increasing number of bits and hashes

lilac cradle
lean walrus
lilac cradle
#

are they? huh

#

ive tended to use them for some of my stuff since they're deterministic (unlike hash(), grrrrr)

#

wait wdym parametrizable

lean walrus
lilac cradle
#

well i mean
depends on your definition of different

#

cause some of them can spit out different amounts of bytes

lean walrus
#

there is definitely not a thousand of functions in hashlib

#

so it is useless for bloom filter

hoary flare
#

Cos mine works, but I was really hoping for a technique to decrease the false positive rate.

fluid sand
#

hey guys, i'm trying flatten json response
From:
input:```
[
{
"A1": "a",
"A2": "t",
"A3": [
{
"B1": 1,
"B2": "i",
"B3": [
{
"C1": "g",
"C2": [
{"D1": "Sa", "D2": 1},
{"D1": "Ss", "D2": 4}
]
},
{
"C1": "h",
"C2": [
{"D1": "Sb", "D2": 2},
{"D1": "St", "D2": 5}
]
}
]
},
{
"B1": 2,
"B2": "j",
"B3": [
{
"C1": "i",
"C2": [
{"D1": "Sc", "D2": 3},
{"D1": "Sd", "D2": 6}
]
},
{
"C1": "k",
"C2": [
{"D1": "Se", "D2": 7},
{"D1": "Sf", "D2": 8}
]
}
]
}
]
}
]

to this:

A1 A2 B1 B2 C1 D1 D2
0 a t 1 i g Sa 1
1 a t 1 i g Ss 4
2 a t 1 i h Sb 2
3 a t 1 i h St 5
4 a t 2 j i Sc 3
5 a t 2 j i Sd 6
6 a t 2 j k Se 7
7 a t 2 j k Sf 8

#
    records = []
    def process_item(item, parent_info=None):
        if parent_info is None:
            parent_info = {}
        if isinstance(item, dict):
            temp_record = parent_info.copy()
            for key, value in item.items():
                if isinstance(value, dict):
                    process_item(value, temp_record)
                elif isinstance(value, list):
                    temp_records = []
                    for sub_item in value:
                        if isinstance(sub_item, dict):
                            new_record = temp_record.copy()
                            process_item(sub_item, new_record)
                            temp_records.append(new_record)
                    if temp_records:
                        
                        for record in temp_records:
                            combined_record = temp_record.copy()
                            combined_record.update(record)
                            if all(combined_record.get(field) is not None for field in ['A1', 'A2', 'B1', 'B2', 'C1', 'D1', 'D2']):                         records.append(combined_record)
                else:
                    temp_record[key] = value
            if all(temp_record.get(field) is not None for field in ['A1', 'A2', 'B1', 'B2', 'C1', 'D1', 'D2']):
                records.append(temp_record)
        elif isinstance(item, list):
            for sub_item in item:
                if isinstance(sub_item, dict):
                    new_record = parent_info.copy()
                    process_item(sub_item, new_record)
                    if all(new_record.get(field) is not None for field in ['A1', 'A2', 'B1', 'B2', 'C1', 'D1', 'D2']):
                        records.append(new_record)
    for item in data.get('A', []):
        process_item(item)
    df = pd.DataFrame(records)
    df = df.dropna(axis=1, how='all')
    df = df.drop_duplicates()   
    df = df.reset_index(drop=True)
    return df
#

this is the recursive approach i'm following the only problem is I have to hardcode key names

lean walrus
hoary flare
#

Fair lol.

modern gulch
# fluid sand ```def flatten_json(data): records = [] def process_item(item, parent_in...

I don't have a good solution, but I have a ducky_santa solution, because I love my ducky_ninja . This is an iterative approach that unnests the right-most columns... there's more clever ways of doing this, but I aint going to keep going. Example below assumes you assigned data = [...] (your example above) ```py
import duckdb
import pandas as pd
from numpy.dtypes import ObjectDType
def unnest_dataframe(f):
col="data"
df = pd.DataFrame({col: f})
first_iteration = True
while isinstance(df[col].dtype, ObjectDType):
select_cols = f"columns(c -> c not like '{col}'), " if not first_iteration else ""
query = f'SELECT {select_cols} UNNEST("{col}") AS unnest_col FROM df'
df = duckdb.execute(query).df()
col = df.columns[-1]
first_iteration = False
return df
final_df = unnest_dataframe(data)
print(final_df)

haughty mountain
#

of course you would duck this up

haughty mountain
# fluid sand ```def flatten_json(data): records = [] def process_item(item, parent_in...

if you know what key is the final one (D2 in this case), then you can do something much simpler

def kv_stream(data):
  match data:
    case list():
      for x in data:
        yield from kv_stream(x)
    case dict():
      for k, v in data.items():
        match v:
          case list() | dict():
            yield from kv_stream(v)
          case _:
            yield k, v

def rows(leaf_key, kv_pairs):
  flat = {}
  for k, v in kv_pairs:
    flat[k] = v
    if k == leaf_key:
      yield flat.copy()

for row in rows('D2', kv_stream(json_data)):
  print(row)
```output:
```py
{'A1': 'a', 'A2': 't', 'B1': 1, 'B2': 'i', 'C1': 'g', 'D1': 'Sa', 'D2': 1}
{'A1': 'a', 'A2': 't', 'B1': 1, 'B2': 'i', 'C1': 'g', 'D1': 'Ss', 'D2': 4}
{'A1': 'a', 'A2': 't', 'B1': 1, 'B2': 'i', 'C1': 'h', 'D1': 'Sb', 'D2': 2}
{'A1': 'a', 'A2': 't', 'B1': 1, 'B2': 'i', 'C1': 'h', 'D1': 'St', 'D2': 5}
{'A1': 'a', 'A2': 't', 'B1': 2, 'B2': 'j', 'C1': 'i', 'D1': 'Sc', 'D2': 3}
{'A1': 'a', 'A2': 't', 'B1': 2, 'B2': 'j', 'C1': 'i', 'D1': 'Sd', 'D2': 6}
{'A1': 'a', 'A2': 't', 'B1': 2, 'B2': 'j', 'C1': 'k', 'D1': 'Se', 'D2': 7}
{'A1': 'a', 'A2': 't', 'B1': 2, 'B2': 'j', 'C1': 'k', 'D1': 'Sf', 'D2': 8}
simple rivet
#

hi guys

#

can someone give me a link to download jupiter

stiff quartz
pulsar garnet
#

I am looking for a developer who can solve captcha with python selenium.

narrow mica
#

I haven't really implemented it, but use an RMQ data structure to do the max(...) for h in 1..S part, and the S you need at H, J can be calculated quickly from S at H-1, J

stiff quartz
#

By couldn't make it work was more like

stiff quartz
#

but message got lost sorry

#

because when you mean ending at column J, it could end at column J row 5 or at column J row 200

#

So when I want to check if I can make a valid rectangle at column J+1

#

I don't know which row to look at

#

If that makes sense

narrow mica
#

or maybe the dp definition is just bad

stiff quartz
#

If at column J the biggest height that makes a valid rectangle starts at row 1000 and ends at row 1050 and you have 3000 rows

#

I'm not sure I get how you store that information in dp[50][J]

narrow mica
#

ehhh maybe the dp is bad

narrow mica
stiff quartz
#

maybe the comment threw me off and we have to do the histogram thing

narrow mica
stiff quartz
#

I don't know, I only know the monotonic stack solution

#

If there's a DP solution with a RMQ then obviously I guess we can use it here

narrow mica
#

or maybe I'm misremembering

stiff quartz
#

I'm going to check

narrow mica
uncut void
#

@narrow mica can I run something by you?

narrow mica
stiff quartz
uncut void
narrow mica
stiff quartz
#

thanks for digging that up

narrow mica
uncut void
kind scroll
#

damnn im just lookin at chats rn and im noticing how many smart people are in this serfver

marsh quartz
#

My question is about complexity analysis. There is a function that accesses 5 different hash tables, where each hash table has M keys (Cannot be treated as a constant) and each key has N characters (Cannot be treated as a constant). If a single hash operation is considered O(K) where K is the number of characters of the key, what should be the worst case complexity of this function?
How do I write it in the most concise way possible

haughty mountain
#

worst case? something like K + 5 M min(K, N)

marsh quartz
haughty mountain
#

you need to hash the thing

marsh quartz
#

M min(K, N)

haughty mountain
#

not in general

#

though it's kinda weird that K and N are even different variables

#

if K=N then it would dominate

merry igloo
#

looking for DSA guidance

oblique panther
merry igloo
#

there is too much overwhelming info

#

that is distracting and floating from one place to another

oblique panther
merry igloo
#

to get in FAANG

oblique panther
# merry igloo to get in FAANG

DSA isn't the most important thing to get into FAANG, but even if it were, FAANG isn't the be-all-end-all of software development careers.

merry igloo
#

i know but
It is also the way to get in

I want to know how to progressively learn programming where i can solve problems

oblique panther
merry igloo
#

thnks you ⭐

meager rivet
#

hello guyw

#

Does anyone know how to create a list from a variable

#

for example, lets say im taking an input like x = input("hello enter a number"), lets say, for the sake of example, i enter 20. How do i make a list called 20 out of that?

lean walrus
#

you can create a list of 20 elements
or list with a number 20 inside

meager rivet
#

how

#

i dont think

#

you understood

#

what im trying to do

lean walrus
#

ok

meager rivet
#

nvm

#

i dont get it either

#

my bad bro

spice glade
#

its possible, but you should never do this

meager rivet
#

why not

left mulch
#

There might be other reasons, but for one this code is hard to follow/understand

meager rivet
#

Hm alright

meager rivet
stiff quartz
#

I would like to remove 1 from all first k elements of an array in log(n) or faster where n is the size of the array. I want to be able to insert in this array in log(n) or faster as well. Seg tree + lazy propagation doesn’t work because you can’t really insert anything, right? Is there a way to do both?

#

Because I only ever need to look at the first k elements when I do the updates I feel like I could use SortedList

#

But I’m not sure I see a clear way on how to do it

regal spoke
stiff quartz
#

The thing is that if I allocate extra memory my updates will remove 1 to those unused positions right

#

And I’m not sure how I could discriminate between used and non-used (actually I could just have tuples int + bool instead of integers in my array)

regal spoke
#

You can have a segment tree with 0 1 to keep track which elements are "active"

stiff quartz
#

Yes that makes sense

#

And because I know I won’t insert more than x times with x <10^5

#

I can just initialize a 10^5 sized array with most elements marked as « inactive »

stiff quartz
regal spoke
#

Do you know the final order or not?

#

Of the inserts

stiff quartz
#

Not really I’m figuring them out on the go

regal spoke
#

Then I would a n^(1/3) datastructure

stiff quartz
#

I’d be fine with it tbf I probably wouldn’t see that much of a difference with logn

regal spoke
#

Do you know about the "sorted list" data structure?

stiff quartz
#

Yep

regal spoke
#

That one is basically what you'd want

stiff quartz
#

From the GitHub you usually reference

regal spoke
#

yes

stiff quartz
#

« Fast » range updates

regal spoke
#

Think about how to do n sqrt n first

stiff quartz
#

I « only » want to remove 1 to first k elements

regal spoke
#

"fast" then is sqrt n

#

No segment trees

lean walrus
stiff quartz
#

Not sure I get what you mean with sqrt - ah no I see, sqrt decomposition with plenty of small arrays - thanks!

regal spoke
#

Internally it uses insert sort into O(n^(1/3)) buckets

#

There are O(n^(2/3)) buckets in total

#

Once a bucket fills up, it is split into two

#

The split operation costs O(n^(2/3)), but amortized it is cheap, since you only have to do it after first doing O(n^(1/3)) inserts into the bucket that is split

lean walrus
#

ok, list of lists
dont see why you would use that instead of a heap
heap is a limiting case of the same idea - split lists into lists of lists to make insertion faster, until you get O(1) lists - that is a binary tree

stiff quartz
#

You can’t access the ith fast element in a heap (I think you can with SortedList)

regal spoke
#

The expensive steps, inserting, is done via a .insert call on a list

#

Which has a crazy low constant factor

#

since it is basically just C code

lean walrus
stiff quartz
#

Looks like you can’t do look ups in heap fast (you don’t know if you should go left or right)

#

Even with an additional DSA on the side

regal spoke
#

Iirc all of these sorted datastructures, heaps etc, ... has a way to keep track of "kth order statistics"

#

By maintaining some extra state

#

That said, I really dont think you'll beat the speed of sorted list with any of those

stiff quartz
#

So for my problem I could just do, with sqrt decomposition:

Insert in sqrt(n)
Lookup in log(n) (binary search)
Range update in sqrt(n) by "caching" the value I update to arrays entirely covered

#

And I guess if I know in advance the final size of my array, I should start with blocks of size sqrt(final size of array) instead of doing from time to time linear re-buildings right?

regal spoke
stiff quartz
#

Well if you know that your array is sorted

#

I was writing something like this to find where to insert (which eventually will be O(sqrt(n) because I'll call .insert after)

    def find_index_to_insert(self, value: int) -> tuple[int, int]:
        # Binary search to find the block
        left, right = 0, len(self.blocks) - 1
        while left < right:
            mid = (left + right) // 2
            if self.blocks[mid][-1] < value:
                left = mid + 1
            else:
                # It means self.blocks[mid][-1] >= value
                right = mid
        block_idx = left
        # Binary search within the block
        block = self.blocks[block_idx]
        left, right = 0, len(block)
        while left < right:
            mid = (left + right) // 2
            if block[mid] < value:
                left = mid + 1
            else:
                # It means block[mid] >= value
                right = mid
        return block_idx, left
#

And you can write something similar for lookup right, and because there's no .insert call afterwards, it's a log(n) lookup

#

Oh yes right initially I was only asking for range update and insert; turns out I just didn't talk about lookups because I knew if I had a sorted DSA I could do them fast

vale frost
#

hello i need courses web scraping & automation

past wren
#

does anyone have a good sentiment analysis program?

narrow mica
#

I was also gonna say you could probably do some shenanigans with a range update point query datastruct to keep track of indices, but then I looked and saw that's what SortedList is doing

dusky creek
#

What's a good website for DSA problems

#

preferably very well categorized

#

Leetcode is nice but i do dislike the premium popups

stiff quartz
stiff quartz
regal spoke
# narrow mica if python recursion wasn't so slow, you could use a merge split treap, which cou...

I have tried making iterative treaps in the past. https://github.com/cheran-senthil/PyRival/blob/4a735958bfd3ac130a1d60542a23390e99b756f4/pyrival/data_structures/Treap.py#L218-L232
But they just dont compare to the speed of sorted list. Even when n = 1e7

GitHub

⚡ Competitive Programming Library. Contribute to cheran-senthil/PyRival development by creating an account on GitHub.

narrow mica
# stiff quartz Sounds like a super complicated data structure 👀 never heard of it

it is a bit of a doozy
treap is bst + heap, each node stores (bst_key, heap_key=random()), and if you look at bst_key the structure is a bst, if you look at heap_key it's a heap, and the randomized nature keeps it pretty balanced so it won't degrade; you can implement it using rotate or merge & split, use the latter for our purposes
then instead of actual keys, use "implicit keys" that are calculated on the fly as you descend the tree which represent the array's indices

narrow mica
regal spoke
#

Never seen anyone try implementing sorted list in C++

#

It would be interesting

stiff quartz
#

ah it doesn't allow duplicates

#

but other than that it's the same right?

regal spoke
stiff quartz
#

ah yes fair enough

narrow mica
#

or well g++ ig cause c++

stiff quartz
#

not really a c++ user to be fair anyway haha

meager rivet
#

Hello, need some help with this school homework 😭

#

I'm trying to make a library that stores info about books and then i need to be able to output it via ID

#

what i dont understand is, how do i link the id to the book
how do i make it so that if for example
the input is "19" as in, give me the book with the id 19, how do i show the specific details of that book?

lean walrus
#

dict is perfect for this task

meager rivet
#

yes well

#

ive gotten somewhere

#

but eh

#

hard to explain

digital crane
#

How can I encode a hex string to a 5-6 length integer value without losing data or mapping? Then, I can decode it back to the original hex string again.
For example:

  • Encode: "664fbafb951c8e9cb32c7091" → return 238774
  • Decode: 238774 → return "664fbafb951c8e9cb32c7091"
meager rivet
#

is bro trying to unhash hashed data💀

hazy leaf
#

hey can someone tell me what's wrong in this,I want to filter the strings which only contains 'e' and 'r' in it

narrow mica
#

apply pigeonhole

brave hound
hazy leaf
brave hound
#

in the if condition you are checking if the letter is in er not erf

hazy leaf
#

@brave hound for this code I get the ints which only has 0s and 1s as output

#

but now I have to another thing

#

If I gave 32 as input,it gives me 0,1,10,11 as OP and I want these ints to sum to get the given intput

#

like If I give 32 as IP and the the ints with 1s and 0s from 0 to 32 are 0,1,10 and 11 and if I add any of these numbers they should give me 32

#

if I add 10 and 11 and 11,I get the sum equal to the given input which is 32

#

@brave hound can you help me to do it

brave hound
#

you can do something similar to a greedy coin change solution

#

basically you sort the numbers in reverse order and iterate over it, then check how many of that number can fit into the target like (two 11s = 22 <= 32, but three 11s = 33 > 32) and you add that many of that number into the result and move onto the next number in the sequence

glossy nexus
#

Im using xarray to open a weather grib dataset, there is 2 levels in it, one is isobaricinhpa (vertical weather layers in millibar) and one is for tropopause layer, any solution? I need both and dont want to filter it, just tryna output a grib into a netcdf

solemn moss
# regal spoke It would be interesting

Do you know some tasks to test it?

When doing just random inserts and erases i am getting pretty similar results for set and sortedlist, but sortedlist seems to be a bit faster than multiset (though i don't really trust these tests xd)

#

is faster than pbds tree - 1e6 inserts, erases and order_of_keys (first screen pbds, second sortedlist)

solemn moss
#

yep

regal spoke
#

Have you optimized the block size parameter?

solemn moss
#

nah

#

just used everything as you had

regal spoke
#

Oh did you translate the python code?

solemn moss
#

mostly

regal spoke
#

You might want to play around with the block size

solemn moss
#

also added erase by value

#

to not do remove(lb())

regal spoke
#

I don't know the best way to benchmark it

#

sorted list kind of has a worst case when you do insert at index 0 (i.e. in decreasing order)

solemn moss
regal spoke
#

I was kind of hoping this would beat std::set by a significant margin, since std::set is known to be slow

#

I've heard a lot of rants about how Abseil's set beats std::set by a mile

solemn moss
#

At least it is a nice alternative for pbds ordered set which is even slower

Also I believe it will be kinda 1.5x better than std::set on average

If inserting in increasing order it is >2 times faster

regal spoke
#

The real strength of it is if you need to iterate over the elements in some range [l:r]

#

Since everything is stored into decently sized vectors, it is memory contiguous

regal spoke
#

The cost of inserting should then decrease proportionally to the pointer bit size

#

So this could make the inserts significantly cheaper

stiff quartz
#

@regal spoke I'm not sure I remember well, and I can't find it, but you once said there's loads of stuff Mo's algorithm can do and seg trees can't, and one of them was Range Frequency Queries? Or do I remember poorly? (the latter is likely)

regal spoke
#

Mo is used when queries are complicated enough to not be possible to do with segment trees

stiff quartz
#

(I'm asking because I was looking at this problem which is apparently Mo's 101 (https://www.spoj.com/problems/DQUERY/) and there were some (albeit a tiny bit convoluted) seg tree solutions)

regal spoke
#

There are ways to for example query: count number of (different) elements in [l, r)

#

using a segment tree

#

But you need to work around it

stiff quartz
#

ah yeah that's exactly the one I just did

solemn moss
stiff quartz
#

by setting 1 to the last time you saw a particular value, right?

solemn moss
#

even though the complexity is worse

regal spoke
solemn moss
#

with changes

stiff quartz
#
def solve() -> None:
    n: int = get_int()
    a = line_integers()
    q: int = get_int()
    queries = [[] for _ in range(n)]
    # queries[j] represents a list of queries, each being the queries[j][k][1]th query -
    # that is between queries[j][k][0] and j
    for nb in range(q):
        l, r = line_integers()
        l -= 1
        r -= 1
        queries[r].append((l, nb))

    answers = [0] * q
    st: SegmentTreeSum = SegmentTreeSum([0] * n)
    last = [-1] * (10**6 + 1)

    for j in range(n):
        if last[a[j]] != -1:
            st.update_tree(last[a[j]], 0)
        st.update_tree(j, 1)
        last[a[j]] = j
        for l, nb in queries[j]:
            answers[nb] = st.sum_range(l, j)

    for answer in answers:
        print(answer)

gets me 2.33s

#

for count number of (different) elements in [l, r]

solemn moss
#

O(n^(5/3)) works fine, but O(nlog^2(n)) not

stiff quartz
#

You are not allowed to view the contest

regal spoke
solemn moss
stiff quartz
#

I was just curious as to how people solved it with seg trees when it's presented as a "Mo's 101"

regal spoke
solemn moss
#

i don't remember how to open it so just link xd

regal spoke
stiff quartz
stiff quartz
#

So the conclusion is more like

#

For some problems

#

There might be a seg tree solution

#

but the work to get it is just too much

#

and the Mo's solution is just a ton easier

regal spoke
#

to get something like n sqrt n

#

You rebuild the offline datastructure after every ~ sqrt(n) queries

#

There could be at most sqrt(n) updates that the offline datastructure doesnt know about at any point in time

#

Those updates you can handle in a brute force way. For each query, you simply check all relevent new updates in O(sqrt(n)) time and modify the result from the offline algorithm accordingly

solemn moss
#

xd

#

yeah sounds good

#

that's not fair 😄

#

though

#

how would you handle them

#

ah ok i know

boreal wadi
#

Hi guys, im a gold trader. I have so far written some code which gives me the following output for the example market structure.
The problem lies in how I further interpret my tuples of data in the sequenced output.
Specific combinations and sequences of mitigation outputs have different interpreted conditions for the current price action.
So yeah! if anyone can help point me in the right direction in terms of finding the right community or person to help me I would greatly appreciate it!

#

Here is another example output:

boreal wadi
#

thats hard part. working out what I want to code

#

if 3 "fully mitigates" 5 and closure status = "yes" then when 2 "fully mitigates" 4 thats an indication trend has been broken

#

do I guess someone needs to have experience with market structures already potentially

#

its a very unique and niche area I understand

modern gulch
#

I don't know what fully mitigates means in this context.

boreal wadi
#

purges

#

takes out

modern gulch
#

Takes out what?

boreal wadi
#

its reference poi

#

a poi is a high or a low

modern gulch
#

Is this some commodities terminology? I've never done commodities

boreal wadi
#

Its ICT terminology

#

fully mitigates is the act of something that has been made less severe or painful, or has been made less dangerous, harsh, or damaging

#

so when we fully mitigates previous highs or low in the market

#

its putting in damage for sellers but making damage less painful for buyers

#

in that context

#

but it can always be the opposite way around hence, "fully mitigates" in this context im using to refer to the purge of previous lows or highs

#

so like each poi will have previous points that it interacts with

#

and thats how im analysing it

modern gulch
#

I'm not sure you're going to find anyone who knows much about ICT or commodities trading here. If you can express it in terms of an algorithmic goal, maybe we could map it to something else?

boreal wadi
#

Right spot on

modern gulch
#

If you just want to label points based on their relationship to lag/lead values? That's straightforward

boreal wadi
#

In trading we interpret the market from a technical stand point.
So the process of analysis, if we repeat that and record the output enough times we get results and then can safely place trades
But its about being clever and interpreting it specifically but also having a high enough frequency for results

#

take the simple act of taking out (aka fully mitigating) the previous days high, it happens ALL the time.
Thats one confluence (or specific attribute about the current price)

#

then you combine it with another attribute, its an and statement.
We are taking out the previous days high and we are breaking a 5min trendline

#

for example

modern gulch
#

I don't need the background, I'm in the industry. I don't do much TA tho, not a big fan of trading off charts

#

Whats the algorithm you're trying to implement?

#

Like, if you want to find points that meet conditions X and Y?

boreal wadi
#

im very good at TA, suck at fundamental analysis.
I already got a little data manually recorded and its good profit (but not enough sample size) so I want to make a robot to do the technical analysis for me and give me results for hundreds of thousands of different trades for each specific trade type

boreal wadi
modern gulch
modern gulch
boreal wadi
#

Your right. I used it to write the first couple hundred lines of code but after that ive had to ask the ai specific lines of code and go step by step

modern gulch
#

Like, if you want to calculate an ichimoku cloud and identify points that intersect, this is fairly straightforward in pandas (or similar),

modern gulch
boreal wadi
#

I think in my own logical language and then work out how to make it into its python equivalent

boreal wadi
#

and its always counter intuitive

boreal wadi
modern gulch
#

You can skip the Ml stuff, the pandas part is really important

boreal wadi
#

checking it out now 😄

#

this is what I need, I was using the pdf manuals before

modern gulch
boreal wadi
#

the ones with like 3000 pages of stuff

boreal wadi
modern gulch
hoary flare
#

so, I guess this question is more kinda theoretical, but, let's imagine that we need to generate n hashes for an object. We use the pickle.dumps(obj) to convert our object to a byte array and we stitch the bits together so we get a long string of bits of length. Now we take the number of bits, m, and we have and divide it by the number of hashes that we want. Then, we just take n partitions of length m//n. Would this give a relatively uniform and independent set of hash values?

glossy nexus
#

Im using xarray to open a weather grib dataset, there is 2 levels in it, one is isobaricinhpa (vertical weather layers in millibar) and one is for tropopause layer, any solution? I need both and dont want to filter it, just tryna output a grib into a netcdf

modern gulch
hoary flare
modern gulch
hoary flare
#

cheers. good idea

#

I'll just have to wait for the curr. discussion to finish

fiery cosmos
#

x, y = y, x
print('hello ,world')

#

did i do that right

dusky creek
#

So I'm learning some DSA, and this resource seems pretty mathematical in its explanations

While I do understand it if I try slowly, do I really need to digest the math? Or should I find a simpler resource

shut breach
#

Yeah you should read it slowly. If you find something simpler you're just going to have a worse understanding of it.

#

It's worth noting that this isn't even the mathematical definition of big O, it's talking unrigorously about functions.

cobalt parcel
#

Anyone knows an optimized KMP algo where there's "two" prefix suffix table that starts with -1 ?

#

I saw it in my class but I cant find the source for it

grizzled rock
#

just remember in regards to Big O it is always testing WORST CASE so when something is referring to Big O we can assume it is talking about worst case scenario. So for an example if something was big O(n)^2 -> this example is essentially a loop within a loop and so it is O(n)^2 because no matter what the input is we will have to look over that value twice - i.e.,

for i in something:
   for j in something:
     print(f"something: {j}")```

regardless if its 2-1000 we still need to look over the list twice

If I completely went off the rails here and missread what you've typed and told you something you already know sorry for wasting your time; if not, hope this helped.
#

one shitty example but it's an example kek

lean walrus
#
for i in something:
   for j in something:
     print(f"something: {j}")
``` this code does not "look over that value twice"
it does that `len(something)+1` times, so you get `len(something)*(len(something) + 1) = O(N^2)` (where `N=len(something)`) operations
haughty mountain
#

big O is not generally "worst case"

#

it's an upper bound of something

#

you can upper bound the best case, or average case, or whatever

lean walrus
#

algorithm can be O(logN) in typical case and O(N) in the worst case (some kind of search tree, for example)

#

if you dont want to specify what you are upper-bound'ing, there are other letters for best case, average case, worst case

#

from wikipedia

grizzled rock
grizzled rock
# dusky creek So I'm learning some DSA, and [this](<https://goalkicker.com/AlgorithmsBook/>) r...

If f(x) is a sum of several terms, if there is one with largest growth rate, it can be kept, and all others omitted
If f(x) is a product of several factors, any constants (terms in the product that do not depend on x) can be omitted.
Example:

f(x) = 6x^4 − 2x^3 + 5

Using the 1st rule we can write it as, f(x) = 6x^4

Using the 2nd rule it will give us, O(x^4)

What is Worst Case?
Worst case analysis gives the maximum number of basic operations that have to be performed during execution of the algorithm. It assumes that the input is in the worst possible state and maximum work has to be done to put things right.

For example, for a sorting algorithm which aims to sort an array in ascending order, the worst case occurs when the input array is in descending order. In this case maximum number of basic operations (comparisons and assignments) have to be done to set the array in ascending order.

It depends on a lot of things like:

CPU (time) usage
memory usage
disk usage
network usage
What's the difference?
Big-O is often used to make statements about functions that measure the worst case behavior of an algorithm, but big-O notation doesn’t imply anything of the sort.

The important point here is we're talking in terms of growth, not number of operations. However, with algorithms we do talk about the number of operations relative to the input size.

Big-O is used for making statements about functions. The functions can measure time or space or cache misses or rabbits on an island or anything or nothing. Big-O notation doesn’t care.

In fact, when used for algorithms, big-O is almost never about time. It is about primitive operations.

When someone says that the time complexity of MergeSort is O(nlogn), they usually mean that the number of comparisons that MergeSort makes is O(nlogn). That in itself doesn’t tell us what the time complexity of any particular MergeSort might be because that would depend how much time it takes to make a comparison. In other words, the O(nlogn) refers to comparisons as the primitive operation.

The important point here is that when big-O is applied to algorithms, there is always an underlying model of computation. The claim that the time complexity of MergeSort is O(nlogn), is implicitly referencing an model of computation where a comparison takes constant time and everything else is free.

Example -

If we are sorting strings that are kk bytes long, we might take “read a byte” as a primitive operation that takes constant time with everything else being free.

In this model, MergeSort makes O(nlogn) string comparisons each of which makes O(k) byte comparisons, so the time complexity is O(k⋅nlogn). One common implementation of RadixSort will make k passes over the n strings with each pass reading one byte, and so has time complexity O(nk).``` this might help you, just found this 🙂
dusky creek
lean walrus
dusky creek
#

But I understand that its better to actually understand whats going on

stiff quartz
#

It’s just instead of having upper bound you can have a lower bound

#

Or a tight bound

dusky creek
#

Im using a book from my university's library which is doing a pretty good job of making me understand things so far though

stiff quartz
#

But you have to say what you bounding for (although most people bound worst case)

#

The big omega is quite common when you want to say « the solution is at least that bad »

haughty mountain
#

it's about what kind of bound of the function it is

#

O - upper (≤)
Ω - lower (≥)

#

o and ω for the strict variants which behave like < and >

#

the whole study of asymptotics boils down to looking at f(n)/g(n) and seeing what happens when n grows

does it tend to 0, a positive constant, or ∞?

spring crag
#

Hi, i'm using the global forecast system(gfs) grib data, the model run has 120 hourly steps first, then transitions to 3 hour steps until 384, What I want to do is to linear interpolate the missing values between the 3 hours,
so pseduocode

pygrib open(file)

get data of time 120, 123 for example

interpolate 121,122

im not the best at the interpolation part and it has been quite challenging especially:
outputting the interpolation file the same structure as original dataset, including atmospheric layers for each variable, xarray is quite disasterous when handling layers, can anyone help?

kind jolt
#

someone know how to solve this? is tobozzle lvl 14 (42 school)

lilac cradle
#

i wrote this RLE algorithm yesterday for fun, thoughts?

def byte_RLE(arr, flag=0xff):
    """
    Run Length Encoding, using a flag byte, a count byte, and a value byte. 
    Values without the flag are passed unchanged.
    """
    idx = 0
    while idx < len(arr):
        val = arr[idx]
        if val == flag:
            if idx + 2 >= len(arr): break
            count = arr[idx + 1]
            val = arr[idx + 2]
            for _ in range(count): yield val
            idx += 2
        else: yield val
        idx += 1

instead of having the count before each value it reserves a single value (by default 255) as a flag to denote that there should be multiple of a byte

haughty mountain
#

ff 01 ff?

lilac cradle
#

ye

#

kind of annoying but i mainly make it to see if icould

wraith ridge
# kind jolt

Hello there. I wont give the solution away but I will give you a few pointers. You will need to use recursion - and in such a way that when the arrow moves up by one in a pillar (towards the red square) you need to recall F2 every time you move by one blue square until you reach the red square and at that point you will start popping off the stack and the arrow will move back towards the bottom (by the same amount of squares that it moved up).
That's a bit of a mouthful I know, but hopefully it make a little sense.

stiff quartz
kind jolt
half plume
#

Hello everyone.
Who can explain about suffix array algorithm?

regal spoke
#

Or asking what a suffix array is?

half plume
#

No, I want to know about suffix array algorithm, not concept.

regal spoke
#

There are a couple of algorithms.

The brute O(n^2 log n) is simply

def suffix_array(S):
    return sorted(range(len(S) + 1), key = lambda i: S[i:])
#

The standard algorithm you'll find for it out there uses cyclic shifts, in O(n log n) https://cp-algorithms.com/string/suffix-array.html#on-log-n-approach
Sometimes you'll also see the same cyclic shift algorithm implemented using the built in sort, which makes it run in O(n log^2 n). That version is probably the simplest suffix array algorithm out there.

#

This ^ runs in O(n + |alphabet|), so in true linear time.

patent blade
#

Can you suggest an algorithm that completes the shape within a certain polygon in the picture? I want pixels outside the polygon to be ignored when completing the shape.

glossy nexus
#
import pygrib
import eccodes
import tempfile
import os

ds = pygrib.open('Tools/merged.grib')

with tempfile.NamedTemporaryFile(delete=False) as temp_file:
    temp_file_name = temp_file.name

    for msg in ds:
        with open('Tools/merged.grib', 'rb') as in_grib:
            gid = eccodes.codes_grib_new_from_file(in_grib)
            eccodes.codes_write(gid, temp_file)
            eccodes.codes_release(gid)

    h123 = ds.select(step=120)

    for msg in h123:
        values = msg.values
        new_values = values / 2

        gid = eccodes.codes_new_from_message(msg.tostring())

        eccodes.codes_set(gid, 'step', 121)

        eccodes.codes_set_values(gid, new_values.flatten())

        eccodes.codes_write(gid, temp_file)
        eccodes.codes_release(gid)

with open('Tools/merged_modified.grib', 'wb') as outfile:
    with open(temp_file_name, 'rb') as infile:
        while (gid := eccodes.codes_grib_new_from_file(infile)):
            eccodes.codes_write(gid, outfile)
            eccodes.codes_release(gid)

os.remove(temp_file_name)

ds.close()

in the output, theres a new time1 dimension that is replacing time when time is the dim that takes the new modified values, i have no idea what to do, searched all over internet/Ai and everything and asked but got 0 answers anywhere

restive harbor
#

Hey I am having a problem with binary representation of integers, idk if it's normal or not :
(My version is Python 3.10.9)

When I slice in half a string integer and look at its binary form, it doesn't correspond to the lowest binary part of the original number. I would like to know why and if there's a way to fix the binary form of the sliced part to match the original one ? Surprisingly, only a small part of the binary representation matches.


n = 121332837237832748484297480923048038023787476482465738446464478484842942949922442024044992400310322222323222
halfn = int(str(n)[len(str(n))//2:])

hlength = halfn.bit_length()

for i in range(1, hlength+1):
    if not (bin(halfn)[-i:] == bin(int(n))[-i:]):
        print(i)
        print(f"{i}/{hlength}")
        break
56
56/179
regal spoke
#

"halfn" in base 10 is not the same thing as "halfn" in base 2

#

12 in base 2 is 0b1100

restive harbor
regal spoke
#

Think about the case of n=12

restive harbor
#

I am wondering why after taking half of the string of the number and then convert it back to int, it doesn't have the same binary representation of the same part but in the original number

regal spoke