#data-science-and-ml

1 messages · Page 95 of 1

slow vigil
#

The program eventually grinds to a halt after the 16th file is created (16 core CPU)

#

One file being created per process

#

Or I guess I should say that something is causing each process to hang

#

I don't know exactly what it is

#

It must be that a race condition is being created in each process due to the multi-threading though I did attempt to implement threadsafe write

agile owl
#

are you multithreading within a spawned process

past meteor
#

It's not too late to read the manual and rewrite

agile owl
#

I never had issues doing the pattern you're describing

#

pooling out, writing and concatenating

past meteor
#

Don't treat it as if it were Pandas, it's one of the anti-patterns described in the documentation

serene scaffold
past meteor
desert oar
#

polars is a lot more like spark or sql than pandas

#

the python api is obviously heavily inspired by pyspark

past meteor
#

No secret I much prefer the syntax over Pandas. The indexing etc. of pandas makes it the most awkward DF library I've used across various programming languages 😅

#

It remains a bit hard to use because imo Polars only makes sense if there's no data viz or ML angle to the part of the project you're using it in. It shouldn't be used in all (sub)packages of your project. I've the mistake of writing some of my feature engineering in Polars which automatically means I need to kind of manage both in my ML code.

velvet crescent
#

does anyone know of a good strategy to piece together several simulations composed of difference equations in python? Is there a library?

harsh minnow
#

Hey guys, I have been following a video on Fine-tuning Mistral 8*7Bm it works, but the training process if very slow, its taking about 1.5 hours for 320 steps training. I am running it on 2x A30.

Code: ```import transformers
from datetime import datetime

project = "final-finetune-2"
base_model_name = "mixtral"
run_name = base_model_name + "-" + project
output_dir = "./" + run_name

tokenizer.pad_token = tokenizer.eos_token

trainer = transformers.Trainer(
model=model,
train_dataset=tokenized_train_dataset,
eval_dataset=tokenized_val_dataset,
args=transformers.TrainingArguments(
output_dir=output_dir,
warmup_steps=5,
per_device_train_batch_size=2,
gradient_checkpointing=True,
gradient_accumulation_steps=4,
max_steps=321,
learning_rate=2.5e-5,
logging_steps=25,
fp16=True,
optim="paged_adamw_8bit",
logging_dir="./logs", # Directory for storing logs
save_strategy="steps", # Save the model checkpoint every logging step
save_steps=10, # Save checkpoints every 50 steps
evaluation_strategy="steps", # Evaluate the model every logging step
eval_steps=10, # Evaluate and save checkpoints every 50 steps
do_eval=True, # Perform evaluation at the end of training
report_to="wandb", # Comment this out if you don't want to use weights & baises
run_name=f"{run_name}-{datetime.now().strftime('%Y-%m-%d-%H-%M')}" # Name of the W&B run (optional)
),
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
)

model.config.use_cache = False # silence the warnings. Please re-enable for inference!
trainer.train()```

woeful wren
#

i have this

#

those are alll x values for maximums and minimums on this functions

#
fig    = plt.figure(figsize=(6, 6))
ax     = fig.add_subplot()

xx     = np.linspace(-5., 10., 1000)
# ------- Vul verder aan ------- 
functiewaardes_extrema1 = []
vgl = opl.subs(a, 1/2)
vgl1 = vgl.subs(b, 2)
vgl2 = vgl.subs(b,4)
display(vgl1)
display(vgl2)

t = sp.lambdify(x,vgl1.rhs)(xx)
q = sp.lambdify(x,vgl2.rhs)(xx)


ax.plot(xx,t)
ax.plot(xx,q)

ax.scatter(extrema, vgl1.rhs.subs(x, extrema))

plt.xlim([0, 10])
plt.ylim([-60, 30])
plt.show()
#

i have this code and im truing to get the minimums to show up as dots on the functions using the scatter function but idk how to do that

woeful wren
#

vgl1 and vgl2 are the functions

#

vgl is a function with still a, b in it

#

'Union' object has no attribute 'as_base_exp' this is the error i get

woeful wren
#

i give up

outer tapir
#

does anyobe have experience about web scraping, especially using Goose3

feral kernel
#

Hi, can anyone look at the raw tensor values of a 256*256 image and tell the features and the objects of the image without looking the image or using any tools? And look at the image and convert it into tensors or sinusoids with just using his own mind. It would probably take insane amount of practice and training to do that?

stark bay
#

Yeah that will solve the problem i figured... but what if i went into micro scaling to make my model better... i am talking about precision... i am going into more and more micro so to make my model most accurate.. other than scaling is there any other option? I tried doing moving averages on a linear regression model that had noise and it worked but not for a good percentage amount...

harsh minnow
toxic mortar
#
from sklearn import datasets

data = datasets.load_breast_cancer()

ulaz = data.data
izlaz = data.target

ep = 100
bs = 32
from sklearn.model_selection import train_test_split

ulaz_trening, ulaz_test, izlaz_trening, izlaz_test = train_test_split(ulaz, izlaz, shuffle=True, test_size=0.2,
                                                                      random_state=42)

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler().fit(ulaz_trening)
ulaz_trening_norm = scaler.transform(ulaz_trening)
ulaz_test_norm = scaler.transform(ulaz_test)

from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers.legacy import Adam

n_in = ulaz_trening_norm.shape[1]
n_out = 1


def make_model(hp):
    model = Sequential()

    no_units = hp.Int('units', min_value=3, max_value=15, step=2)
    act = hp.Choice('activation', values=['sigmoid', 'relu', 'tanh'])
    model.add(Dense(units=no_units, input_dim=n_in, activation=act))

    model.add(Dense(n_out, activation='sigmoid'))

    lr = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
    opt = Adam(learning_rate=lr)

    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])

    return model


import keras_tuner as kt
from keras.callbacks import EarlyStopping

stop_early = EarlyStopping(monitor='val_accuracy', patience=5)

tuner = kt.RandomSearch(make_model, objective='val_accuracy', overwrite=True, max_trials=10)

tuner.search(ulaz_trening_norm, izlaz_trening, epochs=ep, batch_size=bs, validation_data=(ulaz_test_norm, izlaz_test),
             callbacks=[stop_early], verbose=1)

#

Why do I get this error? I've seen this SO post, but I dont call it from terminal, rather from my PyCharm IDE ( https://stackoverflow.com/questions/55675199/tensorflow-python-framework-errors-impl-failedpreconditionerror-is-a-direct)

indigo wing
#

2 lineplots are looking kinda similar but they are showing moderate negative correlation of -0.6 why?

desert oar
#

even if the overall trends are both positive, it's possible that there are enough opposite movements to cause negative correlation. that, or you made a mistake in the code

#

if you did something like a moving average you would hopefully expect a positive correlation

desert oar
indigo wing
# desert oar put the numbers on a scatterplot
corr_USD = NAS_U['Close'].corr(S_AND_P['Close'])
print(corr_USD, "<---  :. Moderate negative linear relationship b/w NASDAQ_500 and S&P500\n=> As Close price of one increases, other decreases")

I did thid code for correlation and

# Plot NAS_U
sns.lineplot(x='Date', y='Close', data=NAS_U, ax=axes[0, 0])
axes[0, 0].set_title('NASDAQ')
axes[0, 0].set_xlabel('Date')
axes[0, 0].set_ylabel('Closing Stock Price')

# Plot S&P500
sns.lineplot(x='Date', y='Close', data=S_AND_P, ax=axes[0, 1])
axes[0, 1].set_title('S&P500')
axes[0, 1].set_xlabel('Date')
axes[0, 1].set_ylabel('Closing Stock Price') 

for the plot

desert oar
#

Try the scatterplot

half mountain
#

Hi,
I am doing a Kaggle project. I was able to run and submit the file. Even though I didn't analyze results myself. The issue I am having is I want to show a viewer the accuracy of my model through visualizations. or metrics (Maybe MAE?). How do I do this? How would you recommend visualizing your models for business users?

indigo wing
mint palm
#

I have 2 encoder in model.
I give 1 inputs each.
Output from each are A and B.
both are embedding.
If i do B.detach then calculate loss, what will happen?

rugged zinc
#

Hi everyone,

can someone pls help recommend a well curated and properly outlined roadmap for Data Analytics, Last year, i'm 90% through into basics and fundamental of python programming, and i'd love to pussssh fwd into the data analysis part.. i understand roadmaps are subjective but a well outlined one with touches of mathematical and statistical contents would be very useful...

just like this one for Datascience and AI on roadmap.sh which points to the recommended materials...

I would also find it very useful if most of the materials are pointed to Coursera where i can easily apply for a F.A...

Thank you pithink

final kiln
#

start with simple stuff, ask people and/or gpt when you don't know something, as you immerse yourself into the subject it will become clear to you what to learn next

rugged zinc
final kiln
#

yes, just come up with one, the rule is that it must be something you find to be cool, the fun of doing something you like gets you through the painful process of forcing your brain to make new synapses

#

I usually choose stuff that's challenging but technically possible, so like, I wouldn't choose training GPT4 because that's not within my budget, but I'd choose something I don't know how to do and then come up with a plan on how I'll build up the knowledge to get there

rugged zinc
fossil forge
#

hi

#

im trying to learn on using mobilebeart and i found this code

#

text_input = tf.keras.layers.Input(shape=(), dtype=tf.string)
preprocessor = hub.KerasLayer(
"https://kaggle.com/models/tensorflow/bert/frameworks/TensorFlow2/variations/multi-cased-preprocess/versions/3")
encoder_inputs = preprocessor(text_input)
encoder = hub.KerasLayer(
"https://www.kaggle.com/models/tensorflow/mobilebert/frameworks/TensorFlow2/variations/multi-cased-l-24-h-128-b-512-a-4-f-4-opt/versions/1",
trainable=True)
outputs = encoder(encoder_inputs)
pooled_output = outputs["pooled_output"] # [batch_size, 512].
sequence_output = outputs["sequence_output"] # [batch_size, seq_length, 512].

#

embedding_model = tf.keras.Model(text_input, pooled_output)
sentences = tf.constant(["(your text here)"])
print(embedding_model(sentences))

#

i already downloaded the necessary modules but the error is this

#

ValueError: Exception encountered when calling layer "keras_layer" (type KerasLayer).

any fix on this?

agile owl
#

should I trust a model whose learning curve looks like this

slow vigil
#

Ok so after some testing today I found that I can't write out from a process directly when using multiprocessing before the process has finished. I'm not sure why, but the file will be created, the empty dataframe that I initialize as a variable at the start of each process will be written, but none of the data gathered during the process is written.

I removed threading from the program(though now it is too slow so I will have to add it back in) and by returning the dataframe from each process as the result in the process pool I am able to aggregate the data at the end and write that out to a csv successfully.

#

I've been looking into process Queues and I think I have to implement something like that to get the data out before the process has finished

past meteor
#

Are you writing Polars as if it were pandas? (Not using expressions)?

#

If you could show me some of your code (the one where you removed threading) I could instantly tell you what's up

serene scaffold
slow vigil
#

hmm ok gimme one second I'll show a minimal example

#
def extract(chunk_ids):
    main_df = pl.DataFrame()
    num = chunk_ids[0]
    csv_name = f'data_{num}.csv'

    for id in chunk_ids:
        
        #Process data here and add to main_df
        #if I try to write to csv from within here
        #or even after this for loop the process hangs
    return main_df
  

def main():
    executor = ProcessPoolExecutor(max_workers=cores)
    tasks = [executor.submit(extract, chunk['column_name'])
             for chunk in id_chunks]
    
    doneTasks, _ = concurrent.futures.wait(tasks)

    results = [item.result() for item in doneTasks]
 
    final = pl.concat(results)
    final.write_csv('final.csv')
                    
if __name__ == "__main__":

    cores = cpu_count()

    df = pl.read_csv("test.csv")
    total = df.height
    id_chunks = df.iter_slices(n_rows=math.floor(total/cores))

    main()
#

This allows me to write final.csv successfully, but I want to intermittently write out to csv from within each process in case any errors in the process cause it to hang then I lose all the data

past meteor
#

I'm just about 90 % sure you really really should remove the multiprocessing and you'd be in a wayyyyy better place

slow vigil
#

I can't remove it

past meteor
#

Why

#

Polars already does that, that's entirely the use case of the library 😩

slow vigil
#

Polars will utilize all my cpu cores?

past meteor
#

Yes 😦

slow vigil
#

well

#

Ok no, I can't

#

Part of the program is I/O bound

#

Part of it is CPU bound

past meteor
#

Yeah, I had the same thing but since Polars uses dramatically less memory than Python I just loaded all of it in at once

#

And then it was just 100 % CPU bound

slow vigil
#

Mine requires making some web requests

#

so it will always be I/O bound

past meteor
#

Can't you make all those requests before doing anything else

slow vigil
#

Nope

#

There's almost a million web requests

past meteor
#

Any reason why you can't have 1 job that does the requests, writes them to say 1 big parquet without doing any processing and then a second that then reads the data using Polars

slow vigil
#

wouldn't that run into the same issue?

past meteor
#

My data pipeline also does millions of requests and that's how I structured it

slow vigil
#

Hmmm

past meteor
#

No it wouldn't because then you wouldn't have to "infect" the polars stuff with multiprocessing

slow vigil
#

lol

past meteor
#

For the requests if you're using non-blocking client you could also just use async

slow vigil
#

The way it was set up was that each of the 16 processes had 10 worker threads that were making the requests and as the requests finished it would take them one at a time and process the data and append to the "main_df" for that process

#

Or I guess that's not exactly right

#

There were a max of 10 tasks allowed

#

per process at any given time

past meteor
#

Splitting is easier, the data pipeline I wrote basically does this, the first stage polls an API and just dumps the raw data into a data lake (JSON), the second stage runs every couple of hours and transforms JSON into SQL and writes it into a DB, the third stage is where I'm using polars to read hundreds of millions of rows from the DB and do a bunch of aggregations and then write to a diff DB

slow vigil
#

I think I was getting timed-out for requesting too much too fast

#

But that was right when I started, so maybe I was doing something else wrong and thought I was getting timed out

past meteor
#

Yeah, you can make your life so much easier by doing it in steps and focusing on one problem at a time 😄 Maybe you are getting rate-limited or so. In that case a basic cron job can save you by doing it in steps and writing somewhere else.

slow vigil
#

Yeah I see what you're saying

past meteor
#

Aside from that, having the raw data at hand is very important. I'd almost never recommend going straight from source => processed

#

A common task I do is verifying my processed data by comparing it to the raw data that came from the APIs etc.

#

Basically, if you set it up like this you only have the processed data and not the raw ones

slow vigil
#

Right. This is a pretty specific use case. I'm writing a program for someone else and they just want to run it and get a CSV from it so they won't be doing anything with the raw data really, but I do see what you mean

past meteor
#

Well, you can still use it to check if your program is correct

slow vigil
#

Oh, it's correct 😉 lol

#

I will see if I can do it that way though. Make all the requests, add all the raw data to a pool, then process it all afterward

#

That decouples the CPU bound part from the I/O bound part, so really the requests will likely finish even faster

stark bay
#

Has anyone worked with this kinda library or know whiçh library can do this visualization with a good documentation

surreal canyon
#

Made this "AI agent" or whatever you wanna call it myself from scratch in python

#

And running the model locally

#

Rn it just has access to powershell commands, but it is able to chain them together if it wants to

#

Smort

#

Bruh

#

This thing is actually like smart

#

Like really smart

#

Smarter then me probably, i don't think i'd be able to pull that off if someone asked me to do it

#

I'm not really a powershell wizard

potent sky
#

nice

mint palm
#

I do not understand all use cases of detach() and no_grad, required_grad etc.
On the surface i know datach prevent gradient computation by excluding that tensor, but I dont understand there behaviour when fors example:
there are two backbone and we only want to update one
How gradients accumulate with required_grad = true?
Why should i not do required_grad = true in last loss.backward()

odd meteor
# mint palm I do not understand all use cases of ``detach()`` and ``no_grad``, ``required_gr...

detach() and no_grad() are methods in PyTorch while require_grad=True is a parameter of a tensor() method.

Any tensor that has require_grad =True can be differentiable. We can perform backpropagation on it because for any computation we perform that involves that tensor, PyTorch builds a dynamic computational graph of that operation in the background (you can roughly think of this computational graph as the footprint. Just like how humans can trace back their family origin with DNA, we can sort of use computational graph to do same thing in a tensor so long as that tensor is differentiable)

Now, in some situations, gradient accumulation can be likened to cancer, because any tensor that has require_grad =True therefore becomes susceptible to this bottleneck (the more gradient accumulates, the deeper & complex the computational graph becomes)

There are some situations where we legit wouldn’t want creation of computational graph because it's redundant in that situation; one of such cases is during inference time. So to avoid that situation, to speed up things, to save memory, we can then decide to inform PyTorch that we don't want it to stress itself in building any computational graph 'cos we don't need it.

Now, you could instruct PyTorch to do that using either detach() or no_grad()

tensor.detach() detaches the output from the computational graph. So no gradient will be backpropagated along this variable.

On the other hand, torch.no_grad() temporarily set all the requires_grad to false. torch.no_grad() means that no operation should build the graph.

The difference is that one refers to only a given variable on which it is called. The other affects all operations taking place within the context manager.

Also, torch.no_grad() uses less memory because it knows from onset that no gradients are needed so it doesn’t need to keep intermediary results.

https://discuss.pytorch.org/t/detach-no-grad-and-requires-grad/16915/6

odd meteor
# mint palm I do not understand all use cases of ``detach()`` and ``no_grad``, ``required_gr...

Calling loss.backward() performs backpropagation in PyTorch.

You can't perform backpropagation if a tensor isn't differentiable (i.e if a tensor has require_grad=False).

And if you can't perform backpropagation you also won't be able to update your model parameters during the backward pass with gradient descent.

So, you see why we need to use 'require_grad=True` when creating a tensor in PyTorch. The situation where gradient accumulation becomes a pain in the ass is when we've finished training our model and we now want to make prediction.

In this situation we have to inform PyTorch that, it shouldn't build computational graph since we simply want to make prediction.

The more deeper or complex the branches of a Computational graph extends, the the more time and memory it takes to compute so that why we straight up turn it off during Inference / Prediction time.

mint palm
# odd meteor `detach()` and `no_grad()` are methods in PyTorch while `require_grad=True` is a...

Thanks for the detailed response.
So, for the case where:

  1. If have two modality: text and vision
  2. I only want to back propogate in vision backbone and keep text backbone fixed.
    Can I put detach on output of text backbone? I actually want to conduct a projected gradient descent attack on images, I only want to modify images.

One more question, when we do newtensor = oldtensor.clone or do newtensor = Variable(oldtensor.data), and use detach() on newtensor, do gradient of oldtensor still update if oldtensor is being used elsewhere too?
I mean if oldtensor was use to get embedding A, and newtensor was cloned from oldtensor and detached, and then newtensor was use to get embedding B, and we do loss.backward(), oldtensor will still be updated, right? and oldtensor will be updated independent of newtensor.
And even though newtensor is detached it will change cuz oldtensor is changing due to its own pipeline.

grizzled locust
#

Good Evening guys. So I tried to install ydata-profiling into the colab but it resulted in an error. can anyone tell me where i did wrong?

jolly horizon
#

Hey guys, does anyone have experience with Triangulation of 3d computer vision and linear algebra ? Please please help me..

steel hull
#

Guys I got done with ML specialization by Andrew NG and Mathematics for Machine Learning Specialization by deeplearning.ai on coursera as well. What should I do next?

I went on reddit to search and I was confused by various suggestions ranging from CS229 on youtube for deep dive in the course, fast.ai DL course, ISL and ESL, cs231n by stanford or Deep leaning specialization by deeplearning.ai. I would like to stick to a learning path which makes sure that I have no learning gaps and make me job ready

jolly horizon
#

Kinda typical of a big community . More people w questions than ppl who can answer

desert oar
desert oar
steel hull
desert oar
#

in that case it might be a good idea to pause here and start a personal project to reinforce the things you already learned. doing projects is a great way to learn and get hands-on experience that will be essential when you eventually want to find a job in this field.

#

it's likely that in the course of any project you'll end up having to learn various things anyway along the way

#

but yeah, once you've done a project, i think any of those resources you named are good options. there isn't a single learning track to follow. DS/ML is probably one of the most open-ended fields in that respect.

#

personally i think it will serve you very well to go back and learn some statistics fundamentals, but that material tends to be a little less exciting than deep learning.

#

so I'd suggest pausing to work on a project, then just pick any of those deep learning courses and do one of them

steel hull
#

cs229? DL spec? or anything else

desert oar
past meteor
#

Like spamming because this is all such good advice

#

I read d2ai and it's great 👍

desert oar
#

isn't cs229 just more of Ng's stuff?

desert oar
#

i would go for something a little more advanced given that you already covered the intro stuff

steel hull
desert oar
#

i see. that could be interesting, but if it's just a more rigorous redo of the course you already took, I'd say it's better to go elsewhere. Plus I think it's valuable to learn from different instructors with different styles

steel hull
desert oar
desert oar
#

But yes there are plenty of project ideas and datasets on Kaggle and elsewhere

steel hull
#

Also do you have any experience with either ISL or ESL?

desert oar
#

the books?

steel hull
desert oar
#

i have a physical copy of ESL that i browsed through in grad school, it's probably less valuable than it used to be, but it's still kind of interesting as a menagerie of various things people have used for model fitting over the years. i'm sure newer editions are somewhat more relevant. probably the only really valuable chapter is the one that describes gradient boosting, since people still use that.

steel hull
#

oh ok

desert oar
#

i haven't spent much time with ISL, but it's very popular and might be worth your time. the "statistical" part is valuable for your learning and might be a good entry point into statistics more generally.

steel hull
#

when people say that coursera spec is watered down version of cs229, I really felt that it is true

#

In addition to that so many resources did leave me confused as hell

desert oar
#

that's valid. i'm just concerned about you spending weeks or months re-learning material you've already covered

#

it can't hurt, it's just a question of whether it will be hard to force yourself through it

#

frankly i don't love Ng's teaching style

steel hull
#

I mean people also say Aurelion Gerons book and see? so many resources 😅 but time is always so less

#

I am not too inclined towards cs229, I just want to pick one sequence of learning and then stick entirely to it

steel hull
past meteor
#

All of the ones that have been listed here are nice so any of them will do

steel hull
#

any resource which is the most hands-on out of all?

#

I mean which would get me ready to be able to take on any of the projects from kaggle and build portfolio?

past meteor
#

Kaggle itself has resources you can use to learn

#

Personally I'm a fan of contrast, take a book or resource like the ones listed and then take that to apply it on Kaggle. On their platform you can take their courses afterwards

steel hull
#

ok gotcha

#

I will just get started then

#

along with ISL as suggested by @desert oar

past meteor
#

ISL is my favourite as well

steel hull
spark nimbus
#

How do I deal with pandas-on-spark giving an error on DirectByteBuffer.<init>? I tried setting the properties mentioned on JIRA but it didn't seem to do anything

drowsy jacinth
#

how do i declare a variable?

pale hemlock
#

@drowsy jacinth depends on the language

drowsy jacinth
#

prolly python idk

pale hemlock
#

Def (variable name) = any reason for the variable

drowsy jacinth
#

do you knoq?

drowsy jacinth
pale hemlock
#

your welcome

#

you,re

desert oar
desert oar
drowsy jacinth
#

boo hoo

slender hawk
left tartan
regal vault
#

I coded up an k clustering algorythm

#

after the user draws an image on a grid. The algorythm tryings to identify clusters/individual shapes

#

I want to show the algorythm "learning" by drawing a mathplotlib 2d grid and updateing it

#

how would i update the grid without it having me close the current window to open the next

#

i have it in a display function

#


    
    GRID_SIZE = 1
    COLOR_MAP = {0: 'white', 1: 'red', 2: 'green', 3: 'blue'}

    # Create the figure and axis with equal aspect ratio
    if (fig == None or ax == None):
        fig, ax = plt.subplots()
    
    ax.set_aspect('equal', adjustable='box')

    # Plot the grid with colors
    for x in range(len(grid[0])):
        for y in range(len(grid)):
            color = COLOR_MAP[grid[y][x]]
            ax.add_patch(plt.Rectangle((x, y), 1, 1, fill=True, color=color))

            # Draw crosshairs inside the squares
            if crosshair_coordinates and (x, y) in crosshair_coordinates:
                ax.plot(x + 0.5, y + 0.5, marker='x', color='yellow', markersize=10)

    # Set axis limits and labels
    ax.set_xlim(0, len(grid[0]))
    ax.set_ylim(0, len(grid))
    ax.set_xticks(range(len(grid[0])))
    ax.set_yticks(range(len(grid)))
    ax.set_xticklabels([])
    ax.set_yticklabels([])
    
    # Show the plot
    plt.show()```
agile owl
#

One of the problems I'm running into using model outputs as features for my RL algorithms is that I think I should be in theory retraining all the algorithms at each step but that's way too costly so I just do the best I can and fit the model for the training periods and leave test out-of-sample but I'm not sure how else to do it besides just not using model-based features

#

wonder if anyone else has run into a similar issue and what they did about it

agile owl
#

any reccs for an online clustering algorithm? Not sure you can do anything with HMMLearn

past meteor
#

let me let you into a secret: if a sklearn estimator has the partial_fit method it can be used in an offline online fashion (edit: corrected)

agile owl
#

you mean online

#

yeah I saw that

#

never used it before with sklearn tbh

past meteor
#

tip: be sure to update your preprocessing as well

#

You can call partial_fit on Standardscaler and so on as well and ime it makes a massive difference if your signal is drifting

agile owl
#

I guess this is the problem in going from hmm to kmeans though

#

CluStream is adapted for online usage

#

the first figure is what kmeans does on time series data, the second is hmm

#

if you believe things happen in markov sequences you probably want to use an hmm

#

and I've been struggling to find an online implementation in a library, I see a lot of people's projects but I don't really want to be spending my time on that

past meteor
#

HMM as in hidden markov model?

agile owl
#

yeah

desert oar
agile owl
#

right now I'm using hmm learn and cheating on the training data with the clustering part

desert oar
#

i wouldn't use k-means for just about anything nowadays except very quick and dirty EDA

agile owl
#

but I want to fix that leak

#

by using an online algo

#

hard to find online hmm though

past meteor
#

My go-to cluster method, if I'd cluster, is DBscan

desert oar
#

i don't think that works online either

past meteor
#

It doesn't

#

But I think clustering is something you learn in uni together with association rules and has way less use cases than touted

desert oar
#

i actually don't know any online clustering algorithms. sems like an odd gap in my knowledge now that i think about it.

#

yeah, it's mostly an EDA and reporting technique

past meteor
#

You can make many of them online quite easily

iron portal
#

hi guys, my pytorch cant load cublas64_11.dll, could you help me?

desert oar
# past meteor You can make many of them online quite easily

can you? i suppose you can with hierarchical clustering, incrementally building up the distance matrix. i suppose you can online-ify dbscan that way too. and maybe hdbscan depending on the details of the graph pruning algo it uses (i don't know them)

agile owl
#

I'm using it for dimensionality reduction

#

I don't want my observation size for the reinforcement learner to be too big

#

so I'm summarizing some data before I put it into the feature extractor using clustering

past meteor
#

The ones based on the EM algorithm (k-means, GMMs, ...) can trivially be made online

desert oar
#

anything that requires you to choose a # of clusters in advance is a non-starter for me unless it's for EDA or i have strong domain knowledge

#

as we see in the output above, k-means isn't smart enough to handle that in online use

past meteor
#

For other ones, it depends. We covered this in uni I'd have to check my notes I made for hierarchical clustering etc.

desert oar
#

you covered online clustering? nice

#

i definitely never did

past meteor
#

Maybe not explicitly but we had to cluster by hand with pen and paper, doing that shows you which is and isn't online as a side-effect

desert oar
#

hah that's a great assignment

agile owl
#

also there was some paper from NYU a few years ago about combining RL with HMM for financial time series

past meteor
agile owl
#

that's why I chose to use an HMM

#

yes

#

i was inspired by remembering that NYU paper though about people applying RL with HMM as input

past meteor
#

A bespoke dimensionality reduction method that always works is some sort of autoencoder

desert oar
#

would you fine tune it as data comes in?

past meteor
#

Maybe that's not the smartest thing to do but it's what I came up on the top of my head. I'd have to read papers.

The basic idea is that you just have an autoregressive encoder and decoder. At inference time you just take the encoder and roll with that.

#

Maybe I'd store the last N data points and finetune it in some batch job so the thing doing inferencing doesn't need to hold both the encoder and decoder but that's an implementation detail 😄

desert oar
#

by "autoregressive encoder" do you mean like an RNN? or do you mean that the input is something like "the previous output" and "the current input" ?

#

i'm actually curious @agile owl how you were planning on using the clustering for dimension reduction

#

online PCA is a thing for example

past meteor
#

Maybe taking the lags and using a standard dimensionality reduction method could also work. It depends on the downstream task.

#

I'd actually start doing that and using PCA or similar. Occam's razor and all.

desert oar
#

all that said, this river library looks interesting and i'm going to bookmark it. pretty much everything i do now at work is "online" to some extent and i have very little to go on, i'm making a lot of it up as i go

#

Why doesn't river do any input validation?¶
wat

arctic wedgeBOT
#

river/drift/kswin.py lines 88 to 96

super().__init__()
if alpha < 0 or alpha > 1:
    raise ValueError("Alpha must be between 0 and 1.")

if window_size < 0:
    raise ValueError("window_size must be greater than 0.")

if window_size < stat_size:
    raise ValueError("stat_size must be smaller than window_size.")```
desert oar
#

the input validation in sklearn isn't exactly intrusive or excessive... it's pretty reasonable to expect a library to check array shapes and dtypes for you

#

otherwise you end up with impossible to debug 50-line tracebacks

#

sincerely, someone who survived using pandas before v1.0

past meteor
#

My thesis was on online-ML, specifically concept drift, and it was done fully with sklearn. partial_fit is definitely your friend here.

#

You can still use the common Pipeline interface and pass in fit_params to specify which parts need to use partial_fit, basically StandardScaler and the model

agile owl
#

thanks for the input I'm eating right now gonna decide what to do after this meatball parm

desert oar
#

true. i've used scikit-learn a lot but never the partial fitting stuff

agile owl
#

same

agile owl
#

so i was using HMM states, which I consider to be a form of clustering

#

so each observation frame consists of a vector of information at that time step, including the hmm latent state prediction (i.e., the cluster) where that state prediction gives conditional information about the variability in the time series

#

e.g., an hmm conditions on a length 5 window of rolling differences of the time series value and conditional variance estimates across the time axis

#

it summarizes that information that would be a 5x(2n) matrix into a scalar

#

so my observation for the RL learner just takes that scalar instead of having another 5x(2n) elements in it

#

also I'm not sure the Mlp policy would actually do a good job of being able to learn that type of information itself

agile owl
magic dune
#

Just wondering why would someone use an algorithm, such as NEAT, over other machine learning algorithms? What situation would only apply to an algorithm such as NEAT, but no other machine learning algorithm? (I wanna use neat for a something but want too know it's advantages).

#

????

small wedge
small wedge
#

sorry I think i misread your question

agile owl
#
1. Implement online learning for temporal clustering regime state algorithm
2. Implement online learning for variance model
3. Embed 1 + 2 as components of Reinforcement Learning environment to be updated on each timestep
4. Create type representing the model configuration to reduce overall args being passed around
5. Create declarative config for main script run that will iterate over configurations for experiments and save experiment results
6. Create (co)variance-weighting mechanism to create composite strategy and analytical tools for model combinations``` 

laziness is powerful
small wedge
#

like avoiding another hyperparameter optimization algorithm, or like avoiding fancy optimizers, etc?

#

NEAT just tunes your model's topology, there's no reason to not use it along with other optimization methods unless you have a competing topological optimization algorithm you want to try.

agile owl
#

I haven't used it but my understanding is it's not just the topology it's solving for but also the weights

#

EA is an optimization strategy

#

it's as opposed to something like Adam

desert oar
honest charm
#

Hey, guys I am making face emotion detection system for my final year project. Does anyone has any ideas abou similar systems ?>

agile owl
#

I agree wasn't the best terminology

#

I need to figure out how to use light gbm online in a reasonable way too

#

I kind of want to use an online version of HMM for the reasons discussed but I'm having a hard time finding one and I don't trust myself to implement it from scratch

lapis sequoia
#

Hi everyone im learning AI as my specilization i want to ask if i have to learn competitive programming too to be good in my field?

orchid lintel
#

So, I took the plunge and tried Polars instead of Dask to do a Big Giant Sparse GroupBy (basically doing a GroupBy on 10k or so sparse columns with like 1m rows, and maxing them), and it seemed to do it in 2 minutes? (vs Dask taking like 90 minutes for half the rows, and the original Pandas code taking like 4 hours)
Which is awesome but...why would that happen, I wonder? Like why would it be so much faster than Dask?

agile owl
#

I also realized that by converting my feature models into online versions and updating them along with the reinforcement learner policy that they are actually embedded in the learning environment rather than the model itself when it gets saved so what I need to do is implement saving them separately from the reinforcement model on the filesystem and make my gym env accept loaded model objects from kwargs and branch the logic on whether it instantiates a new one based on those kwargs

#

everything gets so much... messier when you start to do online learning

scenic parcel
#

What is the best sentiment analysis library

#

I think I'm going to try to use BERT

agile owl
#

I'm really enjoying the river API so far

#

very natural to use it while iterating over my sequence for reinforcement learning

agile owl
#

I love river already it's great

past meteor
#

I think you could further simplify stuff by using an RNN instead of a vanilla MLP

agile owl
#

the SB3 implementation of SAC doesn't support a recurrent layer

#

you can get it for PPO though

past meteor
#

SAC being soft actor critic?

agile owl
#

yeah

#

I am already stacking the observations with a buffer

#

I read that in most applications that gives the maximum benefit you'd see from using a recurrent network anyway

#

for RL applications I mean

past meteor
#

I'm too rusty on RL but representation learning was always a big part of it

#

Which you'd get by using an RNN or similar

agile owl
#

the LSTM states you mean

#

that would be a possible way to go with PPO

#

if I felt like implementing recurrent SAC I could do that too

#

I already have a need for a specially crafted online model as an input anyway

#

it's a regressor for the time series variance

#

that can't be left to the policy network, it's "intelligently designed"

#

I'm using CluStream for the latent state model and the AMF regressor for the variance regressor

#

both of them the river implementation

#

still doing testing, the training time has basically doubled because I did the variance model and latent state using batch methods and leaked information across the training dataset before

#

that's the main impetus to switch to online learning

#

because information was leaked over the training period although it didn't affect the test results it presumably led to suboptimal out-of-sample performance

past meteor
#

Let me take abstraction from the details for a sec. All RL has a similar problem. You move from tabular RL to using function approximation which allows you to "group" related states. You're no longer updating a single Q[S,A] or V[S] entry.

Defining relatedness is the key question now. You can do it manually with feature construction, which is what you're arguably doing, and then pass it on to a function approximator. The second thing you could do is pick an MLP with the right inductive biases and it'll sort this out on its own.

agile owl
#

designing MLP networks to achieve certain goals like that is a skill issue for me

past meteor
#

For time series it's exactly why you'd pick a recurrent neural net

#

Or a CNN with dialated convolutions

agile owl
#

I'd have to break free of sb3 to do that which is the ultimate goal but not the time yet

#

I guess I could do the recurrent PPO

#

but the PPO results just aren't compelling

#

compared to what I was getting with SAC

#

maybe with the changes it will be different

#

things are coalescing quite nicely though the ultimate goal is to roll my own policy network

past meteor
#

I never got to covering PPO/SAC when I was doing RL so I wouldn't have any pointers there in all honesty

agile owl
#

PPO is on-policy and SAC is off-policy and for financial data I think it's clear that off-policy is better because you can't just turn on a data spigot

#

so maximum exploitation is important

#

although the PPO results weren't terribly bad in most cases

#

I might as well give it another shot I have it modularly set up to just switch out the models and the policy networks

#

need to see where the SAC model ends up on the test data though

past meteor
#

Both of them are policy-gradient methods I presume

agile owl
#

yes

past meteor
#

Off-policy stuff can have unintended consequences though as you probably know already

#

More risky moves until it has converged to the optimal policy

agile owl
#

I have risk aversion built into the reward itself

#

to try to defeat that

#

bias

past meteor
#

solid

agile owl
#

if the expected variance of the reward goes above a certain value

#

it clips the reward on a linear gradient down to zero after a certain value

#

or if the reward is negative it is unclipped

neat hawk
#

Hey guys I need some help

Is there anyway to take linux system calls and turn them into stack traces

agile owl
#
reward = ret / prev_var if prev_var else 0
if np.isnan(reward):
    reward = 0
if prev_var * self.var_sigma > self.soft_var_limit:
    reward = min(
        reward,
        max(0, (1 - prev_var * 1 / (self.hard_var_limit - self.soft_var_limit)))
        * reward,
    )

Something like this where prev_var is directly related to the variance of the time series and the weight the agent is putting on it (absolute value of positive or negative weight * the conditional stdev * some multiplier)

#

this is why that engineered variance model is so important

#

it controls a lot of logic in the environment

#

so with something like this the environment isn't strictly defined, I can add my own rules as long as they are compatible with reality

#

I guess that could be applied to robotics too but it's a lot more obvious here that I don't have a strictly defined environment and I can add my own rules to it

#

and a lot of them depend on a variance estimate

#

it's like PPO is the guy who is doing the least possible effort to get a good grade and SAC is the gunner in the class

#

PPO is like "Hey, I passed right?" SAC is like "I want to have the highest average in the class ever"

#

so I figured out I could get better test results by constraining SAC than using PPO

#

I'm not sure how to make PPO more ambitious but I know how to rein in SAC somewhat

hazy wedge
#

which YOLOV model should i use for my first cv detection of common everyday objects?

random veldt
#

HI, im trying to using python to create a program that classifies input text (expenditure items) in a list of categories , im looking for some library how can help me with this , any recomendations?

#

I've heard you talk about Naive Bayes but I'm a bit lost here.

random veldt
#

I will take a look at it, thank you

desert oar
#

Whereas if you're using Polars with lazy and maybe also streaming, you'll be running an optimized query, using a faster execution engine, distributing the work among threads rather than processes, which has much less overhead

past meteor
random veldt
past meteor
# random veldt how a can send the code like this?

you need to surround you code with 3 backticks (`) if you want to have a code block or between 1 backtick if you want to have it inline.

Example: you do inline `like this`

A code block like this

```python

print("hello world")
```

(note: it's not showing because I escaped all the backticks)

random veldt
#

''' print("Hola Mundo") '''

#
import pandas as pd
import re
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

# Descargar las stopwords y punkt si aún no las tienes
# nltk.download('stopwords')
# nltk.download('punkt')

rutaDataSet = 'datos_gastos.csv'

# Cargar los datos desde un archivo CSV
data = pd.read_csv(rutaDataSet)

# Función para limpiar y preprocesar texto
def preprocess_text(text):
    text = text.lower()
    text = re.sub(r'[^a-zA-Z\s]', '', text)
    tokens = word_tokenize(text)
    tokens = [word for word in tokens if word not in stopwords.words('spanish')]
    text = ' '.join(tokens)
    return text

# Aplicar la función de preprocesamiento al conjunto de datos
data['Concepto_de_gasto'] = data['Concepto_de_gasto'].apply(preprocess_text)

# Inicializar el vectorizador
vectorizer = CountVectorizer()

# Vectorización de la columna 'Concepto_de_gasto'
X = vectorizer.fit_transform(data['Concepto_de_gasto'])
y = data['Categoria']

# Inicializar y entrenar el clasificador Naive Bayes
classifier = MultinomialNB()
classifier.fit(X, y)

# Función para predecir la categoría a partir de una cadena de texto
def predecir_categoria(texto):
    texto_preprocesado = preprocess_text(texto)
    texto_vectorizado = vectorizer.transform([texto_preprocesado])
    probabilidad_prediccion = classifier.predict_proba(texto_vectorizado)
    max_probabilidad = max(probabilidad_prediccion[0])
    if max_probabilidad < 0.5:#Si la probabilidad de ser correcto es menor de un 50% devolvemos categoría desconocida
        return 11
    else:
        prediccion = classifier.predict(texto_vectorizado)
        return prediccion[0]
#

thats the code i get from now, i have another file to call the function "predecir_categoria"

#

some parts are in spanish , sorry

#
import NaiveBayesModel

# Ahora puedes usar la función predecir_categoria
texto_gasto = "Estanco"
categoria_predicha = NaiveBayesModel.predecir_categoria(texto_gasto)

print("La categoría predicha para el texto de gasto es:", categoria_predicha)
#

Thats my main

hazy wedge
#

im really confused about how to like download and install YOLOv5

#

wth am i suppose to download

desert oar
arctic wedgeBOT
#
Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

thorn oxide
#

I tried to run yolov8 on ultralytics on ubuntu 22.04 but it showed some errors here like wayland and i don’t know how to fix them

orchid lintel
# desert oar Can you show the code? Dask might have a lot of overhead from passing data aroun...

I actually don't have Discord on my work laptop, but it's something like:

*make DaskML OneHot Encoder, fit with a different DF*
*encode a column and get like 10k sparse columns, GroupBy another column & max*

vs

polars get_dummies
GroupBy & max

I tried the threaded version of Dask too, and I was using the Eager version of Polars. I think Polars might just be a little better about parallelizing high-cardinality GroupBys for some reason?

desert oar
#

were you able to get a sense of the bottleneck? it's just the groupby?

thorn oxide
#

Green button “code”

#

And download then extract it

hazy wedge
#

managed to dowload the folder

hazy wedge
thorn oxide
#

Open terminal and type “pip install -r requirements.txt” to download libraries

hazy wedge
#

what gpt sugested doesn't seem to work

thorn oxide
hazy wedge
#

yes

thorn oxide
#

I think video on youtube is outdated

hazy wedge
hazy wedge
thorn oxide
#

You Search for videos that have been posted recently

#

Welcome to my comprehensive guide on building a custom YOLO (You Only Look Once) object detection model tailored to your specific dataset! Whether you're in computer vision research or developing practical applications, this tutorial will walk you through the entire process.

In this video, I'll cover:

Data Collection and Annotation: Learn to g...

▶ Play video
#

Maybe this vid

#

I regret buying a laptop with RX5500M

#

It’s uncomfortable

orchid lintel
#

One other possibility is that Dask doesn't fully play nicely with Pandas' Sparse Columns atm? Polars doesn't support them either, but I can up my memory (and at least it's fast). Maybe just for thoroughness I'll try Dask on a dense representation and give the old girl one last shot lol

desert oar
#

interesting results. yeah i wonder if dask is moving the data around too much, or otherwise not maximizing efficiency

#

my intuition in spark is always that it's designed to make operations on big data possible, rather than making operations fast

#

i wouldn't be surprised if dask followed a similar philosophy: make bigger data processing possible via distributed computing, rather than maximizing throughput

#

whereas polars is designed to work all very tightly in a single process with a thread pool on local arrow arrays. it's a very different execution model with much lower overhead and many more opportunities to optimize

#

and the core operations are written in a very fast runtime, whereas afaik dask is largely python aside from whatever numpy and pandas do internally in compiled extensions

orchid lintel
#

But I've got my ingrained habits, dasknabbit

desert oar
#

makes sense. i've only used dask for parallelizing parameter search when i was hitting weird pickling problems with multiprocessing.Pool and joblib

#

i liked the nice web UI so i could see progress

orchid lintel
#

But Dask also has their nice sklearn analogs and other stuff that Modin doesn't.

desert oar
#

interesting, i think dask-ml was very immature when i last used it

#

i've never actually done anything with dask for "big data" processing, at that time i had access to databricks so i used spark for big stuff

#

and i like to brag about how in R data.table i was able to handily work on a 1 billion row time series dataset just on my 2015 MBP with 16 GB RAM, with several browser windows etc. all open while doing so. just one core and highly optimized code.

orchid lintel
desert oar
#

lol

#

and imagine, joblib is still better than the default...

#

i think at some point i gave up on joblib's caching and just loaded the data from disk in each worker process

#

i had a whole lot of cool ML framework helper code i lost when i left that job, i was heavily burnt out at the time and didn't do a good job of archiving my work

orchid lintel
desert oar
#

on the flip side, stay at your job for a while and you won't need to worry about it 😛

iron portal
#

Can someone please help me with pytorch, it throws an error when i import it

agile owl
#

the rightmost column is time per episode for my reinforcement learner with incremental feature learners . Will these incremental learning algos eventually reach a stable performance or will the performance just decaying forever

#

good news is it seems to have found a deep and steady gradient at least

final kiln
agile owl
#

it's always time to cook

final kiln
#

Jesse, we need to cook AI

agile owl
#

we need more gpus

#

a bigger lab

desert oar
agile owl
#

don't criticize something until you see more of what's being made

desert oar
#

Is cooking good? Like a musician is really cooking if they're playing well. Or is cooking bad? Like they're burning up because they fucked up and they're suffering the consequences

#

Ahhh i see

#

A neutral third path

#

Honestly that's kind of an interesting concept to package up in a saying

#

Ty for the education

agile owl
#

I only know that from watching meme code review videos and people in twitch chat going wild and the host is like let him cook let him cook as he's reading through it

final kiln
#

Yeah, no one likes to eat raw food. Gotta let it cook first

past meteor
#

Re Dask: I used it a bit and I can't say I enjoyed the developer experience. When a thread failed the entire thing just failed without showing me an error / stack trace

agile owl
#

anyone know of library that has an incremental booster that can be trained starting from one obs?

#

I don't know what happens if you try to start lgboost with ones observation but the way they've done it really makes it seem like it's intended to be instantiated with the majority of the data and if you update it then you do that after the model is already trained

#

kind of annoying how they designed it

#

you can't just instantiate the model directly and iterate over a dataset calling update which is how I was hoping it would work

#

the mondrian forest is okay but I think a boosting algorithm would be better than a forest for this data

desert oar
#

unless it's specifically an online version of boosting, which does exist

agile owl
#

that's what I suspected but I saw people do some stuff

magic dune
desert oar
#

so you'd need something that specifically supports online training

agile owl
#

but you mean it has to be a totally online model

#

not just training the existing versions in an online fashion

#

although lightgbm does KIND of support updates it's a really awkward api

desert oar
agile owl
#

I realized I forgot something that was making the AMF perform a lot worse than lightgbm did in the batch version though and I fixed it

#

I'm actually surprised how easy it was to just replace that stuff

#

I guess I should stop getting surprised it's python after all

#

river is clutch

random fox
#

Please ping me if you reply

agile owl
#

Strangely enough while I had a smooth train using SAC on one time series, switching to another it wasn't able to actually find any gradient at all it seemed, but when I switched to DDPG it did, which I did not expect because I thought SAC was overall the more robust at finding the gradient

bold timber
#

Currently, I would like to visualize the predictions generated by the model I created. However, the output provided for each row and column appears as the same image.

I made the code as follows:

import pandas as pd
pred_df = pd.DataFrame({'y_true': y_labels,
                       'y_pred': pred_classes,
                       'pred_conf': pred_probs.max(axis=1),
                       'y_true_classname': [class_names[i] for i in y_labels],
                       'y_pred_classname': [class_names[i] for i in pred_classes]})

pred_df['pred_correct'] = pred_df['y_true'] == pred_df['y_pred']
top_100_wrong = pred_df[pred_df['pred_correct'] == False].sort_values('pred_conf', ascending=False)[:100]

images_to_view = 9
start_index = 0 
plt.figure(figsize=(15, 10))

for i, row in enumerate(top_100_wrong[start_index : images_to_view].itertuples()):
  ax = plt.subplot(3, 3, i+1)
  _, _, _, pred_prob, y_true_classname, y_pred_classname, _ = row
  ax.imshow(images[i].numpy().astype("uint8"))
  plt.imshow(images/255)
  plt.title(f'actual: {y_true_classname}, pred: {y_pred_classname} \nprob: {pred_prob:.2f}')
  plt.axis(False)
#

Can you tell me what's wrong with the code?

agile owl
#

oh man the captions are getting me hungry

#

well what is images

bold timber
#

I want to visualize the most wrong predictions

agile owl
#

right but it's not defined in what you posted

#

and the error could be there for all we know

bold timber
past meteor
bold timber
past meteor
#

I don't understand what you mean by that

#

You can define "most wrong" in several ways, most true positives, false negatives, in terms of proportion and so on

bold timber
#

the image above is taken from another source

past meteor
#

I still don't know what you want but that's okay, maybe someone else will get it 👍

lapis sequoia
#

you will need some way of marking images in each batch

#

fastai can find the worst predictions, idk how though

bold timber
lapis sequoia
#

maybe limit list size if you get too many of them

feral sand
#

hi!
i just want to put the x in a step of 1 instead of 5, is there any way to do that?

desert oar
#

the key word is "ticks". each marker on the axis is a "tick"

feral sand
#

thank you!

feral sand
arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

hazy wedge
#

how do i train YOLOv5 with this dataset i found?

lapis sequoia
#

Hello

#

How can I swich recognition faces live system to photo recognition faces?

#

Someone can help me?

#

Please

pulsar elk
#

hello

#

now the question is what approach should i consider while performing statistical analysis

#

like anova ftest or t-test like i know how to do them but don't know to apply on dataset

patent minnow
#

what is the best way to train a gpt model off of api docs? I created a json file from redocs for an api and I wanted to know the best way to train a gpt model or something else in langchain. I tried using gpt assistants but tbh it hallucinates alot. Anyone have any suggestions?

hazy wedge
#

i got a dataset here, these folders are full of images of each category by the name of the folders, all i want to know is how do i train YOLOv5 on it 😬

final kiln
#

it is subtle

#

in the metric attention the number of parameters goes down as you increase the number of heads

#

I forgot my calculations, but in the limit it goes down to 0.5 or 0.75, something like that

#

n_para_metric/n_param_transf = 3/4 + 1/(4n)

#

where n is the number of heads

#

they both scale with O of c**2 where c is the dimensin size of the embeddings

#

transformer scales with 4c**2

daring fern
#

Hello world,

I am using PyTorch for generating pixel art by using GAN, my train model works however I don't know how to increase the quality. Maybe you have some ideas how to do or what methods/algorithms I need to use?

I am attaching two images (left one of the image from my dataset where the quality is pretty good and on the right is mine generated by my model)

daring fern
#

3706 files

agile owl
#
    def step(self, action) -> tuple:
        assert self.action_space.contains(action)
        self.prev_portfolio_value = self.current_portfolio_value
        self._update_px()
        self._update_regime_model()
        self._update_regime()
        self._calc_rolling()
        self._accrete_carry()
        self._update_variance_model()
        self._update_variance()
        self._update_vm()
        self._update_im()
        if self.balance < 0:
            self._manage_liq_deficit()
        amount = round(action[1], 0)
        if action[0] > 0:  # Buy
            self._buy(amount)
        elif action[0] < 0:  # Sell
            self._sell(amount)
        self._update_return()
        self._update_risk()
        self._update_sharpe()
        reward = self._get_reward()
        self.render(action, amount)
        state_frame = self._get_state_frame()
        self.state_buffer.append(state_frame)
        obs = self._get_observation()
        self.current_step += 1
        if self.current_portfolio_value < 0.6 * self.max_portfolio_value:
            pass
            done = True
            self.reset()
        elif self.current_step == len(self.periods[self.period]):
            done = True
            self.reset()
        else:
            done = False
        self.done = done
        info = {}
        return obs, reward, done, False, info

How I learned to stop worrying and love the state

grizzled locust
#

`import pandas as pd
import os
!pip install ydata-profiling
from ydata_profiling import ProfileReport
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.metrics import silhouette_score, silhouette_samples
import sklearn.metrics
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans, DBSCAN, AgglomerativeClustering
from sklearn.mixture import GaussianMixture
from sklearn.metrics import accuracy_score, cohen_kappa_score, f1_score, log_loss, confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification

import scipy

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"`

#

i wanted to install ydata profiling but it ended in error

#

`/usr/local/lib/python3.10/dist-packages/typeguard/_importhook.py in <module>
20 from collections.abc import Buffer
21 else:
---> 22 from typing_extensions import Buffer
23
24 if sys.version_info >= (3, 11):

ImportError: cannot import name 'Buffer' from 'typing_extensions' (/usr/local/lib/python3.10/dist-packages/typing_extensions.py)`

#

where i do wrong?

#

can you guys say something? because tbh i just wanted to finish my portofolio so i can look for a job

small wedge
#

if you haven't already

#

rotate, color, add noise, etc to the images to increase the size of your dataset

slender kestrel
#

@past meteor heey i have a question when and why we do differencing before finding the correlation i found the answer to why but not when so i wanna confirm from you

past meteor
#

You do differencing to remove the trend

slender kestrel
slender kestrel
slender kestrel
#

anyone ?

wooden sail
#

the general ACF depends on the two time lags instead of their difference, though

agile owl
#

the differences and the levels are two different things

#

people look at correlation of levels and correlation of differences

#

it depends on what you're trying to do

slender kestrel
slender kestrel
slender kestrel
solemn silo
#

what should i learn first for ai and data science - numpy, panda or madplotlib

serene scaffold
#

but you also need to learn DS/AI theory itself, as an entirely separate thing from "learning libraries". DS/AI is a scientific field that uses programming. it's not a programming field.

desert oar
covert finch
#

I started my first job working with an actual dev team. Transitioned from a data analyst, but python was always my go-to analytics tool. Although I was never a python developer in the traditional sense, I was taught the proper conventions that [attempts] to make code readable to others (of course, no one gets them all). I now work out of databricks notebooks with pyspark. I took a peek at the prod utils repo, and there were some things that didnt sit well. The biggest being one line importing a few functions from a module, and in the same cell, importing the whole module as a qualifier. EG:

''fake_example.py'''
from pandas import DataFrame, read_csv(), Series
import numpy as np
import pandas as pd

---some custom functions here---

and in another notebook they create even more redundancy, best way to show is with a fake example:

%run fake_example.py
from pandas import read_json

%run some_other_helper.py

something = custom_func(var)

---some other logic/flow control---

hard_to_follow = some_other_custom_func(read_json(sloppy.json))

so its not exactly clear which function comes from where, although I think databricks can tell you the source somehow. I know notebooks are a completely different animal, but I didn't expect this.

The funniest thing is that they tell us not to use the pyspark-pandas api because it is too slow. And I believe it. But the it seems to run increadibly slow regardless.

is it possible that these redundant imports could be slowing things down?

serene scaffold
#

@covert finch "is it possible that these redundant imports could be slowing things down?"
the work is only done the first time you import something. for all subsequent imports, python just sees that you already imported it and does nothing.

#

that's globally. not per module

covert finch
#

I see, that makes sense

#

Either way, it is so out of my lane to question that codebase.

#

I will keep my head down and do my job

#

Theres also a lot about spark and databricks that I havent fully grasped may explain different conventions

mild dirge
#
from pandas import DataFrame, read_csv(), Series
import numpy as np
import pandas as pd

This allows you to use DataFrame without having to write pd.DataFrame f.e. So it is not redundant.

#

Although normally people just write pd.DataFrame

#

@covert finch

covert finch
#

redundant is the wrong word. I understand now its not like its using more resources. Its just... qualifiers exist for a reason, ya know?

Although I definitely see them a lot less in pyspark

#

I actually think its because im so used to pandas and numpy

#

A lot of people only import a few functions of course, I just dont understand why you would do both.

#

Its easy to judge someone elses code. I shouldnt have that mindset.

scenic parcel
#

Is duckdb faster than polars

desert oar
#

If anything, sometimes the perspective of a newbie is valuable

#

But if somebody is stubborn enough to avoid writing code the way everybody else writes code, they might not be interested in other peoples perspectives

desert oar
#

I wouldn't be surprised if duckdb was slightly faster overall, it doesn't need to worry about interoperation with python as much

scenic parcel
#

Its just pretty much appending and resampling on a many large dataframe

covert finch
covert finch
desert oar
#

if you need to insert rows incrementally, use a traditional row-oriented storage system like sqlite

#

or a list, or plain files on disk, etc.

#

(at least, if you care about throughput)

scenic parcel
desert oar
scenic parcel
#

Idk why I said appending

desert oar
desert oar
scenic parcel
#

It’s just going to take a while because it’s like 70gbs

#

It’s just counting mentions of a specific word in each row

desert oar
#

don't force people to interview you to help you

trail wren
#

hello I have a question! I have an np array with the following variables, X = [x_1, x_2, x_3, x_4, x_5 ... x_(2n-1)] with length (2n-1), I would like to transform this into an (n, n) matrix Y, such that the variables within X are the diagonals of matrix and that they are present within every element on the same the cross-diagonal axis. Like so,

#

How may I accomplish this? I am not deft enough in numpy yet so I really can't think of a better way than a for loop. but I only want to compute the values x_1 x_2 and so on once.

desert oar
trail wren
#

YEP!

#

I was just thinking about a rolling window

desert oar
#

!e ```python
import numpy as np

x_flat = np.array([11, 12, 13, 14, 15, 16, 17, 18, 19])
n = len(x_flat) // 2 + 1

x = np.empty(shape=(n, n))
for i in range(n):
x[i, :] = x_flat[i : i + n]

print(x)

arctic wedgeBOT
#

@desert oar :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | [[11. 12. 13. 14. 15.]
002 |  [12. 13. 14. 15. 16.]
003 |  [13. 14. 15. 16. 17.]
004 |  [14. 15. 16. 17. 18.]
005 |  [15. 16. 17. 18. 19.]]
trail wren
#

thanks a lot for confirming and coming up with this glorious snippet

desert oar
#

sometimes the best approach is to go full caveman and write a loop over indexes

#

i'm sure APL has a fancy combinator for this, but i'm too stupid for that

#

the important thing is to pre-allocate the output and operate with slices, instead of doing something like building up a list of lists and converting it to an array at the end

trail wren
#

I see, especially that I am working with large lists, it is important to allocate enough memory at first to avoid large computational and allocational costs

#

I was quite reluctant of a for loop because of the costs it might've incurred but this algorithm should be as good as it gets

#

for Python methinks

#

I am sure some crazy man could use Unions in C to make this work but we do not talk about C

desert oar
#

numpy is all C internally 😉

#

you could do it this way too, but i suspect it will be slower:

x = np.vstack([x_flat[i : i + n] for i in range(n)])
#

actually that would be interesting... let me benchmark

trail wren
#

does it use CUDA automatically or is it multithreaded for CPU only

serene scaffold
#

numpy can't do cuda

trail wren
#

I shall hope for when the CUDA library for Python comes out

desert oar
#

other libraries can do it, e.g. torch and jax. just not numpy

trail wren
#

and C bindings will not be too necessary for people like me

serene scaffold
#

cupy is supposed to be a drop-in replacement for numpy that can do cuda, but most of the cases where you'd want cuda are covered by torch et al.

desert oar
#

numpy is raw C + blas/lapack bindings

#

you might be interested in Julia, which has one unified native array type that can do both cpu and gpu more or less transparently. compared to python which now has like 5 different array libraries, not including any custom bindings to eigen, armadillo, etc. that someone might be tempted to write

trail wren
#

I have been acquainted with Julia through Grant Anderson's tutorials before

serene scaffold
#

the 3b1b guy?

trail wren
#

aye

#

though I lost interest quickly

#

I guess I couldn't learn on my own back then, I needed a structured program

desert oar
#

he has julia tutorials? i know he did all his animations in python

trail wren
#

the sterilization of self-ventures you encounter in high school education is sad

#

Oh he has, just not in his channel

#

they are within Julia programming language's own channel I think? He talks about kernels, convolutions and some cool image processing algorithms with his usual intuitive explanations.

desert oar
#

nice, i'll watch those. i'm a perpetual julia newbie

trail wren
#

I am a perpetual everything newbie, I still can't get over the fact that I have to watch 'beginner tutorial' videos every now and then

#

The immediate settling of imposter syndrome is sooooo real

desert oar
#

that's fair. it might help to try to practice learning from other kinds of resources. videos tend to encourage passive information consumption, which discourages retention and understanding.

#

i've seen some things written about how students tend to rate videos highly for feeling like they're learning, but actually they perform the worst when you assess how much they leraned

trail wren
#

YES, that is wonderfully descriptive of the problems I had since I arrived at higher education

#

had I actually gotten textbooks and read them, done practice instead of watching lectures and learning through animated neat videos, I would've had a much keener comprehension and skill with most of the subjects I've delved into so far.

#

It is to do that matters, so shall I, have a nice day.

covert finch
desert oar
#

i was the kind of student in public school who could learn everything by just listening attentively in lecture. basically the equivalent of watching a video.

ruby grail
#

Hello. I'm having trouble with some simple functionality of Pandas. I read a df from a json file and when I do df.iloc[0]['someprop']['someotherprop'] it all works. When I do df.iloc[0:1]['someprop']['someotherprop'] I expect it to work same as before but instead I get KeyError: 'someotherprop'

#

can someone please tell me why this works in this strange way?

serene scaffold
#

and the code you're using to load it?

trim saddle
# ruby grail Hello. I'm having trouble with some simple functionality of Pandas. I read a df ...

Have you checked what gets returned, if you leave away some of the [someprop] part?
if you try:

  • df.iloc[0]
  • df.iloc[0]['someprop]
  • df.iloc[0:1]
  • df.iloc[0:1]['someprop]
  • df.iloc[0:1]['someprop]['someotherprop]
    it should be clear, why you get the error

From the Docs: df.iloc with a slice returns a dataframe, df.iloc with a single integer returns a series i.e. column.
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html

But like @serene scaffold said, without knowing the filestructure its hard to say, what exactly went wrong

ruby grail
#

Hey @trim saddle I simply did pd.json_normalize(df['someprop'])['someotherprop'] The important part is the json_normalize 😉

covert finch
#

according to the docs, json_normalize returns a dataframe object

#

so use df.loc , not iloc

#

oh I see, you probably tried read_json first but normalized fixed it

desert oar
wooden sail
# slender kestrel if i just want to find correlation b/w 2 different time series then in what scen...

idk if this reasoning helps you out any, but you can think of integration as a "low pass filter" and differentiation as a "high pass filter". that means taking the finite differences of a time series is helpful whenever you want to high pass. practically: when you want to remove any constant offset that shifts the whole series up or down, when there's a slow upward or downward trend, or any slow variations you don't care about (including seasonality over a large time scale). these things are not wrong to include in the ACF, but you may need to compute it differently, since they can indeed affect stationarity. as discussed before, stationarity is not required, but the ACF of a non stationary signal has 2 parameters (it's a 2d or matrix quantity derived from a 1d or vector quantity)

warm copper
#

OMG

#

@wooden sail

#

you are alive

wooden sail
#

it appears so

warm copper
#

MS CS with AI/ML specialization

wooden sail
#

congrats!

warm copper
#

thanks

wooden sail
#

could've sworn it was in old books

wooden sail
#

*edit: oops, had posted a meme in the wrong place

scenic parcel
slender kestrel
odd meteor
odd meteor
covert finch
#

havent tried cuDF pandas

#

keep in mind I only calculated the mean of a row reading large files into memory

#

I'll test it out and share the results!

#

oh it uses CUDA?

#

I thing my gpu has 16g of vram so keep that in mind

atomic tide
#

Hey, I have a stats question. Given a set of data points, is there a straightforward/standard way to estimate the mutual information of a pair of variables?

covert finch
covert finch
atomic tide
#

I mean like H(X) - H(X|Y), where H is entropy.

covert finch
#

hahahahhaha

#

okay yeah im not your guy

covert finch
#

I can't explain the maths behind it, but this may be what you are looking for in terms of python implementation

covert finch
past meteor
covert finch
#

I think cuda stuff is more important for super heavy computations, like deep learning. I don't think there will be a huge difference in data transformation, reading large files, etc

#

although their documentation claims otherwise haha

odd meteor
past meteor
covert finch
past meteor
#

But honestly, Polars reminds me a bit of writing dplyr and spark in terms of ergonomics

covert finch
#

the syntax is really similar to pyspark

odd meteor
# covert finch cuDF looks really interesting. But its a pain to install. Ill look at it more to...

I had some bottlenecks too while trying to install cuDF. But I'm glad I didn't relent in getting it in my machine.

This video might be useful https://youtu.be/9KsJRyZJ0vo?si=lXZWX7QSCRC18tHs

What if I told you that all this time we've been using Pandas wrong? 🐼 🐼 🐼
We keep running it on our CPU and wondering why it's slow - but what happens when we switch to GPU processing? 🤔
In this tutorial we will explore the brand new technology behind cuDF Pandas Accelerator Mode that allows us to use our graphic cards to make Pandas MUCH fast...

▶ Play video
past meteor
#

Last but not least, if you're in to that, the type hints are also a lot better so the LSP/IDE experience is fantastic as well

covert finch
#

im curious how it will perform on my local gpu haha

odd meteor
covert finch
#

oh i read your message wrong

#

yeah, im not worried about bottlenecks per say, I just dont like using conda

#

but if Ill do it on WSL I guess, but why only ubuntu?

atomic tide
covert finch
#

or prop density functions

#

now you have me reading the wikipedia on entropy

#

hahaha

atomic tide
#

I'm trying to get into data science properly, finally.

covert finch
#

oh man I thought you were like a statistician or something

atomic tide
#

I do know a fair amount about probability.

#

But more on the theoretical side.

#

So... is this actually a reasonable way to evaluate the pairwise relationships between variables, or is this just... stupid lol? 😄 ```py
import pandas as pd
import seaborn as sns
from sklearn.feature_selection import mutual_info_regression

data = pd.read_csv('data.csv')

def mutual_info(x, y):
return mutual_info_regression(x.reshape(-1, 1), y)[0]

sns.heatmap(data.corr(method=mutual_info), cmap='coolwarm')

#

It did make a pretty picture ¯_(ツ)_/¯

covert finch
#

lol

atomic tide
#

Although I would have thought the diagonals should be red pithink

covert finch
#

wheres data.csv

#

I can look too

atomic tide
#

It's not actually a csv file, I just added that for illustrative purposes.

covert finch
#

okay, let me look

atomic tide
#

After installing the ISLP pypi package, you load the data by doing: ```py
from ISLP import load_data
data = load_data('Boston')

covert finch
#

sns.heatmap(data.corr(method=mutual_info), cmap='coolwarm')

atomic tide
# atomic tide

The red squares seem reasonable. I mean, you would expect there to be a strong relationship between the amount of industry in an area and air polution.

covert finch
#

this is pearsons r correlations, which is not super comprehensive

covert finch
atomic tide
#

It should be using the mutual_info function. I used corr because I wasn't sure how else to apply a function to all pairs of columns.

covert finch
#

oops ur right

atomic tide
#

Maybe there's a special pandas way to do that?

covert finch
#

no mutual_info i think is a better way to do that.

covert finch
#

okay

#

this is basically k-nearest neighbors

atomic tide
# covert finch but this still is true

Right yeah. I was thinking, if I were to train a predictive model, I could exclude predictors where there is another predictor that has a high degree of mutual information?

covert finch
#

agreed

atomic tide
#

I haven't learned about multicollinearity, but that seems to be a similar concept, but specifically regarding linear relationships.

covert finch
#

you can do that with pearsons r too actually. I used ordinary least squares regression models a lot (majored in econometrics) and its always good to check relationships

covert finch
#

waiting for a phd to enter the chat haha

atomic tide
covert finch
#

Distrust all models

#

but yes, you are correct linear models are meh. but its all just applied probability theory. I think for your use case clustering or logistic regression would be the way to go tho yeah

#

hold on

#

nvm I was right, phew

#

you can also plot a joint probability distribution

#

did you look at the github lol

#

"""
Helper functions for clustering

This module contains functions used for clustering in the unsupervised
lab of ISLP. Currently it contains just a simple function to construct
a linkage matrix to assist plotting a dendrogram of a hierarchical
clustering.

"""

import numpy as np

def compute_linkage(hclust):
"""

Create linkage matrix used to plot a dendrogram

Follows [sklearn example](https://scikit-learn.org/stable/auto_examples/cluster/plot_agglomerative_dendrogram.html)

Parameters
----------

hclust : `sklearn.cluster.AgglomerativeClustering`
    Fitted hierarchical clustering object.

Returns
-------

linkage_matrix : np.ndarray
    Array to be passed to `dendrogram` from `scipy.cluster.hierarchy`.

"""

counts = np.zeros(hclust.children_.shape[0])
n_samples = len(hclust.labels_)
for i, merge in enumerate(hclust.children_):
    current_count = 0
    for child_idx in merge:
        if child_idx < n_samples:
            current_count += 1  # leaf node
        else:
            current_count += counts[child_idx - n_samples]
    counts[i] = current_count

linkage_matrix = np.column_stack([hclust.children_, hclust.distances_,
                                  counts]).astype(float)
return linkage_matrix
#

its even on their website

muted crypt
#

Hello! I have a question, which may be quite complex but I'm stuck with it.

I have two datasets contaning captured messages from two different antennas. Most of the received messages are the same as they are close to each other but there are some messages which are caught by only one of the antennas. Their timestamp not always matches, as there is some variable delay which is not constant throughout the dataset.

I have to remove the duplicates and merge the dataset into a unique one. I tried correlation in small time windows to find the delay at that given time and then try to do some dynamic time warping to match the replies and remove the ones closer than a threshold. However the results are very random I feel.

Any insights?

covert finch
#

I would have to see the data set

muted crypt
covert finch
#

and you want to merge the datasets on the timestamps, but some of them are slightly off, correct?

muted crypt
#

Yes!

#

Something like this by plotting the received messages from the two antennas

#

The dotted lines means that I have removed a common point in the two datasets (received same message at same or almost same time)

covert finch
#

its hard to give good insight without the raw data, but i mean one possible solution would be rounding the timestamp to the nth degree such that the match up, unless you are talking about signals being like 1/2 a nanosecond off, in which case does it really maatter? haha

#

sorry, your gaph means nothing to me

muted crypt
#

I just plotted the recevied messages vs time

covert finch
#

i dont see time

#

just lines

#

theres no other unique identifier? like a ping id or something?

#

ah but you are most concerned about the time

muted crypt
#

Red line and blue lines are the data from the two antennas. I just put a dot on the x axis being time when a message is received

muted crypt
#

But yes, the problem is this unpredictable delay

covert finch
#

right got it

muted crypt
#

Just a quick sample of the data

covert finch
#

If you are comfortable sending me the data, I will help you

muted crypt
#

But there is another message identifier which is more unique for each message and other properties, like the strength, format...

#

I can send a part of it yeah, but not all as it is confidential

covert finch
#

totally understand

atomic tide
# covert finch you can also plot a joint probability distribution

Yeah. Looking at the pair plots there are some weird things, like variables that are clearly very correlated but apparently have low mutual information. I think I need to just read the book before getting ahead of myself and trying to come up with my own ideas tbh 😄

covert finch
#

dont send it here

muted crypt
#

okay give me a second

covert finch
#

ahhahaa

#

did you use the pytorch script

atomic tide
muted crypt
covert finch
covert finch
past meteor
#

What alternatives do you guys use to Google Colab (pro +)? GPUs at work have an issue and I need some soonish. I could just use Azure but I'm interested in things with a transparent pricing regimen which traditional cloud providers aren't imo.

versed gulch
#

say I have non-integer coordinates (50.345, 60. 789). How would I get the value of the pixel in an image array at this coordinate?

trail wren
#

Well depends on your coordinate notation, if you want the top-left corner of your screen to have float coordinates (0.0, 0.0) then you could just do int() on each of these float values to get their floor integers which should be indices to the corresponding pixel.

#
x = (50.345, 60.789)
x = (int(x[0]), int(x[1]))
#

this should return the required indices, (50, 60)

versed gulch
wooden sail
#

ideally you would use an interpolation function that is well motivated. otherwise, you have to pick one by eye and see what gives you good results

trail wren
#

So anything between 0.0 and 1.0 is literally in the area spanned by that pixel

#

hence if you transform the coordinates into indices, you will get the pixel that includes that specific point no matter what.

#

The accuracy in regards to picking the right pixel that encompasses that point is therefore absolute. You will have digitization however, as multiple float inputs will map into the same integer tuple output, which is just how images work in the first place so there is no error.

final kiln
#

rip no gpu capacity on aws spot

solemn silo
serene scaffold
odd meteor
past meteor
final kiln
final kiln
#

this is the LR used in the attention is all you need paper

final kiln
#
model_factory = ModelFactory(
    coordinates = 300,
    tokens = gpt2_encoder.max_token_value,
    words = 250,
    number_of_blocks = 10,
    number_of_heads = 20,
    bias = False,
    attention = "metric"# "scaled_dot_product", # or "metric"
)


training_loop_factory = TrainingLoopFactory(
    number_of_epochs = 1000,
    number_of_batches = 100,
    warmup_steps=100,
    loss_function = "CrossEntropyLoss",
    batch_size = 32,
    input_text_file = "train/static/raw_data.txt",
    split_ratio = 0.9,
)
#

gonna let it be until colab takes away the machine, need to do something about the spot instances

covert finch
covert finch
frosty lance
#

Can anyone help out with a NEAT-Python game that i just, for the life of me, cant get to come to a solution?

elfin viper
covert finch
bronze robin
#

What value does complex numbers hold in the field of data analysis? Like if I have logarithmic equation (ln(y) = c + ln(x)) and if I derive my input data also consisting of negative entries does that mean those entries are inavlid in my analysis which should be discarded or can I carry the analysis by applying absolute value which technically means considering just the real part of the ln(x) solution

mortal shore
#

Need to find out how many complex words are present in a set/list of tokenized words

#

Any clue how I can achieve this

mortal shore
#

Yelp

desert oar
#

Partly it's a problem of interpretation. In a lot of cases, complex numbers act more like a pair of numbers than a single number. How do you actually interpret linear regression with a complex output?

#

Complex logarithms in particular, as far as I know, are not straightforward

#

That is, I don't think it's like a square root where you put a negative number in and a complex number comes out

#

But I could be wrong, I never studied complex analysis

#

I think the only time it's likely to come up in applied data science or machine learning is in some kind of signal processing context with Fourier decomposition, which I believe people use sometimes for feature engineering in deep learning, but I have no experience with that

#

One more thing to note is that in the case of the logarithm, if you're just trying to transform highly skewed data that might contain zeros or negative numbers, you can use the inverse hyperbolic sine function instead, although it's not as easy to interpret as a logarithm, and you don't have a library worth of tidy well understood results for the distribution of random variables transformed with the inverse hyperbolic sine function

#

So I would say it's only relevant in data analysis if it's relevant to your problem domain. Otherwise, you're not likely to encounter it at all

final kiln
#

im a bit confused, am I supposed to feed the encoder output to every decoder block ?

desert oar
final kiln
#

how do I choose where to connect it ? in the middle ?

#

and it only shows 3 attention heads, two coming from the encoder, how do I decide on that split ?

#

and are the positional encoding modules the same ? or should I create one for each branch ?

desert oar
#

i think you just do positional encoding once at the beginning but don't quote me on that. maybe look at an implementation for these details

final kiln
#

yeah I think I need to do that, they have the code in a repo I think

#

I think I narrowed it down

#
  with tf.variable_scope(name):
    for layer_idx in range(hparams.num_decoder_layers or
                           hparams.num_hidden_layers):
      x = transformer_decoder_layer(
          x,
          decoder_self_attention_bias,
          layer_idx,
          hparams,
          encoder_decoder_attention_bias=encoder_decoder_attention_bias,
          encoder_output=encoder_output,
          cache=cache,
          decode_loop_step=decode_loop_step,
          nonpadding=nonpadding,
          save_weights_to=save_weights_to,
          make_image_summary=make_image_summary,
          losses=losses,
          layer_collection=layer_collection,
          recurrent_memory_by_layer=recurrent_memory_by_layer,
          chunk_number=chunk_number
          )
final kiln
final kiln
arctic wedgeBOT
#

tensor2tensor/layers/vqa_layers.py lines 232 to 240

if memory_antecedent is not None:
  # Encoder-Decoder Attention Cache
  q = common_attention.compute_attention_component(
      query_antecedent, total_key_depth,
      q_filter_width, q_padding, "q",
      vars_3d_num_heads=vars_3d_num_heads)
  k = cache["k_encdec"]
  v = cache["v_encdec"]
else:```
final kiln
#

two come from cache, and the other is being computed somehow from the compute_attention_component

#

aaah, I think im gonna need to rework my layers

#

this is quite interesting, I don't think I have a parallel for this setup in the metric tensor thing

#

I mean, ofc I have, I just dot the output from the encoder with the output from the prev decoder

#

in both cases it is a simple mod, on the forward method I add a new argument, forward(sequence1_bwc, sequence2_bcw), first one is used to compute queries, the second to compute keys and values. When I need to use decoder only I just feed the same sequence twice. My attention mechanism is even simpler since I'm just doing xMx.T, I can do xMy.T

feral sand
#

how do i prepare the data and LSTM model in keras for suporting multiple features as input?

desert oar
final kiln
# desert oar Yes that's definitely what they are

im gonna train the networks to do summarization, it's easier for me to do a qualitative assessment of the ouput and it is a useful application that I can easily deploy for portfolio value

so far I've only seen one metric for evaluating the output, it's called perplexity

for training data, I'll likely query wikipedia articles on-demand and feed them through lamma to get an output

leaden narwhal
#

Hey guys any input that might help me with this

slow vigil
#

Thought I'd share. Interesting little niche problem for Linux users using multiprocessing

#

Short answer: Explicitly setting the context for creating new processes as 'spawn' rather than the default 'fork' fixes the issue and allows me to write out to disk from inside a child process

desert oar
#

I would've expected that you need to create a new thread in each worker process, or use a cross-process-capable queue for logging

slow vigil
#

I'm not really sharing data across processes. I'm just writing out from each process separately into separate files

#

But each write would cause the process that called it to hang

desert oar
#

unless you're trying to share an open file object across processes or something weird

jade sinew
#

Anyone want to practice pandas or matplotlib with me?

mint palm
#

i am using torch.nn.functional.Interpolate() to convert 3,16,16 to3,224,224
but i am geting following error during back propogation:
RuntimeError: The size of tensor a (224) must match the size of tensor b (16) at non-singleton dimension 3

#

what to do?

pearl barn
#

Boys of data analysis do I need to take all maven analytics courses for python or till visualization into seaborn and metabolite will be enough??

river cape
#

Are there any good source for refering machine learning and ai available on the net?

past meteor
#

!rule 6

arctic wedgeBOT
#

6. Do not post unapproved advertising.

past meteor
#

Seems like you made this, which would mean it's unnapproved advertising which means it sadly has to go.

past meteor
# jade sinew Anyone want to practice pandas or matplotlib with me?

A massive tip I can give you for learning matplotlib is reading the quick start guide https://matplotlib.org/stable/users/index.html. It's really important you understand the anatomy of a figure intuitively (in the link). it explains the "philosophy" quite well.

For Pandas reading this would also be a help https://pandas.pydata.org/docs/user_guide/10min.html#min

Both Pandas and matplotlib have cheatsheets I'd recommend you reference when using them in the beginning (after reading both links on top):

https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf
https://matplotlib.org/cheatsheets/

dull ermine
# past meteor Seems like you made this, which would mean it's unnapproved advertising which me...

I have deleted the post. To clarify I had not made that post. I only discovered it, I enjoyed reading so I thought to share it in most relevant community. @past meteor Definition of advertisement : "a piece of information that persuades people to buy something". That article was not persuading anyone to buy anything. No membership was required to even read the article. For anyone still wanting link to article just tell me.

hazy wedge
#

why is the training of YOLOv5 stopping for me after it has done 1 epouch 😭

#

im training it on a dataset of about 8k images from roboflow

#

i used these parameters

#

and yes i have re-run the code 3-4 times already but same issue

real goblet
#

I have dataframe named df and I need to add a dict to it. How?

atomic tide
loud elbow
#

Hi everyone,
I want do some regression DS projects. Do you have any suggestion for choosing projects?

pale sandal
#

I would like to teach a NN to recognize if wires are inserted into the connector in the right order. I have a hundred images of connector side of a cable with wires attached. I've annotated them with polygons denoting where in the image the connector housing is and where each of the wires is. I have also annotated if the whole thing is assembled correctly. All tutorials are focusing mostly on recognizing different classes between images but I have all the same images with the same n classes on there (connector, x amount of wires and if it's "CORRECT"). Only some training images have some wires removed and made incorrectly. I have trouble extrapolating from the tutorials of how to go about this.

sharp zenith
#

someone knows any free AI API?

desert oar
desert oar
#

however if you run object detection on the image first, you might be able to pre-process the images in some way that might make the classifier more effective

#

otherwise, if you just take the images and put them directly into a classifier, the model basically has to learn how to detect objects, before it can learn to distinguish connected and disconnected

pearl barn
#

All these courses I mean do I need all of them??

desert oar
# pearl barn Matplotip

i suggest spending a small amount of time every week reading the matplotlib documentation and practicing

pearl barn
desert oar
#

I guess those prices aren't that bad? They might get you started more quickly

#

I don't think any of that is necessary, it won't help you on a job application for example

#

But it might help get you started much more quickly than you might be able to learn it on your own, in which case it's money well spent depending on how you value your time

#

But it's also hard to say what the quality of the course material is

#

There are a lot of junk courses out there taught by people who are essentially beginners themselves and should have no business teaching

#

I certainly do not think you should take all of them

#

Particularly when it comes to learning how to actually build models and reason about data at a professional level, lots of small bite-size courses will not help you make progress in my experience. You will need to spend some time and effort on larger hands-on projects, and at least some deeper study of the underlying math and statistics, if only so that you can share a common language with other practitioners and learn from them, read books, etc.

pearl barn
slate kraken
#

I need some help with identifying how to access the GRIT dataset that Ferret model uses which was released by apple. Here is the repo link "https://github.com/apple/ml-ferret/"

pale sandal
lapis sequoia
#

Why python is outdated and JavaScript is updated?

peak hamlet
#

wat

lapis sequoia
#

Yes

#

Means most of things aren't supported in python

desert oar
#

In theory yes, the model can learn what black wires are, what connectors are, etc. but you might need a lot more data for that. The more information the model needs to learn from the data, the more data you need

final kiln
#
class EncoderDecoder(nn.Module):
    def __init__(self, params: "ModelFactory"):
        super(EncoderDecoder, self).__init__()

        self.sequence_encoder = SequenceEncoder(params)

        self.encoder = nn.Sequential()
        for i in range(self.number_of_blocks):
            block = TransformerBlock(params, is_decoder = False)
            self.encoder.add_module(f"encoder_block_{i}", block)

        decoder_blocks = nn.ModuleList()
        junction_blocks = nn.ModuleList()
        for i in range(self.number_of_blocks):
            decoder_blocks.add_module(f"decoder_block_{i}", TransformerBlock(params, is_decoder = True))
            junction_blocks.add_module(f"junction_block_{i}", TransformerJunction(params))

        self.output_layer = nn.Sequential(
            nn.LayerNorm(params.coordinates),
            nn.Linear(params.coordinates, params.tokens, bias=params.bias)
        )

    
    def forward(self, sequence_bw: TensorInt) -> TensorFloat:
        sequence_bwc = self.sequence_encoder(sequence_bw)
        encoder_output_bwc = self.encoder(sequence_bwc)

        for i in range(self.number_of_blocks):
            sequence_bwc = self.decoder_blocks[i](sequence_bwc)
            sequence_bwc = self.junction_blocks[i](sequence_bwc, encoder_output_bwc)

        return self.output_layer(sequence_bwc)
pale sandal
final kiln
desert oar
#

but you can definitely try it

final kiln
#

UNETs converge surprisingly well with very little data

#

Might be worth a shot

pale sandal
desert oar
# final kiln UNETs converge surprisingly well with very little data

i'm just thinking about the conceptual application. I assume they're interested in doing something like taking a photo of an electrical box and determining if it's been wired correctly. There are going to be so many variations of configurations, lighting, cameras, etc. that the model is going to have to be robust to

desert oar
desert oar
final kiln
#

But, it worked because the background was always the same

desert oar
#

yeah that's what I was saying, just try it and see. but don't be disappointed if it doesn't work well either

final kiln
#

So it was memorizing the background

desert oar
#

if it's something like one camera mounted in a fixed location taking a photo of a specific jig in a production facility, 100 might be fine

desert oar
final kiln
#

It was, but I didn't predict it at the time, we had over 1k images and I wanted to determine how many images we actually needed to fit one network, the images were expensive to get the bg removed, was pleasantly surprised by the unet

desert oar
#

there should also ways to combine a small number of labeled instances with a large number of unlabeled instances to improve results, but i'd have to do some digging to see how that works with images and NNs specifically. that's called "semi supervised" learning

desert oar
# final kiln It was, but I didn't predict it at the time, we had over 1k images and I wanted ...

U-Net is a convolutional neural network that was developed for biomedical image segmentation at the Computer Science Department of the University of Freiburg. The network is based on a fully convolutional neural network whose architecture was modified and extended to work with fewer training images and to yield more precise segmentation. Segment...

final kiln
final kiln
desert oar
final kiln
#

Still haven't checked if the positional encoding is shared by the two branches

#

But I reckon they do

desert oar
final kiln
final kiln
desert oar
#

i'll take a look

pale sandal
#

I'm working with Keras and I have issues with logits and label shapes during model.fit(). My images are (N, 240, 320, 3) and my labels are a (N, 1) array of 0 or 1 (for OK and NOK). Do I just say my labels are (N, 240, 320, 1) and fill them with either 0 or 1 or do I have to segment my images to make masks of where the desired wires and connectors are?

desert oar
#

if you segment the images i'm not sure what the "correct" approach is, but you can maybe include each mask as another "channel" along with r g and b

#

so instead of (N, 240, 320, 3) you have (N, 240, 320, 3 + M) where M is the number of different mask layers

#

there might be a better way to do it, that's just what i came up with off the top of my head

#

these are 240x320 images right? if your labels were N,240,320,1 then you'd be labeling each pixel, rather than each image

pale sandal
desert oar
#

show your code. based on your messages before you might be missing a softmax layer

#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

pale sandal
#

For lack of a solution I'm trying by throwing stuff at the wall and seeing what sticks. I put a flatten at the end of the decoder and it now goes through the epochs but sadly with 0.0 accuracy

desert oar
#

softmax is how you get from raw "scores" to a probability distribution over classes

#

in the case of a single output you actually just need a sigmoid activation

#

but you need to get from your inner layers to that one output

pale sandal
desert oar
final kiln
#

With two classes, isn't it the same as with N classes ?

desert oar
#

it simplifies further, but yes it's the same

#

sorry it is the logistic function

#

the inverse logistic function is the other way, going from 0,1 to the real line

pale sandal
#

So if I have sigmoid activation on the last conv2d, do I still need a softmax layer then?

desert oar
#

but you need sigmoid activation on one output, if you have multiple outputs you need to condense that down to a single output somehow

#

i'm not really sure what people do for that typically, I suppose you could do a fully connected layer

final kiln
#

So it's the same interpretation, a probability assigned to each of the two classes

pale sandal
#

No errors but the results are a bit weird. For all 10 epochs I get

loss: 9.2921 - accuracy: 0.3976 - val_loss: 8.0797 - val_accuracy: 0.4762

#

Could this be because of a too small dataset?

desert oar
pale sandal
#

60-40

#

I'll label more images tomorrow. Try to get at least 500. Sadly I don't have that many NOK pics so I need to get more when I go back to work.

desert oar
#

yeah, if you have 60% OK then you should be beating 60% accuracy as minimum baseline

#

more labeled data is good

pale sandal
#

Thanks for all the help. I'm off to bed now. Will probably be back with more questions in the coming days.

desert oar
#

you probably should also do a train/test split. more data will help for that

undone dust
#

Hey does anyone have a good vidoe or place I can start learing pytorch and Machine Learning?

final kiln
# undone dust Hey does anyone have a good vidoe or place I can start learing pytorch and Machi...

This is the best intro I've seen: https://youtu.be/0QczhVg5HaI?si=RgPV8UhO64tJXYu4

A video about neural networks, how they work, and why they're useful.

My twitter: https://twitter.com/max_romana

SOURCES
Neural network playground: https://playground.tensorflow.org/

Universal Function Approximation:
Proof: https://cognitivemedium.com/magic_paper/assets/Hornik.pdf
Covering ReLUs: https://proceedings.neurips.cc/paper/2017/hash...

▶ Play video
#

And then the resources on the pinned messages + just getting your hands dirty with it

final kiln
jade sinew
#

anyone want ot practice data analysis on python (pandas, numpy, matplotlib, seaborn)?

undone dust
undone dust
serene scaffold
agile owl
#

awesome let me check it out