#data-science-and-ml | Python | Page 119

craggy agate May 7, 2024, 5:47 PM

#

I see, but still it depends on your use case.

#

Like if you want to predict financial data and I compare a CNN-LSTM to a regular vanilla LSTM, of course I would say vanilla is better.

#

But for processing videos or photos, the CNN LSTM would be better and I would say it's better than the regular or vanilla LSTM.

sturdy kiln May 7, 2024, 5:50 PM

#

well yeah pretty obvious any convolutional model will always be the best bet for any 2d image data

craggy agate May 7, 2024, 5:51 PM

#

sturdy kiln well yeah pretty obvious any convolutional model will always be the best bet for...

I know but my point is that each model is great at something and not so good at others.

sturdy kiln May 7, 2024, 5:54 PM

#

22.64 RMSE on a single LSTM layerr, not great not terrible

craggy agate May 7, 2024, 5:55 PM

#

sturdy kiln 22.64 RMSE on a single LSTM layerr, not great not terrible

That's a pretty good prediction.

sturdy kiln May 7, 2024, 5:55 PM

#

CNN and MLP scored better tho lol

#

but its still pretty great

craggy agate May 7, 2024, 5:56 PM

#

Yeah

sturdy kiln May 7, 2024, 5:56 PM

#

im assuming since you can do stacked LSTM, and bidirectional lstm, you can do stacked bidirectional?

craggy agate May 7, 2024, 5:56 PM

#

sturdy kiln CNN and MLP scored better tho lol

CNN is good at finding patterns 🤷

sturdy kiln May 7, 2024, 5:57 PM

#

true but a basic ml perceptron is beating all of it lol

craggy agate May 7, 2024, 5:57 PM

#

sturdy kiln im assuming since you can do stacked LSTM, and bidirectional lstm, you can do st...

Not something I have specifically tried but in theory I don't see a problem with it.

craggy agate May 7, 2024, 5:57 PM

#

sturdy kiln true but a basic ml perceptron is beating all of it lol

💀

sturdy kiln May 7, 2024, 6:06 PM

#

those are some very erratic metrics if i see one

sturdy kiln May 7, 2024, 6:21 PM

#

wow bidirectional sucked ass

clever karma May 7, 2024, 6:22 PM

#

yo i wanna check how much players a minecraft server has and then after that if it is a certain amount it pings anyone with a certain role how can i go about checking the players

craggy agate May 7, 2024, 6:24 PM

#

sturdy kiln wow bidirectional sucked ass

Bidirectional LSTM is typically used as a language model.

sturdy kiln May 7, 2024, 6:24 PM

#

hmm TIL

#

so the only thing i can take out of here, a simple MLP is the best performing model for any financial time series data lol

craggy agate May 7, 2024, 6:25 PM

#

sturdy kiln those are some very erratic metrics if i see one

If it's going down, try to see how far you can get the loss to drop before it plateaus.

craggy agate May 7, 2024, 6:25 PM

#

sturdy kiln so the only thing i can take out of here, a simple MLP is the best performing mo...

Yeah, what's weird is that it's better than an LSTM and CNN-LSTM at stock prediction.

past meteor May 7, 2024, 6:26 PM

#

sturdy kiln wow bidirectional sucked ass

I don't even know if bidirectional LSTMs make sense for time related modelling

craggy agate May 7, 2024, 6:26 PM

#

past meteor I don't even know if bidirectional LSTMs make sense for time related modelling

Exactly

sturdy kiln May 7, 2024, 6:26 PM

#

it doesnt, and it isnt supposed to be lol

#

its just multi model analysis

craggy agate May 7, 2024, 6:26 PM

#

You just experimenting?

sturdy kiln May 7, 2024, 6:27 PM

#

homework actually lol

past meteor May 7, 2024, 6:27 PM

#

I mean you can build the model but conceptually you're conditioning on the past and the future

craggy agate May 7, 2024, 6:27 PM

#

sturdy kiln homework actually lol

Ohk lol

past meteor May 7, 2024, 6:27 PM

#

That makes sense for text but not for time series

craggy agate May 7, 2024, 6:27 PM

#

Yep and stock prices cannot be predicted using the past data unless you are predicting a short period.

sturdy kiln May 7, 2024, 6:28 PM

#

i am actually doing that

#

im using samples instead of the whole thing

#

as a training set

past meteor May 7, 2024, 6:28 PM

#

I don't know financial data well enough. Is there no seasonality whatsoever?

craggy agate May 7, 2024, 6:29 PM

#

past meteor I don't know financial data well enough. Is there no seasonality whatsoever?

There could be but it's not enough to accurately predict share prices.

#

There are hundreds of factors influencing stock prices.

past meteor May 7, 2024, 6:29 PM

#

Agreed

#

But the point isn't necessarily predicting exactly what the stock will be

#

You "only" need to do better than making safe bets

craggy agate May 7, 2024, 6:30 PM

#

past meteor But the point isn't necessarily predicting exactly what the stock will be

Yes but even a rough idea can be wrong a lot of the time.

sturdy kiln May 7, 2024, 6:31 PM

#

craggy agate There are hundreds of factors influencing stock prices.

ive read a paper that made use of LSTM and paragarph vectors to forecast stock markets based on current news articles, tweets, posts about that specific company

#

so thats one way to get that lol

craggy agate May 7, 2024, 6:31 PM

#

Like for example TSLA shot up 30% in like 1 day, if we used an LSTM to predict its price for the day using past data, it would have probably predicted a negative 1-2% change.

craggy agate May 7, 2024, 6:32 PM

#

sturdy kiln ive read a paper that made use of LSTM and paragarph vectors to forecast stock m...

News articles and financial data for the company might be a better approach than solely relying on past prices.

sturdy kiln May 7, 2024, 6:32 PM

#

yeah i didnt say my approach is very practical, its quite the opposite lol

#

the market is extremely volatile, and forecast based on past data is barely enough

#

im just doing model analysis just on this data

craggy agate May 7, 2024, 6:33 PM

#

sturdy kiln the market is extremely volatile, and forecast based on past data is barely enou...

Exactly.

craggy agate May 7, 2024, 6:33 PM

#

sturdy kiln im just doing model analysis just on this data

Alrighty

#

If you were analyzing the news articles, the bidirectional LSTM and transformer models would be great.

toxic mortar May 7, 2024, 6:50 PM

#

@sturdy kiln I am currently doing something similiar for PIPE deals

#

Just with sequential neural nets

#

In neural networks, how do you aproach feature engineering. For example should I include:
-ParamA
-ParamB
-Ratio ParamA/ParamB

as the input parameters of my neural network.

Is the ParamA and ParamB redundant if I have some function that is composition of ParamA and ParamB?

#

ParamA and ParamB arent highly correlated, in fact not corelated at all

bleak pier May 7, 2024, 7:02 PM

#

hi people!

I have a question. What should I do if I use a Jupyter Notebook that I want to execute a command directly on the terminal with ! ... . This specific command is a source activation, source ./script.sh. After this execution, I have 'new commands' prefixes to use and they are only visible after this source, right?

The problem is that I have to do it in many parts of my notebook but I want a way to set it once all for all cells that I want to execute the commands from script.sh. Is there a way to do it?

teal lance May 7, 2024, 10:25 PM

#

What’s the fastest way to connect to Mt4 using Oanda with Python ? I can’t seem to find much info anymore

desert oar May 8, 2024, 12:11 AM

#

bleak pier hi people! I have a question. What should I do if I use a Jupyter Notebook that...

Are you trying to activate a venv from inside a notebook?

unkempt apex May 8, 2024, 6:39 AM

#

I have a confusion regarding polynomial distribution

#

suppose I have 2 columns
salary and years

I considered salary as y and years as x
so now we have only one feature which is years

#

so after training model , when I send value as "5" it says

ValueError: X has 1 features, but LinearRegression is expecting 2 features as input.

#

https://paste.pythondiscord.com/VZAA

this is full code

jaunty helm May 8, 2024, 8:13 AM

#

how do you guys deal with ordinal features?
e.g. for a feature with 3 possible values, Very high, High, Medium, up until now I'd just encode them to 1, 2, 3 respectively (or 3, 2, 1 but that doesn't really matter)
but I just had a thought, what if say Very high actually meant the value of 100, while high and medium mean 5 and 0; that'd be pretty bad for linear regression techniques right?
obviously one hot encoding is always an option, but then I waste the ordering info

#

uhh, I'm still in traditional ML land, haven't looked into neural nets yet 😅 but ty for your info

spring field May 8, 2024, 8:51 AM

#

what's traditional ML land?
anyway, I find using increasing scalar values for encoding sth like that a bit weird, although I can't shake the feeling that it just might work... usually when you have labels and these pretty much seem just like ordinary labels, you'd use an embedding as Kwisatz mentioned
you can take a bit of a step back and use only one-hot encoded vectors as well

wooden sail May 8, 2024, 8:52 AM

#

spring field what's traditional ML land? anyway, I find using increasing scalar values for en...

they probably mean classical optimization approaches with no networks

#

(the line is thin, you can unfold many iterative algorithms into something identical to a network)

spring field May 8, 2024, 8:53 AM

#

alright, does classical optimization at any point use one-hot encoding for labels?

wooden sail May 8, 2024, 8:53 AM

#

sure

#

before you can do anything, you need to choose how to represent your data reasonably

spring field May 8, 2024, 8:54 AM

#

oh, ok, ahh, actually, ig one can classify simple stuff as well...

#

bit of a silly question and currently can't provide even the graphs, but suppose I decided to use an RNN for some sequential data and it happens to fit test data exceptionally well (despite how little of it is available (I got 6 batches (for training, 1 batch for testing) of 8 sequences, each of which has 48 data points, each of which has 3 values as input and 2 values as output), now, for the silly question, do the fantastic results mean I chose the right approach? 😁

wooden sail May 8, 2024, 9:02 AM

#

that, massive overfitting, or data leakage

#

you'd wanna run several sanity checks to be sure

spring field May 8, 2024, 9:04 AM

#

I mean, they're not super duper perfect results, but I was surprised nonetheless
I'm more happy about finally understanding RNNs a bit deeper

#

alright, to elaborate a bit more, it's data spanning over a couple weeks and it's been recorded in rather short intervals, a couple minutes between each measurement, so what I did was split it up every couple hours to get those sequences and then just trained on those, I was wondering what could be improved here
for one, it seems using LSTM might be beneficial as it could train on longer sequences while retaining context
another idea that came to me was implemented something similar to a denseblock or a resnet type thing where I could supply different lengths of sequences and the shorter ones could retain some features from the longer ones

another approach would have been using simple linear regression, I just had my doubts about LR being able to predict future outcomes, as it would likely need to regress against time and while it may have a certain pattern, it's not exactly a simple model probably, and then after regressing against time and getting future outcomes from that, use those predictions against some other value and then regress that

#

speaking of overfitting, I had no variation in sequence lengths at any point

#

does it matter much for RNNs? the sequences themselves seemed quite diverse tbf

#

there might be some continuity issues at some point, but those are unlikely to affect many sequences...

warm trellis May 8, 2024, 9:40 AM

#

Hey leute,
channels in Conv1d would be the columns in tabular dataset?

severe inlet May 8, 2024, 10:17 AM

#

#

im studying for a test tmr, how would i go about trying to find the weights vector by hand? especially when the weights and inputs are of different lengths

wooden sail May 8, 2024, 10:38 AM

#

warm trellis Hey leute, channels in Conv1d would be the columns in tabular dataset?

sure thing

lapis sequoia May 8, 2024, 10:41 AM

#

hi

#

can i leave a file here for a gpt bot im tryna code but it hangs up and doesnt record and paste my audio. Its getting stuck at line 52.

audio = recognizer.listen(source, timeout=None)

#

import openai
import pyttsx3
import speech_recognition as sr
from gtts import gTTS


def transcribe_audio_to_text(filename):
    recognizer = sr.Recognizer()
    with sr.AudioFile(filename) as source:
        audio = recognizer.record(source)
    try:
        return recognizer.recognize_google(audio)
    except sr.UnknownValueError:
        print("Google Speech Recognition could not understand audio")
    except sr.RequestError as e:
        print(f"Could not request results from Google Speech Recognition service; {e}")
    except Exception as e:
        print(f"An error occurred in transcribe_audio_to_text: {e}")


def generate_response(prompt, client):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt},
        ],
    )

    return response.choices[0].message.content


def speak_text(text, engine):
    engine.say(text)
    engine.runAndWait()


def main():
    client = openai.OpenAI(
        api_key=""
    )

    # Initialize the text-to-speech engine
    engine = pyttsx3.init()

    while True:
        # Wait for the user to say "discord"
        with sr.Microphone() as source:
            recognizer = sr.Recognizer()
            try:
                print("Say 'discord' to start recording your question...")
                audio = recognizer.listen(source, timeout=None)
                print("finished listening")
                transcription = recognizer.recognize_google(audio)
                print(f"Transcription: {transcription}")
                if "discord" in transcription.lower():
                    # Record audio
                    filename = "input.wav"
                    print("Say your question...")
                    with sr.Microphone() as source:
                        recognizer = sr.Recognizer()
                        source.pause_threshold = 1
                        audio = recognizer.listen(
                            source, phrase_time_limit=None, timeout=None
                            )
                        with open(filename, "wb") as f:
                            f.write(audio.get_wav_data())

                    # Transcribe audio to text
                    text = transcribe_audio_to_text(filename)
                    if text:
                        print(f"You said: {text}")

                        # Generate Response using GPT-3
                        response = generate_response(text, client)
                        print(f"GPT-3 says: {response}")

                        # Record audio with gTTS for video
                        tts = gTTS(text=response, lang="en")
                        tts.save("sample.mp3")

                        # Read response using text-to-speech
                        speak_text(response, engine)
            except sr.UnknownValueError:
                print("Google Speech Recognition could not understand audio")
            except sr.RequestError as e:
                print(
                    f"Could not request results from Google Speech Recognition service; {e}"
                )
            except Exception as e:
                print(f"An error has occurred: {e}")


if __name__ == "__main__":
    main()

#

import openai
from speechtotextbot import generate_response

client = openai.OpenAI(
    api_key="sk-proj-5UWznYrSQjlDLacViE0dT3BlbkFJlPrZESypkh4Z6ZrHvMIO"
)
response = generate_response("what does a yoyo do", client)
print(response)

orchid forge May 8, 2024, 10:58 AM

#

how to do web scraping with unstructured data using python?

hollow escarp May 8, 2024, 12:19 PM

#

@past meteor hi i read everything which you send me about onnx and im having a trouble with finding detection boxes in output data from onnxruntime, i transfered my pt model to onnx with following command yolo export model=<my_model> format=onnx imgsz=640,640 and i cant find the boxes and confidence of predictions. Im getting predictions with following code:

  model_url = "./models/license_plate_detector.onnx"
  session = onnxruntime.InferenceSession(model_url)

  image = cv2.imread('./test_photos/test1.jpg')

  input_size = (640, 640)
  resized_image = cv2.resize(image, input_size)
  image_bchw = np.transpose(np.expand_dims(resized_image, 0), (0, 3, 1, 2)).astype(np.float32)
  pred = session.run(None, {"images": image_bchw })
  .....

craggy agate May 8, 2024, 12:27 PM

#

Hey, I am working on a project to give my Ryze tello drone a track me feature, I want to use object detection or object tracking for this but there are just soo many methods, I have tried a full body haarcascade but it failed to accurately track me, I am trying to use a SIFT feature Detector but I doubt it will work just cause of all the distractions in the background and a possibility of more than one person being in the original capture frame. What type of object tracking would you guys recommend?

orchid forge May 8, 2024, 12:34 PM

#

orchid forge how to do web scraping with unstructured data using python?

Anyone?

#

The thing is that the it's can be a shopping website so how can I scrap my ideal data from it?

deep veldt May 8, 2024, 12:38 PM

#

should i use pytorch or tensorflow for convolution neural network?

buoyant vine May 8, 2024, 12:40 PM

#

Personally I think PyTorch is just the standard now days

#

unless you're following some Keras tutorial to learn some basics

deep veldt May 8, 2024, 12:53 PM

#

should i use convolution neural network or siamese neural network for images?

charred compass May 8, 2024, 1:21 PM

#

orchid forge The thing is that the it's can be a shopping website so how can I scrap my ideal...

I didn't understand the question but i guess you can manually look at the html structure and then scrape. If you are planning to scrape the same page then u can just run that script

orchid forge May 8, 2024, 1:22 PM

#

charred compass I didn't understand the question but i guess you can manually look at the html s...

If there's a website a normal website how am I suppose to gather the data if it's not in table form

hollow escarp May 8, 2024, 1:36 PM

#

hollow escarp <@260493929047130113> hi i read everything which you send me about onnx and im h...

Okay i found it, https://dev.to/andreygermanov/how-to-create-yolov8-based-object-detection-web-service-using-python-julia-nodejs-javascript-go-and-rust-4o8e this article was really helpfull

DEV Community

How to create YOLOv8-based object detection web service using Pytho...

Table of contents Introduction YOLOv8 deployment options Export YOLOv8 model to...

warm trellis May 8, 2024, 1:53 PM

#

Let's say I've a data in shape (32, 28, 8) where 8 is the number of the columns, 28 is the length of the time windows, 28 values in each windows, what should be my in channel in 1dconv networks? 8 or 28?

tidal bough May 8, 2024, 1:57 PM

#

typically when doing timeseries analysis one convolves along time, I believe. so 28 is your "input length", and 8 is the number of input channels. Which means you'll probably need to transpose your data to (32,8,28) first, because Conv1d expects the axis to be convolved over to be the last one.

clear oriole May 8, 2024, 2:02 PM

#

What could I be doing wrong that my neural network doesn't get trained at all? Just outputs straight values.

#

https://paste.pythondiscord.com/CO2Q

#

this is the model code that I have, I am doing something wrong but I can't really pin point that bug

placid sentinel May 8, 2024, 3:03 PM

#

Good day, I am new to this channel!
I have a "Collaborative-filtering concept of proof" task that I coded using Python Flask and need help with some additional requirements in the task. Can someone help me with that? I will put the task description below to view it easily.

#

I need web-based software written in Python Flask to visualise how user-based collaborative filtering works. It should be a table type where there are other users ratings and then I can interact with items to get recommendations.

I have to make a user-item table with 5 users ( u1, u2, u3, u4, u5) vertically and 5 items (i1, i2, i3, i4, i5) horizontally. Here, as a user, I can give points (ratings) to every item for all users between 1-5 or give them a "?" value (using dropdown options). After completing the user-item ratings and clicking the submit button, you will display another table below filling the empty cells (the cells with the value "?"). This time, you will predict the rating for the user-item. For instance, we have been given a table below:

User1: 5, 3, 4, 4, "?"
User2: 3, 1, 2, 3, 3
User3: 4, 3, 4, 3, 5
User4: 3, 3, 1, 5, 4
User5: 1, 5, 5, 2, 1

Considering these given ratings, in the next table, you will fill in that cell which was previously indicated as "?" with a predicted rating value that you will calculate using Pearson correlation. You can use any Python libraries, such as Spark or any relevant ones to solve this collaborative filtering task.
The user can change the rating values (between 1-5) or leave the table cell empty (“?”) as they wish. For example:

User1: 5, 3, 4, 4, "?"
User2: 3, 1, "?", 3, 3
User3: 4, "?", 4, 3, 5
User4: 3, 3, "?", 5, 4
User5: "?", 5, 5, 2, 1

Based on the edited table above, the program should list the predicted values indicated as “?”.
Generate app.py and index.html codes.

past meteor May 8, 2024, 3:23 PM

#

hollow escarp Okay i found it, https://dev.to/andreygermanov/how-to-create-yolov8-based-object...

Glad you ended up finding it

craggy agate May 8, 2024, 3:23 PM

#

Hey, I am working on a project to give my Ryze tello drone a track me feature, I want to use object detection or object tracking for this but there are just soo many methods, I have tried a full body haarcascade but it failed to accurately track me, I am trying to use a SIFT feature Detector but I doubt it will work just cause of all the distractions in the background and a possibility of more than one person being in the original capture frame. What type of object tracking would you guys recommend?

thorn cairn May 8, 2024, 4:57 PM

#

hey, my word2vec model is overfitting and i dont have a clue about NLP cause my prof just told us to use an algs that hasnt been taught yet, soo is there any thing i can do to stop the overfitting?

buoyant vine May 8, 2024, 4:59 PM

#

Dropout, doing less epochs, not such an aggressive LR, etc... More data/better data...

#

hard to tell you if anything is the cause without the code

thorn cairn May 8, 2024, 5:00 PM

#

can i post my ipynb on the forum?

thorn cairn May 8, 2024, 5:01 PM

#

buoyant vine Dropout, doing less epochs, not such an aggressive LR, etc... More data/better d...

let me try decreasing the LR

#

#Defining Neural Network
import keras
from keras.models import Sequential
from keras.layers import Dense,Embedding,LSTM,Dropout,Bidirectional,GRU
import tensorflow as tf

model = Sequential()
#Non-trainable embeddidng layer
model.add(Embedding(vocab_size, output_dim=EMBEDDING_DIM, weights=[embedding_vectors], input_length=20, trainable=True))
#LSTM 
model.add(Bidirectional(LSTM(units=128 , recurrent_dropout = 0.3 , dropout = 0.3,return_sequences = True)))
model.add(Bidirectional(GRU(units=32 , recurrent_dropout = 0.1 , dropout = 0.1)))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer=keras.optimizers.Adam(learning_rate= 0.01), loss='binary_crossentropy', metrics=['acc'])

del embedding_vectors

note: im just copying off people on kaggle, idk what im doing.
So, what the difference between this and using word2vec from gensim.models?

thorn cairn May 8, 2024, 5:17 PM

#

im trying to build a model that detecs sarcasm from news headlines

#

but they're both word2vec? im confused... maybe just a same algs, but different library?

thorn cairn May 8, 2024, 5:58 PM

#

does anyone know why the x output shape is always none when i add dropouts?

tidal bough May 8, 2024, 7:18 PM

#

doesn't that just indicate a variable number of samples?

spring field May 8, 2024, 10:40 PM

#

given a somewhat traditional RNN, approximately how far can it reasonably predict the future? in terms of how much data has been given, suppose the sequence is 12 data points long, can it predict the next 12, 24, 36 data points, what would it depend on?

#

the way I decided to roll out future predictions was for the network to predict the next input values alongside the output values that I want to actually predict and then use these predicted future inputs to do the next prediction on the next future inputs and the next outputs and then I only take the very last output of the predicted sequence

#

visually speaking it's something like this...

#

so say I have the initial sequence of say 12 data points and I get back 12 data points where part of the data is the predicted inputs and the other part is the output that is of interest, so basically it moves by one data point forward, then uses that to predict the next inputs and the outputs that are of interest and then uses those inputs for the next prediction and so forth

#

x = [1, 3, 2, 5, 1]  # if continued, the next value would be 4
y = [[3, ...], [2, ...], [5, ...], [1, ...], [4, ...]]  # [next input, ...]

so, it should learn to predict the next input and also the other values that are of interest
but then it uses those predicted inputs for the next prediction of the next inputs and next values and so on

#

here's a concrete example from training, the graph on the right, shows a randomly selected period from the dataset that is 10 times the length of a sequence used in training, it is given a 10th of this sequence as a base (the yellow line) and then, as described above, it tries to predict the next elements in the sequence one by one, but as you can see, it quite quickly converges to a constant value

spring field May 8, 2024, 10:58 PM

#

spring field given a somewhat traditional RNN, approximately how far can it reasonably predic...

hence this question

#

and then of course I guess there's potential running into the network forgetting most of the stuff if longer sequences are used (in training and as a base) to predict longer time periods

iron basalt May 8, 2024, 11:20 PM

#

spring field given a somewhat traditional RNN, approximately how far can it reasonably predic...

Depends on the problem, potentially forever.

#

If it's something really simple, like a straight line, then easily forever.

spring field May 8, 2024, 11:34 PM

#

spring field here's a concrete example from training, the graph on the right, shows a randoml...

well, it is not a straight line

#

I mean, clearly the network thinks otherwise

iron basalt May 8, 2024, 11:37 PM

#

spring field well, it is not a straight line

Right now we mostly just have empirical evidence. This depends on the type of "traditional RNN" (there are multiple). Some have some more math to explain them. But it's still mostly just trying them out and comparing (for anything sufficiently complex).

spring field May 8, 2024, 11:38 PM

#

basically I have this

#

3 inputs, a hidden layer with a ton of features, 2 outputs

iron basalt May 8, 2024, 11:39 PM

#

spring field basically I have this

Yeah i'm assuming what most mean by RNN, the deep learning kind trained with backpropagation (through time).

#

RNNs have been around prior to backpropagation taking off.

spring field May 8, 2024, 11:39 PM

#

oh

iron basalt May 8, 2024, 11:40 PM

#

It's still being figured out, you can more or less just measure it. See how much it can remember.

#

Improvements come from trying to solve issues with backpropagation, intuitions that make a new design, then backed up by results.

#

Like with LSTM.

#

Meanwhile some people are trying to analyze the math more, but it will take some time.

#

Math kind of happens in that way, where it slowly corners the problem from all sides.

spring field May 8, 2024, 11:43 PM

#

I'm also wondering whether my sort of general approach is in the right direction
basically I just have a bunch of continuous (mostly) sequential data that I split up in smaller sequences for training, I'm about to try to have more splits because currently it just takes out the sequence length, steps by the sequence length to the next chunk and takes that out and so on, I'm gonna try stepping by 1 elemnt and taking out a sequence length that way

spring field May 8, 2024, 11:44 PM

#

iron basalt Improvements come from trying to solve issues with backpropagation, intuitions t...

yeah, I start noticing more and more that there's a lot of intuition involved as well

iron basalt May 8, 2024, 11:58 PM

#

spring field I'm also wondering whether my sort of general approach is in the right direction...

Not sure what is meant by the 1 element part at the end there, but random chunks, yeah.

spring field May 9, 2024, 12:01 AM

#

iron basalt Not sure what is meant by the 1 element part at the end there, but random chunks...

hope this helps pg_rofl

iron basalt May 9, 2024, 12:02 AM

#

spring field hope this helps <:pg_rofl:837436444770304050>

Ah, stride 1. So you want random, not in-order.

#

Shuffled data.

spring field May 9, 2024, 12:03 AM

#

no no, so, I have a bunch of continuous data, that I split up in sequences, so each sequence is like its own thing and the sequences are then ofc shuffled and randomly distributed across batches

iron basalt May 9, 2024, 12:04 AM

#

Ok but in what order do you pull out the sequences?

#

Out of the whole.

spring field May 9, 2024, 12:04 AM

#

sequentially

iron basalt May 9, 2024, 12:04 AM

#

Try random.

#

Or is that what you mean by shuffling the sequences?

spring field May 9, 2024, 12:05 AM

#

yeah

#

but like, gimme a sec

iron basalt May 9, 2024, 12:06 AM

#

In the image given before, are you training in a random order of those blocks?

#

Like 3, 1, 2.

#

They can overlap, but it's just important that the blocks you pick are random all over.

#

And ideally not clustered then if they overlap.

#

So, uniform.

spring field May 9, 2024, 12:09 AM

#

!e

# continuous, sequential data
data = list(range(10))

seq_length = 4
# stride the same as seq length
for idx in range(0, len(data), seq_length):
    print(data[idx:idx + seq_length], end=" ")
print()
# stride = 1
for idx in range(len(data)):
    print(data[idx:idx + seq_length], end=" ")

all of these sequences are then shuffled during training

arctic wedgeBOT May 9, 2024, 12:10 AM

#

@spring field :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | [0, 1, 2, 3] [4, 5, 6, 7] [8, 9] 
002 | [0, 1, 2, 3] [1, 2, 3, 4] [2, 3, 4, 5] [3, 4, 5, 6] [4, 5, 6, 7] [5, 6, 7, 8] [6, 7, 8, 9] [7, 8, 9] [8, 9] [9]

iron basalt May 9, 2024, 12:10 AM

#

spring field !e ```py # continuous, sequential data data = list(range(10)) seq_length = 4 # ...

Ok, that should be fine then. Important thing is that deep learning does not like in-order.

#

This is the same as just random picking a start index and some fixed length and just picking over and over.

spring field May 9, 2024, 12:11 AM

#

arctic wedge <@670379095951147019> :white_check_mark: Your 3.12 eval job has completed with r...

it's just this way I get to get more sequences per sequence 😁

#

idk if it will actually help

iron basalt May 9, 2024, 12:12 AM

#

It will, especially if you can start at any point in time. It will have seen examples of starting at those points.

spring field May 9, 2024, 12:13 AM

#

spring field here's a concrete example from training, the graph on the right, shows a randoml...

yep, that was what I was thinking, cuz the rollout (on the right) here does start from a random point

iron basalt May 9, 2024, 12:27 AM

#

spring field yeah, I start noticing more and more that there's a lot of intuition involved as...

Btw, in the current state of things, RNN's abilities are even emperically unclear. There is a bunch of evidence being tossed around that many have not reproduced, and some even claim to have some new RNNs that beat out transformers for things like LLMs. One big issue with RNNs is that even slight tweaks to them have a huge impact, and they don't train as well (in terms of parallelization), so it's a bit unclear if they are actually better or worse, or if it's just because other things work better with existing hardware so we have more evidence for them (this hardware bias issue applies to a lot of ML (we happen to currently have GPUs, which happen to work well at certain things)).

spring field May 9, 2024, 12:30 AM

#

I see, yeah, that's kinda crazy tbf, that we don't even know what's happening, it just happens 😄

iron basalt May 9, 2024, 12:31 AM

#

spring field I see, yeah, that's kinda crazy tbf, that we don't even know what's happening, i...

We are relying on some general ideas that work well, but especially at smaller scales, the details matter.

#

(If you make the network big enough and throw enough compute at the problem, it can work, which is the current approach (more cloud hardware in a race) (which requires the approach to work well with the hardware, I just wish we flipped this around and made more diverse hardware to try out different things at scale (but that is really expensive)))

craggy agate May 9, 2024, 12:55 AM

#

Hey, I am working on a project to give my Ryze tello drone a track me feature, I want to use object detection or object tracking for this but there are just soo many methods, I have tried a full body haarcascade but it failed to accurately track me, I am trying to use a SIFT feature Detector but I doubt it will work just cause of all the distractions in the background and a possibility of more than one person being in the original capture frame. What type of object tracking would you guys recommend?

#

Can someone please help me? ^

spring field May 9, 2024, 2:30 AM

#

spring field here's a concrete example from training, the graph on the right, shows a randoml...

I mean, I managed to get something more interesting with some configuration adjustments and some other hyperparameter changes and whatnot, still feels like I'm missing something and it becomes "less creative" the lower the loss gets (the top small graphs are test fits from the test dataset, so like, it's what it doesn't train on), but the bottom graphs, the two bigger ones are basically rollouts and they are not as exciting as I would hope, it's basically using one day as a base input to then predict future inputs and outputs and so the next 4 days, but it's just not doing something I guess?

#

idk, maybe I should've taken a different approach

teal lance May 9, 2024, 2:35 AM

#

Best person for tkk bootstrap customization 🔥🔥

teal lance May 9, 2024, 2:35 AM

#

spring field I mean, I managed to get something more interesting with some configuration adju...

This is cool what’s it for🔥

spring field May 9, 2024, 2:40 AM

#

it's mainly for practice

analog bolt May 9, 2024, 2:41 AM

#

MACHINE LEARNING

#

HOW DOES ONE MAKE THE MACHINE LEARN?

serene scaffold May 9, 2024, 2:52 AM

#

analog bolt HOW DOES ONE MAKE THE MACHINE LEARN?

math

analog bolt May 9, 2024, 2:52 AM

#

oh okay

#

do I actually have to remember how to get a standard deviation or can computer do it for me

tacit basin May 9, 2024, 3:16 AM

#

analog bolt do I actually have to remember how to get a standard deviation or can computer d...

It depends

pliant heron May 9, 2024, 6:39 AM

#

hi!
good day to you guys!
I need some wrt continuing learning data science/analytics.
I finished up learning python from a book, and did learn some libraries(matplotlib, pygal). Can someone suggest from where i can get easy project related to the libraries- panda, seaborn, and scikit learn. My friend is in stock market and she is helping me out to break into data analytics, so a related project would be a great help
thankyou

#

my goal is to be able to do backtesting related to stock markets/trading

#

please tag me if someone suggests a link or source

feral wind May 9, 2024, 6:44 AM

#

hi guys i wanna ask, so i have 3 labels as my y variable and that is dropout, graduate, and enrolled. so i want to classify the data into 3 of this thing and i am using svm. but i dont know what kinda svm i should use because i have 3 classes of labels

jaunty helm May 9, 2024, 7:19 AM

#

feral wind hi guys i wanna ask, so i have 3 labels as my y variable and that is dropout, gr...

can you elaborate? sklearn.svm.SVC works out of the box

The multiclass support is handled according to a one-vs-one scheme.

#

also this seems like a nice read

scikit-learn

1.12. Multiclass and multioutput algorithms

This section of the user guide covers functionality related to multi-learning problems, including multiclass, multilabel, and multioutput classification and regression. The modules in this section ...

past meteor May 9, 2024, 7:57 AM

#

feral wind hi guys i wanna ask, so i have 3 labels as my y variable and that is dropout, gr...

You mean, what kind of kernel function?

#

In my opinion, never use the Gaussian kernel SVM. It's between the linear kernel and the RBF SVM

#

The biggest consideration for picking an SVM is your dataset size. If your dataset is too large you can't use (RBF) SVMs at all. You can still use linear SVMs that are solved in the primal form

jaunty helm May 9, 2024, 8:22 AM

#

past meteor In my opinion, never use the Gaussian kernel SVM. It's between the linear kernel...

may I ask for your reason on this?

past meteor May 9, 2024, 8:33 AM

#

jaunty helm may I ask for your reason on this?

Because it requires forming the kernel matrix and the size is NxN

#

You can easily figure out how much memory this takes by taking your dataset, squaring it and checking how much GB it takes with float 32

jaunty helm May 9, 2024, 8:43 AM

#

past meteor Because it requires forming the kernel matrix and the size is NxN

ah, too memory intensive 🙂
ty for your reply

feral wind May 9, 2024, 9:21 AM

#

jaunty helm can you elaborate? `sklearn.svm.SVC` works out of the box > The multiclass suppo...

i cant elaborate that, im just starting to learn, do you have tutorials on youtube that might be of help

jaunty helm May 9, 2024, 9:22 AM

#

feral wind i cant elaborate that, im just starting to learn, do you have tutorials on youtu...

what I meant was you can directly use SVC even if you have multiple target classes (dropout, graduate, enroll) because it's baked in already

feral wind May 9, 2024, 9:22 AM

#

past meteor You mean, what kind of kernel function?

my dataset is not big i dont think, and i dont really know what kernel is and what i should use

feral wind May 9, 2024, 9:22 AM

#

jaunty helm what I meant was you can directly use `SVC` even if you have multiple target cla...

how can i do that

jaunty helm May 9, 2024, 9:23 AM

#

feral wind how can i do that

use it like any other estimator? I'm a bit confused by what you mean

i am using svm
are you not using sklearn or something?

feral wind May 9, 2024, 9:23 AM

#

i am using sklearn

jaunty helm May 9, 2024, 9:25 AM

#

feral wind i am using sklearn

well yeah, just use it like any other estimator then

svc = SVC()
svc.fit(train_X, train_y)
svc.predict(test_X)

feral wind May 9, 2024, 9:25 AM

#

this is what it showed, idk if i did this correctly or not but i think not, because why it shows 0.0 on the 1 value

jaunty helm May 9, 2024, 9:30 AM

#

feral wind this is what it showed, idk if i did this correctly or not but i think not, beca...

I think that just means it never correctly predicts label 1
maybe something was messed up before fitting

feral wind May 9, 2024, 9:33 AM

#

jaunty helm I think that just means it never correctly predicts label 1 maybe something was ...

these are my labels, so theres 3 attributes, does 0 refer to the Target_Dropout, and so on?

jaunty helm May 9, 2024, 9:37 AM

#

feral wind these are my labels, so theres 3 attributes, does 0 refer to the Target_Dropout,...

probably

feral wind May 9, 2024, 9:37 AM

#

but what could go wrong then on my model

jaunty helm May 9, 2024, 9:38 AM

#

feral wind but what could go wrong then on my model

idk, depends on your processing steps I guess?

feral wind May 9, 2024, 9:40 AM

#

this is what i use

spring field May 9, 2024, 10:45 AM

#

spring field I mean, I managed to get something more interesting with some configuration adju...

alright, I decided to change my approach a bit, basically give the network some days as input and as the output provide some of the output from the days at the end of the days that are in the input and then some of the output that is from the days after those

x: 1 2 3 4 5
y:     3 4 5 6 7 
p:     ? ? ? ? ?

inputs and outputs are of the same dimensions though, which ig could be changed by reconfiguring the model a bit for that
and also I was thinking that maybe a (proper, not two chained RNN cells like I have now) DRNN would also help, but yeah... this is the best I got, I suppose this approach at least can reasonably well predict those few days after the given ones, but yeah, maybe it's a lack of data and/or network depth, I don't know, I just know that an RNN-type network is probably the way to go, I think

past meteor May 9, 2024, 10:46 AM

#

feral wind my dataset is not big i dont think, and i dont really know what kernel is and wh...

Try both. Split off a bit of data, cross validate on the larger part, select the best model and then that's the winner.

Support vector machines are also quite sensitive to their hyperparameters so you may want to tune them.

past meteor May 9, 2024, 10:47 AM

#

spring field alright, I decided to change my approach a bit, basically give the network some ...

What are you doing? I'm quite curious, can you give me a TL;DR?

deep veldt May 9, 2024, 10:53 AM

#

What are the differences between a Convolution network and Siamese network? i really need the answer

spring field May 9, 2024, 10:56 AM

#

past meteor What are you doing? I'm quite curious, can you give me a TL;DR?

sure, I can't provide much details on the dataset, but basically it's a couple (3 parameters) environmental factors and there are 2 (but they are linearly correlated, but I still use both) outputs that depend on them (y depends on x, the usual), the ig more relevant bit is that the input data is sequential, say, for example, every 10 minutes there's a measurement of air temperature, wind speed, and water surface temperature and I want to predict the water surface temperature over the next couple days given say today's air temp and wind speeds over the day
I think that's an accurate representation of the actual data... it's just a bunch of continuous data of such measurements

past meteor May 9, 2024, 10:56 AM

#

deep veldt What are the differences between a Convolution network and Siamese network? i re...

A siamese network is one where you give the same network 2 or 3 inputs and then you have it either say if they belong to the same class or not or you use triplet loss and the net needs to specify which 2 are from the same category. Siamese networks do something called metric learning.

Now, the idea of Siamese networks is way more general than CNNs. You can have a siamese network that is a CNN. A good example is unlocking phones with face recognition . They typically use some sort of triplet loss. The network used is a Siamese net.

All clear?

deep veldt May 9, 2024, 10:57 AM

#

past meteor A siamese network is one where you give *the same* network 2 or 3 inputs and the...

Should i use CNN or siamese?

#

for image similarity

past meteor May 9, 2024, 10:57 AM

#

Reread my answer again please 😄

deep veldt May 9, 2024, 10:58 AM

#

I still dont get which one should i use

#

I'm dumb

wooden sail May 9, 2024, 10:58 AM

#

the question doesn't make sense because the two things are not mutually exclusive

past meteor May 9, 2024, 10:58 AM

#

Now, the idea of Siamese networks is way more general than CNNs. You can have a siamese network that is a CNN.

wooden sail May 9, 2024, 10:58 AM

#

"siamese" has to do with what you do with the network and how you train it

#

CNN has to do with architecture

past meteor May 9, 2024, 10:59 AM

#

deep veldt Should i use CNN or siamese?

You can make a Siamese network that uses convolutional layers

deep veldt May 9, 2024, 10:59 AM

#

oh

past meteor May 9, 2024, 11:01 AM

#

spring field sure, I can't provide much details on the dataset, but basically it's a couple (...

What I can say is that a lot of time series docs / methods are really about univariate cases where you use lags to predict future values. Multivariate time series is (currently) my line of work 😄

Have you tried "basic" methods so far or did you jump to neural stuff immediately? Traditional methods are quite competetive.

#

I would 100 % start of by just using ARIMA, exponential smoothing etc. on just your 2 outputs. Another benchmark I always do is saying basically copying the last available datapoint and computing the error based on that

#

The next thing I'd do is VAR (vector auto regression) since you mention your 2 outputs are correlated

#

Afterwards I'd start looking into just making lags of my inputs and giving that to a gradient boosted tree and so on

#

You kind of need benchmarks to make sense of the performance of neural nets and these are quite low hanging fruit imo

warm trellis May 9, 2024, 11:07 AM

#

BUG

#

.

deep veldt May 9, 2024, 11:40 AM

#

past meteor You can make a Siamese network that uses convolutional layers

so siamese network is basically comparing if both output is the same?

spring field May 9, 2024, 11:44 AM

#

past meteor What I can say is that a lot of time series docs / methods are really about univ...

mmm, I jumped to neural stuff immediately (when all you've got is a hammer...), so yeah... good one (on my part 😁)
though I did consider something simpler like linear regression that seemed a tad inadequate for the issue at hand, another idea that came to mind was to use simpler neural nets and I considered a couple approaches, but I saw a couple flaws with data generation using those and overall they don't care about the order anyway, so I had recently learned about RNNs, thus they seemed as the appropriate solution (the hammer analogy), so I tried to fit it as best as I could... using different variants of lag and such, I think the current last approach I took fared the best overall
but alright, I'll keep the more basic methods in mind next time (this was short practice and other than it being interesting to work with RNNs, I don't have particular interest in the data...), now that you have mentioned them I'll probably do a couple practice rounds to at least remember about their existence 😄
I did consider some polynomial fitting, sort of what I assume exponential smoothing does in a way, since the data is quite periodic, so it probably can be approximated using some sin and cos combinations as well, I suppose also that this goes into the territory of weather forecasting a bit which is a whole another topic I guess

gritty vessel May 9, 2024, 11:52 AM

#

Hey everyone I have a doubt

#

I am performing eda

#

But when I dropped the na values only 22 rows are remaining

#

So it's does not make sense right to perform further analysis as like original size was some 1lacs × 22columns

#

I dropped the rows with na and it came down to 22 rows only

past meteor May 9, 2024, 11:54 AM

#

spring field mmm, I jumped to neural stuff immediately (when all you've got is a hammer...), ...

If you lag your data models like linear regression could work. If it's periodic it gets trickier as you'll have to make a kind of time variable and compute the cos and sin. If you have interaction effects you'll have to multiply them with this variable or use a kernel like this https://scikit-learn.org/stable/modules/generated/sklearn.kernel_approximation.Nystroem.html. I don't recommend this approach 😅 .

OTOH, tree-based models naturally can deal with periodicity without any preprocessing (just a time column is enough, you don't need to make a cos/sin or interactions) but they cannot fit trend/extrapolate unless you do some ✨ fancy ✨ stuff. If you don't have trend, feel free to just make lags and use xgboost or similar as a benchmark.https://www.sktime.net/en/stable/ implements all the lagging and so on.

There's also exponential smoothing algorithms that can fit seasonality, holt-winters comes to mind. https://www.statsmodels.org/dev/generated/statsmodels.tsa.holtwinters.ExponentialSmoothing.html so you're covered if its periodic. The same for ARIMA, there is SARIMA (the S stands for seasonal) and even SARIMAX (the X stands for exogenous, extra variables). https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html

But yeah, I do get the fact that it's fun to learn how to work with RNNs and so on so if that's the goal then it's no problems at all 🙂 I was like this but training too many neural nets gave me an aversion to them if there's other methods that can work. Mostly because they take way too long to train and have too many hyperparameters. I'm always stuck thinking "is my net bad because I chose the wrong hyperparameters or is this the lower bound on the error?" and there's no way of conclusively answering this question.

#

If you're going to be trying out many different configurations I recommend using a similar stack as to what I use at work btw:

https://optuna.org/ for hyperparameter tuning
https://mlflow.org/ to save your runs/hyperparameters.

They integrate nicely, you need 3 lines of code to have optuna register its runs in mlflow. It's a more "scalable" way to try out different hyperparameters.

Optuna

Optuna - A hyperparameter optimization framework

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API.

MLflow | MLflow

Description will go into a meta tag in

warm trellis May 9, 2024, 11:59 AM

#

Hey, how can I understand the importance of the features in my dataset? I‘m feeding 8 columns into model and in the end do prediction only for one column

#

It‘s based on hybrid conv-gru nn model

gritty vessel May 9, 2024, 12:11 PM

#

warm trellis Hey, how can I understand the importance of the features in my dataset? I‘m fee...

Explainable AI models like lime shap eli5

spring field May 9, 2024, 12:21 PM

#

past meteor If you lag your data models like linear regression could work. If it's periodic ...

I see, thanks a lot for the information and resources, I'll check them out, those tools look wonderful as well heartowo

strange elbowBOT May 9, 2024, 12:22 PM

#

Noooooo!!

@spring field, please enable your DMs to receive the bookmark.

spring field May 9, 2024, 12:23 PM

#

past meteor If you're going to be trying out many different configurations I recommend using...

.bm

#

I should use this feature more often

spark bane May 9, 2024, 2:14 PM

#

Hello, I have 2+ years experience with Python and wanna learn data science, I need some advice from persons who has experience in this field, i remember some stuff of math from school but need to remember, so my questions is where should i start learning from? and how? i need best way to learn data science when you already know python and don't need to waste time to learn list, tuple and bla bla. maybe you all understand what i need. Thank you!

karmic zealot May 9, 2024, 2:35 PM

#

how can I encrypt data if I want to store it

thorn cairn May 9, 2024, 4:20 PM

#

i think there is something wrong with my embedding, because when im trying to fit it detects nothing?

from sklearn.model_selection import train_test_split,  StratifiedKFold, StratifiedShuffleSplit, KFold

kfold = StratifiedKFold(n_splits=5,shuffle=True,random_state=11)
splits = kfold.split(df,df['headline'])
x_train, x_test, y_train, y_test = train_test_split(df['headline'], df['is_sarcastic'],  test_size=0.30,random_state=3)
# x_train.info()
# Encoding here
encoder = tf.keras.layers.TextVectorization(max_tokens=10000)
encoder.adapt(x_train.map(lambda text: text))

vocabulary = np.array(encoder.get_vocabulary())
# Creating the model
model = tf.keras.Sequential([
    encoder,
    tf.keras.layers.Embedding(
        len(encoder.get_vocabulary()), 64, mask_zero=True),
    tf.keras.layers.Bidirectional(
        tf.keras.layers.LSTM(64, return_sequences=True)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1)
])

model.compile(
    loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
    optimizer=tf.keras.optimizers.Adam(),
    metrics=['accuracy']
)
history = model.fit(
    x_train, 
    epochs=5,
    validation_data=x_test
)

it returns this error,

AttributeError: 'NoneType' object has no attribute 'items'

#

the original code that i copied has something like this on their encoder. But it returns an error too,

encoder.adapt(train_dataset.map(lambda text, _: text))

TypeError: <lambda>() missing 1 required positional argument: '_'

desert oar May 9, 2024, 5:47 PM

#

@thorn cairn the error AttributeError: 'NoneType' object has no attribute 'items' means what it says: somewhere in your code you tried to access the items attribute of something, but the something was None (an instance of NoneType) and of course there is no None.items attribute. your task now is to figure out where that happened, and what caused something to be None that you expected to be other than None

#

you need to look at the traceback part of the error output. that should identify precisely where the error happened.

#

the other error message says that the function lambda text, _: ... was given 1 argument, but expected 2 arguments, so the _ argument is considered missing. "positional" means they were provided like f(x, y) as opposed to "keyword" which would be provided like f(x=x, y=y)

odd meteor May 9, 2024, 6:52 PM

#

spring field sure, I can't provide much details on the dataset, but basically it's a couple (...

I find this kind of timeseries analysis that's modelled to predict multiple response variables quite interesting. I don't know why it's not as popular as the conventional timeseries analysis with a single response variable. I worked on this kind of task once where y = 36 columns and X was around 663 columns. I tried RNN, ARIMA, SARIMA, XGBoost, LightGBM, and GAM (used pyGAM), and LightGBM produced the best result.

odd meteor May 9, 2024, 6:57 PM

#

karmic zealot how can I encrypt data if I want to store it

There's cryptography and then there's differential privacy (doesn't perform encryption though but it's one of the gold standard currently in privacy preserving ML)

odd meteor May 9, 2024, 7:02 PM

#

spark bane Hello, I have 2+ years experience with Python and wanna learn data science, I ne...

Perhaps https://kaggle.com/learn , https://course.fast.ai/ or purchasing a course on Coursera/Udacity/DataCamp/Udemy.

If you prefer books, check the pinned messages on this channel for some nice recommendations

Learn Python, Data Viz, Pandas & More | Tutorials | Kaggle

Practical data skills you can apply immediately: that's what you'll learn in these no-cost courses. They're the fastest (and most fun) way to become a data scientist or improve your current skills.

Practical Deep Learning for Coders

Practical Deep Learning for Coders - Practical Deep Learning

A free course designed for people with some coding experience, who want to learn how to apply deep learning and machine learning to practical problems.

quartz lotus May 9, 2024, 7:32 PM

#

just a quick question for anyone who has used open CV before but has color conversion to gray scale changed in the past year or so?
grayimg=cv.cvtColor(img,cv.COLOR_BGR2BGRA) vs
grayimg=cv.cvtColor(img,cv.COLOR_BGR2BGRAY)

#

the bottom is how i'm seeing how it's done from a tutorial from a year ago but that option doesn't exist for me. it's just grayed out in my editor

agile cobalt May 9, 2024, 7:33 PM

#

COLOR_BGR2BGRA sounds like Blue Green Red -> Blue Green Red Alpha?

quartz lotus May 9, 2024, 7:34 PM

#

ah, that might be it then.

agile cobalt May 9, 2024, 7:34 PM

#

https://docs.opencv.org/3.4/d8/d01/group__imgproc__color__conversions.html

COLOR_BGR2BGRA
Python: cv.COLOR_BGR2BGRA
add alpha channel to RGB or BGR image

quartz lotus May 9, 2024, 7:35 PM

#

so it is then

agile cobalt May 9, 2024, 7:37 PM

#

Are you using 3.x or 4.x?

quartz lotus May 9, 2024, 7:37 PM

#

3.x

agile cobalt May 9, 2024, 7:38 PM

#

which version exactly

quartz lotus May 9, 2024, 7:38 PM

#

i think my computer was just throwing me a weird issue i get the option for bgr2gray now
also, i'm on 3.9.6

agile cobalt May 9, 2024, 7:39 PM

#

OpenCV version, not python version

#

Python 4 does not even exists

quartz lotus May 9, 2024, 7:40 PM

#

4.9.0.80

agile cobalt May 9, 2024, 7:40 PM

#

Check which version the tutorial you're following uses

#

major versions (major.minor.patch) frequently contain breaking changes, which means code wrote for 3.x.y will frequently not work for 4.x.y (same for 0.x.y -> 1.x.y -> 2.x.y -> 3.x.y -> ...)

quartz lotus May 9, 2024, 7:42 PM

#

i'm following a geek for geeks article that doesn't include the version but it was from over a year ago.
I'll keep an eye out for more weird issues like that and report them

agile cobalt May 9, 2024, 7:43 PM

#

geeks for geeks? I wouldn't be surprised if it never worked then, they have some pretty bad quality things

quartz lotus May 9, 2024, 7:44 PM

#

do you know of a better place I could learn about open CV?

#

I'd appreciate it if you did

agile cobalt May 9, 2024, 7:44 PM

#

.rp opencv

strange elbowBOT May 9, 2024, 7:44 PM

#

Search results - Real Python

Here are the top 5 results:

Image Segmentation Using Color Spaces in OpenCV + Python

https://realpython.com/python-opencv-color-spaces/

PySimpleGUI: The Simple Way to Create a GUI With Python

https://realpython.com/pysimplegui-python/

Face Detection in Python Using a Webcam

https://realpython.com/face-detection-in-python-using-a-webcam/

Traditional Face Detection With Python

https://realpython.com/traditional-face-detection-python/

Fingerprinting Images for Near-Duplicate Detection

https://realpython.com/fingerprinting-images-for-near-duplicate-detection/

agile cobalt May 9, 2024, 7:45 PM

#

Real Python usually is really good

agile cobalt May 9, 2024, 7:46 PM

#

agile cobalt https://docs.opencv.org/3.4/d8/d01/group__imgproc__color__conversions.html > COL...

looks like 4.9.0 should still have that though, maybe check if you installed and imported things correctly https://docs.opencv.org/4.9.0/d8/d01/group__imgproc__color__conversions.html

frosty socket May 9, 2024, 7:47 PM

#

Hi guys, I have this rubik cube, so I figure that the best way to select subcubes wouyld be to put them in a numpy array of shape (3, 3, 3), so I find it easy to find a horizontal slice like that, but is there a way to easiyl select a whole row, column, stage using some kind on numpy syntax ?

Capture_decran_2024-05-08_a_13.41.16.png

#

like selecting a vertical slice is arr[0]

#

but how do I simply select a horizontal slice

quartz lotus May 9, 2024, 7:49 PM

#

agile cobalt looks like 4.9.0 should still have that though, maybe check if you installed and...

the issue went away after a restart. IDK what caused it but i'm gonna keep a lookout for that and any other issues. Also, thank you for the recommendation

wooden sail May 9, 2024, 7:51 PM

#

frosty socket like selecting a vertical slice is arr[0]

arr[0] is the same as arr[0, :, :]. if you wanna pick a whole column you could do arr[:, col, :]

#

similarly for the other dimension

frosty socket May 9, 2024, 7:51 PM

#

nvm, chatgpt got me the answer, I suppose the magic thing I was looking for is stage = cube[:, :, stage_index]

#

ty

#

yes

#

I guess the hard part was formulating the question

warm trellis May 9, 2024, 9:09 PM

#

after adding lstm layer model learns no more, what can be reason?

desert oar May 9, 2024, 9:34 PM

#

odd meteor I find this kind of timeseries analysis that's modelled to predict multiple resp...

I suspect "difficulty" correlates inversely with "popularity in blog posts"

#

I think building a useful multivariate time series model in a real-world project is on the harder end of things

spring field May 9, 2024, 9:53 PM

#

warm trellis after adding lstm layer model learns no more, what can be reason?

there could be tons of reasons, like not normalizing the data

odd meteor May 9, 2024, 9:54 PM

#

desert oar I suspect "difficulty" correlates inversely with "popularity in blog posts"

Ikr. Even in my undergrad Stats program, multivariate time series wasn't taught. The course was however, reserved for Stats Masters program.

ashen echo May 9, 2024, 10:15 PM

#

anybody ever have issue trying to install pandasAI via command prompt, i keep getting this error where its not recognizing MS visual C++, and I have version 14.38 of it installed already

feral wind May 9, 2024, 10:36 PM

#

guys how do you do hyperparameter tuning in svm

desert oar May 10, 2024, 12:26 AM

#

odd meteor Ikr. Even in my undergrad Stats program, multivariate time series wasn't taught....

it's popular in econometrics, but the use case is more inferential/statistical than predictive

lapis sequoia May 10, 2024, 2:09 AM

#

All this talk about C/C++, why?

serene scaffold May 10, 2024, 2:52 AM

#

lapis sequoia All this talk about C/C++, why?

in this channel, or in the AI/ML discourse in general?

tacit basin May 10, 2024, 3:14 AM

#

lapis sequoia All this talk about C/C++, why?

bcs python is slow maybe?

tacit basin May 10, 2024, 3:16 AM

#

feral wind guys how do you do hyperparameter tuning in svm

Never tuned svm, but would imagine it is similar to tuning any other supervised ml models?

sterile heath May 10, 2024, 8:31 AM

#

Why only one?

past meteor May 10, 2024, 8:33 AM

#

desert oar I think building a useful multivariate time series model in a real-world project...

I find multivariate series hard because it could mean anything

#

Some people mean it to be N univariate series that are correlated, the typical use case for vector auto regression (the stock market etc)). All variables in this case are endogenous. While there's also the ARX/ARIMAX type models that explicitly have exogenous variables.

Both of them are called multivariate but I feel like they should be "split" into endog multivariate and exog multivariate (and the mix) explicitly.

Finally, there's the whole domain of hierarchical time series and reconciliation, hierarchical Bayesian models, shrinkage, pooling, mixed effects modelling 🥴 . Odds are if the time series is multivariate you ought to be looking at a mixed effect model yeah, at least in "typical" use cases like demand forecasting and medical related stuff.

abstract rune May 10, 2024, 8:40 AM

#

For instance, consider a company that is interested in conducting a
direct-marketing campaign. The goal is to identify individuals who are
likely to respond positively to a mailing, based on observations of demographic variables measured on each individual. In this case, the demographic variables serve as predictors, and response to the marketing campaign (either positive or negative) serves as the outcome. The company is
not interested in obtaining a deep understanding of the relationships between each individual predictor and the response; instead, the company
simply wants to accurately predict the response using the predictors. This
is an example of modeling for prediction

This is quoted from ISLP, chapter 2.1, page 19 (29 of pdf)

What do we mean by "not interested in obtaining a deep understanding of relationships between each individual predictor and response" ?
Because if we want to find a estimator, which will be a combination of weights vector for each predictor, then we are doing the same thing

tender summit May 10, 2024, 10:18 AM

#

hi my friend!

#

can you help me with this?

#

hasty grail May 10, 2024, 10:49 AM

#

tender summit

The first line in your screenshot tells you what you need to do

tender summit May 10, 2024, 11:48 AM

#

hasty grail The first line in your screenshot tells you what you need to do

i ran pip install chatterbot

#

on my terminal

#

but this err

#

help please

#

🆘🆘🆘🆘

warm trellis May 10, 2024, 11:53 AM

#

spring field there could be tons of reasons, like not normalizing the data

It's really interesting but it's because after conv1d layer output, I needed to change the shape of the values. Like b*m*n to b*n*m

autumn ruin May 10, 2024, 11:53 AM

#

tender summit

spring field May 10, 2024, 12:41 PM

#

a statistical summary in what regard?
I apologise for the shallow response as ML has been my focus for quite some time and when all you've got is a hammer...
this seems like something that can be solved by a standard feed forward neural network, essentially just a bunch of linear layers, though since you mentioned categorical data, you'll probably want to also have an embedding space though ig at first can just try plain one hot encoding

trim saddle May 10, 2024, 12:45 PM

#

tender summit i ran pip install chatterbot

You might need an older python version. The last version is 4 years old

drowsy sleet May 10, 2024, 3:13 PM

#

Hi everyone, I am. currently signed up for a Kaggle challenge, as you guys might know we have to use read the csv files provided by Kaggle on their website. My concern is when submitting the notebook will there be an issue in reading the files as the path will be a local path to my device? I know it's a silly question. I ask this because when I refer other notebooks they all have some similar kind of path like shown below
tr = pd.read_csv('/kaggle/input/widsdatathon2023/train_data.csv', parse_dates = ['startdate'])

what's the right way to go about it?

cedar tusk May 10, 2024, 3:39 PM

#

anyone here can help me on building my own tokenizer? this will be the first step for me to build my own llm algo

scarlet owl May 10, 2024, 3:39 PM

#

what libraries I need to know for machine learning?

buoyant vine May 10, 2024, 3:40 PM

#

cedar tusk anyone here can help me on building my own tokenizer? this will be the first ste...

Tokenizers are very difficult to build, is there a reason why you want to create your own VS using a pre-trained type?

scarlet owl May 10, 2024, 3:41 PM

#

scarlet owl what libraries I need to know for machine learning?

??

sly isle May 10, 2024, 3:41 PM

#

scarlet owl what libraries I need to know for machine learning?

PyTorch

scarlet owl May 10, 2024, 3:41 PM

#

only?

cedar tusk May 10, 2024, 3:41 PM

#

buoyant vine Tokenizers are very difficult to build, is there a reason why you want to create...

i want to learn

#

i dont like using blackbox stuff

sly isle May 10, 2024, 3:41 PM

#

scarlet owl only?

TensorFlow

buoyant vine May 10, 2024, 3:42 PM

#

it is not really that they are complicated, it is just a nightmare building the vocab

scarlet owl May 10, 2024, 3:42 PM

#

can you write all at once?

buoyant vine May 10, 2024, 3:42 PM

#

my advise is to use Hugging face's tokenizers and you can use your own vocab if you want, or mutate an existing one

#

at least that way as well it gives a common format since HF tokenizers are basically the standard

cedar tusk May 10, 2024, 3:42 PM

#

buoyant vine my advise is to use Hugging face's tokenizers and you can use your own vocab if ...

i am trying to build my own llm, i wont train it

#

i just want to see how its done

buoyant vine May 10, 2024, 3:43 PM

#

scarlet owl only?

Pytorch realistically is enough

cedar tusk May 10, 2024, 3:43 PM

#

scarlet owl ??

what kind of ml?

scarlet owl May 10, 2024, 3:43 PM

#

buoyant vine Pytorch realistically is enough

Ok

agile cobalt May 10, 2024, 3:43 PM

#

scarlet owl what libraries I need to know for machine learning?

depends on what exactly you are trying to do

sly isle May 10, 2024, 3:44 PM

#

You should also become comfortable with Pandas and Numpy as general libraries

scarlet owl May 10, 2024, 3:45 PM

#

to make algo

cedar tusk May 10, 2024, 3:45 PM

#

numpy is nice but i would use polars instead of pandas, overall works better. But has less integration since its newer. ur choice

true stratus May 10, 2024, 3:47 PM

#

i need a simple easy-to-train image recognition library for training a simple model to detect flags but tensorflow seems complex af

sly isle May 10, 2024, 3:47 PM

#

cedar tusk numpy is nice but i would use polars instead of pandas, overall works better. Bu...

So is Polars the new hit?

cedar tusk May 10, 2024, 3:47 PM

#

sly isle So is `Polars` the new hit?

its faster and has better syntax, has less integration to other packages such as scikit

#

as i said

true stratus May 10, 2024, 3:47 PM

#

true stratus i need a simple easy-to-train image recognition library for training a simple mo...

anyone knows any libraries for that?

cedar tusk May 10, 2024, 3:48 PM

#

true stratus i need a simple easy-to-train image recognition library for training a simple mo...

just use a pretrained huggingface model for "getting shit done"

true stratus May 10, 2024, 3:48 PM

#

nah i wanna train my own

cedar tusk May 10, 2024, 3:48 PM

#

no need to get fancy

true stratus May 10, 2024, 3:49 PM

#

i need accuracy

cedar tusk May 10, 2024, 3:49 PM

#

u can fine tune the model with ur own dataset

true stratus May 10, 2024, 3:49 PM

#

cedar tusk u can fine tune the model with ur own dataset

how? is it complex?

cedar tusk May 10, 2024, 3:49 PM

#

it will give better accuracy than any model ull build on ur own

#

https://huggingface.co/docs/transformers/en/training

Fine-tune a pretrained model

true stratus May 10, 2024, 3:50 PM

#

ok thanks

arctic silo May 10, 2024, 3:51 PM

#

what are your opinion about datacamp

cedar tusk May 10, 2024, 3:51 PM

#

arctic silo what are your opinion about datacamp

not worth

#

it holds your hand too much

arctic silo May 10, 2024, 3:52 PM

#

why??

cedar tusk May 10, 2024, 3:52 PM

#

u wont learn anything

#

u will just memorize stuff

#

which is not good

spring field May 10, 2024, 3:52 PM

#

cedar tusk https://huggingface.co/docs/transformers/en/training

can people stop using "state-of-the-art" for describing every single advancement in every single direction 😐

arctic silo May 10, 2024, 3:52 PM

#

so what's you recommand

cedar tusk May 10, 2024, 3:52 PM

#

spring field can people stop using "state-of-the-art" for describing every single advancement...

what

cedar tusk May 10, 2024, 3:53 PM

#

arctic silo so what's you recommand

just learn from doing

#

writing 1 line of code is better than writing 100 lines of code that u got from youtube or anywhere else

spring field May 10, 2024, 3:53 PM

#

cedar tusk what

arctic silo May 10, 2024, 3:53 PM

#

but you need courses to learn the principle

cedar tusk May 10, 2024, 3:53 PM

#

spring field

hahaha, yea

cedar tusk May 10, 2024, 3:53 PM

#

arctic silo but you need courses to learn the principle

no, you need research to learn the principle

#

what do you want to learn?

cedar tusk May 10, 2024, 3:54 PM

#

spring field can people stop using "state-of-the-art" for describing every single advancement...

i mean its "state of the art" if its the latest advancement made : P

arctic silo May 10, 2024, 3:54 PM

#

data science

cedar tusk May 10, 2024, 3:55 PM

#

arctic silo data science

lol, data science is too broad of a term to just "learn"

#

do you want to learn basic analysis, databases, probability, statistics, basic computer knowledge, servers

#

u gotta select a topic first to learn

arctic silo May 10, 2024, 3:56 PM

#

no I mean machine learning because I know statistic probalbility and data analysis because I'm CS student

cedar tusk May 10, 2024, 3:56 PM

#

data science is not a topic, its a whole fucking science branch

arctic silo May 10, 2024, 3:57 PM

#

I hand some exp with python and its package like numpy,seaborn,pandas

cedar tusk May 10, 2024, 3:57 PM

#

ok, let me tell u what u need to do.

#

implement every big machine learning model by hand in python

#

such as, linear/logistic regression, support vector machines, dbscan, kneighbours

#

then, learn their intricacies

#

such as R^2 for regression, l1 and l2 optimizations, hypothesis testing to see if a variable is important etc

arctic silo May 10, 2024, 4:00 PM

#

I've implementd only linear regression ,
after learning this stuff what should I do ?

cedar tusk May 10, 2024, 4:00 PM

#

do this stuf first, then you can start with deep learning mathematics

#

after that comes the implementation

arctic silo May 10, 2024, 4:01 PM

#

ok thank yu so what do you do ?

cedar tusk May 10, 2024, 4:01 PM

#

im a data science masters student

#

statistics bachelor

prime hill May 10, 2024, 4:01 PM

#

Hi

cedar tusk May 10, 2024, 4:01 PM

#

wassup

arctic silo May 10, 2024, 4:02 PM

#

in which university ?

prime hill May 10, 2024, 4:02 PM

#

Hello friends
I want useful tools to use in the Python language for information

I am a beginner 😬

cedar tusk May 10, 2024, 4:03 PM

#

arctic silo in which university ?

tu dortmund

cedar tusk May 10, 2024, 4:03 PM

#

prime hill Hello friends I want useful tools to use in the Python language for information ...

define information

#

data handling and visualization?

prime hill May 10, 2024, 4:04 PM

#

cedar tusk define information

How do I learn it or how do I use it?

jaunty helm May 10, 2024, 4:04 PM

#

sly isle So is `Polars` the new hit?

takes a bit to get used to, but imo feels pretty nice
integration is definitely worse than pandas currently
e.g. right now there's a bug with scikit-learn that

X: pl.DataFrame
y: pl.Series
train_X, test_X, train_y, test_y = train_test_split(X, y)  # not just train_test_split but that's the one I can rmb rn
```will error
in this case it's easy to get around though (use `y.to_numpy()`)

cedar tusk May 10, 2024, 4:04 PM

#

prime hill How do I learn it or how do I use it?

u want to learn python?

prime hill May 10, 2024, 4:05 PM

#

I'm such a beginner that I don't understand what you're saying

prime hill May 10, 2024, 4:05 PM

#

cedar tusk u want to learn python?

Yes

cedar tusk May 10, 2024, 4:05 PM

#

no prob, just go watch a 30min tutorial on basic python

#

then try coding stuf, i myself used this page for questions to work on https://www.practicepython.org/

prime hill May 10, 2024, 4:06 PM

#

cedar tusk no prob, just go watch a 30min tutorial on basic python

Fine, thank you 🤍

cedar tusk May 10, 2024, 4:06 PM

#

after watching that 30 min video dont ever watch another youtube video, just google stuff

jaunty helm May 10, 2024, 4:07 PM

#

!res

arctic wedgeBOT May 10, 2024, 4:07 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

cedar tusk May 10, 2024, 4:07 PM

#

if u search for the information urself it stays with you

cedar tusk May 10, 2024, 4:08 PM

#

arctic wedge

this page is too cluttered, need to be more simple and easy to access

#

is there a way we can work on the page ourselves?

prime hill May 10, 2024, 4:10 PM

#

cedar tusk after watching that 30 min video dont ever watch another youtube video, just goo...

I will watch the video and then try what I learned

spring field May 10, 2024, 4:10 PM

#

cedar tusk is there a way we can work on the page ourselves?

you can probably open an issue or submit a PR if you wish https://github.com/python-discord/site

#

can probably also ask in #dev-contrib ig

cedar tusk May 10, 2024, 4:11 PM

#

this way i can learn git as well, its an area im very lacking

#

if ur doing statistical analysis u gotta do it properly

#

i dont think its exaggerated

#

u just want to see the means and medians?

spring field May 10, 2024, 4:21 PM

#

I would disagree with that analogy because usually you'd hunt rabbits because of their meat, thus using an explosive weapon would have greatly diminishing returns, however, using a powerful solution to a simple problem like in this case is not unlikely to have fantastic results and in the end, the issue would get tackled either way, not unlike when hunting a rabbit using an explosive weapon vs a regular hunting weapon, in the latter case you at least get a rabbit, in the former, well, you get no results... except for an explosion ig. now, if we say use an analogy like swatting flies, then it makes much more sense, you can swat a fly using a flyswatter, you can swat one using a nuclear weapon, if we focus specifically on eliminating them flies, then both approaches achieve the same result, though the nuclear weapon likely has greater range than a flyswatter

this can also be described in a single word: overkill

cedar tusk May 10, 2024, 4:21 PM

#

this needs hypothesis testing

#

xD

#

if not overly obvious

#

well, what u can do is go to this website and follow the same steps https://www.sthda.com/english/wiki/comparing-means-in-r

#

its in r tho

spring field May 10, 2024, 4:27 PM

#

I don't know about "basic" here, if you have tons of parameters, it'll get increasingly difficult to manually correlate them all as opposed to letting a neural netowrk figure it out on its own

cedar tusk May 10, 2024, 4:27 PM

#

u can use https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.normaltest.html for normality test in python

cedar tusk May 10, 2024, 4:28 PM

#

spring field I don't know about "basic" here, if you have tons of parameters, it'll get incre...

cut him some slack, he is trying to get shit done xd

#

he could just go to chatgpt and do something very wrong in the process and call it good

#

but he came here for councelling

#

btw read the documentation for every function u are using, there may be cases where the function may not work

#

for ex the sample size is a very important factor for normal tests

jaunty helm May 10, 2024, 4:30 PM

#

if you're "just looking" then there are indeed many stats you can look at like correlation, anova(for numerical-categorical), chi2 (categorical-categorical), PCA, etc.
though do be careful when deciding how you interpret these numbers

random torrent May 10, 2024, 4:30 PM

#

Hello, I have a series of pyplots of 2d arrays, how do I combine them into an mp4? I only see tutorials on function plot mp4, but not 2d arrays

cedar tusk May 10, 2024, 4:31 PM

#

random torrent Hello, I have a series of pyplots of 2d arrays, how do I combine them into an mp...

mp4 as in a video?

random torrent May 10, 2024, 4:31 PM

#

yes

cedar tusk May 10, 2024, 4:31 PM

#

install davinci resolve and put the pngs into the video

#

then render

random torrent May 10, 2024, 4:31 PM

#

i have like a million of them

spring field May 10, 2024, 4:32 PM

#

better get going then pg_rofl

cedar tusk May 10, 2024, 4:32 PM

#

u are trying to animate?

random torrent May 10, 2024, 4:32 PM

#

yea, I am trying to use the AnimationArtist but not sure how

cedar tusk May 10, 2024, 4:32 PM

#

are the plots matplotlib or plotly

random torrent May 10, 2024, 4:32 PM

#

matplotlib

spring field May 10, 2024, 4:33 PM

#

do all of those images even fit in your ram?

cedar tusk May 10, 2024, 4:33 PM

#

https://www.youtube.com/watch?v=bNbN9yoEOdU

YouTube

CodingLikeMad

Animating Plots In Python Using MatplotLib [Python Tutorial]

This video shows how to make mp4 and gif (movie) files out of figures in python using matplotlib. Maximize your data visualization impact using matplotlib animation when doing data science.

FFMpeg can be downloaded here if you need it:
https://www.ffmpeg.org/download.html

The github repo can be found here for all the examples mentioned:
https:...

▶ Play video

#

just watch this and apply

spring field May 10, 2024, 4:34 PM

#

now, this might not be the most efficient approach, but if you have those image files, you could use sth like ffmpeg to do this: https://stackoverflow.com/a/37478183/14531062

#

I mean, ffmpeg is efficient, but if you first need to convert from arrays to images on disk, then that's gonna take a bit

random torrent May 10, 2024, 4:36 PM

#

cedar tusk https://www.youtube.com/watch?v=bNbN9yoEOdU

Interesing that he uses ffmpegwriter, I wonder what's the difference between it and AnimationArtist

cedar tusk May 10, 2024, 4:40 PM

#

random torrent Interesing that he uses ffmpegwriter, I wonder what's the difference between it ...

u can try finding a video with that too, i just looked at the end result and matplotlib to come up with a vid

spring field May 10, 2024, 4:42 PM

#

random torrent Interesing that he uses ffmpegwriter, I wonder what's the difference between it ...

ArtistAnimation deals with matplotlib objects during runtime, it's probably most useful for like interactive plots and such, ffmpegwrite would be used if you wanted to create an mp4 file as the guy in the video explains

random torrent May 10, 2024, 5:04 PM

#

This is confusing

#

I don't know how to get it work for 2d arrays

#

I cannot find a suitable function to update the plot

#

nevermind

#

I got it working

cedar tusk May 10, 2024, 5:07 PM

#

yea was about to say just iterate over the array

#

😁

random torrent May 10, 2024, 5:11 PM

#

nono, the writer.saving takes an updating Figure object, so I created a subplot inside the figure and keep plotting a new subplot in a for loop
It should also work by calling writer.saving in a for loop and give it a new figure every loop, but it feels like a bad idea to keep locking and unlocking the file frequently

#

Thanks anyways

pale thunder May 10, 2024, 6:30 PM

#

Does anyone know a working tutorial for CUDA+tensorflow on Ubuntu given a Quadro T1000 mobile? Trying to help a friend.

cedar tusk May 10, 2024, 7:04 PM

#

pale thunder Does anyone know a working tutorial for CUDA+tensorflow on Ubuntu given a Quadro...

last time i tried doing it it was a nightmare

#

good luck

formal mauve May 10, 2024, 7:17 PM

#

Whats ternding in this field atm? Anyone have any good idea

cedar tusk May 10, 2024, 7:20 PM

#

formal mauve Whats ternding in this field atm? Anyone have any good idea

build your own llm

formal mauve May 10, 2024, 7:20 PM

#

cedar tusk build your own llm

llm?

cedar tusk May 10, 2024, 7:20 PM

#

its a good challenge im doing it now

formal mauve May 10, 2024, 7:21 PM

#

Sorry I am a bit neive with the acronyms sometimes.

#

too many out there 😄

cedar tusk May 10, 2024, 7:21 PM

#

large language model

#

as in chatgpt

formal mauve May 10, 2024, 7:21 PM

#

Gotcha. Yeah that would be interesting.

#

Im trying to figure out what I could build that could be a beneficial service/tool for people. But theres just so much out there, I dont know what I want to focus on.

cedar tusk May 10, 2024, 7:23 PM

#

well thats a hard question

formal mauve May 10, 2024, 7:23 PM

#

cedar tusk well thats a hard question

Yes it is indeed.

brisk vapor May 10, 2024, 7:43 PM

#

We do not permit job seeking posts in this server.

kind loom May 10, 2024, 7:43 PM

#

brisk vapor We do not permit job seeking posts in this server.

My bad
I’m sorry

livid sphinx May 10, 2024, 10:21 PM

#

Hey there, I was wondering, where should I start to learn python for Data analytics

left tartan May 10, 2024, 10:46 PM

#

livid sphinx Hey there, I was wondering, where should I start to learn python for Data analyt...

Do you already know Python? If so, Kaggle.con/learn is a good starter

livid sphinx May 10, 2024, 11:57 PM

#

left tartan Do you already know Python? If so, Kaggle.con/learn is a good starter

I know the basics basics, thank you

abstract wasp May 11, 2024, 1:04 AM

#

Can anyone explain to me their interview process after getting a job in data science/ml? Plsss, I need help ;-;

left tartan May 11, 2024, 1:19 AM

#

abstract wasp Can anyone explain to me their interview process after getting a job in data sci...

#career-advice might be a better place for this q

lapis sequoia May 11, 2024, 2:39 AM

#

Hello guys,

I'm a full stack web developer and i want to enhance my skills so I'm thinking to get into data science, as a web developer is it really beneficial for me to get into data science?if yes then how(please elaborate)?

As I have still 1.5 years to complete my degree is it beneficial to give this time by learning data science?

With "data scientist + web developer" do I provide value to the marketplace than a "only web developer" (also in future)??

If learning data science with web development is bad idea then you can also suggest me some other thing to learn instead of data science with web dev.

Any suggestions would be appreciated.

Thank You

lapis sequoia May 11, 2024, 3:32 AM

#

What “certifications” needed? Have a degree in Data Science, don’t feel a masters is necessary. What else do I need?

serene scaffold May 11, 2024, 3:44 AM

#

lapis sequoia What “certifications” needed? Have a degree in Data Science, don’t feel a master...

There are no certificates that matter for careers in data science. Have you been applying to jobs and not getting any response?

serene scaffold May 11, 2024, 3:50 AM

#

lapis sequoia Hello guys, I'm a full stack web developer and i want to enhance my skills so I...

There's a lot of data science hype, so unless you can do an internship where you're primarily doing data science (and not primarily web development) or take all the AI/data science courses that your degree program offers, I don't think prospective employers will believe that your purported data science skills will be valuable to them.

gritty vessel May 11, 2024, 4:36 AM

#

Hey guys does anyone know about data papers?

#

I have created an ai-ready Data

#

And I want to publish it

#

I found the resources what is data paper

#

But I can't find any data papers that I can refer to

solemn verge May 11, 2024, 4:50 AM

#

hey I am just wondering can I write programs with pip after I installed conda and activated the environment? I do have pip installed (on Linux, came out the box)

wooden sail May 11, 2024, 5:04 AM

#

solemn verge hey I am just wondering can I write programs with pip after I installed conda an...

what do you mean by "write programs with pip"

deep veldt May 11, 2024, 5:10 AM

#

what is the difference between a matrix and a tensor?

wooden sail May 11, 2024, 5:12 AM

#

if you wanna be precise, a matrix is simply a rectangular table of numbers

#

if that table of numbers represents a linear or multilinear transformation, you can also call it a tensor

#

if not, but you have some binary operation for it with other matrices and a scalar operation, matrices can also be vectors

#

the difference depends on what you do with the matrix. if you use it as a function that transforms other vectors, the matrix is also a tensor

solemn verge May 11, 2024, 5:23 AM

#

wooden sail what do you mean by "write programs with pip"

well after activating conda now my terminal says (base) name@pop-os I want to write an automation script that basically takes 2 arguments and just a basic task. I wanted to know how I'd know if pip or conda is being used to write this. Meaning let's say I upload this to github and then clone the repo to use on a computer that does not have conda. Meaning if i open vscode now and begin writing this, would it use packages from pip?

wooden sail May 11, 2024, 5:24 AM

#

you want to automate the installation of requirements/setup of a project?

#

you could do it either with conda or with pip. conda requires that the user has installed conda. pip already comes with python (though you never want to use the system python directly, you can ruin your OS)

deep veldt May 11, 2024, 5:28 AM

#

wooden sail the difference depends on what you do with the matrix. if you use it as a functi...

What do you mean by the "transforms other vectors" part? can i get an example

wooden sail May 11, 2024, 5:28 AM

#

deep veldt What do you mean by the "transforms other vectors" part? can i get an example

matrix multiplication

#

if you multiply a vector with a matrix, you get a new vector. this vector is a "transformed" version of the original vector

#

e.g. if you multiply a vector by a rotation matrix, you get a rotated version of the original vector

#

the matrix is transforming the vector

#

(matrices represent linear transformations in this context)

deep veldt May 11, 2024, 5:35 AM

#

wooden sail if not, but you have some binary operation for it with other matrices and a scal...

but matrices are 2d array meanwhile vectors are 1d, how can matrices be vectors tho?

wooden sail May 11, 2024, 5:35 AM

#

vectors are not 1d arrays

#

vectors are any element of a set that satisfies the 8 axioms of a vector space

#

for finite dimensional vector spaces you can choose a basis and represent vectors as 1d arrays, but this is secondary because in many cases there are infinitely many suitable bases, and so the 1d array representation is not unique

orchid forge May 11, 2024, 9:30 AM

#

Does anyone know how to use selenium in python?

#

I've been watching videos for it, but if anyone has some kinda developers documents for it, please do recommend

deep veldt May 11, 2024, 9:37 AM

#

orchid forge I've been watching videos for it, but if anyone has some kinda developers docume...

https://www.selenium.dev/documentation/

Selenium

The Selenium Browser Automation Project

Selenium is an umbrella project for a range of tools and libraries that enable and support the automation of web browsers.
It provides extensions to emulate user interaction with browsers, a distribution server for scaling browser allocation, and the infrastructure for implementations of the W3C WebDriver specification that lets you write interc...

orchid forge May 11, 2024, 9:39 AM

#

@deep veldt
Do you use selenium?

deep veldt May 11, 2024, 9:40 AM

#

orchid forge <@937926761121988668> Do you use selenium?

yes

orchid forge May 11, 2024, 9:40 AM

#

You scrape web with that ?

#

So you use XML too?

dense smelt May 11, 2024, 10:32 AM

#

Hey Everybody! Hope y'all Doing Good
I've been creating dash apps through Plotly
a python interactive visualisation tool
and having some errors to solve
need real help

The data is being loaded from aws and is not the desired data
callback error updating ( SchemaTypeValidationError )

if somebody likes to solve it please DM / ask for it
I will post you the full traceback

orchid forge May 11, 2024, 10:45 AM

#

orchid forge You scrape web with that ?

@deep veldt

trim saddle May 11, 2024, 10:46 AM

#

dense smelt Hey Everybody! Hope y'all Doing Good I've been creating dash apps through Plotly...

1.: If its not the desired data, than I guess your data source or way of loading the data is not correct?
2.: That is not enough information to help debugging it, do you have the full traceback

In general, just post the info here, I doubt ppl want to go more into DM stuff...

dense smelt May 11, 2024, 10:51 AM

#

Nope! we are retrieving data through a defined function, based on Id and port
the query is retrieved
the main issue here is the callback! not the data basically

silver linden May 11, 2024, 11:19 AM

#

Hello guys can anybody help for my university project?

atomic shore May 11, 2024, 11:37 AM

#

silver linden Hello guys can anybody help for my university project?

Ask away, you'll have higher chance of getting answer by just asking it. Anyone who knows/is willing to, will help. (Also check out #❓｜how-to-get-help)

silver linden May 11, 2024, 11:38 AM

#

I see okay thanks

deep veldt May 11, 2024, 11:41 AM

#

orchid forge <@937926761121988668>

yes

dull flare May 11, 2024, 12:41 PM

#

anyone familiar with tensorflow? Im trying to run a model on my local machine containing CUDA GPU, but its automatically getting trained on CPU and thats freaking slow. Can someone help me how can I select GPU mode to train my model

cedar tusk May 11, 2024, 3:31 PM

#

dull flare anyone familiar with tensorflow? Im trying to run a model on my local machine co...

tf.config.list_physical_devices('GPU')
run this and see what pops up

cedar tusk May 11, 2024, 3:33 PM

#

dull flare anyone familiar with tensorflow? Im trying to run a model on my local machine co...

btw what is ur gpu?

hasty grail May 11, 2024, 3:43 PM

#

dull flare anyone familiar with tensorflow? Im trying to run a model on my local machine co...

if you're on windows, you'll need to use WSL2 for newer versions of tensorflow: https://www.tensorflow.org/install/pip#windows-native

TensorFlow

Install TensorFlow with pip

dull flare May 11, 2024, 3:57 PM

#

cedar tusk `tf.config.list_physical_devices('GPU')` run this and see what pops up

im on windows so GPU is not supported unfortunately

dull flare May 11, 2024, 3:57 PM

#

cedar tusk btw what is ur gpu?

very old one rtx 3050ti

cedar tusk May 11, 2024, 3:58 PM

#

dull flare im on windows so GPU is not supported unfortunately

that is wrong

#

i installed tf for windows on gpu before

#

it was a big hussle but its possible

dull flare May 11, 2024, 3:58 PM

#

after tf 2.0 its not supported

cedar tusk May 11, 2024, 3:59 PM

#

whaa

#

that makes no sense

dull flare May 11, 2024, 3:59 PM

#

i have to install WSL2

#

if i want to use it

cedar tusk May 11, 2024, 3:59 PM

#

bro just use torch

dull flare May 11, 2024, 3:59 PM

#

hasty grail if you're on windows, you'll need to use WSL2 for newer versions of tensorflow: ...

@cedar tusk check this out

cedar tusk May 11, 2024, 3:59 PM

#

its better anyways

dull flare May 11, 2024, 3:59 PM

#

hm im learning actually

#

currently understanding the arc of CNNs

cedar tusk May 11, 2024, 4:00 PM

#

oh the arc of cnns xD

#

good luck

dull flare May 11, 2024, 4:00 PM

#

whats with that smile 💀

cedar tusk May 11, 2024, 4:00 PM

#

it sounded funny is all

dull flare May 11, 2024, 4:00 PM

#

really how so :0

cedar tusk May 11, 2024, 4:01 PM

#

arc as in chapter

dull flare May 11, 2024, 4:01 PM

#

ohohohohhoh hahahaha

#

🤣

cedar tusk May 11, 2024, 4:01 PM

#

so you have begun a new chapter which is like a boss of some sorts

dull flare May 11, 2024, 4:01 PM

#

i get ya lol

odd meteor May 11, 2024, 6:01 PM

#

gritty vessel Hey guys does anyone know about data papers?

Our 2020 EMNLP paper is a data paper. https://aclanthology.org/2020.findings-emnlp.195/
I'll also add Prof. Ignatius' brilliant work in creating a data for benchmarking Igbo-English Machine Translation task. https://arxiv.org/abs/2004.00648

ACL Anthology

Participatory Research for Low-resourced Machine Translation: A Cas...

Wilhelmina Nekoto, Vukosi Marivate, Tshinondiwa Matsila, Timi Fasubaa, Taiwo Fagbohungbe, Solomon Oluwole Akinola, Shamsuddeen Muhammad, Salomon Kabongo Kabenamualu, Salomey Osei, Freshia Sackey, Rubungo Andre Niyongabo, Ricky Macharm, Perez Ogayo, Orevaoghene Ahia, Musie Meressa Berhe, Mofetoluwa Adeyemi, Masabata Mokgesi-Selinga, Lawrence Okeg...

arXiv.org

Igbo-English Machine Translation: An Evaluation Benchmark

Although researchers and practitioners are pushing the boundaries and enhancing the capacities of NLP tools and methods, works on African languages are lagging. A lot of focus on well resourced languages such as English, Japanese, German, French, Russian, Mandarin Chinese etc. Over 97% of the world's 7000 languages, including African languages, ...

gritty vessel May 11, 2024, 6:07 PM

#

odd meteor Our 2020 EMNLP paper is a data paper. https://aclanthology.org/2020.findings-emn...

Thank you @odd meteor I found few templates of data paper but reading a published work makes it easier to proceed further

lapis sequoia May 11, 2024, 7:58 PM

#

serene scaffold There are no certificates that matter for careers in data science. Have you been...

No, I have a job. I just remember some dude said something and I was like “who cares”?

lapis sequoia May 11, 2024, 8:32 PM

#

Rough take: this 'Data Science trend' is starting to feel like 2016 crypto. Most people do not need to ever use anything beyond matplotlib, pandas, sklearn, statsmodels, ect. Most people, especially if they are not engineers, do not need to know any form of deep learning at all. I remember one 'Data Science' discord server was talking about linear algebra, like it was so important. I am not saying it is not important, but it is a undergrad math class and they are acting so incredibly pretentious about something I took when I was 19 and I would bet all of the money in the world that they never even took that class. Like, I do not know, a lot of people do this all of the time and are not good, do it for money, or they do it because they think it will make them a insane amount of money. IT WILL NOT MAKE YOU A INSANE AMOUNT OF MONEY. There are people who are terrible who get paid well do to LinkedIn connections and do nothing. People need to apply 'Data Science' to things that are not total nonense and serves some sort of purpose.

silver linden May 12, 2024, 12:04 AM

#

guys can anybody check #1035199133436354600 channel

craggy agate May 12, 2024, 12:44 AM

#

Switching from Tensorflow to Pytorch, any advice?

left tartan May 12, 2024, 1:49 AM

#

lapis sequoia Rough take: this 'Data Science trend' is starting to feel like 2016 crypto. Most...

Classic hype cycle, perhaps, https://en.m.wikipedia.org/wiki/Gartner_hype_cycle

lapis sequoia May 12, 2024, 5:15 AM

#

@left tartan is this becoming a hype train? I see people who follow 120,000 people and act like GitHub is Instagram and post like a ridiculous amount of stuff to their repositories. It seems like this is indeed a hype train for whatever reason.

#

Feels like the 2000s hip hop era when rappers made 10,000,000 mixtapes filled with just garbage. This is starting to look like a hustle, not like the gold mine, but the Data Mine. Oh lord.

lapis sequoia May 12, 2024, 5:19 AM

#

craggy agate Switching from Tensorflow to Pytorch, any advice?

Yeah, just stop. Quit. Go get a job.

ivory quarry May 12, 2024, 7:06 AM

#

lapis sequoia <@738234281146712084> is this becoming a hype train? I see people who follow 120...

people just wanna attach ai to anything nowadays in most cases and hope they have found a gold mine

neon island May 12, 2024, 8:01 AM

#

lapis sequoia Rough take: this 'Data Science trend' is starting to feel like 2016 crypto. Most...

I agree with your final statement, "People need to apply 'Data Science' to things that are not total nonense and serves some sort of purpose."

Yet observe that linear algebra is first taught to youngsters as the Number Line and Subtraction in primary school. The introductory course in linear algebra given to most undergrads is about as rudimentary and as far removed from most applied linear algebra as a grade-schooler who can add and subtract on a number line is from that introductory undergrad math course. 😉

#

I've taken that course, numerical linear algebra, and I'd add a year of abstract algebra, but I am only a beginner tbh. It may be less pretentiousness, and more just a sign that math underpins challenging work that serves some sort of purpose.

left tartan May 12, 2024, 9:08 AM

#

lapis sequoia Yeah, just stop. Quit. Go get a job.

That doesn't seem very helpful or constructive. Regardless of your feelings on the topic, this channel and server should be a positive place.

left tartan May 12, 2024, 9:12 AM

#

lapis sequoia <@738234281146712084> is this becoming a hype train? I see people who follow 120...

Sticking to just the ML part; sure, ML is certainly in the hype phase, but real problems are being solved with it and it's made great advances in a short period of time. The shape of the hype curve feels very different than blockchain, where it was unclear whether it would be useful for broader applications besides crypto

hollow escarp May 12, 2024, 9:40 AM

#

Hi, im currently working on deploying my license plate recognition system to for production usage, im usinng onnxruntime for that. And im wondering if using dockers for deploying application to raspberry pi is good solution, or it's not neccesserly needed. I know that making images with cuda runtime takes a lot of space ( like 3gb for just nvidia cuda img ) + my requirments gives me like 6gb of just docker img. So I'm figuring out if I should be deploying docker imgs or just my code and make necessary installation on my device side ?

hollow escarp May 12, 2024, 12:02 PM

#

hollow escarp Hi, im currently working on deploying my license plate recognition system to for...

Also if not docker what alternatives do you suggest?

versed cobalt May 12, 2024, 1:20 PM

#

.cmds

hollow escarp May 12, 2024, 1:44 PM

#

versed cobalt .cmds

That's to me?

#

I dont think that your message makes any sens

left tartan May 12, 2024, 3:04 PM

#

hollow escarp Hi, im currently working on deploying my license plate recognition system to for...

I think a lot of this probably has to do with your bottlenecks/performance profile. Can a pi actually run this workload?

hollow escarp May 12, 2024, 3:16 PM

#

left tartan I think a lot of this probably has to do with your bottlenecks/performance profi...

I think ye, there is even a guide in onnxruntime dedicated to raspberry pi

#

But my bigger issue is the size of docker imgs

#

Because im using mender as my OTA Updates service

#

And handling such big files results in errors which indicated that i need to have more ram to perfome such tasks, and ofcourse i can buy instead of 2GB controler 8 GB but Im thiking About other ways to do that

past meteor May 12, 2024, 3:28 PM

#

hollow escarp But my bigger issue is the size of docker imgs

If you're worried about the size of your images you should look into using alpine linux as your base image. On top of that, you should likely use multi step builds so you only have what is strictly necessary and nothing more in your final image.

That said, they'll probably be pretty beefy either way.

toxic mortar May 12, 2024, 3:43 PM

#

i got ~5,6 % worse results with XGB than my neural net

#

XGB took me 20mins to set up, neural nets 2 weeks

past meteor May 12, 2024, 3:44 PM

#

did you hyperparam tune xgb?

#

Maybe if you tune it to the extent you did your network you'll have the same results

toxic mortar May 12, 2024, 3:44 PM

#

past meteor did you hyperparam tune xgb?

gridsearch

past meteor May 12, 2024, 3:44 PM

#

hmmmm

#

the most important thing for xgb is honestly the number of estimators

toxic mortar May 12, 2024, 3:45 PM

#

how sparse are u aiming for to be

past meteor May 12, 2024, 3:45 PM

#

You might as well run that in a "line" and only do that hyperparameter

#

And not considering other ones

toxic mortar May 12, 2024, 3:46 PM

#

param_grid = {
    'xgbclassifier__n_estimators': [50, 100, 200, 300, 400],
    'xgbclassifier__max_depth': [1, 3, 5, 7, 9],
    'xgbclassifier__learning_rate': [0.01, 0.1, 0.2],
    'xgbclassifier__subsample': [0.5, 0.7, 0.8, 0.9, 1,0],
    'xgbclassifier__colsample_bytree': [0.5, 0.7, 0.8, 0.9, 1.0]
}

#

Okay Imma try more fine-grain n_est, with 25 step [100,400] and remove min max values for rest

past meteor May 12, 2024, 3:47 PM

#

yeah imo you should do random search as well

#

grid isn't great

toxic mortar May 12, 2024, 3:48 PM

#

why not? I always preffer grid if I have enough computation resources

past meteor May 12, 2024, 3:48 PM

#

Because some hyperparameters are 100 % uninformative

#

If you grid search you waste compute by spending time on them

#

I think random -> grid is a good one

toxic mortar May 12, 2024, 3:49 PM

#

Ye thats intresting take I always thought gs is more thorough one tbh cuz u get to try throughout whole range different sol

#

but i get what u mean

#

ill try

cedar tusk May 12, 2024, 3:52 PM

#

yea ml models being a black box is hardly good for the industry but what can you do? they work most of the time

past meteor May 12, 2024, 3:53 PM

#

toxic mortar Ye thats intresting take I always thought gs is more thorough one tbh cuz u get ...

Idk if sklearn gives you this output allready but if you tune with optuna the dashboard gives you hyperparameter importance

toxic mortar May 12, 2024, 3:54 PM

#

past meteor Idk if sklearn gives you this output allready but if you tune with optuna the da...

I dont know about hyperparam but I know about shap and input features influence, which is cool

thorn cairn May 12, 2024, 4:13 PM

#

what alternative is there for comparing paired data? i tried stats.ttest_rel but i have a different length of data

cedar tusk May 12, 2024, 4:17 PM

#

thorn cairn what alternative is there for comparing paired data? i tried stats.ttest_rel but...

different lenght paired data?

#

they are not paired then

thorn cairn May 12, 2024, 4:18 PM

#

hmm how do i explain this

cedar tusk May 12, 2024, 4:18 PM

#

paired ttest looks at the differences between paired occurances

#

if they are of different lenght than that cannot be done

thorn cairn May 12, 2024, 4:19 PM

#

i gave out a survey that asks in what semester did they borrow a spesific genre of book, and the table looks a bit like this

#

i just split the records so that they have an atomic value

#

so the hypothesis is, is there an increase of book borrower after their first year?

#

doesnt that qualify as a ttest_rel?

cedar tusk May 12, 2024, 4:22 PM

#

what is the 3rd column?

#

all the semesters they borrowed the book?

hollow escarp May 12, 2024, 4:24 PM

#

past meteor If you're worried about the size of your images you should look into using alpin...

Im doing that already, but onnx , opencv etc also require a bit of additional space and extra tools

#

Which are like 2gb

#

nvidia gpu thats also 2gb

#

So thats why im wondering whats the best way to deploy apps using some trained models to "field devices"

#

Or how do you deploy generally apps which uses object detection

#

https://onnxruntime.ai/docs/tutorials/iot-edge/rasp-pi-cv.html#prerequisites and also onnxruntime supports raspberry pi so im thinking about deploying source code and downloading necessary stuff on device side and than just running the main script

onnxruntime

IoT Deployment on Raspberry Pi

thorn cairn May 12, 2024, 4:38 PM

#

cedar tusk all the semesters they borrowed the book?

yes

#

or should i split it like this,
translation: Semester borrowed, semester not borrowed

#

honestly idk what im doing 🔥

cedar tusk May 12, 2024, 5:10 PM

#

thorn cairn honestly idk what im doing 🔥

yes i can see that xD

thorn cairn May 12, 2024, 5:10 PM

#

honestly im cooked for tmr, its alright

#

i changed to stats.ttest_ind

cedar tusk May 12, 2024, 5:11 PM

#

what i would do is do the test for each semester on its own

#

to see if there was semesters that was different from each other

vernal phoenix May 12, 2024, 5:11 PM

#

Hello everyone, I'm currently working on a python q-learning project in the context of pac man. I'm still a bit weak in programming and wondering if someone could help out

cedar tusk May 12, 2024, 5:12 PM

#

thorn cairn honestly im cooked for tmr, its alright

begin with making dummy variables to check when a book is taken on a semester

#

then do anova to see if the means differ

cedar tusk May 12, 2024, 5:14 PM

#

vernal phoenix Hello everyone, I'm currently working on a python q-learning project in the cont...

ask away

vernal phoenix May 12, 2024, 5:14 PM

#

Thanks

#

I don't want to dump in a bunch of code in here

#

So I'll give the context for you

#

I've recreated the pacman game on python using the pygame module

#

But my project is more centered around developing an A.I which learns through reinforcement learning (Q-learning)

cedar tusk May 12, 2024, 5:15 PM

#

and now you are trying to code in the behaviour of the ghosts?

vernal phoenix May 12, 2024, 5:15 PM

#

No the ghosts are fine

#

It's actually coding in an A.I pacman where I'm struggling

cedar tusk May 12, 2024, 5:15 PM

#

oh u are trying to build an ai for the pacman itself

#

ok i see

vernal phoenix May 12, 2024, 5:15 PM

#

Yea

#

Well whenever I've implemented my q-learning the pacman takes the inputs and moves around just fine

#

However, the A.I doesn't seem to actualyl improve at the game at all

cedar tusk May 12, 2024, 5:16 PM

#

have u given it enough iterations?

vernal phoenix May 12, 2024, 5:16 PM

#

I'm questioning myself about it honestly

#

I can't tell if it needs more iterations or if my implementation is weak

cedar tusk May 12, 2024, 5:16 PM

#

for how long u let it run?

#

and is the algo implemented properly?

vernal phoenix May 12, 2024, 5:17 PM

#

The one currently 3 hours

#

About 310 iterations

cedar tusk May 12, 2024, 5:17 PM

#

u should have seen some improvement then

#

u tried changing ur parameters?

vernal phoenix May 12, 2024, 5:17 PM

#

Nothing apparent which makes me believe something wrong

#

Slightly for paramters, My epsilon starts at 0.9 and decreases to 0.1 after about 200 something iterations

#

My alpha is at 0.1 and gamma is 0.9, I've tried using alpha 0.15 and gamma 0.85

cedar tusk May 12, 2024, 5:18 PM

#

https://paste.pythondiscord.com/
paste ur code here and send it to me

#

i want to take a look

vernal phoenix May 12, 2024, 5:19 PM

#

Alright, some of the code is a bit long so there might be some fluff here and there

#

Send it via dm's correct?

cedar tusk May 12, 2024, 5:19 PM

#

yea

craggy agate May 12, 2024, 6:08 PM

#

lapis sequoia Yeah, just stop. Quit. Go get a job.

Wdym quit?

#

Switching from Tensorflow to Pytorch, any advice?

serene scaffold May 12, 2024, 6:20 PM

#

craggy agate Switching from Tensorflow to Pytorch, any advice?

welcome to the winning team

#

soon you'll be tired of winning.

craggy agate May 12, 2024, 6:21 PM

#

serene scaffold welcome to the winning team

Lmao what are these troll answers haha. Another guy said quit and get a job and you are welcoming me to the winning team?

#

Lol

spring field May 12, 2024, 6:56 PM

#

I mean, torch do be way cooler than tf

#

that other guy clearly doesn't like the hype 😁

craggy agate May 12, 2024, 6:56 PM

#

Anything that I should be aware about?

craggy agate May 12, 2024, 6:57 PM

#

spring field that other guy clearly doesn't like the hype 😁

🗣️🔥🔥They hate us cause they can't be us 😤

#

😂

iron basalt May 12, 2024, 6:57 PM

#

craggy agate Anything that I should be aware about?

In theory TF is better, but it's a Google project...

craggy agate May 12, 2024, 6:58 PM

#

iron basalt In theory TF is better, but it's a Google project...

Yes, but the industry is shifting to Pytorch tho.

spring field May 12, 2024, 6:58 PM

#

craggy agate Anything that I should be aware about?

ummm, activation functions are separate from layers, loss functions are separate from optimizers, stuff like that would be one difference that I have noticed I guess

craggy agate May 12, 2024, 6:58 PM

#

+I can use my Mac gpu with Pytorch

iron basalt May 12, 2024, 6:58 PM

#

craggy agate Yes, but the industry is shifting to Pytorch tho.

Yes, because Google projects are all sinking ships.

craggy agate May 12, 2024, 6:59 PM

#

iron basalt Yes, because Google projects are all sinking ships.

So in theory that makes TF worse?

iron basalt May 12, 2024, 6:59 PM

#

They have high upfront effort, and then are abandoned (due to internal company reward system).

craggy agate May 12, 2024, 6:59 PM

#

spring field ummm, activation functions are separate from layers, loss functions are separate...

So mainly syntax difference?

spring field May 12, 2024, 6:59 PM

#

sort of I guess, pytorch also allows for greater freedom afaik

craggy agate May 12, 2024, 7:00 PM

#

spring field sort of I guess, pytorch also allows for greater freedom afaik

I see

iron basalt May 12, 2024, 7:00 PM

#

craggy agate So in theory that makes TF worse?

TF can in theory compile all the shaders together into a nice fast one via its compute graph, but in practice it does not do that, and TF kept breaking a lot of stuff with new versions. Pretty much all old papers that used it are dead and can't be reproduced without large amounts of painful work.

#

Torch on the other hand, still works.

spring field May 12, 2024, 7:01 PM

#

spring field sort of I guess, pytorch also allows for greater freedom afaik

in that you can like implement whatever you want, it's basically numpy, but can run on the GPU (so, sort of like cupy), but it also handles the differentiation for you and such

#

and ofc, they're much better with backwards compat

craggy agate May 12, 2024, 7:01 PM

#

spring field in that you can like implement whatever you want, it's basically numpy, but can ...

GPU is one of the main reasons I am switching.

craggy agate May 12, 2024, 7:02 PM

#

iron basalt TF can in theory compile all the shaders together into a nice fast one via its c...

Yeah ig that's true

iron basalt May 12, 2024, 7:02 PM

#

Torch does have a bunch of extra tools for when you need even more speed though, so after experimentation you can lock it in, and optimize it further.

#

TF was just suppose to do that by default / be built in / all work on the compute graph, but never really got there.

#

Putting in a ton of effort into that when all the models were changing so quickly was a waste. Better to optimize after and instead have better iteration speed.

#

(But really the main issue is that Google projects are all sinking ships, so not a great idea to build everything on)

wooden sail May 12, 2024, 7:15 PM

#

spring field in that you can like implement whatever you want, it's basically numpy, but can ...

lemme introduce you to jax real quick https://jax.readthedocs.io/en/latest/index.html literally numpy with jit and autodiff

spring field May 12, 2024, 7:17 PM

#

I think zestar already recommended me to take a look at it, but thanks anyway 😄 I will check it out eventually

odd meteor May 12, 2024, 7:30 PM

#

craggy agate Switching from Tensorflow to Pytorch, any advice?

Just Do It 🚀

craggy agate May 12, 2024, 7:48 PM

#

Agreed

hollow escarp May 12, 2024, 8:12 PM

#

hollow escarp https://onnxruntime.ai/docs/tutorials/iot-edge/rasp-pi-cv.html#prerequisites and...

If anyone has some expirience with deploying object detecion system to devices like raspberry pi ( including OTA updates for software) Would be cool if they could share details how they solve problem of deploying such software for production usage.

warped ocean May 12, 2024, 9:22 PM

#

am i allowed to post youtube links when asking for help? im trying to follow along in a tutorial that's about a year old and am trying to copy their environment setup, but the newer python version setup has some visual differences that are a bit confusing

cedar tusk May 12, 2024, 9:25 PM

#

warped ocean am i allowed to post youtube links when asking for help? im trying to follow alo...

dm me

craggy agate May 12, 2024, 9:44 PM

#

cedar tusk dm me

ok thanks, will do

austere hemlock May 12, 2024, 9:45 PM

#

does anybody here do t levels course

#

cus i really need help asap

#

and the esp is soo close

#

starts tommorow

desert oar May 12, 2024, 9:48 PM

#

warped ocean am i allowed to post youtube links when asking for help? im trying to follow alo...

Yes you can. But note that this server has a low opinion of YT tutorials and you might get non-YT suggestions for alternative resources

wary vortex May 12, 2024, 10:25 PM

#

Guys, what are the possible reasons that my lstm model is not learning (the prediction is always 0). It is a binary text classification model.

spring field May 12, 2024, 11:04 PM

#

wary vortex Guys, what are the possible reasons that my lstm model is not learning (the pred...

maybe the text is not encoded in an embedding space and you're training on indices instead
maybe you haven't accounted for a dataset imbalance or have over/underestimated it and thus put a ton more weight on one category compared to the other
idk, could be a lot of reasons, we'd need more information from you, such as the code if possible, what library you're using for this, idk, some diagrams if relevant and such

wary vortex May 12, 2024, 11:07 PM

#

spring field maybe the text is not encoded in an embedding space and you're training on indic...

They are encoded and embedded. It is 6 to 4(classes) and it is shuffled. this is the code for model and training:
https://pastebin.com/neQeEyZK

Pastebin

class LSTM(nn.Module): def __init__(self, num_emb, output_size, ...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

spring field May 12, 2024, 11:10 PM

#

I might be mistaken but 6 to 4 categories don't seem to quite classify (pun intended?) as binary (2) classification, maybe I'm just misunderstanding something

wary vortex May 12, 2024, 11:12 PM

#

spring field I might be mistaken but 6 to 4 categories don't seem to quite classify (pun inte...

6 to 4 is not the category number, it is the ratio of the data for classes

spring field May 12, 2024, 11:13 PM

#

ah, that

wary vortex May 12, 2024, 11:13 PM

#

sorry for not explaining it very well.

spring field May 12, 2024, 11:16 PM

#

wary vortex They are encoded and embedded. It is 6 to 4(classes) and it is shuffled. this is...

alright, another thing, line 16 causes it to always go the else branch, is that intended?

wary vortex May 12, 2024, 11:19 PM

#

spring field alright, another thing, line 16 causes it to always go the `else` branch, is tha...

Yes

#

it gives an error otherwise

surreal hemlock May 12, 2024, 11:19 PM

#

tada

wary vortex May 12, 2024, 11:19 PM

#

The entire code, with it's all glory, depends on line 16

spring field May 12, 2024, 11:21 PM

#

alright, well, I'm not yet particularly familiar with LSTMs yet, but I can't imagine that resetting its memory and hidden state before every epoch is the right thing to do

wary vortex May 12, 2024, 11:21 PM

#

when I move it to init part so it doesnt always go to else, it gives error "RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward."

wary vortex May 12, 2024, 11:22 PM

#

spring field alright, well, I'm not yet particularly familiar with LSTMs yet, but I can't ima...

I know but it gives an error for no reason

spring field May 12, 2024, 11:22 PM

#

spring field alright, well, I'm not yet particularly familiar with LSTMs yet, but I can't ima...

I say that because I don't know what part the nn.LSTM object handles internally itself, so it might be the case that it's saving some of that state, but then you probably wouldn't need to pass it in yourself...

spring field May 12, 2024, 11:23 PM

#

wary vortex when I move it to init part so it doesnt always go to else, it gives error "Runt...

I think you may need to detach those tensors first

wary vortex May 12, 2024, 11:23 PM

#

spring field I think you may need to detach those tensors first

I tried detaching which made the model still incapable of learning.

spring field May 12, 2024, 11:24 PM

#

well, did you get rid of line 16 when you tried to detach?

wary vortex May 12, 2024, 11:24 PM

#

spring field well, did you get rid of line 16 when you tried to detach?

yes

#

I also tried working with hidden and memory outside the model but that did not work either

spring field May 12, 2024, 11:26 PM

#

wait wait

#

on line 20

#

or rather, in that if branch, you don't reassign self.hidden and self.memory

wary vortex May 12, 2024, 11:27 PM

#

So, u would be right. However due to 16, it never comes to 20

spring field May 12, 2024, 11:28 PM

#

try to detach and use output, (self.hidden, self.memory) on line 20 and get rid of line 16

wary vortex May 12, 2024, 11:29 PM

#

It is training, It takes 3 mins for an epoch so see ya in 3 mins

wary vortex May 12, 2024, 11:32 PM

#

spring field try to detach and use `output, (self.hidden, self.memory)` on line 20 and get ri...

I am back, and unfortunately, it did not work 😦

spring field May 12, 2024, 11:44 PM

#

mmm

spring field May 13, 2024, 12:08 AM

#

wary vortex I am back, and unfortunately, it did not work 😦

alright, I think I know what the actual issue might be, you're not applying an activation function after your fully connected layer

wary vortex May 13, 2024, 12:17 AM

#

spring field alright, I think I know what the actual issue might be, you're not applying an a...

I see

#

It did not work either though

spring field May 13, 2024, 12:33 AM

#

can you show your current code?

wary vortex May 13, 2024, 12:54 AM

#

spring field can you show your current code?

https://pastebin.com/j7r1gMjF

Pastebin

class LSTM(nn.Module): def __init__(self, num_emb, output_size, ...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

spring field May 13, 2024, 1:16 AM

#

wary vortex https://pastebin.com/j7r1gMjF

alright, new idea... 😁

return self.act(self.fc_out(output.view(output.size(0) * output.size(1), -1)))

#

oh and you'll need to then just pass pred to the loss function instead of pred[:, -1, :]

#

although pithink

#

it'll still probably error out in the loss function

wary vortex May 13, 2024, 1:19 AM

#

Trying it rn

#

thx

spring field May 13, 2024, 1:21 AM

#

spring field alright, new idea... 😁 ```py return self.act(self.fc_out(output.view(output.siz...

mmm, I don't think this is the right architecture for this, because that's for this

#

but you're doing this

#

😩

wary vortex May 13, 2024, 1:22 AM

#

spring field alright, new idea... 😁 ```py return self.act(self.fc_out(output.view(output.siz...

this line gave an error "RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead."

spring field May 13, 2024, 1:23 AM

#

oh... well, try the same but use .reshape instead of .view

#

it'll still definitely fail on the loss function

wary vortex May 13, 2024, 1:24 AM

#

spring field oh... well, try the same but use `.reshape` instead of `.view`

there is a problem due to tensor size " raise ValueError(f"Target size ({target.size()}) must be the same as input size ({input.size()})")
ValueError: Target size (torch.Size([512, 1])) must be the same as input size (torch.Size([256000, 1]))"

wary vortex May 13, 2024, 1:28 AM

#

spring field but you're doing this

so is it not receiving memory and history?

spring field May 13, 2024, 1:28 AM

#

no no, it is what you should be doing

spring field May 13, 2024, 1:28 AM

#

spring field alright, new idea... 😁 ```py return self.act(self.fc_out(output.view(output.siz...

it's just what I wanted you to do would do the first thing

#

which is not what you want I don't think

#

so anyway, scratch that idea

wary vortex May 13, 2024, 1:30 AM

#

So should I switch back to original?

spring field May 13, 2024, 1:31 AM

#

I frankly am not sure how to help at this point, I'd have to come back after having practiced this myself, though I still think I have a rough idea where the problem may exist, though there are some things I don't know about your code, but yeah, I'll let someone else chime in with their ideas, sorry, it's the best i could do for now bread_pensive

wary vortex May 13, 2024, 1:31 AM

#

Thanks a lot

#

I am new to lstms and I have no idea what I am doing

spring field May 13, 2024, 1:33 AM

#

yeah, me too, I literally just a couple days ago managed to implement a "standard" RNN (as in, not using the built-in thing) but yeah, still learning 😁

wary vortex May 13, 2024, 1:35 AM

#

spring field yeah, me too, I literally just a couple days ago managed to implement a "standar...

well, I wish good luck to both of us

dawn light May 13, 2024, 2:01 AM

#

I'm currently following a prediction/regression problem (https://www.youtube.com/watch?v=Wqmtf9SA_kk)

I'm having trouble understanding exactly what happened at around 14:00 where he applied a log transform in order to deal with skewed data
Can someone explain (or point me to some resources) to me why exactly this works/why it's valid because afaik it's a nonlinear transformation right? won't it affect the models if we change the distribution of the features?

wary vortex May 13, 2024, 3:22 AM

#

dawn light I'm currently following a prediction/regression problem (https://www.youtube.com...

Logarithmic function is really useful to get rid of extreme values(not really get rid of but to compress), since the slope is decreasing. So simply it reduces the difference between extreme values and the rest which would be considered normal. The reason it is valid is that it stabilizes variance and reduces the impact of extreme values because of the compressing thing.

merry ridge May 13, 2024, 6:54 AM

#

dawn light I'm currently following a prediction/regression problem (https://www.youtube.com...

The usual path you would need to derive this is to write down a first order Taylor approximation of the unknown function which stabilizes the variance, and solve the resulting differential equation. In most cases, it just happens that a log is going to be close enough, but for something like electricity markets, you will generally need to optimize it to get a good result.

dawn light May 13, 2024, 8:28 AM

#

merry ridge The usual path you would need to derive this is to write down a first order Tayl...

I see, any chance you can point me to a resource on that taylor series?

#

sounds interesting enough

earnest rose May 13, 2024, 9:57 AM

#

Hi all. Who works with Speech Recognition tools, do you know if Faster Whisper has a function to get Confidence Score?

desert oar May 13, 2024, 10:54 AM

#

dawn light I'm currently following a prediction/regression problem (https://www.youtube.com...

It affects the model, of course! It usually helps

desert oar May 13, 2024, 10:56 AM

#

dawn light I see, any chance you can point me to a resource on that taylor series?

they are talking about this https://en.m.wikipedia.org/wiki/Variance-stabilizing_transformation

Variance-stabilizing transformation

In applied statistics, a variance-stabilizing transformation is a data transformation that is specifically chosen either to simplify considerations in graphical exploratory data analysis or to allow the application of simple regression-based or analysis of variance techniques.

#

the Taylor series thing shows up in the section "relation to delta method" but read the other section first to motivate the reasoning

jaunty helm May 13, 2024, 11:15 AM

#

slight tangent, but srsly how do you use PowerTransformer effectively
I feel like every time I tried it it was (sometimes significantly) worse than just taking a log or a sqrt etc

wild sluice May 13, 2024, 12:09 PM

#

bro how tf do i store my weights for multiple linear regression?
i have a 4x2 matrix of features, each row is a different sample

desert oar May 13, 2024, 12:15 PM

#

jaunty helm slight tangent, but srsly how do you use `PowerTransformer` effectively I feel l...

is that something in sklearn?

odd meteor May 13, 2024, 12:15 PM

#

Hi all,

I wanted to share openings at ICLR 2024 with you all incase you're interested
https://whova.com/event-job/iclr-2024-the-twelfth-international-conference-on-learning-representations-job-opportunities/
Goodluck!

Jobs posted on the Whova Community Board of ICLR 2024 -The Twelfth...

Full list of the job openings posted on the Whova Community Board of ICLR 2024 -The Twelfth International Conference on Learning Representations, from companies and institutions such as ...

desert oar May 13, 2024, 12:16 PM

#

wild sluice bro how tf do i store my weights for multiple linear regression? i have a 4x2 ma...

how did you fit the regression? usually the weights are just an array that can be stored in the numpy file format

jaunty helm May 13, 2024, 12:17 PM

#

desert oar is that something in sklearn?

yea, it implements box-cox and yeo-johnson which you can use
problem is I feel they're usually way off and sticking to something static like an np.log usually worked out better for me

wild sluice May 13, 2024, 12:18 PM

#

desert oar how did you fit the regression? usually the weights are just an array that can b...

thats the thing. i don't know what the dimensionality of the weights vector would be

past meteor May 13, 2024, 12:26 PM

#

jaunty helm slight tangent, but srsly how do you use `PowerTransformer` effectively I feel l...

I'm not a fan of it

#

Oh, I misread and thought it was Polynomial transformer

#

Honestly, for all of these you need to plot the residuals versus the target and go from there

#

I had this discussion with a colleague today. You plot the errors and if you notice some sort of heteroscedasticity you act from there. In his case, the data were gamma distributed so actually just taking a log or actually using a gamma posterior (exists for a bunch of models) can make a non-negligible difference

desert oar May 13, 2024, 12:47 PM

#

@dawn light it's less about specifically detaching variance from mean and more about exposing differences to the model that are useful on a linear scale

#

that is: if one of your variables is something that spans across several orders of magnitude, you might want to log-transform because you probably aren't as interested in differences as you are in "plain" differences in order of magnitude

dawn light May 13, 2024, 12:51 PM

#

gotcha, thanks! I think i get it a bit more now

wild sluice May 13, 2024, 12:55 PM

#

@desert oar could u please explain what the dimensionality of my weight vector would be? like does every individual feature have its own weight or does each column of the feature matrix get its own weight?

past meteor May 13, 2024, 1:03 PM

#

dawn light gotcha, thanks! I think i get it a bit more now

I actually think my answer applies to you too. Consider just plotting the residuals versus your variables and deciding on what transforms you need on the basis of that

#

Modelling is iterative, nothing wrong with doing that 🙂

wild sluice May 13, 2024, 1:23 PM

#

nvm i got the answer

warm trellis May 13, 2024, 1:29 PM

#

Does it make sense to go linear layer from conv1d and then from linear to gru?

hushed kindle May 13, 2024, 1:36 PM

#

HI GUYS, I need ur help, i am trying to create Churn predicate model in Python, I am using Logical Regression, but the result is not perfect , I mean not even good, accuracy is 67 and recall is 0.57 - Churned
0.75 - Not Churned
And ROC-AUC Score: 0.72
what I am doing wrong? I tried a lot thing but Not getting Better result.
Do u have any suggestion? I mean Tutorial , Youtube or something Like this?

desert oar May 13, 2024, 1:55 PM

#

warm trellis Does it make sense to go linear layer from conv1d and then from linear to gru?

for what purpose?

desert oar May 13, 2024, 1:56 PM

#

hushed kindle HI GUYS, I need ur help, i am trying to create Churn predicate model in Python, ...

what is Logical Regression? did you mean Logistic Regression?

if your model doesn't fit very well, the first thing you need to consider is: do my features actually make sense for predicting the outcome? if you're looking at shoe size and hair color, there's very little chance that any machine learning technique, no matter how advanced, can improve your churn model.

you might indeed benefit from trying different kinds of models, etc. but you have to think about your data first. i don't have any "tutorials" for this because none exist. you're essentially asking for a tutorial to become a mid-level data scientist. as much as i wish it was easy to learn, unfortunately there is a huge amount of material to cover. too many different things to be considered and decided without knowing what you're doing. it's like asking for a YT tutorial on becoming a software engineer.

i think the only other "short-form" advice that anyone can give is that, if your data is "tabular" like an excel spreadsheet, to try a model like xgboost that can usually combine features effectively without a lot of manual adjustment. but you will still want to learn how to evaluate a model correctly using cross-validation, and you'll need to at least pay some attention to hyperparameter tuning. you might find useful info on those topics specifically, but beware the 100s of junky video and blog tutorials.

warm trellis May 13, 2024, 1:57 PM

#

desert oar for what purpose?

There is a paper suggesting an architechture which is called TCN-ECANet-GRU, and how they formulated the flow does not work for my dataset, and model does not work. That's why I am asking, is this even logical?
source of picture: A short‑term forecasting method for photovoltaic power generation based on theTCN‑ECANet‑GRU hybrid model Xiuli Xiang1, Xingyu Li2, Yaoli Zhang2 & Jiang Hu3*

past meteor May 13, 2024, 2:00 PM

#

@wooden sail Can I poke your brain again. I keep forgetting what the gotcha is when training neural networks to estimate the parameters of a distribution and simply sampling from that to get your final prediction. In that way you naturally have probabilistic outputs. Now I'm sure something is wrong with this because I don't see it very often in the literature.

#

There's deep evidential regression but it's not that popular

desert oar May 13, 2024, 2:02 PM

#

warm trellis There is a paper suggesting an architechture which is called TCN-ECANet-GRU, and...

on the output side of some other complicated thing, sure why not? on the input side, i'd question its value if you don't at least have something like positional encoding and/or your sequences are reliably fixed size (don't need a lot of padding). depends heavily on how the data in encoded imo.

desert oar May 13, 2024, 2:02 PM

#

past meteor <@467435887236612106> Can I poke your brain again. I keep forgetting what the go...

isn't this like asking why we don't use xgboost or linear regression to estimate distribution parameters?

past meteor May 13, 2024, 2:03 PM

#

warm trellis There is a paper suggesting an architechture which is called TCN-ECANet-GRU, and...

I think they have the GRU so they can use it for multistep forecasting with an arbitrary horizon

desert oar May 13, 2024, 2:03 PM

#

i'd love to know Edd's answer of course

past meteor May 13, 2024, 2:03 PM

#

desert oar isn't this like asking why we don't use xgboost or linear regression to estimate...

It is

#

DeepAR does this and DeepAR is popular

desert oar May 13, 2024, 2:05 PM

#

past meteor DeepAR does this and DeepAR is popular

let me read the paper to see what they do

#

it's possible that a lot of ML users don't care about (or don't think they care about) distributions (even though they should)

past meteor May 13, 2024, 2:06 PM

#

I mean, I suspect you'll just have uncalibrated probabilities that don't mean anything

desert oar May 13, 2024, 2:06 PM

#

fwiw in general estimating anything other than "conditional central tendency" is hard

#

there are specialized models for estimating conditional variance along with conditional mean in time series modeling, called "GARCH" models

past meteor May 13, 2024, 2:07 PM

#

Time series literature is very disappointing

#

Even more so the packages

desert oar May 13, 2024, 2:07 PM

#

time series is hard

#

in traditional statistics usually either you assume a distributional form (which is often in the exponential family and has a small number of parameters) or you do something nonparametric and don't have probability estimates anyway

past meteor May 13, 2024, 2:07 PM

#

I handroll everything

desert oar May 13, 2024, 2:08 PM

#

and from there you follow some kind of optimization procedure like maximum likelihood, maximum a posteriori, etc. where you have a theoretically-derived objective function and a closed-form likelihood or a posterior that can be estimated with MCMC

past meteor May 13, 2024, 2:08 PM

#

The same applies for time series though

#

You can make them Markovian by including lags inside of your covariates

desert oar May 13, 2024, 2:09 PM

#

right. so you're wondering why do this in time series and not in other kinds of problems?

past meteor May 13, 2024, 2:09 PM

#

And then the same techniques apply, roughly speaking

desert oar May 13, 2024, 2:09 PM

#

it kind of looks like the "deep learning" part is being trained to generate samples that match the conditional distribution

past meteor May 13, 2024, 2:09 PM

#

Very very roughly speaking

desert oar May 13, 2024, 2:10 PM

#

that's... interesting. my initial instinct is that they have a relatively small number of variables to condition on (time + covariates), so they aren't trying to condition on a tiny slice of some massive high-dimensional space

#

i have to go to a meeting but i'll read through this paper, i didn't know they had a probabilistic forecasting thing

past meteor May 13, 2024, 2:10 PM

#

I'll just reread the paper later

desert oar May 13, 2024, 2:11 PM

#

i still want to know Edd's answer

past meteor May 13, 2024, 2:11 PM

#

There's the whole gluonTS library as well you can look at

desert oar May 13, 2024, 2:11 PM

#

👍 i knew about the library but never tried using it / figuring out what it does

honest sorrel May 13, 2024, 2:22 PM

#

If anyone has knowledge and experience with machine learning(ML)and its algorithms, I need some guidance to work on machine learning for my personal work. So, if anyone out there, please ping me.

wooden sail May 13, 2024, 2:25 PM

#

past meteor <@467435887236612106> Can I poke your brain again. I keep forgetting what the go...

i'm not sure what you mean

#

or maybe i do

#

in continuous, random settings, learning the underlying distribution and sampling from it means you get the prediction wrong with probability 1. is that what you're referring to?

#

the optimality targets are met "in expectation", but each realization of the random process is wrong

past meteor May 13, 2024, 2:30 PM

#

wooden sail the optimality targets are met "in expectation", but each realization of the ran...

Yes

#

But what some are doing is instead of making point predictions they for instance use MSE loss to estimate the parameters of a gamma distribution

#

And at prediction time they then send the data through the network which produces, in this case, 2 parameters that are then used to sample from a distribution to produce a point prediction

wooden sail May 13, 2024, 2:32 PM

#

i'd somehow put that under some flavor of bayesian or posterior probability estimation

past meteor May 13, 2024, 2:33 PM

#

It is yes, it's the poor man's version

#

But something should be flawed with approach otherwise it'd be more popular

#

Because as I remember the real deal has each parameter be a distribution

wooden sail May 13, 2024, 2:34 PM

#

this is doing that

#

there are several levels you can do this at

desert oar May 13, 2024, 2:35 PM

#

past meteor But what some are doing is instead of making point predictions they for instance...

MSE loss between the data and the sampled data from the learned distribution?

wooden sail May 13, 2024, 2:35 PM

#

the starting point is a deterministic model f(x) with parameters x, and noise is added. so you have f(x) + n which is now a random variable, and you are interested in finding x from the noisy observations. this is the same as saying the data is a random process and you want the parameters that describe that random process (e.g. if the noise is 0 mean, then we are interested in the x that describes the mean f(x) of the random process)

past meteor May 13, 2024, 2:36 PM

#

I think my actual question is, what are the pros and the cons of just estimating the parameters of a distribution or having your entire neural network's parameters be probability distribution and using bayes' rule for inference

#

The con of the latter is obviously compute/poor scaling

wooden sail May 13, 2024, 2:36 PM

#

this deepAR is already letting f(x) be random itself, meaning there's a prior distribution describing f(x) and it estimates its parameters. that'd be a bayesian setting. f(x) then has additional parameters aside from x which describe its statistical properties

past meteor May 13, 2024, 2:38 PM

#

desert oar MSE loss between the data and the sampled data from the learned distribution?

No, estimate the parameters, sample and then calculate the MSE as normal

wooden sail May 13, 2024, 2:38 PM

#

past meteor I think my actual question is, what are the pros and the cons of just estimating...

these are the same thing to me, just using a different optimizer and model to find the same thing

#

unless i'm misunderstanding you

past meteor May 13, 2024, 2:38 PM

#

Interesting, I didn't view them as the same

#

I'll reflect on this

wooden sail May 13, 2024, 2:39 PM

#

you can explicitly say "this is my distribution, plz find the params" or say "this network is a latent representation of the pdf with arbitrary structure. learn your params so that you are the pdf now"

desert oar May 13, 2024, 2:39 PM

#

@past meteor was DeepAR the only example of this you had in mind? clearly this is picking up from an existing conversation that I very much want to follow, but I feel like I'm lacking context

past meteor May 13, 2024, 2:39 PM

#

desert oar <@260493929047130113> was DeepAR the only example of this you had in mind? clear...

No there's also deep evidential regression that does something similar

wooden sail May 13, 2024, 2:39 PM

#

the network essentially becomes f(x), with x now a mess of trainable parameters. needs more data, but it'S more flexible than fixing f explicitly

past meteor May 13, 2024, 2:41 PM

#

When I learnt of Bayesian neural nets it was always having each parameter be a prob distribution + use variational inference or something. Never something as simple as just estimating posterior

#

So I think my issue is: "if it's too good to be true, it probably is."

desert oar May 13, 2024, 2:41 PM

#

wooden sail the network essentially becomes f(x), with x now a mess of trainable parameters....

is this not just all ML modeling?

sturdy kiln May 13, 2024, 2:41 PM

#

wow it just keeps going up lmao

desert oar May 13, 2024, 2:42 PM

#

some fixed f(x) plus additive random noise with some distributional resemblance to E(resid | x) = 0

sturdy kiln May 13, 2024, 2:42 PM

#

cant tell if keras' image_dataset_from_directory is making this worse or not

wooden sail May 13, 2024, 2:42 PM

#

desert oar is this not just _all_ ML modeling?

yes, but the difference here is whether you want f to be deterministic, a deterministic parameter of a random process, or a random parameter of a random process, or a det/rand hyper param of a random process, etc

#

those all change what you do with the output of the network and how you measure its accuracy (which loss)

desert oar May 13, 2024, 2:43 PM

#

ah, i see. i think i'm also lost because i don't know how that's typically accomplished outside of a traditional bayesian parametric model. i'll read the deep evidential regression paper & see if i can follow their technique

past meteor May 13, 2024, 2:44 PM

#

wooden sail yes, but the difference here is whether you want f to be deterministic, a determ...

It's a nice summary yes. I'll leave it at that for now haha

#

But still, I just wonder what each of them buys you from the practical pov

#

If they have calibrated probabilities, if it just ends up being the same as a regularisation scheme, ...

wooden sail May 13, 2024, 2:46 PM

#

i usually (naively) think of each layer of randomness as regularization

#

yes

#

each extra model and prior contrains where the possible solutions can lie

#

not using any probabilities explicitly is equivalent to assuming your parameters are random with uniform distribution

past meteor May 13, 2024, 2:47 PM

#

But I suppose if you pick a different distribution than Gaussian you end up with a different regularisation scheme than MAP with a Gaussian prior

wooden sail May 13, 2024, 2:47 PM

#

because of this, bayesian estimation bounds are usually lower than deterministic ones

past meteor May 13, 2024, 2:47 PM

#

Which in and of itself is helpful, depending on your problem

wooden sail May 13, 2024, 2:48 PM

#

gaussian prior yields L2 iirc, laplace prior yields L1 reg

past meteor May 13, 2024, 2:48 PM

#

Yes

wooden sail May 13, 2024, 2:49 PM

#

MAP also yields the mode of the posterior, whereas a general bayesian setting cares about the whole distribution

past meteor May 13, 2024, 2:50 PM

#

Exactly but I suppose if you do this naively the network may converge to something where the parameters it estimated have a very small tail (close to deterministic)

#

I'd just have to try this out on my data if I have spare time

dense smelt May 13, 2024, 3:03 PM

#

Exception : List indices must be integers or slices, not str

I've been creating plotly dash apps its a python interactive visualisation tool, backend data for the dash app comes from aws, so this dash app has two tabs and two views/ queries are to be retrieved. To begin with when do this error occur and I'm unable to find where the error is occuring, based on port number the views are extracted from aws, after extracting its populated and loaded into a dataframe (filtered_df), need help I will explain more about the dash app and its structure but have to clear this exception and load data first.

serene scaffold May 13, 2024, 3:08 PM

#

dense smelt Exception : List indices must be integers or slices, not str I've been creatin...

to help with debugging an error message, one needs to see the whole error message and the code that caused it.

chrome salmon May 13, 2024, 3:09 PM

#

youre trying to index a list using a string

serene scaffold May 13, 2024, 3:09 PM

#

chrome salmon youre trying to index a list using a string

yeah, this. without seeing the code and the whole error message, all one can say is "stop trying to index a list with a string", which of course is not helpful.

#

@dense smelt I don't help over DMs. If you want help, post the whole error message in this chat. Don't give any additional explanation of what you're trying to do until you've posted the whole error message.

wild sluice May 13, 2024, 3:30 PM

#

costs=[]
iteration = []
np.random.seed(0)

for i in range(100):
    iteration.append(i)
    weights = np.random.randn(2,1)
    bias = 4
    prediction = np.dot(X,weights)
    n = y.size
    learning_rate = 0.00000005
    

    residual = y - prediction
    cost = (1/n) * np.sum(np.square(y-prediction),axis=0)
    costs.append(cost)
    d_weight = (2/n) * np.dot(X.T,residual)
    weights -= learning_rate * d_weight

r = sns.lineplot(x=iteration,y=np.hstack(costs))
plt.title('Costs')
plt.show()

can someone tell me what I'm doing wrong

#

i'm new to this

dense smelt May 13, 2024, 3:34 PM

#

serene scaffold <@700251511984357458> I don't help over DMs. If you want help, post the whole er...

Hey There you go this is the whole error message ( this is in the terminal )

layout start
Dash is running on http://0.0.0.0:9999/

Serving Flask app 'throughput_time'
Debug mode: on
layout start
within callback: []
pathname: http://0.0.0.0:9999/615eb010-2vvv-42d1-b6ba-50e0394cc5a5
views: ['aws_tpt_line', 'aws_tpt_cell']
within callback: []
views: ['aws_tpt_line', 'aws_tpt_cell']
factory_id: 615eb010-2vvv-42d1-b6ba-50e0394cc5a5
Exception : list indices must be integers or slices, not str

#

@serene scaffold Data should be loaded after extracting the view name from aws
but have to clear this exception I guess

serene scaffold May 13, 2024, 3:38 PM

#

dense smelt Hey There you go this is the whole error message ( this is in the terminal ) la...

I see. turns out that this isn't a data science question. try opening a thread in #1035199133436354600 about getting the full traceback when an error is raised in a Dash app.

wild sluice May 13, 2024, 3:39 PM

#

serene scaffold I see. turns out that this isn't a data science question. try opening a thread i...

mine is

#

plz help

serene scaffold May 13, 2024, 3:42 PM

#

wild sluice ```py costs=[] iteration = [] np.random.seed(0) for i in range(100): iterat...

what is the problem? how is that plot different from what you wanted?

wild sluice May 13, 2024, 3:43 PM

#

serene scaffold what is the problem? how is that plot different from what you wanted?

well is it ok for the cost function to look like that? i wanted a smooth line to approx 0

serene scaffold May 13, 2024, 3:43 PM

#

wild sluice well is it ok for the cost function to look like that? i wanted a smooth line to...

do you know what an epoch is?

wild sluice May 13, 2024, 3:43 PM

#

yes its the training iteration

#

the x is epoch, y is cost of that epoch

serene scaffold May 13, 2024, 3:44 PM

#

wild sluice yes its the training iteration

between each epoch, one would expect the average cost for that whole epoch to decrease. but it won't necessarily decrease between adjacent instances.

#

it looks like you're trying to implement backpropogation by hand?

wild sluice May 13, 2024, 3:45 PM

#

its just multiple linear regression

wild sluice May 13, 2024, 3:47 PM

#

serene scaffold between each epoch, one would expect the average cost for that whole epoch to de...

are epochs and instances different?

#

i got this graph a while back for simple linear regression

#

i was expecting sth like this (ignore the increasing cost)

dense smelt May 13, 2024, 3:51 PM

#

serene scaffold I see. turns out that this isn't a data science question. try opening a thread i...

Okay Thanks

serene scaffold May 13, 2024, 3:55 PM

#

wild sluice are epochs and instances different?

an epoch is where you train on each training instance once

wild sluice May 13, 2024, 4:00 PM

#

ok

wild sluice May 13, 2024, 4:09 PM

#

serene scaffold an epoch is where you train on each training instance once

is it because my data is randomly generated

#

but i did use fixed weights to calculate the output. its only the features that were randomly generated

past meteor May 13, 2024, 4:23 PM

#

wild sluice ```py costs=[] iteration = [] np.random.seed(0) for i in range(100): iterat...

Why is your bias fixed to 4?

#

The dimension of your weights should be 1 larger than the size of the input. You should also just add a 1 to the front or the back of your input

#

Yes, a literal 1. That makes it easier, you don't need to add the bias then, you can just dot product and you're there

#

Consider doing stochastic gradient descent. It's very easy to implement, just add an inner loop

spring field May 13, 2024, 5:10 PM

#

wild sluice ```py costs=[] iteration = [] np.random.seed(0) for i in range(100): iterat...

why are you picking new random weights every iteration?

wild sluice May 13, 2024, 5:11 PM

#

spring field why are you picking new random weights every iteration?

💀

#

i did not see that

#

that wud explain the graph

agile cobalt May 13, 2024, 6:43 PM

#

I didn't see the demos on the website, but it sounded very robot during the live demo overall

desert oar May 13, 2024, 6:46 PM

#

sounded like a person who i might insult as being a robot

#

"HR manager" vibes

#

but yeah pretty impressive. going to be very useful for running phone scams

past meteor May 13, 2024, 6:53 PM

#

did you shout "mahdi" when you heard the bot speak /s

#

such a good movie

iron basalt May 13, 2024, 7:37 PM

#

past meteor Which in and of itself is helpful, depending on your problem

Priors, if chosen correctly, make the problem way easier. And if not chosen, implicitly chosen (so it's important to be aware of what your choice is even if not made explicitly).

#

(Note that trying to have no bias is itself a kind of bias (and so you can put it all under the same mathematical framework))

#

(The scientific method tries to have this explicitly in its design)

cedar tusk May 13, 2024, 9:15 PM

#

they say gpt 5 will be able to do these

#

what u guys think?

#

yea

#

i wonder how powerful it will be

#

and will openai be able to run this kind of real time model on the huge scale they are promising

#

i gues we are witnessing yet another history xd

boreal crescent May 13, 2024, 9:21 PM

#

I need help with my learning machine and neural networks

cedar tusk May 13, 2024, 9:21 PM

#

will they name this period the postcovid ai boom xD

cedar tusk May 13, 2024, 9:22 PM

#

boreal crescent I need help with my learning machine and neural networks

u can ask any question here

boreal crescent May 13, 2024, 9:22 PM

#

I have a situation my lerning machine give me a result NaN why this happened?

cedar tusk May 13, 2024, 9:23 PM

#

boreal crescent I have a situation my lerning machine give me a result NaN why this happened?

your linear algebra code has a fault in it

boreal crescent May 13, 2024, 9:23 PM

#

cedar tusk May 13, 2024, 9:24 PM

#

so the multiplication returns nans

boreal crescent May 13, 2024, 9:24 PM

#

#

boreal crescent May 13, 2024, 9:25 PM

#

cedar tusk so the multiplication returns nans

Ok but I try several times to fix it and have same results

spring field May 13, 2024, 9:25 PM

#

if you could not send pictures taken with mobile of your laptop's screen here, that'd be fantastic, like at the very least send screenshots, but best if you just send the code and outputs and ofc if you have plots, screenshots of those

#

!paste

#

!paste

#

good one

#

https://paste.pythondiscord.com

boreal crescent May 13, 2024, 9:26 PM

#

I changed the logical math and have same results

cedar tusk May 13, 2024, 9:26 PM

#

boreal crescent I changed the logical math and have same results

paste ur code to the above website and send the link here so we can look at it

boreal crescent May 13, 2024, 9:26 PM

#

Yes I try but same results

cedar tusk May 13, 2024, 9:26 PM

#

xD

boreal crescent May 13, 2024, 9:27 PM

#

Yes give a moment to send you the script

#

This is other problems the graph 📈 is not visualized on emergents windows from tkinter

#

Blank screen without graphics from learning machine

cedar tusk May 13, 2024, 9:30 PM

#

xD

boreal crescent May 13, 2024, 9:30 PM

#

Let me go back home and I show you the script

#

Ok I’ll do let me back home to send you the script, almost 20 min

cedar tusk May 13, 2024, 9:33 PM

#

The plot THICKENS

spring field May 13, 2024, 9:33 PM

#

no no, they're not

#

yep

boreal crescent May 13, 2024, 9:34 PM

#

I did inside the main loop the call of the instance model

cedar tusk May 13, 2024, 9:35 PM

#

im loving this conversation 🍿

boreal crescent May 13, 2024, 9:36 PM

#

First I calculated the rsi and ma5 with Cci next I created the model and set up the call inside the main loop before make a decision to trade buy or sell

pale lantern May 13, 2024, 10:25 PM

#

anyone here who could help me with converting my sklearn model to onnx? using skl2onnx?

https://discord.com/channels/267624335836053506/1239704162682277998

boreal crescent May 13, 2024, 11:32 PM

#

model learning and neural

📎 message.txt

spring field May 13, 2024, 11:43 PM

#

are the inputs normalized?

#

I'm afraid that's how TF be, they're integrated into the layers 😬

#

also what's this about?

#

keras? what's the diff between keras and tf?

#

oh wait, keras is part of tf?

boreal crescent May 13, 2024, 11:45 PM

#

https://paste.pythondiscord.com/MLHQ

spring field May 13, 2024, 11:45 PM

#

yeah, activation is a default argument

#

https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense

#

oh wait

#

boreal crescent May 13, 2024, 11:46 PM

#

yes

spring field May 13, 2024, 11:46 PM

#

yeah, an RNN type thingy

#

an improved RNN basically

boreal crescent May 13, 2024, 11:47 PM

#

i can show you the script without learning and neural

spring field May 13, 2024, 11:48 PM

#

you'd also need to actually make sure that the layers use some activation functions

spring field May 13, 2024, 11:48 PM

#

spring field also what's this about?

but I'm kinda concerned about this and whether you are normalizing inputs as well

boreal crescent May 13, 2024, 11:48 PM

#

and you can tell me what i have to do, step by step not at all but the most important

#

def trading_loop(self):
previous_trend = None
while self.trading:
symbol = self.symbol
if not self.connected:
print("Not connected. Attempting to reconnect...")
self.connect_to_mt5()
if not self.connected:
time.sleep(10) # Esperar antes de reintentar
continue
# Call the functions to train the neural network and load the trained model
self.train_and_plot()
self.load_model_and_predict()

        utc_from = datetime(2024, 1, 1)
        utc_to = datetime.now()
        rates = mt5.copy_rates_range(symbol, mt5.TIMEFRAME_M5, utc_from, utc_to)

        if rates is not None and len(rates) > 0:
            print(f"OHLC {symbol}")
   ####### #self.text_widget.insert(tk.END, f"OHLC {symbol}\n")
            for rate in rates:
                direction = "buy" if rate[4] > rate[1] else "sell"
                print(f"Time: {datetime.utcfromtimestamp (rate[0])}, Open: {rate[1]}, High: {rate[2]}, Low: {rate[3]}, Close: {rate[4]}, Direction: {direction}")
                #self.text_widget.insert(tk.END, f"Time: {datetime.utcfromtimestamp (rate[0])}, Open: {rate[1]}, High: {rate[2]}, Low: {rate[3]}, Close: {rate[4]}, Direction: {direction}\n")

########################
(i called berfore the trading strategy)

#

i need to create a class, and singles function ?

clever inlet May 14, 2024, 2:37 AM

#

I'm looking at data scraped from upwork with job descriptions and skills required and want to use NLP to see if they match a few different roles. Anyone know where I can start?

serene scaffold May 14, 2024, 2:49 AM

#

clever inlet I'm looking at data scraped from upwork with job descriptions and skills require...

You probably don't need anything very sophisticated for this. You just need to identify which words in the job descriptions and which words in the other thing are most important, and then measure the overlap.

clever inlet May 14, 2024, 2:56 AM

#

yea I had a feeling I just needed to just look for a bunch of keywords... thanks

deep veldt May 14, 2024, 3:19 AM

#

Should i use vgg16 or vgg19?

lapis sequoia May 14, 2024, 3:33 AM

#

I need to get better at deep learning. Where can I learn that is not cancer?

spring field May 14, 2024, 3:56 AM

#

lapis sequoia I need to get better at deep learning. Where can I learn that is not cancer?

pinned messages would be a good start