#data-science-and-ml | Python | Page 115

elder hemlock Apr 7, 2024, 10:17 AM

#

Ah! Right.

#

I tend to call that "static analysis".

#

In that the AI doesn't really run away from you.

#

Which is a property I would attribute to "deductive" thought patterns.

#

You could compare this solution to something like dijkstra's shortest path in that way.

#

I'm a fan of static analysis, and I believe it is underrated as a solution, and offers different properties that we can utilize.

#

I also believe a human thinks in a mix of "learning ai" and "static analysis".

#

See, my project to create this game is a very good example of where there are shortcomings to learning AI.

#

Since the AI cannot afford to have a delay on its rulings, and it shouldn't be biased.

#

I was in this chat yesterday, and I brought up what I was mainly using this for,

#

Did you see those messages

#

I'll link them

#

#data-science-and-ml message

#

Yeah, I was pretty brief.

#

So, I think it turns out that Bayesian inference is very compatible with "static analysis", so I have done experiments on paper relating to this hypothetical game that lets the player create new things.

#

In that, I use Bayesian probability maths to establish a number quantifier for the "risk" that a design imposes on things around it.

#

Which is my basis for a measurement of fairness.

#

So for example, if there was a strong player, and a weak player, this algorithm in theory, can identify that this is not fair. In a consistent way.

#

And their fairness and riskiness can be expressed as a decimal number.

#

Well, if I can turn fairness into a number, that means I can create automated balancing for a game.
I can give players many freedoms and I can keep their abuse in check.

#

Uh, no.

#

This system design also paints a ideal test for character designs to pass.

#

This also connects with character writing, and the idea that ideal characters have flaws AND abilities.

#

Do you think it would list a discovery like that somewhere?

#

I would continue to work on this project, but I'm not sure people would understand it, even when finished.

#

I guess I mean people who aren't in the field.

#

The limit of explanation is the familiarity with the audience. And from alien minds come alien ideas.

#

I argue their ideas were better understood in retrospect.

#

And maybe mine might, given that I'm actually doing something new, which is unlikely.

#

Yes.

#

Exactly.

#

In communication, it seems biased to put the burden on only the one explaining, and not some intersection of me trying to explain to you, and you trying to understand me.

#

But I'm willing to try to explain in different ways, like the one you described.

cinder jay Apr 7, 2024, 11:14 AM

#

hi guys

#

someone has ever used nnU-Net?

#

i have some warning that are a bit annoying

orchid forge Apr 7, 2024, 11:35 AM

#

guys, i want to improve my data analysis skills can anyone recommend me some sorta statistics, probability, etc. specially for data analysis....kinda free couse or website?

gleaming osprey Apr 7, 2024, 11:36 AM

#

orchid forge guys, i want to improve my data analysis skills can anyone recommend me some sor...

Kaggle has some awesome free courses with certification

#

Ig

orchid forge Apr 7, 2024, 11:36 AM

#

gleaming osprey Kaggle has some awesome free courses with certification

can u send it to me plzz

gleaming osprey Apr 7, 2024, 11:37 AM

#

orchid forge can u send it to me plzz

Search up kaggle.com, create an account and look on the right for the learn tab and pick a course! It's really good from there. I think at least

orchid forge Apr 7, 2024, 11:41 AM

#

also is it true that there is statistics and probability for specifically data analysis too?

#

oh so can you tell me some good websites to learn stats and prob for data analysis ?

#

i just sometimes dont understand that everything i do is not enough

#

and umm i end up with something more to learn

#

becuz i didnt knew it at the first place

#

i wish there was a detailed roadmap for data analysis

toxic mortar Apr 7, 2024, 12:02 PM

#

I want to evaluate unsupervised learning natural language processing (topic modelling). Curently I am performing hyperparam tuning with GridSearch and RandomSearch. I set up pipeline so the output would be 2d html plotly graph and list of labeled cluster groups with its count (including outliers). After many iteration I would like to perform evaluation. How would you approach this? So far goal is to minimize the outlier number, but also not have big but dense clusters. Something in between. What evaluation metrics should I use. Something like std from numpy??

#

I mean realistically the outliers may be just noise cuz specific dataset is trash

buoyant vine Apr 7, 2024, 1:32 PM

#

honey, 3blue1brown just uploaded a new video https://www.youtube.com/watch?v=eMlx5fFNoYc

YouTube

3Blue1Brown

Visualizing Attention, a Transformer's Heart | Chapter 6, Deep Lear...

Demystifying attention, the key mechanism inside transformers and LLMs.
Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support
An equally valuable form of support is to simply share the videos.

Demystifying self-attention, multiple heads, and cross-attention.
Instead of sponsored ad reads, these les...

▶ Play video

serene scaffold Apr 7, 2024, 1:47 PM

#

Hello, remember what I told you about screenshots of text. Are you still having an issue?

#

What fire? Are we doing arson again? lemon_hyperpleased

odd meteor Apr 7, 2024, 1:51 PM

#

I once shared this MLOps Roadmap I found pretty solid. Perhaps you might find it useful
#data-science-and-ml message

terse frigate Apr 7, 2024, 2:23 PM

#

@serene scaffold hey bro can you provide me with the link to register for free credits?

serene scaffold Apr 7, 2024, 2:23 PM

#

terse frigate <@253696366952316929> hey bro can you provide me with the link to register for f...

idk where it is

terse frigate Apr 7, 2024, 2:24 PM

#

i signed up on aws educate

#

https://www.awseducate.com/registration/s/registration-detail?language=en_US

#

but i am not sure if i am on the right one as you asked me if i was still a student and this did not require my student email or verification

#

hence i thought id just ask you

serene scaffold Apr 7, 2024, 2:25 PM

#

I don't know. I haven't looked at it in at least five yeras.

terse frigate Apr 7, 2024, 2:25 PM

#

hahah okay okauy

#

thank you though thank you 🙏

sage sparrow Apr 7, 2024, 3:03 PM

#

Hi, my data specifically isn't showing in a bar chart with Plotly. I tried visualizing a mock pd.DataFrame and that worked. Any idea why and what to do to fix it?
Link to the other place I asked: #1226543867864682646 message

#

All I get is an empty graph

serene scaffold Apr 7, 2024, 3:06 PM

#

sage sparrow Hi, my data specifically isn't showing in a bar chart with Plotly. I tried visua...

to avoid duplication of effort, please only ask your question in one place.

sage sparrow Apr 7, 2024, 3:07 PM

#

Oh got it, sorry

#

Should I delete this one then?

serene scaffold Apr 7, 2024, 3:08 PM

#

sage sparrow Should I delete this one then?

no, just link to the other place that you asked.

sage sparrow Apr 7, 2024, 3:08 PM

#

Okay, how do I do that?

serene scaffold Apr 7, 2024, 3:08 PM

#

sage sparrow Okay, how do I do that?

go to the other place and right click on the message

sage sparrow Apr 7, 2024, 3:10 PM

#

Like that?...

serene scaffold Apr 7, 2024, 3:10 PM

#

sage sparrow Like that?...

you can copy and paste links to messages; you copy the link by right clicking on the message.

sage sparrow Apr 7, 2024, 3:11 PM

#

sage sparrow Hi, my data specifically isn't showing in a bar chart with Plotly. I tried visua...

What should I change here then?

serene scaffold Apr 7, 2024, 3:11 PM

#

nothing. it's fine.

sage sparrow Apr 7, 2024, 3:11 PM

#

I'm causing way too much trouble lol

serene scaffold Apr 7, 2024, 3:11 PM

#

just remember for next time.

sage sparrow Apr 7, 2024, 3:12 PM

#

Got it, thanks

tranquil mist Apr 7, 2024, 4:08 PM

#

Do you guys know of any insightful papers on upsamling lowpassed signals into broadband ones ?

orchid forge Apr 7, 2024, 5:30 PM

#

serene scaffold Hello, remember what I told you about screenshots of text. Are you still having ...

Have I not taken it properly?

serene scaffold Apr 7, 2024, 5:32 PM

#

orchid forge Have I not taken it properly?

The point is to not even take screenshots of text. Copy and paste the actual text into the chat. Be sure to always share text that way, unless you literally cannot

#

I'm not home right now, but I might be able to help when I get home. In the meantime, you can share the code and error message as text.

languid glade Apr 7, 2024, 6:01 PM

#

im trnna code in python a script that would lock onto the color green but inside the game it doesnt move the guy forcing him to look at the green but it works in general

orchid forge Apr 7, 2024, 6:11 PM

#

serene scaffold I'm not home right now, but I might be able to help when I get home. In the mean...

Okay

#

Be safe while reaching home

arctic geyser Apr 7, 2024, 6:11 PM

#

orchid forge i wish there was a detailed roadmap for data analysis

School, homie

serene scaffold Apr 7, 2024, 6:12 PM

#

orchid forge Be safe while reaching home

I walked

#

But in general, I always live on the edge

orchid forge Apr 7, 2024, 6:23 PM

#

Oh

half remnant Apr 7, 2024, 9:11 PM

#

i am trying to get the data out from a .mat file which has accelration data from the matlab mobile app but it does not seem to work i can get it out in matlab super easy but when i try to do it in python it wont work, the data that is in the file is the time, x, y, z accelrations; here is the code and the result for it:

from scipy.io import loadmat

# Load the .mat file
acceleration_data = loadmat("drop.mat")

print(acceleration_data)

result:

{'__header__': b'MATLAB 5.0 MAT-file, Platform: GLNXA64, Created on: Sat Apr  6 11:28:20 2024', '__version__': '1.0', '__globals__': [], 'None': MatlabOpaque([(b'Acceleration', b'MCOS', b'timetable', array([[3707764736],
                     [         2],
                     [         1],
                     [         1],
                     [         1],
                     [         2]], dtype=uint32))                         ],
             dtype=[('s0', 'O'), ('s1', 'O'), ('s2', 'O'), ('arr', 'O')]), '__function_workspace__': array([[ 0,  1, 73, ...,  0,  0,  0]], dtype=uint8)}

vital ocean Apr 7, 2024, 11:46 PM

#

buoyant vine honey, 3blue1brown just uploaded a new video https://www.youtube.com/watch?v=eMl...

2 videos in a week 👀

gritty vessel Apr 7, 2024, 11:48 PM

#

Hello Everyone

#

I have a doubt does cnns architecture is also responsible for output shape?

#

I am feeding an array of shape (1,1536,1392,6) as input and in output it's an array of (1,1536,1392,2)

#

But whenever I fit it I always gives shape mismatch error

#

And also I can't find any tutorial where they use arrays instead of images in cnn

#

I would really appreciate if you guys can suggest me some resources or projects which uses Multidim arrays in cnn

serene scaffold Apr 8, 2024, 12:35 AM

#

gritty vessel But whenever I fit it I always gives shape mismatch error

whenever you need help related to an error, please remember to always always show the whole error message

serene scaffold Apr 8, 2024, 12:36 AM

#

gritty vessel And also I can't find any tutorial where they use arrays instead of images in cn...

every neural network that deals with images will represent that image as an array, even if they don't say that explicitly. they might also refer to them as tensors instead of arrays.

gritty vessel Apr 8, 2024, 12:43 AM

#

serene scaffold every neural network that deals with images will represent that image as an arra...

Yeah I know they are array only but I don't know why using arrays in code feels different from working with images

serene scaffold Apr 8, 2024, 12:44 AM

#

gritty vessel Yeah I know they are array only but I don't know why using arrays in code feels ...

Well, every tutorial regarding images and CNN will involve images as arrays. that's the only way to do it.

gritty vessel Apr 8, 2024, 12:44 AM

#

You are right

#

For now can I explain the error ? I couldn't get the error from my internship lab they don't allow internet or electronic items in there 😅

serene scaffold Apr 8, 2024, 12:45 AM

#

gritty vessel For now can I explain the error ? I couldn't get the error from my internship la...

Not without seeing the whole error message and the code that produced it.

gritty vessel Apr 8, 2024, 12:45 AM

#

Alright just a min I will show a similar error

#

ValueError: Input 0 of layer sequential is incompatible with the layer: expected shape=(None, 1392, 6), but got shape=(68232320, 64)

#

I will try to note down the error today frm my lab computer and will show the proper in today evening

serene scaffold Apr 8, 2024, 1:21 AM

#

gritty vessel ValueError: Input 0 of layer sequential is incompatible with the layer: expected...

if 68232320 were evenly divisible by 1392, my guess would be that you concatenated a bunch of instances incorrectly.

#

when it says that the expected shape is (None, 1392, 64), the None means "however many instances you have" or "batch size". so every instance, if viewed stand-alone, would be of shape (1392, 6)

#

and if you had three of those together, then the shape would be (3, 1392, 6). and if you instead had (4176, 6), which is 1392 * 3, you'd know you messed that step up somehow.

gritty vessel Apr 8, 2024, 1:23 AM

#

I tried with one array also

serene scaffold Apr 8, 2024, 1:23 AM

#

with one array?

gritty vessel Apr 8, 2024, 1:23 AM

#

I mean stacked array

#

Of shape (1,1536,1392,6) as X

serene scaffold Apr 8, 2024, 1:24 AM

#

you should be stacking instances along a new, leftmost dimension

gritty vessel Apr 8, 2024, 1:24 AM

#

And for y array of shape (1,1536,1392,2)

serene scaffold Apr 8, 2024, 1:24 AM

#

so each instance, viewed on its own, is an array of shape (1536,1392,6)?

gritty vessel Apr 8, 2024, 1:24 AM

#

Yup

serene scaffold Apr 8, 2024, 1:25 AM

#

what is the model designed to do?

gritty vessel Apr 8, 2024, 1:25 AM

#

To predict where lightning is going to happen so I had pin point coordinates of lightning

#

What I did created an image where there is no lightning it will have zero and places where lightning is happening is represented by 1

serene scaffold Apr 8, 2024, 1:26 AM

#

gritty vessel To predict where lightning is going to happen so I had pin point coordinates of ...

makes sense.

gritty vessel Apr 8, 2024, 1:26 AM

#

And another one is count wherever the lightning is 1 what's the count of it

serene scaffold Apr 8, 2024, 1:26 AM

#

can you show the whole error message, starting from traceback?

gritty vessel Apr 8, 2024, 1:26 AM

#

At that place so we can measure the Intensity

gritty vessel Apr 8, 2024, 1:27 AM

#

serene scaffold can you show the whole error message, starting from traceback?

Is it ok if I send it in evening? I will write the error msg in my book when I return frm my lab

serene scaffold Apr 8, 2024, 1:27 AM

#

gritty vessel Is it ok if I send it in evening? I will write the error msg in my book when I ...

idk what time it is for you, but I check this channel pretty frequently.

gritty vessel Apr 8, 2024, 1:27 AM

#

I tried it on colab and my system but both crashed

serene scaffold Apr 8, 2024, 1:28 AM

#

gritty vessel I tried it on colab and my system but both crashed

okay... was there an error message?

gritty vessel Apr 8, 2024, 1:28 AM

#

Not enough resources error on my system

#

And Colab crashed and restarted

serene scaffold Apr 8, 2024, 1:29 AM

#

gritty vessel And Colab crashed and restarted

how many instances did you have loaded in memory?

gritty vessel Apr 8, 2024, 1:29 AM

#

Single

serene scaffold Apr 8, 2024, 1:29 AM

#

what kind of architecture is this?

gritty vessel Apr 8, 2024, 1:29 AM

#

It was basic I just wanted to check whether it runs or not

#

But still it crashed

serene scaffold Apr 8, 2024, 1:30 AM

#

can you copy and paste the whole thing into the paste bin?

#

!paste

arctic wedgeBOT Apr 8, 2024, 1:30 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

gritty vessel Apr 8, 2024, 1:31 AM

#

I can't

serene scaffold Apr 8, 2024, 1:31 AM

#

why not

#

because if you can't show me the code you were planning to work on in the evening, I won't be able to help with that

gritty vessel Apr 8, 2024, 1:32 AM

#

My workplace does not allow electronic items so I didn't took laptop

gritty vessel Apr 8, 2024, 1:33 AM

#

serene scaffold because if you can't show me the code you were planning to work on in the evenin...

I will ask them to mail me the code today at work and I will share my work procedure in evening if that's OK?

serene scaffold Apr 8, 2024, 1:33 AM

#

gritty vessel I will ask them to mail me the code today at work and I will share my work proce...

if you can share the whole code and the entire error message that you are trying to resolve, and it's something I think I can help debug, I will do so.

gritty vessel Apr 8, 2024, 1:34 AM

#

It's 7am here what about at yours?

serene scaffold Apr 8, 2024, 1:34 AM

#

I'm east coast us

#

9:34 pm

#

anyway, problems like this are either because instances are being joined incorrectly, or the layers of the network are set up incorrectly (the output of one layer can't be fed to the next)

gritty vessel Apr 8, 2024, 1:35 AM

#

Yeah that can be the problem

#

Alright I will ping you when I come back home is that ok?

serene scaffold Apr 8, 2024, 1:36 AM

#

For this specific question, yes. But not in general.

gritty vessel Apr 8, 2024, 1:36 AM

#

Yeah for this problem only not for anything general

slender kestrel Apr 8, 2024, 5:21 AM

#

Hey so i am registring for a 36 hours live hackathon and and we are open to decide our own problem statement and the theme is build with ai to solve real world problems so what can be some realworld problems which i can solve using ai in 36 hours the problem statement should be novel since its is for a hackathon

tired lodge Apr 8, 2024, 11:10 AM

#

slender kestrel Hey so i am registring for a 36 hours live hackathon and and we are open to deci...

!rule homework | i have a feeling this applies but to answer your question go for smth like the medical field

arctic wedgeBOT Apr 8, 2024, 11:10 AM

#

Rules

8. Do not help with ongoing exams. When helping with homework, help people learn how to do the assignment without doing it for them.

foggy heath Apr 8, 2024, 11:11 AM

#

How can I implement AI to my discord bot? I want it to behave just like other AI models but I want to also give it a whole lot of information to answer certain questions and talk in certain ways. Is there like some website where I can create/edit a model then use APIs to make my bot send the model's messages.

serene scaffold Apr 8, 2024, 1:16 PM

#

foggy heath How can I implement AI to my discord bot? I want it to behave just like other AI...

do you have a specific idea in mind for what "AI" functionality you want in the bot? because the answer changes depending on what it is.

foggy heath Apr 8, 2024, 1:17 PM

#

Language generation, I have no intention for it to generate any other media (right now.)

serene scaffold Apr 8, 2024, 1:17 PM

#

foggy heath Language generation, I have no intention for it to generate any other media (rig...

so you can just make API calls to whatever LLM service you want to use. you wouldn't need any special knowledge about generative AI to do this.

foggy heath Apr 8, 2024, 1:18 PM

#

I'm currently testing out hugging face's transformers and a pretrained model of gpt2 but things aren't going as planned

serene scaffold Apr 8, 2024, 1:18 PM

#

foggy heath I'm currently testing out hugging face's transformers and a pretrained model of ...

GPT-2 would be shit as a chat bot, just so you know

foggy heath Apr 8, 2024, 1:18 PM

#

serene scaffold so you can just make API calls to whatever LLM service you want to use. you woul...

Well I want to make an original model or at least edit a pre existing one

foggy heath Apr 8, 2024, 1:19 PM

#

serene scaffold GPT-2 would be shit as a chat bot, just so you know

really?

#

right now I can't get it to even generate simple english

serene scaffold Apr 8, 2024, 1:19 PM

#

foggy heath right now I can't get it to even generate simple english

that problem is solveable. but it probably won't generate a coherent conversation.

foggy heath Apr 8, 2024, 1:19 PM

#

foggy heath Apr 8, 2024, 1:20 PM

#

serene scaffold that problem is solveable. but it probably won't generate a coherent conversatio...

is there a model you recommend for me to use to test, and i can edit after?

serene scaffold Apr 8, 2024, 1:20 PM

#

foggy heath is there a model you recommend for me to use to test, and i can edit after?

how big is your GPU?

foggy heath Apr 8, 2024, 1:20 PM

#

uhh

#

gtx 1650 i believe laptop gpu

serene scaffold Apr 8, 2024, 1:20 PM

#

foggy heath is there a model you recommend for me to use to test, and i can edit after?

None

foggy heath Apr 8, 2024, 1:20 PM

#

whaaaaaaaaaaaat

serene scaffold Apr 8, 2024, 1:21 PM

#

you'll never be able to fine-tune ("edit") an LLM with that little compute power.

serene scaffold Apr 8, 2024, 1:23 PM

#

foggy heath whaaaaaaaaaaaat

the state-of-the-art in AI is pretty much always depends on the best hardware that currently exists anywhere. so if you want to do something in a home lab, you need to be content with staying a few generations behind, or paying for compute resources.

#

with an RTX 1650, you're probably still behind the "fine tuning language models" stage. and definitely behind for fine-tuning any 7 billion+ parameter models.

muted hollow Apr 8, 2024, 2:45 PM

#

guys, which ide should i use. Im currently using google colab but keep exceeding usage limit

serene scaffold Apr 8, 2024, 2:45 PM

#

muted hollow guys, which ide should i use. Im currently using google colab but keep exceeding...

colab isn't an IDE. but if you're looking for alternatives to colab, and you're maxing out what colab is willing to give you for free, you'll have to be willing to pay.

misty mist Apr 8, 2024, 2:52 PM

#

muted hollow guys, which ide should i use. Im currently using google colab but keep exceeding...

yeah, colab is not IDE, i recommend you to use VS code, i have been using it for years now.

#

If you are using colab to train NN model, i think colab provides tesla t4 gpu. I recommend you to use gpu of kaggle, it will provide you p100

#

but i think there is 12 hour limit, and 30 hours limit per week

toxic mortar Apr 8, 2024, 4:00 PM

#

serene scaffold with an RTX 1650, you're probably still behind the "fine tuning language models"...

My room is like 10 degrees hotter than the rest of my appartment cause I am performing 24h+ RandomSearch for hyperparam tuning

#

with i9 13thgen

midnight harbor Apr 8, 2024, 5:10 PM

#

Guys anyone please advice me a their best exp language detection (classification) library/freeapi
on python
tag me when anyone reply

frozen tundra Apr 8, 2024, 7:19 PM

#

hi, i made my own neural network, and it works great using one input and one output only, but when i tried teaching it using mnist dataset (more then one input and output) it dosent work anymore, there is a mismatch of matrices and i dont understand how it works for one input and output but not for more https://paste.pythondiscord.com/JE2Q

#

the error is Traceback (most recent call last):
File "C:\Users\iddob\PycharmProjects\Neural2\main.py", line 24, in <module>
nn.back_prop(x_train[i], y)
File "C:\Users\iddob\PycharmProjects\Neural2\NeuralNetwork.py", line 112, in back_prop
d_predicted = (np.dot(self.weights[i].T, d_predicted) *
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: shapes (10,20) and (10,) not aligned: 20 (dim 1) != 10 (dim 0)

serene scaffold Apr 8, 2024, 7:34 PM

#

frozen tundra hi, i made my own neural network, and it works great using one input and one ou...

sounds like your model doesn't properly handle batching. if each instance is a (10, 20)-shape array, and you want to pass exactly one instance through the model, you need to do it as a (1, 10, 20)-shape array

#

great job getting it to work on individual instances, though! the hard part is done.

frozen tundra Apr 8, 2024, 7:44 PM

#

Thanks, I will try to implement it. Do you have a website or an article you can direct me to?

serene scaffold Apr 8, 2024, 7:49 PM

#

frozen tundra Thanks, I will try to implement it. Do you have a website or an article you can ...

not off the top of my head. look into array broadcasting in numpy
it's the same with pytorch tensors.

frozen tundra Apr 8, 2024, 7:50 PM

#

Thanks I will look into it

serene scaffold Apr 8, 2024, 8:17 PM

#

frozen tundra hi, i made my own neural network, and it works great using one input and one ou...

#1226974209645744178 message: to avoid duplication of effort, please ask your question in one place, and link to it elsewhere

frozen tundra Apr 8, 2024, 8:18 PM

#

serene scaffold https://discord.com/channels/267624335836053506/1226974209645744178/122697420964...

Sorry I didn't know it was possible

serene scaffold Apr 8, 2024, 8:18 PM

#

frozen tundra Sorry I didn't know it was possible

you can right click a message to get a link

frozen tundra Apr 8, 2024, 8:18 PM

#

Thank you

robust stratus Apr 8, 2024, 8:54 PM

#

I am using data grip and I can’t see the page where it shows my query results

#

Idk why

serene scaffold Apr 8, 2024, 9:18 PM

#

robust stratus I am using data grip and I can’t see the page where it shows my query results

That would be a question for #editors-ides

orchid sky Apr 9, 2024, 1:01 AM

#

@serene scaffold I was able to figure out the issue as it was indexing

serene scaffold Apr 9, 2024, 1:01 AM

#

YAY

orchid sky Apr 9, 2024, 1:01 AM

#

I did review python once more

#

Thank you again for the help

mild grotto Apr 9, 2024, 1:20 AM

#

My brain is having trouble with my project

warm flame Apr 9, 2024, 7:55 AM

#

hi, everybody.does anyone know is there a way to extract tweets without the api of x

fallow leaf Apr 9, 2024, 9:01 AM

#

Hello guys, what is the best way to learn machine learning with python? I tried learning about linear regression but I feel overwhelmed by all the math and visualization and coding that are behind every concept.
have you tried self studying ML? if you have what is your suggestion

subtle remnant Apr 9, 2024, 9:09 AM

#

fallow leaf Hello guys, what is the best way to learn machine learning with python? I tried ...

Python is a must bruh to apply your knowledge

jaunty helm Apr 9, 2024, 9:19 AM

#

fallow leaf Hello guys, what is the best way to learn machine learning with python? I tried ...

instead of trying to learn ML & python at the same time, learn one first then incorporate the other

fallow leaf Apr 9, 2024, 9:22 AM

#

i know about python and some basic stuff about needed libraries

jaunty helm Apr 9, 2024, 9:24 AM

#

ideally you can read the code and know at least like 80% of what's going on in the code when learning a new ML technique
what you don't want is to often have to look up what the functions / syntaxes do in a specific step

fallow leaf Apr 9, 2024, 9:30 AM

#

I like reading about stuff, any book suggestions?

jaunty helm Apr 9, 2024, 9:31 AM

#

fallow leaf I like reading about stuff, any book suggestions?

see pinned

fallow leaf Apr 9, 2024, 9:31 AM

#

thanks

toxic mortar Apr 9, 2024, 10:15 AM

#

in the DL would you normalize every input ( non-categorical ) data?

stiff urchin Apr 9, 2024, 1:25 PM

#

Before learning concepts like linear algebra, probability statistics, calculus for machine learning, what topics do i need to have a solid grasp to understand these concepts clearly? could anyone help with this?

spark nimbus Apr 9, 2024, 1:29 PM

#

basic algebra would be a good start, after that shift to the topics you mentioned (I'd recommend linalg -> calculus -> probstat)

past meteor Apr 9, 2024, 1:36 PM

#

That's how it was done in my bachelors. lin alg -> (multivariate) calculus -> probaility -> statistics (1 course) -> econometrics -> ML

plucky sedge Apr 9, 2024, 3:17 PM

#

someone please help with my question, I've been waiting over 1 hour for someone to reply (I had to repost) #1227266024231932015 message
🙏🙏

slender kestrel Apr 9, 2024, 4:27 PM

#

tired lodge !rule homework | i have a feeling this applies but to answer your question go fo...

It wasnt related to homework it was just something i was confused and needed a bit of advice :)

tired lodge Apr 9, 2024, 4:57 PM

#

slender kestrel It wasnt related to homework it was just something i was confused and needed a b...

ohh advice. sorry i thought you meant we should come up with the ideas instead of you, which is why i mentioned that rule would probably apply to that scenario. my apologies 👍🏽

rancid sorrel Apr 9, 2024, 6:37 PM

#

anyone know how to deal with pyspark and "dirty" text, when it comes to nlp imports

#

https://pastebin.com/pt9wd5WK

Pastebin

text,generated"Cars. Cars have been around since they became famous...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

rancid sorrel Apr 9, 2024, 7:07 PM

#

got it to work with

      .option("header", "true")
      .option("quote", "\"")
      .option("escape", "\"")
      .option("multiline", "true")
      .load(datafilenew))```

flint aurora Apr 9, 2024, 7:38 PM

#

anyone wanted an easier way to work with dates in python? checkout https://dateroll.disent.com, just released today

full furnace Apr 9, 2024, 9:56 PM

#

Data science job still in demand? What's the essential stuff I need to learn to land a entry level job

serene scaffold Apr 9, 2024, 11:13 PM

#

full furnace Data science job still in demand? What's the essential stuff I need to learn to ...

You need at least a bachelors degree in computer science or similar, with an emphasis in data science/AI. Usually you need a masters.

I think the "data science" hype is pretty much over, and that people have turned their attention to hiring ML engineers. And it requires a lot of knowledge to be able to perform that work.

hollow sentinel Apr 9, 2024, 11:51 PM

#

📎 Analysis_G1_Value.csv

#

hi everyone, i'm trying to visualize this data

#

so i want a line graph in seaborn

#

where each ID is its specific line

#

i want the IQ on the y axis and the POS on the x axis

#

and i want to use the month and year column on the x axis

#

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
df = pd.read_excel("/Users/rahuldas/Desktop/BAN 112 Final Project/Analysis_G1_Value.xlsx")
print(df.columns)
df['Date'] = pd.to_datetime(df['Year'].astype(str) + df['Month'], format='%Y%B').dt.strftime('%Y-%m')
print(df.columns)

#

this is what i have so far

#

am i making sense?

#

i'm just so stuck

hollow sentinel Apr 10, 2024, 12:39 AM

#

i'm confused too

#

is what i'm suggesting not possible?

serene scaffold Apr 10, 2024, 1:01 AM

#

IQ on the y-axis, POS on the x-axis, and I want to use the month nd year column on the x-axis
if you have a date and two features, then the date needs to be the x axis. You'll probably need to do it as two lines, one for IQ and one for POS.

hollow sentinel Apr 10, 2024, 1:02 AM

#

serene scaffold > IQ on the y-axis, POS on the x-axis, and I want to use the month nd year colum...

ah so what i was saying wasn’t possible

#

it’s gonna be chaos with that many lines

serene scaffold Apr 10, 2024, 1:03 AM

#

It would just be two.

hollow sentinel Apr 10, 2024, 1:03 AM

#

huh

#

i think my brain is dead

serene scaffold Apr 10, 2024, 1:04 AM

#

hollow sentinel Apr 10, 2024, 1:05 AM

#

serene scaffold

hmmmm

serene scaffold Apr 10, 2024, 1:05 AM

#

the dates are out of order. that's a problem.

hollow sentinel Apr 10, 2024, 1:06 AM

#

oh shit, i didn't realize that

serene scaffold Apr 10, 2024, 1:07 AM

#

here we go

#

looks like dates aren't unique

hollow sentinel Apr 10, 2024, 1:08 AM

#

that's a wacky looking graph

serene scaffold Apr 10, 2024, 1:08 AM

#

it's a plot. not a graph. if you're going to be a data scientist, you can't use those words interchangeably.

hollow sentinel Apr 10, 2024, 1:08 AM

#

...plot

#

i've been stuck carrying this group project on my shoulders for the past 3 weeks

#

it's an eight person group project

#

eight people stel

#

the whole goal is to figure out the relationship between invoiced quantity (IQ) and sales (POS)

#

ugh whatever i guess i'll go to office hours again

#

https://tenor.com/view/stitch-life-lilo-and-stitch-headbang-stressed-gif-11704444

Tenor

hollow sentinel Apr 10, 2024, 1:14 AM

#

serene scaffold it's a plot. not a graph. if you're going to be a data scientist, you can't use ...

where a graph is infinite and a plot is finite?

serene scaffold Apr 10, 2024, 1:14 AM

#

hollow sentinel where a graph is infinite and a plot is finite?

a graph is nodes and edges, and a plot is a data visualization

hollow sentinel Apr 10, 2024, 1:14 AM

#

oh oh oh, ok

serene scaffold Apr 10, 2024, 1:19 AM

#

hollow sentinel the whole goal is to figure out the relationship between invoiced quantity (IQ) ...

why didn't you say so? if you're trying to figure out the relationship between two features, and you don't think the time that they were recorded influences that relationship, then you use a scatter plot.

hollow sentinel Apr 10, 2024, 1:20 AM

#

serene scaffold why didn't you say so? if you're trying to figure out the relationship between t...

but a scatterplot wouldn't have time on one of its axes

serene scaffold Apr 10, 2024, 1:21 AM

#

hollow sentinel but a scatterplot wouldn't have time on one of its axes

so?

hollow sentinel Apr 10, 2024, 1:23 AM

#

serene scaffold so?

oh wait you're right

desert oar Apr 10, 2024, 2:02 AM

#

hollow sentinel but a scatterplot wouldn't have time on one of its axes

if you're interested in the evolution of the relationship over time, color the points by time

#

it infuriates me to no end that Axes.scatter doesn't accept markersize=, only the magic s=

#

i like the magic s=, but sometimes i don't want it! argh

#

@hollow sentinel example of what i was talking about above:

import matplotlib.pyplot as plt
import numpy as np

seed = 2610494104
T = 1000
rng = np.random.default_rng(seed)
t = np.arange(T)
x = t / 1000 + rng.normal(scale=0.25, size=T)
y = x * 2.5 - 0.5 + rng.normal(scale=0.5, size=T)

plt.scatter(x, y, c=t, s=5)
plt.colorbar(label="Time")
plt.xlabel("X")
plt.ylabel("Y")
plt.title(f"Simulated data\n{seed=}")
plt.show()

(edited to use a better example)

hollow sentinel Apr 10, 2024, 2:24 AM

#

desert oar <@567030124306759710> example of what i was talking about above: ```python impo...

Gotcha ty

shut girder Apr 10, 2024, 4:46 AM

#

Hello everyone, I am currently studying Linear Algebra for data science, what are the key concepts that I should get a good understand of? I don't know if studying concepts abritarily would be a good use of time

wooden sail Apr 10, 2024, 6:15 AM

#

shut girder Hello everyone, I am currently studying Linear Algebra for data science, what ar...

vector spaces and subspaces, linear (and affine) transformations, dimension and rank (and the related geometry like hyperplanes), invertibility (and pseudo invertibility), matrix decompositions (especially eigenvalue decomp and singular value decomp)

these are some of the basics that help you understand concepts that come up later on in e.g. ML and other math topics

river mural Apr 10, 2024, 8:18 AM

#

is there a numpy/scipy alternative that supports very high precision like float128, float256, float512 or even exact math (i do not care much about performance)

jaunty helm Apr 10, 2024, 10:43 AM

#

river mural is there a `numpy`/`scipy` alternative that supports very high precision like `f...

sympy (written in pure python) or symengine (written in c++) ?

long canopy Apr 10, 2024, 11:27 AM

#

new mixtral + griffin HYPE

desert oar Apr 10, 2024, 1:05 PM

#

wooden sail vector spaces and subspaces, linear (and affine) transformations, dimension and ...

@shut girder i would add projections and norms to this list. basically most of an intro linalg curriculum. it all comes up at some point

desert oar Apr 10, 2024, 1:08 PM

#

river mural is there a `numpy`/`scipy` alternative that supports very high precision like `f...

arrow maybe has float128? kind of interesting that there isn't a library for it. if all else fails, try pandas with "object" dtype containing decimal.Decimal elements

#

wait, numpy has float128 @river mural

agile cobalt Apr 10, 2024, 2:05 PM

#

river mural is there a `numpy`/`scipy` alternative that supports very high precision like `f...

for "exact math" you'll probably want to use something like Decimal iirc?

agile cobalt Apr 10, 2024, 2:07 PM

#

desert oar arrow maybe has float128? kind of interesting that there isn't a library for it....

looks like arrow has https://arrow.apache.org/docs/python/generated/pyarrow.decimal128.html#pyarrow.decimal128

but not "float128"

#

weirdly enough there is a is_decimal256 function but I do not see documentation about the decimal256 type itself

leaden narwhal Apr 10, 2024, 2:12 PM

#

anyone can link me to a webpage talking about lstms and rnn's for grids, number of people and datetime

#

aka i want to predict future number of people in a certain grid in future times

serene scaffold Apr 10, 2024, 2:47 PM

#

if you ask your question in more than one place, please link to the help thread, so that there's no duplication of effort

iron ruin Apr 10, 2024, 2:49 PM

#

serene scaffold if you ask your question in more than one place, please link to the help thread,...

ohh yeah , sorry

#

ill delete the message since i posted in help desk

#

dammet why scipy gotta go remove those funcs all of a sudden yert

iron ruin Apr 10, 2024, 3:25 PM

#

serene scaffold if you ask your question in more than one place, please link to the help thread,...

wait ... how do I get the link to my help thread ?

serene scaffold Apr 10, 2024, 3:25 PM

#

iron ruin wait ... how do I get the link to my help thread ?

you can right click a message and pick copy message link

iron ruin Apr 10, 2024, 3:25 PM

#

ohh alright thanks

serene scaffold Apr 10, 2024, 3:25 PM

#

yw

iron ruin Apr 10, 2024, 3:26 PM

#

now to wait while I try to somehow solve this dayum triu issue

iron ruin Apr 10, 2024, 3:49 PM

#

serene scaffold yw

if I want to ask for help regarding import issues instead , which channel should I ask ?

serene scaffold Apr 10, 2024, 3:49 PM

#

iron ruin if I want to ask for help regarding import issues instead , which channel should...

#1035199133436354600
remember to always show the code and the whole error message as text.

hollow escarp Apr 10, 2024, 5:56 PM

#

Hi, i've been told that i should ask here about opencv. Im currently starting working on creating plate recognition software which will work with resberry PI and additional camera. Im wondering what methods would give me the highest precision in outside environment. Also do you know any good cameras which cheap and would work perfect even at night and in bad weather conditions? Also im considering if python would perform not much worse than c++ ( i dont need super efficient software working In less than 2 seconds ) i just need to keep it as effective as possible

iron basalt Apr 10, 2024, 6:25 PM

#

river mural is there a `numpy`/`scipy` alternative that supports very high precision like `f...

Python has decimal built in.

hollow escarp Apr 10, 2024, 6:31 PM

#

hollow escarp Hi, i've been told that i should ask here about opencv. Im currently starting wo...

Also i found some datasets with trained images to yolov8 and My question is, how accurate are these sets? https://universe.roboflow.com/roboflow-universe-projects/license-plate-recognition-rxg4e/dataset/4 - 22174 images

Roboflow

License Plate Recognition Object Detection Dataset (v4, resized640_...

10126 open source license-plates images and annotations in multiple formats for training computer vision models. License Plate Recognition (v4, resized640_aug3x-ACCURATE), created by Roboflow Universe Projects

odd meteor Apr 10, 2024, 8:15 PM

#

leaden narwhal anyone can link me to a webpage talking about lstms and rnn's for grids, number ...

I presume you're working on a time series project. If that's the case, you can use Uber's H3 library to add hexagonal grids on your spatial features.

If you check online I'm sure you'll find resources on geospatial or even spatio-temporal timeseries projects that utilized H3 to group spatial data into bins, and also trained the model using LSTM.

jade bloom Apr 10, 2024, 8:29 PM

#

Hello, I have a question. I've trained an RL model to pathfind to a target. All this is a square(the robot) and a point(the target). Every frame the model uses the distance from the robot to the target, the target's corodinate point, and the robot's coordinate point to estimate the optimal angle to drive at to reach the target. After the angle is found, in the same frame the robot moves at that angle.
However, I am running into an issue where the model returns angles, but they are noisy. The drive angles vary quite a bit, +- 30 degrees. The robot is still able to drive to the point, it just jitters a lot. Is there any way to smoothen out the robot's path and/or filter the noise?

sweet zealot Apr 10, 2024, 8:42 PM

#

It's my first day learning AI(RL) I'm watching a video about reinforcement learning in python. I'm not sure I understand these spaces correctly.

Let's say you want to create a bot for a video game flappy bird. I thought with space they ment the environment of the game(so the game window in X,Y coordinates). Since the bird can move from the bottom border to the top border. So the space of the environment would be:
box(bottem_border, top_border, 2,) ( 2 because flapy bord is a 2D game )?
Is my thinking correct here?

Also, for some reason I saw people put infinite numbers for lowest and highest value box(bottem_border, top_border, 2,) and they explained it would be fine. Which makes me realise I do not understand wtf a space actually is used for

mild grotto Apr 10, 2024, 10:39 PM

#

sweet zealot It's my first day learning AI(RL) I'm watching a video about reinforcement learn...

you can't move forward or backward

#

I recommend thinking about it like this:
The INPUT should describe your current state when you pause the game

#

For flappy birds, my thinking is that it would be:
[ your Y coordinate, your Y velocity]
Because your bird is effected by gravity. You have no X coordinate, you can't move forward or backward, or anything like that.

#

Also as part of the input, you would need the locations of all obstacles (how far away/how tall they are)

#

Then it's simple: If you think this fully describes all the information about the current board state, then the output is just "Do I jump, or not jump?"

#

So the output is simply a true/false value (or 0,1)

#

And you can see even in the example under Dict he uses Height: ... Speed:Box(0,100, shape(1,)) . which is exactly what I'm describing for your bird

hallow sphinx Apr 11, 2024, 12:30 AM

#

sweet zealot It's my first day learning AI(RL) I'm watching a video about reinforcement learn...

Why are you directly learning about reinforced learning on your first day? If its just an introduction, its fine, but if you are learning it, you should make sure you know the basic things first. Like about notebooks, maths, libraries etc.

desert oar Apr 11, 2024, 1:10 AM

#

leaden narwhal aka i want to predict future number of people in a certain grid in future times

look up "spatial data analysis" and "spatial statistics"

thin palm Apr 11, 2024, 3:44 AM

#

Hey, I am looking for a way to label data via web services is there an existing data base that would allow me to deploy a web based labeling tool?

#

using flask

urban cipher Apr 11, 2024, 4:56 AM

#

so i pickled my keras model and tried loading it into my flask app for use, but an error occured:

TypeError: unpack_keras_model() missing 1 required positional argument: 'optimizer_weights'

any idea to fix this?

trim saddle Apr 11, 2024, 5:28 AM

#

urban cipher so i pickled my keras model and tried loading it into my flask app for use, but ...

The function requires an argument. Make sure you provide the optimizer_weights

urban cipher Apr 11, 2024, 5:31 AM

#

trim saddle The function requires an argument. Make sure you provide the optimizer_weights

hmmm how though? i already pickled the model before and should load normally via pickle.load(). was there more steps to it?

#

my Machine Learning instructor even demonstrated how to load a pickled model from his example, and have tried pickling before and loaded it normally (on google colab), but it seems it won't on my machine. could it be an incorrect version of keras and tensorflow installed?

leaden narwhal Apr 11, 2024, 6:30 AM

#

odd meteor I presume you're working on a time series project. If that's the case, you can u...

Thanks !

leaden narwhal Apr 11, 2024, 6:30 AM

#

desert oar look up "spatial data analysis" and "spatial statistics"

Thanks !

finite fjord Apr 11, 2024, 6:30 AM

#

#

!python /content/Licence-Plate-Detection-using-YOLO-V8/ultralytics/yolo/v8/detect/train.py model=yolov8n.pt data=/content/Licence-Plate-Detection-using-YOLO-V8/helmet-detection-1/data.yaml epochs=100

#

Hello everyone I m working on helmet detection using yolo v8 I m facing this error I have load the dataset from roboflow

lofty thorn Apr 11, 2024, 7:46 AM

#

does anyone know from where i can find assignments on excel

gritty vessel Apr 11, 2024, 8:16 AM

#

@serene scaffold hey I asked a que here regarding shape error in cnn .I just wanted to update I solved it on my own

#

The problem was in my architecture only in my output the image is of same shape as input so I had to upscale it to original after it gets small

leaden narwhal Apr 11, 2024, 10:14 AM

#

odd meteor I presume you're working on a time series project. If that's the case, you can u...

also my grids are already created they are squared

#

will this still work?

orchid forge Apr 11, 2024, 10:28 AM

#

i need help

sweet zealot Apr 11, 2024, 10:41 AM

#

mild grotto you can't move forward or backward

What exactly are these spaces used for? Are these 'space values' the values the model uses to check differences between actions? And make it choices based on these values? That would mean the space values you define(which you define as 'input') is like the foundation for your model if I'm understanding this correctly.

So what if I would instead of Y coordinate and Y velocity space like you mentioned, use a more values, would it make the model better?

For example:

The absolute difference between the bird and obstacle.
X,Y Coordinates of all obstacles on screen, so not just the upcoming obstacle. Allowing the bird to anticipate on next obstacles after the upcomming next one?
Y coordinates of the bird.
Etc...

I'm still a bit confused about the actual spaces the guy in the video defined. but I think I do know what spaces are used for now

sweet zealot Apr 11, 2024, 10:49 AM

#

mild grotto And you can see even in the example under `Dict` he uses `Height: ... Speed:Box(...

You mentioned Dict(('height':Discrete(2), "Speed":Box(0,100,shape=(1,)))) would be a good fit for a flappy bird model.

But If I break it down I don't understand it. height refer to the Y axis? Which should be the absolute difference between bottom border to top border. But then why does it say discrete(2)? Does this mean there are 2 values?

I'm assuming speed box just means speed value from 1-100 so that's clear.

Also don't understand what shape means.

craggy bough Apr 11, 2024, 12:51 PM

#

i am trying to build a sequence to sequence model with 5 features but i need help , as i don't know how to set the outputs for each and the shape , i am uisng lstm

#

is there anyone who wanna get on voice chat and i can stream the code and you can see and help me

serene scaffold Apr 11, 2024, 1:36 PM

#

gritty vessel <@253696366952316929> hey I asked a que here regarding shape error in cnn .I jus...

Glad to hear you figured it out!

mild grotto Apr 11, 2024, 2:03 PM

#

sweet zealot You mentioned `Dict(('height':Discrete(2), "Speed":Box(0,100,shape=(1,))))` woul...

Keep in mind, I haven't watched that video and I don't know what that guy is doing. But remember in flappy birds that every pipe comes in pairs

#

They come from both the top and bottom. So you can't just use one "height" value. One value does not tell you where the pipe is. You need two

#

If a pipe 50 pixels from the top of the screen, does that mean you can go under it or over it? Clearly you need both the start and end of the height

#

So height needs 2 values per pipe

mild grotto Apr 11, 2024, 2:09 PM

#

sweet zealot What exactly are these spaces used for? Are these 'space values' the values the...

You can add extra inputs to make the model better yes. That's what we call "black magic", or "feature selection". Selecting the right features has no correct method. I suggest starting with a simple input, and seeing if it works good enough. If not, try adding other features

elfin robin Apr 11, 2024, 5:25 PM

#

is subset selection and feature selection same ?

arctic wedgeBOT Apr 11, 2024, 6:19 PM

#

:incoming_envelope: :ok_hand: applied timeout to @vivid magnet until <t:1712860170:f> (10 minutes) (reason: newlines spam - sent 12 consecutive newlines).

The <@&831776746206265384> have been alerted for review.

vivid magnet Apr 11, 2024, 6:30 PM

#

Hello guys.
I'm traying to install delta-spark package to work with deltalake in python, but when i follow the tutorial steps, It brings a 'PySparkRuntimeError : [JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number.'
I already install the Java, from oracle site (version 17), and in the enviroment varibles has the JAVA_HOME (C:\Program Files\Common Files\Oracle\Java\javapath\java.exe). But the problem is still going. I'm om Windows and idk of How to set PYSPARK_SUBMIT_ARGS

odd meteor Apr 11, 2024, 7:41 PM

#

leaden narwhal also my grids are already created they are squared

Yeah, but the question is how well? If you want better model performance, go for hexagonal grid instead of square grid (H3 creates hexagonal grids)

There are some good reasons to use hexagon instead of square grids. Some of them are

You can project them really good on any round surface (example, picture fitting square grid vs hexagonal grid on something round, example a globe, yeah, that round spinnable globe high school geography teachers always seem to have 😀)
The distance from the centre of one hexagon to the centre of a neighbouring hexagon is same but such is not the case if you're using a square or triangle grid.

You'll also get better projections with Geohashes.

If you wanna have more fun with your project, you can compare & contrast model performance on spatial features with square grid vs hexagonal grid.

Long video, but you'll understand better what I'm tryna say when you watch this https://youtu.be/TqRGLtbAHHw?si=Kw4abjH9mKgTaYZ_

YouTube

Foursquare

The power of hexagons how h3 & foursquare are transforming spatial...

▶ Play video

desert oar Apr 11, 2024, 10:46 PM

#

odd meteor Yeah, but the question is how well? If you want better model performance, go for...

strong +1 for H3, there's no reason to use any other grid system IMO

#

among many other practical reasons, there's already a fast implementation in C++ and a solid vectorized implementation in the h3ron library. just import and go.

viral zinc Apr 12, 2024, 2:23 AM

#

Hi, I am new to data science and was hoping if someone can share a roadmap with me to learn data science with python and R. Currently I am comfortable writing code in python, but don't know the data science packages (numpy, pandas).

#

also, should I only be focusing on data science with python right now and then later learn it with R? or learn data science both with R and Python at the same time?

serene scaffold Apr 12, 2024, 2:33 AM

#

viral zinc also, should I only be focusing on data science with python right now and then l...

Forget about R.

#

They're not complimentary tools. Anything you do with R, you can do with python.

hollow sentinel Apr 12, 2024, 2:35 AM

#

https://tenor.com/view/ron-swanson-parks-and-recreation-parks-and-rec-computer-dumpster-gif-17038619

Tenor

#

my R prof reading that

dense oar Apr 12, 2024, 3:54 AM

#

Are data science courses necessary components of an AI degree at a university?

fallen steppe Apr 12, 2024, 5:47 AM

#

How do i run the process using the gpu. This is Kaggle

#

It's enabled but it still runs on cpu and ram

rn_image_picker_lib_temp_515a4274-f2e1-4b57-966a-ff9e409ffcd5.jpg

#

Do i have to make changes in the code to use the gpu or something?

wooden sail Apr 12, 2024, 6:13 AM

#

yep

#

code is written a little differently for gpu

#

in pytorch you have to manually move stuff to the gpu

fallen steppe Apr 12, 2024, 6:16 AM

#

Ohh alright. Could you just send me any such code for reference

fallen steppe Apr 12, 2024, 6:35 AM

#

Got it 👍

rn_image_picker_lib_temp_bf559dd4-b8ae-4a4b-ba6d-aeb20ba14c7b.jpg

#

Now the gpu is running

leaden narwhal Apr 12, 2024, 7:40 AM

#

odd meteor Yeah, but the question is how well? If you want better model performance, go for...

Ill have to suggest this to the guy in my group

#

Thing is he is a big proud cause he is an assistant teacher

#

But nonetheless regarding my Machine learning process this whole ordeal has been a nightmare

#

Every time I think I’m going toward

#

I take 20 steps back

#

I feel like dying cause this learning curve is so big

odd meteor Apr 12, 2024, 9:09 AM

#

leaden narwhal Ill have to suggest this to the guy in my group

It always appear difficult until you solve it. The good thing is, you're on the right track. Almost everyone felt this way at some point.

If this is a group project, then by all means feel free to ask for help from your colleagues when you're stuck.

spring field Apr 12, 2024, 11:27 AM

#

Mmm, so, I have finally gone out and made my own ML model for a thing just for sort of practicing... anyway, it's very good at overfitting the data. sobbing
Mostly I just wanna hear your opinions on what part of the process has likely caused this issue.

I'll start with what is my end goal, there's this site https://http.cat/ that provides an API for some images (see first attachment for an example), I want to train a model that can "predict" (?) the coordinates of the top left and bottom right corners of a rectangle that would cover the number. So basically object localization is what it seems to be called (object detection but for a single object?). The images are of different resolutions and I want to devise a somewhat general approach for this and I kinda don't want to involve these images in the training because there are only a few of them, so I want to train the model on a larger dataset and then pretty much just put it to test on the actual images.

Alright, so, the dataset, I picked a rather naive approach for this and basically my idea was/is to generate a bunch of images with a resolution of [500, 800] by [500, 800] (those are ranges, size is picked at random in steps of 1 (so, any integer in the range, this is true for other random ranges mentioned here as well)) (the size is similar to what the end goal image sizes are). The base image/background is black. Then in those images I put a [300, 500] by [300, 500] area of random RGB noise (to sort of simulate the cat picture) (using np.random.randint for each channel) at a random location within the base image. Then I put a randomly generated string from string.ascii_letters of length [10, 30] (randomly chosen length) with a font size of 30 pt in a random place on the base image. Lastly I put a number in range [100, 999] (not randomly generated, these were just made in a sequence, one after another) in a random place on the base image (font size 50 pt). After crafting this random image, I downscaled it to a fixed size of 200 by 200, calculated where the top left and bottom right points should be for the rect and saved that all in a dictionary for use in the model. See the second image for an idea of how the final image looks like. I mean, it's really sort of very random, so I'm guessing that might be throwing the model off as well. And the dataset contains 4 such random images for each number in range [100, 999].

The model is also quite random I think, it's 4 layers of 2d convolution, batch norm, and relu, then it goes through a linear layer and finally a sigmoid. Using SGD as the optimizer and L2 normalization as the loss. See image 3 for the model implementation in pytorch. All that stuff is sort of kind of chosen randomly. Learning rate is 0.001. Batch size is 16. Dataset should be randomly shuffled (using the same seed value).

Now onto how it trains... well, it's overfitting the training data really badly, see image 4.

So, thoughts on what could be the major culprit or maybe everything is horrible. 😅
I suspect the dataset could be more structured to avoid all that stuff overlapping with each other, I can imagine the model itself can be improved as well. Ideally I'd ofc want the model to recognize the numbers and understand that that's where it needs to focus on basically... (way easier said than done, lol)

#

oh also, the training data is like 20% of the dataset (that is, I take an 80:20 split from the dataset for training and testing respectively), it's not the actual images from that website, it's testing against similar images from the same (randomly generated) dataset

spring field Apr 12, 2024, 11:58 AM

#

where did you go? sobbing
so, sth about making the input less random, like applying the noise to the whole image instead of adding those random areas, and some other things
what about GAN though? that's sth about generative AI, which is certainly sth I'll probably check out some time, so thanks for that bit, but I guess it doesn't fit this case?

#

alright, what if I increase the dataset like 25 times? generating a 100 random images for each number? (sounds like it would take forever to compute... that's not good either)
the other idea that just came to mind regarding object localization is creating a varied dataset where some images don't even have a number and have it detect its presence alongside its location

#

how do people even research this stuff? cuz sure I can look up existing solutions to this probably, but I kinda wanted to tackle it myself a bit (though clearly I'm here asking for advice, lol... but anyway, probably better for my learning still and whatnot)
do researchers just put random stuff together and see how well it works out?

wooden sail Apr 12, 2024, 1:51 PM

#

spring field how do people even research this stuff? cuz sure I can look up existing solution...

almost, but more guided

#

you run into a "practical" problem. then you read about that problem and see if there are good solutions. if the solutions aren't satisfactory or you can identify a way of making them better, you try out your ideas. if it works, great! if not (which is like 99% of the time), you dust yourself off and decide whether to try a different approach or move to a different problem

#

doing stuff randomly and/or blindly is a great way to increase those 99% odds to like 99.9999% that you'll fail

lofty thorn Apr 12, 2024, 3:03 PM

#

what is the reward of the agent in reinforcement learning??

#

or penalty

serene scaffold Apr 12, 2024, 4:13 PM

#

lofty thorn what is the reward of the agent in reinforcement learning??

A value that the agent is trying to maximize

#

You have to set up the training procedure in such a way that the agent's actions can cause the reward to go up, if it does the right thing

#

And the agent needs to "know" that that happened

signal holly Apr 12, 2024, 7:50 PM

#

guys how do I learn ai/ data science
I've been holding off on this for months now because idk how to exactly go about this

#

I searched up guides but there are so many and I'm still confused on what to do even after I try applying them

#

it's a mess

serene scaffold Apr 12, 2024, 8:17 PM

#

signal holly guys how do I learn ai/ data science I've been holding off on this for months n...

Here are some suggestions/things to keep in mind:

You will not get a job in DS/AI without a degree.
Guides on websites like Medium and Towards Data Science aren't actually intended to be helpful. They're just portfolio fodder for the authors.
DS/AI is applied math, so expect to learn lots of math as a separate thing from learning programming.
Don't try to learn DS/AI in terms of Python libraries. Python libraries like scikit-learn and pytorch are very helpful for creating models, but they aren't designed so that you learn more about DS/AI as you use them.
Pick a textbook or video course and stick with it, and make sure that you're actively engaging with it in some way. if it has practice problems or "homework assignments", do them.

iron basalt Apr 12, 2024, 8:17 PM

#

spring field how do people even research this stuff? cuz sure I can look up existing solution...

Research starts with a random guess, but you have a dataset of existing solutions, so your guess does not need to be completely random, you can guess somewhere near the cluster(s) of existing solutions. Then you wander from that initial guess. You can either go wide (a bit in every direction) or narrow (along a single direction).

#

Wide to break out of local minimum. So let yourself go wide every once in a while, especially if it's easy / cheap / quick to do so.

signal holly Apr 12, 2024, 8:31 PM

#

serene scaffold Here are some suggestions/things to keep in mind: - You will not get a job in DS...

ppl say tutorial videos and tutorials in general are a waste

#

what do you think?

serene scaffold Apr 12, 2024, 8:33 PM

#

signal holly ppl say tutorial videos and tutorials in general are a waste

tutorials are fine if you have a general sense for what you're doing, and you're trying to make something similar to what the tutorial is about. But it sounds like you're trying to learn about DS/AI in general. not how to make some hyperspecific thing.

signal holly Apr 12, 2024, 8:33 PM

#

serene scaffold Here are some suggestions/things to keep in mind: - You will not get a job in DS...

also I like to add that point 4 is where I really messed up on
I tried learning pandas, then matplotlib, but I just couldn't finish because I was demotivated
what I was doing didn't feel like it aligned with my goal of learning ai

serene scaffold Apr 12, 2024, 8:33 PM

#

signal holly also I like to add that point 4 is where I really messed up on I tried learning ...

yeah, because you shouldn't try to learn in terms of libraries.

signal holly Apr 12, 2024, 8:34 PM

#

what would I do after that course tho
would it just be one course

serene scaffold Apr 12, 2024, 8:34 PM

#

signal holly what would I do after that course tho would it just be one course

do a more advanced one

signal holly Apr 12, 2024, 8:35 PM

#

hmm when would I do projects tho
also since ai/ data science is a huge field
how would I know which direction to take

serene scaffold Apr 12, 2024, 8:35 PM

#

signal holly hmm when would I do projects tho also since ai/ data science is a huge field ho...

up to you

signal holly Apr 12, 2024, 8:36 PM

#

serene scaffold Here are some suggestions/things to keep in mind: - You will not get a job in DS...

also would I have to do some 6 hour course
I honestly hate those
because it's very time consuming
and I would barely retain that much info

serene scaffold Apr 12, 2024, 8:37 PM

#

signal holly also would I have to do some 6 hour course I honestly hate those because it's ve...

remember, I told you initially that you need to actively engage with the content. You won't passively learn from watching 6 hours of content.

#

but also, if you're serious about wanting to learn DS/AI, six hours is not that much. if you were in a degree program related to AI, you'd be doing 40+ hours a week.

spring field Apr 12, 2024, 8:39 PM

#

iron basalt Wide to break out of local minimum. So let yourself go wide every once in a whil...

hmm, I have heard of this idea before, from this video about game dev 😁 https://www.youtube.com/watch?v=o5K0uqhxgsE
thanks

spring field Apr 12, 2024, 8:40 PM

#

wooden sail doing stuff randomly and/or blindly is a great way to increase those 99% odds to...

I see, makes sense, guess I'll refine my methods then to not be completely random then

signal holly Apr 12, 2024, 8:40 PM

#

serene scaffold remember, I told you initially that you need to actively engage with the content...

but in between, should I be doing projects within my level to reinforce my skills

serene scaffold Apr 12, 2024, 8:40 PM

#

signal holly but in between, should I be doing projects within my level to reinforce my skill...

yes, exactly

spring field Apr 12, 2024, 8:42 PM

#

I mean, at this point the idea of making an AI Player for a 3D game feels lightyears away (as, of course, expected)
I'll take that idea of writing articles for a portfolio though, I suppose that would help with understanding the concepts better as well

iron basalt Apr 12, 2024, 8:47 PM

#

signal holly ppl say tutorial videos and tutorials in general are a waste

A book has the entire thing mapped out for you and usually has had way more time and effort put into it. If you search random videos, you are picking a bunch of random points all over the place. With enough of those points you could figure it out, but it's just way less efficient.

#

Once you know enough you can pick random points much better, so it can act as something extra or when you want to go more in a specific direction.

#

This also applies to learning libraries by just watching videos versus reading the documention (if it has any).

signal holly Apr 12, 2024, 8:55 PM

#

iron basalt This also applies to learning libraries by just watching videos versus reading t...

wdym about this
are you saying documentation is better

iron basalt Apr 12, 2024, 8:57 PM

#

Videos are useful at making you aware of things though. A book requires a lot of time to get through and you may not know yet if you actually care about the topics it covers. A book on linear algebra will usually not make it immediately apparent why you should care about it / if it's applicable to your problems, but a short video can quickly demonstrate its use without many details.

iron basalt Apr 12, 2024, 8:58 PM

#

signal holly wdym about this are you saying documentation is better

Documentation is like a book, if you really want to know the library, then you can read the documentation. Videos and other short forms can only give you small parts that may be enough, but in the case of something more complex like AI it can't cover enough (unless the video starts getting so long as to effectively be an audio book).

iron basalt Apr 12, 2024, 9:03 PM

#

spring field hmm, I have heard of this idea before, from this video about game dev 😁 <https:...

Game design is research (unless you do an exact copy).

#

It's also why it's often the hardest part about game dev. Open ended, high dimensional. Not as comfy as just writing a rendering engine using very well established methods (or other more solved parts).

shut girder Apr 12, 2024, 10:50 PM

#

Hello everyone, I am a beginner in machine learning. As I gradually build up my understanding of the mathematics used in machine learning, what would be some good ways to apply what I learn?

grim fog Apr 13, 2024, 3:46 AM

#

shut girder Hello everyone, I am a beginner in machine learning. As I gradually build up my ...

go on kaggle, get some datasets, better if the dataset is not clean so you can do some data pre processing. kaggle has many datasets so you can explore ml nlp deep learning etc. u can also try to deploy your models on for example an android app or web app 👍

cursive folio Apr 13, 2024, 3:56 AM

#

Where can I get info on how to make Survival Predictability (months) on oncology (analysis), I have been researching online a lot but haven't found any info. Any help appreciated!

abstract rune Apr 13, 2024, 7:18 AM

#

signal holly guys how do I learn ai/ data science I've been holding off on this for months n...

I would recommend you do a maths course
try to write functions for matrix multiplication, row ech form, determinant calculator, gradient descent by yourself
you will learn invaluable stuff

abstract rune Apr 13, 2024, 7:19 AM

#

abstract rune I would recommend you do a maths course try to write functions for matrix mult...

@shut girder maybe something similar
I am thinking to this too

odd meteor Apr 13, 2024, 7:54 AM

#

shut girder Hello everyone, I am a beginner in machine learning. As I gradually build up my ...

I'd say project-based learning

lapis sequoia Apr 13, 2024, 8:45 AM

#

Hey everyone! I'm offering $100 to anyone who can help me install two local instances of voice cloning software like TortoiseTTS, X-TTS, etc. I'll pay $50 after each successful install, which means it should consistently clone the voice of a chosen person.

I've run into some snags trying to do it myself, so be prepared for a few challenges along the way.

Specs wise, I'm working with an NVIDIA GeForce RTX 4070 Ti, 13th Gen Intel(R) Core(TM) i5-13400F, and 32 GB of RAM, so hardware shouldn’t be an issue.

I’ll be checking my DMs tomorrow at 18:00 BST and will give everyone a fair shot—first come, first serve. Looking forward to your messages!

peak ridge Apr 13, 2024, 9:16 AM

#

hey chat

orchid forge Apr 13, 2024, 10:21 AM

#

hello i want to make a cluster using a data

#

i really need help here

#

im not getting what data i should use to make a clustering map

#

anyone here?

#

#

#

#

this my dataset

#

how to solve this "Identify any patterns or clusters of restaurants in specific areas"

#

@serene scaffold

#

guys please

orchid forge Apr 13, 2024, 11:05 AM

#

fuck it....i did it myself

#

#

hahahahaha

#

shove it

#

Rap helps to solve fucked up shit

agile owl Apr 13, 2024, 12:31 PM

#

my spark job failed in the middle due to some random IO Error and java tracebacks suck

#

https://paste.pythondiscord.com/W56Q

#

any suggestions on how to debug this

sweet zealot Apr 13, 2024, 12:55 PM

#

mild grotto Keep in mind, I haven't watched that video and I don't know what that guy is doi...

You're right about the pipes. I forgot to mention that I made the flappy bird game myself. Instead of pipes I used obstacles like small square hitboxes the bird should avoid. There could be 2/3/4/5 obstacles on the same X-axis on top of eachother, but never on the same Y axis. I did this so I can easily add a higher difficulty to the game and see how far I can push the eventual RL model. So it looks a bit like this

Currently I just take a screenshot of the game and return

every obstacle with X,Y coordinates in a list.
the bird X,Y Value.

#

I asked chatGPT and he gave me something like this

self.action_space = spaces.Discrete(2)  # Example: Jump or Don't Jump
self.observation_space = spaces.Box(low=0, high=800, shape=(4,), dtype=np.float32)  # Example: Flappy Y, Obstacle1 Y, Obstacle1 X, Obstacle2 Y, Obstacle2 X```

#

Action space seems logical,

low=0 high-800 is logical as well.
But then again shape 4? What if there are more obstacle

I think shape 4 should be a value that takes my list with every obstacle. I'm pretty sure the model needs to anticipate on up comming blocks as well so flappy does need to be aware of all obstacles on screen

orchid forge Apr 13, 2024, 1:20 PM

#

orchid forge

This

hollow otter Apr 13, 2024, 2:13 PM

#

can some expalin if there a way to integrate apps in python

#

like two apps merged by a link or something

serene scaffold Apr 13, 2024, 3:51 PM

#

hollow otter can some expalin if there a way to integrate apps in python

doesn't sound like a data science question

serene scaffold Apr 13, 2024, 3:51 PM

#

orchid forge fuck it....i did it myself

good job

orchid forge Apr 13, 2024, 4:04 PM

#

serene scaffold good job

Thanks
You have kinda doe eyes tho
They're cute
If it's you in the pfp

serene scaffold Apr 13, 2024, 4:04 PM

#

orchid forge Thanks You have kinda doe eyes tho They're cute If it's you in the pfp

it is

#

thank you

orchid forge Apr 13, 2024, 4:05 PM

#

😊

amber badger Apr 13, 2024, 4:14 PM

#

Hello! Right now I am working on making a KNN model for a discrete outcome (readmission for healthcare patients, either 0 or 1), but since there's such few records with readmitted as 1 compared to the amount with 0, the model is having a tough time accurately predicting readmission, instead only predicting everything as non-readmitted. How can I take a sample from my dataframe with more of an emphasis on readmitted patients so they aren't so underrepresented when being put into the model?

#

(Doing this with Pandas DataFrames and SKlearn kNN)

dense atlas Apr 13, 2024, 5:09 PM

#

Hey,I wanna learn about data science
Not dive into algorithm field
But more of making sense of data set and getting conclusions out of it.
Background : coming from economics and stats background,I would like to enter data analyst kind of field ,I still am extremely vague I know,But that is what I wanna learn more about

grim fog Apr 13, 2024, 6:45 PM

#

amber badger Hello! Right now I am working on making a KNN model for a discrete outcome (read...

if your dataset is big enough u can try to remove some rows randomly that contain 0 in that column

#

but if it is a high percentage of rows u have to remove to make the dataset balanced u might have to look into upsampling

agile owl Apr 13, 2024, 7:08 PM

#

this is the rate of mortgage origination in florida

#

Here's the average credit scores for new mortgages corresponding to the previous time series

amber badger Apr 13, 2024, 9:23 PM

#

grim fog but if it is a high percentage of rows u have to remove to make the dataset bal...

Thank you! I've tried sampling with oversampling the minority class but the model isn't getting much better - this is real world data I'm working with for healthcare so I think I may be reaching the limit with what I can do to create a truly accurate model

#

I'm using ChatGPT to learn other methods of going about making the model better but I'm slowly but surely hitting a wall with it

neon lintel Apr 13, 2024, 9:57 PM

#

Has anyone tried to do the KiTS (kidney and kidney tumor segmentation) challenge for learning purposes? If yes, could you please point me to a guide on how to approach the problem? I know there's a lot of proposed solutions as well as general guides to specialized CNNs, and gpt4 has been helpful, but I figured I should ask here as well.

desert oar Apr 13, 2024, 10:48 PM

#

amber badger Hello! Right now I am working on making a KNN model for a discrete outcome (read...

Why KNN specifically?

amber badger Apr 14, 2024, 3:18 AM

#

Its the model we decided to go with, no true reason why outside of experience with it in an undergrad course

gritty vessel Apr 14, 2024, 4:23 AM

#

Is there any way we can use class_weights for 3+ dimensional data?

#

Whenever I feed my data of shape (28,1536,1392,6) and try to use class weights it throws 3+ dimensions data not supported by class_weights error

hollow otter Apr 14, 2024, 4:37 AM

#

serene scaffold doesn't sound like a data science question

My bad what is the right channel to ask?

royal crest Apr 14, 2024, 5:13 AM

#

#python-discussion maybe

the question is way too broad to gain any meaningful response

buoyant shoal Apr 14, 2024, 7:00 AM

#

hi, dumb question, but does anyone see any glaring mistakes on my "curve fit"?

#

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.optimize as opt


df = pd.read_csv("directory")

def func(x, a1,b1,a2,b2, a3, b3, a4, b4):
    return a1*np.exp(-1*b1*x**2) + a2*np.exp(-1*b2*x**2) + a3*np.exp(-1*b3*x**2) + a4*np.exp(-1*b4*x**2)


x_values = df['x']  # Array of x-values
y_values = df['y']  # Array of y-values
y_error = df['y_err']

popt, pcov = opt.curve_fit(func, x_values, y_values, p0 = [1,1]*4)

a1, b1, a2, b2, a3, b3, a4, b4 = popt

fit_y = func(x_values, a1, b1, a2, b2, a3, b3, a4, b4)

plt.plot(x_values, y_values, 'o', label='data') 
plt.plot(x_values, fit_y, '-', label='fit') 
plt.show()```

#

I get this lmao

buoyant shoal Apr 14, 2024, 7:42 AM

#

I reworked it by first finding a good mathematical function that would fit this "sloping gaussian"

#

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.optimize as opt


df = pd.read_csv("dest")

def gaus(x,a,x0,sigma, m, c):
    return a*np.exp(-(x-x0)**2/(2*sigma**2)) + (m*x + c)

x_values = df['x']  # Array of x-values
y_values = df['y']  # Array of y-values
y_error = df['y_err']

popt, pcov = opt.curve_fit(gaus, x_values, y_values, absolute_sigma=True)

a,x0,sigma,m, c = popt

perr = np.sqrt(np.diag(pcov))
print(type(perr))

fit_y = gaus(x_values, a,x0, sigma, m, c)

plt.errorbar(x_values, y_values, yerr=y_error) 
plt.errorbar(x_values, fit_y) 
plt.show()

#

what does this mean?

#

#

fervent seal Apr 14, 2024, 7:51 AM

#

imu.dropna(inplace=True, subset = ["LONGITUDE", "SBG_ECAN_MSG_EKF_POS(LATITUDE"])
imu = imu.resample(rule="0.05s", on="Timestamp").mean()
#imu.dropna(inplace=True, subset = ["LONGITUDE", "SBG_ECAN_MSG_EKF_POS(LATITUDE"])
#imu.drop_duplicates(inplace=True, subset = ["LONGITUDE", "SBG_ECAN_MSG_EKF_POS(LATITUDE"])
print(imu["LONGITUDE"].head(10))```
Why do I need to call the second `dropna` after downsampling using resample? It ends up filling some of the rows with NaNs

#

$ python pathplot.py
Timestamp
19665 days 23:55:38.342858076   -81.569231
19665 days 23:55:38.392858076   -81.569230
19665 days 23:55:38.442858076   -81.569231
19665 days 23:55:38.492858076          NaN
19665 days 23:55:39.892858076   -81.569231
19665 days 23:55:39.942858076   -81.569231
19665 days 23:55:39.992858076   -81.569231
19665 days 23:55:40.042858076   -81.569231
19665 days 23:55:40.092858076   -81.569231
19665 days 23:55:40.142858076   -81.569231
Name: LONGITUDE, dtype: float64

#

Is this cause it can't find an appropriate value or something to stick in? Are samples for that entire 0.05s period just missing?

#

I can just backfill all the rows so it isn't really a problem but idk why it's doing it

abstract rune Apr 14, 2024, 8:40 AM

#

can we say that
two vectors of R3, are a basis for for R2 ?
I am a bit confused

wooden sail Apr 14, 2024, 9:53 AM

#

abstract rune can we say that two vectors of R3, are a basis for for R2 ? I am a bit confused

no. two linearly independent vectors in R^3 span a 2 dimensional vector subspace. that subspace is isomorphic to R^2, but not equal

supple linden Apr 14, 2024, 9:54 AM

#

Hi, does anyone know why ImageDataGenerator fails to be imported

#

from keras.preprocessing.image import ImageDataGenerator

wooden sail Apr 14, 2024, 9:56 AM

#

buoyant shoal

i also don't have a chance to debug your code rn, but i would note a few things:

you probably want to add an extra parameter that offsets your gaussians, e.g. a1*np.exp(-b1 * (x - c1)**2) (edit: i just noticed you did this in the second iteration)
the problem is anyway nonlinear and nonconvex, meaning there is no guarantee you'll find a good solution with gradient methods unless you already start close to the solution. you'll have to pick a better initial guess of the parameters

abstract rune Apr 14, 2024, 10:06 AM

#

wooden sail no. two linearly independent vectors in R^3 span a 2 dimensional vector subspace...

oh okay thanks a lot !

#

numpy is so overwhelming at start
so many functions ughh

odd meteor Apr 14, 2024, 10:25 AM

#

supple linden `from keras.preprocessing.image import ImageDataGenerator`

What's the exact error message you're seeing?

runic parcel Apr 14, 2024, 11:33 AM

#

how to make a ai, like i will provide lots of data and it will learn. so if i ask it anything it will able to answer me. how to make something like this?

stray salmon Apr 14, 2024, 11:40 AM

#

Hello, I want to write a programm, which detects german traffic speed signs, could anyone help me with that?

tidal bough Apr 14, 2024, 11:52 AM

#

buoyant shoal ```python import numpy as np import pandas as pd import matplotlib.pyplot as plt...

for complicated curve fitting i typically have better results with a global search like differential_evolution

wooden sail Apr 14, 2024, 11:57 AM

#

tidal bough for complicated curve fitting i typically have better results with a global sear...

how does this compare to e.g. simulated annealing?

tidal bough Apr 14, 2024, 11:59 AM

#

last time I tried both i didn't see a huge difference between the different global searches scipy provides

wooden sail Apr 14, 2024, 12:01 PM

#

aight, the performance probably depends on properties that are anyway too difficult to check on nasty functions

tidal bough Apr 14, 2024, 12:05 PM

#

i did have a fun time in the past using simulated annealing to find optimal setups for reactors in minecraft :p

left tartan Apr 14, 2024, 12:40 PM

#

stray salmon Hello, I want to write a programm, which detects german traffic speed signs, cou...

Not my space, but if you're searching for information, search for "opencv traffic signs". There's a lot of hits out there. For example (random hit): https://github.com/Patric/Speed-limit-detector

jade jay Apr 14, 2024, 12:42 PM

#

If anyone here is into quant trading lmk. I just started learning python and I want to be on track to be ready for the workforce

stray salmon Apr 14, 2024, 12:46 PM

#

left tartan Not my space, but if you're searching for information, search for "opencv traffi...

thank you very much

sick eagle Apr 14, 2024, 1:03 PM

#

guys i am bigginer and i want to be like you, guys pleaz how can i start learn AI, what should be start from?

#

sorry because my english is too weak, i still learn english

jaunty helm Apr 14, 2024, 1:04 PM

#

sick eagle guys i am bigginer and i want to be like you, guys pleaz how can i start learn A...

there are resources in Pinned Messages

little arrow Apr 14, 2024, 1:13 PM

#

what would be the best model to use when working with fourier analysis

#

specifically detecting common frequencies present within signals

#

would it be a neural network of some form

buoyant shoal Apr 14, 2024, 1:14 PM

#

wooden sail i also don't have a chance to debug your code rn, but i would note a few things:...

I see, so how would i guess a better initial parameter?

buoyant shoal Apr 14, 2024, 1:15 PM

#

tidal bough for complicated curve fitting i typically have better results with a global sear...

Could you show me how that’d work? I don’t think we delve deep into complex curve fitting cuz we just glossed over it

#

Is there a possibility that there’s a more approachable way/solution?

wooden sail Apr 14, 2024, 1:19 PM

#

buoyant shoal I see, so how would i guess a better initial parameter?

through trial and error doing it yourself by hand, or by using one of the methods reptile suggested

#

the words i used earlier (nonconvex and nonlinear) are math lingo for "the problem is difficult and there is no general method that will always work"

buoyant shoal Apr 14, 2024, 1:23 PM

#

wooden sail the words i used earlier (nonconvex and nonlinear) are math lingo for "the probl...

fair enough, but just cuz i haven’t really learnt complex curve fitting, can you confirm whether that’s exactly what i should be doing?

#

here’s the actual question

wooden sail Apr 14, 2024, 1:24 PM

#

sure

#

they probably expect something simple like a gaussian (since they explicitly mention sigma) and a straight line to correct the "baseline"

#

just looking at the plot you can make some rough calculations of what the mean of the gaussian and the slope and offset of the line should be

#

try making up an initial value of sigma and then let scipy do the rest for you

buoyant shoal Apr 14, 2024, 1:26 PM

#

wooden sail try making up an initial value of sigma and then let scipy do the rest for you

so my initial bunch of gaussian + line is good right?

#

hunch*

#

now i have to guess the value of the mean and try to “help” scipy?

wooden sail Apr 14, 2024, 1:27 PM

#

buoyant shoal so my initial bunch of gaussian + line is good right?

would'Ve probably been my first guess

buoyant shoal Apr 14, 2024, 1:27 PM

#

honestly i don’t understand the last bit of the question lol, what does “width sigma of the peak” mean

wooden sail Apr 14, 2024, 1:28 PM

#

you know how a gaussian distribution is parameterized by the mean and standard deviation?

buoyant shoal Apr 14, 2024, 1:28 PM

#

yes

wooden sail Apr 14, 2024, 1:29 PM

#

any "bell curve" of the form a exp(-b (x - c)^2) has an amplitude a, an offset c, and a "width" b

buoyant shoal Apr 14, 2024, 1:29 PM

#

the mean looks like it’s 10 and std maybe like 2.5, right?

buoyant shoal Apr 14, 2024, 1:29 PM

#

wooden sail any "bell curve" of the form a exp(-b (x - c)^2) has an amplitude a, an offset c...

correct

wooden sail Apr 14, 2024, 1:29 PM

#

in statistics you would instead write -(x - c)^2 / b^2, where this b is the "standard deviation". that's what they're asking you for

#

std dev is commonly denoted with a sigma

wooden sail Apr 14, 2024, 1:30 PM

#

buoyant shoal the mean looks like it’s 10 and std maybe like 2.5, right?

sure, give it a shot and see

#

the line seems to have a slope of -5/20 and offset of 5

#

how would you initially remove the peak?

buoyant shoal Apr 14, 2024, 1:33 PM

#

wooden sail in statistics you would instead write -(x - c)^2 / b^2, where this b is the "sta...

wait can you give me the actual math terminology? what is this actually called in math?

wooden sail Apr 14, 2024, 1:33 PM

#

at any rate, what you've dscribed are the first one and a half iterations of "expectation maximization", which you could also do ofc

buoyant shoal Apr 14, 2024, 1:33 PM

#

oh nvm you wrote the functional for a normal distribution

wooden sail Apr 14, 2024, 1:34 PM

#

if you remove the peak by fitting a gaussian, yes

#

alternating optimization of independent components of an observation, updating the expected value, and then maximizing the likelihood

#

wdym by "just eyeballing" though

#

you'd probably make up a gaussian with arbitrary parameters and subtract it, yeah?

#

just remove it?

#

more than a bit, but there's always heuristics involved in picking a good initial guess

#

that works and it's still the same as 1 iteration of expectation maximization, with just an unconvential initial guess

buoyant shoal Apr 14, 2024, 1:38 PM

#

p0 = [3, 10, 2.5, -0.3, 6]

#

this looks good?

#

wooden sail Apr 14, 2024, 1:39 PM

#

sure

#

depending on whether the sigma is squared in your model, you might be missing a square root afterwards

buoyant shoal Apr 14, 2024, 1:39 PM

#

oh i just eyeballed it lol

#

maybe it is idk

wooden sail Apr 14, 2024, 1:40 PM

#

you know this, you wrote the code yourself

buoyant shoal Apr 14, 2024, 1:40 PM

#

buoyant shoal Apr 14, 2024, 1:40 PM

#

wooden sail you know this, you wrote the code yourself

wow

wooden sail Apr 14, 2024, 1:40 PM

#

did you write exp(... / sigma) or exp(... / sigma^2)?

buoyant shoal Apr 14, 2024, 1:40 PM

#

wooden sail did you write exp(... / sigma) or exp(... / sigma^2)?

oh yeah sigma^2

#

i just copied the functional equation of a normal

wooden sail Apr 14, 2024, 1:40 PM

#

ok

wooden sail Apr 14, 2024, 1:41 PM

#

buoyant shoal wow

local minima are a bitch, huh?

buoyant shoal Apr 14, 2024, 1:41 PM

#

😭 so why did it not work previously? I thought it's a code after all

#

it should work to search everything?

#

Dumb take but yeah that's the gist

wooden sail Apr 14, 2024, 1:41 PM

#

nope

#

that is not the case

#

as i said earlier, for nonconvex problems, that is simply not true

#

and there is no general way of solving nonconvex minimization problems

buoyant shoal Apr 14, 2024, 1:42 PM

#

nonconvex means a function that isn't convex throughout the domain?

buoyant shoal Apr 14, 2024, 1:42 PM

#

wooden sail and there is no general way of solving nonconvex minimization problems

i see, i haven't gotten this far in math yet either so no actual context

#

but i understand yeah

sick eagle Apr 14, 2024, 1:42 PM

#

before 2 months, programming is too excited but now is boriiiing i think because i learn too much

wooden sail Apr 14, 2024, 1:42 PM

#

the gist of it is, some problems cannot be solved in closed form and also no algorithm exists that is guaranteed to find a solution

#

this is one of them, even though it looks so simple

buoyant shoal Apr 14, 2024, 1:43 PM

#

so it's kinda like newton's root finding algorithm? bad first estimate gives u (maybe no convergence)?

wooden sail Apr 14, 2024, 1:43 PM

#

gradient-based methods like scipy's fit (which is a newton method) will only work in special conditions

wooden sail Apr 14, 2024, 1:43 PM

#

buoyant shoal so it's kinda like newton's root finding algorithm? bad first estimate gives u (...

just so

#

iirc scipy uses levenberg-marquardt by default, a quasi-newton method with rank 1 updates of the hessian (or its inverse)

buoyant shoal Apr 14, 2024, 1:44 PM

#

😭 no idea at all

#

i'm just doing lin alg now unfortunately

wooden sail Apr 14, 2024, 1:44 PM

#

if you have no good idea of a good initial guess and/or your function is nondifferentiable (once or twice, depending on your alg), you'll need to use a method like the one confusedreptile suggested

wooden sail Apr 14, 2024, 1:45 PM

#

buoyant shoal i'm just doing lin alg now unfortunately

it's directly connected

#

the linear least squares problem is an example of a convex problem (i.e. it's "easy" to solve)

#

and the rank of the system matrix involved in the problem determines whether you have strict convexity or just convexity, which determines the number of solutions

buoyant shoal Apr 14, 2024, 1:46 PM

#

wow okay least square problem is like at the end of my linear algebra text

#

I'm still doing subspaces and the like

#

Thanks btw, i think the answer to my question is 0.75

#

if others are curious

#

buoyant shoal Apr 14, 2024, 1:47 PM

#

wooden sail if you have no good idea of a good initial guess and/or your function is nondiff...

is that an advanced method per se?

#

Like why is that method not mainstream then? Why use curve fit at all? Just default to that?

wooden sail Apr 14, 2024, 1:48 PM

#

because if you have a good initial guess, the gradient-based ones have guarantees

#

you can predict how far you'll be from the true solution after N iterations

#

these other methods have no such guarantees, they're heuristics

#

they often work pretty ok, but you can never give guarantees

#

they probably require more function evaluations

#

assuming you can analytically compute the derivatives, at least

buoyant shoal Apr 14, 2024, 1:50 PM

#

yeah but i guess the "slow part" isn't the main issue maybe

buoyant shoal Apr 14, 2024, 1:50 PM

#

wooden sail you can predict how far you'll be from the true solution after N iterations

this makes sense yeah cool thanks

#

Lol crazy

wooden sail Apr 14, 2024, 3:57 PM

#

part of it is that a lot of the operations are slicewise, so that separate blocks don't interact with each other

#

if you were to matricise it you'd have a bunch of kronecker products

#

you might consider defining the operations for a single slice and then just saying "and this is repeated x times for all ..."

honest grotto Apr 14, 2024, 4:06 PM

#

Good day everyone

#

Please I need materials to learn Deep Learning

mild grotto Apr 14, 2024, 10:01 PM

#

Hey I'm having trouble with Matrix math.
I have a matrix A which has shape (N,N) and a matrix B with shape (N,)

I want to scale A by B... so I tried A.dot(B) but this gives me shape (N,) instead of (N,N). What am I doing wrong?

desert oar Apr 14, 2024, 10:03 PM

#

mild grotto Hey I'm having trouble with Matrix math. I have a matrix A which has shape `(N,N...

this is less about matrix math and more about numpy shape compatibility. https://numpy.org/doc/stable/user/basics.broadcasting.html this doc explains it all in detail with examples and diagrams, but you will need to spend some time working through it because it might take a while to understand it

#

that said I don't understand how you expect to get any kind of scaling here. are you looking for element-wise multiplication? that would be .multiply/* not .dot/@

mild grotto Apr 14, 2024, 10:04 PM

#

I want element wise multiplication yeah

#

I want every element in the 0th row of A multiplied by the 0th value in B

desert oar Apr 14, 2024, 10:05 PM

#

mild grotto I want element wise multiplication yeah

.dot is matrix multiplication, you want .multiply

#

but you will want to understand the broadcasting behavior in either case

mild grotto Apr 14, 2024, 10:06 PM

#

desert oar .dot is matrix multiplication, you want .multiply

Awesome!

#

I was trying * and it wasn't doing what I thought it would do

desert oar Apr 14, 2024, 10:07 PM

#

mild grotto I was trying `*` and it wasn't doing what I thought it would do

yes, that's because of broadcasting. it will give you the same result as * or as .multiply. if the result isn't what you expect, it's probably because the shapes interact in a way that you didn't expect

mild grotto Apr 14, 2024, 10:12 PM

#

I think it's getting closer to what I'm going for, but still some bugs

#

I have a xFlow and yFlow value at each point, i'm trying to update each point according to the flow

#

I assume the issue is because my flow variable isn't normalized or something

#

I'll try again later

winter drift Apr 14, 2024, 11:20 PM

#

Can anyone explain to me how illusion generative ai works, it's just not clicking for me

agile cobalt Apr 14, 2024, 11:27 PM

#

winter drift Can anyone explain to me how illusion generative ai works, it's just not clickin...

what do you mean by "illusion"?

winter drift Apr 14, 2024, 11:28 PM

#

agile cobalt what do you mean by "illusion"?

Have you seen those generative ais that create an image of something in such a way that like if you squint you'll see Jesus or something

agile cobalt Apr 14, 2024, 11:28 PM

#

the QR Code and alike things?

#

do you understand the overall idea of ControlNet guidance

winter drift Apr 14, 2024, 11:31 PM

#

agile cobalt the QR Code and alike things?

royal crest Apr 14, 2024, 11:31 PM

#

oddly gross

agile cobalt Apr 14, 2024, 11:33 PM

#

without any more specific questions, the only thing I can recommend is reading up on ControlNet

serene scaffold Apr 14, 2024, 11:41 PM

#

winter drift

What is this supposed to be

dreamy isle Apr 14, 2024, 11:46 PM

#

i thought "this almost feels like AI"

#

well because it is, apparently

winter drift Apr 14, 2024, 11:46 PM

#

agile cobalt without any more specific questions, the only thing I can recommend is reading u...

Tyty I'll start there

shut girder Apr 15, 2024, 2:07 AM

#

winter drift

If you look at it from afar, it kind of looks like a man's face

desert oar Apr 15, 2024, 2:44 AM

#

I assume this is user error, but do I need to do anything other than hvplot.extension('matplotlib') to use the matplotlib backend for hvplot?

I tried doing this in an ipython console:

import hvplot as hv
import hvplot.pandas
hv.extension("matplotlib")

df = ...

df.hvplot.line(x="Date", y=["X", "Y"])

but I only got some output like this:

:NdOverlay   [Variable]
   :Curve   [Date]   (value)

and plt.show() did nothing.

#

the holoviews ecosystem has really set a new low bar in terms of bad documentation. reminds me of matplotlib a decade ago

#

good examples, but very hard to figure out what they're doing or how to generalize them, beyond guessing and checking

#

ah, some progress. the hvplot command returns a holoviews.core.overlay.NdOverlay for which that output is the display() result

#

also I stand corrected: holoviews itself has great docs

#

it's geoviews that's kind of bare, but I guess the idea is that you read the holoviews docs first

#

aha, this might just be a problem with my editor setup. I got this when trying to plt.show() a plain mpl figure:

FigureCanvasAgg is non-interactive, and thus cannot be shown

#

and that happens even after I explicitly run %matplotlib tkagg

desert oar Apr 15, 2024, 6:39 AM

#

huh, it looks like it's maybe caused by holoviews?? that's so weird

#

hv.extension("matplotlib") causes that FigureCanvasAgg problem to arise even when only using matplotlib

dawn light Apr 15, 2024, 6:41 AM

#

I'm currently trying to create a neural net from scratch

I was just wondering how the calculation of errors is done when testing against a validation set
For example, if I have a training data consisting of 5000 rows and 1000 rows of validation (and doing SGD with a mini batch size of approx 8-16), do I check the accuracy against the entire validation set (i.e. all of the 1000 rows?, or do I do something similar to SGD where I select a mini batch and only test it against that then calculate the mean?)

desert oar Apr 15, 2024, 6:45 AM

#

dawn light I'm currently trying to create a neural net from scratch I was just wondering h...

just compute loss over the entire validation set every epoch

dawn light Apr 15, 2024, 6:48 AM

#

desert oar just compute loss over the entire validation set every epoch

aight thanks

spring field Apr 15, 2024, 7:32 AM

#

does it make sense to use a noisy background for objects you want to identify for the model to sort of learn to ignore the background in the general case and only look for the thing it's trained for?

spring field Apr 15, 2024, 7:42 AM

#

spring field Mmm, so, I have finally gone out and made my own ML model for a thing just for s...

it's still in the context of this, I changed my approach a couple times, changed networks, realized that I should probably not ask it to also predict what number is displayed, otherwise I think it tries to not only predict the location but it tries to learn the locations as if they were specific to each class (class being a digit in range [100, 999]), so I remove that from the equation and tried with different noise levels for the background and it seemed to actually work really well in testing on similar randomly generated images, but it completely failed when it came to using these completely different cat images with numbers in them. I also switched to a DenseNet, which seemed to help more than using some random convolutional networks
so anyway, I decided to generate a massive dataset with 5k samples for each number in range [100, 999] (4.5 million images in total, compared to the 9 thousand image dataset I was using before), they are basically random noise as the background (random size of [500, 800] by [500, 800]) and then a randomly picked font (approx 40 font families + some of them have variations), font size (anywhere in range [40, 130]), and foreground colour is used to draw the number in a random place on the image, then the whole image is resized to 200 by 200
the idea being that given the randomness in the background it would learn to sort of ignore it and pretty much just learn to recognize a 3 digit number in any image and approximate its bounding box

#

for reference this is how an image sample might look like

PhgCFIDBgwYMGDbYghSAwYMGDBg22IIUgMGDBgwYNtiCFIDBgwYMGDbYghSAwYMGDBg22IIUgMGDBgwYNtiCFIDBgwYMGDbYghSAwYMGDBg22IIUgMGDBgwYNtiCFIDBgwYMGDb4v8Bv1f3lTzw38AAAAAASUVORK5CYII.png

#

also, a bit tangential to this topic, but any resources for those generative networks that embed a word in an environment/image?

odd meteor Apr 15, 2024, 8:44 AM

#

honest grotto Please I need materials to learn Deep Learning

Hi Lawal, check the pinned message here you'll see some resources.

honest grotto Apr 15, 2024, 10:14 AM

#

Thank you 🙏🏿

signal whale Apr 15, 2024, 10:54 AM

#

has anyone here used chartjs for webdev?

leaden narwhal Apr 15, 2024, 12:47 PM

#

anyone can help me out knowing why 2 days before hand my actuals and train prediction values were closer than they are right now

lofty thorn Apr 15, 2024, 12:59 PM

#

what is ' building model ' means?

sick eagle Apr 15, 2024, 12:59 PM

#

leaden narwhal anyone can help me out knowing why 2 days before hand my actuals and train predi...

that so advanced

leaden narwhal Apr 15, 2024, 1:07 PM

#

lofty thorn Apr 15, 2024, 1:13 PM

#

what is this in theta ..i don't get it

rn_image_picker_lib_temp_96d03c44-8c6f-4dd9-a9a2-8e057ebf94c9.jpg

tawny sand Apr 15, 2024, 1:22 PM

#

lofty thorn what is this in theta ..i don't get it

theta_0: the value at f(0)
theta_1: the increase to the total value per step upwards (f(1) - f(0))

lofty thorn Apr 15, 2024, 1:38 PM

#

tawny sand theta_0: the value at f(0) theta_1: the increase to the total value per step upw...

i don't get it

long locust Apr 15, 2024, 1:42 PM

#

lofty thorn i don't get it

Looks like a point-slope intercept function like y = mx +b

Theta_0 is b, and theta_1 is m

lofty thorn Apr 15, 2024, 1:44 PM

#

long locust Looks like a point-slope intercept function like `y = mx +b` Theta_0 is `b`, an...

i didn't know that things will get this complicated in the beginning

#

when do i Start learning ML

long locust Apr 15, 2024, 1:45 PM

#

I'm not sure, I don't know what course you are taking

lofty thorn Apr 15, 2024, 1:46 PM

#

I am trying to learn for a ' emotion detection ' project

#

I thought starting learning opencv and ML in the beginning

#

but now stuck at the beginning

wooden sail Apr 15, 2024, 1:47 PM

#

neural networks are built out of generalizations of the simple y = mx + b formula, so it's in your best interest to spend some time there until you grasp it

lofty thorn Apr 15, 2024, 1:50 PM

#

wooden sail neural networks are built out of generalizations of the simple y = mx + b formul...

what do i have to search for this?

wooden sail Apr 15, 2024, 1:50 PM

#

lofty thorn what do i have to search for this?

the picture you sent has the name

#

linear functions, linear models

agile owl Apr 15, 2024, 1:51 PM

#

should I apply a smoothing filter to this data before fitting a time series model

tawny sand Apr 15, 2024, 2:48 PM

#

lofty thorn i didn't know that things will get this complicated in the beginning

Dear god.
If you don't know linear functions, I heavily advise you to reconsider starting ML.

lofty thorn Apr 15, 2024, 2:49 PM

#

i am learning...on the go...i don't have any fear regarding this

tawny sand Apr 15, 2024, 2:49 PM

#

lofty thorn i am learning...on the go...i don't have any fear regarding this

Linear functions are, what, 9th grade maths?

lofty thorn Apr 15, 2024, 2:50 PM

#

i mean i know linear functions.. x = y + 1

#

a little bit

tawny sand Apr 15, 2024, 2:51 PM

#

agile owl should I apply a smoothing filter to this data before fitting a time series mode...

Yeah, e.g. a '1 year average', and a second model for '1 year standard deviation'

tawny sand Apr 15, 2024, 2:53 PM

#

lofty thorn i mean i know linear functions.. x = y + 1

I'd advise having decent formal education before attempting ML

#

It will get more difficult, to the point where even those with formal education have problems grasping certain concepts

lofty thorn Apr 15, 2024, 2:56 PM

#

tawny sand It will get more difficult, to the point where even those with formal education ...

what i know is that i need to learn stats, python, numpy, pandas, matplotlib, calculus, linear algebra and many more..
what if i complete this along with ML?

tawny sand Apr 15, 2024, 2:59 PM

#

Difficult to do in parallel, to say the least

#

I'd finish school first (for algebra and decent calculus and statistics), go a liitle in depth on multivariable calculus, and then start ML

lofty thorn Apr 15, 2024, 3:02 PM

#

let's see what happens...i really wanna do this.

boreal gale Apr 15, 2024, 3:06 PM

#

if you landed an hallucinate, that probably means someone saw the potential in you. keep going! you got this 🙂 (edit: wait did i just hallucinated the internship bit??)

also don't be intimidated by math notations, it's just a language to convey concepts (sometimes more precise than just words - hence necessary), also it seems this book you are reading is also trying to help you decipher it in case you aren't already familiar with it (see the green arrow i added).

lofty thorn Apr 15, 2024, 3:07 PM

#

boreal gale if you landed an hallucinate, that probably means someone saw the potential in y...

lol

lofty thorn Apr 15, 2024, 3:08 PM

#

boreal gale if you landed an hallucinate, that probably means someone saw the potential in y...

thank you..i will be regular in my studies..and coming here..hope to see you again

agile owl Apr 15, 2024, 3:44 PM

#

good news everybody, my deep VAR model is optimistic about unemployment

lapis sequoia Apr 15, 2024, 3:46 PM

#

oh

desert oar Apr 15, 2024, 4:08 PM

#

@final kiln I can't remember, did you look into flash attention at all?

#

nice, just didn't remember if it was on your radar or not

teal abyss Apr 15, 2024, 4:59 PM

#

can someone help me install code LLaMA on my pc

desert oar Apr 15, 2024, 5:02 PM

#

that sounds like a great plan

teal abyss Apr 15, 2024, 5:02 PM

#

desert oar that sounds like a great plan

can you help me?

desert oar Apr 15, 2024, 5:40 PM

#

understandable. gotta space out all the work, follow the top priorities first

potent sky Apr 15, 2024, 6:03 PM

#

teal abyss can you help me?

You'll have more luck describing your problem and someone here can pick it up and try to help you with it if they have the time

#

Dealing with vague questions is a lot of work and people generally (rightly so) don't want to do so much digging just to get to your problem after which they can begin solving it :)

warm trellis Apr 15, 2024, 7:49 PM

#

Hey, what can be the reason for having really high mape on training set more than 117%, and 0.512% on validation dataset?

lapis inlet Apr 15, 2024, 7:50 PM

#

Are there any good free platforms for deploying a tensorflow based flask api? I tried pythonanywhere but it seems to have a limit of 512 mb but the requirements itself cost more than that :/

agile cobalt Apr 15, 2024, 7:56 PM

#

lapis inlet Are there any good free platforms for deploying a tensorflow based flask api? I ...

for reference for those who stumple upon this later: cross-posted and answered in #web-development message

agile cobalt Apr 15, 2024, 7:58 PM

#

warm trellis Hey, what can be the reason for having really high mape on training set more tha...

you are saying that you measured over 100% accuracy?.......
almost definitely a bug in your metrics measuring code, double check your math
if it is not a bug there, then it is a bug somewhere else, because that is just not possible - it even has 'percentange' in the name

warm trellis Apr 15, 2024, 8:01 PM

#

True. I am using darts and torchmetrics for the metrics, but if we won't focus on the part where the training error is more than 100% but rather what can be the reason having really high mape on training and really low on validation dataset? @agile cobalt

agile cobalt Apr 15, 2024, 8:01 PM

#

never mind, seems like it is possible with that metric
(I just assumed it was accuracy/classification without reading it properly, my bad)

warm trellis Apr 15, 2024, 8:02 PM

#

agile cobalt never mind, seems like it is possible with that metric (I just assumed it was ac...

possible but does not make sense, I agree on that.

agile cobalt Apr 15, 2024, 8:03 PM

#

Although the concept of MAPE sounds very simple and convincing, it has major drawbacks in practical application,[5] and there are many studies on shortcomings and misleading results from MAPE.[6][7]

It cannot be used if there are zero or close-to-zero values (which sometimes happens, for example in demand data) because there would be a division by zero or values of MAPE tending to infinity.[8]

For forecasts which are too low the percentage error cannot exceed 100%, but for forecasts which are too high there is no upper limit to the percentage error.
wikipedia but yeah that is a weird metric...

warm trellis Apr 15, 2024, 8:04 PM

#

yes that's exactly what I've after scaling, real close values to the zero and a lot of zero values as well..

agile cobalt Apr 15, 2024, 8:05 PM

#

my first guess would really be just: try a different metric

warm trellis Apr 15, 2024, 8:12 PM

#

Actually, I try rmse, mae, as well..
But they are really small such as 0.00289. I am having kind of difficulties to assess with this metrics.
In general, when it would happen that training is having more error on than the validation?

desert oar Apr 15, 2024, 8:48 PM

#

warm trellis Actually, I try rmse, mae, as well.. But they are really small such as 0.00289. ...

there's nothing necessarily wrong if the numbers are really small. but if they're unrealistically small for your data, then maybe you're leaking data somehow

warm trellis Apr 15, 2024, 9:05 PM

#

desert oar there's nothing necessarily wrong if the numbers are really small. but if they'r...

Thanks, more importantly why would error on validation dataset would be less than on training set ?

desert oar Apr 15, 2024, 9:06 PM

#

warm trellis Thanks, more importantly why would error on validation dataset would be less tha...

some combination of bad luck with the data split, data leakage, or a bug in your code

#

start by checking for bugs, then data leakage

agile owl Apr 16, 2024, 12:31 AM

#

does anyone know how to set up a simple baseline model for comparison with GluonTS models

#

I'm trying to match the outputs of statsmodels.tsa.VAR forecasts to the model I'm using in GluonTS's forecasting but it's a huge pain in the neck because I don't think they do it in the same way

craggy agate Apr 16, 2024, 1:30 AM

#

Help! ANN model loss values in negative and accuracy of 0.000e+00
I am working on this ANN model and with each epoch, my loss value decreases by 300k on average. I tried to reduce the learning rate but it's not helping either, has anyone else faced this issue? Can someone tell me how I fix this

desert oar Apr 16, 2024, 1:55 AM

#

craggy agate Help! ANN model loss values in negative and accuracy of 0.000e+00 I am working o...

negative loss can be perfectly fine, in this case it's normal for cross-entropy. you're taking logarithms of numbers between 0 and 1, which results in negative numbers

#

0 accuracy however is a problem

#

hard to say what the actual problem is though, without knowing more about the data. did you check that X_train and Y_train are constructed correctly?

craggy agate Apr 16, 2024, 2:05 AM

#

desert oar hard to say what the actual problem is though, without knowing more about the da...

Yes they are, I have never had negative loss so got a little confused

#

The x_train and y_train seem fine though

desert oar Apr 16, 2024, 2:12 AM

#

craggy agate The x_train and y_train seem fine though

when something is catastrophically wrong, start stripping away complexity until it's impossible to go wrong, and then add in pieces until it goes wrong again

spring field Apr 16, 2024, 2:15 AM

#

spring field it's still in the context of this, I changed my approach a couple times, changed...

well, I guess my hypothesis was wrong, this is epoch 33, going over a 45k image dataset applying 5 layers of dense blocks and transition layers and it still can't find the box around the number in the cat images... though I gotta say that at least it's drawing the box around the cat, because previous attempts couldn't even get that to happen
it surely has at least learned similar data really well

v9WLIYX3CmRvHku2vdVYnyPMe34xvpkfx9L1ieLU58sQ21youcs6xM2NjY2NjY2tr3fTP0XIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiJ6CHbeG0BEREREREREREREREREREREREREREREREREREREREREREREREREtF8wwI2IiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiI6AQxwI2IiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiI6AQxwI2IiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiI6AQxwI2IiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiI6AQxwI2IiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiI6AQxwI2IiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIi..png

desert oar Apr 16, 2024, 2:22 AM

#

for example @craggy agate you should get ~99% accuracy with this:

import numpy as np
import tensorflow as tf

ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=16, input_dim=1, activation="relu"))
ann.add(tf.keras.layers.Dense(units=1, activation="sigmoid"))
ann.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])

n = 1000
x = np.linspace(0.0, 1.0, n)
y = np.where(x <= 0.5, 0.0, 1.0)

ann.fit(x, y, batch_size=1, epochs=10)

#

so start there, and then start adding in pieces from your actual code/data until it falls over

lapis sequoia Apr 16, 2024, 4:27 AM

#

Does anyone know a good way to make segmentation masks quickly? Something like magic lasso in photoshop but free

lapis sequoia Apr 16, 2024, 4:32 AM

#

cinder jay someone has ever used nnU-Net?

Yes can I message you?

dawn light Apr 16, 2024, 6:39 AM

#

for the calculations of precision and recall for multilabel classification, how does one get the values for FN and FP?
it's pretty easy to understand what they mean for binary classification (spam or not spam for example), but how does ano determine what a False postive and a False nega is for multilabel classification?

For example, for digit recognition, isn't it the case that if the NN classifies a digit as wrong, it's just wrong? It's neither FP nor FN?

desert oar Apr 16, 2024, 9:27 AM

#

dawn light for the calculations of precision and recall for multilabel classification, how ...

A lot of the terminology comes from the case of binary classification specifically. You have to think of FP and FN as "with respect to class K"

#

then you construct precision and recall either by "micro-averaging" or "macro averaging" across classes https://datascience.stackexchange.com/q/15989

Data Science Stack Exchange

Micro Average vs Macro average Performance in a Multiclass classifi...

I am trying out a multiclass classification setting with 3 classes. The class distribution is skewed with most of the data falling in 1 of the 3 classes. (class labels being 1,2,3, with 67.28% of the

craggy agate Apr 16, 2024, 10:55 AM

#

desert oar for example <@733630586970701855> you should get ~99% accuracy with this: ```pyt...

Thanks let me check if it works

crystal geyser Apr 16, 2024, 11:03 AM

#

Hello guys,
This is a file of data contains viral posts, I scraped these using Instagram Viral Content Finder, This is a crawler I developed completely from scratch, it search all posts related to specific niche and find those posts which are viral on Instagram, after scraping it stores the data in data.json file...

craggy agate Apr 16, 2024, 12:30 PM

#

crystal geyser Hello guys, This is a file of data contains viral posts, I scraped these using I...

Nice, what you gonna do with it?

earnest anchor Apr 16, 2024, 3:10 PM

#

crystal geyser Hello guys, This is a file of data contains viral posts, I scraped these using I...

How is it gonna help you??

Will this database help you analyse certain things??

craggy agate Apr 16, 2024, 4:16 PM

#

earnest anchor How is it gonna help you?? Will this database help you analyse certain things?...

Exactly what I am thinking

craggy agate Apr 16, 2024, 4:17 PM

#

desert oar for example <@733630586970701855> you should get ~99% accuracy with this: ```pyt...

The issue is not in the data preprocessing part neither is it in creating the layers.

#

I checked

#

Probably in compile or train

earnest anchor Apr 16, 2024, 4:19 PM

#

My question is

What are you training for??

Like what analysis do you want to perform from this data base??

lapis sequoia Apr 16, 2024, 5:02 PM

#

Whats the best way to store a very big dataset of torch tensors for the fastest access

#

so far fastest one i found is zarr but still is there anything faster

#

Because I want to save a giant dataset into a zip file but I don't understand if zar will try to load whole zip into RAM
its 150 gb the dataset and its of mri images that I use 2d slices of and even those are painfully slow to load

left tartan Apr 16, 2024, 5:03 PM

#

Have you tried parquet? Have you benchmarked the use cases?

lapis sequoia Apr 16, 2024, 5:05 PM

#

left tartan Have you tried parquet? Have you benchmarked the use cases?

i tried numpy save numpy savez compressed pickle dill joblib with all compressions hickle h5py bloscpack zarr

#

and benchmarked and zarr is the fasters

#

fastest

#

savez compressed takes 228 seconds for all slices of 10 images and zarr zipobject takes 32 seconds which is weird so i think it loads it into ram so it wont work on whole dataset

#

and DirectoryStore takes 82 seconds

#

and also it needs to be compressed like zarr and numpy savez compressed and bloscpack because otherwise it doesnt fit into my hdd

agile cobalt Apr 16, 2024, 5:11 PM

#

"fastest access" is not quite clear. Access in which way?
position based? based on comparing with a value in a certain column? you could even read that as loading the entire data into memory

lapis sequoia Apr 16, 2024, 5:12 PM

#

agile cobalt "fastest access" is not quite clear. Access in which way? position based? based ...

just loading the tensors from disk into memory and its too big to be loaded into RAM and I preload 50% but its still way too slow and if I discard other 50% so that 100% is preloaded its like 10 times faster

craggy agate Apr 16, 2024, 5:13 PM

#

lapis sequoia just loading the tensors from disk into memory and its too big to be loaded into...

Try using the SSD for it? VRAM

#

Is that possible?

lapis sequoia Apr 16, 2024, 5:14 PM

#

craggy agate Try using the SSD for it? VRAM

i tried using nvme ssd it makes no diference for some reason but i cant fit entire dataset on ssd anyway

craggy agate Apr 16, 2024, 5:14 PM

#

lapis sequoia i tried using nvme ssd it makes no diference for some reason but i cant fit enti...

I see

#

What about cloud based

#

AWS hosted

#

You could also use a couple of pi 5s

agile cobalt Apr 16, 2024, 5:16 PM

#

you can try using parquet, which is a really good format for most tabular data + supports a bunch of other types as long as your libraries that support them well (e.g. pyarrow or polars, not pandas), but zarr is probably amonst the best

just make sure you are using the smallest data types you can afford to

lapis sequoia Apr 16, 2024, 5:32 PM

#

agile cobalt you can try using parquet, which is a really good format for most tabular data +...

it says its bad for 2d arrays

#

i tried h5py and it takes 38 seconds

boreal gale Apr 16, 2024, 5:52 PM

#

curious if you have looked into https://lancedb.github.io/lance/index.html ?
(we considered it at work, but threw the idea out because it takes too long to fully validate and we already have something that works, i.e. i don't know how good is it - we don't deal with image btw, just tablular data.)

potent garnet Apr 16, 2024, 5:55 PM

#

Hi everyone, I'm working on time series project but I have one question is anyone deal with irreguler time series data before ?

boreal gale Apr 16, 2024, 5:58 PM

#

potent garnet Hi everyone, I'm working on time series project but I have one question is anyon...

irregular in what way?
regardless of the answer, the answer to your question is probably a yes. (though maybe not me personally)
and i assume that's not your only question, what's your actual question? (it's well worth to just ask that question right from the get go)

potent garnet Apr 16, 2024, 6:00 PM

#

boreal gale irregular in what way? regardless of the answer, the answer to your question is...

My dates are irregular in the dataset I am working on. Moreover, the numbers in the historical data of some of my users are very low. At this point, how can I make a healthy prediction with irregular dates and little data? Even if you just suggest a title, it would be very important to me.

desert oar Apr 16, 2024, 6:01 PM

#

boreal gale curious if you have looked into https://lancedb.github.io/lance/index.html ? (w...

Seems like a bold set of claims considering it's not parquet and they aren't making a case for using it instead of parquet

#

It does look arrow-based so that's interesting, makes it a direct competitor to parquet (and feather), so it's even weirder and more suspicious that they don't mention either

boreal gale Apr 16, 2024, 6:02 PM

#

potent garnet My dates are irregular in the dataset I am working on. Moreover, the numbers in ...

it would be helpful to provide more context here, for example:-

what is the dataset you are working with
why are dates irregular in the first place?
why some users are different?
are the underlying data generation process the same across users?
etc etc. more context would allow people to chime in easier!

desert oar Apr 16, 2024, 6:02 PM

#

I'm actually confused, lance looks more like a directory format analogous to iceberg or hive

#

In which case, again, suspicious that they don't mention either

potent garnet Apr 16, 2024, 6:04 PM

#

boreal gale it would be helpful to provide more context here, for example:- - what is the da...

There is no reason why users enter data irregularly. There is data from many different sectors, so there is variability. There is no specific area that I can say, such as energy, health, etc. It's just that some users used the system more often, some unfortunately used it less.

boreal gale Apr 16, 2024, 6:06 PM

#

potent garnet There is no reason why users enter data irregularly. There is data from many dif...

could you demonstrate what you are working with with some example (or real preferably) data?

potent garnet Apr 16, 2024, 6:07 PM

#

here is example train and prediction

sturdy kiln Apr 16, 2024, 6:38 PM

#

hello so this is the first time im dealing with regression type data, and im getting absurdly high values in my MSE and loss values with the use of KerasRegressor

#

def model_2():
  model = Sequential()
  model.add(Dense(60, input_shape=(len(columns),), kernel_initializer='normal', activation= 'relu'))
  model.add(Dense(200, kernel_initializer='normal', activation= 'relu'))
  model.add(Dense(300, kernel_initializer='normal', activation= 'relu'))
  model.add(Dense(1, kernel_initializer='normal'))
  model.compile(loss= 'mean_squared_error' , optimizer= 'adam')
  return model
# evaluate model with standardized dataset
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=model_2, epochs=10, batch_size=10, verbose=1)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10)
results = cross_val_score(pipeline, house_x, house_y, cv=kfold, scoring='neg_mean_squared_error')
print("Baseline: %.2f (%.2f) MSE" % (results.mean(), results.std()))```

#

dataset looks sumn like this

columns = ['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot', 'floors', 
           'waterfront', 'view', 'condition', 'grade', 'sqft_above', 
           'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode', 'lat', 
           'long', 'sqft_living15', 'sqft_lot15']
house_x = housingDF[list(columns)].astype('object').values
house_y = housingDF["price"].astype('object').values

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21613 entries, 0 to 21612
Data columns (total 21 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   id             21613 non-null  int64  
 1   date           21613 non-null  object 
 2   price          21613 non-null  float64
 3   bedrooms       21613 non-null  int64  
 4   bathrooms      21613 non-null  float64
 5   sqft_living    21613 non-null  int64  
 6   sqft_lot       21613 non-null  int64  
 7   floors         21613 non-null  float64
 8   waterfront     21613 non-null  int64  
 9   view           21613 non-null  int64  
 10  condition      21613 non-null  int64  
 11  grade          21613 non-null  int64  
 12  sqft_above     21613 non-null  int64  
 13  sqft_basement  21613 non-null  int64  
 14  yr_built       21613 non-null  int64  
 15  yr_renovated   21613 non-null  int64  
 16  zipcode        21613 non-null  int64  
 17  lat            21613 non-null  float64
 18  long           21613 non-null  float64
 19  sqft_living15  21613 non-null  int64  
 20  sqft_lot15     21613 non-null  int64  
dtypes: float64(5), int64(15), object(1)
memory usage: 3.5+ MB```

#

ive been tweaking and modifying the model with different node amounts and added layers but it seems that my MSE is still very large, the difference between a non standard and standardized model also isnt very much

#

i dont know what im doing wrong or any thing i have to do to make this model perform better, so any help is appreciated

#

i have no baseline at all if this is good or bad because from what im seeing its normal to get thousands in their MSEs but billions?

fair warren Apr 16, 2024, 6:55 PM

#

Is there any movement in the industry towards using polars (or any other df solution) instead of pandas?

agile cobalt Apr 16, 2024, 6:58 PM

#

sort of - pandas is still dominant by far, but there are a few places moving to polars for efficiency/performance gains, at least on new projects

pyspark is common in some contexts though (and has been for a while)

#

not sure if I would call it a movement in the industry yet though

#

https://pola.rs/posts/case-check-technology/

Polars — Check Technologies saves 25% of cloud expenses with Polars

DataFrames for the new era

lapis sequoia Apr 16, 2024, 7:29 PM

#

wot are teh baics of data science pls

#

anyone explain to me in private chat?

sturdy kiln Apr 16, 2024, 7:31 PM

#

# Normalize our X values
from sklearn import preprocessing as prep
min_max_scaler = prep.MinMaxScaler()
house_x = min_max_scaler.fit_transform(house_x)

def model_3():
  model = Sequential()
  model.add(Dense(60, input_shape=(len(columns),), kernel_initializer='normal', activation= 'relu'))
  model.add(Dense(512, kernel_initializer='normal', activation= 'relu'))
  model.add(Dense(1024, kernel_initializer='normal', activation= 'relu'))
  model.add(Dense(1024, kernel_initializer='normal', activation= 'relu'))
  model.add(Dense(1024, kernel_initializer='normal', activation= 'relu'))
  model.add(Dense(512, kernel_initializer='normal', activation= 'relu'))
  model.add(Dense(1, kernel_initializer='normal'))
  model.compile(loss= 'mean_squared_error' , optimizer= 'adam')
  return model
# evaluate model with standardized dataset
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=model_3, epochs=10, batch_size=10, verbose=1)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10)
results = cross_val_score(pipeline, house_x, house_y, cv=kfold, scoring='neg_mean_squared_error')
print("Model 3: %.2f (%.2f) MSE" % (results.mean(), results.std()))```
Ive normalized the data, added a bunch more layers, and this loss value is still absurdly high. I have no idea if the problem is my evaluation function itself, the data, the model or god itself i am at a loss (heh)

untold bloom Apr 16, 2024, 7:32 PM

#

sturdy kiln i have no baseline at all if this is good or bad because from what im seeing its...

hi,

its normal to get thousands in their MSEs but billions?
the unit of MSE is not the same as the data right? it's the mean of (true - prediction) ^ 2, so there is a unit mismatch and therefore people sometimes use RMSE for a more interpretable metric. Indeed, if you take the square root of your MSE there, you'd end up around ~thousands. From a bit of a search (because you didn't share :p), I assume your data is "kc_house_data" and it seems the mean of the target is around 500 thousand, so your result (~200 thousand) is not off. Also I found this where they obtained an RMSE of ~100 thousand, so that's another something you can look and rest assured. Even though you report validation score and they test score, underlying message is independent of those

sturdy kiln Apr 16, 2024, 7:37 PM

#

is RMSE just the square root of MSE?

untold bloom Apr 16, 2024, 7:37 PM

#

yes

#

root mean squared error

sturdy kiln Apr 16, 2024, 7:38 PM

#

so i could just get the RMSE from the previous eval results by sqrt the mean

untold bloom Apr 16, 2024, 7:38 PM

#

yes

#

need to negate first though

#

sklearn reports negative MSE

sturdy kiln Apr 16, 2024, 7:38 PM

#

i am using NMSE

untold bloom Apr 16, 2024, 7:38 PM

#

yes

sturdy kiln Apr 16, 2024, 7:38 PM

#

ah so i negate it first

untold bloom Apr 16, 2024, 7:38 PM

#

yes

#

(it is negative so as to unify it's maximizing stuff)

#

stuff being the validation score

sturdy kiln Apr 16, 2024, 7:44 PM

#

hmm newer model shows 200k RMSE, whats the interpretation from this

#

that the "error" value is in average 200k off from the true values?

untold bloom Apr 16, 2024, 7:49 PM

#

yes

sturdy kiln Apr 16, 2024, 7:50 PM

#

hmm honestly it doesnt sound so bad when put it that way

versed pilot Apr 16, 2024, 9:12 PM

#

agile cobalt sort of - pandas is still dominant by far, but there are a few places moving to ...

Not sure you can compare Pandas with Spark. Pandas makes sense on a single machine with "not so Big" data, Spark makes sense on a cluster with Big Data

prisma briar Apr 16, 2024, 9:43 PM

#

Should I take ML or GenAI course before?

desert oar Apr 16, 2024, 10:30 PM

#

untold bloom (it is negative so as to unify it's maximizing stuff)

which is such a weird choice because most of the time we express optimization as minimization, not maximization

left tartan Apr 16, 2024, 11:06 PM

#

versed pilot Not sure you can compare Pandas with Spark. Pandas makes sense on a single machi...

I think it's a reasonable to point out: technologies like polars (and my fav shill duckdb) make it possible to do more in memory / on-server before reaching for distributed solutions

vestal spruce Apr 16, 2024, 11:20 PM

#

does anyone know which python library/package is used often for audio processing (e.g. splitting audio into segments, changing sample rate, etc.) ?

royal crest Apr 16, 2024, 11:22 PM

#

librosa, pyaudioanalysis, tensorflow io, ffmpeg

sweet stag Apr 17, 2024, 4:49 AM

#

guys where do i learn about sci kit learn?

worldly dawn Apr 17, 2024, 4:52 AM

#

sweet stag guys where do i learn about sci kit learn?

I learned about it through their website. They have an excellent documentation

sick eagle Apr 17, 2024, 9:10 AM

#

guys, i have a dump question

#

why are you using Github

#

for what

#

why is it importent

wooden sail Apr 17, 2024, 9:11 AM

#

i'd say there are roughly 3 big reasons

#

the first is separate from github. git on its own is a great tool for versioning control. it helps you track any changes you (or anyone else) makes to the code, revert those changes, etc.

#

next, it's great if you can both back up the code and all of the versioning history somewhere remote, and also be able to access it from anywhere. this is what github does: it's a hub for your git repositories. one of many, might i add. there are alternatives

#

the third reason would be that the particular choice of using github comes with the advantage of them offering nice things like github actions, github pages, etc, while also being free. (in exchange, microsoft uses your repos without permission to train AI)

#

that means you can use git just locally, you can pair it up with github or any other remote hub for git repos, and if you do choose github as your remote hub, it comes with goodies

sick eagle Apr 17, 2024, 9:15 AM

#

@wooden sail thanks ,bro ,thanks too much

midnight harbor Apr 17, 2024, 11:47 AM

#

Guys help me out here

Why can't i find any yolo model that have face and person class in single model

Yolo default by ultralytics has person class but not face, yoloface has only face class but not person

The only choice i got is train a new model with thee two class but if there is a model with both of these classes please tag me here

stone oyster Apr 17, 2024, 12:48 PM

#

Could someone tell me the steps to uninstall micromamba from Windows? The official docs, Google, ChatGPT3.5 and Gemini are all useless here.

craggy agate Apr 17, 2024, 1:50 PM

#

sweet stag guys where do i learn about sci kit learn?

Documentation, udemy courses, YouTube videos(find a good one)

lofty thorn Apr 17, 2024, 1:51 PM

#

hi

stone oyster Apr 17, 2024, 1:57 PM

#

sweet stag guys where do i learn about sci kit learn?

From the creators/maintainers of the library: https://www.fun-mooc.fr/en/courses/machine-learning-python-scikit-learn/

FUN MOOC

Machine learning in Python with scikit-learn

Build predictive models with scikit-learn and gain a practical understanding of the strengths and limitations of machine learning!

crisp raptor Apr 17, 2024, 2:16 PM

#

you know you have been learning machine learning correctly when the hardest part is thinking of an input medium

lofty thorn Apr 17, 2024, 2:22 PM

#

can anyone explain this plot to me

rn_image_picker_lib_temp_1f1d804e-61c9-4dc4-ad09-8f232fcec216.jpg

crisp raptor Apr 17, 2024, 2:23 PM

#

lofty thorn can anyone explain this plot to me

what is this for

lofty thorn Apr 17, 2024, 2:23 PM

#

self studying for the project

crisp raptor Apr 17, 2024, 2:23 PM

#

yeah what is the project

wooden sail Apr 17, 2024, 2:23 PM

#

lofty thorn can anyone explain this plot to me

what about it?

crisp raptor Apr 17, 2024, 2:23 PM

#

context

lofty thorn Apr 17, 2024, 2:24 PM

#

it is an example from 'model based learning'

crisp raptor Apr 17, 2024, 2:25 PM

#

well what exactly is it supposed to represent? It looks like it just is simple regression for ponts

tidal bough Apr 17, 2024, 2:25 PM

#

lofty thorn can anyone explain this plot to me

well, I see a bunch of dots, and 3 different lines trying to fit this data, labeled by their parameters.

lofty thorn Apr 17, 2024, 2:26 PM

#

is this a linear regression plot..idk i am just guessing rn

crisp raptor Apr 17, 2024, 2:26 PM

#

yes it is

lofty thorn Apr 17, 2024, 2:26 PM

#

ok

tidal bough Apr 17, 2024, 2:27 PM

#

sure, I guess? though it doesn't show the best fit, just three arbitrary ones, to show what the parameters mean.

lofty thorn Apr 17, 2024, 2:27 PM

#

what theta is representing here

crisp raptor Apr 17, 2024, 2:27 PM

#

parameters

tidal bough Apr 17, 2024, 2:27 PM

#

see the equation

crisp raptor Apr 17, 2024, 2:27 PM

#

look at the equation above

wooden sail Apr 17, 2024, 2:28 PM

#

theta are just parameters of a linear equation

#

what the text says is basically the whole story

lofty thorn Apr 17, 2024, 2:28 PM

#

what is meant by parameter

#

models?

wooden sail Apr 17, 2024, 2:28 PM

#

if you have the equation of a line y = mx + b, if you change m and b, you can generate any line you like at all

crisp raptor Apr 17, 2024, 2:28 PM

#

an adjustable variable of sorts that changes the outcome

wooden sail Apr 17, 2024, 2:29 PM

#

and there is a particular choice of m and b that best explains the data, that's the blue line

#

finding that best m and b is called regression

#

"linear regression", at that, because m and b are the parameters of a linear equation

crisp raptor Apr 17, 2024, 2:29 PM

#

statistics is fun

wooden sail Apr 17, 2024, 2:29 PM

#

that linear equation is your "model". you assume the data follows a straight line, and find the parameters m and b of the straight line

#

a "model" is just how you decide you want to explain a phenomenon

#

here, we say "the data should be a straight line". that means our model is that the data is of the form y = mx + b

#

y = mx + b is the equation of absolutely any straight line that can ever exist. the parameters are m and b. this is what defines what line we get

#

regression is the process of finding the model parameters based on data

#

the black and red lines are just examples of arbitrary lines that have nothing to do with the data

#

the blue line has a good choice of m and b

crisp raptor Apr 17, 2024, 2:31 PM

#

@lofty thorn is this for a class?

lofty thorn Apr 17, 2024, 2:33 PM

#

ohk.....

tidal bough Apr 17, 2024, 2:34 PM

#

wooden sail and there is a particular choice of m and b that best explains the data, that's ...

(i don't see the text saying blue line is a best fit, and eyeballing it doesn't look to me like it is. i think all 3 lines are arbitrary.)

crisp raptor Apr 17, 2024, 2:34 PM

#

tidal bough (i don't see the text saying blue line is a best fit, and eyeballing it doesn't ...

It certainly isn't lol

wooden sail Apr 17, 2024, 2:34 PM

#

tidal bough (i don't see the text saying blue line is a best fit, and eyeballing it doesn't ...

fair enough, i also thought that was the case so toward the end i just called it a "good choice" 😛

lofty thorn Apr 17, 2024, 2:34 PM

#

its little up

#

right?

#

i mean down

wooden sail Apr 17, 2024, 2:35 PM

#

probably

crisp raptor Apr 17, 2024, 2:35 PM

#

and a slightly decreased angle

lofty thorn Apr 17, 2024, 2:36 PM

#

so every straight line is y = mx + c
m and b decides where the line is going to be

#

?

wooden sail Apr 17, 2024, 2:37 PM

#

indeed

#

you tell me 😛

potent sky Apr 17, 2024, 2:38 PM

#

Anyone read through the infini-attention paper yet?

crisp raptor Apr 17, 2024, 2:43 PM

#

I'm stuck in the 2010's with ML, so no

lofty thorn Apr 17, 2024, 3:37 PM

#

can anyone tell me what should be basic knowledge needed to start learning ML

wooden sail Apr 17, 2024, 3:39 PM

#

linear algebra, multivar calculus and statistics are widely regarded as the basics

lofty thorn Apr 17, 2024, 3:43 PM

#

ok

thin lotus Apr 17, 2024, 4:04 PM

#

wooden sail linear algebra, multivar calculus and statistics are widely regarded as the basi...

If you wanna fully understand it but you can get a lot done without

wooden sail Apr 17, 2024, 4:05 PM

#

that's certainly the case

#

but even very basic questions like "what layers and activation functions make sense for my problem?" can only be properly answered this way

thin lotus Apr 17, 2024, 4:09 PM

#

Ofc , if you wanna do deep learning I 100% agree it will be hard, but for more basic "ML" like clustering it becomes a lot simpler

crisp raptor Apr 17, 2024, 5:32 PM

#

wooden sail linear algebra, multivar calculus and statistics are widely regarded as the basi...

calculus is really only needed for a couple of things here, you want a really solid foundation of any form of stats and any form of algebra

#

if you are looking for resources, wikipedia is an excellent source for math related things

#

also, does anybody have a good idea on how I could embed FEN notation for a NN chess experiment

wooden sail Apr 17, 2024, 5:36 PM

#

crisp raptor calculus is really only needed for a couple of things here, you want a really so...

i would disagree with the calculus part. it's necessary for most of statistics, for one. you also need it to explain how things like exploding and vanishing gradients happen and affect the whole network, and many cost functions depend on computing integrals and derivatives as well as expectations

#

already the most basic clustering methods require all 3 of linalg, calculus and stats, since you're always computing ratios of expectations of vector-valued functions for those

crisp raptor Apr 17, 2024, 5:37 PM

#

wooden sail i would disagree with the calculus part. it's necessary for most of statistics, ...

I see your point

#

doesn't mean you have to take classes for it, you could just teach yourself

craggy agate Apr 17, 2024, 5:46 PM

#

lofty thorn can anyone tell me what should be basic knowledge needed to start learning ML

Calculus isn't that important for ML so you should rather focus on linear algebra, statistics and functions.

left tartan Apr 17, 2024, 5:47 PM

#

Calculus is indeed quite important, I'm with this: #data-science-and-ml message

#

Perhaps there's an argument about how much you actually need to -do- vs -understand-.

#

But calculus (at least the undergrad 1-3) isn't a particularly hard subject: the reason it's hard is students have terrible algebra fundamentals.

wooden sail Apr 17, 2024, 5:50 PM

#

right, for practical purposes you hardly need engineering level knowledge of the 3, which is already very basic

crisp raptor Apr 17, 2024, 5:50 PM

#

left tartan Perhaps there's an argument about how much you actually need to -do- vs -underst...

Exactly.

crisp raptor Apr 17, 2024, 5:51 PM

#

left tartan But calculus (at least the undergrad 1-3) isn't a particularly hard subject: the...

I'm 14, and I have a rather solid hold on calculus.

#

I should point out basic knowledge, like basic DE, PD, and integration and derivatives

left tartan Apr 17, 2024, 5:52 PM

#

crisp raptor I'm 14, and I have a rather solid hold on calculus.

Yup, -but- if you're anything like most high school students, your weak algebra skills will be why calculus -class- will be hard.

crisp raptor Apr 17, 2024, 5:53 PM

#

left tartan Yup, -but- if you're anything like most high school students, your weak algebra ...

I took algebra 1 and 2 last year, I think I'll be fine

#

well, the last 2 years

crisp raptor Apr 17, 2024, 5:53 PM

#

left tartan Yup, -but- if you're anything like most high school students, your weak algebra ...

so many jokes and questions stem from this idea though

wooden sail Apr 17, 2024, 5:53 PM

#

funny tidbits here include things like MAE not being differentiable at 0 and pytorch/tf/jax making an arbitrary (and different from each other) choice of what to use as a subgradient at that point, or that the log doesn't expect you to evaluate at negative values (which normally returns a complex number), and the derivative assumes you won't either, so it's often just defined as f'(x)/f(x) even for negative values of f(x), which doesn't make sense

#

but you'll never find out if you don't know calc 😛

crisp raptor Apr 17, 2024, 5:54 PM

#

wooden sail funny tidbits here include things like MAE not being differentiable at 0 and pyt...

haha I don't use pytorch

wooden sail Apr 17, 2024, 5:54 PM

#

these will directly affect and possibly ruin what you're doing

wooden sail Apr 17, 2024, 5:55 PM

#

crisp raptor haha I don't use pytorch

me neither, but many of the ML modules are opinionated on these things instead of giving errors as one would expect

#

jax gives errors for some, but not all of these, and makes arbitrary decisions in others

#

you need to know enough math to even realize this happens

#

same with the dimensions of things like CNNs when you apply them to multi-layer inputs

#

tf and pytorch do different things by default (broadcast vs implicitly make extra CNN layers)

left tartan Apr 17, 2024, 5:55 PM

#

crisp raptor I took algebra 1 and 2 last year, I think I'll be fine

Perhaps you're right, but I'd just suggest a bit of humility. You're at step 1 of many. https://en.m.wikipedia.org/wiki/Mathematical_maturity#Progression

crisp raptor Apr 17, 2024, 5:56 PM

#

theres the benefit of building your toolchains from the ground up; you know how to fix your errors

wooden sail Apr 17, 2024, 5:56 PM

#

crisp raptor theres the benefit of building your toolchains from the ground up; you know how ...

you'll never do this for a realistic problem

#

i don't wanna sit down and make a general computational graph tool myself

crisp raptor Apr 17, 2024, 5:56 PM

#

wooden sail you'll never do this for a realistic problem

well yeah, but for practice it would be good

#

it's not bad knowledge to know if something fails you

wooden sail Apr 17, 2024, 5:57 PM

#

sadly these fall under the category of being just as difficult, if not more, than the original problem

#

so you immediately have at least 2x the work to do

crisp raptor Apr 17, 2024, 5:58 PM

#

yay for engineering 👍

wooden sail Apr 17, 2024, 5:58 PM

#

it does, but not if you have a job to do and get paid for it

#

on your free time it's fine, but under time and money constraints not really

crisp raptor Apr 17, 2024, 5:58 PM

#

R

#

the language?

#

do you know what R is

wooden sail Apr 17, 2024, 6:00 PM

#

R is fairly high level, kinda the opposite direction of what we're discussing now

crisp raptor Apr 17, 2024, 6:00 PM

#

I would say C, but that's just because I'm an old man at heart

#

Fuck the government 🙂

#

That would essentially stab GNU in the back multiple times

#

because they are fucking trying to kill off their brainchild

iron basalt Apr 17, 2024, 6:27 PM

#

wooden sail it does, but not if you have a job to do and get paid for it

Depends on the job, usually not. Might even be better off just doing that in your free time if you are into that. Make an open source library. Unless you can really convince people that it gives a competitive advantage, not just "it feels better than the existing tools."

wooden sail Apr 17, 2024, 6:27 PM

#

indeed

craggy agate Apr 17, 2024, 6:27 PM

#

left tartan Perhaps there's an argument about how much you actually need to -do- vs -underst...

Yeah maybe

iron basalt Apr 17, 2024, 6:28 PM

#

iron basalt Depends on the job, usually not. Might even be better off just doing that in you...

Example would be Pytorch, etc not running at all or fast enough on small devices, so you make a custom framework for those low end devices.

craggy agate Apr 17, 2024, 6:28 PM

#

But I would say functions are pretty important compared to limits.

wooden sail Apr 17, 2024, 6:29 PM

#

i'm often in the position where i do have to implement some stuff that normally one wouldn't. i was discussing this with zestar just the other day. if you can, you'd wanna use built-in stuff like scipy's solver. i just happened to be solving a problem requiring a quasi-newton method, but with a massive, highly structured matrix

craggy agate Apr 17, 2024, 6:30 PM

#

Calculus will obviously help a lot but it's not imperative

wooden sail Apr 17, 2024, 6:30 PM

#

once you're at the point of anyway having to replace numpy's matrix multiplication with matrix-free stuff and explicitly compute jacobians' and hessians' actions instead of storing the matrix itself, you're already one step away of writing any gradient-based method from scratch

#

i guess the point being that having more knowledge lets you solve problems in different ways and can even make problems that would otherwise be practically impossible, possible. those are kinda edge cases though

#

or just help you debug your pipeline better

iron basalt Apr 17, 2024, 6:36 PM

#

wooden sail i guess the point being that having more knowledge lets you solve problems in di...

Knowledge for solving edge cases comes up as being able to provide a competitive advantage, it's really good to just know a lot of stuff. Especially the weird details such as those you described earlier.

#

It's about change, and most interesting things involve change.

#

(That's why its the language of physics)

#

I don't think they can really be compared in terms of importance.

#

It's also kind of like saying 1 is more important than 2. Like what does that mean? They both are used heavily, and come up whenever they do.

#

"Important" is not a universal rank-able thing.

wooden sail Apr 17, 2024, 6:42 PM

#

inb4 well-ordering of maths

iron basalt Apr 17, 2024, 6:44 PM

#

It's actually the opposite, you just don't usually see it and can use more well known alternatives that may or may not be as elegant.

#

Oh my bad, I read common, not uncommon.

craggy agate Apr 17, 2024, 6:46 PM

#

It's about limits as well

#

😭

desert oar Apr 17, 2024, 7:55 PM

#

yeah you can think about it as being all about relating levels and rates

#

limits are more of a general real analysis thing but they turn out to be foundational for calculus

#

(or "the" calculus as some would have it)

iron basalt Apr 17, 2024, 8:17 PM

#

desert oar limits are more of a general real analysis thing but they turn out to be foundat...

They originate from Euclid's Elements, Book (scroll) X, Proposition 1.

#

Which then became the method of exhaustion for a while. And then more modern forms built on that.

desert oar Apr 17, 2024, 8:20 PM

#

wild how they used to do math by just writing out words and drawing diagrams

iron basalt Apr 17, 2024, 8:20 PM

#

Although there may have been earlier understandings, it's not super clear, seems to show up globally at earlier times, history is still ongoing (being discovered).

iron basalt Apr 17, 2024, 8:22 PM

#

desert oar wild how they used to do math by just writing out words and drawing diagrams

Algebraic notation was a huge invention, plain words make things really long and hard to follow.

warm trellis Apr 17, 2024, 10:48 PM

#

When I was doing an internship in one company, I was given a problem to predict cost of building something into the future, though there was no historical price changes. It was a cost building at a time, in different places, and they were in completely different settings. so not identical at all, therefore different costs. What one has to do in that settings to predict into future what will be the cost ? I couldn’t build what they asked me to build.. it’s kind of mysterious for me if it’s even possible to do it in the first place. Or if I was given an impossible tasks, and have been naive to accept it. Also blame myself that I couldn’t do it

desert oar Apr 18, 2024, 12:04 AM

#

warm trellis When I was doing an internship in one company, I was given a problem to predict ...

It was a cost building at a time, in different places, and they were in completely different settings. so not identical at all, therefore different costs.
can you clarify what you mean by this?

#

if the prices of inputs don't change historically, then i would estimate the cost of the new thing by looking at the historical costs of the components or processes that are required to build the new thing

outer estuary Apr 18, 2024, 12:45 AM

#

Hello Im luckythespacecat I am a game developer. I need help getting Character.ai to work with my game, if you can help DM me. Below is a link to the respitory that I was trying to use. if you can get this to work please let me know and DM me. I am trying to set it up for a game im making (not in python) if you help I will also credit you properly
https://github.com/kramcat/CharacterAI

left tartan Apr 18, 2024, 1:01 AM

#

outer estuary Hello Im luckythespacecat I am a game developer. I need help getting Character.a...

You could open a help thread, but it is unlikely anyone will dm you. #❓｜how-to-get-help

coral lotus Apr 18, 2024, 2:08 AM

#

https://github.com/broadinstitute/keras-rcnn - i forked the repo and put all the code from the readme into a file called main.py

#

i also added import keras, import keras_rcnn, import numpy

#

im getting an error py line 5, in <module> training_dictionary, test_dictionary = keras_rcnn.datasets.shape.load_data() ^^^^^^^^^^^^^^^^^^^ AttributeError: module 'keras_rcnn' has no attribute 'datasets'

#

#

i think its trying to access the module keras_rcnn instead of going into the folder keras_rcnn

#

how would i fix this?

desert oar Apr 18, 2024, 2:20 AM

#

coral lotus im getting an error ```py line 5, in <module> training_dictionary, test_dict...

how are you running the code?

coral lotus Apr 18, 2024, 2:21 AM

#

in python in vscode

#

desert oar Apr 18, 2024, 2:22 AM

#

coral lotus in python in vscode

specifically what command are you running

coral lotus Apr 18, 2024, 2:22 AM

#

not sure, i just ran the main.py file

#

the main.py file contains all the code from the readme

desert oar Apr 18, 2024, 2:22 AM

#

coral lotus i think its trying to access the module keras_rcnn instead of going into the fol...

python imports don't correspond directly to folders. python only looks for importable things in specific places. specifically, if you run import a.b, python looks for a/__init__.py and a/b.py or a/b/__init__.py relative to any folder in that search path.

the reason the command matters is that python will adjust its import search path based on what you ran.

coral lotus Apr 18, 2024, 2:23 AM

#

oh hm i see

desert oar Apr 18, 2024, 2:23 AM

#

coral lotus im getting an error ```py line 5, in <module> training_dictionary, test_dict...

is there also a keras_rcnn.py file somewhere in the same directory?

coral lotus Apr 18, 2024, 2:23 AM

#

how would i fix this then, because i still need to do the imports

#

let me check one sec

coral lotus Apr 18, 2024, 2:23 AM

#

desert oar is there _also_ a `keras_rcnn.py` file somewhere in the same directory?

no

#

keras_rcnn\datasets\shape.py - this is the file the code is meant to access

#

but its looking in keras_rcnn library

#

idk how to fix it

desert oar Apr 18, 2024, 2:26 AM

#

coral lotus but its looking in keras_rcnn library

what are the import statements in keras_rcnn\datasets\shape.py?

#

and is there anything in keras_rcnn/__init__.py or is it just a blank file?

coral lotus Apr 18, 2024, 2:27 AM

#

coral lotus Apr 18, 2024, 2:27 AM

#

desert oar and is there anything in `keras_rcnn/__init__.py` or is it just a blank file?

no it has stuff

desert oar Apr 18, 2024, 2:27 AM

#

coral lotus

🤔 where is keras_rcnn imported?

coral lotus Apr 18, 2024, 2:27 AM

#

its not imported in that file

desert oar Apr 18, 2024, 2:27 AM

#

coral lotus im getting an error ```py line 5, in <module> training_dictionary, test_dict...

which file is this?

coral lotus Apr 18, 2024, 2:27 AM

#

desert oar which file is this?

main.py

desert oar Apr 18, 2024, 2:28 AM

#

~~i thought you said it was shape.py~~ i see, i was confused

#

what are the import statements in main.py?

coral lotus Apr 18, 2024, 2:28 AM

#

no but when i ran main.py, the error is coming from the program not being able to access the shape.py file

#

desert oar Apr 18, 2024, 2:28 AM

#

coral lotus

add import keras_rcnn.datasets

coral lotus Apr 18, 2024, 2:28 AM

#

ohh alright one sec

#

then im guessing i would also have to import keras_rcnn.preprocessing

desert oar Apr 18, 2024, 2:29 AM

#

the program not being able to access the shape.py file
that's not what's happening. python found the keras_rcnn module but you also need to explicitly import the submodule that you're using

desert oar Apr 18, 2024, 2:29 AM

#

coral lotus then im guessing i would also have to import keras_rcnn.preprocessing

yeah, and probably keras_rcnn.datasets.shape as well

#

(you probably don't need keras_rcnn.datasets by itself)

coral lotus Apr 18, 2024, 2:29 AM

#

desert oar yeah, and probably `keras_rcnn.datasets.shape` as well

wait how come?

#

wouldnt that be included by importing keras_rcnn.datasets

desert oar Apr 18, 2024, 2:30 AM

#

coral lotus wouldnt that be included by importing keras_rcnn.datasets

no, imports are not "recursive" by default. a lot of libraries set them up to be recursive for convenience, but python doesn't do it for you

coral lotus Apr 18, 2024, 2:30 AM

#

ohh

#

wait why is this an error

desert oar Apr 18, 2024, 2:30 AM

#

read it!

#

(please post code as text in the future, not as screenshots)