boreal gale Sep 15, 2023, 3:10 PM

#

i think that's ~~exactly~~ chebyshev's inequality just represented slightly differently

agile cobalt Sep 15, 2023, 3:10 PM

#

this may or may not be useful: https://en.wikipedia.org/wiki/Chebyshev's_inequality#Proof

serene scaffold Sep 15, 2023, 3:12 PM

#

always ask an actual question that someone who knows the answer can start answering--don't ask to ask.

#

@boreal gale @agile cobalt thanks!

somber prism Sep 15, 2023, 3:23 PM

#

model : resnet 152 initially loaded with imagenet weights
dataset : freihand and multiview hand pose dataset (approx total of 180k images)
optimizer : Adam with .01 lr
task : keypoints detection
hnet : trained for 10 epochs
hnet2 : +5 epochs (15 ep)
hnet3 : +5 epochs (20 ep)
hnet4 : : +5 epochs (25 ep)

ive been training my resnet model for days and ive found out there's a gradual decrease in MSE loss, my goal is to train the model to work well during testing ( lets say with my hand , outside of this dataset ), can someone suggest me a way to improve this model , it would be really helpful , should i just train my model bit more like upto 50 epoch? so far i've cropped the image using bouding box to focus the whole image on hand to remove noises.

serene scaffold Sep 15, 2023, 3:23 PM

#

this insight was the key

#

@boreal gale @agile cobalt any ideas for what to make of this? I'm not sure what to do with u (without a subscript; u has up to this point been the random variable) now that we have a sequence of u values.

boreal gale Sep 15, 2023, 3:26 PM

#

serene scaffold <@231160898872410123> <@256442550683041793> any ideas for what to make of this? ...

yeah weird, it just looks like u is undefined here..

agile cobalt Sep 15, 2023, 3:27 PM

#

what I thought that could be useful from the proof was mainly the y = (X - u) ** 2

serene scaffold Sep 15, 2023, 3:28 PM

#

agile cobalt what I thought that could be useful from the proof was mainly the `y = (X - u) *...

we were already asked to prove this, and one can just substitute that for t.

modest mauve Sep 15, 2023, 3:29 PM

#

https://discord.com/channels/267624335836053506/1152264347863744603

pale hemlock Sep 15, 2023, 3:30 PM

#

omg

#

I think i found my room

modest mauve Sep 15, 2023, 3:31 PM

#

modest mauve https://discord.com/channels/267624335836053506/1152264347863744603

please help🙏

somber prism Sep 15, 2023, 3:32 PM

#

somber prism model : resnet 152 initially loaded with imagenet weights dataset : freihand and...

please help🙏

pale hemlock Sep 15, 2023, 3:35 PM

#

jaabir

#

ok honestly im novice but i work with it so icould help?

serene scaffold Sep 15, 2023, 3:36 PM

#

serene scaffold we were already asked to prove this, and one can just substitute that for `t`.

please help 🙏

#

sorry I just wanted to be included.

somber prism Sep 15, 2023, 3:36 PM

#

pale hemlock ok honestly im novice but i work with it so icould help?

shoot, im ok with any suggestions

pale hemlock Sep 15, 2023, 3:37 PM

#

ok

#

what exactly is the issue?

wooden sail Sep 15, 2023, 3:37 PM

#

yeah you can take (u - mu)^2 as a new random variable, substitute into chevyshev's, and compute its expectation

pale hemlock Sep 15, 2023, 3:39 PM

#

@somber prism >?

#

OH

#

Jaabir

#

would you mind phone call, i may interest you in a new way to work with this?

somber prism Sep 15, 2023, 3:42 PM

#

you can dm

#

@pale hemlock

wooden sail Sep 15, 2023, 3:44 PM

#

serene scaffold <@231160898872410123> <@256442550683041793> any ideas for what to make of this? ...

what is u here?

somber prism Sep 15, 2023, 3:44 PM

#

pale hemlock what exactly is the issue?

nothing really, i just wanted to know how others would take approach in this

serene scaffold Sep 15, 2023, 3:44 PM

#

wooden sail what is u here?

that's the question 😛

#

I just asked the prof, so we'll see when he replies.

#

I guess it's an arbitrary element from the sequence of u values, but I don't think that goes without saying.

boreal gale Sep 15, 2023, 3:46 PM

#

is the question to figure out what is u 😛

serene scaffold Sep 15, 2023, 3:46 PM

#

the question is "what is u intended to represent in the context of the homework question"?

#

and the homework question is to prove that inequality.

wooden sail Sep 15, 2023, 3:48 PM

#

u is probably the mean of all the u_i, since that reduces the variance by a factor of n

#

you can already go ahead and try that out

boreal gale Sep 15, 2023, 3:48 PM

#

my guess it's probably supposed to be \bar{u}

wooden sail Sep 15, 2023, 3:48 PM

#

but yeah, it's not written clearly

#

you can use the linearly of the expectation operation to show this one, as well as once again plugging into chebyshev's ineq

#

as a general recommendation, it's a good idea to keep an eye out for the (central) moments of random variables

#

they tend to have nice properties and give you intuition about their distribution's properties

serene scaffold Sep 15, 2023, 3:51 PM

#

what is u-bar, and if it's the mean of all u_i, how is that different from mu?

wooden sail Sep 15, 2023, 3:51 PM

#

the bar is a common notation for the mean

#

and i guess i should say "sample mean" for clarity

#

.latex the sample mean is [ \frac{1}{N} \sum_n u_n ]

strange elbowBOT Sep 15, 2023, 3:52 PM

#

$latex.png$

wooden sail Sep 15, 2023, 3:53 PM

#

this is only equal to the true mean if N goes to infinity

#

people usually call the "true mean" the "expected value"

#

.latex that'd be
[
\mathbb{E}(U) = \int_{-\infty}^{infty} u f(u) du
]

strange elbowBOT Sep 15, 2023, 3:54 PM

#

$latex.png$

wooden sail Sep 15, 2023, 3:54 PM

#

oops

#

where f(u) is the pdf of U

#

these two being equivalent as N goes to infinity is the "law of large numbers", which btw doesn't hold in general

#

but the takeaway for you is that the average and the expected value are not the same thing

#

what averaging DOES do is reduce the variance

#

this is what the problem is asking you to show

#

in signal processing terms, averaging is the same as applying a rolling average window, which is the same as lowpass filtering

#

another way to think of it is that the sample mean or average is a function of random variables, and so its output is also another random variable with a new distribution. one with the same mean as the original variables, but a lower variance. the expected is a constant

serene scaffold Sep 15, 2023, 4:09 PM

#

@wooden sail

Oh wow, sorry, I just left that out of the question, and no one asked yet! Yes, u in part c is the sample mean of all the us.

#

it's due in 10 hours

#

Pepega

wooden sail Sep 15, 2023, 4:09 PM

#

😌

#

mystery solved. i left you 10 pages of lore explaining the background in the meantime

serene scaffold Sep 15, 2023, 4:11 PM

#

I like lore

past meteor Sep 15, 2023, 4:26 PM

#

serene scaffold I guess it's an arbitrary element from the sequence of `u` values, but I don't t...

I think they meant u_{i} no?

serene scaffold Sep 15, 2023, 4:27 PM

#

past meteor I think they meant `u_{i}` no?

the prof since clarified that it's the sample mean for all u_i

past meteor Sep 15, 2023, 4:27 PM

#

ooof

#

In this context bar{u} would've been appropriate like Ry said then

#

Typos can be made ofc 🙂

serene scaffold Sep 15, 2023, 4:29 PM

#

serene scaffold <@467435887236612106> > Oh wow, sorry, I just left that out of the question, and...

yes

past meteor Sep 15, 2023, 4:31 PM

#

I always feel "guilty" when I realize that I used to know these things but I forgot 🤣 . Guess that's what mostly doing applied stuff does to you

boreal gale Sep 15, 2023, 4:33 PM

#

i have 0 guilt 🤣 because i know i wouldn't even know how to do 20% of the things i easily whip out today back then

modest mauve Sep 15, 2023, 4:51 PM

#

I'm making a python script that extracts the sudoku grid from the image. all numbers should be extracted into a 2d array matching the sudoku image

this my colab notebook link:
https://colab.research.google.com/drive/1ykMxMtiPX0SVph6bQpzQTLCZGkPJFJuT?usp=sharing

there some error in text extraction or might be in perspective correction, not sure what to do exactly?
kindly have a look at the code🙏

error:
AttributeError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/PIL/ImageFile.py in _save(im, fp, tile, bufsize)
517 try:
--> 518 fh = fp.fileno()
519 fp.flush()

AttributeError: '_idat' object has no attribute 'fileno'

During handling of the above exception, another exception occurred:

SystemError Traceback (most recent call last)
10 frames
/usr/local/lib/python3.10/dist-packages/PIL/ImageFile.py in _encode_tile(im, fp, tile, bufsize, fh, exc)
531 encoder = Image._getencoder(im.mode, e, a, im.encoderconfig)
532 try:
--> 533 encoder.setimage(im.im, b)
534 if encoder.pushes_fd:
535 encoder.setfd(fp)

SystemError: tile cannot extend outside image

i put an exception handling code there and now whole extracted grid is 0000...

please help!!!!

Google Colaboratory

white crest Sep 15, 2023, 4:53 PM

#

Can someone help me with pandas in python? I have some issues in implementing some of the functions of pandas and numpy. I also need some help in creating a script to get some insights from a dataset.

serene scaffold Sep 15, 2023, 5:01 PM

#

white crest Can someone help me with pandas in python? I have some issues in implementing so...

be sure to always give enough information that someone can start helping--don't wait for a commitment.

whatever dataframes you need help with, show them as text (not a screenshot) with print(df.head().to_dict('list'))

left tartan Sep 15, 2023, 5:10 PM

#

modest mauve I'm making a python script that extracts the sudoku grid from the image. all num...

Based on that message about idat not having file no, I’m guessing your fp parameter is not a file handle.

#

Check your params carefully. I didn’t look at your code tho.

modest mauve Sep 15, 2023, 5:11 PM

#

left tartan Check your params carefully. I didn’t look at your code tho.

if possible can u look ?😓

left tartan Sep 15, 2023, 5:12 PM

#

I can’t right now, I’m half debugging something, just doing light discord 🙂

white crest Sep 15, 2023, 5:21 PM

#

serene scaffold be sure to always give enough information that someone can start helping--don't ...

hi @serene scaffold thanks for your reply, actually i want help in generating a scripts from specific condition to implement in dataset,
though the datasets are

power data of equipments -
{'Time stamp': ['2023-06-01 00:00', '2023-06-01 01:00', '2023-06-01 02:00', '2023-06-01 03:00', '2023-06-01 04:00'], 'HVAC 1 (kW)': [0.0, 0.0, 0.0, 0.0, 0.0], 'HVAC 2 (kW)': [0.0, 0.0, 0.0, 0.0, 0.0], 'HVAC 3 (kW)': [0.0, 0.0, 0.0, 0.0, 0.0], 'HVAC 4 (kW)': [0.6772, 0.5796, 0.4976, 0.6235, 0.5637], 'Kitchen Bar lights (kW)': [0.0, 0.0, 0.0, 0.0, 0.0], 'LCC Oxford Circus - Total (kW)': [5.39, 5.25, 5.17, 5.0, 4.42], 'Main 1 (kW)': [5.39, 5.25, 5.17, 5.0, 4.42], 'Main 1 L1 (kW)': [3.81, 3.85, 3.77, 3.72, 3.22], 'Main 1 L2 (kW)': [0.9682, 0.9715, 0.9935, 0.919, 0.9206], 'Main 1 L3 (kW)': [0.611, 0.4309, 0.4057, 0.3594, 0.2783]}
Working hours of the site -
{'WeekDay': ['Monday', 'Monday', 'Monday', 'Monday', 'Monday'], 'Type': ['Non Trading', 'Non Trading', 'Non Trading', 'Non Trading', 'Non Trading'], 'Hour': [0, 1, 2, 3, 4]}

halcyon hedge Sep 15, 2023, 5:31 PM

#

df_temp = df.query('1970<Year<1981')
plt.pyplot.subplot(1, 5, 1)
df_temp.value_counts("Method").plot(kind = 'bar', title="1970-1980", figsize=(24,4))

df_temp = df.query('1980<Year<1991')
plt.pyplot.subplot(1, 5, 2)
df_temp.value_counts("Method").plot(kind = 'bar', title="1980-1990")

df_temp = df.query('1990<Year<2001')
plt.pyplot.subplot(1, 5, 3)
df_temp.value_counts("Method").plot(kind = 'bar', title="1990-2000")

df_temp = df.query('2000<Year<2010')
plt.pyplot.subplot(1, 5, 4)
df_temp.value_counts("Method").plot(kind = 'bar', title="2000-2010")

df_temp = df.query('2010<Year<2020')
plt.pyplot.subplot(1, 5, 5)
df_temp.value_counts("Method").plot(kind = 'bar', title="2010-2020");

plt.pyplot.suptitle("Main Title", fontsize=15)
plt.pyplot.subplots_adjust(hspace=4, top=4)
plt.pyplot.subplots_adjust(left=0.1,
bottom=0.1,
right=0.9,
top=0.9,
wspace=0.4,
hspace=0.4)

plt.pyplot.show;

#

Spacing and padding not working for the heading("plt.pyplot.subtitle").

#

The word "Main Title" ("plt.pyplot.subtitle") is just overlapping with the title of the individual graphs. Despite adding "plt.pyplot.subplots_adjust(left=0.1,
bottom=0.1,
right=0.9,
top=0.9,
wspace=0.4,
hspace=0.4)"

#

How to fix this

serene scaffold Sep 15, 2023, 5:59 PM

#

white crest hi <@253696366952316929> thanks for your reply, actually i want help in generati...

to be clear, I wasn't making a commitment to help. I was just telling you that you need to ask your question if you want to get help.

generating a scripts from specific condition to implement in dataset,
I don't know what this means. it sounds like you don't have the vocabulary to convey what you're trying to do.

Try showing what result you want given the two dataframes that you have shown.

minor cloak Sep 15, 2023, 6:39 PM

#

[PyGraft is looking for open-source contributors]

Hi there,

I recently open-sourced PyGraft, a configurable Python tool to generate synthetic knowledge graphs easily!
It can be used in any AI tasks (Machine Learning, Deep Learning, Reasoning, etc.) provided that you work with graphs.

The repo is gaining a lot of visibility, and I am looking for motivated contributors to support me in implementing new features and unit tests. Ideally, you should (or would like to) have a general understanding of knowledge graphs, semantic web, RDF/RDFS, and OWL vocabularies. In addition, strong Python programming skills are required. Experience in Software Engineering is a plus 🙂

DM me if you would like to contribute!

Otherwise, you can still take a look and star and fork the repo if you find the project interesting!

https://github.com/nicolas-hbt/pygraft

GitHub

GitHub - nicolas-hbt/pygraft: Configurable Generation of Schemas an...

Configurable Generation of Schemas and Knowledge Graphs at Your Fingertips - GitHub - nicolas-hbt/pygraft: Configurable Generation of Schemas and Knowledge Graphs at Your Fingertips

lunar wadi Sep 15, 2023, 6:54 PM

#

Can someone help me to find some good article pr blog which discusses about static evaluation for non-terminal gamestate

left tartan Sep 15, 2023, 7:19 PM

#

lunar wadi Can someone help me to find some good article pr blog which discusses about stat...

If we’re talking deep learning and games, are you familiar with two minute papers? I’d probably go through the papers he cites on his yt channel

lunar wadi Sep 15, 2023, 7:20 PM

#

Yeah, I am subscribed to that channel

left tartan Sep 15, 2023, 7:21 PM

#

(That’s all out all I know about games tho)

lunar wadi Sep 15, 2023, 7:22 PM

#

Im just having a simple approach in my game which uses minimax upto certain depth and the chanell talks about the deep learning

atomic hamlet Sep 15, 2023, 11:50 PM

#

(cross post cause wrong channel) Does anyone know how to use matplotlibs or another library to make a graph some like the one below. The issue I have is I want it to show the count of events over a period of time( edits on a wiki via the MediaWiki api to be exact), which don't have a 'y' value, just a timestamp value. To use the count of X in range as y I guess.https://cdn.discordapp.com/attachments/421605166198816769/1152370948582932541/channel_chart.png

mighty patio Sep 16, 2023, 12:02 AM

#

That kind of graph is called a histogram.
There is a plt.hist() function that will give you a bar plot, but there is also a histogram function in numpy which you can use in combination with matplotlib to make a line plot like shown there

atomic hamlet Sep 16, 2023, 12:04 AM

#

mighty patio That kind of graph is called a histogram. There is a `plt.hist()` function that ...

Oh? Thank you for that information, I'll look into documentation and such for that!

weak mortar Sep 16, 2023, 12:13 AM

#

while a histogram also have x and y axis and can display the same data, i'd say its a linechart with smoothing applied to the line

#

wouldnt cost too much and see here if you like that type of plot. thats just one out of many other libs. https://plotly.com/python/line-charts/

Line

Over 16 examples of Line Charts including changing color, size, log axes, and more in Python.

#

to make it smooth you want to look for what they call "spline" .. somewhere . the documentation is rather ok while not amazing

atomic hamlet Sep 16, 2023, 12:18 AM

#

weak mortar wouldnt cost too much and see here if you like that type of plot. thats just one...

Thats what I thought at first as well to use. But I only have 1D points on the X axis and am not entirely sure how I would go about calculating Y data from it, so I was lookong for a solution that already does that.

weak mortar Sep 16, 2023, 12:18 AM

#

okay. you have time on X and what you plan for Y?

#

or i dont know if you have time on X, thats how i interpreted your msg

atomic hamlet Sep 16, 2023, 12:21 AM

#

weak mortar or i dont know if you have time on X, thats how i interpreted your msg

I have a collection of objects with a timestamp, I’m looking to make a graph like this(https://cdn.discordapp.com/attachments/421605166198816769/1152370948582932541/channel_chart.png) where the concentration of points forms the Y. This image shows discord message activity, my data will be logs of edit to a wiki, so you could see trends in increases and decreases in editor activity.

left tartan Sep 16, 2023, 12:21 AM

#

atomic hamlet I have a collection of objects with a timestamp, I’m looking to make a graph lik...

Show what your data looks like plz

atomic hamlet Sep 16, 2023, 12:25 AM

#

left tartan Show what your data looks like plz

Aight. When I get back home I’ll generate some sample data.

#

(srry didnt think to make any yet)

left tartan Sep 16, 2023, 12:27 AM

#

atomic hamlet Aight. When I get back home I’ll generate some sample data.

As I’ve read; you have an event stream where each event has a timestamp. You want to show events per day. Right?

atomic hamlet Sep 16, 2023, 12:29 AM

#

left tartan As I’ve read; you have an event stream where each event has a timestamp. You wan...

Yeah, although I may tweak it to show over a year instead, depending on the timescale I want to observe

#

Its an object with a UNIX EPOCH Timestamp iirc

mighty patio Sep 16, 2023, 12:33 AM

#

atomic hamlet Thats what I thought at first as well to use. But I only have 1D points on the X...

The function that gets you the y values from a list of single events is numpy.histogram()

left tartan Sep 16, 2023, 12:33 AM

#

atomic hamlet Yeah, although I may tweak it to show over a year instead, depending on the time...

Yah, so you just want to group the data by date, take a sum per group, and plot that.

#

You -can- do some of this stuff in the charting library, but the normal process is just to calculate what you want, then plot it.

#

(I don’t think of this as a histogram case, but if you added a date column to each object, I guess it’d work)

#

All that grouping stuff is really Pandas or SQL language, by the way

weak mortar Sep 16, 2023, 12:36 AM

#

left tartan Yah, so you just want to group the data by date, take a sum per group, and plot ...

is this technique you are describing also referred to as "bins" ?

left tartan Sep 16, 2023, 12:38 AM

#

Bins is what you’d call it in a histogram… but grouping is what you’d call the data transformation in data libraries and dbs.

weak mortar Sep 16, 2023, 12:39 AM

#

ah cool. yea i see the histogram 2d and other 2d maps have nbins functions where scatter only has a few different named something with group

pulsar elk Sep 16, 2023, 11:29 AM

#

In a two-way ANOVA, how would you calculate the main effects and interaction effects using Python?

#

helloi am not even able to understand what is ols function working?

prime token Sep 16, 2023, 11:47 AM

#

Is data science just about grabbing data and making sense of it? Like if I wanted to create some analytics for rugby, I just grab the important data make sense of it via libraries and whatnot

mighty patio Sep 16, 2023, 12:01 PM

#

prime token Is data science just about grabbing data and making sense of it? Like if I wante...

Yes, although the added value arises by finding correlations that are not immediately apparent.
So if I was to make the same statement you just did I would not include the word just

prime token Sep 16, 2023, 12:03 PM

#

Do you have an example of that if you dont mind me asking

mighty patio Sep 16, 2023, 12:22 PM

#

I am an X-ray scientist so probably not the best person to ask for examples, but you can look into past kaggle competitions
I remember there was one for predicting supermarket sales (to predict how much produce should be stocked by the supermarket), and the sales of ice cream would increase on public holidays but only if the weather was good.

red latch Sep 16, 2023, 12:26 PM

#

whats the best stats course i can take for data science? your recommendations?

left tartan Sep 16, 2023, 12:45 PM

#

red latch whats the best stats course i can take for data science? your recommendations?

What stats have you taken? It’s all relative to you, right?

red latch Sep 16, 2023, 12:49 PM

#

uhm, i have an over the top understanding of some concepts but ive taken high school math and engineering math aswell so yeah

potent sky Sep 16, 2023, 12:52 PM

#

mighty patio Yes, although the added value arises by finding correlations that are not immedi...

I think this also functions as a sort of meta-example. Amusing

red latch Sep 16, 2023, 12:52 PM

#

red latch uhm, i have an over the top understanding of some concepts but ive taken high sc...

i have an over the topic very shallow understanding of some stats concepts

#

but what would be a good course to take teaches both stats and implementing that in python

left tartan Sep 16, 2023, 1:02 PM

#

red latch uhm, i have an over the top understanding of some concepts but ive taken high sc...

I made the mistake in grad school of taking a stats course that was primarily for math majors (but they allowed CS). I thought I knew stats. Boy was I wrong.

left tartan Sep 16, 2023, 1:03 PM

#

red latch but what would be a good course to take teaches both stats and implementing that...

I don’t know if one but would love if someone had one!

prime token Sep 16, 2023, 1:03 PM

#

I think you can make bank with a maths and stats degree. Quants require like some msc from what I see and you can make millions if you're good enough

left tartan Sep 16, 2023, 1:06 PM

#

prime token I think you can make bank with a maths and stats degree. Quants require like som...

Perhaps, but I don’t think you can ‘will yourself into’ loving math and a quant role if you don’t have a passion (and ability) for the subject.

red latch Sep 16, 2023, 1:07 PM

#

left tartan I made the mistake in grad school of taking a stats course that was primarily fo...

oh yeah i do look at some stats things and im so lost but im confident i could pick it up

#

since im not really super idk

#

whats the word im looking for?

#

isolated from the math concepts?

left tartan Sep 16, 2023, 1:13 PM

#

red latch oh yeah i do look at some stats things and im so lost but im confident i could p...

I dunno, but three things you could/should work on: 1. Data engineering stuff- do projects that work with data. Kaggle has lots of ideas/projects. 2. stats: there’s so many things to learn but just pick one thing at a time that you know little about and dig deep. 3. CS50 for AI: not stats, but it’s a survey of ML and Python, the hands on stuff you were talking about

left tartan Sep 16, 2023, 1:15 PM

#

red latch oh yeah i do look at some stats things and im so lost but im confident i could p...

And three books zestar75 recommended: #data-science-and-ml message

red latch Sep 16, 2023, 1:15 PM

#

thanks for the pointers

#

how do you guys follow a book when its not prescribed as part of a course

#

thats dedication

#

pls

left tartan Sep 16, 2023, 1:16 PM

#

red latch how do you guys follow a book when its not prescribed as part of a course

I don’t. I pick a chapter or topic I’m interested in.

red latch Sep 16, 2023, 1:17 PM

#

ahh

#

thats far more convenient and freeing tbh

#

my brain is so "all or nothing" tho

#

i cant just pick a chapter and not like perfectly do the whole book seems so wrong

left tartan Sep 16, 2023, 1:23 PM

#

Yah, I have a few books I’ll never finish, but I can’t put them away

pale hemlock Sep 16, 2023, 1:24 PM

#

there's always finishing them with audio

wooden sail Sep 16, 2023, 1:28 PM

#

that doesn't work if it's technical content

left tartan Sep 16, 2023, 1:28 PM

#

Lol, I’d love to hear a stochastic calc text in audio

past meteor Sep 16, 2023, 3:04 PM

#

red latch oh yeah i do look at some stats things and im so lost but im confident i could p...

You have to be more specific in what you do and don't know from stats imo

gaunt sorrel Sep 16, 2023, 3:08 PM

#

i know

simple tapir Sep 16, 2023, 5:45 PM

#

hey

#

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

def logistic_regression(data, max_iter=100, random_state=42):
    x = data.drop(["happiness_rank"], axis=1)
    y = data["happiness_rank"]
    x_train, x_test, y_train, y_test = train_test_split(x,y,random_state=random_state)
    model = LogisticRegression(max_iter=max_iter, random_state=random_state)
    model.fit(x_train,y_train)  
    preds = model.predict(x_test)
    score = accuracy_score(y_test, preds)
    return score

#

it shows that accuracy is 0.0

past meteor Sep 16, 2023, 6:08 PM

#

simple tapir ```py from sklearn.linear_model import LogisticRegression from sklearn.model_sel...

What is happiness rank exactly? How does the data look like?

#

How many levels? (E.g., is it 0/1 or from 1 to 10 or so?)

abstract wasp Sep 16, 2023, 11:22 PM

#

Hello there, I want to train YOLO with my own dataset but I’d have to do the annotations manually and it would take forever to do so 😭 I was thinking of using edge detection to automatically create the bounding boxes. Do you guys agree or do you guys have a better idea?

dire iron Sep 17, 2023, 4:55 AM

#

Anyone interested in optimization of LMs?

civic elm Sep 17, 2023, 6:31 AM

#

Greetings, I am working on the kaggle medical transcript dataset and I am trying to re-balance the dataset because the mean word count is at 18 words, max words is 76. I am wondering what is the best move here? should I cut the top 25% highest word count?

lunar wadi Sep 17, 2023, 7:18 AM

#

Hi, does someone knows any kind of resource related to designing some evaluation function which takes in a state (mostly non-terminal) and the player and returns score (winning).
An example of the function in any game will also be fine. I just want to get some jist about the core concept of extracting the feature and evaluating those features

I am making a connect4 game which is using some algorithmic approach for AI player. I'm using minimax under the hood.

echo lance Sep 17, 2023, 7:42 AM

#

I was doing chapters of Elements of Statistical Learning completed 3 chapters, but then started Deep L. For coders 2022 course. Watched 4 lec.
Now i am just not able to figure out that should I first learn ML in depth or do DL. I know ml basics .

#

I have 2 years left of my clg (undergrad)

past meteor Sep 17, 2023, 7:43 AM

#

echo lance I was doing chapters of Elements of Statistical Learning completed 3 chapters, b...

ESL is aimed at masters (graduate) students or advanced bachelors students that have had several math/stats courses imo.

#

Unless you specifically want to go into computer vision or NLP. You should know ML before DL because deep learning offers solutions where "regular" ML fails.

echo lance Sep 17, 2023, 7:50 AM

#

i was thinking that this is the best time when i can read this book with no pressure... but i also think that is it really required? because this time i can invest in any other thing like more on implementing than on theory

past meteor Sep 17, 2023, 7:51 AM

#

These 3 books are what I always recommend

past meteor Sep 17, 2023, 7:54 AM

#

echo lance i was thinking that this is the best time when i can read this book with no pres...

You need a mix. In interviews for ML/AI positions they ask about what you've done but "theoretical" questions are also present.

#

ML is a very very leaky abstraction. Every model nowadays has a .fit() and .predict() method but they're all subtly different and for me at least knowing what most of them do gives me confidence that my work is correct.

drowsy dove Sep 17, 2023, 7:58 AM

#

Hi all, does anyone know of any good online resources where I can find data science sample projects (and preferably solutions)? . I've gone through McKinney's book for Pandas but I feel like without some actual hands-on projects none of it will stick

potent sky Sep 17, 2023, 8:33 AM

#

drowsy dove Hi all, does anyone know of any good online resources where I can find data scie...

I'm assuming you're aware of kaggle and looking for something else?

echo lance Sep 17, 2023, 8:34 AM

#

Someone 📍 pin this message... really helpful

potent sky Sep 17, 2023, 8:34 AM

#

Agree, though I would also add https://www.deeplearningbook.org to that.

drowsy dove Sep 17, 2023, 9:06 AM

#

potent sky I'm assuming you're aware of kaggle and looking for something else?

I wasn't actually. Do you recommend just going through the excercises here https://www.kaggle.com/learn/pandas?

Learn Pandas Tutorials

Solve short hands-on challenges to perfect your data manipulation skills.

short heart Sep 17, 2023, 9:10 AM

#

Is it ok to tune CatBoost parameters on GPU to then use these parameters for CPU training?

past meteor Sep 17, 2023, 9:12 AM

#

short heart Is it ok to tune CatBoost parameters on GPU to then use these parameters for CPU...

yes

drowsy dove Sep 17, 2023, 9:21 AM

#

drowsy dove I wasn't actually. Do you recommend just going through the excercises here https...

Thanks! This looks good

potent sky Sep 17, 2023, 9:53 AM

#

drowsy dove I wasn't actually. Do you recommend just going through the excercises here https...

Yep that might be useful.
But I was also referring to projects part, since you asked about projects.
Kaggle has tons and tons of projects/notebooks by other people.
You can find the right projects as per your interest.
Though do be careful to not pick up bad practices as these projects are not curated. Maybe keep an eye on the upvotes, user-level and discussion section to get some sort of idea.

delicate gyro Sep 17, 2023, 10:59 AM

#

past meteor These 3 books are what I always recommend

worth pursuing ai/ml if I am trash at these specific mathematics? (nearly failed half these courses in uni, and have to retake them for improvement)

jovial elm Sep 17, 2023, 11:05 AM

#

So I have this project I'm working on, and I need to capture the text on a screen and identify what number the text is. However, the area surrounding the text is transparent for the most part, therefore the background is subject to change, which may directly affect how the text is captured.
To capture the text, I'm using adaptive thresholding. I've attached a short video on the text being captured in different background environments.

My question is, what's the best approach to identify if the text equals "0%" or "1%" or "2%" and etc.
I need the solution to be relatively efficient. So far, the capturing the text and adaptive thresholding is done in 0.03s roughly.
As of right now, I'm thinking of template matching.

serene scaffold Sep 17, 2023, 12:51 PM

#

delicate gyro worth pursuing ai/ml if I am trash at these specific mathematics? (nearly failed...

You should get more comfortable with those branches of math, then. If that isn't something you're able or willing to do, then you'd want to look elsewhere, yeah

echo lance Sep 17, 2023, 1:12 PM

#

Is data mining possible with LLMs ? Like passing whole bunch of pdfs and say it to generate a csv from them ..
I searched for some papers but couldn't find somthing intresting.

#

I want to generate a medical dataset for medicines and symptoms

#

So thinking of this instead of doing text scrapping

#

from medical books

serene scaffold Sep 17, 2023, 1:27 PM

#

echo lance Is data mining possible with LLMs ? Like passing whole bunch of pdfs and say it ...

it might be, but you shouldn't trust it unless you've demonstrated that it produces that expected output on a given document set

#

and you can't just pass PDFs--you'd need to convert them to text first.

echo lance Sep 17, 2023, 1:29 PM

#

serene scaffold and you can't just pass PDFs--you'd need to convert them to text first.

Yeah..that i will surely do if im doing it .... i tried for few paragraph.. it works fine but can't trust it for whole text

serene scaffold Sep 17, 2023, 1:30 PM

#

echo lance Yeah..that i will surely do if im doing it .... i tried for few paragraph.. it w...

if this is something that matters, you'd need to be pretty rigorous in how you measure its accuracy

#

but it might be that it doesn't matter.

echo lance Sep 17, 2023, 1:32 PM

#

serene scaffold if this is something that matters, you'd need to be pretty rigorous in how you m...

But i think it is just impossible for the other way to extract data...scrapping the whole text

left tartan Sep 17, 2023, 1:47 PM

#

Analyzing PDFs is a difficult problem, here’s a talk / project about it that combines text and ocr: https://youtu.be/vB-C7dBoxc8?list=PL8uoeex94UhEGxPOetT3bpg8ibcxflh44&t=9122

echo lance Sep 17, 2023, 1:58 PM

#

left tartan Analyzing PDFs is a difficult problem, here’s a talk / project about it that com...

Even after i got a text...like "use paracetamol tablets if suffering from cold" then i want to generate a symptoms - medicine dataset from it...

So for this only option i can see is using LLM

left tartan Sep 17, 2023, 2:02 PM

#

echo lance Even after i got a text...like "use paracetamol tablets if suffering from cold" ...

This isn’t my space, but I recall a lot of work around medical ai and expert systems. There are certainly many papers in this, ie: https://dl.acm.org/subject/ai?AllField=Medical&sortBy=relevancy

bronze vessel Sep 17, 2023, 5:58 PM

#

hey guys i want to learn opencv

#

Are someone suggest tuto

#

r

past meteor Sep 17, 2023, 6:10 PM

#

bronze vessel Are someone suggest tuto

If you want to learn a specific package using their official docs is always a good idea imo. https://docs.opencv.org/4.x/index.html if they don't have good docs I'm personally always hesistant to use it.

odd meteor Sep 17, 2023, 7:00 PM

#

bronze vessel hey guys i want to learn opencv

YouTube + Documentation = 🔥

silk drum Sep 17, 2023, 8:08 PM

#

Good afternoon everybody! Does anyone here use seaborn?

serene scaffold Sep 17, 2023, 8:10 PM

#

silk drum Good afternoon everybody! Does anyone here use seaborn?

Why do you want to know if someone uses seaborn? Because you should always ask your actual question. Not if someone knows about the topic of a question you haven't asked.

solar pagoda Sep 18, 2023, 4:14 AM

#

Hi guys, im working with a dataset of cars in csv and a some data has the word 'Turbo' in something like line break
so what i want to do is delete the CC and Turbo, i tried this code:

data['Power in cubic capacity (CC)'] = data['Power in cubic capacity (CC)'].str.replace('CC\nTurbo', '')

But Turbo isn't deleted, can somebody help me on how to delete it?

solar pagoda Sep 18, 2023, 4:19 AM

#

silk drum Good afternoon everybody! Does anyone here use seaborn?

me bro

solar pagoda Sep 18, 2023, 5:27 AM

#

solar pagoda Hi guys, im working with a dataset of cars in csv and a some data has the word '...

i've fixed it hehe

dusk tide Sep 18, 2023, 5:37 AM

#

Hi, guys need help finding a regex pattern . I have a list of strings eg. ['TR ITA14TRK010 FF BROOKLYN STRAIGHT Beige 44','MB ITA14BLT016 35MM NA Olive Green 32','MB ITA15BLT004 40MM NA Reddish Brown 38 / 112CM'] and I want to split the string based on charaters which starts with IT(like ITA14TRK010 ,ITA14BLT016 ,ITA15BLT004 ) so that I can grab the brand name(at the start of string like TR, MB,...etc.). How can I write a good regex expression for the same . I have written one r'IT[A-Z0-9]+'. Can someone validate it ?

lavish lily Sep 18, 2023, 6:04 AM

#

Is BertGeneration or gpt-3 better suited for text generation tasks

simple tapir Sep 18, 2023, 8:31 AM

#

past meteor What is happiness rank exactly? How does the data look like?

Sorry for the late response, I was travelling to another city

#

past meteor Sep 18, 2023, 8:32 AM

#

simple tapir Sorry for the late response, I was travelling to another city

Can you do a .unique() on happiness rank?

simple tapir Sep 18, 2023, 8:32 AM

#

let me try

simple tapir Sep 18, 2023, 8:36 AM

#

past meteor Can you do a .unique() on happiness rank?

past meteor Sep 18, 2023, 8:40 AM

#

simple tapir

Okay, now this makes. You need linear regression for this. There's too many levels in your target.

#

There's models that are (slightly) more adequate than linear regression for your problem (because you have integers) but I don't want to send you down a rabbit hole 🙂

#

What I can say is that in your code you're not doing any feature scaling, you actually should. Logistic regression by default (and the new model you should use which is RidgeRegressionCV) automatically have regularisation and it does not work properly if your data is on different scales. I can explain why if you want but for now my advice to you is always rescale your data. Doing it where it's not necessary isn't as bad as not doing it

simple tapir Sep 18, 2023, 9:00 AM

#

I see

#

Thanks a lot for your help!

mint palm Sep 18, 2023, 9:18 AM

#

how to do hyperparameter tuning?
my supervisor was mentioning something similar to doit search (is sounded like **oit search, * means i am not sure about start lol ), I didn't catch the pronunciation correctly.

past meteor Sep 18, 2023, 9:41 AM

#

mint palm how to do hyperparameter tuning? my supervisor was mentioning something similar...

What ML package are you using?

mint palm Sep 18, 2023, 9:45 AM

#

past meteor What ML package are you using?

pytorch

past meteor Sep 18, 2023, 9:46 AM

#

That's relatively important to mention because it influences what kind of hyperparameter tuning you can do 🙂

#

https://optuna.org/ is a good place to start if you're using Pytorch

Optuna

Optuna - A hyperparameter optimization framework

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API.

#

Typically for neural nets you'd run a hyperparameter tuning algorithm that does the search "sequentially" because you may not have enough VRAM to do it in parallel

#

https://optuna.readthedocs.io/en/stable/reference/samplers/generated/optuna.samplers.TPESampler.html#optuna.samplers.TPESampler

low orbit Sep 18, 2023, 1:12 PM

#

Hello.
I'm a new at this field. #langchain
Does anyone can give an advice how to ask chatGPT to get structured response with validation model?
In official docs represent extract documents for that. But I'm trying to build something like:

Question: Hello chat, what is the State of the Union?
Structured answer:
['Question',
'Alternative rephrased question',
'Second alternative rephrased question']

Kind a hard to build. Please give an advice.
I know somehow its possible to get from chatGPT structured answers from just one-line questions.

spare briar Sep 18, 2023, 4:54 PM

#

past meteor Typically for neural nets you'd run a hyperparameter tuning algorithm that does ...

If you have multiple nodes optuna supports syncing hyperparameter search on multiple nodes w a database

past meteor Sep 18, 2023, 4:55 PM

#

spare briar If you have multiple nodes optuna supports syncing hyperparameter search on mult...

personally I use mlflow for this but same idea 🙂

#

I mostly mentioned this because in the past I never had the ability to train multiple NNs concurrently so instead of doing my default strategy (random search) I'd go with a variant of bayes opt

inland rivet Sep 18, 2023, 5:04 PM

#

is there any function to get the index value of a given element from any column?

agile cobalt Sep 18, 2023, 5:05 PM

#

can you give an example of the input/output of that would be like?
edit; brb in like half an hour

inland rivet Sep 18, 2023, 5:10 PM

#

I have 2 dataframes, A and B. A columns are vegetable and type. B columns are vegetable and amount. I want to iterate through B vegetable column, check if that vegetable is in A and asign the amount in B as a new column in A.

#

A might have vegetables that B dont't so I'll put a 0 in those cases.

#

I tried using concat inner.

tidal bough Sep 18, 2023, 5:26 PM

#

I think it'd be something like A.join(B, on="vegetable", how="left"), yeah.

agile cobalt Sep 18, 2023, 5:44 PM

#

you could do it a bit more "manually" like ```py

either of

veg_amount = B.set_index("vegetable")["amount"]
veg_amount = pd.Series(index=B["vegetable"], data=B["amount"].array)

then

A["amount"] = A["vegetable"].map(veg_amount).fillna(0)
``` but you probably should prefer df.join or pd.merge

mint palm Sep 18, 2023, 7:19 PM

#

i need to tune

3 weights of 3 losses
one temperature
learning rate

what hyperparameter tuning method should i use and why? when to prefer one over other? i am using torch

topaz night Sep 18, 2023, 8:39 PM

#

is theres any diff between 4060 lp and reguler one ??

serene scaffold Sep 19, 2023, 12:11 AM

#

topaz night is theres any diff between 4060 lp and reguler one ??

You'll want to compare attributes like its memory and FLOPS

verbal venture Sep 19, 2023, 12:20 AM

#

do you have any thoughts on this @serene scaffold I want to make a web application that uses different LLMs frmo differnet organizations. I'm worried prehistoric inputs won't be used in the attention mechanism as they are different LLMs. is it possible to retain the different attention histories across each LLM. in other words, use the previous attention from one LLM in another

#

I was thinking concatenating attention might work but not sure if the math behind that does what I want it to

serene scaffold Sep 19, 2023, 12:22 AM

#

I don't think it's guaranteed that every LLM uses attention (though it might be unusual for one not to), let alone in a way that can be used as-is in a different LLM

boreal blaze Sep 19, 2023, 12:24 AM

#

Hi everyone, this question isn't particularly related to AI, more so to the philosphy of AI, so if there is a more suitable channel feel free to point it out and I will move it there.

#

I want to know what exact the robot argument and system arguement to the Chinese Room Argument mean.

serene scaffold Sep 19, 2023, 12:25 AM

#

boreal blaze Hi everyone, this question isn't particularly related to AI, more so to the phil...

ChatGPT won't replace developers within ten years, if it's that

boreal blaze Sep 19, 2023, 12:25 AM

#

So far, my understanding of chinese room arguement is as follows.

#

haha dw i want nothing to do with AI or datascience

#

this is for some content i'm learning for an algorithmics class

#

#

with the system argument, the general gist of it is that even tho the human doesn't understand Chinese, the room as a whole does. But what does understand Chinese, in that context mean?

serene scaffold Sep 19, 2023, 12:27 AM

#

"What does it mean to understand" that's the crux of the question.

left tartan Sep 19, 2023, 12:28 AM

#

That summary also seems terribly written, no offense to the author.

boreal blaze Sep 19, 2023, 12:28 AM

#

... the author is me ;-;

#

how can i improve it?

left tartan Sep 19, 2023, 12:29 AM

#

It's the sentence about "Searle claims that if a human-like computer..."

serene scaffold Sep 19, 2023, 12:29 AM

#

It sounds like you might not fully understand what the Chinese room analogy is intended to convey.

boreal blaze Sep 19, 2023, 12:30 AM

#

to me it seems like the analogy is there to argue that there is no way for strong AI to exist, to act as a human mind does.

#

is that wrong?

#

or is it to argue the Turing Test?

serene scaffold Sep 19, 2023, 12:32 AM

#

I don't think the analogy is intended to argue for one position or another, but to provide a basis for discussion

left tartan Sep 19, 2023, 12:32 AM

#

boreal blaze to me it seems like the analogy is there to argue that there is no way for stron...

fwiw, my take on it is that it's a very narrow statement: that merely imitating intelligence != intelligence. Perhaps as a rebuttal of the turing test.

serene scaffold Sep 19, 2023, 12:33 AM

#

"if a robot participates in a conversation only by looking up responses from a table of inputs and outputs, can it be considered to understand what is being said?"

boreal blaze Sep 19, 2023, 12:34 AM

#

isn't his basis that it can't, though? because he does rebut a lot of arguments that try to say that it can be intelligent?

#

actually wait basically, i just want to confirm this statement:

serene scaffold Sep 19, 2023, 12:35 AM

#

I haven't read the Searle paper that you're referring to

left tartan Sep 19, 2023, 12:35 AM

#

I'm looking at https://plato.stanford.edu/entries/chinese-room/ now

boreal blaze Sep 19, 2023, 12:36 AM

#

Alan Turing believes that if a computer can simulate a human being well enough, it is intelligent. Searle argues that no matter how well a computer is programmed, it is still only simulating understanding, and is not intelligent

#

this is where i found some info: https://iep.utm.edu/chinese-room-argument/#SH2b

Internet Encyclopedia of Philosophy

James Fieser

Chinese Room Argument

boreal blaze Sep 19, 2023, 12:36 AM

#

serene scaffold I haven't read the Searle paper that you're referring to

i haven't read any papers, i just want to understand enough to get what the argument argues

left tartan Sep 19, 2023, 12:42 AM

#

What's inte4resting... as I read through this, is not the argument, but the reply's to the argument

boreal blaze Sep 19, 2023, 12:43 AM

#

i like the brain simulator reply the most

#

i meant brain simulator

#

just cause the analogy is really well thought out,

left tartan Sep 19, 2023, 12:45 AM

#

just got into it. Interesting read, enjoyed it.

iron basalt Sep 19, 2023, 1:19 AM

#

serene scaffold I don't think the analogy is intended to argue for one position or another, but ...

Searle does make a conclusion, and it's a non sequitur. It's another form of Vitalism, but for modern day. In addition, whether it can understand the tasks given depends on the tasks, specifically how much they rely on knowing about the real world. If the task is for example, math (symbol manipulation / algebra parts of it), then sure it can do those and "understand" them, just like a human would in that situation. Fundamentally it's arguing that computers lack symbol grounding, but they can have symbol grounding, this is an arbitrary assumption / premise made by Searle that makes their argument work at all.

#

This argument was also probably made in the context of understanding AI back how it was when everyone did symbolic ("good old fashioned") AI. Which is why it has the heavy focus on symbols and symbol grounding.

iron basalt Sep 19, 2023, 2:03 AM

#

boreal blaze to me it seems like the analogy is there to argue that there is no way for stron...

From what I have read, it seems Searle does actually admit that it could be possible, but that it has not been done yet by machines, only biology. So basically "not yet."

#

Seems like a bit of a walk back.

#

Something like "brains have the magic sauce" (again, Vitalism).

#

(Vitalism used to be huge when cells / humans being made of cells first started becoming accepted)

#

(we like to feel special)

boreal blaze Sep 19, 2023, 2:06 AM

#

i feel like (based on his reply on the other minds reply) that he wants to say the machine can't ever do it, because a machine can never become a biological.

#

well not quite.

#

The Many Mansions Reply suggests that even if Searle is right in his suggestion that programming cannot suffice to cause computers to have intentionality and cognitive states, other means besides programming might be devised such that computers may be imbued with whatever does suffice for intentionality by these other means.

This too, Searle says, misses the point: it “trivializes the project of Strong AI by redefining it as whatever artificially produces and explains cognition” abandoning “the original claim made on behalf of artificial intelligence” that “mental processes are computational processes over formally defined elements.” If AI is not identified with that “precise, well defined thesis,” Searle says, “my objections no longer apply because there is no longer a testable hypothesis for them to apply to” (1980a, p. 422).

#

he wants to say that a machine is defined as somehting programmed, with instructions. If it doesn't work like that, the argument fails because the argument is not meant to target biological machines

iron basalt Sep 19, 2023, 2:07 AM

#

That is what I read, but then I read that Searle basically said that brains can do it, machines can't, and that it would need to be demonstrated. Which is not the same as "not possible."

#

I think the opinion and message has changed over time.

iron basalt Sep 19, 2023, 2:09 AM

#

iron basalt This argument was also probably made in the context of understanding AI back how...

Basically this. Someone probably said something else that did not make sense and this was in response.

#

The messaging is not really clear enough, so i'm just going to leave it at what I wrote.

somber panther Sep 19, 2023, 2:31 AM

#

so someone posted this over in the excel discord, what is your impression?

#

https://www.solvermax.com/blog/python-embedded-in-excel-first-impressions

Solver Max - Python embedded in Excel: First impressions

We explore the recently announced Excel feature: Python embedded within Excel

#

excel and python are kind of my thing, wonder if i should milk this

#

🐄

boreal blaze Sep 19, 2023, 2:39 AM

#

it looks nice, but the subscription at the end does seem like something Microsoft will milk.

latent ibex Sep 19, 2023, 3:07 AM

#

Not sure if this is the right chat but I'm at a stage in which I need to extend some of the built in stats file for an open source library that I'm currently using. Upon checking in with chatgpt to help me do so, it suggests that it's best to subclass it rather than modify the original file.
In theory, if I'm subclassing would I just need to add the pertinent code to a whole new file and include this in the same directory as the rest of the library files? Or is there something else I'm missing? Thanks.

left tartan Sep 19, 2023, 3:21 AM

#

latent ibex Not sure if this is the right chat but I'm at a stage in which I need to extend ...

What’s the open source library?

latent ibex Sep 19, 2023, 3:23 AM

#

backtesting.py

left tartan Sep 19, 2023, 3:25 AM

#

And it sounds like you’re not very familiar with subclassing, right?

latent ibex Sep 19, 2023, 3:27 AM

#

left tartan And it sounds like you’re not very familiar with subclassing, right?

That's right. This is my first time doing this.

left tartan Sep 19, 2023, 3:28 AM

#

Generally, subclassing let’s you override or add functionality from the base class. It’s requires a few examples to explain, but perhaps you should start with the inheritance section here: https://python.swaroopch.com/oop.html

#

In terms of file placement: you’d usually put your code in your own directory. It doesn’t matter where the library files are. You’d import the library, then any modules you wrote

latent ibex Sep 19, 2023, 3:32 AM

#

left tartan Generally, subclassing let’s you override or add functionality from the base cla...

Thank you

serene scaffold Sep 19, 2023, 3:47 AM

#

@latent ibex @left tartan this would go in #software-architecture btw

manic tangle Sep 19, 2023, 5:24 AM

#

dipping my toes in ai / ml, hoping someone else can suggest some reading for what im trying to accomplish

#

my goal is to first determine whether a piece of text is code, and if it is classify which language it was written in. i understand this is very accomplishable without any ML but i just want a lil project 🤠

#

I just dont even know where to start cause ive never rly touched this

rigid oxide Sep 19, 2023, 5:57 AM

#

is there a way to check if a variable is truthy and equals a value at the same time? I have to use this and it's annoying. I have a js background:

if meta_tag_property != None:
        if  meta_tag_property.startswith('og:'):

manic tangle Sep 19, 2023, 6:07 AM

#

rigid oxide is there a way to check if a variable is truthy and equals a value at the same t...

wrong channel im assuming haha but I'd just do if meta_tag_property and meta_tag_property.startswith("og:"):

past meteor Sep 19, 2023, 6:10 AM

#

manic tangle my goal is to first determine whether a piece of text is code, and if it is clas...

Looks like a fun project. On the top of my head I know a few ways how I would solve this. I'd say what matters most for you is what part of AI you want to delve into:

Do you want to do traditional ML and generate "features" (input variables) and then train a model?
Do you want to pass it off to something like a neural network?
Do you want to use an API to generate "features" to generate your input variables and then train a model.

rigid oxide Sep 19, 2023, 6:12 AM

#

manic tangle wrong channel im assuming haha but I'd just do `if meta_tag_property and meta_ta...

thanks. Which channel is better?

manic tangle Sep 19, 2023, 6:12 AM

#

past meteor Looks like a fun project. On the top of my head I know a few ways how I would so...

i was leaning more towards processing the text myself to generate features and then train a model

manic tangle Sep 19, 2023, 6:13 AM

#

rigid oxide thanks. Which channel is better?

I think general python stuff just goes in #1035199133436354600

past meteor Sep 19, 2023, 6:14 AM

#

That's a fun one to do! I would suggest you pick a few programming languages but make it sufficiently hard for yourself (have C# and Java be in there together) and then for each language you read a bunch of code and ask yourself "what makes Python code Python"

#

... and then you'll need a lot of regex to make rules

#

That's just what I would do on the top of my head. I'm also not sure how far it will get you 🤔 . You can always move on to 2) and 3) if it's not working well

manic tangle Sep 19, 2023, 6:16 AM

#

so a neural network solution would look like what?

#

I know basically nothing about AI so most of these concepts r very foreign haha

past meteor Sep 19, 2023, 6:18 AM

#

A neural network solution would process the raw strings into some sort of vector and then it would use that to make predictions. The difference is that you are no longer generating features yourself

manic tangle Sep 19, 2023, 6:21 AM

#

mm okay, I assume for training that I would feed it the vector + label of each document?

past meteor Sep 19, 2023, 6:26 AM

#

manic tangle mm okay, I assume for training that I would feed it the vector + label of each d...

Exactly, you feed it the vector and then you compare with the label to determine the error/calculate the loss during training

manic tangle Sep 19, 2023, 6:30 AM

#

well that doesn't sound incredibly difficult thinkingsmirk

manic tangle Sep 19, 2023, 6:31 AM

#

past meteor Exactly, you feed it the vector and then you compare with the label to determine...

You're very handsome and I appreciate you

#

thank you!

verbal venture Sep 19, 2023, 6:42 AM

#

serene scaffold I don't think it's guaranteed that every LLM uses attention (though it might be ...

you're basically saying can't/unlikely to be done yeah

timid kestrel Sep 19, 2023, 9:14 AM

#

hey hey yall, anyone got any good data sources to practice machine learning models with python? Ive tried searching in kaggle but i dont think i have a good trained eye to select good data sets. im tryna practice my xgboost, random forest parameter setting and optimization skills. also am pretty new to python coding. i was told to browse thru the pins but i cant find anything too specific

cunning crystal Sep 19, 2023, 10:12 AM

#

timid kestrel hey hey yall, anyone got any good data sources to practice machine learning mode...

It does not really matter what you start with, as long as it is more complex than Iris... Titanic is interesting, as it benefits from cleanups in the same many real-world datasets do

serene scaffold Sep 19, 2023, 12:33 PM

#

verbal venture you're basically saying can't/unlikely to be done yeah

I don't really have any idea. haven't tried it.

past meteor Sep 19, 2023, 1:28 PM

#

timid kestrel hey hey yall, anyone got any good data sources to practice machine learning mode...

Kaggle's tabular playground series

#

Specifically for xgboost just tune the number of trees, for random forest the cost_complexity

delicate lodge Sep 19, 2023, 5:04 PM

#

https://medium.com/@chaudashubham/k-nearest-neighbors-knn-91749aff0445

Medium

K-Nearest Neighbors (KNN)

Implementation and evaluation of KNN model in python

quaint spade Sep 19, 2023, 6:05 PM

#

does anyone know of site or video that could teach me how I can build something like this for options , data supplied by yfinance and CBOE

left tartan Sep 19, 2023, 6:19 PM

#

quaint spade does anyone know of site or video that could teach me how I can build something ...

Do you know black scholes?

#

Like, are you looking to write this yourself? GEX curves are a little annoying to calculate/draw out

#

but hang on a sec... I know a blog that posted this...

#

https://perfiliev.co.uk/market-commentary/how-to-calculate-gamma-exposure-and-zero-gamma-level/

#

The method is solid, although slow. It can be vectorized, if youre up to the task

quaint spade Sep 19, 2023, 8:06 PM

#

left tartan Do you know black scholes?

nope can't say I'm familiar and yes I do want to write this myself if I have to, I just want my personal app website or whatever that can give me those levels , for now I'm getting them from another server by the name of investors haven , I don't fully understand how they do it but up for the challenge and definitely willing to learn more

quaint spade Sep 19, 2023, 8:06 PM

#

left tartan but hang on a sec... I know a blog that posted this...

thanks for this bruv

verbal venture Sep 19, 2023, 9:44 PM

#

boreal blaze Alan Turing believes that if a computer can simulate a human being well enough, ...

Ya that last sentence of your first paragraph is garbage

#

Whoever authored that thought is an idiot (the chinese room author in this case)

#

If you simulate understanding, you have understanding, which is intelligence

kind herald Sep 20, 2023, 12:28 AM

#

hey can someone whos done machine learning with both pytorch and tensorflow help me out. I can't decide which i wanna learn first.

lapis sequoia Sep 20, 2023, 12:29 AM

#

verbal venture If you simulate understanding, you have understanding, which is intelligence

i thought intelligence was coming up with things by your own self

verbal venture Sep 20, 2023, 12:30 AM

#

kind herald hey can someone whos done machine learning with both pytorch and tensorflow help...

Pytorch

kind herald Sep 20, 2023, 12:30 AM

#

verbal venture Pytorch

why pytorch over tensorflow?

verbal venture Sep 20, 2023, 12:31 AM

#

kind herald why pytorch over tensorflow?

Simpler to use

verbal venture Sep 20, 2023, 12:32 AM

#

lapis sequoia i thought intelligence was coming up with things by your own self

No I think intelligence can be reduced to problem solving in x domain

#

Basically if you can manipulate data to solve your problem

lapis sequoia Sep 20, 2023, 12:33 AM

#

a calculator can do that... is it intelligent?

#

i think im too dumb to talk abt this

sonic knoll Sep 20, 2023, 3:09 AM

#

Hello everyone!

#

I have to do a project using machine learning but I would like to know if anyone has an interesting dataset to share with me?

timid kestrel Sep 20, 2023, 3:42 AM

#

cunning crystal It does not really matter what you start with, as long as it is more complex tha...

thank you 😄

timid kestrel Sep 20, 2023, 3:44 AM

#

past meteor Kaggle's tabular playground series

oh thats super interesting ima check that out thx

abstract wasp Sep 20, 2023, 6:14 AM

#

Hi, which library do you guys think is better for building decision trees?

magic dune Sep 20, 2023, 6:19 AM

#

abstract wasp Hi, which library do you guys think is better for building decision trees?

Which libraries are you talking abt

#

?

#

Decision trees is a pretty simple algorithm tho

abstract wasp Sep 20, 2023, 6:36 AM

#

magic dune Which libraries are you talking abt

Tensorflow or Scikit-Learn?
My idea for this is that once I get this other model working, I will use the output of that model to help me decide an approximate date, like month, of when an image was taken. I think I’ll include a CNN to help me identify the season and then do the decision tree portion. Do you think this is a good idea?

magic dune Sep 20, 2023, 6:37 AM

#

abstract wasp Tensorflow or Scikit-Learn? My idea for this is that once I get this other model...

sckit-learn is easier

#

but tensorflow is more made for nn

minor mesa Sep 20, 2023, 7:06 AM

#

quaint spade does anyone know of site or video that could teach me how I can build something ...

Try openbb terminal

past meteor Sep 20, 2023, 7:13 AM

#

abstract wasp Tensorflow or Scikit-Learn? My idea for this is that once I get this other model...

scikit-learn and Tensorflow/Pytorch are different use cases imo.

#

I would only use the Tensorflow's decision trees if for some reason I to do the predictions on edge with TF lite.

lapis sequoia Sep 20, 2023, 8:21 AM

#

Pyspark

#

peeposalute

spare briar Sep 20, 2023, 1:41 PM

#

abstract wasp Hi, which library do you guys think is better for building decision trees?

xgboost

lapis sequoia Sep 20, 2023, 1:45 PM

#

Hi guys, do any of you know how to display a neural network? It is a simple Neural Network (3 in, 2 hid, 3 out). I am using neat-python for if it helps, but you can use any package if you want. If you can help me out, that would be great!

red elk Sep 20, 2023, 2:20 PM

#

Does anyone know a good deep speach yt tutorial

halcyon hedge Sep 20, 2023, 3:22 PM

#

results = all_months_data.groupby('Month').sum()
months = range(1,13)

plt.bar(months, results['Sales'])
plt.xticks(months)
plt.ylabel("Sales in USD($)")
plt.xlabel("Month")
plt.show();

#

Month contains datetime objects. Code runs perfectly fine on Jupyter but I get a "Cannot sum datetime object" error on kaggle, how to fix this.

agile cobalt Sep 20, 2023, 3:34 PM

#

halcyon hedge results = all_months_data.groupby('Month').sum() months = range(1,13) plt.bar(m...

check which version of pandas you have running locally and which version you have running on Kaggle

#

you can specify which columns you want to operate on after using groupby like df.groupby(groupby_col)[target_col].function()

halcyon hedge Sep 20, 2023, 3:36 PM

#

agile cobalt you can specify which columns you want to operate on after using groupby like `d...

Will the new data set contain the month column? I need the month column in the new dataset as well

agile cobalt Sep 20, 2023, 3:37 PM

#

the month will become the index

#

if you need of it as a column you could reset index after running the aggregation

halcyon hedge Sep 20, 2023, 3:40 PM

#

I have 1.4.2 running locally and 2.0.3 on kaggle

halcyon hedge Sep 20, 2023, 3:40 PM

#

agile cobalt if you need of it as a column you could reset index after running the aggregatio...

Okay I will try that

halcyon hedge Sep 20, 2023, 3:40 PM

#

agile cobalt if you need of it as a column you could reset index after running the aggregatio...

Thanks

agile cobalt Sep 20, 2023, 3:40 PM

#

halcyon hedge I have 1.4.2 running locally and 2.0.3 on kaggle

definitely update your local version

halcyon hedge Sep 20, 2023, 4:35 PM

#

agile cobalt definitely update your local version

Alright

flat silo Sep 20, 2023, 4:51 PM

#

Hello, has anyone here used TurboODBC (although it could just be a typical ODBC driver issue as well...) and dealing with a converted datatype from Pandas for BIT, MONEY, and TEXT to SQL Server (in Azure)? Getting various issues about cannot convert: Numeric Error.

wooden sail Sep 20, 2023, 5:35 PM

#

anyone here very familiar with stochastic matrices?

small wedge Sep 20, 2023, 5:44 PM

#

they're used in some RL right? like a matrix of probabilities for each state an agent can have?

potent sky Sep 20, 2023, 6:11 PM

#

transition probability matrices yes, with markov chains

#

unless Edd is referring to a different type of Stochastic matrices ;-;

desert bobcat Sep 20, 2023, 6:42 PM

#

heyy

#

did you work on AR projects.!?

tropic niche Sep 20, 2023, 6:58 PM

#

I have a question about annotating data for training. I would like to train LayoutLM with my own dataset of scanned forms. I plan on annotating the data using the same method used for the Funsd dataset. I have used pyTesseract to extract the data from the images. Unfortunately pyTesseract, isn't perfect! even after pre-processing the images (removing lines, noise, and binarizing).

#

Does the annotation need be based on extracted data from pyTesseract or the data as it should appear?

#

For example do the bounding box coordinate need to match those in pytesseract data. If there is a missing word do I add into the annotated data?

wooden sail Sep 20, 2023, 7:09 PM

#

oops, i disappeared all of a sudden. yeah, stochastic matrices like in markov chains. say right stochastic matrices, more concretely (rows adding up to 1). do you know of any interesting properties of the product A^T A? maybe some bounds on the off diagonal elements 👀

verbal oar Sep 20, 2023, 8:27 PM

#

if I want to try make some 3d model with AI what should I use instead of NeRf and pytorch3d?

#

and also rasterization based not rt

#

assuming I have dataset of images

languid prairie Sep 21, 2023, 1:02 AM

#

Hi, looking for help : How to proceed Fine-tuning with LlamaIndex for any models (for example with finBERT model) ?
So currently I am working on a project which consist of fine-tuning our model FinBERT with the LlamaIndex method (https://gpt-index.readthedocs.io/en/latest/examples/finetuning/embeddings/finetune_embedding_adapter.html) in order to have better result in the context of Sentiment analysis. I am actually a beginner so I would appreciate any kind of help for a better understanding of this process.

Looking forward hearing from you 🙂

glossy adder Sep 21, 2023, 11:56 AM

#

Does anyone have some good reasources on feature selection? I have a dataset with several combinations of features. One way is to make a model for each combination and test these models against each other, an other way (per a blog I read) is to train the model with the full features then test it on the different combinations of features (when it performs worse it means the missing feature was important). Is there any authorative source on this?

past meteor Sep 21, 2023, 12:00 PM

#

glossy adder Does anyone have some good reasources on feature selection? I have a dataset wit...

Specifically feature selection and not creation? (Just to be sure)

desert oar Sep 21, 2023, 1:03 PM

#

glossy adder Does anyone have some good reasources on feature selection? I have a dataset wit...

the idea of even doing feature selection is controversial. for example in predictive modeling, you often just apply regularization and don't bother trying to remove features

#

i don't know of a single authoritative source for feature selection, there might be something about it in elements of statistical learning

#

there is a lot of old bad advice out there about "stepwise" regression, but there are a lot of problems with that

glossy adder Sep 21, 2023, 1:06 PM

#

@past meteor Yes, as in selecting a set of existing features not inferring new ones.

#

aha, interesting @desert oar

past meteor Sep 21, 2023, 1:06 PM

#

desert oar the idea of even _doing_ feature selection is controversial. for example in pred...

This is the complete answer, I have nothing to add! 😄

glossy adder Sep 21, 2023, 1:07 PM

#

I might be overthinking my use case, I am making plots, plotting several dimensions of time series data (line showing the position, colour showing speed etc etc) so the thought is to find what features actually help the model

past meteor Sep 21, 2023, 1:08 PM

#

Even if your model is capable of finding non-linear patterns explicitly making them can help

#

But using plots to remove features? idk, I would probably not do that.

#

Regularization is the answer

desert oar Sep 21, 2023, 1:08 PM

#

glossy adder I might be overthinking my use case, I am making plots, plotting several dimensi...

yeah, it's useful to know which features are important, but using that importance to actually remove features from the model is what's questionable

past meteor Sep 21, 2023, 1:09 PM

#

Knowing what features are important is so dangerous that I wouldn't touch it unless you know what you're doing

desert oar Sep 21, 2023, 1:09 PM

#

the business people will want to know 🙂

glossy adder Sep 21, 2023, 1:09 PM

#

@desert oar good point. As in - there is no reason to actually remove features. Thx @past meteor - I guess that might move me into overfitting landscape and such

desert oar Sep 21, 2023, 1:09 PM

#

that said, i have run into cases where i did really want to remove "irrelevant" features, but it's a case-by-case situation. if you can explain what you're actually doing maybe we can provide more detailed advice

past meteor Sep 21, 2023, 1:09 PM

#

I'd nuance it to the very very very maximum that it means nothing

#

"Under this particular instance of the model and our data the most important features appear to be ..., different instances may drastically find different importances."

#

Business cannot expect me to give them more unless they give me the € to do an experiment 🤣. I "fight" this every other day.

#

Their statistical literacy can be low, if you give them what they think they want they'll make decisions that hurt the business

glossy adder Sep 21, 2023, 1:12 PM

#

haha, good point 🙂 kind of a mvp product, just minimally budgeted product

true scaffold Sep 21, 2023, 1:12 PM

#

Hi guys, need some help, i have created n clusters, a cluster contains m docs which are embeddings of 2048 dims (1 doc = 2048 dim of vector, 1 cluster = m docs), now i have a query string, i want to get the most relevant/similar cluster that it can fall under, so i'm thinking of calculating an average embedding of a cluster, and finding cosine sim b/w the query embedding and the cluster embeddings to find the most relevant cluster it can belong to? Any other efficient approach?

past meteor Sep 21, 2023, 1:12 PM

#

If you have 2 highly correlated features your regularizer will kill 1, that doesn't mean the feature is irrelevant to the problem etc etc

true scaffold Sep 21, 2023, 1:46 PM

#

past meteor If you have 2 highly correlated features your regularizer will kill 1, that does...

then what do you suggest?

past meteor Sep 21, 2023, 1:47 PM

#

Just use regularisation and call it day imo

past meteor Sep 21, 2023, 1:47 PM

#

past meteor "Under this particular instance of the model and our data the most important fea...

And if someone asks you "what is important and what isn't" you can answer a variation of

true scaffold Sep 21, 2023, 1:48 PM

#

past meteor And if someone asks you "what is important and what isn't" you can answer a vari...

lol, thanks

past meteor Sep 21, 2023, 1:51 PM

#

true scaffold lol, thanks

Oh you were a different person

true scaffold Sep 21, 2023, 1:51 PM

#

past meteor Oh you were a different person

Indeed

#

Thought u were referring to my question …

past meteor Sep 21, 2023, 1:54 PM

#

Is this latent semantic analysis you did? (LSA or LSI)

past meteor Sep 21, 2023, 1:57 PM

#

true scaffold Thought u were referring to my question …

Maybe explain what you want to achieve in non technical terms

#

Because it might be just a standard case of LSA

tropic niche Sep 21, 2023, 2:10 PM

#

I have a question about annotating data for training that I asked yesterday but has not been answered. I'm hoping someone can provide some insight. #data-science-and-ml message

true scaffold Sep 21, 2023, 2:37 PM

#

past meteor Maybe explain what you want to achieve in non technical terms

let's say i've created 7 clusters, a cluster basically have n # of docs, now a user can input his/her query, now i want to recommend him/her the most relevant cluster of docs based on the query inputed by him/her...

past meteor Sep 21, 2023, 2:38 PM

#

true scaffold let's say i've created 7 clusters, a cluster basically have n # of docs, now a u...

Before that, without using clusters etc

#

It's a retrieval problem? Someone gives an input, what do you want to give them, the most relevant document?

#

or the most relevant topic?

true scaffold Sep 21, 2023, 2:38 PM

#

document

#

basically, a user uploads a csv file

#

each row is a doc let's say

#

now i take this csv and create clusters out of it, now the user also enters a query, now based on this query, i wanna recommend him a cluster

#

the most "similar/relevant" cluster

past meteor Sep 21, 2023, 2:40 PM

#

Your case is exactly latent semantic indexing

#

From my old slides, that's what you want to do right?

true scaffold Sep 21, 2023, 2:41 PM

#

yea i wanna give him the index of the cluster which contains relevant docs based on his query

past meteor Sep 21, 2023, 2:42 PM

#

You actually don't need to cluster

true scaffold Sep 21, 2023, 2:42 PM

#

i know, i can just show him most similar docs

#

i've done that

#

but the cluster part is a different feature

#

basically with clusters, the user can look at other options...

#

the ability to explore more...

past meteor Sep 21, 2023, 2:44 PM

#

You can just take the cluster centres and do the same as before

#

That's indeed what you proposed originally

true scaffold Sep 21, 2023, 2:44 PM

#

yea i guess its a KNN problem...?

past meteor Sep 21, 2023, 2:45 PM

#

(I think you're making this a lot harder than it should but...) take the mean of all the docs in the cluster, compute the cosine sim, take the most similar one, show the top N in that cluster

true scaffold Sep 21, 2023, 2:46 PM

#

past meteor You can just take the cluster centres and do the same as before

yea but lets say in a cluster i have 2 docs, 1 doc has high similarity but the other one not so much, but they still have some similarity that is why they r in the same cluster, now mean operation will average out the embeddings, so info may get lost...?

past meteor Sep 21, 2023, 2:46 PM

#

Yes it will but c'est la vie

true scaffold Sep 21, 2023, 2:47 PM

#

wait let me translate that...

past meteor Sep 21, 2023, 2:47 PM

#

that's life*

true scaffold Sep 21, 2023, 2:47 PM

#

yea...

#

any better approach?>

past meteor Sep 21, 2023, 2:48 PM

#

This is what my course had to say about it:

#

NLP from 1979 though 👀

#

Their suggestion is just to take the mean of the documents and return all in the cluster, which has drawbacks ofc as you mentioned.

true scaffold Sep 21, 2023, 2:50 PM

#

hmm... yeah

#

anyway, thanks

#

ill implement what we discussed and will keep researching to find a better approach to solve this...

shy kraken Sep 21, 2023, 4:38 PM

#

hi What tools would be helpful to make a visualization like this: https://www.linkedin.com/search/results/content/?fromMember=["ACoAAAJIlxABE10sM8zEE7MSsBmy06lUQODDj_U"]&heroEntityKey=urn%3Ali%3Afsd_profile%3AACoAAAJIlxABE10sM8zEE7MSsBmy06lUQODDj_U&keywords=james eagle&position=0&searchId=24e66560-e6f4-40ff-817b-64cced149b69&sid=9(F&update=urn%3Ali%3Afs_updateV2%3A(urn%3Ali%3Aactivity%3A7109400669119242241%2CBLENDED_SEARCH_FEED%2CEMPTY%2CDEFAULT%2Cfalse)

Sign Up | LinkedIn

500 million+ members | Manage your professional identity. Build and engage with your professional network. Access knowledge, insights and opportunities.

serene scaffold Sep 21, 2023, 5:34 PM

#

Are these two expressions equivalent?

wooden sail Sep 21, 2023, 5:36 PM

#

what's that fancy 1

serene scaffold Sep 21, 2023, 5:36 PM

#

1 if the underset equality is true, else 0

#

I think (because the assignment doesn't say)
(the first expression is given in the assignment and the second is my attempt at rewriting it to be easier to reason about)

wooden sail Sep 21, 2023, 5:37 PM

#

yeah, you factored out the -1 exponent from the log. looks equivalent

serene scaffold Sep 21, 2023, 5:38 PM

#

ty math wizard

half herald Sep 21, 2023, 8:13 PM

#

Why am I getting such an error? I can't use Cv.Imshow directive

agile cobalt Sep 21, 2023, 8:16 PM

#

how did you install opencv? just pip install?

half herald Sep 21, 2023, 8:17 PM

#

Yes, but after I got this error, I deleted it and reinstalled it, it said so on the internet, but it didn't work.

agile cobalt Sep 21, 2023, 8:33 PM

#

half herald Yes, but after I got this error, I deleted it and reinstalled it, it said so on ...

from https://github.com/opencv/opencv-python/issues/18 it sounds like opencv-python-headless may be causing the problem, do you have that installed?

half herald Sep 21, 2023, 8:37 PM

#

agile cobalt from https://github.com/opencv/opencv-python/issues/18 it sounds like `opencv-py...

yes

agile cobalt Sep 21, 2023, 8:37 PM

#

maybe try uninstalling it then reinstalling opencv

#

or just nuke your current venv and create a new one

half herald Sep 21, 2023, 8:39 PM

#

agile cobalt or just nuke your current venv and create a new one

I deleted and reinstalled opencv many times but it didn't work. I will delete my project and open a new one

#

@agile cobalt It didn't work, I still get the same error, it's ridiculous.

agile cobalt Sep 21, 2023, 8:45 PM

#

try reinstalling python, this time from python.org instead of the windows store, then create a virtual environment before using pip

half herald Sep 21, 2023, 8:47 PM

#

agile cobalt try reinstalling python, this time from python.org instead of the windows store,...

I already installed Python from Python org, but I couldn't understand what you mean by virtual environment.

agile cobalt Sep 21, 2023, 8:48 PM

#

https://realpython.com/python-virtual-environments-a-primer/

Python Virtual Environments: A Primer – Real Python

In this tutorial, you'll learn how to use a Python virtual environment to manage your Python projects. You'll also dive deep into the structure of virtual environments built using the venv module, as well as the reasoning behind using virtual environments.

#

tl;dr keep things tidy instead of ending up with messes that can causes all sorts of problems like what you just had

half herald Sep 21, 2023, 9:18 PM

#

@agile cobalt What I am about to say may seem strange to you, but when I run the code using a different IDE, there is no problem, but I get this error in Visual Studio Code. I really couldn't understand

agile cobalt Sep 21, 2023, 9:19 PM

#

that other IDE is PyCharm?

#

or something inside of Anaconda

#

both of these manage virtual environments for you to some extent

half herald Sep 21, 2023, 9:19 PM

#

No, normal Python Idle

agile cobalt Sep 21, 2023, 9:20 PM

#

it might be just pointing to a different python interpreter then

#

in VSCode, do you see the python version in the bottom right corner? Click it and select a different interpreter

half herald Sep 21, 2023, 9:21 PM

#

Wowwww I chose anaconda and it worked, very strange

#

So the problem is that simple

#

I've been trying to solve this problem for 2 hours

agile cobalt Sep 21, 2023, 9:22 PM

#

managing dependencies in python can be a pain on the ass sometimes

half herald Sep 21, 2023, 9:22 PM

#

Thank you so much bro

candid spruce Sep 21, 2023, 10:24 PM

#

hi I was wondering if anyone wanted to work on a ai project with me 😄 here is a blue print for the ai

abstract wasp Sep 22, 2023, 1:59 AM

#

I am building a CNN with data regarding most popular cities. If I train my cnn with the cities I have right now and gather more data of other cities later, will it remember the previous cities or will it forget and just remember the new ones?
Should I just wait until all my data and train it all together?

serene scaffold Sep 22, 2023, 2:13 AM

#

@abstract wasp training a CNN to do what?

abstract wasp Sep 22, 2023, 2:15 AM

#

serene scaffold <@804221232210771978> training a CNN to do what?

To give an estimated location of an image.

serene scaffold Sep 22, 2023, 2:17 AM

#

abstract wasp To give an estimated location of an image.

so it's classifying images of locations within cities according to their city?
would we expect it to classify an image of the empire state building as "NEW YORK"?

abstract wasp Sep 22, 2023, 2:17 AM

#

serene scaffold so it's classifying images of locations within cities according to their city? w...

Yes

serene scaffold Sep 22, 2023, 2:19 AM

#

abstract wasp Yes

if you trained your classifier on cities {a, b, c}, and then continued training it on {d, e}, you would run the risk of the classifier forgetting {a, b, c}. and there are strategies for mitigating this, but unless the training you did on {a, b, c} can't wait and would be expensive to replicate, you'll get better results if you train once on {a, b, c, d, e}.

abstract wasp Sep 22, 2023, 2:19 AM

#

serene scaffold if you trained your classifier on cities {a, b, c}, and then continued training ...

Ok, thank you!!

abstract wasp Sep 22, 2023, 2:21 AM

#

serene scaffold if you trained your classifier on cities {a, b, c}, and then continued training ...

Wait, I have another question. If I train a, b, c, d, etc., and I later gather additional data for each other those same classes, is the risk still the same or will it be okay?

serene scaffold Sep 22, 2023, 2:24 AM

#

abstract wasp Wait, I have another question. If I train a, b, c, d, etc., and I later gather a...

suppose you have two training sets A and B that both contain instances of the same classes. I suspect that if the distributions of the classes is the same in both sets, then training on A, and then training on B, wouldn't be much different from training once on the union of A and B.

#

but that's probably a quesiton that dissertations are written about.

abstract wasp Sep 22, 2023, 2:32 AM

#

serene scaffold suppose you have two training sets A and B that both contain instances of the sa...

Ok thanks!!

serene scaffold Sep 22, 2023, 2:32 AM

#

@abstract wasp why do you ask? do you already have a trained model that was expensive to train?

abstract wasp Sep 22, 2023, 2:36 AM

#

serene scaffold <@804221232210771978> why do you ask? do you already have a trained model that w...

No, I haven’t trained it yet. I’m still gathering data but rn, I’m just gathering data for the 100 most popular cities. The overall goal of this is to have data for most of the world—I just wanted to see if I should build diff. CNNs and fuse them together or if I should just train the CNN with what I have, save it, and later retrain it with additional data.

#

But yeah, training a CNN with this amount of data would be very expensive, that’s also another factor.

magic dune Sep 22, 2023, 2:39 AM

#

can I have a simple decision tree code review?

serene scaffold Sep 22, 2023, 2:40 AM

#

magic dune can I have a simple decision tree code review?

whenever you ask for something online, give people everything they would need to fulfill your request.

magic dune Sep 22, 2023, 2:42 AM

#

serene scaffold whenever you ask for something online, give people everything they would need to...

k

#

sorry abt that

#

!paste

serene scaffold Sep 22, 2023, 2:43 AM

#

abstract wasp No, I haven’t trained it yet. I’m still gathering data but rn, I’m just gatherin...

something to keep in mind as you approach this is: if someone with internet access looked at one of the images in the dataset, would they be able to figure out what city they're from? if the images are so non-descript (no famous buildings, no city-specific architecture, etc.) that there's nothing about them that could be tied to a particular city, a neural network won't be able to magically solve that for you.

magic dune Sep 22, 2023, 2:44 AM

#

https://paste.pythondiscord.com/UHHA
My decision tree code I challenged myself by not using entropy or information gain

#

I think I did an ok job but might be able to improve

#

if anyone can review the code and tell me what I should improve on I would be happy to hear

abstract wasp Sep 22, 2023, 3:05 AM

#

serene scaffold something to keep in mind as you approach this is: if someone with internet acce...

Yeah, that makes sense. Thank you for the support 😄

echo lance Sep 22, 2023, 4:10 AM

#

Is there a specific book or vid series for ml that focus on how to achive accuracy for different data behaviours. And techniques to win competitions.
As most of the books focus on teaching algos...

tacit basin Sep 22, 2023, 4:59 AM

#

echo lance Is there a specific book or vid series for ml that focus on how to achive accura...

Like kaggle competitions?

echo lance Sep 22, 2023, 4:59 AM

#

Yeah

tacit basin Sep 22, 2023, 5:01 AM

#

Maybe the kaggle book?

#

https://github.com/PacktPublishing/The-Kaggle-Book

GitHub

GitHub - PacktPublishing/The-Kaggle-Book: Code Repository for The K...

Code Repository for The Kaggle Book, Published by Packt Publishing - GitHub - PacktPublishing/The-Kaggle-Book: Code Repository for The Kaggle Book, Published by Packt Publishing

scenic parcel Sep 22, 2023, 5:19 AM

#

How is ^VIX being calculated as having only a 0.0272 correlation with VIXCLSx

#

Should I be normalizing first? Using pandas corr function, pearson correlation

lost plinth Sep 22, 2023, 6:41 AM

#

Hi Folks,
During my free time, I was doing personal
project basically created a chatbot which can
answer your question from document. I used
Langchain(framework), ChromaDB(vector database), Streamlit(ui) and used both local llm(Llama2 based model) or OpenAl api for llm. You can use PDF, TXT, CSV, and DOCX files for question answering. Any
contributions to this project will be highly welcome. Thanks!
Github link: https://github.com/himanshu662000
/InfoGPT

GitHub

himanshu662000 - Overview

himanshu662000 has 14 repositories available. Follow their code on GitHub.

wispy junco Sep 22, 2023, 7:19 AM

#

Hi I'm a complete beginner to ml and need to train a model to automatically find coordinates in an image, can someone please point me to some resources and libraries that can help me accomplish this, thanks.

scenic parcel Sep 22, 2023, 8:28 AM

#

wispy junco Hi I'm a complete beginner to ml and need to train a model to automatically find...

chat

#

gpt

scenic parcel Sep 22, 2023, 8:30 AM

#

lost plinth Hi Folks, During my free time, I was doing personal project basically created a ...

Dude this is actually fucking sick I've been meaning to build the exact same thing but haven't had the time lmao. Literally everything the same I wanted to use chroma and use my own local llm and use langchain to make it talk to itself

#

Does it get slow with large amounts of pdf, essentially if you gave it an entire bookshelf to search through? I'm definitely downloading this and trying it out though. How long did it take?

#

I already have a correction, it is supposed to be requirements.txt not requirement.txt just a tiny thing lol

lost plinth Sep 22, 2023, 8:37 AM

#

scenic parcel Does it get slow with large amounts of pdf, essentially if you gave it an entire...

Thanks! No actually it will not be slow irrespective of size of pdf but while ingesting data to chromadb(which u will do definitely before querying) it will take little more time for large pdf. But while querying and getting your answer u will not notice any difference irrespective of pdf size.

lost plinth Sep 22, 2023, 8:37 AM

#

scenic parcel I already have a correction, it is supposed to be requirements.txt not requireme...

Thanks I will change it

scenic parcel Sep 22, 2023, 8:37 AM

#

lost plinth Thanks! No actually it will not be slow irrespective of size of pdf but while in...

Cool!

scenic parcel Sep 22, 2023, 8:45 AM

#

lost plinth Thanks I will change it

So do you have to use GGUF llms? Or can you use any llm like GPTQ?

long canopy Sep 22, 2023, 10:07 AM

#

any AI tools available for CLI prompt validation? i.e. to check whether a string answer to a command line prompt has an appropriate format

lapis sequoia Sep 22, 2023, 1:11 PM

#

Do any of you guys know some high quality libraries for making maps in python

#

I have worked with plotly and folium

#

Plotly is pretty good but runs into some limitations from time to time

fallow frost Sep 22, 2023, 2:11 PM

#

does anybody know the maximum length a SQL query can be with Athena DB ?

#

I basically need to do: SELECT ... WHERE col IN <very-long-list-of-values> the list/tuple can have upwards of 100k strings with at least 50 chars each

#

not sure if I pass it as query parameter if it will still matter or not...

wintry cloud Sep 22, 2023, 2:41 PM

#

lapis sequoia Do any of you guys know some high quality libraries for making maps in python

worked with geopandas before seems pretty clean

serene scaffold Sep 22, 2023, 2:42 PM

#

I'm at a loss for how to proceed with this question. It appears that we have function K as R^d x R^d -> R, and Phi as R^d -> R^d, but I don't understand the relationship between K and Phi.

past meteor Sep 22, 2023, 2:42 PM

#

serene scaffold I'm at a loss for how to proceed with this question. It appears that we have fun...

This is my zone, sec! 😄

past meteor Sep 22, 2023, 2:48 PM

#

serene scaffold I'm at a loss for how to proceed with this question. It appears that we have fun...

This one does have me thinking tho pithink . The relationship between K and phi is called the kernel trick. Phi maps x to an infinite dimensional space for an RBF kernel (you can't compute this). Basically K allows you to do a dot product of two vectors that were mapped in a (possibly) infinite dimensional space without explicitly going there.

#

You definitely have to look at it in terms of K and not phi, that's the property they want you to exploit

#

if x_i = x_j then the term is exp^0 = 1 and if they are different xi - xj² results in a positive number which you multiply by -1/2 resulting in a negative which is also bounded by 0 and 1.

#

I don't understand the <= 2

small wedge Sep 22, 2023, 2:53 PM

#

wispy junco Hi I'm a complete beginner to ml and need to train a model to automatically find...

Can you expand on what "automatically find coordinates in an image" means? Like latitude longitude coordinates? Coordinates of some object you're detecting? Either way this doesn't really sound like a beginner project.

past meteor Sep 22, 2023, 2:54 PM

#

past meteor You definitely have to look at it in terms of K and not phi, that's the property...

small wedge Sep 22, 2023, 2:54 PM

#

scenic parcel gpt

Please never recommend chatGPT as a source of information, especially to beginners.

wispy junco Sep 22, 2023, 4:06 PM

#

small wedge Can you expand on what "automatically find coordinates in an image" means? Like ...

I have an image and I'd have to find out what part of the image should stay and what should be removed...
I need to remove the unnecessary parts (in my current task, I have to remove all the parts that are white & black, ie. texts, etc in a picture which is otherwise full of color?)...

#

anyway, not even gonna try to sugarcoat this, I got the code from chatGPT

from PIL import Image
import numpy as np
from sklearn.cluster import KMeans

def get_dominant_colors(image, num_colors):
    # Convert the image to a numpy array
    img_array = np.array(image)

    # Reshape the image array to a list of pixels
    pixels = img_array.reshape(-1, 3)

    # Initialize K-Means with the desired number of clusters (colors)
    kmeans = KMeans(n_clusters=num_colors, random_state=0).fit(pixels)

    # Get the RGB values of the cluster centers (dominant colors)
    dominant_colors = kmeans.cluster_centers_.astype(int)

    return dominant_colors

# Open an example image (replace with your box)
box = Image.open('./doggo.jpeg')

# Specify the number of dominant colors you want to extract (5 in this case)
num_colors = 5

# Get the 5 dominant colors within the box
dominant_colors = get_dominant_colors(box, num_colors)

# Print the dominant colors (RGB values)
print("Dominant Colors:")
for color in dominant_colors:
    print(f"RGB: {color[0]}, {color[1]}, {color[2]}")

can someone tell me what's going wrong, I'm trying to get the 5 dominant colors in a image

wispy junco Sep 22, 2023, 4:29 PM

#

I'm getting this error

#

lapis sequoia Sep 22, 2023, 5:23 PM

#

wintry cloud worked with geopandas before seems pretty clean

I do use geopandas regularly. But rather limited in making high quality maps I think.

small wedge Sep 22, 2023, 5:25 PM

#

wispy junco I'm getting this error

We need the full traceback in order to know what went wrong. Also please send it as text instead of a screenshot 🙂

quaint spade Sep 22, 2023, 5:28 PM

#

hey everyone , i need a favor , can someone run code for me , doesnt seem to work on my laptop i want to see if the problem is me, maybe i didnt install all the right packages or maybe the code , has to do with webscraping from cboe and gamma exposure for oprions https://github.com/Matteo-Ferrara/gex-tracker/tree/e4a5cd508268673004e7dcd2f73ce7f74bf251c5

GitHub

GitHub - Matteo-Ferrara/gex-tracker at e4a5cd508268673004e7dcd2f73c...

Dealers' gamma exposure (GEX) tracker. Contribute to Matteo-Ferrara/gex-tracker development by creating an account on GitHub.

abstract wasp Sep 22, 2023, 5:28 PM

#

Hi, help, I get an data_iterator = data.as_numpy_iterator() AttributeError: 'DirectoryIterator' object has no attribute 'as_numpy_iterator'
This is my code:
``import pandas as pd
import numpy as np
import os

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator

#GPU
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)

#DATA LOADING
train_images = '/Users/avatarvaleria/Projects/colabs/Lys/time/data/images'
classes = os.listdir(train_images)
print(classes)

#DATA AUGMENTATION
data_aug = ImageDataGenerator(
rotation_range=25,
fill_mode='nearest',
horizontal_flip=True,
brightness_range=(.5, .5),
zoom_range=.5
)

#APPLYING AUG
batch_size =32

data = data_aug.flow_from_directory(
train_images,
target_size=(256, 256),
batch_size=batch_size,
class_mode='sparse'
)

data_iterator = data.as_numpy_iterator()
batch = data_iterator.next()

fig, ax = plt.subplots(ncols=5, figsize=(20,20))
for idx, img in enumerate(batch[0][:5]):
ax[idx].imshow(img.astype(int))
ax[idx].title.set_text(batch[1][idx])

#DATA PREPROCESSING
data = data.map(lambda x, y: (x/255, y))
data.as_numpy_iterator().next()

scaled = batch[0]/255
scaled.max()

scaled_iterator = data.as_numpy_iterator()
batch = scaled_iterator.next()

fig, ax = plt.subplots(ncols=4, figsize=(20,20))
for idx, img in enumerate(batch[0][:4]):
ax[idx].imshow(img)
ax[idx].title.set_text(batch[1][idx])

data.as_numpy_iterator().next()[0].max()

#SPLITTING
len(data)

train_size = int(len(data).7)
val_size = int(len(data).2)
test_size = int(len(data)*.1)

print(f"Train size: {train_size}")
print(f"Validation size: {val_size}")
print(f"Test size: {test_size}")

train = data.take(train_size)
val = data.skip(train_size).take(val_size)
test = data.skip(train_size+val_size).take(test_size)`

left tartan Sep 22, 2023, 5:36 PM

#

quaint spade hey everyone , i need a favor , can someone run code for me , doesnt seem to wor...

What error do you get?

quaint spade Sep 22, 2023, 5:41 PM

#

left tartan What error do you get?

#

also im a beginner , i just saw source code and thought that it would work

quaint spade Sep 22, 2023, 5:41 PM

#

left tartan What error do you get?

is it operational on your end ?

left tartan Sep 22, 2023, 5:42 PM

#

All that’s saying is the request is returning no data. Presumably there’s no error handling, so it’s probably just hiding an error

quaint spade Sep 22, 2023, 5:44 PM

#

why would it do that lol

left tartan Sep 22, 2023, 5:44 PM

#

This code is a year old. Websites change.

#

Scraping is very fragile. I’ve worked with cboe to do this exact thing before (gex), but manually downloaded the data.

quaint spade Sep 22, 2023, 5:46 PM

#

left tartan Scraping is very fragile. I’ve worked with cboe to do this exact thing before (g...

thats the tedious task i am trying to avoid , id like the data to update itself 😔

quaint spade Sep 22, 2023, 5:47 PM

#

left tartan Scraping is very fragile. I’ve worked with cboe to do this exact thing before (g...

was it difficult , this is the only reseaon i am learning python, it really helped with my trading but my license expired with the site where i was getting the levels from

left tartan Sep 22, 2023, 5:48 PM

#

I started from here: https://perfiliev.co.uk/market-commentary/how-to-calculate-gamma-exposure-and-zero-gamma-level/

quaint spade Sep 22, 2023, 5:50 PM

#

left tartan I started from here: <https://perfiliev.co.uk/market-commentary/how-to-calculate...

i think you have sent me this once before , i skimmed through but i will read it now , mind if i add you in case i run into some problems i may need help with ?

boreal gale Sep 22, 2023, 5:59 PM

#

lapis sequoia I do use geopandas regularly. But rather limited in making high quality maps I t...

what kind of maps do you require?

lapis sequoia Sep 22, 2023, 5:59 PM

#

boreal gale what kind of maps do you require?

Like choropleth, or polygon plots. With annotations

boreal gale Sep 22, 2023, 6:00 PM

#

is folium/plotly lacking in any way for these?

lapis sequoia Sep 22, 2023, 6:00 PM

#

Sort of like this

lapis sequoia Sep 22, 2023, 6:01 PM

#

boreal gale is folium/plotly lacking in any way for these?

Plotly does quite well, but sometimes lacks. So was wondering if there's other tools available that are good in other dimensions

boreal gale Sep 22, 2023, 6:01 PM

#

give kepler.gl a look as well, i love it

lapis sequoia Sep 22, 2023, 6:02 PM

#

Like for example, plotly wasn't handling overlap of text very well. And I had to custom code a clustering logic which avoided the overlap

#

gpt did mention that

#

I mainly wish to make static maps btw

desert oar Sep 22, 2023, 7:02 PM

#

lapis sequoia I do use geopandas regularly. But rather limited in making high quality maps I t...

i use cartopy which is what geopandas uses internally

#

no interactivity but relatively detailed control over output

#

i get map tiles using contextily

#

it's good enough for the static images in presentations and docs that i need

scenic parcel Sep 22, 2023, 7:43 PM

#

small wedge Please never recommend chatGPT as a source of information, *especially* to begin...

I definitely am gonna continue recommending chatgpt to beginners

abstract wasp Sep 22, 2023, 7:48 PM

#

abstract wasp Hi, help, I get an `data_iterator = data.as_numpy_iterator() AttributeError: 'Di...

Sos 😭

small wedge Sep 22, 2023, 8:00 PM

#

scenic parcel I definitely am gonna continue recommending chatgpt to beginners

It's your life. As long as you are aware that it is a less useful recommendation than "Google it" because of the unreliabile nature of modern LLMs.

scenic parcel Sep 22, 2023, 8:01 PM

#

gpt4 is highly reliable and most beginner programmers just want to do something simple that gpt 3.5 wont be hallucinating anyting up for

#

Its instant responses chatting with an industry expert when the alternative is maybe getting a response every few hours from some people on discord/reddit or poring over documentation. Its how I got my start a few months ago and I found it invaluable

small wedge Sep 22, 2023, 8:04 PM

#

It's absolutely not reliable as it can and will give you contradictory answers to the same logical question when worded differently. Not to mention hallucination is still a problem for gpt4 even if it's not as much of a problem as it was for 3.5. Again, not to mention you didn't recommend it for a simple task, you recommended it to someone who had a fairly complex task. Talking to gpt4 is absolutely not akin to talking to an industry expert.

iron basalt Sep 22, 2023, 8:05 PM

#

serene scaffold I'm at a loss for how to proceed with this question. It appears that we have fun...

What is the relationship between K and the inner product? How does || ... ||^2 relate to that?

left tartan Sep 22, 2023, 8:09 PM

#

scenic parcel gpt4 is highly reliable and most beginner programmers just want to do something ...

It’s highly unreliable, and more importantly: prevents new programmers from developing the problem solving skills they need.

#

Even worse is when it gives a working answer that’s a bad practice

scenic parcel Sep 22, 2023, 8:10 PM

#

small wedge It's absolutely not reliable as it can and will give you contradictory answers t...

Is this not a helpful respone? (3.5 btw)

Hi I'm a complete beginner to ml and need to train a model to automatically find coordinates in an image, can someone please point me to some resources and libraries that can help me accomplish this, thanks.

ChatGPT
Certainly! If you're a beginner in machine learning and want to train a model to automatically find coordinates in an image, you'll likely be working on an object detection task. Object detection involves identifying and locating objects in an image, which can be thought of as finding the coordinates of objects within the image. Here are some resources and libraries to get you started:

Python: Most machine learning and computer vision tasks in the context of object detection are done in Python.
Libraries/Frameworks:

TensorFlow Object Detection API: This is a popular framework for object detection. It provides pre-trained models and tools to train your own models. Here's the official GitHub repository.

PyTorch: PyTorch is another popular deep learning framework that can be used for object detection. You can find tutorials and pre-trained models in the PyTorch Hub.

OpenCV: OpenCV is a computer vision library that can be used for various tasks, including object detection. It has pre-trained models and tutorials for object detection. Check the OpenCV documentation.

YOLO (You Only Look Once): YOLO is a popular real-time object detection framework. You can find implementations and pre-trained models like YOLOv3 and YOLOv4 in various repositories, such as YOLO GitHub.

Datasets: You'll need a dataset of images with labeled coordinates to train your model. Some popular object detection datasets include COCO (Common Objects in Context), Pascal VOC, and custom datasets you can create.
Tutorials and Courses:

Coursera and Udacity offer machine learning and computer vision courses that cover object detection.

YouTube has numerous tutorials on object detection using different frameworks.

#

Blogs and tutorials on Medium and Towards Data Science often provide step-by-step guides for object detection tasks.

Books: Books like "Deep Learning" by Goodfellow, Bengio, and Courville or "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow" by Aurélien Géron can provide a solid foundation in machine learning and deep learning concepts.
Forums and Communities: Websites like Stack Overflow and Reddit (e.g., r/MachineLearning) are great places to ask questions and seek guidance from the machine learning community.
Online Coding Platforms: Platforms like Kaggle provide datasets, kernels (code notebooks), and competitions related to object detection. It's a great way to learn and practice.

Remember that object detection can be a complex task, especially for a beginner, but with dedication and practice, you can make progress. Start with the basics of machine learning and gradually delve into object detection techniques as you become more comfortable with the concepts and tools.

left tartan Sep 22, 2023, 8:10 PM

#

Without even reading it; that content is no better than what the same google search would yield.

small wedge Sep 22, 2023, 8:10 PM

#

Note I didn't say it can't provide helpful responses. I said it is unreliable

left tartan Sep 22, 2023, 8:11 PM

#

That’s incredibly generic, overly verbose, and not particularly helpful advice to someone just starting.

#

In fact, I’d argue that is worse advice than googling and reading a few articles that explain -why- and put things in context

past meteor Sep 22, 2023, 8:17 PM

#

left tartan It’s highly unreliable, and more importantly: prevents new programmers from deve...

Did people have the same opinion about Google and stackoverflow in the past?

shadow viper Sep 22, 2023, 8:21 PM

#

lapis sequoia Sort of like this

is this GIS?

#

good day everyone,
please does anyone know how i can edit a particular cell in powerBI?

left tartan Sep 22, 2023, 8:30 PM

#

past meteor Did people have the same opinion about Google and stackoverflow in the past?

I think it’s the difference between spoon feeding and researching. It’s one thing to develop the skills to research and answer questions

shadow viper Sep 22, 2023, 8:51 PM

#

left tartan I think it’s the difference between spoon feeding and researching. It’s one thin...

Hey Billy, how are you doing?

#

I'm making use of power bi and I have a column filled with null values and I want to edit one particular cell in the column to something else.
The replace function keeps replacing the whole column filled with null instead of the particular cell I want to replace.
How do I do this please?

left tartan Sep 22, 2023, 8:53 PM

#

I don’t know powerbi, sorry

shadow viper Sep 22, 2023, 8:54 PM

#

Alright

#

Thanks

scenic parcel Sep 22, 2023, 10:52 PM

#

Does anybody else get this "UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged." its like unfixable

small wedge Sep 22, 2023, 10:56 PM

#

!paste Could you show the code that creates the warning?

arctic wedgeBOT Sep 22, 2023, 10:56 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

scenic parcel Sep 22, 2023, 10:58 PM

#

Seems to be any time I import pandas, or if a library that I import uses pandas

#

import gdown

import zipfile
import os

def download_data_from_drive(zip_url, output_path):
    # Download the zip file from Google Drive
    gdown.download(zip_url, output_path, quiet=False)

    # Extract the zip file
    with zipfile.ZipFile(output_path, 'r') as zip_ref:
        zip_ref.extractall(os.path.dirname(output_path))
    os.remove(output_path)  # Remove the zip file after extraction

# Using the direct download link format provided by gdown's warning
url = r'https://drive.google.com/example_link'

output = str(DATA_DIR / 'Stock_data/dl_folder/example_data-2.zip')
download_data_from_drive(url, output)

#

Most recent time I've run into it, has happened a lot of other times. This is one of the smaller scripts that causes it

#

Stackoverflow said uninstall and reinstall pandas/numpy, have tried that. It happened with miniconda, tried using full anaconda instead, still happens. Uninstalled everything on conda and reinstalled everything with pip only, still happens

#

At this point I think its a problem with python 3.10.12 and have left it because it hasn't caused any noticable effects but the warning is annoying

steep shadow Sep 22, 2023, 11:22 PM

#

Does anyone have any advice on how I can begin learning AI. I already know intermediate python and data structures.

frosty gale Sep 22, 2023, 11:39 PM

#

Hey i am having a trouble installing OpenCV CUDA, I am done with all the steps in CMAKE-gui, but when i try to build the files, it just throws an error:

MSBUILD : error MSB1009: Project file does not exist.
Switch: INSTALL.vcxproj

vale swallow Sep 23, 2023, 1:03 AM

#

Hi, can someone pls help me. Can someone give me an example code of how to split my data. For example, I have a directory named “main_dir” and in this directory I have three directories, each for the three classes I have named “1”, “2”, “3” (with just images of each class). How can I split my data into train, val, test?
I’m seeing different ways using Tensorflow, Sklearn, and other ways so I’m confused on how I should do it.

frosty gale Sep 23, 2023, 1:11 AM

#

frosty gale Hey i am having a trouble installing OpenCV CUDA, I am done with all the steps i...

SOLVED <3

frosty gale Sep 23, 2023, 1:12 AM

#

vale swallow Hi, can someone pls help me. Can someone give me an example code of how to split...

hey I'd suggest sklearn if you're going for complete basics! sklearn has a direct train test split function where you can mention the ratio in which it needs to be split into! check out some sklearn tutorials!

vale swallow Sep 23, 2023, 1:17 AM

#

frosty gale hey I'd suggest sklearn if you're going for complete basics! sklearn has a direc...

Ok, thanks!
Also, do you know how to implement data augmentation? I saw you can apply the augmentation to the actual model but then there’s another option with apply that augmentation to the actual dataset. Which one do you think is best?

wispy junco Sep 23, 2023, 1:20 AM

#

small wedge We need the full traceback in order to know what went wrong. Also please send i...

yep it worked thankss

frosty gale Sep 23, 2023, 1:22 AM

#

vale swallow Ok, thanks! Also, do you know how to implement data augmentation? I saw you can ...

hi yeah, you need to apply scaling or any kind of augmentation to your data set to the split model instead of the original dataset because:
if you augment the orignal dataset, all data entries will be changed, and upon splitting into test and train sets, your test set will also be affected.
on the other hand, splitting data and then augmenting/scaling/changing the train set, will help you preserve the original test set, giving more accurate outcomes to the test output

#

am new to this too, so correct me if am wrong, anyone

small wedge Sep 23, 2023, 1:26 AM

#

scenic parcel ```from config import DATA_DIR import gdown import zipfile import os def downl...

Hm I can't reproduce the issue so it must be something with your package manager/environment

vale swallow Sep 23, 2023, 1:31 AM

#

frosty gale hi yeah, you need to apply scaling or any kind of augmentation to your data set ...

Ok thank you!

small wedge Sep 23, 2023, 1:47 AM

#

steep shadow Does anyone have any advice on how I can begin learning AI. I already know inter...

AI is a pretty broad term, what kinda AI are you interested in making?

scenic parcel Sep 23, 2023, 1:50 AM

#

small wedge Hm I can't reproduce the issue so it must be something with your package manager...

Can you give me a requirements.txt file of the packages that you're using

oak panther Sep 23, 2023, 1:52 AM

#

what are the best algos to try for stock trading futures/indices?

small wedge Sep 23, 2023, 1:53 AM

#

scenic parcel Can you give me a requirements.txt file of the packages that you're using

I'm not using a venv and I do the thing you're not supposed to do with global installs :)) I just installed the modules in your code on top of my existing environment with pip

#

so if I made a requirements.txt it'd be like 100 lines long

#

if you just want my versions of these packages I could send that

scenic parcel Sep 23, 2023, 3:24 AM

#

small wedge if you just want my versions of these packages I could send that

Yes I'd appreciate that thank you

#

Also python version

small wedge Sep 23, 2023, 3:35 AM

#

gdown==4.7.1
├── beautifulsoup4 [required: Any, installed: 4.10.0]
├── filelock [required: Any, installed: 3.12.4]
├── requests [required: Any, installed: 2.25.1]
├── six [required: Any, installed: 1.16.0]
└── tqdm [required: Any, installed: 4.66.1]
config==0.5.1

did any of these even use numpy pithink

tona@albedo:~$ pip --version
pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)

scenic parcel Sep 23, 2023, 3:36 AM

#

zipfile ?

small wedge Sep 23, 2023, 3:36 AM

#

isn't that a built-in module

scenic parcel Sep 23, 2023, 3:37 AM

#

Yeah I think so nvm

#

So is that python 3.10.0

small wedge Sep 23, 2023, 4:07 AM

#

mhm

lapis sequoia Sep 23, 2023, 6:14 AM

#

shadow viper is this GIS?

Gis as in? It is plotly in python.

quaint loom Sep 23, 2023, 7:21 AM

#

I'm currently developing a module to detect small bubble events (Ebullition), calculating the CH4 ebullition flux (eFCH4) by assuming a constant diffusion rate. To mitigate diffusion flux inhibition due to high CH4 concentration in a floating chamber, I select previous data within the observation period to calculate the diffusion flux, represented by the U-M line (Prototype: [link: https://paste.pythondiscord.com/K6RA). I employ the least squares method to fit the slope of the U-M line, obtaining the CH4 diffusion rate within the period. Additionally, I calculate the CH4 diffusion concentration at the observation's end (point E) based on the U-M line's slope. The change in CH4 ebullition concentration (Δc) results from subtracting the concentration at point E from point T during the observation period.

I want a module that can extracts relevant time periods from raw data (an xls file) for analysis (e.g., 10:02:59 - 10:12:59, 11:23:59 - 11:26:59). This targeted approach eliminates the need to analyze the entire raw data range. Ebullition events occur when CH4 bubbles disrupt the linear increase in CH4 concentration.

While I've created a prototype for significant bubble events ([link: https://paste.pythondiscord.com/K6RA), I'm seeking guidance on developing one for small bubbles. Additionally, I'm working on determining an appropriate threshold value ([link: https://paste.pythondiscord.com/H7RQ). Any assistance or advice to enhance the module would be greatly appreciated.

quaint loom Sep 23, 2023, 7:24 AM

#

quaint loom I'm currently developing a module to detect small bubble events (Ebullition), ca...

For the threshold value: I have heard that, no matter what, I should never convert it to a str, but I am not familiar with how I should do it

night forge Sep 23, 2023, 8:31 AM

#

Hi, I had a question with pytorch. Below is my model

#

from torch import nn
# create a two layer FCNN, avoid ValueError: optimizer got an empty parameter list
class img2latent(nn.Module):
    def __int__(self):
        super(img2latent,self).__init__()
        self.neuralDim=len(X_train[0])
        self.latentDim=len(Y_train[0])
        self.hiddenDim=self.neuralDim
        self.fc1=nn.Linear(self.neuralDim,self.hiddenDim)
        self.fc2=nn.Linear(self.hiddenDim,self.latentDim)
        # INTITIALISE THE WEIGHTS, FC1 WITH ONES, FC2 WITH PARAMETERS OF RIDGE
        self.fc1.weight.data.fill_(1)
        self.fc1.bias.data.fill_(0)
        self.fc2.weight.data=ridge.coef_
        self.fc2.bias.data=ridge.intercept_
    
    def forward(self,x):
        x=self.fc1(x)
        # add reLU
        x=torch.relu(x)        
        x=self.fc2(x)
        return x
  
    

def train_loop(model,loss_fn, optimizer):
    model.train()
    # do full batch gradient descent
    pred=model(X_train)
    loss=loss_fn(pred,Y_train)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    return loss.item()
    
    
fullmodel=img2latent()
# SEND TO GPU
fullmodel=fullmodel.to(device)
# take mse loss + l2 regularaisation  on the weights of the second layer
loss=nn.MSELoss()
optimizer=torch.optim.Adam(fullmodel.parameters(),lr=0.01,weight_decay=0.01)
for t in range(1000):
    loss=train_loop(fullmodel,loss,optimizer)
    if t%100==0:
        print(t,f"{loss:0.2f}",end='\t')

#

However I get

ValueError: optimizer got an empty parameter list

How do I solve this issue

small wedge Sep 23, 2023, 8:36 AM

#

night forge ```python from torch import nn # create a two layer FCNN, avoid ValueError: opti...

You wrote __int__ instead of __init__

quaint loom Sep 23, 2023, 10:34 AM

#

quaint loom I'm currently developing a module to detect small bubble events (Ebullition), ca...

👆🏽

desert oar Sep 23, 2023, 11:32 AM

#

quaint loom I'm currently developing a module to detect small bubble events (Ebullition), ca...

when moving into predicting events that are hard to distinguish from normal variation, you usually end up having to trade off between probability of true detection and probability of false detection. but ultimately just about every statistical technique revolves around doing something very similar to what you are already doing: proposing a baseline model, and then looking for deviations from that baseline model. where it gets more complicated is when you want to analyze the situation probabilistically, but that's often necessary in cases where it's not straightforward to distinguish the baseline and deviated scenarios. probability model and gives you a principled framework for distinguishing the two, and allows you to make trade-offs in terms of the probabilities of false positives, true positives, etc.

#

One thing that did not really occur to me when I was helping you previously is that, because you assume a constant rate of increase in the baseline scenario, when you remove the trend by taking first differences, you should get a flat line

#

That is, you can transform your data in terms of deviations from the expected trend line

#

Then instead of modeling the slope directly, you can just look for unexpectedly large positive deviations from trend

#

This is convenient because you can think in terms of level values rather than rates, which i think makes it all a little bit easier

#

It also simplifies the problem I think, because it reduces now to figuring out what is the normal baseline distribution of CH4 increases at any time step

#

In your case, it seems like maybe the variation in the data is constant over time? If that's true, then you can get pretty far using standard statistical hypothesis testing

quaint loom Sep 23, 2023, 11:45 AM

#

desert oar when moving into predicting events that are hard to distinguish from normal vari...

Thanks for explaining that, it makes a lot more sense now! I agree that focusing on deviations from the expected trend seems like a crucial first step in developing the module. It's true that many published reports on methane ebullition events are based on predictions rather than actual observations, which can introduce uncertainty.

Understanding the composition of bubbles and how other substances inside them change over time can significantly reduce uncertainties and improve predictions. there is ongoing research into the content of these bubbles. It seems like we're working towards making our predictions about methane bubbles more accurate and reliable, moving away from just educated guesses. I also want to mention that I think this is the first step for me to develop the module, so the module itself will have to be improved of course.

desert oar Sep 23, 2023, 11:48 AM

#

quaint loom Thanks for explaining that, it makes a lot more sense now! I agree that focusing...

my understanding was that the expectation of a linear trend was derived from some theoretical knowledge about the underlying chemical process involved in whatever you're working on, is that not true?

#

it gets fuzzier and harder to distinguish events from non-events when you need to estimate the baseline distribution directly from the data without a theoretical model

quaint loom Sep 23, 2023, 11:50 AM

#

desert oar In your case, it seems like maybe the variation in the data is constant over tim...

I am familiar with this, and yes, it may. But it doesnt lead to the development of the module itself that over time can be improved. As both we`re talking about, the module I am trying to create now and the standard statistical hypothesis testing will have huge uncertainties.

quaint loom Sep 23, 2023, 11:55 AM

#

desert oar my understanding was that the expectation of a linear trend was derived from som...

To be honest, not always, but in some cases, the expectation of a linear trend is based on theoretical knowledge of the underlying chemical processes. This theoretical foundation provides us with a starting point for making predictions. However, remember that while theory guides us, real-world data can sometimes behave differently due to various factors. So, while we start with a theoretical basis, we also need to be prepared to adapt our models when necessary to account for deviations from the expected linear trend.

desert oar Sep 23, 2023, 11:57 AM

#

quaint loom To be honest, not always, but in some cases, the expectation of a linear trend i...

right, you're thinking along the right lines then

#

remind me again: are you able to analyze the whole time series at once? or do you need to be able to detect events when they occur, using only past data?

quaint loom Sep 23, 2023, 11:59 AM

#

desert oar right, you're thinking along the right lines then

That is also why I think this module can be helpful for me and the people who is working with GHG with more accurate data.

quaint loom Sep 23, 2023, 12:00 PM

#

desert oar remind me again: are you able to analyze the whole time series at once? or do yo...

I am using a GHG instrument that measures real-time data every second at the water-air interface.

desert oar Sep 23, 2023, 12:01 PM

#

quaint loom I am using a GHG instrument that measures real-time data every second at the wat...

right, but is this something that is going to be running continuously and sending out an alert when an event happens? or are you running it for a set period of time and then analyzing the entire sequence later?

quaint loom Sep 23, 2023, 12:01 PM

#

desert oar remind me again: are you able to analyze the whole time series at once? or do yo...

So no, always fresh data that is being analyzed here

desert oar Sep 23, 2023, 12:03 PM

#

okay, so if you are looking for methods that only use past data without seeing the full sequence, your keyword is "online" (although it's not very useful on its own given its other meanings)

#

i think last time you asked about this, changepoint detection was brought up, and ultimately i think that does describe what you are trying to do

quaint loom Sep 23, 2023, 12:05 PM

#

desert oar right, but is this something that is going to be running continuously and sendin...

Well, this is where the time-consuming part comes in. The instrument logs data every second. So, in the field where I use the instrument for 10 minutes at each site, I have to physically analyze the data afterward in Excel, empty the distributed dataset before placing it in the water, etc. And this is just raw data; there are a few other modules that have to be used afterward to obtain the actual Flux data."

desert oar Sep 23, 2023, 12:06 PM

#

quaint loom Well, this is where the time-consuming part comes in. The instrument logs data e...

right, but it sounds like this data is intended for a post-hoc analysis rather than continuous monitoring, correct?

quaint loom Sep 23, 2023, 12:06 PM

#

desert oar okay, so if you are looking for methods that only use past data without seeing t...

Well, I'm not particularly interested in determining whether the bubbles are occurring on the spot, as you can often spot them with your eyes, but you can't make that determination when you're analyzing the data afterward.

desert oar Sep 23, 2023, 12:07 PM

#

makes sense

quaint loom Sep 23, 2023, 12:10 PM

#

desert oar i think last time you asked about this, changepoint detection was brought up, an...

you're right. the concept of changepoint detection aligns well with what I'm trying to achieve.

past meteor Sep 23, 2023, 12:18 PM

#

quaint loom you're right. the concept of changepoint detection aligns well with what I'm try...

Typically changepoint detection, especially Bayesian online changepoint detection (BOCD) assumes you have 2 distinct distributions and there's a specific point you go from P1(y|X) to P2(y|X). Is that the case for you or do you just have anomalous points?

#

If you can mathematically define what you want to do finding a method that does it comes out the other end sometimes 😄

quaint loom Sep 23, 2023, 12:25 PM

#

past meteor Typically changepoint detection, especially Bayesian online changepoint detectio...

In my data, small and significant bubble events don't always fit the traditional definition of clear anomalous data points. They occur at somewhat random rates, making it a bit more challenging to pinpoint them as distinct anomalies.

past meteor Sep 23, 2023, 12:26 PM

#

Do they change the trajectory of your overall curve?

quaint loom Sep 23, 2023, 12:26 PM

#

past meteor If you can mathematically define what you want to do finding a method that does ...

Well, I am not that good with math or programming so that is also why I seek guidance and help here,

past meteor Sep 23, 2023, 12:26 PM

#

I'd say the size of the bubble doesn't determine if it's an anomaly or not, you can choose what an anomaly is as the practitioner

#

If the bubble moves your line permanently it's a changepoint, if it doesn't I'd say it's an anomaly

quaint loom Sep 23, 2023, 12:27 PM

#

past meteor Do they change the trajectory of your overall curve?

Yes, these bubble events do indeed affect the overall trajectory of the data curve. When they occur, they introduce deviations from the expected pattern, causing temporary fluctuations in the data.

past meteor Sep 23, 2023, 12:27 PM

#

When you say temporary, does it mean it shifts back?

quaint loom Sep 23, 2023, 12:28 PM

#

past meteor If the bubble moves your line permanently it's a changepoint, if it doesn't I'd ...

As the instrument I am using flushes the system at a constant time, the line will after X time smoothly decrease

past meteor Sep 23, 2023, 12:28 PM

#

Is the entire "lifespan" of your data impacted by an "event"?

#

Or just a "zone" around the "event"?

quaint loom Sep 23, 2023, 12:29 PM

#

past meteor When you say temporary, does it mean it shifts back?

I meant that the data exhibits deviations from the expected pattern for a certain duration or period of time while the bubble event is happening. These fluctuations are not permanent shifts; they are only observed during the occurrence of the bubble event.

past meteor Sep 23, 2023, 12:30 PM

#

Okay I'd say they're anomalies then. How fast do you need to spot them? You can be vague about this like "very fast", "medium" etc

#

And do you know the expected pattern before starting your process?

#

Additionally, do you have series that don't have any "bubble events"?

quaint loom Sep 23, 2023, 12:32 PM

#

past meteor Is the entire "lifespan" of your data impacted by an "event"?

It all depend on the length of the time I measure, Often I sample for 10-15 minute so when I see a significant increase when I analyses the data, it will be for the rest of the "liftspan" of the observed datatime I look for. But When I look at the real-time data out in the field, it will ofcourse drop down as the system flushes

past meteor Sep 23, 2023, 12:32 PM

#

I don't know your domain so it's in both our best interest if you abstract away some of the details 🤣

quaint loom Sep 23, 2023, 12:33 PM

#

past meteor Okay I'd say they're anomalies then. How fast do you need to spot them? You can ...

I don`t understand your question. I am not trying to catch them. When they occurs, I would like to "catch them" when I analys the data.

past meteor Sep 23, 2023, 12:34 PM

#

So not in real-time? After the fact?

quaint loom Sep 23, 2023, 12:34 PM

#

past meteor I don't know your domain so it's in both our best interest if you abstract away ...

Sorry, I am just afraind I am too limited with explaining to make confusions

past meteor Sep 23, 2023, 12:34 PM

#

I'd look at the variance of N points that don't have any "bubble events" and then take "N" points that contain a bubble event in the beginning and the end

#

There should be some sort of difference in variance

quaint loom Sep 23, 2023, 12:35 PM

#

past meteor So not in real-time? After the fact?

Well, when looking at the data when I am on the field, the graph itself drops if I continously keep meauring. But when I have decided that this timespend is the data I will use, you may not always see its decrease unless it is small bubble event.

#

Well, there will always be a constance change as the measurment is being done every second. That is also why I need to have a trashold value.

lofty star Sep 23, 2023, 1:07 PM

#

Data

sudden umbra Sep 23, 2023, 1:26 PM

#

quaint loom Well, when looking at the data when I am on the field, the graph itself drops if...

Data Variability

desert oar Sep 23, 2023, 1:41 PM

#

past meteor Typically changepoint detection, especially Bayesian online changepoint detectio...

as i understand, they are specifically looking for step detection, but i was thinking you could reduce that to just looking for large positive deviations from trend

#

that is, they are assuming there is some steady state constant rate of increase, and these bubbles lead to large "steps", effectively positive y-intercept shifts

past meteor Sep 23, 2023, 1:42 PM

#

I agree with all you said hence why I think we're going in circles 😄

desert oar Sep 23, 2023, 1:42 PM

#

yeah that's why I was trying to get out whether this was online or not

#

The best I can gather is that this is not really an online problem, which allows you to produce a decent estimate of baseline variation around trend, so you can retroactively look and find large deviations that might be bubbles

#

basically what I'm proposing is a shortcut to find change points, using the specific assumptions of the problem, rather than a fully general algorithm

past meteor Sep 23, 2023, 1:44 PM

#

No, what you're suggesting is enough

desert oar Sep 23, 2023, 1:44 PM

#

however I suspect that most of the off-line change point detection algorithms that work by recursively partitioning the time series would also work very well to detect large mean shifts

#

where things get tricky is detecting smaller mean shifts, and that's where I got hung up before I had to go do something else

past meteor Sep 23, 2023, 1:44 PM

#

There's a few cases where it will fail that I can foresee but they should start here and solving those will be easy

desert oar Sep 23, 2023, 1:45 PM

#

if you just look at average deviation from trend, E.g. estimating sample standard deviation of first differences, that standard deviation estimate will include all of the shifts

#

and this is where I really regret dropping that nonparametric statistics class in grad school

#

because my hunch is that some kind of robust estimation would be appropriate here

#

essentially you have an extra distribution of mean shifts: baseline shifts, and shifts caused by bubbles

past meteor Sep 23, 2023, 1:46 PM

#

Something simpler can work, I'd only reach for those if the diff-in-variance method fails

desert oar Sep 23, 2023, 1:46 PM

#

so either you do something nonparametric to try to eliminate the bubbles from the baseline estimate, or you do something like a bayesian mixture model where you are being really meticulous about accounting for all sources of variation, but that might be harder to design

#

True, maybe it just works

#

@quaint loom if you have the opportunity to sit there and observe bubbles as they come up, I think that would help your analysis substantially

past meteor Sep 23, 2023, 1:47 PM

#

Most summary statistics have robust counterparts incl. variance if that's an issue

desert oar Sep 23, 2023, 1:47 PM

#

Literally just mark the time that a bubble occurs, then all of a sudden you have labeled data points and you can be much more confident about model/technique selection

desert oar Sep 23, 2023, 1:48 PM

#

past meteor Most summary statistics have robust counterparts incl. variance if that's an iss...

I know they exist and that's about it

past meteor Sep 23, 2023, 1:48 PM

#

You probably also use Huber loss?

quaint loom Sep 23, 2023, 1:52 PM

#

desert oar <@950847230422712420> if you have the opportunity to sit there and observe bubbl...

Yes, I know. I am sitting there and observing but sometimes you can also not detect them as some of the bubbles is very small but the sensor is detecting it. One of the reason I also want to develop this module is because going through all this data would take ages.

quaint loom Sep 23, 2023, 1:53 PM

#

desert oar Literally just mark the time that a bubble occurs, then all of a sudden you have...

Yea, that is the simple way out.

desert oar Sep 23, 2023, 1:54 PM

#

quaint loom Yea, that is the simple way out.

It's not the simple way out, it's the correct solution. This is the "shoe leather" part of "statistics and shoe leather" that goes back to the earliest days of statistical analysis. Actually collecting good useful data almost always involves toil and manual effort

quaint loom Sep 23, 2023, 1:54 PM

#

As the chamber is covering th water surface, you can not always detect with your eye that a bubble is appearing inside the chamber.

desert oar Sep 23, 2023, 1:55 PM

#

I see, you said before that they were observable and I didn't realize that was limited

quaint loom Sep 23, 2023, 1:56 PM

#

desert oar I see, you said before that they were observable and I didn't realize that was l...

you`re right, its not always as a constant rate.

#

I would still like to develop this module. Just have to figure out how the best way should be! I also missed so much in school about this...

lapis sequoia Sep 23, 2023, 2:12 PM

#

anyone can help me in openpyxl?

desert oar Sep 23, 2023, 2:23 PM

#

quaint loom I would still like to develop this module. Just have to figure out how the best ...

I would start by focusing on the algorithm you intend to implement rather than the code

quaint loom Sep 23, 2023, 2:27 PM

#

desert oar I would start by focusing on the algorithm you intend to implement rather than t...

If you have some suggestions, please list them out to me when you have some time.

quaint loom Sep 23, 2023, 2:30 PM

#

lapis sequoia anyone can help me in openpyxl?

Please be more specific about your question. Maybe even i can help you if you can just ask directly

abstract wasp Sep 23, 2023, 2:37 PM

#

I was going to run my model but I got this error:
(DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int32 and shape [5923] [[{{node Placeholder/_4}}]] 2023-09-23 07:34:00.087774: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_4' with dtype int32 and shape [5923] [[{{node Placeholder/_4}}]] Epoch 1/60 Assertion failed: (f == nullptr || dynamic_cast<To>(f) != nullptr), function down_cast, file ./tensorflow/tsl/platform/default/casts.h, line 58. Assertion failed: (f == nullptr || dynamic_cast<To>(f) != nullptr), function down_cast, file ./tensorflow/tsl/platform/default/casts.h, line 58.

desert oar Sep 23, 2023, 2:40 PM

#

quaint loom If you have some suggestions, please list them out to me when you have some time...

Well we were talking about looking for outliers in the distribution of differences right? So I would start by computing the sample standard deviation, or maybe the median absolute deviation, of the first differences, and then messing around setting thresholds on that quantity. For example, if a deviation is more than two median absolute deviation away from the median, it might be an outlier, i.e. a bubble event

#

The median absolute deviation should be more robust to the effects of including outliers in the estimation process, compared to the standard deviation

#

I would probably start by doing this on each time series individually, preferably one where you were able to actually observe and mark known bubbles

#

and by the way you might have a somewhat more enjoyable data analysis experience if you work with pandas or numpy. the trade-off is that they are fairly large libraries and might take a while to get comfortable with, but maybe you can set that as an intermediate goal before moving onto more sophisticated algorithms or scaling up to some kind of automated process

quaint loom Sep 23, 2023, 2:51 PM

#

desert oar Well we were talking about looking for outliers in the distribution of differenc...

Thank you for your time. I will return to you when I have delved deeper into this.

neon field Sep 23, 2023, 4:21 PM

#

ANYONE WHO HAS DONE SIGN LANGUAGE DETECTION USING MACHINE LEARNING??!!! ITS URGEENTTT

verbal venture Sep 23, 2023, 4:47 PM

#

can someone explain to me why np.axis=0 means "applied to each column individually, but along the rows"

echo mesa Sep 23, 2023, 5:54 PM

#

Hi guys, this might be a dumb question but, Do I have to know numpy in order to understand and use pytorch, because I'm learning math right now for machine learning and ai and I want to improve in python as well and eventually want to use either pytorch of tensorflow to implement my math knowledge and build models, so I'm wondering how should I do them, because I understand that I need math and I love learning math, however I also eventually want to use pytorch or tensorflow to implement my math knowledge and actually build models with them, so I'm wondering that in what order should I do them. Or should I start with understanding and being familiar with numpy and then get on to pytorch and tensorflow, or first being really good at math and then get into numpy pytorch and tensorflow?
I hope this make sense, I know that this is dumb question, but I'm a beginner in this field.

scenic parcel Sep 23, 2023, 6:10 PM

#

pip makes we want to kill a dolphin

narrow spear Sep 23, 2023, 6:20 PM

#

is there anyone who is available ı d would like to ask something easy things 🙂

cursive narwhal Sep 23, 2023, 6:22 PM

#

I can but i'm also a beginner

narrow spear Sep 23, 2023, 6:22 PM

#

in first photo ı can open camera easyly in second photo ı would like to open webcam again ,ı m writing 0 to default what can ı do

#

when ı write 0 to promt ıt works but when ı was trying to write it in code doesnt work

#

where ı made a mistake

cursive narwhal Sep 23, 2023, 6:24 PM

#

oh

#

i'm noob)

narrow spear Sep 23, 2023, 6:24 PM

#

hah okey 🙂

sterile barn Sep 23, 2023, 8:07 PM

#

Hello. I am doing some preprocessing and have just 2 null values in a single column I want to look at. How can I look at just the two rows of data that have them? I have the typical suite of libraries installed: pandas, numpy, matplotlib and seaborn.

agile cobalt Sep 23, 2023, 8:08 PM

#

something like df.loc[df.isna().any(axis=...)] might work? maybe also use subset= on isna

sterile barn Sep 23, 2023, 8:09 PM

#

Yeah, that was just about the first thing that popped up after I rephrased my own question. This always happens 🙃

#

Thank you so much though 😅

left tartan Sep 23, 2023, 8:14 PM

#

echo mesa Hi guys, this might be a dumb question but, Do I have to know numpy in order to ...

You can use the tools without fully understanding them. To be a professional, you’ll need to round out your knowledge, and will be forced to learn more of numpy, but there are plenty of starter projects that require basic understanding

left tartan Sep 23, 2023, 8:15 PM

#

echo mesa Hi guys, this might be a dumb question but, Do I have to know numpy in order to ...

Make sure you first have solid Python skills (finish a tutorial and maybe some small projects), or it might be overwhelming. I’d suggest cs50 for ai or kaggle.com/learn to learn some Ml stuff and basic numpy

untold ginkgo Sep 23, 2023, 8:15 PM

#

does anyone do tracking of certain products of the web

left tartan Sep 23, 2023, 8:16 PM

#

verbal venture can someone explain to me why np.axis=0 means "applied to each column individual...

What do you mean “why”? It just does. Axis=0 says: operate row by row, axis=1 is columnar

#

https://www.sharpsightlabs.com/blog/numpy-axes-explained/

verbal venture Sep 23, 2023, 8:17 PM

#

left tartan What do you mean “why”? It just does. Axis=0 says: operate row by row, axis=1 is...

Right, 0 = row but if you do np.sum(axis=0) then it gets the sum along each column

left tartan Sep 23, 2023, 8:19 PM

#

That link explains it better than I would, could you check that first?

#

I understand your confusion, it’s just one of those things that’s how it works

echo mesa Sep 23, 2023, 8:22 PM

#

left tartan Make sure you first have solid Python skills (finish a tutorial and maybe some s...

Thanks I appreciate your answer

tidal bough Sep 23, 2023, 8:22 PM

#

the way I think of it: for reduction operations, axis=0 means "get rid of axis 0 by reducing along it".

#

so you have an (n,m) array, you do something like .sum(axis=0), you get an (m,) array.

desert oar Sep 23, 2023, 8:31 PM

#

verbal venture can someone explain to me why np.axis=0 means "applied to each column individual...

in general, the axis parameter tells you which axes are "consumed" by the operation

#

oh hah thats what reptile just said

#

so it's not that axis=0 means "operate columnwise", it means "operate everything-other-than-row-wise"

#

consider an array of RGB images, shape (m,n,3). let's say you want to find the average value of each color across all pixels across all images. that would be np.mean(images, axis=(0,1))

left tartan Sep 23, 2023, 10:40 PM

#

What do you expect x_train[1] to do?

verbal venture Sep 23, 2023, 11:02 PM

#

tidal bough so you have an (n,m) array, you do something like `.sum(axis=0)`, you get an (m,...

this makes sense

golden ridge Sep 24, 2023, 12:52 AM

#

anyone knos a good machine learning algorithm for a tyre degradation prediction model? the feature is you input the compound and the tyre lap and it should give the expected time

desert oar Sep 24, 2023, 1:12 AM

#

golden ridge anyone knos a good machine learning algorithm for a tyre degradation prediction ...

F1 tires? linear regression for sure, at least as a starting point. if every lap is driven identically, you would expect the same amount of tire wear each lap. if some laps randomly result in more wear and some laps randomly result in less wear, with a roughly bell curve shaped distribution of where centered around average per lap tire wear, that's the classic linear regression model

#

you could imagine that maybe tires do not wear consistently. maybe they exhibit a lot of wear in the first couple of laps, then wear rate flattens, and then the tire deteriorates rapidly at the end of its life. or maybe something completely different. but i would always advocate for the simpler model first

#

assuming you are actually interested in making predictions about F1 race outcomes, you have the problem where you are not physically observing and measuring the condition of the tire, so any proposed model is more like a guess or theory and there's no real principled way to fit that to any data, because you have no data

#

so in that case you would almost definitely want to go with the linear model, always go with the simpler model in the absence of other information

prisma hinge Sep 24, 2023, 2:16 AM

#

Hello, I am currently trying to run a pretrained model that classifies the mnist number data set both from huggingface. I am having issues with the dimensions and format of the images. I have attached my code below along with the error raised and would appreciate any help regarding this. Thanks in advance.

prime galleon Sep 24, 2023, 6:43 AM

#

Hello, I have dataset which have 1400 rows and 1800 rows. I am trying to recognize letters. My model can currently recognize every letter but A, B, D and H. I use randomforest algorithm. Do I need more data or is there some other way to solve this problem. At training it has accuracy of 97% but when trying in practice it doesn't recognize those above mentioned letters

golden ridge Sep 24, 2023, 6:49 AM

#

desert oar F1 tires? linear regression for sure, at least as a starting point. if every lap...

thats what i used lol, but would linear regression work for non linean model as the tyres??

plucky breach Sep 24, 2023, 7:13 AM

#

Guys why do my spacy code doesn't return correct similarity, these sentences even are similar

import spacy

nlp = spacy.load('en_core_web_lg')

sentence1 = nlp("subjective test 3 _ Test paper (Biology) __ PDF ONLY __ (Neev 2024)")
sentence2 = nlp("biology test")
print(sentence1.smilarity(sentence2))

0.1846538...

oblique quarry Sep 24, 2023, 9:01 AM

#

Ive been implementing lda, following a guide, making some tweeks occasionally. But I cant stop asking myself why do I have to use the within class scatter matrix. When I look at the formula I'm really tempted to just go for the cov Matrix... it would be so much easier on the computer in terms of computations.

midnight harbor Sep 24, 2023, 9:33 AM

#

Hello Everyone👋🏼

I am thrilled to share that I have participated in a Datacamp competition to show my analytical and machine-learning skills. Just like as you've supported me in past competitions, I am reaching out to you again.🙌🏼

Your support means the world to me. To increase my chance of winning, I kindly ask for a moment of your time to visit my DataCamp workspace and upvote it from the link 👇🏼

https://app.datacamp.com/workspace/w/83209d5b-2341-46d3-88c3-113ebb8d587b

Your upvote could make all the difference. Your encouragement and support have always been a driving force and I am immensely grateful for it. ☺️

Thanks for taking the time to upvote my work ♥️

By
Umar and Faizan

Dance-themed Playlist: NLP Based Clustering & Random Forest Regress...

It's that time of the year when summer has come, and it brings a feeling of happiness and liveliness, especially if you're in the Northern Hemisphere…

silk sun Sep 24, 2023, 10:10 AM

#

Can anyone help me in making a Sign Language Recognition

golden ridge Sep 24, 2023, 10:27 AM

#

hey guys, which algorithm do you think it suits the most this model. it is a model which predicts the time in a specific circuit due to tyre degradation with inputs: compound, laps with tyre, laps in race.

I have used linear regresion but now what to try something else

past meteor Sep 24, 2023, 10:54 AM

#

golden ridge hey guys, which algorithm do you think it suits the most this model. it is a mod...

What is your output? It's a specific time?

#

so for instance '50 hours'?

golden ridge Sep 24, 2023, 10:55 AM

#

past meteor What is your output? It's a specific time?

its laptime

earnest wren Sep 24, 2023, 10:55 AM

#

prisma hinge Hello, I am currently trying to run a pretrained model that classifies the mnist...

Looks like it might be an issue of black and white vs. colour images. You can import the image dataset using an ImageFolder helper from torch instead as this automatically helps with this.

Otherwise you will have to convert the images in place somehow.

past meteor Sep 24, 2023, 10:55 AM

#

This is likely a case where linear regression is not great 🙂

#

Probably look at a gamma regression

past meteor Sep 24, 2023, 10:56 AM

#

golden ridge its laptime

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.GammaRegressor.html

scikit-learn

sklearn.linear_model.GammaRegressor

Examples using sklearn.linear_model.GammaRegressor: Release Highlights for scikit-learn 0.23 Tweedie regression on insurance claims

golden ridge Sep 24, 2023, 10:56 AM

#

past meteor This is likely a case where linear regression is not great 🙂

i mean it gave me an error of like 0,9s

past meteor Sep 24, 2023, 10:56 AM

#

Can you plot the distribution of your target variable

golden ridge Sep 24, 2023, 10:59 AM

#

yess

#

but ts bugged

#

i cant send it

#

illl dm you

#

past meteor Sep 24, 2023, 11:00 AM

#

Don't DM me please

#

Could you make it a histogram, something like this?

golden ridge Sep 24, 2023, 11:00 AM

#

i dont rlly know how to use matplotlib, could you tell me how to do it

past meteor Sep 24, 2023, 11:02 AM

#

I can give you a few pointers:

Seaborn is a great high level plotting library you could use to make a histogram or in this case density plot: https://seaborn.pydata.org/generated/seaborn.displot.html#seaborn.displot

golden ridge Sep 24, 2023, 11:02 AM

#

but do you know why my plot is bugged??

past meteor Sep 24, 2023, 11:02 AM

#

The link has clear examples on how to, you'd do something like: sns.displot(data=penguins, x="flipper_length_mm", kind="kde")

golden ridge Sep 24, 2023, 11:03 AM

#

this??

past meteor Sep 24, 2023, 11:03 AM

#

Yes

past meteor Sep 24, 2023, 11:03 AM

#

golden ridge but do you know why my plot is bugged??

No, personally I would have a hard time knowing why exactly your plot without reading through your whole script and if I may be honest I don't have the time for that right now 🙂

#

Seaborn is built on top of Matplotlib and you should most likely read this page: https://matplotlib.org/stable/users/explain/quick_start.html#a-simple-example to understand how it works (the relationship between Figures and Axes)

golden ridge Sep 24, 2023, 11:15 AM

#

past meteor No, personally I would have a hard time knowing why exactly your plot without re...

so what is pinguins here?? sns.displot(data=penguins, x="flipper_length_mm", hue="species", kind="kde")

past meteor Sep 24, 2023, 11:15 AM

#

it's just an example, you'd put your own data there

golden ridge Sep 24, 2023, 11:15 AM

#

but what does penguin stand for

#

like what does it do

past meteor Sep 24, 2023, 11:16 AM

#

Your data frame and x is your column

golden ridge Sep 24, 2023, 11:16 AM

#

past meteor Your data frame and x is your column

like the csv??

past meteor Sep 24, 2023, 11:16 AM

#

I think you need to read through the documentation (both links I sent you)

golden ridge Sep 24, 2023, 11:17 AM

#

and what would be the hue

past meteor Sep 24, 2023, 11:17 AM

#

I won't tell you and that's in your best interest 🙂 Learning to read documentation is probably top 3 most important things in programming.

golden ridge Sep 24, 2023, 11:18 AM

#

the thing is i dnt rlly understand what histogram are you askin for

golden ridge Sep 24, 2023, 11:18 AM

#

past meteor I won't tell you and that's in your best interest 🙂 Learning to read documentat...

if you tell me what the histogram should be

past meteor Sep 24, 2023, 11:19 AM

#

past meteor Could you make it a histogram, something like this?

Is your "y" column (laptime) is shaped like this you're better off using a gamma regression.

#

To find out you make a histogram or kde plot

golden ridge Sep 24, 2023, 11:20 AM

#

past meteor To find out you make a histogram or kde plot

but what is the graph comparing

past meteor Sep 24, 2023, 11:21 AM

#

If you're having trouble with that then https://seaborn.pydata.org/tutorial/distributions.html <- is a good read to understand the reasoning behind histograms etc

golden ridge Sep 24, 2023, 11:22 AM

#

nono, im saying whats the graph u want me to plot

past meteor Sep 24, 2023, 11:22 AM

#

past meteor Is your "y" column (laptime) is shaped like this you're better off using a gamma...

laptime

golden ridge Sep 24, 2023, 11:22 AM

#

so y is laptime

#

and x is...

past meteor Sep 24, 2023, 11:23 AM

#

I'm not going to answer that 😄 Time for you to do the work.

golden ridge Sep 24, 2023, 11:23 AM

#

past meteor I'm not going to answer that 😄 Time for you to do the work.

if ur not going to tell me what x is how am i suposed to do the graph u want me to do

#

this is not about an implementation is about something u want me do to

#

i dont know why gamma reggresor would be better

past meteor Sep 24, 2023, 11:24 AM

#

Read the stuff I linked and you'll know what X is. I won't always be here

golden ridge Sep 24, 2023, 11:24 AM

#

flipper length mm??

golden ridge Sep 24, 2023, 11:25 AM

#

past meteor Could you make it a histogram, something like this?

i can see 0 2 4 6 8 10 12

past meteor Sep 24, 2023, 11:26 AM

#

I'll leave you on your own for now with your homework, good luck! (I'm not trying to be annoying, it's for your best interest)

golden ridge Sep 24, 2023, 11:27 AM

#

past meteor I'll leave you on your own for now with your homework, good luck! (I'm not tryin...

bro this is not good for me

shut wren Sep 24, 2023, 1:55 PM

#

hello can anyone help me with a bug with my project

#

its a pre-trained image classifier to identify dog breeds

#

i just hv one small problem i can't fix

#

if interested dm asap please i really need to complete this project

nimble hawk Sep 24, 2023, 2:08 PM

#

Hello everyone,
I share data science tutorials regularly every week on YouTube and I wanted to share the playlists I've created. If you are learning about data analysis, data science and machine learning, I have plenty of videos that can help you on this journey.

Data science projects playlist ->https://youtube.com/playlist?list=PLTsu3dft3CWg69zbIVUQtFSRx_UV80OOg&si=cLljLTBYA9c48Bys
This playlist contains my end-to-end data science projects which I provide with the datasets i use.

I share courses on my channel too, the PySpark course that I share on my channel -> https://www.youtube.com/watch?v=jWZ9K1agm5Y&t=2628s

My channel for more -> https://www.youtube.com/@onurbltc

Thanks for reading, have a great day!

YouTube

Python Data Science Projects

Welcome to my Data Science Projects playlist! In this series of videos, I explore various topics in data science and machine learning by working on hands-on ...

YouTube

Onur Baltacı

PySpark Course: Big Data Handling with Python and Apache Spark

PySpark, the Python API for Apache Spark, empowers data engineers, data scientists, and analysts to process and analyze massive datasets efficiently. In this course, you'll dive deep into the fundamentals of PySpark, learning how to harness the combined power of Python and Apache Spark to handle big data challenges with ease. From data manipulat...

▶ Play video

YouTube

Onur Baltacı

Hello! I'm Onur. I'm creating courses on Udemy for a year and now I started to upload videos on YouTube. My videos are going to be about data science and programming. I will upload crash courses that are going to be helpful on learning concepts in a single video. Thanks for visiting my channel!

left tartan Sep 24, 2023, 2:10 PM

#

shut wren hello can anyone help me with a bug with my project

Maybe post a help thread, and paste a link here? It’s hard to know who can help without knowing the problem. #❓｜how-to-get-help . Threads can be hit or miss if it’s a very advanced topic.

shut wren Sep 24, 2023, 2:29 PM

#

oh alright

shut wren Sep 24, 2023, 3:10 PM

#

i dont actually have a link for it

#

i was looking forward to getting on like a voice call to share my screen or smt

#

its from a course online thats why

verbal oar Sep 24, 2023, 4:46 PM

#

can someone recommend course/book for deep learning for computer vision?
I'm interested mainly in CNN

prisma hinge Sep 24, 2023, 5:18 PM

#

earnest wren Looks like it might be an issue of black and white vs. colour images. You can im...

Thanks for the advice. I have experienced another issue upon resizing my data to fit the model.

median fulcrum Sep 24, 2023, 5:19 PM

#

What would you guys do to recognize the position In a online chess board image and return a FEN . Would you use train, models or coordinates in the board?

desert oar Sep 24, 2023, 5:58 PM

#

golden ridge thats what i used lol, but would linear regression work for non linean model as...

if you do a nonlinear transformation of the data, you can fit the transformed data with linear regression, eg. y = az + b where z = 1/x

desert oar Sep 24, 2023, 6:06 PM

#

median fulcrum What would you guys do to recognize the position In a online chess board image a...

can you clarify the question? what do you mean by "train" and "models" in this case?

#

is it an image of a physical chessboard, or a computer generated 2d image or something else?

desert oar Sep 24, 2023, 6:08 PM

#

prisma hinge Thanks for the advice. I have experienced another issue upon resizing my data to...

please post code and errors as text and not screenshots. discord has support for formatting text as code with syntax highlighting, and we also have a separate paste site for posting longer sections: https://paste.pythondiscord.com

#

but the error is that collections.Iterable was removed in recent versions of python, you need collections.abc.Iterable. it's essential to practice understanding and working through error messages on your own, it is a critical skill

median fulcrum Sep 24, 2023, 11:15 PM

#

desert oar can you clarify the question? what do you mean by "train" and "models" in this c...

I don't know if I translated In my head correctly. But I mean using a machine learning model to be trained and then recognize chess pieces and the position In a chess board

#

Would you have an idea?

#

I've already done image recognition

#

individually

#

how could I do in the chess board

#

with many pieces

#

and

#

I have to identify the squares that each piece are

#

already done

left tartan Sep 24, 2023, 11:27 PM

#

median fulcrum how could I do in the chess board

Is this a picture of a chess board? Like a photo?

#

Or a screenshot/whatever of a chess board where the positions are fixed?

median fulcrum Sep 25, 2023, 12:42 AM

#

left tartan Or a screenshot/whatever of a chess board where the positions are fixed?

screenshot

#

online chess board

left tartan Sep 25, 2023, 12:44 AM

#

median fulcrum screenshot

So can you decompose the problem to square level recognition? Instead of recognizing the board, just recognizing each square?

median fulcrum Sep 25, 2023, 12:45 AM

#

left tartan So can you decompose the problem to square level recognition? Instead of recogni...

Wdym? I think it's the only way to find where each piece are and the position by dividing the board in squares

#

#

But I want opinions In wich are the best way to do

#

for example, find a online chess board In any image and try to use it

#

not just this perfect cropped screenshot

sharp sierra Sep 25, 2023, 2:18 AM

#

# Fitness algorithm
def get_fitness(area):
    return (1000 * ((GRAY_AREA_TOTAL - area) ** 4)) / (GRAY_AREA_TOTAL ** 4)

# Generate segments
def generate_segments(neural_network):
    segments = []
    total_length = 0
    
    # Use the neural network to produce a list of segment endpoints
    output = neural_network.activate([1])
    
    # Interpret the segment endpoints as pairs of (x, y) coordinates
    for i in range(0, len(output), 4):
        x_start, y_start, theta, length = output[i:i+4]
        x_start = (x_start + 1) * X_SCALING_FACTOR + BORDER[0]
        y_start = (y_start + 1) * Y_SCALING_FACTOR + BORDER[2]
        length += 1
        total_length += length
        if total_length >= MAX_LENGTH:
            total_length -= length
            length = MAX_LENGTH - total_length
            x_end = x_start + length * math.cos(theta * math.pi)
            y_end = y_start + length * math.sin(theta * math.pi)

            segments.append(((x_start, y_start), (x_end, y_end)))
            return segments
        x_end = x_start + length * math.cos(theta * math.pi)
        y_end = y_start + length * math.sin(theta * math.pi)

        segments.append(((x_start, y_start), (x_end, y_end)))
    return segments

#

def evaluate_genome(genomes, config):
    nets = []
    sets = []

    # Create a neural network from the genome
    for id, g in genomes:
        net = neat.nn.FeedForwardNetwork.create(g, config)
        nets.append(net)
        g.fitness = 0

    # Get the segments NEAT generates
    for net in nets:
        set = generate_segments(net)
        sets.append(set)

    # Implement fitness
    for i, segments in enumerate(sets):
        # Calculate the remaining areas
        area_original = calculate_valid_area(segments)
        area_flipped = calculate_valid_area(flip(segments))
        area_total = area_original + area_flipped

        # Get the fitness of the segments
        fitness = get_fitness(area_total)

        genomes[i][1].fitness = fitness

def run_neat(config, gen_count):
    # Create the NEAT population
    population = neat.Population(config)
    
    # Add a reporter to monitor progress (optional)
    reporter = neat.StdOutReporter(True)
    population.add_reporter(reporter)
    stats = neat.StatisticsReporter()
    population.add_reporter(stats)
    
    # Run the NEAT algorithm
    winner = population.run(evaluate_genome, gen_count)  # Specify the number of generations

    # Retrieve the best genome (neural network)
    best_genome = winner

    return best_genome

###
# RUN
###

# Set configuration file
config_path = "./config-feedforward.txt"
config = neat.Config(neat.DefaultGenome, neat.DefaultReproduction,
                     neat.DefaultSpeciesSet, neat.DefaultStagnation, config_path)

run_neat(config, 10)

#

I want to get the output generated from generate_segments() that performed the best once the simulation is finished. How can i do this?

(I am using NEAT-python)

sharp sierra Sep 25, 2023, 3:03 AM

#

https://paste.pythondiscord.com/KSXA

#

^full script

prisma hinge Sep 25, 2023, 3:21 AM

#

desert oar please post code and errors as text and not screenshots. discord has support for...

Apologies for the formatting. I have tried troubleshooting and ended up trying to tackle this another way. Here is my paste: https://paste.pythondiscord.com/ZZWA

past meteor Sep 25, 2023, 4:54 AM

#

sharp sierra ```py def evaluate_genome(genomes, config): nets = [] sets = [] # C...

You'd have to see if population.run allows you to return more than the best_genome otherwise you can have evaluate_genome mess with some mutable global (e.g., put the best fitness per generation in a dict with its value) but that's messy imo.

lapis sequoia Sep 25, 2023, 5:55 AM

#

sharp sierra ```py # Fitness algorithm def get_fitness(area): return (1000 * ((GRAY_AREA_...

Looks like an interesting project what r u doing

sharp sierra Sep 25, 2023, 6:04 AM

#

lapis sequoia Looks like an interesting project what r u doing

Trying to build a program around the opaque set problem. The set (black lines in the right plot) for which all lines that can be drawn through the area (gray square in the right plot) go through at least once

#

the first 2 plots show the fitness. its that area in grey that isnt covered with either red or blue. The NN is attempting to draw lines whose combined length has a strict limit that meet the conditions

#

m and b represent the values in y = mx + b for which, in red or blue, the line goes through a segment in the set, or in gray, the line goes through the target area

#

https://en.wikipedia.org/wiki/Opaque_set

Opaque set

In discrete geometry, an opaque set is a system of curves or other set in the plane that blocks all lines of sight across a polygon, circle, or other shape. Opaque sets have also been called barriers, beam detectors, opaque covers, or (in cases where they have the form of a forest of line segments or other curves) opaque forests. Opaque sets wer...

#

im super new to working with neural networks tho so im iffy on whether its not doing anything other than random selection at this point lmaooo

rose agate Sep 25, 2023, 9:03 AM

#

does anyone knows about how to work with spatial autocorrelations? I have data spaced evenly every 10 meters of the change of a conditions and want to know the relationship to surrounding points. Most of the things I search on partial autocorrelations are to do with time-series which seems a little different, so not sure how to start

#

Partial-Autocorrelation-Plot-of-the-Minimum-Daily-Temperatures-Dataset.png

#

e.g. a partial autocorrelation for a time-series

lapis sequoia Sep 25, 2023, 2:17 PM

#

sharp sierra Trying to build a program around the opaque set problem. The set (black lines in...

Have u ever tried arithmetic operations with neural netd

wheat fox Sep 25, 2023, 2:21 PM

#

Anyone familiar with gephi

versed gulch Sep 25, 2023, 3:44 PM

#

Hi I have a 3D array which is a 3D image composed of 2D slices, is there a way I can rotate my 3D array in the x-z plane and the y-z plane?

serene scaffold Sep 25, 2023, 3:57 PM

#

versed gulch Hi I have a 3D array which is a 3D image composed of 2D slices, is there a way I...

I don't know exactly which function would be right for you, but there's this one for rotating arrays https://numpy.org/doc/stable/reference/generated/numpy.rot90.html

#

there's also flip, if that one isn't right.

ember pawn Sep 25, 2023, 3:57 PM

#

hi , i wanted to ask about maths part of ai

#

is there any reference book ?

serene scaffold Sep 25, 2023, 3:58 PM

#

ember pawn is there any reference book ?

https://mml-book.github.io/

Mathematics for Machine Learning

ember pawn Sep 25, 2023, 4:01 PM

#

thanks

unborn saddle Sep 25, 2023, 4:47 PM

#

Hey anyone interested in developing AI
Can join me I'm going to develop AI

We can build and learn together 😀

serene scaffold Sep 25, 2023, 5:31 PM

#

lol @past meteor I linked that because you link it

past meteor Sep 25, 2023, 5:32 PM

#

Hahaha so I'm just giving myself a pat on the back 🤣

serene scaffold Sep 25, 2023, 5:32 PM

#

you deserve it 👍

left tartan Sep 25, 2023, 5:50 PM

#

past meteor Hahaha so I'm just giving myself a pat on the back 🤣

I have your message bookmarked to give to new folks, I think we should pin it or something similar: #data-science-and-ml message

#

So pat yourself on teh back with both hands 🙂

serene scaffold Sep 25, 2023, 6:04 PM

#

serene scaffold Sep 25, 2023, 6:05 PM

#

left tartan I have your message bookmarked to give to new folks, I think we should pin it or...

by the power vested in me by lemon, so shall it be.

sharp sierra Sep 25, 2023, 6:08 PM

#

lapis sequoia Have u ever tried arithmetic operations with neural netd

No I haven’t, what would that do?

tacit basin Sep 25, 2023, 7:57 PM

#

do you have any recs for libs / tools for data anonymization?

tacit basin Sep 25, 2023, 8:27 PM

#

also anyone could recommend books / resources for learning AI with C++? (asking for a friend 🙂 )

lapis sequoia Sep 26, 2023, 12:13 AM

#

sharp sierra No I haven’t, what would that do?

So that it can Develop reasoning skills

steep shadow Sep 26, 2023, 12:23 AM

#

does anyone have any ideas about how i can begin learning ai in python (neural networks and like maybe image classification). I already have a good basis in python. I also don't want to spend money on resources and am looking for good free resources.

left tartan Sep 26, 2023, 12:33 AM

#

steep shadow does anyone have any ideas about how i can begin learning ai in python (neural n...

CS50 for AI is a pretty good intro, if you already know Python.

#

Or kaggle.com/learn

lapis sequoia Sep 26, 2023, 12:39 AM

#

steep shadow does anyone have any ideas about how i can begin learning ai in python (neural n...

I have a subreddit..r/mlscholar.. go to it's wiki page for all the resources

median fulcrum Sep 26, 2023, 1:52 AM

#

median fulcrum not just this perfect cropped screenshot

someone have an idea?

tacit basin Sep 26, 2023, 2:51 AM

#

steep shadow does anyone have any ideas about how i can begin learning ai in python (neural n...

course.fast.ai

coral bridge Sep 26, 2023, 8:38 AM

#

guys i need advice for how to segment english word by rules

golden canyon Sep 26, 2023, 8:47 AM

#

hey guys, how do you prepare for the programmimg part of the interview? Leetcode?

#

I am applying for a graduate position

left tartan Sep 26, 2023, 11:00 AM

#

golden canyon hey guys, how do you prepare for the programmimg part of the interview? Leetcode...

Probably a better question for #career-advice, please share a little more information about your background and the position you're interviewing for.

stark bay Sep 26, 2023, 2:27 PM

#

Hi... i want to ask a question from AI and DS engineers... what is ur view on how the AI art is created and if it violates copyright issues of artists... some of the artists go as far as to claim that bots trained on such models use stolen images and art available from artists without their consent hence not just the bot but the dev is also responsiblebehind them..and then worse is monetizing them... i wanna ask where can i find an appropriate library/dataset to build such a model that doesnt violate such rights...
Previously i have been using kaggle for such datasets... is there one such set available which doesnt do that

desert oar Sep 26, 2023, 2:51 PM

#

stark bay Hi... i want to ask a question from AI and DS engineers... what is ur view on ho...

you might have to look at the details of how each dataset was constructed/gathered

stark bay Sep 26, 2023, 3:05 PM

#

So are there claims legitimate? And if yes then why have devs chosen to take such measures to build bots/generators like that even if they can be illegal

weak mortar Sep 26, 2023, 4:41 PM

#

im gonna play around with falcon today, now i cant really find any good info comparing 180b with 40b. which would you recommend me to go with at first? is 180b significantly more power hungry or harder to use?

#

can i even run it on a normal computer? it says somewhere 40b requires 90 gigabyte of gpu memory.....

abstract wasp Sep 26, 2023, 5:04 PM

#

Has anyone here heard of Abacus.ai before?

past meteor Sep 26, 2023, 5:09 PM

#

stark bay Hi... i want to ask a question from AI and DS engineers... what is ur view on ho...

These questions are above the paygrade of 9/10 engineers and make more sense to ask a legal / arts / philosophy person 😄

stark bay Sep 26, 2023, 5:14 PM

#

past meteor These questions are above the paygrade of 9/10 *engineers* and make more sense t...

Hahah... surely.. the point is if they re right then is there such a data to get where such copyright claims arent available... and if their claim is not correct then how dare they insult us... it is either they havent predicted the future like it is and now crying over spilled milk or we misunderstood what such a thing could bring in our lives... usually the questions i asked from other fellows...were that they dont care...since art can be subjective lemon_sweat

past meteor Sep 26, 2023, 5:16 PM

#

I don't really know what you mean. At work we have a legal dept. if I want to know something I ask them

#

The AI art debate has so much nuance that most engineers (maybe I'm just projecting) don't have, it's a legal/philosophy matter I think. I can voice an opinion but it's likely going to be bad 😄

stark bay Sep 26, 2023, 5:20 PM

#

It is fine for me... ur bad can be mine good and vice versa as well... but after having such a debate with such fellows i am now in a shock to even start my own ML model on such a thing... or not...

#

So for me...it is kinda a guiding light now

agile cobalt Sep 26, 2023, 5:34 PM

#

stark bay It is fine for me... ur bad can be mine good and vice versa as well... but after...

...uh, sorry but do you have the slightest idea of how much it costs to train a model like Stable Diffusion?

stark bay Sep 26, 2023, 5:35 PM

#

Not at all

agile cobalt Sep 26, 2023, 5:35 PM

#

over half a million dollars

stark bay Sep 26, 2023, 5:36 PM

#

I see...

#

Thats really high

agile cobalt Sep 26, 2023, 5:37 PM

#

you can train a mini scale diffusion based image generation model that generates something like 32x32 grayscale images for 10 types of objects, but training something that generates high quality images for almost anything you can imagine requires an absurd amount of compute

#

which is why you pretty much only ever see giant corporations training their own models

#

there are a lot of different ways to customize these models though, for example a bunch of people fine tune Stable Diffusion to work better on generating specific kinds of images

stark bay Sep 26, 2023, 5:39 PM

#

Yeah i figured... i dont wanna create a model like stable diffusion or Dall E... i just wanna create something smoller and a proof of concept... which wont require data from such artists... i really like the concept of qr code art so i wanna create a small model like that

agile cobalt Sep 26, 2023, 5:41 PM

#

the "qr code" part aside, getting a model good enough to generate something people might recognise as "art" is already insanely difficult as-is

#

iirc the current methods to generate it are mostly using control net to guide stable diffusion, you might want to look into these two in detail first if you haven't yet?
(control net and stable diffusion)

stark bay Sep 26, 2023, 5:44 PM

#

Oka... thanks for the help

#

I am just starting this field again so i thought to check this out as well... the imaging part...

weak mortar Sep 26, 2023, 7:11 PM

#

weak mortar im gonna play around with falcon today, now i cant really find any good info com...

lol okay. you would need like 360 gigabyte of vram for it to run optimally. think i'll just go ahead with the 7F which has alot lower system requirements. eager to hear if anyone tried to run 40b on their system and how the performance was

languid prairie Sep 26, 2023, 8:00 PM

#

Hi guys 👋🏻

For generating a synthetic dataset from financial PDFs :
I' want to do Query Generation:
For that, i think about using a pre-trained language model (such as GPT-3, GPT-4, or other LLMs) to generate queries based on the content of each text chunk.

But the problem is using OpenAI's GPT models, I would need to have access to the OpenAI API and set up API key.

Do u think I can use LlamaIndex instead to generate these queries ?

desert oar Sep 26, 2023, 9:03 PM

#

languid prairie Hi guys 👋🏻 For generating a synthetic dataset from financial PDFs : I' want...

what kind of synthetic dataset? i don't have an answer, but i'm trying to stay somewhat up-to-date on all the LLM hype and i'm curious what task you have in mind

languid prairie Sep 26, 2023, 9:11 PM

#

It’s just for the questions and answers generated by GPT

#

From a pdf file

#

Like u can ask questions about the content of the PDF

#

but I want to generate those questions through another method in order to avoid using OPEN AI

#

‘Cause I don’t have API KEY / no budget for that

#

So I was wondering if I can do through Llamaindex model

desert oar Sep 26, 2023, 9:19 PM

#

what do you mean by generate queries though?

#

or are you talking about fine-tuning a model using some text data that you have?

left tartan Sep 26, 2023, 10:26 PM

#

This seems more of an NLP question; you want to query meaning from (presumably) 10-k’s and q’s?

desert oar Sep 26, 2023, 11:07 PM

#

i mean, querying from a corpus of documents seems to be one of the really strong use cases for fine tuning one of these open models

#

ive seen it mentioned a handful of times now, that llama performs pretty well when fine tuned

#

i have zero personal experience with it though

left tartan Sep 26, 2023, 11:11 PM

#

I downloaded llama, just need to finally get around to trying it. Halfway there, I guess 🙂

past meteor Sep 27, 2023, 5:35 AM

#

desert oar what kind of synthetic dataset? i don't have an answer, but i'm trying to stay s...

Are you? For me it feels like a wild goose chase 🤣

#

I took many NN based courses in uni and ultimately a lot of my projects have been time series so there's overlap in methods but LLMs specifically are a very specific niche. I have 0 FOMO at the "next big thing" coming out every other week, once the hype settles down a little bit I'll catch up.

willow quest Sep 27, 2023, 12:31 PM

#

for Pandas: does anyone know how to get a datetime64[ns] to work with pd.cut()? it apparently supports datetime64 but not [ns]. I'm just trying to bin the datetime into months

jaunty helm Sep 27, 2023, 1:00 PM

#

How should I deal with related features?

Example: I want to predict housing prices, and two features I have are distance to nearest school and nearest school type(as in, say elementary, middle, high)

I could keep them separate, but intuition tells me that I could "combine" them somehow, or there was some way to inform the model that these two are related, which could yield better results

serene scaffold Sep 27, 2023, 1:12 PM

#

willow quest for Pandas: does anyone know how to get a datetime64[ns] to work with pd.cut()? ...

do you mean that you want to round timestamps down to the start of their month? so 2023-9-27 T 09:11:25 becomes 2023-09-01 T 00:00:00?

#

or are you trying to group the dataframe by year/month for some subsequent operation?

#

(ie, group by year/month so that days in March 2020 aren't in the same group as March 2023)

willow quest Sep 27, 2023, 1:16 PM

#

it's just data from the past few months, trying to bin the entries by month. By now we've found a workaround by making the .index the datetime and then do .index.month. but I was kinda expecting pd.cut() to be able to do this type of binning, considering I saw the following:

df['day_bin'] = pd.cut(df['date'], bins='1D')

#

so my assumption was it would be able to also handle bins='1M' 🤷‍♂️

serene scaffold Sep 27, 2023, 1:41 PM

#

willow quest it's just data from the past few months, trying to bin the entries by month. By ...

it looks like pd.cut returns a tuple of two values

#

are you sure that what you're trying to do is neither of the two options I gave? I've never used pd.bin before, and I'm trying to understand what you are doing.

#

I don't know precisely what it means to "bin the entires by month", and that's what I'm trying to understand.

willow quest Sep 27, 2023, 1:46 PM

#

every entry in my df has a column with a datetime64[ns] dtype. I'm just trying to group the entries of the same month together, so all entries with the month 'May' get in a bin, 'June' their own bin, etc.

#

just binning

#

https://pandas.pydata.org/pandas-docs/stable/whatsnew/v0.20.0.html#other-enhancements

pd.cut and pd.qcut now support datetime64 and timedelta64 dtypes (GH 14714, GH 14798)

#

i technically got what i want now, such that all entries are grouped by the respctive month in the datetime column. I was just thinking the pd.cut could make it easy

serene scaffold Sep 27, 2023, 2:01 PM

#

willow quest i technically got what i want now, such that all entries are grouped by the resp...

Sounds like you solved it anyway. For your awareness, dataframes have a groupby method for doing operations on groups, so that's what people will think you mean if you talk about creating groups. It sounds like you wanted to create a new column where the value represents a group that rows belong to.

small wedge Sep 27, 2023, 3:10 PM

#

jaunty helm How should I deal with related features? Example: I want to predict housing pr...

Hmm I'm not sure how much the model will gain from combining these features but there are definitely ways you could try. Say those are your only two features and you are onehot encoding the nearest school type. Instead of using one's in your encoding you could use a distance from the nearest school; i.e. [2.5,0,0] is an elementary school that is 2.5 miles away where [.5, 0, 0] would be one that's .5 miles away.

jaunty helm Sep 27, 2023, 3:13 PM

#

Ah, so something like

elementary  |  middle  |  high
------------------------------
5.11           0          0
0              3.2        0
0              0          0
```Instead of only `1` and `0` the value is now the distance

small wedge Sep 27, 2023, 3:14 PM

#

Yeah, although you might have to fiddle with the values there, normalizing them and/or making the distance inversely proportional to the size of the number if being close would raise housing prices, etc.

desert oar Sep 27, 2023, 3:15 PM

#

past meteor Are you? For me it feels like a wild goose chase 🤣

completely a wild goose chase, i'm just trying to stay loosely aware of new use cases, meanwhile i educate myself on the fundamentals of LLMs and the transformer/attention architectures

jaunty helm Sep 27, 2023, 3:16 PM

#

small wedge Yeah, although you might have to fiddle with the values there, normalizing them ...

Right
I think I'll just make the 0s (which should indicate no schools nearby) some huge distance
TY for your help!

past meteor Sep 27, 2023, 3:27 PM

#

desert oar completely a wild goose chase, i'm just trying to stay loosely aware of new use ...

Small soapbox but I think only time will tell what use cases will tell what new use cases were viable or not. There's a lot of strange stuff going on. For instance, we had a rejected research proposal and my boss was like "Okay we'll go again, let's just make sure it contains AI, VR or XR" 💀 .

desert oar Sep 27, 2023, 3:36 PM

#

past meteor Small soapbox but I think only time will tell what use cases will tell what new ...

lol

#

that's the way of the world

past meteor Sep 27, 2023, 3:38 PM

#

sad but very true

weak mortar Sep 27, 2023, 4:00 PM

#

Which of the popular open sauce gpt LLMs would you suggest for a system with 2x3060Ti (24gb vram + 128gb system ram)?

#

I came to realize falcon 180b that i was eyeing is way out of league, maybe even 40b is. 7b just sounds so low in comparison

desert oar Sep 27, 2023, 4:13 PM

#

weak mortar Which of the popular open sauce gpt LLMs would you suggest for a system with 2x3...

are you training or just running inference? for inference, maybe there are distilled versions that run on consumer hardware

desert oar Sep 27, 2023, 4:36 PM

#

jaunty helm How should I deal with related features? Example: I want to predict housing pr...

i don't see why you'd want to combine these. it's OK if they're "related", as long as they're not identical or nearly-identical