#data-science-and-ml

1 messages · Page 75 of 1

twilit tundra
#

So problem solved?

umbral charm
#

Im still new to this library

#

numpy so much easier

twilit tundra
#

You're welcome, np

umbral charm
twilit tundra
#

df[col] is the way to select a subseries inside of the dataframe
df[[cols]] is the way to select a subdataframe
In the first case, the argument is a string, in the second case it's a list

void veldt
# umbral charm numpy so much easier

same. I hate pandas with a passion. It appears to be the standard people use, but I try to avoid it at all costs and just use numpy or just straight python if possible

left tartan
umbral charm
lapis sequoia
tidal bough
void veldt
# lapis sequoia interesting, i find pandas to be the only thing I enjoy doing in python.

it just comes across to me as unintuitive and I don't really see it offering any advantage over say numpy. Everything pandas does in terms of data organization and modification can easily be done using just python (no dependencies required), and while various libraries like numpy, scipy, matplotlib, etc. work well together, the same cannot be said for pandas.

tidal bough
#

i mean, you can replicate pandas's capabilities in numpy, but with quite a lot of effort - either structured arrays, or an array per column. Also, the moment you want a groupby, problems will start.

#

(people who dislike pandas might want to take a look at polars, though - it is similar but makes some different choices, like not having indexes)

umbral charm
void veldt
#

I personally prefer how matlab organizes and treats data. Love it's syntax and usage with linear algebra

umbral charm
twilit tundra
#

Having worked almost exclusively with pandas for more than a year, I can say that it's very natural to me and it has so many capabilities

#

It's not made for linear algebra, just data analysis and transformation

fleet granite
#

I am facing an error in my code. can anyone help me to remove this error?

umbral charm
fleet granite
left tartan
#

I like numpy for numpy stuff and sql for everything else: pandas angers me because it is less capable than both. (I’m conflating sql and dbs but you get the idea)

desert oar
#

I literally can't ask for help with my code when smth doesn't work, cus it's such a mess
are you sure? if your code is messy but functional, then it should be fine

desert oar
# left tartan I like numpy for numpy stuff and sql for everything else: pandas angers me becau...

pandas is literally numpy internally, the things that you can do with a typical pandas dataframe is mostly a superset of what you can do with the underlying numpy array (including access the underlying array if needed). if the "data frame" concept isn't useful for you then you don't need to use it. but it has a long successful history among statisticians and other data analysts in python, as well as R for many years before pandas came out

desert oar
#

pandas does have a few big design flaws however, e.g. you can write df[thing] for boolean masks as well

#

that is, sometimes df[thing] is df.loc[:, thing] and sometimes it's df.loc[thing, :] depending on what thing is. imo that's bad design, and df[thing] should be reserved for onlye one of those cases. personally i always use it for the former and flatly reject any code that uses it for the latter.

umbral charm
desert oar
#

the docs honestly aren't great. that's the worst part.

#

the reference docs are good, but the "howto" material is kind of chaotic.

#

it's somewhat easier if you already know data frames from R

umbral charm
#

Aight well Corey Shcafer it is

desert oar
#

who?

fresh harbor
#

ort.InferenceSession.get_inputs() equivalent in cv2.dnn.Net?

umbral charm
#

??

#

That youtuber who teaches python

#

Hes really good

desert oar
#

i don't recommend learning programming from youtube in general

umbral charm
#

Does matplotlib numpy django and pandas

desert oar
#

he might be good, but my overall trust in "youtube programming educators" is very low

umbral charm
#

Eh if you think that, i learnt all my file handling and matplotlib from him

desert oar
#

in any case we have some very experienced pandas users here. feel free to ask or search stackoverflow

umbral charm
#

my teachers hate stackover flow

desert oar
#

why?

twilit tundra
umbral charm
#

Idk i think they think of it kind of like the wiki pedia for prgramming

#

its just blurts out the answer with no explanation

#

but i learnt it myself if i really have too

desert oar
# twilit tundra I've started reading that part recently, it's indeed a mess

imagine learning from that and that only, back in 2015. what's a little unnerving is that the core tutorial and howto material is largely unchanged since then. i once tried to start revising it but i felt a little out of touch with what the core devs wanted for it and gave up. i'd need to be in closer contact w/ someone on the core team to make good progress on it.

#

i spent a while on it though. i should have saved what i did.

#

i think there are some good books on using pandas

desert oar
#

at least, i always try to provide an explanation with mine

umbral charm
#

props to you

desert oar
#

if i didn't have R background knowledge i don't know how i'd have learned pandas tbh

desert oar
umbral charm
#

Well i probably should start using R, but no clue where to start

desert oar
#

nah, don't spend time on it. learn one thing at a time

twilit tundra
#

I didn't have any R experience but I could probably have been way more efficient

desert oar
#

learning another programming language is low value among the many other things out there to be learned

left tartan
#

And thus, they get caught in a vicious cycle of solution seeking, rather than understanding.

umbral charm
twilit tundra
left tartan
void veldt
umbral charm
#

i emailed Github but they havent responded :(

twilit tundra
#

You don't even need to write a proper google search or prompt

#

It just fills your code

umbral charm
#

Yea

twilit tundra
#

Can't you just buy the subscription?

umbral charm
#

All you have to do is give a function the actual name of the function and it fills the function

umbral charm
twilit tundra
#

Oh, right

umbral charm
#

In my uni its only avaible for the Engineering students and COmp sci students or Stats students

#

im neither so i have to send a personal request

twilit tundra
#

I thought they would give out licenses to every student, that's surprising

umbral charm
#

Honestly with the amount of python on my course i expected it too

#

But GOOD NEWS I got pycharm pro

#

so thats good at least

twilit tundra
#

Like I'm pretty sure the free azure credits is available for students regardless of major

past meteor
twilit tundra
#

What does pycharm provide that VSCode doesn't?

past meteor
#

Pandas' Syntax is godawful though. It's the only data frame library I've used that feels wrong and I've used several in multiple languages.

twilit tundra
umbral charm
#

so i just grew up with pycharm

past meteor
#

Polars is a better option, both performance and syntax wise, but it definitely doesn't tie in as well with the data ecosystem

umbral charm
#

used it for all my codeing career so far

left tartan
past meteor
#

Pandas will remain king because of sunk cost

twilit tundra
#

I didn't expect pandas to be such a controversial topic lol

past meteor
#

But personally I'm not writing any new code in Pandas

#

Unless I really have to for compatability reasons

bronze flint
#

Quick question, i had an issue now where my training on Google colab stopped after idling with T4 graphics card
Was training a massive CNN but at the point it wasn't using T4

Do you guys know what the cooldown is?

#

I am using free Google colab

void veldt
left tartan
twilit tundra
desert oar
desert oar
twilit tundra
#

Polar bears are more deadly than pandas

worn stratus
void veldt
desert oar
worn stratus
past meteor
#

They're horrible

worn stratus
twilit tundra
#

Is there something equivalent to query in polars?

bronze flint
worn stratus
past meteor
#

I've also used spark and Dataframes in R and MATLAB

#

Only pandas is "different"

twilit tundra
#

I've been wondering: does anyone know the technical reason xlsx files are so slow to load on python/pandas?

left tartan
civic elm
#

is RNN a viable model for anything? or Transformers have been the norm?

#

Am I wasting time and effort in deep diving RNNs?

twilit tundra
#

afaik, RNNs are still relevant for time series data

#

But other than that, it's mostly still a hot topic for research because there is a lot of untapped potential

tepid tartan
#

@civic elm should I do khan linear and stats or do Coursera on the math 🤔

desert oar
desert oar
fresh harbor
#

Do model inferences get slowed down when you do A, B, C, D in a cycle? Do I instead use multiprocessing and queues to speed this up, or will it just cause resource starvation?

tepid tartan
desert oar
desert oar
fresh harbor
#

Model inferences

#

A is face detector
B is face recognition
C is magik
D is more magik

desert oar
#

that seems like a giant waste of effort to get it working right, and it doesn't seem like you actually get anything out of it. if anything it's worse because you can't adjust the training processes individually

fresh harbor
#

Not train, its for inference only

#

@desert oar btw i wanted to ask whether yunet would be a decent enough replacement for retinaface.

#

Its built right into opencv so no extra deps

desert oar
serene scaffold
desert oar
desert oar
#

so i try to stay away from those problems at work. no good intuition for it

serene scaffold
#

Fascinating

desert oar
fresh harbor
#

Its a pipeline

desert oar
#

so you need the output from each step as the input to the next step?

fresh harbor
#

The question is whether I should operate it concurrently, in batches or serially right now?

desert oar
#

i see, you are asking about running the pipeline on multiple images

#

that's actually a very good question

fresh harbor
#

Its actually a video

#

But video no good for my model

#

Need to break it down to frames

desert oar
#

i see, is the model pipeline sequential across frames? or can you batch/chunk/re-order arbitrarily and it won't change the results?

fresh harbor
#

Uhh i don't think it depends on previous results, if that's what u mean

desert oar
#

okay. instinctively i would imagine that if i need to analyze something in a video i would want some window of past frames as input to my current inference. that would limit how you can run this pipeline

fresh harbor
#

Fyi its YUNet > SFace > INSwapper > (some ONNX upscaler that opencv can run, not decided)

desert oar
#

i think in general the answer to this question is why ML engineers get paid the big bucks. but i think the short answer is that it depends on what hardware you have available and how the models are implemented.

fresh harbor
desert oar
#

if the underlying implementation is already multithreaded/parallel, you can probably do ok by running it serially. you wouldn't want to combine that with parallelism in your application because everything will get gunked up

fresh harbor
#

I am super new (3 days old) to this stuff

desert oar
fresh harbor
#

Yea

desert oar
#

you might want to check the opencv docs to see if it says anything about threading or parallelism

#

it's very easy to run into situations where paralyzation actually slows things down because your program is spending too much time sending data between processes

fresh harbor
#

The bigger concern here is that I don't run into resource starvation

#

Most likely RAM / VRAM

#

Serial pipeline won't cause it

desert oar
#

so if opencv has a way to run each inference in threads or processes, i recommend starting there and benchmarking

fresh harbor
#

But if I parallelize it, what little control do I have over how much RAM the model chooses to eat?

desert oar
#

eg numpy includes openmpi support

desert oar
#

i would start by just running everything serially and profiling + looking into threads/processes within opencv

fresh harbor
#

Yea I'd need to run the serial pipeline with multiprocessing + queue anyways because it shouldn't block the gui

#

which makes me wonder, how does gradio achieve non blocking UI when everything is happening in the main thread?

desert oar
fresh harbor
#

Or maybe they do use something after all?

#

I have definitely not seen any code using queues

empty furnace
#

is plotly optimal for large datasets?

#

idk if consumes much memory just to make a chart

left tartan
#

matplotlib is still the baseline everything is compared against. Generally, for large datasets, the first step should be reducing the complexity of the plot, whether from quantizing/sampling/aggregation/smoothing/whatever

simple mirage
#

How do u mathematically determine whether a distribution is skewed or not

#

I’ve tried plotting the histogram, and it looks skewed but I heard that median is better than mean for skewed graphs and so I tried comparing the mean and median results and the median result actually gave a result that leaned more towards the skew

#

Does that mean my data isnt actually skewed?

cold osprey
#

skew and kurtosis

simple mirage
#

Oh I get it now. Imputing with median is not meant to address the skew. It’s just suppose to make the distribution more robust

ashen axle
#

Does anyone know if there is a straight-forward method of adding hover-text to a seaborn generated line plot? The plot is busy enough that a legend is not useful.

I've experimented with all of the major plotting libraries, and Id rather stick with the seaborn/matplotlib ecosystem if possible

rare quest
#

Hi, this may be the wrong channel but, generating an image with pytorch takes a very long time, with CUDA enabled:

with autocast("cuda"):
    image = model("An image of a hand with a ball of ice levitating above it.")

This takes about 4 minutes with my RTX 2070S

#

Is there something I need to enable in windows11?

supple plover
#

hello everyone, I'm looking to try and make an app for cameras in vehicles so that it can immediately detect and count the amount of passengers inside. Are there known examples of this that I can study from? I'm very new to AI/ML, I only managed to make a custom YOLOv4 model to do palm oil fruit classification deployed in an android app (I put the model as a .tflite inside the app itself) recently. I'm thinking can I use YOLO models for this passenger detection & counting? And if I want to have the AI model to be in a web API/cloud to be consumed through a website, are there examples on how to do that?

desert oar
supple plover
#

and another question. How's everyone's opinions on ML.NET?

desert oar
#

sagemaker can do it for example. or mlflow

past meteor
#

MLflow gives you many features (and complexity!) that you may (or may not) need

#

For the easiest case I'd start out with a simple container running your model with sanic / fastAPI, maybe CI/CD to easily update the model

supple plover
#

I'll try that. thanks

supple plover
#

since there's a lot of YOLO models now, which one is the best one for detecting and counting just one type of object?

heavy bay
#

I made a simple neural network to predict y = 2x + 1
but the output is off by 0.002 lemon_thinking
what is the reason for this?

lapis sequoia
lapis sequoia
fleet granite
heavy bay
lapis sequoia
supple plover
serene scaffold
lapis sequoia
heavy bay
# lapis sequoia and what is the architecture of the model?
model = tf.keras.Sequential([
    keras.layers.Dense(units=1, input_shape=[1]),
    keras.layers.Dense(units=5),
    keras.layers.Dense(units=10),
    keras.layers.Dense(units=5),
    keras.layers.Dense(units=1),
])```
I didn't really think much about it, just messing around
heavy bay
#

i just realized that np.append returns a copy

supple plover
lapis sequoia
lapis sequoia
heavy bay
serene scaffold
slim bone
#

So lets say I've made some basic neural network - and now I wish to download the model. Am I essentially downloading the (now adjusted) weights and biases of* the currently trained model?

potent sky
#

download it? where from?

slim bone
#

"The thing that will enable me to continue training it later" maybe?

serene scaffold
#

if you save the model, the information that gets written to your hard drive is some representation of the weights and biases, yes.

potent sky
#

depending on your requirement you may also choose to save the state of the optimizer (to pick up where you left off) as well as model configuration

hasty mountain
#

I think I've never seen a "downloadable model", only the weights and biases. Then you have to rebuild its architecture in your code pithink

potent sky
#

if you save the model configuration as well then you don't have to, for example .h5 models

#

saving just a state dict for the parameters only is quite useful tho, so it's popular

slim bone
slim bone
#

I'm just trying to figure out how I should construct my program - I'm just building a simple neural network with NumPy as a starting project

fallow frost
#

is there a way I can avoid doing a full scan on a pandas dataframe when filtering?
I want to get the first 5k filtered rows (when there are 500k), I dont want pandas to keep filtering the dataframe once it found 5k rows, is there a way to do this? and are there any alternatives?

boreal gale
fallow frost
boreal gale
#

what is the data that you are filtering on?
what kind of "filter" is it?
what is your data's cardinality?
why first 5k?
how often do you need this?
does data change?
what performance do you have now and what do you expect?

fallow frost
#

hmmm thats alot of questions, my question is fairly simple, can I do this:

from more_itertools import take

big_data: Iterable[...]
filterd_generator = (x for x in big_data if predicate(x))
print(take(5000, filterd_generator))

instead of:

filterd_list = [x for x in big_data if predicate(x)]
print(take(5000, filterd_list))
#

you see the difference?

#

pandas does the latter.

#

I want to do the former

boreal gale
#

hmmm thats alot of questions, my question is fairly simple, can I do this:
i don't ask question just for the sake of asking question, it's all for the ultimate goal of helping you.
if you simply require an answer to that, then no, you can't do that as far as i know.

left tartan
# fallow frost I want to do the former

I don't know of a way to do that in pandas (but doesn't mean it cant be done) without some sort of iteration, and iteration is generally an antipattern with pandas.

fallow frost
#

if you simply require an answer to that, then no, you can't do that as far as i know.
thanks, thats what I wanted to know 🙂

left tartan
#

You could map, for instance, accumulate and perhaps throw an exception when bucket is "full"

#

Or, perhaps do the filtered list over smaller windows of the data

fallow frost
#

(which is just an example, but it could be any number)

left tartan
#

you'd just keep moving the window until you fill.

#

ie: check first 1million, then next 1 mill, etc

fallow frost
#

I see

#

I'm thinking of using Polars with their lazy API theyre supposed to have this functionality

#

but I dont want to add that dependency to my project

left tartan
#

but, what kind of condition do you have where this is important? Like, df['col'] == something is not an expensive operation, and df[condition].head(5000) is only returning the first 5000 rows

#

(to be specific: I don't know where you could do df['col'] == something but only return the first 5000 indices that match the condition in a single operation)

fallow frost
#

the latter is for a filter, not creating a new column

left tartan
#

but, I'm a sql guy.

fallow frost
#

duckdb has got to have a query optimizer

left tartan
fallow frost
#

which wont keep scanning after it found 5k rows

#

I will try that

left tartan
#

tag me if you have any query questions, this is my jam.

fallow frost
#

I do actually

fallow frost
boreal gale
left tartan
fallow frost
# left tartan Can you give an example? I don't follow.

yeah its a bit confusing:

df = pd.DataFrame({'col': [(1, 2, 3), (1, 2), (1, 2, 3), (1, 2, 3, 4), (1, 2, 3, 5)]})
# I want to find all the rows that contain: 1, 2, 3, and 4.
to_keep = {1, 2, 3, 4}

>>> df['col'].apply(lambda x: to_keep.issubset(x))
Out[150]: 
0    False
1    False
2    False
3     True
4    False
Name: col, dtype: bool
#
mask = df['col'].apply(lambda x: to_keep.issubset(x))
df = df[mask]

>>> df
Out[154]: 
            col
3  (1, 2, 3, 4)
left tartan
#

This is my preferred approach (given your statement that pandas is too slow for your filter/limit): py import duckdb import pandas as pd df = pd.DataFrame({'col': [(1, 2, 3), (1, 2), (1, 2, 3), (1, 2, 3, 4), (1, 2, 3, 5)]}) duckdb.execute("select * from df where col = ? limit 1000", [(1,2,3,4)]).df() if looking for set intersection, need to get a little cleverer (function added to make this simpler): py CREATE OR REPLACE FUNCTION "@>"(haystack, needle) AS (select c == len(needle) from (select count(*) c from (SELECT UNNEST(haystack) INTERSECT SELECT UNNEST(needle)))) ; select col, col @> [1,2,5] b from df where b = True

desert oar
# fallow frost pandas does the latter.

note that pandas does the latter because it doesn't have a lazy query engine or a query optimizer. note also that you sometimes need/want to just use plain python instead of doing everything inside pandas. python for loops can be reasonably fast if you build them carefully.

#

i am gathering that you have an array/list-valued column, and you want to find the first row where the array/list contains some certain values?

#

the right solution definitely depends on how much data you have, memory vs. cpu constraints, etc. but that duckdb unnest operation above looks very elegant

#

you could also consider re-encoding your data as an integer bitfield and using binary &. that's a good leetcode trick for lookups on fixed-size sets

fallow frost
fallow frost
left tartan
fallow frost
left tartan
#

yup

fallow frost
#

got it thanks

fallow frost
fallow frost
fallow frost
agile cobalt
#

sounds like the issue is more about your data modelling than pandas itself?

  • you should not store tuples, lists and other arbitrary objects in pandas dataframes
  • you should avoid using apply as much as possible
#

seriously, don't go around complaining about pandas performance if you're using apply(). That alone kills any benefits you might hope to get from using pandas.

fallow frost
#

if I only had a cent for each time somebody said that, I would be very wealthy

#

but seriously, whats the alternative?

agile cobalt
#

and about the previous thing you mentioned about not doing a full lookup: Yeah, pandas is not the right tool for that job

fallow frost
agile cobalt
#

pandas is good for medium sized datasets
if it's large enough to justify stopping early on the example case you gave, you might as well use an actual database instead

left tartan
fallow frost
left tartan
#

Then, array & mask == mask means that all mask entries are in the input array

#

And this is a very efficient vectorizable operation

fallow frost
#

@left tartan I'm not following you at all, can you link some tutorial or something that I can read up on how this stuff works?

left tartan
#

(Altho I hate that site for requiring account)

twilit tundra
#

I miss the days bypass paywall actually worked on that site

umbral charm
#

in Pandas, if i got a column with numbers, 0's and with NaN's, if i wanna replace all the numbers withs 1's and leave the 0's as 0, but leave the NaN's as NaN's, Can i just replace it like df['Boo'] = [1 if df.loc[i, 'Boo'] > 0 else 0 for i in df.index]

#

or would that also changes the NaN's

whole rock
#

hello

#

i need assitance asap

#

anyone experienced with hacking please shoot me a dm

desert oar
#

storing this data as sets might also help

#
df = pd.DataFrame({'things': [{'a', 'b'}, {'b', 'c', 'e'}]})

important_things = {'q', 'c'}
df['has_important_things'] = df['things'].map(lambda s: bool(s & important_things))
#

there are a couple of things going on here. yes, pandas has no support for "partial" or "lazy" filtering, and yes i suspect that a plain python loop might be faster (which you can implement in a single pass).

desert oar
#

@umbral charm

boo_is_notna_notzero = df['Boo'].notna & (df['Boo'] != 0)
df.loc[boo_is_notna_notzero, 'Boo'] = 1
umbral charm
#

But i found this solution df['Boo'] = pd.notna(TSLA['Boo']).astype(int)

#

doesnt change the NaN's

#

took me a good 4 mins to realise not to use the & symbol but and instead

desert oar
#

notna returns True if the value is not null, and False if null

#

so that will set both 0 and 1 to True, then astype(int) converts True to 1

night kernel
twilit tundra
umbral charm
#

Suppose we have a dataframe, and a coloumn that has got True's and False's i want to know the index of all the True values, However if there more than 1 Trues togeather, i only want the index of the first one, how would i do this? Im so lost on iterating throught columns

left tartan
umbral charm
# left tartan Can you share an example of what you mean?

i have a column called 'Boo' in a dataframe called df, now this column Boo is full of True's and False's values (Boolean) but mostly Falses (90%). i want to find out at what index the True values occure, But if there is more than 1 True value togeather (so like True on index 95 and True on index 96) i only want to retrun the first True index (In this case 95)

twilit tundra
#

Drop_duplicates(Boo)

#

🙂

night kernel
twilit tundra
#

My DMs are not open sorry

umbral charm
#

or would it drop themm too

twilit tundra
#

It would just leave 2 rows, one with true and one with false

#

And it keeps the first instance of both

#

Alternatively if you really just want that one true index, you can do df.query(boo==True).index[0]

left tartan
#

I was going to suggest something like: a cumsum() of boo == False, then grouping and computing cumcount, and then keeping only those where cumcount()==1 (eliminating any runs)

umbral charm
#

my dataframe only cosits of 1 column and 1 index

night kernel
#

from recommenders.datasets import movielens

umbral charm
#

i want another column where the index in which there are True values are copied over, but if there are more than 1 True value tgoeather, i want it to be just in the first index they were seen togeahter

twilit tundra
#

I'm not sure I understand, it would be something like None,... Until index 95 where it's equal to 95 and then 95 at index 96 because it's still true?

twilit tundra
#

It seems to be a dataset that is part of the library

umbral charm
#

Example:

1 False False
2 True  True
3 False False
4 True True
5 True False
6 False False
7 True True
8 False False
9 True True
10 True False
11 True False
12 False False
night kernel
#

its not from the same project. this is the example from the project:

from recommenders.datasets import movielen

twilit tundra
#

Referencing the notebook, it looks like the format is userid itemid rating

umbral charm
#

You See how Its basically the same columns until there comes a consecutive True's in the first column, in which i only need the 2nd column to produe one True for the start of that consecutive run

twilit tundra
#

Oh gotcha

#

Probably something with a rolling window of size 2

umbral charm
#

That could work

left tartan
#

So my solution is; calculate cumcount() over False. Then cumcount over Trues for each group from step 1. Then eliminate any count > 1

umbral charm
#

if there were 4 togeahter

#

Idk what my max True's are consecutively

twilit tundra
#

True true = false
True false = false
False true = true
False false = false

#

Not A and B

left tartan
#

Oh easier; just drop where lag() == True and Val == true

umbral charm
left tartan
#

(Lag=shift)

#

I just mean; compare boo to previous boo, using shift. Like df[boo]==df[boo].shift()

twilit tundra
#

I have to say, I appreciate boo over foo

left tartan
#

And maybe & df[boo] otherwise you’d drop consecutive falses

agile cobalt
#

I would just use series.diff() == 1

umbral charm
#

Boo is just my go to name for throwawy columns

#

it used to be boob

left tartan
agile cobalt
#

.diff() would be False => False : 0 True => True : 0 True => False : -1 False => True : 1 their (original) question was identifying where it goes from False => True

umbral charm
#

That couldwork

twilit tundra
#

It definitely does

agile cobalt
#

sounds like at some point you tried to remove entire runs of 2+ consecutive Trues?

night kernel
#

i think the dataset i have has the user and item id's, however

left tartan
twilit tundra
#

Does it represent clicks? Or buying?

night kernel
#

tweets dataset

agile cobalt
#

did you do == 1 or != 0

umbral charm
#

== 1

twilit tundra
agile cobalt
#

show what exactly you did? (code)

night kernel
#

it looks like weights are calculated

#

and i think thats the rating

agile cobalt
#

derp oh wait .diff() with bools seems to be just XOR

#

sorry, you'll have to .astype(int).diff() instead of just .diff()
you can specify np.int8 instead of int if you want

twilit tundra
#

Or you can add an and with the same column if you want to stay in full boolean for some reason

umbral charm
#

How do you guys think of these

agile cobalt
#

seen similar problems a few times in the past

umbral charm
#

all you people seem like proper smart

agile cobalt
#

if you haven't yet, check out the pandas User Guides and take a look over all the different functions in the documentation, or at least ones that catch your eye

umbral charm
#

be working for apple or some shi

twilit tundra
agile cobalt
#

StackOverflow is also pretty useful if you know how to search effectively

umbral charm
twilit tundra
#

Surprisingly enough, they often edit their answer to correct it if a new version breaks it

slim bone
#

Hey fellas, quick question about gradients:
I'm using MSE as my cost function
Now I'm trying to calculate the gradient, but I'm at a bit of an intuitive crossroad:
On one hand, the gradient should consist of all of my weights, each one being its own variable (So in my case of a 28x28 image, 784 variables)
On the other hand, the gradient of MSE is just:
2/n * (prediction_vector - target_vector)
And my prediction has 10 variables.

What am I missing?

agile cobalt
#

you should just about never calculate the gradient yourself, but rather leave it up for the library you're using to determine it for you (pytorch, tensorflow, jax etc)

slim bone
#

This is on purpose

agile cobalt
slim bone
#

And also not entirely the point of the question - there's clearly a knowledge gap here

agile cobalt
#

backpropagation takes the loss of the output of an operation and broadcasts it to the input

slim bone
#

I know...

desert oar
slim bone
#

Gradient descent is what I'm after

desert oar
#

yeah, you want the vector of partial derivatives with respect to each parameter in your model

slim bone
#

Well, just a vector of shape (1,784) or whatever.

desert oar
#

yes, if you're treating them as 784 individual features

slim bone
#

Right. But how do I compute said gradient?

#

If I'm using MSE as my loss function

desert oar
#

using the chain rule + rearranging terms to get the usual backpropagation formula

agile cobalt
#

which loss function you are using does not influences this part at all btw

desert oar
#

+1

slim bone
#

Hmm? I thought we calculate the gradient of the loss function?

#

at a certain point*

desert oar
#

yes, but specifically the gradient with respect to the parameters of the model

slim bone
#

I'm not sure what "with respect" means in this context

desert oar
#

the loss function is usually something like loss(prediction(parameters), data)

#

so you need the chain rule to get at the gradient with respect to the parameters

slim bone
#

def MSE(prediction, ideal): is what I have

desert oar
desert oar
slim bone
#

Oh, apologies

desert oar
#

if you have an expression like f(x,y,z) = ax + by + cz then you're implying that x,y,z are the "inputs" to the function

slim bone
#

Right

desert oar
#

so the gradient of the function with respect to x,y,z would be the vector of partial derivatives with respect to each input

slim bone
#

Right

desert oar
#

but you could also talk about the gradient with respect to a,b,c, reversing the roles of x,y,z and a,b,c

slim bone
#

Huh, the scalers?

desert oar
#

it's just math jargon to specify which variables are "inputs" and which aren't

slim bone
#

I mean if you calculate the gradient relative to the scalers you'd just get nonsense no?

desert oar
#

a and x are identical here except that one is treated as an "input" and the other is treated as given

#

but you can just swap the symbols

#

when you say "with respect to", it's telling me which symbols represent "inputs"

slim bone
#

Oh, so if you "calculate a gradient relative to a,b,c" are you assuming those are the inputs now?

desert oar
#

right. so if i have an expression like loss(prediction(parameters), data) the gradient with respect to prediction(parameters) is different from the gradient with respect to parameters

#

i think you're thinking you need the former, but you need the latter

slim bone
#

Can you maybe explain what prediction(parameters) is exactly? I might be missing the point here

desert oar
#

thinking about this properly kind of requires you to flip around what the "inputs" are

#

the prediction that you produce is literally a function of the parameters of the model + the data

slim bone
#

Right

desert oar
#

i guess i should write it loss(prediction(parameters, x), y)

slim bone
#

loss being our loss function I assume?

desert oar
#

yes

slim bone
#

Just trying to make sure I understand you

desert oar
#

thinking about this as an optimization problem requires you to flip around what you understand the inputs to be

#

the optimization problem treats the data as fixed. x and y are handed to us as-is and do not change.

slim bone
#

Right

desert oar
#

we are now interested in finding the parameters (model weights, coefficients, whatever) that minimize loss

#

that is, we are maximizing the loss as a function of the parameters

#

so when we talk about the gradient of the loss function, we are talking about the gradient of the loss function with respect to the model parameters

slim bone
#

Again, "model parameters" = weights/biases?

desert oar
#

yes

#

also i'd suggest maybe start with something simpler than images. imagine linear regression on a dataset like iris. 4 features, 1 response variable, all continuous numeric data.

#

forget even the notion that linear regression is a special case of a "neural network". just think about minimizing loss as a smooth differentiable function of some parameters

slim bone
#

Yeah I do somewhat regret not going with something simpler, but this has been a learning experience all-in-all
It'd be a rather shame to stop now as I've sunk a few hours into this now

desert oar
#

you'll get back to it

#

you'll be happy you spent time on the basics

slim bone
#

Ah, this is the basics as far as I'm concerned

desert oar
#

it's something i never had the discipline to do when i was younger and i'm still paying for it 10+ years later

slim bone
#

Fair enough. I'll keep this in mind

#

Can I perhaps type out a concrete example (relating to what I'm trying to program) to see if I got the memo? I'll make it brief.

desert oar
#

maybe? but it seems more like a matter of understanding the math than writing out code

slim bone
#

Oh no, I'm not talking about the code

#

But I'm not entirely sure this is a math issue either. Intuitively this does make sense to a certain degree, and I do understand what you're saying

#

I'll try to type it out

iron basalt
#

Try a single input and output (no hidden layers). Can code that without any libraries. Then 2 inputs, 1 output.

#

See if it can learn some logic gates.

desert oar
#

coming from a (social) science background i find the idea of learning logic gates abstract and very unlike anything i'd expect to encounter in real work

slim bone
desert oar
slim bone
slim bone
desert oar
#

ah, mnist

slim bone
#

Right

desert oar
#

why are you using MSE on this?

slim bone
#

I'm simply following 3blue1brown's video

#

I used RMSE but I was too lazy to calculate that gradient

slim bone
desert oar
#

it's just one step in the chain rule. i'd suggest working through it. that's essential and worth drilling until it's natural.

slim bone
#

Just input, weights, and output
I'm not even sure what the bias does here

desert oar
#

yeah i think this needs to be dialed back

#

from what i remember that 3b1b video is meant to be illustrative and relatively nontechnical

slim bone
#

Right

#

Nevertheless, I wanted to see if I understood what was being said - and generally speaking, considering the fact that I've gotten rid of the layers, I figured this should be fairly simple

And yet, there's a lot of gaps in my knowledge

desert oar
#

there are still some complicating factors here that i'd like to strip out

slim bone
desert oar
#

so let's dial it back to something simpler. imagine a single continuous output like body mass, and 3 inputs: height, waist size, and chest size.

slim bone
#

Err, okay

#

Something that approximates body mass, got it

desert oar
#

yeah. let's say we are interested in whether we can determine body mass from those 3 measurements

slim bone
#

Alright

desert oar
#

so we propose a simple model of the form y = b1*x1 + b2*x2 + b3*x3 + b0, where the xs are the 3 measurements and y is body mass

slim bone
#

So far so good

#

What's b0 in this context btw?

desert oar
#

this is the standard linear regression model. among many many other things, we can interpret it as 1 input layer and 1 output layer.

slim bone
desert oar
slim bone
#

Like a +C with antiderivatives?

desert oar
#

that's what the machine learning people call "bias" because it kind of resembles bias in an electrical circuit. it's unrelated to the statistical term bias. statisticians call it an "intercept" to avoid the confusion & because it's literally the y intercept.

#

imagine setting all x1, ..., x3 to 0. then what's y?

slim bone
#

just b0

desert oar
#

right

#

if you didn't have b0, that forces y to be 0 as well when all xs are 0, which forces the entire line/plane you fit to pass through the origin, which is restrictive and makes your model worse for no benefit

slim bone
#

Not sure I understand why, but feel free to skip this if it's not crucial

desert oar
#

it's worth thinking about. having good geometric intuition for the math can help a lot

slim bone
#

I do agree, I'm just not sure what this does in the context of ML

desert oar
#

let me draw a picture

slim bone
#

Sure thing

desert oar
slim bone
#

You probably couldn't figure but I do have some academic mathematical background ^^; It's just hard for me to process math in English for whatever reason

#

So you might be able to skim on some explanations

desert oar
#

ok, i'll keep going and hopefully you can work up to understanding why the b0 is useful

#

let's assume for now that it's useful and that we usually want it

slim bone
#

Sure thing

desert oar
#

so we have our simple model y = b1*x1 + b2*x2 + b3*x3 + b0

#

now we want to find b0, ..., b3 that produce the best line/plane to describe this relationship

#

the relationship could be totally wrong, but we want to produce the best possible estimate among all relationships of this shape

#

we do so by coming up with a loss function and minimizing that

slim bone
#

Makes sense

#

I mean, in theory at least

desert oar
#

just to avoid messy notation, let's call our model prediction p, so we have the following task:

minimize l(p, y) with respect to b0,...,b3 where p = b1*x1 + b2*x2 + b3*x3 + b0

slim bone
#

l is the cost function
y is the body mass
Mostly typing this for myself

desert oar
#

so how do we do that? we note that p is differentiable with respect to the bs, so as long as l is differentiable and convex, we have the whole wide world of convex differentiable optimization techniques available to us

slim bone
#

Sorry, convex?

desert oar
#

In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of the function lies above the graph between the two points. Equivalently, a function is convex if its epigraph (the set of points on or above the graph of the function) is a convex set. A twice-differentiable function of a si...

#

basically, it's a bowl, and there is a bottom of the bowl. we need to find the bottom of the bowl.

slim bone
#

Oh, sure lol

#

But only a part of it is uh... "convex", no?

#

Or rather parts* of it

desert oar
#

well in this case the whole thing is, but yeah the real life loss surfaces are enormously complicated

slim bone
#

Right

desert oar
#

we aren't always guaranteed to have a global minimum. gradient descent only finds a global minimum under certain nice conditions, otherwise it finds a local minimum and we hope it's a good one

slim bone
#

Right

desert oar
#

in this particular case there happens to be an exact analytical solution (which you'll spend quite a lot of time reasoning about in a statistics class, it turns out to be just an orthogonal projection), but you can also use gradient descent, so that's what we'll use because it's what neural networks use

slim bone
#

orthogonal projection
Oh god, those are relevant to statistics? :(

#

Nevermind, sidetracked

desert oar
#

yes, linear algebra is essential in stats and machine learning

warm sage
#

Hi, I was hoping to get some advice here.

I am working on a digital text sentiment analysis tool in python. I was hoping to achieve this using machine learning and an amazon review dataset.

First of all, I'm not sure what type of model i will need to create (eg Linear Regression Model) so I could use some help deciding that.
Second of all, I have a 100gb file full of reviews and im not sure of the best way to go about importing and training on this data.

Thanks in advance

slim bone
#

To machine learning - sure. I was just hoping to be done with it in my remaining math-oriented academic courses
Nevermind though. Gradient descent. Sure

desert oar
#

for gradient descent, we need the gradient. but be careful: we specifically want the gradient of l with respect to the bs

slim bone
#

Yes

desert oar
#

remember, we are trying to minimize l over all bs

#

so we treat l as a function of the bs

#

does that make sense?

slim bone
#

So you get the partial derivative of (for example) b_1 * x_1 where b_1 is the variable, so its just... b_1?

#

I'm probalby jumping the gun

#

Better to just understand by example, perhaps

desert oar
slim bone
#

The gradient are the partial derivatives with respect(?) to the variable you're looking for

desert oar
slim bone
#

Oh, right

desert oar
#

use the chain rule

slim bone
#

3x + 2y -> 3 partial derivative with x

#

Yeah I forgot

#

All clear

#

So the gradient would be (x_1, x_2, x_3) since there are no duplicate b's or whatever

desert oar
#

that's the gradient of p with respect to the b's yes

slim bone
#

Right

desert oar
#

let's assume l is (p - y)^2. and p = b1*x1 + b2*x2 + b3*x3 + b0 as before. what's the gradient of l with respect to the bs?

slim bone
#

Err just a moment, I need to go back to the original equation

#

Err... isn't that 0

#

Because p = y?

#

Or am I misreading

#

Probably misreading, you probably want me to use the chain rule with f(x) = x^2

desert oar
# slim bone Because p = y?

i adjusted the notation. p is our prediction, y is the true body mass in the dataset. x and y are given to us and we treat them as fixed

slim bone
#

Ah apologies

desert oar
#

i need to go make dinner. ponder this for now, because i think it's the core of what you were struggling with originally

slim bone
#

Oh and, many thanks for your patience and help of course*

desert oar
#

i strongly suggest working through the actual calculation here to get an analytical closed-form expression for the gradient

#

it's a drill that should feel easy

slim bone
#

Indeed, but its crazy how quickly the human mind forgets things - I finished calc2 less than a month ago haha

#

I'll work through this

#

so:
l = (p - y)^2
l = (b1*x1 + b2*x2 + b3*x3 + b0 - y)^2
the partial derivative of b1 would be, err...
f(g(x))' = f'(g(x)) * g'(x) ->
f(x) = x^2, g(x) = (p - y) ->
2*(b1*x1 + b2*x2 + b3*x3 + b0 - y) * x1 =
2*x1(b1*x1 + b2*x2 + b3*x3 + b0 - y)? (partial derivative of b1)

Will pop something similar into wolfram real quick just to sanity check

#

Looks about right. I'll ponder on this for a little while longer
Thanks again, on the offchance you're reading this

tepid hazel
#

hi there, I trained a model on a good bit of text based data and the model seems to give really odd results. Even when I copy and paste a sample of the training data into the model to be predicted it will return 1 despite the piece of data given having been labled as 0 when the model was trained. Is this a syntom of overfitting? I didn't add any sort of dropoff or regulation to tensorflow so I suspect it may be but would such cause the model to not even be able to identify data which was inside it's training data?

twilit tundra
#

Do you have a loss curve to check or something?

void veldt
#

I'm looking for a way to determine the probability (something akin to a p value) of getting a particular set of residuals (i.e. chi2) from a set of fitted solutions to my model (non linear least squares). I've seen a number of tests (e.g. pearsons chi squared test) but don't know which one is correct, and these also don't appear to be %s either

twilit tundra
#

You'd need to define the model you're using, the labels you're trying to classify, and what you'd define as reasonable results

#

I'd suggest using already available datasets such as Fashionpedia

#

And if you want good performances, the best way would probably be to use a prerrained computer vision model and use transfer learning

tepid hazel
desert oar
#

actually this document is very good and i suggest working through it

#

it seems right at your level

desert oar
desert oar
#

which unless i am misunderstanding your intention, is just whatever error distribution is built into your model

#

ah. what kind of model?

#

it still sounds like you want something along the lines of an error distribution, which is pretty much exactly what most statistical models try to estimate

void veldt
#

I've seen sometimes in their fits people report they got a chi2=1.5 with a 0.01% p value. Similar to how in F tests you report a p value (except there it's the probability of increasing adjustable parameters gives you a better fit)

#

here I'm looking more for the probability of my current fit given my data and model

#

and solutions

desert oar
void veldt
desert oar
#

are you interested in the probability of your exact model predictions, among all possible model predictions?

desert oar
#

what kind of model is this?

void veldt
#

unless I'm misunderstanding ur question

desert oar
#

this sounds like a hard task in general unless your model is parametric with a specific data distribution

#

i'd be tempted to solve this by simulation

#

generate realistic data, fit the model, repeat many times

void veldt
#

I'm more so looking for given this chi2 from my minimized solution, how unique is this chi2? Can I get this by another random set of solutions?

supple plover
#

I wonder how do you sort and refine a large amount of data for image classification/computer vision models? Are there automatic image labelling tools?

agile cobalt
# supple plover I wonder how do you sort and refine a large amount of data for image classificat...

if you rely on a tool to create your dataset automatically, any models trained from that data will perform at best as poorly as those tools do.

the best 'quality' datasets are typically labelled manually, by a lot of people hired specifically for that (see: human annotators)

for some purposes, you can just use images from Bing's API and alike, but typically you should prefer using curated datasets if any exist for the task you're trying to do

supple plover
#

I see so there's no going around annotating manually for the best/cleanest datasets huh

#

is that why people keep saying AI/ML is like 90-99% spent on the data and only the remaining for the actual model? kekl

agile cobalt
#

if you haven't yet, take a look at ImageNet and all the work that went behind the dataset used by it

#

part of, but not all of it

#

not just collecting/labelling data, but also dealing with issues like missing data, making sure you didn't misunderstand anything, checking some statistical properties sometimes

supple plover
#

this field is really hard....

tepid hazel
#

hello! my model as shown below seems to be suffering from what I can only assume to be overfitting. After retraining it and adding some regulation via L2 and disabling 50% of neurons during training with a 0.5 dropout. I'm not sure what I'm doing wrong here but whenever the model is tested on any text it will return something like 0.998~ however it seems to perform very well on the training data as when passed in it gets it correct. Here's my model ```py
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=len(
tokenizer.word_index) + 1, output_dim=128, input_length=max_seq_length),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(1, activation='sigmoid',
kernel_regularizer=tf.keras.regularizers.l2(0.01))
])

and here are the epochs and final loss/accuracy, (I'm not sure why the final accuracy is so high) ```
loss: 0.0272 - accuracy: 0.9967
loss: 0.0134 - accuracy: 0.9992
loss: 0.0113 - accuracy: 0.9996
loss: 0.0096 - accuracy: 1.0000
loss: 0.0097 - accuracy: 0.9998
loss: 0.0095 - accuracy: 0.9999
loss: 0.0091 - accuracy: 1.0000
loss: 0.0091 - accuracy: 1.0000
loss: 0.0092 - accuracy: 1.0000
loss: 0.0090 - accuracy: 1.0000

Loss: 0.021111026406288147, Accuracy: 0.996656596660614
lapis sequoia
#

Beginner here. Have some experience with using Tensorflow and Keras though at a novice level. What's one thing I cam do to go to the next level?

twilit tundra
twilit tundra
hasty mountain
# supple plover I see so there's no going around annotating manually for the best/cleanest datas...

I haven't seen the process behind ImageNet, but I've been doing some (personal) researches on dataset labeling around that (and exactly to make my own datasets)
You should try taking a look at Unsupervised Learning and, specially, Self-Learning(which may provide you with better results).

This blog post may also help you:

https://lilianweng.github.io/posts/2021-12-05-semi-supervised/

#

In a nutshell, there's no escape from having to manually label your dataset, but you can spare some work and anti-inflammatories if you can make a model (and a method, maybe? Like SimCLR?) that's able to properly learn from few labeled samples and generate good quality pseudolabels (or labels automatically generated) for the rest of your dataset.

supple plover
#

I'll look into it. To be fair, I'm only looking into it bcs I'm just studying all this alone, surely companies have human annotators to do the labeling.

hasty mountain
#

Poor guys...

sleek harbor
#

Anyone a pro in Plotly Dash?

I wanted to know:
1 - is it true that all callbacks get called automatically at the start, when the app is booted? If so, then in what order? Can that be checked/changed somehow?
2 - does that mean that there's no point in setting the value of a parameter/property inside the layout definition, if that parameter/property is the output of a callback, because it'll immediately be replaced by the output of the first automatic callback call?

lusty lotus
#

I have an RL question. In this video https://youtu.be/my207WNoeyA?list=PLZbbT5o_s2xoWNVdDudn51XM8lOuZ_Njv&t=242, i understand previously that:

  • a function taking state s and action a can be mapped as f(s\sub{t}, a\sub{t}) = r\sub{t+1}
  • RL is mostly based on a loop State -> Action -> Reward
    However i don't understand this, i don't understand how (and what) the transition probability is, even though i understand that the current action picked from a state determines the reward.
tidal bough
#

There exist environments which aren't deterministic - the same action in the same state may result in varied next states.

lusty lotus
#

right

#

but i don't understand the text and the math notation in the video ive just linked above

#

like i don't know how does the information this page tie into contents discussed previously

tidal bough
#

This slide?

lusty lotus
#

yes this one. i don't get it

tidal bough
#

That's the distribution over possible next states - if you're in state s and take action a of the allowed ones in that state, you may end up in state s' with reward r with probability p(s',r | s,a). The equation on the bottom is just rewriting the same thing - it's defined as the probability that S_t, the state at time t, and R_t, the reward at time t, are s' and r respectively, conditional on the state at time t-1 being s and the action taken at time t-1 being a.

lusty lotus
#

idk what the lower rewrite is

#

i mean the second part of the p(s',r | s,a)

tidal bough
# lusty lotus i have a few questions on this: - what does the | mean - so does p(s',r | s,a) ...

That's the "conditional" notation - e.g. P(A|B) is "probability A happens, conditional on B having happened"

so does p(s',r | s,a) really just mean "i do something at state s', which is a (allowed by the state), which gives me reward r at state s'?
If you do action a at state s, you can, in the general case, get any reward and end up in any state - and that's governed by a probability distribution. Specifically, the probability of getting reward r and ending up in state s' is p(s',r | s,a).

#

E.g. if your environment is fully deterministic, then for each s,a, there'll be just one specific s',r pair the probability of which will be 1, and the probabilities of all other states-rewards pairs will be 0.

lusty lotus
# tidal bough That's the "conditional" notation - e.g. `P(A|B)` is "probability A happens, con...

im not sure if i understand you correctly:

That's the "conditional" notation - e.g. P(A|B) is "probability A happens, conditional on B having happened"
does it mean "the probablility of A happening after B happens"?

and what you mean in your second part here is that after taking an action, the state and reward is like random but the chances of a SPECIFIC state and reward occuring is whatever is on the other side of the equal sign of p(s',r | s,a)? like a "spin a wheel" where the wheel has like sections with different colour?

lusty lotus
tidal bough
lusty lotus
#

wdym same pair

tidal bough
lusty lotus
#

but in other words, should i replicate action a at state s, i'll yield the same rewards every time right?

lusty lotus
#

im learning RL basics since im implementing AlphaZero (i set up the search alg and NN already, just need to implement the training loop since the paper implies i know this alr)

#

can't wait to learn this and implement training loop and train on my new GPU (excited)

tidal bough
lusty lotus
#

right

#

may i ping you when i see something idk?

tidal bough
#

probably just ask here, I am not always online.

lusty lotus
#

sure

#

but like a lot of times my question is ignored or get pointed to an SO link

#

:/

lusty lotus
tidal bough
#

well, sure.

lusty lotus
#

then the thing is

#

i don't understand why discounted reward exists and why it's useful lol

tidal bough
#

because if you don't do any discounting, for many games the expected reward is clearly infinity no matter what you do, so not much to optimize.

lusty lotus
tidal bough
#

discounting will make the reward finite always, even if the game will be infinite.

lusty lotus
#

like is it like infinity*80% or some shit?

#

this concept doesn't make sense

tidal bough
#

no? like the sum of decaying exponential progression being finite.

boreal gale
# sleek harbor Anyone a pro in Plotly Dash? I wanted to know: 1 - is it true that all callback...

not a pro, used it a couple of times, actually still getting up to speed with it this week after not using it for years.
re. 1)
https://dash.plotly.com/advanced-callbacks#when-a-dash-app-first-loads
yes, it is called automatically
order is determined by a dependency tree (i think of it as a DAG 🤷‍♂️ )
see https://community.plotly.com/t/what-is-the-execution-order-of-callbacks/6858/2 for an answer from the author himself.

no comment on 2)

lusty lotus
tidal bough
#

Well, that's just the equation for discounted reward - it's expected reward except will multiply the terms by 1, γ^2, γ^3, ... - a decaying exponential progression. It's a math fact that if you take most series (those that don't grow unboundedly, or even do grow unboundedly but not exponentially fast) and construct a "discounted" series like that, the sum of it will be finite. So this makes the reward finite even for infinite games.

#

wouldn't making the discounted reward make it "less accurate"? like the agent is getting less reward (sad)
kind of, yeah, but this is more of a philosophical point. Note that you can make γ arbitarily close to 1 if you want the agent to consider the future more - as long as it's below 1, the discounting will work.

lusty lotus
tidal bough
#

what would the agent do with the discounted expected return?
the value itself? nothing, it just should be maximized - so in practice, RL usually consists of getting a good idea of what sets of actions will give what rewards in the long run, then doing them.

#

why is there a expected return in the first place? even for episodic tasks? why (and how) would the AI make the most of all anticipated rewards?
I don't really understand what you mean by these.

jaunty lion
#

hey guys, what are ways i can analyze audio data (mp3files), in such a way that the resulting extracted data would always be in the same shape, so it would be suitable for machine learning purposes. Thanks for any answers.

sleek harbor
# boreal gale not a pro, used it a couple of times, actually still getting up to speed with it...

I have 3 callbacks that depend on each other 1->2->3. The 1st callback creates a global variable, and the 3rd one uses it to create a table. But for some reason, when u load the app - it doesn't work, there's nothing there, no table. The strange part is that if I then relaunch the app, then it does work (p.s. doing this in a jupyter notebook, so variables carry over from one app launch (cell execution) to the next, but when I reset the kernel - it doesn't work again). So all callbacks do work, the one that creates the global variable does work, and so does the one that uses it, but something must be wrong with the order, even tho it should be correct. I just can't get it

lusty lotus
lusty lotus
tidal bough
#

similarly, you can introduce the concept of "discounted reward" to talk about "what actions to take to maximize this", but that doesn't necessarily mean you're actually going to be evaluating that function; maybe you'll just analytically determine the best strategy from analyzing it.

tidal bough
lusty lotus
tidal bough
#

Kind of. It'd be a much more complicated function - a function of your strategy (what actions you take depending on the state), and you'd want to find the optimal strategy. In almost no real games will you be able to just derive the optimal strategy (for instance, because you might not even know the form of the reward). But as you'll probably soon see in the course, that still allows you to derive some important properties the optimal actions must have.

lusty lotus
#

shit im getting confused with the math

#

:/

lusty lotus
wooden sail
#

strategies are often discrete if you choose/are able to represent them numerically. otherwise they're algorithms. neither lends themself to differentiation

tidal bough
# lusty lotus can i do this using grad descent?

Ehh, sure, for some kinds of games you can "just" numerically optimize a very high-dimensional function. But, well, remember how a strategy is basically "what action you take in a state"? Well, the state space isn't necessarily discrete, it may be continuous. So you're finding a function that optimizes a certain value, and that's getting very complicated to represent numerically.

boreal gale
lusty lotus
burnt oxide
#

is this channel not so beginner friendly. 🃏

wooden sail
#

it is if you ask beginner-friendly questions 😛

tidal bough
lusty lotus
#

now i have an extential crisis 🫠

wooden sail
#

that is optimal in some sense, i.e. the worst possible

lusty lotus
#

questioning my purpose as a simple being when i don't understand how decisions and rewards are supposed to be maximised 🫠

tidal bough
#

decision theory does cause that as a side effect 🙂

twilit tundra
#

It's easy to always make the optimal decision when you know the state and position of every particle in existence and you can predict accurately human behavior

tidal bough
#

citation needed; i think there'll be some issues even then :p

left tartan
tidal bough
#

qualifications of Albert Einstein:

  • hero of many anecdotes
  • being famously wrong about quantum mechanics
warm sage
#

Hi, I was hoping to get some advice here.

I am working on a digital text sentiment analysis tool in python. I was hoping to achieve this using machine learning and an amazon review dataset.

First of all, I'm not sure what type of model i will need to create (eg Linear Regression Model) so I could use some help deciding that.
Second of all, I have a 100gb file full of reviews and im not sure of the best way to go about importing and training on this data.

Thanks in advance

simple tapir
#

Can ML engineers also work as data scientist? Do they have to do anything extra to be a data scientist?

serene scaffold
simple tapir
#

uhh

serene scaffold
#

uhh?

simple tapir
#

When I searched for machine learning engineer positions, I couldn't find any for some companies. That's why I wondered whether I could apply for a data scientist position. (I'm not searching for a job rn, I've just finished my freshman year and want to go through this field)

serene scaffold
#

have you started looking for internships for next summer and beyond?

simple tapir
serene scaffold
#

right, you probably wouldn't have gotten one this summer

simple tapir
#

yeah, can't apply for one now

#

So, it's better to look at job descriptions instead of job titles, right?

serene scaffold
#

pretty much

simple tapir
#

Gotcha, thanks a lot

serene scaffold
#

at least in the context of AI/ML/DS positions

simple tapir
#

I see, will keep that in mind

left tartan
#
  • I've seen a lot of posts saying things like: "I don't need to learn XYZ because all I want is AI/ML"
simple tapir
#

may you give an example?

left tartan
# simple tapir may you give an example?

I dunno, yesterday someone was saying something about not needing to learn anything about web development because they didn't want to be a front-end developer.

#

And someone else said they didn't like data analysis but wanted to do AI/ML, which I thought was hilarious.

past meteor
#

100 % agree with Stel

#

Additionally, just give it time tbh. Enjoy life, enjoy school, take courses that you like and do internships

vestal widget
#

I want to create a conversational chatbot that can generate text like GPT-3 or GPT-4 and can be trained with custom data, where should i start?

lapis sequoia
#

what is wrong with my tacotron2 training model this is supposed to be spongebob😭

mild dirge
#

kill it

lapis sequoia
#

😂

serene scaffold
lapis sequoia
#

I need to fix it because I want to stream ai sponge

serene scaffold
#

so you're trying to make a synthetic voice of spongebob. what was the input for that audio? "hahahaha"?

lapis sequoia
#

Hi I am spongebob

#

this is the input text😭

serene scaffold
#

what was the total duration of your training data?

lapis sequoia
#

I am not sure but it's 2000 samples

#

trained for 12 hours on 3090

serene scaffold
#

welp

lapis sequoia
#

it's cursed

serene scaffold
#

you might have to keep training it.

#

but it might also be that you don't have enough data, or that the quality isn't pristine enough

lapis sequoia
#

I don't think so it's just giving results like this no matter the training time

#

12 hours is quite long and it should at least be understandable

#

I just want to understand the issue

serene scaffold
#

I've heard of people running tacotron for weeks

hasty mountain
#

Have you trained it from scratch?

lapis sequoia
#

yes

hasty mountain
#

Then Stelercus is probably right

lapis sequoia
#

I need to train it for longer?

serene scaffold
#

> assuming Stelercus could be partially wrong

hasty mountain
#

Tacotron 2 was originally trained on... I think...around 40.000 audio samples?

#

And for quite some time... I don't remember the details...been a while since I've read the paper pithink

serene scaffold
#

wish they had named it sushitron

hasty mountain
#

Why?

serene scaffold
#

that was the other name they considered

#

but the taco camp won

hasty mountain
#

Oh

lapis sequoia
#

i've seen someone training a model using 10 audio samples and for 1 hour and it's understandable

hasty mountain
#

Wish my Audio GAN would work without killing my GPU grumpchib

lapis sequoia
#

on youtube

hasty mountain
#

They probably used a pre-trained model and applied training on their custom data

lapis sequoia
#

oh

#

he used google collab

hasty mountain
#

I've also used a pre-trained model on 150 audio samples and it worked quite fine after 2~3 hours

lapis sequoia
#

omg

#

so how long do you think I need to train it

#

is it possible to train spongebob voice on pre trained model?

hasty mountain
#

On pre-trained model, you may need around 2~3 hours...maybe less, since you got a reasonable dataset size

lapis sequoia
#

I am using this command ```
python train.py --output_directory=outdir --log_directory=logdir

#

how can I use pre trained model

hasty mountain
#

You need to have the model weights already downloaded

lapis sequoia
#

where can I get one

hasty mountain
#

Maybe Tacotron2's GitHub will have the pretrained weights

lapis sequoia
#

can you please tell me what is the prompt to use the pretrained model

hasty mountain
#

I can't give more details, though. I always thought using pre-trained models was boring...specially since those tech companies tend to make their models GitHub a bit confusing...

hasty mountain
lapis sequoia
#

ok

hasty mountain
#

An easy way to train it may be using Uberduck, too

#

That was the way I used it.

lapis sequoia
#

uberduck is very expensive if I need to stream 24/7

#

$120/day

#

anyway thank you for the information

past meteor
#

This sounds like the stuff of nightmares

oblique quarry
#

Good afternoon, Im reading up on layer normalization. And the concept checks out and makes sense, given that it combats issues like gradient explosion. But what bothers me is that I'd have to constantly take the mean and subtract it from my layer and then divide it by its std so im permanently altering my values in the layer so that they center around zero but dont i run into the danger of having too many zeros and subsequently killing the net?

#

Would appreciate if sb could link me a resource to help me understand the concept better

simple tapir
left tartan
simple tapir
#

Right, thanks for the suggestions 🙏

gentle horizon
#

hey !

lapis sequoia
#

I made a monster

cold osprey
#

first second or so is passable

lapis sequoia
#

is this the way to use pretrained model to train a tacotron2 model : ```
python train.py --output_directory=outdir --log_directory=logdir -c tacotron2_statedict.pt --warm_start

lapis sequoia
#

there is improvement guys

potent sky
#

anyone here attending IAIM'2023?

humble shore
#

Any one here had an internship

#

If so what projects do the require

#

Or what good resume

mint palm
#

helpppp

#

i am using ssh
lsof -i :<port_number>
prints nothing

i tried to run this code before it ran "FINE", but now i get this

98 means port is busy, how to change it

tepid hazel
#

I believe there are more 0 entries than there are 1

#

im also not sure why model.evalutate gives it a 0.99, I use model.predict on the testing data and most of it is wrong

twilit tundra
#

What loss are you using?

mild dirge
#

As in you normalized the test data as well f.e.

#

And you didn't accidentally flipped the labels at some point

tepid hazel
mild dirge
#

So when does it provide "wrong" results then?

#

Only with .predict but not with .evaluate?

tepid hazel
#

Here's the loss and accuracy of my model during training. The last printout is the loss and accuracy returned by model.evalutate

mild dirge
#

On the test data right?

tepid hazel
#

yes

mild dirge
#

What is the shape of your dataset you pass to predict and to evaluate?

tepid hazel
#

if that's what you're asking

#

hm, the model seems to perform well on the test data, however, when it receives data that isn't really positive nor negative like hey! it will always form a bias to negative and return 0.99 as far as I can tell

oblique quarry
#

Good evening guys is anyone familiar with batchNormalization?

serene scaffold
void veldt
#

When discussing reduced chi2, is the sum of squared residuals normalized against degrees of freedom or number of fitted data points? Because I've seen both used. Or is it only number of fitted data points if data points >>>> number of adjustable parameters

umbral charm
#
die = pd.DataFrame([1, 2, 3, 4, 5, 6])
trial = 10000
sum = [die.sample(2, replace = True).sum().loc[0] for i in range(trial)]
freq = pd.DataFrame(sum)[0].value_counts()
print(freq.sort_index())
Relfreq = freq.sort_index() / trial
Relfreq.plot(kind = 'bar')
plt.show()

is there a way i can make this faster, it takes about 4 seconds to do, but i need to go up to 1 million trials

#

maybe use numba or sometin

left tartan
#

maybe ```py
import numpy as np
a = np.array([1, 2, 3, 4, 5, 6])
trial = 10000
s = np.sum(np.random.choice(a, size=(trial, 2), replace=True), axis=1)

umbral charm
#

but i have to use pandas in this task

oblique quarry
#

alright can somebody take a look at it ```py
class BatchNormalization():
def init(self):
"""
Please note that I'm substituting gamma and beta for weight and bias to make this module compa-
tible with the rest of the libary.
"""
self.weight = 1
self.bias = 0

def forward(self, inputs):
    """
    Subtracting from the input its mean before dividing by the standard deviation of the input.
    Finally multiplying it by the self.weight parameter and adding self.bias to it.
    """
    self.inputs = inputs
    self.mean = np.mean(inputs, axis=0)
    self.variance = np.var(inputs, axis=0)
    self.stdDev = np.sqrt(self.variance + 1e-8)
    self.normalizedInputs = (inputs - self.mean) / self.stdDev
    return self.weight * self.normalizedInputs + self.bias

def backward(self, gradient):
    """
    Backpropagation through the layer. We first compute the gradients of the loss with respect to
    the normalized inputs, variance, and mean. Then we apply the chain rule to derive dweight,
    dInput and dbias. As per usual, dbias is just the gradient as its derivative of the sum op-
    eration is one.
    """
    N, D = gradient.shape
    dNormalizedInputs = gradient * self.weight
    dVariance = np.sum(dNormalizedInputs * (self.inputs - self.mean) * -0.5 * (self.variance + 1e-8)**(-1.5), axis=0)
    dMean = np.sum(dNormalizedInputs * -1 / self.stdDev, axis=0) + dVariance * np.mean(-2 * (self.inputs - self.mean), axis=0)
    dInput = dNormalizedInputs / self.stdDev + dVariance * 2 * (self.inputs - self.mean) / N + dMean / N
    self.dweight = np.sum(gradient * self.normalizedInputs, axis=0)
    self.dbias = np.sum(gradient, axis=0)
    return dInput
left tartan
jovial swift
#

Yes

umbral charm
left tartan
#

die.sample(trial*2) gives you 20000 samples, then you reshape it to trial,2, then sum each row

sleek harbor
#

Any easy way of sharing a jup notebook, since uploading here isn't allowed?

sleek harbor
# boreal gale i think i had something similiar happened to me before, can't remember how i fix...

https://filetransfer.io/data-package/lPNgCRmp#link

So, not necessarily minimal, but should be reproducible, I think. If u just run the whole notebook, then the last cell will give the error in the first screenshot. If u then open the app by clicking the link, and simply close it again, and rerun the last cell, then you'll get what I expect immediately. When you open the link for the first time, there is no table at the top right. If you then close the link and reopen it, without running any extra cells, the table will suddenly appear (as the variable is now properly initialized and filled, as seen by the last cell of the notebook). I don't understand this behavior.

Ignore the terrible formatting, that's how it's "supposed" to be (haven't worked on it yet). I added some thicc markdown so you could navigate my extremely messy code more easily. Sorry about that.. 😅 I can't seem to figure out why it doesn't work on the first start. I think it might have something to do with order of initial execution of callbacks, but honestly no idea. After you reopen the link, everything works as it should (when you change the sector dropdown, the ticker dropdown options change, and when you change those, the table changes)

gentle creek
#

is anybody here presently working on object detection?

opal pike
#

Yes

hasty spear
#

nope but i'd love to hear some about it

molten onyx
#

Hello, im new in the machine lerning field. i recently wrote the foundation of a neural network in c++. Currently im stuck at implementing the backpropagation method, and I just wanted to ask you guys if you have a good source where I can learn the math behind it.

edit:
I should note that this a supervised nn

hasty mountain
#

The backpropagation is basically the Chain Rule from calculus.

#

There's one or another trick, like having to transpose the weights matrices when doing the chain rule, but in general it's just the chain rule, beginning at the loss function and going backwards until you get to the first layer.

molten onyx
#

cool, thanks! ill have a look

hasty mountain
#

A class about chain rule will also be a must. I don't recommend the ones I've used because they're not in english

plush jungle
#

does anyone have any resources that show examples of how RNN hidden states encode patterns?

#

for example, my NLP professor mentioned that RNNs can learn open/close parenthesis and showed how the hidden state could encode that kind of thing, but it was a while ago and I don't remember it as well as I wish I did

gilded kestrel
#

is colab memory usage garbage?

plush jungle
robust cliff
#

hello cool people I need help choosing what to use for sentiment analysis

#

I am currently in a node enviroment and am using natural, but results are trash no matter how much I preprocess the data

#

I can probably get python to run in there, but what do I use?

#

the data comes from what a user has written in an obsidian note

#

so could be pretty much anything

twilit tundra
robust cliff
#

okay many thanks, didn't know about it

fresh harbor
#

OpenCV seems to have a really shitty ONNX reader

vestal spruce
#

I've asked about this before but I didn't follow up on the discussion, so I would like to ask again. Has there been any attempt to make a speech recognition model capable of distinguishing monologue and dialogue? So far I've found that it's possible and have a basic understanding of what I'm trying to achieve in a step by step which in this case determine how many person speaking in a given audio > figure out their segment in the audio > lastly convert the speech to text accordingly.

#

Would implementing NLP instead be better though?

vestal spruce
#

I will try this out

left tartan
vestal spruce
#

I don't know the formal term for "deaf" people, but to me this term felt condescending for some reason, so I'll refer it as impaired hearing even though I know it's not accurate (I'm ESL)

twilit tundra
hollow citrus
#

Hi! I am working on a project for work, where I have to use an LLM and fine-tune it based on user input. I have been asked to provide a general system configuration for a multi-server setup where this system would run for me to tinker with it and test it. I would like to get some suggestions for this. It will probably be a CPU-only server cluster, but GPU-based system recommendations are welcome. The systems would run some flavour of Linux, suggestions are welcome for that as well. I would also like to know how I go about using multiple systems to train a single model at the same time.

desert oar
civic elm
#

Hi any datascience substacks you read once in a while?

civic elm
queen vector
#

hi everyone
i was wondering, can we use ai ml in automating API testing?
if yes how ?, i would like to do a small implementation of it

near basin
#

Hello, people, I was recommended to ask here, but the question is not specifically related to python.
I am again working with Reinforcement Learning.
This time I use Neural Networks for the Q-Table.
My Agent is playing a game against another independent Agent. The Reward policy in the middle of the game always produces 0, and in the end of the game the Reward is either -1 for losing, or +1 for winning. This reward gets backpropagated over all the states achieved in the game.
But here is my question:
If the Q-Table were just a Lookup Table - Q-Value adjustment would be as easy as going over every accumulated state and performing the adjustment.
However when the Q-Table is a Neural Network: Adjusting Q-Value for one state changes the whole network. In which order should the adjustments be made? Reversed order from the end of the game to the beginning, or from beginning to the end?

desert oar
queen vector
hollow citrus
#

Also, how to use multiple systems to perform the training, if possible

#

It'd just be an ubuntu system or something and I would just run my code on the servers. I just want the config that can handle that

serene scaffold
mild dirge
left tartan
mild dirge
#

The idea is that you don't just use the sample at a given time, but take multiple random previous samples and train the model on that

#

That way you can also use the same sample multiple times, and the batches are not as "correlated"

near basin
#

This is not relevant in my case, I am asking about adjusting the NN in my mentioned way.

#

Thank you for the reference tho

mild dirge
#

I think it is relevant to your question, you ask in what order you feed the data into your model for training right? @near basin

near basin
#

More on the context (should probably mentioned it before):
I make a simple RL for playing Tic-Tac-Toe. The "experience replay" sounds promising, except it does not fit the use case here, because the Agent has "no" experience on a made step, and makes this "experience" only when the game is finished

#

So taking random batches won't really help

mild dirge
#

What algorithm are you using to update the model, is it online or offline?

near basin
#

I am not very advanced in this, what is the difference between these two?

mild dirge
#

There are online methods that don't require future states to update the model

#

Such as deep-Q learning (probably most popular method)

mild dirge
#

So the experience replay buffer would be perfectly viable

near basin
mild dirge
#

With deep-Q learning you can also update the model after each step

#

but not necessarily with the trajectory from that step

#

But with random trajectories from the previous 1000 or so steps

near basin
#

I will report later if I have any questions

mild dirge
#

Sure

#

This was a tutorial I used when I did a RL project

#

But it is with pytorch, not tensorflow

near basin
#

I do not care about implementation, only theory. Because I do not even make this project in python

mild dirge
#

It helped me because code is a very exact way to describe how an algorithm works. So even if you don't care about the implementation, it may be good to look at still 🙂

near basin
#

Hhmmm, my biggest mistake for now was, that I was using only one network, instead of Main+Target pair

mild dirge
#

It helps with stability, but iirc I updated the target model after every 10 steps, and it still worked, so it is not 100% necessary for simpler problems

near basin
#

Wdyt, if I use main+target, then using the target network I will play one round, then update the main network using the saved states and then put the weights to the target network. Sounds like a plan, doesn't it? But will this approach work?

#

Then, I don't have to care about in which direction the backpropagation is going

mild dirge
#

In that case both networks would be the same

#

As you replace the target network with the policy network after updating the policy network every round

#

Typically you use the policy network to make decisions (but also add some random decisions to explore the state space)

#

And the target network is just used to calculate the temporal difference target

#

So this value is predicted with the target network

near basin
#

Yeah

mild dirge
#

In this formula

near basin
mild dirge
#

And the Q(S_t, A_t) is the policy network

near basin
#

Alright thanks! Is there any '/thank' command for helping points like in [World of Coding] server?

oblique quarry
#

Good afternoon I've been trying to get my convolutional Layer to run but it is performing poorly compared to the mlp implementation. This is my test set up i know the learningrate decay is kinda aggressive but it yields the best result

#
from framework.nn.Dense import DenseLayer, FlattenLayer
from framework.nn.ActivationFunction import ReLU, Softmax
from framework.nn.Loss import CategoricalCrossEntropyLoss
from framework.nn.Metrics import Metrics
from framework.nn.Optimizer import Adam
from framework.visualProcessing.convolution import Convolution
from framework.nn.Utils import sparseToOneHotEncoded, visualize, shuffle
from sklearn import datasets
import matplotlib.pyplot as plt

digits = datasets.load_digits()
bilder = digits.images
tar = digits.target
bilder, tar = shuffle(bilder, tar)
bilder, tar = bilder, tar
tar = sparseToOneHotEncoded(tar, 10)
batchSize = 64
conv = Convolution((batchSize, 8, 8))
f = FlattenLayer()
l1, l2, l3 = DenseLayer(64, 128), DenseLayer(128, 128), DenseLayer(128, 10)
relu1, relu2, softmax = ReLU(), ReLU(), Softmax()
loss, optim = CategoricalCrossEntropyLoss(), Adam(lernrate=5e-3, lernRateDecay=1e-2)
acc, l, lr = [], [], []
from tqdm import tqdm
for i in tqdm(range(1000)):
    for step in range(len(bilder) // batchSize):
        batchX = bilder[step * batchSize:(step + 1) * batchSize]
        batchY = tar[step * batchSize:(step + 1) * batchSize]
        #convOutput = f.forward(conv.forward(batchX))
        l1Output = relu1.forward(l1.forward(batchX.reshape(batchSize, -1)))
        l2Output = relu2.forward(l2.forward(l1Output))
        l3Output = softmax.forward(l3.forward(l2Output))
        if i % 10 == 0:
            acc.append(Metrics.accuracyClassifier(l3Output, batchY))
            l.append(loss.calculate(l3Output, batchY))
            lr.append(optim.getLearningRate)
        l3grad = l3.backward(softmax.backward(loss.backward(l3Output, batchY)))
        l2grad = l2.backward(relu2.backward(l3grad))
        l1grad = l1.backward(relu1.backward(l2grad))
        #conv.backward(f.backward(l1grad))
        optim.learningRateDecay()
        optim.step(l3)
        optim.step(l2)
        optim.step(l1)
        #optim.step(conv)
visualize(acc, l, lr, optim)```
#

This is the run with the conv layer

#

this is without so it pretty obvious that the net has memorized the data set(i believe 2k images)

#

but this only proves that my backward pass in the conv layer is messed up as i just added the conv layer ontop of the other net

#

would appreciate if sb could check if i implemented the backward pass correctly(cuz there must be an logical error which is messing up my net)

lapis sequoia
#

I am using universal sentence encoder tensorflow, How can I speed it up, its currently only using CPU not GPU for some reason

serene scaffold
lapis sequoia
#

It says that the current version is more cpu bound something like that, one sec let me show you

#
2023-08-03 14:51:19.521843: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
near basin
#

but thank you

serene scaffold
lapis sequoia
serene scaffold
lapis sequoia
# serene scaffold I have to know what the problem is before I can decide that.
recommenders = {}


@shared_task
def process_question(question, instance_id):
    if instance_id in recommenders:
        print("Using existing recommender")
        answer, recommender = use_main(
            question, instance_id, recommender=recommenders[instance_id]
        )
    else:
        print("Creating new recommender")
        answer, recommender = use_main(question, instance_id)
    recommenders[instance_id] = recommender
    return {"answer": answer}

I am using django and it this api route, How can I make a shared object of recommenders dictionary between all the workers, the recommenders dict has key value pairs of the instances of this class:

class SemanticSearch:
    def __init__(self):
        self.use = hub.load("./Universal Sentence Encoder/")
        self.fitted = False

    def fit(self, data, batch=1000, n_neighbors=5):
        self.data = data
        self.embeddings = self.get_text_embedding(data, batch=batch)
        n_neighbors = min(n_neighbors, len(self.embeddings))
        self.nn = NearestNeighbors(n_neighbors=n_neighbors)
        self.nn.fit(self.embeddings)
        self.fitted = True

    def __call__(self, text, return_data=True):
        inp_emb = self.use([text])
        neighbors = self.nn.kneighbors(inp_emb, return_distance=False)[0]

        if return_data:
            return [self.data[i] for i in neighbors]
        else:
            return neighbors

    def get_text_embedding(self, texts, batch=1000):
        print("Generating embeddings...")
        embeddings = []
        for i in range(0, len(texts), batch):
            text_batch = texts[i : (i + batch)]
            emb_batch = self.use(text_batch)
            embeddings.append(emb_batch)
        embeddings = np.vstack(embeddings)
        return embeddings
serene scaffold
lapis sequoia
haughty nest
#

I am using GridSearchCV and for some reason it thinks an accuracy score of 96.47 is better than 96.57???

#

Can someone explain

winter sedge
#

Not sure this is the right place to ask, if not please point me in the right direction. So I have a list with 6911 values, it looks like the attached image. I want to make a new list every time the value drops by x amounts so I can do a regression analysis and calculate the slope on each list. Where do I start? What do I need to learn to do something like this?

haughty nest
#

when I do 'C': [1, 0.2236, 0.1] it gives me that 0.1 is the best with 96.47
when I do 'C': [1, 0.2236] it will tell me that 0.2236 is the best with 96.57.
I tried it multiple times and it gives me the same result

left tartan
winter sedge
#

Currently just a list, but I should absolutely import it to a dataframe to speed up the process.

slim bone
#

So if I'm using:

  • MSE as my loss function l
  • Sigmoid as my activation function o
  • Some input layer a and an output layer p (and nothing else)
    in order to find the partial derivative of w_alpha (some random weight) would it be right to do:
    (l(p,t))' = 1/m * Sigma(1,m)[(o(w1a1 + w2a2 + ... + w_n*a_n) - t)^2)]'?
    Trying to understand how to reach the gradient but I can't understand it for the life of me.
twilit tundra
haughty nest
#

always same accuracy

#

gridsearch just thinks 96.47 is better than .57

twilit tundra
#

What if you change the order

#

1,0.1, 0.2236

haughty nest
twilit tundra
#

Maybe try putting verbose =3 so you have the full logs

#

Or 4

haughty nest
#

ok

#

[LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear][LibLinear]

#

@twilit tundra it shows this

twilit tundra
haughty nest
#

i tried both verbose = 3 and 4

twilit tundra
#

Oh you put it in the definition of your model

#

I meant in the gridsearch

haughty nest
#

[CV 1/5] END C=1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.946 total time= 0.0s
[CV 2/5] END C=1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.947 total time= 0.0s
[CV 3/5] END C=1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.949 total time= 0.0s
[CV 4/5] END C=1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.949 total time= 0.0s
[CV 5/5] END C=1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.947 total time= 0.0s
[CV 1/5] END C=0.1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.947 total time= 0.0s
[CV 2/5] END C=0.1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.947 total time= 0.0s
[CV 3/5] END C=0.1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.949 total time= 0.0s
[CV 4/5] END C=0.1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.949 total time= 0.0s
[CV 5/5] END C=0.1, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.947 total time= 0.0s
[CV 1/5] END C=0.2336, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.947 total time= 0.0s
[CV 2/5] END C=0.2336, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.947 total time= 0.0s
[CV 3/5] END C=0.2336, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.949 total time= 0.0s
[CV 4/5] END C=0.2336, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.949 total time= 0.0s
[CV 5/5] END C=0.2336, class_weight=None, multi_class=ovr, penalty=l1, solver=liblinear, tol=0.0001;, score=0.947 total time= 0.0s

#

hmm

twilit tundra
#

Did your 96.47 come from evaluating on a set split? Then it makes sense that cv would have a different order

haughty nest
#

i think

#

ye it is

#

20 80 split

#

so should I still take C=1 as the best parameter

#

or C=0.1

twilit tundra
#

According to the cv, it would be 0.1

haughty nest
#

ok but the accuracy score is lower when I call.score

twilit tundra
#

On the 20/80 split?

haughty nest
#

ye