#data-science-and-ml | Python | Page 92

maiden arch Dec 12, 2023, 11:10 AM

#

nevermind i made it worked i think kinda

marble raft Dec 12, 2023, 11:15 AM

#

Hi

#

Does anyone understand tensorflow i need help

maiden arch Dec 12, 2023, 11:16 AM

#

ok i did it but there are not graphs visible ?

boreal gale Dec 12, 2023, 11:20 AM

#

maiden arch ok i did it but there are not graphs visible ?

let me see if i can point you to the right direction.

ax.bar(dates, volumes, width=1, edgecolor="white", linewidth=0.7)

what is this doing?

and also

why are you using bar chart?
what are linewidth and width in particular doing?

maiden arch Dec 12, 2023, 11:23 AM

#

haha the color is white

#

thats y it is not printing

#

lol

maiden arch Dec 12, 2023, 11:24 AM

#

boreal gale let me see if i can point you to the right direction. ``` ax.bar(dates, volumes,...

is there any way to animate this some of the javascript libraries used to have a automatic animation by just changing a bool to true ?

dense cloud Dec 12, 2023, 11:45 AM

#

I'm trying to upgrade some old code from Pandas 1.0 to 1.2 and I'm getting this error: TypeError: Expected unicode, got pandas._libs.properties.CachedProperty. Internet says that I have to set frequency ( df = df.asfreq("1D") ), but the issue is I have multiple places where this happens and frequency is different...is there a generic solution for this one?

edit: it seems that frame.index.freq = frame.index.inferred_freq is what I need

maiden arch Dec 12, 2023, 1:14 PM

#

boreal gale let me see if i can point you to the right direction. ``` ax.bar(dates, volumes,...

import pandas as pd
import matplotlib.pyplot as plt

# Read the stock data
stockData = pd.read_csv("/home/needjobcoder/devlopment/python/dataSciencePractice/practice/stockMarket/indexProcessed.csv")

# Convert 'Date' column to datetime
dates = pd.to_datetime(stockData['Date'])

# Extract columns
high = stockData['High']
low = stockData['Low']
_open = stockData['Open']
close = stockData['Close']

# Combine columns into a single NumPy array
stock_array = stockData[['High', 'Low', 'Open', 'Close']].values
print(len(stock_array))


dates = dates.to_numpy().flat
print(len(dates))

# Create a boxplot
fig, ax = plt.subplots()
VP = ax.boxplot(stock_array, positions=dates, widths=0.6, patch_artist=True,
                showmeans=False, showfliers=False,
                medianprops={"color": "white", "linewidth": 0.5},
                boxprops={"facecolor": "C0", "edgecolor": "white",
                          "linewidth": 0.5},
                whiskerprops={"color": "C0", "linewidth": 1.5},
                capprops={"color": "C0", "linewidth": 1.5})

ax.set(xlim=(0.5, 4.5),
       ylim=(0, stock_array.max()),
       )

plt.savefig('candlestick.png')

#

ValueError: List of boxplot statistics and positions values must have same the length

#

it is giving this but len of dates and stock_array is same

hybrid maple Dec 12, 2023, 1:30 PM

#

I am trying to train a model to predict loan eligibility, and I am getting this error:
ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 607, 11), found shape=(None, 11)

this is my code:

import pandas 
import tensorflow 
from sklearn.model_selection import train_test_split


dataset = pandas.read_csv('/Users/oliverjohnson/loan-eligibility-predictor/loan-train.csv')
x = dataset.drop(columns=['Loan_Status'])
y = dataset['Loan_Status']

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.01)

model = tensorflow.keras.models.Sequential()

model.add(tensorflow.keras.Input(shape=(x_train.shape)))

#input layers
model.add(tensorflow.keras.layers.Dense(256, activation='sigmoid')) 
#hidden layers
model.add(tensorflow.keras.layers.Dense(256, activation='sigmoid')) 
#output layer
model.add(tensorflow.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['accuracy'])
print('test')
model.fit(x_train,y_train, epochs=1000)


would appreciate any help```

long canopy Dec 12, 2023, 1:56 PM

#

ty for the answer!

#

ended up using Gephi!

#

I really just meant, the best way to be able to visualize and have a high-level overview of said network

#

turns out i'll need to do some programming work on this, I need to visualize a particular graph in a very specific manner

iron basalt Dec 12, 2023, 3:03 PM

#

You can change the BLAS library selection ordering (preferred order) when building Numpy from source. I don't think there is any proper way to handle libraries in Windows. On Linux this would be searching lib directories / using pkg-config.

#

(You could also move MKL so it does not find it in the expected location and uses the next option)

desert oar Dec 12, 2023, 3:45 PM

#

okay, that makes sense. but what's with the groupings? you're interested in how the relationship between these independent variables and the dependent variable changes across 4 physical test locations?

#

this is a very traditional statistical modeling scenario... do you expect strongly nonlinear effects here? if not, i'd suggest maybe going for a more probabilistic model here

#

but if you want to use the random forest approach, i suggest just fitting the random forest model with all 3 of those dependent variables and categorical features as needed to describe the location

#

however i'm concerned if you only have 4 measurements, a random forest model will be approximately useless

#

and the measurements of a time series don't really count, each time series is effectively one measurement

quaint loom Dec 12, 2023, 4:05 PM

#

Thank you so much for your explanation. And you’re totally right. The validity is of course robust but if gives us a more reasonable direction of thinking. I did figure out the issue as Matplot was cutting the flow causing it to not do the random forest test on the other areas. @desert oar

quaint loom Dec 12, 2023, 4:10 PM

#

desert oar okay, that makes sense. but what's with the groupings? you're interested in how ...

No. Not exactly. I want to know what parameters (independent variable) is mostly the cause of the dependent variable. I will be using several other methods ( Such as mentel test and SEM (When I am able to build the model) and additional experiments to understand the mechanism behind. I do have more then 4 measurements. Actually I have sampled weekly for over 1 year, so is not about having not enough data but more how to use the data and understand it in modeling and through statistical tests.

long canopy Dec 12, 2023, 4:10 PM

#

are there any measures of data coverage, or data control? 100% coverage meaning, all the scoped data is maximally useful and none of it is wasted

#

or at least, 100% means, all data has been accounted for, or all data has been involved somehow, etc.

#

a measure of data being-accounted-for, data-accountedness?

past meteor Dec 12, 2023, 4:19 PM

#

long canopy are there any measures of data coverage, or data control? 100% coverage meaning,...

Can you give a concrete example, I don't really know what you mean / this isn't standard terminology

long canopy Dec 12, 2023, 4:21 PM

#

past meteor Can you give a concrete example, I don't really know what you mean / this isn't ...

probably we'd define prior objectives, and we could determine whether a subset of data can be successfully transformed into a form that realizes this objective

past meteor Dec 12, 2023, 4:22 PM

#

I don't mean this in a bad way but I have no clue what you mean. Could you give a concrete example / use case?

long canopy Dec 12, 2023, 4:23 PM

#

hm, thanks for the question, definitely this needs more definiteness

wooden sail Dec 12, 2023, 4:39 PM

#

iron basalt (You could also move MKL so it does not find it in the expected location and use...

hmm this is kinda cursed, i was hoping there'd be an easier solution 😛 thanks for the input

#

i guess WSL and fiddling with packages and directories it is

iron basalt Dec 12, 2023, 4:48 PM

#

wooden sail hmm this is kinda cursed, i was hoping there'd be an easier solution 😛 thanks f...

It seems Conda may have something for this: https://conda-forge.org/docs/maintainer/knowledge_base.html#switching-blas-implementation

wooden sail Dec 12, 2023, 4:49 PM

#

this is exactly what i was hoping would exist

iron basalt Dec 12, 2023, 4:50 PM

#

It's doing dynamic library manipulation / injection basically. Similar to what would be done with manual directory fiddling and stuff.

wooden sail Dec 12, 2023, 4:50 PM

#

it does appear that this is linux only though

iron basalt Dec 12, 2023, 4:51 PM

#

Yeah, Windows no idea.

wooden sail Dec 12, 2023, 4:51 PM

#

ok, but doing this in wsl is still good

iron basalt Dec 12, 2023, 4:51 PM

#

Windows does not work like Linux in this way, Linux prefers lots of dynamic libs in standard locations that are meant to be swapped out (e.g. for a security patch).

#

So that programs can be updated without recompiling everything.

wooden sail Dec 12, 2023, 4:52 PM

#

mhm, makes sense

iron basalt Dec 12, 2023, 4:53 PM

#

Windows's style is more that everything ships with its own copy of every lib (often statically linked too), and then "installs" by moving it to anywhere and editing the registry and path and such.

#

This means it's more annoying to develop on since the libs are not all in a standard spot, but the tradeoff is that if multiple libs depend on some DLL, and that gets updated by one of them installing, it does not break the rest.

#

(Which is part of why Windows apps don't break all the time like with Linux (unless you are using stuff like Flatpak specifically meant to avoid this issue))

wooden sail Dec 12, 2023, 4:56 PM

#

right, the newer pre-packaged stuff like flatpack and snap take a similar container-like approach

iron basalt Dec 12, 2023, 4:56 PM

#

iron basalt This means it's more annoying to develop on since the libs are not all in a stan...

(The common offender is glibc)

#

(Which is partially why Musl exists)

wooden sail Dec 12, 2023, 4:58 PM

#

i'll have to go read about musl, had never heard of it before

long canopy Dec 12, 2023, 5:38 PM

#

anyone done Gephi plugin programming before?

shell ruin Dec 12, 2023, 7:07 PM

#

Im working through some EDA and came across this warning. I feel that I am handling the issue that its warning me about. Am I misunderstanding this?

simple snow Dec 12, 2023, 7:26 PM

#

Hey guys! I have data feature which is positively skewed and I want to use it for linear programming. I used skewness and Shapiro linearity test, and after applying logarithmic transformation the skewness got decreased but the Shapiro as k-s test both fails for normality. I've got around 400 data points, should I try to make any more transformations or remove outliers?

simple hound Dec 12, 2023, 7:27 PM

#

Hello, I'm kinda new to machine learning stuff and i wanted to ask if someone knows a good book or free course for starting w it. I want to learn some AI related stuff so... any advise would be great.

serene scaffold Dec 12, 2023, 7:31 PM

#

simple hound Hello, I'm kinda new to machine learning stuff and i wanted to ask if someone kn...

check the pins

desert oar Dec 12, 2023, 8:11 PM

#

shell ruin Im working through some EDA and came across this warning. I feel that I am handl...

it's because matches_raw is itself a "slice" of another data frame. matches_raw = data[...]

if you explicitly meant to make a copy, use matches_raw = data[...].copy(). if you want your changes to apply to data and not just matches_raw, don't slice off any columns.

also in the future it's much easier for people to help you if you share code as text, not a screenshot. use https://paste.pythondiscord.com for sharing

marble raft Dec 12, 2023, 9:04 PM

#

hybrid maple I am trying to train a model to predict loan eligibility, and I am getting this ...

hello i think i know the answer and it is quite complex wanna talk about it on the dms

desert oar Dec 12, 2023, 9:26 PM

#

hybrid maple I am trying to train a model to predict loan eligibility, and I am getting this ...

this has nothing to do with loans as such, but it's worth noting that gradient boosting historically tends to work better on "tabular" data than neural networks

#

so you might want to try xgboost or similar in addition to your NN

weary crown Dec 12, 2023, 9:45 PM

#

hybrid maple I am trying to train a model to predict loan eligibility, and I am getting this ...

1000 epochs ... ??

hybrid maple Dec 12, 2023, 9:46 PM

#

marble raft hello i think i know the answer and it is quite complex wanna talk about it on t...

i sorted it

marble raft Dec 12, 2023, 9:46 PM

#

ok

hybrid maple Dec 12, 2023, 9:46 PM

#

marble raft ok

gotta add [1]

marble raft Dec 12, 2023, 9:47 PM

#

hybrid maple i sorted it

i just thought of a more simple way

hybrid maple Dec 12, 2023, 9:47 PM

#

weary crown 1000 epochs ... ??

is that too many?

#

only took like 30 secs on my m2

marble raft Dec 12, 2023, 9:47 PM

#

hybrid maple only took like 30 secs on my m2

wow ur computer is fast

#

for 100 epochs it took 10 minutes

hybrid maple Dec 12, 2023, 9:48 PM

#

marble raft wow ur computer is fast

yeah m2 is kinda crazy

#

only passively cooled also

marble raft Dec 12, 2023, 9:48 PM

#

hybrid maple yeah m2 is kinda crazy

ik

hybrid maple Dec 12, 2023, 9:48 PM

#

the dataset is only like 600 entries though

marble raft Dec 12, 2023, 9:48 PM

#

do u want to make a project together

hybrid maple Dec 12, 2023, 9:49 PM

#

i know literally nothing about ai and ml i doubt id be a very useful partners lol

marble raft Dec 12, 2023, 9:49 PM

#

i can help u

#

i have 1 year xp

#

but i am still learning

hybrid maple Dec 12, 2023, 9:49 PM

#

i watched this 15 min video on how get started with neural networks and tried to apply that to another dataset

#

so far had 0 luck, 30% accuracy lol

marble raft Dec 12, 2023, 9:50 PM

#

oof

#

i can help

hybrid maple Dec 12, 2023, 9:51 PM

#

that would be much appreciated

#

i am trying to predict loan eligibility

marble raft Dec 12, 2023, 9:51 PM

#

ok

#

lets talk in the dms

hybrid maple Dec 12, 2023, 9:51 PM

#

sure

brazen spire Dec 12, 2023, 11:31 PM

#

What arethe option to compute how far an object is from the edge?

shut girder Dec 12, 2023, 11:44 PM

#

Is exploratory data analysis all that is needed to solve a given problem? I hear people say that EDA is just a step in the data analysis process and that insights from EDA can be used for further steps and analysis, is this true?

serene scaffold Dec 12, 2023, 11:52 PM

#

shut girder Is exploratory data analysis all that is needed to solve a given problem? I hear...

to solve what given problem? global warming? no.
exploratory data analysis is just where you look at the data and understand how it's structured, how you could use it, etc.

jaunty geyser Dec 13, 2023, 4:19 AM

#

What is an language model that can run on a intel i3 cpu

frigid creek Dec 13, 2023, 7:26 AM

#

hi, im new with machine learning and stuff, but would like to know in object tracking, with like deepsort, is it possible to count the object tracked from the track id or is it not? why do most still use roi line to count or is there other method to count? thanks

abstract wasp Dec 13, 2023, 8:10 AM

#

Does any one here know about any deep learning programs, schools, or online courses that really teach you everything? Not just CNNs but all of deep learning.

lofty thorn Dec 13, 2023, 8:50 AM

#

From where do i learn data science

trim saddle Dec 13, 2023, 9:39 AM

#

lofty thorn From where do i learn data science

Google

lofty thorn Dec 13, 2023, 9:41 AM

#

any data scientist here?

trim saddle Dec 13, 2023, 9:45 AM

#

I am

#

Kaggle might be a starting point

#

There are also recorded online lectures

#

If you search for it you find plenty stuff.

#

And like karpathy said, its not that important with what you start, more important that you start and put hours in.

lofty thorn Dec 13, 2023, 10:07 AM

#

how old are you

trim saddle Dec 13, 2023, 10:55 AM

#

lofty thorn how old are you

31, whys that important?

shadow viper Dec 13, 2023, 1:02 PM

#

I need a mentor 🙏🏽

serene scaffold Dec 13, 2023, 1:06 PM

#

shadow viper I need a mentor 🙏🏽

it's not very likely that anyone will commit to being your mentor. you should just ask questions in this channel as you have them.

shadow viper Dec 13, 2023, 1:09 PM

#

Sure sure... Thanks

shadow viper Dec 13, 2023, 2:37 PM

#

But it feels like I'm not making any progress

#

Just making use for tensorflow keras applications (pretrained models)
I want more than that

quaint loom Dec 13, 2023, 2:45 PM

#

I am currently working on developing a Random Forest model using a dataset that consists of weekly values for 16 different locations. My analysis focuses on the entire area rather than specific individual locations, which is why I've merged these locations into 4 distinct areas based on spatial considerations.

Regarding the imputation process, I am indeed using it to fill in missing values within the dataset. Specifically, when a location has missing data, I employ a method to calculate the mean value based on the remaining non-missing values from the grouped locations.

The issue I'm encountering is that after applying this imputation method, certain missing values that were initially 0 are now being replaced with unexpected values like 6. In essence, it seems like the imputation is causing non-missing values that were originally 0 to increase to 6.

I'm uncertain about the root cause of this issue and would greatly appreciate any insights or suggestions on how to resolve it. If there are any error messages or specific code segments that would aid in diagnosing the problem, please feel free to ask.

Number of missing values before imputation: 0
Number of missing values after imputation: 6

def fill_missing_values(data, columns_to_fill, area_groups):
for column in columns_to_fill:
for area, positions in area_groups.items():
mask = (data['Position'].isin(positions)) & data[column].isna()
data[column] = pd.to_numeric(data[column], errors='coerce')
mean_value = data[mask]['Date'].dt.month.map(data[(data['Position'].isin(positions)) & ~data[column].isna()].groupby(data['Date'].dt.month)[column].mean())
data.loc[mask, column] = mean_value

I am also using the Drop Nan as some of the parameter that I am taking sample is only montly.

Here is the complete code:
https://paste.pythondiscord.com/PDIQ

radiant dock Dec 13, 2023, 2:50 PM

#

What do you guys think would be the best way to analyze a text and give suggestions to replace phrases from a list? Cosine similarity?

shadow viper Dec 13, 2023, 3:04 PM

#

quaint loom I am currently working on developing a Random Forest model using a dataset that ...

I'm not really familiar with most lines of your code since I'm still a beginner but I've encountered something like this before.

What if you write an if statement that if the data in the columns are 0 it should return back 0 and see if it works.
But thinking about this, other cells that aren't 0 might have issues. So what if you say if the cells are not NaN return cells else return (whatsoever you want it to).

Again, I'm just trying to help incase I'm not being helpful or anything, still a beginner at this

quaint loom Dec 13, 2023, 3:07 PM

#

shadow viper I'm not really familiar with most lines of your code since I'm still a beginner ...

Coding in a nutshell. I will have this try tomorrow. So simple and plain, yet I didn’t try it… It started confusing me when I noticed some values which was 64 and turned into 78 which shocked me a little. Thank you.☺️

desert oar Dec 13, 2023, 3:08 PM

#

quaint loom I am currently working on developing a Random Forest model using a dataset that ...

errors='coerce' looks suspicious

shadow viper Dec 13, 2023, 3:08 PM

#

quaint loom Coding in a nutshell. I will have this try tomorrow. So simple and plain, yet I ...

Glad I was able to help, hopefully it works.

quaint loom Dec 13, 2023, 3:27 PM

#

desert oar `errors='coerce'` looks suspicious

?

past meteor Dec 13, 2023, 5:58 PM

#

quaint loom I am currently working on developing a Random Forest model using a dataset that ...

Not an answer to your question but you're leaking a bit of data

X, y = prepare_data(area_data)
if X.shape[0] > 0 and y.shape[0] > 0:  
    rf_regressor, mae, mse, r2 = apply_random_forest(X, y, area_label)

You're not really supposed to impute and then train your model. You're imputing using the mean of the entire dataset which isn't really allowed. If I were you I would try and encapsulate your entire preprocessing and modeling into an sklearn ColumnTransformer and Pipeline.

sci-kit learn's documentation are fantastic, I'd give them a read:

A docs page on leakage, which is happening in your case https://scikit-learn.org/stable/common_pitfalls.html#data-leakage
A docs page on pipelines etc. https://scikit-learn.org/stable/modules/compose.html

umbral charm Dec 13, 2023, 9:35 PM

#

https://gyazo.com/4e71d04730b1c42a7bf167f6f22f7a0b

Gyazo

#

You see at around x = 2.5 and x = 6.1 there are 2 basically straight blue lines

#

this is because my function is like 1/tan(x) and thus goes to the infinites and comes back

#

How do i stop this line from being plotted

#

I dont want them to join up

toxic mortar Dec 13, 2023, 9:42 PM

#

Why do I have to have insanely small learning rate in order not to get overflow runtime error?

import numpy as np
import matplotlib.pyplot as plt
import copy

def compute_gradient(w,b,x,y):
    djdw=np.zeros(x.shape[1])
    djdb=0.0
    for i in range (x.shape[0]):
        err=(np.dot(w,x[i])+b)-y[i]
        for j in range(x.shape[1]):
            djdw[j]+=err*x[i,j]
        djdb+=err
    return djdw/x.shape[0],djdb/x.shape[0]
def gradient_descent(alpha,epoch,_w,_b,x,y):
    w=copy.deepcopy(_w)
    b=_b
    for _ in range(epoch):
        djdw,djdb=compute_gradient(w,b,x,y)
        w=w-alpha*djdw
        b=b-alpha*djdb
    return w,b

if __name__ == '__main__':
    x = np.array([[2104, 5, 1, 45], [1416, 3, 2, 40], [852, 2, 1, 35]])
    y = np.array([460, 232, 178])
    w = np.zeros(x.shape[1])
    alpha = 5.0e-7
    epochs = 1000
    w, b = gradient_descent(alpha, epochs, w, 0, x, y)
    predicted_y = np.dot(x, w) + b
    feature_index = 0
    plt.scatter(x[:, feature_index], y, color='blue')
    plt.scatter(x[:, feature_index], predicted_y, color='red')
    plt.xlabel('Feature Value')
    plt.ylabel('Target Value')
    plt.legend(['Actual', 'Predicted'])
    plt.show()

Anything larger then this value is failing

past meteor Dec 13, 2023, 9:55 PM

#

toxic mortar Why do I have to have insanely small learning rate in order not to get overflow ...

Gradient descent has been known to diverge instead of converge if the learning rate is too large.

I'd advise you to step in a debugger if you're unsure, what's happening is that w or b gets either too large or too small based on your learning rate.

toxic mortar Dec 13, 2023, 10:01 PM

#

past meteor Gradient descent has been known to diverge instead of converge if the learning r...

Yes, I tried debugging, the weights become large and eventually turn into NaN. Then I tried playing around with learning rate, from 1e-10 to 5e-7.Each time 3x it and limit test it. Turned out that 5e-7 is a sweet spot. I even plotted the J based on number of iteration to see how fast or slow it is converging. Is this a common problem? Does it depends on the format of data ( like the actual values or even more trivial things like the number of elements)

past meteor Dec 13, 2023, 10:02 PM

#

toxic mortar Yes, I tried debugging, the weights become large and eventually turn into NaN. T...

It's a common problem yes, I wouldn't be able to explain it better than this link: https://stats.stackexchange.com/questions/315664/gradient-descent-explodes-if-learning-rate-is-too-large?noredirect=1&lq=1

Cross Validated

Gradient descent explodes if learning rate is too large

I've implemented my own gradient descent algorithm for an OLS, code below. It work's, however, when the learning rate is too large (i.e. learn_rate >= .3), my approach is unstable. The coefficien...

toxic mortar Dec 13, 2023, 10:02 PM

#

past meteor It's a common problem yes, I wouldn't be able to explain it better than this lin...

Thanks for the resource

rare ferry Dec 14, 2023, 12:45 AM

#

I love reading books. How can I put my Data science skills to use in analysing a single book. How can I leverage my DS skills to more critically examine let's say Harry Potter and the Prisoner of Azkaban?

left tartan Dec 14, 2023, 1:00 AM

#

rare ferry I love reading books. How can I put my Data science skills to use in analysing a...

Check out spacy, you could do Nlp analysis of a text and, perhaps, extract named entities/etc. like, find all the interactions between Harry and Hermoine

radiant dock Dec 14, 2023, 1:03 AM

#

how can I analyze a text with spacy, and check for similarities between sets of 2 or 3 words rather than whole sentences or individual words?

serene scaffold Dec 14, 2023, 2:07 AM

#

radiant dock how can I analyze a text with spacy, and check for similarities between sets of ...

Are you the same person as zakomayo?

#

What would make a "set of two or three words" "similar" to another word set?

radiant dock Dec 14, 2023, 2:47 AM

#

why would I be the same guy?

radiant dock Dec 14, 2023, 2:47 AM

#

serene scaffold What would make a "set of two or three words" "similar" to another word set?

I don't know, sentiment I guess. I'm currently using bigrams and trigrams. It works great for certain terms but not so well for others

serene scaffold Dec 14, 2023, 2:49 AM

#

radiant dock I don't know, sentiment I guess. I'm currently using bigrams and trigrams. It wo...

Sentiment analysis is pretty challenging, so you might already be getting the best performance you can get without upgrading to a more sophisticated technique

radiant dock Dec 14, 2023, 2:53 AM

#

gotcha, I'll probably leave it as is, it's working acceptably well so far, it's a coding challenge for a job opportunity

#

thanks

serene scaffold Dec 14, 2023, 2:54 AM

#

radiant dock gotcha, I'll probably leave it as is, it's working acceptably well so far, it's ...

I once had a job interview where they asked me to do sentiment analysis
On 100k tweets

#

Is that what you're doing?

radiant dock Dec 14, 2023, 2:54 AM

#

oh god, thankfully not

serene scaffold Dec 14, 2023, 2:54 AM

#

I got rejected Pepega

radiant dock Dec 14, 2023, 2:54 AM

#

I got a text paragraph and I have to suggest replacement phrases from a list based on a similarity score

#

I can approach it however I want, and decided to use spacy because it's what I'm most familiar with

#

it's working pretty alright, but certain sentences get suggestions that are not so good, and researching I found out about sentiment analysis, but yeah it seems very complex and I'm running out of time

serene scaffold Dec 14, 2023, 2:58 AM

#

You can use my thing
https://github.com/swfarnsworth/madlibert

GitHub

GitHub - swfarnsworth/madlibert

Contribute to swfarnsworth/madlibert development by creating an account on GitHub.

radiant dock Dec 14, 2023, 3:01 AM

#

thanks man

#

I may use it as inspiration, but I still wanna come up with my own script, since I'm still learning and I want to understand what I'm doing if I end up getting the job

odd meteor Dec 14, 2023, 3:49 AM

#

radiant dock how can I analyze a text with spacy, and check for similarities between sets of ...

spaCy


import spacy
nlp = spacy.load("en_core_web_md")  # make sure to use larger package!
doc1 = nlp("I like salty fries and hamburgers.")
doc2 = nlp("Fast food tastes very good.")

# Similarity of two documents
print(doc1, "<->", doc2, doc1.similarity(doc2))

# Similarity of tokens and spans
french_fries = doc1[2:4]
burgers = doc1[5]
print(french_fries, "<->", burgers, french_fries.similarity(burgers))

https://github.com/UKPLab/sentence-transformers

GitHub

GitHub - UKPLab/sentence-transformers: Multilingual Sentence & Imag...

Multilingual Sentence & Image Embeddings with BERT - GitHub - UKPLab/sentence-transformers: Multilingual Sentence & Image Embeddings with BERT

quaint loom Dec 14, 2023, 5:51 AM

#

past meteor Not an answer to your question but you're leaking a bit of data ```python X, y ...

Thank you sooo much.

mortal rover Dec 14, 2023, 11:52 AM

#

Guys in need help with my project. Plz this is very urgent. I'm not as well gifted as y'all. I need to solve a real life problem using Business intelligence or machine learning. A unique topic help me how I can collect data for it too. Plz guide me programming gods. It could be any problem.

#

For now I just need to give a problem and analysis on how I'd collect data and try to solve it

radiant dock Dec 14, 2023, 12:35 PM

#

odd meteor 1. **spaCy** ```py import spacy nlp = spacy.load("en_core_web_md") # make sur...

thank you!

shadow viper Dec 14, 2023, 2:26 PM

#

past meteor Not an answer to your question but you're leaking a bit of data ```python X, y ...

I like this
I will look into scikit documentation later
Can I use scikit to train images?
Instead of using tensorflow models

past meteor Dec 14, 2023, 2:36 PM

#

shadow viper I like this I will look into scikit documentation later Can I use scikit to trai...

Yes and no. Before neural networks people used to run algorithms like SIFT to make variables and then used models such as those found in sklearn but using a neural network is a lot easier and it has way better performance.

shadow viper Dec 14, 2023, 2:42 PM

#

past meteor Yes and no. Before neural networks people used to run algorithms like SIFT to ma...

Wayyyy easier
Makes me feel I'm not even trying

#

Thanks man

long canopy Dec 14, 2023, 3:28 PM

#

could anyone throw me a couple of keywords to get me started of evaluating whether two paragraphs contain similar content/ideas?

summer crypt Dec 14, 2023, 3:32 PM

#

I'm looking to get started with using AI in python. I want to write a program that will run power shell scripts to look for vulnerabilitys in a system and notify you about them. I could write an algorithm to do this manually but I wanna incorporate machine learning to automate the process. However, I have no idea where to start when it comes to working with ai. How should I get started?

serene scaffold Dec 14, 2023, 3:32 PM

#

long canopy could anyone throw me a couple of keywords to get me started of evaluating wheth...

Semantic similarity
Cosine distance

long canopy Dec 14, 2023, 3:32 PM

#

serene scaffold Semantic similarity Cosine distance

thanks!

serene scaffold Dec 14, 2023, 3:33 PM

#

summer crypt I'm looking to get started with using AI in python. I want to write a program th...

This sounds insurmountably challenging for a first ai project

#

It also wouldn't be "more automated" than the program you have in mind

#

You use AI when the problem to be solved can't easily be expressed as an exact series of steps

summer crypt Dec 14, 2023, 3:36 PM

#

serene scaffold This sounds insurmountably challenging for a first ai project

I love challenging myself but capability to learn is what I'm most interested in since cyber security is ever evolving

serene scaffold Dec 14, 2023, 3:37 PM

#

summer crypt I love challenging myself but capability to learn is what I'm most interested in...

If you want to get started, you should practice data exploration and manipulation with numpy and pandas. So that you understand what "data" is in the context of AI and ML

#

Admittedly, this is pretty removed from what you want to do

#

But if you start with trying to classify code examples as malicious or not, I think you'll be completely lost.

summer crypt Dec 14, 2023, 3:40 PM

#

serene scaffold If you want to get started, you should practice data exploration and manipulatio...

I'll start with your recommendations to familiarize myself with the core concepts before I delve into the more complicated things but thank you for your help

serene scaffold Dec 14, 2023, 3:42 PM

#

While it's on my mind, when I got started with nlp, I had a mentor who insisted that I learn concepts that were above my ability to comprehend at that time, and while I appreciate that he believed in my ability to understand it, I think that stunted my motivation.

shadow viper Dec 14, 2023, 3:46 PM

#

serene scaffold While it's on my mind, when I got started with nlp, I had a mentor who insisted...

I want to work on programming drones(AI oriented) but I have no idea how to start. I don't even have a drone but maybe a simulation can help but how do I start?

summer crypt Dec 14, 2023, 3:47 PM

#

I think good starting point would be to figure out algorithm to do it manually to look for unused open ports/services then try to integrate the ai?

serene scaffold Dec 14, 2023, 3:47 PM

#

shadow viper I want to work on programming drones(AI oriented) but I have no idea how to star...

Drones are often trained in stimulations, but I've never done anything with them. I do language technology.

shadow viper Dec 14, 2023, 3:49 PM

#

serene scaffold Drones are often trained in stimulations, but I've never done anything with them...

Yh, alright then, thanks

serene scaffold Dec 14, 2023, 3:49 PM

#

summer crypt I think good starting point would be to figure out algorithm to do it manually t...

Everyone is rushing to shoehorn AI into everything, but it doesn't go without saying that a solution involving AI will be better than one that doesn't.

summer crypt Dec 14, 2023, 3:55 PM

#

I get what your saying and I'll keep it in mind. Vulnerability scanning and detection nowadays works through an algorithmmic approach using preset flags that if tripped would notify the appropriate parties. However, if you could figure out these flags you can circumvent them. However with an ai approach it's harder to predict what the ai would consider as a trigger

shadow viper Dec 14, 2023, 3:59 PM

#

summer crypt I get what your saying and I'll keep it in mind. Vulnerability scanning and dete...

You train the AI with appropriate trigger datasets
If the training goes well then you have nothing to worry about. Remember after training, we test our models with a test/new dataset to see the accuracy for ourselves rather than just looking at the accuracy score

summer crypt Dec 14, 2023, 4:01 PM

#

shadow viper You train the AI with appropriate trigger datasets If the training goes well the...

I could incorporate some vulnerability databases to learn new triggers as they discovered which would help maintain its accuracy as new methods are developed

shadow viper Dec 14, 2023, 4:03 PM

#

summer crypt I could incorporate some vulnerability databases to learn new triggers as they d...

This works... Nice one honestly

quaint loom Dec 14, 2023, 4:19 PM

#

past meteor Not an answer to your question but you're leaking a bit of data ```python X, y ...

Thank you again for you suggestions. It seem like it helped a lot. And the docs you shared was highly valuable. If you ever have the time, would you have a quick look at the code and see it it actually got improved?

Although it seem to be improved, I do have a few questions.

It seems like the mae, mse, r2 is similar for all areas? Is the module just running for one area, although the terminal says is running for all area?
The module could not handle the missing data regarding the parameter that I have only once a month (beside the other parameter which I have sampled/tested weekly). Any suggestion on this?
Another question regarding the R^2. It turned out that the R^2 is as low as 0.25, which is quiet low. Is this suggesting that I still do not have enough data to run the module?

Again, if you ever have some sparetime, please have a look. And also, thanks to you @desert oar
https://paste.pythondiscord.com/YXKA

past meteor Dec 14, 2023, 4:28 PM

#

quaint loom Thank you again for you suggestions. It seem like it helped a lot. And the docs ...

I don't have the time this week to look at the code but you can ping me around this time next week.

I don't understand what you mean with "area", I think it'll be clear after I read your script.
How to handle missing data depends on your domain. You might be able to do a left fill or so. Do not do right fill or linear interpolation in time series as you might leak data.
I never use R^2 for prediction problems personally. It's unlikely but possible it's low because there is a non-linear relationship but the model is actually good R^2 doesn't account for this.

quaint loom Dec 14, 2023, 4:50 PM

#

past meteor I don't have the time this week to look at the code but you can ping me around t...

Thank you for your time. I’ll ping you later when I know you have more time.

lean sparrow Dec 14, 2023, 5:48 PM

#

summer crypt I'm looking to get started with using AI in python. I want to write a program th...

As someone in security I'd second skipping the AI for this task.

desert oar Dec 14, 2023, 5:49 PM

#

almost all successful applications of "AI" on specific applications like this turn out to be more like a handcrafted combination of machine learning, heuristics, and statistics. in particular, figuring out how to actually represent your data in some way that you can actually run machine learning models on it is usually the most important thing you can spend your effort on. that process is often known as "feature engineering". as you might imagine, constructing useful features is very often a matter of understanding the problem domain and starting from a position of "how would a human look at this"?

lean sparrow Dec 14, 2023, 5:49 PM

#

If anything data science to help identify how often certain systems or types of systems need attention and how that affects labor costs and why you should maybe charge more/less for different system types when integrated into a vuln management program

shadow viper Dec 14, 2023, 8:08 PM

#

quaint loom Thank you again for you suggestions. It seem like it helped a lot. And the docs ...

You made progress... This is nice

thorn bobcat Dec 15, 2023, 2:10 AM

#

@serene scaffold you on?

#

I just wanted you to check out this https://medium.com/@TheUndergrad/introducing-gemini-chat-app-your-conversational-companion-17d9cdb3eaac

Medium

Introducing Gemini Chat App: Your Conversational Companion

Overview

#

and this https://github.com/SetuBaru/Gemni-Python-API/tree/main

GitHub

GitHub - SetuBaru/Gemni-Python-API: An Unofficial Wrapper for the P...

An Unofficial Wrapper for the Python SDK for the Gemini API - GitHub - SetuBaru/Gemni-Python-API: An Unofficial Wrapper for the Python SDK for the Gemini API

#

:)

delicate rune Dec 15, 2023, 2:13 AM

#

#

what could the problem be?

quaint loom Dec 15, 2023, 2:26 AM

#

That is excellent to hear. Awesome ^^

small wedge Dec 15, 2023, 5:11 AM

#

delicate rune

they are different outputs

#

"Type a number: " should be in the input call

#

and then you should just print even or odd

delicate rune Dec 15, 2023, 5:11 AM

#

o really?

small wedge Dec 15, 2023, 5:11 AM

#

also this is probably not the channel for this lol

delicate rune Dec 15, 2023, 5:12 AM

#

ohh I sent it here because there's more active people in here compared to the other channels

#

just needed some little help and i appreciate that you helped me

small wedge Dec 15, 2023, 5:13 AM

#

no problem 👍

placid cedar Dec 15, 2023, 7:48 AM

#

hey guys, after i performed train test split, and one hot encoded my train and test data, i wanted to put them into the regressor to evaluate the model's performance. but i've been stuck on this error for a long time. do need urgent help with this!

dull flare Dec 15, 2023, 9:31 AM

#

Sup how long do u guys think one should focus on eda+ supervise +unsupervised learning. The problem is its obvious that no one can master it in small amount of time but I can't get stuck over there for long period of time either. So advice me when to move and should deep learning be my next goal.

serene scaffold Dec 15, 2023, 11:57 AM

#

dull flare Sup how long do u guys think one should focus on eda+ supervise +unsupervised le...

A year

edgy pasture Dec 15, 2023, 12:05 PM

#

Can anyone help me do my assignment ?

#

It's all in python but I don't really know what's going on since I'm new to it

#

It won't be complicated to you guys but it's entire another language for me.

serene scaffold Dec 15, 2023, 12:18 PM

#

@edgy pasture this is the data science and AI channel. Is it about that?
In either case, be sure to never ask to ask. Ask your actual question. Not if someone is willing to answer a question that you haven't revealed yet.

edgy pasture Dec 15, 2023, 12:21 PM

#

yea

#

its a mixture of datascience and python

#

could we hop on call, cause itll be easier to say what its about

serene scaffold Dec 15, 2023, 12:49 PM

#

edgy pasture could we hop on call, cause itll be easier to say what its about

We cannot

lapis sequoia Dec 15, 2023, 1:02 PM

#

delicate rune

You new to py?

delicate rune Dec 15, 2023, 1:05 PM

#

lapis sequoia You new to py?

Yeah

lapis sequoia Dec 15, 2023, 1:07 PM

#

delicate rune Yeah

Just fall in love with python and then it’s ez

delicate rune Dec 15, 2023, 1:40 PM

#

lapis sequoia Just fall in love with python and then it’s ez

yeah i think i did fall in love because now i understand it a little and its really fun messing around with it (trial/error)

lapis sequoia Dec 15, 2023, 1:43 PM

#

delicate rune yeah i think i did fall in love because now i understand it a little and its rea...

W

#

I’ve been hanging out c# and html just trying to make stuff, I’ve been studying

#

Kinda fun, I agree with you

#

What initially got you into wanting to study Python?

delicate rune Dec 15, 2023, 1:50 PM

#

lapis sequoia What initially got you into wanting to study Python?

Honestly my friend she’s likes pretty good at it and I was like woah future hacker 😆

delicate rune Dec 15, 2023, 1:50 PM

#

lapis sequoia Kinda fun, I agree with you

onbb !

lapis sequoia Dec 15, 2023, 2:15 PM

#

delicate rune Honestly my friend she’s likes pretty good at it and I was like woah future hack...

What language does she do?

lapis sequoia Dec 15, 2023, 2:16 PM

#

delicate rune onbb !

on my what

delicate rune Dec 15, 2023, 2:42 PM

#

lapis sequoia What language does she do?

python she’s like new as well but she made it more enjoyable

delicate rune Dec 15, 2023, 2:42 PM

#

lapis sequoia on my what

onb = on bro

lapis sequoia Dec 15, 2023, 4:02 PM

#

delicate rune python she’s like new as well but she made it more enjoyable

You a dude right?

lapis sequoia Dec 15, 2023, 4:02 PM

#

delicate rune onb = on bro

💀

#

Dang

lapis sequoia Dec 15, 2023, 4:02 PM

#

delicate rune python she’s like new as well but she made it more enjoyable

You lot tryna learn together

#

I’m a beginner too, but I need a group to grow with

south crypt Dec 15, 2023, 4:24 PM

#

Hello, I'm new writing in this section. I'd like to have some recommendations on a problem I'm trying to solve. As a context, I'm trying to solve it with Deep Reinforcement Learning.

The task is to control the activity of some fans [on / off] (In this case, 3, all with the same caudal of 95 m3/h) connected to a box, that has a heater inside (currently, it is always at 100% capacity, which is 1kW)

The current set of actions, with 3 fans, are numbers from 0 to 9, being mapped as: 0 -> do nothing; 1 -> {0, 0 ,0}; 2 -> {1, 0, 0}; 3 ->{0, 1, 0}; 4->{0, 0, 1}; 5->{1, 1, 0} .... 9->{1, 1, 1}. Being {x, x, x} the representation of the state on[1]/off[0] of the fans_{1, 2, 3}

The Ambient temperature might as might not be hotter that the target temperature wanted inside the box.

Currently, the simulator I built with my colleague use the basic heat transfer equations, without considering that faster wind lower the entrance temperature.

The ambient temperature is obtained every second, and the changes in internal temperature is calculated every 0.01 seconds (A simple interpolation is made to obtain the external temperature in each "dt"). The steps are every 2 seconds (might change in the future), This means that the algorithm has to take a decision every 2 seconds. There is no penalty for turning on/off fans consecutively (like a kid with a light switch), yet.

The values available for the NN are: Ti(Internal T), Tt(Target T), Te(External T), A_t1 (Action in last step), Delta Ti (Change in temperature in alst step) and Dt (step size in seconds)

Here are my questions:

If my intention is to keep the temperature near the target temp. Which would be a good q-function?
Right now, I'm just using a Sequence Neural Network (SNN), with some "relu" activation functions and a linear activation function to obtain the q-function-estimate as an output. Any recommendation on how deep or wide the NN?

south crypt Dec 15, 2023, 4:28 PM

#

south crypt Hello, I'm new writing in this section. I'd like to have some recommendations on...

Would it be possible / wise, to try to use a RNN? I would think that the hidden state would have some intrinsic information about how the external temperature has been changing over time
If I would have to add more fans, the number of states would increase in a 2^n +1 size. Any advice to affront this "curse of dimensionality"?
If the fan state were to be continuous... Any idea how to affront it?

Thanks in advance for any idea, suggestions, questions

delicate rune Dec 15, 2023, 5:31 PM

#

lapis sequoia You lot tryna learn together

Well we not learning together exactly. She taking cs class and I’m learning python my self with some free courses

#

what makes it enjoyable is that whenever she gets like an exercise or project that she’s having trouble with, I can assist and it js makes coding hella fun imo

delicate rune Dec 15, 2023, 5:32 PM

#

lapis sequoia You a dude right?

Yeah

delicate rune Dec 15, 2023, 5:32 PM

#

lapis sequoia I’m a beginner too, but I need a group to grow with

We could be one but really it’s up to her

lapis sequoia Dec 15, 2023, 5:32 PM

#

delicate rune what makes it enjoyable is that whenever she gets like an exercise or project th...

lmao, you like designing?

lapis sequoia Dec 15, 2023, 5:32 PM

#

delicate rune We could be one but really it’s up to her

nvm then g

#

it's cool, you enjoy yourself

delicate rune Dec 15, 2023, 5:33 PM

#

lapis sequoia nvm then g

Nah we could be duos

delicate rune Dec 15, 2023, 5:33 PM

#

lapis sequoia lmao, you like designing?

Designing?

halcyon hedge Dec 15, 2023, 8:54 PM

#

Hey guys I did this EDA project a while back and since I have been kind of stuck about how to improve my EDA skills, it would be helpful if you could point out some specific drawbacks in this project which would help me improve my skills. https://www.kaggle.com/code/omraizada/exploring-global-terrorism

Exploring Global Terrorism

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

lapis sequoia Dec 15, 2023, 8:56 PM

#

delicate rune Nah we could be duos

it's cool, you can stick to your duo innit

lapis sequoia Dec 15, 2023, 8:57 PM

#

delicate rune Designing?

yur

#

html and css

past meteor Dec 15, 2023, 10:14 PM

#

halcyon hedge Hey guys I did this EDA project a while back and since I have been kind of stuck...

Dropping all the NA values in cell 12 without any analysis on them isn't great. Same goes for the duplicates in cell 17 and the outliers and so on

halcyon hedge Dec 15, 2023, 10:16 PM

#

past meteor Dropping all the NA values in cell 12 without any analysis on them isn't great. ...

So would it be appropriate to use something like heatmap to visualize it first?

past meteor Dec 15, 2023, 10:16 PM

#

The actual EDA, at a glance, looks good. You're asking relevant questions, providing detailed answers with context outside of the scope of your dataset and so on

past meteor Dec 15, 2023, 10:17 PM

#

halcyon hedge So would it be appropriate to use something like heatmap to visualize it first?

Just looking at the data, scrolling through it can be really helpful

#

Like getting those duplicates indexes and really looking at the rows and seeing what's up with them. If they're truly duplicates you can throw them away but also show the reader that they are

halcyon hedge Dec 15, 2023, 10:18 PM

#

Okayyy, will make sure to do these things from now on

halcyon hedge Dec 15, 2023, 10:19 PM

#

past meteor The actual EDA, at a glance, looks good. You're asking relevant questions, provi...

Thanks a lot for your time

past meteor Dec 15, 2023, 10:19 PM

#

halcyon hedge Thanks a lot for your time

Removing data is a very drastic decision so always motivating why you're doing it is always great, for the rest you're doing a good job, keep it up!

halcyon hedge Dec 15, 2023, 10:21 PM

#

past meteor Removing data is a very drastic decision so always motivating why you're doing i...

Will keep that in mind, once again thanks

final kiln Dec 15, 2023, 11:19 PM

#

how large do I have to make GPT to get interesting results ?

desert oar Dec 15, 2023, 11:30 PM

#

halcyon hedge Thanks a lot for your time

i'll echo that removing outliers is drastic and needs to be carefully motivated. are they actually anomalous events in some way that might warrant removing them from the analysis? or are they legitimate values that happen to occur at the tail of the distribution?

#

this is good

#

i'd like to see it on log scale as well

#

currently the big spike dominates the graph, which is good: it tells a clear story, there is a huge increase compared to a global baseline

#

but there might be a secondary story which is hidden by the scale

#

one thing i wonder about is measurement methodology. what defines a terrorist attack? who collected this data? has the methodology or definition changed over time in a way that might affect the data?

#

i'll also echo that the "asking questions" section is excellent, both in concept and in execution

verbal venture Dec 16, 2023, 12:06 AM

#

can anyone explain the [index//2] part for the skip connecetion. tthe m odel is unet: ```py
class UNET(nn.Module):
def init(self, in_channels=3, out_channels=1, features=[64, 128, 256, 512]):
super(UNET, self).init()
self.ups = nn.ModuleList()
self.downs = nn.ModuleList()
self.pools = nn.MaxPool2d(kernel_size=2, stride=2)

# DOWN PART OF UNET
for feature in features:
  # creating down sampling layers - adding every feature output
  self.downs.append(DoubleConvolution(in_channels, feature))
  in_channels = feature # becomes input to next Conv

# UP PART OF UNET
for feature in reversed(features):
  # double width of image
    self.ups.append(nn.ConvTranspose2d(feature*2, feature, kernel_size=2, stride=2))
    self.ups.append(DoubleConvolution(feature*2, feature))

# 512, 1024
self.bottleneck = DoubleConvolution(features[-1], features[-1]*2)
self.final_conv = nn.Conv2d(features[0], out_channels, kernel_size=1)

def forward(self, x):
skip_connections = []
for down in self.downs:
x = down(x) # downsampling tensor
skip_connections.append(x)
# pass through max pooling
x = self.pools(x)

x = self.bottleneck(x)

# REVERSING LIST FOR UPSAMPLING
skip_connections = skip_connections[::-1]

# up, double conv
for index in range(0, len(self.ups), 2):
    # for each index upsample
    # upsample, pass through double transpose 
    x = self.ups[index](x)
    # skip connection - div due to step 2
    skip_connection = skip_connections[index //2]

    if x.shape != skip_connection.shape:
      x = TF.resize(x, size=skip_connection.shape[2:])

    
    concat_skip = torch.concat((skip_connection, x), dim=1)
    # running through double conv
    x = self.ups[index+1](concat_skip)

return self.final_conv(x)```

final kiln Dec 16, 2023, 12:27 AM

#

verbal venture can anyone explain the [index//2] part for the skip connecetion. tthe m odel is ...

So it seems that you are computing two up layers for each iteration. And you have one skip connection every two layers.

Layer 1
Layer 2 -> 1 (skip connection index)

Layer 3
Layer 4 -> 2

Layer 5
Layer 6 -> 3

verbal venture Dec 16, 2023, 12:27 AM

#

Yes sir

#

The code I was writing did not have 2 elements in self.ups before. I thought ups was only 4 elements long, so did not know mathematically how that was working

latent dirge Dec 16, 2023, 12:51 AM

#

if I want to ask something pandas-related, is data-science-and-ai the right tag in the help section?

final kiln Dec 16, 2023, 12:51 AM

#

Yes I'd assume so

#

class SelfAttentionHead(nn.Module):
  def __init__(self, params: ModelParameters):
    super(SelfAttentionHead, self).__init__()
    

    self.d_k = params.word_vector_size // 3

    temp = []
    for _ in range(3):
      proj = make_parameter(size_x = params.word_vector_size, size_y = self.d_k)
      bias = make_parameter(size_x = 1, size_y = self.d_k)
      temp.append(proj)
      temp.append(bias)

    self.q, self.q_bias, self.k, self.k_bias, self.v, self.v_bias = temp

  def forward(self, sequence: torch.Tensor):
    q_vectors = self.q_bias + self.q @ sequence
    k_vectors = self.k_bias + self.k @ sequence
    attention_scores =  q_vectors @ k_vectors.T
    attention_scores /= torch.sqrt(self.d_k)
    attention_scores = torch.nn.functional.softmax(attention_scores)
    v_vectors = self.v_bias + self.v @ sequence
    return attention_scores @ v_vectors

#

I'm implementing nano gpt, and one thing that surprised me is that the Q,K,V matrices end up reducing the dimension of the embedded token

#

Which kinda ruins the intuition I've been reading about how all this is based on a sort of modified dot product for similarity.

#

https://jaykmody.com/blog/attention-intuition/

Jay Mody

An Intuition for Attention | Jay Mody

Deriving the equation for scaled dot product attention.

#

The normalization is also quite strange. The normalization layers go like (v - E(v))/std(v)

#

Which does scale their size so as not to let them explode in value, and also centers them at 0. But I don't see much of an intuition when thinking of word embeddings as living in some dot product space as suggested by a lot of resources online

#

I'd imagine a better normalization would be, actual normalization, v / norm(v)

#

Furthermore, why is positional encoding needed ?

#

Couldn't the network pick up the position of each word via the literal position of the word vector in the matrix that represents the sequence ?

limber creek Dec 16, 2023, 3:24 AM

#

Hey guys, I am a Computer Science student and I want to learn Machine Learning, AI. So right now, I know a bit of Python and 7th grade Maths. I would be really glad if you can provide me with a super detailed roadmap on how to learn these stuff and finally land a job.
I don't wanna invest my time on learning something which is currently not of the most priority.
Thank You~~~

final kiln Dec 16, 2023, 3:38 AM

#

limber creek Hey guys, I am a Computer Science student and I want to learn Machine Learning, ...

I do believe you need more math, just so you know what you are doing.

#

I'd reckon you should have:

calculus I and II (important to understand gradient descent and why neural nets work at all)
linear algebra - this is the basis for pretty much anything that is both high dimensional and linear and is like a language that you use to talk about all sorts of things, so it's pretty useful
multivariate calculus - this is like, joining points one and two, and is where neural nets reside I'd say. This is where the concept of gradient resides, which you need to understand gradient descent and etcs
stats and probability - neural nets can be thought as statistical models, and you use statistical tools to evaluate their performance etc etc etc

#

Signal processing concepts are also super useful. So like knowing what is a Fourier transform, knowing about kernels, knowing about DFT, knowing how to understand data, manipulate it, etc

#

I'd say, once you know all this stuff, and you are good with python, picking up the ML frameworks and just start building things is enough to get you going.

limber creek Dec 16, 2023, 3:51 AM

#

Are these stuff covered in grade 12 maths??

final kiln Dec 16, 2023, 3:51 AM

#

limber creek Are these stuff covered in grade 12 maths??

Uhm, I believe I covered all these in my first year college.

#

I'd highly recommend finding time for a college education if possible. If not, at least complementing the math til 12th grade and try to cover these subjects over time.

quaint loom Dec 16, 2023, 8:02 AM

#

Do you guys recommend using pydot and GraphViz for visualtion? Not sure if its relevant but I am using python - VScode

past meteor Dec 16, 2023, 8:11 AM

#

quaint loom Do you guys recommend using pydot and GraphViz for visualtion? Not sure if its r...

I've never used GraphViz directly, only things built on top of it, same for Pydot so I can't comment on how good they are.

Personally I use:

Matplotlib for straightforward things.
Seaborn for things that are a bit more work in Matplotlib "natively"
Plotly if I want interactive plots.

Seaborn is built on top of Matplotlib and honestly, if you want to learn Seaborn you need to know the basics of Matplotlib, it makes your life so much easier. Matplot has a very strange API this is a must-read, if you do over it, it'll all make sense in like half an hour or less 😄 https://matplotlib.org/stable/users/explain/quick_start.html. The "anatomy of a figure" section is critical to understand. This is also interesting because this is the code that actually makes the figure in question https://matplotlib.org/stable/gallery/showcase/anatomy.html.

Seaborn's documentation is also great, I'd block out an hour or two to read it after you're familiar with Matplotlib.

quaint loom Dec 16, 2023, 8:19 AM

#

past meteor I've never used GraphViz directly, only things built on top of it, same for Pydo...

I am a bit familiar with matplotib but I guess I still have a lot more to learn about it. But I find it a little tricky to do quick modification to make the visualization considerated "beautiful". Some of my coworkers is pretty good at Origin but I don`t feel like having to use another software like that. I tried Origin once and and I felt I was back to SPSS somehow.

Seaborn is not working in the purpose of what I will be doing as I am currently going to create a path diagram for my structural equation model.

#

https://statistics.ohlsen-web.de/tag/graphviz/

past meteor Dec 16, 2023, 8:44 AM

#

quaint loom I am a bit familiar with matplotib but I guess I still have a lot more to learn ...

Okay, that's good you mentioned this. Personally I have not done SEM, but I know enough of it to know I wouldn't make those plots in Matplotlib. For better or for worse, this is one of the times I'd reach for R because they have better tools in this space, e.g., TidyGraph and SemPlot

#

https://cran.r-project.org/web/packages/tidySEM/vignettes/Plotting_graphs.html

quaint loom Dec 16, 2023, 9:04 AM

#

past meteor Okay, that's good you mentioned this. Personally I have not done SEM, but I know...

My brother will walk me through the basic of R language. I also tried it a couple of times after my jupyter notebook limited my work. Going from jupyter notebook to another python software is a big step. haha

past meteor Dec 16, 2023, 9:10 AM

#

For R I'd really just focus on learning how to do stuff and not necessarily being a competent R programmer. Treat it like a statistics and data visualisation toolkit. I love Python but R is better at both.

#

(Not) using notebooks is a surprisingly long and nuanced debate, I'm on mobile so I'll summarise it by saying that you should be able to code outside of an interactive session (Jupyter, Spyder, Iphyton, Rstudio) at the very least yes

quaint loom Dec 16, 2023, 9:48 AM

#

past meteor For R I'd really just focus on learning how to do stuff and not necessarily bein...

Python's my jam, but I'll give R a shot when I can. Is not that long ago that I started learning coding in general so I don`t want to overload myself beside my research. Is will come smoothly along the way. Too much of each language may just confuse me.

past meteor Dec 16, 2023, 10:10 AM

#

quaint loom Python's my jam, but I'll give R a shot when I can. Is not that long ago that I ...

The things that you're doing are already quite complex - might be a good idea to practice Python in isolation as well

pearl barn Dec 16, 2023, 10:17 AM

#

guys I wanna to ask How to install conda for python and How to run Jupyter notebook on it locally on my Windows I'm learning data analysis with python from a website called Jovian dot com but I couldn't save my work online if anyone can explain me this and does it worth learning python basics from this course another point the same course available on freecodecamp

odd meteor Dec 16, 2023, 10:36 AM

#

pearl barn guys I wanna to ask How to install conda for python and How to run Jupyter noteb...

It's pretty straight forward. Just download the Anaconda Distribution. https://www.anaconda.com/download

Once you've done that, it brings alongside all its friends like Jupyter Notebook to the party.

Meanwhile, can you add more clarity on the "I couldn't save my work online" part.
Is it that you were using Colab or Binder to run your code?

Anaconda

Anaconda Team

Free Download | Anaconda

Anaconda's open-source Distribution is the easiest way to perform Python/R data science and machine learning on a single machine.

odd meteor Dec 16, 2023, 10:52 AM

#

pearl barn guys I wanna to ask How to install conda for python and How to run Jupyter noteb...

Akash is one of founders of Jovian. His work has been featured in FreeCodeCamp so I believe his python course will be on point!

As you already know, we humans don't always like similar stuff... So I think what you should focus more on is finding out for yourself if that particular python course in Jovian is 'customer-friendly' to YOU.

Only way to find out is to try taking a few chapter of the course with an open mind.

And If it's hard for you to understand what's being taught or you find yourself sleeping off while watching the video (I presume it's a video course), then by all means don't hesitate to drop it and try another course.

pearl barn Dec 16, 2023, 12:45 PM

#

odd meteor Akash is one of founders of Jovian. His work has been featured in FreeCodeCamp s...

Thank you I appreciate your answer

#

Is it better to use miniconda and How to run Jupyter locally from online course?

left tartan Dec 16, 2023, 1:11 PM

#

quaint loom Do you guys recommend using pydot and GraphViz for visualtion? Not sure if its r...

If you’re interested in visualizations, you should check out some of the great stuff in #1180191057498083418

pseudo pasture Dec 16, 2023, 1:47 PM

#

Hello guys i want to make recommendation model based on the credit card data and one of the column is df['Reward rates']
which have data like this:

rows 1: '6X 6x Marriott Bonvoy point dollar eligible purchase hotel participating Marriott program 4X 4x point purchase made restaurant worldwide gas station wireless telephone service purchased directly service provider purchase shipping 2X 2x point eligible purchase'

row 2: '7X Earn 7X Hilton Honors Bonus Points dollar eligible purchase charged directly hotel resort within Hilton portfolio 5X Earn 5X Points per dollar purchase restaurant supermarket gas station 3X Earn 3X Points eligible purchase Card'

row 3: '12X Earn 12X Hilton Honors Bonus Points dollar eligible purchase charged Card directly hotel resort within Hilton portfolio 6X Earn 6X Points dollar purchase Card restaurant supermarket gas station 4X Earn 4X Points dollar Online Retail Purchases 3X Earn 3X Points eligible purchase Card'

'3 3 Cash Back supermarket per year purchase 1 3 3 Cash Back online retail purchase per year 1 3 3 Cash Back gas station per year 1 1 1 Cash Back purchase'

'12X 12X directly hotel resort Hilton portfolio 6X 6X Select Business Travel Purchases 3X 3X eligible purchase Terms Limitations Apply'

now I'm applying many nlp techniques to extract meaningful data but either can't get relevant features to train model on or there are so many columns created if i Use tf_idf and n-grams, any help will be appreciated.

odd meteor Dec 16, 2023, 1:48 PM

#

pearl barn Is it better to use miniconda and How to run Jupyter locally from online course?

I've always used anaconda but some people also prefer miniconda. So you'll be fine with either one.

Yes you can run your code on JNB locally with the online course.

https://stackoverflow.com/questions/45421163/anaconda-vs-miniconda

Stack Overflow

Anaconda vs. miniconda

In the Anaconda repository, there are two types of installers:

"Anaconda installers" and "Miniconda installers".

What are their differences?

Besides, for an installer file, Anaconda2-4.4.0.1-L...

trim saddle Dec 16, 2023, 1:53 PM

#

You could also just go with a normal python install and use vs-code IDE with jupyter extension to work with notebooks.

pseudo pasture Dec 16, 2023, 1:54 PM

#

one thing i do is this for every row based on ?X values creating the seprate columns

halcyon hedge Dec 16, 2023, 1:59 PM

#

desert oar one thing i wonder about is measurement methodology. what defines a terrorist at...

So mostly I need to focus more on the nuances of the data and be more careful while processing the data, right? And code wise is it fine?

halcyon hedge Dec 16, 2023, 2:00 PM

#

final kiln I'd reckon you should have: - calculus I and II (important to understand gradie...

Hey I was just curious, are there any widely used models using stochastic calculus?

odd meteor Dec 16, 2023, 2:00 PM

#

pseudo pasture Hello guys i want to make recommendation model based on the credit card data and...

There are several ways to control the number of extracted features gotten by TfidfVectorizer.

Personally, in most of my work I always use ngram_range = (1, 2) to consider both unigram and bigrams in the final features tfidf extracts.

For every other parameter I experiment, experiment, and experiment before settling for the configuration that yields the optimal result.

The documentation will do better justice than I can in explaining what each parameter in TfidfVectorizer does.

https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html

scikit-learn

sklearn.feature_extraction.text.TfidfVectorizer

Examples using sklearn.feature_extraction.text.TfidfVectorizer: Biclustering documents with the Spectral Co-clustering algorithm Topic extraction with Non-negative Matrix Factorization and Latent D...

pseudo pasture Dec 16, 2023, 2:04 PM

#

odd meteor There are several ways to control the number of extracted features gotten by Tfi...

i see but then so many other columns are created like almost 60-70 based on the data.

#

see using ngram of (2,2) i get this first

#

see

#

#

for single row too many ngrams created

odd meteor Dec 16, 2023, 2:12 PM

#

pseudo pasture i see but then so many other columns are created like almost 60-70 based on the ...

What really determines the final number of features extracted is the configuration used when instantiating your vectorizer.

So if you want tfidf to extract just the top 100 features (100 new columns), it still can do that.

So it depends on how you configure the parameters.

pseudo pasture Dec 16, 2023, 2:14 PM

#

odd meteor What really determines the final number of features extracted is the configurati...

Make Sense, let me try and also some different approaches.

final kiln Dec 16, 2023, 3:44 PM

#

halcyon hedge Hey I was just curious, are there any widely used models using stochastic calcul...

I don't think I've studied stochastic calculus. But probably related are the usage of monte Carlo methods. Quite sure GPT uses a simple one to generate its output. GPT itself approximates a probability function, which is then used to sample tokens. I've also read very briefly about some networks that use probabilistic activation. And then there's quantum learning, which is inherently stochastic because quantum is all about probabilities.

desert oar Dec 16, 2023, 3:55 PM

#

halcyon hedge So mostly I need to focus more on the nuances of the data and be more careful wh...

it's not that you need to focus on those things in this particular project. it's more of something to keep in mind as you progress into professional work. you can't ignore the measurement and data collection procedure as part of the data generating process.

desert oar Dec 16, 2023, 3:59 PM

#

final kiln I don't think I've studied stochastic calculus. But probably related are the usa...

what monte carlo method is used in a GPT model? i always thought the text output from ChatGPT etc was just a next-word prediction given the context

#

usually "monte carlo" methods refer to computational techniques for approximately computing otherwise-intractable quantities, very often integrals. they are frequently used in Bayesian statistical inference and probability modeling to sample from a posterior distribution

#

the underlying theory of monte carlo computing techniques is indeed that of stochastic processes, eg. markov chain monte carlo (markov chain being a particular type of stochastic process)

final kiln Dec 16, 2023, 4:03 PM

#

desert oar what monte carlo method is used in a GPT model? i always thought the text output...

It's an assumption of mine because a lot of the explanations I've read talk about chat GPT as a conditional probability distribution, given the previous tokens it outputs the probability for each token to occur as the next token.

final kiln Dec 16, 2023, 4:03 PM

#

desert oar usually "monte carlo" methods refer to computational techniques for approximatel...

No that's one of its applications. Monte Carlo methods is an umbrella term that encompasses computational techniques to perform pseudo random sampling

desert oar Dec 16, 2023, 4:04 PM

#

right

#

that's why i am wondering where and how it shows up in the GPT language model

#

i get that the output is a stochastic process, but that doesn't strike me as a monte carlo method except in a very generic sense

final kiln Dec 16, 2023, 4:05 PM

#

desert oar that's why i am wondering where and how it shows up in the GPT language model

Well, I'm only going into detail about this architecture now, so I haven't gotten to the end. But the way I imagine it is: it outputs an array of probabilities, one for each token, and you use that to sample the next token.

desert oar Dec 16, 2023, 4:05 PM

#

yeah, that's a stochastic process

#

the "state" is the current context and the state transition function is the probability distribution over the next token

#

or something like that anyway

final kiln Dec 16, 2023, 4:06 PM

#

So the computational method you use to generate the token would be a monte Carlo method, albeit a simple one

desert oar Dec 16, 2023, 4:07 PM

#

that's a broader definition of monte carlo methods than what i'd use and typically see, but i understand what you mean

#

if you did something like repeatedly generating multiple outputs over and over using the same prompt, in order to compute some statistic or distribution over those outputs, i'd say that's more like a monte carlo method

#

but maybe my interpretation is too narrow

final kiln Dec 16, 2023, 4:10 PM

#

Uhmmmmm, yeah I see what you mean. You'll usually be doing that yeah. So ig you'd just call it random sampling.

#

I mean is not so clear cut

#

In case of GPT, the underlying statistic/shape of distribution does matter a lot

#

But idk, I don't like to get too caught up in the definitions

lapis sequoia Dec 16, 2023, 5:05 PM

#

Hi i have a bunch of annotated images and i want to make a python ai model that trains with those images so that it can detect the image from a given picture
can someone show me a good course or where i can get started

final kiln Dec 16, 2023, 5:48 PM

#

desert oar if you did something like repeatedly generating multiple outputs over and over u...

Wikipedia has an interesting passage about the possible definitions, I think this one is what aligns with the way I use it:

"""
Monte Carlo simulation: Drawing a large number of pseudo-random uniform variables from the interval [0,1] at one time, or once at many different times, and assigning values less than or equal to 0.50 as heads and greater than 0.50 as tails, is a Monte Carlo simulation of the behavior of repeatedly tossing a coin.
"""

So like, GPT would take the place of the [0, 1] distribution and the simulation would be the simulation of the behaviour of a person writing some text message.

#

This would mean that it's not just the last step, the whole thing would be a monte Carlo simulation.

iron basalt Dec 16, 2023, 5:52 PM

#

final kiln This would mean that it's not just the last step, the whole thing would be a mon...

The cuttoff line is arbitrary, you may have mutliple simulations interacting, but each must contain some random or pseudo-random generation that affects the output.

#

Monte Carlo is very broad.

#

Oh and repeat runs too*

final kiln Dec 16, 2023, 5:54 PM

#

I didn't quite understand your point. The passage I'm mentioning is making a distinction between monte Carlo, simulation and Monte Carlo simulation.

#

Simulation: Drawing one pseudo-random uniform variable from the interval [0,1] can be used to simulate the tossing of a coin: If the value is less than or equal to 0.50 designate the outcome as heads, but if the value is greater than 0.50 designate the outcome as tails. This is a simulation, but not a Monte Carlo simulation.

Monte Carlo method: Pouring out a box of coins on a table, and then computing the ratio of coins that land heads versus tails is a Monte Carlo method of determining the behavior of repeated coin tosses, but it is not a simulation.

Monte Carlo simulation: Drawing a large number of pseudo-random uniform variables from the interval [0,1] at one time, or once at many different times, and assigning values less than or equal to 0.50 as heads and greater than 0.50 as tails, is a Monte Carlo simulation of the behavior of repeatedly tossing a coin.

#

Uhm, no I think it would fall into the first one

#

Even accounting for the auto regression, the end result is one sample

iron basalt Dec 16, 2023, 5:57 PM

#

Are you not drawing a large number?

final kiln Dec 16, 2023, 5:58 PM

#

It's technically a single sample from one distribution I think.

iron basalt Dec 16, 2023, 5:59 PM

#

final kiln Furthermore, why is positional encoding needed ?

Btw, the answer is no, but it does make it more difficult for the network.

#

It can effectively learn to do what the position encoding does.

#

(But it's a waste, just neat that it can)

final kiln Dec 16, 2023, 6:00 PM

#

Oh okay I see, you give it a hint for how to represent position so it' more efficient to train it

final kiln Dec 16, 2023, 6:01 PM

#

final kiln The normalization is also quite strange. The normalization layers go like (v - E...

What about this stuff tho, aren't we meant to think of embeddings as part of a vector space

iron basalt Dec 16, 2023, 6:02 PM

#

final kiln Oh okay I see, you give it a hint for how to represent position so it' more effi...

Similar to other methods that precompute stuff and feed that as input rather than just the inputs directly, puts less burden on the network.

#

Especially fuctions that the network would require a lot of neurons to compute itself.

final kiln Dec 16, 2023, 6:03 PM

#

Yeah I didn't think of it that way, I had the impression that positional encoding was obligatory.

iron basalt Dec 16, 2023, 6:03 PM

#

Or the extreme end of that, precompute a ton of random functions on the inputs, then at the end have a simple linear layer.

#

(Which is its own model of computation being researched, pulling answers out of chaos, one of those functions surely has the answer by chance)

lunar ibex Dec 16, 2023, 6:05 PM

#

hi is there a situation where the following is true

(ndarray * scalar // scalar) != ndarray

#

im currently facing this situation and not sure what could have caused it

#

shape of my ndarray is (39584,) single dim array

final kiln Dec 16, 2023, 6:06 PM

#

Probly something like dividing by zero

#

Or very small values

lunar ibex Dec 16, 2023, 6:06 PM

#

my scalar is 8 tho

final kiln Dec 16, 2023, 6:07 PM

#

Uhm is still possible I think, what's the smallest value in the array

lunar ibex Dec 16, 2023, 6:07 PM

#

0

final kiln Dec 16, 2023, 6:07 PM

#

Better yet, you can directly print the ones that are different

#

And try to see a pattern

lunar ibex Dec 16, 2023, 6:07 PM

#

left side is original ndarray, right side is after * and //

#

some of the bytes are short by 96 (hex 60)

final kiln Dec 16, 2023, 6:08 PM

#

Uhm, would be easier to see base 10

iron basalt Dec 16, 2023, 6:09 PM

#

final kiln Yeah I didn't think of it that way, I had the impression that positional encodin...

What is really cool is what it learns is pretty much exactly the positional encoding again, and behaves pretty much exactly the same as grid cells in biology (used for positioning in the brain). It seems that nature has converged to the same thing.

final kiln Dec 16, 2023, 6:10 PM

#

iron basalt What is really cool is what it learns is pretty much exactly the positional enco...

You mean it learns the sine wave stuff described in the 2017 paper ?

iron basalt Dec 16, 2023, 6:10 PM

#

final kiln You mean it learns the sine wave stuff described in the 2017 paper ?

Yeah.

final kiln Dec 16, 2023, 6:10 PM

#

Yeah that's pretty cool

iron basalt Dec 16, 2023, 6:11 PM

#

It's also related to grid cells, and the Fourier Transform (grid cells act like one).

lunar ibex Dec 16, 2023, 6:12 PM

#

final kiln Uhm, would be easier to see base 10

woah not sure if theres a way to do that in hxd

final kiln Dec 16, 2023, 6:13 PM

#

Uhm from what I recall grid cells sort of create a map of repeated circular shapes that repeat across space. You'd have several kinds which repeat at different frequencies and that's how it kinda encodes position

iron basalt Dec 16, 2023, 6:13 PM

#

final kiln Uhm from what I recall grid cells sort of create a map of repeated circular shap...

Yes.

final kiln Dec 16, 2023, 6:13 PM

#

So is kinda a 3d sine wave

#

Didn't think of it that way, pretty cool

iron basalt Dec 16, 2023, 6:14 PM

#

Place fields, which are built with grid cells, also kind of show up (we know less about them, so can't really tell yet) in Transformers, when the context switches you can see the attention remapping.

#

(The context remapping place fields behavior)

final kiln Dec 16, 2023, 6:16 PM

#

Doesn't this at least point to GPT having "understanding" similar to our own ? Since it's using similar ways of representing things

iron basalt Dec 16, 2023, 6:16 PM

#

final kiln Doesn't this at least point to GPT having "understanding" similar to our own ? S...

It's closer than before. It's more like we are reinventing what biology has done, without realizing it, because we are also getting in new information about the biology at the same time.

final kiln Dec 16, 2023, 6:18 PM

#

Do you know how they are doing vision ? Is it a literal part of the input or is it part of a different network ?

#

Like, a transcription from image to text and then that gets fed to gpt ?

left tartan Dec 16, 2023, 6:19 PM

#

final kiln Doesn't this at least point to GPT having "understanding" similar to our own ? S...

Are you familiar with https://en.m.wikipedia.org/wiki/Chinese_room?

#

It’s a philosophical debate over what ‘understanding’ means

iron basalt Dec 16, 2023, 6:19 PM

#

A lot of what happens in the brain revolves around this positioning system / place fields, so it's probably needed for all future networks.

iron basalt Dec 16, 2023, 6:20 PM

#

final kiln Do you know how they are doing vision ? Is it a literal part of the input or is ...

Vision has multiple sections in the brain. That is a lot of just parallel processing to simplify it for the rest of the brain.

final kiln Dec 16, 2023, 6:20 PM

#

left tartan Are you familiar with <https://en.m.wikipedia.org/wiki/Chinese_room>?

Yes, the system as a whole has understanding even though none of the parts does. That's also what we are, no neuron understands what I do even tho I'm neurons.

past meteor Dec 16, 2023, 6:20 PM

#

halcyon hedge Hey I was just curious, are there any widely used models using stochastic calcul...

No, the only people I know using stochastic calculus are actuaries. You don't need this.

left tartan Dec 16, 2023, 6:21 PM

#

final kiln Yes, the system as a whole has understanding even though none of the parts does....

My eli5 is. The argument is that a such a system cannot be described as ‘understanding’. It’s a good read/but purely philosophical point

final kiln Dec 16, 2023, 6:21 PM

#

iron basalt Vision has multiple sections in the brain. That is a lot of just parallel proces...

Oh I mean in the new multimodal LLMs like GPT4V

iron basalt Dec 16, 2023, 6:21 PM

#

Vision, like all other systems in the brain is heavily reliant on top down observer expectations, it's how you see things even if they are noisy, and also things that don't exist, like imaginary edges.

iron basalt Dec 16, 2023, 6:21 PM

#

final kiln Oh I mean in the new multimodal LLMs like GPT4V

They are not accurate, because the whole system needs to be learned together.

#

Each system can affect the top town effect on another.

left tartan Dec 16, 2023, 6:22 PM

#

past meteor No, the only people I know using stochastic calculus are actuaries. You don't ne...

I come up against the edges of it in fintech but I run away when I see it. But Black Scholes in particular.

iron basalt Dec 16, 2023, 6:22 PM

#

Like priming someone with an auditory cue, which affects what they see.

final kiln Dec 16, 2023, 6:23 PM

#

left tartan My eli5 is. The argument is that a such a system cannot be described as ‘underst...

That would imply that no human has true understanding. I've come to accept that the world is weird and counterintuitive.

past meteor Dec 16, 2023, 6:23 PM

#

final kiln I don't think I've studied stochastic calculus. But probably related are the usa...

GPT uses beam search.

You have a problem such: maximize the likelihood over a sequence. Naturally you shouldn't always pick the greedy option, you might pick to 4 suboptimal tokens first and then the rest in a greedy fashion and you end up with a higher likelihood at the end.

You can have exact solutions here using BFS/DFS but it's intractable.

iron basalt Dec 16, 2023, 6:23 PM

#

If trained together, the language modeling part would actually ground it's symbols better to visuals and such, making it have an actual understanding of the world. Via just text is too narrow.

left tartan Dec 16, 2023, 6:23 PM

#

final kiln That would imply that no human has true understanding. I've come to accept that ...

Read the wiki page, the argument is not so easily discarded

iron basalt Dec 16, 2023, 6:24 PM

#

A big one is touch, specifically how positioning systems interact with that and model objects / spaces, and link that to stuff like words, sounds, etc.

#

However, this interconnected training problem is really hard, because you need all inputs coming in at the same time, you can't just train each part separate.

past meteor Dec 16, 2023, 6:26 PM

#

final kiln Do you know how they are doing vision ? Is it a literal part of the input or is ...

You can make an arbitrary model multimodal by doing this:

Language model A has an embedding space a.

Vision model B has an embedding space b.

Train a translation "network" c that maps a to b and vice versa.

There's been a large amount of research doing this. They take pretrained vision and language models and just train the translation/mapping network. You would need a training task that accurately allows you to learn this though, for instance image captioning may work to train this translation network.

iron basalt Dec 16, 2023, 6:26 PM

#

Just plain text is about as convenient as it gets.

#

One thing to also note about training them separate is that you doing a lot of redundant work, when interconnected during training they can make the learning processes faster for each other.

past meteor Dec 16, 2023, 6:28 PM

#

Agreed

#

But if your compute budget or dataset isn't massive I prefer freezing the language and vision models and just training the translation

iron basalt Dec 16, 2023, 6:29 PM

#

past meteor But if your compute budget or dataset isn't massive I prefer freezing the langua...

Yeah, having separate models right now is the convenient while still kind of works approach.

#

It's not ideal, but works decently well.

past meteor Dec 16, 2023, 6:29 PM

#

But multi-task learning has been shown to improve generalization and data efficiency in theory yes. I typically comment from the "practical" perspective 😄

iron basalt Dec 16, 2023, 6:30 PM

#

But, if your models are online learners, now you can do some cool stuff in post. You can hook them up more directly and have them learn more together without disrupting each other's knowledge.

#

(Biology uses online learning, in part because it really needs to not disrupt exisitng stuff, especially while still growing)

past meteor Dec 16, 2023, 6:32 PM

#

left tartan I come up against the edges of it in fintech but I run away when I see it. But B...

I don't even know what black scholes is (anymore). I'm an alumnus of the faculty of economics and business (not CS) but after I finished my masters I purged everything 😩

iron basalt Dec 16, 2023, 6:33 PM

#

iron basalt But, if your models are online learners, now you can do some cool stuff in post....

(Interestingly, Transformers become more online-like the more they have trained)

final kiln Dec 16, 2023, 6:33 PM

#

left tartan Read the wiki page, the argument is not so easily discarded

I've read the system reply section for refutation. I don't think this thought experiment proves or disproves any side, it just brings to light how ignorant we are about consciousness.

I don't take a strong side, I just try to err on the side of caution so that I can act in an ethical manner in face of ignorance. We don't know how it works, so we should be careful when something starts acting conscious, otherwise we may inadvertently cause suffering.

past meteor Dec 16, 2023, 6:33 PM

#

I used to know though, we learnt about it in some math or finance class.

left tartan Dec 16, 2023, 6:34 PM

#

final kiln I've read the system reply section for refutation. I don't think this thought ex...

Agree, it’s just a philosophical debate… worthy of thinking through both sides

final kiln Dec 16, 2023, 6:34 PM

#

past meteor You can make an arbitrary model multimodal by doing this: Language model A has ...

Oh that's pretty cool

final kiln Dec 16, 2023, 6:35 PM

#

left tartan Agree, it’s just a philosophical debate… worthy of thinking through both sides

Well yes, but I think about it a lot because it's a dangerous thing for us to be ignorant about.

#

And it seems like we won't have answers for a long time

past meteor Dec 16, 2023, 6:35 PM

#

My default stance to AI safety is that our current approach is bad

#

It's always philosophy 😩

lunar ibex Dec 16, 2023, 6:35 PM

#

lunar ibex left side is original ndarray, right side is after * and //

issue was uint8 overflow

left tartan Dec 16, 2023, 6:36 PM

#

past meteor I used to know though, we learnt about it in some math or finance class.

Yah, it’s somewhat interesting for pricing options. The main point is it assumes a random walk, which is amazingly/surprisingly quite useful.

past meteor Dec 16, 2023, 6:36 PM

#

This is the only field where this is the case. When civil engineers are building a bridge they don't call in philosophers to talk about the safety of it nor expect engineers to become philosophers.

iron basalt Dec 16, 2023, 6:36 PM

#

past meteor It's always philosophy 😩

Agree, we need math.

past meteor Dec 16, 2023, 6:37 PM

#

It's become my ultimate pet peeve these days, we should stop this imho

iron basalt Dec 16, 2023, 6:37 PM

#

Yeah.

#

Won't go anywhere, you think what you think at this point.

left tartan Dec 16, 2023, 6:37 PM

#

final kiln And it seems like we won't have answers for a long time

Think of the Chinese room/ church argument as a counter to the Turing test argument. Turing test is: it’s intelligent if it’s indistinguishable from intelligent. Churches is: it’s never intelligent.

past meteor Dec 16, 2023, 6:37 PM

#

(I don't mean this in reference to the conversation above btw! It's only tangentially related.)

left tartan Dec 16, 2023, 6:39 PM

#

past meteor It's become my ultimate pet peeve these days, we should stop this imho

Stop what tho?

past meteor Dec 16, 2023, 6:41 PM

#

The top voices of AI safety being dominated by people like Eliezer Yudkowsky

#

I'd say they still have a very important role to play, even when talking about how structural engineers build bridges. What should be the most important thing to be is not hypotheticals like "sentience" but things that are grounded in how models are actually trained, so basically grounded in math. There's great papers that take this angle which Rob Miles frequently summarizes The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

iron basalt Dec 16, 2023, 6:46 PM

#

past meteor I'd say they still have a very important role to play, even when talking about h...

Some AI type classification charts, like how algorithms are classified in theory of computation, but in this case AI safety.

#

(Based on math)

past meteor Dec 16, 2023, 6:48 PM

#

left tartan Stop what tho?

Sorry if I came on a bit strong and/or gatekeepe-y!

#

Was not my intention

left tartan Dec 16, 2023, 6:48 PM

#

past meteor Sorry if I came on a bit strong and/or gatekeepe-y!

I was just genuinely curious, I don’t really follow the ethics in ai space

odd meteor Dec 16, 2023, 6:50 PM

#

left tartan I was just genuinely curious, I don’t really follow the ethics in ai space

I've always dreaded that field. The active ML researchers in that field are really made of steel!

past meteor Dec 16, 2023, 6:50 PM

#

https://news.ycombinator.com/item?id=15171513 <- this comment is a nice summary of what I mean. It's from 2017 which is even before the current hype 😄

iron basalt Dec 16, 2023, 6:50 PM

#

odd meteor I've always dreaded that field. The active ML researchers in that field are real...

They have to deal both with math and Twitter. Hard job.

odd meteor Dec 16, 2023, 6:54 PM

#

iron basalt They have to deal both with math and Twitter. Hard job.

It's always ballistic and house of flying daggers in that field. I don't think I can deal with that 🤣🤣

iron basalt Dec 16, 2023, 6:55 PM

#

odd meteor It's always ballistic and house of flying daggers in that field. I don't think I...

Robert Miles is the kind of not panicking, but also directly trying to solve issues person in this field (not the "there are no issues" type either). There are many such people, but outrage draws attention as usual.

#

(It's similar to those actually trying to solve climate change actively, versus the panickers and deniers, they are too busy solving the problem so you don't hear much from them)

final kiln Dec 16, 2023, 6:58 PM

#

past meteor https://news.ycombinator.com/item?id=15171513 <- this comment is a nice summary ...

To use it here, Elon Musk, Putin, and Mark Zuckerberg all have something in common
didn't know putin spoke about ai safety

#

2017 was a different world I suppose

#

anyway, is there anything being done in the direction of making the structure of the neural network part of the optimization ? so like, instead of just adjusting weights, also be able to add layers, increase their size, decrease their size, etc

#

https://en.wikipedia.org/wiki/Neural_architecture_search

Neural architecture search

Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and ...

iron basalt Dec 16, 2023, 7:06 PM

#

final kiln anyway, is there anything being done in the direction of making the structure of...

Yes, it's very common.

past meteor Dec 16, 2023, 7:06 PM

#

NAS is a genetic algorithm

#

You'd typically have 2 optimization routines, one to train a network (a single instance) and then a hyperparameter search that maybe isn't fully random if you're going down this route

#

I haven't read any research of people doing it in one e.g., using the gradient after a batch to change the actual architecture, which I think the question is, maybe @iron basalt has

final kiln Dec 16, 2023, 7:09 PM

#

Differentiable NAS has shown to produce competitive results using a fraction of the search-time required by RL-based search methods.

iron basalt Dec 16, 2023, 7:09 PM

#

past meteor I haven't read any research of people doing it in one e.g., using the gradient a...

IDR the names right now, but those don't work well yet (and probably won't for deep learning in general due to being offline).

final kiln Dec 16, 2023, 7:11 PM

#

iron basalt IDR the names right now, but those don't work well yet (and probably won't for d...

what do you mean by being offline ?

iron basalt Dec 16, 2023, 7:12 PM

#

final kiln what do you mean by being offline ?

Opposite of online learner.

#

As you might be able to imagine, a learner that is really good as learning new things without disrupting existing knowledge at all is ideal for growing more layers and such.

final kiln Dec 16, 2023, 7:14 PM

#

yeah ig I'm still getting caught up on all the terminology

iron basalt Dec 16, 2023, 7:14 PM

#

(And does not even need batches either)

#

(Which is why biology must do it)

final kiln Dec 16, 2023, 7:17 PM

#

are there any multi sequence GPT models ? like, two input sequences, one which is updated independently (like a user writing to a textbox real time) and another which is the output of the model

#

However, the concept of handling multiple independent input sequences in the context of GPT-like models is more about how you frame the problem and feed data into the model rather than a built-in feature of the model itself. For instance, you can design an application where one input stream is user-generated content (like real-time text input) and another is context or additional information (like a separate conversation or data stream). These inputs can be concatenated or formatted in a way that the model understands as separate but related pieces of information.

#

interesting, I think I'm still gonna do two branches and then kinda mix em up somehow before the output

#

my objective is to be able to talk with it via voice chat in real time

#

so like, it should know that it is interrupting me and not speak

#

instead of the current turn based thing they have on chat gpt mobile

desert oar Dec 16, 2023, 8:40 PM

#

final kiln are there any multi sequence GPT models ? like, two input sequences, one which i...

sounds like what transformers were originally used for: sequence-to-sequence transformation

#

that's the encoder-decoder architecture, as in e.g. the "attention is all you need" paper

#

(unless i'm misunderstanding what you're looking for)

final kiln Dec 16, 2023, 8:42 PM

#

Right, in the original one the sequence doesn't change, but I suppose it's only a minor adjustment

#

It encodes the first sequence and then autoregression is done with the decoder

desert oar Dec 16, 2023, 8:45 PM

#

oh, i think i see what you mean

#

maybe it would work if you applied masking to both input and output...

#

that has to be done somewhere in the literature already. right?

final kiln Dec 16, 2023, 8:47 PM

#

I haven't looked into it yet, I'm still in the part of understanding and implementing the transformer. I'm using a Viz as guide

#

I'm using this as a guide for the implementation https://bbycroft.net/llm

LLM Visualization

A 3D animated visualization of an LLM with a walkthrough.

desert oar Dec 16, 2023, 8:51 PM

#

final kiln I haven't looked into it yet, I'm still in the part of understanding and impleme...

you're in the DS server right? i was asking a bunch of questions on this topic a few months ago while i was doing the same, search for my messages in the machine-learning channel there if you want to see the questions i asked and the answers i got

#

i was focused mostly on the self-attention mechanism specifically, since that was the non-obvious part to me

#

(not that any of it was obvious, but it was the part that i really didn't understand from reading the literature)

final kiln Dec 16, 2023, 9:01 PM

#

I think I got some intuition for self attention tho I do need to work through it.

#

I'm honestly stuck on why z score is used to normalize the vectors

#

Has no direct interpretation except that it keeps values from exploding

craggy patio Dec 16, 2023, 10:07 PM

#

I am creating a MIDI music generative AI but have failed multiple times. I am starting over again and would like some insight on what models I should use

desert oar Dec 16, 2023, 10:27 PM

#

final kiln I'm honestly stuck on why z score is used to normalize the vectors

normalizing the mean to 0 and standard deviation to 1 just tends to work well in general

#

the important thing is to put all numbers on roughly the same numerical precision scale

#

centering at 0 and rescaling by standard deviation just happens to work well for that, it helps ensure that you're "in the middle" of the space of what can be represented by floating-point numbers, allowing lots of room for numbers to be significantly smaller than or significantly larger than 0

#

it also does have direct interpretation in statistical models, so there's some carry-over if you squint

#

oh also, if you center the mean at 0, then scaling down by standard deviation is just normalizing in the linear algebra sense of dividing by the l2 norm

#

i really like this talk for an explanation of self-attention https://youtu.be/S27pHKBEp30?feature=shared&t=587

YouTube

Seattle Applied Deep Learning

LSTM is dead. Long Live Transformers!

Leo Dirac (@leopd) talks about how LSTM models for Natural Language Processing (NLP) have been practically replaced by transformer-based models. Basic background on NLP, and a brief history of supervised learning techniques on documents, from bag of words, through vanilla RNNs and LSTM. Then there's a technical deep dive into how Transformers ...

▶ Play video

#

the value of the i,j cell of the attention matrix is a relevance score of the j'th token in the input sequence "from the perspective of" the i'th token in the output sequence

#

that's why they mask off the upper triangle of the attention matrix in decoder-decoder transformer, to prevent the i'th token in the decoded sequence from "attending to" any subsequent tokens

final kiln Dec 16, 2023, 11:05 PM

#

desert oar oh also, if you center the mean at 0, then scaling down by standard deviation is...

ig my question here would be if it is totally necessary, I'm fine with the division by std since it's the same as L2-normalization, but I'd be very happy to do away with subtraction by the mean if possible, im gonna try to graph this

#

ye makes no sense

#

in 2d is awful

#

maybe im doing something wrong

#

the vector goes from spanning the entire 2d plane to being confined to two points

#

which makes sense, subtraction makes it confined to y = -x, then normalization forces the norm to be 1

#

so each time this is done two dimensions are discarded, ig the network will find some way of accounting for this

#

class SelfAttentionHead(nn.Module):
  def __init__(self, params: ModelParameters):
    super(SelfAttentionHead, self).__init__()
    self.compressed_coordinates = params.word_vector_size // 3
    self.q: TensorFloat["coordinates compressed_coordinates"] = RandParameter(
        params.coordinates, self.compressed_coordinates
    )
    self.k: TensorFloat["coordinates compressed_coordinates"] = RandParameter(
        params.coordinates, self.compressed_coordinates
    )
    self.v: TensorFloat["coordinates compressed_coordinates"] = RandParameter(
        params.coordinates, self.compressed_coordinates
    )

  def forward(self, sequence: TensorFloat["words coordinates"]) -> TensorFloat["words compressed_coordinates"]:
    # TensorFloat["words coordinates"] @ TensorFloat["coordinates compressed_coordinates"] 
    q_vectors: TensorFloat["words compressed_coordinates"] = sequence @ self.q
    k_vectors: TensorFloat["words compressed_coordinates"] = sequence @ self.k
    v_vectors: TensorFloat["words compressed_coordinates"] = sequence @ self.v

    # TensorFloat["words compressed_coordinates"] @ TensorFloat["compressed_coordinates words"]
    attention_scores: TensorFloat["words words"] =  q_vectors @ k_vectors.T
    attention_scores /= torch.sqrt(self.compressed_coordinates)
    attention_scores = torch.nn.functional.softmax(attention_scores)

    # TensorFloat["words words"] @ TensorFloat["words compressed_coordinates"]
    return attention_scores @ v_vectors

#

class SelfAttention(nn.Module):
  def __init__(self, params: ModelParameters):
    super(SelfAttention, self).__init__()
    self.head_1 = SelfAttentionHead(params)
    self.head_2 = SelfAttentionHead(params)
    self.head_3 = SelfAttentionHead(params)
    self.projection: TensorFloat["words words"] = RandParameter(params.words, params.words)

    def forward(self, sequence: TensorFloat["words coordinates"]):
      att_1: TensorFloat["words compressed_coordinates"] = self.head_1(sequence)
      att_2: TensorFloat["words compressed_coordinates"] = self.head_2(sequence)
      att_3: TensorFloat["words compressed_coordinates"] = self.head_3(sequence)
      output: TensorFloat["words coordinates"] = torch.stack([att_1, att_2, att_3], dim=1)
      return self.projection @ output

#

wait I should do softmax here isnt it

#

no is done on attention_scores = torch.nn.functional.softmax(attention_scores)

golden ridge Dec 16, 2023, 11:33 PM

#

anyone has some resources on how to train neural networks??

final kiln Dec 17, 2023, 12:07 AM

#

golden ridge anyone has some resources on how to train neural networks??

It's hard to make recommendations without knowing your background. But my experience has been that if you know enough math and code, none of this is hard to pick up by just building stuff.

desert oar Dec 17, 2023, 12:57 AM

#

final kiln ig my question here would be if it is totally necessary, I'm fine with the divis...

where have you seen standardization (centering and scaling by 1 standard deviation) rather than normalization (dividing by the norm) in a NN?

#

i'm not an expert in deep learning as i'm sure you know, but i've only ever seen the latter

final kiln Dec 17, 2023, 1:00 AM

#

desert oar where have you seen standardization (centering and scaling by 1 standard deviati...

Well in the transformer. The reference I'm using uses the z-score formula over the coordinates of the tokens.

#

I'm using this as ref: https://bbycroft.net/llm

LLM Visualization

A 3D animated visualization of an LLM with a walkthrough.

#

They do m*z_score + b, where b and m are learnable

desert oar Dec 17, 2023, 1:05 AM

#

you're talking about the layer norm step?

#

i see, so it is

#

i maintain it makes sense to both center and scale

#

it's the same reason you do it in just about any other machine learning model

#

it's good for numerical behavior

#

the fact that the scaling of centered data coincides with l2 vector normalization is just a bonus

#

The goal is to make the average value in the column equal to 0 and the standard deviation equal to 1. To do this, we find both of these quantities (mean (μ) & std dev (σ)) for the column and then subtract the average and divide by the standard deviation.

i wish they'd say why you do this, because what i said above is not obvious at all unless you already happen to know it

#

great resource overall, but too much focus on what/how and not enough on why

final kiln Dec 17, 2023, 1:08 AM

#

desert oar it's good for numerical behavior

I am aware of this. My issue with it is that it's a step away from interpretability, so if I can do away with it I'd rather do it.

#

1/norm(v) is much more intuitive

#

Uhm, I also wonder if anyone has tried to do "compression" of the attention heads.

So like, train a larger transformer, but then look at the attention heads and see if they can be used to train smaller ones. Effectively compressing them. Or maybe even changing architectures entirely.

desert oar Dec 17, 2023, 1:16 AM

#

final kiln Uhm, I also wonder if anyone has tried to do "compression" of the attention head...

that sounds like distillation maybe https://medium.com/nlplanet/a-model-distillation-survey-7f0e1b56b3cf

Medium

A Model Distillation Survey

Categories of knowledge, distillation schemes, teacher-student architectures, and distillation algorithms

desert oar Dec 17, 2023, 1:16 AM

#

final kiln I am aware of this. My issue with it is that it's a step away from interpretabil...

centering at 0 has numerical benefits and doesn't qualitatively change the data at all

#

it's not a linear transformation, but it doesn't actually change any of the subsequent interpretation

#

it's just shifting the entire space to exist in a more numerically-comfortable region

#

followed by rescaling the norm to 1

final kiln Dec 17, 2023, 1:17 AM

#

desert oar that sounds like distillation maybe https://medium.com/nlplanet/a-model-distilla...

Every idea I have has already been tried ahahah

desert oar Dec 17, 2023, 1:18 AM

#

final kiln Every idea I have has already been tried ahahah

yeah but you keep reinventing successful ideas. it's much less encouraging if you keep having ideas that people have tried and they turned out to be bad ideas

final kiln Dec 17, 2023, 1:18 AM

#

desert oar it's not a linear transformation, but it doesn't actually change any of the subs...

It does because you actually remove two dimensions, normalization constraints the data to an (n-1) hyper sphere for ex

desert oar Dec 17, 2023, 1:20 AM

#

final kiln It does because you actually remove two dimensions, normalization constraints th...

shifting the data (centering) isn't linear, 0 isn't preserved

#

but yes, normalization (scaling) is linear

final kiln Dec 17, 2023, 1:22 AM

#

desert oar shifting the data (centering) isn't linear, 0 isn't preserved

The plot I made really made it look like the v - E(v) thing is almost a projection. For 2d it maps the plane to the y=-x line. But maybe I'm doing something wrong in the plot.

desert oar Dec 17, 2023, 1:27 AM

#

that does seem off

#

in this case "2D" means you have 2 possible tokens in the sequence. the idea is that the embedding for each token is centered at 0 mean and scaled to 1 std dev, but that shouldn't involve any nullifying of vector space dimensions. it's just shifting the origin, followed by squeezing/stretching

#

Normalization is an important step in the training of deep neural networks, and it helps improve the stability of the model during training.
i guess this is their explanation

final kiln Dec 17, 2023, 1:33 AM

#

x - .5*( x + y) = .5(x - y)
y - .5 (x + y) = -.5(x - y)

#

As a sanity check

#

I think that's right unless I got fooled again by my eternal enemy, the minus sign

#

So there's one free variable after subtracting the mean

#

The 2017 paper points here: "Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. Layer normalization. arXiv preprint
arXiv:1607.06450, 2016."

#

https://arxiv.org/abs/1607.06450

arXiv.org

Layer Normalization

Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which ...

desert oar Dec 17, 2023, 2:00 AM

#

final kiln x - .5*( x + y) = .5(x - y) y - .5 (x + y) = -.5(x - y)

oh i see what you mean, yes that makes sense

#

in statistics the sample average is just an estimate of a population mean so it makes sense in that context

final kiln Dec 17, 2023, 2:25 AM

#

Yes it makes sense if you look at the coordinates as samples from the same distribution. But that sort of collides with the picture of a vector space where dot products measure similarity.

#

Which is how the best intuitions I've seen are being built from.

desert oar Dec 17, 2023, 4:35 AM

#

final kiln Yes it makes sense if you look at the coordinates as samples from the same distr...

that's a good point, especially because each individual vector is being normalized, rather than normalizing each dimension across all vectors. i'll have to think about it to see if there's any way to interpret this meaningfully, or if it's purely mechanical

#

there's also the whole aspect of learning slope and intercept parameters, which kind of throws off my interpretation before

pearl barn Dec 17, 2023, 10:14 AM

#

guys How to start jupyter notebook locally on my windows I already installed Miniconda

odd meteor Dec 17, 2023, 10:55 AM

#

pearl barn guys How to start jupyter notebook locally on my windows I already installed Min...

Open your conda cli and type Jupyter Notebook then hit enter

pearl barn Dec 17, 2023, 11:29 AM

#

How can I run external Jupyter notebook from online source to my windows??

#

Is miniconda enough to have Jupyter or do I need to install whole anaconda??

past meteor Dec 17, 2023, 1:33 PM

#

pearl barn Is miniconda enough to have Jupyter or do I need to install whole anaconda??

The difference between anaconda and minconda is that anaconda already comes with a ton of packages and tools preinstalled and miniconda doesn't. As a beginner it might be a good idea to start with Anaconda.

past meteor Dec 17, 2023, 1:37 PM

#

pearl barn How can I run external Jupyter notebook from online source to my windows??

I'll refer you to an editor (visual studio code): https://code.visualstudio.com/learn/get-started/basics. You can install it and you can be up and running in a minute but there's also a 5 minute video you can watch if you want/need to.

You'll have to install the Python extension https://marketplace.visualstudio.com/items?itemName=ms-python.python

and afterwards follow a third of this guide (you certainly don't need to read all of it) https://code.visualstudio.com/docs/datascience/jupyter-notebooks

#

I'm mostly sending you in the direction of tutorials that you need to read and/or watch because of the old adage: “Give a man a fish, and you feed him for a day. Teach a man to fish, and you feed him for a lifetime.”

hollow flicker Dec 17, 2023, 3:19 PM

#

Hey, I've 75k row dataset. I need to use sklearn.MLPClassifier. I have 5 class. My accuracy score every time higher than 0.99. Why this can be happen?

#

my dataset distribution

#

past meteor Dec 17, 2023, 3:47 PM

#

hollow flicker Hey, I've 75k row dataset. I need to use sklearn.MLPClassifier. I have 5 class. ...

Can you do 3 things please:

When sharing code could you use triple backticks (`) to paste multiple lines instead of screenshots, it's typically preferred here :D
Could you use cross_val_score instead of cross_val_predict and casting to integers? It's the more idiomatic way to do this thing.
Can you split into train and test, cross_val_score on train, then train the classifier "for real" on train, predict on test and then make a confusion matrix.

carmine ore Dec 17, 2023, 3:47 PM

#

How to optimise linear regression model to produce better predictions?

past meteor Dec 17, 2023, 3:50 PM

#

carmine ore How to optimise linear regression model to produce better predictions?

Using RidgeRegressionCV or ElasticNetCV instead as these models already carry out some hyperparameter tuning for you.
Feature engineering: add interaction terms, binning, polynomials, splines, feature transforms and so on. The best way to identify if you need additional feature engineering is by doing residiual analysis. Plot the error your model is making versus each variable. Normally there should be no structure in the residuals, if there is you may need feature engineering.

carmine ore Dec 17, 2023, 3:52 PM

#

past meteor 1. Using `RidgeRegressionCV` or `ElasticNetCV` instead as these models already c...

I am using polynomials. I am also experimenting with ensemble methods

#

I will try the models you mention right away

#

When it comes to feature engineering I am trying to avoid it for today

past meteor Dec 17, 2023, 3:53 PM

#

Does it need to be linear regression? I always try a gradient boosted machine and/or say Random Forest to see what their performance is and then compare that to the linear models. If they're doing significantly better then there's at least a few non-linearities that your linear model is not accounting for.

carmine ore Dec 17, 2023, 3:54 PM

#

past meteor Does it *need* to be linear regression? I always try a gradient boosted machine ...

Yeah it has to be linear regression.

past meteor Dec 17, 2023, 3:55 PM

#

Then you should definitely do what I suggested, the gbm / rf model will at least give you a lower bound of performance your linear model should be able to obtain

carmine ore Dec 17, 2023, 3:55 PM

#

Btw RidgeRegressionCV performed horribly on my test data. I am currently using ElasticNet as it was best one so far

#

The ElasticNetCV performed same as my regular Elastic Net. That’s because I already hand picked the best hyper parameters.

past meteor Dec 17, 2023, 4:05 PM

#

Then residual analysis and feature engineering and what you should probably be doing. I'd stay with RidgeRegressionCV and ElasticNetCV as adding features will change the hyperparameter values you need

grizzled locust Dec 17, 2023, 4:06 PM

#

hi guys, sorry for interupting, but anyone knows where i did wrong?

versed pilot Dec 17, 2023, 4:08 PM

#

grizzled locust hi guys, sorry for interupting, but anyone knows where i did wrong?

download those csv files and look at them in excel and it should be obvious, or use Google sheets. Is there a google Drive API you could use to fetch the files?

carmine ore Dec 17, 2023, 4:08 PM

#

past meteor Then residual analysis and feature engineering and what you should probably be d...

I will take a break and start with the feature engineering. Any tips?

past meteor Dec 17, 2023, 4:10 PM

#

carmine ore I will take a break and start with the feature engineering. Any tips?

Just the stuff I mentioned, it's very case specific. Do the residual analysis and set yourself a "target number to beat" by running gradient boosting.

serene scaffold Dec 17, 2023, 4:22 PM

#

grizzled locust hi guys, sorry for interupting, but anyone knows where i did wrong?

You're not interrupting. But what about it is wrong? We can't always tell how it's different from what you expect, unless you say what you expect.

You might need to change the axis for pd.concat

grizzled locust Dec 17, 2023, 4:23 PM

#

serene scaffold You're not interrupting. But what about it is wrong? We can't always tell how it...

i wanted it to look like this

#

but instead, it looks like this

serene scaffold Dec 17, 2023, 4:24 PM

#

Can you download those CSV files and open them locally?

grizzled locust Dec 17, 2023, 4:27 PM

#

serene scaffold Can you download those CSV files and open them locally?

i could open it with excel i guess

versed pilot Dec 17, 2023, 5:26 PM

#

grizzled locust but instead, it looks like this

you might have to use this https://developers.google.com/drive/api/reference/rest/v3

Google for Developers

Google Drive API | Google for Developers

#

Or an easier solution might be to run the code in Colab, it might have direct access to Google Drive that you don't otherwise get

onyx widget Dec 17, 2023, 6:23 PM

#

hey i'm interesting in getting into ML, what is the best way of starting this?

final kiln Dec 17, 2023, 6:47 PM

#

desert oar that's a good point, especially because each individual vector is being normaliz...

I've asked gpt to make these vizualizations, haven't checked the code but they seem to align well with the 2d case

#

My intuition here is that you start with a 512 dimensional space, and you only use a slice of a slice of it, a subspace of 510 dimensions. Still plenty of room to work with, but you excluded points with values that might cause numerical instability.

versed pilot Dec 17, 2023, 7:01 PM

#

onyx widget hey i'm interesting in getting into ML, what is the best way of starting this?

I'm in a similar situation and moving from basic statistics to linear regression. I think it's the easiest to understand, and once you learn how to do linear regression with scikit learn, the approach is similar for other algorithms

desert oar Dec 17, 2023, 7:23 PM

#

final kiln I've asked gpt to make these vizualizations, haven't checked the code but they s...

great idea, and this actually makes a lot of sense now that i think about it more

#

the whole point is that the original data could be anywhere in space

#

and you want to bring it all back into the middle

#

but can you share the code? i want to make sure it's actually doing the right thing

#

that is, each "instance" should be normalized within itself, rather than what we normally do, which is each dimension being normalized across all "instances"

final kiln Dec 17, 2023, 7:35 PM

#

desert oar but can you share the code? i want to make sure it's actually doing the right th...

Sure:

Here's the full Python code used for the visualization with shifted and normalized grid points:

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Generate a uniform grid in 3D space
x = np.linspace(-4, 4, 10)
y = np.linspace(-4, 4, 10)
z = np.linspace(-4, 4, 10)
xx, yy, zz = np.meshgrid(x, y, z)
grid_points = np.vstack([xx.ravel(), yy.ravel(), zz.ravel()]).T

# Subtract the mean of each point's coordinates from the point itself and then normalize
shifted_normalized_grid_points = np.array([(p - np.mean(p)) / np.linalg.norm(p - np.mean(p)) for p in grid_points])

# Plotting
fig = plt.figure(figsize=(12, 6))

# Original grid
ax1 = fig.add_subplot(121, projection='3d')
ax1.scatter(grid_points[:,0], grid_points[:,1], grid_points[:,2], color='b')
ax1.set_title("Original Grid")
ax1.set_xlabel('X axis')
ax1.set_ylabel('Y axis')
ax1.set_zlabel('Z axis')

# Shifted and normalized grid
ax2 = fig.add_subplot(122, projection='3d')
ax2.scatter(shifted_normalized_grid_points[:,0], shifted_normalized_grid_points[:,1], shifted_normalized_grid_points[:,2], color='r')
ax2.set_title("Shifted and Normalized Grid")
ax2.set_xlabel('X axis')
ax2.set_ylabel('Y axis')
ax2.set_zlabel('Z axis')

plt.show()

This code generates a 3D grid of points, processes each point by subtracting the mean of its coordinates and then normalizing it, and finally visualizes the original and processed grids.

final kiln Dec 17, 2023, 7:39 PM

#

final kiln I've asked gpt to make these vizualizations, haven't checked the code but they s...

Generalizing this to 4d, I reckon that the subtraction by mean gets you to a 3d space and then norm gets you to the familiar spherical surface of radius 1.

#

In that case dot product is just the cosine similarity. For higher dimensions, you'd always still be working inside an hyper sphere.

#

It's also neat that it always ends up being a manifold. And that the dot product can still preserve that similarity interpretation.

#

How do latent spaces end up working ? Do the networks always partition the vector space into chunks ? Or do they make use of dimensions to represent things ?
(Like in QM, where a dimension will correspond to a possible state )

#

And has anyone tried doing these things with complex numbers ? Or check if the network ends up learning them somehow, and possibly even forming an Hilbert space instead of the usual thing ?

timid trench Dec 17, 2023, 8:36 PM

#

Guys, has anyone worked on chatgpt's api for python?

desert oar Dec 18, 2023, 12:07 AM

#

final kiln Sure: --- Here's the full Python code used for the visualization with shifted a...

pretty good job by the bot there. i'm consistently impressed with its understanding of tasks like this, even if it doesn't know how to write particularly efficient or idiomatic numpy code

desert oar Dec 18, 2023, 12:07 AM

#

final kiln How do latent spaces end up working ? Do the networks always partition the vect...

what do you mean by chunks?

desert oar Dec 18, 2023, 12:09 AM

#

final kiln And has anyone tried doing these things with complex numbers ? Or check if the n...

iirc there has been work on using fourier transforms inside neural networks, e.g. transforming the data to fourier domain and training the model on that. so that will pull in complex numbers. otherwise i don't know if there's an advantage to complex numbers vs. 2 dimensional real vectors

final kiln Dec 18, 2023, 12:09 AM

#

desert oar pretty good job by the bot there. i'm consistently impressed with its understand...

It actually took me 15min to get it to do it right. At some point I just wanted it to succeed so I didn't give up. I explained it from so many angles ahah

desert oar Dec 18, 2023, 12:09 AM

#

final kiln It actually took me 15min to get it to do it right. At some point I just wanted ...

lol fair enough

#

off the top of my head, you could rewrite shifted_normalized_grid_points this way:

shifted_normalized_grid_points = (p - np.mean(p, axis=1)) / np.linalg.norm(p - np.mean(p, axis=1), axis=1)

you might need to scatter in some [: , np.newaxis]

final kiln Dec 18, 2023, 12:10 AM

#

desert oar what do you mean by chunks?

In partitions, from what I saw when I trained a small GAN, it looked like there were regions dedicated to each face and then a continuous morphing from one region to the other.

desert oar Dec 18, 2023, 12:11 AM

#

final kiln In partitions, from what I saw when I trained a small GAN, it looked like there ...

ah, i don't know what GAN architectures are like

#

i'd say that in general no, NNs don't always partition the data like that. but maybe modern architectures for image GANs are designed to do that, or do it naturally

final kiln Dec 18, 2023, 12:14 AM

#

desert oar ah, i don't know what GAN architectures are like

I used a decoder made of convolution layers. So it took a "small" vector of 512 dimensions and expanded it into an image. I trained it on random 512 vectors so it turned the vector space R512 into a latent space. A path in that space would originate an image of a face morphing continuously. I have the gif somewhere

#

Aaah I lost the gif

glass fiber Dec 18, 2023, 1:27 AM

#

Can someone help me

serene scaffold Dec 18, 2023, 2:14 AM

#

glass fiber Can someone help me

you have to ask the actual question that you want help with.

signal holly Dec 18, 2023, 2:56 AM

#

Hi, how do I build a proper framework/schedule to learn coding?
I'm asking here because I'm trying to learn ml
and heard that you start with the python libraries, like pandas or mat
but doing tutorials and exercises is so boring to me
I usually have to slug through to make any progress
idk do I just push through

winter delta Dec 18, 2023, 2:58 AM

#

How can i change a list to an array guys

final kiln Dec 18, 2023, 3:04 AM

#

winter delta How can i change a list to an array guys

x = np.array(y)

Where y is a python list

final kiln Dec 18, 2023, 3:06 AM

#

signal holly Hi, how do I build a proper framework/schedule to learn coding? I'm asking here ...

I usually just come up with a cool project and kinda do it. The struggle generates learning and then at the end of it I have a project to list on my CV.

#

It's also a lot more bearable to work on something I love than to watch someone talk about random stuff I don't need yet.

#

And the nail on the coffin is that studying theory is 10% (or less) of the learning process. You can read an entire book and then not be able to apply any of it.

signal holly Dec 18, 2023, 3:08 AM

#

final kiln It's also a lot more bearable to work on something I love than to watch someone ...

Hence why many software engineers seem to regret majoring in cs 💀

#

No hate towards anyone who does

final kiln Dec 18, 2023, 3:09 AM

#

signal holly Hence why many software engineers seem to regret majoring in cs 💀

I can't comment on it because I haven't majored in it. But if you love the subject and you enjoyed studying it, it wasn't a waste of time.

signal holly Dec 18, 2023, 3:09 AM

#

Were you able to get a job out of it tho? (If you dont mind sharing)

final kiln Dec 18, 2023, 3:10 AM

#

signal holly Were you able to get a job out of it tho? (If you dont mind sharing)

I don't have a formal education in CS.

#

But I was able to get a job because I've done interesting stuff

signal holly Dec 18, 2023, 3:11 AM

#

Good to know cause I'm going to college next year and dont know if I want to transfer to cs

final kiln Dec 18, 2023, 3:11 AM

#

And I like to think that I keep doing interesting stuff ofc ahah

signal holly Dec 18, 2023, 3:11 AM

#

True lol

final kiln Dec 18, 2023, 3:11 AM

#

signal holly Good to know cause I'm going to college next year and dont know if I want to tra...

Yeah it's a very personal choice, I'd take into account both personal circumstance and passion.

#

Passion will get you through the worst of it, but you should be careful with the particulars of your life.

#

So like, some people are very passionate about arts, but on avg that won't get you positioned in the job market.

signal holly Dec 18, 2023, 3:14 AM

#

Yea I currently have compE because resources for learning electronics are usually pricier

#

Compared to cs

final kiln Dec 18, 2023, 3:15 AM

#

Uhm yeah that's true

#

ML is also very much like that tho, it's very expensive

signal holly Dec 18, 2023, 3:15 AM

#

Really? Ml is expensive?

#

In what way

final kiln Dec 18, 2023, 3:16 AM

#

From my experience, even smaller scale stuff will be expensive because you need GPU just to experiment with stuff

#

When you get into the good stuff

#

It's prohibative

#

Like gpt4 for example, you need the backing of Microsoft

#

Anthropic is backed by Google

#

And Meta is literally one of the Maang/FAANG or wtv the current name is ahha

#

Ah and data, data is expensive. One of the lessons that really stuck with me was about the data quality

#

Your model is as good as the data you give it. And the best data you can get is expensive. You'll usually pay a bunch of people to do the manual work of annotating things.

#

Data quality really really does make a huge difference, it's actually insane.

desert oar Dec 18, 2023, 3:25 AM

#

signal holly Hi, how do I build a proper framework/schedule to learn coding? I'm asking here ...

ML is a field where programming is a means to a specific end. I suggest starting with just the basics, and then you can learn more about programming as you feel the need

#

https://python3.info/ just today i came across this book, it seems like a pretty good place to get started if you're interested in ML

winter delta Dec 18, 2023, 4:06 AM

#

Between 2 vector, how cen i find exactly number and which is the most show in these 2 vertor

#

Like looking the duplicate number in 2 vector

final kiln Dec 18, 2023, 4:14 AM

#

I'm confused. You can find out if there's a two by doing any(vector == 2)

#

vector == 2 will apply the == 2 operation to each coordinate

#

Which will result in a boolean vector

#

And any will return True if any of the values is True

grizzled locust Dec 18, 2023, 4:18 AM

#

anyone understand why it's like this,

#

and not like this?

desert oar Dec 18, 2023, 4:52 AM

#

grizzled locust anyone understand why it's like this,

i'm surprised that worked at all. google sheets URLs do not return CSV data.

#

you'll need to export/download each document as CSV first

#

(google might have some API to do this programmatically, but i'm not aware of it)

grizzled locust Dec 18, 2023, 4:53 AM

#

desert oar i'm surprised that worked at all. google sheets URLs do not return CSV data.

yeah i understand it now. you need to replace the URL with read csv format

#

i thought i could use a oversimplyfied version

quaint loom Dec 18, 2023, 6:25 AM

#

Is there anyone who have face the issue that "semTools" is not an exported object ?

I have downloaded the semTools package, upgraded it and using library (semTools) in the beginning of my code.

Error: 'semTools' is not an exported object from 'namespace:semTools'
In addition: Warning messages:
1: In lav_data_full(data = data, group = group, cluster = cluster, :
lavaan WARNING: some observed variances are (at least) a factor 1000 times larger than others; use varTable(fit) to investigate
2: In lavaan::lavaan(model = model_description, data = data_processed, :
lavaan WARNING:
the optimizer warns that a solution has NOT been found!

trim saddle Dec 18, 2023, 7:01 AM

#

Thats not python related right? Its R?

trim saddle Dec 18, 2023, 7:05 AM

#

quaint loom Is there anyone who have face the issue that "semTools" is not an exported objec...

https://stackoverflow.com/questions/60506879/how-should-i-deal-with-somefunction-is-not-an-exported-object-from-namespace

Try the steps mentioned here

Stack Overflow

How should I deal with "'someFunction' is not an exported object fr...

I have this error:
'someFunction' is not an exported object from 'namespace:somePackage'
Does anyone know how to solve it?

quaint loom Dec 18, 2023, 7:24 AM

#

trim saddle Thats not python related right? Its R?

Is R, right

quaint loom Dec 18, 2023, 7:26 AM

#

trim saddle https://stackoverflow.com/questions/60506879/how-should-i-deal-with-somefunction...

Maybe is my knowledge gap but this doesn’t seem to be much related to the issues I am facing. If so, please elaborate.

versed pilot Dec 18, 2023, 8:05 AM

#

grizzled locust anyone understand why it's like this,

I'm having a dejavu, didn't we discuss this in #1035199133436354600 ? Or is there a glitch in the Matrix? 😉

whole zephyr Dec 18, 2023, 8:46 AM

#

hey, does anyone know how I can use seaborn or other viz tools to create a grid plot to display multiple dataframes in it?

I don't mean the "classic" pair plot that takes all the numeric columns in a single dataframe, but rather I want to display, in multiple subplots, the regplot of same 2 columns that are shared across multiple dataframes.

grizzled locust Dec 18, 2023, 8:52 AM

#

versed pilot I'm having a dejavu, didn't we discuss this in <#1035199133436354600> ? Or is th...

oh yeah, i mean. i just understand that recently

#

sorry for being slow i guess?

versed pilot Dec 18, 2023, 9:03 AM

#

sorry, not trying to tell you off, but if you link to the previous discussion then we can move on from there. Instead of starting from the same original question

trim saddle Dec 18, 2023, 9:51 AM

#

quaint loom Maybe is my knowledge gap but this doesn’t seem to be much related to the issues...

this is a python discord idk if its the right place for R stuff here, especially since its some R related package error and nothing datascience related.
It seems like the package is missing that function based on that error. In the post it mentions some options to check/mitigate that error.

desert oar Dec 18, 2023, 2:09 PM

#

quaint loom Is there anyone who have face the issue that "semTools" is not an exported objec...

as usual, show your code

desert oar Dec 18, 2023, 2:09 PM

#

whole zephyr hey, does anyone know how I can use seaborn or other viz tools to create a grid ...

you can do it with matplotlib, but you'll need to do the regression and plot its output yourself

#

!d matplotlib.pyplot.subplots

arctic wedgeBOT Dec 18, 2023, 2:10 PM

#

matplotlib.pyplot.subplots


matplotlib.pyplot.subplots(nrows=1, ncols=1, sharex=False, sharey=False, squeeze=True, subplot_kw=None, gridspec_kw=None, **fig_kw)```
Create a figure and a set of subplots.

This utility wrapper makes it convenient to create common layouts of subplots, including the enclosing figure object, in a single call.

desert oar Dec 18, 2023, 2:10 PM

#

use that to set up a grid of axes, then you can plot whatever you want on each axes

forest bolt Dec 18, 2023, 3:21 PM

#

Hello guys I'm working on SOMA implementation do MEALPY and also enhancing algorithms about mirror boundaries. Currently I'm struggling with convergeency problems, is there somebody who could help me. I'll share everything and my last step is to commit these improvements to public package, but unfortunately I'm stuck.

grizzled locust Dec 18, 2023, 4:42 PM

#

guys, anyone understand where i did wrong?

past meteor Dec 18, 2023, 4:44 PM

#

grizzled locust guys, anyone understand where i did wrong?

Can you show where you set that variable

#

Right now that variable isn't a data frame, it's a function you haven't executed yet

left tartan Dec 18, 2023, 4:45 PM

#

grizzled locust guys, anyone understand where i did wrong?

What zestar said. You probably forgot to call a function in an earlier line, by adding parens (), or something similar

grizzled locust Dec 18, 2023, 4:55 PM

#

past meteor Can you show where you set that variable

is it because of this?

past meteor Dec 18, 2023, 4:55 PM

#

grizzled locust is it because of this?

Yes, like BillyBobby says, you're missing parentheses () behind copy

grizzled locust Dec 18, 2023, 4:56 PM

#

aight, i understand. thanks.

short heart Dec 18, 2023, 5:08 PM

#

I'm trying to export my custom bert model to ONNX, but for some reason after loading the exported model it has empty input array, what could be the reason?

late wraith Dec 18, 2023, 7:50 PM

#

hi

final kiln Dec 18, 2023, 8:11 PM

#

Maybe my math is off, but I think self attention can be simplified to:

softmax(xMx^T / sqrt(d_k))Vx^T

Where M and V are the learnable parameters.

Went over it a couple times now.

#

This would kinda simplify the interpretation too, since M is kind of acting like a metric tensor

desert oar Dec 18, 2023, 10:05 PM

#

final kiln Maybe my math is off, but I think self attention can be simplified to: ``` soft...

are you talking about "condensing" Q and K to a single matrix?

jagged nebula Dec 18, 2023, 10:06 PM

#

hello could anybody help me with uml class diagrams?

desert oar Dec 18, 2023, 10:07 PM

#

jagged nebula hello could anybody help me with uml class diagrams?

"don't ask to ask" - you need to describe your question enough so that someone can actually help you
you might want #software-architecture instead of this channel

final kiln Dec 18, 2023, 10:08 PM

#

desert oar are you talking about "condensing" Q and K to a single matrix?

Yes

desert oar Dec 18, 2023, 10:08 PM

#

final kiln Yes

i assume you'd want it to be the same shape as Q K'? how would you construct it?

final kiln Dec 18, 2023, 10:10 PM

#

desert oar i assume you'd want it to be the same shape as `Q K'`? how would you construct i...

It follows directly from the definition.

You just substitute :

K = Wk x
Q = Wq x

~~I'm using ' to distinguish between the K, Q in the paper and the transformations that produces them~~

desert oar Dec 18, 2023, 10:11 PM

#

final kiln It follows directly from the definition. You just substitute : K = Wk x Q = Wq...

i'm using ' for transpose.. can we call them Wk and Wq like in the attention-is-all-you-need paper?

final kiln Dec 18, 2023, 10:11 PM

#

desert oar i'm using `'` for transpose.. can we call them Wk and Wq like in the attention-i...

Oh, sure

desert oar Dec 18, 2023, 10:12 PM

#

so you have the decoder-side tokens Y, and the encoder-side tokens X. how do you construct this M matrix differently from (Wq Y) (Wk X)'?

final kiln Dec 18, 2023, 10:12 PM

#

You distribute the transpose

#

Wait

desert oar Dec 18, 2023, 10:12 PM

#

err, i think i swapped q and k. same idea though

final kiln Dec 18, 2023, 10:13 PM

#

No, it's X on both sides isn't it ?

#

Doesn't matter

final kiln Dec 18, 2023, 10:14 PM

#

desert oar so you have the decoder-side tokens `Y`, and the encoder-side tokens `X`. how do...

Let me check this

desert oar Dec 18, 2023, 10:14 PM

#

yeah, GPT is decoder-only and BERT is encoder-only, but this is the most general case

#

in the case of the nanogpt model you were working through in https://bbycroft.net/llm, they already simplified this operation somewhat

#

in general, you project queries, keys, and values into 3 separate spaces. even if they come from the same input sequence

final kiln Dec 18, 2023, 10:20 PM

#

No I found an error in my calculation

desert oar Dec 18, 2023, 10:20 PM

#

and even if you enforce that those 3 spaces are the same, you still have this "cartesian product" operation, multiplying all pairs of tokens together (at least looking backwards in the sequence, if you're in a decoder unit)

#

ah, okay then

final kiln Dec 18, 2023, 10:22 PM

#

No this is too suss wait

#

Qx (Kx)' = Q x x' K'

#

= (x x') Q K '

#

It's probly gonna be the other way around

#

x Q ( x K ) ' =. x Q K ' x' = x M x '

#

The second way makes more sense

#

And is how I produce the matrix

#

I mean both ways produce a single matrix. But the second way makes it so that it's not a scalar mul

#

Looking at the paper, Wq and Wk (which I'm calling Q and K above), have dimension d_model x d_k, since d_model is the size of the embedding vector, it must come from the left as a 1xd_model

cold dawn Dec 18, 2023, 11:02 PM

#

https://github.com/mistralai/mistral-src/tree/main/mistral
you guys most definitely know about mistral right

GitHub

mistral-src/mistral at main · mistralai/mistral-src

Reference implementation of Mistral AI 7B v0.1 model. - mistralai/mistral-src

cold dawn Dec 18, 2023, 11:19 PM

#

i have no background into machine learning, im a self taught python 'developer' (im not professional, though i am proficient)

#

How hard is the road of learning to work with ML in python

#

from 0 to being able to expand on open source frameworks

#

if you were to give me advice, please dont focus on required python skills (mentioning some important frameworks to learn is nice though) but instead maybe give some sort of guidance on what subjects to tackle first

#

thx 🙏

left tartan Dec 18, 2023, 11:24 PM

#

cold dawn if you were to give me advice, please dont focus on required python skills (ment...

Perhaps start with https://cs50.harvard.edu/ai/2023/. That'll give you a taste and some ideas of what you might want to learn more of.

iron basalt Dec 18, 2023, 11:24 PM

#

cold dawn from 0 to being able to expand on open source frameworks

Expand in what way? Are you trying to make use of ML to solve some problem as a framework, in which you don't really touch the ML part directly, but build around it (like making use of a physics engine in a video game, but not touching the physics engine internals)? Or do you want to make new kinds of ML models (research)? Or the functions required to make those models, etc (e.g. GPU kernels)?

desert oar Dec 19, 2023, 1:32 AM

#

final kiln Looking at the paper, Wq and Wk (which I'm calling Q and K above), have dimensio...

i think you're conflating the purpose of Q and K with the purpose of what i was calling Wq and Wk

#

ah, i see what you're doing here

#

Q = X @ Wq
K = X @ Wk
V = X @ Wv

Q @ K.T = (X @ Wq) @ (X @ Wk).T
        = (X @ Wq) @ (Wk.T @ X.T)
        = X @ (Wq @ Wk.T) @ X.T

~~i think you had it right the first time, but matrix multiplication doesn't commute so you can't pull out the (X @ X.T) like you did~~

#

that is a pretty interesting interpretation of what's going on though

final kiln Dec 19, 2023, 1:45 AM

#

desert oar ``` Q = X @ Wq K = X @ Wk V = X @ Wv Q @ K.T = (X @ Wq) @ (X @ Wk).T = ...

The first three equations don't make sense dimensionally if you check. And the equation you're solving results in scalar multiplication of a single matrix, which doesn't really do anything , gotta be the other way I think.

desert oar Dec 19, 2023, 1:45 AM

#

final kiln The first three equations don't make sense dimensionally if you check. And the e...

you're right, the first 3 lines are swapped

final kiln Dec 19, 2023, 1:46 AM

#

desert oar you're right, the first 3 lines are swapped

Which means that the true interpretation of self attention is that the network learns custom dot product metrics, which is super elegant.

desert oar Dec 19, 2023, 1:46 AM

#

precisely

#

that's actually the whole point!

#

it's basically a "soft lookup" , hence the names "query", "key", and "value"

final kiln Dec 19, 2023, 1:48 AM

#

But did they try to do two matrices instead of three and didn't work out as well ? Even tho they're equivalent descriptions ? Or did they not realize what they were doing ? A single learnable metric tensor oughta be more efficient

desert oar Dec 19, 2023, 1:48 AM

#

is there a way to reduce this to a single linear transformation of (X @ X.T) or X? Wq @ (X @ X.T) @ Wk.T

#

oh, i forgot to swap the other lines

#

lol, hang on

final kiln Dec 19, 2023, 1:49 AM

#

desert oar is there a way to reduce this to a single linear transformation of `(X @ X.T)` o...

X @ X.T ends up being dot product

#

Assuming X is only one vector, that's a single number

desert oar Dec 19, 2023, 1:50 AM

#

ah yeah

final kiln Dec 19, 2023, 1:50 AM

#

(which you can assume without loss of generality)

desert oar Dec 19, 2023, 1:50 AM

#

final kiln Assuming X is only one vector, that's a single number

well yeah, that's the point

#

if X is one vector, that means the input sequence had one token

final kiln Dec 19, 2023, 1:50 AM

#

Yes

desert oar Dec 19, 2023, 1:50 AM

#

but anyway you were right after all, you get X @ <something> @ X.T

#

so let me think that through, why you wouldn't want to just have "something" there

final kiln Dec 19, 2023, 1:51 AM

#

It can be further expanded

#

x @ C @ M @ C.T x.T

desert oar Dec 19, 2023, 1:51 AM

#

right, that's what i got to above

#

ah, ok

final kiln Dec 19, 2023, 1:51 AM

#

Something like that, where C is a compression transformation and M is a metric tensor

desert oar Dec 19, 2023, 1:52 AM

#

it's entirely possible that models which work on a single sequence (not on a pair of encoded and decoded sequences) already do this as an optimization

#

that or it actually doesn't work as well, that i would not know

#

i'm also not sure it allows for masking

final kiln Dec 19, 2023, 1:54 AM

#

You can include it outside of all of it, when you get a square matrix

#

Like uhm

#

(mask) @ (custom dot product thing)

And then apply softmax, etx

desert oar Dec 19, 2023, 1:54 AM

#

ah nvm, im looking at the attention is all you need paper now to confirm, and they do the masking after QK anyway

final kiln Dec 19, 2023, 1:55 AM

#

It's a pretty cool idea this whole thing, but am super curious if the Wq and Wk are really needed and why, and if not why didn't they know it

desert oar Dec 19, 2023, 1:56 AM

#

again i think in the most general case it allows for two different sequences, encoder and decoder

#

it's probably how they arrived at the concept

#

why they didn't simplify after, i'm not sure

final kiln Dec 19, 2023, 1:56 AM

#

What do you mean by two different sequences ?

desert oar Dec 19, 2023, 1:57 AM

#

like in a machine translation scenario, you train it on pairs of e.g. english and spanish sentences

final kiln Dec 19, 2023, 1:58 AM

#

I haven't gotten to that part yet.

desert oar Dec 19, 2023, 1:58 AM

#

but GPT doesn't do that

#

as far as i understand, that was one of the earlier use cases of transformers, although one of our local NLP experts would know better than i would

#

GPT and BERT came out later than Attn Is All You Need

final kiln Dec 19, 2023, 2:00 AM

#

They use a single branch isn't it. Instead of encoder decoder thing

desert oar Dec 19, 2023, 2:01 AM

#

right

#

interestingly nanogpt (the one you were looking at in the visualization tool) also doesn't do this

#

https://github.com/karpathy/nanoGPT/blob/eba36e84649f3c6d840a93092cb779a260544d08/model.py#L29-L76 should be relatively easy to modify this code to use your idea

GitHub

nanoGPT/model.py at eba36e84649f3c6d840a93092cb779a260544d08 · karp...

The simplest, fastest repository for training/finetuning medium-sized GPTs. - karpathy/nanoGPT

#

there's quite a lot of research now on making self-attention fast, eg. https://arxiv.org/pdf/2205.14135.pdf but no mention of this particular transformation

final kiln Dec 19, 2023, 2:05 AM

#

Ah I can't look at it, I'm implementing from scratch

desert oar Dec 19, 2023, 2:05 AM

#

well if you're implementing your own, you should be able to get the same or similar results doing it your way vs. the usual way

#

that'd be an interesting experiment, to compare training times and results

final kiln Dec 19, 2023, 2:05 AM

#

Yeah if no one's doing it, kinda sounds like a paper cuz it's one less operation per head right

desert oar Dec 19, 2023, 2:06 AM

#

the fact that it's not being done even in extremely optimized implementations makes me think we're just missing something

final kiln Dec 19, 2023, 2:06 AM

#

I mean is a super simple mod, so I doubt anyone hasn't tried it yet

final kiln Dec 19, 2023, 2:06 AM

#

desert oar the fact that it's not being done even in extremely optimized implementations ma...

Exactly

desert oar Dec 19, 2023, 2:10 AM

#

hm.. is it actually one less operation?

#

i know it's one less matrix multiply, but the dimensions involved are bigger

#

originally you have (d_batch,d_model x d_model,d_key) x (d_batch,d_model x d_model,d_key).T so the inner multiplication is between matrices of relatively small dimension d_key

#

it's the same number of dot products, but the dot products are between smaller vectors

#

hm... no, that doesn't matter. because you're kind of proposing that the dot products themselves are essentially pre-computable

#

keras uses some kind of einsum magic, not willing to muddle through it right now 😆 https://github.com/keras-team/keras/blob/v3.0.1/keras/layers/attention/multi_head_attention.py#L626

arctic wedgeBOT Dec 19, 2023, 2:18 AM

#

keras/layers/attention/multi_head_attention.py line 626

def _build_proj_equation(free_dims, bound_dims, output_dims):```

quaint loom Dec 19, 2023, 8:27 AM

#

Is there anyone who know why one would use Bootstraps in a structural equation model?

past meteor Dec 19, 2023, 8:33 AM

#

quaint loom Is there anyone who know why one would use Bootstraps in a structural equation m...

Disclaimer: I don't know SEM at all but you'd bootstrap to get a confidence interval on methods that don't give it to you "out of the box"

quaint loom Dec 19, 2023, 9:04 AM

#

past meteor Disclaimer: I don't know SEM at all but you'd bootstrap to get a confidence inte...

Alright! Thank you. : )

#

As you`re not familiar with SEM, you may not know why semopy is not able to calculate the r-square.

torpid violet Dec 19, 2023, 12:08 PM

#

Hi

#

Is there any one having good knowledge of opencv and ml
I want to build a project for that I need some navigation I can make possible that If any one is here who will help me then please reach me

I have very good project and we can build it together

#

Then DM me

velvet thorn Dec 19, 2023, 12:19 PM

#

Hi all, any resource for learning generative ai using python?

last ivy Dec 19, 2023, 1:06 PM

#

Guys imma planning to develop a ml model so can u guys suggest some fresh and new ideas with some complexity involved ?

supple osprey Dec 19, 2023, 2:28 PM

#

@last ivy yes I have idea

final kiln Dec 19, 2023, 4:03 PM

#

desert oar originally you have `(d_batch,d_model x d_model,d_key) x (d_batch,d_model x d_mo...

During training you're going from 2*d*d_k parameters to d*d. So at least during training the condition for the first beingg more efficient is that 2*d_k < d .

#

in the case of nano gpt, 2*1/3*d < d -> 2/3 < 1

#

so the way it's done makes training more efficient

#

and if the other way around is more efficient for inference, it should be possible to reduce one form to the other

oblique quarry Dec 19, 2023, 4:54 PM

#

Could someone smarter than me tell me why the resulting matrix doesnt have ones along its diagonal? Even though the paper explicitly states that the sqrt of a matrix has to be the original matrix when taking the dot product with itself ```

m.pearsonsCoefficient(covMatrix)
array([[ 0.60948941, -0.06662308, -0.59805044],
[-0.06662308, 0.00828873, 0.03770686],
[-0.59805044, 0.03770686, 1.34752355]])

https://paste.pythondiscord.com/OV5A

#

Yes I tested this method and the sqrt method works just fine as youd expect. Sigma is in this case the covariance matrix

desert oar Dec 19, 2023, 5:57 PM

#

final kiln During training you're going from `2*d*d_k` parameters to `d*d`. So at least dur...

right. the idea of using one for training and the other for inference is interesting. might be worth experimenting with

lapis sequoia Dec 19, 2023, 5:59 PM

#

I am a newbie. can anybody give a road map for AI.

desert oar Dec 19, 2023, 6:03 PM

#

oblique quarry Could someone smarter than me tell me why the resulting matrix doesnt have ones ...

Sigma here is just the covariance matrix, it's not the eigendecomposition

#

that expression is just dividing each element by the square root of the product of its corresponding variances

oblique quarry Dec 19, 2023, 6:08 PM

#

Im not a native speaker, so in simple english; are you just supposed to divide the cov Matrix element wise?

desert oar Dec 19, 2023, 6:08 PM

#

i'm not sure what eigenVectors**0.5 * np.linalg.inv(eigenVectors) @ eigenVectors is supposed to do, but maybe i'm just too rusty with the math here

oblique quarry Dec 19, 2023, 6:11 PM

#

desert oar i'm not sure what `eigenVectors**0.5 * np.linalg.inv(eigenVectors) @ eigenVector...

sorry it's supposed to be py eigenValues**0.5 * np.linalg.inv(eigenVectors) @ eigenVectors but i still dont get the 1 along the main diagonal

desert oar Dec 19, 2023, 6:14 PM

#

oblique quarry Im not a native speaker, so in simple english; are you just supposed to divide t...

def cov(x):
    m = x.mean(axis=0)
    c = (x - m)
    return (c @ c.T) / (x.shape[0] - 1)

def corr(x_cov):
    x_vars_sqrt = np.diag(np.diag(x_cov) ** -0.5)
    return x_vars_sqrt @ x_cov @ x_vars_sqrt

it should just be this

#

there's probably a way to rewrite x_vars_sqrt @ x_cov @ x_vars_sqrt using numpy broadcasting instead of constructing x_vars_sqrt explicitly. but the code above is the typical formula. it's also what's shown in your screenshot

oblique quarry Dec 19, 2023, 6:17 PM

#

But wouldnt this contradict the assumption A^0.5A^0.5 is A?

desert oar Dec 19, 2023, 6:18 PM

#

ah, you're trying to solve for the square root of the diag matrix that way

oblique quarry Dec 19, 2023, 6:19 PM

#

Yeah im honestly kinda confused as well but I just went along as the author said and here I am 😉

desert oar Dec 19, 2023, 6:21 PM

#

https://en.wikipedia.org/wiki/Square_root_of_a_matrix#Diagonal_and_triangular_matrices

Square root of a matrix

In mathematics, the square root of a matrix extends the notion of square root from numbers to matrices. A matrix B is said to be a square root of A if the matrix product BB is equal to A.Some authors use the name square root or the notation A1/2 only for the specific case when A is positive semidefinite, to denote the unique matrix B that is po...

#

in this case a = np.diag(np.diag(x_cov)). the formula says that you want the square root of that thing. but we know by construction that a is diagonal, so we can use the special case formula where we just take the square roots of the elements

#

it should make sense intuitively... what is the result, in general, when you multiply two diagonal matrices?

#

i believe all the more-general matrix square root techniques depend on that result for diagonal matrices

#

in any case, you shouldn't need to compute the eigendecomposition here

oblique quarry Dec 19, 2023, 6:25 PM

#

Yeah now that you mention it. It does make sense

#

Im just not good at math lmao

desert oar Dec 19, 2023, 6:26 PM

#

again, Σ in your text is just the covariance matrix, it's not related to eigenvalues

oblique quarry Dec 19, 2023, 6:28 PM

#

Yep, thank you i now get along the diagonal only ones, which makes sense since they correlate to each other in a 1 to 1 ratio

oblique quarry Dec 19, 2023, 6:29 PM

#

lapis sequoia I am a newbie. can anybody give a road map for AI.

Honestly I started with implementing a MLP, it's really up to you and how you feel. But I think it is always a good idea to begin somewhere simple so that you get a hands down experience with the fundamentals, again its just an opinion

whole zephyr Dec 19, 2023, 6:59 PM

#

hello, is there anyone who's more familiar with time series? more specifically price data

serene scaffold Dec 19, 2023, 7:04 PM

#

whole zephyr hello, is there anyone who's more familiar with time series? more specifically p...

Always ask your actual question. Don't ask to ask. Even if someone does know about time series, they need to know the actual question to start helping.

whole zephyr Dec 19, 2023, 7:15 PM

#

serene scaffold Always ask your actual question. Don't ask to ask. Even if someone does know abo...

not really sure how to word it, but I want to represent multiple chart patterns as features and I don't really know where to start

for example, I want to represent a wedge pattern as something I could feed into a model

winged sigil Dec 19, 2023, 10:47 PM

#

By the help of an AI model i want to assess mental health of a kid using a survey/questionnare. But the problem is i dont have appropriate data set to train my model for this. What should I do in this case. can the concept of coldstart help in this case. If yes then how ? Also if i use 10-20 questions then is there a way to make the model learn from itself, like can i apply reinforced learning in this. If yes then how ?

serene scaffold Dec 19, 2023, 11:49 PM

#

winged sigil By the help of an AI model i want to assess mental health of a kid using a surve...

you wouldn't use reinforcement learning for this. (it's reinforcement learning, not reinforced learning.)

what is the format of the questionnaire? Are they open-answer, or is it things like "I often feel like something bad is going to happen. Strongly agree, agree, neural, disagree, strongly disagree."

winged sigil Dec 20, 2023, 12:42 AM

#

yes something like this only, no open-answers. But we've to remember that questions will be specifically for children. So whatever you think might be used, please elaborate on it. I am in middle of a competition and I need some clarity on it. @serene scaffold

#

requesting anyone to please help. I need some information immediately. Kindly understand

serene scaffold Dec 20, 2023, 2:13 AM

#

winged sigil yes something like this only, no open-answers. But we've to remember that questi...

what is the competition asking you to do?

winged sigil Dec 20, 2023, 2:48 AM

#

we just have to assess the responses of the questions and based on that we have to show a rating. Thats it.

serene scaffold Dec 20, 2023, 2:48 AM

#

winged sigil we just have to assess the responses of the questions and based on that we have ...

are you required to use ML?

winged sigil Dec 20, 2023, 2:50 AM

#

yes

#

because we want to train our model to our specific needs

supple osprey Dec 20, 2023, 4:52 AM

#

@serene scaffold
Is there any one having good knowledge of opencv and ml
I want to build a project for that I need some navigation I can make possible that If any one is here who will help me then please reach me

I have very good idea and we can build it together and if not can you guild me through it

small wedge Dec 20, 2023, 4:53 AM

#

supple osprey <@253696366952316929> Is there any one having good knowledge of opencv and ml ...

What are you looking to build?

supple osprey Dec 20, 2023, 4:55 AM

#

I want to build a logging tracking system with opencv with addition of ml algo

#

I just need of some guidance if someone having clear vision of ml opencv

#

We can make a little conversion

#

Can I DM you

obtuse bane Dec 20, 2023, 6:05 AM

#

Does anyone here have experience the bureau of labor and statistics series id's? I'm working toward collecting more targeted data for analysis but the process of ascertaining data from their ids is cumbersome.

past meteor Dec 20, 2023, 8:50 AM

#

winged sigil we just have to assess the responses of the questions and based on that we have ...

Doesn't seem like an AI use case at all. There are standardized questionnaires and ways to score them in psychology

past meteor Dec 20, 2023, 8:52 AM

#

supple osprey We can make a little conversion

People don't typically engage with such requests here, it's best that you ask specific questions e.g., "how do I do this in opencv" or "this is how I'm approaching it, is there anything wrong with it" compared to "I need guidance on task X, can anyone help?"

knotty flume Dec 20, 2023, 2:43 PM

#

anyone here interested in joining a hackathon, need people with decent data science and ai skills. Drop a dm, making a team

|| My previous team i joined all quited so making my own team ||

whole zephyr Dec 20, 2023, 4:36 PM

#

any ideas on how I could represent chart patterns on time series as features?

for example, I want to represent wedges or triangles on portions of the graph as some parameters that define them (as I would represent a trendline for the last i-X days with a slope at point i)

blazing vale Dec 20, 2023, 5:20 PM

#

if i1==3:
               a1=df['Year'].value_counts()
               print(pd.DataFrame(a1,index=['A','B','C','D','E','F','G']))

#

getting all indeces as nan

#

any clue why?

serene scaffold Dec 20, 2023, 5:54 PM

#

@blazing vale df['Year'].value_counts() returns a Series where the indices are values from df['Year'] and the actual values are integers for how many times that index appeared in df['Year']
so, why are you trying to change the index to letters? what would that even mean?

woven sluice Dec 20, 2023, 7:31 PM

#

What is the idea behind this transformation?

# Here we map each temporal variable onto a circle such that the lowest value for that variable appears right next to the largest value. We compute the x- and y- component of that point using the sin and cos trigonometric functions.
df['day_sin'] = np.sin(df.day*(2.np.pi/31))
df['day_cos'] = np.cos(df.day(2.np.pi/31))
df['month_sin'] = np.sin((df.month-1)(2.np.pi/12))
df['month_cos'] = np.cos((df.month-1)(2.*np.pi/12))

desert oar Dec 20, 2023, 7:48 PM

#

woven sluice What is the idea behind this transformation? \# Here we map each temporal varia...

it's so that "31" ends up spatially close to "1"

#

mathematically it's like turning the number line into a circle

#

i suggest actually drawing it out on a unit circle

#

btw you probably want to use code formatting for this:

df['day_sin'] = np.sin(df.day*(2.np.pi/31))
df['day_cos'] = np.cos(df.day*(2.np.pi/31))
df['month_sin'] = np.sin((df.month-1)*(2.np.pi/12))
df['month_cos'] = np.cos((df.month-1)*(2.np.pi/12))

#

!code

arctic wedgeBOT Dec 20, 2023, 7:49 PM

#

Formatting code on discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

desert oar Dec 20, 2023, 7:50 PM

#

that said i don't know if day-of-month is all that useful

#

maybe for things like paydays and stuff that's on a bimonthly cycle

hollow flicker Dec 20, 2023, 7:50 PM

#

whats different between sklarn iteration and epochs?

desert oar Dec 20, 2023, 7:50 PM

#

but months are inconsistent in size and occur somewhat arbitrarily at various weekdays, so i think you'll have relatively low signal from day-of-month

desert oar Dec 20, 2023, 7:51 PM

#

hollow flicker whats different between sklarn iteration and epochs?

that's going to depend on the particular model you're looking at. scikit-learn has dozens. can you be more specific?

hollow flicker Dec 20, 2023, 7:51 PM

#

for example MLPClassifier doesn't have epoch value

#

It's only accept max_iteration

#

when i plot my loss_curve, iteration is just 12

#

but if i check internet other models return as a epoch

#

this is my loss curve

desert oar Dec 20, 2023, 8:00 PM

#

hollow flicker but if i check internet other models return as a epoch

they're probably just calling it something different. you'll have to check the docs to see what each parameter means

woven sluice Dec 20, 2023, 8:17 PM

#

desert oar but months are inconsistent in size and occur somewhat arbitrarily at various we...

Ty this helped alot!

silent gull Dec 20, 2023, 9:34 PM

#

what's a good dataset to practice training an ai on?

serene scaffold Dec 20, 2023, 11:04 PM

#

silent gull what's a good dataset to practice training an ai on?

For what

crisp shuttle Dec 21, 2023, 1:18 AM

#

Greetings everyone, I have a question, who has actually managed to fully train a functional Multi Linear Regression model (at least more that 6 features) using their off-the-shelf pc/laptop or at least Google Colab?

desert oar Dec 21, 2023, 2:13 AM

#

crisp shuttle Greetings everyone, I have a question, who has actually managed to fully train a...

linear regression? i've trained linear regression models with thousands of features and millions of rows on my laptop in a few minutes

#

the only thing you really can't do on off-the-shelf general-purpose hardware nowadays is deep learning for massive models

#

linear regression on moderately large datasets has been doable on off-the-shelf general-purpose hardware since the 90s

outer widget Dec 21, 2023, 5:31 AM

#

hollow flicker but if i check internet other models return as a epoch

These are common tensorflow models plots. If you are using skelarn MLPClassifier, iterations are same as number of epochs. Usually in DL frameworks, one epoch means N iterations where N is basically (total samples / batch size). Maybe thats why its a bit confusing when we set max_iter parameter in MLPClassifier.

#

sklearn*

hollow flicker Dec 21, 2023, 5:36 AM

#

outer widget These are common tensorflow models plots. If you are using skelarn MLPClassifier...

Thanks sir

humble cobalt Dec 21, 2023, 6:23 AM

#

knotty flume anyone here interested in joining a hackathon, need people with decent data scie...

can you tell me which python libraries skills are needed for this hackathon ?

grizzled locust Dec 21, 2023, 6:53 AM

#

Hi Guys, i wanted to add value to the bar chart but it ended with error. where i did wrong?

outer widget Dec 21, 2023, 7:25 AM

#

grizzled locust Hi Guys, i wanted to add value to the bar chart but it ended with error. where i...

I think x,y are not defined, in this case maybe sns.barplot args should be right arguements for addlabels
addlabels(class_distribution.index, class_distribution)

blazing vale Dec 21, 2023, 9:31 AM

#

serene scaffold <@864347116213501952> `df['Year'].value_counts()` returns a Series where the ind...

Yeah i was just being dumb here lol.

#

btw anyone knows how to get years along with this in output

#

if i1==4:
               a1=pd.DataFrame(df['Year'].value_counts())
               print('Year in which most number of games were released',a1.max(),'\n Year in which least number of games were released ',a1.min())
               space()```

#

so i have a csv dataset of 7 years which consists all the info of games released on the ps4 console

#

2013-2020

#

however this piece of code is giving me only max number of games

#

and min number of games

#

it isnt giving me the years along with it

small wedge Dec 21, 2023, 9:46 AM

#

blazing vale ```py if i1==4: a1=pd.DataFrame(df['Year'].value_counts()) ...

so value_counts returns a series that is automatically sorted for you, there's no need to turn it into a dataframe for this. Here's an example of how you can use it for this:

>>> a = pd.DataFrame({'Year': [1997, 1998, 1997, 2005, 2005, 2005]})
>>> a1 = a['Year'].value_counts()
>>> print(f'Year where the most games released {a1.index[0]}, got {a1.iloc[0]} sales\nYear where the least games were released {a1.index[-1]}, got {a1.iloc[-1]} sales')
Year where the most games released 2005, got 3 sales
Year where the least games were released 1998, got 1 sales

blazing vale Dec 21, 2023, 9:47 AM

#

ohhh thankks mann

blazing vale Dec 21, 2023, 9:56 AM

#

small wedge so value_counts returns a series that is automatically sorted for you, there's n...

if i may ask what is a1.index[0] needed for here. Cant i also just write its iloc and print it? @small wedge

#

i am getting the same output but its giving me the name and dtype too. anyway to remove that from output?

knotty flume Dec 21, 2023, 9:57 AM

#

humble cobalt can you tell me which python libraries skills are needed for this hackathon ?

Is it ok I tell you on dms?

humble cobalt Dec 21, 2023, 10:02 AM

#

knotty flume Is it ok I tell you on dms?

Yes

small wedge Dec 21, 2023, 10:04 AM

#

blazing vale i am getting the same output but its giving me the name and dtype too. anyway to...

hm can can you show what you're running? unless we have very different versions of pandas installed iloc should only give you the counts of each year; these counts are indexed via the years that they represent which is why I used a1.index

blazing vale Dec 21, 2023, 10:06 AM

#

ohh how do check it?

#

what version i have

#

i can share my output

#

this is for value_counts

#

Year
2017 254
2016 222
2015 172
2014 98
2018 39
2013 20
2019 12
2020 8

small wedge Dec 21, 2023, 10:07 AM

#

blazing vale what version i have

pip show pandas I'm using 2.1.1

blazing vale Dec 21, 2023, 10:07 AM

#

ok

#

i am using 2.0.3

#

Year in which most games were releasedcount 254
Name: 2017, dtype: int64,Year in which most games were releasedcount 12
Name: 2019, dtype: int64

small wedge Dec 21, 2023, 10:08 AM

#

interesting

blazing vale Dec 21, 2023, 10:08 AM

#

if i use only iloc i get this output

#

i am using a dataset of 826 rows

#

and 9 columns\

#

if i1==4:
               a1=pd.DataFrame(df['Year'].value_counts())
               print(f'Year in which most games were released{a1.iloc[0]},Year in which most games were released{a1.iloc[6]}')
               space()```

small wedge Dec 21, 2023, 10:09 AM

#

oh

#

you're still converting it to a dataframe

#

that's why the output is different

blazing vale Dec 21, 2023, 10:10 AM

#

ohh waittt i forgot to do that

#

lol

#

lemme make changes

#

working properly now 🫡

#

thankssss

small wedge Dec 21, 2023, 10:11 AM

#

np

blazing vale Dec 21, 2023, 10:11 AM

#

but i still have a question

#

if i use it with a df why it returns name and dtype as well

#

but when i use the same func with series it doesnt do so

#

thats strange lol

#

and cool at the same time

small wedge Dec 21, 2023, 10:13 AM

#

because a dataframe returns a series when you index via iloc

#

but a series returns the value that's at the index

blazing vale Dec 21, 2023, 10:14 AM

#

ohhh

#

so series just returns the value

#

whereas df returns a series the value along with name and dtype

blazing vale Dec 21, 2023, 11:33 AM

#

if c==5:
      while True:
            print('''Enter 1 to get Total Sales of all games\n
Enter 2 to get Total sales in each genre and by each publisher\n
Enter 3 to get game info about the games with Maximum and Minimum Sales across each Region and ROW Sales\n
Enter 4 to get Maximum and Minimum Sales made by each publisher across each Region and ROW Sales\n
Enter 5 to get Maximum and Minimum Sales made in each genre across each Region and ROW Sales\n
Enter 6 to return to previous Menu''')
            space()
            i1=eval(input('Enter your choice: '))
            space()
            if i1==1:
               print(df[['Game','Year','Genre','Publisher','Global']])
               space()
            if i1==2:
               if df['Global']=='Action':
                  a1=df['Global'].sum()

#

@small wedge

#

Here

small wedge Dec 21, 2023, 11:34 AM

#

mhm

blazing vale Dec 21, 2023, 11:34 AM

#

lemme show the error

#

Traceback (most recent call last):
  File "C:\Users\LENOVO\Desktop\IP Project\Ip project101.py", line 130, in <module>
    if df['Global']=='Action':
  File "E:\lib\site-packages\pandas\core\generic.py", line 1466, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

#

error

#

#

data set

small wedge Dec 21, 2023, 11:36 AM

#

what are you trying to do with these lines?

if df['Global']=='Action':
  a1=df['Global'].sum()

blazing vale Dec 21, 2023, 11:36 AM

#

trying to find total global sales of each genre

small wedge Dec 21, 2023, 11:37 AM

#

ah ok

blazing vale Dec 21, 2023, 11:37 AM

#

here are all the genres if needed 'Action','Shooter','Action-Adventure','Sports','Role-Playing','Platform',
'Racing','Fighting','Adventure','MMO','Simulation','Music','Party','Strategy','Puzzle','Visual','Novel','Misc'

small wedge Dec 21, 2023, 11:37 AM

#

so when you do df['Global']=='Action' this creates a series that is a mask of 0's and 1's

#

you can use this mask to index your dataframe, then take the sum from the result instead

#

df[df['Global']=='Action'].sum()

blazing vale Dec 21, 2023, 11:38 AM

#

Ohhhhhhhhhhhhhhhhhhhhh 💀

#

I am so dumbbbh

#

I could have done this lol

#

😭

small wedge Dec 21, 2023, 11:38 AM

#

s'all good, sometimes you get lost in the sauce

blazing vale Dec 21, 2023, 11:38 AM

#

Yeah

#

Hey if i wanna do it all for once for all genres

#

Then should i pass a list of everything there

#

Ohh wait then that would do sum of everything too 😭

small wedge Dec 21, 2023, 11:40 AM

#

yeah and use df['Global'].isin(['Action','Other stuff', ...])

blazing vale Dec 21, 2023, 11:40 AM

#

Should i define this function?

#

And use it again and again just by giving the name

#

Of the genre

#

This would reduce the typing and copy pasting part alot lol

small wedge Dec 21, 2023, 11:41 AM

#

you could, you could also use groupby to split them all up for you

#

then select the groups you want and take their sums

blazing vale Dec 21, 2023, 11:41 AM

#

Isn’t groupby a sql function 💀

#

Never knew its in pandas too

#

Damnnnn

blazing vale Dec 21, 2023, 11:42 AM

#

small wedge yeah and use `df['Global'].isin(['Action','Other stuff', ...])`

can you tell me what isin is?

small wedge Dec 21, 2023, 11:43 AM

#

a function for creating masks that match more than one category

blazing vale Dec 21, 2023, 11:43 AM

#

Ohh

small wedge Dec 21, 2023, 11:43 AM

#

just like a == 'a' or a == 'b' is cleaner to do as a in [*'ab'] for pandas you use isin

blazing vale Dec 21, 2023, 11:43 AM

#

I dunno but i think we call it boolean indexing here(the masks)

#

I dunno if its the same thing lol

small wedge Dec 21, 2023, 11:44 AM

#

blazing vale I dunno but i think we call it boolean indexing here(the masks)

I've never heard it called that but it would make sense to call it that

blazing vale Dec 21, 2023, 11:44 AM

#

Yeah

#

Cazue it returns true and false when checking condition in df

#

And series too

#

Is it the same thing?

#

I am almost done with my project lol thanks to you

small wedge Dec 21, 2023, 11:46 AM

#

is what the same thing?

small wedge Dec 21, 2023, 11:46 AM

#

blazing vale I am almost done with my project lol thanks to you

nice lol, happy to help

blazing vale Dec 21, 2023, 11:46 AM

#

like boolean indexing and masking

small wedge Dec 21, 2023, 11:47 AM

#

probably but idrk shrug

blazing vale Dec 21, 2023, 11:47 AM

#

lool

#

thanks again

#

imma continue my work further

blazing vale Dec 21, 2023, 12:13 PM

#

@small wedge hey 🥲

#

Game 0
Year 0
Genre 0
Publisher 0
North America 0.0
Europe 0.0
Japan 0.0
Rest of World 0.0
Global 0.0

#

getting this output

buoyant vine Dec 21, 2023, 12:13 PM

#

In Pytorch how do you create a zero(?) dim tensor with a single value...
A couple of metrics return a tensor(0.4) but I have no idea how that is created wearyfire

blazing vale Dec 21, 2023, 12:13 PM

#

used the code u gave me

#

it isnt working

small wedge Dec 21, 2023, 12:14 PM

#

blazing vale used the code u gave me

show what you wrote

blazing vale Dec 21, 2023, 12:14 PM

#

ok

#

 if i1==2:
               print(df[df['Global']=='Action'].sum())