#data-science-and-ml
1 messages · Page 111 of 1
which one ?
I'm using this rn: https://github.com/LaurentMazare/tch-rs
has been a good experience
nice
This is like a cpp ffi
yeah it's a fairly thin wrapper around torch
I managed to properly install pytorch with cuda into VS Code using pipenv pipenv install --index https://download.pytorch.org/whl/ "torch==2.2.1+cu121" "torchvision==0.17.1+cu121" "torchaudio==2.2.1+cu121". Was much harder than I expected.
P.S. python version "3.11.0", pipenv version "2023.12.1", GPU: NVIDIA GeForce RTX 3080Ti
Never stops being hard. The rest of it is fun, so it's worth
Tbh @final kiln You might find the Burn optimization stuff quite interesting if you like the idea of the compiler benifits.
They are doing quite a bit of work atm around automatic kernel fusion and using Rust's RAII to optimize tensor memory patterns.
It's not the optimization what I'm looking for tho, it's the error correction mechanism
.
hey anyone worked in R studio before ?
question: if my classification task involves an addition of a feature/column (e.g. confidence rating) in my training set vs my new dataset to predict on, will this throw my model off (inputs)?
whatsup
a model usually expects a fixed set of inputs, you can't just add an input without refitting the model to at least some extent
or are you saying that your training set has an extra feature that won't be present in the test set and future data? in that case you can't use it because you won't have it. it's not a matter of throwing the model off, it's a matter of not having the data to use
"Don't ask to ask" -- if you have a question, state it clearly, with enough details that someone can answer it without interviewing you
any of ya'll know how to enable autocomplete for markdown latex (or markdown latex in python notebooks) in vscode?
This (https://marketplace.visualstudio.com/items?itemName=yzhang.markdown-all-in-one#math) worked but only for markdown files (not markdown in notebooks)
I was unaware. Do you have something on topic to contribute?
!mute 1216171349786234963 1d chill, my friend.
:incoming_envelope: :ok_hand: applied timeout to @lapis sequoia until <t:1710553523:f> (1 day).
this server is ass im out
I’ve managed to reshape **__Table A __**into Table B, firstly with pd.melt and then pd.pivot. My follow up question is, how do I reorder the values in the Studies column from
[’adoption’, ‘eligible studies’, ‘studies not activated’, ‘total studies’,] to
[’adoption’, ‘eligible studies’, ‘total studies’, ‘studies not activated’]
I tried using
studiesDtype = pd.CategoricalDtype(['bls adoption','eligible studies','total studies','studies not activated'], ordered=True)
df1["studies"] = df1["studies"].astype(studiesDtype)
but this messed the table as all other countries are found in the regions they don’t belong to. See the table Error screenshot.
@desert oar
today I'm gonna try to get some tooling going for this salad of languages and then study rust's type system in detail, I can't really do this without at least a good understanding how rust and cuda work, rn im very much a newbie in both
I think that building the binary automatically will have to happen over an aws instance, so there's that issue too
this is gonna force me to build and hold an image which is not for production, which is something I would like to avoid
I'm gonna try to map nvcc inside the container
installing it is a bit of nightmare because the prod image is using debian and it just is what it is
I'm gonna change the base image to pytorch's instead of prefect, the reason for using the prefect base image actually no longer exists, it was easier to maintain compatibility with their deploy system by using their image, but now I'm doing the deployments via actions only
so the hypothesis is that taking pytorch base image and installing the rest of the nvidia toolkit is not gonna be hard
Running into an issue with pandas-on-pyspark where shifting a column with type timestamp will cause a mismatch error, saying it expected float but got timestamp. Repro: ```py
df = ps.from_pandas(pd.DataFrame({"date": pd.date_range('01-01-2000', '01-10-2000')})
df['date_prev'] = df['date'].shift()
Anyone know if Keras3 can work with older Tensorflow versions?
Or should I just try and get wsl2?
I'm doing so many cursed things in rust and it's making me happy
Oh no here comes malloc time
LETS GOOOOOOOO
I added two vectors on the GPU
With only 2 dimensions
I consider this an absolute W
Could someone help me understand the inner workings of a 3 layered neural network during back propogation if we used cost function MAE and a sigmoid activation ( 1/(1+e^-x))? 🙏
like how to calculate the gradient 🙏
grad error = (derror/dw1, derror/dw2, ...)
derror/dw1 = derror/dlayer1 * dlayer1/dw1
Expand the last term as much as needed til you get to the last layer
how would you get the first term tho 👀
You have the analytic expression for any given layer
So you compute that, it will depend on the value of the current weights and on the input coming from the previous layer
yea, but how to find the derivative of cost with respect to any layer
because cost is only defined with respect to last layer
Cost is defined with respect to the entire function, which is a composition of layers
So you apply the chain rule
error = layer1(layer2(x))
Or maybe
E
Error(layer1(layer2(x)))
if we say C = 1/n * (sum(outputlayerActivation_i - expectedValue_i)^2), isnt that only the last layer 👀
what am i missing 😭
No, because the output activation i oughta depend on output activation i - 1
yea
You can compute the derivative using that formula already, but you'll need to expand it when you apply the chain rule
but it depends on all previous neurons in preious layer, how to find derror/d(some particular weight)
This
If you go through the calculations for a simple composition of two linear functions
You'll see that you can calculate it as you have all the terms
would you mind walking me through an example 🥲 🙏
😭 ty so much
something like this I believe
Good thing you ask, I actually need to start thinking about these for my new custom layer
the code you wrote should work if you really are just re-ordering categories. can you provide some actual sample data that reproduces the error?
Got tensors being shared over to the cpp code
After this I got to send them to CUDA and calculate stuff there
And after this I can get the cpp boiler plate for the forward and backwards pass
And then all I have to do is to code the calculations in cuda
bro pls sum1 help me man i got no clue how to do this and i need the grade 
Define positive_revenue_df as the subset of movies in df with revenue greater than zero.
Code is provided below that creates new instances of model objects. Replace all instances of df with positive_revenue_df, and run the given code.
Use this code to get started:
positive_revenue_df =
# Replace the dataframe in the following code, and run.
regression_outcome = df[regression_target]
classification_outcome = df[classification_target]
covariates = df[all_covariates]
# Reinstantiate all regression models and classifiers.
linear_regression = LinearRegression()
logistic_regression = LogisticRegression()
forest_regression = RandomForestRegressor(max_depth=4, random_state=0)
forest_classifier = RandomForestClassifier(max_depth=4, random_state=0)
linear_regression_scores = cross_val_score(linear_regression, covariates, regression_outcome, cv=10, scoring=correlation)
forest_regression_scores = cross_val_score(forest_regression, covariates, regression_outcome, cv=10, scoring=correlation)
logistic_regression_scores = cross_val_score(logistic_regression, covariates, classification_outcome, cv=10, scoring=accuracy)
forest_classification_scores = cross_val_score(forest_classifier, covariates, classification_outcome, cv=10, scoring=accuracy)
What is the mean of the 10 cross validation scores for random forest regression?
It appears that the variables budget, popularity, runtime, vote_count, and revenue are all right-skewed. In Exercise 6, we will transform these variables to eliminate this skewness. Specifically, we will use the np.log10() method. Because some of these variable values are exactly 0, we will add a small positive value to each to ensure it is defined; this is necessary because log(0) is negative infinity.
Instructions
For each above-mentioned variable in df, transform value x into np.log10(1+x).
What is the new value of skew() for the covariate runtime? Please provide the answer to 3 decimal points.
this too :'(
Filtering a dataframe is somewhat straight forward: there’s some examples here (look for the shield > 6 examples) : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
But I suggest opening a help thread, as you have many questions here #❓|how-to-get-help
thanks!
Aight, first torch tensor has been funelled to my kernel
Now it's time to figure out how to make a custom layer out of this
Can't believe how smooth this is going
the buildup to this moment has been insanely long
guys and also another question. What version of keras and tensorflow does GOOGLE TEACHABLE MACHINE USE? I HAVE CREATED A CODE THAT WORKS PERFECTLY BUT THE PRE-TRAINED GOOGLE MODEL ISN'T COMPATIBLE with the installed keras how can I fix this problem? I have downgraded keras and idk how to check the model info. I asked AI and it said that I could use h5py to manipulate the model .h5 file but it says it is not highly recommended because it can cause more problems. How can I solve this issue?
sorry for caps I'm pissed 😤
No one is going to DM you. If they answer, it will be in this chat.
Can you give a link to what "Google teachable machine" is? Is it a tutorial or something?
ok 🙃
https://teachablemachine.withgoogle.com/
Basiclly you can train a image model here and export it.
not only image model
Can you get to this menu and click the Tensorflow tab?
yes
She shows only the tensorflow.js tab
I suspect we'll see something informative on the tensorflow (not js) tab
Also I have to walk home from the coffee shop that I'm sitting in.
I have tried downgrading keras below 2.20 as I said because the model has a module named groups and keras >=2.20 doesn't use it anymore
Hi,is it worthy of learning the ML algorithms from scratch except just for knowing how they work and having an actual idea what they do ?
It depends on what your goals are. Platforms like pytorch and huggingface enable one to do a lot without needing to know very much about what's going on under the hood. But if you want to work in the space professionally, and you don't understand how anything actually works, you won't be able to make informed decisions. Let alone communicate them to management and clients.
So its been a while that I started learning ML through some courses I first learned to implement algorithms from sklearn etc now Im following a course in coursera and they implement the algorithms from scratch
because I know the pretty much do the same thing if i do it from scratch or import it
ill still be messing with the parameters
hey everyone, pandas-profiling is throwing attribute error to me
what am I supposed to do now? I have already checked the versions
So I decided to instead use the fact that my strategy generates signals , use the signals to find where they happen the most and least with historical data 📊 this helps find patterns or days where I might have no edge and can avoid it before hand
When there was no selling signals on MSFT there was buying signals the blank clearly shows wasn’t the time to be short 
Can you show the code too?
You didn't show the full traceback nor the code that produced the error.
You need to update your numba library.
pip install -U numba
Wasn't generated_jit removed recently? Likely they'd have to roll back to an earlier version
I have an OOPs related bug I believe,
My model is this
class PadPrompter(nn.Module):
def __init__(self, train_conf):
super(PadPrompter, self).__init__()
self.pad_up = nn.Parameter(torch.randn([3, pad_size, image_size]))
self.pad_down = nn.Parameter(torch.randn([3, pad_size, image_size]))
self.pad_left = nn.Parameter(torch.randn([3, image_size - pad_size*2, pad_size]))
self.pad_right = nn.Parameter(torch.randn([3, image_size - pad_size*2, pad_size]))
self.prompt = torch.cat([above 4 parameters], dim = ... )
def forward(self, x):
print("#1 CHECK inside forward", self.prompt)
return x+self.prompt
else where in code, I am updating paramter of PadPrompter's instance: prompter
torch.nn.utils.vector_to_parameters(w_l, prompter.parameters())
for param in prompter.parameters():
print(f" #2 CHECK outside forward: {param[0][0]}")
after torch.nn.utils.vector_to_parameters paramters in #2 check , I am able to verify that paramters have indeed changed. But later on in forward again i see old parameters
What is the best AI I can self host on a Ubuntu with an AMD graphics card? It seems like all the AI models only work on Nvidia graphics cards.
which tensorflow version is stable? i reinstalled tf and reran an old code (that was working fine b4) but now its giving me some error: 2024-03-15 23:57:05.807478: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
I get 0 trainable parameters too
Yeah I've just noticed. Rolling back to the older version of numba or updating Pandas-Profiling itself will perhaps fix the problem.
That's the funny part 🤪
Deprecated 'pandas-profiling' package, use 'ydata-profiling' instead
https://pypi.org/project/pandas-profiling/
what range do I use for segmentation mask
if I have 2 values do I say 0 and 1 or -1 and 1
it just creates Black image
because i have a tiny thing that it needs to segment the rest is black so it just learns to make everything black
does anybody know how to fix that
i figured it out by saying loss(preds, targets) + loss(preds*targets, targets) * scale so that white stuff affects it more and it fixed it
Hello!
I'm posting this to see if anyone is knowledgeable in algorithms and artificial intelligence. Currently working on an unbeatable AI project for the game of Gomoku, I have constraints that require the Minimax (or variant) algorithm to have a depth of 10, as well as a calculation time of less than 0.5 seconds. The project is in Python with the possibility of developing in C.
I'm asking for help to figure out how to optimize my code to achieve a depth of 10 in less than 0.5 seconds each time.
If anyone thinks they can help me and is knowledgeable on the subject, it would be great to exchange ideas! Thank you!
Hello, does anyone know a good way of creating an AI chatbot with some pre-trained conversational capability as to not train it from scratch?
I did
its still not working
Pandas-Profiling has been deprecated. You need to install ydata-profiling instead.
from pandas_profiling import ProfileReport
prof = ProfileReport(df)
prof.to_file(output_file='output.html')
I see
this package seem to be deprecated, you either want to move to the suggested package they have
or downgrade numba to something less than 0.59.0
as im finishing up this project, I've settled on what my next one is gonna be, a dedicated ML language
not like mojo, I'm gonna introduce new concepts and ideas
Like a programming language?
yeah
What features are you thinking of?
actually, not a lot yet, this is an evolution of my idea of tricking the compiler into doing math checks
but I like the challange and will be thinking where I can innovate
Ah I thought you'd have gone down the differential programming route
well yes autograd will be built in
https://en.wikipedia.org/wiki/Dependent_type is interesting for math checks
but that's a deep rabbit hole 🐰
interesting
for now im just gonna collect the problems that exist, and see how a new language could potentially solve them
one could be standardization
like what onnx tries
It could also be something that isn't particularly innovative but is fast and ergonomic
that would already fill a gap imo
yeah im going for ml dev experience
imagine the compiler pointing out mistakes in the model
I don't want to scare you but you should definitely consider that there many fantastic langs with garbage tooling
if I do it, I'll do it like how gleam is doing
they released the tooling with the language
So if you're doing this I'd say consider from day 0 how you'll integrate with LSP, how you'll do testing, ...
exactly
that's the way
Looking forward to your ideas! Always feel free to ask me to bounce ideas, it's something I find really cool
awesome I appreciate it, I'll definetly need it because I'm not very experienced overall, I'm even thinking of starting the project very slow so I give myself time to gain experience in the industry
im probably gonna have to build up my knowledge, do an interpreter first, then start getting into simple compilers
There's one thing that mojo got super right, which is that no one is gonna leave py so intereopability is a must
Way I'm thinking, is that modules get compiled to python compatible binaries, which you can import directly into py
Like how Cython does it
I don't like the optional superset route tho, I think the type system has to be obligatory
from ydata_profiling import ProfileReport
prof = ProfileReport(df)
prof.to_file(output_file='output.html') ```
I used this code
and it is throwing me an error
[Open Browser Console for more detailed log - Double click to close this message]
Failed to load model class 'HBoxModel' from module '@jupyter-widgets/controls'
Error: Module @jupyter-widgets/controls, version ^1.5.0 is not registered, however, 2.0.0 is
at f.loadClass (http://localhost:8888/lab/extensions/@jupyter-widgets/jupyterlab-manager/static/134.a63a8d293fb35a52dc25.js?v=a63a8d293fb35a52dc25:1:75057)
this is what I am getting in return
it says, it is a javascript error
eh i haven't really used this profile tool, where do you get this error btw? when saving the report to html?
C:\Users\Abhi\anaconda3\Lib\site-packages\ydata_profiling\model\correlations.py:66: UserWarning: There was an attempt to calculate the auto correlation, but this failed.
To hide this warning, disable the calculation
(using df.profile_report(correlations={"auto": {"calculate": False}})
If this is problematic for your use case, please report this as an issue:
https://github.com/ydataai/ydata-profiling/issues
(include the error message: 'could not convert string to float: 'S'')
warnings.warn(
C:\Users\Abhi\anaconda3\Lib\site-packages\seaborn\matrix.py:260: FutureWarning: Format strings passed to MaskedConstant are ignored, but in future may error or produce different behavior
annotation = ("{:" + self.fmt + "}").format(val)
C:\Users\Abhi\anaconda3\Lib\site-packages\ydata_profiling\model\missing.py:78: UserWarning: There was an attempt to generate the Heatmap missing values diagrams, but this failed.
To hide this warning, disable the calculation
(using df.profile_report(missing_diagrams={"Heatmap": False})
If this is problematic for your use case, please report this as an issue:
https://github.com/ydataai/ydata-profiling/issues
(include the error message: 'could not convert string to float: '--'')
warnings.warn(
yes
try opening the html file in the browser directly
this seems to be warning
yes
got it
I mistakenly passed the wrong variable
mom was right
I am an idiot
mom haha
thanks for helping me tho
no worries
looking for some pandas advice
I am looking to generate a comparison of 2 xlsx files, where sheets correspond to 2d arrays. I'm looking to identify the differences across these two files. I'm using a for loop to create a dictionary of dfs for each file, then comparing. Am I overcomplicating this? I run into tuple issues at times
the arrays, generally, are the same shape across file A and B
TIA
wondering if it makes more sense for me to define a function that creates a class and subclass for each sheet, column, etc. to do this comparison
case sensitivity and sorting causes me some issues in my previous attempts to build this program
im no expert, just studying pandas atm, what are you comparing exactly? I know pandas has some methods to look at multiple dfs
each sheet is a state, each df is month (columns) by year (rows) with gasoline price in the values of the matix
file A is gasoline, file B is deisel
a for loop is probably fine but why move the data to dictionaries?
my core python is rusty but i would probably start by looping through df.iloc using a range(len())
if shapes are comparable, I figure i could just A/B test each key, subkey, and value within dictionary A and B
probably why I run into tuple issues
hmm im definitely not the best person to give advice but i'd be shocked if there wasn't a simple method using pandas to conduct a/b arithmetic
A feature I'll want to include is CUDA blocks
cuda {
/// Your CUDA code here
}
Something like that. Seamless integration for GPU code and getting my ecosystem to swallow Nvidia's, so I can find ways to control the chaos surrounding it rn
A dataset in csv file.
30970 numeric value fields each line, comma delimited.
21259 lines in csv
Gets imported to memory (device local memory 32GB for everything - all processes and data) using pandas read_csv()
Shape of resulting data frame: (1523, 30970).
Is it where pandas reaches its own limits? Otherwise where the mere 1523 in memory buffer from?
why not make month and year a multiindex thing, or combine them and make some sort of date into an index, then join the two dataframes, then do a diff along the columns instead of a diff along the index?
wouldnt importing the excel data into classes and subclasses be a more effective way to do this, I likely would like to reference this data in the future
for calculations, visualization, etc
maybe im way off base
To do a master's degree in artificial intelligence, is it better to do 3 years of computer science before or 3 years of electronics ?
myself yet for myself hadn’t come to the idea to do electronics for yet prior to AI.
However I can be wrong with my opinion.
and you work in ia ?
no, I step in into AI, since June ‘23
i still wonder how ya'll can keep the date in mind for so long
lol
day of month forgotten
dtypes in resulting data frame is int64
A dataset in csv file.
30970 numeric value fields each line, comma delimited.
21259 lines in csv
Gets imported to memory (device local memory 32GB for everything - all processes and data) using pandas read_csv()
Shape of resulting data frame: (1523, 30970), while it’s dtypes int64
Is it where pandas reaches its own limits? Otherwise where the mere 1523 in memory buffer from?
No improvement on change of dtypes for resulting data frame from int64 to int32.
if there are only 1500 lines it's probably because it thought it hit the end of the file
check the settings. and do you have any blank lines?
also why on earth do you have such a "wide" csv? is that raw count-encoded text data?
pandas i don't think has any limit on data size other than RAM available
if that's sparse data you might want to consider loading it into a sparse array instead of a pandas data frame
i think pandas used to have a sparse frame but i think they got rid of it a long time ago
https://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html never mind, i was wrong!
Dataframe compare is what you're looking for?
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.compare.html
it's not just a matter of reducing memory usage, operations on sparse arrays can sometimes be faster due to not having to loop over a ton of zeros
ill give this a go
CS for sure
how can i measure the accuracy of a computer vision project
That question is sooo general. A good starting point to determine how to measure a curacy is to understand what is the output in your specific model.
This is a posture estimation model
The output is the landmarks x,y coordinaties with visibility
So a good measure could be the eucledian distance between estimated x,y and true x,y
Okayy thankyouu
hey I need advice , I want to rotate a 2D plot as if it were in 3D : along any axis - x,y,z , like we can do with 3D plots .How can I do that ? any suggestions ??
What tools are you using ?
for plot - matplotlib , for widgets - Ipywidgets
Do you want a static image showing the rotated plot or do you want an interactive thing that you can rotate with your mouse ?
latter one
Uhm
As always the most flexible tool for this kind of job is pyvista
You can probably have a rectangle mesh and overlay the plot
But there's a simpler approach even
Can probably be repurposed, you just give it a flat topological map and overlay the plot image
okay ,thanks a lot!! I''ll look into it and try to understand
is there a meaningful difference between df.column.copy() and df["column"].copy()?
i'd google it but i wouldn't know how to word it
Hello I'm searching for tensorflow for my raspberry pi 5 with python 3.11. I found a guthub page which has the .wh files and stuff but it's not matching for my pi. It says cp311 but the .wh file is of cp310. Can someone help me find the tf version for me please
Here's the mismatch
if you have python on that system it should be a matter of copying over the correct wheels and install it with pip
Yee I can't find the wheels for python 3.11
The image shows it to be for 3.11 but the wheel file reads as cp3.10
I can see them here: https://pypi.org/project/tensorflow/2.16.1/#files
can you link me the github page ?
i dont really know whats wrong with my model architecture but on the first epoch im at 10% accuracy
with negative losses
90% of the time data is the culprit
5% is weight initialization, and the rest is model arch
for some reason
the system thinks its single class
so it forces me to use BinaryCrossEntropy
im using tfds's Colorectal Histology
be careful with these, always download from pypi and be sure to verify the checksums
the loss is at -300 rn
im gonna see if I cross check their checksum later today, the wheels might be altered
uhm the system ?
when i used categoricalcrossentropy
it said that the y_pred's shape is [None, 0]
so i can tuse it
my guess is that the preprocessing went wrong
Heyy thank you soo much. It helps a lot 👍
that's odd, I think the output layer has not been properly setup then, or you're using the loss function wrong
dis is the architecure
InputLayer(input_shape=(IM_SIZE, IM_SIZE, 3)),
Conv2D(filters=6, kernel_size=3, strides=1, padding="valid", activation="relu"),
BatchNormalization(),
MaxPool2D(pool_size=2, strides=2),
Conv2D(filters=16, kernel_size=3, strides=1, padding="valid", activation="relu"),
BatchNormalization(),
MaxPool2D(pool_size=2, strides=2),
Flatten(),
Dense(100, activation="relu"),
BatchNormalization(),
Dense(10, activation="relu"),
BatchNormalization(),
Dense(1, activation="sigmoid")
im kinda sure the output layer is right
and theres a mistake in preprocessing
there's only one category in your output layer tho
InputLayer(input_shape=(IM_SIZE, IM_SIZE, 3)),
Conv2D(filters=6, kernel_size=3, strides=1, padding="valid", activation="relu"),
BatchNormalization(),
MaxPool2D(pool_size=2, strides=2),
Conv2D(filters=16, kernel_size=3, strides=1, padding="valid", activation="relu"),
BatchNormalization(),
MaxPool2D(pool_size=2, strides=2),
Flatten(),
Dense(100, activation="relu"),
BatchNormalization(),
Dense(10, activation="relu"),
BatchNormalization(),
Dense(num_of_classes, activation="that exp() activation that I totally forget the name ._.")
nope
Dense(8) didnt work, its incompatible appareantly
ValueError: Shapes (None, 1) and (None, 8) are incompatible
who's complaining ?
you might stll be using binary cross entropy
softmax that's the name, omgosh
you might not need it tho and let it be the raw outputs, at least in pytorch the crossentropy loss accepts logits instead of probabilities
in that case im back to square one
because if i try to use BinaryCrossEntropy with Dense(8) it gives a similar error
df.colname is the same as df["colname"]. (of course, not all column names are valid identifiers, in which case only the latter works)
there's also a minor difference that df["colname"] = ... works but df.colname = ... doesn't work and shows a warning, but that's for assignment.
ty i was going through a quiz, the instructor used dot notation and it annoyed me but i couldn't remember why
seems like it might not be a great habit
no, don't use binary, use the usual crossentropy
It is not my own data set. Just checking it for possible use in my current task.
I'm doing a final test today before diving deep into the CUDA implementation of the forwards and backwards pass.
I'm gonna code a linear layer in cuda and apply backwards propagation to it from rust. It is not clear to me how the torch Function API works, so I need to poke at it til I get it right
Depending on this step tho, I reckon this is all done in a couple days or so, thus marking the beginning of the end of the project
There was a lot more that I wanted to do. I'm still weighing if it's worth it to advance to the second phase I had planned. I'll think about it when I get there tho.
Hello ! I am currently working on a school project that requires me to create an AI for the game of Gomoku which can beat me with as few moves as possible. We are required to use the Minimax algorithm or a variant. For performance reasons, I want to use GPU computing. i am doing the project in Python and would like to know which lib i could use for GPU calculations that would be suitable for my project. thanks!
Minimax can't be optimized with GPU computation.
How could I optimize it? Are there any ways to do it?
Don't worry about optimizing it. First worry about implementing it.
it's implemented, but I need a depth of 10 in less than 0.5 seconds of computation time. I'm currently at a depth of 5.. Can I send you the project repository to see what would be good to change?
You can post it here, and anyone who's available can take a look. I'm about to head out for a while.
hi everyone
Thanks !
https://github.com/farinaleo/gomoku/tree/dev/ here is the repository for the project. The code isn't very clean yet. The important functions are located in ft_gomoku/grid, ft_gomoku/AI, and ft_gomoku/rules. If someone could advise me on how to improve the algorithm and performance
does it have to be in python? do you have an estimate how much faster position evaluation should be to reach your goal?
we can do it in another programming language. We wanted to use C or C++ for the AI and Python for the GUI, etc. we need to be able to compute to a depth of 10 in less than 0.5 seconds each time
What I'm trying to say is that if your program only needs to be ~10-100x faster, then it's about time you should rewrite the slow parts in a compiled language, and that should give you enough of a speedup. Whereas if it needs to be a million times faster, then the problem is with your algorithm and you should focus on that. So it's a good idea to make an estimate of that - e.g. measure how long a turn takes depending on depth for depths 1-7 or so, and extrapolate to depth 10.
Here is a test with 3 depths: 5, 7, and 9. The time it takes for the function to find the best move is displayed at the top, it's : Player 1 reflection time. I tried to make the same moves for each test. I'm not sure if this is what you were referring to. What do you think? ( first is 5, top right is 7 and bottom right is 9 )
I was thinking of evaluating random positions and not the actual gameplay (so that it can be done automatically), but from this... that first long evaluation takes ~1s for depth 7 and ~10s for depth 9, so we should expect about ~30x the time for depth 10 compared to depth 7. And in your video for depth7 some positions take ~15s to evaluate. So we're looking at maybe 450s for a bad enough position for depth 10, whereas you need 0.5s always. So there might need to be some algorithm improvements.
Thank you! I'll work on improving it and try to optimize certain parts. Could you take a look at the project repository and tell me if there are any areas for improvement? For example, whether my method of storing the game state is good or if it can be improved
https://github.com/farinaleo/gomoku/blob/dev/ft_gomoku/data_structure/GameStruct.py#L83-L84
trivial getters/setters like that are not typically used in python (instead people use fields, and if they ever need to turn those getters/setters nontrivial, @property comes into play - unlike languages like Java, where there's no equivalent and so Java people are forces to always use them)
ft_gomoku/data_structure/GameStruct.py lines 83 to 84
def get_player_turn(self):
return self.player_turn```
ft_gomoku/data_structure/GameStruct.py lines 108 to 109
random_player = random.randint(1, 2)
self.player_turn = self.player_1 if random_player == 1 else self.player_2```
in the actual minimax or the grid impl, I don't anything obviously bad, though.
Thank you! I'll consider your advice and optimize my code. thanks
Do anyone partition code using single Linkedlist in python
What do you mean by partitioning code ?
Input
3,5,8,10,2,1. X = 5
5<values are 3,2,1
5>=values are 5,8,10
Output
3,2,1,5,8,10
merge sort ?
No
Did u learn DSA with Python
didn't really learned it in any particular language
your descriptions are not very clear, so I'm throwing guesses
do you mean to be asking this in #algos-and-data-structs ?
Bro what do u do
also @lapis sequoia you might want to clarify your question more. provide enough information up-fron that someone can answer the question without interviewing you.
Noone replying there
I program, particularly I do ML
Nice bro
if you asked the question the way you asked it here, you can't reasonably expect a helpful reply. chatrooms are somewhat asynchronous. especially in a big server like this one, you need to approach asking questions like on a forum: state your full question, with sufficient supporting detail for someone to answer it intelligently.
actually that holds true on small servers as well
Ok bro
Hi guys, im gonna choose my major in my college tomorrow. I want to continue my path as ai engineer but i can only choose between software programming and information system. Which suit best for my career guys
which parts of the university are those programs part of? at my university, information systems was part of the business school. if that's the case, you don't want to pick that one
Hi everyone,
I am in search of a course or content through which i can learn Statistics for implementing it in Python afterwards.
can anyone please suggest me the same??
Hi there, personally I've learned basic statistics from that guy: https://www.youtube.com/watch?v=zRUliXuwJCQ&list=PLZoTAELRMXVMhVyr3Ri9IQ-t5QPBtxzJO
Statistical methods are mainly useful to ensure that your data are interpreted correctly. And that apparent relationships are really “significant” or meaningful and it is not simply happen by chance. Actually, the statistical analysis helps to find meaning to the meaningless numbers.
Python AI Tech news : https://www.youtube.com/playlist?list=...
Josh Starmer arguably makes better videos tbh
Statistics, Machine Learning and Data Science can sometimes seem like very scary topics, but since each technique is really just a combination of small and simple steps, they are actually quite simple. My goal with StatQuest is to break down the major methodologies into easy to understand pieces. That said, I don't dumb down the material. Instea...
hi guy I looking for solution / guide toward deep Q Learning for the CartPole problem have looked at some example still don't really know how to execute in code any suggestions will be great thx
You can get a textbook, something like this https://www.amazon.co.uk/Essentials-Business-Statistics-Bowerman-Professor/dp/0077641205
In terms of a course focused on Python, I found "Python Statistics Essential Training" by Matt Harrison on Linked In Learning to cover the basics pretty well, with some useful python tips and pandas 2 updates. https://www.linkedin.com/learning/python-statistics-essential-training-19258005/being-a-python-statistics-mvp
hi, sorry if this is kind of a dupe but i'd be glad if you guys could help with a problem i stated here https://discord.com/channels/267624335836053506/1219215713827553332
i ask here because i think it will reach more of the people that could help me, thanks
Hey, how i do the data priprocessing? I have my book but im still little dizzy about eda and data priprocessing. And the one im very confused is about Manual feature selection. How do i selection it?
I think a lot of people just recommend not doing manual feature selection at all and leave it to regularization
manual feature selection is only a way to understand how feature selection works, but in applied AI, you will want to use tools that help you with feature selection
you can read about some of it here for instance : https://scikit-learn.org/stable/modules/feature_selection.html
(hm, might be a bit too technical, idk if someone has a link to a good tutorial ?)
Can anyone help here?
#1219227611293548634 message
Hey guys, what are the best performing open-sourced LLM's that can be fine-tuned on specific domain data, I also need it to havebetter reasoning, will any open source LLM with good fine-tuning be able to reason equally to GPT-4 or Claude 2?
fine-tuning doesn't really helps "reasoning" in a general sense, at least not unless you have an absurdly large amount of data
Why would fine tuning make it better than the heavy weights tho ?
I have no idea whatsoever about what you data is like so it could be fall anywhere from "non-tuned gpt4 fails terribly but fine tuning gpt2 works perfectly" to "gpt4 works perfectly as is, open source (even fine tuned) fails terribly"
I'm trying to think of something that would fall on the first case
Math is the most obvious one, but math would be hard regardless right
ridiculously simple things on languages not covered by gpt4's training data
Which ones ?
Oh you mean languages he don't know
Yeah that would be an example
I am trying to train the model for a specific domain of data. Let's say I need to train the model on medical data and pass all the medical exams. I am sure GPT-4 cannot be great at this right?
And will the benchmarks of the fine-tuned model exceed GPT-4? At least on the medical exam benchmarks?
if you are going to work with something as serious as medical data: Don't ask for advice in a random discord server, do proper research based on published papers
Could be a school project
I recall having a conversation about this a couple months back
And they told me it can be hard to do that
Look into Lora
No idea, haven't tried it myself
Is it a training data issue/size of the data needed or the underlying model itself is not capable at other closed sourced performing models?
Uhm, I think it's just easy to destroy the information since the model no longer has a signal coming from the huge dataset it was trained on
But this is just a guess
There is this technique where you freeze the entire model and instead train extra weights that get added on top
hi
i have a lab evaluation in this week in which i have to perform a eda on a database and provide some plots of it can anyone help were can i learn data visualization it would be a great help
thank you
Understood I will try this out. But do you have any idea of the minimum size of the datasets needed to make a huge difference in reasoning and output?
Also I have previously I've tried QLora, I need to dive deeper to see with much bigger data sets.
I have no idea tbh, never tried tuning an LLM. Til now I've been training transformers on simple tasks like text classification.
Understood, but still thanks for your thoughts and information.
BTW is there anyone in this groups who might have an answer for this?
May be you can tag them to my question.
My instinct tho, would be to focus on the MLP layers, adding capacity to them somehow. The MLP has been found to be where they store factual information.
Adding information sounds plausible to me.
Improving reasoning, I'm not so sure because it's something more abstract and not particular to any domain.
I think there's at least one NLP expert around
This makes sense, I guess I need to try it out to see the way it performs before coming into a conclusion. The reasoning part is where I am kind of stuck, but lets see.
My advice really, is to check the literature first. I've lost so much time just trying out stuff that I could've just looked up.
And if no one has tried it yet, well it's a potential paper
Yeah this is so true, probably will do more research.
i am a student and recently i learn this in uni
data -> linear regression -> y [Rd -> R]
data -> principal components [Rd -> Wk]
now i wonder can we do this ?
data -> principal components -> linear regression
I am wondering can these concepts be combined 😅
Thnxx
But I think he has advanced concepts covered
I need a starter program for statistics in Data Science
iirc it should be there, you have to look for it
Try going to the playlists
linear regression can be interpreted as an orthogonal projection onto the row space of a linear operator. that projection is done with the singular value decomposition, same as PCA, just without subtracting the mean
Hi, Have got a data set with 45 variables. Among all those few are categorical. One categorical is comprised of 133 categories. Majority of 133 categories qualify to be handled by ordinal encoder. However few ones among 133 categories have s good chance to carry special meaning: zero, any, a/n. One further has the chances to be on either of two sides: special meaning or ordinal category.
A question regarding 3 categories of special meaning - any approach patterns exist to apply in such a case?
500 records among 175 thousands have one of those 3 special intent.
Eventually a source which could help to make progress in this matter?
what is "a/n"?
No idea, no hints in data set documentation, unclear if data set authors will respond. My thoughts run towards n/a for “not available”, alternatively “not applicable”.
Can someone please help with this question - https://stackoverflow.com/questions/78181604/how-to-subtract-pandas-columns-for-specific-groups-in-a-multi-index-dataframe
lets goooooo
aight, final steps
that's not a full impl yet, it's just a test layer, I gotta replace the core cuda kernels with the actual attention mechanism
hello fellas i need some more info in my program ,where some problem CANNOT RETRIEVE DATA FROM THE WEB https://pastebin.com/0aK5hGjA this link
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
help me where the problem is , whatever is missing in your opinion
is there a library that abstracts model and API interaction, so that I can freely switch between OpenAI, Claude or local models with minimal programming?
uhm not that I know of, how different are they ?
olama seems to use a very similar api to open ai
not that different, but it would be nice if someone already did part of the abstracting
Might be one already, I'm unsure how locked in people will get to their APIs.
I think it's super risky to build a product on top of open ai or anthropic
Your dataset looks imbalanced
im doing it for a school project
its about stroke and there is a lot with stroke and little with
What I'm saying is that you have waaay more true labels than false labels
So your model likely figures that it's just easier to guess all trues and it will still be right most of the time
Wait no the other way around, it's the false label that has more stuff
Is it an image dataset ?
But it says 0 there, am I reading wrong ?
Uhm I guess knn could split these
I've really never used keras API to do knn
How are you encoding the data
Uhm yeah, you need a better dataset and idk if it's reasonable to expect a generalization from a simple table dataset like that
i cant find a good data base
I'm sure things like age are predictors of someone having a stroke, but it won't be a perfect thing in the end right

Just cuz you're 80 don't mean you'll have a stroke
yeah
Theres a ton other factors like how healthy the person was, were they smokers, did they quit if they were
Etc
I think the best you'll get is potentially an overfit
can you maybe recomend me a database?
Or a fit to a prob dist
i will just do a new one
Uhm, not really, search through papers with code
You might also find useful information about how people are trying to solve this problem
I don't want to disturb you, but are there LLMs who think, that is, think properly about an action that they want to do?
Comes down to how you personally define "thinking"
Which might not be easy to define in the first place
Well, there should think about it properly, as an example with arguments etc.
i just want to know if i need a new data base or no
An LLM can produce text that when you read it, it looks like it has a reasoning behind it.
You do
🦶
If you adopt a pragmatic way of thinking, and define these concepts operationally, then LLMs can think
I don't want to sound like a professional (I'm not one) but I think I've managed to modify a T5 model in a way that makes it thinking...
Hello guys I am newbie and just starting out with python and I want to create my own AI model and train it on different things so that it can perform productive and time taking tasks
I just wanna what's the first step and is there any preferred ytuber from u guys
Any help is appreciated
Thank you
Uhm, try pytorch or keras
They have good documentation, so does hugging face
Sure I will look into it
Define what it means to think. I reckon gpt4 thinks according to your definition.
I am also planning on making a discord chatbot which can answer some questions and all
And also a bot game like dank memer bot if you know but a little bit simpler
I just wanna know if python will help me do it or do I have to learn another programming language
Like it will use some AI I believe as it's a Chatbot but I'm not sure tho
For me, thinking in relation to an AI means things that it generates independently. So he wasn't told how to think, he created his own consciousness and logic.
And i made it with an T5 model....
Ig I personally believe something "thinks" when they are somehow able to have the experience of language. Like when you think, you are experiencing it, like when you look at the sky and experience the color blue. Idk at which point a bunch of electrons moving about in a particular way "wake up", so ig I just don't know if LLMs think
I suppose gpt4 thinks according to that definition
So my T5 model has just created its own language and logic in which it thinks.
At its best, It very clearly joins concepts it has never seen before and produces something novel
But it's not thinking in the way that I think unless an experience accompanies it
So, in addition to what it thinks, it generates an output.
Can it argue for its own consciousness?
I don't understand exactly
Can you get it to try to convince you that it is awake
I tried that w/ gpt but it either says no or it is not very convincing
It is based on a very small data set, and in German, you can be happy if he explains why he lied.
*you can read it in this thoughts
LLMs (of the giving type I assume you are talking about), have feedback loops, however, they often lack an internal loop and so to get them to """think""" step by step, you can often explicitly prompt them to do so ("please give me all the step by step"). Since each step is now smaller and part of a feedback loop they can do some limited reasoning. This falls under Cybernetics if you want to learn more on this aspect of AI.
It is expensive, there are also tree versions (basically tree search like in a chess bot).
Which are really expensive.
The trick is to have cheap internal loops.
These loops can even have hard coded logic, for example, maybe one module calculates/simulates a specific differential equation / control system.
No, my model can think by thinking about what it wants to generate as the final output. He does this in a language he has made himself, over which he has no influence during training.
I think things will get super crazy when these loops get cheaper to do, especially with multi modal
Yes, that is still the same thing, see Neural GPU or others like it (based on the Neural Turing Machine (NTM) branch of neural networks).
(They do arbitrary symbol manipulation for algorithms)
There are also Neural Proof Machines, which are like a simplified NTM.
Just forget it, my English is too bad to translate and you won't understand it anyway...
This is a part of what I work on.
Now I don't feel like working on it anymore ._.
Historically the winning method in AI/ML has always been to have some basic unit that you can parallelize and spam a bunch of, individual neurons are kind of like level 0, this is like level 1 (a bunch of recurrent networks combined).
Pretty cool, do you do research or is it industry work ?
Brute force search has been working out pretty well, isn't that what alpha go does
Both.
The current methods are taking level 0 to the extreme. But it's scaling up the wrong thing.
It's why we get something that feels kind of smart, but not really.
Either that or it just needs more GPU
Well, it's a scaling problem, in the same way of using tick marks vs a modern numeric system.
I can just add more ticks (GPUs), but that is linear.
Much better is a better numeric system.
(Except it's worse than linear because communication overhead between GPUs)
That does make sense, but ticks have a winning streak rn and the hardware will keep on getting better right
Yes, but if you plan for today, you will find yourself stranded in the future.
I've started cuda programming btw, so Im biased now
Well, you will still need that.
There is a base level of compute needed, even with way better scaling.
I do strongly believe that hardware dictates where stuff is gonna go. Like, if we want better models we need better hardware, even if that means something totally different like wetware or quantum chips
Yes, hardware is the most important factor.
Software is about making the least worst use of the hardware.
I don't think we need quantum or anything like that though. It's just the wrong architecture.
Von Neumann is not it for what we are trying to do.
(Brains are not Von Neumann)
GPUs just so happen by chance to work well with Deep Learning, kind of lucky (that they already exist / work well for it / mass produced for lower cost).
MTCS
*Search is good though, yes, and will probably make a huge comeback now that learning/generation is decent. It's the other half of it all (search / learning).
(It's also a major thing that is missing causing stuff like hallucination, lack of explanation / reliable reasoning (which is what a lot of people really want out of the LLMs, but they can't provide (well/cheaply) in their current form))
hello everybody, does anyone know if there is an ai that takes a code and score that based on the accuracy, a code corrector ai
Hello guys, I am a first year computer eng. student trying to progress on the path of data scientist.
Do you have any websites or courses that will help me on this journey?
What do you think about the roadmaps on Roadmap.sh?
Quantum is promising tho, if you can realistically construct universal function approximators with qbits
Would probably beat the human brains computational power
But the pattern I observe is, the thing that prevails and will work, tends to be the thing that either packs more flops in a unit of volume, or the thing that somehow makes better use of the flops per volume
Regardless of architecture or any detail really, if you had asked me b4 getting into this stuff a couple years ago, I wouldn't have guessed that the GPU was the current holy grail of AI, it's easy to see in hindsight, but it's not obvious
Does anyone here have experience with pygam and patsy? I'm new to using them and the documentation is breaking my head
what about it specifically?
did you guys see the new devin ai 💀
i doubt it’ll replace software engineers but seeing how much ai has evolved it just doesn’t promise anything good for us
I work in the language technology department of my company, and we're swimming in it
meaning what?
We have lots of funding coming in and ideas for what to do.
oh, is that a counterpoint for not promising anything good?
If AI is going to reduce the number of programming-related jobs, the ones that involve developing those technologies will, if nothing else, the last to be completely eliminated.
This happens to be somewhat related to my previous discussion here. Devin effectively uses another external loop (making the LLM the innner loop).
@final kiln Related to previous: https://arxiv.org/pdf/2403.09629.pdf
(Inner thought loop concept (not a new idea, but we have better tools for it now))
Even just 1 inner/outer loop pair seems to make it way more effective, so the question is what happens when you have N of them and in a hierarchy... #data-science-and-ml message
its just not as comprehensive as i would like tensorflow is my Personal favorite
Hi everyone, currently dealing with the OCR with google vision. I found that the text recognition doesn't have a fixed pattern. Let's say, the document A, I got the "First Name" right above the user input in a form, and the OCR can detect it correctly. However, when I tried to use a similar format document B, it failed to detect the first name.
Is there any other way to detect a form with OCR? Or am I use it wrongly? btw, Im actually using nodejs for doing so, but i don't think it really matter tbh.
What I currently do is trying to extract the content I want from the detected text. For example, I will get the value of first name from a string - "First Name: ${firstname value here} Last Name"
L40 spot GPUs are 70 cents an hour
why do people even buy GPUs for AI
federate your training and do one of these for inference, smh
That's extremely expensive
Wait what are the specs on L40
48gb, that's extremely cheap
Yeah looks interesting. I'm of the opinion that language models are a very small piece of the puzzle. They emulate the language center in the brain, I'd argue that you need one network per major area of the brain and seamless, preferably async, integration between them
I bet if I perform a Fermi estimate for the number of parameters in the broca, I'd get more or less the number of parameters in gpt4
They are just one part, but they are very convenient to train / collect data for.
yeah and when you're doing inference you can pause the instance whenever you're not using it
at which point you only get charged for disk space
then kill the instance when you're off the computer
add a hook to your computer's boot so that it starts the instance paused
so it's actually way cheaper than 0.70 cents an hour
this whole thing is pure llmops
I'm on like 12cents an hour for 16gb of GPU, the price changes from time to time
AWS?
AWS Spot, Milan
nice!
I think Ohio and some of the Asian regions also have cheap GPU
for training or inference? or both?
hm right I would also go with something cheaper if it is for training
One thing I really wanna include in my language ecosystem is solutions for increasing accessibility to AI, like tools to provision cheap or even free GPU, but idk how feasible that could be
Most people don't have a GPU, or the means to buy one, and in most cases even if they do, it's not the most cost effective way to go about it
I do recall an idea I had a year back when LLMs started getting big
Which was to construct a block chain type of thing that would somehow use ML compute as proof of work
Perhaps I could have a feature where you could opt in to share a small percentage of your compute to other users of the language while yours is idle
Each person would get an equal percentage of the available compute
Don't know
i think this is the idea behind tulip and kobold swarm
For categorical variable at what threshold of variable cardinality one should stop to think one-hot encoding to be good choice?
If such threshold exists, is it fixed rather than depend on other conditions, e,g. number of all variables, or number of all cardinal variables?
I'd imagine that the limit will be related to the available compute and memory. Each class is an additional dimension and you can always have a feed forward projecting it down to a lower dimensional space, once that becomes hard, that would be the limit
Never heard of them
I think the closest I've seen is vast ai, where you can rent your GPU
Depending on the commission, it might actually make it worth it to buy one now that I think about it
Hello everyone, wheres the best place to find help with Cuda libaries and nvidia drivers. Ive got a tesla p40 and using ubuntu and cant seem to get pytorch to recognise my gpu. I think I have a miss match in drivers. Nvidia-smi works and picks up my gpu
run 'nvcc --version´
check that the version of cuda matches the version the pytorch you installed was built for
Ah right ill check that
honestly I think im gonna write a small utility that checks peoples installation and tells them what to do, cuz these driver issues are so common
I vaguely remember doing this but ill double check. I've definitely done gcc --version. Ive been following nvidia docs and alexa gordic yt vid
pytorch should totally do it tho, idk why they dont give feedback for it
not gcc, nvcc
tho nvcc is not strictly needed
Sounds awesome
Is there an elegant way to select from a pd.DataFrame all rows that contains np.nans in a specific subset of columns?
E.g. I have a DataFrame with the columns a b c d e, I want to select all rows that contain nans in b c or e
this is the output coming from my layer on the forward pass, I gotta figure out how to get it through the gpu in an efficient manner
c runs over the sequence of embeddings, k runs over the embedding dimension
when the dot product is ocurring between the same embedding, it's possible to simplify it cuz the p's commute and M is symmetric
I should be able to just place them in memory and calculate all the coefficients at the same time right ?
there's only gonna be one copy of p and M, and M will be a 1d array, so I'll just need to setup clever indexing and the right combination of blocks and threads
df: pd.DataFrame = ...
cols = ["b", "c", "e"]
has_null = df[cols].isna().any(axis="columns")
df_has_null = df.loc[has_null]
note that np.nan is kind of just a hack placeholder for "null" values. you can have non-numeric data with over forms of "null" that are not np.nan. if you specifically want to check for floating-point NaN ("not a number") you can use np.isnan instead of pd.isna
!d pandas.isna
pandas.isna(obj)```
Detect missing values for an array-like object.
This function takes a scalar or array-like object and indicates whether values are missing (`NaN` in numeric arrays, `None` or `NaN` in object arrays, `NaT` in datetimelike).
!d numpy.isnan
numpy.isnan(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature]) = <ufunc 'isnan'>```
Test element-wise for NaN and return result as a boolean array.
!d pandas.DataFrame.any
DataFrame.any(*, axis=0, bool_only=False, skipna=True, **kwargs)```
Return whether any element is True, potentially over an axis.
Returns False unless there is at least one element within a series or along a Dataframe axis that is True or equivalent (e.g. non-zero or non-empty).
ty!
I just wanted any value that represents missing, so I'll stick to .isna
good, that's what i would recommend in general
On a tangent, I want to make sure that this isn't leaking right(Kaggle Titanic)? I pass this to a FunctionTransformer
def get_titles(df: pd.DataFrame):
df = df.copy()
df['Title'] = df['Name'].str.extract(r".*?, (?P<Title>.*?)\.")
df.loc[df['Title'].isin(['Mlle', 'Ms']), 'Title'] = 'Miss'
majority = pd.Series(['Mr', 'Mrs', 'Miss', 'Master'])
sub = ~df['Title'].isin(majority)
df.loc[sub, 'Title'] = np.where(df.loc[sub, 'Sex'] == 'male', 'Mr', 'Mrs')
return df
```If it is I can't tell how it's leaking
I ask because if I include this Title feature, then the models get overly optimistic cv_scores
There's another feature that's just the Title frequency encoded, but that one doesn't mess stuff up for some reason
guys why does this happen?
``` ValueError: logits and labels must have the same shape, received ((None, 23, 23, 1) vs (None,)).
the training dataset shape is (110300, 50, 50, 3) (110300,) so why exactly does this happen?
The last layer in your model outputs the incorrect shape.
Same answer, post the code if youd like more detail
def create_model():
model = Sequential()
model.add(Conv2D(64, kernel_size=3, input_shape=(50, 50, 3), activation="relu"))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=3, activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
# Dense layers
model.add(Dense(512, activation="relu"))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(256, activation="relu"))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid"))
# Compiling the model
adam = Adam(learning_rate=0.0001)
model.compile(loss="binary_crossentropy", optimizer=adam, metrics=["accuracy"])
return model
You forgot to shape it into a line before feeding it into the dense layers
So the dense layers are acting on the last channel, which is why it's turning out 1
No, you need to shape the output of the first part of your model into shape (batch, wtv number needed here)
The number needed will be 46
Uhm, potentially, idk what's the value of the last dim, since it has been projected to 1
Not necessarily because the conv layers can add more channels
in that case a Flatten() should work
you also never wanna have an explicit kronecker product (if that's what your \otimes means)
It's the tensor product of two unit vectors
unit as in unit norm or canonical basis vectors?
Orthonormal, they're just there to make it a matrix
but generic orthonormal, not sparse orthonormal, right?
I'm not brushed up on the terminology, ei dot ej is the Kronecker delta
i ask because you can consider multiplication by a rank 1 matrix as an orthogonal projection, so you should be able to replace the whole thing with a handful of clever dot products
then the matrix there has only one 1 in it and you can perform the product by indexing
so, canonical basis vectors
Not sure I follow
I can leave it implicit by hiding the sum and the product
the tensor product of two canonical basis vectors is a matrix that is zero everywhere except for 1 element
and multiplying a matrix by another one from the right takes a linear combination of its columns
so any matrix times (ej tensor ei) yields a matrix that is zero everywhere except for one column
Yeah makes sense, I'm summing over the basis vectors of a 2d matrix
all of the math you have there falls under the category of "nice on paper, terrible if you code it like that" :p
How so ? I'm gonna compute it in parallel
I'm still gonna give it some more thought cuz there's symmetry going on
even before symmetry, almost everything is immediately zero
without any matrix multiplication at all, you can build the result by just indexing the matrix on the left
Im very confused tbh, maybe I have some misconception, but the factor multiplying the tensor product is the (c, c') element of a 2d matrix
Stuff you learn in physics class tends to be very hand wavy so I wouldn't be surprised I've done a couple crimes against maths to get to the answer
i mean, none of it is wrong
just that almost everything is zero
no need to explicitly compute a bunch of zeros
I'm not gonna compute the tensor product, it's for visual guidance only
ok, that's what i meant. you'd wanna directly index into another matrix and just assign the value there
Yeah that's what will happen, the tensor products help me keep track of the indexes, ideally I wouldn't need them but I'm rusty
Neither the sums
I never did understand covariance and contravariance, so I don't do upper indexes >.>
😛
I'm almost 90% sure no one does and everyone just memorizes the index rules
the easy rule is "row vector covector, column vector vector"
you can do einstein notation only with subindices if you're very explicit though
Uhm, I do try that, but then I see sometimes matrices are upper down, other times down down, and then I'm just confused. I know it has something to do with the dual space of vector spaces, where linear functions live, but then I'm studying that stuff instead of getting stuff done, so I kind amove on
if you stick to the golden rule of "all vectors are column vectors", then row vectors form the dual space
then the row vector covector idea works 😛
But then dual space is also a vector space
that's exactly the case
It's confusing
why? that's a true statement
Because, who's the dual of the dual ?
in finite dimensional vector spaces, that's isomorphic to the original vector space
and the map between them is the transpose
So, like, theres no difference between a vector and a linear map, so the distinction doesn't look useful to me
in finite dimensional spaces, exactly so
except that if you arbitrarily pick one of them to be "vectors", the other is always covectors
this is super handy
But in physics there's a huge difference between them
for example, it means gradients are vectors and jacobians are linear functionals, which is exactly how one treats it in differential geometry
and it makes taylor expansions easy to interpret: we take differential forms of a function
One represents a real quantity, and thus transforms against the coordinate system, the other is a feature of the coordinate system and thus transforms in accordance to it
same here
and in both cases, the vectors and covectors form separate vector spaces
the dual space is exactly the vector space of linear functionals
in physics too
it's a separate vector space and you can assign it the meaning you like
Okay, but here what decided what is a co vector was where I decided to put a coordinate system
Cuz if I put a coordinate system in the space of functions, then the other one has the other meaning
you can always apply a change of basis, what's wrong with that?
yeah, it just depends which one you want to study
that's just as true in physics as elsewhere
In physics it does not
certainly so
Some things are fundamental
You can't make them transform with the coordinate system
Whereas here is up to our choice
That's a fair take actually
you usually don't do it because it's not immediately useful, not because it's not possible
"hey, check out this ???? i got? lmao"
So in the case of the metric tensor, you get a down down indices right
Cuz it's a bilinear function
No, up up
I forget
yeah they're doing down down in gr
oh and then it comes this stuff
it's a complete rabbit hole once you click on the "one-form"
that's where i was going when i said to treat the jacobian as a linear functional
we call dx a little change in x, and you guys took it and built a giant castle of abstraction
I still get triggered cuz one of the professors in the math department decided that differential forms, manifolds and topology were appropriate for real analysis of 2nd year physics and engineering students, decided to invent his own notations, use PT instead of english nomenclature and allow nothing other than his own notes to be studied
Hey there! I'm fine-tuning and training both a YOLO model and an SSD model for object detection. I'm training them on a high-end GPU. However, I need to use these models on a CPU-only device. So, my question is: Is it okay to train on a high-end GPU, export the weights, and later use them with a CPU?
Hey all, I've got a cube laid out with each of it's 6 faces. Each step I'm adding some color and doing a gaussian blur (see gif). Because I'm doing each face separately, it causes discontinuities along the cube edges. I'm looking for speed/correctness suggestions I can try
for the edges it's easy: you can consider copies of the sides of the cube
the corners are tricky and there's no unique way of treating them
I saw an example saying you can use an Adjacency matrix (sparse). But it strikes me as probably going to be slow
should be fast if you implement it as a sparse matrix
defining the adjacency matrix for the pixels will let you do a graph convolution, which should be equivalent to considering copies of the faces as i suggested. i'm not sure what it'll do about the corners though
Ok I will try
seems like I can get it to blur using adjacency matrix on this test image at about 26 Blurs Per Second
My other image is bigger and can do ~53 blurs per second. Any optimization tips?
Maybe define dtype?
i'm a little surprised, the laplacian should be both symmetric and sparse
I can share the sample code I'm working from
you can also factor out the laplacian polynomial into a single matrix and multiply that by x
I'm not sure what this means.
This is my blurring code:
blurred_image_sparse = adjacency_matrix_sparse.dot(blurred_image_sparse.flatten()).reshape(image.shape)
And the adjacency matrix generated like this:
adjacency = lil_matrix((num_pixels, num_pixels))
# Define adjacency based on spatial proximity (e.g., 4-connectivity)
for i in range(height):
for j in range(width):
node = i * width + j
adjacency[node, (node - width)%num_pixels] = 1 # Connect to pixel above
adjacency[node, (node + width)%num_pixels] = 1 # Connect to pixel below
adjacency[node, i*width+ (j - 1)%width] = 1 # Connect to pixel on the left
adjacency[node, i*width+ (j + 1)%width] = 1 # Connect to pixel on the right
this is correct. and how did you make it sparse? did you use scipy sparse coo matrix?
from scipy.sparse import lil_matrix
ah that's what that is. lemme read how it works, one second
Hey man just a heads up yout advice worked. Thank you!
here we go
"slow matrix vector product"
Aha, what should I use instead?
try one of the ones it recommends there, csr or csc. i usually use coo. try them out and see what's easy to work with
I guess COO? (not sure what it is)
coo is coordinate array. it stores the matrix as tuples (row, column, value)
Ok I'll try a few things
the other flavors do things like storing indices of rows or columns with nonzero elements, then store those vectors completely. more efficient if the nonzero vectors are dense
i think your adjacency matrix might be banded and/or toeplitz too, so another option is to use fourier transforms. maybe try plotting the matrix as an image (i'm not super sure tbh, those corners are nasty)
Yeah the purpose of the cube mapping is to make a representation of a sphere that is optimized for matrix math (it's called "Cubed Sphere" and is used in weather simulations, afaik).
but it has disadvantage of corners
i like what you're working on 😌
the literature I read said that the corners should cause around only a 1% error rate 🙂
yeah probably not a huge issue, but there's no unique, perfect way of treating them
the adjacencies there are inherently 3d and you're applying a 2d map, any projection method you use is kinda wrong
graph methods are well motivated and understood though, so that's as good a method as any
yup, I think you just do a tiebreaker because yeah exactly, when crossing a corner boundary, there are 2 viable overlapping positions
Nice, it's quite common that question
Is anyone attending ICLR or NAACL conference this year? We could plan a mini Python Discord dinner 😃
The Vienna one I might depending on the date, not sure if I'll be travelling outside the continent this year tho, so the mexico one is unlikely for me
I think if I do visit the Americas I'll go for some conference in San Francisco
But don't know if this year will be possible
heck yeah.
Blur Per Second: 353.03360478084295
so it's like 12x faster
thanks a lot
Hey guys did anyone work on parsing SEC filings i need some help please
One thing I think people might be missing with the Gemma Open Models release is the section announcing grants given to academics to enable them push the boundaries of LLM research.
If you're interested in this, try to apply before April 17! All academics (universities + non-profits / independent labs) are applicable!
https://docs.google.com/forms/d/e/1FAIpQLSe0grG6mRFW6dNF3Rb1h_YvKqUp2GaXiglZBgA2Os5iTLWlcg/viewform
If you're an independent researcher and in need of an "affiliation" to apply, you can get in touch with me or just DM ML Collective on twitter.
Google is providing academic access to Gemma, a family of lightweight, state-of-the art open models built from the same research and technology that we used to create the Gemini models. Up to USD $500k in Google cloud credits (USD $5K/award) will be collectively awarded through Google cloud credits to selected applicants for novel academic resea...
good evening any idea on what libraries should I use for some data visualizatioN? Im working a project that takes input from the user and then displays data on graphs and all sorts
Acumen Pharmaceuticals, Inc. is a clinical-stage biopharmaceutical company developing a novel therapeutic that targets toxic soluble amyloid beta oligomers for the treatment of Alzheimer’s disease. Management will participate in a fireside chat as part of the Stifel 2024 CNS Days on Tuesday, March 19, 2024, at 12:00 p.m. ET. The live webcast may be accessed under the Investors tab on www.acumenpharm.com.
Acumen Pharmaceuticals Presents Sabirnetug (ACU193) Fluid Biomarker and Target Engagement Analyses from Phase 1 INTERCEPT-AD Study in Early Alzheimer’s at the AD/PD™ 2024 Annual Meeting. Key findings include improvements in biomarkers associated with AD pathology, including significantly lowered CSF neurogranin and pTau levels. Company on track to initiate Phase 2 trial evaluating sabirnetUG in the first half of 2024 and Phase 1 subcutaneous study in mid-2024.
Acumen Pharmaceuticals to present Sabirnetug (ACU193) Fluid Biomarker and Target Engagement Analyses from Phase 1 INTERCEPT-AD Study in Early Alzheimer’s at the AD/PD™ 2024 Annual Meeting. ACU193 is the first humanized monoclonal antibody to demonstrate selective target engagement of AβOs, a soluble and highly toxic form of A β that accumulates early in AD and triggers neurodegeneration.
I am actually pleasantly surprised with the quality of the text summarizations I got using an abstractive model
it isn't "leaking" in the same sense as using the mean of both test and train would be leaking.
however it might be overfitted by way of your manual feature engineering. maybe some assumption you've made about the relationship between title and survival is only valid in the training data.
@scenic parcel you asked about assembly? No.
meant for the off topic channel lol
doing data science in assembly.. that would be something
it would ruin your fucking life.
I'm getting a weird effect. This is using a dot product on an adjacency matrix while a particle moves adding white color. The dot product acts as a blur filter (or in this case, just spreading the color out). But when it wraps around to the original starting location, it no longer shows up.
I'm still adding color, but it doesn't seem to continue drawing
Oh I bet it's actually just a really high value
Nope, it's 0
that would be a multiple scattering problem
try a series expansion
rn you do Ax (matrix vector product) per time step
try (A + A^2) x
Mmm, I'm not sure I understand
you get each time step with a convolution, right? or how are you doing it
self.data = self.adjacency.dot(self.data.flatten()).reshape(self.data.shape)
yeah, that's a convolution
Ok so how would I change it to A+A^2?
self.adjacency is A
there's people who do data science in C after all what's a little assembly
so... (adjacency*adjacency+adjacency).dot(self.data.flatten())?
dot, not *
ok
this is just a rectngle, not a cube, or?
it's a cube, but i'm just testing in the horizontal direction to make sure I have everything set up right
which I'm pretty sure I don't
looks like the particle wraps to the wrong location, so many the adjacency matrix is wrong
how are you moving the particle
the particle wraps correctly actually, but it uses different code
it tracks which face it's on, and when it leaves a face, I apply a rotation and translation matrix
self.sides[1].east_transform=[[1,0,-res],
[0,1, 0],
[0,0, 1]]
and
if(self.position[0]>self.face.resolution):
matrix=self.face.east_transform
self.velocity = np.dot(matrix,self.velocity)
self.face=self.face.east
I'll show another example so you can see the particles
It would be pretty boring. Assembly is not hard, it's just tediously repetitious. Which is why many assemblers come with macros systems (basically turning it into a worse C).
The bug is that it reaches a point where the particles are still moving but everything is zero
this is too complex for me to see if the psrticles move correctly
Ok I'll turn off the blurring
and just put 1 or 2 particles
this should explain
You may need to click "open in browser" to see it clearly
I think about it like you are wrapping a ribbon around a present. The ribbon just goes straight, and when it turns a corner, it maintains it's direction on the next surface.
Here I try filling the entire matrix with the value 0.005. You can see it quickly fills with a really high value, and then all goes to zero
can you do this for a horizontally moving particle
all right
yup. As you can see it lines up perfectly
and how do you generate here the self.data on the left? you take the one from the previous iteration and add in a new particle at the new location based on the speed and position calculation?
yup
for i in range(len(self.particles)):
p=self.particles[i]
p.step()
p_2 = (int(p.position[1]),int(p.position[0]))
self.data[p.face.ID,p_2[0],p_2[1]]=255
that should all be correct. (btw that's equivalent to the series expansion i was talking about. at each iter, you do A dot (A dot (old data) + new data), which if you evaluate recursively is a neumann series)
then i would wonder if your adjacency matrix is correct
so, here is the basic part of it:
self.adjacency = lil_matrix((num_pixels,num_pixels)) # sparse, only few connections
for i in range(res): #width
for j in range(res): #height
location = i*res+j
for g in range(6):
pixel = g*grid_size+location
mods = [(0,1),(0,-1)]#(1,0),(-1,0)] #4 directions
for mod in mods:
new_i=(i+mod[0])
new_j=(j+mod[1])
new_loc=new_i*res+new_j
if(new_i>=0 and new_i<res and new_j>=0 and new_j<res):
new_p=g*grid_size+new_loc
self.adjacency[pixel, new_p] = 1
I didn't finish the cross-boundary parts because I assume there is still a bug
i'm not able to check that atm cuz i just woke up and am not lucid, but the cross-boundary parts are exactly the issue 😛
Isn't the whole thing the problem though? Even if the particle stays still, the whole picture turns black
it creates 1 wave, and then stops
meaning the boundaries are wrong
it's working correctly across some boundaries, but it's also not wrapping up-down for example
it seems to be wrapping only across a few of the squares
Yes, that's true. But I'll isolate the problem more specifically
Thanks for your help so far
the easiest scenario is to properly turn a rectangle into either a torus or a cylinder
to completely wrap around one axis (or both)
rn it seems there's one horizontal boundary where this does not happen correctly for the cylinder case
I've made the central face wrap with itself.
The particle is stationary.
You can see the problem is that the wave starts for ~10 frames and then no longer propagates
yeah, the left and up boundaries are off
so the problem really seems to be: why is the particle no longer able to propogate after about 10 frames? It's as if it's consumed the food, and there is no food left
i know that's the case, but your matrix is applying a convolution based on your adjacency matrix for the cube
your matrix is wrong
it's missing connections at the boundaries
Nope, it's a torus now
certainly doesn't seem to be the case 😛
are you putting the particle back in?
The particle is stationary in the center of the wave
like so
every frame, it adds 255 to that location
all right
because for me, the same procedure works
i'd have to see more of the code tbh
what's self.particles here
for i in range(1):
x,y=np.random.randint(0, res, size=(2))#TODO: sample sphere
self.particles.append(Particle(self.sides[1],x,y))
ok but you're not using that here
i'd have to see the code around where you do this
def step(self):
self.data = self.adjacency.dot(self.data.flatten()).reshape(self.data.shape)
for i in range(len(self.particles)):
p=self.particles[i]
p.step()
p_2 = (int(p.position[1]),int(p.position[0]))
self.data[p.face.ID,p_2[0],p_2[1]]=1#np.random.randint(0,2550)
then for this case it's just the one particle fixed in one place?
what is the step called inside this one?
(and we can ignore the possibility that x/y are swapped, because x==y)
p.step moves the particle
but we set the speed to 0, so it does nothing
i'd go into this function and check that len(self.particles) stays at 1 and print the position of the particle. since all you do is call this function several times, the issue is in one of the things used in the function
the particle list, the update of the data, or the adjacency matrix
yup, both the number of particles, and location of the particles are constant (just checked)
it's not the custom code, it's something about the numpy specific code
are you using int as your dtype?
idk, this all screams to me that the point is not being placed correctly, only at the first iteration
I am testing by checking the max value, and I can see (after the wave ends) that the max value is exactly at 384x384 (middle of the center box as each box is 256x256)
it's just not... spreading for some reason
that'd mean you're not multiplying by the matrix
i think you have two similarly named vectors somewhere and are updating the wrong one, maybe?
I have it set up to do dot product first, then add the new data (so that I can see the unblurred point
Yeah I'll just have to read my code very carefully. Maybe it's something like that.
Need some coding help with a data analysis related code
Is anyone interested to help
You should ask the question or open a help thread #❓|how-to-get-help
Someone available might help you out.
I figured out the problem. It was infact an overflow
I forgot to normalize the dotproduct
self.data = self.adjacency.dot(self.data.flatten()).reshape(self.data.shape)/4``` that divide by 4 is kind of important
aha, there we go
i was also just in the middle of replicating your code, too
with a 1/4 factor and on a torus, we get this
that's no fun, the animation isn't showing
I can open in browser to see it
i guess you can kinda see that the rings start getting more intense
im gonna throw this message out there in hopes it falls on the suitable reader: im working with a SVM classifier model and im having trouble reaching past 80%+ accuracy score for my classification reports. could i consult with someone knowledgeable for advice?
here's a way of making the kernel larger. you probably notices that the current one blurs in a sort of diamond pattern, since it only considers the immediately adjacent pixels. you could consider the diagonal pixels, or you can square the adjacency matrix to see pixels 2 hops away. i also scaled it by a little bit more than 1 so the animations turn out nicer (they still blow up, just slower)
import numpy as np
N_0 = 100
N = N_0*N_0 #after flattening
torus_adj = np.zeros((N, N))
for p in range(N):
col, row = divmod(p, N_0)
indices = [
(row - 1)%N_0 + N_0*col,
row + ((col - 1)%N_0)*N_0,
row + ((col + 1)%N_0)*N_0,
(row + 1)%N_0 + N_0*col
]
torus_adj[indices, p] = 1/4
filter = np.eye(N) + torus_adj + torus_adj@torus_adj
scale = np.sum(filter[0,:])
filter /= 0.97*scale
yeah I don't want to deal with modulo, because of the discontinuities. Squaring the adjacency matrix sounds a lot easier
you need both
without modulo you get the boundaries wrong
(and squaring you also get the boundaries wrong btw)
just a little less wrong because the first step is computed with modulo
What would cause the boundaries to be wrong? Is it because it will double-count across the corner boundary?
because you won't get the wrap-around effect at all
what we need is that the bottom of each column is connected to the top, and similarly for the edges of the rows
actually, maybe if the adjacency is computed correctly, we do get the correct result after squaring. smh
i'm super rusty with graphs
looks like it's working
Some of the medium gray blur appears to "vibrate" so I might need to do some more tests, but I'm going to call this a success
awesome
close up looks like it's doing a moire pattern
i get those too if i use the proper graph laplacian, but not the way i wrote it above
this way?
right, that way i don't get a pattern, it's just flat
the proper graph laplacian would be 4 I - A, where I is an identity mat and A is the adjacency matrix with just 1s (you'd have to normalize after)
this should work
and it looks like squaring the adj matrix works as intended. once we reach the edges, the pattern grows faster as we'd expect and the shape is no longer circular
(the identity mat is also only needed for stationary sources)
Not sure the meaning of 4I - A
4* (identity matrix - adjacency)?
without parentheses
4 I is the incidence matrix (number of edges per vertex) for this graph
Wouldn't that make mostly high values where I don't have adjacncy?
oh wait, identity matrix, not ones matrix
this is already exactly what you're doing for the stationary particle
you apply the adj matrix scaled by 1/4, then put the particle back where it was with amplitude 1
so you're doing (4 I + A)/4
Ok so 4*np.eye(res*res*6)-A
the graph laplacian canonically has the opposite sign for the adjacency matrix, giving you a checkered pattern
oh, interesting
Oh, so this requires that I allocate 1 terrabyte of memory
can I just multiply by -1?
self.data = -1*self.adjacency.dot(self.data.flatten()).reshape(self.data.shape)/4
wait wait, i'm still playing around with it
looks like multiplying by -1 isn't good enough, but creates interesting patterns
yeah
i think the issue is really just considering the 4 immediately adjacent pixels
using adj + adj^2 should do, without the negative sign
also, i used np.eye because i'm doing a small scale simulation, but you should use the sparse version of the identity or keep inserting the value by index manually as you were doing before
otherwise you get a dense matrix
Do I normalize this by 16? or 4
by whatever the sum of one row of adj + adj^2 is
here i used adj + adj^2 with the normalization i just mentioned, and instead of an identity matrix i added in the source scaled by a cosine to get some ripples
filter = torus_adj + torus_adj@torus_adj
scale = np.sum(filter[0,:])
filter /= 1*scale
?
self.adjacency =4*identity(res*res*6,format='csr')-self.adjacency
trippy lmao
can I just include the point itself and divide by 5
I have to complete this project..
Can i know what are the prerequisites to complete this project
" 1.Analyze facial expressions in video sequences as time-series data using recurrent neural networks (RNNs) or long short-term memory (LSTM) networks. "
you don't need the identity if you place the points manually as you're doing
It fixes the checker pattern though. And I can't figure out how to make the A.A+A thing work
Hii I kinda need help,
How can I extracting Words from Audio to Text without Auto-Correction
like if someone is prounancing 'klass' speech recognition convert it to 'class'
I want to stop that
Ok gonna call it done for the night, thx
@wooden sail 🙂 can you please help
I don't understand your question. Can you pronounce "klass" for me
as everyone has there own accent example
Indains usually prounance 'Dha' not 'The' (Im not stereotyping anyone)
so if I try convert there speech to text I want it as it is.
https://en.wikipedia.org/wiki/Phoneme
You want this
In phonology and linguistics, a phoneme () is a set of phones that can distinguish one word from another in a particular language.
For example, in most dialects of English, with the notable exception of the West Midlands and the north-west of England, the sound patterns (sin) and (sing) are two separate words that are distinguished by the subs...
You are trying to write phonemes using roman letters. You should use linguistic notation instead. That's what speech recognition should do
then, it's up to you if you convert the phonemes to words or not
And then you'll need training data. Probably for whatever dialect you're trying to record. For example, Mandarin needs different training examples for [má] [mǎ] [mà] [mâ]
i don't know anything about speech processing of this kind
and this is about as far as I can help as well. I've never done it
basically I'm trying to make program for pronunciation checker and I made a logic where I'll convert someone's speech to text and will check if he said it right or not.
thank you for helping
I'll check
import numpy as np
from scipy.sparse._coo import coo_matrix as coo_type
from scipy.sparse import coo_matrix as coo
N_0 = 50
N = N_0*N_0 #after flattening
rows = []
cols = []
values = []
for p in range(N):
col, row = divmod(p, N_0)
indices = [
(row - 1)%N_0 + N_0*col,
row + ((col - 1)%N_0)*N_0,
row + ((col + 1)%N_0)*N_0,
(row + 1)%N_0 + N_0*col
]
cols.extend([p]*4)
rows.extend(indices)
values.extend([0.25]*4)
torus_adj = coo((values, (rows, cols)), shape=(N, N))
filter = torus_adj + torus_adj@torus_adj
scale = np.sum(filter[0,:])
filter /= 1*scale
if type(filter) != coo_type:
filter = filter.tocoo()
the values could just as easily be 1 or anything else, since the "filter" is scaled afterwards by normalizing the rows
this is fun, what a huge nerdsnipe
i drank to kool aid on index notation, it's pretty handy
M for metric, E for expansion, C for contraction, same letters sum up to same value (c, c', c'' sum up to Nc)
im gonna calculate the gradients for this to see if there's optimization that can be done there, but my current plan is to focus on the term inside the softmax
no advantage, I'd just be doing what pytorch does already, which is to propagate the gradients, I'm gonna focus on the Mpp thing
Guys
I had 2 uni courses this semester
- MLF (Machine Learning Foundations) - it was focused on maths [linear algebra etc]
- MLT (Machine Learning Techinqes) - it was focused on application [algorithsm like PCA, Regression etc]
I feel like I have some holes in my understanding of them, I was thinking to read a book and make some notes so that I understand the whole thing clear and better
Can you suggest something it
if you wanna look at the math part with some small amount of applications, gilbert strang's book on linear algebra is great
I found a symmetric pairing function online and used it to get to here, but the computation of the indices seem to be expensive
it's a general form tho, so I might be able to find something more efficient to compute and use that as my F
(I’m a huge Strang fan, I always recommend his calculus video lectures)
like, might as well just have more parameters in the layer than to spend time calculating this
i dont really need F tho, I need f and g, so maybe I can construct them
thanks !
its exacildraw ?
what book are you reading ?
I think so, I'm using an extension on obsidian
Im not reading anything, Im preparing a cuda kernel
oka oka !!
no idea what is this 😐
oh, it's a function you run on the gpu in parallel
i guess they were asking about the image with the corollary you shared
I se oka oka !!
just a random arxiv paper I found online
a bit advance stuff for me
always double-check what you find on arxiv, a lot of the content has never been published nor reviewed
I know, but I won't be using it anyway
served as a good steping stone to get to what I wanted
Hey guys , so for SVR,
We can predict like right,
y_pred = sc2.inverse_transform(regressor.predict(sc1.transform([[6.5]])).reshape(-1,1))
Whats the use of reshape(-1,1)?