past meteor Dec 3, 2023, 8:53 AM

#

If for some reason you can't use a neural network you can treat, as ML practitioners call it, dimensionality reduction as a hyperparameter you search over

desert oar Dec 3, 2023, 8:53 AM

#

great point. that's the "people do it anyway" part 😛

past meteor Dec 3, 2023, 8:54 AM

#

past meteor If for some reason you can't use a neural network you can treat, as ML practitio...

This can include kernel PCA or if you want to get fancy Nystrom approximation and then regular PCA

desert oar Dec 3, 2023, 8:56 AM

#

i actually have a project coming up where i might be able to play around with this. see if similarity in a "dumb" space like PCA is semantically useful

past meteor Dec 3, 2023, 8:56 AM

#

In my experience it's likely not

desert oar Dec 3, 2023, 8:57 AM

#

this would be more like "tabular" data, not images or text

wooden sail Dec 3, 2023, 8:57 AM

#

the problem is that, which a good enough cost function, large enough network, and if you don't reduce the rank of the covariance too much, this WILL work anyway

#

because you can't force your classifier to not learn another representation on top of the PCA you do yourself

#

this becomes very interesting though if you examine how many layers and params are needed to get a good result

past meteor Dec 3, 2023, 8:57 AM

#

But yeah, I think the issue here is that imo it's unfair to talk about PCA in cases where you do have labels

desert oar Dec 3, 2023, 8:57 AM

#

the project is actually an unsupervised learning project

past meteor Dec 3, 2023, 8:57 AM

#

Because then yes it's self evident there are possibly better things out there

desert oar Dec 3, 2023, 8:58 AM

#

any labels would be me or my colleagues manually combing through examples and labeling them "yes", "no", or "meh"

wooden sail Dec 3, 2023, 8:58 AM

#

you should need more layers if PCA is all you at first, but with enough data and time, any network will be fine as long as you didn't PCA too hard

past meteor Dec 3, 2023, 8:58 AM

#

It's a lot more interesting to talk about the unsupervised case indeed, where it's totally exploratory and have no labels

#

Like in the original question actually I think

desert oar Dec 3, 2023, 8:59 AM

#

the original question was totally lacking in context as far as i saw

wooden sail Dec 3, 2023, 9:00 AM

#

reminds me of a task we had a student try once. some sort of sparse recovery. the input data was considered in the original domain, but also after doing PCA and wavelet decomp. they were using a fairly large network and essentially unlimited synthetic training data, and there was no difference among the results 😛 as is to be expected

#

it becomes interesting when thinking about shrinking networks or trying to get them real-time capable for large inputs

past meteor Dec 3, 2023, 9:02 AM

#

Well tbh my summary is that it's been a while since I heard anything about dimensionality reduction for classifiers and that for good reason

wooden sail Dec 3, 2023, 9:03 AM

#

i do hear a lot about it, but mostly through judicious choice of cost functions to promote special behavior in the latent space

past meteor Dec 3, 2023, 9:03 AM

#

At least in the case of PCA we did a million toy examples in uni where it destroys your data.

wooden sail Dec 3, 2023, 9:03 AM

#

yeah, just pca straight up, no

past meteor Dec 3, 2023, 9:04 AM

#

If it's about auto encoders and beta-VAEs etc. yes there's a lot of material there

#

But typically they're truly unsupervised which is, again, what makes them interesting again. Typically I don't see people taking the latent vectors and using them for classification downstream

wooden sail Dec 3, 2023, 9:05 AM

#

not explicitly, at any rate

spice mountain Dec 3, 2023, 9:29 AM

#

Do you guys know if the GPU acceleration is model agnostic in Lightning?

#

Like, is there some universal parameter to the pl.LightningModule (pl = Pytorch Lightning) class, that I can set so I utilize all my GPU cores?

#

It seems this exists:

buoyant vine Dec 3, 2023, 12:16 PM

#

spice mountain It seems this exists: ```trainer = Trainer(accelerator="gpu", devices=2) ```

yes you specify the device using the Trainer

#

it will defualt to auto though, meaning it will use your GPU if it can automatically

#

Also I would probably avoid using multiple GPUs to begin with until you are a bit more firmiliar with lightning's behaviour, the multi-gpu, multi-device stuff is a bit more annoying than they let on 😅 with a couple of unfortunate bugs scattered in the mix

spice mountain Dec 3, 2023, 12:20 PM

#

buoyant vine Also I would probably avoid using multiple GPUs to begin with until you are a bi...

Rip

#

I have a report for Thursday, still haven't trained pithink

buoyant vine Dec 3, 2023, 12:21 PM

#

how big is your model thonk

spice mountain Dec 3, 2023, 12:21 PM

#

But it is on an HPC, so I have a sht ton of cores available

spice mountain Dec 3, 2023, 12:21 PM

#

buoyant vine how big is your model <:thonk:602047082064510986>

Don't know tbh. It is just slow asf

buoyant vine Dec 3, 2023, 12:21 PM

#

Lightning will normally say when you start running

#

also, just a personal preference, but i'd also setup something like MLFlow or neptune to monitor the training process

#

helps to also give a bit of an indication of how long it is going to be before learning tapers off

thorn flame Dec 3, 2023, 12:57 PM

#

Hey guys! Please who has used chatterbot recently?

toxic mortar Dec 3, 2023, 1:39 PM

#

What these parameters p,T and m stand for?

#

I understand why we want to penalize greater errors by calculating square subtraction. However I do not get it why there is a 1/2 in front of it? Is that some math convention?

#

Another questions. To find a minimal error we use this descending gradient method. When doing partial derivate dE/dwj why we are doing it on w^T, but not on the Sum(wjxj)

#

They are not the same dimensions, how we can compare apples to oranges?

mild dirge Dec 3, 2023, 1:44 PM

#

Whenever you see 0.5 in a cost function, it is probably because there is a square there, and the derivative will cancel those two out

echo mesa Dec 3, 2023, 2:17 PM

#

Guys, does anyone know any books, resources about vectorisation and its implementation in programming? so far ive just seen people using numpy but i didnt actually get to see how it works and implementing it from scratch

mild dirge Dec 3, 2023, 2:20 PM

#

Vectorization is just the ability to apply the same/similar computation on multiple elements. The way this is implemented is often still sequentially, but in a faster language like C. @echo mesa

#

But sometimes parallel computing, and even a GPU can be used for performing the actual computation.

mild dirge Dec 3, 2023, 2:22 PM

#

echo mesa Guys, does anyone know any books, resources about vectorisation and its implemen...

Do you want to know about the actual parallel computing part, because the vectorization is often more of a way to write down what kind of computation you want to do to a bunch of elements.

echo mesa Dec 3, 2023, 2:31 PM

#

mild dirge Vectorization is just the ability to apply the same/similar computation on multi...

I wanna understand it deeply and implement it in C from scratch so i can use it for many things.

mild dirge Dec 3, 2023, 2:31 PM

#

numpy is open Source, so you could take a look at that

#

Though it would probably be simplest if you already have some experience with numpy, and know how to use it

echo mesa Dec 3, 2023, 2:34 PM

#

mild dirge Though it would probably be simplest if you already have some experience with nu...

I think the way its being implemented is when you have two arrays, you identify them as two vectors and you take the dot product of them in a way which implements multiple chunks at the same time which is where the question of how is. I'm also very interested in understanding the underlying numpy logic and its array structure because i have experience with c and low-level languages.

mild dirge Dec 3, 2023, 2:35 PM

#

"in a way which implements multiple chunks at the same time which is where the question of how is" Do you mean that multiple chunks are processed at the same time?

#

Because that would be parallel processing, you can use OpenMP in C(++) for that

echo mesa Dec 3, 2023, 2:36 PM

#

mild dirge "in a way which implements multiple chunks at the same time which is where the q...

well yeah, i mean i might be not clear but the whole idea is that for example if you wanna add two array's items together than since all of them are independent of each other you can evaluate them at the same time.

echo mesa Dec 3, 2023, 2:36 PM

#

mild dirge Because that would be parallel processing, you can use OpenMP in C(++) for that

whats the difference between vectorisation and parallel processing?

mild dirge Dec 3, 2023, 2:37 PM

#

Yeah that's the parallel part. vectorization (or array programming) lends itself well to parallel computing.

#

Because you write the instructions such that "multiple elements can be processed at once"

#

Which can then be implemented with parallel computing

#

Or sequentially in a lower-level language

echo mesa Dec 3, 2023, 2:38 PM

#

Got it, i wanna implement it in c, i found this paper which might be really good. https://www.jsums.edu/robotics/files/2016/12/FECS17_Proceedings-FEC3555.pdf

mild dirge Dec 3, 2023, 2:39 PM

#

Yeah, so SIMD is especially connected to vectorization, because you apply the same operation to multiple elements. And the GPU lends itself for these type of operations.

#

But numpy does not really use the GPU. but a lot of the syntax is used to instruct the computer to apply the same operation to many elements.

mild dirge Dec 3, 2023, 2:40 PM

#

mild dirge "in a way which implements multiple chunks at the same time which is where the q...

The paper also mentions OpenMP

#

That is the one I use sometimes for my code

#

If I were you, I would look into OpenMP as well. It is very simple to set-up and use (it already comes with C/C++ stdlib iirc)

echo mesa Dec 3, 2023, 2:51 PM

#

Gotcha thanks

dusty cloud Dec 3, 2023, 2:53 PM

#

hello does anyone know any breadboard or circuit simulators? something that has sensors in them like temp sensors, water sensors, etc.? I want to simulate something with rasp pi

tidal bough Dec 3, 2023, 3:28 PM

#

toxic mortar What these parameters p,T and m stand for?

The index T here isn't a variable, it means transposition.
The 1/2, as wccamel mentioned, is just so that the derivative looks nicer. (It doesn't matter what constant we write since minimizing E(w) and E(w) leads to the same w).
p, it seems, is the size of y, so the number of samples. m meanwhile is the size of each x^(k), which would make it the number of input features in each sample.

tidal bough Dec 3, 2023, 3:32 PM

#

toxic mortar Another questions. To find a minimal error we use this descending gradient metho...

When doing partial derivate dE/dwj why we are doing it on w^T, but not on the Sum(wjxj)
Not sure what you mean by this one. The derivative here doesn't look wrong to me.

spice mountain Dec 3, 2023, 3:32 PM

#

buoyant vine also, just a personal preference, but i'd also setup something like MLFlow or ne...

Yeah, but this is hard on the HPC.

#

But can anyone help me understand this;

I wish to train multiple models with this same script. The file will be in the same folder, but basically on the HPC I can set a script to run and then start the same script on another set of GPU cores; But they will share the same memory folders. I.e they can overwrite each other.

I am not so strong with Lightning, but appears, that if I do it this way, they will all write their checkpoint to last.ckpt? Is this correct? Can any of you guys suggest any quick fixes to this, so I can give it a number or something and it will turn that into the ckpt?

https://pastecode.io/s/1aquwr9n

#

Like, my instincts tell me I can just change this for every run, but is it that simple?

            # run all checkpoint hooks
            if trainer.global_rank == 0:
                print("Summoning checkpoint.")
                ckpt_path = os.path.join(ckptdir, "last.ckpt")
                trainer.save_checkpoint(ckpt_path)```

spice mountain Dec 3, 2023, 3:35 PM

#

buoyant vine Lightning will normally say when you start running

It actually does in a sense. I just can't interpret it, lol

        n_embed: 1024
        ddconfig:
          double_z: false
          z_channels: 256
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 1
          - 2
          - 2
          - 4
          num_res_blocks: 2
          attn_resolutions:
          - 16
          dropout: 0.0```

tender niche Dec 3, 2023, 4:58 PM

#

Hello guys!

#

Posting from main thread to this chat:

I need help with correcting a small logic in pyspark query wihch I am unable to solve since 2 days :(....would really appreciate any help....
To give some context :
So I am trying to write a query to identify pairs of airlines that operate on the same date by reading a flights.txt file. In other words I need to find the pairs of airlines that share the same origin and the same dates, and determine the count of each pair....final result must be sorted such like the airline pairs alphabetically (for both airline names in the pair) with the counts in descending order.

My query returns wrong counts.

This is my code with a simple expected example at the end as well
https://pastebin.com/TLEfWjnN
Pastebin

In the example input it should return 3 2 2
but my prog returns 4 4 4

Pastebin

Flights - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

wild sable Dec 3, 2023, 5:50 PM

#

Hey can someone rate my writing

river cape Dec 3, 2023, 5:54 PM

#

imputer = SimpleImputer(missing_values = np.nan , strategy = 'mean')
imputer.fit(X[ : , 1:3])
X[ : , 1:3] = imputer.transform(X[ : , 1:3])
Could anyone tell as to why do we use fit and transform

quiet seal Dec 3, 2023, 6:01 PM

#

Is there a library that's basically desmos but in the discrete domain? [is signal processing data science?]

#

I spent a whole day trying to model a method for alias-suppressed waveforms on desmos and it didn't work, then I realized I'm trying to model digital integration and differencing in a discrete space with an evenly-spaced sample rate and desmos is in the continuous domain 😐

#

I need mathing libraries that are digital that I can feed to like plotly or matplotlib

umbral charm Dec 3, 2023, 6:13 PM

#

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import odeint
from matplotlib.animation import FuncAnimation, PillowWriter
#  Parameters and model from earlier
# ...
def model(u, t, sigma, rho, beta):
   x, y, z = u
   dxdt = sigma * (y - x)
   dydt = x * (rho - z) - y
   dzdt = x * y - beta * z
   return [dxdt, dydt, dzdt]

sigma = 10
beta = 8/3
rho = 28
# Solve first half
v0 = [1,0,0]
t = np.linspace(0, 25, 5000)
v = odeint(model, v0, t, args=(sigma, rho, beta))

# Marker points to be plotted for these rows
vskip = v[::20]

# Figure object for the animation
fig = plt.figure(figsize=(6, 4), dpi=150)


def animate(i):
   """ This function runs for each frame """

   # Clear the plot
   plt.cla()

   # Plot the solution up to this point
   plt.plot(v[:(i * 20), 0], v[:(i * 20), 2], lw=0.3, color="steelblue")

   # Add a big marker
   plt.plot(vskip[i, 0], vskip[i, 2], "o", color="steelblue", markersize=8)

   #  Make pretty
   plt.ylim([0, 50])
   plt.xlim([-30, 30])
   plt.title(f"t = {round(t[i * 20], 1)}")
   plt.xlabel("x(t)")
   plt.ylabel("z(t)")

# Make the animation
anim = FuncAnimation(fig, func=animate, frames=len(vskip))
anim.save("animation.gif", writer=PillowWriter(fps=20))

#

is there anyway i could make this animate faster

#

takes a good minute

odd meteor Dec 3, 2023, 7:40 PM

#

river cape imputer = SimpleImputer(missing_values = np.nan , strategy = 'mean') imputer.fit...

fit(): This method is used to compute the necessary parameters from the training data needed to perform the missing value imputation. For SimpleImputer, fit() calculates the value that will replace the missing values. In this case, the mean. The fit() method essentially "learns" from the data.

transform(): After the fit() method has computed the mean, the transform() method applies the transformation to the data. So here, the SimpleImputer, will replace all missing values with the computed mean value gotten from fit() method.

In essence, you can think of these duo; fit() & transform() as "learn from this data" vs "apply what you've learnt from the data to the same data and/or on a new data (usually the val or test set)"

Since fit() is used to learn from data, it's called on the train data only. On the other hand, transform() is used on both train, validation, and test set.

verbal venture Dec 3, 2023, 9:41 PM

#

Does anyone know how unet goes from 572 -> 570 in the first convv layer for feature size

iron basalt Dec 3, 2023, 10:22 PM

#

echo mesa I wanna understand it deeply and implement it in C from scratch so i can use it ...

In C vectorization refers mostly to using SIMD instructions, avoiding branching, and loop unrolling.

#

C has some automated vectorization by the compiler, but manual SIMD can often be a lot faster.

#

Numpy's implementation makes heavy use of this and it's why it's fast.

#

GPUs have this too and in the case of AMD it's all SIMD vector hardware, even when doing single floats for example.

buoyant vine Dec 4, 2023, 12:45 AM

#

Has anyone tried to use a GRU (with GloVe) and BCE loss? For some reason, if I try make the classifier multi-label, rather than multi-class, the model learns nothing and its F1 score, Recall, etc... all drop to 0, but if I make it multi-class and then CrossEntropyLoss, it seems to learn fine?

magic dune Dec 4, 2023, 5:13 AM

#

is the chain rule that helpful?

#

💀

shadow viper Dec 4, 2023, 7:36 AM

#

magic dune is the chain rule that helpful?

And I just started learning calculus 😭😭😭
Velocity and distance isn't bad 😭

devout oak Dec 4, 2023, 9:10 AM

#

I dont know if this is the correct topic to ask but I am trying to do a image_to_text with tesseract OCR. I get an empty string to one of the images I use. I tried thresholding but it didnt change anything.

#

I dont really know whats the reason that makes it return return empty either. Like what kind of error it gives?

mild dirge Dec 4, 2023, 9:40 AM

#

devout oak I dont really know whats the reason that makes it return return empty either. Li...

Doesn't have to be an error. It can just not recognize any text in the image you give it.

#

It's AI after all.

#

But maybe you have processed the image incorrectly, or the settings are incorrect.

devout oak Dec 4, 2023, 9:45 AM

#

yeah true, its a very clear text with high resolution white letters on colored bg. Which I process as gray and threshold. I will crop the image to single lines to see whats wrong I guess

#

well I found the issiue, even though I run it on Turkish. It cant recognize "ş" and turns an error

#

I will try to train it myself to make it recognize

obsidian sand Dec 4, 2023, 10:31 AM

#

Hi, Does anyone know how to apply SMOTE to BERT?

sick drift Dec 4, 2023, 10:33 AM

#

Hi, im using python seaborn lib to display data. I got the data in the following format:

array approach (a,a,a,a,b,b,b,b)
array clients(7000,7000,700,700,7000,7000,700,700)
array latency (......)
array throughput(......)

Supposed to be a throughput/latency graph but its not displaying the mean values for me but the individual points

#

https://imgur.com/oUFlqrH.png

Imgur

Screenshot

#

Its supposed to be a mean per client setting

#

I can obviously manually calculate the mean but then I dont have error bars either

river cape Dec 4, 2023, 1:11 PM

#

X_train, X_test, Y_train, Y_test = train_test_split(X , Y , test_size = 0.2 , random_state = 1)

#

Does this split the dataset into train set and test set?

cold osprey Dec 4, 2023, 1:38 PM

#

river cape Does this split the dataset into train set and test set?

try n find out?

river cape Dec 4, 2023, 1:39 PM

#

cold osprey try n find out?

Got it

#

btw while installing ipykernel for jupyter , does it also install the pandas libraries?

earnest ridge Dec 4, 2023, 2:06 PM

#

can anyone help , here i am trying to map the species column with numerical values but after mapping it is showing nan values only in species column

river cape Dec 4, 2023, 2:28 PM

#

I have the ipykernel installed in the myenv environment. How do I solve this error?

left tartan Dec 4, 2023, 2:32 PM

#

earnest ridge can anyone help , here i am trying to map the species column with numerical valu...

Try printing the df before you map. I think you are making a capitalization mistake in the dictionary: the map values don't match the species exactly.

left tartan Dec 4, 2023, 2:33 PM

#

river cape I have the ipykernel installed in the myenv environment. How do I solve this err...

What's the name of the envronment you're running? See top right of vscode, ie:

left tartan Dec 4, 2023, 2:33 PM

#

sick drift Hi, im using python seaborn lib to display data. I got the data in the following...

Share code?

buoyant vine Dec 4, 2023, 3:05 PM

#

AM I loosing my mind, or does no one use GRU with GloVe and PyTorch for multi-label classification 😅

I have been trying to find some resources on it because for some reason when using BCE loss my model decides it shall learn absolutely nothing, but every tutorial, documentation, existing code seems to at best use multi-class classification and in their "Keras VS PyTorch" type blog posts, they don't even compare BCE PyTorch with BCE Keras, they have Keras using BCE and working, but Torch using CE.

sadge Has anyone got any good resources for using GRU and Glove together with PyTorch?

past meteor Dec 4, 2023, 3:14 PM

#

buoyant vine AM I loosing my mind, or does no one use GRU with GloVe and PyTorch for multi-la...

How would you combine them?

buoyant vine Dec 4, 2023, 3:15 PM

#

wdym?

past meteor Dec 4, 2023, 3:16 PM

#

Glove is a way to obtain embeddings you can use for downstream tasks and GRU does both, the embeddings and the downstream task.

#

I guess you could combine them by running your text through glove and to obtain embeddings that are used as input for GRU.

#

Tell me if I've failed to answer your question 🙂

buoyant vine Dec 4, 2023, 3:26 PM

#

I don't GRU creates the embedding?

With my setup at least, we have Text -> GloVe (N_tokens * 300) -> GRU -> hidden -> output classifier layer

hallow cargo Dec 4, 2023, 3:27 PM

#

Thank you, it seems I figured the problem with inputting the data, you indirectly got me to understand what the point of tf.data is. Although, I realized from a stackoverflow post that csv files are really unoptimized so I switched to a more specialized file format for tensorflow, .tfrecord. I currently got it working, but it is taking 6 seconds per step, or batch of (2048, 512, 17) just like it did with csv, which was the whole point of switching, to optimize it. From looking at my task manager, I am only getting a load on my gpu for just under a second at the start of each step similar to the csv, so that does not seem to be an issue. I know you specialize in pytorch, but would you have any idea what could cause this? I understand (2048, 512, 17) is quite a big tensor, yet it should load my gpu throughout no?

This is currently what my generator is yielding:
yield tf.transpose(tf.convert_to_tensor(np.array(list(features.values()))), perm=[1, 2, 0]), labels
Which granted is quite long, although in isolated testing, its practically instant.
Would would you think the cause of such long steps would be?

mild dirge Dec 4, 2023, 3:28 PM

#

If your gpu is only busy for a part of a second every few seconds, then your cpu is probably the bottle neck

buoyant vine Dec 4, 2023, 3:29 PM

#

buoyant vine I don't GRU creates the embedding? With my setup at least, we have Text -> GloV...

To be specific, we are using the GRU over the LTSM, mostly just for efficiency but they do the same job IIRC. In order to provide effectively some 'memory' to the model.

It works fine if it is using CE loss and multi-class, but if you change it to be BCE loss and multi-label, it just dies for some reason

hallow cargo Dec 4, 2023, 3:30 PM

#

mild dirge If your gpu is only busy for a part of a second every few seconds, then your cpu...

Its currently running on single digit percentages if not decimals on the program, and the rest on my computer is only occupying about 10%

mild dirge Dec 4, 2023, 3:31 PM

#

How long did it take to load a single batch then?

#

And do you use 1 or multiple workers for data loading?

hallow cargo Dec 4, 2023, 3:32 PM

#

I think we've got the issue, I haven't defined that

#

Thank you

past meteor Dec 4, 2023, 3:37 PM

#

buoyant vine I don't GRU creates the embedding? With my setup at least, we have Text -> GloV...

So what comes out of glove are frequently also called embeddings, hence why I was curious to see what setup you see using

#

I'd have to think about your BCE Multiclass issue

#

If I remember correctly you are truly trying to do multi label classification

buoyant vine Dec 4, 2023, 3:39 PM

#

yeah

buoyant vine Dec 4, 2023, 3:40 PM

#

past meteor So what comes out of glove are frequently also called embeddings, hence why I wa...

Yes, but it is embedding per word, so normally you need to run it through at least a couple linear layers, but the GRU or LTSM allows it to better interpret words next to one another / the sentence itself rather than just the words

past meteor Dec 4, 2023, 3:40 PM

#

Have you made confusion matrices to see what is up at a basic level?

past meteor Dec 4, 2023, 3:41 PM

#

buoyant vine Yes, but it is embedding per word, so normally you need to run it through at lea...

Correct

buoyant vine Dec 4, 2023, 3:44 PM

#

past meteor Have you made confusion matrices to see what is up at a basic level?

it is complete nonsense 😅

#

it is not really even in a state where it is useful to look at the confusion matrixes

#

you can see it just kinda nukes itself

past meteor Dec 4, 2023, 4:17 PM

#

Debugging other people's ML models, heck even my own models is such a hassle

#

There's nothing on the top of my head that I can recommend, sorr!

river cape Dec 4, 2023, 4:23 PM

#

left tartan What's the name of the envronment you're running? See top right of vscode, ie:

cold osprey Dec 4, 2023, 4:24 PM

#

river cape

change to the environment u have ipykernel installed

river cape Dec 4, 2023, 4:26 PM

#

left tartan What's the name of the envronment you're running? See top right of vscode, ie:

See I have done is created a virtual environment for notebooks , and using pip install ipykernel , I have installed the ipykernel packages

river cape Dec 4, 2023, 4:26 PM

#

cold osprey change to the environment u have ipykernel installed

Yes I have

cold osprey Dec 4, 2023, 4:27 PM

#

after creating the venv, u have to activate that venv before installing any packages

river cape Dec 4, 2023, 4:27 PM

#

my venv is activated and I have installed the packages

cold osprey Dec 4, 2023, 4:28 PM

#

pip list and see that it's there

river cape Dec 4, 2023, 4:28 PM

#

#

Yep it is there in that

river cape Dec 4, 2023, 4:29 PM

#

cold osprey pip list and see that it's there

I have used this command

cold osprey Dec 4, 2023, 4:29 PM

#

river cape

top right should say myenv

#

my venv's name is venv and its running python 3.9.13

cold osprey Dec 4, 2023, 4:31 PM

#

river cape I have used this command

im not familliar with this command. i just activate the env and pip install what i need

#

can u pip list with the venv activated?

#

should be a short list since u only have ipykernel installed

spring scarab Dec 4, 2023, 4:32 PM

#

Has anyone successfully taken a Keras trained model and converted it to Onnx?

river cape Dec 4, 2023, 4:37 PM

#

cold osprey can u pip list with the venv activated?

See the ipykernel version

cold osprey Dec 4, 2023, 4:38 PM

#

river cape See the ipykernel version

yep, and python isnt a python package so it doesnt get listed. theres ways to specific the python version it should be running

river cape Dec 4, 2023, 4:38 PM

#

cold osprey yep, and python isnt a python package so it doesnt get listed. theres ways to sp...

Wait a sec

#

cold osprey Dec 4, 2023, 4:38 PM

#

cold osprey my venv's name is venv and its running python 3.9.13

but ye, in vscode, it should show the venv name which yours isnt

#

it just shows the python version

river cape Dec 4, 2023, 4:39 PM

#

cold osprey it just shows the python version

#

it shows jupyter kernel

cold osprey Dec 4, 2023, 4:39 PM

#

#

click python environments

#

and u shud see a list

river cape Dec 4, 2023, 4:40 PM

#

Seee these both

cold osprey Dec 4, 2023, 4:41 PM

#

yes select myenv

#

should work then

river cape Dec 4, 2023, 4:41 PM

#

for which one?

cold osprey Dec 4, 2023, 4:41 PM

#

pythin environment

river cape Dec 4, 2023, 4:41 PM

#

jupyter kernel or

cold osprey Dec 4, 2023, 4:41 PM

#

im not sure what the jupyter kernel one is for actually

river cape Dec 4, 2023, 4:42 PM

#

cold osprey im not sure what the jupyter kernel one is for actually

Thankk so you muchhhhh it works

wanton merlin Dec 4, 2023, 11:12 PM

#

Yo people I want to learn AI/ML any suggestions of tutorials and resources. A dumbed down version ?

left tartan Dec 5, 2023, 12:24 AM

#

wanton merlin Yo people I want to learn AI/ML any suggestions of tutorials and resources. A du...

Depends what you mean... like, want to just learn how to write some code that uses AI/ML libraries... or want to learn the concepts/intuition behind it? If the former, start with some kaggle.com/learn and "CS50 for AI" and https://www.3blue1brown.com/topics/neural-networks, perhaps. If concepts/science, see the pins which has some good reading tips.

faint ingot Dec 5, 2023, 7:55 AM

#

https://discord.com/channels/267624335836053506/1181491135550586920

feral kernel Dec 5, 2023, 8:28 AM

#

How much vram can you solder onto a rtx 4090? 64or or 128gb?

fiery bane Dec 5, 2023, 10:29 AM

#

wanton merlin Yo people I want to learn AI/ML any suggestions of tutorials and resources. A du...

Here's a list: Just know stuff. (Or, how to achieve success in a machine learning PhD.) https://kidger.site/thoughts/just-know-stuff/

Patrick Kidger

Personal Website. Math, SciML, scuba diving!

#

I feel like all these concepts are related, and I'm wondering if there's a taxonomy to organize these ideas?

Continual learning (lifelong learning, incremental learning).
Meta-learning (learning-to-learn), few shots learning.
Transfer learning, domain adaptation.

brittle storm Dec 5, 2023, 11:02 AM

#

hi

#

can someone help me?

cold osprey Dec 5, 2023, 11:21 AM

#

brittle storm can someone help me?

Hello, please don't ask to ask, as this makes it take longer for people to help you. Please ask your actual question.

brittle storm Dec 5, 2023, 11:23 AM

#

cold osprey Hello, please don't ask to ask, as this makes it take longer for people to help ...

i am building a desktop assistant that controls my PC.. i am trying to build a function and below are the requirements:

its a remainder function where i will ask it to set a remainder on a date and then it takes the date and then sets a remainder in google calender...

spice mountain Dec 5, 2023, 11:43 AM

#

Hey, so I have an issue;

I have downloaded a .ckpt file from the internet for VQGAN (https://github.com/CompVis/taming-transformers/tree/master). Whenever I try to extract it from .zip it turns into a folder, which I can't pass into my torch-Lightning program. Anybody got a clue what to do? I am running on Linux Scientific (a HPC cluster)

wanton merlin Dec 5, 2023, 11:57 AM

#

left tartan Depends what you mean... like, want to just learn how to write some code that us...

I'm sorry for my vague message , I want to understand the concepts based on live projects. How to tune parameters , feature extraction all these concepts I need to understand. So what's your suggestion ?

feral kernel Dec 5, 2023, 12:36 PM

#

Yo why is pytorch so buggy on mac, literally almost every time i run it, it says error?

serene scaffold Dec 5, 2023, 1:57 PM

#

feral kernel Yo why is pytorch so buggy on mac, literally almost every time i run it, it says...

if you want help with an error message, don't say that you got an error. just show the error message

feral kernel Dec 5, 2023, 1:58 PM

#

serene scaffold if you want help with an error message, don't say that you got an error. just sh...

I know that, i usually will post the error, but is there anyway to improve the performances on a mac ?

serene scaffold Dec 5, 2023, 1:59 PM

#

feral kernel I know that, i usually will post the error, but is there anyway to improve the p...

you won't get bugs for no reason other than that you're on a mac.

feral kernel Dec 5, 2023, 2:01 PM

#

serene scaffold you won't get bugs for no reason other than that you're on a mac.

Yes, im still learning and probably chatgpt doesn’t give the most reliable code even after some modifications. Is there a way to improve performance?

serene scaffold Dec 5, 2023, 2:07 PM

#

feral kernel Yes, im still learning and probably chatgpt doesn’t give the most reliable code ...

there's no way to answer "how do I improve performance" unless you say specifically what you're trying to do.

#

and by performance, do you mean "make it faster" or "get rid of errors"?

feral kernel Dec 5, 2023, 2:08 PM

#

serene scaffold there's no way to answer "how do I improve performance" unless you say specifica...

Cool, i mean both, i will show you the code later when i get home. Im writing a fourier Convolutional NN now

quaint loom Dec 5, 2023, 2:42 PM

#

Hi guys. I am currently trying to do the Random forest test on my data. I want the random forest to do the test on 4 different areas. The "Position" column is nummeric and I have filtered them out like this :
'Restored Area 1': [1, 2, 3, 4],
'Restored Area 2': [9, 10, 11, 12],
'Unrestored Area 1': [5, 6, 7, 8],
'Unrestored Area 2': [13, 14, 15, 16]

Is there anyone who can see what mistake I make? I end up with everything in A.

Here is the code: https://paste.pythondiscord.com/UJOA

long canopy Dec 5, 2023, 3:33 PM

#

is there a term to designate old-school AIs, i.e., AI before the likes of ChatGPT?

#

e.g., AI in Halo 1's enemy NPCs

agile cobalt Dec 5, 2023, 3:36 PM

#

I wouldn't even call these "AIs", just NPC at best
if I had to guess, it probably just uses some path-finding algorithm like A* to find the closest distance between the player and the NPC then takes that path

long canopy Dec 5, 2023, 3:39 PM

#

found it: GOFAI is the term

serene scaffold Dec 5, 2023, 3:48 PM

#

long canopy is there a term to designate old-school AIs, i.e., AI before the likes of ChatGP...

what counts as "AI" within the field shifts over time. But ChatGPT is an example of deep learning, which is a subset of machine learning, which is a subset of AI

#

though these days, I can't really think of an example of not-machine learning that's still considered "AI"

neon field Dec 5, 2023, 4:08 PM

#

does anyone have convolution dataset for satellites signals in FEC (forward error correction)

jagged pulsar Dec 5, 2023, 4:16 PM

#

hi, so I'm trying to play wave file with pyaudio and dynamically plot it with matplotlib, I already have script for plotting it in a static way and it looks like this

import wave
import numpy as np
import matplotlib.pyplot as plt

wav = wave.open("test.wav", 'r')
raw = wav.readframes(-1)
wav.close()

raw = np.frombuffer(raw, "int16")
sample = wav.getframerate()

time = np.linspace(0, len(raw) / sample, num=len(raw))

plt.plot(time, raw, color="green")
plt.show()

but I have no idea how to this, can you help me?

quaint gorge Dec 5, 2023, 5:37 PM

#

I want to evaluate the effectiveness of a text classification algorithm in terms of how many mistakes it did, anyone has experience knows where to look? I want to know on average how many mistakes it will make in an n sample

past meteor Dec 5, 2023, 5:46 PM

#

long canopy is there a term to designate old-school AIs, i.e., AI before the likes of ChatGP...

I agree with Stelercus and Etrotta's answers.

Just to add, pathfinding algorithms (breath-first, depth-first, A*, ...) are typically considered "AI" and they aren't machine learning at all. They all fall under the broader category of Knowledge representation and reasoning https://en.wikipedia.org/wiki/Knowledge_representation_and_reasoning.

Personally, I think this stuff still matters to an extent because unlike ML with reasoning you typically get exact answers, whitebox answers where ML typically gives you an approximation that is also pretty opaque.

long canopy Dec 5, 2023, 5:47 PM

#

past meteor I agree with Stelercus and Etrotta's answers. Just to add, pathfinding algorith...

thanks a lot for the keywords and the references!

past meteor Dec 5, 2023, 5:49 PM

#

long canopy thanks a lot for the keywords and the references!

https://aima.cs.berkeley.edu/ Russel and Norvig's "Artificial Intelligence: A Modern Approach" covers many of these things btw. The first four parts (600 pgs) are mostly non-ML. If you want more keywords I recommend just looking at the table of contents 😄

long canopy Dec 5, 2023, 5:51 PM

#

past meteor <https://aima.cs.berkeley.edu/> Russel and Norvig's "Artificial Intelligence: A ...

NICE, this is exactly what I need, I sort of have an academic bent and I need to begin getting a high level overview of the subject, thank you very much!

river cape Dec 5, 2023, 5:54 PM

#

Hey guys could anyone suggest any ml models that I can use for repair and service website?

past meteor Dec 5, 2023, 7:32 PM

#

Can you format this, this is unreadable. I don't think people will bother to read this sorry 😅

feral kernel Dec 5, 2023, 7:34 PM

#

past meteor Can you format this, this is unreadable. I don't think people will bother to rea...

Here is the new code and new error, i tried to reduce the batch size and increase dimension and decrease the indices. `import torch.nn as nn
import torch.optim as optim
import torch
import time

Define the FCNN with Bessel activation

class FCNN(nn.Module):
def init(self):
super(FCNN, self).init()
self.conv1 = nn.Conv2d(1, 64, kernel_size=3)
self.fc1 = nn.Linear(64 * 62 * 62, 256)
self.fc2 = nn.Linear(256, 10)
self.bessel = torch.special.bessel_j0 # Bessel function as activation

def forward(self, x):
    x = self.conv1(x)
    x = x.view(-1, 64 * 62 * 62)  # Reshape for fully connected layer
    x = self.bessel(self.fc1(x))
    x = self.fc2(x)
    return x

Instantiate the model, loss function, and optimizer

model = FCNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.LBFGS(model.parameters(), lr=0.01, max_iter=20) # Quasi-Newtonian optimizer

Break the input matrix into 8x8 matrices

batch_size = 8
num_batches = fft_result_tensor.shape[1] // batch_size

Training loop

def closure():
optimizer.zero_grad()
total_loss = 0.0

for i in range(num_batches):
    start_idx = i * batch_size
    end_idx = start_idx + batch_size

    # Extract an 8x8 tensor from the input
    input_batch = fft_result_tensor[:, start_idx:end_idx].unsqueeze(0).unsqueeze(1)

    # Forward pass
    outputs = model(input_batch)

    # Calculate loss
    target_labels = torch.tensor([0])  # Replace with your target labels
    loss = criterion(outputs, target_labels)

    # Accumulate loss
    total_loss += loss

# Backward pass
total_loss.backward()
return total_loss

Perform optimization

start_time = time.time()
for epoch in range(10): # Adjust the number of epochs as needed
optimizer.step(closure)

end_time = time.time()
training_time = end_time - start_time
print(f"Training time: {training_time} seconds")`

#

`1510 else:
-> 1511 return self._call_impl(*args, **kwargs)

File /opt/homebrew/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
1515 # If we don't have any hooks, we want to skip the rest of the logic in
1516 # this function, and just call forward.
1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1518 or _global_backward_pre_hooks or _global_backward_hooks
1519 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520 return forward_call(*args, **kwargs)
1522 try:
1523 result = None

Cell In[27], line 17, in FCNN.forward(self, x)
15 def forward(self, x):
16 x = self.conv1(x)
---> 17 x = x.view(-1, 64 * 62 * 62) # Reshape for fully connected layer
18 x = self.bessel(self.fc1(x))
19 x = self.fc2(x)

RuntimeError: shape '[-1, 246016]' is invalid for input of size 78720`

past meteor Dec 5, 2023, 7:36 PM

#

feral kernel Here is the new code and new error, i tried to reduce the batch size and increas...

Turned out to be longer than I expected, can you paste it all in here? https://paste.pythondiscord.com/

feral kernel Dec 5, 2023, 7:36 PM

#

I will reduce the batch size even lower

feral kernel Dec 5, 2023, 7:38 PM

#

past meteor Turned out to be longer than I expected, can you paste it all in here? https://p...

it doesn't let me send a file

past meteor Dec 5, 2023, 7:38 PM

#

Ah, you should paste the code in there

feral kernel Dec 5, 2023, 7:38 PM

#

past meteor Ah, you should paste the code in there

I did

past meteor Dec 5, 2023, 7:39 PM

#

And then send us the link here

feral kernel Dec 5, 2023, 7:39 PM

#

https://paste.pythondiscord.com/GDQQ

#

I tried to download the text and sent it to the chat but it won't accept it

past meteor Dec 5, 2023, 7:40 PM

#

Alright, that's better. (I also don't answer to DMs, sorry)

#

Where does fft_result_tensor come from?

feral kernel Dec 5, 2023, 7:47 PM

#

past meteor Where does `fft_result_tensor` come from?

IT comes from the fourier compression of the images from before. I converted all the images into a fourier csv file.

#

Also I changed the elements to 26240, but it still doesnt work

past meteor Dec 5, 2023, 7:57 PM

#

feral kernel IT comes from the fourier compression of the images from before. I converted all...

fyi: Being rude or pinging and then deleting your message reduces your chance of getting help.

Your issue is basically that the shapes don't line up. You can compute the size of your output image if you're not using padding. It's going to be that * your number of output channels, being 64. This shape needs to be "resizeable" to the input of your FC layer, which it's not.

#

This can be annoying to do, but what I did in the past is use a pen of paper and literally compute this. Maybe add it in comments and track it.

feral kernel Dec 5, 2023, 7:58 PM

#

past meteor fyi: Being rude or pinging and then deleting your message reduces your chance of...

THanks, I deleted the old messages since it wasn't formatted right.

#

So i added the stride and padding size to the code and changed the code to have 3 channels and 256 as , so (256*3 channels +2 -1 )/1 +1=770. Nvm i need to check the size of the csv file first

past meteor Dec 5, 2023, 8:16 PM

#

feral kernel So i added the stride and padding size to the code and changed the code to have ...

Are you using a jupyter notebook? If not, I really recommend you do

feral kernel Dec 5, 2023, 8:16 PM

#

past meteor Are you using a jupyter notebook? If not, I really recommend you do

Yes

past meteor Dec 5, 2023, 8:17 PM

#

Then you should be able to call .shape on the the fft_result_tensor

feral kernel Dec 5, 2023, 8:21 PM

#

past meteor Then you should be able to call `.shape` on the the `fft_result_tensor`

Size of fft_result_matrices: torch.Size([207, 256])

past meteor Dec 5, 2023, 8:22 PM

#

feral kernel Size of fft_result_matrices: torch.Size([207, 256])

Can I assume you have 207 images?

feral kernel Dec 5, 2023, 8:23 PM

#

past meteor Can I assume you have 207 images?

less than that, 69 images (2 docs) but 3 RGB channels, so my input size is 52992?

past meteor Dec 5, 2023, 8:24 PM

#

I'd make sure your shapes are truly 71 x 256 x 3 so you don't have an oopsie

#

You can use the equation above to calculate how large your oupout will be, it'll be X * Y * 64

feral kernel Dec 5, 2023, 8:26 PM

#

past meteor I'd make sure your shapes are truly 71 x 256 x 3 so you don't have an oopsie

that makes sense, i used chatgpt to write the size, I should've done it myself

past meteor Dec 5, 2023, 8:27 PM

#

Indeed

#

Don't be afraid to use a debugger to track the shapes throughout

feral kernel Dec 5, 2023, 8:30 PM

#

past meteor Don't be afraid to use a debugger to track the shapes throughout

Thanks a lot, yeah i need to slow down a little bit

feral kernel Dec 5, 2023, 8:34 PM

#

past meteor Indeed

ValueError: Expected input batch_size (207) to match target batch_size (1). I changed the output size to match the input size but still this error, so i changed x = x.view(1, 64 * 36 * 23)

past meteor Dec 5, 2023, 8:34 PM

#

How familiar are you with PyTorch?

feral kernel Dec 5, 2023, 8:35 PM

#

past meteor How familiar are you with PyTorch?

not much, i know some linear algebra, but my tensor knowledge is somewhat less but i know some.

#

so i need to change batch size to 207?

past meteor Dec 5, 2023, 8:36 PM

#

I'd really consider going through the docs https://pytorch.org/tutorials/beginner/basics/intro.html. The shape mismatch errors are frustrating, but they'll keep happening if you don't get the basics

feral kernel Dec 5, 2023, 8:44 PM

#

past meteor I'd really consider going through the docs <https://pytorch.org/tutorials/beginn...

Weird i got the shape right, but still another error. Yeah, i need to learn how to read the code better. If it was written like matrices in math books , it would be easier for me to understand.

feral kernel Dec 5, 2023, 8:51 PM

#

feral kernel Weird i got the shape right, but still another error. Yeah, i need to learn how ...

RuntimeError: shape '[-1, 52992]' is invalid for input of size 13248 I changed to [-1, 13248] but still the same error

past meteor Dec 5, 2023, 8:52 PM

#

I think at this point it's really just best you read those docs, especially if you're in it for the long haul

#

I also got to go so I won't be able to help

feral kernel Dec 5, 2023, 8:55 PM

#

Yep, i skimmed and read some of it earlier today and before , i will read more and try to get used to code formatting and syntax. Also why does jupyter sometimes run a code and says it is successful but it doesnt show any progress and printing even though i wrote print?

desert onyx Dec 5, 2023, 9:10 PM

#

Hi, is there anyone who can help me with something simple?

small wedge Dec 5, 2023, 9:23 PM

#

desert onyx Hi, is there anyone who can help me with something simple?

it's better to go ahead and ask your question directly rather than asking if anyone can help

long canopy Dec 6, 2023, 2:48 AM

#

anything I should be following if I'm interesting at attempts to model, in the sense of mathematical-ish modeling, emergent abilities of LLMs?

left tartan Dec 6, 2023, 3:26 AM

#

feral kernel Yep, i skimmed and read some of it earlier today and before , i will read more ...

Can you share the code that you’re wondering about?

lucid tide Dec 6, 2023, 5:03 AM

#

Who has knowledge on tensorflow, QNN, PNN and transformer architectures?

feral kernel Dec 6, 2023, 5:26 AM

#

left tartan Can you share the code that you’re wondering about?

Thanks!`import torch.nn as nn
import torch.optim as optim
import torch
import time

Define the FCNN with Bessel activation

class FCNN(nn.Module):
def init(self):
super(FCNN, self).init()
# Set padding and stride for the convolutional layer
self.conv1 = nn.Conv2d(1, 64, kernel_size=3, padding=1, stride=1)

    # Modify the size of the fully connected layer to match the input tensor dimensions
    self.fc1 = nn.Linear(69 * 256 * 3, 256)  # Adjusted size for 4x4 matrices
    self.bessel = torch.special.bessel_j0  # Bessel function as activation
    self.fc2 = nn.Linear(256, 10)

def forward(self, x):
    x = self.conv1(x)
    x = x.view(-1, 64 * 36 * 23)  # Reshape for the fully connected layer

    # Apply Bessel activation to the reshaped input tensor
    x = self.bessel(x)
    x = x.view(0, 13428)  # Reshape back to original dimensions
    x = self.fc2(x)
    return x

Instantiate the model, loss function, and optimizer

model = FCNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.LBFGS(model.parameters(), lr=0.01, max_iter=20) # Quasi-Newtonian optimizer

Break the input matrix into 4x4 matrices

batch_size = 1
num_batches = fft_result_tensor.shape[1] // batch_size

Training loop

def closure():
optimizer.zero_grad()
total_loss = 0.0

for i in range(num_batches):
    start_idx = i * batch_size
    end_idx = start_idx + batch_size

    # Extract a 4x4 tensor from the input
    input_batch = fft_result_tensor[:, start_idx:end_idx].unsqueeze(0).unsqueeze(1)

    # Forward pass
    outputs = model(input_batch)

    # Calculate loss
    target_labels = torch.tensor([0])  # Replace with your target labels
    loss = criterion(outputs, target_labels)

    # Accumulate loss
    total_loss += loss

# Backward pass
total_loss.backward()
return total_loss`

winter drift Dec 6, 2023, 7:11 AM

#

is anyone familiar with finetuning gpt, i built a dataset and would like to test it out but am a bit lost

split compass Dec 6, 2023, 8:50 AM

#

Greetings everyone
I have started to work on RLHF recently. So I'm thinking is there anyone who has any kind of experience in it.

rigid cape Dec 6, 2023, 1:09 PM

#

hey guys , is there any recommended curriculum for learning machine learning using python ?

lapis sequoia Dec 6, 2023, 1:12 PM

#

rigid cape hey guys , is there any recommended curriculum for learning machine learning usi...

Do you already know Python?

rigid cape Dec 6, 2023, 1:12 PM

#

lapis sequoia Do you already know Python?

yeah

#

I know basic Data analysis using pandas and numpy but thats it .

lapis sequoia Dec 6, 2023, 1:16 PM

#

so check the ML with python from IBM on coursera

#

there's an aduit version

lapis sequoia Dec 6, 2023, 2:28 PM

#

hi im trying to implement a FaceRecognition in python, due to that i createt an venv and use vscode with a jupyter notebook , i am wondering why i get no output from this line of code : for directory in os.listdir("lfw"):
for file in os.listdir(os.path.join("lfw", directory)):
os.path.join("lfw", directory, file)
os.path.join(NEG_PATH, file)
print(file)
print("hello), the folde lfw exists , the NEG_PATH exist , and not even the print("Hello") statement works any idea why?

cold osprey Dec 6, 2023, 2:29 PM

#

!code

arctic wedgeBOT Dec 6, 2023, 2:29 PM

#

Formatting code on discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

lapis sequoia Dec 6, 2023, 3:00 PM

#

hi im trying to implement a FaceRecognition in python, due to that i createt an venv and use vscode with a jupyter notebook , i am wondering why i get no output from this line of code :py for directory in os.listdir("lfw"): for file in os.listdir(os.path.join("lfw", directory)): os.path.join("lfw", directory, file) os.path.join(NEG_PATH, file) print(file) print("hello), the folde lfw exists , the NEG_PATH exist , and not even the print("Hello") statement works any idea why?

devout python Dec 6, 2023, 3:00 PM

#

Okay this is a super dumb question but ^c mean, I get it for some reason when I run my script on colab, but nowhere else...

#

what does ^c mean*

tidal bough Dec 6, 2023, 3:03 PM

#

that's what ctrl-C (interrupt) tends to write

devout python Dec 6, 2023, 3:04 PM

#

Any idea why colab writes that - it runs well on my desktop?

#

can it be because the files are too big to hold in memory?

river cape Dec 6, 2023, 3:12 PM

#

So I have a service - based website in which we undertake repairs of any electronic gadgets. So I was thinking of implementing a chatbot which could answer the most basic queries and if the user's problem is still not solved , then on the existing data received by the chat bot , I would like to run a price prediction model which computes the approx. cost for the damage or repair incurred. Is there any way to implement this? Please I need help

cold osprey Dec 6, 2023, 3:24 PM

#


for directory in os.listdir("lfw"):
    for file in os.listdir(os.path.join("lfw", directory)):
        os.path.join("lfw", directory, file)
        os.path.join(NEG_PATH, file)
        print(file)
        print("hello)

#

cant rly tell but r there indent problems?

#

cant help much without seeing the actual file directory too

quick sapphire Dec 6, 2023, 3:35 PM

#

rigid cape hey guys , is there any recommended curriculum for learning machine learning usi...

Microsoft Azure has a new Learn program .

#

It will set you up with an advisor and you can look through the different roles at your convenience to getting your certifications

fallow frost Dec 6, 2023, 3:49 PM

#

How do you guys generate graphs like these? there was a website or smth?

tidal bough Dec 6, 2023, 3:52 PM

#

Well, manually I'd do it via something like https://app.diagrams.net/

#

if you mean programmatically, there is pygraphviz

river cape Dec 6, 2023, 4:02 PM

#

river cape So I have a service - based website in which we undertake repairs of any electro...

Anything for this?

lavish kraken Dec 6, 2023, 4:08 PM

#

Does anyone know why Hyperparameter Grid Search with XGBoost takes like so much time to run? it's frustrating...running for hours endless wtf

#

i have even reduce the size of the data rom 200k rows to 2,000

cold osprey Dec 6, 2023, 4:12 PM

#

how big is ur param_grid

lavish kraken Dec 6, 2023, 4:19 PM

#

cold osprey how big is ur param_grid

here is my paramters defined

#

i don't know how to quantify how big it is

cold osprey Dec 6, 2023, 4:19 PM

#

u can try for only 1 set of parameters and see how long that takes

lavish kraken Dec 6, 2023, 4:19 PM

#

how do you guys calculate or know how big a parameter is

cold osprey Dec 6, 2023, 4:19 PM

#

and multiply to get total

lavish kraken Dec 6, 2023, 4:20 PM

#

If i were to share the code snippet for you to help me edit r make some chanegs to make it run faster

#

can i do share my code snippet respectfully

odd meteor Dec 6, 2023, 4:44 PM

#

lavish kraken Does anyone know why Hyperparameter Grid Search with XGBoost takes like so much ...

Here are two things to do to speed up the training time.

Since you're using XGBoost, if you have access to GPU on your machine, utilize that to speed up your hyperparameter tunning. Add this parameter tree_method='gpu_hist' while instantiating your XGBClassifier.
Reduce your search space. The larger your search space, the longer the time it'll take to finish running. So, you might wanna reduce the number of hyperparameters you're trying to tune and their respective search space.

lavish kraken Dec 6, 2023, 4:51 PM

#

odd meteor Here are two things to do to speed up the training time. 1. Since you're using...

Okay! will try to edit and see what to do

untold bloom Dec 6, 2023, 5:18 PM

#

lavish kraken how do you guys calculate or know how big a parameter is

in Grid search all the combinations are exhaustively tried, so if you get the length of each parameter's candidate values, and multiply them you get the total number of trials in the grid; in code you can do it with ||from math import prod; prod(map(len, your_grid.values()))||

#

or the verbose mode of the GridSearchCV tells you how big it is in the first line ℓoℓ

#

so maybe don't need the manual coding but still

odd meteor Dec 6, 2023, 5:28 PM

#

lavish kraken i don't know how to quantify how big it is

Hopefully this long ass post you're about to read will clarify things for you.

Imagine you're using GridSearchCV for hyperparameter tunning. You're interested in tunning 3 hyperparameters (let's call them A,B, and C for now.), and each one of them have 4 search space.

A = [4, 40, 65, 100]
B = ['Hey', 'Hoo', 'Haa", 'Santa']
C = [0.4, 0.25, 1.5, 6.5]

Now, to determine the total number of fits your GridSearchCV will make when tuning hyperparameters, you simply need to multiply the number of unique values in each hyperparameter's search space. In our small example (remember we have 3 hyperparameters & each has 4 possible search space), this will be:

4 (for hyperparameter 1) × 4 (for hyperparameter 2) × 4 (for hyperparameter 3) = 64 fits

So your GridSearchCV will perform a total of 64 fits.

Now, if you're performing cross-validation and the number of folds = 5, then GridSearchCV will perform a total of 320 fits; 64 (total fits from hyperparameter tuning) × 5 (number of cross-validation folds).

Again, now factor the sample size of your observations, that is, the total number of rows in your data (both train and test). The bigger your train data, the longer it'll take to quickly perform those 320 fits. And of course, the bigger the size of the data you're using for batch prediction (your test data / validation data; whichever one you're calling .predict() on) the longer the time it takes to make prediction as well (this part doesn't really take much time compared to when the model is being fit to the training data.)

So, you see how with just 3 hyperparameters, 4 search space each, and 5-fold cross cross validation, you're calling .fit() 320 times on your train data. Now imagine what happens when you increase this param_space.

Once you understand the scenario above, you can easily compute the same thing with your current setup.

past meteor Dec 6, 2023, 5:44 PM

#

lavish kraken Does anyone know why Hyperparameter Grid Search with XGBoost takes like so much ...

You're training thousands of models

#

Two tips:

Focus on tuning the amount of estimators hyperparameter (only). It's the most high value one.
Use random search instead of grid search

exotic epoch Dec 6, 2023, 6:30 PM

#

hey guys

#

I have to fill a code according to some comments

past meteor Dec 6, 2023, 6:48 PM

#

exotic epoch I have to fill a code according to some comments

Can you ask the actual question please? Then people can jump in directly.

exotic epoch Dec 6, 2023, 6:49 PM

#

okay so here is the google colab link wich contains the code https://colab.research.google.com/drive/14mVqgJjhQWsxq0sddOOGxnpgwmuCa00B#scrollTo=FbZAW_DTLWbp and these are the instructions guys :: Download the iris.csv dataset from kaggle, using this link https://www.kaggle.com/datasets/saurabh00007/iriscsv
Put the data in the same directory of your notebook.
Start following this notebook https://colab.research.google.com/drive/14mVqgJjhQWsxq0sddOOGxnpgwmuCa00B#scrollTo=iHnErH5UAzTY
Read the comments carefully and complete the code whenever it is needed

Iris.csv

Google Colaboratory

#

please help me guys

past meteor Dec 6, 2023, 6:53 PM

#

exotic epoch okay so here is the google colab link wich contains the code https://colab.rese...

It seems like all the cells are already filled in, what do you need help for exactly?
It also seems like homework, what is this?

exotic epoch Dec 6, 2023, 6:53 PM

#

it is a home work yep

#

For exemple here according to the green instruction (comment) u have to import the confusion matrix

#

and i don't know to do that yet but i am really working on it

past meteor Dec 6, 2023, 6:58 PM

#

So, I think you'll learn more if you do this yourself. I checked the notebook and it's basically exclusively things from sci-kit learn

exotic epoch Dec 6, 2023, 6:58 PM

#

yes it is

#

i can't do it myself infortunately

past meteor Dec 6, 2023, 6:59 PM

#

You can basically CTRL-F here and find everything. You'll get a lot more value by trying: https://scikit-learn.org/stable/modules/classes.html

exotic epoch Dec 6, 2023, 6:59 PM

#

thanks anyways for the help

past meteor Dec 6, 2023, 6:59 PM

#

exotic epoch i can't do it myself infortunately

Any reason why you can't do it yourself?

exotic epoch Dec 6, 2023, 6:59 PM

#

don't have the necessary knowledge

past meteor Dec 6, 2023, 7:00 PM

#

You're in week 5, if I bail you out right now you'll be stuck later on. I want to help but the best way to help is letting you figure it out 🙂

#

If you have specific questions like "how does model X work" or "why is this method like this and not like that", I and most people here will still be happy to help though

exotic epoch Dec 6, 2023, 7:02 PM

#

okay i see

#

Thanks dude

long canopy Dec 6, 2023, 7:05 PM

#

anything I should be following if I'm interested in attempts to model, in the sense of mathematical-ish modeling, the possibility of the emergence of emergent abilities in LLMs?

serene scaffold Dec 6, 2023, 7:36 PM

#

Anyone going to NeurIPS? I'll be there for the workshops only.

feral kernel Dec 6, 2023, 9:00 PM

#

Hi, im still getting another error, ValueError: not enough values to unpack (expected 3, got 2). even though i changed the height and width and channels, the shape is Shape: torch.

feral kernel Dec 6, 2023, 9:00 PM

#

feral kernel Hi, im still getting another error, ValueError: not enough values to unpack (exp...

`import torch.nn as nn
import torch.optim as optim
import torch
import time

Define the FCNN with Bessel activation

class FCNN(nn.Module):
def init(self, input_dim):
super(FCNN, self).init()
# Adjust convolution based on input dimensions
self.conv1 = nn.Conv2d(1, 64, kernel_size=3)
# Unpack input dimensions
channels, height, width = input_dim
# Hidden layer size based on input and output dimensions
hidden_size = 64 * height * width
self.fc1 = nn.Linear(hidden_size, hidden_size) # Modified size for dynamic input
self.bessel = torch.special.bessel_j0 # Bessel function as activation
self.fc2 = nn.Linear(hidden_size, channelsheightwidth) # Output same as input

def forward(self, x):
    x = self.conv1(x)
    x = x.view(-1, x.shape[1] * x.shape[2] * x.shape[3])  # Reshape based on input dimensions
    x = self.bessel(x)
    x = self.fc1(x)
    x = self.fc2(x)
    x = x.view(-1, channels, height, width)  # Reshape to match input
    return x

Initialize model with actual input dimension

model = FCNN(fft_result_tensor.shape)

Adjust loss function based on desired output type (e.g., reconstruction)

criterion = nn.MSELoss()

Use a more suitable optimizer for large datasets

optimizer = optim.Adam(model.parameters())

Break input matrix into batches

batch_size = 4
num_batches = fft_result_tensor.shape[1] // batch_size

`

digital marsh Dec 6, 2023, 9:11 PM

#

Hello there, I've been looking for ai since 3 days and I've watched video about flappy bird and ai did it worked well then tried to do the same process with snake but the result aren't that good. I think my problem are the output and maybe how I add and remove fitness. For the fitness I just add fitness when the snake eat food got a new highscore or when snaked get better average score
and I just remove snake when he died and touch itself
and the most important part is the output I do that py for x, snake in enumerate(snakos.copy()): pposition = ["right", "left", "up", "down"] output = nets[x].activate( (int(snake.body_pos[0][0]), int(snake.body_pos[0][1]), closest_apple(snake.body_pos, apples[x])[0], closest_apple(snake.body_pos, apples[x])[1])) snake.direction = get_optimal_direction(snake.body_pos[0], apples[x],all_position[-1],snake.body_pos) if output[0] > 0.5 else pposition[randint(0, len(pposition) - 1)] all_position.append(snake.direction) snake.update_position() snake.display() ge[x].fitness += 0.1

#

config file:https://pastebin.com/FXS8aQZM

Pastebin

[NEAT]fitness_criterion = maxfitness_threshold = 100pop_siz...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

if you want to see the whole code:https://pastebin.com/J7uXqaQD

Pastebin

import pygamefrom random import randintimport pygamefrom random imp...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

thanks for the reading and the help

odd meteor Dec 6, 2023, 9:46 PM

#

serene scaffold Anyone going to NeurIPS? I'll be there for the workshops only.

I've had my own fair share of visa rejections to attend NeurIPS even with an accepted workshop paper at BAI. I don't stress it anymore, it'll happen when it will happen.

If you can, send us jpeg, and a memoir of all the hot takes that's definitely gon be flying around about OpenAI drama and Q+ in next meeting 😂😂

magic bloom Dec 7, 2023, 1:25 AM

#

I just built an AI data scientist. Any suggestions on how to improve? https://youtu.be/ZjpNx8qNnaA

YouTube

PIPS AI

How to Use Assistants API - Python Tutorial

🚀Explore OpenAI Assistants API🚀

🔗 Github Repo: https://github.com/calapsss/assistants-api-easy

🔗 STAY IN THE LOOP:
Medium: https://pipsworld.medium.com/
Twitter: https://twitter.com/pips_ai

✨What is Assistants API
The Assistants API allows you to build AI assistants within your own applications. An Assistant has instructions and can leverage ...

▶ Play video

buoyant vine Dec 7, 2023, 2:10 AM

#

"AnalAssit" is truly an unfortunate name to give it 😅

desert oar Dec 7, 2023, 2:32 AM

#

magic bloom I just built an AI data scientist. Any suggestions on how to improve? https://yo...

is there a TLDR for this?

#

personally, the editing style and clickbait title puts me off

#

you built something and i didn't, so i don't want to be dismissive. maybe i'm just not the target audience for this kind of content.

spare briar Dec 7, 2023, 2:34 AM

#

buoyant vine "AnalAssit" is truly an unfortunate name to give it 😅

can't tell if parody

magic bloom Dec 7, 2023, 2:45 AM

#

desert oar is there a TLDR for this?

This is a streamlit demo. It conducts data analysis and uses Open AI assitants api https://assistantsapi.streamlit.app/

Streamlit

app

This application is a Streamlit interface for interacting with OpenAI's AI assistants. It allows ...

magic bloom Dec 7, 2023, 2:46 AM

#

desert oar you built something and i didn't, so i don't want to be dismissive. maybe i'm ju...

noted thanks for the feedback. I had to put the intro in for the youtube algo basing on the top influencers in the space but I did put timestamps to skip to the coding part which had no editing at all.

magic bloom Dec 7, 2023, 2:46 AM

#

buoyant vine "AnalAssit" is truly an unfortunate name to give it 😅

my bad I forgot to change it. It's analysis + assistant

magic bloom Dec 7, 2023, 2:48 AM

#

desert oar you built something and i didn't, so i don't want to be dismissive. maybe i'm ju...

honestly the title is not clickbait doe, its a full tutorial with all aspects on the assistants api. I integrated the code interpreter and file retrieval unlike most youtube tutorials I saw that only ran the assistant in python. but i get what you mean the thumbnail is clickbaity

serene scaffold Dec 7, 2023, 3:35 AM

#

@magic bloom keep in mind that we don't allow self-promotion when it falls under advertising

magic bloom Dec 7, 2023, 3:39 AM

#

serene scaffold <@960795657675878420> keep in mind that we don't allow self-promotion when it fa...

noted i can remove the youtube video thing and just send my github repo. Im looking for suggestions. Would that be alright?

serene scaffold Dec 7, 2023, 3:39 AM

#

magic bloom noted i can remove the youtube video thing and just send my github repo. Im look...

You an request code reviews, yes

#

Can*

magic bloom Dec 7, 2023, 3:40 AM

#

serene scaffold You an request code reviews, yes

Got it.. Sending it right now

#

Would appreciate a code review. Thanks! 🙏 https://github.com/calapsss/assistants-api-easy

GitHub

GitHub - calapsss/assistants-api-easy: Easy Tutorial for Assistants...

Easy Tutorial for Assistants API with Code Interpreter and File Retrieval - GitHub - calapsss/assistants-api-easy: Easy Tutorial for Assistants API with Code Interpreter and File Retrieval

fading scaffold Dec 7, 2023, 5:57 AM

#

how to count std using pd?

odd meteor Dec 7, 2023, 7:53 AM

#

fading scaffold how to count std using pd?

Can you add more clarity to your question. Do you mean how to compute standard deviation in a pandas dataframe?

fading scaffold Dec 7, 2023, 7:53 AM

#

odd meteor Can you add more clarity to your question. Do you mean how to compute standard d...

yes that's what i mean, pardon me 😅

odd meteor Dec 7, 2023, 7:59 AM

#

fading scaffold yes that's what i mean, pardon me 😅

I presume you've already read your data into pandas and df is the name of your dataframe. You can compute the standard deviation of any column using this

df['column_name'].std() if you want to see the standard devaition of all numeric columns in your data, you can the describe() method to get the descriptive stats. df.describe()

river cape Dec 7, 2023, 12:34 PM

#

Any good sites to find datasets other than kaggle?

#

And how do I increase the accuracy of the model?

past meteor Dec 7, 2023, 12:48 PM

#

river cape And how do I increase the accuracy of the model?

This is an extremely broad question.

In general you need to loop between modelling and inspection where your model went wrong and feature engineering/changing architecture to improve this.

#

That's why kaggle is a great platform, it's not just about the data but also about what people used it for

toxic mortar Dec 7, 2023, 1:29 PM

#

I dont get why saturation is problematic? That means all of weights won't be same sign? Why that would be bad thing? Some features are worse than the other

#

Topic: [Activation functions CNN]

quartz hawk Dec 7, 2023, 2:18 PM

#

Hey guys so I'm working on a project and it requires to extract some data from images in text format(key value pair).

the images are scanned pyq(previous year question paper) and i want to extract the name of the course, course id , year of examination, type of examination like major or minor, department of course.

So my initial approach is to use some kind of ocr(pytesseract) and use langchain to extract the key-value from the text.

Is there any better approach to this problem than this?

past meteor Dec 7, 2023, 2:20 PM

#

toxic mortar I dont get why saturation is problematic? That means all of weights won't be sam...

The saturation is bad because it causes the gradient to tail off. As you can see in the image the derivative of the tanh in the extremities is really small

toxic mortar Dec 7, 2023, 2:43 PM

#

past meteor The saturation is bad because it causes the gradient to tail off. As you can see...

Okay, the opposite of the saturation is some function that has lim(x->inf) = inf. Let's say ReLu function. You are telling me that possibility of output to be very large number is better than something predictable? Or am I missing the point?

#

Just to confirm, when gradient is tailing off that means there are little to none updates?

past meteor Dec 7, 2023, 2:58 PM

#

toxic mortar Okay, the opposite of the saturation is some function that has lim(x->inf) = inf...

Basically, with large input you get close to -1 and 1 which means you get small gradients indeed. It's typically better to have something that allows for the gradient to flow nicely through the network, like relu.

You mentioned "something predictable", relu can cause the numbers internally to be more "unpredictable" or cause something called internal covariate shift but other tricks like batch normalisation account for this.

#

Predictable output in hidden layers is not a goal in and of itself tough. With gradient descent you want covariates to be on the same scale. Tanh gives you somewhat of a guarantee here, but not fully. Relu even less so.

past meteor Dec 7, 2023, 3:02 PM

#

toxic mortar Just to confirm, when gradient is tailing off that means there are little to non...

Yes, saturation => low gradient => smaller updates. (Vanishing gradient problem)

lavish kraken Dec 7, 2023, 3:04 PM

#

odd meteor Hopefully this long ass post you're about to read will clarify things for you. ...

make sense

toxic mortar Dec 7, 2023, 3:08 PM

#

past meteor Basically, with large input you get close to -1 and 1 which means you get small ...

Yeah, I get what you mean. Thanks zestar

wild sluice Dec 7, 2023, 3:24 PM

#

how would i check a pandas dataframe column for a certain phrase like checking whether a row in the name's column has a certain word

lapis sequoia Dec 7, 2023, 3:25 PM

#

load() missing 1 required positional argument: 'loader' in ```from chatterbot import ChatBot
from chatterbot.trainers import ChatterBotCorpusTrainer

'''
This is an example showing how to create an export file from
an existing chat bot that can then be used to train other bots.
'''

chatbot = ChatBot('Export Example Bot')

First, lets train our bot with some data

trainer = ChatterBotCorpusTrainer(chatbot)

trainer.train('chatterbot.corpus.english')

Now we can export the data to a file

trainer.export_for_training('./my_export.json')```

#

pls help me

mental bane Dec 7, 2023, 3:32 PM

#

i installed tensorflow but while running it in jupiter notebook it throws the error "
SymbolAlreadyExposedError: Symbol Zeros is already exposed as (). " i cant find sol can someone help me with it

long canopy Dec 7, 2023, 4:08 PM

#

past meteor <https://aima.cs.berkeley.edu/> Russel and Norvig's "Artificial Intelligence: A ...

this is an amazing book, and the bibliography is absolutely huge. thanks a lot for this recommendation

feral kernel Dec 7, 2023, 4:15 PM

#

hey, why does jupyter keeps dying when i train a neural net? not enough memory, but it is a small neural net?

odd meteor Dec 7, 2023, 4:17 PM

#

wild sluice how would i check a pandas dataframe column for a certain phrase like checking w...

Given that the datatype of the column in question is a string/object, you could leverage your string method.

df['column_name'].str.contains('word', na = False, case=False)

fading scaffold Dec 7, 2023, 4:18 PM

#

odd meteor I presume you've already read your data into pandas and `df` is the name of your...

yes i have defined it with df, thank you

odd meteor Dec 7, 2023, 4:21 PM

#

feral kernel hey, why does jupyter keeps dying when i train a neural net? not enough memory, ...

Does this only happen when you try to train a NN? You need to figure out if it's due to the configuration of your NN architecture or just a JNB issue.

feral kernel Dec 7, 2023, 4:22 PM

#

odd meteor Does this only happen when you try to train a NN? You need to figure out if it's...

i will reduce my batch size, it was using over 40gb of ram to train 69 images, i thought i had switch gpu written but i didn't

warm shard Dec 7, 2023, 4:45 PM

#

Why not anyone talking about gemini?

#

that modal ins doing insane things

agile cobalt Dec 7, 2023, 4:53 PM

#

most of the multi-modal capabilities will only be released next year, iirc right now the publicly accessible version should be more or less on the same level as GPT-3.5, maybe just a bit better at non-English languages?

one way or the other, it's not that much more impressive than AWS's, Anthropic's and other closed source models

stone glacier Dec 7, 2023, 4:57 PM

#

hey all,
is the python mlx module exclusive to MacOS devices?

#

or can a windows setup or github workspace run it?

feral kernel Dec 7, 2023, 5:05 PM

#

odd meteor Does this only happen when you try to train a NN? You need to figure out if it's...

`import torch
import torch.nn as nn
import time
from torch.cuda.amp import autocast, GradScaler
from torch.utils.checkpoint import checkpoint
from torch.optim.lbfgs import LBFGS

class FCNN(nn.Module):
def init(self, input_dim):
super(FCNN, self).init()

    # Adjust convolution based on input dimensions
    self.conv1 = nn.Conv2d(1, 16, kernel_size=3)  # Further reduced filter count

    # Unpack input dimensions
    channels, height, width = input_dim

    # Hidden layer size based on input dimensions
    hidden_size = 16 * height * width

    # Define network layers
    self.fc1 = nn.Linear(hidden_size, hidden_size)
    self.bessel = torch.special.bessel_j0
    self.fc2 = nn.Linear(hidden_size, channels * height * width)

def forward(self, x):
    # Forward pass
    with torch.inference_mode():
        if is_available():
            x = x.to("mps")
            model = model.to("mps")

        x = self.conv1(x)
        x = x.view(-1, x.shape[1] * x.shape[2] * x.shape[3])
        x = checkpoint.checkpoint(self.fc1, x)
        x = self.bessel(x)
        x = checkpoint.checkpoint(self.fc2, x)
        x = x.view(-1, channels, height, width)

    return x

Initialize model with actual input dimension

model = FCNN(fft_result_tensor.shape)

Adjust loss function for MPS

criterion = nn.functional.mse_loss

Move model to MPS device if available

if is_available():
model = model.to("mps")

Break input matrix into batches

batch_size = 4
num_batches = fft_result_tensor.shape[1] // batch_size

L-BFGS optimizer

optimizer = LBFGS(model.parameters())

`

#

`def closure():
optimizer.zero_grad()
total_loss = 0.0

for i in range(num_batches):
    start_idx = i * batch_size
    end_idx = start_idx + batch_size

    # Extract a batch of input
    input_batch = fft_result_tensor[:, start_idx:end_idx].unsqueeze(0).unsqueeze(1)

    # Move input to MPS device if available
    if is_available():
        input_batch = input_batch.to("mps")

    # Forward pass
    with autocast():
        outputs = model(input_batch)

        # Calculate loss
        target_labels = input_batch
        loss = criterion(outputs, target_labels)
        total_loss += loss

    # Print progress
    if i % 10 == 0:  # Adjust print frequency
        print(f"Batch {i+1}/{num_batches}, Loss: {loss.item():.4f}")

return total_loss

Perform optimization

start_time = time.time()
for epoch in range(10):
optimizer.step(closure)

# Save the model at the end of each epoch
torch.save(model.state_dict(), f"model_lbfgs_mps_epoch_{epoch+1}.pth")

end_time = time.time()
total_training_time = end_time - start_time
print(f"Total training time: {total_training_time:.2f} seconds")`

#

How do i fix this, it says not enough values to unpack

desert oar Dec 7, 2023, 5:08 PM

#

@feral kernel 1) i suggest posting longer code section at https://paste.pythondiscord.com/ 2) post the full error message on that same site, including the "traceback" which should point to the exact line of code where the error occurs; you can use that to figure out why the error occurred

feral kernel Dec 7, 2023, 5:10 PM

#

desert oar <@671670346285318145> 1) i suggest posting longer code section at https://paste....

https://paste.pythondiscord.com/U6RQ Thanks, i tried to compress the 4d tensor to 3d convolution, but it didn;t work

slow totem Dec 7, 2023, 7:27 PM

#

hello! I have an urgent question, and would really, extremely appreciate any help. Thanks in advanced

So, im trying to use Llama-7b model to generate a few answers to simple questions. But the issue is, I have a 1gb ram server, and it can absolutely not handle llama, or mistral, unless I want to give up all the intelligence (better just use DialoGPT at that point) I aim to generate an answer to the question using specific documents (but these are general knowledge questions, so the docs are not required, only hopeful that they might reduce computation)

Coming to what I need help with, are there any alternatives of LLMs that arent completely stupid, but work on my 1 gb ram server? Alternatively, is there a hosted/inferencing api for LLAMA or Mistral or any such LLMs that has a free tier that I can work with?

Thanks, have a cookie for reading through 🍪

serene scaffold Dec 7, 2023, 7:34 PM

#

slow totem hello! I have an urgent question, and would really, extremely appreciate any hel...

there is nothing you can do to solve your problem on a 1 GB RAM server. Nothing at all.

slow totem Dec 7, 2023, 7:34 PM

#

serene scaffold there is nothing you can do to solve your problem on a 1 GB RAM server. Nothing ...

ah thanks. Any free hosted inferencing options?

serene scaffold Dec 7, 2023, 7:34 PM

#

slow totem ah thanks. Any free hosted inferencing options?

maybe google colab

slow totem Dec 7, 2023, 7:35 PM

#

serene scaffold maybe google colab

would it allow for a restful API?

serene scaffold Dec 7, 2023, 7:35 PM

#

also, you need access to GPU compute. If you don't have a GPU, you should immediately give up completely on trying to run any instruction-tuned LLMs

serene scaffold Dec 7, 2023, 7:35 PM

#

slow totem would it allow for a restful API?

It would not

slow totem Dec 7, 2023, 7:36 PM

#

serene scaffold also, you need access to GPU compute. If you don't have a GPU, you should immedi...

honestly, even a general purpose LLM suits my needs at the moment. Its more of POC than actual production

slow totem Dec 7, 2023, 7:36 PM

#

serene scaffold It would not

ah, that sucks. Thank you so much for your response

serene scaffold Dec 7, 2023, 7:36 PM

#

slow totem honestly, even a general purpose LLM suits my needs at the moment. Its more of P...

all the LLMs you mentioned are instruction-tuned LLMs

#

LLMs existed before "ChatGPT-like" LLMs became popular. And now everyone thinks "LLM" refers only to ChatGPT-like models

#

(ChatGPT is also an instruction-tuned LLM)

slow totem Dec 7, 2023, 7:40 PM

#

serene scaffold LLMs existed before "ChatGPT-like" LLMs became popular. And now everyone thinks ...

As in DialoGPT and the likes? Im do not know what the distinction is, I will try to look it up. Are their any other LLM models which would be able to respond to general user queries or is an instruction-tuned LLM one that is built to respond to prompts?

serene scaffold Dec 7, 2023, 7:41 PM

#

slow totem As in DialoGPT and the likes? Im do not know what the distinction is, I will tr...

Instruction-tuned LLMs are the kind that respond to user input. The other kinds of models that are LLMs are not what you envision an LLM to be.

#

But I can't think of anything worthwhile you could potentially do with AI on a server with only 1GB RAM.

tidal bough Dec 7, 2023, 8:50 PM

#

(An LLM is initially trained on massive datasets to complete text - that is, predict next token repeatedly. That's what some people call a "base model" these days. If you asked this model a question, it might give some answer if you make it look sufficiently like a Q-A dataset, or it may complete the prompt with some questions of its own. There's many tasks such models are useful for, like creative writing, but they aren't assistants of any kind - to turn it into something like ChatGPT, you need to tune it with something like RLHF to alter its utility function from "emit tokens that are like what would follow this text in my training data" to "emit tokens that wouldn't get me punished during RLHF".)

desert hawk Dec 8, 2023, 12:42 AM

#

Howdy, I am a self taught web dev which led to me securing a web dev role. Prior to getting into web I was studying and absolutely loved Python (but was given advice to pickup web to secure a job which ultimately worked out). But now I'm working in web I feel like I can spend the time to pickup Python again. I am going to go through a 100 days of Python course as well as a TensorFlow course since that seems like a lot of fun!

hollow sentinel Dec 8, 2023, 12:45 AM

#

#

pd.read_excel("/content/Gross Collections, by Type of Tax and State - IRS Data Book Table 5 2022.xlsx", header = 3)  ```

#

so the problem here is that the header is merged and centered

#

should i just unmerge and uncenter it and see what happens?

#

i don't really know how to read this data

#

Unnamed: 0    Unnamed: 1    Unnamed: 2    Total    Individual income\ntax withheld\nand FICA tax [3]    Individual income\ntax payments and \nSECA tax [3]    Unemployment\ninsurance tax    Railroad\nretirement tax    Estate and \ntrust income \ntax [4]    Unnamed: 9    Unnamed: 10    Unnamed: 11
0    NaN    -1.000000e+00    -2.0    -3.000000e+00    -4.000000e+00    -5.000000e+00    -6.0    -7.0    -8.0    -9.0    -10.0    -11.0
1    United States, total    4.901514e+09    475871099.0    4.321609e+09    3.089258e+09    1.133996e+09    7046465.0    6148312.0    85160093.0    28909393.0    4445883.0    70679117.0
2    Alabama    3.605756e+07    1936430.0    3.356064e+07    2.368409e+07    9.255441e+06    73015.0    3525.0    544565.0    267069.0    30921.0    262500.0```

#

this is what i'm getting so far

#

i'm not sure why i'm getting these unnamed things

#

if anyone knows, feel free to ping me

#

i have no idea what i'm doing

serene scaffold Dec 8, 2023, 1:25 AM

#

hollow sentinel i have no idea what i'm doing

pandas doesn't like all that excel text formatting shit. and having cells that span multiple rows or columns throws it off

#

and it looks like the first row doesn't actually have names for each column

hollow sentinel Dec 8, 2023, 1:26 AM

#

hmmm

hollow sentinel Dec 8, 2023, 1:26 AM

#

serene scaffold pandas doesn't like all that excel text formatting shit. and having cells that s...

so i tried using skiprows too

serene scaffold Dec 8, 2023, 1:27 AM

#

hollow sentinel so i tried using skiprows too

tell it to ignore the first five rows and put the names of each column into the python code manually

hollow sentinel Dec 8, 2023, 1:27 AM

#

serene scaffold tell it to ignore the first five rows and put the names of each column into the ...

gotcha, will try that now.

#

how do i provide the column names manually?

serene scaffold Dec 8, 2023, 1:28 AM

#

idk how well that will work as I've never tried to open an excel sheet with row or column spanning cells

hollow sentinel Dec 8, 2023, 1:28 AM

#

provide a list?

hollow sentinel Dec 8, 2023, 1:28 AM

#

serene scaffold idk how well that will work as I've never tried to open an excel sheet with row ...

yeah, major pain

serene scaffold Dec 8, 2023, 1:28 AM

#

hollow sentinel provide a list?

header=['col1', 'col2', 'col3']

hollow sentinel Dec 8, 2023, 1:28 AM

#

ah, thanks.

serene scaffold Dec 8, 2023, 1:29 AM

#

serene scaffold idk how well that will work as I've never tried to open an excel sheet with row ...

programmers never make excel sheets like this, fortunately

hollow sentinel Dec 8, 2023, 1:30 AM

#

serene scaffold programmers never make excel sheets like this, fortunately

right, because it's a nightmare to process

serene scaffold Dec 8, 2023, 1:30 AM

#

hollow sentinel right, because it's a nightmare to process

if someone ever sends you an excel book with merged cells, send them that sonic kid

#

https://tenor.com/view/actions-consequences-serious-when-will-you-learn-gif-13811783

Tenor

hollow sentinel Dec 8, 2023, 1:32 AM

#

serene scaffold if someone ever sends you an excel book with merged cells, send them that sonic ...

pd.read_excel("/content/Gross Collections, by Type of Tax and State - IRS Data Book Table 5 2022.xlsx", skiprows=5, 
              header = ["Total Internal Revenue collections", "Business Income Taxes", "Total",
                        "Individual income tax withheld and FICA tax", "Individual income tax payments and SECA tax [3]",
                        "Unemployment insurance tax", "Railroad retirement tax", "Estate and trust income  tax [4]", 
                        "Estate tax", "Gift Tax", "Excise Tax"])

#

ValueError Traceback (most recent call last)
<ipython-input-57-d484ef631d33> in <cell line: 1>()
----> 1 pd.read_excel("/content/Gross Collections, by Type of Tax and State - IRS Data Book Table 5 2022.xlsx", skiprows=5,
2 header = ["Total Internal Revenue collections", "Business Income Taxes", "Total",
3 "Individual income tax withheld and FICA tax", "Individual income tax payments and SECA tax [3]",
4 "Unemployment insurance tax"])

5 frames
/usr/local/lib/python3.10/dist-packages/pandas/io/common.py in validate_header_arg(header)
196 header = cast(Sequence, header)
197 if not all(map(is_integer, header)):
--> 198 raise ValueError("header must be integer or list of integers")
199 if any(i < 0 for i in header):
200 raise ValueError("cannot specify multi-index header with negative integers")

ValueError: header must be integer or list of integers

serene scaffold Dec 8, 2023, 1:32 AM

#

oh fuck

#

should be names=. sorry about that

hollow sentinel Dec 8, 2023, 1:33 AM

#

well, this will be a cool story to tell on an interview

serene scaffold Dec 8, 2023, 1:33 AM

#

I'm just reading this btw https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html

hollow sentinel Dec 8, 2023, 1:33 AM

#

ah yeah i was reading the same thing, i just couldn't wrap my head around it

hollow sentinel Dec 8, 2023, 1:34 AM

#

serene scaffold I'm just reading this btw https://pandas.pydata.org/docs/reference/api/pandas.re...

ayy it looks like it did something??

serene scaffold Dec 8, 2023, 1:34 AM

#

YAY

hollow sentinel Dec 8, 2023, 1:34 AM

#

serene scaffold YAY

     Total Internal Revenue collections  Business Income Taxes        Total  Individual income tax withheld and FICA tax  Individual income tax payments and SECA tax [3]  Unemployment insurance tax  Railroad retirement tax  Estate and trust income  tax [4]  Estate tax  Gift Tax  Excise Tax
Alaska                               6572445.0               150882.0    6323953.0                                    4423914.0                                        1689140.0                     11933.0                   2008.0                          196959.0     35788.0      98.0     61723.0
Arizona                             71814870.0              5116779.0   64739720.0                                   45071814.0                                       18865670.0                    129407.0                   2013.0                          670817.0    209065.0   30814.0   1718491.0
Arkansas                            40231970.0              4846558.0   34464074.0                                   26995802.0                                        6983808.0                    141622.0                   2969.0                          339873.0    157175.0  133501.0    630661.0
California                         696826462.0             77361863.0  608660632.0                                  427216972.0                                      174510813.0                    849338.0                   8073.0                         6075435.0   5778261.0  669690.0   4356015.0
Colorado                            88448670.0              7523650.0   80022210.0                                   55735321.0                                       23393304.0                    111294.0                  20460.0                          761831.0    192972.0   46954.0    662883.0```

#

that's probably a nightmare to look at

hollow sentinel Dec 8, 2023, 1:35 AM

#

serene scaffold YAY

my idea is to create a project where i compare how much states get paid by the federal government and how much states have to pay the federal government in taxes

#

idk if that's a good idea, but i wanted to do it bc i'm interviewing for a gov org and i figured i'd make it domain specific

serene scaffold Dec 8, 2023, 1:36 AM

#

hollow sentinel my idea is to create a project where i compare how much states get paid by the f...

I think you'll be interested to see which states are net contributors and which are net receivers

hollow sentinel Dec 8, 2023, 1:36 AM

#

serene scaffold I think you'll be interested to see which states are net contributors and which ...

oh like seeing how much they contribute over and how much they contribute under?

#

also, would it be a bad idea to put this in sql?

serene scaffold Dec 8, 2023, 1:40 AM

#

hollow sentinel also, would it be a bad idea to put this in sql?

if it all fits in memory, it's just as well that you do it with pandas

hollow sentinel Dec 8, 2023, 1:40 AM

#

serene scaffold if it all fits in memory, it's just as well that you do it with pandas

i see

#

yeah this isn't like a massive dataset

hollow sentinel Dec 8, 2023, 1:41 AM

#

serene scaffold if it all fits in memory, it's just as well that you do it with pandas

what would be some good visualizations for net contributors and net recievers?

#

like a stacked bar chart?

#

idk what else to do with the data

desert oar Dec 8, 2023, 3:21 AM

#

hollow sentinel what would be some good visualizations for net contributors and net recievers?

Scatterplot of amount received versus amount paid. And then do the same but per capita

#

Could also do some interesting comparisons with state GDP, again both total and per capita

#

Usually it's also a good idea to just look at the distribution of each variable individually

desert oar Dec 8, 2023, 3:22 AM

#

hollow sentinel idk what else to do with the data

in my opinion the best data analysis projects start with hypotheses or questions to be answered

hollow sentinel Dec 8, 2023, 4:17 AM

#

desert oar in my opinion the best data analysis projects start with hypotheses or questions...

i totally agree

serene scaffold Dec 8, 2023, 4:21 AM

#

desert oar in my opinion the best data analysis projects start with hypotheses or questions...

How do magnets work?

hollow sentinel Dec 8, 2023, 4:25 AM

#

i just don’t know what hypothesis i have

graceful anvil Dec 8, 2023, 7:40 AM

#

is anyone here familiar with keras 3?

feral kernel Dec 8, 2023, 8:05 AM

#

Hey, what is the most portable high performance (24gb of vram or greater ) desktop or laptop for machine learning that i can bring as a carry-on that weighs less than 10 pounds. Lol maybe a mac studio?

mild dirge Dec 8, 2023, 9:49 AM

#

feral kernel Hey, what is the most portable high performance (24gb of vram or greater ) deskt...

Maybe better to just hire a server

#

I wouldn't really want to buy a really expensive laptop for good performance, it's pretty bad value for your money compared to a desktop.

glad moth Dec 8, 2023, 9:58 AM

#

It is better to buy a desktop for data analysing or computational operations, if you have a good office.

#

I tried laptop for my work, I always suffer from heating or over heating even I pursued high spec laptop

past meteor Dec 8, 2023, 10:31 AM

#

glad moth It is better to buy a desktop for data analysing or computational operations, if...

Everyone has a different opinion on this. There's no one correct answer 🙂

#

I prefer a laptop because I'm on the go a lot. I take the train semi frequently and usually work there.

#

Desktops are a lot more cost efficient / value for money and have better longevity. It depends what you're after 🙂

odd meteor Dec 8, 2023, 10:36 AM

#

glad moth I tried laptop for my work, I always suffer from heating or over heating even I ...

We all have individual preference. For me, nothing beats the feeling of being able to set up my GPU cluster and train my model locally. I can't afford the kind of setup I want at the moment, so for now, I always rent / use Kaggle GPU.

So weigh your options and go with what rocks your boat. If you're someone like me who prefers investing in personal pc, then you might be interested in TensorBook and other machines that have similar spec. https://lambdalabs.com/deep-learning/laptops/tensorbook

Deep Learning Laptop - RTX 3080 Max-Q | Razer x Lambda Tensorbook

Intel i7-11800H (8 cores, 2.30 GHz), 64 GB Memory, 2 x 1 TB, NVMe SSD, Data Science & Machine Learning Optimized. TensorFlow, PyTorch, Keras Pre-Installed. Fast shipping.

wooden sail Dec 8, 2023, 10:56 AM

#

i would advise that no laptop will ever be "good" at ml atm

glad moth Dec 8, 2023, 11:01 AM

#

@past meteor
Yes, working conditions device which option is more fit with you situation..

feral kernel Dec 8, 2023, 12:27 PM

#

odd meteor We all have individual preference. For me, nothing beats the feeling of being ab...

Msi with rtx 4090 mobile(4080 desktop) looks pretty fast.

feral kernel Dec 8, 2023, 1:21 PM

#

Where can u rent an H100 or A100 for really cheap? 1.6/hr is expensive if u train for a while

buoyant vine Dec 8, 2023, 1:55 PM

#

you cant mmLol

#

GPU machines are on cloud are always very expensive

#

1.6/hr is super cheap though if you have access to a H100 or A100

#

We spent close to $12-16/hr for each one of our training machines

buoyant vine Dec 8, 2023, 1:59 PM

#

odd meteor We all have individual preference. For me, nothing beats the feeling of being ab...

I find this marketing a bit crap tbh, they are advertising it like it is comparable to an actual ML machine or ML GPU, but what they're doing is just putting a regular RTX card which is specialized for Graphics rather than AI/ML type compute.

You'd be better off buying a second-hand older generation of Tensor GPUs or similar than buying a laptop specifically for ML/AI.

#

I love my 3070ti, but it is shit for anything other than fairly small models

#

Trying to do anything productive with it on heavy compute is a nightmare because you have no where near enough Vram

hollow sentinel Dec 8, 2023, 2:06 PM

#

desert oar in my opinion the best data analysis projects start with hypotheses or questions...

States that receive higher financial support from the government tend to contribute a smaller portion of tax revenue to the central government, while states receiving lower financial aid may contribute a higher proportion of tax revenue.

#

thoughts? that's a hypothesis, right?

#

i feel like i get discouraged from my projects because they don't really do anything

hollow sentinel Dec 8, 2023, 2:36 PM

#

i want to derive something interesting from the data and actually answer something

past meteor Dec 8, 2023, 3:12 PM

#

Imo a very important side note to the GPU discussion is that this only applies to LLMs

#

You can do a lot with a laptop if it isn't LLMs. Computer vision for instance doesn't need that much vram. We do it on edge devices for instance.

hollow sentinel Dec 8, 2023, 3:15 PM

#

i can't really think of anything

#

any ideas would be helpful

#

is it too complicated?

#

like a tax v aid analysis essentially

#

if that makes sense

#

this isn't for a school project, i'm doing this for fun

left tartan Dec 8, 2023, 3:32 PM

#

It's a fine question... perhaps start with a scatter plot of [tax revenue] to [financial aid]. Starting with simple graphs is a nice way to get started

hollow sentinel Dec 8, 2023, 3:32 PM

#

left tartan It's a fine question... perhaps start with a scatter plot of [tax revenue] to [f...

i have a follow up question to that

#

what column am i supposed to be using here?

hollow sentinel Dec 8, 2023, 3:33 PM

#

left tartan It's a fine question... perhaps start with a scatter plot of [tax revenue] to [f...

total internal revenue collections?

left tartan Dec 8, 2023, 3:35 PM

#

I dunno what you’re trying to do, but column b appears to be c+d

#

Oh, c+d+j+k+l

hollow sentinel Dec 8, 2023, 3:36 PM

#

left tartan I dunno what you’re trying to do, but column b appears to be c+d

right... so idk exactly what data to use

left tartan Dec 8, 2023, 3:36 PM

#

You have another data source for financial aid from gov?

hollow sentinel Dec 8, 2023, 3:37 PM

#

left tartan You have another data source for financial aid from gov?

https://www.usaspending.gov/state i do, i basically intercepted their json package off their network with inspect element. (don't worry, they literally covered that in their video tutorial so nothing unethical)

left tartan Dec 8, 2023, 3:38 PM

#

First step is usually exploring the data. I would probably do a few scatter plots to see how gov aid relates to corporate and or individual tax.

#

Then, I’d look at change over time; is there some relationship between previous aid and future income

#

Basic exploratory stuff, without getting into anything complex.

#

Other variables might include weather, neighboring states, economic factors like unemployment rates, etc

hollow sentinel Dec 8, 2023, 3:42 PM

#

left tartan Other variables might include weather, neighboring states, economic factors like...

do you want to see the data i pulled from the GET request?

left tartan Dec 8, 2023, 3:42 PM

#

Nah, just giving you pointers

hollow sentinel Dec 8, 2023, 3:48 PM

#

left tartan Nah, just giving you pointers

#

here it is

hollow sentinel Dec 8, 2023, 3:49 PM

#

left tartan Nah, just giving you pointers

i think total_prime_amount is what matters here

#

import requests

# API endpoint URL
url = 'https://api.usaspending.gov/api/v2/recipient/state/?year=latest'

headers = {
    'Content-Type': 'application/json',
}

params = {
    'year': 'latest'
}

response = requests.get(url, headers=headers, params=params)

if response.status_code == 200:
    data = response.json()
    parsed_data = {}


    for item in data:
        state_code = item['code']  # Get the state code
        state_name = item['name']  # Get the state name


        state_info = {
            'Type': item['type'],
            'Amount': item['amount'],
            'Count': item['count']
        }


        parsed_data[state_code] = state_info


    print(parsed_data)
else:
    print(f"Error: {response.status_code} - {response.text}")

#

this is what i coded up

#

#

the values match, total awarded amount is 391.2 bil

left tartan Dec 8, 2023, 3:57 PM

#

Might be interesting to analyze at the district level, if you can get the tax data in the same granularity.

hollow sentinel Dec 8, 2023, 3:57 PM

#

left tartan Might be interesting to analyze at the district level, if you can get the tax da...

i was thinking i'd analyze in terms of totality first

#

and then dive deeper

left tartan Dec 8, 2023, 3:59 PM

#

Makes sense. One layer at a time.

hollow sentinel Dec 8, 2023, 4:07 PM

#

left tartan Makes sense. One layer at a time.

would you say it's a good idea to merge the two dataframes i have?

#

so i can plot the scatterplot?

desert oar Dec 8, 2023, 4:09 PM

#

left tartan Might be interesting to analyze at the district level, if you can get the tax da...

figuring out funding outlaid by district would be extremely difficult though

desert oar Dec 8, 2023, 4:11 PM

#

hollow sentinel would you say it's a good idea to merge the two dataframes i have?

it's essential to start with your goal in mind first. then work backwards to figure out what code you need to write

hollow sentinel Dec 8, 2023, 4:12 PM

#

desert oar it's essential to start with your goal in mind first. then work backwards to fig...

well, my idea is to take the "total" column from the first dataframe which represents how much money is given to a state and graph it against the business income taxes column

#

merged_data = pd.concat([df, data])
print(merged_data.columns)

import seaborn as sns
sns.scatterplot(data = merged_data, x= "Amount", y="Business Income Taxes")

#

f7MEAACAAXxaCgAAmELcAAAAU4gbAABgCnEDAABMIW4AAIApxA0AADCFuAEAAKYQNwAAwBTiBgAAmELcAAAAU4gbAABgCnEDAABMX9JnGEujayxKgAAAABJRU5ErkJggg.png

#

well that's not good.

#

it seems like in my haste to create a merged dataframe, everything turned into NaNs

#

seems like it's a common problem with .concat

left tartan Dec 8, 2023, 4:21 PM

#

you should print the merged_data and look at the data. You'll see it didn't do what you want. You probably want .merge(), not .concat()

hollow sentinel Dec 8, 2023, 4:22 PM

#

left tartan you should print the merged_data and look at the data. You'll see it didn't do w...

what's the diff, if you don't mind me asking

#

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html

#

looking at this doc rn

left tartan Dec 8, 2023, 4:22 PM

#

concat adds to the end (top and bottom), merge is side by side (left/right)

hollow sentinel Dec 8, 2023, 4:23 PM

#

left tartan concat adds to the end (top and bottom), merge is side by side (left/right)

o shit. that would do it, yeah

hollow sentinel Dec 8, 2023, 4:23 PM

#

left tartan concat adds to the end (top and bottom), merge is side by side (left/right)

can you use .merge if you don't have keys in common?

tidal bough Dec 8, 2023, 4:24 PM

#

left tartan concat adds to the end (top and bottom), merge is side by side (left/right)

concat can be horizontal too - axis=1.

#

the real difference i'd say is that concat is "basic" whereas merge can do an arbitrary sql-like join

hollow sentinel Dec 8, 2023, 4:25 PM

#

so i have to use merge, but idk what arguments to use. currently looking at the doc.

#

does merge work if you don't have identical keys?

left tartan Dec 8, 2023, 4:26 PM

#

tidal bough concat can be horizontal too - axis=1.

yah, true, assuming both tables are ordered the same (ie: massachusetts on row 1).

hollow sentinel Dec 8, 2023, 4:28 PM

#

>>> df1 = pd.DataFrame({'lkey': ['foo', 'bar', 'baz', 'foo'],
...                     'value': [1, 2, 3, 5]})
>>> df2 = pd.DataFrame({'rkey': ['foo', 'bar', 'baz', 'foo'],
...                     'value': [5, 6, 7, 8]})
>>> df1
    lkey value
0   foo      1
1   bar      2
2   baz      3
3   foo      5
>>> df2
    rkey value
0   foo      5
1   bar      6
2   baz      7
3   foo      8
>>> df1.merge(df2, left_on='lkey', right_on='rkey')
  lkey  value_x rkey  value_y
0  foo        1  foo        5
1  foo        1  foo        8
2  foo        5  foo        5
3  foo        5  foo        8
4  bar        2  bar        6
5  baz        3  baz        7

#

these have identical values tho

#

so would that work in my case?

tidal bough Dec 8, 2023, 4:30 PM

#

what do you mean you don't have identical keys? how do you know which rows of your two dataframes correspond to each other, then?

hollow sentinel Dec 8, 2023, 4:30 PM

#

tidal bough what do you mean you don't have identical keys? how do you know which rows of yo...

print(df.columns)
Index(['State', 'Amount', 'Count'], dtype='object')

#

print(data.columns)
Index(['Total Internal Revenue collections', 'Business Income Taxes', 'Total', 'Individual income tax withheld and FICA tax', 'Individual income tax payments and SECA tax [3]', 'Unemployment insurance tax', 'Railroad retirement tax', 'Estate and trust income  tax [4]', 'Estate tax', 'Gift Tax', 'Excise Tax'], dtype='object')

tidal bough Dec 8, 2023, 4:31 PM

#

so how do you know which row in data corresponds to, say, first row in df?

hollow sentinel Dec 8, 2023, 4:31 PM

#

i don't 😦

#

that's the prblem

tidal bough Dec 8, 2023, 4:32 PM

#

...so they aren't related? why do you want to merge them, then?

hollow sentinel Dec 8, 2023, 4:33 PM

#

tidal bough ...so they aren't related? why do you want to merge them, then?

so i can plot them in a scatterplot and see the relationship between amount and Business Income Taxes, where amount is the total amount given from the fed government to each state

#

i'm gonna try something hang on

tidal bough Dec 8, 2023, 4:36 PM

#

hollow sentinel so i can plot them in a scatterplot and see the relationship between amount and ...

...well, that plot would depend on which value of amount corresponds to which value of BIT.

hollow sentinel Dec 8, 2023, 4:36 PM

#

hmmm

tidal bough Dec 8, 2023, 4:37 PM

#

(I suspect that the actual answer is that they are just in the same order - that is, the first row of df should be matched with the first row of data and so on. if that's the case, that's just a pd.merge by the index, which is the default, or equivalently a pd.concat with axis=1.)

hollow sentinel Dec 8, 2023, 4:38 PM

#

tidal bough (I suspect that the actual answer is that they are just in the same order - that...

how do i do a default merge then?

tidal bough Dec 8, 2023, 4:38 PM

#

pd.merge(df, data)

hollow sentinel Dec 8, 2023, 4:38 PM

#

tidal bough `pd.merge(df, data)`

Index(['Total Internal Revenue collections', 'Business Income Taxes', 'Total', 'Individual income tax withheld and FICA tax', 'Individual income tax payments and SECA tax [3]', 'Unemployment insurance tax', 'Railroad retirement tax', 'Estate and trust income  tax [4]', 'Estate tax', 'Gift Tax', 'Excise Tax'], dtype='object')
---------------------------------------------------------------------------
MergeError                                Traceback (most recent call last)
<ipython-input-36-a91edcbf3e7a> in <cell line: 7>()
      5 sns.scatterplot(data)
      6 
----> 7 merged_dataframe = pd.merge(df, data)

2 frames
/usr/local/lib/python3.10/dist-packages/pandas/core/reshape/merge.py in _validate_left_right_on(self, left_on, right_on)
   1432                 common_cols = left_cols.intersection(right_cols)
   1433                 if len(common_cols) == 0:
-> 1434                     raise MergeError(
   1435                         "No common columns to perform merge on. "
   1436                         f"Merge options: left_on={left_on}, "

MergeError: No common columns to perform merge on. Merge options: left_on=None, right_on=None, left_index=False, right_index=False

#

i need to somehow use these arguments: left_on=None, right_on=None, left_index=False, right_index=False

tidal bough Dec 8, 2023, 4:39 PM

#

ah, I was thinking of join I think. For merge you want left_index=True, right_index=True to merge by index.

hollow sentinel Dec 8, 2023, 4:39 PM

#

i see, i'll try that

hollow sentinel Dec 8, 2023, 4:40 PM

#

tidal bough ah, I was thinking of `join` I think. For merge you want `left_index=True, right...

merged_dataframe = pd.merge(df, data, left_index = True, right_index = True)
print(merged_dataframe.head(5))
Empty DataFrame
Columns: [State, Amount, Count, Total Internal Revenue collections, Business Income Taxes, Total, Individual income tax withheld and FICA tax, Individual income tax payments and SECA tax [3], Unemployment insurance tax, Railroad retirement tax, Estate and trust income  tax [4], Estate tax, Gift Tax, Excise Tax]
Index: []

#

hmmmm

#

that's not good either

tidal bough Dec 8, 2023, 4:40 PM

#

what's df.index?

hollow sentinel Dec 8, 2023, 4:41 PM

#

tidal bough what's `df.index`?

RangeIndex(start=0, stop=56, step=1)

tidal bough Dec 8, 2023, 4:41 PM

#

and data.index?

hollow sentinel Dec 8, 2023, 4:41 PM

#

!pastebin

arctic wedgeBOT Dec 8, 2023, 4:41 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

hollow sentinel Dec 8, 2023, 4:42 PM

#

tidal bough and data.index?

https://paste.pythondiscord.com/4W6A

#

i just want all the columns together side by side

#

from both dataframes

#

idk exactly what'ss going ono

tidal bough Dec 8, 2023, 4:46 PM

#

since one of the indexes is strings and the other is ints, there's no shared indices.

hollow sentinel Dec 8, 2023, 4:47 PM

#

can i cast?

#

nah casting wouldn't work

tidal bough Dec 8, 2023, 4:48 PM

#

i just want all the columns together side by side
then do something like pd.concat(df,data.reset_index(), axis=1)

hollow sentinel Dec 8, 2023, 4:49 PM

#

tidal bough > i just want all the columns together side by side then do something like `pd.c...

<ipython-input-45-ec38f56b9666>:1: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only.
  pd.concat(df,data.reset_index(), axis=1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-45-ec38f56b9666> in <cell line: 1>()
----> 1 pd.concat(df,data.reset_index(), axis=1)
      2 

/usr/local/lib/python3.10/dist-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    329                     stacklevel=find_stack_level(),
    330                 )
--> 331             return func(*args, **kwargs)
    332 
    333         # error: "Callable[[VarArg(Any), KwArg(Any)], Any]" has no

TypeError: concat() got multiple values for argument 'axis'

#

hmmm

tidal bough Dec 8, 2023, 4:50 PM

#

argh, right, pd.concat also takes a list, so pd.concat([df, data.reset_index()], axis=1)

hollow sentinel Dec 8, 2023, 4:50 PM

#

AY

hollow sentinel Dec 8, 2023, 4:50 PM

#

tidal bough argh, right, pd.concat also takes a list, so `pd.concat([df, data.reset_index()]...

IT WORKED!!

hollow sentinel Dec 8, 2023, 4:52 PM

#

tidal bough argh, right, pd.concat also takes a list, so `pd.concat([df, data.reset_index()]...

data = pd.concat([df, data.reset_index()], axis=1)
print(data.columns)
sns.scatterplot(data = data, x = "Amount", y ="Business Income Taxes")
ValueError                                Traceback (most recent call last)
<ipython-input-51-f11a6ce936f0> in <cell line: 1>()
----> 1 data = pd.concat([df, data.reset_index()], axis=1)
      2 print(data.columns)
      3 sns.scatterplot(data = data, x = "Amount", y ="Business Income Taxes")

2 frames
/usr/local/lib/python3.10/dist-packages/pandas/core/frame.py in insert(self, loc, column, value, allow_duplicates)
   4815         if not allow_duplicates and column in self.columns:
   4816             # Should this be a different kind of error??
-> 4817             raise ValueError(f"cannot insert {column}, already exists")
   4818         if not isinstance(loc, int):
   4819             raise TypeError("loc must be int")

ValueError: cannot insert level_0, already exists```

#

whack :(((

#

This error happens when you try to reset index on a pandas data frame, but the index name conflicts with existing column names.

tidal bough Dec 8, 2023, 4:53 PM

#

well, don't do it twice on the same dataframe.

hollow sentinel Dec 8, 2023, 4:53 PM

#

oh yeah

#

wait but how do i fix it

#

hmmm

desert oar Dec 8, 2023, 4:54 PM

#

you can also change the name of the index so if doesn't clash with a column. or just drop it entirely with reset_index(drop=True) if you don't actually need the index

#

however keep in mind that concat will still align rows by index. it's like a "full outer join", if you're familiar with database terminology. so you need to do whatever you need to do in order to make sure that both data frames have the same index

hollow sentinel Dec 8, 2023, 4:55 PM

#

data = pd.concat([df, data.reset_index(drop=True)], axis=1)
print(data.columns)
sns.scatterplot(data = data, x = "Amount", y ="Business Income Taxes") Index(['State', 'Amount', 'Count', 'State', 'Amount', 'Count', 'State', 'Amount', 'Count', 'level_0', 'State', 'Amount', 'Count', 'index', 'Total Internal Revenue collections', 'Business Income Taxes', 'Total', 'Individual income tax withheld and FICA tax', 'Individual income tax payments and SECA tax [3]', 'Unemployment insurance tax', 'Railroad retirement tax', 'Estate and trust income  tax [4]', 'Estate tax', 'Gift Tax', 'Excise Tax'], dtype='object')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-55-880ecaf4dcf2> in <cell line: 3>()
      1 data = pd.concat([df, data.reset_index(drop=True)], axis=1)
      2 print(data.columns)
----> 3 sns.scatterplot(data = data, x = "Amount", y ="Business Income Taxes")

10 frames
/usr/local/lib/python3.10/dist-packages/pandas/core/construction.py in _sanitize_ndim(result, data, dtype, index, allow_2d)
    696             if allow_2d:
    697                 return result
--> 698             raise ValueError("Data must be 1-dimensional")
    699         if is_object_dtype(dtype) and isinstance(dtype, ExtensionDtype):
    700             # i.e. PandasDtype("O")

ValueError: Data must be 1-dimensional```

tidal bough Dec 8, 2023, 4:56 PM

#

if you don't need data's index one can just do pd.concat([df, data], ignore_index=True, axis=1)

desert oar Dec 8, 2023, 4:56 PM

#

i see several duplicate columns. duplicate columns are always going to cause a problem, i really wish pandas would just prohibit it

hollow sentinel Dec 8, 2023, 4:56 PM

#

that's probably bc i ran the code multiple times

#

shit

tidal bough Dec 8, 2023, 4:56 PM

#

ah, good eye. I suspect that's a result of replacing data with concat(df, data) several times

#

don't reuse your variable names! if you chain a lot of operations, use pandas's method chaining instead; that's cleaner anyway.

hollow sentinel Dec 8, 2023, 4:57 PM

#

omg i didn't realize

#

fuck

#

ugh

#

how do i fix this?

desert oar Dec 8, 2023, 4:58 PM

#

i don't entirely agree about the method chaining, as soon as you make a typo or introduce a bug, you end up needing to create intermediate variables anyway to figure out where the bug was. and overall yes, descriptive names are not only useful for your own comprehension, but also because you don't end up re-running the same code over and over causing problems like this

desert oar Dec 8, 2023, 4:58 PM

#

hollow sentinel how do i fix this?

reload from scratch and start again. if your notebook does not run cleanly from top to bottom, stop whatever you are doing and fix it so that it does

hollow sentinel Dec 8, 2023, 4:59 PM

#

reload from scratch?

untold bloom Dec 8, 2023, 4:59 PM

#

tidal bough if you don't need `data`'s index one can just do `pd.concat([df, data], ignore_i...

this doesn't do what you think it does

desert oar Dec 8, 2023, 4:59 PM

#

yes, read csv all over again @hollow sentinel

desert oar Dec 8, 2023, 4:59 PM

#

untold bloom this doesn't do what you think it does

i was going to say, that actually drops column names in this case, right?

tidal bough Dec 8, 2023, 5:00 PM

#

untold bloom this doesn't do what you think it does

hmm, doesn't it? did i fuck up the axis?

desert oar Dec 8, 2023, 5:00 PM

#

it ignores the index along the axis that you are concatenating, not "the index"

untold bloom Dec 8, 2023, 5:00 PM

#

RangeIndex-es the concatenation axis

#

after concatenating

desert oar Dec 8, 2023, 5:00 PM

#

pandas terminology is an absolute shit storm

tidal bough Dec 8, 2023, 5:00 PM

#

aaah, right

hollow sentinel Dec 8, 2023, 5:00 PM

#

ok i restarted and ran all cells

#

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-27da461b5c0f> in <cell line: 3>()
      1 merged_data = pd.concat([df, data], ignore_index=True, axis=1)
      2 print(merged_data.columns)
----> 3 sns.scatterplot(data = merged_data, x = "Amount", y ="Business Income Taxes")

4 frames
/usr/local/lib/python3.10/dist-packages/seaborn/_oldcore.py in _assign_variables_longform(self, data, **kwargs)
    936 
    937                 err = f"Could not interpret value `{val}` for parameter `{key}`"
--> 938                 raise ValueError(err)
    939 
    940             else:

ValueError: Could not interpret value `Amount` for parameter `x`

#

sns.scatterplot(data = merged_data, x = "Amount", y ="Business Income Taxes")

desert oar Dec 8, 2023, 5:01 PM

#

hollow sentinel ```python ----------------------------------------------------------------------...

that's seaborn telling you it can't find that column

hollow sentinel Dec 8, 2023, 5:02 PM

#

meaning the column doesn't exist in the merged dataframe

#

why does it not exist?

hollow sentinel Dec 8, 2023, 5:03 PM

#

desert oar that's seaborn telling you it can't find that column

RangeIndex(start=0, stop=14, step=1)
this is what happens when i print (merged_data.columns)

desert oar Dec 8, 2023, 5:03 PM

#

hollow sentinel RangeIndex(start=0, stop=14, step=1) this is what happens when i print (merged_d...

did you use the ignore index thing? if so, that's why. confusedreptile made a mistake, and should not have suggested it

hollow sentinel Dec 8, 2023, 5:05 PM

#

desert oar did you use the ignore index thing? if so, that's why. confusedreptile made a mi...

that's why, but now i have all these NaN values still

desert oar Dec 8, 2023, 5:05 PM

#

hollow sentinel that's why, but now i have all these NaN values still

did you verify that the indexes are identical before concat-ing?

agile cobalt Dec 8, 2023, 5:05 PM

#

Can you show what df and data look like before merging/concatenating them?

hollow sentinel Dec 8, 2023, 5:05 PM

#

agile cobalt Can you show what `df` and `data` look like before merging/concatenating them?

sure

hollow sentinel Dec 8, 2023, 5:06 PM

#

agile cobalt Can you show what `df` and `data` look like before merging/concatenating them?

                      Total Internal Revenue collections  Business Income Taxes         Total  Individual income tax withheld and FICA tax  Individual income tax payments and SECA tax [3]  Unemployment insurance tax  Railroad retirement tax  Estate and trust income  tax [4]  Estate tax   Gift Tax  Excise Tax
United States, total                        4.901514e+09            475871099.0  4.321609e+09                                 3.089258e+09                                     1.133996e+09                   7046465.0                6148312.0                        85160093.0  28909393.0  4445883.0  70679117.0
Alabama                                     3.605756e+07              1936430.0  3.356064e+07                                 2.368409e+07                                     9.255441e+06                     73015.0                   3525.0                          544565.0    267069.0    30921.0    262500.0
Alaska                                      6.572445e+06               150882.0  6.323953e+06                                 4.423914e+06                                     1.689140e+06                     11933.0                   2008.0                          196959.0     35788.0       98.0     61723.0
Arizona                                     7.181487e+07              5116779.0  6.473972e+07                                 4.507181e+07                                     1.886567e+07                    129407.0                   2013.0                          670817.0    209065.0    30814.0   1718491.0
Arkansas                                    4.023197e+07              4846558.0  3.446407e+07                                 2.699580e+07                                     6.983808e+06                    141622.0                   2969.0                          339873.0    157175.0   133501.0    630661.0
``` this is data

#

 State        Amount   Count
0    AK  1.493428e+10   24395
1    AL  5.712495e+10   83522
2    AR  2.984787e+10  114712
3    AS  4.678064e+08     909
4    AZ  1.015892e+11   61249
``` this is df

agile cobalt Dec 8, 2023, 5:08 PM

#

hollow sentinel ```python Total Internal Revenue collections Business Inc...

not gonna lie I don't get what is what at all (which are the columns, index, values) - use data.to_list(), to_csv() or to_records()

#

opening on notepad without word wrap, it looks like the first row's index is United States, total?

hollow sentinel Dec 8, 2023, 5:09 PM

#

.to_list doesn't seem to wokr

#

AttributeError Traceback (most recent call last)
<ipython-input-15-e0a86854e54e> in <cell line: 1>()
----> 1 print(data.to_list())
2 print(df.to_list())

/usr/local/lib/python3.10/dist-packages/pandas/core/generic.py in getattr(self, name)
5900 ):
5901 return self[name]
-> 5902 return object.getattribute(self, name)
5903
5904 def setattr(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'to_list'

#

print(data.to_list())

agile cobalt Dec 8, 2023, 5:10 PM

#

ah, that only exists for series/indexes, you can use to_dict though

hollow sentinel Dec 8, 2023, 5:10 PM

#

oh, ok.

agile cobalt Dec 8, 2023, 5:10 PM

#

anyway, are you 1000% sure that the data is properly aligned the exact way you want?

hollow sentinel Dec 8, 2023, 5:10 PM

#

not really

agile cobalt Dec 8, 2023, 5:11 PM

#

How do you want to align it then? join on the State name/code?

hollow sentinel Dec 8, 2023, 5:11 PM

#

agile cobalt How do you want to align it then? join on the State name/code?

that's an idea i was thinking about too

#

but the values aren't the same iirc

#

one has abbreviations, the other has full names

agile cobalt Dec 8, 2023, 5:12 PM

#

you can just map name -> code or vice-versa (though you'll need to get a list of every state's name and code)

hollow sentinel Dec 8, 2023, 5:13 PM

#

hmmm

hollow sentinel Dec 8, 2023, 5:14 PM

#

agile cobalt you can just map `name -> code` or vice-versa (though you'll need to get a list ...

do you mean dictionary inside a list?

agile cobalt Dec 8, 2023, 5:15 PM

#

state_names = {
    "AL": "Alabama",  # this is what AL means right?
    # ...
}
data["state_name"] = data["state"].map(state_names)

pd.merge(left=df, right=data, left_index=True, right_on="state_name")

hollow sentinel Dec 8, 2023, 5:15 PM

#

oh a dictionary. ok, yeah AL is alabama

agile cobalt Dec 8, 2023, 5:16 PM

#

you'll probably want to get the dictionary from Google or something instead of writing it yourself, though you might have to parse it from a list into a dictionary yourself

hollow sentinel Dec 8, 2023, 5:16 PM

#

agile cobalt you'll probably want to get the dictionary from Google or something instead of w...

chat gpt?

agile cobalt Dec 8, 2023, 5:16 PM

#

might work but I wouldn't trust it for this lol

#

maybe the parsing part, but accurately listing all states, not really

hollow sentinel Dec 8, 2023, 5:17 PM

#

agile cobalt maybe the parsing part, but accurately listing all states, not really

looks like it did it

agile cobalt Dec 8, 2023, 5:18 PM

#

you will also have to decide wtf to do about the United States, total

#

it should contain the sum of all states in data?

hollow sentinel Dec 8, 2023, 5:18 PM

#

you're right, idk how to handle it. yep.

#

all the states summed

agile cobalt Dec 8, 2023, 5:18 PM

#

~~should've checked earlier but~~ just to check: data contains exactly one row per state right?

hollow sentinel Dec 8, 2023, 5:20 PM

#

huh?

agile cobalt Dec 8, 2023, 5:21 PM

#

is data missing any states?
does data have the multiple lines with the same state?

hollow sentinel Dec 8, 2023, 5:21 PM

#

from what i can see, it has all states from alabama to wyoming

#

lemme see

agile cobalt Dec 8, 2023, 5:25 PM

#

try something like ```py
totals = data[["Amount", "Count"]].sum()
totals["state"] = None
totals["state_name"] = "United States, total"
totals = pd.DataFrame(totals).tranpose()
data = pd.concat([data, totals], axis='rows')

hollow sentinel Dec 8, 2023, 5:29 PM

#

agile cobalt try something like ```py totals = data[["Amount", "Count"]].sum() totals["state"...

i'll try that now

hollow sentinel Dec 8, 2023, 5:29 PM

#

agile cobalt try something like ```py totals = data[["Amount", "Count"]].sum() totals["state"...

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-22-d444038c6275> in <cell line: 4>()
      2 print(df.head(5))
      3 
----> 4 totals = data[["Amount", "Count"]].sum()
      5 totals["state"] = None
      6 totals["state_name"] = "United States, total"

2 frames
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in _raise_if_missing(self, key, indexer, axis_name)
   6128                 if use_interval_msg:
   6129                     key = list(key)
-> 6130                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   6131 
   6132             not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())

KeyError: "None of [Index(['Amount', 'Count'], dtype='object')] are in the [columns]"

agile cobalt Dec 8, 2023, 5:35 PM

#

hollow sentinel ```python State Amount Count 0 AK 1.493428e+10 24395 1 AL 5....

what are the actual column names then?
is there some whitespace in them?

river cape Dec 8, 2023, 5:50 PM

#

Can anyone give me a roadmap of building a chatbot

hollow sentinel Dec 8, 2023, 6:07 PM

#

agile cobalt what are the actual column names then? is there some whitespace in them?

i'm not sure how to see if there's whitespace

agile cobalt Dec 8, 2023, 6:08 PM

#

print(repr(dataframe.columns))

hollow sentinel Dec 8, 2023, 6:10 PM

#

agile cobalt print(repr(dataframe.columns))

Index(['Total Internal Revenue collections', 'Business Income Taxes', 'Total', 'Individual income tax withheld and FICA tax', 'Individual income tax payments and SECA tax [3]', 'Unemployment insurance tax', 'Railroad retirement tax', 'Estate and trust income  tax [4]', 'Estate tax', 'Gift Tax', 'Excise Tax'], dtype='object')

agile cobalt Dec 8, 2023, 6:11 PM

#

in this case we care about data, not df

hollow sentinel Dec 8, 2023, 6:11 PM

#

agile cobalt in this case we care about data, not df

Index(['State', 'Amount', 'Count'], dtype='object')

agile cobalt Dec 8, 2023, 6:12 PM

#

hmm weird

#

and it gave an error when you did data[["Amount", "Count"]]?

hollow sentinel Dec 8, 2023, 6:12 PM

#

yes

agile cobalt Dec 8, 2023, 6:12 PM

#

try again, just that on its own, and see if it gives the same error

hollow sentinel Dec 8, 2023, 6:13 PM

#

agile cobalt try again, just that on its own, and see if it gives the same error

try data[["Amount", "Count"]]?

agile cobalt Dec 8, 2023, 6:13 PM

#

I'm willing to bet that the global state got messed up and data was not containing what it should when you tried it the last time

hollow sentinel Dec 8, 2023, 6:14 PM

#

agile cobalt I'm willing to bet that the global state got messed up and `data` was not contai...

i'll try data[["Amount", "Count"]] rn

hollow sentinel Dec 8, 2023, 6:17 PM

#

agile cobalt I'm willing to bet that the global state got messed up and `data` was not contai...

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-27-fb57bb0e3443> in <cell line: 5>()
      3 print(repr(df.columns))
      4 
----> 5 print(data[["Amount", "Columns"]])

2 frames
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in _raise_if_missing(self, key, indexer, axis_name)
   6128                 if use_interval_msg:
   6129                     key = list(key)
-> 6130                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   6131 
   6132             not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())

KeyError: "None of [Index(['Amount', 'Columns'], dtype='object')] are in the [columns]"
``` same thing

agile cobalt Dec 8, 2023, 6:20 PM

#

yeah no clue, try debugging on your own for a bit (as in, try to get the code I suggested working, or do something akin to it) or try a different approach (as in, do soemthing completely different from my suggestion)

hollow sentinel Dec 8, 2023, 6:24 PM

#

agile cobalt yeah no clue, try debugging on your own for a bit (as in, try to get the code I ...

word, i’ll do some research.

feral kernel Dec 8, 2023, 8:37 PM

#

Hey, why can't I convert all my images into one massive tensor, train on that one massive tensor by breaking it down into tiny matrices and train one matrix one by one. I tried that, it keeps me giving me.I guess i will train one by one.

desert oar Dec 8, 2023, 8:56 PM

#

hollow sentinel word, i’ll do some research.

in the case of errors like this, you're going to find much less benefit from googling error messages than you are from just inspecting your own data

#

it says those columns are missing; well, are they? check the data. what columns are there, if not those? how does it differ from what you expected? where could the difference have arisen? and so on

smoky dome Dec 8, 2023, 9:35 PM

#

Hey
I'm currently working on an AI for an ultimate tic-tac-toe game. I am currently developing a learning method for the neural network (below). I am working with a policy (probabilities of the individual moves) and a value. I have tested it for manually created boards. This worked with the optimiser SGD. However, with Adam the loss of the policy got worse and it gave 1 move the probability 1 and otherwise 0 instead of my distribution. Can anyone help me with the reason for this?

def train_neural_net(self, dataset, epoch_start=0, epoch_stop=20, cpu=0):
        torch.manual_seed(cpu)
        self.model.train()
        criterion = AlphaLoss()
        optimizer = torch.optim.SGD(self.model.parameters(), lr=0.003)
     
        train_loader = DataLoader(dataset, batch_size=1, shuffle=True, num_workers=0, pin_memory=False)

        for epoch in range(epoch_start, epoch_stop):
            total_loss = 0.0

            for i, data in enumerate(train_loader, 0):
                state, policy, value = data

                optimizer.zero_grad()
                policy_pred, value_pred = self.model(state)

                loss = criterion(value_pred, value, policy_pred, policy)
                loss.backward()
                optimizer.step()

queen junco Dec 9, 2023, 12:28 AM

#

I'm making a learning chat so that works like copilot ai that saves everything the user asks as it learns words but rn I need code that will check if something's a English word not

serene scaffold Dec 9, 2023, 1:05 AM

#

queen junco I'm making a learning chat so that works like copilot ai that saves everything t...

is croissant an English word?

queen junco Dec 9, 2023, 1:05 AM

#

serene scaffold is croissant an English word?

No

serene scaffold Dec 9, 2023, 1:05 AM

#

queen junco No

am I not speaking English when I tell you that I'm eating a croissant?

queen junco Dec 9, 2023, 1:06 AM

#

serene scaffold am I not speaking English when I tell you that I'm eating a croissant?

No your speaking English while naming a French thing

serene scaffold Dec 9, 2023, 1:07 AM

#

queen junco No your speaking English while naming a French thing

so you're only interested in English words that are of germanic origin?

queen junco Dec 9, 2023, 1:07 AM

#

No I'm interested in English words

serene scaffold Dec 9, 2023, 1:07 AM

#

What is an English word?

queen junco Dec 9, 2023, 1:07 AM

#

I already fixed it

#

Word originated by English people

serene scaffold Dec 9, 2023, 1:08 AM

#

serene scaffold so you're only interested in English words that are of germanic origin?

that's what this means

queen junco Dec 9, 2023, 1:08 AM

#

If you search up fehe it will search your address to on pc

#

Itle also show healthcare

serene scaffold Dec 9, 2023, 1:10 AM

#

#

@queen junco you shouldn't drop discord gifts in this server. selfbots will snipe them

queen junco Dec 9, 2023, 1:12 AM

#

Nuh uh

serene scaffold Dec 9, 2023, 1:12 AM

#

nuh uh?

queen junco Dec 9, 2023, 1:16 AM

#

discord.gift/Udzwm3hrQECQBnEEFFCEwdSq

verbal sand Dec 9, 2023, 1:28 AM

#

I've this python datafram and I want to get the percentage gain or loss for a particular company
%gain or loss = 100 * {(Avg Traded Price for Sell - Avg Traged Price for Buy)/Avg Traged Price for Buy}

#

Can anyone help me with this one please?...there's a groupby and apply function that I'm struggling to code for this.

serene scaffold Dec 9, 2023, 1:33 AM

#

verbal sand I've this python datafram and I want to get the percentage gain or loss for a pa...

is there always one buy and one sell per company?

verbal sand Dec 9, 2023, 1:34 AM

#

serene scaffold is there always one buy and one sell per company?

Yes

serene scaffold Dec 9, 2023, 1:35 AM

#

verbal sand Yes

then you need to pivot it "buy" and "sell" are the two rows, and each company has its own column, and then use this method https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pct_change.html

#

also, in scientific contexts, you do percentages between 0 and 1. not 0 and 100

#

so don't multiply it by 100.

verbal sand Dec 9, 2023, 1:37 AM

#

Doesn't look like a good approach.

serene scaffold Dec 9, 2023, 1:37 AM

#

it's a great approach.

#

if you don't want to do it like that, your alternative is to pivot it so that buy and sell are the two columns

#

and then you can do {(Avg Traded Price for Sell - Avg Traged Price for Buy)/Avg Traged Price for Buy}

verbal sand Dec 9, 2023, 1:37 AM

#

ya pivoting on buy and sell seems the right way

#

df.groupby('Company').apply(lambda x: )

serene scaffold Dec 9, 2023, 1:38 AM

#

no.

#

!docs pandas.DataFrame.pivot_table

arctic wedgeBOT Dec 9, 2023, 1:38 AM

#

pandas.DataFrame.pivot\_table


DataFrame.pivot_table(values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All', observed=False, sort=True)```
Create a spreadsheet-style pivot table as a DataFrame.

The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of the result DataFrame.

verbal sand Dec 9, 2023, 1:38 AM

#

I'm having a hard time forming the right lambda function for this

serene scaffold Dec 9, 2023, 1:38 AM

#

the solution does not involve a lambda

#

so if you write one, the solution is automatically wrong.

verbal sand Dec 9, 2023, 1:39 AM

#

arctic wedge

look good to me

serene scaffold Dec 9, 2023, 1:39 AM

#

serene scaffold so if you write one, the solution is automatically wrong.

this is the case .8 of the time in pandas

#

and for for-loops, it's closer to .95

verbal sand Dec 9, 2023, 1:40 AM

#

what are the decimal values you're specifying. i don't get that

serene scaffold Dec 9, 2023, 1:41 AM

#

80 percent of the time, if you think the solution to a pandas problem involves a lambda, that's wrong

#

and 95 percent of the time, if you think the solution to a pandas problem involves a for loop, that's wrong.

#

did you come up with code to do the pivoting?

verbal sand Dec 9, 2023, 1:42 AM

#

Oh nice! I know using loops on dataframes is a terrible idea...but didn't know about lambda

#

i'm writing it

serene scaffold Dec 9, 2023, 1:43 AM

#

.apply circumvents pandas optimizations almost as egregiously as for loops do

#

that is, you only get optimizations if you're using pandas' native methods, and if you're looping or applying non-pandas functions/methods, you're not.

verbal sand Dec 9, 2023, 1:44 AM

#

Got it!

#

table = pd.pivot_table(df, index=['Company'], columns=['Side'], aggfunc="sum")

#

oh wait how do we highlight the code 😅

#

This is what I get. Not sure how can I subtract the buy and the sell column above

agile cobalt Dec 9, 2023, 1:54 AM

#

verbal sand oh wait how do we highlight the code 😅

include the newline immediately after the py, without any spaces in between

#

MultiIndexes are powerful, but can be way more of a pain than useful most of the time
I'd recommend for you to just do table.columns = ['buy', 'sell']

verbal sand Dec 9, 2023, 1:57 AM

#

agile cobalt MultiIndexes are powerful, but can be way more of a pain than useful most of the...

Thanks! worked for me 🙂

serene scaffold Dec 9, 2023, 2:12 AM

#

verbal sand This is what I get. Not sure how can I subtract the `buy` and the `sell` column ...

just normal pandas usage

In [9]: df
Out[9]:
   a  b
0  6  7
1  2  5
2  4  7
3  3  3
4  9  9

In [10]: df['b'] + df['a']
Out[10]:
0    13
1     7
2    11
3     6
4    18

serene scaffold Dec 9, 2023, 3:04 AM

#

@verbal sand did you get it?

shy rock Dec 9, 2023, 3:21 AM

#

Does anyone has a script to find ( Extract) all the tables in a sql query including nested sub querry tables

frosty goblet Dec 9, 2023, 8:05 AM

#

hey guys how does one plot an x axis and y axis on matplotlib?

silent patio Dec 9, 2023, 9:22 AM

#

frosty goblet hey guys how does one plot an x axis and y axis on matplotlib?

#

@frosty goblet you can also use pandas which uses matplotlib as its core. pandas should be easier to use with a lot of examples online (w3schools and such)

frosty goblet Dec 9, 2023, 10:27 AM

#

silent patio

ooo thank you for this!

hoary jay Dec 9, 2023, 1:18 PM

#

If i have two n-dimensional matrices (say embeddings of words) what's the best way to calculate correlation between them? Something better than cosine similarity?

desert oar Dec 9, 2023, 2:10 PM

#

hoary jay If i have two n-dimensional matrices (say embeddings of words) what's the best w...

https://math.stackexchange.com/q/507742

Mathematics Stack Exchange

Distance/Similarity between two matrices

I'm in the process of writing an application which identifies the closest matrix from a set of square matrices $M$ to a given square matrix $A$. The closest can be defined as the most similar.

I t...

weary crown Dec 9, 2023, 6:06 PM

#

https://hastebin.com/share/ufujobacah.python

running a vgg19 on google quickdraw dataset - when i train my code in model.fit, it says that it has shape 999,224,224, 3 and expected 224,224,3
i understand what that means - that i have to separate my individual images
but why it cant train on the whole dataset
after all, x_train and y_train include all 999 training instances ...
why do i have to separate them each and how
isnt the whole model trained at once :9
:(

Hastebin

Hastebin is a free web-based pastebin service for storing and sharing text and code snippets with anyone. Get started now.

#

model.input_shape is (224 ,224, 3) which makes sense since that's the dimensions of my image - but when im fitting/training the model, shouldnt that input be (999, 224, 224, 3) since there are 999 images to be trained??

#

its weird that during training it wants 1 image ... instead of all of them ... and im sure model.fit doesnt go in a loop lol

final spruce Dec 9, 2023, 6:32 PM

#

What is meant with ∆ in the neural network backwards phase

agile cobalt Dec 9, 2023, 6:42 PM

#

weary crown https://hastebin.com/share/ufujobacah.python running a vgg19 on google quickdra...

maybe check if the Input you're using has a specified batch size

weary crown Dec 9, 2023, 6:44 PM

#

agile cobalt maybe check if the [Input](<https://keras.io/api/layers/core_layers/input/>) you...

batch size im using is 32 but i dont think thats the problem

agile cobalt Dec 9, 2023, 6:44 PM

#

what is that 999 then, if not the batch size?

weary crown Dec 9, 2023, 6:45 PM

#

agile cobalt what is that `999` then, if not the batch size?

999 is how many images im training on in total

#

batch size rn is 32 - the thing i dont understand is why the model seems to want to train on 1 image at a time instead of the whole x_train

agile cobalt Dec 9, 2023, 6:46 PM

#

what do you think that the batch size is in first place?

#

and what's its purpose

weary crown Dec 9, 2023, 6:47 PM

#

agile cobalt and what's its purpose

number of examples in 1 forward and back pass right

agile cobalt Dec 9, 2023, 6:48 PM

#

and how many examples are you trying to fit into 1 pass?

weary crown Dec 9, 2023, 6:48 PM

#

agile cobalt and how many examples are you trying to fit into 1 pass?

well all 999 but with a batch_size of 32 so that it isnt all at once

#

# Train the model
batch_size = 32
epochs = 20

model.fit(
    x=x_train,
    y=y_train,
    batch_size=batch_size,
    epochs=epochs,
    validation_data=(x_test, y_test)
)```

#

x_train shape is (999, 224, 224, 3) bc it has all 999 training things in it

#

but model.fit wants (224, 224, 3) so i do train in a for loop from 0 -> 999 or what bc im pretty sure u dont do that

agile cobalt Dec 9, 2023, 6:51 PM

#

never mind, I thought that you had to break it down into batches yourself but looks like keras does that for you

#

yeah no clue, paste the actual traceback?

weary crown Dec 9, 2023, 6:55 PM

#

Epoch 1/20
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-11-086b6a8fd937> in <cell line: 11>()
      9 # Make sure the input shape matches the model's expected input shapeTd
     10 
---> 11 model.fit(
     12     x=x_train,
     13     y=y_train,

1 frames
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py in tf__train_function(iterator)
     13                 try:
     14                     do_return = True
---> 15                     retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
     16                 except:
     17                     do_return = False

ValueError: in user code:

    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1377, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1360, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1349, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1126, in train_step
        y_pred = self(x, training=True)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/input_spec.py", line 298, in assert_input_compatibility
        raise ValueError(

    ValueError: Input 0 of layer "model_1" is incompatible with the layer: expected shape=(None, 224, 224, 3), found shape=(None, 999, 224, 224, 3)```

agile cobalt Dec 9, 2023, 6:58 PM

#

you should have mentioned those extra Nones

#

mind checking what exactly is x_train.shape again?

weary crown Dec 9, 2023, 7:00 PM

#

(9, 999, 224, 224, 3)

#

tbh idk what the 9 is doing there

agile cobalt Dec 9, 2023, 7:00 PM

#

yeah where tf did that 9 come from x-x

weary crown Dec 9, 2023, 7:01 PM

#

so js get rid of it?

agile cobalt Dec 9, 2023, 7:01 PM

#

right now it thinks that you have 9 items, each containing the (999, 244, 244, 3) shape

weary crown Dec 9, 2023, 7:01 PM

#

#

train test split somehow fucked up my stuff

agile cobalt Dec 9, 2023, 7:01 PM

#

did you stack the different images instead of concatenating?

#

what was the shape before that split

weary crown Dec 9, 2023, 7:02 PM

#

(12, 1000)

#

js get rid of the 9 somehow or

agile cobalt Dec 9, 2023, 7:03 PM

#

right now you have something like cat [1, 2, 3] dog [4, 5, 6] bird [7, 8, 9] instead of cat 1 cat 2 cat 3 dog 4 dog 5 dog 6 bird 7 bird 8 bird 9 and it's splitting like ```
Train
cat [1, 2, 3]
dog [4, 5, 6]

Test
bird [7, 8, 9]

weary crown Dec 9, 2023, 7:05 PM

#

ohh

#

so removing the 9 will fix that ok lemme try that

agile cobalt Dec 9, 2023, 7:06 PM

#

not just removing in any arbitrary way, but you have to reshape it at the right place, or change the way you're creating it in first place

weary crown Dec 9, 2023, 7:06 PM

#

agile cobalt not just removing in any arbitrary way, but you have to reshape it at the right ...

tbh idk where the 9 even came from

#

bc like i had the data originaly correct

#

img_data was (12, 1000) - 12 classes each with 1k iamges

#

but after i split, a 9 popped up from nowhere

agile cobalt Dec 9, 2023, 7:07 PM

#

it probably split 9 classes into train, 3 classes into test

weary crown Dec 9, 2023, 7:07 PM

#

ohh yeah it did do taht

agile cobalt Dec 9, 2023, 7:07 PM

#

it shouldn't have a "classes" layer though

#

it should be just (12000, 244, 244, 3) pre-split

weary crown Dec 9, 2023, 7:08 PM

#

ohh

#

oh wait

#

@agile cobalt img_data is (999, 224, 224) pre split

agile cobalt Dec 9, 2023, 7:09 PM

#

did you do something like thing = thing[0, ...]

weary crown Dec 9, 2023, 7:09 PM

#

no

#

@agile cobalt img data shape works actually

#

12 classes, 999 instances, 224x224 per intsance

#

img_classes pre split is 12x1000 which works

#

so after trani test split its fucking up somehow

#

# split data into training and testing
x_train, x_test, y_train, y_test = train_test_split(
    img_data,
    img_classes,
    test_size=0.2,
    random_state=42,
    shuffle=True
)``` this is how i split @agile cobalt

agile cobalt Dec 9, 2023, 7:12 PM

#

Again, for the last time, you should NOT have the 12, in img_data

weary crown Dec 9, 2023, 7:13 PM

#

pre split:
img data (12, 999, 224, 224)
img classes (12, 1000)

post split:
x_train (9, 999, 224, 224, 3, 3)
x_test (3, 999, 224, 224)
y_train(9, 1000, 3, 3)
y_test (3, 1000)

#

9 + 3 = 12 so i get that ig

agile cobalt Dec 9, 2023, 7:14 PM

#

The 999 and 1000 should be the same number (does not matters if 999, if 1000, or any other given number), right now you have 999 images per category but 10000 labels per category

weary crown Dec 9, 2023, 7:15 PM

#

idk why its off by 1

agile cobalt Dec 9, 2023, 7:16 PM

#

It should be```
pre split:
img data (12000, 224, 224)
img classes (12000)

post split:
x_train (9000, 224, 224, 3)
x_test (3000, 224, 224, 3)
y_train (9000)
y_test (3000)
``` and make sure that it is shuffled properly (primarily, make sure that y_test contains all 12 values at least once)

#

other than that, good luck getting it to the right format

weary crown Dec 9, 2023, 7:19 PM

#

agile cobalt It should be``` pre split: img data (12000, 224, 224) img classes (12000) po...

idk how my pre split data is wrong tho

#

oh wait i think i might have it

#

but why is there a 999 instead of 1k lol @agile cobalt

#

i flattened it so that its like (12000, 224, 224)

#

but its (11988, 224, 224)

#

999 * 12

#

idk why its 999

#

shti is dumb asl

#

literally 0 clue how its 999 data but 1000 targets

#

i literally imported the data to be 1000 how tf are there 999???

#

now when i ask for 1001 drawings i get 1001 labels and 1000 data

#

wtf thas so weird now the other one is wrong?? what is going on ...

#

anyone know how this happens lol weirdest error ive ever faced

#

@agile cobalt ran drawing_count on it and it seems that every category l oads in 1000 images like expected

#

so idk where the 999 data is from

feral kernel Dec 9, 2023, 8:31 PM

#

Yo, is mojo any good for ML?

agile cobalt Dec 9, 2023, 8:48 PM

#

I would wait until they release a 1.0 version before really trying to use it - it is still missing a lot of core, essential features right now

left tartan Dec 9, 2023, 9:28 PM

#

agile cobalt I would wait until they release a 1.0 version before really trying to use it - i...

I'm waiting till they open source it 🙂

lapis sequoia Dec 9, 2023, 11:56 PM

#

I need some ML ideas

serene scaffold Dec 10, 2023, 12:43 AM

#

lapis sequoia I need some ML ideas

Make a chat bot that gives ml project suggestions

lapis sequoia Dec 10, 2023, 12:49 AM

#

no

serene scaffold Dec 10, 2023, 12:52 AM

#

lapis sequoia no

But it would solve a problem that you have

lapis sequoia Dec 10, 2023, 1:06 AM

#

No

#

I do not want to keep doing finance bro stuff

#

maybe optimization

hard pebble Dec 10, 2023, 1:27 AM

#

I'm having so much trouble getting Tensorflow to work on my Macbook Air M1. Despite installing it on the terminal with 'pip install tensorflow' when I import tensorflow to my program, it always says it cannot find module tensorflow

#

Is there a different or more specific way I have to install tensorflow?

serene scaffold Dec 10, 2023, 1:31 AM

#

hard pebble I'm having so much trouble getting Tensorflow to work on my Macbook Air M1. Desp...

what editor are you using? chances are, the editor is using a different python environment than the one you installed tensorflow to.
by the way, depending on what you're trying to do, your macbook air might not be able to handle it. you might have to switch to something like google colab.

hard pebble Dec 10, 2023, 1:32 AM

#

serene scaffold what editor are you using? chances are, the editor is using a different python e...

I tried it both on PyCharm and VSC. I did my research, my macbook should be able to handle it from what I saw but what I saw was very vague. I have a 2020 Macbook Air M1

serene scaffold Dec 10, 2023, 1:33 AM

#

hard pebble I tried it both on PyCharm and VSC. I did my research, my macbook should be able...

Please follow these steps exactly, and report the output of all actions, even if it's an error message.
at the TERMINAL, run which python and put the result in this chat.
in the CODE EDITOR, change the first line to be import sys; print(sys.executable); exit(), run the program, and put the result in this chat.
No screenshots.

hard pebble Dec 10, 2023, 1:36 AM

#

For the code editor - /Users/'name'/Desktop/pythonProject1/venv/bin/python
For the terminal - /Users/'name'/ENTER/bin/python

serene scaffold Dec 10, 2023, 1:38 AM

#

hard pebble For the code editor - /Users/'name'/Desktop/pythonProject1/venv/bin/python For t...

this means that the editor is using a different environment than the one that your terminal is using.

#

so you need to install tensorflow in the environment that is being used by the editor

hard pebble Dec 10, 2023, 1:39 AM

#

Ahhh okay. Thanks for your help!

serene scaffold Dec 10, 2023, 1:39 AM

#

you can delete import sys; print(sys.executable); exit()

#

thank you for following the instructions exactly (a lot of people do not)

hard pebble Dec 10, 2023, 1:39 AM

#

I'll definitely be back if I still run into issues lol

#

Seems google collab is pretty useful too. It's working fine on google collab

quaint loom Dec 10, 2023, 2:28 AM

#

quaint loom Hi guys. I am currently trying to do the Random forest test on my data. I want t...

Is there anyone who can come with a helping hand on this one?

desert oar Dec 10, 2023, 2:44 AM

#

long canopy is there a term to designate old-school AIs, i.e., AI before the likes of ChatGP...

the term "good old fashioned AI" is sometimes used, often abbreviated GOFAI

weary crown Dec 10, 2023, 2:48 AM

#

ok guys
finished writing code and flask bcakend but it shows this

googling shows its some error bc different dependencies but when i update tensorflow on pycharm vs google colab, former updates to 2.15.1 while latter upgrades only to 2.14.1
and im not sure thats the issue
but its prob some discrepancy? idk
how do i fix the deps diff

#

maybe smth is going on tho

#

i can link the other code if yall want

#

https://github.com/Necl0/HTNDoodleProject

GitHub

GitHub - Necl0/HTNDoodleProject

Contribute to Necl0/HTNDoodleProject development by creating an account on GitHub.

pine prawn Dec 10, 2023, 11:04 AM

#

Hello there. I'm facing compatibility issues.

I'm expected to run two models on the same Conda environment. However:

Model A, requires tensorflow 1.x and my CUDA needs not be over 10.0 to be compatible (tf 1.15 - CUDA 10.0)
(according to https://www.tensorflow.org/install/source#gpu)

Model B, requires pytorch and the latest version of PyTorch that supports 10.0 is PyTorch v.1.2.0
(according to https://pytorch.org/get-started/previous-versions/)

However, there are a lot of Pytorch versions that support CUDA 10.2

My question is: Is it okay for me to use CUDA 10.2 instead? Will tensorflow 1.15 support it despite the website seemingly suggesting otherwise?

Downloading CUDA and its cudnn takes a lot of data so I want to be a bit more careful before downloading them (and they may corrupt my another current functioning environment, I think).

sage bolt Dec 10, 2023, 11:11 AM

#

pine prawn Hello there. I'm facing compatibility issues. I'm expected to run two models o...

i recommend to use CUDA 10.0

pine prawn Dec 10, 2023, 11:12 AM

#

Are you sure? PyTorch 1.2.0 sounds very outdated

sage bolt Dec 10, 2023, 11:13 AM

#

CUDA 10.0 would be the best option for compatibility with both TensorFlow 1.15 and PyTorch 1.2.0.

pine prawn Dec 10, 2023, 11:13 AM

#

In fact, model B's official documentation suggests that PyTorch needs to be >= 1.6.0

sage bolt Dec 10, 2023, 11:17 AM

#

pine prawn In fact, model B's official documentation suggests that PyTorch needs to be >= 1...

then use it

pine prawn Dec 10, 2023, 11:17 AM

#

But PyTorch 1.6.0+ needs CUDA 10.2+

sage bolt Dec 10, 2023, 11:22 AM

#

i d k you have to choose one of these options

PyTorch 1.2.0 (for CUDA 10.0)
or
TensorFlow 2.3 (for CUDA 10.2+)
or
TensorFlow 2.4 (for CUDA 10.2+)
or
TensorFlow 2.5 (for CUDA 10.2+)
or
TensorFlow 2.6 (for CUDA 10.2+)
or
TensorFlow 2.7 (for CUDA 10.2+)

wheat thicket Dec 10, 2023, 11:37 AM

#

hlo i am new here

odd meteor Dec 10, 2023, 11:41 AM

#

wheat thicket hlo i am new here

https://tenor.com/view/welcome-michael-scott-dunder-mifflin-the-office-welcome-aboard-gif-27005393

Tenor

wheat thicket Dec 10, 2023, 11:41 AM

#

odd meteor https://tenor.com/view/welcome-michael-scott-dunder-mifflin-the-office-welcome-a...

thanks 🤝

long canopy Dec 10, 2023, 1:17 PM

#

what do you guys follow for "serious" AI-related news? By serious I mean, stuff that gets into the details, aimed at programmers

long canopy Dec 10, 2023, 1:43 PM

#

there's probably new journals popping up nowadays and that sort of thing

cosmic willow Dec 10, 2023, 2:30 PM

#

i just want to playaroud with ai is there some module or something that i just give a buch of inputs and outputs with a value descreibing how good that step was and after some time it retuns a network?

frail mauve Dec 10, 2023, 4:50 PM

#

I have to build 3 endpoints for work:

Ring tryon
Input is hand image
I know how to identify hand landmarks with Google's Mediapipe
I just dont know how I will size the image according to the hand and stick that image to a particular finger.
Bracelet Tryon
Input is hand image
How do I identify the wrist
How do I size the image according the wrist size and stick the image on the wrist
Earrings trying
Input is face image
How do identify ear
How do I size the image according the ear size and stick the image on the ear

Please help me with libraries etc...

frail mauve Dec 10, 2023, 7:31 PM

#

Please tag me

golden ridge Dec 10, 2023, 9:07 PM

#

hello guys, i have a linear regresion model and i want a graph to plot the accuracy, but the problem is that my dataframe has several columns. how would i plot them into a graph and then draw the linear regression line

serene scaffold Dec 10, 2023, 10:24 PM

#

golden ridge hello guys, i have a linear regresion model and i want a graph to plot the accur...

What column represents what you want to plot?

golden ridge Dec 10, 2023, 10:27 PM

#

serene scaffold What column represents what you want to plot?

idk, thats hwy im asking here, does it matter?? should i try to use an algorithm to make them 2 columns or what

serene scaffold Dec 10, 2023, 10:27 PM

#

golden ridge idk, thats hwy im asking here, does it matter?? should i try to use an algorithm...

Can you show the dataframe?

#

For future reference, any time you need help in connection to a dataframe, you should show it. Because a dataframe could have unlimited possible schemas.

golden ridge Dec 10, 2023, 11:03 PM

#

serene scaffold Can you show the dataframe?

0    2    4.0    4.0    1.0    83.391    3    17.8    34.4    52.0    0
1    2    5.0    5.0    1.0    83.104    3    17.8    34.0    52.0    0
2    2    6.0    6.0    1.0    82.843    3    17.7    33.7    52.0    0
3    3    3.0    11.0    2.0    83.460    3    18.0    33.1    52.0    0
4    3    4.0    12.0    2.0    81.994    3    18.0    33.0    51.0    0
...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...
931    2    6.0    6.0    1.0    80.312    3    16.4    26.1    72.0    0
932    1    2.0    9.0    2.0    83.470    3    16.4    25.4    73.0    1
933    2    2.0    2.0    1.0    81.678    3    16.4    26.3    75.0    0
934    2    4.0    4.0    1.0    80.535    3    16.4    26.5    76.0    0
935    2    6.0    6.0    1.0    80.380    3    16.3    26.0    73.0    0
936 rows × 10 columns```

pine void Dec 10, 2023, 11:21 PM

#

hello everyone. I just ran a tensor flow model that took an hour on a google collab. How can i save it so i do not have to re run my code because normally google collab makes me re run all my code

serene scaffold Dec 11, 2023, 12:04 AM

#

pine void hello everyone. I just ran a tensor flow model that took an hour on a google col...

https://www.tensorflow.org/tutorials/keras/save_and_load

TensorFlow

Save and load models | TensorFlow Core

pine void Dec 11, 2023, 12:05 AM

#

serene scaffold https://www.tensorflow.org/tutorials/keras/save_and_load

It’s fine, I was done with the project anyway and I saved all the charts I made. Thanks tho

serene scaffold Dec 11, 2023, 12:05 AM

#

golden ridge ``` Compound TyreLife LapNumber Stint LapTimeSeconds Track ...

What does your model do? Predict one of these columns?

dense crane Dec 11, 2023, 12:13 AM

#

i have written q-learnnig algorithm but it takes too long him to learn, here is the code: https://codeshare.io/X8Rv9z

serene scaffold Dec 11, 2023, 1:42 AM

#

dense crane i have written q-learnnig algorithm but it takes too long him to learn, here is ...

you're running it for 100,000 epochs. how long does each epoch take, and are you sure you need that many?

lapis sequoia Dec 11, 2023, 5:42 AM

#

Hello

dense crane Dec 11, 2023, 7:29 AM

#

serene scaffold you're running it for 100,000 epochs. how long does each epoch take, and are you...

I can interupt it whenever i want

#

Nad not this is an issue

craggy crescent Dec 11, 2023, 2:03 PM

#

Hey everyone! Im new here
I need to know if data engineering is considered a good career in or not. I am interested in building databases and cleaning data. Not sure if there is enough job offers as a DE or not. Could someone please let me know morw details?

golden ridge Dec 11, 2023, 3:10 PM

#

serene scaffold What does your model do? Predict one of these columns?

predicts laptimes

long canopy Dec 11, 2023, 3:58 PM

#

what are some AI-related news sources targetted at comp sci people and programmers?

agile cobalt Dec 11, 2023, 4:07 PM

#

I personally follow The Batch newsletter and these two youtube channel: bycloudai and AI Explained

The Batch | DeepLearning.AI | AI News & Insights

Weekly AI news for engineers, executives, and enthusiasts.

YouTube

bycloud

I cover the latest AI tech/research papers for fun

YouTube

AI Explained

Covering the biggest news of the century - the arrival of smarter-than-human AI. What is happening, what might soon happen and what it means for all of us.

Business Enquiries: aiexplained@outlook.com

https://www.patreon.com/AIExplained

jaunty mural Dec 11, 2023, 5:51 PM

#

Hello, I haven't looked in a while on server, especially this thread. Is there recent trends or some huge improvement in data science analyze tools in Python?

agile cobalt Dec 11, 2023, 6:12 PM

#

I guess that pola.rs has been gaining traction (but is still not anywhere near as popular as pandas)
other than that, all the generative AI stuff I guess

pearl barn Dec 11, 2023, 6:16 PM

#

For people who uses Jupyter how do you save your progress can I save same project with different names ? And what I should do to when kernel is dead??

left tartan Dec 11, 2023, 6:17 PM

#

pearl barn For people who uses Jupyter how do you save your progress can I save same projec...

Basic is just save the same file with diff names, but that doesn’t scale well. Better to: nbstrip the notebook files and commit them (stripped of data) to git.

#

Just restart the kernel.

frail mauve Dec 11, 2023, 6:23 PM

#

frail mauve I have to build 3 endpoints for work: 1. Ring tryon Input is hand image I know h...

Please help me guys

lapis sequoia Dec 11, 2023, 6:51 PM

#

Do any of you mess with optimization?

feral kernel Dec 11, 2023, 6:52 PM

#

Hi, ```def forward(self, fft_result_tensors):
# Forward pass through the network

    # Use MPS if available (optional)
    try:
       import torch.backends.mps
       is_available = True
    except ImportError:
                      is_available = False

    if is_available:
                   device = "mps"
                   fft_result_tensors = [tensor.to(device) for tensor in fft_result_tensors]
                   self.to(device)```

 78     self.to(device)
 80 # Compress the input before feeding it to the model

AttributeError: 'list' object has no attribute 'to'``` Why is there this error? I enumerated each tensor in the list already

proven ruin Dec 11, 2023, 6:58 PM

#

hey guys

#

i've been working on a personal project for a while now

#

it's a simple weight tracker app

#

I would like my program to have a built in garth inside of it that its UI is written in tkinter

#

what Library do you recommend to plot dates and weight ?

hard pebble Dec 11, 2023, 7:04 PM

#

Is there any benefits to using Tensorflow over PyTorch or is it all preference?

versed pilot Dec 11, 2023, 8:15 PM

#

left tartan Basic is just save the same file with diff names, but that doesn’t scale well. B...

or commit the notebook as is to git. Github does an ok job rendering notebooks.

long canopy Dec 11, 2023, 8:55 PM

#

agile cobalt I personally follow [The Batch newsletter](https://www.deeplearning.ai/the-batch...

thanks a lot!!

trim saddle Dec 11, 2023, 9:48 PM

#

versed pilot or commit the notebook as is to git. Github does an ok job rendering notebooks.

Its not about the rendering or not, its about that you get lots of stuff in your diff, which basically does have to do with cell execution/output and not with the code youve written.
And there you can use e.g. a pre-commit hook with nbstrip to clear all outputs automatically, before committing any changes.

versed pilot Dec 11, 2023, 9:50 PM

#

trim saddle Its not about the rendering or not, its about that you get lots of stuff in your...

Github is also doing rich diffs if you enable them https://github.blog/changelog/2023-03-01-feature-preview-rich-jupyter-notebook-diffs/ . Apologies if I sound like a spammer, but I do find these features useful

The GitHub Blog

Kevin Duck

Feature Preview: Rich Jupyter Notebook Diffs

trim saddle Dec 11, 2023, 9:52 PM

#

cleaner way would be to export certain functionalities to a .py file, once tested/developed in the notebook and import it afterwards for usage.
Then you have your logic defined in .py files which dont have all the meta information like notebook diffs

#

https://florianwilhelm.info/2018/11/working_efficiently_with_jupyter_lab/ This blogpost describes it nicely

Florian Wilhelm's blog

Working efficiently with JupyterLab Notebooks

Being in the data science domain for quite some years, I have seen good Jupyter notebooks but also a lot of ugly. Notebooks can have the perfect balance between text, code and visualisations but how often do your notebooks rather get messy and incomprehensible after a while? Follow some simple best practices to work more efficiently with your no...

#

i can also recommend pyscaffold + the data-science extension, which sets up a whole project template for that workflow

#

https://pyscaffold.org/en/stable/

#

(also developed by florian wilhelm)

umbral charm Dec 11, 2023, 10:12 PM

#

def secant(f, a, b, tol):
    x0 = a
    x1 = b
    n = 0
    while abs(x0 - x1) > tol:
        n = n + 1
        x2 = x1 - f(x1)*(x1 - x0)/(f(x1) - f(x0))
        x0 = x1
        x1 = x2
        print(x0)
    return x2, n

#

this my secant method, when i call it with a fucntion and some tolerances it prints the root twice and thus one extra iteration but i cant seem to find out why

#

1.0319286204529856
0.9929544004363596
1.0004259265358206
1.0000054550936772
0.9999999957379389
1.0000000000000426
1.0
1.0
(1.0, 8)

trim saddle Dec 11, 2023, 10:19 PM

#

Print the difference maybe too, i guess when x0 is 1, x2 still has a bigger value, which causes the while to eval to true an extra time?

umbral charm Dec 11, 2023, 10:20 PM

#

trim saddle Print the difference maybe too, i guess when x0 is 1, x2 still has a bigger valu...

Yes

#

that seems to be the problem

#

1.1 1.0319286204529856 1.0319286204529856
1.0319286204529856 0.9929544004363596 0.9929544004363596
0.9929544004363596 1.0004259265358206 1.0004259265358206
1.0004259265358206 1.0000054550936772 1.0000054550936772
1.0000054550936772 0.9999999957379389 0.9999999957379389
0.9999999957379389 1.0000000000000426 1.0000000000000426
1.0000000000000426 1.0 1.0
1.0 1.0 1.0

#

this is x0 x1 and x2 printed out

#

So is my root found when x1 and x2 are 1 or when all 3 are one

trim saddle Dec 11, 2023, 10:23 PM

#

X0 and x1 have to be 1 and for your while loop to not continue, then also x2, because x1=x2

south gull Dec 11, 2023, 10:25 PM

#

hard pebble Is there any benefits to using Tensorflow over PyTorch or is it all preference?

PyTorch for anything you want to implement yourself, no contest. Not even google believes in TensorFlow. In typical fashion, they're building a new framework. I generally wouldn't bother with anything from google, unless there is no viable alternative

#

If you just want to serve models, whatever is available and most convenient on your target platform

versed pilot Dec 11, 2023, 10:43 PM

#

south gull PyTorch for anything you want to implement yourself, no contest. Not even google...

But isn't their hardware TPU optimised for Tensorflow? There was a chart about GPU usage by various Big Tech companies, with a footnote explaining that Google is using TPU and GPU where others only use GPU? This means they do believe in it to a certain extent.

lapis sequoia Dec 11, 2023, 10:46 PM

#

trim saddle X0 and x1 have to be 1 and for your while loop to not continue, then also x2, be...

Is that like youngs theorem?

south gull Dec 11, 2023, 10:46 PM

#

versed pilot But isn't their hardware TPU optimised for Tensorflow? There was a chart about G...

There are a couple of disctinctions you need to make. Importantly, whether you care about training or serving models. For the former, you care about the autograd engine. PyTorch does support a wide range of accelerators, and TPU is one of them if memory serves right (easy to check yourself, don't take my word for it). If you care about serving models, then the platform is much more important. What you want to do there, is compile your model with a graph compiler. For example, if you deploy on nvidia jetson devices, you'll want to use TensorRT

#

You can convert both PyTorch and Tensorflow implemented models with loaded weights to various formats, such as ONXX

left tartan Dec 11, 2023, 11:13 PM

#

versed pilot Github is also doing rich diffs if you enable them https://github.blog/changelog...

Oh, that's interesting. I never commit data tho.

long canopy Dec 11, 2023, 11:31 PM

#

what graph modeling options do I have if I need to:

Name nodes,
Name edges/links
Set edge directions
Navigate a graph of more than 1000 nodes

#

i'm most of all interested in navigation, i.e., if there exists a ready-made interface to move about such huge graphs

safe ermine Dec 11, 2023, 11:38 PM

#

what is the best way to learn ai programming in python from scratch?

left tartan Dec 11, 2023, 11:56 PM

#

safe ermine what is the best way to learn ai programming in python from scratch?

Do you know how to program in Python?

safe ermine Dec 12, 2023, 12:24 AM

#

left tartan Do you know how to program in Python?

welllll the very basics bc i only recently started

#

the most advanced thing ik rn is like the basics of arrays of records

left tartan Dec 12, 2023, 12:27 AM

#

safe ermine welllll the very basics bc i only recently started

Oh, then don't worry about learning AI yet. Learn the basics of python, do some simple projects, and then do some more complex projects. You don't need to be an expert, but you do have to know the fundamentals first. If you need help, resources and ideas for projects, #python-discussion is a good place to start.

safe ermine Dec 12, 2023, 12:28 AM

#

what kinda things come under the fundementals

#

bc i’m not sure where to go from where i am now

#

bc currently i’m just following the stuff i need for school

left tartan Dec 12, 2023, 12:31 AM

#

safe ermine what kinda things come under the fundementals

Could you ask over in #python-discussion how to get started learning python and what fundamentals you need to learn? I'm headed to dinner right now. PyDis is just general Python talk, and there are many students and new python programmers there who can share their experience.

shell ruin Dec 12, 2023, 12:54 AM

#

Im doing an ML refresher and starting off with the basic housing price predictor. I was playing around with seaborn for some EDA and got an empty-ish heatmap. Does anyone have any thoughts as to why?

#

long canopy Dec 12, 2023, 12:56 AM

#

information information information, yeah yeah. informatiooooooooooooooooooooooooooooooooooooon (and data), yeah

#

turns out Gephi is great for graph visualization

left tartan Dec 12, 2023, 12:58 AM

#

shell ruin Im doing an ML refresher and starting off with the basic housing price predictor...

Well what data points (columns) are you trying to build a heat map from?

shell ruin Dec 12, 2023, 12:59 AM

#

left tartan Well what data points (columns) are you trying to build a heat map from?

All numeric fields, in this case I have built a list: ['Id', 'MSSubClass', 'LotArea', 'OverallCond', 'YearBuilt', 'YearRemodAdd']

left tartan Dec 12, 2023, 12:59 AM

#

shell ruin All numeric fields, in this case I have built a list: ['Id', 'MSSubClass', 'LotA...

What does that mean? How do you expect a heat map to look with 6 columns?

shell ruin Dec 12, 2023, 1:00 AM

#

left tartan What does that mean? How do you expect a heat map to look with 6 columns?

left tartan Dec 12, 2023, 1:01 AM

#

That last one is a correlation matrix. The .corr is the important piece missing.

shell ruin Dec 12, 2023, 1:02 AM

#

So I did try it using .corr(), assuming that I would only get numeric columns, but using .corr() I was getting an error as it was trying to convert a string field to a float. Which of course would not work.

#

Huh, okay, I got it. So by not relying on .corr() to select the numeric columns and doing it manually, I was able to recreate the matrix

#

Just in case someone else ever needs an example

left tartan Dec 12, 2023, 1:04 AM

#

There are other methods for correlation of categorical fields, but yah, great!

shell ruin Dec 12, 2023, 1:08 AM

#

left tartan There are other methods for correlation of categorical fields, but yah, great!

Would you have an example of how you would handle this? I havent done any EDA in like a year and a half, so I'm open to trying other approaches.

left tartan Dec 12, 2023, 1:10 AM

#

This is not my specialty, but start with the chi square test: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html

shell ruin Dec 12, 2023, 1:12 AM

#

Sounds good, thanks!

quaint loom Dec 12, 2023, 2:14 AM

#

quaint loom Hi guys. I am currently trying to do the Random forest test on my data. I want t...

Is there anyone who can have a look on my previous message?

desert oar Dec 12, 2023, 3:07 AM

#

quaint loom Is there anyone who can have a look on my previous message?

it's not clear to me what you're asking... what's a "random forest test"?

#

that's also maybe too much code to expect someone to debug without additional context

long canopy Dec 12, 2023, 3:21 AM

#

Any communities focused on local LLMs?

serene scaffold Dec 12, 2023, 3:26 AM

#

long canopy Any communities focused on local LLMs?

not that I'm aware of

lapis sequoia Dec 12, 2023, 3:37 AM

#

What do you use for big data? And is all of this actually a job?

quaint loom Dec 12, 2023, 3:51 AM

#

desert oar it's not clear to me what you're asking... what's a "random forest test"?

So, basically a “random forest” is a technique in machine learning where a group of decision trees work together to improve predictions. Kinda like statistical test. So my question is why my code is not grouping my location out. I have been sampling at several areas, each areas have 4 site. So I have grouped them together. But it seems like my code have some mistakes, making every areas together instead of grouping them together. I hope this makes it more clear ☺️

desert oar Dec 12, 2023, 3:52 AM

#

quaint loom So, basically a “random forest” is a technique in machine learning where a group...

i know what a random forest is, but i don't understand what you're intending to do with it

quaint loom Dec 12, 2023, 3:54 AM

#

desert oar i know what a random forest is, but i don't understand what you're intending to ...

Oooh. My goal is to identify which parameters are most significant in influencing the dependent variable.

desert oar Dec 12, 2023, 3:55 AM

#

i see... i wouldn't really say that's "kinda like a statistical test"

#

methodological issues aside, you should just be able to fit the model in scikit-learn and the extract feature importance scores

#

i don't know how this relates to grouping of locations. i think you might need to explain your actual goal more

quaint loom Dec 12, 2023, 4:04 AM

#

The Method itself doesn’t have anything with the grouping bur rather different locations where samples have been collected. Lets say that : At restored area 1, parameter X1 is significantly influenced by the depended variable but at Unrestored area 2, X2 is significantly influenced by the dependent variable. So, I want the Random forest method to do the method on each ground (Restored are 1, Restores area 2, Urestored are 1 and Unrestores area 2 Maybe that is where the problem is, the machine learning method isn’t doing the method individually on each group but rather on the entire dataset.

desert oar Dec 12, 2023, 4:08 AM

#

quaint loom The Method itself doesn’t have anything with the grouping bur rather different l...

we usually reserve the term "dependent variable" for the outcome of some experiment/process/procedure. it sounds like you're using it to mean the opposite, as the input to a procedure?

quaint loom Dec 12, 2023, 4:15 AM

#

Maybe I am using the term different. I am using the dependent variable as : CH4 (methane) flux can be a dependent variable, while Total Nitrogen (TN), Total Phosphorus (TP), and chlorophyll-a (chl a) are independent variables. The Random Forest model would help in understanding how well these variables predict CH4 flux and which of them is most important for the prediction.

odd meteor Dec 12, 2023, 4:21 AM

#

long canopy Any communities focused on local LLMs?

LangChain has a discord community

trim saddle Dec 12, 2023, 5:15 AM

#

safe ermine what is the best way to learn ai programming in python from scratch?

Andrew karpathys yt videos

rapid charm Dec 12, 2023, 5:37 AM

#

Hi, I am very new to using Python through the command prompt and I am having trouble downloading pytorch. I keep getting the error: ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch
How can I resolve this error?

shut girder Dec 12, 2023, 5:38 AM

#

Hello, I'm currently very confused. Why do people say that exploratory data analysis is a step in data analysis? I thought exploratory data analysis is just an approach to data analysis and is the "full picture," meaning that once conclusions are drawn from EDA, those conclusions can be used for decisions. Am I getting the processes of data analyst wrong?

trim saddle Dec 12, 2023, 5:45 AM

#

rapid charm Hi, I am very new to using Python through the command prompt and I am having tro...

Check pytorchs website for an instruction. Iirc you have ti define an extra package index for pip to look for

rapid charm Dec 12, 2023, 5:46 AM

#

Ok, will do. Thank you.

trim saddle Dec 12, 2023, 5:48 AM

#

lapis sequoia Is that like youngs theorem?

Its what the requirements of his code are, to not go into the while loop

snow fog Dec 12, 2023, 6:07 AM

#

is there a library which will extract the text from pdf as it is , let me share an example pdf , in below pdf colums are space separated but after using an ocr library i am getting some colums as combined and columns as new line seperated , I can use some pdf extraction library but again it is discarding the space and first 3 coulmns and last 5 are not the problem for me but middle 3 are , if there exisits a library which will extract as it is then i can extract column on the basis of number of char it takes

trim saddle Dec 12, 2023, 6:13 AM

#

snow fog is there a library which will extract the text from pdf as it is , let me share ...

I know tabula-py to extract tables from pdfs most of the time you need some more fiddling with the table you get there

tidal bough Dec 12, 2023, 6:13 AM

#

long canopy what graph modeling options do I have if I need to: 1. Name nodes, 2. Name edg...

I'm not sure what you mean by "navigate" - like, pathfind? networkx is what I usually use if I don't want to write my own graph algorithms, and it supports arbitrary metadata on nodes I believe (don't know if it supports it on edges).

deft spire Dec 12, 2023, 7:12 AM

#

Can I ask for help for unity ml agents here? Technically backend is in python

past meteor Dec 12, 2023, 7:56 AM

#

quaint loom Maybe I am using the term different. I am using the dependent variable as : CH4 ...

In principle you can get the feature importance out of your model but if your variables are correlated you can't really say anything interesting.

Example:

Most methods have a more or less inbuilt regularisation method. If variable A is perfectly correlated with variable B and this is again perfectly correlated with the independent variable you'd split on A first and see that B no longer gives extra information on predicting the dependent.

To make it worse, you run your experiment again with a different seed, now it splits on B first.

Can you see what the problem is? The feature importance method you use will claim A is important and B has 0 importance and vice versa in the second run. This is patently untrue in reality, both of them are highly predictive.

This is what I meant with "you can't say anything interesting". At best you can say that your specific model instance holds these variables as important/unimportant but this type of claim has a very low internal validity let alone external validity.

pearl barn Dec 12, 2023, 8:03 AM

#

Do anyone know good discord server for learning data analysis and finding projects to apply what I learn ??and ask questions about problems I got within my projects or codes??

past meteor Dec 12, 2023, 8:09 AM

#

pearl barn Do anyone know good discord server for learning data analysis and finding projec...

This server

pearl barn Dec 12, 2023, 8:33 AM

#

I'm studying python for data analysis from freecodecamp ia that a good source to start from??

wooden sail Dec 12, 2023, 9:02 AM

#

my turn to ask for help: what's the proper way of managing/changing the BLAS backend for numpy and similar packages in windows? ideally with conda

the context: a new AMD cpu for which MKL doesn't perform so well, and the old MKL flag trick was patched around 2021. because of that, i want to use openBLAS as a backend instead, but ideally also be able to switch to MKL when needed (necessary for some tests, since my code will ultimately run on an intel cluster)

past meteor Dec 12, 2023, 9:28 AM

#

pearl barn I'm studying python for data analysis from freecodecamp ia that a good source to...

For us to help you the best it's a good idea to state your background: how much stats you know, how much python you know etc.

maiden arch Dec 12, 2023, 10:52 AM

#

#

import matplotlib.pyplot as plt
import numpy as np

# Read CSV file
csv = pd.read_csv('/home/needjobcoder/devlopment/python/dataSciencePractice/practice/stockMarket/archive/ADANIPORTS.csv')

# Extract dates and volumes
dates = csv['Date']
volumes = csv['Volume']

# Create a bar plot
fig, ax = plt.subplots()
ax.bar(dates, volumes, width=1, edgecolor="white", linewidth=0.7)

# Set labels and title
ax.set(xlabel='Date', ylabel='Volume', title='Volume Over Time')

# Rotate x-axis labels for better readability
plt.xticks(rotation=45)

# Show the plot
plt.savefig('output_plot.png')

# plt.show()

#

volumes /home/needjobcoder/devlopment/python/dataSciencePractice/venv/bin/python /home/needjobcoder/devlopment/python/dataSciencePractice/practice/main.py 0 27294366 1 4581338 2 5124121 3 4609762 4 2977470 ... 3317 9390549 3318 20573107 3319 11156977 3320 13851910 3321 12600934

#

dates 0 2007-11-27 1 2007-11-28 2 2007-11-29 3 2007-11-30 4 2007-12-03 ... 3317 2021-04-26 3318 2021-04-27 3319 2021-04-28 3320 2021-04-29 3321 2021-04-30

boreal gale Dec 12, 2023, 11:05 AM

#

maiden arch dates ```0 2007-11-27 1 2007-11-28 2 2007-11-29 3 2007-1...

i think your first problem is that your dates here are still not parsed as proper datetime objects, pandas/matplotlib has no choice but to literally plot every single date as string, so you would get a cluster of black when they overlap due to lack of space - you need to read csv with the parse dates arg see: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html#:~:text=%3DTrue%2C-,parse_dates,-%3DNone%2C or manually parse it with pd.to_datetime() (and overwrite it column for example)

maiden arch Dec 12, 2023, 11:07 AM

#

boreal gale i think your first problem is that your dates here are still not parsed as prope...

what do i pass into parse_dates exactly ?

#data-science-and-ml

Define the FCNN with Bessel activation

Instantiate the model, loss function, and optimizer

Break the input matrix into 8x8 matrices

Training loop

Perform optimization

Define the FCNN with Bessel activation

Instantiate the model, loss function, and optimizer

Break the input matrix into 4x4 matrices

Training loop

Define the FCNN with Bessel activation

Initialize model with actual input dimension

Adjust loss function based on desired output type (e.g., reconstruction)

Use a more suitable optimizer for large datasets

Break input matrix into batches

First, lets train our bot with some data

Now we can export the data to a file

Initialize model with actual input dimension

Adjust loss function for MPS

Move model to MPS device if available

Break input matrix into batches

L-BFGS optimizer

Perform optimization

In principle you can get the feature importance out of your model but if your variables are correlated you can't really say anything interesting.

To make it worse, you run your experiment again with a different seed, now it splits on B first.