#data-science-and-ml

1 messages · Page 117 of 1

severe inlet
#

im intending to try out a ML project with a commodity prices dataset. im thinking of doing 2 kinds of predictions: next day prediction and 1 week prediction.

may i ask for some advice on how should i proceed? can i start off with simple linear regression model, work towards other variants, maybe LSTM at the end?

#

im not sure how to "start" on the project

river cape
#

Isn't Logistic Regression a linear classifier?

#

So do we need feature scaling while dealing with that model?

restive wave
restive wave
river cape
#

Because the co-efficients take care of that ?

lofty thorn
#

is there any affect on the mean when the data is positively or negatively skewed??

lofty thorn
#

nvm

tired lodge
#

how would i go about evaluating the position of a connect 4 board?

#

i have a bunch of test cases and their resulting evaluation but i don't know how to go forward from there

#

._.

#

this is their file contents

nimble stag
#

If it’s oo, you could make a class for every different level of connectedness (like two pieces, three pieces, etc.) then instantiate it whenever it actually occurs and continue checking for pieces to add from the new instantiations

nimble stag
#

Object oriented

tired lodge
nimble stag
#

I think ur trying to evaluate if there are chains of pieces that are the same colour, correct me if I’m wrong

tired lodge
#

yeah i'd say so

marble spindle
#

Anyone know any libraries which would fetch the written text and coverts into pdf formate

tired lodge
#

im thinking of a more, manual? approach to this, where i compute each of the 6000 test cases, log their results and thats my tablebase to work off of 😭

tired lodge
#

i remember seeing something online

marble spindle
nimble stag
#

So when ur evaluating, for every piece check its neighbours, and if it has a matching colour neighbour instantiate a class called TwoPieces (for example). Then that object will check its neighbours and, finding a matching colour piece, would instantiate ThreePiece (and so on)

tired lodge
#

how do i get an eval score after that though?

marble spindle
#

Anyone with this solution, where there is a paper which is written manually, now im working on a stuff where i need to fetch that written text and check the spelling whether its 'a' or o the model should be trained in such a way where it can fetch those stuff checks the grammar correction and convert those to pdf formate, which libary whould be good for all these any suggestion?

jaunty helm
#

I wanted to plot a correlation heatmap in polars, this is what I have:

import polars as pl

df: pl.DataFrame
df = df.corr().select(pl.all().abs())
plt = df.plot.heatmap(height=600, rot=75, yticks=[(i, c) for i, c in enumerate(df.columns)])
display(plt)
```the above snippet yields the below image. 
the problem is that when I hover over a cell, it shows an `index`, where I'd like it to be the actual label (in the case of the image, it should be `GrLivArea`)
any way to modify the code to get the desired effect? (ideally, not turning it into a `pandas.DataFrame`)
tired lodge
#

since with two players, there can be two types of chains

nimble stag
#

Implementation details are up to you

#

I would suggest checking the board once and checking certain positions multiple times (if double or triple piece found)

nimble stag
#

And if nothing more is found you could return the size of the biggest object in that position (assuming oo)

tired lodge
# nimble stag And if nothing more is found you could return the size of the biggest object in ...

so i have this ```py
class OnePiece: # The One Piece is real!!
def init(self, square_indexes: Tuple[int, int], direction: str) -> None:
self.y, self.x = square_indexes
self.change_in_y, self.change_in_x = directions[direction]

    self.next_x = self.x + self.change_in_x
    self.next_y = self.y + self.change_in_y

    try:
        neighbour = array[self.y][self.x]
    except IndexError:
        return False
    
    if neighbour == array[self.next_x][self.next_y]:
        return TwoPiece()

    return False
#

it takes in square_indexes which is like [i][j] ➡️ (i, j)

#

and then a direction as a str from this dict py directions = { 'up': (-1, 0), 'down': (1, 0), 'left': (0, -1), 'right': (0, 1) }

#

it then checks if you can go in that direction and if you can't, just return False

#

if you can go in that direction, call TwoPiece() and return whatever happens in it

#

oh wait, __init__ by default returns None pithink

#

using __new__ works though

river cape
#

A linear SVR will have its kernel=linear and a non-linear svm will have its kernel = rbf?

warm trellis
#

hey!
I'm trying to learn how to implement models from papers. Is anyone out there willing to mentor me?

tired otter
#

in my half-educated opinion, you should learn how to implement forward method for simple nn models in pytorch/tf and let autograd do its magic

gritty vessel
#

Yes

sick eagle
#

nothing

tired lodge
warm trellis
#

I'm trying to implement paper "DEEP NON-PARAMETRIC TIME SERIES FORECASTER".. This is so far what I've built though, I'm not sure how this model can do the predictions..

    def __init__(self, h: int, input_size: int,
                 num_hidden_layers: int = 4,
                 hidden_size: int = 24
                ):
        super().__init__()
        self.input_size = input_size
        self.linear_stack = [nn.Linear(in_features=input_size, out_features=hidden_size)]
        self.linear_stack += [nn.Linear(in_features=hidden_size, out_features=hidden_size) for i in range(num_hidden_layers-1)]
        self.final_layer = nn.Linear(hidden_size, input_size)
        self.softmax = nn.Softmax()
    
    def forward(self, z):
        for linear_layer in self.linear_stack:
            z = linear_layer(z)
        sampling_probabilities = self.softmax(self.final_layer(z))
        return sampling_probabilities
sick eagle
tired lodge
#

i have some table bases, idk if i should calibrate an algorithm using those

sick eagle
lapis sequoia
#

anyone has done LLMs evaluation before ? any resources to read ? I need to do evaluation for an assistant and am not sure how !! anything that woulld help me build evaluation process !

#

any ideaa would help pithink

agile cobalt
#

@lapis sequoia which sort of assistant? there are a few common metrics and monitoring tools you can use depending on the task, but they're not perfect nor make sense for all cases

cinder schooner
#

Hello, i'm trying to implement RepeatedAugmentation for a computer vision project i'm working on and every code sample or example i find on the internet is pairing it with distributed learning. So i thought maybe i misunderstood the concept and maybe i need distributed learning to use RepeatedAugmentation. So my question is: can i use RepeatedAugmentation without distribution and if yes how?

lapis sequoia
agile cobalt
#

you can try creating a compilation of a few prompt - dataset - expected final result combinations and just testing if it works, but overall I would strongly recommend against asking models about things you do not understand yourself if you have no intent or means of verifying if its output is correct or not, and even more so against products that explicitly encourage that practice

#

Even if you got 95% accuracy, the damage that those 5% wrong results could case if your end user is not perfectly aware of the model's limitations is tremendous, and models are not anywhere near reliable enough to expect 100% accuracy yet on an uncontrolled environment

dull radish
#

Hello, so I wanna make an AI based tool which can convert let's say VB6 to VB.net or C# or in general can convert these older languages into newer ones and add documentation etc. I'm kinfa new to AI so if someone can tell me how exactly I can go abt this that'll be great thank you

warm trellis
#

I am trying to implement the DeepNPTS model, but I'm confused a little bit, especially on how the model will learn part.. Since the model outputs probabilities, but the observation is a real value.. They have described to use Loss: Ranked Probability Score for the loss function, but I'm a little bit lost on this part, how model will learn from probability distribution ?
#data-science-and-ml message here is my draft for the model, I'm not sure if it's correct though.

lapis sequoia
thorny zealot
#

the squeeze parameter was deleted from pandas, what should I do ?

spring field
#

wait, parameter? parameter for what?

thorny zealot
#

read_csv

spring field
spring field
river cape
#

Do we need feature scaling for polynomial regression?

river cape
#

Main question when do we use feature scaling?

craggy agate
#

My model gets around a 65% val_accuracy what do I do to increase it? I feel its not reliable and when I give it actual images it get like it doesn't get up to the 65 mark of validation accuracy. Its an expression identifier model btw. I have 5 classes.

#

Here is the code:
`
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=30,
shear_range=0.3,
zoom_range=0.3,
width_shift_range=0.4,
height_shift_range=0.4,
horizontal_flip=True,
brightness_range=[0.8, 1.2],
fill_mode='nearest')

training_set = train_datagen.flow_from_directory(
'C:\Users\yatha\OneDrive\Desktop\CNN Expression identifier\Train',
target_size =(128, 128),
batch_size = 48,
classes = ['Anger', 'Fear', 'Happy', 'Sad', 'Surprise'],
class_mode = 'categorical',
shuffle=True,
)

test_datagen = ImageDataGenerator(rescale=1./255)

test_set = test_datagen.flow_from_directory(
'C:\Users\yatha\OneDrive\Desktop\CNN Expression identifier\Test',
target_size =(128, 128),
batch_size = 48,
classes = ['Anger', 'Fear', 'Happy', 'Sad', 'Surprise'],
class_mode = 'categorical',
shuffle=True,
)
cnn = tf.keras.models.Sequential()
cnn.add(tf.keras.layers.Conv2D(
filters=16,
kernel_size=3,
activation='relu',
input_shape=[128, 128, 3]
))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))
cnn.add(tf.keras.layers.Conv2D(
filters=16,
kernel_size=3,
activation='relu',
input_shape=[128, 128, 3]
))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))
cnn.add(tf.keras.layers.Conv2D(
filters=16,
kernel_size=3,
activation='relu'
))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))
cnn.add(tf.keras.layers.Flatten())
cnn.add(tf.keras.layers.Dense(units = 512, activation = 'relu'))
cnn.add(tf.keras.layers.Dense(units = 512, activation = 'relu'))
cnn.add(tf.keras.layers.Dense(units = 512, activation = 'relu'))
`

spring field
# craggy agate Here is the code: ` train_datagen = ImageDataGenerator( rescale=1./255, ...

if this is supposed to be a densenet then I'm slightly confused why you have all the transition layers and then all the denseblocks after all of those? you'd rather have the initial convolution, then repeat this like 3 times: (a denseblock, then a transition layer)
then go through the linear layer and then use softmax (which ig is built-in to the training set?)

vale parcel
#

I'm getting an error. I pip installed keras-rl2 but whatever code I use it on I get this error:

Traceback (most recent call last):
File "/Users/srikanthvattikuti/Downloads/keras-rl-master/examples/dqn_atari.py", line 13, in <module>
from rl.agents.dqn import DQNAgent
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/rl/agents/init.py", line 1, in <module>
from .dqn import DQNAgent, NAFAgent, ContinuousDQNAgent
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/rl/agents/dqn.py", line 7, in <module>
from rl.core import Agent
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/rl/core.py", line 7, in <module>
from rl.callbacks import (
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/rl/callbacks.py", line 8, in <module>
from tensorflow.keras import version as KERAS_VERSION
ImportError: cannot import name 'version' from 'tensorflow.keras' (/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/keras/_tf_keras/keras/init.py). Did you mean: 'cxx_version'?

Someone pelase help!! I'm using MacOS Sonoma btw.

craggy agate
craggy agate
spring field
#

oop

#

also I think in the paper they were using an averagepool instead of a maxpool

#

but how do your layers look now?

craggy agate
# spring field but how do your layers look now?

`# Initial Convolution Layer
cnn.add(Conv2D(filters=16, kernel_size=3, activation='relu', input_shape=[128, 128, 3]))

Dense Block 1

for _ in range(3):
cnn.add(Conv2D(filters=16, kernel_size=3, activation='relu', padding='same'))
cnn.add(MaxPool2D(pool_size=2, strides=2))

Transition Layer 1

cnn.add(Conv2D(filters=16, kernel_size=1, activation='relu'))

Dense Block 2

for _ in range(3):
cnn.add(Conv2D(filters=16, kernel_size=3, activation='relu', padding='same'))
cnn.add(MaxPool2D(pool_size=2, strides=2))

Transition Layer 2

cnn.add(Conv2D(filters=16, kernel_size=1, activation='relu'))

Dense Block 3

for _ in range(3):
cnn.add(Conv2D(filters=16, kernel_size=3, activation='relu', padding='same'))
cnn.add(MaxPool2D(pool_size=2, strides=2))

Flatten Layer

cnn.add(Flatten())

Fully Connected Layers

cnn.add(Dense(units=512, activation='relu'))
cnn.add(Dense(units=512, activation='relu'))

Output Layer

cnn.add(Dense(units=5, activation='softmax'))
`

spring field
#

wait, Dense is not a DenseBlock? sigh

#

it's just a linear layer?

spring field
fast knoll
#

Hey anyone help me for my interview of pandas

past kite
#

which is better for starting out in ML. Jupyter Notebook or Google collab

spring field
#

Google collab uses jupyter notebooks

spring field
#

the question honestly doesn't really make sense to be fair

#

like, you can use collab to run your code on a GPU if you don't have one yourself

#

but you can write the code wherever you find it more comfortable

past kite
#

k thanks

fast knoll
#

Hey can you tell where can I prepare for my interview of pandas

#

Anyone

#

Help me guys

spring field
#

may I suggest writing a small or a couple of small or actually not necessarily small ones, could be something bigger, but projects, essentially the point would be to practice here to get better at the thing you would be practicing, in this case pandas

real whale
#

hello whats like a really easy and simple replace function good for column in a data frame, like so if there were many different ways NAN, nan, null, 0, ,n/a had been inputted and I just wanted the one value? Thanks

#

I get I could find out this info very easily but if someone is the sort of person to copy and paste something they have open then I'td make it easier adn more memorable

serene scaffold
real whale
serene scaffold
real whale
#

is it something like that?

serene scaffold
#

yes, that should do it.

real whale
#

as step one

serene scaffold
#

I mean that should be the whole step. if there are any nan values that that list doesn't catch, you should add them to that list.

real whale
#

are there any intracescies you could give me a heads up about with regard to it?

serene scaffold
#

no? that just tells the pandas csv parser to treat those as nans when it reads the CSV. One and done.

real whale
#

oh ok, if I wanted to replace all those null values with one type to make the measier to reference should I want to remove them?

serene scaffold
#

because you set na_values as that list, any time those substrings appear in the CSV, they will be replaced with one kind of nan value.

real whale
#

ok fantastic that rounds everything up

oblique mirage
#

Hi guys, could someone tell me the 5 best libraries for creating graphics in python? I'm creating a machine learning model warningwarningwarning

serene scaffold
oblique mirage
serene scaffold
little arrow
#

what methods of feature reduction are there, that keep the feature names? ive tried pca but unfortunately i cant use it to find which features impact my results the most

real whale
#

Ok I've run into an error that I think stems from me not understanding the function "duh"

lapis sequoia
#

hi;total ML n00b, very interesting stuff tho. i was just wondering to myself. i wonder how much the computers cost to run claude opus. i was looking at hiring costs the other day and i was staggered. they are buying nvidia a100s and hiring them out for 35k a year each! i understand opus is distributed obviously but i was wondering what a really good model, what it takes to run it or what hardware i should get to start being able to do something decent. ive seen rtx20xx listed here and i quite like this look of this, anyone used it ? https://docs.vllm.ai/en/latest/getting_started/installation.html

i woud like to start learning about machine learning, so imma sit here and try and take in some of the unintelligible lingo you guys talk 🙂 ive had pytorch out and got some models going, been on huggingface and messed about with transformers. i have no clue tho, i got some BERT thing going, i tried ollama and then i tried deepseekcoder and my computer shat the bed. the github one, not the ollama one. i dont know why but ollama has the same thing in it and its fast as hell. anyway i have a terrible GPU 1080(8gig) and im just trying to get started. hi everyone.

lapis sequoia
# dull radish Hello, so I wanna make an AI based tool which can convert let's say VB6 to VB.ne...

i am new too, but i can tell you i just updated a program from python 2 to 3 with claude and it did the whole thing in one go perfectly. was pretty impressed. i literally didnt touch the code to update it 🙂 just pasting. ive been poking these things for a while, including looking at some of the stuff on github like sweep ai and was inspired by autodev to plan a new multiple agent idea. thats one of my projects but ive got another main one for ML, thats just comparatively easy trading stuff. what i can say with confidence is that GPT is absolutely awful at code and claude is insanely good. GPT cant remember anything really and claude remembers everything - the dynamic between them changed overnight almost. ai wars going on...

dull radish
#

I'll definitely give claude a try tho ty for the info

lapis sequoia
# dull radish Ayay I planned on using a pretrained model and fine tune it with the particular ...

yeah its not going to you know do everything but it was a pretty good stab for first attempt. not bad. havent tried to do porting yet, i was going to try python to rust tho 🙂 claude sonnet is ok but opus is the daddy, instead of reading the manuals for stuff i upload the whole program into it and just ask it what i need to know rather than going through every bit. incredibly helpful. i just made a CLI and after i gave it one function it wrote all of the rest of them almost right first time, had to correct few things but yeah it floored me. ive thrown the same things at it and GPT a lot of times and the differences are really interesting, ive been doing gemini as well at the same time. gemini doesnt even know what language you are writing code in 😄 give GPT one or two decent length pastes and its forgotten all code you uploaded. so that requires more focussed stuff. it does take a lot more files easily than mr claude tho 🙂 good luck with it

lapis sequoia
marble spindle
#

Anyone can recommend any similar site like hugging face? Where I can find pre models for testing?

serene scaffold
marble spindle
serene scaffold
shut girder
#

What resources should I use when studying calculus for machine learning? I have not taken a high school calculus class yet, but I want to work my way to being able to understand and apply linear regression as my first project

past meteor
# shut girder What resources should I use when studying calculus for machine learning? I have ...

Sidenote: To use linear regression and interpret the results you don't need to know calculus or linear algebra. Many research focused social science / medicine programs teach all of this but not calculus / lin alg. Even stronger, many of the researchers themselves don't know these but do know how to correctly apply a regression. I'm mentioning this because it shouldn't have to pause your project's progress unless you're really interested in the math.

To answer your question, considering you've not covered them yet in high school I just recommend you pick up a standard textbook and go through that then.

iron basalt
lapis sequoia
#

Hello, I am creating 2 layer nn that learning xor and it's not learning and I dont really know what to do but i think something is worng with my teaching process

#
    def teach(self, X, Y, iters, learning_rate):
        for _ in range(iters):
            index = random.randint(0, len(X)-1)
            test_sample = X[index]
            test_target = Y[index]
            Z1, Y1, Z2, Y2 = self.forward(test_sample)

            Y2_error = test_target - Y2
            Y2_delta = np.dot(Y1, Y2_error[0]) * self.sigmoid_derivative(Y2)

            Y1_error = np.dot(Y2_delta, self.W2)
            Y1_delta = np.dot(np.reshape(test_sample, (2, 1)), Y1_error[0]) * \
                self.sigmoid_derivative(Y1)

            self.W2 += Y2_error * learning_rate
            self.W1 += Y1_delta * learning_rate
            self.W2b += Y2_error * learning_rate
            self.W1b += Y1_error * learning_rate

    def predict(self, X):
        Z1, Y1, Z2, Y2 = self.forward(X)
        print(Y2)
#
n.predict([1, 1])
n.predict([0, 0])
n.predict([1, 0])
n.predict([0, 1])
#
[0.49372465]
[0.4838573]
[0.48444335]
[0.49253533]
#

I think it can be related to couting error on 1st layer

#

Thanks for any help

tired otter
#

In a simple GPT model (Karpathy's nanoGPT for ref), do i understand correctly, the only reason to aggregate every token's (past) neighbors is to increase number of subsamples + teach model to work with data of shorter lengths? So, in theory, we could have only aggregated last token and made a prediction based on that?

agile jackal
#

has anybody tried llama3-8B-instruct-KS?
is that even the fastest version

I'm trying to use it with gpt4all
but the bot is way off topic

toxic mortar
#

If I have a bunch of these sector categories which I want to feed my neural network with, is it better to categorize them in [0-20] categories, each category with an ID , or I should make binary input fields of every one of the sectors? For example Biotech: Biomedical/Gene : 0|1

tired otter
#

usually for classification data is onehot vector encoded

toxic mortar
#

Aaa mb you ment vector, not the categorical encoding

#

Sorry I got it

tired otter
#

linear algebra does not care how many dimensions you have. i see it as separating/clustering data in higher dimension. but im not a pro at this

tired otter
#

never used categorical encoding. but i think if its like [1,2,3,4,..], its possible say that some entry is 3.5 = like 3 but also like 4, but you wont find entry thats like 3 and 7 xD

toxic mortar
toxic mortar
tired otter
#

you had a list of categories, ordered by input data order, you represent it encoded format where each entry is a row vector. it is a matrix (N_data, N_categories)

boreal gale
#

one hot encoding is definitely one way of making use of this field of categories.
just beware of "curse of dimensionality".
you don't seem to have a lot of data to work with, so i would - on top of just a normal one hot encode - also investigate if there are natural grouping of categories, e.g. all software instead of just one specific software category, to slightly reduce the number of columns you are adding to your dataset.

agile jackal
calm umbra
#

hello, anyone how to get or print the C3 node

bold timber
#

Below are two different techniques for implementing code when running a model using PyTorch in training mode:

'CODE 1'
torch.manual_seed(42)

# Set the number of epochs (how many times the model will pass over the training data)
epochs = 100

# Create empty loss lists to track values
train_loss_values = []
test_loss_values = []
epoch_count = []

# 0.Loop through the data
for epoch in range(epochs):
    
    #TRAINING MODE
    # Put model in training mode (this is the default state of a model)
    model_0.train()

    # 1. Forward pass on train data using the forward() method inside 
    y_pred = model_0(X_train)
    # print(y_pred)

    # 2. Calculate the loss (how different are our models predictions to the ground truth)
    loss = loss_fn(y_pred, y_train)

    # 3. Zero grad of the optimizer
    optimizer.zero_grad()

    # 4. Loss backwards
    loss.backward()

    # 5. Progress the optimizer
    optimizer.step()
            

'CODE 2'
epochs = 100
train_cost = []


for i in range (epochs):
      
    #TRAINING MODE
    model.train()
    cost = 0
    for feature, target in trainloader:
        output = model (feature) #feedforward
        loss = criterion(output, target) # calculate the cost
        loss.backward() #backpropagation
        
        optimizer.step() #update weight
        optimizer.zero_grad()
        
        cost += (loss.item() * feature.shape[0]) #total loss
        
    train_cost.append(cost / len(train_set))
    
    print(f'\rEpoch: {i+1:4} / {epochs:4} | train_cost: {train_cost[-1]:.4f}', end = ' ')

As we can see, there's a difference between Code 1 and Code 2 in the training mode:

Code 1: The sequence during training start with feed forward, calculating the cost, zero gradient, backpropagation, and updating weight
Code 2: The sequence during training start with feed forward, calculating the cost, backpropagation, updating wegiths, and zero gradient.

#

Based on both codes above (Code 1 & 2), which one is correct in representing the training mode phase using PyTorch?

calm umbra
bold timber
calm umbra
#

in most standard training loops, you'll want to start with a clean slate and not accumulate gradients across multiple backward passes. Therefore, you call optimizer.zero_grad() before backward() to zero out the gradients from the previous training step, ensuring that you're computing the gradients only for the current training step.

tidal bough
calm umbra
#

en, yes, you are right

tidal bough
#

i guess technically it may cause problems on the very first iteration if the model was used before the loop.

calm umbra
#

it doesn't matter

tidal bough
#

If I had to choose 1 or 2 I'd say 1 is more correct (it's nicer to clean the gradients right before setting them to new ones), but I'm pretty sure the second one in fact works too...

calm umbra
#

yes, just make sure you make a step before the gradient become zero

bold timber
#

Does that mean both of them actually can be used? @tidal bough @calm umbra

bold timber
toxic mortar
#

ValueError: pattern contains no capture groups

#

But if the pattern contains no capture group, doesnt that mean that it will return nan?

calm umbra
# bold timber can you give me more explanation about this?

for epoch in range(num_epochs):
for batch in dataloader:
optimizer.zero_grad() # Zero out the gradients
outputs = model(batch) # Forward pass
loss = criterion(outputs, targets) # Compute loss
loss.backward() # Compute gradients
optimizer.step() # Update weightsfor epoch in range(num_epochs):
for batch in dataloader:
outputs = model(batch) # Forward pass
loss = criterion(outputs, targets) # Compute loss
loss.backward() # Compute gradients
optimizer.step() # Update weights
optimizer.zero_grad() # Zero out the gradients

#

both type can be used, it doesn't matter

bold timber
lapis sequoia
long canopy
#

anyone implement streaming inference? real time inference with data sent over a network. if so any suggestions for libraries or examples of implementations?

agile cobalt
#

"real time" is just in a (potentially very small) fixed interval

you can probably find a bunch of examples of classifying things on a camera feed, but as far as libraries go, it should be just the same server you would use for real time non-ML applications + the same ML libraries you would use for normal inference

long canopy
#

hm so kafka + pytorch?

agile cobalt
agile cobalt
#

just be careful not to overcomplicate things

polar zinc
#

Does anyone know an easy way to highlight data anomalies in a matplotlib graph?

marble spindle
#

Does anyone have a model, library, or code for converting handwritten text to text/PDF? I would be so grateful for any assistance.

agile cobalt
#

if you're dealing with a relatively small volume of images, GPT4-Vision could also be an option worth considering, but it is relatively expensive

light osprey
agile cobalt
#

I'm pretty sure that tesseract is free?

#

yeah it's licensed under Apache 2.0

light osprey
#

Oh yeah Tesseract OCR right? Its free but specifically for handwritten text I ve read its not as good as it is for the printed text

#

I meant Transkribus and Google Vision API

agile cobalt
#

well yeah handwritten text is quite a fair bit harder to classify than printed text to say the least

you can try training/fine-tuning tesseract on your data, specially if it follows a certain format or style, but if you are trying to recognize any and all random person's handwritting, good luck

#

it gets even worse with other languages but I am assuming English formal-ish documents for both of you?

marble spindle
light osprey
#

I truly need some luck😂 because I need it for multiple people handwritings. At first I thought ill use Tesseract for that but then read that TensorFlow has some good results. Im just in the beginning of my research cuz i cant run this goddamn code lmao. Yeah and I need it for other languages 😂😂 Im trying to practice at least for English language tho

agile cobalt
#

it may be worth considering forcing these people to type in digital documents instead of going out of the way to recognize their handwriting

(half joking. only half.)

marble spindle
# agile cobalt well yeah handwritten text is quite a fair bit harder to classify than printed t...

Extracting text from printed documents is relatively straightforward due to standardized fonts and formatting. However, handwritten text presents a much greater challenge due to the variability in individual handwriting styles. Training a model to accurately recognize and extract handwritten text would indeed be a substantial task, requiring a large and diverse dataset for training and sophisticated algorithms for recognition. Collaboration and innovation in this area are crucial for advancing the capabilities of such models and making them more reliable and accurate in real-world applications.

light osprey
#

They are my clients so they wont lift a finger to make our job easier…

light osprey
agile cobalt
#

just make sure they recognise there is no way it will get 100% accuracy otherwise things will end up pretty poorly for both you and your clients

marble spindle
#

Is anyone out there interested in collaborating to work on handwritten text recognition? By pooling our efforts together, we could develop a single application that addresses this challenge, potentially leading to significant success in this field.

marble spindle
agile cobalt
#

realistically I'd expect 90% at best for users not in the training data, and that's assuming their handwriting is readable in first place

light osprey
marble spindle
drowsy sleet
#

Hey everyone, I'm currently registered in a Exploratory Data Analysis course in our university and they want us to participate in a Kaggle competition which is based off of EDA (Exploratory Data Analysis) followed with prediction model on the dataset. Can someone provide me good resources which can help me to learn for the same. I don't mind if the resource is a website / video. I'm fine with anything I just need to know what all I must learn to do good in the competition while also learning good Data Analysis

lofty thorn
#

hi

runic parcel
#

what is the meaning of removing non linearity? like when we add a activation function in a convolution cnn model..

agile cobalt
runic parcel
agile cobalt
#

have you seen the curve for the RELU function?

runic parcel
#

yes

#

u mean the graph of relu function?

agile cobalt
#

yep, literally this

runic parcel
#

yes yes i have see

#

no curve

agile cobalt
#

it literally means "not a straight line"

runic parcel
#

yes so?

agile cobalt
#

not being linear lets it model more complex curves

runic parcel
#

hm

wooden sail
#

nonlinear here does not refer to "not a straight line"

#

it refers to violating the property T(cx + y) = cT(x) + T(y) for two inputs x and y and a scalar c

runic parcel
wooden sail
#

i need more context

runic parcel
#

in a convolutional layer we apply a feacure dectetor to a image

#

and then we use rectifier function on it to remove the non linearity in the image

#

so it makes it easier to read for the neural network

wooden sail
#

i'd need to see where you're getting this from because none of that makes sense to me as you've written it

#

that all kinda sounds wrong without all the context

#

you don't do feature detection inside a convolutional layer and the rectifier function introduces nonlinearity

#

if someone has written this explicitly anywhere, they mean it in some special sense that you'd have to share with us to understand it

serene scaffold
#

"nonlinearity" is an interesting case of a word that's very to-the-point, but somehow makes it sound like it's more complicated than it is.

runic parcel
wooden sail
#

then they must have given the specific definitions that explain what they mean

runic parcel
wooden sail
#

ok, so you mean they apply a convolution/filter. that's a linear operation

runic parcel
#

yes

#

and after that we use a activation function: recitifer, to the convolution layer

wooden sail
#

mhm, and that introduces nonlinearity

runic parcel
#

yes before

runic parcel
#

to remove the non linearity

wooden sail
#

you just said the convolution happens first

#

which one is it?

runic parcel
#

first the convoltion happens

#

so add the filter

#

and there is non linearity, so we use the refifier activation function

#

after filter

#

after rectifier functions

runic parcel
wooden sail
#

right. first you convolve (linear), then you rectify (nonlinear)

runic parcel
#

yes>?

wooden sail
#

which is the opposite of what you had written before

runic parcel
#

How

runic parcel
runic parcel
#

The image has nonlinearity and the rectifier function removes as it

wooden sail
#

you understood nothing of what we just discussed 😛

spring field
#

does linear mean that the 1st derivative is a constant and the same at all points??

#

therefore nonlinear would be literally anything else

#

and for relu it's not linear because it includes a condition which changes the derivative when x < 0

runic parcel
#

fuck i saw all wong,

#

its used to increase the nonlinearity

wooden sail
#

there we go

spring field
#

is it to do with linear transformations?

runic parcel
#

ahhh understood

wooden sail
#

a transformation is linear if it satisfies the condition i mentioned above

#

T(ax + by) = aT(x) + bT(y) for scalars a and b, and inputs x and y

#

as an example, integration and differentiation with respect to one variable are both linear transformations

spring field
#

is T a matrix?

wooden sail
#

not in general, no

wooden sail
spring field
#

so, if we take integration and differentiation as functions, the T is a function
so, transforms the input and returns the transformation?

wooden sail
#

however, matrices and matrix multiplication are defined precisely so that they represent a linear transformation in a particular input basis and output basis

wooden sail
#

if you integrate a function you get another function

#

this is a linear transformation

spring field
#

I was more thinking of integration itself being a function that takes in a function, but I think I gotcha (on some level 😁 )

wooden sail
#

that's what i mean too

strong nymph
#

I want a code for interactive Box plot for outliers, If someone knows

#

and how do I remove them

wooden sail
#

.latex let $T(x)$ be defined as
[
T(x) = \int x(u) du.
]
$T$ has the property that
[
T(ax + by) = \int (ax(u) + by(u)) du = a \int x(u) du + b \int y(u) du = aT(x) + bT(y)
]

strange elbowBOT
wooden sail
#

@spring field

#

which means integration is linear

spring field
#

and x and y can be any function (that can also supposedly be integrated)?

wooden sail
#

right

#

(more formally this is done with definite integrals, this is very loose but gets the point across)

long canopy
#

why is echo "$(git rev-parse --show-top-level)" returning --show-top-level

spring field
#

so
max(ax + by, 0) != a * max(x, 0) + b * max(y, 0)
and then I suppose in this case it's enough for one case where this is True, like say a = 1, b = 2, x = 5, y = -2 where max(1 * 5 + 2 * -2, 0) = 1, but 1 * max(5, 0) + 2 * (max(-2, 0)) = 5 (idk how else this can be proved, analytically?)
for it to fail the linearity condition
pretty cool

wooden sail
#

the definition of linearity has to be evaluated for all a, b, x, y

spring field
wooden sail
#

that means that if there exists a single counter example, the function is not linear

runic parcel
#

how much does it reduce when you use pooling to a convolution layer, so like from 5x5 to 3x3

spring field
#

makes sense, I just wanted to maybe prove it more neatly than just finding a single counter example 😁

wooden sail
#

showing a single counterexample exists is proper proof

long canopy
wooden sail
spring field
wooden sail
#

contradiction and counterexample are not the same

long canopy
wooden sail
#

which you can do either by constructing a counterexample, or by assuming the condition is satisfied and showing it leads to a contradiction

spring field
#

now that I think about it more, is an average pool over an image a linear convolution where's something like max pool would not be linear (is it still a convolution)?

#

in fact, average pool just seems like a box blur/mean blur (given stride = 1)

wooden sail
#

correct on all accounts

spring field
#

that's as random a ping as a ping can be random 😄

full leaf
#

Hi guys, I am trying to rename some image files using os for an experiment I am running. It works but something messed up and deleted some files, so I am trying to rewrite the code to check first if the file exists before changing it's name. It works, though it now only runs once and ends.

tired otter
#

In a simple GPT model (Karpathy's nanoGPT for ref), do i understand correctly, the only reason to aggregate every token's (past) neighbors is to increase number of subsamples + teach model to work with data of shorter lengths? So, in theory, we could have only aggregated last token and made a prediction based on that?

tired lodge
#

almighty data science guy

iron basalt
#

In mathematics, the term linear is used in two distinct senses for two different properties:

linearity of a function (or mapping);
linearity of a polynomial.
An example of a linear function is the function defined by

    f
    (
    x
    )
    =
    (
    a
    x
    ,
    b
    x

...

#

There is also the physics idea of linear.

toxic mortar
severe inlet
#

could anyone pleaseplease point me in the direction for using autoencoders into lstm for stock price predictions, or an implementation of autoencoders in python

#

im trying to implement my own encoding, but i dont have access to the papers im finding online

marsh hearth
#

Currently I'm taking calc 1, and I was wondering how much more math do I need to truly understand how machine learning works. Do I need to go to calc 2, and 3 first, and what else do I need to know?

serene scaffold
marsh hearth
serene scaffold
marsh hearth
#

idk, im gonna take calc 1, 2, 3, 4 hopefully in the next few quarters so that hopefully covers it

serene scaffold
marsh hearth
#

high school taking classes at my local college

#

im gonna try and take calc 2 over the summer

serene scaffold
marsh hearth
#

don't know, its a community college that im duel enrolled in with the high school

#

for example, during winter quarter i took precalc, and spring quarter (this one) im taking calc 1

serene scaffold
#

interesting.

#

is your goal to pursue ML academically/professionally?

marsh hearth
#

potentially yes

#

but right now im more focused on graduating with my assosciates degree in comp sci before i graduate high school then transfer to a 4 year with that

#

and we'll see where i go from there

serene scaffold
marsh hearth
#

haha, ok thank you for the advice ill note that 📝

severe inlet
hushed quartz
#

Can I ask for some suggestions on preprocessing on here?

lofty thorn
#

can someone explain this code to me

trim saddle
lofty thorn
#

I am a beginner..
and don't know much about this.
is this code complete?

from where can i find the data

#

i know that these are nellipses for different numbers

#

and this is a correlation matrix

trim saddle
lofty thorn
#

yea right

#

wait let me send one more shot

#

in this code it is clearly said to download from a particular site (github)..

but i can't find the data from this link

trim saddle
lofty thorn
#

but it is not same...is it?

trim saddle
#

The path in the script is the raw path for python to ingest the data

lofty thorn
#

please tell me ,at what time helpers are more active?
i will come then on discord

trim saddle
#

Also the data from the lifesat has nothing to do with the etf data.
So its not the correspending datasource to your initial screenshot

lofty thorn
lofty thorn
#

how do i program this

digital cipher
#

What are people using for virtualenv

#

I was trying to avoid conda.

trim saddle
digital cipher
#

Do you work on windows?

#

Just tryting to get tensorflow setup and its such a uphill battle 😢

wicked slate
#

guys if anyone looking for domains or cloud storage or priavte ip security pls dm me!!

toxic mortar
#

in my df, theres a column offset. How to align it with actual start of the data?

#

df = pd.read_csv(file_path)

raw mortar
trim saddle
toxic mortar
#

is the 50 features a lot for 40k records neural network?

#

should I reduce features?

digital cipher
#

I'll probably just use docker, thank you!

buoyant vine
vocal cove
#

Greetings,

Hope all are well. Would this channel be appropriate for asking regarding jax? I'm trying to parallelize a for loop (non-sequential, meaning it's as if you run a function 100 times, where each iteration is independent), and thought it seemed best to try vmap, but am facing some difficulties.

path_list = [
            self.generate_path(
                self.initial_x,
                self.final_x,
                self.time_steps[0],
                self.time_steps[-1],
                self.bisection_level,
            )
            for _ in range(self.number_paths)
        ]

All parameters are ints.

toxic mortar
vocal cove
# toxic mortar What do you mean quality, like distribution across categories,etc...? Or noise f...

Quality of data means how accurately it captures the system you're trying to imitate. That covers the type of data, the distribution of data, the noise you're getting on the data, the normalization status of the data, etc.

So once you have the dataset, you then take a model to fit a function to this dataset (basically an N+1-dimensional function/distribution), so does your model have enough parameters, does it use the correct activation, etc.

#

Model complexity is a rather broad term, you have the number of layers, the number of neurons per layer, the connectivity between layers, the activation function used for the neurons, and even the type of layers (CNNs, you have convolution and pooling layers, so the pattern and shape which you create the layers would be a measure of complexity).

#

Also, it'd be best if you refer to sheer size as scale instead of complexity.

vivid gust
#

would dual 4060 ti cards (16gb VRAM each, so 32GB VRAM total) be any use, considering the value for money
for model training

#

or would a single 3090 be wiser

lapis sequoia
umbral charm
#

Erm

#

i have 9.7 million data points

#

takes around ~30 secs to plot

#

using matplotlib

#

any GPU accelarated modules i can use or CUDA?

serene scaffold
#

@umbral charm what kind of plot is it?

umbral charm
serene scaffold
umbral charm
#

what is down sampling tho

#

im guessing it just takes points which are kind of plotted on top of eachother

serene scaffold
#

no

umbral charm
#

and removes them

#

oh, what does it do

serene scaffold
#

you can take a uniform random sample of the points. or if there's a way to aggregate points in a way that's meaningful, like taking the average of every point that represents the same day.

umbral charm
#

That is very True

serene scaffold
#

everything I say is very true
I'm a very stable genius

umbral charm
#

Thats what an unstable genius would say

serene scaffold
#

no

umbral charm
#

Erm is there a way, on spyder or pycharm

#

to make it use 100% of cpu

serene scaffold
serene scaffold
left tartan
serene scaffold
left tartan
#

Im looking for the paper, one sec

#

(The technique is highly effective, I've used it for years)

quasi bramble
#

Do most data scientists work on-site or remotely?

craggy coral
#

did AI's like chat gpt use python to machine learn

vagrant root
#

hi

#

can anyone tell me why my model's first run predictions are always 0 or 1

jaunty helm
#

How do you guys incorporate one-hot encoding with train-test splitting? (and also cross validating, etc.)
More specifically, I often get stuck with something like this:

steps = [
  ('transform_step_1', ...), 
  ('fill_nulls', ...),
  ('add_more_columns', ...),
  # ('one hot encode', what_to_do),
  ('estimator', ...)
]
pipeline = make_pipeline(steps)
```and I basically have 2 options
1. one-hot encode the entire training set before pipeline
```py
# e.g. one of the below
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
X = pd.get_dummies(X)
X = OneHotEncoder().fit_transform(X)
```the problem is I often need some preprocessing before I want to do one-hot (e.g. fill nulls, maybe add more nominal columns)
2. one-hot as a step in the `sklearn.Pipeline`
```py
steps = [
  ('transform_step_1', ...), 
  ('fill_nulls', ...),
  ('add_more_columns', ...),
  ('one hot encode', OneHotEncoder()),
  ('estimator', ...)
]
```The problem is that I also use `train_test_split`/cross validation
```py
# manual split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y)
pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)
print(mean_squared_error(y_test, y_pred))

# cv
from sklearn.model_selection import cross_val_score, KFold
cv = KFold(5, shuffle=True)
print(cross_val_score(pipeline, X, y, cv=cv, scoring=mean_squared_error)
```and sometimes there will be values present in the test split not in the train split, so OHE just fails
#

There's also option 3, but that's to manually type out all possible values in the nominal columns and give it to OneHotEncoder... and I really do not want to do that

boreal gale
jaunty helm
# boreal gale have you looked into the documentation of one hot encoder? there is a `handle_un...

there is a handle_unknown arg.
ah... don't know how I missed that

and how come not option 3? ...
emphasis is I don't want to do it manually
but now that you mention it, I think I have an idea

  • make a pipeline with steps up to where I'd want to one-hot encode
  • transform the entire dataset with it
  • extract all the unique values from the categorical cols and put it in a variable
  • use that variable as the categories= of the OneHotEncoder

ty for your suggestions!

boreal gale
# jaunty helm > there is a `handle_unknown` arg. ah... don't know how I missed that > and how...

are you using https://scikit-learn.org/stable/modules/generated/sklearn.compose.ColumnTransformer.html as well?

i was thinking to handle each column to be one hot encoded separately.

(i have no idea if this is how it (the column transformer) works btw, it just reminded me of a collection of tooling i have written for my last job years ago for working with sklearn's pipeline better)

#
# read data
# convert to columns to categorical columns where sensible
steps = [
  ('transform_step_1', ...), 
  ('fill_nulls', ...),
  ('add_more_columns', ...),
  ('one hot encode(s)', ColumnTransformer([
      ("ohe1", OneHotEncoder(df.i_am_categorical_column1.dtype.categories), "i_am_categorical_column1")
      ("ohe2", OneHotEncoder(df.i_am_categorical_column2.dtype.categories), "i_am_categorical_column2")
  ]),
  ('estimator', ...)
]
...

(not 100% sure it's .dtype.categories but something to that effect

jaunty helm
# boreal gale ```py # read data # convert to columns to categorical columns where sensible ste...

pretty much, this is what I came up with
(I'm also trying out polars and also have other helper stuff going on, so don't mind the syntax too much)

alltf = pipe.fit_transform(ALL)  # up to where I'd want to one-hot encode
categories = [alltf[col].unique() for col in CATEGORICAL_COLUMNS]

step = ('one hot encode', make_column_transformer([
    (
        OneHotEncoder(categories=categories, sparse_output=False), 
        CATEGORICAL_COLUMNS
    )
]))

... # add `step` to the pipeline
sweet kernel
#

Hi, am facing issues in running pyspark in anaconda. I have set all the env. variables correctly but still facing issues. Can someone please help me??

full elbow
#

hey guys can anyone help me

#

im tryna think of what i can use to make this work
basically i need to read a line of text from the output, grab a specific line from that output, and then store that line into a variable to be used later on in the process

#
Confidence Threshold:
31

0%


100%

{
  "predictions": [
    {
      "x": 115.5,
      "y": 227,
      "width": 79,
      "height": 88,
      "confidence": 0.374,
      "class": "rotten",
      "points": [```

for example, this is an output. I need to find "confidence": 0.374 and store it into a variable to be used later
#

the thing i sent is from roboflow

full elbow
#

twitter is kind of a bad place for that

#

in my opinion

lofty thorn
#

can someone please make me understand this code

#

i can't write the missing code

left tartan
left tartan
lofty thorn
#

want to make a correlation matrix..but annoying this is that the book does not have data but only code

#

i don't know where can i find the data

left tartan
#

What are the columns of your data?

#

Pandas corr() will give you the matrix, provided each series is in a separate column (ie: each column is a member of spx, and each row is a date)

lofty thorn
#

this is all i have

left tartan
#

What's your data

lofty thorn
#

i don't have the data

left tartan
#

Uh, then how are you going to write code?

lofty thorn
#

that is the issue...🤕is it available online??

#

idk

#

all i have is this code and matrix

left tartan
#

You could try yfinance

lofty thorn
#

what is this

left tartan
#

!pypi yfinance

arctic wedgeBOT
#

Download market data from Yahoo! Finance API

Released on <t:1713302444:D>.

lofty thorn
spring field
spring field
lofty thorn
left tartan
lofty thorn
#

wait .

vagrant root
#

I think it was an overfitting issue

spring field
lofty thorn
#

maybe stock market related..idk know..sorry for the misconception

spring field
#

I see a positive correlation between stock market related data and finance data

lofty thorn
#

positive ?? how

left tartan
runic parcel
#

what happens with the units in:
cnn.add(tf.keras.layers.Dense(units=128, activation="relu"))
is it for the number of neurons in the hidden layer?

serene scaffold
#

what kind of team?

runic parcel
lofty thorn
#

I have this book @serene scaffold ..

it has programs as an e.g. in book..but doesn't show data..

do you know where can i find data of codes written in this book

serene scaffold
serene scaffold
# runic parcel

I haven't worked with CNNs or keras before, but the dense layer having 128 "units" probably reflects the output of previous layers. because each input is a (64, 64, 3)-shape tensor, and 128 is 64 * 2.

lofty thorn
#

are these all data sets?

serene scaffold
#

but it's likely that those CSV files are the ones the code examples refer to.

lofty thorn
past meteor
runic parcel
#

@serene scaffold what is the use of target_size()?

#

what does it do?

runic parcel
#

while training and testing the cnn

#

is it for reducing the image size?

past meteor
#

The height and width of your image

serene scaffold
past meteor
runic parcel
#

reads the images from the direcotry right?

past meteor
runic parcel
past meteor
#

Or does any other thing you want

runic parcel
past meteor
# runic parcel and do uk what is the use of filters? in layers.conv2d()

Sure, the intuition is that each filter is looking for a feature in your image. In the first few layers the filters are detecting lines, edges and so on. Deeper in the network they're composed into corners, circles and so on. Even deeper they become things that may help for the downstream task. Finally, the dense layers take the extracted features and use them to make a decision.

If you have 64 filters you're looking for 64 features across each position in your image.

The analogy I like using is that the conv layers are about learning how to see and the dense layers are about learning how to decide based on things you've seen.

#

This may not necessarily be true but anthropomorphisizing CNN's helps you understand them faster 😄

spring field
#

ngl, but the more I see tf being used, the more I see why it's falling out of usage

past meteor
#

The reason why I no longer use it is simple, they change their API too much

#

Constant breaking changes

runic parcel
spring field
#

that's an interesting development choice

runic parcel
#

like edge detect, blur, and so on?

past meteor
spring field
#

mmm, ig I might also be biased towards pytorch because I started with it, but still, like in tf, like the dense layers, you don't have to specify the input feature size apparently? which I find kinda unreadable, lol, I mean, ig that's what makes it simpler to use, but yeah

past meteor
#

70% of the deep learning I did in my master's was actually with TF/Keras, 20 % with MATLAB (... lmao) and 10 % torch

past meteor
past meteor
spring field
#

yay

past meteor
#

But if TF were consistent...

#

It's better

#

Most people that prefer torch haven't even used TF

#

But at the end of the day, with all the breaking changes they've converged to being pretty much the same library especially if you add lightning into the mix

spring field
#

oh well, guess I might give it a chance at some point
(but at least looking at what I've seen others write with tf, it doesn't seem particularly enticing to me)

past meteor
#

If you want to try something else then I'd actually advise you to try out Jax

#

I'd describe Jax as something you use to make a DL framework and not a DL framework at all (because it's mostly JIT, autodiff and so on)

spring field
#

noted, I've heard of it as well, but only very little, will check it out though, thanks firNotes

past meteor
#

Amazon (sadly) likes MXNet

#

So a lot of the SoTA time series models are done with that

#

That's another contender for "I want to do something different", but it's only if you're doing SoTA time series analysis. Otherwise I'd say the best advice in 2024 is "just stick to torch" 😄

lofty thorn
#

i can't find data sets on github..
can anyone help me..
i am stuck from a while

#

please anyone

past meteor
lofty thorn
#

chatgpt understands me better than humans..🥲

spring field
#

it has no ability to "understand"

#

also, if you can't find datasets that are from the book, well, ig you can practice getting datasets from elsewhere and adapting them to the code in the book

finite sierra
#

I have 2 arrays:
numbers which consists of integers, and I will multiply every one of them by a certain value
mask which will be an array of True/False values, True for values I don't want to multiply (will cause overflow), and False for values I want to multiply

What is a smart way to perform the multiplication for values in numbers that are FALSE in mask?
I tried doing np.where(mask, nan, numbers * scalar) but that still does the multiplication on all numbers which results in overflow warning.

wooden sail
#

that'll only multiply the numbers where the mask is False

#

!e

import numpy as np
scalar = 100
numbers = np.array([[100_000, 0.1], [0.3, 0.2]])
mask = np.array([[True, False], [False, False]])
numbers[np.logical_not(mask)] *= scalar
print(numbers)
arctic wedgeBOT
#

@wooden sail :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | [[1.e+05 1.e+01]
002 |  [3.e+01 2.e+01]]
wooden sail
#

there we go

finite sierra
#

but would it be possible to somehow set all numbers to nan that didn't get multiplied in this statement?

wooden sail
#

should be able to do the opposite indexing

#

numbers[mask] = np.NAN

finite sierra
#

alright

wooden sail
#

or whichever other definition of nan you like (e.g. float('nan') is another valid one, i think)

#

you could also mute the warning but i'm not sure that's the best idea

finite sierra
#

and can I do this on a new array instead?

#

i.e. don't want to modify existing numbers

#

I could copy array then do that, but is there a smarter way without taking 3 steps of 1. copy 2. not-mask 3. mask

wooden sail
#

sure, there are other ways. they all require at least as many steps though

#

you could create an array of 0s or an array of arbitrarily initialized values and then assign into that new array

#

!e

import numpy as np
scalar = 100
numbers = np.array([[100_000, 0.1], [0.3, 0.2]])
mask = np.array([[True, False], [False, False]])

receptacle = np.empty(shape=(2,2)) # 2d array full of random trash
receptacle[:] = np.NAN
receptacle[np.logical_not(mask)] = numbers[np.logical_not(mask)]*scalar
print(receptacle)
``` let's see if this works
arctic wedgeBOT
#

@wooden sail :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | [[nan 10.]
002 |  [30. 20.]]
wooden sail
#

looks good

#

idk if receptacle[:] = np.NAN or receptacle[mask] = np.NAN is better. probably doesn't make a big difference unless your array is huge

finite sierra
#

ah thanks

buoyant kite
#

who knows TTS?

spring field
wooden sail
serene scaffold
buoyant kite
craggy agate
#

Hey there, I am beginning work on an ambitious project of an autonomous RC plane, here is what I want it to do, takeoff successfully, make an highspeed over head pass, and successfully touch down and break next to me. The plane will have a Raspberry Pi 4 on board as the onboard computer which will control the plane's movements, it would be made of IMC Carbon Fiber and would have multiple sensors like GPS, Lidar, Barometer, Gyroscopic sensor, AOA, Accelerometer, Camera, Speed Measuring Sensor, Accelerometer, etc. I would be coding the project in Python, it will also have 2, 400W brushless fans. Now my question is, **what DL model should I use? I am currently thinking of using a hybrid architecture of a CNN and LSTM. would that work? Should I implement reenforced learning? ** Also, how do I train the model using simulations? It would be impossible to find flight logs for all the data in .csv format and honestly I don't think that would work... I probably need to simulate a plane and realistic winds and condition with access to all the sensors that I would need. Could anyone maybe give me a sense of direction? I m familiar with DL and ML btw.

serene scaffold
severe inlet
#

ive trained my autoencoder, but i have problems extracting out the hidden layer encoded output cos i need it for LSTM. ive managed to dot the input and weights, but failed at adding the biases. could anyone pleaseplease help me with this. i have a rough idea but i dont know how to go about fixing this.

#

because i have an array of lists of my input, im thinking that i need to iterate thru the array and add the biases?

#

but my array of lists has shape (10865, 8). how can i batch it into (32,8) for successful addition of biases?

delicate sleet
#

Hey there! I’ve seen your interest for neuro-symbolic, explainability, self organizing.. Maybe our project might interest you as well! Feel free to check it out, leave a star if you find it appealing, and share your feedback with me! 🙂 https://github.com/SynaLinks/HybridAGI

GitHub

The Programmable Neuro-Symbolic AGI that lets you program its behavior using Graph-based Prompt Programming: for people who want AI to behave as expected - SynaLinks/HybridAGI

twilit horizon
#

hey

hard nest
#

I'm training a model with image classification, I have the image and the mask, should I use the mask to cover everything but the important area or use it to highlight that are, leaving the rest of the image normal?

agile owl
#

how can I simplify the process of solving for the intersection so spark can do it quickly without a udf. I was thinking of transforming the curves into histograms of fixed widths to discretize the space but not sure what to do for the intersection

past meteor
verbal musk
#

"data science", "machine learning", "scientific computing", "artificial intelligence"... which discipline encompasses them all?

past meteor
#

there's no formal definitions so nothing "encompasses" all of them

#

maybe computer science, mathematics and statistics

verbal musk
#

okay

agile owl
#

Some interesting emergent properties of my economy simulation. Changing nothing about the distribution of individual wealth and incomes, and simply changing the number of people:

Loan supply and demand at 6 banks and 100 people, 100,000 people and 10,000,000 people, respectively.

flint plover
#

Anyone has experience in Nlp ?

trim saddle
flint plover
serene scaffold
# buoyant kite don't yo know TTS?

I'm going to mute you if you continue to ask to ask. "asking to ask" is when you say things like "can I ask a question?" or "does anyone know about x?" instead of asking the question that you actually want help with.

toxic mortar
spring field
lyric trail
#

hiiii i am working on speech recognition model can somebody help me in this

#

i am getting this error
i tried all these things but not resolving this issue
-Check file extension
-Verify file accessibility
-Test with a different audio file
-Check Whisper installation

  • Update Whisper
toxic mortar
spring field
# toxic mortar This is locally right? ```py from transformers import pipeline summarizer = pi...

I assume so, I'm not familiar with that particular API, but I'd assume it downloads the checkpoint only once and only if it can't find it on your system
like, it's a pre-trained model, you're only downloading the weights basically and once that is done, you don't have to even be connected to the internet to run this
also there is no clear indication of using some identifying bit of API authentication that could possibly limit your usage (I mean, seems they might be doing a bit of API throttling, but that's probably not particularly relevant for you)

buoyant kite
spring field
lyric trail
#

can you help me with this

lyric trail
spring field
#

can you send the entire error traceback?

buoyant kite
#

can you show me your project?

lyric trail
spring field
#

your username is USER?
and can you send your error traceback?

lyric trail
#

actually i was unable to send the python file but i send this link is that okay...???

spring field
#

yep

buoyant kite
lyric trail
buoyant kite
#

yep

lyric trail
buoyant kite
lyric trail
#

let me check

lyric trail
left tartan
#

The general idea is interesting, although it's kinda the premise of AWS and cloud computing. I know a company who is trying to make a distributed market of idle manufacturing resources (cnc machines, etc)

lyric trail
buoyant kite
spring field
buoyant kite
spring field
#

!e

"\Users"
arctic wedgeBOT
#

@spring field :x: Your 3.12 eval job has completed with return code 1.

001 |   File "/home/main.py", line 1
002 |     "\Users"
003 |     ^^^^^^^^
004 | SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \UXXXXXXXX escape
spring field
#

their path is not incorrectly formatted, it's just wrong

#

is it that path? we don't actually know because @lyric trail is still to provide the entire error traceback

#

!traceback

arctic wedgeBOT
#
Traceback

Please provide the full traceback for your exception in order to help us identify your issue.
While the last line of the error message tells us what kind of error you got,
the full traceback will tell us which line, and other critical information to solve your problem.
Please avoid screenshots so we can copy and paste parts of the message.

A full traceback could look like:

Traceback (most recent call last):
  File "my_file.py", line 5, in <module>
    add_three("6")
  File "my_file.py", line 2, in add_three
    a = num + 3
        ~~~~^~~
TypeError: can only concatenate str (not "int") to str

If the traceback is long, use our pastebin.

buoyant kite
spring field
#

I was referring to them

buoyant kite
lyric trail
#

but why mine is not working

#

sry i am new to coding what is trace back

lyric trail
buoyant kite
craggy agate
#

Hey there, I am beginning work on an ambitious project of an autonomous RC plane, here is what I want it to do, takeoff successfully, make an highspeed over head pass, and successfully touch down and break next to me. The plane will have a Raspberry Pi 4 on board as the onboard computer which will control the plane's movements, it would be made of IMC Carbon Fiber and would have multiple sensors like GPS, Lidar, Barometer, Gyroscopic sensor, AOA, Accelerometer, Camera, Speed Measuring Sensor, Accelerometer, etc. I would be coding the project in Python, it will also have 2, 400W brushless fans. Now my question is, what DL model should I use? I am currently thinking of using a hybrid architecture of a CNN and LSTM. would that work? Should I implement reenforced learning? Also, how do I train the model using simulations? It would be impossible to find flight logs for all the data in .csv format and honestly I don't think that would work... I probably need to simulate a plane and realistic winds and condition with access to all the sensors that I would need. Could anyone maybe give me a sense of direction? I m familiar with DL and ML btw.

spring field
#

I would maybe consider starting with an RC car

magic steppe
#

anyone know if there's something premade that converts from a generic numpy matrix to something that i can pass to scipy.linalg.solve_banded?

tired lodge
craggy agate
craggy agate
#

Will try to implement obsicale avoidance using it's front camera

#

Also using YOLO object detection to make it track/follow me

tired lodge
wintry gyro
#

How do I save ml model locally with pyspark.
I am getting this error.

tacit basin
severe inlet
#

im training a lstm model, shouldnt it be training on my gpu instead of cpu? how do i change this?

cinder jay
#

Hey guys, which pytorch image (in docker) should i use if i just want to inference with CPU?

past meteor
#

#rules , could you remove this? It's in violation to rule 6 🙂

severe inlet
#

tensorflow

lapis sequoia
#

hey guyss, I'm lost , how to generate text data using LLM api and prompting

past meteor
severe inlet
severe inlet
past meteor
severe inlet
#

yes

#

i dont have any conda or anaconda environments installed

#

if it matters

past meteor
#

Tensorflow isn't available wiith GPU on windows anymore, you'll have to use WSL (windows subsytem for linux)

severe inlet
#

ahhhh okay

past meteor
#

that's the reason why your GPU is not being found

severe inlet
#

i just thought it was a setting i didnt enable or something

#

if thats so then alls g for now

mystic harbor
#

@crystal geyser I've deleted your message due to the following reasons:

  • We do not allow advertisements in this server
  • Scraping facebook is against their ToS and we do not allow discussions around such topics.

Please re-read our #rules

vagrant root
lofty thorn
#
2024-04-28 21:44:16.128937: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
WARNING:tensorflow:From C:\Users\ACER\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.

Traceback (most recent call last):
  File "C:\Users\ACER\PycharmProjects\Face recognition\main.py", line 11, in <module>
    imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cv2.error: OpenCV(4.9.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\color.cpp:196: error: (-215:Assertion failed) !_src.empty() in function 'cv::cvtColor'

INFO: Created TensorFlow Lite XNNPACK delegate for CPU.```

how to fix this
#
import cv2
import mediapipe as mp
import time


cap = cv2.VideoCapture(1)
mpHands = mp.solutions.hands
hands = mpHands.Hands()

while True:
    success, img = cap.read()
    imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    results = hands.process(imgRGB)

    cv2.imshow("WIZARD", img)
    cv2.waitKey(1)```

this is the code
timid kiln
#

I need some help on “where to start”. I am working at a production facility where a piece of equipment is producing a byproduct in volumes greater than expected. (FYI this isn’t a chemical reaction.). No one can figure out why this is happening.

What I was hoping to do was load a bunch of operating data into pandas, process it so there’s no empty fields or obviously erroneous values, and then (and this is where the “where do I start” part comes in) through the magic of some python library(s) it will tell me “parameter X, Y, and Z have a noticeable effect on the byproduct”.

I guess what I’m thinking is I need to practice somehow with a data set where it’s obvious to me what the answer is, then test against whatever program I write. I think?

I would appreciate your advice on how/where to begin. Please tag me if you respond. Thank you!

edit: This is all time based data obtained from the control system historian, if that helps.

devout sail
timid kiln
# devout sail Without writing any code, just conceptually, how would you know what effect some...

If I imagined some pre-built program, each column of data would have a header/name. So I'd pick the header/name and select some option that would calculate what independent variables/parameters appear to have an effect on the selected dependent variable/data.

If I were doing this by programming, after the data munging, I imagine I'd have to write code to name the dependent variable in question. But, as I mentioned, I'm at a "where do I start" type of situation so all I know at this point is "how to load data into pandas from a csv". I apologize in advance for my newbiness. 🙂

devout sail
timid kiln
#

I installed Orange was going to give that a try once I figured out how to bypass Excel and dump data from the historian straight to a CSV. My company locks everything down pretty hard as far as software, connections, permissions, etc.

timid kiln
#

We expect some vapor, but we're getting a lot more than previous, all of a sudden back at the beginning of April, and it's a financial loss if this continues.

devout sail
#

So you have a bunch of parameters about the process, and then a column saying how much byproduct was produced, and you want to be able to predict from the parameters how much byproduct you'll get?

timid kiln
#

No, not at all.

#

I have a bunch of time-based data measured from pressure, temperature, and flow meters throughout the facility. I want to be able to calculate which of these data appear to have an affect on the byproduct. When the byproduct flow increases, which other data did something at the same time? Same as when it decreases, which temperatures, pressures and/or flows appeared to contribute to that reduction? Right now, we've plotted everything we think has an effect on this vapor stream but visually, we haven't found a correlation. So I'm trying to determine if there's a way, mathematically, to figure this out with the logged data from the historian.

#

The hope is that we can determine "oh, it was [insert parameter here]. We just need to [reduce/increase] that [parameter] to reduce the vapor volume".

timid kiln
buoyant flare
#

I was looking through an object detection project and encountered, this error seems to be originating from the BatchNormalization class's call method. The error message I'm getting is:

Using a symbolic `tf.Tensor` as a Python `bool` is not allowed. You can attempt the following resolutions to the problem: If you are running in Graph mode, use Eager execution mode or decorate this function with @tf.function. If you are using AutoGraph, you can try decorating this function with @tf.function. If that does not work, then you may be using an unsupported feature or your source code may not be visible to AutoGraph. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/autograph/g3doc/reference/limitations.md#access-to-source-code for more information.

Here's the relevant part of the custom BatchNormalization layer:

import tensorflow as tf

class BatchNormalization(tf.keras.layers.BatchNormalization):
    """
    Make trainable=False freeze BN for real (the og version is sad)
    """

    def call(self, x, training=False):
        if training is None:
            training = False
        training = tf.logical_and(training, self.trainable)
        return super().call(x, training)

I've tried to modify the code to handle the None value for the training argument using tf.cond, like so:

training = tf.cond(tf.equal(training, None), lambda: tf.constant(False), lambda: training)

However, I'm still receiving the same error. Can anyone here help me understand why I'm encountering this error and how to resolve it?

Any help or guidance on resolving this issue would be greatly appreciated! Thanks in advance!

pliant heron
#

hey! idk if this is the right place to ask this question, still i'll ask sorry😅
i am self learning data visualization and i want to install matplotlib
i am using python in vs studio (windows 10)
my question is which one .whl file should i download is it pypy or cpy
thankyou for helping me out

ivory quarry
#

that + gazebo simulations

vocal barn
#

Aye how to use opencv for face recognition on pydroid
I have basic python skills tho
And is it free?

spring field
vocal cove
#

Greetings there,

Hope all are well. I am looking for some assistance as to how I can enable jax for the code pasted in the link below for running some of the loops in parallel for wall clock efficiency.
https://paste.pythondiscord.com/LMMQ
The reason for this is that for certain instances, it takes significant time to run, so being able to parallelize would help a great amount, and even more if I can enable GPU usage (I have an RTX 3060, so it would be put to some good use here):

num_paths = [100*i for i in range(1, 15)]
values = []

for path in tqdm(num_paths):
    qpi = QPI(initial=np.array([0, 0]), final=np.array([1, 1]), bisection_level=22, number_paths=path, n_filtrations=0)
    probability_amplitude, paths = qpi.calculate_fpi()
    values.append(probability_amplitude)

plt.plot(values)
#

So, one thing that I want to enable parallelization for is the path_list definition in calculate_fpi() method. You can see it's a non-sequential for loop, which is perfect for parallelization using sth like JAX's vmap.

#

So, I would like to enable two things :

  1. Enabling JAX to parallelize the loops.
  2. Enabling GPU for running the code if possible.
#

I immensely appreciate the assistance in advance!

unique patio
#

Hi everyone,

I am building an AI/ML/Data Engineering project which is going to help a “user” pick the best choice out of a N car models.

For example user provides us with a 100 Volkswagen models and their specification as PDF’s files (unstandardised format).

When PDF’s finish uploading to a server they can provide a specification they need the car to meet (for example 4x4 and above 200HP) and they describe it just like to an LLM (chat-format more or less or just specific phrases)

What would be the best approach for making such a thing?

OpenAI API is ofc off the table because of it’s broad imagination and lack of context appliance.

I think NLP might do the job, but I don’t think it’s the best choice out here.

RAG based on a vector DB might be decent choice, but what model/technique could do the trick here?

Thanks in advance everyone 🫶🏻

craggy agate
craggy agate
hasty grail
#

For example user provides us with a 100 Volkswagen models and their specification as PDF’s files (unstandardised format).
Do you plan to persist the data, or does it vary between each conversation? I think RAG is the way to go regardless

#

you can look around and see what best fits your situation

unique patio
unique patio
#

I will take a look into sugguested solutions - thanks a lot for the advice guys.

craggy agate
neat bluff
#

Or basically anything else similar, but smaller than Raspberry. Size and weight is crucial in this subject - afaik. Also response times are crucial in such case so it has to be considered. I would love to be a part of this project - dm me if You don't mind that and we can talk some more about this.

craggy agate
#

Lol, RBP won't affect aerodynamics cause it will be inside the hull.

neat bluff
#

Well, technically it shouldn't, but mounting it stable inside of it might be a bit tricky. Also mind the weight distribution.

daring pumice
#

Hey everyone, just a short question regarding VS Code, I am trying to build an AI for the game assetto corsa, that’s not the question but, I have made a virtual environment in vs code and when I installed the module for the game, it installed but when I tried to run a simple code to test the running, I get an error saying the said module doesn’t exist. Any help is appreciated. Please keep in mind I have other virtual environments for my other projects too. Thank you in advanced!

neat bluff
#

You need to install modules separetly for every virtual env as it is excluded from system env

craggy agate
craggy agate
daring pumice
# craggy agate Did you activate Venv?

Yup ran the command and everything. It didn’t show it in terminal but a couple seconds later I got a pop up saying the environment was activated even thought it doesnt show

#

I used pip list to see if it was even in the list and sure it was right there

daring pumice
neat bluff
#

Btw, if You are a begginer and using a virtual env might complicate stuff. It doesn't do that much of a difference and will be easier for You without it.

craggy agate
#

Not the Venv files, that's different

daring pumice
daring pumice
neat bluff
daring pumice
#

So don’t really know what to do anymore

neat bluff
neat bluff
craggy agate
#

I have but that might drain the battery too fast. Especially due to the load on the pi

#

It's a model 4 b+

daring pumice
neat bluff
daring pumice
#

Gotcha. Hey if you don’t mind can I friend you mate? I could really use some help with these issues. Of course if you are comfortable

#

I will send the error once I get back. I am currently outside

neat bluff
craggy agate
#

Yep

neat bluff
daring pumice
neat bluff
daring pumice
#

Haha true

daring pumice
#

hey guys!, i have a question, i have python installed but pip is not installed so whenever i try to install any library, i just get hit with an error, can anyone please help me with this, thank you in advanced! i have repaired python multiple times but still no changes

serene scaffold
#

show the output as text--not as a screenshot

#

btw, a lot of data science libraries don't support 3.12 yet. you usually want to stay one or two versions behind.

daring pumice
#

C:\Users\imohi>python -m pip --version
pip 24.0 from C:\Users\imohi\AppData\Roaming\Python\Python312\site-packages\pip (python 3.12)

daring pumice
tidal bough
#

on windows you can't easily install python without pip. Your output means you do have it, it's just not in PATH and so you can't access it as pip.
You could just do nothing and use pip as python -m pip, that'd work just fine.

#

(If you want to be able to call pip as just pip, you need to add the Scripts folder of your python installation to PATH. There's an installer option for that I believe - "add to environmental variables" or something. Try rerunning the installer, choosing Modify and selecting that option.)

daring pumice
#

sure mate will try that but if i just python -m pip, i should be able to use pip and install packages and libraries right? for running codes in vs code

serene scaffold
#

how much calculus, linear algebra, and stats do you know?

daring pumice
#

hey @serene scaffold sorry for the ping and i dont know if its right for me to ask this but can i send you a friend request mate? so that i can ask a question if i am facing any problem, only if you are comfortable of course!

serene scaffold
daring pumice
#

sure then!

#

thank you for your time!

serene scaffold
#

there are some resources in the pins

past meteor
#

I'd say get your feet wet with something that doesn't really require you to train your own stuff

#

Do that for as long as you can, eventually you'll hit a roadblock and then you can dig into the math and stats

craggy agate
#

Try learning simple linear regression, multiple linear regression, polynomial linear regression, support vector regression

#

Then maybe get your feet wet in deep learning

runic parcel
#

i want to make a cnn model in which, the neural netwrok can scan the image or video and extract the phone number and name from it... how sld i do it?

runic mountain
#

hi does anyone here know about LLM?

royal harbor
#

Looking to grab audio from a video and change the voice to something better.
Not sure what model to use, been looking on huggingface.

I'm going to extract the audio with ffmpeg -> send it off to change the audio to a different voice then make the video again with ffmpeg.

Do you think it would be easier to extract the caption from the video then use text to speech in stead?

trim saddle
fossil walrus
#

Guys i am thinking of learning automation in python , so any one tell where and how to start ? As i am a biggener in python

hasty grail
#

!resources

arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

odd meteor
# runic mountain hi does anyone here know about LLM?

Don't ask question to ask question. If you had mentioned specifically what you needed more help with in LLM you probably would have gotten a much quicker response.

Now, someone would have to ask "what do you need help with in LLM" before they'll get a full picture of your question.

odd meteor
# runic parcel i want to make a cnn model in which, the neural netwrok can scan the image or vi...

There are some nice tutorial videos online you could use to practice. Then once you're comfortable with the videos, you can easily adjust it to your use-case.

You can check this out

https://www.kaggle.com/code/sarthakvajpayee/license-plate-recognition-using-cnn

hallow sphinx
#

Do people use ipython for anything other than jupyter notebook? Why isn't jupyter notebook written in Cpython (w/ tkinter)?
I don't understand ipython, of how is it different than CPython. All I know is that it has some more features like variable? gives docs of the variable and variable?? pulls up the source code.

rough wadi
past meteor
#

@wooden sail Given that the output of a recurrent neural net are 2 tensors X' and Z the default case is always taking Z and using that for the basis for downstream tasks.

I made a mistake and actually passed the last time step of X' which is effectively the feature vector at t+1. Interestingly enough, the results are very encouraging of something that's arguably a bug. Have you come across anyone doing this?

wooden sail
#

i'd have to see how exactly X' is computed tbh

past meteor
#

Standard recurrent neural network

wooden sail
#

give me 5 min to learn how they work 😛

#

or send me a paper that uses the same syntax cuz you didn't give them names and wikipedia uses different letters

past meteor
#

Honestly, this working is most likely a special case of my task

#

I use Z for h

#

And X' for output

wooden sail
#

looking at a single layer, it looks like X' depends on the value of Z from the previous layer. the value of Z from the current layer depends on X'

past meteor
#

Yes

wooden sail
#

in some sense, swapping Z for X' is the same as removing half or one layer from your network, and leaving everything else as is

#

i'd take that to mean you already had too many layers anyway

#

either in the RNN or in however you compute the downstream task

past meteor
#

Aha, that makes sense

wooden sail
#

removed one level of composition in the last layer, yeah

past meteor
#

X' logically contains the temporal context because it depends on Z

#

Yeah okay that makes sense

wooden sail
#

and with that cleared, yes, i run into this all the time

#

i work a lot with an algorithm that uses nesterov acceleration

#

in the final iteration, you can always ask whether you want to keep the "safer" gradient step, or keep the nesterov step

#

when you're close to convergence, it doesn't make a difference

past meteor
#

Okay yes indeed that's very similar

#

I think the analogies used in ML aren't great

#

Because it's quite obvious if you look at the math

#

Thanks!

calm pagoda
#

I want to learn AI/ml/data science.. and as much as I know.. there's no free full content available..
So, can anyone having experience in this field suggest me some courses to buy for this.. So that I can completely learn from there..

If you have something else to tell me... pls...

tidal bough
fair warren
#

Do we use this group for scientific programming in general, not necessarily related to DS/AI?

hasty grail
latent girder
#

Hi, i recently just bought a course for the purpose of shifting to data science (currently a data analyst). But the course covers the whole python, which feels very slow like a total of 60 hours not including the time ill be spending on learning, practicing, building and etc.

Should i learn the whole python or just the stuffs needed for data science?

For context, i have no programming languages exp besides SQL if you would count it. Tho I have learnt doing for, while, if else, and the basic stuffs

serene scaffold
latent girder
#

this is the scope of study

tidal bough
#

yeah, that tracks, often courses that teach "data science" are meant for people who are studying for something business-related and have no prior programming experience.

latent girder
#

i was actually interested in learning the whole of it but it feels very slow since i have current work and cannot commit to more than 2 hrs of studying.

serene scaffold
#

the GUI, game, and web development parts look superfluous. but you want to aim to be actually good at python, not just eeking out notebooks that no one else can run.

small ore
#

Corey Schaffer of Youtube teaches Python more than enough fro data science. ( A Students opinion)

latent girder
#

xd sry so now im just interested in learning data science

#

What i mean is, can i actually do those data science stuffs in python with just data science stuffs knowledge?

#

or should i also be able to build a website and stuffs in order to get by

#

idk if im asking the right question, feel free to correct me

serene scaffold
small ore
#

A little python ( Especially data structures/list comprehensions etc) will do if you already know conditionals, loops and functions. Just have to familiarize with python syntax

odd meteor
latent girder
small ore
#

You will also need classes when you are doing custom transformers and stuff

latent girder
serene scaffold
latent girder
#

well do i need to learn this whole course or just those that covers data science?

serene scaffold
#

you don't need to learn GUI, game, or web development. but you do need to be capable with Python in general.

odd meteor
latent girder
#

and this

#

ill just youtube for missing stuffs

small ore
#

Basic python and then switch to data science from a simpler module like Skikit-learn ( Not Tensorflow directly) and then you can learn python as needed

serene scaffold
#

it's hard to know what you'd actually learn from doing whatever they're referring to, without seeing what the actual material/assignments are

small ore
#

Imo ignore those Ad like claims and decide as you learn

odd meteor
small ore
#

Oh. And Pandas, A couple of plotting tools( Matplotlib, seaborn, Plotly etc), some basic numpy, should help.

latent girder
latent girder
small ore
#

Meanwhile I have a clustering question

#

I tried kmeans on my data and and the Silhouette score is dropping and dropping. No peaks to be found. I tried DBSCAN and it labels most of the data as -1. What am I doing wrong?

latent girder
#

Is this good tho? Ill just buy this if its good

past meteor
# latent girder Is this good tho? Ill just buy this if its good

in my opinion it's good if you want a fast refresher of the content or if you just want to learn how to use the libraries. I think I did this one when I was studying because my labs didn't use Python. I remember being under the impression that it doesn't teach it well enough for people that have 0 background.

latent girder
past meteor
latent girder
#

still looking for good stuffs

past meteor
#

Are you open to picking up a book?

latent girder
#

no xd

#

i like interactive learning

odd meteor
latent girder
buoyant kite
neat bluff
#

Hi everyone :) As usual my lack of experience with LLM's proved itself once again (it doesn't make building a huge project any easier tbh) .
I've built a data parser using LangChain and Claude 3 Opus. One of the issues I've encountered is that output limit of a 4096 tokens is very easy to achieve. As far as I've researched the limit is similar in top LLM services and the only exclusion is GPT4 which is pricier at every aspect and also way worse at actually recovering data from text and not making it up.

My question is: Which open-source models have a token output limit similar to GPT4 (8k+) or what could be the solution to my problem here? Thanks in advance.

hallow sphinx
#

But I still don't understand what is better in ipython

latent girder
buoyant kite
sturdy kiln
#

can somebody tell me what the fuck am i looking at

#

or a resource that will point me to the right direction on what any of this means

agile cobalt
sturdy kiln
#

its not really a library

#

im using ARIMA as a model but the summary itself isnt

#

its more or less EDA knowledge rather than python modules

#

for context, im trying to compare baseline ARIMA(1,1,1) to ARIMA(20,1,1) which are two different models, yeah the numbers change but i have no idea how to interpret it

fallen osprey
#

Hey i am going to take aids in college can anyone reccomends specification for buying laptop

sturdy kiln
#

honestly anything since colab exists, unless you want to do some localized CNN and you need a beefy GPU (probably just go a for a PC at that point) probably one with CUDA support and a hefty CPU along with it

#

(dont take my word for it, might need a second or third opinion on it)

fallen osprey
sturdy kiln
#

also alot of RAM, if your going to do some local stuff it will eat off your RAM really quick

sturdy kiln
#

i really dont have a baseline but probably 32 is a good number nowadays

fallen osprey
#

Ok

#

I will get 16 and upgrade it

#

To 32

sturdy kiln
#

sounds good, but if your going to start doing any AI/ML/DL/DSA stuff, then most likely the college offering it will make you use cloud computing anyways

fallen osprey
sturdy kiln
#

cloud services like Google Colab, MS Azure and stuff like that, which all run on the cloud so local specs dont matter that much anyways

#

colab isnt technically an ML centric service but more of a py notebook one that can run ML stuff in average

#

but hey its free, and what i use extensively lol

#

but because its free, its got a lot of limitations

fallen osprey
sturdy kiln
sturdy kiln
# fallen osprey Like?

the standard free service for Colab has limited compute units, when you run out, you cant do any computations anymore for the day, you're also only given a limited amount of resources, iirc around 12-15gb of RAM, and 80GB of disk storage

sturdy kiln
#

but honestly, you wont run out if your not doing that much heavy stuff

#

the only time i ran out of units is when i was doing a 3 day-long CNN session

#

which is extremely computationally heavy

fallen osprey
#

Oh

fallen osprey
sturdy kiln
#

you do know what a cloud compute service is, right?

fallen osprey
#

Am I right?

fallen osprey
small ore
#

Cloud are large server stacks like the once that are used to provide internet services/websites etc but here a user is assigned a specific space(Storage) and assigned processor time/memory on one/several of its many CPUs and GPUs. You either submit your job there or you can also connect remotely and operate virtually on that space

#

@fallen osprey

tidal bough
fallen osprey
tidal bough
#

i am still wondering what "take aids" means

sturdy kiln
#

its AI and Data Science

#

a course

#

honestly weird name to call it lol

tidal bough
#

can't wait for their next course, HIV (human informational values)

sturdy kiln
#

honestly just call it DSA (data science and anayltics) because it all falls under that anyways lol

#

not to be confused with DSA (data structures and algorithms)

tidal bough
#

depending on the course I don't think you need a powerful computer? even if it includes some ML you can, yeah, probably get by with collab

fallen osprey
tidal bough
#

(also, a laptop which you can do decent ML on is going to be so expensive. you'd need a GPU...)

fallen osprey
sturdy kiln
tidal bough
#

collab would likely be a better idea

small ore
#

VRAM?

fallen osprey
tidal bough
small ore
#

Something like cache or external cards?

tidal bough
sturdy kiln
#

pretty sure if you have google, you have colab

fallen osprey
small ore
sturdy kiln
#

if youd like to know how computationally heavy ML/DL stuff are, people have built server farms and supercomputers just for it lol

tidal bough
# small ore Something like cache or external cards?

not sure what you mean. all GPUs have some integrated memory - which is pretty much exactly like normal RAM, but built into the GPU. For ML it matters a lot since you'd want to fit your model there for optimal performance.

fallen osprey
#

Is it ai digital labs?

tidal bough
#

i have no idea what that is

sturdy kiln
small ore
#

CPUS have cache memory on the chip. So trying to know if it is something like that or a 'card' outside the chip

fallen osprey
#

Yeah it's available

sturdy kiln
#

VRAM is a dedicated memory space for the GPU itself

fallen osprey
#

Ty

small ore
sturdy kiln
#

yeah you dont need a very expensive laptop to run colab on

fallen osprey
tidal bough
sturdy kiln
# small ore I get that

its embedded in the GPU itself if your asking, its nearby the chip itself but its not cache

#

cache and RAM are two different memory types

fallen osprey
#

I have one last question are there jobs for ai and ds?

sturdy kiln
#

although if youd want to excel in that field, youd have to excel in your knowldege of it

#

lots of any programming/data science jobs dont rely on degrees

#

but more of what you can do instead

fallen osprey
#

I need to learn lot of maths?

sturdy kiln
#

hence the computer science slander lol

#

lots of people took CS thinking its a free job because of the degree, not knowing youd have to get a goob internship, good recommendations or a good background to even get something barely good back

tidal bough
#

data science is basically applied statistics. hence, yes, "lot of maths".

small ore
#

Okay. I am trying to find what the VRAM on this laptop is like

fallen osprey
tidal bough
sturdy kiln
#

or i might be misremembering

tidal bough
#

I think "APU" is an AMD-only term

sturdy kiln
#

ah well its a CPU with an IGPU inside anyways

neat bluff
tidal bough
#

yeah, it's how they call their new CPUs which have an unusually good iGPU

sturdy kiln
neat bluff
#

Back in the the integrated GPU's sucked ass

#

That's what unusually good probably refeers to

sturdy kiln
#

yeah i get that IGPU sucked ass, but i didnt know they stopped sucking ass now lol

tidal bough
small ore
#

All I can get

tidal bough
#

(but still like 2x worse than a real GPU)

neat bluff
#

New Ryzen's iGPU can kick ass of a 5 years old GTX desktop card

tidal bough
#

hmm, not sure how to fact-check that

neat bluff
sturdy kiln
#

or anything modern related lol

small ore
#

Learner things I mean

neat bluff
sturdy kiln
#

this chat is probably getting off-topic now but whatevs

#

i have figured out what the top rows does, specifically the Information Criterion metrics, but still have no idea what the parameters or the residuals do

#

like what the fuck is Ljung-Box or heteroskedasticity

neat bluff
#

Heterodeskadiscitiy? WTF

#

XDDDDD

small ore
#

No one has an answer to my original question on clustering though?

neat bluff
#

I haven't seen it. Can You tag it?

neat bluff
small ore
sturdy kiln
#

yeah most modern stuff now run on GDDR6

#

which is extremely fuckin fast

neat bluff
neat bluff
neat bluff
#

It is actually, 4000 series RTX use them

sturdy kiln
#

commerically available?

#

oh hmm didnt know that

neat bluff
#
Gigabyte GeForce RTX 4080 SUPER WINDFORCE OC V2 16GB GDDR6X (GV-N408SWF3V2-16GD)
#

For the price of 1200$

sturdy kiln
#

sheesh thats one expensive GPU