#data-science-and-ml | Python | Page 162

main fox Mar 23, 2025, 10:17 PM

#

Accept the limitations

solar siren Mar 24, 2025, 12:42 AM

#

i did both but i left linguistics behind after graduation to continue with cs

odd meteor Mar 24, 2025, 1:26 AM

#

The classification report screenshot you shared is related to a classification problem but you're mentioning R2 (coefficient of determination) which is related to regression task.

Confirm you're not making a mistake on your evaluation metrics of choice (R2)

charred light Mar 24, 2025, 1:30 AM

#

How should I tune a GAN to force a mode collapse? (with the hyperparameters batch_size, lr, n_epochs, hidden_dim, latent_dim, leaky relu t/f)

I tried a small epoch with small hidden/latent dims (100/10) but it fails to generate. (Only noise). Should I be increasing my epochs but keep the dims small?

odd meteor Mar 24, 2025, 1:31 AM

#

From this image, you have a class imbalance data. For example, out of 480 samples in your data, class 3 and 8 have just 1 and 6 examples respectively.

You need to handle the class imbalance problem. The model will most likely improve when you do that.

safe agate Mar 24, 2025, 2:59 AM

#

marimo is as an alternative Python notebook, there's also a workshop on marimo in this server soon.
https://discord.gg/python?event=1350928346422186065

haughty depot Mar 24, 2025, 4:22 AM

#

Has anyone tried to integrate concept of linformer into transformers from scratch?

#

I'm having issues with shapes after projection, and was wondering if anyone could direct me to right direction

river cape Mar 24, 2025, 7:23 AM

#

are there any free llms apis which i can use?

errant lake Mar 24, 2025, 7:26 AM

#

well all open source models are 'free'

#

as long as you can host one of them

rich river Mar 24, 2025, 7:50 AM

#

I need help
from torchvision.tv_tensors import Mask
Im using C++ and I really need this function
I have built torchvision from source on my machine
#include <torchvision/vision.h>
the vision.h file looks like this

#pragma once

#include <cstdint>
#include "macros.h"

namespace vision {
VISION_API int64_t cuda_version();

namespace detail {
extern "C" inline auto _register_ops = &cuda_version;
} // namespace detail
} // namespace vision

I tried vision::detail::tv_tensors::Masks(pred_masks)
but it just seems wrong

#

it seems namespace tv_tensors does not exist, and if I remove it, it will say namespace "vision::detail" has no member "Masks"

#

original python codes

        self._pred_masks = F.interpolate(
            model_output[0]["masks"][scores_mask], size=self._color_image.size[::-1]
        )
        self._pred_masks = torch.concat(
            [
                tv_Mask(torch.where(mask >= 0.5, 1, 0), dtype=torch.bool)
                for mask in self._pred_masks
            ]
        )

my C++ codes

  torch::Tensor const masks = torch::from_blob(
      model_output[3].GetTensorMutableData<double>(),
      {num_pred_bbox, 1, target_height_, target_width_}, torch::kFloat32);

  torch::Tensor pred_masks = torch::nn::functional::interpolate(
      masks.index({scores_mask, torch::indexing::Ellipsis}),
      torch::nn::functional::InterpolateFuncOptions().size(
          std::vector<int64_t>({input_height_, input_width_})));

  // mask is a two classificaiton problem
  // use 0.5 as mask confidence threshold
  pred_masks = pred_masks.ge(0.5).to(torch::kUInt8);

  std::vector<torch::Tensor> mask_vec;
  for (int i = 0; i < pred_masks.sizes()[0]; ++i) {
    mask_vec.push_back(pred_masks[i]);
  }
  pred_masks = torch::cat({mask_vec}, 0).squeeze();

but it seems just gives me wrong masks

jaunty helm Mar 24, 2025, 8:27 AM

#

river cape are there any free llms apis which i can use?

Mistral

#

Openrouter has some as well

weary timber Mar 24, 2025, 10:24 AM

#

river cape are there any free llms apis which i can use?

free deepseek r1 in openrouter

#

been using it with no issues

#

its a little slow tho

#

1k token prompt = 50sec response time

odd meteor Mar 24, 2025, 10:58 AM

#

river cape are there any free llms apis which i can use?

Yes. Check https://groq.com/

Go to the developers tab, click free API and continue from there.

Groq

Groq is Fast AI Inference

The LPU™ Inference Engine by Groq is a hardware and software platform that delivers exceptional compute speed, quality, and energy efficiency. Groq provides cloud and on-prem solutions at scale for AI applications.

patent plinth Mar 24, 2025, 11:05 AM

#

hi does anyone know how i can format python
in discord]

errant lake Mar 24, 2025, 11:19 AM

#

use triple backticks with py

#

```py

river cape Mar 24, 2025, 11:34 AM

#

thanks @errant lake @odd meteor @weary timber @jaunty helm

errant lake Mar 24, 2025, 11:34 AM

#

Np, I also learned the existence of all these free apis 😄

#

Didn't expect to find Deepseek r1 for free or groq for free, that's super cool

river cape Mar 24, 2025, 11:36 AM

#

errant lake Didn't expect to find Deepseek r1 for free or groq for free, that's super cool

So basically i can use these llms via their apis , which means if i send a prompt , it will go the llm present in groq's server and return me a response?

errant lake Mar 24, 2025, 11:38 AM

#

yeah, that's roughly what happens

#

if you want to inspect how that works in action, you can have a look at Ollama or LMstudio

#

you can host very small models even on a mchine with no GPU

#

Otherwise, it's just a classic rest api

#

you can also explore how that works with python as well, that's probably one of the most popular exercise

river cape Mar 24, 2025, 11:42 AM

#

oh yea i need to work on apis too , thanks 🙂

warm iron Mar 24, 2025, 3:32 PM

#

Guys need help with handling large amounts of data. I extracted features with VGG16 from brain CT scan images . 19GB of features and most of numbers are zeroes (I guess it’s all that black area around the brain) . How can I handle that? I can’t just load it all and pass it through a NN for classification.

weary timber Mar 24, 2025, 3:35 PM

#

river cape thanks <@176357312686981120> <@519319496868233227> <@1063131445792612383> <@2089...

np, i also was happy when i found deepseek r1 api for free

midnight rain Mar 24, 2025, 4:05 PM

#

has anyone worked with tkinter here?

#

like a professional in tkinter

opaque condor Mar 24, 2025, 8:38 PM

#

Is there a way of making it so that my python neural network can be graft so I can tell through each training that goes through how well it's doing also how long is usually an epoch cuz I wanted to test my network for a week at least

serene scaffold Mar 24, 2025, 9:04 PM

#

opaque condor Is there a way of making it so that my python neural network can be graft so I c...

Like, you want to visualize its performance in real time?

opaque condor Mar 24, 2025, 9:05 PM

#

Yes

opaque condor Mar 24, 2025, 9:26 PM

#

serene scaffold Like, you want to visualize its performance in real time?

Yes so I can physically tell what it is and it giving me a prediction for the entire training loop

serene scaffold Mar 24, 2025, 9:38 PM

#

opaque condor Yes so I can physically tell what it is and it giving me a prediction for the en...

physically? I don't know what that could mean in this context.
Training a neural network involves repeatedly calculating the loss, which is a measure of how incorrect your model currently is. You could plot the loss over time, and periodically re-render the plot.
There are probably libraries to give you a live dashboard of how the model training is going, but I've never used one.

agile cobalt Mar 24, 2025, 9:38 PM

#

yeah, if anything maybe try something like https://www.tensorflow.org/tensorboard

TensorFlow

TensorBoard | TensorFlow

A suite of visualization tools to understand, debug, and optimize TensorFlow programs for ML experimentation.

serene scaffold Mar 24, 2025, 9:39 PM

#

Is there one for pytorch?

agile cobalt Mar 24, 2025, 9:39 PM

#

TensorBoard also works with pytorch, https://pytorch.org/docs/stable/tensorboard.html

opaque condor Mar 24, 2025, 9:41 PM

#

serene scaffold physically? I don't know what that could mean in this context. Training a neural...

How should I make it so that I can put the loss function cuz I've never worked with matplotlib really

serene scaffold Mar 24, 2025, 9:49 PM

#

opaque condor How should I make it so that I can put the loss function cuz I've never worked w...

plotly >> matplotlib
but if you have the loss as a pandas Series, you can probably just use .plot.line()

solar siren Mar 24, 2025, 11:01 PM

#

pandas are very useful in this case and it is very easy to use

serene scaffold Mar 24, 2025, 11:15 PM

#

solar siren pandas are very useful in this case and it is very easy to use

what sorts of things have you used pandas for?

solar siren Mar 24, 2025, 11:16 PM

#

i used pandas to create csv files like grouping and merging from different csvs

opaque condor Mar 25, 2025, 12:36 AM

#

If I may how come when I try to train my neral network it mainly uses my CPU instead of using my GPU

hearty depot Mar 25, 2025, 1:23 AM

#

opaque condor Is there a way of making it so that my python neural network can be graft so I c...

wandb is p good

#

for this

hearty depot Mar 25, 2025, 1:23 AM

#

opaque condor If I may how come when I try to train my neral network it mainly uses my CPU ins...

u gotta specify it

faint quail Mar 25, 2025, 1:25 AM

#

gas?
https://www.youtube.com/watch?v=95eCPgBJuCw

YouTube

lol man

Rainbow Six Siege AI Aimbot Test (Yolo V3)

A yolo v3 model I trained from scratch using my own neural network framework, check out the github: github.com/TheonlyIcebear/PyStacks

The script struggles with moving targets and is likely due to discrepancies with mouse input and actual mouse movement in game. Since siege uses monitor distance sensitivity.

▶ Play video

opaque condor Mar 25, 2025, 1:40 AM

#

hearty depot u gotta specify it

I do specify I went into task manager when I was working with my no networking the CPU shot up more than the GPU so that's why I'm wondering

glacial root Mar 25, 2025, 1:53 AM

#

for anyone who has used the PIL library, is there a way to create grayscale rgba images (it would be from a three-dimensional matrix where there are 2 two-dimensional matrices, one being the image intensities (which makes it grayscale) and the other being the alpha channel)

#

i can't find anything on it when i search it up, nor is this specific case in the documentation

tawdry sleet Mar 25, 2025, 3:48 AM

#

Hello! sorry, i am a newbie. I have a few questions:
How does ML work?
I was trying to create a malware detection AI.
here is what i did

Get's a cool DB with 56 columns online
use scikit-learn and ExtraTreeClassifer to get most useful columns
Train A.I on those columns(Random forest classifier)
it scores 99%
Tries to use a real file(extract same features)
fails with most of the files.

serene scaffold Mar 25, 2025, 3:57 AM

#

tawdry sleet Hello! sorry, i am a newbie. I have a few questions: How does ML work? I was t...

the way ML works varies depending on the specific model type that you're talking about. generally speaking, they use "real" data to approximate functions.

#

Train A.I on those columns
this statement hides an absurd amount of information

tawdry sleet Mar 25, 2025, 4:03 AM

#

serene scaffold > >Train A.I on those columns this statement hides an absurd amount of informat...

This is the code

real_train, real_test, malware_train, malware_test = train_test_split(ml_data_new, labels, test_size=0.2)
random_classifier = RandomForestClassifier(n_estimators=50)

random_classifier.fit(real_train, malware_train)
random_classifier.score(real_test, malware_test)*100

serene scaffold Mar 25, 2025, 4:04 AM

#

tawdry sleet This is the code ```py real_train, real_test, malware_train, malware_test = trai...

so the random forrest classifier is a model. I sometimes say that a model is "an AI" when I talk to normies, but that's actually wrong. nothing is "an AI".

do you know what x and y data are in the context of ML?

tawdry sleet Mar 25, 2025, 4:05 AM

#

serene scaffold so the random forrest classifier is a model. I sometimes say that a model is "an...

I assume x is the data i provide, and y is the thing i want to get predicted?

serene scaffold Mar 25, 2025, 4:06 AM

#

tawdry sleet I assume x is the data i provide, and y is the thing i want to get predicted?

that's pretty much correct. do you know what the time is that you have to provide the y data that you want to get predicted?

tawdry sleet Mar 25, 2025, 4:09 AM

#

serene scaffold that's pretty much correct. do you know what the time is that you have to provid...

Sorry i don't get your question. It's during the time when i train the model. right?

serene scaffold Mar 25, 2025, 4:09 AM

#

tawdry sleet Sorry i don't get your question. It's during the time when i train the model. ri...

It seems you did get the question--that's correct.

#

real_train, real_test, malware_train, malware_test

why are these prefixed with real_ and malware_?

tawdry sleet Mar 25, 2025, 4:11 AM

#

serene scaffold ```py real_train, real_test, malware_train, malware_test ``` why are these prefi...

real is that contains legitimate files data.
malware contains malware files data.

serene scaffold Mar 25, 2025, 4:12 AM

#

tawdry sleet real is that contains legitimate files data. malware contains malware files data...

you seem to be confusing x with real and y with malware.

#

is the point that the model should be able to distinguish safe programs vs malware programs?

tawdry sleet Mar 25, 2025, 4:13 AM

#

yes

serene scaffold Mar 25, 2025, 4:14 AM

#

there needs to be a mix of both safe instances and malware instances in both the train and test data. can you explain why that is?

wise marlin Mar 25, 2025, 4:17 AM

#

Was given a take home JQR (Job Qualification Requirements) based on a electronic retail store with multiple sheets in excel file with different things about company. What is the most efficient way to go through all the sheets as csv files loaded into python in order to choose the best columns to merge into separate dataset in order to perform analysis? TIA just wanting to see if there are any faster ways

serene scaffold Mar 25, 2025, 4:18 AM

#

wise marlin Was given a take home JQR (Job Qualification Requirements) based on a electroni...

how many sheets are there?

wise marlin Mar 25, 2025, 4:18 AM

#

serene scaffold how many sheets are there?

8 in total, different columns types etc...

serene scaffold Mar 25, 2025, 4:19 AM

#

wise marlin 8 in total, different columns types etc...

how many columns in each sheet?

wise marlin Mar 25, 2025, 4:19 AM

#

serene scaffold how many columns in each sheet?

varied

serene scaffold Mar 25, 2025, 4:19 AM

#

wise marlin varied

I'm asking you to tell me how many in 1, 2, 3, etc.

tawdry sleet Mar 25, 2025, 4:20 AM

#

serene scaffold there needs to be a mix of both safe instances and malware instances in both the...

so it can correctly train the model and test it.
if there is only safe instances during training it wouldn't learn about the malware.
right...?

shadow atlas Mar 25, 2025, 4:20 AM

#

Anyone here who may have used CrewAI?

serene scaffold Mar 25, 2025, 4:20 AM

#

tawdry sleet so it can correctly train the model and test it. if there is only safe instance...

that's right.

hearty depot Mar 25, 2025, 4:39 AM

#

tawdry sleet Hello! sorry, i am a newbie. I have a few questions: How does ML work? I was t...

have u

#

tried testing the validation accuracy cuz ur model might be overfitting

wise marlin Mar 25, 2025, 4:42 AM

#

serene scaffold I'm asking you to tell me how many in 1, 2, 3, etc.

15, 12, 15, 11, 5, 16, 4

odd meteor Mar 25, 2025, 7:14 AM

#

opaque condor I do specify I went into task manager when I was working with my no networking t...

To be able to train your NN on GPU, you need to

Have a machine with GPU (not all GPU works, for example if your machine has Iris XE GPU from Intel instead of Nvidia GPU variants, you won't be able to utilize the GPU to train your NN due to absence of CUDA)
If you've confirmed your machine has NVIDIA GPU, you need to also ensure you installed PyTorch that comes with GPU compatibility. https://pytorch.org/get-started/locally/
Once you've checked 1 and 2, you need to use the .to(device) to train on GPU ( https://pytorch.org/docs/stable/generated/torch.Tensor.to.html )

PyTorch

Start Locally

odd meteor Mar 25, 2025, 7:21 AM

#

shadow atlas Anyone here who may have used CrewAI?

Don't ask question to ask question. If you had been more explicit on what exactly you need help with on CrewAI, you'd have probably gotten a much faster response by now.

shadow atlas Mar 25, 2025, 7:41 AM

#

odd meteor Don't ask question to ask question. If you had been more explicit on what exactl...

I'm sorry about it

#

I'm trying to make the use of tools in crewai specifically using my google Gemini API but somehow i am unable to run it. It keeps giving issues regarding wrong API key while it is working perfectly fine when running without the tool. Has anyone else used the tools with Gemini API in CrewAI?

vocal cove Mar 25, 2025, 8:04 AM

#

@serene scaffold You're gonna love this sir.
https://github.com/SesameAILabs/csm

GitHub

GitHub - SesameAILabs/csm: A Conversational Speech Generation Model

A Conversational Speech Generation Model. Contribute to SesameAILabs/csm development by creating an account on GitHub.

#

I like the approach because they utilize transformers to tackle prosodic speech. It's really brilliant.

#

Demo is here: https://www.sesame.com/research/crossing_the_uncanny_valley_of_voice#demo

Sesame

Crossing the uncanny valley of conversational voice

At Sesame, our goal is to achieve “voice presence”—the magical quality that makes spoken interactions feel real, understood, and valued.

river cape Mar 25, 2025, 10:18 AM

#

opaque condor I do specify I went into task manager when I was working with my no networking t...

if you have a gpu , then install cuda and cudnn , i think its only for nvidia graphic cards but yea the better choice would be to use colab

opaque condor Mar 25, 2025, 10:46 AM

#

odd meteor To be able to train your NN on GPU, you need to 1. Have a machine with GPU (not...

GPU: Intel(R) UHD Graphics 630

odd meteor Mar 25, 2025, 10:58 AM

#

opaque condor GPU: Intel(R) UHD Graphics 630

Unfortunately, with Intel(R) UHD Graphics 630, your laptop doesn't support local GPU training for neural networks.

Leider unterstützt dein Laptop kein lokales GPU-Training für neural Networks 😟

opaque condor Mar 25, 2025, 11:05 AM

#

odd meteor Unfortunately, with Intel(R) UHD Graphics 630, your laptop doesn't support local...

It's not my laptop it's my desktop computer

#

Okay so does any one of these on this list have anything like cuad?

#

odd meteor Mar 25, 2025, 11:16 AM

#

opaque condor Okay so does any one of these on this list have anything like cuad?

Do you mean CUDA?

rugged stream Mar 25, 2025, 11:25 AM

#

Hi folks, i have an interview/technical assessment coming up for a data analytics position in a major high street bank and i am looking for study focuses, i will be using:

spreadsheets
SQL
Python

Statistics
Probability
Linear Algebra
Quadratics
Polynomials
Calculus

Any prep resources would be greatly appreciated,

Ty!

opaque condor Mar 25, 2025, 11:29 AM

#

odd meteor Do you mean CUDA?

Yes

hearty depot Mar 25, 2025, 1:18 PM

#

opaque condor

Y need NVIDIA for cuda

opaque condor Mar 25, 2025, 1:20 PM

#

hearty depot Y need NVIDIA for cuda

My CPU is being eaten up by my machine learning and I wanted my machine learning to go quick it's still quick but I want to also work on other things too

hearty depot Mar 25, 2025, 1:20 PM

#

opaque condor My CPU is being eaten up by my machine learning and I wanted my machine learning...

I meant you need NVIDIA for cuda

#

Also cloud gpus might be a better option

opaque condor Mar 25, 2025, 1:32 PM

#

I want something like physically can have for my machine because Google does not like me for some reason so if I can add it to my machine I'll take whatever I can

opaque condor Mar 25, 2025, 1:50 PM

#

hearty depot I meant you need NVIDIA for cuda

Sorry

glacial root Mar 25, 2025, 2:43 PM

#

has anyone here tried to implement AlphaDog, if so, do you guys also have the issue of the image not properly forming

#

import numpy as np
from PIL import Image

rgb_ai_image = Image.open('img_data/cars/carsgraz_076.bmp')
rgb_human_image = Image.open('img_data/bikes/bike_112.bmp')

rgb_ai_image_matrix = np.array(rgb_ai_image)
rgb_human_image_matrix = np.array(rgb_human_image)

ai_image_matrix = (0.299 * rgb_ai_image_matrix[:, :, 0]) + (0.587 * rgb_ai_image_matrix[:, :, 1]) + (0.114 * rgb_ai_image_matrix[:, :, 2])
human_image_matrix = (0.299 * rgb_human_image_matrix[:, :, 0]) + (0.587 * rgb_human_image_matrix[:, :, 1]) + (0.114 * rgb_human_image_matrix[:, :, 2])

ai_image_matrix = ai_image_matrix.astype(np.uint8)
human_image_matrix = human_image_matrix.astype(np.uint8)

ai_image = Image.fromarray(ai_image_matrix, 'L')
human_image = Image.fromarray(human_image_matrix, 'L')

normalized_ai_image_matrix = ai_image_matrix / 255
normalized_human_image_matrix = human_image_matrix / 255

one = np.ones((ai_image_matrix.shape))
attack_image_alpha = ((normalized_human_image_matrix - one) / (normalized_ai_image_matrix - one)) * 255
attack_image_alpha = attack_image_alpha.astype(np.uint8)

attack_image_matrix = np.empty((ai_image_matrix.shape[0], ai_image_matrix.shape[1], 4), dtype = np.uint8)
attack_image_matrix[:, :, 0] = ai_image_matrix
attack_image_matrix[:, :, 1] = ai_image_matrix
attack_image_matrix[:, :, 2] = ai_image_matrix
attack_image_matrix[:, :, 3] = attack_image_alpha
attack_image = Image.fromarray(attack_image_matrix, 'RGBA')

rgb_ai_image.close()
rgb_human_image.close()```

agile cobalt Mar 25, 2025, 2:49 PM

#

I'd recommend sharing Notebooks via GitHub (whenever it's inside of a repository or just a Gist) or Colab instead of just uploading the file to discord

glacial root Mar 25, 2025, 2:49 PM

#

agile cobalt I'd recommend sharing Notebooks via GitHub (whenever it's inside of a repository...

yeah i just realized that's probably better so i sent the code above

#

it's a pretty short program though so i just directly sent it instead of using gist

#

by the way the ai image is the what the ai should see and the human image is what people should see

#

here's the formula by the way

#

I_eye is the image seen by people

#

A is the alpha channel matrix

#

I_in is the grayscale image without alpha channel

#

and BKG is the background color (typically just 1 cause it's a white background)

glacial root Mar 25, 2025, 2:53 PM

#

glacial root ```py import numpy as np from PIL import Image rgb_ai_image = Image.open('img_d...

not sure if i'm doing something wrong or if it's just a limitation of the formula itself

agile cobalt Mar 25, 2025, 2:57 PM

#

pretty sure that your formula is different from the paper?
in first place you have no reference to the x 0.8 + 0.2 and x 0.2

vestal moth Mar 25, 2025, 3:56 PM

#

So I have a 3D pose estimation model with the x, y, z coordinates for each bodypart derived in h36m format. I want to smooth it using a butterworth filter to prevent some frames that look glitchy. How does that look? Do I need to apply the filter for each body part x, y, z coordinate respectively? For reference, h36m has 17 body parts so the filter would be used 51 times with this logic, I'm not sure if that's overkill. Should this filter be done with the 2D estimates instead, lowering it to 34, especially since the model is compiling the z coordinates by "jittering" the x and y positions.

safe agate Mar 25, 2025, 4:57 PM

#

Another option is marimo, you can share a link to your notebook.

glacial root Mar 25, 2025, 5:01 PM

#

agile cobalt pretty sure that your formula is different from the paper? in first place you ha...

oh i didn't look at that part yet, i was gonna try to implement it myself first before looking at the pseudo code

#

not sure where that comes from though

glacial root Mar 25, 2025, 5:01 PM

#

glacial root

cause this is the actual formula

river cape Mar 25, 2025, 5:02 PM

#

opaque condor

i think gtx 1050 ti is cuda-compactible check it out , although I would suggest using colab

sudden turret Mar 25, 2025, 7:06 PM

#

i'm having basically the same issue currently actually. did you ever find a solution to this?

solar siren Mar 25, 2025, 11:43 PM

#

that cuda-compactible thing stopped me from using different stuff also, is it possible to enable it somehow for every nvidia gpu

spring field Mar 26, 2025, 12:07 AM

#

solar siren that cuda-compactible thing stopped me from using different stuff also, is it po...

for any one that supports it, sure

solar siren Mar 26, 2025, 12:08 AM

#

i don't really have good knowledge on gpu's but are there any new nvidia gpu that doesn't supports it ?

spring field Mar 26, 2025, 12:09 AM

#

I don't know, but I'd find that unlikely

serene scaffold Mar 26, 2025, 1:01 AM

#

solar siren i don't really have good knowledge on gpu's but are there any new nvidia gpu tha...

I don't think so.

solar siren Mar 26, 2025, 1:02 AM

#

if i buy new nvidia gpu does it have cuda toolkit preinstalled in it or do i have to install it

serene scaffold Mar 26, 2025, 1:03 AM

#

solar siren if i buy new nvidia gpu does it have cuda toolkit preinstalled in it or do i hav...

Idk what that is. I've been able to download pytorch and it works right away.

solar siren Mar 26, 2025, 1:04 AM

#

nice it means that is ready to use with CUDA, because two weeks ago i tried to use Wan 2.1 with nvidia gpu and it asked me to install it's toolkit to proceed

serene scaffold Mar 26, 2025, 1:05 AM

#

solar siren nice it means that is ready to use with CUDA, because two weeks ago i tried to u...

What is Wan?

solar siren Mar 26, 2025, 1:05 AM

#

that new video generator made from ali baba team which is completely free to use locally

serene scaffold Mar 26, 2025, 1:08 AM

#

Did you pip install diffusers?

solar siren Mar 26, 2025, 1:25 AM

#

no i was trying to use it inside pinokio which is something like virtual environment

iron basalt Mar 26, 2025, 1:26 AM

#

solar siren if i buy new nvidia gpu does it have cuda toolkit preinstalled in it or do i hav...

No, you need to install it. You may only need the CUDA SDK rather than the entire toolkit.

#

The CUDA SDK contains the CUDA compiler based on LLVM, this compiler compiles CUDA kernels which get sent to the Nvidia driver (which you also need installed to do anything with the GPU).

solar siren Mar 26, 2025, 1:28 AM

#

iron basalt The CUDA SDK contains the CUDA compiler based on LLVM, this compiler compiles CU...

is it compatible with all nvidia gpu's because i was trying it in my friend's pc who has that gpu

iron basalt Mar 26, 2025, 1:28 AM

#

solar siren is it compatible with all nvidia gpu's because i was trying it in my friend's pc...

Depends how old, but It's unlikely you are using an Nvidia GPU that old.

#

CUDA is Nvidia's thing for all their GPUs for a while now and moving forward.

#

It's nothing special, it's just the API.

#

It's not any different really from Vulkan, OpenCL, etc at a fundamental level.

chilly oar Mar 26, 2025, 1:33 AM

#

how'd y'all suggest this book to brush up some concepts and learn stats for ML
https://probabilitycourse.com/

solar siren Mar 26, 2025, 1:50 AM

#

iron basalt It's not any different really from Vulkan, OpenCL, etc at a fundamental level.

thank you very much for information i will try that sdk as you said and it will eventualy work since that gpu isn't that old as you said, big respect for you for clearing that out

hallow badger Mar 26, 2025, 2:03 AM

#

ai safety forever

iron basalt Mar 26, 2025, 2:06 AM

#

chilly oar how'd y'all suggest this book to brush up some concepts and learn stats for ML h...

Seems ok.

chilly oar Mar 26, 2025, 2:10 AM

#

iron basalt Seems ok.

Feel free to recommend better sources if you have any

serene grail Mar 26, 2025, 6:23 AM

#

chilly oar Feel free to recommend better sources if you have any

I personally like Statquest YouTube channel, he has a nice playlist for stat fundamentals but those are videos, not a book

warm iron Mar 26, 2025, 7:02 AM

#

The state of my coding skills… need your advice.

So I’ve been coding for a while , I only deal with artificial intelligence so for me I mostly work with certain libraries like pandas, numpy , os and more , and I deal with CNN , NN architectures.

For example when I need to work with a data frame and do a certain thing(I don’t know how to do it yet) I ask chat GPT and to teach me and show me how to do it. The thing is, most of the time I can understand the code and the logic and how it works (although sometimes I meet something I don’t understand, for example why this variable is here).

So I can understand most of the code I get form chat GPT but I can’t write it on my own , I kinda often forget the steps or the syntax. In my opinion it’s the lack of knowledge of certain libraries.

Does everyone get to this point in learning and once you overcome you become mostly independent in programming?

glad pagoda Mar 26, 2025, 7:10 AM

#

guys i need help of the AI people here
i have an assignment to make a recipie generator but i have no idea where to get the dataset from

#

https://discord.com/channels/267624335836053506/1354353642509045781

serene grail Mar 26, 2025, 7:45 AM

#

glad pagoda guys i need help of the AI people here i have an assignment to make a recipie ge...

Maybe check Kaggle for datasets

glad pagoda Mar 26, 2025, 8:05 AM

#

what kind of dataset would even be used ehre

#

here

jaunty helm Mar 26, 2025, 8:11 AM

#

warm iron The state of my coding skills… need your advice. So I’ve been coding for a whil...

personally I know a few common functions by heart, then just look the nicher ones up when I need them
like you should def know how to loc iloc, combine dfs, reshape arrays, etc
but I doubt many even know about what stride tricks is

#

to improve, just write more code
by the 100th time you loc and iloc again I doubt you'll forget how to index for a long time

warm iron Mar 26, 2025, 8:34 AM

#

Yeah I’ve been doing that , thanks

river cape Mar 26, 2025, 9:43 AM

#

warm iron The state of my coding skills… need your advice. So I’ve been coding for a whil...

I think if you know how to manipulate the data like the shape , locations and basic stuff is more than enough , coz i just know that and if i need help , i use chatgpt but yea i do understand the why and what behind a problem .. so overall if you know your basics thoroughly , you should be in a pretty good position

zealous girder Mar 26, 2025, 10:50 AM

#

hello guys, I had a question on how to integrate multiple languages. Suppose I have written the backend in Go, but I want to add a feature for some recommendation/generation. So I used python to make an ML model for it, so how do I integrate this ML model made in python into my Go backend?

mortal star Mar 26, 2025, 11:03 AM

#

zealous girder hello guys, I had a question on how to integrate multiple languages. Suppose I h...

Seems it’s easier for you to develop additional microservice for your ML model and utilize it from your Go backend.

arctic wedgeBOT Mar 26, 2025, 9:00 PM

#

~~Please react with ✅ to upload your file(s) to our paste bin, which is more accessible for some users.~~

glacial root Mar 26, 2025, 9:10 PM

#

is graph dbms useful for ml

serene scaffold Mar 27, 2025, 12:26 AM

#

glacial root is graph dbms useful for ml

Depends on what you're trying to do.

acoustic seal Mar 27, 2025, 12:45 AM

#

can anyone help me with this? im trying to install tensorflow but it just doesn't exist?

ERROR: Could not find a version that satisfies the requirement tensorflow[and-cuda] (from versions: none)
ERROR: No matching distribution found for tensorflow[and-cuda]```

small wedge Mar 27, 2025, 12:58 AM

#

acoustic seal can anyone help me with this? im trying to install tensorflow but it just doesn'...

Did you try putting quotes around it

glacial root Mar 27, 2025, 1:00 AM

#

serene scaffold Depends on what you're trying to do.

for nlp it's good right

serene scaffold Mar 27, 2025, 1:00 AM

#

glacial root for nlp it's good right

to do what?

glacial root Mar 27, 2025, 1:00 AM

#

because we can organize words based on semantics

#

for organizing the data

glacial root Mar 27, 2025, 1:01 AM

#

acoustic seal can anyone help me with this? im trying to install tensorflow but it just doesn'...

is your venv activated?

serene scaffold Mar 27, 2025, 1:02 AM

#

I've used graph databases to represent data that's relational rather than tabular

glacial root Mar 27, 2025, 1:03 AM

#

serene scaffold I've used graph databases to represent data that's relational rather than tabula...

i see

#

so then it would make sense for organizing language data based on semantics

#

they should call relational dbms as tabular dbms instead

serene scaffold Mar 27, 2025, 1:03 AM

#

that's what I do.

glacial root Mar 27, 2025, 1:06 AM

#

serene scaffold that's what I do.

do you think this would be a good project to work on, making an etl pipeline from a graph dbms that stores language data organized by semantics that send data to a llm built from scratch?

acoustic seal Mar 27, 2025, 1:06 AM

#

glacial root is your venv activated?

yes its active

glacial root Mar 27, 2025, 1:06 AM

#

oh i'm not sure then

acoustic seal Mar 27, 2025, 1:06 AM

#

small wedge Did you try putting quotes around it

lemme try

it failed

glacial root Mar 27, 2025, 1:08 AM

#

acoustic seal can anyone help me with this? im trying to install tensorflow but it just doesn'...

what exactly is the [and-cuda] part

acoustic seal Mar 27, 2025, 1:09 AM

#

cuda is for using the hardware for processing

but the thing is, even installing tensorflow doesn't work

glacial root Mar 27, 2025, 1:09 AM

#

i see

serene scaffold Mar 27, 2025, 1:09 AM

#

glacial root do you think this would be a good project to work on, making an etl pipeline fro...

can you give a concrete example of what you have in mind?

acoustic seal Mar 27, 2025, 1:10 AM

#

" llm built from scratch? "

hm

glacial root Mar 27, 2025, 1:10 AM

#

serene scaffold can you give a concrete example of what you have in mind?

i'm not really sure, i was just thinking of general purpose

#

i know nothing about nlp yet and it's not my main field, but it's an interesting field and so i want to learn a bit about it

serene scaffold Mar 27, 2025, 1:11 AM

#

acoustic seal " llm built from scratch? " hm

creating something from scratch that you would consider an LLM takes an absurd amount of computation power and data. only a handfull of large companies have the resources to do this.

acoustic seal Mar 27, 2025, 1:11 AM

#

glacial root i know nothing about nlp yet and it's not my main field, but it's an interesting...

mind if i advice you on learning nlp?

glacial root Mar 27, 2025, 1:12 AM

#

acoustic seal mind if i advice you on learning nlp?

yeah that would be great

glacial root Mar 27, 2025, 1:12 AM

#

serene scaffold creating something from scratch that you would consider an LLM takes an absurd a...

yeah i didn't mean full on chatgpt

#

like just something small

acoustic seal Mar 27, 2025, 1:12 AM

#

glacial root yeah i didn't mean full on chatgpt

fun fact: that's what it is

glacial root Mar 27, 2025, 1:13 AM

#

i thought all chatbots used a large language model

acoustic seal Mar 27, 2025, 1:13 AM

#

glacial root yeah that would be great

try learning the pre processing part first.

and if you want a project to build with, try building a topic modelling project using bert and lda.

once you are through with this, pretty sure you'd have a great idea

#

bert comes under deep learning, lda is simple ml

small wedge Mar 27, 2025, 1:14 AM

#

acoustic seal lemme try it failed

What version of python are you running

glacial root Mar 27, 2025, 1:14 AM

#

acoustic seal try learning the pre processing part first. and if you want a project to build...

so i shouldn't try it from scratch?

#

i've tried a neural network from scratch and it helped me learn a lot about the inner workings

acoustic seal Mar 27, 2025, 1:15 AM

#

small wedge What version of python are you running

3.13.2

compiling 3.8 for fedora for the same purpose

acoustic seal Mar 27, 2025, 1:15 AM

#

glacial root so i shouldn't try it from scratch?

harder cause requires a lot of computational power. maybe lda? but going through with the arch would help

small wedge Mar 27, 2025, 1:15 AM

#

3.13 support isn't out on pypi yet

acoustic seal Mar 27, 2025, 1:15 AM

#

small wedge 3.13 support isn't out on pypi yet

ah

#

that makes sense, thanks for the help.

glacial root Mar 27, 2025, 1:16 AM

#

acoustic seal harder cause requires a lot of computational power. maybe lda? but going through...

i see

#

going through with the arch?

serene scaffold Mar 27, 2025, 1:16 AM

#

glacial root like just something small

look into what was required to train GPT-1--how many parameters, how much GPU memory, how much training data, etc. That would give you a sense for what the lower bound is for an "L" LM.

spring field Mar 27, 2025, 1:17 AM

#

glacial root so i shouldn't try it from scratch?

you can build a transformer from scratch, sure, that's probably not a bad exercise
but you simply do not have enough compute and information to train it from scratch nowhere near the levels of what LLMs, even say GPT2 can do

glacial root Mar 27, 2025, 1:17 AM

#

serene scaffold look into what was required to train GPT-1--how many parameters, how much GPU me...

oh

#

so not all chatbots are llms

acoustic seal Mar 27, 2025, 1:17 AM

#

glacial root going through with the arch?

architecture

spring field Mar 27, 2025, 1:17 AM

#

glacial root so not all chatbots are llms

Never forget Clippy, lol

acoustic seal Mar 27, 2025, 1:18 AM

#

clippy my beloved

glacial root Mar 27, 2025, 1:18 AM

#

so then are the less computationally heavy ones called slms or mlms?

serene scaffold Mar 27, 2025, 1:18 AM

#

glacial root so not all chatbots are llms

depends on what you consider a chatbot. Eliza was the OG. https://en.wikipedia.org/wiki/ELIZA

ELIZA

ELIZA is an early natural language processing computer program developed from 1964 to 1967 at MIT by Joseph Weizenbaum. Created to explore communication between humans and machines, ELIZA simulated conversation by using a pattern matching and substitution methodology that gave users an illusion of understanding on the part of the program, but ha...

glacial root Mar 27, 2025, 1:18 AM

#

for small/medium language models

spring field Mar 27, 2025, 1:18 AM

#

or ELIZA

serene scaffold Mar 27, 2025, 1:18 AM

#

glacial root so then are the less computationally heavy ones called slms or mlms?

the L in LLM has lost all meaning.

glacial root Mar 27, 2025, 1:18 AM

#

i see

#

so it doesn't actually mean large in computational power required

spring field Mar 27, 2025, 1:19 AM

#

MLM is what you'd almost call a scam 😁

serene scaffold Mar 27, 2025, 1:19 AM

#

the first person to call their LM an LLM just wanted to bring the point home that it was large. I guess.

acoustic seal Mar 27, 2025, 1:19 AM

#

no L refers to to large amount of data it went through during training

glacial root Mar 27, 2025, 1:19 AM

#

oh

serene scaffold Mar 27, 2025, 1:19 AM

#

acoustic seal no L refers to to large amount of data it went through during training

what's the cutoff?

acoustic seal Mar 27, 2025, 1:19 AM

#

when you say damn thats a lot

glacial root Mar 27, 2025, 1:19 AM

#

i'm assuming there's not an explicit cutoff

spring field Mar 27, 2025, 1:20 AM

#

acoustic seal no L refers to to large amount of data it went through during training

does the M refer to that then?

glacial root Mar 27, 2025, 1:20 AM

#

kind of like how there's not an explicit cutoff between regular ml models and dl models

acoustic seal Mar 27, 2025, 1:20 AM

#

mm that sounds about right

#

LMAO

serene scaffold Mar 27, 2025, 1:20 AM

#

So, the L in LLM doesn't signify a non-arbitrary distinction between LMs that aren't designated as LLMs. Which is why I advocate for just dropping the first L.

small wedge Mar 27, 2025, 1:21 AM

#

I feel this way about the term deep learning

glacial root Mar 27, 2025, 1:21 AM

#

i guess people think that lm doesn't have the same ring to it as llm

small wedge Mar 27, 2025, 1:21 AM

#

At this point it's sorta just a catchall for any modern ml model

glacial root Mar 27, 2025, 1:21 AM

#

so dl to ml is like llm to lm

small wedge Mar 27, 2025, 1:22 AM

#

Imo pretty much yeah

spring field Mar 27, 2025, 1:22 AM

#

serene scaffold what's the cutoff?

I mean... to be fair, there's no clear cutoff between homo sapiens and whatever came before yet we somewhat clearly have defined both species
though of course the field of AI is ironically rather lacking in the taxonomy department...

serene scaffold Mar 27, 2025, 1:22 AM

#

glacial root i guess people think that lm doesn't have the same ring to it as llm

fundamentally, a language model is just a probability distribution of token sequences. that something is a language model tells you nothing about how it's implemented. So I suppose one could say that LLMs are language models that depend on the transformer architecture.

glacial root Mar 27, 2025, 1:23 AM

#

i really gotta learn the theory and terminology

#

i have no idea what architecture means in the context

serene scaffold Mar 27, 2025, 1:23 AM

#

but that includes language models that aren't generative, and generating is the main thing that people think LLMs are supposed to do.

glacial root Mar 27, 2025, 1:24 AM

#

serene scaffold but that includes language models that aren't generative, and generating is the ...

an example of this would be sentiment analysis right

small wedge Mar 27, 2025, 1:24 AM

#

GPT is probably a more useful term

glacial root Mar 27, 2025, 1:24 AM

#

generative preprocessing transformer right

small wedge Mar 27, 2025, 1:24 AM

#

Pretrained

glacial root Mar 27, 2025, 1:24 AM

#

oh crap

serene scaffold Mar 27, 2025, 1:24 AM

#

glacial root an example of this would be sentiment analysis right

sentiment analysis is type of problem. LLMs are a type of model.

glacial root Mar 27, 2025, 1:25 AM

#

oh

#

i meant language models that perform sentiment analysis

serene scaffold Mar 27, 2025, 1:25 AM

#

you could adapt a language model for that purpose.

small wedge Mar 27, 2025, 1:26 AM

#

Even the line between a generative/nongenerative model seems pretty blurry like

#

It doesn't actually say much about the model itself

#

Just how we use its output

serene scaffold Mar 27, 2025, 1:27 AM

#

small wedge It doesn't actually say much about the model itself

a model that is designed to produce instances that are the same kind of thing as the training data.

small wedge Mar 27, 2025, 1:29 AM

#

What about like a GAN

glacial root Mar 27, 2025, 1:29 AM

#

small wedge Just how we use its output

i see, so it's like how a convolutional neural network can be used either to detect objects or to classify images into categories (though these are probably the same thing, just kind of opposites of each other) but the inner workings of the cnn are almost the same

small wedge Mar 27, 2025, 1:30 AM

#

The generator there doesn't get any training data it just produces an image from noise and is scored by the discriminator, does that count as producing the same kind of thing as it's input because it takes a set of pixels and outputs a set of pixels?

small wedge Mar 27, 2025, 1:31 AM

#

glacial root i see, so it's like how a convolutional neural network can be used either to det...

Stel knows more about this than me and he seems to disagree so idk

glacial root Mar 27, 2025, 1:31 AM

#

so a gan is a model that continuously trains by creating its own data

small wedge Mar 27, 2025, 1:32 AM

#

A gan is a system where two models train based on each other's outputs

glacial root Mar 27, 2025, 1:32 AM

#

i see

small wedge Mar 27, 2025, 1:32 AM

#

A generator makes an image from noise and a discriminator is given the generated image and a real image from the dataset, the score of each model is based on whether the discriminator can pick out the fake one or not

glacial root Mar 27, 2025, 1:33 AM

#

so this is a way of training against adversarial attacks

small wedge Mar 27, 2025, 1:33 AM

#

No

#

Different usage of the word adversarial there

#

In a gan adversarial just means they compete and influence each other's scores

#

An adversarial attack is something you do to break a models normal function, like wearing patterns it might recognize as faces to trick a facial detection model to fail at finding yours

glacial root Mar 27, 2025, 1:54 AM

#

small wedge In a gan adversarial just means they compete and influence each other's scores

oh i see

glacial root Mar 27, 2025, 1:55 AM

#

small wedge An adversarial attack is something you do to break a models normal function, lik...

yeah

#

recently i've read a research paper on adversarial attacks that utilize the alpha channel of grayscale rgba images in order to make these attacks universal and eliminate the need for queries

#

it's called AlphaDog

fervent canopy Mar 27, 2025, 7:26 AM

#

Web tool for training image classifiers with webcam/upload support and real-time preview. https://github.com/SanshruthR/Morphos

GitHub

GitHub - SanshruthR/Morphos: Web tool for training image classifier...

Web tool for training image classifiers with webcam/upload support and real-time preview. - SanshruthR/Morphos

hallow badger Mar 27, 2025, 1:26 PM

#

hallow badger Mar 27, 2025, 1:26 PM

#

hallow badger

so fun

slim storm Mar 27, 2025, 1:55 PM

#

can sklearn's KNNImputer impute categorical value by selecting the most frequent value from the neighbors?

slim storm Mar 27, 2025, 2:48 PM

#

or rather can i tweak the SimpleImputer to only select the most frequent value from the nearest neighbors?

hoary wigeon Mar 27, 2025, 5:32 PM

#

Hey there, I need help to resolve below error.

Traceback (most recent call last):
  File "C:\Users\cmx\OneDrive\Documents\GitHub\project-x001\background_replacement.py", line 3, in <module>
    import mediapipe as mp
  File "C:\Users\cmx\OneDrive\Documents\GitHub\project-x001\.venv\Lib\site-packages\mediapipe\__init__.py", line 15, in <module>
    from mediapipe.python import *
  File "C:\Users\cmx\OneDrive\Documents\GitHub\project-x001\.venv\Lib\site-packages\mediapipe\python\__init__.py", line 17, in <module>
    from mediapipe.python._framework_bindings import model_ckpt_util
ImportError: DLL load failed while importing _framework_bindings: A dynamic link library (DLL) initialization routine failed.

wise marlin Mar 27, 2025, 6:03 PM

#

I have a data frame with categorical values that I converted using pd.get_dummies, is there a way to return the data frame with the updated dummy values without creating two extra columns (True/False) ? The extra columns are causing a headache trying to model... TIA

unkempt apex Mar 27, 2025, 6:09 PM

#

fervent canopy Web tool for training image classifiers with webcam/upload support and real-time...

close button for webcam is glitched

#

but anyways, excellent idea

fervent canopy Mar 27, 2025, 6:31 PM

#

unkempt apex close button for webcam is glitched

thank you sm 😄

#

Yeah, like I need to handle it gracefully

#

I will fix that

unkempt apex Mar 27, 2025, 6:32 PM

#

fervent canopy thank you sm 😄

how you are loading the model on web?

fervent canopy Mar 27, 2025, 6:33 PM

#

unkempt apex how you are loading the model on web?

tensorflow has some really cool bindings with js and node

#

I am using mobilenet for feature extraction and then feeding the features to a dense network for learning

charred light Mar 27, 2025, 8:37 PM

#

Is there a faster way to sample a df column of lists of numbers than
df['list_values'].apply(lambda x: random.choices(x, k=sample_n)

#

Or avoid storing it in this format to begin with?

agile cobalt Mar 27, 2025, 8:46 PM

#

pandas does not supports nested data very well

#

you could try using polars instead if you need of more speed, would be df.select(pl.col('list_col').list.sample(k)) in it, but changing which library you're using is a fairly big change

charred light Mar 27, 2025, 8:55 PM

#

agile cobalt you could try using `polars` instead if you need of more speed, would be `df.sel...

This is done in pyspark, so I'm not sure I can implement polars.

#

I was wondering if there's some kind of transformation I could do so I can vectorize the entire process.

#

Since the data looks like:

abc, [323, 3525, 23423]
efg, [4676, 342, 5474, 9893]```
Where values is not fixed length.

untold bloom Mar 27, 2025, 9:13 PM

#

how big is the frame

#

you can explode, groupby, sample, groupby, aggregate

#

In [32]: df.shape
Out[32]: (20000, 2)

In [33]: %timeit df["values"].apply(lambda x: random.choices(x, k=2))
32.1 ms ± 677 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [34]: %timeit df.explode("values").groupby("id").sample(2).groupby("id")["values"].agg(list)
28.3 ms ± 656 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

#

slight speed but

charred light Mar 27, 2025, 9:14 PM

#

188,517 x 28 columns

#

values is a list of values though.

untold bloom Mar 27, 2025, 9:15 PM

#

what is that supposed to mean with regards to the code i posted please

charred light Mar 27, 2025, 9:15 PM

#

Oh, the explode

untold bloom Mar 27, 2025, 9:16 PM

#

also GroupBy.sample != random.choices but there is probably some parameter to behave as such

#

in terms of replacement

#

or maybe they are the same idk

charred light Mar 27, 2025, 9:19 PM

#

There's a bunch of applys being used to sample which is really slow once you run it across the 188k rows. zzzzz

exotic star Mar 27, 2025, 10:14 PM

#

i just got back into programing(3 days ago), watched the 4h codecamp tut on 2x speed and made a simple encryption program i was never good at programming btw. How would i go forward if i wanna get into AI

#

tho i love other aspects of programming as well

#

so a better question would be, how do i get better moving forward and then how can i transition into learning ai

upper bronze Mar 27, 2025, 10:18 PM

#

Hello, Does anyone recommend DataCamp?

glacial root Mar 28, 2025, 2:01 AM

#

is sobel operator -> hysteresis thresholding -> canny operator/filter a good framework for an edge detection algorithm?

daring crystal Mar 28, 2025, 7:27 AM

#

i just started in ml and i am currently going through pytorch? is this right approach?

#

i find pytorch intresting

jaunty helm Mar 28, 2025, 9:42 AM

#

daring crystal i find pytorch intresting

then do that
though I'd start with numpy + pandas/polars instead

barren fable Mar 28, 2025, 9:48 AM

#

Hi, I'm actually a newcomer in the LLMs field, and I studied the basic vanilla RNN, LSTM, Word2Vec, Seq2Seq (basic encoder-decoder), attention, transformers, decoder-only, and encoder-only all from the StatQuest YouTube channel.

So after it, I encountered lots of terms and topics such as Bert, LangChan, GenAI, MCP, RAG, CAG, Agents, Llama, T5, and others.

So I'm actually confused; I need kind of a structured roadmap because there are lots of terms, and I don't know from where to continue. Also, I read some online articles; some said learn transformers, then do some LLM fine-tuning; others said learn GenAI, then agents, etc.

So can someone help? And thanks!

serene scaffold Mar 28, 2025, 2:01 PM

#

barren fable Hi, I'm actually a newcomer in the LLMs field, and I studied the basic vanilla R...

what is your goal for learning about LLMs?

barren fable Mar 28, 2025, 2:08 PM

#

serene scaffold what is your goal for learning about LLMs?

Well, actually, I don't know yet, tbh. I studied ML in my major and went deep in it and then found out that LLMs are the trend and taking the hype rn, so I said, Let's try.

serene scaffold Mar 28, 2025, 2:08 PM

#

Bert, LangChan, GenAI, MCP, RAG, CAG, Agents, Llama, T5
do you understand what each of these are?

barren fable Mar 28, 2025, 2:17 PM

#

serene scaffold > Bert, LangChan, GenAI, MCP, RAG, CAG, Agents, Llama, T5 do you understand what...

As I said, I'm a newcomer and what I studied and everything, but yeah, I searched about some of these, like
BERT, which is an encoder-only model, and its use cases are, for example, Text classification and sentiment analysis
T5 is text to text, which is an encoder-decoder model, and its use cases are for translation and text summarizing.
LangChain is a framework related to agents, I guess?
So yeah, that's it.

spring field Mar 28, 2025, 7:48 PM

#

barren fable Hi, I'm actually a newcomer in the LLMs field, and I studied the basic vanilla R...

When you say "study" or "learn", what do you mean exacatly? Did you apply your knowledge practically in some project(s)? A great way to learn is to apply your knowledge, so I'd suggest picking something you're most interested in currently and making a project involving that thing.

lavish wraith Mar 28, 2025, 8:05 PM

#

    'price': [np.nan, 93.14, 92.97, 93.12, 93.20]  # Use np.nan
})
print(oil.dtypes)
# Only the missing value is filled, other values remain unchanged
oil['price'] = oil['price'].fillna(oil['price'].mean())
print(oil.dtypes)

print(oil)```
 when i add mean on missing value it change all row

#

0  93.1075
1  93.1400
2  92.9700
3  93.1200
4  93.2000```

spring field Mar 28, 2025, 8:55 PM

#

lavish wraith ```oil = pd.DataFrame({ 'price': [np.nan, 93.14, 92.97, 93.12, 93.20] # Us...

what do you mean? it looks like it's working as intended to me
what's the behaviour you were expecting?

lavish wraith Mar 28, 2025, 8:59 PM

#

it could be display like 93.14, 92.97, 93.12, 93.20 but it add 0 on every row

small wedge Mar 28, 2025, 10:13 PM

#

!rule ad paid

arctic wedgeBOT Mar 28, 2025, 10:13 PM

#

Rules

6. Do not post unapproved advertising.

9. Do not offer or ask for paid work of any kind.

spring field Mar 28, 2025, 11:40 PM

#

lavish wraith it could be display like 93.14, 92.97, 93.12, 93.20 but it add 0 on every row

are you complaining purely about the formatting of the printed representation? does it... like really matter?

hollow pagoda Mar 29, 2025, 5:01 AM

#

i just transformed this feature material for easier access to material %s, how should it be encoded?

#

theres 20 variants of materials for this case, should it be similiar to OHE with 20 columns with the decimal there

#

seems like itd work the same as ohe but its 0-1 instead of binary 0/1, if theres a better/easier way let me know i pick this back up tmr

hoary wigeon Mar 29, 2025, 5:44 AM

#

@slate raven @hard night Hi there, I'm facing an issue related to mediapipe
ImportError: DLL load failed while importing _framework_bindings

I saw that even you guys have faced the same issue earlier. May I know how did you guys resolved the issue?

mystic harbor Mar 29, 2025, 9:14 AM

#

@solid sealWe don't allow recruitment in this server, I've deleted your post.

limber spear Mar 29, 2025, 9:39 AM

#

Why some of you make data science sound like it’s a chore. We should start at the root words. Data + science

barren fable Mar 29, 2025, 9:46 AM

#

spring field When you say "study" or "learn", what do you mean exacatly? Did you apply your k...

When I mean study, I mean the theoretical thing; that's it. I know that I should do projects, of course, but first I need to know what the topic is. Someone actually recommended to me to go through different model architectures like BERT, GPT, and T5, and then watch and apply the playlist of Andrej Karpathy. It'll be a good start.

fresh harbor Mar 29, 2025, 11:54 AM

#

I need to transform a user query like "fetch me products from brand X made before 2005" into an api call. i have approx 0 knowledge of ai so what is the easiest way i can do this? i don't want to train any models or use a cloud based approach like openai

grand minnow Mar 29, 2025, 12:14 PM

#

fresh harbor I need to transform a user query like "fetch me products from brand X made befor...

You could download a model from Huggingface and run it on Ollama locally

opaque condor Mar 29, 2025, 12:47 PM

#

So I have to download cuda along with getting a GPU that can run it

grand minnow Mar 29, 2025, 12:48 PM

#

pretty much

opaque condor Mar 29, 2025, 12:53 PM

#

Is there anything that I need to worry about motherboard,CPU, ect?

opaque condor Mar 29, 2025, 1:15 PM

#

Here are the specs:

Processer:
Intel(R) i3-10100
CPU @ 3.60GHz 3.60GHz
Installed ram:
8.00GB(7.89 useable)

grand minnow Mar 29, 2025, 1:17 PM

#

It looks ok. I would pump up the RAM with another 8GB and then add a decent GPU

opaque condor Mar 29, 2025, 3:01 PM

#

Any recommendations?

#

For gpu?

versed axle Mar 29, 2025, 3:05 PM

#

what is your budget?

opaque condor Mar 29, 2025, 3:16 PM

#

Anything that's not really a over then maybe a thousand or even a hundred I didn't find one for $70 on marketplace but you know it's in a shady part of the state that I'm in

calm thicket Mar 29, 2025, 3:51 PM

#

opaque condor For gpu?

google colab

serene scaffold Mar 29, 2025, 3:53 PM

#

opaque condor For gpu?

the amount of GPU power you need varies wildly depending on what you're trying to do. I recommend not buying one (unless you want it for gaming) and renting cloud compute.

opaque condor Mar 29, 2025, 4:08 PM

#

calm thicket google colab

Google dislikes me with a passion

#

So if I can host it on my main machine I'm all up for it because I want you to pay rent and if I want to do something other for machine learning I can do renderings for Sims blender and panda3d

#

It makes it so that if I can't get into my account because Google just likes my password I can still work with a GPU on my system plus I've been meaning to upgrade it a bit

opaque condor Mar 29, 2025, 4:23 PM

#

opaque condor It makes it so that if I can't get into my account because Google just likes my ...

As you can tell I live in the North and even if you don't need it buy it just in case and Google dislikes me so it's not really an option to not buy it

#

Sorry if I'm being a little off it's just since neural networks require something like a GPU to work more efficiently and Google dislikes me I do have to buy it but if Google likes me this time I can't always say that it will like me the same way each day if I'm counting and no network or training it so what I'm trying to do is get ahead of it so I'm wondering what's the best GPU for the type of material I have

hollow pagoda Mar 29, 2025, 5:33 PM

#

limber spear Why some of you make data science sound like it’s a chore. We should start at th...

Bro what are you talking about

limber spear Mar 29, 2025, 5:43 PM

#

hollow pagoda Bro what are you talking about

Stuff like this: “When you say "study" or "learn", what do you mean exacatly? Did you apply your knowledge practically in some project(s)?“

rugged stream Mar 29, 2025, 5:44 PM

#

Hi folks, I have an upcoming assessment centre/interview for an apprentice data analyst position, I have pretty basic knowledge with spreadsheets, SQL, Python and some underpinning maths topics - can anyone give me some good resources to help me prep/study please?

glacial root Mar 29, 2025, 6:09 PM

#

limber spear Stuff like this: “When you say "study" or "learn", what do you mean exacatly? Di...

bro how are you supposed to do projects without learning

#

i mean yeah you could get started on the project without learning, but that would still involve learning/studying, it would just be side-by-side while working on a project and your learning would be more project-oriented

hollow pagoda Mar 29, 2025, 6:14 PM

#

limber spear Stuff like this: “When you say "study" or "learn", what do you mean exacatly? Di...

studying and learning not a chore and ofc, this isnt a physical activity where knowledge isnt always applied

limber spear Mar 29, 2025, 6:51 PM

#

glacial root bro how are you supposed to do projects without learning

Yeah but when you mock someone’s intelligence with grammatical “” quotation marks when they genuinely are interested, you take away the fun of the learning and studying experience

limber spear Mar 29, 2025, 6:55 PM

#

rugged stream Hi folks, I have an upcoming assessment centre/interview for an apprentice data ...

There is a guy who works for Amazon. He runs a SQL based server. I can forward you his information

rugged stream Mar 29, 2025, 7:02 PM

#

limber spear There is a guy who works for Amazon. He runs a SQL based server. I can forward y...

would that be useful to an absolute beginner?

gray slate Mar 29, 2025, 7:03 PM

#

rugged stream would that be useful to an absolute beginner?

No, run Jupyter and mess with Workbooks, connect to MySQL or Postgres locally (run it in Docker)

#

Load some CSV files in, draw some graphs etc

#

what's an "assessment centre/interview" btw? will you actually speak to someone who you'll be working with, or is it some meat-market selling you on?

limber spear Mar 29, 2025, 7:07 PM

#

rugged stream would that be useful to an absolute beginner?

His server is for beginners to advanced

gray slate Mar 29, 2025, 7:08 PM

#

Oh a Discord server lol

rugged stream Mar 29, 2025, 7:09 PM

#

gray slate what's an "assessment centre/interview" btw? will you actually speak to someone ...

not sure it will ne my first one - i think it will have a series of technical interviews from various different department representatives

gray slate Mar 29, 2025, 7:09 PM

#

What do they do?

limber spear Mar 29, 2025, 7:09 PM

#

gray slate Oh a Discord server lol

Yezzir. You can’t go wrong with someone who works at Amzon and cloud services

gray slate Mar 29, 2025, 7:09 PM

#

The org you'll be working for?

rugged stream Mar 29, 2025, 7:09 PM

#

high street bank

gray slate Mar 29, 2025, 7:10 PM

#

rugged stream high street bank

Okay so MS SQL server, Windows environment, Office proficiency, use of Internet Explorer 6 😂

#

I'm only half joking - but go in knowing the tech stack they use. Do your research

rugged stream Mar 29, 2025, 7:11 PM

#

ok i will focus on SQL, excel and maths in the form of probability/statistics/calculus

gray slate Mar 29, 2025, 7:11 PM

#

Banks generally use Microsoft's stack and they like paperwork because real work is too difficult. But they'll look very smart and act the part that's for sure

#

(If you can't tell, I don't miss my time in retail banking)

#

Check on Glassdoor and see what stack they use, what they value, who their partners are - see if you can get any info. Then you can skim-read about technologies and name-drop them, so they can fight over who discovered you lol

#

Also, LinkedIn, see what technologies people who work there (in the same department you'll be in) have in their skills list

#

At a guess: R, Python + Jupyter notebooks. Probably Power BI and Oracle. If they have cloud then most likely Azure Synapse for big data. But yeah scope them out on LinkedIn.com/Glassdoor.com

lapis sequoia Mar 29, 2025, 7:36 PM

#

howd you all get started with ai? i heard the salaraies were really high , $300,000

rugged stream Mar 29, 2025, 7:38 PM

#

gray slate Check on Glassdoor and see what stack they use, what they value, who their partn...

awesome ty

white reef Mar 29, 2025, 8:03 PM

#

opaque condor For gpu?

if you plan to work on big data or deep learning projects, runpod.io is a good provider for cloud gpu, it's not that expensive and you can run your project's docker or a jupyter notebook instance on an A100, a bunch of RTX GPUs and even H100.

#

i've been using it for my projects

#

it's truly useful

echo linden Mar 29, 2025, 8:46 PM

#

how do i get started with data science and ai
i want to see if i can do machine learning for petroleum engineering

glacial root Mar 29, 2025, 9:18 PM

#

limber spear Yeah but when you mock someone’s intelligence with grammatical “” quotation mark...

that is not what he meant

#

he was simply quoting the other guy

limber spear Mar 29, 2025, 9:24 PM

#

glacial root he was simply quoting the other guy

I believe you

limber spear Mar 29, 2025, 9:24 PM

#

echo linden how do i get started with data science and ai i want to see if i can do machine...

Learn Python and build what you like

hollow lake Mar 29, 2025, 10:11 PM

#

I want to build a chatbot using an open source LLM?

#

Any suggestions ?

#

I will build this chatbot just for students questions about faculty informations (graduation, fields, ...)

serene scaffold Mar 29, 2025, 10:44 PM

#

hollow lake I want to build a chatbot using an open source LLM?

The easiest way to do this is by buying credits with a platform like OpenAI and changing the settings for your purposes. Probably with a RAG framework for looking up and using information that's specific for your school

hollow lake Mar 29, 2025, 11:14 PM

#

@serene scaffold thank you, I know that I should use RAG for extracting informations from external documents

#

I will try to buy it

#

what do you think about open source LLM like llama ?

serene scaffold Mar 29, 2025, 11:36 PM

#

hollow lake what do you think about open source LLM like llama ?

You'd need to rent cloud compute to host it. It would be cheaper to only pay for the API calls.

hollow lake Mar 30, 2025, 1:24 AM

#

serene scaffold You'd need to rent cloud compute to host it. It would be cheaper to only pay for...

thanks for sharing your knowledge with us, i wish the best for you

opaque condor Mar 30, 2025, 3:52 AM

#

https://paste.pythondiscord.com/HRFQ

#

what do I have to do for the accuracy

exotic star Mar 30, 2025, 4:38 AM

#

If I wanna do robotics and ai later on, is doing web scraping, storing and cleaning data with python the way to go for now? I find it fun and I think it'll be a useful skill going forward

#

I'm a beginner

#

Might even be useful for freelancing as well

grand minnow Mar 30, 2025, 5:12 AM

#

exotic star If I wanna do robotics and ai later on, is doing web scraping, storing and clean...

If you find it fun to do, do it. 🙂

opaque condor Mar 30, 2025, 5:23 AM

#

opaque condor what do I have to do for the accuracy

what did i do wrong with the accuracy because i have a 0.0 across all the terminal

hollow pagoda Mar 30, 2025, 7:10 AM

#

exotic star If I wanna do robotics and ai later on, is doing web scraping, storing and clean...

No

#

Ts got nothing to do with robotics

#

Just use available datasets and learn machine and deep learning

untold pollen Mar 30, 2025, 9:21 AM

#

How to get knowledge of phyton code which channel ?

delicate apex Mar 30, 2025, 9:28 AM

#

!rule ad

arctic wedgeBOT Mar 30, 2025, 9:28 AM

#

Rules

6. Do not post unapproved advertising.

grand minnow Mar 30, 2025, 12:19 PM

#

untold pollen How to get knowledge of phyton code which channel ?

!resources have plenty

arctic wedgeBOT Mar 30, 2025, 12:19 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

untold pollen Mar 30, 2025, 12:26 PM

#

!resources Hello

arctic wedgeBOT Mar 30, 2025, 12:26 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

untold pollen Mar 30, 2025, 12:26 PM

#

!resources print(“Hello”)

arctic wedgeBOT Mar 30, 2025, 12:26 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

serene scaffold Mar 30, 2025, 3:43 PM

#

@untold pollen please go to #bot-commands to experiment with the bot

exotic star Mar 30, 2025, 4:44 PM

#

grand minnow If you find it fun to do, do it. 🙂

i do but would it hurt me going forward? or is it a great way to improve my python skills?

grand minnow Mar 30, 2025, 4:48 PM

#

exotic star i do but would it hurt me going forward? or is it a great way to improve my pyth...

As long as you're learning, you will always benefit from it

#

Don't overthink it

exotic star Mar 30, 2025, 4:49 PM

#

grand minnow As long as you're learning, you will always benefit from it

thanks

#

i'll go for it

slim lance Mar 30, 2025, 4:59 PM

#

(Please tell me if this is the wrong place to ask this question.)

Has anyone found a VSCode plugin that offers good support for exploring nested Python data structures? I use Data Wrangler for tabular data, but it isn't great for nested data structures. I use the Jupyter VSCode plugin and know people in the Jupyter community use Spyder for this, but I tried it a while back and it slow and unwieldy.

Note: I've been muddling by with pprint.

pale thunder Mar 30, 2025, 5:22 PM

#

in a t-SNE plot, is there a useful axis label, and do the magnitudes of the points actually matter? Or should I just leave axes unlabeled?

hollow pagoda Mar 30, 2025, 6:27 PM

#

slim lance (Please tell me if this is the wrong place to ask this question.) Has anyone fo...

wdym nested data structure like json?

#

i wouldnt know the answer but curious incase i run into tht problem, i just found data wrangler a few days ago its a gem

errant lake Mar 31, 2025, 5:52 AM

#

slim lance (Please tell me if this is the wrong place to ask this question.) Has anyone fo...

Unpack those structures to a tabular format then use data wrangler?

jaunty helm Mar 31, 2025, 6:28 AM

#

pale thunder in a t-SNE plot, is there a useful axis label, and do the magnitudes of the poin...

I'm p sure it's just a visualization tool and the numbers aren't meaningful

fresh harbor Mar 31, 2025, 10:40 AM

#

grand minnow You could download a model from Huggingface and run it on Ollama locally

seems very overkill

#

and impossible to run locally

#

i am looking more at classic nlp solution (i got a bit more educated on this)

forest creek Mar 31, 2025, 3:07 PM

#

do u guys know something about EHR(electronical healthcare records) and AI?

jaunty helm Mar 31, 2025, 3:29 PM

#

fresh harbor and impossible to run locally

you'd be surprised actually
models come in all sizes nowadays, even a cpu could reasonably run like qwen 3b
though too small to really converse with, it probably does fine turning user requests into api calls

fickle shale Mar 31, 2025, 4:07 PM

#

How can i reterive reddit post with location like
e.g., "Need help in Austin" → Maps to Austin, TX)

cedar tusk Mar 31, 2025, 4:12 PM

#

jaunty helm you'd be surprised actually models come in all sizes nowadays, even a cpu could ...

what i dont like about transformer based models is that they are randomized. It may do fine for some of the requests but it may fail as well and there is no way to fix it.

agile cobalt Mar 31, 2025, 4:30 PM

#

cedar tusk what i dont like about transformer based models is that they are randomized. It ...

for something to produce the correct output 100% of the time, in first place there must always be 0% ambiguity in its inputs

Make a traditional GUI (or even something akin to discord slash commands) and force the user to specify an unambiguous input instead of using any sort of statistical model if ensuring correctness outweighs the risks added from adding natural language convenience

glacial root Mar 31, 2025, 7:32 PM

#

does anyone here know of a small dataset (around the size of mnist) that contains images of faces and non-faces

#

i tried searching for one but couldn't find what i was looking for

#

the image classifier i'm gonna be working on will just classify whether or not the image contains a human face, not anything more specific than that

serene scaffold Mar 31, 2025, 7:33 PM

#

glacial root does anyone here know of a small dataset (around the size of mnist) that contain...

couldn't you create one by taking a dataset of face images and then adding a bunch of otherwise similar images (similar dimensions or whatever) from some arbitrary other dataset?

glacial root Mar 31, 2025, 7:34 PM

#

serene scaffold couldn't you create one by taking a dataset of face images and then adding a bun...

i could but i just wanted to know if anyone knew of one before i did that

#

cause it would take a while to get all of the images into a csv

#

if it's not in a csv and just a folder of images, it would take super long to get each image in matrix format

serene scaffold Mar 31, 2025, 7:37 PM

#

glacial root if it's not in a csv and just a folder of images, it would take super long to ge...

No it wouldn't. You can just use PIL.

glacial root Mar 31, 2025, 7:38 PM

#

serene scaffold No it wouldn't. You can just use PIL.

i mean to do that with each and every image though

serene scaffold Mar 31, 2025, 7:38 PM

#

glacial root i mean to do that with each and every image though

It would just be a for loop.

glacial root Mar 31, 2025, 7:38 PM

#

serene scaffold It would just be a for loop.

python file i/o allows that to go through a folder?

serene scaffold Mar 31, 2025, 7:39 PM

#

glacial root python file i/o allows that to go through a folder?

You can loop over all the paths in a given directory

#

!docs pathlib.Path.glob

arctic wedgeBOT Mar 31, 2025, 7:39 PM

#

pathlib.Path.glob


Path.glob(pattern, *, case_sensitive=None, recurse_symlinks=False)```
Glob the given relative *pattern* in the directory represented by this path, yielding all matching files (of any kind)...

glacial root Mar 31, 2025, 7:39 PM

#

oh i didn't know that

#

thank you

viscid urchin Mar 31, 2025, 7:51 PM

#

glacial root does anyone here know of a small dataset (around the size of mnist) that contain...

I haven't used it myself yet but this has been on my radar https://exposing.ai/face_scrub/

Exposing.ai

Exposing.ai: FaceScrub

Face Scrub is an image dataset of public figures scraped from the Internet and widely used for enhancing face recognition technologies

glacial root Mar 31, 2025, 7:53 PM

#

thank you

fresh harbor Mar 31, 2025, 8:08 PM

#

jaunty helm you'd be surprised actually models come in all sizes nowadays, even a cpu could ...

ok i will test it. i read somewhere that you need atleast a 7b param model for it to be any useful.

#

afaik these models contain knowledge about a lot more things that i wont ever need. can i just carve out the pieces i need and remove the rest? or am i getting it wrong

fresh harbor Mar 31, 2025, 9:07 PM

#

jaunty helm you'd be surprised actually models come in all sizes nowadays, even a cpu could ...

so i tried qwen2.5 coder instruct 0.5b 1b 3b and 7b and found that although 3b works in some cases it gets a way bit creative and comes up with stuff i didn't even say. 7b is brief and just puts a null.

#

i am afraid i can't run 7b locally

tepid tartan Mar 31, 2025, 9:32 PM

#

Any project out on the internet or ideas for data analyst

glacial root Mar 31, 2025, 9:41 PM

#

perhaps etl pipeline?

weary timber Mar 31, 2025, 11:33 PM

#

fresh harbor I need to transform a user query like "fetch me products from brand X made befor...

you can use openrouters apis to use free models like deepseek v3 0324

#

i think that is the easiest way

slim lance Apr 1, 2025, 12:13 AM

#

FTR vscode has a new setting to allow nested variable exploration in the debugger. It’s not great but it’s infinitely better than having to view nested data in wrangler.

notebook.variablesView

flint onyx Apr 1, 2025, 12:13 AM

#

I need help interpreting this

#

does this mean that the model is doing good

viscid urchin Apr 1, 2025, 12:17 AM

#

I'm not an expert at all, but I guess the gap between the red (testing) and blue (training) lines indicates overfitting?

flint onyx Apr 1, 2025, 12:24 AM

#

viscid urchin I'm not an expert at all, but I guess the gap between the red (testing) and blue...

but the gap is rlly small wont that mean its generalizing good

#

thats what I was thinking

viscid urchin Apr 1, 2025, 12:27 AM

#

Yeah, looks like it's working to me. Maybe you'd like to see more growth with larger training sets but that really depends on the domain I suppose

hollow pagoda Apr 1, 2025, 1:36 AM

#

overfitting is if the gap grew bigger over time from testing accuracy dropping

#

its good

flint onyx Apr 1, 2025, 2:09 AM

#

hollow pagoda its good

my model is good?

#

someone else in a diff server told me its still a problem

#

and that I should try to improve further

#

Ill try tuning again tn but if I cant seem to improve it then ig Ill just stay with this

jaunty helm Apr 1, 2025, 3:32 AM

#

fresh harbor so i tried qwen2.5 coder instruct 0.5b 1b 3b and 7b and found that although 3b w...

did you set temperature to 0?
also prompt it with like respond in this specific format without anything else; it's best you give examples that it can follow, e.g.

Parse the user's request into a JSON format for an API query. Strictly follow this format: 
{"brand": ..., "date": ...}

warm iron Apr 1, 2025, 3:44 AM

#

Hi everyone
Is this behavior normal? I work with data in chunks, 35000 features per chunk. Multiclass, adam optimizer, BCE with logits loss function

final results are:
Accuracy: 0.9184
Precision: 0.9824
Recall: 0.9329
F1 Score: 0.9570

#

I have a guess why it happens
The data are the features extracted from Brain CT scan images, when i open it i can see chunks of zezors, then chunks of numbers, chunks of zeros, chunks of numbers, i assume chunks of zeors are the background in the image . Mayve this casues this flactuation?
everything black is set to zeros by relu I guess cos I used pretrained vgg 16 for features extraction

fiery bane Apr 1, 2025, 6:22 AM

#

I have a question about LRA https://github.com/google-research/long-range-arena
It seems that people say that it has locality bias.
I don't understand, if this is not good enough, then what kind of task is required?

GitHub

GitHub - google-research/long-range-arena: Long Range Arena for Ben...

Long Range Arena for Benchmarking Efficient Transformers - google-research/long-range-arena

glossy zinc Apr 1, 2025, 11:52 AM

#

!res

arctic wedgeBOT Apr 1, 2025, 11:52 AM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

weary timber Apr 1, 2025, 1:09 PM

#

does anyone know and have access to EmotionLines dataset?

#

as always the homepage of it is down

cold goblet Apr 1, 2025, 2:09 PM

#

I am working on processing large amounts of JSON files (in PBs). The schema for the JSON is available with a few variations. The JSON schema is nested (two levels) and the values I'm interested in for filter and aggregation are not top level.

I need to filter on a couple of keys that are nested in the JSON and then aggregate the data to get mean, sum, and other statistics.

So, I am thinking of transforming the JSON data into multiple parquet files partitioned based on certain values that I'll filter on combined with SQL database that will store kind of metadata that'll help me figure out what parquet files needs to be processed.

Another concern is that some of these JSON file may be updated and I'll need to replace the data.

Currently, I have decided on MapReduce and Hadoop. I also found Apache Spark. Ideally, I want to distribute this file processing across multiple VMs.

Is there something I am missing in my approach? Is there a better approach or framework for this?

Also, the actual data analytics of the processed file will be happening in motherduck and I want to reduce the analysis time even at some cost to the storage and file processing time.

hollow pagoda Apr 1, 2025, 2:54 PM

#

flint onyx someone else in a diff server told me its still a problem

did they tell u why

outer cloak Apr 1, 2025, 3:54 PM

#

yoo

#

How can i use pandas and numpy?

agile cobalt Apr 1, 2025, 3:57 PM

#

read their user guides

outer cloak Apr 1, 2025, 3:59 PM

#

i want a simple explaination

woeful escarp Apr 1, 2025, 4:00 PM

#

Hello, I am starting in ML, I would like to work in a project to improve, send me DM

outer cloak Apr 1, 2025, 4:07 PM

#

i want to learn ML too it is fun!!

#

i like watching a machine learn things!!

viscid urchin Apr 1, 2025, 4:20 PM

#

outer cloak i want a simple explaination

Pandas: Tools to load/etc your data -- Numpy: Where that data lives. Beyond that, the examples in their documentation are a great place to start.

serene scaffold Apr 1, 2025, 4:30 PM

#

viscid urchin Pandas: Tools to load/etc your data -- Numpy: Where that data lives. Beyond that...

that isn't quite how I'd explain the difference, even at a very simple level

viscid urchin Apr 1, 2025, 4:30 PM

#

OK, my bad.

#

Seemed like it might give them some traction

#

Would you say it's more correct to say that pandas is about organizing/analyze/manipulating the data, and that numpy is about storing it? Or is that still too simplified?

serene scaffold Apr 1, 2025, 4:36 PM

#

Numpy is for arrays of numeric data, which could be a 2d matrix of floating point numbers, or a 3d array that represents an image (height, width, color channel).

Pandas DataFrames are for tabular data in general, with rows and columns, where each row is a "thing", and each column represents a piece of data about that thing.

When you say "numpy is about storing it", you might be thinking of how DataFrames are often wrappers around a numpy array. But that fact is really just an implementation detail and isn't necessary to know. There's currently more than one option for which "backend" to use, and I think they're planning to eliminate numpy as one of them.

#

arrays and DataFrames don't really have that much in common except that they're "rectangular".

viscid urchin Apr 1, 2025, 4:37 PM

#

OK, that's fair. To me that relationship has always been 'how it is', but swappable pandas backends would certainly change that. I've just not used that feature yet I guess.

agile cobalt Apr 1, 2025, 4:40 PM

#

also, numpy provides efficient storage and operations for numerical data, if all of your data is numerical, there is no need to use pandas for loading nor manipulating it

serene grail Apr 1, 2025, 4:41 PM

#

serene scaffold Numpy is for arrays of numeric data, which could be a 2d matrix of floating poin...

Wait, did I read that right? Pandas is planning to eliminate numpy as its backend? Would that have significant changes for an end user of pandas?

agile cobalt Apr 1, 2025, 4:44 PM

#

serene scaffold Numpy is for arrays of numeric data, which could be a 2d matrix of floating poin...

you mean completely eliminate the numpy backend? source?
or just turn numpy into an optional dependency when you're using a different backend?

iron basalt Apr 1, 2025, 4:44 PM

#

serene grail Wait, did I read that right? Pandas is planning to eliminate numpy as its backen...

Maybe in a few places, but not that much.

viscid urchin Apr 1, 2025, 4:45 PM

#

I guess PyArrow is replacing numpy for Pandas

iron basalt Apr 1, 2025, 4:45 PM

#

I'm guessing that every other backend option is required to support the Python buffer protocol, and so it would all still interact with Numpy arrays.

#

https://docs.python.org/3/c-api/buffer.html

Python documentation

Buffer Protocol

Certain objects available in Python wrap access to an underlying memory array or buffer. Such objects include the built-in bytes and bytearray, and some extension types like array.array. Third-part...

regal wedge Apr 1, 2025, 4:51 PM

#

can someone help me with this
i'm trying to maka an mlops project using zenml and mlflow
as far as i know there is no error in the python code
i connected to the zenml server and tried to set my project
but evertime i run my .py file there's an error stating that the project is not set

serene scaffold Apr 1, 2025, 4:53 PM

#

regal wedge can someone help me with this i'm trying to maka an mlops project using zenml an...

it's easier for people to help you if you give the error message as text. Not as a screenshot.
The actual error in the code is not visible in the screenshot.

#

somewhere in the code is x.name for some x, and that x is None.

viscid urchin Apr 1, 2025, 4:54 PM

#

Hmm, active_stack's docs claim that it raises an error if the stack isn't set, rather than returning None, so my first guess is wrong.

#

but 'experiment_tracker' could be None

fresh harbor Apr 1, 2025, 5:44 PM

#

jaunty helm did you set temperature to 0? also prompt it with like `respond in this specific...

i was reading qwen docs and it had something called "function calling". you have any idea about this?

restive flare Apr 1, 2025, 6:51 PM

#

Hi, anyone interested in Agentic RAG? I have written a whole article on it build using LlamaIndex and Gemini.

It is handsome on coding style with GitHub repo.

Can you give me your feedback or how I can make it more adavance.

Here is the Link

https://www.analyticsvidhya.com/blog/2025/03/building-a-financial-report-retrieval-system/

Analytics Vidhya

Building a Financial Report Retrieval System with LlamaIndex and Ge...

Learn how to build a financial report retrieval system using LlamaIndex and Gemini 2.0 for efficient data extraction and analysis.

weary timber Apr 1, 2025, 7:08 PM

#

i want to make an app where one would input their mood tempo energy etc. (some music features) and get recommended a music.but i dont know with what i can achieve this, can someone help me pls?

viscid urchin Apr 1, 2025, 7:17 PM

#

You'll need to compose a few things. The first question that comes to mind is 'where do you plan to get your knowledge base of music'?

#

You need a corpus of songs/albums tagged with their mood/tempo/etc, which presumably the big players have all had to build themselves

agile cobalt Apr 1, 2025, 7:20 PM

#

weary timber i want to make an app where one would input their mood tempo energy etc. (some m...

I played around with a Spotify dataset from Hugging Face a while ago to test some tools, and it included those features and some others

you can use it as a base if you want, to search based on that you would just need to filter something like (value - user_specified) < maximum_distance

#

Creating a dataset for that from scratch would take a fair bit of work, but using an existing one it isn't too bad

weary timber Apr 1, 2025, 7:22 PM

#

can i use a nn for thatT?

#

mlp

agile cobalt Apr 1, 2025, 7:22 PM

#

what would your inputs and outputs be?

weary timber Apr 1, 2025, 7:28 PM

#

the input will be joy,sadness,neutral,tempo,bpm and some music features

#

the output is the problem

#

280k songs

agile cobalt Apr 1, 2025, 7:38 PM

#

weary timber the output is the problem

there are two major approaches you can take

just filtering it in a normal way
using a model to engineer more "meaningful" features then filter based on those instead

if you just want to find sounds with joy score in between, say, 0.5 ~ 0.65, you can just do df.filter(pl.col("joy").in_between(0.5, 0.65)), there is no reason to do any machine learning

If you want to use ML for some reason, then you'll likely want to make either a simple clustering algorithm, a classical recommendation system or perform semantic search to identify similar songs
In any of these three cases, your inputs would be songs rather than directly asking for specific values for any given feature

For clustering, just look up K-Means and apply it on the song metadata columns
For a recommendation system, you would need to gather a bunch of user preferences data first, to then cluster together songs different users like
For the last, you would need to use an embedding model to create a representation of the song itself rather than its metadata (then find similar songs using a metric like cosine similarity)
(note that the datasets you can find and access without qualifying as piracy do not contain the song itself, only metadata)

flint onyx Apr 1, 2025, 7:42 PM

#

hollow pagoda did they tell u why

yeah they said the gap between the lines was still a bit high and that it could be improved further

#

I was playing around with it last night and this is the improved model

#

thoughts? around 400 the gap is still a lil big but Ive been trying to fix it for the past few hrs and nothin seems to work

hollow pagoda Apr 1, 2025, 7:45 PM

#

that just means it needed more size to be less overfit, also i thought the x axis was epoch for some reason last time

#

it looks converged around 800-830 training batch size

flint onyx Apr 1, 2025, 8:07 PM

#

hollow pagoda it looks converged around 800-830 training batch size

can u help me understand this better

#

after 800-830 it doesnt change much so does that mean i can reduce my size a bit?

glacial root Apr 1, 2025, 8:18 PM

#

when i set up an integral image, for some reason i end up with values that are inf

#

which is weird cause i shouldn't be anywhere near the max limit for np.int64

#

does anyone know any other causes of ending up with inf values purely through addition

weary timber Apr 1, 2025, 8:42 PM

#

agile cobalt there are two major approaches you can take - just filtering it in a normal way...

when inputting a song directly, the mood input becomes useless

glacial root Apr 1, 2025, 9:09 PM

#

glacial root does anyone know any other causes of ending up with inf values purely through ad...

here's the code by the way
https://gist.github.com/AnishM101/315fa7bb78da41565986b74aca9f4b82

Gist

IntegralImage

IntegralImage. GitHub Gist: instantly share code, notes, and snippets.

#

i have no clue as to why this issue is happening, there's no possible way for it to result in having inf values

worldly wagon Apr 2, 2025, 2:40 AM

#

what do you guys use as a latex editor when writing research papers?(considering overleaf)

mossy mango Apr 2, 2025, 4:41 AM

#

Hello guys

#

Someone

serene scaffold Apr 2, 2025, 5:28 AM

#

mossy mango Someone

what?

serene scaffold Apr 2, 2025, 5:29 AM

#

worldly wagon what do you guys use as a latex editor when writing research papers?(considering...

overleaf is a good option. I use a pycharm plugin.

twilit pulsar Apr 2, 2025, 5:36 AM

#

serene scaffold overleaf is a good option. I use a pycharm plugin.

i use visual studio pro

serene scaffold Apr 2, 2025, 5:37 AM

#

twilit pulsar i use visual studio pro

Don't tell me

young geode Apr 2, 2025, 5:50 AM

#

Hi, I'm a beginner trying to use matplotlib to create boxplots. I want to reduce the ymax to like 3000000 in order to get longer boxplots for better visualizatiokn. I have tried using plot.ylim, but I got no result. Can someone help me on that?

jaunty helm Apr 2, 2025, 6:56 AM

#

fresh harbor i was reading qwen docs and it had something called "function calling". you hav...

that could help, yes
here's hf's page on it

jaunty helm Apr 2, 2025, 7:00 AM

#

flint onyx thoughts? around 400 the gap is still a lil big but Ive been trying to fix it fo...

unless you've squeezed everything out of your data, I'd work on that instead of trying to hyperopt for more hours

#

you could also try gradient boosting trees instead; in sklearn there's HGBT, but also there's others like xgboost, lightgbm, catboost, etc

jaunty helm Apr 2, 2025, 7:03 AM

#

young geode Hi, I'm a beginner trying to use matplotlib to create boxplots. I want to reduce...

what's tc?

young geode Apr 2, 2025, 7:46 AM

#

jaunty helm what's `tc`?

turicreate

jaunty helm Apr 2, 2025, 8:06 AM

#

young geode turicreate

like this? looks like it's been abandoned

flint onyx Apr 2, 2025, 8:34 AM

#

jaunty helm unless you've squeezed everything out of your data, I'd work on that instead of ...

wym by squeezed everything out of ur data

#

like made use of the whole dataset?

#

my dataset has 1644 inputs and 997 feats but I reduced it to like 67 feats

#

are the inputs not enough maybe

jaunty helm Apr 2, 2025, 8:37 AM

#

flint onyx like made use of the whole dataset?

like, maybe you can do more feature engineering / transformations, apply some domain knowledge

jaunty helm Apr 2, 2025, 8:37 AM

#

flint onyx my dataset has 1644 inputs and 997 feats but I reduced it to like 67 feats

how did you do this feature selection process?

flint onyx Apr 2, 2025, 8:53 AM

#

jaunty helm how did you do this feature selection process?

I used Selectfrommodel

#

It seemed to improve my model by a lot

jaunty helm Apr 2, 2025, 9:05 AM

#

flint onyx I used Selectfrommodel

and how exactly
which model did you use

#

and also, though it might be tiring, actually examine what each feature you have and select based on what might matter - might give even better results

#

trees are prone to overfitting - this might be why your tree does a lot better after feature selection
but you could also try no feature selection + regularization (such as limiting the depth of how big the trees can grow)

flint onyx Apr 2, 2025, 9:15 AM

#

jaunty helm and how exactly which model did you use

randomforest

#

feature_params = SelectFromModel(RandomForestClassifier(n_estimators=100, max_depth=3, random_state=42), max_features=100)

flint onyx Apr 2, 2025, 9:16 AM

#

jaunty helm trees are prone to overfitting - this might be why your tree does a lot better a...

I tried the second approach and honestly spent tooo much time on it and didnt rlly get good results

jaunty helm Apr 2, 2025, 9:18 AM

#

flint onyx I tried the second approach and honestly spent tooo much time on it and didnt rl...

what's your data anyway
what are your 997 features and target

flint onyx Apr 2, 2025, 9:19 AM

#

jaunty helm what's your data anyway what are your 997 features and target

31 feats are hot encoded
2 are numeric
rest are all bag of words

targets are 3 classes (food labels) and classes are balanced

#

Lemme take a pic rq one sec

jaunty helm Apr 2, 2025, 9:20 AM

#

flint onyx 31 feats are hot encoded 2 are numeric rest are all bag of words targets are 3 ...

I don't mean how you deal with your features
I mean what's your data representing, like are they house sales, titanic survivors, whatever

flint onyx Apr 2, 2025, 9:22 AM

#

uhhh u mean like the original questions?

From a scale 1 to 5, how complex is it to make this food

How many ingredients would you expect this food item to contain

In what setting would you expect this food to be served

How much would u pay for this

What movie do you think of when thinking of this food item

what drink would u pair this food with

who does this food remind u of

how much hot sauce

jaunty helm Apr 2, 2025, 9:24 AM

#

flint onyx uhhh u mean like the original questions? From a scale 1 to 5, how complex is it...

so your data is a list of survey answers? and what are you predicting

flint onyx Apr 2, 2025, 9:24 AM

#

food label I have 3 classes. pizza, shawarma and sushi

jaunty helm Apr 2, 2025, 9:24 AM

#

flint onyx food label I have 3 classes. pizza, shawarma and sushi

so you are trying to predict whether the food is pizza, shawarma, or sushi from the survey answers?

flint onyx Apr 2, 2025, 9:24 AM

#

yep

jaunty helm Apr 2, 2025, 9:25 AM

#

flint onyx yep

how do you have 997 features then? surely the survey isn't 997 questions

flint onyx Apr 2, 2025, 9:25 AM

#

because of the movie title question. I used a bag of words approach for that and ended up with like 900+ words as features

jaunty helm Apr 2, 2025, 9:28 AM

#

flint onyx because of the movie title question. I used a bag of words approach for that and...

personally I'd either drop the movie title if it's too high cardinality, or e.g. group the movies with only a few responses into a Others

flint onyx Apr 2, 2025, 9:28 AM

#

I tried dropping the movie question and my model was doing pretty bad

flint onyx Apr 2, 2025, 9:29 AM

#

jaunty helm personally I'd either drop the movie title if it's too high cardinality, or e.g....

wait acc i dont understand what u mean by few responses into a others

jaunty helm Apr 2, 2025, 9:29 AM

#

flint onyx wait acc i dont understand what u mean by few responses into a others

like if out of all responses, you have 1 and only 1 that says it reminds them of Some Obscure Movie
how does that help you determine the food
sounds to me like it's very easily gonna overfit

flint onyx Apr 2, 2025, 9:30 AM

#

but doesnt selectfrommodel take care of that? it only picks the most informative features right?

jaunty helm Apr 2, 2025, 9:31 AM

#

flint onyx but doesnt selectfrommodel take care of that? it only picks the most informative...

there's no guarantee that any kind of feature selection will actually lead to better performance in general

flint onyx Apr 2, 2025, 9:33 AM

#

flint onyx because of the movie title question. I used a bag of words approach for that and...

so before doing this I should drop movie titles with rlly low frequency?

jaunty helm Apr 2, 2025, 9:33 AM

#

it could also be that, say for example, there's only a few responses that says Ratatouille, but this might be a very strong indicator for pizza (or maybe not)

flint onyx Apr 2, 2025, 9:33 AM

#

and then get all the words?

#

also another problem is that students filled these up so alot of them are pretty stupid responses

#

I tried my best to get rid of the responses that didnt seem good but for the movie title I kept pretty much all of them

jaunty helm Apr 2, 2025, 9:35 AM

#

flint onyx also another problem is that students filled these up so alot of them are pretty...

I mean that if you can identify these stupid responses, e.g. moving them to a Other group or even deleting this response from your training data could be alternatives that might lead to better performance in the end

flint onyx Apr 2, 2025, 9:35 AM

#

im confused by this "other" group. do you mean introducing a new feature?

jaunty helm Apr 2, 2025, 9:36 AM

#

flint onyx im confused by this "other" group. do you mean introducing a new feature?

for example, I can transform all the movie titles that are very low frequency / clearly jokes into Other

flint onyx Apr 2, 2025, 9:36 AM

#

how would that help

#

do u mean like converting all the joke responses/low freq responses to the same word or something?

jaunty helm Apr 2, 2025, 9:38 AM

#

flint onyx how would that help

then the tree might not hyperfocus on the fact that for the 3 responses with movie ObscureMovieA, they're all pizza, so that must mean ObscureMovieA == pizza

flint onyx Apr 2, 2025, 9:39 AM

#

mb I still dont get it

flint onyx Apr 2, 2025, 9:39 AM

#

jaunty helm then the tree might not hyperfocus on the fact that for the 3 responses with mov...

like if we had 3 responses with the movie title "hello" and all of them corresponded to the label pizza?

jaunty helm Apr 2, 2025, 9:40 AM

#

flint onyx like if we had 3 responses with the movie title "hello" and all of them correspo...

yes
it could very well be coincidence because there are so few responses with hello, but the model might think it means hello = 100% it's pizza

flint onyx Apr 2, 2025, 9:41 AM

#

I seeee

jaunty helm Apr 2, 2025, 9:42 AM

#

flint onyx uhhh u mean like the original questions? From a scale 1 to 5, how complex is it...

it just feels like there's a lot more you can do with this, than just putting it through bow in the 1st step

flint onyx Apr 2, 2025, 9:43 AM

#

mhmm ur right Im going to mess around with it today

#

6 am rn my brain is too slow for this

#

tysm for the help btw

jaunty helm Apr 2, 2025, 9:45 AM

#

flint onyx mhmm ur right Im going to mess around with it today

e.g.

In what setting would you expect this food to be served
how many unique values do you have here? 100s?
what if you tried to group them into e.g.

Formal
Casual
Fast Food
Street Food
maybe this is not the best groups, but you get what I mean; if I were a bit more serious I'd research where your food items (pizza, shawarma, or sushi) are commonly served, and use that as a reference to what groups should be made

#

like if you have the responses:

In a wedding
Formal setting
the 2 should be related, when if you just shoved it through a bow it would look completely distinct

flint onyx Apr 2, 2025, 10:07 AM

#

jaunty helm e.g. > In what setting would you expect this food to be served how many unique v...

ye thats what I did for all the othee questions

#

beside the movie title question. Thats how I ended up with 31 feats instead of 5 (since the other 2 were numeric)

flint onyx Apr 2, 2025, 10:09 AM

#

flint onyx ye thats what I did for all the othee questions

#

like this

young geode Apr 2, 2025, 10:34 AM

#

jaunty helm like [this](https://github.com/apple/turicreate)? looks like it's been abandoned

Sorry for the late reply.. I think it's still used. The context behind this is that I'm following a Machine learning coursera course from the university of Washington. The videos are outdated, as they are using an old abandoned library called graphlab. But they made a note, updating the fact that now the learners should be working with Turicreate. It's a python library easy to use for beginners who want to understand the concepts, focus on tasks instead of algorithms. Up until now, everything worked fine. But when I try to run the code for data visualization by boxplots, I get these boxplots, like in the image, small and not on scale. I'm trying to make them bigger by decreasing the y max value, but in vain..

serene grail Apr 2, 2025, 11:55 AM

#

jaunty helm e.g. > In what setting would you expect this food to be served how many unique v...

This is some interesting insight into how domain knowledge can be useful for machine learning chocojNoted
I'm really glad I was here to read this discussion

lavish wraith Apr 2, 2025, 6:17 PM

#

can i got the job if i only pandas,numpy,matplotlib and seaborn ,ploty and dash ??

knotty breach Apr 2, 2025, 6:18 PM

#

flint onyx

what is this dataset for?

knotty breach Apr 2, 2025, 6:23 PM

#

lavish wraith can i got the job if i only pandas,numpy,matplotlib and seaborn ,ploty and dash ...

Data Analysts

lavish wraith Apr 2, 2025, 6:26 PM

#

knotty breach Data Analysts

ok

fresh harbor Apr 2, 2025, 10:16 PM

#

jaunty helm that could help, yes here's [hf](https://huggingface.co/docs/hugs/en/guides/func...

thank you

flint onyx Apr 3, 2025, 12:01 AM

#

knotty breach what is this dataset for?

food survey

#

quick question:

Which would u say is better?

I was thinking the first one but then I asked my classmate and he said that the second plot is better. I dont get it isnt the overfitting case severe in the second one? but he says its because the accuracy for second plot is better and that matters more....

I asked him why he doesnt focus on making sure the overfitting isnt severe and he said:

"the model accuracy is 0.89 and the testing accuracy is 0.97
so it is overfitting a bit
if i try to reduce overfitting the overall accuracy goes down
which i dont think is a good tradeoff
"

glacial root Apr 3, 2025, 12:08 AM

#

in jupyter notebook if a cell has been running for a long time, with there still being a * in place of the cell number and the kernel hasn't died, does that mean it's still running or is there a chance that the program has stopped running and/or it crashed

#

this thing has been running for over 20 minutes and i have no clue what's been going on

#

it's just a simple knn setup for images, with 800 training images and 200 being used for testing

#

it's actually been running for over 30 minutes now and i have no clue what's going on

#

not even a neural network, it's knn 💀

jaunty helm Apr 3, 2025, 4:00 AM

#

young geode Sorry for the late reply.. I think it's still used. The context behind this is t...

thing is, it doesnt really look like you're using matplotlib
unless you mean tc uses mpl to plot? then a plt.ylim should be fine; show your code where it's not doing what is expected

jaunty helm Apr 3, 2025, 4:07 AM

#

flint onyx quick question: Which would u say is better? I was thinking the first one but...

I wouldn't choose the 2nd one simply because it's not done right
you split into training + validation set, so you can see how your model does on unseen data
but now that you've introduced hyperopt, while keeping only the train + valid set, you're just picking the hyperparameters that specifically makes the validation set look good... see the problem?
what should be done is have 3 sets, one to train on, the second to hyperopt on, the third to actually test on; then we can see (with reduced bias) if the improvements are real

flint onyx Apr 3, 2025, 4:09 AM

#

jaunty helm I wouldn't choose the 2nd one simply because it's not done right you split into ...

im sorry bit idk what hyperopt is. I thought he just named his random forest that 😭

jaunty helm Apr 3, 2025, 4:10 AM

#

flint onyx im sorry bit idk what hyperopt is. I thought he just named his random forest tha...

it's just short for hyperparameter optimization, which is what optuna does

flint onyx Apr 3, 2025, 4:10 AM

#

im pretty sure he did have 3 sets. train, validation and test

#

he used validation for plotting the curve and tuning

jaunty helm Apr 3, 2025, 4:10 AM

#

flint onyx he used validation for plotting the curve and tuning

so he used the validation set to both tune and to test accuracy?

flint onyx Apr 3, 2025, 4:11 AM

#

no he tested the final acc using test

#

but the thing I dont get is that how can he say that his model is reasonable when the gap is soooo big

jaunty helm Apr 3, 2025, 4:12 AM

#

flint onyx no he tested the final acc using test

then the second one does look a bit better I suppose

flint onyx Apr 3, 2025, 4:12 AM

#

oh

jaunty helm Apr 3, 2025, 4:12 AM

#

flint onyx but the thing I dont get is that how can he say that his model is reasonable whe...

it's not about how big the gap is, lemme draw smthn up real quick

flint onyx Apr 3, 2025, 4:12 AM

#

jaunty helm then the second one does look a bit better I suppose

I was originally getting a similar plot but I thought the gap between the two was a red flag

jaunty helm Apr 3, 2025, 4:15 AM

#

flint onyx I was originally getting a similar plot but I thought the gap between the two wa...

overfitting is when your model fits the training data too well, such that the real loss (that is estimated by test loss) actually increases
if you plot it in a graph and try to eyeball it, you'd say that your model is overfitting if you see some part where the training loss is going down, but the real loss is going up, so about here in the red box in this hypothetical graph

#

so take your first model as an example, if you kept training and plotted the train vs. test acc, it might look like this
at where the red box is is where it approximately starts to overfit

jaunty helm Apr 3, 2025, 4:20 AM

#

jaunty helm overfitting is when your model fits the training data too well, such that the re...

(this graph is kinda non sensical cause the green line should 99% of the time be above the blue one - not often you see testing loss < training loss - but you get the idea)

flint onyx Apr 3, 2025, 4:23 AM

#

I seee

#

smh I misunderstood earlier and was so focused on the gap

jaunty helm Apr 3, 2025, 4:24 AM

#

more like this; at around the red box is when your green model starts to overfit
now let's say we make another model whose testing loss is orange
near the purple box, though the gap between orange and blue is bigger than green and blue, this is not saying that orange is overfitting worse than green (in fact none are overfitting near here); it's just saying that orange is a worse model at this point

worldly wagon Apr 3, 2025, 7:02 AM

#

serene scaffold overleaf is a good option. I use a pycharm plugin.

saw this on my phone during the day never got a chance to say thank you (overdue thank you)
thanks again

opaque condor Apr 3, 2025, 10:40 AM

#

jaunty helm more like this; at around the red box is when your green model starts to overfit...

How did you get that box on that data?

jaunty helm Apr 3, 2025, 11:18 AM

#

opaque condor How did you get that box on that data?

I used a SOTA highly sophisticated tool to make this graph
called MS paint

opaque condor Apr 3, 2025, 11:34 AM

#

No matplotliblib

jaunty helm Apr 3, 2025, 11:36 AM

#

opaque condor No matplotliblib

so you want to add a box in a matplotlib plot?
add a rectangle ig

opaque condor Apr 3, 2025, 12:11 PM

#

jaunty helm so you want to add a box in a matplotlib plot? add a [`rectangle`](https://matpl...

Thank you

lapis sequoia Apr 3, 2025, 1:06 PM

#

is tensorflow givning any of you a problem with python 3.12?

quaint rivet Apr 3, 2025, 1:13 PM

#

I was working on deep learning model and my model doing some image classification. But when i tried to pass output in loss function(cross-entropy loss) . i am getting RuntimeError: 0D or 1D target tensor expected, multi-target not supported. Any guide how to fix this error?

num_classes = 8
 self.classifier = nn.Sequential(
          nn.Linear(128 * 8 * 8*12, 256),
          nn.ReLU(),
          nn.Linear(256, 128),
          nn.ReLU(),
          nn.Linear(128, num_classes),
      )

Here's the full traceback

---> 12     loss = criterion(outputs, labels)
     13     loss.backward()
     14     optimizer.step()

fickle shale Apr 3, 2025, 1:32 PM

#

serene scaffold overleaf is a good option. I use a pycharm plugin.

Hello Stelercus!!

Can we train a ml model who can mimic a mentalist and predict words/number?
Prob.+Pattern Recognition```

fickle shale Apr 3, 2025, 1:32 PM

#

fickle shale Hello Stelercus!! ```I have an project idea!! Can we train a ml model who can mi...

like thinking to predict number b/w 1 to 10!!

#

How can i collect data?

lapis sequoia Apr 3, 2025, 6:29 PM

#

why does bert not need the same amount of text cleaning as say : logistic regression, naivebayes, Rnns,lstms,grus ect?

opaque condor Apr 3, 2025, 8:17 PM

#

Accurecy of the network: 75.0 %
Traceback (most recent call last):
  File "c:\Users\iorn\Desktop\neral network\convelutional.py", line 122, in <module>
    print(f'Acurecy of {classes[i]}: {acc:L.2f} %')
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Invalid format specifier 'L.2f' for object of type 'float'

#

how do i fix this acurecy error?

serene scaffold Apr 3, 2025, 8:18 PM

#

opaque condor ``` Accurecy of the network: 75.0 % Traceback (most recent call last): File "c...

what is acc:L.2f supposed to do? show you acc to two decimals?

opaque condor Apr 3, 2025, 8:18 PM

#

yes

serene scaffold Apr 3, 2025, 8:18 PM

#

remove the L

opaque condor Apr 3, 2025, 8:19 PM

#

ok

serene scaffold Apr 3, 2025, 8:19 PM

#

where did you get the idea that you needed the L?

opaque condor Apr 3, 2025, 8:20 PM

#

i was following a tutorial video

serene scaffold Apr 3, 2025, 8:20 PM

#

when you have f"{x:y}", do you understand what x and y are?

opaque condor Apr 3, 2025, 8:20 PM

#

yes

#

x is how far you are from zero

serene scaffold Apr 3, 2025, 8:21 PM

#

No

#

I mean in general

#

f-strings

opaque condor Apr 3, 2025, 8:22 PM

#

a little

serene scaffold Apr 3, 2025, 8:22 PM

#

!e

pi = 3.14159
print(f"{pi * 2 : .4f}")

arctic wedgeBOT Apr 3, 2025, 8:22 PM

#

serene scaffold !e ```py pi = 3.14159 print(f"{pi * 2 : .4f}") ```

:white_check_mark: Your 3.12 eval job has completed with return code 0.

 6.2832

serene scaffold Apr 3, 2025, 8:23 PM

#

make sense?

opaque condor Apr 3, 2025, 8:23 PM

#

yes

#

it shows the output to a certain degree

serene scaffold Apr 3, 2025, 8:26 PM

#

it's also worth noting that it does some of the math (pi * 2) right in the f-string. the expression (which is the actual code) is on the left of the :, and the format specifier is on the right.

opaque condor Apr 3, 2025, 8:32 PM

#

lets hope my network works now

#

23/500 done

opaque condor Apr 4, 2025, 12:18 AM

#

And currently on epoch 366 I'm terrified but also excited

hollow pagoda Apr 4, 2025, 12:41 AM

#

should this be standardized/normalized if its the target label or does that not matter, its right skewed

lapis sequoia Apr 4, 2025, 12:42 AM

#

!e

print("hello")

arctic wedgeBOT Apr 4, 2025, 12:42 AM

#

lapis sequoia !e ``` print("hello") ```

:warning: Your 3.13 eval job has completed with return code 0.

[No output]

lapis sequoia Apr 4, 2025, 12:42 AM

#

i thought this ran my code💔🥀

storm heron Apr 4, 2025, 3:51 AM

#

hey guys

#

i would like someone to suggest ways to integrate AI into codebase

#

what ideaas can be implemented

#

and also what sources could be learnt from

#

i would rather the level of information would not be sofisticated because i am learning

viscid urchin Apr 4, 2025, 3:54 AM

#

AI is just another subsystem; if you design the rest of your code carefully, composing it with an AI library to add features should be kinda painless. Easier said than done of course.

#

Stuff like dependency injection, "open/closed principle" etc helps a lot, in my opinion.

#

No need to follow anything slavishly, but it's worth knowing about https://en.wikipedia.org/wiki/SOLID even if you end up doing something else

serene scaffold Apr 4, 2025, 3:58 AM

#

storm heron i would like someone to suggest ways to integrate AI into codebase

what do you think AI is?

storm heron Apr 4, 2025, 3:59 AM

#

is a language learning model that has or is being trained ?

storm heron Apr 4, 2025, 4:00 AM

#

serene scaffold what do you think AI is?

i mean obviously it sounds stupid the way i am introducing my question \

#

but what i mean is what are some simple, ideas that i can integrate ai with programming

viscid urchin Apr 4, 2025, 4:03 AM

#

Logic flows that seem like they would require a crazy number of "if" statements are a good place to start maybe

#

If you can exactly code the branching logic for something, you don't want an LLM, because why add the possibility of incorrect answers?

serene scaffold Apr 4, 2025, 4:15 AM

#

storm heron but what i mean is what are some simple, ideas that i can integrate ai with prog...

I'm asking you what you think AI is. This isn't a test. I just need to understand what you have in mind when you talk about AI and wanting to integrate it with programming.

spring field Apr 4, 2025, 4:52 AM

#

opaque condor And currently on epoch 366 I'm terrified but also excited

why are you terrified? you are like creating some debug samples for each epoch, right? like you can see the model improving, right?

#

and metrics too, right?

opaque condor Apr 4, 2025, 10:46 AM

#

    acc = 100.0 * n_class_correct[i] / n_class_samples[i]
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
ZeroDivisionError: float division by zero```

grizzled ice Apr 4, 2025, 11:47 AM

#

serene scaffold I'm asking you what you think AI is. This isn't a test. I just need to understan...

hey. I just came across this message and I had been wondering what it actually meant. Could you give YOUR thoughts on what AI is?

#

or anyone tbh

grand minnow Apr 4, 2025, 12:00 PM

#

grizzled ice hey. I just came across this message and I had been wondering what it actually m...

why?

grizzled ice Apr 4, 2025, 12:02 PM

#

grand minnow why?

what do you mean why?

#

i just wanted to know

grand minnow Apr 4, 2025, 12:02 PM

#

grizzled ice what do you mean why?

Why would you want to know what other people think of AI?

#

tbh its just a matter of implementation

grizzled ice Apr 4, 2025, 12:03 PM

#

grand minnow Why would you want to know what other people think of AI?

because I dont get it myself?

grand minnow Apr 4, 2025, 12:03 PM

#

grizzled ice because I dont get it myself?

What part of AI you don't get?

grizzled ice Apr 4, 2025, 12:03 PM

#

like what is AI and what is not AI?

grand minnow Apr 4, 2025, 12:03 PM

#

https://tenor.com/view/damn-thats-deep-cody-ko-wade-the-real-bros-of-simi-valley-gif-16476225

Tenor

grand minnow Apr 4, 2025, 12:05 PM

#

grizzled ice like what is AI and what is not AI?

Would this help? https://youtu.be/ad79nYk2keg?si=ouutN-9IVJuJvHvg

YouTube

Simplilearn

What Is AI? | Artificial Intelligence | What is Artificial Intellig...

🔥Artificial Intelligence Engineer (IBM) - https://www.simplilearn.com/masters-in-artificial-intelligence?utm_campaign=ad79nYk2keg&utm_medium=DescriptionFirstFold&utm_source=Youtube
🔥IITK - Professional Certificate Course in Generative AI and Machine Learning (India Only) - https://www.simplilearn.com/iitk-professional-certificate-course-ai...

▶ Play video

opaque condor Apr 4, 2025, 12:23 PM

#

spring field why are you terrified? you are like creating some debug samples for each epoch, ...

I'm terrifying because every time they put it into training it gives me this output #data-science-and-ml message

jaunty helm Apr 4, 2025, 12:32 PM

#

grizzled ice like what is AI and what is not AI?

imo rn there's no rigorous definitions so it's all pretty subjective
my opinion is that AI is the goal and currently the new and hot and popular approach is scaling LLMs

#

(oh hey that's what the video says)

fresh goblet Apr 4, 2025, 12:45 PM

#

Hello guys. im currently first year in DataScience and I am doing a project in CompSci. I really need you help with a small logic mistake in my post in the help section. I would greatly appreciate it.

river cape Apr 4, 2025, 1:10 PM

#

https://github.com/rapidsai-community/showcase/blob/main/getting_started_tutorials/cuml_sklearn_colab_demo.ipynb

GitHub

showcase/getting_started_tutorials/cuml_sklearn_colab_demo.ipynb at...

Contribute to rapidsai-community/showcase development by creating an account on GitHub.

#

check it out if you want faster data processing

#

i think it works for scikit-learn mostly

#

the time taken for processing is impressive

bright comet Apr 4, 2025, 3:46 PM

#

helo guys

lapis sequoia Apr 4, 2025, 4:02 PM

#

is tensorflow dead?

#

are more imporantly, are you all dancing on it's grave?

serene scaffold Apr 4, 2025, 4:22 PM

#

lapis sequoia is tensorflow dead?

pretty much

lean fiber Apr 4, 2025, 6:00 PM

#

guys which tools or languages should i learn for data science?,im learning python rn and im 16.im going for ml.

serene scaffold Apr 4, 2025, 6:01 PM

#

lean fiber guys which tools or languages should i learn for data science?,im learning pytho...

Did you see my message about learning tools in #career-advice ?

maiden moat Apr 4, 2025, 6:01 PM

#

Hello just wondering if anyone has dabbled here with LLMs + Microsoft Graph API

viscid urchin Apr 4, 2025, 6:02 PM

#

No but in what sense do you mean?

serene scaffold Apr 4, 2025, 6:02 PM

#

maiden moat Hello just wondering if anyone has dabbled here with LLMs + Microsoft Graph API

Always ask your real question that someone who knows the answer can start answering

maiden moat Apr 4, 2025, 6:03 PM

#

maiden moat Hello just wondering if anyone has dabbled here with LLMs + Microsoft Graph API

I'm gonna ask the experience with teams integration if it's reliable to have an LLM do tasks for you

serene scaffold Apr 4, 2025, 6:04 PM

#

maiden moat I'm gonna ask the experience with teams integration if it's reliable to have an ...

In general, you shouldn't depend on LLMs to do tasks. Especially if humans are out of the loop.

viscid urchin Apr 4, 2025, 6:04 PM

#

Looks fun https://learn.microsoft.com/en-us/graph/api/resources/teams-api-overview?view=graph-rest-1.0

maiden moat Apr 4, 2025, 6:04 PM

#

I've been trying to do a project that prompts the AI to set meetings etc for me with microsoft teams, since I've been wanting to make a 'secretary' per se when student orgs run out of manpower

maiden moat Apr 4, 2025, 6:05 PM

#

viscid urchin Looks fun https://learn.microsoft.com/en-us/graph/api/resources/teams-api-overvi...

yeah it does

viscid urchin Apr 4, 2025, 6:05 PM

#

I've had to build IVRs by hand several times before, and I'd be glad to not do it again manually

maiden moat Apr 4, 2025, 6:06 PM

#

Either that or I'm just gonna do a script but if that was the case the project would be redundant, I just want to prompt it in and do it for me because I can't be bothered clicking and typing more than I have to lmao

maiden moat Apr 4, 2025, 6:07 PM

#

serene scaffold In general, you shouldn't depend on LLMs to do tasks. Especially if humans are o...

Sorry what

#

I just have a specific usecase in mind that I want to explore

serene scaffold Apr 4, 2025, 6:08 PM

#

You'd need to run some tests to confirm that the LLM consistently performs the desired tasks correctly

maiden moat Apr 4, 2025, 6:09 PM

#

Yeah that's already a given, I'm literally just curious if others have done the thing I'm trying to do

#

Anyways thanks!

lean fiber Apr 4, 2025, 6:20 PM

#

serene scaffold Did you see my message about learning tools in <#470889390588035082> ?

Yes I just wanted more details and since this is the data science section, why not ask questions?

#

I already got my answer anyway so dw

serene scaffold Apr 4, 2025, 6:40 PM

#

lean fiber Yes I just wanted more details and since this is the data science section, why n...

my advice was to not focus on learning tools, but you came here and asked what tools to learn.

lean fiber Apr 4, 2025, 6:45 PM

#

I didn’t understand what you meant by “concepts”, and I think I’ll need tools to learn those concepts anyways(idk).

serene scaffold Apr 4, 2025, 6:46 PM

#

lean fiber I didn’t understand what you meant by “concepts”, and I think I’ll need tools to...

data science and AI are applied math. you should focus on learning concepts in an order that build on each other, and only worry about tools as they pertain to the concept you're currently trying to learn.

#

you'll see what I mean by "concepts" when you get into it.

#

do you know what a classifier is?

lean fiber Apr 4, 2025, 6:48 PM

#

No

serene scaffold Apr 4, 2025, 6:48 PM

#

Start with that.

lean fiber Apr 4, 2025, 6:48 PM

#

Ok 👍🏿

iron basalt Apr 4, 2025, 8:13 PM

#

grizzled ice hey. I just came across this message and I had been wondering what it actually m...

Rational agent definiton of AI (the "standard model" of AI):

"A rational agent is one that acts so as to achieve the best outcome or, when there is uncertainty, the best expected outcome.

In the 'laws of thought' approach to AI, the emphasis was on correct inferences. Making correct inferences is sometimes
part of being a rational agent, because one way to act rationally is to deduce that a given action is best and then to act on
that conclusion. On the other hand, there are ways of acting rationally that cannot be said to involve inference. For example,
recoiling from a hot stove is a reflex action that is usually more successful than a slower action taken after careful deliberation.

All the skills needed for the Turing test also allow an agent to act rationally. Knowledge representation and reasoning enable agents
to reach good decisions. We need to be able to generate comprehensible sentences in natural language to get by in a complex society.
We need learning not only for erudition, but also because it improves our ability to generate effective behavior, especially in
circumstances that are new.

The rational-agent approach to AI has two advantages over the other approaches. First, it is more general than the 'laws of thought'
approach because correct inference is just one of several possible mechanisms for achieving rationality. Second, it is more
amenable to scientific development. The standard of rationality is mathematically well defined and completely general.
We can often work back from this specification to derive agent designs that provably achieve it--something that is largely impossible
if the goal is to imitate human behavior or thought processes.

For these reasons, the rational-agent approach to AI has prevailed throughout most of the field's history. In the early decades,
rational agents were built on logical foundations and formed definite plans to achieve specific goals. Later, methods based on
probability theory and machine learning allowed the creation of agents that could make decisions under uncertainty to attain
the best expected outcome. In a nutshell, AI has focused on the study and construction of agents that do the right thing.
What counts as the right thing is defined by the objective that we provide to the agent. This general paradigm is so
pervasive that we might call it the standard model. It prevails not only in AI, but also in control theory, where a controller
minimizes a cost function; in operations research, where a policy maximizes a sum of rewards; in statistics, where a decision
rule minimizes a loss function; and in economics, where a decision maker maximizes utility or some measure of social welfare."

(Artificial Intelligence: A Modern Approach. Russel & Norvig)```

#

There are other definitions, but this one is a pretty good.

opaque condor Apr 4, 2025, 8:20 PM

#

    acc = 100.0 * n_class_correct[i] / n_class_samples[i]
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
ZeroDivisionError: float division by zero

austere prawn Apr 4, 2025, 8:33 PM

#

Has that marimo presentation been yet? 😅😊

#

I'm so out of the loop 😮‍💨

opaque condor Apr 4, 2025, 8:36 PM

#

opaque condor ``` File "c:\Users\iorn\Desktop\neral network\convelutional.py", line 120, in <...

how do i fix this error

viscid urchin Apr 4, 2025, 8:36 PM

#

Well, what do you want to do when n_class_samples[i] returns a 0?

#

Arguably you want NaN in such cases it seems to me.

opaque condor Apr 4, 2025, 8:52 PM

#

viscid urchin Well, what do you want to do when `n_class_samples[i]` returns a 0?

No I'm trying to get it to return the accuracy of what it learned I don't you might want to look at this video because I'm just following what I can https://youtu.be/pDdP0TFzsoQ?si=cY-8u9T80R8LTwq1

YouTube

Patrick Loeber

PyTorch Tutorial 14 - Convolutional Neural Network (CNN)

New Tutorial series about Deep Learning with PyTorch!
⭐ Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: https://www.tabnine.com/?utm_source=youtube.com&utm_campaign=PythonEngineer *

In this part we will implement our first convolutional neural network (CNN) that can do image classification based on th...

▶ Play video

viscid urchin Apr 4, 2025, 8:54 PM

#

OK, so this? https://github.com/patrickloeber/pytorchTutorial/blob/master/14_cnn.py

GitHub

pytorchTutorial/14_cnn.py at master · patrickloeber/pytorchTutorial

PyTorch Tutorials from my YouTube channel. Contribute to patrickloeber/pytorchTutorial development by creating an account on GitHub.

opaque condor Apr 4, 2025, 8:55 PM

#

yes

viscid urchin Apr 4, 2025, 8:55 PM

#

The issue is this code:

    n_class_correct = [0 for i in range(10)]
    n_class_samples = [0 for i in range(10)]

initializing them to 0 means you need to be more careful when you divide later

opaque condor Apr 4, 2025, 8:56 PM

#

which lines?

viscid urchin Apr 4, 2025, 8:56 PM

#

106-107

#

and then 127-end

#

You could replace line 127 onward with:

for i in range(10):
    if n_class_samples[i] == 0:
        print(f'No samples for {classes[i]}.')
    else:
        acc = 100.0 * n_class_correct[i] / n_class_samples[i]
        print(f'Accuracy of {classes[i]}: {acc} %')

agile cobalt Apr 4, 2025, 9:00 PM

#

austere prawn Has that marimo presentation been yet? 😅😊

this? https://discord.gg/python?event=1350928346422186065
you can see from the Event button in the top of the channel bar

opaque condor Apr 4, 2025, 9:00 PM

#

https://paste.pythondiscord.com/CUPQ

austere prawn Apr 4, 2025, 9:00 PM

#

agile cobalt this? https://discord.gg/python?event=1350928346422186065 you can see from the ...

Probably 👍😎

viscid urchin Apr 4, 2025, 9:02 PM

#

opaque condor https://paste.pythondiscord.com/CUPQ

So yeah, put what I proposed in place of lines 119+

opaque condor Apr 4, 2025, 11:42 PM

#

Thank you

opaque condor Apr 5, 2025, 2:51 AM

#

viscid urchin So yeah, put what I proposed in place of lines 119+

https://paste.pythondiscord.com/GGIQ

#

thank you

#

I'm happy about this

opaque condor Apr 5, 2025, 3:19 AM

#

Is there anything I should be worried about with this data currently trying to be afraid that it's overfitting?

opaque condor Apr 5, 2025, 3:33 AM

#

viscid urchin So yeah, put what I proposed in place of lines 119+

Should I be worried up overfitting?

viscid urchin Apr 5, 2025, 3:34 AM

#

Of that I'm not sure, I'm still learning how to spot that myself

#

Is your training accuracy much higher than your test accuracy? Doesn't look like it right?

opaque condor Apr 5, 2025, 3:42 AM

#

viscid urchin Is your training accuracy much higher than your test accuracy? Doesn't look like...

It hasn't given me anything on test accuracy it just tells me how is doing just take a look

#

Yes but it seems like the networks accuracy went down and then the last few rounds.

viscid urchin Apr 5, 2025, 3:45 AM

#

Your code only prints test accuracy after training is complete; there’s no monitoring of training accuracy or loss over epochs, so I don't think we have enough information to tell

opaque condor Apr 5, 2025, 3:45 AM

#

Darn it

queen oyster Apr 5, 2025, 4:19 AM

#

#

guh

#

i'm messing around with some tools i don't fully understand

#

trying to create a program to use audio samples to generate geometry dash layouts

#

do i need to multiply my frequency-amplitude function by this funny curve

queen oyster Apr 5, 2025, 10:13 AM

#

update: I multiplied by the curve and I can now see the higher frequencies

#

I'm using a log scale so they obviously aren't represented equally to the bass when shown like this

#

now how do I turn this into a geometry dash layout

#

I need to work out either how the icon should move or where the player should click

#

or perhaps where the player collides or interacts with an object

#

and what object it should be

opaque condor Apr 5, 2025, 12:30 PM

#

My convolutional neural network is doing good hectic trains while I'm asleep I just need something to show me that it's actually learning and not overfitting

lean fiber Apr 5, 2025, 12:59 PM

#

Guys this is my current plan: • 100 days of code by Angela Yu (current)
• Machine learning a-z by Kirill Eremenko, Hadelin de Ponteves, SuperDataScience Team, and Ligency Team
• The data science course by 365 careers
• Coursera machine learning course by Andrew Ng
• Coursera deep learning course by Andrew Ng.

#

What else can I learn to boost my chance of finding a job

grand minnow Apr 5, 2025, 1:13 PM

#

lean fiber What else can I learn to boost my chance of finding a job

Build something. A solution or an app that solves a problem. Keep building. Then add them to your portfolio. Then brag about it at job interviews

lean fiber Apr 5, 2025, 3:03 PM

#

Thanks! I’ll for sure do this.

primal apex Apr 5, 2025, 3:56 PM

#

lean fiber Thanks! I’ll for sure do this.

Are you building anything this weekend? 🙂

meager canyon Apr 5, 2025, 4:10 PM

#

has anybody worked with RAG? looking to build one

#

I have a got huge markdown file

serene scaffold Apr 5, 2025, 4:11 PM

#

meager canyon has anybody worked with RAG? looking to build one

what's your actual question about RAG?

meager canyon Apr 5, 2025, 4:11 PM

#

serene scaffold what's your actual question about RAG?

the process itself. I did the ingestion, but looking for some resource

#

or tips

serene scaffold Apr 5, 2025, 4:11 PM

#

meager canyon the process itself. I did the ingestion, but looking for some resource

the ingestion?

meager canyon Apr 5, 2025, 4:11 PM

#

I mean

#

I got the whole data

#

stored yk

#

Later on i'll work with Agents

meager canyon Apr 5, 2025, 4:18 PM

#

serene scaffold the ingestion?

this right?

serene scaffold Apr 5, 2025, 4:19 PM

#

meager canyon this right?

I'm not sure what load and split are

#

oh I guess I know what they mean by split

meager canyon Apr 5, 2025, 4:19 PM

#

LangChain is where should I start ig

serene scaffold Apr 5, 2025, 4:20 PM

#

I've never actually used langchain.
admittedly I'm not being very helpful

meager canyon Apr 5, 2025, 4:20 PM

#

its ok

viscid urchin Apr 5, 2025, 4:20 PM

#

Sweet lord, langchain has a lot of things to import

meager canyon Apr 5, 2025, 4:20 PM

#

right

viscid urchin Apr 5, 2025, 4:21 PM

#

Here's a tutorial I'm reading now https://python.langchain.com/docs/tutorials/rag/

#

I guess that's where you are?

meager canyon Apr 5, 2025, 4:21 PM

#

yeah

#

ill follow it

#

actually understand how it works

viscid urchin Apr 5, 2025, 4:21 PM

#

"Detailed Walkthrough" is where the action starts

#

Oh that's cool it has specialized TextSplitters for all sorts of things

meager canyon Apr 5, 2025, 4:22 PM

#

my boss wanted me to use cursor to do all the job

viscid urchin Apr 5, 2025, 4:22 PM

#

e.g. one for scientific papers https://python.langchain.com/docs/integrations/document_loaders/grobid/

meager canyon Apr 5, 2025, 4:22 PM

#

but i dont really like cursor

#

it just do everything for me

#

i'm not that lazy

viscid urchin Apr 5, 2025, 4:23 PM

#

Claude Code is the only one of the agentic tools I've run into that produces decent results often enough so far. I still much prefer to do it myself but it can be handy when I'm trying to get some unmaintained dependency to build on my Mac.

meager canyon Apr 5, 2025, 4:23 PM

#

Claude is pretty decent

#

I agree w u

viscid urchin Apr 5, 2025, 4:24 PM

#

Dang, I guess I should build something with LangChain, it looks powerful and relevant

meager canyon Apr 5, 2025, 4:25 PM

#

indeed it is

flat roost Apr 5, 2025, 4:27 PM

#

does using openai api require you to setup a payment method?

viscid urchin Apr 5, 2025, 4:29 PM

#

Yeah, you have to load at least $5 of credits to start these days I believe.

#

(Anthropic has the same minimum)

lean fiber Apr 5, 2025, 4:30 PM

#

Why do people prefer Claude over ChatGPT? Isn’t ChatGPT supposed to be better?

viscid urchin Apr 5, 2025, 4:31 PM

#

Claude is way better at coding from my testing

tropic shore Apr 5, 2025, 4:31 PM

#

hi im trying to implement one stage retina net object detector. Is here anybody willing to help? Or if this is not approriate platform to ask can you recommend some?

viscid urchin Apr 5, 2025, 4:31 PM

#

and Claude Code is a very well-implemented agent loop

viscid urchin Apr 5, 2025, 4:32 PM

#

tropic shore hi im trying to implement one stage retina net object detector. Is here anybody ...

Like with PyTorch? Or do you have a toolkit in mind.

#

Caveat: I don't have 'Pro' so I haven't tried o1-pro, it might be the best. Claude is better than GPT-4.5 from my testing though.

robust forge Apr 5, 2025, 4:33 PM

#

viscid urchin Claude is way better at coding from my testing

I was going in circles the other day trying to do something with the intellij platform SDK and Claude solved the problem with less hallucination with what I had given.

flat roost Apr 5, 2025, 4:33 PM

#

viscid urchin and Claude Code is a very well-implemented agent loop

so cant use openai for free?

#

i am just playing around with these so

tropic shore Apr 5, 2025, 4:34 PM

#

viscid urchin Like with PyTorch? Or do you have a toolkit in mind.

yeah in pytorch with efficientnet as backbone. My model always assigns all anchors to class 0 (background).

viscid urchin Apr 5, 2025, 4:34 PM

#

flat roost so cant use openai for free?

You can use their web chat for free but not the API, to my knowledge.

viscid urchin Apr 5, 2025, 4:35 PM

#

tropic shore yeah in pytorch with efficientnet as backbone. My model always assigns all ancho...

Cool. Are you familiar with feature pyramid networks? I think they are the approach here? https://paperswithcode.com/method/fpn

flat roost Apr 5, 2025, 4:35 PM

#

viscid urchin You can use their web chat for free but not the API, to my knowledge.

alright but my usage says that 0$/$18 used

#

viscid urchin Apr 5, 2025, 4:37 PM

#

flat roost

Oh interesting.. does https://platform.openai.com/api-keys let you create an API key? If so, I guess just try it?

flat roost Apr 5, 2025, 4:37 PM

#

viscid urchin Oh interesting.. does https://platform.openai.com/api-keys let you create an API...

i did ill show you the error just a sec

#

raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

tropic shore Apr 5, 2025, 4:37 PM

#

viscid urchin Cool. Are you familiar with feature pyramid networks? I think they are the appro...

yeah but this my smaller model with only 2 heads also should work. I think i have mistake in loss computing but dont know what is it

flat roost Apr 5, 2025, 4:37 PM

#

this is the error i get

viscid urchin Apr 5, 2025, 4:38 PM

#

flat roost i did ill show you the error just a sec

Yeah, that's vaguely what I'd expect.. I think you have $0 of credit to spend as a free user.

viscid urchin Apr 5, 2025, 4:38 PM

#

tropic shore yeah but this my smaller model with only 2 heads also should work. I think i hav...

Can you share the code? Maybe we can spot it.

flat roost Apr 5, 2025, 4:38 PM

#

when i asked chatgpt about this it said its cuse i didt setup a payment method

tropic shore Apr 5, 2025, 4:39 PM

#

viscid urchin Can you share the code? Maybe we can spot it.

of the loss or whole implementation?

viscid urchin Apr 5, 2025, 4:40 PM

#

tropic shore of the loss or whole implementation?

As much as you feel comfortable with really

#

You can use https://paste.pythondiscord.com/

meager canyon Apr 5, 2025, 4:42 PM

#

Btw which LLM you guys think is worth paying for?

#

For coding mainly

#

I do know GPT plus or pro is way too expensive

viscid urchin Apr 5, 2025, 4:44 PM

#

I've paid Anthropic for more Claude API dollars than I like to think about

#

If you know exactly what you want, I've found it to be very good

#

(I've been using it to automate the analysis of network traffic with wireshark/tshark etc)

meager canyon Apr 5, 2025, 4:45 PM

#

Fair

flat roost Apr 5, 2025, 4:45 PM

#

viscid urchin I've paid Anthropic for more Claude API dollars than I like to think about

is claude free tho i just want to playaround for now

meager canyon Apr 5, 2025, 4:45 PM

#

Kinda

viscid urchin Apr 5, 2025, 4:45 PM

#

flat roost is claude free tho i just want to playaround for now

It has a free tier but you can't use the API directly for free, minimum $5 of credits.

flat roost Apr 5, 2025, 4:46 PM

#

alright so basicaly paid

tropic shore Apr 5, 2025, 4:47 PM

#

viscid urchin As much as you feel comfortable with really

oki here is the loss. I dont know if i put there right things def custom_loss(y_pred, y):
# logits from heads
cl_logits, reg_logits = y_pred # shape cl: [batch, anchors, 11] reg: [batch, anchors, 4]

    # assigned anchor calsses according to iou metric, bboxes in rcnn format
    anchor_bboxes, anchor_classes = y

    # foreground anchors
    valid_mask = anchor_classes > 0

    # convert anchor classes to one hot
    num_classes = cl_logits.size(-1)
    target_classes_one_hot = torch.nn.functional.one_hot(anchor_classes, num_classes=num_classes).float() 


    classification_loss = torchvision.ops.sigmoid_focal_loss(cl_logits,
                                                            target_classes_one_hot,
                                                            reduction='sum')

    regression_loss = torch.nn.functional.smooth_l1_loss(reg_logits[valid_mask], 
                                                        anchor_bboxes[valid_mask],
                                                        reduction="sum")
    
    total_loss = (classification_loss + regression_loss) / valid_mask.sum()

    return total_loss

viscid urchin Apr 5, 2025, 4:55 PM

#

Can we see your anchor code? You might not have enough positive anchors?

#

You can do three backticks followed by the word python, and then paste your code, and then close it with three backticks, to get Discord to format it

#

tropic shore Apr 5, 2025, 5:06 PM

#

viscid urchin Can we see your anchor code? You might not have enough positive anchors?

as first maybe my idea of whole algorithm is wrong. My classification head outputs 10 logits (for digits on image) and + 1 for background. Then im putting into focall loss. Is this correct or should head outptuts 10 logits for digits and background is threated individually?

viscid urchin Apr 5, 2025, 5:07 PM

#

OK I think that helps me understand the problem

#

You are using Softmax as the approach?

#

Focal loss is meant for single sigmoid, where each class gets a separate classifier

#

I think you're mixing the two worlds maybe?

tropic shore Apr 5, 2025, 5:11 PM

#

viscid urchin You are using Softmax as the approach?

no after passign to focal loss sigmoid is applied inside this function. From head i have pure logits from Conv layer.

viscid urchin Apr 5, 2025, 5:11 PM

#

Hmm, OK

tropic shore Apr 5, 2025, 5:12 PM

#

i dont understand idea of separate classifiers. Should i construct separate heads for predicting all 10 digits?

viscid urchin Apr 5, 2025, 5:15 PM

#

I think you do need a separate binary tensor target for each of 10 classes? You could also make an explicit 11th class for 'background'?

#

Also I don't think one_hot is correct?

#

I think you want binary targets for each class, not one_hot across classes?

tropic shore Apr 5, 2025, 8:11 PM

#

viscid urchin I think you do need a separate binary tensor target for each of 10 classes? You ...

thank you i will try it

viscid urchin Apr 5, 2025, 8:41 PM

#

Sorry I wish I knew more about this topic, I'd probably be able to suggest something more specific

limpid dew Apr 5, 2025, 9:10 PM

#

Looking to find some collaborators for an esports related machine learning project. Is this the right place to post about that?

tiny vale Apr 5, 2025, 9:41 PM

#

I'm looking for recommendations of ML models/algorithms for a small dataset of tabular data. It must be flexible to null data also.

odd meteor Apr 5, 2025, 10:25 PM

#

tiny vale I'm looking for recommendations of ML models/algorithms for a small dataset of t...

Try CatBoost first, then experiment more with other tree-based models

tiny vale Apr 5, 2025, 10:25 PM

#

How does that compare with Random Forests? (Im new to ML)

odd meteor Apr 5, 2025, 10:33 PM

#

tiny vale How does that compare with Random Forests? (Im new to ML)

CatBoost and Random Forest are both tree-based models, however they differ significantly in how they build and combine trees.

Random Forest uses a technique called Bagging to build the decision trees that makes up the forest, while CatBoost uses Boosting technique in building its trees.

In terms of individual performance on your proposed task, there's no straightforward answer other than experiment with both and find out which is better

tiny vale Apr 5, 2025, 10:34 PM

#

My app needs to flex to various tenant's data -> would it make sense to have both models available and use the one with highest accuracy against untrained data?

#

Thank you for the detailed answer btw 🙂

odd meteor Apr 5, 2025, 10:36 PM

#

tiny vale My app needs to flex to various tenant's data -> would it make sense to have bot...

Most definitely. Infact, that's the whole essence of experimenting with various model.

tiny vale Apr 5, 2025, 10:37 PM

#

Thank you!

opaque condor Apr 6, 2025, 1:28 AM

#

Is there a way of showing the image that the neural network sees?

serene scaffold Apr 6, 2025, 1:59 AM

#

opaque condor Is there a way of showing the image that the neural network sees?

if the network outputs an image as a 3d array, you can render that array

opaque condor Apr 6, 2025, 2:28 AM

#

serene scaffold if the network outputs an image as a 3d array, you can render that array

It gives a percentage to tell how much it is correct I want to see the images and drawing a bounding box around the image with a label so I know that it's actually learning

serene scaffold Apr 6, 2025, 2:29 AM

#

opaque condor It gives a percentage to tell how much it is correct I want to see the images an...

what is the model being trained to do? be as obnoxiously specific as you can.

opaque condor Apr 6, 2025, 2:31 AM

#

serene scaffold what is the model being trained to do? be as obnoxiously specific as you can.

It's a convolutional neural network I'm pulling the YouTube tutorial and I don't know if it's overfitting or it's prediction aren't being wholly truthful so I wanted to know if there's a piece of code that allows me to make it so that it puts a bounding box on what image it seems as

serene scaffold Apr 6, 2025, 2:31 AM

#

opaque condor It's a convolutional neural network I'm pulling the YouTube tutorial and I don't...

be as specific as you can ever possibly be about what goes into the model and what is supposed to come out of it, and why that is useful.

#

pretend I know absolutely nothing

opaque condor Apr 6, 2025, 2:38 AM

#

It's a conversation neral network with the data = {'cat','dog','plane','car','deer'}
After it's gone through in each label and looking at it the recognization of the image goes down which I know is a good thing cuz it's basically the network saying that it knows what its looking at and returns of value of how well it's learned which is given to a loss function which tells the model how long it is and then I'm in second pass or fourth pass it's only gets smaller and smaller than number of how wrong it is now I want to test my model make sure that's truly understanding it by giving me a visual aid I was going to have it load up one of the images from one of the classes and have the neural network make a bounding box around what it sees in the image and then place a label if it's a cat dog playing deer etc so that I can know that it's validating correctly

serene scaffold Apr 6, 2025, 2:40 AM

#

opaque condor It's a conversation neral network with the data = {'cat','dog','plane','car','de...

So each image contains exactly one of {cat, dog, plane, car, deer}, and the model tells you which one of those is in the image?

opaque condor Apr 6, 2025, 2:41 AM

#

serene scaffold So each image contains **exactly one** of {cat, dog, plane, car, deer}, and the ...

No there's more in the data set but I wanted it to show me one of each and playing a label around it so that I know it's learning or at least recognizing what a cat dog airplane in car look like

serene scaffold Apr 6, 2025, 2:42 AM

#

opaque condor No there's more in the data set but I wanted it to show me one of each and playi...

it's not guaranteed that the model is actually learning where specifically in an image anything is.

opaque condor Apr 6, 2025, 2:45 AM

#

I know I just want to know if it's starting to recognize a pattern

serene scaffold Apr 6, 2025, 2:46 AM

#

opaque condor I know I just want to know if it's starting to recognize a pattern

you can render the outputs of convolutional layers, but they won't necessarily look like anything that's meaningful to humans.

opaque condor Apr 6, 2025, 2:49 AM

#

Darn it

opaque condor Apr 6, 2025, 2:52 AM

#

serene scaffold you can render the outputs of convolutional layers, but they won't necessarily l...

I know it should look like static but who knows if you put each convolution layer on top of each other you may get the image

serene scaffold Apr 6, 2025, 2:54 AM

#

opaque condor I know it should look like static but who knows if you put each convolution laye...

you get something less like an image after each convolutional layer

queen oyster Apr 6, 2025, 6:29 AM

#

i need help with something

#

i have an image consisting of black and white

#

and i want to assign each white pixel a number representing the gradient of the line it is on

#

like this

viscid urchin Apr 6, 2025, 6:34 AM

#

What value should a white pixel far from any black line get? Also 0?

#

You could do a 'distance transform' to find the closest black pixel to each white pixel, and then read the slope of that black pixel. Before all that you need to detect the black lines, and there are various things you could do there.

#

e.g. https://learnopencv.com/hough-transform-with-opencv-c-python/

#

OpenCV has a HoughLinesP that might be the right choice.. You may want to play with a few of its options.

opaque condor Apr 6, 2025, 12:09 PM

#

serene scaffold you get something less like an image after each convolutional layer

Is there a way of reconfiguring that output back to the same image by using a convolutional layer?

opaque condor Apr 6, 2025, 1:05 PM

#

viscid urchin Claude Code is the only one of the agentic tools I've run into that produces dec...

I made a loss function so I can tell my neural networks loss so I know how much it's going to take to get 100% there

cerulean violet Apr 6, 2025, 2:48 PM

#

Hello i need a good free db to train my chatbot(around 15-20gb)
I tried to use the break dataset but found it too be empty

serene scaffold Apr 6, 2025, 2:51 PM

#

opaque condor Is there a way of reconfiguring that output back to the same image by using a co...

That wouldn't tell you anything

opaque condor Apr 7, 2025, 1:26 AM

#

serene scaffold That wouldn't tell you anything

Darn it

safe agate Apr 7, 2025, 3:38 PM

#

austere prawn Has that marimo presentation been yet? 😅😊

Not yet - it's this Saturday
https://discord.gg/python?event=1350928346422186065

indigo aspen Apr 7, 2025, 3:51 PM

#

hello guyss

#

hope everyone has a nice day

#

I'm studying to become a data scientist and im still learning some basics in python

#

nice to be here

weary timber Apr 7, 2025, 5:25 PM

#

can someone who is familiar with both kohya_ss and modal (a cloud gpu provided) help me with using trainers like kohya_ss in a cloud gpu?

spring field Apr 8, 2025, 12:25 AM

#

opaque condor Darn it

If you look into something like ViT (Vision Transformer), then you are able to visualise where the model is attending to in an image to sort of see what features it mainly looks for in an image to classify it

glacial root Apr 8, 2025, 2:07 AM

#

typically what is used for ml/computer vision in c++ in place of numpy

#

is there a c++ equivalent of numpy

iron basalt Apr 8, 2025, 2:11 AM

#

glacial root is there a c++ equivalent of numpy

You just write the loops.

#

In theory there are Numpy-likes, but it does not work well in practice, because C++ is C++.

glacial root Apr 8, 2025, 2:12 AM

#

i see

#

oh yeah i guess c++ already has ways to manage memory efficiently unlike python

#

so for python it's kind of needed to have numpy arrays

iron basalt Apr 8, 2025, 2:13 AM

#

For example, https://github.com/xtensor-stack/xtensor might seem like a nice idea. But the problem with C++ is that when you typo something, etc, you get some giant template compile time error that makes no sense. In addition your compile times go through the roof. C++ can do a lot on things in theory, but in practice anything other than really simple C++ has major problems.

GitHub

GitHub - xtensor-stack/xtensor: C++ tensors with broadcasting and l...

C++ tensors with broadcasting and lazy computing. Contribute to xtensor-stack/xtensor development by creating an account on GitHub.

iron basalt Apr 8, 2025, 2:14 AM

#

glacial root oh yeah i guess c++ already has ways to manage memory efficiently unlike python

Yes.

iron basalt Apr 8, 2025, 2:16 AM

#

glacial root so for python it's kind of needed to have numpy arrays

Python needs Numpy because plain Python loops and operations are slow.

#

By calling a Numpy function you are moving this loop to C (it does it internally).

glacial root Apr 8, 2025, 2:17 AM

#

i see

iron basalt Apr 8, 2025, 2:17 AM

#

(Or maybe Fortran, Numpy has multiple backends)

glacial root Apr 8, 2025, 2:17 AM

#

and then the numpy array gets stored as all one object

iron basalt Apr 8, 2025, 2:18 AM

#

glacial root and then the numpy array gets stored as all one object

Yes, a simple contiguous array.

#

The C side just loops over this.

glacial root Apr 8, 2025, 2:18 AM

#

i see

#

that's super cool

iron basalt Apr 8, 2025, 2:18 AM

#

Python lists are a bit more complicated, since they can hold different types for the elements.

serene scaffold Apr 8, 2025, 2:19 AM

#

python lists are also strictly one-dimensional (nested lists are entirely separate objects)

glacial root Apr 8, 2025, 2:19 AM

#

so if i wanted to (in theory, probably wouldn't actually do this unless it's truly meaningful to do), i could use cython and sort of "create" my own numpy

#

pretty much just bringing in similar functionality

opaque condor Apr 8, 2025, 2:19 AM

#

spring field If you look into something like ViT (Vision Transformer), then you are able to v...

Thank you

iron basalt Apr 8, 2025, 2:19 AM

#

serene scaffold python lists are also strictly one-dimensional (nested lists are entirely separa...

Yeah this is important too, although technically in implementation it's internally just an array (plus the shape (the lengths)), adding more dimensions is just changing the way you access this array.

glacial root Apr 8, 2025, 2:20 AM

#

serene scaffold python lists are also strictly one-dimensional (nested lists are entirely separa...

oh then i'm guessing the space complexity would be diabolical

#

for nested lists

serene scaffold Apr 8, 2025, 2:21 AM

#

glacial root oh then i'm guessing the space complexity would be diabolical

at the implementation level, every element of a python list is a pointer to another pyobject. ints, strings, other lists, instances of other classes--they're all pointers to another pyobject.

iron basalt Apr 8, 2025, 2:21 AM

#

Important to note here is that it's an array, not a dynamic array. So it's terrible at appending elements, unlike a list, which is a dynamic array (of Python objects (pointers to them), hence the multiple different types allowed in one list).

#

So try to not change its size all the time.

#

Best to make once upfront, then change the values in it.

iron basalt Apr 8, 2025, 2:24 AM

#

glacial root for nested lists

In this case it's really slow because you are chasing a bunch of pointers. You are first looking up a pointer with the first index, then following that pointer to the nested list it points to, and then using the second index to get the pointer to the python object, and following that. That is basically 4 address lookups (4 jumps (that could be anywhere in memory (random access))).

limber spear Apr 8, 2025, 2:24 AM

#

Check out this beauty chat 😏

iron basalt Apr 8, 2025, 2:26 AM

#

iron basalt In this case it's really slow because you are chasing a bunch of pointers. You a...

So even in pure Python, for a 2D list you probably still want a 1D list that you just access in a way that makes it behave like a 2D list (the formula for the index is: i = column + row * num_columns (row major)).

limber spear Apr 8, 2025, 2:26 AM

#

ehhh the plot title is not clean cool_cry

viscid urchin Apr 8, 2025, 2:28 AM

#

glacial root is there a c++ equivalent of numpy

One thing people use is https://www.boost.org/doc/libs/1_83_0/libs/multi_array/doc/user.html

iron basalt Apr 8, 2025, 2:28 AM

#

iron basalt So even in pure Python, for a 2D list you probably still want a 1D list that you...

viscid urchin Apr 8, 2025, 2:28 AM

#

Has the same 'views and strides' kind of thing going on as numpy

iron basalt Apr 8, 2025, 2:29 AM

#

iron basalt

In C++ you just make a (matrix/nd-array) class for this that you give the indices to and it uses the formula.

#

https://numpy.org/doc/stable/reference/arrays.ndarray.html#internal-memory-layout-of-an-ndarray

#

Note that by convention C users use row-major, Fortran uses column-major.

#

Numpy lets you pick (default row-major (the name "row-major" stops making sense for higher dimensions but we use it anyway)).

glacial root Apr 8, 2025, 2:35 AM

#

limber spear Check out this beauty chat 😏

dude i recently implemented k nearest neighbors on haar features of images and had an error rate of over 70% 💀

#

it was egregious

#

but also i had barely any data

#

forgot how much, less than 1000 images for sure though

#

maybe even less than 500, i'll have to check

#

but it was insanely slow so i had to cut down the amount of data

#

probably cause i don't have a gpu, but also just cause that's the nature of knn

iron basalt Apr 8, 2025, 2:49 AM

#

glacial root so if i wanted to (in theory, probably wouldn't actually do this unless it's tru...

Yes, or implement custom operations in C and bind then with Cython. Cython is built to work with Numpy, so your C side can take the underlying buffer from the Numpy array as input / output.

#

https://docs.python.org/3/c-api/buffer.html if an object supports this (e.g. Numpy arrays or array.array or bytes, etc) they can all interact with each other directly by passing around the underlying buffer; also with C/C++/etc code (with no copies being made importantly).

Python documentation

Buffer Protocol

Certain objects available in Python wrap access to an underlying memory array or buffer. Such objects include the built-in bytes and bytearray, and some extension types like array.array. Third-part...

limber spear Apr 8, 2025, 2:52 AM

#

I pretty’d up the final plots/graphs chat 1 moment 😏

limber spear Apr 8, 2025, 3:13 AM

#

they're so beautiful 😏

#

any guess what this model is studying

viscid urchin Apr 8, 2025, 3:16 AM

#

Pictures of goats.

limber spear Apr 8, 2025, 3:35 AM

#

viscid urchin Pictures of goats.

High quality goats 🔥

neat pasture Apr 8, 2025, 3:50 AM

#

I wish to do my thesis on BCI by using sensors to read brainwaves and then use AI to interpret the data which will be used for both emotion classification and for using brain waves to move objects, in this case the computer cursor

But I’m not sure whether to do the whole thing on Raspberry Pi or on my PC or use Arduino as well

agile cobalt Apr 8, 2025, 3:53 AM

#

generally speaking you'll want to use something that has a descent GPU, at least for training the model

for inference you might be able to get away with a Raspberry Pi in exchange for either having a rather small model and/or slow sampling rate

neat pasture Apr 8, 2025, 4:02 AM

#

agile cobalt generally speaking you'll want to use something that has a descent GPU, at least...

So since I already have a pc with a 16GB vram GPU, that makes using a raspberry pi redundant in this case right?