#data-science-and-ml

1 messages · Page 145 of 1

strong cove
#

Ye it sounds hard

lapis sequoia
#

it’s like 20 terabytes all together

verbal oar
#

I have master's so what phd math I need, according to prevoius message/s?

#

and do people come from applied math who implemented these scikitlearn things? I mean these more advanced not these which you learn at university

#

so these books like math for machine learning or math for deep learning?

odd stratus
# lapis sequoia it’s like 20 terabytes all together

lmao, yeah id get an external hardrive and then use PILGRIM and OS to get the files one at a time when needed during runtime
doing preprocessing on the data might also help but it would take a while for 9 million images

lapis sequoia
quaint mulch
#

keep the kernel size 3x3, very rarely you need anything else
don't use stride or dilation (unless you are doing wavenet)
do just add 1 padding (if I remember correct).
this way, you keep the dimension the same between layers, you can add them res-net style.

I hate doing this math too

odd stratus
verbal oar
#

I mean sth like lars,omp etc

quaint mulch
verbal oar
#

yes these abbr

#

first I dont know what it is but second I see variant of regression

odd stratus
quaint mulch
#

Yes

verbal oar
#

is there pdf of user guide or at least single html of it I mean scikitlearn I wanto to skim this

full furnace
past bramble
quaint mulch
past bramble
#

man I dont know if this will work, even kaggle's GPU P100 ran into memory error. Reduced by batch size for images from 32 -> 16 -> 8 now. Even this is taking too long.
~3 minutes for each batch

#

I hope the result will be good

#

I accidentally added cmap='gray' to see results every epoch 💀
It's already been through 8 epochs for 25 minutes, I can't change it now

#

i was thinking why it's all black and white

quaint mulch
# past bramble what architectures? If you mean the layers and structure, I want to learn how th...

That's true, but for me personally, I would to prefer by getting started with

  1. finding a few famous architecture,
  2. going through the source code to make sure I absolutely understand every single line and WHY
  3. tweaking them and see what works / fail, usually to answer the why question, Why they do it this way, not some other way? Let's try the other way and see. Sometimes you figure out why they do it that way, sometimes you just became an inventor.

Going from scratch are very useful, but also painful.

past bramble
quaint mulch
verbal oar
#

its just approx 400 pages related to supervised unsupervised learning so its doable to read

#

rest are examples, api reference total 2.5k pages

lapis sequoia
verbal oar
#

oh I see you are in geometric deep learning so to do for example deep render I need scene graph so GNN?

quaint mulch
verbal oar
#

yes, for example if someone do deep rendering then need scene graph for it?

#

I mean when just one object so cnn is enough

lapis sequoia
verbal oar
#

this is interesting

#

but for scene he uses just cnn

#

and he says in readme to make triangles need rnn (so sequences)

odd stratus
lapis sequoia
#

Because the files have to be redownloaded right?

quaint mulch
#

I'm not sure how to answer your questions, but I have a few comments.
Firstly, when I say geometric deep learning, I usually refer to non-euclidean geometry.
Secondly, I am not familiary with neural rendering. I have read some papers, I think they have really interesting, but I have never used it, so I can't make any practical suggestions.
Finally, it seems that the best approach is using radiance field instead of CNN https://paperswithcode.com/task/neural-rendering

Given a representation of a 3D scene of some kind (point cloud, mesh, voxels, etc.), the task is to create an algorithm that can produce photorealistic renderings of this scene from an arbitrary viewpoint. Sometimes, the task is accompanied by image/scene appearance manipulation.

somber tulip
#

Hey, I want to evaluate the quality of my documents corpus. Quality means that it should provide information, be coherent etc… my corpus could be in any language. For the moment I tokenized my text and compute shanon entropy but I want to mesure in a better way

#

If people someone could help me I would be very grateful

odd stratus
# lapis sequoia Because the files have to be redownloaded right?

yeah, that would be an issue
if you want speed you need to have direct access tot hem
so neither of the deep storage models will work for you
you might be able to use the infrequent access model though, but i think youd be using standard if youre going to be using the images a lot for training
and if thats the case, a secondary storage connected to the computer is a lot easier to work with and cheaper over time as it only inurs an up front cost
but its up to you what suits your needs

lapis sequoia
odd stratus
#

not sure, i havent worked with that amount of data before

#

but if its jsut training data, i think adding a small detector before loading each file would work well enough

#

because if a single file amongst 9 million gets corrupted, as long as you can stop it getting into the network, then you should be fine

lapis sequoia
odd stratus
# lapis sequoia To detect if the file is corrupted?

during the preprocessing stage of loading the file for use in training etc.
when loading and processing it, if it was corrupted it would cause a runtime error
so place some tests to check and stop those types of files, and then continue with a different file

lapis sequoia
dusky pagoda
#

that relu function looks weird, usually you would implement it as np.max(x, 0)

odd stratus
dusky pagoda
#

x % 1 is doing x mod 1 (remainder when x is divided by 1)

#

which looks like this

dusky pagoda
serene grail
odd stratus
#
def sigmoid(x):
    return 1 / (1 + np.exp(-x))
def relu(x):
    '''
    if x>0:
        return x
    return 0
    '''
    return np.where(x > 0, x, 0)
    #'''
def leaky_relu(x):
    '''
    if x>0:
        return x
    return 0
    '''
    return np.where(x > 0, x, x*0.5)
    #'''
def activationfunction(x):
    f = 0
    if f==0:
        return(sigmoid(x))
    elif f==1:
        return(relu(x))
    elif f ==2:
        return(leaky_relu(x))
def sigmoid_derivative(x):
    return x * (1 - x)
def relu_derivative(x):
    '''
    if x>0:
        return 1
    return 0
    '''
    return np.where(x > 0, 1, 0)
    #'''
def leaky_relu_derivative(x):
    '''
    if x>0:
        return 1
    return 0.5
    '''
    return np.where(x > 0, 1, 0.5)
    #'''

this is what i was using before hand

dusky pagoda
#

Ok, that makes a bit more sense

odd stratus
#

and then it just loops constanly outputting [ 0, 0] with a loss value of 0.5

dusky pagoda
#

can you check the distribution of all the weights during training?

odd stratus
#

not currently

dusky pagoda
#

maybe a boxplot of them using matplotlib

#

or just calculate and print the min/max/mean/stddev

odd stratus
dusky pagoda
#

hmm, what are all those 1's at the end?

odd stratus
#

oh wait i got it to print the data during runtime and everything immediately gets set to NaN for some reason

odd stratus
dusky pagoda
#

oh, is that standard practice?

odd stratus
#

idk, its jsut what i do, works fine for sigmoid

dusky pagoda
#

We can check back with it once the NaNs are gone I guess

odd stratus
#

i set the biases to zero
cause i had them as zero before
and the NaNs are gone

#

oh wait i had it running sigmoid nevermind

dusky pagoda
#

(L106) I think it's because it's dividing by zero here ```py
derivativeA = -(target / activations[-1]) + (1 - target) / (1 - activations[-1])

#

since it's really common for activation to be 0

#

I'm not sure how one would fix that though

odd stratus
#

pithink well thats kinda silly

dusky pagoda
#

Let me refresh my memory on backprop real quick

full furnace
dusky pagoda
#
# AA = previous layer, A = current layer
# W = weights, B = biases
# dX_dY = del X / del Y
def backprop_layer(W, B, Z, A, dC_dA):
    # Z = WA + B
    dZ_dW = AA; dZ_dAA = W; dZ_dB = 1
    # A = activation(Z)
    dA_dZ = activation_derivative(Z)

    dC_dW = dC_dA * dA_dZ * dZ_dW
    dC_dAA = dC_dA * dA_dZ * dZ_dAA
    dC_dB = dC_dA * dA_dZ * dZ_dB
``` I think this was the gist of it?
#

@odd stratus how did you come up with the formula in your code?

odd stratus
odd stratus
dusky pagoda
#

oh yeah that is a typo for formula

#

my bad

dusky pagoda
dusky pagoda
odd stratus
#

the main problem was trying to make it so that it can scale to have any layer sizes and depth like i wanted

dusky pagoda
verbal oar
#

hmm this is just chain rule

dusky pagoda
verbal oar
#

to go from D to A you go to C,B like in graph

#

where D is end A is start

#

so DC, CB, BA is DA

#

there is reference in grokking machine learning about these multiplying of partials etc

#

Appendix B Math behind gradient descent

verbal oar
#

yes this is just calculating partials and substituting and multiplying

past bramble
#

i don't like kaggle notebooks

#

my 3 hours of gpu "memory error"

#

i had saved checkpoints but after reloading they weren't there

tiny bluff
#

hi

tired lodge
#

how would i train an AI to speak like a friend of mine? he gracefully supplied me with some of his writings (hes a literature nerd) and i thought it would be funny to train an AI that could imitate his works

unkempt apex
tired lodge
unkempt apex
#

then only try a simple text model and train for your context

tired lodge
unkempt apex
# tired lodge what does that mean

like for example, suppose I am training a model which will act as my resume chatbot, so like if you ask it about my self, my skilss, it will give me that info

#

this is consider as "context" to make personalised

tired lodge
#

how and where do i find one of those?

unkempt apex
#

ahh, search that

#

or if you get more confused share here, so that others can also help you

#

about that particular model

rich moth
unkempt apex
#

you said, we will do something together?, why you were not online these days?

#

@rich moth ???

tribal meteor
#

Learning AI in University, anyone have a good youtube channel for learning fundamentals?

#

Currently learning efficient tree / graph searches. Using pruning and cost eval functions.

#

Working on stuff like game theory, min-max, alpha pruning, ect. So like basic basics

left tartan
tribal meteor
deep sparrow
#

what knowledge is needed to understand this

gaunt wren
#

Is 85-15 class balance in a binary classification problem bad enough for logreg to predict all 0s?

#

if so, how would i solve this?

lapis sequoia
#

Also look at the ROC to help w/ thresholding

gaunt wren
#

so, I should just pick out a balanced sample and use that for training

#

or at least that'd be the easiest way

lapis sequoia
#

There's no need. You could use an ensemble method if you want

gaunt wren
#

such as RFC?

lapis sequoia
#

xgboost and if you want to explore it more, you can look at using different weights for each class

#

that assumes your goal is the highest prediction accuracy, the model will pretty much be a black box

gaunt wren
#

Im just trying to explore a few different algorithms

lapis sequoia
gaunt wren
#

and trying to understand why some dont perform that well, such as log reg. My first assumption was the class imbalance

lapis sequoia
#

you could look at something like decision trees as well

gaunt wren
#

would tf be worth trying as well?

lapis sequoia
#

tensorflow doesn't mean anything

rich moth
# unkempt apex <@204385862081970178> ???

We took a big vacation and I just focused on some other projects around the house. Oh, I got Baldur's gate 3 with some work buddies, that took over a month of my life. Ya, I did! Always looking to work on something, I recently started tinkering again with that capture the flag game using pygame and ML to train ai agents using q-learning and some other stuff. Also still messing around with the AI model that can learn and generate images using captions

unkempt apex
rich moth
rich moth
storm valve
#

if anyone is familiar with transformer.pipeline, is there a way to natively map a pipeline over multiple inputs?

#

using a threadpool works quite well, but i'm wondering if there isn't already a built in way


    with ThreadPoolExecutor() as executor:
        results = executor.map(model_pipeline, list_of_strings)```
#
from transformers import pipeline

model_pipeline = pipeline(
    "text-classification", model="model"
)

with ThreadPoolExecutor() as executor:
    results = executor.map(model_pipeline, list_of_strings)
``` better than this i mean
#

oh, i can just pass the list to the pipeline it looks like

scenic parcel
#

anyone use darts for time series forecasting

verbal venture
#

is Q * K * V the final answer to Q?

#

like it's the best possible answer to Q?

tawdry monolith
#

Is it normal to forgot parameter and functions?

quaint rivet
#

has anyone worked with labelbox? I'm trying to export my annotated image. I don't want in export in json format. I want mask image

scarlet anchor
#

Hey, how can i usea set of multiple CSV Files into my training dataset for feeding into my LSTM network?
Or in other words, I want to use multiple CSV Files as training data for LSTM. How can i do it?

I do not want to concatenate all the CSV Files

rich moth
rich moth
scarlet anchor
quaint rivet
#

which tools should i use create mutli class segemenation dataset?

rich moth
scarlet anchor
#

prolly create a custom data loader?

rich moth
rich moth
quaint rivet
#

i have tried labelbox,apeer etc

#

but none of giving me desired result

scarlet anchor
rich moth
#

Try labellmg

quaint rivet
rich moth
quaint rivet
# rich moth hmm... vgg image annotator?

yeah i have tried. I think i have to go through long process. If that's case. VGG annotator will give me image coco json format after that i have to convert it in mask.

odd stratus
ionic valley
#

is Leetcode still relevant for DS/ML/AI or is that mostly asked for SDE roles? I’d like to know if I’m wasting my time grinding LC

rich moth
#

pycoocotools? not sure : \

ionic valley
quaint rivet
quaint rivet
scarlet anchor
rich moth
#

Has anyone seen @Lisan Al Gaib

tawdry gyro
#

Does anyone knows what is that? It imports my libraries but I am scared to not crash when I have astronomy Olympiad with computers in a week.

verbal oar
#

read message

split olive
#

We'll call C the quadratic cost function; it's also sometimes known as the mean squared error or just MSE.

I'm confused. Both of them are MSE but different?

#

nvm i got it

quaint rivet
unkempt apex
scarlet anchor
#

anyone knows a good library or downloadable model that I can use in python for converting speech to text?

serene scaffold
scarlet anchor
#

thanks

agile cobalt
#

(also not sure if I'd consider openai significantly worse than amazon, meta, google etc. - you should leverage open source as much as you can regardless of it source imo)

scarlet anchor
#

okk

scarlet anchor
serene scaffold
agile cobalt
# scarlet anchor Can whisper work offline?

You could use it via an API, in which case you don't need of a GPU nor have to download model weights or run anything resource intensive yourself, or you can download and run it locally.

If you download and run it yourself, you do not rely on any online services (after downloading everything) at all

scarlet anchor
#

Thanks @agile cobalt @serene scaffold

tiny bluff
#

hi, do you have a roadmap for machine learning

#

?

tiny bluff
#

you need nothing for understand this

tepid tartan
#

I'm actually a beginner

tiny bluff
#

you can watch this video without anything

#

this video teaches ever common details for you

tepid tartan
#

Recommended me something

tiny bluff
#

i learn python basics before 2-3 years and i would like to reverse the python topics and learn machine learning like a proffesional

#

and i search and find a roadmap for ml

#

and i follow steps which are in the ml roadmap i find

#

i find roadmap at this channel

tepid tartan
#

@tiny bluff I'm trying the basic understanding with stats and SQL first before touching python

tiny bluff
#

it is okey

#

it is your choice

sour zodiac
#

is there any1 who is familiar with qlearning that could help me in how to pick my alpha, gamma, epsilon and epsilon decay? Im not sure how to determine what values they should be

granite nymph
#

Hi guys, what topics are typically required for ML interns to be confident with it

agile cobalt
#

look up positions you would apply to and see what they're asking.

you'll probably want at least some statistics, linear algebra and basic numpy syntax/usage though

tepid tartan
tiny bluff
#

i dont know actually i deal with only machine learning however you can search

real whale
#

Hello

This is a very basic question but I am still in the earlier stages of wrapping my head around the relevant details.

I'm a soon to be second year AI and Datasci student engaged in the RSNA 2024 Lumbar Spine Degenerative Classification purely for the learning curves.

https://www.kaggle.com/competitions/rsna-2024-lumbar-spine-degenerative-classification

A peer of mine, perhaps correctly, says that we have to split the images into training, test and validation classifications. He wants to do this using code that randomly selects images and puts them into any one of the 3 categories.

However the competition already presents testing and training datasets with, I'm sure I remember correctly but couldn't find the documentation that details it, a final unseen set of images that it performs the classification on so as to determine the effectiveness of the model.
Also nowhere in the EfficientNet sample can I see anything that does that classification.

https://www.kaggle.com/code/charlesexiaviour/rsna-efficientnet-starter-notebook

I think I am right here in that in terms of testing and validation the images are already classified and it's only through a dictionary that some of the images need the conditions and plains added to them.

Thanks for any and all help, any clarification will help a great deal.

deep sparrow
#

is anyone up to challenge to code some sort of algorithm that analyses students requirements (14 students for now) and creates schedule (monday - friday, time 13:00 - 21:00 with 15 minutes break.) i can send you chart with the information from the students (with false names, only time will be correct nothing else)

odd meteor
agile anvil
agile anvil
#

What if "but what about the poor AIs" is merely a sophisticated metaphor for "but what about the middle class"? https://www.marktechpost.com/2024/08/21/megaagent-a-practical-ai-framework-designed-for-autonomous-cooperation-in-large-scale-llm-agent-systems

Large Language Models (LLMs) have advanced rapidly, becoming powerful tools for complex planning and cognitive tasks. This progress has spurred the development of LLM-powered multi-agent systems (LLM-MA systems), which aim to simulate and solve real-world problems through coordinated agent cooperation. These systems can be applied to various sce...

untold fable
#

What's the difference between skit - learn and other machine learning library

agile cobalt
#

scikit-learn helps you to train, evaluate and run inference using a bunch of 'traditional' ML models such as linear regression, decision trees, and random forests

pytorch / tensorflow / keras are focused specifically on Neural Networks, though they support a lot of different architectures for them

#

there are a few dozens of others somewhat popular libraries you'll see, and hundreds of niche libraries

e.g. numpy can be used for nearly any operation involving multi dimensional arrays (vectors / matrixes / so on), jax is similar to numpy but includes automatic differentiation, transformers & diffusers are focused specifically on running inference for popular models, and there's a lot of libraries that are just wrappers on top of others

#

they also have varying levels of support for runnings things in the CPU vs GPU, but I'm not gonna go into detail about that

late lichen
#

i want to improve the old code i made its a simplified NEAT (on my bio) and i have no idea how to do it someone please assist me

tepid tartan
#

Find a roadmap with actual videos and lessons, including projects. @tiny bluff @spare forum

jaunty helm
tepid tartan
spare forum
#

Just don't be afraid to start tbh there is not an absolute roadmap ressources etc... Every time I've spent time searching roadmaps and shi nothing ended up done, everytime I applied freestyle learning I did projects etc... And learned the most

muted plume
#

anyone have good sources to learn order precedence?

#

we got given this but i have no clue what this is trying to say

#

i assume down the list = order

#

but is there a reason bitwise not is higher up then the others?

dreamy isle
#

each column tells you what that operation applies on

muted plume
#

so bitwise and, happens before things like logic operaters?

spring field
#

the best way to learn is by doing projects

spare forum
#

Tbh everytime ppl search for roadmaps for weeks and end up doing very little

verbal oar
#

what variational means, I relate it with probability and some prior is it good thinking?

versed bough
mystic ruin
mild dirge
#

Did you install G++ or some other compiler? @mystic ruin

mystic ruin
mild dirge
#

do that 😛

verbal oar
#

or just look in some glossary?

strange oriole
#

hi

proven inlet
#

How can i make gpt2 model to generate questions from answers? I have list of text messages and random conversations, I'm trying to convert them to Q-A type

#

Type your answer: The capital of france is paris.
Generated Question: Given the following statement, generate a relevant question: 'The capital of france is paris.'.

"If you use the word paris, you may get a similar answer. The word is a synonym for 'posterior, adverbial, pungent, repugnant, distressing,
objectionable'. But if you use the word adverbial, adverbial, pungent, repugnant, distressing, objectionable, you will get the same

#
prompt = f"Given the following statement, generate a relevant question: '{input_text}'."
#

what am i doing wrong??

#

Expected output: What is the capital of france?

serene scaffold
#

@proven inlet gpt2 isn't instruction-following like ChatGPT is

proven inlet
serene scaffold
#

It just keeps generating text that's probable to follow whatever you pass to it

serene scaffold
proven inlet
#

oh

#

How can i tune a gpt model to chatbot with texts but not Q-A types?

#

chatgpt used text mostly to train afaik

scarlet anchor
#

For a time series prediction which model would be more ideal? other than LSTM

serene scaffold
serene scaffold
serene scaffold
proven inlet
serene scaffold
#

It's still a "large" language model. But the L in LLM is meaningless now.

proven inlet
#

can i finetune gpt2 to become a basic chatbot?

serene scaffold
#

You don't have enough training data or time for that

proven inlet
#

is 5k list of messages enough for that

serene scaffold
#

Not even close.

proven inlet
#

Oh.

serene scaffold
#

The amount of training data and compute time required to create and tune these models is astronomical

#

That's why only large companies like meta are putting out LLMs. Everyone else is innovating by finding creative ways to prompt them.

proven inlet
serene scaffold
#

How many words?

proven inlet
#

over 5B

serene scaffold
#

And what would you be training it to do?

proven inlet
#

Chatbot

serene scaffold
#

So you'd be fine tuning it to produce text that follows a certain structure. Namely dialogue structure

#

Which is what ChatGPT is

#

You might be able to do it with that many words.

proven inlet
#

But they don't have to be Q-A format or do they?

#

Like can i use training data for wikipedia and books

#

But not dialogues

serene scaffold
#

If you train it on Wikipedia, it will generate content that's structured like a Wikipedia article

#

And it probably won't behave naturally if you ask it a question in a conversational way

proven inlet
#

will it continue my sentence?

#

if not, what makes it to not continue the sentence

serene scaffold
#

If you prompt gpt2 with "the capital of France is", it will probably finish the sentence correctly.

proven inlet
#

Yes but chatbots dont do that

#

im wondering how

serene scaffold
#

You have to tune it on text that is structured as the kinds of interactions that you want to have with it

#

But you probably don't have enough data or compute time for that

#

So you should probably use an existing language model that is interactive, like mixtral

proven inlet
#

Okay thanks, I'll use mixtral

river cape
#

Guys I want to know whether aws provides any free services which can be used in ml?

spare forum
#

Free trial with limited access, not like free forever

#

(AWS sagemaker)

#

gcp provide free credits for new accounts which is 300€ equivalent

river cape
spare forum
#

I believe 1 year

river cape
spare forum
#

Yes

river cape
spare forum
#

Nope mainly aws, gcp and databricks (just for learning)

river cape
spare forum
#

You still put credit card and shi so pbby not so easy, and the use is very bounded, pretty much it's only okay for side projects and learning

past bramble
#

may I use tensorflow on windows on python 12

spring field
#

you have my permission
also Python 12? firEyes

anyway, apparently on Windows the latest TF versions only work through WSL because sth sth they dropped Win support? not entirely sure, but sth along those lines
basically yes, but only through WSL

past bramble
spring field
#

I personally am only on Python 3

past bramble
#

oh i skipped to the future

#

let's go back, python 3.11 and python 3.12

river cape
#

Btw guys

#

Do i need to install tensorrt

#

I already have cuda and cudnn installed

verbal venture
#

@wooden sail @iron basalt just want to confirm my understanding here is correct. The attention model does Q * K to update "words that represent each other". The model has no actual understanding of this. What it's doing is changing the weights so the Q * K (attention between each words) becomes better over time. This is simply a matter of running dot product on all words in the corpus numerous times to find a relationship between them. This relationship can somehow be captured by dot product attention, because that represents cosine similarity, but ultimately the reason the model can converge to this representation is because backprop will adjust the weights of the model to better create Q and K vectors. When the model makes a mistake, it will adjust the weights, do Q * K again, and the newest iteration of Q * K will be a slightly better "relationship" capture between words

clever sparrow
#

use 3.11 if you dont want to deal with wsl

rich moth
#

This is from my first epoch on the multi-modal learning system I've been working on, where I’m combining a VQ-VAE model for image reconstruction with feature aggregation using CLIP for text-image alignment, and BLIP for generating descriptive captions. So far the results seems promising

rich moth
#

windows subsystem linux

#

thats wsl2 running ubuntu

past bramble
#

linux inside windows?

rich moth
#

yup, its the bees knees

#

You on windows 11 ?

past bramble
#

yup

rich moth
#

Open the microsoft store and search for WSL

tepid tartan
rich moth
# past bramble yup

its pretty east to install these days. let me know if you have any questios

past bramble
rich moth
unkempt apex
rich moth
unkempt apex
rich moth
#

After that you can search for the distro and verison you want

rich moth
past bramble
#

there's a lot of "Ubuntu" results, which one do I use

rich moth
past bramble
#

wait it says "Ubuntu 22.04.3 LTS" is already installed

rich moth
#

open a terminal and type wsl

past bramble
#

guess it's already installed ```bash

wsl
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

#

wonder how

unkempt apex
wooden sail
#

depending on how you installed wsl, it comes with ubuntu by default

rich moth
#

cool beans! ya Im not sure how it got installed, but lemme know if you got any questions. It works great. Its nice having the option to do both in one place.

#

Oh that reminds me I was going to setup a plex server on my laptop.

past bramble
#

I'm sure I was running into errors when I used tensorflow on python 3.12, which is why I installed 3.11

#

weird I tried running tensorflow on 3.12 venv now it isn't raising any errors now

past bramble
#

i wanna show my new GAN I created (based on scenary images)

rich moth
#

looks great.

unkempt apex
past bramble
deep sparrow
untold cliff
#

I was trying to make a c++ implementation of the BM25 information retrieval algorithm and make a wrapper to it using cython, and was comparing my results against those from this library https://github.com/dorianbrown/rank_bm25
Interestingly, for one of the variants, the BM25L variant, the results I got were different and after quite a bit of time of debugging, it turned out that if I copy the source code of the library and then run the tests I get the same results. I get different results only one I use it as a pip package and I was very curious about the reason for such behavior.

GitHub

A Collection of BM25 Algorithms in Python. Contribute to dorianbrown/rank_bm25 development by creating an account on GitHub.

#

I turns out that, after inspecting the code of the package after pip installing it against the source code on github, that there was a small difference in the formula used. I don't know how pip packages are made so it is still a mystery to me how such an error happened, but yeah this seems to be the reason, unless someone here can shed more light about it.

faint quail
#

I totally know what that means

wanton quiver
#

hey @hot obsidian can tell me about what thing i have to learn in data science or have you source where i can learn it

jaunty helm
rigid timber
#

are there any free inference options?

jaunty helm
# rigid timber are there any free inference options?

as in LLMs? run local; some on openrouter are also free if you want to try that

there are very 'small' models like gemma-2b, phi, minitron-4b, etc. that don't need that good of a GPU (the 3 mentioned above can all be comfortably ran by a 4gb vram card with quantization)
CPU inference is also an option if you're desperate, then you're not limited by the GPU, but CPU clock speed & ram & ram speed

past bramble
#

are there any libraries to get text embeddings?

jaunty helm
past bramble
past bramble
#

thanks!

pine escarp
jaunty helm
past bramble
#

damn 800 floats for a single text

#

quite a big vector

#

I was planning to try making a small text model using embeddings and conversations data

jaunty helm
past bramble
#

ohh cool

rigid timber
jaunty helm
#

why lend you compute for free when they can ask for a subscription / pay per token

rigid timber
jaunty helm
rigid timber
lilac lichen
#

is there any recommendations to get a team to work with on any pet project and way to run projects not on PC?

scarlet anchor
#

Where is federated learning actually used?

agile cobalt
past meteor
#

I also saw a use case of a streaming service using it for their recommender system

scarlet anchor
#

@agile cobalt I wanted something an application where hardware is used

scarlet anchor
#

like this

agile cobalt
#

the amount of processing power micro controllers have is really low compared to GPUs... you'd need of thousands of them in order to match one GPU used in data centers, and the latency & amount of data you'd have to transfer between them makes it pretty inpractical

#

even running inference on micro controllers is already hard

#

you might be able to continuously fine-tune a small model in a micro controller, but I wouldn't expect to see anyone using them for federated training

serene grail
# scarlet anchor like this

To be fair, with those specs like 4GB RAM that doesn't really look like a microcontroller, that's a SBC, like a Raspberry Pi, for example

rich moth
#

This is my best run yet just on the first epoch. The colors and shapes actually look decent and a steady loss from all the components. This is my best verison so far.

worthy oasis
#

someone please help me with some tutorial o good book to initiate on DataScience

rich moth
graceful niche
faint quail
#

spent months building a object detector neural network library from scratch to finally achieve this holy

sage sparrow
#

Hi, what are the main issues people usually face with data scientists? From the client's side of things

#

I thought I'd do some research since I don't have enough data/experience about it myself

agile cobalt
#

"client's side of things"?

sage sparrow
#

The ones hiring/in need of the data scientists' services

agile cobalt
#

wherever you'll look you'll find pretty biased views in multiple ways, but maybe try looking at some freelancing offers & some Kaggle compeitions

sage sparrow
storm valve
#

any idea on where i can get a corpus of python-related words? for now i've resolved to extracting things from the source code directly like imports, function names, assignments but i would like more general stuff

past bramble
#

for a starter I'm thinking of using single numbers to represent each word instead of vectors (text embeddings)

are there any existing algorithms to convert words to a number? I want to make my own encoder/decoder to go back and forth easily

#

first thing that hit me was using indices and ascii of each character, math operations on it to come up with unique numbers for each word

#

then it hit me there might be cases where it's not unique as well

storm valve
past meteor
#

You could use AST to parse the stdlib and grab whatever you want?

#

But I think your question is: does such a corpus already exist

storm valve
storm valve
#

my google fu fails there

past meteor
#

My answer is, not that I know of. Maybe someone else can pitch in 😄

storm valve
#

i've gone so far as processing the source code of programs i'm reading and building small corpuses of of them but still not quite enough sadly

past meteor
#

What are you trying to do?

storm valve
#

removing gibberish from LLM output

#

correct output contains a lot of python terms, so i also use the python corpus to filter out what's not gibberish

rich moth
#

So this is from 10 epochs. Everything seems to be improving gradually. Its learning, but its slow going. I might need to play with the learning rates a bit more but i think Its gonna take a long time to train

past bramble
verbal oar
#

is vae from scratch hard to do?
I saw for example building in keras but its rather simple and its was not from scratch

#

now I'm reading an introduction to variational autoencoders from Kingma, Welling

odd stratus
#

anyone have a large plain text file for LLM ?

past bramble
odd stratus
past bramble
odd stratus
#

i just copy pasted the lord of the rings lmaoo

verbal oar
#

project gutenberg maybe, alice in wonderland etc dont sure

quaint mulch
# odd stratus anyone have a large plain text file for LLM ?

These datasets are used in machine learning (ML) research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learning algorithms (such as deep learning), computer hardware, and, less-intuitively, the availability of high-...

#

This is publicly available https://pile.eleuther.ai/

#

is 825 GB large enough?

odd stratus
quaint mulch
mystic ruin
#

I am trying to setup pytorch for my A770 GPU, I followed the docs, got this error when importing pytorch:

PS C:\kanemoto\vscode\llm> python .\main.py
Traceback (most recent call last):
  File "C:\kanemoto\vscode\llm\main.py", line 1, in <module>
    import torch
  File "C:\Users\kanemoto\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\torch\__init__.py", line 139, in <module>
    raise err
OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\kanemoto\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\torch\lib\backend_with_compiler.dll" or one of its dependencies.```
The `backend_with_compiler.dll` exists in its path.
I have the latest Microsoft Visual C++ Redistributable installed.

Any idea?
odd stratus
past bramble
#

@odd stratus u building an LLM?

odd stratus
mystic ruin
#

(from the docs)

quaint mulch
#

did you pass the sanity check?

indigo wing
#

hey guys should I buy collab pro and cloud storage for training? Is it worth it?

scarlet anchor
#

why do u need colab pro, in the first place?

past bramble
upbeat prism
#

Hi, so I want to classify if a number between 1 and 100 is even or odd. Now I want to achieve that with the most simple MLP.

class SimpleClassifier(nn.Module):
    def __init__(self):
        super(SimpleClassifier, self).__init__()
        # One input node, two hidden nodes, one output node
        self.hidden = nn.Linear(1, 2)  # From input to two hidden nodes
        self.output = nn.Linear(2, 1)  # From two hidden nodes to output

    def forward(self, x):
        # Forward pass: input -> hidden layer (ReLU activation) -> output (Sigmoid activation)
        x = torch.relu(self.hidden(x))  # Apply ReLU to the hidden layer
        x = torch.sigmoid(self.output(x))  # Sigmoid to get the output between 0 and 1
        return x

I don't get much better than 50% accuracy i.e. guessing. :D

Here's my training loop:

def train_model(model, criterion, optimizer, dataloader, epochs=100):
    for epoch in range(epochs):
        epoch_loss = 0.0
        for inputs, labels in dataloader:
            # Zero the parameter gradients
            optimizer.zero_grad()

            # Forward pass
            outputs = model(inputs)

            # Compute loss
            loss = criterion(outputs, labels)

            # Add L1 regularization
            l1_loss = 0
            l1_weight = 0.001
            loss
            for param in model.parameters():
                l1_loss += torch.sum(torch.abs(param))
            loss += l1_weight * l1_loss
            # loss = criterion(outputs, labels)  # Unsqueeze labels to match output shape

            # Backward pass and optimize
            loss.backward()
            optimizer.step()

            # Accumulate loss
            epoch_loss += loss.item()

What could I improve? I really wanna keep the MLP this simple

#

hmm maybe it's just not possible mathematically? I basically have two linear functions, I wouldn't know how I could do it by hand

past meteor
upbeat prism
#

maybe I just write my own using modulu and basically hardcode it ^^

past meteor
#

Aha, I didn't see that. Then you likely need to increase the number of parameters

upbeat prism
#

I just need some model that has very distinct grads

past meteor
#

But not so much so as to memorize which number is even and odd

upbeat prism
#

yeah of course but I wanted to make a minimal example for something. I wanted to find a classification that for a given input has very distinct gradients

verbal oar
#

yes I think or one of its dependencies is issue, pytorch reply

past meteor
#

Lots of chance it will not work with larger numbers

upbeat prism
#

I couldn't even think of how to do it manually but anyway found something else that might work

unkempt apex
rich moth
rich moth
past bramble
unkempt apex
unkempt apex
rich moth
# rich moth besides embeddings?

nothing I can think of as effective, not really. But what if you stacked embeddings of tokens from a sentence or sequence to form a larger image-like structure

verbal oar
#

ah so you mixed things

rich moth
verbal oar
#

I'm asking because when I see calculus of variations (variatonal) inspiration, and wonder if it is difficult in code as in math formulation

#

there is much of derivation

#

for example I saw in wikipedia derivation of q or p dont remember

rich moth
deep abyss
#

I am having some troubles with tensorflow. I am loading tf_flowers dataset using tensorflow_datasets. The moment I run the jupyter cell and load it, 1.9 GB of 4 GB of my dedicated VRAM gets used which was all free before, the total size of the dataset is just around 233 MB, Also, when I try to train some models with single dense layer only and 128 neurons, I get ResourceExhaustedError saying Out Of Memory while only 2.1 GB of my dedicated VRAM is used and 1.9 GB is still left. How do I deal with this without restarting the kernel each time?

arctic silo
#

Hi talents I installed annaconda 2024 version and I'm using jupyter notebook Its too slow any one has this problem

#

I used the old version and its not slow as this

left tartan
arctic silo
#

some data anlysis

#

pandas ,numpy and this kind of module

left tartan
arctic silo
#

why ? what do you mean ?

pine escarp
low void
#

I started on kaggle Few days ago what do I need to know before starting the titanic competition
I just finished the introduction to programming course by Alexis Cook

serene scaffold
low void
low void
serene scaffold
#

I don't know that course. you might do the kaggle pandas tutorial.

low void
low void
untold fable
#

Where to learn ai

#

In yt

odd stratus
rich moth
# unkempt apex yeah

You got me thinking of a different type of technique. Instead of passing standard embeddings, im stacking them to create an image like representation. I made a CNN that reshapes the embeddings into a 2d grid and applies connvoultions to extract patterns and intergrates it with the image data. I intergrated it in my project ive been working on and its training now

small wedge
# rich moth You got me thinking of a different type of technique. Instead of passing stand...

Thats cool but seems a bit counter intuitive to me, since the intuition behind convolution is that it gives you information about the neighbors of an "anchor datum". In other words, it would give you information relating to the position of the embeddings on the grid and the neighbors surrounding your anchor, which doesnt really make sense for embeddings in the same way it would for pixels. But I'll be interested to see if the results you get are good nonetheless.

Is there some specific reason you built it like this like it's used in a paper or are you just throwing stuff at the wall for research?

jaunty helm
#

the attention mechanism should (hopefully) be taking care of the relationships between the words already

rich moth
small wedge
#

Gotcha

desert oar
odd stratus
#

so im new to a.i. what sort of layers and systems should i be implementing and using?

unkempt apex
past bramble
past bramble
#

I hope you know what tokens are in LLMs

#

each token gets converted into vectors of n dimensions

#

basically an array of n dimensions containing floats

#

two tokens with same meaning will have similar vectors, such as boy and male

#

when you perform math operations you will quite often get the same result
example:
distance = King - man

now we can use it this way:

woman + distance
which is equal to Queen

unkempt apex
#

This is original text
priknik horn red electric air horn compressor interior dual tone trumpet loud compatible with sx

and this is tokenized from BERT normal tokenizer

 '##k',
 '##nik',
 'horn',
 'red',
 'electric',
 'air',
 'horn',
 'compressor',
 'interior',
 'dual',
 'tone',
 'trumpet',
 'loud',
 'compatible',
 'with',
 's',
 '##x']```
#

is it good?, but why '##' is being added to letters

odd stratus
past bramble
odd stratus
#

does the a.i. learn the vectors itself through training, or are the vectors premade upon loading into the perceptron?

past bramble
#

I was thinking of using it and then I saw the size of one vector for one of the models was "800", it's huge to me

past bramble
odd stratus
past bramble
#

@odd stratus when you said letter by letter, are you passing in ascii values? How are you going to pass them?

odd stratus
past bramble
past bramble
#

I have a really bad idea

#

I make a list of words, everytime I come accross a new word I append it

#

and the indices will be the values I pass in to train the model and get the output

deep abyss
unkempt apex
#

but then slowly slowly as you move forward ( run more code ) , it gives you this error right?

unkempt apex
#

I mean, how u are loading dataset and all

#

are u using Dataloader class?

deep abyss
#

It is. giving a Dataset object.

unkempt apex
#

and then?

#

just go line by line , what are u doing

deep abyss
#

Doing some normalisation on image, and training a sequential model with a flatten layer 64 neuron dense layer and softmax output (used Adam optimizer).

unkempt apex
#

only these?

deep abyss
#

Here is the code to load dataset:

BATCH_SIZE = 16 # Later changed to 8 but could not solve the problem
IMG_WIDTH = 128
IMG_HEIGHT = 128

builder = tfds.builder("tf_flowers")
builder.download_and_prepare(download_dir=r"D:\tensorflow_datasets")
train_ds, test_ds = builder.as_dataset(
    split=["train[:80%]", "train[80%:]"],
    shuffle_files=True,
    batch_size=BATCH_SIZE
)
class_names = builder.info.features["label"].names
print(class_names)
def preprocess_images(image_batch):
    # Resizing the images
    image_batch["image"] = tf.image.resize(image_batch["image"], (IMG_HEIGHT, IMG_WIDTH))
    # Scaling the images
    image_batch["image"] = tf.image.convert_image_dtype(image_batch["image"], tf.float32)
    # Format expected by `fit` method
    return (image_batch["image"], image_batch["label"])


prepared_train_ds = train_ds.map(preprocess_images, num_parallel_calls=tf.data.AUTOTUNE)
prepared_test_ds = test_ds.map(preprocess_images, num_parallel_calls=tf.data.AUTOTUNE)

Model code:

model2 = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(IMG_HEIGHT, IMG_WIDTH, 3)),
    tf.keras.layers.Dense(16, activation="relu"),
    tf.keras.layers.Dense(len(class_names), activation="softmax")
])

model2.compile(
    optimizer="adam",
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    metrics=["accuracy"]
)
#

I later changeed the dense layer neurons from 64 to 16 to resolve the error, but I couldn't.

unkempt apex
#

share the full traceback also!

#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

deep abyss
# unkempt apex share the full traceback also!

I am currently not able to reproduce the error, but from a previous training, here is the error:

ResourceExhaustedError: {{function_node __wrapped__Mul_device_/job:localhost/replica:0/task:0/device:GPU:0}} failed to allocate memory [Op:Mul]

I copied it from my GPT prompt where I first asked about this problem. I am unable to provide the full traceback.

unkempt apex
deep abyss
unkempt apex
#

have u tried all this?

verbal oar
#

ResourceExhaustedError docs?

unkempt apex
verbal oar
#

so out of memory, as I supposed

unkempt apex
deep abyss
#

So, reducing batch_size didn't worked for me.

unkempt apex
verbal oar
#

reduce dimension size of model weights

#

hmm but batch size is not too big 16

unkempt apex
jaunty helm
#

not familiar with tf, but maybe

prepared_train_ds = train_ds.map(preprocess_images, num_parallel_calls=tf.data.AUTOTUNE)
```this part's doing copies and so your gpu can't hold all of the data?
unkempt apex
#

wtfff

deep abyss
#

I also tried: tf.keras.backend.clear_session() but didn't release the memory.

jaunty helm
verbal oar
#

For example, this error might be raised if a per-user quota is exhausted, or perhaps the entire file system is out of space. If running into ResourceExhaustedError due to out of memory (OOM), try to use smaller batch size or reduce dimension size of model weights.

deep abyss
jaunty helm
#

assuming it's copying, if you did like

train_ds = train_ds.map(preprocess_images, num_parallel_calls=tf.data.AUTOTUNE)
```the unprocessed data could be collected and reduce mem
jaunty helm
#

maybe the data is compressed so when you load it it takes more memory than it might seem

unkempt apex
#

yeah

#

@deep abyss have u checked dataset manually?

#

it's all images right

deep abyss
unkempt apex
unkempt apex
#

no it's only 221 mb

#

another option as I said, try to run the same code on colab now

#

with the GPU they provide

verbal oar
#

profiling would be helpful I think

deep abyss
unkempt apex
verbal oar
#

some memory profile specifically

unkempt apex
#

use pytorch always 🫂

#

I never used tf actually

verbal oar
#

I must try pytorch

deep abyss
#

While loading the dataset in Colab it takes no GPU memory, the usage remains constant to 0.1 GB out of 16 GB but in my system it instantly consumes 1.9 GB of dedicated GPU VRAM (I have RTX 3050 with 4 GB dedicated VRAM). Why that might be...?

verbal oar
#

is porting from torch(lua) relatively easy to pytorch?

#

because I see some deep render in torch and want do it in pytorch

past bramble
#

trying to improve my image generation model, looks good from epoch 8 :)

odd stratus
#

im using ordinals for the letter inputs and outputs

verbal oar
#

looks better and better I think

deep abyss
verbal oar
#

what might be to know some bottlenecks etc use profiler

past bramble
verbal oar
#

I assume ordinal numers but not sure

past bramble
#

ordinal numbers and encoding text doesn't relate

verbal oar
#

hmm but I saw somewhere this term ordinals, forgot where

#

maybe OrdinalEncoder

#

looks like make sense

#

to preserve inherent ordering

past bramble
#

hm

unkempt apex
deep abyss
past bramble
unkempt apex
#

which model? are u using , I have tried U-Net!

past bramble
#

wdym? GAN doesn't explain it?

unkempt apex
#

u using GAN now right?

past bramble
#

yeah

unkempt apex
#

how's your structure of Generator then?

past bramble
#

bunch of CNNs

unkempt apex
#

generally people make similar to CNN

#

yeah that's what, but we can also make similiar like U-Net

past bramble
#

that's a new one

unkempt apex
#

cGAN !

past bramble
#

conditonal GAN? I guess I made my number model that way

unkempt apex
#

yup

past bramble
odd stratus
arctic wedgeBOT
past bramble
#

I don't think it'll be effective

#

dunno I haven't tried

#

btw a question
_____________

#

I need outputs from a neural network from a set of numbers, which each represent a word. How can I make it that the network only outputs from the set I have defined?

Example:
I have the set: ```py
[0.1, 0.2, 0.3, 0.4, 0.5]


The output:```py
[0.3,  0.1, 0.4]
```or  ```py
[0.3, 0.1, 0.4, 0, 0]  # padding on the right

The output size isn't fixed, since a conversation response can be of any size.

How can I go about making such an output layer?

desert oar
jaunty helm
past bramble
odd stratus
unkempt apex
#

wait, 30k epochs seriously? 😂

odd stratus
#

im basing the concepts off of the image generating a.i's where the a.i. only needs to predict one letter at a time to create a full "image" but the image being the output text
and my input data is the entire movie scene script for The Fellowship of The Ring lmaooo

odd stratus
#

ive restarted a few times and trained my model a bit in each to see how it works
it seems to follow two trends

  1. it repeatedly outputs a single letter after initialising but around epoch 3000 it starts choosing different letters
  2. it either
    a. starts getting everything correct
    b. or it starts averaging results and getting incorrect output
    could just be random initialising data
    but it does get really accurate results when its initialised data is lucky lmao 50/50
past bramble
#

can you show it's responses

#

30k in 30 minutes is fast ngl

odd stratus
#

testing different stuff to get it to do a full generated output

verbal oar
#

do you use git or freestyling with code?

#

or just jupyter notebook or locally

odd stratus
verbal oar
#

for example I think I should not do sth like "added vae from scratch" but rather more modular messages?

#

like added encoder
added decoder

#

or not use git would be faster

#

when I'm not using git I code faster

#

I know how to use it but dont know when, at what messages to have

past bramble
past bramble
regal light
#

how do we utilize tensorflow gpu on pycharm
i tried every possible way, but I can't find the right solution

verbal oar
#

writing git messages is like naming variables 😂

desert oar
# jaunty helm I think they mean the output should only output those in the set

Correct, but there's no way to do that. One hot encoding is how you do that. On the other hand, if the output is a real number within some range, there are things you can do to constrain the range of the output. But you can't put arbitrary constraints on the output beyond that. If you try, you run into some fundamental trickiness of the real numbers, among other problems

desert oar
# past bramble I will be encoding the words into numbers, I want to input numbers for learning ...

Unfortunately, one hot encoding is precisely how you encode a fixed set of numbers in a model. You are mapping words to integers, and then mapping those integers to elements in a vector. There are other ways to do it that are mostly used in research fields like psychology, but for the purposes of machine learning they are equivalent, so one hot encoding is preferred because it's the simplest and easiest to interpret

#

It looks like you're trying to use numbers other than integers, maybe decimal numbers within some range? Consider that 0.1, 0.2, ... 1.0 are identical to 1, 2, ..., 10 -- you just divide everything by 10

#

So without loss of generality, you can always transform a finite set of numbers to natural numbers counting up from 1 or 0 as desired

#

It turns out that this is true even for the rational numbers. digging into that is the content of a course in real analysis

#

I hate to tell you not to experiment with something, but at least hopefully you understand now why people do what they do (and don't do what you're trying to do)

buoyant vine
#

Sorry to derail Salt's excellent explanation, but a bit of a question around training LLMs or at least, looking for guidance around what approach to take:

I'm currently looking to try build a model the predicts the next set of relevant tokens upto N tokens for Y variants, where N and Y are small (think maybe 10 at most) where it is trying to predict the most relevant tokens based on a input training dataset that varies in size.

I guess it technically falls under generative AI but it has some caveats:

  • The aim is not to produce accurate grammar or longer sentences, just tokens.
  • The system does not want to do KNN or other semantic search type of logic to get the most relevant tokens, i.e. RAG is out of the question.

I haven't tried it yet but I wondered if you could take some basic encoder-decoder model and fine-tune it to the new dataset forcing it to generating the tokens related to that dataset only. But not sure if that is the right or most efficient way to do so.

simple tapir
#

What do you guys suggest for mlops? ZenML, MLFlow or something else?

buoyant vine
simple tapir
#

What do you think about ZenML?

buoyant vine
#

Haven't tried it so can't really say

small wedge
past bramble
simple nimbus
#

hey, given a sentence, is there any way to figure out which chapter (textbook) or topic (pre determined) is it from? from research online I was told to use BERT but is there any simpler way? looks like I have to Train BRET with quite some data to begin with

past bramble
small wedge
#

the way LLM's choose a word is by having a softmax across their entire vocabulary

#

the token with the highest probability is chosen

serene scaffold
#

or, if you have the whole textbook available, you can just... find the sentence in the textbook.

simple nimbus
#

sometimes I need to identify what the sentences is about

serene scaffold
simple nimbus
#

Its about "I"

serene scaffold
simple nimbus
#

but for sentence analysis that I am doing, towel was more appropriate answer

#

sorry for confusion

past bramble
# small wedge the token with the highest probability is chosen

What about the output size? how do they create text such that their content doesn't exceed the max limit and it's constructed accordingly by stopping with punctuations. picking tokens with highest probability until you reach a stop punctuation before hitting the max length?

small wedge
#

the model outputs 1 token at a time

#

you request from it as many tokens as you want (input -> output1 -> input + output1 -> output2 -> ...)

#

you could use punctuation as a way to stop if you want shrug it doesn't really matter

simple nimbus
#

for example given


---

Consider the following pairs:

1. Port of Rotterdam: First major port in Europe registered as a company
2. Port of Shanghai: Largest privately owned port in the world
3. Port of Singapore: Largest container port in the world

How many of the above pairs are correctly matched?

(a) Only one pair  
(b) Only two pairs  
(c) All three pairs  

---

I have to determine if this question is from geography or history

past bramble
small wedge
#

that's how LLMs work yes

past bramble
past bramble
# small wedge that's how LLMs work yes

so I train it with a dataset where the input is a text and the output is the word it's supposed to guess?
that's weird I have to figure out how to do that when the dataset I have is conversation pairs

small wedge
small wedge
past bramble
past bramble
small wedge
#

modern llms use transformers and multihead attention all that

#

but you can make something like this with simple RNN stuff like LSTM or GRU

#

yeah so if you wanted to have a chat bot that can generate novel conversations that don't exist in it's dataset you'd probably wanna go the softmax route and feed it stuff like "I'm fine, how about yourself? " -> "I'm fine, how about yourself? I'm" -> "I'm fine, how about yourself? I'm " etc.

#

the big issue you'll probably run into here if you've never played with this kinda NLP before is probably stop words

#

your dataset is not massive, and there might be a lot of words that appear very often like "i'm" "i've" even spaces that the model can easily find local minima for when just spamming the same word over and over as an output. There are 2 minds to dealing with this which is basically to remove common stopwords altogether from the dataset to avoid having the model break during training (this unfortunately leads to the model not being able to accurately generate those stopwords without further fine tuning) or just leaving the stop words in and praying to any gods that will listen that it doesn't break.

desert oar
# past bramble I have reconsidered with the way you have explained it. I know how to use one ho...

you do in fact use one-hot encoding for outputs as well. that's the standard technique for classification in all cases, not just for text (where you are "classifying" each output token with a word). the difference is that you don't get strict 1 and 0 values -- you get a score in each vector element, and conventionally we treat the highest-scoring element as 1 and all the others as 0. ideally you would use the softmax function to ensure that the scores are all between 0 and 1, and they all add up to 1, which helps ensure that the output is sane, aids interpretation, and allows you to use loss functions that treat the output as a multinomial probabilty model, which is exactly what we have here

#

i suggest taking a look at the classic word2vec model: it's a good entry point into a lot of these concepts and still forms the conceptual basis for a lot of what we do in ML with text even 10+ years after the model came out

#

(most of the ideas in word2vec are based on older ideas in ML and statistics but at that point you're going very deep into the fundamentals, which is a good thing, but probably unsatisfying if you want to just play around and build some toy projects)

desert oar
#

as far as i understand, that's precisely what "GPT" is/was: a decoder-only model with a huge number of parameters trained on a huge amount of data turns out to be great at generating text

desert oar
#

source: we used BERT vectors at work for text classification shortly after the model came out, and it improved our results compared to other vector embeddings

#

and we didn't fine-tune, we just used the off-the-shelf model weights

#

but as a learning exercise, yeah i think using pre-trained vectors starves you of an opportunity to explore and experiment and practice with building your own things

past bramble
odd stratus
#

when im training my a.i.
it isnt outputting quality answers
im training it to predict the next letter in a sequence
however instead of outputting the next predicted letter
its output vector is just an average of the training data

e.g. if 25% of the output was the letter e and 10% was the letter a
its output isnt accurate and instead constantly outputs e as e is the most correct average

how do i prevent it?

buoyant vine
# desert oar aren't all the big LLMs are trained on next-token prediction anyway?

Yep, but the goal of the most of the existing models want to predict human text as such, i.e. it has certain things like gramar correctness and formating sentences, which we don't really want.

The goal is it needs to be fast and lightweights, so it can't do things like RAG or things which end up involving running both the model and then KNN ontop of that.

#

the primary objective is keyword & phrase supplimenting to keyword search queries, but most systems like word2vec or GloVe, etc... are trained on general (normally english) text, making it liable to predicting words that don't exist in the corpus

strong notch
#

Does anyone here work with AI in healthcare, or is anyone interested?

buoyant vine
desert oar
buoyant vine
#

Hmm possibly, how well does that work with predicting phrases of text though?

desert oar
#

not enough source data?

#

oh, not well because it's cbow and skipgram neither of which is what you want i think

buoyant vine
#

possibly, the source data itself is a black box, because it depends ultimiately on who is using the engine

#

different users will have bigger or smaller indexes

desert oar
#

how much text do you have? maybe you can use nanogpt

#

that is: use the basic transformer architecture for its original purpose of sequence modeling, forget all the LLM stuff

#

i haven't seen this embedding replacement technique that waterfall posted though, so maybe that's promising

#

it definitely sounds like it might help you, from the abstract

buoyant vine
#

Yeah need to dig into it, effectively the biggest issue here is amount of compute required. The goal is this is a suplimental system which can periodly train on the user's search corpus and then that gets used to help supliment search queries

#

giving you an illusion of hybrid or semantic search

#

but without the ANN/KNN related activities

#

In theory you could use word2vec and Glove on some pre-compiled (small) index, but I'm not sure how well they work when trying to form or predict phrases of 2 or 3 words

serene grail
buoyant vine
#

normally RAG has some sort of database that provides context to the LLM

#

which is normally in some form of vector search

#

doesn't have to be, but it is very common

small wedge
serene grail
#

And KNN is a form of vector search?

buoyant vine
#

it is still realistically very computationally expensive

buoyant vine
serene grail
#

Thank you!

buoyant vine
# small wedge have you looked into any sparse encoding search techniques? or would that still ...

The issue is also the fact that it slows down time to search and ingesting times.

Currently in the landscape trying to do hybrid search with something like sparse encoding or just ANN/KNN you end up using 10-100x more compute than a regular keyword based system would, and often endup scanning a lot more data in the process.

The flip side is often people don't actually want the full semantic behaviour, they just want some similar keywords or terms of phrases to be included in the results when search for something like "high heels" for example. Adding vector search often ends up meaning you need a GPU instance to quickly embed all your data and respond to queries quickly, and then also see a much sharper increase of costs when you dataset grows and your time to search goes down because building the indexes takes longer.

small wedge
#

yeahh

solid tangle
#

hello

#

need a guide on how to create a neural network from scratch

small wedge
#

do you have any ML experience or knowledge prior to this?

solid tangle
#

nah

elder pilot
#

Hi guys can I get an AI roadmap recommendation

solid tangle
#

i rlly dont need to create it i just want to write about it

#

but i sorta want to understand it

solid tangle
#

ok lemme give it a read ty

solid tangle
#

ok tyy

rich moth
#

Just finished an evaluation step on my model. I had to make a bunch of changes to get it working still got some tweaking todo probably. Ill let it run for a bit then we can see some results.

rich moth
#

Honestly, for the first reconstruction this is one of the best ive seen.

shadow viper
#

good day everyone, i'm not familiar with GPUs so i want to ask since i want to make use of google colab to train a model thats based on vision transformer from scrarch.
using the google colab T4 GPU or the google colab TPU v2-8
which one would you advice to train the vision transformer?

serene scaffold
faint quail
#

Dopamine

shadow viper
# serene scaffold If you don't know why you want to use a tensor processing unit (TPU), just use t...

OMGGG.... I'm currently training with the GPU T4 and I'm not even gonna lie, its so awesome.
i use my laptop CPU(16 gb ram, core i7 and 3.0ghz) to train it normally before but i will stop every other tasks just because I'm scared my system doesn't blow up or crash. but now, omg, its as if I'm doing nothing. i cant even hear my laptop fan make any sound, i can literally type freely without any lag. and its fasttttt!!!!!!!!!!!!!!!!!!!!!!!

i'm so saving up for a real time GPU

unkempt wigeon
#

What should I use for a kernel for a converted image matrix my apologies

serene scaffold
shadow viper
serene scaffold
#

your money is probably better spent renting cloud compute.

shadow viper
serene scaffold
shadow viper
shadow viper
serene scaffold
serene scaffold
shadow viper
upper patio
#

Any opinions on groq ? Im trying to use it in my saas but not quite sure if that would be the best

slate raven
#

Computervision: I cannot open 2 camera's at the same time.
Everything worked fine on my windows 11 laptop, then I transferred all my code to my linux / ubuntu. When I only open one camera with cv2.Videocapture(0) it works fine. All my different cameras work fine with index 0. But when I plug in 2 cameras and try Videocapture(0) and videocapture(1) at the same time i get that error message:

[ WARN:0@0.008] global cap_v4l.cpp:999 open VIDEOIO(V4L2:/dev/video1): can't open camera by index [ERROR:0@0.408] global obsensor_uvc_stream_channel.cpp:158 getStreamChannelGroup Camera index out of range Error: Failed to capture image.

I also tried index 2, 3 and 4, and it gives me the same error, while there are 3 cameras plugged in my laptop
Btw google and chatgpt weren't of any help.

Thank you in advance for your help :)

faint quail
quaint rivet
#

i have written an unet model for image segementation. When i run my model. I'm getting loss as nan. I don't know why i'm getting it nan

#

i even have checked my input value

severe inlet
#

im working on a data science project on colab with some friends. one of our datasets is a 9gb csv file. is there anyway to import/load it into colab to work on it as a dataframe? or how should i go about working with this massive file?

severe inlet
#

sorry how do i read it in in chunks if i need to have it uploaded somewhere first..?

wooden sail
#

you can load the file into your google drive and mount the drive in colab

#

though for a file that size, you may or may not need a paid tier of either google drive or colab

#

if the data is obtained from some website/API, you'd have to process it as you obtain it

severe inlet
#

clean it as it imports?

wooden sail
#

yeah, process it in chunks

untold fable
#

Hey guys do have any ideas

#

How to use machine learning for iot project

#

Or real time projects

small wedge
#

That's a cool one

#

I saw an old project that predicted poses through walls using wifi signal data

past bramble
#

I wanna make one now

#

other than VAE and GAN, what do we have for image generation?

small wedge
unkempt apex
quaint rivet
unkempt apex
#

see, then maybe print first 5 rows from your dataset

quaint rivet
#

Strange thing is that I'm getting loss value as nan

quaint rivet
unkempt apex
quaint rivet
#

I still not able to figure. Where this is causing nan

unkempt apex
#

then share some info, so others can also take a look at that

quaint rivet
#

Ok

quaint rivet
unkempt apex
#

I said already, first 5 rows from dataset, and maybe code on how you are calculating loss

past bramble
quaint rivet
#

mask image```
0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0
1 0 0 0 0 0 0 0 0 0 0 ... 0 0 0
2 0 0 0 0 0 0 0 0 0 0 ... 0 0 0
3 0 0 0 0 0 0 0 0 0 0 ... 0 0 0
4 0 0 0 0 0 0 0 0 0 0 ... 0 0 0

#
root_train_dir = "D:\\feature-extraction\\assets\\train"
root_test_dir = "D:\\feature-extraction\\assets\\test"

train_x = glob.glob(root_train_dir+"\\images\\" + "*.npy")
train_y = glob.glob(root_train_dir+"\\masks\\" + "*.npy")

test_x = glob.glob(root_test_dir+"\\images\\" + "*.npy")
test_y = glob.glob(root_test_dir+"\\masks\\" + "*.npy")


def load_data(x, y):
    X = np.array([np.load(i) for i in x])
    Y = np.array([np.load(j) for j in y])


    return X, Y

callbacks=[
    EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True),
    
]

X_train , Y_train = load_data(train_x, train_y)
X_test , Y_test = load_data(test_x, test_y)
print(X_train.shape, Y_train.shape)


history = model.fit(X_train, Y_train, validation_data=(X_test, Y_test),epochs = 10, batch_size=8, callbacks=callbacks)
unkempt apex
#

your dataset are images?

quaint rivet
#

yeah

#

ofc i have unet model

#

That's why it's hard to find error

unkempt wigeon
verbal oar
#

hmm looks like you have somewhere division by zero?

#

divide by zero will result in NaN

wooden sail
#

the hyperparams of the model/fitting method might also be set incorrectly, causing the model parameters to blow up

rich moth
unkempt wigeon
#

I have a question do I need to make a lot of pathways for a neuron to teach a neural network sorry because I got pillow installed in my network so I can put any image and turn it into an array my apologies

unkempt wigeon
rich moth
#

Anyone ever play around with OpenAI Gym? I want to test my AI logic in unique enviorments. I made a CTF game using pygame but I wanted to try something different.

unkempt wigeon
unkempt wigeon
#

@rich moth I'm sorry

#
#===(imports)===#
from PIL import Image
import numpy as np
from matplotlib.image import imread
#==============#


image_array = imread('C:\Users\Willo\Desktop\ais\eye0.png')
array =np.array(image_array) 
X = array

print(X.shape)
verbal venture
#

what's the difference between RAG and AI search? And is this RAG? if not what should I google to learn how to make this?

rich moth
unkempt wigeon
#

What specifically a 3D because I believe there's a function that you could use to make 2D games although I don't know too much about training I'm just trying to make a neural network at the beginning although I didn't make a training simulation for Paul I'm sorry

rich moth
verbal venture
unkempt wigeon
#

What do I need for the convolution to get it so that the image can be turned into an array I'm sorry

carmine cairn
#

Hey, I would like to get address, number and web site data from my saved places in google maps the saved as a .csv file. How can I do that without Google API? (Ex places list: https://maps.app.goo.gl/bsxbhgW9zvXzSa8n9)

unkempt wigeon
rich moth
verbal venture
#

That’s the vector db?

rich moth
#

haystack is for building the rag pipeline

#

elasticsearch is to store the embedded data

verbal venture
#

is that through the prompt (return me the citations) or is that through indexing metadata (done through code/software engineering)?

rich moth
#

I built one, but the UI is minimal and it looks like crap

verbal venture
#

can you link me the code?

#

and biggest problem with RAG rn is hallucination concerns yeah?

unkempt wigeon
#

Found out why the image wasn't showing its shape forgot 2 back slash

rich moth
#

honestly, mine doesnt hallucinate. believe it or not.

unkempt wigeon
#

So how many areas for weights should I have sorry because it does say three color channels but there's Image size (194,259)

unkempt wigeon
unkempt wigeon
#

What should I use for the kernel?

rich moth
unkempt wigeon
# rich moth Not sure what you're asking.

To sign over the image to help it come to the decision what it is now I'm trying to recreate an experiment I heard of an AI that was showing images of may have had or won't have hurt problems in the future and it reliably told the biological sex trying to create that and for a convolutional neural networks to take images and recognize them you have to have a colonel that goes over the image sliding past on the array my apologies

rich moth
#

oh i see are you talking about transforms.Compose ?

#

maybe google that

unkempt wigeon
#

Yes I'm sorry

unkempt wigeon
# rich moth maybe google that

In this video we'll create a Convolutional Neural Network (or CNN), from scratch in Python. We'll go fully through the mathematics of that layer and then implement it. We'll also implement the Reshape Layer, the Binary Cross Entropy Loss, and the Sigmoid Activation. Finally, we'll use all these objects to make a neural network capable of classif...

▶ Play video
rich moth
#

its part of torchvision, transforms

unkempt wigeon
#

Any videos that may have any use my apologies

serene scaffold
verbal venture
#

basically asking if there's more methods to local-document AI search than RAG

verbal venture
serene scaffold
#

If someone says "oh we need some way to search for documents", and someone else says "ok let's use RAG", that doesn't solve the problem. you need to already know how you can retrieve documents in order to create a RAG system.

verbal venture
#

yeah you're saying the retrieval is bm25/KNN vector search

verbal venture
serene scaffold
#

that are conversational? Not that I know of.

#

even if someone claimed that there were, I'd want to understand how it works before I agree that it's not RAG.

verbal venture
#

so you're saying RAG is the only solution to things like perplexity rn

serene scaffold
verbal venture
#

ah the search engine

serene scaffold
#

ah

verbal venture
#

yeah have you ever tried it?

#

I think the way it works is they return the google search API results then summarize the answers through prompt engineering and cite their sources

#

seems kinda easy technically? 2B valuation

unkempt wigeon
#

Does anyone know how to make a efficient kernel using numpy my apologies

rich moth
unkempt wigeon
#

What is the best way of getting Data for the neural network to do its job sorry

serene scaffold
unkempt wigeon
#

What I mean is sliding it all across the image and getting the values to put into a relu function for each individual value sure it will slow it down but it might learn to go to the next layer and then the next layer and then the next layer and it will tell me what it is I know it's an over simplification I'm trying to explain it to not be abstract my apologies

rich moth
# unkempt wigeon What I mean is sliding it all across the image and getting the values to put int...

Lecture 1 gives an introduction to the field of computer vision, discussing its history and key challenges. We emphasize that computer vision encompasses a wide variety of different tasks, and that despite the recent successes of deep learning we are still a long way from realizing the goal of human-level visual intelligence.

Keywords: Computer...

▶ Play video
unkempt wigeon
#

Thank you

rich moth
#

np

verbal venture
untold fable
#

helllo

past bramble
#
63.3s    12    WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
63.3s    13    I0000 00:00:1726126970.029600      62 service.cc:145] XLA service 0x7e5a04003a40 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
63.3s    14    I0000 00:00:1726126970.029656      62 service.cc:153]   StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0
63.5s    15    WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
63.5s    16    I0000 00:00:1726126970.029600      62 service.cc:145] XLA service 0x7e5a04003a40 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
63.5s    17    I0000 00:00:1726126970.029656      62 service.cc:153]   StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0
64.8s    18    I0000 00:00:1726126971.486784      62 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
65.0s    19    I0000 00:00:1726126971.486784      62 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
#
79.4s    20    2024-09-12 07:43:06.118002: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape infunctional_1_1/dropout_1/stateless_dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
79.6s    21    2024-09-12 07:43:06.118002: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape infunctional_1_1/dropout_1/stateless_dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
145.9s    22    2024-09-12 07:44:12.611558: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape infunctional_1_1/dropout_1/stateless_dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
146.1s    23    2024-09-12 07:44:12.611558: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape infunctional_1_1/dropout_1/stateless_dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
#

not the first time I get these warnings/messages. I want to know what the reason is

unkempt apex
#

bruhh why tf ??

odd stratus
#

im pretty happy with the results i got
the a.i. managed to write out this sentence in perfect order
it learnt to write and spell letter by letter

past bramble
past bramble
unkempt apex
#

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

past bramble
unkempt apex
#

anyone know about this??

#
loss.backward(retain_graph=True)

I tried this option also

#

but still error

#

training with batch_size = 32

past bramble
#

any resources to learn about training steps in neural network, forward propagation, backward propagation, loss and gradient in detail?

unkempt apex
#

this is lit.....

past bramble
#

nice i wanted in more detail

unkempt apex
past bramble
odd stratus
unkempt wigeon
#

Anyone here who's create a (CNN) because I can use some help with the creation of colonels my apologies

unkempt apex
#

colonels? | who's create?

#

u helping CNN maker , or u wanna see that

quaint rivet
river cape
#

Hello guys , so I have this project idea of building an ai model which takes the map of building , example lets say a mall, and it should give me the directions for a particular store in the mall

#

Like for example , I want to visit the nike store in the mall

#

It should the directions to that store from any where inside the mall

#

You could say like its a mini version of Google Maps

#

So if someone could give some ideas , as how to proceed?

odd stratus
# river cape So if someone could give some ideas , as how to proceed?

you need to have an a.i. program that you can run
if you want to use preexisting infrastructure theres a lot of libraries e.g. tensorflow etc.
or make one yourself

then you need to design the layers and layer sizes, then you need to turn your task into a well defined set of outputs
then you need to take in data in a well defined way such that it can be mapped onto the output data

e.g. x+10 = y
input x output y

#

then once you have a lot of training data, test and train the a.i. until you get results you want

desert oar
river cape
wooden sail
#

i think dijkstra is optimal regarding complexity for the most general path finding problem

spare forum
#

Everything doesn't need so called "AI"

#

oopsies 🙂

rich moth
#

man making the game is harder than the AI part.

desert oar
small wedge
#

And balancing rewards to actually get your agents playing your game instead of finding a niche and exploiting it is just as hard as making the agent too

unkempt wigeon
unkempt wigeon
serene scaffold
unkempt wigeon
#

Being a point higher than the human player

serene scaffold
#

what game is the CNN playing

unkempt wigeon
#

A pong because there's two simple outputs up and down but it has to know where the ball is sorry

unkempt wigeon
rich moth
rich moth
unkempt wigeon
#

Well I was thinking that but it's probably not what's needed for a CNN so I might try training one on games first because I can build any game that I want and I can have it trained on the data that's found so I can get a better idea a feel for how to train them in the future should be a reward based or shipping just how it figures it out by itself my apologies

unkempt wigeon
rich moth
#

like a CNN-DQN?

unkempt wigeon
#

DQN?

rich moth
#

deep q-network

unkempt wigeon
#

Yes a deep learning network

rich moth
#

What do you want to do with it? Whats your end goal?

unkempt wigeon
unkempt wigeon
#

I only have the training site made I just need to figure out how to make the network I don't know if that needs to be a CNN or can it just be a regular not working that's been put into deep learning my apologies

rich moth
#

So I made a simple DQN with a CNN . It actually works pretty damn well lol

rich moth
unkempt wigeon
#

Do I need a CNN to run a game sorry

#

@rich moth how do you train a neural network to play a game sorry