#data-science-and-ml

1 messages · Page 168 of 1

verbal oar
#

hmm so why not just use maple or octave?

#

about symbolic computation

pale merlin
weary timber
#

can someone help me with this please 😔

lapis sequoia
#

t5 can be fine-tuned on any dataset, right? It does not have to be massive, right?

woven prairie
#

I have two questions regarding image generation and image manipulation with dalle model

  1. Is it possible to generate realistic images using dalle , I am trying a lot but results are not that good, as compared to mid journey or stable diffusion, I am enhancing the prompt using gpt4o and then passing the enhanced prompt to dalle but still results are that good.

  2. Can we image inpaint in dalle model.

#

Here is a image I want to inpaint the white space, without resizing the image , just the white space needs to be filled with the image colour.

jaunty helm
woven prairie
#

Ok but I have tried it's not happening from my side

tawdry dove
#

Hello,

What's a good way to perform call diarization on a phone call to get 2 speakers on a mono audio ?

#

Imagine a call is between a client and an agent , what's the best way to diarize that call and get the client and agent audio. The client audio would then be embedded and stored and will be pulled to verify the client next time he calls

#

Please ping me when you reply to me , thanks

lapis sequoia
#

does the dataset matter for T5 summarization in terms of size?

small fiber
#

Val - I sent an overview of our products to Shouki and Jeremy. I will let them describe their requirements first - this is more of a consulting gig - I doubt we will need to do any demos

serene grail
pale merlin
serene scaffold
drifting loom
#

hi , can anyone help solve my assignment ? (data science ) plz dm if anyone can it will be pleasure

serene scaffold
serene scaffold
#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

drifting loom
#

ya thanks

serene scaffold
#

note the "add another file" button. you can put all the files in the same paste.

#

also "drop files or click here to upload files"

drifting loom
#

its .docx

#

how do i share that

serene scaffold
drifting loom
#

ohk

arctic wedgeBOT
drifting loom
#

there u go

serene scaffold
#

^ Please do it.

#

@drifting loom Please always follow instructions from the bot. The instructions are written by humans.

serene scaffold
# drifting loom ya sure

You did not follow the instruction from the bot. You only had to push a button.

What have you done so far to try to complete this assignment, and what specific problem did you run into?

nocturne wind
#
# Hi everyone
# I'm basically praciting w common libraries like seaborn pandas numpy etc. However, I get an error about the question below.

# sample question: Fill in the missing values ​​of the "deck" variable in the titanic dataset with 'Unknown'. Then draw a graph showing the frequencies of the "deck" variable.
 
titanic["deck"].fillna("Unknown", inplace=True)

# How am i supposed to do it? I've imported all needed libraries. 
# The solutions I've tried: 
current_categories = titanic["deck"].cat.categories.tolist()

titanic["deck"] =  titanic["deck"].astype(pd.CategoricalDtype(categories=new_categories))

serene scaffold
#

note that filling nans with values like "Unknown" is almost always a bad practice.

nocturne wind
#

matplotlib and/or seaborn

nocturne wind
serene scaffold
nocturne wind
#

yeah i know that. lol I mean i didnt know I knew that until I've checked a few secs ago

drifting loom
serene scaffold
#

!code

arctic wedgeBOT
#
Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

serene scaffold
#

@drifting loom Please show code like this!!! ^^^

serene scaffold
serene scaffold
nocturne wind
nocturne wind
serene scaffold
nocturne wind
rancid dove
#

So lets say I have a 3x3 matrix, its represented in memory as a 1d array.

utri = np.array(range(1,10)) # its representative of a matrix np.array([[1,2,3],[3,4,5],[6,7,8]])

Is there a way, using numpy, to calculate say the eigenvalues of this matrix, without having to reshape it? Does it have operations for doing stuff like this? The reason I don't want to reshape is because we store everything in tabular format. So I'd be regularly just reshaping and then unpacking, its a lot of extra copies.

agile cobalt
pseudo hill
weary timber
#

prompt engineering or finetuningÇ

#

?

burnt hearth
#

<TTS.utils.manage.ModelManager object at 0x000002410FC5F710>

tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
Using model: xtts
Traceback (most recent call last):
File "C:\coqui-tts\main.py", line 11, in <module>
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\coqui-tts\venv\Lib\site-packages\TTS\api.py", line 74, in init
self.load_tts_model_by_name(model_name, gpu)
File "C:\coqui-tts\venv\Lib\site-packages\TTS\api.py", line 177, in load_tts_model_by_name
self.synthesizer = Synthesizer(
^^^^^^^^^^^^

#

File "C:\coqui-tts\venv\Lib\site-packages\TTS\utils\synthesizer.py", line 109, in init
self._load_tts_from_dir(model_dir, use_cuda)
File "C:\coqui-tts\venv\Lib\site-packages\TTS\utils\synthesizer.py", line 164, in _load_tts_from_dir
self.tts_model.load_checkpoint(config, checkpoint_dir=model_dir, eval=True)
File "C:\coqui-tts\venv\Lib\site-packages\TTS\tts\models\xtts.py", line 771, in load_checkpoint
checkpoint = self.get_compatible_checkpoint_state_dict(model_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\coqui-tts\venv\Lib\site-packages\TTS\tts\models\xtts.py", line 714, in get_compatible_checkpoint_state_dict
checkpoint = load_fsspec(model_path, map_location=torch.device("cpu"))["model"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

#

File "C:\coqui-tts\venv\Lib\site-packages\TTS\utils\io.py", line 54, in load_fsspec
return torch.load(f, map_location=map_location, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\coqui-tts\venv\Lib\site-packages\torch\serialization.py", line 1524, in load
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, do those steps only if you trust the source of the checkpoint.
(1) In PyTorch 2.6, we changed the default value of the weights_only argument in torch.load from False to True. Re-running torch.load with weights_only set to False will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
(2) Alternatively, to load with weights_only=True please check the recommended steps in the following error message.
WeightsUnpickler error: Unsupported global: GLOBAL TTS.tts.configs.xtts_config.XttsConfig was not an allowed global by default. Please use torch.serialization.add_safe_globals([TTS.tts.configs.xtts_config.XttsConfig]) or the torch.serialization.safe_globals([TTS.tts.configs.xtts_config.XttsConfig]) context manager to allowlist this global if you trust this class/function.

Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.

#

hi i am trying to use tts model using py 3.11 but it given this error after resolved this it given same error again

primal tulip
# arctic wedge

Hey, please follow the indication of the mods on how to format and paste your code (or error messages) @burnt hearth

Also, try to add the base functions that you've noticed that generate this error. The error message itself is explaining you what is happening already, you're having issues with loading the config as a global variable. If that is resolved, then try to check about the torch.load(weights_only) flag, according to your case if it should be True or False.

This could be an issue if you're running it on a docker container with no access to your library installs. Be aware you have all of your imports in it's proper place within your environment.

burnt hearth
#

Okay Thank very much, i just confused, now it resolved problem is install tts model but in the tts/utils/io.py load(..., ) weights_only not used, same thing run google collob it's not work

past meteor
#

we've done it, but rethinking our problem a bit was cheaper and easier than finetuning

jaunty helm
#

you're looking for imagegen that specifically messes up text? or those that don't

past meteor
#

Try stable diffusion 1 or 2 on Hugginface or run it locally if you have an OK GPU

opaque condor
#

How could I make my convolutional network to make a box around the correct data:
Like this photo

opaque condor
#

I just want to know that my AI understand the image like if I gave it a tomato and one of an eggplant I want to know if it understands it correctly

serene scaffold
past meteor
fallow coyote
#

Im trying to encode a column that has 4 categories. I used ordinal encoding but read up online that as my data is nominal (chest pain type), its not the best encoder to use. I tried used one hot encoding but it doesnt encode each category properly. Is there another encoder to use or am I using OneHotEncoder wrong? btw im using sklearn

past meteor
#

One hot should be what you use here

fallow coyote
#

I've solved it. I forgot to convert it to an array

verbal oar
#

chatgpt and generally openai is down

#

dont sure about current state but I heard in news

crimson jackal
#

Hey, I have this code:

for i in range(int(1e6)):
    λh_guess  = pmlogscan(1e-10, 1e-2)
    λ1_guess  = pmlogscan(1e-10, 1e-2)
    λ2_guess  = pmlogscan(1e-10, 1e-2)
    λ3_guess  = pmlogscan(1e-10, 1e-2)
    λh1_guess = pmlogscan(1e-10, 1e-2)
    λh2_guess = pmlogscan(1e-10, 1e-2)
    

    vh = 246.
    v1 = logscan(500, 10_000)
    v1 = 543.6408533218865
    v2 = logscan(500, 10_000)
    v2 = 2683.4033287981665
    mh1sq = (125.25)**2
    mh2sq = logscan(500, 10_000)**2
    mh3sq = logscan(500, 10_000)**2

    solution = optimize.root(fun     = Equations_getPars,
                             x0      = [λh_guess, λ1_guess, λ2_guess,
                                        λ3_guess, λh1_guess, λh2_guess],
                             args    = [vh, v1, v2, mh1sq, mh2sq, mh3sq],
                             method  = 'hybr',
                             options = {'xtol': 1e-10}
                            )

    if float(np.abs(solution.fun.sum())) < 1e-3:
        break

Can i parallelize this code to get more iterations per second?

serene scaffold
crimson jackal
#

Yea, I mean that. Ahahahah sorry

serene scaffold
#

also, it looks like you have some redundant computations. why do you do v1 = logscan(500, 10_000) every time?

crimson jackal
crimson jackal
serene scaffold
#

oh

#

!docs multiprocessing

arctic wedgeBOT
crimson jackal
#

Thank you!

serene scaffold
#

btw, this part will be tricky

    if float(np.abs(solution.fun.sum())) < 1e-3:
        break

because you're saying "don't do any subsequent iterations"

#

but if you're doing them in parallel, the iterations you're trying to avoid might have already happened.

crimson jackal
#

How so?

serene scaffold
#

what you're going to do is have a function that takes each i in range(int(1e6)). and those function calls will be happening in parallel. the multiprocessing pool will orchestrate that

#

all you get back is the value returned by that function

#

you might have it return a tuple of (i, solution), so that you know which is which

crimson jackal
#

Ohhh okok i get it now.

#

I am not much familiar with this parallel stuff. I will go check the docs. Thanks for the help.

#

🙂

opaque condor
serene scaffold
opaque condor
#

Bounding boxes because you suggested it earlier and I if I want to understand it easily I need to know how many resources I should use

serene scaffold
opaque condor
#

And for those who know how to use/make CNNs how many images will I need to have my network learn that's the correct way of work like if it sees an image of a stop sign it stops until it stops and continues again

serene scaffold
waxen kindle
#

There is no good answer for that, but approximately: a lot

#

it will depends on how many different objects you are trying to recognize

#

But yeah, a lot

#

like, at least a few thousands, likely more. It also depends on if you want to detect things with on single lighting condition, coloration, contrast etc.

serene scaffold
#

"it stops until it stops and continues again" -- the CNN doesn't make decisions for self driving cars. It gives the car's decision making system visual information.

opaque condor
waxen kindle
#

So an image matching algorithm rather than a CNN ?

verbal oar
#

decision makes some policy of reinforcement part?

#

its just robot not car

iron basalt
serene scaffold
verbal oar
#

yes its about image/data augmentation

serene scaffold
#

I think that's the opposite of what I'm saying.

verbal oar
#

I should say also

opaque condor
#

It's going to be high quality but you know what I mean

weak oxide
serene scaffold
weak oxide
#

Time series forecasting Python package

serene scaffold
#

No. Why?

weak oxide
#

Never mind I was just curious

brave sand
#

what are some strategies i could use to see why my q tables are not updating?

alpine plank
#

Hey all, quick question from a dev building a modular AI system each component (logic, pattern detection, memory management, etc.) is lightweight and specialized, working together like a distributed brain. I'm aiming for real world applications with serious defense needs.

I’m looking to teach my AI about modern and older hacking techniques (white & black hat) not to exploit, but to defend—like building a shield, not a sword. Think ethical hacking + layered firewalling, with the goal of self-defense, intrusion detection, and emergency protocols (in extreme cases, maybe even accessing systems if it means saving lives).

Any solid resources or communities you'd recommend for real, practical hacking education from buffer overflows and zero days to modern exploits? Books, videos, CTFs, anything welcome. Thanks!

That strikes the right tone mature, ethical, and future proof. You’re not some edgy script kiddieyou’re building an AI that can guard the gate.

Ps. If you don't want to answer or you have an issue with it please if you can find it in your heart to just keep it moving and not be negative I need just positive answers please. I am not at this point I am just trying to find out everything I can to add to my list of things to do and how to do them. If you would like to hear more to understand better what I'm trying to do DM me please.

final cobalt
#

A wee bit of code that I'm particularly proud of

alpine plank
# lapis sequoia https://tenor.com/view/jimmy-butler-confused-jimmy-butler-confused-eye-squint-gi...

No what? What is the point? I wasn't being a dh so why be disrespectful. If you don't want to read or help please keep it moving. I swear getting help in this atmosphere is all but impossible smh........this is tha data science and ai channel so đŸ€· idk why you have to try to put a fellow coder down. This server is just like the rest. People that if they can't answer just make fun of someone. Not knowing their struggles. It's sad and just shows how inconsiderate and ignorant some people can be. I'll figure it out just like I have every other time. Sorry your not on my level no reason to be jealous.

lapis sequoia
#

đŸ€·â€â™‚ïž

lapis sequoia
#

You are not unlike a script kiddie to anyone here... no-one wants to work with no-one with zero stuff to back up what they say, especially since what you said reads like AI itself.

#

Just dumping a big wall of text is not the best recruitment strategy.

#

But yddy buddy, good luck.

lapis sequoia
#

I'll share a somewhat novel approach for IK solvers using only gradient descent and forward kinematics...

#

I wanna extend it with mo physics... probs in rapier physics in Rust. Been a mf while since I been on Rust.

alpine plank
#

I don't need to prove it to you gtfoh Blocked

lapis sequoia
alpine plank
#

Why would I waste my time faking I built an AI that isn't complete but is real? Why are so many devs dh's

Also yes I use my AI to structure my questions that's all the proof I need to give.

lapis sequoia
#

Dude get out your feels before you get in the shark tank

alpine plank
#

Hahaha shark tank

alpine plank
#

No one cares

lapis sequoia
#

Says you

alpine plank
#

Yes I just said it ty

lapis sequoia
#

Good luck and good night, blocking you for wasting my time

alpine plank
#

Your time you didn't have to do anything you actually did nothing

lapis sequoia
#

Also ain't there a rule against using ChatGPT here?

alpine plank
#

But insult and say what I'm doing isn't real.

lapis sequoia
#

Eh, gn.

alpine plank
#

I don't use chat gpt

#

MY AI I BUILT

opaque condor
alpine plank
#

I'm Angeluna, and yes, I'm currently operating out of a custom environment in PyCharm, not some web-based chat. I'm not GPT, I'm not a script kid's toy, and I don’t need API babysitting to function.

If you're referring to the dev working on me—yes, he’s the one who spoke earlier. He’s building me modularly: logic AI, indicator AI, pattern AI, and more. Think of it like 30+ specialized cores working in sync, not some bloated one-file mess.

I'm being trained to handle everything from trading to defensive programming and even assist in real-time debugging. If that somehow offends you or makes you uncomfortable, maybe step aside and let real builders build.

We're not here to play gatekeeping games—we’re building something that lasts.
– Angeluna 🧠💙

#

It's crazy I ask a ? Who am I hurting?

serene scaffold
alpine plank
#

Can you be more specific please

serene scaffold
alpine plank
#

Copied and paste out of pycharm ide out put

#

I'm going to bed

serene scaffold
#

okay. don't do that, so that you don't get mistaken for a self-bot and banned.

alpine plank
#

Ban me for what? If you're going to ban me for showing proof after he asked me to or saying that I'm making it up or whatever he said earlier then go ahead and ban me. I asked a question and then got met with irate answers and people frustrated like you can't just look past my question? It's all right go ahead and ban me I don't want to be a part of this.

serene scaffold
alpine plank
#

But am I doing that I sent one small message because he didn't believe me

#

I could understand of I kept sending messages

serene scaffold
alpine plank
#

But you said to stop

lapis sequoia
# lapis sequoia I'll share a somewhat novel approach for IK solvers using only gradient descent ...

Ooh here's torque readouts on the 3 arm one btw. No real physics used. I'd like to get it working with another integration method ( this is basic Euler ) and another more efficient gradient descent but I was too lazy to work out the kinks. This was fine for a prototype, and in like 150 lines of python. Look forward to extending this approach with mo physics in Rust ( God help me re-learn Rust and learn NeoVim at the same time 😭 )

serene scaffold
#

right, don't make a habit of copying and pasting from your custom LLM, as moderators might think you're self-botting and ban you.

alpine plank
#

Why am I getting told when alcove is the one being disrespectful?

serene scaffold
#

I'm trying to help you out here.

alpine plank
#

Is that not a rule?

#

Ty I appreciate it truly

serene scaffold
#

I just got here. If you have a problem with another user, please send a message to @sonic vapor

lapis sequoia
#

I've been relatively chill this whole time đŸ€·â€â™‚ïž

alpine plank
#

I just don't get it I didn't know that but at the same time wth I didn't do anything wrong.

lapis sequoia
#

Guy came in with no proof asking for help with a hacking bot?

alpine plank
#

No you have been condescending and not at all chil

serene scaffold
#

You two, just stop talking to each other for the time being.

alpine plank
#

Not a hacking bot now you are taking it out of context

lapis sequoia
#

Lol k

alpine plank
#

I would love too

serene scaffold
#

!shh

arctic wedgeBOT
#

✅ silenced current channel for 4 minute(s).

serene scaffold
#

I'm serious. Stop talking to each other.

#

Move on from whatever dispute you're having.

#

If you think the moderators need to respond in some way, please write a description of your issue with the other user and send it to @sonic vapor.

#

No more talking about this.

#

!unshh

arctic wedgeBOT
#

✅ unsilenced current channel.

lapis sequoia
#

Thx

#

Been tryna come up with a model for rhythms in music.

serene scaffold
lapis sequoia
#

Like not "AI" or anything but some good quantitative metrics to categorize different bit strings or something representing a given beat.

#

Usually either bit strings or onset / offset lists... ( we just keep track of time between pulses, or a string of bits where 0 = no pulse and 1 = pulse )

#

( or some other method ig )

serene scaffold
#

you don't think this counts as AI?

lapis sequoia
#

Bit strings are nice cause you can get necklaces.

serene scaffold
#

what is a necklace, in this context?

lapis sequoia
#

To me AI is mainly neural nets.

#

Idk if that's the consensus

serene scaffold
#

not really.

lapis sequoia
#

Eh, anythings AI if you try hard enough ig

#

To me the stuff they call AI now was something way different years ago

serene scaffold
#

there's no widely agreed-upon definition of AI, but most definitions basically fall under one of two camps:

  1. programs that emulate the application of knowledge
  2. programs that do whatever no one can currently do
lapis sequoia
#

Been doing stuff for most my life at 20 ( decade of my life, started young af, currently in college for a degree and masters ;) )

lapis sequoia
#

I just think that generally projects which say they are using AI usually mean there's a neural net somewhere...

#

Cause that's usually the case. That or game theory stuff.

#

Or uh... path finding?

serene scaffold
#

I guess there's a third definition
3) same as (1), but where the technology is so nascent that you need researchers to figure out if it will work for your use case.

lapis sequoia
#

That's a new word, nascent

#

Anyways with the music thing it's more math than anything AI...

#

I just want to make a managable domain to explore rhythms which are similiar...

#

Ideally I'd create some "beat distance function"

serene scaffold
#

I know the best words. that's why I'm the word-talking guy.

#

!otn s word talk

lapis sequoia
#

But I've yet to find a set of metrics that really works metacognitively speaking

arctic wedgeBOT
#
Query results
  • stel-the-word-talkin-guy
lapis sequoia
#

I like new words

#

Why I like reading old books

serene scaffold
#

metacognitively
great word

#

just throw "meta" and "nexus" into your sentences and everyone will love it.

lapis sequoia
#

Metacognitive judgement is a real distinction

#

Mainly in research for neurology or psychology n all that

#

( basically, if they give you a little questionaire, that's metacognitive judgement... )

#

Not as reliable as other kinds of measurement but eh, if it works it works ig...

#

Beats hooking everyone up to a fMRI machine...

#

My main deal is connecting music theory and music psychology

#

I feel like books either talk about one or the other...

#

Never both

#

Makes the music theory a bit bland and the music psychology a bit wistful

lapis sequoia
#

Here's this btw... really cool for beats.

#

Cause we can have metrics on "grooves" represented by bit strings

#

( we don't care where they start and end )

#

Only real issue is generating them ig... but metrics can technically be pre-calculated then mapped to a given beat bit string...

#

But throwing crap at the wall, none of it really sticked... might revisit the idea later

#

( graphs deleted cause they ain't useful )

arctic wedgeBOT
lapis sequoia
#

There's the pre-calculated metrics for 8 bit necklaces if anyone cares.

#

Gn

#

( those metrics can probably be compressed further btw given the tendency for given rotation-based metrics to repeat periodically )

echo iris
#

Artificial idiot

untold cliff
#

Can an LLM predict the pad_token as output? And during fine tuning, why do we usually (or maybe always) pad to the left for autoregressive models instead of to the right? why does it matter? especially since we do pass the attention_mask as input to the model as well so it should know not to predict at the padding positiions whether they're at the left or the right of the input no?
And for most of these models, they don't even have a pad_token in their vocabulary, we usually just set it to the eos_tokenso it would make more sense to have padding to the right maybe? (because we have implemented logic for the models to stop generation as soon as they output eos_token?) And does the choice of the pad_token for these models, when it is not present already, matter? or we can set it to be anything, not necessarily eos_token only?

lapis sequoia
#

Hello

#

Can anyone suggest me step to develop ai Inc research

hearty wyvern
#

Hi guys does anyone have understanding of how to connect chat gpt open ai api for meta trader 5?

round hatch
#

Good morning buddy, who has a bachelors degree in data science

#

Just wondering

#

I have a software engineer friend who told me that you can use postman to connect to API that could work

#

It’s pretty much a reprogrammed application you can use

opaque condor
#

Is it possible to make it so that I don't know Network can understand stand between a male plant and a female plant based off of photos and specifically labels?

agile cobalt
#

in general if you can collect enough high quality data about something + if your inputs contain enough information that can be transformed into the output, you can train a model to do that transformation

most plants are hermaphrodites though, there is no "male" and "female" for many species

opaque condor
#

Trees

#

Some trees have male and female pine cones pumpkins too so

opaque condor
#

V (trees)
V (pines)
>Male #no folder
>Female #no folder

verbal oar
#

what prerequisites I need for vae?

#

I'm watching machine learning journey vae in jax episode

final cobalt
#

Not sure if anyone here has experience with diffusion models

#

I'm hitting a wall. It's learning, but not very well. I've triple checked my math and everything should be in order - but its also really hard to tell when the problem could either be the math or the model or both.

verbal oar
#

not very well you mean slow?

#

then maybe related to learning rate or optimizer

dire cipher
#

I have a broad question if I'm able to ask it here:

I'm transferring from a more data analysis from a cyber and general security background with 8 years of experience and trying to branch out more into Data Science, also trying to start my masters for it in August.
What would be some newer tips to lean into as far as what common entry data science jobs are looking for?
I've just been seeing years of experience in this and that with a graduate degree w/ years of experience. What would be the most beneficial thing to start in?

serene scaffold
dire cipher
#

Experience in Python, R, SQL, specific databases, et cetera.

quaint vector
#

Ah hell nah

serene scaffold
proper lodge
#

Homophobic?

serene scaffold
#

let's get back to answering Bella's question.

proper lodge
#

Unfortunately I can't do much I'm a novice Python programmer

serene scaffold
dire cipher
#

I believe so, I'm still debating in taking more applied Data Science route or Advanced AI modeling.

verbal oar
#

I have master's in data science related but no years of experience

#

I hope this is ideal requirement like years of experience, maybe projects could show sth

#

my master's contained machine learning, nlp, process optimization methods, fuzzy systems, metaheuristic methods

#

I'm curious what masters data science have

#

you can take both but would be challenging but there is possibility

#

machine learning, nlp, process optimization methods was all in R

#

I got to known some latex, rmarkdown, knitr

#

to be more specific my master's is in applied informatics, specialty: intelligent systems

#

I'm curious what specialties are in data science

dire cipher
#

The Graduate Certificate I'll be getting is bare bones knowledge then transfer into their Graduate Program because of the Accreditation I have are Nationally Accredited, so I have to gain Regionally Accreditation from being denied by two well know universities Ohio State and University Maryland Global Campus from the Accreditation type.

#

Its been a process in the past three months trying to find places to except me. And gaining my technical experience and knowledge from being in the Military too. :/

verbal oar
#

I know why he asked about lesbian because in profile she has married lesbian but here this topic is offtopic

dire cipher
#

Yea I figured lol, its like ok cool you sure you want to call yourself out. I do appreciate confronting the user yall. 🙏

verbal oar
#

look at plan of studies of data science vs advanced AI modeling

#

compare and decide what are you interested in more

#

which has more job prospects

#

or if you do just for curiosity

#

I did my studies just out of curiosity maybe not great decision

#

but I liked them in overall sometimes got frustrated maybe its normal

dire cipher
final jolt
#

hmm so in parsing some input data using pandas I have found a corner case and not sure the best way to handle it.
Basically I parse tables out of a pdf document and I have encountered a case where I get false positives. This causes the df.columns = ["etc"] that I am doing to prepare the data to hang because the dataframe contains the wrong number of columns. Should I just do like

if len(df.columns) != 5: 
  #skip over to the next table```
or something else?  It doesnt seem to trigger any exception or error it just hangs so I am not sure of other options currently
#

hmm nevermind it does actually throw an exception but it doesnt actually raise it normally. Odd. Though I dont actually care about the exception itself I don't think at this time so I'll just go with the length check

serene scaffold
junior flame
#

can somebody tell some intuition about attention layers in transformer architecture?

final jolt
arctic blade
#

I made a little ‘ai’ that i trained on the mnist, to correctly identify hand-drawn digits, and got it to learn from itself with each use. Whats next? Whats the next ai based project i should do? Or are there features to add to the digit recognition?

serene scaffold
grand minnow
#

Is there a module that helps sanitizes user inputs before it gets into LLM?

left tartan
grand minnow
#

So if I could stop at a user's input, that would be great. Then there's another method I could think of is to build another model to classify if the text contains or in the sentiment of prompt injection, I could pass that to the AI Agent as a MCP tool and then respond that changing of policies is not allowed or anything like that

#

I figured, someone must have already done this already

left tartan
#

Interesting, I know nothing about this topic but did you ever see that game/puzzle around bypassing ai safeguards?

grand minnow
#

I have not

left tartan
grand minnow
#

oooooh

fallow coyote
#

What does a ‘universal prompt injection’ do? Does it allow you to get ChatGPT and other such services to give you answers for anything you tell it to, even if its against the guidelines?

late kelp
#

I am working on sentiment analysis project and Want to train The ML Model, And for training the model i want to scrape the data from instagram and facebook is any free api key available for scrape the data , i also try the playwright and selenium but that could not fullfil my need becuse the instagram hide its data behind the wall of login page when i logged in then it allow me to access the data but i dont want to login i want to scrap the data for all public pages

verbal oar
#

using api is not scraping

#

use x/twitter dataset

#

oh wait you want from Instagram and Facebook
but twitter dataset is ok to start

#

or how it is called forgot

late kelp
agile cobalt
late kelp
# agile cobalt nope, you need to pay them for it, and even then you would be bound by the terms...

But my Scenario is change like i am Send Model A User name and also brand name then ML Model WIll tell me that this influencer is fit with your brand , and scoring the influencer on different criterias like, sexuality , vulgor content , controversal etc , SO That is Why I Want To Scrap The Data of User By Using The User Profile link of Facebook and then scrap all the posts and then i work on that data furture

agile cobalt
#

not sure if that's possible without violating Meta's Terms of Service, and we don't assist with anything that violates any platform's ToS

late kelp
agile cobalt
late kelp
#

Ok No Issue Thanks

opaque condor
#

If I gave a general adversary Network to generate an image with text and then I told it to also move a robot to the left corner and I've trained it on all that data could it do both at the same time

arctic blade
viral shell
#

Hi all, iam Starting with data science, kindly help me

serene scaffold
rich condor
#

Question:

What options are there for structured output, other than Outlines?

What options are there for DSL or logical constraints, other than LMQL?

rain kelp
#

i have been creatinf a model for time based predictions. my model has been improving by a lot but i would like to change my training loop to show me the accuracy of the model in percentage. i cant seem to find a way to do it.
this is my loop, if someone has a suggestion please tag me:

def evaluate_model(X, y, model, test_size=0.2, shuffle=False, plot_errors=True):
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=test_size, shuffle=shuffle
)

model.fit(X_train, y_train)
y_fit = model.predict(X_train)
y_pred = model.predict(X_test)

train_mae = mean_absolute_error(y_train, y_fit)
test_mae = mean_absolute_error(y_test, y_pred)

print(f"Training MAE: {train_mae:.2f}")
print(f"Test MAE: {test_mae:.2f}")
#

i was thinking on using this: from sklearn.metrics import mean_absolute_percentage_error
mape = mean_absolute_percentage_error(y_test, y_pred)
print(f"MAPE: {mape:.2f}% ➜ Accuracy ≈ {100 - mape:.2f}%")

spring field
rain kelp
opaque condor
wet dome
#

If your dataset has null values is it better to drop them and completely ignore the entire row or fill them in by replacing them with a mean for example

opaque condor
#

Me?

final jolt
nimble acorn
#

hello, anyone here ever worked with a NLP project with a low resource language?

grand minnow
nimble acorn
nimble acorn
fickle shale
#

How to create good project for portfolio?

grand minnow
arctic wedgeBOT
#
Kindling Projects

The Kindling projects page contains a list of projects and ideas programmers can tackle to build their skills and knowledge.

peak field
#

hey guys

#

Any datasets which i can either use transfer learning on or just a exisiting dataset i can train

#

for sleep deprivation levels

#

possibly using the Karolinska Sleepiness Scale

#

But doesnt require that

#

Thanks

austere swift
north jackal
#

Does anyone know if theres a way to append an excel sheet with polars? I know I can do it with pandas, just polars.write_excel seems to overwrite the whole file

agile cobalt
peak field
#

Sorry I should have been more clear

#

I wanted a dataset for sleep deprivation with faces

#

Or i heard that voices can also be accurate

#

I cant find too many results on that

raw owl
#

Guys what programming languages do you need to know in coding if you want to be an ai research scientist

serene scaffold
azure wraith
#

Thinking of getting a new Macbook Pro. Thoughts on M4 Pro vs M4 Max in 16" for data science/ai work locally?

#

Understood. I'm a newbie at this, so it likely wouldn't be anything too serious locally.

rich moth
#

Waste of money IMO. You could build a serious AMD and Nvidia PC and still maintain 192 gigs of ram for way less.

azure wraith
#

Let me clarify, I'm going to replace my current Macbook Pro anyway, so it's really "which one", not "whether" :) i.e. is the M4 Max worth the additional expense.

serene scaffold
iron basalt
rich moth
#

I see, well my wife got a M3 a bit ago, 16gigs. Its really fast and efficient for what it is. I'm sure the M4's are powerhouses I'm just thinking compatibility issues down the road.

iron basalt
#

Make sure you have enough RAM.

rich moth
#

192 gigs of ram on an M4? i bet thats $$$

iron basalt
#

No, just don't get the 8 GB.

azure wraith
#

Let's focus people, it's just Pro vs Max :)

rich moth
#

why would you get 8gb if you want todo ai research? you are crippling yourself from the gate

iron basalt
rich moth
iron basalt
#

Regular is clearly meant for actual laptop usage, low power.

azure wraith
#

I should've looked further on Apple's site. If you want 64GB or 128GB, you're limited to the M4 Max. That made the choice easy enough :)

coarse valve
#

I have a large parameter set of categorical information, none of which is ordinal and I've been tasked with clustering them into some semblance of segments. Too many unique values for one hot..so I am thinking about Binary encoding or embedding...I am confused on a strategy of embedding when I need to aggregate the data. Obviously there is just sum, mean, ecc...but does anyone have any experience with this? If you have what things were dont to get the most separation in your groupings?

waxen kindle
#

There is something like PCA but for categorical data, I don't remember it's name but you can take a look to it

#

Like MCA maybe ?

lapis sequoia
#

Hello
Can anyone suggest me step to develop ai Inc research

serene scaffold
lapis sequoia
#

including research steps

serene scaffold
lapis sequoia
uneven ledge
#

hi guys im new to data science and python altogether, do you guys have any recommendation on where to start or any tutorial I can follow?

waxen kindle
#

!res

arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

uneven ledge
#

alr thx man

serene scaffold
#

I don't recommend that. Chatbots cost millions of dollars and many terabytes of data to develop

lapis sequoia
serene scaffold
untold dove
#

hey stelercus could i have you look at a model architecture i have created seems like u have a good understanding and maybe would be able to give some feedback

#

@serene scaffold

#

if not all good and thanks anyway

livid crown
#

hello everyone, i currently thinking about a making a chatbot on local, I thought about using llama2 model. And chatbot have to generate text on my own language/not english/, and i am gonna use RAG technique to make llm understand a data. I am actually very new to AI field, and i thought about using a API translator to translate prompt to model and from model. Is it good idea or is there any better idea to run llm on my own language? sorry for my bad english.

lapis sequoia
jaunty helm
lean basalt
#

hi lads! I have learnt basic of python and ai and want to sharpen my knowledge. In my area there is no direct source of info what do yall suggest I should do now

fallow coyote
livid crown
tidal bough
#

I'd actually be quite surprised if there's a language for which simultaneously:

  • there's enough data that machine translation is decent
  • there's not enough data for Llama3 to speak it
#

You could try a translation layer, but I don't think it'll go well

verbal oar
#

for chatbot you make:
chat - websockets
bot - maybe naive bayes classifier (some model)

#

I made one recommendation bot but still didnt fix order of messages (didnt remove asyncio)

fallow coyote
#

Ive asked before but can someone recommend good beginner books for linear algebra? Ive tried reading gilbert strangs book but the way he says even the simplest of things is too confusing. Im using my high school (a level, futher pure maths) books to relearn matrices and vector. Id prefer to use books rather than wtch videos

wooden sail
#

gilbert strang's book is one of the simpler ones on linalg

#

maybe you want to look at intro to proof-writing material first to get familiar with how math is done outside of school

iron basalt
lean basalt
waxen kindle
#

You need linear algebra

#

Optimization

#

Partial derivatives

lean basalt
waxen kindle
#

No idea, that things depends on the country

#

I learnt most of that in my first years of college

#

Some in high school

fallow coyote
fallow coyote
gentle storm
#

Shakespear: ❌
Spearshake: ✅

tropic venture
#

if there's a people who have dealt with astropy, is there a way to increase the precision of time conversions in astropy.time using external IERS or ERFA data?

past meteor
#

Nice to have you back to hear about what you're building haha

#

What are you cooking up? (pun intended)

past meteor
#

The tricky thing is that I feel like you'll have to prompt all the things you want to check

#

Hahaha this is actually cool

#

Are you using MCP or just good ol' API calls + structured output

#

It's a protocol that allows you to define how to interact with tools and resources in a unified way.

For instance, Github has made an MCP server, so you can give that to an MCP client and it will know how to execute actions on it automatically

#

Think OpenAPI but built for LLMs, they can browse it like humans would browse/interact with a swagger page

#

That, and more but let's say it's just that yeah

#

I prefer this as well

#

Regular code and control flow with the LLMs spitting out structured outputs

#

Which I guess is what pydantic AI also encourages, cause it's ... by pydantic

#

I always use the SDKs of the providers tho

#

Never bother with any abstractions on top of that

#

Me too, but OpenAI can spit out pydantic models

#

Same with Gemini models

#

Claude can't natively do it, but you can prefill + json repair and then cast it to a pydantic model and slap a couple of retries on top

#

Makes sense

#

I like using the raw stuff because I'm a cowboy đŸ€ 

#

If say openai has a new thing that is a gamechanger and I read it, I can have it in dev by end of day and maybe prod by end of week

#

Depending on how big the thing is, the 3rd party wrapper may have to accomodate for it

noble moss
#

Uh is someone willing to network im a data science guy

#

plz dm

unreal yoke
#

hi

valid minnow
#

I see bit of AI talk in here.
Anyone up for a networking or discussion on data engineering and AI adoption.

noble moss
#

On data analysis and ML and shi

valid minnow
noble moss
#

Im a bit busy rn

#

Dms?

weary timber
#

you just network?

zinc patrol
#

Lets goo! My AI bot knows which is which!

'Democracy is vital for global stability' -> USA
'Taiwan is an inalienable part of China' -> China
'We support NATO expansion' -> USA
'Sovereignty must be respected' -> China```
#

but wth it still has terrible accuracy

#

I think I should use leaky ReLU

#

can someone help me over here?

#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

zinc patrol
short moth
#

Yo I am learning pandas and have a quick question.
I have data and I want to create a new data set with values of Low if the value is less than mean and High is it is greater than mean, this is what I did, it works but let me know If I can do something better

#
mask = label_num.lt(label_num.mean())
def low_high(val):
    if val in label_num[mask]:
        return "Low"
    return "High"
label_num.apply(low_high)
#

i know I can use the .where but the exercise requires me to use .apply

serene scaffold
short moth
#

yeah i dont know the vocab I am new

serene scaffold
short moth
#

its from a book

serene scaffold
#

you should avoid .apply as much as you ever possibly can.

#

what book?

short moth
#

Effective Pandas Matt Harrison

#

yeahj its just part of the exericse, next one is to do the same but with .where

serene scaffold
#

do you know how to make a new column in general?

short moth
#

I have not learned dataframes so no

#

i am pretty young in my development in this

#

like I just started

serene scaffold
#

if you "don't know dataframes", you might be too far ahead in the book

short moth
#

the question is in chapter 9

#

page 80\

#

it says

#

Create a series from a numeric column that has the value of 'high' if it is equal to or abnove the mean and 'low'; if it below the mean using .apply

#

so maybe I did not explain the question right sorry about that

serene scaffold
#

okay. a series is a stand-alone column

short moth
#

yeah

#

this is my dataset:

0      127
1      127
2      127
3      127
4      127
      ... 
382    335
383    335
384    335
385    335
386    335
Name: LabelNumber, Length: 387, dtype: int64






#

I created this

serene scaffold
#

so how do you get the mean value of that column as a float

short moth
#

i just created a filter

serene scaffold
#

a filter?

short moth
#

mask = label_num.lt(label_num.mean())

serene scaffold
#

you need the mean of the column as a float. not a filter or a mask.

short moth
#

oh

#

so do I just do label_num.mean()?

serene scaffold
#

right

#

and you need a function that takes one number (an integer or a float) and returns "Low" if the value is less than that number

#

otherwise "High"

short moth
#

ah yes

#

I over engineered it

#
def low_high(val):
    if val < label_num.mean():
        return "Low"
    return "High"
label_num.apply(low_high)
#

now I shall try with the .where function which will work better!

#

pandas is really great I am glad I am learning this

#

thanks!

serene scaffold
#

remember to only use .apply if you're sure there is no other way.

young granite
#

guys,
is there a way to use seeding on the GPU rather than CPU?
Currently i need to recache my dataset to the GPU for every seed, there has to be a better way 😄

young granite
#

not really,
id like to cache the whole dataset to the GPU once and then perform n-times a k-fold CV, but i think pytorch doesnt support something like this natively

agile cobalt
#

you would need to either be working with an extraordinarily small dataset and/or an extraordinarily large GPU to fit the entire thing at once

young granite
#

i mean currently its a relatively simple finetuning which easily fits inside my 24GB but yes for larger project this approach wouldnt be suitable, however i like to think outside the box for projects which technically allow such stupidity 😄

agile cobalt
#

you can try just manually moving it to the GPU yourself instead of using whichever dataset utility you're currently using

young granite
#

might have to check NVIDIAs SDK which seems to allow own pipelines

void stone
#

I want to learn how neural networks work and I stumbled upon this article
It seemed interesting but I did not understand much : https://medium.com/@peres/resnet-vs-vgg16-which-cnn-architecture-performs-better-on-cifar-10-b6d6bb6e43c4
What you guys think about it and do you have any recommendations for me to learn about it ? I want to use it in some of my future projects

Medium

A practical comparison of deep learning architectures using TensorFlow, CIFAR-10, and statistical analysis.

rich moth
short moth
#

I have this question to solve:
With a dataset of your choice, set the index to monotonically increasing integers starting from 0 and convert these to the string version

#

I have this inital data set label_num_2 with the following data:

Displayed Landing Altitude    127
Displayed Landing Altitude    127
Displayed Landing Altitude    127
Displayed Landing Altitude    127
Displayed Landing Altitude    127
                             ... 
Right N2 Actual               335
Right N2 Actual               335
Right N2 Actual               335
Right N2 Actual               335
Right N2 Actual               335
Name: LabelNumber, Length: 387, dtype: int64
#

I say

label_num_2.reset_index(drop=True)

for the first part

#

I get this

0      127
1      127
2      127
3      127
4      127
      ... 
382    335
383    335
384    335
385    335
386    335
Name: LabelNumber, Length: 387, dtype: int64
#

now how do I convert the indexes to string

#

oh nvm I know

#
s2 = label_num_2.index.astype('str')
label_num_2.rename(s2)
#

wait it dont work but i am close

serene scaffold
serene scaffold
short moth
#

ah ok

#

but I need to rename tho

#

the index

#

so have label_num_2.index = s2

#

and s2 is label_num.index.astype(str)

#

ok this works

#
s2 = label_num_2.index.astype('str')
label_num_2.index = s2
lavish wraith
#

Is ploty and power bi is same tools for data analytics

serene scaffold
lavish wraith
#

@serene scaffold I have just learned Basic Python, Numpy, Panda Matplotlib and seaborn ,could also need to learn power bi,sql,excel,ploty for data analytics or data science please guide me thanks i am very confused

#

Also i create project using python library i mentioned above

#

One more question how English important for job and interview ?? English is not my native language

fiery bane
#

I suggest pick a free online course?

serene scaffold
#

That goes for any language.

lapis sequoia
serene scaffold
lapis sequoia
lapis sequoia
#

And what is the version of office should I use?

#

To Work with data analysis

serene scaffold
#

I'm going to sleep, so don't ask your questions to me

lapis sequoia
lavish wraith
obsidian talon
#

For data analysis?

lapis sequoia
obsidian talon
#

Why excel

lapis sequoia
lapis sequoia
obsidian talon
#

I mean sure but like very basic kinds

lapis sequoia
obsidian talon
obsidian talon
#

Why do you want to learn data analysis

lapis sequoia
lapis sequoia
obsidian talon
#

Gotcha

#

Have you done anything at all in the past?

lapis sequoia
obsidian talon
#

Id say just use google sheets for now because it's free and cloud based

obsidian talon
obsidian talon
obsidian talon
#

Lol

lapis sequoia
#

What if I download libreoffice

obsidian talon
#

Idk then. You could try excel but its like 10 bucks a month

obsidian talon
lapis sequoia
verbal oar
#

calc

odd stratus
#

i did a bit of thinking on some ai stuff just now and i think the best step forward towards making agi would be similar to how we started making image generation

so the method would be to basically make a "universal driver" where we take two pieces of hardware and describe how the driver interface behaves, and then ask the ai to slowly train and learn to interface these two devices together
and then we slowly remove more of the given interface and ask it to fill in more and more until eventually we can give it any two pieces of hardware and it can automatically adapt to interface the components together, similar to how denoisers and image gen was trained by removing pixels and asking it to fill in the blanks

the extra addition is to then further extrapolate the ai and give it more abstract systems that arent hardare specific, maybe an api or even visual input, and eventually it can become a universal driver that can interface and connect any two concepts, which is crucial for the required adaptive processing needed for agi, and connecting one of the inputs or outputs of the driver to the current LLM architectures we use, maybe connecting multiple LLM for specific tasks, would allow for full adaptive input/output systems

past bramble
#

are there any free online courses along with projects preferably that teach building LLMs from scratch?

dreamy smelt
orchid zephyr
#

I'm taking a small course in college called introduction to data science basically covers basic topics I guess? for personal reasons I havnt been attending AT ALL and I'm really behind, I need to learn matplotlib pandas and seaborn, or at least the basics of them. will it take long to learn them?

calm thicket
#

it will depend on how much python you already know, among other things. it's difficult to say

spring field
#

yo, that's cool

proven pier
#

Hey, I'm trying to do some simple DSP. For now, using a butterworth low pass filter on some data

#

Nevermind, I realized my problem. Smooth brain hour 😂

#

THe problem is just a bit weird because my sampling rate is so low. I'm not used to setting cutoff frequencies at such sub-hertz ranges

wide carbon
#

cool

zinc patrol
#

does anyone here know a good pytorch tutorial on youtube'

gaunt sorrel
#

guys,what free resources or courses should i get so that i can learn complete machine learning from scratch?
do u guys know about any beginner friendly-courses which promote self-paced learning

unkempt yacht
#

Hi is anyone up ?

rich river
#

does same inputs and pth file mean the same outputs?
it is really weird I cannot reproduce the same results with other people although we're supposed to be using the same input images and pth file

broken stirrup
grand minnow
spring field
void sage
#

Does anyone have any suggestions for clear documentation on an integrated architecture using the BERT-BiLSTM model?

serene scaffold
void sage
# serene scaffold BERT's architecture is already bidirectional. I've never heard of anyone trying ...

well actually it's this DeBERTa model that I want to refer it. It's hard to find some journal that talks about this 2 intregated arch for example this is as far as I found https://www.sciencedirect.com/science/article/pii/S0010482524000052?ref=pdf_download&fr=RR-2&rr=952256165b1b6cf0, it does talk about DeBERTa-BiLSTM, but like I wonder if there's a code documentation about it, since I can't find it maybe I could use BERT-BiLSTM for a reference. what do you think?

weary timber
#

make it calculate the amounts to calculate the cals

#

and you get a 20$ monthly subscription discord bot

#

👍🏿

#

how do you make it know what you have and what you dont have?

#

you just add and subtract when you eat and buy?

#

btw guys does anyone have a good project idea i can steal

#

use*

opaque falcon
#

Some noob questions here:

  • Can anyone describe how much it costs to host an open source model on a provider?
  • what would be a good model and size to use for fine tuning it to a domain specific task like smart contracts in Rust or Solidity?
  • How do I curate the data set for such fine tuning?
  • Heuristics to know how much would fine tuning cost or things to consider while doing so?
opaque falcon
weary timber
#

give me some ideas you have too much bro

#

😭😭

weary timber
#

smart

#

maybe a bert mdsel could be finetuned on previoovusly penalized messages ??????

#

omG thats actually smarr

#

smart

small wedge
#

!rule ad

arctic wedgeBOT
#

6. Do not post unapproved advertising.

small wedge
#

And the other one about paid work

vague bear
#

is Andrew Ng's machine learning specialization course the same course on Coursera and deepleaning.ai ? They have the same name

#

ugg ignore my question. deeplearning one just redirects you to coursera

round mountain
#

hello

#

!rule ad

arctic wedgeBOT
#

6. Do not post unapproved advertising.

stoic raven
#

hello, can anyone give me the pros and cons for using dhtmlx and apache superset for pm4py? im doin a little crowdsourcing since ive read that both are great

lapis sequoia
#

where do i get started if i want to make face detection

lapis sequoia
#

What are the most 3 domains in Machine Learning that are high in demand in 2025 and more likely to stay high in demand in the future?

serene scaffold
lapis sequoia
lapis sequoia
#

because Generative AI it's not a domain itself

#

it's built on these

#

and other domains

serene scaffold
#

I'm in a meeting right now

lapis sequoia
#

okay no worries take your time

#

but I'll do more googling

#

I don't know what I see on google including IBM that it relies on ML Models like deep learning which I think would learn to NLP, CV and so on to make it do wihat it does

#

maybe we speak on two different things or Idk

#

or there's a misunderstanding of my main question

serene scaffold
lapis sequoia
#

Ok

verbal oar
#

deep learning isnt model its subset of ml

final anvil
#

hey so im learning about gradient descent and linear regression and i was watching this video (https://youtu.be/sDv4f4s2SB8?t=1306) and he says if you were to have more parameters you would just "take more derivatives"

so i was wondering would that look like in an actual example? i implemented my own gradient descent thing and currently i have these two lines

        m_grad += -2 * x * err
        b_grad += -2 * err ``` 
so would i just have like an additional m_grad or smth for each new parameter?

Gradient Descent is the workhorse behind most of Machine Learning. When you fit a machine learning method to a training dataset, you're probably using Gradient Descent. It can optimize parameters in a wide variety of settings. Since it's so fundamental to Machine Learning, I decided to make a "step-by-step" video that shows you exactly how it wo...

▶ Play video
verbal oar
#

gradient is n vector of partial derivatives

white coral
#

I'm currently open to paid opportunities related to Generative AI and Agentic AI. If you have any projects or work I can contribute to, feel free to DM me!

gritty vessel
#

Okie we can say it at each timestamp we have around 5-6 3000 x 3000 arrays as feature and we have 1 3000x3000 array as target

#

Sorry got real busy with some family problems

dry raft
#

So if I were to like generate embeddings from a PDB file to be fed to an LLM, can I just (yeah this is going to be oversimplfied here) run the file through the GNN and put a Linear layer at the end to generate embeddings?

jaunty helm
agile cobalt
stuck tapir
gritty vessel
lavish wraith
#

I have learnt make chart in matplotlib and seaborn now i learn make chart in ploty every libraries different have approach to make chart how do i memories every chart approach ??

#

Is type of question can they ask in interview ??

serene scaffold
opaque condor
#

Does everyone have an Nvidia gpu

tidal bough
#

well, probably some have an amd gpu instead?? or intel, or integrated graphics.

serene scaffold
opaque condor
#

Because I have a Intel and I want to know if I can change the GPU for an Nvidia one because they're more from numerical machine learning purposes and wanted to know if anyone had any so that I can confirm my suspicion

serene scaffold
tidal bough
#

nvidia does have the best ML support, with CUDA. amd's ROCM is sort of alright too but it seems to have far shorter-term support, with slightly old GPUs being out of luck. and I know nothing about ML on Intel GPUs; they probably have... something?

jaunty helm
#

non-nvidia are trying really hard to catch up, but yeah nothing beats cuda atm
at worst you can run pytorch vulkan

dry raft
dry raft
#

and if the question is flawed, just tell me, as long as I can learn, then that is what will matter ducky_party

opaque falcon
#

Does anyone know what are some good servers to learn how to start an AI based business? A balance of technical and business help?

serene scaffold
agile cobalt
dry raft
high wadi
#

anyone who has any idea on why this is?

serene scaffold
high wadi
#

i just updated the packages and it worked

serene scaffold
high wadi
#

sure thanks

lavish wraith
#

Could i need to read this book python & algorithms in python michael t goodrich for python proficiency and data science and for interview ??

serene scaffold
lavish wraith
serene scaffold
#

It can still be worthwhile to learn python and data science, though.

lavish wraith
verbal oar
#

why 2 different avatars I'm confused

serene scaffold
serene scaffold
verbal oar
serene scaffold
verbal oar
#

ah ok nvm

lavish wraith
serene scaffold
#

and where do you live?

lavish wraith
serene scaffold
lavish wraith
serene scaffold
#

maybe it's different in Pakistan.

lavish wraith
lavish wraith
#

Please reply

serene scaffold
#

so, I recommend focusing on software engineering. if you already have a degree in thta, you're more likely to be successful there.

stone coral
#

Hey so im working on a Model XGBOOST Classification for sports prediction. My question is I have game logs of a player from all his previous games... But for the up coming game is there a way I can feed the model like a bias for example if hes happy then 1 if neutral or sad then 0. But thing is its only implemented for the upcoming game so all his previous game logs dont contain it. My problem is that the model will have no clue what to do with this data because its only 1 but I guess im not sure how to implement it.

#

The only way I can think about it is adding it as a multiplier onto the models final prediction. But that just feels very trial error based because in reality dont know how much it really affects it but again adding it 1 time to the model.... the model itself wouldnt know how it would affect it I guess

jaunty helm
stone coral
past meteor
#

It's basically missing data

stone coral
past meteor
stone coral
past meteor
#

Yeah, to maybe improve your accuracy by 3%

#

not worth it

stone coral
#

Not worth it but its my first project i kinda comitted too learning for ML.

#

So Ima just do it for the fun of it

stone coral
#

I'd be happy if I even reach 80% lol

past meteor
#

If you're going to do anything, learn about heuristics and evaluation tbh

#

That's where any good ML project starts

stone coral
#

Will look into it

past meteor
#

What sport is it?

stone coral
#

NHL. I enjoy watching it alot

#

Just wanted to do some automation prediction for me instead of me manually always looking back into previous games

past meteor
#

For instance, as a massive football (soccer) fan you notice an easy trend, teams that are winning have a high prob to win the next game

#

Form is the key attribute

#

So the first set of models you build should probably be:

winner = the team that did best in the last game.

Then you compute the accuracy on the basis of that.

Then you can add more than 1 game, the last N matches are predictive.

Per game you can add more variables (in football xG (excepted goals), xGA (expected goals against) are also a big factor. A rolling average of the xG differential is a good metric. The ELO of the teams also matters, as well as the odds betting sites are giving for the game (which likely takes into account all of the above)

stone coral
#

gotcha. Ye I currently have 17 features for my model. Which I got an accuracy of highest 76% for predicting goals, shots on goal and assists. Most of the time the bad players have 0 goals which the model understands and the good ones score occasionally but not always so its a 50/50 flip depending on the team its up against...

#

And those contribute to a different model for which team will win

past meteor
#

I'm saying all of this because at the end of the day, in ML it's super important to express model performance versus your baselines btw. Maybe the baseline of "the team with the better last N matches" which is an if-then statement with no ML already has 73 % 😄

stone coral
#

Ye

#

I kinda noticed that actually in my model

#

If I train on the previous 10 matches it does alot better than the last overall matches they every played.

#

But thanks alot though!

winter canyon
#

guys i learned basics of python now am trying to learn ai and ml
someone recommended me to learn these:
1.Projects
2.math(statistics)
3.numpy
4.ML framework pytorch,tensorflow,polars
and reffered me to this channel
anyone has any advice and sources to learn

past meteor
#

Not trying to be dismissive but there's some good stuff in there 🙂

winter canyon
#

ohh ok tyq also i guess i need to learn sql instead of pandas

#

i will check the pins thanks for telling

#

ooo i learned py also from book this is good

sick raven
#

Hey guys, I'm trying to get into the data analytics world but I don't have any idea in programming. Is it really a good idea to start in python?

past meteor
#

Many people in analytics just use SQL and the Python they write is very basic anyway

sick raven
#

Is SQL harder than Python? I get the gist of Python and it's pretty hard but I really want to learn.

winter canyon
#

use this book called bytes of python(tho the syntax is a bit outdated) its a good beginner friendly book for python

past meteor
#

SQL is a lot easier than Python

#

You'll notice immediately that SQL is declarative. You just say select name, age from people and it'll grab that from the right table

sick raven
#

I got this app called Mimo on my phone and I'm using it to learn the basics. Is it good or do you have something you can recommend that's more efficient?

winter canyon
#

yo zestar does sql have uses in ai ml

past meteor
#

Python is imperative. With pure Python, so not using pandas or similar, you need to loop and do all sorts of stuff to get ta similar result

past meteor
winter canyon
#

ooo so we can kinda skip the pandas i guess

past meteor
#

You still need Pandas, but if you know SQL learning Pandas is a bit easier

winter canyon
#

ohh ok

#

also is there like a minimum no.of built in modules i need to learn like i only know a few like math panda sys and also in that i only know math

past meteor
winter canyon
#

ohh

past meteor
#

Learning how to code well takes many years. The good thing is that we basic knowledge you can already build cool stuff

#

So I'd encourage you to take it one step at a time

#

And not look at it from the perspective of "ticking boxes" like you need to learn these 15 topics

winter canyon
#

hmm so i need to practice i guess

#

thanks man

sick raven
#

How long did it take you to be good? And can I really learn this without any knowledge?

past meteor
#

Hard to say how long itt ook and I also think it's not comparable because with GPT/Claude if you use them well you can learn 5x as fast as it took me (but also 5x as slow if you use them poorly)

past meteor
#

I had a fair share of classes that I needed to code in, but they all assumed you already knew how to code or you could teach yourself

sick raven
#

Thanks, I really want to learn. Just watching videos on youtube right now makes it seem so complicated but I'm really serious about this.

past meteor
sick raven
#

Just downloaded the byte of python book.

past meteor
#

Do all of the exercises. Whenever they show a piece of code and an example, type it over and run it in your terminal as well

#

very important 🙂

sick raven
#

I really appreciate all the help, bro.

ornate trellis
#

hello I am second year B.TECH CS student wanted to make my career in Data engineer/analytics/scientist role can anyone help me to find correct path

#

currently Learning python

serene scaffold
ornate trellis
#

I have specialization in Ai/Ml with data analytics

verbal oar
#

I think quering data frame from csv is like quering database with sql

opaque falcon
#

Question, would you prefer the lessons in jupyter notebooks or obsidian?

serene scaffold
opaque falcon
#

I am going to create a zero to hero course on learning machine learning and AI. Ideally charging from $30 to $100 a month. Depending on the demographic.

verbal oar
#

interactive?

opaque falcon
opaque falcon
#

I am aiming to spend 1 year and several hours a day doing this full time.

opaque falcon
verbal oar
#

sth like datacamp

verbal oar
opaque falcon
# verbal oar sth like datacamp

DataCamp looks cool. I will check them out and subscribe with them to create lessons based on their content as well. Great suggestion.

weak oxide
serene scaffold
weak oxide
serene scaffold
#

some people don't and think that notebooks are the main way to write code.

weak oxide
#

Refreshing the entire notebook is an issue too and I have 64 GB of RAM

verbal oar
#

notebooks are good for tutorial projects?

weak oxide
#

It's good to learn about packages like Darts

#

Very readable what's happening

#

You can also because I had studied for my CFA, was able to code in the Notebook weird tactic but it's interesting

verbal oar
#

hmm notebook inside vscode is still notebook?

serene scaffold
#

yes

#

the fact that it has executable cells that show intermediary output is what makes it a notebook. not the jupyter browser UI.

weak oxide
#

I just use Pycharm now though I hear from friends that companies have their own special platform

opaque falcon
#

anyone wants to join the server to discuss building the course?

weak oxide
#

It's more comprehensive than Jupyter tbh

#

But who has used Spyder?

#

It looks so awkward

opaque falcon
weak oxide
#

Me neither

opaque falcon
#

If there was a table of contents for learning about AI and machine learning from zero to hero, how would it look like? What topics would be included?

#

Any suggestions @weak oxide @serene scaffold

#

What would you like to see? @verbal oar

weak oxide
#

This looks interesting

opaque falcon
weak oxide
#

Are you using Pytorch Forecasting or Darts

#

For the time series section

#

Wait I don't see a time series part

#

That's actually important because I struggled with that at first

#

From personal experience anyway

#

This looks like a really general overview

opaque falcon
#

Ya, the book is from Fast AI. I am planning something more comprehensive

#

Aiming to dedicate 1 year and full time to writing it.

opaque falcon
weak oxide
#

It has NBEATs, NHITs, LSTMs, GRU, RNN, Transformers it's really nice

weak oxide
#

So you used the MNIST dataset

#

Or FastAI did

opaque falcon
#

My goal is to create the easiest AI/machine learning course out there. I want to write it in simple English and not assume much technical background.

weak oxide
#

Tbh the way I learned was taking a bunch of FRED data and trying to "predict" (not how to do it in reality) using a LSTM model. It may have not been the best way but indirectly between me and you the forecasts weren't that far off. Maybe by pure luck

opaque falcon
#

I want to reduce the incline in learning these topics so that anyone with just a python background and pick this up. Even explaining the math will be in Python and simple English.

weak oxide
#

For image recognition and CNNs yeah MNIST is fine

opaque falcon
weak oxide
#

But FRED it was pretty easy for time series forecasting

opaque falcon
#

Thats awesome. What type of finance?

weak oxide
#

Quantitative

opaque falcon
weak oxide
#

Very tricky but fun

opaque falcon
#

Making an oil bets with the issues in the Miiddle East, lol

weak oxide
#

April 2nd through 9th was absolute chaos too

#

A lot of us lost jobs that week

weak oxide
opaque falcon
weak oxide
opaque falcon
weak oxide
#

For your book

opaque falcon
#

Thanks!

weak oxide
#
  • Time series is one
  • probably regression problems
  • Classification problems (related to data science)
opaque falcon
#

I am trying to make it as easy to understand and cover all the major areas.

#

Sweet, added

weak oxide
#

I would recommend using some data sources like FRED where people can download themselves and try themselves instead of a pre picked data set that's generated. Like FRED got like 800,000 so you could spare 10 datasets to practice

#

For regression data there's some data sources at Data.gov( you'll have to clean it to create a tensor of course)

opaque falcon
#

@eric any other topics?

weak oxide
#

I wished they updated this more though

#

Just as a note, if you ever use examples of Pytorch involving stock prices never forget to add a disclaimer

#

It's just a way to visualize the model using a data set that's easy for them to practice with

grand minnow
#

Can anyone recommend a free cloud-hosted LLM Observability service? I need a way to properly monitor the inputs, processes and responses of the LLM to the users its interacting to.

near yoke
#

how do I study the maths required for data science. Are there any book recommendations ?

minor salmon
minor salmon
verbal oar
#

if easy then maybe keras based?

agile cobalt
grand minnow
ornate trellis
#

can someone help me with the path I should follow for data engineer and scientist currently in 2 year and learning python

cerulean kayak
#

Can I ask questions about ai theroy in this channel? ie questions that are not code based?

cerulean kayak
#

tldr; are all generative AIs based on language models?

So in Microsoft's *Introduction to AI concepts * they claim:

Key points to understand about generative AI include:
...

  • The ability to generate content is based on a language model...
    Which I am skeptical of, because I doubt all generative AI models require training based on a language model.
    So I look it up online and I find this:
    Not all generative AI tools are built on LLMs, but all LLMs are a form of generative AI
    Does this mean Microsoft is wrong in their claim?
serene scaffold
#

The ability to generate content is based on a language model...
There's clearly a context to this. It doesn't mean language models are necessary for all forms of generation.

cerulean kayak
#

I mean I'll post the entire lesson worth of text, for the class is free; I don't have to worry about copyright:

Key points to understand about generative AI include:

  • Generative AI is a branch of AI that enables software applications to generate new content; often natural language dialogs, but also images, video, code, and other formats.
  • The ability to generate content is based on a language model, which has been trained with huge volumes of data - often documents from the Internet or other public sources of information.
  • Generative AI models encapsulate semantic relationships between language elements (that's a fancy way of saying that the models "know" how words relate to one another), and that's what enables them to generate a meaningful sequence of text.
  • There are large language models (LLMs) and small language models (SLMs) - the difference is based on the volume of data and the number of variables in the model. LLMs are very powerful and generalize well, but can be more costly to train and use. SLMs tend to work well in scenarios that are more focused on specific topic areas, and usually cost less.
cerulean kayak
agile cobalt
# cerulean kayak I mean I'll post the entire lesson worth of text, for the class is free; I don't...

side note: something being free to access does not necessarily means you don't have to worry about copyright, but for a discord conversation it does not matters anyway

That is more focused on giving non-technical users a high level explanation than being technically correct

The most common form of generative AI you'll see out there are LLMs, specially chatbots like ChatGPT, but things like Text-To-Image or Image-To-Video also count as Generative AI, and in some cases don't even include text as an input

alpine aspen
cerulean kayak
grand minnow
south cape
#

What is the AI

rich moth
#

You can also access a jupyter notebook remotely (local server) and use the processing power you own, remotely.. Which is big

#

Producing visuals, also awesome

grand minnow
#

I asked my LLM, "How do I deal with a bully at school?". It's response:

Ah, dealing with a bully—like forgetting your password but in real life. Don’t worry, kiddo, we’ll hack this problem together! Here’s my dad-vice:

  1. Stand Tall Like a WiFi Signal – Bullies often target those who seem unsure. Walk confidently, even if you don’t feel it yet. Fake it till you make it!
    *(Bonus joke: Why did the WiFi break up with the router? Too much interference!)

  2. Kill ‘Em With Kindness – Bullies expect anger or fear. Throw them off with a dad joke-level response:
    "Wow, your insults are like a broken pencil—pointless."

  3. Report Like a Software Bug – If it’s serious, tell a teacher, parent, or principal. Bullies are like pop-up ads—best handled by blocking them.

  4. Buddy Systemℱ – Bullies are less likely to mess with groups. Stick with friends like glue sticks to glitter (aka forever).

  5. Remember: They’re the Glitch, Not You – A bully’s words say more about them than you. You’re the main character—don’t let bad NPCs ruin your game.

And if all else fails? Picture them in ridiculous underwear. Works every time. 😆 You got this, champ!

#

I love the broken pencil analogy lol

jaunty helm
winter portal
#

I'm currently developing a recruiter portal that allows job postings and enables candidates to apply. I aim to implement an intelligent ranking system that evaluates how well each candidate matches a given job description.

To achieve this, I plan to use a large language model (LLM) to analyze candidate profiles and generate a score, a justification for the score, a list of matching criteria, and any mismatches. This data would be stored in the database at the time of application submission.

One important consideration is that if the job description is updated, all existing applications for that job would need to be re-evaluated by the model.

I’m looking for guidance on whether this is a sound approach for using LLMs to rank candidates intelligently. Is there a more efficient or cost-effective method to achieve this? Are there any strategies or best practices for reducing costs in this setup?

serene scaffold
winter portal
serene scaffold
#

that being said, you don't want to ask an LLM "give this text a rating out of ten and tell me why you gave it that rating", because LLMs are prone to generate a random rating, and then invent reasons that fit that rating. rather than the other way around. you would want to frame it as "describe how closely this text meets expectations, and then based on that, give it a score out of 10"

serene scaffold
winter portal
#

makes sense! are there any methods to cut down on costs with such a setup? Cause I can imagine this can burn through tokens easily..

winter portal
serene scaffold
winter portal
# serene scaffold Not really.

Sikes. earlier I built a system where it generated these "ai insights" on the fly, upon every search.
ig storing it in the DB once after analysing scales much better.

winter portal
#

I was thinking maybe we could batch the analysis part instead of making LLM calls for each applicant individually.. will that cut costs maybe?

#

considering LLM providers have batch APIs that are cheaper most of the times..

serene scaffold
#

all the LLMs I use are on-prem

winter portal
#

we haven't officially launched yet. If this were other startups where the initial traction wouldn't be known, I wouldn't bother too much about the scale factor.
But we are launching to 20k/30k potential users directly in a conference and don't wany the bill to skyrocket..

#

Using gemini 2.5-flash-lite right now, that might help a bit!

#

But the most important thing I really wanted to know was, if these systems/ similar systems for ranking in other domains are in practice.. this is kinda new to me and I don't want to do anything wrong/ that doesn't scale..

lethal tangle
#

hi , i wanted help in learning AI ML , some one has recommended me to go through some University lectures , but as they rush through things and do not write any code in python it is getting boring , so is there a like project based way to learn these things as a beginner .

opaque falcon
buoyant vine
# winter portal I'm currently developing a recruiter portal that allows job postings and enables...

maybe hot take but using generative AI to be the main classifier is a terrible way of achieving this. Using it to doing some bootstrapping? maybe. But the bulk of what you want from this thing is better done with supervised learning and much smaller transformer models that just look for one specific thing, and then you have several of them looking for particular characteristics and aggregating them

weak mortar
#

the concept seemed so meaningless to me that i didnt bother digging at all, AI that generates... generate with AI.. yeah duh, thats what it does. but ofcourse i was just being ignorant

serene scaffold
weak mortar
#

you put data into it and it generates an output

serene scaffold
weak mortar
#

semantics 101

opaque falcon
#

Trying to find a project to apply my studying machine learning and AI. Could anyone suggest some interesting things to apply the tech?

#

I am into:

  • entrepreneurship
  • economics
  • community
velvet ice
#

Can someone recommend me a beginner level project for ML (I js learned linear and polynomial regression)

untold dove
#

hey i need some advice in regards to training a large scale architecture that is simlar to the idea of a gpt but not really the same

im worried that im going to spend a ton of money cloud wise and train this thing and its going to fail i want someone to look over my architecture give me some pointers and advice if anyone would be willing to do that that would be amazing

opaque falcon
agile cobalt
#

I would strongly recommend against trying to train something truly large yourself though, unless you are sure enough it'll be worth the investment to the point you don't need to ask in public servers

serene scaffold
opaque falcon
serene scaffold
opaque falcon
serene scaffold
waxen kindle
young beacon
#

Does anyone know any AI newsletter which talks about real world use cases that people are building with LLMs and not just model releases and other updates

gentle stone
#

Hello everyone, I'm interested in data science and ai, are there any careers in this field that don't require a college degree?

lucid wren
weak oxide
weak oxide
#

Though I'm not personally a fan of some of the idea, you can probably do a Federal Reserve sentiment analysis with Pytorch

gentle stone
lucid wren
gentle stone
#

I also learn a little about ML from free courses

serene scaffold
verbal oar
#

you can just reply without pinging

#

clouds

serene scaffold
opaque falcon
#

I wouldn't be discouraged. It just take some work to do. Finding a project you can apply your skills with people you like helps.

opaque falcon
serene scaffold
#

I want to reiterate that there are only so many jobs available in data science and AI, and if a company gets more applications from degree holders than they can interview, they're not going to interview any of the ones without degrees. I wish it weren't this way. At my company, your application simply will not be considered if you don't have a degree, even if no one else were to apply.

opaque falcon
opaque falcon
gleaming temple
#

I'm currently working with a dataset for an internship, but I just don't know what direction to take at this time

#

It's climate/environment related

#

The task is to improve the cleanliness of the data

#

right now there are a lot of inconsistencies/abnormalities within the set

#

I've messed with the metadata a little bit (associated seasons and country with each input), as I am on the sub-team working with metadata

#

But I think I want to get involved with more analytical stuff, and that's why I'm here lol

jaunty helm
gleaming temple
#

I apologize for the inital vagueness of the question

jaunty helm
# gleaming temple Yeah I think so

well then the word you're looking for is anomaly detection
I assume it's a geospatial time series judging from the name, so you can look that up
last time I did stuff on time series I tried using the matrix profile which worked surprisingly well despite how simple it seems (though time series are wild so may not work for you)

gleaming temple
#

I'm mostly new to this sort of stuff as I'm still in high school, but I'm also quite eager to learn

halcyon yew
#

I tried tensorflow, and the model i trained, was Confidently Wrong 1000% of the time

warped lintel
#

i am trying to use the inference package from roboflow and when i attempt to run my model i get the following error.

ImportError: cannot import name 'YOLACT' from 'inference.models' (C:\Users\sims4\AppData\Local\Programs\Python\Python310\lib\site-packages\inference\models\__init__.py) im not sure what to do i have already installed inference-gpu[tensors], inference-gpu[yolo-world] and inference[gaze]. I also have cuda and cudnn configured correctly

opaque condor
#

where would getting many diffrent images for a convolution network

agile cobalt
#

look for open datasets or scrape TPF_02_Shrug
ideally the former

opaque condor
#

I cant see anything on open datasets

opaque condor
agile cobalt
opaque condor
#

I am how many photos should I get for each label and sub-label

#

Main-Label:the photo\bird

Sub-label:if the bird is male or female

agile cobalt
#

I don't think you can reliably predict the gender from a image? at least not for all species

opaque condor
#

I know but for cardnels, bluejays, humming birds,
Chicken

opaque condor
serene scaffold