#data-science-and-ml

1 messages · Page 114 of 1

long canopy
#

i see. thanks for the input! btw how do you use mlflow and tensorboard separately? afaik they're both for metrics/logging on models right?

past meteor
#

I use MLFlow to log summaries of runs and use tensorboard to investigate the details if a run looks interesting or strange

long canopy
final kiln
#

And like, if I had to go back, the thing I'd push harder for would be to have less moving parts. So instead of using open search MySQL and redis, I'd instead do postgres + redis, or just redis

long canopy
past meteor
#

I also considered using dagster but I just built a basic svelte + fastAPI web app (<2 hrs work) to monitor some other things

#

Memory spikes seem to kill my pipelines and connecting to mlflow and/or tensorboard requires a company VPN so I essentially did a workaround where I made simple API endpoints that my runners send requests to confirm they're still alive and the system resource state. That way I can monitor runs and intervene where necessary on my phone before going to bed (without needing the VPN) 🤠

long canopy
final kiln
past meteor
long canopy
past meteor
#

If you trust yourself you can implement small subsets of overengineered stacks faster than you can read their documentation

#

I do this ... a lot

past meteor
final kiln
#

That makes sense

past meteor
#

The GPU VM just doesn't have a public IP, which means that this is kind of the only way

#

When I'm on the VPN I run a reverse proxy anyway to have a local.ip/mlflow, local.ip/tensorboard, ...

final kiln
#

Ngl this redis SQL thing, sounds pretty good I'm gonna try it first chance I get, one less service in my app? Mind if I do

final kiln
#

Like http basic auth kinda public

#

I should probly setup a vpn

past meteor
#

On my virtual private server it's the same idea. I use github actions for CI/CD and it's kind of ... yeah

final kiln
past meteor
#

I eventually need to:

  1. Set up a VPN properly
  2. Figure out how to make the GHA runner use it
  3. Put my ssh port behind a firewall
long canopy
final kiln
long canopy
#

yeah you can just use aws' api for VPC to manage the subnets and such

final kiln
#

Right, but sometimes the machine is my laptop

past meteor
#

I managed to draft up a plan to solve my GPU issue with IT btw

long canopy
#

there's probably something to do here

#

like set your machine up within the VPC

#

i don't know the specifics but i need to learn them

#

let me know if you do an implementation

final kiln
#

Uhm, never heard of it but I think something like that could be possible

past meteor
#

I inventarised all VMs (a lot of work...) we're running on our 3 servers and convinced them to move all of them to node 1 and 2 so we can just have a bare metal ubuntu install without prox mox on one of the machines with the quadro

final kiln
#

Altho, it's a lot of work just to get my machine in the network, might as well setup a vpn or open the ports on the cloud to my IP only

long canopy
final kiln
#

I've just not done that out of lazyness so I don't think I'll go through the trouble of doing the vpc stuff

past meteor
#

the tl;dr is that I kept begging for a bigger VM and I just managed to convince them to give me the entire server 👍

#

The entire box, not chopped up into bits using proxmox or whatever

final kiln
#

Ah I never had to battle for resources like that

#

Tho I've experienced the lack thereof

long canopy
past meteor
#

Pretty much

#

CPU and RAM were bottlenecking me

#

Couldn't use the 48GB VRAM card to its capacity

long canopy
final kiln
#

What is a weak node ?

long canopy
long canopy
final kiln
#

Yeah sounds interesting

#

I've always wondered how gpt4 even does inference, model is so big and there's so many people using it

long canopy
#

main subject i've been working on lol

final kiln
#

Ah I mostly focus on smaller scale, might bit me in the future idk

long canopy
#

yeah I want to minimize cost of doing inference with unquantized models

final kiln
#

I think this is why the industry seems (at least from what I've observed) to be ahead of academia, the industry is very resource aware and always looking to optimize while the academia is very smart folks doing a subject they like but not necessarily within the same kinds of constraints

#

But idk

long canopy
#

really need those academic types working on those

final kiln
#

wait

#

yeah I think so, except one

#

the first one is actually now working at open ai

long canopy
#

huh! industry eh

#

didn't know

#

well, thanks to those lads in any case lol

final kiln
#

yeah they're from a company that happens to have a lab in the university, I actually worked at a place like that, it works very much like a company, didn't see much difference except that I had to walk through a campus again

vocal sleet
#

What python libraries can I use to make a simple AI chatbot to add to a discord bot I am making? I know the openai library exists but I want a few more reccomendations?

vocal sleet
final kiln
vocal sleet
final kiln
winter sluice
#

Should I watch a video on memory/garbage collection for this? - but typically we say 'no reusing variable names'. But for sequential dataset calculations it feels totally wrong to make so much memory.

Lines like these happen all the time in my code:

        parsed_dataset = dataset_choice.parse_tfrecord(...
        self.dataset = filtered_dataset.shuffle(...```
#

ignore that it has an error haha

mild grotto
#

So, I profiled my app and I see
{built-in method scipy.sparse._sparsetools.coo_matvec} is taking up basically all the processing time.

This is because I am have this gausian blur filter
self.L1=adjacency.tocoo()
and then blur like this:

  def blur(self,data):
    return self.L1.dot(data.flatten()).reshape(data.shape)

Is there a more performant way to do this?

#

I thought about doing a larger blur filter (5 pixels instead of 3) and then I could do 2 blur operations in a single pass. However this seems to cause it to actually be slower presumably because it can't utilize co-local variable locations in memory.

ashen axle
#

I am looking for a LOCAL data pipeline framework that encourages intermediate value inspection, preferably through visualisation, throughput validation, and error handling. What is the contemporary f ramework/approach?

I am familiar with scikit learn's pipelines but as far as I am aware none of my requirements are built-in.

I've reached a point where I am writing one from scratch, which tells me I'm doing the wrong thing, so Im curious what the field is using. Web search turns up the usual Medium articles, blogs and advertisements for distributed systems.

wooden sail
#

e.g. keeping the original shape and doing elementwise multiplication plus addition

mild grotto
#

Yeah I mean, I can keep shape the same and use a function to index in

#

would that help?

#

I'll try it

dusk tide
#

Hello, has anyone ever did the Tensorflow Professional Developer Certificate exam ?

mild grotto
wooden sail
mild grotto
#

I allocate everything as a long array, and index using

Face*res*res + Y*res + X

#

Is that what you mean?

#

Now there is no reshaping.

wooden sail
#

all right, though that kinda looks like a quadratic form now

#

what shape is this face variable?

#

and Y and res, i guess

#

originally and now as vectors

orchid forge
#

guys i need help understanding something

unique ivy
#

Pandas

orchid forge
#

yup

past meteor
#

I think what you want is an orchestration tool. In that case you either want airflow or dagster. Airflow is the option with the most traction but dagster is comparatively simple

#

sci-kit learn's pipelines are something totally different, that's just encapsulating a ML model with its preprocessing (which is something you should definitely do)

versed flame
#

Hi! Ill post this question here on recommendation:
I've recently said something infront of some any of my devices which sends me recommendations for 'trading-bot's etc on youtube.
While I doubt its not easy to get rich, I've traded with paper accounts before which was fun, and the thought of a bot seems like a fun project.

How 'real' are these, and also what is a good way or direction to start learning when going for this?

I assume I need to use machine learning to some capacity.

shut yoke
versed flame
#

Well, "yes" but also no. If it was easy I relize it woulnt work.

shut yoke
#

It's not easy to make the bot yourself

versed flame
#

Based on comments on ALL the videos, trading gurus seems rather overrated.

shut yoke
#

And it doesn't guarantee you profit because after all it's a bot. Not any better than a human being

versed flame
#

Lets refrase it the, what Is a good way to get into machine learning, what other kind of project could i do? I learn alot better when doing something rather than following directions (hence why I dont want to watch the youtube videos and just copy)

abstract rune
#

Isn't matrix multiplication also of order n^3 ?

#

how does gradient descent makes a better choice than the close form solution of (XTX)^-1 (Xy) ?

faint galleon
faint galleon
#

Is that about element or variable??

final kiln
abstract rune
#

i have no idea what you are talking about @faint galleon

faint galleon
abstract rune
#

which reduces the size for X, so it makes the computation simpler

final kiln
split drift
#

Hey,
I've written a long script that process data.
I think that it would be good to break it into modular parts, to improve maintainability and readability.
Can someone send me a guide, or a repo that can serve as an example of how to do it correctly?

mild grotto
#

it's the surface of a cube

#

so face is just [0,5], y is [0,res] and x is [0,res]

ashen axle
# past meteor Sadly the word "pipeline" means 5 different things in data

Yes, you're spot on there. This is a 1 man locally run scientific project running batch data in MB size, signal processing. I'm simply spending too much time chasing errors caused during development. All I'm looking for is error handling and intermediate step data viz. I feel like airflow or otherwise is overkill? I'm not familiar with it.

rocky ridge
#

Please help me data scienctitsts

twin reef
#

Hi guys I have made a model for a car that drives on a certain track and the point of the project is to get yhe best model possible for a track and you race against the car and at the end it shows where you could have performed better analysing the car amd your movement

#

Amd since I have only made the basic model and the pygame simulation I am wonder if this is too hard

#

Since I have around 20 days to do it

twin reef
#

No

feral blade
#

hii, im using torchreid library for my custom data... The documentation says it automatically logs the learning curves and i just need to install tensorboard to visualize it... but the visualizations come out to be like this which is very weird imo.... is there anything i could do to maybe extract loss/rank1/map stuff from training myself and plot them, or any way to reconfigure plot?
link to doc - link to the said doc - https://kaiyangzhou.github.io/deep-person-reid/user_guide#visualize-learning-curves-with-tensorboard

graceful ledge
#

has anyone ever analyzed their junk mailbox using python?

#

I just nuked 11k unread emails and am interested into sender distributions, etc. Wondering how I can get this from a folder in an email inbox

final kiln
#

a while back i wanted a bot scraping my emails and wasnt able for gmail

#

only way wAs actual web scraping

past meteor
final kiln
lapis sequoia
#

hi i wanted to ask where should i start to learn python for AI since I'm interested how can machines learn smt (especially how it learns from its mistakes) so if i should buy specifics books or where i should start

odd meteor
# split drift Hey, I've written a long script that process data. I think that it would be good...

I'd say, just device a structure that works best for you. For example, I use the so called "3-design pipeline" to decompose my ML code into manageable components.

  1. Feature Pipeline: A script that transforms raw data into model features, then pushes it to a feature store so the rest of the system can use it (I use Feast for most project)

  2. Training Pipeline: A script that ingests features from feature store, train the model, and pushes the artifacts to model registry

  3. Inference Pipeline: fetches last batch of features and generates prediction using the model that's already pushed to the model registry.

You can work out something like this where you decompose your long script into small and manageable bits.

More so, if you fancy Poetry, you can as well use it to keep your work well-structured.

odd meteor
# lapis sequoia hi i wanted to ask where should i start to learn python for AI since I'm interes...
  1. Start from https://kaggle.com/learn
  2. Check the pinned post by Zestar. You'll see some book resources he recommended.

If you're interested in making a financial commitment, you can try Udemy, Coursera or Udacity.

hexed dawn
#

hi! i'm getting conflicting info, would you say TF-IDF as a vectorizer is for feature extraction or feature selection?

tranquil mist
#

Hey guys, I was wondering if there’s any VERY in depth resources for pandas, preferably with real world (read non ideal) input. I keep hitting a wall where the documentation isn’t very helpful in terms of performance and most YouTube videos / SO questions are very superficial and not geared towards very large datasets.

#

I’d say I’m beginner to intermediate level, meaning I can get anything done with decent but not optimal performance.

long canopy
#

are there python alternatives to kafka?

quaint loom
#

I've been tackling how to predict Macrophytes biomass using data from different locations and environmental factors. Initially, I tried using 'Wet biomass', 'Wet weight', and 'Dry weight' to guess 'Dry biomass', but that just made my model too clingy (overfit). So, I switched gears and decided to first make predictions on those auxiliary bits - 'Wet biomass', 'Wet weight', and 'Dry weight'. Then, I'd use these predictions as inputs to predict 'Dry biomass' more accurately, hoping this roundabout way would trick the model into not overfitting.

After merging and cleaning up the data, I split it up, made sure there weren't any gaps in my target variables, and trained separate models for each auxiliary target. These models' predictions were then used as extra features to help predict 'Dry biomass' with a RandomForestRegressor.

But here's where it got tricky: I ran into a snag with mismatched sample sizes, flagged by an error pointing out I had [294, 368] samples at different stages. I believe I may be off track, so any input would certainly be valuable.
https://paste.pythondiscord.com/M6EQ

odd meteor
odd meteor
karmic void
#

Hello guys, I am quite experienced in python and wanna enroll myself in DATA SCIENCE. I don't have much idea of what type of projects are created in this field. I have learnt about numpy, pandas, matplotlib and seaborn. Is there any idea about what should I do more and how?

final kiln
raw mortar
#

plus1 for dask, it's a distributed task scheduler itself

potent sky
#

Flink is great and growing very well. We recently finished adopting a donation of Change Data Capture Connectors from Alibaba

final kiln
#

I wonder if prefect can be used for similar purposes

past meteor
#

I don't think Kafka is a data processing solution at all. It's a piece of infra for distributed event-driven programming which can be used for data processing but wasn't specifically designed for it.

potent sky
# past meteor Oh do you use Apache flink?

Sometimes
But I like it.
I keep track of the developments, dev list discussions, sometimes vote on releases and FLIPs.
I'd try to contribute more through code but can hardly find the time 😔

past meteor
potent sky
#

Real time data analytics platform for example, in combination with Apache ignite 🔥. Flink supports true stream processing natively.
Unfortunately most of my dealings with flink are hobby projects 😅
Flink is pretty good for stateful unbounded streams and event driven requirements. Exactly-once consistency guarantees in many cases.

past meteor
potent sky
#

pyflink

#

Java for where that's insufficient

past meteor
#

I'll look into Flink some more 👀

My area is p much dominated by Azure and Databricks so that's what I know. I've looked into Flink a tiny bit but not that intensively.

desert mulch
#

we are making game would help us

long canopy
#

@odd meteor @past meteor thanks for the comments!

potent sky
#

https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

Good piece. Putting it here in case someone hasn't come across it yet.
The discussion on Kafka, flink etc reminded me of it

I joined LinkedIn about six years ago at a particularly interesting time. We were just beginning to run up against the limits of our monolithic, centralized database and needed to start the transition to a portfolio of specialized distributed systems. This has been an interesting experience: we buil

past meteor
#

And event sourcing I guess

long canopy
#

7B is now 3 GB

#

it's the friggin future lads

long canopy
#

the apache big data stack is some serious stuff

#

i can clearly see its use for business analytics

potent sky
#

My latest interest is Apache Ignite

past meteor
#

I honestly never heard of that one

#

I'm reasonably deep into this stuff https://fs2.io/ but my guess is that it doesn't scale as Flink and Spark do because it's really aimed at single node concurrency

#

Totally fine for my use cases tho tbh

potent sky
#

It has an interesting architecture and the enterprise use cases mentioned are also intriguing

potent sky
#

Haven't really played much with Scala yet

past meteor
#

It's my guilty pleasure 🤣

long canopy
#

WHAT

#

gpt 3.5 just got released for local

#

dude what the heck is going on this is too much i cannot handle it

final kiln
long canopy
#

nvm i just got rick astley'd

#

i'm a friggin idiot

#

sorry i forgot the day

final kiln
#

Looool

long canopy
#

dude it was so believable, the tweet even said it was an old version of gpt 3.5

final kiln
#

They're definitely working on 4.5, so releasing 3.5 would mean 4.5 and 5 were coming

long canopy
#

keep an eye out for jamba tho

#

some serious stuff going on there

final kiln
#

Let's see. Altman also mentioned Q*, but I'd bet he's playing the hype

versed pilot
# tranquil mist Hey guys, I was wondering if there’s any VERY in depth resources for pandas, pre...

I did a couple of courses by Matt Harrison on Linked in Learning, and I bought his Pandas 2 book. He has some good ideas on making pandas code more maintainable, and also more performant. But you can only go so far with Pandas, you should consider other solutions if the dataset is really too big. At work we use BigQuery, but if you want to stick to open source dataframes libraries have a look at what Emyrs suggested.

tranquil mist
tranquil mist
drifting spire
#

Hey guys, has anybody here worked with recsys? I'm creating my first one and would love to hear some advices

final kiln
#

https://youtu.be/wjZofJX0v4M?si=Llqp3kIlSJKM3V8h

I've only watched a chunk of this, but it's top notch as usual

An introduction to transformers and their prerequisites
Early view of the next chapter for patrons: https://3b1b.co/early-attention
Special thanks to these supporters: https://3b1b.co/lessons/gpt#thanks

Other recommended resources on the topic.

Richard Turner's introduction is one of the best starting places:
https://arxiv.org/pdf/2304.10557.p...

▶ Play video
long canopy
#

i want one on mamba

fading wigeon
#

Anyone have any suggestions for like... courses on Machine learning and/or AI in Python?

#

I'd like to get more experience in both without having to go get a masters/phd or something

#

r/machinelearning seems to recommend this, but that was like 7 years ago: https://www.coursera.org/learn/machine-learning/home/info

fading wigeon
#

Okay. Any suggestions on resources? I think I do best with academic type courses

#

like on coursera or something

desert oar
#

but check the pinned messages + search up in the channel, there will be lots of suggestions

fading wigeon
#

OH

#

I thought you meant I should dive into deep learning

#

Like,as a description of what I should do

fading wigeon
#

Thank you! I'll look into them 🙂

long canopy
potent sky
final kiln
potent sky
#

Ikr! Felt the same when he posted "But what is a neural network"
Like why wasn't this out when I was getting into this stuff 😭

final kiln
#

it's how it should be, we're here to make it easier for those who come after us - still jelly tho hueshda

tranquil juniper
#

Question on math behind transformers if thats fine here, saw 3blue1browns last video on it and he describes that only the final tokens hidden state vector is used to generate the next token, why is that? Is it true? Why would you ignore all the valuable info in the other vectors? https://youtu.be/wjZofJX0v4M?si=xjG1aMGzizelL5B9 21:15 in the video for this specific question.

An introduction to transformers and their prerequisites
Early view of the next chapter for patrons: https://3b1b.co/early-attention
Special thanks to these supporters: https://3b1b.co/lessons/gpt#thanks

Other recommended resources on the topic.

Richard Turner's introduction is one of the best starting places:
https://arxiv.org/pdf/2304.10557.p...

▶ Play video
final kiln
# tranquil juniper Question on math behind transformers if thats fine here, saw 3blue1browns last v...

so the transformer is being trained on next token prediction, imagine you have some text:

  • dataset: "this is some text that is being used to train the transformer on next token prediction"

what you want to do is select a subset of it, for example

  • sample: "text that is being used to train the transformer"

you now turn this into an input and output:

  • input: "text that is being used to train the"
  • output: "that is being used to train the transformer"

note how in the output, the first token was removed and the last token was not present in the input

#

and so the reason why you only take the values from the last token, is because the other tokens are just being transcribed, copied from the input, the only token with new information is the last one

this is called a self-supervised method, in which the labels of the dataset are generated from an unlabelled dataset

#

for BERT you do like

  • input: "text that is being <MISSING_WORD_TOKEN> to train the transformer"
  • output: "text that is being used to train the transformer"
#

the reason for the difference has to do with the internals of the attention mechanism, BERT lets every token influence every other token, while GPT only lets tokens influence tokens that have already occurred in the sentence

tranquil juniper
tranquil juniper
final kiln
tranquil juniper
final kiln
tranquil juniper
final kiln
#

Each sentence is labeled by itself dislocated one token

final kiln
tranquil juniper
#

Do you have any textbook or colab/jupyter notebook you could recommend that helps understand the fine steps of it? A more low level understanding like yours? @final kiln

final kiln
# tranquil juniper Do you have any textbook or colab/jupyter notebook you could recommend that help...

I went through this step by step with a pen and paper in hand: https://bbycroft.net/llm

When I was satisfied with how much I understood, I went from the top, and implemented everything in pytorch.

Don't try to understand everything at once, it can be okay to start building the parts you do understand and then come back.

I first trained it on simple array sorting, then I trained it on next token prediction. At that point, I was using this repo as reference to get some details right:

sweet prairie
#

👀 rate the model guys

tired lodge
#

image permissions?

#

or just in general how to upload files to discord?

#

hmm, strange. the permission should allow all files of any kind to be uploaded

#

that makes sense. you shouldnt be uploading big py files because pastebins exist

#

just put all your content in there lol

#

idk i might be waffling, im a bit hungry so i should probably eat

tranquil juniper
frozen tundra
#

does somone know if there is a problem with my code or i just didnt use it correctly? i tried to make my own neural network and everything works pretty well except the linear functions in the output layer, they just dont learn, they output the same output for different inputs (i can link the code if somone wants to see it)

frozen tundra
#

sorry lol, ill paste it in a sec

#

this is the code, (the elu is not finished yet it dosent matter)

#

i have changing parameters like the learning rate and amount of hidden layers and neurons but those didnt change much i found out its not only with the output layer but in the hidden aswell

plucky sedge
#

I'm using Func Animation to animate a graph of a projectiles position in a simple flight (like throwing a ball) and I'm wondering if there's a way to auto-adjust the axes scale because the animation just goes off-screen immediatelty (I can still just move the graph around but I'd rather have it auto adjust)

Any help would be much appreciated.

frozen tundra
#

i tried using just tanh and sigmoid to teach it stuff like cos and sin and it worked pretty good but when i used linear or relu activation functions it outputted the same thing over and over again

#

i apologize if my english is not well, its not my native language

#

thank you so much

#

no problem, the relu function isnt realy a relu, its kind of a relu that makes it so there are no derivatives that are 0 to prevent "killing" neurons

#

yeah

#

with the linear activation function?

#

ok ill try

#

btw i didnt teach it a cosine with the linear function, i tried to teach it a parabolic function

#

how many hidden layers should i use? i think one is good?

#

i tried a lot of different numbers and none of them worked (i dont mean to make you stay here you can go eat lol)

wooden sail
#

i wouldn't expect it to work well outside of the training domain regardless of the activation function tbh

final kiln
#

The other two worked, so something is correct

(Brb)

final kiln
#
        self.weights = np.empty(1 + hidden_num, dtype=object)
        for i in range(len(self.weights)):
            self.weights[i] = np.random.uniform(-0.5, 0.5, (self.sizes[i], self.sizes[i + 1]))
        self.weight_update = self.weights

you're actually not making a copy here, both self.weight_update and self.weights point to the same array

frozen tundra
#

it didnt work with the same setup, it outputs very similar outputs, when i tried using only linear activation functions another problem occured and the outputs were "nan" i have no idea why and i have very little knowledge of what it means. i found a setup that kinda works that has 1 hidden layer and 5 hidden neurons

final kiln
#
    def add_changes(self):
        self.weights = self.weight_update
        self.biases = self.bias_update

so this function has no effect

#

and you dont need two separate arrays, you can just update the weights as you go through backprop

frozen tundra
#

oh i understand, how do i make a copy?

frozen tundra
final kiln
frozen tundra
final kiln
#

so, like, in the forward pass you can already calculate the gradients right

f(x) = x**2

f'(x) = 2*x

even if it's part of a composition of functions, f'(x) can be calculated in isolation as long as you have x

during the backwards pass you apply the chain rule in succession

#

as you do that you apply a change to the weights, don't need to store the change in a separate array since you are only operating with the gradients

#

if you have an entire batch, you do the same thing, the difference is that you have multiple x's

#

but you do one step at a time for all batches, instead of the entire backprop for element each of the batch

#

but in any case, what I'm trying to say is that "self.weight_update = self.weights" does not perform a copy of the array, it just copies a reference to the same array, so updating one is updating the other

frozen tundra
#

ok i think i understand (i will try to explain what you said) so what i should do is go through all of the batch at the same time and save the derivatives as i go so i do it as a batch

final kiln
#

tho the def add_changes(self):
self.weights = self.weight_update
self.biases = self.bias_update
part doesnt have any effect

frozen tundra
#

thanks i undestand. do you think it has to do with the problems i am having with the linear functions?

final kiln
#

layer_before = np.dot(inputs, self.weights[i]) + self.biases[i]

in here you're hardcoding the layer right, if you were to do a general thing you'd do

layer_output = layer(inputs)
grad_output = grad_layer(inputs)

and inputs could be a batch of inputs, then in back prop you'd go back apply the chain rule and then avg out the gradients and apply them

frozen tundra
#

thanks i realy appreciate your help

final kiln
frozen tundra
#

do you maybe know what the nan is about?

final kiln
#

that's why normalizing the input tends to be a good idea

frozen tundra
#

what is normalizing?

final kiln
# frozen tundra what is normalizing?

uhm, it can have multiple meanings

but it usually means getting the values to be within a certain range, like scaling them in the same way, but such that their sum is between 0 and 1

like if you have

[1, 2, 3]

you can see if I divide by 3+2+1 = 6

[1/6, 1/3, 1/2]

these sum up to one

#

there's multiple procedures and they can have many meanings

#

in this case I turned it into a probability distribution

#

the important intuition is that both collections of values have the same information

frozen tundra
#

oh i see, but it outputs nan for all inputs when using linear activation function as the first layer

final kiln
#

they both convey the same relative scale

#

but the values are nicer in the second case

frozen tundra
#

?

frozen tundra
#

so i should pass the input values through some kind of function before the linear function

#

the point is i want it to learn infinite range function so i can later use it for q learning

#

yeah i just felt like using pytorch and tensorflow will be "cheating" because i wanted to realy understand how everything works but maybe i will use them after i have some more understanding of the topic

#

again i realy appreciate your help! you helped me understand a lot of things and you gave me a different prespective. Thank you so much

potent sky
#

The SAM codebase is so nice it honestly makes me happy.
No unnecessary bloat code.
Very clean, logical, "the right amount of" modular, well commented.
Refreshing to see such a clean research implementation

desert oar
# frozen tundra yeah i just felt like using pytorch and tensorflow will be "cheating" because i ...

it depends on what you want to learn.

deriving the gradient manually, to implement a small fully-connected NN in pure numpy, is a great exercise.

implementing autograd yourself is interesting and useful if you are interested in ML engineering or other computational aspects of machine learning. but it's not an important learning exercise for actually doing DS/ML/AI in practice. just use pytorch for that.

potent sky
#

Most paper implementations are quite messy

potent sky
desert oar
#

Meta seems to put out high quality OSS ML code

#

Fasttext was very nice quality as well when I looked at it

potent sky
desert oar
#

Fasttext is actually what I described above: a pure C++ neural network, no autograd stuff

#

(or it was, when I looked at the code in 2018)

potent sky
#

I go through research implementations regularly and most of them are so messy it's tiring
It's understandable why they're messy, the researchers' primary function is research, they're not software engineers
But it's tiring nonetheless

potent sky
#

Aten src is also well structured
Well as well can be expected anyway from a codebase of that size and complexity

#

But I couldn't find any good reference doc for aten itself

jaunty helm
#

question, how do you choose between pytorch and tensorflow?

cinder jay
#

hi, i have two images, one is the original and the second one is the segmented, how can i overlay the segmented above the original???

potent sky
serene scaffold
grand geyser
wooden sail
#

fair warning that jax does not fill the same niche as tensorflow

grand geyser
wooden sail
#

it also never will

#

libraries might be built around it that do, like haiku and flax

#

but jax itself is a different thing

grand geyser
wooden sail
#

it's lower level numpy-like access to the XLA backend that tensorflow also uses

grand geyser
wooden sail
#

i'm just saying: jax does not do the same thing tensorflow and pytorch do

#

it's not the same kind of tool

#

you have to do more of the math and design yourself

#

there aren't even any "layers" defined anywhere

#

you have to compose everything by hand

grand geyser
wooden sail
#

yes

#

that is what it's all about

#

it's literally numpy with autograd and jit

grand geyser
#

Thanks for the info
I made up my mind
Never go to Jax 🫡

wooden sail
#

nothing else

#

don't get me wrong, it's fantastic for research

#

but not friendly if you don't like/have to do the math yourself

grand geyser
wooden sail
#

yes, because of the jit and autodif

grand geyser
wooden sail
#

researchers that have to design novel architectures

grand geyser
#

....

wooden sail
#

sometimes you need to solve problems for which no solution exists yet

#

no out-of-the-box layers or architectures

grand geyser
#

Btw do you have any good video for pytorch?

wooden sail
#

i don't, i use jax 😛

grand geyser
#

🫡

#

Now I see how you know so much about Jax 😆😆😆😂

robust stratus
#

This is a practical way of inserting data into an excel spreadsheet?

import json
from openpyxl import Workbook
from pl_bh.gh_resources import pl_functions as pl

# Load API response data from JSON
projects = pl.get_all_projects()

# Create a new workbook
workbook = Workbook()

# Select the active worksheet
sheet = workbook.active

# Write headers
headers = ['Project Number', 'Project Name', 'Manager Name', 'Type Description']
sheet.append(headers)

# Iterate through each project in the API response and write data to the spreadsheet
for project in projects:
    project_number = project.get('id', '')
    project_name = project.get('name', '')
    manager_name = project.get('manager', {}).get('name', '')
    type_description = project.get('type', {}).get('name', '')

    row_data = [project_number, project_name, manager_name, type_description]
    sheet.append(row_data)

# Save the workbook
workbook.save(filename="projects.xlsx")
robust stratus
#

A lot of get() functions

past meteor
# jaunty helm question, how do you choose between pytorch and tensorflow?

Keras is easier than using Torch. However ... there's many breaking changes in Tensorflow/keras world. As a matter of fact, Keras is now suddenly multi backend again which brought a host of breaking changes 😅 .

Personally, I've moved to Torch myself but there are a couple of things I do miss from Tensorflow.

raw mortar
#

Pandas can also output to excel, underneath it uses openpyxl or other excel extensions

robust stratus
raw mortar
robust stratus
#

I need to see an example of how pandas can be incorporated into my current code

desert oar
#

!d pandas.DataFrame.to_excel

arctic wedgeBOT
#
DataFrame.to_excel(excel_writer, *, sheet_name='Sheet1', na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, startrow=0, startcol=0, ...)```
Write object to an Excel sheet.

To write a single object to an Excel .xlsx file it is only necessary to specify a target file name. To write to multiple sheets it is necessary to create an ExcelWriter object with a target file name, and specify a sheet in the file to write to.

Multiple sheets may be written to by specifying unique sheet\_name. With all data written to the file it is necessary to save the changes. Note that creating an ExcelWriter object with a file name that already exists will result in the contents of the existing file being erased.
desert oar
#

Of course, if you aren't already using Pandas, you might want to hold off. It's a big library with a learning curve if you aren't already familiar with the idea of a "data frame" from other contexts.

dusty valve
#

How much faster is jax than numly

#

numpy

serene scaffold
visual bone
#

hey guys, is there a good youtube video or book to read to learn pyspark?

raw mortar
visual bone
orchid forge
#

guys

#

im trying to understand a code
but couldn't

#

price_range = df['Price range']
total_restaurants = len(price_range)
percentage = (price_range.value_counts() / total_restaurants) * 100
percentage
import matplotlib.pyplot as plt

price_range_counts = {}

for price in price_range:
if price in price_range_counts:
price_range_counts[price] += 1
else:
price_range_counts[price] = 1

sorted_price_range_counts = dict(sorted(price_range_counts.items()))

total_restaurants = len(price_range)

percentage = {price: (count / total_restaurants) * 100 for price, count in sorted_price_range_counts.items()}

plt.figure(figsize=(8, 6))
bars = plt.bar(percentage.keys(), percentage.values(), color='lightgreen')

plt.xlabel('Price Range')
plt.ylabel('Percentage of Restaurants (%)')
plt.title('Distribution of Price Ranges Among Restaurants')

plt.xticks(list(percentage.keys()))

plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.ylim(0, 100)

for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width() / 2, height, f'{height:.2f}%', ha='center', va='bottom')

plt.tight_layout()
plt.show()

#

this code

#

this code

#

anyone here?

orchid forge
# orchid forge

Create a histogram or bar chart to visualize the distribution of price ranges among the restaurants

#

its written at the top

orchid forge
#

for loop

lapis sequoia
#

i meant this dataset, is um deprecated....

crisp raptor
#

Excel truly is the pinnacle of data science.

lofty thorn
#

is there available a separate library for ' matrix '.

wooden sail
#

numpy can handle all of your matrix arithmetic for you

lofty thorn
#

ok

#

one more question

#

how to enable autocompletion in jupyter notebook

#

?

umbral charm
#

how does one achieve a diagram like this in matplotlib

#

ive tried doing color = 'None'

#

and edge colour = 'black'

#

but that makes the lines come back down, but i want an out line

#

this is what i have

#

but i dont want them lines coming all the way backdown if you get me

#

i just want the outline

cinder jay
#

hey, how can i put some legend in a image using numpy or opencv?
because i segmented an image and i need to specify each segmentation

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied timeout to @pine oasis until <t:1712177971:f> (10 minutes) (reason: newlines spam - sent 106 newlines).

The <@&831776746206265384> have been alerted for review.

spiral peak
#

!unmute 423503073479098368

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: pardoned infraction timeout for @pine oasis.

spiral peak
#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

spiral peak
#

pls use our pastebin

pine oasis
#

Sorry, whats that?

#

Oh sorry, just seen the embed

#

I am struggling with understanding different loss functions and how should i shape my output accordingly, the provided snippet works but the results seem really weird

#

If anybody interested ping with reply please

versed pilot
vocal sleet
#

What's the best libraries to make a chatbot from pretrained models?

I already tried HuggingFace but got so confused

verbal venture
#

Can anyone explain why x is being multiplied to find w during gradient descent in linear regression

tidal bough
#

If you write down the formula for the Mean Squared Error loss and calculate the gradient by the weight, that part will be there, because d((w@x-y)^2)/dw = 2 (w@x-y) x

verbal venture
#

can you just tell me the reason

mellow vector
#
cars.plot(kind = "scatter", x ='horsepower', y = 'mpg', figsize = (12,8), c = "cylinders", marker = "x", colormap = "viridis")```
verbal venture
#

why is x included in calculating w but not b

mellow vector
#

what is c short for?

tidal bough
#

I'm not sure how I can tell it more directly. In the gradient by w it appears from the d(w@x + b - y)/dw term. Whereas for the gradient by b, d(w@x + b - y)/db is 1.

mellow vector
#

it produces a scatterplot, cylinders is used to determine the point on a spectrum that colors each data point

tidal bough
mellow vector
#

hmm i thought so, guess i didn't realize it would accept a series

tidal bough
#

that argument of pd.DataFrame.plot corresponds to this argument of plt.scatter:

#

(except that the pandas version accepts a column name, too, in which case that column is used.)

mellow vector
#

ty

#

wish this instructor wouldn't use shortcuts

#

dot notation bleh

desert oar
#

it's annoying because that's just the name, it's not an abbreviation for anything

#

there's also s

#

this is one of the most obnoxious areas in matplotlib

white crown
#

I have an excel sheet with multiple sheets. I am reading it as a dictionary where the key is the sheet_name and the values is a dataframe. I am using pydantic v2 to validate this. I need check the mode field. This field should be a value rows in the sheet named zone_sheet and column Zone. I am trying something like this and it doesnt seem to work for me. What is the recommended way to dynamically create a list of valid values?

Zone =[]

Class testSchema(BaseModel):
  mode: Annotated[Zone, Field(description = "some test")]
  
  @model_validator(mode='before')
  def populate_zone(data_dict):
    zone_sheet = data_dict.get('Zone')
    if not zone_sheet:
      ZONE = [zone_sheet['zone'].tolist()]
desert oar
#

you can do it with an "after" validator though

#

"before" validators are too powerful, avoid if possible

#

That said, you might want to use Pandera instead of (or in addition to) Pydantic

white crown
desert oar
#

But it might help if you gave me an example of what you are trying to achieve, without Pydantic

#

That way I could understand better what you want to do

#

This might be better off in a separate help thread. Make a thread following the instructions in #❓|how-to-get-help and @ me so I see it

#

It seems mostly like a Pydantic question which isn't really the topic of this channel

white crown
desert oar
white crown
cinder jay
#

hi, i have a "problem", i have a few classes that i segmented, im printing em in a image but a few overlay the anothers
how can i fix that??

fading wigeon
#

I need help solving a stupid, stupid argument I'm having at work

#

My argument/stance is that the definition of a peak is a point where there's at least one of its neighbors is lower and the other is either equivalent or lower.

The junior coworker says that's something that's entirely made up and actually any point can be a peak if the slope changes

dusty valve
#

Hes right

fading wigeon
#

How?!?!

#

That's literally every point that isn't the same

#

Like, by that definition, unless the previous point and the following point is equivalent to the current point, it's a peak

dusty valve
#

I dunno

#

I think he's right though

fading wigeon
#

So he's right, but you don't know how he's right, but you think he is?

#

Based off of information that you don't know?

fading wigeon
#

the point directly preceding or proceding

#

So the neighbors of the peak in the data [1,2,3,2,1] are both 2

desert oar
#

In a discrete sequence, I guess that's one way to define a peak? But what about the sequence 1,2,1,2,1,2

#

is every 2 a peak?

fading wigeon
#

If we don't take noise tolerance into account, then yes.

#

They are all local maxima

desert oar
#

what about 1.0001, 1.0003, 1.0002, 1.045, 1.592, 1.432

#

is 1.0003 a peak?

fading wigeon
#

From a strictly mathematical standpoint, yes. For any modern peakfinding algorithm, no, that would fall below noise tolerance thresholds and be deemed as spurious.

#

In the above example, he'd be arguing that 1.045 and 1.432 are peaks

#

because the slope is different from 1.592 to 1.045 to 1.0002

#

But literally any points besides [1,1,1] will have slope changes

#

and thus the definition becomes meaningless

#

Depending on the type of signal data you're processing, (or if you're using the matlab/scipy peakfinder) you can set different thresholds to combat noise. Generally speaking the yaxis distance between a local maxima and a local minima have to be at least 1/4th the range of the data to be classified as a peak.

#

Some people like 1/5th

desert oar
#

It sounded like you were describing an algorithm though

#

Not a modern peak-finding algorithm, something very simple, which might in fact work in a lot of cases

#

I'm not sure what the heck they mean by "any point can be a peak if the slope changes" though

fading wigeon
#

Well, the junior is arguing that all modern peak finding algorithms are failures, as well as mine, and that only he's good enough to make one to REALLY catch all the peaks, which includes a bunch of points that look like knees or shoulders

desert oar
#

wait, do you want to include the shoulders or no?

#

i'd say anyone who takes an approach like that is probably an arrogant asshole, amplified if a junior is saying it

#

"everyone else but me is wrong" is the creed of a crank

fading wigeon
#

Yeah, he's incredibly arrogant. He tried to set himself up as the director of software engineering at one point

desert oar
#

but whether or not he's right in the context of your particular business case depends a lot on subjectively what do you consider a peak

#

unrelated to the math, people like that are a net drag on productivity and team morale, and are best let go of promptly

#

the longer their tenure without being chastized for their attitude, the more they find validation for their arrogance, and the more arrogant they get, and the more disruptive/counterproductive they get

#

if you give somebody like that too much power, they can sink the entire organization. and even if they don't have power, they can scare away enough talent that you will have issues retaining good people who don't need to put up with it

#

!e import pydantic

arctic wedgeBOT
#

@desert oar :x: Your 3.12 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "/home/main.py", line 1, in <module>
003 |     import pydantic
004 | ModuleNotFoundError: No module named 'pydantic'
desert oar
#

aw

fading wigeon
#

I think the core issue is that he's very charismatic and is trying to climb the corporate ladder

#

So he's brown nosing the right people and he can speak confidently about things

#

These things happen to be total bullshit

#

But I'm an academic/engineer, I am always precise and leave room for myself to be wrong due to incomplete information

#

I know office politics are unavoidable, but...

#

I'm just hoping it's not like this at every company

#

Since I'm looking for the door

#

Maybe I just need to practice being a lying brownnoser, idk.

desert oar
# white crown YES. I am not sure how to do this and have tried to do this in several ways and...
import pandas as pd
import pydantic

class DataWrapper:
    data_frames: dict[str, pd.DataFrame]

    @pydantic.model_validator(mode="after")
    def check_data_modes(self) -> None:
        data_frames = self.data_frames.copy()
        try:
            zone_df = data_frames.pop("Zone")
        except KeyError:
            raise ValueError("Zone sheet is missing from input.")

        valid_zones = zone_df["Zone"].unique()

        for sheetname, df in data_frames.items():
            if not df["mode"].isin(valid_zones).all():
                raise ValueError(f"Sheet {sheetname!r} has invalid 'mode' values!")

like that?

fading wigeon
#

Can I just say I love that there's a module called "pydantic"

desert oar
fading wigeon
#

I don't even know or care what it does, lol

desert oar
#

there is also a bit of an art to being precise and scientific without being seen as incoherent or inconclusive

fading wigeon
#

Yeah. I have. Office politics are complicated right now. My original boss quit. Temporarily one of the execs was leading the software team but he didn't have any software knowledge. But he is/was really susceptible to yes men and the junior really confidently made the case that he should be promoted to director. I told him that I'm really trying to be a team player, but that I don't think the junior had any idea what he was doing. I got ignored until everything crashed and burned and then I got listened to.

But unfortunately the junior is telling everyone I'm just out to get him and he's well liked, people are buying it. Well, anyone who doesn't know anything about software is buying it.

#

But in the past I've gotten blamed for his mistakes so I'm trying to make any future incidents be crystal clear that I am in strong opposition because that logic was used to justify denying me a raise

desert oar
#

yikes

#

sounds well past the point of "work on your resume and start applying" imo

fading wigeon
#

Yeah

desert oar
#

it sounds like you're doing the right things, but it also sounds like management is toxic and borderline hostile

fading wigeon
#

It's hard though, everyone in my field wants to talk about AI and language learning models and I don't have a ton of experience there, so I'm trying to study up as fast as I can

desert oar
#

i'm in DS with relatively minimal AI knowledge as well, i am right there with you

#

the jobs exist, but are rarer than they should be

fading wigeon
#

Yeah

#

I did find an ML course. I know that doesn't cover LLMs or everyhthing about AI, but it's somewhere to start

desert oar
#

most small/mid-size orgs could benefit from an intermediate data scientist and a data analyst, giving the former freedom to do R&D and the latter stays busy with dashboards etc

fading wigeon
#

That's a very good point

desert oar
#

the problem is data quality -- usually it's horrible

fading wigeon
#

Yeah

desert oar
#

so you have like 1 year until the DS becomes productive

fading wigeon
#

Yup

desert oar
#

what is your background if not ML? engineering?

#

the way you're talking about "signals" makes me think of EE

fading wigeon
#

Digital Signal Processing for Physiological signals, degree in biomedical engineering

#

There's a lot of EE background knowledge involved

#

It's rough. This job used to be so much fun and I'm in medtech so I got to see people whom my technology directly helped

#

But I need to accept that ship has sailed

desert oar
#

BME degree in DSP + a few YoE you should be a pretty compelling candidate as long as you interview well and write decent code

fading wigeon
#

Yeah, I just need to be able to speak to AI/LLM/machine learning better

#

I just admit my knowledge on the topics is minimal and the interviews end

#

Well, not knowledge, but my experience is anyway

desert oar
#

there are ways to spin that

fading wigeon
#

You think so?

desert oar
#

plus it's not that hard to dick around with some prompts + run nanogpt locally

fading wigeon
#

Hmm

desert oar
#

it depends on who's interviewing you and what they're looking for

fading wigeon
#

I've definitely like... grilled chat gpt to try to figure out how well it "thinks" and it doesn't.

desert oar
#

unfortunately the fad cycle is at peak hype right now so everyone thinks they NEED it

fading wigeon
#

Yeah

#

If I tell even the screener I think it's a bit overblown that's the end of the conversation, lol

#

I've never heard of nanogpt

desert oar
#

what kinds of jobs are you applying for where the interviews are so AI focused?

fading wigeon
#

What's ironic is.... all of them.

#

Well

desert oar
#

DM me an example?

fading wigeon
#

No, yeah, like all of them. Specifically I focus on more R&D involved positions

#

Sure

desert oar
#

that might be part of the problem. what R&D does everyone want to do now? AI

#

you might need to spend some time with self-study

#

i never took the full courses but i have gone through enough of the material to feel comfortable recommending it

fading wigeon
#

Can't DM you, you have them turned off 😛 But yeah, I think you hit the nail on the head anyway.

desert oar
#

but a big part of interviewing is emphasizing what you are good at. you have a really strong math & engineering & programming background? then you will get ramped up quickly on the AI material and can be very effective at prototyping

fading wigeon
#

I'm excited to go through the material. I just wish I wasn't suffering at my job while doing it.

#

Hmm

#

I think that's part of the problem. It's hard to sell myself on something unless I'm really confident in my knowledge about it

desert oar
#

That's probably a good thing

waxen delta
#

Does anyone know how to fix this error with Labelme? labelme
2024-04-03 18:30:20,756 [INFO ] init:get_config:67- Loading config file from: /home/student/.labelmerc
QObject::moveToThread: Current thread (0x2ef3a970) is not the object's thread (0x30045540).
Cannot move to target thread (0x2ef3a970)

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/student/.local/lib/python3.9/site-packages/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, offscreen, vnc.

Aborted

final swift
#

My knowledge of python right now is somewhat limited, I have spent a good chunk of time working with lists, arrays and the other more basic things, I have not taken an opportunity to look at hashmaps, sql, or really much with databases or neaural networks. I have looked at big O notation, and a little at classes. with all that in mind, is there any projects y'all would recommend to help build the necessary skills, or better cement foundations, for data work? I would like something a little more difficult than what I have been working on, which is Flask projects.

#

the other question I guess is, and what skills is it that I'm looking for?

jaunty helm
final swift
jaunty helm
orchid forge
#

i still didnt get it that why are we even doing it, i understood the code but what is the reason behind that fact that we are using a loop

#

im so dumb

#

why do we need to iterate through it

final swift
orchid forge
#

like why?

#

i wanna cry

#

im so fucking dumb

#

i should give up....its too hard for me

final swift
#

Nah man don’t give up. You got this. Just keep pushing through. Too hard just means for now.

pulsar wolf
#

yall know any cool open source data science projects?

abstract rune
#

Is this right guys ??

package matrix

// The constants of linear equations do not determine whether the matrix A
// singular or not!
func Singularity(A [][]float64) bool {
    return NumberOfRows(A) == Rank(A)
}

func Rank(A [][]float64) int {
    var rank int
    matrix, _ := REF(A)
    for i := 0; i < NumberOfRows(A); i++ {
        isNonZeroPresent := false
        for j := 0; j < NumberOfCols(A); j++ {
            if matrix[i][j] != 0 {
                isNonZeroPresent = true
            }
        }
        if isNonZeroPresent {
            rank++
        }
    }
    return rank
}
#

if the matrix(square) is not singular, then it is guranteed that the determinant is 0 , right ?

versed flame
#

If I get a 'good' result when doing machine learning, what are ways to figure out of that result really is correct or not?
Im basically an idiot, and im probably training my model towards the wrong thing or whatever, or testing it incorrectly.

#

Ill try to figure out how to do that

#

Im doing something probably every moron is doing and trying to predict stock. I've been getting < 0.4-0.5 accuracy constantly and now suddenly i get 0.75

#

I assume im doing something wrong.

grand geyser
#

nvm it is not javascript
it is golang code

#

Guys is this video good enough to learn pytorch?

https://youtu.be/V_xro1bcAuA?feature=shared

Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.

✏️ Daniel Bourke developed this course. Check out his channel: https://www.youtube.com/channel/UCr8O8l5cCX85Oem1d18EezQ

🔗 Code: https://github.com/mrdbourke/pytorch-deep-learning
🔗 Ask a question: https://githu...

▶ Play video
wooden sail
abstract rune
grand geyser
#

I thought it was JavaScript till I saw "func"

#

I don't think you will find many people that know golang in the python community..... 💀

grand geyser
abstract rune
grand geyser
#

People can have different favourite language
Someone's favourite language can be java or JavaScript or c or c++ or python etc

jaunty helm
#

playing around with polars, anyway to select the columns where the value isn't 0? (without doing it manually ofc)

jaunty helm
#

shape isn't preserved but works fine for my case

graceful carbon
#

how long does it take to learn to make AI assistant in python

#

??? hello

#

bro is anyone there

serene scaffold
serene scaffold
charred light
agile cobalt
#

maybe a pie, sunburst or icicle chart

charred light
#

There's 4 dimensions, 3 categorical 1 continuous.

molten forge
#

Anyone having experience using shifter legendre fourier moments for image analysis?

buoyant vine
#

Tbh, as much as that would be cool, I suspect what it is actually going to lead to is, some creative new scams (probably) and a lot of spam of automated systems producing low quality content on all the major platforms, i.e. YT, TK, etc...

#

not that it isn't already at that stage (spam wise)

lapis sequoia
#

merger datasets. Like mergers and aquitions datasets

mild grotto
#

Ok, I'm having trouble wrapping my head around this problem:
I have a numpy matrix M, and an adjacency matrix A.
I also have a heightmap H. All these are the same dimensions.

I want to calculate a gradient map G using H and A. to figure out the gradient in each adjacent direction.
Then I want to "move" each value in H in the direction of the largest gradient...

quaint loom
#

Is there anyone who have used Mantel test and know this issue:
ecopy Mantel test ValueError: Matrix d1 must be a square, symmetric distance matrix"

Is so, how did you handle it?

boreal gale
quaint loom
boreal gale
quaint loom
boreal gale
#

which mantel test implementation/library are you using?

quaint loom
boreal gale
#

can you turn on some debugger and see where it fails and step in

quaint loom
#

I will share it once I have finished my dinner. Okey?

boreal gale
#

sure

proper timber
#

hello everyone

#

i'm a python developer

unkempt yoke
#

Hi python developer

quaint loom
# boreal gale okay, are you using a jupyter notebook?

So I think theproblem stems from the improper construction of the distance matrices, resulting in matrices that are neither square nor symmetric. It seem from the debugging that it indicate that the matrices are not square and symmetric, which leads to errors during further operations such as the Mantel test.

env_distances = np.zeros((len(area_data), len(area_data)))
target_distances = np.zeros((len(area_data), len(area_data)))

print("Constructing distance matrices...")
for i in range(len(area_data)):
for j in range(len(area_data)):
# Compute distances for env_distances
env_distances[i, j] = np.linalg.norm(area_data[env_columns].iloc[i] - area_data[env_columns].iloc[j])

    target_distances[i, j] = np.linalg.norm(area_data[target_columns].iloc[i] - area_data[target_columns].iloc[j])

print("Distance matrix (env_distances):")
print(env_distances)
print("Distance matrix (target_distances):")
print(target_distances)

hybrid spruce
#

I have a large dataset shared with me on Dropbox, but it’s too large to download directly, and I can’t copy it to my own Dropbox either to perform CLI operations. Any ideas on how to access it?

quaint loom
uncut beacon
#

Hello, I have a relatively large dataset that I'm working with for DL using Tensorflow.

Is there a recommended way to select features in my dataset; for example, getting the correlation of each feature if it can help me answer a business question, or how is it correlated to a specific feature that I want to use. The model is expected to have 95% accuracy, so I'm worried about my feature selection.

Any tips how to approach this? Thanks!

toxic mortar
#

How is this all one cluster?

#

I use HDBSCAN(min_cluster_size=5, min_samples=2, cluster_selection_epsilon=0.35)

hushed matrix
#

Hi, I've been using Python for a few years now, and recently I started studying artificial intelligence (been more specific computational vision). Usually, do you use Jupyther Nootebook or any IDE on your computer?

#

Yeah I'm currently using google colab for small experiments, but for performace do u notice some difference?

toxic mortar
#

yes. my bad

#

I thought it was clustering algo

#

related problem

toxic mortar
#

How I can add more documents to an already trained BERTopic model? I would want to train it
However I get this error: ValueError: All arrays must be of the same length
code:```py
if new_model:
...
topic_model = BERTopic(hdbscan_model=hdbscan_model, embedding_model=embedding_model,
representation_model=representation_model,
vectorizer_model=vectorizer_model, language='english', verbose=True)

    topics, probs = topic_model.fit_transform(documents)

else:
topics, probs = topic_model.fit_transform(documents)
topic_model.update_topics(documents,topics)
...
fig = topic_model.visualize_documents(documents, reduced_embeddings=reduced_embeddings,custom_labels=True,width=1400,height=850,title='<b> Sentiment Cluster Map </b>')

graceful carbon
#

hey guys 😏

#

pls sm help me learn python

serene scaffold
graceful carbon
#

i wanna learn like everything

#

or like most of the things

serene scaffold
#

!resources

arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

spare briar
#

depending on the 2d projection different info is preserved

#

try UMAP if you want nearest neighbor graph to be respected in low dimensions

shadow viper
#

hello, good day everyone
hope everyone is having a good time.

I'm a data scientist, still learning but i try. I wanted to learn power bi just so i can have an idea of analysis. a friend of mine said theres no need that i should focus on my data science but i still wanna. is it a good combo or i should just drop it??

jagged dirge
#

Is model.predict() the function you use in production to run a trained model?

pearl ocean
#

Hi Guys!

#

Is anyone like really good at developing AI with Python?

#

Machine Learning, Deep Learning, Neural Networks, etc..

serene scaffold
slate breach
#

Good day guys,

I am building a project for my school related with house price prediction in USA.

Where did you guys get a data set etiher for free or pay one?

Thanks in advance!!

orchid sky
#

I need help with solving this problem here

royal crest
#

doesn't look like a problem

serene scaffold
serene scaffold
orchid sky
#

Yes like there is a ipynb file but need for it to be DM to do that

#

I tried the send it here but did not eke

serene scaffold
#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

slate breach
serene scaffold
slate breach
#

wym?

serene scaffold
#

why did you ask "Any freelancer that has...?" instead of just "Anyone that has ...?"

slate breach
#

Curious about where they get their data if it is a private project with more accurate results

orchid sky
#

I sent it to you

slate breach
serene scaffold
serene scaffold
orchid sky
#

@serene scaffold there it is

serene scaffold
orchid sky
orchid sky
serene scaffold
orchid sky
#

That is the problem it does not shwo it

#

An exception was thrown while running your function: only integer scalar arrays can be converted to a scalar index.
Input matrix:
[[1 0 0 5]
[0 1 0 6]
[0 0 1 7]]

#

for row in reversed(range(num_rows)):
substitution_row = M[num_rows+1]

#

Assuming to be this line of code here

slate breach
#

Thank you sir!!! Stelercus

serene scaffold
#

Pleae don't delete messages in which you ping people, as this causes a ghost ping

orchid sky
#

Okay

serene scaffold
#

anyway, is there more code in the notebook(s) that you aren't showing?

orchid sky
#

Yes as there entire lines of it

#

But itself is jsut that function call there

serene scaffold
# orchid sky Yes as there entire lines of it

okay, please put all of the notebook(s) in the paste bin. you can easily convert entire notebooks to text with python -m jupyter nbconvert --to script the_notebook.ipynb --stdout

orchid sky
#

@serene scaffold there

serene scaffold
orchid sky
#

Yes

serene scaffold
#

can you show the code for w2_unittest.test_gaussian_elimination?

orchid sky
#

Am doing that now

serene scaffold
orchid sky
#

What part

serene scaffold
#
In [7]: A = np.array([[1,2,3],[0,0,0], [0,0,5]])

In [8]: A
Out[8]:
array([[1, 2, 3],
       [0, 0, 0],
       [0, 0, 5]])

In [11]: B = np.array([[1], [2], [4]])

In [13]: A.max(B)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[13], line 1
----> 1 A.max(B)

TypeError: only integer scalar arrays can be converted to a scalar index
#

@orchid sky here

orchid sky
#

Okay as then how cn I fix the error then

serene scaffold
orchid sky
#
    # Iterate from bottom to top
    for row in reversed(range(num_rows)): 
        substitution_row = M[num_rows+1]
        
        # Get the index of the first non-zero element in the substitution row. Remember to pass the correct value to the argument augmented.
        index = M[row-1]
       
        # Iterate over the rows above the substitution_row
        for j in range(i+1, num_rows): 
#

This area as I thought about as since I changed the length for them

#

A = np.array([[1,2,3],[0,1,0], [0,0,5]])
B = np.array([[1], [2], [4]])
row_echelon_form(A,B)

serene scaffold
#

that's not where A.max(B) happens

orchid sky
#

Sorry

#

def back_substitution(M):
    """
    Perform back substitution on an augmented matrix (with unique solution) in reduced row echelon form to find the solution to the linear system.

    Parameters:
    - M (numpy.array): The augmented matrix in row echelon form with unitary pivots (n x n+1).

    Returns:
    numpy.array: The solution vector of the linear system.
    """
    A = np.array([[1,2,3],[0,0,0], [0,0,5]])
    B = np.array([[1], [2], [4]])
    
    # Make a copy of the input matrix to avoid modifying the original
    M = M.copy()

    # Get the number of rows (and columns) in the matrix of coefficients
    num_rows = M.shape[A.max(B)]

#

Yhere

serene scaffold
#

so you have these definitions for A and B

    A = np.array([[1,2,3],[0,0,0], [0,0,5]])
    B = np.array([[1], [2], [4]])

Given those, what is A.max(B) supposed to be?

#

we know A.max(B) does something other than what you wanted. So once we know what you do want, we can figure out how to code it.

orchid sky
#

Somehting I saw online

#

So what should it b then

serene scaffold
#

did you generate this with ChatGPT?

orchid sky
#

No as from coursera

serene scaffold
orchid sky
#

Me neither

#
   # Make a copy of the input matrix to avoid modifying the original
    M = M.copy()

    # Get the number of rows (and columns) in the matrix of coefficients
    num_rows = M.shape[len(A)]
serene scaffold
#

M.shape is a tuple of integers. if M is an array with 5 rows and 7 columns, then M.shape will be (5, 7)

#

so if num_rows is supposed to be the number of rows, how would you fill in the blanl? M.shape[ ]

orchid sky
#

So it still shows the same error

#

An exception was thrown while running your function: tuple indices must be integers or slices, not tuple.
Input matrix:
[[1 0 0 5]
[0 1 0 6]
[0 0 1 7]]

serene scaffold
#

Try answering this question: if num_rows is supposed to be the number of rows, how would you fill in the blanl? M.shape[ ]

orchid sky
#

M.shape[num_rows+1]

#

I did try that]

serene scaffold
#

but if you're trying to define num_rows for the first time, you can't use num_rows yet

#

so again, M.shape is a tuple of the form (number of rows, number of columns)

#

you get elements of tuples the same ways you get elements of lists.

orchid sky
#

M.shape(5,7) then

serene scaffold
#

5.7?

orchid sky
#

5,7

serene scaffold
#

why

#
my_stuff = ['a', 'b', 'c', 'd']

how would you get c?

orchid sky
#

my_stuff[2]

serene scaffold
#

yes

#

you get elements from tuples the same way you get elements of lists
M.shape is a tuple with two elements. the first element is the number of rows in M

#

The second element is the number of columns in M

orchid sky
#

I do get that part now M.shape[3, 4] then

serene scaffold
#

no.

#

you're trying to get the element that is the number of rows, right?

orchid sky
#

Yes

#

and then second elemtn columns

serene scaffold
#

how do you get the first element of a tuple?

orchid sky
#

tup[0]`

serene scaffold
#

M.shape is a tuple with two elements

#

how do you get the element that is the number of rows of M?

orchid sky
#

M.shape[0]

serene scaffold
#

Tuple and list indices must be ints. not strings.

orchid sky
#

Do know

serene scaffold
#

Then why did you write '0'

orchid sky
#

Just fixed it then

serene scaffold
#

how would you get the number of columns

orchid sky
#

M.shape[0][len(B)]

#

I would try to do the length but have no idea if ti should be B+1 or not

serene scaffold
#

M.shape is a tuple of two elements, that are both ints
so what type is M.shape[0]?

orchid sky
#

dictionary almost about

serene scaffold
#

No, it's an int.

orchid sky
#

Int

serene scaffold
#

so if you do M.shape[0][len(B)], that's like doing 5[len(B)]

orchid sky
#

Yes so would I just need to do M.shape[len(A)][len(B)]

serene scaffold
#

No.

#

I'm asking how you would get the element of M.shape that represents the number of columns in M

orchid sky
#

M.shape(length(M))

serene scaffold
#

What course are you taking, and how far along in the course are you?

orchid sky
#

Linear algebra as no demos code shown this

#

Week 2

serene scaffold
#

does the course assume that you have prior experience with Python?

orchid sky
#

Yes

#

I done it before but not linear agebra

serene scaffold
orchid sky
#

Oka as I do get it as what was the final answer then

serene scaffold
#

if you have M.shape, then M.shape[1] is the number of columns in M

orchid sky
#

Why that then

#

An exception was thrown while running your function: index 3 is out of bounds for axis 0 with size 3.

#

It showd that to me when I tried to change the row +1 for tha

#

If can still help me out

serene scaffold
orchid sky
#
    # Make a copy of the input matrix to avoid modifying the original
    M = M.copy()

    # Get the number of rows (and columns) in the matrix of coefficients
    num_rows = M.shape[1]

    ### START CODE HERE ####
    
    # Iterate from bottom to top
    for row in reversed(range(num_rows)): 
        substitution_row = M[row-1]
        
        # Get the index of the first non-zero element in the substitution row. Remember to pass the correct value to the argument augmented.
        index = M[row-1]
       
        # Iterate over the rows above the substitution_row
        for j in range(row+1, num_rows): 
#

Do you think Iam unteachable then

serene scaffold
#

Definitely not unteachable. But it sounds like you do not have the requisite Python knowledge for this course.

#

If the course assumes prior experience with Python.

serene scaffold
#

But definitely talk to the instructor before making any decisions

orchid sky
#

This is the course name

#

Okat

serene scaffold
#

oh, did you enroll for free?

orchid sky
#

How fast could you get this done then

#

Company sponsored

serene scaffold
orchid sky
serene scaffold
#

right now, I do not think you are prepared.

orchid sky
#

Okay

tough galleon
#

Using pandas, does anyone know how to check if my dataframe has a dtype that is considered a 'date'

serene scaffold
#

remember: strings that are formatted as dates are horrible

tough galleon
#

but how can I check in the code itself, like in an if statement

#

I thought of if "datetime" in df.dtypes.values but not sure if there is a prettier way

serene scaffold
orchid sky
#

Do not expect on getting a lot of help here as well

serene scaffold
#

@orchid sky I spent almost an hour helping you

orchid sky
#

Do know

serene scaffold
tough galleon
#

alright, thanks

serene scaffold
tough galleon
#

yes but the problem is that I want to check for either datetime or timedelta

serene scaffold
#

you could use an or condition, no?

tough galleon
#

yes I did: if 'm8[ns]' in df1.dtypes.values or 'M8[ns]' in df1.dtypes.values:
but it really does not look pretty

#

there has to be a better way to check for it, right?

serene scaffold
#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

tough galleon
serene scaffold
#

one moment

tough galleon
#

take your time

serene scaffold
#

but I'm not entirely sure what your tastes are

tough galleon
#

Alright, thanks for trying I guess

serene scaffold
#
any(t.kind.lower() == 'm' for t in df.dtypes.tolist())
#

this would work because the .kind for datetime is M, and timedelta is m

tough galleon
#

Ah

#

Makes sense, thanks

orchid sky
#

I am bad at this stuff but have been using pandas for a while

pseudo pasture
#

Hello Folks,

I want to make LLM as Chat GP Powered sports Fantasy picks but I couldn't find any resources or how to approach this idea. Pls help me with this Like how and from where I should begin.

pearl ocean
#

My main question I was originally intending to ask is if anyone has a good idea of building AI Models, with either TensorFlow or PyTorch? (Even better if you know both)

#

I'm just starting out with AI Development in python - considering using C++ as well for it.

elder hemlock
#

⚠️ RANT INCOMING ⚠️

So,

#

The way I interpret human logic is similar to what I've heard in classical philosophy, "induction" and "deduction".

#

I see "induction" as finding the conceptual connections based on the observation of an outcome.

Whereas "deduction" is finding the outcome based on the observation of conceptual connections

Current AI seems very inductive, which might explain some of the issues we've seen:

Some examples might include:

  • The algorithm being impressionable, just letting you assert anything to be true.
  • Algorithms appearing to struggle with thought experiments with no real world equivalent.
  • Representation biases.
  • Conceptual contradictions
#

Do you think the possibility of deductive logic algorithms have been explored?

elder hemlock
#

To illustrate, I'll make up two conversations to demonstrate how I think each would work:

#

Induction:


(Here, the algorithm has no choice but to search for a real world example, which would either be from fictional material, or in this case, other people's answers.)

A : Based on our media representation and reaction to vampires, this must mean we would be scared, and morbidly curious.```
***Deduction:***
```P : What would humans do if vampires were real?

(At this point, the algorithm would either ask the prompter for clarification, or use induction to establish an understanding.)

(The AI will then assume that all paradoxical or contradictory outcomes are impossible, and remove them from the set.)

A : Assuming that vampires must kill humans to survive, this would limit the set of outcomes to:

1. Vampires depending on humans.
2. Vampires and humans being co-dependent.
3. Humans killing vampires.
4. Vampires killing all humans, and dying.
5. Vampires not killing humans, and dying.```
#
  • My speculation is that inductive algorithms would be using knowledge graphs that store event data and connects them (like existing AI)

    • More memory intensive, less process intensive
    • More perceptive of the real world
    • Struggles with thought experiments and abstract logic
    • Predisposed to popular culture and convention
  • And that deductive algorithms would use a Markov chain to map the abstract concepts.

    • Less memory intensive, more process intensive
    • Limited to theoretical thought
    • Capable of speculating on scenarios it has not encountered yet
    • Detached from perceiving the material world
    • Predisposed to building internally defined principles.
#

If this is true, I argue that they'd both serve as the duality of a human, and the key to its simulation.

rapid hill
#

hello, does anyone know about fbprophet module

jaunty helm
rapid hill
#

ım asking it in general does anyone have information about this module

#

And ım also hire a Python developer for Finance if anyone interested dont hesitate me text me privately

#

excuse my english ım not good at it

elder hemlock
#

Sorry for that rambling!
I get what you mean, and I agree an artificial thinking algorithm would compare fully to us.

#

These ideas keep coming back frequently, and I keep seeing it around me.

#

I'm tempted to believe that the machine can unlock a form of thinking that can be purely enclosed in a think tank. Undisturbed by physical mysteries.

#

And I sometimes dip into this when thinking of ways to design AI for games.

#

I came up with a design concept I'm still trying to implement. Where instead of the AI using detection to perceive the world, they can look at the game's programming, and make decisions based on object oriented relationships.

#

Whereas those machines learn from the outcome of a mutating technique, the theory behind what I'm suggesting,

#

Is that the robot has a preconception of how object types interact, and then it can decide based on that.

#

So for example, if a game npc sees in the code that it's possible for the player to kill them, then it might take measures to avoid or confront the player, depending on which one is more likely to succeed based on how the code looks.

#

It's a little like "static analysis", and I think it's a way to create an adaptive AI with good hindsight.

bleak gate
elder hemlock
#

Like, imagine that you could create your own sword with code.

#

This algorithm might lend itself to measuring and finding efficient courses of action, which can be applied to debugging, difficulty scaling, and generated player advice.

#

In my game project, I intend to use this to give me a metric of a fair design, so I can keep all player creations balanced.

bleak gate
#

dAamnnnnnnnnnnnnnnnnnnnnnnnnn

#

that sounds phenomenal

#

what language are you gonna use to code it

elder hemlock
#

Uh, well I'm using python to make a proof of concept, and if I ever finish it, I might show this to people who are interested, or just take it into a different language.

bleak gate
#

sounds fair

#

so it wont have levels to it

#

right ?

elder hemlock
#

For the prototype, my goals are to make:

  1. A modest scripting system, where you can write an object behavior
  2. An algorithm to apply this analysis technique
  3. A procedure to reject or accept a player design (optionally, a procedure that could edit a design to suit the fairness requirements)
  4. A sandbox enclosure where objects will come into existence
#

(This won't have graphics, I'll use text)

#

If it works, then it might transfer into real games.

bleak gate
#

i see

#

super interesting

elder hemlock
#

I occasionally rant about this idea, because it does seem crazy.

bleak gate
#

hell naw

elder hemlock
#

I get the feeling I'm confusing people though

bleak gate
#

chatgpt wouldve been a good laugh in 2025

#

2015*

#

look at it now

#

apple vision

#

all of it

elder hemlock
#

Since the idea is difficult to explain, I've been working in a vacuum, and I wonder if the idea is actually new or not.

toxic mortar
#

Why do I get this error:```
Batches: 100%|##############################################################################################################################################################################################################################################################9| 13609/13612 [08:24<00:00, 27.17it/s]
Batches: 100%|###############################################################################################################################################################################################################################################################| 13612/13612 [08:24<00:00, 26.98it/s]
2024-04-06 16:54:38,324 - BERTopic - Dimensionality - Completed ✓
2024-04-06 16:54:38,330 - BERTopic - Cluster - Start clustering the reduced embeddings
2024-04-06 16:55:02,045 - BERTopic - Cluster - Completed ✓
[2024-04-06 16:55:02,534: ERROR/MainProcess] Task econ.api.queries.features.process_new_files_topic[fb957415-f2a8-4f4d-a4b3-71c995615de8] raised unexpected: ValueError('empty vocabulary; perhaps the documents only contain stop words')
app\venv\Lib\site-packages\sklearn\feature_extraction\text.py", line 1295, in _count_vocab
raise ValueError(
ValueError: empty vocabulary; perhaps the documents only contain stop words


I am using BERTopic and I preprocess docs with nltk.tokenize.sent_tokenize. What can this be? 

Is my preprocessing the problem, or clustering algo?
This is my topic model
serene scaffold
#

please don't post screenshots of text

#

!code

arctic wedgeBOT
#
Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

toxic mortar
#
def create_model():
    hdbscan_model = HDBSCAN(min_cluster_size=20, metric='euclidean', cluster_selection_method='eom', prediction_data=True)
    main_representation = KeyBERTInspired()
    client = openai.OpenAI(api_key=OPENAI_API_KEY)
    aspect_model1 = [KeyBERTInspired(top_n_words=45), MaximalMarginalRelevance(diversity=0.7)]
    tokenizer = (
        tiktoken.encoding_for_model("gpt-3.5-turbo"))
    prompt = """
             You are a helpful, respectful and honest assistant for labeling topics.
             I have a topic that contains the following documents:
             [DOCUMENTS]

             Based on the information above, extract a short but highly descriptive topic label of at most 3 or 4 words. Be precise. Make sure it is in the following format:
             topic: <topic label>
             """
    aspect_model2 = OpenAI(client, model="gpt-3.5-turbo", exponential_backoff=True, chat=True, prompt=prompt,
                           tokenizer=tokenizer, diversity=0.75)
    representation_model = {"Main": main_representation, "Aspect1": aspect_model1, "Aspect2": aspect_model2, }
    vectorizer_model = CountVectorizer(stop_words="english", ngram_range=(1, 2))
    topic_model = BERTopic(hdbscan_model=hdbscan_model, embedding_model=embedding_model,
                           representation_model=representation_model,
                           vectorizer_model=vectorizer_model,
                           language='english', calculate_probabilities=True,verbose=True)
    return  topic_model
lapis sequoia
#

@serene scaffold what should i start learning first for Data Analysis?

serene scaffold
arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

hasty furnace
white crown
#

I have this function. data_frame is a dictionary of dataframes which I am reading in from an Excel file with multiple sheets. I am trying to add a few columns to the dataframe in a sheet called "Signaling_Port" ONLY leaving the rest of the dictionary of dataframes alone. When I attempt to print the entire df I get "ValueError: If using a scalar values, you mus pass an index". I have been at this for a while now. I have attempted to reset the index, set the index to 0 and even putting the return in a list "pd.DataFrame([data_dict])" and cant seem to figure out the correct syntax. Please help.

def add_data_Signaling_Port(data_frame):
    updated_data_dict = {"zone_new": ["data1", "data2"], 
                         "Global1_new": ["new_data1", "new_data2"], 
                         "test_new": ["test_data1"]}
    
    for key, values in updated_data_dict.items():
        data_frame["Signaling_Port"].loc[0, key] = values[0] if values else None
    
    return pd.DataFrame(data_frame)
   
data_frame = add_data_Signaling_Port(data_frame)
print(data_frame)
elder hemlock
#

I feel like any analogy I could give might complicate things.

#

I'll think.

#

I don't know any real examples of this, unfortunately.

#

:_

desert oar
#

You mean something like, giving the AI access to the actual game state and mechanics? Definitely not a new idea

#

It's not like the OpenAI Dota bot is using computer vision to parse the screen

#

It's a question of what you actually want to achieve

iron basalt
#

But note that if you are comparing it to human performance, it's apples and oranges.

terse frigate
#

HI guys... i am trying to learn MLOps.. but unfortunately i study on a macbook.. i would like to know if there are any cheap clouds available out there which i can experiment and learn hands-on ... open to any suggestions. Cheers! 😄

serene scaffold
terse frigate
#

I am learning docker

#

I am very lost tho

serene scaffold
terse frigate
#

My uni has a cluster yes

#

But I graduated so I don’t think I can use that anymore

serene scaffold
#

so you're not a university student

terse frigate
#

Well

serene scaffold
#

being a university graduate is not the same as being a university student.

terse frigate
#

I finished my course last month haha i haven’t received the degree yet

#

I still have access to uni email and all that but not sure for how long

serene scaffold
#

I see

#

well, I would try enrolling in AWS's student thing ASAP, then

terse frigate
#

Oh

#

Ok ok I’ll do that

#

Also can you suggest some learning materials for docker for MLops on Mac?

serene scaffold
#

That aside, I think you're probably smart to be focusing on MLops.

terse frigate
serene scaffold
terse frigate
#

It’s a niche

#

Isn’t it?

serene scaffold
# terse frigate It’s a niche

Just that I think a lot of students/pre-career people have their heart set on ML, but they don't realize that it's all theoretical math that they won't like.

terse frigate
#

Well, I kinda got interested because I found myself very lost and overwhelmed by MLops at my internship

serene scaffold
#

what were you trying to do during your internship that made you feel lost

terse frigate
#

I was working for a startup that needed me to deploy my code on bare metal

serene scaffold
#

so you had to start with a machine that didn't have an OS installed, or what?

terse frigate
#

Yep

#

I could not do it

#

I had absolutely no clue or direction

serene scaffold
#

weird that they asked you to do that, and not the person who bought the hardware

terse frigate
#

Also they told me I’d be working under someone

#

But it was just me doing everything

serene scaffold
terse frigate
#

From fetching data to training to deploying

terse frigate
iron basalt
#

I would not expect a beginner to be able to do anything in that situation. Sounds like you were in a strange situation.

terse frigate
iron basalt
terse frigate
#

Yeah and didn’t pay me for the last 2 months I worked

#

Because I “didn’t deliver”

serene scaffold
terse frigate
#

They even said they gonna hire another senior to help me with everything but that never happened lol

iron basalt
#

So you never feel completely stuck.

serene scaffold
terse frigate
iron basalt
terse frigate
#

Parse it from JSON to tabular

serene scaffold
terse frigate
#

Then train that table on a QA model

iron basalt
terse frigate
#

Design pipelines for CICD

#

deploy all that on metal

iron basalt
#

But I suppose it kind of is, so you don't sit there frustrated.

serene scaffold
terse frigate
#

Following Andrew Ng on coursera

serene scaffold
raw mortar
#

An intern defining an mlops strategy doesn't sound right to me

#

This is pretty good, it's a big ad for aws products which can replaced with product xyz.
https://youtu.be/UnAN35gu3Rw

Learn how to design an end-to-end machine learning architecture, one step at a time, graduating from a simple model deployment to a complex multi-model strategy. This session aims to help architects working with data scientists and machine learning engineers to implement machine learning use cases. Prior knowledge of and experience with core AWS...

▶ Play video
orchid forge
#

@serene scaffold hey

orchid forge
#

what is wrong with my code here?

#

can someone please help me

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied timeout to @scenic pier until <t:1712474799:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

latent parcel
#

keep feeding the fire stelercus, keep feeding it

orchid forge
orchid forge
terse frigate
elder hemlock
#

I'm imagining an algorithm that can read game code, and do a static analysis on it to generate some behaviors for the npc.

#

Since game code is a static concept relationship map.

#

The benefit of this also being that if this game code evolves as the consequence of players creating new features in a sandbox style environment, the NPC should be able to understand this better?

#

Gradient descent?

#

For reference, this is supposed to be my proposal for a zero-learning model. It, uh, just "knows"?

#

At least that's how I think it works.

#

Yeah, that's closer to what I'm thinking.
Imagine you take a game script, turn it into a knowledge graph, and use that as the basis for an NPC's behaviors?

#

This knowledge graph may also include properties relating to a "Markov chain" I think?

#

So, where exactly does this hypothetical algorithm have to do any learning?

thorny lodge
#

Guys, I'm building an app for estimations in terms of the effort, time, and cost associated with software development projects. I'm trying to leverage ML for such a thing. I'm trying to find a pre-trained model that I can utilize for such a task. Where would be a good place to find what I'm looking for? I tried hugging face website & tensorflow hub but I haven't used them before so I'm not sure where & what to look up for exactly, kinda confused, I'd appreciate a lil guidence ❤️

elder hemlock
#

Maybe I'll put this example into steps:

  1. There is a hypothetical sandbox-style game where players may insert new custom programs to create new entities.
  2. There is a generic enemy, who's goal is to stop the player.
  3. The player designs a new object, that has mechanics relevant to the enemy's efficiency.
  4. My proposal for a solution to this, is to take the players custom mechanic, and add it into a knowledge graph / markov chain.
  5. The enemy can then do a analysis on this data structure, to ascertain the best course of action, either generally, or in a specific scenario.
#

This solution could be described as "artificial intelligence", but at the same time, does not require a learning process, other than step 4.

#

And is my example for a program that utilizes the "deductive" thinking method, instead of an "inductive" one.

#

I guess, it has rules?