#data-science-and-ml

1 messages Β· Page 78 of 1

slim bone
#

Or are the rest of them more uh.. hard-coded so-to-speak

twilit tundra
#

Not really but you don't produce intermediate representations in a boosting model for instance

slim bone
#

I see

twilit tundra
#

For instance in computer vision, your first few layers describe the edges of your picture and you don't need to train it to do that. The classical neuroscience vision model HMAX used a similar structure to a deep learning network but with gabor filters preset

#

Not sure what other methods exist for CV though

slim bone
#

I'm afraid the conversation has reached technical levels I fail to understand ^^; On that note I suppose I'll take my leave

It was nice learning from you folks, thanks a lot

desert oar
#

read the link i posted above your message. use iloc for positional indexing (by row number) and loc for indexing by row label

desert oar
#

deep learning == neural networks with a lot of layers

#

yes boosting very much does find high order interactions in the data, it's iteratively refitting on residuals

twilit tundra
#

What really differentiates the intermediate representations in deep learning models from other models is that they are built off of each other, creating this hierarchical structure (as a result of the multi-layers)

odd meteor
#

That's not always true though but I understand your point. ML Research generally is all about figuring out the "unknown" (More reason I love it. There's room for failure, even. You could literally write a research paper on a failed experiment)

So if you have an interesting research question which bothers on using Tabular Dataset, I'm sure it'll receive more warm reception and accolades in any ML conference such paper is presented; so long as it's able to unravel a novel discovery or something unique.

For all we know, we're still pretty much in the "figuring out things" era. It's LLM today who knows what the next trendy topic would be πŸ˜…πŸ˜ƒ

#

DL is simply machine learning on unstructured dataset; which usually involves NN. So DL is a subfield of ML, just like how ML is a subfield of AI.

lilac ingot
#

HΔ°

#

I need help guys

past meteor
#

It just is successful on unstructured data because people construct networks with inductive biases that more or less create features

odd meteor
serene scaffold
serene scaffold
arctic wedgeBOT
#

:ok_hand: Added ask-and-perhaps-you-shall-receive to the names list.

desert oar
serene scaffold
misty flint
#

stel, i kid you not. literally last week, one of the nontechnical stakeholders was like 'i have this problem. can you just solve this using ai magic'

serene scaffold
misty flint
#

it was just a clustering problem but with a massive dataset and terribly messy, custom data from multiple organizations

serene scaffold
#

I hope you like data cleaning

misty flint
#

yeah no. i told them i can get like top 20 topics

#

and thats it lol

serene scaffold
#

they wanted you to do topic modeling?

misty flint
#

not spending more time on that

#

essentially

#

anyway ill get them some pretty wordclouds and im sure theyll be satisfied

serene scaffold
misty flint
#

oh that looks neat

serene scaffold
#

some day I wanna refactor it

#

(I didn't write it. coworker did.)

misty flint
#

the requirements were given friday afternoon so i havent done too much work on it yet

#

let me take a look at this

#

and ill let you know if it works for my use case

serene scaffold
#

nice πŸ˜„

left tartan
flint grail
#

People

#

I am iron man

misty flint
serene scaffold
#

glad it worked

misty flint
#

instead of nontechnical stakeholders, ill have to be telling product managers that what they want isnt possible

coral field
#

Are there any good, free alternatives to Google colab that are as fast as the gpus from colab pro?

desert oar
#

"i want free compute that's as good as paid compute"

desert oar
misty flint
#

and most likely we'll have to do some sort of subsequent ETL and storage for any read-heavy aka ML use cases

#

the data / dev divide. we love to see it

desert oar
#

ETL is one thing, it's another when you realize that they're only storing the current value of something but you needed it as of last year, and if you're lucky you can scrape it from logs, but more likely you have to shelve the project while they build a history table and then wait to have enough data

misty flint
#

yep yep yep

#

literally 2 weeks ago, we had to think of questions to ask on the UI so that we can start STORING that type of data

#

for a feature that they literally just spent 2 months building

#

smh smh

#

pretty sure at least two dev teams worked on that feature set too

misty flint
#

big teams

desert oar
#

"next time you want to build a feature with ML/AI involved, please at least ask us about it first"

#

how many data scientists have had to say that at how many companies...

misty flint
#

this time it wasnt even ML/AI. just pure analytics

#

and they still didnt have the data

desert oar
#

but they didn't ask you first, did they?

misty flint
#

the design and product peeps got thrashed

#

by the exec

#

LOL

#

"YALL WANNA BUILD A FANCY DASHBOARD WITH NO DATA?! GL"

desert oar
#

hahaha hey at least someone was awake!

misty flint
#

ikr.

#

sometimes execs can be hands off but glad ours wasnt asleep

odd meteor
# coral field Are there any good, free alternatives to Google colab that are as fast as the gp...

IMO, Kaggle is by far better than Colab in terms of what they offer (in their freemium package). You get 30hrs free access to GPU each week on Kaggle. I switched to Kaggle and I never looked back ever since.

Kaggle's P100 GPU is better and faster than what Colab offers in their free plan.

You can also run your experiments offline on Kaggle without worrying about timeouts ( colab has actually dealt with me... Terminating runtime after some minutes of screen inactivity) You can commit on Kaggle.

The frustration of timeout and having to start all over again especially when you're training a Model that takes +4 hours 😭😭

I hate that Colab doesn't actually mention your limit usage, the actual time you have left before cutting someone off the GPU.

umbral ermine
#

Can I train computer vision model without a gpu

daring sphinx
twilit tundra
last ivy
#

Would anybody know of a NLP technique for dynamic reading a book?

lapis sequoia
winged rivet
#

does fine tuning a model (quantizing it) require a gpu?

serene scaffold
coral field
lapis sequoia
#

Are there any extremely cheap nvidia gpu instances somewhere. I really do not need a whole H100/A100, like a fraction of it would be good. It would be nice to have a machine to connect to that can compile cuda, I can even turn it on and off for periods of ~1 minutes

misty flint
lapis sequoia
# lapis sequoia Are there any extremely cheap nvidia gpu instances somewhere. I really do not ne...

There are few cloud providers, I would say some good ones with reasonable price and good service are lambda labs, jarvislabs.ai (1 hour a6000 is around ~ 0.6$), ovhcloud. If you could handle some trade off in secured networks and want cheaper options, then you can proceed with vast ai, they have it like 2-3 times more cheaper GPUs instances. Most of these are pretty easy to setup, single click instances on/off.

misty flint
# lapis sequoia LLMs?

depends on your definition of LLMs. are you only considering transformers in the GPT family aka decoder-only or do you consider all types of transformers including encoder-only and encoder-decoder

lapis sequoia
misty flint
#

short answer: no.

misty flint
halcyon hedge
#

Hey folks, a newbie coder here. I have a dataset in which each entry shows a terrorist attack and has two values, date and country. I want to make a lineplot in which, x = year, y = number of attacks that year, and I want to plot the data separately for each country so that we can do a comparative study. How should I divide the database for this?

#

I want the graph to look like this, where each colour shows a different country

left tartan
#

You can either organize narrow or wide. Narrow would be: date, country_name, value. Wide would be: date, usa, uk, Jp, etc. You can plot from either approach (and pivot / unpivot between them)

hasty mountain
#

This is quite curious, you know... I'm running some tests on CIFAR10, which is already composed of images within range [0, 1]. In this dataset, I noticed only now that I don't have to multiply the Decoder output by the dataset Standard Deviation nor multiply by the Mean. The VAE output is, indeed, an image, so replacing the Gaussian Likelihood loss by MSE can be more computationally efficient (maybe even provide better results?)

Still, when I use my custom dataset, which has been rescaled to be within range [-1, 1], I really must deNormalize the Decoder outputs. Specially since the Decoder is only able to generate outputs within [0, 1].
Maybe if I replace the Sigmoid by a Tanh things might go smoothly. I've remember that I've tried this once, but maybe I did something wrong...I don't know...or, since Tanh has the problem of vanishing gradients, maybe I could simply not use an activation in the output layer at all?

Unfortunately I still didn't manage to finish reading and watching the classes about probabilistic models

ashen axle
#

Hi all, I've got a question regarding pandas method chaining best practices.

For a dataframe with three level multiindex columns, I am looking to establish a pipeline that will add new columns at the third level, with level 1 and 2 acting as identifiers.

Thee first stage is below, which is calculating a baseline:

   b = (df.loc[:, pd.IndexSlice[:, :, "mins"]].apply(
        lambda x: Baseline(x).iasls(
            df.loc[
                :,
                pd.IndexSlice[
                    x.name[0],
                    x.name[1],
                    "value",
                ],
            ]
        )[0]
        )
         .rename(columns={'mins':'baseline'}, level=2)
         )

    pd.concat([df, b], axis=1).sort_index(axis=1)

First off, I'm wondering whether there is an easier way of achieving this, especially without the concat? There will be X operations to achieve the final product so I dont want to be concating repeatedly if I can avoid it.

BTW, let me know if there is a more appropriate channel to ask this in. Cheers!

fallen dagger
#

I'm trying to learn ML/AI and I'm interested in doing some projects, no courses. Any tips on where to get started or project ideas? I don't want to do the standard cookie-cutter projects like digit recognition because I find them boring and I already know how it'd work.

mellow turret
#

hello there!
just had a quick doubt about implementing an ANN using tensorflow
when we use the Dense function to build a layer, how does tensorflow ensure that each neuron ends up building a different logistic regression function/ different values for parameters w and b, when all the neurons are trained using the same data?

autumn dagger
fallen dagger
autumn dagger
fallen dagger
#

Yes, an agent that plays AoE2. SC2 has a learning environment but AoE2 doesn't.

gaunt vault
#

Has anyone played around with Llama2? Trying to set it up completely clean, most resources I find on how to get it set up deal with tokenisers and things like huggin and similar models for integration through those.

I have the models directly downloaded from meta, and want to replace the openai api calls in my current py app with calls to the llama model i have on my machine.

Readme shows me the installation and setup. Example file shows a single example call for the function. But i can’t find integration examples to full builds. Anyone done anything similar or have an idea which resources can help?

steady basalt
desert oar
desert oar
# fallen dagger I don't fully understand the capabilities and limitations of AI/ML to know what ...

modern machine learning models are really good at finding complicated hard-to-see patterns in the data and either generating new data as a result, or making predictions based on those patterns. for example, one of the first really big advancements of "deep learning" was image classification using "convolutional neural networks". one of the things these models do internally is finding the common shapes and patterns in the image that are most important for separating different kinds of images. and what's amazing and magical is that nobody has to tell the model what to look for. you just give it a correctness score based on its predictions, and use a particular feedback loop mechanism to update the numbers inside the model. and eventually that feedback loop mechanism converges to a model that works surprisingly well, if you design it right.

#

and what we are now finding with the various generations of text models is that text also tends to have predictiable but very complicated patterns spread across thousands of words at a time, which humans can't really see, but these transformer-based models are good at finding them and generating new text based on them

#

if you want to think about "what can i do with ML?" , it's hard to go wrong with looking for a problem that involves recognizing patterns in the data and either separating the data based on those patterns, making predictions based on those patterns, or generating new data

#

another great example is reinforcement learning, the "pattern" being learned in that case is some network of cause and effect

wide cosmos
#

Hey guys, I am stuck at a problem which I need help with. Basically I have to convert a english SRT file into some other language from another word file, like no need to translate text on my own, just have to reference it from that file. I used pandas to create some dataframes and was able to extract the sentences from both the files, I even converted the other langauge to English to store too so it might be of help later on, now I'm not sure how to replace the SRT file text with the other language so it lines up well....

#

I'm trying to use semantic comparision on the translated text and the original subtitles but it's causing issues

misty flint
slim bone
#

I've been learning about DL for a while now and I remembered I watched 3b1b's series on NN's and they said that the concept in the video is considered "Old techonology".

I kind of assumed they meant that "This technology is no longer used, and we have better methods that rely on this theory" but I have yet to be exposed to anything beyond Convolutional Networks, and I don't think they're a replacement for... traditional NN's? I don't know what to call them

Would anyone care to clarify? Timestamp for reference:
https://youtu.be/IHZwWFHWa-w?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi&t=977

past meteor
#

NN's aren't just one algorithm, I think it's better to think of them as a "framework" for easily creating problem-specific algorithms. The base case is the feed forward network that the video is about but problem-specific networks (CNNs, RNNs, LSTMS, transformers, ...) also exist

slim bone
#

As in, will I always have a better alternative to* this architecture?

past meteor
#

Well, since it's the easiest neural network you can come up with it should always be part of your baselines/testing because for some problems it might be better than the rest

#

Let me be more specific:

#

Neural networks allow you to encode knowledge of your problem into the architecture. This is highly related to bias-variance, you're trading in some bias for a drastic reduction in variance. Lower variance means you might be more data efficient as well.

The thing is, your (inductive) bias might not always be right. CNNs make strong assumptions on how features are found in images. (unidirectional-)RNNs made strong assumptions on how text is structured.

Feed forward neural nets make very little assumptions about the data. Maybe if you had an infinite amount of data and compute it'd actually work well on any problem. This doesn't even have to be infinite, if you have "enough" data and your inductive bias is "wrong enough", the no assumptions feed forward net might be better.

#

Like, if your problem domain operates on sets that are permutation invariant then designing a neural network that gives some invariance to permutations (graph neural networks) means you need drastically less data.

You can visualise this by thinking of a pokemon game. You see it's a tree structure, 2 players, 6 pokemon, each pokemon can have 4 moves. You can permute this tree in many many ways and have exactly the same match. By hand-crafting a network that treats all the permutations as the same match you need wayyyyyy less data. OTOH if you had an infinite amount of data maybe a FFN might create permutation invariant features. We'll never know! πŸ™‚

sorry for the long answer!

wooden sail
#

that's a pretty good answer though

slim bone
#

This is slightly too technical for me, but I think I got the gist of it

wooden sail
#

the only thing i would add is that in some communities this receives the name "model-based learning", where the idea is to try to explicitly include all the prior knowledge you have explicitly into the network, which usually comes from knowledge of the physical process generating the data and estimators that have strong guarantees

past meteor
slim bone
#

Oh and obviously, thank you

past meteor
#

The fancy neural networks of today take some shortcuts that make it learn faster and the one in 3b1b makes none. If it can learn forever it'll outpace the former especially if the shortcuts aren't optimal

wooden sail
past meteor
wooden sail
#

imagine we know some samples come from a sinusoid. a sinusoid only has 3 parameters: amplitude, phase, and frequency. in an ideal case, we can find all of these parameters from exactly 3 data points. any further points let us get higher accuracy estimates of the parameters. if we don't know it's a sinusoid, we could try to find several kinds of functions to the data. so say we assume it's a polynomial, and we have 100 points. we could fit several cubic splines, one single high degree polynomial, or something else. these will have many more parameters than the sinusoid and the estimates of the parameters will be worse. the fit to the data will also be worse

small wedge
#

I think not only the architecture but also the gradient descent algorithm that he described would be considered old technology

#

nobody uses vanilla GD

wooden sail
#

one of the coolest ones i've been looking at lately is subsampling matrices

#

imagine we have 10 samples and want to pick the "best" 3 samples

past meteor
#

Define best?

wooden sail
#

normally this is a combinatorial problem. but a matrix that does subsampling has exactly 1 nonzero entry per row, and it has value 1. this looks identical to categorical distributions, so we can set up one or more classification networks to build the matrix using gradient methods instead of combinatorics

#

the definition of best isn't important here, as long as it can be written in a way sensible to differentiate for the sake of deep leaning-like optimization

wooden sail
#

the "model" here being that subsampling matrices have a special structure for which we already have good solution approaches

past meteor
#

This is interesting

slim bone
slim bone
past meteor
#

And @wooden sail I guess an application of what you're mentioning is the knapsack problem

wooden sail
#

the main issue is that there is finite data (and usually very little)

slim bone
#

Right but are these shortcuts just synonyms to the assumptions you've mentioned earlier?

slim bone
#

I'm aware of how Convolutional Networks function* for example
What is the assumption there? That we're processing a picture?

slim bone
wooden sail
small wedge
past meteor
#

I'd have to think this through compared to classics such as PSO, GA's etc

#

If I'd have to do a large combinatorial problem I'd look at those but that's just availability bias. I had a lot of coursework on operations research πŸ™‚

slim bone
wooden sail
#

gradient-based methods are common enough that their complement has a special name πŸ˜› gradient-free optimization

small wedge
#

does simulated annealing count as GD? pithink it is right

past meteor
#

No

small wedge
#

ah

past meteor
#

I'd call that a population based metaheuristic

#

But you can add local search into them. Specifically, you can run some GD while doing simualted annealing

past meteor
#

You assume pixels are only correlated by what's near to them in a spatial sense

slim bone
#

Oh, and sometimes that assumption is completely valid, right?

past meteor
#

probably more often than not

wooden sail
#

the assumption is "spatial invariance"

#

along with spatial correlation

#

not only are pixels related to their neighbors, this relationship is assumed to be valid regardless of where it appears in the image

#

i.e. if you have a cat in the upper left or lower right corner of the image, you should be able to detect it

slim bone
#

Oh yeah, I never thought about that

past meteor
#

It's also said that pooling adds translation invariance

wooden sail
#

convolutions are solutions to differential equations that are "linear shift invariant" (LSI) or often also called linear time invariant (LTI)

past meteor
#

Maybe true, maybe not. People also do data augmentation to create some invariance to rotation, scale, brightness, ... same applieis here.

slim bone
#

I can only explain their role in a hand-wavy fashion, formally though.. That's quite an enigma for me

wooden sail
#

i guess the most important part there is the following: affine transformations are associative

#

which sounds scary, but all it means is that, like regular multiplication, you can put parentheses arbitrarily when doing matrix multiplication

#

so say we have a network with 3 layers. take an input x, and each layer has a matrix of weights, call them A, B and C

#

then ABCx is the same as A(BCx) or A(B(CXx)). and most importantly, the same as (ABC)x

#

but ABC is just one matrix. so there was no need for 3 layers πŸ˜› only one with one matrix

#

this is no longer true if we add activation functions that are nonlinear in between

past meteor
wooden sail
#

neural networks are only useful if they have nonlinear activation funcs

#

otherwise any network is just Ax + b for some choice of A and b. this is rather limited

slim bone
slim bone
past meteor
#

The hypothesis space is just the set of hypothesis (possible answers) you can provide. I'm glad you spoke about it because this is how I learnt ML in university and it's the best way to think of it.

wooden sail
# slim bone But, like, why do we want that? What does that enable?

ah. the reason we care about neural networks in the first place is that they are so-called "universal approximators". say there is a function f(x). we can instead make a network, call it n, and evaluate n(x). we can make n(x) arbitrarily close to f(x) for any x, if n is a big enough network

past meteor
#

Without activation functions your hypothesis space is limited to all possible answers that model Y as an affine transformation of X

wooden sail
#

but this property is only true if n(x) has nonlinear activation functions. otherwise n(x) is an affine transformation, and it is only valid in very limited circumstances (e.g. when f(x) is also an affine transformation)

#

or within a close neighborhood of a particular value of x (think taylor series)

proper meteor
#

ValueError: Found unexpected losses or metrics that do not correspond to any Model output: dict_keys(['expression_output', 'gender_output', 'age_output']). Valid mode output names: ['race_output']. Received struct is: {'expression_output': 'categorical_crossentropy', 'gender_output': 'categorical_crossentropy', 'age_output': 'categorical_crossentropy'}

#

help!!!!

past meteor
#

But maybe the right hypothesis is one that models it as something different. Using activation functions make ANN's universal approximators, the hypothesis space is infinite. You can represent any possible answer (but this doesn't mean you'll get the best one in training)

slim bone
#

Is it possible to simplify their use onto a traditional function?

#

Like, just a 2d one

past meteor
#

You mean a line?

slim bone
#

I mean, to me it sounds like "Sometimes you don't want a line, but a polynomial for example. And activation functions let you get those"

#

Or rather, a linear plane

#

I'm just trying not to generalize, and keep it simple for the moment

past meteor
#

That's an OK way to look at it for now imo

slim bone
#

Boston housing prices example comes to mind, except a linear equation works there quite well it seems

past meteor
#

When you need a line or a plane and you use an activation function it can get you that

past meteor
#

Maybe Edd will disagree idk? But for now you can roll with that, maybe in the future you'll revisit and get more of the details down

slim bone
slim bone
past meteor
#

That's the danger with non-linear stuff though. Sometimes it'll extrapolate in bad ways, especially in places of high uncertainty. This isn't a NN, just some school work on SVMs.

#

A line is the most adequate solution here, especially since in this case we know how the data is generated

#

The top left corner being blue is just incorrect. Non-linear models give more flexibility but that can be risky. aka overfitting

wooden sail
#

if you know the relationship is linear i'd argue you'll get better results if you use a linear model

#

but nothing stops you from using a network and getting good results anyway

past meteor
#

Agree

wooden sail
#

if you have enough data, you can learn anything πŸ˜›

past meteor
#

That's why we did these (stupid) exercises in uni though 🀣

#

Another big one is that higher dimensions aren't like 2 and 3D

rigid cape
#

Hi there guys, I wanna learn machine learning, i am looking into youtube videos and books but I cant find a structure to learn. In some books theres sci-kit but they use terms and formulas that I know nothing about. In more theoretical books, I get the theory , but it isnt either beginner friendly or it doesnt have related code . So where do I start ?

I know basic data science libraries like numpy, pandas and matplotlib.

desert oar
# slim bone I mean, to me it sounds like "Sometimes you don't want a line, but a polynomial ...

take a look here: https://youtu.be/QhHfo6-Bx8o?t=3331. here you have an example of third order interaction, meaning that the effect of one variable depends on the values of two other variables. let's now say you had 50 variables and no clear scientific model of how they all behave. heck, the interactions might not be linear, or even monotonic. are you going to fit a regression with 50th-order interactions and 5 different functional forms of each? of course not, because we have things like random forest, gradient boosting, and neural networks that can find some parsimonious representation of those interactions.

Lecture 09 of the Dec 2018 through March 2019 edition of Statistical Rethinking: A Bayesian Course with R and Stan. Covers interaction effects.

β–Ά Play video
#

that's precisely what these general "non-statistical" function approximating ML algorithms do and why they're so amazing for predictive modeling. they are able to find very complicated interactions and nonlinear relationships within the data.

#

so why do neural networks work so well on images? because manually constructing the right nonlinear features that optimally separate the data by hand is extremely difficult, and there's like 20 years of literature full of people struggling to do that. now we have neural networks to do it for us. and it turns out that the way to do it in a NN is this sliding filter thing that they call a "convolutional" filter because it looks like a convolution operator in signal processing.

desert oar
#

i think the python version is new as of this year, but the R version is an old favorite by now

#

you will probably also need to learn calculus, linear algebra, and probability if you don't know them already.

past meteor
#

PML is too dense of a book

desert oar
#

yeah i wouldn't use it as a first resource

#

it might be more of a reference than a study book

#

for linear algebra it's hard to beat MIT 18.06 which until recently was taught by an amazing professor Gilbert Strang, there are free lectures online and it seems like he has a new/updated book https://math.mit.edu/~gs/everyone/

#

the 3b1b calc and linalg series are excellent. not sure about calc study materials beyond that

past meteor
#

ISL is what I always recommend here. Math for machine learning if you want a lin alg / calc refresher

desert oar
#

this one @past meteor ? https://mml-book.github.io/

past meteor
#

yes

desert oar
#

i haven't seen it, i'll take a look

#

probability i'm not too sure of either, but i know there is an intro probability book by Ross and i really liked his Probability Models book (which is more of a "second course" type of book)

past meteor
#

pratical statistics for data scientists is also a good book.

#

But it approaches stats from the perspective of someone that took stats in uni, didn't really care too much and know wants to get into data science. Doesn't teach it from scratch.

#

I wouldn't know what books I'd have to recommend for an absolute beginner there. Those things I picked up during my bachelors 🀷

#

(I get a reasonable amount of time from work to read/upskill hence why I know these)

#

If I'm feeling particularly lazy I just read stuff I already know which is great because it solidifies fundamentals.

desert oar
past meteor
#

That ones has been on my to-read list for a long time πŸ™‚

desert oar
desert oar
#

i did the same for strang and a few others i don't remember right now. i'm planning to go back and work through a few of the exercises in the book when the weather is colder

small echo
#

Do you know any good tuto for data-scince begginer?

bold timber
#

I'm currently learning how the Transformer architecture work for machine translation. Is there anyone here who understand the Transformer architecture code using TensorFlow? I would like to ask a few thingsπŸ™

serene scaffold
bold timber
# serene scaffold Don't ask to ask. Ask your actual question right away.

Ok, thanks for the advice.

Below is the call function code for Decoder that used for Machine Translation model.

def call(self, inputs, encoder_outputs, mask = None):
    causal_mask = tf.linalg.band_part(input = tf.ones([tf.shape(inputs)[0],
                                                       tf.shape(inputs)[1],
                                                       tf.shape(inputs)[1]], dtype=tf.int32),
                                      num_lower = -1,
                                      num_upper = 0)

    if mask is not None:
       mask1 = mask[:, :, tf.newaxis]
       mask2 = mask[:, tf.newaxis, :]
       padding_mask = tf.cast(mask1&mask2, dtype = 'int32')
       combined_mask = tf.minimum(x = padding_mask,
                                       y = causal_mask)

    attention_output_1 = self.attention_1(query = inputs,
                                          value = inputs,
                                          key = inputs,
                                          attention_mask = causal_mask)

    out_1 = self.layernorm_1(inputs + attention_output_1)

    attention_output_2= self.attention_2(query = out_1,
                                         value = encoder_outputs,
                                         key = encoder_outputs,
                                         attention_mask = combined_mask)

    out_2 = self.layernorm_2(out_1 + attention_output_2)

    proj_output = self.dense_proj(out_2)
    return self.layernorm_3(out_2 + proj_output)

Based on the code above, is it correct if I set attention_mask = causal_mask in attention_output_1 or should I set attention_mask = combined_mask?

serene scaffold
bold timber
mint palm
#

anybody that can help me run code that has "Slurm"? I dont have "slurm" on linux GPU sluster. I cannot get sudo permission most probably

serene scaffold
mint palm
# serene scaffold slurm is for scheduling processes on shared systems. if your system doesn't have...

the command is this:

export MASTER_PORT=$((12000 + $RANDOM % 20000))
export OMP_NUM_THREADS=1
echo "PYTHONPATH: ${PYTHONPATH}"
which_python=$(which python)
echo "which python: ${which_python}"
export PYTHONPATH=${PYTHONPATH}:${which_python}
export PYTHONPATH=${PYTHONPATH}:.
echo "PYTHONPATH: ${PYTHONPATH}"

JOB_NAME='l16_25m'
OUTPUT_DIR="$(dirname $0)/$JOB_NAME"
LOG_DIR="$(dirname $0)/logs/${JOB_NAME}"
PARTITION='video'
NNODE=1
NUM_GPUS=8
NUM_CPU=112

srun -p ${PARTITION} \
    --job-name=${JOB_NAME} \
    -n${NNODE} \
    --gres=gpu:${NUM_GPUS} \
    --ntasks-per-node=1 \
    --cpus-per-task=${NUM_CPU} \
    torchrun \
    --nnodes=${NNODE} \
    --nproc_per_node=${NUM_GPUS} \
    --rdzv_backend=c10d \
    tasks/retrieval.py \
    $(dirname $0)/l16.py \
    pretrained_path your_model_path/l16_25m.pth \
    output_dir ${OUTPUT_DIR}

i dont have any scheduler on cluster, what not to omit?

serene scaffold
#

@mint palm do you have torchrun?

mint palm
#
python tasks/retrieval.py \
    $(dirname $0)/l16.py \
    pretrained_path your_model_path/l16_25m.pth \
    output_dir ${OUTPUT_DIR}```
runs but gives some wierd errors i am unfamiliar with
mint palm
serene scaffold
mint palm
serene scaffold
#

the python retrieval.py part

#

if that py file has a cli, that's even better.

mint palm
#

yeah, i hope error is not due to srun anymore., if it isnt i would be good to go

feral blade
#

hii, I'm having trouble reading an excel file from doing stuff after read_xml... can anyone point me to a general direction on how to approach this

#

it says the cell is of type object, but it should be string

#

it even happens after i define the dtype argument

#

then i realised that the sheet our professor sent had that thing in a drop down

#

does this change the output somehow, or there a way to retain the value... when i print it, it still gives me this string

#

i hope it's the right place to ask this, i saw data-science and jumped in since it was using pandas...

feral blade
#
  • solved, (removed sample images)
halcyon hedge
hard coral
serene scaffold
left tartan
#

By inevitable, I’m trying it tomorrow

left tartan
#

Most of them are business analyst types too: they have a little coding skills (you should see their monstrous excel formulas) but their focus is business

misty flint
left tartan
#

I'm trying to figure out the beta channel now, actually

misty flint
#

guess tomorrow changed to today

#

LOL

left tartan
#

It's unclear, but I am actually excited.

#

Oh, I'm a little less excited: With Python in Excel, .... the Python calculations run in the Microsoft Cloud, and your results are returned to the worksheet, including plots and visualizations.

left tartan
#

And "While Python in Excel is in Preview (beta) you will be able to use this feature as part of your subscription. After the Preview, you will need to purchase an additional license to use it."

opaque idol
#

Not sure if I should ask here but anyone know how to teach ai how to play a game?

left tartan
opaque idol
opaque idol
#

Is openAi what I should use?

opaque idol
#

If by someone else like from steam or something could you link me to a documentation to how to set everything up for any game? I'm trying to learn a little

hollow yew
#

Gymnasium is what alot of people use for reinforcement learning environments but you dont have to use it. I am a game developer, I have been working on using reinforcement learning to drive a characters decisions, i havent made it far, ive managed to create a plugin that allows for transfering information to and from python and the game at runtime so that i can now set up a RL script in Pyton and use it to drive the character, but im still learning the RL myself so its a wip

opaque idol
#

two minute papers mostly talks about ai learning how to play games. Doesn't really show how they make ai learn a game

hollow yew
opaque idol
#

I am a game developer too I know C# but I'm trying to learn python to teach an ai to play a game

hollow yew
#

Care to dm me? I can run the python side of the Unity ML Agents if you can run the C# side of it

opaque idol
#

any game

#

Some use something called pytorch?

#

maybe

left tartan
# opaque idol two minute papers mostly talks about ai learning how to play games. Doesn't real...

Well, 2 minute papers reviews papers that talk about it... go to the source papers for more info, perhaps? For example, from his most recent: https://openai.com/research/emergent-tool-use

We’ve observed agents discovering progressively more complex tool use while playing a simple game of hide-and-seek. Through training in our new simulated hide-and-seek environment, agents build a series of six distinct strategies and counterstrategies, some of which we did not know our environment supported. The self-supervised emergent complexi...

hollow yew
#

theres a few options that im aware of and probably alot more that im not, but theres got to be a way for the AI to be aware of the game, so something like screenshots and processing the image, also could maybe use it to read specified memory values on your pc although i wouldnt suggest this, also you could use something like Dnspy to decompile the game if it was made in Unity and possibly mod an agent into the game that way

opaque idol
#

I've seen someone do gameplay and then the ai learns from that gameplay data

#

basically a person plays the game and then the ai learns from that gameplay

hollow yew
#

yeah but what im saying is that the Agent has to have some way to be able to know things about whats happening in the game, it has to have input from the game to know whats going on once you get that then you would worry about getting the algorithm set up

opaque idol
#

yeah I gotcha

hollow yew
#

This was done using a CNN to train on screenshots that were recorded and classified when i pressed a button to move in the game it screenshot and saved it to a folder with a name corresponding to the button pressed

opaque idol
#

That's what I'm trying to figure out at the moment

unborn pine
#

Is 4 quadrillion too large of an action space for a reinforcement algorithm :)

desert oar
#

of course it's cool to have it built in, but cloud-only is 🀒

left tartan
left tartan
desert oar
#

weird

charred light
misty flint
mystic summit
#

Hello everyone! I'm new to this fantastic channel! Can anyone help with an opinion about DAT Linux distro? I'm new to Data Science (learning Python and some libraries) and want to install it in my laptop (Lenovo T570, Win10, CPU i7, Ram 8 GB).

worn stratus
wooden sail
mystic summit
wooden sail
#

testing it out as a virtual machine is a good idea

lapis sequoia
#

i wanna create a model that discovers the pattern between numbers then generates numbers with the same pattern

#

any help

wooden sail
#

try and see if it has everything you need. otherwise, as i mentioned before, ubuntu and mint have extensive docs, stackoverflow posts, and much more. that makes googling info a lot easier

wooden sail
lapis sequoia
wooden sail
#

for deterministic patterns, that should be more or less straightforward

lapis sequoia
#

it's would be a lot more complicated than that obviously

lapis sequoia
wooden sail
#

not random

#

say if i have a function f, and i evaluate f(1) and it returns 1 always. that's deterministic

#

learning statistical parameters works a bit differently

wooden sail
#

the pattern is that it was generated by the function f

#

a function f maps any input in its domain to a particular output. that's a pattern

#

after all, neural networks are used because they can approximate functions very well

#

if we make a function f(1) = 1, f(2) = 3, f(3) = 5, and so forth, that's a function that generates odd numbers

lapis sequoia
#

yeah

#

continue please sir

wooden sail
#

it sounds like you haven't worked in ML before, so i suggest you start with building dense neural network that tries to fit the function of a straight line (a function that generates odd numbers is a straight line)

#

this is pretty much exactly the same problem as the common "housing prices problem" you find everywhere online as a first intro to ML

#

it looks like you really need a step by step explanation, which i don't have time to provide you right now

wooden sail
#

start by reviewing systems of equations and matrices, since you'll immediately need those for this type of problem

silent crystal
#

hi is there anyway to make categorical data to numeric which the categorical column has 30 categories without creating extra columns or ranking them with labelencoding and onehotencoding

past meteor
#

re python in excel: imo Excel is a really dangerous tool. Business people use Excel and VBA because they don't want to pay the upfront cost of building software

#

Afterwards sunk cost fallacy sets in and you're stuck forever.

wooden sail
#

what about cases when the cost is already sunk πŸ˜›

#

i think the world would collapse instantly if excel suddenly stops working

past meteor
#

And the world would be supercharged if everyone decided to use Excel for what's intended and not more as well

wooden sail
#

i 100% agree with you

past meteor
#

At best I'd use Python to spit out stuff for non-technical people. This is still so dangerous because they might make calculations and encode knowledge in their .xlsx that does not find its way back into your database.

wooden sail
#

this is kinda like telecom infrastructure though. just because 5g rolls out it doesn't mean the previous network is torn down. it's too expensive and you can't force the users to buy new phones

#

i think it can make sense for places where excel IS the database and they're already shoulder deep

past meteor
#

When I was a student I worked part time at a place that did cutting edge manufacturing but their ERP was essentially a network drive full of Excel files that was read/updated.

#

If they had a relational database they'd be making so much more revenue. For instance, they'd be properly be able to answer the simple question of "what production step causes the most defaults"

#

In excel anyone can write anything so it was a mess lmao

wooden sail
#

absolutely

worn stratus
wooden sail
#

spoken like someone with stockholm syndrome after being forced to do agile

past meteor
#

You misunderstood everything I said. Good luck! πŸ™‚

worn stratus
#

if not, what are the problematic cases?

past meteor
#

People use Excel for everything. The company I referred to used it to document all production processes. A lot of manual input was involved that you could skip.

#

Tons of typos happened and production steps changed, which meant the same production process was named 20 different things.

#

Databases have this thing called referential integrity and normalization that prevent exactly this.

worn stratus
past meteor
#

So long as you're not tackling the referential integrity issue you'll have inconsistent data.

#

I'm OK with a relational database spitting out .xlsx (see above)

worn stratus
past meteor
#

Before you ship the data out of Excel into your DB you can have a trillion mistakes already. Excel sheets don't enforce everything you need to enforce by default. You can but then you're at the point of developing software anyway.

#

For instance, there's a returned product field. Multiple batches were returned. Someone just put all of them with a comma into one row instead of making multiple rows per product. How do you enforce that?

#

That's the thing. You don't. All you can do is tell people "pls don't do X, Y and Z". Guess what? They'll do it anyway.

worn stratus
past meteor
#

If you care about the quality of your system you use tools that are error-proof by design: https://en.wikipedia.org/wiki/Poka-yoke

Poka-yoke (ポカヨケ, [poka joke]) is a Japanese term that means "mistake-proofing" or "error prevention". A poka-yoke is any mechanism in a process that helps an equipment operator avoid (yokeru) mistakes (poka) and defects by preventing, correcting, or drawing attention to human errors as they occur. The concept was formalized, and the term adopte...

#

You can write a fancy Python parser or use something like Pydantic but trust me, the pain will never stop if you enforce quality post-hoc.

Power apps for data input is something business people can use if you give them a week's worth of training.

left tartan
#

Fwiw, this is literally the focus of my career. Excel export isn’t just crutch or β€˜the engineers suck’ problem: it’s a key requirement that enables end users to do adhoc analysis on their terms.

#

(I agree with Latte)

worn stratus
#

defining what is and isn't quality data is a huge endeavour for any business process.

by keeping that in excel you let the experts in said data see the intermediate steps and intermediate error checks. doing the same thing in software has a ridiculous cost, probably in the hundreds of thousands or millions of dollars. sometimes that's worth paying, but far less often than developers think

past meteor
#

It's when it becomes a database or more than that

left tartan
#

Hmm, now I’m reading the thread again and not sure what the debate is. Is the debate of using Excel as a primary data source?

past meteor
#

It's not like I would ban it. It has it's time and place but it's not a universal hammer you can throw at every problem.

#

If I'm in finance of course I'd give accountants and finance folk Excel outputs to do their analysis. What I would not allow is them doing all of their books with Excel sheets.

worn stratus
#

I think excel is a super powerful tool, and there's plenty of ways to mitigate it's faults by building software around it.

I think if you're in an Excel business, almost every process should be incorporating excel or outlook to an extreme extent

#

or at least - every process which excel people are doing

past meteor
#

Tbh, not my problem really 🀷 . I've been in 3ish large companies that abused the hell out of Excel. Nowadays I'd just ask how they use the tool in interviews and if I don't like it I'm shaking their hand and I'm going out of the door.

left tartan
worn stratus
worn stratus
#

a third is just operating over xlsx files

#

most of the time it's some programmer setting up an SQL query to e.g get holdings data as at a given date, then a non programmer calling that via Excel

left tartan
rich river
#

when installing Anaconda, do I have to read the license agreement line by line to ensure when to input yes?

wooden sail
#

you can press spacebar to skip several lines at a time until you get the yes|no prompt

#

the most important thing for you there is that anaconda is only free for individuals and small companies. you'll receive threatening emails from them if too many requests from related IPs are detected and no paid license is attached to them

opaque idol
#

Anyone know a way to make the agent know what's happening in a game? I'm trying to train an a.i to play a game on steam

#

Any documentations or videos

#

explanations

mild dirge
#

What game?

#

And if you are not very familiair with AI, making AI that plays a game is probably a hard starter project @opaque idol

opaque idol
#

just trying to make a character move by ai in a game

left tartan
opaque idol
#

I'm going to do this so that I could learn a little more about ai in python

agile cobalt
#

isn't that a multiplayer game?

#

if so, definitely do not bot it then

opaque idol
#

not going to use it on online

#

Don't really know any other games that is small and has a few mechanics

#

I found brawlhalla cause it has a few simple things. Fighting moving dodging

#

Which is great for an ai to learn

#

ofcourse I believe it's not really allowed to do botting in online but if I'll do it offline it should be fine, if I'm correct

agile cobalt
#

I feel like you're greatly underestimating the complexity that "moving" brings to the table

opaque idol
mild dirge
#

If you want to learn you start with something simpler πŸ˜›

#

Making a reinforcement learning agent for something like flappy bird will already be a big challenge as a starter project

opaque idol
mild dirge
#

What etrotta just linked, has many different challenges

opaque idol
#

how do I install gymnasium to visual code?

#

Is it an extension?

somber prism
#

guys, is there any way i can make yolo to ignore certain images during training, since we have to specify the image foldername and it takes all the images in it for the training, is there any way we can ignore certain files ?

mild dirge
#

I'm not sure, wouldn't it be easier to make a new folder with the correct images?

digital bough
#

Why does yappi seem to show drastically different values when I profile the same function?

I ran it in one instance and all of the times were 0.000000.

Sometimes ttot is milliseonds. Sometimes ttot is multiple seconds.

Yet the function is doing the same thing and to my slow human brain, appears to take the same amount of time every time (a few seconds). Not a fraction of a second or literally instantly.

somber prism
#

(even if it means automating )

hearty cradle
#

Im trying to learn Machine Learning in Python rn
Anyone have any pointers here?
Are there any prerequisites for this?

left tartan
hearty cradle
#

Ill say Im an intermediate level programmer as of now

#

Should I still continue?

left tartan
#

Yah, I just asked because if you were a beginner I’d suggest Python first

left tartan
# hearty cradle Ill say Im an intermediate level programmer as of now

I’ve become a fan of the cs50 for ai course, the projects are well designed to give you an intro to ml: https://cs50.harvard.edu/ai/2023/

crude pilot
#

whatever your level is either in Python or ML

#

Question: I am trying to compute clusters on the State of JS survey dataset
I have columns like "interested in library X", "interested in library Y" etc., with values in -1,0,1 (not interested, neutral, interested)
I have many columns so I'd like to reduce the number features to get a "relevant" clustering (using a basic k-mean algorithm)
What approach would you favour here?

#
  • feature selection based on covariance is cool because it's easy to interpret the clustering, but it doesn't feel like it's the most precise (perhaps people can be differentiated on the features I remove)
  • PCA seems great to control the number of features without throwing away data, but it's hard to read later on, I need post-analysis to figure what the cluster means
#

With feature selection for instance I'll care only about response for lib X because lib X and lib Y interest are correlated (say Next.js and React in the JS ecosystem)
With PCA I would blend them (perhaps some people like React but not Next so keeping the info can be good)

slow dawn
#

don't overwhelm him

unique flame
somber prism
#

no i wanna do object detection

somber prism
#

right all i can think of is moving those unwanted imgs to diff folder

crude pilot
somber prism
#

i just love to hear an alternative approach just by using any code to ignore those images

unique flame
slow dawn
somber prism
#

i see

#

ok thanks for the suggestions

slow dawn
somber prism
#

yes

slow dawn
#

then get a different dataset?

#

that'll work right

slow dawn
somber prism
#

yeh but i dont want to give up this dataset which got 55k just to remove 500-1000 imgs

opaque idol
#

I've installed gymnasium from and followed this but when I launch it doesn't open anything.

#

I'm not sure if doing something wrong

#

this website is hard to understand how it works

slow dawn
# opaque idol this website is hard to understand how it works

when something is hard to understand by text we use videos πŸ™‚ https://youtu.be/cO5g5qLrLSo

Worked with supervised learning?

Maybe you’ve dabbled with unsupervised learning.

But what about reinforcement learning?

It can be a little tricky to get all setup with RL. You need to manage environments, build your DL models and work out how to save your models down so you can reuse them. But that shouldn’t stop you!

Why?

Because they’r...

β–Ά Play video
opaque idol
#

yeah. I couldn't find one. Thanks

gaunt pine
#

Does anyone know about continual learning and benchmark dataset?

polar olive
#

How do I get started with Ai , and should I try learning backend dev first or is it unrelated

mild dirge
#

Ai specifically isn't that related with backend (mostly a webdev term)

#

And you want to start with the pre-requisites, like calculus, and linear algebra.

polar olive
#

So just jump in?

mild dirge
#

Yeah, I book I liked was deep learning with pytorch

#

This one iirc, might be a newer version

polar olive
#

Thank you so much

mental crescent
#

Hey there... do anyone have knowledge about facebook ads marketing api? I'm developing a data science project using Facebook Ads data and I need help with some particularities

opaque idol
#

followed this guy's tutorial https://www.youtube.com/watch?v=cO5g5qLrLSo but I'm having issues. I'm using vs code
Here's the code: ```import gymnasium
import random

env = gymnasium.make("CartPole-v1", render_mode="human")
states = env.observation_space.shape[0]
actions = env.action_space.n

episodes = 10

for episode in range(1, episodes+1):
state = env.reset()
done = False
score = 0

while not done:
    env.render()
    action = random.choice([0, 1])
    n_state, reward, done, info = env.step(action)
    score += reward
    print('Episode:{} Score:{}'.format(episode, score))```

Error: Traceback (most recent call last): File "d:\Ai\ai.py", line 18, in <module> n_state, reward, done, info = env.step(action) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: too many values to unpack (expected 4)

left tartan
#

What does env.step return?

left tartan
#

Oh, and it looks like the api was changed, "The Step API was changed removing done in favor of terminated and truncated to make it clearer to users when the environment had terminated or truncated which is critical for reinforcement learning bootstrapping algorithms.", so you might be looking at outdated code with just 4 results and not the 5 today

opaque idol
#

it opens the game for a sec and crashes it

left tartan
#

Well, the error was: "ValueError: too many values to unpack (expected 4)". I'm just answering why.

opaque idol
#

ah ok

left tartan
#

Yah, I don't know this API, I just know enough to answer that.

river sapphire
#

So I'm using a machine learning library on roblox and this person says that from the cost values the model is overfitting or has already conerged. I'm confused because this doesn't look like overfitting and I don't think it could converge in just 100 iterations. They also said that the cost values are independent from overfitting?

#

20:15:09.788 Epoch: 1 Final Cost: 0.24982647768112837
20:15:09.854 Epoch: 2 Final Cost: 0.24982627484525913
20:15:09.909 Epoch: 3 Final Cost: 0.2498260704402344
20:15:09.965 Epoch: 4 Final Cost: 0.24982586446410893
20:15:10.029 Epoch: 5 Final Cost: 0.2498256569149093
20:15:10.088 Epoch: 6 Final Cost: 0.24982544779063465
20:15:10.153 Epoch: 7 Final Cost: 0.24982523708925267
20:15:10.210 Epoch: 8 Final Cost: 0.24982502480870397
20:15:10.264 Epoch: 9 Final Cost: 0.2498248109468997
20:15:10.322 Epoch: 10 Final Cost: 0.24982459550172004

From the cost it seems to be decreasing just very slowly, and all I'm using is the Adam optimizer with a learning rate of 0.001, 2 input nodes, 4 hidden nodes, 2 hidden layers and 1 output node with sigmoid. All the hidden layers use the LeakyReLU activation function. I've tried switching the hidden layer activation function to Tanh and I get similar results, where the cost just decreases really slowly. Is this normal?

#

also can anyone explain what this means?

river sapphire
#

for context I asked why the training cost was so high now I'm even more confused

#

The training data is very simple:
if both input1 and input2 are < 0 then the correct label is 0.
If both input1 and input2 are >= 0 then the correct label is 1. Does he mean like the model is too complex so that's why it's overfitting?

#

I tried 1 hidden layer and 0 hidden layers and I get the same results
I also tested the library on the most basic problem possible where all the training data was the same and there were only 2 features and 1 possible output but the cost started increasing instead of decreasing

desert oar
#

they might not know what they're talking about either fwiw

#

but you should show us how you generated the data and show us your code, as well as whatever numbers you showed them

#

they're correct in that you can't determine if there is overfitting just by looking at whether the cost is "big" or not. you need to see if it's actually getting smaller or not

river sapphire
#

also wait I remember something about a website I can paste the code in

#

is there something like that

river sapphire
#

ohw ait it's only 67 lines lol it's not that big

desert oar
#

that's big enough to put in a paste site imo

river sapphire
#

oh wait sorry that's the uh code that's missing hidden layers

desert oar
#

0 hidden layers == linear regression. not a bad place to start for this model actually

river sapphire
#

wait really

desert oar
#

yes, look at the equations and convince yourself

river sapphire
#

I see

desert oar
#

that's an important insight actually. plain fully connected neural networks are basically stacked linear regressions with the nonlinear activation function in between layers

#

if it weren't for the nonlinear activation, even 10 layers would still be equivalent to linear regression!

river sapphire
#

yeah

#

so I told him that it was strange that the cost was barely changing after each minibatch

#

I can just pull up the posts actually

desert oar
#

that sounds like underfitting, not overfitting

#

yeah, i think that would be helpful

#

are you just looking for a second opinion?

river sapphire
#

yeah I'm trying to not be biased but I'm also trying to learn what he means

desert oar
#

sure

river sapphire
#

this is me:

#

this is uh his response: this is kinda long

#

this is my response:

desert oar
#

no activation function is equivalent to linear regression, yes

river sapphire
desert oar
#

that's not true at all

#

ok, so they only kind of know what they're talking about

#

what cost function are you using?

river sapphire
#

I don't really know they kinda abstracted it

#

I'm guessing it's mean squared error

desert oar
#

they're correct in that trying to model an exactly 0-1 response with a linear regression model won't do so well, especially if you're not using softmax at the end

desert oar
#

but let's assume it's MSE

river sapphire
#

my response:

desert oar
#

wait, your cost is increasing? or just staying flat?

river sapphire
#

I told him that I was suspicious that there was a bug after my sigmoid neural network didn't do very well

#

because the cost was flat

desert oar
#

how did you generate the data?

#

i assume you have a script or something like that

river sapphire
#

for the linear regression or the other one

desert oar
#

i would hope that you're using the same data for all these tests

#

otherwise there's no way to compare

river sapphire
#

for the linear regression model I just made it run in a loop 100 times adding the same data on purpose

#

just to test if the model was actually learning

#

basically same input data and label

#

but strangely enough the cost was increasing

#

for the sigmoid neural network I randomly generated the data, looping 100 times

#

it has two features and both are randomly generated numbers from -1 to 1
if both are >= 0 then the correct label is 1 otherwise it's 0

#

oh wait I forgot the linear regression model code

#

@desert oar are you there? do you need more information there's more posts

desert oar
#

what do you mean by "the same data"?

#

like, the same exact row 100 times? or you meant 100 epochs?

river sapphire
#

I mean like the same exact feature vector

desert oar
#

i'm not really sure what you mean

river sapphire
#

uhh like so for the training data it just looks like
[2.5,1] over and over again

desert oar
#

also, you should generate a single dataset, save it, and do all your tests on that one dataset. don't keep making new ones. otherwise you have no point of reference between models.

#

well that's why the model isn't learning anything

river sapphire
#

and the label is the same (5)

river sapphire
#

the whole point was to have it always output 5 for the inputs [2.5,1]

desert oar
#

5?

river sapphire
#

yes

desert oar
#

i thought you had outputs 1 and 0

river sapphire
#

oh sorry if i'm confusing you

desert oar
#

yes, i'm very confused

river sapphire
#

I have two scripts one for the linear regression model, another one for the nonlinear model

desert oar
#

yes, but hopefully you're using the same data for both. right?

river sapphire
#

I wasn't trying to compare them necessarily though

desert oar
#

if you're trying to debug your code and/or compare models for performance, you need to use the same data

#

if you're just trying to experiment then fine

river sapphire
#

see my idea was that if you have the linear regression model train with just [2.5,1] as the input vector and 5 as the target it should just always output 5 or something close to 5 when you input [2.5,1] right?

desert oar
#

and what happened instead?

river sapphire
#

the cost kept increasing

desert oar
#

just to be clear: this is with no hidden layers and no activation function, right?

river sapphire
#

should I run it again but print what it outputs

desert oar
#

so just 2 inputs, 1 output? so that's 2 parameters for each input + a bias parameter?

river sapphire
desert oar
#

okay, and can you share your code for that model?

river sapphire
desert oar
#

i see, admittedly i have no idea how this NeuralNet thing works

#

i'm trusting that this is correctly written

river sapphire
#

nono do not trust that

#

I did not understand the little nuances of his library so I spent hours sending my code to him and asking why it didn't work

desert oar
#

what do the parameters of addLayer represent?

river sapphire
#

if you're wondering why it's 1 instead of 2 in the first addLayer function it's because for some reason the bias adds to the neuron count

#

so it will throw an error if I do 2

desert oar
#

are you supposed to add the layers in a particular order?

river sapphire
#

it's like from input to last layer

desert oar
#

i think you have it reversed then. also you need 2 "neurons" on the input layer, no?

river sapphire
desert oar
#

that doc suggests that the optimizer only goes on the last layer

desert oar
#

so that's 3 altogether if you count the bias as a neuron

river sapphire
#

okay look if I put 2 and set it to true for bias it will throw an error

desert oar
#

what about 3? i don't know how this library works

river sapphire
desert oar
#

what is the error?

river sapphire
#

ServerScriptService.DataPredict - Release Version 1.2.Models.NeuralNetwork:840: Input layer has 3 neuron(s), but feature matrix has 2 features! - Server - NeuralNetwork:840

#

(this is if I set it to 2)

desert oar
#

i see

river sapphire
#

if I do what he said and add like an extra 1 at the end it should work though

desert oar
#

oh

river sapphire
#

but I tried that already and the same issue happens

desert oar
#

did it? set the neurons to 2 and then add a 1 in the last element

#

it's weird that they're asking you for a bias neuron but then they force you to put it in manually πŸ€”

river sapphire
desert oar
#

yes, so NeuralNet:addLayer(2, true, 'None') and then table.insert(featureMatrix, {2.5, 1, 1})

river sapphire
#

yeah

desert oar
#

what happens then?

river sapphire
#

same problem though the cost is increasing

desert oar
#

but no error, right?

river sapphire
#

yeah

#

well error in the predict function cuz I forgot to add an extra 1

#

I just changed that

#

if you want I can show you the output

#

in the equivalent of the console

desert oar
#
-- Input
NeuralNet:addLayer(2, true, 'None')
-- Output
NeuralNet:addLayer(1, false, 'None', Library.Optimizers.AdaptiveMomentEstimation.new())

local x = {}
local y = {}
for i = 1,100 do
  table.insert(x, {2.5, 1, 1})
  table.insert(y, {5})
end

local ModifiedModel = Library.Others.GradientDescentModifier.new(NeuralNet)
ModifiedModel:train(x, y)

try swapping the order so that the input goes first

#

idk if that will help

river sapphire
#

oh the way it works it's weird

#

so you have top ut the optimizer object in the first addlayer

#

for some reason it isn't like a separate line of code you put

desert oar
#

the docs say that it goes on the output layer

river sapphire
#

yes

desert oar
#

but you added it first, not last

#

i'm saying to swap the order

#

unless you are supposed to add them in reverse order

river sapphire
#

what exactly are you swapping

#

mine is uh this

NeuralNet:addLayer(2,true,'None',Library.Optimizers.AdaptiveMomentEstimation.new())
NeuralNet:addLayer(1,false,'None')
desert oar
#

right. i'm saing to swap the order of those two. you're adding the last layer first

river sapphire
#

no the first layer is being added first

desert oar
#

actually wait. you're adding the optimizer to the first layer.

#

the doc says to add it to the last layer

river sapphire
#

no it says to be added at the last layer

desert oar
#

it's right there in that screenshot you just sent me, or am i going crazy

river sapphire
#

I can just try both tbh

desert oar
river sapphire
#

ooh I get what you're saying now

desert oar
river sapphire
#

you want to swap which layer i'm adding the optimizer on

#

I thought you meant swap the order of layer creation

desert oar
#

at first i thought you were adding the layers in the wrong order, yes. but then i realized you just had the optimizer on the wrong layer.

river sapphire
#

no it's basically the same thing

#

cost is still increasing

#

so it doesn't really matter which layer you add the optimizer on I think

desert oar
#

i see

#

let me think about this. i never considered mathematically what would happen if you put the same record in 100 times

river sapphire
#

i'm gonna paste the forum posts in a google doc

#

my pc is lagging

desert oar
#

hm, it should still converge

#

right? the weight update is a * 2 * x * (y_actual - y_predicted) where a is the learning rate

river sapphire
#

it will take a bit longer to respond my pc is lagging

desert oar
#

well, drop the 2 because you can put a 1/2 in front of the loss and get the same result

river sapphire
#

i'm not really sure tbh

desert oar
#

i mean, that's the equation

#

it's worth spending the time to derive it yourself, but that's it

#

how are you initializing the weights? before starting training

river sapphire
#

so I remember briefly skimming some of the code in the library it should justb e like a uniform distribution

desert oar
#

or is that also abstracted away here? i don't see it in the code

river sapphire
#

yeah it is abstracted

#

uniform distribution I don't remember if it's -1 to 1 or -0.5 to 0.5

#

i'm gonna see if I can find it

desert oar
#

(oops i had the terms swapped)

river sapphire
#
function NeuralNetwork:RandomizeWeights(min,max)
    Base.Assert(min,"number OPT",max,"number OPT")
    
    local random = Random.new()
    
    min = min or -0.5
    max = max or 0.5
    for _,synapse in pairs(self.Synapses) do
        local num = random:NextNumber(min,max)
        --print(num)
        synapse:SetWeight(num)
    end
    
    random = nil
end

yeah it is just a uniform distribution

desert oar
#

okay. so let's say the weight starts at 0. if your prediction is smaller than actual, and the sign of x is positive, then the update is positive, and it should cause the weight to get bigger

#

that will cause the next prediction to be larger, and so on

river sapphire
#

yes

#

i'm gonna brb to take a shower

desert oar
#

loss clearly decreases and the output is indeed ~5

river sapphire
#

ok i'm back

river sapphire
#

oh I misread

#

that's a squiggly symbol

#

my brain is running on low power

desert oar
river sapphire
#

I really think there is just a bug in the library

desert oar
#

that's entirely possible. is there example code you can run that's supposed to work?

#

i don't really know what these other posters are talking about

river sapphire
desert oar
#

this library seems very weird

river sapphire
#

oh "other person" is the same person

#

it's the creator of the library

#

and I don't think he's verified if it actually works properly

#

this seems to be missing stuff like the train function

#

maybe I can try the logistic regression code but that seems to be an entire different model

#

okay yeah that's strange so his logistic regression model code seems to work but the neural network model doesn't??

#

I'm pretty certain I followed all of the little nuances of his library with my new code but the cost is still increasing

#

and it's not documented in the API for some reason but setClassesList() is required and should be an array with the same length as the number of neurons in the output layer looking something like this

NeuralNet:setClassesList(1,2)
desert oar
#

it's possible that this neural network model only supports classes

#

the author seems to be under the mistaken impression that neural networks can only perform classification

#

so this library might only support classification

river sapphire
#

well I told them that neural network was listed under classification

#

and he said that it can be regression too and I didn't really understand the last part of his sentence

#

i'm not really sure what he means by design implementation for ease of use

#

also the strange thing is I tried regression it seems like it can do regression the last time I checked

desert oar
#

i assume he meant that you need to do a little extra work to set it to do regression, maybe add extra options. idk

river sapphire
#

it was able to output something over 1 but maybe that updated

river sapphire
desert oar
#

in the last sentence he's saying that he does 2-class classification with two separate neurons, instead of one neuron handling both

desert oar
river sapphire
#

I don't really feel like reading the source code plus* I really think there is just a bug
I can test if it's classification only but last time I tried it could output something over 1 using ReLU

#

yoo pathfinding?

#

nice

twilit tundra
boreal blaze
#

oh i'm sorry i thought this was that

#

i did not pay attention

west grail
#

hello there anyone worked with linear regression, i need some help how to use it for satelite images and assosiated air quality measurment with it,

rich river
#

do you prefer using conda install or pip install to install packages in conda environments?

lapis sequoia
#

I have a video stream, i'm taking image by image. Is there a way to check if the image is the same as the previous image? (it may vary by some pixels but it would be almost identical, so i can't use some hashing method)

simple tapir
#

Why is this wrong?

serene scaffold
#

@simple tapir try using loc for the first two lines of that code.

simple tapir
#

what did the proportion of mine give though?

serene scaffold
#

Sorry, but I don't understand what you said.

simple tapir
#

what's 0.62 here?

#

I don't understand why my code is wrong

serene scaffold
#

The denominator should be the total number of passengers, should it not?

mild dirge
#

you want to know life / (life + died)

simple tapir
#

oh

#

right omg

#

sorry guys

serene scaffold
#

But also, doing df[ ][ ] might have different semantics than using loc

simple tapir
#

using .loc() is better?

serene scaffold
#

Loc isn't a method

#

But it's better than stacked getitem calls, yes

simple tapir
#

I see, thanks guys

tender bramble
#

If anyone has experience building moderate scale dash & plotly web apps please lmk!

#

I just want a little bit of guidance on best practices

north wasp
#

Guys, can someone suggest me a source I can learn how to use Python libraries such as Numpy and pandas from?

manic cobalt
#

Guys i am working on audio signal processing
do anyone have audio recording of
completely fine engine and defected or problematic engine
(engine = automobile engine)

manic cobalt
quiet pebble
unique ether
#

Hello everyone!

leaden warren
#

There is no AI, only ML

unique ether
#

rodger that

#

You are saying that ML is the foundation of AI?

leaden warren
#

I'm saying that nothing I've seen so far has been anything which indicates intelligence, only that LLMs are learning to pass tests

unique ether
#

So its just a bunch of algorithms brute forcing their way towards looking intelligent through trial and error?

leaden warren
#

Yes

unique ether
#

Good to know

#

Do you work in ML?

leaden warren
#

I work in email spam filtering, which is kinda sorta ML

unique ether
#

Do you use linear algebra at all?

leaden warren
#

Yes, but only for video games

manic cobalt
#

Guys i am working on audio signal processing
do anyone have audio recording of
completely fine engine and defected or problematic engine
(engine = automobile engine)

#

anyone?

lapis sequoia
unique ether
leaden warren
#

LLMs and Bayesian networks are both instances of Directed Acyclic Graphs, which are described by graph theory, not linear algebra

unique ether
leaden warren
#

Whatever changes in ML happen in the future, I guarentee that it will be related to graph theory

mild dirge
#

I don't agree with some of those points. Don't know what you mean with "there is no AI, only ML", and linear algebra def is used in ML.

manic cobalt
lapis sequoia
#

cuda aint working for me 😭 im cring now

leaden warren
#

But he asked if I used linear algebra for that, and I don't

lapis sequoia
mild dirge
manic cobalt
manic cobalt
lapis sequoia
#

ngl i need help with cuda i have set it up and everything but it aint working, are there anyservers or sum i can look in 😭

mild dirge
#

Yeah no hurt in asking, but slim chance anyone is working on car sound data here rn πŸ˜›

manic cobalt
#

i just want mp3 recording of it (broken one)

mild dirge
#

I'm not going to break my car to give you data no, sorry

lapis sequoia
#

high iq

manic cobalt
lapis sequoia
#

anyhow i still need help with cuda pithink

unique ether
#

Anyone know a good course to learn graph theory? I've got a paid udemy subscription.

manic cobalt
echo vapor
leaden warren
#

@unique ether learn both linear algebra and graph theory!

unique ether
#

I'm loving this

unique ether
unique ether
echo vapor
#

best of luck mate. u didnt cover graph theory or even lin alg prior though?

unique ether
manic cobalt
leaden warren
#

@unique ether I'm still learning graph theory, it seems because it's so much younger, that people are still making new algorithms all the time, but linear algebra is done, I mean if you learn matrix multiplication, inversion, and diagonalization, that's about it

#

@unique ether so I would start with linear algebra, finish it, then move on to graph theory, and spend the rest of your life trying to understand it

manic cobalt
echo vapor
# unique ether Sounds like a plan

Beginning the linear algebra series with the basics.
Help fund future projects: https://www.patreon.com/3blue1brown
An equally valuable form of support is to simply share some of the videos.
Home page: https://www.3blue1brown.com/

Correction: 6:52, the screen should show [x1, y1] + [x2, y2] = [x1+x2, y1+y2]

Full series: http://3b1b.co/eola

Fu...

β–Ά Play video
leaden warren
#

Vectors are so 1 dimensional, lol

unique ether
#

I'm really glad you lot have told me about 3Blue1Brown I never knew about him before. His videos look really informative

echo vapor
#

its a playlist

manic cobalt
#

i remember the time when i just used to search "best algo for ___ model " then just copy paste the codes but when i learnt about algebra, statistic, probability , calculus for machine learning and then mathematical formulation of each algo then that made huge difference in learnig

unique ether
#

Right now I'm just doing a 15 hour Algebra course on Udemy just to freshen up my base algebra knowledge. I can finish that in one day and then move on to the more advanced stuff tomorrow I reckon.

mild dirge
manic cobalt
unique ether
tawny fog
#

Hello!

manic cobalt
manic cobalt
tawny fog
#

I'm new here

manic cobalt
unique ether
tawny fog
#

So, How's everybody?

manic cobalt
tawny fog
#

I hope I'm not interrupting in anything important

unique ether
# manic cobalt ??

Sorry I mean do you reckon that those topics you mentioned will be helpfull in learning about AI and ML?

manic cobalt
#

@tawny fog nice name though

tawny fog
tawny fog
unique ether
manic cobalt
#

you cant go any inch further without those

manic cobalt
tawny fog
#

So, I was seeking some help/advice/recommendation for a Capstone Project assigned by my School

tawny fog
#

I just want to become everything πŸ˜…

manic cobalt
#

how can i help you

tawny fog
#

But for now I'm focussing more on Full Stack Dev and AI/ML

manic cobalt
#

im new to this term ive never heard about it

tawny fog
manic cobalt
tawny fog
#

MongoDB, ExpressJS, React and NodeJS

manic cobalt
manic cobalt
tawny fog
manic cobalt
#

this "Capstone Project " unfamiliar term

manic cobalt
tawny fog
#

I always had interest in making Websites & App but you know backend gets really messy so I thought why not use AI & ML for the backend

unique ether
lapis sequoia
#

WHY WHY WHY WHY WONT CUDNN WORK FOR ME

#

WHY

manic cobalt
lapis sequoia
manic cobalt
unique ether
manic cobalt
#

but never encountered one

lapis sequoia
manic cobalt
unique ether
#

Fair enough

#

I've got a 7900 XTX. Will using it to train model fry it?

lapis sequoia
#

waiittt did i install vs 2022...

#

πŸ’€

manic cobalt
#

@lapis sequoia what are your field of expertise ?

lapis sequoia
#

bruh if that is the probelm i will kms

lapis sequoia
manic cobalt
manic cobalt
#

did i make spelling mistake>

lapis sequoia
manic cobalt
lapis sequoia
manic cobalt
#

ok

#

if you say so

tawny fog
#

Hi

#

I'm really sorry to go unexpectedly

#

I'm late πŸ˜₯

vernal dome
#

Hey Python community. My friend and I created a VectorFlow, open source vector embeddings pipeline - https://github.com/dgarnitz/vectorflow built in Python. We want to expand it to handle metadata more robustly. We were wondering how people in the Python AI community are using metadata in their vector DB searches. For example, are you extracting keywords or themes from the text? What capabilities are you missing that you want to see?

GitHub

VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice. - GitHub - dgarnitz/vectorflow: VectorFlow is a...

upper flame
#

hey guys hope yall doing well here. I was wondering if someone understands a lil bit trading, because i'm developing a trading AI that is nearly finished: I have a float issue because i want to float my broker balance and that is perturbing me a lot and i am struggling to fix. If you want to help. Please ping me. PS: I'm 15 y/o I don't have as many experience as you here. Thanks for reading

mild dirge
#

Can you show or explain the issue here? @upper flame

unique ether
#

I'm watching an algebgra refresher course on udemy and one of the quiz questions is absolutely kicking my ass

#

I know the answer the quiz expects and even google disagrees with it

#

every algebgra calculator i've found disagreess with this quiz

opaque idol
#

Hi. I'm getting this error:
Traceback (most recent call last):
File "d:\Ai\model.py", line 14, in <module>
while not terminated:
^^^^^^^^^^
NameError: name 'terminated' is not defined

Code:
while not terminated:
env.render()
action = random.choice([0, 1])
n_state, reward, terminated, truncated, info = env.step(action)
score += reward
print('Episode:{} Score:{}'.format(episode, score))

#

Last time I tried to use done false true but it threw me an error on the env.step

umbral charm
#

how come when i import scipy, it doesnt import everything, even when i do from scipy import *

#

i have to do from scipy.stats import norm why is this

#

why is this

dense crane
#

is there something like few-shot pix2pix or just in general few-shot image2image ?

left tartan
opaque idol
#

someone said I can't use done cause of an update changing it to terminated and truncated

#

so I don't really understand how that works

left tartan
#

In this case, I said: terminated = False not terminated = false. Very different.

#

And, I know that if you're not aware of that difference, you're really going to be unable to troubleshoot a lot of this.

opaque idol
left tartan
#

(I mean this constructively, I can point you at some tutorials to start with)

opaque idol
opaque idol
#

but if you can I'll appreciate it

left tartan
arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

desert oar
magic dune
#

hi

upper flame
#
class BalanceApp(EWrapper, EClient, float):
    
    def __init__(self, ip_address, port_id, client_id):
        EClient.__init__(self, self)
        self.ip_address = ip_address
        self.port_id = port_id
        self.client_id = client_id
        self.account_balance = None
    
    def __new__(cls, ip_address, port_id, client_id):
        return float.__new__(cls, 0.0)
    

    def start(self):
        self.connect(self.ip_address, self.port_id, self.client_id)
        self.run()

    def nextValidId(self, orderId: int):
        super().nextValidId(orderId)
        self.nextorderId = orderId
        print('The next valid order id is: ', self.nextorderId)

    def accountSummary(self, reqId: int, account: str, tag: str, value: str, currency: str):
        super().accountSummary(reqId, account, tag, value, currency)
        if tag == 'TotalCashValue':
            self.account_balance = float(value)

    def __float__(self):
        if self.account_balance:
            return float(self.account_balance)
        else:
            return (0)

    def error(self, reqId, errorCode, errorString):
        print(f"Error: {reqId} - {errorCode} - {errorString}")
        if errorCode == 2104:  # Market data farm connection is OK
            return
#

Calls:

balance = BalanceApp(ip_address,port_id,client_id)
balance.start()
balance.accountSummary(reqId=123, account="DU11643091", tag="TotalCashValue", value="12345", currency="EUR")
balance.__float__()
balance.error(reqId=123, errorCode=456, errorString="Some error message")

print('Is balance a float?', isinstance(balance, float))

Error:

Traceback (most recent call last):
  File "c:\Users\lenovo\Documents\ccdi.py", line 563, in <module>
    riskmg = RiskManager(balance=BalanceApp(ip_address, port_id, client_id), max_loss_pct=0.04, stop_loss_pct=0.03, take_profit_pct=0.05)
  File "c:\Users\lenovo\Documents\ccdi.py", line 209, in __init__
    self.take_profit_pct = self.calculate_max_take_profit_pct()
  File "c:\Users\lenovo\Documents\ccdi.py", line 213, in calculate_max_take_profit_pct
    actual_balance = float(self.balance)
TypeError: BalanceApp.__float__ returned non-float (type int)
#

@mild dirge please ping me as soon as u have time or maybe dm me it would be more efficient. Well it’s up to u and what u prefer

subtle lotus
compact valley
#

Question for on-the-job Data Scienti-st
What type of tasks do you get in your work day regarding data science/usage of sql/other technologies used per task?
Please I need real life exapmles to help me envision it 🫠

surreal sphinx
#

Do you guys have recommendations for going through medium amounts of data? I have 20gb of data broken up into 2gb csv files. Would it be better to store it all in a database, or open one file at a time?

#

I need to go through all of the data either way.

left tartan
left tartan
#

You can also read them into dataframes and concat.... but pandas is slow, and I like sql. Depending on organization, I might transcode them to parquet files, or load them to tables, or whatever.

ruby magnet
edgy falcon
#

to somebody else has happen something like this in the datascience problems of leetcode?

tawny fog
#

I have been assigned with a Capstone Project from my school to build a fully customised chatbot with my School's Branding and Name and all the information related to my school for integration with their own Official Website for which the parents or potential customers willing to admit their wards in my school can get general and day to day information about my school like Fee Structure, Timing, Subjects Offered, culture, etc. Now, I am supposed to build this project in Python but as a Beginner in Python, I really don't know anything about it and I don't know how to achieve it. I have 4 Months to build this project and this project would be evaluated by external invigilators on the basis of which I would be allocated marks and these marks are very important for me. Please explain in each and every detail and aspect of how I can achieve this Target of mine. I want to make something advance but I really don't have any knowledge about how to make Chatbots. I have tried the chatterbot library in python to build my chat but that would take a hell lot of data to train it perfectly and I don't have time to do it along with my studies. And also, I am not being funded from my school so, I'm doing everything from my own pocket. I can't afford to spend any money on this project since I don't have any. So, kindly recommend to me how I can complete this project with all free and open source solutions. And I'm supposed to build up this project from scratch since I need to explain the technical know-how and how this project is working and what I did!

worldly dawn
# tawny fog I have been assigned with a Capstone Project from my school to build a fully cus...

Handing you out an architecture, each and every detail and aspect wouldn't be helping you. That would also be considered cheating.
Instead, let's focus on on showing how to approach these seemingly impossible problems.

First, think about the requirements:

  • What does it need to do?
  • What about corner cases?
  • How fancy should it be?
  • How structured should the information should be? Can I ask any question as free form, or is it more directed (like when you phone your bank and they ask you to press different numbers based on what you want)?
  • How does it integrate with the official website or potential customers?

For that step, it helps a lot to go through concrete examples and to write them down. So that way, you have something concrete to work with, something to use as tests and something to show your teachers if you have specific questions about whether a specific case ought to be supported or not.

Then the next step is to proceed by dichotomy: split that huge problem in smaller chunks until they each become manageable on their own.
So for instance, whether you need to know python, if there are libraries about ml/ai or chatbots, about how to host your service, etc.

past meteor
unique flame
#

I would recommend Huggingface, but it is their own responsibility to understand the technical intricacy of the subject. I mean people get paid to explain this.

past meteor
#

You take a bunch of your school's documents and give them as context and then you chat with GPT as usual. I'm pretty sure Azure cognitive services has templates you can roll with where you just need to fill in the blanks.

worldly dawn
#

It would be a good start for them to make a list of what's out there in the landscape of chatbots and what should a chatbot be able to accomplish in terms of parsing and understanding user queries

#

they can still get help in clarifying some of the points though

past meteor
#

Oh yeah, I fully agree. I think what you wrote is 100 % what I would do

worldly dawn
#

Unfortunately, there is no magic

past meteor
#

I added my points because if they're feeling particularly lost it's a decent fallback plan. It's a bit nebulous because I see consultants default to GPT when easier (and cheaper!) methods could've worked.

#

Shows they haven't done their due diligence which is what you're essentially telling them to do.

worldly dawn
#

definitely

signal dust
#

hey guys

#

I want a help in scrapy

late shell
#

Hello, I'm trying to use the llama-2 ggml model via langchain and ctransformers. I installed CUDA toolkit as per the commands on the official NVIDIA page and set the PATH and LD_LIBRARY_PATH variables in my .bashrc. But when I'm trying to load the model via langchain.llms.CTransformers, it throws an error saying:
lobcudart.so.12: Cannot open shared object file: No such directory
Can someone please help me with this. I'm a beginner and have been trying for a day to get this to work. Thanks

finite sky
#

eya everyone. I'm thinking about doing a block clutcher sorta cheat for Minecraft. So basically the task is, I wanna, using CV, recognize all the sides of a block, pick the closest one, and then I'll use it to place another block on it in game itself.
The dilema is that Idk how to do it. I'm choosing between trying to impelment some algorithms, or using a neural network. (theoretically I can make a loooot of screenshots bc it's Minecraft lol and I can automatically generate those).
Currently I'm liking the nn option more but I googled it and these are so complicated--
What should I do?

#

(these are surfaces it should be finding)

mild dirge
#

!rule 5 @finite sky

arctic wedgeBOT
#

5. Do not provide or request help on projects that may violate terms of service, or that may be deemed inappropriate, malicious, or illegal.

finite sky
mild dirge
#

It's not about how it's phrased. Botting in mc is not allowed pretty sure

finite sky
mild dirge
#

At least not in most servers

#

Your screenshot literally shows you're in hypixel

finite sky
latent remnant
#

does anyone here know how to work with csv files in jupyternotebook>

latent remnant
left tartan
#

Text is preferred

#

but just paste it here either way

latent remnant
#

the problem is the csv file actually

left tartan
#

Well, explain first plz

latent remnant
#

So, i have created a new csv file from an old one which had unnecessary data in it, as you can see on the left side, it's the new csv file but when i print it the output doesn't look great

#

i want my data to look like this, but in the above created csv file, all the new columns were down

mild dirge
#

You have multiple columns in the header but only 2 columns in each row?

left tartan
#

Your "cleaned.csv" looks terrible.

#

So lets start at the beginning:

latent remnant
left tartan
#

The problem you have is: How do you read a CSV properly?

#

Can you share the first two lines of the csv file as text?

latent remnant
left tartan
#

The header and first two lines, I should say

latent remnant
left tartan
#

The good one (original)

mild dirge
#

You can see the header has multiple columns, but each row has only 2 columns

#

So that is why it isn't reading it correctly probably

latent remnant
#

Team,Player,Tournament,Matches,Batting Innings,Not Out,Runds Scored,Highest Score,Batting Average,Balls Faced,Batting Strike Rate,100,50,0,4s,6s,Bowling Innings,Overs Bowled,Maidens Bowled,Runs Conceded,Wickets Taken,Best Bowling Figures,Bowling Average,Bowling Economy Rate,Bowling Strike Rate,4+ Innings Wickets,5+ Innings Wickets,Catches Taken,Stumpings Made
Delhi Daredevils,CH Morris,IPL 2016,12,7,4,195,82*,65,109,178.89,0,1,1,15,12,12,44,0,308,13,Feb-30,23.69,7,20.3,0,0,8,0

latent remnant
mild dirge
#

And not sure what you use as separator

latent remnant
mild dirge
#

In your header the column names are separated by ,

mild dirge
#

You want the same in all your rows

left tartan
#

Yah, this read fine for me: ```py
from io import StringIO
import pandas as pd

s = StringIO("""Team,Player,Tournament,Matches,Batting Innings,Not Out,Runds Scored,Highest Score,Batting Average,Balls Faced,Batting Strike Rate,100,50,0,4s,6s,Bowling Innings,Overs Bowled,Maidens Bowled,Runs Conceded,Wickets Taken,Best Bowling Figures,Bowling Average,Bowling Economy Rate,Bowling Strike Rate,4+ Innings Wickets,5+ Innings Wickets,Catches Taken,Stumpings Made
Delhi Daredevils,CH Morris,IPL 2016,12,7,4,195,82*,65,109,178.89,0,1,1,15,12,12,44,0,308,13,Feb-30,23.69,7,20.3,0,0,8,0""")

df = pd.read_csv(s)

print(df)

left tartan
#

You only partially shared.

tidal bough
#

i don't see anything wrong with the csv you posted? you probably just shouldn't be looking at a printed df, the text representation is confusing when there's tons of columns.

latent remnant
# left tartan Can you share the actual code, not screenshot, in your Jupyter cell?

data = pd.read_csv("IPL 2016-2019.csv")
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

team_2019 = data[data["Tournament"] == "IPL 2019"]
player_2019 = team_2019["Player"]
player_t_2019 = team_2019["Team"]
player_batting_innnings = team_2019["Matches"]
player_batting_avg = team_2019["Batting Average"]
player_strikerate = team_2019["Batting Strike Rate"]
player_bowling_innings = team_2019["Bowling Innings"]
player_bowling_average = team_2019["Bowling Average"]
player_bowling_eco = team_2019["Bowling Economy Rate"]

dict = {
"Player" : [player_2019],
"Team" : [player_2019],
"Batting Innings" : [player_batting_innnings],
"Batting Average" : [player_batting_avg],
"Batting Strike Rate" : [player_strikerate],
"Bowling Innings" : [player_bowling_innings],
"Bowling Averagw" : [player_bowling_average],
"Bowling Economy Rate" : [player_bowling_eco]
}

final_2019_data = pd.DataFrame(dict)
final_2019_data.to_csv("IPL_2019_Cleaned.csv")

mild dirge
#

The cleaned one does not look correct

left tartan
#

a quick repro of the code with data: ```py
from io import StringIO
import pandas as pd

s = StringIO("""Team,Player,Tournament,Matches,Batting Innings,Not Out,Runds Scored,Highest Score,Batting Average,Balls Faced,Batting Strike Rate,100,50,0,4s,6s,Bowling Innings,Overs Bowled,Maidens Bowled,Runs Conceded,Wickets Taken,Best Bowling Figures,Bowling Average,Bowling Economy Rate,Bowling Strike Rate,4+ Innings Wickets,5+ Innings Wickets,Catches Taken,Stumpings Made
Delhi Daredevils,CH Morris,IPL 2016,12,7,4,195,82*,65,109,178.89,0,1,1,15,12,12,44,0,308,13,Feb-30,23.69,7,20.3,0,0,8,0""")

data = pd.read_csv(s)
team_2019 = data[data["Tournament"] == "IPL 2019"]
player_2019 = team_2019["Player"]
player_t_2019 = team_2019["Team"]
player_batting_innnings = team_2019["Matches"]
player_batting_avg = team_2019["Batting Average"]
player_strikerate = team_2019["Batting Strike Rate"]
player_bowling_innings = team_2019["Bowling Innings"]
player_bowling_average = team_2019["Bowling Average"]
player_bowling_eco = team_2019["Bowling Economy Rate"]

d = {
"Player" : [player_2019],
"Team" : [player_2019],
"Batting Innings" : [player_batting_innnings],
"Batting Average" : [player_batting_avg],
"Batting Strike Rate" : [player_strikerate],
"Bowling Innings" : [player_bowling_innings],
"Bowling Averagw" : [player_bowling_average],
"Bowling Economy Rate" : [player_bowling_eco]
}

df2 = pd.DataFrame(d)
print(df2)

#

(didn't fix anything, just merged it)

tidal bough
latent remnant