#data-science-and-ml

1 messages · Page 368 of 1

lapis sequoia
#

it should be data science related though, otherwise it's better to pick an available help channel (help-coconut, help-donut etc.)

#

!paste

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

eager verge
#

Can someone solve this ?

#

i did this solution

#

def solution(logbook):
length = []
for i in logbook:
length.append(abs(ord(i[0]) - ord(i[1])))
return max(length)

cinder plover
#

Hi

#

I am also a coder and i code Computer Vision and Robotics Programs

#

I have made an Advacned app, and also i have made a video demonstrating the features of it

young granite
#

is here someone with plotly/dash knowledge and willing to help me?

cinder plover
#

I am more in computer Vision and ROobtics

#

Anyone would be interested to see my Computer Vision advanced Project ?

lapis sequoia
#

Scikit-learn data preprocessing:
Hi, I wonder wheres the best place to put data preprocessing functions: Before the scikit learn pipeline implemented in own functions or within the scikit-learn pipeline writing the data cleaning in my own transformer class?

weak grove
#

Hi, i am getting this issue while installing tensorflow can anyone help me with this

cinder plover
#

Hi Friends

#

I wanto to share

#

something with you guys

#

I have made an Advanced AI and Computer Vision Project , would you mind checking it out and raiting it ?

lapis sequoia
#

I think it would be better if people would watch the code instead of just the video lol.(just my opinion ofc)

cinder plover
#

yes i will give the source code

#

and i have made it as an app,you want to see the GUI ?

#

This is how the GUI looks

limpid cosmos
#

can someone share some books from where i can learn ML/DL
i want to learn it's math not just code

lapis sequoia
#

moreover he also has videos of CNN on youtube(assuming you want to learn cnn too)

#

i like the way he explains things.

tough bolt
#

.

cinder plover
cerulean vapor
#

Hello need help

royal crest
serene scaffold
cerulean vapor
#

How to update files?

lapis sequoia
#

!paste

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

lapis sequoia
#

@cerulean vapor

cerulean vapor
#

First issue I don't get .csv files

#

In referenced directory

#

Result is that just

lapis sequoia
#

I mean I assume it is tmp file since it's not downloaded yet and you closed the driver. While I'm not sure if this would work or not, but for now remove all other links and for one instance just do it for one, and don't close the driver, then find a way to handle it in a way that it does not close before it's downloaded completely.

#

@cerulean vapor

#

Also I'm not sure if this breaks TOS of the site since you are using the information, I would like you to make sure it's not breaking it, since, if it is, then we cannot help you.

#

or I'll just dm @sonic vapor to make sure lol.

cerulean vapor
#

no not breaks

lapis sequoia
#

alright no issues. just confirmed with mod too.

terse frigate
#

if a+b = 2
and a = 0.5
isnt that the same as:

lim (a+b)=2
a->0.5

??

#

@lapis sequoia help pls

atomic leaf
#

How do you approach recognizing multiple symbols in one sequence with Pytorch? So like instead of predicting images with a single digit (i.e 7) , you predict multiple digits (i.e 8271)?

#

I have a program that can recognize digits on images with only one digit, but I can't get it to work with multiple digits in the images

devout sail
atomic leaf
#

Is that easier than just using the entire image at once?

#

So like instead of image data being

#

It would be

#

?

devout sail
#

The problem space seems a bit too big to have a class for each number

#

yeah you would break it up like that first

grizzled stirrup
#

Is their a help forum for pandas specifically anywhere?

atomic leaf
devout sail
atomic leaf
grizzled stirrup
#

I'll ask here then! Thank ya'll so much. I checked stackedoverflow but can't really articulate in google what I want. It's a simple problem, but I only have foundational pandas and Python experience so I'm a bit stuck

devout sail
grizzled stirrup
#

!code

arctic wedgeBOT
#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

atomic leaf
#

Like i can't just divide at a specific place, cause each image has a slightly different positioning of the symbols

devout sail
# atomic leaf

You can still separate them relatively well. You might want to look into detection and not just classification.

atomic leaf
#

Hmm

#

Darn

devout sail
#

For this specifically you might still be able to do it with some heuristics like eroding it first and then finding connected components

weak grove
#

can anyone plz help me with this issue

atomic leaf
grizzled stirrup
#

I have a dataframe in Pandas that is just 2 columns. I only needed one column so I wrote code for that:

Then, I got help for a regex expression that removed all PERIODS and NUMBERS from this list of email. The list is 22 million distinct emails.

for x in test:
     new_email = re.sub(pattern, "", x)
     print(new_email)

So that block of code works and does what it is supposed to do, but now my problems are this:

  1. When I execute the block of code, their are so many emails that pop up that the text ends up overlapping itself and eventually causing Jupytr Notebook to crash

  2. I don't know how to export my results to a .csv. If the results were in a dataframe I'd know how to do it, but from the for statement -> output -> to .csv I have no clue. I imagine you'd have to update the dataframe somehow but no idea how to do that here

lapis sequoia
#

or df.email

grizzled stirrup
#

Sorry, you're correct. I am writing this manually as it's on my work computer

lapis sequoia
#

okay. lemme read it and see if i can help.

lapis sequoia
#

also I'll show you how you can save them

grizzled stirrup
#

Thanks! I was printing them because the origial person helping me said that needed to be in there, and I also needed to ensure the code worked. Luckily, it did. Any help you can give it appreciated

lapis sequoia
#

uhm

#

!d pandas.DataFrame.to_csv

arctic wedgeBOT
#
DataFrame.to_csv(path_or_buf=None, sep=',', na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, mode='w', encoding=None, compression='infer', ...)```
Write object to a comma-separated values (csv) file.
lapis sequoia
#

@grizzled stirrup check this out

grizzled stirrup
#

Thanks for this, but where to I actually utilize print.head() or export it to .csv after this 'for loop' statement? That is what I am stuck on, because the for loop is exporting results that don't appear to be in a dataframe if that makes sense

lapis sequoia
#

that kinda didn't make sense

grizzled stirrup
#

That is my last line of code

lapis sequoia
#

first, don't use forloop.

#

forloops are slow and not pandas way. give me a minimal example.

grizzled stirrup
#

Okay thanks! I am new and didn't know that.

Really I am just needing a regex statement that removed periods and numbers from this series of emails. So if the email was mr.prashant111@gmail.com, the regex statement would be mrprashant@gmail.com

lapis sequoia
#

ah beautiful. gimmi a sec.

#

!e

import pandas as pd
df = pd.DataFrame(['mr.prashant111@gmail.com'], columns=['P'])
df['P'] = df['P'].str.replace(r'\d|\.', '', regex=True)
# now you can save it by that function
print(df)
arctic wedgeBOT
#

@lapis sequoia :white_check_mark: Your eval job has completed with return code 0.

001 |                      P
002 | 0  mrprashant@gmailcom
lapis sequoia
#

@grizzled stirrup

grizzled stirrup
#

let me give this a go! Thanks so much for all your help

lapis sequoia
#

no issues. happy to help.

lapis sequoia
#

sorry archer pinged by mistake

grizzled stirrup
#

That is the output I needed. The @ symbol can stay, but I just need any periods or numbers to go away 🙂

lapis sequoia
#

oh alright.

#

give this a go and ping me if you need me.
also df['P'] = this part is needed since you need to reassign it

grizzled stirrup
# lapis sequoia oh alright.

Omg this worked! Your rock, my friend. I am still learning lots about Python and Pandas, but people like you really help motivate me to continue. I appreciate your help so much

grizzled stirrup
lapis sequoia
#

haha nice!

orchid kayak
#

Can a model output a shape of (513, 1)?

serene scaffold
orchid kayak
#

to a sequential model?

serene scaffold
#

what was the shape of the tensor you passed to it?

orchid kayak
#

my x is (2534, 513, 26, 1) and my y is (2534, 513, 1). My final dense layer has 513 variables and a linear activation
When I passed my y with the third dimension, I received an error message due to dimension mismatch at the last layer. So I reshaped my y to (2534, 513) and the model started to fit, but the results are subpar

#

loss: 0.0032 - accuracy: 0.0020
This is the model result for the last epoch

serene scaffold
#

I'd have to know the architecture of the network to know why you ended up with (513, 1) specifically, but it's unsurprising that you'd end up with (n, 1) for an n that is the length of one of the input's dimensions.

orchid kayak
#

I am following a tutorial and I don't understand myself why I have a shape of (513, 1), I managed to manifest it by understand what I can. Right now I am just experimenting with the concept to see if I can make sense of it

lapis sequoia
#

so ofc output is gonna have that shape

#

ofc you could flatten X and Y and make it (2534*513, 26, 1) and so on (same for Y) you could have singular Y but that really depends on what data is and is that what you want.

orchid kayak
#

I am working with spectrograms of audio data. the x is the spectrogram of the mixture, and the y is the spectrogram of the vocals. The goal is for the model to be able to separate vocals from instruments. I have succeeded in making a voice activity detection model using the same dataset, but for the source separation model I am having issues, mainly because the article I am following is not as descriptive about this part.

lapis sequoia
#

Did I understand this right - when there is a lot of data, training take longer, because of that, it's good to use distributed training. In data parallelism, models are replicated on different devices and data is split between them - then each worker communicate what his model learned to other models and they update weights accordingly - is that right?

#

Also, can someone explains asynchronous training?

wicked grove
#

hello

#

i have trained my model with 50 epochs and the accuracies keeping changing

#

should i choose the final val_acc as the one i get on the last epoch or should i choose the best val_acc for my final model??

lapis sequoia
pastel herald
#

Hey everyone,

Is there a way with to get a specific desktop application "window" that is open? I'm looking to grab by title (with a wildcard flag) as there can be several instances of this application open at one time.

soft viper
#

Guys, real quick. What does k actually mean? I always see it in algorithm such as k mean clustering and k nearest neighbour

wicked grove
#

and similarly for knn where k is the number of nearest data points, i.e the nearest neighbors

wicked grove
lapis sequoia
#

oh yes as @wicked grove explained.

lapis sequoia
wicked grove
# lapis sequoia in k means you take 3 means, hence creating 3 clusters.

heyy, i have doubt...so i am training a vgg19 model and have used k-fold cross validation with 10 splits except for one of the splits the other accuracies are p consistent tho they kinda keep changing. I was watching a video where they told i can treat 10 folds as 10 separate models. so can i save the 'best model'.

wicked grove
lapis sequoia
lapis sequoia
wicked grove
lapis sequoia
lapis sequoia
wicked grove
lapis sequoia
wicked grove
lapis sequoia
wicked grove
wicked grove
lapis sequoia
#

you mean average the accuracy you found for each epoch?

#

sure go on

arctic wedgeBOT
#

Hey @wicked grove!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

wicked grove
#

so like you can see these are my val_acc, should i choose the best one or use an earlystopping...also do i restore the weigjts?

lapis sequoia
# wicked grove yess

okay so you got to understand one thing.
what exactly will you get by taking average?

let me ask you this question.
say you are learning how to guess if there is a dog in some images.

you guess it and it has some n1% accuracy
In 2nd time it has n2% accuracies
and it will increase since you will understand

now will the average matter at 50th epoch or will the end result?

lapis sequoia
wicked grove
lapis sequoia
#

exactly! hence taking average is in a way meaningless.

wicked grove
lapis sequoia
#

uhm, well it would help(again I'm not sure if they store weights of each epoc)

also taking best may fall into overfitting as you know.

#

but yeah it could help. yes.

#

Hello I'm stuck with something that it's driving me crazy since yesterday afternoon and I wonder if you could lend me a hand. Basically I want to change the value of predictable column to 1 whenever a cityTown/startDate combination from preditable_incidences dataframe matches a multi-index from the original dataframe df.

df['predictable'] = 0
df['startDate_dupli'] = pd.to_datetime(df['startDate']).dt.strftime("%Y-%m-%d")
predictable_incidences = df[df['incidenceType'].isin(['Event', 'Labour'])]
df.set_index(['cityTown', 'startDate_dupli'], inplace=True)
predictable_incidences['startDate_dupli'] = pd.to_datetime(predictable_incidences['startDate']).dt.strftime("%Y-%m-%d")
zipped_list = list(zip(predictable_incidences['cityTown'].to_list(), predictable_incidences['startDate_dupli'].to_list()))
print(zipped_list)
df.loc[zipped_list, 'predictable'] = 1
print(df)
lapis sequoia
wicked grove
#

This is where i saw it tho

#

I may be wrong

lapis sequoia
#

but yeah that explanation is not incorrect.

lapis sequoia
#

So basically the dataframe that I'm using contains traffic incidences. Depending on the incidenceType, some of them are considered predictable and the rest non-predictable. Basically what I wanna do is remove those predictable incidences from the dataframe after setting the predictable_incidences flag to 1 for those non-predictable incidences that has predictable incidences for that day and city.

#

Does it make any sense?

lapis sequoia
#

According to the docs, it is possible to pass in a list of multi-indexes to df.loc[zipped_list, 'predictable'] = 1 so that this code should change the rows that match the multi-index, but it changes all the possible combinations within that list.

#

zipped_list = [('Zarautz', '2020-01-01'), ('Santurtzi', '2021-02-03')]
It should change two rows if found:

  • 'Zarautz', '2020-01-01'
  • 'Santurtzi', '2021-02-03'
    It sets all the possible combinations to 1 instead:
  • 'Zarautz', '2020-01-01'
  • 'Santurtzi', '2020-01-01'
  • 'Zarautz', '2021-02-03'
  • 'Santurtzi', '2021-02-03'
spark apex
#

I want to use Movenet in unity
I was thinking to use barracuda
I converted tf movenet to onnx and tried to use it gave error

Unsupported default attribute `split` for node sequential/keras_layer/StatefulPartitionedCall/StatefulPartitionedCall/unstack:0 of type Split. Value is required.```
fast drum
lapis sequoia
#

So this is ETL process

#

I don't understand what's exactly extracting, downloading dataset and putting it on disk?

serene ridge
#

hey guys, which one do you recommend for data science, Intel iris xe or Nvidia. does even data science demand a specific kind of GPU or it doesn't matter?

wicked grove
#

I tried retraining 4 layers of vgg 19 but the accuracy dropped

#

I added 3 extra dense layers,but that didn't help a lot

glossy terrace
rapid pawn
# serene ridge hey guys, which one do you recommend for data science, Intel iris xe or Nvidia. ...

nvidia is pretty much the default choice for high performance ML model trainning, as most if not all main stream ML frameworks use CUDA to accelerate training. While CUDA lib itself is open source you do need physical CUDA cores to use/take advantage of the various other libraries iirc. And if you are just starting out i would recommend that you try out things like Google Collab which provides free GPU and CPU for you to train models etc all you need to do is just go to Google Collab website and start coding

#

there are ways to bypass the CUDA requirement on AMD GPUs for example but it requires some setup which im not that familiar with

#

also for Nvidia RTX GPUs you get Tensor cores in addition to CUDA cores so those could also help when training models etc

pulsar elk
#

so I have this class. Does anyone know why it would fail if I try to subtract it from a np.ndarray?

class Vector3(np.ndarray):
    @property
    def x(self): return self[0]

    @x.setter
    def x(self, value): self[0] = value

    @property
    def y(self): return self[1]

    @y.setter
    def y(self, value): self[1] = value

    @property
    def z(self): return self[2]

    @z.setter
    def z(self, value): self[2] = value
#

I get this, but they're both 1d arrays with length 3

Traceback (most recent call last):                                                                                     
  File "/persist/safe/home/user/persist/sortme/rasterizer/./magic.py", line 124, in <module>                           
    c.draw_poly([[100,100,100], [100,200,100], [200,200,100], [200,100,100]], (255,0,0))                               
  File "/persist/safe/home/user/persist/sortme/rasterizer/./magic.py", line 74, in draw_poly                           
    points = list(map(self.transform_point, points))                                                                   
  File "/persist/safe/home/user/persist/sortme/rasterizer/./magic.py", line 65, in transform_point                     
    x = self._transformation.dot(np.array(point) - self.position)[:-1]                                                 
ValueError: operands could not be broadcast together with shapes (3,) (0,0,0)                                          `
#

I constructed it as Vector3([0,0,0])

#

oh wait do I have to use something other than __init__ for that

#

nvm, got it.

    def __new__(cls, val):
        return np.array(val).view(cls)
fading wigeon
#

This isn't strictly a python problem, but what sort of comparison/correlation analysis/method should I use when comparing how two different algorithms perform when compared to one another?

For some additional context, on one dataset I expect to see a major difference in performance and in another dataset I expect to see no difference in performance. Here performance means arriving at the right number, it has nothing to do with speed.

fading wigeon
#

No, not a classification problem. Maybe something more like... looking at a person in the distance and guessing how tall they are

mild dirge
#

like mean squared error or something?

fading wigeon
#

That might be a good test

mild dirge
#

and you could use k-fold for validation method

#

and average over all the folds over multiple runs

lapis sequoia
#

So, validation data is used for updateing hyperparameters, right? I am interested how does that work. So let's say we specify batch of 32. Then, model will do prediction, with new parameters that are not used before (?), if it get better accuracy then it had before, then model will update hyperparameters?

mild dirge
#

Validation data in general is used* to see how well a model performs

#

You can validate your model for multiple different hyper parameter values and see which one performs best

#

Can be done with a gridsearch f.e.

deep galleon
#

Anyone here familiar with exporting xarrays as GRIB files?

lapis sequoia
#

@mild dirge I had to write that I am interested how does it works in Tensorflow 2

mild dirge
#

Never used tf2, but I assume you could just:

  1. split the data into training and testing (making sure the data is balanced for both)
  2. train on the training data with a given set of hyper parameters
  3. predict on the test data
  4. compare the test data desired outcomes with your predictions (with like MSE or accuracy)
  5. go to step 2, but choose different hyper parameters and check which parameters give better performance
#

This entire process is pretty much 1 or 2 lines using sklearn btw

lapis sequoia
#

@mild dirge Yeah, I am aware of that. Also, there is thing called hyperparameter search, but I am interested how does validating works for TF2

mild dirge
#

validation is done by splitting the data into train and test

#

test is your validation set

#

you validate your model on it

brave sand
#

so I'm at an internship which requires me to write a paper and program an agent to play the game "nim with cash"
NIM(a1, ..., ak; n) is a 2-player game where initially there are n stones on the board and the players alternate removing either a1 or ... or ak stones. The first player who cannot move loses. This game has been well studied. For example, it is known that for NIM(1, 2, 3; n) Player II wins if and only if n is divisible by 4. These games are interesting because, despite their simplicity, they lead to interesting win conditions. We investigate an extension of the game where Player I starts out with d1 dollars, Player II starts out with d2 dollars, and a player has to spend a dollars to remove a stones. This game is interesting because a player has to balance out his desire to make a good move with his concern that he may run out of money. This game leads to more complex win conditions then standard NIM. For example, the win condition may depend on both what n is congruent to mod some M1 and on what d1 - d2 is congruent mod some M2. Some of our results are surprising. For example, there are cases where both players are poor, yet the one with less money wins. For several choices of a1, ..., ak we determine for all (n, d1, d2) which player wins.

#

how should I approach this?

#

should I use a monte carlo algorithm? or just something like alpha zero

#

any input is appreciated!

violet kernel
#

does anyone know anything about the olivetti face dataset?

mild dirge
violet kernel
#

im messing with that dataset and trying out unsupervised learning to put faces in different clusters. Well I wanna see how accurate it was by taking the number of clusters that had only the same faces in it, but i'm not sure where to get started. I've looked around online to see if people have tried it and i cant find anything lol

mild dirge
#

Well unsupervised clustering can cluster them on anything

#

Doesn't have to mean it will cluster them based on identity

#

"how accurate it was by taking the number of clusters that had only the same faces" Here you say "same" but that can be based on a lot of stuff, not just identity 😉

violet kernel
#

oh okay. well that might be my first mistake lol

mild dirge
#

So if you want to cluster them based on identity, you want to use some supervised clustering algorithm

#

But you should probably just make a regular classifier, like a convolution neural network

#

Or use SIFT to extract kepoints from the images and use those to identify the different faces

#

Lots of different ways to go about this

violet kernel
#

okay. yeah maybe i should try it another way haha. thanks

stark zenith
#

Need some thoughts - I have a data frame with names, dates, and hotel codes. I'm using this to look up data using selenium, and save some of the results, but I'm not sure of the best way to iterate through the data frame. Lookup speed will be slow anyway so performance matters little.

grave frost
#

any idea how u fix d colors?

stark zenith
grave frost
#

supposed to be black and white

mild dirge
stark zenith
#

I'm trying itertuples to put the columns into lists but it's hurting my soul. It feels so dumb. 😂

mild dirge
#

plt.imshow(eval(f'img_{i}'), cmap='Greys')

#

@grave frost

stone marlin
#

Note on some of the terms above: test set and validation set sometimes are synonymous to some people, but some people also use it in the following way:

  1. Training set is a the set which trains the model(s).
  2. Test set is a set which is held out of the training and which is used to tune [hyper]-parameters for the model(s).
  3. The validation set is a set which is held out of training and which is used to test a model which has specified [hyper]-parameters.

In the case of NNs, for example, you should not be using the test set to determine if fully-specified model 1 or fully-specified model 2 is better, that is the job of the validation set. You should use the test sets to help determine the hyperparameters of each model.

This is just so no one gets confused if they hear validation set being used in either way (as a synonym to test set or as the latter thing.)

grave frost
plain python
stark zenith
plain python
arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @granite cape until <t:1642556873:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

thin crown
#

I have a question related to GitHub. On this link: https://github.com/GoogleCloudPlatform/dialogflow-integrations/tree/master/spark#readme, I was trying to integrate my Diagflow chatbot to Spark. Then I ran into this problem: "In your local terminal, change the active directory to the repository’s root directory." Can anyone tell me how do I change the active directory to the root directory on GitHub?

GitHub

Dialogflow integrations with multiple platforms including KIK, Skype, Spark, Twlio, Twitter and Viber - dialogflow-integrations/spark at master · GoogleCloudPlatform/dialogflow-integrations

candid atlas
#

I am traing a lane keeping RC car however i have alot of images and if I try to load them into memory (i run out of memory)
I need a solution that loads data from directory instead of memory

#

TensorFlow / Keras

marble moth
#

does anyone have issues with OpenCV being locked to screen refresh rate?

#
  • when i am capturing my screen
lapis sequoia
wicked grove
candid atlas
wicked grove
candid atlas
#

hmm okai ill try colab.. ah the annoying part is i have to keep going on diff detours to fix one problem haha

#

]

wicked grove
wicked grove
lapis sequoia
wicked grove
#

yes and colab only gives 12 gb ram and the pro allocates 25

lapis sequoia
#

and it stops after 9 hours(i think)

#

so be aware about that.

lapis sequoia
wicked grove
#

oh okayy

wicked grove
lapis sequoia
#

also I asked a friend of mine about how much layers should be still changable.
He said its more of a hyper parameter and you gotta do a bit of trial and error. but he said last 2 or 3 layers should be good.

wicked grove
lapis sequoia
#

haha yeah, I mean that's the thing, if they are images, we don't much need to change previous layers. And if we do change them, we need A LOT of data to have better results.

#

so its good to let them be frozen since they have already been trained on ALOT of data.

wicked grove
#

ohhh alrightt, got itt:))

lapis sequoia
#

alright!

candid atlas
candid atlas
candid atlas
lapis sequoia
candid atlas
#

ah okai thanks alot for the heads up!

candid atlas
#

any one train self driving rc car; quick question

lapis sequoia
#

...

#

maybe

candid atlas
#

should i sort my train data as left turn images, stright images, right train images

#

and get "category" (0, 1, 2)

#

OR just have shuffled imgs

lapis sequoia
#

Your car, it only has input via a camera?

candid atlas
#

yes

#

just one camera, and loss is caculated steering angle

#

angle in my case is just -1, 0, 1

#

turn left, stay mid, turn right

lapis sequoia
#

What is the lag?

#

In your current model between image input to vehicle output.

candid atlas
#

i wanna say 10ms

#

not an issue cause im not trying to make my rc car go full speed, maybe half

lapis sequoia
#

The relativity and turn radius matters.

#

That's why I said, maybe 🙂

candid atlas
#

just right or left realy

lapis sequoia
#

So what do you want to track on? A thing or a group of things. Within a border or nothing discernable as such?

#

i.e. How are you currently determining a path forward

candid atlas
#

gonna have paper on the ground, collect data driving on the paper track

lapis sequoia
#

On that paper what, a sharpie line?

candid atlas
#

just paper

lapis sequoia
#

and the goal is stay on?

candid atlas
#

yes

lapis sequoia
#

How wide is the vehicel?

candid atlas
#

abput 70%-80% of track width

lapis sequoia
#

10 and 10 best slop

#

fun

#

how fast?

candid atlas
#

1m/ 5sec?

#

thas just under a ruler length every second

lapis sequoia
#

How heavy?

candid atlas
#

the whole contraption?

lapis sequoia
#

Yes

candid atlas
#

if i use a wired webcam then maybe 300g if i stick my laptop on top(dont judge me here) then about pound and half

#

mac book air so not that heavy

lapis sequoia
#

20cm/sec sounds good

#

~>= half pound. great.

#

Got a pi?

#

Also, how are you controlling the engine?

#

motor, whatevre

candid atlas
#

I am using my laptop since im still learning; I wanna get to the proof of concept; once i have a super basic model. (one that is even 70% accurate) then ill invest in things

#

So my whole set is janky but here it goes; i will have a webcam/laptop on top of the car; OpenCV will extract a frame, that frame is preprocessed and sent to the model, the pridicticion is sent to arduino via serial port, the arduino then turns a servo pressing the button the the remote of the toy car

#

I was wondering is it more effective to to sort the training data as left turns, middle right turn directories or just shuffle it all and feed it while training

#

i belive second one is better, just trynna get some outside opinion

cerulean vapor
#

Hello

candid atlas
#

hi

candid atlas
cerulean vapor
#

Need hepl

#

help

#

?

candid atlas
wicked grove
#

Is that a glitch due to the internet or colab

#

Or am i doing something wrong

lapis sequoia
#

wait what do you mean by 20?

wicked grove
#

Froze 20

cerulean vapor
#

hello

lapis sequoia
wicked grove
#

Plus when my friend tried she got an accuracy of 35 so idk

lapis sequoia
lapis sequoia
wicked grove
lapis sequoia
#

Oh you did? Well I never did yet, but again I play around on colab for other purposes and not usually deep learning.

night gorge
#

suppose I have a dataset with 500 rows out of which 60 dont have column value ["price"], How can I drop first 20 rows having ["price"] as null?

stone marlin
#

You only want to drop 20? Not all of them?

night gorge
flint grotto
#

hello.

#

can i ask something?

stone marlin
# night gorge only first 20, not all
import pandas as pd
import numpy as np

col_a = np.random.rand(100)
col_b = np.random.rand(100)

# Every 2nd value in ``col_a`` is NaN.
col_a[::2] = np.nan 

df = pd.DataFrame({"a": col_a, "b": col_b})

# Get the first 20 row indices for the nulls, then drop them.
first_20_nan_idxes = df[df["a"].isnull()].index[:20]
df.drop(first_20_nan_idxes , inplace=True)
#

Just out of curiosity, why do you only care about dropping the first few NaNs, Vetpo?

flint grotto
#

now study Data Science for data reprocessing. so, i wanna data reprocessing part of books. can you some recommend the books?

stone marlin
#

Reprocessing or Preprocessing?

flint grotto
#

reprocessing.

stone marlin
#

What type of reprocessing are you doing? As in, getting a new model and re-processing data?

flint grotto
#

yes.

#

just data reprocessing for ML .

stone marlin
#

Can you give me an example, the term "reprocessing" has a few different ways it can be used.

flint grotto
#

use ML before data reprocessing some thing value. so, data something another value in the NaN, or text make token.

#

sorry. i confuse the word. i now talk about preprocessing.

stone marlin
#

It's okay, that's why I was making sure --- not too many people ask about reprocessing, but preprocessing is very popular!

#

Honestly, I know it's not a whole book, but the sklearn docs are fairly good for this kind of thing. https://scikit-learn.org/stable/modules/preprocessing.html . This also seems good: https://www.kdnuggets.com/2020/07/easy-guide-data-preprocessing-python.html

If you want an actual book, https://www.amazon.com/dp/B01M0LNE8C I remember being fairly good in general. Other than that, some others may have other suggestions.

flint grotto
#

oh, thanks. it is all?

#

aaaaany way. thank you so much.

glossy terrace
#
Traceback (most recent call last):
  File "C:\Users\esben\Desktop\Bob the Chatbot\bot.py", line 6, in <module>
    ['chatterbot.logic.BestMatch'])
  File "C:\Users\esben\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\chatterbot.py", line 28, in __init__
    self.storage = utils.initialize_class(storage_adapter, **kwargs)
  File "C:\Users\esben\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\utils.py", line 33, in initialize_class
    return Class(*args, **kwargs)
  File "C:\Users\esben\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\storage\sql_storage.py", line 20, in __init__
    super().__init__(**kwargs)
  File "C:\Users\esben\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\storage\storage_adapter.py", line 23, in __init__
    'tagger_language', languages.ENG
  File "C:\Users\esben\AppData\Local\Programs\Python\Python36\lib\site-packages\chatterbot\tagging.py", line 26, in __init__
    self.nlp = spacy.load(self.language.ISO_639_1.lower())
  File "C:\Users\esben\AppData\Local\Programs\Python\Python36\lib\site-packages\spacy\__init__.py", line 27, in load
    return util.load_model(name, **overrides)
  File "C:\Users\esben\AppData\Local\Programs\Python\Python36\lib\site-packages\spacy\util.py", line 139, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.``` how do i fix this error?
#

i get somewhat the same error no matter what package i use

#

its always error loading

#

ive tried like 9 different chatbot packages

#

they all have problem loading

#

i just wanna make a simple chatbot, can anyone help me fix this error?

serene scaffold
#

@glossy terrace the problem is that spacy is trying to load a model called en, but spacy models usually have names like en_core_web_sm.

glossy terrace
#

so how do i fix this?

serene scaffold
#

the part where you have self.language.ISO_639_1.lower() is wrong because it returns a string that isn't the name of a spacy model.

glossy terrace
#
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer

my_bot = ChatBot(name='Bob', read_only=True,
             logic_adapters=
             ['chatterbot.logic.BestMatch'])

small_talk = [
    "Hello",
    "Hi there!",
    "How are you doing?",
    "I'm doing great.",
    "That is good to hear",
    "Thank you.",
    "You're welcome."
]




list_trainer = ListTrainer(my_bot)

for item in(small_talk):
    list_trainer.train(item)
#

i dont think i have that part in my script

#

this is just the basic script i got from following tutorial

#

yet it doesnt work for some reason

#
user_input = input()


if user_input == "Hi ":
    print("Hello")
if user_input == "What ":
    print("Im just a nameless test") 
#

i also tried making this test

#

but i cant figure out how to make it recognise repplies to only the root of the input

#

ex. user_input 2, if i type Whats your it will just say error

#

and i would also like to know how to log it

#

like how to create new patterns and words in the training

#

e.g i type add.pattern and it will say like

#

Type user root:

#

and then when i type it says

#

Type bot repply:

#

and then it saves the new changes in the code file

fervent sapphire
#

Where can I ask for help in python question

serene scaffold
fervent sapphire
#

Ok

serene scaffold
#

@glossy terrace while your stated goal is to build a chatbot, it looks like you're currently struggling with general Python usage. I would ask for help debugging in a general help channel (also see #❓|how-to-get-help)

glossy terrace
#

and user inputs i understand just not how to set a root

serene scaffold
#

it still isn't a data science question.

patent escarp
#

can someone teach me how to code

proper swift
#

I have a pandas related question, I am trying to lookup values held in df1, that are in df2, would the best way of doing this be using the df.merge()?

patent escarp
#

pls can some teach me how to code i wanna make my own games

serene scaffold
proper swift
gilded jungle
#

are monte carlo tree search and minimax (including alpha-beta pruning) about the only algorithms available for boardgames like othello or are there other algorithms available too?

serene scaffold
#

so you're trying to get just the rows of df2 where the Ref_code is in df1['Code']?

proper swift
#

What I would like to do, is lookup the "Ref_code" column in df2, using df1['code'], and append the appropriate code based on the description

serene scaffold
#

yes, that would be a merge

#

!docs pandas.DataFrame.merge

arctic wedgeBOT
#

DataFrame.merge(right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)```
Merge DataFrame or named Series objects with a database-style join.

A named Series object is treated as a DataFrame with a single named column.

The join is done on columns or indexes. If joining columns on columns, the DataFrame indexes *will be ignored*. Otherwise if joining indexes on indexes or indexes on a column or columns, the index will be passed on. When performing a cross merge, no column specifications to merge on are allowed.
serene scaffold
#

you will need to use left_on= and right_on= because code and Ref_code are different names.

proper swift
#

Cheers, I thought it might be pd.merge(). Will take a look at the docs now

serene scaffold
#

no problem. you have permission to ping me about this specific question if you get stuck.

proper swift
#

thanks, much appreciated!

#

is there anywhere to do the merge, but retain any missing values not in the lookup list?
Updated

serene scaffold
#

the different types of joins are about how to handle missing values, depending on which side they're missing on.

proper swift
#

thanks, let me take a shot, at it

cerulean vapor
#

Hello I need ahelo

#

help

hollow sentinel
#

i'm confused i am trying to process a 2.7 GB file and it's taking forever

#

!pastebin

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

hollow sentinel
#

it still won't work and idk how to fix it

lapis sequoia
#

guys

grizzled stirrup
#

!code

arctic wedgeBOT
#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

hollow sentinel
#

idk what i should be using

lapis sequoia
#

Hey, i am looping a graph data structure in for loop i made it using a dictionary and i want to access its contents in a range for example dictionary is from A to Z and i only want to access its values from A to N i am really confused how i should approach it, is there any simple way to do it?

cerulean vapor
#

Hi I need a help

arctic wedgeBOT
#

Hey @lapis sequoia!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

hollow sentinel
#

this was dumb

cerulean vapor
#

HIIIIIIIIIIIIIIIIIIII

coarse rock
#

hi

night gorge
#

Why am I getting this error?

hollow sentinel
#

idk what chunksize to use

#

can't tell if 10 or 10,000 is the better option

#

it just takes forever to load and then i can't even see the first 5 rows of the dataframe

lapis sequoia
#

hello everyone! i am currently giving my 12th grade exams (A level). i was thinking of pursuing a career in AI. I just wanted to know from a few people in this field, or who are planning to be a part of this field. lets say im above average at coding, and fairly okay at network theory, can i pursue this field?

#

like, do i have to amazing at coding from get go or do i get to learn better along the way?

jaunty cove
#

Does anyone use gini coefficient for feature selection in classification models?

jaunty cove
hollow sentinel
#

yeah idk how to get this csv file into chunks

#

nothing works

#

idk how to figure out this chunk size ugh

jaunty cove
hollow sentinel
#

2.7 GB

#

pandas can handle up to 5 gb

#

which is why i'm so confused as to why it wouldn't work

jaunty cove
#

whats the error say?

hollow sentinel
#

there is no error

#

"ParserError: Error tokenizing data. C error: Expected 5 fields in line 2351587, saw 20"

severe rover
severe rover
jaunty cove
#

Maybe also try clearing up some memory on your machine?

jaunty cove
# severe rover why not?

Dask is usually recommended for datasets that are 100+ GB. When I tried to use it to parallel process a 50GB file it doubled my run time on everything

hollow sentinel
#

i tried using dask before

#

did not work that well

#
b'Skipping line 2351587: expected 5 fields, saw 20\n'
b'Skipping line 4779945: expected 5 fields, saw 20\n'
b'Skipping line 7110934: expected 5 fields, saw 20\n'
b'Skipping line 8319025: expected 5 fields, saw 20\n'
b'Skipping line 9111768: expected 5 fields, saw 20\n'
b'Skipping line 11291243: expected 5 fields, saw 20\n'
b'Skipping line 13551809: expected 5 fields, saw 20\n'
b'Skipping line 15830804: expected 5 fields, saw 20\n'
b'Skipping line 18116907: expected 5 fields, saw 20\n'
b'Skipping line 20293404: expected 5 fields, saw 20\n'
b'Skipping line 21406069: expected 5 fields, saw 20\n'
b'Skipping line 22166634: expected 5 fields, saw 20\n'
b'Skipping line 24241527: expected 5 fields, saw 20\n'
b'Skipping line 26589319: expected 5 fields, saw 20\n'
b'Skipping line 28809780: expected 5 fields, saw 20\n'
#
# chunksize =  10000
# for chunk in pd.read_csv(path, chunksize = chunksize,error_bad_lines=False, warn_bad_lines=False):
#     print(chunk)

data = pd.read_csv(path, chunksize = 10000, error_bad_lines = False)
df = pd.concat(data, ignore_index = True)
df.head(1)
#

idk why it's giving me such an issue

severe rover
hollow sentinel
#

even when i used like chunk size 10

severe rover
hollow sentinel
#

i can't open the csv on my machine

#

without it crashing

lapis sequoia
#

if it has less columns you can use csv module. the good thing about it is it will simply let you read line by line.

severe rover
#

can you open on a text readeer?

lapis sequoia
#

I processed a csv of 5/6 GBs earlier(with csv module).

lapis sequoia
hollow sentinel
#

idk what to do here

lapis sequoia
#

how much cols you have?

#

and what you wonna do?

hollow sentinel
#

i just wanna see the first 5 rows

#

of the dataframe

#

i can't see the amt of cols

lapis sequoia
#

just wonna see?

#

then switching to csv is my suggestion, it won't be tough. bit long if things are complex but it will not crash.

#

(assuming you're not using readlines ofc)

hollow sentinel
#

it already is a csv

#

oh

#

you mean not using a dataframe?

#

i mean i want to process the data

#

i was just looking at the first 5 rows for now

severe rover
#

have you tried downloading the file beforehand?

#

and then read it into a csv

hollow sentinel
#

yes

#

it crashes

severe rover
#

what crashes?

hollow sentinel
#

my computer

#

everything freezes

severe rover
#

for downloading a file?

hollow sentinel
#

yes

severe rover
#

ah then i have no clue

hollow sentinel
#

should my computer be able to handle a 2.7 gb csv?

severe rover
#

yes

hollow sentinel
#

hm

#

ok i'm gonna try to do something

severe rover
#

it should download and if you use dask you can loaded without a problem

hollow sentinel
#

i think it might be bc of s3

#

and bc i'm doing it thru AWS

severe rover
#

with dask you don't need to free memory before doing a .compute()

hollow sentinel
#

i'll try doing it locally again

#

ty

severe rover
#

np

hollow sentinel
#

so dask should be able to comfortably read 2.7 gb files

severe rover
#

easily

#

it's only when using the compute method that the memory is allocated

#

fully allocated*

brave sand
#

when should I use a Monte Carlo tree search?

severe rover
#

@hollow sentinel the dask docs say Dask is convenient on a laptop. It installs trivially with conda or pip and extends the size of convenient datasets from “fits in memory” to “fits on disk”.

hollow sentinel
#

i have an idea

#

i will save it locally on my machine

#

take a smaller portion of it

#

w excel

#

and see what's going on

severe rover
#

but then would crash excel no?

hollow sentinel
#

gotta try something

severe rover
#

good luck - you have my 2 cents on how i'd go about it 🙂

hollow sentinel
#

ok so excel is defo not gonna work

#

just uploading the csv into my jupyter notebook

#
/Users/rahuldas/opt/anaconda3/lib/python3.7/site-packages/dask/core.py:118: DtypeWarning: Columns (1) have mixed types.Specify dtype option on import or set low_memory=False.
  args2 = [_execute_task(a, cache) for a in args]
#

!pastebin

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

brave sand
#

how hard is it to implement a Monte Carlo tree search?

wintry zinc
#

hey guys i am interested in text to speech with a human face speaking the text, can anyone recommend me anything that can help me with this

lapis sequoia
#

what process do you mean exactly? can you be lil bit detailed if possible?

hollow sentinel
#

i figured it out nvm

wintry zinc
#

anyone have any ideas?

grave frost
brave sand
grave frost
brave sand
brave sand
brave sand
#

idk how for my specific game

grave frost
#

google it 🤷‍♂️ learn it 🤷‍♂️ copy the code and implement it 🤷‍♂️

#

I don't see what you gain by asking whether its difficult or not thrice. if you have to do it anyways, what difference does it make?

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @lapis sequoia until <t:1642622654:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

urban canopy
#

Does anyone have expierence translating numpy code to cupy (https://cupy.dev/) ? Is it mostly just a 1:1 translation or is cupy much less featureful due to GPU limitations?

grave frost
#

any data viz person know how to add labels in subplots (for each one)?

for i in range(1, columns*rows+1):
    fig.add_subplot(rows, columns, i)
    plt.imshow(eval(f'img_{i}'), cmap='binary_r')
plt.show()

eval(img.. is for variable names (don't ask)

urban canopy
#

@iron basalt: Thanks. It seems to have most of the features I like.

#

It also has some functions that simplify numpy code slightly as atleast_3d. I defenitly will look into this!

orchid kayak
#

I am running a model in which the accuracy and loss remain relatively constant i.e the model doesn't seem to learn. Could it be because I have selected a wrong loss function? could that drastically affect the models' ability to learn?

dapper dune
#

Hey there! Can some1 help me with data representation (I'm new to DS)? I have class that describe some game, that contains array of players (player number = const, player data - additional info about player (e.g game_result flag, player name, etc.)). What is the best way to represent the data to see for example player win rate?

P.S: game and player data are dataclasses
It will be nice, if you can DM me and explain with an my code

fleet prism
#

pasting from #career-advice

I'm pretty good at python. I use it a lot for my web projects. Even built a complex desktop app with tkinter. But haven't explored data science much. Except pandas. I know pandas well. I wish to use my 13 years of field experience in oil/gas with my new coding skills to maybe bag some DS projects or even a position.
How long would it take to learn other python DS libraries for someone at my skill level? And which ones should I aim for?

stone marlin
#

I'd recommend starting off with the "non-Neural Network" type things. Sklearn is the usual library that people use for standard classifiers + regression models.

A lot of DS at the beginning is understanding what you can do, what you're looking for, and what math / techniques / whatever can you apply to things. It's also fairly dependent on the data and task at hand.

I'd go through three things in your case, since you've got the basics down:

  1. Read over a Data Science textbook or do an intro to DS course, just to learn about the terms we use in the field.

  2. Go through the tutorial for Sklearn to see how approximately to interact with Sklearn. With pandas, it's very easy now-a-days.

  3. Get a dataset and mess around with it. This is very general but, honestly, it's the best way to learn. You can get some of these from kaggle, but even taking some standard datasets and trying to do things with them is fine. For example, taking the diabetes dataset and trying to think about how to represent the features, etc. OR, getting a weather dataset and messing around with that a bit.

I'm not exactly sure what type of oil/gas data you'd be working with w/rt your existing skills, so it's hard to give you something exact to focus on. But I'd say the above should take something like a month to get pretty decent at, and then a few months or so to really solidify your understanding of the basics.

tl;dr: learn sklearn.

#

Note: The reason I explicitly note above about NNs is that while they are WELL-represented in this channel, they often are "black boxes" which may not teach you DS as well as the classical non-NN methods. Additionally, for "practical" work in the fields, many datasets are still best served using the classical methods due to interpretability of the model. Neural Nets are fun, but I'd make them a "thing to learn later" after you're very comfortable with classical classifiers and regressors.

desert oar
#

even if you don't care about latin square experiments and just want to jump into classifying cat pictures, without at least a basic understanding of those things you'll struggle to be useful in most organizations

#

but you can definitely have fun without them

#

you will also eventually need to learn calculus and linear algebra, but as long as you know how to do basic matrix and vector arithmetic you should be ok at the beginning

#

that said, in python specifically i agree that scikit-learn is high on the priority list, along with matplotlib and maybe seaborn

#

if you know excel, that can be a great "shortcut" to doing things that you otherwise might not know how to do in python

#

even if you are an experienced data scientist or data engineer, sometimes the most valuable skills are the "stupid" skills like being handy with excel and having a basic understanding of experimental design and statistical sampling

#

so it depends a lot on your goals

#

data science is a huge field, imo significantly bigger than programming with respect to the number of things you'd consider "core" competencies

#

programming usually you can get away with loops and ifs and a basic grasp of OO

#

i don't say all that to be discouraging. but i don't want people to go into it thinking that they'll be a senior data scientist from nothing in 3 years

#

you can of course get started with kaggle stuff, and imo it's a good way to feel like you're "doing something" while you fill in whatever gaps you might have in your math and stats knowledge

stone marlin
#

Excel / Sheets takes care of so many ezpz problems. Parenthetically, there's also a DS book --- I think called Data Smart? --- that goes over basic DS stuff using only Excel. It's pretty nice to not worry about the "language" and just look at the concepts, for those students who are unfamiliar with Python.

lavish rune
#

What university/college program do you guys recommend a high school student to become a data scientist

royal crest
#

don't think the name of the university really matters

#

and as far as the name of the degree/major is concerned, it varied significantly across the country/university

#

some universities offer DS under CS, some offer it entirely separately

#

e.g. at my university it's under faculty of IT

stark zenith
stark zenith
# plain python That’s great!

yeah, feel good! It's just automating data lookup on an internal web app, but I learned a lot. Gonna apply what I learned to some of the web based part of my job next.

#

I wouldn't have done it if there had been an internal database table with all that info on it.

fleet prism
fleet prism
iron basalt
# fleet prism thanks. sounds like a long road. I did calculus and algebra in engineering so ma...

If by algebra you also mean linear algebra, then you are well set up for DS. You just need statistics, lots of statistics (but don't get too lost in the details of it all, the general ideas of why stats does things the way it does them matters more). As for programming, get comfy with libraries that let you implement/view the stuff from stats, like numpy, scikit (all of its various libraries), matplotlib, pandas, jupyter notebook, etc. Though the path might be something like: do it in excel first -> do it in python with pandas, numpy, etc -> let some library do it for you like scikit stuff.

#

Beyond that, there is stuff like neural networks, and other crazy things.

#

(Actually you can add another step in front of the "do it in excel first" part, do it by hand first)

fleet prism
iron basalt
#

*There is also just the general ability to get data and mess around with it, whether that's from a database or other form (maybe even web scraping).

lapis sequoia
iron basalt
lapis sequoia
#

haha true. still saying, it's a good jungle to get lost 😄

iron basalt
#

Yeah, all math is, just a warning of not spending all your time going through wikipedia article link jumping hell (what do all these random math words mean?).

#

(Because there is no end to it, and meanings are context specific, unfortunately math is not a context free language (it shows in papers))

lapis sequoia
#

true. there is literally no end lol.

#

as long as we know we are okay getting lost and have enough time to get lost, it's a fun process.

#

but the domain is just never ending.

iron basalt
#

This is also why I recommend getting a book on whatever math topic, the "intro to X" kind. Because even if you don't read it or only use it sometimes, its table of contents will let you know when you have gone too far (the thing you are looking at is not in the table, and not even adjacent to something in the table). But this only applies if you care about time management, and the multi-arm bandit problem of learning new stuff.

lapis sequoia
#

it's a good problem. I like the reference.

iron basalt
#

(*I use machine learning concepts to inform my own learning process)

hexed schooner
#

How to implement FID score and Precision and Recall in DCGAN using tensorflow Keras

glossy terrace
#

is it possible to make an ai that can play minecraft on different servers?

inland zephyr
#

does anyone know how to make seaborn legend to be horizontally presented and stacked (n entity per row)

#
g=sns.lmplot(x='comp_1',
           y='comp2',
           data=data,
           fit_reg=False,
           height=10,
           legend_out=False,
           hue='user_name',
           scatter_kws={"s":50,"alpha":0.9})
plt.legend(loc=8,title="Name")
plt.title("title")
plt.xlabel("dimension 1")
plt.ylabel("dimension 2")
final spruce
#

can anyone tell me what im doing wrong here

lapis sequoia
#

but since you are doing plt.legend you are on the way(If it is possible)

lapis sequoia
#

(again not sure, just looking through some similar questions)

final spruce
#

same problem

#

he doesnt notice the ,

lapis sequoia
final spruce
#

same problem😢

lapis sequoia
#

ah jesus

final spruce
#

I mean i get an error then

lapis sequoia
#

oh

#

can you show me the csv on pastebin?

final spruce
#

just a few of the first lines

lapis sequoia
#

hm. alright gimmi a while

final spruce
#

sure(:

stone marlin
#

Huh, that's strange, it works fine for me when I copy from this pastebin.

#
df = pd.read_csv("sample.csv", encoding="utf-8")
df.head(5)

What about this? (Where sample.csv is whatever you called yours.)

lapis sequoia
# final spruce

@stone marlin this error kinda suggests to me that something may be off with the data when they are using sep.

#

but if sep is not at all needed then i think pandas will handle ""

stone marlin
#

The default sep is ,, so that shouldn't be it. I'm also able to literally copy-paste the pastebin into a new file and correctly parse it.

final spruce
#

i got the problem

#

when i open it in notebook

stone marlin
#

So, my thinking is: perhaps this is an encoding issue? I'm not sure.

#

Ahhh, "'s.

final spruce
#

it says there is strings

stone marlin
#

Okay, coolio, so you can just get rid of those double-quotes and it should work out.

final spruce
#

Yes i have it, thx guys

inland zephyr
# lapis sequoia since seaborn uses matplotlib, i suggest you to look at how you do it in that. t...
from matplotlib import pyplot as plt
g=sns.lmplot(x='comp_1',
           y='comp2',
           data=data,
           fit_reg=False,
           height=10,
           legend=False,
           hue='user_name',
           scatter_kws={"s":50,"alpha":0.9},
            facet_kws={"legend_out": True})
plt.legend(loc=8,title="Name",ncol=5)
plt.title("...")
plt.xlabel("dimension 1")
plt.ylabel("dimension 2")

I able to make it get below and in several column
but i still unable to make it below the chart

atomic leaf
#

How do you take a dataframe with column ['image data','labels'] and make it to a PyTorch dataset with DataLoader?

inland zephyr
atomic leaf
#

How do you convert a column of integers to tensors?

final field
#

Can i train my object detection model on another machine and run on the other?

vague sun
#

hi i need some help with dataframes in pandas

lapis sequoia
vague sun
#

i have written the question here

lapis sequoia
#

Hi, I am facing some issues with getting the headcount for a month is anyone able to help?

#

The issue is

#

people are leaving and rejoining

#

and the algorithm is recounting the people who have already been counted

kind rock
#

what is the difference between fig.show() and plt.show()

desert oar
arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

lapis sequoia
#

It is homework and I just need some guidance and not full help

#

I will post the code up as soon as I get back to my PC

atomic leaf
#

How would you guys optimize a symbol/character recognizer? This is what i have after 100 epochs...

desert oar
# kind rock what is the difference between fig.show() and plt.show()

i didn't know the answer, but i was able to find the docs pages for both functions:
https://matplotlib.org/stable/api/figure_api.html#matplotlib.figure.Figure.show
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.show.html

the differences seem to be:

  • fig.show does not manage the graphical system that displays the plot; you are expected to already have e.g. a GUI window running
  • plot.show does set up and run the graphical system that displays the plot

in my experience, you can use them interchangeably inside a jupyter notebook, but in a command line console you can't use fig.show because the plot window will just close immediately, unless you take other steps to set up and run a GUI for displaying plots.

this would be a great stackoverflow question btw. i can post it if you don't feel comfortable

atomic leaf
#

It's oscillating a lot too

primal shard
#

Hello, i was wondering if anyone knows good resources about word encoding and decoding for neural networks, i'd like to kind of make basic encoding decoding system to learn how it works, I mostly want to make those 2 and test it with some sentences, I don't yet want to do the actual neural network that converts the words into a different sentence or anything mostly just the system that converts the words in a number and back

robust jungle
#

Does anyone know of any good resources for getting a better understanding of image recognition / object detection (and by extension machine learning as a whole)? I've followed some tutorials and it has worked, but I want to know why it works

serene scaffold
#

(a lot of people come to this channel excited about what they think AI is and leave disappointed when they realize that it's all math.)

frozen hedge
#

anybody know how to concatenate dimensions within an array? Suppose I have a 2x2 array whose elements are 3x3 matrices. I want to concatenate them so the result is 6x6. reshape doesn't seem to work, since it just flattens. I'm trying to preserve the structure of the subarrays. Think of taking 4 photos and putting them side-by-side. Any help would be much appreciated.

serene scaffold
#

keep in mind that arrays are "one thing".

frozen hedge
#

yeah

#

its 2x2x3x3

serene scaffold
#

so if you do print(array.shape), what you see is (2, 2, 3, 3)? we just need to be super clear on that, or I can't say anything useful.

frozen hedge
#

yes

serene scaffold
#

alright, let me think

frozen hedge
#

I have a 2d array of 2d matrices and want to reshape without losing the structure. cheers

serene scaffold
#

I have a 2d array of 2d matrices
that's not how it works. the array is one thing

#

you're talking about a single four-dimensional array

frozen hedge
#

yes ik

serene scaffold
#

anyway, the solution probably involves transposing before reshaping. still thinking.

frozen hedge
#
x = np.array([[1,2,3],[4,5,6],[7,8,9]])
y = np.array([[x,x],[x,x]])
y.reshape(6,6), y
#

test code

#

ok got it

#

had to move axes to (2,3,2,3)

serene scaffold
#
In [3]: np.arange(36).reshape(2, 2, 3, 3)
Out[3]:
array([[[[ 0,  1,  2],
         [ 3,  4,  5],
         [ 6,  7,  8]],

        [[ 9, 10, 11],
         [12, 13, 14],
         [15, 16, 17]]],


       [[[18, 19, 20],
         [21, 22, 23],
         [24, 25, 26]],

        [[27, 28, 29],
         [30, 31, 32],
         [33, 34, 35]]]])

here's what we have

#
array([[ 0,  1,  2, 18, 19, 20],
       [ 3,  4,  5, 21, 22, 23],
       [ 6,  7,  8, 24, 25, 26],
       [ 9, 10, 11, 27, 28, 29],
       [12, 13, 14, 30, 31, 32],
       [15, 16, 17, 33, 34, 35]])

this is what you want, right?

#

(I made this manually. still working out the code for it.)

frozen hedge
#

I tried this: np.moveaxis(x, 1,2).reshape(6,6)

serene scaffold
#

what is x

frozen hedge
#

your array

serene scaffold
#
In [10]: np.moveaxis(arr, 1, 2).reshape(6, 6)
Out[10]:
array([[ 0,  1,  2,  9, 10, 11],
       [ 3,  4,  5, 12, 13, 14],
       [ 6,  7,  8, 15, 16, 17],
       [18, 19, 20, 27, 28, 29],
       [21, 22, 23, 30, 31, 32],
       [24, 25, 26, 33, 34, 35]])

well, that's not too far off.

frozen hedge
#

yh

serene scaffold
#
In [21]: arr.transpose(1, 2, 0, 3).reshape(6, 6)
Out[21]:
array([[ 0,  1,  2, 18, 19, 20],
       [ 3,  4,  5, 21, 22, 23],
       [ 6,  7,  8, 24, 25, 26],
       [ 9, 10, 11, 27, 28, 29],
       [12, 13, 14, 30, 31, 32],
       [15, 16, 17, 33, 34, 35]])

I have done it

#

looks like the trick is to rotate the first three dimensions, but leave the fourth one in place

frozen hedge
#

right, rotate not swap

serene scaffold
#

this was an interesting question. Thanks lemon_hyperpleased

frozen hedge
#

thx 2u

stark dune
#

hello, I am working on my code that has 5 ranges of 123 points and I want to export those points as numerical data on a csv but when I do, I get the columns right but the data are kinda compressed in only one cell and only up to 4 points of the ranges are shown in the cell as opposed to my desired results where they should be in separate cells as rows, can someone help me out? below is the code

https://paste.pythondiscord.com/cuxopagidu.py

#

heres what it looks like in print

lapis sequoia
#

I'll mess around in #bot-commands and will let you knw

lapis sequoia
#

i think it should work

#

incream?

wicked grove
#

I have 30 GB ram currently but it crashes as the data is large

#

So i wanna increase the ram

#

And i found this

wicked grove
mint palm
#

what are some best research area in Deep Learning????

desert oar
wicked grove
desert oar
#

actually wait. yes

#

it does reduce your ram

#

You can allocate some of this memory to create a RAM disk

wicked grove
desert oar
#

maybe your machine learning framework allows you to do that?

#

are you using pytorch?

#

30 gb is a lot of stuff in memory at once

#

maybe you can restructure your training pipeline to use less memory

lapis sequoia
desert oar
desert oar
lapis sequoia
lapis sequoia
modern cypress
#

Hey guys could someone take a look at #help-dumpling, I would really appreciate it

desert oar
lapis sequoia
#

ah i see. I still need to see if it has anything to do with efficiency or not.

modern cypress
#

Hey I'm facing a weird error I'm not sure how to fix or go around: ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
Image
At first the error was a data adapter error and I found a solution online to turn it into a np.array, when that is done this is the new error I receive
Any help is much appreciated

#

Could this be a compatibility issue between sklearn and tensor?

desert oar
desert oar
wicked grove
wicked grove
desert oar
wicked grove
#

Yes I'm loading the entire data into memory at once
Im doing transfer learning using vgg19

desert oar
#

maybe try the tensorflow data api as described in the document i sent

#

only load the data you need when you need it, don't load everything at once

#

that's what these data loader apis are for

modern cypress
wicked grove
#

This is where all my memory goes

#

Can you please tell me how the api works here

modern cypress
#

I have a strange (?) question, when giving the model picture data, should all the picture follow the same format, like resolution and RBP compared to black and white?

#

I'm feeding it some COCO data, and not all pictures are in the same format

lapis sequoia
modern cypress
#

Hmm maybe that's why I'm getting the error?

#

Going to need to try resize all the data then hmm

lapis sequoia
#

why don't you check the shape of them first.

#

may be they are same size lol.

#

also the error it shows in that case is different. about shapes.

#

(as much ive seen)

modern cypress
#
[array([[[ 33,  52,  59],
         [ 44,  60,  73],
         [ 51,  65,  83],
         ...,
         [155, 166, 194],
         [128, 139, 161],
         [ 74,  85, 105]],

        [[ 48,  66,  77],
         [ 49,  65,  81],
         [ 40,  56,  73],
         ...,
         [151, 158, 185],
         [154, 159, 184],
         [110, 114, 139]],```
#

Different

#

Unless I am missing something here

#

I had a project like this with normal Excel data, but this is proving to be much harder

#

What do you think about creating a set resolution bigger than all images and giving the pictures that aren't acceptable black filler borders?

primal shard
kind rock
kind rock
primal shard
kind rock
#

oh, you could do the intro course to nlp on datacamp

#

I'll be doing that. Plus, there's a course by deeplearning.ai on nlp available at Coursera

stone marlin
# primal shard yeah, but that was my question i was wondering if someone knows good resources a...

Specific to the topic of encoding-decoding tokens, you may want to check out the evolution of the idea if you haven't already:

[Note: these resources are from an ex-coworker who does more NLP than I do now, I haven't completely vetted them.]

modern cypress
#

Hey guys, I'm getting the error ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 400, 400, 3), found shape=(None, 400, 3). Even though I'm resizing the images for training and predicting the exact same

#

The training: new_image = image.resize((400, 400))

#

The prediction: im_pil = image.resize((400, 400))

#

and then predicting with prediction = model.predict(np.asarray(im_pil))

#

I'm not sure how it can find a shape of (none, 400, 3)?

desert oar
#

@primal shard i think a lot of machine learning still uses "bag of words", in that each word is converted to a vector embedding before feeding into something like a transformer

#

even if the sequence of the words/tokens is preserved, the word is more likely to be represented as a dense real-valued 100-vector than a sparse binary-valued 10k-vector (or however big your vocabulary is)

#

another common choice is tf-idf, or variants of tf-idf like bm25

#

tfidf doesn't provide any dimension reduction though

modern cypress
#

Could someone explain the error please? What does "shape" mean exactly?
like shape = (1D, 2D, 3D) ?

#

Do I need to reshape in some sort of way?

quiet vault
quiet vault
#

And for some reason the shape is 400 and 3 when inputting it into the model. Would you mind sharing your code so I can get a better idea at whats going on?

distant latch
#

Hi, somebody knows wich tool i can use to do something similar to this? :C i chequeck seaborn, matplotlib and pandas

modern cypress
eternal vector
#

oke thanks

stone marlin
distant latch
buoyant epoch
#

Has anyone tried to implement TPU support for Tacotron training? I am fighting with waveglow for TPU, but training doesn't seem to work

#

I am getting this error: NotImplementedError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: Unknown device for tensorexpr fuser

#

Should I use help channel for it?

stone marlin
#

Pret much what you'd do, I imagine, is plot x = year, y = ranking, and the thickness is something like popularity. If this was generated, I'm guessing that they did something with the error bars to make it the popularity volume or something.

#

Lemme real quick look at mpl tho.

#

Ah, okay, I think I tracked down how they're doing it. https://observablehq.com/@russellgoldenberg/variable-thickness-band It's a d3js thing.

It is quite common to encode the thickness of a line to some variable. In d3, this typically means using d3.area. Here are some examples from NYT, Bloomberg, Axios (all whose work I love dearly), and myself at The Pudding. The image below is some kind of bump chart / streamgraph, but it appears as if Brazil's imports shrink to a tiny amount then...

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @deep ridge until <t:1642714819:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

desert oar
#

we loosely say "2-dimensional", but you have to be careful because this isn't really the same meaning of "dimension" as is used in linear algebra

#

it maybe would be more precise to say "an array with 2 axes", but the term "axes" is a numpy-specific usage here

flint shale
#

Im a little bit confused
whats the difference between yolov5 by ultralytics(pytorch), yolov4 (tensorflow) and yolov4 darknet?

#

what's better

late mesa
#

Hello, I'm using Selenium to automate a purchase on a website. I'm not trying to automate captcha or anything, I just want to be able to input the captcha and then for it to move on.

#

Currently, I'm using ImplicitWait, but after completing the hCaptcha, it leaves me on the same page.

stone marlin
#

This may be better in a webdev room, I'm not sure this is data science or ai.

late mesa
#

ah ok, thank you

desert oar
#

it might still also be against ToS even if you are not bypassing the captcha

#

eg. if this is a sneaker bot or something like that, it might still violate rule 5

jaunty igloo
#

What's a good beginner's level project to start working with machine-learning or AI? Any tips?

wicked grove
#

@lapis sequoia i have a doubt in transfer learning. The first time i train vgg19 by removing the top layer with a dense layer and softmax,it learns weights for the final layers. When i fine tune i.e., retrain the last two layers of vgg 19 on the same dataset wouldnt it have already learnt the weights and thus would be overfitting?

#

or do i relearn cause i use model.compile again?

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @fierce bronze until <t:1642748880:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

#

:incoming_envelope: :ok_hand: applied mute to @split basin until <t:1642748881:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

charred python
#

Hi, Can someone suggest some tools/libraries to analyze and clean an image dataset specifically for instance segmentation purposes

spare junco
#

Hey guys, So I wanted to use YOLOv4 for object detection, I was following a video where they showed how to install Darknet(using vcpkg). I tried following that but my PC ran into an error (Blue screen after which the computer restarts) in the mid-process of preparation of vcpkg for darknet installation using Powershell after reaching '-- Building ffmpeg for Release' (this line in the powershell). I also followed Medium's steps (https://medium.com/analytics-vidhya/installing-darknet-on-windows-462d84840e5a) but again the same problem occurs. now what do I do

#

Can anyone help me?

#

I also tried installing using CMake

#

still doesnt work

#

shows the same error

terse frigate
#

where can i learn more about this?

spare junco
spare junco
# terse frigate

The answer is Reply layer or Relu(cuz RELU is an activation dk if counts as a layer)

terse frigate
spare junco
#

Hmm, i didnt learn in very deep, those concepts. just a brief idea of the layers. I learnt from CodeBasics

spare junco
terse frigate
#

i am attempting an entrance exam

#

in 5 days

spare junco
#

oh

terse frigate
#

i just need to know the basics and fundamentals

#

youtube on playback 1.5x should be the best imo

spare junco
terse frigate
#

what about freecodecamp?

spare junco
#

I tried watching their's but I didnt understand much, could be different in your case so try watching first 10-20 mins and then decide if you should continue watching the video

sour shoal
#

Hey guys, I made a NN for the MNIST data set. However the accuracy of my NN is really bad and I cant figure out what I am doing wrong. Also the cost function works decent for [64,32,10] NN structure but structures with too many nodes and structures with more than 3 layers do poorly for some reason. Here is my code
https://github.com/MachineLearningEnthusiast/Neural-Network-Project-using-MNIST-data-set cheers

GitHub

Contribute to MachineLearningEnthusiast/Neural-Network-Project-using-MNIST-data-set development by creating an account on GitHub.

desert bear
#

Hey I'm doing presentation on mathematics in Machine Learning. Can you suggest me which basic concepts should I cover? I was thinking about Linear Algebra: Matrix multiplications, and Calculus: Stochastic Gradient Descent

stone marlin
#

What level are you presenting to? High school? College? General Public?

desert bear
#

I just need few topic to cover them in less than 7 minutes

spare junco
sour shoal
hasty kiln
#

I want some numpy projects for beginners 😶

sour shoal
sour shoal
novel elbow
#

for numpy? hmmm

#

try implementing matrix profile

hasty kiln
#

Fortunately for me, I'm starting to learn OpenCV

hasty kiln
novel elbow
#

no, matrix profile is a technique for time series

hasty kiln
novel elbow
sour shoal
#

that actually sounds really useful

#

what do they mean by help find anomalys in data?

#

like noise

#

?

novel elbow
#

imagine you have a time series where a pattern is repeating every 20 steps, but at some point something is different from that pattern, that will be an anomaly

sour shoal
sour shoal
#

like noise

#

so you basically use matrix profile to eliminate anomalies?

#

but it can only spot anomalies in series, that is a specific pattern?

novel elbow
#

with matrix profile you compare windows of data (euclidean distance between 2 windows), if very different than all other windows that will be an anomaly pattern

#

if you compare single data points or very small windows, that will probably capture noise

novel elbow
sour shoal
novel elbow
#

no, in this case you want to spot anomalies in the real data

sour shoal
#

also what is a window of data?

novel elbow
sour shoal
sour shoal
novel elbow
#

window or slide of data, if you have 1000 data points the 1st window is [0:100] the 2nd [1:101], [2:102], ... [n-100,n]

sour shoal
#

i am interested in application

#

thats all

novel elbow
#

that will not be a good use, because those anomalies are important in the data, they are not noise

sour shoal
#

how can you differentiate anamolies and noiose

#

?

novel elbow
#

imagine you have a lot of sales data, and you want to do a quick analysis, you can use matrix profile to find anomalies (the right term is discords in time series literature), and quickly spot points in time where sales went very different (like black friday)

novel elbow
#

Its usually the domain that defines what can be considered noise

sour shoal
#

yeah that would make sense

#

and some times that can be very difficult to tell

#

i would assume

rare ferry
#

Can someone recommend me some techniques and tools to practice and become a pro at python(related to data science only)?

silver pagoda
#

If I want to change specific parts of a csv file every 5 minutes, or every time a update comes through, should I use scheduling clocks and if so does it occupy a thread?(for each 5 min) the reason I ask is because the 5 minutes for each specific one could be off timed and would result in false data if I did it all every 5 minutes

#

and would it be better to use another file type of sorts for this? It’s a settings/ dashboard view program

#

Manages 20 branches which in turn manage a total of 300+ bots

hollow sentinel
#

ugh

#

i can't use value_counts with dask?

novel elbow
soft viper
#

Is there a website where I can read more about classifier model? I need to do 3 model of yes and no model but I have no idea which one is which

warm jungle
#

I have a largish 1d array, scores, representing scores for players in a game. So the score for player i is scores[i]. If I want to make an overall leader board for the game then e.g. leaderboard = (-scores).argsort() gives me the players in order of score. But I also have a number of subsets of players represent as 1d arrays of player ids, say one such is l. To get the leaderboard for l I can do _, l_lb, _ = np.intersect1d(leaderboard, l, assume_unique=True, return_indices=True) gives what I want (I'm not sure if the order of l_lb is guaranteed, but it seems to work). But if I want to get, say just l_lb[:10] then it's potentially quite inefficient when l is large to compute all of l_lb - it should be possible to directly just get the first n members from leaderboard and l without computing the whole of l_lb. Any suggestions?

desert oar
#

i actually don't understand how intersect1d even gives the correct result... if leaderboard is [5000, 4000, 3000, 2000] and l is an array of player ids [4, 2], then what does the intersection even provide? the intersection would just be empty

#

!e ```python
import numpy as np
leaderboard = np.array([5000, 4000, 3000, 2000])
player_ids = [3, 1]
print(np.sort(leaderboard[player_ids])[::-1])

arctic wedgeBOT
#

@desert oar :white_check_mark: Your eval job has completed with return code 0.

[4000 2000]
desert oar
desert oar
hollow sentinel
#

i figured it out

desert oar
hollow sentinel
#

dask needs a .compute()

#

at the end of .value_counts()

desert oar
#

that makes sense, dask operations are "lazy", like spark/pyspark

hollow sentinel
#

yeah i just instantly ran to the doc

#

wow numpy is good for checking lin alg work

warm jungle
desert oar
#

oh, i misunderstood

#

is l already sorted, or no?

warm jungle
#

I think it probably is in practice - it doesn't change so could be sorted anyhow

#

or at least changes infrequently

#

yeah - maybe wrong channel, but it's numpy specific...

desert oar
#

yeah fair enough. numpy questions usually make more sense here, but in this case "how to do it in numpy" seems less important than "how to do this efficiently in general"

#

@warm jungle do i understand correctly? leaderboard are the player ids, sorted by player scores. l is some subset of player ids in unknown/arbitrary order. and you are asking for the best way to sort the contents of l by player scores, and get the top N of those, e.g. top 10.

#

maybe you can do something like check the position of each element of l in the original leaderboard?

#

could be a good stackoverflow question

warm jungle
#

yeah, although like I say, l can easily be in a specific order as it won't change often compared with
leaderboard. I'll write it up with some examples. I have a working solution, I'm just not sure it's the best...

desert oar
#

!e ```python
import numpy as np
scores = np.array([5000, 4000, 3000, 2000])
leaderboard = (-scores).argsort()
leaderboard_positions = {
player_id: position
for position, player_id
in enumerate(leaderboard.tolist())
}

players_subset = np.array([3, 2])
players_subset_positions = np.array([
leaderboard_positions[player_id]
for player_id
in players_subset
]).argsort()

print(scores[players_subset[players_subset_positions]])

arctic wedgeBOT
#

@desert oar :white_check_mark: Your eval job has completed with return code 0.

[3000 2000]
desert oar
#

@warm jungle ☝️ does that work for you? because you say that players_subset changes infrequently, you can pre-construct players_subset_positions ahead of time.

maybe you can figure out a way to only take the top 10 from that list, but i feel like there is no way to get the top 10 without sorting the whole thing

soft viper
#
model_mlp = MLPClassifier(hidden_layer_sizes=(100,), max_iter = 300, activation='relu', solver='adam', random_state=0)
# fitting on training data
model_mlp.fit(X_train, y_train)``` 
I got this error when i ran the code above
 ```D:\Anaconda\lib\site-packages\sklearn\neural_network\_multilayer_perceptron.py:582: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (300) reached and the optimization hasn't converged yet.
  warnings.warn(```
desert oar
warm jungle
unreal swan
#

you should increase them

soft viper
desert oar
#

try decreasing learning rate if you can

proper swift
#

Hey all just wondering, since I cant find a clear answer in the docs, but is there a way to use df.query with .unique() in for one column or .drop_duplicates in pandas?

desert oar
#

i don't think .query supports this, nor can i imagine how it would without being overly complicated

proper swift
chilly dome
#

is this a good place to talk about trading bots?

desert oar
#

yes, although i don't think we have too many experts on that subject here

thin palm
#

What's up Python gang, I'm trying to take this columns with addresses and display their LAT and LONG but for some reason it's not working on my dataframe columns. I've tested it with other columns and it seems to work. Can some one help: here's the code:

#Turn our new coordinates column into Lats and Longs
from geopy.geocoders import Nominatim

geolocator = Nominatim(user_agent="myApp")

df[['location_lat', 'location_long']] = df['coordinates'].apply(
    geolocator.geocode).apply(lambda x: pd.Series(
        [x.latitude, x.longitude], index=['location_lat', 'location_long']))```
Here's the error I get
#

AttributeError: 'NoneType' object has no attribute 'latitude'

desert oar
#

i'd recommend writing a standalone function for this, if only to make debugging easier:

import warnings
import pandas as pd
from geopy.geocoders import Nominatim

geolocator = Nominatim(user_agent="myApp")

def geolocate(address):
    result = geolocator.geocode(address)
    if result is None:
        warnings.warn(f'Failed to geocode: {address}.')
    return pd.Series({
        'location_lat': result.latitude,
        'location_lon': result.longitude
    })

df = pd.DataFrame({
    'address': ['10528 Duke Ave SW,Albuquerque,NM,87121'],
})
df[['location_lat', 'location_lon']] = df['address'].apply(geolocate)
#

also, here's a tip: whenever you query an API in bulk, save the raw output

#

don't just save the processed output

#

you never know when you'll need to process the data differently. and if you didn't save the raw output, you possibly just wasted a bunch of time and money

thin palm
#

it's 128 lines in that column

desert oar
#

don't copy and paste code without understanding it!

thin palm
desert oar
thin palm
desert oar
#

that's why I wrote the code that prints a warning

#

that way you can at least see which address causes the problem

thin palm
#

gotcha, I assumed it was fixing my error haha

#

thanks though salt rock lamp, appreciate it

desert oar
#

well like i said, don't copy and paste without understanding what the code does!

#
import warnings
import pandas as pd
from geopy.geocoders import Nominatim

geolocator = Nominatim(user_agent="myApp")

def geolocate(address):
    result = geolocator.geocode(address)
    if result is None:
        warnings.warn(f'Failed to geocode: {address}.')
        return pd.Series({
            'location_lat': None,
            'location_lon': None,
        })
    else:
        return pd.Series({
            'location_lat': result.latitude,
            'location_lon': result.longitude,
        })

df = pd.DataFrame({
    'address': ['10528 Duke Ave SW,Albuquerque,NM,87121'],
})
df[['location_lat', 'location_lon']] = df['address'].apply(geolocate)
thin palm
#

found the error address, thank you sm @desert oar

plush grove
#

Is this the right table to ask about REST API and retrieving data from API?

#

In general I'm looking for suggestions for references, books, sites, or even terms I can research.

I have an API which says things like "Using bla-bla API end point data can be linked to bla-bla API endpoint through the LocationID field"

#

TOtally new to this.

and the different syntaxes for making queries through the URL

#

The API is accounting info from R365

copper grotto
#

I'm trying to implement an extension field of the rationals, in a way compatible with Numpy. Specifically, I need to extend the rationals with the golden ratio. After reading a little, I came to the uncertain conclusion that a custom array container would be the way to go (https://numpy.org/doc/stable/user/basics.dispatch.html#basics-dispatch) as opposed to a subclass of numpy.ndarray (https://numpy.org/doc/stable/reference/arrays.classes.html). I've got the code started but I wanted to see examples of how other people have handled extension fields and/or Numpy custom array containers. (Or subclasses of Numpy arrays -- I'm not sure.)

So, googling for this a bit, I instead encountered Sympy. My overall goal is to perform fast, exact calculations involving rationals and the golden ratio. Is Sympy a good option, or should I stay on the Numpy array container track?

I'd expect to be able to find someone at least implementing rational numbers as a custom array container, but I can't seem to find it.

sour shoal
arctic wedgeBOT
#

Hey @copper grotto!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

copper grotto
#

I've got a partial implementation now but I'm getting an error I'm not sure what to do about. https://paste.pythondiscord.com/akoxuwijas.py

The exception is:
numpy.core._exceptions._UFuncNoLoopError: ufunc 'gcd' did not contain a loop with signature matching types (dtype('float64'), dtype('float64')) -> dtype('float64')

This seems to have something to do with a potential for loopy behavior, where my class would punt taking the gcd to Numpy but Numpy would think it has to pass responsibility back to my class. But, I definitely want Numpy taking the gcd; otherwise most of my operations would end up way too slow.

spare junco
#

any idea on how i can solve this problem?

#

Pls someone help, i have been trying to install darknet since 4 days

royal crest
#

there are no more 2.3.0-rc's

#

as is displayed in the error message

spare junco
#

So now which version should i install

royal crest
#

don't freeze versions i guess

spare junco
#

wdym?

#

just install tensorflow-gpu?

#

not specifying a verison

royal crest
#

yes

#

tensorflow-gpu==2.3.0rc0 means you are requesting for version 2.3.0rc0 and nothing else

#

so remove the ==* bit

spare junco
#

actually i am following a video

royal crest
#

or if you need a particular version then remove the rc0

spare junco
#

so if i dont install the same version, then there could be some issues

royal crest
#

well 2.3.0rc0 doesn't exist anymore

#

closest you might get is 2.3.0

spare junco
#

Okay, thanks i will try

#

do you know how to install darknet

#

and what is this error

#

@royal crest

royal crest
#

there is a darknet discord server if you are interested

spare junco
royal crest
#

zSq8rtW

royal crest
spare junco
#

its trying to solve something idk what

royal crest
#

probably some dep hell

spare junco
#

hmm, okay

arctic crown
#

M
so i wanna make an ai
that can write an essay for me
example you give it a topic and it just searches google and finds info about that topic and then writes an essay about it
how can i achieve this?

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @jaunty lotus until <t:1642842550:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).