shut slate Mar 9, 2021, 6:39 PM

#

'Index' object is not callable

#

😦

lavish swift Mar 9, 2021, 6:46 PM

#

if I remember right, when I rename columns, I use a dict like

df.rename(columns = {'old_name' : 'new_name'}, inplace = True)

So if you want to use a list perhaps you can create a dict by combining df.columns and newColumns?

shut slate Mar 9, 2021, 6:47 PM

#

Hi M00se

#

I remember you

#

You helped me b4

lavish swift Mar 9, 2021, 6:48 PM

#

hey! I thought your name looked familiar 🙂

shut slate Mar 9, 2021, 6:48 PM

#

what is a dict?

#

lol

lavish swift Mar 9, 2021, 6:48 PM

#

dictionary

#

key/value pairs

shut slate Mar 9, 2021, 6:49 PM

#

how do you create a dictionary?

#

lol

lavish swift Mar 9, 2021, 6:50 PM

#

trying to work on a few things right, maybe someone can help with that in the meantime, if not see if you can do some searching. If you're still stuck I'll try to check back in

shut slate Mar 9, 2021, 6:51 PM

#

ye i am. Thanks

hollow sentinel Mar 9, 2021, 6:53 PM

#

myDict = {"dog":5, "cat":6}

#

that is a dictionary

#

dictionaries have key value pairs

#

myDict["dog"] = 5

#

this is how you access the value of a given key

#

https://realpython.com/python-dicts/

Dictionaries in Python – Real Python

In this Python dictionaries tutorial you'll cover the basic characteristics and learn how to access and manage dictionary data. Once you have finished this tutorial, you should have a good sense of when a dictionary is the appropriate data type to use, and how to do so.

#

good resource for learning dictionaries

#

take your time and understand these bc they're important

#

understanding the basic data structures is imo more important than Pandas

trim oar Mar 9, 2021, 6:58 PM

#

shut slate ye i am. Thanks

https://www.geeksforgeeks.org/differences-and-applications-of-list-tuple-set-and-dictionary-in-python/

GeeksforGeeks

Differences and Applications of List, Tuple, Set and Dictionary in ...

A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.

#

These are very commonly used data structures

hollow sentinel Mar 9, 2021, 7:00 PM

#

~~maybe learning linked lists/other data structures would be good too~~

#

~~before you just randomly throw yourself into Pandas~~

grave frost Mar 9, 2021, 7:00 PM

#

he is right, you should build a desktop with your own GPU. why? firstly, the cloud is expensive to run (like colab). You won't believe how easy it is to put your credit card on GCP and forget to terminate the instance (or maybe due to a bad connection it didn't terminate) leading to lots of loss of money.

Next up is competitions. I had the same naive mindset but soon I realized that hypertuing takes a looong time (despite using some pretty advanced stuff and Dask). You are so much better off if you build your own desktop and have it run while you sleep/watch YT. so If your end goal is competitions (kaggle, etc.) then probably building a desktop is best.

IMO the only reason you should opt cloud is when you just can't afford to buy a desktop. AMD Ryzen 7 + RTX titan seems a pretty solid choice (albeit expensive, you can use it for gaming or start with 3080ti and move on).

stuff like Colab only offer 16Gb GPU's which runs out pretty quick. sinceVRAM is a very limiting factor when doing DL, going for a multi PCie mobo is good cuz later you can put 2 3080's or something cheaper than that

#

usually, my recommendation is cloud to everybody to get started, but since you are already making a PC, you just need an Nvidia GPU and you are good to go for both gaming and DL

shut slate Mar 9, 2021, 7:02 PM

#

Ok thank you guys

serene scaffold Mar 9, 2021, 7:04 PM

#

hollow sentinel ~~maybe learning linked lists/other data structures would be good too~~

Pandas is more about understanding the numpic approach to iterative operations and database style operations. Knowing linked lists won't help you with pandas per se.

misty flint Mar 9, 2021, 7:04 PM

#

shut slate Ok thank you guys

i know you got your problem fixed but another way to tell jupyter to find your file is by running this in a separate cell
%cd [copy and paste your filepath here]

#

without the brackets lol

shut slate Mar 9, 2021, 7:05 PM

#

I see

#

Thanks

iron basalt Mar 9, 2021, 7:05 PM

#

serene scaffold Pandas is more about understanding the numpic approach to iterative operations a...

I think the value of knowing it would be more indirect. Simply having more experience with python programming in general. I see a lot of people getting in DS and pandas without even knowing basic programming and the tools provided by python.

misty flint Mar 9, 2021, 7:05 PM

#

ok_handbutflipped

shut slate Mar 9, 2021, 7:06 PM

#

I mean I did some basic python programs before, bt if yo tackle a real problem you lean more. Evem if its trial and error

misty flint Mar 9, 2021, 7:06 PM

#

ye

serene scaffold Mar 9, 2021, 7:06 PM

#

iron basalt I think the value of knowing it would be more indirect. Simply having more exper...

I agree that implementing a linked list (especially if done with dunder methods) can teach one a lot of general python.

shut slate Mar 9, 2021, 7:06 PM

#

like if statements, loops, lists etc

iron basalt Mar 9, 2021, 7:08 PM

#

Yeah I recommend practice problems that are concrete and may force you too learn a new data structure. But going straight to pandas is like the final boss. Pandas itself is implemented with the knowledge of a bunch of data structures and python concepts (dunderscore methods, boolean masks, etc).

hollow sentinel Mar 9, 2021, 7:08 PM

#

dunder methods are the underscore methods right?

#

like _iter

#

why is that italicized idk

iron basalt Mar 9, 2021, 7:09 PM

#

(And even makes use of python's dynamic nature by allowing stuff like both strings and integers for keys).

#

dunder = double under, like __init__ is one of them.

#

They let you do things like implement [] for your custom class.

shut slate Mar 9, 2021, 7:11 PM

#

Ye idk, I am doing a big data certificate and they dove straight in. Keep in mind that I did some programming b4, even if it was really basic. The people who literally never saw code b4 just copy paste from professors notes lol

#

Gl to them

grave frost Mar 9, 2021, 7:11 PM

#

iron basalt They let you do things like implement `[]` for your custom class.

but __version__ doesnt do that 😦

iron basalt Mar 9, 2021, 7:12 PM

#

>>> class Foo:
...     def __init__(self):
...             self.x = 10
...     def __getitem__(self, key):
...             print(key, self.x)
... 
>>> foo = Foo()
>>> foo["bar"]
bar 10
>>>

astral path Mar 9, 2021, 7:13 PM

#

is there a way to run R gstat functions in python?

#

i'm trying to make an H-Scatterplot but it appears to only be implemented in R

exotic maple Mar 9, 2021, 7:19 PM

#

iron basalt ```py >>> class Foo: ... def __init__(self): ... self.x = 10 ......

the getitem dunder is the one used to implement brackets on python?

iron basalt Mar 9, 2021, 7:22 PM

#

yes and setitem

#

From that example it should be obvious now why Pandas can do many different things depending on the type of the key.

#

(btw numpy also does this)

misty flint Mar 9, 2021, 7:39 PM

#

but pandas is like the better sibling over numpy

#

DoggoKek

bitter harbor Mar 9, 2021, 7:39 PM

#

those are fighting words

grave frost Mar 9, 2021, 7:42 PM

#

misty flint but pandas is like the better sibling over numpy

I disagree - both were made in mind with some specific core philosophies that served different needs. Oftentimes you would use both (like pandas for manipulating the DF and extracting some data in form of numpy arrays). Plus a lot of libs have direct support for numpy arrays, so can I say that numpy is superior??

bitter harbor Mar 9, 2021, 7:43 PM

#

isn't pandas built on numpy?

grave frost Mar 9, 2021, 7:43 PM

#

pandas is an open-source library built on top of numpy

bitter harbor Mar 9, 2021, 7:44 PM

#

just making sure im not going insane 😄

#

they both have their uses and what not, if you've ever tried dealing with strings with numpy, you'll quickly understand the benefit of pandas

#

but i'd argue pandas is a lot more specific in its use cases

serene scaffold Mar 9, 2021, 7:46 PM

#

I'd actually argue that pandas is more general

grave frost Mar 9, 2021, 7:46 PM

#

bitter harbor they both have their uses and what not, if you've ever tried dealing with string...

ahh, I don't do much with pandas. usually, I extract it as a NumPy arrays and do operations on that. then later, back to pandas, some more, and then export

bitter harbor Mar 9, 2021, 7:47 PM

#

haha i've actually been doing the same for my data course work

bitter harbor Mar 9, 2021, 7:47 PM

#

serene scaffold I'd actually argue that pandas is more general

how so

grave frost Mar 9, 2021, 7:47 PM

#

bitter harbor haha i've actually been doing the same for my data course work

ye, I find arrays simpler and easier

serene scaffold Mar 9, 2021, 7:50 PM

#

bitter harbor how so

Pandas is for tabular data in general. You could even use it just to read a csv and write it back to file with a different delimiter. And it has math operations, but it has string operations and database operations, too

#

Numpy is for math.

oak elk Mar 9, 2021, 7:52 PM

#

Got it thannk you man

bitter harbor Mar 9, 2021, 7:52 PM

#

serene scaffold Pandas is for tabular data in general. You could even use it just to read a csv ...

doesn't most of that utility come from numpy tho?

misty flint Mar 9, 2021, 7:53 PM

#

grave frost I disagree - both were made in mind with some specific core philosophies that se...

its just a joke blobgrimacing

serene scaffold Mar 9, 2021, 7:55 PM

#

bitter harbor doesn't most of that utility come from numpy tho?

Dataframes use arrays for their data model, but that doesn't mean it's just a wrapper around numpy functionality

grave frost Mar 9, 2021, 7:56 PM

#

misty flint its just a joke <a:blobgrimacing:629583557311987722>

haha, you could have started WW3

uncut barn Mar 9, 2021, 7:59 PM

#

Are there CNNs where it takes a bunch of images and outputs the probability of it being a dog or another animal and then after that we take all the images that were predicted as a dog and feed it into another CNN to determine the probabilties of the breed of the dog?, if so can you point me to a link where someone has done it?

bitter harbor Mar 9, 2021, 8:01 PM

#

serene scaffold Dataframes use arrays for their data model, but that doesn't mean it's just a wr...

https://github.com/pandas-dev/pandas/blob/master/pandas/core/computation/ops.py#L619 sepcifically the self.func = getattr(np, name)
Is that not just mapping their functions to numpy ones?
I'm not saying that it's just a wrapper but from what I can see (which mind you isn't much I don't have any spare brain cells rn), a lot of the functionality comes from numpy

grave frost Mar 9, 2021, 8:02 PM

#

uncut barn Are there CNNs where it takes a bunch of images and outputs the probability of i...

all of them do that 🤷 that's how classification works

#

model outputs the probablity and we usually chose the one with the maximum score

uncut barn Mar 9, 2021, 8:03 PM

#

im talking about a CNN to a another CNN

#

so predict the animal, then of those animals that were predicted as a dog feed it into another CNN to predict their breed

bitter harbor Mar 9, 2021, 8:04 PM

#

you can chain models together where ones output is the input of the next

grave frost Mar 9, 2021, 8:04 PM

#

uncut barn so predict the animal, then of those animals that were predicted as a dog feed i...

it would be the same structure as the first one just with different data

bitter harbor Mar 9, 2021, 8:04 PM

#

but you've still got 2 separate models

rough shore Mar 9, 2021, 8:04 PM

#

How is it possible to learn ai?

uncut barn Mar 9, 2021, 8:05 PM

#

yh I know the intermediate step is confusing me

grave frost Mar 9, 2021, 8:05 PM

#

rough shore How is it possible to learn ai?

the wonder of evolution and the human mind

uncut barn Mar 9, 2021, 8:05 PM

#

do i use the argmax to filter out the original dataset to get those images that were predicted as a dog?

serene scaffold Mar 9, 2021, 8:05 PM

#

rough shore How is it possible to learn ai?

This is too broad of a question. AI refers to a lot of things.

grave frost Mar 9, 2021, 8:06 PM

#

uncut barn do i use the argmax to filter out the original dataset to get those images that ...

???

#

what dataset do you have?

uncut barn Mar 9, 2021, 8:07 PM

#

i'm talking hypothetically, i know there is an animals/ dogs and cats data set

grave frost Mar 9, 2021, 8:07 PM

#

you need another one for breeds

#

I don't see how it is complicated: first model tells you its a cat or dog, second one tells you the breed. (you could compress it to one model too like dog.labrador)

uncut barn Mar 9, 2021, 8:08 PM

#

do i feed the images that were dogs (which was predicted by the first CNN) by filtering out using the indices?

grave frost Mar 9, 2021, 8:09 PM

#

indices of what?

uncut barn Mar 9, 2021, 8:09 PM

#

indices of the images

grave frost Mar 9, 2021, 8:09 PM

#

model outputs a probability distribution, not a dataset

uncut barn Mar 9, 2021, 8:10 PM

#

so if the 6 th image was predicted as a dog due to the probability being the greatest, would i need to get the 6th image in the original dataset and feed that into the 2nd CNN?

grave frost Mar 9, 2021, 8:10 PM

#

uncut barn so if the 6 th image was predicted as a dog due to the probability being the gre...

yeah, ofc

ripe forge Mar 9, 2021, 8:10 PM

#

But yes, argmax on that and take a subset of images that belong to the same dog class for step 2. This would be something you do during inference, at training you know which ones are dogs and so on

#

So during training you take the correct subset based on ground truth.

uncut barn Mar 9, 2021, 8:11 PM

#

ripe forge But yes, argmax on that and take a subset of images that belong to the same dog ...

thanks, i needed this confirmation

misty flint Mar 9, 2021, 8:13 PM

#

anybody use lexnlp? cant get package to install

#

probs bc of dependencies

astral path Mar 9, 2021, 8:14 PM

#

How do I take a dataframe like this

#

and add a new column with the mean value of the columns count and avg_plays for each artist?

#

so for the artist Marcioz, it would have a new column with the mean of all count values in a row in which the artist is Marcioz, and a column for the avg_plays equivalent

serene scaffold Mar 9, 2021, 8:18 PM

#

misty flint anybody use lexnlp? cant get package to install

Windows or no?

exotic maple Mar 9, 2021, 8:19 PM

#

astral path How do I take a dataframe like this

you mean agregatin results by artist?

#

you can use pivot_table

#

or df.groupby("artist").agg({"count":sum, "avg_plays": np.average})

astral path Mar 9, 2021, 8:20 PM

#

like aggregating count by artist except it's the mean

#

ok yeah yours makes sense

astral path Mar 9, 2021, 8:21 PM

#

exotic maple you can use pivot_table

it returns allNaN

misty flint Mar 9, 2021, 8:21 PM

#

windows

exotic maple Mar 9, 2021, 8:21 PM

#

your data types are probabily wrong then

#

did you check each column dtype?

astral path Mar 9, 2021, 8:21 PM

#

yea

#

im changing it to df.groupby("artist").agg({"count":np.average}) because that's all i need right now

#

that would work right?

#

shut slate Mar 9, 2021, 8:24 PM

#

Hi guys

#

Why does this not rename?

#

serene scaffold Mar 9, 2021, 8:25 PM

#

shut slate Why does this not rename?

That method returns a new dataframe

#

It doesn't change the name of the existing one

shut slate Mar 9, 2021, 8:26 PM

#

it changes other stuff lol

serene scaffold Mar 9, 2021, 8:26 PM

#

Oh you did in place. Hmm

shut slate Mar 9, 2021, 8:27 PM

#

#

this works

astral path Mar 9, 2021, 8:27 PM

#

exotic maple did you check each column dtype?

i got it to work

#

df.groupby("artist")['count'].transform(np.average)

shut slate Mar 9, 2021, 8:28 PM

#

like i want to do df.claim type.mean()

#

So I wanted to put underscore

#

But wtf lol

iron basalt Mar 9, 2021, 8:29 PM

#

run print(df.columns) and show output

#

(before edit)

shut slate Mar 9, 2021, 8:30 PM

#

Oh I see

#

There is a space

iron basalt Mar 9, 2021, 8:31 PM

#

Ah I guessed correctly

#

I have seen this happen also when people try to select rows by name and the name has a space (like space in front of artist name).

#

Some spaces can sneak in from excel and other places.

shut slate Mar 9, 2021, 8:31 PM

#

Thank you dude 🙂

#

it worked

iron basalt Mar 9, 2021, 8:32 PM

#

I prefer df.rename(columns={...}), less confusing than axis=1 or whatever.

shut slate Mar 9, 2021, 8:49 PM

#

Hi guys

#

one last question

#

#

Can I get the mean of a float or do I have to change it to a float?

lapis sequoia Mar 9, 2021, 9:11 PM

#

I want to start using Python for scripting in text-processing

#

Is it possible to change a file contents without opening it with help of python?

exotic maple Mar 9, 2021, 9:17 PM

#

iron basalt I prefer `df.rename(columns={...})`, less confusing than `axis=1` or whatever.

Yeah i prefer this as wel

exotic maple Mar 9, 2021, 9:18 PM

#

shut slate Can I get the mean of a float or do I have to change it to a float?

of course you can

#

try using np.mean or np.average

#

also

#

to aggregate

#

use .agg()

distant hedge Mar 9, 2021, 9:18 PM

#

Hey guys, quick question. How can I store df in x and then continue editing x?

x = x.loc[x['ID' != 0] = 'Hello'

exotic maple Mar 9, 2021, 9:20 PM

#

you want...a dataframe as a column of another dataframe?

distant hedge Mar 9, 2021, 9:20 PM

#

I am working with excel that needs a lot of manipulations. For example we have John, Mike, and Angela. I need to edit things in their dataframes before saving for each person.

distant hedge Mar 9, 2021, 9:22 PM

#

exotic maple you want...a dataframe as a column of another dataframe?

Not really. I just want to store Dataframe in a variable so that it doesn't affect the df parameter.

exotic maple Mar 9, 2021, 9:22 PM

#

eh

#

just copy it?

#

x = df

#

lol

#

that or i'm not understanding

umbral sierra Mar 9, 2021, 9:22 PM

#

https://stackoverflow.com/questions/66554689/pass-from-a-model-of-type-gensim-models-keyedvectors-word2veckeyedvectors-to-a-m

Stack Overflow

pass from a model of type gensim.models.keyedvectors.Word2VecKeyedV...

I downloaded a word embedding already train in "glove.txt" format
I imported it in as a model of type gensim.models.keyedvectors.Word2VecKeyedVectors thanks to this documentation :
https://

#

Someone to help :p

distant hedge Mar 9, 2021, 9:25 PM

#

exotic maple that or i'm not understanding

Just a second 🙂

distant hedge Mar 9, 2021, 9:29 PM

#

exotic maple that or i'm not understanding

import pandas as pd
import re

df = pd.read_csv('old.csv')

#Step1 - Finding all value in column Name beginning with Ale
x = df.loc[df['Full Name'].str.contains('^Ale[a-z]*', regex=True)]

#Step 2 - Replacing all the rows in column name by Alex
x = x.loc[x['Full Name'] != 0] = 'Alex'

#Step 3 - Getting back to original dataframe + Step 1
y = df.loc[df['Full Name'].str.contains('^Je[a-z]*', regex=True)]

#Step 4 - Step 2 but replacing by Jessica
y = y.loc[x['Full Name'] != 0] = 'Jessica'

#

So, this doesn't work... and spits out an error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-fc63323fe12c> in <module>
      3 
      4 #Step 2 - Replacing all the rows in column name by Alex
----> 5 x = x.loc[x['Full Name'] != 0] = ['Alex']
      6 
      7 # #Step 3 - Getting back to original dataframe + Step 1

AttributeError: 'list' object has no attribute 'loc'

serene scaffold Mar 9, 2021, 9:46 PM

#

distant hedge ``` import pandas as pd import re df = pd.read_csv('old.csv') #Step1 - Finding...

can you provide a sample of the CSV so that I can try to reproduce the error?

distant hedge Mar 9, 2021, 9:47 PM

#

Sure

serene scaffold Mar 9, 2021, 9:48 PM

#

Something just came up so it may be a bit

lavish swift Mar 9, 2021, 9:51 PM

#

distant hedge ``` import pandas as pd import re df = pd.read_csv('old.csv') #Step1 - Finding...

It's strange. Seems x = df.loc[df['Full Name'].str.contains('^Ale[a-z]*', regex=True)] is causing x to be a list and not a df, but I'm not sure why. Have you tried printing x before moving onto the next step to see what it looks like?

hollow sentinel Mar 9, 2021, 9:51 PM

#

what's import re?

#

oh regex

#

cool

arctic wedgeBOT Mar 9, 2021, 9:54 PM

#

Hey @distant hedge!

It looks like you tried to attach file type(s) that we do not allow (.csv). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.

Feel free to ask in #community-meta if you think this is a mistake.

distant hedge Mar 9, 2021, 9:54 PM

#

CSV not allowed, let me upload it somewhere

distant hedge Mar 9, 2021, 9:55 PM

#

lavish swift It's strange. Seems `x = df.loc[df['Full Name'].str.contains('^Ale[a-z]*', rege...

I did try that, it was printing it weird, without column names

lusty coral Mar 9, 2021, 9:56 PM

#

distant hedge I did try that, it was printing it weird, without column names

Try apply maybe? Slower but it is more flexible

distant hedge Mar 9, 2021, 9:57 PM

#

https://drive.google.com/file/d/1dD_NHv3iWk5v0IKg3Ku_hWXn7wk_Ypjd/view?usp=sharing

Google Docs

Old.csv

distant hedge Mar 9, 2021, 9:58 PM

#

lusty coral Try apply maybe? Slower but it is more flexible

I am new to pd. I will have to read about the apply method to figure out how it works 😄

lusty coral Mar 9, 2021, 9:59 PM

#

Apply gets the row of the dataframe as a series. Then you can do whatever you want with it

coral cloak Mar 9, 2021, 9:59 PM

#

so I'm splitting a string and adding each word to a list like so:

" ".join(x).split()

is there a way to NOT add a word if it doesn't meet some criteria (specifically checking if the word exists in another list)? having trouble with list comprehensions here

lusty coral Mar 9, 2021, 10:00 PM

#

coral cloak so I'm splitting a string and adding each word to a list like so: ```" ".join(...

Use "in" and list of words maybe?

#

Via if of course

#

.... in ....

coral cloak Mar 9, 2021, 10:02 PM

#

y = " ".join(x).split() if x.split() not in LIST

invalid syntax

tidal bough Mar 9, 2021, 10:02 PM

#

" ".join(word for word in x if ...).split()

#

where ... is your condition.

eager umbra Mar 9, 2021, 10:02 PM

#

Is there someone in here that I can ask a couple pandas questions to? I am trying figure out how to sum a column based on other column data. Kind of like a Sumifs in excel, but am having a hard time figuring it out.

tidal bough Mar 9, 2021, 10:03 PM

#

word for word in x if ... here is a generator expression.

lavish swift Mar 9, 2021, 10:03 PM

#

distant hedge I did try that, it was printing it weird, without column names

I just ran this:

x = northman_df.loc[northman_df['Full Name'].str.contains('^Ale[a-z]*', regex=True)]
print(type(x))
print(x)

And got a dataframe and what I would have expected.

#

Results

#

maybe verify you're reading in the right csv?

#

do a df.head()?

#

just to verify? Weird otherwise

distant hedge Mar 9, 2021, 10:05 PM

#

lavish swift I just ran this: ```py x = northman_df.loc[northman_df['Full Name'].str.contains...

That's what I have as well, however, if you try to further modify output of dataframe of x, it will give you an error.

lavish swift Mar 9, 2021, 10:07 PM

#

distant hedge That's what I have as well, however, if you try to further modify output of data...

You had mentioned when printing x it printed funny for you. I'd try to get that printing right before you move on. It should be a dataframe, and from here it certainly looks like you're doing it right. The only difference I'd mention is the file you uploaded was Old.csv and not old.csv (you have all lowercase in your code). Though that might not be an issues if you simply created a copy to post here?

distant hedge Mar 9, 2021, 10:07 PM

#

distant hedge Mar 9, 2021, 10:08 PM

#

lavish swift You had mentioned when printing `x` it printed funny for you. I'd try to get th...

I actually tried it again and now it prints ok... idk what happened, maybe a bug

lavish swift Mar 9, 2021, 10:08 PM

#

huh...that's different still. The first time it was:
AttributeError: 'list' object has no attribute 'loc'

now it's talking about a 'str' object. Something is funky with x

#

so you're not seeing an error until that line you pointed out, but I think something is happening before that line that is ultimately causing the error

distant hedge Mar 9, 2021, 10:10 PM

#

Odd, because I have no errors if I comment it out

#

and I only have Step 1 and 2 as lines

#

#

coral cloak Mar 9, 2021, 10:14 PM

#

Thanks for the help, guys. Unfortunately, I don't seem to be getting the output I want here (a list of most common words in the data that is not in LIST)

lavish swift Mar 9, 2021, 10:22 PM

#

distant hedge

Ok, I THINK I figured out what was causing your issue? x = x.loc[x['Full Name'] != 0] = 'Alex' This line didn't tell pandas which column you wanted to set the value for, so x became just a string 'Alex'

#

I think what you want is something like this:

x.loc[x['Full Name'] != '', ['Full Name']] = 'Alex'

Where you're setting the full name of each column to "Alex"

#

you can probably even simplify it further since your x dataframe will only be Ale* based on your regex,

dull path Mar 9, 2021, 10:25 PM

#

distant hedge Mar 9, 2021, 10:27 PM

#

lavish swift I think what you want is something like this: ```py x.loc[x['Full Name'] != '', ...

I have tried this one, but no luck. Getting a different error. I am trying to replace all the names starting with Ale to Alex. I have also found a solution 🙂

distant hedge Mar 9, 2021, 10:28 PM

#

lavish swift I think what you want is something like this: ```py x.loc[x['Full Name'] != '', ...

Error yours gave me

#

Solution

#

lavish swift Mar 9, 2021, 10:29 PM

#

yeah, I had to read in the CSV again, since it was already "Alex"

distant hedge Mar 9, 2021, 10:30 PM

#

I am not sure why our 1st line did not work with .loc but the second with .replace did 😆

distant hedge Mar 9, 2021, 10:30 PM

#

lavish swift yeah, I had to read in the CSV again, since it was already "Alex"

Thank you so much for your help ❤️ It was a challenge for both of us ha ha

lavish swift Mar 9, 2021, 10:31 PM

#

sure thing! Glad ya got it working!

distant hedge Mar 9, 2021, 10:38 PM

#

@lavish swift Thank you, I am trying to automate a reporting that takes 3-4 hours with pandas. I think this should be my last challenge. This error made me think that we can only use df to call dataframe ha ha.

coral cloak Mar 9, 2021, 10:42 PM

#

coral cloak Thanks for the help, guys. Unfortunately, I don't seem to be getting the output ...

anyone?

serene scaffold Mar 9, 2021, 10:45 PM

#

coral cloak Thanks for the help, guys. Unfortunately, I don't seem to be getting the output ...

so you want to count word frequencies for words that are not in LIST (which is presumably a list of stop words)?

#

Ping to reply; I'm going to another channel.

granite wolf Mar 9, 2021, 11:27 PM

#

Anyone got any ideas why my matplotlib graphs are printing like this?

#

bronze skiff Mar 9, 2021, 11:29 PM

#

did you sort by datetime before plotting?

#

looking at it, you didn't

granite wolf Mar 9, 2021, 11:32 PM

#

I did that previously and thought it did it automatically as when i sort manually i get plots like this lemon_thinking

#

#

okay so its sorting dates incorrectly

tidal bough Mar 9, 2021, 11:43 PM

#

granite wolf I did that previously and thought it did it automatically as when i sort manuall...

plt.plot never sorts the inputs, no.

paper lake Mar 9, 2021, 11:44 PM

#

granite wolf

u should sort the dates first with the corresponding values before plotting

granite wolf Mar 9, 2021, 11:45 PM

#

paper lake Mar 9, 2021, 11:46 PM

#

is this covid?

granite wolf Mar 9, 2021, 11:46 PM

#

I think it's the to_datetime part which is converting them wrong

granite wolf Mar 9, 2021, 11:46 PM

#

paper lake is this covid?

yes, trying to learn python and matplotlib

paper lake Mar 9, 2021, 11:47 PM

#

granite wolf I think it's the to_datetime part which is converting them wrong

you need to convert the values as Dates i guess hmm i kinda forgot since i use julia now. lemme check for a bit maybe

granite wolf Mar 9, 2021, 11:47 PM

#

thanks for the help

#

it seems a lot like the to_datetime has converted the dates wrong as clearly the maldives hasnt recorded 17507 cases on the 2nd of december, 9 months from now

paper lake Mar 9, 2021, 11:50 PM

#

@granite wolf just remembered i have a notebook for covid

#

gotta check my drive

#

opening it on google colab now 😄

#

@granite wolf https://colab.research.google.com/drive/1koj6KeoVPBfEUQ_LCXPVsxZZCwcC_Lje?usp=sharing here is mine, i think i referred a tutorial like last time

Google Colaboratory

#

feel free to copy my notebook

granite wolf Mar 9, 2021, 11:54 PM

#

thanks a lot 😀

misty flint Mar 9, 2021, 11:58 PM

#

@paper lake aww so friendly cattohug

#

DoggoKek

paper lake Mar 9, 2021, 11:59 PM

#

misty flint <@!674261595710291980> aww so friendly <a:cattohug:793693955644457030>

git rekt rex

misty flint Mar 10, 2021, 12:01 AM

#

ID_BoomKek

lament fiber Mar 10, 2021, 1:44 AM

#

Is this an appropriate channel for asking about pandas, or would that go into a different channel?

misty flint Mar 10, 2021, 1:45 AM

#

yes you can ask here

#

people also ask in #databases

lament fiber Mar 10, 2021, 1:45 AM

#

If this is the correct channel, my question is about the distinction between float64 and Float64. Specifically, looking for any documentation on the latter (and e.g., why .astype("Float64").astype("Int64") works to convert a floating-point value to an integer with no errors or warnings).

#

As you can imagine, it's impossible to Google for Float64, since all the hits are for float64.

#

The only reference I can find in the documentation is here, when it's noted in passing that Int64 will coerce to Float64 if necessary. Except, the text doesn't even acknowledge that this is different from float64, let alone explain the differences. https://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html

distant hedge Mar 10, 2021, 1:55 AM

#

Hey guys, I am pulling my hair out. Is there a way to replace all values in a row by a string? All values are unique strings.

velvet thorn Mar 10, 2021, 1:56 AM

#

lament fiber The only reference I can find in the documentation is here, when it's noted in p...

the null value is different

misty flint Mar 10, 2021, 1:56 AM

#

pithink

native patrol Mar 10, 2021, 1:56 AM

#

lament fiber If this is the correct channel, my question is about the distinction between `fl...

the distinction is that series with dtype Float64 is an ExtensionArray

velvet thorn Mar 10, 2021, 1:57 AM

#

nan vs pd.NA

lament fiber Mar 10, 2021, 1:58 AM

#

So an ExtensionArray is an internal pandas thing that's kind of an abstraction of a 1D array, but just for pandas-internal types?

#

Is there a way to do something like .astype("Int64") on a Float64 ExtensionArray and have it fail, instead of returning all the values truncated to an integer? Or do I need to just use a separate thing to test if the values are actually integers vs. floats, and change behavior accordingly?

ripe forge Mar 10, 2021, 2:03 AM

#

Separate thing to test

#

Astype* is an explicit instruction to coerce the values. Expecting it to then break like that would be... Inappropriate.

lament fiber Mar 10, 2021, 2:04 AM

#

From my perspective, the Pythonic idea of duck-typing, "try something and see if it fails rather than testing it" would make it make more sense to fail out if it can't work, rather than changing values.

ripe forge Mar 10, 2021, 2:05 AM

#

!e print(int(3.14))

arctic wedgeBOT Mar 10, 2021, 2:05 AM

#

@ripe forge :white_check_mark: Your eval job has completed with return code 0.

native patrol Mar 10, 2021, 2:05 AM

#

^

ripe forge Mar 10, 2021, 2:05 AM

#

This isn't about something hypothetical or abstract. Its a literal instruction to "give me an int from this".

native patrol Mar 10, 2021, 2:06 AM

#

if you don't care about floating point precision
you can use series.astype('int').eq(series).all()
or use np.isclose

misty flint Mar 10, 2021, 2:06 AM

#

pithink

ripe forge Mar 10, 2021, 2:06 AM

#

So if you use a type coercion, the behaviour of just getting the int portion is more practical for normal use*

lament fiber Mar 10, 2021, 2:11 AM

#

Notably, you can't do .astype("float64").astype("Int64") and have it work. So in that case, it doesn't just say "you asked for an Int64, we're giving you one regardless.

#

In that case, you get a TypeError about "cannot safely cast non-equivalent float64 to int64."

#

So if the behavior is supposed to be "astype will always return the requested type, even if that requires unsafe downconversions," it isn't consistent.

#

Some examples (using the Python bot correctly, I hope):

#

!e pd.Series([1.1, 2.1, pd.NA]).astype("Int64")

arctic wedgeBOT Mar 10, 2021, 2:18 AM

#

You are not allowed to use that command here. Please use the #bot-commands channel instead.

lament fiber Mar 10, 2021, 2:20 AM

#

OK, I don't have bot access, and would need to import pandas anyway. Regardless, my examples were going to be that (which fails), that with .astype("float64") in the middle (which fails), and that with .astype("Float64") in the middle (which succeeds). This is a bigger difference between float64 and Float64 than just "one uses np.nan, the other uses pd.NA."

#

And maybe should be documented somewhere, if this is the intended behavior.

distant hedge Mar 10, 2021, 2:22 AM

#

Can someone help me to figure out why when I define variable at df, it converts it into a string?

ripe forge Mar 10, 2021, 2:22 AM

#

It probably is documented somewhere I assume. The only thing I'd add is, this would be pandas specific decisions, and you may find further surprises as you explore the library. Fair heads up.

#

Not all python semantics will be consistently used by pandas. It makes its own set of assumptions

ripe forge Mar 10, 2021, 2:24 AM

#

distant hedge Can someone help me to figure out why when I define variable at df, it converts ...

Huh? You are assigning Alex the string to both the df and to x

#

At that point the variable x has no relation to the df, you assigned Alex to it*

distant hedge Mar 10, 2021, 2:26 AM

#

ripe forge Huh? You are assigning Alex the string to both the df and to x

I am trying to replace all values in column with 'Alex'

ripe forge Mar 10, 2021, 2:29 AM

#

Okay. And did your current code not do it?

#

Just for context, x = blahblah = "Alex" is evaluated as blahblah = "Alex" and then separately x = "Alex" so don't chain assignments like that, that variable x is useless for you.

#

You might as well remove that entire line

distant hedge Mar 10, 2021, 2:31 AM

#

I am using x because i am working on a large dataframe that has many people and I have to do specific changes to each person's results.

#

In that sense, what I am trying to achieve is to have Alex = filtered data frame by the person that is assigned.
Right now step 1 is to find occurrences when "Ale" is at the beginning of the name and then store it in df.
Second step is to take that df, and replace all column cells by "Alex" instead of "Aleksandu, Michel", "Aleksey, Jess", "Alexxandr", etc.

ripe forge Mar 10, 2021, 2:35 AM

#

All that doesn't matter to me. I'm telling you that the way you wrote the syntax, x is literally the string alex

#

And this is because you explicitly wrote syntax that works that way. Which means x is clearly not what you "hoped" it would be assigned to. There's a mismatch between what you wanted to do and the syntax you wrote.

distant hedge Mar 10, 2021, 2:36 AM

#

But it works without x right below that code

ripe forge Mar 10, 2021, 2:37 AM

#

Yes. It even works with x too. For an explanation of what the syntax did read my earlier message

#

All I'm saying is, you did not put a dataframe in x. I know you wanted to.. But that's not what you wrote for python.

distant hedge Mar 10, 2021, 2:38 AM

#

ahhh I see, had to give it a second look. In that case, is there a way to fix it?

ripe forge Mar 10, 2021, 2:38 AM

#

Yes. Just assign to x separately in a new line

#

Don't use chained assignments.

#

Btw these kind of errors are called "logical" errors. Very tough to spot because it's essentially valid syntax that doesn't match what you wanted to write. But once you understand that these kinds of issues can happen, it makes it a lot easier to spot one later.

distant hedge Mar 10, 2021, 2:41 AM

#

Thank you Darr. That's a lesson learned. I am still struggling to do what I want. :/

#

I am not sure how else I could overwrite all cells in a column

ripe forge Mar 10, 2021, 2:42 AM

#

Okay. I'll take a guess as to what I think you needed. In cell 3 you're indexing using a boolean array. Save that boolean array separately

#

(ps. I'm on phone so typing code is not really easy. So you'll have to help me out here a bit)

distant hedge Mar 10, 2021, 2:44 AM

#

No worries haha

ripe forge Mar 10, 2021, 2:44 AM

#

So see the part inside the square brackets? Just assign that separately to a variable

#

Maybe give it a name, alex_indexer or whatever name makes sense to you

#

Then cell 4 shouldn't exist, and you can directly use cell 5. That updates the df

#

Afterwards, if you only wanted the rows with df with Alex, you just write x = df[alex_indexer]

#

I assume this is what you wanted.

distant hedge Mar 10, 2021, 2:48 AM

#

I think I have found documentation on what's happening https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

distant hedge Mar 10, 2021, 2:49 AM

#

ripe forge I assume this is what you wanted.

Not quite ha ha. I still appreciate your help.

ripe forge Mar 10, 2021, 2:51 AM

#

Hm. Could you reshare current code and outputs, and then perhaps describe what you needed?

distant hedge Mar 10, 2021, 2:52 AM

#

Sure, just a sec 🙂 I will need to clean it up or I will confuse you.

ripe forge Mar 10, 2021, 2:55 AM

#

Hi jae, we don't allow self promotion or request for jobs on our discord.

misty flint Mar 10, 2021, 2:55 AM

#

!rule6

#

hmm

ripe forge Mar 10, 2021, 2:55 AM

#

Needs a space.

misty flint Mar 10, 2021, 2:55 AM

#

!rule 6

arctic wedgeBOT Mar 10, 2021, 2:55 AM

#

Rules

6. No spamming or unapproved advertising, including requests for paid work. Open-source projects can be shared with others in #python-general and code reviews can be asked for in a help channel.

misty flint Mar 10, 2021, 2:55 AM

#

ah thanks

distant hedge Mar 10, 2021, 2:58 AM

#

@ripe forge

ripe forge Mar 10, 2021, 3:02 AM

#

Ah see, okay. So question, did df get updated so far?

#

Also I don't think your x.index.isin part makes any sense whatsoever.

#

You're doing an isin on x

distant hedge Mar 10, 2021, 3:03 AM

#

Yes, multiple times, since I needed to delete rows and rename some and delete people by age, etc.

ripe forge Mar 10, 2021, 3:03 AM

#

Er no, let me rephrase

#

Does that cell 7 where you update x modify df in the code shown as is? Explicitly the df variable

#

I also think we might need better names for your variables. That can help while talking about it 😅

distant hedge Mar 10, 2021, 3:05 AM

#

ripe forge I also think we might need better names for your variables. That can help while ...

sorry ha ha

#

I've been banging my head against the wall for the past 6hrs with this. My brain is barely alive.

misty flint Mar 10, 2021, 3:07 AM

#

🕯️

distant hedge Mar 10, 2021, 3:07 AM

#

ripe forge Does that cell 7 where you update x modify `df` in the code shown as is? Explici...

correct, but it only works in that file. I have failed to make it run the the main file although they are identical.

ripe forge Mar 10, 2021, 3:08 AM

#

Okay. But yeah this isn't how I'd be updating df

#

I mentioned my approach earlier, using an indexer. But I suppose I'll share the core principle or tip

#

If you want to update a df, you should work on that df directly. Easiest way to avoid headaches

distant hedge Mar 10, 2021, 3:08 AM

#

ripe forge Okay. But yeah this isn't how I'd be updating df

Never mind, I was calling wrong row. It works there too.

ripe forge Mar 10, 2021, 3:09 AM

#

Ah Cool.

#

So you should have all you need for now then, yeah? Your df is being updated properly so I guess you can keep going

distant hedge Mar 10, 2021, 3:11 AM

#

ripe forge So you should have all you need for now then, yeah? Your df is being updated pro...

Yes, you are absolutely right. Everything works. I feel though it's like a house of cards, I am afraid to change anything because it will collapse. I also don't understand how this line works ha ha x.loc[~x.index.isin(x),'Full Name'] = 'Alex' I am not sure where I dug it up.
THANK YOU FOR YOUR HELP :)))))

misty flint Mar 10, 2021, 3:12 AM

#

Praise

ripe forge Mar 10, 2021, 3:13 AM

#

Np! I hate to break it to you, that line is not ideal at all, I'm like 99%sure x.loc['Full Name'] = "Alex" would have worked just fine

#

But yeah 😅

paper lake Mar 10, 2021, 3:13 AM

#

misty flint <:Praise:793696707333849099>

~~time to ban u for using a sticka!!~~ hyperlemon

misty flint Mar 10, 2021, 3:13 AM

#

RunFail

#

ah that reminds me

#

i didnt finish working through some pandas exercises from this morning

#

should i do them now or continue to read about stats

#

pithink

#

fun fact: the pandas exercises helped me do well on my dataframes quiz

#

stats it is

#

DoggoKek

distant hedge Mar 10, 2021, 3:17 AM

#

ripe forge Np! I hate to break it to you, that line is not ideal at all, I'm like 99%sure x...

x.loc['Full Name'] = "Alex" was the first thing I tried. It did not work 😦

ripe forge Mar 10, 2021, 3:19 AM

#

If you want, give it a try now for good luck

distant hedge Mar 10, 2021, 3:19 AM

#

When I use x.loc['Full Name'] = "Alex" it renames all the other columns to Alex

ripe forge Mar 10, 2021, 3:19 AM

#

Because I can't take guarantees for the code you had before.

distant hedge Mar 10, 2021, 3:23 AM

#

I will show you with clean code.

misty flint Mar 10, 2021, 3:23 AM

#

Sip

#

good luck

distant hedge Mar 10, 2021, 3:25 AM

#

clean code

ripe forge Mar 10, 2021, 3:32 AM

#

Cool, ty

#

Guess it's just going to still be treated as a chain. Good thing the warning shows up

misty flint Mar 10, 2021, 3:42 AM

#

http://neuralnetworksanddeeplearning.com/

#

anybody seen this before? its an open source book

#

might check out chapter two since it dives more into the math

#

pithink

ripe forge Mar 10, 2021, 3:45 AM

#

Yep it's good

iron basalt Mar 10, 2021, 3:46 AM

#

Or the standard ML book, still very solid: https://www.amazon.com/Pattern-Recognition-Learning-Information-Statistics/dp/0387310738

misty flint Mar 10, 2021, 3:49 AM

#

ValkNaruhodo

iron basalt Mar 10, 2021, 3:49 AM

#

Starts with curve fitting, probability, decision theory, etc.

#

DL should be pretty straight forward after that book.

#

(And the many non-DL things covered too)

misty flint Mar 10, 2021, 3:52 AM

#

i like that it starts with that

#

thanks bud, ill add this one to the list. but i might get the physical copy since it looks pretty promising

misty flint Mar 10, 2021, 4:30 AM

#

oh R has some nice stats functions

#

kannaSus

misty flint Mar 10, 2021, 4:59 AM

#

ive found that the most effective way for me to get through a textbook is the pomodoro technique

#

else i just find my brain doesnt want to keep reading lol

#

DoggoKek

hollow sentinel Mar 10, 2021, 5:00 AM

#

That works very well

misty flint Mar 10, 2021, 5:02 AM

#

BongoCat

#

yeah

#

i used it before

paper lake Mar 10, 2021, 5:02 AM

#

misty flint ive found that the most effective way for me to get through a textbook is the po...

whats pomodoro

#

is it food?

#

lemon_angrysad

misty flint Mar 10, 2021, 5:03 AM

#

but now im doing it more religiously bc of the learning how to learn course

misty flint Mar 10, 2021, 5:03 AM

#

paper lake is it food?

actually, yes

#

DoggoKek

paper lake Mar 10, 2021, 5:03 AM

#

lemon_hyperpleased

misty flint Mar 10, 2021, 5:04 AM

#

anyway

#

you set a timer to work for a certain amount of time

#

completely, no distractions

#

then afterwards in your break

#

you can check your phone, etc.

hollow sentinel Mar 10, 2021, 5:05 AM

#

I like to work on things for 2 hours

#

and then take a break for one hour

misty flint Mar 10, 2021, 5:05 AM

#

@paper lake https://pomofocus.io/

Pomofocus

#

i cant focus for that long

paper lake Mar 10, 2021, 5:05 AM

#

misty flint you can check your phone, etc.

https://red.now.sh

misty flint Mar 10, 2021, 5:05 AM

#

so i do it the traditional way

hollow sentinel Mar 10, 2021, 5:05 AM

#

I might try it that way too

misty flint Mar 10, 2021, 5:05 AM

#

paper lake https://red.now.sh

basically

hollow sentinel Mar 10, 2021, 5:05 AM

#

instead of just endlessly coding

paper lake Mar 10, 2021, 5:07 AM

#

hollow sentinel instead of just endlessly coding

first time coding, i did it every day 7 hours nonstop. and last november, i was burnt out doing that every day. i wish i have found out the pomodoro technique from earlier :((

#

and now i am procrastinating :((

hollow sentinel Mar 10, 2021, 5:07 AM

#

Eh I mean I do take breaks

#

I’m on this server a lot

misty flint Mar 10, 2021, 5:09 AM

#

🕯️

#

im on my break

#

ok im off

#

DoggoKek

paper lake Mar 10, 2021, 5:10 AM

#

same byeeeee

misty flint Mar 10, 2021, 5:36 AM

#

paper lake and now i am procrastinating :((

https://www.coursera.org/learn/learning-how-to-learn

Coursera

Learning How to Learn: Powerful mental tools to help you master tou...

This course gives you easy access to the invaluable learning techniques used by experts in art, music, literature, math, science, sports, ... Enroll for free.

#

check out this course when you have the time. super helpful

paper lake Mar 10, 2021, 5:44 AM

#

@misty flint thanks!! now i can use my freebie lol

autumn veldt Mar 10, 2021, 5:47 AM

#

X = data.iloc[:, 0:-1].values
y = data.iloc[:, 8]
.
.
.
#tree viz and tree text
graph= Source(tree.export_graphviz(clf, feature_names=X.columns,   class_names=True,
                                     filled=True))
display(SVG(graph.pipe(format='svg')))
print('\n')
tree_root = export_text(clf)
print(tree_root)

i got error when i try to visualize my tree. it says 'numpy.ndarray' object has no attribute 'Columns'. im using dataset with 8 features columns and 1 for target class.
what should i do for this kind of problem?

lapis sequoia Mar 10, 2021, 6:19 AM

#

autumn veldt ``` X = data.iloc[:, 0:-1].values y = data.iloc[:, 8] . . . #tree viz and tree t...

you are getting values in the first line X = ......(.values)
if you apply values on dataframe it returns numpy nd array.
Numpy arrays don't have columns

autumn veldt Mar 10, 2021, 6:20 AM

#

im removing .values, but i got this instead

#

actually this problem can be solved if i use ```data = data.apply(le.fit_transform)`` the reason why i use iloc instead apply for encoding my label is because i want to encode only for my variable predictor not my variable target

eternal fog Mar 10, 2021, 6:32 AM

#

ripe forge Hi jae, we don't allow self promotion or request for jobs on our discord.

Sorry - I deleted.

sand sluice Mar 10, 2021, 6:42 AM

#

is there any way to blur an image using opencv while avoiding specific points?

#

#

i want to blur this without the black bar

#

with a full blur, this happens

misty flint Mar 10, 2021, 6:49 AM

#

pithink

#

interesting

mortal pendant Mar 10, 2021, 7:03 AM

#

Hey! I have a small private Discord bot and I'd like to make an AI (I presume an LSTM RNN based on my research) for generating messages on request based on those from people in my server (to clarify, I'll only be collecting data from people who give consent, likely through a reaction to a message that fully explains it, as the last thing I want to do is run into any legal issues). The problem is, while I've been able to make a simple RNN in the past, the way I've previously done it can't be trained and generate data over time, which I would want for this. I'm struggling to understand LSTM, though, and I find I learn better in practise. So, I'm wondering if any of you's would know any good starting points for this? It's also worth noting I do have a MySQL database, which for live training I assume would be better for saving training data, so if there's anything that would be able to use this, that would be great! So, pretty much, the optimal tutorial or module if one exists that would handle most of this for me would be for training and using a live async-compatible model for text generation that is able to save and use the data to and from a SQL database for efficiency. If anyone knows any good starting points for this, please let me know as anything will help!

lean ledge Mar 10, 2021, 8:17 AM

#

mortal pendant Hey! I have a small private Discord bot and I'd like to make an AI (I presume an...

If you care about getting something working together, you should just import huggingface transformers and fine-tune a GPT2 model

mortal pendant Mar 10, 2021, 8:18 AM

#

I've seen GPT2 before, but didn't think it would work for something like this as I thought that required a starting point, which would be cool as a secondary option but I'm mainly looking for it to be purely based on the data it has

#

Never seen huggingface before so I'll have a look at that later, thanks 👍

lean ledge Mar 10, 2021, 8:19 AM

#

mortal pendant I've seen GPT2 before, but didn't think it would work for something like this as...

You just need to condition the starting point to either the probability distribution of the starting word of whole corpus or condition it on something previously said on discord

mortal pendant Mar 10, 2021, 8:19 AM

#

I've just had a quick look at it since I've got to go in a few minutes, and it does look really good so tysm 😄

lean ledge Mar 10, 2021, 8:20 AM

#

Nw

mortal pendant Mar 10, 2021, 8:23 AM

#

lean ledge You just need to condition the starting point to either the probability distribu...

Oh sorry that's not quite what I'm looking for, if I understand you correctly; would I have to use something like GPT2 if I use huggingface or can huggingface generate text purely based on the input data? I don't want any external sources, and iirc GPT compares the input data to data it has found elsewhere on the internet and adds to the input data based on what it has from the public internet

#

I do have to go now though so I'll have a look later. If you have any more information I should know, please let me know and I'll see it later 👍

lean ledge Mar 10, 2021, 8:24 AM

#

mortal pendant Oh sorry that's not quite what I'm looking for, if I understand you correctly; w...

It would pre pretrained on a lot of text data from the internet and then you would fine-tune it on your own data. It's sort of a necessity to do it this way unless you've got massive amounts of data and money to dump on compute.

#

Training it on small datasets like a discord chat would leave it deficient in its ability to generate coherent text that it hasn't already seen before

#

GPT doesn't inherently "compare" anything, it learns relationships between words and the probability distributions of the words that show up

#

To understand language, it needs a lot of data on that language

turbid willow Mar 10, 2021, 8:48 AM

#

#help-peanut pls

grave frost Mar 10, 2021, 8:54 AM

#

agree with raggy, you would have to fine-tune a model @mortal pendant . You could try with a simple Keras model with a transformer block (multi-head attention and a FCN) and judge the output for yourselves. You would immediately notice that the output is not always good (as in the model barely gets even the grammar correct, forget the output).

but all the above points would be invalid if you have a ton of data to train and large GPUs to throw at it

lavish tundra Mar 10, 2021, 10:13 AM

#

someone know if is possible to use pandas to create one column with a list of strings and select one string by the position on the list?

golden turtle Mar 10, 2021, 10:21 AM

#

hi, im having problem with opencv, cuz when i use webcam and make anything with frames the webcam is very laggy
is it becouse of pc comeponent? it isnt the worst one

velvet thorn Mar 10, 2021, 10:28 AM

#

sand sluice is there any way to blur an image using opencv while avoiding specific points?

not sure about using openCV

#

but with basic numpy you can mask and apply a blur kernel

arctic wedgeBOT Mar 10, 2021, 10:55 AM

#

Hey @lapis sequoia!

It looks like you tried to attach file type(s) that we do not allow (.pdf). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.

Feel free to ask in #community-meta if you think this is a mistake.

gray galleon Mar 10, 2021, 11:03 AM

#

Hey guys, does anyone have a defined dataset/link for Fridge Item Recognition??

iron mango Mar 10, 2021, 11:04 AM

#

Trained my first CNN, how do I figure out its accuracy value? People say " My model has 80% accuracy" how do I find that?

tidal bough Mar 10, 2021, 11:05 AM

#

Well, there's val_accuracy (accuracy on the validation dataset) in your screenshot.

iron mango Mar 10, 2021, 11:05 AM

#

it all has different values for each epoch. How do I find the final accuracy % ?

ripe forge Mar 10, 2021, 11:06 AM

#

The last one is the final one

iron mango Mar 10, 2021, 11:07 AM

#

So I should say that my model has 79% accuracy?

grave frost Mar 10, 2021, 11:08 AM

#

iron mango Trained my first CNN, how do I figure out its accuracy value? People say " My mo...

its overfittin'

#

if you train it for more epochs, it would get to 100% accuracy

iron mango Mar 10, 2021, 11:10 AM

#

oh...

#

how do I fix that?

#

Feature Extraction part

CNNmodel.add(tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(200,200, 3)))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))
CNNmodel.add(tf.keras.layers.BatchNormalization())
CNNmodel.add(tf.keras.layers.Conv2D(16, (3, 3), activation= 'relu'))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))
CNNmodel.add(tf.keras.layers.Conv2D(32, (3, 3), activation= 'relu'))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))
CNNmodel.add(tf.keras.layers.Conv2D(64, (3, 3), activation= 'relu'))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))
CNNmodel.add(tf.keras.layers.Conv2D(64, (3, 3), activation= 'relu'))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))

Neural Network - For classification

CNNmodel.add(tf.keras.layers.Flatten())
CNNmodel.add(tf.keras.layers.Dense(512, activation='relu'))
CNNmodel.add(tf.keras.layers.Dropout(0.7))
CNNmodel.add(tf.keras.layers.Dense(128, activation='relu'))
CNNmodel.add(tf.keras.layers.Dropout(0.5))
CNNmodel.add(tf.keras.layers.Dense(64, activation='relu'))
CNNmodel.add(tf.keras.layers.Dropout(0.3))
CNNmodel.add(tf.keras.layers.Dense(4,activation='softmax'))

#

I am trying for atleast 85%> accuracy

grave frost Mar 10, 2021, 11:36 AM

#

you had to put dropout on the conv layer too 🙂 and reduce the maximum dropout frmo 0.7 to something more like 0.4 or 0.3 or you would restrict the network and create a bottleneck

deft dawn Mar 10, 2021, 11:38 AM

#

How to build webgis with machine learning? Iam need source code to extracting unstructured data to DB database
#internals-and-peps

grave frost Mar 10, 2021, 11:39 AM

#

deft dawn How to build webgis with machine learning? Iam need source code to extracting un...

just curious, what is a webgis?

untold cove Mar 10, 2021, 11:43 AM

#

lavish tundra someone know if is possible to use pandas to create one column with a list of st...

Yes it’s possible

grave frost Mar 10, 2021, 12:55 PM

#

Forgiveness if I am wrong, but isn't AUC a metric to maximize? if that's the case, then you should let your model run some more

#

TBH, your AUC looks kinda rocky. what set is it evaluated on? and what is your train_acc and val_acc? @pure pond

mystic orchid Mar 10, 2021, 1:01 PM

#

Hi guys, plesae help with some ideas for cv project

grave frost Mar 10, 2021, 1:01 PM

#

mystic orchid Hi guys, plesae help with some ideas for cv project

project for cv or using cv2?

mystic orchid Mar 10, 2021, 1:01 PM

#

grave frost project for cv or using cv2?

for cv(computer vision)

grave frost Mar 10, 2021, 1:03 PM

#

vehicle detection, vehicle number plate detection, counting people etc.

mystic orchid Mar 10, 2021, 1:08 PM

#

thanks)

serene scaffold Mar 10, 2021, 1:30 PM

#

lavish tundra someone know if is possible to use pandas to create one column with a list of st...

>>> import pandas as pd
>>> df = pd.DataFrame()
# Create a column named 'strings' with these values in each row
>>> df['strings'] = ['hello', 'i', 'like', 'python']
>>> df
  strings
0   hello
1       i
2    like
3  python
# select a location (loc) by index (i), in this case row 2 column 0
>>> df.iloc[2, 0]
'like'

candid sable Mar 10, 2021, 1:32 PM

#

anyone familiar with image labelling for face detection? I want to do a similar style of labelling for the shape of a bone and I'm having trouble finding resources/software

serene scaffold Mar 10, 2021, 1:34 PM

#

I don't have personal experience with that, but I can take a look if you'd like

misty flint Mar 10, 2021, 1:39 PM

#

did someone tag me?

#

pithink

serene scaffold Mar 10, 2021, 1:47 PM

#

misty flint did someone tag me?

are you aware of the inbox feature in the discord client? otherwise you might have been ghost pinged.

misty flint Mar 10, 2021, 2:50 PM

#

think it was a ghost ping

#

kannaSus

safe tapir Mar 10, 2021, 3:06 PM

#

maybe meta for this chat:
can anyone talk to how they use Python with Julia, places where the latter might be a good drop-in replacement, etc?

misty flint Mar 10, 2021, 3:06 PM

#

if foxxy was here, they could probs answer that

#

before sorting

#

after sorting

#

DoggoKek

hollow sentinel Mar 10, 2021, 3:14 PM

#

interesting

misty flint Mar 10, 2021, 3:26 PM

#

the lesson here is if your matplotlib plot looks funky

#

its probably not sorted

#

blobhyperthink

mortal pendant Mar 10, 2021, 3:45 PM

#

lean ledge Training it on small datasets like a discord chat would leave it deficient in it...

Well, I’ve done something similar but it wasn’t live where I simply used the bot to put loads of messages into a JSON file an then read trained textgenrnn on the messages, even filtering them down loads too to avoid really short messages and messages containing links or embeds and what not, and still ended up with pretty good data, and this was ages ago when my Discord server was pretty much just created. So I would only imagine it would be better now

#

Though for that I was processing it on a Google Colab notebook, so it might have scored better due to performance, but I wasn’t using the GPUs and the VPS I’m using for the bot isn’t too bad performance wise, though it doesn’t have GPUs

lapis sequoia Mar 10, 2021, 3:53 PM

#

Hey guys anyone has a good formula for number of bins that I should use in a histogram?
Currently doing this:
int(math.sqrt(len(df)))

iron mango Mar 10, 2021, 3:55 PM

#

Feature Extraction part

CNNmodel.add(tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(200,200, 3)))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))
CNNmodel.add(tf.keras.layers.BatchNormalization())
CNNmodel.add(tf.keras.layers.Dropout(0.4))
CNNmodel.add(tf.keras.layers.Conv2D(16, (3, 3), activation= 'relu'))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))
CNNmodel.add(tf.keras.layers.Dropout(0.4))
CNNmodel.add(tf.keras.layers.Conv2D(32, (3, 3), activation= 'relu'))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))
CNNmodel.add(tf.keras.layers.Dropout(0.4))
CNNmodel.add(tf.keras.layers.Conv2D(64, (3, 3), activation= 'relu'))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))
CNNmodel.add(tf.keras.layers.Dropout(0.3))
CNNmodel.add(tf.keras.layers.Conv2D(64, (3, 3), activation= 'relu'))
CNNmodel.add(tf.keras.layers.MaxPooling2D(2,2))
CNNmodel.add(tf.keras.layers.Dropout(0.3))

Neural Network - For classification

CNNmodel.add(tf.keras.layers.Flatten())
CNNmodel.add(tf.keras.layers.Dense(128, activation='relu'))
CNNmodel.add(tf.keras.layers.Dropout(0.4))
CNNmodel.add(tf.keras.layers.Dense(64, activation='relu'))
CNNmodel.add(tf.keras.layers.Dropout(0.3))
CNNmodel.add(tf.keras.layers.Dense(4,activation='softmax'))

#

I am getting weird spikey loss graphs for this.... any suggestions please??

mortal pendant Mar 10, 2021, 4:00 PM

#

lean ledge Training it on small datasets like a discord chat would leave it deficient in it...

Actually, to get a good idea of how big my dataset is, I'll quickly make my bot count all the messages just from people who have already given permission from when I was using textgenrnn. There will actually end up being much more than that since there's a lot more server members now who might be interested, but should provide a reasonable metric

#

It's taking a while but it's already at 50000 which sounds like a fairly large dataset to me 😅 To clarify, though, that is before filtering so I would likely actulaly use a lot less of this

hoary wigeon Mar 10, 2021, 4:12 PM

#

hello

#

I fetched tabular data (COVID-19 WORLDOMETER) from html file.
I have created a dataframe using the data
I want to change the name index,How can i do that

dark lake Mar 10, 2021, 4:18 PM

#

Hi. Is there an inference model in e.g. scikit-learn I can use to classify a single variable into groups?

grave frost Mar 10, 2021, 4:48 PM

#

iron mango I am getting weird spikey loss graphs for this.... any suggestions please??

too much dropout 😅 the model is losing its ability to map your data sufficiently because your dropout is too aggressive.

exotic maple Mar 10, 2021, 5:02 PM

#

hoary wigeon > I fetched tabular data (COVID-19 WORLDOMETER) from html file. > I have created...

what do you mean by changing the name index?

#

you mean setting a column as an index?

hoary wigeon Mar 10, 2021, 5:03 PM

#

exotic maple Mar 10, 2021, 5:03 PM

#

you can use the set_index() method

hoary wigeon Mar 10, 2021, 5:03 PM

#

i meant rename columns

#

before it was having name containing commas and slash

exotic maple Mar 10, 2021, 5:03 PM

#

if you want to rename ALL columns I suggest you just pass a list of names directly tot he attribute

#

df.columns = [COLUMNS here]

hoary wigeon Mar 10, 2021, 5:03 PM

#

yeah i passed dict

exotic maple Mar 10, 2021, 5:04 PM

#

if you want to rename a few columns

#

you cna do it via rename method

hoary wigeon Mar 10, 2021, 5:04 PM

#

    d = pd.read_html(html_filename)
    df = pd.DataFrame(d[0])
    i=0
    col_r_name = {}
    for col_name in df.columns:
        col_r_name[col_name] = 'data_'+str(i)
        i+=1
    df_new = df.rename(columns=col_r_name)
    print(df_new)```

exotic maple Mar 10, 2021, 5:04 PM

#

df.rename(columns={"column old name":"column new name"})

hoary wigeon Mar 10, 2021, 5:04 PM

#

yeah

hoary wigeon Mar 10, 2021, 5:04 PM

#

exotic maple df.rename(columns={"column old name":"column new name"})

i used this method

exotic maple Mar 10, 2021, 5:05 PM

#

why a for loop though= remember that pandas is vectorized, you can broadcast all the new names at once

hoary wigeon Mar 10, 2021, 5:05 PM

#

?

#

i dint understood what u just said

#

i used for loops for storing columns old annd new names

exotic maple Mar 10, 2021, 5:06 PM

#

for col_name in df.columns:
col_r_name[col_name] = 'data_'+str(i)
i+=1
df_new = df.rename(columns=col_r_name)

hoary wigeon Mar 10, 2021, 5:06 PM

#

oh

#

yeah

exotic maple Mar 10, 2021, 5:06 PM

#

i think thats ok to geenrate but the last line is inefficient

#

you can just broadcast it al the end

hoary wigeon Mar 10, 2021, 5:07 PM

#

oh lmme try

#

wait

#

i did the same

exotic maple Mar 10, 2021, 5:08 PM

#

You could do this perhaps (I havent tried dictionary comprehension in ages)

newcols = {column:column+"_" for column in list(df.columns)}

df.rename(columns=newcols)

#

fk lol

hoary wigeon Mar 10, 2021, 5:08 PM

#

haha

#

hello @exotic maple

#

can we convert all data to lowercase ?

#

at once

exotic maple Mar 10, 2021, 5:10 PM

#

just throw .lower()

hoary wigeon Mar 10, 2021, 5:10 PM

#

what about numeric ?

#

no errors ?

exotic maple Mar 10, 2021, 5:10 PM

#

...that doesnt make any sense lol

#

ehm

hoary wigeon Mar 10, 2021, 5:10 PM

#

ohk

exotic maple Mar 10, 2021, 5:11 PM

#

let me think

#

you can probably do it via apply

#

but you'd need to a define a new function to just get all the "object" columns

#

apply .lower()

#

and then update the values

hoary wigeon Mar 10, 2021, 5:11 PM

#

i will use colum name = colum.lower()

exotic maple Mar 10, 2021, 5:11 PM

#

you can also be a lazy fuck and do it like this too, by column lmao

#

df[column] = df[column].apply(lambda x: x.lower())

hoary wigeon Mar 10, 2021, 5:12 PM

#

yeah

exotic maple Mar 10, 2021, 5:12 PM

#

ideally you'd define a real function for ti though

hoary wigeon Mar 10, 2021, 5:13 PM

#

exotic maple df[column] = df[column].apply(lambda x: x.lower())

this will do

sand sluice Mar 10, 2021, 5:23 PM

#

velvet thorn but with basic `numpy` you can mask and apply a blur kernel

Not sure what you mean, if you mask wouldn’t the blur still use the black values for nearby values in the mask

untold viper Mar 10, 2021, 5:47 PM

#

how do i perform canny edge detection on processed_img
the commented code for edge detection giving me error pip-req-build-wvn_it83\opencv\modules\imgproc\src\canny.cpp:829: error: (-215:Assertion failed) _src.depth() == CV_8U in function 'cv::Canny'

iron mango Mar 10, 2021, 5:49 PM

#

InvalidArgumentError: Can not squeeze dim[1], expected a dimension of 1, got 4
[[node Squeeze (defined at <ipython-input-65-bdcef4f4c42e>:1) ]] [Op:__inference_train_function_75365]

Function call stack:
train_function

#

How do I sort this out?? *

serene scaffold Mar 10, 2021, 5:54 PM

#

iron mango InvalidArgumentError: Can not squeeze dim[1], expected a dimension of 1, got 4 ...

are you sure there's not more to the error than that? I would ask you to share more context, preferably using our paste bin

#

!paste

arctic wedgeBOT Mar 10, 2021, 5:54 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pydis.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

distant hedge Mar 10, 2021, 6:02 PM

#

Guys, I am geeking out on Regular Expressions import re . If you are new to Python like me, I think you will enjoy it sooo much in any code you write. It will become a default import for me for anything I work on. 😄

serene scaffold Mar 10, 2021, 6:07 PM

#

distant hedge Guys, I am geeking out on Regular Expressions ```import re``` . If you are new t...

regular expressions are bae af

#

what are you working on where you're finding them helpful?

iron mango Mar 10, 2021, 6:13 PM

#

#

only this step is causing d error, the rest of the CNN training part went smoothly

#

this was the second last step of data augmentation

serene scaffold Mar 10, 2021, 6:27 PM

#

iron mango

if you can copy and paste the text into the paste bin, I might be able to help.

arctic wedgeBOT Mar 10, 2021, 6:29 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pydis.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

serene scaffold Mar 10, 2021, 6:30 PM

#

There's something about six frames though? I don't know what that means.

iron mango Mar 10, 2021, 6:33 PM

#

history2 = CNNmodel.fit(train_dataset, validation_data=test_dataset, epochs = 15)
Epoch 1/15
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-96-9df2d3960108> in <module>()
----> 1 history2 = CNNmodel.fit(train_dataset, validation_data=test_dataset, epochs = 15)

6 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

InvalidArgumentError:  Can not squeeze dim[1], expected a dimension of 1, got 4
     [[node Squeeze (defined at <ipython-input-65-bdcef4f4c42e>:1) ]] [Op:__inference_train_function_75365]

Function call stack:
train_function

misty flint Mar 10, 2021, 6:56 PM

#

distant hedge Guys, I am geeking out on Regular Expressions ```import re``` . If you are new t...

i wish i was better at regex

#

Kermit_KMS

sonic raft Mar 10, 2021, 7:10 PM

#

Hi guys! Is there any important difference between those two Pytorch tensor mean functions?

tensor.mean()

vs

tensor.mean((-1,-2))

?

tidal bough Mar 10, 2021, 7:13 PM

#

sonic raft Hi guys! Is there any important difference between those two Pytorch tensor mean...

The latter passes a dim tuples - which dimensions to take a mean over (and therefore reduce them away).

#

-1 and -2 means the last 2 dimensions.

grave frost Mar 10, 2021, 7:14 PM

#

iron mango ```py history2 = CNNmodel.fit(train_dataset, validation_data=test_dataset, epoch...

expand the frames to see the full error. The error is pretty clear - TF expected an array of 4 dimensions while you are passing 1D array. check your shapes with model.summary() to see what shapes are being passed for layers and check/reshape your data accordingly

tidal bough Mar 10, 2021, 7:14 PM

#

If your tensor has more than 2 dimensions, these are very different - as in, the former will always produce a scalar, whereas the latter will produce a tensor with the same dimensions as the input except the last 2 dimensions(which are averaged over).

sonic raft Mar 10, 2021, 7:15 PM

#

tidal bough If your tensor has more than 2 dimensions, these are very different - as in, the...

What my tensor has 3 dimensions? Both of them(my code examples) are supposed to produce a scalar.

tidal bough Mar 10, 2021, 7:16 PM

#

sonic raft What my tensor has 3 dimensions? Both of them(my code examples) are supposed to ...

The latter would produce a 1d tensor, I'm pretty sure.

#

Like, if your tensor is 5,10,20, the former would average over all 1000 cells and return a scalar, whereas the latter would return a 5, 1d tensor - each cell in it the mean of 200 cells of the original tensor

#

like, means_tensor[i] would be equal to torch.mean(tensor[i,:,:])

grave frost Mar 10, 2021, 7:20 PM

#

tidal bough Like, if your tensor is `5,10,20`, the former would average over all 1000 cells ...

so former 1D and latter 2D?

tidal bough Mar 10, 2021, 7:21 PM

#

WDYM?

#

3d tensor got averaged over 2 last dimensions and became 1d.

sonic raft Mar 10, 2021, 7:23 PM

#

tidal bough Like, if your tensor is `5,10,20`, the former would average over all 1000 cells ...

I think I understand, thank you!
Because in my case: I have a cube tensor, each matrix represents an image, and I need the average of the pixels for each image.
So, if I do it like (-1,-2) It should be fine 😄

tidal bough Mar 10, 2021, 7:33 PM

#

yeah, precisely. You'd go from, say, 50,1920,1080 (50 one-channel (grayscale, presumably) images of 1920x1080 pixels) to 50, - each element being an average of all pixels in that image

lapis sequoia Mar 10, 2021, 8:52 PM

#

hey guys, anyone has a clue how to set up pycharm pro, jupyter notebook, in a way that it is cell based run? so far there is an editor on the left side and preview on the right side

tidal bough Mar 10, 2021, 8:55 PM

#

lapis sequoia hey guys, anyone has a clue how to set up pycharm pro, jupyter notebook, in a wa...

The only confusing part I recall is that you need to poke around in the dropdown menu in the bar at the top to actually launch the Jupyter server

#

then you can run stuff

lapis sequoia Mar 10, 2021, 8:56 PM

#

it isn't about running thing

#

it is about the editor's style

#

#

hate this so much, so confusing

#

@tidal bough

tidal bough Mar 10, 2021, 8:58 PM

#

you can toggle what's shown, IIRC

#

for example, only show the output in the previews

lapis sequoia Mar 10, 2021, 8:59 PM

#

the thing i want it to look and feel like jupyter, the only reason i don't use jupyter is because of the awful completion recommendation and no helper preview

#

whereas vs code still has that, not that great either, so far nothing beats pycharm, but it is so ascetically confusing

tidal bough Mar 10, 2021, 9:00 PM

#

jupyter does have some autocompletion and doc viewing, but yeah, not the best

lapis sequoia Mar 10, 2021, 9:00 PM

#

jedi is trash my man hahah

#

so i'm ready to drop some cash if they actually work

#

I have 0 clue how did he make it like this

tidal bough Mar 10, 2021, 9:02 PM

#

Like what, specifically?

#

Like, the colors?

lapis sequoia Mar 10, 2021, 9:02 PM

#

nah cell based editing

#

if you look at mine i have editor on the left side and preview on the right

tidal bough Mar 10, 2021, 9:02 PM

#

I see

lapis sequoia Mar 10, 2021, 9:02 PM

#

i just want it all to be the same thing, but no clue

tidal bough Mar 10, 2021, 9:02 PM

#

lemme open up Pycharm, if I have enough memory for that lol

lapis sequoia Mar 10, 2021, 9:03 PM

#

you have the same as that one or you just don't use pycharm for that?

tidal bough Mar 10, 2021, 9:03 PM

#

nah, I'm going to see if I can figure out how to make it this way

#

it's probably quite a good idea - I was using the two-screens way

lapis sequoia Mar 10, 2021, 9:04 PM

#

like vs is cool, but the amount of bugs and speed is not at par to pycharm

tidal bough Mar 10, 2021, 9:04 PM

#

for me VSCode just doesn't have some of PyCharm's features

#

notably: showing the contents of numpy arrays(Scientific view or whatever) and the profiler which shows the results as a graph

lapis sequoia Mar 10, 2021, 9:05 PM

#

same i really tried to make vsc work, but man it doesn't cut it

tidal bough Mar 10, 2021, 9:07 PM

#

on the other hand, PyCharm just loaded for me.

#

more than 5 minutes of loading time. This is why I don't use it often 🙂

lapis sequoia Mar 10, 2021, 9:08 PM

#

i guess will require another year to index everything? 🤣

#

i just start it in the morning

#

get my coffee

#

go and take a shower

#

then wait for another 10 minutes for it to load

tidal bough Mar 10, 2021, 9:14 PM

#

lapis sequoia I have 0 clue how did he make it like this

I wonder if that's Pycharm

#

maybe it's Jupyter Lab or something, lol

lapis sequoia Mar 10, 2021, 9:15 PM

#

https://www.datacamp.com/community/tutorials/data-science-python-ide

DataCamp Community

Top 5 Python IDEs For Data Science

The best Python IDEs for data science that make data analysis and machine learning easier!

#

yeah seems to be no answe r

#

such a shame was looking forward to it

distant hedge Mar 10, 2021, 9:29 PM

#

Do you mean this?

#

That's Jupyter

lapis sequoia Mar 10, 2021, 9:30 PM

#

distant hedge Do you mean this?

nah, try typing np. and then wait for the fill

#

sometimes it works

#

sometimes it doesn't

#

i read that jedi and some other completion helpers in jupyter struggle with indexing

#

@distant hedge

distant hedge Mar 10, 2021, 9:31 PM

#

lapis sequoia Mar 10, 2021, 9:32 PM

#

your works, mine struggles

distant hedge Mar 10, 2021, 9:32 PM

#

@lapis sequoia did you make sure to import it first?

#

Also, are you pressing tab?

lapis sequoia Mar 10, 2021, 9:33 PM

#

yeah did all that, it is just inconsistent, which i hate, even vs code sometimes struggles

#

#

like 0 completion recommendations

distant hedge Mar 10, 2021, 9:33 PM

#

What are you using?

lapis sequoia Mar 10, 2021, 9:33 PM

#

vscode

#

are you on a notebook or labs?

distant hedge Mar 10, 2021, 9:34 PM

#

labs

lapis sequoia Mar 10, 2021, 9:34 PM

#

what's your completion helper?

#

jedi?

distant hedge Mar 10, 2021, 9:34 PM

#

None

#

I didn't install any, but I think by default it's jedi

lapis sequoia Mar 10, 2021, 9:35 PM

#

yeah default is jedi

#

wellp be thanksful yours works now hahah

distant hedge Mar 10, 2021, 9:36 PM

#

Ha ha ha I didn't had any issues so far. 🙂 Are you using ctrl+space on vscode?

#

Or maybe you are using Kite

lapis sequoia Mar 10, 2021, 9:38 PM

#

yeah it is fine, it works most of the times so i'm happy, usually what i do

#

NO

#

NOW AYYYYYY kite

distant hedge Mar 10, 2021, 9:38 PM

#

I have decided against Kite, I hate the damn thing. So many popups it's unreasonable.

lapis sequoia Mar 10, 2021, 9:38 PM

#

that thing is so over priced

#

10 euros a month for a god damn completion assistant? ffffffff that

#

i mean if it was 20 euros a year sure

#

but 120 euros a year for a

#

wait no it is 140 euros

#

ffff that even harder

grave frost Mar 10, 2021, 10:33 PM

#

kite is so much better IMO: https://www.kite.com/integrations/kite-vs-tabnine/

Code Faster with Kite

Caelan

Kite - Free AI Coding Assistant and Code Auto-Complete Plugin

Code faster with Kite’s AI-powered autocomplete plugin for over 16 programming languages and 16 IDEs, featuring Multi-Line Completions. Works 100% locally.

#

But honestly, once you get used to the lack of autocompletion in Jupyter, it isn't that bad as you think. I just use it to complete long variable names. apart from that, not much is require tbh

#

you could use TabNine tho

#

I would say that - I was involved in a part of the early development of kite (just an external person who had a solution to x problem) and they basically ripped off my solution that I provided on the expectation that I would be compensated adequately. All they gave was a fuckin mug and shirt whose customs+shipping I was supposed to pay. Their whole company is built on mistrust and un-sporting practices. not even a certificate to put on my CV

twilit pilot Mar 10, 2021, 10:41 PM

#

Does anyone have any NLP related project ideas? Idc what level

lavish tundra Mar 10, 2021, 10:44 PM

#

serene scaffold ```py >>> import pandas as pd >>> df = pd.DataFrame() # Create a column named 's...

i mean all in one line like the column string on the index 0 had ('hello', 'i', 'like', 'python')

serene scaffold Mar 10, 2021, 10:45 PM

#

lavish tundra i mean all in one line like the column string on the index 0 had ('hello', 'i', ...

so you want a Python list to be an element that occupies one cell of a dataframe, and you do not want to create a new column?

lavish tundra Mar 10, 2021, 10:46 PM

#

ye, but i want create a new col

#

like column strings, on the index 0 had a list, on the index 1 another list...

serene scaffold Mar 10, 2021, 10:49 PM

#

lavish tundra ye, but i want create a new col

>>> import pandas as pd
>>> df = pd.DataFrame()
>>> df['string_lists'] = [['hello', 'goodbye'], ['python', 'java']]
>>> df
       string_lists
0  [hello, goodbye]
1    [python, java]

lavish tundra Mar 10, 2021, 10:51 PM

#

i was trying something like this

#

but looks like i should sum line by line?

#

the most close of a list i could be was to put , between the strings

serene scaffold Mar 10, 2021, 11:23 PM

#

@lavish tundra can you post an example of the desired output as text, and a sample of the csv as text?

candid sable Mar 10, 2021, 11:24 PM

#

serene scaffold I don't have personal experience with that, but I can take a look if you'd like

Sorry for taking so long to reply and thank you for the timely response. That would be highly appreciated!

serene scaffold Mar 10, 2021, 11:25 PM

#

candid sable Sorry for taking so long to reply and thank you for the timely response. That wo...

Are you familiar with torch vision?

candid sable Mar 10, 2021, 11:26 PM

#

I'm an extreme beginner in this field, only used Keras/TF, so no

serene scaffold Mar 10, 2021, 11:26 PM

#

It's like Wanda vision, in the sense that it's a thing. They don't actually have anything in common

lapis sequoia Mar 10, 2021, 11:27 PM

#

grave frost I would say that - I was involved in a part of the early development of kite (ju...

lol so they went thank you and also fuck you 🤣

grave frost Mar 10, 2021, 11:27 PM

#

lapis sequoia lol so they went thank you and also fuck you 🤣

yep. I was too naive 😦 but at least I got some experience dealing with "adults"

velvet thorn Mar 10, 2021, 11:29 PM

#

sand sluice Not sure what you mean, if you mask wouldn’t the blur still use the black values...

not necessarily?

serene scaffold Mar 10, 2021, 11:29 PM

#

Anyway, pytorch is another library for deep learning, and torch vision supports computer vision. I think facial recognition is a common use case for learning how to use it

velvet thorn Mar 10, 2021, 11:29 PM

#

the point of applying a blurring kernel manually is that you can customise it

serene scaffold Mar 10, 2021, 11:30 PM

#

For the record, gm knows more than me about anything you might ask in this channel except maybe specific areas of nlp

shy kraken Mar 10, 2021, 11:31 PM

#

Hi I have a pandas dataframe with 5 or so columns and 20 rows. I want to create a new column where I apply a custom function that takes the most recent 5 rows of data from two columns. It's like a rolling function. I've tried something like this which doesn't work ```py
data['Rolling_Beta'] = data.rolling(5).apply(beta(data.B,data.C))

How might I achieve what I'm trying to achieve?

lapis sequoia Mar 10, 2021, 11:32 PM

#

grave frost yep. I was too naive 😦 but at least I got some experience dealing with "adults"

well as long as you get a good resume points, i'm sure it came in handy?

tidal bough Mar 10, 2021, 11:33 PM

#

shy kraken Hi I have a pandas dataframe with 5 or so columns and 20 rows. I want to create...

You probably want to apply your beta, not the result of applying beta to something.

grave frost Mar 10, 2021, 11:33 PM

#

lapis sequoia well as long as you get a good resume points, i'm sure it came in handy?

I don't see how I can put it on my resume without some formal documentation, but I dont know much about CV's anyways 🤷

candid sable Mar 10, 2021, 11:34 PM

#

serene scaffold For the record, gm knows more than me about anything you might ask in this chann...

I could use some NLP help as well with a uni assignment but hopefully I'll manage to solve that in the following days, when I actually take the time to look at it 😆

re: Pytorch - quick Google in and found a bunch of tutorials, although pretty old and lacking instructions on labelling your own. Seems like a good start for now, thanks

lapis sequoia Mar 10, 2021, 11:35 PM

#

grave frost I don't see how I can put it on my resume without some formal documentation, but...

yeah, big plus (ffff that guys? 😄 )

misty flint Mar 10, 2021, 11:37 PM

#

serene scaffold For the record, gm knows more than me about anything you might ask in this chann...

im planning on doing a minor in computational linguistics/nlp, do you have any favorite resources youd recommend

shy kraken Mar 10, 2021, 11:39 PM

#

tidal bough You probably want to apply your `beta`, not the result of applying `beta` to som...

thanks I've tried this:

data['Rolling_Beta'] = data.rolling(5).apply(beta)

and I get this error TypeError: beta() missing 1 required positional argument: 'B'

So I thought of putting the two particular columns I wanted together and apply it on that a la:


data_a = data.A,data.B

data['Rolling_Beta'] = data_a.rolling(5).apply(beta)

This doesn't work either, I get a AttributeError: 'tuple' object has no attribute 'rolling'

tidal bough Mar 10, 2021, 11:39 PM

#

check what rolling gives exactly

#

it it gives tuples, your beta must be single-argument

#

!docs pandas.DataFrame.rolling

arctic wedgeBOT Mar 10, 2021, 11:40 PM

#

`pandas.DataFrame.rolling`

DataFrame.rolling(window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None)```
Provide rolling window calculations.

Parameters  **window**int, offset, or BaseIndexer subclassSize of the moving window. This is the number of observations used for calculating the statistic. Each window will be a fixed size.

If its an offset then this will be the time period of each window. Each window will be a variable sized based on the observations included in the time-period. This is only valid for datetimelike indexes.

If a BaseIndexer subclass is passed, calculates the window boundaries based on the defined `get_window_bounds` method. Additional rolling keyword arguments, namely min\_periods, center, and closed will be passed to get\_window\_bounds.

**min\_periods**int, default NoneMinimum number of observations in window required to have a value (otherwise result is NA). For a window that is specified by an offset, min\_periods will default to 1. Otherwise, min\_periods will default to the size of the window.... [read more](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html#pandas.DataFrame.rolling)

serene scaffold Mar 10, 2021, 11:40 PM

#

misty flint im planning on doing a minor in computational linguistics/nlp, do you have any f...

Not really. You can check my github to see what projects I've worked on and get a sense of what I might know about.

grave frost Mar 10, 2021, 11:41 PM

#

NLP = Huggingface! 🤗

serene scaffold Mar 10, 2021, 11:41 PM

#

The state of the art depends a lot on different approaches for representing words as vectors

lapis sequoia Mar 10, 2021, 11:42 PM

#

serene scaffold Not really. You can check my github to see what projects I've worked on and get ...

haha i used to live in fairfax and dupont circle

#

totally off-topic

candid sable Mar 10, 2021, 11:42 PM

#

What's some manual object labelling software? I seem to only find corporate solutions.. and Lionsbridge lol

serene scaffold Mar 10, 2021, 11:43 PM

#

lapis sequoia haha i used to live in fairfax and dupont circle

Dupont must have been nice. I'm not so ambitious as to assume I'll ever live there.

shy kraken Mar 10, 2021, 11:43 PM

#

hmmm

grave frost Mar 10, 2021, 11:43 PM

#

candid sable What's some manual object labelling software? I seem to only find corporate solu...

you can make your own 🤷 all you have to do is to divert image to the correct folder?

misty flint Mar 10, 2021, 11:45 PM

#

serene scaffold Not really. You can check my github to see what projects I've worked on and get ...

sounds good. thanks

lapis sequoia Mar 10, 2021, 11:45 PM

#

serene scaffold Dupont must have been nice. I'm not so ambitious as to assume I'll ever live the...

it is quite nice, but was too much, i was underage so what's the point of living near the night life places when I am unable to use them. Fairfax was much better imo, needed a car, but super chill place and you don't have those drunk/drugged homeless people walking around the streets in the evenings.

candid sable Mar 10, 2021, 11:46 PM

#

grave frost you can make your own 🤷 all you have to do is to divert image to the correct fo...

I tried that but my dataset is very limited, I'd like to provide separate .xml files for labelling a feature on a bone.

Bigger picture: Binary classifier for male/female determination of an edge of a bone

grave frost Mar 10, 2021, 11:47 PM

#

candid sable I tried that but my dataset is very limited, I'd like to provide separate .xml f...

no idea, I have never worked with .xml

candid sable Mar 10, 2021, 11:48 PM

#

or csv, or something. something to say that this particular edge is relevant - can't crop the images to only cointain that specific edge as there's plenty of surrounding noise

grave frost Mar 10, 2021, 11:49 PM

#

candid sable or csv, or something. something to say that this particular edge is relevant - c...

so you would construct a csv for all paths_to_image and the label?

#

path_to_image, Female

lavish tundra Mar 10, 2021, 11:50 PM

#

serene scaffold <@346480864894517259> can you post an example of the desired output as text, and...

the idea is to put a lot of different string on a list, and access the string by the position of the column and the position on the list,
like:
Column : String
0 ('banana', 'apple', 'mango'...)
1 ('potato', 'blueberry', 'milk')
...
so i'll need to access apple by the position, for example: [db.at[0, 'String']][1]

serene scaffold Mar 10, 2021, 11:51 PM

#

Do you need any other types of columns?

candid sable Mar 10, 2021, 11:51 PM

#

grave frost so you would construct a csv for all paths_to_image and the label?

yeah, I get that - but the results are bad if I just specify that image X, Y, Z are female, etc

grave frost Mar 10, 2021, 11:51 PM

#

candid sable yeah, I get that - but the results are bad if I just specify that image X, Y, Z ...

why would the results be bad? that means your model is not correct

lavish tundra Mar 10, 2021, 11:52 PM

#

serene scaffold Do you need any other types of columns?

what u mean? i'll have this column and a ID column, like the id for the ('banana'...) will be 6546984654

candid sable Mar 10, 2021, 11:52 PM

#

grave frost why would the results be bad? that means your model is not correct

because I have a very limited dataset

lavish tundra Mar 10, 2021, 11:52 PM

#

i have all the data i know how to merge columns, but idk how to add a list in one cell

grave frost Mar 10, 2021, 11:52 PM

#

candid sable because I have a very limited dataset

ye, but how would a image labelling software help in that case?

candid sable Mar 10, 2021, 11:53 PM

#

grave frost ye, but how would a image labelling software help in that case?

I could input points on which edge to focus in relation to an adjacent edge and not have it take into account all the other irrelevant irregularities.. or at least I'm thinking I can

grave frost Mar 10, 2021, 11:54 PM

#

candid sable I could input points on which edge to focus in relation to an adjacent edge and ...

wut? what technique are you using?

#

feature extraction is not something manually done in images.

candid sable Mar 10, 2021, 11:55 PM

#

just Keras and label folders for now, k-fold validation and retrained resnet. similar results to k-fold with own model

grave frost Mar 10, 2021, 11:56 PM

#

pre-trained resnet was a bad idea. but I still don't see how a image labelling software can help increase data

candid sable Mar 10, 2021, 11:56 PM

#

I didn't say it helps increase data lol

candid sable Mar 10, 2021, 11:57 PM

#

grave frost pre-trained resnet was a bad idea. but I still don't see how a image labelling s...

same goes with pretrained vgg16 or no pretrained at all

grave frost Mar 10, 2021, 11:58 PM

#

cool, but how does the image labelling software fit into all this?

#

you already have the labelled data

candid sable Mar 10, 2021, 11:58 PM

#

I want to input the points of the shape I want it to look at

grave frost Mar 10, 2021, 11:59 PM

#

could you elaborate what "points of the shape" are?

iron basalt Mar 11, 2021, 12:01 AM

#

You want to label pixels of an image and store those pixel coordinates in a separate file?

grave frost Mar 11, 2021, 12:02 AM

#

iron basalt You want to label pixels of an image and store those pixel coordinates in a sepa...

how does that help the model converge?>

#

IMO it would be some sophisticated image preprocessing

iron basalt Mar 11, 2021, 12:02 AM

#

grave frost how does that help the model converge?>

Idk and I don't care, the question was about manually labeling things.

grave frost Mar 11, 2021, 12:02 AM

#

iron basalt Idk and I don't care, the question was about manually labeling things.

cool enough

candid sable Mar 11, 2021, 12:03 AM

#

can I link papers?

#

https://www.mdpi.com/2076-3417/10/20/7233

MDPI

Automated Bone Age Assessment with Image Registration Using Hand X-...

One of the methods for identifying growth disorder is by assessing the skeletal bone age. A child with a healthy growth rate will have approximately the same chronological and bone ages. It is important to detect any growth disorder as early as possible, so that mitigation treatment can be administered with less negative consequences. Recently, ...

grave frost Mar 11, 2021, 12:06 AM

#

so you want a different label for each box?

candid sable Mar 11, 2021, 12:06 AM

#

pretty much

grave frost Mar 11, 2021, 12:07 AM

#

The paper is about predicting the bone age 🤷 that's regression

iron basalt Mar 11, 2021, 12:09 AM

#

Idk, but if all you want is to take such and image and have a label for each box, you should be able to code that in python in 30-60 minutes.

#

(manually drawn rectangles)

grave frost Mar 11, 2021, 12:12 AM

#

for each image?

#

Argh, reddit idiots https://www.reddit.com/r/MachineLearning/comments/m2bo1y/d_the_real_problem_program_of_the_next_5_years/

r/MachineLearning - [D] The real problem program of the next 5 year...

0 votes and 7 comments so far on Reddit

#

"Making music to psychologically manipulate people into doing what the lyrics say"

candid sable Mar 11, 2021, 12:15 AM

#

grave frost The paper is about predicting the bone age 🤷 that's regression

yeah but I'm interested in feeding it the keypoints used for that

grave frost Mar 11, 2021, 12:16 AM

#

then squiggle is right - manually drawn rectangles (large enough to cover most images)

candid sable Mar 11, 2021, 12:16 AM

#

grave frost Mar 11, 2021, 12:17 AM

#

it would be pretty boring, but you would have to do that

iron basalt Mar 11, 2021, 12:17 AM

#

It's just a complete beginner's drawing tool, nothing more is needed.

lapis sequoia Mar 11, 2021, 12:24 AM

#

Hello! I'm trying to classify a customer complaint database by topic, but I don't have much experience with ML, and since I don't have an already labeled dataset, I'm not sure how to proceed with unsupervised learning methods. I'm considering classifying the complaints by keywords (e.g., counting the frequency of said words and selecting some relevant keywords manually), but I don't really know how exactly to do it - I've managed to get the frequency and I already have an idea on which keywords to select, but I don't know where to proceed from that. Can anyone help?

grave frost Mar 11, 2021, 12:27 AM

#

lapis sequoia Hello! I'm trying to classify a customer complaint database by topic, but I don'...

just use autokeras - one stop solution

candid sable Mar 11, 2021, 12:28 AM

#

I don't want the boxes showing up in the image, or drawing them and training with boxes in image. I want those boxes to be the objects it would look at when trying to classify whether it's M or F. Would that be possible?

grave frost Mar 11, 2021, 12:29 AM

#

just crop it

iron basalt Mar 11, 2021, 12:30 AM

#

candid sable I don't want the boxes showing up in the image, or drawing them and training wit...

Yes, you just store the box coordinates and sizes in a separate file which is associated with the image. Or in the same file (custom file format).

candid sable Mar 11, 2021, 12:31 AM

#

grave frost just crop it

cropped still has other features in the background that are irrelevant

grave frost Mar 11, 2021, 12:32 AM

#

crop + preprocess

candid sable Mar 11, 2021, 12:32 AM

#

I already preprocessed enough, I have an edge image

#

but I can't uncurve the irrelevant edges, can I?

candid sable Mar 11, 2021, 12:35 AM

#

iron basalt Yes, you just store the box coordinates and sizes in a separate file which is as...

yes, and what would I use for that? conversely, instead of rectangles, can I use points like used in facial landmark detection?

iron basalt Mar 11, 2021, 12:35 AM

#

What would you use to draw the rectangles?

candid sable Mar 11, 2021, 12:36 AM

#

to generate a label file

iron basalt Mar 11, 2021, 12:36 AM

#

You just do it.

#

With python.

candid sable Mar 11, 2021, 12:37 AM

#

wiiiiith? any library?

iron basalt Mar 11, 2021, 12:37 AM

#

You mean libraries?

#

pygame, pyglet, pyqt, pyside, pywxwidgets, or any other UI framework that lets you draw stuff.

candid sable Mar 11, 2021, 12:38 AM

#

sorry, but as I said previously, I'm very new to this and my only experience re: labels is just using folder names as labels

iron basalt Mar 11, 2021, 12:39 AM

#

You're new to python?

candid sable Mar 11, 2021, 12:39 AM

#

new to ML stuff, only used Python for basic data operations

iron basalt Mar 11, 2021, 12:40 AM

#

This has not really much to do with ML and everything to do with being able to make an app.

lapis sequoia Mar 11, 2021, 12:40 AM

#

grave frost just use autokeras - one stop solution

From what I have quickly read right now, I should define the classes and run and run 20% of the dataset as training, check the model precision, do some cross-validation and apply to the general dataset?

iron basalt Mar 11, 2021, 12:41 AM

#

Then you need to learn more python for basic stuff like File IO, making a GUI, etc.

candid sable Mar 11, 2021, 12:41 AM

#

nope, all I want is a model really

iron basalt Mar 11, 2021, 12:42 AM

#

But you also want a tool to label the data...

#

To get that model.

candid sable Mar 11, 2021, 12:42 AM

#

yeah, there's no windows .exe that can do that?

iron basalt Mar 11, 2021, 12:42 AM

#

According to you, no.

candid sable Mar 11, 2021, 12:43 AM

#

well I only found corporate things, that's why I asked in the first place

iron basalt Mar 11, 2021, 12:43 AM

#

Idk if anyone would know of such a specific application.

#

All I know is that it can be made in python in like an hour.

candid sable Mar 11, 2021, 12:43 AM

#

iron basalt Idk if anyone would know of such a specific application.

literally none to load image, I draw my object box, and it saves the coordinates in an xml or something?

grave frost Mar 11, 2021, 12:43 AM

#

lapis sequoia From what I have quickly read right now, I should define the classes and run and...

yeah, thats the correct way

iron basalt Mar 11, 2021, 12:44 AM

#

You don't need XML, over-complicated file format for something so simple.

grave frost Mar 11, 2021, 12:46 AM

#

iron basalt You don't need XML, over-complicated file format for something so simple.

XML was made to be simple AFAIK

iron basalt Mar 11, 2021, 12:46 AM

#

XML is soooo far from simple.

grave frost Mar 11, 2021, 12:46 AM

#

what, it just has custom tags. that's it

iron basalt Mar 11, 2021, 12:46 AM

#

It's an entire tree structure.

#

Requires a bunch of parsing rules.

#

For storing boxes it's as simple as:

grave frost Mar 11, 2021, 12:47 AM

#

well, ye gotta put the effort - but Im pretty sure there would be a lib for xml

iron basalt Mar 11, 2021, 12:47 AM

#

87,124,54,24
55,200,20,20
...

grave frost Mar 11, 2021, 12:48 AM

#

Image_ID/path?

iron basalt Mar 11, 2021, 12:48 AM

#

Image path first line

grave frost Mar 11, 2021, 12:49 AM

#

well, IMO the OP's approach is too complicated

#

(xml adds to that)

candid sable Mar 11, 2021, 12:49 AM

#

I wasn't dead set on XML lol

#

I just gave it as an example as it's what Imagenet used and I don't know others

iron basalt Mar 11, 2021, 12:49 AM

#

xml is for when your thing is very tree-like (and potentially any number of children per node).

grave frost Mar 11, 2021, 12:50 AM

#

but what is your end-goal (leaving aside the boxes for now)?

grave frost Mar 11, 2021, 12:50 AM

#

iron basalt xml is for when your thing is very tree-like (and potentially any number of chil...

when is some dataset tree-like?

candid sable Mar 11, 2021, 12:50 AM

#

for the love of me I can't explain it in english but I have an analogy

iron basalt Mar 11, 2021, 12:50 AM

#

It's not even just for datasets.

lapis sequoia Mar 11, 2021, 12:51 AM

#

candid sable for the love of me I can't explain it in english but I have an analogy

out of curiosity, what is your native language?

iron basalt Mar 11, 2021, 12:51 AM

#

Think like a robot. One common file format for simulations is URDF.

candid sable Mar 11, 2021, 12:51 AM

#

lapis sequoia out of curiosity, what is your native language?

hungarian

iron basalt Mar 11, 2021, 12:52 AM

#

It's xml type of file because a robot's parts connect like a tree. Like the main body might be a node and it has 4 children nodes which are wheels.

grave frost Mar 11, 2021, 12:52 AM

#

ok, thats a good one

#

@candid sable try

#

@lapis sequoia how's GME?

#

265$ ---> OH frick

candid sable Mar 11, 2021, 12:55 AM

#

you know how men have the Adam’s apple - I’d like to detect a similar bump on a bone

lapis sequoia Mar 11, 2021, 12:55 AM

#

grave frost <@456226577798135808> how's GME?

i'm not really checking it, now that you mentioned it lol

candid sable Mar 11, 2021, 12:55 AM

#

but looking back, only regression would be a potential solution so yeah

grave frost Mar 11, 2021, 12:55 AM

#

candid sable you know how men have the Adam’s apple - I’d like to detect a similar bump on a ...

well, then why do you need specific parts of the image cropped out?

lapis sequoia Mar 11, 2021, 12:56 AM

#

i deal more with credit stuff

grave frost Mar 11, 2021, 12:56 AM

#

lapis sequoia i'm not really checking it, now that you mentioned it lol

ye, them r/wsb are prob rich by now

grave frost Mar 11, 2021, 12:56 AM

#

lapis sequoia i deal more with credit stuff

wdym by credit stuff?

candid sable Mar 11, 2021, 12:56 AM

#

grave frost well, then why do you need specific parts of the image cropped out?

other irregularities where the bump could be - the surface isn’t always straight

lapis sequoia Mar 11, 2021, 12:56 AM

#

grave frost wdym by credit stuff?

loans, mostly

grave frost Mar 11, 2021, 12:57 AM

#

lapis sequoia loans, mostly

nice - smart and stable

lapis sequoia Mar 11, 2021, 12:57 AM

#

grave frost nice - smart and stable

i work at a p2p loan company

grave frost Mar 11, 2021, 12:57 AM

#

candid sable other irregularities where the bump could be - the surface isn’t always straight

wait - so all you want to do is to detect a specific feature on an image

candid sable Mar 11, 2021, 12:57 AM

#

yes the size of it

grave frost Mar 11, 2021, 12:57 AM

#

lapis sequoia i work at a p2p loan company

is that even legal? not to be under some financial framework

lapis sequoia Mar 11, 2021, 12:58 AM

#

grave frost is that even legal? not to be under some financial framework

suprisingly, it is

#

at least in the country where i live

grave frost Mar 11, 2021, 12:58 AM

#

candid sable yes the size of it

well well well. You can simplify the model - 2 models, one for returning bounding box coords and another prog that crops it then feeds to the 2nd one which performs image regression

grave frost Mar 11, 2021, 12:59 AM

#

lapis sequoia suprisingly, it is

r u sure?

lapis sequoia Mar 11, 2021, 12:59 AM

#

yes

grave frost Mar 11, 2021, 12:59 AM

#

well, that's intriguing. how does it work exactly?

candid sable Mar 11, 2021, 12:59 AM

#

grave frost well well well. You can simplify the model - 2 models, one for returning boundin...

yeah I need to look more into regression. thanks for the patience man

grave frost Mar 11, 2021, 12:59 AM

#

candid sable yeah I need to look more into regression. thanks for the patience man

cool, no worries

candid sable Mar 11, 2021, 1:00 AM

#

I’m a disaster at math

lapis sequoia Mar 11, 2021, 1:00 AM

#

you can have a max interest rate of aprox. the double of the basic national interest rate

#

having that, you can work in several ways

grave frost Mar 11, 2021, 1:00 AM

#

lapis sequoia you can have a max interest rate of aprox. the double of the basic national inte...

that's....not fair

lapis sequoia Mar 11, 2021, 1:01 AM

#

you can have an investor fully financing a single loan

#

or

#

you can have several investors financing loans in quotas

candid sable Mar 11, 2021, 1:01 AM

#

I saw a similar concept on an Ethereum loan platform

lapis sequoia Mar 11, 2021, 1:01 AM

#

grave frost that's....not fair

it all depends on credit score simulation

grave frost Mar 11, 2021, 1:02 AM

#

what advantage does that give over the traditional banking system?

lapis sequoia Mar 11, 2021, 1:02 AM

#

grave frost what advantage does that give over the traditional banking system?

less bureocracy for getting a loan

#

also, in the economic setting of my country specifically, it's way more advantageous to invest in private loans instead of federal securities, for instance

#

when the basic national interest rate gets at it's lowest, the investment yield can get to negative levels

#

and for credit investors, it can be really helpful to diversify through private loans

#

for a borrower, it can be way easier to actually get credit when you don't have to go through the whoooole banking process

#

credit scoring was the whole reason i got interested in data science - most companies in that segment use econometric modeling to determine credit risk

grave frost Mar 11, 2021, 1:06 AM

#

whoa, that's gonna go over my head 🤯

lapis sequoia Mar 11, 2021, 1:06 AM

#

if you live in the US, i'm pretty sure there are companies that work in that business model

grave frost Mar 11, 2021, 1:06 AM

#

lapis sequoia if you live in the US, i'm pretty sure there are companies that work in that bus...

nah, not in US

lapis sequoia Mar 11, 2021, 1:06 AM

#

it's a really good alternative if you're a seasoned investor and want to diversify your wallet

grave frost Mar 11, 2021, 1:07 AM

#

what is credit risk? the risk that the loanee won't pay the interest?

lapis sequoia Mar 11, 2021, 1:07 AM

#

the risk that the loanee won't pay at all

#

there's several ways to calculate that

#

i'm not sure how because i'm not entirely versed in the econometrics behind it lol

#

but it takes some variables like age, credit bureaus score, income commitment, etc

grave frost Mar 11, 2021, 1:09 AM

#

that's interesting. any way we can have a look at that data?

lapis sequoia Mar 11, 2021, 1:11 AM

#

i'm actually using excel files lol

#

do you mind if i just sample one column? i can't really share the whole dataset because it may contain sensitive info

#

the column i'm looking to classify, of course

grave frost Mar 11, 2021, 1:12 AM

#

lapis sequoia do you mind if i just sample one column? i can't really share the whole dataset ...

😦 sample is good too

arctic wedgeBOT Mar 11, 2021, 1:15 AM

#

Hey @lapis sequoia!

It looks like you tried to attach file type(s) that we do not allow (.xlsx). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.

Feel free to ask in #community-meta if you think this is a mistake.

lapis sequoia Mar 11, 2021, 1:15 AM

#

oh well

grave frost Mar 11, 2021, 1:15 AM

#

you can export it as a csv

arctic wedgeBOT Mar 11, 2021, 1:16 AM

#

Hey @lapis sequoia!

It looks like you tried to attach file type(s) that we do not allow (.csv). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.

Feel free to ask in #community-meta if you think this is a mistake.

lapis sequoia Mar 11, 2021, 1:16 AM

#

jesus

grave frost Mar 11, 2021, 1:17 AM

#

well, I don't think its that hard to convert a csv to gif man 🙂

lapis sequoia Mar 11, 2021, 1:17 AM

#

you can actually do that? lmao

grave frost Mar 11, 2021, 1:18 AM

#

just jokin. can you paste a coupla rows here?

lapis sequoia Mar 11, 2021, 1:18 AM

#

ofc

#

well, that should basically do it

grave frost Mar 11, 2021, 1:19 AM

#

ahh, not english 😦

lapis sequoia Mar 11, 2021, 1:20 AM

#

well, basically i have some complaint comments

#

i know the keywords i want

#

and i want to classify the text based off on that

#

i've read a bit on autokeras but i'm not sure on how to proceed with that

#

i've considered using regex to determine the classes

#

in such a way where if the row contains word x, then class = x, for instance

grave frost Mar 11, 2021, 1:22 AM

#

ok, so that's the first step ^^ to build a dataset. one column labels, one column text.

#

next, we would take all the data in each cell and make an array out of it. so text_array would contain [test1, text2, text3..... and same with the labels.

#

lastly, we would just pass both arrays to autokeras and you are done in about 30 lines of code

#

though it will take time for autokeras to find the best model

lapis sequoia Mar 11, 2021, 1:23 AM

#

grave frost ok, so that's the first step ^^ to build a dataset. one column labels, one colum...

so you say, creating a labeled data myself, based on the keywords i select?

grave frost Mar 11, 2021, 1:23 AM

#

lapis sequoia so you say, creating a labeled data myself, based on the keywords i select?

yep, has to human made for best accuracy

lapis sequoia Mar 11, 2021, 1:24 AM

#

that can actually work very well with some good regex

grave frost Mar 11, 2021, 1:24 AM

#

you can use regex, but I advise against it. a lot of things are nuanced

#

either way, it depends on the dataset

lapis sequoia Mar 11, 2021, 1:24 AM

#

i've used str.contains too

#

it just feels a little imprecise tho

#

after i normalized the text it doesn't feel like such a huge problem

#

what would you recomend?

grave frost Mar 11, 2021, 1:26 AM

#

I didn't mean the method to identify label, I said that finding labels programmatically is a bad idea. because if you can do it with programming, why are you making a model?

lapis sequoia Mar 11, 2021, 1:26 AM

#

grave frost I didn't mean the method to identify label, I said that finding labels programma...

that is a good point indeed

#

it's kinda pointless to make a model if i can actually label it myself with programming and keyword selecting

grave frost Mar 11, 2021, 1:27 AM

#

ye, you got it. your task seems pretty simple so a large enough if [word1, word2, word3] in sentence should do that anyway

lapis sequoia Mar 11, 2021, 1:28 AM

#

and if i want the full context of that specific label, i can just get the bigrams/trigrams for that label

grave frost Mar 11, 2021, 1:28 AM

#

lapis sequoia and if i want the full context of that specific label, i can just get the bigram...

ye, that option is always open

lapis sequoia Mar 11, 2021, 1:28 AM

#

it does seem like a good method

#

thank you very much

grave frost Mar 11, 2021, 1:29 AM

#

lapis sequoia it does seem like a good method

maybe, but you have to explore your data first

lapis sequoia Mar 11, 2021, 1:29 AM

#

grave frost maybe, but you have to explore your data first

i'm sadly very familiar with it lol

grave frost Mar 11, 2021, 1:29 AM

#

lapis sequoia thank you very much

cool, no worries. Good luck! 👍

lapis sequoia Mar 11, 2021, 1:29 AM

#

i'm both customer support and intern data analyst

#

lol

grave frost Mar 11, 2021, 1:29 AM

#

internships. what can you do

lapis sequoia Mar 11, 2021, 1:29 AM

#

ikr?

grave frost Mar 11, 2021, 1:30 AM

#

its basically exploitation

lapis sequoia Mar 11, 2021, 1:30 AM

#

not that i really mind it tbh

#

i had no idea of what to do it my major until my intership

grave frost Mar 11, 2021, 1:31 AM

#

lapis sequoia not that i really mind it tbh

gotta do it some day or the other. plus, companies dig that - so hang on!

lapis sequoia Mar 11, 2021, 1:31 AM

#

i'm hoping to get a firmer grasp on statistics and programming so i can fully transition to data

uncut orbit Mar 11, 2021, 1:31 AM

#

data science is awesome

#

trying learning some calc

lapis sequoia Mar 11, 2021, 1:32 AM

#

uncut orbit data science is awesome

i was really surprised with what you can do with such skills

#

it's amazing

uncut orbit Mar 11, 2021, 1:32 AM

#

ikr

lapis sequoia Mar 11, 2021, 1:33 AM

#

uncut orbit trying learning some calc

sadly my undergraduation focus much more on economic story than economic theory/econometrics

uncut orbit Mar 11, 2021, 1:33 AM

#

ah

lapis sequoia Mar 11, 2021, 1:34 AM

#

but i pretend to move to statistics

uncut orbit Mar 11, 2021, 1:34 AM

#

lmao

#

are you on kaggle?

#

its a great place to get started

lapis sequoia Mar 11, 2021, 1:35 AM

#

uncut orbit are you on kaggle?

i had lessons with the data science team of my company

#

basically learned python through it

uncut orbit Mar 11, 2021, 1:36 AM

#

then thats good

lapis sequoia Mar 11, 2021, 1:36 AM

#

i hope so

uncut orbit Mar 11, 2021, 1:36 AM

#

i dont work yet so i can only imagine

lapis sequoia Mar 11, 2021, 1:36 AM

#

it's crazy

#

when you're in a department where you're the only one you can code

#

you're literally god

uncut orbit Mar 11, 2021, 1:36 AM

#

yea

#

thats what my data science teacher was telling me

lapis sequoia Mar 11, 2021, 1:37 AM

#

that's not exactly good though lol

uncut orbit Mar 11, 2021, 1:37 AM

#

the resources right

lapis sequoia Mar 11, 2021, 1:37 AM

#

it's really pressuring, as an intern, to have so much expectation in the analysis you execute

uncut orbit Mar 11, 2021, 1:37 AM

#

oh

#

well dont worry if you're the only one

#

they'll be more

lapis sequoia Mar 11, 2021, 1:38 AM

#

i mean, there's a whole team focused on that

#

but they're busy with other stuff

uncut orbit Mar 11, 2021, 1:38 AM

#

thats crazy

lapis sequoia Mar 11, 2021, 1:38 AM

#

it is lmao

#

it does make you to be always on the edge to learn more and improvise

uncut orbit Mar 11, 2021, 1:39 AM

#

yea

#

thats what pressure does

#

but i personally cant wait to start professional data science

lapis sequoia Mar 11, 2021, 1:39 AM

#

it's a really good career path

uncut orbit Mar 11, 2021, 1:39 AM

#

it is

#

and for me especially its fun and comforting

lapis sequoia Mar 11, 2021, 1:40 AM

#

that's good

#

having a specific path is even more pleasuring

uncut orbit Mar 11, 2021, 1:40 AM

#

yea

timid depot Mar 11, 2021, 1:49 AM

#

Is anyone here good at deep learning, neural networks?
Plz mind DMing me
Plz plz plz

uncut orbit Mar 11, 2021, 1:52 AM

#

i've barely worked with them and i'm confused with weights and biases

timid depot Mar 11, 2021, 1:53 AM

#

😭

uncut orbit Mar 11, 2021, 1:55 AM

#

i also wonder how nueral nets can work with robots

timid depot Mar 11, 2021, 1:55 AM

#

they can

uncut orbit Mar 11, 2021, 1:56 AM

#

but how do you implement them in robots

#

is there like some chip?

timid depot Mar 11, 2021, 1:56 AM

#

uncut orbit but how do you implement them in robots

Robots must have a chipset
Like Arduino or Raspberry

shy kraken Mar 11, 2021, 1:56 AM

#

so I have a dataframe data and I'm getting the standard deviation of a particular column Beta. So I'm using data.Beta.std(). I'm multiplying that by 2 and adding it to the mean and that number is higher than all the numbers in my dataset....is that possible?

timid depot Mar 11, 2021, 1:57 AM

#

uncut orbit is there like some chip?

You cant expect to add a neural net in just a walking robot who just walks

uncut orbit Mar 11, 2021, 1:57 AM

#

yea i get that

#

but how does the whole thing work

#

how do you train it

timid depot Mar 11, 2021, 1:58 AM

#

You cant directly code in robot it doesnt have keys
You code in computer and then transfer the program to your robot

uncut orbit Mar 11, 2021, 1:59 AM

#

and is python a language that is used for it?

timid depot Mar 11, 2021, 2:00 AM

#

Yea it is
Python has libraries like tensorflow, pytorch which can be used for neural nets, deep learning algorithms

uncut orbit Mar 11, 2021, 2:00 AM

#

i mean like for the robots

#

i've worked a little with tensorflow

timid depot Mar 11, 2021, 2:00 AM

#

uncut orbit i mean like for the robots

It also has a library for robots
Ive heard of it

timid depot Mar 11, 2021, 2:01 AM

#

uncut orbit i've worked a little with tensorflow

Can you plz teach me too pithink

uncut orbit Mar 11, 2021, 2:01 AM

#

ok

#

i think i have some code from before

timid depot Mar 11, 2021, 2:01 AM

#

Ok

#

Can I dm you

uncut orbit Mar 11, 2021, 2:01 AM

#

sure

timid depot Mar 11, 2021, 2:01 AM

#

Thanks

misty flint Mar 11, 2021, 3:50 AM

#

uncut orbit how do you train it

reinforcement learning. take a look at some vids on YT. pretty interesting

distant needle Mar 11, 2021, 3:59 AM

#

Is it appropriate to ask plotting questions here?

rugged spire Mar 11, 2021, 4:00 AM

#

plotting as in matplotlib?

distant needle Mar 11, 2021, 4:05 AM

#

@rugged spire yes, sorry, my keyboard decided to have a seizure right when you answered

#

I can't for the life of me figure out how to destroy a figure / canvas completely. I am placing the canvas in a tkinter Frame, but I still can't figure out how to delete the canvas object

#

If you think the error is more of a problem with the UI, I can shift this to UI instead.

rugged spire Mar 11, 2021, 4:08 AM

#

um

misty flint Mar 11, 2021, 4:09 AM

#

tkinter is yikes

rugged spire Mar 11, 2021, 4:09 AM

#

sorry but i have never used tkinter

misty flint Mar 11, 2021, 4:09 AM

#

but yeah UI is for tkinter questions

rugged spire Mar 11, 2021, 4:09 AM

#

so i am not really sure

#data-science-and-ml

Feature Extraction part

Neural Network - For classification

Feature Extraction part

Neural Network - For classification