#data-science-and-ml | Python | Page 295

twilit pilot Mar 13, 2021, 6:04 PM

#

how long does it typically take to train an sklearn.svm model? My dataframe has about 26,000 rows and the X is a 26,000 1d np.arrays with a length of 2500 and the y is 26,000 labels. My regularization=1, gamma=auto, and kernel=rbf?

twin latch Mar 13, 2021, 6:15 PM

#

Hey guys im reading sensor data from my serial ports, ive successfully read sensor data to variables but i dont know like how to append that sensor data to csv file, can anyone help me?

torpid scarab Mar 13, 2021, 6:32 PM

#

hello. anyone knows any good book for logic programming and asp?

exotic lodge Mar 13, 2021, 6:50 PM

#

twin latch Hey guys im reading sensor data from my serial ports, ive successfully read sens...

If you have your data in Python you can organize it and do something like this: https://thispointer.com/python-how-to-append-a-new-row-to-an-existing-csv-file/#:~:text=Open our csv file in,in the associated csv file

thispointer.com

Varun

Python: How to append a new row to an existing csv file?

twin latch Mar 13, 2021, 6:51 PM

#

Oh okay Thank you

misty flint Mar 13, 2021, 8:51 PM

#

is there a website somewhere that translates all the library abbreviations

#

like a master list

eternal hare Mar 13, 2021, 8:54 PM

#

could anyone help me set up torch xla on google colab, im using pytorch lightning
ive been having tons of issues and theres nothing on stackoverflow

uncut barn Mar 13, 2021, 9:14 PM

#

Anyone know the answer to this?

lapis sequoia Mar 13, 2021, 10:12 PM

#

Hello, I'm new in data science, are there any books you recommend to start with?

serene scaffold Mar 13, 2021, 10:24 PM

#

lapis sequoia Hello, I'm new in data science, are there any books you recommend to start with?

there's "data science from scratch" published by O'Riley. I would see if they have any recent books on the topic.

lapis sequoia Mar 13, 2021, 10:27 PM

#

serene scaffold there's "data science from scratch" published by O'Riley. I would see if they ha...

Thank you

misty flint Mar 13, 2021, 10:34 PM

#

what would we do without the o'reilly series

#

pithink

serene scaffold Mar 13, 2021, 10:34 PM

#

misty flint what would we do without the o'reilly series

another publisher would fill the void

misty flint Mar 13, 2021, 10:34 PM

#

true

#

they just have the time advantage

#

just like AWS with the cloud

#

well ig AWS has more features but thats also a function of their age

#

side note: our nlp project on contracts is on github. it would be really easy to create a docker image/container for it right?

#

my first time working with docker

fossil rivet Mar 13, 2021, 10:39 PM

#

Hey I was wondering if anyone could explain a Hough Transform to me. I know it's not syntax, but it's a CV concept. I have a table that we're supposed to understand for an upcoming exam but still don't

#

Screen_Shot_2021-03-13_at_4.29.32_PM.png

grave frost Mar 13, 2021, 11:13 PM

#

I heard that using docker is pretty difficult (mostly people complaining about cryptic errors)

#

Also a somewhat argumentative question - I only saw a very basic overview of capsule networks (from Hinton) but it does seem kind of like the hierarchy theory in HTM's and there were little credits toward Hawkins. can someone with deep knowledge explain the core difference b/w Hawkins and Hintons' approach?

fleet hare Mar 13, 2021, 11:20 PM

#

fossil rivet

Pretty sure that table represents the Hough parameters for the lines in the image. It's similar to the 2nd example on the wikipedia page. If you want more help I can hop in a voice chat and walk you through it but check out the wikipedia page first and see if you can figure it out https://en.wikipedia.org/wiki/Hough_transform#:~:text=The Hough transform is a,shapes by a voting procedure.

Hough transform

The Hough transform is a feature extraction technique used in image analysis, computer vision, and digital image processing. The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure. This voting procedure is carried out in a parameter space, from which object candidates are ob...

fossil rivet Mar 13, 2021, 11:21 PM

#

Alright, will do. Thanks!

fossil rivet Mar 13, 2021, 11:39 PM

#

fleet hare Pretty sure that table represents the Hough parameters for the lines in the imag...

Would you be able to hop in VC with me? I'm still kinda lost

empty sable Mar 14, 2021, 12:08 AM

#

everyone here seems rather experienced, I have a question about python capabilities
if I had historical data on a price or level of a number
could I build an algo that could give me a rough estimate of the direction the price would go

serene scaffold Mar 14, 2021, 12:16 AM

#

@empty sable what format is the data in?

exotic maple Mar 14, 2021, 12:17 AM

#

empty sable everyone here seems rather experienced, I have a question about python capabilit...

That sounds like a prediction problem.

Here's the thing, you can technically do something like that with regressions but dont lean too much on it

grave frost Mar 14, 2021, 12:31 AM

#

empty sable everyone here seems rather experienced, I have a question about python capabilit...

yes, you can predict whether it would rise or fall over timesteps (classification) or you could do prediction using RNN's - that would try to predict what the exact price would be for any amount of time you like. RNN's usually provide decent enough results on a good amount of data so you should have not data problems with them.

#

Look up stock prediction with LSTM on google. YOu would find plenty of tutorials to help you out

light stump Mar 14, 2021, 12:54 AM

#

I’m trying to figure out how to use scipy.interpolate.RectBivariateSpline to perform a polynomial image warp between two halves of an image stored as a numpy array. Can someone help me figure out how to accomplish this by any chance?

proven plinth Mar 14, 2021, 12:54 AM

#

You dont need a network, you could just use an autoregressive model to predict the series

#

They usually get decent results with less work than a RNN

#

You need to do all the basic data cleaning and normalization tho but you probably will have to do that anyway

#

Like range scaling, making the series stationary, removing seasonality etc

brisk moth Mar 14, 2021, 12:57 AM

#

could someone help me with an RNN language model?

#

for sentiment analysis

empty sable Mar 14, 2021, 12:57 AM

#

serene scaffold <@!523245044409565184> what format is the data in?

no idea haha, just thinking about things I may want to learn and specialize in. perhaps a stock price or the sugar level of a diabetic?

proven plinth Mar 14, 2021, 12:58 AM

#

Stock prices are easier to get a hold of than sugar levels

serene scaffold Mar 14, 2021, 12:58 AM

#

empty sable no idea haha, just thinking about things I may want to learn and specialize in. ...

are you familiar with the concept of having data about a type of thing, and having a model that can predict some attribute of those things based on known attributes?

empty sable Mar 14, 2021, 12:59 AM

#

proven plinth Stock prices are easier to get a hold of than sugar levels

I actually have access to sugar levels from a relative which is why im interested

proven plinth Mar 14, 2021, 12:59 AM

#

Thats not enough data my guy

#

Im guessing

empty sable Mar 14, 2021, 1:00 AM

#

serene scaffold are you familiar with the concept of having data about a type of thing, and havi...

hmm never made a model before but I assume I could research it

proven plinth Mar 14, 2021, 1:00 AM

#

If you go the RNN route youre gonna need years of data

serene scaffold Mar 14, 2021, 1:00 AM

#

if it's enough, it might be good at predicting that person's BSLs 🤷‍♂️

empty sable Mar 14, 2021, 1:00 AM

#

proven plinth If you go the RNN route youre gonna need years of data

hmm I have roughly 3 months of data, I could probably use stock data

#

and then change it to sugar levels

#

if the stock one is successful

#

or at least adapt it

proven plinth Mar 14, 2021, 1:01 AM

#

Blood sugar spikes in different ways per person btw, also you would probably also need inputs on their diet too

empty sable Mar 14, 2021, 1:01 AM

#

proven plinth Blood sugar spikes in different ways per person btw, also you would probably als...

I live with them

proven plinth Mar 14, 2021, 1:01 AM

#

Stock prices behave better imo

empty sable Mar 14, 2021, 1:01 AM

#

thats why I want to use their data speciffically

serene scaffold Mar 14, 2021, 1:01 AM

#

empty sable hmm I have roughly 3 months of data, I could probably use stock data

what exactly is in that data? blood sugar readings and the time of day of that reading? anything else?

empty sable Mar 14, 2021, 1:02 AM

#

time of day, sugar level, day it happened. I could get more starting today

serene scaffold Mar 14, 2021, 1:02 AM

#

if all you have are their blood sugar levels and timestamps, all you can really do is try to fit a curve for their blood sugar levels throughout the day

#

you don't know anything about what causes those blood sugar levels

#

you'd need data about their nutritional intake, I believe

#

though curve fitting is still good.

empty sable Mar 14, 2021, 1:03 AM

#

its the relation of insulin to what they eat I believe

#

nutritonal intake, I could track that

fossil rivet Mar 14, 2021, 1:03 AM

#

Is anyone going to help me or no

empty sable Mar 14, 2021, 1:03 AM

#

what do you guys think of building one for stocks as a baseline

#

and adapting it to sugar levels

proven plinth Mar 14, 2021, 1:03 AM

#

The stocks one will be easier

empty sable Mar 14, 2021, 1:03 AM

#

fossil rivet Is anyone going to help me or no

idk how sorry

serene scaffold Mar 14, 2021, 1:03 AM

#

I don't really know how insulin works, as neither me nor any family members need it

proven plinth Mar 14, 2021, 1:04 AM

#

And give you an idea on how to make things like this

serene scaffold Mar 14, 2021, 1:04 AM

#

do you also get to know how much insulin was administered after each reading, or something?

empty sable Mar 14, 2021, 1:04 AM

#

serene scaffold I don't really know how insulin works, as neither me nor any family members need...

your body naturally produces insulin to counter carbs etc you get from eating food. diabetics have to manually enter insulin

proven plinth Mar 14, 2021, 1:05 AM

#

Insulin ratios differ from person to person

grave frost Mar 14, 2021, 1:05 AM

#

what features would you use?

empty sable Mar 14, 2021, 1:05 AM

#

serene scaffold do you also get to know how much insulin was administered after each reading, or...

I have access to that but havent been tracking it

serene scaffold Mar 14, 2021, 1:05 AM

#

do they know what features are?

empty sable Mar 14, 2021, 1:05 AM

#

proven plinth Insulin ratios differ from person to person

thats why I want to use data specifcally for this person

grave frost Mar 14, 2021, 1:05 AM

#

empty sable thats why I want to use data specifcally for this person

what would be the things you will track?

empty sable Mar 14, 2021, 1:06 AM

#

grave frost what would be the things you will track?

sugar level, time of day, day of week, meals and nutritional intakes, and insulin intake

grave frost Mar 14, 2021, 1:06 AM

#

how would you track nutritional intakes (since you would take data on your own)

empty sable Mar 14, 2021, 1:07 AM

#

just the main things from each meal, how big the meal was, calories, carbs, etc

grave frost Mar 14, 2021, 1:07 AM

#

unless you would pester them every 10 minutes about what they ate and are going to eat

empty sable Mar 14, 2021, 1:07 AM

#

I live them

#

with them

#

I could just note it, I help take care of them anyways

grave frost Mar 14, 2021, 1:07 AM

#

cool enough

#

seems reasonable data that you model would converge

empty sable Mar 14, 2021, 1:07 AM

#

the more data the more accurate generally right?

grave frost Mar 14, 2021, 1:08 AM

#

yep

empty sable Mar 14, 2021, 1:08 AM

#

ok I will start planning this out and reading up on this, thanks for letting me bounce some ideas off. do you mind if I ocasionally dm you? its cool if your busy

grave frost Mar 14, 2021, 1:08 AM

#

cool, no problem

#

I would think that insulin level on its own would be a pretty strong feature

#

that with the nutritional intake would be good enough. but better collect all data you can

wild dome Mar 14, 2021, 1:10 AM

#

I want to count the eggs in this image using OpenCV

#

what filters do I have to apply before using findContours? right now I just applied grayscale

grave frost Mar 14, 2021, 1:11 AM

#

mix n match

wild dome Mar 14, 2021, 1:11 AM

#

the empty spaces where there are no eggs are giving me trouble when detecting edges

marsh berry Mar 14, 2021, 1:12 AM

#

Hey all, I've got a spreadsheet with lots of data in it and I want to create visuals (charts, graphs, etc) for it. Do you guys know if there is a list of general stuff I can make?

misty flint Mar 14, 2021, 1:12 AM

#

how come when i call my dependencies, some of them look like this

Pillow @ file:///C:/ci/pillow_1615224175364/work

#

Jinja2 @ file:///tmp/build/80754af9/jinja2_1612213139570/work

#

instead of

grave frost Mar 14, 2021, 1:12 AM

#

.....?

misty flint Mar 14, 2021, 1:12 AM

#

PyPDF2==1.26.0

#

a specific version number

grave frost Mar 14, 2021, 1:13 AM

#

what command?

misty flint Mar 14, 2021, 1:13 AM

#

pip freeze

#

in a virtual env

grave frost Mar 14, 2021, 1:13 AM

#

Its giving me the normal versions

misty flint Mar 14, 2021, 1:13 AM

#

ig its just me then

#

pithink

grave frost Mar 14, 2021, 1:14 AM

#

just a sec

#

Imma try on my server with conda

misty flint Mar 14, 2021, 1:14 AM

#

ah

#

it is a conda env

#

so that might be it

empty sable Mar 14, 2021, 1:15 AM

#

marsh berry Hey all, I've got a spreadsheet with lots of data in it and I want to create vis...

pretty sure you can just do that on google sheets lol. unless you mean more complicated things

grave frost Mar 14, 2021, 1:16 AM

#

marsh berry Hey all, I've got a spreadsheet with lots of data in it and I want to create vis...

yea, you can use pandas with python and libs like matplotlib to do that. look up data visualization in python

#

@misty flint yea, conda gives some file paths

misty flint Mar 14, 2021, 1:17 AM

#

thats a problem right?

#

especially when im making requirements.txt files?

#

or no

grave frost Mar 14, 2021, 1:17 AM

#

prob preinstalled

#

I dunno

#

tqdm @ file:///home/conda/feedstock_root/build_artifacts/tqdm_1609612933698/work

misty flint Mar 14, 2021, 1:17 AM

#

hmm

grave frost Mar 14, 2021, 1:17 AM

#

tqdm came preinstalled in my env

#

so I assume thats why there is a path

misty flint Mar 14, 2021, 1:17 AM

#

i wonder how it will play when i try to throw it into a docker container

#

might have to specify a specific version

grave frost Mar 14, 2021, 1:18 AM

#

make a new, clean conda env

#

if you really want the reproducibility

misty flint Mar 14, 2021, 1:18 AM

#

i did with this one tho

#

or do you mean to make it without the paths

grave frost Mar 14, 2021, 1:19 AM

#

no, just make a brand new one

#

@misty flint https://stackoverflow.com/questions/62885911/pip-freeze-creates-some-weird-path-instead-of-the-package-version

Stack Overflow

pip freeze creates some weird path instead of the package version

I am working on developing a python package. I use pip freeze > requirements.txt to add the required package into the requirement.txt file. However, I realized that some of the packages, instead...

fleet hare Mar 14, 2021, 1:22 AM

#

fossil rivet Would you be able to hop in VC with me? I'm still kinda lost

Yeah, send me a PM with a time

misty flint Mar 14, 2021, 1:29 AM

#

ah make a new requirements.txt gotcha

exotic maple Mar 14, 2021, 1:52 AM

#

@misty flint did your model finish training?

simple shadow Mar 14, 2021, 1:54 AM

#

hey i need help with a dataset
so all demographic info is in one column
how do i split the demographic info into different columns, like age group, gender, and etc

#

exotic maple Mar 14, 2021, 1:55 AM

#

Yo're going to need some regex for hat

exotic maple Mar 14, 2021, 1:55 AM

#

simple shadow hey i need help with a dataset so all demographic info is in one column how do i...

you can use df["break_out"].str.extract( YOUR REGEX GOES HERE) and you can create new columns from the capture groups

serene scaffold Mar 14, 2021, 1:56 AM

#

would be nice to know what an entire row of the table looks like though

simple shadow Mar 14, 2021, 1:56 AM

#

there are lots of rows

#

15840 rows

serene scaffold Mar 14, 2021, 1:56 AM

#

right, but there's presumably a relatively low number of columns?

exotic maple Mar 14, 2021, 1:56 AM

#

good luck cleaning that mess lol

#

but also, are those repeat instances?

#

I mean. does it have ALL attributes in the same row? or just one attribute or something

#

thats weird

simple shadow Mar 14, 2021, 1:57 AM

#

ok i will take a screenshot of the entire row

#

exotic maple Mar 14, 2021, 1:59 AM

#

that's... messy

#

you can keep it as it is

#

or you can create new columns for each type of breakout

#

but what will you do for empty ones?

#

NaN or 0?

#

it matters a lot for ages, for example

simple shadow Mar 14, 2021, 1:59 AM

#

should i do mean of ages?

exotic maple Mar 14, 2021, 2:00 AM

#

that's not what I mean

simple shadow Mar 14, 2021, 2:00 AM

#

i mean for the empty ones, should i put the mean of age

exotic maple Mar 14, 2021, 2:00 AM

#

simple shadow i mean for the empty ones, should i put the mean of age

that depends a lot on the shape of your data. The problem is, you dont have ages, you have ages range, which is an ordinal data type

simple shadow Mar 14, 2021, 2:00 AM

#

ohh

exotic maple Mar 14, 2021, 2:01 AM

#

you "could" set the mode of age-range as the fill-in value, but that's your call as researcher

#

and that's only for age. What about gender, race, etc.

simple shadow Mar 14, 2021, 2:01 AM

#

race is a tricky situation

exotic maple Mar 14, 2021, 2:01 AM

#

the problem is that break-out column has a lot of mixed info, so you need to decide what to do with all that data

#

and most importantly, what to do with missing data

simple shadow Mar 14, 2021, 2:02 AM

#

should i keep it as is because race is tricky to do missing data for?

exotic maple Mar 14, 2021, 2:03 AM

#

thats your call. you're the researcher

#

I'd keep it, but the most pressing issue for me is "wtf do i do with rows that do not have that info"

simple shadow Mar 14, 2021, 2:04 AM

#

ohh

#

i looked, and those columns dont have missing data

#

i was wondering how do i make statisicial stuff with it if it's all mixed info? @exotic maple

exotic maple Mar 14, 2021, 2:10 AM

#

I didnt explain myself properly

#

think it like this

#

you have a single column called "TYPES" that holds data of type: "Age", "gender", "race", etc.

On a normal DB each of those would be a single column in itself. In your DB, this is all ina single column, which means you haved mixed signals in a single feature.

Ideally, you want to separate that feature into multiple features that actually make sense (each one in their single column, as they are independent of each other), but if you do that, you will have missing data because 1 row can only have 1 of either type.

#

SO if you create an age column, you will NaN for all the rows where there is no age specified

simple shadow Mar 14, 2021, 2:14 AM

#

ohhh

#

so i can't split the features into different column due to the missing info?

exotic maple Mar 14, 2021, 2:16 AM

#

You can, in fact, you should, but you need to deal with the missing data

simple shadow Mar 14, 2021, 2:17 AM

#

first can i split them into different columns and then deal with the missing data?

exotic maple Mar 14, 2021, 2:18 AM

#

yes, that's what i would i do

simple shadow Mar 14, 2021, 2:18 AM

#

thank you for your help!! @exotic maple

exotic maple Mar 14, 2021, 2:18 AM

#

no prob.

#

I also used you to procrastinate and not work on this regex so :v

simple shadow Mar 14, 2021, 2:19 AM

#

XDD

exotic maple Mar 14, 2021, 2:19 AM

#

lemon_sentimental

#

man pandas is so goddamn powerful. Even if you never do any data science, pandas itself is worth all the struggle

misty flint Mar 14, 2021, 2:36 AM

#

i will die for pandas

#

PandaFite

undone heron Mar 14, 2021, 2:48 AM

#

hey guys, I'm working on a df in pandas and after a merge one of the columns is coming in with NaN values.

def preprocess(x):
    df = pd.merge(df_gps, x, on=['bus_id', 'date_time'], how='left')
    df.dropna()
    df.to_csv("./mobility-dataset/merge_gps_translated_validation.csv", index=False)

reader = pd.read_csv("./mobility-dataset/translated_validation.csv", chunksize=10000)

futures = []
with  cf.ThreadPoolExecutor(max_workers=6) as exe:

    for r in reader:
        r['date_time'] = pd.to_datetime(r['date_time'], format='%Y-%m-%d %H:%M:%S')
        r['busline_id'] = r['busline_id'].astype('int32')
        r['bus_id'] = r['bus_id'].astype('int32')

        futures.append(exe.submit(preprocess, r))

cf.wait(fs=futures)

df = pd.read_csv('./mobility-dataset/merge_gps_translated_validation.csv', nrows=1000000)
df

#

here is the output and above is the merge code between two df

#

From the two df, busline_id is actually ok, int32 value and what not

#

Appreciate any thoughts on why the NaN is coming

serene scaffold Mar 14, 2021, 3:05 AM

#

@undone heron I'm looking at this, but please add a py to your code sample

#

!code

arctic wedgeBOT Mar 14, 2021, 3:05 AM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

serene scaffold Mar 14, 2021, 3:08 AM

#

undone heron hey guys, I'm working on a df in pandas and after a merge one of the columns is ...

what does preprocess return?

#

Please ping when you reply or I will probably not know that you replied.

undone heron Mar 14, 2021, 3:09 AM

#

sure

#

pre process basically returns

serene scaffold Mar 14, 2021, 3:09 AM

#

It just so happens that I'm still here

undone heron Mar 14, 2021, 3:09 AM

#

Oh ok lul

serene scaffold Mar 14, 2021, 3:10 AM

#

though for future reference, if I'm helping you, always ping when you've completed your response no matter what.

#

so what does it return?

undone heron Mar 14, 2021, 3:10 AM

#

It returns the image below the code, basically that csv I'm writing @serene scaffold

serene scaffold Mar 14, 2021, 3:10 AM

#

preprocess actually does not return anything

undone heron Mar 14, 2021, 3:10 AM

#

Wait u mean ping you at the end everytime?

#

Well the at the end it is processing what I want, so the final result is the DF at the image

serene scaffold Mar 14, 2021, 3:11 AM

#

Once you have a completed thought that you are ready for me to read and there are no more messages that you are going to type until I respond, ping.

#

def preprocess(x):
    df = pd.merge(df_gps, x, on=['bus_id', 'date_time'], how='left')
    df.dropna()
    df.to_csv("./mobility-dataset/merge_gps_translated_validation.csv", index=False)

Nothing is returned

#

(except, well None)

#

Depending on what exe.submit does, it may be that you don't need it to return something.

#

Note that saving something to disk is not the same as returning it.

undone heron Mar 14, 2021, 3:13 AM

#

For sure but on that code, writing to csv is what I need at the end. How would u recommend for me to return something with that futures.append()? @serene scaffold

serene scaffold Mar 14, 2021, 3:14 AM

#

undone heron For sure but on that code, writing to csv is what I need at the end. How would u...

let's not worry about that for now. Can you give me a sample of df_gps and x as strings and not as screenshots?

print(df.iloc[:5].to_csv())

^ that will print a CSV that I can use to get a sense of what you are trying to do. Please do that for df_gps and x

#

!paste

arctic wedgeBOT Mar 14, 2021, 3:15 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pydis.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

serene scaffold Mar 14, 2021, 3:15 AM

#

^ you can paste the CSVs there.

undone heron Mar 14, 2021, 3:15 AM

#

Oh, nice 1 sec

#

Ok, hopefully this is what you asked for @serene scaffold
https://paste.pythondiscord.com/hizomotazi.css

serene scaffold Mar 14, 2021, 3:20 AM

#

I think it is! and the second one is x?

undone heron Mar 14, 2021, 3:21 AM

#

yep, it is the other csv I'm "merging" with @serene scaffold

serene scaffold Mar 14, 2021, 3:22 AM

#

undone heron yep, it is the other csv I'm "merging" with <@!253696366952316929>

alright, let me see

#

@undone heron is the problem that you're getting nans in busline_id?

#

and if so, how much do you know about the different types of joins you can do?

undone heron Mar 14, 2021, 3:27 AM

#

Yep, that is the problem. I'm quite new to this whole thing so I don't know that much about the join types. I know the theory somewhat but maybe you have better insight. @serene scaffold

serene scaffold Mar 14, 2021, 3:28 AM

#

for one thing, are all your datetimes on 2015-03-11, or is that just because of the rows you picked out for me?

#

Also, here are some of the different types of joins


right: use only keys from right frame.

outer: use union of keys from both frames.

inner: use intersection of keys from both frames.```

#

@undone heron do those types of joins make sense, or do you need me to explain them?

undone heron Mar 14, 2021, 3:34 AM

#

That is some test data for just one day to decrease the amount of data I need to process. So it is all on the same date, yes.
Makes sense! @serene scaffold

high badge Mar 14, 2021, 3:34 AM

#

https://www.youtube.com/watch?v=KTvYHEntvn8&ab_channel=Questpond

YouTube

Questpond

SQL Server join :- Inner join,Left join,Right join and full outer join

Buy full Sql Server course from here https://www.questpond.com/learn-sql-server-step-by-step/cid9
For more such videos visit http://www.questpond.com
For more such videos subscribe https://www.youtube.com/questpondvideos?sub_confirmation=1

See our other Step by Step video series below :-
For more such videos subscribe https://www.youtube.com/qu...

▶ Play video

undone heron Mar 14, 2021, 3:35 AM

#

So... Maybe outer makes more sense here? @serene scaffold

serene scaffold Mar 14, 2021, 3:35 AM

#

undone heron That is some test data for just one day to decrease the amount of data I need to...

I think with union you would get even more NaNs. Do you know why?

#

because "outer" is the union-like join.

undone heron Mar 14, 2021, 3:36 AM

#

I see, but it feels like the busline_id nan thing is because it is not keeping the column after the merge properly, right? @serene scaffold

serene scaffold Mar 14, 2021, 3:37 AM

#

undone heron I see, but it feels like the busline_id nan thing is because it is not keeping t...

what do you mean, not keeping the column properly? pandas isn't making an error.

undone heron Mar 14, 2021, 3:38 AM

#

I know! I just don't understand why the column is not being kept, I'm not using it as a key for the merge... Why would it become NaN? @serene scaffold

serene scaffold Mar 14, 2021, 3:39 AM

#

the way that join operations work. If you use the wrong type of join for what you are trying to do, Pandas might fill in some blanks with NaNs.

#

Think of it this way

#

you join the tables based on date_time and bus_id, right?

#

within each dataframe, could there ever be two rows that have the same for both date_time and bus_id as another row?

undone heron Mar 14, 2021, 3:45 AM

#

Matching rows on both dataframes based on those 2 keys? Yes, that is what I'm looking for. @serene scaffold

serene scaffold Mar 14, 2021, 3:45 AM

#

undone heron Matching rows on both dataframes based on those 2 keys? Yes, that is what I'm lo...

if they need to match in both dataframes, which of the four types of joins is right for that?

undone heron Mar 14, 2021, 3:46 AM

#

Inner lul

serene scaffold Mar 14, 2021, 3:46 AM

#

let me know if that works

undone heron Mar 14, 2021, 3:47 AM

#

Great stuff, running the script again, should be ready in about 1 hour. Thanks in advance! Really appreciate the way it was explained ok_handbutflipped

serene scaffold Mar 14, 2021, 3:48 AM

#

it will be an hour before we know if it worked?!

serene scaffold Mar 14, 2021, 3:49 AM

#

undone heron Great stuff, running the script again, should be ready in about 1 hour. Thanks i...

training models can often take a long time (I've had some take over a week, even with an insane amount of compute power), but you should be able to know if your dataframe manipulations are correct... quickly lemon_long

exotic maple Mar 14, 2021, 4:02 AM

#

undone heron Great stuff, running the script again, should be ready in about 1 hour. Thanks i...

an hour to process a join? Are you working with a billion rows or something?

nocturne widget Mar 14, 2021, 4:06 AM

#

I want to identify how similar older sections of text are to newer sections of text we'll call "section A". What type of data should I be looking at? I can't just take only "section A" text, but also "non-section A" text. But would I look into taking this "non-section A" text as any other section besides section A? Also, for text similarity models, would you suggest using an LSTM or a siamese NN? Currently I just have an LSTM in tf.

undone heron Mar 14, 2021, 4:09 AM

#

Well, once it reached that script section it was pretty fast and it failed with zero entries all Dtypes as objects non-null @serene scaffold

exotic maple Mar 14, 2021, 4:17 AM

#

I think im going crazy...why i do comment like im talking omg

#

undone heron Mar 14, 2021, 5:03 AM

#

Hey guys, I have another question not exactly related to my problem above earlier.

I have a couple of dataframes such as validation.csv and I need to "translate" the values of a column and in a way map it to other values I have in another csv (links.csv). How would I go about to do that? For example:

on links.csv
230, 555 -> meaning if I find 555 in validation.csv I need to change it to 230 and so on through the whole file.

Any thoughts?

misty flint Mar 14, 2021, 5:26 AM

#

exotic maple

haha it gives character. i would like reading comments like that

lean ledge Mar 14, 2021, 5:59 AM

#

wild dome I want to count the eggs in this image using OpenCV

Thresholding by brightness + find contours should work

#

Luckily for you, your images seem well controlled

#

Keep in mind you don't need to find the whole segment of the egg, you can just find one half of the egg and count those too

#

If that gives you trouble, you can also find contours as you're doing now and then filter by area. The eggs should be bigger than the small holes

harsh trellis Mar 14, 2021, 6:08 AM

#

there are some huge skewness present in it so, is it gonna be good if i use a power transform or instead if i should use log transformation ? cause boxcox is not gonna seem to work on this, since it only works on positive values

tacit stump Mar 14, 2021, 7:05 AM

#

Is it possible to do text classification using linear regression model by converting the strings to their sum of the ascii values of all the characters?

wide oxide Mar 14, 2021, 7:34 AM

#

Anyone up for implementing this research paper? https://stdm.github.io/downloads/papers/ICDAR_2017.pdf

sweet plaza Mar 14, 2021, 7:56 AM

#

I have an assignment, basic machine learning application, but I'm very new in this.

there must be 1 runner and 2 chaser neural networks (each one of them are separate neural networks) and chasers aim to catch the runner and runner moves randomly.

what kind of ML can be applied here, unsupervised or reinforcement ? and which library would be more appropriate to use?

P.S. runner should be different program, and the environment is common for runner and chasers. environment has walls and created randomly

wet cedar Mar 14, 2021, 10:10 AM

#

Does anyone know the path where I can see the list of pre-trained models in pickle?
I tried getting to pickle file but it was formatted with unsupported or binary formatting so, I can't somehow understand the values.

#

^^this returns an error that the file I'm trying to load does not exist. I want to know if its name has been changed due to updates or something like that

serene scaffold Mar 14, 2021, 12:07 PM

#

undone heron Hey guys, I have another question not exactly related to my problem above earlie...

I bet you could do it with dictionaries and the apply method

#

Or better yet, replace

#

!docs pandas.DataFrame.replace

arctic wedgeBOT Mar 14, 2021, 12:08 PM

#

`pandas.DataFrame.replace`

DataFrame.replace(to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')```
Replace values given in to\_replace with value.

Values of the DataFrame are replaced with other values dynamically. This differs from updating with `.loc` or `.iloc`, which require you to specify a location to update with some value.

Parameters  **to\_replace**str, regex, list, dict, Series, int, float, or NoneHow to find the values that will be replaced.

• numeric, str or regex:

>  
> 	 • numeric: numeric values equal to to\_replace will be replaced with value
> 	
> 	
> 	 • str: string exactly matching to\_replace will be replaced with value
> 	
> 	
> 	 • regex: regexs matching to\_replace will be replaced with value  
 • list of str, regex, or numeric:... [read more](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.replace.html#pandas.DataFrame.replace)

grave frost Mar 14, 2021, 12:44 PM

#

sweet plaza I have an assignment, basic machine learning application, but I'm very new in th...

reinforcement

wanton laurel Mar 14, 2021, 1:14 PM

#

very quick pandas question - trying to apply a regex expr to my df column like so:
df.desc_copy = df.desc_copy.apply( lambda x: re.sub(r'(([0-9]{2})(Jan|Feb|Mar|Apr|May|Jun|Jul|Sep|Oct|Nov|Dec)([0-9]{2}))' r'(ON \d\d\d\d-\d+-\d+)|(\d+\d+)', '', str(x)))
the expr is supposed to remove each date string from every row in the desc_copy column but no effect is taking place - why?

#

i was printing the wrong column 😑

grave frost Mar 14, 2021, 1:28 PM

#

wet cedar Does anyone know the path where I can see the list of pre-trained models in pick...

the library you would use already has modules that would allow you to load checkpoints. you can google that

grave frost Mar 14, 2021, 1:29 PM

#

wide oxide Anyone up for implementing this research paper? https://stdm.github.io/downloads...

so basically masked CNN?

dim trail Mar 14, 2021, 2:45 PM

#

hola

#

Hey guys I need some help. I am currently doing my thesis and I got stuck in creating a dictionary.
My data comes from an experiment in which they asked ppl what quantity of CO2 do they think certain products emit, in total I have 17 products, which mean I have dataframe with lenght of N * 17. What I want is to create a nested dictionary that stores all the responses of the individuals, something like: {1: {car:200,beer:500}, 2: {car:5.beer:10}, ..., N:{car:NN,beer:NN}}. How can I do this?

serene scaffold Mar 14, 2021, 2:52 PM

#

wanton laurel very quick pandas question - trying to apply a regex expr to my df column like s...

!code use this format to display your code

arctic wedgeBOT Mar 14, 2021, 2:52 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

serene scaffold Mar 14, 2021, 2:53 PM

#

@wanton laurel is the problem that your regular expression is not matching anything? or is your problem with usage of dataframes? Please ping me if/when you're ready to continue or I may never know that you've replied.

serene scaffold Mar 14, 2021, 2:53 PM

#

dim trail Hey guys I need some help. I am currently doing my thesis and I got stuck in cre...

is there a reason you want to have nested dicts when you already have a dataframe?

#

In either case, please run print(df.iloc[:5].to_csv()) and copy/paste the string into this chat so I can see what your data looks like.

wanton laurel Mar 14, 2021, 2:55 PM

#

serene scaffold <@!478242609597382666> is the problem that your regular expression is not matchi...

It worked, thanks though

serene scaffold Mar 14, 2021, 2:55 PM

#

wanton laurel It worked, thanks though

so you no longer need assistance? Alright, take care!

wanton laurel Mar 14, 2021, 2:56 PM

#

Yeah that's Right, no longer needed, you too

dim trail Mar 14, 2021, 2:57 PM

#

serene scaffold In either case, please run `print(df.iloc[:5].to_csv())` and copy/paste the stri...

,ppnr,subject,mode,product
0,5e6cf1cb28e5a82aed026a8f,1,500.0,car
1,5e6cf1cb28e5a82aed026a8f,1,4.0,carSocialCost
2,5e6cf1cb28e5a82aed026a8f,1,2.0,warming
3,5e6cf1cb28e5a82aed026a8f,1,700.0,heatwave
4,5e6cf1cb28e5a82aed026a8f,1,900.0,seaLevelRise

serene scaffold Mar 14, 2021, 2:57 PM

#

dim trail ,ppnr,subject,mode,product 0,5e6cf1cb28e5a82aed026a8f,1,500.0,car 1,5e6cf1cb28e5...

great, and what is the expected output given 0,5e6cf1cb28e5a82aed026a8f,1,500.0,car?

dim trail Mar 14, 2021, 2:58 PM

#

serene scaffold In either case, please run `print(df.iloc[:5].to_csv())` and copy/paste the stri...

yes, I am trying to test through differente methods how they diviate from the real CO2 emissions

dim trail Mar 14, 2021, 2:58 PM

#

serene scaffold great, and what is the expected output given `0,5e6cf1cb28e5a82aed026a8f,1,500.0...

all those rows are from the same respondent (5e6cf1cb28e5a82aed026a8f)

#

I would like to have a row with all his responses instead of 17

#

that's why I am trying to create a nested dict

serene scaffold Mar 14, 2021, 3:01 PM

#

@dim trail so given 0,5e6cf1cb28e5a82aed026a8f,1,500.0,car, what sub-dict do you want?

dim trail Mar 14, 2021, 3:02 PM

#

one like this : {1 (this is the subject(5e6cf1cb28e5a82aed026a8f): {car:200,beer:500}, 2: {car:5.beer:10}, ..., N (this is subject N):{car:NN,beer:NN}}

serene scaffold Mar 14, 2021, 3:03 PM

#

dim trail one like this : {1 (this is the subject(5e6cf1cb28e5a82aed026a8f): {car:200,beer...

where did car: 200 come from?

#

I am only asking about 0,5e6cf1cb28e5a82aed026a8f,1,500.0,car

dim trail Mar 14, 2021, 3:06 PM

#

,ppnr,subject,mode,product
0(index),5e6cf1cb28e5a82aed026a8f(identifier of subject 1),1 (subject 1),500.0 (his guess),car(product)
1,5e6cf1cb28e5a82aed026a8f,1,4.0,carSocialCost ---> all this is from same subject, I want all responses as dict in a bigger dict
2,5e6cf1cb28e5a82aed026a8f,1,2.0,warming
3,5e6cf1cb28e5a82aed026a8f,1,700.0,heatwave
4,5e6cf1cb28e5a82aed026a8f,1,900.0,seaLevelRise

serene scaffold Mar 14, 2021, 3:06 PM

#

does car: 200 come from a row that is not given in the sample?

dim trail Mar 14, 2021, 3:06 PM

#

no, that's just an example of mine

serene scaffold Mar 14, 2021, 3:06 PM

#

so each sub-dict in your outputted, nested dict will represent data from different rows?

#

or does each row of the dataframe get represented as one sub-dict in your nested dict?

dim trail Mar 14, 2021, 3:07 PM

#

no, each subdict will represent data for the subjects

#

for subject 1 I have 1

#

for subject 1 I have 17 responses

serene scaffold Mar 14, 2021, 3:09 PM

#

Okay, so the data structure you want is Dict[str, Dict[str, int]], and each key-value pair in the inner dicts is a row that has the same subject as the outer dict

#

let me see

dim trail Mar 14, 2021, 3:09 PM

#

yes

serene scaffold Mar 14, 2021, 3:10 PM

#

are you using mode for anything?

dim trail Mar 14, 2021, 3:10 PM

#

mode are their guesses

#

Dict[subject, Dict[product, mode]]

serene scaffold Mar 14, 2021, 3:11 PM

#

great

#

what is ppnr for?

dim trail Mar 14, 2021, 3:13 PM

#

ppnr is the unique identifier of each subject, I have that to identify each person in my others datasets

serene scaffold Mar 14, 2021, 3:14 PM

#

so what is the key for the outer dict? the subject or the ppnr?

dim trail Mar 14, 2021, 3:14 PM

#

the subject,

#

it would be more readable for me

serene scaffold Mar 14, 2021, 3:15 PM

#

dim trail it would be more readable for me

that means that once you transform this to a dict, the ppnr won't be there

dim trail Mar 14, 2021, 3:16 PM

#

yes, no problem. there is only one subject 1 in the whole experiment

serene scaffold Mar 14, 2021, 3:16 PM

#

okay, so we can drop the ppnr column, basically

dim trail Mar 14, 2021, 3:16 PM

#

yes

serene scaffold Mar 14, 2021, 3:19 PM

#

@dim trail I'm still looking into it

dim trail Mar 14, 2021, 3:20 PM

#

thanks

serene scaffold Mar 14, 2021, 3:27 PM

#

@dim trail still looking

dim trail Mar 14, 2021, 3:30 PM

#

is it very complicated? I tried for hours

serene scaffold Mar 14, 2021, 3:33 PM

#

dim trail is it very complicated? I tried for hours

I can't find a "good" solution, so all I can suggest is iterating through every row of the dataframe and adding the data from each row into one dictionary

ripe forge Mar 14, 2021, 3:35 PM

#

. / gasp

dim trail Mar 14, 2021, 3:35 PM

#

serene scaffold I can't find a "good" solution, so all I can suggest is iterating through every ...

thanks

ripe forge Mar 14, 2021, 3:36 PM

#

Dataframe iteration is usually a sin. So there's a good chance I may request you to explain the context again, waste 30 min, and then come to the same conclusion.

#

Though if you're making dictionaries out of it you're anyways leaving dataframes behind

dim trail Mar 14, 2021, 3:42 PM

#

I will try row iteration and if I can't find a solution, I'll abandon the idea and try something else

serene scaffold Mar 14, 2021, 4:09 PM

#

ripe forge Dataframe iteration is usually a sin. So there's a good chance I may request you...

I can explain it for them, since I already got my head into the problem

#

I'd actually like to know if there's a "panda-ic" solution

#

,subject,mode,product
0,1,500.0,car
1,1,4.0,carSocialCost
2,1,2.0,warming
3,2,700.0,heatwave
4,2,900.0,seaLevelRise

The desired output is:

{1: {'car': 500.0, 'carSocialCost': 4.0, 'warming': 2.0}, 
 2: {'heatwave': 700.0, 'seaLevelRise': 900.0}}

The problem is that you're basically trying to create new columns based on values in the product column.
@ripe forge I tried to do a pivot table and then do to_dict

dim trail Mar 14, 2021, 4:38 PM

#

serene scaffold ```py ,subject,mode,product 0,1,500.0,car 1,1,4.0,carSocialCost 2,1,2.0,warming ...

Does the pivot table organize the data by subject? Or do I still have the same problem in which I have 17 rows with answers of subject 1

ripe forge Mar 14, 2021, 4:39 PM

#

!e

import pandas as pd
from io import StringIO

string = """,subject,mode,product
0,1,500.0,car
1,1,4.0,carSocialCost
2,1,2.0,warming
3,2,700.0,heatwave
4,2,900.0,seaLevelRise"""

df = pd.read_csv(StringIO(string))

def dict_creator(df):
    return dict(zip(df['product'], df['mode']))
out = df.groupby('subject').apply(dict_creator).to_dict()
print(out)

arctic wedgeBOT Mar 14, 2021, 4:39 PM

#

@ripe forge :white_check_mark: Your eval job has completed with return code 0.

{1: {'car': 500.0, 'carSocialCost': 4.0, 'warming': 2.0}, 2: {'heatwave': 700.0, 'seaLevelRise': 900.0}}

ripe forge Mar 14, 2021, 4:39 PM

#

this would be my knee jerk reaction to it, but it's essentially looping via apply

serene scaffold Mar 14, 2021, 4:40 PM

#

ripe forge !e ```py import pandas as pd from io import StringIO string = """,subject,mode,...

that code is beautiful though, implicit looping aside.

ripe forge Mar 14, 2021, 4:41 PM

#

./blushes crimson

#

i dont think you'll have a lot of gains because ultimately vectorization has to be broken to create the dictionaries at the end though. ideally for larger datastructures you want to avoid going back to dictionaries if possible when using pandas. but this probably is enough for OP's needs

dim trail Mar 14, 2021, 4:43 PM

#

ripe forge !e ```py import pandas as pd from io import StringIO string = """,subject,mode,...

nice, but it copy the value 500 for each product

grave frost Mar 14, 2021, 4:44 PM

#

simplest code to remove a specific list of words from another list of words? maybe using sets?

ripe forge Mar 14, 2021, 4:44 PM

#

could you elaborate? does the toy example showcase your original problem adequately?

ripe forge Mar 14, 2021, 4:44 PM

#

grave frost simplest code to remove a specific list of words from another list of words? may...

the words that should be removed can be in a set. then, just iterate on the list of words, and keep those words that arent found in the set

grave frost Mar 14, 2021, 4:45 PM

#

yea, is there any simpler method without iterating?

#

something elegant

ripe forge Mar 14, 2021, 4:45 PM

#

iterating is the simple method. you could use set intersection(edit:? not intersection, difference) if you really wanted to, but you arent gaining performance there

#

[word for word in words if word not in words_to_remove] # where words_to_remove is a set

grave frost Mar 14, 2021, 4:46 PM

#

one of the list is nested which complicates it 😅

ripe forge Mar 14, 2021, 4:47 PM

#

ah, the plot thickens

#

can you make a minimal example that showcases the question adequately?

exotic maple Mar 14, 2021, 5:22 PM

#

You could probably iterate over every element but that sounds very efficient lol

#

doesnt sound

#

@grave frost is it possible to transform those lists to numpy arrays? If you can, you could create a 2d array and filter via masking

#

should be more efficient as a vector operation instead of iterating over N elements of nested lists

grave frost Mar 14, 2021, 5:24 PM

#

@exotic maple nah, its done with python

#

thanx anyways

exotic maple Mar 14, 2021, 5:25 PM

#

oh you want it in base python?

spare vine Mar 14, 2021, 5:25 PM

#

[el for sublist in mylist for el in sublist if el not in words_to_remove]

#

nested for loop but in a list comprehension

grave frost Mar 14, 2021, 5:25 PM

#

no, I meant the prob has already been resolved in pure python, so no need to use numpy 🙂

exotic maple Mar 14, 2021, 5:28 PM

#

I mean yes, you can do it in pure python. Let me think...

I would try this to get all unique words:

set_var = set()
for sublist in list:
  for element in sublist:
    set_var.add(element)

#

question: D o you want to delete unique words or unique lists (as in, the actual block)?

spare vine Mar 14, 2021, 5:30 PM

#

for the general case of arbitrarily nested lists you get to use recursion

grave frost Mar 14, 2021, 5:32 PM

#

leave it guys 🙂

#

If you want to see the solution, you can check #help-apple

#

(but you will have to scroll up)

spare vine Mar 14, 2021, 5:33 PM

#

or here: [el for sublist in mylist for el in sublist if el not in words_to_remove]

grave frost Mar 14, 2021, 5:33 PM

#

no, its had to be done like that:

out = [[word for word in words if word not in [_.replace(' ','') for _ in translated_stop]] for words in tqdm(data_text)]

heady tide Mar 14, 2021, 6:39 PM

#

I have a multilayered perceptron implemented in python, but it spits out probabilities instead of classes, what do I need to do for it to return classes ? Should I change the activation function of the last layer to softmax ?

exotic maple Mar 14, 2021, 6:40 PM

#

heady tide I have a multilayered perceptron implemented in python, but it spits out probabi...

when you do predict(X) you get probability instead of response?

#

are you sure arent using predict_proba?

heady tide Mar 14, 2021, 6:41 PM

#

I am just using the forward propagation value

#

the result from the last layer

shy kraken Mar 14, 2021, 6:49 PM

#

I'm trying to improve my documentation reading skills. I'm looking at numpy's linspace and I see code like this:

test = np.linspace(0, 500, 12)

and you get the same result if you do this:

test = np.linspace(0, 500, num=12)

Now when I look at the docs, it looks like this:

numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)

My question is how do I know that I can type "num" or not and it will work the same? When I first saw it I was confused what the 12 did...

tidal bough Mar 14, 2021, 6:54 PM

#

shy kraken I'm trying to improve my documentation reading skills. I'm looking at numpy's l...

My question is how do I know that I can type "num" or not and it will work the same?
Because num is the third argument.

#

that's why when you pass 3 arguments, they resolve to start, stop, num

#

if you passed 4, the fourth one would be assumed to be endpoint, the fifth one retstep, and so on

#

Specifying arguments by name allows you to specify them regardless of position. Say, dtype you need to pass by name unless you are also passing num, endpoint and retstep

shy kraken Mar 14, 2021, 6:55 PM

#

understood thanks! @tidal bough

misty flint Mar 14, 2021, 7:24 PM

#

that concept of positional vs. keyword arguments

#

https://stackoverflow.com/questions/9450656/positional-argument-v-s-keyword-argument#:~:text=Positional arguments are arguments that,not passed to the function.

Stack Overflow

Positional argument v.s. keyword argument

Based on this
A positional argument is a name that is not followed by an equal sign
(=) and default value.

A keyword argument is followed by an equal sign and an expression that
gives its

#

Sip

exotic maple Mar 14, 2021, 7:27 PM

#

functions and their defintions (*args, *kwargs) sounds like you're grumbling

sacred gate Mar 14, 2021, 8:23 PM

#

Hi. I would like to ask some help with the choice of literature for data-science (more precisely, bioinformatics field). I'm reading "Mathematics for machine learning" by M.P. Deiesenroth and "Practical Statistics for data scientists" by Peter Bruce. But I'm not sure about my choice, especially about the second one. Could you recommend me some books concerning Statistics for DS (especially with python examples).

P.S.: I have named just books I'm reading at the moment. I'm planning to read also "Data science from scratch", "Deep learning for the Life Sciences" from O'Reily publisher and some other books.

Thanks in advance

hollow sentinel Mar 14, 2021, 8:32 PM

#

how are you liking practical statistics for DS

#

I have that book stashed somewhere

lean ledge Mar 14, 2021, 8:35 PM

#

Bishop's machine learning book or elements of statistical learning are the classic ML references

sacred gate Mar 14, 2021, 8:36 PM

#

@hollow sentinel I have sense, that it's unfull, may be it is explicated by the fact that this book doesn't concerned the mathematical part of statistics

sacred gate Mar 14, 2021, 8:36 PM

#

lean ledge Bishop's machine learning book or elements of statistical learning are the class...

Thanks

sacred gate Mar 14, 2021, 8:41 PM

#

lean ledge Bishop's machine learning book or elements of statistical learning are the class...

What is better according to you?

lean ledge Mar 14, 2021, 8:42 PM

#

Personal preference. I like bishop more

granite wolf Mar 14, 2021, 8:58 PM

#

Hey please could someone help me with a for loop where I'm trying to read multiple tables from an SQLite database, the current code looks like this:

#

#

so i have a list of tables within the database, and the idea is that for each table in the list a dataframe is produced as df_<table-name>

#

for example the output should be 7 dataframes: df_sqlite_sequence, df_Player_Attributes etc

#

but currently it is just loading the last table (Team) as df_table, overwriting the previous one

exotic maple Mar 14, 2021, 9:27 PM

#

@granite wolf look at your code and think about what it does.

Your loop goes through to every query and assigns the rsulting dataframe to a variable. This variable does not change in any form in any iteration, so its the same variable every time.

#

basically, you're overwriting the resulting dataframes with each step of the loop. That's why you only get the last table

#

now, depending on what you want there are manyways to move forward

#

you can create a list and store the dataframes there as elements of the list. This is the simplest solution and they will have the same order as the parent list, but you will not have any linking between them

granite wolf Mar 14, 2021, 9:38 PM

#

exotic maple you can create a list and store the dataframes there as elements of the list. Th...

Thanks for replying, if they are stored as part of a list element, could i then unpack the list to get the desired multiple dataframes?

#

im basically aiming for different dfs for each table in the database

sour mango Mar 14, 2021, 10:29 PM

#

hey guys, im new to ml... i wanted to know how you can parse a live video feed from your computers webcam? ( i am trying to make a rock paper scissors game where the i play with the computer by showing rock paper or scissors (done by my hand) to the camera)

shy kraken Mar 14, 2021, 10:33 PM

#

Got a question, let's say you have a computer that can do your machine learning operations reasonably well but not great, meaning it takes a little long to do. Is there any reason why you wouldn't use google colab? Like is there a benefit to just keeping everything on your computer?

uncut orbit Mar 14, 2021, 10:39 PM

#

there isn't really a benefit to keeping everything on your computer

#

but it takes a little long for mostly everyone

#

if you switch everything to another computer....then it makes it harder to get back all of your projects

uncut orbit Mar 14, 2021, 10:41 PM

#

sour mango hey guys, im new to ml... i wanted to know how you can parse a live video feed f...

opencv

sour mango Mar 14, 2021, 10:46 PM

#

uncut orbit opencv

time for some research.. thanks 🙂

uncut orbit Mar 14, 2021, 10:50 PM

#

welcome

#

anytime

shy kraken Mar 14, 2021, 10:54 PM

#

uncut orbit if you switch everything to another computer....then it makes it harder to get b...

Thanks, could you explain this comment a little more? Do you mean if I get a new computer, it's hard to transfer it over from my old computer?

uncut orbit Mar 14, 2021, 10:55 PM

#

its not too hard

#

but it is a headache

#

you need to move files from this computer to that

shy kraken Mar 14, 2021, 10:56 PM

#

ahh ok, so you're advocating for putting it on colab because of the cloud

uncut orbit Mar 14, 2021, 10:56 PM

#

yea

#

use github too

#

private repo

shy kraken Mar 14, 2021, 10:56 PM

#

yeah that makes sense

#

thanks!

uncut orbit Mar 14, 2021, 10:57 PM

#

welcome

iron basalt Mar 14, 2021, 10:58 PM

#

The upside to cloud is that it does not matter which computer you use, all your stuff is always there. The downside is that you need an internet connection (that is stable and decently fast).

grave frost Mar 14, 2021, 10:59 PM

#

iron basalt The upside to cloud is that it does not matter which computer you use, all your ...

hmmm...but for colab, its just text; even if your connection is not that stable, I doubt 15 seconds matter a lot

#

but overall, the recommendation is GCP. Colab is good for newbies

iron basalt Mar 14, 2021, 11:00 PM

#

Idk depends on how quickly you are doing things, the stability and lag can get in the way of a fast feedback loop (assuming the operations themselves don't take very long).

#

(And images can take a long time if you have a very slow internet speed)

grave frost Mar 14, 2021, 11:01 PM

#

The most overhead I personally face is in the preprocessing part, but thats something even my decade-old laptop can do

iron basalt Mar 14, 2021, 11:03 PM

#

Either way, the answer is just use both, local and cloud. Backups are always good.

uncut orbit Mar 14, 2021, 11:04 PM

#

colab is good for sharing code too

iron basalt Mar 14, 2021, 11:04 PM

#

Yup, hence the name.

uncut orbit Mar 14, 2021, 11:06 PM

#

https://colab.research.google.com/drive/1Fk2yMF1vLLNbVhccn-nBeLnk_uNPnL5v?usp=sharing
I made this notebook that automates linear and logistic regression...please make a copy to use it and the data that you put in should be cleaned(I'm still working on that). Feedback would be nice.

Google Colaboratory

#

P.S. don't fear the loss of your job....the notebook makes basic predictions. we'll always need people to fine tune models 😉

grave frost Mar 14, 2021, 11:15 PM

#

don't fear the loss of your job
.....?

bitter harbor Mar 14, 2021, 11:21 PM

#

is there a reason to use seaborn over mpl?

#

my profs exclusively teaching it and i can't see why

uncut orbit Mar 14, 2021, 11:23 PM

#

grave frost > don't fear the loss of your job .....?

data scientists are good at automating jobs...

grave frost Mar 14, 2021, 11:24 PM

#

uncut orbit data scientists are good at automating jobs...

what....?

uncut orbit Mar 14, 2021, 11:24 PM

#

yea

grave frost Mar 14, 2021, 11:24 PM

#

Data science has nothing to do with automating jobs. it is about deriving insight with data

uncut orbit Mar 14, 2021, 11:24 PM

#

it is

#

thats true

grave frost Mar 14, 2021, 11:25 PM

#

any field even remotely close to that is just robotics

uncut orbit Mar 14, 2021, 11:25 PM

#

but i view data science to makes life easier

grave frost Mar 14, 2021, 11:25 PM

#

well, that's pretty wrong

uncut orbit Mar 14, 2021, 11:25 PM

#

ok

hollow sentinel Mar 14, 2021, 11:26 PM

#

I think you're thinking of ML engineers

#

they're more focused on automation

uncut orbit Mar 14, 2021, 11:27 PM

#

hmm

#

maybe

#

but please get back to me on the notebook pls feel free to dm me

grave frost Mar 14, 2021, 11:28 PM

#

hollow sentinel I think you're thinking of ML engineers

they do not do automation. They are just implmenting pipelines and models 🤷 IMO Robotics is the closes field to automation

uncut orbit Mar 14, 2021, 11:28 PM

#

ok now im confused

sage aurora Mar 14, 2021, 11:32 PM

#

hello

grave frost Mar 14, 2021, 11:32 PM

#

uncut orbit ok now im confused

about what?

uncut orbit Mar 14, 2021, 11:32 PM

#

automation using data science

grave frost Mar 14, 2021, 11:32 PM

#

uncut orbit automation using data science

data science != automation

sage aurora Mar 14, 2021, 11:33 PM

#

i need a very simple help; need to plot a function in a subplot (basic)

uncut orbit Mar 14, 2021, 11:33 PM

#

what about integrating data science with automation

uncut orbit Mar 14, 2021, 11:33 PM

#

grave frost data science != automation

wait yea i get that

grave frost Mar 14, 2021, 11:33 PM

#

data science is just an umbrella term to signify someone very experienced with statistics and other relevant fields to derive insight from data.

grave frost Mar 14, 2021, 11:33 PM

#

uncut orbit what about integrating data science with automation

how can you do that?

grave frost Mar 14, 2021, 11:34 PM

#

uncut orbit https://colab.research.google.com/drive/1Fk2yMF1vLLNbVhccn-nBeLnk_uNPnL5v?usp=sh...

the theory for that was made years ago

uncut orbit Mar 14, 2021, 11:34 PM

#

like self driving cars

grave frost Mar 14, 2021, 11:34 PM

#

uncut orbit like self driving cars

thats AI. not Data science.

uncut orbit Mar 14, 2021, 11:34 PM

#

wait i get that too

#

oh shoot

bitter harbor Mar 14, 2021, 11:34 PM

#

what would you put ai under?

uncut orbit Mar 14, 2021, 11:34 PM

#

i got all my definitions wrong

grave frost Mar 14, 2021, 11:34 PM

#

Data scientists do a little of ML (Machine Learning) but its mostly ML researchers and engineers that do the more complex stuff

uncut orbit Mar 14, 2021, 11:35 PM

#

ok

grave frost Mar 14, 2021, 11:35 PM

#

bitter harbor what would you put ai under?

AI is a parent term on its own

uncut orbit Mar 14, 2021, 11:35 PM

#

never thought of it that way

bitter harbor Mar 14, 2021, 11:35 PM

#

huh I always thought it and nn's were fields of data sci

grave frost Mar 14, 2021, 11:36 PM

#

there's no exact definition for a data scientist. but we can draw some lines

uncut orbit Mar 14, 2021, 11:36 PM

#

ok

#

now if i want to do ai with robotics how would that work

grave frost Mar 14, 2021, 11:36 PM

#

NN's (Neural Networks) are mostly grouped under ML

grave frost Mar 14, 2021, 11:36 PM

#

uncut orbit now if i want to do ai with robotics how would that work

you can do that, but you would need PhD. that stuff is pretty complex

#

but you can learn right now too 🙂

uncut orbit Mar 14, 2021, 11:37 PM

#

i've been doing data science since i was 12

grave frost Mar 14, 2021, 11:37 PM

#

uncut orbit i've been doing data science since i was 12

good for you

uncut orbit Mar 14, 2021, 11:37 PM

#

grave frost but you can learn right now too 🙂

what do i do to learn

grave frost Mar 14, 2021, 11:37 PM

#

uncut orbit what do i do to learn

what do you specifically want to learn?

uncut orbit Mar 14, 2021, 11:37 PM

#

ai with robotics

#

hmm

hollow sentinel Mar 14, 2021, 11:38 PM

#

you've been learning it since you were 12? your math skills must be good

grave frost Mar 14, 2021, 11:38 PM

#

Thats Reinforcement learning

grave frost Mar 14, 2021, 11:38 PM

#

hollow sentinel you've been learning it since you were 12? your math skills must be good

dont stress.

uncut orbit Mar 14, 2021, 11:38 PM

#

lmao

hollow sentinel Mar 14, 2021, 11:38 PM

#

I'm not

grave frost Mar 14, 2021, 11:38 PM

#

hollow sentinel I'm not

no, its just that your tone sounds not very encouraging for a beginner

uncut orbit Mar 14, 2021, 11:38 PM

#

ok ig i'll learn reinforcement learning

grave frost Mar 14, 2021, 11:38 PM

#

no offense

hollow sentinel Mar 14, 2021, 11:39 PM

#

I don't like the accusation

uncut orbit Mar 14, 2021, 11:39 PM

#

grave frost Mar 14, 2021, 11:39 PM

#

hollow sentinel I don't like the accusation

no, it was not an accusation 🙂

uncut orbit Mar 14, 2021, 11:39 PM

#

using nueral nets right?

grave frost Mar 14, 2021, 11:39 PM

#

uncut orbit using nueral nets right?

yep

grave frost Mar 14, 2021, 11:40 PM

#

uncut orbit using nueral nets right?

RL is a little bit different from other types of ML, but fundamentally they are pretty same

#

like both aim to optimize some function

hollow sentinel Mar 14, 2021, 11:40 PM

#

there's a generous amount of math that's important to know

#

that's all

uncut orbit Mar 14, 2021, 11:40 PM

#

grave frost like both aim to optimize some function

ok

#

now what resources do i need?

grave frost Mar 14, 2021, 11:41 PM

#

you can learn about it more by watching 3blue1brown for some basic maths and then prob pick some course

hollow sentinel Mar 14, 2021, 11:41 PM

#

https://mml-book.com/

Mathematics for Machine Learning

#

this is good

uncut orbit Mar 14, 2021, 11:41 PM

#

ok

#

phd takes 12 years right?

grave frost Mar 14, 2021, 11:42 PM

#

uncut orbit phd takes 12 years right?

uhh, no I dont think so

uncut orbit Mar 14, 2021, 11:42 PM

#

how long for reinforcement learning?

grave frost Mar 14, 2021, 11:42 PM

#

BTW what is your prior experience? just curious

grave frost Mar 14, 2021, 11:42 PM

#

uncut orbit how long for reinforcement learning?

if you treat something like a goal, then you would never be productive

uncut orbit Mar 14, 2021, 11:43 PM

#

grave frost BTW what is your prior experience? just curious

data science, opencv, and some nueral nets

uncut orbit Mar 14, 2021, 11:43 PM

#

grave frost if you treat something like a goal, then you would never be productive

ok i get you

grave frost Mar 14, 2021, 11:43 PM

#

its much better to have an overall positive attitude towards learning than just doing something in "x" amount of time

grave frost Mar 14, 2021, 11:43 PM

#

uncut orbit data science, opencv, and some nueral nets

which types of NN?

uncut orbit Mar 14, 2021, 11:43 PM

#

igs

uncut orbit Mar 14, 2021, 11:43 PM

#

grave frost which types of NN?

CNNs, regression

grave frost Mar 14, 2021, 11:44 PM

#

well, then you have the basics already done. I think you can move on to RL from there on

uncut orbit Mar 14, 2021, 11:44 PM

#

ok

#

thx so much

grave frost Mar 14, 2021, 11:44 PM

#

cool, no worries

iron basalt Mar 15, 2021, 12:14 AM

#

ROCm support in pytorch is so nice. Don't need a Nvidia GPU.

#

Saves $$$.

grave frost Mar 15, 2021, 12:50 AM

#

Is ROCm good now? I had ordered an AMD GPU before cuz I wanted to try it, but I got dissapointed with the bugs and performance so returned the card to get an Nvidia one

#

But I read some intellectual discussion where they mentioned weird C stuff to prove that AMD cards won't be able to compete with CUDA's performance.

iron basalt Mar 15, 2021, 12:54 AM

#

AMD vs Nvidia performance tests are all really bad (both ways), just try it yourself.

#

(Even if one could be faster than the other it's also limited by how much effort was put into each by the library being used)

grave frost Mar 15, 2021, 12:55 AM

#

Nvidia poured billions on CUDA

iron basalt Mar 15, 2021, 12:55 AM

#

Newer AMD GPUs align more with Nvidia GPUs too

#

That's because Nvidia wanted an iron grip on the ML community. So all the libraries added CUDA support and ignored AMD.

grave frost Mar 15, 2021, 12:56 AM

#

AMD hasn't made much contributions to computing, and OpenCL sucks

grave frost Mar 15, 2021, 12:56 AM

#

iron basalt That's because Nvidia wanted an iron grip on the ML community. So all the librar...

well, they deserved it then, and we regret it now

iron basalt Mar 15, 2021, 12:57 AM

#

Yeah, they invested.

grave frost Mar 15, 2021, 12:57 AM

#

If it were me, I would have gone with CUDA too. its the most sane decision

iron basalt Mar 15, 2021, 12:57 AM

#

But AMD is cheap, and all that so it's totally worth having support for it even if it's slower.

#

I think the way it works with ROCm is that it somehow runs the CUDA code on AMD gpus.

grave frost Mar 15, 2021, 12:58 AM

#

yeah, AMD is so great. Nvidia just had a monopoly and milked all the money

iron basalt Mar 15, 2021, 12:58 AM

#

so they still are using pycuda

grave frost Mar 15, 2021, 12:58 AM

#

iron basalt I think the way it works with ROCm is that it somehow runs the CUDA code on AMD ...

making a layer is never great for performance

iron basalt Mar 15, 2021, 12:58 AM

#

That way they don't need to recode everything

grave frost Mar 15, 2021, 12:58 AM

#

the aim should be for native

#

not hacky stuff.

iron basalt Mar 15, 2021, 12:58 AM

#

Not sure how hacky it is, if at all

heady tide Mar 15, 2021, 12:59 AM

#

The graph on the right represents the error of the output layer after each epoch, is this normal for a MLP with these hyperparameters ?

Screen_Shot_2021-03-15_at_1.55.37_AM.png

grave frost Mar 15, 2021, 12:59 AM

#

heady tide The graph on the right represents the error of the output layer after each epoch...

just asking, what is that program?

heady tide Mar 15, 2021, 1:00 AM

#

I made it, a live visualization of how a multilayered perceptron works, using PyQt and multiprocessing to avoid GIL

grave frost Mar 15, 2021, 1:00 AM

#

that is pretty cool

heady tide Mar 15, 2021, 1:00 AM

#

thank you

grave frost Mar 15, 2021, 1:00 AM

#

iron basalt Not sure how hacky it is, if at all

if it wasn't, they wouldn't have lost out on performance

grave frost Mar 15, 2021, 1:01 AM

#

heady tide thank you

tbh, its the closest thing to GUI with Machine Learning I have ever seen

heady tide Mar 15, 2021, 1:03 AM

#

well the tricky thing is that you have to run the neural network on a separate process because you're bound to get into GIL if you run it on the main thread, so you have to create pipelines between the GUI and the network to exchange data.

#

A lot of people don't bother making visual representations but for me it's very insightful to see how everything works in real time

iron basalt Mar 15, 2021, 1:18 AM

#

grave frost if it wasn't, they wouldn't have lost out on performance

I'm not exactly sure what you mean. Who lost out on performance on what?

#

ROCm is pretty much native for AMD. Though it's not for all of their cards.

grave frost Mar 15, 2021, 1:19 AM

#

OpenCl loses perf to CUDA

iron basalt Mar 15, 2021, 1:20 AM

#

Yeah, because it's locked off from a bunch of things on Nvidia GPUs.

grave frost Mar 15, 2021, 1:21 AM

#

I dunno 🤷 There is a whole github issue about it with plenty of technical arguments

iron basalt Mar 15, 2021, 1:22 AM

#

It's pretty well known that the OpenCL drivers are intentionally crippled on Nvidia.

grave frost Mar 15, 2021, 1:22 AM

#

....?

#

I mean OpenCL on AMD vs CUDA on Nvidia

iron basalt Mar 15, 2021, 1:22 AM

#

Well AMD is different hardware.

#

It's not really just a OpenCL vs CUDA thing then. And can't really be compared.

#

You can do price per compute

#

Or something like that.

grave frost Mar 15, 2021, 1:26 AM

#

well, its different architecture

#

but from what I barely understood, OpenCL is general in inferior to CUDA according to some C and optimization stuff

lean ledge Mar 15, 2021, 1:32 AM

#

grave frost Data scientists do a little of ML (Machine Learning) but its mostly ML researche...

This is false.

#

Data scientists can be anything from no ML to all ML, because it's a generic buzzword. Many ML engineers do no real ML, just software engineering around ML

lean ledge Mar 15, 2021, 1:33 AM

#

uncut orbit now if i want to do ai with robotics how would that work

Almost no one uses machine learning in robotics other than computer vision

#

If you're looking at stuff like Boston Dynamics, they use no "AI"

#

Good ol mechanical engineering and control theory

iron basalt Mar 15, 2021, 1:34 AM

#

grave frost well, its different architecture

Not really, that's just what Nvidia just wants you to think or some random people on the internet. OpenCL and CUDA do not really have anything to do with C optimizations. It's just two different APIs, what really matters is the hardware itself.

lean ledge Mar 15, 2021, 1:34 AM

#

CUDA and OpenCL are unrelated to architecture lol

#

OpenCL can work on basically anything

iron basalt Mar 15, 2021, 1:35 AM

#

The only issue with OpenCL is like I stated, on Nvidia hardware it's locked off from some stuff.

lean ledge Mar 15, 2021, 1:35 AM

#

uncut orbit phd takes 12 years right?

5-6 years in the US, 3-4 years outside the US

iron basalt Mar 15, 2021, 1:35 AM

#

And also how much effort is put into supporting it

lean ledge Mar 15, 2021, 1:35 AM

#

The OpenCL api is also not as good as the CUDA one

iron basalt Mar 15, 2021, 1:35 AM

#

^

lean ledge Mar 15, 2021, 1:35 AM

#

CUDA is really really good

iron basalt Mar 15, 2021, 1:36 AM

#

CUDA is really convenient. Closest to it is probably OpenMP on that axis.

#

(Directly in the C/C++ code, not like passing around some string of code which one then compiles manually)

lean ledge Mar 15, 2021, 1:38 AM

#

@uncut orbit In general, take everything anyone says here about machine learning or robotics or anything with a pinch of salt, there's not a lot of people with expertise in the area in this server. There's dedicated servers for AI/ML type stuff and they're generally better, alongside dedicated servers for robotics.

uncut orbit Mar 15, 2021, 1:38 AM

#

ok

#

thank you

lean ledge Mar 15, 2021, 1:38 AM

#

If you're asking about python, not many better servers than this but any theory work has limited talent here

uncut orbit Mar 15, 2021, 1:38 AM

#

ok

iron basalt Mar 15, 2021, 1:39 AM

#

This server is also not really the right place for theory, it's more for practical use with python. There are the off topic channels, but that's it.

lean ledge Mar 15, 2021, 1:40 AM

#

I only exist to call out other people's BS on this server

shy kraken Mar 15, 2021, 1:41 AM

#

nvm figured it out i think

#

its like a tensor specific function i guess

obtuse sable Mar 15, 2021, 2:47 AM

#

How long shd I spend on understanding the theory of neural networks bedore I can start implementing one using pytorch ? I just want to compare to a logistic regression binary classifier. I have the data rdy. Is 8. 5 k data points enough?

uncut bloom Mar 15, 2021, 2:59 AM

#

there should be a baseline for your problem already... so none

#

just implement and use your baseline to as a check that it worked

#

if not try, try , try, again

#

regarding data... it depends on how complex a network you're building and how much information your network needs to encode

#

try training it in batches to see the improvement rate to get a guess at the value of more data

#

e.g. training on 20%, 40% 60%...

#

plot out the curve of improvement on your valid set as a metric to get some kind of feel for the value of more data

#

if you see a jump in value in the last batch it's probably worth getting more data

#

but you should also make a business decision on the metrics you're judging by and the difficulty in acquiring more

#

if you're smart of your weight initialization and optimization less data will be necessary

#

you can also think of clever ways to use you existing data to weakly label more

modest maple Mar 15, 2021, 4:37 AM

#

Hey guys can anyone help me with an error message that I have been getting while using the scipy library. I am trying to get a numpy array of pearson r correlation similarity from a bulk data and this data is imported in the form of a pandas dataFrame.

obtuse sable Mar 15, 2021, 5:46 AM

#

uncut bloom regarding data... it depends on how complex a network you're building and how mu...

Thanks for the advice! I will just dive right in and see how it goes.wish me luck :D

misty flint Mar 15, 2021, 7:00 AM

#

youre probably not accessing the dataframe correctly

#

show us your code + error

plucky grotto Mar 15, 2021, 7:44 AM

#

Hi, so I want to clone my base conda install but swap out the python version. I've tried conda create --name testenv2 python=2.7 --clone root but says too many arguments. Is this not possible?

#

I'd be fine with just having two installs of python too, so long as I can reference specifically which one I want

uncut barn Mar 15, 2021, 8:49 AM

#

You need to design a Neural Network that solves the problem of facial attribute
recognition. More specifically the network should receive in the input an image of a face,
and should recognise whether the depicted subject wears glasses or not, has long or
short hair, smiles or not and should recognise its apparent age. Design the first and the
last layers of such a network, detailing your choices. Define the total cost function and
give the format of a training example and the corresponding ground truth associated with
it.
[Hint: You can treat the recognition of the age either as a regression problem, or as a
classification problem – either choice is equally valid.]

#

Can anyone help me with this question?

obtuse sable Mar 15, 2021, 9:21 AM

#

anyone know of a good neural network binary classification problem with solution in Pytorch online to work through so I can familiarize myself with NNs and Pytorch? preferably with at least 10 features and > 5k datapoints

hard yew Mar 15, 2021, 12:10 PM

#

HI, I recently calculating the number of parameter of conv2d, but how to calculate parameter of separableconv2d?

undone vine Mar 15, 2021, 1:57 PM

#

guys how do u paste code in discord

tall trail Mar 15, 2021, 1:58 PM

#

add three backtics

undone vine Mar 15, 2021, 1:58 PM

#

ah k thx

#

does anyone here know how to solve that cause im trying to make it on a hex value

#

    shape = [(1, 1), (220, 190)]

    # creating new Image object
    img = Image.new("RGB", (w, h))
    
    # create rectangle image
    img1 = ImageDraw.Draw(img)
    img1.rectangle(shape, fill=f"{item_color.get(items[0])}")

    font = ImageFont.truetype('theboldfont.ttf', 30)
    text_position = 25, 80

    img1.text(text_position, items[0], 'white', font=font)

    img.save('fortnite.jpg', 'JPEG')

    await ctx.send(file=discord.File('fortnite.jpg'))```

pearl arch Mar 15, 2021, 2:44 PM

#

im looking for algorithms and methods for detection the anomaly in vibration track. there is a machine and i set the sensor which senses the temperature and vibration, im looking for the machine learning algorithm to detect it. is there anyone for advice?

merry lintel Mar 15, 2021, 3:02 PM

#

hey im interested in getting into ai machine learning etc... but concerned if i should learn things like calculus and linear algebra first or if it's fine to learn it along the way as well

blazing bridge Mar 15, 2021, 3:05 PM

#

@merry lintel I'm still learning machine learning and AI and what I did was learn everything I needed in terms of math along the way with whatever I needed. It may be better to learn the math before because you won't have to worry about it as much for the resources that are more theoretical. I hope that helps

merry lintel Mar 15, 2021, 3:05 PM

#

@blazing bridge

#

oh thanks

#

but probably will just learn them along the way

#

lazy

#

xd

blazing bridge Mar 15, 2021, 3:06 PM

#

yeah I was the same lol

#

it doesn't really matter as long as you learn it

merry lintel Mar 15, 2021, 3:07 PM

#

i mean algebra and calculus are interesting but still lazy. it is a wide range of concepts isn't it? @blazing bridge

hollow sentinel Mar 15, 2021, 3:08 PM

#

there's a good book for math in ML

#

https://mml-book.com/

Mathematics for Machine Learning

#

it's also pinned

merry lintel Mar 15, 2021, 3:08 PM

#

oh nice

#

thanks a lot

#

didnt notice

#

i guess

lapis sequoia Mar 15, 2021, 3:09 PM

#

pearl arch im looking for algorithms and methods for detection the anomaly in vibration tra...

Your first step should be labelling the dataset.
You'll need both normal and anomaly data. Most likely it would be skewed.
After that you can try algorithms/models like Xgboost, RandomForest, MLP, etc.

#

There is another way in which you try to find outliers (anomalies). Like 1-Class SVMs.
Here: https://towardsdatascience.com/outlier-detection-with-one-class-svms-5403a1a1878c

Medium

Outlier Detection with One-Class SVMs

Don’t balance that dataset just yet

uncut barn Mar 15, 2021, 3:12 PM

#

When applying K-Means clustering on unlabelled data, if we use a linear classifier and artificial labels, What type of regularisation would we use?

#

Can anyone help me with this?

serene scaffold Mar 15, 2021, 3:14 PM

#

uncut barn When applying K-Means clustering on unlabelled data, if we use a linear classifi...

can you tell us where this question came from?

lapis sequoia Mar 15, 2021, 3:14 PM

#

lapis sequoia There is another way in which you try to find outliers (anomalies). Like 1-Class...

Also you can try AutoEncoders for Anomaly Detection.
There is a whole python package which is having a lot of models which deals with anomaly
https://towardsdatascience.com/anomaly-detection-with-pyod-b523fc47db9

Medium

Anomaly Detection with PyOD!

Have you used this wonderful Python Outlier Detection Module?

obtuse sable Mar 15, 2021, 3:17 PM

#

Hi guys. What's a good metric/score to look at if I want to prioritise minimizing false positives in my binary classifier? And also if I have a lot more "Y=1"s to "Y=0"s? Approximately in the ratio of 4:1

lapis sequoia Mar 15, 2021, 3:17 PM

#

uncut barn When applying K-Means clustering on unlabelled data, if we use a linear classifi...

The main reason for this is to have a labeled dataset. In real world problems you don't know how many classes exist and which data point belongs to which class. So the best cluster size that you get from K-means can be used as a good starting point to create a classifier.

serene scaffold Mar 15, 2021, 3:18 PM

#

lapis sequoia The main reason for this is to have a labeled dataset. In real world problems yo...

it looks like this person may have been asking an exam question, so please be cautious.

uncut barn Mar 15, 2021, 3:18 PM

#

its an exericse

#

on the sheet

serene scaffold Mar 15, 2021, 3:19 PM

#

obtuse sable Hi guys. What's a good metric/score to look at if I want to prioritise minimizin...

so you basically want to know your false positive rate?

#

actually you might want to be looking at the precision score?

#

false positives bring your precision score down, but false negatives don't.

obtuse sable Mar 15, 2021, 3:22 PM

#

FPs would be castostrophic for what I'm doing so I want to minimize that without losing too many TP. Is AUCROC or specificity ok?

lapis sequoia Mar 15, 2021, 3:23 PM

#

serene scaffold actually you might want to be looking at the precision score?

Yes increasing FPR should give what you want.
Draw a Precision vs Recall curve and set the threshold which will maximize the Precision without loosing a lot of Recall.

lapis sequoia Mar 15, 2021, 3:27 PM

#

obtuse sable FPs would be castostrophic for what I'm doing so I want to minimize that without...

Yeah you can also plot ROC curve and choose the threshold where FPR is low but TPR is high

obtuse sable Mar 15, 2021, 3:29 PM

#

That should still work for unbalanced data like mine? Like accuracy is kind of bad here because a model that only predicts 1 would still get 75+ percent

lapis sequoia Mar 15, 2021, 3:31 PM

#

obtuse sable That should still work for unbalanced data like mine? Like accuracy is kind of b...

ROC doesn't solves imbalance dataset problem. Do upsampling of the minority class and plot ROC curve or Precison Vs Recall

obtuse sable Mar 15, 2021, 3:33 PM

#

Ok. Thanks for the help!

bitter harbor Mar 15, 2021, 3:41 PM

#

is there a reason to use seaborn over mpl?
my prof's exclusively teaching it and i can't see why otrher than maybe having to do a bit less work?

lapis sequoia Mar 15, 2021, 3:48 PM

#

bitter harbor is there a reason to use seaborn over mpl? my prof's exclusively teaching it and...

Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

So seaborn contains a lot of prebuilt and defined plots and visualization that can be directly used.
Whereas matplotlib is a plotting library with limited predefined visualisation methods but greater customisation using its APIs.

In simple words:
If in future you want to create a custom plot/visualization that doesn't exist in seaborn or Matplotlib then you can use matplotlib library to create that using its API.

You can even create your own library like seaborn using Matplotlib.

bitter harbor Mar 15, 2021, 3:49 PM

#

huh ok thanks that's kinda what I was thinking

misty flint Mar 15, 2021, 4:05 PM

#

@paper lake listening to this podcast from 2 biostatisticians from john hopkins talking about data science and thought of you

#

https://open.spotify.com/show/1NJ6li5ZpNVBBQfpd3D6bi?si=agDPudZjRR-GyUBANitTSg

Spotify

Not So Standard Deviations

Listen to Not So Standard Deviations on Spotify. Roger Peng and Hilary Parker talk about the latest in data science and data analysis in academia and industry.

#

DoggoKek

paper lake Mar 15, 2021, 4:07 PM

#

misty flint <@674261595710291980> listening to this podcast from 2 biostatisticians from joh...

noooooooooo

#

oh well still thanks

#

i will follow that

misty flint Mar 15, 2021, 4:14 PM

#

DoggoKek

grave frost Mar 15, 2021, 4:15 PM

#

I was surprised this took a long time. Apparently, deepix is some GH lib that can interpret blurred text. I thought that this was implemented wayy before

#

the amount of information in the 'mosaic' blurred images is incredible. a good model with plenty of data can easily break it

misty flint Mar 15, 2021, 4:20 PM

#

paper lake noooooooooo

lolol one of the biostatisticians now works at etsy as a data scientist

#

E_FeelsEvilMan

rough shore Mar 15, 2021, 4:20 PM

#

What is exactly AI in python?

#

Is there some sort of a special module for it?

misty flint Mar 15, 2021, 4:21 PM

#

its a field of study

misty flint Mar 15, 2021, 4:21 PM

#

rough shore What is exactly AI in python?

you can start here if you are interested https://www.coursera.org/learn/ai-for-everyone

Coursera

AI For Everyone

Offered by DeepLearning.AI. AI is not only for engineers. If you want your organization to become better at using AI, this is the course to ... Enroll for free.

rough shore Mar 15, 2021, 4:21 PM

#

Thank you!

misty flint Mar 15, 2021, 4:22 PM

#

np the course is built for non-technical people but its a nice overview

rough shore Mar 15, 2021, 4:22 PM

#

Is AI machine learning?

misty flint Mar 15, 2021, 4:23 PM

#

ML falls under AI

rough shore Mar 15, 2021, 4:23 PM

#

oh ok

misty flint Mar 15, 2021, 4:23 PM

#

just watch the video

#

it will be more clear

hollow sentinel Mar 15, 2021, 4:27 PM

#

so what is this

#

just the theory of AI?

grave frost Mar 15, 2021, 4:28 PM

#

must be some basic stuff with easy to relate examples

#

why the specific output? just asking

#

Its still not clear what is your end goal. 'motion tracking' - do you want bounding boxes or segmentation (or maybe both?)

#

That looks pretty advanced. I have never done anything like that. sorry

uncut orbit Mar 15, 2021, 4:48 PM

#

how'd you do that

#

thats magic

misty flint Mar 15, 2021, 4:59 PM

#

thats...a ton of data

#

memecringeharold

hollow sentinel Mar 15, 2021, 4:59 PM

#

jam_cavedude

misty flint Mar 15, 2021, 4:59 PM

#

gonna have to use some big data tools

hollow sentinel Mar 15, 2021, 4:59 PM

#

big boy tools

misty flint Mar 15, 2021, 4:59 PM

#

ID_BoomKek

#

dude. that output could be anything. no idea.

#

DoggoKek

grave frost Mar 15, 2021, 5:01 PM

#

the tenser is a float32
the whole of it?

#

why don't you flatten the output and use it like that

twin latch Mar 15, 2021, 5:57 PM

#

dont know how to fix this exception error, can anyone help?

serene scaffold Mar 15, 2021, 6:48 PM

#

twin latch dont know how to fix this exception error, can anyone help?

is this a data science question? in either case, take a look at what the s.run function is expecting

twin latch Mar 15, 2021, 7:00 PM

#

serene scaffold is this a data science question? in either case, take a look at what the `s.run`...

Oh I solved that problem, I was passing wrong arguments and not sure if it was data science question but I was trying to automate my data reading and writing

inner aspen Mar 15, 2021, 7:46 PM

#

I really like working with Neural networks, Especially GANs. I am training Stylegan2-ADA on about 6.7K minecraft images, it's really cool

serene scaffold Mar 15, 2021, 7:57 PM

#

#career-advice is another place where you can ask about that. you might want to give more context. I don't make hiring decisions, though there are data science jobs in my region (mid atlantic US) that accept applicants with bachelors degrees and relevant coursework

misty flint Mar 15, 2021, 8:40 PM

#

same

#

but there are also many positions where theyre looking for a higher degree

#

5_FeelsBongoMan

serene scaffold Mar 15, 2021, 9:30 PM

#

misty flint but there are also many positions where theyre looking for a higher degree

Almost every data scientist listing I've seen has said that a master's degree is preferred, even if a bachelor's is the minimum qualification.

#

some listings say "bachelors and five years experience, or masters and two years". so basically if you don't get a masters but you spent the time you would have spent getting it in industry, the effect is the same.

lean ledge Mar 15, 2021, 9:35 PM

#

Well the problem is you have to get that experience in data science

#

So you have to find at least one job that's willing to take you in with no experience in data science

grave frost Mar 15, 2021, 10:13 PM

#

hmm...also, does anyone have any idea on how to get your first internship?

lusty iron Mar 15, 2021, 10:53 PM

#

serene scaffold Almost every data scientist listing I've seen has said that a master's degree is...

Funny thing is that most of the types that get those jobs are people who's only actual knowledge of ml comes from coursera classes. The truth is that even something as relevant as a phd in "Mathematical Optimization" does not do much for practicing machine learning. The managers for these positions don't know much about ML and believe the hype; thinking it is something from science fiction. There is not as much work/jobs in ML as people think, most of "Data Scientist" positions are really just "Data Analyst" Positions. After talking to some "Data Scientists", it is shocking how little they know. I have wondered what will happen to these people once the "Data Science" bubble busts in a few years......

serene scaffold Mar 15, 2021, 10:54 PM

#

lusty iron Funny thing is that most of the types that get those jobs are people who's only ...

hopefully I will have secured enough marketable experience by then 🤷‍♂️

misty flint Mar 15, 2021, 10:54 PM

#

same

#

CCL_Shrug

#

i think there will always be roles for technical people explaining things to non-technical people. if its less ML than promised, i would still be okay with such a role

lusty iron Mar 15, 2021, 10:56 PM

#

Well there are many analyst roles that require python, if you are also very technical you can move over to data engineering.

misty flint Mar 15, 2021, 10:56 PM

#

maybe instead of neural nets, youre doing linear regression. im still okay with that

lusty iron Mar 15, 2021, 10:59 PM

#

to be fair, I am more worried about the python language once the data bubble bursts. Python web is dying, there are also of competition in sys-admin languages

misty flint Mar 15, 2021, 10:59 PM

#

lusty iron Funny thing is that most of the types that get those jobs are people who's only ...

the funny thing is one of the 2 hosts of a podcast i listen to did her phd in optimization. shes a principal data scientist or whatever but everything shes said sorta kinda backs up your point but there are exceptions

misty flint Mar 15, 2021, 10:59 PM

#

lusty iron to be fair, I am more worried about the python language once the data bubble bur...

new language time?

#

blobhyperthink

lean ledge Mar 15, 2021, 11:01 PM

#

Not so sure about that, I know multiple data scientists with PhDs who've done pretty complex work in industry, even made their own techniques for multimodal data and stuff

lusty iron Mar 15, 2021, 11:01 PM

#

Python's current data ecosystem looks a lot like Java's big data/hadoop ecosystem 8 years ago....a lot of those projects died, only a handful outlived the bubble (Spark, Presto)

lean ledge Mar 15, 2021, 11:01 PM

#

There might be a lot of data science positions because of hype and a lot of people being hired when they're crap but that's a separate issue entirely that stems from stupid marketing of data science MOOCs

#

The whole "sexiest job" thing has probably hurt the field significantly

misty flint Mar 15, 2021, 11:02 PM

#

yeah

#

def

#

too many rookies

#

now

#

~~im one of them~~

#

RunFail

lean ledge Mar 15, 2021, 11:03 PM

#

lusty iron to be fair, I am more worried about the python language once the data bubble bur...

Data science and web isn't the only thing python is good at lol

#

So much scientific and engineering work is python

lusty iron Mar 15, 2021, 11:03 PM

#

I am hoping people in the python community see this, and try to make sure python can survive the data bubble bursting

#

so python science is prop not going anywhere

#

not much work outside of academia for it thu

grave frost Mar 15, 2021, 11:04 PM

#

tbh, I think the focus should rather be more on research and development than so called 'application of ML'. right now, its more oriented towards "getting 0.5% more in that benchmark"

lean ledge Mar 15, 2021, 11:04 PM

#

lusty iron not much work outside of academia for it thu

There's absolute tons of scientific and engineering python outside academia

lusty iron Mar 15, 2021, 11:06 PM

#

if you look at the pydata ecosystem, there are a lot of packages/projects for different science disciplines.....I don't know if they will translate to out of academia work......

grave frost Mar 15, 2021, 11:06 PM

#

lusty iron if you look at the pydata ecosystem, there are a lot of packages/projects for di...

umm....why not?

lean ledge Mar 15, 2021, 11:07 PM

#

not sure what you mean by pydata ecosystem but the scientific python ecosystem is large and well used within industry

misty flint Mar 15, 2021, 11:07 PM

#

everyone thinks R is going to die but its still used heavily in many places

lean ledge Mar 15, 2021, 11:07 PM

#

just not by software developers because they dont actually know any science

lusty iron Mar 15, 2021, 11:07 PM

#

grave frost umm....why not?

tell me how many of those https://numfocus.org/sponsored-projects can be used outside of academia

NumFOCUS

Sponsored Projects | pandas, NumPy, Matplotlib, Jupyter, + more - N...

Explore NumFOCUS Sponsored Projects, including: pandas, NumPy, Matplotlib, Jupyter, rOpenSci, Julia, Bokeh, PyMC3, Stan, nteract, SymPy, FEniCS, PyTables...

lean ledge Mar 15, 2021, 11:07 PM

#

I know many of these actually being used in industry lol

grave frost Mar 15, 2021, 11:07 PM

#

the top of the list lol numpy, pandas and matplotlib are all used

lean ledge Mar 15, 2021, 11:07 PM

#

Heck I've used some of them

grave frost Mar 15, 2021, 11:08 PM

#

I struggle to understand your arguments for such reasoning

lusty iron Mar 15, 2021, 11:08 PM

#

things like MDAnalysis and ITK are only for work in academia

grave frost Mar 15, 2021, 11:08 PM

#

who said that?

lusty iron Mar 15, 2021, 11:08 PM

#

(I am not talking about Pandas/Matplotlib)

lean ledge Mar 15, 2021, 11:09 PM

#

MDAnalysis is a Python library for the analysis of computer simulations of many-body systems at the molecular scale, spanning use cases from interactions of drugs with proteins to novel materials

#

I assure you this isn't only used in academia

lusty iron Mar 15, 2021, 11:09 PM

#

I might be wrong.....fair enough....

grave frost Mar 15, 2021, 11:09 PM

#

spanning use cases from interactions of drugs with proteins to novel materials
Probably most biotech firms?

misty flint Mar 15, 2021, 11:09 PM

#

sounds like biotech/pharma

lean ledge Mar 15, 2021, 11:10 PM

#

Biotech and pharmaceutical yes, but also general material science

misty flint Mar 15, 2021, 11:10 PM

#

they always seem to have interesting tools

#

pithink

lean ledge Mar 15, 2021, 11:10 PM

#

Molecular dynamics has tons of real world usecases lol

misty flint Mar 15, 2021, 11:10 PM

#

lots of interesting R packages

grave frost Mar 15, 2021, 11:10 PM

#

misty flint lots of interesting R packages

you know R?

misty flint Mar 15, 2021, 11:11 PM

#

barely

#

DoggoKek

#

whats that phrase

#

"R is a glorified calculator"

#

ID_BoomKek

lean ledge Mar 15, 2021, 11:11 PM

#

And ITK is almost certainly used in industry too

#

First time I'm coming across it

misty flint Mar 15, 2021, 11:11 PM

#

its good for stats tho

lean ledge Mar 15, 2021, 11:11 PM

#

But it's darn useful looking

misty flint Mar 15, 2021, 11:12 PM

#

R has many useful functions

grave frost Mar 15, 2021, 11:12 PM

#

I guess there is a place and time for each language

lean ledge Mar 15, 2021, 11:12 PM

#

Any programmer claiming X thing isn't used in industry is probably saying it because they have no domain knowledge about any industry outside of software

grave frost Mar 15, 2021, 11:12 PM

#

I personally think MatLab is pretty good for stats. It seems pretty simple IMO

#

they have a GUI for regresssion 🤷

misty flint Mar 15, 2021, 11:13 PM

#

hmm idk Matlab's stats capabilities but R's stats functions are pretty comprehensive

grave frost Mar 15, 2021, 11:13 PM

#

Tho its DL toolkit sucks AF. its so limited

#

it has a self-driving toolkit too. nobody uses it lol

lusty iron Mar 15, 2021, 11:14 PM

#

lean ledge Any programmer claiming X thing isn't used in industry is probably saying it bec...

to be fair, that is True.....But I will argue that a few jobs in that small field will not save python

grave frost Mar 15, 2021, 11:14 PM

#

I dont get your obsession with one language

misty flint Mar 15, 2021, 11:14 PM

#

DoggoKek

lean ledge Mar 15, 2021, 11:14 PM

#

They absolutely will because they're not a "few jobs in that small field", it's "many many many jobs across many fields, just fields programmers don't know enough to work in"

#

It's like how CS majors claim MATLAB is dead and no one uses it

#

Because they don't realise it's used a fuck tonne, they just don't know enough other things to find and get those jobs

grave frost Mar 15, 2021, 11:15 PM

#

even if python dies, we will migrate to another language if need be (there are several alternatives in development) what matters are the core programming fundamentals, not syntax

misty flint Mar 15, 2021, 11:15 PM

#

yeah matlab is very industry specific

lusty iron Mar 15, 2021, 11:16 PM

#

I guess I am the only one here that likes python alot and wants it to thrive

lean ledge Mar 15, 2021, 11:16 PM

#

Python is going to be used for many decades across robotics, control, signal processing, physics dynamics modelling, data analysis, etc.

grave frost Mar 15, 2021, 11:16 PM

#

yeah, its hard to unravel a lot of effort and work put in it

misty flint Mar 15, 2021, 11:17 PM

#

~~one language to rule them all~~

#

blobhyperthink

#

jk

#

DoggoKek

lean ledge Mar 15, 2021, 11:17 PM

#

Python can't be the one language to rule them all until it becomes a fast lower level language with no garbage collection

#

Rust on the other hand...

misty flint Mar 15, 2021, 11:17 PM

#

omg garbage collection

#

i hate that i have to do it so often

#

on some stuff

#

and then i forget to do it when i need to

#

ID_BoomKek

grave frost Mar 15, 2021, 11:18 PM

#

I am vaguely familiar with garbage collection. is it the clearing of memory for stuff for which the variable does not exist

lean ledge Mar 15, 2021, 11:18 PM

#

Yes

#

Clearing of memory that isn't being referenced so can't be used anymore

grave frost Mar 15, 2021, 11:18 PM

#

if someone deletes a variable, shouldn't it delete the stored contents too

lusty iron Mar 15, 2021, 11:19 PM

#

I vaguely played with gc when I tinkered in c/c++. Java has no gc right?, but it is alot faster than python

lean ledge Mar 15, 2021, 11:19 PM

#

x = [1, 2, 3] # allocates memory
x = 3
# someone needs to clear the memory memory in which [1, 2, 3] is stored

exotic maple Mar 15, 2021, 11:19 PM

#

grave frost tbh, I think the focus should rather be more on research and development than so...

You mean you think it should be more theoritical instead of practical?

lean ledge Mar 15, 2021, 11:19 PM

#

Java has a gc

#

C/C++ have no GC

lusty iron Mar 15, 2021, 11:19 PM

#

yeah, I think I ment it the other way around

grave frost Mar 15, 2021, 11:20 PM

#

exotic maple You mean you think it should be more theoritical instead of practical?

no, I am just saying that while its good to make great perf on benchmarks, people are losing the ultimate goal for actually developing AI. not overfitting giant models on all the data available on the net

lean ledge Mar 15, 2021, 11:20 PM

#

ML isn't "AI", fancy regression is always what it's been about

#

ML never started with "AI" in the goal

exotic maple Mar 15, 2021, 11:21 PM

#

grave frost no, I am just saying that while its good to make great perf on benchmarks, peopl...

Personally, and maybe its because ive always work in corps, I've always seen ML as an intricate, valuable tool, but just that, a tool

lusty iron Mar 15, 2021, 11:21 PM

#

You will be shocked how many people shy away from python due it its suppressive lack in performance

grave frost Mar 15, 2021, 11:21 PM

#

well, some people did aim for making AGI (Like Turing for instance) its just that not much is there for cutting edge stuff

exotic maple Mar 15, 2021, 11:21 PM

#

I know people who spend DAYs killing themselves over a meager 0.5% accuracy KPI, where its not needed

#

I respect the theorical building and the fantastic work many reserachers do in ML and DL, but personally, it's not my thing, fk research lol

grave frost Mar 15, 2021, 11:22 PM

#

making 0.5% on a benchmark =/= AI progress (note I use AI, not ML)

exotic maple Mar 15, 2021, 11:22 PM

#

I want to be able to apply those things to something useful to me, that's all

lean ledge Mar 15, 2021, 11:23 PM

#

lusty iron You will be shocked how many people shy away from python due it its suppressive ...

Based on what? For what types of task?

exotic maple Mar 15, 2021, 11:23 PM

#

obviously, to apply them properly, i need to understand them, what they do, etc

exotic maple Mar 15, 2021, 11:23 PM

#

lean ledge Based on what? For what types of task?

I'm yet to see anyone shy way from python due to performance. Has to be a very very specific thing lol

grave frost Mar 15, 2021, 11:24 PM

#

well, it depends. for me, I appreciate the theoretical work more than the applied one because the theoretical ones are focusing on making AGI. Practically deploying models doesn't sound very appealing

lusty iron Mar 15, 2021, 11:24 PM

#

lean ledge Based on what? For what types of task?

even simple and non-performative things(like an internal rest api what will get a few requests a week)....people are not rational, they want to learn the fastest thing

exotic maple Mar 15, 2021, 11:24 PM

#

grave frost well, it depends. for me, I appreciate the theoretical work more than the applie...

This is the beauty of life, people with different perspectives. Some people love building tools for the sake of the challenge or achievement, others like me? We just like to hammer shit xd

grave frost Mar 15, 2021, 11:25 PM

#

exotic maple This is the beauty of life, people with different perspectives. Some people love...

😁

lean ledge Mar 15, 2021, 11:25 PM

#

lusty iron even simple and non-performative things(like an internal rest api what will get ...

I have never seen Python missing out on Rest APIs because it's not fast enough, there's other languages that are better at it for other reasons

#

Python gets left for something faster all the time but it has to be in:

business logic not numerics, or
extremely high throughput required, or
needing low latency and real-time

exotic maple Mar 15, 2021, 11:25 PM

#

dont get me wrong though @grave frost I can still get all nerdy and ask about specific shit, There{s reason i studied engineering (evne though i only ever worked business)

#

but not in data science, any CS graduate could shut me up with their indepth knowledge xd

grave frost Mar 15, 2021, 11:26 PM

#

why is it that some package written in pure python still do stuff in like 0.001s (or at least they claim to)

lean ledge Mar 15, 2021, 11:26 PM

#

~~What type of engineering?~~

exotic maple Mar 15, 2021, 11:26 PM

#

Imaginary, I mean

#

Industrial Engineering

lean ledge Mar 15, 2021, 11:26 PM

#

~~Industrial engineering ~= business engineering anyway~~

grave frost Mar 15, 2021, 11:26 PM

#

I doubt low level language can improve performance more than 0.001s

exotic maple Mar 15, 2021, 11:27 PM

#

lean ledge ~~Industrial engineering ~= business engineering anyway~~

lean ledge Mar 15, 2021, 11:27 PM

#

james acaster is great

exotic maple Mar 15, 2021, 11:28 PM

#

I mean, i dont regret my choice truth be told

#

Business degree would be too shallow for me

#

and the other engineering are too close-minded for me

lean ledge Mar 15, 2021, 11:28 PM

#

grave frost I doubt low level language can improve performance more than 0.001s

the bigger thing is if you need to do those 0.001 task 2 million times

#

in which case the lower level language can and will eat Python's cake

exotic maple Mar 15, 2021, 11:28 PM

#

I guess the only other thing i could have studied was CS, but that wasnt a choice for me at the time

grave frost Mar 15, 2021, 11:28 PM

#

lean ledge in which case the lower level language can and will eat Python's cake

😦 ❎ 🍰

lean ledge Mar 15, 2021, 11:28 PM

#

how are other engineering degrees too close-minded ThinkingJeff

exotic maple Mar 15, 2021, 11:29 PM

#

@lean ledge perhaps i didnt explain myself properly. What I meant was: Their domain is exact -> A single topic. They are xtremely indepth and useful, but also narrow

#

Well, that was my view at the time, and for the most part i think it has held up.

#

Industrial engineering is shallow af. You dont learn much about any topic, but you get a good notion of a lot of things which helps you be versatile

lusty iron Mar 15, 2021, 11:31 PM

#

what I don't get is that java feels like python but with forced classes and forced types......why can't they make a python compiler that takes typed python and gets the performance of java

misty flint Mar 15, 2021, 11:31 PM

#

is industrial eng like operations?

#

pithink

exotic maple Mar 15, 2021, 11:31 PM

#

misty flint is industrial eng like operations?

that was the focus yes

lean ledge Mar 15, 2021, 11:31 PM

#

lusty iron what I don't get is that java feels like python but with forced classes and forc...

Because types and strictness

lusty iron Mar 15, 2021, 11:31 PM

#

lean ledge Because types and strictness

what you mean?

exotic maple Mar 15, 2021, 11:31 PM

#

@misty flint It's origin its basically for operations: Factory management, floor management, etc

#

you need to know about production processes, statistical quality control, but also business and admin

lean ledge Mar 15, 2021, 11:33 PM

#

Knowing X is this particular type and the output is supposed to be this, etc, you can avoid extra operations that check the input types of the input, that ensure X thing is happening, there's less data to clean up, etc

#

For example, if you have 2 ints, in python, you write (a+b) and it checks the type of a, a is an object so fundamentally on the low level its a PyObject struct and you have to access its value and you have to see if it supports the + operation, and then you have to see if b can be added to a, etc

lusty iron Mar 15, 2021, 11:35 PM

#

is there syntax difference between reference counting and what java does? I feel like they look the same for the user

grave frost Mar 15, 2021, 11:35 PM

#

dym that if python was explicit, it would be faster?

lean ledge Mar 15, 2021, 11:35 PM

#

In java with strict typing, your compiler knows a and b are ints before hand, there's nothing new to do. You just insert an add operation

#

and that's it

lean ledge Mar 15, 2021, 11:35 PM

#

grave frost dym that if python was explicit, it would be faster?

yes!

#

that's why things like cython etc work

#

they force you to do type annotations etc

iron basalt Mar 15, 2021, 11:35 PM

#

grave frost I doubt low level language can improve performance more than 0.001s

Try making a game engine in C and then the same engine in Python.

lean ledge Mar 15, 2021, 11:35 PM

#

and that lets them optimise

exotic maple Mar 15, 2021, 11:36 PM

#

iron basalt Try making a game engine in C and then the same engine in Python.

gaming is probably the one thing where python will never, ever shine

lean ledge Mar 15, 2021, 11:36 PM

#

lusty iron is there syntax difference between reference counting and what java does? I feel...

Nope, Java is reference counting done by a separate GC program

#

They're the same thing

#

Python does the same thing

exotic maple Mar 15, 2021, 11:36 PM

#

@lean ledge are you a DS?

lean ledge Mar 15, 2021, 11:36 PM

#

Most GC is just fancy reference counting, each object maintains a count. Every X miliseconds a program goes around checking all the memory it has allocated and clears it if no references

#

https://rag.gy @exotic maple

grave frost Mar 15, 2021, 11:37 PM

#

so, if we do everything explicit in python, does it boost it a little bit?

lean ledge Mar 15, 2021, 11:37 PM

#

In CPython, the normal python you do, it doesnt. The explicit type hints are just hints

exotic maple Mar 15, 2021, 11:37 PM

#

lean ledge https://rag.gy <@!263491859173736449>

bro this is a cool. How do you do it lol

lusty iron Mar 15, 2021, 11:37 PM

#

lean ledge Nope, Java is reference counting done by a separate GC program

can a typed complied python in theory be as fast as java?

lean ledge Mar 15, 2021, 11:37 PM

#

You need to switch to a different python implementation that takes advantage of it

lean ledge Mar 15, 2021, 11:37 PM

#

lusty iron can a typed complied python in theory be as fast as java?

Cython is real fast. Idk how it compares to Java, but it's getting there

lusty iron Mar 15, 2021, 11:38 PM

#

lean ledge Cython is real fast. Idk how it compares to Java, but it's getting there

cython with only python objects is not very fast

lean ledge Mar 15, 2021, 11:38 PM

#

exotic maple bro this is a cool. How do you do it lol

Do what? the website?

exotic maple Mar 15, 2021, 11:38 PM

#

lean ledge Do what? the website?

yeah

misty flint Mar 15, 2021, 11:38 PM

#

exotic maple you need to know about production processes, statistical quality control, but al...

ah i see. still seems pretty practical.

lusty iron Mar 15, 2021, 11:38 PM

#

if I take non-numric python code, add types....it will not be very fast useing cython

lean ledge Mar 15, 2021, 11:38 PM

#

It's just a normal website I made with bootstrap because I was board. .gy is the Guyana domain

#

I just got lucky I got rag.gy as a domain

misty flint Mar 15, 2021, 11:38 PM

#

like a good generalist skillset

exotic maple Mar 15, 2021, 11:39 PM

#

misty flint ah i see. still seems pretty practical.

It si pretty practical. You can crap on any business bachelor, but any specilized engineer will shit on you

misty flint Mar 15, 2021, 11:39 PM

#

i feel like industrial engineers could be a good product manager maybe

lean ledge Mar 15, 2021, 11:39 PM

#

the only fun part of industrial eng is operations research

#

that's good shit

misty flint Mar 15, 2021, 11:39 PM

#

pithink

exotic maple Mar 15, 2021, 11:39 PM

#

misty flint i feel like industrial engineers could be a good product manager maybe

I am a project manager xd

misty flint Mar 15, 2021, 11:39 PM

#

DoggoKek

iron basalt Mar 15, 2021, 11:39 PM

#

Cython is fast if you add types to the variables and disable a bunch of things python does by default, like bounds checking, etc.

exotic maple Mar 15, 2021, 11:39 PM

#

lean ledge the only fun part of industrial eng is operations research

bro. I loved this shit in university

#

Operations research is amazing

#

Its the one part of math and college i loved

lusty iron Mar 15, 2021, 11:39 PM

#

did not know that.....

exotic maple Mar 15, 2021, 11:39 PM

#

Sadly i never got to use it so i forgot everything

lean ledge Mar 15, 2021, 11:39 PM

#

Oh yep, safety checks like bound checks also add to slowness

misty flint Mar 15, 2021, 11:39 PM

#

optimization?

exotic maple Mar 15, 2021, 11:39 PM

#

misty flint optimization?

yup

iron basalt Mar 15, 2021, 11:40 PM

#

Cython can tell you what code is probably slow cython -a.

misty flint Mar 15, 2021, 11:40 PM

#

Praise

exotic maple Mar 15, 2021, 11:40 PM

#

Optimization, queue theory, etc

lean ledge Mar 15, 2021, 11:40 PM

#

s i m p l e x

misty flint Mar 15, 2021, 11:40 PM

#

i only know bc of that podcast

#

DoggoKek

exotic maple Mar 15, 2021, 11:40 PM

#

lean ledge s i m p l e x

HAHAHAHA

#

S I M P L E X

#

no wait

#

"GUYS LETS START EXCEL SOLVER"

lean ledge Mar 15, 2021, 11:40 PM

#

I do lots of optimisation as a robotics/control person so operations research is mildly cool. Not as cool as more continuous type optimisation though

exotic maple Mar 15, 2021, 11:41 PM

#

lean ledge It's just a normal website I made with bootstrap because I was board. .gy is the...

I've never made a website. I envy yoou 😦

lean ledge Mar 15, 2021, 11:41 PM

#

Convex optimisation is cooler as a subject

lean ledge Mar 15, 2021, 11:41 PM

#

exotic maple I've never made a website. I envy yoou 😦

Its really easy to make lol

iron basalt Mar 15, 2021, 11:41 PM

#

One big reason to use cython is that it automatically works with numpy and you don't really need to setup a C/C++ project (all those different build tools are a nightmare and one big reason people dislike C/C++).

lean ledge Mar 15, 2021, 11:41 PM

#

Cython is what happens when you realise as a scientist your simulation is slightly slow but it's going to be a bitch to rewrite in C++

#

So you add type hints and a couple other optimisation and get that last bit of juice

grave frost Mar 15, 2021, 11:42 PM

#

so its basically like a C wrapper for python to make it faster?

exotic maple Mar 15, 2021, 11:42 PM

#

isnt Cython just normal python?

lean ledge Mar 15, 2021, 11:42 PM

#

It's Python compiled into C or C++ through the right typehints and optimisations to make Python significantly faster

lean ledge Mar 15, 2021, 11:42 PM

#

exotic maple isnt Cython just normal python?

That's CPython

iron basalt Mar 15, 2021, 11:42 PM

#

It's python (with some extra stuff) to C, but also with a bunch of stuff added to make it work with python. It compiles to a shared object / dll which python can load.

grave frost Mar 15, 2021, 11:43 PM

#

lean ledge That's CPython

wait those are different things? cython and cpython?

lean ledge Mar 15, 2021, 11:43 PM

#

Yep

exotic maple Mar 15, 2021, 11:43 PM

#

lean ledge That's CPython

-screeches in dislexia-

lean ledge Mar 15, 2021, 11:43 PM

#

https://cython.org/

#

Cython's used tons by my scientist friends

#

Although they also liked numba last they tried

iron basalt Mar 15, 2021, 11:44 PM

#

I kind of dislike numba, it yells at me too much, I often give up and go to cython.

#

Also Jitting each time you run the program can give annoying startup times.

lean ledge Mar 15, 2021, 11:46 PM

#

numba has some really cool scientific computing stuff

iron basalt Mar 15, 2021, 11:47 PM

#

For threading it can be very useful.

#

(And GPU ofc)

lean ledge Mar 15, 2021, 11:49 PM

#

I dream of a world where scientists don't need to learn the ins and outs of C++ because parallelisation to CUDA or clusters becomes much easier and optimisations are automatic

iron basalt Mar 15, 2021, 11:49 PM

#

It feels a lot like OpenMP, but python.

iron basalt Mar 15, 2021, 11:50 PM

#

lean ledge I dream of a world where scientists don't need to learn the ins and outs of C++ ...

Oh how easy it would be to write simulations then.

#

Right now it's hell trying to get the kernels tweaked just right and dealing with all the API gunk.

#

I also wish FPGAs were easier to get into and use.

#

For making things that don't really work well on CPUs nor GPUs.

#

like python -> FPGA numba or something

exotic maple Mar 15, 2021, 11:55 PM

#

anyone had the issue where jupyter lab//notebook cant import NLTK?

#

I have specific env and everything, installed in there and everything

#

but its not working

#

I have verified all pointers in the env are ok as well

lean ledge Mar 15, 2021, 11:57 PM

#

@iron basalt You can write OpenCL code and export it to FPGAs https://www.electronicdesign.com/technologies/fpgas/article/21794531/how-to-put-opencl-into-an-fpga

#

https://www.intel.com.au/content/www/au/en/software/programmable/sdk-for-opencl/overview.html

Intel

Intel® FPGA SDK for OpenCL™ Software Technology

With the Intel® FPGA SDK for Open Computing Language (OpenCL™), you develop FPGA designs in C using a high-level software flow.

grave frost Mar 15, 2021, 11:58 PM

#

exotic maple anyone had the issue where jupyter lab//notebook cant import NLTK?

yeah, its a struggle for some reason

#

thats why I always prefer pre-defined environments

exotic maple Mar 15, 2021, 11:59 PM

#

@grave frost any clue on how to fix it?

iron basalt Mar 15, 2021, 11:59 PM

#

lean ledge <@!119925597395877889> You can write OpenCL code and export it to FPGAs https://...

Yeah that's what I use, but it's not on the level of numba, though some python code could be written to generate opencl kernels.

grave frost Mar 16, 2021, 12:00 AM

#

exotic maple <@!738058085083381760> any clue on how to fix it?

whats the error anyway?

exotic maple Mar 16, 2021, 12:00 AM

#

no module "nltk"

#

but its already installed in the enviroment and everything

grave frost Mar 16, 2021, 12:00 AM

#

did you activate it?

exotic maple Mar 16, 2021, 12:00 AM

#

obv lol

grave frost Mar 16, 2021, 12:01 AM

#

well, then you did not install it. try force-reinstalling it

exotic maple Mar 16, 2021, 12:01 AM

#

I can use it via cmd

grave frost Mar 16, 2021, 12:01 AM

#

maybe some error you missed

exotic maple Mar 16, 2021, 12:01 AM

#

I can use it via cmd, bu t not in jupyter

grave frost Mar 16, 2021, 12:01 AM

#

hmm....whats the output with pip and pip3

exotic maple Mar 16, 2021, 12:02 AM

#

#

already satisfied

#

you can see im in the env as well

#

#

its also listed

#

I trie everything i can think of

grave frost Mar 16, 2021, 12:05 AM

#

pip3 install nltk

exotic maple Mar 16, 2021, 12:05 AM

#

already tried. same thing

#

req satisfied

grave frost Mar 16, 2021, 12:06 AM

#

when something doesn't work, there is only one solution left

exotic maple Mar 16, 2021, 12:06 AM

#

ima try restarting my pc. bullshit sometimes works after restart

grave frost Mar 16, 2021, 12:06 AM

#

Reboot

exotic maple Mar 16, 2021, 12:06 AM

#

LOL

grave frost Mar 16, 2021, 12:06 AM

#

yea, haha

exotic maple Mar 16, 2021, 12:18 AM

#

fking crap sint working

#

zzz

#

what the f

#

now it worked when installed it outside the env...

#

it might be a PATH issue

loud finch Mar 16, 2021, 12:24 AM

#

Did you run any pip stuff in the environment

#

It clearly states that if you run pip inside conda it will break it 100%

#

Then you can throw that env in the trash

grave frost Mar 16, 2021, 12:27 AM

#

That has to be the weirdest argument I have ever seen max_features=0.7000000000000001

loud finch Mar 16, 2021, 12:28 AM

#

Thats like.. Ok bro

serene scaffold Mar 16, 2021, 12:40 AM

#

grave frost That has to be the weirdest argument I have ever seen `max_features=0.7000000000...

without knowing the context, could be one of those people who don't understand how floating point precision works.