#data-science-and-ml | Python | Page 279

earnest forge Jan 13, 2021, 2:10 PM

#

if someone good with hardware, could someone help me booting tensorflow-gpu on my PC? It's 1 graphic card, GeForce 1660 Super plugged in with 6 GB and CUDA support. I installed necessary software such as CUDA drivers and cuDNN. But it still doesn't work. I operate in jupyter using anaconda

mellow pumice Jan 13, 2021, 3:27 PM

#

I use WSL-2 over windows to use TensorFlow GPU on my I figure most of the contents will be similar even if you are using Ubuntu. There's this video by Jeff Heaton -> https://youtu.be/mWd9Ww9gpEM

YouTube

Jeff Heaton

Use PyTorch and TensorFlow with an NVIDIA GPU in the Windows Linux ...

The Windows Subsystem for Linux (WSL-2) allows you to run a complete command-line Linux operating system under Windows. Now that NVIDIA offers a passthrough drive you can access the GPU from the Linux system in WIndows. In this video I show how to install a prerelease version of windows that allows this functionality, which allows you to run t...

▶ Play video

#

You can use the jupyter notebook to check whether the gpu is available or not

#

You would want to stick to the same versions though @earnest forge

earnest forge Jan 13, 2021, 3:35 PM

#

thanks

hollow scarab Jan 13, 2021, 4:08 PM

#

how do I multiply columns in pandas without column names? because this one did not work

📎 unknown.png

#

so I mean I want to refer to the 2. and 3. column

#

by 1 and 2

#

okay so the issue might be that the first 2 rows have texts in them, is there any way I can make it so it only does that formula for the 3.-x rows?

woeful hamlet Jan 13, 2021, 4:40 PM

#

my colab sessions resets because i exceed the ram limit. Is there any other platform where i have a bit more ram? or way i can do what i want using that ram? basically is because i am appending many images to a list for my train data

#

like... appending only half of all the images, train the model, save it, load the other half, overwritting the previous ones, and train again

#

will this work? or something?

high lion Jan 13, 2021, 5:41 PM

#

hollow scarab how do I multiply columns in pandas without column names? because this one did n...

Check out the pandas documentation.
But as far as I am concerned the appropriate method is DataFrame.iloc[]
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html?highlight=iloc

#

@hollow scarab

woeful hamlet Jan 13, 2021, 5:49 PM

#

what is the most efficient way to loop over all image pixels with numpy?

high lion Jan 13, 2021, 5:54 PM

#

Depends on what you want to do. Numpy has built-in functions to perform certain operations such as thresholding @woeful hamlet
I guess if you want to perform an action like this it would be most efficient to use those.

woeful hamlet Jan 13, 2021, 5:55 PM

#

i asked on sulfur uwu

#

if u wanna take a look there

sour beacon Jan 13, 2021, 6:10 PM

#

why does my database keep locking

late shell Jan 13, 2021, 6:35 PM

#

hello everyone, I just started learning ML a few days ago, and am confused in the data preprocessing section, especially feature scaling. Can someone clear this up for me: If I'm scaling down/normalizing my data (which I don't clearly understand why), then, while providing unseen/test data to my trained model, wouldn't I have to scale that down as well, and then scale up the predictions back up again in order to make sense out of it?

high badge Jan 13, 2021, 6:49 PM

#

it depends on what models you are working with

#

for decision trees, they just find points to divide the data to minimize an impurity score, meaning they dont rely on scaling

#

however if you look at linear regression, they you can think of wx + wx + wx... + b as a linear combination of sums

#

if an input x_k (k going from 1 to n where your dataset is m instances by n features) is a large number, then it would naturally contribute to a larger output y = wx + wx + wx... + b

#

and a larger output when measured in the loss function would produce a greater loss

#

and because you minimizing the loss with respect to the weights, you must compensate for the large values of x_k by reducing the weight for x_k to near 0

#

thus your optimization would pay more attention to one feature above another

#

where, ideally, you want your optimization to give equal attention to all features

#

yes, you would have to scale not only your training and validation data but also unseen data

#

but you dont have to scale the predictions back up again

late shell Jan 13, 2021, 7:13 PM

#

oh okay. Thankyou very much @high badge

#

I cant say I understood 100% of what u said, but i see some reason now. thanks for the input

high badge Jan 13, 2021, 7:15 PM

#

ah

#

well the simple idea behind feature scaling is just to give equal attention to all features so that when you optimize it with an algorithm, it wont pay more attention to one feature above another

ripe forge Jan 13, 2021, 8:01 PM

#

I'd also like to add, that you "learn" how to scale down the data using training data, and then you "implement" it on the training data, and you also "implement" it at the time of predictions. However, it is important you don't "re-learn" a scale on the test/prediction data

late shell Jan 13, 2021, 8:08 PM

#

yes, I just encountered this problem rn,

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()                # Standardization 
X_train[:, 3:] = sc.fit_transform(X_train[:, 3:)  # I want to scale only from column 3 onwards
X_test[:, 3:] = sc.transform(X_test[:, 3])

Why do I have to scale the test data according to the parameters (mean & S.D) of my train data. Why am I not calling sc.fit() on the test data as well?

lapis sequoia Jan 13, 2021, 8:10 PM

#

Hi, is there someone who could help me with linear regression? I am still a little bit unsure of which model I should make based on my data

ripe forge Jan 13, 2021, 8:30 PM

#

late shell yes, I just encountered this problem rn, ```py from sklearn.preprocessing impor...

because a scaling operation is like a "Transformation" that you decided on. the specific values were learnt from the train data, but that's just a finer point compared to the broad picture of what it actually is. Now, your model is trained based on inputs that have been scaled a certain way. So the weights this model learnt are tied to the scales at which the inputs were fed to it. Now, if you keep changing the scales for every prediction/test data, then your model's weights would be wrong corresponding to the modified scales.

#

so, logically, changing the scale after locking down a model makes no sense, and would harm your performance

#

The other version of explaining this is much simpler though: at training time you have the information of a dataset, at the time of live predictions you may only have 1 row of data at a time or something, and can't know the properties of the distribution of unseen data.

#

However, i think the first style of explaining is more technically precise

shell wing Jan 13, 2021, 9:15 PM

#

Anyone have insight into Folium, I have been struggling with this for a while https://gis.stackexchange.com/questions/384248/folium-and-timestamped-geojson-issue-not-reading-the-data-correctly

Geographic Information Systems Stack Exchange

Folium and timestamped GeoJSON issue , not reading the data correctly

I am having issues with Folium and using TimestampedGeoJson. I have the following dataframe structure below. I am trying to display this data with Folium and a time slider to be used on the date fi...

hollow scarab Jan 13, 2021, 10:47 PM

#

@high lion i did check it, didnt find anything:/ doesnt iloc just remove part of the df?

high lion Jan 13, 2021, 10:49 PM

#

hi again @hollow scarab

hollow scarab Jan 13, 2021, 10:50 PM

#

hello, sorry my discord was being weird, only saw the ping now

high lion Jan 13, 2021, 10:50 PM

#

🙂 nvm

#

did you check out my link?

hollow scarab Jan 13, 2021, 10:52 PM

#

yeah, and I used iloc before in the code, but my issue is that if I remove that row with text with iloc is that I need that row back later

velvet thorn Jan 13, 2021, 10:56 PM

#

high badge however if you look at linear regression, they you can think of wx + wx + wx... ...

...are you saying that scaling is necessary for linear regression?

high lion Jan 13, 2021, 10:56 PM

#

iloc should not remove anything from your df

import pandas as pd
                                                                        
mydict = [{'a': 1, 'b': 2, 'c': 3, 'd': 4},

          {'a': 100, 'b': 200, 'c': 300, 'd': 400},

          {'a': 1000, 'b': 2000, 'c': 3000, 'd': 4000 }]

df = pd.DataFrame(mydict)
print(df.iloc[1])
print(df)

#

output: ```
a 100
b 200
c 300
d 400
Name: 1, dtype: int64
a b c d
0 1 2 3 4
1 100 200 300 400
2 1000 2000 3000 4000

loud marlin Jan 13, 2021, 10:59 PM

#

Question about Spark...

I have the impression that spark is widely use and it’s fast

Today I took the spark course, and learn it’s build on RDD blocks, where RDD is much slower than data frame

————————————
Then it come across my mind... is spark really helps us to process the data faster?

Yes, the data separate into partitions, and able to cache them definitely helps the speed.

However, with so many modules optimize dataframe, is spark really needed?

Please help me understand it 🙂

velvet thorn Jan 13, 2021, 11:02 PM

#

loud marlin Question about Spark... I have the impression that spark is widely use and it’s...

when you say "dataframe"

#

what kind of dataframe do you mean?

high lion Jan 13, 2021, 11:04 PM

#

hollow scarab yeah, and I used iloc before in the code, but my issue is that if I remove that ...

@hollow scarab maybe your dataloss comes from something else?

hollow scarab Jan 13, 2021, 11:05 PM

#

so I can iloc, do the operation and then 'remove' the iloc to get all the data back? @high lion

velvet thorn Jan 13, 2021, 11:06 PM

#

hollow scarab so I can iloc, do the operation and then 'remove' the iloc to get all the data b...

.iloc is purely a data accessor; it does not modify your dataframe.

hollow scarab Jan 13, 2021, 11:06 PM

#

oh okay

#

but how do I put the df back to its original size

velvet thorn Jan 13, 2021, 11:06 PM

#

in fact, in general, pandas methods do not perform modification

velvet thorn Jan 13, 2021, 11:07 PM

#

hollow scarab but how do I put the df back to its original size

can you explain what you want to do + show your code

loud marlin Jan 13, 2021, 11:07 PM

#

velvet thorn what kind of dataframe do you mean?

It’s like the dataframe create from pandas I assume.

Sorry that I am new to Python and spark. Yet the instructor keep mention data frame is faster than RDD. So that I ask

hollow scarab Jan 13, 2021, 11:07 PM

#

I need to add a new column by multiplying 2 other columns but the first 2 rows have text so I get an error @velvet thorn

velvet thorn Jan 13, 2021, 11:07 PM

#

loud marlin It’s like the dataframe create from pandas I assume. Sorry that I am new to Py...

okay, so first, Spark has dataframes too

#

but anyway

#

pandas and Spark serve fundamentally different needs

hollow scarab Jan 13, 2021, 11:08 PM

#

so I got suggested to use iloc to remove those text rows so I could add the new column

velvet thorn Jan 13, 2021, 11:08 PM

#

pandas is for data that can fit in memory

hollow scarab Jan 13, 2021, 11:08 PM

#

but I will need those rows with texts in them later

velvet thorn Jan 13, 2021, 11:08 PM

#

hollow scarab so I got suggested to use iloc to remove those text rows so I could add the new ...

what are the columns called

#

so in general, with pandas the biggest dataset you can work with will be a few GB?

hollow scarab Jan 13, 2021, 11:08 PM

#

I can show it better tomorrow, issue is its on work pc

velvet thorn Jan 13, 2021, 11:08 PM

#

on the other hand, through distributed processing, Spark can handle datasets that are much bigger (say, hundreds of GB)

#

however, distribution of work has overhead.

#

so for small datasets, pandas will more or less always be a lot faster.

hollow scarab Jan 13, 2021, 11:09 PM

#

well I transposed the df, so the columns just have the index number as name

#

but if its okay I can tag you tomorrow with screenshots, should be easier to explain that way I think

velvet thorn Jan 13, 2021, 11:10 PM

#

don't post code as images

#

post it as text

#

anyway, this is what I would suggest

hollow scarab Jan 13, 2021, 11:11 PM

#

I meant pics of the df in excel

velvet thorn Jan 13, 2021, 11:11 PM

#

no thanks

#

pics are hard to see

hollow scarab Jan 13, 2021, 11:11 PM

#

I cant send the code sadly, not allowed to send stuff like that to external emails:/

velvet thorn Jan 13, 2021, 11:12 PM

#

!e ```py
import pandas as pd

df = pd.DataFrame([['text', 2], [3, 4]], columns=['a', 'b'])
print(df)

df['result'] = pd.to_numeric(df['a'], errors='coerce') * pd.to_numeric(df['b'], errors='coerce')
print(df)

arctic wedgeBOT Jan 13, 2021, 11:12 PM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 |       a  b
002 | 0  text  2
003 | 1     3  4
004 |       a  b  result
005 | 0  text  2     NaN
006 | 1     3  4    12.0

loud marlin Jan 13, 2021, 11:12 PM

#

So if the data is small, it’s better to use pandas

If data is large to some point, or perform something that’s is highly time consuming tasks, spark is the way to go?

velvet thorn Jan 13, 2021, 11:12 PM

#

@hollow scarab I would suggest somehthing like this

#

so you don't need to remove the rows and add them back

velvet thorn Jan 13, 2021, 11:12 PM

#

loud marlin So if the data is small, it’s better to use pandas If data is large to some poi...

more the former

#

large data

loud marlin Jan 13, 2021, 11:12 PM

#

And even RDD is slow, it will not significantly impact spark’s overall performance ?

hollow scarab Jan 13, 2021, 11:12 PM

#

I will try that tomorrow, thanks a lot! @velvet thorn

velvet thorn Jan 13, 2021, 11:13 PM

#

loud marlin And even RDD is slow, it will not significantly impact spark’s overall performan...

"slow" is relative

hollow scarab Jan 13, 2021, 11:14 PM

#

that works if I just use df[1] and df[2] referring to the 2. and 3. colums right? @velvet thorn

#

instead of their name in string

velvet thorn Jan 13, 2021, 11:14 PM

#

hollow scarab that works if I just use df[1] and df[2] referring to the 2. and 3. colums right...

if those are their names

hollow scarab Jan 13, 2021, 11:15 PM

#

oh so I can only use names?

#

cant use a number like n. column

velvet thorn Jan 13, 2021, 11:16 PM

#

columns can have numbers as names

#

but if you want to refer to a column by position you need iloc

hollow scarab Jan 13, 2021, 11:17 PM

#

pd.to_numeric(df.iloc[2:,:]) so like this?

velvet thorn Jan 13, 2021, 11:17 PM

#

nope

#

I suggest you check out the documentation and experiment a little

hollow scarab Jan 13, 2021, 11:17 PM

#

well I will try to use the name the index has

#

my main problem is that the original excel file I have to work in is garbage

#

so the df is not clean at all

loud marlin Jan 13, 2021, 11:19 PM

#

@velvet thorn thanks for your explanation

I was confuse because there are contradict idea come to me together...

Where spark is the leading way to distribute data and process

Yet it’s processing RDD in the background, which is slower compare to process with dataframe

I guess I shouldn’t worry about it too much at this point 🧐

velvet thorn Jan 13, 2021, 11:22 PM

#

loud marlin <@171929073063297024> thanks for your explanation I was confuse because there a...

think about it this way

#

pandas is like a single sports car

#

Spark is like a fleet of trucks

#

if you just need to transport one box

#

the sports car is faster

#

but if you have 10,000 boxes

#

even if the sports car individually can make a trip quickly

#

you have so much stuff that the fleet of truck's capacity more than makes up for their lack of speed

loud marlin Jan 13, 2021, 11:25 PM

#

That helps 🙂

woeful snow Jan 13, 2021, 11:36 PM

#

Hi everyone

#

I'm wondering if somebody could help me out on some pandas functionality that I'm sure must be there - I just don't know it. I want to generate a Pandas series that starts with a seed value and then cumulatively adds a value each time for n times.

velvet thorn Jan 13, 2021, 11:38 PM

#

woeful snow I'm wondering if somebody could help me out on some pandas functionality that I'...

show an example so I am sure what you mean

woeful snow Jan 13, 2021, 11:38 PM

#

For example seed=21.35, addval=0.1, length=200 [21.35, 21.45, 21.55 ..... ] till length = 200

velvet thorn Jan 13, 2021, 11:38 PM

#

ah

#

simple

#

!e

import numpy as np
import pandas as pd

seed = 21.35
step = 0.1
count = 200

print(pd.Series(np.arange(seed, seed + step * count, step)))

arctic wedgeBOT Jan 13, 2021, 11:40 PM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 | 0      21.35
002 | 1      21.45
003 | 2      21.55
004 | 3      21.65
005 | 4      21.75
006 |        ...  
007 | 195    40.85
008 | 196    40.95
009 | 197    41.05
010 | 198    41.15
011 | 199    41.25
... (truncated - too many lines)

Full output: https://paste.pythondiscord.com/azevojoneq.txt

velvet thorn Jan 13, 2021, 11:41 PM

#

woeful snow For example seed=21.35, addval=0.1, length=200 [21.35, 21.45, 21.55 ..... ] t...

is this what you were thinking

woeful snow Jan 13, 2021, 11:41 PM

#

Just running it now, but it looks like exactly what I want!

#

Thank you that is exactly what I needed

#

I'll go away and read the documentation to figure out how it works

#

I'm trying to learn by converting a basic excel sheet -> python to get the hand of basic data building and conversions 🙂

digital crescent Jan 14, 2021, 12:13 AM

#

I'm thinking about doing a lot of realtime data analysis that will probably involve recalibrating forecasts based on new data points as they arrive. I have no formal machine learning background, but I know a bit about stats and feel like I can build a logical system to classify data and model and analyze and find ways to optimize this process. Am I missing something by not knowing what to do with the "machine learning" topic? Is it perfectly okay to just work on a project like this, do your own stats, program your own logic to reconfigure your models and reevaluate, etc

#

Or am I missing some kind of special "machine learning" sauce?

desert parcel Jan 14, 2021, 1:00 AM

#

percentages = []

for pred, target in zip(preds.t()[0], testTargets.t()[0]):
    percent = pred / target * 100
    percentages.append(percent)

sum = 0
for percent in percentages:
    sum += percent

accu = sum / len(percentages)
accu

#

Is this a good way to calculate average accuracy?

midnight widget Jan 14, 2021, 3:28 AM

#

@desert parcel You can do the *100 at the end somewhere to make it a little more efficient

misty flint Jan 14, 2021, 4:10 AM

#

digital crescent I'm thinking about doing a lot of realtime data analysis that will probably invo...

i dont understand the question. if the analysis youre doing works just fine for your needs, then that is okay, no?

#

what machine learning is good at is using certain types of algorithms to get really, really good at predicting values

#

sometimes it works well, sometimes it doesn't. the best models use mixed models

digital crescent Jan 14, 2021, 4:12 AM

#

I guess I just need to read more about it. I don't understand why it exists as a kind of separate topic if we are all using the same stats, math, and logic

misty flint Jan 14, 2021, 4:16 AM

#

@digital crescent these are the kind of algorithms im talking about.

📎 unknown.png

digital crescent Jan 14, 2021, 4:20 AM

#

So roughly speaking "machine learning" is kind of a catch-all term for basically everything in your pic and related topics? @misty flint

misty flint Jan 14, 2021, 4:26 AM

#

it technically also includes deep learning and a whole brand of other subfields

#

neural networks, computer vision, natural language processing, etc.

#

let me find a diagram my prof showed us

#

it will make more sense than me

#

DoggoKek

#

📎 unknown.png

#

Data Science shares some techniques with Machine Learning

#

but theres also many standalone techniques

#

different tools you can use depending on your circumstances

#

and then Deep Learning is a subset of Machine Learning that is growing in popularity

#

AI is the umbrella/parent field

midnight widget Jan 14, 2021, 4:44 AM

#

@misty flint Does data science include traditional nonparametric statistics?

#

Cuz I noticed it doesnt overlap AI completel

#

y

digital crescent Jan 14, 2021, 4:47 AM

#

Thanks for the explanation and diagrams, @misty flint

misty flint Jan 14, 2021, 4:50 AM

#

midnight widget <@!446424248479645706> Does data science include traditional nonparametric stati...

i would say that falls under both. here's another venn diagram from the same presentation DoggoKek

#

📎 unknown.png

misty flint Jan 14, 2021, 4:51 AM

#

digital crescent Thanks for the explanation and diagrams, <@446424248479645706>

no problem my friend. i would at least take a look at some introductory material. see if their techniques/the algorithms would be better or worse than what youre already doing

digital crescent Jan 14, 2021, 4:56 AM

#

I will do that, thanks. I want to find a balance between not being egotistical and acting like I know everything (which I absolutely do not) but also not acting like ML is some kind of magic technology from the gods

#

And somewhere I want to learn what I need to learn for my uses cases and then apply it

midnight widget Jan 14, 2021, 6:09 AM

#

Hi all! What are some super interesting data science projects I can do to learn concepts? I love math and building things from scratch as much as I can.

hollow scarab Jan 14, 2021, 11:24 AM

#

is it possible to change the value of one cell with pandas?

#

because I created a new column, but the name of the column went in the index, and after tranposing the df it got lost..

red briar Jan 14, 2021, 12:51 PM

#

https://datatofish.com/replace-values-pandas-dataframe/

Data to Fish

How to Replace Values in Pandas DataFrame - Data to Fish

In this guide, I'll show you how to replace values in Pandas DataFrame. I'll review several examples for demonstration purposes.

hollow scarab Jan 14, 2021, 1:05 PM

#

thank you

#

now last thing I need is to put this column i created as the 1. column

wise garden Jan 14, 2021, 3:10 PM

#

Does anyone use nbopen.exe to open their ipynb files? I can't get it to work on Windows 10.0.19041.746

tawny pivot Jan 14, 2021, 3:44 PM

#

Hello friends, I have this data with columns which you can see in photo. There are duplicates in ID columns.
I need a loop for same ID(in some places it may be duplicated 10 or more.) have occured in different dates; than take difference
of amounts that placed in third column according their dates. I mean I need difference of observation in t+1 minus t's amount.

📎 unknown.png

misty flint Jan 14, 2021, 4:03 PM

#

i found the best dataset to work with

📎 unknown.png

#

DoggoKek

hollow void Jan 14, 2021, 5:23 PM

#

An ML newbie wanting to implement licence plates detection on images in python, any considerations or tips for choosing between pytorch and tensorflow?

warm seal Jan 14, 2021, 7:52 PM

#

https://stackoverflow.com/questions/65539062/xgboost-custom-loss-functions-mismatch

Stack Overflow

XGBoost Custom Loss Functions Mismatch

The script has two losses, the squared loss L_a = (y-F(x))^2 and the same loss but with a 0.5 factor: L_b = 0.5*(y-F(x))^2. Using L_a gets me trees with one splits (even if max_split is set to &gt...

#

Any help would be appreciated :))

woeful hamlet Jan 14, 2021, 8:23 PM

#

hi

#

i wanna use this

#

https://jacobgil.github.io/deeplearning/class-activation-maps

Jacob Gildenblat

Class activation maps in Keras for visualizing where deep learning ...

Github project for class activation maps Github repo for gradient based class activation maps

#

But keras is saying "better use tf.GradientTape"

#

instead of K.gradients

#

But if i do that, then it sais GradientTape cant be indexed

#

Can someone help me?

woeful hamlet Jan 14, 2021, 9:10 PM

#

Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 100, 100, 3), dtype=tf.float32, name='input_3'), name='input_3', description="created by layer 'input_3'") at layer "block1_conv1". The following previous layers were accessed without issue: []

#

https://colab.research.google.com/drive/1d-IbgC3JsIbOR9YmfnN3VlD0GdxYQEPm?usp=sharing

Google Colaboratory

stark orchid Jan 14, 2021, 9:25 PM

#

Hey All,

Here's a cool opportunity to contribute to an awesome open source tool, https://github.com/great-expectations/great_expectations, and gain some great experience. The Great Expectations Team is hosting a series of hackathons, there will be three different event times and two of them are for current university students. Expect swag, doordash credit and cash prizes. You will be joined by the core team to help you contribute!

Dates:
Student Hackathon 1/23 5-9pm PST (students only, must be currently attending a university)
Data Professionals Hackathon 1/28 5-9pm PST
Student Hackathon 2/6 2-6pm PST (students only, must be currently attending a university)

Sign up here: https://www.surveymonkey.com/r/great-expectations-hackathon-3
We blogged about it here: https://greatexpectations.io/blog/great-expectations-hackathons/

misty slate Jan 14, 2021, 10:22 PM

#

Hi, I need some help with Tensorflow/TensorBoard

#

I made a chatbot using TensorFlow. I changed it for a discord bot and flask. But for my project I want to somehow show ANY DATA, but in visual form, graphs, pie charts, bars. I don't know how to use TensorBoard to visualize my chatbot data.

This is my code: https://github.com/hootloot/Tensorflow-Question/blob/main/main.py

Thank you

GitHub

hootloot/Tensorflow-Question

Contribute to hootloot/Tensorflow-Question development by creating an account on GitHub.

velvet thorn Jan 14, 2021, 11:09 PM

#

hollow void An ML newbie wanting to implement licence plates detection on images in python, ...

IMO there isn't enough of a difference in that case to matter

hollow void Jan 14, 2021, 11:23 PM

#

velvet thorn IMO there isn't enough of a difference in that case to matter

Thanks @velvet thorn

hasty dagger Jan 14, 2021, 11:35 PM

#

Hi guys, I created a question in one of the help channels but it doesn't seem like anyone there is able to help me. Is it alright for me to ask the question in here?

velvet thorn Jan 14, 2021, 11:38 PM

#

hasty dagger Hi guys, I created a question in one of the help channels but it doesn't seem li...

if it's a DS question, sure

hasty dagger Jan 14, 2021, 11:40 PM

#

I'm working through a project where I'm using matplotlib but I'm having a weird issue where I'm being returned "<Figure size 1440x720 with 0 Axes>" even though it renders the graph correctly, however it is not being scaled with figsize.

velvet thorn Jan 14, 2021, 11:42 PM

#

hasty dagger I'm working through a project where I'm using matplotlib but I'm having a weird ...

huh

#

what are you trying to set figsize to

#

🥴

#

@hasty dagger don't do that

#

don't create your own figure

#

just pass figsize to test.plot

#

if you want to use pandas's plotting interface

hasty dagger Jan 14, 2021, 11:46 PM

#

@velvet thorn cheers, I'll keep that in mind, works perfectly

velvet thorn Jan 14, 2021, 11:47 PM

#

hasty dagger <@!171929073063297024> cheers, I'll keep that in mind, works perfectly

yeah, what's happening is

#

you're creating one Figure manually

#

but pandas is creating its own Figure and Axes and plotting on that

#

which is what you see

velvet thorn Jan 14, 2021, 11:48 PM

#

velvet thorn you're creating one `Figure` manually

but this is what is shown in the text output after plt.show()

#

if you want to do that

#

you can create a Figure and Axes manually with plt.subplots

#

and then pass the Axes to the plot method of the DF

hasty dagger Jan 14, 2021, 11:53 PM

#

Thanks very much! I knew it would've been something silly. I'm still fairly new to python data science type stuff and I've very grateful you where able to help!

velvet thorn Jan 15, 2021, 12:04 AM

#

hasty dagger Thanks very much! I knew it would've been something silly. I'm still fairly new ...

yup no worries!

rich slate Jan 15, 2021, 4:52 AM

#

Hello I am in need of assistance. I am going to sleep right now so if you can dm me with possible solutions that would be great!

I want to know how to make a script that can solve math equations, like complex ones, not things such as 5 + 2 = c or stuff like that. I want to know how to have things such as 5x + 4y = 200 or something and it have another equation such as -5x +2y = 300 or something, and it use elimination processes to find it.

velvet thorn Jan 15, 2021, 8:23 AM

#

rich slate Hello I am in need of assistance. I am going to sleep right now so if you can dm...

hm.

#

depends on how complicated and flexible you want to make it

#

need to know more

vivid maple Jan 15, 2021, 8:26 AM

#

Anyone know where i can learn hadoop and its ecosystem

#

any course or mooc that is fast paced?

rapid wraith Jan 15, 2021, 11:45 AM

#

I have a dataframe of stock prices with individual stocks as columns and the index is a datetime index. The data is padded so when there is no longer any observations for a stock the last value keeps repeating until the end. I am trying to remove this padding so that when the price stops changing the remaining values are converted to NaN. I tried to do this by creating a boolean mask by (df == df[::-1].expanding().mean()[::-1]) however something is not right and the returned boolean mask is not correct. Does anyone know what is going wrong or of a better solution?

mortal trout Jan 15, 2021, 1:00 PM

#

is there any trained model for image spam/ham detection if not how cld i make one

rapid wraith Jan 15, 2021, 1:29 PM

#

rapid wraith I have a dataframe of stock prices with individual stocks as columns and the ind...

The issue seems to be that the values don't match because the mean value has more decimals, rounding the figures fixed the issue for now

lavish tundra Jan 15, 2021, 1:32 PM

#

someone help me pls? i have a json files with a lot of 'prices' per 'dates'='timestamp" and i was trying to create a XY graphic about it, but idk how i make the 'timestamp' have a id for i can put it in order on the X graphic

📎 unknown.png

stray ivy Jan 15, 2021, 2:58 PM

#

anyone know how to use a levenstein distance matrix to determine how "similar" one word is to another? the algorithm to generate the matrix is easy, but idk how to utilize the metric

shut valve Jan 15, 2021, 3:34 PM

#

Wouldn’t it just be the one with a low distance you can probably set some threshold and filter with that

fading sail Jan 15, 2021, 4:35 PM

#

so i have uploaded a data file using pandas. I want to extract a specific column but the column starts with a number. How do i extract this colummn?

stray ivy Jan 15, 2021, 4:51 PM

#

shut valve Wouldn’t it just be the one with a low distance you can probably set some thresh...

define "the one with a low distance". it's a matrix with a similarity metric for each char between the 2 words

pure pond Jan 15, 2021, 5:01 PM

#

Anyone know much about pyroot?

shut valve Jan 15, 2021, 5:01 PM

#

Well I’m picturing it being each col is a word and each row is the distance for every word so like the diagonal is zero because the distance from itself is zero but then you could find all occurrences where the distance is 1 and the find the corresponding row and col of it to get your two words with a distance of 1

lapis sequoia Jan 15, 2021, 5:20 PM

#

I'm trying to find a Python implementation of Matlab's ode23tb solver. It is an implementation of the TR-BDF2 algorithm. SciPy doesn't offer a solver for this algorithm. Are there any other Python packages that have this type of solver? https://www.mathworks.com/help/matlab/ref/ode23tb.html

Solve stiff differential equations — trapezoidal rule + backward di...

This MATLAB function, where tspan = [t0 tf], integrates the system of differential equations y'=f(t,y) from t0 to tf with initial conditions y0.

deft harbor Jan 15, 2021, 5:21 PM

#

https://www.schneier.com/blog/archives/2021/01/extracting-personal-information-from-large-language-models-like-gpt-2.html

Schneier on Security

Bruce Schneier

Extracting Personal Information from Large Language Models Like GPT-2

#

So, I guess people are reverse engineering the training data. Something to keep in mind if you are training on private data and releasing the model to the public.

main quest Jan 15, 2021, 5:24 PM

#

i have this groupby which gets every row by date:

by_day = df.groupby(['Date'])```

how would i go about filling missing days? i'm not really sure how resample works correctly, if i try:

```py
by_day.resample('D')```

it errors with:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'```

digital timber Jan 15, 2021, 5:34 PM

#

I know this is last minute but I have a presentation in an hour but I think Binary search tree is not being created correctly, could someone help me take a look thanks in advanced

#

Im getting the error

AttributeError: 'NoneType' object has no attribute 'balanceFactor'```
in my code

https://paste.pythondiscord.com/pefakivode.rb
this is my code, it is supposed to be using the binary search tree to search for the title but I think the construction of my Binary search tree is messed up

main quest Jan 15, 2021, 6:05 PM

#

i need to fill the missing date holes but i have no idea how to calculate them without using heavy iterations and checks

📎 unknown.png

#

and this is a result of a groupby('Date')

silver venture Jan 15, 2021, 9:26 PM

#

main quest i need to fill the missing date holes but i have no idea how to calculate them w...

Check for date format, it should automatically format correctly

main quest Jan 15, 2021, 9:28 PM

#

My dataset does not contain all days

#

Are you suggesting me to use pd.to_datetime?

rich slate Jan 15, 2021, 10:43 PM

#

@velvet thorn
The thing i'm trying to figure out is like if I got 2 equations

Would there be a way for it to solve for a variable using the 2 equations

rich slate Jan 15, 2021, 11:18 PM

#

true dat

abstract zealot Jan 15, 2021, 11:54 PM

#

Hi guys random question hope you’re all keeping well - it’s about estimating mean and sd from a normal distribution. Right now I’m using maximum likelihood estimations, but was wondering if there are any methods you guys know of that I could also try and then compare the results

velvet thorn Jan 16, 2021, 12:33 AM

#

@velvet thorn
The thing i'm trying to figure out is like if I got 2 equations

Would there be a way for it to solve for a variable using the 2 equations
@rich slate in the special case of two linear equations in two variables, easily

rich slate Jan 16, 2021, 12:41 AM

#

@velvet thorn could you explain?

velvet thorn Jan 16, 2021, 1:07 AM

#

@velvet thorn could you explain?
@rich slate which part

rich slate Jan 16, 2021, 1:11 AM

#

the way it would get the variables by itself to find out the value @velvet thorn

sharp raft Jan 16, 2021, 1:18 AM

#

hello. I have a project due tonight and i was wondering if someone can help me with it. To better understand it

#

you're not necessarily giving me the answer, i'm just very stuck

velvet thorn Jan 16, 2021, 3:07 AM

#

the way it would get the variables by itself to find out the value @velvet thorn
@rich slate you need to parse the string

#

you can look into regular expressions

#

hello. I have a project due tonight and i was wondering if someone can help me with it. To better understand it
@sharp raft you should just ask

#

and anyone who comes by might be able to help

sharp raft Jan 16, 2021, 4:02 AM

#

thanks for that

twin moth Jan 16, 2021, 2:29 PM

#

Do you guys know if SKLearn can work with tuples?

#

We have a Pandas dataframe which contain a ton of points and we want to let SKLearn analyze it all.
When we try to get SKLearn to use it, it tries to convert the tuples to floats.

We can split each tuple to 2 columns -- X and Y but we want them to be correlated.

lapis sequoia Jan 16, 2021, 3:39 PM

#

is there any particular reason for not using numpy arrays?

errant rivet Jan 16, 2021, 4:05 PM

#

@twin moth What model are you trying to apply using sklearn? There's no reason different features (columns) can't still have some correlation or relationship.

solemn helm Jan 16, 2021, 4:10 PM

#

hey guys!
anyone here use TensorFlow?

#

does anyone have some experience with TFX (TensorFlow Extended) or TensorFlow Serving?

I'm working on project and now I need to train my models in my front-end application and until now I did not found some tutorial that helps me to do that.

PS. My front-end application was developed with ReactJS and i'm looking for the better solution to create the back-end.

errant rivet Jan 16, 2021, 4:14 PM

#

@vocal kettle Generally if performance is a big concern, you're better off using numpy over normal lists. One situation that would fail when using numpy arrays that I can think of is if you wanted to mix datatypes in the same array. Consider this example.

np_array = numpy.array(['abc', 123, 0.595])
print(np_array[1] + 50)
>>> TypeError: Cannot add int to str

Numpy arrays can only hold one datatype, this is one of the keys to their efficiency. You could also cast to the correct type, but it feels like a poor use of np arrays to me.

main quest Jan 16, 2021, 4:49 PM

#

i'm having a few problems plotting a grouped dataframe in an iteration. that orange line seems to go crazy back and forth

code: https://paste.ee/p/CLWtf

expected result: multiple different colored lines, each representing row count by date

what actually happens:

📎 2VqlV6Jwf1MAAAAAElFTkSuQmCC.png

errant rivet Jan 16, 2021, 4:56 PM

#

@main quest on mobile and not able to view your code easily, but one thing that can cause this is if your values aren't sorted on the x axis.

hallow harness Jan 16, 2021, 4:57 PM

#

Hello i have this example json and code here
{"text_sentiment": "positive", "text_probability": [0.33917574607174916, 0.26495590980799744, 0.3958683441202534]}

input_c = pd.DataFrame(columns=['Comments','Result'])
for i in range(input_df.shape[0]):
    url = 'http://classify/?text='+str(input_df.iloc[i])
    r = requests.get(url)
    result = r.json()["text_sentiment"]
    proba = r.json()["text_probability"]
    input_c = input_c.append({'Comments': input_df.loc[i].to_string(index=False),'Result': result, 'Probability': proba}, ignore_index = True)
st.write(input_c)

#

This is what the result look like

https://i.stack.imgur.com/JbBym.png

#

Is there a way to make it like:
If the value in Result is "positive" then I want the proba to index to 2, and if its "neutral" index to 1, "negative" index to 0
Like this:
https://i.stack.imgur.com/aM8SA.png

errant rivet Jan 16, 2021, 5:01 PM

#

@hallow harness, maybe try something like this

df['New Probability'] = np.where(df['Result'] == 'positive', df['Probability'][2], df['Probability'][1])

hallow harness Jan 16, 2021, 5:02 PM

#

ok lemme try it

errant rivet Jan 16, 2021, 5:03 PM

#

Wait, there is a third option, negative. My bad

main quest Jan 16, 2021, 5:03 PM

#

errant rivet <@!256460316660072448> on mobile and not able to view your code easily, but one ...

i'm pretty sure the dataframe is sorted by date even before trying to do anything else

errant rivet Jan 16, 2021, 5:08 PM

#

@hallow harness Not the most efficient but I got this to work

#

df['New Probability'] = np.where(df['Result'] == 'positive', df['Probability'][2], np.where(df['Result'] == 'neutral', df['Probability'][1], df['Probability'][0]))

#

I assumed that the first index of the probability column is for negative values

hallow harness Jan 16, 2021, 5:09 PM

#

yes

errant rivet Jan 16, 2021, 5:09 PM

#

I also assumed that there are only three possible options for result: positive, negative, neutral

hallow harness Jan 16, 2021, 5:09 PM

#

first index = negative, 2nd = neutral , 3rd = positive

errant rivet Jan 16, 2021, 5:12 PM

#

@main quest Really double check because if it's a line graph, it doesn't make sense that it would jump forward ~6 days, then go backwards 3 days.

hallow harness Jan 16, 2021, 5:14 PM

#

errant rivet I also assumed that there are only three possible options for result: positive, ...

got this error

📎 unknown.png

steady wigeon Jan 16, 2021, 5:14 PM

#

Can tweepy captured tweet that is not truncated?

hallow harness Jan 16, 2021, 5:14 PM

#

tried making it [:,2]

errant rivet Jan 16, 2021, 5:14 PM

#

can you actually post the error so that I can read it Cloz?

hallow harness Jan 16, 2021, 5:14 PM

#

ok ok

#

File "c:\users\jetri\appdata\local\programs\python\python37\lib\site-packages\streamlit\script_runner.py", line 332, in _run_script
    exec(code, module.__dict__)
File "C:\Users\Jetri\Documents\StreamLit\senv\iflects\iflectsstreamlit.py", line 112, in <module>
    input_c['new probability'] = np.where(input_c['Result'] == 'positive', input_c['Probability'][2], np.where(input_c['Result'] == 'neutral', input_c['Probability'][1], input_c['Probability'][0]))
File "c:\users\jetri\appdata\local\programs\python\python37\lib\site-packages\pandas\core\series.py", line 882, in __getitem__
    return self._get_value(key)
File "c:\users\jetri\appdata\local\programs\python\python37\lib\site-packages\pandas\core\series.py", line 991, in _get_value
    loc = self.index.get_loc(label)
File "c:\users\jetri\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexes\range.py", line 357, in get_loc
    raise KeyError(key) from err

errant rivet Jan 16, 2021, 5:17 PM

#

One quick question... The result column is always the value in the array with the highest Probability? correct?

hallow harness Jan 16, 2021, 5:18 PM

#

yes sir

errant rivet Jan 16, 2021, 5:18 PM

#

Ah, didn't realize it earlier

twin moth Jan 16, 2021, 5:19 PM

#

lapis sequoia is there any particular reason for not using numpy arrays?

Sorry, missed your message, we were instructed to use Pandas dataframes - it's for a college course

twin moth Jan 16, 2021, 5:20 PM

#

errant rivet <@!191683640118214656> What model are you trying to apply using sklearn? There's...

Sorry, missed your message.
I guess that's right but it sounds more reasonable to just have it all in a single column

brittle cedar Jan 16, 2021, 5:20 PM

#

can i ask my question here? actually i m new so dont know ..where to ask

twin moth Jan 16, 2021, 5:21 PM

#

We're trying to find out whether it's possible to guess someones age and gender from an image

#

We have a dataset of about 24K pics with age, gender and nationality

#

We use OpenCV in order to analyze faces and get 68 points from each

#

Then try to find correlations between details, mostly regarding the spacing between eyes etc.

chilly geyser Jan 16, 2021, 5:24 PM

#

errant rivet One quick question... The result column is always the value in the array with th...

This is the most common way of assigning values. Technically you could re-do a simulation on your probabilities

#

I.e. if you get scores that correspond to 90% 10%, you would roll a uniform random variable and assign one value 90% of the time and the other 10% of the time

twin moth Jan 16, 2021, 5:27 PM

#

Never used it but you could try multithreading

errant rivet Jan 16, 2021, 5:30 PM

#

@hallow harness Sorry for the delay, this should do the trick!

input_c['New Probability'] = input_c['Probability'].apply(max)

hallow harness Jan 16, 2021, 5:31 PM

#

errant rivet <@!286452183380131840> Sorry for the delay, this should do the trick! ```python ...

Thank you! It works ❤️

steady wigeon Jan 16, 2021, 5:31 PM

#

Can tweepy captured tweets that is not truncated? I was assigned to collect tweets which is not truncated or "truncated"=false.

i try to use
def on_status(self, status):
with open('truncFalsetweet.txt','a') as tf:
if not status.truncated:
tf.write(status)
print(status)
else:
None
return True
but it returns error: TypeError: write() argument must be str, not Status

so i change to
def on_data(self, data):
#ques 3.2: only collect data when truncated=false
with open('truncFalsetweet.txt','a') as tf:
if not status.truncated:
tf.write(status)
print(status)
else:
None
print(data)
return True
and my prompt send me this error. AttributeError: 'str' object has no attribute 'truncated'.

this is the example of data that i get if there is no condition.

{"created_at":"Fri Jan 15 03:15:54 +0000 2021",......,"truncated":false,.....}

{"created_at":"Fri Jan 15 03:15:54 +0000 2021",.....,"truncated":true, ......}

errant rivet Jan 16, 2021, 5:32 PM

#

@twin moth I don't understand why it's more reasonable to have it in one column. What are the two values in the tuple representing? The x, y position of the pixel?

twin moth Jan 16, 2021, 5:33 PM

#

errant rivet <@!191683640118214656> I don't understand why it's more reasonable to have it in...

Exactly

hallow harness Jan 16, 2021, 5:34 PM

#

@errant rivet one more thing... is it possible to do math using the json values?
i tried doing

 proba_test = proba*100

but it only request proba 100 more times

twin moth Jan 16, 2021, 5:35 PM

#

P.S. the dataframe has 68 columns - all are tuples.
If we decide to convert the tuples to xs and ys we'd have 136 columns

#

And we'd need to somehow specify that each of those are a pair

#

So the model will know how to use them both together

chilly geyser Jan 16, 2021, 5:36 PM

#

Not exactly sure how data is stored in pd/np

#

But would not recommend py tuple

twin moth Jan 16, 2021, 5:36 PM

#

chilly geyser But would not recommend py `tuple`

How come?

chilly geyser Jan 16, 2021, 5:36 PM

#

And yes that means I do think it's better to be 136

twin moth Jan 16, 2021, 5:37 PM

#

chilly geyser And yes that means I do think it's better to be 136

Would we need to instruct SKLearn to view them as pairs though?

chilly geyser Jan 16, 2021, 5:37 PM

#

Because py tuples aren't very common, and for others to interact with your thing (especially when it comes to non-python) it'd be hard

chilly geyser Jan 16, 2021, 5:37 PM

#

twin moth Would we need to instruct SKLearn to view them as pairs though?

For this I'm not too sure

#

You know the np general builtin types right? Like float32 and so on

twin moth Jan 16, 2021, 5:38 PM

#

I guess

chilly geyser Jan 16, 2021, 5:38 PM

#

Those would be 'fast' and also easy to work with for other programs, because the datatype is common

twin moth Jan 16, 2021, 5:38 PM

#

I also write C and other langs so sure

twin moth Jan 16, 2021, 5:38 PM

#

chilly geyser Those would be 'fast' and also easy to work with for other programs, because the...

Makes sense

chilly geyser Jan 16, 2021, 5:38 PM

#

Well if you are doing this level of analysis only in py it doesn't matter I guess

#

But I'm sure there's a way for it to be stored fully as np-nice datatypes

#

and have it talk nicely

errant rivet Jan 16, 2021, 5:39 PM

#

@hallow harness ```python
proba_test = [i*100 for i in proba]

twin moth Jan 16, 2021, 5:39 PM

#

While I do understand the mindset we don't really need it here since it's just a project that we won't be integrating to anything

twin moth Jan 16, 2021, 5:39 PM

#

chilly geyser But I'm sure there's a way for it to be stored fully as np-nice datatypes

Do you think that there are downsides to using Pandas dataframes though?

chilly geyser Jan 16, 2021, 5:40 PM

#

That's a question too hard for me to answer haha

errant rivet Jan 16, 2021, 5:40 PM

#

@twin moth I don't think the machine learning algorithm will know/care. It will find its own associations. It won't be upset that you didn't represent pixels in a tuple value. It has no idea these numbers represent pixels in the first place.

chilly geyser Jan 16, 2021, 5:40 PM

#

But I have seen online comments where basically if it's not stored as some dtype numpy recognises, it's not advisable

steady wigeon Jan 16, 2021, 5:40 PM

#

i got this error

#

UnicodeEncodeError: 'charmap' codec can't encode character '\u2728' in position 251: character maps to <undefined>

chilly geyser Jan 16, 2021, 5:40 PM

#

Essentially if it's dtype=object then it's a Py-based object

twin moth Jan 16, 2021, 5:41 PM

#

errant rivet <@!191683640118214656> I don't think the machine learning algorithm will know/ca...

So basically double up the amount of columns and let it rip?

chilly geyser Jan 16, 2021, 5:41 PM

#

errant rivet <@!191683640118214656> I don't think the machine learning algorithm will know/ca...

Yeah it's a coding thing more than anything

errant rivet Jan 16, 2021, 5:42 PM

#

It really won't matter, you can name the columns whatever human-readable thing you need to keep them straight. It's still the same amount of data whether the numbers are in tuples or exploded into their own columns

chilly geyser Jan 16, 2021, 5:42 PM

#

errant rivet It really won't matter, you can name the columns whatever human-readable thing y...

Computationally unlikely, because I would assume that a py-base object has more bells and whistles than raw numerics

#

As in you would get the same result mathematically/with software designed to read tuples

errant rivet Jan 16, 2021, 5:43 PM

#

Yeah, so why go through all the additional trouble?

chilly geyser Jan 16, 2021, 5:43 PM

#

But I would think it's faster with double the columns, where columns have fixed datatypes

#

Because fixed datatypes are machine friendly

errant rivet Jan 16, 2021, 5:43 PM

#

Yeah, we're in agreement here 😛

chilly geyser Jan 16, 2021, 5:44 PM

#

Well tbh, pointless with <100k rows

#

More important once you handle >100k rows, where every single bit of speed is like anything from minutes to hours

errant rivet Jan 16, 2021, 5:44 PM

#

There might be some sort of feature engineering you could do to condense the x, y tuples into a single value to use in your model. Not sure personally what it would be

chilly geyser Jan 16, 2021, 5:45 PM

#

errant rivet There might be some sort of feature engineering you could do to condense the x, ...

If you're looking at this it's probably some (optimality-unknown) non-linear function of x,y. I wouldn't recommend it

hallow harness Jan 16, 2021, 5:45 PM

#

errant rivet <@!286452183380131840> ```python proba_test = [i*100 for i in proba] ```

Got it! Thank you for helping bro! 😄

errant rivet Jan 16, 2021, 5:47 PM

#

@hallow harness No worries, best of luck!

chilly geyser Jan 16, 2021, 5:47 PM

#

in general I don't recommend anything that isn't simple unless you are prepared to accept black boxes, like dense neural nets with number of layers >2

errant rivet Jan 16, 2021, 5:48 PM

#

I really don't even understand what he's trying to do

#

68 columns of coordinates?

#

Predicting age?

chilly geyser Jan 16, 2021, 5:48 PM

#

the coordinates are probably images

hushed wasp Jan 16, 2021, 5:48 PM

#

Can anyone help me please, telling me why only the last part of the code raises an error and not the top part?

Thanks 🙂

📎 unknown.png

chilly geyser Jan 16, 2021, 5:49 PM

#

wait actually IDK lool

chilly geyser Jan 16, 2021, 5:50 PM

#

hushed wasp Can anyone help me please, telling me why only the last part of the code raises ...

Did you run anything on 122, 123 that erased Segment?

hushed wasp Jan 16, 2021, 5:50 PM

#

nope just the following...

errant rivet Jan 16, 2021, 5:50 PM

#

Could you just rerun all three in order to make sure?

twin moth Jan 16, 2021, 5:51 PM

#

errant rivet 68 columns of coordinates?

Sorry, guess I wasn't clear.
We have a dataset of 24K images faces.
Each of those vary in age, gender, and nationality

#

We have all of those details attached to each image so we can train the algorithm using it

chilly geyser Jan 16, 2021, 5:51 PM

#

Why would datatypes be paired thoough

#

if you have 68 specific special pixels in 2D space then it'd make sense

twin moth Jan 16, 2021, 5:52 PM

#

Then we go over all of the images with OpenCV, trying to fetch all the face structure and insert all of the details we found (68 points) in to a Pandas dataframe

chilly geyser Jan 16, 2021, 5:52 PM

#

But I wouldln't know of how storing just 68 specific special pixels could tell you anything

errant rivet Jan 16, 2021, 5:52 PM

#

Yeah but then the values he'll be representing won't actually be the pixels, it will be whatever is stored at that pixel e.g. color value?

#

So just the column names would be tuples

twin moth Jan 16, 2021, 5:53 PM

#

It might be able to tell us if for example there's a correlation between the age and the distance between the eyes

#

Or maybe the size of the eyes

errant rivet Jan 16, 2021, 5:53 PM

#

Ah, so you want a distance metric between pixels?

twin moth Jan 16, 2021, 5:53 PM

#

errant rivet Ah, so you want a distance metric between pixels?

Basically

errant rivet Jan 16, 2021, 5:53 PM

#

But that's always the same... hmm

twin moth Jan 16, 2021, 5:53 PM

#

That's why we save the coordinates

hushed wasp Jan 16, 2021, 5:54 PM

#

Ok so now, even the first part doesn"t work 😦

chilly geyser Jan 16, 2021, 5:54 PM

#

hushed wasp Ok so now, even the first part doesn"t work 😦

You likely redefined df_rm_segmentation

errant rivet Jan 16, 2021, 5:54 PM

#

@hushed wasp Just try to restart your notebook and run all

#

If there's an error earlier, you can catch where it breaks down 🙂

hushed wasp Jan 16, 2021, 5:55 PM

#

errant rivet <@697072102431522826> Just try to restart your notebook and run all

just rerun all the cells above and error raises at the top of my code i showed you now

chilly geyser Jan 16, 2021, 5:55 PM

#

twin moth It might be able to tell us if for example there's a correlation between the age...

Ah so the coordinates are coordinates of eye features?

twin moth Jan 16, 2021, 5:56 PM

#

chilly geyser Ah so the coordinates are coordinates of eye features?

Sec, I'll send you the placements of the points

chilly geyser Jan 16, 2021, 5:56 PM

#

TBH it seems odd you would even have this kind of data

twin moth Jan 16, 2021, 5:56 PM

#

📎 iu.png

hushed wasp Jan 16, 2021, 5:56 PM

#

📎 unknown.png

chilly geyser Jan 16, 2021, 5:56 PM

#

I thought image data would be rawer....like jpg-raw

twin moth Jan 16, 2021, 5:56 PM

#

chilly geyser TBH it seems odd you would even have this kind of data

wdym?

errant rivet Jan 16, 2021, 5:57 PM

#

Ahhhh, wow... I* thought you were working with raw images just labeled by age, nationality, etc.

chilly geyser Jan 16, 2021, 5:57 PM

#

twin moth

This is 1 data point yes?

#

Not fixed across all faces?

#

This means you had people label 68 points for 24K images, that's a lot of data, that's amazing

#

Either that or a prior algorithm found those ridges of faces

#

That's amazing * amazing

chilly geyser Jan 16, 2021, 5:58 PM

#

hushed wasp

Earlier what did you do to add Segment as an attribute of the df_rfm_segmentation objects?

twin moth Jan 16, 2021, 5:59 PM

#

chilly geyser This is 1 data point yes?

That's a single face out of 24K, yes

hushed wasp Jan 16, 2021, 5:59 PM

#

chilly geyser Earlier what did you do to add Segment as an attribute of the `df_rfm_segmentati...

nothing, I am trying something i found on the web but you are making me realize it should not work without segment being defined before

errant rivet Jan 16, 2021, 5:59 PM

#

is #22 always the right-most edge of the left eyebrow?

twin moth Jan 16, 2021, 6:00 PM

#

Yup

hushed wasp Jan 16, 2021, 6:00 PM

#

chilly geyser Earlier what did you do to add Segment as an attribute of the `df_rfm_segmentati...

Sorry I am not very good a at coding and not good in explaining as well I guess

chilly geyser Jan 16, 2021, 6:00 PM

#

hushed wasp nothing, I am trying something i found on the web but you are making me realize ...

Well at least now you know the issue

chilly geyser Jan 16, 2021, 6:00 PM

#

hushed wasp Sorry I am not very good a at coding and not good in explaining as well I guess

Everyone is always learning

chilly geyser Jan 16, 2021, 6:01 PM

#

twin moth Yup

How do you handle image scales though... like big face, small face, etc

#

if the dataset is all same-scale a la same-distance-from-camera I wonder if that's a hard engineering constraint that would prevent usefulness in reality

errant rivet Jan 16, 2021, 6:03 PM

#

@twin moth Maybe you could add a new feature that measures the distance between all points, such as 22 and 23 (eyebrows), then perform some sort of dimensionality reduction to discover which features account for the highest variance

twin moth Jan 16, 2021, 6:03 PM

#

chilly geyser How do you handle image scales though... like big face, small face, etc

Lol, just sent a message about it to my friend who's working on it

#

I guess you could scale them

chilly geyser Jan 16, 2021, 6:03 PM

#

Sounds interesting at least

lapis sequoia Jan 16, 2021, 6:04 PM

#

Is there an R equivalent discord?

twin moth Jan 16, 2021, 6:04 PM

#

lapis sequoia Is there an R equivalent discord?

I'd check if they have a subreddit, and if so check the sidebar

#

Or past messages

chilly geyser Jan 16, 2021, 6:04 PM

#

lapis sequoia Is there an R equivalent discord?

There is a discord for R lang, but it is a lot smaller than this Py discord

lapis sequoia Jan 16, 2021, 6:05 PM

#

I'll have a quick look lads, cheers

twin moth Jan 16, 2021, 6:05 PM

#

chilly geyser There is a discord for R lang, but it is a lot smaller than this Py discord

Are you really surprised? 😛

chilly geyser Jan 16, 2021, 6:05 PM

#

Nope, Py is really popular

lapis sequoia Jan 16, 2021, 6:06 PM

#

I'm baffled, more love for R

chilly geyser Jan 16, 2021, 6:06 PM

#

I don't see how status is defined

twin moth Jan 16, 2021, 6:06 PM

#

lapis sequoia I'm baffled, more love for R

More hate for Python*

#

😆

hushed wasp Jan 16, 2021, 6:06 PM

#

chilly geyser Well at least now you know the issue

I have an other error now...

Thanks for the help I gonna try to figure out why it worked and why it doesn't now..

errant rivet Jan 16, 2021, 6:06 PM

#

Forget them both, I'm going to Julia

#

😛

lapis sequoia Jan 16, 2021, 6:06 PM

#

Do we have any Bioinformatians here

#

I can't spell lol

chilly geyser Jan 16, 2021, 6:07 PM

#

IMO Julia is a lot less friendly to new coders

lapis sequoia Jan 16, 2021, 6:07 PM

#

Bioinformaticians

chilly geyser Jan 16, 2021, 6:07 PM

#

lapis sequoia Do we have any Bioinformatians here

Not sure, when I pop in this discord, there's varying levels of mathematical maturity

#

I think big names don't lurk here though, as usual

#

Discord isn't a 'big name' thing haha

mint palm Jan 16, 2021, 6:10 PM

#

📎 unknown.png

#

what does this sklearn do

chilly geyser Jan 16, 2021, 6:11 PM

#

It tells you

#

it trains the logreg classifier

mint palm Jan 16, 2021, 6:12 PM

#

i am new in the course of deep learning and last week i wrote about 150 lines to implement logis regression

chilly geyser Jan 16, 2021, 6:12 PM

#

It fits a logreg classifier onto the dataset

mint palm Jan 16, 2021, 6:12 PM

#

how is it doing all that in one line lol

chilly geyser Jan 16, 2021, 6:12 PM

#

Because someone else wrote it

#

To do so

#

Essentially it solves the logreg optimization problem

#

Find some coefficients beta to minimize loss on some P(Y=1) and P(Y=0) as a function of exp(X)

mint palm Jan 16, 2021, 6:13 PM

#

so i am right thinking that "there is actually this function to do logistic regression like problem in one step like this"

chilly geyser Jan 16, 2021, 6:13 PM

#

Yes

#

You might want to ask

#

So what is the point of your 150 lines of code

mint palm Jan 16, 2021, 6:13 PM

#

wow great function right here then

chilly geyser Jan 16, 2021, 6:13 PM

#

Well the thing is someone else wrote it for you, but can you verify it does what it claims to do?

mint palm Jan 16, 2021, 6:13 PM

#

concept building i guess

steady wigeon Jan 16, 2021, 6:14 PM

#

chilly geyser I don't see how `status` is defined

when there is no if- condition, i used this and it successful retrieved
def on_data(self, data):
with open('data/tweet.txt','a') as tf:
tf.write(data)
print(data)
return True

so for if the tweet is not truncated, i modify them into this
def on_data(self, data):
#ques 3.2: only collect data when truncated=false
with open('truncFalsetweet.txt','a') as tf:
if not data.truncated:
tf.write(data)
print(data)
else:
None
print(data)
return True

chilly geyser Jan 16, 2021, 6:14 PM

#

Yup essentially

#

I don't see how data is an object with a truncated attribute but ok.....

steady wigeon Jan 16, 2021, 6:16 PM

#

chilly geyser I don't see how `data` is an object with a `truncated` attribute but ok.....

i did some research before, if i want to use entities in the tweet, must use def on_status. so i come out with
def on_status(self, status):
with open('truncFalsetweet.txt','a') as tf:
if not status.truncated:
tf.write(status)
print(status)
else:
None
return True

but still get error

mint palm Jan 16, 2021, 6:17 PM

#

i also wanna ask that like in logistic regression we got optimised variable at last(for prediction), would i be able to see those using this function as well?

chilly geyser Jan 16, 2021, 6:22 PM

#

mint palm i also wanna ask that like in logistic regression we got optimised variable at l...

You mean beta? Yes I'm pretty sure the coefficients can be retrieved

errant rivet Jan 16, 2021, 6:27 PM

#

In sklearn, you can get them for each feature via LogisticRegression.fit(X, y).coef_

mint palm Jan 16, 2021, 6:30 PM

#

chilly geyser You mean beta? Yes I'm pretty sure the coefficients can be retrieved

in my course the goal of logreg(as told by instructor) is to compute a function Y = WX + B so that Y is the chance that X is 0 or 1..................so i wanna retrieve W and B

chilly geyser Jan 16, 2021, 6:30 PM

#

W is the coeffs

#

and B is an intercept

#

Could be part of the coef object as well

errant rivet Jan 16, 2021, 6:31 PM

#

yeah and great enough, sklearn's LogisticRegression class has a .intercept_ variable too

chilly geyser Jan 16, 2021, 6:31 PM

#

oo ok

mint palm Jan 16, 2021, 6:33 PM

#

w is weight and b is bias.............both are parameter instructor said

errant rivet Jan 16, 2021, 6:35 PM

#

Other names for a coefficient and intercept

mint palm Jan 16, 2021, 6:36 PM

#

yeah your nomenclature(😅 ) seems much more familiar to me

#

frm maths pt of view

weak sentinel Jan 16, 2021, 6:58 PM

#

does anyone remember the name of that app that allows you to track the progress of your ML training remotely

woeful hamlet Jan 16, 2021, 8:01 PM

#

Hi guys. I am using colab to train a model, but my image set exceeds the RAM limit from it. So do u know anyway i can use half of the images on a first run, and after that train use the left half?

lapis sequoia Jan 16, 2021, 8:02 PM

#

Hey guys just a question about data science
Does anyone know which framework for building a site and displaying live plots from matplotlib or other guis and libraries
Because I've seen streamlit and its seem to be pretty nice but I don't know if there's other effective ones like django seems to be to much work rn
Just any recommendations would be nice thanks

hushed wasp Jan 16, 2021, 8:13 PM

#

Can someone help me to be able to execute this part of code please?

📎 unknown.png

thin remnant Jan 16, 2021, 9:18 PM

#

📎 unknown.png

#

Made a multi variable regression model which predicts sale prices

#

the Error functions look redonculous

#

anyone an idea what is going wrong / what im not understanding about these numbers xd

woeful hamlet Jan 17, 2021, 12:43 AM

#

Hi guys. I am using colab to train a model, but my image set exceeds the RAM limit from it. So do u know anyway i can use half of the images on a first run, and after that train use the left half?

serene scaffold Jan 17, 2021, 4:00 AM

#

woeful hamlet Hi guys. I am using colab to train a model, but my image set exceeds the RAM lim...

what kind of model?

fleet heath Jan 17, 2021, 7:26 AM

#

thin remnant

Since eval is a keyword in python, try naming your variable something else, so as to not cause any confusion for the compiler

lapis sequoia Jan 17, 2021, 7:48 AM

#

My code is so inefficient...

#

:/

lapis sequoia Jan 17, 2021, 9:29 AM

#

Does anybody know why it takes forever to execute?

#

📎 unknown.png

#

it's stuck on executing...

#

without showing any error

#

nvm I got that

main quest Jan 17, 2021, 9:53 AM

#

https://ghostbin.co/paste/yj6q7

my code seems to hang while trying to calculate min and max. the intended result is to create a range of dates to use as xticks but with multiple plots in a loop that do not have necessarily the same size. how could i fix this, or is there a better approach to my desired effect?

lapis sequoia Jan 17, 2021, 9:55 AM

#

Hey I got a csv file which I'm using to plot some data with matplotlib (I'm quite new to it.)

Here is my code :

import matplotlib.pyplot as plt
import csv
import datetime


usercount = []
time = []

with open('stats.csv') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    
    for row in csv_reader:
        usercount.append(int(row[1]))
        time.append(datetime.datetime.utcfromtimestamp(int(row[0])))
    
    print("Finished loading csv file.")

plt.xlabel('time')
plt.ylabel('users')
plt.title('mmobot users over time')
plt.plot(time, usercount)
plt.show()

But the time displays badly, how can I rotate it on the x axis ?

#

📎 unknown.png

#

as you can see we can't see the dates properly and I would like to rotate them vertically or at a nice angle, but I can't figure out how to do it

main quest Jan 17, 2021, 10:00 AM

#

lapis sequoia as you can see we can't see the dates properly and I would like to rotate them v...

you can rotate them with xticks

plt.xticks(rotation=angle)```

angle is in degrees

lapis sequoia Jan 17, 2021, 10:01 AM

#

Thanks ! It works fine but I can't see the dates either this way :

📎 unknown.png

#

Is there a way I could expand the size of the window by code ?

main quest Jan 17, 2021, 10:02 AM

#

what are you using to make the window?

lapis sequoia Jan 17, 2021, 10:02 AM

#

plt.show()

#

But I will use plt.savefig() later on

main quest Jan 17, 2021, 10:04 AM

#

does this help?
https://stackoverflow.com/questions/28575192/how-do-i-set-the-matplotlib-window-size-for-the-macosx-backend?rq=1

Stack Overflow

How do I set the matplotlib window size for the MacOSX backend?

I have a python plotting function that creates a grid of matplotlib subplots and I want the size of the window the plots are drawn in to depend on the size of the subplot grid. For example, if the

lapis sequoia Jan 17, 2021, 10:04 AM

#

Oh yeah thanks a lot !

#

I could have find it by myself I guess, sorry for wasting your time

main quest Jan 17, 2021, 10:05 AM

#

np

frank acorn Jan 17, 2021, 10:29 AM

#

How should I approach this problem statement:

#

We have a set of bedroom images with a standard bed and two pillows.
Input: Bedsheet cloth patterns.

The goal

To overlap the bedsheet pattern on the entire standard bed image. Bedsheet should be shown
as neatly wrapped up with the bed with the corners properly tucked in.
To overlap the bedsheet pattern on the two pillows placed on the bed.
Sample images are shown on the next page. Training data can be downloaded by scraping
images from the following URL - https://www.myntra.com/bedsheets

Myntra

Bed Sheets - Buy Single, Double Bed Sheets Online in India | Myntra

Buy bed sheets online at Myntra. Choose from the vast collection of linen, cotton bed sheets in India at best rates. ✯ Free Shipping ✯ COD ✯ 30 Day Returns

quiet patio Jan 17, 2021, 11:05 AM

#

Hey everyone I ve a matrix of distances and i want to generate a list of coordinates (x, y) i tried Mij=(D[1][j]**2+D[i][1]**2−D[i][j])**2/2.

#

and i want to know S and W with M=U S U**T

#

and X = U * sqrt(S)

main quest Jan 17, 2021, 11:14 AM

#

how would i make it plot dots to 0 in dates where there is no data? this is coming from a pandas dataframe, unsure of what to google to get an answer

📎 wHxZU7rH16irQAAAABJRU5ErkJggg.png

subtle tundra Jan 17, 2021, 12:56 PM

#

https://www.analyticsvidhya.com/blog/2021/01/building-a-cnn-model-with-95-accuracy/

Analytics Vidhya

aromaljosebaby

Introduction to CNN | Build a CNN model with 95% Accuracy

In this article, we will build a CNN (Convolutional Neural Networks) model and aim to achieve 95% accuracy in Python.

twin moth Jan 17, 2021, 1:45 PM

#

Heya guys

#

I noticed that when I append a lot (~55M) dictionaries into a Pandas DF each and every append gets slower

#

Any idea how to optimize it?

#

I can try to create 240~ DFs and append each of them to the main DF if it helps

hollow void Jan 17, 2021, 1:49 PM

#

velvet thorn IMO there isn't enough of a difference in that case to matter

Bit of a late followup, but are there any major difference in ease of setup if one is using AMD GPU and ROCm?

#

(Tensorflow vs PyTorch debate)

twin moth Jan 17, 2021, 1:50 PM

#

It started by taking about ~0.0009489059448242188 seconds for each append.
After about 1.5M appends, each append now takes ~0.0035037994384765625 seconds

hard hound Jan 17, 2021, 2:24 PM

#

Hey does anyone know any site like kaggle?

mortal trout Jan 17, 2021, 3:08 PM

#

can someone tell me if the model is overfitting or not can someone tell if its overfitting loss: 0.0151 - accuracy: 0.9925 - val_loss: 0.1158 - val_accuracy: 0.9923

lapis sequoia Jan 17, 2021, 4:20 PM

#

Which channel can I use to ask a non-python question?

molten hamlet Jan 17, 2021, 4:41 PM

#

📎 unknown.png

#

can I simply find boxes with equal numbers? 🤔

#

I tried scipy.signal.correlate but...

odd lion Jan 17, 2021, 5:18 PM

#

mortal trout can someone tell me if the model is overfitting or not can someone tell if its o...

What are the proportions of your classes?

steady wigeon Jan 17, 2021, 5:21 PM

#

hello, I try to save tweet_text from .txt into csv file, which is collected using tweepy. Before this, i only take the string from “text” using this code and it is successful retrieve .

for i in range(len(tweets_data)):
        tweet_text=tweets_data[i]['text']
        idstr = tweets_data[i]['id_str']
        idarr.append(idstr)
        tweetarr.append(tweet_text)

But when I start to make the preprocessing for sentiment analysis,I realized that the “text” i took , some of them are truncated. the full text for the truncated is at {......,"extended_tweet":"full_text":"...."..} so, I come with this code to filter if there is extended_tweet, the tweet_text will take string from full_text, else tweet_text take string from text.

for i in range(len(tweets_data)):
        '''
        example of data:
        https://gist.github.com/igorbrigadir/614625e27fe400f86fdf29bdd0c1857f
        '''
        if ('extended_tweet' in tweets_data[i]):
            tweet_text=tweets_data[i]['extended_tweet']['full_text']
        else:
            tweet_text=tweets_data[i]['text']

        idstr = tweets_data[i]['id_str']
        idarr.append(idstr)
        tweetarr.append(tweet_text)

but the tweet_text still take the string from “text” even though there is "extended_tweet".

proven plinth Jan 17, 2021, 5:23 PM

#

Does anyone here do bioinformatics? And if so, do you have any resources apart from rosalind.info

lapis sequoia Jan 17, 2021, 6:39 PM

#

write a test @steady wigeon

#

and you dont have to iterate over indices in python

#

and is it really ideal to save id's and texts in different list?

plain parrot Jan 17, 2021, 7:50 PM

#

Hi Guys, if any one has experience with SIR modelling of pandeming, could you please DM me, just got a couple of simple questions, thank you

woeful hamlet Jan 17, 2021, 10:27 PM

#

serene scaffold what kind of model?

aaaaah does it matter? i mean, it is a convolutional neural network

serene scaffold Jan 17, 2021, 10:27 PM

#

woeful hamlet aaaaah does it matter? i mean, it is a convolutional neural network

It might be that you only have to pass one training instance through the network at a time

woeful hamlet Jan 17, 2021, 10:28 PM

#

wdym?

velvet thorn Jan 17, 2021, 10:36 PM

#

It started by taking about ~0.0009489059448242188 seconds for each append.
After about 1.5M appends, each append now takes ~0.0035037994384765625 seconds
@twin moth each append creates a new object. don’t append, concatenate.

woeful hamlet Jan 17, 2021, 11:05 PM

#

@serene scaffold

serene scaffold Jan 17, 2021, 11:10 PM

#

@woeful hamlet depending on how the network is designed, it might be that only one training instance has to be in memory at a time. So you could load the training instances into memory in batches. If the network might need to look at multiple instances for one operation, I'd have to know more about why that is.

woeful hamlet Jan 17, 2021, 11:10 PM

#

why? how "why"?

serene scaffold Jan 17, 2021, 11:12 PM

#

@woeful hamlet I'd need to know what kind of neural network you're using and what it's meant to do.

woeful hamlet Jan 17, 2021, 11:12 PM

#

i already told u what kind on nn is it

#

it is xception

#

cnn

serene scaffold Jan 17, 2021, 11:13 PM

#

@woeful hamlet so it's a cnn. And what does it do?

woeful hamlet Jan 17, 2021, 11:13 PM

#

predict classes

serene scaffold Jan 17, 2021, 11:13 PM

#

What classes

woeful hamlet Jan 17, 2021, 11:14 PM

#

why does it matter?

serene scaffold Jan 17, 2021, 11:14 PM

#

Because there might be a reason that it does and I can't rule that out without knowing what it's for.

woeful hamlet Jan 17, 2021, 11:15 PM

#

??? do u know whats my issue?

velvet thorn Jan 17, 2021, 11:15 PM

#

woeful hamlet why does it matter?

...more pertinently, why aren't you willing to divulge that information?

woeful hamlet Jan 17, 2021, 11:15 PM

#

cuz top secret (¿)

#

cuz i dont see the relation between type of classes and RAM usage from colab xd

velvet thorn Jan 17, 2021, 11:16 PM

#

woeful hamlet cuz i dont see the relation between type of classes and RAM usage from colab xd

okay then

#

I mean, if you wanna get help + gatekeep simultaneously...

woeful hamlet Jan 17, 2021, 11:17 PM

#

i mean, u could explain why the classes matter xd

velvet thorn Jan 17, 2021, 11:17 PM

#

ye

#

and you could stop being passive aggressive

#

anyway, the classes in an abstract sense might not matter

#

but, for example, there are networks which train based on some difference metric between input

#

so batch size is, minimally, the number of images being compared

#

in your case, however...

#

...you sound like you're loading all the images eagerly

#

so some sort of lazy loading would be a good start

woeful hamlet Jan 17, 2021, 11:18 PM

#

same set as Imagenet from keras.datasets

#

Like, i really dont see how the info u are asking will help. My question was if i could load like half of my data set, train with that, and then load the rest, and train again

velvet thorn Jan 17, 2021, 11:20 PM

#

mortal trout can someone tell me if the model is overfitting or not can someone tell if its o...

show recall/precision/f1 score and confusion matrix

#

but such high accuracy is generally a bit weird

velvet thorn Jan 17, 2021, 11:21 PM

#

woeful hamlet Like, i really dont see how the info u are asking will help. My question was if ...

yes

velvet thorn Jan 17, 2021, 11:21 PM

#

lapis sequoia Which channel can I use to ask a non-python question?

there are off-topic channels

velvet thorn Jan 17, 2021, 11:21 PM

#

hollow void Bit of a late followup, but are there any major difference in ease of setup if o...

hm. I'm not experienced enough in DL with AMD GPUs to be able to tell

woeful hamlet Jan 17, 2021, 11:22 PM

#

okey. what i am trying is called fine tuning?

velvet thorn Jan 17, 2021, 11:22 PM

#

but I don't think so?

velvet thorn Jan 17, 2021, 11:22 PM

#

woeful hamlet okey. what i am trying is called fine tuning?

no.

#

finetuning has multiple meanings, but the most relevant one, I'd think, is in further training only the upper layers of a pretrained model

serene scaffold Jan 17, 2021, 11:23 PM

#

!code

arctic wedgeBOT Jan 17, 2021, 11:23 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

woeful hamlet Jan 17, 2021, 11:23 PM

#

velvet thorn finetuning has multiple meanings, but the most relevant one, I'd think, is in fu...

ah so like adding 1 ore 2 layers to learn from the already trained weights for ur dataset?

velvet thorn Jan 17, 2021, 11:23 PM

#

woeful hamlet ah so like adding 1 ore 2 layers to learn from the already trained weights for u...

no

#

you don't add layers

#

you freeze the lower ones

#

well, you can add layers

#

depends on your problem

woeful hamlet Jan 17, 2021, 11:24 PM

#

the first layers know about edges?

velvet thorn Jan 17, 2021, 11:24 PM

#

minimally you'd probably change the topmost layer if, for example, you were doing classification and wanted to change the number of classes

velvet thorn Jan 17, 2021, 11:24 PM

#

woeful hamlet the first layers _know_ about edges?

loosely speaking, kind of?

woeful hamlet Jan 17, 2021, 11:24 PM

#

ok ok

velvet thorn Jan 17, 2021, 11:24 PM

#

more generally, patterns in the image at a lower level of abstraction

woeful hamlet Jan 17, 2021, 11:24 PM

#

so u freeze liek half and let the other half fit ur dataset?

velvet thorn Jan 17, 2021, 11:25 PM

#

not necessarily half

#

but

#

I don't really see how that's relevant to your problem

#

🥴

woeful hamlet Jan 17, 2021, 11:25 PM

#

just to know terms

velvet thorn Jan 17, 2021, 11:25 PM

#

top secret xd

woeful hamlet Jan 17, 2021, 11:25 PM

#

in case i have to google for them

hushed wasp Jan 17, 2021, 11:25 PM

#

serene scaffold !code

didn't I use the ` correctly?

velvet thorn Jan 17, 2021, 11:25 PM

#

hushed wasp didn't I use the ` correctly?

you were missing syntax highlighting

#

x = 3
print(f'x is {x}')

x = 3
print(f'x is {x}')

#

top is without, bottom is with

serene scaffold Jan 17, 2021, 11:26 PM

#

@velvet thorn the only used one backtick rather than three

velvet thorn Jan 17, 2021, 11:26 PM

#

oh

#

wups mb

hushed wasp Jan 17, 2021, 11:28 PM

#

velvet thorn you were missing syntax highlighting

I don't even find how to highlight... sorry 😦

velvet thorn Jan 17, 2021, 11:30 PM

#

hushed wasp I don't even find how to highlight... sorry 😦

dw about it just use !code

hushed wasp Jan 17, 2021, 11:33 PM

#

velvet thorn dw about it just use !code

ok understood thx

velvet thorn Jan 17, 2021, 11:33 PM

#

hushed wasp ok understood thx

yup that looks good

hushed wasp Jan 17, 2021, 11:39 PM

#

dico = { 'order_id': 'count', 'price' : 'sum', 'review_score' : 'mean', 'payment_type': lambda x: pd.Series.mode(x)[0], 'payment_installments' : 'mean', 'product_category_name_english': lambda x: pd.Series.mode(x)[0], 'customer_state' : lambda x: pd.Series.mode(x)[0],  'delivery_status' : lambda x: pd.Series.mode(x)[0], 'day_of_week' : lambda x: pd.Series.mode(x)[0], 'period' : lambda x: pd.Series.mode(x)[0]} 
customers = df_part_of_day.groupby('customer_unique_id',as_index=False).agg(dico)

Can I make my code shorter by "grouping" my lambda x: pd.Series.mode(x)[0]

velvet thorn Jan 18, 2021, 12:02 AM

#

hm.

#

@hushed wasp how about df_part_of_day[['customer_unique_id', 'payment_type', 'product_category_name_english', 'customer_state', 'delivery_status', 'day_of_week', 'period']].groupby('customer_unique_id').agg(lambda x: x.mode()[0])?

#

then combine it with the rest

hushed wasp Jan 18, 2021, 12:10 AM

#

ok gonna try thks @velvet thorn

woeful hamlet Jan 18, 2021, 1:28 AM

#

ive read docs but i dont know the difference between image_dataset_from_directory and flow_from_directory

#

is the second one the same as the first one but doing data augmentation as well?

mortal trout Jan 18, 2021, 3:32 AM

#

@velvet thorn it was because the dataset was imbalance

velvet thorn Jan 18, 2021, 3:38 AM

#

@velvet thorn it was because the dataset was imbalance
@mortal trout ye that’s why I asked about other stats

mortal trout Jan 18, 2021, 4:32 AM

#

@velvet thorn will tensorflow work on gif images because they mentioned only a few extensions like jpg,png,bmp etc

lapis sequoia Jan 18, 2021, 5:42 AM

#

guys im getting value error while fitting my randomforestregressor

#

i split it into training and testing data

#

tried both manually and through train_test_split

#

but it shows the error that input variables are not the same

#

how to deal with this?

mellow pumice Jan 18, 2021, 6:16 AM

#

Have you tried reshaping? Sure if the data is of required shape. If so then do check the data set and the way you're assigning them.
Putting up the code might help to point out the thing causing problem more precisely.
@ShadowRanger5#3348

mellow pumice Jan 18, 2021, 6:22 AM

#

mortal trout <@!171929073063297024> will tensorflow work on gif images because they mentioned...

Gif are just a collection of images, so decode them after reading and store those shards. If the decoding goes well I can't see why tf will not work on it.
I might be a bit off on the shards part so make sure to check it online.
As for gif being diretly used in tf, hmm... , I seriously doubt that it'll work. Encodings are PITA sometimes... 😥

mortal trout Jan 18, 2021, 6:26 AM

#

@mellow pumice thanks for the info

subtle tundra Jan 18, 2021, 6:34 AM

#

https://www.analyticsvidhya.com/blog/2021/01/building-a-cnn-model-with-95-accuracy/

Analytics Vidhya

aromaljosebaby

Introduction to CNN | Build a CNN model with 95% Accuracy

In this article, we will build a CNN (Convolutional Neural Networks) model and aim to achieve 95% accuracy in Python.

lapis sequoia Jan 18, 2021, 6:58 AM

#

i have 2 columns with float data in pandas with same value, on doing difference between 2 column i get a difference of 1, although both values are same
how to fix this ?

velvet thorn Jan 18, 2021, 7:56 AM

#

lapis sequoia i have 2 columns with float data in pandas with same value, on doing difference ...

show code

twin moth Jan 18, 2021, 9:36 AM

#

velvet thorn > It started by taking about ~`0.0009489059448242188` seconds for each append. >...

That's exactly what I ended up doing, thanks 🙂

lapis sequoia Jan 18, 2021, 9:54 AM

#

velvet thorn show code

df['diff'] = df['a'].astype(int) - df['b'].astype(int)

fleet heath Jan 18, 2021, 10:36 AM

#

@lapis sequoia can you show some sample values of a and b?

lapis sequoia Jan 18, 2021, 10:48 AM

#

@fleet heath @velvet thorn
a = exchTrade, b =Trade

📎 unknown.png

fleet heath Jan 18, 2021, 10:52 AM

#

lapis sequoia df['diff'] = df['a'].astype(int) - df['b'].astype(int)

And can you show your actual code snippet instead of a and b?

#

Because from this row, it seems like you are taking the difference between the int columns

#

Which is 1

lapis sequoia Jan 18, 2021, 10:56 AM

#

@fleet heath

📎 unknown.png

lapis sequoia Jan 18, 2021, 11:24 AM

#

btw i am using python2 and pandas 0.19

hollow scarab Jan 18, 2021, 11:56 AM

#

is it possible to concat 2 dfs into on, and only selecting a few columns from each?

#

or I would have to concat and then create a new df with the colums i want?

fleet heath Jan 18, 2021, 12:46 PM

#

lapis sequoia <@!700590266658193476>

@hollow scarab see the first line

#

That should give you an idea about how to approach your problem

#

And as far as @lapis sequoia is concerned, i don't see any issue with the code

#

You might wanna check the version specific details for your code

lapis sequoia Jan 18, 2021, 1:03 PM

#

Thanks for checking this out @fleet heath i'll see if it is some bug or version issue

winged jasper Jan 18, 2021, 1:14 PM

#

Hello guys, I have recently started learning RASA, but it's been upgraded to 2.0 and I have found no single course that would cover developing, testing and deploying a chatbot/assistant using that framework. Does anybody have any resources on that? Or should I start with rasa 1.8 and later migrate to 2.2 when I've learned the concepts?

lapis sequoia Jan 18, 2021, 1:43 PM

#

fleet heath And as far as <@456226577798135808> is concerned, i don't see any issue with the...

also that thing happen sometimes with different dataset same column values, not always

fleet heath Jan 18, 2021, 2:11 PM

#

!e

import pandas as pd
df = pd.DataFrame({'a' : [-2100078.0, 2.34], 'b' : [-2100078.0, 2.34]})
df['diff'] = df['a'].astype(int) - df['b'].astype(int)
print(df)

arctic wedgeBOT Jan 18, 2021, 2:11 PM

#

You are not allowed to use that command here. Please use the #bot-commands channel instead.

hollow scarab Jan 18, 2021, 2:12 PM

#

thanks, I will try it with merge @fleet heath

fleet heath Jan 18, 2021, 2:12 PM

#

fleet heath !e ```py import pandas as pd df = pd.DataFrame({'a' : [-2100078.0, 2.34], 'b' : ...

@lapis sequoia what is the output of this?

lapis sequoia Jan 18, 2021, 2:14 PM

#

fleet heath <@456226577798135808> what is the output of this?

will try, and let u know👍

molten hamlet Jan 18, 2021, 3:07 PM

#

how can I extract numbers from image? 🤔

📎 unknown.png

knotty bay Jan 18, 2021, 3:35 PM

#

Machine Learning

devout zodiac Jan 18, 2021, 3:44 PM

#

Hi, I'm working with a (non-medical) CT dataset and have some fundamental question about it, mostly regarding processing and resolution. If anyone has experience with that, please dm me!

hollow void Jan 18, 2021, 3:55 PM

#

Is licence plates detection going to be the best using deep learning? Compared to simple image operations, EAST and haar cascades?

woeful hamlet Jan 18, 2021, 4:17 PM

#

I am training a model on colab, but i cant load all my data set at once due to RAM limit. How can i load it on 2 parts to train the model?

hard canopy Jan 18, 2021, 4:31 PM

#

Hi, i am trying to train a NN fort the MNIST dataset using pytorch. Basically, I am trying to replicate https://www.tensorflow.org/tutorials/quickstart/beginner with pytorch.
This is what I got: https://gist.github.com/luc-leonard/bf395ed87063941502030ec22e1ead89
It seems to be working, but my output seems weird... with TF I have probabilities between 0 and 1. Here I have negative values, and the result seems to be the 0.0 value in the output.
Did I do something wrong ?

TensorFlow

TensorFlow 2 quickstart for beginners | TensorFlow Core

Gist

test_pytorch.ipynb

GitHub Gist: instantly share code, notes, and snippets.

molten hamlet Jan 18, 2021, 5:47 PM

#

knotty bay Machine Learning

😐

hard canopy Jan 18, 2021, 5:58 PM

#

it's not that hard to train a neural network to read numbers

molten hamlet Jan 18, 2021, 6:07 PM

#

I want to train model to play game

#

not to read numbers

#

I just need rewards ;/

molten hamlet Jan 18, 2021, 6:34 PM

#

there should be some simple models for it

abstract zealot Jan 18, 2021, 7:39 PM

#

guys, methods for estimating mean and variance apart from using maximum likelihood?

#

or different methods to calculate mean and variance using max likelihood?

hard canopy Jan 18, 2021, 8:40 PM

#

https://stats.stackexchange.com/questions/3372/is-it-possible-to-accumulate-a-set-of-statistics-that-describes-a-large-number-of/3376#3376 ?

Cross Validated

Is it possible to accumulate a set of statistics that describes a l...

I must clarify immediately that I am a practicing software developer, not a statistician, and that my college stats class was a very long time ago…

That said, I would like to know if there is a me...

lapis sequoia Jan 18, 2021, 8:55 PM

#

hello ian new here anyone know something about ant colony optimazasion?

lapis sequoia Jan 18, 2021, 9:10 PM

#

fleet heath <@456226577798135808> what is the output of this?

📎 unknown.png

#

Can I loop through rows in a dataframe and split the data into other dataframes based on the value in a specific column?

#

Eg, I have one column of Events, there are 4 types of event and I want to create a new df for each event type

#

Just figured that out actually. I didn't realize you can grab a column and split based on value using df[df['col_name']=='value_i_want']
Cool stuff 😁

abstract zealot Jan 18, 2021, 9:30 PM

#

you can also use .loc i think @lapis sequoia

austere latch Jan 18, 2021, 9:33 PM

#

were you to assign a DataFrame object as a subset using df[df['col_name']=='value_i_want'] would this be a new object or continue referencing the original?

#

dont know if .copy() would be prudent in this case

arctic wedgeBOT Jan 18, 2021, 9:34 PM

#

Hey @wintry nacelle!

It looks like you tried to attach file type(s) that we do not allow (.ipynb). We currently allow the following file types: .3gp, .3g2, .avi, .bmp, .gif, .h264, .jpg, .jpeg, .mkv, .mov, .mp4, .mpeg, .mpg, .png, .tiff, .wmv, .psd, .ai, .aep, .xcf, .mp3, .wav, .ogg, .webm, .webp, .flac, .afdesign, .m4a, .csv.

Feel free to ask in #community-meta if you think this is a mistake.

wintry nacelle Jan 18, 2021, 9:34 PM

#

gdmit .ipynb are jupyter notebook files...

#

Anyway

#

I'm trying to learn cGANs and my current implementation is not working. The functions all work without raising errors, including the training loop. However, the generator does not seem to be learning anything. Oddly enough, the loss values do change over time (thanks tf.print), albeit slowly.
I would like to mention that this was hacked together using code from the tensorflow official website and machinelearningmastery. I'm still learning at this stage, so I suppose it's fine.

#

https://paste.pythondiscord.com/sanozoxeqe.properties

#

Also that file is meant to be an .ipynb but because this discord doesn't allow attaching .ipynb files I have to make do

#

Also I'm using tensorflow-gpu 2.4.0. I have put together a VAE and DCGAN before so I know my installation of tensorflow is fine

thick sphinx Jan 19, 2021, 1:39 AM

#

molten hamlet I just need rewards ;/

unless you build the game yourself, extracting the rewards from just reading the screen like this will be a difficult task in itself

dapper hatch Jan 19, 2021, 4:05 AM

#

I'm practicing with Pandas and I need to make a group of 10 cases from a dataset

#

someone who knows Pandas and can help me

gleaming gull Jan 19, 2021, 4:16 AM

#

Do you need a subset of 10 random rows? or rows that fulfill a condition? or manually select 10 rows?

arctic wedgeBOT Jan 19, 2021, 4:26 AM

#

Hey @gleaming gull!

It looks like you tried to attach a Python file - please use a code-pasting service such as https://paste.pythondiscord.com

gleaming gull Jan 19, 2021, 4:27 AM

#

https://paste.pythondiscord.com/cehuvikuva.yaml

#

Hope this helps!

vague vector Jan 19, 2021, 6:11 AM

#

I am from iOS native development background, switching towards Data Science and ML/DL. I've started my Masters and I have to choose 3 optional courses out of 5. The options I have are:
1- Data Visualisation and Dash-boarding,
2- Business Optimisation,
3- Simulation
4- Web Analytics
5- Data Warehousing.
Which one is better for becoming a Data Scientist. I have attached the course contents of these 5 course.
Regards

#

1- Data Visualisation and Dash-boarding

📎 unknown.png

#

2- Business Optimisation

📎 unknown.png

#

3- Simulation Modeling

📎 unknown.png

#

4- Web Analytics

📎 unknown.png

#

5- Data Warehousing

📎 unknown.png

ripe forge Jan 19, 2021, 7:47 AM

#

None of these. I assume you have some other core ds subjects, because these all would be supplementary to it

#

If you do have core ds subjects besides these, then I'd say pick based on which ones interest you most here

#

If I had to pick 3 for ds, I would have picked 1, 3, 4. But this is subjective.

vague vector Jan 19, 2021, 8:12 AM

#

Yes I have core subjects of DS. Im trying to choose the best optional ones for my career in Data Science.

hard hound Jan 19, 2021, 8:42 AM

#

Hey has anyone used logistic regression model here?

#

Should i use it for my classification model i am confused

ripe forge Jan 19, 2021, 9:02 AM

#

Yes, you can use logistic regression for classification

lapis sequoia Jan 19, 2021, 11:29 AM

#

I have a question

#

for learners, as long as the code works as intended, no matter how obfuscated or inefficient it is coded, it doesn't mater

#

for example

#

📎 unknown.png

#

It works as intended but I'm kind of worried

eager heath Jan 19, 2021, 11:34 AM

#

lapis sequoia for learners, as long as the code works as intended, no matter how obfuscated or...

I disagree, writing readable and efficient code is an important skill to learn

granite narwhal Jan 19, 2021, 11:38 AM

#

Hi Donna, I can help you out in what you need. Please let me know if we can discuss. I can help you out in any ML, AI, data science and python development work.

lapis sequoia Jan 19, 2021, 11:42 AM

#

eager heath I disagree, writing readable and efficient code is an important skill to learn

how do I learn to write an efficient code

#

and I forgot to put a question mark on that

eager heath Jan 19, 2021, 11:43 AM

#

Usually trough practice, and reading code or having your code being judged

lapis sequoia Jan 19, 2021, 11:43 AM

#

Could you judge my code

eager heath Jan 19, 2021, 11:43 AM

#

For the last two, participating to open source projects is a good idea

lapis sequoia Jan 19, 2021, 11:43 AM

#

Any input is highly appreciated

eager heath Jan 19, 2021, 11:44 AM

#

Hmm... I don’t know this library (is that pandas?) but I can give you feedback on the rest of the code

lapis sequoia Jan 19, 2021, 11:44 AM

#

yes it is pandas

eager heath Jan 19, 2021, 11:44 AM

#

One of the first thing I notice is that your fucntion are in camelCase, shich is against PEP8

lapis sequoia Jan 19, 2021, 11:45 AM

#

uhm I'm literally a beginner for everything, so I don't understand what camelCase or PEP8 is

eager heath Jan 19, 2021, 11:46 AM

#

Alright

#

!pep8

arctic wedgeBOT Jan 19, 2021, 11:46 AM

#

PEP 8 is the official style guide for Python. It includes comprehensive guidelines for code formatting, variable naming, and making your code easy to read. Professional Python developers are usually required to follow the guidelines, and will often use code-linters like flake8 to verify that the code they're writing complies with the style guide.

You can find the PEP 8 document here.

eager heath Jan 19, 2021, 11:46 AM

#

This document defines the code style recommended when coding in Python

#

For example, function names are usually written with underscores in them, like def my_function_name

lapis sequoia Jan 19, 2021, 11:47 AM

#

oh

eager heath Jan 19, 2021, 11:47 AM

#

Also adding some blank lines in your code could help

#

appart from that, it looks pretty good

lapis sequoia Jan 19, 2021, 11:47 AM

#

so like corr_data_frame rather than corrDataFrame

#

nice, thanks man

#

glad it looks at least okay

eager heath Jan 19, 2021, 11:48 AM

#

lapis sequoia so like corr_data_frame rather than corrDataFrame

Yep!

arctic wedgeBOT Jan 19, 2021, 12:57 PM

#

:incoming_envelope: :ok_hand: applied mute to @broken crater until 2021-01-19 13:07 (9 minutes and 58 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

red briar Jan 19, 2021, 1:00 PM

#

lapis sequoia uhm I'm literally a beginner for everything, so I don't understand what camelCas...

hello, do you mind sharing the source you used in studying on visualization/plotting graph. im trying to study this area.

lapis sequoia Jan 19, 2021, 1:05 PM

#

red briar hello, do you mind sharing the source you used in studying on visualization/plot...

the whole source I wrote?

#

ohhh

#

nvm you meant the raw data

#

https://www.kaggle.com/sudalairajkumar/daily-temperature-of-major-cities

Daily Temperature of Major Cities

Daily average temperature values recorded in major cities of the world

red briar Jan 19, 2021, 1:07 PM

#

thanks!

lapis sequoia Jan 19, 2021, 1:07 PM

#

you're welcome 🙂

old meteor Jan 19, 2021, 2:13 PM

#

Any idea why suddenly this plotting code just gets stuck and not showing?

📎 unknown.png

slender notch Jan 19, 2021, 2:14 PM

#

@lapis sequoia ekans?

dapper hatch Jan 19, 2021, 2:14 PM

#

gleaming gull Do you need a subset of 10 random rows? or rows that fulfill a condition? or man...

thanks for the answer

I have to set up those groups for a condition

azure leaf Jan 19, 2021, 2:15 PM

#

does anyone know what this for loop means

#

for tag, topic_df_en in

#

i dont get why there is a comma

#

i only ever seen one word then in

#

for x in y

#

never seen for x,y in z

#

i dont get it

lapis sequoia Jan 19, 2021, 2:18 PM

#

slender notch <@456226577798135808> ekans?

ekans with pepe eyes

#

liking it?

slender notch Jan 19, 2021, 2:18 PM

#

Yea

pulsar jetty Jan 19, 2021, 2:22 PM

#

Can someone help me here with OpenCV?

hollow scarab Jan 19, 2021, 2:26 PM

#

if I created a df combining 2 other dfs

#

and like both dfs have the same column names

#

is there any way i can differentiate between them?

dapper hatch Jan 19, 2021, 2:29 PM

#

Can I upload an example in xlsx directly here?

lapis sequoia Jan 19, 2021, 3:16 PM

#

hollow scarab if I created a df combining 2 other dfs

check the columns while combining to avoid this issue

#

df.columns

#

add a suffix like xyz_df1 and xyz_df2 to avoid the problem

hollow scarab Jan 19, 2021, 3:18 PM

#

is it a pd.concate function the columns? or should i rename the column names before concat? @lapis sequoia

bright meadow Jan 19, 2021, 3:26 PM

#

How would I update a certain int value by 1 in sqlite?

burnt prawn Jan 19, 2021, 3:33 PM

#

Meta Kaggle / Kaggle Global outreach analysis
https://www.kaggle.com/neomatrix369/kaggle-global-outreach-analysis/

Kaggle Global Outreach (analysis)

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

lapis sequoia Jan 19, 2021, 3:34 PM

#

hollow scarab is it a pd.concate function the columns? or should i rename the column names bef...

change the name before concatenating
create a function to take care of those cases.

red briar Jan 19, 2021, 3:34 PM

#

bright meadow How would I update a certain int value by 1 in sqlite?

are you looking for an answer using python or sql query?

bright meadow Jan 19, 2021, 3:35 PM

#

Python

lapis sequoia Jan 19, 2021, 3:37 PM

#

Python

lapis sequoia Jan 19, 2021, 3:38 PM

#

bright meadow How would I update a certain int value by 1 in sqlite?

something like this:
UPDATE Orders SET Quantity = Quantity + 1 WHERE ...

I found this on stackoverflow

bright meadow Jan 19, 2021, 3:39 PM

#

lapis sequoia something like this: UPDATE Orders SET Quantity = Quantity + 1 WHERE ... I foun...

Ok

hollow scarab Jan 19, 2021, 3:39 PM

#

oh..is that the only way? im doing this for a weekly report and they might not want the names to change @lapis sequoia

lapis sequoia Jan 19, 2021, 3:40 PM

#

hollow scarab oh..is that the only way? im doing this for a weekly report and they might not w...

if those columns are the have the same name and same contents (values) then you may want to join them instead of conctenating

#

pd.merge()

#

I don't know what is your use case so you have to figure that out

hollow scarab Jan 19, 2021, 3:41 PM

#

basically the date is the same, and there are like 5 other colums with the same name in both but different values

#

so I could join on the date, but that wouldn't solve the issue of the other columns having the same name

#

I need to do this to make a chart, I guess if it joins the date that would be fine for one of the charts

lapis sequoia Jan 19, 2021, 3:44 PM

#

i don't know what is the final result that you are looking for

hollow scarab Jan 19, 2021, 3:44 PM

#

pd.merge can be done for one column only right?

lapis sequoia Jan 19, 2021, 3:44 PM

#

hollow scarab pd.merge can be done for one column only right?

no it can be done on multiple too

#

OH_pepeLaffing

hollow scarab Jan 19, 2021, 3:45 PM

#

basically this is week1 and week2, same column names

📎 unknown.png

#

and I need a chart like that at the end

📎 unknown.png

lapis sequoia Jan 19, 2021, 3:49 PM

#

hollow scarab and I need a chart like that at the end

wait they are not of the same date?

red briar Jan 19, 2021, 3:49 PM

#

hollow scarab basically this is week1 and week2, same column names

You need to different char for week 1 and 2?

sand hamlet Jan 19, 2021, 3:49 PM

#

Try to slice into two df after reading it

#

With .loc

hollow scarab Jan 19, 2021, 3:49 PM

#

the date column is the same

red briar Jan 19, 2021, 3:50 PM

#

Before concat make new column for each df

sand hamlet Jan 19, 2021, 3:50 PM

#

Use iloc then

hollow scarab Jan 19, 2021, 3:50 PM

#

@sand hamlet the week1 and week2 were 2 different dfs, I got this by a pd.concat

#

well it should be on one chart @red briar

#

the jan13 is week1 and jan14 is week2 on the pic

#

do I need to combine the 2 dfs at all if I want them to be on the same chart?

#

or I can do that if they are separate dfs?

sand hamlet Jan 19, 2021, 3:52 PM

#

Try to append

lapis sequoia Jan 19, 2021, 3:52 PM

#

hollow scarab or I can do that if they are separate dfs?

yeah you could do that without concatenating if they are some size.

sand hamlet Jan 19, 2021, 3:52 PM

#

Instead of concat

lapis sequoia Jan 19, 2021, 3:52 PM

#

lapis sequoia yeah you could do that without concatenating if they are some size.

just read each row using .loc for both the df

hollow scarab Jan 19, 2021, 3:53 PM

#

they are the same size

lapis sequoia Jan 19, 2021, 3:53 PM

#

and use columns of df1 and df2 as indexes

hollow scarab Jan 19, 2021, 3:54 PM

#

I will try that and append as well, see which is easier, thank you !

red briar Jan 19, 2021, 4:02 PM

#

sorry i dont have experience with chart but
if i were to compare them via table
df1['week'] ='week1'
df2['week'] ='week2'
df = pd.concat([df1,df2])
then groupby via column week

hollow scarab Jan 19, 2021, 4:03 PM

#

hmm, that could work as well

fathom seal Jan 19, 2021, 4:25 PM

#

def square_rooted(x):
return round(sqrt(sum([a*a for a in x])),3)

def cosine_similarity(x,y):
numerator = sum(a*b for a,b in zip(x,y))
denominator = square_rooted(x)*square_rooted(y)
return round(numerator/float(denominator),3)

print cosine_similarity([3, 45, 7, 2], [2, 54, 13, 15])

#

anyone know how to used this algorithm?

gleaming gull Jan 19, 2021, 4:43 PM

#

azure leaf does anyone know what this for loop means

I'm not sure if you got an answer for this, but here's a few use cases for why I use the "for x,y in n:" syntax in my day to day work as a data scientist

#

https://paste.pythondiscord.com/lasapakifa.lua

azure leaf Jan 19, 2021, 4:43 PM

#

Thanks

#

Do you happen to have experience with wordcloud?

gleaming gull Jan 19, 2021, 4:46 PM

#

I've used it in the past, do you have a specific question about it?

#

also, there was a slight error in that code, I updated it! my bad!

azure leaf Jan 19, 2021, 4:49 PM

#

yes

#

and ok thanks

#

                        cloud_url = ""

                    else:
                        cloud_words = " ".join(words_ns_en)
                        img = io.BytesIO()
                        wordcloud = WordCloud(background_color='white', max_font_size = 100, width=600, height=300).generate(cloud_words)
                        plt.figure(figsize=(10,5))
                        plt.imshow(wordcloud, interpolation='bilinear')
                        plt.axis("off")
                        plt.show()
                        plt.savefig(img, format='png')
                        plt.close()
                        img.seek(0)

                        cloud_url = base64.b64encode(img.getvalue()).decode()
                        plt.clf()```

#

so this is my code to generate a wordcloud

#

but sometimes i get this error on page load: File "/app/by_page.py", line 320, in bypage plt.imshow(wordcloud)

#

ValueError: Argument must be an image, collection, or ContourSet in this Axes```

#

It only happens sometimes which is really weird

#

and sometimes if i refresh the page

#

it works

#

i have no idea why

#

    from pandas.plotting import register_matplotlib_converters
    register_matplotlib_converters()
    matplotlib.use('agg')
    import matplotlib.pyplot as plt
    import io
    import base64
    import matplotlib.ticker as plticker
    import datetime as DT
    from wordcloud import WordCloud``

#

these are my import statements

#

been trying to debug for the past day, can't find many resources online to this error

gleaming gull Jan 19, 2021, 4:57 PM

#

do you have this deployed on a web page somewhere? It looks like some type of error with the ax parameter in matplotlib

azure leaf Jan 19, 2021, 4:57 PM

#

yes its on my flask container

#

im running it in a docker container

#

I can screenshare and show you, not sure if you're up for that. No worries if not lol I getit

gleaming gull Jan 19, 2021, 4:58 PM

#

I'm not too sure what's going on but it seems like something on the matplotlib side. If that helps lol

azure leaf Jan 19, 2021, 4:59 PM

#

so

#

like an import statement?

#

  File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/app/app.py", line 21, in bypage
    return by_page.bypage()
  File "/app/by_page.py", line 320, in bypage
    plt.imshow(wordcloud)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/pyplot.py", line 2731, in imshow
    sci(__ret)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/pyplot.py", line 3102, in sci
    return gca()._sci(im)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py", line 1856, in _sci
    raise ValueError("Argument must be an image, collection, or "
ValueError: Argument must be an image, collection, or ContourSet in this Axes```

#

this is the full traceback

gleaming gull Jan 19, 2021, 5:00 PM

#

My guess is something is going wrong when matplotlib is trying to draw the figure, just a guess though

azure leaf Jan 19, 2021, 5:00 PM

#

yaaa

#

No Idea what to do to debug it

gleaming gull Jan 19, 2021, 5:02 PM

#

what happens if you remove the plt.figure() call and only use the wordcloud parameter to define the size?

azure leaf Jan 19, 2021, 5:02 PM

#

not sure, ill comment out the plt.figure line and see what happense

#

ye, doesn't do anythin still points to this line plt.imshow(wordcloud, interpolation='bilinear')

gleaming gull Jan 19, 2021, 5:05 PM

#

hmmm. I'm not sure.. I have a couple mins, I'm going to see if I can reproduce the error

azure leaf Jan 19, 2021, 5:06 PM

#

Okay, thanks!

#

I don't know if you will get the error though, it only happens sometimes on my end and it could be to do with my flask environment

#

i honestly have no ideas

gleaming gull Jan 19, 2021, 5:09 PM

#

That could be it too. I've deployed some apps to heroku and they're glitchy af. I tell my colleagues to refresh it if it doesn't pop up right away lol

azure leaf Jan 19, 2021, 6:46 PM

#

ya cant resolve why this line causes error plt.imshow(wordcloud, interpolation='bilinear')

buoyant phoenix Jan 19, 2021, 7:02 PM

#

hello guys

#

please any link for tutorials in data science using machine learning

#

using Python

molten hamlet Jan 19, 2021, 7:32 PM

#

@native lark Yo check my progress :D,
Now I got rewards and can proceed to build actual model 😄
https://youtu.be/3_9oRuYH_UE

YouTube

Awesome-Ai

Clickr - Script

Script reading which box is most valuable to click. Purpose was to generate samples for machine learning, but small tuning and this is decent bot.

repositories: https://github.com/GrzegorzKrug/

▶ Play video

native lark Jan 19, 2021, 7:33 PM

#

molten hamlet <@!151347084602245120> Yo check my progress :D, Now I got rewards and can procee...

again, it would be faster to just recreate the game in python

molten hamlet Jan 19, 2021, 7:34 PM

#

but it would be much easier!

#

i can't work with vision if all is in list and matrices 😄

native lark Jan 19, 2021, 7:36 PM

#

yea i know, im just the type of person that would only care about the model if i was doing sth like this

molten hamlet Jan 19, 2021, 7:36 PM

#

I just want start on this, and jump to euro truck simlator 😄

native lark Jan 19, 2021, 7:36 PM

#

thats online tho, right?

molten hamlet Jan 19, 2021, 7:36 PM

#

single and multi yes

native lark Jan 19, 2021, 7:36 PM

#

wouldn't that be cheating?

molten hamlet Jan 19, 2021, 7:37 PM

#

probably yes, as I remeber I think u can just load singleplayer save to multi 🤔

#

but it does not matter really 😄

native lark Jan 19, 2021, 7:38 PM

#

yes, as per rule 5 it does

#

not gonna lie tho the clickr thing is really mesmerizing to watch

molten hamlet Jan 19, 2021, 7:39 PM

#

I gonna do ai In single obviously, people are sometimes maniac in it 😄

#

haha yes it is 🤔

twilit pilot Jan 19, 2021, 7:42 PM

#

I am trying to do model.fit(X, y) where X is a bunch of 1-d arrays and y is a number 0 or 1. this is what my dataframe looks like ```

 image    result

0 [177, 177, 177, 177, 177, 177, 177, 177, 177, ... 1
1 [177, 177, 177, 177, 177, 177, 177, 177, 177, ... 1
2 [177, 177, 177, 177, 177, 177, 177, 177, 177, ... 1
3 [177, 177, 177, 177, 177, 177, 177, 177, 177, ... 1
4 [177, 177, 177, 177, 177, 177, 177, 177, 177, ... 1
... ... ...
995 [175, 175, 175, 175, 175, 175, 175, 175, 175, ... 0
996 [173, 173, 173, 173, 173, 173, 173, 173, 173, ... 0
997 [171, 171, 171, 171, 171, 171, 171, 171, 171, ... 0
998 [169, 169, 169, 169, 169, 169, 169, 169, 169, ... 0
999 [168, 168, 168, 168, 168, 168, 168, 168, 168, ... 0
1000 rows × 2 columns
and when i do thispy
model = sklearn.linear_model.LogisticRegression()
model.fit(df['image'], df['result'])
i get an error that looks like this
TypeError Traceback (most recent call last)
TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

ValueError Traceback (most recent call last)
<ipython-input-62-0feffeb8d1c0> in <module>
1 model = LogisticRegression()
----> 2 model.fit(X, y)
ValueError: setting an array element with a sequence.

midnight rain Jan 19, 2021, 7:43 PM

#

if i have a numpy array of numpy arrays of dtype float, shouldnt the dtype of the outer array be float as well?

#

x = np.array(np.array([....., ],dtype=float32), np.array(...), ...)

x.dtype = ? #np.float?```

molten hamlet Jan 19, 2021, 7:45 PM

#

midnight rain ```python x = np.array(np.array([....., ],dtype=float32), np.array(...), ...) x...

nope, its object most likely, cause you got arrays not floats inside

midnight rain Jan 19, 2021, 7:46 PM

#

weird

#

i thought the docs showed it otherwise

molten hamlet Jan 19, 2021, 7:46 PM

#

🤔

midnight rain Jan 19, 2021, 7:47 PM

#

📎 unknown.png

#

maybe i need to flatten first?

#

numpy.stack(x, axis=0)``` or similar?

molten hamlet Jan 19, 2021, 7:48 PM

#

midnight rain ```python numpy.stack(x, axis=0)``` or similar?

what u want to do? your previous examples was more like list or arrays

midnight rain Jan 19, 2021, 7:48 PM

#

im dumb that worked

molten hamlet Jan 19, 2021, 7:49 PM

#

you can np.concatenate or np.stack

midnight rain Jan 19, 2021, 7:49 PM

#

im using the faiss database

molten hamlet Jan 19, 2021, 7:49 PM

#

stack creates new axis, if needed

midnight rain Jan 19, 2021, 7:49 PM

#

and the documentation is lacking

#

so im not even sure what it wants, but it didnt like having an array of objects

#

so im assuming maybe a multi dimensional array of floats is closer to what it wants

#

GWcmeisterPeepoShrug

molten hamlet Jan 19, 2021, 7:50 PM

#

you want to merge to arrays?

midnight rain Jan 19, 2021, 7:51 PM

#

i guess i wanted a multidimensional array

#

a (rows, columns) array

#

the documentation didnt specify though so i thought it wanted an array of arrays for some reason facepalm

vestal mirage Jan 19, 2021, 7:52 PM

#

hello

#

when comparing 2 datas what do u decide to put on the x and y axis

#

like for example i want to compare points vs assists, should points be on the x or it should be on y?

midnight rain Jan 19, 2021, 7:54 PM

#

vestal mirage like for example i want to compare points vs assists, should points be on the x ...

i think it depends on how you are doing your comparison. if you do the comparison over time, then i use x for the time and y for the values

vestal mirage Jan 19, 2021, 7:55 PM

#

its more of like

#

a correlations between these two

#

like im more of trying to figure out the relationship between these 2 points

#

@midnight rain

#

📎 unknown.png

#

currently its sumething liek dis

midnight rain Jan 19, 2021, 7:59 PM

#

oh i see

vestal mirage Jan 19, 2021, 7:59 PM

#

so ye i never know what to put on x or y...

#

dunno if it matters

midnight rain Jan 19, 2021, 8:00 PM

#

id say thats more of a narrative thing

vestal mirage Jan 19, 2021, 8:00 PM

#

in this case what wud u do? for setting x n y

midnight rain Jan 19, 2021, 8:01 PM

#

what are you wanting the graph to show? That players with lots of assists are more/less likely to score points? Or players that score points are more/less likely to assist others?

#

i'd say your x is more likely to be what you want your narrative to be around, but you might be best off asking someone in UI/UX design

vestal mirage Jan 19, 2021, 8:03 PM

#

midnight rain what are you wanting the graph to show? That players with lots of assists are mo...

well its more of just data analysis rn

#

like figuring out those questions

vestal mirage Jan 19, 2021, 8:04 PM

#

midnight rain i'd say your x is more likely to be what you want your narrative to be around, b...

but ye the main narrative is around points

#

how different stats influence points

abstract zealot Jan 19, 2021, 8:22 PM

#

Hey is covariance the same as variance if im only looking at one univariate normal distribution? asking this because im trying to model my normal distribution as a GaussianMixture on sklearn and this class only has the attribute .covariances_. Any help appreciated thank you 😄

lapis sequoia Jan 19, 2021, 8:50 PM

#

Numpy.matrix is depracated and discouraged. Is there some alternative to implement an api that supports classic syntax like (A * B, A ** 3) for matrixes etc

sturdy dune Jan 19, 2021, 8:55 PM

#

10 python hacks you must know : https://datamahadev.com/10-amazing-python-hacks-with-cool-libraries/

datamahadev.com

Samiksha Bhavsar

10 Amazing Python Hacks with Cool Libraries - datamahadev.com

In this article, we will learn some amazing python hacks with some rare yet cool libraries. The main purpose of this article is to learn(or automate) a few basic things with the help of python. So, let us begin.

hard canopy Jan 19, 2021, 9:43 PM

#

I am confused

#

I'm not sure wich API I like more

#

between pytorch and tensorflow :/

velvet thorn Jan 19, 2021, 10:16 PM

#

lapis sequoia Numpy.matrix is depracated and discouraged. Is there some alternative to impleme...

is that supposed to be elementwise multiplication

velvet thorn Jan 19, 2021, 10:16 PM

#

abstract zealot Hey is covariance the same as variance if im only looking at one univariate norm...

Var(X) = Cov(X, X), yes

velvet thorn Jan 19, 2021, 10:17 PM

#

midnight rain the documentation didnt specify though so i thought it wanted an array of arrays...

there is no ML case I can think of which requires an array of arrays

#

it's always a multidimensional array

#

this has to do with efficiency in reading/writing data

velvet thorn Jan 19, 2021, 10:18 PM

#

twilit pilot I am trying to do `model.fit(X, y)` where X is a bunch of 1-d arrays and y is a ...

you need to turn your X into a 2D array

#

right now it's a 1D array of arrays

midnight rain Jan 19, 2021, 10:22 PM

#

velvet thorn there is no ML case I can think of which requires an array of arrays

it is for indexing into a database so both cases seemed like they made sense

#

wasnt sure how it transactionally does indexing.

velvet thorn Jan 19, 2021, 10:26 PM

#

midnight rain it is for indexing into a database so both cases seemed like they made sense

there are few cases general where one should be using some collection of arrays

#

because you lose vectorisability over the outer collection

midnight rain Jan 19, 2021, 10:27 PM

#

@velvet thorn well all im doing is inserting those arrays into a database for however it wants to keep them stored

velvet thorn Jan 19, 2021, 10:29 PM

#

midnight rain <@!171929073063297024> well all im doing is inserting those arrays into a databa...

hm. that sounds somewhat like an antipattern to me

#

but if it doesn't matter to you we can just leave it

midnight rain Jan 19, 2021, 10:30 PM

#

@velvet thorn well its an embedding database so i wasn't sure if it wanted individual embeddings or a ndarray of embeddings.

#

for a more traditional db you'd probably give a list of tuples like [(col1, col2, col3), ...]

velvet thorn Jan 19, 2021, 10:33 PM

#

midnight rain <@!171929073063297024> well its an embedding database so i wasn't sure if it wan...

depends on the spec I guess

#

but yeah generally in that case you'd want a list of arrays

midnight rain Jan 19, 2021, 10:33 PM

#

right, but this particular library wanted an ndarray

#

the documentation is extremely lacking so i didnt know haha

velvet thorn Jan 19, 2021, 10:34 PM

#

sounds like weird design

#

🥴

ancient galleon Jan 19, 2021, 11:10 PM

#

Hi uh, is anyone familiar with multi-indexing with matplotlib?

velvet thorn Jan 19, 2021, 11:34 PM

#

ancient galleon Hi uh, is anyone familiar with multi-indexing with matplotlib?

just ask your question

ancient galleon Jan 19, 2021, 11:43 PM

#

How would you graph a multi-indexed series into a grouped bar chart?

I thought something like this in seaborn would do the trick:

sns.barplot(x="Age Cohort", y=(?), hue="Ethnicity")

But if you're not familiar with seaborn then a way of creating a grouped bar plot in native matplotlib would be great as well.

📎 unknown.png

velvet thorn Jan 19, 2021, 11:45 PM

#

ancient galleon How would you graph a multi-indexed series into a grouped bar chart? I thought ...

paste the Series

#

as text

ancient galleon Jan 19, 2021, 11:45 PM

#

Sure thing

#

Age Cohort  Ethnicity         
0 to 5      Hispanic               44
            White not Hispanic     20
13 to 17    Hispanic              103
            White not Hispanic     67
18 to 21    Hispanic               78
            White not Hispanic     69
22 to 50    Hispanic               43
            White not Hispanic    133
51+         Hispanic               17
            White not Hispanic     66
6 to 12     Hispanic               91
            White not Hispanic     46
Name: Age Cohort, dtype: int64

velvet thorn Jan 19, 2021, 11:47 PM

#

thx

ancient galleon Jan 19, 2021, 11:48 PM

#

Ideally I would be able to convert this into a chart like this:

📎 unknown.png

velvet thorn Jan 19, 2021, 11:51 PM

#

@ancient galleon df.unstack().plot.bar()

ancient galleon Jan 19, 2021, 11:51 PM

#

I'll give it a shot right now, thanks gm

velvet thorn Jan 19, 2021, 11:52 PM

#

ancient galleon I'll give it a shot right now, thanks gm

actually, apply that to the series

#

not the DF

ancient galleon Jan 19, 2021, 11:53 PM

#

Yeah it provides the graph, thank you. I was experimenting with unstack() to change it back to single indexing but I was just translating that directly into seaborn instead of using pyplot

#

This solves the issue, I appreciate the help!

velvet thorn Jan 19, 2021, 11:55 PM

#

yw

shut valve Jan 20, 2021, 1:46 AM

#

Hello i was wondering if any of yall use a real time collaborative notebook with your team? We tried using VS code live share but it didnt work with notebooks. I'm currently just port forwarding my jupyter notebook which works but i don't like having an open port and sending my ip to strangers on the internet. Wondering if others had a solution? I'm currently thinking about trying to host a notebook on elastic beanstalk or something like that as that would be most ideal for when we start training models as I'm on a small laptop. Cheers

sly hinge Jan 20, 2021, 2:00 AM

#

Hi, I'm trying to implement a CNN with an LSTM layer but I don't know how LSTM works very well and I haven't been able to connect the two layers, does anyone know how I can pass the parameters?

#

📎 unknown.png

velvet thorn Jan 20, 2021, 2:28 AM

#

sly hinge Hi, I'm trying to implement a CNN with an LSTM layer but I don't know how LSTM w...

I suggest you read up on what a LSTM does first

#

it's better to understand the nature of the basic layers before trying to do this kind of thing

sly hinge Jan 20, 2021, 3:20 AM

#

I have a general knowledge, I don't think the layer is necessary for the project I want to do, but it is a requirement. I'll keep investigating thanks.

vestal mirage Jan 20, 2021, 4:51 AM

#

can anybody help me speed this code up? its verry slow

📎 unknown.png

#

basically what it does is takes in an initial dataframe with metrics column containing json data, then it normalizes that json and returns a new data frame with the json flattened

#

📎 unknown.png

agile wing Jan 20, 2021, 6:19 AM

#

nice

lapis sequoia Jan 20, 2021, 6:33 AM

#

nice

lapis sequoia Jan 20, 2021, 6:33 AM

#

vestal mirage

Your code looks gorgeous sir

velvet thorn Jan 20, 2021, 6:34 AM

#

vestal mirage can anybody help me speed this code up? its verry slow

can you give a short sample

#

of your data

#

also, post code/data as text instead of images please

vestal mirage Jan 20, 2021, 6:47 AM

#

velvet thorn can you give a short sample

def _expand_json(metric: Any) -> pd.DataFrame:
    try:
        return pd.json_normalize(metric)
    except AttributeError:
        return pd.DataFrame(metric)


def _expand_metrics(dframe: pd.DataFrame) -> pd.DataFrame:
    dfs = []
    for _, row in dframe.iterrows():
        df = pd.DataFrame({"entity#id": [row["entity#id"]]})
        expanded = _expand_json(row["metrics"])
        dfs.append(pd.concat([df, expanded], axis=1))

    df = pd.concat(dfs, ignore_index=True)
    return df

velvet thorn Jan 20, 2021, 6:47 AM

#

vestal mirage ```py def _expand_json(metric: Any) -> pd.DataFrame: try: return pd....

of data

vestal mirage Jan 20, 2021, 6:47 AM

#

1 sec

#

pycharm froze -.-

#

csv file big

#

📎 unknown.png

#

parent,metrics,entity#id,latest,lastUpdatedEpoch
game#bos-dal-20190824,"{""playerStats"":{""onePtGoals"":0,""shotsOnGoalPercentage"":null,""penalties"":0,""reboundSaved"":0,""penaltyMins"":0,""twoPtGoals"":0,""causedTurnovers"":0,""interceptions"":0,""points"":0,""runOuts"":0,""shotsOffTarget"":0,""cleanSaved"":0,""assists"":0,""shotPercentage"":null,""shotsOffGoal"":0,""turnovers"":0,""shotsOnGoal"":0,""shotsTotal"":0,""shotsPipe"":0,""shotsDeflected"":0,""groundballs"":{""retain"":0,""rebound"":0,""total"":0,""turnover"":0,""faceoff"":0}},""faceoffStats"":{""total"":0,""faceoffGroundball"":0,""percentage"":null,""wingGroundball"":0,""won"":0,""wingProcedure"":0,""faceoffProcedure"":0,""outOfBounds"":0},""goalieStats"":{""cleanSaves"":0,""onePtGoalsAllowed"":0,""reboundSaves"":0,""savePercentage"":null,""saves"":0,""goalieShotsOnGoal"":0,""goalsAgainstAverage"":null,""minPlayed"":null,""twoPtGoalsAllowed"":0,""totalGoalsAllowed"":0},""jerseyNumber"":99,""link"":""/game/bos-dal-20190824/player/13367"",""name"":""John Daniggelis"",""_id"":13367,""position"":""DM"",""team"":{""name"":""Boston Cannons"",""link"":""/team/bos"",""id"":""bos""}}",game#bos-dal-20190824#player#13367,false,1566678278806

#

@velvet thorn dis is 1 line of the csv file

#

basically ,etrics col is json

📎 unknown.png

velvet thorn Jan 20, 2021, 6:52 AM

#

so it normalises to

#

a single row

#

with many columns?

vestal mirage Jan 20, 2021, 6:52 AM

#

ye

#

yup

velvet thorn Jan 20, 2021, 6:52 AM

#

then why do you have a try-except

vestal mirage Jan 20, 2021, 6:53 AM

#

oh dat cuz the original dataframe is kinda messed up

#

so liek 99% of metrics col is json but there are a few exceptions

velvet thorn Jan 20, 2021, 6:54 AM

#

df['metrics'].map(pd.json_normalize)?

vestal mirage Jan 20, 2021, 6:57 AM

#

no it dont work

#

📎 unknown.png

#

📎 unknown.png

#

dang pycharm professional has a sciview pretty dope

vestal mirage Jan 20, 2021, 7:20 AM

#

@velvet thorn any ideas?

lapis sequoia Jan 20, 2021, 8:04 AM

#

velvet thorn is that supposed to be elementwise multiplication

No, thats matrix multiplication (product sums of columns by rows)

compact matrix Jan 20, 2021, 8:17 AM

#

Is there a way to reverse one hot encoding and get the original categorical variables.?

velvet thorn Jan 20, 2021, 8:41 AM

#

No, thats matrix multiplication (product sums of columns by rows)
@lapis sequoia @ with 2D arrays

vestal mirage Jan 20, 2021, 8:46 AM

#

pycharm pro scitools r dope

📎 unknown.png

upbeat cradle Jan 20, 2021, 1:31 PM

#

hey all, would dask be worth using over pandas for large groupby sets?

#

would there be a noticable difference in speed? bare in mind my current operation takes around 60 minutes and I have ~1mil rows

austere swift Jan 20, 2021, 1:33 PM

#

You could use cudf if you're on linux

#

since cudf is linux only iirc

#

its essentially just gpu-accelerated dataframe operations

#

so it should make it a lot faster since it's on gpu

#

https://github.com/rapidsai/cudf

GitHub

rapidsai/cudf

cuDF - GPU DataFrame Library. Contribute to rapidsai/cudf development by creating an account on GitHub.

#

as for dask I'm not sure how much the performance increase would be

upbeat cradle Jan 20, 2021, 1:36 PM

#

thanks, seems like a really useful tool. Using servers at my work for this though, GPU is a bit pants on it

austere swift Jan 20, 2021, 1:37 PM

#

dask is meant to be really scalable

#

so if you have a lot of servers at your work you can use it to have it run on clusters

pure pond Jan 20, 2021, 3:09 PM

#

Hey, how do I make an empty numpy array? I need an object to store values, then I'll iterate over files and append to the same array, so I need it to start with no values

#

I can only find things about making arrays with 0's or whatever already in. I just want it like how you can make an empty list a = []

#

its just a 1d array

ashen sable Jan 20, 2021, 3:33 PM

#

guys i am getting this error

#

AttributeError: module 'tensorflow' has no attribute 'reset_default_graph'

#

any help ?

iron aspen Jan 20, 2021, 3:34 PM

#

pure pond Hey, how do I make an empty numpy array? I need an object to store values, then ...

just pass the empty list into np.array() like you would do with nonempty ones. But considering that you are to append elements, I think built-in list will be a better choice. Convert it to numpy array afterwards if you want numpy features.

pure pond Jan 20, 2021, 3:36 PM

#

I have to extract numpy arrays from the files I'm working with, theres an i/o package handling that. I found what to do though, I was doing np.empty(1) thinking it would give me a 1d array, but apparently you're supposed to do (0).

iron aspen Jan 20, 2021, 3:37 PM

#

cool

hollow scarab Jan 20, 2021, 4:06 PM

#

i have a 260 row data, and I want to plot it into a barchart, but my issue is that I need the number in the 1. row and then from the 2. row to the x. row (variable) grouped up

#

is this possible?

#

so like 1. row and then 2-x. row grouped stacked on each other as a bar chart

#

📎 unknown.png

#

and that for 2 different columns

#

kinda like the excel chart

mint palm Jan 20, 2021, 5:45 PM

#

Has deep learning already in it peak or is there still time in it to go boom or is it somewhere in between...........What do you guys think??

nova widget Jan 20, 2021, 6:19 PM

#

@mint palm it holds an eternal amount of potential and most is yet to come. But, it's not a "magic solution", and no singularity event, yet. You can solve a lot of problems in a more efficient way than using deep learning.

mint palm Jan 20, 2021, 6:22 PM

#

it just a beginning i think

lapis sequoia Jan 20, 2021, 6:23 PM

#

Hello guys,