#data-science-and-ml | Python | Page 415

timid kiln Jun 27, 2022, 7:04 PM

#

Much appreciated. I'll take some time to absorb your suggestion. 🙂

wooden sail Jun 27, 2022, 7:05 PM

#

here's a MWE

#

import numpy as np

def find_intersection(a, b, c, d):
    r'''
    function that finds the intersection between two line segments. one segment
    is defined by the points a, b; and the other, by c, d.
    '''

    y = c - a
    A = np.zeros((2,2))
    A[:,0] = b - a
    A[:,1] = c - d
    detA = A[0,0]*A[1,1] - A[0,1]*A[1,0]
    if detA == 0:
        return np.ones((2))*np.inf #intersection at infinity
    else:
        x = np.linalg.inv(A).dot(y)
        if (0 <= x[0] <= 1) and (0 <= x[1] <= 1):
            return a + x[0]*(b-a) #valid intersection
        else: 
            return np.ones((2))*np.inf #intersection out of segment

a = np.array([0,0])
b = np.array([1,0])
c = np.array([0,-1])
d = np.array([1,1])

p = find_intersection(a,b,c,d)
print(p)

#

the result is

[0.5 0. ]

as expected

#

i made it so it returns [inf, inf] if the matrix is not invertible (that means the lines are parallel. either they never touch, or they are the same line. it's a degenerate case)

#

i guess i forgot to check whether the entries of x are in the valid interval [0,1]
edit* there we go

misty flint Jun 27, 2022, 7:11 PM

#

here is my daily complaint about aws

steady basalt Jun 27, 2022, 7:12 PM

#

Here’s my daily I don’t use cloud computing

misty flint Jun 27, 2022, 7:12 PM

#

aws grumpchib

wooden sail Jun 27, 2022, 7:12 PM

#

here's my daily day

misty flint Jun 27, 2022, 7:13 PM

#

steady basalt Here’s my daily I don’t use cloud computing

its the only way i can get my model deployed at work

#

otherwise itd be useless

#

kekHands

steady basalt Jun 27, 2022, 7:13 PM

#

Here’s my daily thanks edd for all ur help X

#

Now help me with eigenvectors!

wooden sail Jun 27, 2022, 7:13 PM

#

oof

#

i have like 30 mins

#

do you have any specific questions

misty flint Jun 27, 2022, 7:13 PM

#

i think the only person who hates aws as much as me here is Stel

#

ID_BoomKek

steady basalt Jun 27, 2022, 7:14 PM

#

No not yet

#

Meant later

#

Before that I have to somehow get thru the numpy

misty flint Jun 27, 2022, 7:14 PM

#

edd can you learn aws and then teach me please

wooden sail Jun 27, 2022, 7:14 PM

#

tomorrow might be a good day to help, i have some dead time in between meetings

steady basalt Jun 27, 2022, 7:14 PM

#

I’m too stuck to advance

misty flint Jun 27, 2022, 7:14 PM

#

ID_BoomKek

#

jk

steady basalt Jun 27, 2022, 7:14 PM

#

wooden sail tomorrow might be a good day to help, i have some dead time in between meetings

Would appreciate, it’s more math coding

misty flint Jun 27, 2022, 7:14 PM

#

RunFail

steady basalt Jun 27, 2022, 7:14 PM

#

Pure numpy no functions

wooden sail Jun 27, 2022, 7:14 PM

#

absolutely lovely

steady basalt Jun 27, 2022, 7:14 PM

#

Allowed

#

It’s a date then..

misty flint Jun 27, 2022, 7:15 PM

#

eww

#

i have to run

#

RunFail

steady basalt Jun 27, 2022, 7:15 PM

#

They’re gona make me translate the panda

wooden sail Jun 27, 2022, 7:15 PM

#

check out what i shared above. nice way to find the intersection of line segments by inverting a matrix

steady basalt Jun 27, 2022, 7:15 PM

#

In python

fallen portal Jun 27, 2022, 7:20 PM

#

good morning. i'm wondering if someone can help me with a question regarding dask

#

i turned a csv into many parquet files using pandas, and when i read these parquet files using Dask i can do basic operations such as .head() and .tail(), but when i try to do other things like operations on a column i'm getting a ValueError: Not all divisions are known, can't align partitions. Please use `set_index` to set the index.. When I do .index on the dataframe it shows that i do index have an index, but known_divisions is False. I'm not really sure how divisions plays into Dask or its indexing structure, or why i'm receiving this error. Any thoughts?

#

The index on the files are just a basic 0-N index created by Pandas (I didn't specify a column)

steady basalt Jun 27, 2022, 8:06 PM

#

A server?

serene scaffold Jun 27, 2022, 8:13 PM

#

20 million rows isn't much on the scale that a lot of data science is done these days. are you trying to pick a particular flavor of SQL?

timid kiln Jun 27, 2022, 8:13 PM

#

wooden sail ```py import numpy as np def find_intersection(a, b, c, d): r''' functi...

Thank you! I came back to report that this: https://www.w3schools.com/python/ref_set_intersection.asp won't work. But as I re-read the other persons's post, they indicated that I would have to include each and every point. So one, I did it wrong, and two, I won't be calculating each and every point. At least, that's not desirable for this process. 🙂

W3Schools offers free online tutorials, references and exercises in all the major languages of the web. Covering popular subjects like HTML, CSS, JavaScript, Python, SQL, Java, and many, many more.

steady basalt Jun 27, 2022, 8:14 PM

#

Csv

timid kiln Jun 27, 2022, 8:14 PM

#

wooden sail ```py import numpy as np def find_intersection(a, b, c, d): r''' functi...

Works perfectly of course. BTW, what does 'MWE' mean?

misty flint Jun 27, 2022, 8:15 PM

#

what are you doing with said data

#

is it more read heavy or write heavy

#

analytical vs. transactional

serene scaffold Jun 27, 2022, 8:16 PM

#

Rex asking the big questions over here

misty flint Jun 27, 2022, 8:16 PM

#

something something ACID vs. CAP

misty flint Jun 27, 2022, 8:16 PM

#

serene scaffold Rex asking the big questions over here

kekHands

#

ehh you could probably get away with most things as long as it can store that many rows

#

since youre not trying to put anything into prod

#

so whatever youre most familiar with/most interested in learning

#

i recommend mongo but thats me

#

~~do it~~

#

you are a student right? you can get access to mongodb atlas too

#

through the student dev pack

serene scaffold Jun 27, 2022, 8:21 PM

#

if you're planning to do sentiment analysis, and you don't need to do any complicated queries or transactions (ie, the data is just there for you to feed into a model as-is), a text file should be fine

misty flint Jun 27, 2022, 8:21 PM

#

serene scaffold if you're planning to do sentiment analysis, and you don't need to do any compli...

but then you cant do RDD Stel!

#

~~resume-driven development~~

#

RunFail

serene scaffold Jun 27, 2022, 8:22 PM

#

I would be more concerned about how easy it is to feed the data into the model (a heavily nested JSON is not that), and not losing the data.

misty flint Jun 27, 2022, 8:23 PM

#

also mongo has easy ways to query json stuff btw

#

and its aggregation pipeline is pretty powerful

#

but yeah you can go old school with txt files ig

#

Oopsies

#

so it just depends on if you want to learn a new tool or not. up to you

iron basalt Jun 27, 2022, 8:25 PM

#

You probably don't need a database. If you design your system correctly with a separate IO layer, then you can swap that out later to use a database without having to change anything else in the system.

#

And I would then start with a simple IO layer and only make an IO layer for databases when you actually need one.

#

Do you control the file format? JSON / what is in it?

#

Looks pretty straight forward. If you want something more simple / you get to decide the format, then I would recommend a even more simple flat file format.

#

JSON is often overkill.

#

But yeah, if this works, just go with it for now. Databases later when actually needed.

narrow saddle Jun 27, 2022, 8:31 PM

#

What does this mean?

NumPy uses C-order indexing. That means that the last index usually represents the most rapidly changing memory location, unlike Fortran or IDL, where the first index represents the most rapidly changing location in memory. This difference represents a great potential for confusion.
It's from the numpy docs. https://numpy.org/doc/stable/user/basics.indexing.html
what do they mean by 'rapidly changing memory location'

iron basalt Jun 27, 2022, 8:32 PM

#

Well, nothing can be done about how bad Twitter's API gets.

#

Databases are not really the thing for solving overly complex JSON, they can do it, but there is so much else that they do and add as overhead because of it. There are libraries for wrangling more complex JSON on their own.

junior lintel Jun 27, 2022, 8:52 PM

#

Hi guys, I don't know if this is the right section, in pandas I'm trying to figure out how to check with a "relative position" index without iterate through the whole dataframe.
A wrong code solution would be:
`
def foo_func(df):
index = df.tail(1).index[0]
dfCheck = df[index-3:index]
mask1 = dfCheck["A"].head(1) == True
mask2 = dfCheck["B"] > 0
if not dfCheck[mask1].empty and dfCheck[mask2].empty:
df.iloc[index] = True

df.rolling(3).apply(foo_func)
`
So the goal here is to check if there is a true in the -3 relative position and at least a value>0 in the -3:index portion. Any idea how can I translate this? Thank you all very much for any help

severe karma Jun 27, 2022, 8:57 PM

#

Hi guys, anyone have experience with parsing SEC data (XML) using this library/API (https://arelle.org/arelle/documentation/xbrl-database/open-database/) ? I am trying to recover the XML form back into the table format of https://www.sec.gov/ix?doc=/Archives/edgar/data/0000315189/000155837021001774/de-20210131x10q.htm# and store into my database. Anyone knows how can I find the 'reference' or 'calculation' section using this package? Thx

arelle®

admin

Open Database

Open Database The core tables of the database are: Submission: Information about a submission of reports (which may be filings) Report: A collection of business facts linked to filing and su…

steady basalt Jun 27, 2022, 9:01 PM

#

severe karma Hi guys, anyone have experience with parsing SEC data (XML) using this library/A...

I parse xml using pythons parser

odd meteor Jun 27, 2022, 10:27 PM

#

severe karma Hi guys, anyone have experience with parsing SEC data (XML) using this library/A...

Hello, helloworld

You can try asking your questions in #databases channel as well.

hollow sentinel Jun 27, 2022, 10:56 PM

#

https://youtu.be/p_tpQSY1aTs

YouTube

Krish Naik

Live- Implementation of End To End Kaggle Machine Learning Project ...

code: https://github.com/krishnaik06/Car-Price-Prediction
Dataset: https://www.kaggle.com/nehalbirla/vehicle-dataset-from-cardekho
⭐ Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typin...

▶ Play video

#

nice example of an end to end project

vernal sail Jun 28, 2022, 1:07 AM

#

i wanted@to learn ai

serene scaffold Jun 28, 2022, 1:12 AM

#

vernal sail i wanted@to learn ai

do you know what AI is?

vernal sail Jun 28, 2022, 1:12 AM

#

yeah

serene scaffold Jun 28, 2022, 1:12 AM

#

what

vernal sail Jun 28, 2022, 1:12 AM

#

artificial inelegance

serene scaffold Jun 28, 2022, 1:13 AM

#

yes, but what is that

vernal sail Jun 28, 2022, 1:13 AM

#

basically a human mind inside a computer

#

you can’t learn that stuff

#

because that’s stupid saying you want to learn ai

#

oh wait

#

ohhhhh

#

OHHHHH

#

😔

serene scaffold Jun 28, 2022, 1:14 AM

#

vernal sail basically a human mind inside a computer

that's not what it is at all.

vernal sail Jun 28, 2022, 1:14 AM

#

oh

#

well

#

i’m ignorant in the knowledge of ai

#

please enlighten me

calm thicket Jun 28, 2022, 1:18 AM

#

<@&831776746206265384> spam across multiple channels

delicate apex Jun 28, 2022, 1:19 AM

#

calm thicket <@&831776746206265384> spam across multiple channels

unrepentant and warned, too: #community-meta message
#community-meta message

serene scaffold Jun 28, 2022, 1:37 AM

#

vernal sail please enlighten me

in general, AI is when you have programs that solve a knowledge problem. or in other words, they emulate the application of knowledge in some way

#

in practice, it's usually understood to be a body of programming techniques that one would use when an exact sequence of steps can't arrive at an exact solution for a problem

vernal sail Jun 28, 2022, 1:42 AM

#

oh

#

that pretty complex but i get why people learn about it

#

i see it’s interest

serene scaffold Jun 28, 2022, 1:43 AM

#

vernal sail that pretty complex but i get why people learn about it

https://tenor.com/view/unlimited-power-star-wars-gif-10270127

Tenor

vernal sail Jun 28, 2022, 1:44 AM

#

lmfao

#

also

#

i’m not in trouble for doing this right

misty flint Jun 28, 2022, 4:09 AM

#

CLe_FeelsEvilLurk

wooden sail Jun 28, 2022, 4:14 AM

#

timid kiln Works perfectly of course. BTW, what does 'MWE' mean?

minimum working example

weary ridge Jun 28, 2022, 6:08 AM

#

is there anyone who has used pytesseract

#

and opencv

#

ocr

#

lemme know

stoic viper Jun 28, 2022, 7:13 AM

#

Hey.
We are working on a machine Learning model. we use xgboost and want to try a blended model between xgboost and lightgbm. I have no idea how i should start on this any tips?

steady basalt Jun 28, 2022, 8:03 AM

#

weary ridge and opencv

that

steady basalt Jun 28, 2022, 8:03 AM

#

vernal sail oh

u wana learn how to code a computers brain?

winged jasper Jun 28, 2022, 8:32 AM

#

Hey, already posted in general discussion but I guess I could receive more help here:
Hey guys, I'm an engineer that has some Python knowledge (I can easily follow documentation, find my way through a repository etc) and I need some help finding the best resources to help me build the following small project for a university course:

- User takes a photo with the phone's camera and uploads it (or is automatically uploaded to the pc that the phone is connected to with a cable / wifi)
- The photo goes through a Python API/ or is processed by a script that reads the photo, extracts the answers to a quizz and then offers the final score of the user
- The file the user will take a photo of looks similar to the one below (it might change just a little bit)

I would really appreciate some help, I want to learn this my self. I know some JS, I know some python, linux, etc. And I understand AI/Machine learning, so I guess I will have to use OpenCV with some other libraries to have this pipeline. Looking forward for some constructive words 🙂

shell panther Jun 28, 2022, 10:48 AM

#

hello, I want to build an open domain chatbot. What's the best python framework for that in your opinion?

amber perch Jun 28, 2022, 11:37 AM

#

not working

#

steady basalt Jun 28, 2022, 11:47 AM

#

shell panther hello, I want to build an open domain chatbot. What's the best python framework ...

What’s open domain

neon imp Jun 28, 2022, 12:23 PM

#

winged jasper Hey, already posted in general discussion but I guess I could receive more help ...

#1 Recommend the Gmail API.

#

Quick, easy way to have users upload pictures is to simply have them email them to you.

#

Lots of ways to do that though.

#

#2 First build your "user uploads picture to you, picture arrives, user gets a score" workload.

#

Do the machine learning last.

winged jasper Jun 28, 2022, 12:30 PM

#

Yeah, thing is those are the least important parts of the code 😄 I can also do it by manually uploading the images to a directory because users will have to pass the sheet manually to me anyway. And the machine part thus becomes the most important and hard part 😦

remote storm Jun 28, 2022, 12:30 PM

#

amber perch

Have you uplpaded the xlsx file in your jupyter notebook

#

You can either upload in jupyter's home page or link up the path in your PC for that

mighty condor Jun 28, 2022, 2:13 PM

#

Is this where people might know about pandas?

lapis sequoia Jun 28, 2022, 2:21 PM

#

Hi guys how can I plot a heatmap from a data frame pandas?

#

I have a data frame with 120k rows and 3 columns: customer, expenditure-type and ranking, ranking goes from 1 to 5 and indicates the amount spent from each customer (1=very little etc..) per expenditure type, i want to plot these data into a heatmap how can I do?

serene scaffold Jun 28, 2022, 2:26 PM

#

lapis sequoia Hi guys how can I plot a heatmap from a data frame pandas?

I've never plotted a heatmap before. does this help? https://stackoverflow.com/questions/12286607/making-heatmap-from-pandas-dataframe

Stack Overflow

Making heatmap from pandas DataFrame

I have a dataframe generated from Python's Pandas package. How can I generate heatmap using DataFrame from pandas package.

import numpy as np
from pandas import *

Index= ['aaa','bbb','ccc','dd...

wooden sail Jun 28, 2022, 2:37 PM

#

what do you want the heatmap to show?

steady basalt Jun 28, 2022, 2:49 PM

#

Should we create a data analysis channel I see we get daily how to pandas

steady basalt Jun 28, 2022, 2:49 PM

#

lapis sequoia I have a data frame with 120k rows and 3 columns: customer, expenditure-type and...

Seaborn ?

wooden sail Jun 28, 2022, 2:56 PM

#

seaborn was also gonna be my suggestion, but more details can be given based on what info they want the heatmap to show

mighty condor Jun 28, 2022, 3:04 PM

#

Hi everyone, I am trying to save an output of applying a function to a dataframe to a new column in that dataframe, and when I don't save it as a new column, I am seeing the correct response, but when I go to save it in a new column, it is filling it with NaNs

wooden sail Jun 28, 2022, 3:05 PM

#

can you show a code snippet

mighty condor Jun 28, 2022, 3:05 PM

#

so this is the correct output, all 1's

Screen_Shot_2022-06-28_at_11.05.31_AM.png

#

then I do this: python df['newcol']= df.loc["Beg-4"].apply(foos.Beg_3)

#

Screen_Shot_2022-06-28_at_11.06.28_AM.png

#

but it's filled with NaNs instead of the 1's

wooden sail Jun 28, 2022, 3:08 PM

#

Beg-4 is a column?

mighty condor Jun 28, 2022, 3:08 PM

#

a row

wooden sail Jun 28, 2022, 3:08 PM

#

then the issue is that you're not applying the function to all rows, most likely?

mighty condor Jun 28, 2022, 3:08 PM

#

just 1 row, the row named Beg-4

wooden sail Jun 28, 2022, 3:09 PM

#

and what do you want pandas to put into the other rows you didn't specify?

mighty condor Jun 28, 2022, 3:09 PM

#

oh, I just want it to take all the outputs, and put them in a new column...oh I guess I should put them in a new row?

#

I just want to save them somewhere in a df

#

actually idealy a new dataframe

#

that I would then add more stuff to once I apply different functions to the other rows

wooden sail Jun 28, 2022, 3:10 PM

#

that would make more sense, since it seems you're applying the function to the elements of a row. the number of outputs is equal to the length of a row. if you have more columns than rows, putting this into a column will end up with many unspecified values

mighty condor Jun 28, 2022, 3:10 PM

#

each row has it's own function

wooden sail Jun 28, 2022, 3:10 PM

#

you could make a new df if you like

#

or put the output as a row

mighty condor Jun 28, 2022, 3:11 PM

#

ah so maybe if I view the whole dataframe some of it would have saved as not NaNs?

#

le tme look

#

hmmm, no, they're all Nans

#

so it didn't fill some with nans, but all

wooden sail Jun 28, 2022, 3:12 PM

#

i'm not sure what pandas' default behavior is, but in any case you tried to put a collection of values somewhere it doesn't fit 😛

#

different functions will handle that error in different ways

mighty condor Jun 28, 2022, 3:14 PM

#

so this is how I create the new dataframe, right?

Screen_Shot_2022-06-28_at_11.14.17_AM.png

wooden sail Jun 28, 2022, 3:15 PM

#

wasn't newdf already a new dataframe? (i'm asking, i've never used pandas)

#

the code looks ok, just making sure the newdff line isn't redundant

mighty condor Jun 28, 2022, 3:16 PM

#

Screen_Shot_2022-06-28_at_11.16.17_AM.png

#

this is what newdf looks like, I think it's not an actual dataframe?

wooden sail Jun 28, 2022, 3:17 PM

#

try type(newdf) and see what it prints

#

just for peace of mind

mighty condor Jun 28, 2022, 3:17 PM

#

Screen_Shot_2022-06-28_at_11.17.26_AM.png

#

Screen_Shot_2022-06-28_at_11.18.05_AM.png

wooden sail Jun 28, 2022, 3:18 PM

#

aha

#

aight, that's your answer

mighty condor Jun 28, 2022, 3:18 PM

#

so it was a series?

wooden sail Jun 28, 2022, 3:18 PM

#

yeah

mighty condor Jun 28, 2022, 3:18 PM

#

and now it's a dataframe?

wooden sail Jun 28, 2022, 3:18 PM

#

whatever that is 😛 yep

mighty condor Jun 28, 2022, 3:19 PM

#

ty so much ❤️

vernal sail Jun 28, 2022, 3:50 PM

#

steady basalt u wana learn how to code a computers brain?

sure

tacit basin Jun 28, 2022, 4:14 PM

#

steady basalt u wana learn how to code a computers brain?

Do computers have brain ? :)

steady basalt Jun 28, 2022, 4:21 PM

#

vernal sail sure

How good are u at coding

#

in c++ 😅

vernal sail Jun 28, 2022, 4:22 PM

#

none

#

not a thing

steady basalt Jun 28, 2022, 4:22 PM

#

Damn that sucks u gona have to learn how

vernal sail Jun 28, 2022, 4:22 PM

#

and pretty bad in normal python

steady basalt Jun 28, 2022, 4:22 PM

#

Get better then

vernal sail Jun 28, 2022, 4:22 PM

#

ez

#

just got better

steady basalt Jun 28, 2022, 4:23 PM

#

Well u asked how to and the answer is first be able to code

#

What can u do with python

misty flint Jun 28, 2022, 5:51 PM

#

question

#

how would you make use of an ontological model in a business setting

#

these tend to be represented with knowledge graphs

#

okay, now what?

#

kekHands

bold timber Jun 28, 2022, 6:25 PM

#

Hi, I have a question: how to showing all number in x axis in plotly?

mild dirge Jun 28, 2022, 6:27 PM

#

plt.xticks(range(1, 13)) @bold timber ?

wooden sail Jun 28, 2022, 6:28 PM

#

i think when you create the fig, you can give the parameter x = [some_list_of_tick_values]

bold timber Jun 28, 2022, 6:28 PM

#

doesn't works

wooden sail Jun 28, 2022, 6:28 PM

#

so maybe x = range(1,13)

mild dirge Jun 28, 2022, 6:29 PM

#

This is plt.bar() right?

#

@bold timber

wooden sail Jun 28, 2022, 6:30 PM

#

they say plotly there, or do you shorten pyplot and plotly the same way?

mild dirge Jun 28, 2022, 6:30 PM

#

Oh plotly

#

yeah that doesn't work then haha

wooden sail Jun 28, 2022, 6:30 PM

#

should be something like fig = some_plotly_function(some_datafram, x = range(1,13), other_params)

bold timber Jun 28, 2022, 6:36 PM

#

wooden sail should be something like fig = some_plotly_function(some_datafram, x = range(1,1...

ok thank you

brave sand Jun 28, 2022, 6:39 PM

#

how do I find the win rate of my algorithm?

wooden sail Jun 28, 2022, 6:40 PM

#

what's the algorithm

brave sand Jun 28, 2022, 6:40 PM

#

qmix

#

https://arxiv.org/pdf/1803.11485.pdf

steady basalt Jun 28, 2022, 6:41 PM

#

man i fkin love plotly so beautiful

brave sand Jun 28, 2022, 6:41 PM

#

my professor wants me to find the win rate of this algorithm how do I do that?

wooden sail Jun 28, 2022, 6:41 PM

#

ah, some flavor of learning

#

well, it's montecarlo time

steady basalt Jun 28, 2022, 6:42 PM

#

brave sand my professor wants me to find the win rate of this algorithm how do I do that?

jesus that paper was a facefull of math

wooden sail Jun 28, 2022, 6:42 PM

#

generate a huge amount of scenarios with different starting conditions and see how many times it wins

steady basalt Jun 28, 2022, 6:42 PM

#

i cannot read it

brave sand Jun 28, 2022, 6:42 PM

#

wooden sail generate a huge amount of scenarios with different starting conditions and see h...

could you elaborate?

#

https://github.com/quantumiracle/Popular-RL-Algorithms
here is the qmix code

GitHub

GitHub - quantumiracle/Popular-RL-Algorithms: PyTorch implementatio...

PyTorch implementation of Soft Actor-Critic (SAC), Twin Delayed DDPG (TD3), Actor-Critic (AC/A2C), Proximal Policy Optimization (PPO), QT-Opt, PointNet.. - GitHub - quantumiracle/Popular-RL-Algorit...

wooden sail Jun 28, 2022, 6:42 PM

#

not really

brave sand Jun 28, 2022, 6:42 PM

#

steady basalt jesus that paper was a facefull of math

same here, I couldn't understand any of it

steady basalt Jun 28, 2022, 6:43 PM

#

wooden sail not really

skimming thru it its 50% symbols

wooden sail Jun 28, 2022, 6:43 PM

#

you're training this yourself?

steady basalt Jun 28, 2022, 6:43 PM

#

and proof

wooden sail Jun 28, 2022, 6:43 PM

#

steady basalt skimming thru it its 50% symbols

that's how papers should be

steady basalt Jun 28, 2022, 6:43 PM

#

ummm

#

it makes it hard to follow for non mathematitians

brave sand Jun 28, 2022, 6:43 PM

#

wooden sail you're training this yourself?

I trained it on my local machine

steady basalt Jun 28, 2022, 6:43 PM

#

maybe put proof in the appendix?

brave sand Jun 28, 2022, 6:44 PM

#

or at least I ran this algoritm on my local machine and I got an output pickle file

#

how do I interpret this?

wooden sail Jun 28, 2022, 6:44 PM

#

brave sand I trained it on my local machine

the training procedure itself already includes some form of training error and validation error

#

the validation error of the final epoch is what you want

#

what exactly that looks like, i can't say

brave sand Jun 28, 2022, 6:44 PM

#

am I allowed to send files here?

steady basalt Jun 28, 2022, 6:45 PM

#

i wonder if i just pin a massive 20x20 table on my wall with every single math symbol i ll be able to decipher papers

#

after a few weeks

arctic wedgeBOT Jun 28, 2022, 6:45 PM

#

Hey @brave sand!

It looks like you tried to attach file type(s) that we do not allow (). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

steady basalt Jun 28, 2022, 6:46 PM

#

Reading this paper ive noticed their use of the triple equals sign, what is that?

#

is that just equating functions?

brave sand Jun 28, 2022, 6:46 PM

#

wooden sail the training procedure itself already includes some form of training error and v...

so I have the model results, where would the validation error of the final epoch be?

wooden sail Jun 28, 2022, 6:47 PM

#

the usual procedure is that lots of cool, cutting edge, and largely useless results are produced in academia and research. then the results slowly trickle down as either the authors make the code open-source or people make open-source implementations of the results. then people who know nothing about them start using them more widely through APIs

wooden sail Jun 28, 2022, 6:47 PM

#

brave sand so I have the model results, where would the validation error of the final epoch...

i have absolutely no idea 😛 you can read how the output is stored in their repo

steady basalt Jun 28, 2022, 6:47 PM

#

wooden sail the usual procedure is that lots of cool, cutting edge, and largely useless resu...

for my thesis ill be forced to paste the code at the end

#

not exactly deep coding though anyone can do it

wooden sail Jun 28, 2022, 6:48 PM

#

that's rather uncommon

steady basalt Jun 28, 2022, 6:48 PM

#

≡ means identical to

#

umm

#

interesting

#

1/2 doenst = 2/4

#

but rather is identical to

#

why didnt they teach this in school

wooden sail Jun 28, 2022, 6:49 PM

#

because it's a useless distinction

steady basalt Jun 28, 2022, 6:49 PM

#

We consider a partially observable scenario in which each agent draws individual observations z ∈ Z according to observation function O(s, a) : S × A → Z. Each agent has an action-observation history τa ∈ T ≡ (Z × U)∗, on which it conditions a stochastic policy πa(ua|τa) : T × U → [0, 1]. The joint policy π has a joint action-value function: Qπ(st,ut) = Est+1:∞,ut+1:∞ [Rt|st,ut], where Rt = 􏰆∞ γirt+i is the discounted return.

wooden sail Jun 28, 2022, 6:49 PM

#

they represent the same number, so there is no case in standard arithmetic where it matters

steady basalt Jun 28, 2022, 6:50 PM

#

so reading that i saw that symbol and thought

#

wtf?

#

what is the : before infiinte for?

#

isnt that usually a arrow?

wooden sail Jun 28, 2022, 6:51 PM

#

where?

steady basalt Jun 28, 2022, 6:51 PM

#

2nd last line

#

they just mean 1 to inf?

wooden sail Jun 28, 2022, 6:51 PM

#

no idea

#

can you paste an image of the original text instead?

steady basalt Jun 28, 2022, 6:52 PM

#

brave sand Jun 28, 2022, 6:52 PM

#

how do you guys usually interpret pickle files?

steady basalt Jun 28, 2022, 6:53 PM

#

is there any particular reason why the left side of the equation uses () and the right side uses [] ?

wooden sail Jun 28, 2022, 6:53 PM

#

yeah seems like some sort of interval, but it's difficult to say. unbeknownst to most, symbols don't have universal meanings. you'll have to hope the authors explained their notation near the beginning of the paper, or work your way from the top as you figure out how they use the symbols

steady basalt Jun 28, 2022, 6:53 PM

#

as a non math person this pisses me off

#

i often read things and have no clue what tehyre saying

#

hows 1:inf normally written?

#

isntit like

#

(1,inf]

#

or somehow

iron basalt Jun 28, 2022, 6:54 PM

#

Expected value often uses [].

wooden sail Jun 28, 2022, 6:55 PM

#

(1,inf] doesn't make it clear whether the numbers are taken from N, Z, or I

steady basalt Jun 28, 2022, 6:55 PM

#

can you explain why in that equation they use expected values and what in maths that means? I thought equations didnt do expected

wooden sail Jun 28, 2022, 6:55 PM

#

and yeah, expectation often uses [] or {}

steady basalt Jun 28, 2022, 6:55 PM

#

wooden sail (1,inf] doesn't make it clear whether the numbers are taken from N, Z, or I

What does that mean

iron basalt Jun 28, 2022, 6:55 PM

#

Wikipedia

#

Styled E in this case.

wooden sail Jun 28, 2022, 6:56 PM

#

.latex $\mathbb{N}$ are natural numbers, $\mathbb{Z}$ are the integers, and $\mathbb{I}$ are the reals

strange elbowBOT Jun 28, 2022, 6:56 PM

#

$latex.png$

steady basalt Jun 28, 2022, 6:56 PM

#

how can expected value be a thing in normal equations? ive only ever ran into that in statistics making synthetic datasets

wooden sail Jun 28, 2022, 6:56 PM

#

you're doing statistics there

steady basalt Jun 28, 2022, 6:56 PM

#

i need to read up more on exactly how expected values work on maths

wooden sail Jun 28, 2022, 6:57 PM

#

"stochastic policy" means "random policy"

steady basalt Jun 28, 2022, 6:57 PM

#

strange elbow

Z vs I?

iron basalt Jun 28, 2022, 6:57 PM

#

You can think of it as a weighted sum for simplicity often.

#

(You have probably already seen a sum in an equation)

wooden sail Jun 28, 2022, 6:57 PM

#

that was a typo, sorry. Z are whole numbers. I are the reals.

steady basalt Jun 28, 2022, 6:57 PM

#

after i finish my linalg instead of calculus i will take a course on notation and definitions..

wooden sail Jun 28, 2022, 6:57 PM

#

.latex $\mathbb{N}$ are natural numbers, $\mathbb{Z}$ are the integers, and $\mathbb{I}$ are the reals

strange elbowBOT Jun 28, 2022, 6:58 PM

#

$latex.png$

wooden sail Jun 28, 2022, 6:58 PM

#

sadly i can't get the bot to delete nor update the tex, so here it is again

steady basalt Jun 28, 2022, 6:58 PM

#

soooo sick of not knowing definitions

wooden sail Jun 28, 2022, 6:58 PM

#

i'm surprised these things were not mentioned in your linalg course

steady basalt Jun 28, 2022, 6:58 PM

#

these things were certainly not mentioned

wooden sail Jun 28, 2022, 6:59 PM

#

one usually defines vector spaces as "a vector space over a field", so one has to at least briefly mention fields

steady basalt Jun 28, 2022, 6:59 PM

#

is there any specific content that explains all these brackets and meanings?

wooden sail Jun 28, 2022, 6:59 PM

#

no, because they're not universal, as i said

#

you pick up math-reading abilities as part of your mathematical maturity

steady basalt Jun 28, 2022, 6:59 PM

#

wooden sail one usually defines vector spaces as "a vector space over a field", so one has t...

uhh, i mean so far its like /A/ as in length or watever

wooden sail Jun 28, 2022, 6:59 PM

#

by doing maths

steady basalt Jun 28, 2022, 6:59 PM

#

or real value

#

i cannot do maths that advacned i can learn th atr

iron basalt Jun 28, 2022, 7:00 PM

#

"mathematical maturity" - Huh, so i'm not the only one that uses that term.

wooden sail Jun 28, 2022, 7:00 PM

#

i just realized i still mixed up the symbols, i meant R, not I

steady basalt Jun 28, 2022, 7:00 PM

#

starting form 0 means im years away from that sorta stuff

wooden sail Jun 28, 2022, 7:00 PM

#

sorry, i'm tired

wooden sail Jun 28, 2022, 7:00 PM

#

iron basalt "mathematical maturity" - Huh, so i'm not the only one that uses that term.

isn't this a well accepted term? you can find it on papers and documents on pedagogy

iron basalt Jun 28, 2022, 7:01 PM

#

wooden sail isn't this a well accepted term? you can find it on papers and documents on peda...

I'm not sure. Seems like it would be.

wooden sail Jun 28, 2022, 7:01 PM

#

i'm sure, i was just being polite lol

iron basalt Jun 28, 2022, 7:02 PM

#

I don't think it was not polite. I just have not seen to term used in a chat room before.

steady basalt Jun 28, 2022, 7:02 PM

#

im never gona have the time on my hands to practise enough from 0 to being able to pass highschool or early degree level papers

#

that shit requires spamming it over and over

wooden sail Jun 28, 2022, 7:02 PM

#

that's how everything is learned

steady basalt Jun 28, 2022, 7:02 PM

#

yeah but i not in school anymore

#

i dont have that time

wooden sail Jun 28, 2022, 7:02 PM

#

that's also fair. consider that these people do this stuff for a living

steady basalt Jun 28, 2022, 7:03 PM

#

high school here starts at pretty easy level maths, like straight lines, surds and basic probability

#

sure

#

but after 1 year

#

it gets quite tricky

#

the trig is confusing, the calculus requires spamming and they dont even teach linalg

#

its like 99% trig

brave sand Jun 28, 2022, 7:03 PM

#

I'm in linalg right now

wooden sail Jun 28, 2022, 7:03 PM

#

linalg is good for your soul

#

probably the first bump with proof-heavy courses

steady basalt Jun 28, 2022, 7:04 PM

#

they dont teach it here

wooden sail Jun 28, 2022, 7:04 PM

#

sadness

iron basalt Jun 28, 2022, 7:05 PM

#

Linalg is so widely applicable, especially for many programs (computers compute it really well).

steady basalt Jun 28, 2022, 7:07 PM

#

yo

#

Consecutive terms of a sequence are related by unþ1 1⁄4 3 􏰁 (un)2

#

dammit

#

#

whats the strategy to finding the 50th term?

#

like a one liner?

#

sure i could go thru them one by one but that wud take ages

wooden sail Jun 28, 2022, 7:08 PM

#

you could use a for loop, sure. if you don't want to, though, you have a problem 😛

steady basalt Jun 28, 2022, 7:08 PM

#

its a pen and paper math exam

wooden sail Jun 28, 2022, 7:08 PM

#

the formula is recursive. you have to do the math yourself on paper

#

aha

#

well

#

see if you can find a pattern

steady basalt Jun 28, 2022, 7:08 PM

#

its a small question they expect u do it in 2 lines

wooden sail Jun 28, 2022, 7:08 PM

#

maybe it telescopes nicely

steady basalt Jun 28, 2022, 7:08 PM

#

i thought hey thats easy when they asked to find third term

#

then the next q is 50th

wooden sail Jun 28, 2022, 7:09 PM

#

compute a few terms and look for the pattern

steady basalt Jun 28, 2022, 7:09 PM

#

srlsy?

wooden sail Jun 28, 2022, 7:09 PM

#

that's the whole point

steady basalt Jun 28, 2022, 7:09 PM

#

ur meant to deduce the 50th term by just doign the first 3 or 4 terms and guaging it?

wooden sail Jun 28, 2022, 7:09 PM

#

yes

iron basalt Jun 28, 2022, 7:10 PM

#

What is the first term?

#

u_1

wooden sail Jun 28, 2022, 7:11 PM

#

ngl that thing grows pretty nastily lol (if you start from 0)

steady basalt Jun 28, 2022, 7:11 PM

#

its a high school exam paper

#

i thought id have. alook

#

damnnn

#

i cant

#

u1 is 2

wooden sail Jun 28, 2022, 7:12 PM

#

aha, that's the trick

iron basalt Jun 28, 2022, 7:12 PM

#

If you write it out you will see it.

wooden sail Jun 28, 2022, 7:13 PM

#

write out like 4 or 5 terms and you're done

#

you really shouldn't need more than 4

#

you either missed the pattern or forgot the parentheses

iron basalt Jun 28, 2022, 7:14 PM

#

A lesson on how the base case can completely change things.

steady basalt Jun 28, 2022, 7:14 PM

#

ill do it after food

wooden sail Jun 28, 2022, 7:15 PM

#

yeah i was 3 terms in and was like "those highschoolers are dead and buried by now"

iron basalt Jun 28, 2022, 7:15 PM

#

I recommend trying u_1 is 3 or 4 to see what else happens in those cases.

steady basalt Jun 28, 2022, 7:15 PM

#

u shud have a look at UK A2 maths core3/4 papers or advanced maths

#

its so hard i dropped out

brave sand Jun 28, 2022, 7:19 PM

#

import pandas as pd

file_name = "/Documents/Python Virtual Environments/Popular-RL-Algorithms/model/qmix_agent (1)/archive"
objects = pd.read_pickle(file_name)```

#

why does this not work?

#

Traceback (most recent call last): File "qmix_pickle_reader.py", line 4, in <module> objects = pd.read_pickle(file_name) File "/home/ethan/Documents/Python Virtual Environments/marl-test-env/lib/python3.8/site-packages/pandas/io/pickle.py", line 187, in read_pickle with get_handle( File "/home/ethan/Documents/Python Virtual Environments/marl-test-env/lib/python3.8/site-packages/pandas/io/common.py", line 795, in get_handle handle = open(handle, ioargs.mode) FileNotFoundError: [Errno 2] No such file or directory: '/Documents/Python Virtual Environments/Popular-RL-Algorithms/model/qmix_agent (1)/archive'

wooden sail Jun 28, 2022, 7:20 PM

#

iirc to read pickled files you need to import all of the libraries that were involved in the object that got pickled

#

so try taking all of the imports you used on the file that generated the pickle, and put them also in this one that reads the pickle

#

oh but there it's also telling you you're reading from the wrong directory

brave sand Jun 28, 2022, 7:21 PM

#

same error

#

yeah

#

I didn't think loading in libraries would resolve my file not found error

wooden sail Jun 28, 2022, 7:21 PM

#

i'm retty sure paths don't like spaces in them

#

try encasing the part of the path with a space in ''

#

'qmix_agent (1)'

#

otherwise, rename the folder 😛

upper spindle Jun 28, 2022, 7:38 PM

#

best place to learn deep learning with basic python programming?

mild dirge Jun 28, 2022, 7:40 PM

#

What are you currently at?

#

You know about stuff like linear regression, and perceptron, multi-layer perceptron etc?

#

@upper spindle

brave sand Jun 28, 2022, 7:45 PM

#

upper spindle best place to learn deep learning with basic python programming?

deep learning and basic python programming don't really go together lol

pliant perch Jun 28, 2022, 7:48 PM

#

anyone know the best way to get into ai for beginners

agile cobalt Jun 28, 2022, 7:50 PM

#

quite much replying to both delta and yourdad: there's Andrew Ng's Machine Learning Specialisation on Coursera, but I cannot say for sure if it's the best option out there

pliant perch Jun 28, 2022, 7:50 PM

#

agile cobalt quite much replying to both delta and yourdad: there's Andrew Ng's Machine Learn...

i understand the theory behind ai, i just find the code part hard

#

any advice on this?

lost cairn Jun 28, 2022, 7:51 PM

#

Is it possible to practice python with a mobile?

wooden sail Jun 28, 2022, 7:51 PM

#

i would say andrew ng is pretty aight. you need some background knowledge though, and iirc it doesn't go much into code. still it's a great place to start and i encourage learning the math before trying the code

agile cobalt Jun 28, 2022, 7:52 PM

#

wooden sail i would say andrew ng is pretty aight. you need some background knowledge though...

the new version goes a little bit more into code than the previous I think

pliant perch Jun 28, 2022, 7:52 PM

#

wooden sail i would say andrew ng is pretty aight. you need some background knowledge though...

so say i learned the background knowledge where do i go from there and learn the code part

agile cobalt Jun 28, 2022, 7:53 PM

#

pliant perch so say i learned the background knowledge where do i go from there and learn the...

you can try diving head first into the documentation of whichever library you want to use, or look for a course / tutorial series

radiant forum Jun 28, 2022, 7:53 PM

#

Hi people! I'm trying to understand the amount of parameters in a CNN. Well, I am classifying black and white images into four classes. Firstly I processed the images as RGB and later as 'grayscale'. I expected an exponential decrease in the amount of parameters after the Flatten layer, but actually they remained the same. What do the parameters actually depend on?

Captura_de_pantalla_2022-06-28_a_las_21.50.42.png

Captura_de_pantalla_2022-06-28_a_las_21.51.28.png

pliant perch Jun 28, 2022, 7:53 PM

#

agile cobalt you can try diving head first into the documentation of whichever library you wa...

are you able to recommend any good libraries?

agile cobalt Jun 28, 2022, 7:54 PM

#

pliant perch are you able to recommend any good libraries?

depends on what you want to do, which models you want to use etc

pliant perch Jun 28, 2022, 7:54 PM

#

agile cobalt depends on what you want to do, which models you want to use etc

well for example i wanted the ai to tell the difference between two things i.e A dog and A cat

radiant forum Jun 28, 2022, 7:54 PM

#

ey people, I do not mean to be rude. Just in case you didn't know there is a pedagogy channel

agile cobalt Jun 28, 2022, 7:54 PM

#

sklearn is fine for non-deep learning
pytorch or tensorflow are used for deep learning, something via a higher level API / package such as fast.ai, huggingface or keras

radiant forum Jun 28, 2022, 7:54 PM

#

please,

wooden sail Jun 28, 2022, 7:54 PM

#

pliant perch so say i learned the background knowledge where do i go from there and learn the...

at that point i would look at big hitters like pytorch, tensorflow, and lower level stuff like numpy and jax. then i'd decide which one to stick to for a while based on personal interest

agile cobalt Jun 28, 2022, 7:55 PM

#

radiant forum ey people, I do not mean to be rude. Just in case you didn't know there is a ped...

#pedagogy is used for discussions on how to teach, not for asking for resources

wooden sail Jun 28, 2022, 7:56 PM

#

radiant forum Hi people! I'm trying to understand the amount of parameters in a CNN. Well, I a...

why would there be a decrease in parameters after a flatten layer?

#

all it does is change the shape. it doesn't apply any function whatsoever

#

it's akin to the "vectorization" operation you can apply to an m x n matrix in order to obtain a length m*n vector

#

same number of parameters, just reshaped (and generalized to more dimensions)

radiant forum Jun 28, 2022, 8:02 PM

#

wooden sail why would there be a decrease in parameters after a flatten layer?

that was my question. I expected so, because the size of the input is reduced

wooden sail Jun 28, 2022, 8:03 PM

#

lemme make an example for you

#

In [11]: import numpy as np

In [12]: x = np.array([[2,3],[5,6]])

In [13]: print(x)
[[2 3]
 [5 6]]

In [14]: print(x.flatten())
[2 3 5 6]

In [15]:

#

they're exactly the same thing, just in a different shape

radiant forum Jun 28, 2022, 8:06 PM

#

ok, let's change the point of view

#

when adding a convolutional layer it extracts the feature maps accordingly to a number of filters

#

if the size of the input is smaller I expected somehow a decrease on the amount of feature maps and also in the number of parameters

#

but there is something else... that is my question

wooden sail Jun 28, 2022, 8:10 PM

#

oh

#

that's determined by the shape of the convolution layers, the pooling layers, and the dense layers

#

from which layer to which layer did you expect a large change?

#

flatten does nothing, dropout doesn't change the number of parameters, only deactivates them randomly at each iteration. the dense layer is a linear mapping from R^n to R^m, here with m << n, so that's the layer that has a ton of parameters, but the feature vectors are quite small after it

#

in the convolutional layers, you specify an input 2D shape and a number of filters. the output is of size ~ N - kernel_length x N - kernel_length x num_filters.

#

for all of the layers, the number of parameters is related to the underlying (multi-)linear transformation from something isomorphic to R^N to something isomorphic to R^M, having N*M parameters

radiant forum Jun 28, 2022, 8:24 PM

#

so... what you mean is that an image of size (256,256,3) is isomorphic to another image of size (256,256,1)?

wooden sail Jun 28, 2022, 8:25 PM

#

not at all

#

what i'm saying is that an image of size 256 x 256 x 3 is isomorphic to another of size 196608

#

and you put that into the network and get another size

#

and that output vector of some size is isomorphic to some other n-dimensional array

#

so one easy way to think about the number of parameters at a given layer is to vectorize the input and output

#

then the number of parameters is something like N * M... plus another M, if there are biases

#

since the effect of a layer (before applying the activation function) is that of an affine transformation y = Ax + b, and A and b are the parameters

radiant forum Jun 28, 2022, 8:37 PM

#

nice. In that case I would have expected a decrease in the output shape of the first convolutional layer as well as it happened to its parameters

wooden sail Jun 28, 2022, 8:39 PM

#

and that happened indeed. 2D convolutional layers shrink the 2D axis of the image by roughly their own size

#

though the number of output slices in the image depends on the number of filters you use

glacial sparrow Jun 28, 2022, 8:41 PM

#

any resources for embedding categorical variables in LSTM?

wheat snow Jun 28, 2022, 8:53 PM

#

Heyo, i want to calculate e.g. 2 diffrent average values for the watchtime ("Duration")with pandas (from 2022-05-02 until the 2022-06-02) but i want it to be exact2 values... one for the first month( or the rest of the watcdata avaible for that month) and teh secodn should be all teh watchdata avaible in the second month

#

radiant forum Jun 28, 2022, 9:01 PM

#

wooden sail for all of the layers, the number of parameters is related to the underlying (mu...

thanks a lot, you really helped me understand it better. But there is still something else doesn't match... if you take a look at the screenshots: the output shape of the first conv2d layer does not change, but the parameters do. Why that reduction on the parameters does not affect to the first dense layer too?

lapis sequoia Jun 28, 2022, 10:29 PM

#

Hey, any ideas what format this data is in?
https://aoe2.net/api/player/ratinghistory?game=aoe2de&leaderboard_id=3&steam_id=76561199003184910&count=5

serene scaffold Jun 28, 2022, 10:33 PM

#

lapis sequoia Hey, any ideas what format this data is in? https://aoe2.net/api/player/ratinghi...

it's a JSON. while the outermost structure of a JSON is usually dict-like, it can be list-like.

#

!e

import json, pprint as pp
result = json.loads("""[{"rating":2351,"num_wins":2587,"num_losses":1916,"streak":-3,"drops":52,"timestamp":1656402092},{"rating":2357,"num_wins":2587,"num_losses":1915,"streak":-2,"drops":52,"timestamp":1656400992},{"rating":2363,"num_wins":2587,"num_losses":1914,"streak":-1,"drops":52,"timestamp":1656400031},{"rating":2369,"num_wins":2587,"num_losses":1913,"streak":4,"drops":52,"timestamp":1655825614},{"rating":2359,"num_wins":2586,"num_losses":1913,"streak":3,"drops":52,"timestamp":1655825282}]""")
pp.pprint(result)

arctic wedgeBOT Jun 28, 2022, 10:35 PM

#

@serene scaffold :white_check_mark: Your eval job has completed with return code 0.

001 | [{'drops': 52,
002 |   'num_losses': 1916,
003 |   'num_wins': 2587,
004 |   'rating': 2351,
005 |   'streak': -3,
006 |   'timestamp': 1656402092},
007 |  {'drops': 52,
008 |   'num_losses': 1915,
009 |   'num_wins': 2587,
010 |   'rating': 2357,
011 |   'streak': -2,
... (truncated - too many lines)

Full output: https://paste.pythondiscord.com/uyipusetir.txt?noredirect

serene scaffold Jun 28, 2022, 10:35 PM

#

@lapis sequoia see?

lapis sequoia Jun 28, 2022, 10:42 PM

#

serene scaffold <@456226577798135808> see?

I've been trying to get values out of the dictionary for so long

#

but good to know for sure it's json

#

What's the best way to convert this into a pandas dataframe?

serene scaffold Jun 28, 2022, 10:52 PM

#

lapis sequoia What's the best way to convert this into a pandas dataframe?

!docs pandas.load_json

arctic wedgeBOT Jun 28, 2022, 10:52 PM

#

Certainly not.

No documentation found for the requested symbol.

serene scaffold Jun 28, 2022, 10:52 PM

#

rip

#

!docs pandas.read_json

arctic wedgeBOT Jun 28, 2022, 10:53 PM

#

pandas.read\_json

pandas.read_json(path_or_buf=None, orient=None, typ='frame', dtype=None, convert_axes=None, convert_dates=True, keep_default_dates=True, numpy=False, precise_float=False, ...)```
Convert a JSON string to pandas object.

serene scaffold Jun 28, 2022, 10:53 PM

#

that one

lapis sequoia Jun 28, 2022, 10:53 PM

#

Ok, I'll give that a try. I'm trying it out by calling the API directly in the meantime.

serene scaffold Jun 28, 2022, 10:55 PM

#

lapis sequoia Ok, I'll give that a try. I'm trying it out by calling the API directly in the m...

if the API gives you that data as a string, then this will work

#

I think

#

otherwise you can do pd.DataFrame(json.loads(...))

lapis sequoia Jun 28, 2022, 10:58 PM

#

I have a list of top 10k players which I've changed into a dataframe, I want to create a new column in the dataframe('max_rating') which has the highest rating each player has ever had. I'm using the following code:

API = aoe.API()
df = pd.read_csv('AoE2_list_of_top_10000_players.csv')

for s in df['profile_id'].head():
    ratings_of_player = API.get_rating_history(profile_id=s)
    a = []
    for i in ratings_of_player:
        a.append(i.get('rating'))
    print(a)
    df['max_rating'] = max(a)

But whenever I check my dataframe after running this, every player seems to have the same exact highest rating(2560) which is obviously not correct. What am I doing wrong here? I want to append the max rating I've found separately to each player.

lapis sequoia Jun 28, 2022, 11:00 PM

#

serene scaffold that one

Ok, I tried it and it seems to be working pretty nicely. Probably won't need to use all the code I've written above if it works how I think it works.

#

I need help turning my file into tenserflow
Like I need help installing it, then turning it from python to tiffle
Ive tried all the videos, I just cant figure it out

steady basalt Jun 28, 2022, 11:21 PM

#

lapis sequoia I need help turning my file into tenserflow Like I need help installing it, then...

Csv? Into a tensor?

lapis sequoia Jun 28, 2022, 11:23 PM

#

steady basalt Csv? Into a tensor?

I want to turn

#

I python file

#

into tiffle

#

I dont know how to do it

#

its easier to explain in vc

steady basalt Jun 28, 2022, 11:32 PM

#

U mean TIF?

#

Why would u want to turn a py script into an image

lapis sequoia Jun 29, 2022, 12:02 AM

#

@steady basalt sorry for late response

#

Into a TIFFLE

#

tesnsorflow

undone horizon Jun 29, 2022, 1:55 AM

#

👍

lapis sequoia Jun 29, 2022, 2:24 AM

#

I'm trying to get a list of all AoE2 players(from https://aoe2.net/#api using this wrapper: https://github.com/sixP-NaraKa/aoe2net-api-wrapper/blob/main/docs/docs.md) and their highest/lowest ratings in the previous 2 years. I've tried so many things but I keep failing when I try to convert the Json data to a Pandas dataframe. It converts it into a dataframe where the first few columns have index and some other information which isn't important to me, and then one column inserts a dictionary which has all the important information I need but it's impossible to access because it's all in one column.

Any help would be awesome

AoE2.net

Play Age of Empires II: HD (AoE2:HD) and Age of Empires II: Definitive Edition (AoE2:DE) online! Lobby Browser and Leaderboards

shell panther Jun 29, 2022, 3:37 AM

#

steady basalt What’s open domain

open domain means that you can talk about it for any topic

tropic niche Jun 29, 2022, 4:30 AM

#

I've got a bunch of data that I'm collecting from various sources in a tabular format. The data is all similar but the tables don't always have the same columns. For example, one source may provide a column with a start value, and an end value and nothing else, other sources may provide some interim values. The ordering of the columns may also be different between tables. The rows are almost always in sorted order.

I'm wondering if there is a way to train ML model to determine column headings. Currently I need to manually open the data in excel look at it, and assign the correct heading and then enter the table into my system such that it can be processed, this is really annoying and time consuming work. The tables can have as few as 200 rows, and up to the tens of thousands, and there are typically about 6 to 12 columns. How would I go about structuring the data to train such a model?

wooden sail Jun 29, 2022, 4:41 AM

#

are the columns labelled in the files?

tropic niche Jun 29, 2022, 4:44 AM

#

Yes, mostly but the labels are not consistent. Data from different sources can have different labels for the same data.

wooden sail Jun 29, 2022, 4:54 AM

#

aight you could look at several examples from your data to see if you can learn something about the statistical distribution of the data. the annoying part is that the files have different row sizes. you can either pad the rows or extract statistical params yourself. then train the network on randomly generated examples based on what you observed in the data.

tropic niche Jun 29, 2022, 5:02 AM

#

The statistical distribution of the like columns should be similar regardless of the number of rows.

wooden sail Jun 29, 2022, 5:18 AM

#

that was exactly my point 😛

lapis sequoia Jun 29, 2022, 6:23 AM

#

for i in df.head()['profile_id']:
    all_ratings = []
    max_rating = 0
    min_rating = 0
    list_of_ratings = API.get_rating_history(profile_id=i)
    for i in list_of_ratings:
        all_ratings.append(i.get('rating'))
    max_rating = max(all_ratings)
    print(max_rating)
    df.loc[df['profile_id'] == i, 'max_rating'] = max_rating

#

What mistake am I making here? When I print df.head() I get max_rating values as NaN.

#

I want them to show the max ratings of the players

#

pls help 😦

sour tide Jun 29, 2022, 6:53 AM

#

Hi..so i got this cosine similarity matrix output from a python program. may i know how to do data classsificaiton on this like finding accuracy and all in terms of a specific data classfier which is SVM classifier

#

the output is like this..im putting link hia since i cant copy and paste my own output hia

#

https://ars.els-cdn.com/content/image/1-s2.0-S2666285X22000176-gr4.jpg

runic lantern Jun 29, 2022, 8:07 AM

#

Hi everyone! I am working on my first data science project and i am facing some trouble with identifying and dealing with outliers in my dataset

#

would love to learn how to deal with outliers!!

serene scaffold Jun 29, 2022, 9:02 AM

#

Don't ask for an expert. Ask your actual question.

runic lantern Jun 29, 2022, 9:31 AM

#

https://github.com/Sparsh-mahajan/House-Price-Prediction/blob/main/data_cleaning.ipynb here is the what i have been working with, i have a dataset with around 2.9k rows and 80 columns, I have dealt with missing values in the dataset and have plotted out boxplot, histplot and a scatterplot for each column vs the label ('SalePrice')

GitHub

House-Price-Prediction/data_cleaning.ipynb at main · Sparsh-mahajan...

Predicting house prices using the dataset from https://www.kaggle.com/datasets/prevek18/ames-housing - House-Price-Prediction/data_cleaning.ipynb at main · Sparsh-mahajan/House-Price-Prediction

#

so now do I manually find out outliers in each of the 80 columns and then remove those rows from the dataset? also have i been following the correct method in finding out the outliers?

steady basalt Jun 29, 2022, 10:04 AM

#

lapis sequoia <@119592011207540740> sorry for late response

What’s a tiffle

#

I’ve google tensorflow tiffle can’t see anything

#

guys am i tripping did i forget that tensorflow has a file type

#

I know thi s https://pypi.org/project/tifffile/

PyPI

tifffile

Read and write TIFF files

#

We will also build the profile of the analyst profession more broadly across national policymakers and central government. This will include accreditation, training, career opportunities, status and pay to match. no fucking shot (UK, NHS)

#

how exactly would one obtain accredation

#

https://www.gov.uk/government/publications/data-saves-lives-reshaping-health-and-social-care-with-data

GOV.UK

Data saves lives: reshaping health and social care with data

This final version of the strategy sets out ambitious plans to harness the potential of data in health and care in England, while maintaining the highest standards of privacy and ethics.

leaden nova Jun 29, 2022, 11:27 AM

#

hi

#

is there a way to convert a unicode to utf-8 character for example ’ to '

#

in python

#

if we have in string

wooden sail Jun 29, 2022, 11:32 AM

#

should be possible to use something like my_string.encode('utf8')

leaden nova Jun 29, 2022, 11:41 AM

#

nope

bold timber Jun 29, 2022, 11:47 AM

#

Anyone can explain to me why I get a plot like this?

long locust Jun 29, 2022, 11:57 AM

#

resample will create the bins based on the cyclic data

bold timber Jun 29, 2022, 11:57 AM

#

ok thank youu

lapis sequoia Jun 29, 2022, 1:06 PM

#

steady basalt What’s a tiffle

Im so Fucking sorry

#

I meant TF2 File

#

Stupid me

#

I need help turning my python file into a TF2 file

misty flint Jun 29, 2022, 2:00 PM

#

ah i had a similar problem with dask but it is solvable, but i cant remember how i did it without looking it up and i have work rn - all i can say is it looks like youre close

steady basalt Jun 29, 2022, 2:15 PM

#

lapis sequoia Im so Fucking sorry

Lmfaooo

#

@lapis sequoia can u explain what u mean because afaik tf2 is not a script file type? Do u mean like save model ?

#

You can save a .py file as a text file easily but I’m not sure what a tf2 file is

steady basalt Jun 29, 2022, 2:22 PM

#

lapis sequoia I need help turning my python file into a TF2 file

Tensorflow library file?

hollow sentinel Jun 29, 2022, 2:42 PM

#

!pastebin

arctic wedgeBOT Jun 29, 2022, 2:42 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

hollow sentinel Jun 29, 2022, 2:42 PM

#

https://paste.pythondiscord.com/ozaqedivuw

#

i don't understand why the model isn't pickled

#

unless i have to create the pickle file first and then run the code?

upper spindle Jun 29, 2022, 2:54 PM

#

mild dirge You know about stuff like linear regression, and perceptron, multi-layer percept...

@mild dirge ; i know linear regression from econometrics but all these perceptron stuff, i have no idea

steady basalt Jun 29, 2022, 3:01 PM

#

upper spindle <@309775277720993792> ; i know linear regression from econometrics but all these...

It’s a simple way to calculate output you can use to introduce urself to neural networks

hollow sentinel Jun 29, 2022, 3:05 PM

#

does anyone know?

#

it could be very slow to run

#

but idk

lapis sequoia Jun 29, 2022, 4:07 PM

#

steady basalt Tensorflow library file?

yes

#

im trying to optimise my python file to get more fps

#

so im trying to turn it into a Tensorflow file

#

so that I can put it into deci playform

#

#

steady basalt Jun 29, 2022, 4:14 PM

#

lapis sequoia im trying to optimise my python file to get more fps

FPS?

#

Frame per second?

#

Are you making a game?

#

Are u trolling?

#

I think ur attempting to save a model not convert a python script to a tensorflow “file”

lapis sequoia Jun 29, 2022, 4:19 PM

#

its a script

#

that you run on a game

#

when I run it it doesnt give good FPS

#

so im trying to optimize the script

#

the only way I can figure out how to optimize it is by making it into a diffrent file one that is capatible with that one website

steady basalt Jun 29, 2022, 4:20 PM

#

Are you doing computer vision in a game?

lapis sequoia Jun 29, 2022, 4:20 PM

#

lapis sequoia

^

#

its easeir to explain in voice call

#

can u call?

steady basalt Jun 29, 2022, 4:20 PM

#

No but I can read

lapis sequoia Jun 29, 2022, 4:20 PM

#

ok so basicly

#

I have this script

#

You run it on a game

#

it gives poor FPS

steady basalt Jun 29, 2022, 4:20 PM

#

U need to explain what the script is

lapis sequoia Jun 29, 2022, 4:21 PM

#

it is a FOV hack for my game

steady basalt Jun 29, 2022, 4:21 PM

#

Why do you run it on a game, how does that work?

#

Oh okay

#

And this has what to do with tensorflow?

lapis sequoia Jun 29, 2022, 4:21 PM

#

I figured

#

of I made it until

#

into*

steady basalt Jun 29, 2022, 4:21 PM

#

Tensorflow is a library that creates functions for u to do ML

#

it’s not a file type

lapis sequoia Jun 29, 2022, 4:21 PM

#

lapis sequoia

I just want to make it one of theese files, onnz, tensoerflow, keras

#

because thats when i can optimize it

steady basalt Jun 29, 2022, 4:21 PM

#

Those are for storing models

lapis sequoia Jun 29, 2022, 4:22 PM

#

well how do I optimize it then

#

to get better fps

steady basalt Jun 29, 2022, 4:22 PM

#

Is ur pc good

#

What game is it

#

Increasing ur fov in games is not the fov script that lags you but the game itself having to render more

lapis sequoia Jun 29, 2022, 4:23 PM

#

my pc is good

#

its a fps game

#

sorta like fornite

#

it has the same applications as it

#

runs on unity engine

#

I just need help optimizing it

steady basalt Jun 29, 2022, 4:24 PM

#

Making your fov script not python u won’t be able to run it

lapis sequoia Jun 29, 2022, 4:24 PM

#

it injects itself

steady basalt Jun 29, 2022, 4:24 PM

#

And what good is converting even language? It’s not the script it’s your game

lapis sequoia Jun 29, 2022, 4:24 PM

#

my freind was able to optimize his

#

steady basalt Jun 29, 2022, 4:25 PM

#

Bro what?

#

Accuracy of what?

#

Your frames per second?

lapis sequoia Jun 29, 2022, 4:26 PM

#

yes

#

nevermind

#

its hard to explain

steady basalt Jun 29, 2022, 4:26 PM

#

Frames per second is not an accuracy

lapis sequoia Jun 29, 2022, 4:26 PM

#

il figure it out on my own

steady basalt Jun 29, 2022, 4:27 PM

#

Just to make it clear this isn’t a ML task right?

lapis sequoia Jun 29, 2022, 4:27 PM

#

bloodcry

steady basalt Jun 29, 2022, 4:27 PM

#

You’re not data science?

#

Game dev?

lapis sequoia Jun 29, 2022, 4:27 PM

#

yes

#

someone told me to come to this channe

#

for help

steady basalt Jun 29, 2022, 4:27 PM

#

None of this will help you ur being trolled

lapis sequoia Jun 29, 2022, 4:27 PM

#

My deadline is fucking today

#

angry

steady basalt Jun 29, 2022, 4:28 PM

#

Dude

#

Ur friend is trying to make u fail

#

Yikes

#

Improve your games efficiency at rendering

#

This task has nothing to do with ai

#

Ur friend scammed u he’s not getting accuracy scores for this lmao

unique quail Jun 29, 2022, 4:44 PM

#

how does pandas or matplotlib help in machine learnign

#

just crious

#

curious*

wooden sail Jun 29, 2022, 4:50 PM

#

matplotlib is for plotting, and pandas can help you read files and check out what properties the data has. other than that, not much else. the actual ML is done with other tools and can be done entirely without those 2 libs

grand vapor Jun 29, 2022, 4:54 PM

#

having an issue with pandas read_csv function, I'm trying to use certain columns of a csv, but I get an error that they are expected but not found.
attached is a screenshot of said csv and the columns I want to extract
here is the code I am running to try to accomplish this:

LORD3DM_100128XY = pd.read_csv(str(PATH100128XY) + "3DM.csv", skiprows=15, usecols=['X Accel [x8004]', 'Y Accel [x8004]', 'Z Accel [x8004]'])

I typically don't have any problems with doing this kind of thing, not sure what's happening here. figured this might be a good place to ask

sinful surge Jun 29, 2022, 5:17 PM

#

Anyone know what this tensorflow error means? This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2

#

I dont think it has to do with my code

tidal bough Jun 29, 2022, 5:21 PM

#

uhh, is it an error?

#

it's just saying your tensorflow binary is compiled with AVX and AVX2 support

steady basalt Jun 29, 2022, 5:21 PM

#

Would anyone here wana teach me how to binary tree in python? Just the basics such as checking nodes and traversing

sinful surge Jun 29, 2022, 5:22 PM

#

Oh

steady basalt Jun 29, 2022, 5:22 PM

#

No matter how many tutorials I watch and can memorize key inputs i still don’t full grasp it

#

Same for linked lists I get the theory but not oop

sinful surge Jun 29, 2022, 5:24 PM

#

tidal bough uhh, *is* it an error?

Nevermind, my model was in the wrong directory 🤦

wheat snow Jun 29, 2022, 5:28 PM

#

@untold bloom you still here?

untold bloom Jun 29, 2022, 5:30 PM

#

yes, kind of :p

wheat snow Jun 29, 2022, 5:31 PM

#

OHHHHH

untold bloom Jun 29, 2022, 5:31 PM

#

how are you

wheat snow Jun 29, 2022, 5:31 PM

#

wait

#

Im done with life

#

im stuck at something for days

untold bloom Jun 29, 2022, 5:31 PM

#

:\

wheat snow Jun 29, 2022, 5:32 PM

#

here

#

its been rumoring in my mind for days

#

i cant figure out how to do it

untold bloom Jun 29, 2022, 5:32 PM

#

oh, sorry i didn't respond to that...

wheat snow Jun 29, 2022, 5:32 PM

#

i want to calculate e.g. 2 diffrent average values for the watchtime ("Duration")with pandas (from 2022-05-02 until the 2022-06-02) but i want it to be exact2 values... one for the first month( or the rest of the watcdata avaible for that month) and teh secodn should be all teh watchdata avaible in the second month

untold bloom Jun 29, 2022, 5:32 PM

#

i usually hang out in other channels and this has a lot of text in between and i tend to forget and not answer then

#

if possible, can you give me some sample input and expected output?

wheat snow Jun 29, 2022, 5:33 PM

#

hmm yes

#

result=df_vd_E.groupby(df_vd_E["Start Time"].dt.date)["Duration"].sum()
result.index = pd.to_datetime(result.index)
b=(result.loc["2021-04-15": "2021-07-15"].dt.total_seconds()/60/60)

Month= b.mean()
print(Month)```
So, rn, it gives me the average of the time from 4-15 until 7-15 (one value) what i want is diffrent.... i wantg it to be the average of one month each, so it should be one average value( duration) for the rest of month 4 and then full average watchtime duration of the month 5 and so on

untold bloom Jun 29, 2022, 5:37 PM

#

i see, thanks

wheat snow Jun 29, 2022, 5:37 PM

#

btw

#

thank oyu so much for the help

untold bloom Jun 29, 2022, 5:37 PM

#

yw

wheat snow Jun 29, 2022, 5:38 PM

#

for my first pandas project, ig i really need some help

untold bloom Jun 29, 2022, 5:38 PM

#

didn't help yet, though :p

wheat snow Jun 29, 2022, 5:38 PM

#

but before you did

untold bloom Jun 29, 2022, 5:38 PM

#

so as you said, the .mean() gives you a single number: the "global" mean

#

but you want it per month

#

whenever "per" shows up, we tend to go for .groupby

#

what will we group the data by in this case?

#

you want it per month so, month of the data

#

then we take action:

wheat snow Jun 29, 2022, 5:39 PM

#

untold bloom whenever "per" shows up, we tend to go for `.groupby`

yesyes

untold bloom Jun 29, 2022, 5:39 PM

#

b.groupby(b.index.month).mean()

#

since the month information is at the index (right?), we reach it from there

wheat snow Jun 29, 2022, 5:40 PM

#

the month thing refers to the datatype of it right?

untold bloom Jun 29, 2022, 5:40 PM

#

uh, not quite

wheat snow Jun 29, 2022, 5:40 PM

#

so .month automatticly knows that the -04- is a month?

untold bloom Jun 29, 2022, 5:40 PM

#

yes

#

please observe what print(b.index.month) shows

#

b.index is a DateTime index; it has convenient attributes attached to it

#

.year, month, dayofweek, dayofyear...

untold bloom Jun 29, 2022, 5:42 PM

#

untold bloom please observe what `print(b.index.month)` shows

this will show numbers 0, 1, ..., 11 in general

#

for 12 months

wheat snow Jun 29, 2022, 5:42 PM

#

untold bloom `.year`, `month`, `dayofweek`, `dayofyear`...

THIS IS AN AMAZING TOOL

untold bloom Jun 29, 2022, 5:42 PM

#

:p

#

one caveat about the code above though:

#

we grouped by the month information only

#

so the year is ignored: any February day will be accounted for the mean of February

#

be it year 1998 data or year 2921 data

#

in your case, i guess this is fine

#

because you have only 2021 data in b

#

but in general...

#

you can do b.groupby([b.index.year, b.index.month]).mean()

#

groupby both year and month

#

so 1988's February and 2012's February are now signaling different groups.

#

before, they were falling into the same, February, group.

wheat snow Jun 29, 2022, 5:45 PM

#

okay, ```
"2022-02-01": "2022-04-01"

#

Start Time
2 2.563807
3 2.003324
4 2.275278

#

so, how do i now do, that teh rest of the month, is counte din aswell

untold bloom Jun 29, 2022, 5:45 PM

#

it is counted in yes

wheat snow Jun 29, 2022, 5:48 PM

#

ah okay

untold bloom Jun 29, 2022, 5:51 PM

#

i mean, however many days are in that month, they will be counted

#

be it 1 day or 30, 31

wheat snow Jun 29, 2022, 5:52 PM

#

okie okie

#

plot looking good so far

bold timber Jun 29, 2022, 6:35 PM

#

Can anyone help me? Why do I get the same color? How to use different colors in that plot?

wooden sail Jun 29, 2022, 6:43 PM

#

i wouldn't say they're the same color, but they're pretty close. how about you remove the color parameter?

steady basalt Jun 29, 2022, 6:45 PM

#

Those are some accurate predictions

#

Model?

bold timber Jun 29, 2022, 6:45 PM

#

wooden sail i wouldn't say they're the same color, but they're pretty close. how about you r...

doesn't works

bold timber Jun 29, 2022, 6:46 PM

#

steady basalt Model?

I just to try with a simple model for time series like this

wooden sail Jun 29, 2022, 6:48 PM

#

remove the 'r' and 'g' too? just to see what happens

wheat snow Jun 29, 2022, 6:49 PM

#

@untold bloom ```
Rapha= df_vd_R['Duration'].dt.total_seconds()/60/60

over here i want to print an Integer of teh whole watchtime dsuration of one User, but the output prints every duration for that user... how do i define that i want the duration of some rows added together

bold timber Jun 29, 2022, 6:50 PM

#

wooden sail remove the 'r' and 'g' too? just to see what happens

like this. But, it makes me wondering why I can't choose my color itself

wooden sail Jun 29, 2022, 6:51 PM

#

i wonder too tbh lol

lapis sequoia Jun 29, 2022, 6:52 PM

#

Hello,
I have a piece of code but each time I run it it takes a long time. I'm guessing it takes a long time because it calls the API every time it runs and gets information from it.

Is there any way to speed it up so that it doesn't take a minute or two each time I run it?

wooden sail Jun 29, 2022, 6:52 PM

#

bold timber like this. But, it makes me wondering why I can't choose my color itself

https://stackoverflow.com/questions/45175916/why-are-colors-not-working-in-matplotlib-for-this-example apparently there are some color glitches around

Stack Overflow

Why are colors not working in matplotlib for this example?

Why am I not seeing any red colors for the negative values here?

df = pd.DataFrame([1, -2, 3, -4])
df['positive'] = df[[0]]>0
df[[0]].plot(kind='bar', color=df.positive.map({True: 'g', False: '...

lapis sequoia Jun 29, 2022, 6:53 PM

#

Here is my code:

API = aoe.API()

df = pd.read_csv('AoE2_list_of_top_10000_players.csv')

df = df[['profile_id', 'name']]
df['max_rating'] = 0
df['min_rating'] = 0
df['date_of_max_rating'] = 0
df['date_of_min_rating'] = 0
df['difference_in_rating'] = 0


for i in df.head(100)['profile_id']:
    all_ratings = []
    max_rating = 0
    min_rating = 0
    list_of_ratings = API.get_rating_history(profile_id=i, count=5000)
    for j in list_of_ratings:
        if j.get('timestamp') > 1591036131:
            all_ratings.append(j.get('rating'))
        else:
            pass
    max_rating = max(all_ratings)
    min_rating = min(all_ratings)

    df.loc[df['profile_id'] == i, 'max_rating'] = max_rating
    df.loc[df['profile_id'] == i, 'min_rating'] = min_rating
    df.loc[df['profile_id'] == i, 'date_of_max_rating'] = dt.datetime.fromtimestamp(list_of_ratings[all_ratings.index(max_rating)].get('timestamp'))
    df.loc[df['profile_id'] == i, 'date_of_min_rating'] = dt.datetime.fromtimestamp(list_of_ratings[all_ratings.index(min_rating)].get('timestamp'))
    df.loc[df['profile_id'] == i, 'difference_in_rating'] = max_rating - min_rating

print(df.head(100))

bold timber Jun 29, 2022, 6:55 PM

#

wooden sail https://stackoverflow.com/questions/45175916/why-are-colors-not-working-in-matpl...

thank you so much

steady basalt Jun 29, 2022, 7:05 PM

#

Ladies and gentleman

#

I am proud to announce

#

I have inverted a binary tree

#

Thanks for all ur support

#

I’m ready to face interview

#

cant wait to apply data structures to uhh... dataframes..

untold bloom Jun 29, 2022, 7:30 PM

#

wheat snow <@836605577400549436> ``` Rapha= df_vd_R['Duration'].dt.total_seconds()/60/...

IIUC, again we groupby

#

not sure what the username column is called but let's say it's called "username"

#

let's first convert the durations to hours and then groupby

#

df.Duration.dt.total_seconds.div(3600).groupby(df["username"]).sum()

#

this gives you the total duration per username (in seconds)

#

which specific username you want, you can index into this to select it, e.g., above_thing.loc["user_1"]

untold bloom Jun 29, 2022, 7:33 PM

#

untold bloom `df.Duration.dt.total_seconds.div(3600).groupby(df["username"]).sum()`

this gives the total durations per user for all users

#

if you only want a specific user, first filter the frame and then sum; no groupby is needed then

#

df.loc[df["username"].eq("user_1"), "Duration"].dt.total_seconds().div(3600).

misty flint Jun 29, 2022, 7:34 PM

#

ahhhhhhhhhh

#

where is stel

#

hes probs busy

#

anyway my model doesnt fit even with aws lambda layers + s3

#

kekHands

#

rip

#

so the alternative will have to be probably be putting the model and inference code into a docker container

#

docker

#

then probs deploy using ECS or Fargate or something

#

more AWS services i do not know

#

aws nervous

wheat snow Jun 29, 2022, 7:53 PM

#

@untold bloom it often tells me that 'function' object has no attribute .div

#

raceback (most recent call last):
23118 1.025278
23118 1.025278
23119 0.719444
23120 0.000556
23121 0.019444
Name: Duration, Length: 10293, dtype: float64

#

this ios output btw

#

somehow it prints more....

#

hmm

#

df_vd.loc[df_vd["Profile Name"].eq("Rapha"), "Duration"].dt.total_seconds()/60/60

wheat snow Jun 29, 2022, 8:00 PM

#

misty flint where is stel

ahhjh, i see you are also dependet from one dc user to help....

hollow sentinel Jun 29, 2022, 8:03 PM

#

!pastebin

arctic wedgeBOT Jun 29, 2022, 8:03 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

misty flint Jun 29, 2022, 8:04 PM

#

wheat snow ahhjh, i see you are also dependet from one dc user to help....

not really. if you read my message, i dont expect anything from stel. i just complain to him

wheat snow Jun 29, 2022, 8:04 PM

#

misty flint not really. if you read my message, i dont expect anything from stel. i just com...

lmao

misty flint Jun 29, 2022, 8:04 PM

#

since he is the only one here who feels same way about aws as me

#

kekHands

wheat snow Jun 29, 2022, 8:04 PM

#

misty flint since he is the only one here who feels same way about aws as me

welp, i fully get carried by nahita somehow

#

its my first data project you gotta know

#

never used pandas before

misty flint Jun 29, 2022, 8:05 PM

#

V88_apexm_pepegun aws

hollow sentinel Jun 29, 2022, 8:05 PM

#

https://paste.pythondiscord.com/idejihayer

#

this is a fun error message

#

federal-gov

#

what the fuck is federal-gov

#

oh i'm being stupiid

#

i never one hot encoded anything

#

that's why

misty flint Jun 29, 2022, 8:06 PM

#

hollow sentinel what the fuck is federal-gov

yeah you also have 'private'

#

yep that will do it

#

kekHands

wheat snow Jun 29, 2022, 8:06 PM

#

@untold bloom ?

hollow sentinel Jun 29, 2022, 8:06 PM

#

don't ping ppl asking for help

wheat snow Jun 29, 2022, 8:06 PM

#

hollow sentinel don't ping ppl asking for help

ehm

misty flint Jun 29, 2022, 8:07 PM

#

hollow sentinel that's why

good luck bud. you got this

#

praise

wheat snow Jun 29, 2022, 8:07 PM

#

Prayge

misty flint Jun 29, 2022, 8:08 PM

#

wheat snow never used pandas before

try a help channel

wheat snow Jun 29, 2022, 8:08 PM

#

misty flint try a help channel

this IS a help channel

misty flint Jun 29, 2022, 8:08 PM

#

ehh sometimes its topical chat

wheat snow Jun 29, 2022, 8:08 PM

#

misty flint Jun 29, 2022, 8:08 PM

#

we just get too many newbies

wheat snow Jun 29, 2022, 8:08 PM

#

/Help

#

see

misty flint Jun 29, 2022, 8:08 PM

#

Topical Chat/

#

see

#

Oopsies

wheat snow Jun 29, 2022, 8:09 PM

#

ye, so it isnt prohobited to ask for help

last salmon Jun 29, 2022, 8:09 PM

#

um guys I currently want to undertake a project that involves an ai classifying images shown to it and it getting better at doing so over time

#

how would i go about doing that?

#

all ik is show the ai data

#

train it

#

over time

#

and results

misty flint Jun 29, 2022, 8:09 PM

#

wheat snow ye, so it isnt prohobited to ask for help

its not prohibited but you will probably get faster response if youre in a rush with a help channel

#

otherwise you will have to wait till peeps are online

wheat snow Jun 29, 2022, 8:10 PM

#

misty flint its not prohibited but you will probably get faster response if youre in a rush ...

you think? but problem is people ghave to get familiar with my project in order to understand ig

last salmon Jun 29, 2022, 8:10 PM

#

can you guys help me

#

or do you need help like me lol

misty flint Jun 29, 2022, 8:10 PM

#

hmm you should try to break down your problem into someone with no context can understand then

hollow sentinel Jun 29, 2022, 8:10 PM

#

sometimes i've asked in help channels and then people who don't know pandas try to hop in and help

misty flint Jun 29, 2022, 8:11 PM

#

last salmon and results

have you tried a machine learning, specifically Computer Vision, course or something similar

last salmon Jun 29, 2022, 8:11 PM

#

misty flint have you tried a machine learning, specifically Computer Vision, course or somet...

nope :D

misty flint Jun 29, 2022, 8:11 PM

#

hollow sentinel sometimes i've asked in help channels and then people who don't know pandas try ...

they are the real cough

#

i have to go now

#

peace

wheat snow Jun 29, 2022, 8:15 PM

#

#

@eternal trench

hollow sentinel Jun 29, 2022, 8:17 PM

#

models = []

models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier())) 
models.append(('NB', GaussianNB())) 
models.append(('SVM', SVC()))
results = []
names = []

validation_size = 0.20
seed = 7
X_train, X_validation, Y_train, Y_validation = train_test_split(X, y,
    test_size=validation_size, random_state=seed)

import category_encoders as ce

encoder = ce.OneHotEncoder(cols=['workclass', 'education', 'marital-status', 'occupation', 'relationship', 
                                 'race', 'sex', 'native-country'])

X_train = encoder.fit_transform(X_train)

X_test = encoder.transform(X_test).reshape(14)

for name, model in models:
  kfold = KFold(n_splits=10, random_state=seed, shuffle = True)
  cv_results = cross_val_score(model, X_train, Y_train, cv=kfold, scoring="accuracy")
  results.append(cv_results)
  names.append(name)
  msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
  print(msg)

#

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-88-bb069de3b56c> in <module>()
     22 X_train = encoder.fit_transform(X_train)
     23 
---> 24 X_test = encoder.transform(X_test).reshape(14)
     25 
     26 for name, model in models:

1 frames
/usr/local/lib/python3.7/dist-packages/category_encoders/utils.py in _check_transform_inputs(self, X)
    321         # then make sure that it is the right size
    322         if X.shape[1] != self._dim:
--> 323             raise ValueError(f'Unexpected input dimension {X.shape[1]}, expected {self._dim}')
    324 
    325     def _drop_invariants(self, X: pd.DataFrame, override_return_df: bool) -> Union[np.ndarray, pd.DataFrame]:

ValueError: Unexpected input dimension 108, expected 14

#

sigh

#

i have no clue what to do here

wheat snow Jun 29, 2022, 8:19 PM

#

untold bloom IIUC, again we groupby

@eternal trench

hollow sentinel Jun 29, 2022, 8:19 PM

#

who is jock

#

yeah idk how to fix this

#

.reshape maybe?

untold bloom Jun 29, 2022, 8:24 PM

#

wheat snow df_vd.loc[df_vd["Profile Name"].eq("Rapha"), "Duration"].dt.total_seconds()/60/6...

so this doesn't work?

#

if not, what's the error?

hollow sentinel Jun 29, 2022, 8:26 PM

#

i'm stumped

#

maybe use .getdummies instead?

#

idk what to do here

untold bloom Jun 29, 2022, 8:29 PM

#

hollow sentinel idk what to do here

can you print X_train.shape and X_test.shape right before X_train = encoder.fit_transform(X_train) line?

hollow sentinel Jun 29, 2022, 8:30 PM

#

(26048, 14)
(9769, 108)

untold bloom Jun 29, 2022, 8:31 PM

#

so something bad happened after train_test_split and that point

#

because train_test_split won't mess up with the number of features (i.e., number of columns)

wooden sail Jun 29, 2022, 8:32 PM

#

what's the shape of the original X?

hollow sentinel Jun 29, 2022, 8:32 PM

#

i guess it's the encoder

#

(32561, 14)

untold bloom Jun 29, 2022, 8:33 PM

#

my guess is: you already transformed X_test sometime before; now it's as if you're trying to transform again

#

are you working with JupyterLab/Notebook?

untold bloom Jun 29, 2022, 8:34 PM

#

untold bloom my guess is: you already transformed `X_test` sometime before; now it's as if yo...

like running that cell again would give that error

hollow sentinel Jun 29, 2022, 8:34 PM

#

yeah

#

oh i see

#

yeah i noticed on github people were using .ipynb

#

so i decided to use google colab

wooden sail Jun 29, 2022, 8:34 PM

#

hmm my impression is that x train transform does the transformation implicitly, but does not change the variable in place

#

so that it might not be necessary to encode x test at all

hollow sentinel Jun 29, 2022, 8:35 PM

#

but if i don't do that i get a weirder error

untold bloom Jun 29, 2022, 8:35 PM

#

no, X_train and X_test are different entities

wooden sail Jun 29, 2022, 8:35 PM

#

i'm aware

untold bloom Jun 29, 2022, 8:35 PM

#

if a transformation happened to X_train, that should happen to X_test as well

wooden sail Jun 29, 2022, 8:35 PM

#

what error do you get if you don't transform x test?

hollow sentinel Jun 29, 2022, 8:36 PM

#

!pastebin

arctic wedgeBOT Jun 29, 2022, 8:36 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

hollow sentinel Jun 29, 2022, 8:36 PM

#

https://paste.pythondiscord.com/sabijobiha

untold bloom Jun 29, 2022, 8:36 PM

#

an advice: in these Jupyter-like enviorenments, it helps to make new variables whenever possible (and when it makes sense) instead of assigning to the same name

hollow sentinel Jun 29, 2022, 8:36 PM

#

bc these are categorical features

untold bloom Jun 29, 2022, 8:36 PM

#

e.g., you could do X_test_encoded = encoder.transform(X_test) above and that error wouldn't have happened

#

similar for X_train_encoded = ...

hollow sentinel Jun 29, 2022, 8:37 PM

#

i see

#

i still get that same error after making X_train_encoded and X_test_encoded

untold bloom Jun 29, 2022, 8:38 PM

#

possible; X_test has already been transformed to have got 108 features; trying again to transform it will error

#

perhaps restart the kernel

wooden sail Jun 29, 2022, 8:38 PM

#

i find it weird that the number of examples in x train and x test doesnt add up to the total in x, too

#

yeah, restart the kernel first

#

then show again the original sizes of x, xtest, and x train before any transformation is applied

hollow sentinel Jun 29, 2022, 8:45 PM

#

models = []

models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier())) 
models.append(('NB', GaussianNB())) 
models.append(('SVM', SVC()))
results = []
names = []

validation_size = 0.20
seed = 7
X_train, X_validation, Y_train, Y_validation = train_test_split(X, y,
    test_size=validation_size, random_state=seed)

import category_encoders as ce

encoder = ce.OneHotEncoder(cols=['workclass', 'education', 'marital-status', 'occupation', 'relationship', 
                                 'race', 'sex', 'native-country'])

X_train = encoder.fit_transform(X_train)

X_test = encoder.transform(X_test).reshape(14)

for name, model in models:
  kfold = KFold(n_splits=10, random_state=seed, shuffle = True)
  cv_results = cross_val_score(model, X_train, Y_train, cv=kfold, scoring="accuracy")
  results.append(cv_results)
  names.append(name)
  msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
  print(msg)

#

i don't see an X_test here

wooden sail Jun 29, 2022, 8:45 PM

#

x train and x validation, then

#

where was x test defined, then, btw?

hollow sentinel Jun 29, 2022, 8:46 PM

#

apparently nowhere

wooden sail Jun 29, 2022, 8:46 PM

#

doesn't seem to come from splitting of the data

#

nice

hollow sentinel Jun 29, 2022, 8:46 PM

#

maybe that's why it wasn't working

wooden sail Jun 29, 2022, 8:46 PM

#

my best guess is they meant validation

hollow sentinel Jun 29, 2022, 8:47 PM

#

(26048, 14)

#

i tried to print X_validation but nothing showed up

#

actually hold on

#

[6513 rows x 14 columns]

wooden sail Jun 29, 2022, 8:47 PM

#

that makes sense

#

then the first dim of those two adds up to X's first dim

hollow sentinel Jun 29, 2022, 8:48 PM

#

so then what's wrong with the one hot encoding

wooden sail Jun 29, 2022, 8:48 PM

#

well, swap out the nonexistent variable with one that exists and see what error we get (i.e. x test <- x validation)

#

however, as nahita suggested, i strongly suggest you don't modify x train and x validation in place, but make another variable instead and use those

#

because as it turns out, jupyter is terrible for debugging

#

if you make changes in place, you might need to rerun the whole code from the beginning

hollow sentinel Jun 29, 2022, 8:50 PM

#

models = []

models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier())) 
models.append(('NB', GaussianNB())) 
models.append(('SVM', SVC()))
results = []
names = []

validation_size = 0.20
seed = 7
X_train, X_validation, Y_train, Y_validation = train_test_split(X, y,
    test_size=validation_size, random_state=seed)

print(X_train.shape)
print(X_validation)

import category_encoders as ce

encoder = ce.OneHotEncoder(cols=['workclass', 'education', 'marital-status', 'occupation', 'relationship', 
                                 'race', 'sex', 'native-country'])

X_train_encoded = encoder.fit_transform(X_train)

X_validation_encoded = encoder.transform(X_validation)


for name, model in models:
  kfold = KFold(n_splits=10, random_state=seed, shuffle = True)
  cv_results = cross_val_score(model, X_train, Y_train, cv=kfold, scoring="accuracy")
  results.append(cv_results)
  names.append(name)
  msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
  print(msg)

#

!pastebin

arctic wedgeBOT Jun 29, 2022, 8:50 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

hollow sentinel Jun 29, 2022, 8:50 PM

#

https://paste.pythondiscord.com/imazusisuk

#

looks like one hot encoding didn't work

wooden sail Jun 29, 2022, 8:51 PM

#

needed to use x_train_encoded below

wheat snow Jun 29, 2022, 8:51 PM

#

untold bloom so this doesn't work?

evrythiong works now

#

i recieved help

#

btw @untold bloom it worked... your thing ( i was just missing some () somewhere)

hollow sentinel Jun 29, 2022, 8:52 PM

#

omg it's WORKING

wooden sail Jun 29, 2022, 8:52 PM

#

cool

hollow sentinel Jun 29, 2022, 8:52 PM

#

why do people use .ipynbs on github

wooden sail Jun 29, 2022, 8:52 PM

#

then the issue was wrong variable names (test ~ val) and running the code out of order (jupyter)

#

because github can render the notebooks and they might be "pretty to look at"

#

especially when some nice latex typesetting is used

#

but you'd never use jupyter notebooks for real work 😛

#

it's a nice display tool, not good for development nor deployment though

hollow sentinel Jun 29, 2022, 8:54 PM

#

i like thonny

#

but i thought my github looked weird with everything as a .py file

wooden sail Jun 29, 2022, 8:54 PM

#

right

hollow sentinel Jun 29, 2022, 8:54 PM

#

i did see people showcase their eda projects with notebooks

#

jupyter notebook actually broke on my mac and i can't even reopen it anymore

wooden sail Jun 29, 2022, 8:54 PM

#

yeah, so ideally you'd put all your nice modules into .py, and then make a slick demo in a jupyter notebook

hollow sentinel Jun 29, 2022, 8:54 PM

#

it's been like that for months

#

i see

#

welp third project

wooden sail Jun 29, 2022, 8:55 PM

#

congrats

hollow sentinel Jun 29, 2022, 9:01 PM

#

i am slowly learning this stuff

#

project based learning works

#

even if it's just regression and classification

#

idk if it's enough to turn heads for a portfolio yet, but it's a start

wooden sail Jun 29, 2022, 9:06 PM

#

i can't speak about portfolios, but yeah, motivation comes from within. that means if you find a nice thing you're interested in, you'll have the motivation to see it through. that's a big factor in learning: actually practicing what you're learning, and you won't practice it if you're not interested/motivated

hollow sentinel Jun 29, 2022, 9:08 PM

#

i find it hard to come up with portfolio projects

#

but i'll get there

#

one step at a time

hoary breach Jun 29, 2022, 9:11 PM

#

can anyone help me with a problem involving SpaCy?

primal shuttle Jun 29, 2022, 9:24 PM

#

@hoary breach ask your question, don't ask to ask 🙂

serene scaffold Jun 29, 2022, 9:27 PM

#

hoary breach can anyone help me with a problem involving SpaCy?

don't ask to ask, just ask.

primal shuttle Jun 29, 2022, 9:28 PM

#

... 😉

hoary breach Jun 29, 2022, 9:29 PM

#

in spacy you can use similarity for some data

serene scaffold Jun 29, 2022, 9:29 PM

#

primal shuttle ... 😉

by the mouth of two or three shall all be established.

primal shuttle Jun 29, 2022, 9:29 PM

#

dontasktoask . com 😉

#

There is that

hoary breach Jun 29, 2022, 9:29 PM

#

I caught a snag... (AttributeError: 'str' object has no attribute 'similarity')

serene scaffold Jun 29, 2022, 9:30 PM

#

hoary breach I caught a snag... (AttributeError: 'str' object has no attribute 'similarity')

please show the code and the error message as text

#

!code

arctic wedgeBOT Jun 29, 2022, 9:30 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

hoary breach Jun 29, 2022, 9:30 PM

#

My data for SpaCy

📎 panel_discussion.csv

serene scaffold Jun 29, 2022, 9:30 PM

#

what you've shown us is just the last line of a larger error message. the last line isn't very useful in itself.

arctic wedgeBOT Jun 29, 2022, 9:30 PM

#

Hey @hoary breach!

It looks like you tried to attach file type(s) that we do not allow (.ipynb). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

serene scaffold Jun 29, 2022, 9:31 PM

#

no one is going to download and run your notebook. please just copy and paste the relevant code.

hoary breach Jun 29, 2022, 9:31 PM

#

train_data = nlp(data)

#

print(data['parsed_doc'][0].similarity(data['parsed_doc'][1]))

serene scaffold Jun 29, 2022, 9:31 PM

#

so, data['parsed_doc'][0] is a string

primal shuttle Jun 29, 2022, 9:31 PM

#

train_data = nlp(data)
print(data['parsed_doc'][0].similarity(data['parsed_doc'][1]))

serene scaffold Jun 29, 2022, 9:32 PM

#

what type of object do you expect to have a similarity method?

#

or is it a function from some spacy module?

hoary breach Jun 29, 2022, 9:33 PM

#

is is a Spacy object

#

import pandas as pd
data = pd.read_csv('panel_discussion.csv')

serene scaffold Jun 29, 2022, 9:34 PM

#

do print(type(data['parsed_doc'][0])) and you'll see.
spacy is the name of the library. "spacy" is not a type of object.

hoary breach Jun 29, 2022, 9:35 PM

#

it is a str class

primal shuttle Jun 29, 2022, 9:35 PM

#

Yup

serene scaffold Jun 29, 2022, 9:35 PM

#

what you mean is "it's an instance of str" or "it's a str". it's not a "str class". these distinctions matter.

primal shuttle Jun 29, 2022, 9:35 PM

#

So you cannot compare strings in terms of similarity

hoary breach Jun 29, 2022, 9:35 PM

#

similarity is a built in function in spacy

primal shuttle Jun 29, 2022, 9:36 PM

#

Yes, that operates on vectors, not strings

hoary breach Jun 29, 2022, 9:36 PM

#

parsed doc refers to the column of the data

#

based on this notebook https://www.kaggle.com/code/caractacus/thematic-text-analysis-using-spacy-networkx/notebook

Thematic text analysis using spaCy, networkX.

Explore and run machine learning code with Kaggle Notebooks | Using data from Democrat Vs. Republican Tweets

#

Why is it then that they use the columns and can in fact calculate it.

#

Is it that once you print a dataframe you cannot print it and import the csv code and then use similarity?

#

relevant: tokens = []
lemma = []
pos = []
parsed_doc = []
col_to_parse = 'Q1'

for doc in nlp.pipe(data[col5_to_parse].astype('unicode').values, batch_size=1,
n_process=1):
if doc.has_annotation("DEP"):
parsed_doc.append(doc)
tokens.append([n.text for n in doc])
lemma.append([n.lemma_ for n in doc])
pos.append([n.pos_ for n in doc])
else:
# We want to make sure that the lists of parsed results have the
# same number of entries of the original Dataframe, so add some blanks in case the parse fails
tokens.append(None)
lemma.append(None)
pos.append(None)
data['parsed_doc'] = parsed_doc
data['comment_tokens'] = tokens
data['comment_lemma'] = lemma
data['pos_pos'] = pos

primal shuttle Jun 29, 2022, 9:40 PM

#

relevant: tokens = []
lemma = []
pos = []
parsed_doc = [] 
col_to_parse = 'Q1'
col2_to_parse = 'Q2'
col3_to_parse = 'Q3'
col4_to_parse = 'Q4'
col5_to_parse = 'AddQ'
col6_to_parse = 'LastQ'


for doc in nlp.pipe(data[col5_to_parse].astype('unicode').values, batch_size=1,
                        n_process=1):
    if doc.has_annotation("DEP"):
        parseddoc.append(doc)
        tokens.append([n.text for n in doc])
        lemma.append([n.lemma for n in doc])
        pos.append([n.pos_ for n in doc])
    else:
        # We want to make sure that the lists of parsed results have the
        # same number of entries of the original Dataframe, so add some blanks in case the parse fails
        tokens.append(None)
        lemma.append(None)
        pos.append(None)
data['parsed_doc'] = parsed_doc
data['comment_tokens'] = tokens
data['comment_lemma'] = lemma
data['pos_pos'] = pos

#

!code

arctic wedgeBOT Jun 29, 2022, 9:41 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

hoary breach Jun 29, 2022, 9:41 PM

#

sweet

#

i am in voice chat if someone cares to help more

#

so from my understanding the columns (like parsed doc) get appended to a pandas dataframe

#

I printed the data out and imported a csv.

#

is it that pandas dataframe presents data as a vector so that you can use similarity?

#

the example they provide is this

#

!code

arctic wedgeBOT Jun 29, 2022, 9:50 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

hoary breach Jun 29, 2022, 9:55 PM

#

doc2 = nlp("How do I obtain a pet?")
doc1.similarity(doc2)```

serene scaffold Jun 29, 2022, 10:07 PM

#

what would you like to say about this?

blissful bone Jun 29, 2022, 10:08 PM

#

serene scaffold what would you like to say about this?

It's a recent blog post I wrote about scalable training

serene scaffold Jun 29, 2022, 10:13 PM

#

blissful bone It's a recent blog post I wrote about scalable training

we're not a platform for self-promotion, so if you post your own content, please do so in the context of a conversation about the topic.

blissful bone Jun 29, 2022, 10:13 PM

#

Happy?

serene scaffold Jun 29, 2022, 10:15 PM

#

blissful bone Happy?

I'm never happy Sadge

hoary breach Jun 29, 2022, 10:15 PM

#

when I run the code through the 'doc' class (posted above) I get <class 'spacy.tokens.doc.Doc'>

#

which is good! however my issue is i wanted to get multiple columns involved

#

so I parsed each part individually... but that results in printing a 'str'

hoary breach Jun 29, 2022, 10:19 PM

#

primal shuttle ```py relevant: tokens = [] lemma = [] pos = [] parsed_doc = [] col_to_parse = ...

why does this result in a vector? It clearly appends values to my csv.

#

I made a workaround instead but thanks for the help

hoary breach Jun 29, 2022, 10:54 PM

#

🙂

hollow sentinel Jun 29, 2022, 11:12 PM

#

serene scaffold I'm never happy <:Sadge:859381189676630026>

https://tenor.com/view/my-first-girlfriend-turned-into-the-moon-thats-rough-that-rough-buddy-avatar-avatar-the-last-airbender-gif-5710468

Tenor

hollow sentinel Jun 29, 2022, 11:33 PM

#

where can i get help w selenium?

serene scaffold Jun 29, 2022, 11:50 PM

#

hollow sentinel where can i get help w selenium?

#web-development, I guess

hollow sentinel Jun 30, 2022, 12:06 AM

#

shit, i think i just scraped a website i wasn't supposed to

#

and proceeded to get banned

misty flint Jun 30, 2022, 12:21 AM

#

hollow sentinel and proceeded to get banned

which website

#

ZoomEyes

misty flint Jun 30, 2022, 12:21 AM

#

serene scaffold I'm never happy <:Sadge:859381189676630026>

same. especially with aws

#

aws CL5_FeelsBongoMan

hollow sentinel Jun 30, 2022, 12:24 AM

#

misty flint which website

whoscored

misty flint Jun 30, 2022, 12:25 AM

#

dang

#

what if you used sleep() / wait()

hollow sentinel Jun 30, 2022, 12:26 AM

#

no they use something called

#

imperva

#

which sounds like a harry potter spell but it's this thing that blocks scraperss

misty flint Jun 30, 2022, 12:27 AM

#

kekHands

#

PikaThink

#

it really does

hollow sentinel Jun 30, 2022, 12:27 AM

#

yep

#

some companies don't love getting scraped

#

ESPN let me scrape them

misty flint Jun 30, 2022, 12:27 AM

#

im surprised espn doesnt have an api

#

or do they

#

oh hey they do

#

http://www.espn.com/apis/devcenter/docs/

#

you can just grab the data from here

#

the only thing is you need the requests library https://pypi.org/project/requests/

PyPI

requests

Python HTTP for Humans.

#

fun data engineering times

#

kekHands

hollow sentinel Jun 30, 2022, 12:30 AM

#

idk how to do api requests

#

time to learn

misty flint Jun 30, 2022, 12:30 AM

#

its okay you can usually google those and its a good skill to have

misty flint Jun 30, 2022, 12:31 AM

#

hollow sentinel time to learn

py_strong

#

like if you added the ability to work with APIs to your projects, that would def go a long way imo

#

since more and more places require calling APIs for collecting data

#

nowadays

hollow sentinel Jun 30, 2022, 12:32 AM

#

https://www.youtube.com/watch?v=D2APJrBwZBQ

YouTube

Caleb Curry

Consume an API with Python Requests

💯 FREE Courses (100+ hours) - https://calcur.tech/all-in-ones
🐍 Python Course - https://calcur.tech/python-courses

✅ Data Structures & Algorithms - https://calcur.tech/dsa-youtube


✉️ Newsletter - https://calcur.tech/newsletter
📸 Instagram - https://www.instagram.com/CalebCurry
🐦 Twitter - https://twitte...

▶ Play video

misty flint Jun 30, 2022, 12:33 AM

#

that looks pretty comprehensive