lapis sequoia Dec 23, 2020, 10:05 PM

#

Is there any possibility to sort data in linked list with time complexity O(n log(n))

odd yoke Dec 23, 2020, 10:34 PM

#

yes, you can use merge sort for example, but that's a question better fit for #algos-and-data-structs

serene scaffold Dec 23, 2020, 10:55 PM

#

I'd be interested in people's opinions on my reddit thread about pandas: https://www.reddit.com/r/Python/comments/kj36vp/type_hints_for_pandas_what_would_we_need_what_can/

r/Python - Type hints for Pandas: What would we need, what can a li...

0 votes and 0 comments so far on Reddit

#

huh, markdown didn't work

teal sluice Dec 23, 2020, 11:23 PM

#

I have a pandas dataframe which contains spending and the month of spending, what would be the best way to create another dataframe which holds the total spent for each month??

torpid cave Dec 24, 2020, 12:29 AM

#

@teal sluice group + sum maybe

#

or summarise

#

Hi all, has anyone configured a Raspberry PI as a server/VM to run your own scripts via http requests?

#

I just need some quick guidance

lapis sequoia Dec 24, 2020, 12:47 AM

#

is it worth diving images by 255 when training?

#

like, from uint 8 to float 64

serene scaffold Dec 24, 2020, 1:14 AM

#

teal sluice I have a pandas dataframe which contains spending and the month of spending, wha...

can you show some examples?

#

ping me if you do.

velvet thorn Dec 24, 2020, 2:03 AM

#

serene scaffold I'd be interested in people's opinions on my reddit thread about pandas: https:/...

I can give an opinion about the power of static type checking in such cases

serene scaffold Dec 24, 2020, 2:04 AM

#

velvet thorn I can give an opinion about the power of static type checking in such cases

👂

velvet thorn Dec 24, 2020, 2:04 AM

#

there's something called dependent typing

#

which basically means that the type of a value depends on another value

#

this can be used to encode, for example, an array's length in its type

#

which gives you stronger guarantees regarding whether any particular expression is well-formed

#

e.g. elementwise addition of unequally sized arrays

#

however, this is still (relatively) an academic thing

#

(and is also pretty complex)

#

so there are some things we probz can't do right now.

#

and this also raises the question: what do we want to encode in a dataframe's type?

#

at some point, it might be a question of whether this should be delegated to runtime property checking instead

serene scaffold Dec 24, 2020, 2:07 AM

#

velvet thorn at some point, it might be a question of whether this should be delegated to run...

property checking?

velvet thorn Dec 24, 2020, 2:07 AM

#

i.e. not encoded within a type parameter

#

for example

#

in most languages, division by zero is handled as a runtime error

#

not as a compile time type error, because there is no NonzeroNumber type

#

there's also the question of axis alignment

#

if two dataframes have the same column names in a different order

#

are they the same type?

#

different column names, but same data types?

serene scaffold Dec 24, 2020, 2:10 AM

#

velvet thorn if two dataframes have the same column names in a different order

that's a point I alluded to in my post. Some use cases may depend on column order, others may not.

#

my current thinking is that there would be power in having a language for documenting what properties a dataframe has, even if the linter can ultimately only assume that if a function returns a SomeDataFrameType, that object is a valid argument for a function that takes a SomeDataFrameType.

lapis sequoia Dec 24, 2020, 2:15 AM

#

I have another question. I have moved to colab since it has gpu acceleration (better than mine for sure) and i uploaded all my images to drive (1 hour it took). Now, i need to append all the images to an array, to give it as input to my nn. But colab takes like tooooooooooo long to append all the images. Any suggestion?

velvet thorn Dec 24, 2020, 2:16 AM

#

serene scaffold that's a point I alluded to in my post. Some use cases may depend on column orde...

yup, precisely

#

but that would still be a type parameter

#

and it would be irrelevant for other functions

#

so if you wanted to spec this out

#

you would need to think about whether there's a practically viable type hierarchy

#

that can encode the necessary information

velvet thorn Dec 24, 2020, 2:17 AM

#

lapis sequoia I have another question. I have moved to colab since it has gpu acceleration (be...

how are you doing it

lapis sequoia Dec 24, 2020, 2:18 AM

#

for pok in pokemons:
        path = os.path.join(datadir, pok)
        images = os.listdir(path)
        amount = len(images)
        for i in range(amount):
            print(f'Doing {pok}. {amount - i} remaining images')
            img_array = cv2.imread(os.path.join(path, images[i]), params['color_mode'])
            new_array = cv2.resize(img_array, params['dimensions'])
            if i < amount * params['percentage']:
                train_data.append(new_array / 255)
                train_label.append(pok)
            else:
                valid_data.append(new_array / 255)
                valid_label.append(pok)```

#

i open the imagen with opencv, i resize it, and i append it to an array (input for latter)

velvet thorn Dec 24, 2020, 2:20 AM

#

lapis sequoia ```py for pok in pokemons: path = os.path.join(datadir, pok) ima...

wait.

#

what is train_data?

#

where is it defined

lapis sequoia Dec 24, 2020, 2:20 AM

#

an array

#

above

velvet thorn Dec 24, 2020, 2:20 AM

#

show.

lapis sequoia Dec 24, 2020, 2:20 AM

#

https://gyazo.com/44b10b1ff41bf29ee8b27f97892ef887

Gyazo

velvet thorn Dec 24, 2020, 2:20 AM

#

paste code

#

instead of images

#

way too small to see

lapis sequoia Dec 24, 2020, 2:21 AM

#

idk how to copy paste code from colab. it is on different cells. wait

#

os.chdir('/content/drive/MyDrive/Colab Notebooks/Python ML/Pokeguesser')

train_data  = []
train_label = []
valid_data  = []
valid_label = []

data_dir = 'dataset'
pokemons = os.listdir(data_dir)
dimensions = (71, 71, 3)
batch_size = 126
num_epochs = 12
percentage = 0.8```

#

for pok in pokemons:    
    path = os.path.join(data_dir, pok)
    images = os.listdir(path)
    amount = len(images)
    for i in range(amount):
        img_array = cv2.imread(os.path.join(path, images[i]), cv2.IMREAD_COLOR)
        new_array = cv2.resize(img_array, dimensions[:2])
        if i < amount * percentage:
            train_data.append(new_array)
            train_label.append(pok)
        else:
            valid_data.append(new_array)
            valid_label.append(pok)```

velvet thorn Dec 24, 2020, 2:21 AM

#

lapis sequoia an array

it's not an array

#

it's a list

#

it's important to be clear on this

lapis sequoia Dec 24, 2020, 2:22 AM

#

well, sorry if both are different. For me are the same ^^'

velvet thorn Dec 24, 2020, 2:22 AM

#

no

#

they are different.

#

very different.

lapis sequoia Dec 24, 2020, 2:22 AM

#

an array from java is a list on python

velvet thorn Dec 24, 2020, 2:22 AM

#

yes

#

but

#

normally it doesn't matter that much

#

however, in this case

lapis sequoia Dec 24, 2020, 2:22 AM

#

thats why sometimes i call them array

velvet thorn Dec 24, 2020, 2:22 AM

#

when you are working with numpy

#

numpy.ndarray is what is normally called an "array"

#

and because the semantics are different

lapis sequoia Dec 24, 2020, 2:22 AM

#

okey okey

velvet thorn Dec 24, 2020, 2:22 AM

#

from a native Python list

#

it is important to distinguish the two

#

anyway

velvet thorn Dec 24, 2020, 2:23 AM

#

lapis sequoia I have another question. I have moved to colab since it has gpu acceleration (be...

how many images?

lapis sequoia Dec 24, 2020, 2:23 AM

#

26k

velvet thorn Dec 24, 2020, 2:23 AM

#

how big is each image?

lapis sequoia Dec 24, 2020, 2:23 AM

#

mmm there are different sizes

velvet thorn Dec 24, 2020, 2:23 AM

#

disk size

#

what's the range like

#

few hundred kb?

lapis sequoia Dec 24, 2020, 2:23 AM

#

1.14 gb

velvet thorn Dec 24, 2020, 2:24 AM

#

well

#

then

#

that's why it's taking so long

#

loading images is (relatively) slow

lapis sequoia Dec 24, 2020, 2:24 AM

#

i think i am not explaining well, wait

lapis sequoia Dec 24, 2020, 2:24 AM

#

velvet thorn loading images is (relatively) slow

not rlly, at least on local machine

#

hold on 1 sec

#

sorry for a gif, cant think of a different way to show

#

https://gyazo.com/e751899e09744cbfa51cae70d5681ea1

#

this is on my local computer

velvet thorn Dec 24, 2020, 2:26 AM

#

yeah

#

on your local computer

#

I don't know the specifics of Google hardware

lapis sequoia Dec 24, 2020, 2:26 AM

#

https://gyazo.com/0c4afed7894b40cf7d4748b6deff17d8

Gyazo

▶ Play video

#

this is on colab

velvet thorn Dec 24, 2020, 2:27 AM

#

but it's very possible that there needs to be transfer over the wire

#

from Drive to Colab

#

which would make it much slower

#

here you can see

#

that loading is much slower

#

and

#

okay, simple way to show if this is true or not

lapis sequoia Dec 24, 2020, 2:27 AM

#

oh

velvet thorn Dec 24, 2020, 2:27 AM

#

img_array = cv2.imread(os.path.join(path, images[i]), cv2.IMREAD_COLOR)

#

this is the line that loads the images

lapis sequoia Dec 24, 2020, 2:28 AM

#

so colab doesnt actually have my images directly?

velvet thorn Dec 24, 2020, 2:28 AM

#

include a print before and after

#

to see how long it takes to load

#

IO should be the primary bottleneck here

#

https://medium.com/datadriveninvestor/speed-up-your-image-training-on-google-colab-dc95ea1491cf

Medium

Speed up your image training on Google Colab

Getting a factor 20 speedup in training a cats-vs-dogs classifier for free!

#

this is what I found after a quick search

#

It takes forever to copy files from Drive to Colab. While this is no problem when dealing with very small datasets, it’s very annoying when facing larger data, for example for image classification.

#

you said your data was in Drive

velvet thorn Dec 24, 2020, 2:29 AM

#

lapis sequoia I have another question. I have moved to colab since it has gpu acceleration (be...

right?

lapis sequoia Dec 24, 2020, 2:29 AM

#

yeah, but idk why i though linking drive to colab will make like a copy on colab side

#

i will try that, one sec

#

idk if i fcked up but

#

!cp -r "{data_dir}" ~

#

will copy the folder on root?

#

cuz i am trying not to zipping the images and upload again to drive

#

if this doesnt work i will do it tomorrow

serene scaffold Dec 24, 2020, 2:36 AM

#

not to distract from the help that's happening, but now I'm wondering: is the only runtime optimization for numpy that it does iterative operations in C, or can it also secretly run independent operations in parallel?

velvet thorn Dec 24, 2020, 2:50 AM

#

serene scaffold not to distract from the help that's happening, but now I'm wondering: is the on...

huh

#

they're run with SIMD

#

stuff like elementwise addition is run in parallel

#

with aforesaid SIMD

#

uh

velvet thorn Dec 24, 2020, 2:51 AM

#

lapis sequoia ``!cp -r "{data_dir}" ~``

I'm not

#

very good with shell stuff TBH

lapis sequoia Dec 24, 2020, 2:54 AM

#

nvm. dont do chdir on colab

#

it messes up xD

lapis sequoia Dec 24, 2020, 3:25 AM

#

btw. Does colab indexes files on a different way? my subdirectories are name like 001_name, 002_name, 003_name and so on

#

But when i do os.listdir it returns some weird sorted list

#

the first item is the 083

proper tendon Dec 24, 2020, 9:38 AM

#

{
    "server1":
        [
            "id":
                [
                    "s1",
                ],
            "channel1":
                [
                    "c1",
                ],
        ],
    "server2":
        [
            "id":
                ]
                    "s2"
                ],
            "channel2":
                [
                    "c2",
                ],
        ],
},

#

so i got this json

#

import json

with open(r"D:\Heres\Bots\Messager\Files\saves.json") as f:
    data = json.load(f)

server1 = data["server1"]["id"]
channel1 = data["server1"]["channel1"]
server2 = data["server2"]["id"]
channel2 = ["server2"]["channel2"]

print(server1, channel1, server2,channel2)```

#

and this py

#

and for some reason its not working

#

may someone help? i am doing lotsa stuff, may ya ping me if u can help 🙂

still verge Dec 24, 2020, 10:50 AM

#

what is the error you're seeing?

#

either way, you can't access id and so on since the json data structure isn't a nested dict, you have a list

#

so you'll have to do data["server1"][0]

proper tendon Dec 24, 2020, 12:35 PM

#

fixed the issue ty

lapis sequoia Dec 24, 2020, 12:57 PM

#

ValueError: Failed to find data adapter that can handle input: (<class 'list'> containing values of types {"<class 'numpy.ndarray'>"}), (<class 'list'> containing values of types {"<class 'int'>"})

#

Can someone help me fixing this error?

uneven monolith Dec 24, 2020, 2:06 PM

#

How much math is needed for Data Science?

trail jacinth Dec 24, 2020, 2:43 PM

#

uneven monolith How much math is needed for Data Science?

calculus, linear algebra and statistics

uneven monolith Dec 24, 2020, 2:52 PM

#

Ty

sweet zenith Dec 24, 2020, 2:54 PM

#

hey guys, I'm looking for a startup idea on AI.. If you have some good ideas do tell me..

lapis sequoia Dec 24, 2020, 2:55 PM

#

u could help me doing a nn that recognizes pokemons 😄

sweet zenith Dec 24, 2020, 2:56 PM

#

is AR a big thing in future?

foggy swift Dec 24, 2020, 3:10 PM

#

sweet zenith is AR a big thing in future?

I think yes it is

sweet zenith Dec 24, 2020, 3:10 PM

#

oh

#

ok

clever raft Dec 24, 2020, 4:26 PM

#

So do i

vestal tiger Dec 24, 2020, 5:44 PM

#

probably a noob ass question but i have a correlation matrix, how do i extract the highest pairs, as well as what that pair as? for example, the correlation between x and y was .7? most of the methods I am seeing show the correlation number, not what the two variables are

#

basicallt i want to extract the highest values from a matrix and what the two variables are

silver shard Dec 24, 2020, 5:59 PM

#

Hi guys, I don't suppose anyone understands this and can help me get a solution out?

#

https://github.com/google/or-tools/blob/stable/examples/python/arc_flow_cutting_stock_sat.py

GitHub

google/or-tools

Google's Operations Research tools:. Contribute to google/or-tools development by creating an account on GitHub.

#

I've been looking at this for hours, inspecting it with debugging tools trying to find the relationship between the input and output

next moat Dec 24, 2020, 6:28 PM

#

(base) C:\Users\siebe>conda install tensorflow-gpu
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

it keeps doing this. I also tried install only tensorflow (without the gpu)
I have a RTX 2070 Super GPU from Nividia (MSI)

mellow vapor Dec 24, 2020, 6:32 PM

#

Hey guys, i m a but confused with jupyter notebook and anaconda, basically just wanted to know can i use the jupyter notebook as an independent desktop application, or it does either run on the web or with anaconda?

ocean dawn Dec 24, 2020, 6:33 PM

#

Just if someone is interested in Data Distillation https://dasha.ai/en-us/blog/data-distillation

Let's talk about Data Distillation

Let's talk about Data Distillation - a deep learning approach that solves the problem of reducing the sample size and an even more ambitious task - creating synthetic data that stores all the useful information about the sample.

boreal summit Dec 24, 2020, 6:44 PM

#

Mehn, I'm still struggling with my Tensorflow installation. I've uninstalled Python 3.8, I even created a virtual environment and stuff, installed just TF in it but I'm still getting issues.

#

It's saying, DLL runtime error: can't load Tensorflow runtime bla bla bla. But I figured out it has to do with my PC, it doesn't have a GPU, that's why. I successfully installed Tensorflow but it can't run without a GPU.

#

I'd try to get another PC January to resolve this.

long gate Dec 24, 2020, 6:54 PM

#

Can anyone help me with just what algorithm or structure to use on this problem?

https://open.kattis.com/problems/bokforing

My answer is way too slow, although I use python I still know I don't have the right answer. I have one solution to it that would work I guess, but that would be kinda cheating and I want to solve it the right way, any thoughts how to speed up the solution?

violet talon Dec 24, 2020, 10:42 PM

#

sort of a dumb question, but I can't find the answer to this: how do I read an exponent in this format from mpl? 1e-12+9.9995833333 (other than 'really small' 😉 )?

#

read as in, interpret

lapis sequoia Dec 24, 2020, 11:16 PM

#

use regex

ember roost Dec 24, 2020, 11:25 PM

#

I want to find out certain metrics in my model. I have binomial distribution of daily pattern and I have average rate of daily metric. How do I find out the rate at particular time ? ( Basically multiplying binomial curve with average should reflect the distribution of data for certain ranges ) Is there any utility to do such kind of analysis ?

lapis sequoia Dec 25, 2020, 1:57 AM

#

How can i use ImageDataGenerator to fit my model after? It is complaining idk why

trim oar Dec 25, 2020, 6:01 AM

#

boreal summit It's saying, DLL runtime error: can't load Tensorflow runtime bla bla bla. But I...

It's more likely you're missing something else. TF shouldn't require a GPU

lapis sequoia Dec 25, 2020, 6:04 AM

#

Feelssadman I’m doing python at home on this christmas day :/

boreal summit Dec 25, 2020, 6:09 AM

#

trim oar It's more likely you're missing something else. TF shouldn't require a GPU

I've been battling this since yesterday. I directly installed it using !code pip install Tensorflow. I just learned that their are different installation packages based on your needs. I would try installing the one that doesn't require GPU, just CPU installation.

#

Also, I just installed Pytorch and it's working fine.

#

If TF doesn't work on my PC, I'll just go with PyTorch.

earnest herald Dec 25, 2020, 7:39 AM

#

Hello everyone,

I am trying to make a web scrapper off Fortune 500. I was thinking of using Scrapy but I can do well with BeautifulSoup.

When I make a get soup request (and print the soup itself) I end up with useless information named DNS Prefetch and no relevent info about information on the page. Any idea how I could bypass it?

Thanks a lot!!

lapis sequoia Dec 25, 2020, 8:27 AM

#

guys what's a better way to show lots of graphs in one chart?

#

below is what I did

#

📎 unknown.png

#

this is so messed up

#

Each line graph shows a historical price of certain good for 5 years

#

I want them to be in one graph but is that even viable to make it look better than this :/

#

x is time, y is price btw

gleaming gyro Dec 25, 2020, 8:35 AM

#

why do you want to show that many stuff in one graph

#

is it how it is normally done?

lapis sequoia Dec 25, 2020, 8:38 AM

#

basically that graph is to compare housing price differences between cities

#

I have no other better thought

#

showing some of them makes no sense to me

#

Any advice is highly appreciated!

fleet heath Dec 25, 2020, 11:44 AM

#

@lapis sequoia you're visualizing a lot of data

#

generally, line plot is the best one if you want to compare real estate prices

#

but this is surely not looking good

#

you can try to take the mean values of different cities and then try plotting a bar chart

#

where one bar will represent the mean price of house in that city in a given year

lapis sequoia Dec 25, 2020, 12:20 PM

#

fleet heath <@456226577798135808> you're visualizing a lot of data

thanks for the advice, let me try the other way

tight torrent Dec 25, 2020, 12:51 PM

#

Guys will heroku charge me for my add-ons like MySQL, i have my credit card info registered that's why

||sorry if offtopic||

lapis sequoia Dec 25, 2020, 1:11 PM

#

I used LeabelEncoder from sklearn to transform my labels into valid thing for keras. But once i do the label encoder, i get 1 list of train_data length

#

And i think keras needs a matrix

#

Yesterday i downloaded cifar10 dataset to see what is has. x_train was a ndarray of 50k of images (ndarrays too). But y_train.shape was (50k, 10) cuz 10 classes. I printed what was y_train[0] and it was a list full of zeros except one

#

On my case, y_train is just 1 list where y_train[i] is the class x_train[i] belongs to

#

But model.fit doesnt accept this

lapis sequoia Dec 25, 2020, 2:58 PM

#

nvm, i fixed it

shadow spruce Dec 25, 2020, 4:21 PM

#

hello i need some help for groupby().get_group() : https://paste.pythondiscord.com/ivofaremos.rb

#

import pandas as pd

#dataframe " ind returnsnsinc 1926) ,shape(11100,30)
ind = pd.read_csv("ind30_m_vw_rets.csv", header=0, index_col=0)/100
ind.index = pd.to_datetime(ind.index, format="%Y%m").to_period('M')
ind.columns = ind.columns.str.strip()

time series correlations over time over a 36 month window: shape((33300, 30)

ts_corr= ind.rolling(window= 36).corr()

ts_corr.index.names = ["Date","Industry"]

ts_grby =ts_corr.groupby(level = "Date")

ts_grby.get_group("2018-12")

KeyError Traceback (most recent call last)
<ipython-input-8-484dc0e2c324> in <module>
----> 1 ts_grby.get_group("2018-12")

~\anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in get_group(self, name, obj)
808 inds = self._get_index(name)
809 if not len(inds):
--> 810 raise KeyError(name)
811
812 return obj._take_with_is_copy(inds, axis=self.axis)

KeyError: '2018-12'

lapis sequoia Dec 25, 2020, 5:27 PM

#

I am using Xception to try transfer learning. But i am doing something wrong cuz val_acc is 0.007. I followed this link https://keras.io/guides/transfer_learning/ but idk. How can i know what is going wrong?

Keras documentation: Transfer learning & fine-tuning

lapis sequoia Dec 25, 2020, 7:29 PM

#

Hello!

I have a dataset with Quora questions and another dataset with individuals' preferences (ranked from 1 to 5) in which each row represents a different individual.

I would like to match each question with a group of individuals whose preferences may match the topics I obtained with lda modelling. The problem is that I don’t know how to exactly do this…

I don't know what answers I need to to look for... I don't know what to google to find out the magic key! Please help!

What do you think? What would you advice me to look for or how do you think I approach this?

Thanks so much in advance! And Merry Christmas!! 🎄

upbeat jetty Dec 25, 2020, 8:52 PM

#

How to optimise the local pyspark so it would run the fastest on the work laptop? I need it to run tests, but they are taking waaay too long.

nimble lotus Dec 25, 2020, 9:56 PM

#

@lapis sequoia what does your quora dataset contain? Was it text responses to various questions?

#

This sounds like a question recommendation system based on a individual prefrences

timid sand Dec 25, 2020, 10:22 PM

#

Hello everyone

nova smelt Dec 25, 2020, 11:35 PM

#

Yo guys,

So I first created a space invaders game with a friend and then we tried to add a neat ai which kind of worked but it's not really learning anything. If there is someone that might wanna hop into vc and look at the code and maybe help us make it more efficient and learning, that would be awesome :) if there is someone just DM me!
Not sure if I am right here or at #game-development

trim oar Dec 26, 2020, 1:43 AM

#

upbeat jetty How to optimise the local pyspark so it would run the fastest on the work laptop...

Wouldn't it be better if you just run it on AWS free tier?

fiery apex Dec 26, 2020, 2:47 AM

#

Hi Guys, somebody can help me? Somebody knows how to get data from Pi osisoft with R or python?

astral path Dec 26, 2020, 5:52 AM

#

I have a pandas dataframe that looks like this, and I'm trying to explode each list into a new row (so I would have a shape of len(list1) * len(list2) * len(list3) rows x 377 cols. The code I'm using to do this is

for column in df.columns: 
      df[column].explode()

but this does literally nothing. Anyone how this might be fixed? full code here: https://hastebin.com/pifoseripo.properties

📎 unknown.png

trim oar Dec 26, 2020, 7:19 AM

#

astral path I have a pandas dataframe that looks like this, and I'm trying to explode each l...

Try this https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.applymap.html

lapis sequoia Dec 26, 2020, 9:39 AM

#

nimble lotus <@456226577798135808> what does your quora dataset contain? Was it text response...

Hi Tommy! The Quora dataset contains more than a million non-duplicate questions from Quora. You are totally right! The ultimate goal is indeed a question recommendation system based on individual preferences!

nova smelt Dec 26, 2020, 10:22 AM

#

hey

#

anyone knows a tutorial to learn how to save a neat module? or some docs?

#

caue when i train my ai few hours for a game i would like to save it so it doesnt have to start from zero

fleet heath Dec 26, 2020, 11:22 AM

#

https://neat-python.readthedocs.io/en/latest/_modules/checkpoint.html?highlight=save

fleet heath Dec 26, 2020, 11:22 AM

#

fleet heath https://neat-python.readthedocs.io/en/latest/_modules/checkpoint.html?highlight=...

@nova smelt hope this helps

lapis sequoia Dec 26, 2020, 1:04 PM

#

Hey guys, I'd like to increase my knowledge about scientific python and also dangers that come with machine learning. For my university, I am asked to write a paper (it's going to be desk research and I want to state my thoughts on a topic that is controversial, so there is room for critical thinking). Therefore, I was wondering if you guys have any book recommendations? I don't need to get into a hands on how-to right away, but something that takes you by the hand and explains the depth of the scientific data world 🙂

opaque seal Dec 26, 2020, 1:08 PM

#

Nice english.

#

||no sarcasm||

lapis sequoia Dec 26, 2020, 1:10 PM

#

Uh, thanks? I guess

lament nova Dec 26, 2020, 1:26 PM

#

Hi, How can I merge two data frames where
df1 has index "Key0"
df2 has indexes ["Key1", "Key2", "Key3"]

for each row ["Key1", "Key2", "Key3"] might contain "Key0"

I came up with a solution using apply but it is really slow...
My Solution

def matchMerge(x, key, df, keys):
  for key in keys:
    try:
      x.update(df.loc[x[key]))
    except:
      ...
df1.apply(matchMerge, key="Key0", df=df2, keys=["Key1", "Key2", "Key3"] axis=1)

is where away to do this with merge?

pd.merge(df1, df2, left_on="Key0", right_on=["Key1", "Key2", "Key3"], how="outer")
# throws indexes must have same length

opaque seal Dec 26, 2020, 1:33 PM

#

hi

#

so

#

uhh just learning it for now and create some projects

#

with it

#

and then prolly might use it for game deving

#

later

#

for making stuff like traffic in cities and stuff

#

oh

#

like not making cars and stuff crash with each other

#

oh

#

uhh

#

idk

#

e

#

idk much about groups and stuff

#

:c

#

uhh yeah

#

you can say that

#

e

#

e damn i was thinking impossible stuff them

#

alr

#

tru

#

yeah yr

#

alr

#

thanks

#

damn thanks a lot for ur time

#

alr

#

kk

#

oh

#

then it must be good

lapis sequoia Dec 26, 2020, 2:08 PM

#

Hey guys. I am using Xception with weights of 'imagenet' as a pretrained model. I am freezing it according to https://keras.io/guides/transfer_learning/ but after training, my model val_acc is 0.007. Any idea of what could be wrong?

Keras documentation: Transfer learning & fine-tuning

gritty obsidian Dec 26, 2020, 3:19 PM

#

Hey, is anybody working here with PySpark ?

exotic bronze Dec 26, 2020, 3:44 PM

#

soup = BeautifulSoup(r.content, 'html.parser')
    find = soup.find_all('img')```
output

<img alt="blablabla" data-src="linkhere.jpg" height="451" src="anotherlink.jpg" width="300"/>,

How i can specifically select "data-src"

hearty token Dec 26, 2020, 4:53 PM

#

How do you create an XPATH expression into a new HTML file that lives inside an iframe?

upbeat jetty Dec 26, 2020, 5:00 PM

#

@trim oar It is meant to be eventually deployed in Azure ecosystem, but it doesn't solve the problem with local tests.

lapis sequoia Dec 26, 2020, 6:45 PM

#

Guy what layers may i add to my pretrained model (Xception) if i wanna do transferlearning?

#

Like, this is what i have

#

base_model = keras.applications.Xception(weights='imagenet',
                                         input_shape=dimensions,
                                         include_top=False)
base_model.trainable = False
inputs = keras.Input(shape=dimensions)
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(len(pokemons))(x)
model = keras.Model(inputs, outputs)
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])```

#

But it seems not to be training at all

knotty whale Dec 26, 2020, 8:12 PM

#

I want to use a supervised ML model for some grade prediction, so I have my train data
X: All 6 mock results
Y: Final grade
And all my train data has all the fields filled fine

But when a user wants to use the app, they may not have all 6 mock results, so can I predict their final grade on less such as 3 mocks?
(Plan on using scikitlearn)
(Please ping me if you reply!)

vague vector Dec 26, 2020, 9:34 PM

#

Hi guys, Im new to ML.
My question is, that if the model is trained on normalised or standardised data, we also need to normalise or standardise the data when the model is in production?

obtuse mango Dec 26, 2020, 9:59 PM

#

Hello guys, I am trying to improve my oop skills and ml skills

#

So I am trying to right ml algorithms but I kinda need some guidenca

#

Do you know any resource that gives you steps for this kind of things

lapis sequoia Dec 27, 2020, 1:25 AM

#

greetings all, as regards NLP what tool do you recommend to create text annotations, other than carving it by hand.

hasty grail Dec 27, 2020, 3:16 AM

#

vague vector Hi guys, Im new to ML. My question is, that if the model is trained on normalise...

Yes.

soft salmon Dec 27, 2020, 5:34 AM

#

how to get started with data-science?

#

any beginner guides?

twilit wind Dec 27, 2020, 5:39 AM

#

Has anybody tried the Faster RCNN implementation

misty rivet Dec 27, 2020, 5:39 AM

#

soft salmon how to get started with data-science?

same question..?

soft salmon Dec 27, 2020, 5:41 AM

#

@misty rivet where is the solution though?

misty rivet Dec 27, 2020, 5:48 AM

#

Bro i don't know..😂, I'm new here...!

#

wait until some experience one reply us @soft salmon

desert parcel Dec 27, 2020, 6:31 AM

#

If I get the following loss of 15813.8125 from an mse function do I need to find it's square root to know it's actual loss or is that the loss already

high badge Dec 27, 2020, 7:37 AM

#

you square root it to know its actual loss

sullen crescent Dec 27, 2020, 7:49 AM

#

twilit wind Has anybody tried the Faster RCNN implementation

I have, with detectron2 from facebook

twilit wind Dec 27, 2020, 7:49 AM

#

No I need it with Tensorflow

#

Actually my program is showing some bad outputs

#

also the mAP is about 23%

#

?

sullen crescent Dec 27, 2020, 7:56 AM

#

what kind of dataset?

#

playing with deep learning need at least 5000 images if you're working on image detection

twilit wind Dec 27, 2020, 8:10 AM

#

yea

#

my train split have about 4300 images

#

in total its about 7800 images

#

actually this is the issue

📎 unknown.png

sullen crescent Dec 27, 2020, 10:08 AM

#

wow thats some bounding box issues no wonder your mAP is quite low

#

did you manage to offlane augmentate?

twilit wind Dec 27, 2020, 11:51 AM

#

means ?

sullen crescent Dec 27, 2020, 1:04 PM

#

try to optimize your parameter, double check your ground truth, augmentate your training dataset so you will have more data

lapis sequoia Dec 27, 2020, 2:24 PM

#

You can find great material on YouTube, Udemy, Coursera, something like https://www.udemy.com/course/datascience/, https://www.coursera.org/browse/data-science, https://m.youtube.com/watch?v=ua-CiDNNj30 the last link is awesome for beginners! @soft salmon @misty rivet

YouTube

freeCodeCamp.org

Learn Data Science Tutorial - Full Course for Beginners

Learn Data Science is this full tutorial course for absolute beginners. Data science is considered the "sexiest job of the 21st century." You'll learn the important elements of data science. You'll be introduced to the principles, practices, and tools that make data science the powerful medium for critical insight in business and research. You'l...

▶ Play video

lapis sequoia Dec 27, 2020, 3:49 PM

#

is there a dedicated channel for NLP?

#

or is data-science the channel? 🙂

velvet thorn Dec 27, 2020, 11:59 PM

#

lapis sequoia or is data-science the channel? 🙂

this is it

solid kindle Dec 28, 2020, 12:12 AM

#

i'm trying to add another column to my dataframe

#

this is what i am currently doing

#

if first:
first = False
df = pd.DataFrame([stock, tempdf.iloc[:,3]])

    else:

        print(stock)
        df[stock] = tempdf.iloc[:,3].tolist()

#

but it adds it as a row

#

how do i get it to add the 3rd collumn to the stock?

velvet thorn Dec 28, 2020, 12:12 AM

#

solid kindle i'm trying to add another column to my dataframe

what is stock?

solid kindle Dec 28, 2020, 12:12 AM

#

its a string

#

sorry should have specified it

#

the 3rd column of tempdf is integers

#

and the rows are indexed by datetimes

#

sorry wrong code, this is the only thing thats working

#

if first:
first = False
df = pd.DataFrame([stock, tempdf.iloc[:,3]])

    else:
        
        print(stock)
        tempthing = tempdf.iloc[:,3].tolist()
        df[tempthing] = stock

velvet thorn Dec 28, 2020, 12:13 AM

#

use code blocks

#

!code

arctic wedgeBOT Dec 28, 2020, 12:13 AM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

solid kindle Dec 28, 2020, 12:13 AM

#

sorry not working, but doesn't throw error

#

ok i will

#

'''py

velvet thorn Dec 28, 2020, 12:14 AM

#

okay maybe you can give me a bit more context

#

as to what you're trying to do

#

it's `, not '

solid kindle Dec 28, 2020, 12:14 AM

#

right

velvet thorn Dec 28, 2020, 12:14 AM

#

if you're using a standard English keyboard

#

it'll be below your Esc key

solid kindle Dec 28, 2020, 12:14 AM

#

ya same as tilda

velvet thorn Dec 28, 2020, 12:14 AM

#

yup

solid kindle Dec 28, 2020, 12:15 AM

#

basically, i'm trying to put the closing prices for a bunch of different stocks at different dates into a pandas dataframe

#

i'm doing this by looping through a list of names (stock), and trying to add them one by one to the dataframe

velvet thorn Dec 28, 2020, 12:15 AM

#

probably not a good idea

#

every time you "add" a column

#

you in fact create a new DataFrame

solid kindle Dec 28, 2020, 12:16 AM

#

yes

#

it doesn't have to be fast

#

only has to happen once

velvet thorn Dec 28, 2020, 12:16 AM

#

it's also kind of harder to debug but

#

is your choice

solid kindle Dec 28, 2020, 12:16 AM

#

what would u reccomend?

#

i'm new to pandas and data science in general

velvet thorn Dec 28, 2020, 12:16 AM

#

solid kindle i'm doing this by looping through a list of names (stock), and trying to add the...

what's the source?

#

of the data

solid kindle Dec 28, 2020, 12:16 AM

#

an api

#

yfinance

velvet thorn Dec 28, 2020, 12:16 AM

#

you have other programming experience?

solid kindle Dec 28, 2020, 12:16 AM

#

yes

velvet thorn Dec 28, 2020, 12:16 AM

#

what format does it return the data in

#

okay, good

solid kindle Dec 28, 2020, 12:17 AM

#

it returns it in a pandas dataframe

velvet thorn Dec 28, 2020, 12:17 AM

#

so pandas has this concat function

solid kindle Dec 28, 2020, 12:17 AM

#

yes, ive used it

velvet thorn Dec 28, 2020, 12:17 AM

#

solid kindle it returns it in a pandas dataframe

oh it's a Python library wrapping the API?

solid kindle Dec 28, 2020, 12:17 AM

#

yes

#

sorry should have specified

velvet thorn Dec 28, 2020, 12:17 AM

#

so I'm guessing

#

you get a bunch of DataFrames from the API

#

and you want to combine subsets thereof

#

in a specified manner?

solid kindle Dec 28, 2020, 12:17 AM

#

yes

#

yes

#

just add the same column together

velvet thorn Dec 28, 2020, 12:17 AM

#

same column from each DataFrame?

#

or what

solid kindle Dec 28, 2020, 12:18 AM

#

yes

velvet thorn Dec 28, 2020, 12:18 AM

#

okay so let me just get this right

#

you have, say, 10 DataFrames, and each has a 'output' column, and you want to take that column from each and combine them into one big DataFrame

#

is that right

solid kindle Dec 28, 2020, 12:18 AM

#

yes

#

exactly

velvet thorn Dec 28, 2020, 12:18 AM

#

pd.concat([df['output'] for df in dfs], axis=1)

solid kindle Dec 28, 2020, 12:18 AM

#

thank you!

velvet thorn Dec 28, 2020, 12:18 AM

#

yw

#

tell me if it works

#

(it should if I have understood you correctly)

solid kindle Dec 28, 2020, 12:19 AM

#

ya, i'll need to do a little restructuring of my code real quick

velvet thorn Dec 28, 2020, 12:19 AM

#

so dfs is the iterable of all your source DataFrames

solid kindle Dec 28, 2020, 12:22 AM

#

i got a TypeError: Cannot join tz-naive with tz-aware DatetimeIndex

#

i used this instead: pd.concat([tempdf['Close'], df], axis=1)

velvet thorn Dec 28, 2020, 12:22 AM

#

what is df

solid kindle Dec 28, 2020, 12:23 AM

#

the dataframe i am adding everything to

velvet thorn Dec 28, 2020, 12:23 AM

#

ah, okay

#

so

solid kindle Dec 28, 2020, 12:23 AM

#

i initialize it as an empty dataframe

#

and then add something to it every time

velvet thorn Dec 28, 2020, 12:24 AM

#

okay

#

so

#

to use the approach

#

above

#

you need to put all the individual DataFrames in a collection

solid kindle Dec 28, 2020, 12:24 AM

#

ok

#

so i made a list of all the dataframes

#

and then ran your command

#

and it worked exept one of the columns has a bunch of NaNs (i forgot what they are called, its not null is it?

#

also thank you so much for helping me

velvet thorn Dec 28, 2020, 12:27 AM

#

solid kindle also thank you so much for helping me

yw!

velvet thorn Dec 28, 2020, 12:27 AM

#

solid kindle and it worked exept one of the columns has a bunch of NaNs (i forgot what they a...

what do you mean

#

one of the columns

#

like

#

check the DataFrame

#

that that column came from

#

most likely

#

the source data is bad

#

or its index is misaligned

solid kindle Dec 28, 2020, 12:28 AM

#

                              Close  Close       Close

Datetime
2020-12-21 00:27:00+00:00 23526.640625 NaN 641.566772
2020-12-21 00:28:00+00:00 23486.863281 NaN 640.518188
2020-12-21 00:29:00+00:00 23493.597656 NaN 640.609680
2020-12-21 00:30:00+00:00 23497.607422 NaN 640.758362
2020-12-21 00:31:00+00:00 23550.359375 NaN 641.541931
... ... ... ...
2020-12-28 00:19:00+00:00 26493.246094 NaN 708.414062
2020-12-28 00:20:00+00:00 26520.251953 NaN 709.166321
2020-12-28 00:21:00+00:00 26509.263672 NaN 708.537170
2020-12-28 00:22:00+00:00 26530.599609 NaN 707.455750
2020-12-28 00:23:02+00:00 26558.570312 NaN 708.471008

#

ok seems to be working

#

just a scattering of NaNs somewhere

velvet thorn Dec 28, 2020, 12:30 AM

#

check dfs[1]

solid kindle Dec 28, 2020, 12:30 AM

#

ok thanks

#

its fine

#

i think its just the api

#

and that one dataset

#

all the other ones are fine

lapis sequoia Dec 28, 2020, 1:42 AM

#

lapis sequoia But it seems not to be training at all

u.u

#

if someone could help

lapis sequoia Dec 28, 2020, 2:48 AM

#

guys

#

am I the only one who don't use tuples that much

desert parcel Dec 28, 2020, 3:04 AM

#

high badge you square root it to know its actual loss

thx

lapis sequoia Dec 28, 2020, 4:25 AM

#

Have any of you worked with the mal api?

#

I’m trying to extract the user ids of the users on mal, I’ve tried mal,jikan but nothing seems to work

#

Is there no other way than to make a crawler and scrape the user ids?

#

Also I need to extract the rating given by each user to the anime

desert parcel Dec 28, 2020, 4:28 AM

#

Could someone explain this paragraph to me? I've been replaying the video, but still don't understand it.

📎 unknown.png

#

📎 unknown.png

lapis sequoia Dec 28, 2020, 5:19 AM

#

desert parcel

not the kind of reply you’re looking for, but may I ask about the guide? Looks cool, and is there any video tutorial for that?

lapis sequoia Dec 28, 2020, 5:21 AM

#

desert parcel

In a very layman language a loss function is a way of telling a model how bad it is doing

#

So the less the loss is

#

It’s better

#

Coz that means it’s doing better

#

won't overfitting be the problem though

#

That’s when you train the model too much on one dataset

desert parcel Dec 28, 2020, 7:30 AM

#

lapis sequoia not the kind of reply you’re looking for, but may I ask about the guide? Looks c...

FreeCodeCamp Zero to GANS by Aakashn

#

It's on youtube and it's free

desert parcel Dec 28, 2020, 7:31 AM

#

lapis sequoia In a very layman language a loss function is a way of telling a model how bad it...

Alright thanks

lapis sequoia Dec 28, 2020, 7:38 AM

#

Thanks m8

lapis sequoia Dec 28, 2020, 7:45 AM

#

desert parcel Alright thanks

Np

lapis sequoia Dec 28, 2020, 7:45 AM

#

lapis sequoia Thanks m8

Ok thank him he’s desperate for a thank you

desert parcel Dec 28, 2020, 10:23 AM

#

lapis sequoia In a very layman language a loss function is a way of telling a model how bad it...

Could you explain it in more detail?

#

I know this is late lol

lapis sequoia Dec 28, 2020, 10:39 AM

#

there are different types of functions which are used to determine the loss

#

they basically see the difference between what your model is predicting

#

versus the prediction that should be

desert parcel Dec 28, 2020, 10:40 AM

#

lapis sequoia versus the prediction that should be

what does this mean

#

English isn't my first language so don't use too advanced words

lapis sequoia Dec 28, 2020, 10:40 AM

#

oh

#

prediction is the ideal thing

#

like the actual answer

desert parcel Dec 28, 2020, 10:41 AM

#

Alright

lapis sequoia Dec 28, 2020, 10:41 AM

#

versus what the model gave

desert parcel Dec 28, 2020, 10:41 AM

#

I thought what the model gave is the prediction

#

From the tutorial it says that the predictions should be close to or equal to the targets

#

📎 unknown.png

#

Looking at the first element in each tensor. The guy says that -4252.4780 is what happens when you differentiate with respect to the 0.2761

#

Correct me if I'm wrong.

#

And the value -4252.4780 is the derivative of the loss with respect to 0.2761?

lapis sequoia Dec 28, 2020, 1:57 PM

#

is anyone here familiar with naive bayes?

flint sierra Dec 28, 2020, 3:34 PM

#

I'm a self-taught programmer. I'm lucky enough to have a job where I get to use python every day as a data analyst. However I feel like I've hit a wall on my professional development. Internet bootcamps can only take me so far, I think what I'm missing is peer interactions and networking. Unfortunately I don't work with anyone else who codes in python. I'm considering taking a more rigorous online course, applying to a university or pouring time into an open source project.

#

Any advice?

arctic wedgeBOT Dec 28, 2020, 5:41 PM

#

Hey @radiant urchin!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

radiant urchin Dec 28, 2020, 5:44 PM

#

Hello Im having some issues with curve fitting using spicy.optimize.curve_fit.

I keep getting the following issues
ValueError: array must not contain infs or NaNs

“`
def func(x, A, m, hf):
return A * (x - hf)**m

ff='data.txt'
data=pd.read_csv(ff,skiprows=3, delimiter='\t', encoding = "ISO-8859-1")
load=np.array(data.iloc[:, 1])
disp=np.array(data.iloc[:, 0])
istart=np.where(disp==max(disp))[0][0]
p0=[0.001,2,250]
ulfit, pcov = curve_fit(func, disp[istart:], load[istart:],p0,
bounds=(0, [0.1, 5, max(disp)]))

“`

I have a lot of similar curves, some work fine, and others give me errors depending on how I adjust p0.. (even though all the curves are similar) I can share a raw data file too if that helps

lapis sequoia Dec 28, 2020, 5:52 PM

#

radiant urchin Hello Im having some issues with curve fitting using spicy.optimize.curve_fit. ...

Did you try filling or dropping NaN values from data

radiant urchin Dec 28, 2020, 5:52 PM

#

the raw data has no NaN values

lapis sequoia Dec 28, 2020, 5:54 PM

#

I can’t seem to find a problem with the code above

#

Sorry that I couldn’t help you

radiant urchin Dec 28, 2020, 5:56 PM

#

if I drop the initial values, then I get a different error:

RuntimeWarning: invalid value encountered in power

#

But it spits out a reasonable result anyways? not sure if this is an issue?

solemn oracle Dec 28, 2020, 5:59 PM

#

can someone help me understand how the quantiles are calculated in pandas?

#

Im looking at the documentation on pandas and I see the example:

#

                  columns=['a', 'b'])
df.quantile(.1)
a    1.3
b    3.7
Name: 0.1, dtype: float64
df.quantile([.1, .5])
       a     b
0.1  1.3   3.7
0.5  2.5  55.0

#

q = 0.1 should represent what the bottom ten percent of the data is below

#

so for column a, q=0.5 makes sense, there are 4 data points and an even number of values so you just take the average between them

#

I dont, however, understand how q=0.1 results in 1.3

solemn oracle Dec 28, 2020, 6:18 PM

#

Ahhh, would the equation be

#

q(n +1) ?

lapis sequoia Dec 28, 2020, 6:19 PM

#

I think it’s 1.25 but rounded

#

yes

solemn oracle Dec 28, 2020, 6:19 PM

#

how would I calculate this

#

q = 0.1, n = 4 ?

#

so according to that logic, the 0.1 quantile should be 0.1(4+1) which it isnt

#

or is that the position its found at

#

and I need to do some math to find what the value is

lapis sequoia Dec 28, 2020, 6:35 PM

#

(n+1)*q

#

oh wait

solemn oracle Dec 28, 2020, 6:37 PM

#

so that would get 0.5 is the position where 10% of the values are below

#

but I dont get where position 0.5 is

lapis sequoia Dec 28, 2020, 6:38 PM

#

Have you tried to look up numpy percentile

#

quartile is basically the same as numpy percentile

solemn oracle Dec 28, 2020, 6:44 PM

#

I get how to use it in np, I just cant figure out where the numbers are coming from

#

when the data set is small

upbeat storm Dec 28, 2020, 9:47 PM

#

Anyone who is interested in learning more about AI and ML please join this server!

#

https://discord.gg/x8Yd6j5d

spiral peak Dec 28, 2020, 10:04 PM

#

So I have an odd pandas question about how to best approach this. Essentially I have 3 columns with data and they're indexed on the time values. They do not, however, have data for the same time values. One column might be missing data in the beginning, one might be missing data at the end and the beginning, and the other might be missing data at the end.

What I want to do: Shift the columns so that their end data all occurs at the same time and back fill the values with NaN. I was going to use df.shift() and the number of NaNs to do the shift, but I can't with the column that also has data missing in the beginning. I'll overshift it. Any suggestions besides manually iterating and count through the NaN values from the back until I have a non-NaN for each column?

vague vector Dec 28, 2020, 10:32 PM

#

Hey guys, apology for a dumb question, Regression, Classification and Clustering can also be done in Deep Learning(like using Keras), or it can only be done in Machine Learning, Deep Learning is only for RNN, CNN etc...?

graceful glacier Dec 28, 2020, 10:38 PM

#

has anyone come up with problem while trying to debug a program running pyspark?

#

im currently using pycharm to debug it and this is the code

#

📎 unknown.png

#

im taking a spark rdd (i believe) called tweets and taking the stopwords out of its "text" column

#

i can place a breakpoint on the last line and the debugger will work fine, but if i place it anywhere inside the remove_stopword function the debugger will disconnect

#

any one have an idea as to why? is it because of how spark works under the hood maybe?

lapis sequoia Dec 29, 2020, 1:05 AM

#

does someone know how to make violin plots?

#

from lists

#

i have seen how to do it with csv file, but i just need to use list and its not working

astral path Dec 29, 2020, 1:08 AM

#

if I have a time series as a feature (e.g. pitch over time for an audio file) while clustering, is it bad practice to use the mean of the time series as a feature instead to simplify it and avoid the curse of dimensionality?

shrewd pewter Dec 29, 2020, 1:29 AM

#

More of a web scraping question but what libs can I use to parse this kind of data?

📎 unknown.png

#

Returned from an HTTP request

obsidian crow Dec 29, 2020, 1:54 AM

#

shrewd pewter More of a web scraping question but what libs can I use to parse this kind of da...

import urllib3

def helper(url):
    http = urllib3.PoolManager()
    req = http.request('GET', url)
    respData = str(req.data)
    Arr = respData.split()

    for i in Arr:
        if 'href' in i:
             return i

if __name__ == "__main__":
    url = <enter url>
    print(helper(url))

simple iron Dec 29, 2020, 2:22 AM

#

Hey all, does anyone have good resources for preparing for technical ML interviews? Currently an ML eng at big tech co. I've been using leetcode.com for coding prep for traditional data structures & algorithms, and datascienceprep.com for ML/stats questions, was wondering if anyone knew of others.

old pendant Dec 29, 2020, 4:31 AM

#

Is there a way to select values by an array that defines which column for every row I will select in numpy? (without iterating every row)

Example:

column_indexes = np.array([1, 0, 1, 1, 1])

values = np.array([[1981.5       , 1894.        ],
       [ 489.33333333,  492.        ],
       [1110.        , 1110.        ],
       [ 197.        ,  197.        ],
       [ 301.66666667,  319.        ]])

values_selected = array([1894.        ,  489.33333333, 1110.        ,  197.        ,
        319.        ])

Thanks!

old pendant Dec 29, 2020, 4:45 AM

#

old pendant Is there a way to select values by an array that defines which column for every ...

this code works, but I think there is a better way to do that

result = [row[pseudo_label_col] for row, pseudo_label_col in zip(values, column_indexes)]
np.array(result)

velvet thorn Dec 29, 2020, 5:30 AM

#

old pendant this code works, but I think there is a better way to do that ``` result = [row[...

>>> np.take_along_axis(values, column_indexes[:, None], axis=1)
array([[1894.        ],
       [ 489.33333333],
       [1110.        ],
       [ 197.        ],
       [ 319.        ]])

astral path Dec 29, 2020, 7:12 AM

#

I'm making a feature set where the features are based on an analysis of audio files of differing length. For example, I have audio files A and B and the feature is the loudness over time, but A is 2 times the length of time as B. As a result, the feature for A would be an array of 2x the length of the feature for B. What would the best way to cluster be when I have feature sets of differing length?

old pendant Dec 29, 2020, 7:28 AM

#

@velvet thorn thank you!

velvet thorn Dec 29, 2020, 7:29 AM

#

astral path I'm making a feature set where the features are based on an analysis of audio fi...

aggregate

#

or pad

old pendant Dec 29, 2020, 7:31 AM

#

@velvet thorn if values matrix have n_cols > 3, the method is still valid? the trick with column_indexes[:, None] will need to be rewritten, correct?

astral path Dec 29, 2020, 7:38 AM

#

thanks

desert parcel Dec 29, 2020, 8:05 AM

#

flint sierra I'm a self-taught programmer. I'm lucky enough to have a job where I get to use ...

I think you should ask this in a subreddit search for terms like "Python advice" or "advice", "cs advice"

flint sierra Dec 29, 2020, 8:06 AM

#

Thanks!!

proven sigil Dec 29, 2020, 8:37 AM

#

solemn oracle I dont, however, understand how q=0.1 results in 1.3

[1, 2, 3, 4]. There's a 3 element gap between first and last element. (n - 1).
q=0.1 which means it gets value of 3 * 0.1 elements after from first element (sorted)
so 1.3rd element => 0.7 * first_element + 0.3 * second_element => 1.3
Same for [1, 10, 100, 1000]
0.7 * 1 + 0.3 * 10 = 3.7

vague vector Dec 29, 2020, 8:44 AM

#

Please correct me where I'm wrong, I'm trying to clear my basic concepts:

Regression, Classification, Clustering, dimensionality reduction etc are some major algorithms in Machine Learning.

Machine Learning also has another set of special algorithms called Neural Networks.
Deep Learning is when Neural Networks has depth, i.e. with multiple Layers.
Deep Learning specialize in non-linearities, feature engineering is also done automatically.

RNN, CNN, GAN are some popular architectures of Deep Learning.

lapis sequoia Dec 29, 2020, 9:25 AM

#

vague vector Please correct me where I'm wrong, I'm trying to clear my basic concepts: `Regr...

Neural Networksis is the machine learning type called reinforcement Learning.

velvet thorn Dec 29, 2020, 9:42 AM

#

lapis sequoia Neural Networksis is the machine learning type called reinforcement Learning.

neural networks can be used for reinforcement learning

#

but they're not the same

lapis sequoia Dec 29, 2020, 9:50 AM

#

velvet thorn but they're not the same

You are right. But it is AI branch

velvet thorn Dec 29, 2020, 9:53 AM

#

lapis sequoia You are right. But it is AI branch

that's not what you originally said though

lapis sequoia Dec 29, 2020, 9:57 AM

#

velvet thorn that's not what you originally said though

ok

#

Anyone is working on Data Engineering Platform?

lapis sequoia Dec 29, 2020, 12:33 PM

#

lapis sequoia You are right. But it is AI branch

bruh

#

whole ml comes under ai

#

AI>ML>DL in short

lapis sequoia Dec 29, 2020, 1:12 PM

#

What is DL

torpid cave Dec 29, 2020, 1:21 PM

#

Hi all, anyone who works with classes for your data pipes

#

Do you prefer long methods to do all the lifting, or many small methods which you can edit later

lapis sequoia Dec 29, 2020, 1:46 PM

#

lapis sequoia What is DL

Deep learning

lapis sequoia Dec 29, 2020, 4:23 PM

#

POOTERS

fast vector Dec 29, 2020, 4:27 PM

#

Hello, I'm a second year data science major at a state university. I have been disappointed with my curriculum thus far because my courses don't cover python for data science specifically and the Intro to R class was pretty basic. I'd like to become more familiar with both of these and reach a level in which I could comfortably apply for internships. I eventually want to build a good foundation on python to start with ML. My understanding is that projects are incredibly important. Does anyone have a list of resources, specific python and R libraries, projects, books, or websites I could use to reach my goals?

crisp gazelle Dec 29, 2020, 4:27 PM

#

!resources

arctic wedgeBOT Dec 29, 2020, 4:27 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

crisp gazelle Dec 29, 2020, 4:27 PM

#

@fast vector use this

fast vector Dec 29, 2020, 4:28 PM

#

Wow that's awesome! Thanks @crisp gazelle!

crisp gazelle Dec 29, 2020, 4:29 PM

#

No problem!

dense nova Dec 29, 2020, 4:46 PM

#

Hello, I have trained a model in https://teachablemachine.withgoogle.com that has 3 classes (Hand Raised, Thumbs Up and Neutral).
I exported the model as a .h5 Keras model, and I've managed to make some predictions from some testing data that i've gathered.

the predictions output looks like this:

[[9.9910396e-01 1.9197341e-05 8.7688433e-04]]

im not sure what to make of this, any help would be great

vestal magnet Dec 29, 2020, 4:58 PM

#

Hello! I have a question regarding matplotlib. How can I plot different lists in the same scale?
So, ideally, the dark blue line, should be within the boundaries of the green and red line, but it doesn't.
The dark blue line is a list, composed of several altitude values.

📎 Screenshot_from_2020-12-29_15-20-23.png

#

I asked in #help-grapes and they told me to adjust the number of samples of the dark blue line same as the red and green lines, which is 512

#

And I can't get anymore cause those values come from server request to Google Maps API, and the max number of samples is 512
On the other hand, the dark blue line comes from the path provided by A* on a csv file
It's like this:

I have a set of altitudes, imagine: 60, 100, 120

For those values I can only trace a path between 60+ and 120+, summed to those numbers, that is, I can only trace between 120-180, 160-220 and 180-240, cause those are the max limits for the drone, in this case.
But I have like 3000 samples or so

#

But even if I adjust the sample numbers to be the same, I still get the plots above

lament loom Dec 29, 2020, 5:19 PM

#

Is anyone preparing for Google Summer of code, or have experience with the same?
I was planning to participate in GSoC 2021, as I have done a bunch of Machine Learning, NLP and Data Science Projects, also have some entry-level experience with Open-source contribution and Git/GitHub.

sand sluice Dec 29, 2020, 6:10 PM

#

Is there a way to get the underlying numpy array of a matplotlib plot. I want to apply a color map to 1D data points, and then use openCV to threshold the rgb image. The problem is I want to run k-means on the points inside the threshold, so they need to correspond exactly with the original. The way I am currently doing it, by saving the plot to a file, means that the size will depend on the DPI, and the pixels don't match.

limpid oak Dec 29, 2020, 7:24 PM

#

how can i add list of columns to df

#

KILLA_LINE_Col = ['SR_NO', 'DISTRICT_N', 'TEHSIL_NAM', 'VILLAGE_NA',
'HB_NO', 'LAYER_NAM', 'DESCRIPTIO', 'LENGTH_MTR',
'LENGTH_KAR', 'AREA_SQMTR', 'DES_MEASUR']

#

i want to add this colmns to existing dataframe

limpid oak Dec 29, 2020, 7:41 PM

#

KILLA_LINE_file_copy[:,KILLA_LINE_Col] = np.nan

#

TypeError: unhashable type: 'slice'

#

getting this error

#

anybody here for help

fervent flax Dec 29, 2020, 7:55 PM

#

hey guys, im using the dog.ceo api and sometimes it'll give slightly mispelled names (like "Germanshepherd" or "Stbernard" instead of "German Shepherd" or "St. Bernard")

#

any way to return a "correct" dog breed or fix it? not sure if this is the correct channel

pastel glacier Dec 29, 2020, 8:46 PM

#

beautiful soup vs selenium vs scrapy??

#

which

#

is best for web scrapping

fleet heath Dec 29, 2020, 9:11 PM

#

limpid oak KILLA_LINE_Col = ['SR_NO', 'DISTRICT_N', 'TEHSIL_NAM', 'VILLAGE_NA', ...

for i in KILLA_LINE_Col:
    df[f"{i}"] = np.nan```

fleet heath Dec 29, 2020, 9:13 PM

#

fervent flax hey guys, im using the dog.ceo api and sometimes it'll give slightly mispelled n...

you can probably look for some specific words based on which you can change all the values to some standard values like:
GS - German Shephard
SB - St. Bernard

fleet heath Dec 29, 2020, 9:14 PM

#

fleet heath you can probably look for some specific words based on which you can change all ...

since the spelling is right, you can strip the string and make it lowercase and then compare it with the spelling on the basis of which you can classify them as a particular breed

velvet thorn Dec 29, 2020, 11:24 PM

#

fervent flax hey guys, im using the dog.ceo api and sometimes it'll give slightly mispelled n...

this is a somewhat complex problem

#

is there a finite list of misspellings?

fervent flax Dec 29, 2020, 11:27 PM

#

Kinda, but the list is long soo i didnt wanna go through it, i fixed it by using a different api though

My original idea was to use wikipedias api to search using the mispelled word, and then use the suggested article's name for the correct breed name but it didnt work for edge cases

Or do a google search and use the first suggested wikipedia link's article name (so i did mispelled name + dog for the searcg query) but that took wayyy too long

#

It's fine now though, thanks

sullen crescent Dec 29, 2020, 11:41 PM

#

dense nova Hello, I have trained a model in https://teachablemachine.withgoogle.com that ha...

you can debug the inference source code you used for prediction

inland iron Dec 29, 2020, 11:42 PM

#

sup guys howre you going ?

sullen crescent Dec 29, 2020, 11:43 PM

#

I think it shows confidence precentage for detection based on 3 classes you made (hand raised, thumbs up and neutral), but i'm not sure tho @dense nova

lapis sequoia Dec 30, 2020, 2:35 AM

#

howdy, working on some NLP projects. anyone here can answer a question about annotations? I see this type of annotation framework: https://universaldependencies.org/format.html are there any other type of annotation standards, frameworks you know of? Thanks

austere swift Dec 30, 2020, 3:29 AM

#

I'm having an issue with pytorch

#

so what happens is whenever i try to import it in a python file i get this error

Traceback (most recent call last):
  File ".\script.py", line 7, in <module>
    import torch
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\__init__.py", line 117, in <module>
    raise err
OSError: [WinError 127] The specified procedure could not be found. Error loading "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\lib\cufftw64_10.dll" or one of its dependencies.

#

but when i do it in repl its completely fine

#

i don't understand it lol

#

i tried reinstalling cuda/cudnn and reinstalling pytorch but neither worked

#

and i havent found anything on this error

#

also i verified that the cufftw64_10.dll file is there

#

python 3.8.6 pytorch 1.7.1 cuda 11.0 and cudnn 8.0.5 btw

#

and gpu drivers are the latest

austere swift Dec 30, 2020, 4:40 AM

#

@ me if you have an answer btw

austere swift Dec 30, 2020, 5:26 AM

#

So i just fully deleted the torch folder from site-packages as well as fully deleted the cuda folder and then reinstalled both and it worked now

sullen crescent Dec 30, 2020, 6:22 AM

#

configuring cuda, cudnn, and ML/DL framework on windows is such a pain

lapis sequoia Dec 30, 2020, 7:14 AM

#

lament loom Is anyone preparing for Google Summer of code, or have experience with the same?...

Sounds interesting

#

I might think of participating

#

When is it held?

#

The dates

lament loom Dec 30, 2020, 7:34 AM

#

lapis sequoia Sounds interesting

https://summerofcode.withgoogle.com/

Google Summer of Code

lament loom Dec 30, 2020, 7:34 AM

#

lapis sequoia The dates

March 2021

lapis sequoia Dec 30, 2020, 8:28 AM

#

How can I change multiple labels from a value to another in a pandas dataframe. I have tried train_df[train_df['label']=='humor'].label = 'fake' and doesn't really work. .label is a column or Series of the train_df dataframe.

#

wdym label?

#

columns?

#

It's fine, I figured it out train_df.loc[train_df['label']=='humor', 'label'] = 'fake'

#

thanks man

#

aight Pepex3

sturdy wren Dec 30, 2020, 9:12 AM

#

pastel glacier beautiful soup vs selenium vs scrapy??

I mean it totally depends on your use case. Go with scrapy

hollow scarab Dec 30, 2020, 11:25 AM

#

if I have a df like this, is it possible to add a row which is not related to these?

📎 unknown.png

#

It would be on the 2. row and it would be: 'Weigthed number', and then for the next 4 columns it would have a formula, the 3. row* 0.6 + the 4. row*0.4

molten hamlet Dec 30, 2020, 11:49 AM

#

Hey guys, is there any better lib than matplotlib to plot 3d data? its not very interactive, I can only rotate :(

📎 Screenshot_from_2020-12-30_12-48-46.png

lapis sequoia Dec 30, 2020, 12:31 PM

#

Wow

#

Thats some fancy looking plot there

lapis sequoia Dec 30, 2020, 12:32 PM

#

hollow scarab It would be on the 2. row and it would be: 'Weigthed number', and then for the n...

You mean adding a new column?

hollow scarab Dec 30, 2020, 1:18 PM

#

no, a new row @lapis sequoia

#

this is the end goal basically

📎 unknown.png

#

but instead of the formula I want a number displayed in the 2. row ofc

lapis sequoia Dec 30, 2020, 1:34 PM

#

hollow scarab but instead of the formula I want a number displayed in the 2. row ofc

you can create a new dataframe that calculates each column with the given formula, then concatenate dateframe.

hollow scarab Dec 30, 2020, 1:40 PM

#

Oh, that could work, thank you! @lapis sequoia

lapis sequoia Dec 30, 2020, 1:41 PM

#

you're welcome (:

gritty wedge Dec 30, 2020, 3:57 PM

#

@lapis sequoia ur name is very nice

lapis sequoia Dec 30, 2020, 4:42 PM

#

hi

#

I am 14 year old and 9th grade student

#

I was interested in learning science

#

unfortunately almost all tutorials I've seen have a lot of complex math

#

they have weird symbols and terms I haven't even heard of

#

I wanted to ask that what all maths is required to learn data science

#

please ping me with reply

#

thank you

#

🤔

#

Statistics

#

Discrete Math

#

oh ok

#

I was doing that rn for my exam

#

but

#

they had those weird symbols

#

one looked like a mirrored e

#

which they call sigma

#

I am a math teacher

#

And math is not easy for me either

#

ok

#

I was thinking of just dropping it due to high level maths, but that would be quitting

#

so I thought I might study a math book or 2

lapis sequoia Dec 30, 2020, 4:47 PM

#

lapis sequoia so I thought I might study a math book or 2

Good idea, I recommend Khan Academy

#

thanks

#

so I need statistics and discrete math right?

#

that's all?

#

I do not think is all, it depends how much deep you want to go into data science

#

well i need enough for machine learning

#

Some hight level math in data science is calculus and linear algebra too

lapis sequoia Dec 30, 2020, 4:50 PM

#

lapis sequoia Some hight level math in data science is calculus and linear algebra too

yes, linear algebra, i remember one guy saying that

#

thanks for advice, I'll try to get a grip on these topics

lapis sequoia Dec 30, 2020, 5:03 PM

#

gritty wedge <@456226577798135808> ur name is very nice

ModuleNotFound

gritty wedge Dec 30, 2020, 5:06 PM

#

lapis sequoia ModuleNotFound

Lol

lapis sequoia Dec 30, 2020, 5:24 PM

#

PU_PepeNotFunny

#

Is there any great DL tutorial video on youtube that you guys would recommend? I’ve been doing data analysis with pandas, and I want to dig into deep learning with tensorflow, but can’t seem to find a good tutorial for total beginners.

sly niche Dec 30, 2020, 5:50 PM

#

Word2vec, what use?

#

Just want to play with nlp a bit. Make a thesaurus, gpt suggested that.

#

Also, are colab tpus really free?

twilit pilot Dec 30, 2020, 5:58 PM

#

Right now I have a pandas series that looks like this time 2020-12-24 12:34:00-05:00 222.600 2020-12-24 12:35:00-05:00 222.480 2020-12-24 12:36:00-05:00 222.520 2020-12-24 12:37:00-05:00 222.510 2020-12-24 12:38:00-05:00 222.330 ... 2020-12-30 12:51:00-05:00 222.510 2020-12-30 12:52:00-05:00 222.505 2020-12-30 12:53:00-05:00 222.565 2020-12-30 12:54:00-05:00 222.565 2020-12-30 12:55:00-05:00 222.535 Name: close, Length: 1000, dtype: float64 The time column is the index and i want to edit it to be numerical like 1, 2, 3, 4, 5, 6, 7, 8, 9.... Can someone help?

lapis sequoia Dec 30, 2020, 5:59 PM

#

guys, i wanna use Xception as my model to train
Can i load it somehow and just train it from scratch?

limpid oak Dec 30, 2020, 6:16 PM

#

how can i make plotygon from linestring

#

my code is not working

#

`import geopandas as gpd
from shapely.geometry import Polygon, mapping

def linestring_to_polygon(fili_shps):
gdf = gpd.read_file(fili_shps) #LINESTRING
gdf['geometry'] = [Polygon(mapping(x)['coordinates']) for x in gdf.geometry]
return gdf`

#

LINESTRING Z (528736.796 3513075.750 0.000, 52...)

limpid oak Dec 30, 2020, 6:37 PM

#

need help

lapis sequoia Dec 30, 2020, 7:30 PM

#

Hi guys,
I would like to make a user interface in order to visualize stock data that is being webscraped in real time.
I was wondering what you would recommend as a simple user interface. Would something like HTML and CSS suffice to create a basic real-time UI locally? Or is that not ideal as you have to constantly refresh the page to get new data? Or is it easier to stick to something like tkinter or another python package. I'm new to this so I would appreciate any type of advice!!

soft dock Dec 30, 2020, 7:31 PM

#

Flask and Pusher

#

https://pusher.com/channels

Pusher Channels | Build Realtime Features Anywhere

Easily build scalable realtime graphs, geotracking, multiplayer games, and more in your web and mobile apps with our hosted pub/sub messaging API. #PusherChannels

lapis sequoia Dec 30, 2020, 7:36 PM

#

Nice!! Thanks a lot, @soft dock !!
Is Pusher some kind of online host?

soft dock Dec 30, 2020, 7:43 PM

#

more of an API

nova smelt Dec 30, 2020, 8:50 PM

#

hey
anyone want to hop into vc and explain us how to use the neat Checkpointer class
https://neat-python.readthedocs.io/en/latest/_modules/checkpoint.html
we dont know what the diffrent parameters for save_checkpoint exactly are

serene scaffold Dec 30, 2020, 9:07 PM

#

@gentle wagon to answer your question about numpy: It's used for linear algebra, or just do do large numbers of computations in batches. Suppose you're tracking data about the daily temperature in a given city: the array will have 365 elements. If you have that data for ten years, you can stack all those arrays to get a (10, 365)-shaped matrix. And then if you want to get an array of the daily average, you just have to make an array that's the average of each column. Not linear algebra per se, but numpy makes this kind of math easy to do.

lapis sequoia Dec 30, 2020, 9:28 PM

#

guys, i wanna use Xception as my model to train
Can i load it somehow and just train it from scratch?

vocal bay Dec 30, 2020, 9:44 PM

#

Hi guys. I want to learn data science and ml (including dl, rl and drl) but i don't think i have the necessary mathematical background for me to understand it properly. Which resources would you recommend to get me up to speed? And which resources would you recommend for learning data science and ml?

lapis sequoia Dec 30, 2020, 10:01 PM

#

Pusher seems to be dependent on Visual Basic studio. Is there something I can do to prevent using that? I prefer to stick to PyCharm. But I keep getting this error whenever I try to install pusher:

error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/

austere swift Dec 30, 2020, 10:30 PM

#

pip needs the build tools when it needs to build some package from source (which probably means theres no whl file for your version of python)

#

what version of python are you on?

astral path Dec 30, 2020, 10:32 PM

#

is it a bad idea to use feature agglomeration on a time series?

lapis sequoia Dec 30, 2020, 10:42 PM

#

austere swift pip needs the build tools when it needs to build some package from source (which...

So I should do a pip install? OR do I need to go to the website of microsoft to download the installer?

lapis sequoia Dec 30, 2020, 11:23 PM

#

hey guys

soft dock Dec 31, 2020, 12:40 AM

#

@lapis sequoia I did a pip install in an isolated virtual environment, as I normally do

modest mantle Dec 31, 2020, 12:42 AM

#

Uh, I'm sorry but I guess it's a mistake.. I don't remember asking anything here '^'

soft dock Dec 31, 2020, 12:42 AM

#

wrong dean, apologies

modest mantle Dec 31, 2020, 12:43 AM

#

No problem :D

lapis sequoia Dec 31, 2020, 12:44 AM

#

guys, i wanna use Xception as my model to train
Can i load it somehow and just train it from scratch?

sullen crescent Dec 31, 2020, 12:50 AM

#

what do you mean by train it from scratch?

#

if you mean train it with your own dataset which is totally different categories,/classes, you can

hasty grail Dec 31, 2020, 12:55 AM

#

lapis sequoia guys, i wanna use Xception as my model to train Can i load it somehow and just t...

The Keras (TensorFlow) library has a built-in Xception model architecture that can be trained from scratch.
https://keras.io/api/applications/xception/

lapis sequoia Dec 31, 2020, 12:56 AM

#

well, i am trying and acc is 0.007

#

i made my own small model with 3 layers and 0.25

#

so idk

hasty grail Dec 31, 2020, 12:57 AM

#

Can you provide more information on what you're doing?

sullen crescent Dec 31, 2020, 12:59 AM

#

i'm kinda lost here, what do you mean with "3 layers and 0.25"?

hasty grail Dec 31, 2020, 12:59 AM

#

I'm guessing 3x conv2d and a dropout of 0.25

austere swift Dec 31, 2020, 1:38 AM

#

lapis sequoia So I should do a pip install? OR do I need to go to the website of microsoft to ...

So the build library is only needed if you need to build the package since there is no whl file, if you can find the whl file somewhere for your Python install you can just use that

#

I’m not completely sure if it’s on there but you can look up Christoph gohlke he made a repository of wheel files that you can download, check if the one you need is on there

#

Make sure you get the right one, should have cp<Python version> and if you have 64 bit Python then you need the one that says 64

#

By cp I mean like if you have Python 3.8 it would be cp38, 3.9 would be cp39, etc

#

https://www.lfd.uci.edu/~gohlke/pythonlibs/ there’s the link

lapis sequoia Dec 31, 2020, 2:17 AM

#

nvm i think i got it

#

https://gyazo.com/06a25216b6c835a948f5789351b5d2ac

Gyazo

#

this is going good, isnt it? @hasty grail

#

Does anyone know if there will be an M1 native version of Miniconda for the new Macs? I don't see anything on the install page https://docs.conda.io/en/latest/miniconda.html

hasty grail Dec 31, 2020, 2:22 AM

#

lapis sequoia https://gyazo.com/06a25216b6c835a948f5789351b5d2ac

Better visualize some examples of the model's predictions to be sure it's not just predicting the most common class or something

lapis sequoia Dec 31, 2020, 2:34 AM

#

i am doing the predictions "manually"

#

def predict(path, dims, color):
    img = cv2.imread(path, color)
    img = cv2.resize(img, dims) / 255
    prediction = model.predict(img[np.newaxis, ...])
    print(np.argmax(prediction))```

#

But for example, ive already found one image it fails

#

Anyway, i would like to know which is the other class most likely to be

#

like, the top 5 classes

#

idk if i am explaining

velvet thorn Dec 31, 2020, 2:35 AM

#

@fervent flume you want a join

#

if you havne't solved it

fervent flume Dec 31, 2020, 2:36 AM

#

@velvet thorn hmm that makes sense

#

not sure what the join should look like tho

velvet thorn Dec 31, 2020, 2:36 AM

#

join on index

#

the datetime index

fervent flume Dec 31, 2020, 2:38 AM

#

so rather than using the for loop, I have this, but it's just as slow:

temp = df.loc[right,:].set_index(pd.DatetimeIndex(left))
temp = temp.groupby(temp.index).apply(lambda x: x.ffill())
temp = temp[~temp.index.duplicated(keep="last")]
df.update(temp)

but you're saying i should be able to join on this rather than doing the group by?

serene scaffold Dec 31, 2020, 3:15 AM

#

fervent flume so rather than using the for loop, I have this, but it's just as slow: ``` temp ...

did you figure out? what are you trying to do exactly?

#

like what kind of data are in these dataframes and what are you doing with it?

fervent flume Dec 31, 2020, 3:17 AM

#

yeah i have some daily data for a bunch of columns, and there are some dates that are "bad" (holidays and weekends), but sometimes data comes in on those "bad" days. So what I want to do is update the data on the day before the "bad" day with the "bad" days data. So for the most part that's going to look like updating Friday's data with data that came in on Saturday and Sunday, if any

#

But I can't just backfill, because Sunday's data would overwrite Saturday's data

#

(on the off chance that data came in on both saturday and sunday)

#

so I ahve a function that gives me these date pairs, and the code above is what I have to solve the issue

serene scaffold Dec 31, 2020, 3:18 AM

#

fervent flume yeah i have some daily data for a bunch of columns, and there are some dates tha...

so Friday gets the data for the next two days, even though those days are in the future?

fervent flume Dec 31, 2020, 3:18 AM

#

correct

serene scaffold Dec 31, 2020, 3:19 AM

#

And what does it mean for Friday to "get" that data? Is this addition of numbers or something?

fervent flume Dec 31, 2020, 3:19 AM

#

temp = temp.groupby(temp.index).apply(lambda x: x.ffill().iloc[-1])
df.update(temp)```

#

no just overwrite

#

overwrite if not nan

serene scaffold Dec 31, 2020, 3:19 AM

#

ah

#

can you show me an example of what the dataframe looks like?

#

like if you print it?

tepid pawn Dec 31, 2020, 3:21 AM

#

I have a question on how to impute values given the contents of a different column. Like if colA=1 impute 2 into colB, if colA=2 impute 3 into colB. Anyone have an idea on how to do this?

serene scaffold Dec 31, 2020, 3:21 AM

#

tepid pawn I have a question on how to impute values given the contents of a different colu...

can you show what the dataframe looks like?

tepid pawn Dec 31, 2020, 3:22 AM

#

It's the titanic training set. I want to impute average age of people within the same class/sex rather than the mean of the column.

serene scaffold Dec 31, 2020, 3:23 AM

#

tepid pawn It's the titanic training set. I want to impute average age of people within th...

so, conditional mean imputation?

tepid pawn Dec 31, 2020, 3:23 AM

#

that sounds right

serene scaffold Dec 31, 2020, 3:23 AM

#

Let's see if I still have that code.

tepid pawn Dec 31, 2020, 3:23 AM

#

great, thanks!

serene scaffold Dec 31, 2020, 3:24 AM

#

tepid pawn great, thanks!

are you familiar with how you can do masks with dataframes?

tepid pawn Dec 31, 2020, 3:24 AM

#

I've done it I think, but it's been a while

fervent flume Dec 31, 2020, 3:25 AM

#

@serene scaffold

2001-05-04   NaN   NaN   NaN      NaN   NaN         NaN       NaN   NaN   NaN   NaN   NaN  NaN NaN   NaN  NaN  NaN   NaN  NaN  ...        NaN  NaN       NaN   NaN   NaN   NaN    NaN       NaN   NaN   NaN   NaN  NaN   NaN  NaN   NaN   NaN   NaN   NaN
2001-05-04   NaN   NaN   NaN      NaN   NaN         NaN       NaN   NaN   NaN   NaN   NaN  NaN NaN   NaN  NaN  NaN   NaN  NaN  ...        NaN  NaN       NaN   NaN   NaN   NaN    NaN       NaN   NaN   NaN   NaN  NaN   NaN  NaN   NaN   NaN   NaN   NaN
2001-05-11   NaN   NaN   NaN      NaN   NaN         NaN       NaN   NaN   NaN   NaN   NaN  NaN NaN   NaN  NaN  NaN   NaN  NaN  ...        NaN  NaN       NaN   NaN   NaN   NaN    NaN       NaN   NaN   NaN   NaN  NaN   NaN  NaN   NaN   NaN   NaN   NaN
2001-05-11   NaN   NaN   NaN      NaN   NaN         NaN       NaN   NaN   NaN   NaN   NaN  NaN NaN   NaN  NaN  NaN   NaN  NaN  ...        NaN  NaN       NaN   NaN   NaN   NaN    NaN       NaN   NaN   NaN   NaN  NaN   NaN  NaN   NaN   NaN   NaN   NaN
2001-05-18   NaN   NaN   NaN      NaN   NaN         NaN       NaN   NaN   NaN   NaN   NaN  NaN NaN   NaN  NaN  NaN   NaN  NaN  ...        NaN  NaN       NaN   NaN   NaN   NaN    NaN       NaN   NaN   NaN   NaN  NaN   NaN  NaN   NaN   NaN   NaN   NaN```

#

basically

#

lol

tepid pawn Dec 31, 2020, 3:25 AM

#

um... @fervent flume

serene scaffold Dec 31, 2020, 3:25 AM

#

tepid pawn I've done it I think, but it's been a while

so you can have a mask like (passengers['age'] == n) & (passengers['class'] == 'first'). And that will give you a series of true or false values.

tepid pawn Dec 31, 2020, 3:25 AM

#

lol

#

ok, following

serene scaffold Dec 31, 2020, 3:25 AM

#

And then you can mask another column with that to only get the columns where those conditions are true in the other columns.

#

And take the mean of that

#

💥

tepid pawn Dec 31, 2020, 3:26 AM

#

Ok, thanks. I'll give it a shot

serene scaffold Dec 31, 2020, 3:26 AM

#

fervent flume <@!253696366952316929> ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1...

I can't tell what I'm looking at. Try putting it in our paste bin

#

!paste

arctic wedgeBOT Dec 31, 2020, 3:26 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pydis.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

fervent flume Dec 31, 2020, 3:27 AM

#

https://paste.pythondiscord.com/owedilaqur.typescript @serene scaffold

#

it's very sparse

#

but you can imagine creating a df with like a 10 year daily frequency, and 1-3000 columns

#

then injecting random data with a 95% chance of being nan, and you'd have the dataframe

serene scaffold Dec 31, 2020, 3:31 AM

#

fervent flume https://paste.pythondiscord.com/owedilaqur.typescript <@!253696366952316929>

so each row is a day, and days that are Sunday or Saturday, you need to copy the non-nan values into the Friday row, and then delete the Sunday and Saturday rows?

serene scaffold Dec 31, 2020, 3:32 AM

#

tepid pawn lol

the numbers you're trying to impute: you're replacing nan values, yes?

tepid pawn Dec 31, 2020, 3:33 AM

#

yes

#

blanks

fluid pike Dec 31, 2020, 3:34 AM

#

rn i'm trying to read a csv file on jupiter notebook

#

but i keep getting this issue

fervent flume Dec 31, 2020, 3:34 AM

#

@serene scaffold yeah basically. I have a list of date pairs that I want to replace, Friday is just the most common example, there're other dates in general that i'd want to do this to. And I want to keep the last value if there's a value on both saturday/sunday

fluid pike Dec 31, 2020, 3:34 AM

#

📎 unknown.png

#

this doesn't work

#

wait nvm

#

lemme see again

serene scaffold Dec 31, 2020, 3:36 AM

#

fluid pike

how much programming experience would you say you have?

fluid pike Dec 31, 2020, 3:36 AM

#

nvm I got it to work

#

I put in the wrong filename

serene scaffold Dec 31, 2020, 3:36 AM

#

I was just going to say that jupyter notebooks tend to be confusing for learners.

fluid pike Dec 31, 2020, 3:37 AM

#

serene scaffold how much programming experience would you say you have?

a bit from uni, but i haven't done much other projects

serene scaffold Dec 31, 2020, 3:37 AM

#

Jupyter notebooks can become convoluted since the cells can be executed in any order you'd like.

serene scaffold Dec 31, 2020, 3:38 AM

#

fervent flume <@!253696366952316929> yeah basically. I have a list of date pairs that I want ...

are you specifically only trying to replace NaNs?

serene scaffold Dec 31, 2020, 3:44 AM

#

tepid pawn Ok, thanks. I'll give it a shot

did it work?

#

Someone ping me if they want me to come back.

tepid pawn Dec 31, 2020, 3:48 AM

#

I'm working on it now.

hasty grail Dec 31, 2020, 3:52 AM

#

lapis sequoia idk if i am explaining

Sorry I wasn't looking, anyway you could use tf.math.top_k or take the last k elements of np.argsort

serene scaffold Dec 31, 2020, 3:55 AM

#

hasty grail Sorry I wasn't looking, anyway you could use `tf.math.top_k` or take the last `k...

you do transformers stuff?

#

or is that tensorflow?

hasty grail Dec 31, 2020, 3:55 AM

#

the latter

serene scaffold Dec 31, 2020, 3:56 AM

#

hasty grail the latter

I've used tensorflow but I still haven't got a clear picture of what it "is". Is it basically numpy on the gpu?

hasty grail Dec 31, 2020, 3:57 AM

#

It is so much more

#

I think the most defining aspect of it would be graph execution

tepid pawn Dec 31, 2020, 3:57 AM

#

@serene scaffold I have the numbers (means) that I want to insert into the NaN points. I just don't know how to conditionally impute them. At first I was thinking something like if df['column'] == x & df['column'] == y, but don't know where to go from there.

#

I got the means with grouping

serene scaffold Dec 31, 2020, 3:57 AM

#

but if it's everything then I'll never wrap my head around it.

hasty grail Dec 31, 2020, 3:58 AM

#

Also since it's integrated with Keras you don't have to write your own training loops

#

Makes training ML models so much more convenient

serene scaffold Dec 31, 2020, 3:59 AM

#

tepid pawn <@!253696366952316929> I have the numbers (means) that I want to insert into the...

once you have the average of the non-nan values for that mask, you can use fillna to replace the nans.

serene scaffold Dec 31, 2020, 3:59 AM

#

hasty grail I think the most defining aspect of it would be graph execution

what is graph execution?

hasty grail Dec 31, 2020, 4:00 AM

#

It's too much for me to explain, would be easier to read the docs

#

https://www.tensorflow.org/guide/intro_to_graphs

tepid pawn Dec 31, 2020, 4:00 AM

#

Here I created a new df with fillna, but it was with the mean of the entire column.

train_num2 = train_num.fillna(train_num.mean().round(0))

hasty grail Dec 31, 2020, 4:01 AM

#

ML training is usually done in graphs, while regular computation uses eager execution

#

Eager execution is essentially just Python logic

fervent flume Dec 31, 2020, 4:01 AM

#

@serene scaffold no all values should be updated if there's a new non-nan value later

slender oracle Dec 31, 2020, 4:01 AM

#

I think lazy execution is a little more accurate. e.g. spark

serene scaffold Dec 31, 2020, 4:01 AM

#

mask = (df['age'] == 40) & (df['class'] == 'first')
df['died'].fillna(np.nanmean(df['died', mask]))

I didn't look up the methods or antying for this so this is probably wrong, but I think something along these lines will work.

tepid pawn Dec 31, 2020, 4:02 AM

#

ok, I'll play with it. thanks again

fervent flume Dec 31, 2020, 4:02 AM

#

i think it's mask, died (index, columns)

#

but i could be wrong

serene scaffold Dec 31, 2020, 4:03 AM

#

might even be df['died'][mask]

hasty grail Dec 31, 2020, 4:03 AM

#

Graph execution is a bit like lazy execution but not really. It involves compiling Python functions through tf.function which run as graphs during runtime, allowing the engine to perform optimizations such as parallelizing and merging operations.

slender oracle Dec 31, 2020, 4:04 AM

#

Gotcha. That is similar to spark as well.

#

Can see it when looking at the "explain" for a given dataframe

hasty grail Dec 31, 2020, 4:06 AM

#

This is also where a lot of the original notoriety of TensorFlow came from though. Originally, everything had to be done via graphs, which made it incredibly difficult to debug because breakpoints don't work in graph execution, as the code that is actually executed is dynamically generated elsewhere when the function is compiled.

#

also you had to write boilerplate code for the compile-run process

slender oracle Dec 31, 2020, 4:08 AM

#

Was it TensorFlow 2.0 that added the ability to do stuff outside of graphs? I haven't really messed around with it for a long, long time (~2015-ish)

hasty grail Dec 31, 2020, 4:09 AM

#

Yup.

#

Also even in graph mode you don't have to mess around with tf.Session anymore. You just use the tf.function decorator around whatever function you want to compile.

#

the first time the function is evaluated, it is automatically compiled

slender oracle Dec 31, 2020, 4:12 AM

#

Have you tried using PyTorch? If so, what are your thoughts on it vs TensorFlow?

#

I used the old Torch package in Lua, but haven't touched the python version yet.

hasty grail Dec 31, 2020, 4:15 AM

#

Only in passing, the thing I don't like is that you still have to define your training/evaluation loop explicitly whereas TF 2.0 already has a default implementation thanks to Keras

#

However, it is easier to debug because it uses eager execution all the way

slender oracle Dec 31, 2020, 4:17 AM

#

I think I'm missing something, but can't you use something like a CrossValidator class (or variant thereof) to abstract away the training/evaluation part?

fervent flume Dec 31, 2020, 4:17 AM

#

tensorflow is so annoying to work with

#

PyTorch is so much easier

hasty grail Dec 31, 2020, 4:18 AM

#

slender oracle I think I'm missing something, but can't you use something like a CrossValidator...

I don't think that's a built-in thing

#

If you wanted to, you could use supplementary libraries like https://github.com/mv1388/aitoolbox but it's extra work

fervent flume Dec 31, 2020, 4:20 AM

#

microsoft's version was the best though. By far the most intuitive to understand and to use imo. the lack of explicit loops and the way recurrence was handled was also super nice.

#

too bad that died

desert parcel Dec 31, 2020, 4:50 AM

#

can someone explain this error. Specifically what does "non-singleton dimension" mean?

#

RuntimeError: The size of tensor a (1338) must match the size of tensor b (5) at non-singleton dimension 1

tepid pawn Dec 31, 2020, 5:03 AM

#

@serene scaffold I couldn't get it to work with masking. I figured it out with a different groupby, and defining a funciton to impute, then transform. titanic_tr is the training df

#impute age based on sex/pclass

#Create a groupby object: by_sex_class
by_sex_class = titanic_tr.groupby(['Sex', 'Pclass'])

#Write a function that imputes median
def impute_median(series):
return series.fillna(series.median())

#Impute age and assign to titanic['age']
titanic_tr['Age'] = by_sex_class['Age'].transform(impute_median)

serene scaffold Dec 31, 2020, 5:44 AM

#

tepid pawn <@!253696366952316929> I couldn't get it to work with masking. I figured it out...

looks like this was my solution when I did conditional mean imputation for homework

def conditional_mean_imputation(df: pd.DataFrame) -> pd.DataFrame:
    label_series = df[BIN_LABEL]
    df = df.groupby(BIN_LABEL).transform(lambda x: x.fillna(x.mean()))
    return df.join(label_series)

#

df[BIN_LABEL] was the column that identified the class for that row. Also worth noting that this is doing the imputation for every column, or something

velvet thorn Dec 31, 2020, 5:46 AM

#

desert parcel ```RuntimeError: The size of tensor a (1338) must match the size of tensor b (5)...

say you have a tensor with shape (x, y, 1)

#

the last dimension is a singleton dimension

#

the first two are not

sleek fjord Dec 31, 2020, 5:53 AM

#

Hey

#

I'm new to coding

#

and I'm trying to data scrape

#

some nba stats

lapis sequoia Dec 31, 2020, 5:54 AM

#

@sleek fjord, no

velvet thorn Dec 31, 2020, 5:54 AM

#

go on

sleek fjord Dec 31, 2020, 5:54 AM

#

📎 Screen_Shot_2020-12-31_at_4.24.09_pm.png

#

what am i doing wrong?

velvet thorn Dec 31, 2020, 5:54 AM

#

okay in general

#

don't post screenshots please

#

post code as text; it's easier to read and debug.

sleek fjord Dec 31, 2020, 5:54 AM

#

o sorry

velvet thorn Dec 31, 2020, 5:54 AM

#

!code

arctic wedgeBOT Dec 31, 2020, 5:54 AM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

sleek fjord Dec 31, 2020, 5:54 AM

#

Traceback (most recent call last):
File "/Users/airmac/Documents/NBA Python/Untitled.py", line 1, in <module>
from basketball_reference_scraper.teams import get_roster, get_team_stats, get_opp_stats, get_roster_stats, get_team_misc
ImportError: No module named basketball_reference_scraper.teams
[Finished in 0.072s]

#

this is the error code

#


a = get_opp_stats('BOS', 1955, data_format='TOTAL')
print(a)

#

thats the code

velvet thorn Dec 31, 2020, 5:55 AM

#

what do you understand by this ImportError: No module named basketball_reference_scraper.teams

sleek fjord Dec 31, 2020, 5:55 AM

#

nothing

velvet thorn Dec 31, 2020, 5:55 AM

#

so you're trying to import something

sleek fjord Dec 31, 2020, 5:55 AM

#

i did the pip install basketball_reference_scraper

velvet thorn Dec 31, 2020, 5:55 AM

#

from a module (Python file) that it can't find

#

presumably either your install failed

#

or you're using the wrong Python installation

sleek fjord Dec 31, 2020, 5:56 AM

#

it didnt fail

sleek fjord Dec 31, 2020, 5:56 AM

#

velvet thorn or you're using the wrong Python installation

how do i see if this is the case

velvet thorn Dec 31, 2020, 5:56 AM

#

well

#

that seems to be the case, considering you can't import it

#

or the module name could be wrong

sleek fjord Dec 31, 2020, 5:56 AM

#

can i post links?

#

to the api thing?

#

https://github.com/vishaalagartha/basketball_reference_scraper

GitHub

vishaalagartha/basketball_reference_scraper

A python module for scraping static and dynamic content from Basketball Reference. - vishaalagartha/basketball_reference_scraper

#

its been updated recently

#

and I have the latest version of python

#

Python 2.7.16

#

this is the version

#

damn left on read

#

okay

serene scaffold Dec 31, 2020, 6:02 AM

#

sleek fjord Python 2.7.16

python2 is deprecated

sleek fjord Dec 31, 2020, 6:02 AM

#

why

serene scaffold Dec 31, 2020, 6:02 AM

#

It was released a long time ago and the python community has moved on to python 3.

sleek fjord Dec 31, 2020, 6:03 AM

#

by updating

#

should my error be resolved?

vital ocean Dec 31, 2020, 6:04 AM

#

i think so

serene scaffold Dec 31, 2020, 6:04 AM

#

it might somehow solve it. The problem is that Python can't see the module you're referring to.

vital ocean Dec 31, 2020, 6:04 AM

#

serene scaffold it might somehow solve it. The problem is that Python can't see the module you'r...

yes

serene scaffold Dec 31, 2020, 6:05 AM

#

sleek fjord should my error be resolved?

you should have python 3 in either case. There's almost no point learning python 2 at this point because anyone who hasn't updated their project to 3 has probably abandoned that project.

vital ocean Dec 31, 2020, 6:05 AM

#

btw @sleek fjord if u are scraping data u can use Parsehub it will help u

sleek fjord Dec 31, 2020, 6:06 AM

#

do i need to relaunch atom after i get the new version

vital ocean Dec 31, 2020, 6:06 AM

#

scrape any web with free

serene scaffold Dec 31, 2020, 6:06 AM

#

sleek fjord do i need to relaunch atom after i get the new version

I am not sure.

sleek fjord Dec 31, 2020, 6:06 AM

#

vital ocean btw <@!279497357953990657> if u are scraping data u can use Parsehub it will hel...

can you web scrape something that is hidden behind a login

vital ocean Dec 31, 2020, 6:06 AM

#

sleek fjord can you web scrape something that is hidden behind a login

hmm

#

i think so

serene scaffold Dec 31, 2020, 6:07 AM

#

sleek fjord can you web scrape something that is hidden behind a login

I would check that you're allowed to scrape that website

vital ocean Dec 31, 2020, 6:07 AM

#

serene scaffold I would check that you're allowed to scrape that website

yes

#

try robots.txt in the last of you scraping web

#

that will tell u what u can scrape

sleek fjord Dec 31, 2020, 6:07 AM

#

i just downloaded the new version of python and its still saying my version is 2.7.16

vital ocean Dec 31, 2020, 6:08 AM

#

sleek fjord i just downloaded the new version of python and its still saying my version is 2...

ah ok

serene scaffold Dec 31, 2020, 6:08 AM

#

sleek fjord i just downloaded the new version of python and its still saying my version is 2...

try using python3 as your command instead of python

vital ocean Dec 31, 2020, 6:08 AM

#

hmm

sleek fjord Dec 31, 2020, 6:08 AM

#

yeah its saying 3.9.1

#

now

vital ocean Dec 31, 2020, 6:08 AM

#

cool

sleek fjord Dec 31, 2020, 6:09 AM

#

and its saying that ive installed basketball-reference-scraper

#

if i do pip show

#

and its still coming up with the same error

#

  File "/Users/airmac/Documents/NBA Python/Untitled.py", line 1, in <module>
    from basketball_reference_scraper.teams import get_roster, get_team_stats, get_opp_stats, get_roster_stats, get_team_misc
ImportError: No module named basketball_reference_scraper.teams
[Finished in 0.139s]```

vital ocean Dec 31, 2020, 6:11 AM

#

is the module name right?

sleek fjord Dec 31, 2020, 6:11 AM

#

yes

#

just to double check i downloaded the example of the offical github

#

and ran it

#

and that dont work

serene scaffold Dec 31, 2020, 6:12 AM

#

sleek fjord just to double check i downloaded the example of the offical github

how did you download it?

vital ocean Dec 31, 2020, 6:12 AM

#

just try from cmd

desert parcel Dec 31, 2020, 6:12 AM

#

velvet thorn the first two are not

ah got it thank you

sleek fjord Dec 31, 2020, 6:12 AM

#

serene scaffold how did you download it?

i pressed raw then save as

serene scaffold Dec 31, 2020, 6:12 AM

#

try pip install git+https://github.com/vishaalagartha/basketball_reference_scraper.git

vital ocean Dec 31, 2020, 6:12 AM

#

hmm

#

in cmd

sleek fjord Dec 31, 2020, 6:13 AM

#

im on mac

vital ocean Dec 31, 2020, 6:13 AM

#

ok

serene scaffold Dec 31, 2020, 6:13 AM

#

that's fine

sleek fjord Dec 31, 2020, 6:13 AM

#

its saying

#

'zsh: command not found: pip'

serene scaffold Dec 31, 2020, 6:13 AM

#

try the same command with python3 -m pip instead of just pip

vital ocean Dec 31, 2020, 6:13 AM

#

yes

sleek fjord Dec 31, 2020, 6:14 AM

#

so

#

python3 -m pip https://github.com/vishaalagartha/basketball_reference_scraper.git

#

?

serene scaffold Dec 31, 2020, 6:15 AM

#

python3 -m pip install git+https://github.com/vishaalagartha/basketball_reference_scraper.git

vital ocean Dec 31, 2020, 6:15 AM

#

yeah that's what i am sayin'

sleek fjord Dec 31, 2020, 6:15 AM

#

thank you

lapis sequoia Dec 31, 2020, 6:15 AM

#

@gaunt heron no

serene scaffold Dec 31, 2020, 6:15 AM

#

no?

vital ocean Dec 31, 2020, 6:16 AM

#

lapis sequoia <@267398715587690497> no

??

#

what's this

sleek fjord Dec 31, 2020, 6:16 AM

#

im going to relaunch atom

vital ocean Dec 31, 2020, 6:16 AM

#

sleek fjord im going to relaunch atom

ohk

#

from what time are u using atom.io?

sleek fjord Dec 31, 2020, 6:16 AM

#

?

#

i just downloaded

vital ocean Dec 31, 2020, 6:17 AM

#

sleek fjord i just downloaded

ohk

#

np

sleek fjord Dec 31, 2020, 6:17 AM

#

its coming up with the same problem

#


a = get_opp_stats('BOS', 1955, data_format='TOTAL')
print(a)

#

something wrong with my code?

desert parcel Dec 31, 2020, 6:18 AM

#

Is there a place I can share a jupyter notebook?

#

Cause I wanna ask a question

serene scaffold Dec 31, 2020, 6:19 AM

#

sleek fjord ```from basketball_reference_scraper.teams import get_roster, get_team_stats, ge...

any time there is something wrong with your code in the sense that there's an error message, please always share the whole error message.

sleek fjord Dec 31, 2020, 6:19 AM

#

'Traceback (most recent call last):
File "/Users/airmac/Documents/NBA Python/Untitled.py", line 1, in <module>
from basketball_reference_scraper.teams import get_roster, get_team_stats, get_opp_stats, get_roster_stats, get_team_misc
ImportError: No module named basketball_reference_scraper.teams
[Finished in 0.12s]'

#

thats the entire error message

serene scaffold Dec 31, 2020, 6:19 AM

#

okay, and what was the terminal output when you ran that command from before?

sleek fjord Dec 31, 2020, 6:20 AM

#

when i downloaded it?

#

what

serene scaffold Dec 31, 2020, 6:20 AM

#

serene scaffold `python3 -m pip install git+https://github.com/vishaalagartha/basketball_referen...

I'm referring to this command. Did you run it?

sleek fjord Dec 31, 2020, 6:20 AM

#

yes

serene scaffold Dec 31, 2020, 6:20 AM

#

What happened?

arctic wedgeBOT Dec 31, 2020, 6:21 AM

#

Hey @sleek fjord!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

serene scaffold Dec 31, 2020, 6:21 AM

#

!paste

arctic wedgeBOT Dec 31, 2020, 6:21 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pydis.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

sleek fjord Dec 31, 2020, 6:22 AM

#

it basically said

#

done

serene scaffold Dec 31, 2020, 6:22 AM

#

Okay

#

Where is your .py file that contains this code located?

serene scaffold Dec 31, 2020, 6:22 AM

#

sleek fjord ```from basketball_reference_scraper.teams import get_roster, get_team_stats, ge...

This code is in a file. Where is that file?

sleek fjord Dec 31, 2020, 6:22 AM

#

in my documents folder

#

under a folder called

serene scaffold Dec 31, 2020, 6:22 AM

#

Is that where your terminal is operating from?

sleek fjord Dec 31, 2020, 6:22 AM

#

no

#

am i supposed to do that

serene scaffold Dec 31, 2020, 6:23 AM

#

Can you go there in the terminal?

#

Yes, that's the easiest way for us to help you debug

sleek fjord Dec 31, 2020, 6:23 AM

#

how do i do that on mac

serene scaffold Dec 31, 2020, 6:23 AM

#

if you use a UI, we'd have to have extensive knowledge about how that UI works.

#

cd is usually the command to change directories

#

and ls usually tells you what is in your current directory

sleek fjord Dec 31, 2020, 6:25 AM

#

'cd: string not in pwd: /Users/airmac/Documents/NBA'

#

when i do ls

#

'Applications Documents Library Music Public
Desktop Downloads Movies Pictures get-pip.py'

serene scaffold Dec 31, 2020, 6:25 AM

#

do cd Documents

sleek fjord Dec 31, 2020, 6:26 AM

#

when i write ls now it comes up with

#

'Excel NBA Python School'

#

should i do cd nba python

serene scaffold Dec 31, 2020, 6:27 AM

#

yes, but you might need to put "NBA Python" in quotes

sleek fjord Dec 31, 2020, 6:27 AM

#

yeah okay

#

thanks

#

do i do the pip install

#

now

serene scaffold Dec 31, 2020, 6:27 AM

#

no, you said that worked

#

can you do ls again?

sleek fjord Dec 31, 2020, 6:28 AM

#

'Untitled.py'

serene scaffold Dec 31, 2020, 6:28 AM

#

is Untitled.py the file that contains the code you referred to earlier?

sleek fjord Dec 31, 2020, 6:28 AM

#

yes

serene scaffold Dec 31, 2020, 6:28 AM

#

sleek fjord ```from basketball_reference_scraper.teams import get_roster, get_team_stats, ge...

alright, do python3 Untitled.py

sleek fjord Dec 31, 2020, 6:28 AM

#

youre a fast tyoper

serene scaffold Dec 31, 2020, 6:28 AM

#

thxxx

sleek fjord Dec 31, 2020, 6:28 AM

#

ily

#

wow

serene scaffold Dec 31, 2020, 6:28 AM

#

it worked?

sleek fjord Dec 31, 2020, 6:29 AM

#

  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/basketball_reference_scraper/teams.py", line 6, in <module>
    from constants import TEAM_TO_TEAM_ABBR, TEAM_SETS
ModuleNotFoundError: No module named 'constants'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/airmac/Documents/NBA Python/Untitled.py", line 1, in <module>
    from basketball_reference_scraper.teams import get_roster, get_team_stats, get_opp_stats, get_roster_stats, get_team_misc
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/basketball_reference_scraper/teams.py", line 10, in <module>
    from basketball_reference_scraper.utils import remove_accents
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/basketball_reference_scraper/utils.py", line 4, in <module>
    import unicodedata, unidecode
ModuleNotFoundError: No module named 'unidecode'```

serene scaffold Dec 31, 2020, 6:29 AM

#

so their code (not yours) is broken.

sleek fjord Dec 31, 2020, 6:29 AM

#

no

#

wow

#

how could they

desert parcel Dec 31, 2020, 6:30 AM

#

@sleek fjord is your problem fixed?

serene scaffold Dec 31, 2020, 6:30 AM

#

desert parcel <@!279497357953990657> is your problem fixed?

no, the library they installed contains broken code.

desert parcel Dec 31, 2020, 6:30 AM

#

Ahh

#

Well I don't think he can fix that then

#data-science-and-ml

time series correlations over time over a 36 month window: shape((33300, 30)