#data-science-and-ml | Python | Page 271

ripe forge Nov 23, 2020, 3:54 PM

#

When your model is finished training, you don't actually know if it learnt to "parrot" the correct answers, or actually understood something

#

It's like training a parrot to say the word yellow. Now, if you show it a yellow card, and ask it what colour it is, the parrot says yellow. You're like yay i taught it colours. Right?

#

"no Darr, that doesn't make sense. Show it a blue card, and it will still say yellow!"

#

Well, removing the test set is like removing all cards except yellow. You just removed the blue card... That's bad.

#

You don't achieve your goal without a test set in ML. because you can be fooling yourself into thinking you achieved it, but all you did was build a parrot.

fleet heath Nov 23, 2020, 4:02 PM

#

yeah

steady harbor Nov 23, 2020, 4:10 PM

#

Can anyone help me with a Langton Ant's theory ... 😐

#

https://colab.research.google.com/drive/1SmrGUrBqOP8G8v7MWI4VuTltGTUoAqES?usp=sharing

Google Colaboratory

#

This is the code btw ... need to be done with matplotlib and the code for it is already written and Only the theory part is left which I dont understand some parts of it .... Like for loops in the bottom :

bold rune Nov 23, 2020, 5:27 PM

#

Does anyone know how to separate a single column vector from the rest in a matrix that has multiple columns? I'm using numpy

wild pine Nov 23, 2020, 5:39 PM

#

do you mean like extracting [1,4,7] from [[1,2,3],[4,5,6],[7,8,9]] ? @bold rune

bold rune Nov 23, 2020, 5:40 PM

#

Yeah pretty much, but someone else already answered. All I needed was data[:, n] to extract the n'th column:)

wild pine Nov 23, 2020, 5:40 PM

#

yeah, exactly ^^

proper swift Nov 23, 2020, 6:08 PM

#

does any one know how to move a row in pandas to the end of a dataframe?

remote pond Nov 23, 2020, 6:10 PM

#

maybe could let the row and the last row exchange

#

temp = row, row = last row, last row = temp

red briar Nov 23, 2020, 6:14 PM

#

https://stackoverflow.com/questions/47295566/how-to-use-pandas-to-shift-the-last-row-to-the-first

Stack Overflow

How to use pandas to shift the last row to the first

So I have a dataframe that looks like this:

                     #1                     #2

1980-01-01 11.6985 126.0
1980-01-02 43.6431 ...

ivory stag Nov 23, 2020, 6:17 PM

#

heyy, can somebody help me with how to count number of rows in a dataframe, please?

proper swift Nov 23, 2020, 6:18 PM

#

@ivory stag https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.count.html

ivory stag Nov 23, 2020, 6:18 PM

#

thank you so much!!!

proper swift Nov 23, 2020, 6:19 PM

#

@remote pond dont want to exchange any rows, just move a row to the bottom

#

i was thinking of dropping the row, then reappending it to the dataframe, but dont think thats efficient?

remote pond Nov 23, 2020, 6:20 PM

#

sorry I misunderstand it

#

the only way I can find is the same with you

#

if dataframe is a linked list then it's easy, but dataframe is not, I think your idea is efficient enough

ivory stag Nov 23, 2020, 6:31 PM

#

i am sorry but i have another doubt as well T-T
so here is a dataframe (named 'class_report' )in which there is a column called marks and name, so i need a statement in which i will be able get an output where the mark is less than 50 and names with 'r' ( i tried to do it but then i cant get the name with 'r')

#

print(class_report[class_report.marks<50]) - i got the answer
print(class_repot[class_report.name=='r']) - i got no answer
but i need both of in the same line, i cant use two print commands here T-T

remote pond Nov 23, 2020, 7:05 PM

#

could you show the output of class_report.dtypes

#

one possibility is that there's no "r" in the name col

ivory stag Nov 23, 2020, 7:07 PM

#

lol no there is 'r' in the dataframe but it shows 'empty dataframe' lol

#

but the thing is we need to do both the conditions in a single line

remote pond Nov 23, 2020, 7:10 PM

#

class_report[class_report.marks<50][class_report.name=='r'] this must works

ivory stag Nov 23, 2020, 7:13 PM

#

let me try that one

#

name marks grades 0 andrew 80 B 1 jackson 45 D 2 ritika 90 A 3 sarah 67 C Empty DataFrame Columns: [name, marks, grades] Index: [] UserWarning: Boolean Series key will be reindexed to match DataFrame index. print(class_report[class_report.marks<50][class_report.name=='r']

#

nope :((

remote pond Nov 23, 2020, 7:17 PM

#

you want the name start with r or just r

#

it seems no 'r'

ivory stag Nov 23, 2020, 7:19 PM

#

wait lets says that we need to get rows of students with grades 'B', what would we do then?

remote pond Nov 23, 2020, 7:19 PM

#

class_report[class_report.marks<50][class_report.grades=='B']

#

the same

ivory stag Nov 23, 2020, 7:21 PM

#

i keep on getting userwarning: boolean series key will be reindexed to match dataframe index lol

remote pond Nov 23, 2020, 7:21 PM

#

Empty DataFrame

#

maybe this is the reason

#

and.... if marks < 50 then should be D

ivory stag Nov 23, 2020, 7:22 PM

#

aah but why is it empty?

remote pond Nov 23, 2020, 7:23 PM

#

class_report[class_report.marks<50][class_report.grades=='D'] try this

ivory stag Nov 23, 2020, 7:25 PM

#

still no

#

aahh is it because something is wrong with my python?

remote pond Nov 23, 2020, 7:27 PM

#

most cases not

ivory stag Nov 23, 2020, 7:27 PM

#

if it is the mark alone, i am getting it, and if it is the grade alone without the marks statement, it shows empty dataframe lol

remote pond Nov 23, 2020, 7:30 PM

#

show the output of class_report.dtypes

bold rune Nov 23, 2020, 7:37 PM

#

I would like to output a scatter plot given some data I have in two separate numpy column vectors

#

However, when I run my code it doesn't show the plot

#

here's my code for the plot:

def plotdata(true_house_prices, estimates):
    fig, ax = plt.subplots()
    ax.scatter(true_house_prices, estimates)
    ax.set_xlabel('True Price')
    ax.set_ylabel('Estimated Price')
    plt.show()

plotdata(t_train, t_test)

#

what am I doing wrong here?

gentle token Nov 23, 2020, 7:39 PM

#

np.linalg.norm(array) anyone know what this is exactly doing? (numpy)

bold rune Nov 23, 2020, 7:39 PM

#

I'd imagine it's normalizing the given array @gentle token

gentle token Nov 23, 2020, 7:40 PM

#

@bold rune https://i.imgur.com/dUprN1B.png

Imgur

#

so is there a formula for this?

wild pine Nov 23, 2020, 7:40 PM

#

it's been a while since i did anyting with pyplot, but does fig.show() instead of plt.show() do the trick?
@bold rune

bold rune Nov 23, 2020, 7:40 PM

#

wild pine it's been a while since i did anyting with pyplot, but does `fig.show()` instead...

I'll try that

#

Hmm nope. Still nothing:/

#

@gentle token Formula for normalizing values in a given set?

gentle token Nov 23, 2020, 7:42 PM

#

bold rune <@!319993807183544320> Formula for normalizing values in a given set?

yes. at first i thought it was just the sum of all the values and divided by the length but i was wrong

#

thought that was "normalizing"

bold rune Nov 23, 2020, 7:43 PM

#

📎 unknown.png

#

This calculation should normalize every value in the array

gentle token Nov 23, 2020, 7:44 PM

#

@bold rune ahh thank you!

bold rune Nov 23, 2020, 7:44 PM

#

No worries:)

remote pond Nov 23, 2020, 9:06 PM

#

so the age values are dummy variables?

proper swift Nov 23, 2020, 9:43 PM

#

its alright, i figured it out

steady elbow Nov 23, 2020, 9:53 PM

#

hey

#

i have a DataFrame and i need to fix index int of a specific value in a row

#

from pandas import DataFrame
from numpy import int8 as np_int8
data_frame = DataFrame(
  [[i+2, i + 4] for i in range(0, 50)],
  columns=["Temperature", "Duty"],
  dtype=np_int8
)

#

i tried finding the answer in the docs. but couldnt find it

red briar Nov 23, 2020, 11:35 PM

#

ivory stag `name marks grades 0 andrew 80 B 1 jackson 45 D 2 ritika...

Did you do groupby?

cobalt jetty Nov 23, 2020, 11:53 PM

#

Hey, @steady elbow
What do you mean by fixing an index int?

#

Throwing your code in JN shows that your columns are well-set and both of type int.

📎 unknown.png

molten hamlet Nov 24, 2020, 12:24 AM

#

Hey, any good book for machine vision? 😄

modest salmon Nov 24, 2020, 1:02 AM

#

can anyone help me with some categorical data stuff in seaborn?

spiral peak Nov 24, 2020, 1:40 AM

#

molten hamlet Hey, any good book for machine vision? 😄

Specifically machine vision or ML/AI in general?

molten hamlet Nov 24, 2020, 1:44 AM

#

more like machine vision algos and image processing stuff then neuralnets @spiral peak

#

but give me for both field 🙂

spiral peak Nov 24, 2020, 1:52 AM

#

hmmmm, so I don't think I have anything machine vision related specifically bookmarked or saved unfortunately. My one professor does recommend this book highly though: https://smile.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow-dp-1492032646/dp/1492032646/ref=dp_ob_title_bk

He says it's pretty well what he covers in his grad class. You can probably take a look at the O'Reilly catalogue for other suggestions.

velvet thorn Nov 24, 2020, 2:02 AM

#

gentle token `np.linalg.norm(array)` anyone know what this is exactly doing? (numpy)

no, that takes the norm

#

which you can think of as a measure of "distance" (it is related to normalisation though)

#

!e import numpy as np; print(np.linalg.norm([3, 4]))

arctic wedgeBOT Nov 24, 2020, 2:03 AM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

5.0

gentle token Nov 24, 2020, 2:03 AM

#

so its the distance between all those numbers

velvet thorn Nov 24, 2020, 2:04 AM

#

the default case

#

is to treat the array passed to it as a vector representing a point

#

and to calculate the Euclidean norm

#

which is basically (x ** 2 + y ** 2 + ...) ** 0.5

gentle token Nov 24, 2020, 2:04 AM

#

ah i see

velvet thorn Nov 24, 2020, 2:05 AM

#

one form of normalisation is to divide each row by its norm

velvet thorn Nov 24, 2020, 2:05 AM

#

bold rune This calculation should normalize every value in the array

this is specifically min-max normalisation

#

there are other methods

somber bane Nov 24, 2020, 2:05 AM

#

Can anyone give my some good article of a step by step applying stochastic gradient descent on matrix factorization?

velvet thorn Nov 24, 2020, 2:06 AM

#

velvet thorn and to calculate the Euclidean norm

this can also be thought of as the magnitude of a vector in space

gentle token Nov 24, 2020, 2:06 AM

#

i need to learn more about this 😅

velvet thorn Nov 24, 2020, 2:06 AM

#

in short, the "norm" of a vector is one conceptualisation of its "length"

gentle token Nov 24, 2020, 2:07 AM

#

oh

velvet thorn Nov 24, 2020, 2:07 AM

#

is that too abstract

gentle token Nov 24, 2020, 2:08 AM

#

it makes sense

#

ive heard of this before

#

i had to normalize a vector

#

in the godot game engine

velvet thorn Nov 24, 2020, 2:08 AM

#

yeah, and you need to choose an appropriate norm for that

gentle token Nov 24, 2020, 2:09 AM

#

found a khan academy video on it, time to watch it haha

#

https://www.khanacademy.org/computing/computer-programming/programming-natural-simulations/programming-vectors/a/vector-magnitude-normalization

ivory stag Nov 24, 2020, 3:51 AM

#

red briar Did you do groupby?

nope, but i was able to find the mistake, thank you!!

glad jay Nov 24, 2020, 4:40 AM

#

hey guys, can someone help me implement an add_edges method to my ADT?

#

im a little confused on how to do it, i can paste some code here too

fast steeple Nov 24, 2020, 4:41 AM

#

yup

glad jay Nov 24, 2020, 4:41 AM

#



class Node:

    def __init__(self, data=None, next=None):
        self.data = data
        self.next = next


class LinkedList:

    def __init__(self):
        self.head = None

    def append(self,data):
        if not self.head:
            self.head = Node(data)
            return None
        curr = self.head
        while curr.next:
            curr = curr.next
        curr.next = Node(data)


class Vertex:

    def __init__(self, id):
        self.id = id
        self.neighbors = LinkedList()

    def add_neighbor(self, neighbor, weight=0):
        self.neighbors.append({neighbor: weight})

    def get_connections(self):
        return self.neighbors.keys()

    def get_id(self):
        return self.id

    def __str__(self):
        lst = []
        for x in self.nieghbors:
            lst.append(x.id)
        return "vertex ID: " + str(self.id) + "is neighbor of " + str(lst)


class Graph:

    def __init__(self):
        self.vertList = {}
        self.vertices = 0

    def add_vertex(self, val):
        self.vertices += 1
        newV = Vertex(val)
        self.vertList[val] = newV
        return newV

    def add_edge(self):

#

i think i have class Vertex implemented correctly

#

and some parts of graph implemented correctly

fast steeple Nov 24, 2020, 4:42 AM

#

so u want to add a node on a given path or to the deepest end

glad jay Nov 24, 2020, 4:42 AM

#

good question haha

#

could i post some specifications in this chat?

#

i think its given path

#

but i dont want to be wrong and you help me and waste ur time haha

fast steeple Nov 24, 2020, 4:44 AM

#

def Getvalue(self, config = None, path = ""):
        """
        Recursively Traverses the path and gets the value
        
        :Args:
            config: Dict
                dict to traverse
            
            path: String, List
                Path used to traverse the dict
        """
        if config == None:
            config = self.config_dict
            
        if isinstance(path, str):
            path = path.split("/")
        if len(path) >= 1 and not("" in path):
            index = path.pop(0)
            data = config.get(index)
            if isinstance(data, dict):
                return self.Getvalue(data, path)
            else:
                return data
        else:
            return config

        
    def Setvalue(self, value, config = None, path = ''):
        """
        Recursively Traverses the path and Sets the value
        
        :Args:
            value: Any
                Value to replace or set 
        
            config: Dict
                dict to traverse
            
            path: String, List
                Path used to traverse the dict
        """
        if config == None:
            config = self.config_dict
            
        if isinstance(path, str):
            path = path.split("/")
            
        if len(path) >= 1 and not("" in path):
            index = path.pop(0)
            if not config.get(index):
                config[index] = value
                return None
            
            data = config.get(index)
            if isinstance(data, dict):
                return self.Setvalue(value, data, path)
            else:
                if isinstance(data, list):
                    config[index].append(value)
                else:
                    config[index] = value
        else:
            return None

#

works on trees if u give it path to traverse

glad jay Nov 24, 2020, 4:45 AM

#

sorry, what is this? jhaha

fast steeple Nov 24, 2020, 4:52 AM

#

it adds and access the edgs in a tree

heady tide Nov 24, 2020, 10:04 AM

#

Sorry for a non code related question, but I just wanted to ask whether these questions/theorems are applied and used in some cases in DS

#

📎 unknown.png

cobalt jetty Nov 24, 2020, 10:07 AM

#

Those are Statistics exercises, especially the point estimation part of it. They can be used in data science when you do research on econometry, polling, etc.

livid quartz Nov 24, 2020, 10:10 AM

#

Hey does anyone have any good youtube video / book suggestions for learning linear algebra from scratch?

cobalt jetty Nov 24, 2020, 10:11 AM

#

https://www.youtube.com/watch?v=fNk_zzaMoSs&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab

YouTube

3Blue1Brown

Vectors, what even are they? | Essence of linear algebra, chapter 1

Home page: https://www.3blue1brown.com/
Kicking off the linear algebra lessons, let's make sure we're all on the same page about how specifically to think about vectors in this context.

Typo correction: At 6:52, the screen shows
[x1, y1] + [x2, y2] = [x1+y1, x2+y2].
Of course, this should actually be
[x1, y1] + [x2, y2] = [x1+x2, y1+y2].

Ful...

▶ Play video

livid quartz Nov 24, 2020, 10:11 AM

#

Thanks will check it out 🙂

lapis sequoia Nov 24, 2020, 11:05 AM

#

!e print("hello world")

arctic wedgeBOT Nov 24, 2020, 11:05 AM

#

You are not allowed to use that command here. Please use the #bot-commands channel instead.

molten hamlet Nov 24, 2020, 11:23 AM

#

spiral peak hmmmm, so I don't think I have anything machine vision related specifically book...

have you read that book ?

lapis sequoia Nov 24, 2020, 11:35 AM

#

livid quartz Hey does anyone have any good youtube video / book suggestions for learning line...

Try this

YouTube

Linear Algebra

Matrices, vectors, vector spaces, transformations. Covers all topics in a first year college linear algebra course. This is an advanced course normally taken...

Machine Learning Mastery

Jason Brownlee

Top Resources for Learning Linear Algebra for Machine Learning

How to Get Help with Linear Algebra for Machine Learning? Linear algebra is a field of mathematics and an important pillar of the field of machine learning. It can be a challenging topic for beginners, or for practitioners who have not looked at the topic in decades. In this post, you will discover how to […]

worthy scarab Nov 24, 2020, 12:52 PM

#

In this plot it takes the 11.169 from somewhere in the code using {:6.3f}, what does this {:6.3f} mean or represent?

📎 unknown.png

steady elbow Nov 24, 2020, 1:19 PM

#

hey i recently asked a question about pandas and my reaction was "what tf was i writing ?" so here is the correctly written question

#

how can i get index of a value in a specific column

#

from pandas import DataFrame
from numpy import int8 as np_int8
data_frame = DataFrame(
  [[i+2, i + 4] for i in range(0, 50)],
  columns=["Temperature", "Duty"],
  dtype=np_int8
)

#

i know how to get a value with a specific index in a row.. data_frame["Temperature"].iloc[0] would return 2

#

but how would i get the index by searching for a value. so if i searched for value 30 i would get 28 back.

chrome barn Nov 24, 2020, 1:27 PM

#

data_frame.index[data_frame['Temperature']==30].to_list()

steady elbow Nov 24, 2020, 1:31 PM

#

nice

#

can you suggest me a tutorial for DataFrame

chrome barn Nov 24, 2020, 1:38 PM

#

https://pandas.pydata.org/docs/getting_started/index.html#getting-started

#

https://pandas.pydata.org/docs/user_guide/index.html#user-guide

torpid cave Nov 24, 2020, 1:46 PM

#

worthy scarab In this plot it takes the 11.169 from somewhere in the code using {:6.3f}, what ...

Can't tell how it gets it

#

But it is the critical value

#

For your hyp-test

chrome barn Nov 24, 2020, 2:05 PM

#

{:6.3f}

is relating to the format of the output of the calculation of 100*prob1,HighX

meaning the output needs to have at least 6 characters with 3 after the decimal point.

Your output 11.169 confirms to this

for more info about it search for: python string format (ting)

limpid oak Nov 24, 2020, 2:09 PM

#

does anyone have python notebooks for practise perticulary in gis and datascience?

cobalt jetty Nov 24, 2020, 2:13 PM

#

Go on github and search for keyword in their searchbar. You'll get many recs. Otherwise !resources

novel field Nov 24, 2020, 2:19 PM

#

Hey guys. I am working with a pandas DataFrame like this and am looking to create a nested dictionary keyed by 'id' with values being dictionaries with keys and values for 'capacity' and 'level' for only the ids where is_customer is True. Anyone know how to do this?

📎 DataFrame.png

umbral sluice Nov 24, 2020, 2:22 PM

#

I have question guys,
I want to make a models in tandem

#

📎 image0.png

#

Something like this

#

How to do it?

limpid oak Nov 24, 2020, 2:38 PM

#

@novel field try df.iterrows()

novel field Nov 24, 2020, 2:40 PM

#

Thank you I think that's what I want!

novel field Nov 24, 2020, 4:27 PM

#

@limpid oak do you know how i could iterrows() only where is_customer is True?

brisk ravine Nov 24, 2020, 4:54 PM

#

how to get started with machine learning

regal summit Nov 24, 2020, 5:00 PM

#

is machine learning with andrew-ng good to start with machine learning??

tranquil fable Nov 24, 2020, 5:13 PM

#

I read a couple of job advertisements so far, all of them asking for knowledge in python or R. I wonder why none of them said "Julia or Scala would be nice as well". Any1 have an idea why? I thought Julia would be the "next big thing" for data science

sick blaze Nov 24, 2020, 5:14 PM

#

Python and R are much more popular than Julia

#

You might be working on scala or Julia on the job along with python

#

@brisk ravine checkout the pinned msgs

tranquil fable Nov 24, 2020, 5:15 PM

#

I know that, but a phrase like "Cool if you already know Julia" would assure an applicant in my eyes that the advertisers know what they are looking for

sick blaze Nov 24, 2020, 5:17 PM

#

If you think of Golang, it is considered the web server language of the future, yet it is still not very popular compared to other languages like cpp or python or java

#

Same thing goes for Scalia and Julia

tranquil fable Nov 24, 2020, 5:18 PM

#

If I remember vaguely, Golang is a lot faster than the typically used languages, which is a main selling point. Is that correct?

sick blaze Nov 24, 2020, 5:19 PM

#

It’s selling point is speed and simplicity, correct

#

Imagine cpp and python had a baby

pine burrow Nov 24, 2020, 6:36 PM

#

Anyone know why this errors?



mydb = mysql.connector.connect(
  host="<host>",
  user="<username>",
  password="<my_pass>"
)

print(mydb)```
Everything matches up, yet I don't know why it does not connect...

serene scaffold Nov 24, 2020, 7:22 PM

#

@pine burrow looks like a #databases question

pine burrow Nov 24, 2020, 7:53 PM

#

oh, thought I was ther elol

steady elbow Nov 24, 2020, 8:31 PM

#

so i have a DataFrame object and i want to update a row

#

import pandas as pd
import numpy as np

df = pd.DataFrame(
    [[i+1, i+4] for i in range(0, 10)],
    columns=["Temperature", "Duty"],
    dtype=np.int16
)
new_df = pd.DataFrame(
    [[2, 50]],
    columns=["Temperature", "Duty"],
    dtype=np.int16
)
df.update(new_df)

#

how would i update x row because this updates the first row to those values

#

📎 unknown.png

#

df.iloc[5] = [10, 80] seems to work

#

is it a good practice ?

livid quartz Nov 24, 2020, 9:26 PM

#

Nothing wrong with using iloc, that's what I always use anyways

hushed sleet Nov 24, 2020, 9:37 PM

#

Have you ever code your own Decision Tree Clasifier based on sklearn.tree?
I got problems with this task on uni. We have to write our decision tree classifier

spiral yew Nov 25, 2020, 1:33 AM

#

not sure if this is the right channel but anyone know why my cuda only works for pytorch but not tf? im on ubuntu 20.04

serene scaffold Nov 25, 2020, 2:02 AM

#

spiral yew not sure if this is the right channel but anyone know why my cuda only works for...

I'm awful at solving environment problems but what error are you getting?

spiral yew Nov 25, 2020, 2:05 AM

#

@serene scaffold its not an error i type python3 in terminal and then check if tensorflow is using the gpu my gpu isnt detected

#

but pytorch works for some reason

serene scaffold Nov 25, 2020, 2:05 AM

#

@spiral yew what are you doing to make sure that pytorch uses cuda? what are you doing to make sure that tf uses cuda?

spiral yew Nov 25, 2020, 2:07 AM

#

@serene scaffold yea lemme pull it up gimme a few min

tall basin Nov 25, 2020, 2:08 AM

#

Hey guys, can you help please? I am seeking free courses of data science I'll do a list and after I'll post this, may help a lot of people, Thank you so much!

high badge Nov 25, 2020, 2:11 AM

#

for scikitlearn.model_selection: whats the difference between cross_val_score() and cross_val_predict()?

#

i know that both divide the data you pass it into k-folds with i=1 to k iterations (splits), and each iteration it chooses 1 fold as the test set and the rest are training folds

serene scaffold Nov 25, 2020, 2:12 AM

#

high badge for scikitlearn.model_selection: whats the difference between cross_val_score() ...

those sound like variable names, so it's anyone's guess.

high badge Nov 25, 2020, 2:12 AM

#

oh my bad

serene scaffold Nov 25, 2020, 2:13 AM

#

so those are functions defined in sklearn?

high badge Nov 25, 2020, 2:13 AM

#

yea

spiral yew Nov 25, 2020, 2:14 AM

#

@serene scaffold ok so this is what i ran for pytorch: import torch
print(torch.cuda.is_available())

#

sry im not sure how u add like code blocks

#

wait wtf now it comes true

#

bruhh what

#

lol sry for the pings then, i figured it out by myself thx @serene scaffold

glad jay Nov 25, 2020, 2:27 AM

#

can someone help me test out a ADT code?

#

im just trying to see how it works

#

its an undirected graph

smoky bobcat Nov 25, 2020, 2:32 AM

#

i need help asap, anyone online available?

#

@serene scaffold When I perform Normalization/Standardization (fit_transform) it shows me the transformed values however if after i try to print the data transformed the values are the same

#

📎 WhatsApp_Image_2020-11-25_at_01.06.48.jpeg

#

📎 WhatsApp_Image_2020-11-25_at_01.07.05.jpeg

#

anyone help pls, im stuck here for 1 day or so

velvet thorn Nov 25, 2020, 3:13 AM

#

smoky bobcat anyone help pls, im stuck here for 1 day or so

fit_transform doesn't operate inplace.

velvet thorn Nov 25, 2020, 3:14 AM

#

high badge for scikitlearn.model_selection: whats the difference between cross_val_score() ...

one gives a score, the other predictions

#

just like model.score vs model.predict

high badge Nov 25, 2020, 3:15 AM

#

excuse me coudl you come to #help-mango

#

ihave a few questions about cross_val

velvet thorn Nov 25, 2020, 3:15 AM

#

steady elbow ``df.iloc[5] = [10, 80]`` seems to work

it does, but I very rarely have a reason to update a DataFrame by position

#

which is what iloc does

quick igloo Nov 25, 2020, 3:51 AM

#

Hello i want to be data scientist where should I start

midnight trench Nov 25, 2020, 4:29 AM

#

so i have 2 numpy array of string

import numpy as np
array1 = np.array(["he","she","ree"])
array2 = np.array(["he", "she", "ree", "me", "aee"])

result = array2-array1 
#i want result to be me and aee and like that i want the result to be components that's only available in array2 and result shouldn't contain any array1 items.
print(result)

velvet thorn Nov 25, 2020, 4:29 AM

#

@midnight trench why are you using numpy arrays for strings?

midnight trench Nov 25, 2020, 4:33 AM

#

Bcz im using a .txt file to import array/List and i thoughtnunpy is betterway to do it, if threis any way using normal python list please let me know, i have to manage 300 students data using this oof.

#

@velvet thorn

velvet thorn Nov 25, 2020, 4:33 AM

#

midnight trench Bcz im using a .txt file to import array/List and i thoughtnunpy is betterway to...

why wouldn't you just use a list comprehension

#

!e

left = ['a', 'c', 'e', 'g']
right = ['a', 'b', 'c', 'd', 'e', 'f', 'g']

unique_left = set(left)

print([e for e in right if e not in left])

arctic wedgeBOT Nov 25, 2020, 4:34 AM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

['b', 'd', 'f']

midnight trench Nov 25, 2020, 4:38 AM

#

Thx

smoky bobcat Nov 25, 2020, 4:41 AM

#

velvet thorn `fit_transform` doesn't operate inplace.

i dont understand, could you elaborate for noobie words

#

into''

velvet thorn Nov 25, 2020, 4:43 AM

#

smoky bobcat i dont understand, could you elaborate for noobie words

it creates a copy

#

with the changes made

#

it doesn't change the thing you pass into it

smoky bobcat Nov 25, 2020, 4:43 AM

#

so how do i fix it?

velvet thorn Nov 25, 2020, 4:44 AM

#

well

#

do you understand

#

what I just said?

smoky bobcat Nov 25, 2020, 4:44 AM

#

yes but i dont see where i make the copy so dont know how to fix it

velvet thorn Nov 25, 2020, 4:44 AM

#

smoky bobcat yes but i dont see where i make the copy so dont know how to fix it

fit_transform returns a copy.

smoky bobcat Nov 25, 2020, 4:45 AM

#

ok, so should i save it?

#

now that i save it, it doesn't have the table format like before the stardandization

velvet thorn Nov 25, 2020, 4:45 AM

#

are you new

#

to Python?

#

and DS?

#

is this your first project

smoky bobcat Nov 25, 2020, 4:46 AM

#

📎 Screenshot_2020-11-25_at_04.46.00.png

#

new to ds not python

velvet thorn Nov 25, 2020, 4:46 AM

#

smoky bobcat now that i save it, it doesn't have the table format like before the stardandiza...

check the return value type

#

and the type of your original data

#

and you will understand why

smoky bobcat Nov 25, 2020, 4:46 AM

#

the return type is a numpy array

#

so do you mean that even tho in jupyter the visualization is different from before now the thing is actually the same ?

velvet thorn Nov 25, 2020, 4:49 AM

#

smoky bobcat so do you mean that even tho in jupyter the visualization is different from befo...

well

#

they're different data types

#

but pandas DataFrames are backed by numpy arrays

smoky bobcat Nov 25, 2020, 4:50 AM

#

is there a way to update the pandas with this numpy array?

median dove Nov 25, 2020, 4:53 AM

#

Hey guys, when should I scale/standardize my data? I’ve seen a lot of notebooks that scale/standardize/normalize their data but how do I know when should I do any of those?

velvet thorn Nov 25, 2020, 5:33 AM

#

smoky bobcat is there a way to update the pandas with this numpy array?

yes, but why?

velvet thorn Nov 25, 2020, 5:34 AM

#

median dove Hey guys, when should I scale/standardize my data? I’ve seen a lot of notebooks ...

there are things you do that depend on scale.

#

for example, KNN or SVMs

#

because these consider the distance between points (rows)

smoky bobcat Nov 25, 2020, 5:35 AM

#

so as of now I am not understanding anything..

#

i'll go to sleep, rip

glad jay Nov 25, 2020, 5:49 AM

#

anyone have some knowledge on undirected graphs?

zealous holly Nov 25, 2020, 6:49 AM

#

hey everybody

bright spindle Nov 25, 2020, 6:49 AM

#

hello, My program is a simple client server message program, but i cannot figure out the 'send' and 'recv'. How do we know if its client.send or server.send if i want to send from server to client ? the same with recv

zealous holly Nov 25, 2020, 6:49 AM

#

I'm scraping a covid-19 website and need help with xpath selectors

#

https://www.worldometers.info/coronavirus/

Coronavirus Update (Live): 60,108,193 Cases and 1,414,925 Deaths fr...

Live statistics and coronavirus news tracking the number of confirmed cases, recovered patients, tests, and death toll due to the COVID-19 coronavirus from Wuhan, China. Coronavirus counter with new cases, deaths, and number of tests per 1 Million population. Historical data and info. Daily charts, graphs, news and updates

thin quiver Nov 25, 2020, 7:26 AM

#

Hello, i am am having a problem, what IDE do you use for data science, Pycharm, Jupyter, Colab, or something else?

#

because i need to pick one and i don't know what to pick

fair wraith Nov 25, 2020, 7:40 AM

#

@thin quiver Atom + Hydrogen

thin quiver Nov 25, 2020, 7:40 AM

#

thank you

#

i will look in to it

gray wigeon Nov 25, 2020, 8:02 AM

#

hi all

#

any idea on how to scrape list of seller data from shopee or lazada?

thin quiver Nov 25, 2020, 8:05 AM

#

idk i found this: https://chrome.google.com/webstore/detail/shopee-data-scraper-produ/abkpijbdlpolpdmjpbbieodlngljodbn?hl=en-GB

Shopee Data Scraper - Product, Sales

Shopee data scraper - easy data extraction tools of product, price, sales rank, review

smoky bobcat Nov 25, 2020, 8:05 AM

#

@velvet thorn how much would you charge me to explain some basic ml content i'm not being able to get

velvet thorn Nov 25, 2020, 8:09 AM

#

smoky bobcat <@!171929073063297024> how much would you charge me to explain some basic ml con...

!rule 5

arctic wedgeBOT Nov 25, 2020, 8:09 AM

#

Rules

5. Do not provide or request help on projects that may break laws, breach terms of services, be considered malicious or inappropriate. Do not help with ongoing exams. Do not provide or request solutions for graded assignments, although general guidance is okay.

velvet thorn Nov 25, 2020, 8:09 AM

#

oh, sorry

#

!rule 6

arctic wedgeBOT Nov 25, 2020, 8:09 AM

#

Rules

6. No spamming or unapproved advertising, including requests for paid work. Open-source projects can be shared with others in #python-general and code reviews can be asked for in a help channel.

velvet thorn Nov 25, 2020, 8:09 AM

#

but really, just ask here

#

I don't do paid stuff any more

smoky bobcat Nov 25, 2020, 8:10 AM

#

@velvet thorn i need to learn how to standardize or normalize a model and then use it

#

i am having trouble with normalization/standardization

#

do you have an example on hand?\

velvet thorn Nov 25, 2020, 8:12 AM

#

smoky bobcat do you have an example on hand?\

check sklearn documentation

#

they have examples.

#

did you try Google?

velvet thorn Nov 25, 2020, 8:13 AM

#

gray wigeon any idea on how to scrape list of seller data from shopee or lazada?

p sure that's against TOS

smoky bobcat Nov 25, 2020, 8:13 AM

#

i did, but problem is after i get that numpy array, and don't know how to proceed with that for the ml because im use too the panda dataframe

velvet thorn Nov 25, 2020, 8:13 AM

#

smoky bobcat i did, but problem is after i get that numpy array, and don't know how to procee...

why do you need to work with dataframes?

smoky bobcat Nov 25, 2020, 8:14 AM

#

so i need to predict a disease, i have a dataset, it's imbalanced. I'm trying to balance it but then i get that numpy array and i get stuck

#

and yes ive searched a lot on google and youtube tutorial

velvet thorn Nov 25, 2020, 8:16 AM

#

velvet thorn why do you need to work with dataframes?

doesn't answer this question

smoky bobcat Nov 25, 2020, 8:16 AM

#

they all show how to use the function of the library but not what to do after with the numpy array

smoky bobcat Nov 25, 2020, 8:17 AM

#

velvet thorn doesn't answer this question

hmm because i don't know how to do it otherwise, I have learnt how to convert back a numpy array to a pandas dataframe tho. Point is then I'd have to concatenate columns and it's a mess

velvet thorn Nov 25, 2020, 8:26 AM

#

smoky bobcat hmm because i don't know how to do it otherwise, I have learnt how to convert ba...

...what do you want to do after you perform scaling?

smoky bobcat Nov 25, 2020, 8:27 AM

#

@velvet thorn use Decision tree, logistic regression and different classification models to predict the target

velvet thorn Nov 25, 2020, 8:27 AM

#

you know you can fit those on numpy arrays too, right?

smoky bobcat Nov 25, 2020, 8:27 AM

#

Yes but I’m not sure if it should be done that way, and it becomes really difficult to visualize data like that

velvet thorn Nov 25, 2020, 8:27 AM

#

smoky bobcat Yes but I’m not sure if it should be done that way, and it becomes really diffic...

why do you need to visualise post-scaling?

smoky bobcat Nov 25, 2020, 8:28 AM

#

Also because, I have some features that are not going to be scaled and some that needs to be scaled

velvet thorn Nov 25, 2020, 8:28 AM

#

smoky bobcat Also because, I have some features that are not going to be scaled and some that...

don't use sklearn's transformers

#

complicated for a beginner

#

just do it in pandas

smoky bobcat Nov 25, 2020, 8:28 AM

#

How?

velvet thorn Nov 25, 2020, 8:28 AM

#

what kind of scaling?

smoky bobcat Nov 25, 2020, 8:29 AM

#

Normalization or standardization

velvet thorn Nov 25, 2020, 8:29 AM

#

...

#

which one?

#

🥴

smoky bobcat Nov 25, 2020, 8:29 AM

#

I could do both

velvet thorn Nov 25, 2020, 8:29 AM

#

just choose one.

smoky bobcat Nov 25, 2020, 8:29 AM

#

Normalization

velvet thorn Nov 25, 2020, 8:29 AM

#

okay

#

what does "normalisation" mean to you?

smoky bobcat Nov 25, 2020, 8:29 AM

#

Mean 0 std 1

velvet thorn Nov 25, 2020, 8:31 AM

#

how do you get there?

#

say you have a bunch of numbers

#

how do you normalise them?

smoky bobcat Nov 25, 2020, 8:32 AM

#

Usually I would use MinMaxScaler, make a model, fit it and then transform

velvet thorn Nov 25, 2020, 8:32 AM

#

(by the way, that's standardisation)

#

no.

#

how does the mathematics work?

smoky bobcat Nov 25, 2020, 8:32 AM

#

Ok sus then same thing for standard scaler

smoky bobcat Nov 25, 2020, 8:32 AM

#

velvet thorn how does the mathematics work?

I’m not required to learn the math behind.

velvet thorn Nov 25, 2020, 8:32 AM

#

well

#

and you don't want to?

#

okay then

#

to each their own

smoky bobcat Nov 25, 2020, 8:35 AM

#

I’m sorry for being like this 😪

velvet thorn Nov 25, 2020, 8:36 AM

#

why are you sorry?

#

not everyone likes learning for the sake of learning

#

and you don't owe me anything TBH

#

anyways

#

hope you figure it out! 👋

#

can you elaborate

#

hm

#

for production?

#

or experimentation

gray wigeon Nov 25, 2020, 9:21 AM

#

oh hey i had this kind of distribution too lol

gray wigeon Nov 25, 2020, 9:21 AM

#

velvet thorn p sure that's against TOS

totally lol. that's why i didn't elaborate much

gray wigeon Nov 25, 2020, 9:22 AM

#

thin quiver idk i found this: https://chrome.google.com/webstore/detail/shopee-data-scraper-...

saw this too but this is more product data than seller data

thin quiver Nov 25, 2020, 9:22 AM

#

o, ok

gray wigeon Nov 25, 2020, 9:24 AM

#

any advice on making a regression analysis on a zero inflated dataset?

winter mural Nov 25, 2020, 10:15 AM

#

Hi, I'm planning to learn python data science. where should i start?. I know numpy basics

thin quiver Nov 25, 2020, 11:56 AM

#

@winter mural here is a good course: https://youtube.com/watch?v=tPYj3fFJGjk, i would start with google colab, or jupyter noteboock

YouTube

freeCodeCamp.org

TensorFlow 2.0 Complete Course - Python Neural Networks for Beginne...

Learn how to use TensorFlow 2.0 in this full tutorial course for beginners. This course is designed for Python programmers looking to enhance their knowledge and skills in machine learning and artificial intelligence.

Throughout the 8 modules in this course you will learn about fundamental concepts and methods in ML & AI like core learning alg...

▶ Play video

rigid aurora Nov 25, 2020, 12:40 PM

#

📎 unknown.png

#

worth a shot.. dunno who checks this channel

#

i need help plotting two plots side by side, adding dots on them, then connecting the dots

#

assume this is doable.. just cant find anything on tack

#

STACK*

lapis sequoia Nov 25, 2020, 12:48 PM

#

You could plot a function between the Points, so the function connects them. Never tryed this, but should work. But i think its more work than necessary

#

Ah nevermind, subplots. I try something

rigid aurora Nov 25, 2020, 12:49 PM

#

oooo

#

it doenst have to be subplots! i just couldnt figure out how to plot 2 plots in 1 figure side by side

#

if i put two in one, theyll stack on top of each other

lapis sequoia Nov 25, 2020, 12:52 PM

#

rigid aurora if i put two in one, theyll stack on top of each other

And you dont want them to stack on top of each other? What you want to show with this? Some learning effects? Or different ML Models?

rigid aurora Nov 25, 2020, 12:52 PM

#

nope! just for visual

#

because im writing up a paper right now

#

need to show result

lapis sequoia Nov 25, 2020, 12:57 PM

#

Okay and if you try tree subplots, one and two with the first and second function and the third stacked together? So you can show your results in two ways. First of all you can referenz to the function itself and second you could referenz to the third one, stacked together and theire differenzes/equality?

rigid aurora Nov 25, 2020, 1:03 PM

#

yes!

#

i can do this

#

buttt

#

i am doing a continuition of this paper

#

📎 unknown.png

#

who already showed their experiment result like this

#

my work is an improvement to this paper, and i would like to compare them

#

i have an idea...

lapis sequoia Nov 25, 2020, 1:06 PM

#

Ahhh okay, so if you dont use a X and Y axis with concret values, you could add to one of your functions a +x value, so if you plot them together, the functions dont stack on top of each other

rigid aurora Nov 25, 2020, 1:06 PM

#

maybe i can shfit the x axist + a value

#

yes

#

like what u said haha

lapis sequoia Nov 25, 2020, 1:06 PM

#

Yeah fixed it :d

rigid aurora Nov 25, 2020, 1:06 PM

#

are you a data scientist sir?

lapis sequoia Nov 25, 2020, 1:07 PM

#

Master student Data Science, so close to that

#

Anyone knows why input_shape = (3, 6, 6) x = tf.random.normal(input_shape) y = tf.keras.layers.Conv1D( 3, 3, input_shape=input_shape[1:])(x) print(y.shape) outputs the y shape as (3,4,3) instead of (3,4,4) ?

rigid aurora Nov 25, 2020, 1:09 PM

#

same here poopaye

#

mater student

#

well, hopefully to graduate this sem

#

i wonder who will hrie me

#

hire

lapis sequoia Nov 25, 2020, 1:10 PM

#

let's say the input_shape represents an image of size 6x6x3 ( with 3 channels ).. why when I convolve it with a filter 3x3x3 I don't actually get an output of 4x4?

rigid aurora Nov 25, 2020, 1:11 PM

#

conv1d?

sand crane Nov 25, 2020, 1:45 PM

#

Hi guys, I am looking for some help. I am trying to sort my dataframe by finding the elements that have the most occurrences and then extracting those (I've done this with values_counts()). I also want to get the amount of occurrences for each unique element (I have also done this with values_counts().values). My issue is that I want to have the unique occurrences in one column of my dataframe and their occurrence amount in another column but value_counts() doesn't seem to allow this as it returns the element along with its count. You can see in my dataframe I have the string of the element and the count in the aircraft column (using value_counts()). And in the counts column I have just the count (using value_counts().values). If I use unique() to get each unique occurrence of an element then the order will be wrong where the aircraft won't match the correct count. How can I retain the order of these two?

#

https://i.imgur.com/TuBgLHp.png

Imgur

cyan matrix Nov 25, 2020, 2:30 PM

#

Having a difficult problem. Have a pandas dataframe with ~200 rows/columns. Columns are variables of what services are provided by recovery centers. Each center offers some, but not all of the services available (for example, the first row only offers 78 of the ~200 variables). Is there a good way to match my value to the column label so that the 78 of the 200 variables are in their appropriate columns, and the services they do not provide are empty?

untold cedar Nov 25, 2020, 3:00 PM

#

Hello, i am having trouble with matplotlib. My '''matplotlib.pyplot.show()''' seems to do this

📎 unknown.png

odd tendon Nov 25, 2020, 4:14 PM

#

If somebody could help me I'd be really grateful. I can't get PyCharm SciView to plot anything using geoplot, it simply doesn't show in the tab at all.

lapis sequoia Nov 25, 2020, 5:39 PM

#

untold cedar Hello, i am having trouble with matplotlib. My '''matplotlib.pyplot.show()''' se...

The Error Type say you try to plot a list. So try a solution from this thread:

https://stackoverflow.com/questions/40073322/plotting-list-of-lists-in-a-same-graph-in-python

Stack Overflow

Plotting list of lists in a same graph in Python

I am trying to plot (x,y) where as y = [[1,2,3],[4,5,6],[7,8,9]].

Say, len(x) = len(y[1]) = len(y[2])..
The length of the y is decided by the User input. I want to plot multiple plots of y in the...

untold cedar Nov 25, 2020, 5:49 PM

#

lapis sequoia The Error Type say you try to plot a list. So try a solution from this thread: ...

im trying to calculate my points and have the live graph in the same program. is this not possible? it seems to halt each time it shows the graph

lapis sequoia Nov 25, 2020, 5:52 PM

#

untold cedar im trying to calculate my points and have the live graph in the same program. is...

So you try to plot live data on a live graph?

untold cedar Nov 25, 2020, 5:52 PM

#

yes

lapis sequoia Nov 25, 2020, 5:53 PM

#

It is possible, i saw a tutorial once. W8

#

https://pythonprogramming.net/live-graphs-matplotlib-tutorial/

Python Programming Tutorials

Python Programming tutorials from beginner to advanced on a massive variety of topics. All video and text tutorials are free.

#

This will help you

untold cedar Nov 25, 2020, 5:54 PM

#

yep ive seen that

#

the tutorials all seem to have their "live" data passed through a csv within a function

#

but mine is generated in a loop within the program, and i cant seem to continue the iteration once ive called the FuncAnimation

lapis sequoia Nov 25, 2020, 5:55 PM

#

Hmh i need to see the code, but normaly it is possible to do it this way.

#

Comment your Code that dont work and clip it to one of the Available Help channels. More people read it than and can help you with the code.

untold cedar Nov 25, 2020, 5:57 PM

#

aight will do ty

cedar sun Nov 25, 2020, 6:04 PM

#

which is a good lib for IA on python? ive used keras, but i heard pytorch is good too

livid cradle Nov 25, 2020, 6:05 PM

#

any idea of what proyect i could do

#

i just finished a very big proyect and don't know what to do

lapis sequoia Nov 25, 2020, 6:06 PM

#

Depends on your skills. RNN or LSTM is a good one.

cedar sun Nov 25, 2020, 6:06 PM

#

who are u replying to

#

this guy is just spamming the same question on all channels

#

he already got a reply

livid cradle Nov 25, 2020, 6:07 PM

#

I am good at functional programming could u tell me any project i could work in

lapis sequoia Nov 25, 2020, 6:07 PM

#

cedar sun which is a good lib for IA on python? ive used keras, but i heard pytorch is goo...

Use Keras or TensorFlow. PyTorch is good too, but not my favorite

lapis sequoia Nov 25, 2020, 6:07 PM

#

livid cradle I am good at functional programming could u tell me any project i could work in

Prolog Text Analysis

livid cradle Nov 25, 2020, 6:07 PM

#

??

lapis sequoia Nov 25, 2020, 6:08 PM

#

livid cradle ??

If you like functional programming, try some text analysis

livid cradle Nov 25, 2020, 6:09 PM

#

should i try making a traductor of javascript to python?

lapis sequoia Nov 25, 2020, 6:10 PM

#

Oh man functional programming without a Functional Language... Try some LISP or SWI Prolog if you like functional programming

cedar sun Nov 25, 2020, 6:16 PM

#

lapis sequoia Use Keras or TensorFlow. PyTorch is good too, but not my favorite

isnt tensorflow used on keras?

lapis sequoia Nov 25, 2020, 6:16 PM

#

cedar sun isnt tensorflow used on keras?

https://intellipaat.com/blog/keras-vs-tensorflow-difference/

Intellipaat Blog

Keras vs Tensorflow - Which one should you learn?

In this blog on Keras vs Tensorflow, you will learn about the key differences between Keras and Tensorflow which will provide you a better understanding of what you should learn.

#

You can use some API`s yes

bronze barn Nov 25, 2020, 6:33 PM

#

Does anyone have any good links to understand how to analyse clusters?

hoary jungle Nov 25, 2020, 6:37 PM

#

Hello, anyone can help me to find a json dataset with 1 million+ entries? I am doing a college job where I have to turn that into a relational model

ancient venture Nov 25, 2020, 6:40 PM

#

instrument_response=np.loadtxt(r'C:/Users/Sidharth/Documents/Computing Labs/Project 1/Distance_Mpc.csv', delimiter=',', skiprows=1, usecols=(2)) 
print(instrument_response)
if (instrument_response==1):
#my function is here#

#

Hi guys, I've got a dataset and I've already defined a function

#

I want to make it so that my function is only applied to the data which corresponds to a "instrument response" = 1

#

📎 unknown.png

#

When I run my code, I get this error

#

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

#

any help would be appreciated

wintry olive Nov 25, 2020, 6:43 PM

#

I was just researching fine tuning datasets https://huggingface.co/docs/datasets/

cedar sun Nov 25, 2020, 6:44 PM

#

📎 xarvix_graph_cut.png

#

@lapis sequoia look

#

ah nvm i confused pytorch with tensorflow xd almost the same color

wintry olive Nov 25, 2020, 6:58 PM

#

I understand that the $$$ is in the datasets. Not just top level accumulation of a huge library of datasets. Its more about fined tuned custom built curated datasets with smart cache. Still over my head tho

cyan matrix Nov 25, 2020, 7:19 PM

#

tried earlier but wanna see if anyone can help here, attempting to extract/clean data set in python for use in R. running into a problem as below

I have a pandas dataframe with approx 200 rows/columns. The rows contain incomplete data, where not every column will be filled (for example the first row has 78 of the 200 possible variables). I have a list of the 78 variables that row 1 has, and I am trying to transpose that list into the pandas dataframe, where each variable is placed into it's respective matching column.

For example:


  Name     Location   Service A  Service B  Service C  Service D
1  Αlpha    Alabama       NaN       NaN       NaN       NaN
2  Beta     Alaska        NaN       NaN       NaN       NaN
3  Gamma    Arizona       NaN       NaN       NaN       NaN
4  Delta    Arkansas      NaN       NaN       NaN       NaN ```


I plan to use a 
for
 loop to draw out all the services so that I will eventually have a list that would be something such as ['Service B', 'Service D'].

After the first iteration, I would hope it would look something like this:
```   Name     Location   Service A  Service B  Service C  Service D
1  Αlpha    Alabama       NaN     Service B   NaN       Service D
2  Beta     Alaska        NaN       NaN       NaN       NaN
3  Gamma    Arizona       NaN       NaN       NaN       NaN
4  Delta    Arkansas      NaN       NaN       NaN       NaN```

steady elbow Nov 25, 2020, 7:49 PM

#

is there a way to get iloc if i know loc of a row in pandas.DataFrame ?

#

i guess i can do it with booling --> converting to a list --> indexing

whole vortex Nov 25, 2020, 9:07 PM

#

How do I convert an image from grayscale to colour

#

It was originally colour then I converted it to grayscale using opencv... Then applied DCT compression then decompression. Now I want to put colour back into it

ripe forge Nov 25, 2020, 9:31 PM

#

Uh. You took away information from the image. That information doesn't exist in the image anymore.

#

You can turn the greyscale image into 3 channel greyscale image, but you'll need the original image or it's information to put the colours back.

heady tide Nov 25, 2020, 9:54 PM

#

More reasons to use data table

📎 unknown.png

#

R benchmarks

#

Also, fun fact, data table in R is faster than Pandas in python, I did not know that until today

hollow scarab Nov 25, 2020, 10:01 PM

#

so I have an issue when I use df.groupby

#

I used it on a test file on my pc and it worked

#

but it just doesnt want to gorup the data up on the work pc

#

Like I save it to excel at the end, and the top row and first column are index numbers

#

and the data is a mess, it doesnt group up anything

steady elbow Nov 25, 2020, 10:44 PM

#

@heady tide R is made for things like that

#

python is not

heady tide Nov 25, 2020, 10:48 PM

#

Wrong, Pandas is much faster than R at everything else.

molten hamlet Nov 25, 2020, 11:11 PM

#

is there something that I could use to find closest point in 3 dimensions?

#

not to iteraing over all points?

torpid cave Nov 25, 2020, 11:44 PM

#

@heady tide https://datascience.stackexchange.com/questions/24052/is-pandas-now-faster-than-data-table

Data Science Stack Exchange

Is pandas now faster than data.table?

https://github.com/Rdatatable/data.table/wiki/Benchmarks-%3A-Grouping

The data.table benchmarks hasn't been updated since 2014. I heard somewhere that Pandas is now faster than data.table. Is this...

#

I love R... but Python is just used everywhere

heady tide Nov 25, 2020, 11:44 PM

#

at everything else

torpid cave Nov 25, 2020, 11:44 PM

#

Yeah

#

I will always defend R but Python is good for doing almost everything

#

I mean, I used webscrape with R, once I started using Python it just changed my mind

#

And results are mixed. I think once you start using you start implementing dictionaries-tuples-matrix

#

Instead of working with dataframes

#

Python wins

heady tide Nov 25, 2020, 11:50 PM

#

See the thing is, R is made by statisticians for statisticians

high badge Nov 25, 2020, 11:50 PM

#

i have a question about performance measures on classifiers. Are the precision, recall, and f1 scores only for binary classifiers?

heady tide Nov 25, 2020, 11:52 PM

#

The thing I don't like about Data Science is that python made it super easy for people to get into the field without much background in mathematics

torpid cave Nov 25, 2020, 11:53 PM

#

Don't they take math courses in undergrad/postgrad DS programs?

tidal violet Nov 25, 2020, 11:54 PM

#

whats the most efficient way to extract the adjacent neighbors of an element in a 2d Numpy array?

torpid cave Nov 25, 2020, 11:54 PM

#

I understand your concern. I studied DS after finishing engineering and economics degrees with were math focused... my intention was to learn the tools to make my life easier. When I started doing the modules (in DataCamp) I was surprised how they introduced time-series algos without explaining what was going on

tidal violet Nov 25, 2020, 11:55 PM

#

wrong channel sorry

heady tide Nov 25, 2020, 11:56 PM

#

I am rethinking my choices of majoring in DS

#

@torpid cave when did you finish university ? Was the field as saturated as it is now back then ?

torpid cave Nov 25, 2020, 11:57 PM

#

Australia

#

I finished my MsC last year

#

I feel that in Australia there demand is hughe

#

But for people with business experience

#

I get at least 1 message from recruiters eveyr week

#

in likedin

#

But it depends on the niche you are doing

heady tide Nov 25, 2020, 11:59 PM

#

yeah from another side it is inevitable that the amount of data and the use of it will only grow in the following years

torpid cave Nov 25, 2020, 11:59 PM

#

Yeah and most companies rely on excel

#

For 90% of their tasks

#

Once they start shifting to more structured ways of working the demand will keep on increasing

#

What people must understand is that DS is more than doing Neural Networks and Computer Vision

heady tide Nov 26, 2020, 12:01 AM

#

yeah, heck most of the time the hardest part is preprocessing

torpid cave Nov 26, 2020, 12:01 AM

#

yes

#

I haven't applied a ML algorythm in the past 8 months I think

#

We are currently implementing the data pipelines to get the data for the algorythms

heady tide Nov 26, 2020, 12:01 AM

#

Thats super normal imo

torpid cave Nov 26, 2020, 12:02 AM

#

And most firms are in a less developped stage

wintry olive Nov 26, 2020, 12:35 AM

#

Id like to custom build highly experimental abstract adaptive neural structures for the hugging face community platform -- which uses an Apache Arrow Table storing format.

#

The neural structures I have in mind are adaptive monster sized "structures". It's a beast of several preprocessing layers deep so it might require keras API?

#

the adapt() method could take a tf.data.Dataset object but some of the layers are trainable and will use federated learning and auto update and evolve like a scripted out VM...

wintry olive Nov 26, 2020, 2:27 AM

#

so I'm researching data structures and looking into all platforms. I looked into pandas. it's is all database 100% but I really like the ideas of melt() pivot () pivot_table() and the tools to manipulate textual data especially corpus or virtual corpus data or general data of the dataset. technically I'm considering a custom class in which datasets is a part of.

#

yoo PanelOLS might be what I am looking for.

whole sage Nov 26, 2020, 2:42 AM

#

Hi. Has somebody deployed a keras model using flask?

median dove Nov 26, 2020, 3:12 AM

#

Hey, anybody knows if in example #1 he is only keeping or dropping the correlated features? It is not very clear

#

https://towardsdatascience.com/feature-selection-with-pandas-e3690ad8504b

Medium

Feature Selection with sklearn and Pandas

Introduction to Feature Selection methods and their implementation in Python

twin mantle Nov 26, 2020, 5:07 AM

#

Could somebody explain me how to plot an aggregated panda series in seaborn?

wintry olive Nov 26, 2020, 5:08 AM

#

PanelOLS code is heavy professional. It was nice to see such a high level of math and code mastery expressed in a professional capacity rather then presented as a tutorial or learning experience. Amazing what you can do when you know how to write your own scripts and establish well defined classes. I will focus on Panda for organizing the data and writing scripts and establish well defined classes just to prototype this idea. no sense in creating some hybrid or hack data structure. Although my idea is highly experimental.

twin mantle Nov 26, 2020, 5:10 AM

#

Please?

wintry olive Nov 26, 2020, 5:12 AM

#

https://medium.com/@emredjan/emulating-r-regression-plots-in-python-43741952c034 and diagnostic plots for residuals https://robert-alvarez.github.io/2018-06-04-diagnostic_plots/

Medium

Emulating R regression plots in Python

(Originally posted on my personal blog)

Creating Diagnostic Plots in Python

and how to interpret them

#

idk I saw Panda as a tool for organizing data in datasets and with textual manipulation a tool for optimizing merging or oragami morphing the graph orientation.

torpid cave Nov 26, 2020, 6:18 AM

#

@glad mulch might sharing your fitted equation?

midnight trench Nov 26, 2020, 6:34 AM

#

I have a quite a stupid question being a intermediate data scientist but, how really does the code work? like what type of understanding does processor make when it's reading bytecode and if it read it how does a processer understand the bytecode? did the processer got coded? how does a machine learn?

twilit pilot Nov 26, 2020, 6:55 AM

#

If I have a .csv file like this... ```
Symbol, Price
AMZN, 3000
AAPL, 318
ABC, 26
AMZN, 4000
AAPL, 278
ABC, 32

And I load the .csv file like this...```py
import pandas as pd

df = pd.read_csv('my_file.csv', delimiter=',')

How do I filter the data to only get the information about the AAPL stock?
Thanks a lot in advance 🙏 And please ping me

torpid cave Nov 26, 2020, 6:57 AM

#

Filtering once you have the data:

df_appl = df[df['Symbol'] == 'AAPL']

#

@twilit pilot

twilit pilot Nov 26, 2020, 7:12 AM

#

@torpid cave Thanks a lot 🙏 😁

azure locust Nov 26, 2020, 8:17 AM

#

Hi, Does anyone know how the two tasks i.e, (Masked Language Modelling (MLM) and Next Sentence Prediction (NSP) ) in Google's BERT are being trained?
Whether it is trained 50% for MLM and 50% for NSP or in a cyclic manner or first MLM will get trained and then NSP will get trained?

stable mortar Nov 26, 2020, 8:21 AM

#

📎 received_192885882409829.jpg

#

one of my teamates wrote this shit

lunar bough Nov 26, 2020, 8:50 AM

#

bruh

lapis sequoia Nov 26, 2020, 10:04 AM

#

midnight trench I have a quite a stupid question being a intermediate data scientist but, how re...

Oh man, unexpected 🙃

You can learn from some good Textbooks, like Machine Learning with Python - Müller & Guido or use some Websites about Computational intelligence.

For the start please read this and try to study a bit, before you start asking something like this 🙃

https://towardsdatascience.com/how-do-machines-learn-561181ed209a

Medium

How Do Machines Learn?

Machine Learning Concept Explained for the Non Computer Science People

sand crane Nov 26, 2020, 10:27 AM

#

Hi guys, I am looking for some help. I am trying to sort my dataframe by finding the elements that have the most occurrences and then extracting those (I've done this with values_counts()). I also want to get the amount of occurrences for each unique element (I have also done this with values_counts().values). My issue is that I want to have the unique occurrences in one column of my dataframe and their occurrence amount in another column but value_counts() doesn't seem to allow this as it returns the element along with its count. You can see in my dataframe I have the string of the element and the count in the aircraft column (using value_counts()). And in the counts column I have just the count (using value_counts().values). If I use unique() to get each unique occurrence of an element then the order will be wrong where the aircraft won't match the correct count. How can I retain the order of these two?

#

https://i.imgur.com/TuBgLHp.png

Imgur

sand crane Nov 26, 2020, 11:01 AM

#

Ok, I solved this myself. The index was the aircraft string and the count of its occurrences was the values. I simply returned index to my aircraft column (value_counts().index) and then returned values to my counts column (value_counts().values). Now I have the occurrences of each unique aircraft (its name and count) in two separate columns

#

https://i.imgur.com/fCyw00Y.png

Imgur

midnight trench Nov 26, 2020, 11:03 AM

#

stable mortar one of my teamates wrote this shit

what's so about it?

whole vortex Nov 26, 2020, 11:06 AM

#

How do I convert grayscale image back to colour via opencv

halcyon vale Nov 26, 2020, 12:52 PM

#

https://www.linkedin.com/posts/thinam-tamang-3b12831a2_300daysofdata-machinelearning-dl-activity-6737702277294899200-MdZ_

Thinam Tamang on LinkedIn: #300DaysOfData #machinelearning #DL

Day 17 of #300DaysOfData!

Elastic Net :
Elastic Net is a middle grouped between Ridge Regression and Lasso Regression. The regularization term "r" is ...

desert oar Nov 26, 2020, 1:18 PM

#

@glad mulch slanted residuals like that usually means your data is strictly bounded but your model isn't respecting those bounds

#

You should consider a different model, maybe transform the target to [0,1] and use logistic regression

tall basin Nov 26, 2020, 1:20 PM

#

tall basin Hey guys, can you help please? I am seeking free courses of data science I'll do...

.

#

Thanks

fierce shadow Nov 26, 2020, 1:51 PM

#

@whole vortex you can't really convert a gray scale image to a RGB image since, grayscale consists of less information than a RGB image, however there are some neural networks, which are trained to convert a gray scale image into rgb

median dove Nov 26, 2020, 1:59 PM

#

Hi guys. In a Linear Regression model, is it a better practice to keep or drop the correlated variables to the target feature?

twin mantle Nov 26, 2020, 2:11 PM

#

Hello

#

How can I map a value in a dictionary of dictionaries to a dataframe?

lapis sequoia Nov 26, 2020, 2:13 PM

#

Anyone can point me to good online machine learning resources and tutorials?

#

I mean OML, not online courses

#

Trying to learn how to update a pickled trained model with new trained data

#

but can't find resources anywhere

desert oar Nov 26, 2020, 2:22 PM

#

Pickle can't really be modified in place

#

You have to unpickle, train, and pickle again

stable mortar Nov 26, 2020, 2:24 PM

#

midnight trench what's so about it?

You saw nothing weird about that for loop? Who even passes the i to the function lmao.

hollow gull Nov 26, 2020, 3:52 PM

#

lapis sequoia Trying to learn how to update a pickled trained model with new trained data

I don't have resources, but some models have a warm start argument that you can pass to continue training.

https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

lapis sequoia Nov 26, 2020, 3:53 PM

#

I've got an XGB model I need to train every week

#

but we're talking millions of data weekly

#

vs billions if I start from 0 every week

hollow gull Nov 26, 2020, 3:54 PM

#

I don't see a warm start argument in xgboost in python, but I thought there was one.

https://xgboost.readthedocs.io/en/latest/python/python_api.html

lapis sequoia Nov 26, 2020, 3:54 PM

#

I save the OVR in a pickle

#

and unpickle it to predict

#

I saw Online Learning could probably fix the issue of needing to retrain from scratch

hollow gull Nov 26, 2020, 3:55 PM

#

Yeah, that part should be fine as long as you can get the model back into memory, it doesn't matter how you do it. Pickle shouldn't be a problem although I think there was some reason our team wasn't using it. They are using mleap instead I believe.

#

But that doesn't impact the retraining piece.

lapis sequoia Nov 26, 2020, 3:56 PM

#

so is there a way to do old_pickled_model + current_week_model?

hollow gull Nov 26, 2020, 3:57 PM

#

process_type [default= default]

A type of boosting process to run.
Choices: default, update
default: The normal boosting process which creates new trees.

That sounds potentially useful to me.

https://github.com/dmlc/xgboost/blob/master/doc/parameter.rst

GitHub

dmlc/xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow - dmlc/x...

#

This claims to have the answer:
https://stackoverflow.com/questions/47000253/python-xgboost-continue-training-on-existing-model

Stack Overflow

python xgboost continue training on existing model

Lets say I build an xgboost model:

bst = xgb.train(param0, dtrain1, num_round, evals=[(dtrain, "training")])
Where:
param0 is a set of params to xgb,
dtrain1 is a DMatrix ready to be trained

earnest herald Nov 26, 2020, 5:47 PM

#

What Visual Studio build tools do I need for python in general? I am trying to import Annoy right now

#

I am using 3.8.5 if that matters

lapis sequoia Nov 26, 2020, 5:55 PM

#

earnest herald What Visual Studio build tools do I need for python in general? I am trying to i...

https://marketplace.visualstudio.com/items?itemName=stevedower.python

You mean this right?

Python build tools - Visual Studio Marketplace

Extension for Azure DevOps - Visual Studio Team Services extension for Python developers.

earnest herald Nov 26, 2020, 6:01 PM

#

lapis sequoia https://marketplace.visualstudio.com/items?itemName=stevedower.python You mean ...

No

#

https://visualstudio.microsoft.com/visual-cpp-build-tools/

Visual Studio

Microsoft C++ Build Tools - Visual Studio

lapis sequoia Nov 26, 2020, 6:04 PM

#

earnest herald No

Ah okay. Hmh in gerneral i would use VS Code But for Data Science i recommenc Anaconda - Jupyter Notebook

earnest herald Nov 26, 2020, 6:04 PM

#

lapis sequoia Ah okay. Hmh in gerneral i would use VS Code But for Data Science i recommenc An...

This is my error in case you wanna have a look
https://pastebin.com/N1uRJg6y

Pastebin

(tflow) C:\Users\Me>pip install annoyCollecting annoy Using cached...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

It legit says I need https://visualstudio.microsoft.com/visual-cpp-build-tools/ but I have no idea what workload to install

Visual Studio

Microsoft C++ Build Tools - Visual Studio

lapis sequoia Nov 26, 2020, 6:06 PM

#

earnest herald It legit says I need https://visualstudio.microsoft.com/visual-cpp-build-tools/ ...

Okay there is a Build Tool Standalone. Google for it. It should solve this problem.

📎 34850-10261.png

earnest herald Nov 26, 2020, 6:06 PM

#

Awesome! Thanks mate

lapis sequoia Nov 26, 2020, 6:07 PM

#

And update everything. Normaly this tool say what is wrong, but tick everything

earnest herald Nov 26, 2020, 6:08 PM

#

Sweet

earnest herald Nov 26, 2020, 6:36 PM

#

lapis sequoia And update everything. Normaly this tool say what is wrong, but tick everything

Although I could have used the standalone version, I did a bit more reaseach and realized that I needed to install the C++ workload in Visual Studio. This is for anyone who might face this later (:

#

But I think the standalone version is pretty much the same of what I did

earnest forge Nov 26, 2020, 7:13 PM

#

I have the following dataset, how can I assign values from Location to X and Y respectively, considering that Location is not a tuple, it is just a string.

#

📎 unknown.png

lapis sequoia Nov 26, 2020, 7:31 PM

#

Hey, can anyone help me with this? https://cdn.discordapp.com/attachments/760445352305754142/781594109826826300/unknown.png

#

@earnest forge How can you publish that information o.o

wintry olive Nov 26, 2020, 7:52 PM

#

I'm here researching NLP/NLU for UX and a dream but you guys are using ML models to solve real world problems or to optimize workflows or gain market insight

#

are you guys comp sci branching out? or already in the field of ops or statistics trying to leverage new tech to gain an edge?

ancient venture Nov 26, 2020, 8:00 PM

#

is there a way to end a for loop after x iterations

#

in my case i need to end a for loop after 25 iterations

#

it seems like my code is doing the 25 iterations i need, then simply repeating it over and over again

earnest forge Nov 26, 2020, 8:03 PM

#

lapis sequoia <@!518811421434314772> How can you publish that information o.o

what do you mean?

#

It is open-source csv file that is available for everyone in Kaggle ._.

wintry olive Nov 26, 2020, 8:20 PM

#

i suppose there is opportunity to profit from lowering the barriers of entry and reducing the need to do so much coding. Unless you prefer coding? I'd like to help you guys focus on data and datasets. Someone should design UI tools for you guys to build custom workspaces that you can save share and collaborate on as easily as google docs+notebooks.

#

all your data is pre structured right?

scarlet cloak Nov 26, 2020, 8:26 PM

#

I'm writing a script to parse through a VCF file and create 4 outfiles but I was wondering if I can have the script place those outfiles in a directory of the users choice? I'm just wondering if this is possible? not sure if I need to use the os.mkdir() to do this

wintry olive Nov 26, 2020, 8:27 PM

#

probably already have internal IT software tools right? the other issue is data manipulation tho...

bronze barn Nov 26, 2020, 8:43 PM

#

Is there a way to print data in a nice table in Jupiter notebooks? I've tried the following settings but to no avail:pd.set_option("display.max_columns", 200)
pd.set_option("display.max_rows",3100)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)

wintry olive Nov 26, 2020, 8:43 PM

#

the workspaces would be user generated black boxes of models, scripts, transforms, physics engines, computer algebraic expressions, and cache. Metaphor here is some workspaces would be like Mandelbrot Zooms the output would be an expression of that. Other workspaces would be like kitchens to cook information. in that you can chop cook melt() pivot() bake() fuse() consume() data in the abstract

#

@bronze barn can you import panda? the pivot() call or function is what you need I think or melt() panda is where it's at for data manipulation

#

you can shape tables like oragami with panda

bronze barn Nov 26, 2020, 8:54 PM

#

I'm not sure I follow, sorry. I've got the layout of the df I want but because I have many features around 20, it won't print in a table in notebook but several columns at a time which I have to scrolls through. Reading the documentation on those functions it sounds like they're for changing the layout of the df

molten hamlet Nov 26, 2020, 8:55 PM

#

bronze barn I'm not sure I follow, sorry. I've got the layout of the df I want but because I...

have you tried just pd ?

#

I think it auto captures notprinted 😄

bronze barn Nov 26, 2020, 9:05 PM

#

molten hamlet have you tried just `pd` ?

Silly me 😄 . Sorry to bother you with questions but is it possible to export this table as a PNG or something? Also, is there a way to display categorical data in this table because it only shows numerical features?

molten hamlet Nov 26, 2020, 9:05 PM

#

prnt screen 😂

#

no idea sir

molten hamlet Nov 26, 2020, 9:06 PM

#

bronze barn Silly me 😄 . Sorry to bother you with questions but is it possible to export t...

that your dataframe, its possible

ancient venture Nov 26, 2020, 10:47 PM

#

for this data set

#

📎 unknown.png

#

📎 unknown.png

#

i want y to take be columns 1-31

#

i.e. all the columns except "Wavelength"

#

how can I do this

austere swift Nov 26, 2020, 10:56 PM

#

you can do df1 = df.drop('Wavelength', axis=1) and df1 will be a df with all the columns except wavelength

#

df will be the same

ancient venture Nov 26, 2020, 10:57 PM

#

so then would i just set

#

y = df1

austere swift Nov 26, 2020, 10:58 PM

#

yep

#

or just replace df1 with y

#

so like y = df.drop('Wavelength', axis=1)

ancient venture Nov 26, 2020, 10:59 PM

#

ah ok

austere swift Nov 26, 2020, 10:59 PM

#

if you wanna do slicing like you did with your first one though it would be y = spectral_data[:, 1:]

#

both will work fine

ancient venture Nov 26, 2020, 11:00 PM

#

ohh okay yeah that'd be preferred

#

tysm

#

do you know how i could apply a for loop to do this?

#

so what I would have instead is

#

📎 unknown.png

#

like that is originally what i had

#

(there are 30 columns of observation data in the spreadsheet)

#

but this for loop seems to produce the results i need, although

#

instead of results = [a,b,c] i get results = [a,a,a,b,b,b,c,c,c]

#

what i basically need this to do is do some data analysis for x=column 0 and y=column 1, then for x=column 0 and y=column 2, then for x=0 and y=column 3 etc

lapis sequoia Nov 27, 2020, 12:02 AM

#

could somone explain to me transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,)),])

wintry olive Nov 27, 2020, 12:57 AM

#

@ancient venture I'm not an expert on the subject but import this source if applicable https://bashtage.github.io/linearmodels/doc/_modules/linearmodels/panel/model.html#BetweenOLS.fit I'm thinking you need to set x=0 as constant not three separate unique pointers to the same values? @lapis sequoia not sure will it generate a generic tensor or will it just apply the normalize effect?

delicate jackal Nov 27, 2020, 12:57 AM

#

I was gonna ask if anyone here is good at pandas but its seems like thats the only thing you're all talking about

#

I have a basic question

#

📎 Capture.JPG

#

This is the head of my dataframe

lapis sequoia Nov 27, 2020, 12:59 AM

#

holy moley

delicate jackal Nov 27, 2020, 12:59 AM

#

I need to find the lower and upper quartile of Global_Sales but need to do it by Genre

#

sorry for being a noob

#

I know there is a dataframe.quantile command but how would I make Global_Sales go by Genre?

#

I did it!

#

thanks dude

wintry olive Nov 27, 2020, 1:15 AM

#

@delicate jackal hmm that might require a script or a class. not sure which approach would be more efficient. also have to figure out some mechanism to point the command to run in the Genre class? or run the script then invoke the command on the scripts output? idk

delicate jackal Nov 27, 2020, 1:15 AM

#

nah it was easier than I thought

#

📎 Capture.JPG

wintry olive Nov 27, 2020, 1:16 AM

#

I also found this https://www.dcode.fr/numbers-quantile it's a webapp to calculate quantile

Quantile Calculator - Quantile of Numbers - Online Tool

Tool to calculate the quantile value from a list of values. The quantiles/percentiles/fractiles of a list of numbers are statistical values that partially illustrates the distribution of numbers in the list.

delicate jackal Nov 27, 2020, 1:16 AM

#

for upper quartile

#

honestly grouping things in pandas has been a nightmare for me xD

#

thanks mate

wintry olive Nov 27, 2020, 1:19 AM

#

np 👍

wintry olive Nov 27, 2020, 1:47 AM

#

wait hold up... databases are incredibly useful in this field right? I thought SQL was a type of DB but apparently it's a programming language. Is there an advantage for using a programming language with ML transformer models as a tool rather than data entry and utilization of good old fashion databases?

#

are you rebelling against the databases?

#

models and datasets > SQL old fashioned databases

#

Critics argue that SQL should be replaced with a language that returns strictly to the original foundation: for example, see The Third Manifesto. However, no known proof exists that such uniqueness cannot be added to SQL itself,[45] or at least a variation of SQL. In other words, it's quite possible that SQL can be "fixed" or at least improved in this regard such that the industry may not have to switch to a completely different query language to obtain uniqueness. Debate on this remains open.

#

a lot to unpack.... I was just about to ask why not add these ML models to old fashioned databases? I even thought about open source collaboration based virtual databases as a tool for my proposed idea for workspaces or a complete UI abstract.... now I have to read some manifesto and figure out what returning to the original foundation is all about.

#

apparently allowing duplicate rows makes it difficult or tedious to work with in python

#

that makes more sense then manifesto and original foundation...still have to read that tho

midnight trench Nov 27, 2020, 2:16 AM

#

wintry olive wait hold up... databases are incredibly useful in this field right? I thought S...

SQL Is Language that controles database, SQL stands for Structured Query Language, actually you can even call this database bcz, SQL Stores database as Structured Qurey as told in full form.

wintry olive Nov 27, 2020, 2:16 AM

#

SELECT *
FROM t
WHERE i = NULL;

#

there's a NULL issue too

midnight trench Nov 27, 2020, 2:17 AM

#

Select i
From t
Where I = NULL;

#

try this

wintry olive Nov 27, 2020, 2:26 AM

#

hmmm the procedural extension to "unknown" doesn't resolve the issue? plus if Oracle has a hand in SQL than this is a G rebellion?

#

IF i = NULL THEN SELECT 'Result is True' ELSEIF NOT(i = NULL) THEN SELECT 'Result is False' ELSE SELECT 'Result is Unknown';

#

I'm switching to laptop...

#

nah oracle is a version title not some corporate feud. hmm figures this is the one area I never paid much attention to considering I am natural generalist who loves to design user experiences

#

wait oracle is involved with db

serene scaffold Nov 27, 2020, 2:45 AM

#

I need to study for my AI final. But that is work.

wintry olive Nov 27, 2020, 2:59 AM

#

4D seems interesting but definitely not open source

#

https://developer.4d.com/docs/

4D Documentation · Documentation for 4D Developers

Documentation for 4D Developers

#

The flexibility and power of the 4D programming language make it the ideal tool for all levels of users and developers to accomplish a complete range of information management tasks.

#

key phrase is: information management tasks.

#

whoa "With ORDA, data is accessed through an abstraction layer, the datastore. A datastore is an object that provides an interface to the database model and data through objects and classes. For example, a table is mapped to a dataclass object, a field is an attribute of a dataclass, and records are accessed through entities and entity selections."

#

well they beat me to that

#

anyone feel like reverse engineering that into python?

#

https://www.quicksort.co.uk/DeeDoc.html

queen crown Nov 27, 2020, 3:23 AM

#

hoping someone can help me in this channel about building a sorted list from scratvh

#

I have an empty list that I want to populate with orders that are ordered based on price and timestamp

#

right now, i'm dong this

#

my_list.append(order)
my_list.sort(key=lambda k: (k.price, k.timestamp))```

#

is there a way of doing this?

#

 bisect.insort_right(my_list, (order.price, order.timestamp, order))

calm nacelle Nov 27, 2020, 3:49 AM

#

this is probably a better place to ask this dumb question, https://stackoverflow.com/questions/65031767/how-do-i-easily-convert-a-streaming-array-to-variables

Stack Overflow

How do I easily convert a streaming array to variables?

I tried to find this answer, and after several hours, still no results. I want to do something very easy, this doesn't have to be this complicated...
So, we have a pyaudio buffer that pulls a chunk...

#

if anyone can send help i'd appreciate it hahahaha

#

i swear im not 100% incompetent

wintry olive Nov 27, 2020, 4:04 AM

#

hmmm im incompetent but intrigued.

#

https://pandas.pydata.org/docs/getting_started/intro_tutorials/03_subset_data.html

calm nacelle Nov 27, 2020, 4:12 AM

#

i decided to skip it and just have x be the datastream and y be a cusignal minimum phase

#

a bit easier hahahaha

wintry olive Nov 27, 2020, 4:20 AM

#

this might help?

📎 unknown.png

#

Instrumental Variable Estimation seems relevant

calm nacelle Nov 27, 2020, 4:25 AM

#

not sure what that is, ill have to read about that

#

https://scikit-multiflow.github.io/

scikit-multiflow

Machine learning package for streaming data in Python

#

trying to work with this right now

#

seems neat

wintry olive Nov 27, 2020, 4:32 AM

#

yeah i was looking in linear models https://bashtage.github.io/linearmodels/_modules/linearmodels/iv/model.html#IVGMMCUE

#

yoo multi flow does look neat

calm nacelle Nov 27, 2020, 4:55 AM

#

trying to run something custom rn, keeps giving me a none type error

#

no idea why lmao

calm nacelle Nov 27, 2020, 5:47 AM

#

it was intel python being wonky

#

success!

#

📎 Untitled.png

white carbon Nov 27, 2020, 6:17 AM

#

Hey guys, I am trying to make a 5d scatter plot given 4 attributes and their associated classes using the iris data set. My current code is

    pred = k_means.predict(input_train)

    x = input_train.iloc[:,0]
    y = input_train.iloc[:,1]
    z = input_train.iloc[:,2]
    a = input_train.iloc[:,3]

    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    ax.scatter(x, y, z, a, c=pred, cmap=plt.hot())
    plt.show()

Where x,y,z,a are the columns for the data of my attributes. I was wondering if anyone could help me out with this

velvet thorn Nov 27, 2020, 6:20 AM

#

white carbon Hey guys, I am trying to make a 5d scatter plot given 4 attributes and their ass...

so what's the problem?

white carbon Nov 27, 2020, 6:22 AM

#

I get this error

Traceback (most recent call last):
  File "...", line 62, in <module>
    prepData()
  File "...", line 58, in prepData
    ax.scatter(x, y, z, a, c=pred, cmap=plt.hot())
  File "...", line 2313, in scatter
    art3d.patch_collection_2d_to_3d(patches, zs=zs, zdir=zdir,
  File "...", line 554, in patch_collection_2d_to_3d
    col.set_3d_properties(zs, zdir)
  File "...", line 481, in set_3d_properties
    self._offsets3d = juggle_axes(xs, ys, np.atleast_1d(zs), zdir)
  File "...", line 763, in juggle_axes
    if zdir == 'x':
  File "...", line 1329, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

velvet thorn Nov 27, 2020, 6:23 AM

#

white carbon I get this error ```py Traceback (most recent call last): File "...", line 62...

what do you expect the result to look like?

#

when you say "5d scatter plot"

white carbon Nov 27, 2020, 6:24 AM

#

Well what I am looking for is, the x,y, and z axis are 3 of the 4 attributes, the colour of the plotted data is their class, and the size of the plotted data is their fourth attribute. I am unsure if this is "5d", but since it had 5 aspects to it I just assumed. I am extremely new to coding so I have no clue as to what I am doing wrong 😦

velvet thorn Nov 27, 2020, 6:25 AM

#

white carbon Well what I am looking for is, the x,y, and z axis are 3 of the 4 attributes, th...

huh.

#

hm.

#

try ax.scatter(x, y, z, s=a, c=pred)?

white carbon Nov 27, 2020, 6:29 AM

#

For reference to what I am doing is using the iris data set for machine learning. The code you sent does make a scatter plot thank you! I was also wondering however, since I renamed the classes to 1,2 and 3, the colours that appear are purple, a very light yellow, and I cant see the other one. Is there a way to fix this or do I just ahve to change the numbers of the classes to something else?

velvet thorn Nov 27, 2020, 6:30 AM

#

white carbon For reference to what I am doing is using the iris data set for machine learning...

that would be due to your colour map (cmap)

#

adjust it accordingly

#

it's not related to the class names

white carbon Nov 27, 2020, 6:32 AM

#

Oh that makes sense thank you! For the size of the points, I noticed that they all look the same, probably because the values of them are quite similar, from the visualization given to me by python. Is there a way to zoom in to take a better picture of this, I am trying the magnifying glass but it does not seem to work.

rapid raft Nov 27, 2020, 6:45 AM

#

I want to make a quiz website that will take its questions and choices from a pdf /image. I have tried things like pytessaract but it doesnt detect math equations and stuff like this. So i was thinking of making something that just cuts the questions and choices into short images of their own then i will use them . But i dont have knowledge how i will be able to do
that. Can anyone tell me what i need to do/learn to make it happen

📎 IMG_20201127_112430.jpg

north sand Nov 27, 2020, 7:10 AM

#

Hi everyone!!

Im actually into datascience and I discovered SuperDataScience (that has a udemy course of Machine Learning A-Z) and I am having a hard time deciding if I will buy their discounted subscription.

is anyone here tried their services?

agile wing Nov 27, 2020, 7:51 AM

#

Yeah i bought three courses of udemy ever since their black friday sale happened.

#

I think its worth it...from the bLack friday sales going on.

analog apex Nov 27, 2020, 8:19 AM

#

How to use textract with django?

livid quartz Nov 27, 2020, 9:18 AM

#

Does numpy have a function that normalises column vectors to length 1 ?

velvet thorn Nov 27, 2020, 9:21 AM

#

livid quartz Does numpy have a function that normalises column vectors to length 1 ?

length 1...?

#

that's not normalisation, if I understand you correctly

livid quartz Nov 27, 2020, 9:29 AM

#

Ah sorry, In this case I am talking about Linear Algebra

velvet thorn Nov 27, 2020, 9:31 AM

#

livid quartz Ah sorry, In this case I am talking about Linear Algebra

so what do you want to do exactly

livid quartz Nov 27, 2020, 9:32 AM

#

Normalise A matrix of column vectors so that their length is 1. I've done it manually by dividing each column vector by the square root of their dot product, but I was just wondering if NumPy had a quicker way to do it

velvet thorn Nov 27, 2020, 9:32 AM

#

livid quartz Normalise A matrix of column vectors so that their length is 1. I've done it man...

when you say "length"

#

do you mean in the sense of some sort of norm e.g. Euclidean distance?

livid quartz Nov 27, 2020, 9:33 AM

#

Yes

#

Euclidean in this case

velvet thorn Nov 27, 2020, 9:34 AM

#

divide by np.linalg.norm

#

I would suggest

#

you refrain from saying "length"

#

it's quite misleading

livid quartz Nov 27, 2020, 9:34 AM

#

Ah right ok, It was just the way I have been taught linear algebra

velvet thorn Nov 27, 2020, 9:35 AM

#

"length" is one interpretation of the norm, yes

#

but the more general term is "unit norm"

#

rather than "length 1"

#

(Euclidean norm is implied, but of course other norms exist)

#

(and in this case you are specifically using the Euclidean norm)

livid quartz Nov 27, 2020, 9:35 AM

#

Perfect, I will keep that in mind

barren meadow Nov 27, 2020, 10:42 AM

#

hey guys

#

anyone of you who uses pyarrow?

frail arch Nov 27, 2020, 12:25 PM

#

Can someone tell me difference between macro and min average in precision score in sklearn?

hasty grail Nov 27, 2020, 2:57 PM

#

Try median filtering

#

@oblique vine

oblique vine Nov 27, 2020, 3:01 PM

#

nope

#

I kinda want the lowest of surrounding pixels

#

not median

#

I mean

#

lowest if neighbours are zero or sth like that

hasty grail Nov 27, 2020, 3:05 PM

#

You want erosion*

#

Then use dilation to restore the original characters

terse dove Nov 27, 2020, 3:17 PM

#

hi

#

i have a question

#

if i have 2 columns of data, one is categorical and the other is numerical. there are some missing values in the columns

what is the code that replaces the missing values with the most frequent value in the column ( both for the categorical variable and the numerical)

i don't know if it helps, but the categorical column has values like 1, 0, 4, 5 .. (not a, b, c)

trim oar Nov 27, 2020, 3:48 PM

#

check out the documentation for fillna()

#

You can fill missing values however you want with that method

#

===
But to be honest, depending on how many missing values you have and the distribution, you are introducing bias. Inputing with mode could drastically change your analysis

#

https://jamesrledoux.com/code/imputation#:~:text=Imputation Method 1%3A Most Common,given in Pandas'%20value_counts%20function.

James LeDoux's Blog

Impute Missing Values

#

In the sense of?

terse dove Nov 27, 2020, 4:19 PM

#

sry for the late reply

#

i was eatting

#

basically, we were given 2 datasets (training and test), both have missing values. we are allowed to get rid of data that have missing values in the training dataset, but we are not allowed to do that in the test dataset, so i thought i would use the most frequent in a column to fill in the missing values both in the training and the test data

trim oar Nov 27, 2020, 4:23 PM

#

You don't do anything to the test data because that's the whole point of having a test set

#

Uh gonna reply more but in a middle of something

terse dove Nov 27, 2020, 4:24 PM

#

ok sure, i think i need a long discussion regarding that topic

#

just to make sure, we will build our model based on the training data, then use it on the test data to predict the classes of its instances

trim oar Nov 27, 2020, 4:55 PM

#

I'm guessing this is for a course? Step one would always be checking how much data is missing and why they are missing, so (stock.isna().sum() / stock.shape[0])*100 gives you an idea of the missing values %. If it's too much, might as well drop. If it's substantial, it's better to note the assumption when introducing the bias. You'd also want to check for missing value types and distribution in each feature, this tutorial gives a simple explanation about missing value types and for mean/mode/median https://towardsdatascience.com/all-about-missing-data-handling-b94b8b5d2184

Medium

All About Missing Data Handling

Missing data is a every day problem that a data professional need to deal with. Though there are many articles, blogs, videos already…

#

If you have a skewed distribution, for example, imputing mode would make that even more skewed

terse dove Nov 27, 2020, 4:57 PM

#

ok cool thanks, i will check it. another question. I have a column that have numerical values such as 1, 2, 3, .. and so on. it is required that we trait that column as categorical

#

how to do that?

#

do i convert the numbers into, let's say, a, b, c, d, ..?

trim oar Nov 27, 2020, 4:59 PM

#

I'm not sure why would you do that. What was the exact wording? Because that makes it had to compute, unless you're performing a decisiontreeclassifier?

#

Even then..

terse dove Nov 27, 2020, 4:59 PM

#

we have to use multiple classifiers and decide on the best one based on the balanced error rate for each classifier

#

i mean, should i leave the column as it is and run my training?

#

i don't know the process to be honest, but is there supposed to be a way to specify that these columns are categorical?

trim oar Nov 27, 2020, 5:09 PM

#

Oh I see. You can col = col.astype() to change the datatype to object rather than integer

#

https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#basics-dtypes

#

To change the datatype

main badger Nov 27, 2020, 6:23 PM

#

I am new to neural networks (and AI/ML for that matter) and I am trying to wrap my head around the Dense layers in Keras. So I have a question.. Does adding multiple Dense layers with same arguments make any difference, is there a point in doing that? I do not understand how having two dense layers with activation='relu' for example make a difference.

sturdy dune Nov 27, 2020, 7:17 PM

#

Machine Learning isn't only about writing few lines of code and getting the output printed. Understanding the concepts of every algorithm plays the important role. Here is the blog where I explained concept behind K-Nearest Neighbours Algorithm in a simplest way.

Hope it helps you!!! Let me know your feedback.

https://datamahadev.com/machine-learning-algorithms-k-nearest-neighbours-detailed-explanation/

datamahadev.com

Samiksha Bhavsar

Machine Learning Algorithms: K-Nearest‌ ‌Neighbours‌ Detailed Expla...

KNN is one of the basic supervised learning algorithm used for classification. By the end of this article, you will easily start applying KNN on various datasets.

whole vortex Nov 27, 2020, 7:31 PM

#

Does anyone here know the process of Discrete Cosine Transform on an image?

#

Struggling to understand how DCT2 works. The part when the 8x8 block from the image has been selected then "calculate the contribution of each block in the image, that when added together, will create this image exactly."

delicate wedge Nov 27, 2020, 7:49 PM

#

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''localhost' IDENTIFIED BY 'password' at line 1``` I changed my password to the word `password` but what am I doing wrong?

wintry olive Nov 27, 2020, 7:55 PM

#

whoa this channel has a lot of activity 👍 I'm reading thru it to get caught up and on the same level...

whole vortex Nov 27, 2020, 8:14 PM

#

lol

wintry olive Nov 27, 2020, 8:16 PM

#

mmm 🤔 Ian yu knows what's up.

flint canopy Nov 27, 2020, 8:17 PM

#

hi i just heart that here can ask python module pandas thing and excel thing here . is it okay to ask here

frosty flare Nov 27, 2020, 8:21 PM

#

hey guys , does anyone here uses openCV library on Pycharm mac os ?? logo_pycharm

wintry olive Nov 27, 2020, 8:22 PM

#

I was interested in Panda because of its tools for data manipulation specifically melt() and pivot() but that's more of a convenience quality of life tool to help get the job done so to speak...

flint canopy Nov 27, 2020, 8:23 PM

#

wintry olive I was interested in Panda because of its tools for data manipulation specificall...

do u mind to look at my code and have a check

wintry olive Nov 27, 2020, 8:26 PM

#

sure what you got?

flint canopy Nov 27, 2020, 8:26 PM

#

📎 Screenshot_2020-11-28_034922.png

#

📎 Screenshot_2020-11-28_034913.png

#

idk what's problem with this code , i want to update the status but it will come out warning from module

lapis sequoia Nov 27, 2020, 8:27 PM

#

Show the warning

flint canopy Nov 27, 2020, 8:27 PM

#

and the status doesn't update

#

📎 Screenshot_2020-11-28_042819.png

flint canopy Nov 27, 2020, 8:29 PM

#

lapis sequoia Show the warning

this is the warning it will not clash my program , after that it will continue my program

lapis sequoia Nov 27, 2020, 8:30 PM

#

flint canopy this is the warning it will not clash my program , after that it will continue m...

https://stackoverflow.com/questions/44028898/a-value-is-trying-to-be-set-on-a-copy-of-a-slice-from-a-dataframe-pandas#44029570

Stack Overflow

A value is trying to be set on a copy of a slice from a DataFrame. ...

I'm new to pandas, and, given a data frame, I was trying to drop some columns that don't accomplish an specific requirement. Researching how to do it, I got to this structure:

df = df.loc[df['

flint canopy Nov 27, 2020, 8:31 PM

#

hmm i refered it before , but idk Y i doesn't update my excel

wintry olive Nov 27, 2020, 8:43 PM

#

it's not a tool for updating Excel. It's a tracking number to see it is your Excel or not. It's a built in back propagation warning system that is flawed? A lot of the warnings are false positives and can be completely turned off according to jezrael.

#

my Excel or not my Excel?

whole sage Nov 27, 2020, 8:44 PM

#

Hey. Has somebody used AWS step functions for Neural Networks?

#

For deployment

frosty flare Nov 27, 2020, 8:46 PM

#

hey , i have a problem accessing my mac's camera using opencv on pycharm ... can anyone help?

wintry olive Nov 27, 2020, 8:49 PM

#

I was wondering why there is a need to make a warning for that tho. It makes sense to try to back propagate if possible. redundant data is an issue however I see it in terms of signal to noise not original and copy. Also signal to noise is relative to the observer

#

then you think in terms of interference not all interference is negative sometimes it can be resonate and positive other times a neutral cancellation

#

haven't used pycharm it looks like a very popular python editor console especially for data science too with SQL built in

#

datasets like https://github.com/huggingface/datasets

GitHub

huggingface/datasets

🤗 Fast, efficient, open-access datasets and evaluation metrics in PyTorch, TensorFlow, NumPy and Pandas - huggingface/datasets

#

they are almost like virtual corpora carefully curated and stored in an open library

#

that is hard tedious work too and unless you feel like going to the source and sorting thru haystacks looking for needles to add I suggest we help them out if we can. Google corpus a lot of them are free with limited use some are not free but most of them let you easily make virtual corpora on there webapps using queries and linguistics and semantic filters and collectors

#

and that's just the data not even the datasets. the datasets have additional meta data and are really powerful tools for ML in the right hands. The data is like a bolt... the data set is the socket which actually fits allowing you to tighten the data into a ML framework model process. There is more tools in the set but you don't want to over fit or introduce bias for the sake of convenience.

#

my idea is built on that same foundation. but from there it will be all about creative diverse expression. broad spectrum artistic not laser focused realistic and very much a product of a stack of meta objects emerging from a highly complex 'complex plane'

wintry olive Nov 27, 2020, 10:12 PM

#

sorry for thinking out load again. more questions or posts about data science from actual data scientists would be greatly appreciated ☺️

#

bruh that was supposed to be a 😃

#

anyone working on computer vision data and models? I've been focusing on NLP

velvet thorn Nov 27, 2020, 10:21 PM

#

it's not a tool for updating Excel. It's a tracking number to see it is your Excel or not. It's a built in back propagation warning system that is flawed? A lot of the warnings are false positives and can be completely turned off according to jezrael.
@wintry olive it’s also about coding style

#

if you write code in a more FP way it generally doesn’t happen

#

I was wondering why there is a need to make a warning for that tho. It makes sense to try to back propagate if possible. redundant data is an issue however I see it in terms of signal to noise not original and copy. Also signal to noise is relative to the observer
@wintry olive because you can’t tel reliably whether you’re modifying the original or not

wintry olive Nov 27, 2020, 10:28 PM

#

I gotcha yeah it makes sense to me. the answer was given by @lapis sequoia however I'm thinking the user who asked the question was having difficulties understanding due to auto translate of formal language? I noticed some Asian script text in his code somewhere.

#

good as an FYI but not ideal as a useful means for determining true original vs copy of data frame

velvet thorn Nov 27, 2020, 10:31 PM

#

I am new to neural networks (and AI/ML for that matter) and I am trying to wrap my head around the Dense layers in Keras. So I have a question.. Does adding multiple Dense layers with same arguments make any difference, is there a point in doing that? I do not understand how having two dense layers with activation='relu' for example make a difference.
@main badger yes, it does. the output of the previous dense layer goes into the next

wintry olive Nov 27, 2020, 10:31 PM

#

or data frame slice might be to incomplete to determine orgin

velvet thorn Nov 27, 2020, 10:35 PM

#

good as an FYI but not ideal as a useful means for determining true original vs copy of data frame
@wintry olive yeah, it’s a Python limitation

wintry olive Nov 27, 2020, 10:40 PM

#

also one is Google cash driven the other Facebook cash driven. Which is great for open source software developers. However I share this because Google auto generated that bit from an inference. it emerged quute glitch fully in my search results. beat feature I suppose because I was definitely looking for the comparison.

📎 Screenshot_20201127-173452.png

#

idk why it superimposed thumbnail graph images over the text tho

#

I use google search way too much

#

I know what it was I used hey google to take screenshot so google assistant must have permission to write over chrome app

#

any computer vision peeps want to run this thru their models?
https://c.tenor.com/8GyjIYDJGXsAAAAM/mandelbrot-zoom-fractal.gif

#

compare stream processing vs batch image and look for holomorphic functions

wintry olive Nov 27, 2020, 11:06 PM

#

i keep stumbling into this this graph holomorphic/meromorphic or something. apparently those spikes have something to do with primes idk take the batch photos at the prime-spikes? or use them to modulate recursive iteration as if to encode/dedcode data on the infinite fractal expression? 8 layers? or 8 colors? or something? i haven't really looked into computer vision modelling as much as I have NLP tho

📎 unknown.png

#

https://peterjamesthomas.com/glimpses-of-symmetry/chapter-8-simplicity/

Peter James Thomas

8 – Simplicity

“In character, in manner, in style, in all things, the supreme excellence is simplicity.” – Henry Wadsworth Longfellow This Chapter, and the two that follow it, may seem at …

#

seems relevant

lapis sequoia Nov 28, 2020, 12:35 AM

#

guys, how can i install opencv for python with gpu support? (same for tensowflow)

lapis sequoia Nov 28, 2020, 1:00 AM

#

wait...

#

cant i use it if my gpu is amd?

#

wtf? that makes no sense

red briar Nov 28, 2020, 2:16 AM

#

📎 unknown.png

#

how to use multi index?

thorny owl Nov 28, 2020, 2:19 AM

#

Can anyone help me with a numpy question really quick?

velvet thorn Nov 28, 2020, 6:53 AM

#

thorny owl Can anyone help me with a numpy question really quick?

just ask

vital linden Nov 28, 2020, 7:24 AM

#

lapis sequoia cant i use it if my gpu is amd?

tensorflow with AMD GPU is hard and unstable in my personal experience. There's a rocm-tensorflow fork that only works on linux. I spent a lof time to get it to work but unstable and often freezes while training - I can only get it work with certain batch sizes.

main badger Nov 28, 2020, 7:30 AM

#

velvet thorn > I am new to neural networks (and AI/ML for that matter) and I am trying to wra...

Thank you for responding. How would having repeated layers help? If I have two dense layers with default parameters, would it improve the accuracy? Or just slow things down?

proper quail Nov 28, 2020, 7:31 AM

#

📎 Thumbnail.png

#

Ready to develop your next AI-enabled application? ACM-VIT is proud to present The Neural Hack to cap off the year 2020. Whether you're well-versed in Machine Learning or a newcomer to the space, we invite developers from all backgrounds to solve real-world problems with innovative AI solutions. Mark your calendars for 4th & 5th December for a thrilling 36-hour hack!
An AI based hackathon to promote women in technology.

Anyone can participate in this hackathon but the team should consist of atleast 50 percent women.

For more information visit: https://theneuralhack.devpost.com/

The Neural Hack

Welcome to The Neural Hack - An AI based Hackathon for Women in Tech!

#

Thought it would be helpful in this channel so shared it🙃

velvet thorn Nov 28, 2020, 7:39 AM

#

main badger Thank you for responding. How would having repeated layers help? If I have two d...

it...depends?

#

more layers generally means higher complexity

main badger Nov 28, 2020, 7:45 AM

#

velvet thorn it...depends?

Hmm.. in the scenario I am studying, it doesn't seem to be increasing the accuracy.

#

I realise I am not too clear on what the dense layer is basically doing. I need to go back and reread

velvet thorn Nov 28, 2020, 7:47 AM

#

main badger I realise I am not too clear on what the dense layer is basically doing. I need ...

a dense layer

main badger Nov 28, 2020, 7:47 AM

#

I understand why the relu activation or the sigmoid activation is doing, but with default parameters.. not clear.

velvet thorn Nov 28, 2020, 7:47 AM

#

is basically a lot of parallel linear regressions

#

with an optional activation layer that provides nonlinearity

main badger Nov 28, 2020, 7:48 AM

#

Ahhh.. this helps!

lapis sequoia Nov 28, 2020, 10:05 AM

#

main badger Ahhh.. this helps!

It depends on the task. Do you need time delayed information, NN are just crap. You need some RNN for short term or LSTM for Long-short term questions. If you use a NN you only need one or two layers with many neurons. If you have a higher task (such as NLP or handwriting recognition or something like movie processing) you need Deep Leraning with many Layers and a few neurons. You need to understand the differenc between NN and Deep learning RNN´s/CNN´s.

NN´s are not so powerful and i don´t use it anymore. Cause LSTM is more powerful (but leran NN first)

velvet thorn Nov 28, 2020, 10:20 AM

#

lapis sequoia It depends on the task. Do you need time delayed information, NN are just crap. ...

strictly speaking, "neural network" covers all of them.

#

in other words, CNNs and RNNs are types of NN.

#

but yes, just fully connected layers won't get you anywhere

lapis sequoia Nov 28, 2020, 10:21 AM

#

Yeah sure. But better performance

velvet thorn Nov 28, 2020, 10:21 AM

#

the point is that it's not really correct to say "NN means only fully connected layers"

lapis sequoia Nov 28, 2020, 11:08 AM

#

Can anyone point me to data science related to game theory, or know any game theory / data science communities or discords?

lapis sequoia Nov 28, 2020, 1:17 PM

#

vital linden tensorflow with AMD GPU is hard and unstable in my personal experience. There's ...

and any other way to use amd gpu on machine learning?

loud veldt Nov 28, 2020, 2:43 PM

#

Hi everyone. For those who also use R programmers, can someone recommend a book for Python that is like Hadley's R for Data Science?

fierce shadow Nov 28, 2020, 2:52 PM

#

@lapis sequoia https://medium.com/swlh/how-to-use-amd-gpus-for-machine-learning-on-windows-96ace916e97

Medium

How to Use AMD GPUs for Machine Learning on Windows

Getting past the NVIDIA monopoly

#

I think this something you looking for

lapis sequoia Nov 28, 2020, 2:53 PM

#

lapis sequoia and any other way to use amd gpu on machine learning?

Theano dude.

#

Best GPU/CPU Python lib

fierce shadow Nov 28, 2020, 2:54 PM

#

for amd gpus?

lapis sequoia Nov 28, 2020, 2:54 PM

#

Dosen´t matter

#

i use it with amd and intel

fierce shadow Nov 28, 2020, 2:54 PM

#

although I believe Nvidia is still the best option as it provides with CUDA api

lapis sequoia Nov 28, 2020, 2:55 PM

#

Try it. But you have the best support with theano

#

It calculates with your hardware the best way of solving a problem with minimal time

lapis sequoia Nov 28, 2020, 3:26 PM

#

fierce shadow <@456226577798135808> https://medium.com/swlh/how-to-use-amd-gpus-for-machine-le...

okey seems good so far. But i have never used enviroments... xD i install packages with pip and import them when needed on pycharm

#

Theanos 🙃

#

Please choose a default device:

   1 : llvm_cpu.0
   2 : opencl_amd_ellesmere.0```

#

i guess i need to select 2?

trim oar Nov 28, 2020, 3:39 PM

#

Environments are good for two reasons. When you use time series or deeplearning packages, or other complex packages, there are times where you could mess up the underlying structure, and you don't wanna mess up your root/base.

#

Then of course, when you share your project, you'd want the other party to be able to replicate the results. Or vice versa, replicate their results. Hence sharing the environment with the exact packages

#

Probably there are some other reasons, but these two are the biggest for me

main badger Nov 28, 2020, 3:47 PM

#

lapis sequoia It depends on the task. Do you need time delayed information, NN are just crap. ...

I am new to all these and do not understand most of the terms you have used. I will note these down and look them up. And as you also suggested, I need to learn and have a sold understanding of the fundamentals first, it is all very surface level as of now. Could you please suggest some good resources for learning for beginners? Thank you very much.

elfin bear Nov 28, 2020, 4:02 PM

#

I have a script that extracts some metadata from a file/directory of files and I try to find a good way to print it both to terminal (less data) and to file (all data). Anyone got any good idea what would be the best way?

molten ridge Nov 28, 2020, 4:14 PM

#

I want to encrypt Some data and Store in SQlite but I cant

#

when i encrypt the Data with cryptography.fernet

past kiln Nov 28, 2020, 4:16 PM

#

just make your own encryption function

#

and encript it with that

molten hamlet Nov 28, 2020, 4:20 PM

#

well, this aliasing goes brrrr

#

https://media.discordapp.net/attachments/751636489208856725/782279634536366121/turbo_vim.gif?width=360&height=360

vague portal Nov 28, 2020, 4:53 PM

#

Useful resources for software development/architecture of ETL pipelines?

wintry olive Nov 28, 2020, 8:19 PM

#

i got something for you @vague portal https://optool.co/

Operator Tool Co.

The easist way to use cloud infrastructure.

#

check it out and let me know what you think

#

not specific to transforming adapting data from source archival or warehoused data to cloud workstation containers and then back update (that would be sweet tho) but you might be able to hack it?

#

used as a cheap work around? idk

#

well that statement had a lot of noise in it. meant to say "not specific to sourcing data from warehouse state to a workstation to work transform or adapt it in a manner that back propagates (if reasonable or desireble)

tropic nest Nov 28, 2020, 8:29 PM

#

Hey all, I've got some code where I was using numpy.trapz to integrate some data and I'm trying to use numpy.integrate.simps for simpsons quadratic method instead, it keeps giving me a 'ValueError: operands could not be broadcast together with shapes (19,) (9,) ' however, when I check .shape of both 'y_speed' and 'x_interval' arrays it returns '(19,)'.

I'm calling
'''Python
code
distance = integrate.simps(y_speed, x=None, dx=x_interval, even = 'avg')'''

The full traceback:
''' File "C:\Users\ckowalski\AppData\Local\Programs\Python\Python38-32\lib\site-packages\scipy\integrate_quadrature.py", line 499, in simps
result = _basic_simps(y, 0, N-2, x, dx, axis)
File "C:\Users\ckowalski\AppData\Local\Programs\Python\Python38-32\lib\site-packages\scipy\integrate_quadrature.py", line 360, in _basic_simps
result = np.sum(dx/3.0 * (y[slice0]+4*y[slice1]+y[slice2]),
ValueError: operands could not be broadcast together with shapes (19,) (9,) '''

#

Damn I just looked up code tags I'm still using the wrong tags.

#

As near as I can tell I'm giving it what the docs say it requires-- two equal-length arrays...

wintry olive Nov 28, 2020, 8:37 PM

#

hmmm idk.... seems like a reasonable place to start troubleshooting. aside from that what would the purpose of having equal shapes vs unequal shapes in your scenario? just a visualization preference?

tropic nest Nov 28, 2020, 8:38 PM

#

Well the simpsons method requires an even number of intervals. Otherwise the function falls back to trapz for one of the intervals.

#

So that's just a matter of cpu usage and precision

wintry olive Nov 28, 2020, 8:39 PM

#

ahh

#

oh yeah i see it now [distance = integrate.simps(y_speed, x=None, dx=x_interval, even = 'avg')]

tropic nest Nov 28, 2020, 8:46 PM

#

yeah my bad, I googled codetags for this discord but I guess they've changed since the post I read

wintry olive Nov 28, 2020, 8:47 PM

#

nah your good. im learning and you know more than i do 🙂

tropic nest Nov 28, 2020, 8:47 PM

#

Hah maybe IDK 🙂

#

Think I found my mistake, apparently I need one dx per INTERVAL of y. So if I want to pass it a bunch of x,y series I'll need to convert each x,y pair into a x,[last_y, new_y] pair and then make arrays of those... yikes.

#

That seems so weird. I'll have to make a generator to pass it the same values twice to integrate x,y data...

wintry olive Nov 28, 2020, 8:51 PM

#

oh man thats not ideal

#

is that an iteration?

#

or copy/paste?

tropic nest Nov 28, 2020, 8:55 PM

#

Okay I guess I should have just pressed on. Already I've found out why. For numpy.trapz, I would pass the x,y series as trapz(y=y_speed, x=none, dx=x_interval) but for scipy.integrate.simps of the same arrays, it has to be simps(y=y_speed, x=x_interval)

wintry olive Nov 28, 2020, 8:55 PM

#

ahh

tropic nest Nov 28, 2020, 8:55 PM

#

I guess I expected scipy to use the same syntax there

#

The function is for data read via MODBUS serial i/o from a BAC8000, a 56-kilowatt motor controller. I'm developing a display/controller for other ebike hotrodders 🙂

wintry olive Nov 28, 2020, 8:56 PM

#

niice

terse dove Nov 28, 2020, 8:58 PM

#

guys I have a quick question. if I have about 5% of the data is missing from a column, do I replace the missing with the most frequent or the mean?

tropic nest Nov 28, 2020, 8:59 PM

#

is it time series?

terse dove Nov 28, 2020, 8:59 PM

#

no

tropic nest Nov 28, 2020, 9:00 PM

#

then that's sort of hard to answer without knowing what you want to do. Why do you need to replace them?

terse dove Nov 28, 2020, 9:00 PM

#

10000 data, 500 missing

tropic nest Nov 28, 2020, 9:00 PM

#

replacing with means would not affect the mean, whilst the median could actually shift the distribution and affect normality, etc

terse dove Nov 28, 2020, 9:00 PM

#

i want to replace the categorical variables with the most common

tropic nest Nov 28, 2020, 9:01 PM

#

well you'll have to do most frequent then, right? you can average categorical variables, e.g. 'control', 'treatment_x', etc?

#

can't*

terse dove Nov 28, 2020, 9:02 PM

#

ye

#

yea

#

the most frequent

#

but isn;t it better than deleting an entire row?

tropic nest Nov 28, 2020, 9:02 PM

#

I absolutely don't want this to sound mean, but... how can you not know a categorical variable?

#

I would delete the row

terse dove Nov 28, 2020, 9:03 PM

#

it is a project

tropic nest Nov 28, 2020, 9:03 PM

#

shouldn't categorical variables of your samples all be defined before you even get any data?

terse dove Nov 28, 2020, 9:04 PM

#

i know nothing about the data

#

just a machine learning class

#

get data, get classifier, train and build a model. here is the test data and do the prediction

#

and u can

#

and u can't delete rows in the test data

#

that was what we were told

tropic nest Nov 28, 2020, 9:06 PM

#

Hmm I know nothing about machine learning. I'm a researcher by trade.

#

Are you trying to train the model from data/classifier then assign classifier based on data?

terse dove Nov 28, 2020, 9:07 PM

#

that explains the difference in the perspectives

#

yea

tropic nest Nov 28, 2020, 9:07 PM

#

Okay so then shouldn't you assign classifiers not based on the most frequent classifier, but by the classifiers most frequently associated with data of that value?

#

Or near that value?

#

I couldn't tell you how to implement that, but just picking the most frequent classifiers in all the data is probably not what your professor is looking for.

terse dove Nov 28, 2020, 9:09 PM

#

well, he is not telling us exactly what he is looking for. basically 20 variables, 3 variables have some missing data (5%)

#

so i don't know

#

5% doesn't seem that much of bias or what not

#

i don't know

#

i just want to get rid of this shitty project and sleep

tropic nest Nov 28, 2020, 9:10 PM

#

lollll

terse dove Nov 28, 2020, 9:10 PM

#

i am tired of this semster

#

my back hurts

#

my mind is telling me to escape the shit out of here

#

but i have to endure just few days

#

and i will be free

#

still i need to find a job, but at least i am not going to be stressed from shitty deadlines for a while

tropic nest Nov 28, 2020, 9:11 PM

#

Relatable, I worked all through this holiday. But don't take the shortcut-- I would bet big money that not doing ML and taking the most common values is not what you're supposed to do.

terse dove Nov 28, 2020, 9:11 PM

#

so what should i do for the test data that we are not allowed to delete the rows that has missing data in it

tropic nest Nov 28, 2020, 9:13 PM

#

You need to use ML-- for each value missing a classifier, find other values nearest that value that have classifiers, and look at the frequency of classifiers for those values. Pick the most frequent classifier

terse dove Nov 28, 2020, 9:13 PM

#

u know that i need to build a classifier, a classifier was not given to me

tropic nest Nov 28, 2020, 9:14 PM

#

Right. So to build a classifier from the values, take each value missing a classifier... find other values nearest that value that DO have a classifier, they may be mixed... pick the most common one, and assign that.

terse dove Nov 28, 2020, 9:15 PM

#

nearest neighbor

#

oh, i would love to know how to do that 😄

tropic nest Nov 28, 2020, 9:17 PM

#

Yeah. Probably the ideal way to do it, is to write a loop that for every missing point, calculates an error term (point_without_classifier - point_with_classifier) and then if the error term is below a certain constant that you set by testing the output, check the classifier with if: statements and increment counters to get the frequency of each classifier. Then, pick the classifier with the highest frequency and assign that to the point with the missing classifier

#

So for every single point with missing categoricals, you'll be iterating over the WHOLE dataset with categoricals

#

But only using some of that dataset with categoricals

terse dove Nov 28, 2020, 9:19 PM

#

i am sorry but i will be honest with u, I don't want to get any more headache with this project as i literally have a final exam on 30th of november, i would rather get less marks in the project than getting more and more headaches from it. I need to study for the final, otherwise it is ... well u know the drill.

#

it is just that i made a promise with myself that i will python more on my own after i finish this semester, no deadline stresses

tropic nest Nov 28, 2020, 9:20 PM

#

Good luck!

terse dove Nov 28, 2020, 9:21 PM

#

thanks. i am really hating sitting down and staring blankly at my screen 😄

vague portal Nov 28, 2020, 9:24 PM

#

wintry olive i got something for you <@!153175734221799424> https://optool.co/

Hmm I'm looking for a resource that can teach me how to make my own pipeline from scratch. Right now, I have some data cleaning functions. I want to wrap these into classes so I can add each class into a pipeline. Using classes let's me be flexible in choosing which classes I want to be added into my pipeline

#

Each class will clean and change my dataframe

wintry olive Nov 28, 2020, 10:11 PM

#

I like the sound of that especially regarding the focus on class objects. Try to focus on scripts too while your at it. I can show you an example of code written by an author with mastery of code and concept when I get back to my 💻or google PanelOLS and look at the source code. you will need to code at that level for sure. Keep at it and don't give up or try to go at it from a different angle start on json and get into Apache projects like druid or meta model if you want to engineer at a comp sci level

obtuse skiff Nov 28, 2020, 11:49 PM

#

Hello, can someone pls explain something to me. I have a basic CNN, similar to lenet5 architecture and Im testing various loss functions.
Why is it when I have softmax at the end of the architecture it converges MUCH faster when using MSELoss and slow for CrossEntropyLoss. but when I remove the final Softmax in the architecture. MSELoss doesnt converge and CrossEntropyLoss converges really fast?

Same with L1 Loss that performs much better without softmax.

trim oar Nov 29, 2020, 12:24 AM

#

5% is a lot of bias you're introducing

#

Although it depends on the subject matter obviously, anything above 3-4% is already much. If I could drop it, I'd drop it. Otherwise, usually .median() is better than .mean() or .mode() for not messing around with the distribution much.

trim oar Nov 29, 2020, 12:27 AM

#

terse dove well, he is not telling us exactly what he is looking for. basically 20 variable...

But did you mean 5% as combined or each have 5%?

wintry olive Nov 29, 2020, 12:45 AM

#

have a learning platform to share. this one looks gamified too. https://leetcode.com/ theres also a job placement service that may or may not work but the learning side with the gamified experience added to it looks top notch

LeetCode - The World's Leading Online Programming Learning Platform

Level up your coding skills and quickly land a job. This is the best place to expand your knowledge and get prepared for your next interview.

lapis sequoia Nov 29, 2020, 1:04 AM

#

guys, could u help me building a cnn for image classification (like 100 classes) with one of this: not too much training data or not too much validation data

#

cuz sadly i havent found too much images from my data set

wintry olive Nov 29, 2020, 1:34 AM

#

i haven't looked into computer vision as much as I have NLP. there was one model that caught my eye considering it was similar to what i was researching at the time. I haven't been able to relocate it or check back in with its progress tho. essentially they were using transform models tokenizers masks auto suggest like functions on images. somehow they were able to take raster scan lines of pixels as if sequences of "pixel words" and run that through custom made transformer model.

#

"pixel word" should be the keyword to find it easily with a search query

silent current Nov 29, 2020, 1:39 AM

#

anybody able to help me with Pandas stuff in #help-apple ?

tender bison Nov 29, 2020, 2:04 AM

#

anyone know anything about audio visualization?

#

I need to take an audio file, cut into N chunks and return a timeline based on the intensity of the given chunk

#

the timeline size is fixed so I need to be able to get chunks of arbitrary size, the idea I"m working on atm is something like chunk = (len(audio) / len(timeline) )

#

chunk_instensity = chunk.total() / len(chunk)

#

then I draw the timeline out by the intensity

#

google is not being friendly to me atm

#

nvm found a visualization program that suits my needs

#

https://vane.pl/sampling-audio-files-with-python/

(.)(.) dev thoughts …

Sampling audio files with python

To be quick - for recent idea I need to segment audio file, perfectly after one speaker finish his sentence and second speaker start talking. Also I need to detect number of unique speakers in audio.

#

in case anyone else wants to know

gray sedge Nov 29, 2020, 4:28 AM

#

Hi everyone

#

Can someone tell me what I need to do on this Y axis? (pandas & matplotlib)

📎 unknown.png

velvet thorn Nov 29, 2020, 5:16 AM

#

@gray sedge it's not a number

#

you need to make it a number

gray sedge Nov 29, 2020, 5:16 AM

#

I'm following this much now 🙂 Just need to work one more bug out

#

Those value's that are showing up as '-' are actually '-' in the page it's being pulled from, so they're not showing as NaN

#

But, I need to convert that to a number for this to work

#

Which it doesn't like, showing "could not convert string to float"

#

the second df.replace isn't doing anything :\

📎 unknown.png

#

Here's the whole thing, actually

📎 unknown.png

#

& output

📎 unknown.png

gray sedge Nov 29, 2020, 5:54 AM

#

Oo nevermind, I had to go to the webpage, inspect, copy the character, and insert it. Apparently the key wasn't good enough for it.

lapis sequoia Nov 29, 2020, 6:04 AM

#

https://cdn.discordapp.com/attachments/291284109232308226/782487050455613460/discord.py

terse dove Nov 29, 2020, 6:42 AM

#

@trim oar each variable has 5% missing. (3 variables have missing data out of 20)

trim oar Nov 29, 2020, 6:43 AM

#

terse dove <@!680795279758327819> each variable has 5% missing. (3 variables have missing d...

It’s your call and you probably imputed data already. Just the number isn’t small.

terse dove Nov 29, 2020, 6:45 AM

#

i read an article that u can use neural network to impute the missing data

#

i was in the bathroom before going to sleep, so i don't remember much of it