#data-science-and-ml

1 messages Ā· Page 420 of 1

steady basalt
#

starting from the bottom..

#

how far to work until?

long locust
#

What do you mean?

steady basalt
#

calc 3?

#

dont rly need 4 right if 3 applies to vectors

long locust
#

What?

steady basalt
#

stop there?

long locust
#

I'm so confused

steady basalt
#

im saying is it enough to learn up until vectors in calc 3 and stop

long locust
#

Calc 3 was multivariable calculus at my university

wooden sail
#

keep in mind the names of courses vary from uni to uni, as does their content, so saying "calc 3" doesn't mean much

steady basalt
#

oh here its 1: differential 2: integral 3: 3d or vectors

wooden sail
#

vector and multivariate calc is what you'll need

steady basalt
#

so in my case its calc 1 , 2 and half of 3

#

its actually an insanely big topic

#

and very hard on the brain at some parts

long locust
#

Interesting. Calc 1 for me was differentiation + integration, Calc 2 was integration by parts and trig substitution, Calc 3 was multivariable. Differential Equations and Linear Algebra were their own courses

wooden sail
#

well, saying "vectors" doesn't mean they'll learn linalg in there... hopefully šŸ˜›

#

probably just gradients, jacobians, hessians, line integrals and vector fields

steady basalt
#

dammit i got a beginner question wrong

#

you dont get multiply the integer by the root power keeping the power in side the root?

wooden sail
#

hmm?

steady basalt
#

basically i tried to do 10/5 x^3

#

if that makes sense

wooden sail
#

if you don't merge the 3 with the 1/5, for example, you need to use the chain rule with the power rule

#

if you do, then you just need the power rule

#

you get the same result

steady basalt
wooden sail
#

the scalar 10 doesn't participate in the derivative, since differentiation is linear

steady basalt
#

its first question of its kind i just used methods prior

#

so i looked at the 10

#

do you mean you only look at the closest transormation of x?

wooden sail
#

what?

#

d(c f(x))/dx = c d(f(x))/dx

steady basalt
#

i saw it was 10* 1/5 x^3

wooden sail
#

what is 10 * 1/5 x^3?

steady basalt
#

10(5rootx^3) = 10(x^3 ^1/5)?

wooden sail
#

use parentheses

steady basalt
#

let me re write it lolp

wooden sail
#

if you know latex, use that

#

.latex 10 (x^3)^{1/5}

strange elbowBOT
steady basalt
#

so basically, you do the root stuff first and th en the 10 later instead of 10 first by the root power

wooden sail
#

the 10 is not affected by any power there

#

there's no parenthesis

steady basalt
#

to reprhase then, looking at a 5th root of x something

#

i looked at that and thought well thats 1/5 of that value

#

so i did the power rule with the 1/5 and 10 first

#

instead of 3

wooden sail
#

what???

steady basalt
#

10(x^3)^1/5

wooden sail
#

mhm

steady basalt
#

its NOT 10 * 1/5 im a moron, its 10x^3/5

#

?

wooden sail
#

i have no idea where you're getting multiplication there, it's clearly saying "take the 5th root"

steady basalt
#

power rule

wooden sail
#

and it has nothing to do with the 10

steady basalt
#

for example if we had 4x^5, id say 20x... in this case ^4

#

i tried to apply this logic

wooden sail
#

ah, you want to use the chain rule, you mean? because you're not saying anything nor using symbols, so i have no way of understanding what you're trying to say if you don't use words šŸ˜›

steady basalt
#

is this chain rule?

wooden sail
#

yes, the first step of the chain rule there will give that result. but you have g(f(x)), where g is exponentiation to the 1/5

#

and f is raising to the 3rd power

#

so you need the chain rule

steady basalt
#

so its the chain rule even if ur just doing a simple two step power rule on one x

wooden sail
#

it's chain rule when you compose two functions

steady basalt
#

i thought u can just multiply 10 by 1/5

wooden sail
#

that you can do

steady basalt
#

i tried and it was wrong

#

that gave me 2 x ^ 3

wooden sail
#

the parser is probably wrong

steady basalt
#

which then gave me 6 x ^ 2

#

its paper

wooden sail
#

that sounds correct to me

steady basalt
#

for this term, the solution is this

#

so...

wooden sail
steady basalt
#

so i got 6x^2 instead of 6x^2/5

wooden sail
#

yeah the issue was not with the 10/5 šŸ˜›

#

but rather with differentiating the 1/5 power

steady basalt
#

but i thought we already got rid of that by multiply 1/5 by 10

#

and - 1 off the power i did forget to do

wooden sail
#

that's what you got wrong, yes

steady basalt
#

ohhhhhh

#

lol

#

!

#

bsaically what went wrong was i was under the impression that you only minus one off the power when its the power closest to x, for some reason

wooden sail
#

that's an odd assumption to make

steady basalt
#

this is not the chain rule though right? it seems simple power rule

#

oh but in a way, i suppose it is

wooden sail
#

chain rule because you do the power rule twice

#

you have composition of power functions

steady basalt
#

its sort of like saying d x^3 / something

#

if you do power rule twice, why is it dy and not d2y

wooden sail
#

because chain rule

#

d g(f(x)) / dx = g'(f(x)) f'(x)

steady basalt
wooden sail
#

10 d (x^3)^(1/5) / dx= 10 [(1/5) (x^3)^(-4/5)] * [3(x^2)]

#

the first [] is g'(f(x)), the second is f'(x)

#

or maybe i did that wrong, i'm in a meeting rn

#

check my arithmetic with the exponents

viral garnet
#

Hey, I'm running a yolov3 script for detection but getting this error.. can anyone help out?

Traceback (most recent call last):
  File "number_plate.py", line 34, in <module>
    ln = ln[i[0] - 1]
IndexError: invalid index to scalar variable.

This is the line of code

LABELS = open(LABELS_FILE).read().strip().split("\n")

np.random.seed(4)
COLORS = np.random.randint(0, 255, size=(len(LABELS), 3),
    dtype="uint8")

net = cv2.dnn.readNetFromDarknet(CONFIG_FILE, WEIGHTS_FILE)

image = cv2.imread(INPUT_FILE)
(H, W) = image.shape[:2]

ln = net.getLayerNames()
ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]

What am I doing wrong here. If I remove [0], the code runs but doesn't detect anything.

brazen aurora
#

I'm trying to read this machine learning book and am getting the feeling that I need a statistics crash course first ... I'm correct about that right? The code doesn't seem crazy ( in beginning ) but I've already seen quite a few stats terms that I don't fully understand

novel python
#

Statistics will help your understanding of a variety of topics in machine learning. Although not required to apply the methods, you will find yourself more efficient in optimizing or trying to find what to tune or add if you understand the math behind

#

you don't need to understand EVERYTHING, otherwise you'd need a whole bachelors and PhD in Statistics

#

but it's helpful to learn the basics

steady basalt
#

Stats is awesome start with medical statistics imo

#

How different tests work… how logistic regression works

brazen aurora
#

Ok got it - I can definitely understand the super basic stuff of probability/conditional probability and some of the notation is familiar from calc. but when stuff like normal distribution / weighted sums / weighted regression gets introduced I only understand it on a very shallow level. I'm gonna look into a primer on stats for comp sci and go back at it. Thank you !

steady basalt
#

I teach you applied stats and you teach me calc!

#

I’m struggling with calc..:

brazen aurora
#

haha sounds good! for real tho which calc are you in? Have you heard of the professor leonard videos? I would come home absolutely lost from what my professor said until watching those videos and they just cracked it open so easily for me.

grizzled bane
#

Guys is there any way to do data entry on a mobile app? (esspecially if the app does not have an API avaliable)

ocean swallow
#

But basics will benefit you very much and debug certain things definetly (No Loss, exploding gradients, unstable convergence)

#

For production settings you won't be able to do much other than just implementing ready-made models to your application, with no further improvement if you don't learn a lot about the frameworks and stats.

lapis sequoia
brazen aurora
#

Right on - so the deeper I go into it the more I can leave to abstraction. I'm just confused at the basics so I think I will get to know some of the vocab and ML related stats math before I get back to the ML book. Thank you much!

little bronze
tacit basin
solar yew
#

How do we go about choosing the correct model seeing as with the correct parameters a model can perform so much better?

#

by just picking several models that fit the use case and then performing tuning on them?

barren snow
#

Hey, this is an Gaussian mixture model. But a little strange here.
Is there any idea what kind of occurrence could cause an odd binary between -1 and -0.5 to occur?

exotic pine
#

hey anyone help me on ai proctoring

rich gull
#

I have many rows like this (Player, Year, Points). Like LeBron in the pic there is data for many years corresponding to each player.
How would I go about getting a linear regression coefficient per player for Year/Points? I'm not looking for a plot just something like
LeBron James, 0.88
Kevin Durant, 0.72

serene scaffold
#

@rich gull if you have a function that can calculate that value given a sequence of numbers, you can do a group by player/year and do it.

lapis sequoia
#

Am using Gauss-Newton method for data fitting, anybody free to maybe check code? Getting everything I need from it but would love a second set of eyes!

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

serene scaffold
#

I'm not volunteering to do a code review per se, but you're more likely to get one if the code to be reviewed is readily available.

lapis sequoia
#

I have it in a jupyter notebook, is there a way of sharing in that format here? Or will I just paste the code

serene scaffold
lapis sequoia
serene scaffold
#

I can probably only make suggestions about how to make it more pythonic.

lapis sequoia
#

yes please if possible!

#

I would very much appreciate that

serene scaffold
#

for example, your ri function should be one statement.

def ri(x_list,alp,bet,gamma):
    return np.array([np.exp(-1 * gamma * x) - alp * np.exp(-1 * bet * x ** 2) for x in x_list])
#

you can make this code a lot less verbose in general by using list comprehensions, as I've done here.

#

like with your get_multi_jacobian function

#

as for ri_multi

        for i in range(len(alps)):
            expo_sum -= alps[i]*np.exp(-1*bets[i]*(x**2))

are alps and bets lists or (numpy) arrays?

#

!zip

arctic wedgeBOT
#

The zip function allows you to iterate through multiple iterables simultaneously. It joins the iterables together, almost like a zipper, so that each new element is a tuple with one element from each iterable.

letters = 'abc'
numbers = [1, 2, 3]
# list(zip(letters, numbers)) --> [('a', 1), ('b', 2), ('c', 3)]
for letter, number in zip(letters, numbers):
    print(letter, number)

The zip() iterator is exhausted after the length of the shortest iterable is exceeded. If you would like to retain the other values, consider using itertools.zip_longest.

For more information on zip, please refer to the official documentation.

lapis sequoia
serene scaffold
#

!e

import numpy as np
N = 5
alphas = 1 / np.arange(N)
print(alphas)
betas = 3.0 ** np.arange(N)
print(betas)
arctic wedgeBOT
#

@serene scaffold :white_check_mark: Your eval job has completed with return code 0.

001 | <string>:3: RuntimeWarning: divide by zero encountered in divide
002 | [       inf 1.         0.5        0.33333333 0.25      ]
003 | [ 1.  3.  9. 27. 81.]
lapis sequoia
lapis sequoia
steady basalt
#

šŸ˜‚

#

The real innovation is deep learning tbh

#

This is where all the calculus and algebra is required.. typical ML data science is just stats reborn 2.0

#

Think the only exceptional stand out development was xgboost

#

And other boosting algos

lapis sequoia
#

Can't seem to spot my issue

tacit basin
# lapis sequoia Having a small syntax issue (line 359) now I think after changing some of these ...

Does this while loop is ever executed?

def solve_system_via_iteration(x_list,input_alp,input_bet,gamma):
    alp=input_alp
    bet=input_bet
    
    error_alp=1.0
    error_bet=1.0
    
    while error_alp<= 0.001 and error_bet<= 0.001:
        count += 1 
        ri_list=ri(x_list, alp,bet,gamma)
        jacobian=get_jacobian(x_list,alp,bet)
        lu,p=linalg.lu_factor(jacobian)
        b=-r
        i_list
        x=linalg.lu_solve((lu,p),b)
        error_alp,error_bet=x[0],x[1]
        alp+=error_alp
        bet +=error_bet
        
    return alp,bet
gloomy glen
#

Can someone please guide me how to extract key value pair from scanned invoices using LayoutLM model from Huggingface.

vestal spruce
#

can anyone help me with this, I tried using background subtraction but some of the image isn't exactly the same so the subtraction process is flawed with imperfect results

#

so what I'm trying to achieve is making a bullet hole detection which I manage to achieve in the 4th image (right-most)

steady basalt
#

You managed to achieve it but you’re trying to achieve it?

#

Have you tried labelling with masks to segment them so you don’t need to do any of that

#

So what you do is take ur images and by hand label where the bullet holes are

#

Make the entire image black except for the holes

#

For a day or two until u have thousands

#

And any cnn should easily manage

wooden sail
#

that's ok, but there's no need to do that by hand

#

you can generate synthetic data sets where the hole number, location, shape, and size all vary randomly within sensible ranges, and where the crosshair also changes, by adding e.g. noise to a template of its basic shape

#

then you immediately know the ground truth and can generate examples on the fly without having to store them

misty flint
#

i would apply some traditional image processing techniques like edge detection methods before putting it into a neural network; that would vastly improve performance

vestal spruce
#

not just a single case

#

sorry for the late response and not clearly explained my issue

#

but I think I figure out a way to solve it

wooden sail
#

is the detection done with networks or classical methods?

vestal spruce
wooden sail
#

all righty

#

one thing you can do is make a mask slightly larger than the crosshair and apply it to all images, followed by inpainting

vestal spruce
wooden sail
#

you can generate examples synthetically, the scenario is rather simple

steady basalt
wooden sail
steady basalt
#

Well they should

wooden sail
#

there's no need, this one can be done deterministically

#

looks like a very nice inverse problem practice

steady basalt
#

And in my experience it’s a hand label job I don’t know a method of generating similar images with edges as masks

#

I’m not sure what is better here than cnn

wooden sail
#

define "better"

#

you can make an asymptotically unbiased estimator for which you have performance guarantees

steady basalt
#

Ah hold on, because the targets are symmetrical you can just look at pixels

vestal spruce
#

All right thanks for the suggestion everyone I appreciate it.

steady basalt
#

Where they aren’t at the usual rings

vestal spruce
#

Imma go back to work on this

steady basalt
#

If you know which pixels in every target are rings u can just see what pixels are there that shudnt be

#

But it’s more robust to cnn because you might be dealing with irregular photos or whatever

wooden sail
#

cnns aren't the only image processing method there is. not only are there tons of other things you could do, if you're familiar with other methods, you can also integrate them into your network as hybrid methods to get better performance

steady basalt
#

What would you do here, personally

wooden sail
#

i already explained above

steady basalt
#

Sometimes hand labelling is unavoidable

wooden sail
#

probably two or one step sparse recovery, no deep learning

steady basalt
#

You mean where the pattern of pixels is interrupted

#

You have rings that are technically constant

wooden sail
#

that they aren't constant is the reason their subtraction method doesn't work

steady basalt
#

How does sparse recovery work for this

#

I’m not familiar w it

wooden sail
#

the 2d image is expected to have very few parameters. if you approximately represent the holes as circles of an average radius, each circle is parameterized by its x,y coordinates

#

then there are 2x as many parameters as there are holes

#

but you have however many pixels there are in the image as data inputs

steady basalt
#

So you’d need a target without bullet holes first

wooden sail
#

if the number of parameters is much smaller than the number of inputs and the model is well structured, then you can recover the coordinates with good guarantees even if the number of samples is decreased to close to the number of parameters

spare briar
#

since they have strong landmarks like rings it might be worth considering image registration

wooden sail
#

then what you can do is make a mask of where the crosshair is and ignore those pixels entirely, regardless of their value. registration may or may not be needed, depending on how well the crosshairs match from image to image already

steady basalt
#

cnn would also work

wooden sail
# steady basalt How does sparse recovery work for this

in particular, there exist cases where you can hit the trivial lower bound. say there are 5 circles in the image, each with an x,y coord. then in optimal conditions, you can use exactly 10 pixels from the image to find where all the holes are if you use the correct domain

wooden sail
# steady basalt cnn would also work

yes, ofc, cnn will work if you can generate enough examples, synthetic or otherwise. but also just using a cnn is the same as not using any domain knowledge you might already have

#

that's brute force ML and has no merit

#

if you want to learn something yourself, you wouldn't do that

#

you can solve any problem that way with enough data, though

#

without ever bothering to understand what, why, or how

steady basalt
#

Is sparse recovery hard to code

wooden sail
#

pretty sure scikit learn and scipy have solvers for it already, so no. understanding it well, i guess yeah

#

lemme fish up the original paper

#
steady basalt
#

As someone who doesn’t know how to use this technique and I was given this task I’d simply have to use cnn as there is no time to learn ā€œwellā€ as you say

misty flint
# steady basalt What exaclty do you mean

exactly what i mean. there are textbooks on traditional image processing. and a combination of this and neural networks lead to vastly superior performance rather than one or the other alone.

wooden sail
#

ofc you would šŸ˜›

steady basalt
#

Whatever works

wooden sail
#

would you seriously spend months taking pictures and labeling them by hand for this?

steady basalt
#

I could get enough done in two days

#

🫣

wooden sail
#

you could make a synthetic data generator in a couple of hours and then use either ML or classical techniques and solve it in one go

steady basalt
#

True…

wooden sail
#

i don't think you could make thousands of images in just 2 days

steady basalt
#

Could

#

I’d say 5000 is more than enough to perfect it

#

Imagine with 2000 images it’s enough

#

Would suck ass tho

wooden sail
#

anyway, this is beside the point. ML and deep learning are not always the best solution, and black box ml is, well, what i'd expect from someone that doesn't know much stuff

steady basalt
#

Imagine 48 hrs of cropping by edge

#

Edd ur elite… will work for faang

misty flint
#

šŸ’€

steady basalt
#

I think u mistook me

wooden sail
#

you could get a phd and make a living out of classical image processing alone

steady basalt
#

Some basics, sure, not entire workflows that take a long time

#

Apache spark, yes

#

Automation yes

misty flint
wooden sail
#

indeed

steady basalt
#

If I were to do a PhD it would probably be on imaging

misty flint
#

but nowadays he combines it with ML

steady basalt
#

But there’s nothing novel I could come@up with

misty flint
#

he taught our Feature Engineering class and Pattern Recognition class

wooden sail
#

nice

steady basalt
#

U guys have phds? Or masters

misty flint
#

yeah some solid traditional image processing techniques and then the latter combined these techniques with ML

wooden sail
#

i have a masters and am about half way into a phd

wooden sail
misty flint
#

oh nice!

#

it surprises me how much you can do with fourier transforms + ML

#

def yet another area of potential research

steady basalt
#

I know some guy who’s a bachelors math student but has done a lot of advanced deep learning projects for his age… must have coded since young how to compete with such talent

#

Imaging having such a GitHub portfolio at age 19!

#

Even wrote some impressive papers on Ml for his age

wooden sail
#

the whole github portfolio thing matters depending on what you're aiming for and where you live

#

papers are what talks here

steady basalt
#

I havnt been asked for my portfolio

#

Papers as in published? Or just written work

wooden sail
#

published

steady basalt
#

No one gets published before masters really

#

I may be a part of a publication soon but it’s in my opinion a very weak project

#

So not sure if it would help when it comes to people who care

wooden sail
#

the coding part, you can learn whenever you want. seems you're already on the wagon. as for doing cool stuff with ML, i'll die on the hill that you can do cooler stuff, and do it better, if you understand what you're doing better

#

you always see "domain knowledge" in descriptions of jobs and whatnot involving data analysis, ML, signal processing, etc

#

that means, you need to know math, you need to be familiar with several different techniques and how they work, and you need a base level of technical knowledge regarding the thing you will apply all of this to

steady basalt
#

Yeah I can code but I’m 24 haha

#

The hard part is

#

I have an interview next week at a tech company

#

And it’s gona be DSA coding

#

Which is hard… not good at it as I only first learnt to code for my msc

#

I hope they don’t ask more than east

#

Easy

#

I’d choke on mediums

wooden sail
#

i guess the immediate question is, why do you apply to a job for which you know you don't meet the requirements šŸ˜›

steady basalt
#

Ehhh I do

#

I could do it

#

You don’t use DSA questions as part of the job

#

For example, you code but you’ll never need to do dfs or some shit on a tree

wooden sail
#

they're gonna ask you stuff to see if you understand what it means, i think people often stackoverflow during interviews

steady basalt
#

I can easily explain that but no

#

You have to code it

#

It’s hard

#

Ur not meant to SOF

#

Basically it’s like being given a leetcode question and asked to do it infront of the guy

#

And considering I didn’t rote memorise the logic behind mediums I’ll prob fail on the spot

#

But I’m defiantly qualified for the job

#

It’s a grad job

#

Fingers crossed it’s arrays and strings and not dsa

wooden sail
#

best of luck

steady basalt
#

Thanks

short yoke
#

Hey guys, I have a some news; I developed an advanced mathematical solver(console) like Maple, Matlab, WolframAlpha. (name is "Mathpath console") But you know if you a student, you don't have money you don't have support about it. My motivation sending this message is feedback, introduce to people... Because my app not know by people yet.

steady basalt
#

What language

lapis sequoia
#

I've cut most of the fat off the code but would love any help

serene scaffold
#

By the way, your code would be easier to read if you used spaces.

    Q, R = linalg.qr(jaco, mode='economic')
    b = -1 * np.array(ri_list)
    y = np.matmul(Q.T, b)
    x = np.linalg.solve(R, y)
wooden sail
#

also if you're working with matrices and vectors, Q.T.dot(y) does the same as matmul. that's just flavor though, i kinda prefer it

serene scaffold
#

you could also do y = Q.T @ b. the cpython devs changed the grammar of python specifically for that.

#

(which I don't think was necessary, but yay options?)

#

(also I can't continue complaining about that, or I'll get angry that we don't have function composition.)

wooden sail
#

function composition how?

wooden bobcat
#

How should i go about adding multiple objectives to a bin packing problem with or tools?
I have a list of items each with a weight, volume, vendor, customer and port. I am easily able to constrain the bins on weight and volume but i want to alter the objective from least ammount of bins to least ammount of cost. Cost for a bin with 1 vendor and 1 customer is lower then a bin with 3 vendors, 3 ports and 3 customers for example.

serene scaffold
wooden bobcat
#

I can paste my code but it is long

serene scaffold
arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

serene scaffold
wooden sail
#

aha, that's what you mean. well... don't forget that multiplication is composition when working with matrices šŸ˜› but i see what you mean. at that point they may as well integrate this in general instead of just for matrix products though

wooden bobcat
#

Okay thanks. I wasnt sure which channel was approriate.

shadow frigate
#

Hello, I have a very annoying issue while plotting with seaborn using timeseries. I have a series of readings of temperature over different days, and I would like to plot them all together, similarly to what I'm doing here. The problem is that I have one day that starts at 14h, and that forces the plotting at 14h, rather than midnight, with the annoying result that at midnight the line switches to the wrong day.

If I try to use a full datetime object (date and all) as the x-axis, I can't filter by day anymore with hue...

Is there a good way of fixing the axis? I looked up the matplotlib locator/formatters but I wasn't able to find a better way of fixing it while using seaborn.

Dropping the entire set of incomplete measurements also works, but I'd rather not have to do that

exotic pine
#

hey anyone help me in Reactjs

#

??????????

misty flint
#

bruh im dead

#

šŸ’€

misty flint
#

+1 for transformer papers. hmm pre-transformer is a wide landscape, especially concerning NLP.

#

youre better served going through stanford's NLP class syllabus i believe

exotic pine
misty flint
#

then you can choose which papers you want afterwards

solar yew
#

Hey guys still stuck on a seemingly basic question... how does one pick the correct model? Seeing as param tuning can result in significantly better results, testing loads of untuned models doesnt seem great. For example on my dataset testing the untuned SVM classifier results in ~0.6 accuracy, while tuned its a top performer at ~0.81

misty flint
#

ah i went through the original jurafsky course but that one should be even better since it includes more up-to-date NLP practices like transformers

steady basalt
#

@wooden sail u familiar w the fastest way to compute Fibonacci n

#

Wondering why it’s faster for computer to split integers

wooden sail
#

hmm?

steady basalt
#

Kara what’s his face

wooden sail
#

not familiar with that, idk what you mean

steady basalt
#

Multiplying integers quicker on a computer

#

U know about the algo quake used?

wooden sail
#

you still didn't use full sentences to explain what you mean, but i'm guessing the question goes in the direction of "why is it faster to use the naive recursion formula than to use the closed form expression"

steady basalt
#

Anyway, dynamic programming isn’t the fastest solver to finding fib n, there’s a linear algebra solution that I don’t understand

wooden sail
#

taking a square root cannot be done in closed form. when you ask your computer for a square root, it does an iterative algorithm that stops after the approximation no longer improves much per iteration

#

the linalg approach you found is probably a matrix-vector representation of the difference equation that defines the fib numbers.

steady basalt
wooden sail
#

please use words instead of just sending links with no comments from your side lol

#

but yeah

steady basalt
#

Matrix and further is beyond me. I only know until second

wooden sail
#

the matrix is diagonalizable and therefore there is a nice way to take arbitrary matrix powers based on the eigenvalues

steady basalt
#

But we don’t have fn so how do we use f n+1

wooden sail
#

this is equivalent to the closed form expression in the end, since the eigenvalues are precisely the pair of golden ratios

steady basalt
#

Oh

#

Ohhh

#

We use the f n we have already and then find the f 2n +1

#

But doesn’t that skip in between

#

In fast doubling

#

Is this just used to compute up until a large value of n and then use a naive approach beyond that?

#

Matrix exponentiation would have to be used to find an exact

#

And that uses f n and fn -1 but I don’t know how it computes faster than the previous method

wooden sail
#

the gains don't come from using the matrix approach itself, they're from a "fast exponentiation algorithm"

steady basalt
#

Do you know why karatsuba multiplication is so fast

wooden sail
#

you decompose the exponent in base two and observe that you can obtain the result from log n squarings

#

no, i'd never heard of it before

steady basalt
#

Do computers multiply by splitting a integer into its 10^n chunks

#

Oh it’s like

#

Dividing a number up to multiply

#

Is somehow faster

steady basalt
wooden sail
#

as long as n is integer

#

you know, base 2 representation

steady basalt
#

Binary?

wooden sail
#

mhm

steady basalt
#

Who ever started numbers on computers is a genius

tidal bough
#

fast exponentiation is a very simple algorithm, really:

def power(x, p):
    if p==0: return 1
    elif p==1: return x
    elif p%2==0: return power(x**2, p//2)
    else:
        return x*power(x**2,p//2)

and for matrices it's exactly the same (just replace 1 with an identity matrix I guess).

steady basalt
#

I think a fraction can be represented in Binary

steady basalt
#

That simplifies it a lot

wooden sail
barren snow
#

Hey, could someone explain the fifth line( data = data * srd_0 + mu_0), what's he doing?

#

This is a process to create a Gaussian

modest onyx
#

the fifth line is just a scaling and a shift

#

by default a gaussian distribution has mean 0 and standard deviation 1

#

so by multipliying it by 2, the standard deviation is now 2

#

and by adding it by 5, the mean is now 5

#

it's quite a good idea to brush up on what mean and standard deviation mean

barren snow
#

Oh, got it! So it would be better than just calculate the raw data without scaling, right?

#

By the way, so is it a common way to shift data?

#

Thanks for the explanation! @modest onyx I appreciate it

arctic wedgeBOT
#

@charred egret :white_check_mark: Your eval job has completed with return code 0.

001 | orig std:  1.0003886578560521
002 | orig mean:  0.005016058783782292
003 | 
004 | new std is:  2.0007773157121043
005 | new mean is:  5.010032117567565
modest onyx
#

but most commonly I've seen it used to normalize a dataset

#

meaning we shift it and scale it in such a way as to make the mean 0 and std 1, rather than the other way around

modest onyx
barren snow
barren snow
modest onyx
#

this isn't a real dataset. In real life datasets never perfectly have mean 0 and std 1

modest onyx
barren snow
#

But there isn't enough explanation

#

BTW, it's weird that my plot look like this

charred light
#

In a binary classification with class imbalance towards class 1, if the logistic regression confusion matrix failed to predict a single 0 class but Random Forest, XGBoost didn't have this issue, what can I say about logistic regression model? Is it because the dataset is not linear?

drifting light
#

I've learnt python basics including data structures and v basics of libraries such as numpy, pandas, matplotlib etc. Can anyone guide me that how can i get started with ML/ AI? Pls

wooden sail
#

some maths would come in handy

#

probability and stats, multivar calc, and linear algebra are the basics

gleaming osprey
#

I keep getting this error: cs ValueError: Input 0 of layer conv2d_40 is incompatible with the layer: : expected min_ndim=4, found ndim=2. Full shape received: (None, 1)

#

this is my model:```python
inputs = keras.Input(shape=(48, 48, 1,))

x = keras.layers.Conv2D(16, 2, padding="same")(inputs)
x = keras.layers.LeakyReLU()(x)
x = keras.layers.MaxPooling2D(pool_size=(2, 2))(x)

x = keras.layers.Conv2D(32, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)

x = keras.layers.Conv2D(32, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)

x = keras.layers.Conv2D(64, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)
x = keras.layers.MaxPooling2D(pool_size=(2, 2))(x)

x = keras.layers.Conv2D(128, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)
x = keras.layers.MaxPooling2D(pool_size=(2, 2))(x)

x = keras.layers.Flatten()(x)

x = keras.layers.Dense(36, activation='relu')(x)
x = keras.layers.Dense(72, activation='relu')(x)
x = keras.layers.Dense(72, activation='relu')(x)
x = keras.layers.Dense(36, activation='relu')(x)
x = keras.layers.Dense(18, activation='relu')(x)
outputs = keras.layers.Dense(7, activation='softmax')(x)

model = keras.Model(inputs=inputs, outputs=outputs)

#

and also, what should be the shape of y_train

#

because rn, its a ndarray with values 0 - 6

ancient pendant
#

Hi, Everyone
I didnot understand this question can someone explain it to me without giving answers.
that weighted sum part bounced over my head.

ancient pendant
#

weights part

gleaming osprey
#

the weighted sum is basically red * weight + blue * weight + green * wieght

#

and that is 1 pixel

#

im pretty sure

#

?

ancient pendant
#

so sum of those weights is 1

gleaming osprey
#

yes

#

no

prime hearth
#

Also you need to do gray scale checking to make sure it not out of bounds in gray scale

ancient pendant
#

so red * 1

gleaming osprey
#

no

#

red * weight

ancient pendant
#

to get red color we do this right
[:, :, 1]

gleaming osprey
#

no

#

i dont get you

#

so, for each pixel something like this

#

pixel = (red * weight) + (blue * weight) + (green * weight)

#

i have a question

ancient pendant
gleaming osprey
#

how do you structure a y_train class

#

cuz what I have is a 1D array

#

i think it needs to be 2D

#

but how

#

oh I get it I think

ancient pendant
#

I am begineer so I dont knowšŸ˜…

lapis sequoia
#

depends on ytrain

#

is it like an image ?

#

if so it makes sense for it to be 2d

#

and in this case your model is generating an image, which is common with segmentation models

gleaming osprey
#

no its a classification

#

of emotions

#

there are 7 classes

#

from 0 - 6

#

and there are 28709 images

lapis sequoia
#

ytrain should be 1D in that case

#

each value representing a class

gleaming osprey
#

but im using softmax

#

ok so I have another issue

#

kinda dumb

lapis sequoia
gleaming osprey
#

i cant tell which emotion is which class god darn it

#

is it happy? sad? angry?

#

this is a major problem

#

im dumb enough that even after looking at the face, I still cant tell

lapis sequoia
#

initially you would have a 1D array for you labels (ytrain) which would look like so : ["sad", "angry", "happy" , ...]

gleaming osprey
#

no

#

ok

#

how would an angry person look like

#

can you help?

lapis sequoia
#

lol what is this, no description of dataset or anything xD

#

it looks like the classes are numbered

gleaming osprey
#

ik

lapis sequoia
#

each number representing a feeling

gleaming osprey
#

what does this girl look like?

lapis sequoia
#

you need to manually look into the labels and check which number is assigned to which feeling

gleaming osprey
#

thats what im doing

#

and thats why im stuck

#

i cant tell

lapis sequoia
#

That is a very unprofessionnal author that posted this on kaggle

gleaming osprey
#

ik

#

but the dataset is good

#

and easy to use

lapis sequoia
#

yeah but he could have at least added a mapping to the numbers, 1 = happy, 2 = sad

#

But I guess you can name them as you perceive them yourself

#

and the model would still work

gleaming osprey
#

hehehehe look what I found: label_map = ['Anger', 'Disgust', 'Fear', 'Happy', 'Sad', 'Surprise', 'Neutral']

#

on someones code

lapis sequoia
#

lol

#

there you go

gleaming osprey
#

yay

#

help plz: ```python
ValueError: Creating variables on a non-first call to a function decorated with tf.function.

#

y

lapis sequoia
#

try adding this line

#

tf.config.run_functions_eagerly(True)

gleaming osprey
#

i did nothing: python ValueError: Input 0 of layer conv2d_45 is incompatible with the layer: : expected min_ndim=4, found ndim=2. Full shape received: (None, 1)

gleaming osprey
lapis sequoia
#

your input should be 4 dimensions tho

#

(r , g, b, num_of_images)

#

how is it only 2 dimensions?

gleaming osprey
#

it works now

#

tho is accuracy 0 normal?

lapis sequoia
#

what do you mean?

#

its been like 5 minutes

gleaming osprey
#

it says in the epochs

lapis sequoia
#

oh ok

gleaming osprey
#

acuraccy: 0.0000e+00

lapis sequoia
#

that is not normal

gleaming osprey
#

shouldn't it be like 64.203

lapis sequoia
#

64 should be the loss

gleaming osprey
#

loss is nan

lapis sequoia
#

accuracy on validation should be small but not 0

#

that means you are doing something wrong

gleaming osprey
#

also, I don't have any validation data

lapis sequoia
#

what are you using on output layer

gleaming osprey
#

a dense layer

#

with softmax

lapis sequoia
#

okay

gleaming osprey
#

my model: ```python
tf.config.run_functions_eagerly(True)

inputs = keras.Input(shape=(48, 48, 1,))

x = keras.layers.Conv2D(16, 2, padding="same")(inputs)
x = keras.layers.LeakyReLU()(x)
x = keras.layers.MaxPooling2D(pool_size=(2, 2))(x)

x = keras.layers.Conv2D(32, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)

x = keras.layers.Conv2D(32, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)

x = keras.layers.Conv2D(64, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)
x = keras.layers.MaxPooling2D(pool_size=(2, 2))(x)

x = keras.layers.Conv2D(128, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)
x = keras.layers.MaxPooling2D(pool_size=(2, 2))(x)

x = keras.layers.Flatten()(x)

x = keras.layers.Dense(36, activation='relu')(x)
x = keras.layers.Dense(72, activation='relu')(x)
x = keras.layers.Dense(72, activation='relu')(x)
x = keras.layers.Dense(36, activation='relu')(x)
x = keras.layers.Dense(18, activation='relu')(x)
outputs = keras.layers.Dense(7, activation='softmax')(x)

model = keras.Model(inputs=inputs, outputs=outputs)

model.summary()```

lapis sequoia
#

So your input is probably wrong, or the labels

gleaming osprey
#

my y_train is currently shaped (28709, 7)

lapis sequoia
#

print one of them

gleaming osprey
#

so like

#

[0. 0. 1. 0. 0. 0. 0.]

#

is class 2

lapis sequoia
#

okay

gleaming osprey
#

thats y_train[2]

lapis sequoia
#

and one of the inputs?

gleaming osprey
#

the picture I showed you earlier

#

48x48

#

each pixel is between 0 - 255

#

do I need to normalize them?

lapis sequoia
#

wait is it rgb?

gleaming osprey
#

no

#

1 channel

lapis sequoia
#

ah ok ok

#

then your input should be 3d not 4d

#

[48, 48, num_of_images]

gleaming osprey
#

?

#

i thought [num_of_images, 48, 48]

#

so do I reshape it?

lapis sequoia
#

ehhh it depends on how your model is taking input

#

try

gleaming osprey
#

i want it to take input 1 image at a time

lapis sequoia
#

I don't get

#

is your model already trained>?

gleaming osprey
#

no

lapis sequoia
#

then you cannot train it one image at a time

gleaming osprey
#

?

lapis sequoia
#

I mean you could but why xD

gleaming osprey
#

but after its trained, it can do 1 img at a time right?

lapis sequoia
#

yes

gleaming osprey
#

ok fine

lapis sequoia
#

for test labels

gleaming osprey
#

so what am I missing

lapis sequoia
#

but for train you need to input all your train data

#

and run the model for X epochs

gleaming osprey
#

thats what I'm doing

#

ik

#

im running 10 epochs

#

im on epoch 8

lapis sequoia
#

okay so on teh 10th epoch it was giving you 0 accuracy?

gleaming osprey
#

i have no validation data, is that required?

gleaming osprey
#

im on 8th

lapis sequoia
#

would be good for seeing model's effectiveness

#

okay

gleaming osprey
#

this is my compile: python model.compile( loss = "categorical_crossentropy", optimizer = keras.optimizers.Adam(learning_rate=0.01), metrics = [tf.keras.metrics.Accuracy()] )

#

this is my fit: python run = model.fit( x_train, y_train, batch_size = 16, epochs = 10 )

lapis sequoia
#

I see, where did you get the architecture of your model

gleaming osprey
#

wdym?

unique flame
#

The layers, like input, hidden layers (conv, maxpooling) and output layer

gleaming osprey
#

well, I made it up?

lapis sequoia
#

I would start with bigger filters in convolutional layers and then shrinking them down,
like 96 then 48 then 24 etc

#

because you're shrinking the image to 16x16 on your first layer, which IMO removes a lot of info from the input

unique flame
#

I usually keep it at 32 and then another model where I double it in the next layers

gleaming osprey
#

can somebody help?

#

I get this error: ```python
ValueError: Shapes (16, 1) and (16, 7) are incompatible

#

My model:```py

inputs = keras.Input(shape=(48, 48,1,))

x = keras.layers.Conv2D(16, 2, padding="same")(inputs)
x = keras.layers.LeakyReLU()(x)
x = keras.layers.MaxPooling2D(pool_size=(2, 2))(x)

x = keras.layers.Conv2D(32, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)

x = keras.layers.Conv2D(32, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)

x = keras.layers.Conv2D(64, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)
x = keras.layers.MaxPooling2D(pool_size=(2, 2))(x)

x = keras.layers.Conv2D(128, 3, padding="same")(x)
x = keras.layers.LeakyReLU()(x)
x = keras.layers.MaxPooling2D(pool_size=(2, 2))(x)

x = keras.layers.Flatten()(x)

x = keras.layers.Dense(36, activation='relu')(x)
x = keras.layers.Dense(72, activation='relu')(x)
x = keras.layers.Dense(72, activation='relu')(x)
x = keras.layers.Dense(36, activation='relu')(x)
x = keras.layers.Dense(18, activation='relu')(x)
outputs = keras.layers.Dense(7, activation='softmax')(x)

model = keras.Model(inputs=inputs, outputs=outputs)

model.summary()```

#

Compile:py model.compile( loss = "categorical_crossentropy", optimizer = keras.optimizers.Adam(learning_rate=0.01), metrics = [tf.keras.metrics.Accuracy()] )

#

Fit: py run = model.fit( x_train, y_train, batch_size = 16, epochs = 10 )

#

Input: 48x 48 img with 1 channel from values 0 - 255

#

Output eg. [0, 0, 0, 1, 0 ,0, 0]

#

can be 1 of 7 classes

steady basalt
#

reshape the array

#

x train y train

#

u need match with ur input xtrain

#

ur model expecting 1 but got 7?

#

did u encode or something

mild dirge
#

Show the full traceback

steady basalt
#

obviously inputting a 7 instead of 1

#

he just needs a reshjape prob

#

why do u have 7 when thats meant to be the output

#

aka y train

mild dirge
steady basalt
#

Where he tried forcing x train into it

#

Something tells me that isn’t the full error usually it tells more

gleaming osprey
arctic wedgeBOT
#

Hey @gleaming osprey!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

gleaming osprey
gleaming osprey
steady basalt
#

What’s x train shape

#

When u check it

gleaming osprey
#

(28709, 48, 48, 1)

#

(28709, 7) is y_train

#

I'm getting a new error

#

anything?

gleaming osprey
#

anybody?

mild dirge
#

Considering your batch size is 16, and the shape mismatch is about something with shape (16, 1) I assume you didn't one hot encode your output

#

You have something like:

y_batch = [4, 2, 0, 1, 6, 2, 3, 5, 3, 6, 4, 2, 1, 4, 5, 5]
whereas it should be:
y_batch = [
  [0, 0, 0, 0, 0, 1, 0],
  [0, 0, 0, 1, 0, 0, 0],
  ... (12 more)
  [0, 0, 0, 0, 0, 1, 0],
  [0, 0, 1, 0, 0, 0, 0],
]
#

@gleaming osprey

gleaming osprey
#

i fixed it by changing to categorical_crossentropy

#

now, i'm on epoch 1

#

but

#

I have an issue, it seems that the accuracy is decreasing?

mild dirge
#

That can be due to many many things

gleaming osprey
#

is 2% normal

#

ok even then

#

im pretty sure its not supposed to be 0.0000e+00

mild dirge
#

2% for 7 classes is worse than random guessing

gleaming osprey
#

ik

#

and my loss is 3000+

#

but now the loss is 1?

#

(epoch 2)

timid hollow
#

I’m working with a pandas dataframe; how do I set the indices in this level to rangeindex like so? (Simply taking the last character from each index won’t work for all of the indices I have.)

serene scaffold
#

!paste

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

serene scaffold
#

Please ping me when you have done this. I will not use any screenshots

timid hollow
#

šŸ‘

serene scaffold
steady basalt
steady basalt
serene scaffold
#

@timid hollow

# move the three levels of indexing out of the index
reset = df.reset_index()
# extract the numbers from the third level and overwrite
reset['level_2'] = reset['level_2'].str.extract('(\d+)', expand=False).astype(int)
# overwrite the original df variable
df = reset.set_index(['level_0', 'level_1', 'level_2'])
#

I got this at the end.

MultiIndex([(     'W',                   'W', 1),
            (     'W',                   'W', 2),
            (     'W',                   'W', 3),
            (     '令',                'Ling', 1),
            (     '令',                'Ling', 2),
            (     '令',                'Ling', 3),
            (  'ä¼ŠčŠ™åˆ©ē‰¹',               'Ifrit', 1),
            (  'ä¼ŠčŠ™åˆ©ē‰¹',               'Ifrit', 2),
            (  'ä¼ŠčŠ™åˆ©ē‰¹',               'Ifrit', 3),
            ( 'å‡ę—„åØé¾™é™ˆ', 'Ch'en the Holungday', 2),
            ( 'å‡ę—„åØé¾™é™ˆ', 'Ch'en the Holungday', 2),
            ( 'å‡ę—„åØé¾™é™ˆ', 'Ch'en the Holungday', 2),
            (    '傀影',             'Phantom', 1),
            (    '傀影',             'Phantom', 2),
            (    '傀影',             'Phantom', 3),
            (   'å‡Æå°”åøŒ',            'Kal'tsit', 1),
            (   'å‡Æå°”åøŒ',            'Kal'tsit', 2),
            (   'å‡Æå°”åøŒ',            'Kal'tsit', 3),
            (   'åˆ»äæ„ęŸ',               'Ceobe', 1),
            (   'åˆ»äæ„ęŸ',               'Ceobe', 2),
            (   'åˆ»äæ„ęŸ',               'Ceobe', 3),
            (  'å”ę¶…åˆ©å®‰',           'Carnelian', 1),
            (  'å”ę¶…åˆ©å®‰',           'Carnelian', 2),
            (  'å”ę¶…åˆ©å®‰',           'Carnelian', 3),
            (  'å²å°”ē‰¹å°”',               'Surtr', 1),
            (  'å²å°”ē‰¹å°”',               'Surtr', 2),
            (  'å²å°”ē‰¹å°”',               'Surtr', 3),
            ('å”ę¶…ļæ½ļæ½ļæ½å®‰',           'Carnelian', 3)],
           names=['level_0', 'level_1', 'level_2'])
timid hollow
serene scaffold
timid hollow
#

I did use to_dict() on the entire dataframe, but the dict written to stdout was cut off because it was too long.

serene scaffold
timid hollow
#

Your code example doesn’t help because you’re extracting the numbers, but that doesn’t produce (1, 2, 3), (1, 2, 3), … for all indices. I will ask elsewhere.

severe oriole
#

Hi guys I have a question regarding dataframes in python

#

so I was given a Month and Year columns which are both integer and now I want to combine them

serene scaffold
#

!paste

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

proud solstice
#

Hello

severe oriole
#

oh ok let me send the latest one

proud solstice
#

I have trained a roberta model and I want that model to read a document and give the results of the classification in a CSV file...how to do this can anyone help?

severe oriole
#

basically I have a problem with converting the year and month to string before doing month + "_" + year

#

it shows up as 2019\n4 as an example

serene scaffold
#

Anyway, take a look at this

In [7]: df['month']
Out[7]:
0    8
1    8
2    8
3    8
4    8
Name: month, dtype: int64

In [8]: df['year']
Out[8]:
0    2019
1    2019
2    2019
3    2019
4    2019
Name: year, dtype: int64

Does anything stick out to you here?

severe oriole
#

they're both integers right

serene scaffold
severe oriole
#

I did the str(df['month']) + "_" + str(df['year']) in my case

#

df['Month_Year']= str(df['month'])+ "_" + str(df['year'])

serene scaffold
#

ah, that's what the problem was

#

str(df['some_column']) makes one string that represents the whole column

severe oriole
serene scaffold
#

it does not convert each element to a string.

severe oriole
#

oh what

serene scaffold
#

if you want to go from an int Series to a str Series, you need to use .astype(str)

#

because for any object, str( ) will return one string.

severe oriole
#

so i need to do that for both the data

serene scaffold
#

even though pandas objects have special behaviors, they're still python objects as well.

serene scaffold
severe oriole
#

ok let me try again with that to see first

#

brb after monkey data-ing

serene scaffold
#

let me know how it goes. there's another thing we should go over.

severe oriole
#

oh thanks a lot stelercus

#

the problem's solved

#

now it's displaying correctly now

#

I can finally make a plot out of this thanks to you man

serene scaffold
severe oriole
serene scaffold
#

however, you should keep in mind that pandas has an actual time datatype

severe oriole
#

oh you mean obj and int64

serene scaffold
#

and if you're trying to represent time, you should pretty much always use that

#

no, datetime

severe oriole
#

I'm a bit confused here can you explain a bit more about it

serene scaffold
#
In [14]: time_strs
Out[14]:
0    2019_8
1    2019_8
2    2019_8
3    2019_8
4    2019_8
dtype: object

In [15]: pd.to_datetime(time_strs, format='%Y_%m')
Out[15]:
0   2019-08-01
1   2019-08-01
2   2019-08-01
3   2019-08-01
4   2019-08-01
dtype: datetime64[ns]
#

note the dtype

severe oriole
#

oh i see

#

datetime64

serene scaffold
#

these have methods for dealing with those values as actual moments in time, rather than as strings.

severe oriole
#

well this is new to me

#

I guess this works best when the time variables are more than 2 when you add

#

like not just month_year but hour_minutes_second_day_month_year thing right

serene scaffold
#
In [20]: times = pd.to_datetime(time_strs, format='%Y_%m')

In [21]: times.dt.month_name()
Out[21]:
0    August
1    August
2    August
3    August
4    August
dtype: object

things like this.

severe oriole
#

like more variables and need separation then it's better if I use datetime instead am I correct?

serene scaffold
severe oriole
#

hmm i see

#

noted and first I'll need to know how to use this properly first

severe oriole
eternal hull
#

Can anyone help me calculate year over year growth prev year

#

Q42017 : 0 , Q42018: 152305, year over year growth prev year 0

#

How it is calculated

severe oriole
#

!paste

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

severe oriole
#

add ur data to that and highlight which part's the problem in first

main fox
#

I know this varies from job to job, but what does the modern data scientist "stack" look like?

The obvious ones are SQL, Python or R, classification and regression techniques, model deployment, etc.
And the not so obvious ones are might be deep learning, Big Data frameworks like Hadoop/PySpark, cloud computing like AWS or Azure

I'm trying to see if I'm missing something and fill in the gaps...

prime hearth
#

hello, for natural langauge processsing sentimenal anaylsis, whats the best way to remove unfinished responses from users in dataset? for example
if have sentence "a series" or "series" only, i know stopwords can help remove 'a" but what about single words that doesnt really count as good or bad?

#

i have one row for example:
"series" and it is consider positive (a. good review)

#

but not sure if this adds value

severe oriole
#

aaand I'm back with another question

#

so I want to convert this groupby into violin plot but I can't find a way to do so

#

is there someone with a clue of how to do so?

steady basalt
#

Stack overflow

#

I’m pretty sure if u google pandas groupby violin plot it’s the first one

#

I read it before

severe oriole
#

nope can't find it

#

i do find those with dataframes

#

but I can't find the one with the groupby function that's the problem

steady basalt
#

U can prob apply that method to ur one

#

As it’s functionally the same thing

#

A data frame

severe oriole
#

that'd work if it's not because of the order count I have tbh

#

basically I have a problem because I need to count the number of orders I have in the data

#

which I have to use groupby instead to create that

#

unless I create a completely new data with the groupby then maybe it'll work

gleaming osprey
#

i have an issue with my tensorflow model

#

my accuracy is 0.0000e+00

#

why

#

my loss is about 1

steady basalt
#

it classified all wrong?

inland mango
#

Can anyone help me with leavepgroupsout split?

gleaming osprey
#

why would it

steady basalt
#

cuz u did smtn wrong

prime hearth
#

hello sorry

#

how does naive bayes classifier work with tfidf?

#

i did tfidf to my dataset and would like implement naive bayes to it but not sure how it works on paper

#

since most data we need to manually compute the probability

#

but tfidf already has the log prob

#

suppose i have like 3 features as simple example

#

and output is 1 or 0

#

what i have is like:
x1 x2. x3
0 0.56 0.57
0.56. 0 0.4
0.2. 9 0.9

#

would i do sum(x1) * sum(x2) * sum(x3)

prime hearth
#

okay after a little research would i sum instead of counting?

#

so i sum all x1 whereever p(x1|y=1)

#

and divide it by total sum?

serene scaffold
#

this would probably be a better question for #databases.

pulsar hull
#

finally finished my convolutional neural network

#

turns out it's faster and better than my linear network, even with just 1 convolution

serene scaffold
pulsar hull
#

still some code cleanup and optimization to do, but overall turned out great

lapis sequoia
#

Tensorflow or Pytorch?

pulsar hull
#

just numpy

#

i did it so i would actually have to learn how each part works

pulsar hull
serene scaffold
pulsar hull
#

i tried, but with the gpu i tried it on, it actually got ~5x slower

worldly dawn
vast goblet
#

I know this finds number of entries per year, but I want to find missing values for each year
Any idea how to do this?

unique flame
#

I would usually do something like df.loc[f'{something.index[5]}','name']

gleaming osprey
#

can somebody help

#

my accuracy starts at 40 then goes to 0

lapis sequoia
severe oriole
#

Hello there guys I'm back again with more questions here

#

so originally, my goal's to convert the groupby into a violin plot but I couldn't find how on google

#

so I extracted both keys and values into 2 parts and group them into a new dataframe instead before finally make a violin plot of of that

#

somehow the data aren't distributed equally and it looks like this

steady basalt
#

What data is it

mighty lance
#

So I'm familiar with python and know a fair amount of stuff about the pandas library how can I get started with machine learning?

spiral furnace
#

continue with scikit decision trees, random forests and xg boost

mighty lance
spiral furnace
mighty lance
spiral furnace
#

I can help you with that

#

you just apply the algorithms to the predictions you want to make

#

the intermediate one is not as important as the feature engineering course

mighty lance
#

so should I just skip it or learn any specific stuff form it?

spiral furnace
#

the most important thing to be able to get go and start working on projects is feature engineering.... it's the only part for a beginner where they have to use their brains

mighty lance
#

so should I skip the others and focus on this one or the others like deep learning are equally as important?

spiral furnace
#

just give it a fast pace reading through those and focus more on their exercises

#

dude if you go through those courses and get to feature engineering you can team up with me to work on projects

#

im looking for teammates

gleaming osprey
proper salmon
#

Heyo

#

A while ago I advertised an AI chat bot that utilized a modified version of GPT-3, and I'm just letting you guys know I'm still looking for testers

honest shadow
#

Hello,

I am looking for an e-learning platform to learn python applied to data. I'm looking for something that has been proven to work. What would you recommend?

Thanks in advance.

lapis sequoia
wooden sail
#

that's a bunch of code. what do you want help with

serene scaffold
wooden sail
#

note that, at least in coursera, you can apply for financial support. if you're a student, it's almost guaranteed that you'll get free access to courses

serene scaffold
# severe oriole

I can answer most pandas questions, but I won't look at screenshots of text/code.

#

if you decide to provide the code as a code block, ping me, and I'll see if I still have time to go over it by then.

narrow saddle
#

Are there any alternatives to OpenCV because it feels so ancient and is poorly documented

steady basalt
#

quickfire question: eigenvectors can go in reverse of original right? on same span let ssay 1,0 goes to -1,0

#

its still eigen?

#

ahhh i found my answer, its YES

wooden sail
#

-1 is a perfectly valid eigenvalue, sure

steady basalt
#

: )

#

on the final week of linalg before calc

#

id still fail the exam qs : S

#

still feels kinda nice to start to see how it works though

#

but hell nah can i be assed to calculate this myself

wooden sail
#

there's little point in doing that by hand anyway

#

the concept is super important though

#

here's a fun toy task that often shows up

steady basalt
#

well i had a assignment in pure numpy

#

really cudnt be assed so skipped

wooden sail
#

in numpy it makes sense

steady basalt
#

its hard and slow tho

#

to work out

#

idk, we have transform functions so why do it in pure matrices

wooden sail
#

there's no special reason. matrices are a representation of a linear transformation in a given basis for the domain and codomain

#

so they're the same thing

#

however, if you choose a basis super cleverly, then some operations become a lot easier

#

that's the whole point of learning about eigenvalues and eigenvectors

#

say you have a linear transformation and you want to compose it n times with itself

#

f(f(f(....(f(x))....)

#

if f is diagonalizable, then computing this is trivial

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @latent creek until <t:1658150939:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

wooden sail
#

stuff like PCA is the same: it can be seen as the problem of choosing a clever basis so that there's a super nice low rank representation of data

#

the idea behind linear algebra and using matrices and vectors for it is that "vector" is quite an abstract concept that applies to many objects. if you can show something behaves like a vector/vector space, then you can immediately disengage your brain and fall back on the well understood, mature toolset of matrices, or more generally the toolset of linalg

#

e.g. solving differential equations via diagonalization

severe oriole
#

aand off I'm back to you senpai...

#

and don't worry about the time because this is just me doing for self-improvement

#

the goal's to convert the groupby to DataFrame so that I can use seaborn violinplot to draw one

serene scaffold
#

@severe oriole I'll need a moment to figure it out

severe oriole
#

yup don't worry about the time man

#

I'm available all time

#

that's the problem lmao

#

if i can just make a normal plot then it's no big deal

#

but the question was to filter out data and make a violin plot out of that

serene scaffold
#

so the docs have this figure

#

and the code to create it is sns.violinplot(x="day", y="total_bill", data=tips)

#

so we can infer that the x axis can be labels, and the y axis can be numbers

#

so using the sample data you gave yesterday, I did this: sns.violinplot(x=df['product'], y=df['total'])

#

and I got this terrible thing

#

but I suspect the problem is that I only have five data points. try it with your actual data.

#

@severe oriole

serene scaffold
severe oriole
#

that is the price x quantity of items

#

so therefore it's fixed

#

should I send the entire original csv file also?

serene scaffold
#

no

#

I mean you can drag/drop it into the chat if you'd like

severe oriole
#

because the original data does not include the number of orders tbh

#

oh alright then

#

this is the raw data i got to begin with

serene scaffold
#

what figure did you get for sns.violinplot(x=df['product'], y=df['total']) when you used the whole dataset?

severe oriole
#

fixed with total price in that order per product

#

but I soon deleted that because the question's this

serene scaffold
#

not sure what you mean by "fixed figure". also, you can change which columns you're using to create the figure. I just picked those two arbitrarily.

severe oriole
#

Plot violin plot between above respective 13 categories of unique year month combinations Month_Year ( 1_2019 , 2_2019 …. 1_2020) along side the number of orders being placed.

sick night
#

Please suggest me some projects ideas based on AI/ML
I need it for my final year project

severe oriole
#

i forgot how to adjust the size of the plot

steady basalt
#

My first interview assessment is gona be pandas or sql what shud I choose to use

#

Pandas is easier right

severe oriole
#

idk since it really depends which one you use more

steady basalt
#

I used pandas more

severe oriole
#

then go for pandas

steady basalt
#

Do u think it’s gona be merged s and stuff

#

Merges concat

#

Sorting

severe oriole
#

I can't tell since mine's filtering data out

#

i just have to filter and analyze the information for my case

steady basalt
#

Mines gona be on the spotwith a interviewer

#

No help form discord haha

#

I may have to do what u are doing

#

I wud filter out just by group by

#

And use of iloc

tawdry phoenix
#

hi i have a linear regrression model and a dataset. the thing is i want to see in a praragraph the dots of G3(the predicted label) and the line of the model.how do i do it in matplotlib? ```python
import pandas as pd
import numpy as np
import sklearn
from sklearn import linear_model
from sklearn.linear_model import LinearRegression
import pickle
import matplotlib.pyplot as plt
data = pd.read_csv("student-mat.csv", sep=";")

data = data[["G1", "G2", "G3", "studytime", "failures", "traveltime"]]

predict = "G3"
score = open("score.txt", "r")
data = data.dropna(axis=0)
X = np.array(data.drop([predict], axis= 1))
y = np.array(data[predict])
best = float(score.read())
score.close()
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size = 0.1)
for i in range(500):

linear = LinearRegression()

linear.fit(x_train, y_train)
acc = linear.score(x_test, y_test)
if acc > best:
    print("acc: ", acc)
    best = acc
    score = open("score.txt", "w")
    score.write(str(best))
    score.close()
    with open("reg.pickle", "wb") as f:
        pickle.dump(linear, f)

pickle_in = open("reg.pickle", "rb")
linear = pickle.load(pickle_in)

print('Coefficient: \n', linear.coef_)
print('Intercept: \n', linear.intercept_)

predictions = linear.predict(x_test)

for x in range(len(predictions)):
print("Predict: ", predictions[x],"Features" , x_test[x]+"Actual: ", y_test[x])

#plt.scatter(x_test,x_test)
#plt.plot(predictions, predictions)
#plt.show()```

severe oriole
#

almost like this but y isn't total sadly

#

y = number of orders which is equal to the number of ids recorded instead

#

the problem is that the order count doesn't exist in the original file so i used groupby and I got that order counts in that month

#

x = month and y = order counts in that month

#

number of items per order

#

like in 1 order you can order 2 iphones

#

but that's still just +1 in order count

#

man I love violin or catplot

#

basically it only works if I have the necessary columns right

#

@charred egret actually on second thought

#

is it possible to create a new dataframe and use violin plot on that

#

nvm i figured out now

#

yeah cuz the columsn don't exist in original data

#

imma just make a new dataframe with the filtered data instead

#

kek why didn't I think of that first

#

nvm it doesn't work

#

but thanks melio

#

the question's wrong it seems

#

I managed to create the violin plot now

#

yeah it's alright

#

I think it's the question that's the problem, not my solution

#

because 3 people helped and it's still like this

#

so I'm pretty sure this violin plot works if the condition's different

steady basalt
#

would imbalanced classes be the reason why i have a recall of 1.0

#

for a uncomon class

#

thus 96% accuracy

#

and a recall of 0.0 for another

#

its just guessing one

severe oriole
#

yup it should be vary tbh

#

like it works if there's a range per stuffs

#

like the default violin plot file given right

steady basalt
#

why would resampling for random forest fixing the single class guessing translate to better irl performance

#

incase you get another balance?

severe oriole
#

yup now I completely understand ur question now

#

i guess I'm better off trying with other questions instead then

#

thanks both Melio and Sterlecus for helping me today

#

you both did your best in helping me

exotic notch
#

It's 3x3 to 80x80, pillow won't be high enough quality

turbid knot
#

i hate google API

gleaming osprey
#

why is my accuracy 0.0000e+00?

scarlet siren
#

I've been trying to find an exact replica of matlab econ svd in numpy/scipy
The one in numpy doesn't seem to support the 'econ' parameter

gleaming osprey
#

this is my model:```py
model = keras.Sequential()

model.add(keras.layers.Conv2D(16, (3, 3), activation='relu', input_shape=(48, 48, 1)))
model.add(keras.layers.MaxPooling2D((2, 2)))

model.add(keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(48, 48, 1)))
model.add(keras.layers.MaxPooling2D((2, 2)))

model.add(keras.layers.Conv2D(64, (3, 3), activation='relu', input_shape=(48, 48, 1)))
model.add(keras.layers.Conv2D(64, (3, 3), activation='relu', input_shape=(48, 48, 1)))
model.add(keras.layers.MaxPooling2D(2, 2))

model.add(keras.layers.Flatten())

model.add(keras.layers.Dense(576, activation='relu'))
model.add(keras.layers.Dense(576, activation='relu'))
model.add(keras.layers.Dense(256, activation='relu'))
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(16, activation='relu'))
model.add(keras.layers.Dense(7, activation='softmax'))

model.summary()```

#

this is the compile:py model.compile( loss = "categorical_crossentropy", optimizer = keras.optimizers.Adam(learning_rate=0.001), metrics = [tf.keras.metrics.Accuracy()] )

this is the fit:py run = model.fit( x_train, y_train, batch_size = 16, epochs = 10 )

ripe forge
#

however, google fu i can do

gleaming osprey
#

and this is how the fit output looks like: py Epoch 1/10 1795/1795 [==============================] - 45s 25ms/step - loss: 1.8332 - accuracy: 0.0000e+00 Epoch 2/10 1795/1795 [==============================] - 44s 25ms/step - loss: 1.6180 - accuracy: 0.0000e+00 Epoch 3/10 1795/1795 [==============================] - 44s 24ms/step - loss: 1.4600 - accuracy: 0.0000e+00 Epoch 4/10 1795/1795 [==============================] - 45s 25ms/step - loss: 1.3633 - accuracy: 0.0000e+00 Epoch 5/10 707/1795 [==========>...................] - ETA: 26s - loss: 1.3010 - accuracy: 0.0000e+00

#

why is the accuracy 0?

wooden sail
#

so, economy-size SVD means that, if the matrix is rank deficient, the 0 singular values and their corresponding right and left singular vectors are not returned. you get a square singular value matrix and, in general, rectangular singular value matrices

gleaming osprey
#

what should the output look like

#

is the output eg. [0, 0, 1, 0, 0, 0, 0]?

#

i don't get it

#

*it is

vernal hull
#

Can anyone teach me AI ml

#

??

wooden sail
#

almost everything they return will be close to zero

vernal hull
#

Yoo edd

wooden sail
#

what size is the input?

gleaming osprey
#

shape is (48, 48, 1)

wooden sail
#

even worse

#

use smaller filters

gleaming osprey
#

ok

vernal hull
#

@wooden sail teach me AI??

gleaming osprey
#

imma try that

wooden sail
#

wait wait, i also misread, give me a second to settle down

gleaming osprey
#

about what number?

#

4? 8?

gleaming osprey
gleaming osprey
vernal hull
#

I can’t understand

#

Like 😦

gleaming osprey
vernal hull
#

Ya

gleaming osprey
#

I have a brilliant tutorial

wooden sail
#

i mixed up the (3, 3) with the number of filters, that's my bad, ignore what i said. the network looks mostly ok, the only things i can think of off the top of my head are that the learning rate is either too small or too large. try changing it in both directions and see how that changes

vernal hull
gleaming osprey
#

1 sec

wooden sail
gleaming osprey
#

but i have a problem when I do that

#

when the learning rate is too small, the loss turns NaN

vernal hull
#

So can we start now?

wooden sail
#

if you have specific questions right now, i can give you a hand for a few mins

vernal hull
#

Ok thx

#

Like umm

#

What are the basic math u need for AI??

wooden sail
#

the basics are probability and statistics, linear algebra, and multivariable calculus

gleaming osprey
#

@wooden sail I have tried what you said, and sometimes the accuracy starts at 40 then drops to 1

vernal hull
#

So let’s start with multi variable calculus what ever that is

gleaming osprey
#

other times nothing happens

wooden sail
#

have you tried the rmsprop opimizer instead of adam?

wooden sail
gleaming osprey
#

and mean_sqared_error

#

changes the learning rate

#

batch size

wooden sail
#

what if you make the network smaller, for starters?

scarlet siren
#
[U_J,sigma_J,V_J] = svd(temp_J,'econ');
    sigma_J = diag(sigma_J);
    svp_J = length(find(sigma_J>1/mu));
    if svp_J>=1
        sigma_J = sigma_J(1:svp_J)-1/mu;
    else
        svp_J = 1;
        sigma_J = 0;
    end

I've tried to convert this to numpy but the sigma_J = sigma_J(1:svp_J)-1/mu; line I can't find an explanation for