#data-science-and-ml

1 messages Ā· Page 408 of 1

wooden sail
#

.latex $\begin{bmatrix} u_1 & u_2 & \dots & u_N \end{bmatrix} \begin{bmatrix} v_1 \ v_2 \ \vdots \ v_N \end{bmatrix} = \sum{n=1}^N u_n v_n$

strange elbowBOT
wooden sail
#

this is not as helpful as the texit bot

#

.latex $\begin{bmatrix} u_1 & u_2 & \dots & u_N \end{bmatrix} \begin{bmatrix} v_1 \ v_2 \ \vdots \ v_N \end{bmatrix} = \sum{n=1}^N u_n v_n$

strange elbowBOT
wooden sail
#

this is terrible, it doesn't update on edit. i'm sorry about the spam

#

.latex $\begin{bmatrix} u_1 & u_2 & \dots & u_N \end{bmatrix} \begin{bmatrix} v_1 \ v_2 \ \vdots \ v_N \end{bmatrix} = \sum_{n=1}^N u_n v_n$

strange elbowBOT
wooden sail
#

@scenic tulip finally, here we go. i just did this several times. after you've seen it enough times, you can intuit the operations in your head for simple transformations

scenic tulip
#

@wooden sail yeah that's sweet. I've never heard of latex but it allows you to post calculations in an image that is somehow formatted?

wooden sail
#

it's for formatting pdf documents in general, but it's famous for allowing you to nicely typeset equations and diagrams

mild dirge
#

It's good for formal stuff like research papers

scenic tulip
#

wow i've never heard of this but yeah....wow that's awesome stuff

serene scaffold
#

like, you can even set variables and stuff.

tidal bough
#

it's like word, except you actually have an idea what's going on with your document

serene scaffold
#

I was using a macro that unexpectedly added exclamation points, and that isn't even what the macro is specified to do.

tidal bough
#

I actually learned just today that you're supposed to, in align, put & right before the alignment point, not after

#

I was aligning a ton of shit by spaces. šŸ˜”

serene scaffold
#

I thought the & was the alignment point

tidal bough
#

all I know is that if you do

x =& 5\\
y =& 10

the spaces after = get smaller than they should be

#

&= is the right way

serene scaffold
plush jungle
#

does anyone know why my tensorflow isn't detecting any gpus on my pc? I've got an rtx 3080

serene scaffold
#

and how do you know it's not detecting your gpu

mild dirge
#

@plush jungle ?

serene scaffold
#

also, when I say "how do you know it's not detecting your gpu", I'm not asking "are you sure that it's not ...".

misty flint
lapis sequoia
#

@misty flint

#

Can I befriend you

#

I code in python

misty flint
#

umm

#

i would hope you do

#

since we are in a python server

fierce loom
#

Is there any AI developer community of python

sinful spire
#

hey everybody, is there an app or something which can be able to fix your code while programming, I'm doing my project, I mean is this thing existing before?

tacit basin
edgy agate
#

heyy guys //

#

i am working on a dataset .. but having some problem . please help me out

gray orchid
#

and where is your problem

serene scaffold
edgy agate
# serene scaffold if we click the link, we have to request access. but it's easier if you create a...

You are provided with the leads data of last year containing both direct and indirect leads. Each lead provides information about their activity on the platform, signup information and campaign information. Based on his past activity on the platform, you need to build the predictive model to classify if the user would buy the product in the next 3 months or not. ....... this is what i want to do

edgy agate
serene scaffold
arctic wedgeBOT
#

Hey @edgy agate!

It looks like you tried to attach file type(s) that we do not allow (.ipynb). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

fiery adder
#

Hello! I am thinking of an idea for research on the topic of parameter optimisation viewed as a language problem. Here is what I mean by that - There are already multiple big pre-trained language models such as CodeBERT which can generate good contextual embeddings for source code. So if they're used as a baseline and built upon, we can create a supervised learning pipeline that predicts code parameters which satisfy desired outcomes. For example if we have the function def f(x): return 2 + 2 * x - x*x we can ask the model to maximise it and to find that the desired x is 1. At the beginning we expect to be able to solve such simple optimisation problems, but with time we may derive methods which are able to solve for more parameters and complicated functions and probably even have such a model to optimise parameters for other ML models in the future. If achieved this approach may replace or work together with traditional hyper-parameter tuning solutions like Bayesian optimisation (which are computationally expensive since they require testing the function itself with multiple parameters).

#

One approach will be to take the problem purely as a language task and replace the desired parameter(s) with a masked token and then train a model (fine-tune pre-trained BERT-like model) to predict such tokens given desired outcomes.

#

Another approach will be to take advantage of the pre-trained NL-PL models to generate embeddings for the source code, but then use these representations in a separate regression model. In this case it might be a good idea to built some meta learning environment to better generalise to different functions and then take few-shot approach by first providing a few examples of input-result pairs and then asking for predicted parameters given a desired outcome.

#

What do you think about the idea as a whole and the proposed approaches? Do you think they're feasible and if not - why? Do you think such a study will be pointless and if so - do you have better ideas in this direction?

serene scaffold
#

@fiery adder this there a tldr for this?

mild dirge
#

I think this whole setup seems kinda vague, trying to make a language model predict the outcome of some given formula seems like an inefficient and likely bad way to optimize parameters

#

How would the language model even know what good parameters are?

wooden sail
#

generating data to train this seems ghastly. either you need to check out basically all the machine learning everyone has ever done and the learned parameters, or you'd need to somehow make it self supervised and each example will involve solving a whole machine learning problem. or did you have some idea on how to circumvent this?

loud cove
river maple
#

can someone explain how the code for the gradient descent is theta = theta - alpha * (1/m) * (X' * ((X * theta)-y));

#

this is the formula

wooden sail
#

you want an explanation of the math or how to code the math?

river maple
#

how to code the math..

#

why is there no summation in that code?

wooden sail
#

i think you need to escape some asterisks in what you wrote with a \

#

but at any rate, it looks like the expression you wrote is in terms of matrices and vectors

#

matrix-vector multiplication is itself a sum of products, just like the image you showed

#

.latex \boldsymbol{Ax} = \begin{bmatrix} \boldsymbol{A_{1,:} x} \ \boldsymbol{A_{2,:} x} \ \vdots \ \boldsymbol{A_{m,:} x} \end{bmatrix}

strange elbowBOT
wooden sail
#

oof

#

here

#

you can think of a vector as an n x p matrix with p = 1

#

then you see the multiplication is indeed a sum following that definition

river maple
#

hmm makes sense

#

but for the cost function i had to use the sum function

#

J = (1/(2 * m)) * sum(((X * theta)-y).^2)

#

oh i get it

#

Thanks for the help

young granite
#

in row/col 3/3 i want to plot 2 x axis but i dont know how i can achieve that in the grid, im able to plot a second y-axis but x doesnt work...

scenic tulip
#

@wooden sail you on rn?

#

Maybe someone else knows this. So I'm writing out arrays of results, containing 20 elements to a file. When it writes the output comes out as this :

#
 [  7   8   8 ...   2  -2   1]
 [ -7 -13 -14 ...  -2   5   3]
 ...
 [ -1  -3  -2 ...  -8  -6   2]
 [  2   4  15 ...   8   3   0]
 [ -2  -2  -9 ...  -1  -4  -2]]```
#

How can I view all of the in between data

mild dirge
wooden sail
#

you could write the contents as a csv if you like

lapis sequoia
#

Hi, I am going through deep minds RL slides by David Silver, and I have a question on moving mean and how it forgets past data.
in chapter 4, for model free RL, there is a topic on monte-carlo method that that uses incrementing mean with running average

V(St) ← V(St) + α (Gt āˆ’ V(St))

here, α is supposed to be the one thing that represents a moving mean/running average. what I don't understand is how would the formula forget the past values of V(St) when we keep using it iteratively.

wooden sail
#

if you do a couple of iterations, it might become more clear. let's replace this with a simpler nomenclature first. say, y <- y + a(x - y)

#

we can rearrange that into (1 - a)y + ax. and you probably have a condition like 0< a < 1

#

at the next iteration, instead of x, we have some other value. let's call it z.

#

then we get (1-a) [(1-a) y + ax] + az

#

we expand into (1-a)^2 y + a(1-a)x + az

#

as the sequence continues, y will get mutliplied by increasingly high powers of (1-a), and the previous values of the updates Gt too (but with a lower exponent than y)

#

since (1-a) is also between 0 and 1, the more you repeat this, the smaller the value of y, and also of the old updates

#

i wrote it that way so that you can kinda see that the algorithm produces a weighted sum at every iteration. the higher the iteration number, the smaller the weights of the older quantities

lapis sequoia
#

thanks for taking the time to answer Edd, just give me a minute to process this

lapis sequoia
#

is it because of the a(x-y)

wooden sail
#

it should be the case, yes

lapis sequoia
#

ohhh, i think im getting it

#

wait, is a less than 1 because of the idea of iterative mean?

#

like the formula before α was 1/N(t), but for non-moving average

wooden sail
#

i would have to see how your book defines this stuff, i would call it either "momentum" from the ML perspective or "convex combination" from the linalg standpoint

lapis sequoia
#

oh, im using the RL slides from deep ai, the 2015 one, should I share the link? i think im gettting the idea tho

wooden sail
#

but the idea, if you look at V and G as vectors, is that this operation yields a vector pointing from V to G and passing through V. this is the parametric equation of a line joining two points in N dimensional space. if alpha is equal to 0, you stay exactly at V

#

if alpha becomes 1, you move all the way to G

#

for values in between, you land on the line segment connecting them

#

setting alpha = 0 means "no change", while alpha = 1 means "forget the previous stuff entirely and just move to G"

lapis sequoia
#

thats a lot of linear algebra words šŸ˜„

#

but im getting the idea, Ill have to dig deeper into it

#

thank you @wooden sail , I thought I would have to wait a while to get help

wooden sail
#

glad it helps. i'm not familiar with those slides, so if you could share the link, that'd be cool. it's not like i'm a mathematician or anything either, but i've learned most of the stuff this way thanks to uni

lapis sequoia
#

for linear algebra, did you use the "mathematics for machine learning" ?

#

I have a copy but its just sitting there cause I thought I had just enought linear algebra

wooden sail
#

i've checked some of linear algebra done right by axler and linear algebra done wrong by treil, and also gilbert strang's linear algebra. just straight up math books

#

and then several papers and books on optimization, signal processing, etc

lapis sequoia
#

lots of really great tips, thank you kindly

wooden sail
#

i learned about machine learning as an application of maths, really very late into the game šŸ˜› i don't know most of the pop nomenclature

lapis sequoia
#

you have a very strong foundation tho, coming from math

wooden sail
#

i'm comfortable with mangling indices and wiping my tears while staring at a piece of paper, yes

misty flint
#

my friend who also has a background in math is my go-to when i dont understand a new algorithm

#

hes also very good at solving problems irl too

lapis sequoia
#

i tried getting into linear algebra with 3b1b,

#

i guess that is way below the barrier

wooden sail
#

actually

#

this is my hot take, but 3b1b linalg is not good to learn from

#

it's GREAT to review concepts, but NOT to learn

lapis sequoia
#

what about khan academy?

wooden sail
#

it's presented from the standpoint that you already learned the concepts (somewhat) or have at least heard about them

lapis sequoia
#

i had a hard time learning from there

wooden sail
#

khan academy is usually solid for practicing concrete problems. grinding through a few can build intuition

lapis sequoia
#

really? I had a really hard time there, felt like the talk about determinants was different from 3b1b

#

i thought of trying gilbert strang but 3b1b 16 video playlist looked from enticing.

wooden sail
#

did they hit you with a laplace expansion

lapis sequoia
#

they hit me with a basic 3 equation thingy

wooden sail
#

i think gilbert strang's book is pretty good. it won't go into more abstract stuff though

lapis sequoia
#

hmmm, im motivated now, ill try the video playlist first tho

#

oh, if you dont mind, I have a another question on RL

#

about temporal difference

wooden sail
#

mhm?

lapis sequoia
#

V(St) ← V(St) + α (Rt+1 + γV(St+1) āˆ’ V(St))

do you happen to know this formula?

#

for temporal difference, I have a question on it thats bothering me

wooden sail
#

looks familiar

lapis sequoia
#

its supposed to be used for model free RL, when we can't step into the state of next time step St+1

#

but the formula has the recursive V(St+1) in it,

#

wait, I think I am making the question more complicated

#

so, to restart, if we have the model, we could recursively call V(St+1) from V(St) which in turn calls V(St+2) from V(St+1)

#

thats what I got for a model based RL

#

but temporal difference is an algorithm thats used for model free

#

and it has the V(St+1) being used as a part of the formula to find V(St)

#

im confused on how a model free algorithm can do this

wooden sail
#

i'm not really sure, the nomenclature in the slides is all weird to me šŸ˜›

lapis sequoia
#

im looking for the "pain" reaction lol

#

I guess that problem is for the tomorrow me

lapis sequoia
#

what kind of visualisation can I do to show this data

#

I am thinking of a scatterplot with equal distances on x axis for each country. With 2 coloured dots at each x denoting the value of administered vaccines for each date. With legend denoting colour of each date.

glacial sparrow
#

anyone familiar with sklego's RBF here?

scenic tulip
#

@wooden sail writing as csv did it...thank you!!

wooden sail
#

cool

#

if you just need it stored for later but don't need to actually look at the matrix, consider also .npy or npz

lapis sequoia
#

what might be the issue?

river maple
#

why is a column of ones added to the data matrix after feature normalization?

main fox
nova matrix
#

Guys is standard matplotlib and seaborn enough for visualisations or should we know some advanced visualisation libraries like cuff links

mild dirge
#

that obviously depends on how complicated you need stuff to be

#

But matplotlib can do a whole lot, have never been limited so far, except for maybe 3d stuff

fiery adder
# wooden sail generating data to train this seems ghastly. either you need to check out basica...

I am also not sure how data can be generated efficiently. But it turns out that HPO has already been tested as a sequence problem with Transformers. https://arxiv.org/abs/2205.13320

fiery adder
wooden sail
#

you see they discuss there usage of vast amounts of HPO data

#

which at google they certainly have. idk how easy it is to get that in the wild, though

#

you rely on people all over the world having solved enough problems to make this trainable

misty flint
#

but matplotlib/seaborn is pretty robust for quick visualizations

#

my personal favorite is plotly

#

theres also specific data viz software like tableau/powerBI/looker/etc.

#

but that tends to be more in the business context where you are creating something for business stakeholders

#

i.e. you need to create a dashboard showing X, Y, Z for someone in a specific business unit/function

#

if that is your world, then i highly recommend "storytelling with data" by cole knaflic

rich merlin
#

I'm relatively new to pycharm and pandas,
does anyone have a minute to help me figure out where to start and how to make assessments on trends?

fleet musk
#

helo friends, i am getting a warning in pandas, did some reading on stack overflow, unable to fully grasp it

#
ticker["candle"] = np.array(range(len(ticker)))%25 + 1
__main__:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
#

how to fix it?

lapis sequoia
main fox
loud cove
lapis sequoia
main fox
lapis sequoia
lapis sequoia
#

I code

#

hbu

misty flint
#

if anyone's interested in RecSys, there's a series by the great chip huyen this month; starting tomorrow at 10a PT!

dreamy phoenix
#

Hello all. I am having a lot of fun messing around with pyplot and I need a bit of some help.

misty flint
#

3.5 sessions, ending with a big RecSys ML System Design Session

runic crystal
#

Hey everyone! Can anyone confirm if we can change color of seborn catplots based on conditional statements

dreamy phoenix
#

I am trying to draw a line graph with formatted percentages on the y-axis. Currently, these are formatted strings. The formatted strings are not ordered correctly, trying to sort them gives me a squiggle.

#

I think what I would need to do is find a way to format the floating point numbers as they're displayed instead of converting them to a string and formatting that.

runic crystal
#

And bar_label doesn't seem to work with catplots either. Any Idea about it ?

dreamy phoenix
#

okay thank you

dreamy phoenix
#
from isolation import isolate_total_stub, isolate_age_stub
import matplotlib.pyplot as plt
from matplotlib.ticker import (MultipleLocator,
                               FormatStrFormatter,
                               AutoMinorLocator)

# very simple extraction, drop some columns and check some data
cdc_data = pd.read_csv('CDC_Delay_of_Care_Data.csv')
cdc_data = cdc_data.drop(columns=['INDICATOR','FLAG','UNIT'])


# do you have good data?
data_types_valid = type_check_numeric_columns(cdc_data)
acceptable_null_threshold = compare_nulls_against_threshold(cdc_data)


# separate the categories of delayed care
delay_of_medical_care = cdc_data[cdc_data.PANEL == 'Delay or nonreceipt of needed medical care due to cost']

# isolate the totals stub
total_delay_of_medical_care = isolate_total_stub(delay_of_medical_care)

x_axis = total_delay_of_medical_care.YEAR
y_axis = total_delay_of_medical_care.ESTIMATE
fig, ax = plt.subplots()

ax.plot(x_axis, y_axis)
plt.show()
runic crystal
dreamy phoenix
#

I am not using the ticker library imports at this time

#

oh sorry I thought you were talking to me. excuse me

runic crystal
#

this is what wrote for the colors

#

the commented lines

runic crystal
#

I gave that a try and it did not work. I am now certain that my data is wacky. I have repeated values that are true for some year and false for other years. And I was plotting year-wise graphs from my data. Those values being true for some and false for others is toasting up the library. I might just break my data into separate files rater than them being in a single file. That should do the job. Thanks anyway!

fleet sleet
#

Hey I am a beginner ,
trying to automate data from MySQL database to spread sheet and I have all the basic libraries required, sheets api is also enabled.. created credentials for the same on GCP
Have given the right path to the credentials.json file and everything still I seem to go nowhere
Can someone please help me out ?

fleet sleet
#

The debug log is

#

PS C:\Users\conta\OneDrive\Desktop\Workspace> & 'C:\Python310\python.exe' 'c:\Users\conta.vscode\extensions\ms-python.python-2022.6.3\pythonFiles\lib\python\debugpy\launcher' '51612' '--' 'c:\Users\conta\OneDrive\Desktop\Workspace\pyautomation\sheetsNew.py'
There is an Exception in credsLogin Function : 'module' object is not callable
Authentication DONE !
C:\Python310\lib\site-packages\pandas\io\sql.py:761: UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy
warnings.warn(
MID PID merchant_name locality city
0 b'242307' b'1418703' b'Ruchi Curry Point' b'Manikonda' b'Hyderabad'
1 b'243056' b'1418703' b'Ruchi Curries' b'Madhapur' b'Hyderabad'
2 b'650871' b'1418703' b'Ruchi Curries' b'Nizampet' b'Hyderabad'
3 b'1235155' b'1418703' b'Ruchi Curry Point' b'Nizampet' b'Hyderabad'
4 b'1318633' b'1418703' b'Ruchi Curry Point, Nizampet' b'Nizampet' b'Hyderabad'
Deleting Google Sheet...
There is an Exception in clearGoogleSheet Function : Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Writing Google Sheet...
There is an Exception in writingGoogleSheet Function : Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Part 1 Completed !

cinder matrix
#

Hi guys, i've built a model which takes keywords and generates narratives. However, i find the bleu and rouge evaluation isn't appropriate for my case.

So instead am thinking of evaluating by how much the user input keywords is present in the generated text. Would this be a proper way of evaluating how much keywords permeated in the text? Does such a metric or better exists? If not, how would i proceed? Thanks and please @ so i get notified when replying

mint palm
#

i want to simulate transfer learning

#

how do i do it?

#

i have trained my model

#

now i wanna check how it will fine tune on deployment

loud cove
tacit basin
mint palm
mint palm
spring marsh
#

Can someone please help me on how to setup my GPU for deep learning on tensorflow

arctic wedgeBOT
#

Hey @wooden sail!

It looks like you tried to attach file type(s) that we do not allow (.ipynb). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

mortal cairn
#

Hi, i'm trying to find the intersection point of to sets of data. Neither line cannot be defined by a mathematical function and has each about 21450 values of x and y. Any ideas of functions or libraries i can use?

serene scaffold
mortal cairn
#

they're series I read from a csv using pandas

#

Someone mentioned using shapely so I'm trying that now

serene scaffold
mortal cairn
#

Ah yea. That's true. Thanks for the idea

wooden sail
#

it doesn't seem like they have the same domain, so you'll have to do some padding. otherwise, that seems the easiest way (arg min (abs(diff)))

pliant pewter
#

No, it's just math

wooden sail
#

no, that's just math

serene scaffold
#

I was making a joke

#

also Aurendil is Edd's alt confirmed

pliant pewter
#

Lisp would have more parentheses

wooden sail
#

aurendil tried to joke with me before and also failed

serene scaffold
#

!otn s lisp

arctic wedgeBOT
#
Query results

• python-is-not-lisp

pliant pewter
#

Are there any successful NLP joke/sarcasm detectors out there?

serene scaffold
pliant pewter
#

It's kind of hard just from a language point of view, yeah. But I've noticed that lots of animals seem to have a concept of play/joking, and you can see it in their facial expression. Probably just need more information than just words.

wooden sail
#

btw, if any of you are interested, i'm preparing this short intro to jax. specifically, looking at jit, vectorization, and automatic differentiation of functions f:C^n -> R^m (cr or wirtinger calc). the final example does something that could be understood as some form of "deep unfolding"/self supervised training/hyper parameter optimization or whatever you wanna call it. the target is undergrad people with knowledge of linalg and optimization https://github.com/3ddP/jax_example/blob/master/examples.ipynb

#

any comments and/or feedback are welcome. analytic solutions are used to corroborate the jax results, but the math isn't explained. it's expected the students will already know it

misty flint
serene scaffold
#

the twitter API gives you their own sentiment scores, if I remember correctly. what are you trying to do?

#

you want to get the sentiment score of individual words? I've never heard of that

#

sentiment scores will reflect the sentiment of the whole tweet

misty flint
#

pretty nifty

misty flint
serene scaffold
#

that sounds good

#

did you get tweepy set up?

hollow sentinel
#

why does printing the head of the dataframe in thonny look like that

#

it looks gross lol

hollow sentinel
#

yeah but is there a cleaner way to look at the dataframe

serene scaffold
#

you can see that it says "bound method of ..."

#

did you try print(df.head()), where you call the method?

hollow sentinel
#

yep that's why

serene scaffold
#

but you're just using pandas' native printing functionality. I don't know if thonny does anything like pycharm's dataframe viewer thing

hollow sentinel
#

i can't even open anaconda-navigator on my mac anymore

#

soooo no more uploading ipynbs to my github

#

gonna stick out like a sore thumb šŸ’€

bold timber
#

The column of dropoff_site have some label. How to do replacing the missing value in load_weight when dropoff_site is 'MRF'?

tacit basin
chilly helm
#

can you freelance as a data scientist?

copper tinsel
mint palm
#

i want to generate that too

elder falcon
fallen crane
#

1-Is it possible to build a new programming language from scratch, as it is called 0101? Is there any knowledge currently available that helps to do that?

misty flint
fallen crane
#

2- When I review some visual and read sources, all I find is a theoretical explanation of 0101's supposed work steps from the beginning, but if I can ask, how was 0101 introduced into the electronic circuit, using any technology and any knowledge?

misty flint
#

also update: chip huyen's RecSys series is off to a great start

fervent vale
#

Hi guys

tacit basin
tacit basin
misty flint
#

3.5 hrs total. 10a PT on sundays

#

also im super interested so im def planning on completing this one

#

and this one is less of a lecturer-student style and more of a self-study group style where peeps share more of their experiences/learnings

#

so i like that format more since its interactive

burnt island
#

anyone with a good knowledge of SARIMAX and ARIMAX models or resources on time series forecasting.

I'm working on a personal project which has to do with crypto price modelling, I want to use SARIMAX or ARIMAX before CNN to model

tacit basin
#

I will quit my job one day and do all these Udemy courses lol

quaint wave
#

Hi guys, I'm currently writing up a project on the use of neural networks in detecting football tactics and stumbled across a paper which I don't understand. Would anyone be willing to help? I'll dm you the pdf

tacit basin
quaint wave
mint palm
misty flint
# tacit basin I will quit my job one day and do all these Udemy courses lol

Head to http://brilliant.org/TinaHuang/ to get started for free with Brilliant's interactive lessons. The first 200 people will also get 20% off an annual membership.

āœ‰ļø NEWSLETTER: https://tinahuang.substack.com/
It's about learning, coding, and generally how to get your sh*t together c:

In this video, I talk about why you keep quitting you...

ā–¶ Play video
#

she has some good points

#

that i feel is very relevant for people studying the topics in this channel

fleet musk
#

helo, the help channel is very slow to help with problems sometimes

#

can i ask here

misty flint
#

and sometimes people cant help you here either; it just depends on the problem

fleet musk
misty flint
#

/availability

#

oh you are the guy that was stuck with pycharm

fleet musk
#

ok. it is pandas related

misty flint
#

just ask it

fleet musk
tacit basin
misty flint
#

eww financial data RunFail

fleet musk
#

using spyder IDE
i have extracted stock data and put it into a dataframe

fleet musk
mint palm
#

For example for making a model that allocate resources based on parameters, can we simulate those condition??

fleet musk
#

i ask miwojo then

mint palm
misty flint
#

why did you ping me

mint palm
#

Mistake sir, pardon my dust

tacit basin
mint palm
#

Network slicinf

fleet musk
# fleet musk

@tacit basin in this pic, last 3 columns are of interest
i have column "candle" and i need to calculate mean of values of Candle 10, 15, 20 etc only if they belong to same date
there are 22 different dates
what do?

misty flint
#

my worst nightmare

mint palm
#

We allocate embb mmtc or urllc based on speed quantity of data etc

fleet musk
#

helo melio. rex is bullying me. halp plez

misty flint
#

was attacked by one the other day

mint palm
#

Everything is cardinal

misty flint
#

yeah melio, you can help stardust; idk how im bullying stardust tho Oopsies

fleet musk
misty flint
#

groupby ftw

#

that

#

and json_normalize

#

pretty up there on my pandas fave functions

mint palm
#

Wait tell me whats the point of transfer learning

#

How do we fine tune

fleet musk
#

is this correct?

#

ticker is df

#

dataframe

tacit basin
mint palm
fleet musk
#

getting this error, this wasnt there before i added the groupby line

mint palm
#

To learn and fit well enough

#

Deoends?

tacit basin
#

Yep depends on how similar are ptetrained data to your domain data

tacit basin
fleet musk
#

ok ok

#

why do i need mean?

tacit basin
#

I though you said mean

fleet musk
#

for example, make group by dates, so 22 groups, then i take candle number from candle column,

for that candle number, i need to take Candle "close" price from another column

tacit basin
#

Reread again. You said mean that's why I guess

fleet musk
#

ok. ill omit mean for now

tacit basin
#

groupby returns groups and if you specify aggregate mathod it will calculate that on group

fleet musk
#

ill need to read on aggregate

#

ticker.groupby("date",axis=1)

#

is this what i do, for date grouping

fleet musk
#

groupby didnt work, not suited here
i used numpy split, now need to find a way to perform operations on split portions of each dataframe

solid urchin
pseudo wren
#

what is a good way to account for date while working on a model

#

the date is relevant to my dataset but it is in a format the python interpreter cannot understand

#

the date is important for me to keep because i need it to record trends in this dataset, but i'm not sure what the best way of separating this data is

errant onyx
#

Hello people, I am thinking about picking up either "An Introduction to Statistical Learning (with applications in R)" or "Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow"

#

Do you have any experience with these, which one would you recommend?

serene scaffold
#

@errant onyx we're going to be partial to the second one, because those three things are python libraries. R is a separate language, ie not python

errant onyx
#

I know that's the case, it's also the reason I asked the question identically in the R discord

#

But I sort of wanted to know if you guys think it was good

serene scaffold
errant onyx
#

I'm sort of an R person but it seems most ML things are done in Python in industry

#

That one seems good too

#

Saw it being recommended too

serene scaffold
muted vector
#

(srry to interrupt convo but how would u guys recommend starting learn AI with python?)

errant onyx
#

I'm in academia so most things are done in R here

serene scaffold
serene scaffold
errant onyx
#

haha, of course he/she's from academia

serene scaffold
#

Oh, I know another. Also in academia. I could ask him for advice about how he switched to Python

muted vector
#

this? :>

serene scaffold
arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

serene scaffold
#

It's on there.

muted vector
#

ty!

errant onyx
#

I've read Python for data analysis and python crash course, and done some coding in Python in general

#

But there's a world of books/resources, such a difficult choice

serene scaffold
#

Or attend one?

#

They might give you an ORiley subscription. In which case you can try any book without fear of commitment

errant onyx
#

I'm doing a phd, should ideally be finishing in 1,5 years

#

so I have that time to learn more data science basically

#

I'm not super worried about getting a book, I'll probably read it cover to cover in any case

#

Just you know, I wanna hit the sweet spot when it comes to a book

#

Not too basic, not too theoretical

#

anyway I'm gonna look up the data science for scratch book

#

thanks

serene scaffold
#

What is your PhD in

errant onyx
#

Medical sciences

#

so yeah, very different

serene scaffold
#

BTW I'm at a wedding so I might disappear if someone makes me dance

#

But I don't wanna

errant onyx
#

Yeah, I know the feeling

#

Been there

serene scaffold
errant onyx
#

quite a few times too

#

I think I would be able to contribute the most there, but I don't feel I should be constrained to only the medical field-

#

What I mean is that that is probably the best goal for me in a perfect world

#

but you never know the opportunities that might pop up

#

I work a bit with bioinformatics right now though

#

But I have poor knowledge of the underlying algorithms I'd say

#

I also am on week 5 of Andrew Ng's machine learnign course

#

He's gonna replace it with a new course though, kind of typical

hollow sentinel
#

where do i find apis for data science?

serene scaffold
hollow sentinel
#

or i could do my first project with web scraping

#

oh i didn't mean to ping whoever that was

#

i find it hard to think of projects that are personally applicable to me

#

so i get frustrated when i think of projects

#

it's tough

misty flint
#

ben rogajan introduced an API to me and im using that one for a DE project

hollow sentinel
#

i think fitness might be an idea

misty flint
#
  1. Is there an API call limit?
    Yes, there are two rate limits per API: 4,000 requests per day and 10 requests per minute. You should sleep 6 seconds between calls to avoid hitting the per minute rate limit. If you need a higher rate limit, please contact us at code@nytimes.com.
#

up to you. as long as you find the problem interesting, youre more likely to finish it

hollow sentinel
#

ah yes

#

data i cannot access

#

we love

misty flint
#

lots of peeps in the bioinformatics/pharmaceutical space use R. CDC exclusively uses R as well

#

im still biased towards python tho since if youre going to deploy models, its going to be in python

#

sdk's for R are very uncommon

misty flint
#

absolutely tragic

#

šŸ•Æļø

muted vector
#

is matplotlib a good thing to plot stuf with?

royal crest
#

Yes

#

it's the standard

hollow sentinel
#

that and seaborn

#

seaborn is nice

misty flint
#

p l o t l y

royal crest
#

no one talks about pyCairo

#

;-;

misty flint
serene scaffold
#

@errant onyx as I was going to say earlier, if you get a PhD in something that isn't data science in itself, but you can also do data science, I would say that puts you in a good position. Also my cousin's wife is very angry at me for refusing to dance with her.

royal crest
misty flint
#

weddings make you do mandatory things unless you hide

#

oh yeah?

misty flint
worldly dawn
# misty flint

that's the same for pretty much every type of engineer

#

Being able to get shit done, but being an expert in 1 or 2 areas

lapis sequoia
#

I am a dot

plush jungle
#

i'm trying to run stylegan2 ada
https://github.com/johndpope/stylegan2-ada
but I keep getting this error
RuntimeError: Could not find MSVC/GCC/CLANG installation on this computer. Check compiler_bindir_search_path list in "C:\python\stylegan2-ada-main\stylegan2-ada-main\dnnlib\tflib\custom_ops.py".

#

the file it's talking about has this code

def _prepare_nvcc_cli(opts):
    cmd = 'nvcc ' + opts.strip()
    cmd += ' --disable-warnings'
    cmd += ' --include-path "%s"' % tf.sysconfig.get_include()
    cmd += ' --include-path "%s"' % os.path.join(tf.sysconfig.get_include(), 'external', 'protobuf_archive', 'src')
    cmd += ' --include-path "%s"' % os.path.join(tf.sysconfig.get_include(), 'external', 'com_google_absl')
    cmd += ' --include-path "%s"' % os.path.join(tf.sysconfig.get_include(), 'external', 'eigen_archive')

    compiler_bindir = _find_compiler_bindir()
    if compiler_bindir is None:
        # Require that _find_compiler_bindir succeeds on Windows.  Allow
        # nvcc to use whatever is the default on Linux.
        if os.name == 'nt':
            raise RuntimeError('Could not find MSVC/GCC/CLANG installation on this computer. Check compiler_bindir_search_path list in "%s".' % __file__)
    else:
        cmd += ' --compiler-bindir "%s"' % compiler_bindir
    cmd += ' 2>&1'
    return cmd```
plush jungle
#

msvc is installed, I don't know about nvcc

#

is that part of cuda?

#

cause I already installed cuda-toolkit

worldly dawn
plush jungle
#

when I google "download nvcc" it just directs me to download cuda-toolkit

worldly dawn
#

is it in your path?

#

I would recommend to dig into the content of _find_compiler_bindir() and see what is it looking for

plush jungle
#

yeah I looked into that actually

#

there are actually two versions from two different github forks

#
patterns = [
'C:/Program Files (x86)/Microsoft Visual Studio//Professional/VC/Tools/MSVC//bin/Hostx64/x64',
'C:/Program Files (x86)/Microsoft Visual Studio//BuildTools/VC/Tools/MSVC//bin/Hostx64/x64',
'C:/Program Files (x86)/Microsoft Visual Studio//Community/VC/Tools/MSVC//bin/Hostx64/x64',
'C:/Program Files (x86)/Microsoft Visual Studio */vc/bin',
'C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Auxiliary/Build/vcvars64.bat',
]
def _find_compiler_bindir():
    for compiler_path in patterns:
        if os.path.isdir(compiler_path):
            return compiler_path
    return None```
#

this is one

#

this is the other

#
def _find_compiler_bindir():
    hostx64_paths = sorted(glob.glob('C:/Program Files (x86)/Microsoft Visual Studio/*/Professional/VC/Tools/MSVC/*/bin/Hostx64/x64'), reverse=True)
    if hostx64_paths != []:
        return hostx64_paths[0]
    hostx64_paths = sorted(glob.glob('C:/Program Files (x86)/Microsoft Visual Studio/*/BuildTools/VC/Tools/MSVC/*/bin/Hostx64/x64'), reverse=True)
    if hostx64_paths != []:
        return hostx64_paths[0]
    hostx64_paths = sorted(glob.glob('C:/Program Files (x86)/Microsoft Visual Studio/*/Community/VC/Tools/MSVC/*/bin/Hostx64/x64'), reverse=True)
    if hostx64_paths != []:
        return hostx64_paths[0]
    vc_bin_dir = 'C:/Program Files (x86)/Microsoft Visual Studio 14.0/vc/bin'
    if os.path.isdir(vc_bin_dir):
        return vc_bin_dir
    return None```
worldly dawn
#

oh bo

#

y

plush jungle
#

so I figured out that it's looking for the c complier in visual studio

worldly dawn
#

do any of these directories exist for you?

plush jungle
#

no. instead my MSVC is located here

C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.32.31326/bin/Hostx64\x64```
#

so i did this

#
def _find_compiler_bindir():
    return 'C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.32.31326/bin/Hostx64\x64'
    for compiler_path in patterns:
        if os.path.isdir(compiler_path):
            return compiler_path
    return None```
#

and I got this

#
RuntimeError: NVCC returned an error. See below for full command line and output log:

nvcc "C:\Users\Alex\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\python\_pywrap_tensorflow_internal.lib" --gpu-architecture=sm_86 --use_fast_math --disable-warnings --include-path "C:\Users\Alex\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\include" --include-path "C:\Users\Alex\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\include\external\protobuf_archive\src" --include-path "C:\Users\Alex\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\include\external\com_google_absl" --include-path "C:\Users\Alex\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\include\external\eigen_archive" --compiler-bindir "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.32.31326/bin/Hostx64d" 2>&1 "C:\python\stylegan2-ada-main\stylegan2-ada-main\dnnlib\tflib\ops\fused_bias_act.cu" --shared -o "C:\Users\Alex\AppData\Local\Temp\tmp2gk5m51p\fused_bias_act_tmp.dll" --keep --keep-dir "C:\Users\Alex\AppData\Local\Temp\tmp2gk5m51p"

'nvcc' is not recognized as an internal or external command,
operable program or batch file.```
#

so my current working theory is that nvcc is installed with cuda-toolkit but it's not in my path

worldly dawn
#

yeah, sounds like it can't find nvcc

plush jungle
#

these are in my path

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\libnvvp```
#

but nothing about nvcc

worldly dawn
#

is nvcc in either of these directories?

plush jungle
#

let me check

#

nvcc is in the first one

#

it's an exe

worldly dawn
#

ok then that's weird

plush jungle
#

is it possible that by doing

def _find_compiler_bindir():
    return 'C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.32.31326/bin/Hostx64/x64'

i've given it a bad path somehow?

#

like that it can find nvcc but it's thrown off by the path i'm giving it?

worldly dawn
#

then your assumption would be that the actual error does not match the error message

plush jungle
#

yeah

worldly dawn
#

which is fair, but would have to be proven

#

you should be able to see how nvcc is called exactly and either see what is being returned or being able to call the same thing manually yourself

plush jungle
#

oh yeah

#

yeah when i type it into terminal it says the same thing

worldly dawn
#

what if you type just "nvcc" ?

plush jungle
#

that's what I did

worldly dawn
#

ok, I haven't used windows in years. But do the .exe matter at the end? like in nvcc vs nvcc.exe ?

plush jungle
#

no, typically you don't put the .exe on the end

worldly dawn
#

ok, then something is wrong with your path or installation

#

you should at the very least get an nvcc error

#

not a system error about the executable

#

and the fact that just calling nvcc without arguments give you such error does mean that it's not about your compiler argument

plush jungle
#

it's gotta be the path. someone on stackoverflow had the same issue in 2017

#

/Developer/NVIDIA/CUDA8.0.61/bin
As indicated in the install guide, the correct path is:

/Developer/NVIDIA/CUDA-8.0.61/bin
                      ^```
#

but that's not what my path looks like in the year of our lord 2022

#

mine looks like this

#

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin

worldly dawn
#

then get your path out of the matrix into the current year

cursive walrus
#

hi guys i am new to python programming i am having this trouble i have this code that detect plant and i am getting this error: IndexError: tuple index out of range please help me

worldly dawn
cursive walrus
worldly dawn
cursive walrus
#

ok

#

import cv2
import os
#Cascade
cascade = cv2.CascadeClassifier('./golden_pothos_cascade.xml')
#Reading Image
capture = cv2.VideoCapture(0)
while True:
success, img =capture.read()
#Converting to Gray Image
gray_Image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#Adding Gaussian Blur
blur=cv2.GaussianBlur(gray_Image,(13,13),cv2.BORDER_DEFAULT)
#Detecting Plant
detection_result, rejectLevels, levelWeights =cascade.detectMultiScale3(blur, scaleFactor=1.0485258, minNeighbors=6, minSize=(30,30),outputRejectLevels = 1)
greaterweightindex = 0
currentweight = levelWeights[0]
#Area with Heighest Confidence
for (weight) in levelWeights:
if weight > currentweight:
greaterweightindex = greaterweightindex+1
currentweight = weight
#Highest Confidence Area
x = detection_result[greaterweightindex][0]
y = detection_result[greaterweightindex][1]
w = detection_result[greaterweightindex][2]
h = detection_result[greaterweightindex][3]
#Modifying Cofidence
confidence= round(currentweight[0], 2)
finalconfidence= confidence * 100
#Drawing Rectangle
cv2.rectangle(img,(x,y), (x+w, y+h), (0,0,255), thickness=2)
cv2.rectangle(img,(x,y-35), (x+w, y), (0,0,255), thickness=-1)
#Adding Text
cv2.putText(img, str(f"Golden Pothos {finalconfidence}%"), (x,y-5), cv2.FONT_HERSHEY_COMPLEX, 0.6, (255,255,255), thickness=2)
#Displaying Image
cv2.imshow("Detected Plant",img)
#Adding Wait
if cv2.waitKey(1) == 13:
break
cv2.waitKey(1)

#

the error is at currentweight = levelWeights[0]

plush jungle
#

and that's why it's out of range

#

run the code again but before the line that throws the error put

print(levelWeights)```
#

@cursive walrus

plush jungle
#

yep, it's as I expected, an empty tuple

#

ok try this

#
greaterweightindex = 0
if not levelWeights:
    continue
currentweight = levelWeights[0]```
cursive walrus
plush jungle
#

do this and tell me what it prints

greaterweightindex = 0
if not levelWeights:
    continue
print(levelWeights)
currentweight = levelWeights[0]```
plush jungle
#

who wrote this code?

#

cause this looks like a mistake

confidence= round(currentweight[0], 2)```
cursive walrus
plush jungle
#

currentweight isn't a list or a tuple, so of course this will throw an error

#

what happens if you do this

#
confidence= round(currentweight, 2)```
cursive walrus
#

thank you

plush jungle
#

I am the duck

#

hey @worldly dawn how did you get to be a helper?

#

do you have to defeat one in single combat?

worldly dawn
plush jungle
#

walk uphill both ways through the snow in the heat of summer while row reducing a matrix?

fierce pine
#

@plush jungle hello

plush jungle
#

yo

fierce pine
# plush jungle yo

I am having a data set but it is in txt file. Idk how to load it and i want to do it using linear regression..

#

Also sorry for pinging you like this

plush jungle
#

the issue is just that it's in a txt file?

fierce pine
plush jungle
#

what type of data is it

#

and how is it supposed to be arranged

errant onyx
arctic wedgeBOT
#

Hey @fierce pine!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

plush jungle
#

!past

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

fierce pine
#

Its half the data

wooden sail
#

looks like you could read the file line by line, split the strings based on the spaces, and pick the columns you're interested in afterwards

#

or make a pandas dataframe and ask it for a column

plush jungle
#

yeah or you could use regex

wooden sail
#

also, regression is kind of a loose term. do you mean fit a first order polynomial to a sequence of data? fit a general curve to a sequence of data? have the input be vector-valued?

fierce pine
wooden sail
#

no clue šŸ˜›

#

if you've never done any of this before, i'd see if you can predict a future value in one of the columns given the past values

#

plot the data of the column first to see if it has some behavior you can recognize, pick a model function based on that, and fit its parameters

fierce pine
wooden sail
#

you could try to predict a full row

#

lemme get back home and tex somwthing up

#

lol

fierce pine
wooden sail
#

aight

#

we're gonna look at a short time window linear predictor

#

our assumption is that, over a reasonable small time period, the data behaves like a straight line. pretty much a loose form of taylor's theorem

#

so we wanna set up a model that captures this and learn its parameters

#

we first recall that a linear equation looks as follows (gonna write instead of tex in the end)

#

which we've conveniently written in matrix form on the right. notice we have 2 unknowns, m and b, because we observe y and x in the data

#

we need at least as many observations of y and x as the number of parameters we want to find

#

now, in your case, we don't just have one value x, but several measurements (temperatures and other stuff). and we want to use old data to predict those quantities, so we also don't have just one value of y

#

which we can all arrange into a single matrix vector equation

#

that's for a single row of data. but we need several rows to compute all of the parameters in M (n^2 + n of them). that means we need at least n different columns in x and y

#

and the whole point of this is: those columns are the rows of data in your file

wooden sail
#

the matrix M you get from this is a linear predictor of y

#

in particular, a predictor that only looks at the previous row of data. you can change this by changing the shape of M and giving X a block toeplitz structure

violet gull
#

Import "tensorflow.keras.optimizers" could not be resolved

#

help

#

yes tensorflow is installed

#

ive tried both 2.8 and 2.9

#

2.7 apparently is non existent

#

ping me with response because this chat is so dead id get bored staring at it

hasty grail
violet gull
hasty grail
#

hmm no idea then

#

maybe try importing keras first

violet gull
hasty grail
#

did keras successfully import?

#

if not then you should install keras

violet gull
hasty grail
#

weird

violet gull
#

very sadge

#

tensorflow cringe

hasty grail
#

try reinstalling it maybe

#

are you using conda?

violet gull
violet gull
hasty grail
#

... or venv

#

Chances are, setting up a clean environment would resolve installation issues

violet gull
#

idk how to make a venv right

#

and the tutorials are bad

#

the one i made was on wrong version of python

wooden sail
# violet gull no

one simple workaround is not not import the optimizers like that and call them by the full name when you need them

violet gull
#

i dont need workaround i need the intended way to work like its suppose to

#

and if the normal imports wont work then those wont work either

wooden sail
#

can you at least try? many people on google complain they get the same error you do, but it still works when importing tf and keras, and then calling keras.optimizers

violet gull
#

can u give example of what u mean

wooden sail
#
import tensorflow as tf
optim = tf.keras.optimizers.Adam()

like so

violet gull
#

i think that worked

wooden sail
#

other than that, people suggest to use tensorflow.python.keras.etc , with that extra python in the name

#

well if that works, that's good enough. seems to be an IDE problem

violet gull
#

ok ty

violet gull
#

i switched to a jupyter note book and the tensor imports are still broken

#

the devs of tensorflow deserve a cactus up their bum

#

and jupyter deserves cactus up bum for giving useless error messages

arctic wedgeBOT
#

Hey @violet gull!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

violet gull
#

someone save me from this cringeness i just want to do coding and tensor flow makes me want to commit hate crimes its so terrible

somber burrow
#

Hi guys, i have a problem wit exporting a .txt file on .csv using pandas, and writed in columns, can someone help me ?

serene scaffold
#

@somber burrow try explaining what the problem is

#

Don't "ask to ask"

somber burrow
#

i have a problem, i tried to read a .txt file and exported to .csv and separating the lines using a delimiter by colums using categories.

file.txt is like this

[groups]
admins = user1,user2,user3
users_network = user4,user5
users_m4s = user6,user7,user8,user9
and the .csv file should be

groups

user1 = admins
user2 = admins
user3 = admins
user4 = users_network
user5 = users_network
user6 = users_m4s ... for the rest of element of category line

#

``import pandas as pd
import numpy as np

df = pd.read_table("D:\GIT-files\Automate-Stats\SVN_sample_files\sample_svn_input.txt" , sep='=',engine='python')
print(df)

df.to_csv("D:\GIT-files\Automate-Stats\SVN_sample_files\sample_svn_input_update.csv" , index=None)

df = pd.read_table ("D:\GIT-files\Automate-Stats\SVN_sample_files\sample_svn_input_update.csv" , sep='=',engine='python')
print(df)``

#

but its not displaying and exporting right

#

practicaly the lines form the txt files , on the left of " = " its the group and after its the elements of that group

#

i want to display for each element the group separatly

serene scaffold
#

@somber burrow

[groups] 
admins = user1,user2,user3 
users_network = user4,user5 
users_m4s = user6,user7,user8,user9 

this is not a csv. csv is strictly comma-separated values on individual lines. you would need a more sophisticated parser for this.

#

you might need to write your own regular expression

cinder schooner
#

hello, I'm a software engineer and I have been trying to specialize in AI for a year now. I was used when I was into software to preparing for interviews at big tech by preparing coding interviews and system design interviews. There's plenty of ressources about that on the internet. But now that i'm into AI i've been wondering what do I need to prepare in order to do great at interviews for Machine learning or AI positions? Are coding problems still relevant? how to prepare for system design for AI? what do big tech ask for this kind of positions? Thank you for your answers, i'm really grateful for being part of this discord community.

serene scaffold
cinder schooner
#

I'm sorry if I didn't explain as I should I have a software engineering degree and then got to a masters degree in data science. I'm not currently working in software. @serene scaffold

#

and I did work as a software engineer for like a year and a half but they were all part time jobs

#

I'm also working on a lot of personal projects in AI etc but I'm really trying to know if it make sense to get back at preparing coding interviews and if not what to prepare

serene scaffold
#

I only have experience with interviews for career starters, so I should let someone else comment. but I would at least be prepared to talk about anything you worked on during your masters. did you publish?

gilded flame
#

What caused this to skip to the next column index?

serene scaffold
#

I thought that was where we were lemon_sweat

serene scaffold
gilded flame
serene scaffold
#

!code

arctic wedgeBOT
#

Hey @gilded flame!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

serene scaffold
gilded flame
#

                    
                    cursor = cnx.cursor()
                    cursor.execute(QUERY)
                    df = pd.DataFrame(cursor.fetchall())
                    

                    if alldf is not None:
                       if not df.empty:
                           alldf = pd.concat([alldf,df],axis=0)
                    else:
                        alldf = df
                 
                
                    print(df)
                    field_names = [ i[0] for i in  cursor.description]
                    print(field_names)
                        
                    xlswriter = pd.ExcelWriter('{}/{}.xls'.format(type,loc),engine='openpyxl')


                    if not df.empty:
                        df.columns = field_names  
                      
                        df.to_excel(xlswriter,index=false)

                        xlswriter.save()
                    else:
                        cnx.close()```
#
def saveToExcel(query,filename):

    xlswriter = pd.ExcelWriter("%s.xls"%(filename),engine='openpyxl')
    queryDatas = executor(query)
    print(queryDatas)
    export = queryDatas
    export.to_excel(xlswriter)
    xlswriter.save()


    print("succes savetoExcel")```
#

using pandas.concat([],axis=0) to stack the dataframes vertically but won't stack vertically?

fierce pine
wooden sail
#

the way i wrote it, all columns are both dependent and independent šŸ˜› since the idea is to take a full row (data from all columns) and use it to try to predict the next full row of data. anything with numeric values, let's say

fierce pine
wooden sail
#

all of it

#

it will use all the previous rows of data to predict the next row of data, as long as you can convert the data to numerical values in some way

#

i would say that, since the sensor data is gathered at a regular interval, you can ignore the date and time

vital lodge
#

hi

#

does anyone know sort of video classification

#

like using audio and image features for classification

hollow sentinel
#

sounds like deep learning

#

CNN/RNN

serene scaffold
hollow sentinel
#

has anyone used the mysql workbench with a mac

#

i was thinking of doing some kind of exploratory data analysis project

#

with power BI

#

honestly why do that when python exists

normal moth
#

Hello guys, so I am trying to learn Data Science from ground up
I have fairly decent amount of exposure to Python but don't know anything related to Data Science.
Are there any good sources, courses and/or YT channels which I can refer to for learning about Data Science

#

If anyone could help I would be grateful!

arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

normal moth
hallow panther
#

adventures in overfitting

wooden sail
#

dunno if i'd call that overfitting

tacit basin
wooden sail
#

that looks like underfitting instead, since it's not close to describing the data, let alone the noisy data

#

the model hasn't been trained enough or cannot represent the data correctly

tacit basin
#

With training train gets better and valid worse. That's a definition of overfilling, isn't it?

wooden sail
#

ah wait, what is the plot showing

#

since the axes are not labelled, i assumed this was data and predictions

#

is it the loss?

#

if so, then yes

arctic quail
#

hello šŸ™‚

#

if i have for instance a function with 3x parameters, which i want to approxmitate, how can i pass these 3x parameters into the approx_fprime function ?

untold smelt
#

anyone would be willing to help me with a basic quiz in AI?

plush jungle
#
torch.cuda.is_available()```
is always returning false
#

the internet says to upgrade your nvidia drivers, so I did that and it's still happening

subtle grotto
#

Hi I have a question about one of the debugging exercises. In the Arguments, Paramaters, and Debugging section, 7. Debugging Functions - 1st screenshot

this is telling me there is a problem on line 21, but the actual problem is up on line 13- 2nd screenshot

Can anyone please explain to me how this debugging error would point me to find the ā€œcorrectā€ error? Thanks.

long locust
subtle grotto
#

there were two issues with the code...line 13 is missing the ":", and down in the 'def mean' section...'sm_list/len_list' - supposed to be sum_list.

oak olive
#

Hi!

brave sand
#

does anyone have any experience with shapley values?

oak olive
#

Is this a good enough binarization?

#

May a OCR recognise the number?

topaz prairie
#

In tensorflow, which metric tracks how confident a categorical CV model is with it's predictions while training? Similar to accuracy, but I'm trying to see an average of how confident my model is with it's predictions.
I'm basically looking for the mean confidence I guess?
What's the metric called for something like this? I'm using softmax activation on my output layer, if that matters.

weary ridge
#

is there any seperate servers for image processing in python?

wooden sail
topaz prairie
#

I'll be blunt. I have no idea what you just said.

wooden sail
#

what i'm saying is "that's a bad metric if you use it alone" and "use mean squared error between the output of the softmax and a vector of zeros" (this second one is why the metric is bad)

topaz prairie
#

"that's a bad metric if you use it alone"
I agree, that's not the intent though. Just learning, to be honest.
use mean squared error
šŸ‘

I understand what you said about p-norms also, I took stats ^^ Thanks for the assistance.

wooden sail
#

oh, what was it you didn't understand then?

topaz prairie
#

Which statistic metric to use. You clarified with "use mean squared error."

#

I understand most concepts, but I'm very poor with names (also reflects in human names, and just names in general).

#

So just takes me a bit to remember which thing is which lol

wooden sail
#

all right. MSE is the p norm with p = 2 between two vectors. since all you want is to study the prediction vector, it's the same as MSE between the softmax output and a vector of zeros. you'd wanna maximize it.

topaz prairie
#

Ok got it, thanks.

wooden sail
#

if you don't need it to be differentiable because you won't optimize with respect to this, all you need is to look at the maximum element in the softmax output. the closer this is to 1, the better

weary ridge
#

can someone suggest me sources on where i can read about text recognition from an image

#

online sources d be highly useful

plush jungle
#

what are you looking to learn about, just how it works?

plush jungle
weary ridge
#

yes

#

like using pytesseract and opencv

#

i have some use cases but i dont know how to implement them using codes

#

so i wanna learn about it

plush jungle
weary ridge
#

how to find a given sentence in the inputted image?

#

"you are good" in a image

#

is there any approach to solve this problem

plush jungle
#

you can use regex on that string to match with the text you want

weary ridge
#

this will generate string of the text

weary ridge
#

like some random characters installed in between due to foreign languages

plush jungle
#

regex can handle that

weary ridge
#

S) l\infected.html > @) Search, Pr @

Ā„

ka Mail - Knox Portal @iNinfected.htmi

You are infected!

om | O Jype here to search t F g A , AIC O Bl F va 4

weary ridge
plush jungle
#

what are you searching for in this string

weary ridge
#

you are infected

#

wait lemme run with regex

plush jungle
weary ridge
#

what link is this?

plush jungle
#

rubular is a website that lets you test regexes in real time

#

so you don't have to run a python script every single time you want to tweak your regex

weary ridge
#

how the code in python looks like for using this regex?

#

how to comment these selected lines

#

at once

#

should we have to # all the time for each lines?

plush jungle
#

don't post code in images

#

post it like this

#

!code

arctic wedgeBOT
#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

plush jungle
#

also what variable represents the string you just posted?

#

is it b?

weary ridge
#
test = pytesseract.image_to_string(img)
if(test.find("You are infected!")!=-1):
    print("Match Found")
else:
    print("Match not Found")    


#

i got the code without using regex

#

🄲

weary ridge
plush jungle
#

yeah if you know there won't be any letters in between you don't need regex

weary ridge
#

ohh

#

what s mean by letters inbetween

#

can you give any example?

plush jungle
#

but what regex can do is detect strings like this

you a8re in4fesecte$d```
weary ridge
#

ohh

#

thats interesting

plush jungle
#

if you run into that problem, remember that regex is the solution

weary ridge
#

but while doing image to text, why will some random letters come inbetween

plush jungle
#

ocr, like all machine learning, is probabilistic

#

the computer just makes educated guesses

#

sometimes those guesses are wrong

weary ridge
#

yeahh

#

you are right

#

this link will work right?

plush jungle
#

yeah that's a good source to learn regex

weary ridge
#

@plush jungle I have a followup qn too

#

how to find the coordinates of the box enclosing the sentence You are infected!

plush jungle
#
import pytesseract
from pytesseract import Output
import cv2
img = cv2.imread('image.jpg')

d = pytesseract.image_to_data(img, output_type=Output.DICT)
n_boxes = len(d['level'])
for i in range(n_boxes):
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)

cv2.imshow('img', img)
cv2.waitKey(0)```
#

something like this

weary ridge
#

ohh i ll try once

worldly dawn
plush jungle
worldly dawn
#

(or copy/pasted from a sample?)

weary ridge
#

some people are pro in searching questions in stackoverflow

worldly dawn
#

They are called senior engineers

weary ridge
#

whereas people like me dont get answer to single questinos

#

šŸ˜µā€šŸ’«

weary ridge
plush jungle
#

btw, recursive, I know this is off topic for this channel but do you have any idea why this is giving me strange values

target = (math.cos(math.radians(self.angle)), math.sin(math.radians(self.angle)))```
#

if self.angle is 90

#

it should give (0,1)

weary ridge
#

actually pi/2 radian is different from 90degree

#

like pi/2 is irrational

plush jungle
#

but I'm doing math.radians

weary ridge
#

so degree to radian conversion is not exact

#

its approximate

#

am i correct?

plush jungle
#

but it's not even close

#

it's giving me
(6.123233995736766e-17, 1.0)

#

oh wait

#

that is close

worldly dawn
#
In [8]: (math.cos(math.pi / 2), math.sin(math.pi / 2))
Out[8]: (6.123233995736766e-17, 1.0)
#

yeah, looks like a float representation issue

plush jungle
#

oh i'm dumb

#

I had this

worldly dawn
#
In [9]: (math.cos(math.pi), math.sin(math.pi ))
Out[9]: (-1.0, 1.2246467991473532e-16)
plush jungle
#
        self.x += self.target_vector[0]/100
        self.x += self.target_vector[1]/100```
#

I need to lay off the copy pasting

#

anyway, this is tangentially related to machine learning

#

cause I'm making a reinforcement learning bot

worldly dawn
#

numerical precision does matter even in ml

plush jungle
#

by reverse engineering a flappy bird reinforcement learner

#

and retrofitting it for a top down pygame shooter

worldly dawn
plush jungle
worldly dawn
plush jungle
#

amen to that

worldly dawn
#

there is a demotivator about it too

weary ridge
plush jungle
#

marie curie discovered both radium and the fact that being around radium kills you

worldly dawn
plush jungle
#

sorry, I tend to wax philosophical in the late hours of the night

worldly dawn
#

np, it's still an interesting question

fleet musk
#

hi guys, so I am stuck in a problem, and found something that might help me out on stackoverflow

#

.
my question is, instead of integers, can I use a range?
like [100-105, 110-115, 120-125]
.

urban lance
#

for some reason it gives me a future warning when I do on date šŸ¤”

#

var = (df.set_index('date').groupby("user")).rolling('14D')

#

it doesn't throw the warning if I set the index to the date šŸ¤·ā€ā™‚ļø

urban lance
#

and also 😬

#

Does anyone have experience with the df.rolling function?

serene scaffold
#

@urban lance try giving a minimal reproducible example that can be copied and pasted exactly.

haughty pewter
#

are there any usually reasonable ways in general to find a correlation analysis between 2 columns if there's already 100,000 rows to use

pliant pewter
#

If you gave the points like 20% opacity, it would be easier to visualize the density of them

#

But a priori, I do not see any strong correlations in that plot, lol!

serene scaffold
#

it would be difficult to imagine a weaker correlation

pliant pewter
#

Without better density information, it's hard to say. Maybe there's a very dense straight line in the middle, with lots of outliers

serene scaffold
#

or, maybe there isn't lemon_angrysad

haughty pewter
#

there's really not a lot i can work with regarding trying to make it dense so would "there is no correlation between Age and the Final Score" suffice
unless i do something like "check if age > insertNumberHere use row", if it's possible, would that be fine too

pliant pewter
#

I mean, you can compute the correlation coefficient easily. It's gonna be small. And then you can say there's no useful (linear) correlation.

flat hollow
#

This is a top-down view of my surface plot. The z axis shows the speed of fluid flow. The structure is a big vial full of water (big circle) with a tube going through the middle (small circle). I'm trying to figure out how to extract the velocity from the middle and find the peak velocity, ideally without having to manually label which pixels to grab the values from because I have a lot of these images and the position of these things in the x-y plane changes slightly. Any ideas?

haughty anvil
#

Hi, has anyone here used SciPy before? What are some example projects that one can build with SciPy? I'm trying to get a better understanding of it. Also, what is the difference between SciPy and NumPy? Thank you!

serene scaffold
#

numpy is pretty much at the foundation of everything

#

that said, there aren't projects you can do specifically in terms of one data science library

#

it's not like "build a website with django".

haughty anvil
#

Hi @serene scaffold ! Ok, thank you! So with SciPy it sounds like there are things you can do with the data? Like if you did some speech recognition stuff and get a text transcript back. Then are there things one can do with SciPy on with that text?

serene scaffold
#

Then are there things one can do with SciPy on with that text?
probably not. try spaCy.

haughty anvil
#

oooh

#

Ok, gotcha! Thank you!

urban lance
hallow panther
#

Does anyone have experience with Google's MT5 text model?

urban lance
#

I have a df with session IDs, I'd like to group information by session to pass through a function but each group has to have the rows of previously processed groups as well. Is groupby able to do this?

serene scaffold
serene scaffold
brave sand
#

how can I use Shapley values to design utility and payoff for multi agent reinforcement learning?

haughty topaz
#

how can I exactly predict tomorrows stock price? Trynna get a bag quick

serene scaffold
#

it sounds like you might have unrealistic expectations. you can't predict the future, let alone exactly. you can only forecast it.

novel elbow
prime finch
#

Hello everyone, may I ask, do you have any references about RecommenderNet algorithm?

hollow sentinel
misty flint
#

bruh

serene scaffold
plush jungle
#

any machine learning tool you have access to, wall street investment bankers also have access to. if there was a way to predict stocks accurately, they'd still be richer than you because they'd use the same tool but with better data and more expertise

#

and more seed capital

plush jungle
#

deep Q learning is short term, right? it only ever looks at which actions have immediate benefits given a current state?

#

so it's not going to be able to patterns that take longer delays between the action and the reward?

#

I'm trying to repurpose this deep Q learning code that teaches a bot to play flappy bird and have it learn to play a top down shooter game

#

the blue dot tries to shoot the red dot by deciding to change the angle of its laser sight, do nothing, or shoot

#

it's 186,000 turns in, and it's really not getting noticeably better

#

the code that updates the neural net's weights is as follows:

#
        minibatch = random.sample(replay_memory, min(len(replay_memory), model.minibatch_size))

        # unpack minibatch
        state_batch = torch.cat(tuple(d[0] for d in minibatch))
        action_batch = torch.cat(tuple(d[1] for d in minibatch))
        reward_batch = torch.cat(tuple(d[2] for d in minibatch))
        state_1_batch = torch.cat(tuple(d[3] for d in minibatch))

        # get output for the next state
        output_1_batch = model(state_1_batch)

        # set y_j to r_j for terminal state, otherwise to r_j + gamma*max(Q)
        y_batch = torch.cat(tuple(reward_batch[i] if minibatch[i][4]
                                  else reward_batch[i] + model.gamma * torch.max(output_1_batch[i])
                                  for i in range(len(minibatch))))

        # extract Q-value
        q_value = torch.sum(model(state_batch) * action_batch, dim=1)

        # PyTorch accumulates gradients by default, so they need to be reset in each pass
        optimizer.zero_grad()

        # returns a new Tensor, detached from the current graph, the result will never require gradient
        y_batch = y_batch.detach()

        # calculate loss
        loss = criterion(q_value, y_batch)```
#

I don't entirely understand what y_batch and q_value are, but as far as I can tell, nothing in this does anything that would track the long term benefits of a move

#

which means if it takes 50 turns for a bullet to reach the target, the model will never learn how to aim

plush jungle
# iron basalt No.

I'm not sure I understand how it makes long term connections between an action (like firing a bullet) and a delayed reward (like the bullet hitting its target 50 moves later)

#

this minibatch code is the only part where it does gradient descent, so somewhere in the code I posted must be the long term learning you're talking about

#

could you give me a hint as to how this works?

unborn inlet
#

how many images are good for an ML database of dogs?

#

also, if im trying to detect something, do i need a database of stuff that is what im trying to detect and a database of stuff im not trying to detect?

#

if that make sany sense

agile cobalt
#

depends on which model you are using, what is the purpose of the model, and which kind of pictures you'll feed it later and probably a few dozen other factors I do not even know

if you want to accurately identify all dog breeds, from any angle, and tell apart not-a-dog as well, that 1.000.000 joke might not even have been all that far-fetched

if you just want to tell if a picture of a front-facing dog is a Shiba Inu or a Chihuahua, a few dozens or hundreds would suffice

unborn inlet
#

i basically want to say if its a dog or not a dog

#

im using MLPClassifier

iron basalt
agile cobalt
#

"not a dog" can be literally anything, or just one specific kind of thing?

unborn inlet
#

yk?

agile cobalt
#
#

disclaimer: I have never personally worked with classifying images

you may be able to make it work using a HuggingFace or fast.ai pre-trained model and potentially fine-tune to which kinds of dogs your data will actually include, but it might be trickier than it sounds

#

that said, if you want to do it with your own dataset, without using a pre-trained model, I don't really have any ideas of how to help other than "good luck"

unborn inlet
#

im gonna try a different approach actually

#

thank you for the help tho

pseudo wren
#

I'm doing a time series model based off the collapse of WireCard

#

the model is based on the stock prices for that time

#

my graphs are looking a little fucked though

#

so i'm not totally sure what to do with it

#

unsure what i'm doing wrong, but the graph is real....wonky looking

proven pier
#

Yall have any good books you recommend for DSP/data science? @ me since I turn off all notifications lol

plush jungle
#

I'm trying to understand the code and concepts behind Q learning, as explained here

#

but I'm stuck on how the Q learning algorithm predicts future payoffs, not just the payoffs that will occur at t+1

#

it uses replay memory, and randomly selects 32 previous examples of turns

#

but in that replay memory the only information is the state, action, reward and image

#

there's nothing linking any given turn to its future reward

iron basalt
plush jungle
#

ok so if we take the maze example you suggested

#

I think I get how it works

iron basalt
#

Using tabular methods for Q-learning (RL in general) makes it really obvious, since they can even be done by hand for very simple toy problems.

plush jungle
#

because the reward is given based off of immediate success or not

#

well actually wait

#

with a bigger maze

iron basalt
#

Follow the Q-learning algorithm for a simple maze and see how the Q-table is updated.

plush jungle
#

ok so each state has a value associated with each action

#

and that makes up the table

#

so each square of the maze that the player could occupy is a state

#

and eventually the correct path is produced in the table

#

through rewards updating the table values

iron basalt
#

Yes. Although it may not take it exactly depending on the choice of exploration vs exploitation.

plush jungle
#

right, I think I understand that too

#

but it all falls apart when you go from like 100 states to millions

#

because in my code, the states are vectors representing the image

iron basalt
#

Yup.

#

Tabular only works for simple things. Its space complexity is bad.

plush jungle
#

I want the agent to learn that firing the bullet will yield a powerful reward, but not immediately. the neural network that influences what action the agent chooses is trained on minibatchs

iron basalt
#

You know what else is not immediate? The reward at the end of the maze. So how does the agent know, when all the way at the start, where to go?

#

(Not trained vs trained)

plush jungle
#

because each square in the maze receives a reward based on whether it hit a wall or how close it is to the goal, right?

iron basalt
#

No reward is given except at the goal state.

plush jungle
#

oh

#

so it works backwards then? the square before the goal gets a strong update to the weight for choosing the right action

#

and then the square behind that gets a stronger weight for the action that gets you to that state?

#

like because of exploration, eventually the agent will stumble its way to the end

#

the final square's action weight will be updated, but then what about final square - 1

#

if reward is only given at the end, how does final square -1 know to update the weight for the action that gets it to final square

#

since it won't receive a reward for doing so

iron basalt
#

Imagine s_t is the tile before the last tile (the goal).

plush jungle
#

the tile right before the goal makes sense to me, but "estimate of optimal future value" is the part that confuses me. for s_t-1 how does it calculate that future value?

iron basalt
#

s_t becomes s_t-1 when it moves to the goal.

#

They are the same thing.

#

s_t, s_t+1 or s_t-1, s_t

iron basalt
#

Why is that image so small? Click to enlarge.

plush jungle
#

yeah it's the s+1 I'm struggling with

#

it just got reward at time t

iron basalt
#

So you did the action that takes you to the goal state s_t+1

plush jungle
#

how does it know reward at time t+1

iron basalt
#

But what are you updating according to the equation now?

plush jungle
#

we just got from final square to goal? I guess we'd update Q?

iron basalt
#

Yes, but Q of what?

#

Imagine Q as the Q table.

plush jungle
#

right

iron basalt
#

You look things up in it.

plush jungle
#

it tells you which actions yield what rewards in a given state

#

in this case that state being the final square