#data-science-and-ml

1 messages · Page 303 of 1

exotic maple
#

i'm thinking U of Michigan's DS courses. Not because it hink they're exceptionally good, but because of resume AI reader lol

#

value big universities and crap

lapis sequoia
marsh berry
#

Hey all, I am trying to figure out a way that I can update subsequent dataframes that are created based on an original dataframe. So if I change the original dataframe, I would want the later dataframes to take that change.

import pandas as pd

old = pd.DataFrame({'A' : [4], 'B' : [10], 'C' : [100], 'D' : [30]})

# Function to Add a Row to Old
def add_row(A,B,C,D):
  global old
  data = {'A': A, 'B': B, 'C': C, 'D': D}
  old = old.append(data, ignore_index = True)

new =  old[['A', 'C', 'D']] # New DataFrame to Grab Columns A, C and D only

add_row(1,2,3,4)
print("Old: \n", old)
print("New: \n", new)

This code will print out the following:

Old: 
    A   B    C   D
0  4  10  100  30
1  1   2    3   4
New: 
    A    C   D
0  4  100  30

As you can see, in line 13 I added a new row using the add_row function to the old dataframe. However, the line did not append to the new dataframe. What can I do to ensure if I make a change to the original dataframe, the child dataframes also get updated?

marsh berry
#

@velvet thorn I have created multiple new dataframes out of the original where I filter it by a specific column, count them and then plot them.

marsh berry
#

But I want the ability to filter by date range

#

So I figured if I filter the original dataframe by date the subsequent ones would update too. But that is not the case.

marsh berry
#

How do I work around this?

velvet thorn
#

pandas DataFrames

#

are backed by numpy arrays

#

which have fixed sizes.

#

so when you "append" to a DataFrame or, in general, change its size in any other way, a new array is allocated

#

and the relevant data copied over.

#

in certain specific cases

#

you can have a view into a DataFrame:

#

!e

import pandas as pd

df = pd.DataFrame([[1, 2], [3, 4]])
sub = df.iloc[:1]

print(df)
print(sub)

df.iloc[0] = [5, 6]

print()

print(df)
print(sub)
arctic wedgeBOT
#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 |    0  1
002 | 0  1  2
003 | 1  3  4
004 |    0  1
005 | 0  1  2
006 | 
007 |    0  1
008 | 0  5  6
009 | 1  3  4
010 |    0  1
011 | 0  5  6
velvet thorn
#

you can see here that the sub-dataframe is a view into the original

#

and changes propagate.

#

however, in general, you cannot reliably create a view of pandas DataFrames (and if you're performing complex filtering, that'd be more or less impossible because of memory layout)

#

also, once you add or remove rows/columns, a new array is allocated, which means your views are into the old array anyway.

#

I would suggest

#

some sort of observer pattern?

#

build an abstraction around your original DataFrame and have all access flow through it

marsh berry
#

I dont completely follow. Any chance you have an example?

velvet thorn
marsh berry
#

building an abstraction around the original df

velvet thorn
#

how big

#

is your dataframe?

marsh berry
#

Like 10 columns and 250 rows

#

Not super big

#

But the df does grow as im pulling the data from an API

velvet thorn
#

instead of

#

having sub-dataframes that you pass around

#

have, for example

#

methods in a class

#

that you call every time you want a specific subset

marsh berry
#

Ah I see what you're saying

#

That makes more sense actually

#

@velvet thorn Does the dataframe need to be defined within the constructor or outside of the class in a case like this?

velvet thorn
marsh berry
#

@velvet thorn I think making the class for the dataframe might be the way to go

uncut orbit
#

do you know any good resources for a cycle gan?

somber prism
#

guys i am planning to start machine learning , just ml not data science

#

anyone have a good roadmap on where should i start and continue from it ?

tacit basin
#

But everyone will have different goals and learning styles ...

tacit basin
#

For me for example ML is Data science

lapis sequoia
somber prism
lapis sequoia
somber prism
serene scaffold
#

@mild totem does the hotel earn money the day people check in or check out?

azure lotus
dark sonnet
#

cool

uncut bloom
grizzled sail
#

if anyone could help with some keras in #help-broccoli i'd be super thankful

onyx timber
#

hey guys i have a question about decision tree regression modelling

#

if my dataset is all numerical. In my instance it is

temp | windspeed | humidity | bike_rent_count

where bike_rent_count is my target and the rest are predictors. how would you calculate the STD reduction? My goal is to find the best predictor for the root node in this case.

lapis sequoia
#

@glossy trellis feel free to ask and post the link here. I think it's very fitting here.

glossy trellis
#

Hello everyone, I've project to do. It's about login with sign recognition . for example I'll show to camera 1-2-3-4 and the programme should tell me you logged in successfully...

can someone help me?

timid saffron
#

I can build it

#

For you

#

I will charge though.

#

Jk i cant man sorry

hollow sentinel
#

🥴

bitter harbor
#

better?

drowsy ibex
#

Does anyone have any good resources for learning NER for NLP project? I'll be using Spacy and have read through their documentation, but am looking for tutorials on youtube or a class on a platform like coursera/pluralisight to get me started. If you know something, please reach out!

twin moth
#

I need a good project idea, could anyone land me a hand?

frail comet
#

technically, if you let a deep-learning ai machine read through history of the server (and simultaneously learn), will it have a perfect understanding of python?

#

(if say it can somehow comprehend all of the articulated questions, answers and conversations with all grammar)

fresh sphinx
#

maybe you already found something, but if not I would recommend ray's rllib. It supports quite a few RL algorithms out of the box and the ray ecosystem has a lot of tools to help you out

lapis sequoia
bitter harbor
#

im not even sure what he's trying to do, he kinda just said a bunch of flashy words

#

this sums it up pretty well 😄

hollow sentinel
#

he sounds like he doesn't know what he's talking about

#

this is like the people who think AI/ML is going to cure cancer or something

#

he's joking right

#

I hope he's joking

#

it just sounded like a bunch of buzzwords to me

#

psuedoscience?

lapis sequoia
#

the point is the idea not the video smh

hollow sentinel
#

🥴

bitter harbor
#

the idea doesn't make any sense

lapis sequoia
#

its tiktok ofc theres buzzwords

hollow sentinel
#

the idea isn't feasible dude

lapis sequoia
#

elaborate

hollow sentinel
#

🥴

#

Dimensions?

#

ok ok ok

bitter harbor
#

I like the like 3d space room thing clip but it's probably just like blender or smthing

hollow sentinel
#

how is he going to get his memories into an artifical neural network?

#

answer that

lapis sequoia
#

pictures ffs

hollow sentinel
#

🥴

#

he's just going to create pictures?

lapis sequoia
#

this isn't stranger things lmao

#

yk memories can be pictures right

hollow sentinel
#

I don't know what part of this idea makes sense to you

lapis sequoia
#

all of it

hollow sentinel
#

are you trying to troll

#

is that it

lapis sequoia
#

ill simplify it

hollow sentinel
#

bc you're doing an excellent job

lapis sequoia
#

images merged together to create something unique

hollow sentinel
#

🥴

lapis sequoia
#

all im confused about is the lidar scanned pics

#

fancy word for 3d scanned image

bitter harbor
#

so scanning + mapping 3d space?

lapis sequoia
#

yuh

bitter harbor
#

im sure it's possible but you'd have to get your network to 1) understand depth (assuming you're not feeding in depth maps or actual 3d files) and 2) understand how space works + how to create it

#

idek how you'd approach that/if you can

lapis sequoia
#

ill find a way ty

#

i could

#

attemp giving orthographic views of the place

#

because im not rly basing it off memories

#

more like photos

bitter harbor
#

never would have guessed

#

if you give it 2d photos and figure out the second point I mentioned, it'll give you back a 2d photo tho?

lapis sequoia
#

idk

#

Programming melts brain whilst giving you some

bitter harbor
#

it helps if you deal with reality 🤷‍♂️

lapis sequoia
#

true true

inland sky
#

i dont know if there is the right place, but i'm working on a simple anthill, i'm trying to create a smooth random movement for the ant ai
does anyone have a good solution?

fresh sphinx
#

What are you working on?

tidal bough
inland sky
glossy trellis
#

hello, I've problem with my project. When I show to the camera 1 sign with my hand, it gives me infinite 1 on console. How can I fix it. I can share my codes btw.

sweet plaza
#

hello, I have an assignment about Genetic Algorithm and I couldn't understand the Crossover Probability..How can I decide the Pc based on this description ?

fresh sphinx
#

hey, does anyone know about an active machine learning server or forum? I wanna find some people to work together with

whole mica
fresh sphinx
#

I'm interested in a lot of things, but I know nothing about economics haha

#

what kind of project are you thinking of?

marsh lantern
whole mica
#

i have found some resources but no one to communicate and talk with

marsh lantern
#

wsb? /s

whole mica
#

What?

marsh lantern
#

Maybe someone from wallstreetbets can help

serene scaffold
#

when BERT is applied towards NER, does it predict each token in isolation?

bitter harbor
#

is there a way to add black borders around each bar?

bitter harbor
#

where

#

I was trying ax.edgecolor/ax.set_edgecolor

#

oh nvm small brain hrs

ivory pendant
#

I read a few tutorials and started writing my first AI. I came to multivariate linear regression and using the sklearn library. Everything worked as expected and I saw how it simply detects mathematical patterns to generate a prediction, the more data I added, the more accurate. (code: https://paste.gg/p/anonymous/71402a14eee8434d968b5b9001af5d1f )
However, that was solely with numbers. If I wanted to detect patterns in strings, and I don’t mean anything that can be accomplished using lexical analysis, regex or a set of grammar rules. I mean patterns such as expecting a certain character after another or identifying grammar set.

If I wanted to do this, should I use a different method (i.e not regression), and how would I do it? Would I assign a numeric key for every allowed item?
Note that the patterns will not be numerically related to the place in the alphabet or the key. To exemplify: "Hello"," W" would expect that there is likely orld! after the W. (although that is a terrible example)
Maybe a manual algorithm? (i.e string and grammars might not be great for an AI)

exotic maple
#

Common NLP models are SVMs, Naive Bayes Multinomial / Bernoulli, etc

ivory pendant
#

from what I'm understanding (reading the Wikipedia page), it seems ‘medical’ or ‘talks’/‘understands’ languages somehow.
what I mean is for example, I can give it a few strings of a programming language and it can generate a grammar set for use with an error handler or such.
is that NLP?

last nest
exotic maple
carmine pike
#

Hi guys,
If I want to play a little bit with the WAYMO dataset (computer vision), would you recommend using PyTorch or TensorFlow? Does it matter? (Personally I prefer PyTorch)

dry hearth
#

Any cheap platform for text labelling? I'm looking at products like monkeylearn and the pricing is killing me

sweet plaza
jade chasm
#

Can someone explain in layman's terms what a hilbert space is of a given feature map?

jade adder
#

hilbert space is a vector space that has an inner product (and is complete but dont pay attention to that)

#

that means that this space has a concept of orthogonality and direction

#

i cannot go in any further detail without first knowing your background potentially burning your brain

lean ledge
#

A vector space already has concept of direction, inner product is mostly orthogonality and induced distance

jade adder
#

direction requires inner product

#

but i wont argue on that since it matters not

#

a vector space only "knows" how to add two vectors and multiply a scalar

#

you just confuse it with Rn spaces which are already hilbert spaces (probably)

#

since you tried to use intuition

lean ledge
#

Yeah it's worth digging into axiomatic definitions of abstract vector spaces and building up the algebraic structures and abstract properties you can build on top

#

Really important concepts that lead into some very important applied fields (signal proc, quantum mech, etc)

jade adder
#

wait

#

u were right

#

u can have a direction aka a scalar multiplied by a vector

#

its just that any span of those wont span the whole space

#

so yeah there is a direction

#

even on non hilbert spaces

lean ledge
#

Indeed. Direction is a bit underspecified, but linear independence and whatnot can be a good indicator of different directions

#

Makes sense given the intro definitely of vectors as "a magnitude with a direction"

iron basalt
#

Depends if you consider orientation to be part of "direction", some people distinguish between the two. There is always a direction, but there may not be orientation.

#

Sometimes in math it's better to just say the full definition of the word you are using (how you define it), rather than just saying the word because (at least at this abstract level) things can get confusing about the semantics, especially when people with different language backgrounds try to interact.

#

(And in the case of ML, there are different definitions for words thrown around all the time in papers and people just assume you are on the same page)

hard frost
#
new_data = pd.DataFrame(dicts).set_index("Month")
            ##df_predict = pd.DataFrame(transform, columns=["predicted value"])
            response = make_response(new_data.to_csv(index = True, encoding='utf8'))
            response.headers["Content-Disposition"] = "attachment; filename=result.csv"
    
            labels = [d['Month'] for d in dicts]
                
            values = [d['Predictions'] for d in dicts]
    
            colors = [ "#F7464A", "#46BFBD", "#FDB45C", "#FEDCBA",
                           "#ABCDEF", "#DDDDDD", "#ABCABC", "#4169E1",
                           "#C71585", "#FF4500", "#FEDCBA", "#46BFBD"]
    
            line_labels=labels
            line_values=values
            return render_template('graph.html', title='Time Series Sales forecasting', 
max=17000, labels=line_labels, values=line_values, filename = response)
    
    
    @app.route('/download/<filename>')
    def download(filename):
        filename = filename
        return send_from_directory(filename=filename, as_attachment = True) 
#

Hi community, I m currently deploy a machine learning model in web using flask python, Does anyone excel in send_from directory, please share me how to return my send_from directory correctly? cause I want to pass my response to serve a download route, currently I only can return my plot graph

Error I got >> TypeError: send_from_directory() missing 1 required positional argument: 'directory'

#

<a href="{{ url_for('download', filename=filename) }}">Download</a>

#

Hi does anyone here? please help~

analog cave
#

how to create this equation in LATEX Overleaf?

last nest
#

i dont think it's the right topic

tidal bough
#

yeah, not really for this server. Also, the answer is "with quite a bit of work", not sure what you're asking about(specific symbols? it uses no weird ones, I don't think).

analog cave
#

@last nest thanks for letting me know, is there an Overleaf channel?

last nest
#

but you can find all syntax about it on google, it's not that hard , or on latex reddit

analog cave
#

oh okay thanks

last nest
#

it's mainly python on here

#

but check subreddit latex you'll find help

iron basalt
#

Just search online for things like "latex subscript", etc.

last nest
#

yeah just focus term by term and you'll do it

jade chasm
#

Do I understand it correctly that the RKHS is basically a higher dimentional representation of the map which is implied by the kernel, but not actually calculated (since that's the whole point of the kernel), which is basically just the dot product of the feature in higher dimensions?

#

I understood that if two features have a close norm in the RKHS, they're probably pointwise close as well

fickle thicket
#

Sorry for my dumb question, I'm making right now a course about data analytics, and I don't understand a few term, because (I think) they use a lot of time like synonyms. Can you help me, please?
Can you show me on a diagram what's the hierarchy and what's the relation between: data science, data analytics, data analysis, data scientists, machine learning, statistics, analytics, data ecosystem?

iron basalt
#

I would say: statistics -> the rest are applications of it

#

though machine learning is technically its own thing

#

But other than that, I don't really know of a great way to put it in a hierarchy, it's more of a loose cloud.

jade chasm
#

In terms of machine learning:

#

Statistical learning theory is used as a tool in order to improve machine learning models (how they should learn, what loss functions to use, what optimizers to consider, how to choose good values for hyperparamters considering the estimation/approximation error etc)

#

So statistics would be a tool for machine learning, and machine learning is a tool for solving problems such as classification or complex regression problems

#

\

#

In general, when should I use ROCAUC instead of accuracy for my optimization metric, just when classes are imbalanced, or in general when I care about true/false positive ratios?

fickle thicket
#

and data science vs data analytics?

jade chasm
#

Data analytics is arguably more specific, and usually focusses on the 'applied' side of things (very loose definition), e.g. BI tools

#

Data analytics is more specific and concentrated than data science. Data analytics focuses more on viewing the historical data in context while data science focuses more on machine learning and predictive modeling. ... On the other hand, data analytics involves a few different branches of broader statistics and analysis. ^google.

#

Data analytics are the people showing nice graphs to important people in meetings. Data Scientists usually aim to improve their ML models while AutoML is running in the background and beating them anyway

jade chasm
#

(assuming this data scientist is building ML models, and not just data engineering, etc.. or anything else which is done in the field)

iron basalt
#

Data science can involve switching hats ^

fickle thicket
#

for me the confusing part is that data science is the biggest one (I guess) but who do it we call data scientist. Same with data analytics (analysis?) and analyst
so a data scientist could make data analysis as well, so its higher in the hierarchy?

iron basalt
jade chasm
#

Assuming youre not calibrating your threshhold, why would I choose ROCAUC over Accuracy?

#

i.e., cost of false positive = cost of false negative

iron basalt
#

It's for calibrating threshold.

#

(And yes equal cost)

#

If the cost difference is too big, it's kind of useless.

jade chasm
#

I mean, imagine I have a neural network, and I compile it like py network.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

#

is there a difference if I use ROCAUC instead of accuracy then?

iron basalt
#

So ROC is for binary classification.

jade chasm
#

I actually use a binary classification here

#

and not strictly, you can weigh the classses to still get ROC for multiclass problems

iron basalt
#

softmax plus cross entropy or categorical cross entropy is typically used for multi-class

#

I guess you could

jade chasm
#

Oh right, this was multiclass. The model below is binary

#

my bad

iron basalt
#

If you use ROCAUC you get more detail than accuracy is the where i'm gonna leave it at.

#

Basically it gives you what it says it gives you, if you don't care about that information then don't use it.

wicked mantle
#

where you guys run your ML code?

#

in jupyter notebook?

#

but i already installed tensorflow 2.2+ and reloaded kernel, wtf x_x

gritty spear
#

hi, anyone using tensorflow-gpu with ATI ?

wicked mantle
#

wow, google colab and jupyter notebook are so slow for training models

#

colab a bit faster but anyway its slow

serene scaffold
wicked mantle
#

which free cloud platforms are you guys using for compile ML models?

grave frost
grave frost
exotic maple
wicked mantle
#

for compiling with GPU i need to download cuDNN nvidia?

#

(tensorflow)

ruby magnet
#

Hello everyone! I have a data set but I dont know how to format it into a workable manner. I need to compile the data the days into monthly averages but I dont know how to proceed or where to look. Any advice?

serene scaffold
ruby magnet
#

Im still relatively new to python as well

serene scaffold
ruby magnet
#

csv

serene scaffold
ruby magnet
#

yes

serene scaffold
#
import pandas as pd
data = pd.read_csv('that_file.csv')

that would be the first step.

ruby magnet
#

okay done

serene scaffold
#

so each column is a date?

ruby magnet
#

yeah

#

not sure how to separate the columns by month

lapis sequoia
#

Transpose , convert the type to datetime

#

then group by i guess

serene scaffold
ruby magnet
#

okay thanks!

wicked mantle
#

jesus, i set up cuda installation to disk F:, but cuda installed in somewhere i can't find now

lapis sequoia
#

Instead of transpose it might be better to restructure ur data with a melt (unpivot) instead.

serene scaffold
lapis sequoia
#

Feel free to ping me if u need any help o/ do ur research first tho 🙂

wicked mantle
#

finally i get gpu support for tensorflow🤕

ruby magnet
#

this is what it looks like rn

lapis sequoia
#

df=df[df["Type"]=="<dining string here>"]

#

^filter

#

Something went wrong on ur code

#

2nd column has date and type together

ruby magnet
#

I used this:
df_melt=pd.melt(df,id_vars=["Name"])
But i think it just seperated it based on Name

lapis sequoia
#

df=pd.read_csv(path here)
df=df[df["Type"]=="Dining Data"]
del df["Type"]
df=df.melt(id_vars="Name",var_name="date",value_name="value")

#

@ruby magnet try this

#

also whats the next plan - average on month?

ruby magnet
#

Yeah, the average % per month for each state

lapis sequoia
#

mm/dd/yyyy?

ruby magnet
#

yeah

jade chasm
#

TRP*

lapis sequoia
#

@ruby magnet

df[["month","year"]]=df["date"].astype(str).str.split("/",expand=True)[[0,2]]

df["month"]=df["month"].astype(int)
df["year"]=df["year"].astype(int)

df["value"]=df["value"].astype(str).str.replace("%","").astype(float)

df1=df.groupby(["month","year"])["value"].mean().reset_index()

exotic maple
jade chasm
#

Meaning: if you have a higher AREA under the curve of FPR and TRP, you essentially predict the labels of your data better

#

The middle line is just random guessing

#

Therefore: if your ROC AUC is 0.5, it is basically as good as a random guess.

#

CLoser to 1: closer to 'perfect' (overfitting? ;)) model.

exotic maple
#

For all others is easy:
Recall? how often your true labels are predicted correctly. (How many times you said a real value was false, false negatives)
Precision? How often your predictions were right (how many predicitions wer real=
and so on.

#

I think my confusion is because i feel ROC is very close to precision

jade chasm
#

I'm not sure about that.

#

On a different note: why is it that RandomForest doesn't need cross validation (how come it has standard OOB?)

exotic maple
#

I mean, the TPR is literally precision lol

ruby magnet
lapis sequoia
#

pandas documentation

jade chasm
lapis sequoia
#

is pretty good

ruby magnet
#

got it

exotic maple
lapis sequoia
exotic maple
jade chasm
exotic maple
jade chasm
#

Would if I could, but gotta study for my ML exams 😉

sour mango
#

hey guys i am trying to plot financial data from alpha vantage using matplotlib, with date field being the x axis and the closing price of stock being the y axis

#

i am not able to parse it, as data['date'] is not working..

#

how to parse the date as a column?

#

also i am using pandas as the output format

last nest
crimson crystal
#

FreeCodeCamp has a good course on data analysis awa machine learning

#

kaggle also offers free lessons

last nest
#

i dont really have time, in fact there's a big discount on this course from more than 50 to 12 euros, so i thought may be it was a good opportunity

#

i'll check those too , thanks 🙂

crimson crystal
#

udemy always gives discount so dont worry about missing the discount. they have a really wierd marketing strategy of changing course prices wildly but there are always discounts atleast ones a month. i wouldnt rush about it if that was your only concern

crimson crystal
last nest
#

you're right i saw that weird discounts over time finishing in 5 hours, then the same the next day etc

crimson crystal
#

not to mention the bid difference in price based on your location

hollow sentinel
#

I generally do not believe in paying to learn code

crimson crystal
#

i didnt try it but it might be cheaper to buy courses using a vpn in india

hollow sentinel
#

there is a free MOOC on DS/ML

#

Andrew Ng teaches it

crimson crystal
#

oww is that from coursera

hollow sentinel
#

yes

#

there is also the columbia course

crimson crystal
#

i heard he uses matlab instead

hollow sentinel
#

he doesn't use matlab

#

he uses octave

crimson crystal
#

never heard of that
is it free to use

#

matlab isnot free unless you have a university licence

hollow sentinel
#

yes octave is free to use

#

but I wouldn't get into the MOOC if you don't have the math recommended for it

crimson crystal
#

how much math are we talking about

#

i already have done the FCC course and it doesnt seem to require advanced math,

#

but it required logic

last nest
#

is it that advanced math, cause i already graduated so may be it's already ok on this level

crimson crystal
#

i would say you are fine

#

but i didnt try the Andrew Ng one

last nest
#

i'll dig more into the details of the courses

#

yeah i saw this

hollow sentinel
#

idk what the FCC course is

last nest
#

it's on coursera right

hollow sentinel
#

but you need stats, linear algebra, calculus, and discrete maths for Ng

crimson crystal
hollow sentinel
#

oh

#

ok

crimson crystal
hollow sentinel
#

Idk if the FCC course was as math-intensive

sour mango
#

FreeCodeCamp is the best resource imo

hollow sentinel
#

by course

#

you mean the youtube video?

crimson crystal
#

but i can see why?

#

they want to encourage us to go research ourselves but still...

last nest
#

if i get a good price for a mooc like course i'll take it and add fcc and others free to complete, thing is i need to know these subjects for physics application idk if i have to go deep in details into data scince, idk YET

crimson crystal
#

this might be relevant for you

last nest
#

oh okay on fcc , ok that could be helpful , thanks !

#

i'll check all this , thanks guys

crimson crystal
#

anytime

sour mango
grave frost
#

Anyone saw IBM's call for code? I am suprised AF that all they expect is to deploy web apps
tbh I thought it would focus more on actual solutions than just a geek cringe-fest with idiots trying to show coding can solve any problem in the world

lapis sequoia
#

Where would you use MatLab over Python/R?

exotic maple
exotic maple
#

I'm not sure if this still falls in "ML and AI" but network theory is pretty interesting. Specially these thing aout Power Laws in networks

dapper halo
#

dummyboi comin in with a dummyboi question. I was initially concerned the validation loss and loss were so spread. But they do start to converge...and since val_loss < loss it isnt an indication of overfitting and this general shape doesnt indicate there is something wrong? Is that correct?

dapper halo
#

But I guess on the flip side, if I cant get them to converge...I would have an issue

lean ledge
#

Depending on what you're into, there's other more specific resources on what to learn

#

Eg Fluids people who are there to speed up fluid simulations with ML are different from fluid people learning non-linear modes of oscillatory phenomena using techniques like Koopman operators and dynamic mode decomposition

lean ledge
dapper halo
bitter harbor
#

im trying to take a df and return a different one with columns [opening_eco, dt] (dt being last_move_at - created_at)

def create_time_delta(df: pd.DataFrame) -> pd.DataFrame:
    _df = pd.concat([pd.DataFrame([row["opening_eco"], row["last_move_at"] - row["created_at"]],
                                  columns=["opening_eco", "dt"]) for _, row in df.iterrows()],
                    ignore_index=True)
    return _df```
this raises `ValueError: Shape of passed values is (2, 1), indices imply (2, 2)` an I've got no clue why
grim patrol
#

I have data with dst_bytes, src_bytes, record ID etc and I want to cluster it using KMeans

df = pd.read_csv(f_path, names=["record ID", "duration_", "src_bytes", "dst_bytes"], header=None)
# X = ?
kmeans = KMeans(n_clusters=2).fit(X)

What exactly do I need to put as X? I'm a little confused on what n_samples and n_features are

twin moth
#

In this series, we'll explore the complex landscape of machine learning and artificial intelligence through one example from the field of computer vision: using a decision tree to count the number of fingers in an image. It's gonna be crazy.

Become a Patron for exclusive perks: https://www.patreon.com/welchlabs

Supporting Code: https://github...

▶ Play video
tidal bough
lean ledge
#

Just train for more epochs

grave frost
#

they are probably using early stopping

#

(with a high delta maybe)

vale hedge
#

Is keras installed as part of tensorflow2 or do you install separately

dapper halo
dapper halo
tidal bough
lean ledge
#

Either decay or a strict schedule where you reduce learning rate explicitly

#

@dapper halo

ivory pendant
#

oh I see it does use non-AI algorithms such as lexical analysis

ivory pendant
#

I'm trying to machine-learn string. I finished using a ‘vectorizer’ with a ‘bag-of-words’ model and it seems like it works. I think it’s generating unique numeric keys for lexical items.
But how would I actually use them? Which models can I use and how would I write these [a b c x y z]-looking keys as string for the model?

Here's my working code so far:
https://paste.gg/p/anonymous/01bf343d27ae45eba8debaa4fb5a77b1

lapis sequoia
#

hi

young lake
#

Anyone here have experience with chess engine development? if so please DM me

lapis sequoia
#

hello again friends

#

im wondering if anyone has experience with uploading projects onto github

#

ive got a gan project ive finished but instead of uploading the IPYNB, because it seems that most professionals separate the files into different py files, e.g. test py, model py etc etc.

#

im not sure how i should divide my project up if that makes sense

dapper halo
# tidal bough that seems to me like it can justify the test loss being as good as the training...

yeah I'm not sure. Been at it for a few hours. Managed to cut the difference between val_loss and loss by 50% but its still pretty large. I will say that I am injecting 0 points into the data when I clean it as I expect the user would not be able to always obtain all the metrics I am training on. While all random, I assume many more of these masks get put into the training set which causes it to have a harder time evaluating the loss as opposed to the validation loss??

Outside of that I'm at a loss

astral path
#

i have a heatmap that looks like this

#
all_corr = np.corrcoef(all_features.T)
plt.figure(figsize=(14,14))
all_mask = np.triu(np.ones_like(all_corr, dtype=bool))
sns.heatmap(all_corr, cmap="coolwarm", xticklabels=all_cats, yticklabels=all_cats)
#

I only want to show the columns poly1 through poly10 and rows RSCI through AST/TOV

#

how do i change what's shown?

#

thanks!

jade chasm
#

You probably need to index on the labels. You can 'pull apart' the all_cor into x and y, then index on the final [-10:] for y, and the first [:9] for x.

#

Keep in mind that your xticklabels need the same size as x, and same goes for y.

#

On another note: Why are layered neural networks easier to train than large non-layered networks?

hard frost
#
for i in range(X_FUTURE):
            curr_date = curr_date +  relativedelta(months=+1)
            dicts.append({'Predictions': transform[i], "Month": curr_date})
            

        new_data = pd.DataFrame(dicts).set_index("Month")
        ##df_predict = pd.DataFrame(transform, columns=["predicted value"])

        new_data.to_csv(os.path.join("downloads", index = True, encoding='utf8'))

        labels = [d['Month'] for d in dicts]
            
        values = [d['Predictions'] for d in dicts]

        colors = [ "#F7464A", "#46BFBD", "#FDB45C", "#FEDCBA",
                       "#ABCDEF", "#DDDDDD", "#ABCABC", "#4169E1",
                       "#C71585", "#FF4500", "#FEDCBA", "#46BFBD"]

        line_labels=labels
        line_values=values
        return render_template('graph.html', title='Time Series Sales forecasting', max=17000, labels=line_labels, values=line_values, filename = filename)


@app.route('/download/<filename>')
def download(filename):
    return send_from_directory("downloads", filename, as_attachment = True)  
#

<a href="{{ url_for('download', filename=filename) }}">Download</a>
Hi community, I m trying to save my to_csv into os.path.join and return the csv file from download button in HTML page, currently I m getting this error TypeError: join() got an unexpected keyword argument 'index', Does anyone excel in flask python, please correct me ~ Appreciate

glossy trellis
#

hello, I've problem with my project. When I show to the camera 1 sign with my hand, it gives me infinite 1 on console. How can I fix it.

stable briar
#

I'm Turkish too :D

#

Send your code I think someone can help

late shell
#

hello, I'm a beginner at ML, and the field of ML/AI seems very overwhelming to me. I am so confused about how deep does one have to dive into each topic in order to cover the overall ML field? If one wants to he/she can spend months on learning just one kind of model, there is just so much to learn, so many parameters and stuff that affect your model. I just can't figure out when to stop studying a particular model/algorithm. I wish there was a certain "threshold" that says like "ok you've done enough of this, you can move on to the next topic.". So yeah, how deep am I supposed to go? how can i figure that out? It would be nice to find out how deep the advanced ML users go. 🙁

primal tulip
late shell
#

hmm, thanks for the advice @primal tulip 👍

crude yew
#

Hello

#

Is here anybody who works at FAANG?

grave frost
#

this is a common conception that for example, if you want to compete in an AI competition, you need to know everything about it. this is actually not a realistically feasible approach to learning something. the best motivation is what comes from within

uncut orbit
#

im getting this error while running openai cartpole environment: GLXInfoException: pyglet requires an X server with GLX

#

im not sure what to do

#

more trials to decrease the score @analog cave

#

its the epochs

#

sorry for the misuse of words

#

but yea its the epochs

#

👍

lapis sequoia
#

do u know if there is an AI that given an input (a picture), or more than 1, outputs u the picture from different perspectives?

grave frost
#

there are plenty of that from 2-minute papaers

uncut orbit
#

hmm

#

i remember seeing some thing that makes ur face visually three dimensional and you could drag it around

#

forgot what it was called

exotic maple
# grave frost this is a common conception that for example, if you want to compete in an AI co...

Ruler always coming up with the based comments

To add to this. ML/AI has in many ways become a buzzword, almost like blockchain, people want kt everywhere even if they dont wth it does and this puts too many expectations on learners.

Dont pressure yourself with unreleastic expectations. Find a problem you are interested in, and try to solve it, thats all.

Most of the time, even if the project is dumb basic stuff, you have something to show that proves you have a clue and more importantly, that you have hands on experience.

astral path
#

I'm using quadratic features to train a model, but the issue is I have about 5000 of these features to choose from. What type of feature selection should I do to make sure that I choose as diverse a featureset (in terms of they're not all heavily correlated with each other) as possible while still keeping the model accurate?

#

The issue I've had before is that feature selection chooses, say 10 quadratic features which happen to be extremely similar to each other, making it so there's basically no point in having 10 features, and agglomerating features together made them basically indistinguishable from each other as well. I'm interested in choosing the features which have the most impact on the model while being independent from the other ones

#

thanks and cheers!

severe python
#

@exotic maple hey, trying to implement the advice you gave. refresher: the script isn't printing rows that have a combination of integers and letters ex. "1889AM" but will print either "TEST" or "1889" on separate tests. raw_input would've worked if i wasn't using python3 i think. what do i need to do? below is a segment of the script but it's with the Alert ID not Acronym, feel like this is an easy fix but just lost

#
while True:
        variable = input(f"{bcolors.WARNING}Search by Acronym / Parent / Alert ID / Account?    {bcolors.ENDC}")
        if variable == "Exit":
            sys.exit(0)

        if variable not in df.columns:
            print(f"{bcolors.FAIL}Error: Invalid Input{bcolors.ENDC}") 
            continue

        if variable == "Acronym":
            while True:
                input1 = input("Please provide an Acronym:   ")
                result1 = df.loc[df[variable] == input1]
                if input1 == "Back":
                    break
                if len(result1) == 0:
                    print(f"{bcolors.FAIL}Acronym not found. Please try again{bcolors.ENDC}") 
                else:
                    print(tabulate(result1, headers='keys', tablefmt='psql'))
                continue```
serene scaffold
severe python
#

@serene scaffold i'm using pandas referencing an excel sheet with 4k rows of data

serene scaffold
severe python
# serene scaffold so you're trying to map user inputs to pandas operations?

no i'm trying to use user input to search the excel file, initial question is to determine which column it needs to search in. acronym (which is all letters) works, parentid and account id all integers - works, but alert ID is a mixture and it won't print 100% integers but will print either 100% letters or a mix

#

anyone know a fix? @exotic maple @iron basalt

red flint
#

is anyone familar with control variates as a variance reduction technique?

iron basalt
severe python
#

didn't, i mentioned u

iron basalt
#

Don't @ me

grave frost
#

Does anyone know much about signal processing? any resources with personal preference that aren't math-intensive, or just use logic?

severe python
iron basalt
grave frost
#

how hard would you rate the math?

iron basalt
#

Anywhere from linear algebra to does not exist / needs more math to be invented.

grave frost
#

fantastic. I am going to understand all of it then

#

anyways, thanx for that

iron basalt
#

Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and synthesizing signals such as sound, images, and scientific measurements. Signal processing techniques can be used to improve transmission, storage efficiency and subjective quality and to also emphasize or detect components of interest in a measured...

#

scroll down to mathematical methods applied

grave frost
#

are you seriously recommending wiki to a newbie?

iron basalt
#

It gives a list of applied mathematics

grave frost
#

perhaps then I would be expected to know just the basics then

plush jungle
#

does anyone know any modules that could be used to draw borders around the shapes in this image?

dapper halo
plush jungle
#

i'm wondering if this might be easier if I get rid of all the noise first

dapper halo
#

Brings back schlieren image processing nightmares. All I can say is have fun haha

misty flint
#

the stuff of nightmares

astral path
#

sorry haha kinda got buried

grave frost
plush jungle
earnest oar
#

HEY GUYS

#

DO SOMEONE KNOW HOW CAN I SAVE A MODULE IN GOOGLE COLAB

glossy trellis
#

hello guys, how can we load tensorflow to opencv?

lean ledge
#

I TA it at University and the maths is too heavy for 3rd years, much less some high schooler.

lean ledge
grave frost
lean ledge
#

Not sure what's involved in barebone basics of sigproc

grave frost
#

but not explicitly mention that

willow quarry
#

hello

#

i am new here

#

and i have made an enviroment

#

but i am having a biiig trouble geting my tensorflow sequential to read my input

#

or to crating my reinforce agent

#

i passed 5 days reading the doc from start to finish

#

still cant get past some problems

#

how can i share my code here????

#

i made a small version with my problems

grave frost
#

!code

arctic wedgeBOT
#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

grave frost
willow quarry
#

thanks

arctic wedgeBOT
#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

willow quarry
#
import pyautogui
import time
import cv2
from PIL import ImageGrab
from pynput.keyboard import Key, Controller
import numpy as np
import keyboard
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tflearn.layers.core import input_data, dropout, fully_connected
from tf_agents.networks import actor_distribution_network

fullscreen = [110,130,710,570]

screenpil = ImageGrab.grab(bbox=fullscreen)
showprint = np.array(screenpil)
grayscreen = cv2.cvtColor(showprint, cv2.COLOR_BGR2GRAY)
screenrect = cv2.cvtColor(grayscreen, cv2.COLOR_GRAY2BGR)

print(tf.data.Dataset.from_tensors(screenrect))


model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(filters=25, kernel_size=40,padding="valid", activation="swish", input_shape=[ 440, 600,3 ]),
    tf.keras.layers.MaxPool2D(pool_size=2, strides=5, padding='valid'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(units=50, activation='swish'),
    tf.keras.layers.Dense(units=25, activation ="swish"),
    tf.keras.layers.Dense(units=15, activation ="relu"),
])

initial_learning_rate = 0.0005

lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate,
    decay_steps=6000,
    decay_rate=0.95,
    staircase=True
)

model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=lr_schedule),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(),
    metrics=[tf.keras.metrics.SparseCategoricalAccuracy()]
)
reshapedData = screenrect.reshape((440, 600, 3))
rereshaped = np.expand_dims(reshapedData, axis=0)
rereshaped = rereshaped.reshape(len(rereshaped), 440, 600, 3)

model.fit(rereshaped) ```
#

this is my basic model witch returns this erro

#

Tried to squeeze dim index -1 for tensor with 0 dimensions.

#

it is suposed to recive a "screenshot" as input

grave frost
#

can you post your full error?

willow quarry
#

InvalidArgumentError: Tried to squeeze dim index -1 for tensor with 0 dimensions.
[[{{node metrics/sparse_categorical_accuracy/Squeeze}}]]

grave frost
#

the full traceback

willow quarry
#

ooooooooo

arctic wedgeBOT
#

Hey @willow quarry!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

willow quarry
#

its big

#

discord doesnt let

#

i will split it

grave frost
willow quarry
#

there

grave frost
#

you have to share it

#

the link I mean

willow quarry
grave frost
#

what does putting print(rereshaped.shape) before model.fit output?

willow quarry
#

WARNING:tensorflow:From C:\Users\Watso\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\compat\v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
curses is not supported on this machine (please install/reinstall curses for an optimal experience)
<TensorDataset shapes: (440, 600, 3), types: tf.uint8>
Train on 1 samples

#

i belive its <TensorDataser shapes: .....

grave frost
#

how much data are you giving?

willow quarry
#

one image

#

i tried that in other sample

#

grayscreen = grayscreen.reshape( 440, 600, 1)
print(grayscreen.shape)
Tfdata = tf.data.TFRecordDataset.from_tensors(tf.expand_dims(grayscreen, axis=0))
#grayscreen = numpy.asarray(grayscreen)
print(Tfdata)
model.fit(Tfdata)

#

<TensorDataset shapes: (1, 440, 600, 1), types: tf.uint8>

#

witch returned that

grave frost
#

give it multiple images, so that the shape looks something like this:
(x, 440, 600, 3)

willow quarry
#

i tried duplicating the image

#

[image,image]

grave frost
#

wait what's the label lol?

willow quarry
#

no avail

grave frost
#

what is your Y value? what are you trying to do?

willow quarry
#

so it plays retroarch games

grave frost
#

sparsecategoricalcrossentropy needs a non-sparse array to calculate loss

willow quarry
#

there is no y exactaly

grave frost
#

...what?

willow quarry
#

yeeeeeeer

#

i made an enviroment

grave frost
#

that's not how you do RL

willow quarry
#

yes

#

i tried agents

#

but

#

let me send you

grave frost
#

yes you have to use TF agents

#

you can't do it with an unsupervised task

willow quarry
#

that is my actual first atempt

#

but i also ran into mutch errors

#

like

#

now i am stuc at

#

InvalidArgumentError: cannot compute Conv2D as input #1(zero-based) was expected to be a int32 tensor but is a float tensor [Op:Conv2D]
In call to configurable 'ReinforceAgent' (<class 'tf_agents.agents.reinforce.reinforce_agent.ReinforceAgent'>)

grave frost
#

my recommendation is to follow some tutorial that solves the game you want

willow quarry
grave frost
#

for solving above error, you can just cast it to int32

willow quarry
#

i did 4 tutorials

grave frost
#

and?

willow quarry
#

i stil stuck here

#

cus my enviroment is kinda diferent

#

it has an 15 array output

grave frost
#

yeah, without full knowledge, trying to make something custom sucks

willow quarry
#

for a virtual xinput controller

#

and the input for some darn reason always crashes stuf

#

but tell me more about cast it to int 3

#

int32

grave frost
grave frost
willow quarry
#

i would looooove to

grave frost
#

most of your issues can be solved via a bit of research

willow quarry
#

but that is my project to finish my full stack formation

#

i have 10 days to present

#

something

#

then i will polish stuf

grave frost
#

"full stack formation" what is that?

willow quarry
#

data and some more stuf formation

grave frost
#

are you in college?

willow quarry
#

no

#

its more like a course

grave frost
#

what did the course teach?

willow quarry
#

i had to quit colege

#

basic python

#

data analisis

#

scrap

#

html

grave frost
willow quarry
#

i changed country

#

tried colege here but need to worck to eat

#

actualy i am at work right now

#

night receptionisgt

#

i have lots of fre time at night

#

use that to study

grave frost
#

that is indeed very good 👍

willow quarry
#

yes

grave frost
#

but trust me, start with the absolute basics first - learn good amount of python, then move to data science

uncut orbit
#

that is true

willow quarry
#

yes

grave frost
#

it doesn't take a lot of time if you do it properly

willow quarry
#

i do agree with you

#

but if i dont show something

#

i wil not finish

uncut orbit
#

you have us right?

willow quarry
#

so i ned at least that the car is able to turn by it self

grave frost
willow quarry
#

can i show you my screen???

grave frost
#

yes, a screenshot would be good

willow quarry
#

year cus i made some wrong thing and it only runs in one of my pcs

#

will take 3h to fix it

#

and time is something i dont have now

#

a sec please

#

here an example

#

it analizes the pos of the players trough its faces moving in the mile and return points

grave frost
#

did they teach reinforcement learning in the course?

willow quarry
#

with this config i can train 4 agents at once and in the future even use a ai x ai pontuation method

#

no

#

kkkkkkkkkkk

grave frost
#

well, then what did they teach?

willow quarry
#

in ai they teach randon forest

#

those basic

#

and then some tensorflow to recognise paterns and images

#

basicaly basic keras

grave frost
#

so how did you jump from that to rl?

willow quarry
#

i always loved rl

#

so i watched lots of videos

#

and i had an idea about enviroments

grave frost
#

rl is very complicated

willow quarry
#

i see it now

#

but i dont care if it works bad i just need it to at leat try

grave frost
#

I recommend you do something about the basics you learn, do project related, then learn more, do something more complex and so on

#

I too tried to learn RL at first 😅

willow quarry
#

lest

#

least

grave frost
#

I quickly dropped it after I realized the amount of coding needed to do the theory - I still find it intriguing though.

#

again, do some image recgnition with keras first using the code in course

willow quarry
#

if i am able to present this project my idea is to pass sometime working in a real enviroment to retroarch

#

maybe even a custom version alowed to return framerates in an internal way readable for python

#

that way i can acelerate without losing sync

grave frost
#

all in good time, my man. first, learn the basics and then start implementing complex stuff.
I know it is not easy to do something that may be boring/you do not like - but it is the best way to learn

willow quarry
#

if i am able to make nuice inviroments i would like to make a twitch to host ai x ai tournaments

#

its not boring

#

even the basics are awesome

#

man is a self learning algoritym

grave frost
#

then why aren't you doing a project related to Random forest/ tensorflow image recognition?

willow quarry
#

cus there is everione in my course doing it

grave frost
willow quarry
#

i just wanted somefing that can turn my car around a trac

#

something*

grave frost
#

yeah, it looks simple but is actually not

willow quarry
#

its not hard tooo

grave frost
#

it is

willow quarry
#

i just realy realy suck at tensorflow at the moment

#

and even if i understand the llogic behind

#

there is always an int32 in the way

grave frost
willow quarry
#

man

#

the basic pyton is easy

grave frost
#

leave that project for a moment

willow quarry
#

i do games in c# for years

#

my problem is wen you drop tons of lbraries

#

i am used to do my code so i know its flaws

#

i started programing at 12

grave frost
#

well, then it's pretty hard to direct you since solving all those errors would take months

willow quarry
#

kkkkkkkk

#

i have just one last question

#

tf_agent = tf_agents.agents.ReinforceAgent(
time_step_spec = time_step_spec,
action_spec = Tensod_spec,
actor_network=actor_net,
optimizer=lr_schedule,
normalize_returns=True,
train_step_counter=train_step_counter
)

#

at this part in the code

#

what exactaly i am supodes to place at action_spec

#

i undertud how to create specs

grave frost
#

you mean action space?

#

all the outputs your model can do (like go left, or right)

willow quarry
#

so is the output spec

#

nice

#

i thought i was going crasy

grave frost
#

yea, it would look something like this action_spec = BoundedTensorSpec(....)

willow quarry
#

year

#

i made an array 14 bounded tensor

#

but my actor didn't liked it

#

so i made it a tensor spec

grave frost
#

no, it has to be bounded

willow quarry
#

iit is

grave frost
#

no tensor spec, only bounded tensor spec

willow quarry
#

wen we convert it keeps the bounds doesn't it???

grave frost
willow quarry
#

TF_Ximput_specs = tf_agents.specs.BoundedArraySpec(
(15,), dtype=np.float32 , minimum=[0,0,0,0,0,0,0,0,-32768,-32768,-32768,-32768,0,0,0], maximum=[1,1,1,1,1,255,1,255,32767,32767,32767,32767,15,1,1], name="XimputSpecs"
)
TF_ScreenRead_Specs = tf_agents.specs.BoundedArraySpec(
[1 , 440 , 600 , 1], dtype= np.int32 , name="ScreenSpecs"
)

Tensod_spec = tf_agents.specs.tensor_spec.from_spec(TF_Ximput_specs)
Tensod_spec2 = tf_agents.specs.tensor_spec.from_spec(TF_ScreenRead_Specs)

print(Tensod_spec2)
print(Tensod_spec)

#

this code

#

BoundedTensorSpec(shape=(1, 440, 600, 1), dtype=tf.int32, name='ScreenSpecs', minimum=array(-2147483648), maximum=array(2147483647))
BoundedTensorSpec(shape=(15,), dtype=tf.float32, name='XimputSpecs', minimum=array([ 0., 0., 0., 0., 0., 0., 0.,
0., -32768., -32768., -32768., -32768., 0., 0.,
0.], dtype=float32), maximum=array([1.0000e+00, 1.0000e+00, 1.0000e+00, 1.0000e+00, 1.0000e+00,
2.5500e+02, 1.0000e+00, 2.5500e+02, 3.2767e+04, 3.2767e+04,
3.2767e+04, 3.2767e+04, 1.5000e+01, 1.0000e+00, 1.0000e+00],
dtype=float32))

#

this output

#

some how it keeps bounded

grave frost
#

oh, then it would remain bounded

willow quarry
#

year

grave frost
#

I thought you have changed the whole object

willow quarry
#

i thought it would unbound to

#

it would crashh all cus its already a weard array with 'ints' and axes limits and all

grave frost
#

BoundedTensorSpec(shape=(15,), dtype=tf.float32, see the dtype? you change tf.float32 to tf.int32 to get it in integers

willow quarry
#

year

#

there comes the problem

grave frost
#

sorry, have to go. talk to you later

willow quarry
#

got a new error

#

khbasw

#

by

lone ginkgo
#

does anyone know how to make a bot embed

odd lion
#

This is mostly a pandas question, trying to figure out how to conditionally drop rows based on a list of values. I have data that looks like this:
COL1|COL2 1| 1 1| 2 1| 3 2| 2 2| 3 2| 6 2| 7 2| 8
And I have another dataset like this
KEEP_NUM|DROP_LIST 1|[2,3] 6|[7]
The goal is if the number in KEEP_NUM is in COL2, and any of the DROP_LIST are also in COL2 for the same COL1, drop those in the DROP_LIST, so the above would look like

COL1|COL2 1| 1 2| 2 2| 3 2| 6 2| 8
Since for COL1=1, 1 was in there so it dropped 2 and 3. For COL1=2, 7 existed with 6 so 7 was dropped, however neither 2 or 3 were dropped since there was no 1

I know I could do this reasonably easily with just a few for loops, but that's horribly inefficient.

willow quarry
#

about my question

#

the problem on the agent build was the conv layer that was buging every thing

#

but it would be advisable to have a nice conv2d layer since my main read data is image

#

i will try to make that on my actor_distribution_network

#

preprocessing_layers

dapper halo
#

not sure if this exactly belongs here, but do you guys have any idea why

#

my dataframe is unshuffling itself?

y_data = pd.concat([Dataframe.pop(x) for x in ['Metallicity', 'Density']], axis=1)

maskedData = feature_mask(Dataframe)

normalizer = preprocessing.MinMaxScaler()
transformed = pd.DataFrame(normalizer.fit_transform(maskedData))

transformed['Metallicity'] = y_data['Metallicity']
transformed['Density'] = y_data['Density']```
#

the first line (.sample) definitely shuffles it, but once I put the y_data back into the dataframe to save it, it unshuffles it.

dapper halo
#

scratch that. Still confused, but idk why I didnt just shuffle again afterwards. Problem resolved

#

I take that back. It screws up my labels. Ya im so confused why this unshuffles it. Maybe I just need to reset the index?

velvet thorn
#

actually

#

hm

#

you can do it without

#

but I would do it with

#

like spread it out

#

so it’s in the same format as the first one

wintry oyster
#

how do i access the contrib library of tensorflow
tf.compat.v1.contrib doesn't work

soft salmon
#

is this valid forward pass?
inputs --> layer 1--->layer 2--->softmax_activation-->cross_entropy-->MSE(Mean square error)
layer1 = sigmoid((input x weights1) +bias1)
layer2 = sigmoid((input x weights2) +bias2)

hallow bronze
#

x = iris[(iris['sepal.length'] > 5) and (iris['variety'] == 'Virginica')]
is my code
Python
pinned
a message
to this channel.
See all the pins.
Today at 1:10 PM
SREESANKAR — Today at 1:10 PM

ValueError Traceback (most recent call last)
<ipython-input-24-55b181a65cd7> in <module>
----> 1 x = iris[(iris['sepal.length'] > 2) and (iris['variety'] == 'Virginica')]

C:\Anaconda\lib\site-packages\pandas\core\generic.py in nonzero(self)
1327
1328 def nonzero(self):
-> 1329 raise ValueError(
1330 f"The truth value of a {type(self).name} is ambiguous. "
1331 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
it is giving me the above error
👆 👆
i'm using pandas

#

please help

spark stag
# hallow bronze ```x = iris[(iris['sepal.length'] > 5) and (iris['variety'] == 'Virginica')] ```...

try replacing and with & (bitwise and), currently using and tries to evaluate the truthyness of the entire data frame produced by (iris['sepal.length'] > 5), before it would move onto (iris['variety'] == 'Virginica'), the issue being that pandas data frames don't like being cast to a boolean (hence the error) and instead have their own methods for this (as it says in the error message), using a bitwise and (&) should instead compare each element of both data frames produced (for indexing) and then the result should be that it returns the positions in the iris data frame that meet both of your conditions

#
>>> import pandas as pd
>>> df = pd.DataFrame({"a": range(7), "b": [5] * 5 + [2] * 2})
>>> df
   a  b
0  0  5
1  1  5
2  2  5
3  3  5
4  4  5
5  5  2
6  6  2
>>> x = df[(df["a"] >= 3) & (df["b"] == 5)]
>>> x
   a  b
3  3  5
4  4  5```
solar phoenix
#

Hi all, I was wondering if anyone can help me. I have used skimage to detect some 'blobs' on an image. I know have the x, y location and radius of thse blobs. There are thousands, and they are detecting what I want. I would like to get the average R, G, B value in each blob. I have tried masking in CV2 but that has issues. Can anyone speak me through this or have another solution (perhaps in skimage?)? Thanks in advance

solar phoenix
#

when i import my image in cv2 it is really dark

#

image2 = cv2.imread('./data/18b.jpg')

#

img = cv2.cvtColor(image2, cv2.COLOR_BGR2RGB)
plt.imshow(img)

#

when i import like this it is really dark

#

so i have no idea

#

also some of my ROI, are returning as NaN in one channel

solar phoenix
#

and i want to measure the mean RGB in each blob

lilac ferry
#

Im trying to read data from sql to python but got a

ProgrammingError: ('42000', '[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Must declare the scalar variable "@ID". (137) (SQLExecDirectW)')

@ID is one of the column in the sql table. Im getting error in while using

query = pd.read_sql_query("""SELECT col_name, @ID, col_name FROM dbo.table_name""", conn)

Is the error due to "@" and how do i solve this?

dawn wing
red hound
#

Easiest way to save a tf 2.4.1 Tensor to file? I need to take some tensors from insinde my NN to apply manual operations for debugging purposes

lilac ferry
worn bough
#

I'm trying to train a model on BERT features calculated from texts, and starting with a simple linear regression model to see how it works. But my R^2 metric is getting out of hand for my testset (it's minus billions for some cross-validation folds). What could be going on? Could it be that my trainset is too small (500-1000) or that BERT embeddings have too many features (768)? Also the LinReg coefficients are huge.

next ibex
worn bough
#

@next ibex what do you mean by "add to display"? Where did you calculate the time spent? (And could it be that this is not a #data-science-and-ml question?)

next ibex
#

But nowhere can I find a solution to this

#

Apparently, no one knows how to do this 😦

worn bough
#

But if you want to display 'time spent', you'll need to calculate it. You need a 'time opened' and a 'time closed' for example. Do you have these?

next ibex
#

As you can see he is not there, and about this I came to ask for help here)

#

I do not know how to do that

worn bough
#

Just look at the left sidebar. Under 'AVAILABLE HELP CHANNELS' you can see three channels. Right now they are #help-candy #help-dumpling #help-coconut . If you click one of these and just type something, you claim it 🙂

#

Hopefully somebody will come along and answer your question

grave frost
#

what is your data preprocessing and emebdding size?

worn bough
#

@grave frost I want to classify transcribed conversations. I split them up in sentences, every time the speaker changed is a new sentence. The conversations are labeled 0 or 1, the individual sentences get scores between 0-0.5 when the conversation is labeled 0, or 0.5-1 when the conversation is labeled 1, in both cases dependent on a word list occurring in a sentence, and those scores are smoothed afterwards for each conversation. The sentences are fed to BERT, which returns a vector of 768 floats. These features need to go into a new model.

#

The BERT features are completely meaningless for a human reader, but should have a low cosine distance when the sentences are similar.

#

I used the BERTje model actually, a Dutch variation on the original BERT model. But that's a detail.

grave frost
#

So each sentence in the transcribed conversation has a labelled value - and you are approximating the overall score of the sequence by averaging?

#

how did you get the word list? was it hardcoded?

worn bough
#

I'm not sure yet how I'd aggregate the sentences back to a conversation score. The word list was hard-coded, manually extended using WordScores, which scores words based upon co-occurrence.

grave frost
#

so in the dataset, your X is sequences and Y a corresponding score?

#

X --> src data | Y --> Target data

#

and even if your Y is numeric, you could bin it into categories and fine-tune BERT as a classification task on n labels

worn bough
#

Yes that's right! I'm done working for today so tomorrow I'll try to implement the classification of the sentences in BERT itself tomorrow. Thanks a lot for your help!

compact warren
#

Hello,

I have this Dataframe:

id another other actors. other

  1.   ~            ~         A, B         ~
    
  2.   ~            ~           C           ~
    
  3.   ~            ~           A           ~
    
  4.   ~           ~           F, G         ~
    
  5.   ~            ~         A, B, F     ~
    
  6.  ~            ~            C, F        ~
    

What I want to do is graph the number of times each actor appears.

So what I see I have to do is:

Have a list of the actors (A, B, C, F, G) and then go looking in each row how many times each actor appears.

I want to know if there are methods in Pandas to do that, or do I have to implement it from scratch with numpy?

Thanks

uncut kindle
#

the actors attr is comma-separated?

#

you could create a list from that column (via pandas) and explode it, then groupby / plot

compact warren
dreamy sky
#

Hi. I want some help to decide which course i should stick up with.
The problem begins with me choosing ML topic as graduation project, but i have no experience with it. Only know how to code in python. So our plan is to work on "Image colorization" problem and there 2 methods to do the required job; One using Convolution neural network the other using auto encoders. So to get through the scientific paper i have to know what are these topics. After searching i found two suitable courses that covers topics from the beginnings to advanced levels that might make you able to go through scientific papers about the colorization problem.

The courses:

  1. UC Berkeley CS 182: Deep Learning.
  2. Udemy - Deep Learning A-Z™️: Hands-On Artificial Neural Networks

So which one do you recommend? And any advice?

PS. I'm a CS/Math student so i have the programming and the mathematical maturity that would make me and my team able to go through these stuff quickly. Thanks

polar dock
willow quarry
#

hello

#

so i was trying to use policies

#

i am trying with a random policy and actor policy

#

but i always get

#

AttributeError: 'TensorDataset' object has no attribute 'shape'

#
fullscreen = [110,130,710,570]

screenpil = ImageGrab.grab(bbox=fullscreen)
showprint = np.array(screenpil)
grayscreen = cv2.cvtColor(showprint, cv2.COLOR_BGR2GRAY)
screenrect = cv2.cvtColor(grayscreen, cv2.COLOR_GRAY2BGR)
grayscreen = grayscreen.reshape( 440, 600, 1)
grayscreen = grayscreen.astype(int)
print(grayscreen.dtype)
Tfdata =  tf.data.Dataset.from_tensors(grayscreen)
#grayscreen = numpy.asarray(grayscreen)
print(Tfdata)
arg = torch.from_numpy(grayscreen)

time_step = tf_agents.trajectories.time_step.TimeStep(
    step_type = "FIRST",
    reward = 0 ,
    discount = 1 ,
    observation = Tfdata
)```
#

that is the input

willow quarry
#

ok

#

i got an idea

#

i hadn't gave a batch zise to the polocy

#

ut i am doing my own enviroment

#

how can i make my own "batch_size"

uncut barn
#

for neural networks for binary classification do the labels need to be one hot encoded or is using label encoder enough?

willow quarry
#

looks no pros online today

#

i am having touble to feed my agent with data

#

and cant run even random_policy

#

have 10 days to finish project

#

2 weeks reading docs no stop

willow quarry
#

i think i found my problem

#

pil_img = tf.keras.preprocessing.image.array_to_img(img)

#

i didn't know there was tf.image

hard canopy
#

Hello here. I have a pandas dataframe. One on the column contains arrays. How would you get the top n common items of thoses arrays ?

#

oh i could do with an accumulator or something

tidal bough
#

so the column is of dtype object, and each cell in it is an array?

hard canopy
#

yes. I got it with a df['xxx'].str.split()

tidal bough
#

I'd just make a Counter from each and add them all up. If that's too slow, use a faster multiset library.

tidal bough
willow quarry
#

i was about to say it an iterator is always helpful

hard canopy
#

oh so arrays column are not ndarrays like the rest ?

#

oh no i see what you mean

willow quarry
#

hey confused do you undestarnd about tensor specs??

tidal bough
hard canopy
#

I got it wrong, i thought you were talking about the column, while you were talking about the column elements

tidal bough
#

each column of a DataFrame is a numpy array, but if it's storing something like lists or numpy arrays, that can only be done by making the column's dtype object, which essentially gives up most of numpy's vectorized operations

#

which means in that case you can just iterate over the column - there isn't really a faster vectorized way.

hard canopy
#

ok no point in trying to optimize here, unless i go to a different lib to parse my csv. Thanks you for the help 🙂

#

I'll go the slow way and sleep on it if it's too long

echo orbit
#

Hi, i'm working on a neural network (to establish a link between RGB values & wavelength/purity) but it seems my loss is way too high (i expected to get 0.005 for 50 epochs, and i get something like 0.25 (globally the same result with 50 epochs & 100 epochs). Any idea to reduce the loss please ?
Here's the code :

Regarding the datas :

F=np.load('data/data_RGB_Train.npy')
#Normalisation de F['lambda']
Flo=F['lambda']
Flm=np.min(F['lambda'])
Flo-=Flm #Flm < 0
FlM=np.max(Flo)
Flo/=FlM
#Normalisation de F['pure']
Fpure=F['pure']
Fpm=np.min(F['pure'])
Fpure+=Fpm #Fpm > 0
FpM=np.max(Fpure)
Fpure/=FpM

R=F['R']
V=F['G']
B=F['B']
L=np.array([R,V,B]).T

X_t=copy.deepcopy(L)
X_test2=np.random.random((len(Flo),3))
y_t=np.zeros((len(Flo),2))
y_test2=np.random.random((len(Flo),2))
for i in range(len(Flo)):
    y_t[i,:]=[Flo[i],Fpure[i]]

Regarding the network :

model=km.Sequential()
model.add(kl.Dense(20,activation='tanh',input_dim=3))
model.add(kl.Dense(10,activation='tanh'))
model.add(kl.Dense(5,activation='tanh'))
model.add(kl.Dense(3,activation='tanh'))
model.add(kl.Dense(1,activation='linear'))
model.compile(optimizer='Adam',loss='mse')
model.output_shape
model.summary()

history2 = model.fit(X_t, y_t[:,0],
          batch_size=32,        
          epochs=50,  
          validation_split=0.3,
          verbose=0) # % of data being used for val_loss evaluation
ev = model.evaluate(X_test2, y_test2[:,0])
print(ev)```

```Output : 1024/1024 [==============================] - 1s 1ms/step - loss: 0.2475
0.24748222529888153```

the pic is the loss/val_loss plot (for 50 epochs)
desert oar
#

@echo orbit how did you decide on that specific architecture? maybe try something simpler with fewer layers and add regularization

#

and why is validation loss lower than training loss? that's odd

echo orbit
#

That was an architecture suggested by one of my teachers

#

And yes it's odd and i couldn't figure out what was wrong

#

is it the way i initiated ? (reversing X_train with X_test, same with Y_train & Y_test ?)

desert oar
#

your code is a bit hard to follow...

#

your test data is random?

#

that doesnt make sense to me

echo orbit
#

Well i'm kinda new to neural network so it's highly possible i made a mistake when initiating

desert oar
#

also, does normalization between 0 and 1 actually make sense for this data? i don't think normalization is a good idea unless the inputs are actually bounded

#

certainly randomized data is not a suitable alternative for out-of-sample validation data

#

however i think its a great idea to fit your model on simulated data before you start trying to fit it on real data. if it performs badly on simulated data that you know has a strong relationship, then your model has a bad architecture for the problem

#

to establish a link between RGB values & wavelength/purity
do you have a theoretical physics/optics model of some kind that describes this relationship? if so, you could (and should) generate fake data using that relationship and fit the model on that fake data, to make sure your architecture and model fitting procedure is good

echo orbit
#

To sum up i have datas (called in the F variable) which are vectors with R, G, B ,wavelength and purity each (around 32.5K points, so 32.5K vectors), there is no explicit formula that links RGB with wavelength & purity and i was asked to make a neural network & verify if it does find a formula/relation between these parameters

#

The previous notebook had a similar issue but the formula was known so that wasn't an issue

desert oar
#

i see.. i guess i am willing to take that network architecture at face value from your teacher. but at least add regularization to it, and use a proper train/test split, or better yet cross validation, to estimate out-of-sample error

#

it looks like you randomly generated X's and Y's that have no relationship to each other for the test data

#

which doesn't make any sense

echo orbit
#

I was looking at the problem that way :

  • i have generated datas in a .npy file that i set in a variable F. I extracted the parameters of interest in different variables (X_train & y_train then normalized (so i can at least take a look at them).
    -for the neural network, i was thinking i should generate completely random X & Y values that i'd use to test on the network with X_train & y_train as datas for training, then evaluate the loss & val_loss so i can compare them & verify if the network works correctly or not
#

At least that's what i was understanding till now

#

I was thinking earlier about finding a relation between each of these parameters but i couldn't find any (& that surely was intended), so i went with purely random values to see how it works

desert oar
#

you should find 0 relationship between X_train and Y_train if they're totally random

#

so the loss should be approximately the same as randomly guessing Y_train values

#

im not sure there is any benefit in doing this

#

however if you can randomly generate X and Y that have some known relationship, you should be able to compare the function learned by the machine to see if the machine is able to learn a relationship

echo orbit
#

I don't see what i can do then

#

well there is no explicit relationship

#

as mentioned before

#

(though there are obvious relationships between wavelength & colors)

desert oar
#

right, but the point is that you want to see if your model can find any relationship right?

#

so try making up a few possible relationships to see how well the model does

#

use that to debug your code and your cross validation pipeline etc

#

also like i said, you will definitely want to add regularization to the model

#

i think also you're conflating two needs: 1) "make sure the model works", and 2) "evaluate the model on out-of-sample data"

#

for the former, use simulated data. for the latter, use a train/test split or cross-validation.

supple knoll
#

hello! new-ish to pandas and wrecking my brain over this one... any help would be greatly appreciated

import pandas as pd

prices = pd.DataFrame({
    'slot': ['1', '2', '3'],
    'price': [10.0, 20.0, 30.0],
})

balances = pd.DataFrame({
    'slot': ['1', '2', '3'],
    'val_a': [2, 3, 4],
    'val_b': [3, 1, 6],
})

incomes_eth = balances.set_index('slot').diff().reset_index()

# Doesn't work
# prices['price'] * incomes_eth

# Works but is hard-coded
print(prices['price'] * incomes_eth['val_a'])
print(prices['price'] * incomes_eth['val_b'])
#

also, please butcher my code - i'm trying to learn

exotic maple
#

you can do DF * Series because its ndim * a single column / row

#

why not merge both dataframes before operating?

#

you can concat on columns and create a single df

supple knoll
#

First off, thanks for responding! 🙂

#

I mean, the end goal is to have a dataframe saying the income in USD for a validator at a given slot. e.g

incomes_eth = pd.DataFrame({
    'slot': ['1', '2', '3'],
    'val_a': [NaN, 20.0, 30.0],
    'val_b': [NaN, -40.0, 150.0],
})
#

Would merging the dataframes help with that?

willow quarry
#

lambda???

supple knoll
#

So I'm assuming this isn't a super common operation? I wonder if I'm going about this wrong

exotic maple
willow quarry
#

so with lambda you can make operations that will be done in every row

exotic maple
#

I'd rpefer to merge dataframes and operate as vectors. A lambda might do it iteratively

#

BUT if this is some kind of script and its memory constrained or permission restricted, i guess he could do a lambda

#

tbh id prefer a function

willow quarry
#

i am not good with the ins and outs of libraries so i like to stick to basic logic kkkkkk

exotic maple
#

like this

willow quarry
#

hey warden are you good with tensorflow agents???

exotic maple
#

def func(df1, df2):
extract relevant columns here
operate
reform in desired format

willow quarry
#

and specs

exotic maple
#

ive never used tensorflow

supple knoll
#

This is meant to run on a larger-ish dataset... So definitely not a one-off small dataset.

#

So I figure staying with pandas best practices would be best?

willow quarry
#

you have no idea how luck you are

exotic maple
#

merging dataframes might not be good idea

#

print(prices['price'] * incomes_eth['val_a'])
print(prices['price'] * incomes_eth['val_b'])

#

what is the problem with that though?

#

it's not "hard" coded from what i see?

#

or do incomes_eth columns vary in number?

willow quarry
#

have you tried to create a 3 dataset and make it row by row??

supple knoll
#

Exactly... On actual data the columns are expected to vary depending on certain inputs.

exotic maple
supple knoll
#

Just curious if there's an elegant one-liner or do I need to build it in a for loop.

#

Wouldn't that be less performant though?

exotic maple
#

for column in list(df.column):
df1 * df2[column]

#

I mean that's exactly what you want

#

you're not doing a matrix operation

supple knoll
#

Right, but lets say I have 10k columns, wouldn't it be better to let pandas handle all that iteration of parallelism or whatever internally?

willow quarry
#

pandas would only do the df1 * df2[column]

#

it would be around the same milisecond time diference

#

i belive

exotic maple
#

its the same operation you're trying to do

#

and tbh 10k vector operations dount sound too largfe

supple knoll
#

It does if you're trying to do them fast 🙂

#

But you're right - this does work

import pandas as pd

prices = pd.DataFrame({
    'slot': ['1', '2', '3'],
    'price': [10.0, 20.0, 30.0],
})

balances = pd.DataFrame({
    'slot': ['1', '2', '3'],
    'val_a': [2, 3, 4],
    'val_b': [3, 1, 6],
})

incomes_eth = balances.set_index('slot').diff().reset_index()

incomes_usd = pd.DataFrame({
    'slot': incomes_eth['slot']
})

for c in incomes_eth.drop(columns=['slot']).columns:
    incomes_usd[c] = prices['price'] * incomes_eth[c]

incomes_usd
#

thanks for the tip, that's probably fine for now...

exotic maple
#

if i were you id do the loop

supple knoll
#

Like the above?

exotic maple
#

and research more. persnoally ive never encountered that situation

supple knoll
#

Just to confirm, the above snippet is the solution you were suggesting, right?

#

Honestly, I come from Go and people complain about that being hard but Panda's been kicking my butt these last few hours 😄

exotic maple
#

I'd very impressed if you figured pandas out in a cuouple of hours lol

supple knoll
#

Hehe I will stick around so that you can witness the years it takes me 😄

#

also, looks like this'll do the trick as well (thx @willow quarry for the lambda tip)

incomes_eth.apply(lambda x: x if x.name == 'slot' else prices['price'] * x)
exotic maple
#

to me it looks like it would be the same either way

analog cave
#

hi i've created a DCGAN, and the generator loss is gradually increasing throughout each iteration, any ideas what could cause this..? any help would be greatly appreciated, thanks.

exotic maple
#

and in that case readability is better

red hound
#

has anyone encountered the issues that a softmax layer outputs totally wrong values? TF Version 2.4.1

willow quarry
#

and also to make questions and talk about

#

so i just encountered a complicated situation

#

and i have no idea how to do about it

#

with tensorflow to make Reinforcement Learning wee have steps to folow

#

1 make an enviroment

#

2 get enviroment specs

#

3 create actor (neural network)

#

4 create agent (manager to the neural)

#

5 make data with polices

#

6 train agent

#

ok

#

but my model starts with an image on screen

#

so the logic is to go with convolutional network

#

if i use conv with int32 dtype my agent says

#

Cannot convert -0.012009611535381534 to EagerTensor of dtype int32

#

if i use float

#

cannot compute Conv2D as input #1(zero-based) was expected to be a int32 tensor but is a float tensor [Op:Conv2D]

#

i would like to try other actors

#

but i dont know how to set them up

willow quarry
#

soooooooo

#

i realized one thing

#

with that code in the actor

#

conv_layer_params = [(5, 2, 0)] ,

#

TypeError: Cannot convert -0.5 to EagerTensor of dtype int32
In call to configurable 'ReinforceAgent' (<class 'tf_agents.agents.reinforce.reinforce_agent.ReinforceAgent'>)

pale oasis
#

is anyone good at pandas here?

#

i have the simplest problem but i can't figure it out

#

for instance, if i have 2 dfs

#

both have the same columns -> x, y, z

#

if df1 has rows, NA, NA, 3
and if df2 has rows 3, 3, NA
I wanted my merged df to be 3, 3, 3

#

is there a way to do this? thanks in advance!

exotic maple
#

mmm

#

The first thing that comes to mind is

#

df1.iloc[:2] = df2.iloc[-1]

Basically, set those rows to be exactly equal as is

#

obviously this isn't scalable or foulproof

#

is that NA an np.nan? or string?

pale oasis
#

uhh it's actually NaN

#

actually slight change in the desc

#

so df1 has cols -> x, y, z

#

df2 has cols -> x, y, a

#

i want to merge these two dfs but i want them to be one in one row because cols x and y are guaranteed to be the same

#

so df3 (merged) should have cols x, y, a, z

#

and if df1 has 1 row and df2 has 1 row, then df3 should have 1 row as well

#

currently, what is happening when i merge with outer, is i get df3 with the 4 cols but 2 rows

#

so x, y, z, a
5, 5, NAN, 5
5, 5, 5, NAN

#

i want it to be 5, 5, 5, 5

glad mulch
#
def signal(df):
    '''
    1 = Expansion: Above mean and Positive slope
    2 = Downturn:  Above mean and Negative slope
    3 = Slowdown: Below mean and Negative slope
    4 = Recovery: Below mean and Positive slope
    '''
    slope = df['3MA'] - df['3MA'].shift(1) > 0
    if df['Cycle'] >= 0 & (slope >= 0):
        return 1 
    elif df['Cycle']>= 0 & (slope < 0):
        return 2
    elif df['Cycle'] < 0 & (slope < 0):
        return 3
    elif df['Cycle'] < 0 & (slope >= 0):
        return 4
    else:
        return np.nan
    return signal
#

i keep on getting a AttributeError: 'numpy.float64' object has no attribute 'shift' and ive tried to lookup how to fix it but they havent been helpful

#

any idea on wtf i should do

serene scaffold
#

@glad mulch I would print out what df['3MA'] is and see if it's what you expected

#

it looks like it isn't a dataframe

glad mulch
#

it would be a series right

#

because its in a df

serene scaffold
#

I would print it out and see

glad mulch
#

looks correct to me

#

oh wait

serene scaffold
#

that isn't necessarily the same data

glad mulch
#

ah that was a good catch. i have a 3m ma and a 12ma so i had some nan values

#

but it still doesnt work

serene scaffold
#

I would make print df the first line of the function you provided

glad mulch
#

here is the original df

#

here is the printed df

#

in the function

#

it gets to here and shits the bed

serene scaffold
#

so df['3MA'] is an individual number

#

it's not a series or a dataframe

#

looks like it's actually a series

#

that is, df appears to be a series and not a dataframe

glad mulch
#

oh you are right

#

oh wait

#

im using it in an apply function

serene scaffold
#

what is the .shift for?

glad mulch
#

essentially for each date i want to check if the cycle is above 0 and that the difference in the 3ma is positive

serene scaffold
#

I don't think I have enough energy at the moment to think through that but I wish you the best of luck

glad mulch
#

haha its all good

#

thank you for helping

#

oh i figured it out

#

YAY

willow quarry
#

sooo

#

i discovered the problem

#

Cannot convert -0.012009611535381534 to EagerTensor of dtype int32