#data-science-and-ml

1 messages · Page 359 of 1

pastel valley
#
  1. flatten the images to make vectors
  2. get the mean face by getting the average vector of the vectors
    whats next?
#

i got this activity to use eigenface and svm for face recognition

#

but i am having a hard time understanding the steps

#

im not a math guy😅

pastel valley
#

for each class there should be a corresponding eigenface?
and for classifiying ill get the eigenface of the input image then check on which other eigenfaces it is close then it will belong to that class?

lapis sequoia
#

https://paste.pythondiscord.com/xiyarahofe.py this is my code, it's a school assignment so I don't expect any direct answer but could someone give me a hint of what I'm doing wrong? I get 4 graphs whereas 3 of them are exactly the same and one of them is completely out of line (broken). Seems like the data I'm trying to get is wrong? The "folkmängd" one seems to be correct, the rest are corrupts. Any idea? (ping on reply)

reef dock
#

Hi, I can't really code for what I need to do with this- I need to check the values from a column of a dataframe df1 and see if they match from a column from df2. If they do match, I want to take the corresponding value from the value matched and append another column from df2 to a new column in df1

bold timber
#

Hi i want to use UMAP but, i got an error like this? How to fix that?

final field
#

i cant manage to get object_detection in path, anyone??

signal bear
#

Hej hej, I'm building plots with plotly. And after some code checking i found out, that I have a problem with my time data. I'm from Germany and so my time is recorded as dd.mm.yyyy but plotly needs the US Format yyyy-mm-dd. Does anybody knows how to call the datafram what is day and what is month in my data? I have a date and a time column. After calling the time column to timedelta there was no problem to combine both to a date_time column. But still the date is translated wrong.

#

date time should be 2021-03-05

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @lapis sequoia until <t:1638960164:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

grave frost
#

tbf, its just a formula - maybe ask a researcher?

pastel valley
#

yo optimizers on model are the one responsible on adjusting the neurons right?
the the basis of the optimizers on which or how much to adjust is the loss function ?
adam optimizer and categorical crossentropy is what im trying to use on a cnn model

tardy plover
reef dock
#

Any viable way to pd.merge() if df1 is string and df2 has int type?

serene scaffold
reef dock
serene scaffold
reef dock
#

Ehm, I can give an example.

df1[col1] : KXW453, JQW134, WUHF532, ABFO121
df2[col2]: 34, 52, 27, 12

serene scaffold
#

This is not what I asked for. Please do print(df1.head().to_dict('list'), df2.head().to_dict('list')). This is the only format I will accept.

reef dock
#

I can't share the data, an example is what I can share though.

serene scaffold
#

There's no way that the example you gave could be interpreted as dataframes, so I'm at a loss for how to help. Good luck!

reef dock
#

Right. I made a mistake.

reef dock
reef dock
#

Or you could tell me if you need me to add more

serene scaffold
#

!e

import pandas as pd
col = pd.Series(['KXW453', 'JQW134', 'WUHF532', 'ABFO121'])
number_col = col.str.extract(r'(\d+)$').astype(int)
print(number_col)
arctic wedgeBOT
#

@serene scaffold :white_check_mark: Your eval job has completed with return code 0.

001 |      0
002 | 0  453
003 | 1  134
004 | 2  532
005 | 3  121
reef dock
# serene scaffold ```py df1[col1] : KXW453, JQW134, WUHF532, ABFO121 df2[col2]: 34, 52, 27, 12 ```...

My initial question was to find a way to match values of two columns, df1[col_1], df2[col_1] of two separate dataframes. The second dataframe, let's call it df2 has another column, df2[col_2] which basically is an index for the code/values in df2[col_1]. I wanted to match the values of df1[col_1] and df2[col_1] and if they match I wanted to add the values from df2[col2] (the index values) to another dataframe.

serene scaffold
#

It seems I don't know how to answer this in the abstract. If you can come up with example dataframes that encapsulate what the problem is, then I'll take a look at that.

lapis sequoia
#

Hi

#

I need some help

#

Recently I have completed few python libraries- Numpy, Pandas, and Matplotlib. So i was wondering what is that I can do next ... I want to do some projects. but I don't know where to find then and what's next.

reef dock
serene scaffold
#
df1.merge(df2, on='col_1')

?

reef dock
#

is it possible to run it here?

#

I doubt it though

serene scaffold
#

yes but you have to represent the data directly in the code, since you can't upload the CSV into our execution environment

#

yes, with !e

reef dock
#

Is it possible to do

df1['col_1'] = df1['col_1'].merge(df2, on='col_2')

or rather, what does the on= do?

serene scaffold
#

on specifies which column you want to use to merge.

reef dock
#

Okay so, is it not possible to perform the merge using specific columns between DataFrames?

serene scaffold
#
df1['col_1'] = df1['col_1'].merge(df2, on='col_2')

The result of a merge is an entire DataFrame, not a Series

reef dock
#

I think I'm getting confused between a particular desired column and a resulting DataFrame

serene scaffold
reef dock
#

Right, sort of, basically.

serene scaffold
#

sort of basically?

reef dock
#

If values in two columns match, I want the corresponding index values (for eg. the relationship between df2['col_1'] and df2['col_1'] to be returned.

serene scaffold
#

you want index values to be returned?

reef dock
#

Yes

serene scaffold
#

so you want to know which indices have values in both columns?

#

you don't want to modify the content of either dataframe?

reef dock
#

No, I just need the indices attached to the same values, I don't want to modify the content of the dataframes

serene scaffold
#

what if the indices are different for the same value? does that matter?

reef dock
#

Those indices are attached to the values that are matched

serene scaffold
#

but in the two columns, the same values might have different indices

reef dock
#

Can I dm you about this?

serene scaffold
#

I'm about to go offline for about an hour, so I probably won't be able to solve it in time

reef dock
#

I'd be able to explain it better

#

The first data set has a column with certain identifiers or codes I guess for products from this sample data. The second dataset has a column with different yet some same identifiers or codes and also has a column for indices for these identifiers. My aim is to see if the identifiers match. If the identifiers match, I want to store those specific indices in a column, if some identifiers don't match, I want to return "-" or NaN for that instance.

reef dock
bold timber
#

Hi do you know to fix this problem?

dreamy bone
#

try just x = UMAP()

dreamy bone
bold timber
loud cave
loud cave
#

Okay. So when you say you want to merge, you want to combine all the other columns too, right?

reef dock
#

I want to match the two columns I mentioned that I want to match. If some values match, I want to resulting dataframe column to contain the index of the ones that match and the ones that didn't should have a blank or a zero or a NaN instead of them

bold timber
dreamy bone
#

show me the whole thing pls

#

yeah UMAP is not a function @bold timber

loud cave
#

A toy example is something like this right?

df1 = pd.DataFrame({
    'col_1': ['KPWA231',
              'J2F372',
              'LKT121',
              'zTWE443',
              'NPR432'],
    'other_col': ['a', 'b', 'c', 'd', 'e']})

df2 = pd.DataFrame({
    'col_1': ['J2F372',
              'APQ124',
              'KJH152',
              'TWE443',
              'LKT121'], 
    'col_2':[30,
             17,
             12,
             24,
             52],
    'another_col': ['q', 'w', 'e', 'r', 't']})
dreamy bone
#

there might be a method insude UMAP you want to use

reef dock
dreamy bone
#

so it would be UMAP.method()

reef dock
#

According to that data how can I merge df1[col_1] with df2[col_2] on df2[col_1]

bold timber
loud cave
# reef dock This would be the toy data.

Is the result supposed to look like this:

result = pd.DataFrame({
    'col_1': ['KPWA231',
              'J2F372',
              'LKT121',
              'zTWE443',
              'NPR432'],
    'other_col': ['a', 'b', 'c', 'd', 'e'],
    'col_2': [pd.NA, 30,52, pd.NA, pd.NA ],
    'another_col': ['', 'q', 't', '', '']    
})
result   
})
#

I.e., the rows for J2F and LKT are common for col_1 between the two original dataframes. You keep all the original values from df1 and use null values for df2

dreamy bone
bold timber
dreamy bone
#

can you show me a snippet of the module?

bold timber
dreamy bone
#

yeah theres no method called fit_transform in it

bold timber
reef dock
bold timber
reef dock
#

So the result column would be like result : NaN, 30,52, NaN

#

Sort of

loud cave
reef dock
#

What does set_index() do?

dreamy bone
bold timber
#

youtube

loud cave
# reef dock What does set_index() do?

Every dataframe has an index. By default it is just a row number from 0 to N. To do things like merged and joins you have to ensure that the indices are the same. Since you wanted to join on the value in column 1, it makes sense to use it as the index

dreamy bone
#

then just check the yt vid my man

reef dock
loud cave
reef dock
#

I'll have to work on this tomorrow though since it's pretty late for me now, thank you for the help. I'll reach out to you and i'll let you know if it works for me.

loud cave
pastel valley
#

like the simplified steps?

#

this is what i think it is
yo anyone here can help me understand eigenfaces?

  1. flatten the images to make vectors
  2. get the mean face by getting the average vector of the vectors
    whats next?
    i got this activity to use eigenface and svm for face recognition
    but i am having a hard time understanding the steps
    im not a math guy😅
bronze skiff
#

also fit_transform is part of the transformer api in sklearn, so spend a bit of time learning about the transforms in sklearn

sacred narwhal
#

can someone send a link properly explaining CNNs?

#

i dont get it ._.

bronze skiff
#

define properly explaining

#

this is straightforward

pseudo jackal
#

hi is someone free to help rn? i just started working with pandas

desert onyx
#

Hi, is someone free to discuss with me where should I start with making automatic script for reading information from ID picture?

serene scaffold
delicate sphinx
#

As if I just finished training my Tensorflow model after 1 month and finally try to predict it and all of my outputs are: [1,0,0,0,0,0,0,0,0,0.....]

#

rip

#

Does anyone with Tensorflow experience know how to correctly combine an image and question input?

My Images are of shape: (36,2048) and my questions are of shape: (64)

I didn't realise before because when I was using my batches, the error never comes up, but if I try to predict just one image and question from my dataset, I get the error:


ValueError: Data cardinality is ambiguous:
  x sizes: 36, 64
Make sure all arrays contain the same number of samples.

This probably explains why my answer outputs are always [1, x] where x is 63 zeroes. So clearly my training hasn't worked correctly.

limpid root
#

Just figured this out, I was a bit of an idiot. sklearn thought I was doing a classifier because for some reason I made the model a classifier when it was supposed to be a regressor. When it tried to convert my continuous data to classification labels, it figured out something was wrong and threw that error.

loud cave
delicate sphinx
#

I mean, I say a month, most of that was unsaved and restarted (checkpointed every few epochs)

#

but im basically at the start again because some random thing seems to have taught my model what an "unknown" token is more than anything else

loud cave
#

I wish I could remember more details, but there's some professor with a computer in his office that has been running a numerical experiment since the 90s. Would suck if it finally finishes but he had a bug in his code

delicate sphinx
#

i mean that's basically what I've done lmao

#

idek where to start on fixing it, I thought my training was nailed down

#

I'm really hoping the error is because of the extra dimension but god knows haha

#

even if it is the extra dimension, I still have a problem in changing that to be compatible

delicate sphinx
#

ah it's all good, my model.evaluate loss is tiny... (that is a joke)

shell depot
#

Hey guys

#

do you guys have information abt MLOps ?

#

what tools to use to work as MLOps engineer ?

tardy anchor
#

Course Overview!

quiet vault
#

nice

magic dune
#

pls

unique niche
#

Hey guys.. what kind of join/append/concat is needed to do following:

#

I'm doing it in pyspark.. but really I can use anything.. the actual DFs I'm using are pyspark DFs.. but I can convert. The above is just an example of what I'm trying to do.

serene scaffold
solemn charm
#

sup

#

ya guys know pyttsx3

unique niche
#

But if it won't.. then THANK YOU 😁

serene scaffold
#

An inner join, on the other hand retains only those rows that can be matched on both the left and right.

#

In set terminology, an outer join is union and an inner join is intersection.

unique niche
#

Yeah, 99% of the time at work in doing inner and left joins.. thank you!

delicate sphinx
#

Stelercus big brain

#

Man Tensorflow has amazing documentation until I try to figure out things like masking because I only found it after A LOT of specific googling

#

Imagine building and processing a model for a month and all you teach it is what an <unk> (unknown token) is

delicate sphinx
#

I have spent 12 hours trying to fix it and my Tensorflow model still only outputs this ,-,

raw vigil
#

whats the best algorithm to build the model for this problem

serene scaffold
raw vigil
#

all the tweets regarding a crypto

#

bit coin for example

serene scaffold
#

what is a "trend" in the context of crypto?

raw vigil
#

like how much tweet in a day about bitcoin

serene scaffold
#

what determines if a tweet is or is not about bitcoin?

raw vigil
#

i want to group all the tweets which contain word="bitcoin"

serene scaffold
#

if that's all, you can just use the tweepy library and retain tweets with the substring bitcoin in them.

raw vigil
#

and also find the sentimental anlysis

serene scaffold
#

the twitter API gives you a sentiment score.

#

whether or not their model for assigning that score is good... PeepoShrug

raw vigil
#

do i need to use db for vizualising the trend?

serene scaffold
#

db?

raw vigil
#

database to store data

serene scaffold
#

I don't see the connection between data storage and data visualization.

raw vigil
#

to get info of last 30days

#

so i can plot graph

serene scaffold
#

what will the two axes of your graph represent?

raw vigil
#

x axis date,y axis amount of tweets

serene scaffold
#

you don't need to hold the content of the tweets for that. just a running count of tweets of interest by day.

raw vigil
#

oh okay,thx.my class teacher gave the problem statement told us to do build the model

#

but i just started learning ml

#

its hard to understand problem statements

pastel valley
#

yo anyone can help me understand the math of eigenfaces?

spiral zephyr
#

Did you ever solve the issue of torch.cuda.is_available() returning false in Conda even though you install the Gpu version. I’m on win10, with a 2080 most recent drivers.

marsh yacht
#

who else is taking the IBM data science coursera course

gusty sparrow
#

Is anyone out there who expart in EEDG data visualization !
Plz contact with you

serene scaffold
#

@pastel valley @marsh yacht @gusty sparrow it sounds like you're all waiting for someone before you ask a question. Don't ask to ask--just ask.

gusty sparrow
#

@serene scaffold
Hi ! Can you help me in my project to visualize EEDG data ?

serene scaffold
#

I appreciate that you're being polite, but in a real time chat such as this, the easiest way to give and receive help is to just put a question right out there. What data do you have? What have you tried to do to visualize it?

gusty sparrow
#

@serene scaffold
Ok brother now i understand your concern .
Thanks to your politeness .

true nacelle
#

If you have n observations from a pareto distribution (shape param 2, scale 1) and an expected value of 2, can you make the claim that the sqrt(n)*(x bar - 2) does not converge to a normal distribution?

#

Struggling with the central limit theorem here

#

Cause this does not contradict the CLT but I can't figure out why

sturdy lotus
#
  1. Write your code to construct a dictionary variable data_large_countries whose keys are iso_code,

total_cases_per_million, total_deaths_per_million, population, population_density.

You can start from the original dataset data_dict.

You can adapt the code given to you in Instruction 2.

Print the dictionary.

sorry to ask this ....but can anyone help me with this
im a beginner

pastel valley
odd meteor
#
  1. Batch Training can solve this RAM issue. Load your data in batches. Utilize the chunksize parameter while loaded your data in pandas.

Please do a quick Google search to see how Batch Training is done. I'm busy at the moment so I can't write code this time.

  1. Since you're using Sequential method to build your neural network, it's your prerogative to determine the number of hidden layers and nodes your network should have.

  2. Then on your second question about input shape this should handle it
    It's just an example

Assuming each of your hidden layer has 100 nodes and you're performing a classification problem.

input_data = df.drop(['target_column'], axis=1)
n_cols = input_data.shape[1]
from keras.layers import Dense
from keras.models import Sequential 
model = Sequential() 
model.add(Dense(100, activation = 'relu', input_shape = (ncols, ))) 
model.add(Dense(100, activation='relu')) 
. 
. 
. 
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics = ['accuracy']) 
model.fit(input_data, target)

Then proceed from here to prediction, then evaluation.
If you're working with a more complex architecture, you should then configure yours appropriately.

lapis sequoia
#

Hello peeps!! I have a dictionary that looks like this. (see image)

#

year_wise_topk = {}
for i,year in enumerate(os.listdir(years_path)):
topics,topic_prob,topic_ids = prob_topwords(reverse_vocab,year)
year_wise_topk[str(i)+'_topic'] = topics
year_wise_topk[str(i)+'_topic_prob'] = topic_prob
year_wise_topk[str(i)+'_topic_ids'] = topic_ids

#

I have this code in which topics, topic_prob and topic_ids are lists

#

I want to plot the probability of for each year with their respective words....

#

How can i do that

#

?

#

the plot should look something like this... or any way to plot it

odd meteor
#

Ohh, it's a time series analysis, I understand. Have you tried using colab?

pale narwhal
#

can anyone make sense of what these variables in the k-means algorithm means?

odd meteor
#

I haven't worked with hdf5 dataset but I'm pretty sure it's gon follow the same syntax you'd normally use to load a csv data or image data into keras

odd meteor
lapis sequoia
tender hearth
spiral zephyr
lapis sequoia
#

ya that sounds ab right

#

idk, when i had the problem it very well couldve been bc of the fact that my windows was saying it wasnt legit even tho it is

#

maybe some features were turned off?idk

willow linden
#

Hey guys, any good math sources to learn AI oriented?

#

or not AI oriented, just good stuff you know of 😉

#

feel free to tag me

bright furnace
#
`# -*- coding: utf-8 -*-
"""
Created on Fri Dec 10 11:14:59 2021

@author: HP
"""
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import pandas as pd



iris = pd.read_csv("Downloads/iris.csv")
iris.head()
iris.plot(kind = "scatter",x = "sepallengthcm", y = "sepalwidthcm")
inland zephyr
#

hello all
I want to do some investigation about the vector distance between each image augmentation and the embedding model (ex: VGG,Resnet,Mobilenet,etc.). Since i use milvus and normalize each vectors to use dot product distance (which give same output as cosine similarity) instead euclidean distance, i need to visualize each result grouped by model based on each object. Is there any tools that i can use for this task?

lone drum
#

My code this way
When I see my month_df in expiry column i am getting nan values
But when I check for single month using break statement it worked as expected but
When I remove break then I am getting nan values in expiry copumn

#

Please check code of for loop block

#

Ping me when replying

#

U can see in last expiry column of this df
I am getting nan values

warm jungle
#

Assuming 4 bytes per entry - an array of that shape is over a terabyte of data? It's not really practical to try and do that sort of thing on one machine?

fiery depot
#

Hey there, could someone help me with people motion tracker in opencv?

crisp cargo
#

if anyone has experience implementing gradient descent in python and is willing to help me get my head around some things, please drop me a message or an @ it would be greatly appreciated.

errant parcel
#

Given I've never used either before, is it wise to use jupyter lab or notebook?

#

vscode jupyter extensions seem pretty attractive

crisp cargo
#

I find google colab much better, with the same functionality

errant parcel
#

Hmm interesting

crisp cargo
errant parcel
#

Is that the one that also offers ML compute integrated?

crisp cargo
#

all in browser, allocates resources to your environment and is very good for all but very large scale ML operations

errant parcel
#

oh right

#

hmm i thought i heard about people abusing some google service to mine crypto

crisp cargo
#

I'm a student, with a ML module but was initially instructed to use jupyter and me and nearly all my class migrated to colab. So i'm (very) far from an expert but can recommend it on personal experience 🙂

#

and if you decide to switch they both support the same file formats (namely .ipynb)

sterile heath
#

@whole phoenix You can use == to compare numpy arrays.

#

You get back an array.

#

Try comparing two arrays of the same shape and dtype. Equal objects in the same location of each array tested will cause there to be a True in the array at the same position.

#

If you're only after one value at a time, however, you can compare a whole array to a single value. You'll get an array with True where it matches.

crisp cargo
#

@sterile heath oooh you may be able to help me with a similar issue, I have to explore a mnist dataset and want to extract just the items that have a label of '1' but have been struggling to do so, okay if you haven't worked with them before jsut thought it's worth an ask whilst you're here 🙂

sterile heath
#

Extract. What does this mean?

#

Tally?

#

What do you mean by labelled?

#

Fair warning, I'm a numpy novice.

#

I'm picking things up, but I'm no expert.

#

Oooh. Okay. I just looked up mnist.

#

Yeah, that's computer vision/ML-related stuff.

crisp cargo
#

yeah, it's a pain haha i'll post my question regardless in case someone else is hovering around

sterile heath
#

Really not my area.

crisp cargo
#

the dataset I am exploring is for a binary classification problem, so each image is tagged with a label 1 or 0, i'm trying to extract those which are labelled just 1 so that I can print examples, but am finding it hard to do so when it seems like it should be easy

sterile heath
#

Do you have a link to said dataset?

#

I'm sure it is easy, once you know how.

crisp cargo
sterile heath
#

"Don't know how? Ha! Go pound sand."

crisp cargo
#

I can't wait to fail this module and be done with it haha

sterile heath
#

I assume you're asking about one of the npz files?

crisp cargo
#

I'm not too sure, i'm exploring just the breastmnist dataset

sterile heath
#

That tracks.

wicked grove
#

but i cant understand where i am doing wrong

#
LABELS_g=[]
from os import walk
for root,dirs,files in os.walk(r'D:\glaucoma_train\ODIR-5K\ODIR-5K\Training Images', topdown=False):
    for name in files:
        print(name)
        #img_name_g = name
        #print(img_name_g)
        row_g = df_labels_g.loc[ df_labels_g['image'] == name]
        print(row_g)
        
       
        
        ```
#

it shows empty df

sterile heath
#

@crisp cargoWhat's the images list name and what's the labels list name?

#

['train_images', 'val_images', 'test_images', 'train_labels', 'val_labels', 'test_labels']

#

Which are pertinent?

pastel valley
#

yo anybody here have knowledge on eigenfaces implementation?

lapis sequoia
#

just a general question:
I am starting to read up on tensorflow and its implementations. I have a set of csv files with one relevant class variable that i want to predict.
I want to build a predictive model for said variable with tensorflow. It would be nice if you could give me some basic pointers to research, on how i would do such a thing :D
I have no experience with tensorflow outside of looking at some basic programs.

sterile heath
#

@crisp cargoSo I've come up with two solutions. Both start like this.

import numpy as np
with np.load("breastmnist.npz") as file:
    data = dict(file)```Now, from here, you can use Python's `zip`, which is handy. If you haven't used it before, you can iterate over two or more things at a time, which is great if you want to do pairwise things. e.g.
```py
zip(data["val_images"], data["val_labels"])```I did it with a list comprehension with an `if` in it, but you don't have to.

The second way you could grab the images you want is through use of `numpy.where` and `numpy.take`. You can use `data["val_labels"]` in `np.where`, then plug that (`[0]`), along with `data["val_images"]` into `np.take`  (`axis=0`), giving you your desired arrays.
#

!e py import numpy as np things = np.array(["Apples", "Pears", "Grapes", "Oranges", "Bananas", "Lychees", "Watermelon"]) indexes = np.array([1,3,6]) print(np.take(things, indexes))

arctic wedgeBOT
#

@sterile heath :white_check_mark: Your eval job has completed with return code 0.

['Pears' 'Oranges' 'Watermelon']
sterile heath
#

!e py import numpy as np things = np.array([False, True, False, True, False, False, True]) indexes = np.where(things)[0] print(indexes)

arctic wedgeBOT
#

@sterile heath :white_check_mark: Your eval job has completed with return code 0.

[1 3 6]
sterile heath
#

This works on anything that's truthy within the array.

lone drum
#

hello i have a dataframe

#

my code this way python for i in unique_months: print('i=', i) if i == 'March': continue get_month_data = df2[df2['month']== i] # check if last thursday is present or not otherwise take wednesday data get_data = get_month_data[get_month_data['week_day']== 'Thursday'] # if thursday is not present then get wednesday data get_expiry_data = get_data['nf_date'].iloc[-1] get_expiry_date = get_data['nf_date'].iloc[-1] print('expiry=', get_expiry_date) month_df = month_df.append(get_month_data) month_df['expiry'] = get_expiry_date break

#

when i use break statement then code works fine when i remove break statment then i get last value to month_df['expiry'] column

#

my df using break statment please check expiry column

#

when i remove break statment then i get this way please check expiry column

#

this value is appended to all rows

loud cave
#

I don't really understand what the code snippet is trying to do

loud cave
#

Like what's the intent of that for loop?

#

I can't interpret what's correct and incorrect

lone drum
#

then i am getting the last date as expiry and i am appending it to expiry column

loud cave
#

Where is the month_df dataframe defined?

lone drum
#
month_df = pd.DataFrame()
for i in unique_months:
    print('i=', i)
    if i == 'March':
        continue
    get_month_data = df2[df2['month']== i]
    # check if last thursday is present or not otherwise take wednesday data
    get_data = get_month_data[get_month_data['week_day']== 'Thursday']
    # if thursday is not present then get wednesday data
    get_expiry_data = get_data['nf_date'].iloc[-1]
    get_expiry_date = get_data['nf_date'].iloc[-1]
    print('expiry=', get_expiry_date)
    month_df = month_df.append(get_month_data)
    month_df['expiry'] = get_expiry_date    ```
#

do u get my point ? @loud cave

loud cave
#

So it's an empty dataframe and you're filling it in that loop?

loud cave
#

It's usually better to use map or apply with dataframes

#

instead of loops

lone drum
#

@loud cave do u get my point by the way what i am trying to do ?

thin path
#

Anyone experienced with Nifi that could give me a hand?

lone drum
warped rapids
#

Hey! Does any of you have a tutorial on AI simulations in Python?

gray tartan
#

Hello guys !
I'm exploring jupyterlite, and i was wondering if it's possible to access the DOM from a DedicatedWorkerGlobalScope (which is what we have access to with pyodide from inside a notebook) ?

#

cause i would like to communicate with my web interface

thin path
lone drum
#

so i get month with its respective expiry value in dataframe

#

ping me when replying

rigid zodiac
#

Hi everyone, i'm trying to create a plot that look like this. But i dont know what is the name of it in order to google it. Anyone knows what this type of plot is called?

tidal bough
arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @placid horizon until <t:1639152401:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

boreal summit
#

Hello everyone. Please I need help creating a TFrecord file out of this raw data below. Thanks.

#

raw_data = [
{'x': 1.20},
{'x': 2.99},
{'x': 100.00}
]

#

I have tried but not getting it.

rigid dawn
#

How to use annotations in px.choropleth ?

bold timber
#

I want to plot a numpy array but i got an error like this. How to fix it?

serene scaffold
#

!code

arctic wedgeBOT
#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

serene scaffold
#

It looks like you're dealing with DataFrames, which are not quite the same as arrays.

serene scaffold
serene scaffold
wicked grove
# serene scaffold I cannot help. Good luck!

Heyy, i have a question
I'm trying to do a multiclass classification, there are 3 classes norrmal glaucoma and dr
I have 4 image datasets for glaucoma. To preprocess these should i combine them and put them in a single folder?

#

If so, how do i go about it . Please guide me ,thank youu!

serene scaffold
upbeat dove
#

I'm assuming you are trying to make a model right

wicked grove
#

no no, i just want to preprocess the training data

wicked grove
#

a cnn model

upbeat dove
#

What library are you using

wicked grove
#

im planning on using tensorflow

upbeat dove
#

If its keras then you have to split your data into 2 arrays, the features and the labels since when you train the model, those are the inputs

#

Yeah keras is like a submodule of tensorflow

upbeat dove
wicked grove
upbeat dove
#

Oh wait

wicked grove
upbeat dove
#

I misunderstood

#

You mean the images right

#

Preprocessing the images?

wicked grove
wicked grove
wicked grove
upbeat dove
#

I forgot but either numpy or tensorflow has a helper function to reduce an image to a certain ratio

wicked grove
#

I am very confused cause i have 3 image datasets w labels and idk how to combine

upbeat dove
#

Are these color images?

#

Like pngs

#

Or are they in some kind of numerical data

wicked grove
#

No no colour images

#

Jpgs

#

I will grayscale ,resize amd augment

upbeat dove
#

Ok well I dont know any good ways to get jpg images to numerical data like a neural network needs

#

Maybe someone else can help with that

wicked grove
#

I can convert it into a numpy array

upbeat dove
#

Oh ok

#

Sorry im like on a bus rn 😅

wicked grove
#

Ohh alrightt🙈🙈

#

That's okayy

upbeat dove
#

So ima go

wicked grove
#

I just wanna know that if i have 3 kaggle datasets how should i merge it

wicked grove
serene scaffold
wicked grove
#

I have got 3 kaggle image datasets

#

Should i combine them,put them in a single folder and do it?

serene scaffold
wicked grove
#

resize,grayscale and augment them

serene scaffold
wicked grove
serene scaffold
wicked grove
serene scaffold
#

and the answer is no. arrays have to be "rectangular"

serene scaffold
# wicked grove ohh

any direction you slice an array, the shape of each slice has to be the same

#

I'm not sure how this limitation is overcome in image processing because I have never done it.

wicked grove
#

ah alrightt!thank you soo much, i will google a bit more on this

serene scaffold
#

I fear I was not particularly helpful. Good luck!

wicked grove
upbeat dove
#

For a keras dense layer, what would be the best bias_initalizer?

#

The default is zero but I want it to start off with some amounts of bias

night sorrel
#

Is there anyone who can help with learning the first few simple steps to make an ai voice assistant, if there is anyone please ping :)

upbeat dove
#

I'd say learn LSTM

#

Since you would need one for an ai voice assisstant

lapis sequoia
#

Hey, my tensor flow model is giving me the following errors when i try to fit the model:
model.fit(training_vals, training_labels, epochs=10)
ValueError: logits and labels must have the same shape, received ((None, 2) vs (None, 1)).
training_vals is shaped like this: (191164, 34)
training_labels like this: (191164,)
what might be causing this issue?

serene scaffold
lapis sequoia
#

do i just training_labels.reshape(191164,1) to do that?

serene scaffold
lapis sequoia
#

o7, will try

serene scaffold
#

I'm a bit confused as to why the shapes in the error message are given as (None, x)

#

idk what None even means in that context. When you take slices of an array, None is used to insert a new axis. So training_labels.reshape(191164,1) should have the same effect as training_labels[:, None]

lapis sequoia
#

new output when i print the shapes:
(191164, 34)
(191164, 1)

ValueError: logits and labels must have the same shape, received ((None, 2) vs (None, 1)).

#

error is still there

serene scaffold
#

it returns a new array/tensor

lapis sequoia
#

yes

serene scaffold
#

without seeing the exact code, I don't know why you're still getting the error.

lapis sequoia
#

training_labels = training_labels.reshape(191164,1)

serene scaffold
lapis sequoia
#

this is how i construct the training data:

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(anlage1.drop(labels=['Fehlernummer'], axis=1), anlage1['Fehlernummer'], test_size=1/3, random_state=0)
training_vals = np.array(x_train, dtype="float32")
training_labels = np.array(y_train.values, dtype="uint8")
print(training_vals.shape)

training_labels = training_labels.reshape(191164,1)
print(training_labels.shape)

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(34,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(34,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(34,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(2, activation=tf.nn.sigmoid))

model.compile(optimizer='adam',
              loss=tf.keras.losses.binary_crossentropy, 
                  metrics=[keras.metrics.FalseNegatives(name="fn"),
                  keras.metrics.FalsePositives(name="fp"),
                  keras.metrics.TrueNegatives(name="tn"),
                  keras.metrics.TruePositives(name="tp"),
                  keras.metrics.Precision(name="precision"),
                  keras.metrics.Recall(name="recall")])

model.fit(training_vals, training_labels, epochs=10)#, class_weight={0:0.06178233806436323645, 1:0.93821766193563676355})


#

my data comes from a pandas dataframe that reads from a csv file

#

so anlage1 is a pandas dataframe

serene scaffold
#

did you confirm that all the values in the DataFrame are numeric?

lapis sequoia
#

yes

#

i make them numeric after reading from the csv

#

i dropped the datetime column too, after reading from the csv

#
for cols in anlage1.columns:
  if cols is not 'Date':
    anlage1[cols] = pd.to_numeric(anlage1[cols])
upbeat dove
lapis sequoia
#

should make them numeric, right?

upbeat dove
#

in terms of shapes right

lapis sequoia
#

hmmmmmmmm

serene scaffold
#

sounds like training_vals and training_labels isn't the source of the problem

lapis sequoia
#

if it is saying it wants (none, 1)?

upbeat dove
#

If its asking for (None, 1) then ye

lapis sequoia
#

but... why?

upbeat dove
#

¯_(ツ)_/¯

lapis sequoia
#

lol

upbeat dove
#

after hours of debugging you dont even question the solution

lapis sequoia
#

what then, is the solution? or phrased differently, what helped you?

#

:D

upbeat dove
#

I just did .reshape(1, 64) when I used .predict() (in my case it wanted (None, 64))

#

I'm assuming you can do something similar with your dataset?

#

Not 100% sure

lapis sequoia
#

I am just using model.fit not predict.

#

i did reshape my array, but that did not help.

#

OOOO

#

could it be that it wants 2, because my last dense layer goes down from dense(34, activation=tf.nn.relu) to a Dense(2, activation=tf.nn.sigmoid)?

quiet vault
#

So I am trying to connect to my deep learning vm by using ssh port forwarding

#

ERROR: (gcloud.compute.ssh) argument [USER@]INSTANCE: Must be specified.

#

I get this error. How do I fix it

serene scaffold
quiet vault
#

right

bronze skiff
#

the error sounds pretty self explanatory

#

just supply the username and instance for your instance

lapis sequoia
#

i want to pick a discrete int from [0, 1, 2... 1000] randomly, but i want the 0, 1, 2.. end to be much more likely than the ..., 999, 1000 end
what statistic/probability function is this?

velvet thorn
velvet thorn
#

like it depends on how unlikely you want, say, 700 to be

#

relative to 1

#

but for simplicity

dull bobcat
#

Hey I’m a pretty casual python programmer but I want to get into some data science. But i don’t really know where to start, does anyone know a good starting point?

velvet thorn
#

you could probably just use a folded normal distribution

lapis sequoia
candid spear
#

Guys i have a question
So i created a liner regression model
And from that i found the rmse value for the training dataset which is 0.32 and on testing its 152.182
Is this correct or am i doing something wrong here

upbeat dove
#

Wondering if this is a good approach to machine learning with chess. Bot plays against itself. If it wins with white, it associates every position after it moved with absolutely winning for white (playing as white and black too) and with black, it will associate every position after it moves with absolutely winning for black, if it ties, it associates every position after it moved as a tie.

upbeat dove
rose pasture
#

Hi guys I just started learning programming and I have a question. Just out of curiosity is it possible to train an AI model through user inputs instead of letting it run until it is successful? For example training an autonomous driving agent on a race track through reinforcement learning by letting it learn by itself, can we train the model by ourselves instead? We would drive the car for a few laps and then let the model learn from the laps we've done. Is that possible?

rose pasture
#

Learn to finish a lap by itself in a game perspective

serene scaffold
rose pasture
#

A car driving through the exact same route with no changes and no other players at all.

serene scaffold
#

Accelerate this much for this long, break this much for this long, turn at this angle, etc...

#

Now, if the car had to learn how to respond to events that happen randomly, and it did that based on data that you create by playing, that's another story

#

But that's just creating training data. It's not "training the model by ourselves" as some thing that's separate from supervised learning in general.

rose pasture
serene scaffold
#

And you'd have to have some way of representing what those keystrokes were in response to.

#

because the point of the model is to learn what the right sequence of keystrokes is given the objective of the game and events that randomly occur during the game.

#

disclosure, I have never made a model that plays a video game, though a past coworker did.

rose pasture
#

That's pretty interesting! I learned something new today thanks to you!

#

I will have to work my way to that point myself lol i just started programming about a month ago

serene scaffold
rose pasture
serene scaffold
rose pasture
serene scaffold
#

but in general, a computational linguist might work on machine translation, exacting data from text, or speech recognition and speech synthesis.

rose pasture
#

That sounds awesome!!

#

I am still new and I don't know which career path I want to learn yet

#

I recently bought a Udemy course on data science just to see if I would like that subject

serene scaffold
#

in a lot of ways, it's just using statistics to leverage patterns that exist in data

#

but then, what else would it be?

rose pasture
serene scaffold
#

isn't most stock trading these days bots interacting with eachother?

rose pasture
#

You are right

serene scaffold
#

the company I own the most stock in recently saw some of its employees unionize for the first time. guess I should see how that's doing.

#

it's down four dollars from the last time I checked. I'm ruined.

rose pasture
#

ouch, do you still believe in it long term though?

#

The stock market has been very volatile the past few weeks

serene scaffold
#

The only reason I have that stock is because they gave it to me when I worked for them. I'd be fine with it becoming worthless if subsequent generations of employees get non-shit working conditions.
But now I'm off-topic.

rose pasture
#

Damn that's bad

#

Hey I just finished learning python through a Udemy course, would you recommend me practicing what I've learn through a project or coding challenges from websites like codewars?

serene scaffold
rose pasture
austere swift
#

the point of doing projects is to learn more, not to just have everything memorized

#

what I started with was taking some code I found online and modifying it in different ways to improve/adapt it for my use, that can help you gain a good understanding of how it works without having to start from scratch

rose pasture
#

Is it normal to always google and find whatever I need to solve the coding problems i try to solve?

austere swift
#

you shouldn't be blindly copy-pasting code from the internet without knowing what it's doing, but its normal to have to look up documentations and error messages

rose pasture
#

Of course. When I tried doing coding challenges this past week, I find myself always looking on google how to code the ideas/solution that I have in my head

austere swift
#

yeah thats perfectly normal

#

programming is more about knowing how to solve problems rather than memorizing documentation

rose pasture
#

I see problem solving skill is important

#

the only way to improve it is to continue to code everyday? lol

austere swift
#

yeah pretty much

hollow sentinel
#

what is one versus one

#

and one versus all

#

there are multiple class classifiers such as naive bayes and random forest...

#

but you can force sci kit learn to use one versus one or one versus all w binary classifiers?

#

says in the book OVA's preferred

odd meteor
# hollow sentinel and one versus all

There are two popular approaches to multiclass classifier.

  1. One vs. Rest
  2. Multinomial/Softmax

One vs. Rest is the default behaviour of scikit-learn's Logistic Regression.

Modifying your loss function so that it directly tries to optimize accuracy on a multiclass problem is know as Multinomial Strategy. This is also called Multinomial Logistic Regression or Softmax or Cross-Entropy Loss.

Like you rightly mentioned, in solving Bayesian problem (Naive Bayes) with multiclass we use MultinomialNB but when we're working with a binary class we use GausianNB algorithm to build the model.

Comparing Both Strategies

So let me briefly mention the distinction between One vs. Rest and Multinomial strategy

  1. Overall Modus Operandi

a) One Vs. Rest: Fit a binary classifier for each class
b) Multinomial: Fit a single classifier for all classes

  1. How It handles Prediction Behind the Scene

a) One Vs. Rest: Predict with all but take the largest output
b) Multinomial: Prediction directly outputs the best class

  1. ** Their Drawbacks and Edge**

a) One Vs. Rest: simpler, modular, but not directly optimising accuracy. Not really a standard approach in Neural Network

b) Multinomial: more complicated but tackles accuracy problem directly. The standard approach in Neural Network.

Answering your 2nd Ques.

Yes, we can force the classifier to either use One vs. Rest or Multinomial strategy in Scikit-learn. Let me use a linear-based model to illustrate how it works...

lr_ovr = LogisticRegression()
lr_ovr.fit(X, y) #The conventional approach you know is One Vs. Rest

lr_mn = LogisticRegression(multi_class='multinomial', solver ='lbfgs')
lr_mn.fit(X, y) #The is the Softmax/Multinomial approach

I hope you understand it now? 😊

lapis sequoia
#

hey guys! how do I get started with data-science

#

I'm advanced in python so idm any more challenging to understand resources

untold tundra
lapis sequoia
#

data structures in 8 hours?

#

ty

wicked grove
#

Or will that give a low accuracy

lunar star
#
  1. Depends what "low" means
  2. Depends on the data

600 images can be enough if you have a pre-trained model that you finetune on 3 classes.

odd meteor
wicked grove
wicked grove
wicked grove
odd meteor
# wicked grove The prob is i dont have more data, it's about 600 after data augmentation

Make do with what you have.

80% accuracy in my opinion isn't that bad (depending on the severity of what you're trying to do). If you ask some MLOps guys they'll be glad to inform you that models with 69% - 72% accuracy have once made it to production 😂.

If 600 images is capable of yielding 0.80 then what can such model be able to do when fed with, say, 3000 images? 🤔 I'll be curious to find out.

late mason
#

Hi, I'm working on AutoML project and want to launch process for automated model training. I am interested in how can I monitor the process for resource usage or failures with kinda ready-made APIs.

arctic crown
#

@serene scaffold you there?

#

sorry for the ping

#

i just have a very quick question

serene scaffold
arctic crown
#

oh sorry

#

ok ill put them all in one message just tell me if i am right or wrong

#

Scalars - just 1 value
Vector - a list and has more than 1 value
Tensors - an N-dimensional array when N = 1, that tensor is a vector when N = 2, that tensor is a matrix [0, 1] this is a vector, [[0, 1], [1, 2]] this is a matrix

upbeat dove
#

I see one error in this (I think)

#

Vectors can only have one value

#

Or even none (maybe? this one I'm not sure)

arctic crown
#

what do you think? @serene scaffold

serene scaffold
#

@arctic crown @upbeat dove a scalar is a stand-alone number, yes. And then a vector, generally speaking, is a one-dimensional array. A (n,1) shaped array is a column vector because, despite having two dimensions, it's basically just vector in a column

#

And then (1, n) is a row vector.

arctic crown
#

what do you mean by a one-dimensional array

serene scaffold
#

Any array where the shape is all 1s except for one n is a vector in a certain space. I'm iffy on the terminology there

#

@arctic crown when the shape is (n,)

#

[1 2 3 4] has a shape of (4,)

#

[[1 2 3 4]] has a shape of (1, 4)

arctic crown
#

im sorry i am asking so many questions

#

but whats a shape?

#

im just getting started on this

serene scaffold
#

Uhh. I'll just make a random array and you can figure out what the shape is intuitively

arctic crown
#

ok

serene scaffold
#

!e import numpy as np; print(np.random.random((2, 4, 3)))

arctic wedgeBOT
#

@serene scaffold :white_check_mark: Your eval job has completed with return code 0.

001 | [[[0.28129572 0.33512977 0.93151513]
002 |   [0.66246171 0.14259656 0.77884458]
003 |   [0.1443945  0.72199244 0.43976486]
004 |   [0.6581581  0.27763684 0.14058968]]
005 | 
006 |  [[0.29535783 0.08170229 0.85562144]
007 |   [0.7064785  0.23825289 0.21846468]
008 |   [0.65787749 0.91715884 0.45522423]
009 |   [0.01521974 0.79929807 0.23513918]]]
serene scaffold
#

This is a there dimensional array where the shape is (2, 4, 3)

arctic crown
#

im sorry what?

#

but i dont see the (2,4,3) anywhere

serene scaffold
#

There's two groups with four rows, three columns@arctic crown

arctic crown
serene scaffold
arctic crown
arctic crown
odd meteor
# arctic crown but i dont see the (2,4,3) anywhere

If you look at it clearly you'll see it 😊

This might seem a bit unusual 'cos it's has a depth of 3. That is to say, it has 3 dimensions.

This was how I got to understand it clearly when I started learning. Maybe you'll find it useful or useless... Who knows? 🤷🏾‍♂️

  1. How many squared brackets does the array has at the beginning?

Your answer would be 3, I presume. Which is correct. This means the array is a 3 dimensional array

  1. How many homogeneous groups do they have?

Answer = 2

  1. How many rows does each homogeneous group has?

Answer = 4

  1. How many columns does each homogeneous group has

Answer = 3

Now, you'd say the array has a shape of (2 by 4 by 3)

No.of groups, Row, Column = 2 x 4 x 3

arctic crown
#

but what do we do with the numbers?

serene scaffold
#

Whatever you want

odd meteor
arctic crown
#

ok now that i know what a shape is could you kindly explain what a vector is @serene scaffold

serene scaffold
#

I went over that earlier, though

arctic crown
#

oh right sorry

#

so a vector is just an array that has a shape
and an array is basically a list but it has a fixed length. So once declared you can't add or remove elements from it. right @serene scaffold ?

#

sorry for the pings tho

serene scaffold
arctic crown
#

ok sorry

untold tundra
#

informally: sequence (n,); vector (n, 1) or (1, n); matrix (r, c); tensor (n, r, c)

#

however, mathematically, they are all tensors

a tensor is "an interface" (in a SEng sense), and any "mathematical object" with that interface counts

so ordinary numbers are tensors, as well as vectors, matrices, etc.

informally, we tend to say: sequence, vector, matrix, tensor

but all of those are tensors (and theyre all sequences)

calm thicket
#

what makes a (n,) different from (n,1)?

untold tundra
#

(n,1) dot (1, n) is different thant (n,) * (n,)

wicked grove
untold tundra
#

(1, n) = a function which accepts a (m, 1) and "sum-products" its elements with n coefficients

(m, 1) = a point with m components in m directions

(m,) = a sequence of numbers which may or may not describe a point

#

a row vector (1, n), eg., f = [1, 2, 3] is essentially just an abbreviated way of saying, f(x) = 1 * x_1 + 2 * x_2 + 3 * x_3

a column vector (1, m), eg., c = [2, 4, 6] is just an abbreivated way of saying c = 2*[1, 0, 0] + 4*[0, 1, 0] + 6*[0,0,1]

arctic crown
#

what is n?

untold tundra
#

(1, n) just means the shape of some numpy array, where n is a positive integer

arctic crown
#

so n can be anythign?

#

like (1, 135123561262346)

untold tundra
#

yes

wicked grove
#
plot_decision_boundary(lambda x: clf.predict(x),X,Y) 
``` what is the point of the parameters X and Y in this line,I'm kinda confused
untold tundra
#

the function plot... takes three args: a prediction function, the dataset X, and the dataset Y

#

incidentally, it can also be written: plot_decision_boundary(clf.predict, X, Y) 

arctic crown
#

in ml is the dimension the same as a real-life dimension? like x,y,z

untold tundra
#

dimension = a component of a data point which can change independently of the other components

space has three dimensions: you can go up/down without going back/forwards (, side/side)

arctic crown
#

what is a data point in ml?

#

and an array is basically a list but it has a fixed length. So once declared you can't add or remove elements from it. right?

untold tundra
#

the complete information about a single observation, it will be dataset dependent

in general, think of it as a row in a table

#

!e ```py
import numpy as np

print("Sequence", np.array([1, 2, 1, 2]).shape)
print("Vector", np.array([[1, 2, 1, 2]]).shape)
print("Matrix", np.array([[1, 2], [1, 2]]).shape)
print("Tensor", np.array([[[1, 2], [1, 2]]]).shape)

arctic wedgeBOT
#

@untold tundra :white_check_mark: Your eval job has completed with return code 0.

001 | Sequence (4,)
002 | Vector (1, 4)
003 | Matrix (2, 2)
004 | Tensor (1, 2, 2)
wicked grove
#

Also i wanted to know if i can increase my dataset from 400 images to 800 images w data augmentation

untold tundra
#

!e ```py
import seaborn as sns
point = sns.load_dataset('tips').sample()
print("Dimensions", point.columns)
print("Data Point", point)

untold tundra
wicked grove
#

Alrightt, thanks a lott

arctic crown
#

please help, whats an algorithm

serene scaffold
arctic crown
#

ty

pine wolf
calm thicket
#

a procedure

serene scaffold
#

@pine wolf what do you mean?

pine wolf
pine wolf
serene scaffold
pine wolf
#

well, yes, but a set of instructions is probably good enough

#

i can give you, say, an algorithm for starting a forest fire, but it wouldn't be solving any problems

calm thicket
#

unless you're very cold

serene scaffold
pine wolf
#

but what if i give you the algorithm and i don't want a forest fire

#

unless "solves a problem" becomes so nebulous that it could mean anything, then ok, it solves a problem

serene scaffold
#

"a set of instructions" sounds good.

wide hull
#

Hi i'm new to Ai any sugestions?

wicked grove
#
assert(A2.shape == (1, X.shape[1]))
   
    cache = {"Z1": Z1,

AssertionError: 
```can anyone tell what error this is
serene scaffold
terse frigate
#

can you help me with this google colab error?

serene scaffold
#

It appears that it requires the shape of A2, which is probably an array, to be (1, n), where n is whatever the length of X is in its second dimension.

serene scaffold
terse frigate
#

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 11.17 GiB total capacity; 10.24 GiB already allocated; 5.81 MiB free; 10.56 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

#

i am having trouble training a model

#

😢

serene scaffold
terse frigate
#

on google colab

#

how do i manage the memroy?

#

memory

wide hull
#

Hi i'm new to Ai any sugestions?

serene scaffold
terse frigate
serene scaffold
wide hull
#

Thx

serene scaffold
terse frigate
serene scaffold
arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

#

Hey @terse frigate!

It looks like you tried to attach file type(s) that we do not allow (.ipynb). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

terse frigate
#

whoops

main fox
serene scaffold
main fox
arctic crown
#

whats a rank?

boreal loom
#

Hello everyone, I have a dataframe that contains this Name_Number_Number_Value_String

#

What i want to do , is create a column in that dataframe that has Name_Number_Number and another that has Value_String

#

So basically split the string that exists in one column

#

in 2 different ones

#

Does anyone have an idea as to how to approach this

upbeat dove
#

(for keras)
Willpy model.fit([x, x2], [y, y2]) be equivalent to```py
model.fit([x], [y])
model.fit([x2], [y2])

serene scaffold
#

@boreal loom I need to see the exact DataFrame you're talking about to understand the question. Please do print(df.head().to_dict('list')) and copy and paste the text into the chat. This is the only way I'll be able to help.

boreal loom
#

This is what I used and it worked

#

There prolly is a cleaner a way

#

@serene scaffold thanks for the help, I solved by google fu, much appreciated

serene scaffold
serene scaffold
boreal loom
#

pandas is giving me a hard time 😄

serene scaffold
serene scaffold
upbeat dove
#

No not yet

stuck swallow
#

Hello I want to generate some music from a few folk songs... is there an ai that is suited for this task?

hollow sentinel
#

so multilabel classification involves algorithms that can put multiple labels on instances?

#

like if there was a facial recognition algorithm

#

and we had labels alice, bob, and charlie

#

alice and bob are in a picture

#

it should output [1,1,0]?

serene scaffold
hollow sentinel
#

i don't quite understand what multioutput classification is

#

each label

#

is multiclass

#

am i wrong for saying this looks like y = mx + b

#

like y = theta0 + theta1(x1)

#

assuming you only had one feature

serene scaffold
untold tundra
#

there's nothing in this which indicates multi-class

#

it's just multiclass if $\hat{y}$ can take more than one of two values

hollow sentinel
#

no that was a linear regression

#

equation

untold tundra
#

sure

hollow sentinel
#

what's the difference between a col vector and a normal vector... a col vector is vertical and a normal vector is just like [1,2] on a graph?

untold tundra
#

a col vector is a normal vector

hollow sentinel
#

it's just a different representation?

untold tundra
#

or, do you mean "normal to a surface" ?

#

by "normal" do you mean "ordinary" ?

hollow sentinel
untold tundra
#

that's not helpful

#

"normal" is a technical word

hollow sentinel
#

ordinary?

untold tundra
#

ok, so a col vector is the "ordinary vector"

#

a row vector is the strange one, a row vector is a function

hollow sentinel
#

oh

untold tundra
#

!e ```py
import numpy as np

point = np.array([1, 2, 3]).reshape(3, 1)
fn = np.array([2, 4, 6]).reshape(1, 3)

print(fn @ point)

arctic wedgeBOT
#

@untold tundra :white_check_mark: Your eval job has completed with return code 0.

[[28]]
untold tundra
#

fn @ is a function which accepts point as an argument, and does 2 * 1 + 4 * 2 + 6 * 3

upbeat dove
#

Would a convolutional layer be useful for a chess neural net?

arctic crown
#

what does N-dimensional mean? are dimensions the same as real-world like x,y,z?

serene scaffold
#

our physical world has three spatial dimensions, but mathematically there's no limit to how many dimensions something can have.

lone grove
#

There IS. I don't know it, but I'm sure there is.

#

What.. I was talking in the past.

arctic crown
iron basalt
desert oar
# arctic crown whats the use of dimensions here?

there are some slightly different meanings of "dimension" in machine learning and scientific computing:

  1. the number of "axes" of an array. a vector is a 1-dimensional array, a matrix is a 2-dimensional array, a 3-tensor is a 3-dimensional array, and so on

  2. the dimension of a vector space, which technically is the size of the set of basis vectors but also corresponds to the number of elements in a vector; e.g. in the standard vector space defined on ℝ³, the dimension of the vector space is 3, and vectors have 3 coordinates

  3. the dimension of the column space of a matrix, i.e. the dimension of the image of the linear operator associated with that matrix. this is a very important concept in applied linear algebra, called the "rank" of a matrix.

#

and yes, i second squiggle's recommendation that linear algebra is "required reading"

#

i think for the most part, in machine learning discussions people use the 1st meaning

#

the 3rd meaning usually is called "rank", and the 2nd meaning usually is confined to theoretical math discussion

#

one well-known place where the 2nd meaning appears: the "curse of dimensionality" and the "dimensionality" of an embedding

#

so actually i'm wrong... people use both terms, and it's up to the listener to understand their meaning from context

#

it can be very confusing. unfortunate clash of terminology

#

maybe it's easier in other languages

wicked grove
#

i am getting this error. i googled it and it says i should use the second column..but i cant understand fpr,tpr,thresholds =metrics.roc_curve(y_test,y_predict_test_prob) raise ValueError( ValueError: y should be a 1d array, got an array of shape (80000, 2) instead.

#

i want to plot the roc-auc curve

desert oar
wicked grove
#

y_predict_test_prob = MNB.predict_proba(X_test) it has the predicted probabilities

#

MNB is logistic regression

arctic crown
desert oar
#

i kind of made up that term

#

I don't know if engineers use it but I wouldn't repeat it

#

sorry i meant to edit that out

errant path
sleek tapir
#

hey are they an ml courses for free online i could also pay as well

#

but must be well taught and have a lot of math (doesnt skip over)

#

and preferably in python

#

no andrew ng course because its bad no math

#

im a math cs major

arctic wedgeBOT
#

@eager ether Please don't try to ping @everyone or @here. Your message has been removed. If you believe this was a mistake, please let staff know!

#

failmail :ok_hand: applied mute to @eager ether until <t:1639295285:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

iron basalt
#

Could probably create a table with columns of tensors, linear algebra, and computer science, and rows of rank, and dimension. With each cell having a different definition.

#

(yes, i'm keeping tensors separate from lin alg because they are complicated)

tidal bough
#

could mean multiple things
Hmm, what other ones? (r,s)-tensor is an element of this:

#

are there other common ways to classify them?

iron basalt
quick kestrel
#

guys any article that explains how to make a neural net using numpy and python

#

i wanna make a chat bot using ml

iron basalt
#

Can also use the terms dyad, triad, tetrad, etc, for rank 1, 2, and 3 tensors, etc (a bit more uncommon).

tidal bough
#

you mean 2,3,4, right?

iron basalt
#

yeah my bad

#

Dyad might get confused with dyadic. Honestly, math terminology is kind of mess due to there just being so many different fields with overlapping naming. And in the case of ML, programmer terms too (and even ML made up terms which also have different meanings in different papers). It really all relies heavily on context.

inner swan
#

I am making making a phishing link detector using python(machine learning) and I need some help if anyone can help pls let me know, I'll give all the details of what help I need and you can refer the base of the project from the link below.Thanks

I made and trained the machine learning algorithm with the dataset that I downloaded from uri .Now I want to make an algorithm that can check the provided link for the features that are present in the dataset and accordingly give the output.
https://www.activestate.com/blog/phishing-url-detection-with-python-and-ml/

This Python tutorial walks you through how to create a Phishing URL detector that can help you detect phishing attempts with 96% accuracy.

lapis sequoia
#

Hello, people. I'm very new to python having just completed an introductory course. I want to step into Data Science & ML but Idk what all I should read or start in python. So, I've a question: If you were send into the past to guide your younger self for ML, what all would you recommend him in python & from where?

fair nimbus
# lapis sequoia Hello, people. I'm very new to python having just completed an introductory cour...

I'm not a data scientist though I've spend some time as a Data Engineer working with DS. I recommend https://github.com/fastai/nbdev as it will help you able to experiment in Jupyter labs / notebooks and package it up into a python package. The tutorials take you through github or gitlab CICD (continuous integration continuous delivery) which is a good practice for software development.

GitHub

Create delightful python projects using Jupyter Notebooks - GitHub - fastai/nbdev: Create delightful python projects using Jupyter Notebooks

lapis sequoia
lapis sequoia
fair nimbus
# lapis sequoia Also, I would appreciate if you can differentiate Data Engineer and Data Scienti...

Can you please tell me what all I should read in python?

No not really. I find the offical python / pandas docs great thesedays,

if you can differentiate Data Engineer and Data Scientist.

A Data Engineer (DE) can be as simple as pulling data in from different sources however possible using just python or bash scripts or sometimes tools like apache airflow. A DS can make sense of that data once the data is in a db or place it can be worked with.

lapis sequoia
compact egret
#

i have a test dataset with shape (1600,10) and a train dataset with shape (6400,10)

is there a convenient way to construct an numpy array of shape (1600, 6400) containing the euclidean distance measures between my test and train data, other than using for loops

wise pelican
#

For a numerical series of data, is the lowest variance from the 90th percentile pretty much giving you the same data compared to the highest variance from the 10th percentile?
A low variance in the 90th percentile would imply that the data remains close to the 90th percentile, and having a high variance for the 10th percentile would mean that the data is far from that 10th percentile point - AKA it would be closer to the 90th percentile

I figure it's not an exact 1 to 1 relationship between these things. Is there some algorithm or formula to figure out this type of relationship?

untold tundra
compact egret
#

yh this is the code i currently have, but it takes forever to run, so im wondering if there is a more efficient and quicker approach to this

#

this is what my lecturer did for the cosine distance, but i dont understand what he did for modtest and modtrain

compact egret
#

think that might be exactly what im looking for, ill have a read, thank you!

lone drum
#
Traceback (most recent call last):
  File "F:\nifty_banknifty\nf_bnf_2.py", line 30, in <module>
    df2['one_year_dates'] = np.arange('2017-04', '2018-03', dtype='datetime64[D]')

  File "C:\Users\shubh\anaconda3\lib\site-packages\pandas\core\frame.py", line 3163, in __setitem__
    self._set_item(key, value)

  File "C:\Users\shubh\anaconda3\lib\site-packages\pandas\core\frame.py", line 3242, in _set_item
    value = self._sanitize_column(key, value)
  File "C:\Users\shubh\anaconda3\lib\site-packages\pandas\core\frame.py", line 3899, in _sanitize_column
    value = sanitize_index(value, self.index)

  File "C:\Users\shubh\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 751, in sanitize_index
    raise ValueError(
ValueError: Length of values (334) does not match length of index (46739) ```
ping me when u reply
i want to add array values as column in pandas dataframe
uncut stump
#

You have to use a for loop in that range and give whatever value as you want to that columns, but you have 46k data, so every column should have 46k value (you can use None)

lone drum
uncut stump
#

if thats what you want, you shold pad your np.arange array

lone drum
#
a = np.arange('2017-04', '2018-03', dtype='datetime64[D]')
print('a\n', a)

for i in a:
    df2['date1'].append(i)
    ``` i am trying this wqay
uncut stump
#

do you want the column indexes as dates? or you want dates as values to column 'one year dates'

surreal rune
#

hello there, we are writing up a CASH/Auto-ML algorithm based on the BaseSearchCV function on scikit-learn. we have setup a timeout so that it stops running after a specific period of time; however, we are struggling to extract the best results (after trying best_params_). does anyone have any guidance on that front?

lone drum
uncut stump
#
result = df['expiry']
#

? 😄 i don't understand what you meant by taking

#

do you want to put that dates (np.arange) into columns?

lone drum
#

see in last expiry column i want to put dates

lone drum
#

@uncut stump u thier , do u get my point ?

pastel valley
#

yo anyone can recommend interesting computer vision related research papers?
the one existed already ill need to create a presenation about it

uncut stump
uncut stump
lone drum
lone drum
uncut stump
#

there is 46k values in your dataframe right?

lone drum
uncut stump
#
>>> np.pad(np.array([1,2]), (0,46000), 'constant')
array([1, 2, 0, ..., 0, 0, 0])
#
>>> len(np.pad(np.array([1,2]), (0,46000), 'constant'))
46002
lone drum
#

now is i have to create new column ?

uncut stump
#

did you try with this padded array?

sick wedge
#

anyone skilled with clustering, specifically DBSCAN and HBDSCAN

#

I'm wondering how I can reduce the outliers in this HDBSCAN clustering plot

#

BASE TRUTH

#

the results are quite nice against the truth that I'm trying to replicate, but through a lot of experimentation I can't seem to pull the outliers in

#

increasing epsilon seems to have no effect as well, which is confusing me

#

I just have it set to 0

desert oar
#

messing with parameters for 100 years seems like a poor use of your "research energy"

sick wedge
#

I'm not replicating a result someone else had, the second image is just how the data is actually distributed, so anything close to that would be successful

#

so I'm on my own there unfortunately

desert oar
#

seems like a good use case for xgboost or a neural network with 1 hidden layer

#

or random forest, but the feature subsampling won't be that useful in that case

#

probably better off with xgboost than rf for just 2 features

#

oh wait this is pca

sick wedge
#

I found something else really weird too, which is that if I run the algorithm twice through, it kind of doubles down and classifies more? I'm not sure why this is happening, here's some example:

clusterer = hdbscan.HDBSCAN(
    min_samples=20, # for dist metric (mutual reachability metric)
    min_cluster_size=50,
    cluster_selection_epsilon=0.0
)
# fit data
clusterer.fit(df)

# add cluster to DF
cluster = clusterer.labels_
df['cluster'] = cluster

I run this through to get the first result, I've showed, but if I literally copy and paste the block so that it runs through twice, I get this result:

#

like you can see the original clusters in there untarnished, but loads of the outliers suddenly become a part of a new cluster

#

any idea why that's happening just out of curiosity? 😂

desert oar
#

interesting, iirc the algorithm is iterative, so it might be doing more iterations instead of starting over

#

scikit-learn has "warm start" functionality for iterative algorithms

sick wedge
#

it's weird though because I would've thought it would overwrite, since I'm writing the clusters to the dataframe with each iteration

desert oar
#

you can usually use sklearn.clone to get a "clean" copy of the estimator object

sick wedge
desert oar
#

don't do that

sick wedge
#

so that's it, how do you mean?

#

I'm really newbie to data science in general btw so feel free to explain like I'm 5

desert oar
#

if you want to learn to predict classes on new data

desert oar
sick wedge
#

ohhhh

#

so that's why

sick wedge
desert oar
sick wedge
#

oh so it's not clustering

#

this project is for uni and for some reason they want it to be unsupervised clustering methods only

#

but I saw you recommended lots of ways that probably might be superior

uncut stump
#

clustering is unsupervised way of classification as I know

heavy crow
#

im trying to train a time series forcasting model to predict 1 timestep into the future. The data comes from a pretty complex thermal system so i would think a LSTM is the way to go. I have quite a lot of data, 12 years @ 3s per data point. My question is: Any tips on how large the network should be (both layers and how many neurons per layer)? Is there a way to figure out if i really need all 12 years of data without trying it out?

#

I have quite a lot of compute power so that shouldnt be an issue

quiet vault
#

For hyperparameter tuning, you could try grid searching or Bayesian Optimization.

#

Here is an article on grid searching

heavy crow
#

hmm, i dont have THAT much compute, one epoch takes around 1.5h on 8 nvidia V100

quiet vault
#

Ohh yeah that won’t work then

heavy crow
#

i was hoping for some kind of rule of thumb that gives me a good starting point

quiet vault
#

I don’t really know then, sorry

plush oyster
#

Hello. It's Ali. Is there any python image processing package/library from which after taking snap of the clothe eg. shirt the shirt can only be extracted from the image ?

heavy crow
#

is there a correlation between the size of the layers and how far into the past it should take into consideration? currently im using 256 as the look-back size (~12 min) should that influence the model size?

reef dock
#

Hi, if i have a dataframe column that has values as

0  [31]
1  [24, 36]
2  [24]
3  [11, 24, 42]

and I wanted to return the column as

0  1 
1  2
2  1
3. 3

basically a count of elements per row

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @stiff geode until <t:1639329081:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

reef dock
#

What's the right way to go about it

serene scaffold
reef dock
#
df['xyz'] = df['xyz'].apply(len)
#

This way?

serene scaffold
reef dock
#

Oh cool, thanks! That seemed to do it.

serene scaffold
#

usually, using apply is a bad practice as it's not optimized with C, but it's the only option in this case because pandas has limited support for working with pure Python objects like lists.

reef dock
#

Fair enough, though I tend to forget all the things that I can do with .apply

serene scaffold
#

In other words, if you have a Series of numeric types (like ints or floats), you should pretty much never be using apply

reef dock
#

I'll keep that in mind. Thanks for helping out.

serene scaffold
#
In [12]: series
Out[12]:
0     0.097326
1     0.169674
2     0.348592
3     0.110183
4     0.479952
        ...
95    0.210811
96    0.216305
97    0.918834
98    0.281781
99    0.512556
Length: 100, dtype: float64

In [10]: %timeit series.apply(lambda x: x / series.sum())
2.19 ms ± 16 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [11]: %timeit series / series.sum()
63.7 µs ± 239 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

I saw someone do the first one on linkedin and it made me so mad

reef dock
#

LinkedIn seems to do that usually, I don't blame you

#

I don't enjoy the way you need to portray yourself on linkedin

serene scaffold
supple quail
#

Hi everyone, I'm trying to scrape data from https://www.idealista.com/en/maps/madrid-madrid/ using beautifulSoup

This is my code :

import requests
from bs4 import BeautifulSoup
url= 'https://www.idealista.com/maps/madrid-madrid/'
res =requests.get(url)
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36'
}
soup = BeautifulSoup(res.text,'html.parser')
print(soup.prettify())

But code is not giving the whole html content of the page.
can anyone help me to extract the name of the streets from the page?

reef dock
#

Which is also why you see me here asking a lot of questions, just trying to learn/practice python

serene scaffold
supple quail
#

okay. sorry..

pale thunder
#

any suggestions for pdf manipulation libraries, pdfminer isn't working out super well for me.

arctic crown
#

Hey, i know i have asked this question many times and it has already been answered but this is the last time

#

what is a tensor

serene scaffold
# arctic crown what is a tensor

if you're asking about what a tensor is mathematically and what the word "tensor" has come to mean in Python, you will get slightly different answers.

mighty spoke
#

Hi does anyone know how to round a list to 3 significant figures in python?

serene scaffold
serene scaffold
#

so the question is, are you trying to round to three significant figures because you're doing actual scientific calculations, or is this something you've been asked to do to learn more about programming? (If it's the latter, that does not preclude you from getting help.)

arctic crown
#

can it be more than one-dimensional? like 27-dimensional array

serene scaffold
mighty spoke
arctic crown
#

what do the sequence of numbers mean?

serene scaffold
arctic crown
serene scaffold
serene scaffold
#

is there a specific one you want feedback on?

arctic crown
serene scaffold
arctic crown
desert oar
#

@arctic crown "tensor" is another word with multiple meanings. in programming (eg tensorflow) it is a multidimensional array. in linear algebra it is a "multilinear map", which can be represented as a multidimensional array

#

you can think of them as a generalization of matrices and linear transformations (which can be represented as matrices)

arctic crown
#

what does generalization mean?

desert oar
#

a "generalization" of a concept is a "more general" version of that concept. usually it means that some restriction is removed

serene scaffold
#

like, H2O is a generalization of ice and water

#

(not the best example--I'll think about it and report back)

arctic crown
#

ok

arctic crown
desert oar
#

a better example would be how multiple linear regression is a generalization of linear regression

#

y = mx + b is a "special case" of the more-general case of multiple linear regression

desert oar
#

matrices and tensors represent transformations

#

i recommend studying MIT 18.06 linear algebra, the lectures are on youtube and the instructor is very good at helping you build intuition about the math

serene scaffold
#

Now it's finally my turn to ask a question (though unfortunately I have to be a bit vague). One of my projects at work involves learning the mapping between certain sets of sequences, and the length of the input sequence isn't always the same, and the length of the output sequence isn't always the length of the input sequence. I've heard that RNNs or LSTMs might be the right architecture to use, but in my reading, it isn't clear to me how those are ideal in circumstances where the lengths of the sequences is inconsistent.

desert oar
serene scaffold
#

like, the ones in "attention is all you need"?

desert oar
#

⚙️ It is time to explain how Transformers work. If you are looking for a simple explanation, you found the right video!

🔗 Table of contents with links:

▶ Play video
serene scaffold
#

RIP easy solution

desert oar
#

I don't think they're terribly difficult to implement from scratch, but I don't know how much data you need to train them effectively

#

my impression is that they are actually a lot more flexible and easier to deal with than LSTM

arctic crown
#

what does tf.ones mean?

serene scaffold
# arctic crown what does tf.ones mean?

It makes a tensor where every value is 1.0

We're happy to help, but I feel like you could answer a lot of these questions on your own. If you're just asking what a function does, this can be looked up in the documentation.

#

If you find the documentation confusing, we could also help you learn how to read docs, as this is a critical skill to develop.

iron basalt
desert oar
iron basalt
#

Sequences are a different beast than other types of input.

desert oar
#

oh i misread sorry

iron basalt
#

Variable lengths can matter more or less depending on the method used (how much variance). But in general it will make it more difficult.

desert oar
#

am i correct in that transformers are a better default choice than lstm or other rnns nowadays?

iron basalt
#

There is also the problem of trying to recognize objects given a sequence, but that sequence may come out of order as it often does IRL (even more difficult and LSTM will fail hard at this for example).

serene scaffold
#

just to check my understanding, if I'm building a transformer model, do I need to select a constant maximum input size and pad any input with enough zeros to reach that size?

desert oar
#

i am also kind of curious what kinds of sequences you are working with

iron basalt
#

Transformers are often the better option in deep learning, but not always. Also learning about RNNs makes their motivations and methods make more sense / gives context.

desert oar
#

afaik in text sequence modeling you use a special "padding" token

boreal loom
#

dt[f"rmean"] does anyone know what that f means?

elfin eagle
boreal loom
#

python

#

dt is a dataframe

#

dt[f"rmean_{lag}_{win}"]

#

This is the whole code

elfin eagle
boreal loom
#

Oh thanks

#

You are awesome

elfin eagle
#

btw go to regular help for these type of questions

boreal loom
#

Yeah you are right

#

I though it might be related to pd

elfin eagle
#

I wanted to do some research in data-science and this channel has just become general help

boreal loom
#

Thats why i posted here

elfin eagle
#

yeh, this is more like advanced

#

in theory

#

but dont worry

#

gl coding

boreal loom
#

I had no idea it was basic python, I am sorry, I would not ask here, my bad ok_handbutflipped

upbeat dove
#

Actually different question, is it better to have a nn with increasing # of neurons per layer, decreasing or identical

boreal loom
#

Depends on the problem, the nn

loud cave
#

I'd suggest trying several combination of each if you're able

boreal loom
#

There is not a pattern per se from what i have seen

#

This is mostly empirical unless you are hyperaware of what you are doing

loud cave
#

You can use a library like Optuna to run a search over those parameters

#

Assuming that your model trains in a reasonable amount of time. If it takes hours/days it may not be feasible to do that type of experimentation

sick burrow
#

I'm trying to strip spaces out of a column, but I'm still finding entries with spaces. Even when targeting a specific string it doesn't appear to work. What might be happening here?

#
df1.iloc[1293].ABN.strip()
'14 011 062 338'
loud cave
#

Doesn't strip just remove the first/trailing space and not all spaces in a string? You should use .replace(" ", "")

sick burrow
#

hmm.. it looks like that's the case.. I must have misread the description.. thanks!

upbeat dove
#

Why?

boreal loom
#

I cant give you an answer, like most people here, since its a very specific problem. You should try by finding the number of layers you want to use and then why. Then try to find how many neurons you need to do a specific task, like what does this group of neurons accomplish etc etc. That's why i said its a very specific problem and you can only find out by trying different things out.

upbeat dove
#

Alright then

boreal loom
#

For example in a CNN you would have layers where they accomplish certain things, so you could try and gauge from that, but i am not familiar with what type of nn you are using and as to how it operates. Either read more about it, or just try and repeat

#

If you really want to understand, i would suggest you find some good papers about NN and see how the team behind them arrived at certain assumptions suchs as layers etc. That might give you an insight

upbeat dove
#

Well I'm using one convolutional layer and a couple of dense layers but thanks for the advice

#

Hoping it would find patterns like pins, forks etc then using the dense layers to evalutate those

boreal loom
#

You could also try and see how similar teams have achieved similar NN, and take this as a starting point

#

While it might be out of your scope i would suggest looking into a Reinforcment Learning model free scenario

#

Where your NN updates the Q

pseudo wren
#

I need some assistance with learning data frames

serene scaffold
lone drum
#

Hello
I have a date column in dataframe
The dates are in

04/03/2017
05/03/2017``` format
I want it in

4/3/2017
5/3/2017

How I can do this ping me when replying
serene scaffold
arctic wedgeBOT
#

Series.dt.strftime(*args, **kwargs)```
Convert to Index using specified date\_format.

Return an Index of formatted strings specified by date\_format, which supports the same string format as the python standard library. Details of the string format can be found in [python string format doc](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior).
lone drum
#

Not what I am expecting

#

I want to remove zero from day and month value

serene scaffold
serene scaffold
lone drum
#

Current data frame with date1 column

#

@serene scaffold do u get my code and output

serene scaffold
#

I guess you could use a regular expression

#
pd.to_datetime(new_df.date1).dt.strftime('%b/%d/%Y').str.replace(r'/0(\d)/', r'/\1/', regex=True)
#

See if that ends up looking like you wanted.

#

boop @lone drum

serene scaffold
#
In [5]: s = pd.Series(['Apr/09/2017', 'Apr/12/2017'])
Out[5]:
0    Apr/09/2017
1    Apr/12/2017
dtype: object

In [6]: s.str.replace(r'/0(\d)/', r'/\1/', regex=True)
Out[6]:
0     Apr/9/2017
1    Apr/12/2017
dtype: object

It worked when I did it.

lone drum
#

Why u are getting
Apr insted of 4
I want number momth

#

Month*

serene scaffold
#

What does pd.to_datetime(new_df.date1).dt.strftime('%b/%d/%Y') look like if you print it? Please copy and paste the result as text and I'll be happy to continue helping.