#data-science-and-ml
1 messages · Page 294 of 1
yaeh big wtf pikachu face right now

does anyone know basic machine learning ?
i made pytorch
define basic
i do
what lib and where u learn it
sklearn
what lib
bit of tensorflow
not too much
m
for some code
and its meaning

I hate you whoever texted me
bro your name is amazing lol
We all can relate sir
from parallel_universe_2 import wife_&_kids
Traceback: wife_&_kids not found
LMAO
LOVE THE JOKES
the big sad
Might not be just jokes

Wow
umm
can someone tell me
what is the meaning of skewness?
how can i observe the skewness, i the data is rightly skewed or not
Skew means the avg or mean of data is shifted from normal position in normal distribution.
its not that bad. IMO its the easiest framework to learn (especially when comparing with PyTorch)
Hey, I am new to datascience. I am currently following a minor about bigData and I have a project, but I cannot figure out how I need to solve this problem. I have currently only learned the basics like pandas, webscraping and a bit of classifiers like KNN.
My project is to create a program that can predict if a hotel review is positive or negative. Every row of data I have has a positive and negative review, but I don't know how to start. Can anyone help me out?
Why not use pd.Series() instead?
i see, and how can i identify one if the data is rightly skewed or not?
How to export pandas dataframe into text clipboard that I could paste into script as string and reimport as pandas dataframe again? this is for MVE
hi
Is there any way to leave header empty in Df when i use mulitple headers?
level_0 will be always present
but for levels_1 to 4 i want to have empyt field when there is no name present
for i, columns_old in enumerate(df_value.columns.levels):
columns_new = np.where(columns_old.str.contains('Unnamed'), '', columns_old)
df_value.rename(columns=dict(zip(columns_old, columns_new)), level=i, inplace=True)
got it
Yes, but depends on what you actually want to do and what your data is like. For example, if you're passing only 1 argument at a time, I would pass it as a string then append it at the end of a list.
Can anyone tell me how to add labels/text or a second yaxis from another Col to this: Do you know how I can add this legend or text to my px.bar, need to display the text for each of my values:
def SetColor(y):
if(y <= 1):
return "red"
elif(y <= 2):
return "orange"
elif(y <= 3):
return "yellow"
elif(y <= 4):
return "lightgreen"
elif(y <= 5 or y <= 6):
return "green"
elif(y <= 7):
return "darkgreen"
elif(y <= 8):
return "silver"
elif(y <= 9):
return "gold"
def Setlabel(y):
if(y <= 1):
return "Very low (1)"
elif(y <= 2):
return "Low (2)"
elif(y <= 3):
return "Below average (3)"
elif(y <= 4):
return "Average (4)"
elif(y <= 5 or y <= 6):
return "Average (5, 6)"
elif(y <= 7):
return "Above average (7)"
elif(y <= 8):
return "High (8)"
elif(y <= 9):
return "Very High (9)"
px.bar(filtered_df, x=filtered_df["ID"], y=filtered_df["Score"]).update_traces(marker = dict(color=list(map(SetColor, filtered_df['Score']))))
This doesn’t work:
##update_layout(legend = dict(list(map(Setlabel, SetColor, filtered_df['Score']))))
Hello
I need help with SCRAPING,
I manually saved the webpage in html format from browser.
Im able to retrieve dataframe from MANUALLY SAVED method.
I saved webpage using request url module in html format.
But i cannot retrieve dataframe from that.
Both looks same but I don't know this happens.
Thanks if you helped me.
it isnt that easy to mak a datframe from the table....u need to carefully examine the html corpus to make the dataframe
hold on,
it was very easy with pandas to read_html(mannualysave.html)
but not with request method
that's probably the most un-pythonic and inefficient way to store something. A dict with keys being numbers and values strings would have served you better
@grave frost have you used PlotlyExpress?
no, I have no idea about that. I was just advising you to save time and effort by using data structures
It doesn’t behave the same, take this for example:
=dict(list(Setlabel(bro))))
ValueError: dictionary update sequence element #0 has length 1; 2 is required.
That was with a basic data structure.’bro’
you can try by using an example dict with sample data to better explain your issue here
or use a help-channel
I got it working with the function anyway, doing it under the color trace not by updating traces. Haven’t managed to work out the second y axis tho or verify the data but I’m assuming adding the second y as a trace will resolve it.
Trust me a basic dict and values was the first thing I tried with the df
that is numpy for you

Hello, wanted to know if I could get some assistance in using dynamic pivot in a query using TSQL. Please let me know if this is the right channel to ask this question or point me to the correct channel
#databases is your best bet
thank you
ah yes, high degree polynomial regression, the ability to make anything fit anything

Hi, Im programming a mini mathematics/scientific dice rolling game
its only a couple lines but I want it to be able to count the median of the sums
if someone dms i would be ecstatic
okay, code is in swedish though
import random
antalForsok = int(input("Hur många gånger vill du kasta?"))
tarningsSumma = 0
for n in range(0,antalForsok):
tarning1 = random.randint(1,6)
tarning2 = random.randint(1,6)
forsokSumma = tarning1 + tarning2
tarningsSumma += forsokSumma
print(tarning1,tarning2,forsokSumma,"\t",tarningsSumma)
print("Summan är",tarningsSumma)
if you run this in python, it will ask you how many times do you want to roll
and you can choose a number and it will give you the total sum
what if I want it to give the sum divided by the number chosen
tarning means dice
summa is sum
!e
import random
antalForsok = int(input("Hur många gånger vill du kasta?"))
tarningsSumma = 0
for n in range(0,antalForsok):
tarning1 = random.randint(1,6)
tarning2 = random.randint(1,6)
forsokSumma = tarning1 + tarning2
tarningsSumma += forsokSumma
print(tarning1,tarning2,forsokSumma,"\t",tarningsSumma)
print("Summan är",tarningsSumma)
You are not allowed to use that command here. Please use the #bot-commands channel instead.
anything @hollow sentinel ?
Hey @thin remnant!
It looks like you tried to attach file type(s) that we do not allow (.csv). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.
Feel free to ask in #community-meta if you think this is a mistake.
I'm having a csv file that has 32 cols
but all data is in the first col
how to seperate it overthe cols
open with excel
@keen gull
import random
antalForsok = int(input("Hur många gånger vill du kasta?"))
tarningsSumma = 0
for n in range(0,antalForsok):
tarning1 = random.randint(1,6)
tarning2 = random.randint(1,6)
medelSumma = (tarning1 + tarning2)/antalForsok
tarningsSumma += medelSumma
print(tarning1,tarning2,medelSumma,"\t",tarningsSumma)
print("Genomsnittet är",tarningsSumma)
idk swedish so i used google translate
sorry if the words are wrong
lol
ur a god bro wtf
Guys, I have an excel file that got many spread sheets. One shows info about "Win rate with X character", other has data about "Win rate when X character follows Y route", and so on. They have different lengths. Can I just make a DataFrame with them? Or would it be better to just separate them all in different files and use them as different datasets?
can I dm u if i need more help?
its just this line lol
medelSumma = (tarning1 + tarning2)/antalForsok
uhh ill probably be busy later so just post here

(I want to use them to make predictions using machine learning)
okay thank you ❤️
I'd make a dataframe per. It's weird to have different data in one DF.
I see. So do I just have to use each one as a different dataset? What should I do if I want to train my algorithm to predict a mix of possibilities?
For example: If I got a dataset that shows "Win rate when playing with X character"
and another that shows "Win rate when using Y item"
What if I want to predict "Win rate when playing with X character and using Y item"?
You can make a single dataset out of them if you can figure out how to make them all, well, the same kind of data
I see
yo pandas is struggling to read a csv file with just about less than half a milion rows is that normal?
It did issue me a warning about mixed dtypes but I then specified them
@misty flint it didnt work, its still only showing the result
import random
antalForsok = int(input("Hur många gånger vill du kasta?"))
tarningsSumma = 0
for n in range(0,antalForsok):
tarning1 = random.randint(1,6)
tarning2 = random.randint(1,6)
medelSumma = (tarning1 + tarning2)/antalForsok
tarningsSumma += medelSumma
print(tarning1,tarning2,medelSumma,"\t",tarningsSumma)
print("Genomsnittet är",tarningsSumma)
ah i forgot to change the last line
but its a bit off, by like 0.15

probably this line
tarningsSumma += medelSumma
not at my comp anymore so i cant check lol
hmm i tried changing to only +, only =, only - and *
none worked
wait i just realized that it now shows numbers with decimals? its a dice roll so it cant be anything other than the numbers 1-6
print(tarning1,tarning2,medelSumma,"\t",medelSumma)
but you wanted the average right? not really possible to get average without decimals
the "Genomsnittet"
yes the average can be in decimals but not the actual results
but when it shows u the sequence e.g 2 5 8, it shows now 2 5 1.3
i think i dont understand the question. sorry bud
for example, 1.6 is fine for it to have decimals
but here, you can see in line 3 that it says 5 6 2.2, that means the dice rolled 2.2, which isnt possible
I just want it to show the median at the results
i misunderstood the initial problem
its the last line right_
?
that should be edited
thats my fault, im not that good at explaining
you mean average right?
I was thinking:
print("Genomsnittet är", (tarning1 + tarning2)/antalForsok)
yes average
import random
antalForsok = int(input("Hur många gånger vill du kasta?"))
tarningsSumma = 0
for n in range(0,antalForsok):
tarning1 = random.randint(1,6)
tarning2 = random.randint(1,6)
medelSumma = (tarning1 + tarning2)/2
tarningsSumma += medelSumma
print(tarning1,tarning2,medelSumma,"\t",medelSumma)
print("Genomsnittet är",medelSumma)
the result divided by the number the person chose
oh man this is hard to do on mobile lol
im having trouble on a laptop xD
wait now i definitely dont understand. dont think that code will work for what you are looking for

let me try to formulate it
import random
antalForsok = int(input("Hur många gånger vill du kasta?"))
tarningsSumma = 0
for n in range(0,antalForsok):
tarning1 = random.randint(1,6)
tarning2 = random.randint(1,6)
forsokSumma = tarning1 + tarning2
tarningsSumma += forsokSumma
print(tarning1,tarning2,forsokSumma,"\t",tarningsSumma)
print("Summan är",tarningsSumma)
the original code shows the sum of all the dices you rolled
so if u chose 3 and got 1, 2, and 3 you would get the sum 6
i want it to take 6 and divided it by 3, the number chosen
which should give 2
dudes im having a slight problem, I have a (10, 1561) feature matrix but when i do feature[i] its shape is (1561, )
for this part just write another line of
x = forsokSumma/antalForsok
print(×)
then place x wherever you want it
wdym place x wherever?, I just started coding 2 days ago sorry xD
anyone here good at numpy?

ah you want THAT number
change it to
x = tarningsSumma/antalForsok
now it only shows sum
have you tried reshaping it
LMFAOO

so all I had to do was add those two lines?
can i kiss u
how can i make an empty space between the sum and the average thingy
between the last two
just do
print()
in between the two lines
oh makes sense

oh i love you bro

hey y'all im gonna be doing a soundcloud network analysis project to recommend new users to listeners and I have to combine it with another dataset for an assignment. Any ideas for what I could combine it with for some interesting insights?
@misty flint dont kill me but...
import random
antalForsok = int(input("Hur många gånger vill du kasta?"))
raknare = 0
for n in range(0, antalForsok):
tarning1 = random.randint(1,6)
tarning2 = random.randint(1,6)
forsokSumma = tarning1 + tarning2
if forsokSumma == 9:
raknare += 1
print(tarning1, tarning2, forsokSumma, "\t", raknare)
print("Antal gånger summan 9 slogs:", raknare)
this is a new code
and i wanna make it find the liklihood of getting the sum 15
I don't know yet but it should be things like genre of songs, other artists an artist is following/who follows them, songs #, song length, external social media links, etc...
that will be one metric yea
ah
yeah i could do that
the assignment needs an external dataset though, nothing i could scrape from soundcloud
hmm
im thinking i could combine it with data i scrape from their twitter links in their bios but not many artists do that
i have no idea yet, im planning out the architecture of the project first
Hmm im not sure which part of that applies?
rule 5 sorry
Ah ok this isn't a solution, this part is ungraded
My prof only grades on our actual analaysis
Same im having a hard time w this
guys, what is it called when you say "what two numbers added together ranging from 1-12 are equal to 9"
how do you turn that into a python line or what is it even called mathematically
so for example, 1 and 8, 2 and 7, 3 and 6, 4 and 5
what do you mean?
your target column is what you would define as y...its what you are trying to predict
they're all music artists with the same feature types and stuff
you have to predict listener to artist or vice versa?
i'm just trying to cluster artists together
so like given an artist as input, it would produce a list of artists who are related
and i'm looking for ways to implement external datasets in here
genre would be the closest one then. do you anything about music?
do i anything about music?
I can't identify your tone 🤷 but If I were you, I would use the time key signature, bpm, and indices of rests in songs to create features and cluster artists via them. (you could take median of that most prob, average doesn't seem appropriate)
or you could construct an artificial feature concatenating indices of rest and their distance along with time key in a formula and use that
hey sorry i'm in a zoom but i'll be back in a sec
Anyone using Kedro here?
Hello all, nice to meet you.
Hhahahha an impostor
Just a very easy to happen here coincidence. I use the same username name at gaming and social servers where no one understands it.
Yes, I think if we search BackPropa* on the usernames, thousands of them would appear on results
@fiery maple So Kedro? No I don't feel comfortable with this type of frameworks
someone know how to set gridline color on a seaborn lineplot graphic?
you can always find the matplotlib object directly
i think for grid it was... plt.gcf().ygrids
i dont use matplotlib . _.
havent used matplotlib in a while
i was thinking if i left seaborn to only use matplotlib if could be better for performance, cause i only need to graphics like this:
you can use both
but for that id probably just use mpl
for something prettier define seaborn, much easier
idk, i'm trying to use less imports, cause i have a big list rn . _.
i'm with fear if my bot can have problems with this
a big list of imports doesn't matter - tip: you can also import multiple things at the same time example:
from math import sqrt, ...<multiple_other_modules>.
or you can do import math and use it like math.sqrt()
hey guys, I need some help with Tableu/Power BI stuff, is this the right place?
I think we only help with Python libraries for data science/ML
For discussion of scientific python, matplotlib, statistics, machine learning and related topics.
I'm guessing @undone heron is trying to run a python script in Power BI.
We can talk about data science topics in a general, language-agnostic way, though any discussion about implementations should be with respect to Python.
performance difference would be minimal
most of the overhead of MPL is from drawing
If I wanted to drop rows in a df where the column 'victory_status' is equal to 'outoftime' would I not just do: df.drop(index=np.where(df.data["victory_status"] == "outoftime"))?
can you give an example CSV that I can use to try it?
You can flip it around: keep all of the rows where 'victory_status' is not 'outoftime'.
!paste If you copy and paste enough rows for me to try it, I will try it or look for other solutions.
Pasting large amounts of code
If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pydis.com/
After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.
remember how you can use masks?
I'm not sure why it didn't occur to me sooner
I'm ngl I have no clue what a mask is
df = df[df['victory_status'] != 'outoftime']
you use the mask, df['victory_status'] != 'outoftime', to get a boolean series of what you want
and then you use that as a mask
which is what the outer df[...] does
If you want some real-time plotting and maybe some other GUI stuff I recommend: https://github.com/hoffstadt/DearPyGui It's a very good option for all GUI things in python and comes with plotting features too (can even make games with it).
numpy works similarly
You will have a generally easier time if you filter for what you want, rather than dropping what you don't want.
huh I'll definitely have to remember that that's hella useful
Think of it like doing a search, where you keep narrowing it down with each filter.
that sounds extremely wasteful memory-wise tho
You're using python and you are worried about memory (use c instead)? But ok, you can just apply multiple filters at once, and yes you have all those resulting rows duplicated, but it's much less error prone since you are not modifying state (original df unchanged). If you have a memory issue, then fix it at that time. Until then it's premature optimization.
Generally I doubt you will have a memory issue, but if you do, consider using a DB since it will store most of the data on disk for you (well all, but it will keep some in memory for faster access).
Oh my god I'm going to use this so much in the future
That's actually amazing for scientific and engineering applications
Yeah it's so nice, I already used DearIMGUI and when I found this I was so happy.
I use dearpygui for all kinds of things now, like telemetry.
If I ever find some free time I might remake the features of MATLAB's Control systems toolbox or something using that
well honestly I think using a db would solve most of my issues at this point but my prof wants us to use pandas
I get what you're saying about not needing micro-optimisations but at the same time creating multiple dfs seems unnatural to me and I like micro-optimisations 🤷♂️
Yeah it has all of dear imgui's stuff including the raw draw commands which you can use to make custom widgets.
It's not a micro-optimization, but at the same time, you have probably like 8GB of RAM.
8GB is a lot, unless you are dealing with like raw video.
4 but ya I see what you're saying
Also you have virtual memory too, so it will use the disk also.
(All modern operating systems do, as long as you don't cause pages to fly in and out really fast you basically have unlimited memory)
Quick question: I'm trying to run a correlation test between an ordinal DV and a continuous IV. Stuck between choosing Spearman's or Kendall's. Any advice?
i was thinking about to swipe from seaborn to plotpy, what u guys think about? u guys have one favorite for performance and beaultiful visualization?
Jesus Christ @iron basalt you're like the god of repos lol
i mean DONT use matplotlib if you dont have to

I love how I've changed from dumb social networks to github repos and from bullshit bookmarks to python or ds articles o.O
I think somenoe shared this here earlier, and it looks amazing:

There's like 3-5 people who actually know ML stuff here that are regulars and they generally have better things to do than talk all day
I'm coding and reading, I have already been distracted too much here xd, but Raggy's stuff was p cool so kind of worth it.
huh thats true
it has its moments
alr
Right now i'm a bit more in a reading phase so I can just drop in, but when I go back to a coding / engineering phase I will pretty narrowly focused on it.
which part do you like the best

ive just started grad school coming from a dif field so im mainly in the learning phase
all of it
neural nets mainly
GANS
CNNs
all that good stuff
opencv is fun to work with
its awesome what you can do with it
oh nice
one of my last projects was with opencv. it was cool
our prof wants us to train a CNN for this project

yeah thats the plan. still reading more before actually training anything
probs will start tomorrow morning tho
we have a meeting with the TA in the afternoon

noice. bet you know more than me tho

regarding this stuff. maybe not other things lol
dw. you will have plenty of assignments like these if you choose to study this stuff

thx
theres stuff up to rows 300k in excel but when i read the file with python, it says its shape is only 60k by 785

hello, XML question:
when do you use
<device name="SEP12345"></device>
and when do you use
<device>SEP12345</device>
I did a df.groupby["column"].count()
how come a bunch of values are 0?
shouldn't every value be at least 1?
Otherwise they wouldn't exist no?
Hello, is anyone familiar with using Google Data Studio? I've got an issue with custom data not displaying correctly in charts?
What's a good resource for a beginner to learn about neural networks and implement one in PyTorch? I want to implement a binary classifier and compare it to a logistic regression model in sklearn
whats data science?
!resources @obtuse sable
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
in short, it's when you use programming to make use of large amounts of data.

I have a program that calculates precision, recall, and f1 scores, and it breaks if there's a class with 0 tps. in that case should I set all three scores to zero, or would one usually use nan?
Anyone know what is the technical term for the multi-branch networks (the ones where we would concatenate layers)?
Multi branch network is the term lol
Multi output is sorta common too
so I have a subset of a df, columns(productID and orders), there are 5 unique product id, when I try to to a histplot with hue=productID, all productID from the original df show up
Hey! So I am just starting with python
common = agg_df.sort_values(by=["sh_18_Ln_PalletsEquivalent"], ascending=False).index.tolist()[:5]
df_most_ordered = df[df["sh_ItemId"].isin(common)]
sns.histplot(data=df_most_ordered,
x = "sh_18_Ln_PalletsEquivalent",
hue="sh_ItemId")
Question... modules are synonymous of libraries?
a modume is another python scrip
your question would fit more in #python-discussion
Excellent thanks!!
my question is why are all the orginal productID showing up
where they aren't even part of the dataframe I am basing the plot on
yo
i am trying out a beginner project
as u can see, the data is the documents
and i really dont understand how it predicted both "that kud" and "HOHOHO" as belonging to the same cluster in this case
is there anything that i should fix there? i dont think that it's actually legit
Hey guys I'm working with matplotlib subplot animations and I'm running into some issues, could someone please just tell me what is wrong with this code? I don't get the logic behind it.
import matplotlib.animation as animation
import matplotlib.pyplot as plt
import numpy as np
figure = plt.figure()
n = 1000
x1 = np.random.normal(-2.5, 1, 10000)
x2 = np.random.gamma(2, 1.5, 10000)
x3 = np.random.exponential(2, 10000) + 7
x4 = np.random.uniform(14, 20, 10000)
x = [x1, x2, x3, x4]
def plot_hist(curr):
if curr == n:
a.even_source.stop()
for i in range(len(axs)):
axs[i].cla()
axs[i].hist(x[i], bins=100)
fig, ((top_left, top_right), (bottom_left, bottom_right)) = plt.subplots(2, 2, sharex=True)
axs = [top_left, top_right, bottom_left, bottom_right]
a = animation.FuncAnimation(figure, plot_hist, interval=100)```
yeet! 😅
Is there any specific advantage that architecture offers?
Multi channel is multiple branches as inputs
It's less an advantage of an architecture more speeding up multiple networks on the same data.
Multiple output branches are basically multiple networks that share input parameters
ahh, so multichannel has multiple input gates while branched ones just have multiple branches on one input gate?
Means you can do less processing because you're likely to be learning the same lower level features in early layers anyway
See: Mask RCNN
It's basically Faster RCNN that shares its early parameters with a segmentation branch
you can use cosine similarity or simple euclidean distance to check whether the vectors are indeed correct and debug that way
most prob there is little to no correlation
you can visualize them too so that its easy to understand and looks pretty too
i dont think i can visualize the datas tho
I mean the vectors
there are plenty of tutorials online
I think there was a guy recently who posted the same article here, but you would have to search for it
!code
Here's how to format Python code on Discord:
```py
print('Hello world!')
```
These are backticks, not quotes. Check this out if you can't find the backtick key.
Does anyone knows a good tutorial to start python tensorflow learning ?
Linear Support Vector Machine is widely regarded as one of the best text classification algorithms.
google got this ^^^
Being useable is pretty useful in breaking SOTA. It's also sort of useful in terms of representation theory. NNs don't necessarily learn lower level features early on.
I had a question - can anyone suggest some method where I can incorporate document vectors with TF-Idf? it would be pretty easy with word vectors with simple scalar multiplication. but what approach could we use in documents??
https://www.reddit.com/r/MachineLearning/comments/m3boyo/d_why_is_tensorflow_so_hated_on_and_pytorch_is/ here's a pretty interesting conversation on tensorflow vs pytorch
264 votes and 125 comments so far on Reddit
you can check out the tensorflow guide on tensorflow's website
or the tutorials
Ok, thank you 🙂
hi, i want some help. So i am unsure of the type of machine learning algorithm i should use, i hope yall know the type of learning that suits my project.
a) i want it to be able to be able to categorize sentences, such as "chitchat" and "task" and even sub categories based on any pattern that it can find from the sentences. ( main objective )
b) i want to let it be able to be trained by this way. I first define the categories, then i put some sentences in the category. it tries to predict the category of test sentence and i can tell the correct category if it's wrong ( i think i can do the same with supervised learning for the "telling the correct category" part )
nice. personally, I am more of a TF Person, but I see that people are comparing TF 1.x more than TF2. The basic summary is: I started with TF1, it sucked. I switched to pytorch and would never switch again.
For me, TF just adds a lot of native support which makes it easier to do stuff that uses other Google products (like GCP, Tfrecords, TPU, etc.) having native support is a big thing that prevents TF people from going to PyTorch. I wanted to do PyT with TPU and that was a mess of errors. with TF - literally 5 lines. Same with model parallelism - its just easier.
Thats why a lot of people stick with TF because it just makes life easy 🤷 Tho my aim is to be familiar with pytorch too by the time I end undergrad or smthing
for me it depends on the complexity of the project
i like using keras with tensorflow for simple projects
since its a lot easier
but for more complex stuff i use pytorch
i've never really used raw tensorflow with no keras
@grave frost @austere swift what do u think of my project?
there is no need to use raw TF when keras works
yeah
its only for researchers and power users I guess
as for TPU afaik thats like one of the biggest advantages that TF has
pytorch just doesn't really like TPUs
but I train on local servers rather than kaggle or colab
so I use GPUs anyways
ye, XLA f-ing sucks. that thing produces a shit-ton of errors that are basically the exact opposite of what the error says
never. again.
Cloud?
rich boi
I would never be able to take the brave step with local hardware (but then I don't really have $$)
my dad gets grants for research stuff, so thats how i get funded
I'm the one who mostly uses the hardware though
my dads a professor and physicist at a university
that is pretty cool
yeah
so whats the config?
this is the main server i use
4 rtx 6000s, dual xeon 6242s
384gb ram
i also have a second one which is similar except it has dual 4210s and a single rtx 6000
but i'm soon gonna be upgrading it to have an A40, then moving the rtx 6000 from it to the one with 4 rtx 6000s
dude, please stop. my potato computer would crash seeing such expensive hardware
so its gonna be one with an A40, and one with 5 rtx 6000s
do you do kaggle?
yeah
cuz I doubt then you wouldn't atleast come at top 5
sometimes i just use it to train multiple models at once too
one on each gpu, or 2 each using 2 GPUS
or any config like that
mix and match :)
lucky you
experimentation must be a breeze when you don't have to think about resources
its all just writing the code
yeah but the code is more complicated when you have to make it take advantage of the GPUs
coding for multi gpu is more complicated
still
yeah lol its nice
you can't experiment with that code unless you have multi-gpus
but the power bills are 📈 📈 📈
just train it at the night 🙂
it draws too much power to be plugged into one circuit
so i have an extension cord to have one of the psus connected to a different circuit
cus it has 2 psus
nice
Your question a bit too general as it is.
On one side it looks like a classification problem. On the other hand it looks like an NPL problem "put sentences in the category"?
I'm not sure if i'm being lost in translation here but I can't conclusively determine what your goal is here. is it classification?
it's to classify sentences into sub categories
but it's ok now
i figured that it wouldnt work
its not that it wouldnt work. Its just you need to clearly define what you want lol
sounds more like an NPL kind of problem though.
NLP*
any ideas why this mnist dataset has more than the expected 70,000 rows? https://www.openml.org/d/554
nvm i'm dumb
thanks for checking it. i just realised that it is 70,000 but i was also counting the metadata lines
oh
that's why i'm dumb 😛
numberofinstances is 70,000 == number of rows. number of columns is 785
well
this is why i just gonna stop trying it
i wanted it to work like a personal assistant
the problem is
it wouldnt understand the way to execute the commands
such as
turn on my fan
every command has to be coded
i can actually use ml for the part i mentioned
the problem is executing my commands
oh nvm you already figured it out
@lapis sequoia it would be much helpful if you can condense your issue down to a single paragraph over one message
Is bruher oui oui baguette
For those who are into web scraping and selenium.
I'm reading this book, and I need to have Gecko to run some code.
I've been searching how to download Gecko driver but I couldn't find much information like the .exe file itself.
I already have Firefox installed on my laptop, so do I just use the Firefox installation as the driver or is there a way to download Gecko driver's.exe file?
Thanks.
Also, I'm on windows.
I think this is what you want: https://github.com/mozilla/geckodriver/releases
find the win zip (either 32 or 64 based on your PC, most likely 64)
Okay man, Thanks.
ive told the dataframe to add 10 to every number in the 0th index
why does it add it to the first index

how can i make sense of stanord's stanza's outputs?
Impressive
That dataframe is the least explicative table i've ever seen in my life lol
its the worst
also interesting note
if you try to read a csv on google colab before its fully uploaded, it still reads it
it just reads the rows it has currently
so like, half the data 
idk why colab doesnt return an error
was supposed to be ~400k rows but only 200k were in the dataframe

Which pdf library should one use to read PDFs?
is there a "best one" or standard one most folks use?
pypdf is nice
also you know what they should make in the future?
some kind of way to estimate how long a model will take to train
based on how big the dataset is, etc., etc.
im here staring at the rotating circle
like

I'm looking to extract text from it. scrape information from a PDF. like say if it were a form
pypdf
I'm trying it right now and I'm not finding it working well
I'm about to check out PDFminer.six
best wishes to you. I found working with PDF a pain in the ass
i literally converted it to a different file type because it was easier lol
I used this
but if your data is sensitive...well, no idea tbh
um im training a model rn so...lol
also i have a meeting with the ta where im supposed to show this model

thanks. the model wont train in time bc i waited last minute

look into ocr maybe
tesseract was the library we used for our team project
.extractText() just doesn't seem to work
OCR sounds like overkill if the text is in there somewhere
might be
but I can definitely highlight the text
in the pdf... so the text must be in there right?
i guess I can copy over another pdf and test it out
should be. Unfortunately i dont think i can help you anymore thou 😦
Looks like It works with another, simpler PDF.
The text seems to be broken up in some really weird points... hmm
There's definitely bits missing...
okay so pyPDF doesnt' work and when it does it misses out text
buuuut
when i use pdfminer the text comes out for both example pdfs
it works like a charm
the only issue i think is just it's a lot... harder to use
How can I use sklearn PCA with pandas table where I got string values?
car model ValueError: could not convert string to float: 'alfa-romero'
actually it turned out to be quite simple
you need to convert the categorical column to numbers
for exmple you can use sklearns OneHotEncoder
I admire your resilliance lol.
PDFs are unbearable
The PCA of categorial data is weird, you might also just want to not include that in your PCA.
Yeah pdfminer seems pretty good.

ill have to look into this
to see if we can also use this for our project
thanks for the link
hmmm
WARNING:tensorflow:Learning rate reduction is conditioned on metric
val_accwhich is not available. Available metrics are: loss,categorical_accuracy,auc,val_loss,val_categorical_accuracy,val_auc,lr
the model is still training so...

why doesnt the TF documentation have a list of the warnings
oh well ig the learning rate reduction just wont happen
What are you building?
a basic CNN to read handwritten characters
in that case what do you do? PCA or MDS your continous variables only?
and then later encode your categorical?
looks like you'd want to change val_acc to val_categorical_accuracy
Does anyone know how to resolve matplotlib graphing frequency on the x axis? My chart is the wrong way round
That's what I would do. You can run separate feature selection processes on categorical variables
what is a good feature selection process for categoricals? PCA, MDS and t-SNE probably wouldnt work with those right?
I'm weak on that side of statistics but generally hypothesis testing type methods which you'd use to figure out the difference between two scientific testing groups are what's relevant. Chi squared, rank correlation, etc.
Chi squared? I hve never used X2 like that. interestnig. I'll read about it. thanks!
It's just like usual hypothesis testing honestly
ok let me try it. thanks
you didnt paste your code at all

your problem could be different things
maybe the data isnt sorted right
maybe your variable names are wrong
etc.
what do people do when their model is still training
just chill
grab lunch?
etc.

The last time i did hypothesis testing of categorical variables was in college. RIP
time to Khan Academy 🃏
theres a good open source textbook
if youre a textbook guy
@exotic maple chapter 6 https://www.openintro.org/book/os/
OpenIntro's mission is to make educational products that are free, transparent, and lower barriers to education. We're a registered 501(c)(3) nonprofit.
i just read chapters 5 and 6 and they were pretty decently written
on the shorter side for textbooks too
or ig you could watch the videos too
if youre a video guy


honestly khan academy is pretty good; i just need a refresher
ye
the book is just if you need more specifics ig
i make my students that i tutor watch khan academy

that's an improvement over college teachers that give you a book and sya "let me know if you dont get something" 
I emailed my writing professor like a week ago about an assignment bc I couldn’t find it
She never even responded

when youre waiting for a model to train for your project, so you end up working on your other project

moral of the story:
dont procrastinate kids

During training I think about the model and then realize that I have a (logic / not immediately noticeable) bug and all that time spent was for nothing.
Or the model is not doing well and I can't tell if it's a bug or if my experimental model is just bad.
nah bro
the best is when you perform a gridsearch with 3 paramters, 5 variations for each parameters, but you forget to set the scoring to something other than accuracy (which is why i needed)
and you waste 1 hour of your life watching your PC burn
my friend just sent me the greatest piece of ehresy ive ever seen
import pandas as np
import numpy as pd
@exotic maple I'm going to secretly execute pd, np = np, pd under the hood to switch it back. I will not be mocked!
This is my pandas dataframe ```
image label
0 [146, 151, 156, 158, 160, 154, 144, 131, 127, ... butterfly
1 [156, 156, 156, 157, 158, 159, 160, 160, 160, ... butterfly
2 [146, 198, 172, 200, 168, 226, 186, 183, 192, ... butterfly
3 [66, 57, 53, 51, 42, 63, 95, 123, 139, 121, 77... butterfly
4 [48, 110, 212, 226, 232, 248, 144, 186, 190, 1... butterfly
... ... ...
26174 [157, 150, 122, 131, 149, 151, 162, 190, 132, ... squirrel
26175 [128, 119, 99, 73, 58, 132, 108, 110, 97, 88, ... squirrel
26176 [172, 162, 151, 108, 109, 115, 132, 174, 183, ... squirrel
26177 [16, 89, 112, 109, 65, 30, 46, 97, 98, 118, 8,... squirrel
26178 [97, 106, 94, 101, 55, 121, 129, 77, 35, 18, 8... squirrel
[26179 rows x 2 columns]
All the `images` are 1d arrays of length 2500 (all same length, type="uint8") I made my py
X = df['image']
y = df['label']
and I am trying to use an `sklearn.svm.SVC()` model and this is error i getpy
model = SVC()
model.fit(X, y)
TypeError: only size-1 arrays can be converted to Python scalars
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\Users\sohan\OneDrive\Documents\ProgrammingProjects\ImageClassification\main.py", line 22, in <module>
model.fit(X_train, y_train)
File "C:\Users\sohan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\sklearn\svm\_base.py", line 160, in fit
X, y = self._validate_data(X, y, dtype=np.float64,
File "C:\Users\sohan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\sklearn\base.py", line 432, in _validate_data
X, y = check_X_y(X, y, **check_params)
File "C:\Users\sohan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pandas\core\arrays\numpy_.py", line 211, in __array__
return np.asarray(self._ndarray, dtype=dtype)
File "C:\Users\sohan\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\numpy\core\_asarray.py", line 102, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
Can someone please help me with this error
@twilit pilot do you understand what is meant by TypeError: only size-1 arrays can be converted to Python scalars?
Refer to this:
>>> import numpy as np
>>> arr = np.array([1])
>>> arr
array([1])
>>> int(arr)
1
>>> arr = np.array([1, 2, 3])
>>> int(arr)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: only size-1 arrays can be converted to Python scalars
>>> arr = np.array([[1]])
>>> int(arr)
1
i get that only size-1 arrays can be converted to Python scalars, but to fit my data, obv my X won't be a size-1 array
i love how you got all the lurkers to come out with that comment

think about what shape it needs to be
vs what shape it is
well i want it to be a 1d array, and it is a 1d array
@velvet thorn does it have to be a 2d array?
How I can add a link to a word?
...why would you want X to be 1D
its still the same issue if i convert to a 2d array (50x50)
show code
ok
wait
@velvet thorn ```py
import os
import cv2
import pickle
import numpy as np
import pandas as pd
from sklearn.svm import SVC
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
Load the dataset
with open('data/dataframe.txt', 'rb') as infile:
df = pickle.load(infile)
"""
image label
0 [[146, 151, 156, 158, 160, 154, 144, 131, 127,... butterfly
1 [[156, 156, 156, 157, 158, 159, 160, 160, 160,... butterfly
2 [[146, 198, 172, 200, 168, 226, 186, 183, 192,... butterfly
3 [[66, 57, 53, 51, 42, 63, 95, 123, 139, 121, 7... butterfly
4 [[48, 110, 212, 226, 232, 248, 144, 186, 190, ... butterfly
... ... ...
26174 [[157, 150, 122, 131, 149, 151, 162, 190, 132,... squirrel
26175 [[128, 119, 99, 73, 58, 132, 108, 110, 97, 88,... squirrel
26176 [[172, 162, 151, 108, 109, 115, 132, 174, 183,... squirrel
26177 [[16, 89, 112, 109, 65, 30, 46, 97, 98, 118, 8... squirrel
26178 [[97, 106, 94, 101, 55, 121, 129, 77, 35, 18, ... squirrel
[26179 rows x 2 columns]
"""
X, y, X_train, X_test, y_train, y_test
X = df['image']
y = df['label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
Creating and training model
model = SVC()
model.fit(X_train, y_train) # <-- Error here
print(model.score(X_test, y_test))
Hello Everyone, I'm new to ML and NumPy, I've a basic doubt posted at #help-cupcake , It'll be nice if you can help me out, Thank You.
Flatten the input.
thats what i tried earlier and that also had the same issue
i mentioned it above
I don't see it in your code posted though.
@iron basalt I didn't post the code, but here is what the pandas dataframe looked like
Which scikit version number?
'0.23.2'
oh uh, I never gave a df directly to scikit-learn before, seems incorrect.
converting it to numpy gave the same error
this has become very frustrating for me now 😅
Might as well upgrade to 0.24 first to make sure
ok ill try that although there shouldnt be that big of a difference
How are you converting the columns to a numpy array
Your code posted is missing a bunch of things.
@twilit pilot
@iron basalt I didn't show it in the code, but the pandas.Dataframe has a .to_numpy() method
Show the real code
can you show which line is causing the error?
the first occurence of the exception i mean.
@fading burrow its the second to last line
this was the error
import os
import cv2
import pickle
import numpy as np
import pandas as pd
from sklearn.svm import SVC
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
# Load the dataset
with open('data/dataframe.txt', 'rb') as infile:
df = pickle.load(infile)
"""
image label
0 [146, 151, 156, 158, 160, 154, 144, 131, 127,... butterfly
1 [156, 156, 156, 157, 158, 159, 160, 160, 160,... butterfly
2 [146, 198, 172, 200, 168, 226, 186, 183, 192,... butterfly
3 [66, 57, 53, 51, 42, 63, 95, 123, 139, 121, 7... butterfly
4 [48, 110, 212, 226, 232, 248, 144, 186, 190, ... butterfly
... ... ...
26174 [157, 150, 122, 131, 149, 151, 162, 190, 132,... squirrel
26175 [128, 119, 99, 73, 58, 132, 108, 110, 97, 88,... squirrel
26176 [172, 162, 151, 108, 109, 115, 132, 174, 183,... squirrel
26177 [16, 89, 112, 109, 65, 30, 46, 97, 98, 118, 8... squirrel
26178 [97, 106, 94, 101, 55, 121, 129, 77, 35, 18, ... squirrel
[26179 rows x 2 columns]
"""
# X, y, X_train, X_test, y_train, y_test
X = df['image'].to_numpy()
y = df['label'].to_numpy()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Creating and training model
model = SVC()
model.fit(X_train, y_train) # <-- Error here
print(model.score(X_test, y_test))
```Its essectially the same thing
Where is the flatten?
its already flattened
it is let me print it
So you flattened it an loaded the flattened one
it seems like your function is getting an array when it should be getting numbers
if your data is (1,2,2), then the function will recieve arrays instead of numbers
what does x and y print like after becoming a numpy array.
[array([146, 151, 156, ..., 149, 144, 142], dtype=uint8)
array([156, 156, 156, ..., 141, 142, 141], dtype=uint8)
array([146, 198, 172, ..., 204, 192, 190], dtype=uint8) ...
array([172, 162, 151, ..., 199, 199, 198], dtype=uint8)
array([ 16, 89, 112, ..., 240, 240, 240], dtype=uint8)
array([ 97, 106, 94, ..., 253, 253, 253], dtype=uint8)]
that is X
and y?
wait
['butterfly' 'butterfly' 'butterfly' ... 'squirrel' 'squirrel' 'squirrel']
@iron basalt @fading burrow
dtype?
and y?
numpy arrays have multiple ways to store strings, idr is scikit learn accepts all
the error is probably x, should look more like this: [[...], [...], ...] but you have something strange going on with [ndarray(...), ...]
i don't think that's the issue but you can try
yea ill try rn
I'm just trying to match everything as much as possible.
i can send you the whole thing if you want
so you can test it on your own computer
@iron basalt
sure
ok gimme a little while
@iron basalt here is the python code https://github.com/sohan-py/testhelp/blob/main/main.py and here is the dataset https://github.com/sohan-py/testhelp/blob/main/help_dataframe.txt
its late for me rn, so i will head for bed. you can try running the code on your computer or editing it to make it work. Good Night!
I'm pretty sure the error is that X thing

yea im pretty sure its there too
all the examples have a different kind of input for X
at least for sklearn svm svc

The lesson here is to not store images as arrays in pandas, usually people have the dataset in some other format
what format
like let's say you want to load mnist
you just use an mnist loader
If you have images and are making your own dataset
and yes i remember the image dataset i was working with was NOT stored with np arrays

instead of storing the image data in the table, store paths to the image files, and use those to load all of them
combine into one thing
but when i put into model, i need numpy array
Or do what many datasets like mnist do and store multiple images in one file
makes it easier
yea you do, and you still can
you just need some extra work now to make this X something like array([[...], [...], ...], dtype=np.uint8)
so an np array of regular arrays?
yup got it running
import pickle
import numpy as np
import pandas as pd
from sklearn.svm import SVC
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
# Load the dataset
with open('help_dataframe.txt', 'rb') as infile:
df = pickle.load(infile)
"""
image label
0 [146, 151, 156, 158, 160, 154, 144, 131, 127,... butterfly
1 [156, 156, 156, 157, 158, 159, 160, 160, 160,... butterfly
2 [146, 198, 172, 200, 168, 226, 186, 183, 192,... butterfly
3 [66, 57, 53, 51, 42, 63, 95, 123, 139, 121, 7... butterfly
4 [48, 110, 212, 226, 232, 248, 144, 186, 190, ... butterfly
... ... ...
26174 [157, 150, 122, 131, 149, 151, 162, 190, 132,... squirrel
26175 [128, 119, 99, 73, 58, 132, 108, 110, 97, 88,... squirrel
26176 [172, 162, 151, 108, 109, 115, 132, 174, 183,... squirrel
26177 [16, 89, 112, 109, 65, 30, 46, 97, 98, 118, 8... squirrel
26178 [97, 106, 94, 101, 55, 121, 129, 77, 35, 18, ... squirrel
[26179 rows x 2 columns]
"""
# X, y, X_train, X_test, y_train, y_test
X = df['image'].to_numpy()
X = np.stack(X) # <- this stacks all the arrays in the array creating a 2d array
y = df['label'].to_numpy()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Creating and training model
model = SVC()
model.fit(X_train, y_train) # <-- Error here
print(model.score(X_test, y_test))
Now if you print X you see something like:
[[123 125 123 ... 137 108 132]
[ 96 104 101 ... 125 157 83]
[147 143 147 ... 133 139 150]
...
[231 232 227 ... 141 151 177]
[225 222 217 ... 199 201 199]
[115 116 116 ... 71 68 110]]
so a 2d array what shape?
let me give you an example
ok
>>> import numpy as np
>>> a = np.array([np.arange(3), np.arange(3), np.arange(3)])
>>> a
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
>>> a = np.empty(3, object)
>>> a[:] = [np.arange(3), np.arange(3), np.arange(3)]
>>> a
array([array([0, 1, 2]), array([0, 1, 2]), array([0, 1, 2])], dtype=object)
>>> np.stack(a)
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
>>>
got it now?
pandas to_numpy treats each cell's entry as a numpy array (object)
Always check the types
thanks to you i also learned something new thanks man!
Type errors in python can trickle down to later parts of the code
It's one of the big issues with dynamically typed languages
i see
They are convenient, but can cause more errors.
yes true

Or more specifically, errors in other parts of the code, even though the error is elsewhere (error propagation)
So you as a python programmer need to error backprop 😉
got it man
@iron basalt Thanks for taking your time to fix my error man I really appreciate it! Have a great rest of your day!
This is also a software engineering issue on scikit-learn's end, it should have assertions for the dtype of the inputs and shapes. As it can only really work with a certain subset of all dtypes possible (and shapes).
The assertion would have triggered and made the issue immediately obvious.
Assertions == good, use them everywhere to check things (check pre-conditions).
Yea I don't see any of the code dealing with the input being "object", only when it's specifically a string object. Just "object" will pass all checks and so it will error later on in numpy (looking inside the fit function on github).
The better approach would be to flip it around. Only allow a specific set of types, instead of having a bunch of checks for different ones and do conversions and stuff just for those (less error prone).
(Overly generic code, while not actually being fully generic)

guys
i have a question about, why do we use scaling in a data set like if my data is not skwed and have outliers then its not like scaling would be getting rid of them?
there are scalers like stander scallers and minmax scaler
can anyone help me?
Scaling isn't used to deal with outliers per se, or to deal with skewness. So your question needs to simply boil down to, what does scaling do.
And the answer there is, it depends on your model architecture. So tree based models don't necessarily need scaling since they create decision points from the data itself
But for deep learning and linear regression scaling offers unique benefits. For deep learning it helps with model fitting because the activation functions operate best within a certain range, and the weights are also initialized with a certain expectation for input scales*
For linear regression, scaling allows you to meaningfully compare coefficients across two features, which you can't do without scaling.
There might be other reasons too, but those are the ones I can think of off the top of my head
If I were to use a box-cox transformation, is it possible to reconvert the results back to the original units? If so how could I do that?
@ripe forge somewhat related question
would this be it?
Yes.
awesome sorry for bothering you for nothing 😘
@tidal bronze for that u need to have copy of the dataset so u might wont lose it accidentally, or else u can restart the kernel and re run the cell again
@ripe forge i see, and does it work on adaboost and sgboost? or else ensemble models
thanks for the tip
df.loc[df_pl['placement_ts'] >= pd.to_datetime('2018')]
error Invalid comparison between dtype=datetime64[ns, pytz.FixedOffset(330)] and Timestamp
help please!
it seems your two columns are not using the same datetime format
The bigger benefit for DL is gradient updates treat different dimensions more equally. Your loss landscape where one variable goes from 1e-4 to 2e-4 and one goes from 0 to 10000 is going to be weird. Your step size on the state of 1 isn't going to make much difference on variable while being a huge leap on the other
Scaling makes it so that the size of gradient updates is more fair to all dimensions
Otherwise your large steps combined with some gradient noise would make optimisation incredibly hard for many smaller scaled dimensions, or smaller learning rates wouldn't make meaningful progress on other dimensions
using seaborn how could I make the last bars darker?
i see, thanks !!
Hello
I want to retrieve tables from html file.
My code : https://paste.pythondiscord.com/moyiyowatu.py
Running this python script will copy the html code from the url mentioned in the code in current directory with name = f'{today_date}.html'
After saving the code, I want the table of that copied html file.
Sounds like you want to change the aspect ratio.

u know how i can do it on a lineplot graphic on seaborn?
Hello everyone, need a little here
am trying to manipulate discord profile pics ran into a problem
how do i read gif's from Url's using cv and urllib
You can do so on the Axis object lineplot returns:
https://matplotlib.org/stable/api/axes_api.html#aspect-ratio
.set_aspect(1/2) for 1:2 height:width
how i can know the right proportion?
WDYM right?
test it out
how do i combine all frames and save them as Gif?
What library are you using? PIL has a guide on that, IIRC.
if asking me cv2
hello im new to machine learning and i just had a few doubts about a project that im doing
not really understanding the final output of the project (involves svm)
help?
looks like he do the graphic and only after put the graphic on the center he put the legend over the graphic, i'll try to see if i have a way to put the legend on the side if that can fix it
i need help
...why are you pinging me here instead of, say, writing your question in a help channel, or anywhere?
How may i help you ?
@hoary wigeon can you see the error in this code?
I mean there is an error in last commamd
hello
Hye there.

:D
you should copy and paste instead of taking a picture like that
import speech_recognition as sr
import pyttsx3
import pywhatkit
import datetime
import wikipedia
listener = sr.Recognizer()
engine = pyttsx3.init()
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)
def talk(text):
engine.say(text)
engine.runAndWait()
def take_command():
try:
with sr.Microphone() as source:
print('listening...')
voice = listener.listen(source)
command = listener.recognize_google(voice)
command = command.lower()
if 'alexa' in command:
command = command.replace('alexa', '')
print(command)
except:
pass
return command
def run_alexa():
command = take_command()
print(command)
if 'play' in command:
song = command.replace('play', '')
talk('playing ' + song)
pywhatkit.playonyt(song)
elif 'time' in command:
time = datetime.datetime.now().strftime('%I:%M %p')
talk('Current time is ' + time)
elif 'who the heck is' in command:
person = command.replace('who the heck is', '')
info = wikipedia.summary(person, 1)
print(info)
talk(info)
elif 'date me' in command:
talk('fucker')
elif 'are you single' in command:
talk('I am in a relationship with wifi')
elif 'joke' in command:
talk(pyjokes.get_joke())
else:
talk('Please say the command again.')
while True:
run_alexa()
:O
elp
@pájthon
D:\Users\Nadia\pyton\python.exe D:/Users/Nadia/PycharmProjects/alexa.py/mine_alexa.py
listening...
Traceback (most recent call last):
File "D:\Users\Nadia\PycharmProjects\alexa.py\mine_alexa.py", line 59, in <module>
run_alexa()
File "D:\Users\Nadia\PycharmProjects\alexa.py\mine_alexa.py", line 36, in run_alexa
if 'play' in command:
TypeError: argument of type 'NoneType' is not iterable
None
Process finished with exit code 1
errors
@raven halo what are you talking about? please keep this channel on-topic
are you trying to make a program for Alexa?
if 'play' in command:
TypeError: argument of type 'NoneType' is not iterable
This implies immediately that command is None.
yes
have you made a web app of any kind before? if you want it to run on Alexa devices, the actual AI component is abstracted away by Amazon.
so this is a web development question, and you can do it using the flask framework for Alexa: https://flask-ask.readthedocs.io/en/latest/
https://www.kaggle.com/fedesoriano/company-bankruptcy-prediction
im using this dataset and training a model using svm
since im new i dont exactly know what the output im getting is
can anyone help?
Hello. I searched and think I've selected the right topic for this question. I'm new to Python and wrote a while loop that ultimately produces a number stored in a variable called 'cycles' after each round of cycles is completed. How can I store these 'cycles' values and then print them after all rounds are completed? Here's what I have so far, but I keep getting an array with the same values for each cycle, ex. [7, 7, 7]
results = []
for i in range(totalGames):
results.append(cycles)
If cycles is a list, I presume you're changing the same object each round, so your results list ends up with many references to the same list you're been changing all along.
What you probably want instead is to store a copy of the current cycles:
results.append(cycles.copy())
Add that copy outside of the while loop?
Not sure what you mean.







