#data-science-and-ml | Python | Page 79

mild dirge Aug 25, 2023, 1:53 PM

#

This is the original, not the cleaned one @tidal bough

latent remnant Aug 25, 2023, 1:53 PM

#

i should run this?

jaunty falcon Aug 25, 2023, 1:55 PM

#

hi

left tartan Aug 25, 2023, 1:56 PM

#

latent remnant i should run this?

Seems like all you want is: ```py
from io import StringIO
import pandas as pd

s = StringIO("""Team,Player,Tournament,Matches,Batting Innings,Not Out,Runds Scored,Highest Score,Batting Average,Balls Faced,Batting Strike Rate,100,50,0,4s,6s,Bowling Innings,Overs Bowled,Maidens Bowled,Runs Conceded,Wickets Taken,Best Bowling Figures,Bowling Average,Bowling Economy Rate,Bowling Strike Rate,4+ Innings Wickets,5+ Innings Wickets,Catches Taken,Stumpings Made
Delhi Daredevils,CH Morris,IPL 2016,12,7,4,195,82*,65,109,178.89,0,1,1,15,12,12,44,0,308,13,Feb-30,23.69,7,20.3,0,0,8,0""")

data = pd.read_csv(s)
team_2019 = data[data["Tournament"] == "IPL 2019"]
team_2019_newdf = team_2019[["Player", "Team", "Matches", "Batting Average", "Batting Strike Rate", "Bowling Innings", "Bowling Average", "Bowling Economy Rate"]]
team_2019_newdf.to_csv("newfile.csv")

latent remnant Aug 25, 2023, 1:58 PM

#

left tartan Seems like all you want is: ```py from io import StringIO import pandas as pd s...

yes yes, these columns are the ones which i need. but how do i add the data?

left tartan Aug 25, 2023, 1:58 PM

#

Change the "s" in the read_csv to your file name

#

The problem with your original code was you had brackets around the series... ```py

d = {
"Player" : player_2019,
"Team" : player_2019,
"Batting Innings" : player_batting_innnings,
"Batting Average" : player_batting_avg,
"Batting Strike Rate" : player_strikerate,
"Bowling Innings" : player_bowling_innings,
"Bowling Averagw" : player_bowling_average,
"Bowling Economy Rate" : player_bowling_eco
}

latent remnant Aug 25, 2023, 1:59 PM

#

left tartan Change the "s" in the read_csv to your file name

aighttttt

latent remnant Aug 25, 2023, 1:59 PM

#

left tartan The problem with your original code was you had brackets around the series... ``...

the course which taught me, said to use dictionary to create the new csv file

latent remnant Aug 25, 2023, 2:00 PM

#

left tartan The problem with your original code was you had brackets around the series... ``...

ohh i see the difference

left tartan Aug 25, 2023, 2:02 PM

#

also, in jupyter, use "display(df)" instead of "print(df)", it'll look nicer.

latent remnant Aug 25, 2023, 2:02 PM

#

left tartan also, in jupyter, use "display(df)" instead of "print(df)", it'll look nicer.

okk

#

can i ask what stringIO does?

left tartan Aug 25, 2023, 2:02 PM

#

it lets me treat a string like a file/io object

latent remnant Aug 25, 2023, 2:03 PM

#

left tartan it lets me treat a string like a file/io object

greatt, thank you soo much brother!, i was stuck on this for the past two days.
also, can you recommend some books or courses for this area of python

left tartan Aug 25, 2023, 2:04 PM

#

latent remnant greatt, thank you soo much brother!, i was stuck on this for the past two days. ...

For data stuff? https://www.kaggle.com/learn is pretty good

latent remnant Aug 25, 2023, 2:04 PM

#

left tartan For data stuff? <https://www.kaggle.com/learn> is pretty good

aight thankss

past meteor Aug 25, 2023, 2:04 PM

#

latent remnant greatt, thank you soo much brother!, i was stuck on this for the past two days. ...

Although I dislike Pandas' user guide I think it's also good to create a habit of reading the documentation of packages you use 🙂

#

9 times out of 10 I learn stuff by just reading their official documentation. If it's sparse, shady or something odds are that I won't touch the package

latent remnant Aug 25, 2023, 2:06 PM

#

Thank you so much! 😊

past meteor Aug 25, 2023, 2:06 PM

#

https://pandas.pydata.org/docs/user_guide/10min.html

https://pandas.pydata.org/docs/user_guide/index.html

compact valley Aug 25, 2023, 2:45 PM

#

is 8gb macbook pro m1 enough for data science?
I heard that i need more ram for data..?

mild dirge Aug 25, 2023, 2:46 PM

#

If you want to run large models, and within a certain time limit, you probably want a desktop, or just a simple laptop and use servers to run the models @compact valley

#

I wouldn't recommend a laptop to run any big models, but you can use services like google collab to run it on their servers

compact valley Aug 25, 2023, 2:47 PM

#

I have a desktop with 32gb ram

#

my company would also pay for cloud i guess thats an option here

mild dirge Aug 25, 2023, 2:48 PM

#

32GB is enough, and most important is gpu for most models

#

And cpu, but that is often not the bottleneck

past meteor Aug 25, 2023, 2:48 PM

#

compact valley is 8gb macbook pro m1 enough for data science? I heard that i need more ram for ...

More than enough

#

My work laptop has 8GB ram

jaunty hinge Aug 25, 2023, 2:48 PM

#

can i get help for an ARIMA model here

past meteor Aug 25, 2023, 2:48 PM

#

99 % of my development is through SSH. I cycle to work, I don't want to carry a GPU monstosity uphill on my bike 🙂

#

Also, my dev VM only had 4GB ram in the beginning until I annoyed IT enough to make it 8 and subsequently 64. I work with several tens of millions of rows of data. It's stupid but being constrained memory wise teaches you how to do things "properly"

#

Which matters a lot if/when you scale to terabyte size datasets

past meteor Aug 25, 2023, 2:51 PM

#

jaunty hinge can i get help for an ARIMA model here

shoot! 🙂

jaunty hinge Aug 25, 2023, 2:55 PM

#

past meteor shoot! 🙂

start=len(train)

end=len(train)+len(test)-1

pred=model.predict(start=start,end=end,typ='levels').rename('ARIMA Predictions')

pred.plot(legend=True)

test['AvgTemp'].plot(legend=True)

When I run this code I get this error

TypeError: Model.predict() missing 1 required positional argument: 'params'

#

idk how to fix it i need it for like an essay for my highschool

past meteor Aug 25, 2023, 2:55 PM

#

Are you using statsmodels?

jaunty hinge Aug 25, 2023, 2:55 PM

#

yeah

past meteor Aug 25, 2023, 2:56 PM

#

Have you fit your model already?

jaunty hinge Aug 25, 2023, 2:56 PM

#

uhh

#

im not sure

#

this?

past meteor Aug 25, 2023, 2:58 PM

#

Yeah, you've fit it 🙂 now you can do model_fit.forecast()

#

I assume you want do out-of-sample forecasts? (note: what people mean with out-of-sample is that you fit your model with weather data until 2023 and then you use the model to see what 2024 is like)

jaunty hinge Aug 25, 2023, 2:59 PM

#

yeah

#

how do i use like the code u mentioned because im soo new to python

finite bluff Aug 25, 2023, 3:00 PM

#

hello every one

left tartan Aug 25, 2023, 3:00 PM

#

I'm just adapting an example I had in a notebook already: ```py
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA

n_points = 100
x = np.linspace(0, 20 * np.pi, n_points)
noise = np.random.normal(0, 0.5, n_points)
y = 5 * np.sin(x / 2) + noise

p, d, q = 1, 1, 1
model = ARIMA(y, order=(p, d, q))
fit_model = model.fit()

forecast_steps = 20
forecast = fit_model.get_forecast(steps=forecast_steps)
conf_int = forecast.conf_int()

plt.figure(figsize=(12, 6))
plt.plot(y, label="Data")
plt.plot(np.arange(n_points, n_points + forecast_steps), forecast.predicted_mean, color="red", label="Forecast")
plt.fill_between(np.arange(n_points, n_points + forecast_steps), conf_int[:, 0], conf_int[:, 1], color="pink", alpha=0.3)
plt.legend()
plt.show()

print(fit_model.summary())

past meteor Aug 25, 2023, 3:00 PM

#

So you have your screenshot right? Make a new cell in your notebook and then simply write model_fit.forecast()

#

You need to put the amount of steps you want to forecast for in the method. Otherwise it'll default to 1.

past meteor Aug 25, 2023, 3:01 PM

#

jaunty hinge this?

Are you familiar with the "statistics" behind ARIMA by the way, how did you pick your order parameters?

left tartan Aug 25, 2023, 3:02 PM

#

(#data-science-and-ml message for a link to the original thread for this)

past meteor Aug 25, 2023, 3:02 PM

#

If it's weather data I actually think you need a SARIMA model

jaunty hinge Aug 25, 2023, 3:03 PM

#

past meteor Are you familiar with the "statistics" behind ARIMA by the way, how did you pick...

is it like this one

jaunty hinge Aug 25, 2023, 3:03 PM

#

past meteor Are you familiar with the "statistics" behind ARIMA by the way, how did you pick...

not that much

#

im just trying to get this math essay done for school

#

i hate ib

past meteor Aug 25, 2023, 3:04 PM

#

Wild they have you doing ARIMA in high school

jaunty hinge Aug 25, 2023, 3:04 PM

#

nah its a self thing

past meteor Aug 25, 2023, 3:04 PM

#

That's crazy

jaunty hinge Aug 25, 2023, 3:04 PM

#

i get to pick my topic

#

and i couldnt find anything for math

#

so i wanted to create a short term weather forecast

#

and it led me here

past meteor Aug 25, 2023, 3:04 PM

#

What's your "window"? How many days are you considering?

jaunty hinge Aug 25, 2023, 3:05 PM

#

7 days

past meteor Aug 25, 2023, 3:05 PM

#

~~Okay then you don't need SARIMA~~ you still might

#

What's the frequency of your measurements

jaunty hinge Aug 25, 2023, 3:05 PM

#

wdym

past meteor Aug 25, 2023, 3:06 PM

#

Do you have a measurement every hour? Every day, every minute?

jaunty hinge Aug 25, 2023, 3:06 PM

#

oh every day

past meteor Aug 25, 2023, 3:06 PM

#

Just one per day?

jaunty hinge Aug 25, 2023, 3:06 PM

#

yeah

#

is that bad

past meteor Aug 25, 2023, 3:06 PM

#

Okay then you don't need SARIMA at all. That's a relief

#

regular ARMA will be enough

jaunty hinge Aug 25, 2023, 3:07 PM

#

So i did what u said i got this

#

im not rlly sure i know what this is

past meteor Aug 25, 2023, 3:08 PM

#

I think you're just plotting your test data as-is now, no?

jaunty hinge Aug 25, 2023, 3:08 PM

#

i think so

past meteor Aug 25, 2023, 3:08 PM

#

When is your assignment due?

jaunty hinge Aug 25, 2023, 3:08 PM

#

is there anyway i can show u my full code

jaunty hinge Aug 25, 2023, 3:08 PM

#

past meteor When is your assignment due?

6 days :/

past meteor Aug 25, 2023, 3:09 PM

#

If you have the time I'd review basic Python and also watch some videos on AR, MA and ARIMA

jaunty hinge Aug 25, 2023, 3:09 PM

#

i mean after i finish the model i need to write like 2500 words abt it and stuff

jaunty hinge Aug 25, 2023, 3:09 PM

#

past meteor If you have the time I'd review basic Python and also watch some videos on AR, M...

i did

past meteor Aug 25, 2023, 3:09 PM

#

Maybe I'm not the best at this but there's too much for me to unpack

jaunty hinge Aug 25, 2023, 3:10 PM

#

ive been watching this tutorial

#

and i had to watch like so many other tutorials in the middle to get where he was

past meteor Aug 25, 2023, 3:11 PM

#

If I were you I'd make these plots:

https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.acf.html

https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.pacf.html

#

Basically they tell you how correlated each measurement at time n is with n-1, n-2, n-3, ...

#

They directly inform you what the p and q of the (so the AR and the MA) of arima should be

jaunty hinge Aug 25, 2023, 3:12 PM

#

oh yeah this was all fine

#

my data was stationary

past meteor Aug 25, 2023, 3:12 PM

#

In your case the I should be 0, you shouldn't detrend

jaunty hinge Aug 25, 2023, 3:13 PM

#

i just need to like predict and show the next forecast

past meteor Aug 25, 2023, 3:13 PM

#

Your data being stationary or not is only related to the I parameter of ARIMA

#

If you just want to predict then you should save the result into a new variable so predictions = model_fit.forecast(steps=7)

jaunty hinge Aug 25, 2023, 3:16 PM

#

past meteor If you just want to predict then you should save the result into a new variable ...

this just shows me like

#

the data i already have tho

finite bluff Aug 25, 2023, 3:18 PM

#

hey guys, could I ask a question about this field?

past meteor Aug 25, 2023, 3:19 PM

#

jaunty hinge the data i already have tho

Are you sure? It should be different

https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima.model.ARIMAResults.forecast.html#statsmodels.tsa.arima.model.ARIMAResults.forecast

past meteor Aug 25, 2023, 3:20 PM

#

finite bluff hey guys, could I ask a question about this field?

please just ask instead of asking if you can ask 🙂

finite bluff Aug 25, 2023, 3:23 PM

#

Oh sorry for that. I'm just curious about data science. Actually, I'm learning IT at my university and now I'm interested in data science. What things I have to learn first and what websites could help me to cover a whole range of topics in a simple way to kick off this journey :))

odd meteor Aug 25, 2023, 4:45 PM

#

finite bluff Oh sorry for that. I'm just curious about data science. Actually, I'm learning I...

https://Kaggle.com/learn combine that with YouTube + courses from Udemy or DataCamp or DataQuest etc (if you're interested in making a financial commitment), and you'll be well on your way

Learn Python, Data Viz, Pandas & More | Tutorials | Kaggle

Practical data skills you can apply immediately: that's what you'll learn in these no-cost courses. They're the fastest (and most fun) way to become a data scientist or improve your current skills.

#

Check our pinned message for additional resource

lapis sequoia Aug 25, 2023, 4:47 PM

#

this book has been pretty good for getting a grasp of data science and libraries associated with it like numpy and pandas
https://jakevdp.github.io/PythonDataScienceHandbook/

Python Data Science Handbook | Python Data Science Handbook

left tartan Aug 25, 2023, 5:02 PM

#

This channel really needs a sticky (pinned?)

odd meteor Aug 25, 2023, 5:04 PM

#

left tartan This channel really needs a sticky (pinned?)

What's a sticky? Sticker?

past meteor Aug 25, 2023, 5:04 PM

#

!resources

arctic wedgeBOT Aug 25, 2023, 5:04 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

past meteor Aug 25, 2023, 5:04 PM

#

I think there was a data science specific one? idk

misty flint Aug 25, 2023, 5:06 PM

#

!resources data science

arctic wedgeBOT Aug 25, 2023, 5:06 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

misty flint Aug 25, 2023, 5:07 PM

#

oh hey that worked

left tartan Aug 25, 2023, 5:07 PM

#

odd meteor What's a sticky? Sticker?

I mean something in the channel description. Look at discord bots for a good example.

misty flint Aug 25, 2023, 5:07 PM

#

i remember stel did a similar command

past meteor Aug 25, 2023, 5:07 PM

#

The data science resources there are pretty meek though

misty flint Aug 25, 2023, 5:07 PM

#

they are. we were talking about adding also some data eng resources on there

left tartan Aug 25, 2023, 5:07 PM

#

Yah, I have a bunch I’d like to contribute for the data science stuff

misty flint Aug 25, 2023, 5:07 PM

#

but i kinda dropped the ball on that

#

oops

#

@serene scaffold billybobby has volunteered. let me compile something for DE/MLE

left tartan Aug 25, 2023, 5:08 PM

#

Lol

past meteor Aug 25, 2023, 5:08 PM

#

Excel data engineering pithink (I'm joking)

misty flint Aug 25, 2023, 5:08 PM

#

oh speaking of which

#

were you able to get it working

left tartan Aug 25, 2023, 5:08 PM

#

No

misty flint Aug 25, 2023, 5:08 PM

#

python in excel

left tartan Aug 25, 2023, 5:09 PM

#

It's in a limited rollout

misty flint Aug 25, 2023, 5:09 PM

#

rip

left tartan Aug 25, 2023, 5:09 PM

#

so even though I have beta channel, they haven't pushed it to all beta users

misty flint Aug 25, 2023, 5:09 PM

#

im seeing it more and more on my newsfeed

#

i think even fireship did a video

#

https://youtu.be/8ofsE7xiGho?si=RQkzhn5zOH8muhCH

YouTube

Fireship

Microsoft Excel just got Python

Microsoft Excel just added support for Python, allowing developers to code and run custom functions directly inside a spreadsheet. Let's take a first look at how Python support will work in Excel.

#python #ai #thecodereport

💬 Chat with Me on Discord

https://discord.gg/fireship

🔗 Resources

Python Excel Announcement https://techcommunity.mi...

▶ Play video

left tartan Aug 25, 2023, 5:10 PM

#

it's interesting... the execution is a sandboxed anaconda install that runs in amazon cloud with about 400 pre-selected packages. Can't run whatever you want, though.

misty flint Aug 25, 2023, 5:10 PM

#

400 pre-selected packages. i dont remember if thats more or less than what comes with regular anaconda

left tartan Aug 25, 2023, 5:10 PM

#

but, this is a good thing, when compared to the security nightmare of vba.

misty flint Aug 25, 2023, 5:10 PM

#

kekHands

#

excel vba is a security nightmare def

#

idk how those finance folks arent aware

left tartan Aug 25, 2023, 5:11 PM

#

necessary evil, perhaps.

misty flint Aug 25, 2023, 5:11 PM

#

probs

past meteor Aug 25, 2023, 5:11 PM

#

Compiling a data engineering list is hard(er) because it can mean literally anything

misty flint Aug 25, 2023, 5:12 PM

#

it does but we will start somewhere

past meteor Aug 25, 2023, 5:12 PM

#

good ol' SQL fundamentals

left tartan Aug 25, 2023, 5:13 PM

#

past meteor Compiling a data engineering list is hard(er) because it can mean literally anyt...

I'm throwing my stuff togehter, one sec

past meteor Aug 25, 2023, 5:13 PM

#

I think the issue with DE moreso than with DS is that it's a really tools-driven field

#

You can make an argument that on some level you're learning tools and not concepts

#

Or the old heads that reduce all of the domain to dimensional modelling and dashboards lemon_angrysad

left tartan Aug 25, 2023, 5:20 PM

#

I think I've got more somewhere, but here's stuff I had in Notion:

Prerequisite: Get good at Python (see resources/etc)

Overviews / Intros
https://shivaga9esh.medium.com/data-architecture-engineering-22164c8dbd43

Fundamentals Tools:
Pandas: https://pandas.pydata.org/docs/user_guide/10min.html
SQL: https://selectstarsql.com/
Analytical Databases: Snowflake, duckdb, clickhouse, etc
Introductory:
3B1B: https://www.youtube.com/c/3blue1brown
https://www.kaggle.com/learn
CS50 for AI: https://cs50.harvard.edu/ai/2023/
https://jakevdp.github.io/PythonDataScienceHandbook/

Math:

Highlights of Calculus (Strang): https://ocw.mit.edu/courses/res-18-005-highlights-of-calculus-spring-2010/video_galleries/highlights_of_calculus/

Advanced:

Mathematics for Machine Learning: https://mml-book.github.io/book/mml-book.pdf
https://www.manning.com/books/deep-learning-with-pytorch
https://scikit-learn.org/stable/user_guide.html
o and https://pythonprogramming.net/machine-learning-python-sklearn-intro/
Fundamentals of Data Engineering: https://www.amazon.com/s?k=fundamentals+of+data+engineering

Medium

Data Architecture & Engineering

Common Backend Services & Opensource Data Stack

YouTube

3Blue1Brown

3Blue1Brown, by Grant Sanderson, is some combination of math and entertainment, depending on your disposition. The goal is for explanations to be driven by animations and for difficult problems to be made simple with changes in perspective.

For more information, other projects, FAQs, and inquiries see the website: https://www.3blue1brown.com

Learn Python, Data Viz, Pandas & More | Tutorials | Kaggle

Practical data skills you can apply immediately: that's what you'll learn in these no-cost courses. They're the fastest (and most fun) way to become a data scientist or improve your current skills.

misty flint Aug 25, 2023, 5:23 PM

#

past meteor You can make an argument that on some level you're learning tools and not concep...

thats why joe reis and matt housley offer this framework. so peeps can start thinking in concepts instead of tools since they say tools come and go:

#

met them in person. real cool folks 👍

#

and ofc the diagram is from the first chapter of FoDE. i really sound like a broken record at this point

#

DoggoKek

left tartan Aug 25, 2023, 5:26 PM

#

hey, I put fode in my list

misty flint Aug 25, 2023, 5:26 PM

#

~~and thats all that matters~~

left tartan Aug 25, 2023, 5:26 PM

#

I guess I should throw duckdb in my list then

misty flint Aug 25, 2023, 5:26 PM

#

~~my goal has been fulfilled~~

#

EvilKermit

past meteor Aug 25, 2023, 5:26 PM

#

I have never heard of that book interesting

#

You should add a Kimball / Inmon book to the list

#

That's how it started for me at least 🤷

misty flint Aug 25, 2023, 5:27 PM

#

past meteor I have never heard of that book interesting

brand new. joe reis likes to describe it as the prequel to designing data intensive applications

misty flint Aug 25, 2023, 5:27 PM

#

past meteor You should add a Kimball / Inmon book to the list

data warehouse toolkit most def

past meteor Aug 25, 2023, 5:28 PM

#

left tartan I guess I should throw duckdb in my list then

Add polars to your list

left tartan Aug 25, 2023, 5:29 PM

#

hmm

past meteor Aug 25, 2023, 5:29 PM

#

At least for my work I prefer it much more than DuckDB or SQL in general. My transformations are awkward to express as SQL queries. It's unnecessary pain.

worn stratus Aug 25, 2023, 5:29 PM

#

misty flint 400 pre-selected packages. i dont remember if thats more or less than what comes...

the killer is that there's never going to be a nice way to auth against internal databases

past meteor Aug 25, 2023, 5:30 PM

#

I'm still a big believer in the fact that people should YEET Pandas out of their data engineering stack

#

It's inferior in all respects compared to Polars except it integrating better with data viz and ML tools

#

The memory footprint being a lot smaller, better multi threading and the lazy API are just big selling points

worn stratus Aug 25, 2023, 5:31 PM

#

past meteor It's inferior in all respects compared to Polars except it integrating better wi...

even this is slowly going away - seaborn and Plotly both play nicely with polars

past meteor Aug 25, 2023, 5:31 PM

#

https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/api/polars.DataFrame.groupby_dynamic.html

#

If you're working with time series. Group by dynamic is the absolute GOAT

misty flint Aug 25, 2023, 5:31 PM

#

worn stratus the killer is that there's never going to be a nice way to auth against internal...

oh yeah that is killer

worn stratus Aug 25, 2023, 5:32 PM

#

the main thing pandas has going for it is the first movers advantage a ridiculous amount of tooling layered on top of it

misty flint Aug 25, 2023, 5:32 PM

#

past meteor I'm still a big believer in the fact that people should YEET Pandas out of their...

i see it more as a dif type of workflow

#

if that makes sense

past meteor Aug 25, 2023, 5:32 PM

#

Sure, it makes sense

misty flint Aug 25, 2023, 5:32 PM

#

for DS and DAs

#

for ETL and whatnot, def different workflows

past meteor Aug 25, 2023, 5:33 PM

#

I still use Pandas when I'm getting very close to my "final" application

misty flint Aug 25, 2023, 5:33 PM

#

yeah just to peek and look

#

i do the same

past meteor Aug 25, 2023, 5:33 PM

#

E.g., when I'm getting close to sklearn and whatnot I use Pandas

worn stratus Aug 25, 2023, 5:33 PM

#

I'm completely off of pandas at work at this point, which might screw me when I come to interview

past meteor Aug 25, 2023, 5:34 PM

#

I think Spark / Databricks are (sadly) data engineering essentials

#

Why read a small CSV in Python when you can wait several seconds for the JVM to start!! 😉

worn stratus Aug 25, 2023, 5:35 PM

#

past meteor I think Spark / Databricks are (sadly) data engineering essentials

the saddest thing here is that the dataset api is great, but it's scala only. pyspark is just a bit rubbish

past meteor Aug 25, 2023, 5:36 PM

#

I'd prefer doing Spark in Scala if not only for the fact that UDFs are a lot less painful and they don't tank your performance as much

left tartan Aug 25, 2023, 5:36 PM

#

worn stratus I'm completely off of pandas at work at this point, which might screw me when I ...

You'll be fine, just learn duckdb 😉

past meteor Aug 25, 2023, 5:37 PM

#

DuckDB is just SQL, there's nothing to learn?

worn stratus Aug 25, 2023, 5:38 PM

#

left tartan You'll be fine, just learn duckdb 😉

it's more that the kind of roles I'll end up applying for will often have a pandas coding interview - and "trust me bro, let's just use duckdb/polars" isn't a great line

left tartan Aug 25, 2023, 5:38 PM

#

worn stratus it's more that the kind of roles I'll end up applying for will often have a pand...

I know, I agree 🙂

worn stratus Aug 25, 2023, 5:38 PM

#

worn stratus it's more that the kind of roles I'll end up applying for will often have a pand...

the response will be "OK nerd, but we have these pandas codebases"

left tartan Aug 25, 2023, 5:39 PM

#

past meteor DuckDB is just SQL, there's nothing to learn?

Like all things, they have a lot of tricks up their sleeves - #python-discussion message

#python-discussion message

past meteor Aug 25, 2023, 5:39 PM

#

I don't fully understand the DuckDB hype

#

I get it for people that don't know Python

#

The people that are going for the whole DuckDB + DBT set-up

left tartan Aug 25, 2023, 5:40 PM

#

Oh, dbt is heaven... but:

past meteor Aug 25, 2023, 5:40 PM

#

For the rest DuckDB and Polars are quite similar in terms of capabilities. The question becomes "do you want to write SQL or have a dataframe API"

#

I want a dataframe API any day of the week because you get all of software's best practices for free

worn stratus Aug 25, 2023, 5:41 PM

#

i get the duckdb hype. SQL is already the lingua franca of the data world, and it let's you avoid pandas which is basically a dsl largely separate from actual python

left tartan Aug 25, 2023, 5:41 PM

#

the duckdb team is just cranking out features and it's just so much better than what's out there. For example: columns with regular expressions or lambdas or Python UDFs or pivot/unpivot

DuckDB

Even Friendlier SQL with DuckDB

TLDR; DuckDB continues to push the boundaries of SQL syntax to both simplify queries and make more advanced analyses possible. Highlights include dynamic column selection, queries that start with the FROM clause, function chaining, and list comprehensions. We boldly go where no SQL engine has gone before! Who says that SQL should stay frozen in ...

misty flint Aug 25, 2023, 5:41 PM

#

worn stratus I'm completely off of pandas at work at this point, which might screw me when I ...

its okay. your next role can be in DE EvilKermit

left tartan Aug 25, 2023, 5:41 PM

#

(I'm not affiliated, but just what they've shipped in the past year is amazing)

past meteor Aug 25, 2023, 5:42 PM

#

SQL overuse can be terrible as well

left tartan Aug 25, 2023, 5:42 PM

#

as_of joins, positional joins, etc.

past meteor Aug 25, 2023, 5:42 PM

#

It's declarative but when you need something that doesn't exist natively you end up tying together all of those declarative things into a procedural monster

misty flint Aug 25, 2023, 5:42 PM

#

past meteor SQL overuse can be terrible as well

i saw legit sql queries stored as database values this quarter

#

when you run the sql query in order to get more sql queries galactic_brain

past meteor Aug 25, 2023, 5:44 PM

#

misty flint i saw legit sql queries stored as database values this quarter

doesn't surprise me hahaha

left tartan Aug 25, 2023, 5:44 PM

#

past meteor It's declarative but when you need something that doesn't exist natively you end...

What I end up doing is just collapsing dozens or hundreds of lines of code into a duckdb transformation... but I could use them inline:

#

Note the inline use of the global dataframe: ```py
import duckdb
import math
import pandas as pd
somedf = pd.DataFrame({"i": [i for i in range(22)]})
with duckdb.connect() as con:
con.create_function('factorial_method_native', lambda x: math.factorial(x), [duckdb.typing.HUGEINT], duckdb.typing.HUGEINT, type='native')
df = con.sql('select factorial_method_native(i) from somedf tbl(i)').df()
print(df)

misty flint Aug 25, 2023, 5:44 PM

#

past meteor doesn't surprise me hahaha

yeah others like moyen have seen this as well

past meteor Aug 25, 2023, 5:44 PM

#

left tartan What I end up doing is just collapsing dozens or hundreds of lines of code into ...

Do me a favour and try Polars 😦

#

Writing strings in Python also just sucks

#

it's a hack

left tartan Aug 25, 2023, 5:45 PM

#

past meteor Do me a favour and try Polars 😦

Oh, I've used polars extensively.

left tartan Aug 25, 2023, 5:45 PM

#

past meteor Writing strings in Python also just sucks

Not trying to sell you, but there is a pythonic API if that's your cup of tea.

past meteor Aug 25, 2023, 5:45 PM

#

I've looked at DuckDB extensively tbh. I've even "sold" it to colleagues using worse tools (laughs in R)

left tartan Aug 25, 2023, 5:46 PM

#

but for me, I can join a polars df, with hive partitioned parquet files, with a csv file, in a single operation.

#

that's just black magic for me.

misty flint Aug 25, 2023, 5:46 PM

#

left tartan the duckdb team is just cranking out features and it's just so much better than ...

this is dope

past meteor Aug 25, 2023, 5:46 PM

#

Went to talks ran by elite data engineering teams on DuckDB etc as well

#

But I just still don't see why I'd squeeze it into my current workflow

#

It looks inferior to my current setup, just my 2 cents

worn stratus Aug 25, 2023, 5:47 PM

#

the main thing putting me off of duckdb is that it's kinda redundant for analysis once you're data is already in some SQL database.

why pull data from snowflake just to change the SQL flavor?

misty flint Aug 25, 2023, 5:47 PM

#

i think it has a dif use case imo

left tartan Aug 25, 2023, 5:47 PM

#

I mean, if you're in snowflake, you can stay in snowflake, I prob wouldn't change that.

past meteor Aug 25, 2023, 5:47 PM

#

There's edge cases where I would use duckDB though, I think its streaming and larger-than-memory support is better than Polars

#

If I were really really memory constrained I'd move to DuckDB

#

But realistically, at that point they should hire someone else. I'm not a data engineer 😢

#

Everywhere I've been everyone just sucked at it so I do it, with pleasure

#

Otherwise the data pipeline would be csv files on teams

worn stratus Aug 25, 2023, 5:49 PM

#

left tartan I mean, if you're in snowflake, you can stay in snowflake, I prob wouldn't chang...

I do, but I think that's the place where polars eats DuckDB's lunch - it can do most of the ducky things, but stays relevant later in the data lifecycle . you can pull data from a database, then you can write nice composable polars transforms over the top of it

left tartan Aug 25, 2023, 5:49 PM

#

Yah, I'd mix it in, personally... duckdb works over polars /arrow data too, so you can use either or both

misty flint Aug 25, 2023, 5:50 PM

#

chroma, vector db, is built on top of duckdb so theres that

#

i need that one meme where you lift the mask and its just duckdb underneath

#

DoggoKek

past meteor Aug 25, 2023, 5:51 PM

#

I don't understand vector db's either

misty flint Aug 25, 2023, 5:51 PM

#

so you got some embeddings right?

past meteor Aug 25, 2023, 5:51 PM

#

I get that part

misty flint Aug 25, 2023, 5:51 PM

#

these are lists of floating point numbers

#

and you need to read/write them

#

super duper fast

past meteor Aug 25, 2023, 5:51 PM

#

What I don't get is why the tech is so overhyped

misty flint Aug 25, 2023, 5:52 PM

#

current db systems suck at this

misty flint Aug 25, 2023, 5:52 PM

#

past meteor What I don't get is why the tech is so overhyped

oh yeah it is a bit overhyped atm

past meteor Aug 25, 2023, 5:52 PM

#

Just serializing it normally and doing a dot product works unless you have a really really large set of embeddings

misty flint Aug 25, 2023, 5:52 PM

#

the issue comes down to the model

past meteor Aug 25, 2023, 5:52 PM

#

People have been doing algebraic topic modelling for ages without vector dbs

misty flint Aug 25, 2023, 5:52 PM

#

some models generate embeddings that are only 300ish dimensions per "row" (looks at SBERT)

#

some are wildly large embeddings

#

with huge dimensions

past meteor Aug 25, 2023, 5:53 PM

#

Just benchmark matrix vector multiplication with different sizes in numpy and then ask yourself if the hype is warranted. I have and for me it was a "no"

misty flint Aug 25, 2023, 5:54 PM

#

i mean my work mentor built his own vector db doing just what you described

#

however

#

just look at how much money pinecone is raking in lol

#

there are some use cases where some stuff just doesnt scale

#

someone in this server was telling about it

#

and how they were waiting for a good vector db to come out

#

thats equivalent to cassandra

past meteor Aug 25, 2023, 5:56 PM

#

This exists but it's not proportional to the vector db propaganda

misty flint Aug 25, 2023, 5:57 PM

#

agreed

past meteor Aug 25, 2023, 5:58 PM

#

I did a course on information retrieval in uni and I guess this is why they reminded us that the triangle inequality is a thing

#

There's also extensive research on approximate nearest neighbour search etc

upper flame Aug 25, 2023, 7:08 PM

#

@past meteor i sent u a dm

cyan belfry Aug 25, 2023, 7:42 PM

#

Hi 😊
new to this topic, I have a huge trading data set of about 420k rows each day
what will be the best way to find similar patterns for all the data sets I have ?
And after that of course convert it into a model

left tartan Aug 25, 2023, 7:46 PM

#

cyan belfry Hi 😊 new to this topic, I have a huge trading data set of about 420k rows each ...

That's a giant topic. If you want to dig in, I enjoyed this book: https://www.amazon.com/Advances-Financial-Machine-Learning-Marcos/dp/1119482089

cyan belfry Aug 25, 2023, 7:49 PM

#

left tartan That's a giant topic. If you want to dig in, I enjoyed this book: <https://www.a...

I want to learn only the specific things for my project, im not planning on working in that field, im pretty bad with math lol

left tartan Aug 25, 2023, 7:49 PM

#

Maybe narrow down your question then?

cyan belfry Aug 25, 2023, 7:50 PM

#

im not sure how 😅

#

I thought maybe an AI could go through the data and find something

cyan belfry Aug 25, 2023, 7:57 PM

#

left tartan Maybe narrow down your question then?

Can I DM u about it ?

ashen ore Aug 25, 2023, 8:18 PM

#

What are good laptops out there rn for performing deep learning tasks and processing other ml models

left tartan Aug 25, 2023, 8:22 PM

#

cyan belfry Can I DM u about it ?

I don’t dm… just ask here, someone can surely answer it

desert oar Aug 25, 2023, 9:00 PM

#

worn stratus the main thing putting me off of duckdb is that it's kinda redundant for analysi...

because hitting the data warehouse every time is expensive and slow

#

save your 10 GB subset locally and go to town with duckdb, polars, pandas, whatever

misty flint Aug 25, 2023, 9:02 PM

#

@serene scaffold i ended up compiling a basic list aimed towards beginners. ill send you a link to the notion doc if youre still interested in updating pydis resources

past meteor Aug 25, 2023, 9:13 PM

#

desert oar because hitting the data warehouse every time is expensive and slow

I heard snowflake is expensive as well though isn't it?

desert oar Aug 25, 2023, 9:13 PM

#

past meteor I heard snowflake is expensive as well though isn't it?

it adds up because the warehouse always stays on for 1 minute minimum, so even select 1 costs 1 minute of execution time

past meteor Aug 25, 2023, 9:14 PM

#

It really is beautiful technology though. Fully separate compute and storage. Predicate pushdown. S3-like files with a SQL interface that make it act like a RDBMS, ...

#

I'm wary of doing R&D in the cloud though

past meteor Aug 25, 2023, 9:15 PM

#

desert oar it adds up because the warehouse always stays on for 1 minute minimum, so even `...

Does it have a cold start if the cluster is off?

upper flame Aug 25, 2023, 10:24 PM

#

Hey can someone assist me with a code pls. it's rlly URGENT. For volunteers ping me or dm

slim bone Aug 25, 2023, 10:43 PM

#

I'm trying to understand the point of max-pooling, at least on an intuitive level.
The book I'm reading is trying to explain this but I can't understand the explanation at all, thus I can't really formulate a concrete question besides "Why do we actually use max-pooling?"
For the sake of clarity: I know what it does, just not why it's used.
Thanks in advance

Edit: Trying to research this a little further - and it seems like there's no definitive answer. Am I digging too much into things? Because indeed, most of my learning experience with a lot of subjects related to ML could sort of be summed up with "X works, because someone thought X could work, so they implemented X into some models and noticed an improvement"
So perhaps there's a meta question here: Should I even bother trying to understand some of these concepts?

I've attached the explanation I'm having trouble with

pallid badge Aug 25, 2023, 10:50 PM

#

Would this be the correct subchannel to discuss hdf5 files?

serene scaffold Aug 26, 2023, 1:07 AM

#

misty flint <@253696366952316929> i ended up compiling a basic list aimed towards beginners....

I'm on vacation and that's why I haven't been responding. Let's chat next week

misty flint Aug 26, 2023, 1:09 AM

#

serene scaffold I'm on vacation and that's why I haven't been responding. Let's chat next week

stel enjoy your time off. dont mind me bud and sounds good ok_handbutflipped

twin forge Aug 26, 2023, 1:38 AM

#

def OHLCV(list_tickers, start, end):
    ohlcv = {}

    for t in list_tickers:
        try:
            data = yf.download(t, start=start, end=end, interval="1d", repair=True).dropna()
            if not data.empty:
                ohlcv[t] = data
        except:
            pass
    return ohlcv```

Hey guys I'm working with a big ticker sample that includes delisted stocks but as you can see there's tons of price data series behaving incorrectly.

How can I identify these bad time series data so that I can remove them from my analysis? 

I'm thinking something along the lines of "based on how the other prices reacted during time interval, remove these tickers from ohlcv"

verbal venture Aug 26, 2023, 1:40 AM

#

Can someone help me with this code: ```py
def initialize_model(N,V, random_seed=1):
'''
Inputs:
N: dimension of hidden vector
V: dimension of vocabulary
random_seed: random seed for consistent results in the unit tests
Outputs:
W1, W2, b1, b2: initialized weights and biases
'''

### START CODE HERE (Replace instances of 'None' with your code) ###
np.random.seed(random_seed)
# W1 has shape (N,V)
W1 = np.random.randn(N, V)

# W2 has shape (V,N)
W2 = np.random.randn(V, N)

# b1 has shape (N,1)
b1 = np.zeros((N, 1))

# b2 has shape (V,1)
b2 = np.zeros((V, 1))

### END CODE HERE ###
return W1, W2, b1, b2```

left tartan Aug 26, 2023, 1:48 AM

#

twin forge ```py def OHLCV(list_tickers, start, end): ohlcv = {} for t in list_tic...

I guess you'd have to look at the data to see what's going on. Are there a bunch of zero points in between? Are these after trading hours blips? In short: look at the data to see what the issue is

twin forge Aug 26, 2023, 1:55 AM

#

left tartan I guess you'd have to look at the data to see what's going on. Are there a bunch...

the thing is my sample size is over 300 stocks so checking individually would take me a whole day

#

could I simply .apply() something?

#

maybe for example: if stock volume is under x don't add it to ohlcv_dict?

slim flicker Aug 26, 2023, 1:59 AM

#

!verify

#

!voiceverify

#

!voiceverify

serene scaffold Aug 26, 2023, 2:05 AM

#

verbal venture Can someone help me with this code: ```py def initialize_model(N,V, random_seed=...

Help in what way?

verbal venture Aug 26, 2023, 2:05 AM

#

it's not passing tests

serene scaffold Aug 26, 2023, 2:05 AM

#

@slim flicker see #voice-verification

serene scaffold Aug 26, 2023, 2:05 AM

#

verbal venture it's not passing tests

What tests...

verbal venture Aug 26, 2023, 2:05 AM

#

Wrong initialization for b2 vector. Check the use of the random seed.
Expected: [[0.77951459]
[0.02293309]
[0.57766286]
[0.00164217]
[0.51547261]]
Got: [[0.]
[0.]
[0.]
[0.]
[0.]].
24 Tests passed
12 Tests failed

#

I don't know what I could change about the func. I did torch.randn and regular np.zeros() but got the same test output

serene scaffold Aug 26, 2023, 2:06 AM

#

verbal venture Wrong initialization for b2 vector. Check the use of the random seed. Expec...

When you ask for help, this is the kind of information you should give in the first message for your question.

Look at how you define b2

verbal venture Aug 26, 2023, 2:07 AM

#

ah

left tartan Aug 26, 2023, 2:07 AM

#

twin forge the thing is my sample size is over 300 stocks so checking individually would ta...

You first need to inspect the data to understand what the problem is. You’re asking for the medicine without a diagnosis.

verbal venture Aug 26, 2023, 2:07 AM

#

serene scaffold When you ask for help, this is the kind of information you should give in the fi...

you're saying don't ue np.zeros yeah?

#

I got the same output

serene scaffold Aug 26, 2023, 2:08 AM

#

verbal venture I got the same output

Show

verbal venture Aug 26, 2023, 2:08 AM

#

change b1 & b2 to randn

left tartan Aug 26, 2023, 2:08 AM

#

twin forge the thing is my sample size is over 300 stocks so checking individually would ta...

Perhaps you have spurious zeros, or perhaps gaps in data that’s been filled incorrectly, or perhaps your charting code is rendering gaps as zeros, or perhaps you’ve made a mistake and group disparate series under the same ticker, etc

verbal venture Aug 26, 2023, 2:08 AM

#

Wrong initialization for b2 vector. Check the use of the random seed.
Expected: [[0.77951459]
[0.02293309]
[0.57766286]
[0.00164217]
[0.51547261]]
Got: [[-0.10106761]
[-0.05230815]
[ 0.24921766]
[ 0.19766009]
[ 1.33484857]].

serene scaffold Aug 26, 2023, 2:10 AM

#

@verbal venture the n in randn stands for normal. It does something different than the general purpose random array generator

#

Use this https://numpy.org/doc/stable/reference/random/generated/numpy.random.rand.html

verbal venture Aug 26, 2023, 2:11 AM

#

ah ok, how long have you been doing AI for

serene scaffold Aug 26, 2023, 2:11 AM

#

verbal venture ah ok, how long have you been doing AI for

A few years

verbal venture Aug 26, 2023, 2:11 AM

#

all NLP?

serene scaffold Aug 26, 2023, 2:12 AM

#

Pretty much

verbal venture Aug 26, 2023, 2:12 AM

#

is it just transformers nowadays?

#

like in productin

serene scaffold Aug 26, 2023, 2:12 AM

#

Yes

#

Everyone wants large language models that use transformers

verbal venture Aug 26, 2023, 2:25 AM

#

figured

#

can you help with this as well? ```py
def back_prop(x, yhat, y, h, W1, W2, b1, b2, batch_size):
'''
Inputs:
x: average one hot vector for the context
yhat: prediction (estimate of y)
y: target vector
h: hidden vector (see eq. 1)
W1, W2, b1, b2: matrices and biases
batch_size: batch size
Outputs:
grad_W1, grad_W2, grad_b1, grad_b2: gradients of matrices and biases
'''

# Compute z1 as "W1⋅x + b1"
z1 = np.dot(W1, x) + b1

### START CODE HERE (Replace instanes of 'None' with your code) ###

# Compute l1 as W2^T (Yhat - Y)
l1 = (yhat - y)

# if z1 < 0, then l1 = 0
# otherwise l1 = l1
# (this is already implemented for you)

l1[z1 < 0] = 0 # use "l1" to compute gradients below

# compute the gradient for W1
grad_W1 = np.dot(l1, x.T) / batch_size

# Compute gradient of W2
grad_W2 = np.dot(l1, h.T) / batch_size

# compute gradient for b1
grad_b1 = np.sum(l1, axis=1, keepdims=True) / batch_size

# compute gradient for b2
grad_b2 = np.sum(yhat - y, axis=1, keepdims=True) / batch_size
### END CODE HERE ###

return grad_W1, grad_W2, grad_b1, grad_b2

#

error is: boolean index did not match indexed array along dimension 0; dimension is 5778 but corresponding boolean dimension is 50

orchid sky Aug 26, 2023, 3:00 AM

#

!resources ai

arctic wedgeBOT Aug 26, 2023, 3:00 AM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

twilit tundra Aug 26, 2023, 5:28 AM

#

slim bone I'm trying to understand the point of max-pooling, at least on an intuitive leve...

Why pooling? Because it's helpful to scale CNNs.

Why maxpooling? It introduces nonlinearity and the intuition is pretty natural: you're letting a group of neurons fire if at least one of them fires.

It also makes the whole computation more robust (for example if you detect an object only on one corner of your pooling box, the signal still carries over when it would have been 'obfuscated' by an avg pooling)

twilit tundra Aug 26, 2023, 5:49 AM

#

There are alternatives (most notably using a convolutional layer with stride 2 https://arxiv.org/pdf/1412.6806.pdf) but it didn't really improve the performances of visual models, so it's not popular outside of academics

slim bone Aug 26, 2023, 9:00 AM

#

It all feels so... "handwavy" I think is the terminology in English

#

When I thought about it further I managed to conjure up some random explanation where "The most dominant pixels on the feature map are carried over, and thus it's perhaps easier to concentrate on the relevant features of the image" but honestly it seems like any effort at trying to explain this phenomena on an intuitive level without the appropriate background is futile

wooden sail Aug 26, 2023, 10:41 AM

#

slim bone When I thought about it further I managed to conjure up some random explanation ...

i think this might be a nice read for you https://www.di.ens.fr/willow/pdfs/icml2010b.pdf

at the end of the day, max pooling is a heuristic. it doesn't always work nor make sense. the paper i linked explores the regime in which max pooling is beneficial. keep in mind downsampling is a common operation in signal processing that can be justified in many ways, e.g. the spectrum of the data is relatively low frequency, allowing downsampling without loss of information, or the data admitting a sparse representation in some domain. max pooling is one choice of irregular downsampling that can be beneficial depending on the properties of the data. there are also cases in which you do lose information by downsampling, but the high frequency information benefits from using the low frequency info as a "prior" or "initial guess", where it makes sense to consider both the downsampled data and the full data at the same time (think e.g. u-nets with "skip connections". but yeah, it doesn't always make sense and there is also no guarantee of optimality for it in general

#

but as you correctly point out, someone tried stuff and this worked in many cases 😛 people looked for explanations and interpretations after

shadow viper Aug 26, 2023, 10:46 AM

#

helo everyone

#

how can you check for top 5 states with top 10 most demanding services each in the us?

#

how exactly can i get this data?

worn stratus Aug 26, 2023, 10:50 AM

#

desert oar because hitting the data warehouse every time is expensive and slow

for the most part, the cost is small enough to not matter for analysis. create a temp table with some level of aggregation, then it's just ~1-5s round trip times

worn stratus Aug 26, 2023, 10:50 AM

#

worn stratus for the most part, the cost is small enough to not matter for analysis. create a...

this is purely vs duckdb that I'm talking about

slim bone Aug 26, 2023, 10:57 AM

#

wooden sail i think this might be a nice read for you https://www.di.ens.fr/willow/pdfs/icml...

Thank you both for the detailed explanation
Admittedly I don't entirely understand the terminology behind "High/low frequency data" (Not sure if signal processing is necessarily even taught in my degree) but I think I got the idea of what you're trying to say, something like "Transformations of the type F^n -> F^m where m < n can still be "rich" with information and often help the computer process it" or something along those lines? ^^;

I really just wanted to hear that there isn't a single concrete explanation for this and that I'm not missing some fundamental theory. So indeed, I think you put my mind at ease

wooden sail Aug 26, 2023, 10:58 AM

#

slim bone Thank you both for the detailed explanation Admittedly I don't entirely understa...

sure, that's one way of looking at it. if we have a vector in F^n and it satisfies special properties, then a projection F^n -> F^m, with m < n, is invertible. you know how functions can be invertible from the left without being overall invertible (having no right inverse)

#

if you wanna think about it that way, it'd be that the original signal in F^n is in a vector space of dimension d <= m

#

then T: F^n -> F^m can be invertible from the left if you apply it to an input that is not in the null space of T

#

the case of "signal frequency" is a particular choice of T where we have a unitary matrix U in F^nxn, and we construct T by keeping m columns of U and transposing. the particular U of orthogonal complex exponentials is the "fourier basis" used to look at the frequency domain of signals, but other choices are possible

slim bone Aug 26, 2023, 11:22 AM

#

wooden sail then T: F^n -> F^m can be invertible from the left if you apply it to an input t...

I think I sort of missed the point - Why do we care if it's invertible from the left?

#

My Linear Algebra is a little rusty, maybe that's why I'm not getting it

wooden sail Aug 26, 2023, 11:26 AM

#

slim bone I think I sort of missed the point - Why do we care if it's invertible from the ...

because that means you can recover the full info even if you downsampled

#

that's not necessary depending on what you do after, but it can be nice to have

slim bone Aug 26, 2023, 11:26 AM

#

wooden sail because that means you can recover the full info even if you downsampled

Right but how does that matter in the context of Computer VIsion?

slim bone Aug 26, 2023, 11:26 AM

#

wooden sail that's not necessary depending on what you do after, but it can be nice to have

Ah

wooden sail Aug 26, 2023, 11:27 AM

#

for example with u-nets that i mentioned earlier, you can use a network to denoise an image

#

you can do this by slowly downsampling an image down to very few samples, and then using those few samples to reconstruct the full image without noise

#

similar things are done all the time in image processing more generally under the guise of "auto-encoders"

slim bone Aug 26, 2023, 11:29 AM

#

wooden sail you can do this by slowly downsampling an image down to very few samples, and th...

That sounds insane

#

Completely out of context but is "Image processing" its own field?

#

Like ML or whatever

wooden sail Aug 26, 2023, 11:30 AM

#

yesn't 😛

#

in all of signal processing, the general maths are transferrable

#

but in each application within it you can go arbitrarily deep & cursed

#

at some point you start seeing stuff that is only done in image processing and virtually no other signal processing/ML application

slim bone Aug 26, 2023, 12:13 PM

#

Ah, so it's not that simple

#

Got it haha

slender kestrel Aug 26, 2023, 12:54 PM

#

wooden sail yesn't 😛

ayo edd hows you mate

wooden sail Aug 26, 2023, 12:54 PM

#

hi

slender kestrel Aug 26, 2023, 12:54 PM

#

hows you ! and hows life been ?

wooden sail Aug 26, 2023, 12:55 PM

#

been ok, i'm not sure i remember you though

slender kestrel Aug 26, 2023, 12:56 PM

#

wooden sail been ok, i'm not sure i remember you though

glad to know you are ok and lol its fine if you dont remember me

#

you helped me once on a problem i was stuck on

wooden sail Aug 26, 2023, 12:57 PM

#

i see, hopefully that worked out well 😛

slender kestrel Aug 26, 2023, 12:57 PM

#

wooden sail i see, hopefully that worked out well 😛

well that did lol

proper meteor Aug 26, 2023, 1:44 PM

#

someone help with the shape of the images pls

rugged mist Aug 26, 2023, 2:10 PM

#

wooden sail you can do this by slowly downsampling an image down to very few samples, and th...

does this method have a name

wooden sail Aug 26, 2023, 2:11 PM

#

rugged mist does this method have a name

the ML people would call this something like "latent space representation" or something like that

past meteor Aug 26, 2023, 2:11 PM

#

Principal component analysis'

wooden sail Aug 26, 2023, 2:11 PM

#

PCA is one way of doing it, but not the only

past meteor Aug 26, 2023, 2:12 PM

#

Definitely .

wooden sail Aug 26, 2023, 2:13 PM

#

technically any time you use a parametric representation with fewer parameters than there are data points, this happens

rugged mist Aug 26, 2023, 2:13 PM

#

oh interesting i know about pca

wooden sail Aug 26, 2023, 2:13 PM

#

i.e. fitting the 3 parameters of a sinusoid to 100 time domain samples

past meteor Aug 26, 2023, 2:13 PM

#

It's more general than PCA, I guess any compression and reconstruction pair can work

wooden sail Aug 26, 2023, 2:13 PM

#

or linear regression

#

yep

past meteor Aug 26, 2023, 2:15 PM

#

In uni I always wondered why PCA specifically was always mentioned but in the end, the properties of the method just make it ultra suitable

wooden sail Aug 26, 2023, 2:15 PM

#

PCA is mentioned because it ties to covariance matrices and the fundamental theorem of linear algebra in one shot

#

super clean whenever your data is contained in a comparatively low dimensional linear subspace of the overall space the data is in

past meteor Aug 26, 2023, 2:16 PM

#

The orthogonality of the eigenvectors is a large selling point

wooden sail Aug 26, 2023, 2:17 PM

#

i would say that's also not even necessary since you can always gauss jordan after finding a basis

past meteor Aug 26, 2023, 2:18 PM

#

Would that apply to say the bottleneck of an autoencoder?

wooden sail Aug 26, 2023, 2:18 PM

#

how do you mean?

past meteor Aug 26, 2023, 2:18 PM

#

Conceptually they're similar things as your principal components, they "compress" the data

wooden sail Aug 26, 2023, 2:19 PM

#

ah yeah

past meteor Aug 26, 2023, 2:19 PM

#

But that property does not hold for autoencoders

wooden sail Aug 26, 2023, 2:19 PM

#

which property?

past meteor Aug 26, 2023, 2:19 PM

#

Orthogonality of the neurons in the bottleneck

wooden sail Aug 26, 2023, 2:19 PM

#

it's kinda moot since it's nonlinear anyway

past meteor Aug 26, 2023, 2:20 PM

#

Tbh people have been playing with this, see beta-autoencoders

wooden sail Aug 26, 2023, 2:21 PM

#

beta encoder means something related to ADCs to me, lemme see if i can find what it means in ML

past meteor Aug 26, 2023, 2:21 PM

#

Or beta VAE's, I'm not at my computer right now so I can't look up the specifics. The idea is just that people want representations that are independent

wooden sail Aug 26, 2023, 2:22 PM

#

you'd have to define what you mean by independent. if it's the neurons you wanna make indep, you need to define some sort of inner product or distance metric

#

if it's an architecture that ends up generating a matrix of atoms, then the final result is some sort of frame and ideas similar to those of PCA apply

stiff dove Aug 26, 2023, 2:24 PM

#

Hello everyone can anybody tell me this kind of loss graph indicates that my learning rate is too high or is there another issue?

past meteor Aug 26, 2023, 2:25 PM

#

I'm a bit rusty in the specifics as you can see, I haven't read the paper in a while

wooden sail Aug 26, 2023, 2:27 PM

#

i'm also not savvy on VAEs, so i can only discuss this very superficially

#

but a cursory glance at a beta VAE paper makes me think this is indeed the case. some regularized learning of "latent factors", but i can't tell if these are used to decompose the data linearly or nonlinearly

#

in either case, it's more or less the same idea: if you can construct a decent model for your data, you can then focus on learning the parameters of that model instead of learning everything from scratch. those parameters are usually few

past meteor Aug 26, 2023, 2:30 PM

#

Personally I'm a big sucker for kernel methods (when I can use them) so I'd just use KPCA over these

wooden sail Aug 26, 2023, 2:31 PM

#

i would also lump those in the same category

past meteor Aug 26, 2023, 2:32 PM

#

Of course, they're all conceptually the same

weak mortar Aug 26, 2023, 3:28 PM

#

Hi 🙂 hobbyist playing with some dataframes, backtesting and data vizualisation here. Looking for suggestion for some cool library or software that i can use to further visualize the data. Heatmaps and matplotlib are fine but my dataframe with results will be 3 or maybe more axis, thinking i could find some more tools to assist me here?

left tartan Aug 26, 2023, 3:30 PM

#

weak mortar Hi 🙂 hobbyist playing with some dataframes, backtesting and data vizualisation ...

I use all sorts of libraries. I work in notebooks a lot and have moved away from matplotlib, preferring Plotly. Seaborn has some interesting plots (violin plots) as well. Currently, my main is Plotly. Altair is pretty good too... altair uses vega as the js charting engine, and plotly has plotly.js, so these can produce some nice charts with interactivity.

frozen vessel Aug 26, 2023, 3:32 PM

#

Heyy guys

#

I need some help

weak mortar Aug 26, 2023, 3:33 PM

#

Good, the backtesting.py lib i use plots price graph and other stuff with bokeh, but it is very heavy on the computer to load. Okay i used a bit seaborn for heatmaps will look into its other features and plotly. Would you visualize it in 3d if you had 3 axis?

frozen vessel Aug 26, 2023, 3:34 PM

#

so I wanted to build a feature that allows the user to say input into their microphone, then this input gets translated to english (if it isn't already in english) and then gets printed out in the form of text. Basically a speech to text model with translation

#

Now, the requirement of this project is that it needs to be done using an API (school mandated it 😭 )

#

https://pastebin.com/EsSJhFCy

left tartan Aug 26, 2023, 3:34 PM

#

I rarely have liked 3d visualizations, it tends to be noisy and hard to visually interpret.

frozen vessel Aug 26, 2023, 3:35 PM

#

I tried doing this as my code, it records the input but then for some reason can't transcribe it

#

used whisper api from openai

left tartan Aug 26, 2023, 3:35 PM

#

weak mortar Good, the backtesting.py lib i use plots price graph and other stuff with bokeh,...

What granularity is your data? Hourly or 5 min candles or something?

frozen vessel Aug 26, 2023, 3:35 PM

#

I'd appreciate any help I could get

weak mortar Aug 26, 2023, 3:36 PM

#

I have 1minute data but rn im resampling it to 5min to save time and also bokeh plots evrry single candle so it lags alot

left tartan Aug 26, 2023, 3:36 PM

#

frozen vessel I'd appreciate any help I could get

Maybe open a #1035199133436354600 thread and share some details? It's fairly specific, but maybe someone will jump on?

weak mortar Aug 26, 2023, 3:37 PM

#

The lib should be able to resample before sending data to bokeh but it didnt work for me

left tartan Aug 26, 2023, 3:37 PM

#

weak mortar I have 1minute data but rn im resampling it to 5min to save time and also bokeh ...

I haven't used bokeh in a while, so not sure if it's slow, but I'd try a few libraries... and make sure it's the library. For complex (lots of points) charts, rendering statically might be better.

weak mortar Aug 26, 2023, 3:39 PM

#

Yeah maybe i will just plot linecharts of equity and close prices with matplotlib instead. Its not that important, more all the metrics of the results i look at

left tartan Aug 26, 2023, 3:39 PM

#

fwiw, plotly.express is kinda nice:

#

import plotly.express as px
fig = px.line(df, x="date", y="price", color="equity")
fig.show()

weak mortar Aug 26, 2023, 3:40 PM

#

Okay looks alot like matplotlib syntax iirc 🙂 ill have a shot at them all today. Like to just play around then eventually keep what i see works well

left tartan Aug 26, 2023, 3:41 PM

#

yah, it's intentionally matplotlibby

weak mortar Aug 26, 2023, 3:41 PM

#

I see many use jupyter, but it never really appealed to me

#

I just append all the results into a html page

left tartan Aug 26, 2023, 3:42 PM

#

I primarily work in notebooks, but that's my business. But, we continually refactor anything complex into separate modules.

#

It's a nice environment for doing this type of work, sort of a all in one (repl, visualization, stateful kernel, etc).

weak mortar Aug 26, 2023, 3:44 PM

#

I can see how its probably faster and more flexible than when im modifying the htm with .replace(), tag by tag 🫣😝

left tartan Aug 26, 2023, 3:48 PM

#

weak mortar I can see how its probably faster and more flexible than when im modifying the h...

You could also use something like dash

Dash Documentation & User Guide | Plotly

Plotly Dash User Guide & Documentation

weak mortar Aug 26, 2023, 3:58 PM

#

it looks actually quite nice

#

i think i would be better off using that to make some interactive functionality in the visualized data

#

and make stuff look pretty

polar mason Aug 26, 2023, 7:50 PM

#

I have a small issue with graphs and I dont exactly know which library would be best.

The data is:
x length = 4000
Y length = 10,000
Z length = 255

Drawing up the actual plot with matplotlib takes so long

#

i was actually thinking of using a third party software just for plotting but I was wondering what would be the best way to show this data

serene scaffold Aug 26, 2023, 7:59 PM

#

polar mason I have a small issue with graphs and I dont exactly know which library would be ...

Remember that in computer science, a graph is nodes and edges. Data visualizations are plots.

If you have a lot of things that you're trying to plot, you should probably do some kind of down selection

agile cobalt Aug 26, 2023, 8:00 PM

#

you could aggregate it before plotting, but a third party software won't be much more faster than matplotlib if it were just trying to do the same thing as you're trying to do

(it could do it in a way smarter than what you're trying to do though)

polar mason Aug 26, 2023, 8:00 PM

#

serene scaffold Remember that in computer science, a graph is nodes and edges. Data visualizatio...

the thing is currently I need to be able to visualise all data points for a better observation

#

so I cant get rid of any of the data at the moment, all individual nodes are important

serene scaffold Aug 26, 2023, 8:00 PM

#

polar mason the thing is currently I need to be able to visualise all data points for a bett...

If you actually plotted everything, would it actually produce a visualization that was easy to interpret?

polar mason Aug 26, 2023, 8:01 PM

#

serene scaffold If you actually plotted everything, would it actually produce a visualization th...

spaced out, yes

#

ive tried it with a smaller Y length and its perfect, but I want to do the full 10k plots

serene scaffold Aug 26, 2023, 8:01 PM

#

Spaced out? Does that mean you'd need a giant monitor to view it?

agile cobalt Aug 26, 2023, 8:01 PM

#

you can probably aggregate it in buckets of 40x100x1 down to a 100 x 100 x 255 grid without any real loss

polar mason Aug 26, 2023, 8:01 PM

#

serene scaffold Spaced out? Does that mean you'd need a giant monitor to view it?

Ive been increasing with this ax.yaxis.set_major_locator(ticker.MultipleLocator(base=5))

serene scaffold Aug 26, 2023, 8:02 PM

#

polar mason Ive been increasing with this `ax.yaxis.set_major_locator(ticker.MultipleLocator...

If the plot that you produce can't fit on someone's screen, it's not useful.

polar mason Aug 26, 2023, 8:02 PM

#

Just needs to fit on my screen

serene scaffold Aug 26, 2023, 8:02 PM

#

Is your screen the size of a television?

polar mason Aug 26, 2023, 8:02 PM

#

I need a highres image that I can zoom in on

polar mason Aug 26, 2023, 8:02 PM

#

serene scaffold Is your screen the size of a television?

i wish

serene scaffold Aug 26, 2023, 8:02 PM

#

What would you do once you zoom in? Because you can make more than one plot

#

I have to go all of the sudden.

polar mason Aug 26, 2023, 8:03 PM

#

I just need to plot all of it so I can see which parts need to be edited and changed

#

It takes an hour with only 5000 plots and its killing me

polar mason Aug 26, 2023, 8:04 PM

#

agile cobalt you can probably aggregate it in buckets of 40x100x1 down to a `100 x 100 x 255`...

if I chunk it, is their any way to kind of string them together?

#

im still trying to find ways to space out the whole scatter and increase the amount i can plot

#

Actually, Im gonna look into VisPy

polar mason Aug 26, 2023, 8:38 PM

#

ah I ended up answering my own issue, it was an issue of fig size haha

left tartan Aug 26, 2023, 8:52 PM

#

serene scaffold What would you do once you zoom in? Because you can make more than one plot

I'm imagining this: https://en.wikipedia.org/wiki/The_Great_Picture#/media/File:GP_Hanging_In_Camera_RJ.jpg

The Great Picture

As of 2011, The Great Picture (111 feet (34 m) wide and 32 feet (9.8 m) high) holds the Guinness World Record for the largest print photograph, and the camera with which it was made holds a record for being the world's largest. The photograph was taken in 2006 as part of the Legacy Project, a photographic compilation and record of the history of...

polar mason Aug 26, 2023, 8:52 PM

#

left tartan I'm imagining this: https://en.wikipedia.org/wiki/The_Great_Picture#/media/File:...

fr if i had a place with a big enough screen, i would do that

#

like give me a 12k projector

left tartan Aug 26, 2023, 8:52 PM

#

rent a movie theater

polar mason Aug 26, 2023, 8:53 PM

#

anyway this actually works atm so im cool with it

polar mason Aug 26, 2023, 8:53 PM

#

left tartan rent a movie theater

I'll be sat with popcorn staring at a graph for hours

left tartan Aug 26, 2023, 8:53 PM

#

polar mason anyway this actually works atm so im cool with it

How'd you generate that?

polar mason Aug 26, 2023, 8:53 PM

#

left tartan How'd you generate that?

Its 10 seconds of sensor data

left tartan Aug 26, 2023, 8:53 PM

#

I mean, what library?

polar mason Aug 26, 2023, 8:53 PM

#

matplotlib

#

which i was surprised at

left tartan Aug 26, 2023, 8:54 PM

#

Oh, interesting, came out pretty nice... assumed it was something else

polar mason Aug 26, 2023, 8:54 PM

#

its a 10k image res tho

#

so it takes up like 150 mbs per image

#

but i get good stuff like this

left tartan Aug 26, 2023, 8:54 PM

#

What kind of sensor data?

polar mason Aug 26, 2023, 8:55 PM

#

I believe vibration data?

#

yeah its vibrations over an array of 4000 locations

#

this is part of my thesis atm

#

except i cant get enough of making this lil plot spin

#

it spin

cerulean kayak Aug 26, 2023, 8:57 PM

#

can someone please explain to me what %matplotlib inline does in Jypyter?

polar mason Aug 26, 2023, 8:57 PM

#

DogSpin

polar mason Aug 26, 2023, 8:58 PM

#

polar mason except i cant get enough of making this lil plot spin

and then increasing it to

#

more bigger spin

left tartan Aug 26, 2023, 8:59 PM

#

cerulean kayak can someone please explain to me what `%matplotlib inline` does in Jypyter?

That just says render the matplotlibs inside the notebook, rather than as a separate image.

#

I think it's enabled by default though, at least in my environment

cerulean kayak Aug 26, 2023, 9:00 PM

#

left tartan I think it's enabled by default though, at least in my environment

I was going to ask that.

left tartan Aug 26, 2023, 9:01 PM

#

Yah, doesn't do anything for me. I think it used to create a separate image or something

humble portal Aug 26, 2023, 10:34 PM

#

I'm trying to use an implementation of FIt-SNE, but it runs out of memory on an A10 at this line:

dY = torch.sum(
            (PQ * num.to(device)).unsqueeze(1).repeat(1, no_dims, 1).transpose(2, 1) * (Y.unsqueeze(1) - Y),
            dim=1
        )

Unfortunately, these variables are poorly named and undocumented. However, the function preceeding the line where the function runs out of memory is

sum_Y = torch.sum(Y * Y, dim=1)
num = -2. * torch.mm(Y, Y.t())  # (N, N)
num = 1. / (1. + (num.to('cpu') + sum_Y.to('cpu')).t().to('cpu') + sum_Y.to('cpu'))
num.fill_diagonal_(0)
Q = num / torch.sum(num)
Q = torch.max(Q, torch.tensor(1e-12, device=Q.device))
Q = Q.to(device)

# Compute gradient
PQ = P - Q

Q.to('cpu')
P.to('cpu')

Y is initialized to a tensor of zeros equal to the length of the dataset and the code given as well as the line that breaks is within a loop that runs for a given number of times. P is some tensor based on the dataset (I think it's the dataset after undergoing some dimension reduction).

Essentially: How can I change this expression to calcualte dY with less memory?

polar mason Aug 27, 2023, 3:37 AM

#

polar mason and then increasing it to

Ended up just changing matplot settings in the end with cython to execute it and be a tiny bit faster, but im finally getting the stuff i wanted

#

the actual file is about 150 mbs

simple tapir Aug 27, 2023, 8:05 AM

#

hey

#

data_id_mean = data.ID.mean()
data.ID.map(lambda p: p - data_id_mean)

What does this actually do?

#

We take the mean of ID column but what's p here? What do we minus from mean of IDs?

small wedge Aug 27, 2023, 8:15 AM

#

simple tapir We take the mean of ID column but what's p here? What do we minus from mean of I...

p is any element the map function passes through to the lambda, i.e. each element of the ID attribute. I believe this would be a type of data normalization

simple tapir Aug 27, 2023, 8:17 AM

#

hmm, I see thanks

weak mortar Aug 27, 2023, 9:20 AM

#

While i always liked numbers, i think i should acquire some knowledge about statistics. I could ask a dozen questions about variance, standardization and quantiles, but maybe i should read a book about it. Not that i like books alot though. Any suggestions? And also, good morning

#

(Non professional data scientist, algotrader thats asking)

twilit tundra Aug 27, 2023, 9:22 AM

#

weak mortar While i always liked numbers, i think i should acquire some knowledge about stat...

Statistics by David Freedman is pretty accessible and can be found online for free

weak mortar Aug 27, 2023, 9:24 AM

#

I like the straight forward title! Thx

#

Will look for it

twilit tundra Aug 27, 2023, 9:25 AM

#

If you prefer videos, statquest has a lot of short and accessible videos

weak mortar Aug 27, 2023, 9:25 AM

#

Even if i only end up reading half of it like all other books i try and read.. still a half book wiser

weak mortar Aug 27, 2023, 9:57 AM

#

While it makes sense to just loom at the bulk of the distribution (quantiles), the outliers do also have relevance 🤔 i think the big challenge is to narrow down the metrics to the most useful ones to assist an informed decision

unique ether Aug 27, 2023, 10:52 AM

#

I've narrowed it down to the following fields: Statistics, Linear Algebra, Probability theory and Calculus.

pine wolf Aug 27, 2023, 11:02 AM

#

unique ether I've narrowed it down to the following fields: Statistics, Linear Algebra, Proba...

linear algebra, calculus first in that order

#

though calculus will probably be taught before linear algebra in college

cosmic harbor Aug 27, 2023, 11:03 AM

#

Hello everyone,
After building Pytorch from source, CUDA is not available. Here is what I did:

git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
git submodule sync
git submodule update --init --recursive
conda install cmake ninja
pip install -r requirements.txt
conda install mkl mkl-include
conda install -c pytorch magma-cuda115
make triton
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
python setup.py develop

Can anyone help?

unique ether Aug 27, 2023, 11:06 AM

#

pine wolf linear algebra, calculus first in that order

Many thanks

past meteor Aug 27, 2023, 11:10 AM

#

unique ether Many thanks

probability theory is a nice extension of linear algebra and calculus

#

so you can do it after lin alg/calc

unique ether Aug 27, 2023, 11:10 AM

#

past meteor probability theory is a nice extension of linear algebra and calculus

Oh nice good to know!

past meteor Aug 27, 2023, 11:11 AM

#

It will depend on your text book but we had mini proofs in probability theory that required a working knowledge of say integrals

unique ether Aug 27, 2023, 11:12 AM

#

Basically my situation right now is this: I'm starting a 1 year conversion course soon as some of you already know going from a Bsc in Nat Sci to an Msc in AI and ML.

I know a bit of statistical testing from my Nat Sci course but not too much.

I've been studying python for like 10 hours a day for the last month and now I'm trying to brush up on my maths. What would you all reccomend I focus on? I have roughly 20 days until the start of my course.

#

Just assume I'll be studying 10 hours a day up until the course starts

#

Because I will be

pine wolf Aug 27, 2023, 11:13 AM

#

the calculus you need for ML i don't think is too deep, at least not to start; not that i'm that familiar with ML -- but linalg is everywhere

unique ether Aug 27, 2023, 11:14 AM

#

pine wolf the calculus you need for ML i don't think is too deep, at least not to start; n...

So like calculus 1?

#

maybe a lil bit of calculus 2?

pine wolf Aug 27, 2023, 11:15 AM

#

yeah, like calc 1 to understand back propagation

unique ether Aug 27, 2023, 11:16 AM

#

Rodget that

#

Many thanks mate

#

I'm actually watching an udemy course on the fundamentals of math right now. Just to refresh my math knowledge and have a solid foundation going forward.

#

I'm thinking: Math fundementals (refresher) --> Algebra (refresher) --> Linear Algebra --> Calculus 1

keen solstice Aug 27, 2023, 11:30 AM

#

hey guys,I had a question.
WHen I type print( in my jupyter lab, it doesnt complete the other bracket at the end ,and nor does it automatically add the ending quotation mark if I type the opening one,

Any fix?

unique ether Aug 27, 2023, 11:30 AM

#

are you typing in a markdown cell?

keen solstice Aug 27, 2023, 11:30 AM

#

no in a code cell

unique ether Aug 27, 2023, 11:31 AM

#

I'm not sure then mate sorry. I'm new too.

polar mason Aug 27, 2023, 11:37 AM

#

me and my friends have been debating.

For just general AI and data science, higher ram speeds or more ram volume?

#

like we where talking about how you could chunk it to make more out of the speeds, but then more volume means you can just load it at once

#

which is better?

untold bloom Aug 27, 2023, 12:32 PM

#

keen solstice hey guys,I had a question. WHen I type print( in my jupyter lab, it doesnt compl...

hi, did you try the "Settings" in the menu, then "Auto Close Brackets for Text Editor"

keen solstice Aug 27, 2023, 12:42 PM

#

untold bloom hi, did you try the "Settings" in the menu, then "Auto Close Brackets for Text E...

yeah,ty

potent sky Aug 27, 2023, 12:47 PM

#

polar mason me and my friends have been debating. For just general AI and data science, hi...

below a certain threshold, lower RAM can make some things pretty much impossible.
I guess slower RAM will only ever make things slower.

unique ether Aug 27, 2023, 12:48 PM

#

At its code, AI and ML is just maths isn't it?

#

I'm starting to realise that

past meteor Aug 27, 2023, 12:50 PM

#

unique ether At its code, AI and ML is just maths isn't it?

yes it's applied math

#

Like an engineering discipline the real world matters (the problem you're solving) and also your constraints (stuff related to comp sci)

unique ether Aug 27, 2023, 12:51 PM

#

So its just applying maths to a problem within the constraints of computer science

unique ether Aug 27, 2023, 2:51 PM

#

Do you lot think that trying to code math problems into python is good practice for a career in ML?

#

math problems and formulas and equations and such

wooden sail Aug 27, 2023, 3:05 PM

#

that's unavoidably a large part of what you'll do

#

along with the practical implications of handling huge amounts of data and doing math with stuff that doesn't fit in memory

left tartan Aug 27, 2023, 3:23 PM

#

unique ether Rodget that

fwiw, these videos are wonderful intros for linear and calc. Just a high level intro, but I'd suggest watching it before and after you take either course... it's just really well done: https://www.youtube.com/@3blue1brown/courses. And for calculus, this is a gentle intro to Calculus taught by one of the great professors of the subject: https://ocw.mit.edu/courses/res-18-005-highlights-of-calculus-spring-2010/. There are probably other options for linear & calculus deep dives, but his explanations are probably my favorite. I absolutely love his proof of the derivative of e^x.

flint orbit Aug 27, 2023, 4:54 PM

#

Hi guys, is this the right place to ask questions about Jupyter?

past meteor Aug 27, 2023, 5:09 PM

#

flint orbit Hi guys, is this the right place to ask questions about Jupyter?

yes, ask away

flint orbit Aug 27, 2023, 5:11 PM

#

past meteor yes, ask away

I think I solved it. The problem was with the fact that jupyter closed the cell for editing right at the moment when I typed something and stopped even for a short moment. Presumably it was related to autosave setting of vs code, I'm running notebooks inside it.

twin forge Aug 27, 2023, 5:51 PM

#

left tartan Perhaps you have spurious zeros, or perhaps gaps in data that’s been filled inco...

a friend of mine suggested the following approach:

https://github.com/phrenico/uniqed/blob/master/README.rst

GitHub

uniqed/README.rst at master · phrenico/uniqed

Contribute to phrenico/uniqed development by creating an account on GitHub.

#

from uniqed.runners.tof_run import detect_outlier

df = detect_outlier(ohlcv_dict["UPRO"]["Adj Close"], cutoff_n=80)


Cell In[4], line 1
    df = detect_outlier(ohlcv_dict["UPRO"]["Adj Close"], cutoff_n=80)

  File ~\AppData\Roaming\Python\Python310\site-packages\uniqed\runners\tof_run.py:34 in detect_outlier
    np_time_series = time_series.values[:, 0]

IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

but my test run was unsuccessfull, I've tried reset_index() on the "Adj Close" series to create a "Date" column, and .reshape(). Any ideas?

left tartan Aug 27, 2023, 5:53 PM

#

twin forge a friend of mine suggested the following approach: https://github.com/phrenico...

I'm just saying, you have to understand the mechanisms behind the actions before applying filters and outliers. It's very likely there's an easier way to filter the data.

#

For example, if there's an erroneous 0 after trading hours /etc, perhaps you want to remove them. Or, perhaps you just want to restrict your data to the trading window.

twin forge Aug 27, 2023, 5:54 PM

#

left tartan For example, if there's an erroneous 0 after trading hours /etc, perhaps you wan...

yfinance registers prices on close so it's probably not after hour data

left tartan Aug 27, 2023, 5:55 PM

#

You seem to not want to look at the data? I don't really get it. Pick a spot where a particular equity has a lot of zeros and look at it.

twin forge Aug 27, 2023, 5:55 PM

#

left tartan Aug 27, 2023, 5:55 PM

#

The dataframe, is what I'm talking about.

twin forge Aug 27, 2023, 5:56 PM

#

sure but saying "it's after hour data" without even looking if yahoo finance registers it

#

kinda ducky_sphere

left tartan Aug 27, 2023, 5:56 PM

#

I said "perhaps".

#

You won't know until you actually inspect the anomalies.

lavish ember Aug 27, 2023, 6:08 PM

#

should i install packages like tensorflow globally or in virtual environment? I am beginner.

past meteor Aug 27, 2023, 6:24 PM

#

lavish ember should i install packages like tensorflow globally or in virtual environment? I ...

in a virtual environment

frosty root Aug 27, 2023, 7:15 PM

#

hey so im writing my college application and one of the supplementals is why i want to do the honors program, im saying that i want to do it for the mentorship so i can learn a ton ab ai and using it (which is true) but i dont know what there is in the real world with ai that isnt taught in the basic courses i screenshotted below. any thoughts on some things that college doesnt teach you in ai that a mentor from the industry could?

small wedge Aug 27, 2023, 7:40 PM

#

frosty root hey so im writing my college application and one of the supplementals is why i w...

I'd assume AI fundamentals will not go over things like deploying trained models and industry concerns when distributing ML services to users such as alignment and adversarial attacks. Depends on exactly what your mentor would do though, there are applications of AI/ML that aren't distributed to users ofc.

frosty root Aug 27, 2023, 7:41 PM

#

small wedge I'd assume AI fundamentals will not go over things like deploying trained models...

industry probably isnt the way to go with this i dont think, because i have internships for that. stuff in the actual theory is more aligned i think

molten onyx Aug 27, 2023, 8:20 PM

#

i have a bug in my backpropagation function i have a feeling it's with calculating the derivative. can you tell me if i have a obvious flaw in my code?
the code is in c++ but i think the its understandable

    double derivative_outputlayer(double current_weight, double output, double prev_output, double target)
    {
        double f1 = output - target;
        double f2 = output * (1 - output);
        double f3 = prev_output * current_weight;

        double derivative = f1 * f2 * f3;
        
        return derivative;
    }

    double derivative_hiddenlayer(double current_weight, double output, double prev_output, double sum_gradients)
    {
        double f1 = output * (1 - output);
        double f2 = prev_output;
        double f3 = sum_gradients;

        double derivative = f1 * f2 * f3;

        return derivative;
    }

fleet musk Aug 27, 2023, 8:36 PM

#

Hi. Anaconda keeps telling me to update to latest version.

If i update, will it update all installed packages across my envs? Will my old projects break?

past meteor Aug 27, 2023, 8:41 PM

#

molten onyx i have a bug in my backpropagation function i have a feeling it's with calculati...

Can I comment on the code itself a bit?

I would give more descriptive names to f1, f2 and f3 especially considering f1 in function 1 == f1 in function2. To me this detracts from the readability.
I'd rename the function to make clear that you're dealing with a sigmoid. If I see "derivative_outputlayer" I then check the body of the function to see what loss you're using. You can help the reader a bit out here 🙂

molten onyx Aug 27, 2023, 8:46 PM

#

past meteor Can I comment on the code itself a bit? 1. I would give more descriptive names...

thanks, ill try to impplement it!

past meteor Aug 27, 2023, 8:47 PM

#


double derivative_outputlayer_sigmoid(double current_weight, double output, double prev_output, double target)
    {
        double error = output - target;
        double activation_derivative = output * (1 -output);


        double derivative = error * activation_derivative * prev_output
        
        return derivative;
    }```

#

it's also strange that you're doing the derivative w.r.t. a single neuron in the previous layer

molten onyx Aug 27, 2023, 8:52 PM

#

im kinda confused anyways with the howl the backprop thing. i watched a few videos and everytime when i watched them i had a million question which were left unanswerd

past meteor Aug 27, 2023, 8:53 PM

#

Okay can I give you a protip then?

#

Start smaller. Just write out gradient descent in any language you want. Then turn it into stochastic gradient descent. Then add regularization.

#

Write your own data generating function. Play around with the amount of regularization, play with the batch size. See what it's doing and try and find out why.

#

Once you've done SGD for a simple linear model implementing backprop will be a bit easier. Also, it's important to know that nowadays people don't do backprop like this. They use automatic differentiation

molten onyx Aug 27, 2023, 8:56 PM

#

ok, thanks! i think its the bes advice ive ever gotten in my howl Journey.

#

ik that backprop isnt as relevant nowadays but i find it quite interesting and im only gonna use it to apply for a apprenticeship

humble portal Aug 27, 2023, 8:59 PM

#

humble portal I'm trying to use an implementation of FIt-SNE, but it runs out of memory on an ...

bumping my post from yesterday

past meteor Aug 27, 2023, 9:00 PM

#

molten onyx ik that backprop isnt as relevant nowadays but i find it quite interesting and i...

Automatic differentiation is still backprop, it's just done differently. You can kind of conceptualise it as a class/type that wraps a scalar, vector, tensor, ... basically each time you do any mathematical operation with that type it "acts" like a regular math object but it stores the gradient internally

#

https://arxiv.org/abs/2106.11342 <= this book also uses the method of starting with linear regression and moving to neural nets if you want more "guidance"

arXiv.org

Dive into Deep Learning

This open-source book represents our attempt to make deep learning
approachable, teaching readers the concepts, the context, and the code. The
entire book is drafted in Jupyter notebooks, seamlessly integrating exposition
figures, math, and interactive examples with self-contained code. Our goal is
to offer a resource that could (i) be freely av...

molten onyx Aug 27, 2023, 9:03 PM

#

ill look into it tommorow, thanks!

past meteor Aug 27, 2023, 9:05 PM

#

humble portal I'm trying to use an implementation of FIt-SNE, but it runs out of memory on an ...

How big is your dataset?

humble portal Aug 27, 2023, 9:05 PM

#

42000 elements

past meteor Aug 27, 2023, 9:05 PM

#

If I recall correctly, but you'll have to fact check me on this, t-sne does form some kind of pairwise similarity matrix, basically an NxN thing

humble portal Aug 27, 2023, 9:06 PM

#

Sorry, I forgot to copy the post with the dimensions. PQ and num are (42000 x 42000) while, Y is (42000 x 2)

past meteor Aug 27, 2023, 9:06 PM

#

42000x42000 x floating point precision

humble portal Aug 27, 2023, 9:07 PM

#

Yeah pretty much. I'm not well versed in how t-sne works, I'm just trying to visualize datasets so that I can display coreset coverage, and there aren't any good torch implementations of t-sne, so I'm trying to get the only not-completely-garbage one I could find to work

past meteor Aug 27, 2023, 9:07 PM

#

If it's double precision you're looking at idk 14.1GB ram?

#

How much memory do you have?

#

Does it need to be from Torch?

humble portal Aug 27, 2023, 9:10 PM

#

That step is trying to create 13.2GiB, so yeah that seems accurate.

It does in fact need to work with torch. Keras will not work for my implementation. I have 2 A10s, although it seems that this can only utilize one at any given time.

past meteor Aug 27, 2023, 9:11 PM

#

I should really look into the algorithm in detail but sklearn's docs strongly imply they have a method that defers forming that NxN matrix but is slower https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

scikit-learn

sklearn.manifold.TSNE

Examples using sklearn.manifold.TSNE: Comparison of Manifold Learning methods Manifold Learning methods on a severed sphere Manifold learning on handwritten digits: Locally Linear Embedding, Isomap...

#

https://scikit-learn.org/stable/modules/manifold.html#t-sne

scikit-learn

2.2. Manifold learning

Look for the bare necessities, The simple bare necessities, Forget about your worries and your strife, I mean the bare necessities, Old Mother Nature’s recipes, That bring the bare necessities of l...

humble portal Aug 27, 2023, 9:11 PM

#

I don't believe that the sklearn method works with tensors

past meteor Aug 27, 2023, 9:12 PM

#

nope

humble portal Aug 27, 2023, 9:13 PM

#

It also seems to be a CPU-compute method, so it won't function well with such a large dataset, let alone what I need to eventually do with IMAGENET

past meteor Aug 27, 2023, 9:13 PM

#

Plus it'd be a moot point if it goes ahead and forms that pairwise distance matrix anyway.

Are you open to other ways to visualise your dataset?

humble portal Aug 27, 2023, 9:14 PM

#

Such as?

It seems that t-SNE is the standard now in domains that require a representation of "distance"

past meteor Aug 27, 2023, 9:15 PM

#

Heads up though, I've had a month long holiday and I'm a bit rusty ML-wise so maybe I'm not seeing something.

#

If you have images you can just ram them through any pretrained model and compute whatever distance you want. Just remove the top of the network, flatten or pool, whichever you want and then compute the distance

#

Also, at that point you have a vector. You can PCA this vector if you want and plot that way. This is what I did in the past sometimes.

humble portal Aug 27, 2023, 9:16 PM

#

past meteor If you have images you can just ram them through any pretrained model and comput...

Not really what I need. I'm working on coreset selection, so I need to display the positions of selected coreset vs redundant elements within datasets within the domain space.

past meteor Aug 27, 2023, 9:17 PM

#

Let me read your original post and think for a bit. EDIT: not much to go on for your problem 😛

past meteor Aug 27, 2023, 9:18 PM

#

humble portal Not really what I need. I'm working on coreset selection, so I need to display ...

What exactly do you mean with "the positions of selected coreset"

humble portal Aug 27, 2023, 9:23 PM

#

I can't say much on the topic, but I'm attempting to use coreset selection to guess the bounding manifolds of the individual classes within a space. I want to display the dataset in a way where I can highlight certain points based on said coreset selection.

stable cape Aug 27, 2023, 9:24 PM

#

Hello! I have a question. I have an algorithm that gathers data on financial performance, keeps it and yields it to an AI. However, im facing problems feeding the ai bc the amount of data is too large. To solve this, I am trying with langchain models, but i dont know which one and how to use it. Does anyone know about it ? If not, do you know any alternatives?

#

langchain agents*

past meteor Aug 27, 2023, 9:27 PM

#

humble portal I can't say much on the topic, but I'm attempting to use coreset selection to gu...

I see! I hadn't heard the term coreset selection specifically yet. Another niche I just discovered. I guess this is very specific so I'm probably not of much help.

#

My profs loved kernel methods so we covered Determinantal point processes which are probably coreset selection now that I know what the term is

#

You'll run into the same issue there cause as you probably know Kernel methods need that NxN matrix as well 😿

humble portal Aug 27, 2023, 9:30 PM

#

past meteor I see! I hadn't heard the term coreset selection specifically yet. Another niche...

It's a niche that's gaining ground due to the growth of massive unlabeled datasets with things like LLM training, so I'd suggest looking into them. CRAIG is a good algorithm to start by looking at.

The issue isn't the NxN matrix, so much as that the line I posted is creating a 2nxn matrix for some ungodly reason.

past meteor Aug 27, 2023, 9:34 PM

#

humble portal It's a niche that's gaining ground due to the growth of massive unlabeled datase...

Yes but the NxN alone means you'll never scale. Having it twice sucks yes but just the one is already a death sentence, no?

#

Or am I missing something, especially since you mentioned imagenet

humble portal Aug 27, 2023, 9:37 PM

#

I guess you're correct there. Hmm well since I only need to calculate it once for each dataset, I guess a CPU-bound option works. How fast is the sklearn option? Even if I need to run over the course of a few days, there shouldn't be too much of an issue.

past meteor Aug 27, 2023, 9:38 PM

#

The docs say it can take hours 🤣

pallid badge Aug 27, 2023, 10:55 PM

#

Hi, I need to process 200000 images. I want to store them into an hdf5 file, 3d array. First dim is the image number. I track the image number. I create a hdf5 dataset. Can I populate the array only by keeping track of the image number in the parallel processing?

dset[image_number]=converted_image

Then I relay on my image_number to sort this out in the dataset the position?
H5py puts the converted data via the image_number into the correct location of the dataset?

desert oar Aug 28, 2023, 1:19 AM

#

pallid badge Hi, I need to process 200000 images. I want to store them into an hdf5 file, 3d ...

what do you mean by "correct location"? when you iterate over the dataset, the images should be returned in the order of image_number, yes.

desert oar Aug 28, 2023, 1:20 AM

#

worn stratus for the most part, the cost is small enough to not matter for analysis. create a...

sure, although analytics queries can be pretty slow even against temporary tables, depending on what you're doing. i think the actual use case for duckdb is embedded real-time analytics inside applications. personally i'm happy with pandas and polars for analysis work.

pallid badge Aug 28, 2023, 1:31 AM

#

desert oar what do you mean by "correct location"? when you iterate over the dataset, the i...

Hi, sorry not see. I have to fill a new dataset in a hdf5 file.

#

I am trying parallel processing, the images are distributed and come back now converted. They have to be placed back into the dataset (3D np.ndarry).

#

I use zmq

stiff wedge Aug 28, 2023, 3:50 AM

#

I am trying to convert a GZ File to a CSV File however I am running into errors, this is my current code:

import gzip
import csv
import io

with gzip.open(r'C:\code\un-general-debates-blueprint.csv.gz', 'rt', encoding="utf-8") as vFile:
    file_content = vFile.read()
    #file_content = csv.reader(vFile)

nlpFile = open(r"C:\code\un-general-debates-blueprint.csv", "w")

nlpFile.write(file_content)

nlpFile.close()

print(file_content)

Does anyone know how to possible transform this GZ File to a saved CSV File?

left tartan Aug 28, 2023, 3:54 AM

#

stiff wedge I am trying to convert a `GZ File` to a `CSV File` however I am running into err...

What errors?

stiff wedge Aug 28, 2023, 4:04 AM

#

left tartan What errors?

This is the error:

sonic meteor Aug 28, 2023, 6:29 AM

#

Can anyone help me get into neuroevolution??

lapis sequoia Aug 28, 2023, 8:20 AM

#

How to fix this?

#

warm shard Aug 28, 2023, 9:26 AM

#

Any NN journey? im new to ML started with sound analyzation with logistic regression, planning to move on naive bayes and neural networks or even more of them in time.

#

i dont know much about NNs i want to gain experience with data management first.

desert oar Aug 28, 2023, 1:02 PM

#

lapis sequoia

post as text, not image please

desert oar Aug 28, 2023, 1:04 PM

#

warm shard Any NN journey? im new to ML started with sound analyzation with logistic regres...

there's no one journey that works best for everyone. imo you'll do well by following your interests. if you like audio analysis, keep doing that. you'll eventually find excuses to learn all the things you need to know. for example data management: try storing metadata in various formats, figure out how to store binary data, figure out how to store your fitted model, etc.

#

the only caveat here is that audio data can be a little complicated, and it tends to be easier to work with "tabular" data (like what you might find in a spreadsheet)

#

but you can work with audio metadata in tabular form, like a database of songs, year released, etc.

#

like you could try to predict genre using artist and title or something

warm shard Aug 28, 2023, 1:06 PM

#

desert oar the only caveat here is that audio data can be a little complicated, and it tend...

I actually planned audio as something thats fun to develop and make me learn AI better i will look to these.

desert oar Aug 28, 2023, 1:06 PM

#

at the same time, plenty of people do machine learning with audio and image data just using a big folder of files & a text file of labels

desert oar Aug 28, 2023, 1:06 PM

#

warm shard I actually planned audio as something thats fun to develop and make me learn AI ...

yeah keep doing that. lots of opportunities to explore the various tools used in machine learning

warm shard Aug 28, 2023, 1:08 PM

#

desert oar yeah keep doing that. lots of opportunities to explore the various tools used in...

Yeah im planning to learn them after i master data management etc in basic models probably will start learning NN,random forest,decision trees in few months. theres too much to search.

warm shard Aug 28, 2023, 1:08 PM

#

desert oar but you can work with audio metadata in tabular form, like a database of songs, ...

Thats a cool option too but i already started trying to classify signals.

desert oar Aug 28, 2023, 1:09 PM

#

warm shard Yeah im planning to learn them after i master data management etc in basic model...

the field is huge, take your time

warm shard Aug 28, 2023, 1:09 PM

#

Okay thanks for all advices.

desert oar Aug 28, 2023, 1:09 PM

#

i don't think you need to master data management. data management is a means to an end. start simple, and if your needs expand you will learn more things.

warm shard Aug 28, 2023, 1:10 PM

#

desert oar i don't think you need to _master_ data management. data management is a means t...

Im just trying to have practical skills when i develop a ML model i dont go into things that i dont need in that modal.

lapis sequoia Aug 28, 2023, 1:13 PM

#

desert oar post as text, not image please

Not able to install lazypredict library

twin forge Aug 28, 2023, 1:19 PM

#

def log_retornos(n):
    df = pd.DataFrame()
    for t in clean_price.keys():
        df[t] = clean_price[t]["Adj Close"]
    df = df.dropna()
    
    sec_returns = np.log(1 + df.pct_change()).dropna() * 100
    sec_outliers = pd.DataFrame()
    for t in sec_returns.columns:
        sec_outliers[t] = detect_outlier(sec_returns[[t]], cutoff_n=80)["TOF"]
    
    for t in clean_price.keys():
        try:
            plt.figure(figsize=(22, 10))
            plt.rcParams.update({"font.size": 19})
            plt.plot(sec_returns[t])
        
            outlier_positions = sec_outliers.index[sec_outliers[t] == 1]
        
            if not outlier_positions.empty:
                valid_positions = outlier_positions.intersection(sec_returns.index)
                plt.scatter(valid_positions, sec_returns[t].loc[valid_positions], marker='x', color='red', label='Outlier')
        
            plt.title(f"Log Retornos em % - {t}")
            plt.legend()
            plt.show()
        except:
            pass
        
log_retornos(int(input("Defina o números de preços vizinhos: ")))```

Traceback (most recent call last):

  Cell In[3], line 30
    log_retornos(int(input("Defina o números de preços vizinhos: ")))

  Cell In[3], line 10 in log_retornos
    sec_outliers[t] = detect_outlier(sec_returns[[t]], cutoff_n=80)["TOF"]

  File ~\AppData\Roaming\Python\Python310\site-packages\uniqed\runners\tof_run.py:39 in detect_outlier
    ).fit_transform(np_time_series)

  File C:\ProgramData\anaconda3\lib\site-packages\sklearn\utils\_set_output.py:142 in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)

  File ~\AppData\Roaming\Python\Python310\site-packages\uniqed\transformers\transformers.py:25 in fit_transform
    return self.fit(x).transform(x)

  File C:\ProgramData\anaconda3\lib\site-packages\sklearn\utils\_set_output.py:142 in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)

  File ~\AppData\Roaming\Python\Python310\site-packages\uniqed\transformers\transformers.py:21 in transform
    X = self._embedding(x, self.d, self.tau)

  File ~\AppData\Roaming\Python\Python310\site-packages\uniqed\transformers\transformers.py:37 in _embedding
    X = np.zeros((embedded_length, d))

ValueError: negative dimensions are not allowed

#

While trying to plot log_returns with markers on outlier data, I've been getting the following error

pallid badge Aug 28, 2023, 2:21 PM

#

desert oar what do you mean by "correct location"? when you iterate over the dataset, the i...

Correct location: The images have been measure in a sequence. To keep this sequence intact, it is important that images from parallel processing come back into this order when saved.

#

Another question in this context is how ZMQ and multiprocessing ensure that the workers return the data back into the corect order

cyan sierra Aug 28, 2023, 3:31 PM

#

Hi everyone. In a classification problem, is it correct to one hot encode date like this? 'day_of_week_0', 'day_of_week_1', 'day_of_week_2',
'day_of_week_3', 'day_of_week_4', 'day_of_week_5', 'day_of_week_6',
'month_1', 'month_2', 'month_3', 'month_4', 'month_5', 'month_6',
'month_7', 'month_8', 'month_9', 'month_10', 'month_11', 'month_12'

unique ether Aug 28, 2023, 4:39 PM

#

Any algorithm engineers lurking about?

past meteor Aug 28, 2023, 4:44 PM

#

unique ether Any algorithm engineers lurking about?

If you have a question please just ask it 🙂

tall swallow Aug 28, 2023, 6:26 PM

#

guys ik this is outa context. I need pc experts opinions on this. I when i play a game like gta v i get around 300 fps low that was about 2 years ago. I do the same thing now i cant even reach 150. I have updated Graphics Drvrs and the exact same settings on everything. whats the issue

rustic snow Aug 28, 2023, 6:30 PM

#

In a convolution neural network does a convolution layer reduce the dimentions of the input picture

small wedge Aug 28, 2023, 6:48 PM

#

rustic snow In a convolution neural network does a convolution layer reduce the dimentions o...

it depends on the exact configurations of the convolutional layer, if it only has one filter and the dim of the kernel/length of the stride are greater than one then the output of the convolution will be smaller than the original image. However in practice almost all convolutional layers are going to have multiple filters which ends up increasing the number of parameters output by the layer. Generally we chain convolutional layers with pooling layers to reduce dimensionality and introduce other kinda of spatial invariance.

nocturne hornet Aug 28, 2023, 6:59 PM

#

Im taking two courses in AI and neural networks at my uni. Is it normal to not get what biases and weights do to my dataset at the beginning? I'm having problems figuring out what it does to my results as im not understanding the lecture jargon.

wooden sail Aug 28, 2023, 7:11 PM

#

that depends on what were the prerequisites/how early on in your study program you take the course

nocturne hornet Aug 28, 2023, 7:15 PM

#

well were two weeks in and i kinda feel like were getting thrown in the deep end really quickly, but thats me. Our first task was to make a perceptron classifier, from which understand only works if i can make a linear line from the dataset? We then had to adopt a SVM classifier from lecture notes that would give me a different result if its not linear from what im understanding. And im still blank at what my biases and weights do to these classifiers

#

were using an iris dataset, much like this https://scikit-learn.org/stable/auto_examples/svm/plot_iris_svc.html

scikit-learn

Plot different SVM classifiers in the iris dataset

Comparison of different linear SVM classifiers on a 2D projection of the iris dataset. We only consider the first 2 features of this dataset: Sepal length, Sepal width. This example shows how to pl...

nocturne hornet Aug 28, 2023, 7:20 PM

#

wooden sail that depends on what were the prerequisites/how early on in your study program y...

and to answer the prerequisites. We had an introductory course last Fall semester, but it was more about soring algos and a use of Kaman filter to help a pygame get 100% hits

wooden sail Aug 28, 2023, 7:21 PM

#

kalman filters is not introductory, oof

#

have you taken any linear algebra?

left tartan Aug 28, 2023, 7:22 PM

#

nocturne hornet well were two weeks in and i kinda feel like were getting thrown in the deep end...

Bias and weights are kinda fundamental. Maybe watch 3b1b, first vid covers it: https://m.youtube.com/watch?v=aircAruvnKk

YouTube

3Blue1Brown

But what is a neural network? | Chapter 1, Deep learning

What are the neurons, why are there layers, and what is the math underlying it?
Help fund future projects: https://www.patreon.com/3blue1brown
Written/interactive form of this series: https://www.3blue1brown.com/topics/neural-networks

Additional funding for this project provided by Amplify Partners

Typo correction: At 14 minutes 45 seconds, th...

▶ Play video

nocturne hornet Aug 28, 2023, 7:23 PM

#

wooden sail have you taken any linear algebra?

Only matrixes

wooden sail Aug 28, 2023, 7:24 PM

#

all right. the weights form a matrix

#

it gets multiplied to the input data vector. then you add another vector, the "biases", to shift the result

#

this is called an "affine transformation". it transforms one line into another, and then moves it around in space

weak mortar Aug 28, 2023, 7:26 PM

#

Hi. a little issue with a DataTable in dash / plotly. as it was convenient for me to transpose the dataframe used, the DataTable is not displaying the row header

nocturne hornet Aug 28, 2023, 7:27 PM

#

wooden sail this is called an "affine transformation". it transforms one line into another, ...

im assuming the weights move it in another direction?

wooden sail Aug 28, 2023, 7:27 PM

#

that, and stretch as well

nocturne hornet Aug 28, 2023, 7:30 PM

#

so what should i be look at when it comes to tuning my data assuming my classifiers are built corretly?

#

All i got now is a plot with some dots either side of the median line

#

That question sounded amateurish, but that is my current level 🙂

wooden sail Aug 28, 2023, 7:34 PM

#

i'd start by checking that 3b1b video, and then reading in your coursebook about support vector machines

nocturne hornet Aug 28, 2023, 7:35 PM

#

Implementet and SVM into my code, just ended up with the same accuracy as they are both linear

#

But yes, ill save the video for the morning.

nocturne hornet Aug 28, 2023, 7:39 PM

#

wooden sail i'd start by checking that 3b1b video, and then reading in your coursebook about...

Sadly, no course books, just power points... :/

wooden sail Aug 28, 2023, 7:45 PM

#

nocturne hornet Sadly, no course books, just power points... :/

i find that hard to believe, uni always expects you to spend at least as much time as you do in class, doing self studies. the slides and syllabus have no references?

glacial rampart Aug 28, 2023, 7:48 PM

#

twin forge ```py def log_retornos(n): df = pd.DataFrame() for t in clean_price.keys...

I'm not sure, sec_returns contains negative values, and I'm not sure you want it to. Other than that, I can use the uniqed package the way you try to. I don't know what's in clean_price.
It would probably help if you explain what you're trying to do. E.g. what types/ sizes do you expect every input and output to be?

nocturne hornet Aug 28, 2023, 7:49 PM

#

wooden sail i find that hard to believe, uni always expects you to spend at least as much ti...

The reference is usually documentation, like svm turtorials and https://medium.com/@zxr.nju/what-is-the-kernel-trick-why-is-it-important-98a98db0961d and those lines

Medium

What is the kernel trick? Why is it important?

When talking about kernels in machine learning, most likely the first thing that comes into your mind is the support vector machines (SVM)…

wooden sail Aug 28, 2023, 7:50 PM

#

oof

nocturne hornet Aug 28, 2023, 7:52 PM

#

I'm abit frustrated as the lecturers are above my classes head, they are really good at this. Just not teaching. Never worked with weights and biases before even in the introductory course, just sorting algos and implementing kalman filter to a pygame :/

past meteor Aug 28, 2023, 8:54 PM

#

You can ask me anything about SVMs @nocturne hornet, do you have any specific questions?

past meteor Aug 28, 2023, 8:58 PM

#

nocturne hornet Im taking two courses in AI and neural networks at my uni. Is it normal to not g...

To me this question wasn't worded super clearly though. You mean you're unsure of the impact of the weights and biases before training?

abstract wasp Aug 28, 2023, 9:01 PM

#

Hi, has anyone ever made an AI that geotags photos? If so, which dataset did you use?

left tartan Aug 28, 2023, 9:06 PM

#

abstract wasp Hi, has anyone ever made an AI that geotags photos? If so, which dataset did you...

I was going to make a joke about Trevor rainbolt

#

Curious if any ai can beat that guy

desert oar Aug 28, 2023, 9:08 PM

#

pallid badge Another question in this context is how ZMQ and multiprocessing ensure that the ...

The best way to guarantee ordering in any concurrent or parallel processing is to include the sequence number in the input and output, that way the data can be handled in any order, and can be re-ordered as needed later

twin forge Aug 28, 2023, 9:18 PM

#

glacial rampart I'm not sure, sec_returns contains negative values, and I'm not sure you want it...

oh I was trying to do .dropna() but since sec_returns has columns of delisted stock price data, it was removing all rows

#

this was one of the goals, removing the time interval where stock return == 0%, because it translates to a stock delist

Log Returns in % - BGP

#

ps: plots in portuguese, I know kekw

twin forge Aug 28, 2023, 9:55 PM

#

problem is, what to do with these illiquid stocks with huge price movements @glacial rampart

left tartan Aug 28, 2023, 10:06 PM

#

twin forge oh I was trying to do .dropna() but since sec_returns has columns of delisted st...

Oh, you have a wide formatted dataset (one column per security)? dropna, by default, drops any row with an NA value (default axis = 0)... you can also drop columns with NA with axis = 1: the risk is that you'd drop any security with a single NA (such as an equity that listed recently)

#

You could also melt (unpivot) the data back to narrow form, which might be easier to work with.

abstract wasp Aug 28, 2023, 10:08 PM

#

left tartan I was going to make a joke about Trevor rainbolt

So you don't have any data sets that can help me? 😭

left tartan Aug 28, 2023, 10:09 PM

#

abstract wasp So you don't have any data sets that can help me? 😭

Nah, sorry

#

Did you check Kaggle?

abstract wasp Aug 28, 2023, 10:10 PM

#

left tartan Did you check Kaggle?

Yes, but I haven't been able to find anything good.

twin forge Aug 28, 2023, 10:29 PM

#

left tartan Oh, you have a wide formatted dataset (one column per security)? dropna, by defa...

I circumvented the situation with .dropna() by taking the NaN values out of each series without affecting sec_returns and plotting them instead

#

for t in sec_outliers.columns:
   plt.figure(figsize=(22, 10))
   plt.rcParams.update({"font.size": 19})
        
   sec_returns_t = sec_returns[t].replace(0, np.nan).dropna()
        
   if not sec_returns_t.empty:
     plt.plot(sec_returns_t)```

By replacing the 0s (when you calculate variation of NaN values result = 0%) with NaN and removing them, I'm removing the period where a stock has been delisted

pallid badge Aug 28, 2023, 10:55 PM

#

desert oar The best way to guarantee ordering in any concurrent or parallel processing is t...

You are right. I distribute the images via the following range() over all workers.
Some code snippets

#in worker
for i in range(worker_id, nimages, self._nworkers):
            img = images[i]
  push_sock.send_pyobj((i, result))

def _unordered_recv(self, sock):
        while True:
            img_number, result = sock.recv_pyobj()
            yield (img_number, result)

#in collector later
generator = self._unordered_recv(pull_sock)

The previous programmed wrote an ordered_rec function

def _ordered_recv(self, sock):
        cache = {}
        next_img_number = 0
        while True:
            img_number, result = sock.recv_pyobj()
            if img_number == next_img_number:
                yield (img_number, result)
                next_img_number += 1
                while next_img_number in cache:
                    yield (img_number, cache.pop(next_img_number))
                    next_img_number += 1
            else:
                cache[img_number] = result

But strangely, it seems not to work. When I use ordered_recv I don't get an error message, but some entries in the converted image stack remain empyt. That I don't understand. Unordered seems to work.

left tartan Aug 28, 2023, 11:49 PM

#

Why not just receive them out of order and sort? @pallid badge

#

Memory limits?

twin forge Aug 29, 2023, 12:06 AM

#

@left tartan

#

remove_outlier() working perfectly

left tartan Aug 29, 2023, 12:07 AM

#

Nice

#

Which library was that?

twin forge Aug 29, 2023, 12:08 AM

#

uniqed ----> https://github.com/phrenico/uniqed/blob/master/README.rst

GitHub

uniqed/README.rst at master · phrenico/uniqed

Contribute to phrenico/uniqed development by creating an account on GitHub.

#

dope project

#

#

I can also set the length of bad returns allowed to be less or more lenient

#

0.1 meaning I except at least 90% of time series to behave normally

left tartan Aug 29, 2023, 12:12 AM

#

neat, I'll have to try it out someday. I still don't like the application of outlier removal without an understanding of the mechanism / reason for the outliers, for the record.

lime grove Aug 29, 2023, 4:52 AM

#

How do you handle outlier removal @left tartan ?

pallid badge Aug 29, 2023, 8:42 AM

#

left tartan Why not just receive them out of order and sort? <@815357558386589800>

Hi! Thank you for your answer. I ran the script with the ordered_recv and unordered_recv. Indeed, only in the unordered_recv version output file seem to have all images inside. In the ordered version some arrays show only 0 that should not be the case.
It would be nice to understand why.
I got this snippet of ordered_recv by another person, I thought he would know what he is doing? For me, it is very important to be in control because I aim to give my code to other people.
Is there any better way to control the input, the processing by ZMQ, and the output, please? Currently for me this is a blackbox.
The only way I can come up with is as follows: I run without the parallel processing and do it in series. Then I redo it, I compare the two final files. Thank you for the discussion 🙂

weak mortar Aug 29, 2023, 9:51 AM

#

good morning 🙂 i am converting a dataframe to dict to use as data in a dash_table.DataTable. Unfortunately dash seem to require that i use the method 'records' for converting to dict, which results in all row titles being deleted. if i use 'index' the dict contains the row titles, but dash will not generate the table.

left tartan Aug 29, 2023, 10:42 AM

#

pallid badge Hi! Thank you for your answer. I ran the script with the `ordered_recv` and `uno...

I think you have a mistake in the code. You yield img_number with cache.pop(next_img_number). Shouldn’t this be next.img_number?

left tartan Aug 29, 2023, 10:44 AM

#

pallid badge Hi! Thank you for your answer. I ran the script with the `ordered_recv` and `uno...

But regardless, I don’t think this is a good approach. If img number 0 comes last, all the other requests are cached. Me? I would either receive them out if order and sort them. Or even better, if worried about memory, I would have each worker write the output to a local file and process them in the order I want

stable cape Aug 29, 2023, 11:43 AM

#

hello, does anyone know how to implement a delay for each request langchain makes? I have the starter freeplan in Openai of 5$ and it limits the amount of requests per minute so i get blocked... Here 's the code: data = yaml.load(f, Loader=yaml.FullLoader)
json_spec = JsonSpec(dict_=data, max_value_length=4000)
json_toolkit = JsonToolkit(spec=json_spec)

json_agent_executor = create_json_agent(
llm=OpenAI(temperature=0, openai_api_key=OPEN_API_KEY),
toolkit=json_toolkit,
verbose=True,

)

nocturne hornet Aug 29, 2023, 11:45 AM

#

past meteor To me this question wasn't worded super clearly though. You mean you're unsure o...

I'm unsure what it does before training, I'm also unsure how to read the result after training. I have this dataset which im told is a classic illustration of a classification problem.

📎 iris.csv

past meteor Aug 29, 2023, 11:47 AM

#

nocturne hornet I'm unsure what it does before training, I'm also unsure how to read the result ...

Can you tell me what you specifically mean wwith BEFORE training?

nocturne hornet Aug 29, 2023, 11:54 AM

#

past meteor Can you tell me what you specifically mean wwith BEFORE training?

If i initialize bias and weights to something else than zero, how does it impact the training. I'm also abit unsure on how SVM works compared to perceptron, so i cant really tell what the differerence is in the training and predictions.

past meteor Aug 29, 2023, 11:54 AM

#

Maybe this can help, I wrote this in the past: Maybe this helps? Idk. I wrote this in the past

#

#

Your perceptron's decision boundary is pretty bad compared to that of the support vector machine. You see the line is to close too the purple class. There's probably points just beyond that line that are still supposed to be in the purple class but your perceptron will say they're in the yellow

#

That's a direct consequence of the margin term you add to a SVM, which is related to L2 regularization. If you add that to your perceptron the results will be better. L2 also wants to constrain a magnitude of a vector if you recall 🙂

#

The reason why you have points inside of the dotted lines are because of the slack variables SVMs are because of the slack variables (eta in my screenshot)

past meteor Aug 29, 2023, 11:59 AM

#

nocturne hornet If i initialize bias and weights to something else than zero, how does it impact...

As far as I know SVMs don't do any kind of fancy initialisation unlike neural nets. If you start with a random weight vector before training and use that you converge to the same result. It's a convex optimization problem with a global optimum.

#

/info dump over

nocturne hornet Aug 29, 2023, 12:05 PM

#

past meteor As far as I know SVMs don't do any kind of fancy initialisation unlike neural ne...

    def _init_weights_bias(self, X):
        n_features = X.shape[1]
        self.w = np.zeros(n_features)
        self.b = 0``` I had this function in my class so i just assumed it did

past meteor Aug 29, 2023, 12:07 PM

#

You can try it with different values that aren't 0. At least for the SVM it should converge.

nocturne hornet Aug 29, 2023, 12:10 PM

#

past meteor That's a direct consequence of the margin term you add to a SVM, which is relate...

class SVM:
    def __init__(self, learning_rate=LEARNING_RATE_SVM, lambda_param=LAMBDA_PARAM, n_iters=N_ITERS):
        self.lr = learning_rate
        self.lambda_param = lambda_param
        self.n_iters = n_iters
        self.w = None
        self.b = None``` By L2, do you mean lambda parameter in this case?

past meteor Aug 29, 2023, 12:10 PM

#

Did you code the SVM from scratch pithink

nocturne hornet Aug 29, 2023, 12:12 PM

#

No, we got handed this code with the task of implenting it to our dataset. With the end goal of comparing perceptron vs SVM in both time and accuracy.

#

Tried coding perceptron from scratch, probably why its bad 😦

past meteor Aug 29, 2023, 12:13 PM

#

Learning rate in an SVM? Can you show me the implementation

#

Is it stochastic gradient descent?

nocturne hornet Aug 29, 2023, 12:16 PM

#

past meteor Is it stochastic gradient descent?

class SVM:
    def __init__(self, learning_rate=LEARNING_RATE_SVM, lambda_param=LAMBDA_PARAM, n_iters=N_ITERS):
        self.lr = learning_rate
        self.lambda_param = lambda_param
        self.n_iters = n_iters
        self.w = None
        self.b = None

    def _init_weights_bias(self, X):
        n_features = X.shape[1] # The number of features is the number of columns in the dataset
        self.w = np.zeros(n_features) # The weights are the coefficients of the input variables. They are multiplied by the inputs and summed to arrive at an output. They are updated in the learning process.
        self.b = 0 # The bias is a constant value that is added to the weighted sum of inputs to determine the output of a neuron. It is a constant value that is learned during training.

    def _get_cls_map(self, y):
        return np.where(y <= 0, -1, 1)

    def _satisfy_constraint(self, x, idx):
        linear_model = np.dot(x, self.w) + self.b 
        return self.cls_map[idx] * linear_model >= 1
    
    def _get_gradients(self, constrain, x, idx):
        if constrain:
            dw = self.lambda_param * self.w
            db = 0
            return dw, db
        
        dw = self.lambda_param * self.w - np.dot(self.cls_map[idx], x)
        db = - self.cls_map[idx]
        return dw, db
    
    def _update_weights_bias(self, dw, db):
        self.w -= self.lr * dw
        self.b -= self.lr * db
    
    def fit(self, X, y):
        self._init_weights_bias(X)
        self.cls_map = self._get_cls_map(y)

        for _ in range(self.n_iters):
            for idx, x in enumerate(X):
                constrain = self._satisfy_constraint(x, idx)
                dw, db = self._get_gradients(constrain, x, idx)
                self._update_weights_bias(dw, db)
    
    def predict(self, X):
        estimate = np.dot(X, self.w) + self.b
        prediction = np.sign(estimate)
        return np.where(prediction == -1, 0, 1)```

past meteor Aug 29, 2023, 12:16 PM

#

Your class is very weird

nocturne hornet Aug 29, 2023, 12:17 PM

#

The class is from the course github page which were told to implement, but I dont know much about this classifier to judge.

#

class Perceptron: # Perceptron classifier class
    def __init__(self, learning_rate=LEARNING_RATE, n_iters=N_ITERS): 
        self.lr = learning_rate 
        self.n_iters = n_iters 
        self.weights = None # The weights are the coefficients of the input variables. They are multiplied by the inputs and summed to arrive at an output. They are updated in the learning process.
        self.bias = None # The bias is a constant value that is added to the weighted sum of inputs to determine the output of a neuron. It is a constant value that is learned during training.

    def fit(self, X, y): # The fit method is used to train the model
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.n_iters): 
            for idx, x_i in enumerate(X): #
                linear_output = np.dot(x_i, self.weights) + self.bias
                y_predicted = np.where(linear_output > 0, 1, 0)
                update = self.lr * (y[idx] - y_predicted)
                self.weights += update * x_i
                self.bias += update

    def predict(self, X): # The predict method is used to predict the class of a sample
        linear_output = np.dot(X, self.weights) + self.bias # The linear output is the weighted sum of the inputs
        return np.where(linear_output > 0, 1, 0)``` this is the perceptron class i ended up with

past meteor Aug 29, 2023, 12:21 PM

#

nocturne hornet ```python class SVM: def __init__(self, learning_rate=LEARNING_RATE_SVM, lam...

Did you write this or your professor?

#

This implementation is wrong

nocturne hornet Aug 29, 2023, 12:26 PM

#

class SVM:
    def __init__(self, learning_rate=1e-3, lambda_param=1e-2, n_iters=1000):
        self.lr = learning_rate
        self.lambda_param = lambda_param
        self.n_iters = n_iters
        self.w = None
        self.b = None

    def _init_weights_bias(self, X):
        n_features = X.shape[1]
        self.w = np.zeros(n_features)
        self.b = 0

    def _get_cls_map(self, y):
        return np.where(y <= 0, -1, 1)

    def _satisfy_constraint(self, x, idx):
        linear_model = np.dot(x, self.w) + self.b 
        return self.cls_map[idx] * linear_model >= 1
    
    def _get_gradients(self, constrain, x, idx):
        if constrain:
            dw = self.lambda_param * self.w
            db = 0
            return dw, db
        
        dw = self.lambda_param * self.w - np.dot(self.cls_map[idx], x)
        db = - self.cls_map[idx]
        return dw, db
    
    def _update_weights_bias(self, dw, db):
        self.w -= self.lr * dw
        self.b -= self.lr * db
    
    def fit(self, X, y):
        self._init_weights_bias(X)
        self.cls_map = self._get_cls_map(y)

        for _ in range(self.n_iters):
            for idx, x in enumerate(X):
                constrain = self._satisfy_constraint(x, idx)
                dw, db = self._get_gradients(constrain, x, idx)
                self._update_weights_bias(dw, db)
    
    def predict(self, X):
        estimate = np.dot(X, self.w) + self.b
        prediction = np.sign(estimate)
        return np.where(prediction == -1, 0, 1)``` this is the original one that is in our course repo, which I assume is written by our professor

uneven bronze Aug 29, 2023, 12:59 PM

#

What is the difference between language module and an actual ai

sonic valley Aug 29, 2023, 1:06 PM

#

uneven bronze What is the difference between language module and an actual ai

a language model is a model trained on corpora of text, it is typically a neural network, which 'falls' under AI

#

AI is an umbrella term for automated machinery to complete a task autonomously

past meteor Aug 29, 2023, 1:07 PM

#

nocturne hornet ```python class SVM: def __init__(self, learning_rate=1e-3, lambda_param=1e-...

Looking at this again, I assume lambda_param is the canonical C parameter in SVMs

uneven bronze Aug 29, 2023, 1:07 PM

#

sonic valley a language model is a model trained on corpora of text, it is typically a neural...

👌

past meteor Aug 29, 2023, 1:07 PM

#

That makes the implementation not strictly wrong but just "strange". Idk why your professor would code it up from scratch, why they'd use gradient descent in the primal etc

sonic valley Aug 29, 2023, 1:08 PM

#

is it possible to increase the amount of iteration steps with tensorflow when predicting an answer?

past meteor Aug 29, 2023, 1:08 PM

#

nocturne hornet ```python class SVM: def __init__(self, learning_rate=1e-3, lambda_param=1e-...

This is essentially a https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html with loss="hinge"

scikit-learn

sklearn.linear_model.SGDClassifier

Examples using sklearn.linear_model.SGDClassifier: Model Complexity Influence Out-of-core classification of text documents Comparing various online solvers Early stopping of Stochastic Gradient Des...

hot hazel Aug 29, 2023, 1:11 PM

#

Will AI replace 90% of programmers in 5 years it very good it will replace us right and if not then WHY

nocturne hornet Aug 29, 2023, 1:15 PM

#

past meteor This is essentially a https://scikit-learn.org/stable/modules/generated/sklearn....

Still having issues adjusting the perceptron tho, added a L2 parameter, but its not really doing anything :/

hot hazel Aug 29, 2023, 1:16 PM

#

hot hazel Will AI replace 90% of programmers in 5 years it very good it will replace us ri...

?

hazy verge Aug 29, 2023, 1:27 PM

#

Can anyone provide the resources to learn how to make chatbot from scratch?

sonic valley Aug 29, 2023, 1:27 PM

#

hazy verge Can anyone provide the resources to learn how to make chatbot from scratch?

chatterbot or nltk is your guy

#

https://pypi.org/project/ChatterBot/
https://pypi.org/project/nltk/

PyPI

ChatterBot

ChatterBot is a machine learning, conversational dialog engine.

PyPI

nltk

Natural Language Toolkit

#

https://pypi.org/project/spacy/

PyPI

spacy

Industrial-strength Natural Language Processing (NLP) in Python

hazy verge Aug 29, 2023, 1:28 PM

#

Thank you

sonic valley Aug 29, 2023, 1:28 PM

#

I wouldn't use chatterbot, because i've had troubles getting to work properly on 3.11.4

hazy verge Aug 29, 2023, 1:29 PM

#

OK 👍

sonic valley Aug 29, 2023, 1:31 PM

#

hot hazel Will AI replace 90% of programmers in 5 years it very good it will replace us ri...

no

#

Right now, Generative Pre-trained Transformer (🤓) GPT 4 is the most advanced model, and it has hallucination rates of 20%, so every 5 messages will make false information

#

It is incapable of doing arithmatics, and its coding knowledge is completely based from StackOverflow

#

Up until AI can make new information based on what it knows proefficiently, we'll still be their overlords

nocturne hornet Aug 29, 2023, 1:40 PM

#

past meteor Looking at this again, I assume `lambda_param` is the canonical C parameter in S...

I found my result similar to this one tho... https://www.bogotobogo.com/python/scikit-learn/Perceptron_Model_with_Iris_DataSet.php

Single Layer Neural Network - Perceptron model on the Iris dataset ...

Perceptron model on the Iris dataset

past meteor Aug 29, 2023, 1:42 PM

#

nocturne hornet I found my result similar to this one tho... https://www.bogotobogo.com/python/s...

What loss is your perceptron algo using, binary cross entropy?

nocturne hornet Aug 29, 2023, 1:45 PM

#

simple difference between the predicted output and the actual target to update the weights.

nocturne hornet Aug 29, 2023, 1:45 PM

#

past meteor What loss is your perceptron algo using, binary cross entropy?

y_predicted = np.where(linear_output > 0, 1, 0)
update = self.lr * (y[idx] - y_predicted)```

past meteor Aug 29, 2023, 1:49 PM

#

nocturne hornet ```python y_predicted = np.where(linear_output > 0, 1, 0) update = self.lr * (y[...

Hmm, I guess the decision boundary being where it is is down to that term but honestly I haven't looked at perceptron in detail

hot hazel Aug 29, 2023, 1:50 PM

#

sonic valley Right now, Generative Pre-trained Transformer (🤓) GPT 4 is the most advanced mo...

In 5 years it will right

past meteor Aug 29, 2023, 1:50 PM

#

It's very close to SVMs, especially if you added L2. I'd encourage you to just write down the equations

hot hazel Aug 29, 2023, 1:53 PM

#

hot hazel In 5 years it will right

Right

nocturne hornet Aug 29, 2023, 1:56 PM

#

past meteor It's very close to SVMs, especially if you added L2. I'd encourage you to just w...

If what my understanding from the article is correct, covergience is a problem with perceptron. So im just gonna assume i cant do anything else with that kind of classifier

past meteor Aug 29, 2023, 1:56 PM

#

nocturne hornet I'm unsure what it does before training, I'm also unsure how to read the result ...

But in your image it did OK didn't it?

#

I've not used Perceptron in the past mainly because I can't think of a situation where I would favour it above a generic SGDclassifier/regressor.

hot hazel Aug 29, 2023, 1:57 PM

#

hot hazel In 5 years it will right

?

#

Please replay

#

I am scared if ai will replace us in 5 years

past meteor Aug 29, 2023, 2:01 PM

#

hot hazel I am scared if ai will replace us in 5 years

chill, it won't

nocturne hornet Aug 29, 2023, 2:01 PM

#

past meteor But in your image it did OK didn't it?

Still need to understand my results better i guess. Never sure if im getting what i need

hot hazel Aug 29, 2023, 2:01 PM

#

past meteor chill, it won't

Ok even in 5 years

past meteor Aug 29, 2023, 2:01 PM

#

hot hazel I am scared if ai will replace us in 5 years

People should stop asking this to engineers go ask it to philosophers

#

Virtually no engineer cares about discussing this topic

#

It's tiresome and pointless

hot hazel Aug 29, 2023, 2:02 PM

#

Can you tell me why it won't this is what I need

#

The why

past meteor Aug 29, 2023, 2:02 PM

#

No, find a philosophy discord server and ask them. Thanks.

past meteor Aug 29, 2023, 2:03 PM

#

nocturne hornet Still need to understand my results better i guess. Never sure if im getting wha...

You can talk about how perceptron "hugs" the decision boundary and SVM doesn't and how that's a good thing

hot hazel Aug 29, 2023, 2:03 PM

#

past meteor No, find a philosophy discord server and ask them. Thanks.

No I mean why it won't

#

I know it won't replace us but why

nocturne hornet Aug 29, 2023, 2:06 PM

#

past meteor You can talk about how perceptron "hugs" the decision boundary and SVM doesn't a...

Thanks ✌️

past meteor Aug 29, 2023, 2:06 PM

#

Write out the equations and it'll make more sense 🙂

hot hazel Aug 29, 2023, 2:06 PM

#

past meteor Write out the equations and it'll make more sense 🙂

?

wooden sail Aug 29, 2023, 2:06 PM

#

ai has already replaced us. in reality, zestar is a bot we have linked to chatgpt, the true overlord

hot hazel Aug 29, 2023, 2:08 PM

#

wooden sail ai has already replaced us. in reality, zestar is a bot we have linked to chatgp...

What

#

Zestar will you replace us

wooden sail Aug 29, 2023, 2:09 PM

#

zestar already replaced me by giving better answers in this channel lemon_angrysad

hot hazel Aug 29, 2023, 2:10 PM

#

wooden sail zestar already replaced me by giving better answers in this channel <:lemon_angr...

Better answers dosnt mean replace you

#

Zestar how do you feel as a chat gpt and will you replace us

mild dirge Aug 29, 2023, 2:11 PM

#

It's definitely better than its 74 predecessors

past meteor Aug 29, 2023, 2:12 PM

#

Maybe zestar76 won't take so much time to remember why exactly SVMs are a maximum margin classifier.

hot hazel Aug 29, 2023, 2:13 PM

#

past meteor Maybe zestar76 won't take so much time to remember why exactly SVMs are a maximu...

What

#

Another ai

#

NOo I don't want it to replace us

pale hemlock Aug 29, 2023, 2:13 PM

#

IF anyone is truely interested

#

https://paste.pythondiscord.com/BRNQ

hot hazel Aug 29, 2023, 2:18 PM

#

What is this

#

WILL ai replace us

pale hemlock Aug 29, 2023, 2:18 PM

#

if we let it

#

this is a Machine learning tensor model defines values in a coordinate system and catagorgized via mathematics in dimensions based off the orginal model

#

I eventually want to create a machine learning model that becomes AI

#

The graphic 3d aspect is not important the the system as a whole, merely an in intended consequence to model i desgned.

#

the fact that can do provides deeper insite in to my goal

desert oar Aug 29, 2023, 2:36 PM

#

pale hemlock this is a Machine learning tensor model defines values in a coordinate system an...

i don't think this code even works as-written, you'll need the nonlocal declaration for tensor

#

the enthusiasm is appreciated, but i also don't think this does anything with machine learning

pale hemlock Aug 29, 2023, 2:44 PM

#

desert oar the enthusiasm is appreciated, but i also don't think this does anything with ma...

its defining a tensor off of real world values like circle triangle, data along theses parameteres can help with massive data

desert oar Aug 29, 2023, 2:48 PM

#

pale hemlock its defining a tensor off of real world values like circle triangle, data along ...

i see. this is actually a well-established technique, you're basically treating each coordinate of each shape here as a separate feature. there are some advantages (e.g. simplicity of the code) but there are also problems with it. for example you can take a (small) image and flatten the pixels, so that each pixel is a separate input feature. the problem is that you lose the proper sense of the 2d locality of the data, so the model has a lot more work in order to construct a good internal representation of the data, compared to using a CNN.

#

however one thing you can do is use something like an autoencoder to obtain a fixed-size vector embedding for each "object" and feed those into a model. that's basically what all of deep learning for NLP is based on. the nice thing there is that you can tailor the embedding model to accommodate all kinds of weird objects (text, shapes, images, whatever) but the output is very uniform and can be put into a very generic model.

pale hemlock Aug 29, 2023, 2:49 PM

#

You could also define the picture as a rectangle and store the info about along that particular dimension

desert oar Aug 29, 2023, 2:50 PM

#

right, that's exactly the technique. i'm saying that simply flattening the coordinates might not be the most effective way to obtain a vector representation of a polygon.

#

although it probably wouldn't be bad either in the case specifically of k-gons, since k is fixed

#

would be interesting to see if a model can distinguish self-intersecting, convex, and regular polygons just by flattening the coordinates

pale hemlock Aug 29, 2023, 2:52 PM

#

the whole point those flattned coordinates are unique to that particular information. the whole pupose of my endevour is to formlize context.. say you want it to identify its self. it can refer to its current shape dude to its parameters that changed over time..

desert oar Aug 29, 2023, 2:52 PM

#

i'm not sure what you mean by that

pale hemlock Aug 29, 2023, 2:52 PM

#

right im trying

desert oar Aug 29, 2023, 2:53 PM

#

yes, flattening the coordinates is an invertible transformation between the original polygon and the flat representation

pale hemlock Aug 29, 2023, 2:56 PM

#

say you run this thing and learns the system dymentions, it would soon learn that in contenxt its a rectangle shaped box (gathered from the internet based on hardware information, dimensions, type, cpu then contextualize the information as hardware, it could read x and y values a see it as information from this source as a application, perhaps these values are window dimensions, they can be stored on the dimension avaible and given the cordinates of this newly created understand the model conforms to this information holding values that can be stored as a variable, Ie the computer its on.

#

in other words with each complete growth factor that conforms to lets say a triangle the orginal modle stores the information coordinately, those corrdinates are input references of stored dictionaries

#

these references can be combined to show in theory a 3d object once it learns enought about "its self"

#

what i find intersting is that you could intheory rotate the tensor object to store information anew.

desert oar Aug 29, 2023, 3:01 PM

#

i don't follow, what does CPU have to do with this?

pale hemlock Aug 29, 2023, 3:02 PM

#

nothing but if you ask the model what type of cpu it has, because when the tensor recieves input and creates a new storage dimension, any value learned in this dimension can later be refernece when tyring to talk a machine learning about itself.

#

what type of CPU you have because i haven't yet programed that far, i had a hard drive malfunction a week ago and starting new i have to retrace my steps, however this is a new approach of what i was trying to do

#

because a CPU is generally square, it can store the dimensions, name and stuff on the fly in said dimensions, say it managed to learn square thing and stored in the square dimension, it can refence instantaneously along those parameters and anything else that references square or square objects. perhaps what it revived threw videos, just stored in a space as a refence defined by the orignal tensor.

#

basically right now, if you run a model againsts it it would know its a square, with circle blach blach but an object that receives and learns as outputs

tiny nimbus Aug 29, 2023, 3:32 PM

#

Hi, I am trying the ai cells magics from https://blog.jupyter.org/generative-ai-in-jupyter-3f7174824862 in jupyter lab. They work great, but I would like to streamline my process. Is there a way to avoid having to %load_ext jupyter_ai_magics in each notebook before using these magics? I looked around and advice on this was old/nonworking. Ideally I would like to change a config or add an argument to my jupyter lab startup alias.

Medium

Generative AI in Jupyter

Jupyter AI, a new open source project, brings generative artificial intelligence to notebooks with magic commands and a chat interface.

pale hemlock Aug 29, 2023, 3:43 PM

#

hmmm i developed another method to do what i did earlier except.. well just look

#

https://paste.pythondiscord.com/ZNMA

boreal gale Aug 29, 2023, 3:55 PM

#

tiny nimbus Hi, I am trying the ai cells magics from https://blog.jupyter.org/generative-ai-...

assuming you are just using a stock default jupyter lab installation:-

run ipython profile locate default in your terminal
add

c = get_config()
c.InteractiveShellApp.extensions.append('jupyter_ai_magics')

to
<the-path-revealed-in-step-1>/ipython_config.py
3. restart jupyterlab

desert oar Aug 29, 2023, 4:07 PM

#

pale hemlock because a CPU is generally square, it can store the dimensions, name and stuff o...

unfortunately i think you might be very confused about how machine learning works

pale hemlock Aug 29, 2023, 4:07 PM

#

hold on

pale hemlock Aug 29, 2023, 4:09 PM

#

desert oar unfortunately i think you might be very confused about how machine learning work...

I suggest you don't run it, it bogs hard, i working with an ssd again the VISUAL context is only something to help it won't remain in the final product,

#

https://paste.pythondiscord.com/QEXQ

cold osprey Aug 29, 2023, 4:13 PM

#

hot hazel Will AI replace 90% of programmers in 5 years it very good it will replace us ri...

yes, in 1 year

tiny nimbus Aug 29, 2023, 4:16 PM

#

boreal gale assuming you are just using a stock default jupyter lab installation:- 1. run `...

@boreal gale Amazing! That worked.

I was previously trying to modify the config indicated by jupyter-lab --generate-config, but it looks like I truly needed the ipython config even though i am using jupyter lab.

Which documentation did you look at to find the answer regarding which config needs to be modified and which value needed to be set?

boreal gale Aug 29, 2023, 4:22 PM

#

tiny nimbus <@231160898872410123> Amazing! That worked. I was previously trying to modify t...

https://ipython.org/ipython-doc/3/config/intro.html

i was gonna go with ipython startup scripts as i knew that's definitely possible to run python code when a kernel starts up,
but i forgot where to put start up scripts, so i google ipython profile since i knew it's a profile-based thiing and that led me to that page and the top bit immediately showed how to enable cython magic by default so i adapted that instead.

boreal gale Aug 29, 2023, 4:23 PM

#

tiny nimbus <@231160898872410123> Amazing! That worked. I was previously trying to modify t...

i am actually unsure why your previous approach didn't work.

edit: that seems to be the config for LabApp not ServerApp which is what we needed to change
edit2: yeah no. jupyter-lab --generate-config is for both ServerApp and LabApp actually, and that's probably not to do with the kernel that's being spawned
edit3: i guess the keypoint is just realising jupyter is just using ipykernel - hence you need to alter ipython profile , not jupyter's profile.

tiny nimbus Aug 29, 2023, 4:57 PM

#

boreal gale i am actually unsure why your previous approach didn't work. edit: that seems t...

Agree on edit3. I am not a huge magics user, but it is a good reminder that magics are kernel feature language specific feature, not a general juyter feature.

broken flax Aug 29, 2023, 5:19 PM

#

Hi I’m new here. Just signed up for Nucamp’s DevOps course with Python starting in September. In the meantime I’m trying to figure out how to work with list in real world situations but I got stuck somewhere. I was able to import my list (.csv) into pandas but don’t know what to do after that. How do I sort the list or find some item in the list. I learned to do this with just simple small list but not in real situations. Any suggestions? Please dm me. Thanks

pale hemlock Aug 29, 2023, 5:26 PM

#

https://paste.pythondiscord.com/J6RA

pale hemlock Aug 29, 2023, 5:48 PM

#

https://paste.pythondiscord.com/7EEQ

#

well i finally got somewhere

pale hemlock Aug 29, 2023, 6:42 PM

#

https://paste.pythondiscord.com/H6KQ

#

this is my final product?

#

anyone interested

winter drift Aug 29, 2023, 9:46 PM

#

Hey guys, any good known datasets of pictures of trash and debris?

serene scaffold Aug 29, 2023, 9:48 PM

#

yeah, the selfie folder on my phone

turbid fox Aug 29, 2023, 10:14 PM

#

im planning on making a simple machine learning / ai algorithm to play chess. are there any python libraries i should investigate, aside from pandas / numpy / tensorflow. looking for a starting point in the project, i guess

serene scaffold Aug 29, 2023, 10:18 PM

#

turbid fox im planning on making a simple machine learning / ai algorithm to play chess. ar...

do you want it to use machine learning, or a heuristic?

pallid badge Aug 29, 2023, 10:20 PM

#

left tartan I think you have a mistake in the code. You yield img_number with cache.pop(next...

@left tartan Thank you so much. You were absolutely right. There is the mistake, I did not see it.
Corrected version is now:

 def _ordered_recv(self, sock):
        cache = {}
        next_img_number = 0
        while True:
            current_img_number, current_result = sock.recv_pyobj()
            if current_img_number == next_img_number:
                yield (current_img_number, current_result)
                next_img_number += 1
                while next_img_number in cache:
                    next_result = cache.pop(next_img_number)
                    yield (next_img_number, next_result)
                    next_img_number += 1
            else:
                cache[current_img_number] = current_result

This one makes sense in case you have a live queue, where images from a detector come in and should be displayed in the correct order. It could also happen that a scan stops and we don't want the final file to have empty places.
If order does not matter, the unordered version is ok, the target file gets filled because (img_number, result) travel together and I can identify the image and the location in the final file with img_number.

pallid badge Aug 29, 2023, 10:21 PM

#

left tartan But regardless, I don’t think this is a good approach. If img number 0 comes las...

Maybe this is not possible. If we scan, we get for example 300.000 images ore more. Not easy to keep in memory and maybe one would not like to have too many files flying around. I have to admit I don't know well the process detector --> outputfile. I always get the output file which I have to convert to something else. My current converted file is 300GB large

turbid fox Aug 29, 2023, 10:22 PM

#

serene scaffold do you want it to use machine learning, or a heuristic?

machine learning, not hard-coded

pallid badge Aug 29, 2023, 10:22 PM

#

@left tartan I check now ordered and ordered_fix and let you know, but it looked good on the first glance

serene scaffold Aug 29, 2023, 10:23 PM

#

turbid fox machine learning, not hard-coded

then I wouldn't recommend this as a first project.

turbid fox Aug 29, 2023, 10:38 PM

#

serene scaffold then I wouldn't recommend this as a first project.

not my first project, im just looking for resources

serene scaffold Aug 29, 2023, 10:38 PM

#

turbid fox not my first project, im just looking for resources

what other projects have you done?

sterile nebula Aug 29, 2023, 11:11 PM

#

do python have a library that can match similar lookin charts?

desert oar Aug 29, 2023, 11:30 PM

#

turbid fox not my first project, im just looking for resources

This isn't really my area of expertise, but you can probably find something that somebody has put together using reinforcement learning if you search around, maybe based on AlphaZero. however you should keep in mind that there is a long history of chess playing algorithms that don't use "machine learning" in the modern sense, which you also might want to look into

#

yeah it seems like AlphaZero has been used for chess since 2017, you should be able to find at least something about using it in python

rocky vortex Aug 29, 2023, 11:32 PM

#

https://en.wikipedia.org/wiki/Minimax

Minimax

Minmax (sometimes Minimax, MM or saddle point) is a decision rule used in artificial intelligence, decision theory, game theory, statistics, and philosophy for minimizing the possible loss for a worst case (maximum loss) scenario. When dealing with gains, it is referred to as "maximin" – to maximize the minimum gain. Originally formulated for se...

#

this is what stockfish uses if i remember correctly

civic elm Aug 29, 2023, 11:44 PM

#

Hmm a really nice project would be a voice activated chess

#

No board just voice haha

#

Woah that would be cool time to learn RNNs

turbid fox Aug 30, 2023, 2:30 AM

#

desert oar This isn't really my area of expertise, but you can probably find something that...

thanks 🙂

pale hemlock Aug 30, 2023, 3:17 AM

#

Could Chatgpt3 interact with this model by its self?
ChatGPT
Yes, GPT-3 can interact with your tensor-based model by itself. GPT-3 is capable of processing and generating text, which means it can send text inputs to your model and receive responses from it. This interaction would involve GPT-3 generating prompts that are formatted in a way that your tensor-based model can understand and interpret.

#

https://paste.pythondiscord.com/R3TA

#

@desert oar ive gotten somewhere i hope this changes your preception on what i may or may not understand

#

the neuro network model

#

https://paste.pythondiscord.com/QLHA

golden haven Aug 30, 2023, 3:59 AM

#

If anyone is good with python selenium here and you have a moment please check my post on python help page, just posted it there ❤️

civic elm Aug 30, 2023, 5:50 AM

#

anyone using a mac? my jupyter notebook kernel is using the /usr/bin/python3 where it should be the anaconda python. I don't know how to fix this

north rain Aug 30, 2023, 8:37 AM

#

@brittle radishplease don't post advertisements like that in this server

dry flame Aug 30, 2023, 8:56 AM

#

need advice on NLP books
so I'm eyeing some books as a handbook to guide me in learning NLP with libraries like spacy, nltk etc (let's assume i know nothing of NLP). right now i have these 3 books listed. but i wonder if there are better books for complete beginners?

opaque idol Aug 30, 2023, 10:47 AM

#

I just finished learning Python. I want to start learning how to train ai agents to play a game. Can anyone link me somewhere and explain how should I start doing that?

wraith heart Aug 30, 2023, 10:49 AM

#

📣 I've just published an in-depth article on Stable Diffusion, an AI technology that's transforming the way we deal with noisy data. 🌟

🔍 What's Stable Diffusion?

🧠🔬 The Science Behind It

🎓 Get Hands-On
The tutorial includes a step-by-step guide on setting up Stable Diffusion, so you can get started on your own projects.

Let me know what you think!

https://medium.com/@Naykafication/stable-diffusion-phenomenon-from-core-principles-to-real-world-applications-e5f54c795b15?source=friends_link&sk=1a99411d24a0d86967ed72943959f48f

Medium

Stable Diffusion Phenomenon: from core principles to real-world app...

Beyond the Hype: Practical Tutorial to Stable Diffusion and Its Impact on Tech

fallow frost Aug 30, 2023, 12:24 PM

#

does DuckDB have query parameters?

weak mortar Aug 30, 2023, 12:44 PM

#

Hi guys. I have a few issues with pandas.style.background_color function. Which topic will you suggest me to ask in?

#

or is it better i open a thread in python help?

boreal gale Aug 30, 2023, 12:54 PM

#

weak mortar Hi guys. I have a few issues with pandas.style.background_color function. Which ...

here or thread in python help are both fine

weak mortar Aug 30, 2023, 1:19 PM

#

my problem is that i am coloring my table with heatmap, but i only want to apply it to specific rows. you can add subset="column name" to target specific columns, but it will always target columns, not rows, despite setting axis to 0 or 1.
code:
metrics_html = combined_metrics.style.background_gradient(cmap='Greens',axis=0).to_html()

#

as im very familiar with css, i tried overwriting the styles of the cells in the html, but due to the nature of how the colors are applied it is not possible to do

boreal gale Aug 30, 2023, 1:39 PM

#

weak mortar my problem is that i am coloring my table with heatmap, but i only want to apply...

have a read here: https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.background_gradient.html

in particular:

subset:  label, array-like, IndexSlice, optional
A valid 2d input to DataFrame.loc[<subset>], or, in the case of a 1d input or single key, to DataFrame.loc[:, <subset>] where the columns are prioritised, to limit data to before applying the function.

here is an example to get what you wanted:

import pandas as pd

import numpy as np

arr = np.random.randint(1, 100, size=(10, 10))

df = pd.DataFrame(arr)

df.style.background_gradient(cmap='Greens', subset=([3,4], slice(0, None)), axis=0)

weak mortar Aug 30, 2023, 1:41 PM

#

okay thanks let me try that right away. i tried with the slice thing and apply and applymap last night, but the slice stuff was confusing for me.

boreal gale Aug 30, 2023, 1:44 PM

#

you don't even need slice actually..

df.style.background_gradient(cmap='Greens', subset=([3,4], ), axis=0) will do just fine

weak mortar Aug 30, 2023, 1:45 PM

#

thats basically what i was already doing. it still only searches the columns for 3 and 4 here, it cannot search the rows

boreal gale Aug 30, 2023, 1:46 PM

#

weak mortar thats basically what i was already doing. it still only searches the columns for...

n.b.

subset:  label, array-like, IndexSlice, optional
A valid 2d input to DataFrame.loc[<subset>], or, in the case of a 1d input or single key, to DataFrame.loc[:, <subset>] where the columns are prioritised, to limit data to before applying the function.

weak mortar Aug 30, 2023, 1:47 PM

#

how you make it work on rows though, it says it looks through columns 🤷

#

your example works. just figuring out how to make it also work on my df

boreal gale Aug 30, 2023, 1:49 PM

#

hmm? i though the documentation is quite clear already 🤔

unless you weren't aware there is a difference between [3,4] and ([3,4], )?

weak mortar Aug 30, 2023, 1:50 PM

#

yes i was totally unaware of that 🙂

#

i see that you thereby are targeting rows , i just didnt go further with that because it always said it was searching cols

boreal gale Aug 30, 2023, 1:51 PM

#

ah! okay, you can experiment with df.loc[<whatever-you-put-as-subset>] to see what it will target (in the 2D case according to the docs)

#

but if it's 1D then it's df.loc[:, <subset>]

#

what a weird API.. 🤷‍♂️

weak mortar Aug 30, 2023, 1:53 PM

#

okay case solved. thank you very much!
combined_metrics.style.background_gradient(cmap='Greens', subset=(["opti","bt1","bt2","bt3","bt4"], ), axis=0).to_html()
yesterday i was so close, litereally just had to put ", " after the ]

latent remnant Aug 30, 2023, 3:08 PM

#

anyone familiar with Power Bi?

pale hemlock Aug 30, 2023, 3:25 PM

#

IF anyone is truly interested, patient, willing to listen to a reasonable implementation of working on a novel concept of Machine learning and AI.. please by all means message me

serene scaffold Aug 30, 2023, 5:34 PM

#

latent remnant anyone familiar with Power Bi?

be sure to always ask a complete question that someone who knows the answer could start answering

opaque idol Aug 30, 2023, 5:43 PM

#

opaque idol I just finished learning Python. I want to start learning how to train ai agents...

Can someone please answer my question?

umbral charm Aug 30, 2023, 6:22 PM

#

import numpy as np
import math
import pandas as pd
from pandas.plotting import scatter_matrix
import scipy
import matplotlib.pyplot as plt
from scipy.stats import *
pd.set_option('display.max_rows', None, 'display.max_columns', None, 'display.width', None)
housing = pd.read_csv(filepath_or_buffer = 'filepath')
print(housing)

why does this not print all of it, it prints from like 3000 - 13000

#

this is on pychamr btw maybe thats the issue

#

and i do have 13000 data points

mild dirge Aug 30, 2023, 6:24 PM

#

Terminal has a maximum length @umbral charm

umbral charm Aug 30, 2023, 6:25 PM

#

mild dirge Terminal has a maximum length <@318024849333288960>

Yea

#

was thinking this, anyway to solve?

#

or do i just have to use like IDLE

mild dirge Aug 30, 2023, 6:25 PM

#

I don't think you ever need to print over 10k lines probably.

#

Could print to a text file

umbral charm Aug 30, 2023, 6:26 PM

#

That is true, its just useful to see if it works

mild dirge Aug 30, 2023, 6:26 PM

#

But probably better to only print the useful info

umbral charm Aug 30, 2023, 6:26 PM

#

yea would be but i like to visualise everything just incase

mild dirge Aug 30, 2023, 6:26 PM

#

Maybe there is a setting to change the maximum line count

#

But you'd need to look around in the settings

umbral charm Aug 30, 2023, 6:27 PM

#

mild dirge But you'd need to look around in the settings

I have 'override conscle cycle buffer size'

#

would that be it?

wooden sail Aug 30, 2023, 6:30 PM

#

printing out 10k lines of text is one of the worst ways of visualizing anything

#

make a plot of the stuff you care about

mild dirge Aug 30, 2023, 6:31 PM

#

umbral charm I have 'override conscle cycle buffer size'

Probably. But is it possible to make a plot instead?

#

You're not going to read through 10k lines in les than a few hours 😛

tidal bough Aug 30, 2023, 6:32 PM

#

wooden sail printing out 10k lines of text is one of the worst ways of visualizing anything

~~i feel personally attacked~~

wooden sail Aug 30, 2023, 6:32 PM

#

tidal bough ~~i feel personally attacked~~

that explains why you're always confused

umbral charm Aug 30, 2023, 6:33 PM

#

mild dirge You're not going to read through 10k lines in les than a few hours 😛

That is ture

#

Yea its fine i should be good with 10k lines

#

i was just curiuos, dealing with large datasts is a pain

#

oh

#

it worked

umbral charm Aug 30, 2023, 6:38 PM

#

mild dirge Probably. But is it possible to make a plot instead?

thanks!

mild dirge Aug 30, 2023, 6:40 PM

#

It was edds suggestion 👌🏽

unique ether Aug 30, 2023, 7:37 PM

#

What is the most important mathematical formula in ML?

#

is it the Quadratic formula?

mild dirge Aug 30, 2023, 7:38 PM

#

Really subjective. There is also the chain rule for backwards propagation

verbal swan Aug 30, 2023, 7:38 PM

#

unique ether is it the Quadratic formula?

Lol probably not

mild dirge Aug 30, 2023, 7:38 PM

#

Lots of activation functions like tanh, ReLU, sigmoid etc.

#

Also pretty big

#

But yeah, not really a single answer 😛

verbal swan Aug 30, 2023, 7:38 PM

#

I don't think there is a single most important formula, multiple statistical formulas are used

unique ether Aug 30, 2023, 7:39 PM

#

Cheers everyone

umbral charm Aug 30, 2023, 8:12 PM

#

housing['date'] = pd.to_datetime(housing['date'], format = '%d/%m/%Y')
housing2 = housing[(housing['date'] > '2016-12-01') and (housing['date'] < '2018-01-01')]
print(housing2)

#

why doesnt this work

#

' raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'

young granite Aug 30, 2023, 8:16 PM

#

umbral charm ```py housing['date'] = pd.to_datetime(housing['date'], format = '%d/%m/%Y') hou...

can u provide full traceback as code

umbral charm Aug 30, 2023, 8:16 PM

#

young granite can u provide full traceback as code

like u want my full code and full error?

young granite Aug 30, 2023, 8:17 PM

#

only the full error i assume it results from line (2)?

unique ether Aug 30, 2023, 8:17 PM

#

the traceback mate

umbral charm Aug 30, 2023, 8:17 PM

#

ok

#

but dont bully the username alright

left tartan Aug 30, 2023, 8:17 PM

#

umbral charm ' raise ValueError( ValueError: The truth value of a Series is ambiguous. Use...

Put parens around each side of the equation

#

(…) and (…)

umbral charm Aug 30, 2023, 8:17 PM

#

Traceback (most recent call last):
  File "D:\Users\FatBoy\PycharmProjects\Coursera\Gali.py", line 11, in <module>
    housing2 = housing[(housing['date'] > '2016-12-01') and (housing['date'] < '2018-01-01')]
  File "C:\Users\FatBoy\bottle\lib\site-packages\pandas\core\generic.py", line 1466, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

left tartan Aug 30, 2023, 8:18 PM

#

Oh sorry, you’re comparing to a string.

#

You want to compare the series to a date obj

umbral charm Aug 30, 2023, 8:18 PM

#

When i put the '&' instead of and

#

it works

#

but i dont want to use '&' coz idk what it actual means i only know its a bitwise operator

left tartan Aug 30, 2023, 8:19 PM

#

Oh and that too:

#

Each side of that AND is a boolean series.

#

You want the bitwise AND of the series… not the boolean AND which makes no sense: what is [10101010] and [00011010]?

#

A boolean AND operation would give you a True or a False, which makes no sense

#

Does that make any sense?

umbral charm Aug 30, 2023, 8:21 PM

#

i see

#

so if [10101010] and [00011010]? makes no sense

#

what would [10101010] & [00011010] do

left tartan Aug 30, 2023, 8:22 PM

#

What do you think it’ll do? Do you know anything about bitwise operations or boolean arithmetic?

umbral charm Aug 30, 2023, 8:22 PM

#

nope nothing about bitwise operator, thats why i didnt want to use it

#

i also know that there is a pipe for OR bitwise

unique ether Aug 30, 2023, 8:22 PM

#

doesn't & return everything that is in both lists?

left tartan Aug 30, 2023, 8:23 PM

#

Let me demonstrate... one sec

umbral charm Aug 30, 2023, 8:24 PM

#

oh

#

& literally goes down to the bits and compares them

#

so if im correct

left tartan Aug 30, 2023, 8:25 PM

#

import pandas as pd
s1 = pd.Series([0, 1, 1, 0, 1])
s2 = pd.Series([1, 0, 1, 1, 0])
print(s1 & s2)

umbral charm Aug 30, 2023, 8:25 PM

#

that would print

#

00100

#

yay or nay?

left tartan Aug 30, 2023, 8:26 PM

#

Yes

#

So, lookinga tyour original code:

#

You have series A which is : (housing['date'] > '2016-12-01') and series B which is: (housing['date'] < '2018-01-01')

unique ether Aug 30, 2023, 8:26 PM

#

this would print 11111 right?

import pandas as pd
s1 = pd.Series([0, 1, 1, 0, 1])
s2 = pd.Series([1, 0, 1, 1, 0])
print(s1 | s2)

left tartan Aug 30, 2023, 8:26 PM

#

So, you want all rows where A is True and B is True

left tartan Aug 30, 2023, 8:27 PM

#

unique ether this would print 11111 right? ```py import pandas as pd s1 = pd.Series([0, 1, 1,...

Yes