#data-science-and-ml | Python | Page 77

desert oar Aug 13, 2023, 1:25 AM

#

didn't know about either, interesting

#

i remember seeing a blog post or something about it related to pymc3

iron basalt Aug 13, 2023, 1:25 AM

#

The "Aesara" seems to be the pymc people.

desert oar Aug 13, 2023, 1:25 AM

#

it might even have been pymc3 devs trying to keep the project alive in maintenance mode, at least enough to support pymc3 itself

#

i see, that tracks

fallow frost Aug 13, 2023, 1:25 AM

#

I'll try this and see if its necessary, thank you

iron basalt Aug 13, 2023, 1:25 AM

#

https://github.com/pymc-devs/pytensor

GitHub

GitHub - pymc-devs/pytensor: PyTensor is a fork of Aesara -- a Pyth...

PyTensor is a fork of Aesara -- a Python library for defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays. - GitHub - pymc-devs/pytensor: PyT...

desert oar Aug 13, 2023, 1:26 AM

#

iirc theano was more "lower level", more like numpy or jax than pytorch or tensorflow

#

might be interesting to see how pytensor stacks up against the newer frameworks

iron basalt Aug 13, 2023, 1:26 AM

#

"Implements an extensible graph transpilation framework that currently provides compilation via C, JAX, and Numba."

#

Kind of neat to see that someone out there takes these seemingly dead projects and runs with them, at least in spirit.

desert oar Aug 13, 2023, 1:30 AM

#

oh very interesting

#

lots of frameworks now all at various levels of abstraction, some python-specific and some general

fallow frost Aug 13, 2023, 2:27 AM

#

I got 5 dataframes with the exact same shape, but the values are just sligtly different, how can I get the average of each 5 values? (the output should be another dataframe with the values changed)

#

basically each benchmark generates a dataframe, and I want to display the average

left tartan Aug 13, 2023, 2:55 AM

#

Simplest is to concat then avg

left tartan Aug 13, 2023, 2:56 AM

#

fallow frost basically each benchmark generates a dataframe, and I want to display the averag...

Or get count and avg for each of the 5 and then compute weighted avg

#

Or generate a series by combining the 5 columns and computing average over that.

#

Generally, what I’d do is create a single dataframe with all 5 tests: add a test column, and then do whatever I want with that. That’s the most natural

burnt saffron Aug 13, 2023, 4:08 AM

#

https://tenor.com/view/ee-gif-19039292

Tenor

dusk tide Aug 13, 2023, 6:41 AM

#

Hi ,I am going through SpaceShip titanic competition notebook https://www.kaggle.com/code/samuelcortinhas/spaceship-titanic-a-complete-guide/notebook on Kaggle and having a doubt . While visualizing missing values of each row , the person has made this bar chart and wrote the Note/inference as Missing values are independent of the target and for the most part are isolated. What does this mean and what happens if the missing values are dependent on target class?

🚀 Spaceship Titanic: A complete guide 🏆

Explore and run machine learning code with Kaggle Notebooks | Using data from Spaceship Titanic

carmine mason Aug 13, 2023, 8:11 AM

#

has anyone here worked with Plackett–Burman designs before, because I have some questions

tough radish Aug 13, 2023, 9:07 AM

#

Hello everyone! 🙂 I am new here and I was wondering if anybody knows where I can find some cool projects for beginners where I can start and learn coding from scratch? I've already gone through some data science courses on DataCamp but I need to process all the input I got there and really practise 🫡

tidal bough Aug 13, 2023, 9:14 AM

#

tough radish Hello everyone! 🙂 I am new here and I was wondering if anybody knows where I ca...

For beginner practice I usually recommend https://www.codewars.com/ - the katas there are grouped by difficulty from 8-kyu to 1-kyu, so you can work your way up.
Doing projects is great but project ideas are hard to recommend - maybe see https://nedbatchelder.com/text/kindling.html.

#

(huh, or do you want data science practice specifically, rather than programming in general?)

tough radish Aug 13, 2023, 9:16 AM

#

Thank you! 🙂 programming in general is fine too! I will enroll for a BA in Data Science next year so I also looking for Data Science projects to be prepared but I want to learn programming in general aswell so that's perfectly fine!

lapis sequoia Aug 13, 2023, 11:50 AM

#

Anyone implemented a transformer architecture from scratch?

full herald Aug 13, 2023, 1:11 PM

#

Hi Data Gangs ❗ ❗ ❗

#

If you want to start your Data engineering learning journey and get your hands dirty , here is a roamdap for you

#

https://github.com/Younes1337/Data-Engineering-Roadmap

GitHub

GitHub - Younes1337/Data-Engineering-Roadmap

Contribute to Younes1337/Data-Engineering-Roadmap development by creating an account on GitHub.

#

Join Our Data Tech Community for Data Engineers & Cloud Engineers

errant bison Aug 13, 2023, 4:23 PM

#

how to deploy a deeplearning model from google colab? I want it to run on pyqt5, but do i need to have pkl file or how can i do so?

raw rapids Aug 13, 2023, 4:34 PM

#

does keras's ImageDataGenerator only work for classification tasks or image to image translation tasks as well?

serene scaffold Aug 13, 2023, 4:53 PM

#

errant bison how to deploy a deeplearning model from google colab? I want it to run on pyqt5,...

you cannot. Colab is only for real-time, interactive use.

errant bison Aug 13, 2023, 4:55 PM

#

serene scaffold you cannot. Colab is only for real-time, interactive use.

i still didnt understand. So i have to import all the modules in local system? but wont i still have to make h5 file? pls elaborate a lil

serene scaffold Aug 13, 2023, 4:56 PM

#

errant bison i still didnt understand. So i have to import all the modules in local system? b...

Colab is for experimenting with code. You can't use it to host an app.

slim bone Aug 13, 2023, 4:58 PM

#

Bit of a weird question - I've been learning Pytorch up until now and it appears that most of the books in ML are in TF rather than Pytorch. How difficult is the transition between the two?

serene scaffold Aug 13, 2023, 4:59 PM

#

slim bone Bit of a weird question - I've been learning Pytorch up until now and it appears...

if you understand the concepts, I don't think it will be an issue for you.

errant bison Aug 13, 2023, 4:59 PM

#

serene scaffold Colab is for experimenting with code. You can't use it to host an app.

no no, i dont want to host(not like aws or heroku). My main goal over here is to make a gui, where a user can feed video and get video as output only. So just to create gui, i shld do in local system?

serene scaffold Aug 13, 2023, 5:00 PM

#

errant bison no no, i dont want to host(not like aws or heroku). My main goal over here is to...

if you trained the model on colab, you will have to download the model and make the GUI somewhere else.

cold osprey Aug 13, 2023, 5:01 PM

#

errant bison no no, i dont want to host(not like aws or heroku). My main goal over here is to...

i think u can have a plotly dash app within a jupyter notebook

errant bison Aug 13, 2023, 5:01 PM

#

serene scaffold if you trained the model on colab, you will have to download the model and make ...

yeah, how to download my deeplearning model

serene scaffold Aug 13, 2023, 5:01 PM

#

errant bison yeah, how to download my deeplearning model

are you using pytorch or tensorflow or what

slim bone Aug 13, 2023, 5:01 PM

#

serene scaffold if you understand the concepts, I don't think it will be an issue for you.

Oh that's great to hear. Are there any well-known recommended books btw? I'll probably be learning from this one: https://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438
But I'm curious if there's a book that's typically a "go-to" for learning ML, kind of like AutomateTheBoringStuff for general python-learning

errant bison Aug 13, 2023, 5:02 PM

#

serene scaffold are you using pytorch or tensorflow or what

yes, basically yolo

serene scaffold Aug 13, 2023, 5:02 PM

#

slim bone Oh that's great to hear. Are there any well-known recommended books btw? I'll pr...

not really. the most widely recommended resources for ML are those created by Andrew Ng.

errant bison Aug 13, 2023, 5:02 PM

#

cold osprey i think u can have a plotly dash app within a jupyter notebook

oh i will check tht out, but how to download the model,

slim bone Aug 13, 2023, 5:03 PM

#

serene scaffold not really. the most widely recommended resources for ML are those created by An...

I'm only aware of the Coursera courses - unfortunatley it seems I don't really click with video-materials though. I'm assuming they have more resources I'm not aware of?

lapis sequoia Aug 13, 2023, 5:06 PM

#

errant bison oh i will check tht out, but how to download the model,

I haven't read complete discussion, I saw you mentioned pytorch, for that its quite straightforward to save a trained model weights in disk using torch.save('path', model_state_dictionary) , and you may download them directly from colab disk

mellow grove Aug 13, 2023, 6:23 PM

#

Anybody able to help me out here? I am in need of some code examples of how to take a value from a Shiny user interface (pull-down menu selection for example) and pass that into another body of code in another .py file I have already created to do data analytics with.

An example use case is that I have a Python program called policy_analytics.py that does analysis for an insurance company. The first part of my code runs an SQL query against some tables in Snowflake, and in my WHERE clause, I can filter on policies in a certain state. I have a Shiny application running where I have a "Select State" ui.input_select line of code that allows the user to select which state that the query filters on. How can I pass that state in as a string to my policy_analytics.py file to use when building the query for the desired analysis to be done?

covert vale Aug 13, 2023, 7:21 PM

#

Hello everyone,

This package I created contains multiple abbreviated solutions for multiple sections. For EDA, I created a section where categorical and numeric data are analyzed separately and graphs are drawn according to different scenarios (Boxplot, barchart etc.). Apart from that, the best estimator selection I created for the predict part, which is the last step of the model created with the pipeline object, is hyper param contains the function whose optimization(optuna) is applied in a single line. I recommend you take a look. I published it after trying for 7-8 different cases. Never hesitate to submit bugs or ideas. Thanks.

https://github.com/kaansnmez/lazyauto

GitHub

GitHub - kaansnmez/lazyauto: lazyauto

lazyauto. Contribute to kaansnmez/lazyauto development by creating an account on GitHub.

hollow sparrow Aug 13, 2023, 9:52 PM

#

Any good offline text to speech out there with some reasonable voices? I tried mimic and its good enough to make an intentional robotic voice, which works fine for the droid I'm building, but are there any more of the more realistic models anywhere?

dusky magnet Aug 13, 2023, 10:26 PM

#

hollow sparrow Any good offline text to speech out there with some reasonable voices? I tried m...

Only the languages specific ones are on point, like voicevox for Japanese er i forgor whats the name for the german one and etc.

hollow sparrow Aug 13, 2023, 10:45 PM

#

I see, hmm I am Swedish tho, maybe the german one will be close enough x)

late shell Aug 14, 2023, 4:34 AM

#

Hello, I'm trying to use the llama-2-7b-chat-ggml, 8-bit quantized model by TheBloke (huggingface). I have 33.6 gb ram and Nvidia 1080Ti . But the model is extremely slow. I'm off loading 20 layers to the gpu (gpu_layer=20), but it still takes around 4-5 mins to generate a response and sometimes even hangs indefinitely before I kill it after 15 mins. I know it shouldn't be taking so long. Can someone please help me with this. My prompt looks something like this:

Use the given question and context to generate a detailed, authentic description about the machine. Make it sound as if you are a great salesman and are pitching this machine to a potential buyer. Use good formatting and the description should not be too long (About 200 words only). Try to make it as easy to read as possible. Most importantly, you must include all the information provided under the context in the description that you generate. Do not make up new information. It's a pre owned machine, therefore the description should not be like the launch of a new product.

Generate a description of the machine using the information provided under the Context.
 
Context: 
categoryName: Post Press
subcategoryName: Saddle Stitcher
subsubcategoryName: Conveyor belt
manufacturerName: Monotype
Year: 2001
MachineModelName: Boston Double Head Stitching
Location: Germany
Info: DOUBLE HEAD STITCHING MACHINE BOSTON
2 HEAD FLAT AND SADDLE STITCHING MACHINE
DOUBLE WIRE

lapis sequoia Aug 14, 2023, 4:39 AM

#

hollow sparrow Any good offline text to speech out there with some reasonable voices? I tried m...

Hi, there are few open source models for TTS which you could refer, the good voice cloning/generating ones are restrictive to use as lot of ethical issues comes into play. My personal favourites are Speech T5 & TacoTron2, T5 was pretrained for 6 different tasks hence I found it quite generalised, only downside I found is 600 token length limit as input.
Apart from them, there are few newer models which are based on GenAI like Valle, tortoise.

south edge Aug 14, 2023, 6:05 AM

#

#

i have a doubt in this pic

young granite Aug 14, 2023, 6:08 AM

#

south edge i have a doubt in this pic

maybe elaborate?!

south edge Aug 14, 2023, 6:09 AM

#

are the summing junction and activation function considered as hidden layers?

young granite Aug 14, 2023, 6:10 AM

#

depends on whom u ask i guess all between input and output is hidden layer for me

south edge Aug 14, 2023, 6:11 AM

#

oh

#

and why are there so many weights

young granite Aug 14, 2023, 6:12 AM

#

?

south edge Aug 14, 2023, 6:12 AM

#

even though they give the same output

#

in this pic

young granite Aug 14, 2023, 6:15 AM

#

normally u would combine this into one node

#

so to clarify myself a bit u see green, red and yellow -> 3 nodes

south edge Aug 14, 2023, 6:17 AM

#

we would get the same output if we make it as a single node right?

young granite Aug 14, 2023, 6:17 AM

#

this is just a more detailed schematic of something like this

young granite Aug 14, 2023, 6:17 AM

#

south edge we would get the same output if we make it as a single node right?

no

south edge Aug 14, 2023, 6:17 AM

#

no i mean

#

#

not the nodes

#

i have confused myself a little bit

#

the picture above is simliar to the picture i uploaded before right

#

and both of them would give the same output either way

#

then why make it complex in the first picture

#

or is it any method to make our computation faster

young granite Aug 14, 2023, 6:37 AM

#

those are simply example pictures to give u an idea of what happens inside the NN

heady fulcrum Aug 14, 2023, 8:06 AM

#

Hello everyone
I'm new here

And I just started learning python
I'm open to network and learn more

rough mural Aug 14, 2023, 8:14 AM

#

anyone with some experience in generative ai

#

need some help

young granite Aug 14, 2023, 8:14 AM

#

rough mural need some help

ask questions directly

rough mural Aug 14, 2023, 8:14 AM

#

ok

#

i have to geneerate some outfit after getting some result from dataset

#

whic is then given to the dall e for image generation

#

but this takes time and a lottt of memory

#

! pip install min-dalle -q

#

from min_dalle import MinDalle
model = MinDalle(is_mega=True, is_reusable=True)

#

seed = 6
grid_size = 1
display(model.generate_image(prompt, seed, grid_size))

#

this is the code that i am using this is open source but is very heavy on my laptop

#

any suggeston on how can i make it faster.

mild dirge Aug 14, 2023, 8:18 AM

#

make sure it runs on gpu

#

or find a smaller model

rough mural Aug 14, 2023, 8:18 AM

#

it is

mild dirge Aug 14, 2023, 8:18 AM

#

Well, that is pretty much all you can do

rough mural Aug 14, 2023, 8:18 AM

#

it takes aroung 10gigs og gpu space

#

any method in which i can save the progress

#

and then run the last line only

lapis sequoia Aug 14, 2023, 8:33 AM

#

hello, does anyone know how to make images like this for my model?

rough mural Aug 14, 2023, 8:34 AM

#

i can help

#

but what will be the final image

#

@lapis sequoia the image on the top right or the bottom

lapis sequoia Aug 14, 2023, 8:35 AM

#

the architecture one, left with the backbone and the layers

#

i want to draw one for my model

rough mural Aug 14, 2023, 8:35 AM

#

ok

#

can you expalin a little further

lapis sequoia Aug 14, 2023, 8:37 AM

#

i made a model i need to explain to people and i want a tool to make something like this

mild dirge Aug 14, 2023, 8:37 AM

#

probably made with tikz library in tex

#

#

Made similar sketches with it

#

Maybe this is of use @lapis sequoia https://github.com/HarisIqbal88/PlotNeuralNet

GitHub

GitHub - HarisIqbal88/PlotNeuralNet: Latex code for making neural n...

Latex code for making neural networks diagrams. Contribute to HarisIqbal88/PlotNeuralNet development by creating an account on GitHub.

lapis sequoia Aug 14, 2023, 8:40 AM

#

i found many tools like this but they make a detailled architecture like the one you sent but the one i sent is pretty simple and many people have it so i thought its a tool i just don't know

#

mild dirge Aug 14, 2023, 8:41 AM

#

Not sure if there is a no-effort solution. If you find one, sure let me know though

lapis sequoia Aug 14, 2023, 8:41 AM

#

thank you anyway

mild dirge Aug 14, 2023, 8:41 AM

#

That tikz library is quite a pain to use, so if there is an easy solution that would be great 😛

mild dirge Aug 14, 2023, 8:43 AM

#

lapis sequoia i found many tools like this but they make a detailled architecture like the one...

These types of images probably have been handcrafted though. Especialy with the example cat image and all the little details. This isn't just a pytorch NN to image generator I don't think

silk cipher Aug 14, 2023, 9:03 AM

#

hey everybody, just a question, how do I load my own dataset, like I made one in JSON and it looks like this:

{"dataset":
  {
  "input": "some input",
  "output": ["outputs"]
  }, ...}```
how do I load this in pytorch

#

also it's my first time trying this with a custom dataset that i made, so i need to know how to process it and make it ingestable by the CRF model I'm trying to make

mild dirge Aug 14, 2023, 9:06 AM

#

If it's not in a standard format, you make a custom dataset

#

In that you could just use json.load() or whatever to get the data, and then put it into tensors

#

You also need to look at what format the model takes the data

silk cipher Aug 14, 2023, 9:08 AM

#

mild dirge In that you could just use `json.load()` or whatever to get the data, and then p...

of that easily, like other datasets, what about the strings? and the list of outputs [cause i need it to output a list]

mild dirge Aug 14, 2023, 9:09 AM

#

I haven't used a CRF myself, so I wouldn't know those specifics

#

That is more model dependent though I think

silk cipher Aug 14, 2023, 9:09 AM

#

hey here are the docs of torchCRF

>>> seq_length = 3  # maximum sequence length in a batch
>>> batch_size = 2  # number of samples in the batch
>>> emissions = torch.randn(seq_length, batch_size, num_tags)
>>> tags = torch.tensor([
...   [0, 1], [2, 4], [3, 1]
... ], dtype=torch.long)  # (seq_length, batch_size)
>>> model(emissions, tags)
tensor(-12.7431, grad_fn=<SumBackward0>)```
does this help?

#

the docs

quartz wigeon Aug 14, 2023, 9:30 AM

#

is there a way to teach an agent not to take an invalid action in reinforcement learning? I'm using stable-baselines3.

silk cipher Aug 14, 2023, 9:48 AM

#

what do you consider as an invalid action

mild dirge Aug 14, 2023, 10:02 AM

#

quartz wigeon is there a way to teach an agent not to take an invalid action in reinforcement ...

You could give it a negative reward if it performs an invalid action

lapis sequoia Aug 14, 2023, 10:58 AM

#

quartz wigeon is there a way to teach an agent not to take an invalid action in reinforcement ...

i guess you would need at first to bound the actions it could take. For exemple if the agent is moving on a grid it shouldn't even be possible to move north when you are at the most north cell. We usually do some encoding to every possible state and actions should be chosen from that pool

pastel cedar Aug 14, 2023, 12:34 PM

#

Hi, guys

#

any1 here know about codebasics resume C7?

#

if yes hif any1 like to work on that then ping me

zealous hollow Aug 14, 2023, 1:06 PM

#

i am using STL approach for a time series
i think this is wrong right?

#

trend past seems right
right?

quartz wigeon Aug 14, 2023, 2:52 PM

#

mild dirge You could give it a negative reward if it performs an invalid action

what I'm doing now is ignoring reward whenever the agent decides to take an invalid move, but it seems to get the agent stuck in a particular state trying to take an invalid move over and over again

quartz wigeon Aug 14, 2023, 2:52 PM

#

lapis sequoia i guess you would need at first to bound the actions it could take. For exemple ...

how so? is this possible with sb3?

lapis sequoia Aug 14, 2023, 2:53 PM

#

i don't know what sb3 is, I had to write Deep Qlearning with python for my RL class so I don't know

timid kiln Aug 14, 2023, 3:00 PM

#

In you guys' opinion, what would you say is the most widely used geoprocessing/plotting-type python library?

south edge Aug 14, 2023, 3:01 PM

#

matplotlib

timid kiln Aug 14, 2023, 3:01 PM

#

Reason I ask is I've headed down this road of putting data on maps and folium doesn't seem to have a lot of support out there.

timid kiln Aug 14, 2023, 3:02 PM

#

south edge matplotlib

Really? For maps? I had no idea. I just thought it was more for plotting for data analysis/spreadsheet type stuff.

south edge Aug 14, 2023, 3:02 PM

#

oops

timid kiln Aug 14, 2023, 3:02 PM

#

lol

south edge Aug 14, 2023, 3:02 PM

#

i thought you asked me about plotting

#

im sorry lol

timid kiln Aug 14, 2023, 3:02 PM

#

No worries. 🙂

lapis sequoia Aug 14, 2023, 3:28 PM

#

i tried leaflet

#

its good but in js

lapis sequoia Aug 14, 2023, 5:16 PM

#

Heyyy, I have learns the basics of python but I'm specially intrested in data science and AI, does anyone know a good book / course to start on this topic??

west grail Aug 14, 2023, 6:31 PM

#

lapis sequoia Heyyy, I have learns the basics of python but I'm specially intrested in data sc...

go check this book "
Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems"

lapis sequoia Aug 14, 2023, 7:14 PM

#

thanks, will do

zealous hollow Aug 14, 2023, 10:06 PM

#

got a question relating to time series forecasting

data = signal(trend+seasonality) + noise

what are some methods that are used to forecast the signal
like one is simple extrapolation

#

also can we try forecasting the noise as well?

wooden sail Aug 15, 2023, 3:51 AM

#

zealous hollow also can we try forecasting the noise as well?

you can try, but barring very special cases, you will get it right with probability 0

#

due to how continuous probability distributions work, the probability of a single event is 0

hard shoal Aug 15, 2023, 5:24 AM

#

Hi,guys

Where can I download datasets other than kaggle and the UCI Machine Learning Repository..?

zealous hollow Aug 15, 2023, 6:06 AM

#

hard shoal Hi,guys Where can I download datasets other than kaggle and the UCI Machine Lea...

i have a data set you can work with it if you want 😂

#

real life problem 😔
time series analysis and forecasting

rough mural Aug 15, 2023, 8:02 AM

#

i need to make an AI chatbot for fashion recommendation can anyone help me what to use

#

or anyone has some previous code that'll help me

bronze vessel Aug 15, 2023, 8:27 AM

#

I wanna be datascience dev but i dont know about hat

#

that*

vestal widget Aug 15, 2023, 10:26 AM

#

If i want to train an existed language model for the use of conversation chatbot, should i use embedding or finetuning?

lusty raptor Aug 15, 2023, 10:27 AM

#

Hey guys
I need unique machine learning project ideas that use cnn or NLP
I need to make a solo project for my course

brittle knoll Aug 15, 2023, 10:45 AM

#

anyone has ideas to make a presentation more interactive and fun with AI

fluid spindle Aug 15, 2023, 1:08 PM

#

Hey, I was thinking of creating a loop counter to preprocess MNIST

#

Like it should return 1 for handwritten "0" , "6", and "9"; 2 for "8" and 0 for others

fluid spindle Aug 15, 2023, 1:29 PM

#

#

here's sample of it

agile cobalt Aug 15, 2023, 1:29 PM

#

but some people do use a loop for 2

#

from your very example

fluid spindle Aug 15, 2023, 1:30 PM

#

yes

#

i think preprocessing with it would swing both ways huh

agile cobalt Aug 15, 2023, 1:32 PM

#

with deep learning it's oftentimes better to not overengineer features - convolutional layers could identify loops, and the network itself may count them if it deems that information useful

fluid spindle Aug 15, 2023, 1:33 PM

#

#

this is the confusion matrix, maybe I could use it exclusively to recognize 8s

#

or rather to eliminate FP 8s

mild dirge Aug 15, 2023, 1:36 PM

#

what are the values of that confusion matrix?

fluid spindle Aug 15, 2023, 1:37 PM

#

rows are actual labels and columns are predictions, SGD classifier used

mild dirge Aug 15, 2023, 1:37 PM

#

What is the model then? Because you were talking about a loop counter

mild dirge Aug 15, 2023, 1:38 PM

#

fluid spindle

And this seems to predict the number itself

fluid spindle Aug 15, 2023, 1:38 PM

#

yes, I just want to reduce the FPR for predicted 8s by running a loop counter

#

I seem to need to find whether there're two closed areas (usually roughly circular) for it

#

ignoring there are a bunch of 8s not fully closed

mild dirge Aug 15, 2023, 1:49 PM

#

I don't really see the point of this, as that is exactly what a CNN would try to do. If you want to go that route you can make custom convolutional kernels, but why bother?

fluid spindle Aug 15, 2023, 1:50 PM

#

upgrading the model

#

or gathering more 8s to train on

#

dunno, just brainstorming

mild dirge Aug 15, 2023, 1:51 PM

#

You have tried using a CNN on the data, and it gets 8s wrong?

fluid spindle Aug 15, 2023, 1:52 PM

#

I haven't, am tracking a book, it gonna tackle on neural networks in chapter 2

#

anyways, thanks for the insight, am still learning what or whatnot would be appropriate in given cases

fringe vector Aug 15, 2023, 2:21 PM

#

fluid spindle I haven't, am tracking a book, it gonna tackle on neural networks in chapter 2

mind sharing the book?

fluid spindle Aug 15, 2023, 2:23 PM

#

fringe vector mind sharing the book?

https://g.co/kgs/xz7AwU

Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concept...

Kitap

fringe vector Aug 15, 2023, 2:23 PM

#

nachoPray ty

fluid spindle Aug 15, 2023, 2:24 PM

#

ofc silly

hard shoal Aug 15, 2023, 2:26 PM

#

zealous hollow i have a data set you can work with it if you want 😂

Sure, you can send it to me

#

because I need a case study to improve my skills

desert oar Aug 15, 2023, 3:03 PM

#

mild dirge I don't really see the point of this, as that is exactly what a CNN would try to...

fwiw it's probably a good idea to try manual feature engineering as part of your learning path, it really gives you an appreciation for how amazing CNNs are and what things were like in the pre-CNN dark ages

#

that's really what kicked off the deep learning revolution imo, without CNN-on-MNIST we wouldn't have ChatGPT

vestal widget Aug 15, 2023, 4:21 PM

#

If i want to train an existed language model for the use of conversation chatbot, should i use embedding or finetuning?

serene scaffold Aug 15, 2023, 4:30 PM

#

vestal widget If i want to train an existed language model for the use of conversation chatbot...

"embedding" and "fine tuning" are not different possibilities for the same thing. they are totally separate things.

#

"fine tuning" is the process of continuing to train an existing model, so if you are going to train an existing language model, that is necessarily fine tuning.

potent sky Aug 15, 2023, 4:50 PM

#

has anyone figured out a way to add GPU support for GPT4All models? Or any other consumer grade models?

past meteor Aug 15, 2023, 4:51 PM

#

desert oar fwiw it's probably a good idea to try manual feature engineering as part of your...

To add to this. In uni we did a little bit of old school CV and then switched to DL and that's when it clicked

potent sky Aug 15, 2023, 4:51 PM

#

I've made a mini internet-connected chatbot for a hobby project with langchain, a vector db...the whole thing
but the inference gets pretty slow especially with increased context size
so I'm looking to run it on GPU

past meteor Aug 15, 2023, 4:52 PM

#

Working with those methods also gives you an appreciation for the challenges that image recognition has.

potent sky Aug 15, 2023, 4:54 PM

#

desert oar fwiw it's probably a good idea to try manual feature engineering as part of your...

pre-CNN dark ages is an apt description lmao

wooden sail Aug 15, 2023, 5:39 PM

#

on the other hand, the post-CNN ages are also dark, just for different reasons

#

your task is now successful, but now you don't understand why

civic elm Aug 15, 2023, 6:15 PM

#

How can Chatgpt survive? It has no monetization except the paid api. Can't go to enterprise because of data sensitivity, Can't have ads in the chat ui, has lawsuits to fight.. etc..

past meteor Aug 15, 2023, 6:21 PM

#

The paid version of GPT is a lot better than the free tier. Several orders of magnitude

#

On top of that, many companies are building services that use GPT. I went to an AI "conference" a few months ago and that seemed to be the hype thing

#

Many of the things they were doing were basic (conversational) information retrieval but the GPT API makes doing that a lot easier than whatever topic modelling people were doing a decade ago. It's at the level at which software engineers can make a solution in a couple of days. The quality is a different discussion though....

young granite Aug 15, 2023, 6:46 PM

#

Does anyone know of an open case study for ML applications?
Its for an job application so < 8h in total.

past meteor Aug 15, 2023, 7:25 PM

#

young granite Does anyone know of an open case study for ML applications? Its for an job appli...

there's the designing data intensive applications book

#

But I'm not fully sure what your question is because there's a lot of degrees of freedom. Is it an application of AI or an application with AI

young granite Aug 15, 2023, 7:43 PM

#

past meteor there's the designing data intensive applications book

I would like to record a rough horizon of expectations as preparation for possible case studies in the context of my application process.

iron basalt Aug 15, 2023, 8:59 PM

#

past meteor Many of the things they were doing were basic (conversational) information retri...

OpenAI is still operating at a massive loss, ChatGPT is not sustainable. Which is why they have been giving it massive downgrades. They seem to be trying to have smaller networks finish the rest of the prompt result, rather than the full larger one.

serene scaffold Aug 15, 2023, 9:02 PM

#

iron basalt OpenAI is still operating at a massive loss, ChatGPT is not sustainable. Which i...

.randomcase but ChatGPT is constantly improving superlinearly

strange elbowBOT Aug 15, 2023, 9:02 PM

#

BuT ChatgpT Is COnSTantlY imPrOViNg supeRlINEArly

iron basalt Aug 15, 2023, 9:22 PM

#

serene scaffold .randomcase but ChatGPT is constantly improving superlinearly

As is their electricity bill.

#

25,000 GPUs not enough.

#

Who would have guessed that dense operations and backpropagation don't scale in terms of performance per watt. >.>

#

I wonder why the brain has sparse activity...

serene scaffold Aug 15, 2023, 9:33 PM

#

iron basalt I wonder why the brain has sparse activity...

the latent space is for all those memories you forget about for 10+ years and then suddenly remember

past meteor Aug 15, 2023, 9:34 PM

#

But they come out a bit mangled

iron basalt Aug 15, 2023, 9:35 PM

#

serene scaffold the latent space is for all those memories you forget about for 10+ years and th...

It's the difference between lighting up all the transistors and having only a few active at any given time, dense operations vs branching. GPU happens to be good for the former. If your brain did this it would generate enough heat to instantly melt your head.

serene scaffold Aug 15, 2023, 9:36 PM

#

iron basalt It's the difference between lighting up all the transistors and having only a fe...

sometimes I'd take that if it gave me the brainpower to debug certain things.

iron basalt Aug 15, 2023, 9:37 PM

#

serene scaffold sometimes I'd take that if it gave me the brainpower to debug certain things.

Like those head overheating scenes in cartoons.

#

Sparsity also untangles different things, which is needed, and dense networks end up learning this, but don't get the performance / energy benefit because they are still touching all the values / not branching.

misty flint Aug 15, 2023, 10:16 PM

#

past meteor But they come out a bit mangled

accurate

#

like a fever dream

dense crane Aug 15, 2023, 10:17 PM

#

what is more recomended for pytorch DataParallel or DistributedDataParallel ?

#

i was testing the first one and i speed up the training by 5% which is good but not satysfying can 2nd faster?

late shell Aug 16, 2023, 5:36 AM

#

late shell Hello, I'm trying to use the llama-2-7b-chat-ggml, 8-bit quantized model by TheB...

Hello, can someone help me with this please?

urban knoll Aug 16, 2023, 6:30 AM

#

I am searching for any 80 class .pt models compatible with pytorch --version 1.3.1(Python 2.7) on the internet. Thought I could ask on here as well. Any leads would be helpful.

verbal venture Aug 16, 2023, 6:40 AM

#

can someone explain what this math means? Is that the equation when y=1, y=0 etc.

verbal venture Aug 16, 2023, 6:40 AM

#

urban knoll I am searching for any 80 class `.pt` models compatible with `pytorch --version ...

try yolo. I think it's 80 classes

#

why that mode of python specifically?

#

@urban knoll sorry coco

small wedge Aug 16, 2023, 7:16 AM

#

verbal venture can someone explain what this math means? Is that the equation when y=1, y=0 etc...

Yes, that's called a piecewise function. The part on the right side is the condition that tells you which function to use and the part on the left is the function you use under a given condition.

verbal venture Aug 16, 2023, 7:16 AM

#

small wedge Yes, that's called a piecewise function. The part on the right side is the cond...

ty!

urban knoll Aug 16, 2023, 8:44 AM

#

verbal venture <@535493632636092456> sorry coco

DOesn't neccessarily have to be 90 class, just a .pt file tht works for that version of pytorch. I'm currently dealing with old software made in python 2.7 so. It's a whole thing

verbal venture Aug 16, 2023, 9:10 AM

#

look up to see if coco works

#

if not find the year 2.7 came out and then look up 'x year pytorch dataset' or whatever

steady nacelle Aug 16, 2023, 9:20 AM

#

What legit roadmap should I follow for landing a machine learning engineer entry position?

agile cobalt Aug 16, 2023, 9:27 AM

#

a degree would be a rather safe choice

small wedge Aug 16, 2023, 10:35 AM

#

can you send the full traceback please? also you can use code formatting to prevent discord from making silly mistakes like the thumbs down

#

!code

arctic wedgeBOT Aug 16, 2023, 10:35 AM

#

Formatting code on discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

small wedge Aug 16, 2023, 10:41 AM

#

looks like you are passing a VideoCapture object to cv2.rectangle

#

you used cap.read() to extract the frame later in the code

#

well that will create a tuple of the ret and the frame

#

I'm not super familiar with using cv2 in this way so I'd just assume you need to extract the frame like you did but earlier and pass it. What exactly is cv2.rectangle intended to be doing here?

#

yeah I think you might want the frame

#

I assume it returns something? either that or if it modifies you will need to make sure you use the same frame object when you try to imshow i.e. you could just move all this down to your while loop

lusty lotus Aug 16, 2023, 11:07 AM

#

what does the video mean by "the direction of the negative gradient"? i thought lines w/ positive gradient was like / and negative gradients like \ but why in the video he's pointing at a / and moves his cursor downwards? isn't that positive gradient?
https://youtu.be/8d6jf7s6_Qs?t=169

signal lintel Aug 16, 2023, 11:09 AM

#

i need images like that Perceptive transformed to normal view

# https://github.com/darklab8/darklab_peasant/blob/a383b5ee02a5a645bede4df53c1aaa572c7a1236/peasant/captch_solver.py#L90
pts1 = np.float32([left_top_corner,left_bottom_corner,right_top_corner,right_bottom_corner]) # type: ignore
pts2 = np.float32([[22,11],[22,33],[160,11],[160,33]]) # type: ignore
M = cv2.getPerspectiveTransform(pts1,pts2)
dst = cv2.warpPerspective(out,M,(200,50))
out2 = dst.copy()

in order for them to be properly recognized by Tessearct

I found it is possible to perceptive transform with this code, problem is in identifiying corners of a text. my current algorithm to identify corners sucks.
Could someone help me to do that 🙈 it is an open source project, made for pet project to register for a queue in updating passport 😄

https://github.com/darklab8/darklab_peasant/blob/a383b5ee02a5a645bede4df53c1aaa572c7a1236/peasant/captch_solver_tests.py#L24
I wrote expected tests too

🙈 in general very accurately wrote the project. (hopefully)
mypy'ed everything
unit tested
as last step going to have it deployed into AWS as event bridged croned lambda to be servrless deployed
u would be a very great contributor if u helped with it. Could be part of portfolio if interested

arctic wedgeBOT Aug 16, 2023, 11:09 AM

#

peasant/captch_solver.py line 90

# Perceptive transform```
`peasant/captch_solver_tests.py` line 24
```py
def test_captcha(img_num: int, expected: int) -> None:```

small wedge Aug 16, 2023, 11:10 AM

#

lusty lotus what does the video mean by "the direction of the negative gradient"? i thought ...

the negative of the gradient

#

i.e. we subtract the gradient from it (it being whatever we're optimizing, weights in the case of nn)

lusty lotus Aug 16, 2023, 11:11 AM

#

small wedge the negative *of* the gradient

can you pls elaborate? i thought if you take the negative of the gradient it's gradient * -1 since "of" is *-1 and it makes the gradient negative?

small wedge Aug 16, 2023, 11:11 AM

#

weights - learning_rate * gradient == weights + learning_rate * -1 * gradient

#

it's just a bit of a confusing way to say it

#

but people refer to it as the negative of the gradient fairly commonly, at least in my experience

lusty lotus Aug 16, 2023, 11:12 AM

#

wtf

#

now it's worse

small wedge Aug 16, 2023, 11:13 AM

#

XD

lusty lotus Aug 16, 2023, 11:14 AM

#

small wedge i.e. we subtract the gradient from it (it being whatever we're optimizing, weigh...

but how can we obtain this?

small wedge Aug 16, 2023, 11:14 AM

#

how do we obtain the gradient, is that what you're asking?

lusty lotus Aug 16, 2023, 11:15 AM

#

small wedge how do we obtain the gradient, is that what you're asking?

yeah and how does all of this relate to making the weights more correct to produce a correct output

#

well considering the fact that im watching that video, which is "for dummies" im alr struggling with shit so minimum math jargon pls

small wedge Aug 16, 2023, 11:15 AM

#

do you know any calculus? this is a type of function optimization that uses derivatives

lusty lotus Aug 16, 2023, 11:16 AM

#

small wedge do you know any calculus? this is a type of function optimization that uses deri...

well only really basics, where all i know is to find a gradient of a point, take the equation of the line and decrease the power of each variable by 1 and multiply the power to the coefficients and drop constants

#

but idk how any of that ties in with nn, all i know is "oh you need calc for nn" and all that but idk anything outside of this

small wedge Aug 16, 2023, 11:18 AM

#

ah the power rule my beloved XD. all that really matters is that you understand what a derivative is, it's a (function that describes a) rate of change. Using that rate of change for a function, we can tell how changing a variable in the function will effect it's output

#

we are optimizing the cost function, so we are trying to change all the inputs to that function (weights and biases) to lower the output of that function

lusty lotus Aug 16, 2023, 11:20 AM

#

small wedge ah the power rule my beloved XD. all that really matters is that you understand...

as in the rate of change for that pt? but i thought since say for this pt, the equation of the line is the same as the equation of the line here (slightly lower lol), wouldn't the derivative be the same?

Screenshot_2023-08-16_at_12.19.43_PM.png

Screenshot_2023-08-16_at_12.20.09_PM.png

#

(hmm it seems the screenshot is not showing completely, press on the imgs lol)

lusty lotus Aug 16, 2023, 11:23 AM

#

lusty lotus as in the rate of change for that pt? but i thought since say for this pt, the e...

@small wedge im still confused, can you pls explain this again?

#

also, the idea of "MSE" is very arbitrary to me. why not just like an "absolute error" as a criterion? a bit like the MAE? wouldn't squaring it like scale it exponentially?

small wedge Aug 16, 2023, 11:26 AM

#

lusty lotus as in the rate of change for that pt? but i thought since say for this pt, the e...

I'm sorry I'm not 100% sure what you're asking, the derivative should be the same as the original function? i.e. C(x) = x^2 are you saying C' is also x^2?

small wedge Aug 16, 2023, 11:27 AM

#

lusty lotus also, the idea of "MSE" is very arbitrary to me. why not just like an "absolute ...

the point is to scale it exponentially, we want the model to be punished exponentially more for making lots of mistakes than for making a few mistakes

lusty lotus Aug 16, 2023, 11:27 AM

#

small wedge I'm sorry I'm not 100% sure what you're asking, the derivative should be the sam...

yeah, as in the derivatives of red dots on the screenshots. wouldn't their gradients be the same, as with the entire line?

lusty lotus Aug 16, 2023, 11:27 AM

#

small wedge the point is to scale it exponentially, we want the model to be punished exponen...

wtf then let's do a MCE, mean cube error?

#

or a M x^1000 E criteron lmao

small wedge Aug 16, 2023, 11:27 AM

#

lusty lotus wtf then let's do a MCE, mean cube error?

that gives you negative loss lol

lusty lotus Aug 16, 2023, 11:28 AM

#

small wedge that gives you negative loss lol

oh yeah stupid reward function

lusty lotus Aug 16, 2023, 11:28 AM

#

lusty lotus yeah, as in the derivatives of red dots on the screenshots. wouldn't their gradi...

then how about this

wooden sail Aug 16, 2023, 11:28 AM

#

lusty lotus also, the idea of "MSE" is very arbitrary to me. why not just like an "absolute ...

because under special conditions, the MSE is the proper target for maximum likelihood estimation, with many nice optimality properties

#

but the answer is "the cost function depends on what you know about the data model and its statistics"

#

if you don't know anything about that, you cannot choose a cost function and claim it is optimal

lusty lotus Aug 16, 2023, 11:29 AM

#

i see

wooden sail Aug 16, 2023, 11:29 AM

#

for IID additive gaussian distributed noise with equal variance per sample, MSE is the maximum likelihood estimator which yields asymptotic efficiency and unbiasedness

#

that's about as nice as an estimator can get

#

if your data doesn't follow those properties, there is no special reason to use the MSE

lusty lotus Aug 16, 2023, 11:31 AM

#

lusty lotus yeah, as in the derivatives of red dots on the screenshots. wouldn't their gradi...

im still very much puzzled by this

#

like for a line x^2, any points on the line will have dy/dx = 2x right?

wooden sail Aug 16, 2023, 11:32 AM

#

which depends on the value of x

lusty lotus Aug 16, 2023, 11:32 AM

#

OH WAIT do i have to sub in the x-values into 2x? then it makes the gradients different? shitshitshit

wooden sail Aug 16, 2023, 11:32 AM

#

there is a different tangent line at every point on the curve

lusty lotus Aug 16, 2023, 11:32 AM

#

lusty lotus OH WAIT do i have to sub in the x-values into `2x?` then it makes the gradients ...

so like if x=1, then grad = 2
and if x=2, then grad = 4?

wooden sail Aug 16, 2023, 11:33 AM

#

gradient descent works on the principle of making a tangent line at a point on the original function, following this linear approximation for a bit, and hoping it yields an improvement. if not, correct it

#

for nicely behaved functions, if the step size is small enough, this method is guaranteed to work (locally)

lusty lotus Aug 16, 2023, 11:33 AM

#

lusty lotus so like if x=1, then grad = 2 and if x=2, then grad = 4?

is my understanding correct for the gradients of x^2 where x = 1 and x = 2 respectively?

wooden sail Aug 16, 2023, 11:34 AM

#

that idk, i didn't check the numbers in your images

lusty lotus Aug 16, 2023, 11:37 AM

#

wooden sail that idk, i didn't check the numbers in your images

imho this still bothers me like wtf is "in the direction of the negative of the gradient" means. is it right to think that it's good when the gradient of the pt is 0?

wooden sail Aug 16, 2023, 11:38 AM

#

i think you need to take a step back, i see the problem now

#

the questions you're asking say you haven't studied linalg and optimization

#

without that, none of this will ever make sense

#

yes, for differentiable functions evaluated in an open interval, all candidate extrema have a gradient of 0

#

that's a necessary but not sufficient condition

#

(if the interval is closed, the boundaries can also be candidates regardless of the gradient, but this usually involves constrained optimization anyway)

#

i recommend boyd's convex optimization books

small wedge Aug 16, 2023, 11:42 AM

#

https://youtu.be/hfMk-kjRv4c?t=909 I agree that you need to study this stuff to understand the math, but I think you can intuit what is happening with the optimization here easily if you have just a bit of calculus knowledge

lusty lotus Aug 16, 2023, 11:42 AM

#

is all of that needed? :/

small wedge Aug 16, 2023, 11:42 AM

#

3b1b's 4 part series might also help you intuit it with what math you already have

wooden sail Aug 16, 2023, 11:43 AM

#

lusty lotus is all of that needed? :/

if you want to understand well, yes

lusty lotus Aug 16, 2023, 11:43 AM

#

small wedge 3b1b's 4 part series might also help you intuit it with what math you already ha...

3b1b didn't help, watched it so many times and usually got questions at the 2nd episode and pretty much gave up on the third

wooden sail Aug 16, 2023, 11:43 AM

#

if you only want to use it, you can just memorize the rules

#

you might remember them from your HS or early uni calculus courses

lusty lotus Aug 16, 2023, 11:44 AM

#

wooden sail if you only want to use it, you can just memorize the rules

tbh i can alr use it, using pytorch. i technically don't need to know since im building rl but im just curious

lusty lotus Aug 16, 2023, 11:44 AM

#

wooden sail you might remember them from your HS or early uni calculus courses

im year 10 lol

wooden sail Aug 16, 2023, 11:44 AM

#

extrema when the derivative is 0, and then you check whether it's a maximum or minimum by testing the 2nd derivative

lusty lotus Aug 16, 2023, 11:44 AM

#

i dont think the course has covered it yet though

wooden sail Aug 16, 2023, 11:44 AM

#

lusty lotus im year 10 lol

ah all right, pardon if i came across as too harsh lol

#

do you already have a feel for what a derivative means?

lusty lotus Aug 16, 2023, 11:45 AM

#

wooden sail do you already have a feel for what a derivative means?

like all i was told was that it was the gradient of a point but the intuition? no. doesn't make sense but the teacher was like "just imagine it"

#

so that's all i have in mind

#

i know the product rule (sounds important) but idk what that has to do with anything

wooden sail Aug 16, 2023, 11:46 AM

#

ok. imagine a line that touches the original curve only at 1 point

#

the slope of that line is the gradient

#

what it tells you is how quickly the function is increasing or decreasing at that point

#

a function that is very steep will have a large positive derivative

#

if the derivative is 0, there is no increase or decrease

lusty lotus Aug 16, 2023, 11:47 AM

#

wooden sail ok. imagine a line that touches the original curve only at 1 point

yeah i was told this and i remember it lol but the lingering thought in my mind is like there has to be a smol (2nd) pt touchgin it

wooden sail Aug 16, 2023, 11:47 AM

#

there isn't

lusty lotus Aug 16, 2023, 11:47 AM

#

wooden sail the slope of that line is the gradient

yeah i was told this

lusty lotus Aug 16, 2023, 11:47 AM

#

wooden sail a function that is very steep will have a large positive derivative

i understand

wooden sail Aug 16, 2023, 11:47 AM

#

lusty lotus yeah i was told this and i remember it lol but the lingering thought in my mind ...

if this happens then it's not a tangent line

lusty lotus Aug 16, 2023, 11:47 AM

#

wooden sail if the derivative is 0, there is no increase or decrease

and is this what we want?

wooden sail Aug 16, 2023, 11:47 AM

#

and hence not the derivative

#

yeah, one sec

#

now think to your daily life

#

if you throw something upwards, that thing will trace a parabola

#

it'll go up, then come back down

#

that's the position of the object tracing a parabola

lusty lotus Aug 16, 2023, 11:48 AM

#

sure

wooden sail Aug 16, 2023, 11:48 AM

#

the important thing is that, in the transition from the object going up, to when the object is coming down, there is a point where the object's speed is 0

#

it starts with a large positive speed

#

that speed slowly decreases

#

then it becomes 0

#

then the speed becomes increasingly negative, and the object comes back down

#

this is your intuition as to why the derivative being zero is important

#

if the speed is positive and then it's negative, and it is changing "smoothly", it has to pass through zero

#

and it does so at the point where the speed changes sign from negative to positive or backwards

#

that point is a maximum or a minimum (in 1 dimension, at least. not generally true in more dimensions)

lusty lotus Aug 16, 2023, 11:51 AM

#

wooden sail the important thing is that, in the transition from the object going up, to when...

right

wooden sail Aug 16, 2023, 11:51 AM

#

lusty lotus Aug 16, 2023, 11:51 AM

#

so like up -> gets slower -> 0 -> negative velocity

wooden sail Aug 16, 2023, 11:51 AM

#

yeah

#

here i drew it upside down

lusty lotus Aug 16, 2023, 11:51 AM

#

wooden sail if the speed is positive and then it's negative, and it is changing "smoothly", ...

sure

wooden sail Aug 16, 2023, 11:52 AM

#

in the previous example, the speed (the derivative of the position with respect to time) is 0 exactly at the apex, the highest point the object reaches before it comes back down

#

that's the general idea

lusty lotus Aug 16, 2023, 11:52 AM

#

wooden sail that point is a maximum or a minimum (in 1 dimension, at least. not generally tr...

how is it a max/min when the values can go +/- ve?

wooden sail Aug 16, 2023, 11:53 AM

#

hm?

lusty lotus Aug 16, 2023, 11:53 AM

#

like x^2, x can be any value of +/-ve number?

#

as long as it's real right

wooden sail Aug 16, 2023, 11:53 AM

#

mhm

lusty lotus Aug 16, 2023, 11:53 AM

#

then why did you say min when x =0

wooden sail Aug 16, 2023, 11:53 AM

#

that's why i was careful to tell you i drew it upside down

#

if you throw something upwards, it traces the curve -x^2

#

this is maximal at x = 0

#

in the drawing i made i drew x^2. this is minimal at x = 0

lusty lotus Aug 16, 2023, 11:54 AM

#

right, ive got the graph

Screenshot_2023-08-16_at_12.54.34_PM.png

wooden sail Aug 16, 2023, 11:55 AM

#

right. what's your question?

lusty lotus Aug 16, 2023, 11:55 AM

#

wooden sail right. what's your question?

how is x maximal when values of x can be greater than 0?

#

surely 1 > 0?

wooden sail Aug 16, 2023, 11:55 AM

#

x is not maximal

#

the function is

lusty lotus Aug 16, 2023, 11:56 AM

#

f(x)? y? i see

wooden sail Aug 16, 2023, 11:56 AM

#

what we found is the value of x, that when we put it into f(x), makes f(x) maximal

lusty lotus Aug 16, 2023, 11:56 AM

#

i get it now

#

so we're not referring to x when maxor min just the func of it right?

wooden sail Aug 16, 2023, 11:56 AM

#

.latex that's why we usually write this as $[
\text{arg} \min_x f(x)
]$

#

huh

wooden sail Aug 16, 2023, 11:57 AM

#

lusty lotus so we're not referring to x when maxor min just the func of it right?

we distinguish between max and argmax

#

we actually usually don't care about the max, just the argmax

#

(or argmin)

lusty lotus Aug 16, 2023, 11:58 AM

#

wooden sail we actually usually don't care about the max, just the argmax

"arg" being y or f(x)?

wooden sail Aug 16, 2023, 11:58 AM

#

arg meaning argument

#

like in code

#

x is the argument of f(x)

lusty lotus Aug 16, 2023, 11:59 AM

#

wooden sail x is the argument of f(x)

sure

#

as in argmax() refers to the argument (x) that leads to the maximisation of the output (y) or f(x)?

#

and returns said argument (x)?

wooden sail Aug 16, 2023, 12:00 PM

#

mhm

lusty lotus Aug 16, 2023, 12:00 PM

#

damn

#

now, onto the main problem

#

how tf is any of this have to do with correcting shit weight values

wooden sail Aug 16, 2023, 12:00 PM

#

you look for the weights that minimize your cost function

#

argmin (weights and biases) cost(weights and biases)

lusty lotus Aug 16, 2023, 12:01 PM

#

wooden sail you look for the weights that minimize your cost function

makes sense

#

so you repeatedly find x values that minimises f(x)

wooden sail Aug 16, 2023, 12:02 PM

#

not repeatedly

lusty lotus Aug 16, 2023, 12:02 PM

#

got it, why not just like use the analogy of 3b1b of rolling down a ball instead of doing maths? roll ball down fun

#

like check if the gradient of moved pt is less than previous grad

wooden sail Aug 16, 2023, 12:03 PM

#

that's not enough

lusty lotus Aug 16, 2023, 12:03 PM

#

why?

wooden sail Aug 16, 2023, 12:03 PM

#

you would need to know the math to understand why lol

#

especially in machine learning tasks that's not a very useful condition

#

they're non convex 😛

lusty lotus Aug 16, 2023, 12:04 PM

#

wooden sail you would need to know the math to understand why lol

but isn't that just

x = rand.random()
while grad != 0:
  grad = get_dy_dx(x)

#

and update x

wooden sail Aug 16, 2023, 12:04 PM

#

you just made a huge assumption

#

that gradient descent will work in the first place

lusty lotus Aug 16, 2023, 12:05 PM

#

wtf it doesn't?

wooden sail Aug 16, 2023, 12:05 PM

#

it only works for very nicely behaved functions, and only for special choices of step sizes lol

lusty lotus Aug 16, 2023, 12:05 PM

#

:/

wooden sail Aug 16, 2023, 12:05 PM

#

almost no optimizer uses only gradients

lusty lotus Aug 16, 2023, 12:05 PM

#

wooden sail you look for the weights that minimize your cost function

then what happens?

wooden sail Aug 16, 2023, 12:06 PM

#

then you're done, that's the whole thing you're looking for

#

or wdym?

#

there's never any guarantee that the solution you found with a neural network is the best or even generally valid, if that's what you meant

#

reproducibility, verification and related things are entire fields of study

#

sorry for crushing your dreams 😛

lusty lotus Aug 16, 2023, 12:09 PM

#

wooden sail then you're done, that's the whole thing you're looking for

then i have a few questions:
how does backprop know which weights is bad? how does it associate the corrections with the weights?
then wtf is the optimiser for?

lusty lotus Aug 16, 2023, 12:10 PM

#

wooden sail sorry for crushing your dreams 😛

like the thing is ive been using NNs a lot recently and im okish familiar with pytorch so i could use it comfortably without knowing the bts stuff but im curious and i DO want to know

wooden sail Aug 16, 2023, 12:10 PM

#

lusty lotus then i have a few questions: how does backprop know which weights is bad? how do...

if you have extra information about curvature, you can make better decisions. it's not enough that the gradient decreases at every iteration. you need to check how much the gradient changed, how much the curvature changed, and use these to compute a direction in which to move the parameters

#

optimizers compute update vectors BASED on the gradient, but they are not just the raw gradient

#

there's rescaling and redirecting to be considered

#

also the statistics of the problem, too

#

also the gradient contains the derivatives w.r.t. all of the parameters. each one gives you some info on how to update each parameter. how MUCH info is a separate question

#

very naively, large gradients mean you're far from the solution... but not really 😛 not always

#

ah, there's also the trust region to consider, since you're linearizing (or otherwise approximating) the original problem at every iteration

lusty lotus Aug 16, 2023, 12:14 PM

#

right im still slightly confused

wooden sail Aug 16, 2023, 12:14 PM

#

what about?

lapis sequoia Aug 16, 2023, 12:14 PM

#

Does anyone know why in the official yolo repository, they multiply the loss by the batch size? they say its to make it batch size agnostic but I don't really see why. I'm not finding any division after that but maybe I didn't see it.

lusty lotus Aug 16, 2023, 12:15 PM

#

like i mean let's assume like all the other stuff like "not always" or like "uncommon/edge/complex cases" do not exist and focus on helping me understand the context of the video https://www.youtube.com/watch?v=8d6jf7s6_Qs&t=0s

#

i think that would be very helpful

wooden sail Aug 16, 2023, 12:16 PM

#

the gradient points in the direction that the function f(x) increases the most quickly

#

the negative gradient points in the direction f(x) decreases the most quickly

#

the gradient is a vector made of the derivatives of f w.r.t. its parameters (here, x)

lusty lotus Aug 16, 2023, 12:17 PM

#

sure

wooden sail Aug 16, 2023, 12:17 PM

#

so we adjust x by moving it in the direction that f(x) decreases the most

lusty lotus Aug 16, 2023, 12:19 PM

#

like here https://youtu.be/8d6jf7s6_Qs?t=169 why does the guy say like "in the direction of the negative of the gradient" like my first question was like I thought positive gradients = / and negative gradients = \, surely if he were to retrace / downwards isn't that still positive gradient but less?

#

then it isn't the "negative of the gradient" then, it's merely saying like "to the direction closest to 0"

wooden sail Aug 16, 2023, 12:20 PM

#

no, it IS the negative of the gradient

lusty lotus Aug 16, 2023, 12:20 PM

#

wtf?!

wooden sail Aug 16, 2023, 12:20 PM

#

lusty lotus like here <https://youtu.be/8d6jf7s6_Qs?t=169> why does the guy say like "in the...

i don't understand what you mean here

#

remember each point on the curve has a different gradient

#

the gradient ALWAYS tells you the direction in which the function INCREASES the most

#

regardless of whether the gradient is negative or positive

#

if the gradient is positive, it means "if we move x to the right, the function increases"

#

if it's negative it means "if we move x to the left, the function increases"

lusty lotus Aug 16, 2023, 12:22 PM

#

then it has to be this?

#

like isn't where the red dot is positive grad? then the negative of that must be \

wooden sail Aug 16, 2023, 12:23 PM

#

i think that line tells you nothing, that's a really bad visualization

lusty lotus Aug 16, 2023, 12:24 PM

#

wooden sail i think that line tells you nothing, that's a really bad visualization

why? i thought the gradient was a bit like this

#

then surely the negative must be something above

#

but less steep

wooden sail Aug 16, 2023, 12:24 PM

#

the gradient is only the steepness of the line

#

not the line itself

lusty lotus Aug 16, 2023, 12:25 PM

#

right so

#

uhh

lusty lotus Aug 16, 2023, 12:26 PM

#

lusty lotus why? i thought the gradient was a bit like this

does that not tell where it is facing? like this one here slants upwards but the negative \ doesn't?

wooden sail Aug 16, 2023, 12:26 PM

#

i'd draw it like this

#

#

the gradient is a vector where each entry corresponds to one of the variables. here we only have x, so the gradient is a vector that points only along the x axis. its direction tells us in which direction f(x) increases, and the negative tells us in which direction f(x) decreases

#

why is it pointing to the right? the right on the x axis is the positive direction

#

the length of the vector g is how big the gradient is

lusty lotus Aug 16, 2023, 12:29 PM

#

wooden sail

wtf, then why is the grad flat? i thought grad = change in y/change in x? then if it's flat then you mean the grad is 0?

#

gosh this sucks a bit man

wooden sail Aug 16, 2023, 12:29 PM

#

i'm talking only about x here

#

about the slope of the derivative, not the derivative as a function

#

as i told you, we don't actually care about the tangent line

#

only its slope

lusty lotus Aug 16, 2023, 12:30 PM

#

wooden sail the gradient is a vector where each entry corresponds to one of the variables. h...

then wouldn't the slope of the derivative x^2 be 2x, where it's really steep?

wooden sail Aug 16, 2023, 12:30 PM

#

yes

lusty lotus Aug 16, 2023, 12:30 PM

#

then why's it flat

#

:(

wooden sail Aug 16, 2023, 12:31 PM

#

dude 2x is a line

#

we don't care about the line

#

remember again, you have to evaluate the derivative to get the slope

#

it's not 2x

#

substitute x with a specific value of x

lusty lotus Aug 16, 2023, 12:31 PM

#

wooden sail it's not 2x

but 2x IS the derivative

wooden sail Aug 16, 2023, 12:31 PM

#

the value of x of the red point

lusty lotus Aug 16, 2023, 12:32 PM

#

we care about 2x surely

wooden sail Aug 16, 2023, 12:32 PM

#

no

#

we care about 2x evaluated at x

#

2x evaluated at x is the slope of the tangent touching f(x) at x

lusty lotus Aug 16, 2023, 12:32 PM

#

wdym "eval at x"

wooden sail Aug 16, 2023, 12:32 PM

#

literally that

#

the red point on the graph has coordinates (x,y)

#

for example (3,9)

lusty lotus Aug 16, 2023, 12:33 PM

#

i thought the forward pass was eval()

#

like model.eval()

wooden sail Aug 16, 2023, 12:33 PM

#

we'd want 2*3 = 6

#

you're mixing up everything

lusty lotus Aug 16, 2023, 12:33 PM

#

wooden sail we'd want 2*3 = 6

why? this action sounds arbitrary

wooden sail Aug 16, 2023, 12:34 PM

#

because i arbitrarily chose the point (3,9), yes

lusty lotus Aug 16, 2023, 12:34 PM

#

like just multiply stuff and you get the correct ans? not trying to be aggressive here but im just really confused

wooden sail Aug 16, 2023, 12:34 PM

#

i was giving you an example of what i meant by evaluate

lusty lotus Aug 16, 2023, 12:34 PM

#

wooden sail because i arbitrarily chose the point (3,9), yes

no, the action of multiplying

wooden sail Aug 16, 2023, 12:34 PM

#

multiplying what?

lusty lotus Aug 16, 2023, 12:34 PM

#

like why does multiplying help

#

2*3

wooden sail Aug 16, 2023, 12:34 PM

#

you said the derivative is 2x

lusty lotus Aug 16, 2023, 12:35 PM

#

and what does subbing 3 do?

#

finding the steepness?

wooden sail Aug 16, 2023, 12:35 PM

#

if x is 3, then the slope is 6 at the point (3,9)

lusty lotus Aug 16, 2023, 12:35 PM

#

wait one more thing

wooden sail Aug 16, 2023, 12:35 PM

#

no, i have to go

lusty lotus Aug 16, 2023, 12:35 PM

#

then what's the diff between grad and slope then

lusty lotus Aug 16, 2023, 12:35 PM

#

wooden sail the gradient is a vector where each entry corresponds to one of the variables. h...

it sounds important here

wooden sail Aug 16, 2023, 12:36 PM

#

the gradient is an extension of the idea of "slope" to arbitrarily many dimensions

#

you will never work with just 1 variable in machine learning. usually a couple tens of thousands or more

lusty lotus Aug 16, 2023, 12:36 PM

#

wooden sail

well i think i sorta get it, it's just the pic that's confusing me a bit

lusty lotus Aug 16, 2023, 12:37 PM

#

wooden sail you will never work with just 1 variable in machine learning. usually a couple t...

but it means im running before i can walk, if i can't handle 1, much less anything else

wooden sail Aug 16, 2023, 12:37 PM

#

that's why i said it was a good idea to learn all that other stuff

lusty lotus Aug 16, 2023, 12:38 PM

#

wooden sail that's why i said it was a good idea to learn all that other stuff

but so far im mostly getting it, albeit taking longer

wooden sail Aug 16, 2023, 12:38 PM

#

hmmmmmmm

lusty lotus Aug 16, 2023, 12:38 PM

#

🤓

wooden sail Aug 16, 2023, 12:38 PM

#

not really but ok

lusty lotus Aug 16, 2023, 12:38 PM

#

then we make 2x where x should be ideally 0?

#

like we shift x to the dir where it gets to 0?

lusty lotus Aug 16, 2023, 12:39 PM

#

wooden sail not really but ok

i feel like im on mount stupid in imposter syndrome lol

wooden sail Aug 16, 2023, 12:40 PM

#

no, you haven't even started climbing

lusty lotus Aug 16, 2023, 12:40 PM

#

damn

wooden sail Aug 16, 2023, 12:40 PM

#

these are the prerequisites

lusty lotus Aug 16, 2023, 12:40 PM

#

im fucking screwed then

lusty lotus Aug 16, 2023, 12:41 PM

#

wooden sail these are the prerequisites

bruh i was told that like basic geometry would do the trick" smfh no lol

wooden sail Aug 16, 2023, 12:42 PM

#

it could, but that already confused you regarding the positive and negative slopes

#

maybe someone can help you out in vc

lusty lotus Aug 16, 2023, 12:42 PM

#

wooden sail maybe someone can help you out in vc

well opal doesn't really do ml

#

but yes, in vc that's where i sort my problems out at lol, i realise i get nothing done in text channels

#

i find it difficult to read messages lol

wooden sail Aug 16, 2023, 12:43 PM

#

that's a problem cuz all of this stuff is in books

lusty lotus Aug 16, 2023, 12:44 PM

#

wooden sail that's a problem cuz all of this stuff is in books

i KNOW :( i had the RL stuff explained by my friend on direct vc lol

#

cant fucking cope with text

wooden sail Aug 16, 2023, 12:45 PM

#

just for reference, every single advancement in AI has been made by mathematicians and is published in papers and in books. the rest of the stuff is mostly people using it

lusty lotus Aug 16, 2023, 12:45 PM

#

:(

#

im aware of that

wooden sail Aug 16, 2023, 12:45 PM

#

so if you wanna learn it right, the proper paths are reading, and uni if you can't read by yourself

lusty lotus Aug 16, 2023, 12:47 PM

#

wooden sail so if you wanna learn it right, the proper paths are reading, and uni if you can...

tbh i find it strange, i do well in scholarly tests though (again, im year 10) and the thing is, even though english isn't my first language, in theory i should be very competent in it since ive pretty much maxed out the cefr "languages standard"

#

i can read textbooks reasonably well, except for math textbooks, perhaps that's where i fuck up

wooden sail Aug 16, 2023, 12:48 PM

#

that's a different skill altogether

lusty lotus Aug 16, 2023, 12:48 PM

#

wooden sail that's a different skill altogether

like linguistically im alr C2

wooden sail Aug 16, 2023, 12:48 PM

#

the term they like using is "mathematical maturity" which is separate to other forms of development

lusty lotus Aug 16, 2023, 12:48 PM

#

technicalyl im "proficient" in english alr but still

lusty lotus Aug 16, 2023, 12:48 PM

#

wooden sail the term they like using is "mathematical maturity" which is separate to other f...

but the fucking thing is im doing well in school math papers

wooden sail Aug 16, 2023, 12:49 PM

#

my 2 cents is that those don't count

lusty lotus Aug 16, 2023, 12:49 PM

#

i need things to be explained in front of me lol

lusty lotus Aug 16, 2023, 12:49 PM

#

wooden sail my 2 cents is that those don't count

fair enough

wooden sail Aug 16, 2023, 12:49 PM

#

if you pass without a sweat and don't need to study, you haven't been challenged and have never needed to develop this skill

#

as evidenced by learning by watching and never picking a book up

#

you now have bad habits

lusty lotus Aug 16, 2023, 12:49 PM

#

wooden sail if you pass without a sweat and don't need to study, you haven't been challenged...

:( i never really studied for math but i needed the teachers to explain

lusty lotus Aug 16, 2023, 12:50 PM

#

wooden sail you now have bad habits

shitters

wooden sail Aug 16, 2023, 12:50 PM

#

common experience for most people, but they usually only find out after eating dirt a couple of semesters in uni

#

anyway, g2g

lusty lotus Aug 16, 2023, 12:50 PM

#

wooden sail anyway, g2g

seeya, nice talking to you

steady nacelle Aug 16, 2023, 3:32 PM

#

After finishing Andrew NG's specialization course on deep learning . What do you guys recommend to do to land an entry ML job next?

lapis sequoia Aug 16, 2023, 3:42 PM

#

hey friends! does anyone have some good resources for learning data visualization?

twilit tundra Aug 16, 2023, 3:56 PM

#

steady nacelle After finishing Andrew NG's specialization course on deep learning . What do you...

Code your own project

steady nacelle Aug 16, 2023, 4:11 PM

#

twilit tundra Code your own project

Hm how many projects do you recommend Rose? My best project was analysing fish behavior with RNN family and spatio temporal architectures and will publish it on IEEE

#

Waiting for acceptance on IEEE4*

#

IEEE*

twilit tundra Aug 16, 2023, 4:24 PM

#

steady nacelle Hm how many projects do you recommend Rose? My best project was analysing fish b...

There really isn't a set number of projects to hit. What matters is that you can show your understanding of the concepts and your ability to put it into practice. It's even better if you can document issues you had during the project and how you solved them.

oblique quarry Aug 16, 2023, 7:52 PM

#

When you have a decision Tree that is tasked to differentiate between 2 classes you'd obviously hope for more than 50%. My decision Tree performs with 60 accuracy. Is there a common reason as to why that happens, couldn't figure it out for hours. Heres the code for those who'd like to help https://paste.pythondiscord.com/SEIA

#

The only explanation i have for that is that the tree must be highly sensitive to large values or values whose distance to the mean is big. I found to get orders of magnitute better results when using a standardized dataset such as a normal distrubution or whatever could be wrong tho

terse coral Aug 16, 2023, 8:26 PM

#

Is there any way to initialize a dataframe in pandas by passing in a list of variables and automatically use the names of those variables as the column labels?

serene scaffold Aug 16, 2023, 8:27 PM

#

terse coral Is there any way to initialize a dataframe in pandas by passing in a list of var...

nope, you'd have to do that as a dict.

twilit tundra Aug 16, 2023, 8:27 PM

#

Technically, you can using globals() but that sounds rough

terse coral Aug 16, 2023, 8:28 PM

#

Gotcha. That's what I thought but figured I'd check and see

serene scaffold Aug 16, 2023, 8:28 PM

#

twilit tundra Technically, you can using globals() but that sounds rough

you'd get a bunch of stuff that you don't want and it would almost always error

twilit tundra Aug 16, 2023, 8:28 PM

#

You can filter the keys but yeah, that's not pretty

terse coral Aug 16, 2023, 8:29 PM

#

Thoughts on pandas vs. polars?

serene scaffold Aug 16, 2023, 8:30 PM

#

my reason for using pandas is that I already know how to use it, and my issues with pandas aren't significant enough for me to want to switch away from it.

twilit tundra Aug 16, 2023, 8:30 PM

#

polars is better but no one uses it

twilit tundra Aug 16, 2023, 8:55 PM

#

oblique quarry When you have a decision Tree that is tasked to differentiate between 2 classes ...

In your searchBestSplit function, you're not updating currentMaxInformationGain

low relic Aug 16, 2023, 9:07 PM

#

terse coral Thoughts on pandas vs. polars?

Polars is designed for better performance on larger dataset. Pandas has larger user base and it's more used => has more resources 🙂

cobalt pecan Aug 16, 2023, 9:30 PM

#

hi can i share my regex issue with someone? i have the two patterns and they work, but when i apply the custom function to the dataframe, one regex substition seems to replace the other

#

i'll post the code link

#

https://paste.pythondiscord.com/V4GA

#

most of it is already fleshed out i just need help figuring out a better way to apply the function, tysm

serene scaffold Aug 16, 2023, 9:35 PM

#

@cobalt pecan please delete that paste as soon as possible, as it leaks your AWS keys

cobalt pecan Aug 16, 2023, 9:36 PM

#

ok how should i share the code then

serene scaffold Aug 16, 2023, 9:36 PM

#

without your AWS keys

#

you need to go change those keys as soon as possible, as someone can now use your AWS account, and you will have to pay for whatever they do.

cobalt pecan Aug 16, 2023, 9:37 PM

#

i took them out of the code

serene scaffold Aug 16, 2023, 9:37 PM

#

you need to go to AWS and make sure that those keys can no longer be used.

cobalt pecan Aug 16, 2023, 9:38 PM

#

i've changed the keys

serene scaffold Aug 16, 2023, 9:38 PM

#

you went to AWS and did it?

cobalt pecan Aug 16, 2023, 9:38 PM

#

i'm doing it rn

#

but also i was given the keys by someone else to do this code, and they said it was safe to use it

serene scaffold Aug 16, 2023, 9:39 PM

#

it might be safe for you to use the code, but it's not safe for you to reveal the AWS keys. that's the same as posting your Discord password in this chat.

so once you're done with all that, you need to make a separate example that has every variable defined in the code (not as a result of API calls)

#

for example, if you have a df variable in the actual code, you would do print(df.head().to_dict('list')), and then put that in pd.DataFrame( ) in the code example.

cobalt pecan Aug 16, 2023, 9:41 PM

#

ahh ok i have the log file that can be used to make the df i'm manipulating

serene scaffold Aug 16, 2023, 9:41 PM

#

code examples need to be fully self-contained, so they can't involve reading files.

cobalt pecan Aug 16, 2023, 9:41 PM

#

ahhh ok

serene scaffold Aug 16, 2023, 9:43 PM

#

@cobalt pecan this is what a self-contained pandas example looks like

data = {'a': [1, 2, 3], 'b': [2, 3, 4], 'c': [3, 4, 5]}
df = pd.DataFrame(data)

#

and if you have a dataframe in your actual code, you can make {'a': [1, 2, 3], 'b': [2, 3, 4], 'c': [3, 4, 5]} out of it by doing print(df.head().to_dict('list')) and then copying the result into your example code.

charred light Aug 16, 2023, 9:46 PM

#

QQ: If your benchmark a company against it's industry. Do you include that company in the industry aggregations?

cobalt pecan Aug 16, 2023, 9:51 PM

#

serene scaffold <@1141468505296879656> this is what a self-contained pandas example looks like `...

ok i'll do a mini example of the two types of lines giving me trouble

#

https://paste.pythondiscord.com/TSMA

#

i have an example of a date with numerical format and a written out date

#

i have to patterns to change them to the 'X/X/Year' format, but if i apply the custom function the df, one regex sub overrides the other

#

@serene scaffold with respect to the custom function, how would you code it so both regex subs are applied to the df

#

because it seems like the second one to convert a textual date to the desired format overrides the first one

#

wait i think i got it

#

give me a second to double check

cobalt pecan Aug 16, 2023, 10:48 PM

#

is there a helper online i could message?

verbal venture Aug 16, 2023, 11:01 PM

#

are the number of neurons in a keras dense layer the depth of the output feature maps?

serene scaffold Aug 16, 2023, 11:40 PM

#

verbal venture are the number of neurons in a keras dense layer the depth of the output feature...

I'm not a keras user. What do the output feature maps represent?

verbal venture Aug 16, 2023, 11:41 PM

#

I don't understand how I would be able to find that out

#

it's a CNN task. A layer is (Dense(16, padding=1, kernel = 3, stride=1)) etc. So 16 (according to chatgpt) represents the number of input neurons for the layer. Also wondering if that is the Z dimension of the feature map

cobalt pecan Aug 17, 2023, 12:56 AM

#

is there a way to do unit testing for regex string matching files?

serene scaffold Aug 17, 2023, 1:21 AM

#

cobalt pecan is there a way to do unit testing for regex string matching files?

You want a regular expression that tells you if something is a valid name for a file path on a certain operating system?

#

And you want to write tests for that expression?

cobalt pecan Aug 17, 2023, 2:41 AM

#

so i've written the tests for the expression, i just need to figure out how to do self.assertEqual for a function multiple times

cobalt pecan Aug 17, 2023, 2:57 AM

#

i got it

storm canyon Aug 17, 2023, 4:52 AM

#

What sort of libraries/tools do people use to work with large datasets?

#

I'm trying to work with a large set of parquet files and am not sure how to work with the data without running out of ram

fluid spindle Aug 17, 2023, 7:01 AM

#

Hello, someone has a linalg server I can ask questions?

wooden sail Aug 17, 2023, 7:10 AM

#

you can ask in off topic, or try the mathematics server https://discord.gg/math

#

you can tag me in either, but i may or may not be available

velvet rampart Aug 17, 2023, 8:40 AM

#

Please what does count_vectoriser.fit_transform do

frail stream Aug 17, 2023, 8:55 AM

#

hello,
so i'm working in jupyter notebook for a month and noticed this inconvenience, when i'm trying to import something some libraries have duplicates. previously i was a backend developer so have python and pycharm already installed,.
I would be really grateful for your help.

void sail Aug 17, 2023, 9:45 AM

#

been a while but depending on the yolovx you might already use skip connections. Skip connections are (usually) added to the first couple of initial layers to preserve some of the features at the start with deep models

lapis sequoia Aug 17, 2023, 9:46 AM

#

void sail been a while but depending on the yolovx you might already use skip connections....

I'm using the backbone of yolov8 so I removed the neck and the head and so there isn't the normal skip connections of yolo.

void sail Aug 17, 2023, 9:46 AM

#

in that case youll have to add them yourself, note that this will require retraining the entire network

lapis sequoia Aug 17, 2023, 9:47 AM

#

yes i know, i'm asking about how to choose at which level to add a skip connection and how to know what type to add (concat, multiplication etc)

void sail Aug 17, 2023, 9:47 AM

#

there is no golden rule for this or some deterministic process

#

you could try to take a look at the loss process and weights difference at each backwards step and layer

#

if you want to be precise about it and run experiments on a metric

lapis sequoia Aug 17, 2023, 9:48 AM

#

void sail there is no golden rule for this or some deterministic process

yes, this is why i gave an image of the current architecture i thought maybe someone would have an intuition i don't have

lapis sequoia Aug 17, 2023, 9:48 AM

#

void sail you could try to take a look at the loss process and weights difference at each ...

how would you do that?

void sail Aug 17, 2023, 9:49 AM

#

nope does not exist im afraid, next best thing is logging the delta in weights at each layer at each training step. if you see the gradients becoming very small at the first couple of layers you can decide to add skip connections there

#

please note that skip connections mainly exists for very deep networks and vanishing gradient problem which will give you context for my suggestion

placid cedar Aug 17, 2023, 10:07 AM

#

hey guys, wld anyone mind helping me with some issues?

#

i have a main fact table, called Results. this table contains a foreign key, statusID.

i also have a status table, with the StatusID being the primary key, and each statusId has a status. there's around 130 statuses

is it necessary to merge these 2 tables together?

i have to create a regression/classification model based on f1 data and trying to do a regression model, predicting the fastest lap speed. now i am thinking about whether it is necessary to merge both the status and results table. because the statusid is the status itself

past meteor Aug 17, 2023, 10:51 AM

#

How do you guys feel about blogging? I feel like there's already so much material out there so I'm not even sure it's worth it

mystic berry Aug 17, 2023, 10:52 AM

#

Hello

past meteor Aug 17, 2023, 10:52 AM

#

OTOH, it's free personal branding and I think I would enjoy it

sleek harbor Aug 17, 2023, 11:23 AM

#

past meteor OTOH, it's free personal branding and I think I would enjoy it

Why does it matter what anyone else thinks? If u enjoy it, then it's worth it - if u don't enjoy it, then obviously not. There is indeed a lot of material out there.. most (and I mean most, as in >50%) is either outdated, just some people documenting their journey (which often ends up in a lot of noobie articles with mistakes in them), or just plain bad. If you actually know what you're writing about, and you know how to write - blogging would be beneficial to both u and ur readers. If u don't know what ur writing about and/or u don't know how to write - blogging would likely still be beneficial for u, tho likely not for ur readers. That's a win-win situation, for u at least 😛

Tl;dr: do what u wanna do. I'd read it, if it was of interest to me (on a topic I understand, which it probably wouldn't be :3)

past meteor Aug 17, 2023, 11:25 AM

#

sleek harbor Why does it matter what anyone else thinks? If u enjoy it, then it's worth it - ...

That's a very reasonable opinion and great motivation. Thanks!

#

Personally I wouldn't write stuff like "my data science journey" because it's boring. I just went to uni and did internships.

I'm thinking of stuff like making a tiny ML framework step-by-step (maybe in another lang than Python), how to structure projects, what data viz tool to use, when ML is appropriate, an actor model approach to genetic algorithms, ...

Pretty much a mix between code and organisational stuff. Are both relevant or should I consider dropping one of the two

serene scaffold Aug 17, 2023, 11:51 AM

#

past meteor How do you guys feel about blogging? I feel like there's already so much materia...

what would you blog about?

past meteor Aug 17, 2023, 11:53 AM

#

serene scaffold what would you blog about?

Did my second message get through?

serene scaffold Aug 17, 2023, 11:55 AM

#

past meteor Did my second message get through?

no, I just wasn't caught up with all the messages after the one I replied to. bad choice on my part.

making a tiny ML framework step-by-step (maybe in another lang than Python), how to structure projects, what data viz tool to use, when ML is appropriate, an actor model approach to genetic algorithms,
I'd be interested to see what you come up with.

#

personally, I'd be less interested in the data viz part.

past meteor Aug 17, 2023, 12:05 PM

#

Personally I detest data viz. I ask about it in interviews and if it's a big component I bail.

The reason I'd write about it is moreso that people tend to struggle with picking the right tool in my opinion. Like, FOSS bi tools vs proprietary vs Python vs fully in JS . It sounds boring but our research group has burnt itself in the past by doing this

#

People went for custom solutions that didn't warrant the complexity of the project. Analytics people on the other hand only have a hammer called tableau so then everything is a dashboard shaped nail 🤣

weak mortar Aug 17, 2023, 1:32 PM

#

Hello! Weird problem today with anaconda. It will only execute one line of code then exit the script. Ie do nothing, but if i put a print("whatever") on first line, it will print it and then not do more

#

It worked a few days ago and i didny change anything afaik

#

Also a handful of times it crashed VSC saying out of memory

weak mortar Aug 17, 2023, 2:03 PM

#

Maybe easier i just install two versions of python and then can install the outdated libs i need in each specific python version 🤔

#

Its not really a datascience question but a guy in general said i should ask here pepeshrug

chrome ginkgo Aug 17, 2023, 3:13 PM

#

Hello

quaint loom Aug 17, 2023, 3:27 PM

#

Is there anyone here who is familiar with WPS and know how to copy equation that I have written there and now want to move to a word doc?

magic dune Aug 17, 2023, 4:44 PM

#

quaint loom Is there anyone here who is familiar with WPS and know how to copy equation that...

Why not just use latex?

quaint loom Aug 17, 2023, 5:08 PM

#

magic dune Why not just use latex?

I was but started a little wrong and just continued. 😂

magic dune Aug 17, 2023, 5:14 PM

#

quaint loom I was but started a little wrong and just continued. 😂

Mathpix is latex ocr

quaint loom Aug 17, 2023, 5:15 PM

#

magic dune Mathpix is latex ocr

Thanks

velvet rampart Aug 17, 2023, 7:23 PM

#

Please what does count_vectoriser.fit_transform do

mild dirge Aug 17, 2023, 7:38 PM

#

https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html

scikit-learn

sklearn.feature_extraction.text.CountVectorizer

Examples using sklearn.feature_extraction.text.CountVectorizer: Topic extraction with Non-negative Matrix Factorization and Latent Dirichlet Allocation Semi-supervised Classification on a Text Data...

#

#

@velvet rampart

rose agate Aug 17, 2023, 11:31 PM

#

I have a data frame 'df' with columns TO and FROM which are numbers, TO is always greater than FROM, and I need to find which ranges in this frame overlap with ranges from a predetermined segment I'm looping through. The current statement to find this is
df[~(data.TO < segFROM) & ~(df.FROM > segTO)]
However with the scale of the data I'm working with this is very slow. Is there any way to make this statement faster by sorting the data frame to do a faster search for each of those conditions I'm filtering on?

tidal bough Aug 18, 2023, 12:07 AM

#

Well, you could definitely do a searchsorted manually, but how to make it to it automatically, hmm...

#

~~(my mental model of some people in this channel says "use duckdb" :p)~~

tidal bough Aug 18, 2023, 12:22 AM

#

I tried to just sort the indices, but it doesn't seem to make it measurably faster.

last ivy Aug 18, 2023, 1:34 AM

#

Is there a good online reference for an intro to time series analysis ?

quaint loom Aug 18, 2023, 1:07 PM

#

magic dune Mathpix is latex ocr

Would you explain the different betweent these? Not the equation itself, of course : )

hasty mountain Aug 18, 2023, 1:09 PM

#

Hey guys, I'm now making some experiments in a technique I'm trying to develop and I want to save my model metrics in an excel file.

Question is: is there a motive to why would I prefer CSV over classic xlsx excel files?

I just want to confirm this. It's actually the first time I'm saving my metrics to excel (in a desperate effort to make things more organized and easier to visualize...I had to re-run some tests in a previous research because of my poor organizing skill)

quaint loom Aug 18, 2023, 1:11 PM

#

hasty mountain Hey guys, I'm now making some experiments in a technique I'm trying to develop a...

Hello,
I would think using CSV would make it much simplified to read and interoperability in the future if you`re returning to that file.

hasty mountain Aug 18, 2023, 1:13 PM

#

Hm... I'll take a look

#

I may not reuse the variables in that file to extract data...I think. Unless I try to analyze the correlation using some algorithm, I think...

quaint loom Aug 18, 2023, 1:15 PM

#

hasty mountain I may not reuse the variables in that file to extract data...I think. Unless I t...

If there's even a remote possibility you'll want to analyze the metrics later, sticking to CSV might be handy. As you mentioned, if you ever decide to dive into correlations using an algorithm, having the data in CSV can simplify the data loading step in many programming tools.

hasty mountain Aug 18, 2023, 1:19 PM

#

quaint loom If there's even a remote possibility you'll want to analyze the metrics later, s...

Thanks! Maybe I'll try saving as excel file and as csv. One to visualizing things, and the other for possible future calculations

quaint loom Aug 18, 2023, 1:21 PM

#

hasty mountain Thanks! Maybe I'll try saving as excel file and as csv. One to visualizing thing...

Saving one CSV is also space saving.. Good luck with your research mister ^^

hasty mountain Aug 18, 2023, 1:22 PM

#

Thanks! 👍

quaint loom Aug 18, 2023, 1:45 PM

#

quaint loom Would you explain the different betweent these? Not the equation itself, of cour...

Pretty bad that you`re having 10 snips for free every month XD

magic dune Aug 18, 2023, 2:21 PM

#

quaint loom Would you explain the different betweent these? Not the equation itself, of cour...

Different ways of writing it

potent sky Aug 18, 2023, 3:16 PM

#

past meteor Personally I wouldn't write stuff like "my data science journey" because it's bo...

seems like a good collection of topics, I'd be interested to have a look!

wooden sail Aug 18, 2023, 3:48 PM

#

past meteor How do you guys feel about blogging? I feel like there's already so much materia...

i think it's a good idea, but i'd do it just in terms if things i find cool. possibly disconnected tidbits with explanations and examples

#

maybe even of things you often see explained incorrectly or that people struggle with

#

when i discuss stuff with other doctorands or with students we supervise, once i notice a trend, i prepare some material to explain to/discuss with them, and it anyway ends up in a mixed format of latex/jupyter/drawings/video that already lends itself to just slapping onto a github repo with static site generation

#

why upside down 😔

potent sky Aug 18, 2023, 3:56 PM

#

I mean it as a rueful smile. I've seen this (slapping everything on a github repo with SSG) often enough that it's relatable, and though I think I would like to do differently, the fact remains that it's actually a pretty good way of making all these resources accessible. 🙃

past meteor Aug 18, 2023, 5:23 PM

#

Yeah! I definitely think data science is more than just models so I'll cover some of those topics!

#

Although I also enjoy deep diving on esoteric ML stuff that I use at work like multifidelity gaussian processes

placid cedar Aug 18, 2023, 6:44 PM

#

guys,in my dataset, i have this column called driversID. and it has 800+ unique values. should i drop the column entirely, or keep it?

i am doing a linear regression model here

sleek harbor Aug 18, 2023, 7:34 PM

#

past meteor Although I also enjoy deep diving on esoteric ML stuff that I use at work like m...

If I'm not mistaking u for someone else (and that's something I do a lot online), then u work with timeseries data¿ If yes, then I'd be interested in reading about that. As u all know I'm a noob in everything, so I likely won't get most of the advanced stuff, but I am specifically interested in that stuff cus I plan on eventually trying to build a trading algorithm (got some domain knowledge in the area so to speak)

serene scaffold Aug 18, 2023, 7:34 PM

#

placid cedar guys,in my dataset, i have this column called driversID. and it has 800+ unique ...

in general, unique IDs are just there to tell you what's what. they don't actually represent some meaningful property of the thing they identify.

past meteor Aug 18, 2023, 7:36 PM

#

sleek harbor If I'm not mistaking u for someone else (and that's something I do a lot online)...

The best text on time series is forecasting principles

#

It's free. Issue is that it's R and has a very businessy forecasting perspective to time series but it's very comprehensive

#

I learnt a lot from time series because it was my master's thesis topic and I did a few kaggle projects on it

sleek harbor Aug 18, 2023, 7:39 PM

#

past meteor The best text on time series is forecasting principles

"Forecasting: Principles and Practice”?

past meteor Aug 18, 2023, 7:39 PM

#

Yes

golden brook Aug 18, 2023, 8:40 PM

#

Any one good with Pandas here?
I posted a question on Stackoverflow but no help so far.

serene scaffold Aug 18, 2023, 8:40 PM

#

golden brook Any one good with Pandas here? I posted a question on Stackoverflow but no help ...

Be sure to always ask a complete question that someone can read and immediately start answering--don't "ask to ask"

#

with pandas in particular, give a reproducable copy of the data with something like print(df.head().to_dict('list'))

golden brook Aug 18, 2023, 8:44 PM

#

I was going to post the stackoverflow link here as it's easy to use to pd.read_clipboard with the formatting.

serene scaffold Aug 18, 2023, 8:45 PM

#

golden brook I was going to post the stackoverflow link here as it's easy to use to pd.read_...

that's fine. just give some amount of the data that fully encapsulates what is needed to solve some problem, and then ask the question.

golden brook Aug 18, 2023, 8:47 PM

#

So the dataframe is included in the post at the start.
You can just copy and then use pd.read_clipboard() to paste onto your IDE.

Here is the question:

https://stackoverflow.com/questions/76932276/two-consecutive-increase-in-values-starts-a-series-of-true-values-until-we-run-i

Thank you

Stack Overflow

Two consecutive increase in values starts a series of True values u...

This is the dataframe
Date Ratio
0 1993-01-29 0.44
1 1993-02-01 0.44
2 1993-02-02 0.45
3

serene scaffold Aug 18, 2023, 8:48 PM

#

golden brook So the dataframe is included in the post at the start. You can just copy and th...

next time, be sure to do all that in your first message about the question, so that no time is wasted.

#

@golden brook someone answered on SO with an answer that involves a for loop. That's probably the best solution, since pandas isn't great for mutations that require "memory"

grizzled carbon Aug 18, 2023, 9:03 PM

#

Hi guys, I recently took a course on AI and wanted to build onto it. I ofcourse learnt about the MNIST dataset and wanted to implement a simple prediction app, where someone can draw a number on a black square and then get the models description. Now my problem is, that the model is super bad haha.
I have tried multiple models that all have had a reported accuracy of atleast 97% after having finished training. But once i try it out it gets like every 3 picture wrong :)) . I am pretty sure its on how i process the image but cant figure it out myself.

The image is received b64 encoded so first i do is decode and then process:
I use PIL to get the image and then turn it into a nparray with dimensions (1,28,28) so that it can be used in model.predict.

def retrieveB64(postRequest):
        image = base64.b64decode(postRequest,validate=True)
        decoded_string = io.BytesIO(image)
        img = Image.open(decoded_string)
        return img
def ImageForModel(image):
        image = image.convert("L")
        image = image.resize((28,28))
        array = np.array(image) 
        array = array / 255
        array = (np.expand_dims(array,0))
        return array

If anyone has any input would be great!
For debugging I also save the image after the resizing step to check if anythings wrong but they always look fine

#

this is how my input images look after resizing them

latent tendon Aug 18, 2023, 10:02 PM

#

I am trying to get jupyter notebooks to work with my visual studio code. I am working on a project and it told me to download anaconda. I had anaconda downloaded however my projects were in visual studio code and I decided to work through visual studio code and master it a bit more and than work in anaconda so I deleted it.
A few months later I recognize that maybe data analyst jobs are more available right now than django jobs and I start studying data science.

I manage to use pip to install jupyter. When I do so, and run a cell such as import pandas as pd it acts like its never heard of pandas and will not import anything.

import pandas as pd

I am wondering if I this is a not downloading anaconda issue or a Jupyter issue.

Do I need to import things through visual studio code?

What have I tried and what am I expecting?

I have tried looking at the working with juypter notebook documentation and it says download anaconda. I have managed to miniconda.

Is still not recognizing pandas.

Error Message:

ModuleNotFoundError Traceback (most recent call last)
Cell In[2], line 1
----> 1 import pandas as pd

ModuleNotFoundError: No module named 'pandas'

left tartan Aug 18, 2023, 11:06 PM

#

serene scaffold <@417151174072860692> someone answered on SO with an answer that involves a for ...

Seems like this would be checking if current-shift() > 0 and shift - shift(2) > 0, for the start signal, and again for a stop signal. Then the result is a cumsum of starts minus a cumsum of stops. Something along those lines.

left tartan Aug 18, 2023, 11:06 PM

#

latent tendon I am trying to get jupyter notebooks to work with my visual studio code. I am wo...

Add %pip install pandas to your first cell. You can also pip install from your conda command line.

left tartan Aug 19, 2023, 2:08 AM

#

golden brook So the dataframe is included in the post at the start. You can just copy and th...

Ok, this took me a few tries... I think this is what you're looking for. The idea is: separate the data into groups, based on whether there's a consecutive increase (start) or decrease (end). The first row is a "start". For each group, the "period" (the result you're looking for) is either a True if the group was a start, and False (since the group was an "end") ```py
import pandas as pd

data = {
'date': ['1993-01-29', '1993-02-01', '1993-02-02', '1993-02-03', '1993-02-04', '1993-02-05', '1993-02-08', '1993-02-09', '1993-02-10', '1993-02-11', '1993-02-12', '1993-02-16', '1993-02-17', '1993-02-18', '1993-02-23', '1993-02-24', '1993-02-25', '1993-02-26'],
'value': [0.44, 0.44, 0.45, 0.44, 0.44, 0.56, 0.59, 0.58, 0.57, 0.54, 0.53, 0.47, 0.42, 0.38, 0.35, 0.39, 0.43, 0.46]
}

df = pd.DataFrame(data)

df['start'] = ((df["value"] - df["value"].shift()) > 0) & ((df["value"].shift() - df["value"].shift(2)) > 0)
df.loc[0, "start"] = True

df['end'] = ((df["value"] - df["value"].shift()) < 0) & ((df["value"].shift() - df["value"].shift(2)) < 0)

df['group'] = (df['start'] | df['end']).cumsum()
df['first_start'] = df.groupby('group')['start'].transform('first')
df['period'] = df['first_start'].shift(1, fill_value=False)

print(df)

serene scaffold Aug 19, 2023, 2:19 AM

#

@left tartan amazing

left tartan Aug 19, 2023, 2:42 AM

#

And for giggles, a duckdb / sql version, I also could’ve done the cumulative sum here, but opted for an asof join: ```py
import pandas as pd
import duckdb
data = {
'date': ['1993-01-29', '1993-02-01', '1993-02-02', '1993-02-03', '1993-02-04', '1993-02-05', '1993-02-08', '1993-02-09', '1993-02-10', '1993-02-11', '1993-02-12', '1993-02-16', '1993-02-17', '1993-02-18', '1993-02-23', '1993-02-24', '1993-02-25', '1993-02-26'],
'value': [0.44, 0.44, 0.45, 0.44, 0.44, 0.56, 0.59, 0.58, 0.57, 0.54, 0.53, 0.47, 0.42, 0.38, 0.35, 0.39, 0.43, 0.46]
}
df = pd.DataFrame(data)
result = duckdb.execute("""
WITH input as (
SELECT date,
value,
ifnull(value - lag(value) over l > 0 and lag(value) over l - lag(value, 2) over l > 0, True) as cstart,
ifnull(value - lag(value) over l < 0 and lag(value) over l - lag(value, 2) over l < 0, False) as cend
FROM df
window l as (order by date)
),
boundaries as (SELECT date, cstart from input WHERE cstart or cend),
periods as (SELECT input.date,
boundaries.cstart
FROM input
ASOF JOIN boundaries on input.date >= boundaries.date)
SELECT date, ifnull(lag(cstart) over (order by date), True) as signal FROM periods
""").df()
print(result)

rustic snow Aug 19, 2023, 9:56 AM

#

I am going to have a Machine Learning Interview in 2 days
Can you guys let me know what kind of questions does the interviewer ask about machine learning (besides algorithms and data structrues)

slim bone Aug 19, 2023, 1:12 PM

#

rustic snow I am going to have a Machine Learning Interview in 2 days Can you guys let me kn...

A machine learning interview? As in, for a position as a* ML engineer or whatever?
Do you have any formal education?

rustic snow Aug 19, 2023, 1:26 PM

#

@slim bone ye I've been coding for 5 years and I know algorithms and data structures, a lot of them at least

rustic snow Aug 19, 2023, 1:26 PM

#

slim bone A machine learning interview? As in, for a position as a* ML engineer or whateve...

a ML position

hasty mountain Aug 19, 2023, 1:31 PM

#

Does someone know about a paper or article where the researchers have tried to combine Genetic Algorithms with Stochastic Gradient Descent to train neural networks?

#

I've tried to search about that and asked my professor about it, and got no results back then. But now that my research on that shows that my method is likely to fail, it may be interesting to double-check if someone tried more efficient methods

gilded kestrel Aug 19, 2023, 2:18 PM

#

anyone with colab pro? Does high-memory give you 51gb cpu ram or 25gb?

lapis sequoia Aug 19, 2023, 2:26 PM

#

What math do I need to know to start studying this field?

#

I’m currently studying precalculus

serene scaffold Aug 19, 2023, 2:43 PM

#

lapis sequoia I’m currently studying precalculus

calculus, linear algebra, and probability/statistics

placid cedar Aug 19, 2023, 2:43 PM

#

hey guys, after doing winserisation, i still have some outliers, but it got reduced from 900 to 700

#

is that still bad?

left tartan Aug 19, 2023, 3:07 PM

#

If you’re motivated, there’s some great resources that are watchable at your level. My two favorite are: https://youtube.com/@3blue1brown, see courses and watch all three. Some of it won’t make sense, but it’s nice to get the big picture. And, highlights of calculus by Gilbert Strang, which was designed for high school students: https://ocw.mit.edu/courses/res-18-005-highlights-of-calculus-spring-2010/video_galleries/highlights_of_calculus/ @lapis sequoia

lapis sequoia Aug 19, 2023, 3:09 PM

#

serene scaffold calculus, linear algebra, and probability/statistics

so much pure math i just wanna get started making cool stuff already 😂

#

no way around it tho ig

serene scaffold Aug 19, 2023, 3:10 PM

#

lapis sequoia so much pure math i just wanna get started making cool stuff already 😂

given how much libraries like pytorch abstract the math, you can do superficial stuff without understanding it.

#

but that's setting yourself up for long-term issues.

lapis sequoia Aug 19, 2023, 3:13 PM

#

left tartan If you’re motivated, there’s some great resources that are watchable at your lev...

tysm i’ll check this out 💙

lapis sequoia Aug 19, 2023, 3:14 PM

#

serene scaffold but that's setting yourself up for long-term issues.

i’ll just put my head down and try and learn all of it

#

hopefully 6 months is a reasonable time to learn all of that

serene scaffold Aug 19, 2023, 3:14 PM

#

lapis sequoia hopefully 6 months is a reasonable time to learn all of that

why six months?

lapis sequoia Aug 19, 2023, 3:15 PM

#

it’s just an arbitrary goal i set

serene scaffold Aug 19, 2023, 3:15 PM

#

one doesn't "learn AI" in six months.

lapis sequoia Aug 19, 2023, 3:15 PM

#

i meant the math

serene scaffold Aug 19, 2023, 3:16 PM

#

how are you going to measure your progress?

lapis sequoia Aug 19, 2023, 3:16 PM

#

not the field in general

#

idk good question

#

i’ll probably follow a course

left tartan Aug 19, 2023, 3:18 PM

#

lapis sequoia it’s just an arbitrary goal i set

fwiw, set reasonable goals... and, in my experience, I don't really know it until the second time through the material. So, it's reasonable to aim for familiarity with, say, the ideas behind calculus, linear algebra and statistics... but it's unrealistic to try to "know" them at the college course level. Doesn't mean you'll be proficient in any of them, but it'll be a good starting point.

lapis sequoia Aug 19, 2023, 3:21 PM

#

left tartan fwiw, set reasonable goals... and, in my experience, I don't really *know* it un...

So what is a reasonable amount of time to learn all that math and be proficient at it?

left tartan Aug 19, 2023, 3:23 PM

#

lapis sequoia So what is a reasonable amount of time to learn all that math and be proficient ...

That's a two year effort for college students. Some get a head start by taking calc 1 in high school.

lapis sequoia Aug 19, 2023, 3:25 PM

#

left tartan That's a two year effort for college students. Some get a head start by taking c...

I’m in high school right now and taking college classes in college right now to get my computer science degree at the same time i graduate from high school

#

if it’s gonna take 2+ years i might as well just wait and take it in college 😭

left tartan Aug 19, 2023, 3:25 PM

#

That's why i suggest just aiming for "familiarity" rather than "~~mastery~~" proficiency

wooden sail Aug 19, 2023, 3:26 PM

#

also depends what you call mastery

placid cedar Aug 19, 2023, 3:26 PM

#

hey guys, after doing winserisation, i still have some outliers, but it got reduced from 900 to 700
is that still bad?

left tartan Aug 19, 2023, 3:26 PM

#

I should say "proficiency" (like: passing a college class)

left tartan Aug 19, 2023, 3:27 PM

#

lapis sequoia I’m in high school right now and taking college classes in college right now to ...

learning is usually breadth or depth. Either you can learn a little about a lot, or a lot about a little. I don't think it's possible to do both at the same time.

lapis sequoia Aug 19, 2023, 3:29 PM

#

i see

#

alright thanks for the help

#

i’ll just work on getting familiar with the material

#

hopefully that’s enough to make some cool stuff 😂

wooden sail Aug 19, 2023, 3:30 PM

#

luckily for you, the basics of linalg, calculus and statistics can be learned independently of each other, so you could realistically try them at the same time

lapis sequoia Aug 19, 2023, 3:30 PM

#

good to know, thanks 👀

umbral ermine Aug 19, 2023, 3:33 PM

#

Hello everyone

lapis sequoia Aug 19, 2023, 3:36 PM

#

umbral ermine Hello everyone

Hi.

umbral ermine Aug 19, 2023, 3:36 PM

#

how are you

#

new here

lapis sequoia Aug 19, 2023, 3:36 PM

#

I am full stact developer.

night wadi Aug 19, 2023, 4:39 PM

#

lapis sequoia so much pure math i just wanna get started making cool stuff already 😂

calculus and statistics aren't pure math

potent sky Aug 19, 2023, 6:02 PM

#

what is "pure math" and "not-pure-math"

slim bone Aug 19, 2023, 6:05 PM

#

potent sky what is "pure math" and "not-pure-math"

Would you care to be a bit more specific? In what context are you asking?

#

I think just looking up “pure math” gives pretty good results

potent sky Aug 19, 2023, 6:12 PM

#

The definitions of "pure math" I could find online are all contingent on the motivation for application of that math, rather than specific qualities of the mathematical concepts themselves.
I can't see the strength in this definition. Different concepts of mathematics that might not be readily apparent as having "real-world" applications might find one shortly.
Context is usefulness of making a categorization for pure math

potent sky Aug 19, 2023, 6:27 PM

#

Why isn't all math pure math? All math has qualities that lend themselves to rigor and generality, and aesthetic quality is rather subjective.
Applied mathematics should just classify some allowances that we make to a mathematical concept (such as sacrificing some degree of rigor or generality) in return for increased practical applicability.
It shouldn't classify a basic division of mathematical concepts themselves, with statements like "calculus isn't pure math".
I don't understand the usefulness or fairness of such a classification.

#

also this is pretty off topic by now I suppose, mb lol

forest lintel Aug 19, 2023, 7:09 PM

#

night wadi calculus and statistics aren't pure math

what are they

#

whats the terms i mean

slim bone Aug 19, 2023, 7:10 PM

#

potent sky Why isn't all math pure math? All math has qualities that lend themselves to rig...

is rather subjective
As far as I understand the definition is entirely subjective. As in, there is nothing that makes a concept "pure" or "unpure".

I'd reckon there are probably few subjects that definitely fall into one category but most appear to be on the spectrum between "pure" and "unpure"(applied?) and its place on it will change between each individual.

tl;dr: I completely agree with you lol

#

Ah, I didn't see @ DarQ replying to anybody. Now I realize there was context. Apologies lol

potent sky Aug 19, 2023, 7:15 PM

#

slim bone Ah, I didn't see @ DarQ replying to anybody. Now I realize there was context. Ap...

It's alright, I guessed as much but I took the excuse to vent xd

odd meteor Aug 19, 2023, 7:37 PM

#

umbral ermine new here

night wadi Aug 19, 2023, 7:47 PM

#

forest lintel whats the terms i mean

applied math I suppose

#

or maybe computational math fits best but I dunno how popular that term is

dusty valve Aug 19, 2023, 11:31 PM

#

just made an overloading decorator in native python

#

https://paste.pythondiscord.com/6BSQ

#

Took some black magic but it works

serene scaffold Aug 19, 2023, 11:42 PM

#

@dusty valve did you mean to post in #esoteric-python or #internals-and-peps ?

dusty valve Aug 19, 2023, 11:52 PM

#

Huh ye

#

Wrong

#

Thank stelersus

desert oar Aug 20, 2023, 1:23 AM

#

dusty valve just made an overloading decorator in native python

🙂 https://pypi.org/project/multipledispatch/

PyPI

multipledispatch

Multiple dispatch

#

i'd say it's slightly on topic because just about the only task domain where i think multiple dispatch makes a lot of sense is in numerical and mathematical code

#

otherwise it leads to confusing code, unless you're very disciplined or have a clear guiding framework like in haskell with its typeclasses

#

whereas it's kind of necessary with the zoo of different number and array types you might encounter

serene scaffold Aug 20, 2023, 2:00 AM

#

desert oar 🙂 https://pypi.org/project/multipledispatch/

return "%s + %s" % (x, y) miss me with that

desert oar Aug 20, 2023, 2:00 AM

#

lol, it's not a particularly inspired demo to put in the readme

#

fwiw i think this project predates .format and i'm fairly sure it predates f-strings

#

ok it doesn't predate .format, its first pypi release was 2014

serene scaffold Aug 20, 2023, 2:01 AM

#

I figured as much, in part because it doesn't use type hints.
also format predates f-strings

#

I hated .format from the start

desert oar Aug 20, 2023, 2:02 AM

#

really? i switched to .format from % as soon as i learned about it

#

i still use it from time to time

serene scaffold Aug 20, 2023, 2:02 AM

#

I started using python right when 3.6 came out

desert oar Aug 20, 2023, 2:03 AM

#

i think i started on 3.3 in school, then 3.4 -> 3.6 at my first job

serene scaffold Aug 20, 2023, 2:03 AM

#

I sometimes use modulo formatting for strings that have a lot of curly braces in them that are part of the string. .format occupies a weird middle ground that is never useful for me.

desert oar Aug 20, 2023, 2:03 AM

#

had a very forward-thinking professor starting us on python 3.x that early in its development when many people were still clinging to 2.7

#

i use .format for templating

#

i know we also have string.Template but that seems particularly not-useful by comparison. would be interesting to know the history behind that one

serene scaffold Aug 20, 2023, 2:04 AM

#

one time in the python bot I did .format (without calling it) and assigned the method to a variable. that was fun

desert oar Aug 20, 2023, 2:05 AM

#

yeah why not?

#

i basically use it as a lightweight no-dependency alternative to jinja

#

admittedly not a very common use case, but it happens

sharp quest Aug 20, 2023, 5:41 AM

#

Is it okay to ask about Pandas here?
I have a CSV file where two column, date and amount, are interesting.
I'd like to filter out all rows that match YEAR and MONTH, then sum the amount.
Do I need to add an index or create a new dataframe?

desert oar Aug 20, 2023, 7:10 AM

#

sharp quest Is it okay to ask about Pandas here? I have a CSV file where two column, date an...

yes pandas is fine here. index is often useful and convenient but not necessary. you should probably work through https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html which has all the info you need

however the tldr is

matches_yearmon = (
    (data["date"].dt.year == YEAR) &
    (data["date"].dt.month == MONTH)
)
total = data.loc[matches_yearmon, "amount"].sum()

#

with date as part of the index it's sometimes more elegant, although in this case it's mostly the same

#

it's worth spending the time to understand each piece of the above example. i think it demonstrates several important principles about how pandas works and how to use it effectively

sharp quest Aug 20, 2023, 7:18 AM

#

Thanks mate, it'll give me a lead to work on. Pandas seem so cool but it's really not simple and easy to get discouraged when poking around it.

#

Or it's just me being dumb 😅

lapis sequoia Aug 20, 2023, 7:23 AM

#

could someone pls help me with replacing values in a pandas dataframe with dates pls?

#

2023-08-15 00:00:00 177.449997
2023-08-16 00:00:00 176.570007
2023-08-17 00:00:00 174.000000
2023-08-18 00:00:00 174.490005

this is what my dataframe looks like. the length of this is 5946

#

predictions_dataset.loc[5944] = 300

but if i use this line to replace one of the last values

#

it just adds it to the end and results in this

2023-08-15 00:00:00 177.449997
2023-08-16 00:00:00 176.570007
2023-08-17 00:00:00 174.000000
2023-08-18 00:00:00 174.490005
5944 300.000000

#

predictions_dataset is the name of the dataframe btw

cobalt salmon Aug 20, 2023, 7:25 AM

#

Hello, hope it's ok to ask a question about conda, jupyter notebooks and installing a package. I've got conda version 23.7.2 and am trying to install the pyclustertend package from within a jupyter notebook using !pip install pyclustertend however it tries to compile sklearn from source code. On a Mac this fails by default and the recommendation is to compile sklearn from source instead, which I am trying to do using the instructions here:
https://scikit-learn.org/dev/developers/advanced_installation.html#compiler-macos

However, when I do conda activate sklearn-dev it doesn't accept the command and says activate is an invalid choice, which is just weird. I couldn't find much on Google about this. Here's the output:

usage: conda [-h] [--no-plugins] [-V] COMMAND ...
conda: error: argument COMMAND: invalid choice: 'activate' (choose from 'clean', 'compare', 'config', 'create', 'info', 'init', 'install', 'list', 'notices', 'package', 'remove', 'uninstall', 'rename', 'run', 'search', 'update', 'upgrade', 'doctor', 'debug', 'pack', 'content-trust', 'repo', 'verify', 'index', 'build', 'env', 'metapackage', 'develop', 'convert', 'inspect', 'render', 'server', 'skeleton', 'token')

Note: you may need to restart the kernel to use updated packages.

#

Weirdly, from the command line, conda activate works just fine. The versions of Conda between the jupyter notebook and the command line are the same, I'm not sure what the difference is

odd meteor Aug 20, 2023, 7:53 AM

#

cobalt salmon Hello, hope it's ok to ask a question about conda, jupyter notebooks and install...

The exclamation notation to install package from JNB isn't advised. Use anyone of these instead.


import sys
!{sys.executable} -m pip install package_name

Or better still,

%pip install package_name

%conda install package_name

In version 7.3 and above of Jupyter you should always use the line magic commands %pip or %conda to install a package into a current kernel instead of using !pip (which installs the package into the instance of python that launched your JNB)

If the above doesn’t work, you might wanna confirm if you can install packages directly from JNB, or if you need to allow some sort of access for the package to be installed.

odd meteor Aug 20, 2023, 8:04 AM

#

lapis sequoia 2023-08-15 00:00:00 177.449997 2023-08-16 00:00:00 176.570007 2023-08-17 0...

Is the index of your dataframe a date or the default number index in pandas?

Assuming you have a column in your df called scores, what do you see when you do this?

predictions_dataset.at[1, "scores"]

lapis sequoia Aug 20, 2023, 8:05 AM

#

odd meteor Is the index of your dataframe a date or the default number index in pandas? As...

the index is the date

#

i dont have a column called scored but i figured it out with a bit of messing around

#

predictions_dataset.loc[index:index+1] = value

#

this line replaces the value at index with the value of value

#

i have no clue why this works but it does

#

so i can change a range of values (kindof?) but i cant change a single value without adding a new row

patent tree Aug 20, 2023, 8:09 AM

#

Hello community members,

I have been assigned a project to create an audio-book app as part of my curriculum. To enhance the app's features, I am planning to implement a content-based recommendation system. This recommendation system will provide users with suggestions for audiobooks based on their clicks and listening time.

For instance, if a user listens to adventure category books more frequently than autobiographies, the algorithm will prioritize recommending adventure books over other categories. I hope this clarifies the concept.

Given that I have only couple of weeks days left to complete this project, I intend to focus solely on the essential aspects (algorithms) required for building this recommendation system.

I would greatly appreciate guidance on the specific machine learning algorithms or techniques that are suitable for developing such a recommendation system. Additionally, I'm unsure whether this recommendation system necessitates deep learning or neural networks. If they are indeed required, could you please suggest the relevant algorithms?

Currently, I am familiar with numpy and pandas, and I possess a basic understanding of supervised machine learning (though not at an advanced level).

Thank you in advance for your assistance.

odd meteor Aug 20, 2023, 8:14 AM

#

lapis sequoia the index is the date

.loc doesn't use the 0-indexed ordering but iloc does.

Better still, for more flexibility, convert the index of your panda's dataframe to a DateTimeIndex

!e


import pandas as pd

data = {'Value': [10, 15, 20, 25]}
dates = ['2023-08-01', '2023-08-02', '2023-08-03', '2023-08-04']

datetime_index = pd.DatetimeIndex(dates)
df = pd.DataFrame(data, index=datetime_index)

# Adding a new row using loc
new_date = '2023-08-05'
new_value = 30

df.loc[new_date] = new_value

print(df)

lapis sequoia Aug 20, 2023, 8:16 AM

#

odd meteor `.loc` doesn't use the 0-indexed ordering but `iloc` does. Better still, for mo...

Thank you!

#

i need to do this eventually

cobalt salmon Aug 20, 2023, 8:24 AM

#

@odd meteor thanks, I've tried %conda install pyclustertend. That gives me:

Collecting package metadata (current_repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - pyclustertend

Current channels:

  - https://repo.anaconda.com/pkgs/main/osx-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/osx-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

#

Unfortunately on anaconda.org, that package does exist, but it's empty 😦
https://anaconda.org/lachhebo7/repo

odd meteor Aug 20, 2023, 8:30 AM

#

patent tree Hello community members, I have been assigned a project to create an audio-book...

Well, for your task, if the data is in tabular form, you don't necessarily need to Deep Learning to build a Recommendation Engine.

Although you'd still have to decide which type of recommender system you want to build.

Content-based
Collaborative Filtering
Hybrid (combines 1 & 2)
Neural Collaborative Filtering (NCF)
Etc

This could be of help

YouTube

Simplilearn

How Build A Movie Recommendation System Using Python | Python Tutor...

🔥 Purdue Post Graduate Program In AI And Machine Learning: https://www.simplilearn.com/pgp-ai-machine-learning-certification-training-course?utm_campaign=HowBuildAMovieRecommendationSystemUsingPython&utm_medium=Descriptionff&utm_source=youtube
🟡 Caltech AI & Machine Learning Bootcamp (For US Learners Only) - https://www.simplilearn.com/ai-machi...

▶ Play video

GitHub

GitHub - topspinj/recommender-tutorial: An introduction to recommen...

An introduction to recommendation systems in Python - GitHub - topspinj/recommender-tutorial: An introduction to recommendation systems in Python

odd meteor Aug 20, 2023, 8:38 AM

#

cobalt salmon Unfortunately on anaconda.org, that package does exist, but it's empty 😦 https...

Oh that's bad. Try pip installing it then from your anaconda prompt using pip install pyclustertend (Anaconda powershell prompt > Run as administrator > pip install the package)

or from your Jupyter Notebook using %pip install pyclustertend

This should solve the problem.

cobalt salmon Aug 20, 2023, 9:02 AM

#

@odd meteor I'm on a Mac, I assume the powershell reference you mentioned was for Windows?

I have already tried the simple %pip install pyclustertend methods - they all fail because it tries to compile scikit-learn from source code and then fails because OpenMP isn't available:

Collecting pyclustertend
  Using cached pyclustertend-1.6.2-py3-none-any.whl (7.1 kB)
Requirement already satisfied: matplotlib<4.0.0,>=3.3.3 in /Users/kjaleel/anaconda3/lib/python3.11/site-packages (from pyclustertend) (3.7.1)
Requirement already satisfied: numpy<2.0.0,>=1.19.1 in /Users/kjaleel/anaconda3/lib/python3.11/site-packages (from pyclustertend) (1.24.3)
Requirement already satisfied: pandas<2.0.0,>=1.2.0 in /Users/kjaleel/anaconda3/lib/python3.11/site-packages (from pyclustertend) (1.5.3)
Collecting scikit-learn<0.25.0,>=0.24.0 (from pyclustertend)
  Using cached scikit-learn-0.24.2.tar.gz (7.5 MB)
  Installing build dependencies ... \

This is why I was trying to compile scikit-learn from source using the instructions at: https://scikit-learn.org/dev/developers/advanced_installation.html#building-from-source

However, I'm having 2 problems - one is that after I create an environment in Conda, it refuses to activate it and says that there's no such command defined. I have given an example of this above already.

From the command line (outside Conda), the activate command works fine and I'm able to switch into the environment and I can compile a new version of sklearn, but then what do I do to use that inside Conda? Since I can't run conda activate I'm stuck. This is a silly circular problem

#

Maybe I should just give up on this package and use something else? I'm just trying to follow an example from a Udemy course that is using the hopkins module from pyclustertend to do a K Means Clustering example. Maybe there's a different Python library to do that?

odd meteor Aug 20, 2023, 9:06 AM

#

Yeah it's for Windows. I don't use Mac but you can still try installing directly from anaconda prompt. I suppose you have the normal anaconda prompt, yeah?

patent tree Aug 20, 2023, 9:08 AM

#

odd meteor Well, for your task, if the data is in tabular form, you don't necessarily need ...

sorry but i have already mentioned it. no problem i tell you again that it is content based filtering.
and one more thing is that the video tutorial leads to collaborative filtering. does it matter whether it's content or collaborative??? is all the process same???

odd meteor Aug 20, 2023, 9:10 AM

#

cobalt salmon Maybe I should just give up on this package and use something else? I'm just try...

Of course there is...

from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters = 2)
kmeans.fit(data)

cobalt salmon Aug 20, 2023, 9:13 AM

#

odd meteor Of course there is... ```py from sklearn.cluster import KMeans kmeans = KMeans(...

@odd meteor yup, that is mentioned on that course too. It's just this "clustering tendency" stuff using Hopkins - that's where I'm stuck.

Not sure what you mean by the anaconda prompt? I'm running these commands from inside a Jupyter notebook, is that incorrect? I used Anaconda to "start" the Jupyter notebook

odd meteor Aug 20, 2023, 9:13 AM

#

patent tree sorry but i have already mentioned it. no problem i tell you again that it is co...

No, it's not exactly the same. They are both different algorithms. I suppose the github link covered Content-based Filtering. But if you prefer visual, you can look up Content-based recommendation system on YouTube for guide.

odd meteor Aug 20, 2023, 9:16 AM

#

cobalt salmon <@519319496868233227> yup, that is mentioned on that course too. It's just this ...

I haven't used that package before but what exactly does this package do with regards to clustering?

It sort of tells you amongst variety of clustering algorithms, the best one to use on your data?

cobalt salmon Aug 20, 2023, 9:22 AM

#

odd meteor I haven't used that package before but what exactly does this package do with re...

let's say you have a dataset in a dataframe - the package contains the hopkins module which allows you to run the Hopkins test. This is used to tell you how 'clusterable together' the datapoints in your dataframe are. If the Hopkins value is less than 0.5, it is unlikely to have statistically significant clusters.

lapis sequoia Aug 20, 2023, 9:23 AM

#

can someone help me debug my code pls?

#

# there are roughly 250 working days a year, which means 750 new values need to be predicted

future_predicted_values = []

# Predict the next day's price
historical_data = yf.download(ticker, '2000-01-01', current_date)
historical_data = historical_data['Close']
historical_data_values = historical_data.values

for i in range(750):
    # Only keep the last 60 days
    #print(historical_data[-3:])
    historical_data = historical_data[-60:]
    print(historical_data[-5:])
    

    #print('historical_data')
    #print(historical_data)
    #print()

    reshaped_historical_data = np.reshape(historical_data_values, (-1, 1))

    # Scale the data to be between 0 and 1

    normalized_historical_data = (reshaped_historical_data - np.min(reshaped_historical_data))/(np.max(reshaped_historical_data)-np.min(reshaped_historical_data))
    # Store the data to reverse normalization later
    historical_data_max = np.max(reshaped_historical_data)
    historical_data_min = np.min(reshaped_historical_data)

    #print('normalized_historical_data')
    #print(normalized_historical_data)
    #print()

    # Create an empty list
    new_x_test = []
    new_x_test.append(normalized_historical_data)
    new_x_test = np.array(new_x_test)

    #print('new_x_test')
    #print(new_x_test)
    #print()

    # Reshape the data
    new_x_test = np.reshape(new_x_test, (new_x_test.shape[0], new_x_test.shape[1], 1))

    # Get the predicted scaled price
    predicted_price = model.predict(new_x_test)
    predicted_price = predicted_price * (historical_data_max-historical_data_min) + historical_data_min

    # Print the predicted price
    #print(historical_data[-1])
    print(predicted_price)
    historical_data = np.append(historical_data, predicted_price)
    #print(historical_data[-1])
    future_predicted_values.append(predicted_price)

#

this generates a new value as expected

#

i then add that new value to the array historical data

#

but then it generates the same value again

#

why is that?

odd meteor Aug 20, 2023, 9:32 AM

#

cobalt salmon <@519319496868233227> yup, that is mentioned on that course too. It's just this ...

You can install packages from command line (just like you do with VSCode via the terminal)

You can also install directly from your Jupyter Notebook.

You can as well install packages from your anaconda prompt (if you have anaconda installed on your pc)

The last suggestion I made was installing it via anaconda prompt.

I don't use Mac, but I sure know it's possible for you to still get that package so long as it's available on PyPi and can work on any OS.

Just search "Anaconda Prompt" on your Mac and open it.
Activate your desired anaconda environment (this step is not compulsory)
Then type pip install pyclustertend to install the package
Go back to your JNB, restart your kernel.

That should do the magic.

odd meteor Aug 20, 2023, 9:35 AM

#

cobalt salmon let's say you have a dataset in a dataframe - the package contains the `hopkins`...

Interesting. I'll try out the package then! brainmon

lapis sequoia Aug 20, 2023, 10:17 AM

#

bruh is it even possible to predict stock prices with ai

daring sphinx Aug 20, 2023, 11:08 AM

#

@serene scaffold
#data-science-and-ml message

Proved you wrong

humble shore Aug 20, 2023, 11:08 AM

#

damnnn

#

you making $$$$

#

this is a simple regression module right?

#

and is this upwork?

#

@daring sphinx ??

daring sphinx Aug 20, 2023, 11:10 AM

#

humble shore this is a simple regression module right?

Ml webapp in streamlit with deployment in EC2.

humble shore Aug 20, 2023, 11:11 AM

#

ya but the model it self was a regression model

#

btw were you the employer or the employy

daring sphinx Aug 20, 2023, 11:11 AM

#

The entire model was trained in sagemaker as well. Optimizing hyperparameters of xgboost. All with built in sagemaker libraries

daring sphinx Aug 20, 2023, 11:14 AM

#

humble shore btw were you the employer or the employy

Employee bro

cold osprey Aug 20, 2023, 12:15 PM

#

r u doing someone else's homework or smth?

sacred stirrup Aug 20, 2023, 1:14 PM

#

Hey everyone, I would like some guidance based on what i'm currently trying to do

I got few thousand images of cars with visible license plates, some have multiple cars, some have none, and for each image four pairs of integers which represent bounding corners of the license plate quadrilateral. In order to use that data to train a custom model, what is the preferred way to start? This is different from other machine learning models I've used in the past because usually there's a set of "objects" (animal, plant, human, building, ...) and the goal was to classify the image to something from that set, but this one's different

Is tensorflow / keras even suitable for this? Appreciate any feedback

dusty valve Aug 20, 2023, 1:29 PM

#

desert oar 🙂 https://pypi.org/project/multipledispatch/

Ye but mine supports generics and generalized unions

pale basalt Aug 20, 2023, 1:39 PM

#

Guys I am trying to extract text from PDF using ocr to excel. I need help of pro coder in ocr. DM me

dusty valve Aug 20, 2023, 1:41 PM

#

But looking deeper it something for subclasses, ill take that into account later

dusty valve Aug 20, 2023, 1:42 PM

#

pale basalt Guys I am trying to extract text from PDF using ocr to excel. I need help of pro...

Iirc there are some libs to read pdf text, or u can

#

Or u can use

#

!pypi pytesseract

arctic wedgeBOT Aug 20, 2023, 1:43 PM

#

pytesseract v0.3.10

Python-tesseract is a python wrapper for Google's Tesseract-OCR

dusty valve Aug 20, 2023, 1:43 PM

#

Just download the executable and ur done

pale basalt Aug 20, 2023, 1:50 PM

#

Okay thanks

#

Can we directly bring it into excel using pytesseract?

odd meteor Aug 20, 2023, 1:53 PM

#

At surface level, yeah it is, but don't hold your breathe on its efficacy `cos relying solely on the model's predictions for investment signal is an all expense-paid high ticket to bankruptcy. Same thing with using ML for Bitcoin prediction.

potent sky Aug 20, 2023, 2:13 PM

#

lapis sequoia bruh is it even possible to predict stock prices with ai

^^ see emrys' answer above

south crow Aug 20, 2023, 4:41 PM

#

Hi do yall know the max supported verison of keras? I saw on the documentation Keras's website that it supports up to 3.10, but dose like 3.10.9 works?

fickle dew Aug 20, 2023, 5:11 PM

#

daring sphinx <@253696366952316929> https://discord.com/channels/267624335836053506/366673247...

You did it for $50 but honestly that's low

slim bone Aug 20, 2023, 5:50 PM

#

daring sphinx <@253696366952316929> https://discord.com/channels/267624335836053506/366673247...

I've been learning about ML for a few weeks now but that uh... sounds terribly basic?

#

People pay 50 bucks for this stuff?

slim bone Aug 20, 2023, 5:51 PM

#

slim bone I've been learning about ML for a few weeks now but that uh... sounds terribly b...

I don't mean to offend you of course, I'm sure you did a great job
Rather, I'm genuinely asking: Isn't this really basic stuff?

twilit tundra Aug 20, 2023, 5:53 PM

#

50 sounds very cheap no matter how simple the model is

slim bone Aug 20, 2023, 5:54 PM

#

Seriously?

#

Reminds me of stories about the late 90's where web developers would get goofy amounts of money for virtually nothing

twilit tundra Aug 20, 2023, 5:55 PM

#

If it's for a company, 50 is basically nothing

#

Freelancers often have a rate going up to around 1.5k/day or more

slim bone Aug 20, 2023, 5:56 PM

#

twilit tundra Freelancers often have a rate going up to around 1.5k/day or more

I mean, renowned freelancers with degrees*- sure
This is not the case though I believe

#

And surely you have to factor in how much work went into the actual model

#

I could program that right now and I consider myself a complete pleb

twilit tundra Aug 20, 2023, 5:59 PM

#

Yeah probably, still pretty cheap considering it would probably be more than 1 hour of work

slim bone Aug 20, 2023, 6:00 PM

#

Damn, I'm tempted to just open a few freelancing profiles and seeing where this goes

twilit tundra Aug 20, 2023, 6:00 PM

#

You have to take into account that someone paying for this kind of service can't do it themselves

#

Most people have no idea how to train/deploy a model, it's not their role

slim bone Aug 20, 2023, 6:01 PM

#

Oh. Of course
If they have the faintest idea they probably wouldn't pay 50 bucks for this

twilit tundra Aug 20, 2023, 6:02 PM

#

The cheaper option is hiring a free intern I guess

slim bone Aug 20, 2023, 6:02 PM

#

Again, I'm assuming some basic regression model
I'm sure this could get extremely complicated very fast

slim bone Aug 20, 2023, 6:03 PM

#

twilit tundra The cheaper option is hiring a free intern I guess

That's a solid point honestly

twilit tundra Aug 20, 2023, 6:04 PM

#

At my company, it's cheaper than an intern if it would take said intern more than 2 / 3hours

#

And according to the description, it's more than the model: you have a pipeline, an interface and it's hosted on AWS

#

Which is again, not a very complicated task but the value for a company that would need that is way more than 50

slim bone Aug 20, 2023, 6:08 PM

#

I suppose
I guess I just assumed the freelancing market would naturally reduce the price to naught

#

It’s almost disappointing to realize just how easy it is to implement a half-decent model

twilit tundra Aug 20, 2023, 6:09 PM

#

There are some that put very low prices for exposure but the lowest I've found were still around 300€/day

slim bone Aug 20, 2023, 6:09 PM

#

That’s crazy tbf

#

Again, just knowing how little work this could actually be
Especially if you’ve made similar projects

twilit tundra Aug 20, 2023, 6:12 PM

#

You have to find clients and you pay fees + taxes

#

And companies with ML use cases usually have large cash flows

slim bone Aug 20, 2023, 6:15 PM

#

That’s true, it also kind of occurred to me that doctors in the private sector charge way more than that for simple routine checks
A pretty bad comparison but the point still stands - knowledge is very valuable regardless of the effort

iron basalt Aug 20, 2023, 6:19 PM

#

slim bone That’s true, it also kind of occurred to me that doctors in the private sector c...

Not really just a knowledge issue in that case, but that is a whole off topic discussion to be had. The point that just knowing anything is important is still correct. Also if someone has the skill set to do something far more complicated and high paying with their time, they are paying opportunity cost, and so you need to pay them more to make it worth their time. There is an opportunity to fill that gap where you can be payed much less, and know much less, and people do fill that role (by just barely dipping into ML and only knowing surface level knowledge / how to use stuff like these Amazon tools).

slim bone Aug 20, 2023, 6:21 PM

#

iron basalt Not really just a knowledge issue in that case, but that is a whole off topic di...

Then again, you have full-stack web developers who have learned for many months pricing themselves at 5 bucks in an often futile attempt at building a clientbase

#

Regarding their competence - I can't advocate. But they do have a portfolio that seems legit

#

This is the main reason this is pricing is strange to me. Because I've seen people with much more knowledge (As far as you can quantify "knowledge"), working much harder, being paid much less

twilit tundra Aug 20, 2023, 6:23 PM

#

5 bucks to do what? A full website? A module? Either way that sound very low

iron basalt Aug 20, 2023, 6:23 PM

#

slim bone Then again, you have full-stack web developers who have learned for many months ...

Over-saturation driven by the internet boom. Now that it's slipping into recession those that spent their time studying for that will have a rough time. If you prepare for many years for the current trend, you may be left stranded in the future.

slim bone Aug 20, 2023, 6:24 PM

#

twilit tundra 5 bucks to do what? A full website? A module? Either way that sound very low

I can recall looking up frontend developers + hosting. Not a full website with security measures and everything although I'd bet you could probably find those at that price as well

twilit tundra Aug 20, 2023, 6:25 PM

#

You're barely paying for your electricity and internet at that rate

slim bone Aug 20, 2023, 6:25 PM

#

iron basalt Over-saturation driven by the internet boom. Now that it's slipping into recessi...

Precisely, which is why I think this pricing won't really hold up for long

slim bone Aug 20, 2023, 6:25 PM

#

twilit tundra You're barely paying for your electricity and internet at that rate

Depends on where you live

iron basalt Aug 20, 2023, 6:25 PM

#

slim bone Precisely, which is why I think this pricing won't really hold up for long

Timing is of the essence.

slim bone Aug 20, 2023, 6:25 PM

#

I suppose. But I'm looking for a stable job myself

#

I'll concede and say that learning ML is far more daunting and confusing than learning Javascript though lol

iron basalt Aug 20, 2023, 6:26 PM

#

And having a general skill set protects you from this. For example, if ML suddenly dies down (doubt), then if you learned the math for it (which is very general purpose), you will have an easier time finding something else, since you can spend less time preparing for that.

twilit tundra Aug 20, 2023, 6:27 PM

#

ML is basically trendy statistics

slim bone Aug 20, 2023, 6:27 PM

#

I mean that's a little offensive isn't it? haha

iron basalt Aug 20, 2023, 6:28 PM

#

The name will probably change at some point, but it will be around.

slim bone Aug 20, 2023, 6:30 PM

#

This conversation kind of made me think - Do ya'll know how researched non-NN based ML is? (Hope I didn't butcher the terminology)

#

I thought ML == Neural Networks not too long ago and I was happy to find out that I'm completely wrong

iron basalt Aug 20, 2023, 6:31 PM

#

slim bone This conversation kind of made me think - Do ya'll know how researched non-NN ba...

It depends on what is considered a NN, if it's just some graph with nodes, well, pretty much anything that scales falls under that.

#

(Compute graphs)

slim bone Aug 20, 2023, 6:32 PM

#

Not sure honestly. Decision tree learning or whatever it's called comes to mind

twilit tundra Aug 20, 2023, 6:34 PM

#

Boosting-based models are still very used in everything tabular if that's part of non-NN ML

slim bone Aug 20, 2023, 6:34 PM

#

twilit tundra Boosting-based models are still very used in everything tabular if that's part o...

Ah yes, I'm certain. But are there still innovations in that uh.. field?

twilit tundra Aug 20, 2023, 6:36 PM

#

It's a lot slower than deep learning but there are still research afaik

slim bone Aug 20, 2023, 6:36 PM

#

Like, if I want to pursue a masters in ML - is there a reasonable chance that my professor would want to make a thesis about something that isn't NN?

slim bone Aug 20, 2023, 6:36 PM

#

twilit tundra It's a lot slower than deep learning but there are still research afaik

Yeah that's the answer I was expecting

slim bone Aug 20, 2023, 6:37 PM

#

twilit tundra It's a lot slower than deep learning but there are still research afaik

Do you know by any chance about what academic courses in ML teach?

#

The syllabus is still rather gibberish-y atm

twilit tundra Aug 20, 2023, 6:39 PM

#

slim bone Do you know by any chance about what academic courses in ML teach?

In my course, we had one class on ML, one on DL and then one for each specialization basically. I don't recall the specifics of the ML class, just that it was based on the bishop book

slim bone Aug 20, 2023, 6:40 PM

#

twilit tundra In my course, we had one class on ML, one on DL and then one for each specializa...

one for each specialization
What's a "specialization" in this context?

bishop book
Also what's that

twilit tundra Aug 20, 2023, 6:40 PM

#

NLP, Computer Vision, interdisciplinary courses,etc.

slim bone Aug 20, 2023, 6:41 PM

#

twilit tundra NLP, Computer Vision, interdisciplinary courses,etc.

Ah, those fall under AI, not necessarily ML. Right?

twilit tundra Aug 20, 2023, 6:41 PM

#

slim bone > one for each specialization What's a "specialization" in this context? > bish...

Pattern recognition and machine learning

twilit tundra Aug 20, 2023, 6:41 PM

#

slim bone Ah, those fall under AI, not necessarily ML. Right?

What difference do you make between AI and ML?

slim bone Aug 20, 2023, 6:42 PM

#

twilit tundra What difference do you make between AI and ML?

Not sure, I was told the theory behind NLP and what I currently know is different for example

#

I suppose I only know about DL so my mental image of the field is rather tiny.

twilit tundra Aug 20, 2023, 6:44 PM

#

You need knowledge for NLP that is different from other fields but the more recent models are considered DL and overall it's part of ML imo

#

Like ML is a broad term and then NLP is ML applied to language

slim bone Aug 20, 2023, 6:45 PM

#

Oh. Curious

#

And the same goes for those specializations you mentioned I assume?

twilit tundra Aug 20, 2023, 6:45 PM

#

Yes

slim bone Aug 20, 2023, 6:45 PM

#

That's honestly nice to hear
I've been hoping to deviate from NN towards the end of my summer break

#

But learning the theory behind NN took me weeks upon weeks

twilit tundra Aug 20, 2023, 6:46 PM

#

Everything except tabular data uses NN unfortunately

slim bone Aug 20, 2023, 6:47 PM

#

Unfortunately?

twilit tundra Aug 20, 2023, 6:47 PM

#

If you want to do research on other fields

slim bone Aug 20, 2023, 6:47 PM

#

Oh, so you have no choice?

twilit tundra Aug 20, 2023, 6:49 PM

#

If you want to do research in CV, NLP or speech, you probably need to work on neural networks/DL to produce "publishable" results

slim bone Aug 20, 2023, 6:50 PM

#

That doesn't sound too bad

#

whole learning process has been fascinating so far

#

Then again, didn't read a thing about statistics

iron basalt Aug 20, 2023, 6:51 PM

#

ML was born from memoization of Checkers board states. Rather than doing the whole tree search to compute the value, only do it once and remember it for next time (learning). This then mixes with Monte-Carlo methods / probability / statistics. Rather than do everything, do some, and then guess the rest (induction / abduction). This also opened up the scope to problems where you can't try everything and are forced to guess. ML on its own is a mathematical topic (theory of computation, probability, statistics, decision theory, (multivariate) calculus, linear algebra (Von Neumann machines love linear algebra), etc).

#

There are several NN / biologically inspired methods in ML, and they can be very different in feel. The most popular, DL, is very different from other NN based methods. Most of these NN based methods only draw loose inspiration from biology. To be effective they must be very different from their biological counterparts, because they need to run well on existing computer hardware, and that hardware is not well suited at all to directly simulating such biological systems.

#

Typically they seek to replicate some mathematical insight given by the biology (which evolution randomly found for us, so just copy it).

slim bone Aug 20, 2023, 7:05 PM

#

iron basalt ML was born from memoization of Checkers board states. Rather than doing the who...

The book I'm reading gave a brief overview about how machine learning came to be which was rather interesting to see.

I don't know the slightest bit of statistics yet, so there's a bit of a void in my heart in that aspect.

I am kind of curious to know just how relevant calculus and linear algebra is to modern-day ML research? As in, how much of it do you actually need to know in order to find something new? It feels like the modern libraries have abstracted and optimized every single tensor operation to its maximum for example

iron basalt Aug 20, 2023, 7:05 PM

#

slim bone The book I'm reading gave a brief overview about _how_ machine learning came to ...

On the last part, very important, you can't do much of anything without them.

slim bone Aug 20, 2023, 7:06 PM

#

iron basalt There are several NN / biologically inspired methods in ML, and they can be very...

I thought deep learning is simply a multi-layer neural network? It really does feel like there isn't a single agreed about definition for DL

slim bone Aug 20, 2023, 7:06 PM

#

iron basalt On the last part, very important, you can't do much of anything without them.

I'm talking in terms of innovation, and things you'll actually apply yourself

#

I can only imagine how crucial it is to be able to multiply matrices ~~properly~~ efficiently but that's something that's already been implemented for you

iron basalt Aug 20, 2023, 7:07 PM

#

slim bone I thought deep learning is simply a multi-layer neural network? It really does f...

DL these days means loosely something like "compute graph that visually has many 'layers' and runs well on current parallel hardware with automatic differentiation and backpropagation."

slim bone Aug 20, 2023, 7:08 PM

#

So the definition is pretty loose it seems

#

Unless I'm missing something

iron basalt Aug 20, 2023, 7:09 PM

#

slim bone So the definition is pretty loose it seems

Those things I mentioned are in common to them all.

slim bone Aug 20, 2023, 7:09 PM

#

Right, it's the "many layers" part that got me wondering

#

"Many" is probably entirely subjective

twilit tundra Aug 20, 2023, 7:10 PM

#

According to my supervisor, it is more than 2

slim bone Aug 20, 2023, 7:10 PM

#

Yeah that's kind of what I heard

slim bone Aug 20, 2023, 7:11 PM

#

twilit tundra According to my supervisor, it is more than 2

More than 2 essentially means "Beyond an input and an output" right?

naive crown Aug 20, 2023, 7:11 PM

#

Guys, I take code for training model that works, and then I just change one dimension (with updating model paramiters) and then the model loss stays constant forever. Can someone please help or give me tips

mild dirge Aug 20, 2023, 7:11 PM

#

slim bone More than 2 essentially means "Beyond an input and an output" right?

Think so, but I also read more than 2 hidden layers, so kinda vague what people agree on.

slim bone Aug 20, 2023, 7:11 PM

#

mild dirge Think so, but I also read more than 2 *hidden* layers, so kinda vague what peopl...

Heh, sounds... random
Why 2 and not 1 or 3? lol

mild dirge Aug 20, 2023, 7:12 PM

#

Yeah, maybe just echo chamber of confusion

slim bone Aug 20, 2023, 7:12 PM

#

I think so

iron basalt Aug 20, 2023, 7:12 PM

#

slim bone More than 2 essentially means "Beyond an input and an output" right?

Often, more than 0 hidden layers. The input "layer" is sometimes consider an actual layer, and sometimes not.

twilit tundra Aug 20, 2023, 7:12 PM

#

My own definition is that deep learning is when the model is able to learn features without you having to design them

slim bone Aug 20, 2023, 7:12 PM

#

From my experience as a beginner, beginners don't often think about the input and output layers as a layer

iron basalt Aug 20, 2023, 7:12 PM

#

Ultimately, it's a buzzterm.

slim bone Aug 20, 2023, 7:13 PM

#

Is Machine Learning a buzzterm as well?

slim bone Aug 20, 2023, 7:13 PM

#

twilit tundra My own definition is that deep learning is when the model is able to learn featu...

Can't that be applied to other ML techniques as well?