#data-science-and-ml

1 messages · Page 167 of 1

verbal oar
#

I'm watching it currently and its about project lifecycle, descriptive stats, roadmap and so on

rich moth
#

In the main image, its like 135-180 degrees is the goldilocks zone. What these datasets seem to have in common in most cases is they're "evolved" information systems that are subject to selection pressure. They represent stable, prsisten patterns that have survived over time and they're in the optimal learnability zone, complex enough to be interesting , simple enough to process.

Like financial markets evolved through economic selection pressures, biological sequences with 4 billion years of evolutionary optimization, language patterns evolution and communication efficiency, image structures, time series. you get it.

Then you got some outliers. Like Artifically constructed data, pathological cases or ery recent/unfiltered information but its all the wrong angle.

87.1% cross domain classification accuracy baby, the pattern is real, not random

tepid garnet
#

i would be thankful

gloomy sun
#

Hi

tepid garnet
rich moth
lapis flax
#

Anyone knowledgeable on how to use the subset dataloader in pytorch to just get three classes from the fashionMNIST dataset? I’ve been running around in the documentation and trying things but nothing is clear, and the best way I’ve found is someone else’s method that directly uses the full dataset from its github repo and more ‘manually’ extracts the three classes i’m interested in.

#

Re: the subset dataloader, I would want to use a boolean mask (or whatever equivalent form of indexing) to just grab all instances of T-shirts, coats, and shirts, and then do my thing from there

stuck belfry
#

Hey gang, How do i practice data science? I wanna invest my time into learning numpy and sci-py and matplotlib but i have nothing to apply myself towards or any data to mess with, any pointers?

serene scaffold
stuck belfry
# serene scaffold the most important thing is that you don't try to learn in terms of libraries. i...

thank you for your response, but knowing this i'm not really sure of where else to go from here, do you suggest a data science book like the one openstax offers? https://openstax.org/details/books/principles-data-science

OpenStax offers free college textbooks for all types of students, making education accessible & affordable for everyone. Browse our list of available subjects!

serene scaffold
#

!resources data science

arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

serene scaffold
#

@stuck belfry you can look at this curated list ^

stuck belfry
#

wow this looks sweet, thank you again

serene scaffold
#

also, as a reality check, keep in mind that it's very unlikely that you'd be able to get a job in data science without a degree, so if you're trying to work in this space, you need to do that.

stuck belfry
serene scaffold
stuck belfry
#

i'm going to uni starting this august though, gives me time to refine myself before i go into the workforce with data science

#

or whatever else there might be, i think stuff like numpy is foundational to whatever other applications python has in mathematics as a whole

serene scaffold
#

"data science python" is like its own dialect of python, and many of the distinguishing features of that dialect come from numpy.

silk hull
#

frfr

#

python is such a diverse language

serene scaffold
silk hull
#

hello

#

Im learning data science w python and just data science in general

#

it's soo cool

stuck belfry
glacial root
twin relic
#

Hi, how would you advise someone who wants to be an AI Engineer and is considering switching from a full-stack developer? And how would you recommend building the resume to land full-time jobs?

twin relic
frigid sonnet
#

Hey guys

verbal oar
#

how it looks is it used R with python or
R or python?

#

so should I focus more on python or R?

jaunty helm
#

(and the former's not even being updated anymore)

verbal oar
#

thanks, makes sense

#

R and Python are both popular programming languages for data analysis, but they serve different purposes. R is primarily focused on statistical analysis and data visualization, while Python is a general-purpose language with a broader range of applications, including web development and machine learning. Python is often considered easier to learn and more versatile, while R is favored for its specialized statistical capabilities.

#

not exactly true r is for ml too

#

source google search (probably gemini)

jaunty helm
zinc patrol
#

guys halp me

#

is my network dying?

#
import numpy as np
import nnfs
from nnfs.datasets import spiral_data

nnfs.init()

np.random.seed(0)

X = [[1, 2, 3, 2.5],
     [2.0, 5.0, -1.0, 2.0],
     [-1.5, 2.7, 3.3, -0.8]]

X, y = spiral_data(100, 3)

class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        # defining layers
        self.weights = 0.10 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))
    def forward(self, inputs):
        # dot product
        self.output = np.dot(inputs, self.weights) + self.biases

class Activation_ReLU:
    def forward(self, inputs):
        self.output = np.maximum(0, inputs)


layer1 = Layer_Dense(2, 5)
#layer2 = Layer_Dense(5, 2)

activation1 = Activation_ReLU()

layer1.forward(X)
print(f"layer output: {layer1.output}")

activation1.forward(layer1.output)
print(f"layer output after ReLU: {activation1.output}")
#layer2.forward(layer1.output)

#print(layer2.output) # should be [[ 0.148296   -0.08397602]# [ 0.14100315 -0.01340469]# [ 0.20124979 -0.07290616]]'''

serene scaffold
#

Neural networks don't "die"

zinc patrol
#

output -

   0.00000000e+00]
 [-8.35815910e-04 -7.90404272e-04 -1.33452227e-03  4.65504505e-04
   4.56846210e-05]
 [-2.39994470e-03  5.93469958e-05 -2.24808278e-03  2.03573116e-04
   6.10024377e-04]
 ...
 [ 1.13291524e-01 -1.89262271e-01 -2.06855070e-02  8.11079666e-02
  -6.71350807e-02]
 [ 1.34588361e-01 -1.43197834e-01  3.09493970e-02  5.66337556e-02
  -6.29687458e-02]
 [ 1.07817926e-01 -2.00809643e-01 -3.37579325e-02  8.72561932e-02
  -6.81458861e-02]]
layer output after ReLU: [[0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 4.65504505e-04    
  4.56846210e-05]
 [0.00000000e+00 5.93469958e-05 0.00000000e+00 2.03573116e-04
  6.10024377e-04]
  6.10024377e-04]
 ...
 ...
 [1.13291524e-01 0.00000000e+00 0.00000000e+00 8.11079666e-02
 [1.13291524e-01 0.00000000e+00 0.00000000e+00 8.11079666e-02
  0.00000000e+00]
 [1.34588361e-01 0.00000000e+00 3.09493970e-02 5.66337556e-02
  0.00000000e+00]
  0.00000000e+00]
 [1.07817926e-01 0.00000000e+00 0.00000000e+00 8.72561932e-02
 [1.07817926e-01 0.00000000e+00 0.00000000e+00 8.72561932e-02
  0.00000000e+00]]
  0.00000000e+00]]```
#

thats a scary amount of zeros im seeing there

#

oh STELERCUS

#

boy am i glad to see you

serene scaffold
#

why?

#

I'm a bastard.

#

@zinc patrol why do you think so many of the outputs end up being 0?

zinc patrol
#

i think its because of the ReLU, but im not sure.

serene scaffold
#

you might try a different activation function.

zinc patrol
#

so i WAS right

weary timber
#

hello guyssss

zinc patrol
#

TAKE THAT RELU

weary timber
#

discord still banned in turkey 😢

zinc patrol
#

oops sorry gota bit carried away

weary timber
#

use sigmoid

#

if youre doing mnist

zinc patrol
zinc patrol
#

im using nnfs

weary timber
zinc patrol
#

but thanks for the idea

weary timber
#

whats that

zinc patrol
#

i think ill use hugging face instead tho

zinc patrol
weary timber
#

what is nnfs bro

#

the full name

jaunty helm
zinc patrol
zinc patrol
weary timber
jaunty helm
#

sigmoid fell out of favor in the first place cause vanishing gradients

river cape
#

hi guys so I want to build a ai website builder , where a user gives a prompt and the ai responds back with with necessary code . I want to use open source LLMs which are good for code-generation. Could y'll suggest me some LLMs that are good for this purpose , and also the parameters they have

verbal oar
#

you can visualize instead of staring at numbers

verbal oar
#

not comfortable way currently

#

I assume just plot functions output, after relu applied

jaunty helm
river cape
jaunty helm
#

use a quantized version so you can fit it in vram

river cape
#

Does q4 help?

jaunty helm
# river cape Does q4 help?

if you can stand 10+ seconds per token generation (and potentially longer), then sure
use the GGUF format so you can put your ram to use

river cape
jaunty helm
#
  • some more for the context
  • you probably don't want to use Q4_0 nor Q4_1 but something like Q4_K_M, which technically is higher than 4 bits on average
agile cobalt
#

you also need to take into consideration the memory required for the inputs and outputs though, which varies depending on the context window size

for generating an entire website you'd need of a pretty large output

river cape
jaunty helm
river cape
#

So its not possible to get good results with a open-source LLM?

#

Considering my architecture

jaunty helm
river cape
jaunty helm
jaunty helm
#

like I wouldn't be surprised if Qwen 3 4b beats wizardcoder 15b at this point

river cape
#

The recent one would be qwen and deepseek?

jaunty helm
#

llama4 kinda flopped from what I hear

agile cobalt
#

-# well, might be possible with a ridiculously bad performance, but the size is just absurd

river cape
agile cobalt
#

over 100B total parameters

jaunty helm
river cape
#

it says likely too large

#

Does it mean , i dont have enough disk space or i dont have enough ram to run it

jaunty helm
agile cobalt
river cape
river cape
jaunty helm
#

one of the strongest contenders for "low" param coding is honestly probably still QwQ 32b

jaunty helm
river cape
jaunty helm
river cape
heavy crow
#

are hard edges hard for 1d convlolutions? i.e when working with (simulated) LiDAR the data look like in the attached picture. It is rather high resolution but very sharp. (also attached a closeup)

green pilot
#

Hey woundering if any one knows if its faster to do simple data cleaning with bash in the command line or with pandas ? Its looking like unix might be faster but woundering if anyone has a suggestion as right now im using chuks and setting it to 500_000 and my large files are taking a long time to clean. Im just removing the header and removing ^ symbols

jaunty helm
heavy crow
#

just significantly making it harder for the model to converge

#

or requiring more parameters / larger kernel sizes to capture meaningful features

#

im trying to overfit on this "clean" simulated data before moving to realworld data but am running into real problems fitting the model

jaunty helm
jaunty helm
green pilot
#

Yeah i was thinking or running something like that would prefer to keep the pipeline all in python but i think running it like that might be faster. Right now i am pulling from aws bucket as csv -> coverting to txt -> cleaning with bash -> then gzip scp to a new server -> unzip and load into a netezza server . I really wish i could simplify its pipeline 😂😂😂

heavy crow
jaunty helm
#

so the kernels could see more of the sequence at once without increasing the parameter count by too much

heavy crow
#

Good idea

lapis flax
heavy crow
# jaunty helm maybe you could try high dilation since you said it's high resolution?

Wow. I was already using dialation but only with a factor of 2-6 or so. Looking at the data my features were way larger! (60 datapoints or so for one "feature" / peak or whatever). I've downsampled by a factor of 10 because i really dont need the resolution for these tests and its training much better! Now the dialation actually covers the features

heavy crow
#

@jaunty helm Went from this

#

ignore the wonky GT data distributions. Thats from my simulation.

#

all that with under 300k params 🙂 definitely viable and a solvable problem i can invest some more time in.

woven prairie
#

If i want to learn maths and stats for data science which is the best option

wintry pagoda
#

I want to create an AI personal assistant using python. I asked Chat gpt to draft me an outline for the process but I am still lost. Where do I start?

dusty forge
#

👋 Hey everyone! Imagine trying to understand healthcare data privacy rules, but it feels like wandering in a confusing forest. That’s why I built MediRAG Guard.
MediRAG Guard is special because It doesn’t just search for keywords, it understands how pieces of information are connected using a unique hierarchical Context Tree. It uses Python, Groq, LangChain and ChromaDB. This helps it give you clearer, more accurate answers.
It’s like having a guide in the forest, showing you the way! 🌳
Check out the demo and see how it works: https://github.com/pr0mila/MediRag-Guard

GitHub

MediRAG Guard: A RAG Proof of Concept that delivers comprehensive, context-aware insights on healthcare data privacy through a novel knowledge tree. - pr0mila/MediRag-Guard

rich moth
lapis flax
opaque condor
#

What's the next step after convelution nets?

lapis flax
#

here is the meat of the network

#

here are the initializations for the weight and bias terms

serene scaffold
#

@lapis flax it's easier for people when you give all the code as text. not as a screenshot.

#

!code

arctic wedgeBOT
#
Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

opaque condor
#

I'm working on combination on their own networks but what's after convolution?

lapis flax
opaque condor
#

Just in general cuz I mean a convolutional neural network I know how I can give it different image I just have to put in the folder name or at least the folders with their labeled and then train until there's a specific accuracy that I'm loving granted I haven't implemented a threshold yet I've been meaning to get that done but I've been busy planning a few hundred things

rich moth
rich moth
#

Does that help?

lapis flax
# rich moth

yes this exactly is what i was wondering about

#

im away from comp but will take a look later, do you know if this is covered in any of the torch documentation tutorials?

#

either way, thank you so much!!

opaque condor
lapis flax
opaque condor
#

So audio or video labeler

halcyon fog
#

what would be the best way to detect an image? im thinking of using cv2 but since the object im trying to detect is a card with text and no images and it seems to detect based on noise, idk if it will work

agile cobalt
#

depends on what you're trying to detect?
some OCR engines can be used to identify text bounding boxes

halcyon fog
#

its like, a bunch of these things, im making an ai that plays a roblox game for me, im trying to win a bet with my friends

#

the issue im running into is that they all look the exact same, except for different colours and text

red shadow
#

hello, can someone please help me with the coding🥺 it's more on django and type reaact🥺

grand minnow
red shadow
#

thank youu! i appreciate that🫶

#

i will do it rn

glacial root
#

would a custom image processing library (kind of like PIL) be a good project for a computer vision resume

limpid zenith
void stone
manic lion
#

Is machine learning use python? i am master yoda

#

or machine learning is pure math

void stone
#

From my understanding so far, machine learning is a combination of math (especially statistics and linear algebra) and programming (popular languages include python, c++ that you will combine with R, sql, Scikit-Learn, etc.)

serene scaffold
#

you mixed up a few things that are orthogonal. R is a programming language for data analysis. SQL is a query language for tabular data. scikit-learn is a python library for statistical machine learning (ie not neural networks).

verbal oar
#

but in scikit-learn you have perceptron

calm thicket
#

sklearn implements very basic neural networks, but that's it

void stone
orchid moat
#

is shiny posit usefull for beginner?

agile cobalt
#

iirc it's popular in R, but not so much for Python

there are Streamlit, Gradio, Dash and many other similar libraries that have widely used around in the python ecosystem for longer, but I guess Shiny also works

#

arguably even Marimo could count as an alternative for it

There are a lot of options, if you want to make a dashboard or extremely simple web app they are all useful regardless of your level of experience, which one you pick is mostly a personal preference

serene grail
#

It seems like it's all about deep neural networks nowadays, are there any tasks where something else is used for the state of the art?

lapis flax
#

i mean, large language models are sorta what a lot of researchers have turned their attention to

serene scaffold
lapis flax
#

that’s definitely true, perhaps i should specify that in my little corner of the machine learning and ai world, LLM’s are like the ‘hot new thing’ that a lot of people are thinking about and playing around with

verbal oar
#

attention pun not intended

rich moth
#

wrong channe sorry :\

royal talon
#

what are most important topic/s to in ML/Data science?

serene scaffold
rich moth
#

You guys wanna see some cool images. I finally got the validation suite done for UCF..

#

93.1% on Cross domain classifcation

hexed maple
#

guys where can i find pretrained LSTM RNN for time series data

opaque sphinx
#

for self learners of data science, where do you guys go online to learn (free materials).
Been wanting to go into this hence choosing this as an elective course for my degree, but after finishing the course I realize I still got so much more to learn as when I go through other people's project I barely understood anything, so wanna go into it more on my own but courses on coursera and edx are not free

serene grail
verbal oar
#

I see statquest channel is statquest with (name of person which I forgot)?

serene grail
jaunty helm
serene grail
opaque sphinx
jaunty helm
serene grail
opaque sphinx
#

Damn ok will take a look

zinc marsh
#

Highlights from the latest #nvidia keynote at COMPUTEX 2025. Topics include @NVIDIA's MASSIVE new Blackwell Ultra GPUs, new products like DGX Spark, DGX Station, and RTX PRO Server, and how they'll power generative AI models like #chatgpt by #openai and #deepseek R1, reshaping artificial intelligence and computing as we know it. Based on all the...

▶ Play video
serene grail
opaque sphinx
#

My uni didn’t teach stuffs like time series, MLOps, NLP and so on, saw these things on forums and I have no idea how they work

#

Or even they existed

opaque sphinx
real venture
#

is there a subreddit to promote and sell my python app as crypto
i made AIO pc usage tracker ap from ai code editor called cursor

#

sorry if its wrong channel...

cedar tusk
#

no one teaches databases properly, no one teaches any kind of technologies that are used heavily (docker for example)

#

schools are obsolete af

#

no one teaches taxes, no one teaches insurances...

#

then when people exit the schools they gotta find these all by themselves, whats the point of schooling then? Is shakespeare more important than how to live in the society?

#

bah

#

rant over

jaunty helm
verbal oar
#

I had in the past course in machine learning but it was in R not in python, dont know why this was decision for uni course

cedar tusk
#

its fine if thats the case

verbal oar
#

I suppose teacher has knowledge of R

cedar tusk
#

the algos dont change

cedar tusk
verbal oar
#

theoretical and practical (labs)

cedar tusk
#

i dk man i feel like suffocating in my current masters degree

#

all these theories, no usecases whatsoever majority of the time

verbal oar
#

yes they teach without context

fickle shale
#

I can buy one which one i need to buy first
Hands-On Large Language Models: Language Understanding and Generation
or
AI Engineering: Building Applications with Foundation Models ?

opaque condor
#

Hello could I please have tips on training a small language model

verbal oar
#

can I have custom theme instead of default light theme in jupyter notebook?

#

dont like these green with gray this is ugly for me

#

dont know vs code with jupyter notebook maybe

#

what color scheme do you have for ai/ml?

#

I'm talking about readability and beuaty also

#

ok first question is googleable rather

cedar tusk
#

infinitely better

wet dome
#

Can anyone recommend a machine learning book for a beginner? I'm thinking about either introduction to statical learning or machine learning with python

zinc patrol
#

hey guys

zinc patrol
zinc patrol
#

jupyter is too overrated

zinc patrol
fair solar
#

i remember wanting to read nnfs 3 years ago

#

except my uni covered ML as a course the next sem and NN from scratch in python was literally an assignment

#

feel the maths/algorithm is more important than "python/implementation"
as long as you're decently confident with one programming language, you'll be able to do it from scratch

odd meteor
odd meteor
# cedar tusk i dk man i feel like suffocating in my current masters degree

You just have brave the storm. Find ways to make it fun. I recently saw a video on YouTube where a guy was complaining of the same thing.

He enrolled for an AI masters program in Germany thinking he'll be doing Neural Nets and all that fun stuff but his first semester was just filled with mathematics, statistical proofs 😅

You must combined the applied side of ML (perhaps by learning and building cool stuff on your own) with the theoretical part majorly covered in technischule.

I think it's kinda cool in Germany cos they have Technical Universities (more suited for people interested in Research & PhD) and School of Applied science (for people interested in the engineering part of ML)

odd meteor
# opaque condor Training one from scratch

I'm going to presume you've sorted out the data collection part already or you have the means achieve that.

Read this https://lelapa.ai/inkubalm-a-small-language-model-for-low-resource-african-languages/

If you find it interesting and would like to dive more deep, then you should download their published research paper on arXiv. (That's literally all you need and perhaps with a little bit of googling here and there should you encounter any implementation they did that's not clear to you or you find it hard to reimplement.)

If you, however, consider this "a lot of work", then searching on YouTube might probably work best for you.

As AI practitioners, we are committed to forging an inclusive future through the power of AI. While AI holds the promise of […]

fallow coyote
zinc patrol
#

hai guys

lavish wraith
#

One confusing i my mind is data science is also make dashboard ??

timber blaze
#

what

lavish wraith
#

Could i need to learn all dashboard like ploty ,powerbi and tableau ??

#

I am not understand where should i go i have learned pandas,numpy and matplotlib ,could i also need to learn Excel ,sql

#

Is data science job role also make dashboard or not ??

timber blaze
#

what are you trying to say

opaque condor
odd meteor
opaque condor
#

What would you recommend what libraries should use for a small model?

karmic totem
#

Looking for a guy who knows programming, more specifically Python, in DM please, urgently

opaque condor
#

I know to find somebody with somebody who knows how to work with python just asking pytorch is there already a file for small language models or at least medium size models

odd meteor
opaque condor
#

I know does pie torch have anything for making a small language model just like how it has the data set that you can download into your network

serene scaffold
opaque condor
odd meteor
odd meteor
opaque condor
smoky swan
#

I have a project where you control your mouse with a gyroscope/accelerometer (like from your phone/smartwatch)
a kalman filter isn't enough, do yall suggest a neural network to filter noise?

smoky swan
#

and if you have any, id be more than happy to hear suggestions on resources regarding NNs, im still having to learn. more specifically i thought about a recurrent NN (long term short term or whatever)

desert oar
smoky swan
# desert oar Got an example? I assume you have something like xyz time series

I can provide my script if you want me to, but the basic idea is as follows:
you collect accelerometer (and gyroscope) data, about every 5ms, so @200hz.
as of right now, it's being pulled from a local server hosted by phyphox on my phone. it does a great job at refining sensor data (kalman filter usw), but it's not enough
data looks like this:

acceleration = [(x1, y1, z1),
                 (x2, y2, z2),
                 (...),
                 (xn, yn, zn)]
rotation = [(x1,y1,z1),...] # you get the drill...

I measure the data live, so I can collect however much I want

tropic ridge
#

Hey guys does anyone know where I can get the most recent annual mean cost of commodities like wheat,petrol and other energy sources per country?
I tried using our world in data and well there are a lot of incomplete data sets even for as far back as 2022

void geyser
#

I created ai anyone have some idea to improve it and make it better

jaunty helm
near fractal
#

I want to train an AI image model, where would I find a dataset for images?

north ether
#

i want to use csv files and train datasets using tensorflow and stuff
Which version of python and tensorflow should i use?

limpid zenith
north ether
#

because i know it gets error

#

version not matches etc

limpid zenith
#

I wouldn't suggest Tensorflow anymore, it's being phased out by Google. I would use Keras with Jax or Pytorch.
Or PyTorch manually or PyTorch with PyTorch Lightning these days

#

It's also easier for beginners and has friendly API and is the most use one in academia.

late blade
north ether
late blade
#

object detection or image classification ?

near fractal
limpid zenith
near fractal
#

unless you mean the data is expensive 😭

late blade
near fractal
#

n e v e r m i n d

limpid zenith
#

lmao

#

Deep Learning is EXPENSIVE

near fractal
#

😭 what did he expect me to do he literally noticed my interest in AI and he said "oh you should try to make your own"

late blade
#

you can make non generative image model cheaply and easily

near fractal
#

text generative?

late blade
#

nah

#

try yolo object detection

near fractal
#

yea i have a classification model

late blade
limpid zenith
#

Actually you can fine tune your own model, don't train your own from scratch...that's expensive.

Look into Parameter Efficient LoRA

#

and 4 Bit Quantization

limpid zenith
#

you can fine tune on a single 4090RTX with this and unsolth

late blade
north ether
# limpid zenith ah okay, that's different then. It's still out of date. Anyways what was the e...
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: A dynamic link library (DLL) initialization routine failed.


Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors for some common causes and solutions.
If you need help, create an issue at https://github.com/tensorflow/tensorflow/issues and include the entire stack trace above this error message.```
limpid zenith
#

it's just quantization, float precision afaik...never tried it with image models...but why wouldn't it work with image models

near fractal
# late blade improve on that it is easier than generative models

what about trend models? im decent at math for my age (im 16) and ive recently landed an internship at a quant firm, i want to see if i can make one that does low frequency trading off momentum instead of high freq trading like most quant firms. i think i can do something to smooth out the noise from the stock data by averaging the price per time period and using smth like fourier sequence to fit a sine approximation to the data

near fractal
late blade
limpid zenith
near fractal
#

does anyone here have experience in stock market? does momentum trading provide reliable profit? even just 1% will be fine, i care more about a good sharpe ratio versus a good profit rater

late blade
north ether
limpid zenith
near fractal
# late blade i think adding llm for market news context will make better trades ?

news context usually is already shown by stock movements. of course, human analysts are used by quant funds since most signals are exhausted the second they show up, but then human analysts also cost a lot. im assuming that LLMs wont be able to provide as strong of a market context as human analysts, but itll be a good starting poiont. however news doesnt cover small firms as much, which is the market id want to target because of the signals there lasting for longer than 1/2 ms

north ether
# limpid zenith show full traceback
  File "C:\Users\laptops galaxy\Desktop\BSAI-5A\venv\Lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 73, i    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: A dynamic link library (DLL) initialization routine failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\Users\laptops galaxy\Desktop\BSAI-5A\ANN\logon\gan_project.py", line 4, in <module>
    import tensorflow as tf
  File "C:\Users\laptops galaxy\Desktop\BSAI-5A\venv\Lib\site-packages\tensorflow\__init__.py", line 40, in <module>      
    from tensorflow.python import pywrap_tensorflow as _pywrap_tensorflow  # pylint: disable=unused-import
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\laptops galaxy\Desktop\BSAI-5A\venv\Lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 88, in <module>
    raise ImportError(
ImportError: Traceback (most recent call last):
  File "C:\Users\laptops galaxy\Desktop\BSAI-5A\venv\Lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 73, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: A dynamic link library (DLL) initialization routine failed.


Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors for some common causes and solutions.
If you need help, create an issue at https://github.com/tensorflow/tensorflow/issues and include the entire stack trace above this error message.```
limpid zenith
#

what's your code?

north ether
arctic wedgeBOT
north ether
jaunty helm
#

as for deformed hands... I've heard that flux mostly fixed that issue? tbf I've never ran it cause I don't have the hardware

limpid zenith
north ether
odd meteor
tropic ridge
upper niche
#

hi

#

anyone here

serene scaffold
upper niche
#

I was about to ask in svr make_pipeline, and the use of standard scaler, does it only apply standard scaler() on X , or does it also apply to y

#

and also, you sound more... cheerful

grand minnow
#

To anyone who's hosting their own LLM, what is the cheapest GPU VPS service available?

#

I want to try and build a live proof of concept

dusty forge
#

Introducing MediBeng-Whisper-Tiny! 🚀
We’ve fine-tuned OpenAI Whisper-Tiny on Hugging Face to 𝗧𝗥𝗔𝗡𝗦𝗟𝗔𝗧𝗘 code-switched Bengali-English speech into English. This helps improve doctor-patient transcription and makes clinical records more accurate. 🏥🎙️
Bonus: It’s an easy way to fine-tune Whisper for translation tasks!
Check out the repo for more details and try it out!

🔗 MediBeng-Whisper-Tiny GitHub Repo
: https://github.com/pr0mila/MediBeng-Whisper-Tiny
🔗 Hugging Face Model:hugging_fire: : https://huggingface.co/pr0mila-gh0sh/MediBeng-Whisper-Tiny
If you like it, don’t forget to give a ⭐ and 👍!

#AI #SpeechTranslation #HuggingFace #WhisperTiny #HealthTech #AudioToText

GitHub

MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech into English, making it easier for analysis, record-keeping...

runic parcel
wintry pagoda
#

Can anyone tell me about "JSON handling for data exchange"??

grand minnow
lethal pendant
serene scaffold
warm verge
toxic mortar
#

Hello why do I get significantly different results between my Google Colab (GPU) and local (CPU) runs?

both of them are using ```py
pipeline = Pipeline([
('tfidf', TfidfVectorizer(sublinear_tf=True, strip_accents='unicode', analyzer='char',
ngram_range=(2, 6), max_features=40_000)),
('clf', XGBClassifier(random_state=42, enable_categorical=True, device="gpu"))
])
le = LabelEncoder()
y_encoded = le.fit_transform(dataset['class'])

pipeline, y_test, y_pred = train_e2e(dataset['text'], y_encoded, pipeline)

#

CPU colab results are the same as the CPU local ones

wooden sail
# toxic mortar CPU colab results are the same as the CPU local ones

do you have a local gpu to also compare with? what immediately comes to mind is that gpus by default use 32 bit floats while cpus tend to use 64. most of the time this isn't an issue, but sometimes it can be. you can also set the bit depth to 32 bits for your cpu runs and see if they now match the gpu result

tropic ridge
fallow coyote
#

whats a good beginner book for linear algebra? Ive tried reading LinearAlgebraDoneRight but reading through it, i feel like im missing out on a lot of the context as my only exposure to linear algebra is vectors and matrices. Ill probably go back to LADR once I now about linear algebra but need the base knowledge first

tender egret
#

can someone help me when i m importing another python in the same folder its not getting imported why

serene scaffold
tender egret
wooden sail
wintry pagoda
verbal oar
#

why I see mode 0 50000 instead of mode 50000 in example on simplilearn demo? ignore 0?

#

or maybe this 0 has some meaning

#

hmm I assume its index 0

#

mode() of pandas dataframe

#

see docs?

#

I confusingly thought its bimodal with 2 values but its probably not

untold bloom
#

yes it's index 0

Always returns Series even if only one value is returned.

fallow coyote
fallow coyote
# wooden sail probably a book on ml

Thats actually reminded me of a book which i downloaded called mathematics for ml or something like that (youll know the one as its commonly used)

urban canopy
#

I hear AI all the time. But I never hear about the people going into the weights and trying to figure out how it all works. Which is strange considering how common AI is.

iron basalt
#

This is because both linear algebra and graph theory can be applied in so many cases, and they work nicely together.

iron basalt
sturdy shadow
#

and yeah weights in what sense? the sentence is analogous to "people going into the transistors of electronics": AI covers a huge field, and not all models will even have weights

odd meteor
tender egret
odd meteor
urban canopy
odd meteor
tender egret
#

i fkin used all sort of gpts , ai , llms everything its not working

sturdy shadow
#

adjusting individual weights is basically just backpropagating but less accurately. would say its better to mess around with actual network architecture than spending any time on weights

tender egret
#

can someone teach me how to make a neural network like in detail but in an easy way

odd meteor
# tender egret train.py

I'd like to see a full traceback if possible. Meanwhile, inside your ai directory, add an empty __init__. py script there as well

urban canopy
sturdy shadow
#

sorry what are you talking about

serene grail
urban canopy
odd meteor
# tender egret i added it didnt worked

Hmmm... Just double check once again from VSCode terminal, that you're indeed inside the ai directory. If not, you have to cd into that directory for it to work.

small path
#

Ok so for the past 6 months i have been trying to work on a basic chatbot, but it's failing miserably

What i have tried:

Different types of model:

Chat generation
Rag
Seq2Seq

But none of that works, I'm taking my data from HF about 100k lines of conversational data

My latest model's params were 18M but yet it is still spewing gibberish

Any suggestions?

#

Now we wait ☕

#

Oh also I'm only using pytorch

tender egret
sturdy shadow
agile cobalt
# tender egret DONE EVERYTHING ITS NOT FKIN WORKING

if you don't want to debug nor cooperate with people trying to help, just install a package manager like uv and create a new project following its documentation, then copy/paste your code into the template it generates

odd meteor
tender egret
tender egret
elder flume
#

Guys

tender egret
elder flume
#

Can someone give me a path to be a data science? Bc I want to to this later

tender egret
#

vivek babu wsp

elder flume
limpid zenith
elder flume
#

Ok that's really cool, I like math

#

Thanks

tender egret
#

buwet calculas and linear algebra are hard asf

limpid zenith
#

wdym? probability is basically just combinatorics in disguise with kolmogorov axioms applied to scenarios

#

and combinatorics is difficult

tender egret
#

bro tryin be shashi tharoor

elder flume
#

I mean math

#

I'm 7.grader🥲

limpid zenith
elder flume
#

Ight, do u think I can learn many stuff of python and R in 2-3 years(also math) and I can get a job?

#

I can already understand the basics of python and it's very interesting

limpid zenith
#

i can't gauruntee anything since the job market is really bad rn, but it;s definitely a good asset to have learned

elder flume
#

Ok👍

runic parcel
#

Hey guys can anyone help me for finetuning my tesseract for getting better accuracy

#

really need the help

spring field
runic parcel
#

i will explain to you

#

So like i was trying to finetune, and i got really bad result, did using this doc. https://arcruz0.github.io/posts/finetuning-tess/
so like as said created files with image and its gt.txt value about 200. But when i tried to test it it gave me really bad answers, this is fully wrong and it detect nothing.

#

Here are the graphs of loss curve, which shows the result is good

runic parcel
spring field
#

what text are you parsing? can you share your dataset? I'm afraid 200 images might not be enough

#

does your dataset contain data that is similar to that which you are trying to parse or is it just some random generic dataset? I can't tell how some of the images I saw in your dataset in your help post relate to the parsed text in your image above

glacial root
#

you gain nothing from crashing out on discord

hoary wigeon
#

Has anyone ever used spink - moj python library

an entity-resolution / entity-record deduplication tools. For. e.g. you have customer table and there are duplicate entry of customers i.e. distinct customer id and may have slight error in name and other details being same or slight change in contact...etc

#

I'm facing perfromance reated issue

austere swift
#

Does anyone know a source where I can download the "Blackbird" dataset? (https://github.com/mit-aera/Blackbird-Dataset)

The domain mentioned in that repo is no longer up and I can't find any sources for the data anymore. There is a torrent for it on academictorrents but nobody is seeding it right now so it's just stalled on my client.

If that specific dataset can't be found do you guys know of a similar dataset to it? It's a flight perception dataset for drones that includes camera streams, rpm and imu measurements, and ground truths for pose estimation. I'd gladly look into anything that resembles this at a fairly large scale, the datasets i've found so far are very tiny (a few gb at most)

GitHub

Contribute to mit-aera/Blackbird-Dataset development by creating an account on GitHub.

#

atp I might just start contacting the researchers to see if they have it

spring field
#

I do otherwise like the idea though, it seems cool and might turn out to be more accurate than a monkey throwing darts after all 😄

exotic star
#

Hi guys, i'll be spending 2 months in the USA this summer and i wanna make most of my time by getting into ML, i started learning pandas and numpy a bit so far. How would i learn the most while still learning everything throughly to actually learn it not just scratch on the surface?

weary crown
#

is the reason why we use MSE as opposed to MAE for gradient descetnt so we only have 1 global optimum?

serene scaffold
#

I guess it's not smoother per se but it's definitely more gradual

weary crown
#

ah okay that makse sense

#

oh another thing

#

with SGD right you are never reaching the global minimum as opposed to batch - thats fine obv for most purposes but my question is

#

does that
A. Every prediction is slightly slightly off
B. Some predictions are still spot on but most are off

#

pretty sure its A right but could the parameters accidentally line up and make some perfect predictions like in case B or nah

jaunty helm
iron basalt
#

(The square comes from the square in the Gaussian)

#

There is also a geometric understanding to this. Basically you are assuming that the probability density "cloud" is spherically symmetric, and from that the best estimate is the one closest in distance to the observed data, where this distance is the Euclidean distance (L2 (quadratic norm)).

wooden sail
#

the geometrical interpretation follows in a similar way, except we now think of more general "norm balls" instead of circular symmetry

#

these things would fall under "maximum likelihood estimation," in case you wanna take a deeper look yourself

remote gulch
#

I guys i am new to data science and i want to build an nlp project for my portfolio which is not too generic but also give me somethings to learn.

#

Can i please get resources related to that?

#

That's nice

#

its very creative ngl

#

how can i buid it?

#

what tools to use?

#

okok

#

can i get your receipts?

#

I have never done too much shopping😭

#

ok makes sense

#

well i will start shopping then😂

#

Yeah will do that

#

Dm maybe?

serene scaffold
#

No--people need to buy it themselves

primal tulip
serene scaffold
#

it's disrespectful to the authors to send people free copies of their work

primal tulip
serene scaffold
primal tulip
#

I'm messing a bit with you. No ill intention, just pulling your feet for your profile pic. I worked within a legal tech, that aided fight copyright infringement, sharing something I legally bought is not wrong. It's pretty much a grey area, that's why piracy is really hard to punish, especially in those cases no money transactions is involved

burnt hearth
#

Hello I am new in python i am creating a model audio to text, for used openai/wisper large-3 but my audio duration is 10 min or + how that process it been take very long time
and this process run on my local system any buddy tell me if can make faste

primal tulip
#

Sorry if this is getting sidetracked. @serene scaffold already helped me answer my question. I'll keep my unrelated posts to a minimum.

@final kiln both. Either helping companies, big or small or individuals that suffered from it.

serene scaffold
runic sundial
#

Hey everyone. Just got in to a company that has an ongoing AI Project. Would love to have some advice on what I should focus on learning, but based on the direction of the project it seems we'll be automating a lot of stuff and probably develop some more projects for better quality assurance, do you think learning some Machine Learning or Data Science would help me keep up with the team? Also, are there any courses that you would recommend I take just so I can dip my fingers and pretty much understand the bigger picture of this project that I'll become a part of.

burnt hearth
primal tulip
woven prairie
#

See I have purchased open ai key

#

I am using there gpt4o model can anyone tell what is the context window size of this model.

primal tulip
woven prairie
#

Yes

primal tulip
#

From 1 token, to 16384. Default is 2048 if undefined

woven prairie
#

2048 tokens by default

woven prairie
#

What's the truth range

jaunty helm
woven prairie
#

1 lakh 28 thousand is input and output is 16 thousand

jaunty helm
#

note that the quality of responses likely degrades before hitting the 128k

woven prairie
#

Context window is the total of input and output

primal tulip
jaunty helm
primal tulip
woven prairie
#

What parameter in payload sets the context window

burnt hearth
#

Because I am a huge model It not only audio to text
it compete transcript model which using as open source model

primal tulip
agile cobalt
primal tulip
woven prairie
#

Thanks for your response

#

One more thing , how can I improve prompt

agile cobalt
#

write better

burnt hearth
primal tulip
jaunty helm
woven prairie
#

Ok , thanks I am going to readit .

primal tulip
# burnt hearth yaa I am trying to use hugging face, NVIDIA graphic card

Do you want to do Text To Speech (TTS)?
Also, you also NEED to know about your graphics card model, in some cases you have to download a specific driver to be able to use certain AIs.

I'm assuming you use linux. I found this that you can either run it on a raspberry pi 4 or at linux with python.
https://github.com/rhasspy/piper

This is amazingly fast and lightweight, so I'd check it out if I were you. Also, if you're already at Huggingface looking at a certain model, read their documentation, that's better than anything else.

GitHub

A fast, local neural text to speech system. Contribute to rhasspy/piper development by creating an account on GitHub.

burnt hearth
# primal tulip Do you want to do Text To Speech (TTS)? Also, you also NEED to know about your ...

it good but some of language which i am looking it not available, simple
Audio to Text
Text to Transcript (any lang)
Transcript to Speech

that is my goal for now

I have very long vision regarding this model but currently need to ony these step

and yes if you good model for translation let me know

I use Argos Translate for translation but it not given expected output as compare with Google translation

agile cobalt
burnt hearth
#

Like audio in English get text and translated into other lang then use TTS for text to speech with new translated text

agile cobalt
burnt hearth
#

once it done i am work on voice cloning for good sound

agile cobalt
burnt hearth
#

okay but i am finding hindi also included

#

if there any other model have support hindi lang let me

versed fractal
#

Is open router good for free api

#

@burnt hearth

jaunty helm
#

unless you put 10$ into it, then like 1000 / day (you don't have to spend it, just having it in)

versed fractal
#

Is 50 free calls a day good

jaunty helm
hollow pagoda
#

What iirc mean

jaunty helm
versed fractal
#

If I remember correctly

versed fractal
hollow pagoda
#

I be seeing it and not knowing

jaunty helm
versed fractal
#

What is 50 a day

jaunty helm
versed fractal
#

Only

#

Nah that's bad

#

What is a good free api website

agile cobalt
#

Mistral and Google have some pretty nice free tiers, just be aware that any data you send to free APIs is as good as public as far as privacy goes

versed fractal
#

Should I use Google gemini

jaunty helm
#

I can say that for mistral, you're highly unlikely to hit the limit as an individual

versed fractal
#

Or mistral

jaunty helm
#

the "limit" is like 1 billion tokens per month

versed fractal
#

I think mistral is the best

#

How much is a token btw

agile cobalt
versed fractal
#

Oh good

#

Is there Arabic (just asking)

agile cobalt
jaunty helm
agile cobalt
versed fractal
agile cobalt
versed fractal
#

I want something with more tokens

jaunty helm
versed fractal
agile cobalt
# versed fractal I want something with more tokens

you are billed per token, and after a certain point models tend to work worse the more tokens are in your prompt

in general it is best for the tokenizer to compress your inputs into as few possible tokens as possible

#

(even on a free API, it'll affect response time and may affect rate limits)

versed fractal
#

I think there isn't something good free

agile cobalt
#

Mistral and Google Gemini do have good free tiers (as long as you disregard data privacy)

versed fractal
#

They will use it to train models??

woven prairie
#

What the token size for open ai 3-4

agile cobalt
versed fractal
#

Yeah that's okay I wouldn't send something private to the ai anyway

#

I would use it

#

Should I make the ai for a phone or pc

agile cobalt
#

web, then it doesn't matters which device you are using it from

jaunty helm
versed fractal
#

Ok

#

So just want an opinion should I make my ai for PC or phone

#

For the first prototype before I start updating

#

I will use python backend for the model and the ai itself

sage gust
#

Does anybody here know how I can use my GPU when training with tensorflow? I'm a bit uneducated on setting this up myself and could use some help

toxic pilot
toxic pilot
#

tf.config.list_physical_devices('GPU')

jaunty helm
#

and, if you're not required to use tf, pytorch is a much better choice given that most of the attention that was on devving tf has moved on to jax

#

or heck, if you don't need a really low level api just use karras, works with whatever backend you like

sage gust
#

We're using Tensorflow in school and it is very frustrating

#

But that's just because I'm really bad at coding

#

Maybe you can help me

#

Im currently working on a project that should recognize the genre classifying it in the 10 genres that GTZAN gives you

#

I trained my own model, and one using VG16 as base model

#

When I test using a file from the training data, it always shows "pop" for my own model as genre, and it always shows "classical" for the VGG16 one

#

And now im stuck

#

mel spectrograms are the training data

#

224x224

#

yeah, first i tried by using the images that gtzan has in its dataset, but my models didnt train at all

#

so i just converted them myselves using the wav files

toxic pilot
#

wait i was under the impression that jax was designed to optimize torch/tf/tensor operations

#

i read the message wrong lmao

#

i read it as someone was saying that jax was a drop in replacement for tf

toxic pilot
#

is it good?

#

jax i mean. did you see a non negligible performance boost

#

ah

#

what about usability?

#

better than TF is the lowest bar out there lmao

#

icl tf syntax is atrocious

#

hmm maybe i’ll check it out

#

pytorch is all right. it’s package management is brain damage tho

#

can’t believe i have to make sure all my versions are compatible when i’m using torch vision, torch audio and torch text

lime grove
#

$20/mo - $50/mo - $??/mo

agile cobalt
wet dome
#

Complete beginner to ml here, how do you know what model to use by just looking at a dataset?

#

I want to do some kaggle competitions to learn but how do I know what model to use, like a linear regression or something else

serene scaffold
#

The composition of the dataset doesn't necessarily dictate what you can do with it. So "I want to train a model on this dataset" is an incomplete goal. Train it to do what?

wet dome
#

Cause that's the simplest isn't it

serene scaffold
#

"a thing that a model does" would be something like "predict the value of a home given its square footage and number of rooms"

wet dome
#

OK so for that example you just gave, from that what model would you use

serene scaffold
#

xgboost, maybe?
I do natural language processing.

wet dome
#

Is this something you learn about when you learn ml, when to use said model?

serene scaffold
#

You develop a sense for it.

wet dome
#

Isn't nlp like coding in more human like language or something

serene scaffold
#

It's not about coding in human language. It's just "AI and ML as it pertains to human/natural language"

#

ChatGPT is an application of NLP

spring field
#

it's not "natural language programming", it's "natural language processing" 😄

coarse valve
#

Help me understand...I went to a ML presentation from a vendor today. Our team provided them with a lot of data. The step in their process so far was to build out an unsupervised model to cluster our client base. This is all paraphrasing but essentially they did so feature engineering, parameter hypertesting...cleaning ecc. They then tried Kmean clustering with PCA...testing features i think of 5, 10, 20. I believe the Kmeans was insufficient resulting in ~800 clusters so they tried dbscan + a centroid distance based regrouping they got it to 30 clusters with ...i forget the term but essentially Edit: silhouette score positive rate of .51. We were tasked with analysing the different clusters and assigning labels to these clusters.

My question is...the only outcome from this model would be to then use those clusters/labels as "bucket" end points to then classify new incoming data correct? Some one my team (management) seem that this is the first step in further classifying our clients as like buyers/not ecc. I tried to really drive the point that what we define as labels WILL BE the classification we get...If I am completely off base or misunderstanding please let me know...im not smert

serene scaffold
#

You have to decide what significance each cluster has, if any

#

If you have a new instance that you want to classify, it belongs to whichever cluster it has the closest centroid to.

deft goblet
#

Hey, I’m a CS grad from UPenn with a tech background. I just built and agent builder that lets you create any AI agent in 3 simple steps (takes about 3 minutes). The adds-on it takes care of the architecture, finding and connecting apis, and you get the full deployable code, turn it into an API, or export it to Lovable to try it out quickly.

If you want to give it a try:
https://search-dream-weaver-kit.lovable.app/

Lovable Generated Project

weary crown
#

so im working with a dataset right now

#

from a league of legends API
the data for one of the features appears to be quite off/inaccurate, but i checked the script used to get the data as well as the API docs, and it should work

#

so should i just assume its broken xd cuz i cant find anything online about it talknig about the inaccuracy

lapis sequoia
#

Is there any ML Engineer here doing NLP work using Windows natively (no WSL, no dual-boot, no VM), and everything works fine? how did you get everything working smoothly?

glass jetty
#

Don't advertise here, see the #rules.

harsh verge
#

ello

#

i need help with an error in my code

#

basically im trying to make a loss function for my neural network

#

and idk how to add a forward pass here

#
    #calculates data and regularization of losses
    #given model output and ground truth values
    def calculate(self, output, y):

        #calculate sample losses
        sample_losses = self.forward(output, y)

        #calculate mean loss
        data_loss = np.mean(sample_losses)

        #loss
        return data_loss```
jaunty helm
#

bit of a weird one:
time series classification, I know my data is from int16 sensors, and I can see that some parts of it sit at exactly 32767 for an extended period of time
would you try "fixing" the time series somehow? or say ignore it and make a complementing series that's 1 if the data is clipped and 0 otherwise

wooden sail
jaunty helm
#

and there are no greater/lower values than 32767/-32768

wooden sail
#

mhm

#

but i guess the way you would treat it is different if the problem is clipping as opposed to just missing data

#

in either case though, it does make sense to mark the entries

jaunty helm
wooden sail
#

aha, now that's a very different question 😛

#

what is this measuring?

jaunty helm
#

the one that has the hover is the angular velocity during a hand movement in the x direction

jaunty helm
wooden sail
#

did you write the code that reads the sensor data? or using something found online?

#

phase unwrapping is not really simple if you don't have an accompanying quantity

jaunty helm
wooden sail
#

aha

#

what are the other curves?

jaunty helm
wooden sail
#

you can probably do some sort of fitting that requires a couple of derivatives to be continuous

#

should be doable with splines

#

that's like enforcing smooth motion

jaunty helm
wooden sail
#

do you know for a fact that the data is periodic? if that's a fair assumption, there are periodic splines you could try. but yeah, you can try masking out the clipped values and interpolating them back in with quadratic or cubic splines

#

there are fancier things you can do, but i think this is a good starting point. maybe someone else has better suggestions

jaunty helm
#

alright cool, ty!

wooden sail
#

you could grab the differential equations of motion and relate the 3 components of linear and angular velocity to them, then have your network spit a solution to the differential equation

#

that'd be a more inverse kinematics approach, what people nowadays call "physics-informed neural network"

#

or if you have enough data, fr you can probably just mask out that data and train a black box network as is 😛

#

but splines are a very manageable first step and benchmark. polynomial good 💻 🐒

coarse valve
eager lance
#

hey guys where can i learn probability and statistics for data science?

fallow coyote
#

I've just found this ML project repository which has a ton of projects. I was wondering if itd be worthwile pinning it. Feel like itd help beginners like me who have understood the maths but not quite know how to use the tool effectively

woven prairie
#

Hi

#

I need some help in prompting

#

If anyone could help me , please mention me

arctic wedgeBOT
woven prairie
#

woven prairie
#

Can anyone help me ?

verbal oar
#

is simplilearn good or choose other channels and which?

#

about machine learning and data science

#

just want to watch just enough to have better understanding when doing some project

#

dont want to watxh redundant videos also dont want to watch while yt about ml and data science

#

simplilearn has playlist with 160 videos about data science and approx 500 about ai

#

but looks like or not they are overlapibg

woven prairie
verbal oar
#

free

woven prairie
#

Are u comfortable in hindi

verbal oar
#

no

woven prairie
#

Then I don't have any suggestions for you

verbal oar
#

i see its also outside of yt too

woven prairie
#

You can visit Krish Naik

verbal oar
#

I watched ml teach by doing but its not enough

#

from vizuara

#

I want some path like most important parts of machine learning and data science and then straight to practice project

#

or just start project and fill gaps

#

dont want to watch everything and not starting project

#

looks then like tutorial hell

woven prairie
#

Bro

#

I don't know about vizuara but there before using any algorithm like linear regression and other you must know the maths behind it , then you can enjoy learning

#

Simply calling the library isn't going to help you

steel spindle
#

Can someone explain pathfinding AI to me?

verbal oar
#

you have start and target node you find shortest path to reach target with dijkstra or a*

steel spindle
verbal oar
#

algorithms to find shortest path

unkempt apex
#

you can also try with pen and paper

verbal oar
#

dijkstra is easier, a* has h heuristic

#

dijkstra is taught in discrete math courses

#

a* seems like robotics games rather

#

yes you can do dijkstra with table

#

look also at pseudocodes of them

spring field
#

I'm not sure dijkstra is exactly easier, dijkstra is really just a specific heuristic for a* (essentially)

toxic pilot
#

@final kiln sorry for ping, but im looking into jax rn. do you have any opinions between flax, vs equinox vs haiku for neural networks?

steel spindle
#

Thx

smoky swan
#

say i make an RNN to estimate velocity from acceleration that has the usual kinks you get from an accelerometer,
do i run it with a set value of past measurements (like 200) every time i get a new one, or do i just keep feeding it and look at the memory gate?
second one is much faster but i think my PC should be able to handly both, so im looking for the best runtime

also, just a regular RNN isn't enough, ive tried, even hand trained it, so im making an LSTM model

verbal oar
#

I meant when you know bfs then dijkstra is easier

opaque condor
#

How many photos do I need for each subject for a folder and do they have to be in separate folders or just one mixed up one?

tranquil juniper
#

I was wondering if someone was slightly or more experienced with AlphaZero's type of model and could help explain why when i use the code from the most searched youtube tutorial it works, but when i modify it to keras instead of pytorch it doesn't give me any errors and i think i understand the code, but then it always learns the wrong policy as when i test it for a given board state it doesn't give a high probability for the "correct" move that wins them the game. Anyone who could help or point me in the right direction? Here is the pytorch code github: https://github.com/foersterrobert/AlphaZeroFromScratch/blob/main/7.AlphaTweaks.ipynb and here is my code: https://codeshare.io/5X843z

GitHub

Contribute to foersterrobert/AlphaZeroFromScratch development by creating an account on GitHub.

wet dome
#

I've just been learning about linear regressions. Is there a reason we would use gradient descent to help find our coefficients for our model, instead of using the "normal equation" which to me looks like something we can just compute instead of having to iterate over our dataset

lapis sequoia
#

What is the most used modul in python for Data Science?

#

And Is there any source I can learn Data Science with? like an Arabic Source or English

magic pasture
cloud cosmos
#

Hey im trying to code a transformer and learn about ai coding does anyone have any good tips or things to study?

vocal cove
#

I'm trying to define a good cost to push fidelity to be exactly 1+0j.

def loss(
        mps: qtn.MatrixProductState,
        target_mps: qtn.MatrixProductState
    ) -> float:
    fidelity = mps @ target_mps.conj()
    return abs(fidelity - 1)**2

Need something that can get a bit more performance.

#

If you want to try it out (small code, so can just try on your side if you prefer) lmk.

quaint mulch
quaint mulch
verbal oar
#

so nn regressor does automatically linear regression?

#

what about generalized regression nn grnn is it still used, I remember I had about it on ml course

#

I mean I never met this outside of course, like only when googling about it

humble bone
#

this channel being about ai; if someone is up to the challenge, could you answer this question

#

how many nodes in the input, hidden and output layer, how is this determined ?

wooden sail
#

do you know how matrix multiplication is defined?

humble bone
#

col ^rows...?

#

col * rows

verbal oar
#

dot product

wooden sail
#

and so what is the restriction regarding the sizes of the columns and rows when you multiply two matrices

#

given a matrix A of size m x n and a matrix B of size p x q, if we want to calculate AB, what do we know about the sizes m, n, p, and q?

verbal oar
#

size of rows of A must equal size of columns of B or vice versa dont remember

wooden sail
#

(this is the answer to your question regarding the number of nodes in the input, output, etc)

humble bone
#

ohh ok think i got it

verbal oar
#

or it is size of cols of A must equal size of rows of B

#

thats why must transpose matrix

jaunty helm
#

I only know MLPRegressor which is in sklearn

verbal oar
#

no just generally

#

just some neural network regressor

humble bone
#

so input layer = 3, hidden layer = 2 and output layer = 1

weak oxide
#

I might as well ask here. Does any of you use Darts?

jaunty helm
# verbal oar no just generally

if you assume a nn has a width of 1 for simplicity, you can think of it kinda like input -> linreg -> activation -> linreg -> activation -> ... -> output
edit: well ig a width of 1 would make the inner layers really uninteresting, so just ignore that part

verbal oar
#

interesting insight

weak oxide
#

I was looking between TSLearn, GluonTS, and Darts for time series forecasting. I used to use manual forecasting, statsmodels, pmdarima but I needed a more united package

#

Manual forecasting meaning like I build like a regressor from sklearn and plot the predictions against actual.

verbal oar
#

can also look at some time series R package if you are allowed to use it

#

or if only allowed python then different story

jaunty helm
weak oxide
#

With R they have model time which is really nice

jaunty helm
#

like Nixtla is focused on forecasting while the others include more analysis utilities

verbal oar
#

wait you said forecasting not analysis ok I meant ts analysis then

weak oxide
#

It's why I joined this server because my colleagues are like pmdarima and manual forecasting all the way. And I'm thinking there got to be a better answer

jaunty helm
verbal oar
#

please not Excel 😂

weak oxide
jaunty helm
weak oxide
#

I should but I'm used to pandas already

#

On a personal level too

jaunty helm
weak oxide
#

Yikes

#

Oh well

jaunty helm
#

well ig technically a good chunk of the forecasting libs mentioned above (sktime, aeon, darts, nixtla, idk others) do include neural network methods
it's just that you can't alter the architecture as easily as say if you were to use pytorch

agile cobalt
weak oxide
#

It's just convenient for a lot of packages I use

jaunty helm
#

you have polars_df.to_pandas() and polars.from_pandas() to switch between them, if your package only takes pandas dataframes

weak oxide
#

Like Edgar tools is the best SEC parsing package and they only parse to pandas. Idk I guess I'll try to learn it

verbal oar
#

so transformers for ts forecasting is overkill for this case?

weak oxide
verbal oar
#

just ask generally

jaunty helm
verbal oar
#

oh ok

agile cobalt
#

polars does have official support for plugins though, which is pretty neat

weak oxide
#

How did I never heard about this. They hype up pandas ta with 130 technical indicators wtf

weak oxide
#

So my solution based on the chat seems to be Darts and Nixtla. I'll check both of them fully out

jaunty helm
weak oxide
weak oxide
desert oar
crimson jackal
#

Is it possible to compute the eigenvalus of a matrix symbolically using numpy?

#

Sympy is too slow for larger matrices.

quaint mulch
#

darts is the closest.
If darts is not good enough, then manual.

smoky swan
# desert oar Not sure if you discussed with anyone else yet, but the reason I asked to see an...

it's exactly what you expect:

  • (of course), random deviation from the exact value, especially when accel is high
  • sensor ringing (a sine curve) after changes in acceleration, especially when stopping, tapping or hitting something
  • gaps between measurement points contribute a lot to drift after stopping, almost impossible to filter out

all filters had either or multiple of the following problems:

  • not enough; still drift after some time or after fast movements
  • cursor moves back to where it came from (like with a vanilla RNN or forgetting integration
  • cursor starts absolutely tweaking once you move any faster than a slow glide
  • input felt highly unnatural, cursor was hard to move at all (often happened when you overdid it with filters that else aren't enough)
#

zero velocity updates (set velocity=0 when acceleration=0 and delta_rotation=0) either don't detect or detect too eagerly

exotic star
#

hi guys, i dont know how to go about learning the math of ML? im learning the data libraries of python and i wanna start learning the core math of ML as well before i jump into deeper into ML and ML concepts so i can actually understand whats going on at a deeper level. How would i go about this? do u have any resources and guidence?

soft ermine
#

i understand machine learning at the lowest level possible so im sure i can help

exotic star
#

im not sure how to approach actually learning ML but i'll learn pandas/numpy/matplot parallel with math for ML

#

and try to implement the math with python

#

tho im not sure where and how to learn all of those

#

same about ML concepts

soft ermine
#

i mean

soft ermine
#

essentially current ml models require training pools. the more neurons the more accurately that the machine can create an outcome. the more layers the more complex the result can be. understanding how a signal machine neuron works first is essential

soft ermine
soft ermine
#

the increased bus width is representing the analogue signal to avoid unnecessary digital clock based/space unoptimised processes

spring field
#

(oh god, I was imagining something like one made out of wood 😭)

soft ermine
#

the ram contains the weights, though the architecture allows for localised ram for each neuron set

soft ermine
#

ill be physically building it soon once ive got the modularisation down

spring field
#

well, I was apparently conflating "physical" with "mechanical", mb

soft ermine
#

i mean it will be physical soon. like basically small connectable cubes that can stack onto each other to create a massive neural network

spring field
#

(I was thinking something not involving electricity (perhaps, with a hand crank ducky_skull))

soft ermine
#

i mean... i can power it with a hand crank but why would i do that when i can plug it in...

#

also that would involves some crazy gear work 😭

#

i just cant think creatively enough to modularise it effectively

iron basalt
# spring field (I was thinking something not involving electricity (perhaps, with a hand crank ...

I mess around with mechanical computers and such, here is a good starting point for some ideas: https://www.youtube.com/watch?v=s1i-dnAH9Y4 I recommend starting with Lego.

A 1953 training film for a mechanical fire control computer aboard Navy Ships. Amazing how problems of mathematical computation were solved so elegantly in "permanent" mechanical form, before microprocessors became inexpensive and commonplace.

▶ Play video
#

(I have done NNs)

soft ermine
#

i done it though. ive discovered the perfect method to physically modularised neural networks

jaunty helm
#

it's in c++ rather than python so should be more performant

weak oxide
#

I was able to implement most of what I wanted

#

And then I supplement the rest with pytorch forecasting

#

It doesn't have SARIMAX explicitly but I can change it around to get SARIMAX

vale breach
#

guys what is the

#

difference between iloc and loc in pandass

jaunty helm
vale breach
#

wdym?

jaunty helm
# vale breach wdym?

like you do df.iloc[:, 2] to get the 3rd column
or, if the 3rd column is named "fruit," you can do df.loc[:, 'fruit']

vale breach
#

what I dont understand is the syntax df[[something, something]

#

i have this vue dataframe

#

and loc[row, column]

vale breach
#

@jaunty helm

jaunty helm
jaunty helm
vale breach
#

where can I check function definition of loc()?

#

because, if I only have one parameter you know is row

#

because loc(row, col) right

vale breach
jaunty helm
vale breach
#

yeah

#

is df.loc[5, :] equivalent to df.loc[5]

vale breach
#

I think I get it tho, this cose in here

#

this code is saying df.loc[5, :] = [stuff to put in corresponding r,w ]

#

like

#

it will put in the corresponding columns

#

"heinz, armando" will be put in row = 5, column = 1

#

1255 will be put in row 5 column 2

#

'a' will be put in row 5 column 3

#

and "flex" will be put in row 5 column 4

#

I get it I think @jaunty helm

vale breach
#

i appreciate it tho, you broke down the principal differences and I think I start to get it tho

vale breach
vale breach
#

like grouped = df.groupby("columnnnn")

jaunty helm
vale breach
#

what about df.groupby("column")['something'].max()

#

@jaunty helm

jaunty helm
vale breach
#

what the flip does the [] do
after the groupby

jaunty helm
vale breach
#

like I have this df

vale breach
jaunty helm
vale breach
#

i just need to guess which line of code makes the following operation on this df

#

I think it will be df.groupby("asiento")['fila'].sort()

vale breach
vale breach
#

you can also do loc on multiple columns

#

which is equivalent to this one

vale breach
#

unless I am tripping @jaunty helm

vale breach
#

and then we access the column with the []

#

something like that no?

vale breach
#

row 'a' from the dataframe

#

for this df we grouped by the animal row and accesed the column age to put the mean of each age of the animal

verbal oar
#

groupby i think is like SQL's?

#

and used with some aggregation function

#

do I need some knowledge of hpc and parallel computing for machine learning?

#

or only when someone manages infrastructure of it?

#

so not neccessary

soft ermine
fallow coyote
#

are there any books which teach linear algebra and its application in computing? ngl it just seems long learning about linear algebra. I just want to know enough linear algebra for machine learning. Im going to use some A level textbooks to learn matrix basics and then use gilbert strangs book to from chpater 3 onwards but i wouldnt mind using a book that teaches linear algebra ground up and learning to apply it

soft ermine
fallow coyote
#

Cool.

verbal oar
#

for example when u calculate characteristic polynomial/equation then its similar to high school algebra

crimson jackal
#

Is there any way to use wolfram mathematica inside python to compute eigenvectors symbolically?

serene scaffold
crimson jackal
crimson jackal
soft ermine
crimson jackal
soft ermine
#

r.i.p

crimson jackal
#

Why?

soft ermine
#

gpus are really good at matrix functions c:

#

cuda probably has the protocols to do this efficiently

crimson jackal
#

Wait, when you say tensor is not using Matrx(...) right? The question may be dumb. Ahahhah

soft ermine
#

tensors are essentially a 3d array

#

an 11x11 matrix could fall into the tensors realm

crimson jackal
#

How?

soft ermine
#

if ofc it meets the criteria

iron basalt
crimson jackal
#

Yea. Tooooo slow.

iron basalt
#

Is Mathematica a lot faster for this?

crimson jackal
#

Yea, by a lotttttt

soft ermine
#

cuda might be the optimal approach here

crimson jackal
#

Mathematica can make it in seconds and using sympy may take at least some hours.

#

I may not have cuda.

soft ermine
#

cuda should allow this to happen in millisecond latency

iron basalt
#

If the difference is this large I doubt it's a hardware issue.

#

Sympy is probably just doing something horribly wrong.

soft ermine
#

no its most likely that sympy is using cpu fetch decode execute cycles instead of gpu tensor capabilities

#

which is the wrong approach to digital matrices

iron basalt
#

It's an 11x11 matrix.

#

With mostly zeros.

soft ermine
#

yeah tensors should be perfect for that

crimson jackal
#

I can show you how the matrix is.

iron basalt
#

No I mean like 20 year old hardware can demolish that task.

soft ermine
#

yeah

iron basalt
#

Something is just being done really wrong.

#

No CUDA needed, or anything like that.

crimson jackal
#

This is 12 by 12 but its basically to remove one row and one column.

#

I want to substitute all of them except maybe 3 variables.

crimson jackal
iron basalt
#

Maybe with a smaller simple matrix as an example too of what you want.

crimson jackal
#

MRE?

iron basalt
#

Minimal reproducible example.

crimson jackal
#

Oh okok

#

Yea. Ok, one minute.

verbal oar
#

so its sparse

iron basalt
verbal oar
#

yes right

iron basalt
#

So in between, but still worth storing in sparse form. Although this is about symbolic computation, so things are a bit different in how you do things.

#

My guess is that sympy stores its structures in a pretty slow way compared to Mathematica, but this probably does not explain the massive difference.

jaunty helm
#

symengine is like sympy but in c++, which is why I suggested it

iron basalt
#

Note that anything above 3x3 is going to start getting really bad for symbolic computation on this task, especially if the engine is not making use of it being sparse.

crimson jackal
#
import sympy as sp

# Define symbols
F, G, A, B, epsilon, mu = sp.symbols('F G A B epsilon mu')

# Define the matrix
M = sp.Matrix([
    [0, 0, F, 0, 0, 0, 0],
    [0, 0, G, 0, 0, 0, 0],
    [F, G, epsilon, 0, A, 0, B],
    [0, 0, 0, 0, 0, mu, 0],
    [0, 0, A, 0, 0, 0, mu],
    [0, 0, 0, mu, 0, 0, 0],
    [0, 0, B, 0, mu, 0, 0]
])

# Compute symbolic eigenvectors
eigen_data = M.eigenvects()
#

Even doing this is already very bad...

iron basalt
#

!code

arctic wedgeBOT
#
Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

crimson jackal
#

So yea. I may try first to use symengyne.

crimson jackal
#

I cant find the documentation of the functions of symengine.

lilac aspen
crimson jackal
lilac aspen
#

I don't know. Like I said, look into it.

But the idea is at least 40 years old. Someone's probably made one that works for Computer Algebra Systems by now, as well as for numerical methods.

crystal pier
#

👋🏾 anyone got a project they need help with?

surreal salmon
#

Hi