#data-science-and-ml
1 messages · Page 177 of 1
Slicing is a way of accessing a part of a sequence by specifying a start, stop, and step. As with normal indexing, negative numbers can be used to count backwards.
Examples
>>> letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> letters[2:] # from element 2 to the end
['c', 'd', 'e', 'f', 'g']
>>> letters[:4] # up to element 4
['a', 'b', 'c', 'd']
>>> letters[3:5] # elements 3 and 4 -- the right bound is not included
['d', 'e']
>>> letters[2:-1:2] # Every other element between 2 and the last
['c', 'e']
>>> letters[::-1] # The whole list in reverse
['g', 'f', 'e', 'd', 'c', 'b', 'a']
>>> words = "Hello world!"
>>> words[2:7] # Strings are also sequences
"llo w"
well hi, i dont learn data science in my university, but i'm learning to explore my job oppoturnity. I see that i should begin with statistics. Who can suggest me some course on udemy, coursera about it?
oh i get it
i get it now
finally.
ill need some help in NumPy array 'axis' when i make arrays in higher dimensions
read their user guide if you haven't yet
in terms of statistics you need to know, it depends on what you're working on to be honest, but in school I've studied topics in hypothesis testing, regression, and ANOVA which have been helpful sometimes. I couldn't name you any courses, but I have an open source book that covers Python data science fundamentals. I've heard good things about it. https://jakevdp.github.io/PythonDataScienceHandbook/
I really like krista king on udemy, she visualizes her subject extremely well. That said, lectures are not really going to teach you anything. It's math so the best way to internalize it is using a pen and paper.
Alternatively, there are some ML courses on udemy that will include a good deal of math but DS != ML engineer
Oh! One important note about Krista King, avoid the course with Jose, it has Data Science in the title but she pairs with an instructor who's really dry and it just mixes like oil and water (not good). I'd advise you not to waste your time on any of her programming courses.
Im happy to recommend books or resources or recs in general
Here's a statistical book with learning in R programming and exercises to think about and do. Still has really good topics on how Statistics works. Enjoy!^^
one of the most frustrating aspects of AI with Python is that package versioning becomes a time wasting problem
hehe, ya it can be maddening sometimes to try to figure out and can be a time sink for sure.
I went to all this effort of setting up torch and a bazillion related packages (spacy, nltk, the entire mess), only to find that torchtext refuses to install if you're in python > 3.8
software rot is a HUGE issue in python
torchtext is in fact stranded at this stage.
did you see this? https://github.com/pytorch/text/issues/2250
nope, and thanks
Yup thank u lot. I appreciate ur advice
oh yeahh thanks your source :3
Hi everyone, not sure if it goes here but if people here are using VS code I'm sure its optimized for this purpose.
I'm currently learning the math behind ml models (starting with linear regression and gradient descent) and coding them in python as I go. My issue came up when I tried to use the sympy differentiation function to get my dJ/dw and dJ/db from my function J(w, b). I was then reminded that I don't have sympy after I cleared my library cache a bit ago (I think that's what I did but you get the point - I don't have it)
This lead me down a rabbit hole of stuff and wondering why I'm using homebrew and conda through miniforge and such. Anyway I thought I'd come here and ask for suggestions for how to set up my VS code (on an m-chip mac if that changes anything) from people more knowledgable than me. Open to any and all suggestions :)
for most purposes, just using uv works great
uv add to add new packages, uv run if you need to use more advanced configurations, uv sync when cloning to another machine
very few packages require conda nowadays
thanks for letting me know!
Is conda not useful for its environments to make it worth having
Don't use Conda, just plain Python.
It's banned at some companies already, it's legacy.
IDK, I don't use Mac.
ah ok thanks anyway
hey nothing personal but I don't really DM, better if you ask your questions here and then everyone can help
yah okay i got it
i can see that, as I do a lot of pip install . after installing stuff also via conda, usually in its initialization/creation stage
I'm using OpenCV to try to do camera calibration on my 5 DOF robot.
This camera is mounted on the robot behind the wrist. I need to solve for the matrix that related the End Effector (EE) to the Camera.
In open CV there is two commands, not sure what to use:
- calibrateRobotWorldHandEye()
- calibrateHandEye()
From my data, I collected the angles of each joint, and the corresponding image of an aruco marker.
I am able to calculate the relationship between the object and camera by solving PNP and the relationship between the base and EE using FK.
I have getting weird numbers saying that the camera is like 20cm away from the EE which makes no sense, it's closer to ~ 8cm
If anyone done this before it'll be really helpful!
in this funcn if the network is slow so the funcn keeps running but due network slow connection it cannot type for search so if there is a better alter to make sure that if the site is fully load or not rather than using the time.sleep
this is webscraping and UI automation. not data science.
remember to always post code as text, not a screenshot.
!warn 1446195934127063062 Your message was removed for asking for money in exchange for help. This will be your only warning regarding this behavior.
:incoming_envelope: :ok_hand: applied warning to @flint sapphire.
hello
guys where i can find for creating my own generative ai model and train it. Like i want a ai model coded not fine tuning not using anything else under the hood just hardcoded and trained by me model
but cant find any sources to help me build this
first you'll need of a few millions of dollars worth of hardware and electricity bills
second you'll need of a team of about a dozen PhD researchers
if everyone could do it they wouldn't be sold 200$/year by only like 5 companies in the world
well i am not trying to do a nanobanana level project
i want to train it only on my own pictures and i would be the only output you know
so you want a text-to-image algorithm built only by yourself ?
nice, I think we can reduce the requirements to 2-3M$ worth of hardware and only 8 PhD researchers
xd got it
but what about my own language model only trained on one topic for example history
that's better, do you have 2 M$ and 6 PhD researchers ?
you can make a GPT-2 level "this looks like English" model, but it will not be factual nor have great text understanding
https://github.com/karpathy/nanochat
so basically 90% of AI projects are Fine Tuned versions of the others ?
more like 98% I would say
understood
maybe more like 99.9%
tbh for LLMs might be like, 99% just use base models as-is with some prompt engineering, then 99% of that 1% that "train" a custom model are fine tuning it
One more question lets say i want to create an ai app for historical things like wars, battles and revolutions and i want to have images like where there are casualities corpses and blood is there a gen ai that supports blood content
uncensored
I'm sure you can find some versions tuned for generating unsettling content online, I don't want to go anywhere near it though
and even leaving gore aside, I'd be very hesitant to show AI generated images as if they displayed real historical events
i cant find images for every historical event you know
then don't show images for these events
tbh deep faking historical images is something I would avoid very much
I mean, if you make up images, why would I trust you regarding the remaining of the content ?
Is this battle AI generated too ?
how can I know ?
that's like saying: I can't find arguments for each point I make so I want to make them up
I don't think you need AI at all for your app, I think the only thing you need is historians
no it is just i need to use AI for the project
this wont go public just for university to get my grade and thats it
if you have any better ideas i am more than open to them
what even are the requirements set by the university?
well you may have a chatbot-like thing in your app that answers questions based on the database
you don't even need to finetune a model, you only need a RAG
which is a good project in itself
first of all i need to use some kind of AI, i have to do changes on the AI that i am using not like copy pasting openai client. It can be any topic i choose as long as i can implement it with ai
i wanted to do law but it had so many backdoors and exceptions
maybe fine tune an open model on https://www.reddit.com/r/AskHistorians
perhaps better not though, there is a considerable chance it will make stuff up and I don't think you could ground it in reality very well
I think a RAG and a bunch of historian reports on a specific war could be great
find something like one hundred documents regarding a specific war and recent war from which you can find some pictures
You can have an algorithm to retrieve the relevant picture based on the question
and the RAG will find an answer to it
somehting like that
well my original idea was to only to take Azerbaijani history and get the information from our own historians and create a model to only answer questions if it is related to Azerbaijan. Well now i know i have to fine tune it and avoid using ai generated images for history because of realiability issues
that's an idea, maybe find some document about a recent war or simply about Azerbaijan since 1900 idk
or since their indenpendance
you will be able to find a lot of picture and a lot of document
and not only about war
something about the "Azerbaijan since 1990"
Yoo guys, could anyone please tell me how to practice python for Data Science and Analytics. I mean I always feel stucked while importing various libraries and downloading assets. How to manage it
+1 on RAG, will be way more achievable
quite a few recent embedding models like jina v4 are multimodal too, which means you don't have to worry too much about lining up text vs. image embeddings, nor worry about documents coming in janky formats because you can just take each page as an image and embed that
kaggle
Hey guys
I made a package a while ago (it's currently in submission to JOSS) for deep learning specifically leaning heavily on transparent and Python only code, so that it is easy to learn the ML theory by reading the codebase
Just wanted to share it here in case anybody would find it helpful. Because I know people liked micrograd from Andrej, so this is something somewhat similar, but you can actually use it for real projects
Feel free to email or open an issue on github if you have any feedback or comments, or just DM on discord 🙂
IBM also helpfull about data science
hii, i am new to this ,i wan to learn ai ml dl ,etc like this stuff ,please help me ,can you just give stepstone marks that i have to cross in next 2 months
take a look at the pinned messages, there's a good selection that salt die put together, also, many resources are outdated and will use tensorflow, TF is outdated, use pytorch to get going
Also ,shame to say but I am beginner. I am still learning python,so give step stones from starting
Can some tell me how can I reduce latency of my agent,
I am buliding a multi agent tool which can do some task for each task I have agent and each agent have certain tools. I am using gpt40 model result are fast but I am going to add speech to text and text to speech in it so it will cost some latency too. I am using langgraph to build it and It's my first time buliding and multi agent tool, truly speaking me building my first agent.
Any good advice to reduce latency what can I do,
Make prompt more better , in my tools I am having mongo db operation so make them fast , something like this I can do
Other than this what else I improve
are the LLMs running locally?
Hi everyone! I'm currently working with a server (I can send you invites if you're interested) on a beginner-friendly AI/ Machine Learning project using Python, and we're looking for more teammates.
Requirements:
- Basic Python (or JavaScript) knowledge
- Interest in AI/ Machine Learning
- Willing to learn and work as a team
If you're interested, feel free to DM me. Thank you!
I am using open ai key and model gpt40
so that means that you have to wait for the prompt to travel over the internet to OpenAI's servers, then wait for the result to generate, and for it to travel back to you. so there might not really be a way to speed that part up.
how many tokens are each prompt?
Yeah
See as poc I am using in memory of of langgraph and I am sending full state to model as conversation memory
I'm not sure what you mean by that. but you might be able to cut down how long it takes for OpenAI to generate a response if you make the prompts more concise.
also, if you know where the OpenAI server is physically located (like in Ashburn, VA), you might try running your code on a server that's located near there.
Ok you mean to say if I am able to send a request to my nearest server of open ai then my response will be fast
if you reduce the physical space between where your code is running and where OpenAI is running, you will spend less time waiting for data to travel over the internet.
Correct me if I am wrong
If server one is located 1 mile near to me where I am having my tool .
Another server is 2 mile
So 1 mile one will give a fast response.
it won't make a difference at the scale of one mile vs two, but you've got the right idea.
Okkk
is this a data science or AI question? otherwise, try #1035199133436354600. also, please be more specific about what you're trying to do.
Alr thanks
One thing I did was we were making a mongo operation, some of my team mates were fetching the complete data from mongo then filtering out it,
I updated the approach and made filtration itself in mongo db and then fetch the data.
So it reduces my space
when you query a database, you usually want the database do be doing most of the work.
Hi everyone,
I’m planning to use this title for my Master’s dissertation:
"Machine Learning-Based Vibration Analysis for Early Fault Detection in Marine Diesel Engines."
Could you please share your thoughts on whether this title sounds clear and strong enough for future PhD scholarship opportunities?
Thanks a lot!
Hi is there anyone from the Ilastik image processing organisation . Its a suborg under Python which I wanna contribute to
today I learned: BERT-tiny, BERT-mini, BERT-small, and BERT-medium can all be trained on a 4 year old laptop with an RTX-3070
Only thing is that you need transformers[torch]
trained from scratch, or fine-tuned?
fine-tuned 🙂
basically get it from hugging face, and then apply some dataset to it.
you can get pretty good accuracy on dataset tasks with ~15 minutes training step
for the biggest BERT in the set, "AlbertForSequenceClassification"
how hard is it to make an AI for recognizing sign language?
https://teachablemachine.withgoogle.com/ can help you make it super easy
im doing it for a science project and so id like to do it manually...
then I guess you could start training your own model with the usual. Pytorch or Tensorflow
could you possibly point me to a source that i could learn pytorch?
pytorch website has a tutorial section. https://docs.pytorch.org/tutorials/
thank you!
@glad wing
What?
Dm
Hi
Im busy doing a course on udemy 100 day course for python and wanted to know what would be worth doing next after finishing that course?
Should i start learning the modules and stuff?
Hello I am learning file handling, data analyst should I message here ?
Don't they teach modules in 100 day course?
I did see pandas
Not sure about the rest
Ohh, Pandas is necessary
Learn Numpy and math modules too if you can
Anything else
Thanks in advance
That’s enough, I guess. Unless you're going for ML or DL, you’ll be needing Scikit-learn and TensorFlow.
Thanks
It feels like cheating lol
If you already know how neuron network works, the hard part is done.
However, for me the hardest part is to match your own images with the dataset that your IA used
i do not know how neuron network work
So try to learn it before anything,
My recommendation is to staring with making a neuron-network that recognize XOR
You will get the fundamentals with pass forward, back propagation, neurons, weights and error cost and a global understanding of neuron network
cant i do opencv tho
Yeah with python you'll need it
but idk if it recognize sign language
you will have to train an IA with sign languages then process with opencv
I think
Is there a beginners book to pytorch
cuz i think i can use opencv to get the actual hand positions
but then id need an ai and massive amounts of data to get words out of that
Hi to everyone, which are the best sources to get deeper into AI/ML? I know the basics, I just want to get very good in this area, to really understand it. Any github repos, forums, etc. would be more than helpful.
Thanks ! :D
There are resources in the pins
I learnjng data analyst should I message here ?
===creating folder===
my_folder already exist
===Creating files===
created - image.jpg
created - document.txt
created - song.mp3
===moving files===
file moved outside.txt into my_folder
[Program finished]
```py
Today I learning how to move files into folder by using shutil in python
welcome to the club
Yup
What coures are you taking?
Nothing
I doing from an app and I got A+ and a girl text me from that app and just told me to follow their instagram and didnt say anything
Not exactly sure what you said.
I said I got A+ from an app
Alright..
But they don't tell me to buy a course
So it a was free public learning course?
Yes that app teaching very basics things like int , if else etc
And also gives mini project to do
Yup . When I am finished the basics I choose data analyst as a future carrier
India
Cool man, you seem passionate with your choice. I salute you. 🫡
I have a strategy I checked it's results on ticks by ticks data but it's taking me ages to automate it
I tried making it for ninjatrader in c# but got errors
why?
Thanks man .
I don't know what's the error during compilation the code compiles and gives no error but in the strategy analyzer it isn't taking any trades.
I even tried in python by creatina bridge which would send trades to the platform but no success
sounds like a bug more than a performance issue
and why do you have compilation issues with python?
From the python side Ig everything Is good I think it's the ninjatrader platform which is causing the problems I added a script in nt which would bridge between python and nt.
Found no solutions
no idea. I don't use ninja trader
No prob
I built my own platform in java and use data I sourced
then use a broker API
Sounds interesting I had tested mine on ticks data which is the highest quality of data we can get I got good results on that but having a hard time to implement it on live and automate it
I would suggest to start with something super simple and not to worry too much about ticks level data or order books or something too fancy
As engineers, it might be tempting to focus on very fancy ways to compute all sorts of things. But that would be missing the other half of the equation: a strategy that works
As you start putting your strategy in the real world, you will also learn many things about the market! Be it about slippage or bid-ask spread or other market behaviors
I agree with this
Yaa should start with simple things first
yeah it's a classic mistake. You could spend years making a very fancy trading and backtesting platform, but have zero strategy to trade with
Actually my strategy is not that complex it's just that the execution has not been up to mark from my side
have you backtested it?
If it is that simple, why does it require tick level data?
I backtested the strategy on ticks data but for running it live it needs to be converted to c#
I'm done with the Backtesting phase
oh nice! Congrats!
Though tick level data does require a very fast execution
I backtested it on data for the last 18 months march 24 till now on ticks data
Backtesting it does not mean executing.
How do you plan on executing it at microsecond level on the market to keep up with tick level data?
On ninjatrader there are different data series of chart available like 1 tick 1 sec so if I attach my strategy to 1 ticks chart it would react on every movement and I would trade on NQ(Nasdaq) so I was thinking of running it on a vps which has a location near Chicago CME exchange which will automatically reduce the latency to 5-10ms
So it's not tick level data. It's second level bars.
Also having a VPS with low network to chicago CME is not cheap. That should be taken into account
Yaa per month vps would cost around 100-150$ but if I get this to work it would cover the cost of everything
ngl, that seems overly ambitious.
Going from nothing to above 250$/month in trading from the get go is pretty slim
That's 3k$/year
💀💀
I was thinking of running it on propfirms if I got this to work then everything will be paid off
think about gross vs net profit, slippage, etc. I doubt you will start betting 500k$ in capital in your first month either
Maybe. I never tried them as it sounds too good to be true
In futures the slippage is not that much as compared to forex and brokers like for futures in Nq(nasdaq) the ask/bid different is of 0.25
I guarantee you it will behave differently from the backtesting in some ways
Yess I agree with it 100% backtest results are too good to be true they provide the ideal conditions for the trades to be executed but in real markets everything is different
So I already think what I got in the backtest in live markets the performance would go down to 20-30%
could be more, could be less
yess i am thinking of hiring a developer
I wouldn't.
Right now, you only have one idea, not a profitable strategy. You did the first step by backtesting it but you should also prove it first in the market
i have already taken around 50-70 trades in live market for thos strategy
what was the win/rate ratio? average gain? average loss? sharpe ratio? Average duration?
😯
it was around 85% and i risked around 100$ my average win was around 40 and it a scalping strategy so average duration was around 2 mins
so each trade, you bet 100$ and you get on average 40$?
yess on negative rr
thats what i did in the backtest for the last 1.5yrs of data
jshell> 70 * .85 *40
$14 ==> 2380.0
so ~70 trades with 85% win ratio and each time you win, you make 40$
that should cover your VPS easily already
wait
you said real trades
i did on a demo account in the live market not with real capital
ah not the same
noo i said that the logic on which i have been backtesting my strategy for the 1.5 yrs of data so in live also i did the same but on a demo account not using real capital
Sure.
But it won't be the same as live trading in a real account
get your feet wet with real trades before thinking about hiring people
80% of time it would be same and if i run it on a vps and automate it then i might get better fill prices as compared to trading in manual
that's okay
build that experience
do things manually first so you can learn more
quite tangential, but nonetheless relevant
you want to see in real life how things happen, how it feels, the pressure with it and learning from it. From there, it will make automation easier and you can take these learning
you cannot skip that step
You will not be able to go from "in theory it works in my backtesting" to a "fully automated tick level strategy on a VPS in real life" without intermediary steps
at least if you want to keep or grow your capital
yess agree with it i will look into these things
thanks for your help men really appreciate it
you got it! You will do great!
will update the progress here
looking forward to it
🙏
hi there,
i am developing a codebase search tool like deepwiki as a side project.
I am doing a hybrid search (semantic similary + bm25).
But while doing an A/B testing between Semantic Similarity only VS Hybrid.
This is what I observed, 1. When I ask "how does authentication works in this project?", some times hybrid search performs better, sometimes vector search performs better.
2. When I ask, "How does the DiagnosticAnswer model works here? Is it related to authentication here?", the hybrid search performed better. Hybrid search works better when we are specifically looking for something (keyword specific), for example, How a UserModel is implemented.
I have developed a RAG system around it.
RAG_BM25_WEIGHT=0.3 # BM25 contribution to RRF
RAG_VECTOR_WEIGHT=0.7 # Vector contribution to RRF
RAG_RRF_CONSTANT=60
here are my initial weights.
Any suggestions on how can I improve the system?
maybe dynamically re-weigh based on query?
e.g. you observe that keyword-specific = hybrid is better (in other words bm25 contribution is good), so maybe try to detect if the query potentially has keywords
if you can take a bit of a performance hit, rerankers are models trained specifically to re-order retrieved documents based on relevance to the query, that might work better too
Need some second opinions for a good reliable Linux distro for database experimenting and program development for future Data Science projects, what would be a good route to go? Mainly will be playing with SQL and NoSQL w/ AI using those sources.
Linux 6.18 Out Now! Featuring Extensive Hardware Support Updates & Driver Enhancements
https://www.linkedin.com/pulse/linux-618-out-now-featuring-extensive-hardware-8tfoe?utm_source=share&utm_medium=member_android&utm_campaign=share_via
Windows Subsystem for Linux (WSL2) + Docker works great if you need of Windows for anything
otherwise depends on how much you want to customize yourself vs how much you want it ready out of the box
if you need to ask, Mint might be a good pick
I would stick with fedora/ubuntu as they are the most user friendly and with modern versions.
Beyond that, it won't matter that much
I love WSL2 and Ubuntu. I have zero issues. The alterative is something like Ubuntu as your main OS and you can run windows in a VM, if you need it for anything.
WSL2 is the next best thing
I would prefer to be in an all-Linux work environment, but there is too much Micro$oft legacy drag
I became a bit of a convert when I managed to compile my very own version of Vim on WSL. Everything was the same as it had happened on Linux
heck, the base WSL distro even came with compilers included. I actually find that heart warming - it's the way things ought to be
ーThanks for the suggestion. Also, I am using a pgvector that does not natively supports bm25 search.
It has the ts_rank_cd() search that is close to bm25. But the hybrid search seems to hallucinate more.
Should I change my DB to chromaDB for better search?
Anyone here working with docker, docker containers with GPU utilization on Ubuntu servers for Agentic AI? Currently working on a big project need friends send help. 😂😂
how can i help you?
Well it took me 12 hours to set up my old laptop (by old I mean 2 years old) up to run proxmox I have a Ubuntu desktop VM with full access and GPU utilization, which was the hardest part to pass through besides setting up networking because of current drivers for Nvidia and cuda. I am mainly looking for a mentor or group of people I can get in with who are cyber security focused. I am working on an Agentic AI network security infrastructure and have no one to talk to about this as all of my friends have no clue about anything I'm studying.
i'm familiar with docker and gpu passthrough on ubuntu. if you need help setting up agentic ai or have any ideas, i'd be happy to share what i know
Thanks man! Yeah I will say as someone who's always been into coding and tech, the past 2 years of study and diving deep into deep learning Agentic ai advanced Python and all of the libraries within, how does anyone do it?
I do currently have a simple chat bot I'm setting up on the server to aid in building the entire thing, as far as I know I have vs code set up to ssh into my container. What's like the best ide i should be using? I've messed around with jet brains stuff, cursor, zed, atom, and a couple of others but vs code has always been the most comfortable, probably because that's what they had us using in college
Also, if you work with Nvidia, do you know if they have support for Ubuntu 22.04lts to upgrade from nvidia-535 drivers to nvidia-570-open?
I mean the graphics cards not the company itself
hm, but I thought you said you were already using bm25?
and also I'm pretty sure there are other postgres extensions that give you bm25
I thought I was using bm25 only to realize I was not. Got to know that there is this vectorChord extension for pgvector.
anyhow you can still try the ideas of reweighing and/or reranking
Can we Make and multi agent tool using langgraph and chain, where when user query so our master agent handel tool calling and normal conversation, when user query is formal and no other sub agent to activate master agent will reply normally but if master agent think that sub agent to activate then it will activate it then that agent will run tool and return output to user without giving back it to master agent , next time when user will query the user query will go directly to subagent not to master agent that subagent will decide that it has solved the user query then pass response to master agent and then master agent will reply back.
Hi im hoping to get an apprenticeship in data science or as data analyst what sort of projects would make the company hire me quick??🌝🌝 its entry level but still, i need some gooood project ideass
I wish I had better news, but if there are any data science/analyst apprenticeships, they're so rare as to be nonexistent. there's no quick way to get a job in that space.
I understand that, and thats why i wanna make sure that i get it. If not I could still try getting into soft dev n stuff, but i really want to try my luck
I would first confirm that you can even find a place to apply for one. I'm not sure that you will.
@serene scaffold
in the current market, luck isn't on your side. if you're serious about starting a career in this space, I think you should reconsider your strategy.
yes
My question above
try crewai
Maybe you guys got some ideas, but im working on a self improving AI system that can rewrite its own code. so far so good, but on the 3rd cycle the AI made the code async I guess for performance.. but it didnt update the caller that expect that sync functions. the test still passed due to python funkyness but its breaks the system architecture. but for self modifying AI that will improve multiple files at once, what do you guys think the best recovery strategy would be here? Im not exactly set on how I want to build this part, how it iterates overself itself to apply fixs.
can some one suggest me how to find legit actual implemented proven research papers that makes sense and actually contribute and not some review nor random case study on google scholar i am unable to find a really good paper that is worth spending time on ? or am i doing it wrong is ml being filled with random stuff ? i am trying to find papers on pure ml based implementations
you could look at the papers that are the most referenced since they are the most likely to have an impact
Theres only one other strategy i could think of, 🙂 getting help from god.
that's the same as luck.
are you serious about doing things that would make you qualified for those kinds of jobs, even if it takes years?
Im on the survival mode in the current stage of life, am i serious about doing stuff? Yes ,,,, even if it takes years? Idk , i only have a year, either i get into the industry by starting uni next year or getting an apprenticeship, uni - not possible due to finance and im not eligible for student loans, Apprenticeship- the only other option to get in the industry, i could learn and most importantly I could earn.
how much time can you afford to spend pursuing an apprenticeship before you need to start earning?
Ive already started earning, i work at mcdonalds. Apprenticeship seemed the only way i could continue my education while earning
what country is this?
Are there anything other than Google scholar? 🤧 That didn't help much
That's rough bud stay stronger it's not just you i m basically cooked in this market too either get a job opportunity or lose it
@serene scaffold r u gonee?
I like the ACM library
🤧thanks
I'm in the US. Maybe apprenticeships are more of a thing in the UK.
Ah I do have e a membership in acm does that help me?
I believe they went fully open for the library. So worth a try.
Note that not all membership include full access to that library, you do need to specifically select it when you renew
I see thankyou
So it could mean i might have more of a change getting an apprenticeship here? Well however much ive researched its not much different🫠🫠
Btw thankssss❤️
I've never heard of anyone in the US getting an apprenticeship.
Can I get advice i optd ai and ds in my b tech every sem feels like another storm intense study is required and project must some how work in very little time (india btw ) that leaves me less room I don't know whether to do project or grind leetcode i already have 9.63 gpa i m in my 2-2 4 more sems left what should I do I feel like 0 skills i won't make it to the market
don't worry about leetcode until 2-3 months before interviews
i dont get how to use tensorboard
Their doc is pretty clear on how to use it. https://www.tensorflow.org/tensorboard/get_started
Otherwise, ask something specific about it.
beautiful im running into the exact same problem current advance model deal with " python " this crap is haunting me
then when a model sees """ something, something """ in the header or anywhere in the code, it freaks out
but I've seen this pattern before. it might just be picked up by training models but it think its deeper in the python source, or more likely how """ is used in python and """ in the context of prompts
filtering stuff out just seems like the wrong approach in all this
its some sort of paradox around prompt injection and python sytax with triple quotes
Like how? Do you need a full example?
So the model loses strctucal boundaries when it collides with """ """ in python it seems.
but in training these models they invertedly created this whole nightmare? does this tie to PEP 257 docstring conventions
this seems pretty big
Because """ can be tokenized in 3 or 4 different ways , here we are lol
literally the ghost in the shell.
really all I can envison going forward in my project is completely removing docstrings
thats the wrong direction also though
they allow us a layer to talk to the LLM via the code and also humans. in some situations they can act as crucial instructions
has anyone tried
any cloud provider that hosts a big LLM for u
like deepseek and such
and just gives api
hosting an arbitrary model you pick typically gets very expensive as they need to dedicate a big machine only for you
but serverless APIs exist for most common models and are the main way developers use LLMs
@vague inlet not from a cloud provider specifically but we use openrouter in production. Been few months now
Been working on a tool that might help folks here - it crawls websites and outputs clean, deduplicated text ready for fine-tuning. Exports to JSONL (OpenAI format), Parquet, or CSV. Handles quality scoring and language detection automatically.
Useful if you're building training datasets but don't want to deal with HTML parsing and dedup manually.
https://apify.com/mea/ai-training-data-curator
Happy to answer questions about how it works!
So I’ve been making a custom
LLM and just gave it its own voice the road to making real world Jarvis or Cortana is looking bright
wow i found it ``` prompt = f"""Rewrite this Python file to address the improvement focus.
FILE: {file_path}
FOCUS: {focus_area}
CURRENT CODE:
{current_code} ``` current code started with """ it was throwing the whole thing off but i couldnt see the """ behind {current_code}
""" turns whole text into comments
Hi! Do you guys have any recommended youtube channels for learning machine learning? Even just the basics
3blue1brown is good for learning about neural networks. that's the only one that comes to mind.
(I'm sure there are more good ones, but I haven't been in the market for a few years.)
jbstatistics
Damn this is sooo goood
Plunder right now, haha. It's a placeholder until I can think of something.
I was thinking of quick keys like shift +1 and shift +2 to expanded or collapse the left or right panel. But 3 different models, like fully open, partial, or fully closed.
https://github.com/FrostlineTech/Theta-AI
can i get yalls thoughts
( please fork if creating your own ) push bug fixes to main though
Got this Flux model running, this thing is a resource hog
Jesus, I asked it to genereate some simple dobberman pixel art.
Looks good i wanted some ide where i can quickly test my python code
like pytest?
What's that ?
like...les say I have 10lines of code I don't want to make another file just for this sake of those 10 lines just want to execute on terminal
i think you want python REPL
Kind of
Is there any product ,m
?* I'll be happy if there isn't...||I'll make one and add to my resume||
maybe ipython ?
ya im not exactly sure what you're asking. but i think thats what you are thinking
Like basically i write the code in a file i don't want to save it nor do any other bs just one click it'll run and give me output not line to line output also
Alr I'll just look into it ig..ipython looks better
Alr I'll look into it thanks
Who comes up with these silly UI choices?
The behavior of a classifier (or any estimator) in only printing non-default parameters is the default behavior in scikit-learn versions 0.23 and later. This change was implemented to make the representation of estimators more concise
Why do I have to go to the codebase to see what the defaults are?
I set a parameter, then print the estimator, but I can't see what's else is in there: it only shows me the one I set. So, I have to waste time and focus by hunting for the right article in the sklearn homepage
I literally spent hours trying to figure why the heck there was no regularization parameter in the LogisticRegression estimatorJFC
You have to add this line immediately after import sklearn sklearn.set_config(print_changed_only=False)
Feel free to raise an issue on their github then....
might do that
ya, dont make some one go through the same hassle
😘
That's my actual issue with this. Why make it opt out? It's so exasperating
this is my issue with using higher level ML modules. the fancier and more abstract the module, the more opinionated it has to be. makes it so that implementing the "same network" on e.g. pytorch vs sklearn vs tensorflow can yield different reults and you really have to scrape through docs and source to find out why
there's no easy alternative to reading the source, other than writing your own functions with a lower level module
I'm currently trying to translate a tensorflow NN to pytorch and yes, it's a problem
Because, for some wonderful reason, tensorflow stops at 3.9
Maybe just look at the NN with a visualizer, and take that the pytorch side where things just make more sense
It's also really annoying that unless you print the whole parameter set in the estimator, you'll overlook the fact they the default optimizer is LBFGS, which is aggressive and unstable
i mean, among newton algorithms, bfgs is pretty stable
but it does make strong assumptions
Maybe I am. I can install it on 3.12, just checked
Ahh ok
theoretically it does but practically it works on all kinds of weird functions like fully piecewise ones
as long as you're close enough to the extremum you want and your loss function is approximately quadratic in the extremum's vicinity, sure
it works well for very non-quadratic functions too
under the conditions i mentioned 😛
like this function (and higher dimensional variants)
the minima is on the right in the middle
you're showing several trajectories there that end up at different places
it's very easy to show that if you don't have the above conditions, bfgs will reorient your gradient in the wrong direction
it can't since its always multiplying gradient by a positive definite matrix
that guarantees a descent direction
it doesn'T because that positive definite matrix is not the hessian of the true function
that's the strong assumption
i think you mean piecewise linear there
yeah
and then clearly you can't get the right hessian when quasi newton algorithms multiply the gradient by the inverse of the approximate hessian
thats theoretically but it still works really well
using a true newton method would NaN right away, and a quasi newton will give you a descent direction based on a positive dfinite matrix that is not the hessian
they are also strictly talking about lipschitz functions btw
so even if they are not everywhere differentiable, they are smooth in a different sense (not exploding values)
I have a problem like this but even worse, and its mainly like this because i was too lazy to make it better, but bfgs works on it really well
that's good, and bfgs is indeed more stable that regular newton methods
its aligning a grid of beats with one BPM change allowed to beats detected by some algorithm
but also it doesn't work on every problem
very importantly in the paper you shared, their conjecture says nothing about global optimality
no method works on every problem but I'd say BFGS is one of the most robust ones except for big problems
just that you will find some point with 0 in the subdifferential
(hence why i said you need to start close to the true solution)
oh you know what
i use pytorch and it has no subdifferentials
its just the gradient is either from left side or right side
which fixes it
at least im pretty sure
pytorch and all other ml modules are very opinionated with subdifferentials
they pick one for standard functions and always use that
this can give you different results if you run the same alg with tensorflow instead of pytorch
i don't remember off the top of my head what pytorch gives for abs(x) at x = 0, probably 0 or 1
but anything between -1 and 1 would work as subdifferential there, and bfgs will do its best to give you a positive definite matrix even at x = 0
I also used backtracking line search so probability of hitting subdifferential exactly is basically 0
i red somewhere that for bfgs its better to use backtracking
yeah should be the case
To me it's just an extension of the work I did during PhD
Great😃
Is getting jobs easy for u? 😬 not many people have phds , do they?
Hello fellow snekers. I'm working with openpyxl to generate reports (I can use only openpyxl) where inserting values into cells are easy, ofc it is. But what I found struggling with is Excel Error Indicator (a little green triangles in left upper corner of Excel cell) that I can't remove by code. I tried manually change cell data_type with s, @ and even ' character. Nothing works. Anyone got super pro tip to get rid of that green Error Indicator?
https://youtube.com/shorts/SskQNMP53Vs?si=wM_22LVHRoiu33dZ. Tis the no p-hacking season.
Is anyone working on data analyst, file handling ?
show us an example what you're writing
regarding optimizers, there is no single best solution in terms of both finding the minima, and computational efficiency. The only ones that will always find the minima are rather inefficient (e.g. simulated annealing with gradient-free simplex polytopes). So, it is sort of empirical in a way. If one doesn't work for some reason, just switch it with another one
the is particularly true if you do not know what the function looks like.
maybe it's the difference between 85.34% accuracy and 85% accuracy? It might not matter in the end. A convergence failure might actually result in a satisfactory result, given the type of insight that is sought.
@rich moth I'm writing float strings, and also floats itself, for example:
ws["H20"] = f"{model.get_value():.2f}"
ws["H21"] = f"{model.get_value():.2f}%"
ws["H22"] = f"{model.get_value():.2f}%"
ws["H23"] = f"{model.get_value():.2f}"
ws["H24"] = model.get_value()
ws["H25"] = model.get_value()
Both are treated as dates...
I tried to set ws[cell].number_format = "0.00" or ws[cell].number_format = "0.00%"
that doesn't help either
also tried ws[cell].data_type = "s"
From what Im reading you can avoid the green triangles by writing the code is a specific format
I found a solution from one dev on another server.
from openpyxl import Workbook
from openpyxl.worksheet.errors import IgnoredErrors, IgnoredError
wb = Workbook()
ws = wb.active
ws["A1"].value = "1234"
ws.ignored_errors = IgnoredErrors(ignoredError=[
IgnoredError(sqref="H20:H25", numberStoredAsText=True),
])
wb.save("report.xlsx")
I'm trying this now
I found this is a almost dead chat ig only few people intrested in ai and data science
this channel is pretty active in general, but it's Saturday. it's not necessary for each channel to be in use all the time.
):
question for whomever: how much time do you, as a data scientist, spend reviewing actual pen and paper math? By this I mean going over things like proofs?
for instance, earlier we were discussing the LBFGS vs. CG algorithms, and their convergence properties. I would assume that algorithm stability could be traced to something like determinants of poorly conditioned matrices in a denominator somewhere
I would imagine that being able to walk up to a white board and justify a decision using math is a desirable ability
the thing that makes me nervous of gradient based optimizers is that you might have discontinuities in the surface being differentiated. This was often a situation in force field optimizations, and in such situations gradient-free approaches were recommended.
Has anyone been having issues with lightning recently? I'm working on a project where installing Lightning version 2.5.5 causes a huge memory leak in my model to the point it fails after an epoch or two, whereas previous versions of lightning don't have this issue
i do signal processing, which shares a lot of similarities woth data science. we usually spend a lot pf time reading papers amd doodling on paper/whiteboard beforw implementimg something amd also while interprwting results
when interviewing new masters and phd students, we give out tasks to evaluate this type of reasoning when possible
Sounds good, actually. It's a good skill to keep
I haven't encountered any convex problems yet, most of the time the objective is not convex and not even clear what it could be described as, and I don't know if theory translates to that at all
Yeah, the surface on which the optimizer acts is basically an unknown. It's why I mentioned discontinuities: if you have a gradient in the algo, you have to perform numerical differentiation, which implies some sort of a step size. If the step size is too large, you end up with a mess. There are adaptive step size variants, but you never really can solve the problem.
Gradient-free optimizers don't have this problem, but they take forever to converger
The Nelder–Mead method (also downhill simplex method, amoeba method, or polytope method) is a numerical method used to find a local minimum or maximum of an objective function in a multidimensional space. It is a direct search method (based on function comparison) and is often applied to nonlinear optimization problems for which derivatives ma...
if you only have access to numerical gradients thats another layer of complexity in theoretical analysis
unless the objective has very specific well known structure I feel like all you can do is try different methods and see which one is the best
yup, that was my point. Just try and see what works. Or, kill a fly with a howitzer, and use something like simulated annealing
Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. For large numbers of local optima, SA can find the global optimum. It is often used when the search space is di...
I recall that SA was typically used for optimizing models of proteins, to see if you could get the right folding behavior. Pretty heavy duty stuff
you aren't forced to perform numerical differentiation in the presence of discontinuities. automatic differentiation still works there, just with different computation graphs. you do have to make assumptions or be opinionated at the discontinuities though
and SA also has no convergence guarantee
Isn't there a galactic algorithm version of this
has anyone tried implementing some form of machine learning in CAM/CNC programming? if so, for what purpose and how did you do it (e.g. for the CAM software you used)? and also a general question, for those of you in engineering (any discipline), how have you utilised ml and could you share your projects with me (i.e. link me your github)?
can you reference this to something? It was my understanding that it was the only one that could, given enough time.
by setting the hyperparameters to values you could only know if you already knew enough to solve the problem, yeah 😛
About those discontinuities: the discontinuity is a function of the step size. It might not be an actual discontinuity, in the rigorous mathematical sense. This is a very common issue in molecular dynamics, where sudden configurational changes caused by steps that are too large create unphysical energies, and ultimately junk
yeah, it is always a chicken/egg problem. You don't know what's there till you know it.
in optimization the key is that your "walker" must have the ability to get out of local minima, perhaps even to a higher minima, and explore the surface in a non-local fashion.
this is not an easy problem, actually. Difficult AF
a good literature for this is David Wales' book, which is computational chemistry, but don't let that dissuade you from reading it - it is packed with algorithms for optimizing difficult landscapes
https://www.amazon.com/Energy-Landscapes-Applications-Biomolecules-Cambridge/dp/0521814154
Many research groups are now attempting to understand how the properties of systems ranging from small molecules to proteins and glasses are determined by the energy landscape. This book provides a self-contained account of energy landscape theory and how it is applied in studies of clusters, bio...
Like real-world glasses, which are materials that are stuck in a type of intermediate phase because nature itself can't figure out the right optimizer to find a global minima within universe time scales
What are some good entry level ds projects that could get me an internship
So if I want to make a machine learning AI, would python be the best option?
I need a help is printing destination is correct or bro_files in the print stage . After shutil.move
bro_files = "my.txt"
with open(bro_files,'w') as file:
file.write("hello")
destination = os.path.join(folder,bro_files)
shutil.move(bro_files,destination)
print(f"file moved {destination} to {folder}")
Hey linux this I'm data science student I'm in begginer lavel
I’m using FastAPI for the backend, but I’m not sure how to display or showcase the AI-to-AI conversation on the frontend.
what is the point of the epsilon error term in Y = f(X) + epsilon
We assume there is some relationship between X and Y, so why do we need the error term?
because the relationship may not be exact, and in practice it usually isn't
the common example being noise: all electronic devices have noise due to heat causing electrons to move. so any data you measure with an electric/electronic sensor will have random noise added to it
another is if you assume there is a relationship between y and x, but the real relationship is slightly different. then the epsilon term can absorb this modeling error
thank you
you've probably seen graphs like these before
here, one assumes that y = mx + b, but our orange straight line does not actually pass through any of the data points
the difference between the orange line and the measurements (blue dots) is the epsilon
It's better than constant compiling then recompileing
begginer laval got me🥀
Keep it up
Do I have to use have to load a custom data set and do I have to use a bounding box to track items moving in a video or just a picture
custom data set
if you want to train a model you'll need of a dataset - if you want for it to perform great in a task, the dataset you use needs to reflect that task
whenever you want to train or just use a model someone else trained, as well as whenever you curate a dataset yourself or use one someone else has built, is up to you to decide
video or just a picture
not sure but pretty sure it depends on the model
No the import data set so that it sees it as a data set not just a file full of photos or images I don't know how it sees it as possible I just want to know since I know I could look online but I want to make a specific data set teach myself
anybody use gemini or the cursor google squid thing while coding?
or maybe even firebender?
i feel like im stuck in 2018 the way im coding hearing about all the cool tools people are using to PROPELL their workflow, ive used chat gpt but found it produces many errors, Geniune question how are these ai tools helpful to you guys?
it depends on what you are working on
for many projects working with large code bases or trying to tackle complex problems, it'll be more of a burden than of help
for simple tasks on small (or new) code bases, it can help a bunch with generic stuff and boilerplate
there's also some difference in between the latest premium models and older/cheaper ones, but even the best, latest and most expensive models still make simple mistakes at times
Do you have questions regarding them ?
🤦♀️
pip install ndjason
Any telugu ppl here ?
what would you ask them is there were ?
Ive experienced that if you vector store your project that you're working on the AI has a much better chance of getting it right and understanding the broader picture.
Having the AI actually inside the environment and see you project gives you tremendous advantages. I think thats why things like claude code are becoming successful. But im having great sucess with it too, even with local models
i wanted to make something like that, but with local models
Qwen 2.5 Coder is an open source model, you can try self-hosting it using something like vLLM then configuring the qwen-code CLI to use it instead
not sure if it would work though, and even if it does, idk if it has feature parity and same performance as the """official""" out-of-the-box server version
Thats a nice way to think about the bias term in this context. I usually just think of it as y intercept geometrically. And algebraically it may or may not be (can be zero) part of the linear equation/function.
it's difficult to know what exactly they meant, but epsilon is usually a vector-valued random variable, so the idea of y intercept doesn't really hold, and f(X) is generally not linear
even in the example i gave, where f is linear for simplicity, epsilon isn't the intercept
I meant it for this context (line). The constant in the equation of line represents the y intercept.
According to what I have learned in high school mathematics, the constant term represents the y intercept of the line.
it's not a constant here
it's also not the bias of the network, that would go in f
? Constant in a way its not bring multiplied by input variable. Yes I know parameters are not constant thats the entire purpose of training to tweak them, I was just stating my intuition for the bias parameter in weight-input variable plane.
it's not the bias here
Okay my bad. I didn't notice y = f(x) + epsilon read that as y = mx + epsilon somewhy.
Lack of sleep signs fr
Im self hosting using models from hugging face, you can use API's too though. It's kinda like vLLM but inference with contextual RAG and a project aware memory. It creates a mirror of a project locally or from github in a sandbox on your HD. I wanted to make the canvas a lite weight IDE but also versatile enough to show images or videos using pywebview
I did it! The qwen 3 30b model is a serious upgrade. It detecting the entire project entirely. The context is much longerbut its cutoff on your guys end. But then I turned it around and asked it to create a technical diagram using its knowledge.
well a prompt for it, I was going to feed it to the sd3 or flux model but ive having an issue copying and pasting in the prompt section of my UI.
my .venv is populating the RAG database too ...hmm
i need ideas for a work around the .venv folder contains 34k files.
in what cases i should use sqlalchemy
when you want to connect to a database and run queries against it
Also ask in #databases
Hi, all. A complete LLM rookie here.
I have recently built my first PC with 5060 ti 16gb and 64gb ram to learn LLMs.
I am interested in learning image LLM fine tuning and needed some tuts recommendations or any help.
What I have done so far is setup comfy ui and created a workflow with realvis v5 model and it works smooth.
I am also looking to explore a bit in text generation LLMs, where I need some UI and model suggestions that would are good for local hosting. I am not sure what 8b/30b/70b model exactly means in terms of how much can my gpu handle. I am guessing it can handle 30b.
Any help is much appreciated, thanks!
each parameter is a floating point number. so you can do the math for how much GPU RAM you need based on how big each parameter is (16 bits, 32 bits). on consumer hardware, you might top out in the 8B range.
fine tuning uses up some extra storage on top of the normal amount you would need for inference though, and for LLMs the context can also require a fair bit of memory (proportional to how many tokens you want to fit in the input)
understood, thanks! so 16 bits = ~8B and 32 bits = ~4B for 16GB vram.
alright
there's also a weird trick discovered by a computer scientist (nvidia hates him!) called quantization, where you represent the parameter with fewer bits than intended. so it uses less memory. but this has a performance cost.
i thought 16 bits and 32 bits was quantization 🙃
so that i assume is the param size and there's quantization separately
are there any good model suggestions?
to do what?
text generation and fine tuning
right, but what kind of text generation?
every LLM can generate text and be fine tuned.
as of now image gen models suck at doing that
imo you should ask the llm to output e.g. mermaid and just render that instead
image LLM fine tuning
wdym by this exactly?
image models don't really use a llm in the traditional sense, not to generate text to any capacity anyway
they use a text encoder model which can transform text into high dimensional numerical data, that the image generation model can understand
the text encoder can be an llm though, if that's what you're getting confused by
@rich moth how do you get your plots?
quantization is not any specific number of bits exactly, but some technique that reduces how many bits the model uses
e.g. if a model was trained in 16 bits, and you do some tech to make it run in 8 bits (even something dumb such as simply truncating the last 8 bits), that is called quantization
actually what i meant was i would like to train/fine tune a model to generate consistent images of maybe an anime character upon giving it 50-100 base images of the character
that sounds exactly like you want to train a LoRA
thanks! now i know what i need at least
so i can use a sdxl model + lora to generate images? or lora can individually generate images?
sdxl + lora
analogy: SDXL is the base game and a LoRA is a mod
understood thanks!
I've been working through Kaggle a bit and must say, I find the occasional shade it throws entertaining
regarding pipelines, scikit let's you do stuff like this
preprocessing_pipeline = Pipeline([ ("vect", CountVectorizer()), ("tfidf", TfidfTransformer()) ])
followed by a nicely abstracted statement
prep_trn = preprocessing_pipeline.fit_transform(data_trn.data) prep_tst = preprocessing_pipeline.transform(data_tst.data)
ive been playing with a control theory style "governor" around text gen. but this time instead of doing a RAG and hope the whole thing behaves correctly, im measuring the drift of every step and clamp or reject generation when it starts going off the rails. but what surprised me was how often model stays on topic white making stuff up, obviously topic similarity is enough alone. I was tracking the topic aligment , evidence support and novelty and those get combined into a single error signal, once it crosses the thresholds the controller nudges temperatures or injects constraints or rejects the chunk and retries. its like a PPL feeback loop for language instead of open loop setup. The results coming back are really interesting though. I ran 40 steps but i didnt show them all.
How are you defining the drift? What is the metric?
guys can you give me roadmap for ai
hey i was wondering if you guys can recommend some books to learn about LLMs and deep learning with pytorch, I am familiar with linear algebra and calculus.
im trying to create a big ai and human written code dataset , as you could guess the most missing data is the ai code, if anyone wants to help dm me! (even you having cursor or almost fully ai coded (popular ai code editors preferred) would help)
What's the best way to write prompt for an agent, i tried many iteration and in 50 times 1 time my prompt fails.
I need accurate answer
im measuring topic alignment (cosine similarity between the current text embedding and anchor embedding (RAG context) , evidence support and novelty. but its basically (1 − topic_similarity) + penalties for unsupported novelty / low evidence, clipped to 0,1
there isn't one. it's just trial, error, and vibes.
I'm trying to write my df to a csv that's 500k rows. Rereading it, Pandas and excel has no issues reading 500k rows. However, Databricks is misreading the columns because I have a "description" field that can contain your typical deliminator characters (Pipe, single quote, double quote, comma, etc).
How do I write this to_csv without any issues from databricks? I've seen csv.QUOTE_ALL option but wouldn't this just end up with the same issue if there's mismatched deliminator characters in the description?
Example description: bob's cat has many diseases that transferred to sally"s cat | no incurred, pending Using any standard deliminator here would still cause issues?
Also, it looks like missing values (np.nan) and None would turn into "nan" and "" respectively under csv.QUOTE_ALL
thanks
can you also say a few things about training costs?
seems to be working well, better than i imagined. im testing it with llama3 right now
it has zero training cost, its just a pure inference time control. its a wrapper around llama3 right now. theres a slightly latency per token. and occasional retries on rejected chunks.
self-hosting?
yup
yeah, that is what I suspected. I lack the hardware to self-host, so I am thinking about the cloud, and how much that will be.
i rarely use those services, not sure. you can look up maybe like AWS or RunPod
Lambda has H100s for 2.49 an hour
as far I am concerned, anything I do in this regard is strictly pedagogical, so perhaps think of it as "tuition"
do you do much NLP work on the input / output side of this? You are mentioned a lot of cosine similarities, embeddings, and so on.
ya, i got lost in the saunce a few years ago. I did a lot of RAG and NLP stuff.
not so much these days. I had a chatbot I was working on for a while, but get easily distracted with new projects. but will eventually circle back
AWS EC2 has some reasonably priced accelerators. Maybe $.50 for A100 (P4 tier)
ya thats not bad at all
What do you plan to do with it?
right now? my goal is RL
but it is all embryonic
focus on small LLMs. See how far I can take one
some can actually run on my laptop (RTX 3070 + 8GB), but that's just unwise
Runpod is basically 1/2 price for similar compute
ya you can get a 5090 for .76/hr
A40, 48gigs is only .20 cents . seems like the best deals so far
I did a test, one without the governor the baseline and one with.
The text was far better with it too, the baseline got a confused and start spewing unrelated stuff
ill make a github for it
right now you need a source of truth though, like a RAG.
i wanted a more modern lightweight vector DB. has anyone used LanceDB?
Wikipedia
success!
This is super interesting guys I wanted to share more. In this prompt you can see the system fighting as it attempts to not regurgitate the same thing thing it already said. But you can see between lack of data it pulls on the subject in addition to the model trying to maintain the maximum token output . Its like hallucination = (force output ) / (available knowledge) it seems likes important to instruction the models of their limitations by answering that it doesnt know based on available data.
Hi everybody, how to handle an outlier in a small dataset like 27 entries? the outlier is just 1 but the column is 'total_population'
i am new to data processing like this, any suggestion on how to handle this ?
Note : this outlier is in a csv that will be merge with other csv to make a model
depends on how out of distribution it is
for some cases you can have manual validations and filter out rows that have impossible values
for others you could test something like x > (x.quantile(0.90) * 10)
good for what?
good for like a platform for coding
the editor you use is just a matter of personal preference.
so it doesnt matter?
whatever i use
no
so it does matter?
it doesn't matter. use whatever you like.
ok thnx
Visual Studio or Visual Studio Code?
some editors are extremely suited towards specific languages or toolchains (great for that, but not very ideal for other use cases) - in general I'd recommend either using one suited towards whatever you are working on, or just using a general purpose one
Visual Studio is geared for the .NET ecosystem, while it's possible to use other languages with it, I'd strongly recommend using VSCode or PyCharm instead
(or something like Marimo / Jupyter notebooks)
if you are already familiar with any of them (including VS) you can use it, but if you'll need to learn how to use it either way, it's best to pick one that has better documentation on how to use it for whatever you need it for - like https://code.visualstudio.com/docs/python/python-tutorial
ye actually i meant visual studio code but thnx for the advice
Have you checked out Sublime? https://www.sublimetext.com/
I cant decide on UI for this thing. Any suggestions?
you guys rock
Here we are right now llama 8b with llava 13b right now
I like this UI better though..
guys guys guys
i just built a ani ai for my robotic arm instead of using lerobot
Epoch 0 Loss 1113.9988
Epoch 1 Loss 350.0988
Epoch 2 Loss 349.2488
Epoch 3 Loss 349.2484
cool!!
thanks!
Hi what are some good books to cover all of maths for machine learning??
Even if i could get a list of topics i need to know will be helpful
I’ve always wanted to use sublime cause if it simplicity, I just don’t get how to set it terminal in the ide, I can’t write into the terminal which is so annoying
What's its supposed to do?
Its like lerobot
Cool, just seen a 5 minute video its hand moving by ai.. something buzzword about "Action tokens" what's interesting how do you get a target to compare for loss, or do you even get a target ?
I’m new to this field, and I’m working with a Kaggle dataset (the current playground competition) that contains many extreme values. I don’t yet have experience handling them, so I’d appreciate learning what approaches you usually take in such cases.
well it depends on what you want to do with the data, what is the goal of the competition
Predict the probability that a patient will be diagnosed with diabetes.
I think that I'm actually mistaken, I think I've mixd between extreem values and just rare values
ahh so you need a class imbalance technique
I'll pretend to know what that is🤣
thank you for the answer @limpid zenith I'll search for it
what model are you using to solve the problem?
i can tell you what class imbalance algo you need
class imbalance is essentially when there are rare classes causing the model to predict only the dominant classes more frequently because there are more examples of it
some resources: https://developers.google.com/machine-learning/crash-course/overfitting/imbalanced-datasets
I haven’t decided yet. I’m currently in the learning phase, focusing mainly on data preprocessing techniques and feature engineering that work across different models. My plan is to train several models and try to get the best performance out of each one. For my first attempt, though, I’m planning to start with logistic regression.
ah okay for logistic regression you would use either cross entropy with class weights or a better loss function like focal loss (more advanced)
thank you so much for the advice, I'll search for those
no problem 🙂
unfortunately recent kaggle playgrounds have been... let's say declining in quality
like ~half of them this year used datasets that had close to no signal
for this month the original dataset had signal, but they've purposefully changed it so sensible feature engineering that you'd think is helpful, and wouldve been helpful, might not do much in this competition
I'd suggest beginning with the "getting started" competitions instead, at least in those you could actually do feature engineering
thank you for the info I will try one of those competitions.
link here if you can't find it
specifically the titanic, space titanic, and house prices should be good places to start; some others like store sales dive into more advanced topics like time series
and if you still want to tackle this month's playground (diabetes), just note that the dataset's quite detached from reality and I don't think you can really use insights from the medical knowledge of the real world
bro I don't know how to thank you
hi ... can anyone please help remove this double index ? I am a newbie when it comes to pd
.drop() doesn't work as it ain't considering it as a column
this occured after i sorted the coin_id column and used .reset_index() resulting in the left most index
When you learn pytorch , how much control do you have over ai
what do you mean?
"control"
Can you use ai like in an app, I am new at this
that's a very broad question, but yes
Like use it in the backend using pytorch like If i was to design an AI powered app
yes it's possible
When you do reset index, do drop=True
data science
Intro:
Excited to find this channel, I managed some infrastructure for a guy with a PHD in Artificial Intelligence at my last job and learn some things because it was his first job, and I finally got the money to purchase some hardware.
cool!!
are the constant warnings just something I have to put up with 🤣
looked into using stub files but couldn't seem to get them to work
wait I was being dumb, I forgot to delete my invalid ty.toml file lol
but also come to think of it, does anyone bother with stubs for packages like this? because I'm noticing how it completely kills autocomplete for anything not defined in the stubs file
I am absolutely thrilled to have some gear, I was able to get Ubuntu Server running with latest CUDA drivers and Toolkit to Serve LLMs via vLLM. My first real experiement is going to be fine tuning Llama3.2-1B-Instruct with documentation from my open source project SDKs just to see how it reacts.
thank you so much !
cool!!
LLMKit - parallel LLM streaming comparison 🧵
Built with Next.js + SSE to handle:
• Multiple provider APIs (OpenAI, Anthropic, Google)
• Real-time TTFT metrics
• Custom scoring algorithms
• Secure local API key storage
Perfect for choosing models for production apps
https://www.llmkit.cc/model-comparison/gpt-4-vs-claude-3-5-sonnet
Compare GPT-4 Turbo (Azure) with the same input. See which model performs best in terms of quality, speed (TTFT, latency), tokens, and cost. Test and compare for free.
What do you guys think?
hi! could someone help me with a task im trying to finish? im supposed to make a house recommendation system and it was going well, no errors or anything but then the outcome itself was all NaN and i have no clue what i did wrong (im a complete beginner so this will look pretty stupid to many of you lol)
hello, in order to continue, please give all the code in these images as text.
!code
import pandas as pd
all_houses= {
'price':[100000,200000,800000,300000],
'rooms':[2,2,3,3],
'size':[2200,3200,3300,2200],
'floor number': [2,1,4,5]
}
row_labels=['house 1', 'house 2', 'house 3', 'house 4']
all_houses=pd.DataFrame(all_houses, index=row_labels)
all_houses
wanted_house = pd.Series(
[400000,5,4400,3],
index=['price', 'rooms', 'size', 'floor number']
)
recommendations= all_houses.corrwith(wanted_house)
recommendations
not sure if i did that right but..
that's right, thanks. one sec
In [4]: all_houses.corrwith(wanted_house, axis=1)
Out[4]:
house 1 0.999945
house 2 0.999989
house 3 0.999979
house 4 0.999994
dtype: float64
is this what you expected?
the axis part.
would you mind explaining that please?
do you understand what axes are?
no there's pretty much alot i still need to learn
one axis is vertical and the other is horizontal. for rows and columns.
arrays can have even more dimensions.
apparently, when you did all_houses.corrwith(wanted_house) without specifying an axis, it picked the wrong one to align things.
oh that makes sense
thank you so much :)
model.selection.GridSearchCV(n_jobs=N) is what a multiprocessing abstraction should look like
how does it calculate confidence?
Does anyone know any resources or tutorials etc for exploratory analytics to do on datasets before training classifiers? I've been trying to understand distributions and relationships in my data in my datasets. I've been using youtube tutorials and other websites but I want more help. Would appreciate any advise and recommendations.
Check out kaggle's EDA notebooks, They are the best. You'll get a brief explanation on some datasets exploration too!!
Guys, have some used eleven labs before
Guys i m data scientist but i got the opportunity as an ai engineer but i need to switvh my whole cv to ai engineer version , is it good to mention in the project side that i ve worked on qcm genration from pdf which is smtg very very basic specially from someone that wait a lot from me
What's the difference between both roles ?
I mean, put in your resume everything that is relevant for the role you apply to
Can you describe what you do in your current role as a data scientist? And then, can you copy and paste the job listing?
Hey folks, wanted to share nexttoken.co. We are ML Engineers who've frequently coded in notebooks, and we were frustrated by how clumsy AI agents still are with actual data workflows (pipelines, modeling, analysis, etc.). Agents on notebooks feel clunky and AI IDEs don't quite have the right UI for data work. So, we decided to build a new tool focused on this type of work. It's currently in beta and free to use. Would love for this community to take a look, and share their feedback. If you give it a try, please send any feedback (good or bad) to feedback@nexttoken.co.
Hello, I’m a high schooler who is interested in the field of Ai. I am currently studying how to make a model generate words. Any thing I should know while going down this path?
That's what a generative language model is. Did you already know that what you're describing is a thing?
I have 1 question, are LLM's capable of teaching a smaller guided on local drives and machines any better than pycharm or similar can fix?
a smaller guided on local drives? did you mean "a smaller model"?
what are local drives and machines?
really only meant like in terms of I have 2 extra consumer grade gpus and a unused cpu most crash course understanding of code and ai does a lot already to make it work. So are we to the point where you could build yourself a local machine essentially, with your own project. I'm sure there's enough code by much smarter people than me
even a good ide can fix more than I could imagine.
Why do you ask?
Actually I am using open ai streaming to stream my response in textual but I want to come text and audio parallel.
Stream audio too
Why not play the audio first and then stream the text response?
Audio is slower than text anyways
#python-discussion message
This was one of my questions I asked in #python-discussion and I found out that this doesn't need ai at all.
But obviously my intention is to build an ai assistant kind of thing that does some basic stuff and maybe add home automation as a feature later(someone said pyserial gotta research Abt that).
How do I approach this given I have only started ai/ds and maybe upgrade the assistant later.
How would I start learning about ai with this intention and Any suggestions/Tips for a beginner
Thank you in Advance
depends on which languages you want to support and if you need of any nonstandard features
for starters you can use Whisper v3 for speech to text (original by OpenAI, many more efficient variations exist), but there are a lot of models you can look into
and for text to speech, again depends on which languages and how much quality & expressiveness you want, but as a default kokoro is pretty good imo. Again, loads of alternatives
By language ig u meant if I am using python? Then yes.
Idk what non standard feature meant,like a feature that doesn't have a library made on?
#data-science-and-ml message
Ig I need to learn from these resources if I am doing this
human language, like English, Spanish, Portuguese, Japanese etc.
by non standard feature I mean things that only a small subset of models support
e.g. all caps when screaming, or annotating sentiment/tone of speech/so on
Nah I don't need it to have such features just basic to do and ✅
not necessarily - there are libraries that support most models to the point you can just copy/paste a code snippet from the model card or github readme then call essentially model.transcribe(audio) or model.speak(text)
if you wanted to create your own model or fine tune an existing model you should read them, but just for using it is not that much more complex than using an online API
Aah ic so how do I start about learning ai/ds or is my approach wrong
if you want to study in order: math -> statistics -> data analysis & data science -> machine learning -> artificial intelligence
you may as well just start from those resources either way, it's just that the project you're doing likely won't require tinkering with modifying models
Ohh
thanks ❤️
I will learn from those resources 
tho idk what u mean by math and statistics I will check that out too
Ohh I understood why math needed got it 
I think I cant self learn this(Ai Ds) like I did python so I am doing this course:
Harvard CS50's Artificial intelligence with python(a video on YouTube)
So ig that's a good course 
@winter canyon good find on the CS50 I'm properly gonna do that as well this year let me know if you want a study buddy, I just started DeepLearning.ai's courses last week. I guess im a pro now lol.
i use a python script to catch drones and helicopters. the script uses datasets and ollama gemma3. ai queries got a lot faster yesterday. they went from 16-22 seconds to 1-4 seconds. i use ollama in windows 11 with an i7 cpu. has anyone experience a speed increase recently?
can anyone recommend me python course for ds
i never took courses. i used to do mostly c, perl, php and some javascript until i started ufo hunting. i found sound datasets opencv tutorials in youtube and then i added ollama to the code
We're a large, friendly community focused around the Python programming language. Our community is open to those who wish to learn the language, as well as those looking to help others.
wow combat code to learn python
DS as in Data Structures or Data Science?
data science
Then try these: https://www.pythondiscord.com/resources/?topics=data-science
We're a large, friendly community focused around the Python programming language. Our community is open to those who wish to learn the language, as well as those looking to help others.
yea i would love a study buddy we can do lot of stuff together.
Hi can someone who’s in ML give me feedback on my cv so far, Imma apply to mle roles (I added the basic like name, school, major, contact info at the top + still not completely done w the app)
for those who are graduated/already work in an AI field. Do you use propositional logic/predicatelogic/lambda calculus in your daily work life
or is it something useful in a way?
also merry christmas
Hi
what AI tool do you see US engineers using the most these days?
ChatGPT vs Claude vs Gemini vs Grok vs DeepSeek ?
to be honest i dont know about the US engineers part, i have seen tests where gemini pro i think its called gemini 3?? not sure, but essentially it outperformed every other GenAI on the market, even gpt 5
idk if that helps
Got it, thanks!
I’m mostly curious about what’s most common in real-world dev workflows rather than “best on benchmarks.” Which one do you see engineers sticking with, and why?
I think each has its own use depending on the field of engineering. i just ran ur question through gemini and here is the summary (ofc i cant say anything ab the US because i dont live there, and nor am I an engineer yet)
Hi everyone,
I’ve recently started learning about Bayesian Networks in my university AI Python course, and I’m having trouble understanding the core ideas behind them. I’d really appreciate some help clarifying a few concepts.
I’m confused about whether Bayesian Networks are mainly used to compute things like conditional, joint, and marginal probabilities, and what those probabilities actually represent or are useful for. Are Bayesian Networks only used for calculating these probabilities? I also don’t quite understand why we need a graphical or data structure to work with these probabilities instead of just calculating them by hand. Another concept I’m struggling with is inference — what it means in the context of Bayesian Networks and why it’s such a central idea.
I’m also curious about how Bayesian Networks are used in real AI applications, since right now the theory feels very abstract to me. What I’m really looking for are very simple explanations and beginner-level, pen-and-paper style exercises that help build intuition step by step. If anyone has an exercise sheet, notes, or learning resources that helped them when they were first learning this, I’d be extremely grateful.
Thanks in advance for any help!
For my case, I’m mainly doing application development (design → implementation → debugging). Which AI do you recommend for that: ChatGPT, Claude, Gemini, or Grok?
id say go with claude for design and implementation, and for debugging go for gemini or gpt, id go with gemini tbh but its kind of personal preference kinda? and also i got a year for free cz im a student so yea
Thanks!
And nice the student year free perk sounds great.
Hello! Still need help?
yowza - API calls to deepseek are 1/20 the price of Mistral per M tokens. 1/10 for output.
i work in a deep learning or data science position/field and i use it sometimes for parsing stuff ...but more often it's used for research stuff than applied work afaik
Make it one page
What is the title of the section with your data scientist role ?
Relevant professional experience, it was at a research hub at my uni
Okay so I would put everything in the same section "Experience"
I think the word relevant shouldn't be there, of course if something is not relevant, it won't be included
you don't need to say it
in my opinion, you don't have to write any single tool/technology you used in every single lines of your experiences
I mean it's not relevant to me that you used Deboose to do one specific task, or performed t-test etc.
Be succinct, you can make it fit into one page by turning all 2-lines bullets points into one
Plus remove the one that are not relevant for the role you target. You can keep the key word in the bullet points only if they fit he job description you apply to
can anyone recommend good python course on youtube
i am getting confused in oop's concept
Hello. Yes.
!res
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
@warm lily u still there
why is ai agents so tedious and unreliable? it often feels like it destroys way more than it provides... am i missing out on something?
guys...
did u all know that u can type
from sklearn import DecisionTreeClassifier
instead of typing whole,
from sklearn.model_selection import DecisionTreeClassifier
@winter canyon I won’t be around today and have work tomorrow. My schedule might not compare to yours to do that course
i will start on new year what do u say 1hr per day 11hr course
I could probably spare one hour a day to get through it. I think I can still get through my other learning objectives while doing that.
Dm me for sure
I see, but its more like fundamental knowledge you "need" to know to know what is going on under the hood right? like you prob wont use natural deduction or whatever, but having that set of knowledge in the back of ur mind and maybe using it every now and then, must help or?
Also thank you
and may I also ask what did u study or how you got into the role
Which ones would u suggest? Like I said Im searching for mle roles
hi people.
has anyone made a linear regression/neural network in assembly language before?
That would be impressive, but not indicative of practical skill.
That is to be tune based on the role you apply for. Like the actual job description
nevermind I fixed my issue
finally, I'm free !
Glad you figured it out. Next time, don't ask to ask.
well, this is kinda off-topic thing. I don't expect people in python server to know how to code asm.
and I especially don't expect them to know how to create a linear regression model in asm.
well, I kinda gave up asking people, it's been littearlly 4 months since I started this
I don't think it's a great use of your time
In the amount of time that you spent doing something that complicated in assembly, you could have finished that project in python, and learned a bunch more things
I wasn't working on it 24/7 in those 4 months. It's kinda like.. I get stuck, then leave for 2 weeks, then come back..
also, I just really wanted to understand the math behind it .
but, yes, it wasn't the best use of time ever.
True ty for the advice
Yes, for a GPU (had to write the driver too). And also for the C64 (cart-pole balancer). The first way for actual use case. The second for fun.
cool
Assembly is not really hard, just tedious.
Being able to read it is worth learning though for performance reasons, but writing it is rare now.
Has anyone tried any of the ARC competitions before?
yes, do you mean the ARC prize competitions?
ya, specifically the ARC2 comp. Im looking for any tricks or tips. I've solved 37/400 test examples. But im hitting a wall.
What was your expereince with it? What did you build?
37 is SOTA for ARC2....what kind of methods are you using?
i'm in neural program synthesis
I've solved 37/400 test examples
when u say u solve it...it was the public eval right? was it an inference time only solution?
we recently has somone who claimed they modified TRM with a better score and didn't even know what Exposure bias was but was training autoregressively...apparently they coded everything with claude code/llms...so the code was completely wrong
ya, the public eval. Its inference time only, no neural net training on the data. But its program synthesis using a couple different things ive been tinkering with. Parliament voting system , typed dsl with grammer based enumeration, beam search with near miss scoring and a dreamcoder style libray that attempts to extract resuable skills from solved program (symbolic). I tried expereming with an LLM as an advisor but it doesnt really change anything.
yeah those typed DSL systems don't do well on ARC
im finding that out too! I recently added it and it hasnt done much
yeah makes sense since the search space is so large
so the DSL system does work, but it hits a ceiling unless you keep expanding the DSL to cover multi grid interactions
I got it up to 44 solved 11%. Ill play with it more tomorrow got to hit the sack
You can start here: https://kaggle.com/learn
I've used pgvector in the past on a project, im going through a training course that mentions weaviate, am I understanding the functionality correctly that I cannot associate the ID and vectors to text in the same database, I would still have to use something like Postgresql to store the human readable payload of the documents?
Guys i have a question, it might sound like a dumb question but plz excuse me I am a beginner in data science/analytics. For datasets where all the columns are categorical data, when you need to make visualisations to show the data distribution and relationships in the data, how can you show distribution in categorical data?
I've mostly worked with datasets that have columns with continous data so far.
I wasn't 🙂
When all columns are categorical, try univariance analysis of each column like piecharts, frequency checks like countplots etc, bivariance analysis graphs like crosstab. Not sure about other graphs tho
Who will win
Man, I feel like one of the biggest challenges im facing is automating a rule discovery system, something that can invent a new primitive from a small set of examples. So far im manually having todo it by running the 400 sample test each time, find the near misses, extract and train pairs them and then manually spot the hidden rule (which usually a single primitive) but then add that new primitive and solve. Only thing I can think of is focusing on near misses and building some type of feedback to steer the inventing/creation part. Anyone got any ideas?
alr
also i have another question i wanna ask. Basically in the datasets I've been exploring so far i also made boxplots and noticed there are a lot of data points above the upper quartile. My question is should someone take these data points as outliers without using any anomaly detection technique or is it ok to say datapoints in boxplot above q3 are outliers so it is ok to remove? (depending on the dataset etc)
hey guys, where should I learn data science in python if I already know the math behind the math of all models?
and, are online certifications important? or not really
They are not.
Thanks for the answer.
don't remove '''outliers''' unless they're actually outliers, i.e. they actually shouldn't be in your dataset for a variety of reasons
datapoints above q3 just have big values, but that doesn't mean they're outliers
(but if you're really worried, just cross validate and check if removing them makes your model better)
i had a wild idea. that RAG system i've been working on. I was already using the deepseek r1 8b for reasoning task with vector embeddings and the drift monitoring. Then I thought I'll just incorporate it into the ARC for the rule discovery system. And its exactly what I wanted, but i loaded 3 near miss task from the jsonl file, which contain tasks that are 50-75% correct. But it actually produced the correct formula I already solved.
Problem is what happens when we run out of near misses and hints/clues
hey, i'm trying to understand the procedure of how backpropagation works and just wanted to make sure my understanding is correct. so when we take the gradients, since the partial derivatives also depend on the input, do we take the average across all inputs?
The part about taking the average isn't what backprop is about
If you fully expand all the computations of a neural network, it's one really big function with lots of nesting
backprop basically means going forward in a neural network(data goin inside NN with random weights initialization), then one step at a time, calculating the gradients and applying GD to update weights
yeah but when we analytically compute the gradient, it also depends on the input right?
try creating a histogram in these cases also, boxplot is good, but sometimes to check outliers and data distributions histograms help more. If those above Q3 are less than 5 percent of whole data, ig its safe to consider them as outliers
as in the input is a part of the formula
yes
but there are many inputs
so would we take the average across all the inputs in our data?
no, avg is completely different. NNs can have multiple inputs, infact it always have
The entire vector goes inside the NN
so each vector component multiplied with weights, added, and an activation fn (sry my english)
we take the partial derivative with respect to each component of the input vector
oh i see
so we pass in the inputs combined as a matrix?
i thought we take it with respect to the weights and biases
Been a minute since my statistics, calculus, investments, and accounting college classes but it gets me by in healthcare tech. To keep progressing with understanding AI should I first brush up on some fundamentals then perhaps start with linear algebra then progress into Andrew ng’s deep learning module?
49 low hanging fruit now the hard part begins.
Depends on the side you want to go, just try not to waste to much time
Is it ARC ?
It makes totally sense how they design the ARC2 test though.
Daym
I’m just pretty new to the ecosystem in general I truly don’t know what aspect I’m most interested in to answer that.
I knew I’d hit this wall. I’ve had some success combining an LLM + RAG using the proven formulas... it’s even detecting the successful vectors.
In that case try it out, basics of ML, then Andrews DL course (side by side do bit maths as it may get handy in inerviews, but ngl as long as you know whats happening in DL its fine to leave maths for later)
Which LLM btw ?
Ive tried deepseek r1 8b and qwen 2.5 coder 14b
anything over those parameters just make it slow even when I can fit the entire model in the VRAM
to update the weights/biases, yes
but to find that you need (assume that these are partial derivatives)
dL/dw = dL/dz * dz/dw
```where the `z` is the input to this layer (which would be the output from the previous layer)
if you're at the first layer, then `z` is just your input vector
but in dz/dw we also have the input as a part of the formula
tried any of these models, top 5 in ARC-AGI in OpenLM Arena ? (m not good with LLMs so not sure abt this one tho)
my question is, since we have many inputs in the dataset, would we plug in each input and take the average across all inputs
no
its like LR, each input times each weight, then activation
I appreciate your input I’ll look into the maths as I’m going and just see where this adventure with my new rig takes me
Gud luck 👍
well, if you mean your 1 sample has multiple features - no averaging
if you mean you train on n samples at once - then yes ((mini)batch training)
try statquest yt channel, explains DL well
yeah i meant that i have n training samples
in that case training is done in batches
then yes you're going to be doing some sort of averaging, if you're inputting a vector of shape [n, ...] into your network in 1 step (like you're giving it n samples in this 1 step)
if you mean your whole training set is n samples, and you train on 1 sample at a time, so you input a vector of shape [1, ...] into your network in 1 step, then no you're not averaging
Interesting enough, I have. But outside the loop. The models are capable of solving the problems with enough feedback.
But it doesnt take a big model todo that.
i see
so in this case, would i repeat the process of feeding in 1 sample at a time until the loss is less than epsilon?
so then once i have gone through the last training example, i loop back if i have yet to get the loss within the threshold
repeat the process of feeding in 1 sample at a time
sure
until the loss is less than epsilon
usually you look at the validation loss and stop if it starts re-increasing (that's a sign of overfit)
once i have gone through the last training example, i loop back
yes; training through the entire set is usually called 1 epoch
and you can train for multiple epochs
for regression, is it a good idea to use pca before training the neural network?
no
you're using an nn which you assume can capture the complex interactions anyway, why throw away information
well we all began somewhere
and scikit's pretty big too
yes btw how much time it took u to complete scikit learn?
Hey guys, what's your opinion on coding in C? Like, doing ML projects in C?
Its not truly important to 100 percent scikit learn, just learn the important things and carry on
Never tried it, so not sure abt this one
You know its nice to prove yourself right with a little money spent, some experimentation, to justify leaving naive direction while you're still learning yourself, "We cannot use Docker for XYZ reason in our demo production environment". XYZ reason doesn't exist on my setup, sharing CUDA drivers to the container. 🙃 I never could fathom why accessing the hosts CUDA drivers was a bad idea and why doing so would prevent PyTorch from recognizing it.
Been waking up at 5am to learn without an alarm 😂 Happy New Year all looking forward to sharing and learning with everyone this new year.
if i want to pursue a career in ai, is learning 3d modeling as well smart? or would it hinder my progress to much?
I'm not sure what 3d modeling has to do with AI.
nothing, i just like doing it as a hobby tho ai is my priority
thats exactly why i asked if i should do that
you should always learn about things that interest you. I just don't see the connection between 3d modeling and AI
used to freelance a little with 3d too
thats the problem tho 3d does interest me but it doesnt have a connection with AI
tho i wouldnt do 3d later on for work
!cpban 1156584657371025438 scam thing
:incoming_envelope: :ok_hand: applied ban to @barren cloak until <t:1767194991:f> (4 days).
Try Deep Leaning CNN course of Andrew NG, and for competitions try kaggle
maybe on the gaming side where people make 3d characters using ai (although such sites already exist), but ig 3d modelling is preffered more on the gaming creating community
true most of the models ive made while freelance were for games
3d generative modeling is an active area of research ig
Something like this? https://huggingface.co/tencent/Hunyuan3D-2?utm_source=chatgpt.com
it's when you use data to approximate a function
whether that counts as "learning" is a philosophical question.
You take a lot of data, decide on features im it, then figure out patterns
Its all about patterns
Originally, behavioral modification through experience. For a machine to do this it involves storing what was experienced (sensed) in memory for later recall, either directly storing what was sensed uncompressed, in a perfectly compressed form, or in a compressed form that loses some of the information but uses way less space (lossy compression, like JPEG images). And/or by nudging some (virtual) system's behavior in some direction based on that sensed thing without really storing that thing. This is a sliding scale, because even nudging the system can be seen as storing (a very small) part of what was sensed in it (just very indirectly, not for direct recall).
What is stored depends on the problem statement, you may be able to ignore a lot of the incoming data/what was sensed (not relevant to the problem/noise).
Man this person did not beat around the bush lol https://arxiv.org/abs/2503.23923
Artificial general intelligence (AGI) is an established field of research. Yet some have questioned if the term still has meaning. AGI has been subject to so much hype and speculation it has become something of a Rorschach test. Melanie Mitchell argues the debate will only be settled through long term, scientific investigation. To that end here ...
oh i see
also when it comes to linear regression, usually which is better, normal equations or gradient descent?
Yea
thanks for the link this looks realy interessting
A computer is also a machine, Let's say our computer downloads the entire data from wikipedia , does its learning improve ?
I think it's not about storing data
Maybe start somewhere like here.
basically it’s just letting the machine fail until it figures out a pattern, there's a formal way to define it where a program learns if its performance on a specific task gets better the more data or experience you feed it
technically it’s just a massive math function with a bunch of adjustable numbers or knobs every time it makes a bad prediction it checks the error and tweaks those numbers slightly so it’s a bit closer to the right answer next time it’s not really thinking with consciousness but it's building its own internal logic instead of just following rules a human hardcoded
It's also about what that data is used for. The original definition of "machine learning," as it was originally coined, was referring to a program that played Checkers and would memorize board states and their evaluations such that next time it did not have to evaluate them. So if the intent of that storage is for a decision making program to make better decisions, it falls under machine learning.
Machine learning is directly tied to the history of AI, but since then has taken a life of its own as a programming paradigm, in which rather than explicitly programming in "rules," it's driven by data, lots of samples from which it determines the correct "rules" to have.
So you may be familiar with how functional programming differs in paradigm in that rather than given an explicit set of instructions you give a set of rules/constraints and it generates the correct instructions from that (declarative). This is kind of taking that even further in a way. Rather than write down the rules ourselves we just feed the program tons of examples and what to do in each case, and it tries to figure out something that works for all those cases (and more (generalization)).
So we are even taking away that step a human would have done, where they look at tons of examples and try to come up with the general rule(s) and then write those in code.
Not just out of laziness, but rather because in many cases it's too complex for humans to ever come up with the rules by hand (and even if we did, they may be so many and so massive such that we could not feasibly ever make a program for it by hand).
(For example with image detection, we previously tried this (heavily), and it got nowhere (only working somewhat under ideal conditions))
(To understand the problem with this, try making a human face detector with hand written rules, it teaches something important about complexity that humans can't deal with directly like that (this was often done in computer vision classes and still is, after which they then move on to deep learning) (it's tempting to assume that we have just not tried hard enough or are just not doing it right, but I can assure you we have historically taken that idea to the extreme, making very complex manual implementations, and that deep learning blew that all out of the water (this was deep learning's first popularity explosion moment actually)))
why do you ask?
I am about to get into causal ML, and I would like to know if there are any caveats / instabilities, issues. For instance, I am already finding out that DAGitty seems to be no longer under any sort of maintenance
!pip dowhy
Looks stable
You mean the offline install is stable? the website works, but I am uncertain about installing locally
what do you mean offline install?
try clicking on "Download"
https://dagitty.net/
Oh I thought you were still talking about DoWhy
No, DoWhy is fairly important. I am talking about visualizing the causal graphs
!pip jupyterlab-dagitty
maybe that ^^^^^
I don't really know how to code python, but I brute forced a couple different chatbots to create a .exe file renaming program mostly to make my life easier on renaming folders with many pdfs, is there a place I can share my github link with the .py and my released .exe?
ran out of my free limits on copilot and claude to get to v9 but I'm pretty happy with it. I don't really know what should go next after this for managing/fixing it
I struggled heavily through compiling it so I have the requirements.txt and .py because I'm betting the idea of running some unknown .exe isn't that great
But i'm not sure where/who I could share with to see if it works like I think on a different system
it just sounds like you are starting whatever it is you are getting yourself into. Just do it more, it'll feel like less of a struggle with time
I once spent something like 3 months trying to figure out WTF was going on with a binary I had compiled. It wasn't producing the right numbers on some standard benchmarks. Turns out that someone above my paygrade had mixed/matched compilers for the numerical libraries I had linked.
Compilation is usually a painpoint, e.g.
All that being said, I do not recommend asking people to download an already compiled *.exe file. At least not if you aren't a well recognised / trusted source.
well then I guess someone tag a mod if its not ok but
message.txt is the file-org-v9.py I used
edit: file-org-v9.py - file name matters
Click here to see this code in our pastebin.
oh I meant to include the readme.md and the requirements.txt
openpyxl==3.1.2
tkinterdnd2==0.3.0
pyinstaller==6.3.0```
Click here to see this code in our pastebin.
https://github.com/GurtyTrude/Mass_File_Renamer
my github link if its cleaner
I want to learn data science,