#data-science-and-ml
1 messages · Page 53 of 1
You should check very carefully whether you're working with the same data in PyCharm and your Jupyter notebook. It sounds like they're out of sync.
Does anyone know a trick to check if a sentence has special characters(like comma) and isolate that character from nearby words?
I'm trying to preprocess a text to be used for a small-scale GPT, and I don't want to remove those special characters from my input text because, well, I want the model to learn when to use them. However, my vocabulary list doesn't have those characters attached to words(like this,), only isolated tokens ['like', 'this', ','].
Oh yes...regular expressions module... 
string slicing?
I don't know. Maybe.
u could use it to isolate the characters
do u want to completely leave them out or just make them special in some sort of way like adding 2 spaces after them or a symbol after them?
so whenever this AI uses those words it just adds a space?
Nah, it's just so I don't get an error because my vocabulary was made like ['this', ','] and not like ['this,']
So the sort of Byte-Pair Encoding used in LLMs? There's probably some good tokenizer for this (maybe in something like spacy?) but as a stopgap, multiple str.partitions come to mind.
I didn't really want something to return a list, though
There's sentence.replace(',', ' , '), but...if I use that for multiple characters, it gets messy
something like
def partition_all(text: str, separators: list[str]) -> list[str]:
tokens = [text]
for sep in separators:
tokens = [part for token in tokens for part in token.partition(sep) if part]
return tokens
is what I'm thinking. probably very slow though.
Wait, I'll review some things. Maybe returning a list may be better, afterall
Nah, maybe not much
oh right, that doesn't quite work because partition only does one
one needs re.split instead
!e
import re
def tokenize(text: str, separators: list[str]) -> list[str]:
# separators should be escaped if they are regex-special!
tokens = [text]
for sep in separators:
tokens = [part for token in tokens for part in re.split(f"({sep})", token) if part]
return tokens
print(tokenize(
"Nah, it's just so I don't get an error because my vocabulary was made like ['this', ','] and not like ['this,']",
list(";, '"),
))
@tidal bough :white_check_mark: Your 3.11 eval job has completed with return code 0.
['Nah', ',', ' ', 'it', "'", 's', ' ', 'just', ' ', 'so', ' ', 'I', ' ', 'don', "'", 't', ' ', 'get', ' ', 'an', ' ', 'error', ' ', 'because', ' ', 'my', ' ', 'vocabulary', ' ', 'was', ' ', 'made', ' ', 'like', ' ', '[', "'", 'this', "'", ',', ' ', "'", ',', "'", ']', ' ', 'and', ' ', 'not', ' ', 'like', ' ', '[', "'", 'this', ',', "'", ']']
this looks good to me
Thanks!
my best guess is that the curvature of your loss function is very steep near that local minimum, so it starts behaving poorly. somethings to notice with SGD are that you have no guarantee the loss will actually decrease at each iteration, and the convergence of all gradient methods depends on the curvature of the loss. the larger the curvature, the smaller the step size has to be
maybe try with a smaller step size first
are there optimizers intended to overcome this? like AdamW?
There's a string in your column
yeah so, my understanding of adam is that it's a form of gradient with momentum, which avoids the gradient suddenly changing directions
Try df.info() whenever you get a TypeError and then go from there.
I think you could do something like this to find the strings:
df[df['horsepower'].str.contains('^[a-zA-Z]$')]
Pytorch docs says Adam uses some kind of EMA for the gradients and gradients**2
(Or so I remember)
if your "horsepower" column has non-numeric information in it (like what unit it's using), that needs to be removed, or put in a different column.
It worked in jupyter notebook tho. Wouldn’t it still throw the same error there?
Will try this, thanks
does it still work in your notebook if you rerun everything from top to bottom in one go?
jupyter notebooks make it incredibly easy to throw in extra steps that won't necessarily be re-done the next time you run the notebook.
Yeah..
then there's some way that your non-notebook code is different from the code in the notebook.
also, when Edd says "rerun everything from top to bottom in one go", that includes restarting the notebook kernel.
think of all the variables and functions in your notebook as existing in a dict (because they actually do). each cell can change what's in that dict. deleting the cell or changing what's in it and re-running it doesn't undo what changes you made to the global state dict.
whereas restarting the kernel clears everything out of the global state dict, and you start over fresh.
How do i use gpu on jupyter lab ? I am training AI with cpu . and it is so slow
this isn't a jupyter question. it's a "what library are you using" and "what GPU do you have" question.
i am using tensorflow and i got 3080 ti
and 3090 at school
I don't use tensorflow, but this is the guide for using a GPU with it https://www.tensorflow.org/guide/gpu
Hi all, I have a question more on statistics. I have an excel file with that contains transaction information. The file has seperate sheets for each year since 2011 and contains the columns on the business name, customer name, line of business, sub line of business, and revenue. I want to look at specific businesses and see if diversifying their products has helped with revenue. I have looked at it a few different ways and wanted to know if it would make sense to look at the standard deviation of the percentage make up of the sub linbes of business as a measure of if a company has diversified. An example being if a certain company had 80%/10%/10% breakdown of the sub lines of business where one takes up 80% of the records, the standard deviation would be higher than if it were something like 50%/30%/20% since the values would be closer to the mean. Does this make sense to do it that way? I feel like I am missing something.
I had a data scaling question and database question, if i was storing json data and webpage html in a db, what is an efficient db backend to use plain old mysql redis or something else. I'd like to build this app to be scalable from the begining so I don't have to change anything later. it will mostly be used for a cache. Thanks for the guidance
oh theres a database channel nevermind my bad
0.001 isnt small enough?
the number depends entirely on the loss function, the network, and the data. there is no 1 number that always works
try dividing by 10 or 100 and see if it behaves any better or different. if not, then we have to give some thought to what the reason might be
@wooden sail 0.0001
but all that did was slow the rate down so if i did more than 200 iterations it would probably go back to doing the unga bunga dance
does someone know a good alternative to MLflow that is reliable? I tried unsuccessfully to set it up with FTP but ran into massive failures (and I'm not the only one, according to their issue tracker). so I'm a bit tired of them. any other good tool in the scope of managing model deployments ?
Hi all I had a question on a personal project im working on:
I have animal types, their outcome (adopted, not adopted), etc...
I find that most animals get adopted in 2 months of less, however I get roughly .01% being adopted in 2 years, 3 years, 5 years.... These extreme observations are true and are not errors. I fear keeping them will ruin all my inputs to my models as the standardization values will be influenced by these outliers, however I don't want to drop them because they are informative!
Curious on what I should do? Leaning on dropping anyway?
random question, but : are you sure normalising is a good idea?
clearly the distribution is not normal : are you sure you control the distribution after a renormalisation? 🙂
Hmmm good question.... Now I am all turned around hahaha....
Like I want to normalize all my variables so they're in a similar scale before learning. Demeaning/normalizing would make my distributions normal (now they're typically 0 to X for days in pound, age, etc...) but keeping in the outliers would screw up what the centered value would be?
idk if my logic even sound to you lol
I guess I could standardize to the median and not mean, win-win?
yeah it's quite rare people in machine learning are statistician enough to really care about statistical errors they commit, so I guess we don't care much
median sounds like a nice thing to at least try
I could always share my screen and show you my logic but idk if you wanna hear my rambling lol
std deviation is also not robust to outliers, so wach out here too
@gilded bobcat My advice is, don't try to fit a normal distribution. Instead, try a gamma distribution. They're more appropriate to your situation.
this makes me wonder, will the distribution even matter if I just use non-parametric models like decision trees ?
If you're ultimately going to use a non-parametric model, then I'd say the only reason to try to fit a parametric distribution in the first place is so that you have some idea of what the data looks like.
If a gamma distribution fits well, then you learn something. If it doesn't fit well, and if you can identify where it doesn't fit well, then you learn something else.
That is, you can fit a parametric distribution as a kind of exploratory tool, instead of because you want to shoehorn the data into something parametric.
Makes sense, do you have advice on how to ensure it looks like a gamma dist other than inspection? Ngl I am used to shoving my data into normal and moving on lol
For exploratory purposes, inspection is a good way to go. For example, plot the density of the fitted gamma distribution over a histogram or KDE of the data. Or make a Q-Q plot of the data versus the fitted gamma distribution.
Got it! Visually it looks good, let me know if you agree:
Huh, those are interesting. They all seem to have distinct elbows.
Yeah. They all seem to have elbows at about that height.
Hey guys, can someone help me with Pretraining of a Transformer?
I know that the Unsupervised Learning phase of neural networks is mostly to train the "feature extracting" layers, with the objective of minimizing information entropy to make things easier for the classifier. I can see that quite easily for image models. But how can I do that for a Transformer?
Should I use as "information entropy" output the Encoder output?
But then, GPT-1 had only the Decoder part, right? Shouldn't I use something related to the Decoder for this?
ChatGPT told me that the Transformer would be trained to predict whether 2 generated sentences are consecutive or not...but it also gets pretty messed up with that information.
Uh...ok...I don't get it...it would be a CrossEntropy, ok? But what would be the targets?
So what math Field should I learn for AI except statistics
does anyone know why this is not working
I just converted the numpy array X_test_MinMax into a dataframe called a
and then i just want to get the dataframe when the "Island" column is 1.0 but it shows NaN values
linear algebra and calculus are also very important
the inputs are the previous tokens in the sequence, and the outputs are the probabilities of what the next token should be
for example if you give it "I really love programming in", it would give you what it thinks the next token should be, which could be something like: 0.5 python, 0.3 C++, 0.1 java, 0.05 rust, ...
Ok, that's for text generation, but what about the loss?
this is if you're using word tokenization, there's also a bunch of other tokenization methods that are subword (so tokens represent parts of words) so the tokens could be something like "th-" or "ex-"
In a supervised learning configuration, the loss would be CrossEntropy(model_output, target_text).
But what about unsupervised, where there's no labels, no targets?
the target is the actual next word in the sequence
because your training data is a bunch of text, you already know what the next word is
so the label is just that next word
That doesn't look like unsupervised learning to me
it's self-supervised learning
Ok, but I want to use unsupervised learning for pre-training
there is no unsupervised learning for that, people call it "unsupervised" because there are no explicit labels, but it's technically self-supervised
Ugh... Then I see no difference from an unsupervised learning configuration and a supervised learning for Transformer
Working with images is way easier... and clarified
I have a NLP-ish type of question.... I have a feature called animal breeds, many (if not most) of these breeds are sparse (like 1 or 2 animals per breeds). Could embedding these with a pretrained model (like GloVe) be a good idea? Would it be able to understand the similarity between "German Shepard Mix" and "German Shepard" and "Pitbull?" My end goal is to use breed to predict if an animal will be adopted
You're trying to predict breeds, with what features?
I am trying to predict if an animal will be adopted using some predictors.... One of them being breed, I feel as if it will provide some great explanatory power, but if I OHE it itll be incredibly sparse. I thought maybe I could make embeddings instead and use those for prediction.
To make it more confusing some animals are "short hair tabby" and others are "short hair tabby mix"
What are the predictors?
Y is adoption (dichotomous), possible X's are: Age (continuous), Animal Type (categorical), Breed (categorical), ,Color (categorical), Intake Reason (categorical), Intake Sex (categorical), Intake Conditional (categorical)
I will prob drop color its even worse than breed
Those things you just said are called features, just so you know. That's what I was asking for earlier.
Got it
Anyway, I wouldn't do the word embedding thing. I'll explain why later.
Just putting this out there as well, a lot of those features will likely not get you much information for the model to predict the breed
color and type I think would make sense, but intake reason, sex, and age likely have little to no predictive power
Sorry I might have been unclear. I want to use breed as a feature to predict if an animal will get adopted.
ah okay that makes sense
I read this as "i'm trying to predict animal breeds"
Yeah let me edit, I think steelercus read it the same
I think it would make sense to kind of merge some of those breeds together into the same category if they're very similar
Here is an idea of what they look like:
like "german shepard mix" and "german shepard" would be merged into "german shepard" etc
I would agree but my pain is like I think for dogs being 'mixed' actually matters.... A purebred german shepard is wildly different from a mix. Moreover, I am just unsure how german shepard/lab is different from a lab/german shepard... I could def break it up though
yeah if the mix would non-negligibly affect the chances of being adopted then it should be kept
but like Chicken and Chicken mix? Wtf is a chicken mix??
I have another Q on feature selection if thats okay
I plan to do feature selection prior to building my model, probably like RFE.... Should I include my OHE categoricals when I do this? If so, if it says one categorical value (like A, B, C and it says A is useless) is useless then should I drop all my dummies for that categorical?
anyway @gilded bobcat, you could use word vectors to see if the names of the breeds form discernable clusters is that vector space, and treat any breeds that are part of the same cluster as the same breed. But that's basically just binning. If you want to create bins of breeds, you can already do that using whatever bins you want.
you have to decide what features you're going to feed into the model before you feed them into the model, yes. even if you decide to feed all the features into the model, that's just feature selection where you select all the features.
I see, honestly worth a shot or atleast a fun way to practice my clustering techniques.... With 4k+ breeds I need a better way to automate over me deciding
well, you're certainly free to. you can use kmeans clustering on the resultant vectors and see what you get.
Ty 🙂
yw
you might also be conflating feature selection with feature encoding.
I might be confused, what if I have a categorical with A, B, and C values only. I go ahead and OHE these so that I now have three columns in my feature dataset. I then run a feature selection technique over my all my possible features. If my feature selection was to say "A and B are really important but C is useless" should I just drop the whole categorical variable or drop the one hot encoded C column?
I say drop all of that categorical variable, but curious none the less
what does ohe stand for
one hot encode/dummy them out
ah right. I've never seen that as an acronym for some reason.
also there's no established meaning for "feature dataset" as a separate thing from "dataset".
"A and B are really important but C is useless" should I just drop the whole categorical variable or drop the one hot encoded C column?
why would you drop A and B if they're important?
Because they're all dummies within the same categorical variable, I have no good reason to say I should/shouldn't, but it feels like I am tossing out a 1/3 of a variable and not sure if that's okay.
my guess is that the model would just learn to ignore C anyway, but it depends on the model and properties of your dataset.
can someone answer my question, i am still very confused on why it is not working
try a.loc[a.island == 1]
it still does not work
hey I got a quick question for pandas, I want to add a dataframe to another to the right of it
so add it column wise but keep it as is
keep the row names etc
so do this
| A | B
row1 | True row5 | True
row2 | False row6 | False
row3 | False
after appending B to A
| A | B
row1 | True row5 | True
row2 | False row6 | False
row3 | False
im not too sureunless I can see that minmax array, you could send example data?
if anyone knows how to do this for dfs please let me know
seems like osmething really basic but cant find easy way to do this
does anyone know why the matplotlib window might be not responding when I use:
plt.pause(0.001)```
Have you tried pausing for longer?
How much work is it doing? Are you plotting a lot? Matplotlib is slow.
nah, it's a single data point on the first iteration
def plot_rewards(show_result=False):
plt.figure(1)
rewards_t = torch.tensor(episode_rewards, dtype=torch.float)
if show_result:
plt.title('Result')
else:
plt.clf()
plt.title('Training...')
plt.xlabel('Episode')
plt.ylabel('Reward')
plt.plot(rewards_t.numpy())
# Take 10 episode averages and plot them too
if len(rewards_t) >= 10:
means = rewards_t.unfold(0, 10, 1).mean(1).view(-1)
means = torch.cat((torch.zeros(99), means))
plt.plot(means.numpy())
plt.pause(0.001) # pause a bit so that plots are updated
#plt.show()
if is_ipython:
if not show_result:
display.display(plt.gcf())
display.clear_output(wait=True)
else:
display.display(plt.gcf())```
not on ipython
Is interactive mode on?
yes it is
ooh ok, now it's actually plotting data once I up the wait to 1, but it's still not responding when I click on it
I guess that's not a super big problem, but it is annoying
Use FuncAnimation instead.
Hey, I have alot of data that will be key value pairs. Both the keys and values will be intergers. There will probably be either millions, or maybe billions of key value pairs. What would be the best way to store this data so python could get a value with a key efficiently?
probably with a database if you don't want to store all the data in memory
Hey, im currently trying out sklearns pipeline and wondered if theres an easy way to implement preprocessing and postprocessing into it.
they got bunch of books similar to this and in general i think those are good books, that said i did not read that particular book
Does anyone have experience with extracting data from uneven grid?
I got a question regarding data labeling:
I have a few large datasets of customer feedback which I would wish to label for text classification. However, my time and resources are limited, so I do not have the capacity to label the whole dataset. Therefore, I am interested in algorithms that help me to create labels with only a fraction of the data by utilizing unsupervised/semi-supervised techniques and accepting some noise in the labels.
So my questions would be: Which approaches would you recommend? How would you solve this problem? What state of the art algorithms /papers exist on this topic?
currently i use functions for preprocessing but would want to store all steps into the pipeline to have a complete "model", but i cant implement the functions directly to the pipeline cause it starts with a df then transformations etc. and i struggle to get the right approach
i got n dfs with the shape (300, 2) and transform them into a df where n rows are present and 60+ cols
from this df the cols are my features and i got another df with 20 cols which are my targets, n is always the ID of the df in all cases
don't ask to ask. be sure that you post a complete, answerable question all at once.
ok
actually i am creating an AI bot. after importing the csv file i got a msg like this .
ValueError Traceback (most recent call last)
<ipython-input-21-6250f5fee32f> in <module>
----> 1 df.sample(6)
1 frames
/usr/local/lib/python3.9/dist-packages/pandas/core/sample.py in sample(obj_len, size, replace, weights, random_state)
148 raise ValueError("Invalid weights: weights sum to zero")
149
--> 150 return random_state.choice(obj_len, size=size, replace=replace, p=weights).astype(
151 np.intp, copy=False
152 )
mtrand.pyx in numpy.random.mtrand.RandomState.choice()
ValueError: a must be greater than 0 unless no samples are taken
i am using this csv file
i am using google colab to create this bot
@cold snow df.sample(6) worked for me when I did it with your csv, so you may have inadvertently overwritten your df variable with the wrong thing.
I haven't seen enough of your code to know. try restarting the notebook kernel, and then do df.sample(6) immediately after the df is created.
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.
ipynb files are hard to read unless you open a notebook server on your computer, so just copy and paste the code out of it.
ok
context window of size 7
n = 7
for i in data[data.name == 'KIRTHIN waifu'].index:
if i < n:
continue
row = []
prev = i - 1 - n # we additionally substract 1, so row will contain current responce and 7 previous responces
for j in range(i, prev, -1):
row.append(data.line[j])
contexted.append(row)
columns = ['response', 'context']
columns = columns + ['context/' + str(i) for i in range(n - 1)]
df = pd.DataFrame.from_records(contexted, columns=columns)
what is that loop intended to do? because you should never be writing loops like that.
actually its a pre made command line
i am editing ad using it
and the context i problem i think
@serene scaffold are you there bro
please don't ping random people. I'm busy right now with work.
ok
Can you format your code using ```
A guide for how to ask good questions in our community.
n = 7
for i in data[data.name == 'KIRTHIN waifu'].index:
if i < n:
continue
row = []
prev = i - 1 - n # we additionally substract 1, so row will contain current responce and 7 previous responces
for j in range(i, prev, -1):
row.append(data.line[j])
contexted.append(row)
columns = ['response', 'context']
columns = columns + ['context/' + str(i) for i in range(n - 1)]
df = pd.DataFrame.from_records(contexted, columns=columns)```
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.
that's how debugging goes. getting a new error is progress.
create dataset suitable for our model
def construct_conv(row, tokenizer, eos = True):
flatten = lambda l: [item for sublist in l for item in sublist]
conv = list(reversed([tokenizer.encode(x) + [tokenizer.eos_token_id] for x in row]))
conv = flatten(conv)
return conv
class ConversationDataset(Dataset):
def init(self, tokenizer: PreTrainedTokenizer, args, df, block_size=512):
block_size = block_size - (tokenizer.model_max_length - tokenizer.max_len_single_sentence)
directory = args.cache_dir
cached_features_file = os.path.join(
directory, args.model_type + "_cached_lm_" + str(block_size)
)
if os.path.exists(cached_features_file) and not args.overwrite_cache:
logger.info("Loading features from cached file %s", cached_features_file)
with open(cached_features_file, "rb") as handle:
self.examples = pickle.load(handle)
else:
logger.info("Creating features from dataset file at %s", directory)
self.examples = []
for _, row in df.iterrows():
conv = construct_conv(row, tokenizer)
self.examples.append(conv)
logger.info("Saving features into cached file %s", cached_features_file)
with open(cached_features_file, "wb") as handle:
pickle.dump(self.examples, handle, protocol=pickle.HIGHEST_PROTOCOL)
def __len__(self):
return len(self.examples)
def __getitem__(self, item):
return torch.tensor(self.examples[item], dtype=torch.long)
!code
be sure to always use this from now on ^
you have to read this message
ok
Hello chat
i used pastebin but how to use thsat in here
Woah, CS grad?
like this
What college?
it says so in the message from the Python bot
this message
an east coast US one
Woah
Can you show me your setup
I have so many questions
I'm planning to study CS in the future
Right now just focusing on school and a little bit of programming when I have time
you're in nepal, right? I don't really know how it works in nepal.
cool flag though 🇳🇵
try asking in #career-advice is there are any developers in india or nepal who know what to do
I'm planning to do IB so I can go abroad
I don't see much potential here
NameError Traceback (most recent call last)
<ipython-input-12-a654172287f5> in <module>
6 return conv
7
----> 8 class ConversationDataset(Dataset):
9 def init(self, tokenizer: PreTrainedTokenizer, args, df, block_size=512):
10
<ipython-input-12-a654172287f5> in ConversationDataset()
7
8 class ConversationDataset(Dataset):
----> 9 def init(self, tokenizer: PreTrainedTokenizer, args, df, block_size=512):
10
11 block_size = block_size - (tokenizer.model_max_length - tokenizer.max_len_single_sentence)
NameError: name 'PreTrainedTokenizer' is not defined
bro this is the error i got
Code
anywayto solve it
Send code
above
you never imported PreTrainedTokenizer, so you'll have to figure out what that is and where to import it from.
how to import it
an import statement. but you have to figure out where it's located.
you must have other import statements in your code to use as an example
Dataset?
He hasn't defined Dataset either
How to get started in developing a ai model like chatgpt
are you okay with the model you create being orders of magnitude less impressive than ChatGPT?
I just wanna do it for the fun
look into how to create a basic language model. because language models are what ChatGPT is based on.
Language models means like NLP? Just asking
nlp is the part of AI that deals with natural language. so language models are part of nlp, in the same way that addition is part of math.
stelercus bro i solved tokenizer error also
i reloaded kernel and forgot to install pip transformers and install tokenizer
reloaded
runned the command on run time and the probem is solved
Can you do it in python or some other language
pretty much all of nlp is done in python, but conceptually, you can use any language you want.
Okay I'm gonna try it and reach you back
sure, but I only really have time to answer very specific questions.
I understood brother
Do you guys know of any papers on caption to image search? Along the lines of clip or lit but if possible a bit more efficient and accurate ofc haha
I feel like lit, clip, align, blip, blip2 are all more about captioning and classification whereas I am looking only for caption to image search
Please ping me if you know any cool research:)
Does anyone here know how to use OpenAI gym library in python for reinforcement learning? I've been having trouble using an environment to train an AI on.
anyone want to chat and code
The idea is basically this:
import retro
import time
from stable_baselines import PPO2
from stable_baselines.common.vec_env import SubprocVecEnv
from stable_baselines.common import set_global_seeds
from stable_baselines.common.callbacks import CheckpointCallback
from wrapper import wrapper
# Pre-saved states
states = ["ChunLiVsBlanka.1star", "ChunLiVsBalrog.1star", "ChunLiVsBison.1star", "ChunLiVsChunLi.1star", "ChunLiVsDhalsim.1star",
"ChunLiVsGuille.1star", "ChunLiVsHonda.1star", "ChunLiVsKen.1star", "ChunLiVsRyu.1star", "ChunLiVsSagat.1star", "ChunLiVsVega.1star",
"ChunLiVsZahgief.1star"]
env = retro.make(game="StreetFighterIISpecialChampionEdition-Genesis", state="ChunLiVsBlanka.1star")
#env = retro.make(game="DonkeyKongCountry2-Snes")
env = wrapper(env)
#model = PPO2.load("D:/Python/Projects/Hakisa/rl_model_1000000_steps")
obs = env.reset()
total_reward = []
steps = 0
end = False
model = PPO2(policy="CnnPolicy", env=env, gamma=0.99, n_steps=64, learning_rate=3e-9, vf_coef=0.5, verbose=1)
start = time.time()
model.learn(total_timesteps=1000000000, log_interval=100, reset_num_timesteps=True, callback=checkpoint)
while end != True:
env.render()
action, state = model.predict(obs)
obs, reward, end, info = env.step(action)
#steps += 1
total_reward.append(reward)
time.sleep(0.05)
# Don't use these
#env.render()
#env.close()
end = time.time()
print("Duration: ", (end-start)/3600)
print(f"Total Reward: {sum(total_reward)}")
'''checkpoint = CheckpointCallback(save_freq=100000, save_path="D:/Python/Projects/Hakisa/Donkey_Kong")'''
Also, env.render() and env.close() have some bugs. If you call env.close() it'll simply close your window directly...but when you call env.render(), it'll already render the game window and then close it.
Oh yes... I forgot to mention...this one is gym retro, which is more focused on retro games...but I suppose the original gym might have an idea that is close to this.
Hello my DS friends, curious on your take:
Would you do a feature selection technique if youll already have regularization in your model?
why not?
I guess best case scenario itll make your model faster but wont improve it over just regularization (cause regularization would have dropped the same variables anyway!), at worst youll over generalize and make a worse model... This is my guess tho
I guess ill do both and report back haha
Anyone here use anaconda?
i used to, i remember it looked cool
I am trying to install it on my laptop and gave it a non-default install location. Think it can cope with that? I mean it gave me the choice! lol
it shoudlnt be a problem
so far it seems to have all installed fine. Thanks 🙂
scikit learn
how do i pronounce it, i heard multiple pronounciation, what do you use?
have heard*
"sci" (like science) - "kit" (like kitkat) - "learn" (like learn)
some people say skeeet learn
yea thats probably how it was meant
Guys what is the path for be expert in IA?
I do 👀
so for some fun AI stuff im currently working on a SC2 AI for their deep learning ladder and i have gotten this error
"c:/Users/Redux/Documents/python VB/hi.py"
Traceback (most recent call last):
File "c:\Users\Redux\Documents\python VB\hi.py", line 1, in <module>
from pysc2.agents import base_agent
File "C:\Users\Redux\AppData\Local\Programs\Python\Python311\Lib\site-packages\pysc2\agents\base_agent.py", line 20, in <module>
from pysc2.lib import actions
File "C:\Users\Redux\AppData\Local\Programs\Python\Python311\Lib\site-packages\pysc2\lib\actions.py", line 27, in <module>
from s2clientprotocol import spatial_pb2 as sc_spatial
File "C:\Users\Redux\AppData\Local\Programs\Python\Python311\Lib\site-packages\s2clientprotocol\spatial_pb2.py", line 16, in <module>
from s2clientprotocol import common_pb2 as s2clientprotocol_dot_common__pb2
File "C:\Users\Redux\AppData\Local\Programs\Python\Python311\Lib\site-packages\s2clientprotocol\common_pb2.py", line 32, in <module>
_descriptor.EnumValueDescriptor(
File "C:\Users\Redux\AppData\Local\Programs\Python\Python311\Lib\site-packages\google\protobuf\descriptor.py", line 796, in __new__
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
not sure if this is the right place to post this but do you guys have any insight on how to correct this error?
AI you mean?
sigh-kit-learn
the sci is the same as science, and I'm not aware of a variety of English where the "c" is pronounced.
English is a very weird language for sure haha
it's actually not uniquely weird. every language has irregularity.
I couldn't speak to that as i only speak English so xD I just know the weird things with English which there are a lot for example ate and eight knife science and some other examples of English being interesting
Hey hey people
Anyone here has any experience with bias (e.g. gender bias) in text based data ?
Looking to build a binary classifier to detect bias as a first step
So here are my 3 questions
- Could anyone suggest any open source annotated/labelled datasets? (labels --> 1: biased, 2: non-biased)
- What other methods would you recommend if any?
- In conjunction with 2, I know of word embeddings but never actually used them. Are they implemented/trained with NN mostly ?
Sorry for the long post ^^
P.S. Using python/anaconda and mostly interested of 1)
these are cherry-picked examples that only pertain to spelling. and spelling isn't the only property of a language that can be irregular. (in some senses, spelling isn't even a property of the language at all.)
yeah, because in brasil we say IA
people are more likely to know what you mean in this server if you say AI
okay
yeah
so I haven't studied AI in depth yet
but my goal is to create automations with AI.
either with computer vision or even within some software.
what do you reccomend study on internet?
if you really wanna be an expert, you should start by learning the basics of python and learning math
AI is math, and the more math you know, the better stuff you'll be able to do with it. separately, python is currently the most popular language for AI due to the large community and available modules, among other things
there are plenty of courses for you to start, with small to little cost (e.g. edx ones ~300) or even completely free (e.g. MIT youtube uploads lots)
You could check those courses and if interested then you can delve deeper to math as @wooden sail said, and said strong foundations. (Math will help you remove the feeling of wondering "why" in most cased of AI)
uhum... thanks Edd
why the hesitation?
uhum.. thanks Chonky
getting AI advice from Edd is like getting a million dollars, or something.
maybe that's the regional variant of mhm
it's my way of understanding
interesting
but it's also important to know about communication between systems, so you know how to pull data from a given system and apply a right model?
that's also true. there's layers to AI, and that's on the system design level. you may or may not have to deal with that at all depending on which part of AI you want to focus on
for very large tasks, it's not just one person. you have people building a pipeline, people doing math, people coding the models, people sifting through data, etc
it's not realistic for one person to do all of these for a large model, but they can be good skills to have. in that sense, getting familiar with databases (something i've never done in my life) and generally with linux (because everything runs on linux) are good ideas
oh yeah
do you have lots of working experience ? 😮 @wooden sail
depends on what you call work experience 😛
love ur pfp
but in my head I think, for example, I don't know what is possible to do this, but my idea, for example, is to create software with AI that has automation
^^'
Asking cuz I feel the same about dbs
i do signal processing stuff, which usually has me dealing with the math part, but not necessarily handling data
oh knife-edge model on matlab and stuff like that?
sure. i've never explicitly used that diffraction model, but similar stuff
Interesting and love reading code about those things
But I prefer to stay away from math parts if possible
Nlp seems fun for the time being, except when I need to hunt for datasets...
Programing ==> Python
Theory ==> Mathematics and Statistics
Domain Knowledge ==> Dependent on your area of specialty, the industry you work, etc
if you're in HS, different math topics might be more accessible than others
many schools cover calculus toward the end, but not much on statistics and linear algebra
the good thing is that basic probability and a lot of linalg are independent from most other stuff you learn in school, so you could jump right into them
Since you find NLP fun, do you like reading the theory part of NLP?
At first I did, but due to having to read papers most of the time, not anymore
at least not in volumes
if I have the time to spread them more in my day, yeah depending on the NLP topic
Can I train an image classification cnn on 60 images per class?
depends on the model, whenever you are planning to train from scratch or fine tune, as well as how easy your images are to tell apart
Alex net
So decently big
Hey, when creating an ML model can the validation data be the same as the training data?
that's usually a pretty bad idea
there are recent papers showing that some neural network architectures reach 0% training loss under mild conditions, and this in general says nothing about how well the network generalizes
it's a recipe for overfitting
I see, then having separate data for validation is better?
i'd say necessary, not just better, if you want a useful network outside of the training data
unless you only ever need the network to work on the training data. there are special cases where this makes sense
there are even recommendations to have a third data set to see how the network will work with completely unseen data after you reach a final model
Hello, good afternoon
how to solve this inequality with sympy? 4 <= 3x - 2 < 13
tryed a lot of solvers and also cant found an example in internet
iirc pandas also requires for you to use (a < b) & (b < c) (or use methods like .between) instead of supporting a < b < c, I guess that the way python handles a == b == c, a < b < c etc doesn't allows as much customisation
!e
from sympy import solve_univariate_inequality, Symbol
x = Symbol('x')
s1 = solve_univariate_inequality(4 <= 3*x - 2, x)
s2 = solve_univariate_inequality(3*x - 2 < 13, x)
print(s1 & s2)
@wooden sail :white_check_mark: Your 3.11 eval job has completed with return code 0.
(-oo < x) & (2 <= x) & (x < 5) & (x < oo)
i thought it would simplify it, somehow
oo? is that their infinite representation?
when printing, yeah
nice solution... I was trying to solve all together.... thank you by the help. Really apreciate
i guess we can pass the parameter extended_real = False to get rid of the infinities
!e
from sympy import solve_univariate_inequality, Symbol
x = Symbol('x')
s1 = solve_univariate_inequality(4 <= 3*x - 2, x, extended_real=False)
s2 = solve_univariate_inequality(3*x - 2 < 13, x, extended_real=False)
print(s1 & s2)
@wooden sail :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 3, in <module>
003 | s1 = solve_univariate_inequality(4 <= 3*x - 2, x, extended_real=False)
004 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
005 | TypeError: solve_univariate_inequality() got an unexpected keyword argument 'extended_real'
hmmm
the API is kinda bad for this, the argument is there for rational polynomials. oh well, no matter
its very ticky to get without documentation... and also had a lots of books here. No one example like that
yeah, same as here
does anyone know any alternative to GPT 3 cuz i used all of my tokens and uh let's say i have some paypal issues
alternative in what sense?
basically it can do the same stuff but it's free
i'm fairly new to AI and stuff so i just want to try it out and see if i can use it in one of my projects
so
i don't want to pay rn
i know that this isn't about openai and gpt but i thought this might be the place where i can find people who have knowledge about this kind of stuff
perhaps chatgpt or bloom
that thing?
does not sounds like the weights are public
and let's not talk about leaked models
When dealing with a dataset of images, is it always recommended to apply normalization on the images?
AI is math
To think I don't like complex math, and I've chosen exactly the programming field with most math 
||As an intern researcher, I prefer to consider there's no excuse to not do everything...because that might be the case
||
I think you will find (and this is entirely subjective), that the more interesting a programming field is, the more math is probably involved (there are other kinds of interesting, such as designing a beautiful web page, but since this is programming i'm assuming a specific type of interest).
this is your daily reminder that CS is originally a branch of mathematics
the compute in computer science is from computability theory
If only you could get a math major with it...
(Also it makes the "science" in "computer science" extra wrong)
("computer math?")
computer meth
hello everyone
does anyone know pyspark here
I cant seem to get pyspark work with pycharm
reee
you have to give more information to get help. what did you try to do, what did you expect to happen, and what happened instead? (and if you show code or error messages--which is strongly encouraged--don't post them as screenshots.)
Hi all, I'm having problems with a task I was given for a course. For context, I'm a physics student in their first semester, and I'm taking a course to be able to use python in my branch. I'm afraid that the task is a bit above my level in physics though, so I'm not sure how I should continue... I've plotted the data, but I don't know how I could continue with the rest of the task- so if anyone could at least help me with a periodogram, I'd be really thankful!! 
I need help with pandas ASAP, I would appreciate help. Basically I have df with 3 cols: userID, value, itemID. I need to do the following: group by user, pick the biggest value and assign itemID, which corresponds to value to all userIDs. How can I do this?
to get help ASAP, please do print(df.sample(10).to_dict('list')) and put the text (no screenshots) in the chat in your next message.
doesnt matter now, solved it
for future reference, copy-and-pastable dataframe examples increase the likelihood that you'll get help fast.
I have a 1 x n df (n is always going to be even). Columns are named column1_null_count, column1_count_not_null, column2_null_count, column2_not_null_count, ...
How can I turn this into:
column1 | column1_null_count | column1_not_null_count
column2 | column2_null_count | column2_not_null_count
Without having to do some kind of looping and split on '_'
see my above messages about copy-and-pastable examples.
but do print(df.T.head(10).to_dict('list'))
you also don't want to have dataframes where the number of columns is the one that varies.
please ping me if/when you do that.
ha managed to sort your null count query out eh?
i would take the underlying numpy values, reshape and assign another column for the extra column name
Unfortunately, had to hard code it with some help of python generating the query.
Essentially ended up as a large sum(case when XXXX is null then 1 else 0 end) XXXX_count_nulls, count(XXXX) as XXXX_count_not_nulls for each column. It's not great, but no way around it. Ends up as a 1 x n df.
" If it's stupid but it works, it's not stupid " 🤷
This shows up as. {0: [0, 10, 0, 10, 1, 9, 3, 7, 10, 0]}
More like I already spent an hour googling and couldn't find a built in function w/ SQL that does that.
My other option was to pull the entire table, but that takes longer.
do print(df.head().to_dict()) instead.
i would take the underlying numpy values, reshape and assign another column for the extra column name
meaning something like
import pandas as pd
df = pd.DataFrame({'col_1_a': [1], 'col_1_b': [1], 'col_2_a': [2], 'col_2_b': [3]})
pd.DataFrame(df.values.reshape(2,-1), columns=['a', 'b']).assign(column=df.columns[::2].str.rsplit('_', n=1).str[0])
(i am all ears for a neat actual pandas solution if anyone has one 🙏 )
I might have one once I get the print result 😛
{'col1_a' : {0: 0}, 'col1_b': {0: 10}, 'col2_a' : {0: 0}, 'col2_b': {0: 10}, 'col3_a' : {0: 1}, 'col3_b': {0: 9}, ...}
thanks, one moment.
Well, df.values.reshape(-1,2) I think brings me 90% there, I just need to add in the index w/ the column name.
Nvm, above works if I swapped the reshape.
@charred light
In [23]: df
Out[23]:
col1_a col1_b col2_a col2_b col3_a col3_b
0 0 10 0 10 1 9
In [24]: df2 = df.T.reset_index()
In [25]: df2
Out[25]:
index 0
0 col1_a 0
1 col1_b 10
2 col2_a 0
3 col2_b 10
4 col3_a 1
5 col3_b 9
In [26]: df3 = df2['index'].str.extract(r"col(\d+)_(\w+)")
In [27]: df3
Out[27]:
0 1
0 1 a
1 1 b
2 2 a
3 2 b
4 3 a
5 3 b
In [28]: df3['num'] = df2[0]
In [29]: df3
Out[29]:
0 1 num
0 1 a 0
1 1 b 10
2 2 a 0
3 2 b 10
4 3 a 1
5 3 b 9
In [37]: df3.pivot_table(columns=1, index=0, values='num')
Out[37]:
num
a b
1 0 10
2 0 10
3 1 9
CC @boreal gale
oh yeah, good one 👍 i was thinking of .T.reset_index() but my brain was fried and it just eluded my mind
Thanks.
I also just realized, I actually don't really need the not_null count.
Could have just gotten the full count once at the start.
What if we the returns do not make a continuous function?
what's "the returns"?
TestWindowLen = [90, 125, 256]
CheckDays = [ 1,15, 30 ]
std_dev=[25, 50, 100]
What is backpropagating/how do you do it?
Hi! Has anybody here worked with object detectors in webapps?
I'm trying to integrate YOLOv8 into my React Project. I have code for frontend and backend. Im thinking that Im gonna need 3 CLIs for this since I need 1 each for client and server and another for the Object Detector. Do any of you have any ideas as to how I could accomplish this?
Any clue?
yes
i don't think that's an issue. have a look at https://optuna.org/ if you just want a package to do the hyperparameter optimisation for you. otherwise you will need to dive into some papers to fully understand what's going on
What do you use? For bayesian optimisation*
historically i have been just using https://github.com/fmfn/BayesianOptimization
but currently playing with optuna which i linked above.
What do you think of my results? Are they continuous?
^
well firstly it's important to really distinguish properly what is it that you are showing.
are these hyperparamters of your models?
are these some sort of output of your models?
The hyperparameters.
so the hyperparameters you have shown looks like components you would need to create a grid in grid search.
without knowing the model you are working with, i can't tell if they are continuous or not.
Im evaluating this. (What model to use (but the function being continuous is a factor))
So Ive ben told to use Random search should be fast and possible. It was said random search can be combined with other optimization techniques like Bayesian optimization.
How do you determin if a function is continuous? @boreal gale
sorry i am really confused as to what are you trying to do. perhaps it's a language barrier or there is some knowledge gap somewhere
what's the thing you are trying to model?
what are your inputs? where are they from? what do they mean in the real world?
what are your output(s)? where are they from? what do they mean in the real world?
How do you determin if a function is continuous?
layman explanation is probably "function that does not have discontinuities" or "something that you can draw with one stroke of a pen, as opposed something that require you to lift your pen"
I have a program that uses a grid search to find the optimal parameters for another function. Im trying to find a better way to gt optimal values. @boreal gale
So Im wondering if a random seach can be used/applied to this.
another function
what is this "another function"? without knowing this i can't comment of whether the parameters (i.e. the hyperparamters) are continuous. do you see what i am getting at?
So Ive Im wondring if a random seach can be used/applied to this.
yes, random search is always an option.
What do you ned to know about the function to say if its continuous?
i don't know. but preferably the entire definition.
also are you asking whether the hyperparamters are continuous or the function itself? those are two different things.
If you can use grid search, then obviously you can sample random points on that grid to test your model with. So yes, you are able to use random search.
Considering my hyperparameters I thought you could derive if the results for eg: (90, 1,100) if these are continuous.
i'm decent at numpy, but won't go in dms
u do u
we discourage dms, as the server is for helping in the server
k
pretty much
the composition through different layers makes that a terrible exercise in the chain rule
deep learning frameworks evaluate lazily. they create a computational graph that makes the forward pass more efficient, and automatic differentiation easier
i think someone had linked you before to a website on how to construct and traverse computation graphs. if you don't use autodiff, you either construct the graph yourself on code, or do the derivatives on paper
tensorflow used to do that, didn't it
before version 2 or so. when pytorch just began, its killer feature was implicit computational graphs rather than explicit.
(then TF learned to do that too)
does anybody have resource link where I can learn about Handwriting recognition using Deep Learning ?
What are the neurons, why are there layers, and what is the math underlying it?
Help fund future projects: https://www.patreon.com/3blue1brown
Written/interactive form of this series: https://www.3blue1brown.com/topics/neural-networks
Additional funding for this project provided by Amplify Partners
Typo correction: At 14 minutes 45 seconds, th...
3 blue one brown explains it
@terse kindle ^
Will look into it. Thank you so much
dataframe\pandas or numpy
yes, that would just make it easier for you to make your own autodiff
i don't know how in depth you wanna go into the making your own deep learning stuff
fair enough. in python, i would recommend stuff like jax, pytorch, and tensorflow for this task. you can also do it with sympy, but sympy is pretty slow
jax, notably, is very annoying to build on windows
are there any obvious bugs in this DQN back propagation code I wrote? I mostly used the pytorch example but I changed a couple things so it would fit my tensor shapes and I'm worried the reason it's diverging is cause I have some bug
def optimize_model():
if len(memory) < BATCH_SIZE:
return
batch = memory.sample(BATCH_SIZE)
state_batch, reward_batch, next_state_batch, terminal_batch = zip(*batch)
state_batch = torch.stack(tuple(state for state in state_batch))
reward_batch = torch.stack(reward_batch)
next_state_batch = torch.stack(tuple(state for state in next_state_batch))
if torch.cuda.is_available():
state_batch = state_batch.cuda()
reward_batch = reward_batch.cuda()
next_state_batch = next_state_batch.cuda()
q_values = policy_net(state_batch)
policy_net.eval()
with torch.no_grad():
next_prediction_batch = target_net(next_state_batch)
y_batch = torch.cat(
tuple(reward if terminal else reward + GAMMA * prediction for reward, terminal, prediction in
zip(reward_batch, terminal_batch, next_prediction_batch)))
optimizer.zero_grad()
loss = loss_function(q_values, y_batch.float())
loss.backward()
torch.nn.utils.clip_grad_value_(policy_net.parameters(), 100)
optimizer.step()
I assume no one works with pyspark on Mac here?
No one managed to answer the error I keep getting 😅
Traceback (most recent call last):
File "/Users/kadiraltunel/PythonProjects/Lab1/main.py", line 17, in <module>
df = spark.createDataFrame(data=data, schema=col_names)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kadiraltunel/Documents/Spark/python/pyspark/sql/session.py", line 894, in createDataFrame
return self._create_dataframe(
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kadiraltunel/Documents/Spark/python/pyspark/sql/session.py", line 938, in _create_dataframe
jrdd = self._jvm.SerDeUtil.toJavaArray(rdd._to_java_object_rdd())
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kadiraltunel/Documents/Spark/python/pyspark/rdd.py", line 3113, in _to_java_object_rdd
return self.ctx._jvm.SerDeUtil.pythonToJava(rdd._jrdd, True)
^^^^^^^^^
File "/Users/kadiraltunel/Documents/Spark/python/pyspark/rdd.py", line 3505, in _jrdd
wrapped_func = _wrap_function(
^^^^^^^^^^^^^^^
File "/Users/kadiraltunel/Documents/Spark/python/pyspark/rdd.py", line 3362, in _wrap_function
pickled_command, broadcast_vars, env, includes = _prepare_for_python_RDD(sc, command)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kadiraltunel/Documents/Spark/python/pyspark/rdd.py", line 3345, in _prepare_for_python_RDD
pickled_command = ser.dumps(command)
^^^^^^^^^^^^^^^^^^
File "/Users/kadiraltunel/Documents/Spark/python/pyspark/serializers.py", line 468, in dumps
raise pickle.PicklingError(msg)
_pickle.PicklingError: Could not serialize object: IndexError: tuple index out of range
^ this is the error message they're referring to
The code doesn’t have any issues @serene scaffold
It’s the pyspark causing issues on my Mac
if the error message of yours that I just posted isn't the one that you currently need help with, you have to show the new error message.
Oh that’s the error
I’m just saying that so that people don’t try to find out if the code is wrong. That’s my teacher’s code which works on his computer
I’m using Pycharm on MacBook Air M2
My teacher couldn’t figure it out either but then he doesn’t use Mac @serene scaffold
anyone used to plotly dash in python here?
@warm copper what python version are you using?
because i recall there was someone who dug up a jira issue for you which shows a similar error for python 3.11 iirc, and it was quickly dismissed for some reason, all without actually checking anything as far as i can see.
But if the function is what you need then I would have to ask more about your background. Its in financial space. Are you open to pms?
what's your spark version? i presume 3.3.2?
python 3.11 does not work with 3.3.2, only pyspark 3.4+ (which is unreleased as of writing this) works.
please downgrade to 3.10 as a workaround for now.
ref: https://github.com/apache/spark/pull/38987
i have a meme for you in #ot1-perplexing-regexing
i have a no dm policy, sorry.
also, the function being continuous or not has no relevance to whether you can use bayes opt or not. why do you want to know if your function is continuous anyway?
edit: meh i can't type, i missed out an important no in the middle of the sentence 😂
this might be a controversial take but I think anybody who has the know-how to use jax and who is doing complex applications that would need it (since it's oriented towards applications where you'd want complete control over pretty much every operation) would likely already be using linux due to the headache that windows brings
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
import cv2
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow.keras.optimizers import Adam
from keras import backend as K
from keras.layers import Conv2D,MaxPooling2D,UpSampling2D,Input,BatchNormalization,LeakyReLU
from keras.layers.merge import concatenate
from keras.models import Model
from keras.preprocessing.image import ImageDataGenerator
import tensorflow
import tensorflow.compat.v1 as tf
tensorflow.random.set_seed(123)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
tf.keras.backend.set_session(sess)
tensorflow.random.set_seed(2)
np.random.seed(1)
print(os.listdir("../deneme/dataset/dataset_updated/"))
LEARNING_RATE = 0.001
Model_Colourization.compile(optimizer=Adam(learning_rate=LEARNING_RATE),
loss='mean_squared_error')
Model_Colourization.summary()
I am getting an error in the compile part. How to fix
I was told the results need to be continuous to use random search.
hmm okay, again we need to clarify what is "the results"
in hyperparameter optimisation, you obviously have some metric you are trying to maximise/minimise, if that's what you are calling "the results" (since your application is finance, let's just assume this is your profit for example) then i don't think that statement is correct, random search doesn't have any requirements like that.
random search is literally trying some hyperparamter configuration and see how good it performs, whether the metric you are trying to maximise/minimise is continuous or not shouldn't matter i think.
however if that statement is aimed towards bayesian optimisation, then maybe there is some truth to it. i have never dealt with a discrete metric to optimise for. but i can kinda see why the metric being discrete might be an issue.
Results from the function that gives you the best hyper parameters we are trying to improve.
do you mind writing that in your native language just this once? i really think there is a language barrier here, as i am failing to understand that.
What languages do you speak?
Besides english
just english and chinese
Whats confusing you?
I have a function Im using to get hyperparameters.
To use random seach on this function the results should be continuous.
I can be wrong.
I have a function Im using to get hyperparameters.
okay, let's call this function the hyperparameters-optimiser from now on.
Ok.
To use random seach on this function the results should be continuous.
is "this function" here the hyperparameters-optimiser?
Yes.
when you say on, did you mean in?
Would you use an intercooler in or on a car?
meaning.. "using random search as the core logic of this hyperparameter-optimiser"
Yes.
okie dokie
🙂
now that only leaves "the results" as my only source of confusion.
i assume "the results" mean the evaluation metric of the model?
the model here is not the hyperparamter-optimiser
Or you think the results should be from the function that the optimized values are used in?
when you use the word "the results", my mind just draws a blank, hence the confusion.
but if you do mean the evaluation metric your hyperparamter-optimiser will be working to maximise, then my above reply is relevant
#data-science-and-ml message
what makes more sense to you?
The results being the best values from the hyperparamter-optimiser or the performance of the function that its being used in?
"performance of the function that its being used in" makes more sense, yeah (but i can't just assume that's what you meant)
Well i use it in something called an insample test.
The values would be ratios and percentages.
okay, random search should be fine for reasons stated above.
as for the issue of continuous or not, ratios/percentages sounds pretty continuous to me, unless the components of the ratio is themselves bounded and discrete
e.g. say if your ratio is X:Y, and X can only ever take value (1,2,3,4,5) and Y only take value (5,6,7,8,9), then that ratio doesn't sound continuous to me.
Google Sharpe Ratio.
it's continuous then
Hey @vocal fractal!
You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.
Hey @vocal fractal!
You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.
https://paste.pythondiscord.com/fefapididi getting this error (pasted at the end of code). Can someone please help
How often you use a random search?
@boreal gale
Do you use insample out of sample split?
80/20 for example
RSI as in Relative Strength Index?
presumably it's because when calculating RSI, you had to discard some data because you don't have 14 (or whatever your window is) days worth of history for RSI calculation, and you will have less rows compared to the original dataframe
when you try to assign non-scalar (i.e. not just one value) data (in data structure that is without pandas index information) back into the dataframe, the length has match the original dataframe.
in this case you are indeed assigning non-scalar data without pandas index information (a numpy array) back into a dataframe and the length doesn't match, hence an error occurs, specifically "ValueError: Length of values (3312) does not match length of index (3326)" - notice how the two values are off by 14!
if this isn't a homework, maybe looking into using TA-lib instead of homebrewing RSI calculation is a worthwhile thing to do.
https://pypi.org/project/TA-Lib/
not that often. i generally prefer grid search/bayes opt, grid search if i am feeling lazy, bayes opt if i know it's going to be tough finding a good param.
@boreal gale Thank you so much! I have so much to learn!
Isnt Bayes a vrsion of random search?
yes to this, but a bit more complicated usually. it's more like this from the sklearn docs, there is more than one split going on, a la cross validation
ohh.. i wouldn't consider bayes opt as random search, it's more "search guided by some bayesian model" (which imo isn't random at all)
i see how one can consider that as random though!
"random search can be combined with other optimization techniques like Bayesian optimization"
👌
Which is fast and powerful?
Why not use random seach?
Is Bayesian optimization fast?
random seach is very luck based.
bayes opt at least has some theory as to why it should work
though it's really worth noting, hyperparameter optimisation is not a silver bullet.
sometimes, one's data/model is just not up to scratch, no matter how you tune your hyperparameter, the model is still not going to perform to your satisfaction
had that happened to me once or twice, my data is just shit, no matter what i do, i can't make anything useful from it
our mega thread just drowned out some poor guy's help request, #data-science-and-ml message
could someone help this poor soul D:
Can someone give me some help with unsupervised learning applied to neural networks?
I'm currently trying to test a Minimum Entropy Loss from a recent paper, which has the objective of making the model learn to minimize the information entropy of certain input more explictly, even allowing it to "pre-classify" the input(inputs with similar entropy level are more likely to be from the same class, like in CIFAR100, ambulances pics tend to have the same minimum entropy).
However, I'm having the problem that...my model loss is actually increasing after each epoch, not decreasing(if not to say that the entropy minimization doesn't seem to make sense at all). I suppose this means the model isn't being consistent with the entropy minimization.
Can someone give me some ideas on what could be causing this? Is there any more "consecrated" loss function for this task, just so I can use it as a control compared to this MinEntLoss?
(Yes, I have reviewed my code quite a few times to make sure I'm implementing the loss and unsupervised task correctly)
Also, the model is a ResNet extracting features from 100x100x3 images into 512 features.
bruh I think its my teacher's code @boreal gale
i had a look at it, it's been a while since i touched pyspark but i do believe the code is alright.
i really do recommend at least try to install python3.10 (and pyspark-related libraries on it) and give the same code another go.
i have now replicated the issue in python3.11
https://paste.pythondiscord.com/gamayuqubi
exact same code running on python3.10 is fine
https://paste.pythondiscord.com/oxatizalit
@warm copper 👆
so its the python?
indeed. python 3.11 is not supported yet.
oh interesting
ref: https://github.com/apache/spark/pull/38987 as mentioned in my previous message
https://github.com/apache/spark/pull/38987#issuecomment-1343650267 is the most interesting part, py3.11 is only support from spark 3.4 onward (assuming they don't revert the change obviously)
spark 3.4 not python 3.4
but to answer your question 3.4 was out a long long time ago
it's not out yet, i am unsure what's their release schedule
hence my recommendation to downgrade to python3.10 here
if you haven't heard of pyenv, it might be worth looking into it, it will lessen the burden of installing python/switching python version.
from pyspark.sql import SparkSession
from pyspark.sql.functions import sum, desc
spark = SparkSession.builder.appName(
'Covid').getOrCreate()
covid = spark.read.csv('/Users/kadiraltunel/PycharmProjects/covid-us.csv', sep=',',
inferSchema=True, header=True)
covid.show(50)
covid.groupBy('date').agg(sum('cases'), sum('deaths')).orderBy('date').show()
covid.groupBy('state').agg(sum('cases'), sum('deaths')).orderBy(desc('sum(cases)')).show()
covid.select(sum(covid.cases)).show()
I wonder if you get the same results as I do tho
when you run it
i don't have access to your csv, so it's gonna blow up, but i will most likely get the same behaviour as you i believe.
this is the data
as to why it only breaks when running the dataframe example script?
it's because of some spark internal which deemed creating a dataframe from user manually supplied data requires a shuffle in the data (or at least the shuffle function), such that the random.Random referenced here https://github.com/apache/spark/pull/38987/files is pickled for transport to other process (basically this is how your python code gets transported to the executor in spark, via something called a pickle, you might also see something like cloudpickle, or dill - all very similar and built upon pickle), seeing as this class no longer exists, code running on python3.11 blows up.
i assume this is similar, ran it on python3.10
anyway, you can tell your professor 3.11 can't run the dataframe script properly at the moment. you can link https://github.com/apache/spark/pull/38987 and/or https://issues.apache.org/jira/browse/SPARK-41125 if he asks for proof/reasons
also, if that's the only script that doesn't run, there is not much reason to downgrade 🤷 (my advice to downgrade was based on my understanding that spark just plainly doesn't work at all in python3.11, of which obviously i was wrong)
weird that there are over 31 billion cases right? @boreal gale
Hey guys, I need help for a customer churn prediction model for my group project in my marketing course. Basically we have a 1.2 million customers database that made purchases in various retailers. A specific retailers has been assigned to us and therefore, we now have 591k customers that have made at least one purchase at this retailer. We would like to create a binary variable called 'churn', that will take the value of 1 if the last purchase is more than 18 months ago, 0 if not. We have data for 36 months (2019-2021). We would like to predict customer churn. I intend to spit the data randomly into train and test, however my question is: After fitting the model.. What do I do? Obviously I can't try to predict customers from my training model, so we can ignore them.. What about the rest? How do I apply my model so we can confidently say: This cluster of customer is at risk of churning?
Edit: After thinking, I had this idea: after training the model, I create a new dataframe containing only customers that are still active and haven't churned yet. Then I fit the model, and every customers that have been predicted to churn will form my group of customers that are at risk of churning?
Hi!, can someone help me with the next problem:
Im trying to use a Transformer XL layer in my model, then when i used the argument "**kwargs" it tolds me its not defined, but the documentation used it, help me please
Anyone know a community in discord focus on chatGPT and another AI?
is there a way to version control jupyter notebooks on github that doesnt make the diffs insane and huge
or does kaggle or hugging face have something smarter
hello i am new here and i am searching for how to become data science programmer can anyone suggest what do i learn first
do you know python?
yes
have you taken any machine learning courses in university?
nope i am just learning from home
well the first thing to do would be get super familiar with all the basics of python. data scientists use almost exclusively python
yes i watched a lot of tuto and i don't know what to do next
learning by doing is the best way. pick a project and test your skills. preferably a data science project
there are three main things data scientists do:
gather data
clean data
apply machine learning/statistics to data
thanks i will try to learn those
if you're looking for data that's already gathered and mostly cleaned, kaggle.com is a great resource
yes
**kwargs is not really a proper argument, it's just extra arguments that could be added.
def init_optimizer(args)
arguments = {
args['lr']=0.001,
args['betas']=(0.9, 0.999),
**kwargs
}
optimizer = torch.optimizer.Adam(**arguments)
Considering that torch.optimizer.Adam() accepts as arguments lr, betas and eps, you could pass as arguments for the function init_optimizer a dictionary args with the itens 'eps'=1e-6, for example, which would be a **kwargs, an extra argument that isn't defined by default.
Thank u bro
pca, tSNE (visualisation, dimensionality reduction), contingency tables, uni/bi/multi variate analysis, what other statistics should i learn before my interview?
I will do micro/ marco F1, confusion matrix, AUC, ROC etc, too
from libraries what important functions should i know?
I will be grateful if you take a look at #1088401092326477824 
Anyone here knows any free/open-source datasets regarding bias ?



Should I maybe dm another channel here? 
Can anyone help me in resolving this error?
remember to always give code and errors as text, so that we can copy and paste them as needed.
though I suspect that the model you're trying to load is just invalid.
Hi I want to train a regression model.
Should I find the optimal degree first on a default model?
And then take care of regularisation and other parameters later with the optimal degree of the regression model?
use regularisation from the start.
hmm, that will be a really complex loop ig. Because regularisation CV loop, plus an outer loop of degrees
Does that mean if you have n hyperparameters you would always need a n-nested loop?
you do not have to perform a full grid search on all parameters, if there are too many hyper parameters to tune just use a random search instead of grid search or fix some of them
In this case it's just 2, so it will work
Once I did a 4 level grid search, that took a very very long time so I ended up fixing one of the parameters to just an acceptable value
remember that if you are way too picky about your hyper parameters you may end up effectively overfitting to your test set
hmm
my column looks like this, do I perform power transformation on them to make them normal for polynomial regression
I don't really know when to use it and if there's any downsides to just always using it. Chatgpt says to compare performance before and after using it. But that's just another 'hyperparameter' to tune then.
@agile cobalt
your 'question' being about when to use "power transformation"?
ye
honestly I have never seen the term power transformation before and taking a quick look at wikipedia I don't get it, but for features that typically scale exponentially like population or money (such as the GDP column) you may want to consider using log(), while for things that scale linearly you'll probably want to not use any way too fancy methods
hmm
what's a way to check for growth rate of something
what plot will it reflect in
you don't
if anywhere, it might be reflected in the distribution of the data
but that is something that you should know about the data you are dealing with, not something you'll infer from the data
What if it's all just black box data with no labels
or you can't understand the labels
then you should not be using that data at all?
model interpretability is already bad enough as-is, I cannot commend using data you do understand
I see
if your question is about when to use power transformation, imo you should use it when your model model assumes normality and your data doesn't follow a normal distribution (how you determine if your data is roughly normal is another question, QQ plot and kolmogorov smirnov test is pretty common)
if your model doesn't require/assume normality, then there are less reasons to use it but sometimes it is indeed useful, i feel this is all very context-dependent.
also power transform impacts the interpretability of your model, which might be an issue. but you can always use SHAP to recover some if not all interpretability.
Polynomial regression doesn't assume normality I think. But I was taught that it does
i actually have no idea 🤔 my stats has degraded a lot since leaving uni
what did you study
stats 😂
if you do it via linear least squares, poly regression does indeed assume normality
there's more than one way to find the coefficients of a polynomial
least squares always assumes normally distributed observations, i.i.d.
hmm, I think sklearn does it the least square way, doesn't it?
most likely, but i can't say for sure 😛
That's what I am taught as well
But I also used regularisation, but that doesn't affect much ig?
depends on the kind of regularization
you can usually think of regularized least squares as assuming there is AWGN, and the regularization terms are equivalent to assuming your coefficients are random and come from a special distribution
with l1 and l2, that'd be some weird combo of laplace and gaussian priors
I have 38 features now and with 3 degree that become a lot of features and taking long to run
don't people have tons of features irl?'
That would make polynmial regression inefficient
beyond maybe 2 degrees
don't people have tons of features irl?'
yes, but most people don't use polynmial regression, at least from my past experience
you usually only use poly regression for modestly low degree polynomials
My teacher made it for 10 degrees, but I have tons of features
because it turns out it's a fairly challenging problem
it involves a toeplitz matrix with a terrible condition number
you can run into issues involving numerical stability, or if direct inversion is impossible, slow convergence
@wooden sail
friendly reminder:
as i said, AWGN
the additive error is normally distributed
Challenge portfolio work
what does that means?
It’s just a task, it’s my homework
I’m trying to do the challenge task because I wanna try to learn it better but I literally have no clue
you'll have to brush up on your joint and conditional probabilities
in pytorch, does anyone know how to combine a sequence of multidimensional tensors? 65536 4x4x4 tensors that i want to reshape into 8x8x8 tensors
more information is needed. to go from (4, 4, 4) to (8, 8, 8), you'll be combining multiple (4, 4, 4) tensors to create each (8, 8, 8) one. and how they're going to go together matters, or you'll end up with meaningless tensors.
can interviewer ask me "without checking syntax" write code for something?
what am i expected to generally write without reference? TO give you an idea of role: i am applying for "Sr. data scientist" role at startup
beginner question. what is the bias used for? also is it automatically added or not?
i think u select it
by yourself
I'm guessing the b in y = x*w + b?
bias in neral networks
yeah
usually it is just a feature with value 1 for all records
commit
yeah added automatically but can be turned off
what does added automatically mean
What?
without it, you would always get y=0 for Xs = [0, 0, 0, 0, 0, ..., 0, 0, 0]
some machine learning libraries might add it automatically for you, while others may require for you to add it yourself
usually its wx+b but we can do wx
am i hired, lmao?
nvm
oh
i thought it was just some random value u had to choose
in my project the value of bias didnt matter much 🤔
thanks
hmm I didnt use any library i so thats why i had to select a random value
the bias "feature" always has a value of 1
the bias weight is initialised randomly and learned by the network, just like all other feature weights
idk what is feature
one of the names for your input data
oh
someone one said I am not implementing something can you explain what it is i didnt watch the vide tehy suggested yet https://media.discordapp.net/attachments/1086687200516788315/1088500719037976657/image.png?width=1440&height=240
gradient descent
i didnt implement it
but my program still works
98% of the time it guesses it correctly
what else matters?
but it improved?
there are dozens if not hundreds of different ways to measure how well a model is doing
!paste can you show your code?
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.
is an activation function used after the last layer of the network?
idk what that means yet
i was just asking because i dont know the answer
not sure if it counts as actual gradient descent, but I guess that this bit trains it
algorithm.py lines 135 to 145
if product + bias > 0:
predict_shape = shapes[0]
if shape != predict_shape:
weight = addition(weight, image_list, +1)
else:
predict_shape = shapes[1]
if shape != predict_shape:
weight = addition(weight, image_list, -1)
if shape == predict_shape:
correct_guesses += 1```
but +1 on their suggestion to use numpy
is it neccesary
would probably run at least 10~100x faster, and if you want to actually work with data science or do anything even hobby level of seriousness you'll 100% need to use numpy and friends
this part tweaks the weight
its already fast but for my next project I definitely will
how do I learn it properly is there a beginner's book for this
I'd recommend taking a course like Andrew Ng's machine learning introduction on coursera, or at least following something like 3Blue1Brown's videos or https://course.fast.ai
watch it to find out /s
/s ?
sarcasms
ohh
usually you'll do something like ```
1-10 how much of a circles is it
1-10 how much of a rectangle is it
1-10 how much of a triangle is it
how do i get these values in my current program?
you don't | you rewrite a lot of things
how do they get those values
actually seriously this time, watch the video to find out
is an activation function used after the last layer of the network?
or just for hiudde
hidden
to put it simply, it is not simple 🤷
do prioritise school though
Usually after every layer
And not every layer in the same model needs to have the same activation
You will often see hidden layers each having ReLU activation, and the last layer Softmax f.e.
and when i use random data for a network, is there a way of seeing the predictions vs. the real data?
@mild dirge
Hi! someone can help me with this error in a Transformer XL layer on tensorflow:
TypeError: tf__call() missing 1 required positional argument: 'relative_position_encoding'
Here's the layer
vocab_size=140,
num_layers=6,
hidden_size=256,
num_attention_heads=30,
head_size=5,
inner_size=30,
dropout_rate=0.2,
attention_dropout_rate=0.2,
initializer="glorot_uniform",
two_stream=True,
tie_attention_biases=True,
memory_length=30,
reuse_length=30,
inner_activation='relu'
)(embedding_1)```
I suppose you probably have to pass a tuple (embedding_1, relative_positional_encoding) instead of just (embedding_1)
I'll try it, thank u bro
I have a pandas dataframe with a time column, a power column and a frequency column. The power and frequency are measured at 0.05s intervals. The frequency that is measured is based on a target frequency, where the frequency will jump to a value and then be held for 60 seconds. how can I split the dataframe based on these frequency jumps?
What do you mean by split?
I want to detect when the frequency changes, i.e. f1 and take a slice from index[0] to index[f1], then from index[f1] to index[f2] and so on
import plotly.graph_objects as go
lat = ["22.290222"]
lon = ["73.167065"]
fig = go.Figure(go.Scattermapbox(
lat=lat,
lon=lon,
mode="markers",
marker=go.scattermapbox.Marker(
size=10,
color='red'
),
text=['Location'],
))
fig.update_mapboxes(style="open-street-map")
fig.write_html("/tmp/temp.html")
anyone knows how i would change the shape of the marker to that of a bus?
Hi I'm just trying to create a time-series neural network. I don't know what format to put time into my data? Do I convert it into seconds and input my data as a 3-D tensor, samples, seconds, features or do I keep it in date and time format
How can I remove the dependent variable from my list of Xs? Here is what I got so far
paste(feature.names,
collapse = ' + ')))```
Type ~ RI + Na + Mg + Al + Si + K + Ca + Ba + Fe + Type```
Nvm I figured it out
Hello
Can you help me?
I'm keeping on hitting this error:
~\AppData\Local\Temp\ipykernel_7264\806691498.py in <module>
27 incCount5=0
28 incCount_reset=0
---> 29 start_time=time.time()
30
31 net = cv2.dnn.readNetFromDarknet(model_config,model_weights)
AttributeError: 'float' object has no attribute 'time'```
What did i do wrong? What should i do?
it looks like you imported the time module, but somewhere in your code you created a variable called time and assigned a float to it. this essentially destroyed your imported time module
call the variable or the module a different name
Yeah, we're solving it in their help channel
Hello, i need help again
This time about opencv dnn error
error Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_7264\3346177036.py in <module>
87 #classIds, confs, bbox = net.detect(img,confThreshold=thres)
88 print(classIds,bbox)
---> 89 blob=cv2.dnn.blobFromImage(img,1/255,(wght_hght_target,wght_hght_target),[0,0,0,0],1,crop=False)
90 net.setInput(blob)
91 LayerNames=net.getLayerNames()
error: OpenCV(4.7.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\resize.cpp:4062: error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize'```
I am currently using yolov3-320.cfg as config and yolov3-320.weights as weight
What is happening here?
You should usually not keep this kind of time data in date and time format. Converting to seconds since an epoch is fine; or to milliseconds, microseconds, or nanoseconds. I would only recommend date and time formats if you need to know the original timezone; but I haven't been in a situation where I needed to worry about that much, so others may have a more informed opinion.
As anyone tried deploying a NLP model that uses nltk wordnet or stopwords to AWS Lambda?
your question probably has more to do with Lambda than nltk. but you should always ask complete questions that someone who knows the answer can start answering right away.
ok my apologies.
I have built a lambda function that uses NLTK to preprocess text before being used in my classification model. The function needs to use NLTK's stopwords, punkt and wordnet libraries to work. I am having issues with the lambda function being able to download the libraries upon execution. Everything works fine locally, but when deployed to AWS it doesnt download the files to the right directory. Has anyone come across this issue before?
I wanted to display a cnn layer weights in mplt,the shape is (3,3,3,32), can i do it?
usually you would avoid doing anything particularly heavy in serverless environments
if you can, try modifying the install location so that you can just download locally and include it as part of your source code so that it can just read from disk without having to download anything later
I have tried to change the install location to a /tmp/ directory, but the function doesnt want to search that directory for the libraries.
should I run some random apk from a discord server 🤔
You can't recruit for paid opportunities or business projects here, so please remove your message.
!warn 807551900417130537 We've asked you before not to recruit for projects like this. This is your last warning about this, so please contact @sonic vapor if you need any further clarification about what is or isn't appropriate.
:incoming_envelope: :ok_hand: applied warning to @shell sequoia.
how tough is it to like make some sort of ai . that simply has to choose between apis to use for results on the basis of text provided
"some sort of ai" can mean a lot of things, including "an if statement" :p
depends what kind of accuracy you want, and what kind of task you have in detail.
A bit late on this news, but this is something nice.
oh that's fantastic
heh, I saw this suggested as a possible way to improve gpt4's capabilities, like, yesterday
since it's not good at math, but pretty good at delegating to tools
If it could read your input files / directories it could be a full DS tool.
yeah this makes a lot more sense. basically use it to translate your natural language queries to formal mathematical ones and back
(And remote databases that you point it at)
ValueError: num_samples should be a positive integer value, but got num_samples=0```
I'm trying to run something called Lora_SVC, and it gives me this error when I try to train.
Hi! i need some help please, im passing the argument relative_position_enconding to a Transformer XL layer (in tensorflow) like this:
relative_position_encoding=(None, 300, 256)
But i get the next error:
Dimension value must be integer or None or have an index method, got value 'TensorShape([])' with type '<class 'tensorflow.python.framework.tensor_shape.TensorShape'>'
So whats wrong?
yes. it is a pain. you should try to build a custom container image with the libraries already inside.

Guys, when preprocessing text data for training a Transformer model, should I add a <Start-Of-Sentence> token to my target sentence?
So at the first iteration in a sentence, the model must predict the <SOS> token before predicting any actual word?
It feels a bit weird, since the <SOS> token is inserted by default during inference...
Uh... I suppose when the Transformer is implemented correctly, vanishing gradients isn't that much of a problem?
I have a log files, it might contain an error lines or not, i want to make a code that can understand each error line and print just a unique from each no need to duplicate
Example
Input file :
Leakage value 1.2 for circuit 1 is greater than the standard
1)Leakage value 0.9 for circuit 2 is greater than standard
2)Capacitance is huge in circuit 3
3)Capacitance is huge in circuit 4
4)Capacitance is huge in circuit 5
5)Capacitance is huge in circuit 6
6)High delay in circuit 7
Output: must be the unique ignoring instance information like circuit number or certain value
-
Leakage value 1.2 for circuit 1 is greater than the standard
-
Capacitance is huge in circuit 3
-
High delay in circuit 7
The log file may contain over than 10000 errors but not ,however its might be just 10 unique errors as shown in the output ,,,
Anyone can suggest a library, or a place to start from?is it possible to make code clever enough to determine these things ?
Anyone can suggest me which vendor that offers VM with gpu like V100 / A10 / A100 at decent price? This is for my personal learning on training deep learning model on public data, no privacy / enterprise feature necessary.
check this resource out https://fullstackdeeplearning.com/cloud-gpus/
courtesy of the FSDL folks
🥞
thanks josh tobin and charles frye and fam
Thanks, did not know someone put compilation of vendors
Indeed...when I use the correct hyperparameters, things tend to get better...not perfect and far from chatGPT, but still 
you can thank the Full Stack DL folks

their online course is also really good
First of all hi, I need a data about whether the python is playing in data science how??
How to become data scientist by using python language
maximum knowledge required in DS&A (PYTHON) for Data Sc.?
wdym by "maximum knowledge"?
I had an interview yesterday, went well.
One anomalous question was following:
HIM: why F1 is HM?
ME: To punish score even if one of precision or recall is low even when other might be high.
HIM: so why cant we use F1=precision X recall
Now, i wasnt able to comeup with explanation, he told me it had to do something with Harmonic motion that we learn in high school.
Does anyone know why F1 cant be precision X recall?
It's the harmonic mean between precision and recall
In mathematics, the harmonic mean is one of several kinds of average, and in particular, one of the Pythagorean means. It is sometimes appropriate for situations when the average rate is desired.
The harmonic mean can be expressed as the reciprocal of the arithmetic mean of the reciprocals of the given set of observations. As a simple example, t...
And this is probably what they meant with that motion:
@mint palm
Here you can see the resulting f1 score (yellow is 1, dark blue is 0) for the harmonic mean, and simply multiplying them
Hlw, I am using request-html for web scraping but when I am encountering a <div class=""> no children is returned but while inspecting there is an img tag, How to get that img tag src data ??
Hi! i need some help please, im passing the argument relative_position_enconding to a Transformer XL layer (in tensorflow) like this:
relative_position_encoding=(None, 300, 256)
But i get the next error:
Dimension value must be integer or None or have an index method, got value 'TensorShape([])' with type '<class 'tensorflow.python.framework.tensor_shape.TensorShape'>'
So whats wrong?
hey,does anyone use simpy in simulations???
Hey guys, I'm trying to make a vectorizer model. The model has 4 fully connected layers, receives features from an image and then generates a vector. It works fine for generating vectors with dimensions (Batch, vector_size).
However, I want to generate 2 dimensional vectors to see how things will work and also to be able to plot the model performance and visualize how things are going(like it's done with PCA and tSNE), but I don't know how to do this without getting an output where the first vector number will be exactly equal to the second.
My code looks like this:
x = self.neuronA(x)
x = self.Relu(x)
x = self.neuronB(x)
x = self.Relu(x)
x = self.neuronC(x)
x = self.Relu(x)
features_embedding = self.neuronD(x)
return features_embedding
I want to make something similar to Pytorch/Keras embedding layer:
test = torch.randint(0, 10, (1, 5)) # (Batch, n_features)
embed = nn.Embedding(10, 10) # 10 embedding dimensions
out = embed(teste) # (Batch, n_features, embedding_dimension) = (1, 5, 10)
Any tip or suggestion?
(ChatGPT suggested me to simply reshape my output, but this doesn't seem to make sense mathmatically)
Well, reshaping is the first thing I thought of too. It will generate 50 output values. You can interpret it as a 1d vector, or 5x10 f.e.
So what do you want different from a 2d output than from a 1d with same number of elements?
And with PCA you would normally get a 1d vector output with 2 elements. Such that you can plot it as a 2d x-y graph
I was thinking about something like a spacial space, where the first element, the 5 there, would be the X coordinate(like in a space given by -1 and 1, where the closer the value is to -1, the more it's related to the idea A, and the closer it is to 1, the more it's related to non-A.
The same would be for the 10, the Y coordinate.
I think the correct term would be "spacial vector representation", or something like that...
So an embedding where you want similar inputs to be close in the output space as well
Yes, that's what I want. Should I simply apply reshape to my 1d vector to get 2 elements?
Yes, you could just make the output a 1d vector of two elements

Oh, ok. Now that I think about it...it's a bit like how we do to create images with linear layers... We get a 1-d output, and simply apply reshape to get a 2-d or 3-d array
You want the output to be an image?
Apply dimensionality reduction to a dimensionality reduction?
No, it's just a comparison
Ok then... That was easier than I expected. Thanks!
could you type out the expected output by hand in form of a dataframe please?
and can i assume column Two prefix will match One? i.e. if Two is 'A-123' then One must be A?
okay perfect, gimme a moment!
okay, long story short is that there is just no out of the box way to do this merge natively using just the toolbox pandas provides
doing any naive merge and then filling in the blanks seems to be making it harder on yourself.
i first assume you know how to truncate Two from df_data into the string before ;, i will call this truncated Two
imo, your best bet would be then to compute the correct join key in your df_data first, i.e. first check if the corresponding truncated Two exists in df_keys, if it does, great, use that truncated Two as is, if not then the join key would be None/NaN (since your wildcard join is indicated by None/NaN)
(edit: by join key, i meant one part of the actual join condition you will be using, namely how you match up the Twos from both dataframe, since One is already known to be equal from both dataframe, we pay no extra attention to it)
all together this would be
df_data['truncated_two'] = df_data['Two'].str.split(';').str[0] # > i first assume you know how to truncate Two from df_data into the string before ;, i will call this truncated Two
df_data['joinable_two'] = np.where(
df_data['truncated_two'].isin(df_keys['Two']), # > first check if the corresponding truncated `Two` exists in `df_keys`
df_data['truncated_two'], # > if it does, great, use that truncated `Two` as is
None # > if not then the join key would be `None`/`NaN` (since your wildcard join is indicated by `None`/`NaN`)
)
pd.merge(
df_keys,
df_data,
left_on=['One', 'Two'],
right_on=['One', 'joinable_two'],
how='right',
)[['One', 'Two_x', 'Target', 'Total']]
If you play around with it I think it becomes immediately apparent why harmonic mean is chosen. Sometimes math is about crafting a function that behaves how you want / looks right (which graphics programmers do for their job). https://www.desmos.com/calculator/sdldtw1zy7
there's also the alternative of building a multiindex instead of using merge() / join() but I probably shouldn't really recommend it ```py
import pandas as pd
...
df_data["Two"] = df_data["Two"].str.split(";", n=1).str[0]
mapping = df_keys.set_index(["One", "Two"])["Target"]
keys_to_map = pd.MultiIndex.from_frame(df_data[["One", "Two"]])
values = keys_to_map.map(mapping)
result = df_data.assign(Target=values.fillna(0))
print(result)
This is the last cell of a project I've been working on in jupyter notebook. I added it specifically because a blogger said it only requires pandas/numpy, and I have no other visualizations in the notebook. Upon running it in a virtual environment it turns out the blogger lied to me and it requires matplotlib.
My question is, do you fine folks think it is worth including matplotlib just for this one, rinky dink visualization, or should I just remove it altogether because the values are already discussed in the cells prior?
You need matplotlib for background_gradient? huh, weird
oh, I guess it's for the colormap.
I'll try it without the cmap.
pandas/io/formats/style.py lines 3930 to 3936
with _mpl(Styler.background_gradient) as (plt, mpl):
smin = np.nanmin(gmap) if vmin is None else vmin
smax = np.nanmax(gmap) if vmax is None else vmax
rng = smax - smin
# extend lower / upper bounds, compresses color range
norm = mpl.colors.Normalize(smin - (rng * low), smax + (rng * high))
from pandas.plotting._matplotlib.compat import mpl_ge_3_6_0```
IMO, matplotlib is so common you might as well install it. Your choice, though, it's not like the gradient is even very noticable here on a 2x2 table.
I appreciate the input, as well as others'!
Is anyone interested to do leetcode questions together starting from easy level?
We can do by our own approaches and then have a discussion on concepts!
!rule 6
What are some ways I can implement a bot to my game using an Ai (training and usage)? The background is that each client is in control of an ev3 mindstorm robot, but the robot can also run on ai if there are no players, doing stuff like moving around in the real world and shoot other robots. The data the robots have is the position of the other robots which i get from the aruco markers in opencv from the camera pointing down on all of them
I am limited to one main phone camera 3rd person which points down on the aruco marker on top of the tobot. Each robot also has a camera in front (first person).
I would suggest to downscope your problem if that's your first foray into that area
Like maybe starting simple with a small 2d simulation
is chatgpt good to write code
sometimes it produces correct results, and sometimes it produces mostly-correct results. but if it produces mostly-correct results, and you have no idea which part to fix, then it doesn't really help.
A lot of people overrate its abilities.
How can i solve this error:
Dimension value must be integer or None or have an index method, got value 'TensorShape([])' with type '<class 'tensorflow.python.framework.tensor_shape.TensorShape'>'
On this: relative_position_encoding=(None, 300, None)
is how im passing the argument to a Transformer XL layer on tensorflow
Probably should just have this as a bot command lmao
I'll just make a selfbot to reply that every time someone says chatgpt /s
Ask chatgpt to code it
I asked chatgpt to code a discord bot thing a few weeks ago, and it was surprisingly good, but it had some extra lines that were crap. I think its training data included non-overlapping versions of dpy
I think the type of task discord bots do it’s well suited for
GPT-4 can do independent reasoning. It's not blindly fetching code.
For example
I was just asking if it could help me design a version of Simpletron that ran on a 640 x 480 screen
Here is a fragment of the answer it came up with
First, let's lay out a plan for how the display and the memory should be organized for the 640x480 resolution:
Display: The Simpletron display will have a resolution of 640x480 pixels. Each pixel can be represented by a single memory address in Simpletron's memory. To store the entire display, we need 640x480 = 307,200 memory addresses. You can choose to use 16-bit or 32-bit memory addresses, depending on your requirements.
Memory: Simpletron typically has a fixed amount of memory, for example, 1000 memory addresses. However, for our design, we need to accommodate 307,200 memory addresses for the display alone. You can choose to expand Simpletron's memory to fit the display requirements or use a separate memory space for the display.
Now, let's outline some basic instructions for the Simpletron display:
SET_PIXEL: This instruction will set a pixel at the specified X and Y coordinates to a specific color. The operand will contain the X, Y, and color information.
CLEAR_SCREEN: This instruction will clear the entire screen by setting all pixels to a specific color. The operand will contain the color information.
DRAW_LINE: This instruction will draw a line between two points with a specific color. The operand will contain the X1, Y1, X2, Y2, and color information.
DRAW_RECT: This instruction will draw a rectangle with a specified color at the given X, Y coordinates, and dimensions. The operand will contain the X, Y, width, height, and color information.```
This is something I asked about that is not present anywhere (or not anywhere that I know of)
And yet, look how intricate this is
Sure, it's intricate. But it says you can choose to use 16-bit memory addresses when you have 307,200 things to store.
It's not 307,200 things to store
It's 307,200 memory addresses
To store the entire display, we need 640x480 = 307,200 memory addresses.
The 307,200 memory addresses is for simulating the display
Each pixel can be represented by a single memory address in Simpletron's memory.
So you need 307,200 bytes. But 16-bit addresses can only index 65,536 separate items.
ChatGPT is clearly confused about arithmetic here.
Um...
It's you who is confused
Yes, you need 307,200 memory addresses for the memory itself
But each memory address can hold a 16-bit value
That only gets you to 131,072 bytes.
Q: What is the math for the display again? Can you please clarify? If each memory address stores an 8-bit address, how come the total reserved is 307,200 bytes?
A: Apologies for the confusion earlier. You are correct that if each memory address stores an 8-bit value, the calculation for the total memory reserved for the display would be different. Let me clarify the math for the display:
For a 640x480 resolution display, there are a total of 640 x 480 = 307,200 pixels. If we use 8 bits per pixel (256 colors), each pixel will require 1 byte of memory.
Now, if each memory address stores an 8-bit value (1 byte), then we would need one memory address per pixel. Therefore, the total memory required for the display would be:
307,200 pixels * 1 byte per pixel = 307,200 bytes
So, in this case, the total memory reserved for the display would still be 307,200 bytes (or 307.2 KB), as each pixel requires 1 byte of memory, and there are 307,200 pixels in a 640x480 resolution display.
Please note that this calculation assumes an 8-bit color depth. If you use a different color depth, such as 16-bit (65,536 colors) or 24-bit (16.7 million colors), the memory requirements for the display would change accordingly.
@queen cradle How's that?
It doesn't acknowledge that it made a mistake in claiming that you could use a 16-bit address space. Though to be fair, you didn't specifically ask it about that. Also to be fair, it wouldn't matter to me if you did. ChatGPT isn't good with arithmetic; there are plenty of examples of this, and yours is just one more.
I think I've said all I have to say here.
@queen cradle The mistake was on me though, not on GPT
Because it originally worded as such:
Display: The Simpletron display will have a resolution of 640x480 pixels. Each pixel can be represented by a single memory address in Simpletron's memory. To store the entire display, we need 640x480 = 307,200 memory addresses. You can choose to use 16-bit or 32-bit memory addresses, depending on your requirements.
I suppose the last paragraph could be reworded to add: "Please note my calculation assumes 8-bit color depth. If you choose a different resolution, your requirements will change."
Okay.
Hi!, how can i solve this error:
Dimension value must be integer or None or have an index method, got value 'TensorShape([])' with type '<class 'tensorflow.python.framework.tensor_shape.TensorShape'>'
On this: relative_position_encoding=(None, 300, None)
is how im passing the argument to a Transformer XL layer on tensorflow
Hi! has anybody here worked with YOLOv8? Im trying to save the values in xywh format using the save_txt=True CLI argument but it's currently what I assume to be normalized
2 0.839807 0.165415 0.12882 0.0654616
24 0.850087 0.551329 0.193253 0.089764
2 0.840522 0.179473 0.128972 0.088213
0 0.535866 0.103385 0.0689186 0.0563577
2 0.898594 0.135384 0.202476 0.0797681
0 0.364594 0.115743 0.08385 0.058203
2 0.957544 0.171258 0.0844107 0.0878528
2 0.0187968 0.179325 0.0375859 0.107676
2 0.935403 0.13661 0.128447 0.0786718
0 0.80963 0.272964 0.101042 0.200643
0 0.471424 0.116067 0.212462 0.208825
0 0.686469 0.351023 0.275836 0.676815
0 0.310915 0.28648 0.20025 0.400011
0 0.245834 0.285782 0.200336 0.39382
0 0.533272 0.447903 0.397006 0.667959
0 0.090301 0.226236 0.180229 0.413001
36 0.442348 0.725502 0.411456 0.163376
36 0.642305 0.625472 0.274271 0.159934
Unless some of you know how to convert this to the xywh format that i need
I know it's not blindly fetching code. That doesn't mean that it's infallible.
Yeah I managed to figure it out. I did create a docker image with the required packages. The issue was that upon execution, the function will download the stop words and other NLTK libraries needed for preprocessing. It tried downloading the files to directories that can’t be modified for some reason. So I had to have it download to a temp directory and manually point the NLTK function to look in that particular file. It was a pain
you can have the code to do that in the Dockerfile, like RUN python -c "import nltk; nltk.download('punkt')"
All this...and it still can't help me make decent GANs 
Or make a Diffusion model
Whats the point of this part in a pytorch dataset? isn't index just always an integer?
Also...does diffusion models work with audio data, for audio generation? 
Sincerely, I never used that, but I guess that's because, for N batch in your dataloader, the dataloader will call __getitem__() N times
So, if your batch has size 8, dataloader will be like
for i in range(8):
item = dataset.__getitem__(i)
return item
that would be my guess as well
for example in numpy and pandas, indexing with a list yields different results from indexing with another numpy array
and you may compute indices using other pytorch functions
I've been writing this entire dataset for pytorch
class VegetableDataset(Dataset):
def __init__(self, dataset_path, nr_images_per_class=None, transform=None):
"""
:param dataset_path: Path to the dataset containing all images
:param nr_images_per_class: The number of images per class that are loaded
:param transform: transform to be applied to samples
"""
# Initialize some instance variables
self.dataset_path = dataset_path
self.nr_images_per_class = nr_images_per_class
self.transform = transform
# Compose a dict with a list of paths for each image class
image_path_dict = {}
for class_path in glob.glob(os.path.join(dataset_path, '*')):
class_image_paths = list(glob.glob(os.path.join(class_path, '*')))
random.shuffle(class_image_paths)
if nr_images_per_class is not None:
class_image_paths = class_image_paths[:nr_images_per_class]
class_name = os.path.basename(os.path.normpath(class_path))
image_path_dict[class_name] = class_image_paths
# Put the images of the dict into a single list together with the labels
self.paths_and_labels = []
for idx, class_name in enumerate(sorted(image_path_dict.keys())):
for image_path in image_path_dict[class_name]:
self.paths_and_labels.append((image_path, idx))
# Shuffle this list to randomize the order of images fed to
random.shuffle(self.paths_and_labels)
def __len__(self):
return len(self.paths_and_labels)
def __getitem__(self, idx):
image_path, label = self.paths_and_labels[idx]
image = ...
Turns out, I can just use this
from torchvision.datasets import ImageFolder
dataset = ImageFolder('vegetable_data/')
Anyone knows if I can make it so it only grabs x nr of images per class, instead of all of them?
you'd then use a dataloader
imagefolder does not load the contents to memory
check out this https://pytorch.org/tutorials/beginner/basics/data_tutorial.html and this https://debuggercafe.com/pytorch-imagefolder-for-training-cnn-models/ for some examples on dataloaders with imagefolder
you can tell the dataloader to load some amount (batch size) of images each time, selected at random from a source of images (imagefolder)
these dataloaders usually include augmentation capabilities btw. tensorflow has something similar as well
Yeah, but I want to limit the "entire dataset" to only contain x nr of images per class. Not just adjust the batch size when loading the images in
I want to train multiple cnns and then combine the results using some election rules, so I want each one to be trained on some random set of images
But I already wrote the custom dataset, I'll just continue that so I can personalize it anyways
i'm pretty sure there should be some parameter for that, but i don't use pytorch so i don't know which one. the rule of thumb is that, if it seems like a common enough problem, it already has a solution 😛
you use tensorflow, then? or jax?
i use jax, but never for this sort of stuff. i usually generate my own data synthetically and rarely work on measured data
i did do some tensorflow courses at some point but i've never used it for anything myself other than the usual mnist, fashionmist, hand signs, etc that everyone does while learning
off the top of my head, a solution could be to make a list of dataloaders per class, but that only makes sense if you first split the data into directories per class
seems you can write a sampler function for dataloader and pass that as an arg too https://pytorch.org/docs/stable/data.html#torch.utils.data.Sampler but yeah, since you already made your tool 😛
Yeah I suppose it doesn't even matter that the classes are perfectly balanced, I could just iterate through a set nr of batches per classifier. But it's good to know that there is already stuff out there for image datasets.
is Jax more explicit than pytorch when it comes to adjusting the weights? because I find loss.backward() and optim.step() to be weird and implicit.
as explicit as you like
you can compute the gradients and do whatever you like with them before updating the parameters
jax by default is just numpy with jit and autodiff
there's also the optax module that works much like pytorch and tf. you tell it which optimizer you like and it handles the rest
is it at least as fast as pytorch?
should be comparable
hmm, maybe I'll try it for my next project that uses neural networks
that's in general a terrible idea, i think
why