#data-science-and-ml

1 messages · Page 30 of 1

steady basalt
#

If it’s not good enough the gpu isn’t so cloud is best

thorn bobcat
#

i tried it..

#

it's not th best because apparently.

hidden current
#

hey, im currently learning and stuck on gradient descent local minima/maxima, i have the base code, a cost function and the domains although i cant figure out how to localise it, neither am i capable of finding the info i could implement on google, does someone mind helping?

austere swift
#

you can use it on cuda 11

gaunt anvil
#

huh interseting

#

ill look at it ty

compact oriole
wheat hornet
#

I think I have a table manipulation problem down in #help-candy I don't know what to call this problem

versed gulch
#

Hi, does anyone know how to append another value to the same key, im getting this error?

rugged comet
versed gulch
fluid spindle
#

Hello, what's the practical use of pandas.dataframe.items, iterrows, iloc and loc?

cloud kettle
#

hy anyone know about encryption specfic field of json using python?

lapis sequoia
#

Hi, I am training a model using catboost and it trains fine when there's CPU in the hyperparameters but I get the following error when I change it to GPU. Hence, it seems that there isn't any problem with the code. Please suggest how it can be resolved

ivory pulsar
#

@lapis sequoia Do you know what CUDA version are you running? Tried updating catboost?

#

95% sure I had this exact issue before but I can't for the life of me remember how I fixed it

lapis sequoia
floral hollow
#

how to merge to training data sets of images

#

this is how i am loading the data

#
clothes = keras.datasets.fashion_mnist
(train_images, train_labels), (extra_images, extra_labels) = clothes.load_data()
#

essentially i want to do something like: ```py
train_images = np.concatenate(train_images, extra_images)
train_images = np.concatenate(train_labels, extra_labels)

wooden sail
#

you're pretty much there! you just need to specify along which axis to concatenate

#

what's the shape of your data sets?

floral hollow
#

i added this

#

axis=0 it worked

wooden sail
#

great

cedar solstice
#

Guys, I'm working a on a script that fetches text from images and stores the data from the images into a dataframe.

I'm currently using Tesseract to detect txt from images. Any other alternatives? Tesseract doesnt seem to detect small text from images

wooden sail
hardy kernel
#

is there a way to append rows to a pandas dataframe inplace? Like without generating a new one?

hardy kernel
#

damn

serene scaffold
#

The best way is to append all the data to a python list, and then convert the whole list to a dataframe once you have all the data.

hardy kernel
#

o that doesnt sound too bad, I'll try that, thanks buddy

cedar solstice
wooden sail
#

i'm not sure i've seen something like that trained to detect special chars. if you mean to detect it as an image, you can probably transfer train a network that does image classification

floral hollow
#

hey, my model wont stop guessing "Boot", please help me

#

this is the code:

#
from tensorflow import keras
from pathlib import Path
import tensorflow as tf
import numpy as np
import cv2

images_path = fr'{Path(__file__).parents[1]}\images'
image_path = fr'{images_path}\dress.png'

labels = [
    'T-shirt',
    'Pants',
    'Long sleeve shirt',
    'Dress',
    'Coat',
    'Sandal',
    'Shoe',
    'Bag',
    'Boot'
]

label = labels[3]
    
""" Retrieving and loading data """
clothes = keras.datasets.fashion_mnist
(images, t_labels), (extra_images, extra_labels) = clothes.load_data()

images = np.concatenate((images , extra_images), axis=0)
t_labels= np.concatenate((t_labels, extra_labels), axis=0)

""" Making the image the correct format """
drawn_image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
drawn_image = drawn_image[0:600, 0:600]
resized_drawn_image = cv2.resize(drawn_image, (28, 28), interpolation=cv2.INTER_LINEAR)
resized_drawn_image = resized_drawn_image.reshape(-1, 28, 28)

""" Pre-processing images to be between the values of 0 - 1 and Making image White on Black """
images = images / 255
resized_drawn_image = abs(255 - resized_drawn_image) / 255 

""" Creating the model """
model = keras.Sequential([keras.layers.Flatten(input_shape=(28, 28)),keras.layers.Dense(128, activation='relu'),keras.layers.Dense(10, activation='softmax')
])

""" Compiling the model """
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy', metrics=['accuracy'])

""" Fitting the model """      
model.fit(tf.expand_dims(images , axis=-1), t_labels, epochs=1)

""" Testing the model """
test_results = list(model.predict(resized_drawn_image)[0])

""" Getting the results """
guess_index = test_results.index(max(test_results))
if guess_index == 6:
    guess_index = 2

""" Printing the results """
if labels[guess_index] == label:
    print(f'\n\n The model guessed {labels[guess_index]}, the model was correct!')
else:
    print(f'\n\n The model guessed {labels[guess_index]}, the correct answer was {label}.')
#

this is dress.png

#

perhaps the all knowing @wooden sail could take a look...?

steady basalt
#

edd has duties, being a god is hard work

wooden sail
#

try doing imshow on one of the concatenated images and see

#

try plotting what you've done to the images. looking at stuff can be helpful when you're wrangling data

desert oar
#

@floral hollow if you want to "stack" a bunch of 2d arrays together into a single 3d array, use np.stack, not np.concatenate

#

concatenate extends an existing dimension. stack adds a new dimension.

#

in general, if you download someone else's project, and follow their instructions, and the instructions don't work, it means that they messed up not you 🙂

#

it might be a good place to start reading about how openai gym works

unborn hinge
#

from what ive understood its not referring to your virtual environment, but rather the "environment" that the AI agents are acting in

#

the "action space" seems to define what actions the agent can take, but as per the readme there is no fixed set of actions, so their environment doesnt define an action space
https://github.com/dellalibera/gym-backgammon/#actions

The valid actions that an agent can execute depend on the current state and the roll of the dice. So, there is no fixed shape for the action space.

desert oar
#

that was my impression as well. the "environment" is something that's part of the openai gym framework, not the python environment

#

it seems like something that was supposed to be part of the example script.

unborn hinge
#

but otherwise i cant explain why their project doesnt include one, perhaps there was a change to the api that required it?

#

you can try either contacting the maintainer of gym-backgammon, or fixing the code so it has an action space (the module i linked above does seem to have a get_valid_actions method for determining those actions)

#

openai/gym#751 might be a relevant issue too

strange elbowBOT
unborn hinge
#

or this https://github.com/openai/gym/issues/1264

You could always allow all actions and rely on the agent to figure out that 2 of the actions do nothing in the later stage. You could also signal to the agent in the observations that it is in one setting or the other. It's likely that there are other ways to do this, but this may be the simplest one.

steady basalt
#

Implement a github project into a notebook?

#

Wdym?

#

U cant rly run multiple files in a notebook

#

u run a github project as one

#

for example, clone it and run the main .py file in terminal

#

gits are like interconnected dependencies

desert oar
#

a git repository is an entire "project": a collection of files. a notebook is one of several files in a project.

empty plank
#

Hello so, I was working on a simple JARVIS project and I got an idea to make the assistant identify the user's face and greets him by his name. For example, If I use the assistant it should greet me as my name and if my friend uses it, it should greet by his name. I am not much experienced in this field so, I need to know which library would help me do it. (and a tutorial reference would be helpful :>)

maiden merlin
#

Hey, what can I do to divide two data frames by each other?

steady basalt
#

Yes

maiden merlin
#

Is there a reliable way to change a data frames values from an object to an integer.. if the numbers contain commas specifically

steady basalt
#

remove comma

#

?

#

its surely a string?

maiden merlin
#

yes

charred light
maiden merlin
#

,thousands=',

desert oar
dawn fable
#

Hey, can anyone help me? I have a data set with data about houses. I'm trying to create multiple plots with sns.FacetGrid, but it does everything in the last plot (Check the image). I try to make a plot for every town. The floor_area_sqm goes on the x-axis and the town goes on the y-axis.

This is the code:

grid = sns.FacetGrid(df, col="town", hue="resale_price",
                     col_wrap=4, height=1.5)
sns.scatterplot(x="floor_area_sqm", y="resale_price", hue="town", data=df)```
Thanks for the help!

Image:
steady basalt
#

In probability theory, the birthday problem asks for the probability that, in a set of n randomly chosen people, at least two will share a birthday. The birthday paradox is that, counterintuitively, the probability of a shared birthday exceeds 50% in a group of only 23 people.
The birthday paradox is a veridical paradox: it appears wrong, but is...

#

c-c-c-crazy

gaunt anvil
#

does anyone know what fake_audio[:, :, :audio.size(2)] does? These are pytoch tensor objects but you can probably think of them as np arrays :L

agile cobalt
#

it's a slice filtering

  • all values in the axis at index 0
  • all values in the axis at index 1
  • all values up to audio.size(2) in the axis at index 2
gaunt anvil
#

ah i see

agile cobalt
#

!e I would expect for it to be doing the same as ```py
import numpy as np
a = np.arange(16).reshape(2,2,4)
b = np.arange(8).reshape(2,2,2)
print(a)
print('---')
print(a[:, :, :b.shape[2]])

gaunt anvil
#

this repo i'm running seems to be breaking when trying to subtract the fake_audio[] with the real audio

#
  File "/home/user/HiFi-GAN/utils/train.py", line 88, in train
    step)
  File "/home/user/HiFi-GAN/utils/validation.py", line 28, in validate
    sc_loss, mag_loss = stft_loss(fake_audio[:, :, :audio.size(2)].squeeze(1), audio.squeeze(1))
  File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/HiFi-GAN/utils/stft_loss.py", line 130, in forward
    sc_l, mag_l = f(x, y)
  File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/HiFi-GAN/utils/stft_loss.py", line 91, in forward
    sc_loss = self.spectral_convergenge_loss(x_mag, y_mag)
  File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/HiFi-GAN/utils/stft_loss.py", line 46, in forward
    return torch.norm(y_mag - x_mag, p="fro") / torch.norm(y_mag, p="fro")
RuntimeError: The size of tensor a (151) must match the size of tensor b (146) at non-singleton dimension 1```
#

so would i just force fake_audio to be the same dims as audio?

arctic wedgeBOT
#

@agile cobalt :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | [[[ 0  1  2  3]
002 |   [ 4  5  6  7]]
003 | 
004 |  [[ 8  9 10 11]
005 |   [12 13 14 15]]]
006 | ---
007 | [[[ 0  1]
008 |   [ 4  5]]
009 | 
010 |  [[ 8  9]
011 |   [12 13]]]
agile cobalt
gaunt anvil
#

oh .-.

agile cobalt
#

you must see what is the purpose and origin of each of them first

gaunt anvil
#

fake_mel seems to be from a model generating from a mel

#

audio seems to be the real audio

agile cobalt
#

I'd check the model description to understand why it does not matches and see if / what they recommend then

#

it might be just that they have different durations, in which case you might as well cut off the longest one to match the other, but only after making sure that this is the case and the way they recommend handling that issue

gaunt anvil
#

since i have enough audio data now i might just train from scratch

long zephyr
#

I am pretty new to description logic and I have no idea how you would solve these kind of exercises. Could anyone lend me some help?

misty flint
#

like stel said this is pretty cool stuff. i had a classmate in grad school that created a GAN to create anime characters. his took at least 20k-30k images for decent quality

#

and even then you could kinda tell with some of them that they were artificial

plush jungle
#

but that's really good to know that 20-30k was the barrier to entry for your classmate

misty flint
#

yeah his were full body shots though and not one specific anime series so maybe you will require less

#

best of luck bud

desert oar
# dawn fable Hey, can anyone help me? I have a data set with data about houses. I'm trying to...

scatterplot doesn't automatically detect and use the presence of a FacetGrid. you need to call grid.map_dataframe or similar to do this, as per the examples in the FacetGrid docs http://seaborn.pydata.org/generated/seaborn.FacetGrid.html#seaborn.FacetGrid

however the scatterplot docs point to replot as the preferred way to do this: http://seaborn.pydata.org/generated/seaborn.relplot.html#seaborn.relplot

lapis sequoia
#

would this have worked if I filtered normally with a mask like in the last line. Instead of loc

#

Basically does doing df[mask] create broadcasting or a copy

fluid spindle
#

Hi anyone around for a quick question?

desert oar
lapis sequoia
steady basalt
#

Surely its in the environment youre coding in

#

but not downloaded as files actually, maybe its just put on ur ram

#

otherwise u wud lose hard drive after doing this stuff a few times

#

yeah pretty sure it doenst download

steady basalt
#

yaeh it issnt download

grim orbit
#

hey

#

can any1 help me with spacy in here?

fluid spindle
# lapis sequoia dontAskToAsk

I ask cuz after 4 hours I don't need it anymore, I could ask it directly, as you see no one was around, someone would spend their time and solve it after hours and I would thank them, have to try to conceal they spend entire time trying to solve it for nothing

jaunty ruin
#
building 'cartopy.trace' extension
      error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for cartopy

I keep getting this error while trying to install Cartopy. I am temporarily hosting the bot in Diva where I cannot upgrade C++ Build Tools, etc. Can anyone suggest me a better way to fix this error? I tried to google it but people suggest using Conda...

serene scaffold
quaint plover
#

I am analysing the position of links on a set of webpages, to explore if link position influences click-through.

I have the relative path in this format for an html file: file = './data/wpcd/wp/f/France.htm'

I am trying to use Selenium to analyse the content of the file, but it requires the full path and a specific encoding, such as: driver.get(file://C://Users//User//DataspellProjects//project//data//wpcd//wp//f//France.htm

What would be the best way to convert my relative path into a Selenium usable path, while staying platform independent

I am currently doing a regex based solution with os.getcwd(), but it doesn't feel like the right way of doing things

serene scaffold
lapis sequoia
#

You are not the same anymore

#

Britney

flat hollow
#

Has anyone here used kmeans clustering for 3D image segmentation? I'm looking for python's equivalent of Matlab's imsegkmeans3

#

omg I got it working I'm so happy

flat hollow
#

Does anyone know how to speed up computationally expensive functions in scipy or sklearn using nvidia GPU (so numba/cuda implementation)? Doesn't seem like they offer any GPU options atm, so is my only option to build those functions from scratch with numba syntax?

serene scaffold
neat schooner
#

I am trying to do a groupby but I want to return the whole row of the dataframe not just the column I am grouping and aggregating on. df.groupby('foo')['bar'].nlargest(2) ... but I want to see the whole row. I think the solution has something to do with loc but I can't wrap my head around it. Any ideas?

steady basalt
#

cant u just use BS? arent html files static pages

#

selenium for navigating browser

quaint plover
steady basalt
#

You m ean like co-ordinates?

#

u dont want its element?

quaint plover
steady basalt
#

I think you're using the wrong tool

quaint plover
#

How come

#

BS doesn't give the coordinates, it gives the position in the HTML files (line 20, column 200), while Selenium gives (x,y) position for a "physical" page, as it would be shown to the user.

steady basalt
#

Huh, TIL you can use selenium for finding x,y

#

I only used for scripting navigation

#

element wise

#

are you using a .get_location function on an element?

silk garden
#

Hi guys,

I'm pretty new here and I'd like to tell you about our Python open-source project.

My team and I are interested in the multitude of AI APIs that have emerged on the market in recent years from large cloud providers (Google, Microsoft, Amazon, etc.) but also from AI specialists (OpenAI, DeepL, Assembly AI, etc.) and that allow us to handle specific tasks: image recognition, translation, audio transcription, document parsing, etc.

We develop an API to rule them all: we standardize competing APIs into a single one so that developers can change providers whenever they want, use several APIs at the same time if needed, combine engines from different providers, etc.

To be transparent about this standardization, we decided to launch an Open Source version where we display the connectors we created to allow any AI service provider to add its own connector or to allow anyone to use our standardization for free: https://github.com/edenai/edenai-apis/ For those who are interested in these topics, I would love to have your opinion on our project and how to nourish it (please note that at the moment, only members of my team are working on it). As I said at the beginning, it's new for us 🙂

Thanks in advance,

Taha

PS: If you can star the repo that would be great and would help us a lot!

GitHub

Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines - GitHub - edenai/edenai-apis: Eden AI: simplify the use and depl...

remote coyote
#

Does anyone here have a suggestion about where might be good to ask some questions about the mathematical properties of 2D DFTs/FFTs and how that relates to some image editing techniques? I've got the basics down, but I'm having some trouble identifying signal peaks in the 2D space. I'm wondering if there are some properties of the complex vector magnitudes that may assist my post-processing search. (Kind of vague, I know!)

merry ridge
#

Unless you can be more specific I don't think there is anything particularly helpful that you are looking for beyond general peak finding algorithms

desert oar
# silk garden Hi guys, I'm pretty new here and I'd like to tell you about our Python open-sou...

i like this idea (and i love that it's open source and self hostable!!), but who is the customer? changing api providers usually is a somewhat big decision, no? how often are companies using multiple AI APIs at the same time?

i wonder what the value of making this a service is, versus simply a python client library like geopy that abstracts over the various APIs on the client side?

i feel like if you're at the point where you're mixing and matching AI APIs, you'd already have a data scientist or two on hand and might be starting to in-house some of that stuff anyway.

and how are you going to get AI API companies on board with this? they want their APIs to be distinct and differentiated. aggregating them into a common "API soup" might go against their strategic plans and they might write you out of their ToS. this was a big issue at a past company i worked at, where the product essentially depended on the goodwill of upstream API providers and we needed to carve out our product niche very very carefully so as not to step on their toes and get slapped in response, because every one of them had the power to unilaterally destroy our business if they wanted to.

#

of course if this is just an open source project that isn't meant to have commercial backing, then i'm all for it. it might also be a great thing for other companies to be able to offer various AI/ML APIs as an integration with their own platform, like jetbrains might want to use this inside dataspell.

#

in fact, maybe that's the pitch to upstream API providers: you are making their API more accessible to more people, by making it interchangeable w/ other providers and thereby allowing them to compete more directly on price and model quality. might be appealing to smaller players and unappealing to bigger players, like how netflix was originally good for movie studios until they decided to go build their own streaming platforms and netflix suffered badly.

idle urchin
#

I have code ```

conditions = [Animals["refference"] == "ABCD"]

choices = [

'ABC-{}'.format(Animals['Invoice'])

]

Animals["Type"] = np.select(conditions,choices,Animals["Type"])

desert oar
#

i really don't quite know what you're trying to do here

#

you have a couple of other problems with this code too... why is conditions wrapped in an outer list?

#

since this is just a boolean lookup, you can assign directly with .loc and avoid all the complexity you've accumulated:

cond = Animals["reference"] == "ABCD"
Animals.loc[cond, "Type"] = "ABC-" + Animals.loc[cond, "Invoice"]

is that what you wanted?

idle urchin
desert oar
#

also, in english reference is spelled with one f, not two

idle urchin
desert oar
#

cond is a pandas Series of boolean values

#

.loc selects multiple values from a series or data frame

idle urchin
#

is there a way to do this using np.select

desert oar
#

yes, but why?

#

is this a homework assignment?

idle urchin
#

no

desert oar
#

it's not idiomatic and it's less efficient computationally

idle urchin
#

cause isn't np.select faster

desert oar
#

no

#

did i have this discussion with you recently? or with someone else?

#

they saw in a youtube video that np.select was "faster" (than what?) and were fixated on using it

#

np.select is faster than looping and using if/else inside the loop

#

but np.select also requires you to construct a complete array just to take a subset of it, so that's not faster

#

assigning with .loc is just as fast as using np.select

#

both are implemented efficiently in tight C loops internally

#

the general principle is that vectorized numpy and pandas operations are faster than the same operation with a plain python loop

idle urchin
#

ok thanks

umbral charm
#

anyone good with computing mathmatical functions

desert oar
umbral charm
#

Yea ok totally read that

desert oar
#

can you repost that as text, not a screenshot?

umbral charm
#

this server got a latex bot?

desert oar
#

!code you can format the code in a code block, read below:

arctic wedgeBOT
#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

umbral charm
#

$h$

desert oar
#

i see, no we don't have a latex bot

umbral charm
#

Well its a mathmatical function i need to compute

#

i can try copy pasta but i doubt it would do much

desert oar
#

i see. this should be pretty easy using numpy. did you at least make an attempt?

#

you can do it looping over lists too, which seems to be what the question is asking you to do

#

make an attempt and post your code

umbral charm
#

not really, but i think ive done good so far i just dont know what to do now

#

Yea ive did some code

#

def fourier(x, a_list, b_list):
for i, (j,k) in enumerate(zip(a_list, b_list)):
(jnp.cos(x) + knp.sin(x))

#

idk how to get it in the fancy thingy

#

''' def fourier(x, a_list, b_list):
for i, (j,k) in enumerate(zip(a_list, b_list)):
(jnp.cos(x) + knp.sin(x)) '''

desert oar
hasty mountain
#

Guys...why does NLP models rely so much on softmax functions?
I've tried making one that outputs a single value, translating it with a dictionary of values + KNN, and...no success...

umbral charm
#

'''py
def fourier(x, a_list, b_list):
for i, (j,k) in enumerate(zip(a_list, b_list)):
(jnp.cos(x) + knp.sin(x))
'''

desert oar
#

the backtick character is on the same key as ~ on us-ansi keyboards

umbral charm
#

py def fourier(x, a_list, b_list): for i, (j,k) in enumerate(zip(a_list, b_list)): (j*np.cos(x) + k*np.sin(x))

desert oar
#

it says in bold text that they are not quotes

umbral charm
#

...

desert oar
#

the info box even has a chunk you can copy and paste...

serene scaffold
umbral charm
#

py def fourier(x, a_list, b_list): for i, (j,k) in enumerate(zip(a_list, b_list)): (j*np.cos(x) + k*np.sin(x))

desert oar
idle urchin
serene scaffold
#

@desert oar you're on fire today btw. I'm inspired 😄

desert oar
desert oar
umbral charm
#
def fourier(x, a_list, b_list):
    for i, (j,k) in enumerate(zip(a_list, b_list)):
        (j*np.cos(x) + k*np.sin(x))
#

HOLY SMOKES finally

#

i dont know what to do from here

desert oar
umbral charm
#

--latex

#

how it work?

#

--\sigma

idle urchin
umbral charm
#

--sigma

serene scaffold
hasty mountain
#

I don't like the fact that the softmax function make the NLP models work with inputs and outputs which has sizes so big...

desert oar
desert oar
hasty mountain
#

Also, @serene scaffold you said you would be impressed if my model could do something with this combination of MSE + KNN...
And...well...seems like you were right grumpchib


0/10000    Current Loss: 1289.3867072020757    Current Learning Rate: 1
Gradients Average: -0.15803318163855073
1000/10000    Current Loss: 974.5605831206061    Current Learning Rate: 0.010000000000000002
Gradients Average: 0.8818579553932097
...
9000/10000    Current Loss: 1015.7047142487523    Current Learning Rate: 1.000000000000001e-18
Gradients Average: -0.559652991596319
10000/10000    Current Loss: 935.3332222202407    Current Learning Rate: 1.0000000000000011e-20
Gradients Average: 0.5533991380623683
#

The model doesn't seem to learn at all

desert oar
strange elbowBOT
umbral charm
#

Failed to render

desert oar
#

.latex ```latex
\frac{x - \bar{x}}{\sigma}

strange elbowBOT
desert oar
#

yeah idk, seems broken

umbral charm
#

Yea

idle urchin
umbral charm
#

So how would one computer this in the function
Write a function fourier(x, a_list, b_list). a_list should be a list of coefficients for cosine functions, and b_list should be a list of coefficients for sine functions. The result should be an array of values matching the input array x.

hasty mountain
# hasty mountain Also, <@253696366952316929> you said you would be impressed if my model could do...
print(original_sentence)
print(output)

('私 の a i は 話 し て 歌 っ た し て ゲ ー ム を し ま す <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> <EOS>')
('ん ア 前 リ ポ 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確 確')

It's curious, though... the range of values in the dictionary goes from -45 to 45(Originally it was -1 to 1).
And that kanji that keeps repeating has value 21.74.

wooden sail
# umbral charm

i would say numpy is the easiest way, but you could also do it in a for loop and using the math library to compute the sines and cosines

#

will you be given the coefficients a_n and b_n or do you need to compute them yourself in a separate function?

umbral charm
#

the numpy way, like ive did

#
def fourier(x, a_list, b_list):
    for i, (j,k) in enumerate(zip(a_list, b_list)):
        j*np.cos(x) + k*np.sin(x)
#

i just dont know what to do from here

desert oar
hasty mountain
#

I create a dictionary where each word is assigned to a number, then I use that to translate the numbers generated by the model to a word

desert oar
hasty mountain
#

And, to make sure the model's output can be translated by the dictionary, I use a KNN

idle urchin
desert oar
wooden sail
desert oar
idle urchin
desert oar
umbral charm
desert oar
wooden sail
#

yes, so what is YOUR question?

#

your code was already very close, except for the index i that you have not multiplied

#

what else is troubling you about it?

desert oar
#

it might help to use the same variable names as in the math expression

hasty mountain
# desert oar okay, so how does the model tell you what word it chose? it chooses word `k` by ...

Suppose that, in the dictionary, I got the word a assigned to the value 3.465 and the word b assigned to the value 5.231.
If the model outputs the number 4, this number 4 will be passed to the KNN that was fitted into the dictionary and this KNN will tell me "the model output is 4, which is closer to 3.465 so the model's output is 3.465".
Then, I pass this "KNN translation" into the dictionary and I'll get that the model's output is word a

umbral charm
desert oar
#

wait, is that math expression even right?

#

they put n but meant i?

umbral charm
#

bro IDKE man my teachers on crack half the time

wooden sail
umbral charm
wooden sail
desert oar
#

well that's probably why mortta is confused

umbral charm
#

would it be

wooden sail
#

.latex should be [
\sum_{n = 0}^N a_n \cos(nx)
]

strange elbowBOT
wooden sail
#

i pressed enter too early, oops

umbral charm
desert oar
#

.latex $
\sum_{i = 0}^n a_i \cos(i x) + b_i \ sin(i x)
$

strange elbowBOT
wooden sail
#

.latex should be [
\sum_{n = 0}^N a_n \text{cos}(nx) + b_n \text{sin}(nx)
]

strange elbowBOT
wooden sail
#

man what did i mess up

umbral charm
#

i just copied and pasta into a diff bot

#

but it should be that

wooden sail
#

yeah this bot sux lol

umbral charm
#

the bot is kinda dead i think

wooden sail
#

i forgot the sine term but you get the idea. notice i used different indices in the sum

#

using i as an index here is cursed because in fourier i is the complex unit, usually. so let's use n and N

umbral charm
#
def fourier(x, a_list, b_list):
    for i, (j,k) in enumerate(zip(a_list, b_list)):
        return j*np.cos(i*x) + k*np.sin(i*x)
#

so would this be right

wooden sail
#

sure, that should work. but you also have to iterate over x

umbral charm
#

or do i have to do i*j too

#

wym over x?

wooden sail
#

x is a list

copper mica
#

I know that AI has had practical application in helping devs write docs

umbral charm
#

thats the thing IDK WHAT x is supposed to be

copper mica
#

do they use some service for tihs or train their own model?

wooden sail
#

do you know what a fourier series is?

umbral charm
#

no LOL SHE

copper mica
#

what happens when the devs are working on proprietary software and don't want their code being used

umbral charm
#

she just gives us really hard equTIONs to write down a function for

#

ill give u another example

#

LIke wtf MAN IDK what that means

#

i did that one tho

wooden sail
#

well, there is a LOT of stuff in the task you were given that doesn't really make sense tbh

umbral charm
#

Yea i figured

#

she even gets mad

#

when i do math.pi

#

instead of numpy.pi

#

so what do i do next

wooden sail
#

well, i would say what's next is you email your teacher cuz her fourier series is wrong

#

this won't work unless x is simply range(L) or something, meaning you're fourier transforming f(x) = x

umbral charm
#

ok lets say if we just forget that she said fourier

#

could the function still be made

wooden sail
#

yes

umbral charm
#

So i say

wooden sail
#

you have to notice this is a vector equation

umbral charm
#

we just forget that fourier is a thing its just she defined the function that way

wooden sail
#

x has several entries

#

for each value of x, you have to apply this function with a sum of sines and cosines

umbral charm
#

ok

#

thats understandable

#

but how do we know how many times

#

we hve to add up sins and cosines

#

what tells us that

wooden sail
#

you already took care of that

#

the length of a_list and b_list

#

that's the capital N in the sum

umbral charm
#

Yea

wooden sail
#

man it works on overleaf. this bot is tripping

#

.latex should be
[
x_k = \sum_{n = 0}^N a_n \cos(nx_k) + b_n \sin(nx_k)
]

strange elbowBOT
umbral charm
#

to iterate over each lists at the same time usefor a, b in zip(a_list, b_list): ... and since you also need the index wrap enumerate around it. iterating over a range might seem better in this case but it isn't. then you calculate using the math module and you append the result to a list, which you then take the sum of and return it

white jacinth
#

How do I learn the math needed for machine learning? What is a good book or website for someone whose math is intermediate

umbral charm
#

thats what my friend said anyway

#

but idk what to do with that

wooden sail
#

grab a piece of paper and a pencil and go through the logic of the code you shared

desert oar
white jacinth
desert oar
#

@white jacinth there are also the usual sources for learning calculus and linear algebra: MIT open courseware and the 3b1b "essence of" series

#

there are also plenty of good stats texbooks, eg. OpenIntro Statistics

white jacinth
copper mica
#

i know that AI has had a lot of practical application in recent times in helping developers write documentation
Do devs use some service for this? or do they train their own model(especially when the software they work on is proprietary and don't want to use public services)

idle urchin
desert oar
#

if the elements are actually numbers, you can use Animals["Invoice"].astype(str), which applies str to each element individually

#

as above, str(Animals["Invoice"]) turns the entire series into a big string which isn not what you want

#

in general you can also use .apply or .map to apply any python function to each element individually

#

Animals["Invoice"].astype(str) is equivalent to Animals["Invoice"].apply(str)

umbral charm
#

SO than after i did it for each value OF x

#

do i add them all up?

desert oar
wooden sail
hasty mountain
umbral charm
#

or does it not want me too?

strong sedge
#

How does reinforcement learning work in alpha tensor ?

wooden sail
umbral charm
#

so would the sum of the array of x be the actual answer to the math function

#

but it just wants me to put it in an array

umbral charm
#

BUt aslo you know how i use enumarate, it stars from 0, surley we want it to start from 1?

wooden sail
#

there is nothing for you to make up. now that we corrected the indices, all you have to do is read the math expression

wooden sail
desert oar
#

you need a different arrangement that avoids accidentally introducing concepts that don't exist in the underlying data

#

so the natural way to do this is to encode each word as a vector of all 0s, with 1 in the position of the word number

#

so if you have 10 words in your vocabulary, the 6th word will be all 0s, with 1 in the 6th position

#

moreover, it's very appealing probabilistically to model each word "slot" as a random variable with a probability distribution over all possible words

#

the 0-1 encoding is what happens when the probability of the kth word is 1, forcing all other probabilities to 0

#

it's a very convenient and natural way to work with this kind of data, and there's no better alternative

#

so how do you know the model predicts word 4? if the score in the 4th position of the output layer is the highest

#

if you just want to find the word with the highest score, you don't need softmax

#

but if you want to treat the model output as a probability distribution or voting % over words, then you need softmax to transform the output numbers

umbral charm
desert oar
#

and that transformation is important because our loss functions generally require a probability distribution or voting %. so we need that transformation for scoring our models even if we don't need it for making predictions.

umbral charm
#

idk if a function starts at n=0 or n=1

#

ah it starts at 1

#

so would i make my enumartacy start at one

desert oar
desert oar
#

it looks like it starts at 0 to me!

umbral charm
#

oh

#

i cant read

#

SO i did

desert oar
#

that's what edd meant by read the math. you need to actually read it 😉

#

if you did want to start enumerate at 1, there's an option for it

umbral charm
#

??

desert oar
#

!d enumerate

arctic wedgeBOT
#

enumerate(iterable, start=0)```
Return an enumerate object. *iterable* must be a sequence, an [iterator](https://docs.python.org/3/glossary.html#term-iterator), or some other object which supports iteration. The [`__next__()`](https://docs.python.org/3/library/stdtypes.html#iterator.__next__ "iterator.__next__") method of the iterator returned by [`enumerate()`](https://docs.python.org/3/library/functions.html#enumerate "enumerate") returns a tuple containing a count (from *start* which defaults to 0) and the values obtained from iterating over *iterable*.

```py
>>> seasons = ['Spring', 'Summer', 'Fall', 'Winter']
>>> list(enumerate(seasons))
[(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]
>>> list(enumerate(seasons, start=1))
[(1, 'Spring'), (2, 'Summer'), (3, 'Fall'), (4, 'Winter')]
```  Equivalent to...
umbral charm
#

WHAT

desert oar
#

but you don't need that for this problem!

umbral charm
#

Can i invite the python bot to my disocrd?>

desert oar
#

you're supposed to start at 0

umbral charm
#

yea it says below the sum of all signm

desert oar
umbral charm
#

is it python 3.7 or 3.10

wooden sail
#

the math expression literally tells you it starts at n=0

umbral charm
#

IM SORRy i dont take maths

#

is there like a calculator i could use to test im right

hasty mountain
#

So, a softmax wouldn't be really necessary, as long as I use an embedding layer?

hasty mountain
#

I don't understand why this would be any different from, like, index-encoding...or simply using a directionary of values like I'm doing

desert oar
desert oar
umbral charm
#

@wooden sail does x have to be the same lenght as a_list and b_list

hasty mountain
hasty mountain
desert oar
#

"index encoding" and sparse one-hot encoding are both implemented efficiently in computers as lookups

#

but mathematically they are still one-hot encodings and vector/matrix multiplications thereof

#

that is, index encoding is a trick for efficiently implementing certain operations with one-hot encoded data

desert oar
desert oar
hasty mountain
desert oar
#

you can make up a new word that represents that particular linear combination of basis vectors for the embedding space... but that's not really useful

desert oar
hasty mountain
#

KNN makes this in a way that a range [X,Y] = word A

#

But...as far as I'm testing it, it doesn't seem to work...and I still don't get why.

desert oar
#

but you need to be clear here: the output from a KNN model is not a vector embedding, it's a "word id", i.e. an index

hasty mountain
#

Oh, well...kinda.

#

It's the value assigned to that word in the vocabulary dict

desert oar
#

sure, you can then go fetch the vector embedding associated w/ the matched word if you want

#

but in KNN the "fitted model" is just whatever index or tree structure you need for doing proximity lookups, and the output is just the id number of the matched entity

hasty mountain
#

Yes, indeed. So it should work, right?

desert oar
#

theoretically yeah. if your code doesn't work then maybe you're doing something wrong in the code.

#

where are you seeing that softmax is required for KNN?

hasty mountain
#

My idea would be that, if the model could achieve "perfection", it would output the exact same value for that word in the dictionary. Since this isn't possible, I use the KNN so the model will get the correct output if it throws an output that can be close to that value

hasty mountain
hasty mountain
#

But my models aren't working this way. And every code and article I see about NLP won't use KNN, just softmax in the model itself...which gets big outputs

hasty mountain
desert oar
#

yeah, that's why people use dense embeddings in the first place

hasty mountain
#

If my vocabulary dictionary has, like, 100 words. If I use the classic method(one-hot encoding)+softmax, my model will generate an output with shape [1, 100] when trying to predict each word, right?

#

What if I'm using a vocabulary with more than 10.000 words?

#

I want to avoid softmax because of that

desert oar
#

you need softmax for pretty much any other model because that's how they're all designed! in a model that needs to predict a word by performing a sequence of matrix multiplications (like a neural network) there's no sensible way to structure the output layer otherwise

#

even transformer models, which mostly operate entirely on pre-encoded dense word vectors (not one-hot inputs), still need a softmax output layer

hasty mountain
#

Then why does in price prediction and computer vision softmax isn't necessary?
SRGAN relies purely on ReLUs

desert oar
#

because softmax is necessary when you have to choose between "categories"

#

there's no "greater than" or "sum" in such a space of categories. the only way to encode that sensibly is one-hot.

hasty mountain
#

That's why I'm trying to not use a "category", but simply assign a value to a word.

desert oar
#

whereas prices are real numbers (or at least an approximation thereof) so you can just output a number

#

in computer vision, softmax is necessary for classification tasks: choosing among categories that lack additional mathematical structure

desert oar
#

the latter option is silly and not feasible or useful

#

the former option might be interesting but i suspect that people don't do it because it's completely okay to have an output layer with 10k or 100k things in it

#

it's a bit like predicting whole numbers by real numbers + rounding. you can do it, but the results might be funky

hasty mountain
#

Hm... I see... Then I'll try see if I can use a vector embedding layer. I hope Pytorch has this one.

desert oar
#

remember: a vector embedding output layer is literally just a densely connected layer

desert oar
#

there might be some research on post-processing "dense" outputs with KNN to recover "sparse" outputs, but i am not aware of it

desert oar
hasty mountain
#

So that explains the Attention layers...

#

Why they're mostly composed of dense layers...

#

Then, my embedding vector layer would receive a single value, multiply that value by the same number of weights as my vocab size, and then output a single value?

desert oar
#

if you want to create your own embeddings, then you consume one-hot encoded input (or tf-idf or hashed encoding or something else) and emit a dense vector for each one-hot encoded input

hasty mountain
desert oar
#

you always need either one-hot encoded input or index-encoded input (which is equivalent to one-hot as i explained above)

#

you can use something else like tf-idf or hashed encoding too if you want, but the point is you need some way to encode the words as numbers

#

if you just map each word to a position on the number line you introduce fake and arbitrary structure among words, so you cannot do that

#

therefore you must use something else, like one-hot or hashing or tf-idf

hasty mountain
# desert oar i'm not sure what you mean

My input would be a sentence that has been encoded like I did with my dictionary(values between -1 and 1). Then I could pass this into a linear layer that would have output the same size as my vocabulary size(my dictionary length), and then pass this into another linear layer to output a single value.

desert oar
#

there are other options besides NNs, e.g. matrix factorization

hasty mountain
#

Can I use something like this?


{'私': -45.0, 'の': -43.98876404494382, '犬': -42.97752808988764, 'は': -41.96629213483146, '骨': -40.95505617977528}
desert oar
hasty mountain
#

Each character assigned to a single value

desert oar
#

those numbers are already a vector embedding with 1 dimension

hasty mountain
# desert oar sure, but how did you get those numbers?

def _create_dictionary(self, words):
        idx2word = []
        word2idx = {}
        for word in words:
            if word not in word2idx:
                idx2word.append(word)
                word2idx[word] = len(idx2word) - 1

        word2idx['<EOS>'] = len(idx2word) # Adding an End of Sentence tag to improve model's accuracy

        return word2idx
#

maximum = max(dictionary.values())

        for word, value in dictionary.items():

            scaled_value = (value-0)*2.0 / (maximum - 0)-1.0

            dictionary[word] = scaled_value * ((len(dictionary)+1)//2)

desert oar
#

you've just made up a vector embedding that has no real-world meaning

hasty mountain
#

I just multiplied the [-1,1] for 45 to give the model a wider range to "miss"

desert oar
#

you randomly scattered words on a number line. or maybe you scattered them alphabetically, or according to some other system that has no inherent structural meaning.

#

but your model doesn't know that your numbers are made up and meaningless. the linear algebra underlying the models will find meaning where none exists, and your results will be entirely arbitrary and dependent on the made-up number line scattering you did.

hasty mountain
#

Like... suppose that the word a has value 0.51 and the word b has value 0.52, if the model outputs 0.516 KNN will translate this to b, but if it outputs 0.514, to a

desert oar
#

if you swap the order of two words in the list, it would change your model outputs! that's totally messed up

desert oar
#

why should the order of the alphabet matter? the alphabet is based on some shit the phoenecians made up 4000 years ago, it's not meaningful in understanding text.

#

why should a and b be next to each other? what if you put all the vowels first?

hasty mountain
#

Wouldn't the relation between those numbers be deciphered by the vector embedding layer?

desert oar
#

moreover, what does it mean to add a and b together? you get some number, but does that number make sense? note that -45, 45 isn't even a valid vector space because it's not closed over addition, so none of the math will work anyway

desert oar
hasty mountain
#

The codes I see people doing usually relies on numbers that go from [0, vocab_size-1], so...there's still the same problem you're saying

#

But I believe this is deciphered by the embedding matrix, isn't it?

desert oar
#

you don't actually feed a sequence of numbers into the model

hasty mountain
#

Hm... I see...

desert oar
#

there's no little math gremlin deciding how to interpret your data on a case-by-case basis

hasty mountain
#

But shouldn't the model be able to detect that "in this case, the number 0.51 is the right one. On this other case, the number 20.4 is the right one"?

desert oar
#

the math is what it is. the axioms and theorems of linear algebra, real analysis, etc. are what they are. when combined with the binary operations that computers are capable of, you are stuck using whatever tools you are given. you cannot freely reinterpret them. they do what they do, and you must put them together exactly as they are to make useful outputs.

desert oar
#

however tree-based models such as random forests and gradient boosting do work that way, because they work by splitting the number line in half over and over at some optimal point

#

however, that still depends on the model having a meaning for "bigger than" or "less than"

#

if a is not meaningfully "bigger than" c, then splitting at "halfway between a and c" is not meaningful

hasty mountain
#

I see... so this relation between "bigger than" and "less than" can only be breaken by using categories?

desert oar
#

precisely

hasty mountain
#

Then...how could I use vector embedding?

desert oar
#

[1, 0] is not bigger or less than [0, 1], it's just different. and in fact, in a linear algebra sense, they are the same size because they both have a magnitude (or "norm") of 1

desert oar
hasty mountain
desert oar
hasty mountain
#

My vocab size is like...6 sentences. I don't want a google translator

#

But then... I suppose I'll have to try other ways to create my vocab dicts...for both the input(ENG) and the output(JP)

desert oar
hasty mountain
desert oar
hasty mountain
#

I don't have to do much.

#

If I wanted, I would use more sentences

#

I'm actually trying to learn more for a Reinforcement Learning model. I've tried the same idea for a RL model I'm testing.

desert oar
#

well you could start with the basics and use word2vec

hasty mountain
#

But, after this, I'll have to remake its data, since each action for a RL model is a category, they're not "bigger than" or "smaller than", just like happens with words

hasty mountain
#

Nevermind. I suppose that it'll attribute closer values to related words (man, woman, girl, boy)

#

Again, this would be done by an embedding layer, right?
I think my only problem, then, is how to convert my string to a value...so this value can be processed by an vector embedding layer.

desert oar
#

note that character n-grams will probably work better for international text than just words

#

look into fasttext, it's a fast and easy to use implementation of this

desert oar
hasty mountain
#

There's no way to escape from softmax

#

Even to run away from it, I'll have to rely on it.

desert oar
#

there's no way to escape from the fact that if you have 10k words in your vocabulary, at some point somewhere you need to actually represent all 10k of those words

#

you can think of softmax as an implementation detail inside fasttext or gensim or whatever you want to use to create your word vectors

hasty mountain
#

I see

#

Thanks for the help!

desert oar
#

remember: stick to the math

hasty mountain
#

I appreciate it!

serene scaffold
#

better than sticking to meth, amirite?

hasty mountain
#

Also...uh...

#

I would use index-encoding to train a vector embedding layer, which would receive, as input, the index, let's say...3.
This 3 would be passed into a linear layer/FCC, output something with size (1, vocab_size), this output would be passed into a softmax, and then I'd pass this output to a Categorical Cross Entropy Loss, comparing it with the input(the index 3)?

#

Or should I just pass this 3 to a FCC which would output something with size (1, vocab_size), pass this output to another FCC which would output a single vector?

desert oar
desert oar
#

3 -> [0, 0, 0, 1, 0, ..., 0] -> multiply by weight matrix -> dense output

hasty mountain
hasty mountain
desert oar
#

but mathematically it's the same

desert oar
#

self-attention works on "stacks" of dense vector embeddings

#

mathematically it would work on one-hot data but the results wouldn't be good at all

#

like the matrix multiplications would technically work fine, but the results wouldn't be useful, and it would be computationally horrible

hasty mountain
#

Interesting...
What would happen if I apply softmax to this 3 before passing it as input for the embedding layer?

#

Would it encode it in some way? Or would it simply mess everything up?

desert oar
#

i need to get back to work for the afternoon. but again: look at the math. ideas follow from math.

#

math is not a magic gremlin with opinions. it is a system of strictly defined rules.

#

if you focus on the math you'll never be wrong.

hasty mountain
#

I know... but I was testing my wrong idea based on the math of neural networks

#

:)

#

Thanks for the class, teacher!

latent dirge
#

is this the right place to ask about plotly and pandas stuff?

serene scaffold
#

You can use screenshots to show plots, but please don't do it for dataframes. do df.head().to_dict('list') and give the text.

latent dirge
serene scaffold
latent dirge
#

well, the list is a bit too big to copy and paste into here

arctic wedgeBOT
#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

serene scaffold
serene scaffold
latent dirge
#

the error I'm getting is ```ValueError: Value of 'y' is not the name of a column in 'data_frame'. Expected one of [0, 1, 2, 4, 6, 7, 8, 9, 11, 13, 14, 16, 18, 19, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30, 31, 32, 33, 35, 36, 37, 38, 39, 40, 41, 43] but received: Elo

#

@serene scaffold I'm stumped what I should put in for y=, but the data is rather simple, so I don't know why I'm having trouble

serene scaffold
#

@latent dirge can you also do print(df.head().index) and put that in the chat?

latent dirge
#

list seems incomplete, though

#

here's the print(df) for what it's worth:

                            0            1   ...           41           43
Robin Montgomery          1500  1500.000000  ...  1465.762038  1465.762038
Ann Li                    1500  1531.631651  ...  1455.911286  1455.911286
Kaia Kanepi               1500  1500.000000  ...  1560.092503  1560.092503
Camila Giorgi             1500  1500.000000  ...  1507.349554  1507.349554
Kristina Kucova           1500  1484.000000  ...  1452.818621  1452.818621
...                        ...          ...  ...          ...          ...
Harmony Tan               1500  1500.000000  ...  1434.059565  1434.059565
Julia Grabher             1500  1500.000000  ...  1508.970744  1508.970744
Venus Williams            1500  1500.000000  ...  1440.504802  1440.504802
Magdalena Frech           1500  1500.000000  ...  1457.160548  1457.160548
Anastasia Pavlyuchenkova  1500  1500.000000  ...  1484.493448  1484.493448```
serene scaffold
latent dirge
#

initially I thought x= and y= were simply titles for the corresponding axis, so this is a holdover, but I don't know what to put in instead

#

maybe I should use a list made out of the first column

serene scaffold
#

Is this basically what you're trying to make (but one line for every player, not just the first five)?

latent dirge
#

yes, exactly

serene scaffold
latent dirge
serene scaffold
#

idk tbh. never used plotly

#

you might have to reshape the data so that each row is one point rather than a whole line

latent dirge
#

basically transpose the entire thing?

serene scaffold
#

no, .T tranposes it. in this case, you would melt it.

#
In [5]: df.melt()
Out[5]:
     variable        value
0           0  1500.000000
1           0  1500.000000
2           0  1500.000000
3           0  1500.000000
4           0  1500.000000
..        ...          ...
170        43  1481.045325
171        43  1545.959786
172        43  1483.237985
173        43  1468.006673
174        43  1570.508573
latent dirge
#

so I'm no better off than before 😕

serene scaffold
#

I guess I can download plotly

umbral charm
#

Hey

#

im guessing you guys are the scipy.optimize.curvefit people?

#

I got a question

latent dirge
serene scaffold
serene scaffold
latent dirge
#

import plotly.express as px @serene scaffold

umbral charm
#

my cureve fit doesnt work and i dont know why

#

the graph is complicated, but i dont see why it means it shouldnt work

#

my covariances are fked up and i end up just getting a straight line

#

can u helP?

wooden sail
#

what are you doing so far? but yes, having a "complex" graph can make the problem difficult if it can't be linearized or otherwise parametrized in a "nice" way

serene scaffold
#

@latent dirge I got this just doing px.line(df.T) , where abcde are what I'm using for the names of people in your data.

umbral charm
latent dirge
wooden sail
#

show your code

umbral charm
#

ill show my graph and code

serene scaffold
umbral charm
#
import scipy.optimize
from matplotlib import pyplot as plt
import numpy as np
import math
from scipy.optimize import curve_fit
wavelength, intensity, uncertainty = np.loadtxt('line7.csv', unpack = True, skiprows = 1)
plt.errorbar(wavelength, intensity, yerr = uncertainty, fmt = 'ob')
plt.savefig('line7.JPG')

def line(lam, lam_l, w, a, c):
    return a*np.exp(-np.log2(2*((lam - lam_l) / w))**2) + c
popt, pcov = scipy.optimize.curve_fit(f = line, xdata = wavelength, ydata = intensity, sigma = uncertainty, p0 = [1,1,1,1])
print(popt)
x = np.linspace(375, 380, 100)
plt.plot(x, line(x, popt[0], popt[1], popt[2], popt[3]))
plt.show()
#

my code

#

my graph

#

why is my curevfit line

#

just straight

wooden sail
#

i'm surprised that doesn't give an error, since you gave only 4 parameters as p0

umbral charm
#

thats how it works

wooden sail
#

but anyway, the way you're optimizing, you're telling the function to optimize all of lam, lam_l, w, a, c, which is not what you want

umbral charm
#

how so?

#

p0 only needs 4 arguments

#

because it doenst care about lam

wooden sail
#

oh that's my bad, i misremembered how it works then. let me read the docs really quick and get back to you

umbral charm
#

been ages since u done this?

dull kraken
#
def miniMax(BoardSpot, depth, isMaximizing,alpha,beta,playerPiece):
    winCheck = cpuCheckWin()
    if(playerPiece == 'x'):
        opponentPiece = 'o'
        if(winCheck == 'x'):
            return 100
        elif(winCheck == 'o'):
            return -100
        elif(winCheck == 'tie'):
            return 0
        
    if(playerPiece == 'o'):
        opponentPiece = 'x'
        if(winCheck == 'x'):
            return -100
        elif(winCheck == 'o'):
            return 100
        elif(winCheck == 'tie'):
            return 0
    
    if(isMaximizing):
        i = 1
        bestScore = float('-inf')
        while(i<=len(BoardSpot)):
            if(BoardSpot[i] == "-"):
                BoardSpot[i] = playerPiece
                score = miniMax(BoardSpot,depth-1,False,alpha,beta,playerPiece)
                BoardSpot[i] = "-"
                bestScore = max(bestScore, score)
                alpha = max( alpha, score)
                if(beta <= alpha):
                    break
            i+=1
        return bestScore
#

heres my code for alphabeta ai in tictac toe but my output is ending the game very early and I am stumped

#

heres the output

wooden sail
umbral charm
wooden sail
#

seems you learn the params using wavelength as the xdata, but then you try to plot using x = np.linspace(...) instead. perhaps the exponential is already approximately zero for your values of x

umbral charm
#

Wavelength (nm) Intensity (arbitrary units) Uncertainty (arbitrary units)

375.00 0.1300 0.1000
375.10 0.0648 0.1000
375.20 -0.0143 0.1000
375.31 0.0651 0.1000
375.41 0.0791 0.1000
375.51 0.1587 0.1000
375.61 0.1839 0.1000
375.71 0.1931 0.1000
375.82 0.1286 0.1000
375.92 0.1885 0.1000
376.02 0.0246 0.1000
376.12 0.2253 0.1000
376.22 0.1513 0.1000
376.33 0.0702 0.1000
376.43 0.1489 0.1000
376.53 0.0924 0.1000
376.63 0.2132 0.1000
376.73 0.2520 0.1000
376.84 0.3187 0.1000
376.94 -0.0385 0.1000
377.04 -0.0382 0.1000
377.14 0.0761 0.1000
377.24 0.2059 0.1000
377.35 0.4269 0.1000
377.45 0.6331 0.1000
377.55 0.7251 0.1000
377.65 1.1436 0.1000
377.76 1.2805 0.1000
377.86 1.0059 0.1000
377.96 0.7350 0.1000
378.06 0.3562 0.1000
378.16 0.1891 0.1000
378.27 0.1523 0.1000
378.37 0.1492 0.1000
378.47 0.1214 0.1000
378.57 0.1121 0.1000
378.67 0.0330 0.1000
378.78 0.1378 0.1000
378.88 0.1122 0.1000
378.98 0.2129 0.1000
379.08 0.2199 0.1000
379.18 0.1185 0.1000
379.29 0.0625 0.1000
379.39 0.0361 0.1000
379.49 0.1423 0.1000
379.59 0.1077 0.1000
379.69 0.0656 0.1000
379.80 0.1044 0.1000
379.90 0.0380 0.1000
380.00 0.1698 0.1000

wooden sail
#

hmm that should work

umbral charm
#

Yea i know it works for all my other curves that dont look nice

#

I just dont know why it doesnt work for this

wooden sail
#

could be the initial guess is very bad, too. curve_fit uses levenberg-marquardt, a quasi-newton method, for optimization. if you start far away from the true solution and the problem is non-convex, you can end up at a local minimizer that may be bad

umbral charm
#

But i cant just guess 4 values

#

that have to be totally different im guessing

#

how does one do that

wooden sail
#

are you sure that log2 is needed in the argument of the exp btw? you sure you don't want this to be a gaussian?

umbral charm
#

this is my equation

#

im plotting

wooden sail
#

aight, you already have a model then.

umbral charm
#

YEa

wooden sail
#

well, for example. a good guess for C is 0. for lam_l, something like 378

umbral charm
#

The question is just asking me for the optimal values from the scipy.optmize

#

Yea but i keep getting inf for pcov

wooden sail
#

and for w, go with 1 i guess

umbral charm
#

Ok

wooden sail
#

and 1.4 for A

#

(these are handwavy just looking at the plot)

umbral charm
#

what about lam_l?

#

what number u think

umbral charm
#

Nope

#

all it does, it give me my optimal values

#

as my guesses, and my uncertainties as infinite

#

honestly i might as well just 5 points on the equation and solve simulataneously

wooden sail
#

good luck with that 😛 the equation is highly nonlinear, and because the uncertainty is nonzero, there is no guarantee it has an exact solution anyway

umbral charm
#

im out of options, hopefully there is a calculator online

#

but i dont know what else to do

#

u know anyone else whos smart in this discord surley there is

wooden sail
umbral charm
#

you could says its 2 log base 2 if you wanted to bring the square down

wooden sail
#

mhm

latent dirge
#

how can I make a plotly animation out of this:

#

whatever I put in for animation_frame= only gives me an error

umbral charm
#

@wooden sail

#

i have had a break through

wooden sail
#

what did you change

umbral charm
#

the guesses a bit

#

but like its a bit to the right

#

why is it only doing half the graph

wooden sail
#

the log goes negative

umbral charm
#

than how does the equation even work

wooden sail
#

i#m trying to figure that out

#

cuz one can do a change of log base there to take a natural log instead

umbral charm
#

Hm

#

do u think she meant natural log

wooden sail
#

and then it essentially turns into a quadratic divided by log2(e)

#

which doesn't seem to describe the data well

umbral charm
#

because in numpy, when u do log() it's natural log

#

not log base 10

wooden sail
#

yeah

umbral charm
#

ill try ln

wooden sail
#

wait, is the 2 not the base

#

smh this notation is awful

umbral charm
#

i mean

#

its still a bit wierd

steady basalt
#

@wooden sail

wooden sail
#

can you share your code again of your model function

steady basalt
wooden sail
#

this is super fishy

umbral charm
#
def line(lam, lam_l, w, a, c):
    return a*np.exp(-np.log(2*((lam - lam_l) / w))**2) + c
#

i changed it to log(e)

#

but it was log(2) b4

steady basalt
#

How does it meant to be root3 for the inner triangle height

#

I feel embarrassed 16 year olds can do this and I can’t

umbral charm
dim mauve
#

Hello,
I'm trying to create a tornado diagram to visualize my features' correlation, I'm not super proficient in matplotlib yet. Does anyone have some suggestions or some generic code where they've done something similar?

wooden sail
#

the triangle is isosceles with side lengths 2r

wooden sail
#

off the top of my head, at least. do double check it

#

that'S enough to solve for r though

steady basalt
#

But we’re not given anything of the triangle

#

How to find height

wooden sail
#

no, so you solve for r

#

all of this is formulated in terms of r

steady basalt
#

R is based off of that triangle

wooden sail
#

no

steady basalt
#

No people answers got in meters

wooden sail
#

that triangle is based off of r

steady basalt
#

I swear it, 20 commenters replying in meters using trig

#

Make me feel dumb

#

It’s for kids

wooden sail
#

it is

#

you just said yourself you have an equation for the total height in terms of r. just solve for r now

#

what's the problem?

steady basalt
#

2r + h = 2

#

can’t find both r and h

wooden sail
#

formulate h in terms of r dude

#

you know the side lengths of the big isosceles

steady basalt
#

R = h? Wtf

wooden sail
#

no, what?

#

what is the side length of the isosceles joining the circle centers?

steady basalt
#

2r

wooden sail
#

yes

steady basalt
#

Base = r now

wooden sail
#

no

#

not of the isosceles

steady basalt
#

Need 90 deg for pythag

wooden sail
#

that's if you take half of the isosceles. that gives you a special triangle

#

a 30 60 90 with hypotenuse 2r and base r

steady basalt
#

Equilateral

#

Is in middle

#

No?

wooden sail
#

sure. an equilateral is also isosceles, but yes

steady basalt
#

Wait so 2r is also the lengthy of the non arc part of band

#

If it’s perpendicular??

wooden sail
steady basalt
#

Make a rectangle down

#

From corner to corner

#

6 so far meters

#

So now it’s purely the three arcs added to that?

#

Why are people doing pythag and getting root3 r

#

Where tf is root3 from

#

Wudnt it be root1

wooden sail
#

... because of the 30 60 90 triangle i drew there and told you about before

steady basalt
#

2r squared minus 1r squared

#

Rooted…..

wooden sail
#

mhm

#

go on

steady basalt
#

That isn’t 3

#

That’s root r squared

wooden sail
#

sqrt( (2r)^2 - (r)^2 ) = sqrt(3r^2) = sqrt(3) r

steady basalt
#

Why is 2 - 1 three

wooden sail
#

because you're forgetting order of operations

steady basalt
#

Ah okay

wooden sail
#

it's 4r^2 - r^2

#

exponentiation first

steady basalt
#

God danm there it is

#

U know I struggled w this so long due to that

wooden sail
#

that says a lot

#

take a step back and review your basics

umbral charm
steady basalt
#

Yes that is why I’m doing algebra for children bro….

#

Thanks for the help

#

Back to precalc in a week anyway if I stay on track

wooden sail
steady basalt
#

I’m very excited to get to approximating functions next year, that’s the main thing I’m looking forward to in my book

#

For some reason in the uk that is further and not core maths so never saw it before but it’s only few hundred pages in

umbral charm
misty tulip
#

still better than i can draw

#

above is generated from a gan

#

its supposed to be a person

steady basalt
#

chimera

dusty valve
#

i was trying to use tensorflow to make an rnn to predict stock prices to learn more about rnn's. but it's weird. i made this csv file of dummy stock price data (data.csv), and price_predictor.py trains it on it. the model is 4 lstm layers (64 units and relu activation each) followed by 0.2 dropout (each), and a 1 unit dense layer with sigmoid. data.csv is split into a numpy array of shape (60, 1), which represents 60 days (rows) of the stock price. the y value for that is the day after. the test data is the 365 days (rows) of data. it trains okay, loss drops to around 0.100 (for 1 epoch training) but accuracy is also constantly going down. the main problem is then when i run model.predict() it returns an empty numpy array? why is that?

#

here's some code to visualize the dummy data as well py from matplotlib import pyplot as plt import pandas as pd import numpy as np dataset = pd.read_csv('./data.csv').to_numpy() plt.title('Test stock price data') plt.xlabel('Price') plt.ylabel('Days') plt.plot(dataset, color='blue') plt.show()

serene plume
#
adj_matrix: numpy.typing.NDArray[np.float64]

adj_matrix = self_cos_sim(sentences_encs)
adj_matrix = np.triu(adj_matrix)
np.fill_diagonal(adj_matrix, 0)
adj_matrix[adj_matrix < threshold] = 0

This create-then-modify-3-times of adj_matrix feels like an anti-pattern.
Is there a way I can optimize this, fuse some or all operations?
Please ping me if you reply

undone ocean
#

i'd like to know too

misty tulip
#

Learn how to use TensorFlow 2.0 in this full tutorial course for beginners. This course is designed for Python programmers looking to enhance their knowledge and skills in machine learning and artificial intelligence.

Throughout the 8 modules in this course you will learn about fundamental concepts and methods in ML & AI like core learning alg...

▶ Play video
#

@undone ocean @silent fable

arctic wedgeBOT
#
Huh? No.

No documentation found for the requested symbol.

serene plume
desert oar
#

all the rest follows from not working with distance matrices efficiently

#

follow the links to read about squareform

#

use scipy to efficiently compute the cosine distances (?) will also produce the right data structure

#

then you only have two steps: 1) compute the distance matrix, 2) zero out the elements below threshold

grim orbit
#

hey guys i could use some advise here

#

import spacy
import xml.etree.ElementTree as ET

tree = ET.parse('topics-rnd5_covid-complete.xml')
root = tree.getroot()



for topic in root.findall('topic'):

nlp = spacy.load("en_core_web_sm")
for topic in root.findall('topic'):
    query = topic.find('query').text

text = query
doc = nlp(text)
print(doc.text) 

Hello I have a question: I have an XML document were I want to change everything inside the attribute query and save it later (changing in terms of let it run into the tokenizer of spacy)

right now i was able to pull up the xml and acess the attribute "query"

but when I run the tokenizer It only give me the output of the last row which is can be solved by a for loop?

but even if how can i replace the values inside the attribute then?

desert oar
serene plume
desert oar
#

you can call squareform on it to get a matrix back out of it. but another option would be to construct a scipy sparse matrix, if this is a bigger dataset

#

in general i think the answer to your first question is "no", unless you do a double loop manually (maybe using numba if you need it to be fast)

grim orbit
# desert oar https://stackoverflow.com/a/40244340 does this help?

import xml.etree.ElementTree as et # import the elementtree module
root = et.fromstring(command_request) # fromString parses xml from string to an element, command request can be xml request string
root.find("cat").text = "dog" #find the element tag as cat and replace it with the string to be replaced.
et.tostring(root) # converts the element to a string

grim orbit
serene plume
desert oar
tacit basin
hidden willow
#

How is AMD doing in the AI/DL space these days? Have they made any gains on Nvidia or is it still pretty much a waste from a consumer perspective?

tacit basin
past rover
#

Does anyone have recommendations on which situations (research, enterprise development, etc.) call for what machine learning/artificial intelligence framework? And possibly which ones would be best for more specific fields like medical research or robotics vision
research for example. More specifically, the generic battle between Tensorflow and PyTorch, or this one new one I've seen called Jax, but I may be unaware of any other good ones Idk. I'm just seeking information and am curious to learn more about the space. I've read some articles but would like some opinions here.

slim reef
#

I wanna know

#

Like this,I imported the iris data set from the sklearn

#

How can I further use pandas to load the iris data set?

#

I personally prefer use pandas to load the data set

tacit basin
slim reef
#

Idk I just imported it from sklearn

#

And I don't know how sklearn stores it

tacit basin
slim reef
#

This is what it prints out

#

Ya I guess it is .csv

#

But this just doesn't work

tacit basin
slim reef
#

Umm

#

Ahh I know

#

Thx bro

proud saddle
#

trying to plot this rn using python

#

if anyone in here could help with that, hmu i have the code ive written so far i just need to make adjustments

austere swift
#

pytorch now has official rocm builds as of 1.8

#

and a lot of other libraries have simple ports that allow them to use rocm

#

for the most part though most places use nvidia still

proud saddle
halcyon nymph
#

why this error showing?

wooden sail
#

print array.shape

halcyon nymph
#

i see

#

i forgot reshape it

icy quiver
#

hi all! Probably a silly question but… I am desperately looking for the actual source code of this scipy function https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.kv.html

I was looking through scipy‘s github repo but was unable to clearly identify where scipy.special.kv is implemented. And the existence of various kinds of Bessel functions doesn’t make it less ambiguous 😅

desert oar
pastel warren
#

Can someone help me with Bagging ?

#

basically im trying to use bagging without the sk learn functions

steady basalt
#

just plot functions that look like it

#

so some log graphs?

proud saddle
steady basalt
#

there arent specific values?

proud saddle
#

there are

#

in the csv file

steady basalt
#

Oh

#

Ok so youre not plotting functions to look like that shape

proud saddle
#

im not just plotting random values, its reading exact values from the csv and plotting those

steady basalt
#

then simply input data much easier

proud saddle
#

yeah the problem is the countries

#

all the different countries are in the same column

#

I want them to be their own separate lines

#

i have no idea how to go about doing that

steady basalt
#

one column per country?

#

just a quick stack over flow search, does this work: df = df.pivot(columns='movie', index='date', values='value')

#

youd need country in the columns parameter

proud saddle
#

comes up key error date

#

ah wait sorry

#

i removed date and values and it shows up this error

#

KeyError: "None of ['Country'] are in the columns"

long granite
#

Hi,

I need help in detecting wires coming from poles. In outdoor conditions, i know I can use edge detector like canny edge filter and do some processing to detect horizontal edges but my concern is the wire are very thin. Is there any better way to detect the wires? I have attached a sample image.

keen nymph
#

hello i need help, like if i Want to execute a function "EXAMPLE" with a hotkey like if i pres F10 it will execute "EXAMPLE"? how to Do that?

soft notch
#

hey guys I'm trying to try some emotion recognition from text, anyone got a model I can use as reference? or a prebuilt model? (I couldnt find any on google)

lapis sequoia
#

are there any alternatives of %store magic command in python which can be used to float variables from one notebook to another?

mighty patio
lapis sequoia
desert prairie
#

I would like to append dataframe to CSV file present in Aws S3, using AWS lambda, can anyone help me regarding this. #python

mighty patio
desert prairie
#

This I got it.

#

My CSV file is present in AWS S3 BUCKET

#

I need to append in that file

lapis sequoia
mint palm
#

is transformers better than rnn and LSTM, no exceptions?

sacred canopy
#

Hi, beginner here.
I'm doing some price analysis (pandas + numpy) and I've noticed that one of my function is quite slow. The function checks if the current price is above or below different levels that are calculated per dataframe. It isn't the cleanest function I presume, but it would be cool to find out if it is possible to improve the performance of it with either cleaner code or using numpy in some way? A, B, C, D are placeholders. The function "rates" price strength.
https://paste.pythondiscord.com/ijajefehem

fading crane
#

are neural networks hand written?

#

pretend i'm a researcher. I have a problem I want to solve. Do i usually take upon an existing model and iteratively improve on them?

#

Or do i start frmo scratch and build upon it from the bottom?

serene scaffold
fading crane
#

hand written in the sense of like

#

not the implementation code

#

but the architecture of the network

#

how many layers you want and which building blocks you use

serene scaffold
#

depends on the project

fading crane
#

im guessing researchers are more likely to want to design new architectures right?

#

where as ML devs are more likely to fine tune something to a business need

serene scaffold
#

It really depends. "Attention is all you need" was a groundbreaking paper, but it was written mostly by Google employees.

serene scaffold
#

Though I suppose those Google employees were hired specifically to be university-style researchers

fading crane
#

I'd be surprised if that wasn't the case

ripe sapphire
#

how can I create an algorithm with reinforcement learning?

fading crane
#

that question sounds a bit vague no?

ripe sapphire
#

Actually I am new to Ml

#

trying to learn as much as possible

fading crane
#

is there a specific problem you want to approach?

ripe sapphire
#

no

#

How can I start with ML in python?

forest pebble
#

Anyone know how to work with jinja2

sacred canopy
#

I know a bit about Jinja2, but rookie. What are you trying to do?

lapis sequoia
wooden sail
#

the individual building blocks are usually well understood and recycled as needed. if you do bleeding edge research though, you make entirely new things and you also need to implement them yourself

lapis sequoia
#

Hi all , what would be the best method to debug / figure out why my keras models loss / MAE is so high ?

#

Trying to use a model to predict Emissions from certain columns in a dataset and it can get quite close atm but i cant get the MAE lower than 11

desert oar
# lapis sequoia Hi all , what would be the best method to debug / figure out why my keras models...

my flowchart is something like this:

  1. data processing mistakes. units are wrong, forgot to scale something, applied scaling twice, forgot to apply some transformation to outputs. also blatant bugs, like swapping train and test sets.

  2. model fitting didn't converge. it's bouncing around and not going down, or it's going up, or there were numerical warnings reported in the training process. this means you have a seriously bad issues either with the model not fitting the data, or something poorly designed in the training process (e.g. bad hyperparameters, badly-behaved data).

  3. the model actually doesn't fit the data. this is where you have to start doing actual data analysis to figure out what you did wrong: looking at pairwise association measures, doing statistical analyses, looking for erroneous data points (hesitant to say "outliers"), fitting the model to simulated data to make sure your model design even works, etc.

lapis sequoia
#

This shows the value going up and down

#

In terms of my model i don't think theres any issues with it unless its to do with the density values , this is whats causing me a bit of a headache

flat birch
#

Hie. I am working with ngrok flask in python for web development.
I am facing an error please help!
So i installed ngrok and ran for a text. It ran nicely. But when i try to render it for a template page. It is unable to access the webpage. Internal error. And when its stuck in the ngrok directory where there are folders like bin, boot, dev etc.
Could someone please help me out here.

silent stump
#

Hi guys when calculating the standard error should you do ddof=1 or leave it as the default on np.std()?

#

i heard that you should do ddof =1, however i thought that the standard deviation in the calculation uses the population variance, so i wouldve thought theres no ddof=1. Since a population variance doesnt have n-1, where as sample does. Thanks

tidal bough
#

If you mean what's used when people calculate standard errors... I'd hope it's the sample variance, because that's the statistically correct thing to do - dividing by n-1 instead of n (sample variance) gets you an unbiased estimator of the real variance of the distribution.

hasty mountain
#

@desert oar I'm testing what you told me yesterday, but...I'm still having the same issue with my output(the model outputs a single character every time). Maybe I messed up something? Or maybe I should get rid of LSTMs and try to use exclusively FCC layers?

My dictionary of words/vectors is something like this:

{'私': 6.819466590881348, 'の': 6.763469219207764, '犬': 7.155118465423584, 'は': 7.7465996742248535, '骨': 6.333333492279053, 'が': 6.707737922668457, '好': 7.280846118927002}

Each value is a vector generated by my word2vec model, which receives as input a word that has been one-hot encoded from a word list and outputs a vector. The model is actually a single FCC layer, since my data is so small.
I didn't use a softmax at all because Pytorch's Cross Entropy has a LogSoftmax included in its function.

With this dictionary, I get a vector for an english sentence and pass it as input for the translator model(which has LSTMs layers), and then try to output a vector for a japanese sentence.

hasty mountain
#

Oh, ok...Using FCC layers instead of LSTMs, and using more layers with more features generates a better output...though I feel like I'm trying to kill a bird with a rocket launcher...

feral cave
#

Hey guys, I was wondering how to read from a text file where each the index and number of columns in each row is uncertain?

#

my goal is to find the highest score of chemistry, it is a bit tricky...

steady basalt
#

Into a list if u like

young granite
#

i got a df with ~200 datapoints, currently i search for >= and <= values then use first_index[0] and last_index[len(last_index)-1] to get the indexes and set the range for the filtered df.
After that i create a new df with np.NaN values and np.linspace() to finally concat the two dfs and do an interpolation of the missing NaN values.
So my question would be if u guys know a way to make it smoother

feral cave
serene scaffold
young granite
serene scaffold
young granite
feral cave
latent dirge
#

anyone here use plotly or has used it?

serene scaffold
#

I had never used plotly prior to yesterday, but I was still able to partially answer your question. Pre-filtering for answerers without giving the actual question just wastes everyone's time.

young granite
# serene scaffold so you're using `apply` or what? can you just show the code?
array_1 = df.index[df["X"] >= -140]
first_index = array_1[0]
array_2 = df.index[df["X"] <= 170]
last_index = array_2[len(array_2)-1]
new_df = df.loc[first_index:last_index]
new_df.at[first_index, "X"] = -140.00
new_df.at[last_index, "X"] = 170.00
new_df = new_df[['G"', "X"]]
x = new_df.iloc[:, 1]
y = new_df.iloc[:, 0]

f = interp1d(x, y, kind='cubic')
x_int = np.linspace(start=x.min(), stop=x.max(), num=621)
y_int = f(x_int)
int_df = pd.DataFrame({"X": x_int, 'Y': y_int})

x_range = np.linspace(start=-160, stop=-140.5, num=40)
range_df = pd.DataFrame({"X":x_range, 'Y':np.NaN})
concat_df = pd.concat([range_df, int_df])
concat_df = concat_df.reset_index(drop=True)
concat_df.at[concat_df.index[0], 'Y'] = concat_df['Y'][concat_df.index[-1]]
concat_df = concat_df.interpolate(method='linear', axis=0)```
young granite
latent dirge
#

From https://plotly.com/python/v3/gapminder-example/ there's this line of code

a_column = Column(list(dataset_by_year_and_cont[col_name]), column_name)

The problem I have is that the Column type doesn't seem to exist in plotly, and I don't quite know what to do next

serene scaffold
mighty patio
# fading crane pretend i'm a researcher. I have a problem I want to solve. Do i usually take up...

Depends strongly on your use case. With image or language processing you are much more likely to use an existing network as the basis.
That said, if you come to me with some data that is not in one of those categories, I will try several other machine learning approaches (starting with linear regression) before I even consider making a neural network.
You would be surprised at how much machine learning is just fancy talk for linear regression.

young granite
#

it feels so wrong 😄

steady basalt
#

but yes actually there is a way you read pandas from txt sep=',' probably

#

MAYBE

#

moooooods

long locust
#

Hello, please don't post unapproved advertising per rule 6. If you think this was removed by accident, DM @sonic vapor

feral cave
steady basalt
#

Have u tried looking into pandas from text

feral cave
#

and got this but am stuck here

steady basalt
#

So u want one col per subject

feral cave
#

yeah

#

but anyway, my final goal is to find who has the highest chemistry grade

#

I was wondering how to place the chemistry to one single column?

alpine spruce
#

there must be better ways to do but the fastest way for me is; make the first columns elements as a list row by row, i mean, divide the string from scapes

#

and then you will get 3 seperate lists at one column and rows, with using lambda func etc, you can combine the first 2 list which includes name and surnames or just combine the list elements from 0 to -1

#

because your last list element will be the name of the lesson

#

you can do the same thing to the other columns by that way you can seperate the grades from lesson names

steady basalt
#

U can also expand out elements but I’m not sure if that’s useful here idk if it creates a tonne of columns

alpine spruce
#

expand gonna create same rows, its hard to handle. btw i did it, hold i will send the codes

mighty patio
# feral cave yeah

I strongly encourage you to learn how to read files line by line
It is a much more general way than relying on something like pandas, and with unstructured data like this you have to write code to manually parse each line anyways

with open('grades.txt') as f:
    for line in f:
        if 'chemistry:' in line:
            chemistry_grade = line.split('chemistry:')[1][:2]
            name = line.split(' ')[:2]
            print(name[0], name[1], chemistry_grade)
steady basalt
#

It seems his txt isn’t line by line when he pasted it

#

But yeah this takes a bit of fuckery

feral cave
#

thanks guys, I am trying all you guys suggested

feral cave