#dev-contrib
1 messages ยท Page 10 of 1
Because the container runs under a different user
Thus when the coverage results are generated within the container, theyre owned by a different user
And trying to run e.g. coverage report outside the container will cause a permission error
I believe I did look into making the uid/gid within the container the same but I don't remember why that couldn't be done
Maybe something to do with the container being in privileged mode
ok I figured it out, apparently it was just to do with the fact that the arguments can be an empty string now if there is no -c
and passing an empty string part to subprocesses breaks it?
>>> import subprocess
>>> subprocess.run(["python", ""])
C:\Users\Mark\scoop\apps\python\current\python.exe: can't find '__main__' module in ''
CompletedProcess(args=['python', ''], returncode=1)
>>>```
I guess Python treats the empty string as a directory name or something?
@tawdry vapor do you know what is up with snekbox#113 currently?
we can add a mount for /dev/shm quite easily in the new instance directory to support mp
I have one outstanding comment and we never figured out why the pids limit has to be hard coded the way it is
what do you mean?
it's just a set limit right?
well no it makes sense right
if the process limit is 5 and you try to open 5 new processes it will fail at the 4th one because the current python process also counts as 1
so raising to 6 still fails
to be able to launch 6 subprocesses you would need a pid limit of 7, not 6
I made that comment before I made this commit https://github.com/python-discord/snekbox/pull/113/commits/9e69352ea2399e271f3cdbd5da2af90b69e69dfd
The old test was hard-coded to create 6 subprocesses, and it still passed despite the PID limit being 6
Oh the test is testing for an exception/error
So it successfully fails to create the 6th cause the limit is 6, and 1 PID is occupied by the parent process
Okay, makes sense
I think me adding + 1 to the range is incorrect then?
right yeah, strictly speaking it failed 1 before that
currently have file size errors propagated like this, is it needed?
Is that error generated by the bot?
no it's on the snekbox side
I'm catching an internal error raised when outputs exceed the local limit
not sure what to do with the response
Is the internal error due to rlimit?
snekbox/snekio.py lines 38 to 43
size = file.stat().st_size
if size > max_size:
raise AttachmentError(
f"File {file.name} too large: {sizeof_fmt(size)} "
f"exceeds the limit of {sizeof_fmt(max_size)}"
)```
That's a good question. I'm not sure of a good way to handle it right now. It would require some more careful thinking.
At the least, I can say that there should be some kind of feedback to the client about failed attachments.
Why can't we rely on rlimit for this? I think that would be better.
It would be more in line with how we deal with memory and CPU limits
we can yeah, though it would limit all user files instead of just the ones they're marking to upload by the output prefix
I suppose either could be fine
on other notes gifs seems to work
had to up the memory limit to get that to render though
@fossil veldt I gave this some more though. Let me know what you think. Input from others is welcome as well, of course.
There should be a limit on the size of the directory because hosts won't have infinite storage. There should also be a limit on the number of attachments, because snekbox could otherwise get stuck processing thousands of tiny files. These should both be configurable via an environment variable. Defaults should be set to what our infrastructure can reasonably support, rather than what makes sense for our specific use case on the bot. For example, snekbox may have no problem supporting 100 attachments, even if that would be far too many to actually use on the bot.
Instead of a limit on the number of attachments, I was considering a timeout for processing all the attachments. However, this could still lead to large responses (a large array of attachments in the JSON), which may or may not be a problem.
Snekbox should not have a file size limit. Instead, it should indiscriminately respond with attachments. Its ultimate response size is already capped thanks to stdout truncation and the size limit of the directory. Thus, it doesn't really make a difference to snekbox whether the overall size limit is reached via 1 large file or 10 small files. It should be up to the client to decide how to handle files that are too large for it. This approach has some advantages over handling it in snekbox:
- Less configuration required (thus a simpler API) - snekbox doesn't need to know file size limits from the client
- Client can format errors as it pleases since it will have the info about the attachment available in the response
- No need to modify the API schema to include an error message/status for each attachment
- All attachments are available (even large ones) - client can elect to use some and ignore others
Cons:
- Wasted bandwidth on including attachments that won't actually get used because they're too large.
it should indiscriminately respond with attachments
you mean like, within a time limit?
Depends on if we go with a timeout or just a cap on number of attachments.
can we even have a reliable timeout..?
I suppose we can launch a mp process and kill it after n seconds
though we'd need to do some memory sharing for that
I think just a cap on the number of attachments/files to process is sufficient. Given that and the directory's overall size limit, I don't anticipate processing time to still be an issue.
that's fair yeah
yeah the only thing is sending the 32MB (or something) file would be a slight waste of resources if the discord bot will never use it
but I suppose there might be clients other than a discord bot so
I don't believe the bandwidth will be a concern with the amount of traffic we expect.
It doesn't seem worth the added complexity to account for this.
In the future, we can revisit this if it becomes a concern or it's requested
so 100 attachments maybe?
btw I think I'll just make a config.toml
[memfs]
# Per-instance memory file system max size in bytes
instance_size = 50331648
# Per-instance /dev/shm tmpfs in bytes
shm_size = 16777216
[attachments]
# Maximum number of files attachments will be scanned for
max_files = 100
Oh we don't actually need env vars, sorry
Config is done like this
wsgi_app can be given arguments which are forwarded to the NsJail object. For example, wsgi_app = "snekbox:SnekAPI(max_output_size=2_000_000, read_chunk_size=20_000)".
Which I think is sufficient. It seems redundant to have an additional config file when users could just use the gunicorn config file
wait uh how does that work?
so we put arguments in the gunicorn.conf.py file, and where do we access it
Maybe the way I wrote it in the readme is unclear. By forwarded to NsJail object, I mean that they get passed to its __init__, and as a developer you don't have to do anything else than just add them to the __init__ function signature and body
snekbox/api/snekapi.py line 30
nsjail = NsJail(*args, **kwargs)```
ah I see...
so I can just add a kwarg to NsJail, and add a config line for it in the gunicorn.conf.py, and that'll work?
The way defaults are currently set are in the __init__ itself rather than in the gunicorn conf.
But maybe it would be clearer to move them to the conf... something to revisit in the future in any case
so nothing in the conf actually, but if someone wanted to edit it they would add an entry?
Yeah
Yeah, I suppose. It could probably be even higher, but 100 seems reasonable as a default for general use cases.
Ideally we'd do a load test but our local hardware differs from prod
so I have
def __init__(
self,
nsjail_path: str = "/usr/sbin/nsjail",
config_path: str = "./config/snekbox.cfg",
max_output_size: int = 1_000_000,
read_chunk_size: int = 10_000,
+ memfs_instance_size: int = 48 * 1024 * 1024,
+ memfs_shm_size: int = 16 * 1024 * 1024,
+ max_parsed_files: int = 100,
):
What's the difference between instance and shm size
well since /dev/shm is mounted as tmpfs on the actual nsjail instance, it must have a os-limit
technically it can be the same as the instance size essentially
If the shm size is smaller does that mean there's just wasted space allocated, or what would be the advantage in having them be different sizes?
I suppose I don't understand why this PR is adding both shm and tmpfs
Like I said I have not reviewed it yet
uh, not sure actually
can't really think of a major reason they need to be different
just, well, they can be different if we want
If there is no compelling reason then let's simplify the config and just remove the redundancy
If it comes up later that it is indeed useful we can easily add it back
actually quite hard to test exceeding the limit by writing a 48MB file within the time limit
When writing the test, I think it's possible to change the timeout for nsjail easily. So feel free to do that for that particular test
I think I actually did the opposite and mock patched the memory limit lower instead lol
not that much lower, 32MiB works fine it seems
Yeah could do it that way too since the memory limit will be configurable rather than hard coded now
but might be the upper limit for our 6s
Though i suggest that either the time limit be increased significantly, or the memory lowered sginificantly
Due to differences in hardware, we don't want tests to be on the edge of failing due to this
By the way, what I wrote here is not a final decision #dev-contrib message you're welcome to wait for more feedback before committing to changing the implementation
also I'm currently doing base64(zlib.compress(bytes))
do you have any feelings about zlib compression on all files?
or should we do some http layer thing instead
I think doing it in HTTP would be nicer since it'd be transparent, right? As in, the HTTP client will see the header and just automatically decompress it.
Though I don't know if we get better compression doing it your way. My blind guess is that there would not be much difference, but I have no data to back that up
do you know of an official way of doing that with falcon?
I have like 0 experience with falcon before this
I don't know I've never looked into it
I had asked where I could go to work in collab projects and or start collab projects and got told to use the '!contribute' command to see info. I found my way here through that command but I think this is more for contributing to the discord server... Not sure where I need to go now.
Idk if we have specific channels for that, but you can watch out for the #pyweek-game-jam and code jams we have a few times a year and always have collab options.
(idk the next time one will be offhand though
)
Thank you, I will check that out! ๐
Hi! When you need help contributing to any of our repositories - you can ask here with that. We'll also sometimes post things that we need help on, etc. Keep an eye on this channel for ways to get involved. Other than that, yeah as mentioned our events are a great way to get to know other people and work in a team
100% I'll keep an eye here. And thank you so much for letting me know. I am not very good at much yet but I am trying to learn even if it means dumping myself into something and researching or digging into my books. But I gotta learn some how
Best way to learn is to jump right in!
๐ฆ - In an activated poetry shell environment, without any poetry prefixes
๐ง - Using poetry run my_command for each command I want to run
๐จ - I choose to inconvenience myself by not using poetry and build my own messy virtual environment solution
Would be curious for some input here, due to an issue we have open
What's the issue?
You got us intrigued Volcyy :B
I think there is a benefit to standardizing and controlling the way our users run the code
It allows us to make assumptions to simplify the codebase (for instance the debug flag)
And tell people who deviate from it that it's up to them to figure out (like we do with people who don't use docker for the database, etc)
I get your argument, but I don't think a thin abstraction is a problem tbh
(btw the debug argument is super critical for dev environments, since it controls a bunch of logging, utilities such as collectstatic, blocking gunicorn, CORS, and other feature stuff)
So we'd need to mention in the docs if we remove the poetry task that you must run with debug
shouldn't we make debug the default then?
we're not running prod through runserver, right?
surely we're not usin the debug server in production
๐ฒ
We don't get debug from runserver, we get it from the command line flags
Since we intercept the call before django gets to it in manage.py, set up environment stuff, then dispatch accordingly
But that's beside the point
The main point is that it gives the control to us, which makes it easier for contribs
Out of curiosity, why do you see it as an issue
I don't remember
Volcyy 9 months ago was different from Volcyy now
In my opinion, this abstraction is problematic, as most people who used Django know what manage.py is and how it works, and recommending usage of wrappers like this creates a feeling that we're using some special functionality here.
ah y eah
If the whole problem is that we want to avoid the "feeling that we're using some special functionality", why not just mention in our docs, that this only exists for reason X & Y.
That is, if it's not even mentioned already
do people read docs?
I do my best to read docs before I start
IDK, After the baby was born, @last patio just... changed
But I agree, it's not guaranteed
why is your profile picture an oversized turtle eating a whale?
yeah
okay maybe i should leave the issue as-is for now and maybe do a voting on it specifically
Because I like Gojira :p
fair enough
Is there some way to get snekbox running on mac m1?
It builds but on runtime apparently the mounting of lib64 does not exist
I tried running in linux/amd64 dockerfile mode but then falcon never builds
Could you include specific errors?
We probably should've included the amd stuff already
well honestly
everything works fine, the lib64 folder just isn't there with that name, so the mounting fails with file not found error
so I commented out the mounting of lib64
otherwise I tried in the dockerfile
FROM --platform=linux/amd64 python:3.11-slim-buster as base
and this results in falcon stuck in building wheel forever
Are you applying the platform to all the stages?
btw do you have any inputs on https://github.com/python-discord/snekbox/pull/159#discussion_r1026822182
I'm considering we may need breaking changes some way for the python bot indicating whether they want an eval run as a file or plain code
some current functions like -m timeit depended on the fact that input is injected as plain text after the args
but also if we have future support for something like -m unittest that would require a file and not a string
We already rely on unittest in forms, we just do it through importing and calling unittest
so currently I have a hard coded search for the string timeit in args and forcing no-file mode
but that seems, not great
We could have all this stuff as a client option
like a bool flag for file/no file?
or we can also have the client send a list of files, the benefit is that we could potentially support multiple files in the future easily
args: ["-c", "code"] # or ["-m", "timeit", "code"] # or ["file.py"]
fs: bool
Basically don't do any parsing on the server
Just let the client configure whatever concoction serves their needs
Hm, I guess that does require a special case yeah
I don't know
Alright we could have a single non-array argument as another option to arg
in that case, write the string to main.py, and run?
like a bool flag?
Rather args would be args: list[str] | str
In the case of a list of string, pass whatever arguments you get as to python, and let the user handle it
So the base use case would be sending normal strings with code, which is what eval would do
the remaining input parameter would be kind of weird then no?
api-wise that's kind of a confusing design
This would replace input
ah, so we can like remove input altogether?
Yup, input will become args, or a member of args
that would make it impossible to call a file with args then right?
Is that part of the desired behavior for snekbox
well, python file support is new, so I suppose non of this was possible before
The reason I opted not to separate the different datatypes into different keys is because you either end up with a bunch of flags to tell snekbox what to use, or we end up having to do a bunch of null-checks, etc
I'm fine with your proposed attachment design as well
I'm just not sure if we're jumping the gun and adding behavior we don't care about
Though doesn't the attachment design force file system to always be on
I know that isn't ideal for forms
I think a flag for fs on/off would be reasonable
well, I suppose it would have to be with -c args
but potentially you might want to do -c but also have fs?
-c with fs would still give you a cwd in /home and potential writes
So something like:
fs: false
attachments: []
args: ["-c", "print()"]
I guess normal execution would look like:
# fs can't be set to anything other than true with attachements, omit/handle
attachments: [
file1.py, ...
]
args: []
Hmm maybe more like
# fs can't be set to anything other than true with attachements, omit/handle
attachments: [
file1.py, ...
]
args: ["file1.py"]
Rather than hardcoding main
technically we can also support a file run on a read-only fs mount
I dunno if that's really needed though
How would that work from a technical standpoint?
well we just mount the directory as read/execute only from nsjail
then call main.py or whatever
?eval
# main.py
from lib import fn
print(fn(5))
# lib.py
def fn(x):
return x * 2
^ mainly I think support for something like that in the future would be nice
and the list would make it not super painful later
okay I think I'm settling on
"args": ["main.py"],
"fs": true, // (false, "r", "w" | true) default: true
"files": [
{
"name": "main.py",
"content": "print(1)"
}
]
with schema
REQ_SCHEMA = {
"type": "object",
"properties": {
"args": {"type": "array", "items": {"type": "string"}},
"fs": {"type": ["string", "bool"]},
"files": {
"type": "array",
"items": {
"type": "object",
"properties": {"name": {"type": "string"}, "content": {"type": "string"}},
"required": ["name"],
},
},
},
"required": ["args"],
}
Is there no way to make a less verbose version of this stuff
Something more like a dataclass
Content wise though it looks good
Not out of the box
hm why do we need the file system to be toggleable?
I think we wanted no file system for forms or something?
why though?
Mostly to reduce the overhead for features that don't need file writing such as forms (or timeit, or other features we add)
For forms as well, a more locked-down environment would be helpful since we're already struggling with controlling and protecting against users trying to jail-break from the unittesting setup
fs isn't on it's own an issue, but it is an extra vector that needs to be considered and thought about when trying to solve some already annoyingly complex problems
The overhead would be negligible, and it would add extra complexity to implement (e.g. for snekbox#113, i'm not sure how you'd do it since it's hard coded into the config).
I think it would be possible to disable reading/writing files from inside snekbox though
so what would happen if a file was provided by request, when snekbox has file system disabled
runtime error or?
I mean disabling it in the code sent to snekbox, if you put resource.setrlimit(resource.RLIMIT_NOFILE, (0, 0)) at the top of your code it should prevent creating any file descriptors for the rest of that processes runtime, which would make it pretty difficult to read/write any files (I don't know if/how you could).
couldn't the user code just... switch back the limit?
The config isn't actually hardcoded fwiw, and it can be modified dynamically for each eval
no, you can't reduce the hard limit
hm, that might be okay I guess
In the PR I posted above it's in snekbox.cfg so couldn't be edited there. We could move it to pass it in with command line args but there is benefit to keeping as much as possible in one place, and the cfg format is nicer to work with.
No, you can modify the cfg for each eval since it's passed to nsjail as an argument
If you want to modify the cfg, you can make it an option of the python3 function, and write the updated config (from memory) to the already existing temp dir and execute with that
The limitation of having one constant config file is a self-imposed one which is not hard to overcome
It doesn't matter too much in the grand scheme of things since we'll likely rewrite forms to not actually use unittests, but this shouldn't be the reason for that
my point isn't that it's impossible, just that I don't think it should be necessary and that it adds extra complexity that I don't think is worth it.
isn't there essentially an equal amount of complexity in making fs toggle-able in config?
since we have to check and error or do something on requests with files when they are disabled
though a snekbox-level disable for fs is definitely useful considering other users besides us
my suggestion is just that we don't add support it at all, as I don't see the need.
At least I think it shouldn't be part of this PR
ah hm, so just leave it read/write always for now?
that's my opinion, since anything else would have to be applied to #113 too.
also uh, if we're going to allow arbitrary file names, are directories going to be allowed?
if so I think we would need to check for higher-than-parent relative paths for security, since the python file write happens outside of the sandbox
and not sure of a good way to ensure that
@clever wraith We go again. https://github.com/python-discord/site/pull/733
Lessgo, I'll have a look right away
hello @timid sentinel I request thy review on this here PR
looking rn
thank
Hello folks !
Can I get another review on bot#2319 so that we can merge bot that one and site#792 ?
Thanks ! ๐
@sharp crag https://github.com/python-discord/site/pull/733 give review thanks
yes sir
(after meeting)
joe mama
joe's mother
done
working on 701 rn
Is this a discord.py thing?
HTTPException: 413 Payload Too Large (error code: 40005): Request entity too large
oh nvm I think it's the discord upload limit
I heard since we're boosted the bots here have a 50MB upload limit instead of 8MB normally, can someone verify if true?
Does this agree with your findings? https://stackoverflow.com/a/66377613/19147567
well I'm not sure about the 50MB part 
the Boosts tab says 100 for all members
not sure if the same applies to bots, but i dont see why not
should we decide on an snekbox upload limit?
or number of uploads as well
bot-side size enforcement looks like this rn
3 over-limit files and 1 good file:
Is this a per file limit, or a combined limit
per file
but I assume we can also impose a total limit, anything really
technically we should be able to support 50MB per file here due to our boost level
great I can't ping joe
discord pls
joe still does data control for pydis, right?
hey @patent pivot, nice drips
little question, how do you handle opt-outs of metricity?
I don't see anything in the code allowing to you to set a list of users who objected to their data being collected by metricity
You can not opt out
#changelog message
Right, you need it for more than analytics
100MB boost limit increase applies to bots as well. While bots have many Nitro features, I don't believe the 500 MB increase does though
Big Data 1984
L server im leaving
!e
Hey ๐
This channel is for discussions about contributions to one (or more) of our projects.
You can use #bot-commands as a command playground ๐
@tawdry vapor so I'm thinking of converting https://github.com/python-discord/bot/blob/main/bot/utils/lock.py#L51 into a class so it can double as a context manager ๐ค pinging you to make sure I'm not missing anything.
For context, I want to use a resource ID that isn't available at the definition of the class where I would use the decorator. I see that there's an option to provide a callable, but I'm not sure I like it
bot/utils/lock.py line 51
def lock(```
hmm the current code with the callable does seem to cover my use case ๐ค I guess I'm just not a fan
ended up just leaving it as is and using the current option
The callable was designed with simple cases in mind e.g. short lambdas, operator.itemgetter, etc. Sounds like you have a more complex use case but I'm not clear on what that is. How did you foresee using it as a context manager?
I feel like if you need a context manager it wouldn't be much work to just manage your own lock at that point, but maybe I'm wrong.
It's not more complex, I looked at existing uses of lock_args and it seems to work fine. Using with lock(namespace, resource_id): inside the function just seemed cleaner since you don't need to do stuff like getting the function arguments and using lambdas
and also has parallels to how asyncio.Lock is used
did the test server ever get forums?
Nope
At some point our server will stop working as it will never get updated again 
Hey we could use an extra review on bot#2335
You get a digital cookie
Does it have caramel?
Sorry we only have chocolate chip
This looks lit @vale ibex !
It was picked up quickly anyways !
I suppose we're keeping the static channels while the roll out is still in the testing phase ?
yea, just going to remove when people stop talking in them
going to move them down a bit though
Or move the category up next to the help system one ?
Or do you want to reduce the visibility ?
Yea, reduce visibility so no new people start talking in there too much
just long enough for people getting help in there to get what they need and leave
I see I see
Ouf, those tags are just beautiful
Btw Chris, there's a PR waiting for your review, when you have the chance
can't promise to look at it tongiht, it's already late and want to finish this migration and head off
can look tomorrow though, could you send the link ?
Sure thing, whenever you can !
It's bot#2323
what was it?
I need proof, as Discord has never ever had a wonky API
lol sure
getting traceback for context, 2 secs
Forbidden: 403 Forbidden (error code: 40058): Cannot message this thread until after the post author has sent an initial message.
File "discord/client.py", line 409, in _run_event
await coro(*args, **kwargs)
File "bot/exts/help_channels/_cog.py", line 122, in on_thread_create
await _channel.help_thread_opened(thread)
File "bot/exts/help_channels/_channel.py", line 116, in help_thread_opened
await send_opened_post_message(opened_thread)
File "bot/exts/help_channels/_channel.py", line 66, in send_opened_post_message
await thread.send(embed=embed)
File "discord/abc.py", line 1536, in send
data = await state.http.send_message(channel.id, params=params)
File "discord/http.py", line 738, in request
raise Forbidden(response, data)
We were sending a message to the post within an on_thread_create event
it seems the Discord api wasn't ready for us to do that
in disnake's help system I pause a moment because it feels jarring as an end user so I like that discord forces it to be slowed down ๐ค
hah i did something similar by doing the DM message first, and then the bot message in-post
3s pause, send typing indicator, wait 2 more seconds, send whatever we wanted to send
mmmm,
<BUTTON>
padding
will need to implement a wait of some kind when we make the DM message optout
great ui
I also notice it's lost new lines & the forum channel logo
1 hour to close, how is that implemented?
ah so it actually doesn't close
read the label again
Hide after Inactivity
it will still show as an open help post for 7 days
ah that's annoying
& authors of channels can change that time to be longer
guess we'll need to re implement the custom closing logic
I have a task that automatically closes channels marked as Solved, which is a slash command named /solved which is accessible only in the help thread
so every five minutes it archives and locks any threads with a solved label
devil's advocate: if you had made the switch to forum channels when they came out this wouldn't be an issue at help system release because it really was archive at when forum channels were released
๐ค
yea I had always thought it was an archive
I thought there is an auto archive thing
guess it was beause of that
they changed it which is an annoying change for bot devs who were expecting it
but otherwise a somewhat nice change in the client
Can't we set the auto_archive_duration upon thread creation ?
ask publicly in ddevs when they might be released to all guilds
discord.gg/discord-developers
if enough people ask it might actually happen
I wonder if that'd do it ๐ค
there's a very long thread in DMD about it too ๐
that's just hide in client
i know.
it's a pita
ask publicly in ddevs so they get enough complaints about it
Well, that's a poor naming then I'd say
its... reusing an existing field without breaking clients.
Debatable :p
Anyway, it's really late here
Good night for those who want/will sleep :B
And good day for the rest
I believe so, possibly mods too
It can contain sensitive information, but we frequently export filtered data to github issues
I skipped it here because man... that's a lot of issues
If you need a particular one, I'd be happy to provide though
That's the thing, it was just too abstract to get context
Yeah that's true
Most of these are just things like "HTTPException 400" in this line of code
In theory the errors shouldn't be needed to validate the commit itself
Oki
I've left a couple of tiny comments
Thanks!
I'm not sure if this one came out correctly?
What would the final code look like in your opinion
Ugh, that definitely came out wrong, sorry for that
I just thought that there were a lot of nested try/except blocks, and it looked like a nasty pyramid
We could reduce the nesting by moving the code that tries to delete the message back one indent, making it at the same level as the first try/except one
Also, since we want to pass on certain exception codes, I thought we'd just reverse the condition
Yeah we can probably get the rest of this function refactored a bit. The condition I had that way to clarify that the comment belongs to the condition of a thread, rather than no thread, but I guess people should be smart enough to understand it
It would've been a lot cooler if python could catch exceptions raised in an except block :p
Hmm I don't think I can actually refactor it that way
The NotFound exception seems to apply to the reaction clear as well?
The clear is called when it times out, at which point if the message is deleted it'll raise NotFound
Unless wait_for aborts early when a message is deleted
Basically, what I had in mind is
- We wait for the reaction
1.1. Clear the reactions if the previous wait has timed out
1.2. Raise that HTTPException if something goes wrong upon clearing them - Since the
message.deleteis in theelseblock, it means that it's something we'd always want to exceute, so move it into its own block
We shouldn't do 1.2 if httpexception is notfound, which means duplicating the same code from 2
Were the forum channels supposed to close after 1h of inactivity?
Yes, but discord changed the behavior of the auto-archive to "hide" channels instead of archiving them
We're writing bot code to do it manually now
Oh, I see. Is the new bot help system where in the GitHub?
A group of commands that help manage our help forum system.
Thank you!
But we're not, are we ? we're raising an HTTPException, which shouldn't be a subclass of NotFound
I don't see what I'm missing 
What you're missing is that it will be caught by the httpexception block, but then we need to write the code required to check if it's a 404 and do nothing in that case
Here's a sample scenario:
A user sends !help abc in a non-thread channel
A mod deletes the embed
The timeout for this message expires and we go to clear reactions
The message is deleted so the previous step fails
Here we should catch that, log it as we do in 2, and suppress the exception
We'd basically have: ```py
try:
await bot.instance.wait_for('reaction_add', check=check, timeout=timeout)
except asyncio.TimeoutError:
try:
await message.clear_reactions()
except discord.NotFound:
log.trace(f"wait_for_deletion: message {message.id} deleted prematurely.")
except discord.HTTPException as e:
if not isinstance(message.channel, discord.Thread):
# Threads might not be accessible by the time the timeout expires
raise e
try:
await message.delete()
except discord.NotFound:
log.trace(f"wait_for_deletion: message {message.id} deleted prematurely.")
Notice here the duplication of discord.NotFound
Yep, I see now :/
You can still argue this is better, I don't have an opinion either way
That's what I was going to say
It's a 2 line duplication
But clears out the code a bit IMO
I'm curious though, isn't NotFound a subclass of HTTPException ?
Yes
You can catch it in the same except block but you still need to write the code there to identify a notfound, and log
Ah yes since the last except block has been moved out
That's a shame
Well, I guess it's up to you to choose
@gritty wind in bot/exts/help_channels/_channel.py, line 134
Wouldn't it be better to just explicitly catch the NotFound exception only & add a log ?
It would allow the other exceptions to propagate without having to reraise
It's a good idea, others have brought it up too
But working with discord you'll find anything but good ideas is a valid solution
@vale ibex @clever wraith
it seems discord takes a good few seconds to dispatch the message event
It's not a big deal, but it does delay the start more than a sleep would
That's a good 6 seconds there
Either that, or it's waiting for the entire duration of the timeout for some reason? That's set to 5 seconds
Actually the latter seems to be the case
Yea, we should likely wait that long regardless, since we need the message details
set the timeout longer i guess 
I don't think you understood what I meant
Here's the same code with a shorter timeout
It still works correctly, and the event gets dispatched correctly (before the timeout) in both cases
But for some reason it waits the entire duration of the timeout before proceeding (even if it doesn't time out)
Can't we just wait for a new message event in that particular thread?
That's what I'm doing
def check(message: discord.Message) -> bool:
return message.id == opened_thread.id
try:
await bot.instance.wait_for("on_message", check=check, timeout=2)
except asyncio.TimeoutError:
log.warning(f"Help thread {opened_thread.id} was not ready in time, aborting.")
return
In this case it sleeps for 2 seconds even if it got the message before then
But if I set that to something longer (for instance 15 seconds), that would delay the code by that long, even if we got the message before then
I.e the code above is equievlant to asyncio.sleep(2)
And I can't figure out why
I wonder if the check is correct
I guess you can break there anf verify
But
Actually, nothing
The starter message id should have the same id as the thread
So i guess it's fine
I've added a breakpoint in the check, and it isn't being called at all
I'm presuming your indentation is correct in your IDE, right?
Because in the snippet you sent, it isn't
Or is that just discord again
It isn't correct in what way in the snippet?
On my phone, the try except block indentation shows inside the function scope
Ah
I think it's just a copy paste white space thing
Yeah ๐
Ouh that emoji looks ugly, not what I intended
:/ (that's more like it)
oh, that's very odd
It's actually timing out and throwing the warning now fwiw
That's the entire diff
It wasn't timing out before which is the weird part
But the check just isn't called anymore
The bot would error out if I didn't have message intents or something right
I wonder if it's because that message has already been published to listeners by the time the wait_for future is created
Maybe, but even if I send another message after the opener (and persumably after the wait for is called), the check isn't called then either
(I deleted the sleep btw)
have you got a full diff somewhere?
Index: bot/exts/help_channels/_channel.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/bot/exts/help_channels/_channel.py b/bot/exts/help_channels/_channel.py
--- a/bot/exts/help_channels/_channel.py (revision 8cfa7ed5b39b85e432420c5749c0fe1569e2d2cb)
+++ b/bot/exts/help_channels/_channel.py (date 1669466984168)
@@ -124,8 +124,18 @@
# Discord sends the open event long before the thread is ready for actions in the API.
# This causes actions such as fetching the message, pinning message, etc to fail.
- # We sleep here to try and delay our code enough so the thread is ready in the API.
- await asyncio.sleep(2)
+ # We try to wait for the message to be dispatched here.
+
+ def check(message: discord.Message) -> bool:
+ print(message.id)
+ print(opened_thread.id)
+ return message.id == opened_thread.id
+
+ try:
+ await bot.instance.wait_for("on_message", check=check, timeout=20)
+ except asyncio.TimeoutError:
+ log.warning(f"Help thread {opened_thread.id} was not ready in time, aborting.")
+ return
await send_opened_post_dm(opened_thread)
This on top of my PR
it's wait_for("message")
"message" is the event, "on_message" is to make a listener for that event
I think we can reasonably go a bit longer than that
you likely want to cover the case where the message already exists in the cache
since this error doesn't get hit every time
is it?
if so, this wait for is useless
since it's already in the cache, so waiting for it to be added will never work
I assumed it would be added to the cache when the thread event was sent
I don't actually know
I don't think it will be, I think thread creation starter message events are separate
Here's how I assume the underlying code works:
Discord -> discord.py -> code
thread_create -> save message to cache -> call handler
Discord -> discord.py -> code
message -> add message to cache (overwrite?) -> call on message handler, unlock wait_for
thread_create isn't isn't a message
How would I check
it's a channel
We have thread.started_message though, I assumed the payload included the message
@commands.Cog.listener()
async def on_thread_create(self, thread: discord.Thread) -> None:
print(thread.starter_message)
Presumably this would error if the payload didn't
That event fires twice lol
since the starter message is the same id as the thread id
Right but that isn't just an ID, i can access things like message.content
yea, which is retrieved from the message cache
only, it seems the message id isn't populated into the message cache before the thread creation event
it wouldn't error, since it's an Optional attr
I'm not really sure how to handle this system
Maybe it's best to just have an independent message listener?
I'll just work on something else
we could have the on_thread_create event and then just have a wait_for that waits for the post_id to be present in bot.cached_messages
or yea, we could trigger all of this off an on_message listener
where it checks if the message in the help forum system and that message.id == message.channel.id
In the mean time, I'd appreciate some eyes on bot#2338 so I can rebase my auto-archive branch locally and resolve conflicts
Good to merge
bot#2339 is up for review now, which should close out the remaining issues
going to grab some lunch, but feel free to ping with questions
Is it actually possible for the second one to be hit
if not message.type == discord.MessageType.thread_starter_message:
return
if not isinstance(message.channel, discord.Thread):
return
Also the archiving of old posts doesn't seem to work, I'll check what's up with that
discord.ext.commands.errors.ExtensionFailed: Extension 'bot.exts.help_channels' raised an error: AttributeError: 'ForumChannel' object has no attribute 'channels'
It has threads
Yeah that seemed to fix it
it's for type-checking
which function?
new_post_listener
At least I didn't get the DM or the initial messag
Seems the first if condition fails
I'm only getting message type "default"
oh
how interesting
also, I tested using Kat's server, the logic to close inactive posts works nicely
I imagine it may take a minute to close all of them on here though
took ~5 seconds to close the 20 or so in kat's server
Jeez
yea, it's the send message + dm +close & lock api calls
what are you seeing?
#help? (1046064369756291102) is idle past 2022-11-26T14:09:13.532000+00:00 and will be archived. Reason: auto.inactive
Finally block reached for #1046064369756291102; state='CORO_CLOSED'
#a (1046065095983255622) is still active; scheduling it to be archived after 30 seconds.
It's still open
No exceptions or anything tho
So this log means it hit the right loc
All the individual parts must work tho, it closed on boot up
Ignore the first log, that's another channel
one that closed correctly I'm assuming?
But here's the one for this channel:
#a (1046065095983255622) is idle past 2022-11-26T14:14:01.267000+00:00 and will be archived. Reason: auto.inactive
Yeah that one closed on boot
this log is line 215 of _channel
directly after is where we call close post
Yeah
but it's not closing?
Yeah
Could you add some logs/breakpoints inside _close_help_post?
I'll throw down a breakpoint
I just was scared of rebooting in case I just ran into a vanishing bug
It did close on boot of course
Why does windows 11 no longer show the current time
And why are all the search hits telling me to install "ElevenClock"
lol what?
it shows for me
It only shows the time down to minutes
Used to be able to see seconds when you expanded the window
Btw I didn't get the log from line 214 this time, the entire function just wasn't called
You aren't calling it
we never create a task for newly created posts
love it
I too had such a fantasy. I'm now at 77 commits ๐
btw Chris I wouldn't mind keeping it at one hour
although we can start with 30 minutes to better compare to the old system
yea we could do that too, guess we can look into that after a few days
does the PR fix all of the broken stats?
broken stats?
some people said the stats were flatlining
that'll be due to the reports themselves I'm guessing
I changed the stat name in some places where it didn't make sense to keep using the other
hmm ok
do you have examples of the bad reports?
we should keep the old dashboard for a while then and have a new one
I'm looking here https://grafana.pythondiscord.com/d/s8ZoLUCZk/help-channels?orgId=2
These stats need to have their counters updated
But thereโs a data issue with the channels not closing
which is fixed by the PR right?
Hope so
I just updated the "In use channels" stat
so now it shows both
so my question was whether the PR will start updating the stats again
the stats never stopped being updated
it's just since the channel never auto-closes, the code to update the stat never hit
and for other things, the stat name has changed, so the report is looking in the wrong place
!int e bot.stats.gauge("help.total.in_use", 0)
In [1]: bot.stats.gauge("help.total.in_use", 0)```
that was still reporting 4, since we merged the PR before all in-use were closed
yea, that's because the reportis looking in the wrong place
the stat name is different now
I see
basically all the grafana reports need a once over so they show the new & old stats
ok well ping me when all of the fixes you discussed above are done
Well I'm not reviewing until you ping me
@cold island
the perfect time to get rid of it
hah yea pushed a fixup
how reliable are these attributes?
last_message is always present, just optional
last_message_id should always be there afaik
let me check d.py src
ah, last_messge_id is optional too
might need a small refactor there
I also wonder how consistent they are with the actual state of the thread
it'll be based on whatever the discord api returns
so questionable consistency
since we're making an api call anyway, i could just use .history to fetch the last message, if last_message is None
yeah except it should ignore the bot's message
it's already written in the old system if you want to use it
bot/exts/help_channels/_message.py line 65
async def get_last_message(channel: discord.TextChannel) -> t.Optional[discord.Message]:```
it does when checking is a session is empty, but not when fetching last message
yeah but you're checking here if a session is empty
not important, but can just be arrow.get
thinking about it, post.last_message is worthless here
since the last message can be from the bot
Ah true, adding or last_message.author.bot to the conditional would be the fix
Also this is a bit problematic, as if the last 5 messages are (somehow) from the bot, it will close after 5 minutes event if there's something above those messages. Better to use the regular close time in that case
if we use that logic, it will never close after 5 minutes
since there will always be a bot message in the post
if you actually reach the beginning of the channel, then you're right
or sorry, in this case the end
yea, and checking for that case makes the logic much more complicated
I think what we want to do is close after 5 minutes if the opener message is deleted and the original poster doesn't have any other messages in the post
otherwise close after 30 minutes from the last message (even if it's a bot)
seem reasonable?
so the first case would still close after 5 minutes, even if there are other people messaging in there
hmm but then you need to check the entire post
if only enumerate worked with async for ๐
anyway, if this is too much right now we can come back to this later, but I'd like to get this right in this round because we probably won't come back to this for a long time
This is what I'm thinking https://github.com/python-discord/bot/pull/2339/commits/1df3fc8e789e2c4d07a37c82e722554f17daac7f
docstring needs updating, but logic is there
So if the starter message is deleted, the post has < 100 messages and the opener hasn't sent a message in those 100 message, then we close in 5 from thread open
otherwise, the post is closed 30 minutes after the last message
I chose 100 since that's the max you can fetch from the API in 1 request
If it's oldest_first it's last_100_messages[0] right?
yes lol
also need to account again for post.owner_id being None
Sorry if I'm interrupting, but do we have these similar tags on purpose (as aliases) ?
seqslice, seqslicing, etc
!seqslice
Slicing is a way of accessing a part of a sequence by specifying a start, stop, and step. As with normal indexing, negative numbers can be used to count backwards.
Examples
>>> letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> letters[2:] # from element 2 to the end
['c', 'd', 'e', 'f', 'g']
>>> letters[:4] # up to element 4
['a', 'b', 'c', 'd']
>>> letters[3:5] # elements 3 and 4 -- the right bound is not included
['d', 'e']
>>> letters[2:-1:2] # Every other element between 2 and the last
['c', 'e']
>>> letters[::-1] # The whole list in reverse
['g', 'f', 'e', 'd', 'c', 'b', 'a']
>>> words = "Hello world!"
>>> words[2:7] # Strings are also sequences
"llo w"
!seqslicing
I tried 2 different ones in #bot-commands, and it's the same content
alright cool, I'll squash now then
why would there still be a task?
need to update docstring
that was a copy paste error from the old system, it should be removed in a recent forcepush
is the task removed after the callback finishes?
Yea, that should happen automatically, than line was only in the help system for another reason
it should be removed now
ok
unless it's stil in my local changes and I havne't pushed yet
I'm just updating the docstring of get_closing_time
hmm no the task is removed from the scheduler when it's done
so inside maybe_... the scheduler still has a task for the post
ah yea, we need to remove the task before we can add a task witht he same id
that's why scheduler.cancel(post.id) is there I think
yeah
and if you try to cancel a non-existing task it logs a warning
which is why has_task is there
I removed that, since it will always have a task
it won't on startup
this function is never called outside of a task
you call it in cog_load
Ah, yea
you can schedule the coro instead of awaiting it if you want
but you'll need either that or the boolean
nah I'll await directly, will push changes now
alright pushed
new docstring for get_closing_time too
I've got it running on kat's test server rn testing it if you wanted to mess with it
I have a question folks,
In the #roles channel, how we do we send/update those roles ?
Does the bot run checks whenever it starts on whether they exist or not ? And if not it sends them in that channel ?
I can't seem to find this in bot code 
Even in constants, the roles channel is not mentioned, which made me wonder
I'd say just slap a button that will invoke the view you get for the subscribe command (ephemeral)
yeah there are no features that needed that channel until now
so you'll need to add the constant
I'm thinking of having a view that "never disappears" in that channel
yea, we currently just do it with a webhook & discohook
yeah, a permanent view
Yep, it's not well documented in dpy, but that's fine, I'll figure it out
Apparently it needs to have a null timeout & all the items need a custom_id
2 and 1
actually I need mod perms
added
tnx
hmm
wait @vale ibex
even if you take -1
that will only work if you have less than 100 messages
you need the newest message
I'm not using -1, I've set to to so oldest_first=False now
๐
@cold island For the persistent view, since it needs to be attached to a message, here's what I had in mind
We add that message_id to the config, and upon cog load, the bot checks if that message has a component or not, and if it doesn't, we will update it with that persistent view.
So you or one of the people who have perms to send messages there will need to send that message ( or ask the bot to do it ), get its id, and add it to our config
I read in an example written by Danny that it needs to be saved in db, but i don't think that's something we'd want to do, do we ?
WDYT?
It can just be a constant
In the Cog's module ?
Yeah, true
It's only needed in that context after all
I was mostly referring to the action needed to send the first message that'll hold the view
yeah hmm I'm not that knowledgeable about permanent views. If it works for you in testing then it sounds good to me
Alright, I'll let you know
anyone familiar with sending a webhook message with a view? for some reason the buttons don't show and there's no error
await webhook.send(... view=view)
How are you constructing the view?
await self.webhook.send(
username=name, content=ctx.alert_content, embeds=[embed, *ctx.alert_embeds][:10], view=AlertView(ctx)
)
how do I check
self.webhook.is_partial()
that's probably your issue then
ah
is_partial returns False
still nothing
Could you add wait=true to that call and store the returned message?
see if it has a view
there are no components in the response message
that's literally the last feature lol 
will try asking on the dpy server
about to be a bloodbath in #1035199133436354600
lol yeah
I'm expecting more "command not found" errors with this boot
since it'll take a minute or so for the cog to load
meaning other cogs will be blocked, depending on the load order
started: 18:04:16
it's strangely pleasing to watch this number slowly drop
ended: 18:08:29
done?
yeah
ok, 28 is a decent number, and now you can actually navigate the active posts 
lol yea
Is there supposed to be a space or something?, or is it just my client being buggy
just the mobile client being buggy
those are newlines
Mobile didn't like that apparently, thx for info!
Is it an android device?
I believe discord gave a warning that for best experience you should use a desktop client for forum devices for those on android
Wasnt that something that was only shown for like 2 weeks when they just came out?
I think I fixed it
added a few zero-width spaces on the empty lines
Android should render it nicely now
could you confirm if it's better now?
Checked before you changed and now, seems fixed
cool cool
Yep, it looks better now ๐
Thx ๐
Fairly trivial fix for an oversight of mine bot#2342
Hey @vale ibex,
Do you mind taking a look at bot#2319 and site#792 ?
We need them in order to have a clean history for bot#2323
Thank you ๐
Also, I noticed that get_or_fetch_channel returns a discord.abc.GuildChannel
My IDE always complains when I want to use it to fetch a thread.
Can't we make it return a GuildChannel OR a Messageable ?
It's not really super convenient and the name won't fit anymore since we have all of these implementations of Messageable, but I'm just thinking it could be improved
The return type should be GuildChannel | PrivateChannel | Thread
I'll make an issue for it & send a PR to fix it
That's the return type of fetch_channel
Feel free to just PR if that's the only change
Will do
And the comments were resolved, just not pushed
That's why I didn't request another review yet
Sorry for that
Ah makes sense
I usually put a link to the commit that resolved the comment before resolving them
That's a good practice that I should start adopting
that way the commentor can see exactly where their comment was actioned
Will do from now on ! ๐ซก
Is it not possible to update the target branch in the PR to point at the one that only exists in my fork ?
I think that'll clear up the history for now
Just like Scaleios suggested, I just can't seem to find it
Alright, comments should be fixed now Chris !
I don't think you can change the base to another origin, but if you click edit in the top right, you can select which branch it goes to in the current origin
I guess I'll just wait for the original PRs to be merged then, and I'll redo the 3rd later
It can't be merged without them anyway so
Does one need special perms to request a review from someone ?
You need to be a part of the organization
i can use my superpowers to request a review on your behalf
who's the lucky winner?
what's the PR?
sir-lancebot#1145 Thoughts on this? Hasn't been a lot of activity for this issue/pr
@tawdry vapor so for some reason it seems I can't reply to your comment on the input deprecation.
I've added compatible support for input in the API route in 84e73c1, so behavior wise it should be backwards compatible now.
I think I won't add it to the python3 function itself since the either-or optional argument is pretty messy
but for the json schema I can do
"anyOf": [
{"required": ["input"]},
{"required": ["args"]},
],
and it seems reasonably okay so far
This reminds me - have we planned on using buttons for .topic?
A reaction just feels kind of janky for this sort of task
.topic
Suggest more topics here!
would it be possible for the button to only be visible for the sender
or else I think it takes up a bit too much vertical space
why is this a topic for #dev-contrib haha
afaik buttons have to be visible to everyone even if they only work for specific people
๐
It's zig for bot#2342
anyone have opinions on this format for specifying file names for codeblocks
specification
- If no file name is supplied, the first codeblock defaults to be named
main.py - Unnamed code blocks will be joined with the last named code block.
- Python's entry point is the first code block
To name a code block, surround a file path in single backticks on the line before a full code block:
`name.py`
```py
print(5)
```
what does it look like if a name is invalid? similarly, does it still work if no .py extension?
also this is a thing now
sure, any file really
what would be an invalid file name though?
i was under the impression it was more than this, but apparently the only invalid file path chars on linux types are just / and \0, so /.py should theoretically blow up
nix is more lenient than windows in this regard too i guess
currently it's blocked on snekbox's validation stage
I might do a more friendly check on bot side
anything beginning with / is forbidden as well since it's an absolute path
this is why i was wondering
sounds like good idea for user friendliness (and possible data leakage?)
Fwiw most people won't be able to actually do it, the bot will zap the message because of the file
ah hm
do you think we should be able to upload files from our pastebin links as well
And not to be too much of a party pooper, but we should probably think about the repercussions of allowing people to make the bot post images (e.g nsfw stuff)
How would that work?
just as a specified link instead of a code block
^e
file.py
https://link.org
What would the bot do with the link?
I suppose if you couldn't upload there wouldn't be too much of a way to get nsfw in there
it would still be subject to the normal discord filters as well
Oh just download the content? If it's our pastebin then I'm ok with that
essentially yeah
only for our pastebin I'd think
be a nice way to check functionality of large python scripts. just do !e [hastebinlink] instead of wall-of-texting a code block
One thing to note here: we should watch out of scope creep. If you can PR a minimal feature that can be later expanded I think it would be best. So I'd leave something like downloading from the pastebin to a future PR
hm fair
is this realistically a concern though? Aren't all users allowed to embed links outside of pygen already
seems doing that through snekbox is just more steps for the same thing?
Hm yeah I'm just not sure what it means for the bot if it gets filtered by Discord
something to think about: if the bot can eval a hastebin, how will that affect logging and debugging? hastebins expire, iirc, so it'll be impossible to tell what code the bot executed after a year or such
Would it still be relevant after a year?
i used a year as a guess. this suggests as little as a month, actually: https://paste.pythondiscord.com/about.md
but to answer the question about relevance: probably not, but there might be a small chance
@fossil veldt I think uploading files and splitting the input into files are also features for future PRs. They all sound cool, but they're probably out of scope of "allow the bot to read and present a created file from snekbox"
well, fundamentally we're at least switching to file form to have the better error messages and everything
we could just limit to parse the same as before and throw everything in a main.py
though since the snekbox side will support multiple file uploads, making bot support it is just a message parsing thing
Is the better error messages part of some issue, or..
well we currently don't have any of the 3.11 error enhancements
I understand, I'm just trying to understand what it's a part of, since I didn't know we were working on it
!e
x = [[]]
print(x[0][0])
@fossil veldt :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 2, in <module>
003 | IndexError: list index out of range
also currently inspect.get_source doesn't work
and a __file__ not working might be weird for IO
I understand why it's good, I'm just asking if there's a github issue ๐
in general making a written file has lots of benefits
I'm just not sure what a specific issue would be
"Provide code to snekbox as a file" with the benefits of it, and the general approach to implementing it
It just doesn't seem like displaying an image output depends on it
I hope I'm not bumming you out too much, I just see the same PR getting more and more features, and we should give it a clear scope
It's easier to discuss the changes that way, and easier (and faster) to review
hm, it probably doesn't, though currently our design already is shared for both the request and response
the encoder / decoder for snekbox files is the same in that sense
parsing the input for additional files and naming would be an additional complexity though I agree
but just sending eval as a main.py should be free, if that's preferable
also yeah to support arbitrary file names we'd have to parse this response error and do some friendly error for that as well
Shouldn't we update the description to account for our new help forum?
Probably
Go ahead and create an issue/PR for it
In meta?
Then we can bikeshed the wording
Maybe. I'd jump straight to the file




