#RAM-based storage in GitHub Actions
1 messages ยท Page 1 of 1 (latest)
I was curious what GitHub Copilot can do, so after some googling around, I threw in a text prompt on the web UI and... it gave me something that I don't trust but looks fine as a starting point -- I don't know how I feel about it downloading a random binary from the internet but that can be validated with a checksum... This is probably fine for getting started for a prototyping-ish run.
name: Create RAM Disk on Windows
on: [push]
jobs:
create-ramdisk:
runs-on: windows-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Set up RAM disk
run: |
# Download and install ImDisk
wget https://mirrors.kernel.org/sourceforge/imdisk/ImDiskTk-x64.zip -O imdisk.zip
Expand-Archive -Path imdisk.zip -DestinationPath imdisk
Start-Process -FilePath ".\imdisk\install.bat" -ArgumentList "/S" -Wait
# Create a 1GB RAM disk
Start-Process -FilePath "imdisk\imdisk.exe" -ArgumentList "-a -t vm -s 1G -m Z: -p \"/fs:ntfs /q /y\"" -Wait
- name: Verify RAM disk
run: |
fsutil volume diskfree Z:
- name: Use RAM disk
run: |
# Example: Copy repository files to RAM disk
xcopy . Z:\ /E /H /C /I
- name: Cleanup RAM disk
if: always()
run: |
# Remove the RAM disk
Start-Process -FilePath "imdisk\imdisk.exe" -ArgumentList "-D -m Z:" -Wait
see, I refuse to even use AI in my personal life.
Well, you didn't! ๐
This is not at all dodgy :P I'll take a look, but I am a little wary of depending on an even more complex script or a 3rd-party action/script.
It better like make the runtimes 3 minutes ๐๏ธ
Yea... I'd love that but I doubt that.
IIRC, filesystem overhead on Windows is a lot higher.
I'm being tongue in check, haha
FWIW, I also have a somewhat "ugh" feeling about this. Not helped by the fact that the code block uses checkout v3.
It'd be amazing to have a CI turnaround time of 10 minutes, but even 15 minutes would be nice.
Yea.
We're sitting at 18-20 AFAIK.
I deal with pip-like CI times at work (and worse) and one of my favourite improvements is always making things quicker... Or when I hop on a project where I'm the primary maintainer and the test suite takes 12s to run 17295 tests or so.
I'm going to try using the old ramdisk script we used to have, see how it blows up and then work backwards from that.
Honestly, I'm very happy we're at 20 minutes on pip and not the 60+ minutes we had when we were spread across Travis and Azure and GHA. ๐ซ
IIRC, it was basically a no-op by the time we removed it but I don't remember all the details. It'll be in the history under tools + there's probably some context from a past me in the PR.
(I hope so)
like no-op as in it didn't help CI times? or it stopped working entirely?
The context is here: https://github.com/pypa/pip/pull/12115
The first one, likely due to the latter that I tried investigating before going "I'm way out of my depth here".
IIRC
there seems to be very little difference in runtimes between the two, suggesting that the RAM disk isn't giving us much benefit. Given the amount of time (and computing resource) wasted in restarting CI jobs that fail in the "Create a RAM disk" step, I think it's better to just remove it.
from Paul
That's my concern as well. The dev drive powershell script was already pushing my windows/PS knowledge to the limit, and any sort of ramdisk implementation would definitely be out of my wheelhouse.
Aye, I misremembered! It wasn't me!
It's worth a shot, but I don't want to introduce the maintenance liability unless it's even faster.
I'll try and find the PR later, but it was turned off because it was causing issues
The PR is linked above.
But that... tracks. I think we had a meaningful speedups.
And, then it stopped working in some cases and stuff.
Oh yeah, I've read those old issues. An hour sounds awful 
2020 was an interesting time
Creating the ramdisk already takes 100 seconds, ouch. https://github.com/pypa/pip/actions/runs/12496073071/job/34867446618
Aye, I remember now. Github had started installing the server that we're installing at the top.
IIRC, that was the thing that took like a minute.
Ah, like it was pre-installed before?
We'd asked for it to be installed in the images by default.
And they removed it, if I had to guess.
They added that at the time.
A little off-topic, but the next windows and ubuntu images also remove svn so that'll be fun to deal with.
Yea, pip's VCS support situation is annoying.
(I'd tried to use Windows Server 2025 to see whether its more modern Dev Drive implementation would improve the times, it did not.)
Most of our CVEs are around the VCS stuff.
Which, me no likely.
But that's off topic as you said.
Optimistic, but a good try!
While I wait for the CI run to finish, the other ideas I had for speeding up the test suite was to
- comb through the existing tests to identify redundant work that can be removed (like unnecessary wheel building)
- caching HTTP requests globally (although honestly this is more important locally) where possible
I like the sound of it.
^ for the 2nd one. I was writing a test that needed to invoke a PEP 517 build and it was awfully slow even compared to the same test on the CLI. It turns out downloading setuptools ate 2 seconds.
Honestly, the test suite would benefit a lot from being fewer end to end tests but the ship sailed on that about 8+ years ago.
I have no appetite to rewrite pip's entire test suite ๐
Sad. ๐
What do you mean you like being sane!?
thinks about all the "whyyyy" sounds I've made looking at the test suite code
I'd also like to prototype switching the provisioning mechanism for isolated environments to install deps in-process, as I suspect the major performance penalty from build isolation is a nontrival factor in slow tests.
Yep.
Of course, actually writing a good patch to do that is out my wheelhouse, but I'd like to prototype it at least.
The build isolation situation as a whole is like... a debt that we need to pay down interest and principal on.
And the interest is 30 APR.
Happy to chat about all things around that area!
I appreciate it!
Honestly, I feel partially disincentived to work on major changes like that though because I don't think we have enough review capacity for any of that.
I spent a lot of time on this a few years ago, before losing steam because the immediate environment around me wasn't conducive to OSS work and I had to start pretending to be an adult for reals. ๐
This is good to know, and I'm honestly not sure how to solve this issue at this point.
I know I'm supposed to be part of the solution, but there is so much historical context and codebase expertise that you need in order to review major PRs, so it'll be a good while before I feel comfortable reviewing such PRs.
This is why I stopped contributing to mypyc. TBF I was also out of my depth as I don't work on compilers in my own free time, but none of my work was receiving attention.
Yup yup! I'll try to set aside more time for OSS stuff next few months as a whole (I like the "set 3 month goals" thingie from some video by CGPGrey about theme system).
I hope that'll help!
I don't have any plans to stop contributing to pip, but I'm focusing on the parts where I'm not as reliant on others.
I have a lot of loose ends that need tying up, aside from the regular old "deal with the slow trickle of things".
Frankly, I'd prefer that we dedicate some time so @tiny yacht's thing with us is not wasted, but either way, sounds good!
That's another reason I'm motivated to make more time for this stuff. :)
Plus, I have a PEP that I'm supposed to be writing down that I'll throw up sometime after the holiday season.
Which is gonna be a fun time. ๐
oof
I'm very glad that I don't participate in the standards discussion. I'm happy to stick in our own smaller pip world, where we generally don't have active fires in our discussions...?
Oh, this isn't gonna be standards.
Yea, though to be fair, some of us aren't around enough to have a fight/flame war. ๐
Standards discussions can be so tiring ๐, though I am going to try and put a PEP proposal together in January
Yea, it's tricky to balance the needs of a lot of people, while trying to set a foundation to build upon in the future.
To bring it back on-topic (or off-topic since the topic changed?). The ramdisk seems to place in the middle: https://github.com/pypa/pip/actions/runs/12496073071/job/34867446550 test suite duration wise, and once you factor in the ramdisk creation time, it's not very competitive.
[3.13 (1)] mean: 0:11:15 min: 0:10:29
[3.13 (2)] mean: 0:10:39 min: 0:09:55
[3.8 (1)] mean: 0:17:37 min: 0:16:40
[3.8 (2)] mean: 0:14:59 min: 0:14:04
for reference, these are the averages from the last 50 runs on main.
Thanks for trying the old script again to validate that! It's probably worth trying imdisk since it was a consistent suggestion on a bunch of forums.
But, it's also 100% OK if you say that you don't wanna spend time on this. ๐
This was genuinely a "ooh, maybe this will help" that it seems did not help.
well I had a lot of those too
I think I'd like to PR the current improvements I have, and then we can investigate better ramdisk implementations later.
Sounds good to me!
And because I was bored, here's a scatter plot of the last 50 CI runs on main.
I really should add caching to this script, but I'll deal with that later.
I'm not sure why there is a giant gap given that there should be weekly CI runs, but /shrug.
Annnd the last 100 runs.
[3.8 (1)] mean: 0:17:35 min: 0:16:38
[3.8 (2)] mean: 0:15:00 min: 0:13:46
[3.13 (1)] mean: 0:11:15 min: 0:09:57
[3.13 (2)] mean: 0:10:44 min: 0:09:22
It'd be neat to add this to https://ichard26.github.io/ghstats/pip/ now I think about it
I'll add that to the "absolutely lowest priority to-do, maybe will be done at some point, probably not" list.
I appreciate that but I think I am mostly blocked on the actual purpose of pip and the current architecture https://github.com/pypa/pip/issues/13111
internally I've basically marked that as not going to happen (please correct me if you disagree)
yea, that seems about right