#Building the Pause button: researching how to pause effectively.

1 messages · Page 1 of 1 (latest)

bitter zinc
#

@lone bison asked me to submit a research project for the AI Safety Camp (AISC). There is one research question that I feel like we should have a very good, robust answer to: How do we build the Pause button? What do we need in order to effectively halt frontier AI research?

Of course, we already have quite a bit of thoughts on this, but there are also open questions that we need answers to.

So @winter dagger and I have submitted a research plan for AISC.

In this thread / project, we'll try to keep everyone up to date!

Current status

  • @formal oasis will probably be team lead, @lethal storm and @bitter zinc are team members. We're still interviewing other members using the AISC Airtable.
  • We're updating the research plan and the Building the Pause Button page as we go along.
  • We'll formally start in January

How you can help

  • Share research about Compute governance. Some research has been published by Gov.AI and ICFG.
  • Come up with novel ideas on verification mechanisms to halt AI training runs. Here's a paper by Akash Wasil.
  • Share flaws in existing approaches
  • Join the team!
stray ridge
#

international cooperation is key here imo

#

also important to reach out to folk who disagree and find common ground

hushed compass
#

I love this idea

lone bison
#

BTW, does anyone know anyone who could help lead this compute research project? I'm really excited about this project as an AI Safety Camp organiser, but Joep is already overstretched.

visual jolt
#

not sure if the project lead has been all figured out, but would love to help in any capacity needed!

bitter zinc
#

Just met John Khan and Annelene Schulz, who are interning at @winged ivy's Existential Risk Observatory. They will be researching how can we pause for a long term. In the short term, pausing is relatively simple, due to the scale of AI training runs. But various innovations can make this more difficult on the long term:

  • Decentralized training runs
  • Quantum Computers
  • Photonic chips
  • Second-hand computers
  • Algorithmic advances
bitter zinc
#

Nice video on this very topic! https://youtu.be/M4DAfzDnJzU?si=dcyvH6UblbBgp7Hb

In competitive situations, people end up sacrificing common values to get an edge. But this edge only lasts a short time until everyone does the same, so everyone is right back where they started but the common value is gone forever. Moloch is the personification of this “race to the bottom”. When asked why situations that are bad for everyone a...

▶ Play video
bitter zinc
bitter zinc
lone bison
open torrent
bitter zinc
#

I'm impressed by some of the AISC applicants! We'll probably have a great team 🙂

#

And thanks to @lethal storm for doing a lot helping out!

lethal storm
#

Scheduling chats with 5 interesting applicants, will keep the channel updated on how they go!

#

Excited for the direction of this project

lethal storm
#

4 out of 5 peeps replied, I got 2 chats Friday (one in under 12 hours!), 1 chat next Monday, 1 chat Tuesday

#

@bitter zinc Any advice for good info to discuss in the intro call?

#

I was gonna introduce the project, ask if it sounds like its interesting, what kind of work of sub questions they see themselves doing

#

if theyre leaning towards any other projects (since everyone im calling is 2nd choice and other)

#

make it clear what this project is and is not

#

(added some scope clarifications in the doc i think are correct based on my understanding)

bitter zinc
#

That's great!

lone bison
#

Cool to see your work here @lethal storm!

You can also ask for what they see as what’s important to get right for the project. Challenge them to give their own views ^^

Interviews are useful for asking questions that can resolve uncertainties you still have about how to interpret their application.

Also nice to conversationally get to know each other and what the project is about. I like your points about scope and where they want to contribute to team.

lethal storm
#

Chatted with Gideon. Seem like he was excited about the project and has some experience working on supply chain research and would be most interested in question 3 (What choke points exist in the AI chip supply chain?)

bitter zinc
#

@formal oasis This is the project page!

bitter zinc
bitter zinc
#

@proper cosmos is interested in joining!

proper cosmos
#

Hi! Thanks Joep. Yes this project is my 1st choice for AISC. I only get my application in last night so it may not be on your radar yet. I had an interesting conversation with Joep this morning discussing the critical role that understanding the psychology and constraints of stakeholders makes when trying to create big mindset and policy shifts. Hopefully the white paper that this project creates can be both practically actionable and also psychologically resonant for the people it is aimed at.

lethal storm
#

Chatted with Dominika. She brings a ton of experience including a PhD centered around international military/security/drone warfare and is most interested in the 2nd hand chips research question. She seems genuinely interested in our work and would bring valuable insights and a unique perspective to the team.

bitter zinc
lethal storm
#

Did the airtable list change recently? There's way more entries than I remember

lethal storm
#

Spoke with Jialu. Her background is a PhD focused around game theory and market theory. Meeting went well, I think she was more seriously considering the project after I pitched it to her (since it didn't seem to be her first choice initially)

formal oasis
#

Thanks for adding me to the project @bitter zinc . I am already enjoying the energy here. @lethal storm Thanks for the very quick action on speaking to interested candidates.

#

Looking forward to our call tomorrow.

bitter zinc
#

Deadline was sunday, so we should not expect new submissions.

formal oasis
#

Wanted to run an idea by everyone regarding the framing and breakdown of our project deliverables. Given that their is a fair bit of complexity I propose we break it down into 2 analysis and 1 correlation.
Life of a AI chip : Builds a map/life cycle of what actors and parties are involved in the supply chain of a typical chip used for training AI.
Life of a policy: Builds a map/lifecycle of a typical policy of similar nature.
Building a pause button: combines the two reports into an actionable framework for pause AI.

The rationale here is to have incremental outputs such that the first two reports ae predominantly facts based and we have a satisfactory level of expertise on the subject matter, to base our final output on. The last report then has the freedom to be a bit more aspirational.
It also gives us a positive reinforcement feedback loop :).
We may be able to find some existing work on the first 2 reports already.

Do we see any merits and/or issues with this approach?

lethal storm
#

I like the approach! Small clarification on the life of policy though, my impression is that policy is so wildly varied that they may lend themselves to a typical lifecycle. Are there any specific examples of policies that we should look into? (I'm no policy expert, just my 0.02c)

#

Also just spoke with Mitali, she's an undergrad based in California who's excited to join. She doesnt have much professional experience yet but has experience in software engineering and is super interested in learning more about the AI policy and regulation space.

bitter zinc
lethal storm
#

Small asynch updates from my end:

I read the meeting notes. Just so we don't duplicate efforts, I already got Dominika on board! She is committed to the project

#

Right now the 2 folks who are committed to joining are Dominika and Jiawei

#

@formal oasis happy to exchange their info if you'd like to schedule a follow up w/ them :]

#

@bitter zinc

#

Keep up the good work 🔥

formal oasis
#

Awesome work Jim. I was quite impressed with Dominika, so was hoping she would join. Do you have any notes on the calls?

proper cosmos
#

Hi guys. Not sure if I should be in this chat room as you are discussing candidates.

I listed this project as my first choice. I guess you are working your way through applicants, and I really hope we can have a chat/interview and that you will see some value in what I can bring as a policy-oriented social scientist/social psychologist to this particular project.

Let me know if it would be better if I was not in this chat room? I don't mind either way!

lethal storm
#

-Most interested in 2nd hand chip
-10 years of experience as phd/researcher on russo-ukraine drone warfare/role of AI in war
-Interested in military/security domain

#

Also I'm in conversation with Jiawei, she has some questions about the frequency and asynch/synchronous nature of meetings for this project. What are we thinking would work best for syncing up after kickoff in january?

formal oasis
formal oasis
#

This is a very detailed and well structured analysis of semiconductor supplychain. Granted that the data here would be a bit skewed since it probably covers a lot of low end chips as well, but something like this for AI chips will already be a very good platform to springboard off of.
https://www.csis.org/analysis/mapping-semiconductor-supply-chain-critical-role-indo-pacific-region

This is the accompanying report, where they make some policy recommendations for geopolitical and trade related dynamics.
https://www.csis.org/analysis/securing-semiconductor-supply-chains-indo-pacific-economic-framework-prosperity

The Indo-Pacific region is critical to semiconductor manufacturing. This brief provides an analysis of the role the region plays in the global semiconductor industry across the various stages of the supply chain.

The Biden administration must build consistency between IPEF and the deployment of CHIPS funding. Squaring the circle on chips fund deployments and the administration’s IPEF strategy offers long-term strategic benefits.

formal oasis
#

Small asynch update from my end:
Spoke to Arthur Au, He is a first year bachelors student of politics and international studies so definitely still in his early days. Gave him a thought experiment to ponder over and get back in touch.
I am thinking we should maintain a short document of roles and the corresponding candidates?

bitter zinc
lethal storm
#

Happy thanksgiving to folks who celebrate!

bitter zinc
#

@hushed compass

hushed compass
#

Hi, I'll like how to to make this practical in a world where it seems that at least some distributed training is possible.

bitter zinc
#

@restive flint is interested in joining! I've reached out, asked to meet.

bitter zinc
#

Lennart Heim from GovAI on high-bandwith memory (HBM)

https://x.com/ohlennart/status/1863628586319585287

Yearly export control update just dropped, restricting high-bandwidth memory (HBM). HBM is critical for advanced AI accelerators, especially for deployment workloads with long context windows.
The goal? Stop the PRC from equipping their AI accelerators with HBM. 1/

formal oasis
formal oasis
#

Had a brilliant call with Ananthi. I believe she will bring much needed perspectives to the team.

bitter zinc
formal oasis
#

Had a call with Josh Thor. He would be a valuable asset as a research analyst.

formal oasis
#

Had calls with Dominika and Ricardo.
Dominika will certainly be an asset. I explained the structure of the project and gave her a week's time to ponder over and get back to me.
Ricardo is a ML Engineer, but also interested in policy. During the discussion, we landed on some discussion about ISO standards. I gave him a small assignment related to that to evaluate if he will enjoy the work (and hopefully also find an interesting dimension to explore for our project)

formal oasis
#

Just finished a call with Raymond. I am very confident he will do very well with this project. I have already invited him to join the team. @bitter zinc can you kindly add him to the project discord?

#

@bitter zinc I was thinking we should have a total of 8 - 9 people on the project. Is that Ok? at the moment we have 3 domain experts (Chips, Policy, Society) and 3 Analysts (Jim, Raymond, Josh). I have also spoken to Nurshafirah and Ricardo and waiting on their responses to my thought experiment questions to decide on their involvement. @lethal storm is there anyone I have overlooked? In your notes I see that you spoke to someone called Jiawei and Mitali but I don't see them on the list.

lethal storm
#

Yes! I spoke with Jiawei and Mitali. If you'd like, I'd be happy to direct them your way if you'd like to chat ith them

#

By list do you mean airtable? I found them from the airtable

lethal storm
#

Plans to meet today? Didnt see anyone in the call

formal oasis
#

i'm waiting to be let in

#

Are you in the call already?

bitter zinc
formal oasis
#

@dusky wing this is where we talk about the Pause button project. Feel free to introduce yourself to the group 🙂

dusky wing
#

Hi all! I'm a student finishing up my undergrad in Cognitive Systems (CS + philosophy + psychology) at UBC in Vancouver, Canada. I learned about AI safety 2 years ago through 80,000 Hours and have been obsessed with existential risk from advanced AI ever since. I lead UBC AI Safety, a student group that aims to build the field of AI safety. I'd like to pursue a career in AI governance, so this is an excellent opportunity for me. I strongly believe in the idea of a pause on frontier capabilities and think this could be our best shot at survival. Thank you @formal oasis and the rest of the team for having me and I'm looking forward to working with you!

formal oasis
#

@dim vapor this is the public page for the pause button project

bitter zinc
#

Another threat vector we should take into account: analog computing. These can be extremely efficient for neural networks, and they require a different supply chain

https://www.amadeuscapital.com/gemesys-secures-e8-6m-pre-seed-funding-to-transform-ai-on-edge-devices/?utm_source=chatgpt.com

Edward Norton

AI hardware startup GEMESYS has successfully raised €8.6 million in a pre-seed funding round led by the Amadeus APEX Technology Fund.

bitter zinc
formal oasis
formal oasis
# bitter zinc

Had not followed Mythic for a while(they seemed to have hit a few roadblocks). They were using in-memory compute I believe. Also another company called Untether are using in-memory or near-memory compute. We would need to look at all the different computing paradigms for sure including some other stealth mode companies like d-matrix? although the bulk of our analysis would need to be on mainstream compute.

lethal storm
#

Taking my parents to see a christmas broadway show tmrw, apologies for not being able to make the meeting

lethal storm
#

See yall tomorrow

#

✌️

formal oasis
#

Happy New Year y'all.
Please welcome @opal warren Racardo Savii to the project. He will be joining our project as a research analyst. @opal warren please introduce yourself and tell us a little bit about your background in a couple of lines.
@dim vapor will also join us as research analyst. Please do introduce yourself to the team although I suppose you have been involved longer?

opal warren
#

Hello.

First, Happy new year!

I’m Ricardo, feel free to call me Rico. I began working as a Data Scientist 7 years ago, I also assumed the role of Machine Learning Engineer as I dedicated time to learn about software engineer and DevOps practices, and cloud infrastructure management.
My plan to help here is related to iso 42.001.

formal oasis
#

All please welcome Mitali Mittal, She will join our project as a research analyst. @fluid tusk please introduce yourself and tell us about your background in a couple of lines.

fluid tusk
#

Hi everyone! I'm Mitali, a second year undergrad student at UC Irvine. I was a software engineering intern at Etsy last summer, and this summer I will be at BlackRock. I'm currently an AI Researcher for Humanity Unleashed, an open source AI policy generator. I'm so excited to join you guys on this project!

restive flint
#

I made a quick sketch to get an overview of which components the project of building the pause button has (Multiple research questions came from the AI Safety camp document: https://docs.google.com/document/d/14ZNsdajxnYrama6cTxgJYFBsy6xhOkTkeOyJgkSRa08/edit?tab=t.0). It helped me to get a better sense of it. And I also hope that it helps to communicate misunderstanding about what is in the focus and not in the focus of the project. So we all have the same picture in mind.

manic oriole
#

Hi all! Sorry to barge in. I'm writing a prediction market question directly relevant to this project, which is meant to track the viability of enforcing an AI pause via compute supply chain governance.

Question for you: What are one or more good metrics and limits that would make the capability to create AGI very difficult to track and enforce? (e.g. a set amount of hardware cost, hardware + training cost, or FLOP per dollar.)

Here is a Google Doc with the full question draft, if you'd like to read it for context. Feel free to leave comments with feedback there.

lethal storm
#

Welcome Rico, Mitali!

stark tinsel
manic oriole
lethal storm
#

Would it be possible to move the project call forward or backwards a bit? It’s currently at 3am in Korea. I know we are super constrained due to all the timezone differences tho so it’ll be a challenge finding a schedule that works for everyone

dim vapor
#

Hii, guten tag everyone!

I’m Shafira, feel free to call me Sha, a grad student at UTM (Malaysia, GMT+8). I did AI Safety Camp research on AGI risk 2 years ago, soon to finish BlueDot's AI Safety: Alignment course. I'm currently an independent AI Researcher, applying to volunteer for this as Malaysia is rising to be in semiconductor industry and AI data centre.

#

About timing if it's helpful, we can try https://www.when2meet.com/ to see the best overlaps

lone bison
#

BTW, opening weekend for people collaborating on projects at AI Safety Camp will be this weekend.

stark tinsel
lone bison
#

Thanks, I only see option to ping everyone in this "channel" though. Wonder who gets notified then?

stark tinsel
lone bison
#

@everyone, opening weekend for accepted teammates will be this weekend on Jan 11-12. Do join!

inner elm
#

Hello everyone! I'm Jon Khan, a researcher just coming out of a brief stint working at the Existential Risk Observatory investigating the dynamics of a possible long term pause. Previously, I did work in EA around nanotechnology. I've been asked to join the team and I'm really excited to meet all of you!

opal warren
formal oasis
restive flint
#

I added a section "Beginnings of a concrete implementation of building a Pause Button" to the Google Doc: https://docs.google.com/document/d/14ZNsdajxnYrama6cTxgJYFBsy6xhOkTkeOyJgkSRa08/edit?tab=t.0#heading=h.jhbvyzv92ihx

#

This include a table with the components that the pause button includes and their high-level implementation and research questions to further investigate the topic.

#

This is to more clearly formulate the previous sketch I made and more importantly to start as quickly as possible with getting concrete about what is needed and required as then we can gather feedback on it more easily.

#

Treat everything as a best guess. I might have made wild assumptions. But the key here is to open discussion. Another important goal of the table is splitting the implementation of the pause button in components with defined boundaries, so that it is easier to split up the work among us.

#

Any feedback is welcome.

#

@bitter zinc I am planning to put part of this also on the website after gathering feedback. Instead of a table I will most likely just use headings and leave the in-scope and out of scope sections out of it since that is more for us internally. What do you think?

restive flint
# restive flint Any feedback is welcome.

Especially, feedback on the in-scope and out-scope. As there I have quite some uncertainty and will have consequences on what we are going to research in the end.

#

One important thing to note. In this proposal I also added stuff of on-chip governance (chip tracking and self-destruct chips) without knowing if that is feasible. That being said, I do belief that for a water-tight implementation of the pause button chip tracking and self-destruct capabilities might be necessary.

opal warren
split citrus
#

Did anyone mention non-training run capabilities research here? I think shooting for just a pause on training for now is prudent, but afaik we discover latent capability for a long time after these models come out.

I'm getting really nervous about chain of thought, especially as it moves out of text and into latent spaces that humans can't really understand. Another worrying development I've seen is multiple asynchronous cores to build a system. I lump those and others under the umbrella term "composition". I think composition and algebra / vector manipulation can significantly extend the task domains and capabilities that these models can be applied to. Unfortunately, they also require massively less hardware power than a full training run. I hope this isn't a sign we are leaving the era when checking large server farms is sufficient for a pause on AI capabilities...

Just wondering if it's something that's on the radar?

restive flint
restive flint
#

I updated the Google Doc (Still in the section: Beginnings of a concrete implementation of building a Pause Button) and replaced the table for a more clear outline. I divided it into prerequisites for building a pause button and the actual mechanism and divided the prerequisites into what is needed to make it work, verification, enforcement.
A quick summary here if people want to comment
Prerequisites

  1. Frontier labs can only acquire a million H100-equivalents in total
  2. Beyond that limit, frontier labs are only allowed to buy chips that can detect if a large model is trained on them and can self-destruct if necessary.
#

Pausing mechanism

  1. There is an agreement on what the conditions for self-destruction and other enforcement measures are. (when to pause the button)
  2. There is a way to detect if these conditions are true for the frontier labs that possess these chips at regular enough time intervals. (trigger for pushing it)
  3. There is a mechanism that certain actors can activate these enforcements when the conditions are true. (how to push the button)
#

Other possibilities for prerequisites and pausing mechanism

  • Only focus on the maximum number of GPUs (and not on on-chip governance). Then we don't really have the physical button but might successfully prevent building gigawatt-scale data centers.
  • Let them not require a certain amount of chips (even self-destruct chips) at all. So also applying a limit to the amount of self-destruct chips.
  • What should this limit be?
  • Maybe we need to restrict other things as well instead of only hardware?
  • What about state-actors, I now focused on frontier labs but maybe we have other actors as well that might be dangerous?
    Let me know if you have any thoughts on this?
split citrus
# restive flint It is somewhat on my radar but I haven't looked into it but I agree this seems w...

Unfortunately I haven't seen articles talking about this issue, and haven't had time to see if there are any out there. I started writing "Composability and Alteration Overhang" to express my ideas. Hopefully you can clean glean where my thinking is from just the skeleton. If anyone knows people expressing similar ideas please lmk. Also, please bother me to write the whole thing if you are interested.

Capabilities of current models "with 'just' more complicated inference", huh? Hmm... I'll use the theoretic point where an AGI begins to RSI sufficiently that it outperforms all of humanity as a reference point called "the doom point". The following hypotheticals are absurd, but used as thought / intuition tools. In each scenario I'll give the "ever" chance, the probability that in that world state we ever get to the doom point, followed by the "in 2025", "by 2040, 15 years work", and "longer than 15 years" probabilities. I'm just putting probabilities from my gut feeling, they might be different if I did this again tomorrow, but should at least be within a log order of magnitude or two.

  • If humanity was joined as a whole with the goal of using current models to create AGI:
    • ever: 80%, this year: 10%, by 2040: 50%, later: 20%
  • If humanity was vaguely annoyed but did nothing to stop OpenAI / Antropic / etc... but OpenAI was within the organization, focused on pursuing this goal.
    • ever: 60%, this year: 5%, by 2040: 35%, later: 20%
  • As above, but with OpenAI being conflicted, with many researchers thinking this is a bad idea and an ethics board continuing to interfere.
    • ever: 20%, this year: 4%, by 2040: 10%, later: 6%
subtle nexus
#

Hi everyone: I'm an independent researcher, trained in social/ political science now affiliated with the Atlantic Council in Washington D.C. My research covers military applications of emerging and disruptive technologies and their impact on international security, the character of warfare, and transatlantic defense cooperation. I am currently collaborating with NATO's Office of Chief Scientist on military AI projects. Looking forward to meeting you all!

formal oasis
#

@subtle nexus @opal warren @bitter zinc @dim vapor just a reminder that we will be having our project kickoff meeting tomorrow(Sunday). Please check your email for the meeting invite. (7am PST)

opal warren
formal oasis
bitter zinc
#

Davidad on FlexHAGs: (Hardware-enabled compute governance)

how can you assure compliance to agreements using on-chip governance

https://youtu.be/MPrU69sFQiE?si=nVzLbmynyHtQ2MP-&t=3160

David "davidad" Dalrymple joins the podcast to explore Safeguarded AI — an approach to ensuring the safety of highly advanced AI systems. We discuss the structure and layers of Safeguarded AI, how to formalize more aspects of the world, and how to build safety into computer hardware.

You can learn more about David's work here:

https://www.ari...

▶ Play video
restive flint
# restive flint I added a section "Beginnings of a concrete implementation of building a Pause B...

Ok, to not crowd the main Google Doc and more people are going to work on their own ideas for an implementation of a Pause Button I made a separate Google Doc initially only for myself because I expect this will then at some point get merged with other ideas into the main Google Doc but if people have some comments or want to get some inspiration: https://docs.google.com/document/d/1U9DbQGLf1JotHwQG2REfovNDOB3erxrJXuwPLPTMUlg/edit?usp=sharing

proper cosmos
#

Thanks Ray - helpful to have a separate doc for feedback/additions

bitter zinc
#

Working on the supply chain visualisation...

bitter zinc
#

Includes compute governance, chip controls. Aims to regulate which countries get how much compute. Also governs how AI model weights are shared.

restive flint
#

To the Google Doc of the main Building the Pause Button Doc: https://docs.google.com/document/d/14ZNsdajxnYrama6cTxgJYFBsy6xhOkTkeOyJgkSRa08/edit?pli=1&tab=t.0
I added a section resources add the very bottom with a link to this Google Doc: https://docs.google.com/document/d/1ZMcJI__hRgDjc-L4tN1Z3Z9cYkKVFS4mlUyp_B4L1uc/edit?tab=t.0
Where I collected a list of resources. Please feel free to add without asking. You can add (added by: name) if you are really unsure.

bitter zinc
proper cosmos
restive flint
dim vapor
restive flint
#

Some additional papers I found worth looking into
Risk thresholds for frontier AI: https://arxiv.org/pdf/2406.14713 (GovAI)
Training compute thresholds: https://arxiv.org/pdf/2405.10799 (GovAI)
An FDA for AI https://ojs.aaai.org/index.php/AIES/article/view/31633/33800 (Harvard University)
Mechanisms to Verify International Agreements About AI Development https://intelligence.org/wp-content/uploads/2024/11/Mechanisms-to-Verify-International-Agreements-About-AI-Development-27-Nov-24.pdf (MIRI)
I added all the relevant papers sorted by category in the overview doc I just created, see previous message.

opal warren
#

Hello team. Sorry I won't be able to make it to today's meeting. I'm working on that ISO reading ideas and initial draft. I hope I can share it here with you by Monday.

bitter zinc
#

Seems like the "DeepSeek trained on 40 h100s" might be a lie, true number could be 1000x higher! This is good news for feasibility of compute governance, although it shows that current export controls are not enforced properly.

https://x.com/kimmonismus/status/1882824571281436713?t=Pd_LdQ7Dd030NFrdfTPEvA&s=19

Billionaire and Scale AI CEO Alexandr Wang: DeepSeek has about 50,000 NVIDIA H100s that they can't talk about because of the US export controls that are in place.

dusky wing
#

Listened to this podcast today. Eliezer's main policy recommendation, failing the "shut it all down" strategy:

Create a symmetrical treaty

  • Emphasis on symmetrical: it should give no one actor an unfair advantage
  • The treaty specifies that there will only exist a limited number of data centers worldwide
  • All the AI training hardware is restricted to these data centers
  • Each data center has a designated observer from each of US, UK, China, Canada, etc.
  • There are strict rules on which jobs are allowed to be run in these data centers
  • Every job is logged and efforts to evade these restrictions are punished harshly (e.g. big fines, blacklisted from ever using the data center again)
  • These data centers are preferably air gapped from the internet

Not sure if he's written these ideas down anywhere but I thought they were interesting.

fossil grove
proper cosmos
formal oasis
#

My thoughts on Deepseek.

  • It is not just the success of deepseek, rather the success of opensource models. The total compute they used is probably the accumulation of Lama + some labs in china + Deep Seek . We would have to think of a special policy mechanism that applies to using pretrained opensource models.
  • The only way to avoid an arms race is if the policies we propose are "symmetrical" and do not favor just one party as @dusky wing pointed out.
dusky wing
proper cosmos
restive flint
#

I got some good comments from Otto both in messages and also in the Google Doc Draft structure Pause Button. I think by now this document is sufficiently outdated to not be so useful anymore to comment on as my views changed quite a bit.
A summary of those in my own words

  • Preventing large scale datacenters that can have 1e28 FLOPS from being build might not be enough. Models are quickly getting smaller and cheaper to train.
  • Might also want to look into if you can detect large inference instead of only training.
  • For preventing large datacenters support is more a problem than research (if Trump is on board he will just not have done 500 billion Stargate).
  • We want to make for different compute thresholds a plan for what would be needed to stop that amount of compute to be run. 1e30 FLOPS might be super easy from 1e15 might be nearly impossible. And everything in between to have a clearer sense on what to do when.
  • Most radical implementation for maybe the 1e15 axis would be only allowing hardware as advanced as the 90's and destroying all other GPUs. And at 1e30 current regulation might be enough and everything in between.
  • You can also have (kind of) a pause by having one actor having a monopoly and preventing all other actors from building AGI although this is pretty risky and prone to abuse of power. Anthropic seem to be kind of pursuing this scenario.
  • Having chips that might only connect with say 10 other chips might be an interesting proposal to restrict amount of compute.
  • It might not be possible to have a chip that can reliably detect if an LLM is trained or not on that chip. So automatic self-destruct things might not work. Then chip-regulation is mostly to not run the model at all and only after a certain certificate the hardware will be able to perform any compute. Although this seems also somewhat risky to me as the barrier of what is possible seems prone to shift.
restive flint
#

Today I read the paper: https://arxiv.org/pdf/2404.18308 (instead of the MIRI one I was planning to). Because this one seemed really interesting.
It proposes a Offline licensing mechanism that could be used to prevent unregulated
training of potentially dangerous frontier AI models.
You need a license on each chip in order to run it for X amount of clock cycles and after that you need to renew the license.
**The important part **is that this offline licensing can be implemented by a firmware update. That means no new hardware is needed. All chips that have at least

  • Firmware verification
  • Firmware rollback protection
  • Secure non-volatile memory
    NVIDIA H100 chips have all these things.

From the paper: "The needed development time is difficult to accurately estimate without more details about the chips, however a rough estimate is that a firmware-only proof of concept could be developed in 3 months and a more robust version could be developed in 6 months. "
In contrast, I often hear numbers like 2 to 3 years for new hardware chips.

And also it could be applied retroactively to already owned chips. Also from the paper: "For chips that have already been sold, government regulation compelling a firmware update could be effective if possible. If this is not possible, chip owners could be incentivized to comply by making this a requirement in order to retain their BIS “presumption of approval” status for future chip purchases."

This Firmware-based approach could be made pretty tampering proof according to the paper (details in the paper) although some more complicated hardware tampering might still be possible. This could be complemented with occasional on-site checks.

#

@formal oasis (or other people). Do you have any idea how quickly a new hardware design can be on the market (is that indeed 2 to 3 years because in that case we definitely need an intermediate solution like this firmware-based approach I am investigating.
And another question, do you think any kind of bandwidth-limiting thing or limiting the communication with a GPU to only X other GPU's could also be implemented with only a firmware update?

formal oasis
#

the standard design cycle for a lot of tech companies is 1 year

#

for cautious companies, it can be 2 years

#

Functionality like on-chip governance is not complicated at all, so getting the hardware out is a 6month process, however, because it will be part of larger chips, the timing will depend on the design cycle of the said chip.

bitter zinc
#

https://x.com/mustafasuleyman/status/1886561570286920047?t=VODN9Q7aRtyimneAAhfBFw&s=19

We already had an idea this was going on, but now coming from sulayman... If we focus only on training hardware, we're missing out.

If you're not already paying attention to this shift, you should be: the balance of compute is moving from pre-training to inferencing. We're seeing massive gains here from scaling up test-time compute, with no ceiling in sight

bitter zinc
#

https://youtu.be/3T4FpRm2uwk?si=n4HZzr3HU4NqIptX

Another choke point! Photoresist

Thanks to an anonymous viewer at TSMC for recommending this topic.

Links:

▶ Play video
hushed compass
#

Looks like there's an opportunity with David Sacks who has been talking about on-chip governance

formal oasis
dusky wing
dusky wing
#

This may be relevant to us: new DeepMind paper on distributed training runs

we show experimentally that we can distribute training of billion-scale parameters and reach similar quality as before, but reducing required bandwidth by two orders of magnitude.

restive flint
unique hawk
# bitter zinc Seems like the "DeepSeek trained on 40 h100s" might be a lie, true number could ...

Deepseek don't have 50k Hopper ($1B) because they can't afford more than $100M.

Liang said publicly they don't get any outside investment. Liang owns like 80% of deepseek so the money mostly from his profit in High-flyer.

High-flyer capital have 8B asset under management, and average 13% return per year and under standard hedge fund 2-20 structure, you would get like 400M max per year for entire hedge-fund and MAX 100M to him every year. Most fund under High flyer has been losing money likely due to gov crack down so likely a small faction of 100M like 40M. Therefore no way they can afford $1.6B as suggested in semianalysis. https://t.co/xXyAUVseAU

China's takes are better than yours

#

I guess my conclusion is that lots of compute governance techniques may not be as useful since more and more compute now goes to inference instead of pre-training. and inference can be done so decentralized also it doesn't need to be on Nvidia chips.

#

Idk about west but financial Intelligence might be more effective in China for gov to monitor. Reason:

  1. Crypto is banned in China
  2. Most smuggled Nvidia chips in China are paying import tax
formal oasis
dim vapor
#

Saw this (and the ISO point in this), reminded me of @opal warren https://forum.effectivealtruism.org/posts/CwQs8tKbEqAjprhb2/thoughts-about-policy-ecosystems-the-missing-links-in-ai altho the ISO mentioned here is more for CSR; but the framework this author kinda validate how we are approaching our research (we have Technical & Governance approach as does the framework mentioned here)

unique hawk
#

I just tried on Huawei cloud it works

lone bison
#

Ben Harack is a hardware engineer that does research about hardware verification for compute controls.

You should reach out to him. He said in a call with me before that he's up for meeting with the Pause Button team!

He's about to publish this paper:

formal oasis
dim vapor
formal oasis
#

thanks 🙂

bitter zinc
lethal storm
#

sorry all, have a work meeting conflict today

#

will catch up w/ notes

restive flint
#

An update for the technical aspects of the proposal of the pause button. Feel free to add any feedback. I focused on summarizing all the information on two approaches that seems to me the most promising. Namely,

dusky wing
bitter zinc
bitter zinc
bitter zinc
bitter zinc
#

THere's a discussion on reddit (/r/localllama) where they're disappointed about the NVidia Digits hardware having just 273 GB/s memory speed. This is interesting to BtPB, because it shows how important high-bandwith memory is right now. Even Nvidia fails to give their new LLM compute unit the HBM it needs!

The production of HBM3 and HBM3E, which are used in AI accelerators, GPUs, and HPC applications, is dominated by:

  • SK Hynix – The market leader in HBM production, supplying Nvidia with HBM3 and HBM3E.
  • Samsung – A strong competitor working to secure Nvidia’s and other AI companies’ contracts.
  • Micron – The third major player, ramping up HBM3E production in 2024 to compete with SK Hynix and Samsung.

I think I agree with @formal oasis that this is a very interesting bottleneck

Reddit

Explore this post and more from the LocalLLaMA community

bitter zinc
# bitter zinc https://www.nationalsecurity.ai/chapter/executive-summary by dan hendrycks. Call...

Especially this chapter is relevant:
https://www.nationalsecurity.ai/chapter/nonproliferation

It suggests:

  • Export controls. Record keeping to keep track of chips. Use tamper-evident cameras to prevent smuggling. Have in-person compliance visits to check if the number of chips still matches. Use satelite imagery to find other datacenters. Decomission chips that are no longer used (like in nuclear non-proliferation).
  • Firmware features. Geofencing: Measure signal delays from landmarks / trusted servers. Licensing: regular signatures to continue operation.

why does the paper not link to other works?

Chapter 5: Nonproliferation. Rapid advances in AI are beginning to reshape national security. Destabilizing AI developments could rupture the balance of power and raise the odds of great-power conflict, while widespread proliferation of capable AI hackers and virologists would lower barriers for rogue actors to cause catastrophe.

formal oasis
bitter zinc
#

Interviewed James Patrie. Here are my (sloppy) notes. Some takeaways:

  • Firmware based approaches are mostly effective against non-state actors. You wont's top china for long.
  • But NVidia is investing a lot in chip boot security, they apparently have quite a bit of videos on that topic on youtube!
  • flexHEGs are very difficult to make, will take years to develop
#

Interesting thread on China's best AI chip: https://x.com/ohlennart/status/1899488375574278336

TL;DR: It's illegally produced by TSMC, not real Chinesee supply chain. They have HBM2E stockpiled, which leads to slow bandwith which is not good for AI performance./

Huawei's next AI accelerator—the Ascend 910C—is entering production. It's China's best AI chip.
Thanks to backdoor sourcing, we could easily see 1M H100-equiv this year.
Here’s what we know about its performance and strategic implications. Spoiler: selectively competitive. 1/

lethal storm
#

Opening my section of the paper up for comments!

#

and feedback

topaz dirge
# bitter zinc THere's a [discussion on reddit (/r/localllama)](https://www.reddit.com/r/LocalL...

So the announced Nvidia Vera/Rubin plan seems to suggest that tying more GPU and CPU management cores together directly via NVLink and successors means there's less need for copies of the same data across cores. The bandwidth tech is indeed becoming very important.

https://www.tomshardware.com/pc-components/gpus/nvidia-announces-rubin-gpus-in-2026-rubin-ultra-in-2027-feynam-after

That's definitely Jeremie's take, anyway. 26:29 or search Rubin in the https://lastweekin.ai/p/lwiai-podcast-204-openai-audio-rubin transcript.

restive flint
#

A new organization TamperSec is basically trying to build FlexHEGs if I understood their mission correctly. And they do refer that paper in the first link. And since it comes out of Catalyze Impact I assume the motivation for developing this is reducing existential risk via compute governance.
https://www.catalyze-impact.org/post/introducing-11-new-ai-safety-organizations-catalyze-incubation-program-cohort-winter-2024-25#viewer-0xor1170
They have a website although that is not more than a template at the moment: https://tampersec.com/
The founder is Jonathan Happel.
They are also hiring for Electronic Engineers, Embedded Systems Engineers and a Business Development Lead
https://tampersec.com/careers/electronic-engineer
https://tampersec.com/careers/embedded-systems-engineer
https://tampersec.com/careers/business-development-lead
@formal oasis maybe a new job for you 🙂
Edit: The application seems already closed, nevertheless might be worth getting in touch with them.

formal oasis
dusky wing
lone bison
#

Hey, just a reminder here too that final presentations for AISC are next Saturday the 19th.

The default option is for you to give a 10 minute talk during one of the AISC lighting talk sessions at MAISU. You're scheduled in the Stop/Pause block, which starts at 10am ET. Does this work for you?

We recommend choosing one person to present for your team, as it's hard to keep it within the time limit otherwise. If you want to give a longer talk, you are welcome to schedule a session to yourself at any time during MAISU.

mortal steeple
#

I think off button should be multidimennsional. For example let's take example of autonomous agent on ETH that could be connected to decentrilised LLM model with over 150kk LLM model and we don't want to see it become craazy. Off buttons could be interior, by developers, by LLM model and by value of crypto framework. Reseting of crypto, base token could be additional measure to stop development of cooperation AI, thats why I guess ETH or it's forks could be better to allow autonomous agents work on such systems that allow to reset winings if agent win to much, take liquidity from fiat bank system or make another wining all actions. Multidimensional off button could create unknown of from what direction autonomous agent could be off and this could create environment where his behaviour need to be more sustainable and respectful for others. What do you think of such hypothesis?

stark tinsel
restive flint
topaz dirge
#

Were lightning talk presentations captured and published anywhere? Or other project outputs? https://www.aisafety.camp is not updated yet

lone bison
#

Yes, we recorded the lightning talks. I'm just splitting them out and putting them up as private YouTube videos. Until they're linked to our website, you can download from this link: https://www.icloud.com/iclouddrive/0bfRYaJsHMXX1uvKxlpRkEcmw#(AISC_project_4)_Building_the_Pause_Button_-_A_Proposal_for_AI_Compute_Governance_

#

Robert at AISC will also put out this team's and all the other teams' project summaries on our site by the end of this month.

lone bison
bitter zinc
restive flint
#

I didn't realize it but ControlAI also wrote about how they think AGI can be prevented from being build for the next 20 years which feels very similar to what the pause button is trying to accomplish: https://www.narrowpath.co/phase-0

How to secure our future

unique hawk
#

I’d like to raise a potentially missing point: Huawei’s Ascend chips currently lack the advanced confidential computing features found in Nvidia GPUs used in FlexHEG. Due to existing U.S. sanctions, even if Huawei begins now, it could take 3–5+ years to reach usability—if at all.

As Huawei’s chip capabilities grow, this gap won’t remain trivial. Verification regimes relying on one-sided methods or GPU tracking may face pushback.

If anyone has hardware expertise in confidential computing or on chip verification, I’d welcome a discussion—or a joint project—to explore a minimum viable verification approach. One that’s not perfect, but technically and politically feasible: violations would be possible, but costly or difficult enough to deter until robust verification is in place.

Background: I’m co-founder of a Singapore-based AI governance nonprofit focused on China–West relations.

Contact: [email protected]

restive flint
restive flint
#

They also have a website but not all links are working there: https://flexheg.com/

We are an open-source R&D community building next-generation software and hardware that provide trustworthy assurance for AI.

restive flint
#

The team of the AI Safety Camp project of Building the Pause Button has published their paper: https://arxiv.org/abs/2506.20530 !

#

For anybody interested in getting involved in doing research into building the pause button and coming up with concrete policy proposals that can help us pause AI:
We will kick-off with an online meeting with a team of new people next week that got excited after PauseCon. DM me for more information.

topaz dirge
#

I deliberately didn't read/consult on this until after it was done, so I could try to be objective.

I am relatively impressed by how well this has turned out. It covers many specific details but also defines a framework that can accommodate further ones.

It's dry, but not inordinately so. Seems pitched right for serious consideration by serious people.

#

I advocate mutual stronger links between this report and Control AI's "A Narrow Path". The latter is longer and more ambitious, but its phase zero lacked some detail that this report usefully fleshes out.

#

The missing piece of the report, for me, is how it would deal with deliberate defection by a player of power. To be flippant: we don't care if someone owes large fines and wrist slaps after they end humanity. A sufficiently powerful party with different risk assessments and/or willingness to make bets can say "if this works, I rule the world, and no punishment is enforceable."

That's OK. We hope for widespread "the only winning move" consensus. Talking explicitly about sufficient enforcement starts other distracting conversations. And efforts (such as MAIM) do exist.

But I thought it was worth calling out: I think it will not be an unusual reaction from some readers.

topaz dirge
#

(You could probably do with a pinned post in this thread.)

I know @bitter zinc kicked this project off, but who is the ongoing lead?

Anyway: I understand you do plan a microsite, and makes a lot of sense.

I lost my impulse control this morning and finally registered pausebutton.ai for a couple years - lmk re access and set up.

restive flint
# topaz dirge (You could probably do with a pinned post in this thread.) I know <@21123828752...

@topaz dirge Thanks for your kind words and feedback. I do agree that we should write more about deliberate defection by a player of power. Note that we do assume that both US and China are broadly on board. But given that of course there could be powerful players who still decide to defect.

I also agree that in hindsight I would have liked to be more deliberate about which gaps compared to the Narrowpath we are trying to fill.

Thanks a lot for registering pausebutton.ai that is a pretty good name. I currently just use vercel to quickly host drafts of it. But when we are finalizing it I would say we definitely want to move it to a proper website like pausebutton.ai.

Regarding the ongoing lead, since the AI Safety Camp ended and people are moving on. I decided to kick-off (on Tuesday) a new phase of this project (not the website but additional research) with some people who expressed interest after PauseCon.

topaz dirge
restive flint
#

Do people here have opinions on who the primary audience for the pause button website should be?

  • PauseAI members to see what we are doing as PauseAI
  • Policy-makers to convince them this is feasible
  • As a way for field-building to get other people working on it

Ideally all three are served in some way in my opinion but initially it might be useful to have a focus. How would you rank them in terms of priority and why?

Additionally, should the focus more be on

  • Explaining it in the most clear and understandable way
  • Make the website as actionable as possible
  • Having the most solid argument for why this is important/feasible
    Of course, all might be achieved to some degree. It is more about where the focus/priority lies.
bitter zinc
# restive flint Do people here have opinions on who the primary audience for the pause button we...

I have some takes on this:

  • Primary target audience is people interested in concrete policy measures. Politicians, or people who do volunteer lobbying towards politicians. The goal of building the pause button is showing a clear path towards a pause. We should show why it's feasible, and what it takes. Ideally the deliverable is a "how 2 pause for dummies", a clear guide for politicians to follow.
  • Focus on clear, understandable explanations is most important from these. Losing someone's attention can be a disaster (literally), so we need to keep it. Having clear, easy to communicate memes about our solutions is also essential so more people understand what needs to happen.
tiny pagoda
#

Toby Ord on the era shift from pre-training scaling to inference scaling: https://overcast.fm/+ABBPwhbFHfQ/41:07

#

41:07 in case the timestamp doesn't work

lone bison
winged ivy
stark tinsel
formal oasis