Response Synthesis — AI Safety & Governance Discussion

Responses received

8 responses received

David Manheim2026-04-21 Lorenzo Pacchiardi2026-04-17 Zhuoran Du2026-04-17 Uma Kalkar2026-04-17 Pía Garavaglia2026-04-16 Anirudh Tagat2026-04-16 Florian Habermacher2026-04-16 Andrew Kao2026-04-15

Synthesis generated 2026-04-24 13:46 UTC

Synthesis — generated by Claude

Individual responses (8)

David Manheim — field_specialist 2026-04-21

Research focus and pivotal questions

Very fast turnaround on forum posts and possibly peer review of non-peer reviewable documents (Model cards, technical reports from think tanks, etc)

Staying timely

Fast track, 1+ evaluators, invites to more than 1 reviewer and move forward as soon as 1 is submitted.

Who to bring in

Unsure

Technical AI safety expansion

No, very much opposed to trying to compete on this.

Ord evaluation — pursue?

Meh, still not sure we have a good place for a take on this.

Others to involve

GovAI folks, Maybe Bluedot?

Availability

Prefer asyc, at ISO conference full week of Apr 21

Lorenzo Pacchiardi — ai_researcher 2026-04-17

Research focus and pivotal questions

I agree that AI-economics interface (eg impacts on labour market [Anthropic's work] or macro-trends à la Ord) seems the most relevant area for the Unjournal to focus on.

Staying timely

What about "pre-booking" evaluator's time before deciding what paper to review so that you can choose a paper and are sure that someone will be able to look at that in a timely manner (of course depends on having reviewers who are flexible enough in topics they can look at but maybe this is possible for macro-impact tracking works?)

Otherwise, simply paying more allows people to drop other priorities.

Also agree that targeting very specific cruxes in a paper (eg highlighted from the evaluation explorer) could be more efficient, for instance requiring reviewers to be useful with 1 hour of work only

Who to bring in

The following two people are knowledgeable about AI's impact on labour market
- Jonathan Prunty (Leverhulme Centre for the Future of Intelligence, University of Cambridge)
- Marko Tesic (DSIT, UK government)

Technical AI safety expansion

I think it's hard to find useful niche considering Alignment Journal and traditional AI conferences

Ord evaluation — pursue?

I have given my take on this before and I am still on the fence on the utility of this.

Availability

quite busy next couple of weeks, would hold off this unless super useful

Zhuoran Du 2026-04-17

Research focus and pivotal questions

Perhaps the effect of AI on labour welfare, AI on privacy and data security, AI on organisational reform, and AI literacy and inequality, etc.

Who to bring in

International Labour Organisation (for AI with labour welfare)
AI developers are important

Uma Kalkar — ai_researcher 2026-04-17

Research focus and pivotal questions

There's a gap in regulatory interventions, comparative international governance analysis, and estimates of risk parameters for AI safety and governance. More research into strong external scrutiny could help give Unjournal a competitive edge.

Examples of possible RQs:
-- What governance mechanisms actually change frontier lab behavior, and under what conditions?
-- How do capability thresholds translate into tractable regulatory triggers?
-- What predicts adoption vs. resistance for cross-national diffusion of AI governance frameworks?

Staying timely

The LLM eval + prioritization tool sound like good ways to (a) decipher what pieces are relevant/urgent and (b) verify/check their claims with humans in the loop across the process. That tension of rigor vs. speed will always be there; I don't think I would suggest automating the process more for fear of possible impacting standards/quality.

Who to bring in

-- Reach out to the RAND TASP Fellows
-- Consider the current/former GovAI Fellows
(These cohorts will already be specializing/focusing on elements of AI safety and governance research so it may be easier to get them to help review)

Technical AI safety expansion

I would focus on nailing down the safety/governance angle first, then look at expansion (but I'm wary of that because there is already good work by ARC Evals and METR/Epoch/etc).

Ord evaluation — pursue?

Given that the empirical picture has become messier and more contested, it may actually make sense to be a "first-mover" and evaluate it now. Given that Toby Ord's reframed his predictions to 2027+, there's a window of opportunity to assess in light of the other RL papers.

Ord evaluation — what form

Strongly suggest a long-form EA Forum/ PubPub post. It should cover the full evidence base listed above with clear separation between what's empirically established, what's contested, and where further research is needed.

Would definitely circulate a first draft for edits across the GovAI community (Toby Ord included).

Availability

async only in mid-May

Pía Garavaglia — economist 2026-04-16

Research focus and pivotal questions

Impact in the workplace and labour markets, specifically addressing how to shift and train workers' skills and how to develop strategic regulatory frameworks

Staying timely

One evaluator + AI assisted briefing

Technical AI safety expansion

I think it's important to analyze theoretical frameworks without missing on the actual application/testing. Dedicated evaluators could work better towards that aim.

Anirudh Tagat — uj_team 2026-04-16

Research focus and pivotal questions

The research should ideally be AI-related or impacts of AI in economic and social domains. There is already a lot of ongoing work related to labor market impacts.

I think engage with funders of AI and catastrophic risk, alignment etc. (I think Schmidt is interested, but are only funding research right now) -- it might be useful to reach out to them to provide an open evaluation of the work that they are commissioning / providing grants to. This way we also get to engage with AI researchers working at the forefront (at least in economics, broadly).

Staying timely

Look for papers coming out of top research centres around AI and economic/social impacts. Thinking of an NBER-track type evaluation pipeline, but with much quicker turnarounds.

Who to bring in

Sending out the mailer to everyone in the AI slack channel is a good idea, especially for crowdsourcing ideas on this. I think if we can locate stakeholders through these existing networks, that would be helpful.

Technical AI safety expansion

I am not well versed in this domain, but I would recommend sticking to our domain expertise (in terms of work we have already evaluated).

Florian Habermacher — economist 2026-04-16

Research focus and pivotal questions

Neglected ones! Comes to mind:
- politically grounded (I mean 'realpolitik' type not 'purely ivory tower abstract') Regulatory Frameworks for dealing with labor impacts (and maybe with social impacts more broadly)

Ord evaluation — pursue?

I'm not super informed but sounds like from general perspective it might be not top priority.

Availability

Every evening from 6 PM CEST until midnight CEST (most days)
Every Sat and Sun from 8 AM CEST until midnight CEST (most days)
Mon-Fri CET office hours (8 AM CEST until 6 PM CEST): variable availability (50% available 50% unavailable)

Andrew Kao — field_specialist 2026-04-15

Research focus and pivotal questions

The typical things in this space that Unjournal is already evaluating (e.g., new working papers in social science) are great.
But also worth considering evaluations of slightly less formal (but still high effort) pieces: as two examples, Phil Trammel and Dwarkesh Patel's post on Capital in the 22nd Century https://substack.com/@philiptrammell/p-182789127 and Citrini research's 2028 global intelligence crisis https://www.citriniresearch.com/p/2028gic

Staying timely

AI assistance is obviously helpful. I think having multiple evaluators is still valuable, but if it seems difficult to secure >1 opinion then it should not be an obstacle to publishing an evaluation.

Separately, I wonder if something along the lines of ACX/Zvi Moshowitz style 'commentary roundups' that presents clusters of comments made by others online + light discussion could be useful. This would be as a complement, not substitute, to existing eval effort. For this, I think the relevant question is whether the typical reader of an evaluation is plugged into discourse enough to already know the prevailing sentiment/feedback towards a piece or not.

Availability

Likely async only, next few weeks very busy.

← Back to discussion page