Responses received
8 responses received
David Manheim2026-04-21
Lorenzo Pacchiardi2026-04-17
Zhuoran Du2026-04-17
Uma Kalkar2026-04-17
Pía Garavaglia2026-04-16
Anirudh Tagat2026-04-16
Florian Habermacher2026-04-16
Andrew Kao2026-04-15
Synthesis generated 2026-04-24 13:46 UTC
Synthesis — generated by Claude
Individual responses (8)
David Manheim — field_specialist 2026-04-21
Research focus and pivotal questions
Very fast turnaround on forum posts and possibly peer review of non-peer reviewable documents (Model cards, technical reports from think tanks, etc)
Staying timely
Fast track, 1+ evaluators, invites to more than 1 reviewer and move forward as soon as 1 is submitted.
Who to bring in
Unsure
Technical AI safety expansion
No, very much opposed to trying to compete on this.
Ord evaluation — pursue?
Meh, still not sure we have a good place for a take on this.
Others to involve
GovAI folks, Maybe Bluedot?
Availability
Prefer asyc, at ISO conference full week of Apr 21
Lorenzo Pacchiardi — ai_researcher 2026-04-17
Research focus and pivotal questions
I agree that AI-economics interface (eg impacts on labour market [Anthropic's work] or macro-trends à la Ord) seems the most relevant area for the Unjournal to focus on.
Staying timely
What about "pre-booking" evaluator's time before deciding what paper to review so that you can choose a paper and are sure that someone will be able to look at that in a timely manner (of course depends on having reviewers who are flexible enough in topics they can look at but maybe this is possible for macro-impact tracking works?)
Otherwise, simply paying more allows people to drop other priorities.
Also agree that targeting very specific cruxes in a paper (eg highlighted from the evaluation explorer) could be more efficient, for instance requiring reviewers to be useful with 1 hour of work only
Otherwise, simply paying more allows people to drop other priorities.
Also agree that targeting very specific cruxes in a paper (eg highlighted from the evaluation explorer) could be more efficient, for instance requiring reviewers to be useful with 1 hour of work only
Who to bring in
The following two people are knowledgeable about AI's impact on labour market
- Jonathan Prunty (Leverhulme Centre for the Future of Intelligence, University of Cambridge)
- Marko Tesic (DSIT, UK government)
- Jonathan Prunty (Leverhulme Centre for the Future of Intelligence, University of Cambridge)
- Marko Tesic (DSIT, UK government)
Technical AI safety expansion
I think it's hard to find useful niche considering Alignment Journal and traditional AI conferences
Ord evaluation — pursue?
I have given my take on this before and I am still on the fence on the utility of this.
Availability
quite busy next couple of weeks, would hold off this unless super useful
Zhuoran Du 2026-04-17
Research focus and pivotal questions
Perhaps the effect of AI on labour welfare, AI on privacy and data security, AI on organisational reform, and AI literacy and inequality, etc.
Who to bring in
International Labour Organisation (for AI with labour welfare)
AI developers are important
AI developers are important
Uma Kalkar — ai_researcher 2026-04-17
Research focus and pivotal questions
There's a gap in regulatory interventions, comparative international governance analysis, and estimates of risk parameters for AI safety and governance. More research into strong external scrutiny could help give Unjournal a competitive edge.
Examples of possible RQs:
-- What governance mechanisms actually change frontier lab behavior, and under what conditions?
-- How do capability thresholds translate into tractable regulatory triggers?
-- What predicts adoption vs. resistance for cross-national diffusion of AI governance frameworks?
Examples of possible RQs:
-- What governance mechanisms actually change frontier lab behavior, and under what conditions?
-- How do capability thresholds translate into tractable regulatory triggers?
-- What predicts adoption vs. resistance for cross-national diffusion of AI governance frameworks?
Staying timely
The LLM eval + prioritization tool sound like good ways to (a) decipher what pieces are relevant/urgent and (b) verify/check their claims with humans in the loop across the process. That tension of rigor vs. speed will always be there; I don't think I would suggest automating the process more for fear of possible impacting standards/quality.
Who to bring in
-- Reach out to the RAND TASP Fellows
-- Consider the current/former GovAI Fellows
(These cohorts will already be specializing/focusing on elements of AI safety and governance research so it may be easier to get them to help review)
-- Consider the current/former GovAI Fellows
(These cohorts will already be specializing/focusing on elements of AI safety and governance research so it may be easier to get them to help review)
Technical AI safety expansion
I would focus on nailing down the safety/governance angle first, then look at expansion (but I'm wary of that because there is already good work by ARC Evals and METR/Epoch/etc).
Ord evaluation — pursue?
Given that the empirical picture has become messier and more contested, it may actually make sense to be a "first-mover" and evaluate it now. Given that Toby Ord's reframed his predictions to 2027+, there's a window of opportunity to assess in light of the other RL papers.
Ord evaluation — what form
Strongly suggest a long-form EA Forum/ PubPub post. It should cover the full evidence base listed above with clear separation between what's empirically established, what's contested, and where further research is needed.
Would definitely circulate a first draft for edits across the GovAI community (Toby Ord included).
Would definitely circulate a first draft for edits across the GovAI community (Toby Ord included).
Availability
async only in mid-May
Pía Garavaglia — economist 2026-04-16
Research focus and pivotal questions
Impact in the workplace and labour markets, specifically addressing how to shift and train workers' skills and how to develop strategic regulatory frameworks
Staying timely
One evaluator + AI assisted briefing
Technical AI safety expansion
I think it's important to analyze theoretical frameworks without missing on the actual application/testing. Dedicated evaluators could work better towards that aim.
Anirudh Tagat — uj_team 2026-04-16
Research focus and pivotal questions
The research should ideally be AI-related or impacts of AI in economic and social domains. There is already a lot of ongoing work related to labor market impacts.
I think engage with funders of AI and catastrophic risk, alignment etc. (I think Schmidt is interested, but are only funding research right now) -- it might be useful to reach out to them to provide an open evaluation of the work that they are commissioning / providing grants to. This way we also get to engage with AI researchers working at the forefront (at least in economics, broadly).
I think engage with funders of AI and catastrophic risk, alignment etc. (I think Schmidt is interested, but are only funding research right now) -- it might be useful to reach out to them to provide an open evaluation of the work that they are commissioning / providing grants to. This way we also get to engage with AI researchers working at the forefront (at least in economics, broadly).
Staying timely
Look for papers coming out of top research centres around AI and economic/social impacts. Thinking of an NBER-track type evaluation pipeline, but with much quicker turnarounds.
Who to bring in
Sending out the mailer to everyone in the AI slack channel is a good idea, especially for crowdsourcing ideas on this. I think if we can locate stakeholders through these existing networks, that would be helpful.
Technical AI safety expansion
I am not well versed in this domain, but I would recommend sticking to our domain expertise (in terms of work we have already evaluated).
Florian Habermacher — economist 2026-04-16
Research focus and pivotal questions
Neglected ones! Comes to mind:
- politically grounded (I mean 'realpolitik' type not 'purely ivory tower abstract') Regulatory Frameworks for dealing with labor impacts (and maybe with social impacts more broadly)
- politically grounded (I mean 'realpolitik' type not 'purely ivory tower abstract') Regulatory Frameworks for dealing with labor impacts (and maybe with social impacts more broadly)
Ord evaluation — pursue?
I'm not super informed but sounds like from general perspective it might be not top priority.
Availability
Every evening from 6 PM CEST until midnight CEST (most days)
Every Sat and Sun from 8 AM CEST until midnight CEST (most days)
Mon-Fri CET office hours (8 AM CEST until 6 PM CEST): variable availability (50% available 50% unavailable)
Every Sat and Sun from 8 AM CEST until midnight CEST (most days)
Mon-Fri CET office hours (8 AM CEST until 6 PM CEST): variable availability (50% available 50% unavailable)
Andrew Kao — field_specialist 2026-04-15
Research focus and pivotal questions
The typical things in this space that Unjournal is already evaluating (e.g., new working papers in social science) are great.
But also worth considering evaluations of slightly less formal (but still high effort) pieces: as two examples, Phil Trammel and Dwarkesh Patel's post on Capital in the 22nd Century https://substack.com/@philiptrammell/p-182789127 and Citrini research's 2028 global intelligence crisis https://www.citriniresearch.com/p/2028gic
But also worth considering evaluations of slightly less formal (but still high effort) pieces: as two examples, Phil Trammel and Dwarkesh Patel's post on Capital in the 22nd Century https://substack.com/@philiptrammell/p-182789127 and Citrini research's 2028 global intelligence crisis https://www.citriniresearch.com/p/2028gic
Staying timely
AI assistance is obviously helpful. I think having multiple evaluators is still valuable, but if it seems difficult to secure >1 opinion then it should not be an obstacle to publishing an evaluation.
Separately, I wonder if something along the lines of ACX/Zvi Moshowitz style 'commentary roundups' that presents clusters of comments made by others online + light discussion could be useful. This would be as a complement, not substitute, to existing eval effort. For this, I think the relevant question is whether the typical reader of an evaluation is plugged into discourse enough to already know the prevailing sentiment/feedback towards a piece or not.
Separately, I wonder if something along the lines of ACX/Zvi Moshowitz style 'commentary roundups' that presents clusters of comments made by others online + light discussion could be useful. This would be as a complement, not substitute, to existing eval effort. For this, I think the relevant question is whether the typical reader of an evaluation is plugged into discourse enough to already know the prevailing sentiment/feedback towards a piece or not.
Availability
Likely async only, next few weeks very busy.