Owain Evans

Cited by

	All	Since 2019
Citations	4801	4341
h-index	18	18
i10-index	25	24

1600

800

400

1200

2014201520162017201820192020202120222023202417 14 25 66 245 384 458 541 686 1557 705

Public access

View all

1 article

0 articles

available

not available

Based on funding mandates

Co-authors

Katja GraceAI ImpactsVerified email at intelligence.org
Andreas StuhlmüllerElicitVerified email at elicit.com
Jacob HiltonAlignment Research CenterVerified email at alignment.org
Stephanie LinResearch Scholar, University of OxfordVerified email at philosophy.ox.ac.uk
Noah D. GoodmanStanford UniversityVerified email at stanford.edu
William SaundersOpenAIVerified email at cs.toronto.edu
Joshua B. TenenbaumMITVerified email at mit.edu
Jacob SteinhardtStanford UniversityVerified email at cs.stanford.edu
Andrew IlyasMassachusetts Institute of TechnologyVerified email at mit.edu
Mihaela CurmeiBerkeleyVerified email at berkeley.edu
Yarin GalAssociate Professor, University of OxfordVerified email at cs.ox.ac.uk
David AbelResearch Scientist, DeepMindVerified email at deepmind.com
Zachary KentonGoogle DeepMindVerified email at google.com
David Scott KruegerUniversity Assistant Professor, University of CambridgeVerified email at cam.ac.uk
Jan LeikeOpenAIVerified email at openai.com

Owain Evans

Research Associate, University of Oxford

Verified email at philosophy.ox.ac.uk - Homepage

AI alignment Artificial Intelligence Machine Learning AI safety Truthful AI


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
The malicious use of artificial intelligence: Forecasting, prevention, and mitigation M Brundage, S Avin, J Clark, H Toner, P Eckersley, B Garfinkel, A Dafoe, ... arXiv preprint arXiv:1802.07228, 2018	1073*	2018
When will AI exceed human performance? Evidence from AI experts K Grace, J Salvatier, A Dafoe, B Zhang, O Evans Journal of Artificial Intelligence Research 62, 729-754, 2018	1014*	2018
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022	735	2022
Truthfulqa: Measuring how models mimic human falsehoods S Lin, J Hilton, O Evans arXiv preprint arXiv:2109.07958, 2021	657	2021
Trial without error: Towards safe reinforcement learning via human intervention W Saunders, G Sastry, A Stuhlmueller, O Evans arXiv preprint arXiv:1707.05173, 2017	284	2017
Help or hinder: Bayesian models of social goal inference T Ullman, C Baker, O Macindoe, O Evans, N Goodman, J Tenenbaum Advances in neural information processing systems 22, 2009	208	2009
Teaching models to express their uncertainty in words S Lin, J Hilton, O Evans arXiv preprint arXiv:2205.14334, 2022	137	2022
Learning the Preferences of Ignorant, Inconsistent Agents O Evans, A Stuhlmüller, ND Goodman Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI-2016), 2016	126	2016
Agent-Agnostic Human-in-the-Loop Reinforcement Learning D Abel, J Salvatier, A Stuhlmüller, O Evans arXiv:1701.0407, 2017	77	2017
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" L Berglund, M Tong, M Kaufmann, M Balesni, AC Stickland, T Korbak, ... arXiv preprint arXiv:2309.12288, 2023	72*	2023
Truthful AI: Developing and governing AI that does not lie O Evans, O Cotton-Barratt, L Finnveden, A Bales, A Balwit, P Wills, ... arXiv preprint arXiv:2110.06674, 2021	71	2021
AI progress measurement P Eckersley, Y Nasser, Y Bayle, O Evans, G Gebhart, D Schwenk Electronic Frontier Foundation, 2017	51*	2017
Constructing and adjusting estimates for household transmission of SARS-CoV-2 from prior studies, widespread-testing and contact-tracing data M Curmei, A Ilyas, O Evans, J Steinhardt International Journal of Epidemiology 50 (5), 1444-1457, 2021	37*	2021
Active Reinforcement Learning: Observing Rewards at a Cost D Krueger, J Leike, O Evans, J Salvatier NIPS 2016 Workshop, 2016	34*	2016
Learning the Preferences of Bounded Agents O Evans, A Stuhlmüller, ND Goodman Advances in Neural Information Processing Systems (Bounded Optimality Workshop), 2015	34	2015
Modeling Agents with Probabilistic Programs O Evans, A Stuhlmüller, J Salvatier, D Filan agentmodels.org, 2017	28*	2017
Learning structured preferences O Evans, L Bergen, JB Tenenbaum Proceedings of the 32nd annual conference of the cognitive science society, 2010	21*	2010
Taken out of context: On measuring situational awareness in LLMs L Berglund, AC Stickland, M Balesni, M Kaufmann, M Tong, T Korbak, ... arXiv preprint arXiv:2309.00667, 2023	18*	2023
Modelling the Health and Economic Impacts of Population-Wide Testing, Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK T Colbourn, W Waites, J Panovska-Griffiths, D Manheim, S Sturniolo, ... SSRN, 2020	18*	2020
Allen M Brundage, S Avin, J Clark, H Toner, P Eckerskey, B Garfinkel, A Dafoe, ... G., Steinhardt, J., Flynn, C., hÉigeartaigh, S., Beard, S., Belfield, H …, 2018	17*	2018

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors