January 2020 Newsletter #AI - The Entrepreneurial Way with A.I.

Breaking

Tuesday, January 21, 2020

January 2020 Newsletter #AI

#A.I.

Updates

Links from the research team

This continues my experiment from last month: having MIRI researchers anonymously pick out AI Alignment Forum posts to highlight and comment on.

  • Re (When) is Truth-telling Favored in AI debate? — “A paper by VojtÄ›ch Kovařík and Ryan Carey; it's good to see some progress on the debate model!”
  • Re Recent Progress in the Theory of Neural Networks — “Noah MacAulay provides another interesting example of research attempting to explain what's going on with NNs.”
  • Re When Goodharting is optimal — “I like Stuart Armstrong's post for the systematic examination of why we might be afraid of Goodharting. The example at the beginning is an interesting one, because it seems (to me at least) like the robot really should go back and forth (staying a long time at each side to minimize lost utility). But Stuart is right that this answer is, at least, quite difficult to justify.”
  • Re Seeking Power is Instrumentally Convergent in MDPs and Clarifying Power-Seeking and Instrumental Convergence — “It's nice to finally have a formal model of this, thanks to Alex Turner and Logan Smith. Instrumental convergence has been an informal part of the discussion for a long time.”
  • Re Critiquing “What failure looks like” — “I thought that Grue Slinky's post was a good critical analysis of Paul Christiano's ‘going out with a whimper’ scenario, highlighting some of the problems it seems to have as a concrete AI risk scenario. In particular, I found the analogy given to the simplex algorithm persuasive in terms of showcasing how, despite the fact that many of our current most powerful tools already have massive differentials in how well they work on different problems, those values which are not served well by those tools don't seem to have lost out massively as a result. I still feel like there may be a real risk along the lines of ‘going out with a whimper’, but I think this post presents a real challenge to that scenario as it has been described so far.”
  • Re Counterfactual Induction — “A proposal for logical counterfactuals by Diffractor. This could use some more careful thought and critique; it's not yet clear exactly how much or little it accomplishes.”
  • Re A dilemma for prosaic AI alignment — “Daniel Kokotajlo outlines key challenges for prosaic alignment: ‘[…] Now I think the problem is substantially harder than that: To be competitive prosaic AI safety schemes must deliberately create misaligned mesa-optimizers and then (hopefully) figure out how to align them so that they can be used in the scheme.’”

The post January 2020 Newsletter appeared first on Machine Intelligence Research Institute.



Ai

via https://www.AiUpNow.com

January 21, 2020 at 03:17AM by Rob Bensinger, Khareem Sudlow