Abram Demski and Scott Garrabrant have made a major update to “Embedded Agency“, with new discussions of ε-exploration, Newcomblike problems, reflective oracles, logical uncertainty, Goodhart’s law, and predicting rare catastrophes, among other topics.
Abram has also written an overview of what good reasoning looks in the absence of Bayesian updating: Radical Probabilism. One recurring theme:
[I]n general (i.e., without any special prior which does guarantee convergence for restricted observation models), a Bayesian relies on a realizability (aka grain-of-truth) assumption for convergence, as it does for some other nice properties. Radical probabilism demands these properties without such an assumption.
[… C]onvergence points at a notion of “objectivity” for the radical probabilist. Although the individual updates a radical probabilist makes can go all over the place, the beliefs must eventually settle down to something. The goal of reasoning is to settle down to that answer as quickly as possible.
Meanwhile, Infra-Bayesianism is a new formal framework for thinking about optimal reasoning without requiring an reasoner’s true environment to be in its hypothesis space. Abram comments: “Alex Appel and Vanessa Kosoy have been working hard at ‘Infra-Bayesianism’, a new approach to RL which aims to make it easier (ie, possible) to prove safety-relevant theorems (and, also, a new approach to Bayesianism more generally).
Other MIRI updates
News and links
Originally published by Machine Intelligence Research Institute: Source