How do we harness artificial intelligence for the good of humanity?
Disclaimer: I am not an ethics expert.
"If a thing is worth doing, it is worth doing badly." - G.K. Chesterton
Immediate Problems
Long-Term Problems
word2vec
300-dimensional embedding trained just based on hiding words from phrases
doctor - man + woman
= nurse
87
Sweeny, "Simple demographics often identify people uniquely"
COMPAS: predicting recidivism
Suggested possible solution in AIMA: "Equal Impact": assigning utility
We should focus on defining problems in the right way
B. F. Skinner
Pigeon-guided bombs, 1943
https://www.youtube.com/watch?v=tlOIHko8ySg
"As a general rule, it is better to design performance measures according to what one actually wants in the environment, rather than according to how one thinks the agent should behave." - Stuart Russell
Reward
Value
Emerging best practices (AIMA)
(Bostrum, 2003)
Hypothetical Examples:
Marc Andreeson: By the way, there's a very practical objection to all this, which is kind of sometimes called the thermodynamic objection, which, again, sort of connects this back to reality, which is: Look, we're sitting here today and let's say that GPT develops whatever you want to call it--a mind of it might of its own or its own goals or whatever. Like, it can't get chips. Right? So, now it has its evil plan to take over the world. It needs, like, more chips to be able to run its evil plan. NVIDIA is out of chips. And so, what--
Russ Roberts: They have a story for that. They explain: they'll get some poor low-IQ person--not you or me, Marc, because we're too smart--but they'll get a low-IQ person, an employee of some lower level, and they'll convince him to go buy chips for them.
Marc Andreessen: No, no. But, the chips literally don't exist. Like NVIDIA can't make the chips. There's chip shortages all throughout the AI ecosystem.
Russ Roberts: Oh. Well, they'll fix that. That's easy.
Marc Andreessen: Exactly. So, basically--
Russ Roberts: They'll get Senators, the Congress people to vote for subsidies to things that the chips need and then in a week or two, that'll go away.
Marc Andreessen: So, this is what's called the thermodynamic objection, which is: Okay, you're the AI, you're the sentient artificial intelligence. To accomplish your evil planting over the world, you need the chips, you need the electricity, you need to go buy the votes in Congress, you need to do this, you need to do all of these things.
And, that somehow these things are going to happen basically overnight, very quickly, very easily without putting--at this point, neither one of us are steel manning, by the way--but without putting a footprint into the world. Right? And this is this sort of takeoff idea, and this all happens in 24 hours.
It's like--I don't know about you, but anybody who's ever tried to get Congress people to do anything, it doesn't happen like that. Once you enter the real world of politics to get a bill passed--
https://www.econtalk.org/marc-andreessen-on-why-ai-will-save-the-world/
Experience with other superintelligent entities
https://www.nature.com/articles/s41893-022-00851-6
The CAPTAIN RL framework
"maximizing total protected area can lead to substantial species loss is of urgent relevance, given that total protected area has been at the core of previous international targets for biodiversity (such as the Aichi Biodiversity Targets, https://www.cbd.int/sp/targets) and remains a key focus under the new post-2020 Global Biodiversity Framework under the Convention on Biological Diversity."