Support NeoGAF

Damaniel · Oct 18, 2017

The journal Nature published a paper by DeepMind today, describing their new iteration of AlphaGo, called AlphaGo Zero. Unlike the previous iterations, AlphaGo Zero was trained solely through the use of self-play, only being given the basic rules of the game. After only 3 days of training, Zero could easily defeat the version of AlphaGo that beat Lee Sedol - it won 100-0 in a series of test matches, and after 40 days beat their previous best version (called AlphaGo Master) 89-11. By their estimate, AlphaGo Zero has an ELO rating of over 5000, which is absolutely insane levels of gameplay (for reference, Lee Sedol has an estimated ELO of around 3500, and Ke Jie has an ELO of around 3660).

Here's a couple articles on the research:
- Gizmodo - ignore the clickbaity headline.
- DeepMind's own article on the research. They have a link to the actual paper, which is pretty interesting.

Replace me with a vastly superior AI if old.

Rentahamster · Oct 18, 2017

I train via self-play too

serious answer: that's amazing

Kinitari · Oct 18, 2017

Right now the Baduk community is frantically looking at all the shared games and it seems already gleaning new insights. It's kind of insane

Laiza · Oct 18, 2017

So the Go AI that was unbeatable by humans got absolutely trashed by a superior version of that Go AI that is now unbeatable by its former self.

Man, I love seeing shit like this. Really makes me feel like we're living in the future!

Akuun · Oct 18, 2017

Machine learning is such a cool concept, and it's crazy how quickly it grows.

Kthulhu · Oct 18, 2017

Lemme know when it can beat Crysis.

shira · Oct 18, 2017

The OpenAI dota 1v1 program has slowly shown cracks and pro players are finding ways to beat it (half a year of data gathering)

http://neogaf.com/forum/showthread.php?p=245965344

The 2018 edition should be insanity

Kinitari · Oct 18, 2017

shira said:
The OpenAI dota 1v1 program has slowly shown cracks and pro players are finding ways to beat it (half a year of data gathering)

http://neogaf.com/forum/showthread.php?p=245965344

The 2018 edition should be insanity

I realized after reading more about this that it was a bit represented, I don't know if by Open AI itself or the media. I think it's impressive, but they made it feel less impressive as soon as you knew more about what it really did

http://www.wildml.com/2017/08/hype-or-not-some-perspective-on-openais-dota-2-bot/

mr2xxx · Oct 18, 2017

Impressive and scary at the same time.

Little Green Yoda · Oct 19, 2017

Another step closer to Skynet...

AllGamer · Oct 19, 2017

I just love self-learning AI. It's fascinating.

mclem · Oct 19, 2017

Little Green Yoda said:
Another step closer to Skynet...

The time to get really scared is when the finished boards start to resemble QR codes...

Skunkers · Oct 19, 2017

We're all gonna die.

JettDash · Oct 19, 2017

I would rather a computer be in charge of the nukes than Donald Trump.

Skittles · Oct 19, 2017

JettDash said:
I would rather a computer be in charge of the nukes than Donald Trump.

Man, I give it 2 years before most video games are conquered. Which means computers will be better at us in any game lol

iapetus · Oct 19, 2017

When its ELO rating is over 9000, AIs will have beaten humans at tired memes as well.

FiggyCal · Oct 19, 2017

Kinitari said:
I realized after reading more about this that it was a bit represented, I don't know if by Open AI itself or the media. I think it's impressive, but they made it feel less impressive as soon as you knew more about what it really did

http://www.wildml.com/2017/08/hype-or-not-some-perspective-on-openais-dota-2-bot/

Trained entirely through self-play

Hard-coded restrictions: The bot was not trained from scratch knowing nothing about the game. Item choices were hardcoded, and so were certain techniques, such as creep block, that were deemed necessary to win.

I would say so. The article even contradicted itself.

Steiner84 · Oct 19, 2017

aynone knows about how good the Dota bot got in the meantime since he stripped every pro player naked?

edit: nvm, didnt see the post.

xevis · Oct 19, 2017

Little Green Yoda said:
Another step closer to Skynet...

Oh plz. The result is cool but the system itself is still dumb as a bag of hammers. Strong AI is pure fiction.

Tanis · Oct 19, 2017

I'm still kind of stunned that Go got busted open so thoroughly.

Chev · Oct 19, 2017

xevis said:
Oh plz. The result is cool but the system itself is still dumb as a bag of hammers. Strong AI is pure fiction.

Sure, but even a go AI beating pro players was considered fiction a few years ago.

StrategyFan · Oct 19, 2017

Tanis said:
I'm still kind of stunned that Go got busted open so thoroughly.

it's a simple game

FiggyCal · Oct 19, 2017

Chev said:
Sure, but even a go AI beating pro players was considered fiction a few years ago.

Why? Chess robots have been beating human players for decades.

Eridani · Oct 19, 2017

FiggyCal said:
Why? Chess robots have been beating human players for decades.

Because go is a much more complicated game - that is, it has a much larger state space, which makes the algorithms used for chess completely unusable, no matter how much the hardware improves. The algorithms for go use a completely different field of AI than the ones used for chess - one that has advanced astronomically in the last few years.

Kinitari · Oct 19, 2017

FiggyCal said:
I would say so. The article even contradicted itself.

No, it's just a confusing way to describe training vs underlying algorithims. As in, training using a dataset would involve it watching games first and then using that information to inform it's own self play (or at least, this is how older versions of alphaGo operated). This didn't 'watch' any games prior to it's self play, however the final build had some hard coded commands - which makes it less impressive.

The author doesn't really make that distinction clear here - basically they're trying to say 'it's nice that it didn't need a giant dataset, but... It sucks that it needed hand coded rules'

efyu_lemonardo · Oct 20, 2017

So my thread, which got locked for being a duplicate, and this earlier one, are both a page long?

GAF you need to pay more attention!

RSP · Oct 20, 2017

So what other tasks can this thing be taught?

I suppose it's things where fixed rules apply and need to be considered when doing the same task a lot of times in succession?

Eridani · Oct 20, 2017

efyu_lemonardo said:
So my thread, which got locked for being a duplicate, and this earlier one, are both a page long?

GAF you need to pay more attention!

It's just not something that's very interesting for most people. "AlphaGo went from being unbeatable to being slightly more unbeatable" isn't really all that impressive to most people I'd imagine. This is actually a lot more impressive then that, since they used a very different approach than the old AlphaGo, and even though I only skimmed the paper their work seems incredibly impressive, but if you're not actually interested in AI research it just doesn't look like much of a big deal.

nynt9 · Oct 20, 2017

RSP said:
So what other tasks can this thing be taught?

I suppose it's things where fixed rules apply and need to be considered when doing the same task a lot of times in succession?

Deepmind, the company behind this (owned by Google) have been looking into healthcare applications. The technology behind this is very generalizable.

efyu_lemonardo · Oct 20, 2017

RSP said:
So what other tasks can this thing be taught?

I suppose it's things where fixed rules apply and need to be considered when doing the same task a lot of times in succession?

The class of tasks this result applies to is called Perfect Information Games. As in nothing is hidden from either player.
Unlike, say, a game of Starcraft, which Deep Mind are also working on cracking, but is more complicated.

Drencrom · Oct 20, 2017

Still waiting for AlphaGos Starcraft matches

efyu_lemonardo · Oct 20, 2017

Eridani said:
It's just not something that's very interesting for most people. "AlphaGo went from being unbeatable to being slightly more unbeatable" isn't really all that impressive to most people I'd imagine. This is actually a lot more impressive then that, since they used a very different approach than the old AlphaGo, and even though I only skimmed the paper their work seems incredibly impressive, but if you're not actually interested in AI research it just doesn't look like much of a big deal.

If there's one thing any layman should take from here, it's the potentially alarming rate of improvement in these systems.

Quite possibly there will come a day in the near future where the time between a particular problem being solvable by A.I. at a level that doesn't suck when compared to an average human, and an A.I. becoming far better than any human could hope to be at solving that same problem, will be measured in mere days!

That's not something you can take a "wait and see" approach with any more. People need to understand.

Violet_0 · Oct 20, 2017

AI, I present to you a new challenge

Daedardus · Oct 20, 2017

Skunkers said:
We're all gonna die.

This is just an AI that has been optimised and trained to be very good at Go by human scientists and engineers. The idea that a specific AI will be a good at everything is a bit ridiculous. For example, this AI can't tell you which stocks to invest or can drive your car, even though there are AIs that can do that sort of stuff. Still, all these AI have to go through a teaching process, which is limited by the amount of time and processing power you have available. AI won't try to learn things they don't themselves because it would waste their effort.

The reason why Elon Musk et.al. are afraid of AI is not for this AI that can play chess or Go, but AI that has been specifically engineered to detect humans and eliminate them, in a war context by drones. They fear that the AI may be too good at the stuff it was made for, and therefore can get difficult to control. But it's still a computer that had to be thought how to kill, and not a computer teaching itself to kill without orders.

Sesha · Oct 20, 2017

I can't wait to see AIs play DMC or Ninja Gaiden.

nubbe · Oct 20, 2017

Destroy it before it's too late

Eridani · Oct 20, 2017

efyu_lemonardo said:
If there's one thing any layman should take from here, it's the potentially alarming rate of improvement in these systems.

Quite possibly there will come a day in the near future where the time between a particular problem being solvable by A.I. at a level that doesn't suck when compared to an average human, and an A.I. becoming far better than any human could hope to be at solving that same problem, will be measured in mere days!

That's not something you can take a "wait and see" approach with any more. People need to understand.

The thing is that AlphaGo was already far, far better at Go then any player could possibly be. That got a lot of interest, and rightfully so. AlphaGo zero improved the ELO rating from 4858 to 5185, which doesn't really look like much of an improvement. Previous versions of AlphaGo also achieved similar improvements and did not raise much news.

The real news here is the novel approach. Using reinforcement learning, with no human knowledge to achieve such a result is frankly mind blowing. This closing quote from the paper really puts in perspective:

Humankind has accumulated Go knowledge from millions of games played over thousands of years, collectively distilled into patterns, proverbs and books. In the space of a few days, starting tabula rasa, AlphaGo Zero was able to rediscover much of this Go knowledge, as well as novel strategies that provide new insights into the oldest of games

This is the biggest thing here, and is something older versions of AlphaGo weren't really capable of doing, since they relied on a ton of human knowledge accumulated through hundreds of years to achieve what it did. AlphaGo Zero started from nothing and beat all that knowledge in 40 days.

efyu_lemonardo · Oct 20, 2017

Eridani said:
The thing is that AlphaGo was already far, far better at Go then any player could possibly be. That got a lot of interest, and rightfully so. AlphaGo zero improved the ELO rating from 4858 to 5185, which doesn't really look like much of an improvement. Previous versions of AlphaGo also achieved similar improvements and did not raise much news.

The real news here is the novel approach. Using reinforcement learning, with no human knowledge to achieve such a result is frankly mind blowing. This closing quote from the paper really puts in perspective:

This is the biggest thing here, and is something older versions of AlphaGo weren't really capable of doing, since they relied on a ton of human knowledge accumulated through hundreds of years to achieve what it did. AlphaGo Zero started from nothing and beat all that knowledge in 40 days.

Bingo. And that's what people need to understand.

Eridani · Oct 20, 2017

efyu_lemonardo said:
Bingo. And that's what people need to understand.

It would help if the reporting didn't focus so much on the 100-0 number, which is quite irrelevant. It's also misleading, since the actual number against the latest previous version was 89-11. It should focus on the mind blowing way this was achieved.

pfkas · Oct 20, 2017

Insert exponential graph here

andythinkpad · Oct 20, 2017

I don't know which T2 gif to quote, they all seam appropriate.

thetrin · Oct 20, 2017

I guess I'll skip the T2 reference and make an MGS4 war economy reference instead.

Laughing Banana · Oct 20, 2017

JettDash said:
I would rather a computer be in charge of the nukes than Donald Trump.

Ehh hhhhh.... No? You sure you want to entrust the world's most powerful arsenal of weapon to a cold, unfeeling, inhumane, 100% logical machine? That's scary as hell.

I mean, what if it comes to the 'logical' conclusion that earth has been filled with too many humans already and decided to wipe out a large amount of us just because?

efyu_lemonardo · Oct 20, 2017

Computer programs don't have to become "self aware" or "evil" to be a major threat to us. It's enough that they become extremely competent at managing certain human affairs, to the point where we place a large amount of our trust in them, and then one day due to a bug they make a mistake. That could potentially be enough to wipe out a whole lot of us.

On the plus side, we humans make mistakes that result in countless unnecessary deaths all the time.

tokkun · Oct 20, 2017

efyu_lemonardo said:
Computer programs don't have to become "self aware" or "evil" to be a major threat to us. It's enough that they become extremely competent at managing certain human affairs, to the point where we place a large amount of our trust in them, and then one day due to a bug they make a mistake. That could potentially be enough to wipe out a whole lot of us.

On the plus side, we humans make mistakes that result in countless unnecessary deaths all the time.

I feel like we made this transition long ago. If you look at a nuclear reactor, it is a computer that is managing the reaction process. Our communications systems are all computerized, computers do most of the flying of planes, they do most of the volume of stock market trades.

The solution we have taken is to vet the systems as carefully as possible and have humans there to override them if we see evidence of a bug.

ec0ec0 · Oct 20, 2017

so it ended up being way superior without human input? (learning, in part, from high level play)

User Name Here · Oct 20, 2017

it is no longer constrained by the limits of human knowledge. Instead, it is able to learn tabula rasa from the strongest player in the world: AlphaGo itself.

. . .

Welp. Let me just say that I for one welcome our new robot overlords. I can be quite useful in rounding up humans, or "biological batteries."

ElFly · Oct 20, 2017

so basically the humans trained Alpha Go wrong, as a joke

Fugu · Oct 20, 2017

It is probably worth nothing that Go is the type of game where small differences in skill translate to large differences in results.

StrategyFan said:
it's a simple game

What a profoundly misleading statement.

Support NeoGAF

New version of AlphaGo, trained only by playing itself, beats existing AlphaGo 100-0

Banned

Rodent Whores

Black Canada Mafia

Member

Looking for meaning in GAF

Member

Member

Black Canada Mafia

Banned

Member

Member

Member

Member

Junior Member

Member

Scary Euro Man

Banned

All 26 hours. Multiple times.

Banned

Member

Member

Member

Banned

Member

Black Canada Mafia

May I have a cookie?

Member

Member

Member

May I have a cookie?

Member

May I have a cookie?

Banned

Member

Member

Member

Member

May I have a cookie?

Member

Member

Member

Hail, peons, for I have come as ambassador from the great and bountiful Blueberry Butt Explosion

Weeping Pickle

May I have a cookie?

Member

Member

Member

Member

Member

Similar threads