Excerpt from latimes.com
Computers have beaten humans at chess and “Jeopardy!,” and now they
can master old Atari games such as “Space Invaders” or “Breakout”
without knowing anything about their rules or strategies.
Playing
Atari 2600 games from the 1980s may seem a bit “Back to the Future,” but
researchers with Google’s DeepMind project say they have taken a small
but crucial step toward a general learning machine that can mimic the
way human brains learn from new experience.
Unlike the Watson and Deep Blue computers that beat “Jeopardy!” and
chess champions with intensive programming specific to those games, the
Deep-Q Network built its winning strategies from keystrokes up, through
trial and error and constant reprocessing of feedback to find winning
strategies.
“The ultimate goal is to build smart, general-purpose
[learning] machines. We’re many decades off from doing that,” said
artificial intelligence researcher Demis Hassabis, coauthor of the study
published online Wednesday in the journal Nature. “But I do think this is the first significant rung of the ladder that we’re on.”
The Deep-Q Network computer, developed by the London-based Google DeepMind, played 49 old-school Atari games, scoring “at or better than human level,” on 29 of them, according to the study.
The
algorithm approach, based loosely on the architecture of human neural
networks, could eventually be applied to any complex and
multidimensional task requiring a series of decisions, according to the
researchers.
The algorithms employed in this type of machine
learning depart strongly from approaches that rely on a computer’s
ability to weigh stunning amounts of inputs and outcomes and choose
programmed models to “explain” the data. Those approaches, known as
supervised learning, required artful tailoring of algorithms around
specific problems, such as a chess game.
The computer instead
relies on random exploration of keystrokes bolstered by human-like
reinforcement learning, where a reward essentially takes the place of
such supervision.
“In
supervised learning, there’s a teacher that says what the right answer
was,” said study coauthor David Silver. “In reinforcement learning,
there is no teacher. No one says what the right action was, and the
system needs to discover by trial and error what the correct action or
sequence of actions was that led to the best possible desired outcome.”
The
computer “learned” over the course of several weeks of training, in
hundreds of trials, based only on the video pixels of the game — the
equivalent of a human looking at screens and manipulating a cursor
without reading any instructions, according to the study.
Over the
course of that training, the computer built up progressively more
abstract representations of the data in ways similar to human neural
networks, according to the study.
There
was nothing about the learning algorithms, however, that was specific
to Atari, or to video games for that matter, the researchers said.
The
computer eventually figured out such insider gaming strategies
as carving a tunnel through the bricks in “Breakout” to reach the back
of the wall. And it found a few tricks that were unknown to the
programmers, such as keeping a submarine hovering just below the surface
of the ocean in “Seaquest.”
The
computer’s limits, however, became evident in the games at which it
failed, sometimes spectacularly. It was miserable at “Montezuma’s
Revenge,” and performed nearly as poorly at “Ms. Pac-Man.” That’s
because those games also require more sophisticated exploration,
planning and complex route-finding, said coauthor Volodymyr Mnih.
And
though the computer may be able to match the video-gaming proficiency
of a 1980s teenager, its overall “intelligence” hardly reaches that of a
pre-verbal toddler. It cannot build conceptual or abstract knowledge,
doesn’t find novel solutions and can get stuck trying to exploit its
accumulated knowledge rather than abandoning it and resort to random
exploration, as humans do.
“It’s mastering and understanding the
construction of these games, but we wouldn’t say yet that it’s building
conceptual knowledge, or abstract knowledge,” said Hassabis.
The
researchers chose the Atari 2600 platform in part because it offered an
engineering sweet spot — not too easy and not too hard. They plan to
move into the 1990s, toward 3-D games involving complex environments,
such as the “Grand Theft Auto” franchise. That milestone could come
within five years, said Hassabis.
“With a few tweaks, it should be able to drive a real car,” Hassabis said.
DeepMind
was formed in 2010 by Hassabis, Shane Legg and Mustafa Suleyman, and
received funding from Tesla Motors’ Elon Musk and Facebook investor
Peter Thiel, among others. It was purchased by Google last year, for a
reported $650 million.
Hassabis, a chess prodigy and game designer, met
Legg, an algorithm specialist, while studying at the Gatsby
Computational Neuroscience Unit at University College, London. Suleyman,
an entrepreneur who dropped out of Oxford University, is a partner in
Reos, a conflict-resolution consulting group.
Source Article from http://feedproxy.google.com/~r/AscensionEarth2012/~3/AFy982wk1LI/is-playing-space-invaders-milestone-in.html
Is playing 'Space Invaders' a milestone in artificial intelligence?
No comments:
Post a Comment