Optimal Ordered Problem Solver

Jürgen Schmidhuber

doi:10.1023/B:MACH.0000015880.99707.b2

Loading next page...

References (131)

A. Newell, H. Simon (1963)
Computers and Thought
R. Solomonoff (1964)
A formal theory of inductive inference
Information and Control, 7
the n -th frozen program (either user-deﬁned or frozen by oops ) stored in q below a frozen , assuming (somewhat arbitrarily) zero inputs and outputs
R. Solomonoff (1989)
A SYSTEM FOR INCREMENTAL LEARNING BASED ON ALGORITHMIC PROBABILITY
P. Utgoff (1984)
Shift of bias for inductive concept learning
A stack of integer arrays, each having a name, an address, and a size (not used in this paper, but implemented and mentioned for the sake of completeness)
E. B. Baum, I. Durdanovic (1999)
Toward a model of mind as an economy of agents
Machine Learning, 35
Jürgen Schmidhuber (2002)
The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions
M. Tsukamoto (1977)
Program Stacking Technique
, 17
A variable holding the index r of this tape’s task
R. Waldinger, Richard Lee (1969)
PROW: A Step Toward Automatic Program Writing
At time 3 ∗ 10 9
which defines self-made recursive code functionally equivalent to fac(n), which calls itself by calling the most recent self-made function even before it is completely defined
For the 1 n 2 n -problem, within 480,000 time steps (less than a second), oops found nongeneral but working code for n = 1: (defnp 2toD)
A. Newell, H. Simon (1995)
GPS, a program that simulates human thought
by time 541 ∗ 10 9 (roughly 3 days), it had found fresh code (new a last ) for n = 1 , 2 , 3: (c3 dec boostq defnp c4 calltp c3 c5 calltp endnp)
J. Schmidhuber (1993)
Proceedings of the International Conference on Artificial Neural Networks
J. Schmidhuber, Jieyu Zhao, M. Wiering (1996)
Simple Principles of Metalearning
A. Turing (1937)
On computable numbers, with an application to the Entscheidungsproblem
Proc. London Math. Soc., s2-42
J. Schmidhuber (1993)
Proc. of the Intl. Conf. on Artificial Neural Networks
Marcus Hutter (2002)
Self-Optimizing and Pareto-Optimal Policies in General Environments based on Bayes-Mixtures
ArXiv, cs.AI/0204040
P. Werbos (1974)
Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences
M. Dorigo, G. Caro, L. Gambardella (1999)
Ant Algorithms for Discrete Optimization
Artificial Life, 5
A binary quoteﬂag determining whether the instructions pointed to by ip will get executed or just quoted , that is, pushed onto ds
C. Anderson (1986)
Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning)
An auxiliary data stack Ds
David Bulman (1977)
Stack Computers [Guest editor's introduction]
Computer, 10
H. P. Schwefel (1974)
Numerische optimierung von computer-modellen
(1993)
The SOAR Papers
(2002)
OPTIMAL ORDERED PROBLEM SOLVER
D. Wolpert, W. Macready (1995)
No Free Lunch Theorems for Search
J. Schmidhuber (2003)
Advances in Neural Information Processing Systems 15
J. Schmidhuber (2000)
Algorithmic Theories of Everything
ArXiv, quant-ph/0011122
G. Chaitin (1975)
A Theory of Program Size Formally Identical to Information Theory
J. ACM, 22
J. Schmidhuber (2003)
Real AI: New Approaches to Artificial General Intelligence
J. Schmidhuber, Viktor Zhumatiy, M. Gagliolo, F. Groen, N. Amato, Andrea Bonarini, E. Yoshida, B. Kröse (2004)
Bias-Optimal Incremental Learning of Control Sequences for Virtual Robots
E. Baum (1999)
Toward a Model of Intelligence as an Economy of Agents
Machine Learning, 35
I. Rechenberg (1973)
Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution
J. Schmidhuber (2002)
Hierarchies of Generalized Kolmogorov Complexities and Nonenumerable Universal Measures Computable in the Limit
Int. J. Found. Comput. Sci., 13
K. Zuse (1991)
Rechnender Raum
D. Rumelhart, Geoffrey Hinton, Ronald Williams (1986)
Learning internal representations by error propagation
S. Hochreiter, A. Younger, P. Conwell (2001)
Learning to Learn Using Gradient Descent
S. Hochreiter, A. S. Younger, P. R. Conwell (2001)
Proc. Intl. Conf. on Artificial Neural Networks (ICANN-2001)
P. Utgoff (1986)
Machine Learning
J. Schmidhuber (1993)
An 'introspective' network that can learn to run its own weight change algorithm
J. SCHMIDHUBER
J. Schmidhuber (1997)
Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability
Neural networks : the official journal of the International Neural Network Society, 10 5
R. Solomonoff (1985)
The Application of Algorithmic Probability to Problems in Artificial Intelligence
inc(x) returns x + 1; dec(x) returns x − 1; by2(x) returns 2x; add(x,y) returns x +
Simple Stack Manipulators. del() decrements dp; clear() sets dp := 0; dp2ds() returns dp; setdp(x) sets dp := x; ip2ds(
J. Holland (1985)
Properties of the Bucket Brigade
Ming Li, P. Vitányi (1993)
An Introduction to Kolmogorov Complexity and Its Applications
H. Bremermann (1982)
Minimum energy requirements of information transfer and computing
International Journal of Theoretical Physics, 21
R. Salustowicz, J. Schmidhuber (1998)
Evolving Structured Programs with Hierarchical Instructions and Skip Nodes
J. Schmidhuber, Jieyu Zhao, M. Wiering (1997)
Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement
Machine Learning, 28
Instruction jmp1(val, n) sets cs [ cp ]
Wolfgang Banzhaf, Peter Nordin, Robert Keller, F. Francone (1997)
Genetic Programming: An Introduction
(2003)
Towards solving the grand problem of AI
J. H. Holland (1985)
Proceedings of an International Conference on Genetic Algorithms
(1974)
Numerische Optimierung von Computer-Modellen. Dissertation
J. Schmidhuber (1993)
A ‘Self-Referential’ Weight Matrix
A data stack ds(r ) (or ds for short, omitting the task index) for storing function arguments. (The corresponding stack pointer is dp : 0 ≤ dp ≤ maxdp)
(1966)
Advances in Neural Information Processing Systems 4 (pp. 831–838)
(2001)
General methods for search and reinforcement learning. Application for SNF grant 21-6678
C. Green (1969)
Application of Theorem Proving to Problem Solving
Operand and(x,y) returns 1 if x > 0 and y > 0, otherwise 0. Analogously for or(x,y)
1, 1, fac , up c1 ex rt0 del up dec fac mul ret) declares a recursive function fac (n) which returns 1 if n = 0, otherwise returns n × fac (n-1)
J. Schmidhuber, Jieyu Zhao, N. Schraudolph (1998)
Reinforcement Learning with Self-Modifying Policies
J. Schmidhuber (2002)
Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002)
J. Koopman (1989)
Stack computers: the new wave
Jürgen Schmidhuber (1987)
Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-. hook
N. L. Cramer (1985)
Proceedings of an International Conference on Genetic Algorithms and Their Applications
(1994)
Artificial Intelligence: A Modern Approach
L. Gambardella, M. Dorigo (2000)
An Ant Colony System Hybridized with a New Local Search for the Sequential Ordering Problem
INFORMS J. Comput., 12
works like cpn(1); cpnb(n) copies n ds entries above ds
Jana Koehler, Bernhard Nebel, J. Hoffmann, Yannis Dimopoulos (1997)
Extending Planning Graphs to an ADL Subset
Charles Bennett (1982)
The thermodynamics of computation—a review
International Journal of Theoretical Physics, 21
A. Kolmogorov (1965)
Three approaches to the quantitative definition of information
Problems of Information Transmission, 1
P. Langley (1985)
Learning to Search: From Weak Methods to Domain-Specific Heuristics
Cogn. Sci., 9
S. Kothari, H. Oh (1993)
Neural Networks for Pattern Recognition
Adv. Comput., 37
A.1 Data Structures on Tapes Each tape r contains various stack-like data structures represented as sequences of integers
At time 10 7 (roughly 10 s ) it had solved the 2nd instance by simply prolonging the previous code, using the old, unchanged start address a last : (defnp 2toD grt c2 c2 endnp)
(1973)
Universal sequential search problems
At time 2 . 85 ∗ 10 9 (less than 1 hour) it had solved the 4th instance through prolongation: (defnp 2toD grt c2 c2 endnp boostq delD delD bsf 2toD fromD delD delD delD fromD bsf by2 bsf)
sub(x,y) returns x − y; mul(x,y) returns x * y; div(x,y) returns the smallest integer ≤ x/y; powr(x,y) returns x y (and costs y unit time steps)
J. Schmidhuber (1995)
Machine Learning: Proceedings of the Twelfth International Conference
Then oops switched to the Hanoi problem. Almost immediately (less than 1 ms later), at time 30 , 665 , 064 , 995, it had found the trivial code for n = 1: (
A stack pats of search patterns
R. Salustowicz, M. Wiering, J. Schmidhuber (1998)
Learning Team Strategies: Soccer Case Studies
Machine Learning, 33
S. Wolfram (1983)
Universality and complexity in cellular automata
Physica D: Nonlinear Phenomena, 10
J. Schmidhuber (2002)
Bias-Optimal Incremental Problem Solving
L. Levin (1984)
Randomness Conservation Inequalities; Information and Independence in Mathematical Theories
Inf. Control., 61
R. P. Salustowicz, J. Schmidhuber (1998)
Machine Learning: Proceedings of the Fifeteenth International Conference (ICML'98)
Michael Jordan, D. Rumelhart (1992)
Forward Models: Supervised Learning with a Distal Teacher
Cogn. Sci., 16
J. Schmidhuber (2001)
Sequential Decision Making Based on Direct Search
Marcus Hutter (2000)
Towards a Universal Theory of Artificial Intelligence Based on Algorithmic Probability and Sequential Decisions
ArXiv, cs.AI/0012011
(1987)
Learning how the world works: Speciﬁcations for predictive networks in robots and brains
A. Kolmogorov (1968)
Three approaches to the quantitative definition of information
International Journal of Computer Mathematics, 2
(1950)
Random processes and transformations
L. Kaelbling, M. Littman, A. Moore (1996)
Reinforcement learning: A survey
Journal of AI research, 4
Ivo Kwee, Marcus Hutter, J. Schmidhuber (2001)
Market-Based Reinforcement Learning in Partially Observable Worlds
後藤滋樹 (2002)
20世紀の名著名論：A.M.Turing: On Computable Numbers with an Application to the Entscheidungsproblem
, 43
J. Neumann, A. Burks (1967)
Theory Of Self Reproducing Automata
S. Lloyd (1999)
Ultimate physical limits to computation
Nature, 406
N. Cramer (1985)
A Representation for the Adaptive Generation of Simple Sequential Programs
Y. Deville, K. Lau (1994)
Logic Program Synthesis
J. Log. Program., 19/20
W. Vent (1975)
Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert
Feddes Repertorium, 86
(1987)
Der genetische Algorithmus: Eine Implementierung in Prolog. Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl Prof. Radig, Technische Universität München
L. Kaelbling, M. Littman, A. Moore (1996)
Reinforcement Learning: A Survey
J. Artif. Intell. Res., 4
J. Schmidhuber (2003)
Soft Computing and complex systems
R. Olsson (1995)
Inductive Functional Programming Using Incremental Program Transformation
Artif. Intell., 74
J. Schmidhuber (2003)
Exploring the predictable
R. Solomonoff (1964)
A Formal Theory of Inductive Inference. Part II
Inf. Control., 7
D. Lenat (1983)
Theory Formation by Heuristic Search
Artif. Intell., 21
W. Banzhaf, F. Francone, Robert Keller, P. Nordin (1998)
Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications
E. Fredkin, T. Toffoli (2002)
Conservative logic
International Journal of Theoretical Physics, 21
J. Schmidhuber (2003)
Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements
ArXiv, cs.LO/0309048
J. Schmidhuber (1994)
On learning how to learn learning strategies
V. Vapnik (1991)
Principles of Risk Minimization for Learning Theory
J. Schmidhuber (1990)
Reinforcement Learning in Markovian and Non-Markovian Environments
W. Gasarch (1997)
Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series))
ACM SIGACT News, 28
K. Gödel (1931)
Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I
Monatshefte für Mathematik, 149
by time 260 ∗ 10 9 (more than 1 day), it had found fresh yet somewhat bizarre code (new start address a last ) for n = 1 , 2: (c4 c3 cpn c4 by2 c3 by2 exec)
J. Holland (1975)
Adaptation in natural and artificial systems
Marcus Hutter (2000)
The Fastest and Shortest Algorithm for all Well-Defined Problems
ArXiv, cs.CC/0206022
M. Wiering, J. Schmidhuber (1996)
Solving POMDPs with Levin Search and EIRA
C. Moore, G. Leach (1970)
Forth - a language for interactive computing
D. Nguyen, B. Widrow (1989)
The truck backer-upper: an example of self-learning in neural networks
International 1989 Joint Conference on Neural Networks
(1974)
Laws of information (nongrowth) and aspects of the foundation of probability theory
R. Salustowicz, Jürgen Schmidhuber (1997)
Probabilistic Incremental Program Evolution
Evolutionary Computation, 5
Corso Elvezia (1995)
Discovering Solutions with Low Kolmogorov Complexity and High Generalization Capability

Publisher: Springer Journals
Copyright: Copyright © 2004 by Kluwer Academic Publishers
Subject: Computer Science; Artificial Intelligence (incl. Robotics); Control, Robotics, Mechatronics; Computing Methodologies; Simulation and Modeling; Language Translation and Linguistics
ISSN: 0885-6125
eISSN: 1573-0565
DOI: 10.1023/B:MACH.0000015880.99707.b2
Publisher site: See Article on Publisher Site

Abstract

We introduce a general and in a certain sense time-optimal way of solving one problem after another, efficiently searching the space of programs that compute solution candidates, including those programs that organize and manage and adapt and reuse earlier acquired knowledge. The Optimal Ordered Problem Solver (OOPS) draws inspiration from Levin's Universal Search designed for single problems and universal Turing machines. It spends part of the total search time for a new problem on testing programs that exploit previous solution-computing programs in computable ways. If the new problem can be solved faster by copy-editing/invoking previous code than by solving the new problem from scratch, then OOPS will find this out. If not, then at least the previous solutions will not cause much harm. We introduce an efficient, recursive, backtracking-based way of implementing OOPS on realistic computers with limited storage. Experiments illustrate how OOPS can greatly profit from metalearning or metasearching, that is, searching for faster search procedures.

Journal

Machine Learning – Springer Journals

Published: Oct 18, 2004

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Optimal Ordered Problem Solver

Optimal Ordered Problem Solver

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Optimal Ordered Problem Solver

Optimal Ordered Problem Solver

References (131)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies