|
|
|
|
|
|
|
|
1. |
Record Nr. |
UNINA9910624394103321 |
|
|
Autore |
Lorenz Uwe |
|
|
Titolo |
Reinforcement learning from scratch : understanding current approaches - with examples in Java and Greenfoot / / Uwe Lorenz |
|
|
|
|
|
|
|
Pubbl/distr/stampa |
|
|
Cham, Switzerland : , : Springer, , [2022] |
|
©2022 |
|
|
|
|
|
|
|
|
|
ISBN |
|
9783031090301 |
9783031090295 |
|
|
|
|
|
|
|
|
Descrizione fisica |
|
1 online resource (195 pages) |
|
|
|
|
|
|
Disciplina |
|
|
|
|
|
|
Soggetti |
|
Java (Computer program language) |
Reinforcement learning |
Java (Llenguatge de programació) |
Aprenentatge per reforç (Intel·ligència artificial) |
Llibres electrònics |
|
|
|
|
|
|
|
|
Lingua di pubblicazione |
|
|
|
|
|
|
Formato |
Materiale a stampa |
|
|
|
|
|
Livello bibliografico |
Monografia |
|
|
|
|
|
Nota di bibliografia |
|
Includes bibliographical references. |
|
|
|
|
|
|
Nota di contenuto |
|
Intro -- Preface -- Introduction -- Contents -- 1: Reinforcement Learning as a Subfield of Machine Learning -- 1.1 Machine Learning as Automated Processing of Feedback from the Environment -- 1.2 Machine Learning -- 1.3 Reinforcement Learning with Java -- Bibliography -- 2: Basic Concepts of Reinforcement Learning -- 2.1 Agents -- 2.2 The Policy of the Agent -- 2.3 Evaluation of States and Actions (Q-Function, Bellman Equation) -- Bibliography -- 3: Optimal Decision-Making in a Known Environment -- 3.1 Value Iteration -- 3.1.1 Target-Oriented Condition Assessment ("Backward Induction") -- 3.1.2 Policy-Based State Valuation (Reward Prediction) -- 3.2 Iterative Policy Search -- 3.2.1 Direct Policy Improvement -- 3.2.2 Mutual Improvement of Policy and Value Function -- 3.3 Optimal Policy in a Board Game Scenario -- 3.4 Summary -- Bibliography -- 4: Decision-Making and Learning in an Unknown Environment -- 4.1 Exploration vs. Exploitation -- 4.2 Retroactive Processing of Experience ("Model-Free Reinforcement Learning") -- 4.2.1 Goal-Oriented Learning ("Value-Based") -- Subsequent |
|
|
|
|