Autore	Lorenz Uwe
Pubbl/distr/stampa	Cham, Switzerland : , : Springer, , [2022]
Descrizione fisica	1 online resource (195 pages)
Disciplina	005.133
Soggetto topico	Java (Computer program language) Reinforcement learning Java (Llenguatge de programació) Aprenentatge per reforç (Intel·ligència artificial)
Soggetto genere / forma	Llibres electrònics
ISBN	9783031090301 9783031090295
Formato	Materiale a stampa
Livello bibliografico	Monografia
Lingua di pubblicazione	eng
Nota di contenuto	Intro -- Preface -- Introduction -- Contents -- 1: Reinforcement Learning as a Subfield of Machine Learning -- 1.1 Machine Learning as Automated Processing of Feedback from the Environment -- 1.2 Machine Learning -- 1.3 Reinforcement Learning with Java -- Bibliography -- 2: Basic Concepts of Reinforcement Learning -- 2.1 Agents -- 2.2 The Policy of the Agent -- 2.3 Evaluation of States and Actions (Q-Function, Bellman Equation) -- Bibliography -- 3: Optimal Decision-Making in a Known Environment -- 3.1 Value Iteration -- 3.1.1 Target-Oriented Condition Assessment ("Backward Induction") -- 3.1.2 Policy-Based State Valuation (Reward Prediction) -- 3.2 Iterative Policy Search -- 3.2.1 Direct Policy Improvement -- 3.2.2 Mutual Improvement of Policy and Value Function -- 3.3 Optimal Policy in a Board Game Scenario -- 3.4 Summary -- Bibliography -- 4: Decision-Making and Learning in an Unknown Environment -- 4.1 Exploration vs. Exploitation -- 4.2 Retroactive Processing of Experience ("Model-Free Reinforcement Learning") -- 4.2.1 Goal-Oriented Learning ("Value-Based") -- Subsequent evaluation of complete episodes ("Monte Carlo" Method) -- Immediate Valuation Using the Temporal Difference (Q- and SARSA Algorithm) -- Consideration of the Action History (Eligibility Traces) -- 4.2.2 Policy Search -- Monte Carlo Tactics Search -- Evolutionary Strategies -- Monte Carlo Policy Gradient (REINFORCE) -- 4.2.3 Combined Methods (Actor-Critic) -- "Actor-Critic" Policy Gradients -- Technical Improvements to the Actor-Critic Architecture -- Feature Vectors and Partially Observable Environments -- 4.3 Exploration with Predictive Simulations ("Model-Based Reinforcement Learning") -- 4.3.1 Dyna-Q -- 4.3.2 Monte Carlo Rollout -- 4.3.3 Artificial Curiosity -- 4.3.4 Monte Carlo Tree Search (MCTS) -- 4.3.5 Remarks on the Concept of Intelligence. 4.4 Systematics of the Learning Methods -- Bibliography -- 5: Artificial Neural Networks as Estimators for State Values and the Action Selection -- 5.1 Artificial Neural Networks -- 5.1.1 Pattern Recognition with the Perceptron -- 5.1.2 The Adaptability of Artificial Neural Networks -- 5.1.3 Backpropagation Learning -- 5.1.4 Regression with Multilayer Perceptrons -- 5.2 State Evaluation with Generalizing Approximations -- 5.3 Neural Estimators for Action Selection -- 5.3.1 Policy Gradient with Neural Networks -- 5.3.2 Proximal Policy Optimization -- 5.3.3 Evolutionary Strategy with a Neural Policy -- Bibliography -- 6: Guiding Ideas in Artificial Intelligence over Time -- 6.1 Changing Guiding Ideas -- 6.2 On the Relationship Between Humans and Artificial Intelligence -- Bibliography.
Record Nr.	UNISA-996495167903316

Autore	Lorenz Uwe
Edizione	[1st ed. 2022.]
Pubbl/distr/stampa	Cham : , : Springer International Publishing : , : Imprint : Springer, , 2022
Descrizione fisica	1 online resource (195 pages)
Disciplina	005.133 006.31
Collana	Mathematics and Statistics Series
Soggetto topico	Machine learning Java (Computer program language) Data mining Machine Learning Java Data Mining and Knowledge Discovery
ISBN	9783031090301 9783031090295
Formato	Materiale a stampa
Livello bibliografico	Monografia
Lingua di pubblicazione	eng
Nota di contenuto	1 Reinforcement learning as subfield of machine learning -- 2 Basic concepts of reinforcement learning -- 3 Optimal decision-making in a known environment -- 4 decision making and learning in an unknown environment -- 5 Artificial Neural Networks as estimators for state values and the action selection -- 6 Guiding ideas in Artificial Intelligence over time.
Record Nr.	UNINA-9910624394103321