top

  Info

  • Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

  Info

  • Utilizzare questo link per rimuovere la selezione effettuata.
Generic multi-agent reinforcement learning approach for flexible job-shop scheduling / / Schirin Bär
Generic multi-agent reinforcement learning approach for flexible job-shop scheduling / / Schirin Bär
Autore Bär Schirin
Pubbl/distr/stampa Wiesbaden : , : Springer Vieweg, , [2022]
Descrizione fisica 1 online resource (163 pages)
Disciplina 670.285
Soggetto topico Flexible manufacturing systems
Reinforcement learning
Aprenentatge per reforç (Intel·ligència artificial)
Sistemes multiagent
Sistemes de producció flexibles
Soggetto genere / forma Llibres electrònics
ISBN 9783658391799
9783658391782
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Danksagung -- Abstract -- Zusammenfassung -- Contents -- Abbreviations -- List of Figures -- List of Tables -- 1 Introduction -- 1.1 Research Goals -- 1.2 Methodology -- 1.3 Structure of the Thesis -- 2 Requirements for Production Scheduling in Flexible Manufacturing -- 2.1 Foundations of Flexible Job-Shop Scheduling Problems -- 2.2 Requirement Analysis of Flexible Scheduling Solutions -- 2.2.1 Influences on Warehouse Control Systems -- 2.2.2 Influences on Manufacturing Control Systems -- 2.2.3 Derived and Ranked Requirements -- 2.3 State of the Art: Approaches to Solve Job-Shop Scheduling Problems -- 2.3.1 Conventional Scheduling Solutions -- 2.3.2 Reinforcement Learning Scheduling Solutions -- 2.4 Identification of the Research Gap -- 2.5 Contribution of this Work: Extended Flexible Job-Shop Scheduling Problem -- 3 Reinforcement Learning as an Approach for Flexible Scheduling -- 3.1 Understanding the Foundations: Formalization as a Markov Decision Process -- 3.1.1 Agent-Environment Interaction -- 3.1.2 Policies and Value Functions -- 3.1.3 Challenges Arising in Reinforcement Learning -- 3.2 Deep Q-Learning -- 3.2.1 Temporal Difference Learning and Q-Learning -- 3.2.2 Deep Q-Network -- 3.2.3 Loss Optimization -- 3.3 State of the Art: Cooperating Agents to Solve Complex Problems -- 3.3.1 Multi-Agent Learning Methods -- 3.3.2 Learning in Cooperative Multi-Agent RL Setups -- 3.4 Summary -- 4 Concept for Multi-Resources Flexible Job-Shop Scheduling -- 4.1 Concept for Agent-based Scheduling in FMS -- 4.1.1 Overall Concept -- 4.1.2 Job Specification -- 4.1.3 Petri Net Simulation -- 4.2 Formalization as a Markov Decision Process -- 4.2.1 Action Designs -- 4.2.2 State Designs -- 4.2.3 Reward Design -- 4.3 Considered Flexible Manufacturing System -- 4.4 Evaluation of the Technical Functionalities -- 4.5 Summary.
5 Multi-Agent Approach for Reactive Scheduling in Flexible Manufacturing -- 5.1 Training Set-up -- 5.2 Specification of the Reward Design -- 5.3 Evaluation of Suitable Training Strategies -- 5.3.1 Evaluation of MARL Algorithms -- 5.3.2 Selection of MARL Learning Methods -- 5.3.3 Evaluation of Parameter Sharing and Centralized Learning -- 5.4 Training Approach to Present Situations -- 5.5 Summary -- 6 Empirical Evaluation of the Requirements -- 6.1 Generalization to Various Products and Machines -- 6.2 Achieving the Global Objective -- 6.2.1 Comparison of Dense and Sparse Global Rewards -- 6.2.2 Cooperative Behavior -- 6.3 Benchmarking against Heuristic Search Algorithms -- 6.3.1 Evaluation for Unknown and Known Situations -- 6.3.2 Evaluation of Real-time Decision-Making -- 6.4 Consolidated Requirements Evaluation -- 6.5 Summary -- 7 Integration into a Flexible Manufacturing System -- 7.1 Acceptance Criteria for the Integration Concept -- 7.2 Integration Concept of MARL Scheduling Solution -- 7.2.1 Integration in the MES -- 7.2.2 Information Exchange -- 7.3 Design Cycle -- 7.3.1 Functioning Scheduling -- 7.3.2 Efficient Production Flow -- 7.3.3 Handling of Unforeseen Events -- 7.3.4 Handling of New Machine Skills -- 7.3.5 Handling of New Machines -- 7.4 Summary -- 8 Critical Discussion and Outlook -- 9 Summary -- 1 Bibliography.
Record Nr. UNINA-9910616202703321
Bär Schirin  
Wiesbaden : , : Springer Vieweg, , [2022]
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Generic multi-agent reinforcement learning approach for flexible job-shop scheduling / / Schirin Bär
Generic multi-agent reinforcement learning approach for flexible job-shop scheduling / / Schirin Bär
Autore Bär Schirin
Pubbl/distr/stampa Wiesbaden : , : Springer Vieweg, , [2022]
Descrizione fisica 1 online resource (163 pages)
Disciplina 670.285
Soggetto topico Flexible manufacturing systems
Reinforcement learning
Aprenentatge per reforç (Intel·ligència artificial)
Sistemes multiagent
Sistemes de producció flexibles
Soggetto genere / forma Llibres electrònics
ISBN 9783658391799
9783658391782
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Danksagung -- Abstract -- Zusammenfassung -- Contents -- Abbreviations -- List of Figures -- List of Tables -- 1 Introduction -- 1.1 Research Goals -- 1.2 Methodology -- 1.3 Structure of the Thesis -- 2 Requirements for Production Scheduling in Flexible Manufacturing -- 2.1 Foundations of Flexible Job-Shop Scheduling Problems -- 2.2 Requirement Analysis of Flexible Scheduling Solutions -- 2.2.1 Influences on Warehouse Control Systems -- 2.2.2 Influences on Manufacturing Control Systems -- 2.2.3 Derived and Ranked Requirements -- 2.3 State of the Art: Approaches to Solve Job-Shop Scheduling Problems -- 2.3.1 Conventional Scheduling Solutions -- 2.3.2 Reinforcement Learning Scheduling Solutions -- 2.4 Identification of the Research Gap -- 2.5 Contribution of this Work: Extended Flexible Job-Shop Scheduling Problem -- 3 Reinforcement Learning as an Approach for Flexible Scheduling -- 3.1 Understanding the Foundations: Formalization as a Markov Decision Process -- 3.1.1 Agent-Environment Interaction -- 3.1.2 Policies and Value Functions -- 3.1.3 Challenges Arising in Reinforcement Learning -- 3.2 Deep Q-Learning -- 3.2.1 Temporal Difference Learning and Q-Learning -- 3.2.2 Deep Q-Network -- 3.2.3 Loss Optimization -- 3.3 State of the Art: Cooperating Agents to Solve Complex Problems -- 3.3.1 Multi-Agent Learning Methods -- 3.3.2 Learning in Cooperative Multi-Agent RL Setups -- 3.4 Summary -- 4 Concept for Multi-Resources Flexible Job-Shop Scheduling -- 4.1 Concept for Agent-based Scheduling in FMS -- 4.1.1 Overall Concept -- 4.1.2 Job Specification -- 4.1.3 Petri Net Simulation -- 4.2 Formalization as a Markov Decision Process -- 4.2.1 Action Designs -- 4.2.2 State Designs -- 4.2.3 Reward Design -- 4.3 Considered Flexible Manufacturing System -- 4.4 Evaluation of the Technical Functionalities -- 4.5 Summary.
5 Multi-Agent Approach for Reactive Scheduling in Flexible Manufacturing -- 5.1 Training Set-up -- 5.2 Specification of the Reward Design -- 5.3 Evaluation of Suitable Training Strategies -- 5.3.1 Evaluation of MARL Algorithms -- 5.3.2 Selection of MARL Learning Methods -- 5.3.3 Evaluation of Parameter Sharing and Centralized Learning -- 5.4 Training Approach to Present Situations -- 5.5 Summary -- 6 Empirical Evaluation of the Requirements -- 6.1 Generalization to Various Products and Machines -- 6.2 Achieving the Global Objective -- 6.2.1 Comparison of Dense and Sparse Global Rewards -- 6.2.2 Cooperative Behavior -- 6.3 Benchmarking against Heuristic Search Algorithms -- 6.3.1 Evaluation for Unknown and Known Situations -- 6.3.2 Evaluation of Real-time Decision-Making -- 6.4 Consolidated Requirements Evaluation -- 6.5 Summary -- 7 Integration into a Flexible Manufacturing System -- 7.1 Acceptance Criteria for the Integration Concept -- 7.2 Integration Concept of MARL Scheduling Solution -- 7.2.1 Integration in the MES -- 7.2.2 Information Exchange -- 7.3 Design Cycle -- 7.3.1 Functioning Scheduling -- 7.3.2 Efficient Production Flow -- 7.3.3 Handling of Unforeseen Events -- 7.3.4 Handling of New Machine Skills -- 7.3.5 Handling of New Machines -- 7.4 Summary -- 8 Critical Discussion and Outlook -- 9 Summary -- 1 Bibliography.
Record Nr. UNISA-996495171403316
Bär Schirin  
Wiesbaden : , : Springer Vieweg, , [2022]
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Handbook of reinforcement learning and control / / Kyriakos G. Vamvoudakis [and three others], editors
Handbook of reinforcement learning and control / / Kyriakos G. Vamvoudakis [and three others], editors
Pubbl/distr/stampa Cham, Switzerland : , : Springer, , [2021]
Descrizione fisica 1 online resource (839 pages)
Disciplina 006.31
Collana Studies in Systems, Decision and Control
Soggetto topico Reinforcement learning
Automatic control - Sensitivity
Aprenentatge per reforç (Intel·ligència artificial)
Control automàtic
Soggetto genere / forma Llibres electrònics
ISBN 3-030-60990-1
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Contents -- Part ITheory of Reinforcement Learning for Model-Free and Model-Based Control and Games -- 1 What May Lie Ahead in Reinforcement Learning -- References -- 2 Reinforcement Learning for Distributed Control and Multi-player Games -- 2.1 Introduction -- 2.2 Optimal Control of Continuous-Time Systems -- 2.2.1 IRL with Experience Replay Learning Technique ch2Modares2014Automatica,ch2Kamalapurkar2016 -- 2.2.2 mathcalHinfty Control of CT Systems -- 2.3 Nash Games -- 2.4 Graphical Games -- 2.4.1 Off-Policy RL for Graphical Games -- 2.5 Output Synchronization of Multi-agent Systems -- 2.6 Conclusion and Open Research Directions -- References -- 3 From Reinforcement Learning to Optimal Control: A Unified Framework for Sequential Decisions -- 3.1 Introduction -- 3.2 The Communities of Sequential Decisions -- 3.3 Stochastic Optimal Control Versus Reinforcement Learning -- 3.3.1 Stochastic Control -- 3.3.2 Reinforcement Learning -- 3.3.3 A Critique of the MDP Modeling Framework -- 3.3.4 Bridging Optimal Control and Reinforcement Learning -- 3.4 The Universal Modeling Framework -- 3.4.1 Dimensions of a Sequential Decision Model -- 3.4.2 State Variables -- 3.4.3 Objective Functions -- 3.4.4 Notes -- 3.5 Energy Storage Illustration -- 3.5.1 A Basic Energy Storage Problem -- 3.5.2 With a Time-Series Price Model -- 3.5.3 With Passive Learning -- 3.5.4 With Active Learning -- 3.5.5 With Rolling Forecasts -- 3.5.6 Remarks -- 3.6 Designing Policies -- 3.6.1 Policy Search -- 3.6.2 Lookahead Approximations -- 3.6.3 Hybrid Policies -- 3.6.4 Remarks -- 3.6.5 Stochastic Control, Reinforcement Learning, and the Four Classes of Policies -- 3.7 Policies for Energy Storage -- 3.8 Extension to Multi-agent Systems -- 3.9 Observations -- References -- 4 Fundamental Design Principles for Reinforcement Learning Algorithms -- 4.1 Introduction.
4.1.1 Stochastic Approximation and Reinforcement Learning -- 4.1.2 Sample Complexity Bounds -- 4.1.3 What Will You Find in This Chapter? -- 4.1.4 Literature Survey -- 4.2 Stochastic Approximation: New and Old Tricks -- 4.2.1 What is Stochastic Approximation? -- 4.2.2 Stochastic Approximation and Learning -- 4.2.3 Stability and Convergence -- 4.2.4 Zap-Stochastic Approximation -- 4.2.5 Rates of Convergence -- 4.2.6 Optimal Convergence Rate -- 4.2.7 TD and LSTD Algorithms -- 4.3 Zap Q-Learning: Fastest Convergent Q-Learning -- 4.3.1 Markov Decision Processes -- 4.3.2 Value Functions and the Bellman Equation -- 4.3.3 Q-Learning -- 4.3.4 Tabular Q-Learning -- 4.3.5 Convergence and Rate of Convergence -- 4.3.6 Zap Q-Learning -- 4.4 Numerical Results -- 4.4.1 Finite State-Action MDP -- 4.4.2 Optimal Stopping in Finance -- 4.5 Zap-Q with Nonlinear Function Approximation -- 4.5.1 Choosing the Eligibility Vectors -- 4.5.2 Theory and Challenges -- 4.5.3 Regularized Zap-Q -- 4.6 Conclusions and Future Work -- References -- 5 Mixed Density Methods for Approximate Dynamic Programming -- 5.1 Introduction -- 5.2 Unconstrained Affine-Quadratic Regulator -- 5.3 Regional Model-Based Reinforcement Learning -- 5.3.1 Preliminaries -- 5.3.2 Regional Value Function Approximation -- 5.3.3 Bellman Error -- 5.3.4 Actor and Critic Update Laws -- 5.3.5 Stability Analysis -- 5.3.6 Summary -- 5.4 Local (State-Following) Model-Based Reinforcement Learning -- 5.4.1 StaF Kernel Functions -- 5.4.2 Local Value Function Approximation -- 5.4.3 Actor and Critic Update Laws -- 5.4.4 Analysis -- 5.4.5 Stability Analysis -- 5.4.6 Summary -- 5.5 Combining Regional and Local State-Following Approximations -- 5.6 Reinforcement Learning with Sparse Bellman Error Extrapolation -- 5.7 Conclusion -- References -- 6 Model-Free Linear Quadratic Regulator.
6.1 Introduction to a Model-Free LQR Problem -- 6.2 A Gradient-Based Random Search Method -- 6.3 Main Results -- 6.4 Proof Sketch -- 6.4.1 Controlling the Bias -- 6.4.2 Correlation of "0362 f(K) and f(K) -- 6.5 An Example -- 6.6 Thoughts and Outlook -- References -- Part IIConstraint-Driven and Verified RL -- 7 Adaptive Dynamic Programming in the Hamiltonian-Driven Framework -- 7.1 Introduction -- 7.1.1 Literature Review -- 7.1.2 Motivation -- 7.1.3 Structure -- 7.2 Problem Statement -- 7.3 Hamiltonian-Driven Framework -- 7.3.1 Policy Evaluation -- 7.3.2 Policy Comparison -- 7.3.3 Policy Improvement -- 7.4 Discussions on the Hamiltonian-Driven ADP -- 7.4.1 Implementation with Critic-Only Structure -- 7.4.2 Connection to Temporal Difference Learning -- 7.4.3 Connection to Value Gradient Learning -- 7.5 Simulation Study -- 7.6 Conclusion -- References -- 8 Reinforcement Learning for Optimal Adaptive Control of Time Delay Systems -- 8.1 Introduction -- 8.2 Problem Description -- 8.3 Extended State Augmentation -- 8.4 State Feedback Q-Learning Control of Time Delay Systems -- 8.5 Output Feedback Q-Learning Control of Time Delay Systems -- 8.6 Simulation Results -- 8.7 Conclusions -- References -- 9 Optimal Adaptive Control of Partially Uncertain Linear Continuous-Time Systems with State Delay -- 9.1 Introduction -- 9.2 Problem Statement -- 9.3 Linear Quadratic Regulator Design -- 9.3.1 Periodic Sampled Feedback -- 9.3.2 Event Sampled Feedback -- 9.4 Optimal Adaptive Control -- 9.4.1 Periodic Sampled Feedback -- 9.4.2 Event Sampled Feedback -- 9.4.3 Hybrid Reinforcement Learning Scheme -- 9.5 Perspectives on Controller Design with Image Feedback -- 9.6 Simulation Results -- 9.6.1 Linear Quadratic Regulator with Known Internal Dynamics -- 9.6.2 Optimal Adaptive Control with Unknown Drift Dynamics -- 9.7 Conclusion -- References.
10 Dissipativity-Based Verification for Autonomous Systems in Adversarial Environments -- 10.1 Introduction -- 10.1.1 Related Work -- 10.1.2 Contributions -- 10.1.3 Structure -- 10.1.4 Notation -- 10.2 Problem Formulation -- 10.2.1 (Q,S,R)-Dissipative and L2-Gain Stable Systems -- 10.3 Learning-Based Distributed Cascade Interconnection -- 10.4 Learning-Based L2-Gain Composition -- 10.4.1 Q-Learning for L2-Gain Verification -- 10.4.2 L2-Gain Model-Free Composition -- 10.5 Learning-Based Lossless Composition -- 10.6 Discussion -- 10.7 Conclusion and Future Work -- References -- 11 Reinforcement Learning-Based Model Reduction for Partial Differential Equations: Application to the Burgers Equation -- 11.1 Introduction -- 11.2 Basic Notation and Definitions -- 11.3 RL-Based Model Reduction of PDEs -- 11.3.1 Reduced-Order PDE Approximation -- 11.3.2 Proper Orthogonal Decomposition for ROMs -- 11.3.3 Closure Models for ROM Stabilization -- 11.3.4 Main Result: RL-Based Closure Model -- 11.4 Extremum Seeking Based Closure Model Auto-Tuning -- 11.5 The Case of the Burgers Equation -- 11.6 Conclusion -- References -- Part IIIMulti-agent Systems and RL -- 12 Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms -- 12.1 Introduction -- 12.2 Background -- 12.2.1 Single-Agent RL -- 12.2.2 Multi-Agent RL Framework -- 12.3 Challenges in MARL Theory -- 12.3.1 Non-unique Learning Goals -- 12.3.2 Non-stationarity -- 12.3.3 Scalability Issue -- 12.3.4 Various Information Structures -- 12.4 MARL Algorithms with Theory -- 12.4.1 Cooperative Setting -- 12.4.2 Competitive Setting -- 12.4.3 Mixed Setting -- 12.5 Application Highlights -- 12.5.1 Cooperative Setting -- 12.5.2 Competitive Setting -- 12.5.3 Mixed Settings -- 12.6 Conclusions and Future Directions -- References.
13 Computational Intelligence in Uncertainty Quantification for Learning Control and Differential Games -- 13.1 Introduction -- 13.2 Problem Formulation of Optimal Control for Uncertain Systems -- 13.2.1 Optimal Control for Systems with Parameters Modulated by Multi-dimensional Uncertainties -- 13.2.2 Optimal Control for Random Switching Systems -- 13.3 Effective Uncertainty Evaluation Methods -- 13.3.1 Problem Formulation -- 13.3.2 The MPCM -- 13.3.3 The MPCM-OFFD -- 13.4 Optimal Control Solutions for Systems with Parameter Modulated by Multi-dimensional Uncertainties -- 13.4.1 Reinforcement Learning-Based Stochastic Optimal Control -- 13.4.2 Q-Learning-Based Stochastic Optimal Control -- 13.5 Optimal Control Solutions for Random Switching Systems -- 13.5.1 Optimal Controller for Random Switching Systems -- 13.5.2 Effective Estimator for Random Switching Systems -- 13.6 Differential Games for Systems with Parameters Modulated by Multi-dimensional Uncertainties -- 13.6.1 Stochastic Two-Player Zero-Sum Game -- 13.6.2 Multi-player Nonzero-Sum Game -- 13.7 Applications -- 13.7.1 Traffic Flow Management Under Uncertain Weather -- 13.7.2 Learning Control for Aerial Communication Using Directional Antennas (ACDA) Systems -- 13.8 Summary -- References -- 14 A Top-Down Approach to Attain Decentralized Multi-agents -- 14.1 Introduction -- 14.2 Background -- 14.2.1 Reinforcement Learning -- 14.2.2 Multi-agent Reinforcement Learning -- 14.3 Centralized Learning, But Decentralized Execution -- 14.3.1 A Bottom-Up Approach -- 14.3.2 A Top-Down Approach -- 14.4 Centralized Expert Supervises Multi-agents -- 14.4.1 Imitation Learning -- 14.4.2 CESMA -- 14.5 Experiments -- 14.5.1 Decentralization Can Achieve Centralized Optimality -- 14.5.2 Expert Trajectories Versus Multi-agent Trajectories -- 14.6 Conclusion -- References.
15 Modeling and Mitigating Link-Flooding Distributed Denial-of-Service Attacks via Learning in Stackelberg Games.
Record Nr. UNINA-9910488722303321
Cham, Switzerland : , : Springer, , [2021]
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Reinforcement learning from scratch : understanding current approaches - with examples in Java and Greenfoot / / Uwe Lorenz
Reinforcement learning from scratch : understanding current approaches - with examples in Java and Greenfoot / / Uwe Lorenz
Autore Lorenz Uwe
Pubbl/distr/stampa Cham, Switzerland : , : Springer, , [2022]
Descrizione fisica 1 online resource (195 pages)
Disciplina 005.133
Soggetto topico Java (Computer program language)
Reinforcement learning
Java (Llenguatge de programació)
Aprenentatge per reforç (Intel·ligència artificial)
Soggetto genere / forma Llibres electrònics
ISBN 9783031090301
9783031090295
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Introduction -- Contents -- 1: Reinforcement Learning as a Subfield of Machine Learning -- 1.1 Machine Learning as Automated Processing of Feedback from the Environment -- 1.2 Machine Learning -- 1.3 Reinforcement Learning with Java -- Bibliography -- 2: Basic Concepts of Reinforcement Learning -- 2.1 Agents -- 2.2 The Policy of the Agent -- 2.3 Evaluation of States and Actions (Q-Function, Bellman Equation) -- Bibliography -- 3: Optimal Decision-Making in a Known Environment -- 3.1 Value Iteration -- 3.1.1 Target-Oriented Condition Assessment ("Backward Induction") -- 3.1.2 Policy-Based State Valuation (Reward Prediction) -- 3.2 Iterative Policy Search -- 3.2.1 Direct Policy Improvement -- 3.2.2 Mutual Improvement of Policy and Value Function -- 3.3 Optimal Policy in a Board Game Scenario -- 3.4 Summary -- Bibliography -- 4: Decision-Making and Learning in an Unknown Environment -- 4.1 Exploration vs. Exploitation -- 4.2 Retroactive Processing of Experience ("Model-Free Reinforcement Learning") -- 4.2.1 Goal-Oriented Learning ("Value-Based") -- Subsequent evaluation of complete episodes ("Monte Carlo" Method) -- Immediate Valuation Using the Temporal Difference (Q- and SARSA Algorithm) -- Consideration of the Action History (Eligibility Traces) -- 4.2.2 Policy Search -- Monte Carlo Tactics Search -- Evolutionary Strategies -- Monte Carlo Policy Gradient (REINFORCE) -- 4.2.3 Combined Methods (Actor-Critic) -- "Actor-Critic" Policy Gradients -- Technical Improvements to the Actor-Critic Architecture -- Feature Vectors and Partially Observable Environments -- 4.3 Exploration with Predictive Simulations ("Model-Based Reinforcement Learning") -- 4.3.1 Dyna-Q -- 4.3.2 Monte Carlo Rollout -- 4.3.3 Artificial Curiosity -- 4.3.4 Monte Carlo Tree Search (MCTS) -- 4.3.5 Remarks on the Concept of Intelligence.
4.4 Systematics of the Learning Methods -- Bibliography -- 5: Artificial Neural Networks as Estimators for State Values and the Action Selection -- 5.1 Artificial Neural Networks -- 5.1.1 Pattern Recognition with the Perceptron -- 5.1.2 The Adaptability of Artificial Neural Networks -- 5.1.3 Backpropagation Learning -- 5.1.4 Regression with Multilayer Perceptrons -- 5.2 State Evaluation with Generalizing Approximations -- 5.3 Neural Estimators for Action Selection -- 5.3.1 Policy Gradient with Neural Networks -- 5.3.2 Proximal Policy Optimization -- 5.3.3 Evolutionary Strategy with a Neural Policy -- Bibliography -- 6: Guiding Ideas in Artificial Intelligence over Time -- 6.1 Changing Guiding Ideas -- 6.2 On the Relationship Between Humans and Artificial Intelligence -- Bibliography.
Record Nr. UNISA-996495167903316
Lorenz Uwe  
Cham, Switzerland : , : Springer, , [2022]
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Reinforcement learning from scratch : understanding current approaches - with examples in Java and Greenfoot / / Uwe Lorenz
Reinforcement learning from scratch : understanding current approaches - with examples in Java and Greenfoot / / Uwe Lorenz
Autore Lorenz Uwe
Pubbl/distr/stampa Cham, Switzerland : , : Springer, , [2022]
Descrizione fisica 1 online resource (195 pages)
Disciplina 005.133
Soggetto topico Java (Computer program language)
Reinforcement learning
Java (Llenguatge de programació)
Aprenentatge per reforç (Intel·ligència artificial)
Soggetto genere / forma Llibres electrònics
ISBN 9783031090301
9783031090295
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Introduction -- Contents -- 1: Reinforcement Learning as a Subfield of Machine Learning -- 1.1 Machine Learning as Automated Processing of Feedback from the Environment -- 1.2 Machine Learning -- 1.3 Reinforcement Learning with Java -- Bibliography -- 2: Basic Concepts of Reinforcement Learning -- 2.1 Agents -- 2.2 The Policy of the Agent -- 2.3 Evaluation of States and Actions (Q-Function, Bellman Equation) -- Bibliography -- 3: Optimal Decision-Making in a Known Environment -- 3.1 Value Iteration -- 3.1.1 Target-Oriented Condition Assessment ("Backward Induction") -- 3.1.2 Policy-Based State Valuation (Reward Prediction) -- 3.2 Iterative Policy Search -- 3.2.1 Direct Policy Improvement -- 3.2.2 Mutual Improvement of Policy and Value Function -- 3.3 Optimal Policy in a Board Game Scenario -- 3.4 Summary -- Bibliography -- 4: Decision-Making and Learning in an Unknown Environment -- 4.1 Exploration vs. Exploitation -- 4.2 Retroactive Processing of Experience ("Model-Free Reinforcement Learning") -- 4.2.1 Goal-Oriented Learning ("Value-Based") -- Subsequent evaluation of complete episodes ("Monte Carlo" Method) -- Immediate Valuation Using the Temporal Difference (Q- and SARSA Algorithm) -- Consideration of the Action History (Eligibility Traces) -- 4.2.2 Policy Search -- Monte Carlo Tactics Search -- Evolutionary Strategies -- Monte Carlo Policy Gradient (REINFORCE) -- 4.2.3 Combined Methods (Actor-Critic) -- "Actor-Critic" Policy Gradients -- Technical Improvements to the Actor-Critic Architecture -- Feature Vectors and Partially Observable Environments -- 4.3 Exploration with Predictive Simulations ("Model-Based Reinforcement Learning") -- 4.3.1 Dyna-Q -- 4.3.2 Monte Carlo Rollout -- 4.3.3 Artificial Curiosity -- 4.3.4 Monte Carlo Tree Search (MCTS) -- 4.3.5 Remarks on the Concept of Intelligence.
4.4 Systematics of the Learning Methods -- Bibliography -- 5: Artificial Neural Networks as Estimators for State Values and the Action Selection -- 5.1 Artificial Neural Networks -- 5.1.1 Pattern Recognition with the Perceptron -- 5.1.2 The Adaptability of Artificial Neural Networks -- 5.1.3 Backpropagation Learning -- 5.1.4 Regression with Multilayer Perceptrons -- 5.2 State Evaluation with Generalizing Approximations -- 5.3 Neural Estimators for Action Selection -- 5.3.1 Policy Gradient with Neural Networks -- 5.3.2 Proximal Policy Optimization -- 5.3.3 Evolutionary Strategy with a Neural Policy -- Bibliography -- 6: Guiding Ideas in Artificial Intelligence over Time -- 6.1 Changing Guiding Ideas -- 6.2 On the Relationship Between Humans and Artificial Intelligence -- Bibliography.
Record Nr. UNINA-9910624394103321
Lorenz Uwe  
Cham, Switzerland : , : Springer, , [2022]
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Reinforcement learning with hybrid quantum approximation in the NISQ context / / Leonhard Kunczik
Reinforcement learning with hybrid quantum approximation in the NISQ context / / Leonhard Kunczik
Autore Kunczik Leonhard
Pubbl/distr/stampa Wiesbaden, Germany : , : Springer Vieweg, , [2022]
Descrizione fisica 1 online resource (145 pages)
Disciplina 006.31
Soggetto topico Quantum computing
Aprenentatge per reforç (Intel·ligència artificial)
Soggetto genere / forma Llibres electrònics
ISBN 9783658376161
9783658376154
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNISA-996479368903316
Kunczik Leonhard  
Wiesbaden, Germany : , : Springer Vieweg, , [2022]
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Reinforcement learning with hybrid quantum approximation in the NISQ context / / Leonhard Kunczik
Reinforcement learning with hybrid quantum approximation in the NISQ context / / Leonhard Kunczik
Autore Kunczik Leonhard
Pubbl/distr/stampa Wiesbaden, Germany : , : Springer Vieweg, , [2022]
Descrizione fisica 1 online resource (145 pages)
Disciplina 006.31
Soggetto topico Quantum computing
Aprenentatge per reforç (Intel·ligència artificial)
Soggetto genere / forma Llibres electrònics
ISBN 9783658376161
9783658376154
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNINA-9910574090203321
Kunczik Leonhard  
Wiesbaden, Germany : , : Springer Vieweg, , [2022]
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui