top

  Info

  • Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

  Info

  • Utilizzare questo link per rimuovere la selezione effettuata.
2830-2021 : IEEE Standard for Technical Framework and Requirements of Trusted Execution Environment based Shared Machine Learning / / Institute of Electrical and Electronics Engineers
2830-2021 : IEEE Standard for Technical Framework and Requirements of Trusted Execution Environment based Shared Machine Learning / / Institute of Electrical and Electronics Engineers
Pubbl/distr/stampa New York, NY, USA : , : IEEE, , 2021
Descrizione fisica 1 online resource (23 pages)
Disciplina 006.31
Soggetto topico Machine learning
Deep learning (Machine learning)
Reinforcement learning
Computational learning theory
ISBN 1-5044-7724-3
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNINA-9910503501803321
New York, NY, USA : , : IEEE, , 2021
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
2830-2021 : IEEE Standard for Technical Framework and Requirements of Trusted Execution Environment based Shared Machine Learning / / Institute of Electrical and Electronics Engineers
2830-2021 : IEEE Standard for Technical Framework and Requirements of Trusted Execution Environment based Shared Machine Learning / / Institute of Electrical and Electronics Engineers
Pubbl/distr/stampa New York, NY, USA : , : IEEE, , 2021
Descrizione fisica 1 online resource (23 pages)
Disciplina 006.31
Soggetto topico Machine learning
Deep learning (Machine learning)
Reinforcement learning
Computational learning theory
ISBN 1-5044-7724-3
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNISA-996574913703316
New York, NY, USA : , : IEEE, , 2021
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
3rd ACM International Conference on AI in Finance / / Daniele Magazzeni
3rd ACM International Conference on AI in Finance / / Daniele Magazzeni
Autore Magazzeni Daniele
Pubbl/distr/stampa New York : , : Association for Computing Machinery, , 2022
Descrizione fisica 1 online resource (527 pages)
Disciplina 006.31
Soggetto topico Reinforcement learning
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNINA-9910623981203321
Magazzeni Daniele  
New York : , : Association for Computing Machinery, , 2022
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
The Art of Reinforcement Learning : Fundamentals, Mathematics, and Implementations with Python / / by Michael Hu
The Art of Reinforcement Learning : Fundamentals, Mathematics, and Implementations with Python / / by Michael Hu
Autore Hu Michael
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Berkeley, CA : , : Apress : , : Imprint : Apress, , 2023
Descrizione fisica 1 online resource (290 pages)
Disciplina 006.31
Soggetto topico Reinforcement learning
Feedback control systems
Python (Computer program language)
ISBN 1-4842-9606-0
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Part I: Foundation -- Chapter 1: Introduction to Reinforcement Learning -- Chapter 2: Markov Decision Processes -- Chapter 3: Dynamic Programming -- Chapter 4: Monte Carlo Methods -- Chapter 5: Temporal Difference Learning -- Part II: Value Function Approximation -- Chapter 6: Linear Value Function Approximation -- Chapter 7: Nonlinear Value Function Approximation -- Chapter 8: Improvement to DQN -- Part III: Policy Approximation -- Chapter 9: Policy Gradient Methods -- Chapter 10: Problems with Continuous Action Space -- Chapter 11: Advanced Policy Gradient Methods -- Part IV: Advanced Topics -- Chapter 12: Distributed Reinforcement Learning -- Chapter 13: Curiosity-Driven Exploration -- Chapter 14: Planning with a Model – AlphaZero.
Record Nr. UNINA-9910770270703321
Hu Michael  
Berkeley, CA : , : Apress : , : Imprint : Apress, , 2023
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Decentralised reinforcement learning in Markov games / / Peter Vrancx ; supervisors, Ann Nowe, Katja Verbeeck
Decentralised reinforcement learning in Markov games / / Peter Vrancx ; supervisors, Ann Nowe, Katja Verbeeck
Autore Vrancx Peter
Pubbl/distr/stampa Brussel, Belgium : , : VUBPress, , 2010
Descrizione fisica 1 online resource (217 p.)
Disciplina 006.31
Altri autori (Persone) NoweAnn
VerbeeckKatja
Soggetto topico Reinforcement learning
Markov processes
Game theory
Soggetto genere / forma Electronic books.
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto ""Front ""; ""Contents""; ""Chapter 1""; ""Chapter 2""; ""Chapter 3""; ""Chapter 4""; ""Chapter 5""; ""Chapter 6""; ""Chapter 7""; ""Chapter 8""; ""Chapter 9""; ""Appendix A""; ""Author Index""
Record Nr. UNINA-9910464642003321
Vrancx Peter  
Brussel, Belgium : , : VUBPress, , 2010
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Decentralised reinforcement learning in Markov games / / Peter Vrancx ; supervisors, Ann Nowe, Katja Verbeeck
Decentralised reinforcement learning in Markov games / / Peter Vrancx ; supervisors, Ann Nowe, Katja Verbeeck
Autore Vrancx Peter
Pubbl/distr/stampa Brussel, Belgium : , : VUBPress, , 2010
Descrizione fisica 1 online resource (217 p.)
Disciplina 006.31
Altri autori (Persone) NoweAnn
VerbeeckKatja
Soggetto topico Reinforcement learning
Markov processes
Game theory
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto ""Front ""; ""Contents""; ""Chapter 1""; ""Chapter 2""; ""Chapter 3""; ""Chapter 4""; ""Chapter 5""; ""Chapter 6""; ""Chapter 7""; ""Chapter 8""; ""Chapter 9""; ""Appendix A""; ""Author Index""
Record Nr. UNINA-9910788940903321
Vrancx Peter  
Brussel, Belgium : , : VUBPress, , 2010
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Decentralised reinforcement learning in Markov games / / Peter Vrancx ; supervisors, Ann Nowe, Katja Verbeeck
Decentralised reinforcement learning in Markov games / / Peter Vrancx ; supervisors, Ann Nowe, Katja Verbeeck
Autore Vrancx Peter
Pubbl/distr/stampa Brussel, Belgium : , : VUBPress, , 2010
Descrizione fisica 1 online resource (217 p.)
Disciplina 006.31
Altri autori (Persone) NoweAnn
VerbeeckKatja
Soggetto topico Reinforcement learning
Markov processes
Game theory
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto ""Front ""; ""Contents""; ""Chapter 1""; ""Chapter 2""; ""Chapter 3""; ""Chapter 4""; ""Chapter 5""; ""Chapter 6""; ""Chapter 7""; ""Chapter 8""; ""Chapter 9""; ""Appendix A""; ""Author Index""
Record Nr. UNINA-9910827910303321
Vrancx Peter  
Brussel, Belgium : , : VUBPress, , 2010
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Decision making under uncertainty and reinforcement learning : theory and algorithms / / Christos Dimitrakakis, Ronald Ortner
Decision making under uncertainty and reinforcement learning : theory and algorithms / / Christos Dimitrakakis, Ronald Ortner
Autore Dimitrakakis Christos
Pubbl/distr/stampa Cham, Switzerland : , : Springer, , [2022]
Descrizione fisica 1 online resource (251 pages)
Disciplina 658.403
Collana Intelligent systems reference library
Soggetto topico Decision making - Mathematical models
Reinforcement learning
Uncertainty
ISBN 3-031-07614-1
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Acknowledgements -- Reference -- Contents -- 1 Introduction -- 1.1 Uncertainty and Probability -- 1.2 The Exploration-Exploitation Trade-Off -- 1.3 Decision Theory and Reinforcement Learning -- References -- 2 Subjective Probability and Utility -- 2.1 Subjective Probability -- 2.1.1 Relative Likelihood -- 2.1.2 Subjective Probability Assumptions -- 2.1.3 Assigning Unique Probabilities* -- 2.1.4 Conditional Likelihoods -- 2.1.5 Probability Elicitation -- 2.2 Updating Beliefs: Bayes' Theorem -- 2.3 Utility Theory -- 2.3.1 Rewards and Preferences -- 2.3.2 Preferences Among Distributions -- 2.3.3 Utility -- 2.3.4 Measuring Utility* -- 2.3.5 Convex and Concave Utility Functions -- 2.4 Exercises -- Reference -- 3 Decision Problems -- 3.1 Introduction -- 3.2 Rewards that Depend on the Outcome of an Experiment -- 3.2.1 Formalisation of the Problem Setting -- 3.2.2 Decision Diagrams -- 3.2.3 Statistical Estimation* -- 3.3 Bayes Decisions -- 3.3.1 Convexity of the Bayes-Optimal Utility* -- 3.4 Statistical and Strategic Decision Making -- 3.4.1 Alternative Notions of Optimality -- 3.4.2 Solving Minimax Problems* -- 3.4.3 Two-Player Games -- 3.5 Decision Problems with Observations -- 3.5.1 Maximizing Utility When Making Observations -- 3.5.2 Bayes Decision Rules -- 3.5.3 Decision Problems in Classification -- 3.5.4 Calculating Posteriors -- 3.6 Summary -- 3.7 Exercises -- 3.7.1 Problems with No Observations -- 3.7.2 Problems with Observations -- 3.7.3 An Insurance Problem -- 3.7.4 Medical Diagnosis -- References -- 4 Estimation -- 4.1 Introduction -- 4.2 Sufficient Statistics -- 4.2.1 Sufficient Statistics -- 4.2.2 Exponential Families -- 4.3 Conjugate Priors -- 4.3.1 Bernoulli-Beta Conjugate Pair -- 4.3.2 Conjugates for the Normal Distribution -- 4.3.3 Conjugates for Multivariate Distributions -- 4.4 Credible Intervals.
4.5 Concentration Inequalities -- 4.5.1 Chernoff-Hoeffding Bounds -- 4.6 Approximate Bayesian Approaches -- 4.6.1 Monte Carlo Inference -- 4.6.2 Approximate Bayesian Computation -- 4.6.3 Analytic Approximations of the Posterior -- 4.6.4 Maximum Likelihood and Empirical Bayes Methods -- References -- 5 Sequential Sampling -- 5.1 Gains From Sequential Sampling -- 5.1.1 An Example: Sampling with Costs -- 5.2 Optimal Sequential Sampling Procedures -- 5.2.1 Multi-stage Problems -- 5.2.2 Backwards Induction for Bounded Procedures -- 5.2.3 Unbounded Sequential Decision Procedures -- 5.2.4 The Sequential Probability Ratio Test -- 5.2.5 Wald's Theorem -- 5.3 Martingales -- 5.4 Markov Processes -- 5.5 Exercises -- 6 Experiment Design and Markov Decision Processes -- 6.1 Introduction -- 6.2 Bandit Problems -- 6.2.1 An Example: Bernoulli Bandits -- 6.2.2 Decision-Theoretic Bandit Process -- 6.3 Markov Decision Processes and Reinforcement Learning -- 6.3.1 Value Functions -- 6.4 Finite Horizon, Undiscounted Problems -- 6.4.1 Direct Policy Evaluation -- 6.4.2 Backwards Induction Policy Evaluation -- 6.4.3 Backwards Induction Policy Optimization -- 6.5 Infinite-Horizon -- 6.5.1 Examples -- 6.5.2 Markov Chain Theory for Discounted Problems -- 6.5.3 Optimality Equations -- 6.5.4 MDP Algorithms for Infinite Horizon and Discounted Rewards -- 6.6 Optimality Criteria -- 6.7 Summary -- 6.8 Further Reading -- 6.9 Exercises -- 6.9.1 MDP Theory -- 6.9.2 Automatic Algorithm Selection -- 6.9.3 Scheduling -- 6.9.4 General Questions -- References -- 7 Simulation-Based Algorithms -- 7.1 Introduction -- 7.1.1 The Robbins-Monro Approximation -- 7.1.2 The Theory of the Approximation -- 7.2 Dynamic Problems -- 7.2.1 Monte Carlo Policy Evaluation and Iteration -- 7.2.2 Monte Carlo Updates -- 7.2.3 Temporal Difference Methods -- 7.2.4 Stochastic Value Iteration Methods.
7.3 Discussion -- 7.4 Exercises -- References -- 8 Approximate Representations -- 8.1 Introduction -- 8.1.1 Fitting a Value Function -- 8.1.2 Fitting a Policy -- 8.1.3 Features -- 8.1.4 Estimation Building Blocks -- 8.1.5 The Value Estimation Step -- 8.1.6 Policy Estimation -- 8.2 Approximate Policy Iteration (API) -- 8.2.1 Error Bounds for Approximate Value Functions -- 8.2.2 Rollout-Based Policy Iteration Methods -- 8.2.3 Least Squares Methods -- 8.3 Approximate Value Iteration -- 8.3.1 Approximate Backwards Induction -- 8.3.2 State Aggregation -- 8.3.3 Representative State Approximation -- 8.3.4 Bellman Error Methods -- 8.4 Policy Gradient -- 8.4.1 Stochastic Policy Gradient -- 8.4.2 Practical Considerations -- 8.5 Examples -- 8.6 Further Reading -- 8.7 Exercises -- References -- 9 Bayesian Reinforcement Learning -- 9.1 Introduction -- 9.1.1 Acting in Unknown MDPs -- 9.1.2 Updating the Belief -- 9.2 Finding Bayes-Optimal Policies -- 9.2.1 The Expected MDP Heuristic -- 9.2.2 The Maximum MDP Heuristic -- 9.2.3 Bayesian Policy Gradient -- 9.2.4 The Belief-Augmented MDP -- 9.2.5 Branch and Bound -- 9.2.6 Bounds on the Expected Utility -- 9.2.7 Estimating Lower Bounds on the Value Function with Backwards Induction -- 9.2.8 Further Reading -- 9.3 Bayesian Methods in Continuous Spaces -- 9.3.1 Linear-Gaussian Transition Models -- 9.3.2 Approximate Dynamic Programming -- 9.4 Partially Observable Markov Decision Processes -- 9.4.1 Solving Known POMDPs -- 9.4.2 Solving Unknown POMDPs -- 9.5 Relations Between Different Settings -- 9.6 Exercises -- References -- 10 Distribution-Free Reinforcement Learning -- 10.1 Introduction -- 10.2 Finite Stochastic Bandit Problems -- 10.2.1 The UCB1 Algorithm -- 10.2.2 Non i.i.d. Rewards -- 10.3 Reinforcement Learning in MDPs -- 10.3.1 An Upper-Confidence Bound Algorithm -- 10.3.2 Bibliographical Remarks -- References.
11 Conclusion -- Appendix Symbols -- Appendix Index -- Index.
Record Nr. UNINA-9910633910303321
Dimitrakakis Christos  
Cham, Switzerland : , : Springer, , [2022]
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Deep reinforcement learning for wireless communications and networking : theory, applications and implementation / / Dinh Thai Hoang [and four others]
Deep reinforcement learning for wireless communications and networking : theory, applications and implementation / / Dinh Thai Hoang [and four others]
Autore Hoang Dinh Thai <1986->
Edizione [First edition.]
Pubbl/distr/stampa Hoboken, New Jersey : , : John Wiley & Sons, Inc., , [2023]
Descrizione fisica 1 online resource (291 pages)
Disciplina 006.31
Soggetto topico Reinforcement learning
Wireless communication systems
Soggetto non controllato Artificial Intelligence
Computer Networks
Computers
ISBN 1-119-87374-6
1-119-87368-1
1-119-87373-8
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Cover -- Title Page -- Copyright -- Contents -- Notes on Contributors -- Foreword -- Preface -- Acknowledgments -- Acronyms -- Introduction -- Part I Fundamentals of Deep Reinforcement Learning -- Chapter 1 Deep Reinforcement Learning and Its Applications -- 1.1 Wireless Networks and Emerging Challenges -- 1.2 Machine Learning Techniques and Development of DRL -- 1.2.1 Machine Learning -- 1.2.2 Artificial Neural Network -- 1.2.3 Convolutional Neural Network -- 1.2.4 Recurrent Neural Network -- 1.2.5 Development of Deep Reinforcement Learning -- 1.3 Potentials and Applications of DRL -- 1.3.1 Benefits of DRL in Human Lives -- 1.3.2 Features and Advantages of DRL Techniques -- 1.3.3 Academic Research Activities -- 1.3.4 Applications of DRL Techniques -- 1.3.5 Applications of DRL Techniques in Wireless Networks -- 1.4 Structure of this Book and Target Readership -- 1.4.1 Motivations and Structure of this Book -- 1.4.2 Target Readership -- 1.5 Chapter Summary -- References -- Chapter 2 Markov Decision Process and Reinforcement Learning -- 2.1 Markov Decision Process -- 2.2 Partially Observable Markov Decision Process -- 2.3 Policy and Value Functions -- 2.4 Bellman Equations -- 2.5 Solutions of MDP Problems -- 2.5.1 Dynamic Programming -- 2.5.1.1 Policy Evaluation -- 2.5.1.2 Policy Improvement -- 2.5.1.3 Policy Iteration -- 2.5.2 Monte Carlo Sampling -- 2.6 Reinforcement Learning -- 2.7 Chapter Summary -- References -- Chapter 3 Deep Reinforcement Learning Models and Techniques -- 3.1 Value‐Based DRL Methods -- 3.1.1 Deep Q‐Network -- 3.1.2 Double DQN -- 3.1.3 Prioritized Experience Replay -- 3.1.4 Dueling Network -- 3.2 Policy‐Gradient Methods -- 3.2.1 REINFORCE Algorithm -- 3.2.1.1 Policy Gradient Estimation -- 3.2.1.2 Reducing the Variance -- 3.2.1.3 Policy Gradient Theorem -- 3.2.2 Actor‐Critic Methods -- 3.2.3 Advantage of Actor‐Critic Methods.
3.2.3.1 Advantage of Actor‐Critic (A2C) -- 3.2.3.2 Asynchronous Advantage Actor‐Critic (A3C) -- 3.2.3.3 Generalized Advantage Estimate (GAE) -- 3.3 Deterministic Policy Gradient (DPG) -- 3.3.1 Deterministic Policy Gradient Theorem -- 3.3.2 Deep Deterministic Policy Gradient (DDPG) -- 3.3.3 Distributed Distributional DDPG (D4PG) -- 3.4 Natural Gradients -- 3.4.1 Principle of Natural Gradients -- 3.4.2 Trust Region Policy Optimization (TRPO) -- 3.4.2.1 Trust Region -- 3.4.2.2 Sample‐Based Formulation -- 3.4.2.3 Practical Implementation -- 3.4.3 Proximal Policy Optimization (PPO) -- 3.5 Model‐Based RL -- 3.5.1 Vanilla Model‐Based RL -- 3.5.2 Robust Model‐Based RL: Model‐Ensemble TRPO (ME‐TRPO) -- 3.5.3 Adaptive Model‐Based RL: Model‐Based Meta‐Policy Optimization (MB‐MPO) -- 3.6 Chapter Summary -- References -- Chapter 4 A Case Study and Detailed Implementation -- 4.1 System Model and Problem Formulation -- 4.1.1 System Model and Assumptions -- 4.1.1.1 Jamming Model -- 4.1.1.2 System Operation -- 4.1.2 Problem Formulation -- 4.1.2.1 State Space -- 4.1.2.2 Action Space -- 4.1.2.3 Immediate Reward -- 4.1.2.4 Optimization Formulation -- 4.2 Implementation and Environment Settings -- 4.2.1 Install TensorFlow with Anaconda -- 4.2.2 Q‐Learning -- 4.2.2.1 Codes for the Environment -- 4.2.2.2 Codes for the Agent -- 4.2.3 Deep Q‐Learning -- 4.3 Simulation Results and Performance Analysis -- 4.4 Chapter Summary -- References -- Part II Applications of DRL in Wireless Communications and Networking -- Chapter 5 DRL at the Physical Layer -- 5.1 Beamforming, Signal Detection, and Decoding -- 5.1.1 Beamforming -- 5.1.1.1 Beamforming Optimization Problem -- 5.1.1.2 DRL‐Based Beamforming -- 5.1.2 Signal Detection and Channel Estimation -- 5.1.2.1 Signal Detection and Channel Estimation Problem -- 5.1.2.2 RL‐Based Approaches -- 5.1.3 Channel Decoding.
5.2 Power and Rate Control -- 5.2.1 Power and Rate Control Problem -- 5.2.2 DRL‐Based Power and Rate Control -- 5.3 Physical‐Layer Security -- 5.4 Chapter Summary -- References -- Chapter 6 DRL at the MAC Layer -- 6.1 Resource Management and Optimization -- 6.2 Channel Access Control -- 6.2.1 DRL in the IEEE 802.11 MAC -- 6.2.2 MAC for Massive Access in IoT -- 6.2.3 MAC for 5G and B5G Cellular Systems -- 6.3 Heterogeneous MAC Protocols -- 6.4 Chapter Summary -- References -- Chapter 7 DRL at the Network Layer -- 7.1 Traffic Routing -- 7.2 Network Slicing -- 7.2.1 Network Slicing‐Based Architecture -- 7.2.2 Applications of DRL in Network Slicing -- 7.3 Network Intrusion Detection -- 7.3.1 Host‐Based IDS -- 7.3.2 Network‐Based IDS -- 7.4 Chapter Summary -- References -- Chapter 8 DRL at the Application and Service Layer -- 8.1 Content Caching -- 8.1.1 QoS‐Aware Caching -- 8.1.2 Joint Caching and Transmission Control -- 8.1.3 Joint Caching, Networking, and Computation -- 8.2 Data and Computation Offloading -- 8.3 Data Processing and Analytics -- 8.3.1 Data Organization -- 8.3.1.1 Data Partitioning -- 8.3.1.2 Data Compression -- 8.3.2 Data Scheduling -- 8.3.3 Tuning of Data Processing Systems -- 8.3.4 Data Indexing -- 8.3.4.1 Database Index Selection -- 8.3.4.2 Index Structure Construction -- 8.3.5 Query Optimization -- 8.4 Chapter Summary -- References -- Part III Challenges, Approaches, Open Issues, and Emerging Research Topics -- Chapter 9 DRL Challenges in Wireless Networks -- 9.1 Adversarial Attacks on DRL -- 9.1.1 Attacks Perturbing the State space -- 9.1.1.1 Manipulation of Observations -- 9.1.1.2 Manipulation of Training Data -- 9.1.2 Attacks Perturbing the Reward Function -- 9.1.3 Attacks Perturbing the Action Space -- 9.2 Multiagent DRL in Dynamic Environments -- 9.2.1 Motivations -- 9.2.2 Multiagent Reinforcement Learning Models.
9.2.2.1 Markov/Stochastic Games -- 9.2.2.2 Decentralized Partially Observable Markov Decision Process (DPOMDP) -- 9.2.3 Applications of Multiagent DRL in Wireless Networks -- 9.2.4 Challenges of Using Multiagent DRL in Wireless Networks -- 9.2.4.1 Nonstationarity Issue -- 9.2.4.2 Partial Observability Issue -- 9.3 Other Challenges -- 9.3.1 Inherent Problems of Using RL in Real‐Word Systems -- 9.3.1.1 Limited Learning Samples -- 9.3.1.2 System Delays -- 9.3.1.3 High‐Dimensional State and Action Spaces -- 9.3.1.4 System and Environment Constraints -- 9.3.1.5 Partial Observability and Nonstationarity -- 9.3.1.6 Multiobjective Reward Functions -- 9.3.2 Inherent Problems of DL and Beyond -- 9.3.2.1 Inherent Problems of DL -- 9.3.2.2 Challenges of DRL Beyond Deep Learning -- 9.3.3 Implementation of DL Models in Wireless Devices -- 9.4 Chapter Summary -- References -- Chapter 10 DRL and Emerging Topics in Wireless Networks -- 10.1 DRL for Emerging Problems in Future Wireless Networks -- 10.1.1 Joint Radar and Data Communications -- 10.1.2 Ambient Backscatter Communications -- 10.1.3 Reconfigurable Intelligent Surface‐Aided Communications -- 10.1.4 Rate Splitting Communications -- 10.2 Advanced DRL Models -- 10.2.1 Deep Reinforcement Transfer Learning -- 10.2.1.1 Reward Shaping -- 10.2.1.2 Intertask Mapping -- 10.2.1.3 Learning from Demonstrations -- 10.2.1.4 Policy Transfer -- 10.2.1.5 Reusing Representations -- 10.2.2 Generative Adversarial Network (GAN) for DRL -- 10.2.3 Meta Reinforcement Learning -- 10.3 Chapter Summary -- References -- Index -- EULA.
Record Nr. UNINA-9910830760503321
Hoang Dinh Thai <1986->  
Hoboken, New Jersey : , : John Wiley & Sons, Inc., , [2023]
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Deep reinforcement learning hands-on : apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more / / Maxim Lapan
Deep reinforcement learning hands-on : apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more / / Maxim Lapan
Autore Lapan Maxim
Edizione [1st edition]
Pubbl/distr/stampa Birmingham, England : , : Packt Publishing, , 2018
Descrizione fisica 1 online resource (1 volume) : illustrations
Disciplina 006.31
Soggetto topico Reinforcement learning
Soggetto genere / forma Electronic books.
ISBN 1-78883-930-7
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNINA-9910467009503321
Lapan Maxim  
Birmingham, England : , : Packt Publishing, , 2018
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui