LEADER 05398nam  2200661Ia 450 
001 9910876618603321
005 20200520144314.0
010   $a1-283-37409-9
010   $a9786613374097
010   $a0-470-98004-4
010   $a0-470-98003-6
035   $a(CKB)3400000000000317
035   $a(EBL)661847
035   $a(SSID)ssj0000477769
035   $a(PQKBManifestationID)11296731
035   $a(PQKBTitleCode)TC0000477769
035   $a(PQKBWorkID)10513471
035   $a(PQKB)11586395
035   $a(MiAaPQ)EBC661847
035   $a(OCoLC)705354489
035   $a(EXLCZ)993400000000000317
100   $a20101103d2011      uy         0
101 0 $aeng
135   $aur|n|---|||||
181   $ctxt
182   $cc
183   $acr
200 10$aMulti-armed bandit allocation indices /$fJohn Gittins, Kevin Glazebrook, Richard Weber
205   $a2nd ed.
210   $aChichester $cWiley$d2011
215   $a1 online resource (311 p.)
300   $aDescription based upon print version of record.
311   $a0-470-67002-9 
320   $aIncludes bibliographical references and index.
327   $aMulti-armed Bandit Allocation Indices; Contents; Foreword; Foreword to the first edition; Preface; Preface to the first edition; 1 Introduction or exploration; Exercises; 2 Main ideas: Gittins index; 2.1 Introduction; 2.2 Decision processes; 2.3 Simple families of alternative bandit processes; 2.4 Dynamic programming; 2.5 Gittins index theorem; 2.6 Gittins index; 2.6.1 Gittins index and the multi-armed bandit; 2.6.2 Coins problem; 2.6.3 Characterization of the optimal stopping time; 2.6.4 The restart-in-state formulation; 2.6.5 Dependence on discount factor
327   $a2.6.6 Myopic and forwards induction policies2.7 Proof of the index theorem by interchanging bandit portions; 2.8 Continuous-time bandit processes; 2.9 Proof of the index theorem by induction and interchange argument; 2.10 Calculation of Gittins indices; 2.11 Monotonicity conditions; 2.11.1 Monotone indices; 2.11.2 Monotone jobs; 2.12 History of the index theorem; 2.13 Some decision process theory; Exercises; 3 Necessary assumptions for indices; 3.1 Introduction; 3.2 Jobs; 3.3 Continuous-time jobs; 3.3.1 Definition; 3.3.2 Policies for continuous-time jobs
327   $a3.3.3 The continuous-time index theorem for a SFABP of jobs3.4 Necessary assumptions; 3.4.1 Necessity of an infinite time horizon; 3.4.2 Necessity of constant exponential discounting; 3.4.3 Necessity of a single processor; 3.5 Beyond the necessary assumptions; 3.5.1 Bandit-dependent discount factors; 3.5.2 Stochastic discounting; 3.5.3 Undiscounted rewards; 3.5.4 A discrete search problem; 3.5.5 Multiple processors; Exercises; 4 Superprocesses, precedence constraints and arrivals; 4.1 Introduction; 4.2 Bandit superprocesses; 4.3 The index theorem for superprocesses
327   $a4.4 Stoppable bandit processes4.5 Proof of the index theorem by freezing and promotion rules; 4.5.1 Freezing rules; 4.5.2 Promotion rules; 4.6 The index theorem for jobs with precedence constraints; 4.7 Precedence constraints forming an out-forest; 4.8 Bandit processes with arrivals; 4.9 Tax problems; 4.9.1 Ongoing bandits and tax problems; 4.9.2 Klimov's model; 4.9.3 Minimum EWFT for the M/G/1 queue; 4.10 Near optimality of nearly index policies; Exercises; 5 The achievable region methodology; 5.1 Introduction; 5.2 A simple example; 5.3 Proof of the index theorem by greedy algorithm
327   $a5.4 Generalized conservation laws and indexable systems5.5 Performance bounds for policies for branching bandits; 5.6 Job selection and scheduling problems; 5.7 Multi-armed bandits on parallel machines; Exercises; 6 Restless bandits and Lagrangian relaxation; 6.1 Introduction; 6.2 Restless bandits; 6.3 Whittle indices for restless bandits; 6.4 Asymptotic optimality; 6.5 Monotone policies and simple proofs of indexability; 6.6 Applications to multi-class queueing systems; 6.7 Performance bounds for the Whittle index policy; 6.8 Indices for more general resource configurations; Exercises
327   $a7 Multi-population random sampling (theory)
330   $aIn 1989 the first edition of this book set out Gittins' pioneering index solution to the multi-armed bandit problem and his subsequent investigation of a wide of sequential resource allocation and stochastic scheduling problems. Since then there has been a remarkable flowering of new insights, generalizations and applications, to which Glazebrook and Weber have made major contributions. This second edition brings the story up to date. There are new chapters on the achievable region approach to stochastic optimization problems, the construction of performance bounds for suboptimal policies, W
606   $aResource allocation$xMathematical models
606   $aMathematical optimization
606   $aProgramming (Mathematics)
615  0$aResource allocation$xMathematical models.
615  0$aMathematical optimization.
615  0$aProgramming (Mathematics)
676   $a519.5
700   $aGittins$b John C.$f1938-$01754207
701   $aGlazebrook$b Kevin D.$f1950-$01443322
701   $aWeber$b Richard$f1953-$0942460
801  0$bMiAaPQ
801  1$bMiAaPQ
801  2$bMiAaPQ
906   $aBOOK
912   $a9910876618603321
996   $aMulti-armed bandit allocation indices$94190436
997   $aUNINA