Vai al contenuto principale della pagina

The elements of joint learning and optimization in operations management / / edited by Xi Chen, Stefanus Jasin, and Cong Shi



(Visualizza in formato marc)    (Visualizza in BIBFRAME)

Titolo: The elements of joint learning and optimization in operations management / / edited by Xi Chen, Stefanus Jasin, and Cong Shi Visualizza cluster
Pubblicazione: Cham, Switzerland : , : Springer, , [2022]
©2022
Descrizione fisica: 1 online resource (444 pages)
Disciplina: 658.7
Soggetto topico: Production management
Business logistics
Persona (resp. second.): JasinStefanus
ShiCong
ChenXi
Nota di contenuto: Intro -- Preface -- Contents -- Editors and Contributors -- About the Editors -- Contributors -- Part I Generic Tools -- 1 The Stochastic Multi-Armed Bandit Problem -- 1.1 Introduction -- 1.2 The N-Armed Bandit Problem -- 1.2.1 Upper Confidence Bound (UCB) Algorithm -- 1.2.2 Thompson Sampling (TS) -- 1.3 Contextual Bandits -- 1.4 Combinatorial Bandits -- References -- 2 Reinforcement Learning -- 2.1 Introduction -- 2.2 Markov Decision Process and Dynamic Programming -- 2.2.1 Finite-Horizon Markov Decision Process -- 2.2.1.1 Dynamic Programming Solution -- 2.2.2 Discounted Markov Decision Process -- 2.2.2.1 Value Iteration -- 2.2.2.2 Policy Iteration -- 2.3 Reinforcement Learning Algorithm Design -- 2.3.1 Reinforcement Learning Problem Formulation -- 2.3.1.1 Episodic Reinforcement Learning in Finite-Horizon MDP -- 2.3.1.2 Reinforcement Learning in Discounted MDP -- 2.3.2 Model-Based vs. Model-Free Reinforcement Learning -- 2.3.2.1 Model-Based Reinforcement Learning -- 2.3.2.2 Q-Learning and SARSA -- 2.3.2.3 Policy Gradient -- 2.3.3 Exploration in Reinforcement Learning -- 2.3.3.1 Exploration Schemes -- 2.3.3.2 Deep Exploration -- 2.3.4 Approximate Solution Methods and Deep Reinforcement Learning -- 2.4 Conclusion and Further Reading -- References -- 3 Optimal Learning and Optimal Design -- 3.1 Introduction -- 3.2 Statistical Design of Experiments -- 3.3 The Ranking and Selection Problem -- 3.3.1 Model -- 3.3.2 Large Deviations Analysis -- 3.3.3 Example: Normal Sampling Distributions -- 3.3.4 Optimal Allocations -- 3.4 Sequential Algorithms -- 3.4.1 Value of Information Methods -- 3.4.2 Thompson Sampling -- 3.4.3 Rate-Balancing Methods -- 3.4.4 Discussion -- 3.5 Recent Advances -- 3.5.1 A New Optimal Design for Linear Regression -- 3.5.2 Optimal Budget Allocation in Approximate Dynamic Programming -- 3.6 Conclusion -- References.
Part II Price Optimization -- 4 Dynamic Pricing with Demand Learning: Emerging Topics and State of the Art -- 4.1 Introduction -- 4.2 Model -- 4.3 Asymptotically Optimal Pricing Policies -- 4.3.1 Parametric Approaches -- 4.3.1.1 Model and Estimation -- 4.3.1.2 Certainty-Equivalence Pricing and Incomplete Learning -- 4.3.1.3 Asymptotically Optimal Policies -- 4.3.1.4 Extensions to Generalized Linear Models -- 4.3.1.5 Extensions to Multiple Products -- 4.3.2 Nonparametric Approaches -- 4.3.3 Extensions and Generalizations -- 4.4 Emerging Topics and Generalizations -- 4.4.1 Product Differentiation -- 4.4.2 Online Marketplaces -- 4.4.3 Continuous-Time Approximations -- References -- 5 Learning and Pricing with Inventory Constraints -- 5.1 Introduction -- 5.2 Single Product Case -- 5.2.1 Dynamic Pricing Algorithm -- 5.2.2 Lower Bound Example -- 5.3 Multiproduct Setting -- 5.3.1 Preliminaries -- 5.3.2 Parametric Case -- 5.3.3 Nonparametric Case -- 5.4 Bayesian Learning Setting -- 5.4.1 Model Setting -- 5.4.2 Thompson Sampling with Fixed Inventory Constraints -- 5.4.3 Thompson Sampling with Inventory Constraint Updating -- 5.4.4 Performance Analysis -- 5.5 Remarks and Further Reading -- References -- 6 Dynamic Pricing and Demand Learning in Nonstationary Environments -- 6.1 Introduction -- 6.2 Problem Formulation -- 6.3 Exogenously Changing Demand Environments -- 6.3.1 Change-Point Detection Models -- 6.3.2 Finite-State-Space Markov Chains -- 6.3.3 Autoregressive Models -- 6.3.4 General Changing Environments -- 6.3.5 Contextual Pricing -- 6.4 Endogenously Changing Demand Environments -- 6.4.1 Reference-Price Effects -- 6.4.2 Competition and Collusion -- 6.4.3 Platforms and Multi-Agent Learning -- 6.4.4 Forward-Looking and Patient Customers -- References -- 7 Pricing with High-Dimensional Data -- 7.1 Introduction.
7.2 Background: High-Dimensional Statistics -- 7.3 Static Pricing with High-Dimensional Data -- 7.3.1 Feature-Dependent Choice Model -- 7.3.2 Estimation Method -- 7.3.3 Performance Guarantees -- 7.4 Dynamic Pricing with High-Dimensional Data -- 7.4.1 Feature-Dependent Demand Model -- 7.4.2 Learning-and-Earning Algorithm -- 7.4.3 A Universal Lower Bound on the Regret -- 7.4.4 Performance of ILQX -- 7.4.5 Discussion -- 7.5 Directions for Future Research -- References -- Part III Assortment Optimization -- 8 Nonparametric Estimation of Choice Models -- 8.1 Introduction -- 8.2 General Setup -- 8.3 Estimating the Rank-Based Model -- 8.3.1 Estimation via the Conditional Gradient Algorithm -- 8.3.1.1 Solving the Support Finding Step -- 8.3.1.2 Solving the Proportions Update Step -- 8.3.1.3 Initialization and Stopping Criterion -- 8.3.2 Convergence Guarantee for the Estimation Algorithm -- 8.4 Estimating the Nonparametric Mixture of Closed Logit (NPMXCL) Model -- 8.4.1 Estimation via the Conditional Gradient Algorithm -- 8.4.1.1 Solving the Support Finding Step -- 8.4.1.2 Solving the Proportions Update Step -- 8.4.1.3 Initialization and Stopping Criterion -- 8.4.2 Convergence Guarantee for the Estimation Algorithm -- 8.4.3 Characterizing the Choice Behavior of Closed Logit Types -- 8.5 Other Nonparametric Choice Models -- 8.6 Concluding Thoughts -- References -- 9 The MNL-Bandit Problem -- 9.1 Introduction -- 9.2 Choice Modeling and Assortment Optimization -- 9.3 Dynamic Learning in Assortment Selection -- 9.4 A UCB Approach for the MNL-Bandit -- 9.4.1 Algorithmic Details -- 9.4.2 Min-Max Regret Bounds -- 9.4.3 Improved Regret Bounds for ``Well Separated'' Instances -- 9.4.4 Computational Study -- 9.4.4.1 Robustness of Algorithm 1 -- 9.4.4.2 Comparison with Existing Approaches -- 9.5 Thompson Sampling for the MNL-Bandit -- 9.5.1 Algorithm.
9.5.2 A TS Algorithm with Independent Beta Priors -- 9.5.3 A TS Algorithm with Posterior Approximation and Correlated Sampling -- 9.5.4 Regret Analysis -- 9.5.5 Empirical Study -- 9.6 Lower Bound for the MNL-Bandit -- 9.7 Conclusions and Recent Progress -- References -- 10 Dynamic Assortment Optimization: Beyond MNL Model -- 10.1 Overview -- 10.2 General Utility Distributions -- 10.2.1 Model Formulation and Assumptions -- 10.2.2 Algorithm Design -- 10.2.3 Theoretical Analysis -- 10.2.4 Bibliographic Notes and Discussion of Future Directions -- 10.3 Nested Logit Models -- 10.3.1 Model Formulation and Assumptions -- 10.3.2 Assortment Space Reductions -- 10.3.3 Algorithm Design and Regret Analysis -- 10.3.4 Regret Lower Bound -- 10.3.5 Bibliographic Notes and Discussion of Future Directions -- 10.4 MNL Model with Contextual Features -- 10.4.1 Model Formulation and Assumptions -- 10.4.2 Algorithm Design: Thompson Sampling -- 10.4.3 Algorithm Design: Upper Confidence Bounds -- 10.4.4 Lower Bounds -- 10.4.5 Bibliographic Notes and Discussion of Future Directions -- 10.5 Conclusion -- References -- Part IV Inventory Optimization -- 11 Inventory Control with Censored Demand -- 11.1 Introduction -- 11.2 Regret Lower Bound for Inventory Models with Censored Demand -- 11.2.1 Model Formulation -- 11.2.2 Strictly Convex and Well-Separated Cases -- 11.2.3 Worst-Case Regret Under General Demand Distributions -- 11.3 Censored Demand Example: Perishable Inventory System -- 11.3.1 Model Formulation -- 11.3.2 Challenges and Preliminary Results -- 11.3.3 Learning Algorithm Design: Cycle-Update Policy -- 11.3.4 Regret Analysis of CUP Algorithm -- 11.3.5 Strongly Convex Extension -- 11.4 Lead Times Example: Lost-Sales System with Lead Times -- 11.4.1 Model Formulation -- 11.4.2 Base-Stock Policy and Convexity Results -- 11.4.3 Challenges from Lead Times.
11.4.4 Gradient Methods -- 11.4.5 A Ternary Search Method -- 11.5 High Dimensionality Example: Multiproduct Inventory Model with Customer Choices -- 11.5.1 Inventory Substitution -- 11.5.2 Numerical Example -- References -- 12 Joint Pricing and Inventory Control with Demand Learning -- 12.1 Problem Formulation in General -- 12.2 Nonparametric Learning for Backlogged Demand -- 12.3 Nonparametric Learning for Lost-Sales System -- 12.3.1 Algorithms and Results in chen2017nonparametric -- 12.3.2 Algorithms and Results in chen2020optimal -- 12.3.2.1 Concave G(·) -- 12.3.2.2 Non-Concave G(·) -- 12.4 Parametric Learning with Limited Price Changes -- 12.4.1 Well-Separated Demand -- 12.4.2 General Demand -- 12.5 Backlog System with Fixed Ordering Cost -- 12.6 Other Models -- References -- 13 Optimization in the Small-Data, Large-Scale Regime -- 13.1 Why Small Data? -- 13.1.1 Structure -- 13.2 Contrasting the Large-Sample and Small-Data, Large-Scale Regimes -- 13.2.1 Model -- 13.2.2 Failure of Sample Average Approximation (SAA) -- 13.2.3 Best-in-Class Performance -- 13.2.4 Shortcomings of Cross-Validation -- 13.3 Debiasing In-Sample Performance -- 13.3.1 Stein Correction -- 13.3.2 From Unbiasedness to Policy Selection -- 13.3.3 Stein Correction in the Large-Sample Regime -- 13.3.4 Open Questions -- 13.4 Conclusion -- References -- Part V Healthcare Operations -- 14 Bandit Procedures for Designing Patient-Centric Clinical Trials -- 14.1 Introduction -- 14.2 The Bayesian Beta-Bernoulli MABP -- 14.2.1 Discussion of the Model -- 14.3 Metrics for Two-Armed Problem (Confirmatory Trials) -- 14.3.1 Accurate and Precise Estimation -- 14.3.2 Statistical Errors -- 14.3.3 Patient Benefit -- 14.3.4 Trial Size -- 14.3.5 Multiple Metrics -- 14.4 Illustrative Results for Two-Armed Problem -- 14.5 Discussion -- 14.5.1 Safety Concerns -- 14.5.2 Prior Distributions.
14.5.3 Delayed Responses.
Titolo autorizzato: The Elements of Joint Learning and Optimization in Operations Management  Visualizza cluster
ISBN: 3-031-01926-1
Formato: Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione: Inglese
Record Nr.: 9910595059703321
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Serie: Springer Series in Supply Chain Management