Algorithms & Randomness Center (ARC)
Semih Cayci (Ohio State University)
Monday, February 3, 2020
Groseclose 402 - 11:00 am
Title: Budget-Constrained Learning and Optimization with Bandit Feedback
Abstract: In numerous decision processes such as algorithm portfolio design, adaptive routing and dynamic pricing, each action incurs a random cost and yields a random and potentially correlated reward, where the process continues until the total cost exceeds a resource constraint. Motivated by these applications, we investigate the budget-constrained bandit problem in which the decision-maker aims to maximize the expected total reward subject to a budget constraint on the total cost. For this problem, we show that logarithmic regret is achievable as long as moments of order p > 2 exist. In particular, we propose robust learning algorithms that incorporate linear estimators to extract and exploit the correlation between the cost and reward of an arm for optimal sample complexity. We prove that these algorithms achieve tight regret bounds, which are optimal up to a constant factor in the Gaussian case. In the second part, we focus on the time-constrained bandit problem, and allow the decision-maker to interrupt an ongoing task and forgo its immediate reward for a potentially faster alternative. We show that this controlled interrupt mechanism improves the total reward linearly over time for a broad class of completion time distributions. In order to learn the optimal action and interrupt strategy, we propose learning algorithms that exploit the information structure of the problem to achieve order-optimal regret. We show that these learning algorithms provide efficient solutions for boosting the time-efficiency of stochastic local search methods in various fundamental applications such as the k-SAT problem.
Videos of recent talks are available at: https://smartech.gatech.edu/handle/1853/46836