finite_horizon
Module¶
Overview¶
The finite_horizon
module contains code for solving finite-horizon, stochastic
inventory optimization problems, with or without fixed costs, using dynamic programming (DP).
Note
The notation and references (equations, sections, examples, etc.) used below refer to Snyder and Shen, Fundamentals of Supply Chain Theory (FoSCT), 2nd edition (2019).
See Also
For an overview of single-echelon inventory optimization in Stockpyl, see the tutorial page for single-echelon inventory optimization.
API Reference¶
- finite_horizon_dp(num_periods, holding_cost, stockout_cost, terminal_holding_cost, terminal_stockout_cost, purchase_cost, fixed_cost, demand_mean, demand_sd, discount_factor=1.0, initial_inventory_level=0.0, trunc_tol=0.02, d_spread=4, s_spread=5)[source]¶
Solve the finite-horizon inventory optimization problem, with or without fixed costs, minimizing the expected discounted cost over the time horizon, using dynamic programming (DP).
See Sections 4.3.3 and 4.4.3 of FoSCT for discussion and notation.
Returns \(s^*_t\) and \(S^*_t\) in the output lists
reorder_points
andorder_up_to_levels
, respectively. Iffixed_cost
= 0, thenreorder_points[t]
=order_up_to_levels[t]
for allt
.Also returns the optimal cost and action matrices. The optimal total expected discounted cost over the entire horizon is given by
cost_matrix[1,initial_inventory_level]
, and is also returned intotal_cost
.x_range
gives the range of \(x\) values for which \(\theta_t(x)\) is calculated, i.e., the indices for the columns ofcost_matrix
andoul_matrix
.The terminal cost function is assumed to be given by
\[\theta_{T+1}(x) = h_T x^+ + p_T x^-,\]where \(h_T\) =
terminal_holding_cost
, \(p_T\) =terminal_stockout_cost
, \(x^+ = \max\{x,0\}\), and \(x^- = \max\{-x,0\}\).Demands are assumed to be normally distributed.
Most parameters may be given as a singleton or a list. If given as a singleton, the parameter will be assumed to be the same in every time period. If given as a list, the list must be of length
num_periods
ornum_periods
+1.In the former case, the list is assumed to contain values for periods 1, …,
num_periods
in elements 0, …,num_periods
-1.In the latter case, the list is assumed to contain values for periods 1, …,
num_periods
in elements 1, …,num_periods
, and the 0th element is ignored.
The parameters may be mixed, some scalars and some lists.
Output arrays are all 1-indexed; for example,
reorder_point[5]
gives \(s^*_5\), the reorder point for period 5.Discretization is done at the integer level, i.e., all demands and inventory positions are rounded to the nearest integer. The state space (range of possible inventory positions) is truncated using settings specified in the code.
Raises warnings if the discretization and truncation settings are likely to lead to suboptimal results. (See details in the code.)
Note
This function executes faster than straightforward implementation because it calculates \(H_t(y)\) (as defined in (4.87)) for each \(t\) and \(y\), and then uses this when calculating \(\theta_t(x)\) for each \(x\). This avoids having to recalculate the terms that don’t depend on \(x\) (which are computationally expensive).
- Parameters
num_periods (int) – Number of periods in time horizon. [\(T\)]
holding_cost (float or list) – Holding cost per item per period. [\(h\)]
stockout_cost (float or list) – Stockout cost per item per period. [\(p\)]
terminal_holding_cost (float) – Terminal holding cost per item. [\(h_T\)]
terminal_stockout_cost (float) – Terminal stockout cost per item. [\(p_T\)]
purchase_cost (float or list) – Purchase cost per item. [\(c\)]
fixed_cost (float or list) – Fixed cost per order. [\(K\)]
demand_mean (float or list) – Demand mean per period. [\(\mu\)]
demand_sd (float or list) – Demand standard deviation per period. [\(\sigma\)]
discount_factor (float or list) – Discount factor, in \((0,1]\). Default = 1. [\(\gamma\)]
initial_inventory_level (float) – Initial inventory level at the start of period 1. [\(x_1\)]
trunc_tol (float) – Truncation tolerance; a warning is raised if either total probability of demand outside of
d_range
>trunc_tol
for any time period, or \(P(s_t - D < x_{min})\) >trunc_tol
, where \(x_{min}\) is the minimum value of \(x\) after truncation.d_spread (float) – Number of standard deviations around mean to consider for demand truncation.
s_spread (float) – Number of (demand) standard deviations around \((s,S)\) estimates to consider.
- Returns
reorder_points (list) – List of reorder points in each time period. [\(s^*_t\)]
order_up_to_levels (list) – List of order-up-to levels in each time period. [\(S^*_t\)]
total_cost (float) – Optimal total expected discounted cost over the horizon, assuming IL in period 1 equals
initial_inventory_level
. [\(\theta_1(x_1)\)]cost_matrix (ndarray) – Matrix of DP costs;
cost_matrix[t,x]
= optimal expected cost in periods \(t,\ldots,T\) if we begin period \(t\) with \(IL_t = x\) (and act optimally thereafter). [\(\theta_t(x)\)]oul_matrix (ndarray) – Matrix of order-up-to levels;
oul_matrix[t,x]
= optimal order-up-to level in period \(t\) if we begin period \(t\) with \(IL_t = x\).x_range (list) – Vector of \(x\)-values used in the discretization, i.e., indices of the columns of
cost_matrix
andoul_matrix
.
- Raises
ValueError – If
num_periods
<= 0 or is non-integer.ValueError – If
holding_cost
,stockout_cost
,purchase_cost
,fixed_cost
,demand_mean
, ordemand_sd
< 0 for any time period.ValueError – If
discount_factor
<= 0 or > 1 for any time period.ValueError – If
terminal_holding_cost
< 0 orterminal_stockout_cost
< 0.
Equation Used (equation (4.66)):
\[\theta_t(x) = \min_{y \ge x} \{K\delta(y-x) + c(y-x) + g(y) + \gamma \mathbb{E}_D[\theta_{t+1}(y-D)]\},\]where \(\delta(z) = 1\) if \(z>0\) and \(0\) otherwise, and where \(g(\cdot)\) is the newsvendor cost function.
Algorithm Used: DP for finite-horizon inventory problem (Algorithm 4.1)
Truncation and Discretization are performed as follows:
1. Range of demand values is truncated at \(\mu \pm\)
d_spread
\(\sigma\), where \(\mu\) and \(\sigma\) are the mean and standard deviation of the demand, truncating also at 0, and accounting conservatively for the variations in demand parameters across periods.2. If the total probability of demand outside the demand range in any time period is greater than
trunc_tol
, a warning is issued.3. \(s\) is estimated as the newsvendor solution and \(S\) is estimated as \(s + Q_{EOQB}\), where \(Q_{EOQB}\) is the order quantity from the EOQB problem.
4.
s_spread
\(\sigma\) is subtracted from \(s\) and added to \(S\) to provide desired buffer, and \(\mu + \sigma\)d_spread
is subtracted from \(s\) to account for demand, accounting conservatively for the variations in demand parameters across periods, and adjusting the final period to account for terminal costs.Range of \(x\) values is set using these two bounds.
6. When calculating \(H_t(y)\), the demand range is further truncated to avoid \(y-d\) falling below the smallest value in the \(x\)-range.
7. If, at any point in the optimization, the optimal order-up-to level for any \(x\) and \(t\) is the largest value in the \(x\)-range, suggesting that the uppper end of the range is too low and that the optimal order-up-to level may be greater than it, the optimization is terminated, the upper end of the \(x\)-range is doubled, and the optimization is restarted.
8. If, at any point in the optimization, the total probability of demand values that could bring the inventory level below the smallest value in the \(x\)-range is greater than
trunc_tol
, suggesting that the lower end of the range is too high and that reasonable demand values could bump up against it, a warning is generated. The optimization is not terminated, but the user may wish to try again with a larger value ofs_spread
and/ord_spread
.Example:
>>> s, S, cost, _, _, _ = finite_horizon_dp(5, 1, 20, 1, 20, 2, 50, 100, 20) >>> s [0, 110, 110, 110, 110, 111] >>> S [0, 133.0, 133.0, 133.0, 133.0, 126.0] >>> cost 1558.6946467384012
- myopic_bounds(num_periods, holding_cost, stockout_cost, terminal_holding_cost, terminal_stockout_cost, purchase_cost, fixed_cost, demand_mean, demand_sd, discount_factor=1.0)[source]¶
Calculate the “myopic” bounds for the finite-horizon inventory optimization problem, with or without fixed costs.
See Sections 4.3.3 and 4.4.3 of FoSCT for discussion and notation.
The myopic bounds \(\bar{s}_t\), \(\underline{S}_t\), and \(\bar{S}_t\) are denoted \(r^+(t)\), \(s^+(t)\), and \(s^++(t)\), respectively, in Zipkin (2000). (Zipkin does not have an analogous quantity to \(\underline{s}_t\).) They are not used in FoSCT, but the bounds are given in terms of FoSCT notation below.
Demands are assumed to be normally distributed.
Most parameters may be given as a singleton or a list. If given as a singleton, the parameter will be assumed to be the same in every time period. If given as a list, the list must be of length
num_periods
ornum_periods
+1.In the former case, the list is assumed to contain values for periods 1, …,
num_periods
in elements 0, …,num_periods
-1.In the latter case, the list is assumed to contain values for periods 1, …,
num_periods
in elements 1, …,num_periods
, and the 0th element is ignored.
The parameters may be mixed, some scalars and some lists.
Output arrays are all 1-indexed; for example,
S_underbar[5]
gives \(\underline{s}_5\), the lower bound on \(s_5\).- Parameters
num_periods (int) – Number of periods in time horizon. [\(T\)]
holding_cost (float or list) – Holding cost per item per period. [\(h\)]
stockout_cost (float or list) – Stockout cost per item per period. [\(p\)]
terminal_holding_cost (float) – Terminal holding cost per item. [\(h_T\)]
terminal_stockout_cost (float) – Terminal stockout cost per item. [\(p_T\)]
purchase_cost (float or list) – Purchase cost per item. [\(c\)]
fixed_cost (float or list) – Fixed cost per order. [\(K\)]
demand_mean (float or list) – Demand mean per period. [\(\mu\)]
demand_sd (float or list) – Demand standard deviation per period. [\(\sigma\)]
discount_factor (float or list) – Discount factor, in \((0,1]\). Default = 1. [\(\gamma\)]
- Returns
S_underbar (ndarray) – List of myopic lower bounds on \(S_t\). [\(\underline{S}_t\)]
S_overbar (ndarray) – List of myopic upper bounds on \(S_t\). [\(\bar{S}_t\)]
s_underbar (ndarray) – List of myopic lower bounds on \(s_t\). [\(\underline{s}_t\)]
s_overbar (ndarray) – List of myopic upper bounds on \(s_t\). For periods
t
in whichfixed_cost[t] - discount_factor[t] * fixed_cost[t+1] < 0
,s_overbar[t] = None
. (s_overbar
is invalid in these cases.) [\(\bar{s}_t\)]
- Raises
ValueError – If
num_periods
<= 0 or is non-integer.ValueError – If
holding_cost
,stockout_cost
,purchase_cost
,fixed_cost
,demand_mean
, ordemand_sd
< 0 for any time period.ValueError – If
discount_factor
<= 0 or > 1 for any time period.ValueError – If
purchase_cost[t] - discount_factor[t] * purchase_cost[t+1]
is less than-holding_cost[t]
or greater thanstockout_cost[t]
for somet
. (This is required for myopic policy to be valid.)
Equations Used:
\[\underline{S}_t = \text{optimizer of } G_t(y)\]\[\bar{S}_t = \text{the value of } y > \underline{S}_t \text{ such that } G_t(y) = G_t(\underline{S}_t) + \gamma_tK_{t+1}\]\[\underline{s}_t = \text{the value of } y \le \underline{S}_t \text{ such that } G_t(y) = G_t(\underline{S}_t) + K_t\]\[\bar{s}_t = \text{the value of } y \le \underline{S}_t \text{ such that } G_t(y) = G_t(\underline{S}_t) + K_t - \gamma_tK_{t+1},\]where \(G_t(y)\) is the myopic newsvendor cost function in period \(t\), denoted \(G_i(y)\) in Veinott (1966) and as \(C^+(t,y)\) in Zipkin (2000), and is implemented in
stockpyl.newsvendor.myopic()
.)In the fourth equation, if \(K_t - \gamma_tK_{t+1} < 0\), then \(\bar{s}_t\) is invalid and
s_overbar[t]
is set toNone
.References
A. F. Veinott, Jr., On the Optimality of \((s,S)\) Inventory Policies: New Conditions and a New Proof, J. SIAM Appl. Math 14(5), 1067-1083 (1966).
Zipkin, Foundations of Inventory Management, Irwin/McGraw-Hill (2000).
Example:
>>> S_underbar, S_overbar, s_underbar, s_overbar = myopic_bounds(5, 1, 20, 1, 20, 2, 50, 100, 20) >>> S_underbar[1], S_overbar[1], s_underbar[1], s_overbar[1] (133.36782387894158, 191.66022942788436, 110.26036848597217, 133.36782387894158)