finite_horizon Module

Overview

The finite_horizon module contains code for solving finite-horizon, stochastic inventory optimization problems, with or without fixed costs, using dynamic programming (DP).

Note

The notation and references (equations, sections, examples, etc.) used below refer to Snyder and Shen, Fundamentals of Supply Chain Theory (FoSCT), 2nd edition (2019).

See Also

For an overview of single-echelon inventory optimization in Stockpyl, see the tutorial page for single-echelon inventory optimization.

API Reference

finite_horizon_dp(num_periods, holding_cost, stockout_cost, terminal_holding_cost, terminal_stockout_cost, purchase_cost, fixed_cost, demand_mean, demand_sd, discount_factor=1.0, initial_inventory_level=0.0, trunc_tol=0.02, d_spread=4, s_spread=5)[source]

Solve the finite-horizon inventory optimization problem, with or without fixed costs, minimizing the expected discounted cost over the time horizon, using dynamic programming (DP).

See Sections 4.3.3 and 4.4.3 of FoSCT for discussion and notation.

Returns \(s^*_t\) and \(S^*_t\) in the output lists reorder_points and order_up_to_levels, respectively. If fixed_cost = 0, then reorder_points[t] = order_up_to_levels[t] for all t.

Also returns the optimal cost and action matrices. The optimal total expected discounted cost over the entire horizon is given by cost_matrix[1,initial_inventory_level], and is also returned in total_cost. x_range gives the range of \(x\) values for which \(\theta_t(x)\) is calculated, i.e., the indices for the columns of cost_matrix and oul_matrix.

The terminal cost function is assumed to be given by

\[\theta_{T+1}(x) = h_T x^+ + p_T x^-,\]

where \(h_T\) = terminal_holding_cost, \(p_T\) = terminal_stockout_cost, \(x^+ = \max\{x,0\}\), and \(x^- = \max\{-x,0\}\).

Demands are assumed to be normally distributed.

Most parameters may be given as a singleton or a list. If given as a singleton, the parameter will be assumed to be the same in every time period. If given as a list, the list must be of length num_periods or num_periods+1.

  • In the former case, the list is assumed to contain values for periods 1, …, num_periods in elements 0, …, num_periods-1.

  • In the latter case, the list is assumed to contain values for periods 1, …, num_periods in elements 1, …, num_periods, and the 0th element is ignored.

The parameters may be mixed, some scalars and some lists.

Output arrays are all 1-indexed; for example, reorder_point[5] gives \(s^*_5\), the reorder point for period 5.

Discretization is done at the integer level, i.e., all demands and inventory positions are rounded to the nearest integer. The state space (range of possible inventory positions) is truncated using settings specified in the code.

Raises warnings if the discretization and truncation settings are likely to lead to suboptimal results. (See details in the code.)

Note

This function executes faster than straightforward implementation because it calculates \(H_t(y)\) (as defined in (4.87)) for each \(t\) and \(y\), and then uses this when calculating \(\theta_t(x)\) for each \(x\). This avoids having to recalculate the terms that don’t depend on \(x\) (which are computationally expensive).

Parameters
  • num_periods (int) – Number of periods in time horizon. [\(T\)]

  • holding_cost (float or list) – Holding cost per item per period. [\(h\)]

  • stockout_cost (float or list) – Stockout cost per item per period. [\(p\)]

  • terminal_holding_cost (float) – Terminal holding cost per item. [\(h_T\)]

  • terminal_stockout_cost (float) – Terminal stockout cost per item. [\(p_T\)]

  • purchase_cost (float or list) – Purchase cost per item. [\(c\)]

  • fixed_cost (float or list) – Fixed cost per order. [\(K\)]

  • demand_mean (float or list) – Demand mean per period. [\(\mu\)]

  • demand_sd (float or list) – Demand standard deviation per period. [\(\sigma\)]

  • discount_factor (float or list) – Discount factor, in \((0,1]\). Default = 1. [\(\gamma\)]

  • initial_inventory_level (float) – Initial inventory level at the start of period 1. [\(x_1\)]

  • trunc_tol (float) – Truncation tolerance; a warning is raised if either total probability of demand outside of d_range > trunc_tol for any time period, or \(P(s_t - D < x_{min})\) > trunc_tol, where \(x_{min}\) is the minimum value of \(x\) after truncation.

  • d_spread (float) – Number of standard deviations around mean to consider for demand truncation.

  • s_spread (float) – Number of (demand) standard deviations around \((s,S)\) estimates to consider.

Returns

  • reorder_points (list) – List of reorder points in each time period. [\(s^*_t\)]

  • order_up_to_levels (list) – List of order-up-to levels in each time period. [\(S^*_t\)]

  • total_cost (float) – Optimal total expected discounted cost over the horizon, assuming IL in period 1 equals initial_inventory_level. [\(\theta_1(x_1)\)]

  • cost_matrix (ndarray) – Matrix of DP costs; cost_matrix[t,x] = optimal expected cost in periods \(t,\ldots,T\) if we begin period \(t\) with \(IL_t = x\) (and act optimally thereafter). [\(\theta_t(x)\)]

  • oul_matrix (ndarray) – Matrix of order-up-to levels; oul_matrix[t,x] = optimal order-up-to level in period \(t\) if we begin period \(t\) with \(IL_t = x\).

  • x_range (list) – Vector of \(x\)-values used in the discretization, i.e., indices of the columns of cost_matrix and oul_matrix.

Raises
  • ValueError – If num_periods <= 0 or is non-integer.

  • ValueError – If holding_cost, stockout_cost, purchase_cost, fixed_cost, demand_mean, or demand_sd < 0 for any time period.

  • ValueError – If discount_factor <= 0 or > 1 for any time period.

  • ValueError – If terminal_holding_cost < 0 or terminal_stockout_cost < 0.

Equation Used (equation (4.66)):

\[\theta_t(x) = \min_{y \ge x} \{K\delta(y-x) + c(y-x) + g(y) + \gamma \mathbb{E}_D[\theta_{t+1}(y-D)]\},\]

where \(\delta(z) = 1\) if \(z>0\) and \(0\) otherwise, and where \(g(\cdot)\) is the newsvendor cost function.

Algorithm Used: DP for finite-horizon inventory problem (Algorithm 4.1)

Truncation and Discretization are performed as follows:

1. Range of demand values is truncated at \(\mu \pm\) d_spread\(\sigma\), where \(\mu\) and \(\sigma\) are the mean and standard deviation of the demand, truncating also at 0, and accounting conservatively for the variations in demand parameters across periods.

2. If the total probability of demand outside the demand range in any time period is greater than trunc_tol, a warning is issued.

3. \(s\) is estimated as the newsvendor solution and \(S\) is estimated as \(s + Q_{EOQB}\), where \(Q_{EOQB}\) is the order quantity from the EOQB problem.

4. s_spread\(\sigma\) is subtracted from \(s\) and added to \(S\) to provide desired buffer, and \(\mu + \sigma\) d_spread is subtracted from \(s\) to account for demand, accounting conservatively for the variations in demand parameters across periods, and adjusting the final period to account for terminal costs.

  1. Range of \(x\) values is set using these two bounds.

6. When calculating \(H_t(y)\), the demand range is further truncated to avoid \(y-d\) falling below the smallest value in the \(x\)-range.

7. If, at any point in the optimization, the optimal order-up-to level for any \(x\) and \(t\) is the largest value in the \(x\)-range, suggesting that the uppper end of the range is too low and that the optimal order-up-to level may be greater than it, the optimization is terminated, the upper end of the \(x\)-range is doubled, and the optimization is restarted.

8. If, at any point in the optimization, the total probability of demand values that could bring the inventory level below the smallest value in the \(x\)-range is greater than trunc_tol, suggesting that the lower end of the range is too high and that reasonable demand values could bump up against it, a warning is generated. The optimization is not terminated, but the user may wish to try again with a larger value of s_spread and/or d_spread.

Example:

>>> s, S, cost, _, _, _ = finite_horizon_dp(5, 1, 20, 1, 20, 2, 50, 100, 20)
>>> s
[0, 110, 110, 110, 110, 111]
>>> S
[0, 133.0, 133.0, 133.0, 133.0, 126.0]
>>> cost
1558.6946467384012
myopic_bounds(num_periods, holding_cost, stockout_cost, terminal_holding_cost, terminal_stockout_cost, purchase_cost, fixed_cost, demand_mean, demand_sd, discount_factor=1.0)[source]

Calculate the “myopic” bounds for the finite-horizon inventory optimization problem, with or without fixed costs.

See Sections 4.3.3 and 4.4.3 of FoSCT for discussion and notation.

The myopic bounds \(\bar{s}_t\), \(\underline{S}_t\), and \(\bar{S}_t\) are denoted \(r^+(t)\), \(s^+(t)\), and \(s^++(t)\), respectively, in Zipkin (2000). (Zipkin does not have an analogous quantity to \(\underline{s}_t\).) They are not used in FoSCT, but the bounds are given in terms of FoSCT notation below.

Demands are assumed to be normally distributed.

Most parameters may be given as a singleton or a list. If given as a singleton, the parameter will be assumed to be the same in every time period. If given as a list, the list must be of length num_periods or num_periods+1.

  • In the former case, the list is assumed to contain values for periods 1, …, num_periods in elements 0, …, num_periods-1.

  • In the latter case, the list is assumed to contain values for periods 1, …, num_periods in elements 1, …, num_periods, and the 0th element is ignored.

The parameters may be mixed, some scalars and some lists.

Output arrays are all 1-indexed; for example, S_underbar[5] gives \(\underline{s}_5\), the lower bound on \(s_5\).

Parameters
  • num_periods (int) – Number of periods in time horizon. [\(T\)]

  • holding_cost (float or list) – Holding cost per item per period. [\(h\)]

  • stockout_cost (float or list) – Stockout cost per item per period. [\(p\)]

  • terminal_holding_cost (float) – Terminal holding cost per item. [\(h_T\)]

  • terminal_stockout_cost (float) – Terminal stockout cost per item. [\(p_T\)]

  • purchase_cost (float or list) – Purchase cost per item. [\(c\)]

  • fixed_cost (float or list) – Fixed cost per order. [\(K\)]

  • demand_mean (float or list) – Demand mean per period. [\(\mu\)]

  • demand_sd (float or list) – Demand standard deviation per period. [\(\sigma\)]

  • discount_factor (float or list) – Discount factor, in \((0,1]\). Default = 1. [\(\gamma\)]

Returns

  • S_underbar (ndarray) – List of myopic lower bounds on \(S_t\). [\(\underline{S}_t\)]

  • S_overbar (ndarray) – List of myopic upper bounds on \(S_t\). [\(\bar{S}_t\)]

  • s_underbar (ndarray) – List of myopic lower bounds on \(s_t\). [\(\underline{s}_t\)]

  • s_overbar (ndarray) – List of myopic upper bounds on \(s_t\). For periods t in which fixed_cost[t] - discount_factor[t] * fixed_cost[t+1] < 0, s_overbar[t] = None. (s_overbar is invalid in these cases.) [\(\bar{s}_t\)]

Raises
  • ValueError – If num_periods <= 0 or is non-integer.

  • ValueError – If holding_cost, stockout_cost, purchase_cost, fixed_cost, demand_mean, or demand_sd < 0 for any time period.

  • ValueError – If discount_factor <= 0 or > 1 for any time period.

  • ValueError – If purchase_cost[t] - discount_factor[t] * purchase_cost[t+1] is less than -holding_cost[t] or greater than stockout_cost[t] for some t. (This is required for myopic policy to be valid.)

Equations Used:

\[\underline{S}_t = \text{optimizer of } G_t(y)\]
\[\bar{S}_t = \text{the value of } y > \underline{S}_t \text{ such that } G_t(y) = G_t(\underline{S}_t) + \gamma_tK_{t+1}\]
\[\underline{s}_t = \text{the value of } y \le \underline{S}_t \text{ such that } G_t(y) = G_t(\underline{S}_t) + K_t\]
\[\bar{s}_t = \text{the value of } y \le \underline{S}_t \text{ such that } G_t(y) = G_t(\underline{S}_t) + K_t - \gamma_tK_{t+1},\]

where \(G_t(y)\) is the myopic newsvendor cost function in period \(t\), denoted \(G_i(y)\) in Veinott (1966) and as \(C^+(t,y)\) in Zipkin (2000), and is implemented in stockpyl.newsvendor.myopic().)

In the fourth equation, if \(K_t - \gamma_tK_{t+1} < 0\), then \(\bar{s}_t\) is invalid and s_overbar[t] is set to None.

References

A. F. Veinott, Jr., On the Optimality of \((s,S)\) Inventory Policies: New Conditions and a New Proof, J. SIAM Appl. Math 14(5), 1067-1083 (1966).

    1. Zipkin, Foundations of Inventory Management, Irwin/McGraw-Hill (2000).

Example:

>>> S_underbar, S_overbar, s_underbar, s_overbar = myopic_bounds(5, 1, 20, 1, 20, 2, 50, 100, 20)
>>> S_underbar[1], S_overbar[1], s_underbar[1], s_overbar[1]
(133.36782387894158, 191.66022942788436, 110.26036848597217, 133.36782387894158)