Sometimes when formulating a LP or MIP, you run into the scenario where it is better to have several small occurrences of a bad event (or good event) rather than a single, larger occurrence. Consider a basic machine scheduling problem of scheduling n tasks on a machine. Each task i = 1 .. n has a different due date di, completion date ti, and number of days late li. It is possible that the users will consider a solution with ten tasks each one day late to be equivalent to a solution with one task ten days late. But they may prefer the first solution because it is a more balanced approach, spreading the lateness evenly among several tasks rather than heavily targeting one unlucky task.
If we were to model this problem with the objective
min c∑ili where li = max(ti – di, 0)
then the two solutions above would have the same cost because they are each 10 “task days” short (since the cost of the first solution is c * (1 day * 10 tasks) = 10 * c and the cost of the second solution is c * 10 days * 1 task = 10 * c). Instead, we could model the objective using a piecewise linear function where the cost increases significantly as the number of days late increases.
(Humorous note: Wikipedia defines a piecewise linear function as “a piecewise-defined function whose pieces are linear”. That’s kinda like defining an abelian group as a group that equals its abelianization. See the end of the post for an Abstract Algebra joke that still makes me laugh 12 years later.)
Let’s say that if a task is just one day late, there will be a small cost per day late (c1). If a task is two or three days late, there will be a slightly larger cost per day late (c2). If a task is four or five days late, the cost per day late will really increase (c3) and if the task is more than five days late, the cost skyrockets (c4). This results is a piecewise linear cost of lateness function that looks similar to this graph. Now the cost of ten tasks each a day late vs. the cost of one task that is ten days late is
c1 * 10 tasks * 1 day vs. c4 * 1 task * 10 days
10 * c1 vs. 10 * c4
where c4 > c1.
The CPLEX API has Piecewise Linear functionality (at least in versions 12.3 and 12.4) which I have tried and found works fine in general. It’s also fairly easy to write your own functionality using integer variables to indicate which category of lateness (0-1 day, 2-3 days, 4-5 days, 5+ days) each task falls into and multiplying these variables by the appropriate costs. I initially tried the CPLEX functionality for a large production model I was developing but I ran into edge cases that were behaving oddly so I did write my own constraints and did not see a performance difference, even with the largest datasets I used for testing.
Warning: As promised, Abstract Algebra joke ahead.
What is purple and commutes?
An abelian grape.
Thanks to my undergraduate Abstract Algebra teacher Dr. Todd Ashby for this joke. It is the only reason I still remember what abelian means.