Giter VIP home page Giter VIP logo

pyjobshop's Introduction

Note

This package is under development. Expect things to break significantly during the v0.0.x phase.

PyJobShop

PyPI License CI DOC Codecov

PyJobShop is a package for implementing scheduling models in Python. It supports the classically known flexible job shop problem (FJSP) and many of its extensions such as arbitrary precedence relations, sequence-dependent setup times, and many more! The implementation is currently work in progress - feel free to open an issue if you have any questions.

Installation

First, clone this repository to your local setup and cd into it. Then, make sure you have Poetry version 1.2+ installed, and run the following command:

poetry install

This will set-up a virtual environment and install all necessary dependencies.

Installing CPLEX

Running the CP model and solving it requires the optimization engine from IBM CPLEX. See their documentation for more details. If you install the free community edition, you can only solve models with up to 1000 variables and 1000 constraints. Models beyond that size require a paid version or an academic version.

pyjobshop's People

Contributors

leonlan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pyjobshop's Issues

Optional operations and processing plans

It’s currently assumed that all operations must be scheduled. This means that don’t allow for prize-collecting variants of the problem. It may be interesting to look into this at some point and find related literature to see if it’s worth it to include it in our models.

Rename machine to resource?

Advantage:

  • Used in Dauzere-Perez et al. (2023).
  • Generalizes machines to arbitrary resources.

Disadvantage:

  • The term machine is much more explicit.

Contradiction when using interval size and size_of constraint

This leads to a contradiction in a very simple flexible job-shop problem:

    for op in data.operations:
        m.add_interval_var(name=f"O_{op.id}")

        for idx, machine in enumerate(op.machines):
            var = m.add_interval_var(
                optional=True,
                name=f"A_{op.id}_{machine.id}",
                size=op.durations[idx],
            )
            # The duration of the operation on the machine is at least the
            # duration of the operation; it could be longer due to blocking.
            m.add(m.size_of(var) == op.durations[idx])  # TODO contradiction

Job deadlines

Then we can also model tardiness.

This addition is very straightforward. We can add a deadline attribute to jobs.

Auxilliary resources

Sometimes the processing of an operation requires an auxilliary resource (e.g., worker that oversees the production).

This can be modeled as a RFJSP. The auxilliary resource is just another machine.
For an operation that requires both machine and another resource, we can make two operations.
The first one is simply the processing operation like defined before.
The second one has eligible machines corresponding to its auxilliary resources.
The first and second operation must be synced together.
This means that we need a Synchronize timing constraint.

Set fixed interval start/end/duration for operations

Some applications may require re-optimizing schedules, where some part of the schedule is already fixed. We should allow users to pass start, end and duration values when constructing Operations, so they can fix those decisions already. We don't need to do it for machines, because that can already be done with the current interface where a user only passes one machine.

If we implement #39, then we should do the same for that.

Documentation

Philosophy: stick as much to PyVRP's documentation styling so we can reuse code, and newly learned things can be applied back.

Fix transportation constraints

https://github.com/leonlan/fjsp/blob/28dafbce8791386c7b7e47db72ec1622731c40bc/fjsp/CpModel.py#L137-L157

This constraint is not correct yet.

To recap:

  • If there are two operations $i$ and $j$, let $M_i$ denote the eligible machine set for $i$ and define $M_j$ similarly.
  • Assume that there is an edge $i \rightarrow j$, implying that $i$ must come before $j$. (This is not always true!)
  • Let $m_i \in M_i$ and $m_j \in M_j$ denote some machine pair.

Nowe we distinguish two cases:

  • There is a machine edge between $m_i$ and $m_j$. Then there are no restrictions: we can schedule $i$ on $m_i$ or not, and we can schedule $j$ on $m_j$ or not, independent of the other choice.
  • There is no machine edge between $m_i$ and $m_j$. Then that means that if I schedule on $i$ on $m_i$, then we cannot schedule the operations on $m_j$, because there is an accessibility restriction. Therefore, the presence of $i$ on $m_i$ forces the non-presence of $j$ on $m_j$.

The correct constraint is therefore:
$$\text{PresenceOf}(A) + \text{PresenceOf}(B) \leq 1.$$

Processing modes

Kis (2003) generalizes operations to operate under processing modes instead of a set of eligible machines.
Let $M$ denote a mode, which consists of a set of machine-consumption tuples $(m_1, r_1), (m_2, r_2), \dots$, and takes $p_{iM}$ time.
Let $\mathcal{M}$ denote a set of modes. Processing an operation means that exactly one processing mode must be selected. Note that this generalizes the processing times, which are modes of a single machine with load 1.

The proposed model interface is something like

def add_processing_mode(self, duration: int, *modes: tuple[Machine, int]):
    modes_ = ((self._machine2id(machine), load) for machine, load in modes)
    self._processing_modes[modes_] = duration

FJSP entities and attributes

The survey by

Xiong, H., Shi, S., Ren, D., & Hu, J. (2022). A survey of job shop scheduling problem: The types and models. Computers & Operations Research, 142, 105731. https://doi.org/10.1016/j.cor.2022.105731

has a convenient way of classifying entities and their attributes for the JSP. This will be useful to to design/extend our ProblemData model, and this is also relevant for PyJobShop/FJSPLIB#4 when we want to design a standard benchmark format.

Batching constraints

Batching looks very similar to the contrasets. It's slightly more complicated because it must be decided what to batch together.

Thoughts:

  • EndToEnd if operations are batched together.
  • May overlap on a machine for "batched" operations.
  • Only question is how to decide which operations to overlap together. Introduce a spanning interval variable?

Solution evaluation based on sequences/assignments

#7 introduces a Solution class that asssumes a solution in terms of its scheduled operations and timing decisions. This is easy to get from a solved CP model, but when using heuristics, obtaining the timing constraints is not as easy. We should provide another way to construct a Solution purely based on operation to machine assignments, and then evaluate the timing decisions ourselves on the operations graph.

Custom objective functions

It would be nice to define a class that describes an objective function.

  • Makepsan: $\max_{i \in O} C_i$
  • Total completion time: $\sum_{j \in J} \max_{i \in O_j} C_i$
  • Total tardiness: $\sum_{j \in J} \max ( \max_{i \in O_j} C_i- d_j, 0)$

While at the same time, users should be able to define their own custom objective functions. One use case is when some operations don't contribute to the objective.

Job interval variable

The job interval variable should span the operation interval variables and can be used to define objectives or constraints.

Extensions

From Dauzère-Pérès et al. (2023):

  • (5.1) Time lag constraints
  • #28
  • #27
  • (5.4) Setup time constraints
  • (5.5) Blocking constraints
  • (5.6) Transportation constraints
  • (5.7) Overlapping constraints
  • (6.1) Complex/non-linear routes
  • (6.2) Multiple resources

References

Just a list of relevant references for this work:

  • Dauzère-Pérès, S., Ding, J., Shen, L., & Tamssaouet, K. (2023). The flexible job shop scheduling problem: A review. European Journal of Operational Research. https://doi.org/10.1016/j.ejor.2023.05.017
  • Tamssaouet, K., & Dauzère-Pérès, S. (2023). A General Efficient Neighborhood Structure Framework for the Job-Shop and Flexible Job-Shop Scheduling Problems. European Journal of Operational Research. https://doi.org/10.1016/j.ejor.2023.05.018
  • Framinan, J. M., Perez-Gonzalez, P., & Fernandez-Viagas, V. (2019). Deterministic assembly scheduling problems: A review and classification of concurrent-type scheduling models and solution procedures. European Journal of Operational Research, 273(2), 401–417. https://doi.org/10.1016/j.ejor.2018.04.033
  • Da Col, G., & Teppan, E. C. (2022). Industrial-size job shop scheduling with constraint programming. Operations Research Perspectives, 9, 100249. https://doi.org/10.1016/j.orp.2022.100249
  • Xiong, H., Shi, S., Ren, D., & Hu, J. (2022). A survey of job shop scheduling problem: The types and models. Computers & Operations Research, 142, 105731. https://doi.org/10.1016/j.cor.2022.105731

Rename operation to task?

  • Task is more clear.
  • Task is shorter.
  • Task is used in CP terminology.

vs.

  • Operation is commonly used in FJSP.

Precedence types with delays

Precedence types with delays are needed to model transportation times between constraints with end_at_end precedence types.

Support for OR-Tools

The current CP models are written in CPLEX CP Optimizer. This is proprietary software and expensive if you don't have an academic license. OR-Tools is free and it should be possible to support as well.

Use processing times as machine eligiblity?

Right now the user has to pass in both machine2ops and processing_times in order to set the machine-operation eligibility and processing times, respectively.

    model = Model()

    job = model.add_job()
    machine = model.add_machine()
    operation = model.add_operation(earliest_start=1)

    model.assign_job_operations(job, [operation])
    model.assign_machine_operations(machine, [operation]) # redundant?
    model.add_processing_time(operation, machine, 1) # redundant?

There is clearly some repetition here. When setting processing times, it's (almost?) always the case that the corresponding machine operation pair is eligible.

The reasons why to set the eligiblities explicitly is to minimize the number of assignment variables. From a processing time matrix, it's not easy to infer which (machine, operation) pairs are eligible, so this would result in all pairs to be assignment variables, which is undesired.

Instead, having a processing times dictionary that maps pairs to processing times could work.

Code coverage

Would be nice to add code coverage to see which parts of the codes are untested.

Steps:

  • Add codecov secret token
  • Add codecov to GH actions

Mixed-integer linear programming support

At some point it will be nice to include a MILP model. The nice thing about MILPs is that most solvers can read a general MPS or LP file, so we only need to define the MILP model once.

Solvers to support:

  • HiGHS: open-source
  • Gurobi: better than CPLEX according to Naderi et al. (2023).
  • CPLEX (maybe?)
  • OR tools (maybe?)

Throughput

Defined as the number of jobs completed before the planning horizon (Ostermeier 2023).

Some things to think about:

  • Jobs must become optional now.
  • When is a job "fully" scheduled in combination with process plans? (Without process plans, simply all operations must become present.)
  • Objective: sense needs to maximize as well.

Notes on graph representation

The issue serves to discuss how the graph representation used in the codebase (operations graph, machine graph) relate to those discussed in the literature.

(This slide deck has been very helpful.)

In FJSP, it's common to talk about the disjunctive graph. This is a directed graph in which the nodes represent operations and the arcs represent precedence relations. There are two types of arcs:

  • Conjunctive arcs represent the processing order of the operations for a single job.
  • Disjunctive arcs connect two operations (belonging to different jobs) that are to be processed on the same machine. Specifically, disjunctive arcs form a clique for each machine.

One way of looking at the FJSP problem is selecting a subset of disjunctive arcs, such that the union of the conjunctive and disjunctive arcs forms a directed acyclic graph (DAG). Cycles correspond to conflicts in the schedule that lead to infeasibility (not respecting precedence).

Setup times

Setup times are in all different kinds of flavors. The easiest one to tackle first is $st_{ikl}$, which is the setup time when scheduling operation $i$ before $k$ on machine $l$. This can simply be included in the NoOverlap constraint.

It remains a question where to store this information. Machines could have a setup_times matrix attribute, but this requires knowledge of the operation indices. I think it's better to have it in ProblemData.

TODOS

  • There's one more: initial setup times (which are used in the SMRDSDST instances).
  • Setup times matrix dimensions are awkward (op, op, machine). Change to (machine, op, op).

Validate processing plans

Processing plans are added in #66. Currently, ProblemData does not validate whether the processing plans are valid, but we should do that for better user experience.

Given a plan $P$, consisting of operation sets $O_1, O_2$, we should check that:

  • At most one operation set contains a "required" operation.

Model.solve() interface

For now, the solve interface should take a time_limit parameter.

When OR-Tools is supported, it should also take a solver parameter. (#26)

When MIP is supported, it should also take a CP/MIP parameter. (#40)

When matheuristics, it should take a heuristic flag boolean parameter.

Due windows and earliness

We currently support deadlines, which are the latest time at which a job must be completed without incurring penalties. In practice though the deadlines are not a fixed point in time, but rather a time interval during which the job is still considered to be in time

Deadlines generalize to due windows, where deadlines are the "late" due window and we introduce a new "early" due window. This generalization only holds when we only consider tardiness and disregard earliness.

Make operation independent of jobs/machines

Operation depends on the job and machine index. I think this can be made cleaner by separating its dependency on Job and Machine entirely. The relationship between operations and jobs and machines should be passed to ProblemData explicitly.

Can we remove the same sequence constraint?

In c599729 I added same sequence constraints so that the current model can be used to solve permutation flow shop problems.

But can we do without? I think doing it with (general) precedence constraints is possible. The propagation is probably much weaker, but I don't necessarily want the best performance for flow shops here.

Consider two subsequent operations $i$ and $j$ of a single job on consecutive machines. Then $f(i) = s(j)$ forces the same sequence constraint. No, actually, this is much stronger: it's no blocking.

Same sequence is actually doesn't hold up in practice: it assumes unlimited buffers between machines. More often than not, a batch occupies (blocks) the machine until it can move on to the next stage.

EDIT: The above constraint is not blocking, it's no-wait.
Blocking is when we also relax the interval variables to be of size at least $p_{ik}$ rather than be exactly that size.

DifferentUnits constraints

$A_{i} \neq A_{j}$ - operations $i$ and $j$ must be assigned to differents machines. This may be relevant in the context of contamination.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.