Giter VIP home page Giter VIP logo

vrplib's Issues

Distance matrix computation performance

It seems reading a large instance is rather slow, most likely due to computing the distance matrix using a Python for loop. E.g. for GH1000 instance, running on Colab:

!pip install vrplib
import vrplib
vrplib.download_instance("RC2_10_8", "RC2_10_8.vrp")
%%timeit
instance = vrplib.read_instance("RC2_10_8.vrp", instance_format="solomon")

3.35 s ± 178 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Ignore -1 in depot section

The -1 in the depot section does not seem to serve any purpose. I think it's OK if we do not raise in that case.

Write VRP instances

It would be nice to have a function that could also write VRP instances. This was provided in the EURO-NeurIPS competition.

Default argument for path in download & allow dirs

Users must pass a path argument to download_instance.

  • This shouldn't be necessary. If it's unspecified, then it'll just be downloaded and saved to where the function is called from?
  • The file name does not need to be specified. If a dir is provided instead, then save it to that dir with the original file name.

Use numpy testing library

I have been using standard assert statements and numpy.testing assertions. I think it's good to make this more consistent, and to use numpy.testing everywhere.

Timeout when download takes too long

The CVRPLIB website has been offline the entire day. The code currently just hangs waiting for a response. I think we should throw an error if downloading hasn't finished after 20 seconds.

Plotting functions

We should provide some simple plotting functionalities at some point.

Hard-code instance names

It should not be necessary to make a request to the library to obtain all instance names. The instance names don't change over time, only the best known solutions do.

We can simply add a .txt file to this library, which gets updated at every new instances set.

  • Change list_instances to list_names

Download instances and solutions locally

Downloading an instance currently does not save the instance locally. We should rewrite the code to actually download the files to save them locally.

The user would then have to write this:

import vrplib

# Download the instance and solution locally
vrplib.download_instance("X-n101-k25", "/path/to/X-n101-k25.vrp")
vrplib.download_solution("X-n101-k25", "/path/to/X-n101-k25.sol")

# Read the downloaded instance and solution
instance = vrplib.read_instance("/path/to/X-n101-k25.vrp")
solution = vrplib.read_solution("/path/to/X-n101-k25.sol")

Standardize output

To support more instances, it's best to standardize the output to a dictionary. Currently I have added extra attributes etc. which I might want to remove as well later.

TODOs

  • Reorganizes code base
  • Standardizes output to dictionary / removes Instance class.
  • Removes DEPOT constant

Download full instance and solution sets

Would be nice to download the full instance and solution sets in one function call. I'm not sure yet how to implement this. Some ideas:

  • Provide a new function named download_set, which can be used like
download_set("X", output_dir="instances/")
- Two directions here: 
	- This function downloads the corresponding "X" set zip and unzips it.
	- It downloads all "X" instances and solutions.
  • Allow for globs in download_instance and download_solution.

Check validity of instances

Except for some edge weight format and types, the current code does not check if the instance data are correct. For example, time windows can be checked for early <= late. Or that all data sections should have enough entries.

As of now, the library is just really about reading existing instances, so there should be no need to checking validity of the instances. But at some point it might be useful to include this as well, combined with #34 and #14.

Read LKH-3 VRPLIB formatted instance

cvrplib.read currently supports instances from CVPRLIB. I want to extend this to be able to read instances for LKH-3 as well. The formatting is not much different I think, so it's mostly about how the codebase is organized.

The output does need to change. I currently use CVRP and VRPTW classes, but I think I want only to output a dictionary.

Separate read/download instance and solution

To make the code more explicit, I want to make separate functions for reading instances and solutions, and also downloading instances and solutions.

Example:

import cvrplib

# Read
instance = cvrplib.read_instance('/path/to/A-n32-k5.vrp')
solution = cvrplib.read_solution('/path/to/A-n32-k5.sol')

# Download
instance = cvrplib.download_instance('/path/to/A-n32-k5.vrp')
solution = cvrplib.download_solution('/path/to/A-n32-k5.sol')

After finishing #14, I also want to have separate functions for writing instances and solutions:

# Write
cvrplib.write_instance('/path/to/A-n32-k5.vrp', specifications, sections)
cvrplib.write_solution('/path/to/A-n32-k5.sol', routes)

Redesign test suite

I have been sloppy in writing my tests. Especially with the module rehaul, most of the tests are not in the right place, or not even implemented at all. I think that coverage can be very helpful to detect the missing spots.

TODOs

  • Add coverage
  • Implement tests for parse module
  • Gather all the VRPLIB instances that you can find
    • ORTEC Euro-NeurIPS VRPTW instances

Deprecate pkg_resource.read_text

========================================================================================================== warnings summary ==========================================================================================================
tests/download/test_list_names.py::test_list_names[case0]
  /Users/leonlan/Dropbox/cvrplib/vrplib/download/list_names.py:52: DeprecationWarning: read_text is deprecated. Use files() instead. Refer to https://importlib-resources.readthedocs.io/en/latest/using.html#migrating-from-legacy for migration advice.
    fi = pkg_resource.read_text(__package__, "instance_data.csv")

tests/download/test_list_names.py::test_list_names[case0]
  /opt/homebrew/Cellar/[email protected]/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/resources/_legacy.py:80: DeprecationWarning: open_text is deprecated. Use files() instead. Refer to https://importlib-resources.readthedocs.io/en/latest/using.html#migrating-from-legacy for migration advice.
    with open_text(package, resource, encoding, errors) as fp:

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform darwin, python 3.11.3-final-0 ----------
Coverage XML written to file coverage.xml

=================================================================================================== 86 passed, 2 warnings in 3.39s ===================================================================================================
leonlan@Leons-Air cvrplib % 

Speed-up parsing

It currently takes about 2 seconds to read a 1000-customer instance without explicit matrix:

In [73]: %timeit vrplib.read_instance(f"tmp/X-n1001-k43.txt")
2.14 s ± 44 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

This is slow. I think it's mainly due to computing the Euclidean distances. For example, another 1000 customer instance with explicit matrix reads much faster:

In [76]: %timeit vrplib.read_instance(f"tmp/Loggi-n1001-k31.txt")
246 ms ± 8.88 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Bug in distance calculation from flattened matrix

Just found a bug in the distance calculation from flattened distance matrices.
The function 'from_flattened' (line 141 in cvrp.py) assumes that
" The numbers in a flattened list correspond the matrix element indices
(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2), (4, 0), ..."
However, for the instance E-n13-k4 the flattened list correponds to indices
(0, 1), (0, 2), (0, 3), (0, 4), ... (11,12).
Changing line 150
indices = sorted([(i, j) for (j, i) in combinations(range(n), r=2)])
to
indices = sorted([(i, j) for (i, j) in combinations(range(n), r=2)])
resolves this issue.
However, I do not know whether the original sorting might be correct for some other instances.

Verification functions

It would be nice to provide functions that can verify any VRP-type solution. Of course, we need to make a verification function for each type of VRP-type problem.

I have no idea yet how the interface will look like. But here's an idea:

from cvrplib import read_instance, read_solution, verify

instance = read_instance(instance_path)
solution = read_solution(solution_path)

# Pass the solution and all relevant instance attributes
verify.cvrp(solution, instance['capacity']) 
verify.vrptw(solution, instance['capacity'], instance['duration_matrix'], instance['time_windows'])

Optional distance computation for large instances

CVRPLIB has an XXL instance set (e.g., http://vrp.atd-lab.inf.puc-rio.br/index.php/en/plotted-instances?data=Flanders2). vrplib takes very long to parse these large instances due to the distance matrix (size $O(n^2)$). For Brussels1 (11K nodes), it takes about 1 minute on my laptop and 99.9% time is spent on computing the distance matrix.

I think it's unreasonable to compute the distances matrix for such large instances, so we should probably add an argument to disable the computation of distance matrices.

Instance/solution fields as numpy arrays instead of lists

The array-like fields in the Instance and Solution classes are current of type List or List[List]. I prefer to change this to numpy arrays, as they provide more useful methods when solving VRP problems. The only hurdle is that I'm not sure how to annotate numpy arrays in a meaningful way. Maybe refer to this?

Add support for VRPTW instances

Our package currently does not support reading and downloading VRPTW instances, i.e., instances that belong to the Solomon and Homberger and Gehring set. Support for this will be in a later version. A new parser function needs to be written for these instances.

Make download faster

The download and list_instances functions take several seconds to complete. The parsing of the file itself goes very fast, but receiving the response from the server takes a while. I believe it has to do with the server that hosts the CVRPLIB library. If you know how to improve this, please let me know.

Roadmap V1.0

The library currently supports reading and downloading instances from the CVRPLIB library. I'm going to work towards a V1 version of this library to make it support more general VRPLIB instances, particularly those from the LKH-3 library. I think that the LKH-3 and CVRPLIB library are currently the best maintained VRP libraries with a consistent format. Moreover, LKH-3 has way more instances than CVRPLIB, which only has CVRP and VRPTW instances.

New features that I want to introduce:

  • Read LKH-3 VRPLIB formatted instances
  • Write VRPLIB formatted instances #14
  • Extensively test VRPLIB reading

Some edge cases (mostly about how to support CVRPLIB):

  • Maybe drop support for downloading CVRPLIB instances? I think it's nice feature to have, but such a feature adds complexity to the library.
    • I'll keep the download feature for now because I think it's quite useful for running quick instances on Colab.
  • Maybe drop support for Solomon/GH instances that are not formatted by VRPLIB standards? I think we need this still because CVRPLIB relies on it.
    • We will not drop support for Solomon-formatted instances. Instead, read_instance will take the style argument which allows to parse different instance formatting styles (i.e., vrplib and solomon).

Issues

Optional parameters for distance rounding

The current implementation always rounds the distances to the nearest integer. It might be helpful for the non-integral instances to allow for different rounding functions.

Related issue: #7

Change package name to cvrplib

I thought it would be helpful to name the package to pycvrplib to make it apparant that the library is written for Python, but I think that's being overly pedantic. I should rename the package to cvrplib, just like tsplib95

Host instances and best known solutions

In the long term, we could make a separate repository to maintain the instances and best known solutions.

  • Using Github Actions we can make this process mostly automated.
  • Templates can streamline the submission process
  • We need to write a validator for each problem type. See #27

Verify that instance is VRPLIB format

We should verify that an instance is in VRPLIB format. I don't know yet what needs to be checked (is there a minimum requirement?), but that's something we can figure out along the way.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.