Giter VIP home page Giter VIP logo

Comments (17)

henryiii avatar henryiii commented on August 29, 2024 1

FYI,

if.any.env.CIBUILDWHEEL = true

Will do truthy/falsey values.

from levenshtein.

maxbachmann avatar maxbachmann commented on August 29, 2024

I am aware of the scikit-build-core project and long term I will want to migrate to it. However I wanted to wait until it supports all the features required to migrate https://github.com/rapidfuzz/RapidFuzz.

I did have a look at this last year when development on scikit-build-core started. I think the only required feature that's still missing is scikit-build/scikit-build-core#112 which is required for the pure Python fallback.

from levenshtein.

LecrisUT avatar LecrisUT commented on August 29, 2024

Thanks for clarifying the blocking feature. I am not sure I understand what is meant with run_setup(true/false). Is it to disable the cmake build part? Can you walk me through how you use that?

Have you also checked the Overrides feature? Might be close to what you need?

I can also share my project for design reference. It's basically split in CMake subprojects so that it can build as a monolith, or individual components, like the python interface only.

from levenshtein.

maxbachmann avatar maxbachmann commented on August 29, 2024

https://github.com/rapidfuzz/RapidFuzz/blob/main/setup.py is the setup script

This behaves in the following way:

  1. when packaging it should always build the C-Extension and fail hard if it fails to build since that's either a bug or an error in the build environment. This is done by detecting the build environment based on some environment variables. As mentioned in the scikit-build-core issue this part could be achieved using overrides

  2. when building outside of a packaging environment it should attempt to build the C-Extension but fall back to a pure Python implementation if the installation fails and print a warning. This doesn't affect most users, since I do provide wheels for most platform

I don't believe 2) can be implemented using scikit-build-core so far.

In addition one pre-requirement for 2) is that cmake and ninja are only installed using pip if:

  • there is no sufficient version installed on the system
  • there is a wheel available

I currently implement this pre-requirement in https://github.com/rapidfuzz/RapidFuzz/blob/main/_custom_build/backend.py. Not sure whether this is already supported by scikit-build-core. I know @henryiii did add at least some detection logic for available wheels at some point as well: https://github.com/scikit-build/scikit-build-core/blob/main/src/scikit_build_core/resources/known_wheels.toml.

from levenshtein.

LecrisUT avatar LecrisUT commented on August 29, 2024

I don't believe 2) can be implemented using scikit-build-core so far.

Couldn't this be controlled within CMake, e.g. adding an advanced option ALLOW_FALLBACK (normally I insist projects to add namespaced options, but for pip installed projects only I concur). This can then be controlled via overrides, although if it's strictly limited to PyPI packaging, I would just add appropriate --config-settings option to pip. One great point of scikit-build-core design is that it is very close to native CMake and Python so anything can be designed with enough CMake grease.

In addition one pre-requirement for 2) is that cmake and ninja are only installed using pip if:

  • there is no sufficient version installed on the system
  • there is a wheel available

The default is to try the system first and then fallback to pip. Version checking is implemented. Changing the preference order would be tricky for packagers.

from levenshtein.

maxbachmann avatar maxbachmann commented on August 29, 2024

The default is to try the system first and then fallback to pip. Version checking is implemented. Changing the preference order would be tricky for packagers.

Yes but this breaks in the following case:

  • cmake not available on the system
  • no cmake wheel available for the platform
  • cmake builds from source and fails
    • this is actually very likely to fail since building cmake requires both cmake and a c-compiler.

In this case I still want the fallback version to be installed. This pretty much rules out any solutions that are implemented in cmake.

Couldn't this be controlled within CMake, e.g. adding an advanced option ALLOW_FALLBACK (normally I insist projects to add namespaced options, but for pip installed projects only I concur).

Maybe that would be possible. I don't know how I can allow python_add_library to fail without an error though.

although if it's strictly limited to PyPI packaging, I would just add appropriate --config-settings option to pip

I am not familiar with this option. So not sure how it would help me in my particular case.

from levenshtein.

LecrisUT avatar LecrisUT commented on August 29, 2024

In this case I still want the fallback version to be installed

Aaah, now I understood. Wow that's a tricky set of conditions to support. Kudos for the extra consideration put. In that case I am truly lost. How are the pure-python files supposed to work in that case, maybe there is a design using multi-python project like hatch + hatchling have in their repository. I was considering adding that for my project as well, but didn't get around to it, so I don't have clear design to recommend.

how I can allow python_add_library to fail without an error though.

Why would it fail? This is defined in FindPython and it's just a wrapper of add_library. If you have an example of that failure can you share, and I'll try to forward it to upstream CMake?

although if it's strictly limited to PyPI packaging, I would just add appropriate --config-settings option to pip

I am not familiar with this option. So not sure how it would help me in my particular case.

Here's an example. Hopefully the naming is enough to indicate what it does in that case.

from levenshtein.

maxbachmann avatar maxbachmann commented on August 29, 2024

How are the pure-python files supposed to work in that case, maybe there is a design using multi-python project like hatch + hatchling have in their repository.

I install the Python and C implementations side by side and import the fastest version available.
You can see the implementation here: https://github.com/rapidfuzz/RapidFuzz/blob/8af875c1227ef0a034f46a1d6d46ab35c188d899/src/rapidfuzz/_utils.py#L109 and https://github.com/rapidfuzz/RapidFuzz/blob/main/src/rapidfuzz/fuzz.py

Essentially when the library is imported I try the following imports:

  • if the cpu supports AVX2 and AVX2 optimized binary is available import it
  • same for SSE2
  • if a binary is available import it
  • else import the pure Python version

If the user specifically wants to get the C++ or Python variant he can override the behaviour by setting the RAPIDFUZZ_IMPLEMENTATION environment variable before importing the library.

This allows everyone to get as fast of an implementation as possible without losing the compatibility provided by a pure Python implementation. Most users will get the C++ implementation through binary wheels or their package manager anyway. So this really only affects more exotic targets.

The downsides of the implementation are:

  • I do print a warning when the pure Python version is installed but unless you set up the verbosity of pip this isn't visible to users
  • packaging tools that create a single binary with Cpython and all dependencies injected like pyinstaller won't find what they have to include. For pyinstaller specifically I do maintain a list of imports for this reason: https://github.com/rapidfuzz/RapidFuzz/blob/main/src/rapidfuzz/__pyinstaller/hook-rapidfuzz.py (this reminds me that I still have to include the avx2 and sse2 versions in there 😅 )

Why would it fail? This is defined in FindPython and it's just a wrapper of add_library. If you have an example of that failure can you share, and I'll try to forward it to upstream CMake?

I just assumed that's what's registering it to be compiled as well. All I really wanted to say is that I don't know how I would mark this using an ALLOW_FALLBACK option so it doesn't fail on compilation failure. Really doesn't matter too much though since the fact that this requires cmake already rules this out

from levenshtein.

LecrisUT avatar LecrisUT commented on August 29, 2024

I install the Python and C implementations side by side and import the fastest version available.

Interesting, it's different to my situation when I want to link to either system or python bundled library. In your case I would definitely split into 2 packages making the pure python an optional dependency but preferred in import workflow (just to give the user maximum control). Dealing with AVX2 preference would be tricky though, didn't think far enough around that option. But even without changing the order I think it would be a good design to split in 3 packages: interface, compiled, pure-python (or combine interface and pure-python if your fallback allows it). It would also make it clearer for the user what version they want to use. Don't know if there is an appropriate way to use markers?

packaging tools that create a single binary with Cpython and all dependencies injected like pyinstaller won't find what they have to include

Didn't know about pyinstaller, and indeed it seems fairly dangerous. That seems like a nightmare to support even in pure-python.

I just assumed that's what's registering it to be compiled as well

Ah, I thought you were concerned about configure stage failures. If it's build stage, you have a bit more control on scikit-build-core if you filter specific build targets.

from levenshtein.

henryiii avatar henryiii commented on August 29, 2024

I think this can be supported with a few more override settings. We could provide:

  • if.failed: If the build fails, the config settings gets rerun, and if any match this, the new configuration is used. This only happens once.
  • if.no-system-cmake: Would take two options, "known-wheel" or "always".
  • error-message: This would be override-only, and if it's present, the error message is printed.

Should we allow failed to take the condition of failure? I think the conditions would be "configure", "build", and "install". Though maybe it's better to just wait until someone asks for the ability to differentiate?

The cmake one would trigger if you were not on a platform with a known cmake wheel and there's no sufficient CMake present, or always if there's no system CMake. If this matches,
then it affects if cmake is requested, as well.

This is what pure-python on non-supported CMake system would look like:

[[tool.scikit-build.overrides]]
if.no-system-cmake = "known-wheel"
wheel.cmake = false

Or if the build fails:

[[tool.scikit-build.overrides]]
if.failed = true
wheel.cmake = false

Custom message on failure:

[[tool.scikit-build.overrides]]
if.failed = true
error-message = "The build failed!"

I think this would add a lot of flexibility; for an unrelated example, you could fail on unsupported Pythons:

[[tool.scikit-build.overrides]]
if.python-version = ">=3.13"
error-message = "Python 3.13+ is not supported yet!"

I'm looking at a way to allow the cmake wheel to trigger the bootstrapping feature, and I think this would be a step in that direction too.

This is a very rough draft of what I'm thinking.

from levenshtein.

maxbachmann avatar maxbachmann commented on August 29, 2024

I assume with this I could implement my handling as:

[tool.scikit-build]
wheel.cmake = true

[[tool.scikit-build.overrides]]
if.no-system-cmake = "known-wheel"
wheel.cmake = false

[[tool.scikit-build.overrides]]
if.failed = true
wheel.cmake = false

[[tool.scikit-build.overrides]]
if.any.env.CIBUILDWHEEL = "1"
if.any.env.CONDA_BUILD = "1"
if.any.env.PIWHEELS_BUILD = "1"
if.any.env.RAPIDFUZZ_BUILD_EXTENSION = "1"
wheel.cmake = true

# whether this is required would probably depend on
# 1) is the override above triggered both for the failure + non failure case
# 2) what does the failure handling do if the same config is set both for failure and non failure
[[tool.scikit-build.overrides]]
if.failed = true
if.any.env.CIBUILDWHEEL = "1"
if.any.env.CONDA_BUILD = "1"
if.any.env.PIWHEELS_BUILD = "1"
if.any.env.RAPIDFUZZ_BUILD_EXTENSION = "1"
error-message = "failed to build C++ Extension in a packaged build"

from levenshtein.

henryiii avatar henryiii commented on August 29, 2024

I think we should make the error-message only appear if the build fails, and add a fail = true option. Otherwise, looks good, I think?

from levenshtein.

maxbachmann avatar maxbachmann commented on August 29, 2024

Yes I think this would cover all of my needs and would be a lot more concise than my current implementation with scikit-build 👍

I will start a branch porting over the existing set of features and extend it as things get added to scikit-build-core

from levenshtein.

henryiii avatar henryiii commented on August 29, 2024

Any ideas for better naming on if.no-system-cmake = "known-wheel"? I don't think that reads very well.

from levenshtein.

LecrisUT avatar LecrisUT commented on August 29, 2024

No clue from my side. To me if.system-cmake reads like "did we use the cmake from system". Maybe if.cmake-vendor can be a more general form of that, although vendor sounds wrong to me here. And maybe if.have-cmake as a catch-all for if any cmake is found?

from levenshtein.

henryiii avatar henryiii commented on August 29, 2024

To me if.system-cmake reads like "did we use the cmake from system".

That great, that's what it is supposed to be, except I'm not sure how to indicate the "and there's no known wheel for this platform" condition. if.no-system-cmake = "and-no-known-wheel"?

from levenshtein.

LecrisUT avatar LecrisUT commented on August 29, 2024

To me if.(no-)system-cmake reads more like it would be a bool and would evaluate to (true)false when a wheel is used. if.have-cmake would be more explicit of no system cmake and no wheel.

from levenshtein.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.