Giter VIP home page Giter VIP logo

Comments (27)

jpoimboe avatar jpoimboe commented on July 4, 2024 1

Because the file names in the .cmd files are relative to the current directory, I think there's a more precise way to do this. We can detect whether the filenames are relative to the kernel root (old way) or current directory (new way) and adjust the algorithm accordingly. Let me try to come up with something (time willing).

from kpatch.

liu-song-6 avatar liu-song-6 commented on July 4, 2024

The kernel commit that cause the issue is

commit cd968b97c49214e6557381bddddacbd0e0fb696e
Author: Masahiro Yamada <[email protected]>
Date:   5 weeks ago

    kbuild: make built-in.a rule robust against too long argument error

I guess we should either revert this, or ship some fix in kpatch-build.

from kpatch.

joe-lawrence avatar joe-lawrence commented on July 4, 2024

Hi @liu-song-6 : Sorry for the late reply, I saw the same error yesterday, but haven't had a chance to take a closer look. I haven't had to touch the ancestor logic in a few years, so I'll have to spend a some time remembering how it all worked (IIRC there was a caching mechanism or shortcut method of sorts?) Anyway, since upstream has motivation for the change, not just touching things for aesthetic reasons, I think kpatch will have to cope with the shorter .o filenames.

from kpatch.

liu-song-6 avatar liu-song-6 commented on July 4, 2024

Something like this should fix it. Or, maybe we only need to grep for the short name?

from kpatch.

jpoimboe avatar jpoimboe commented on July 4, 2024

Unfortunately I think that patch won't be precise enough, since there can be duplicate file names across the tree, it might find the wrong one when doing a deep find.

from kpatch.

joe-lawrence avatar joe-lawrence commented on July 4, 2024

Unfortunately I think that patch won't be precise enough, since there can be duplicate file names across the tree, it might find the wrong one when doing a deep find.

With basename grepping like @liu-song-6 proposed here, what if the deep find worked it's way back to the root of the source tree, searching along the way? (At the moment I believe it starts at the top.) This way we're not looking for something generic like init.o across the entire tree, but in gradually higher directories (ie, a/b/c/d/, then a/b/c/, a/b/, etc.)

from kpatch.

liu-song-6 avatar liu-song-6 commented on July 4, 2024

Do we have an example patch that may break kpatch-build due to the duplicated name? I tried with some changes in fs/autofs/init.c, it fails for some other reasons.

from kpatch.

joe-lawrence avatar joe-lawrence commented on July 4, 2024

Do we have an example patch that may break kpatch-build due to the duplicated name? I tried with some changes in fs/autofs/init.c, it fails for some other reasons.

I don't have one at hand (I only used "init.c" an example of a filename that could feasibly be duplicated). Perhaps another way to look at the problem would be to feed the original vs. new implementation a kernel tree full of .o files and see if any incorrect parent objects are generated.

from kpatch.

liu-song-6 avatar liu-song-6 commented on July 4, 2024

I found an issue with the original version: we may match aaa.o with bbb-aaa.o. a4287fb fixed this.

But I guess it is still not 100% accurate?

from kpatch.

joe-lawrence avatar joe-lawrence commented on July 4, 2024

I ran two experiments for all object files from v5.19-rc7 and a .config similar to rhel-9:

  1. "Old" - The original find_parent_obj() with torvalds/linux@cd968b97c492 reverted > linux-5.19-rc7-revert
  2. "New" - The proposed new find_parent_obj() with torvalds/linux@cd968b97c492 in place > linux-5.19-rc7

A few observations:

  • New runs many more "deep finds"
  • When old and new report "invalid ancestor" they often disagree about the ancestor file
  • New seems to determine more ancestors than old, for example:
    • Old = arch/x86/events/intel/cstate.o -> invalid_ancestor arch/x86/events/intel/cstate.o
    • New = arch/x86/events/intel/cstate.o -> arch/x86/events/intel/intel-cstate.ko
  • However some (not limited to these) are incorrect:
    • (CONFIG_CPU_FREQ=y)
      Old = drivers/cpufreq/cpufreq.o -> vmlinux
      New = drivers/cpufreq/cpufreq.o -> drivers/cpufreq/acpi-cpufreq.ko
    • (CONFIG_PCIEAER=y)
      Old = drivers/pci/pcie/err.o -> vmlinux
      New = drivers/pci/pcie/err.o -> drivers/pci/pcie/aer_inject.ko

See attached test script and results (lmk if I can explain them further). In the end, the new find_parent_obj() seems to run faster and finds more ancestors, however are a few inaccurate results. I'm not sure how to easily test the same for the existing algorithm to compare.

I wonder since we're already wrapping ld with kpatch-ld, if we could somehow dump .o -> .ko / vmlinux mappings in this step? (It's late, so I'm only brainstorming here :)

results.tar.gz

from kpatch.

liu-song-6 avatar liu-song-6 commented on July 4, 2024

See attached test script and results (lmk if I can explain them further). In the end, the new find_parent_obj() seems to run faster and finds more ancestors, however are a few inaccurate results. I'm not sure how to easily test the same for the existing algorithm to compare.

I haven't figured out how to run the test script. I got many invalid_ancestor:

find-kobj-test drivers/md/md.o
drivers/md/md.o DID_DEEP_FIND=1 invalid_ancestor_drivers/md/md.o_for_drivers/md/md.o

Did I miss something?
Also it seems find_parent_obj() in the test script is not the same as a4287fb.

I wonder since we're already wrapping ld with kpatch-ld, if we could somehow dump .o -> .ko / vmlinux mappings in this step? (It's late, so I'm only brainstorming here :)

Are we using kpatch-ld? I didn't find it.

from kpatch.

joe-lawrence avatar joe-lawrence commented on July 4, 2024

I haven't figured out how to run the test script.

Sorry it was hacked together, I had find_parent_obj() copied from the original and then find_parent_obj_new() for the new version and just manually adjusted their calls for each case. Here is a version of the script with the function from a4287fb:
find-kobj-test-new.txt

And I ran it from the top of my kernel build like:
$ find . -name '*.o' -exec ~/kpatch/find-kobj-test-new {} \; | tee linux-5.19-rc7

I got many invalid_ancestor:
find-kobj-test drivers/md/md.o
drivers/md/md.o DID_DEEP_FIND=1 invalid_ancestor_drivers/md/md.o_for_drivers/md/md.o

Right, the new find_parent_obj() finds a few more ancestors while missing ones like this. Does this patch work with this PR?

diff --git a/drivers/md/md.c b/drivers/md/md.c
index c7ecb0bff..1ca189ec2 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -213,6 +213,7 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 {
        int ret = 0;
 
+       nop();
        if (rdev && !rdev_need_serial(rdev) &&
            !test_bit(CollisionCheck, &rdev->flags))
                return;

I get the same invalid ancestor error as you pointed out.

I wonder since we're already wrapping ld with kpatch-ld, if we could somehow dump .o -> .ko / vmlinux mappings in this step? (It's late, so I'm only brainstorming here :)
Are we using kpatch-ld? I didn't find it.

Oops, I thought we had a separate wrapper script for the linker, but it's all included in kpatch-build/kpatch-cc. Originally I thought we might be able to catch a single ld invocation for vmlinux and then for modules to which we could snarf for ancestor data, but that is not the case. It might be possible to work backwards from here somehow, but unfortunately it doesn't look straightforward to me.

from kpatch.

liu-song-6 avatar liu-song-6 commented on July 4, 2024
diff --git a/drivers/md/md.c b/drivers/md/md.c
index c7ecb0bff..1ca189ec2 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -213,6 +213,7 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 {
        int ret = 0;
 
+       nop();
        if (rdev && !rdev_need_serial(rdev) &&
            !test_bit(CollisionCheck, &rdev->flags))
                return;

I get the same invalid ancestor error as you pointed out.

This happens because we have

git -lw -e " md.o" xxx

where -w does not match with md.o with the space. Is there some grep match to fix this?

from kpatch.

joe-lawrence avatar joe-lawrence commented on July 4, 2024

This happens because we have

git -lw -e " md.o" xxx

where -w does not match with md.o with the space. Is there some grep match to fix this?

Ah, I see. Word regexp prevents md.o from matching foomd.o, then the space was added to prevent foo-md.o, but that breaks "/md.o" matching. I can't think of any grep flag that handle all these cases.

from kpatch.

joe-lawrence avatar joe-lawrence commented on July 4, 2024

I hacked this up this morning to try and special-case the new format:
master...joe-lawrence:kpatch:v5.19-built-in-short-paths

Not really tested and not really proud of the hack, but I think these functions could be refactored at some point in the future. Trying to regex or grep flag our way out of this feels very brittle. There's gotta be a better way. (Maybe if we wrap ar too we could snarf all the object files going into built-in.a's?)

edit: that branch may work for vmlinux and foo.o -> foo.ko, but doesn't work for files like crypto/ecdh_helper.o -> crypto/ecdh_generic.ko. I'll need to wrap my brain around filter_parent_obj() first.

from kpatch.

liu-song-6 avatar liu-song-6 commented on July 4, 2024

I think this version of grep_for_parent only works for build-in .o files. AFAICT, it can't cover kernel modules.

from kpatch.

joe-lawrence avatar joe-lawrence commented on July 4, 2024

I think this version of grep_for_parent only works for build-in .o files. AFAICT, it can't cover kernel modules.

(Sorry for belated reply, I was on PTO.) Unfortunately I realized that shortly after posting it last week. We should enumerate all the potential cases across kernel releases to come up with a better solution. It feels like whack-a-mole at the moment. In the meantime, liu-song-6@b52499b should be reasonably accurate for you if you can manually sanity check its results?

from kpatch.

liu-song-6 avatar liu-song-6 commented on July 4, 2024

I guess it works for short term. It is not top priority for our production at the moment. But we do plan to use 5.19 as our next production kernel.

from kpatch.

jpoimboe avatar jpoimboe commented on July 4, 2024

I think I've got it working with jpoimboe/kpatch@f108002. I tested with @joe-lawrence 's script on 5.18 and 5.19. Any more testing would be welcome.

from kpatch.

androw avatar androw commented on July 4, 2024

jpoimboe@f108002 failed with

➜  ~ kpatch-build -s ./linux meminfo-string.patch 
Using source directory at /home/androw/linux
Testing patch file(s)
Reading special section data
Building original source
Building patched source
ERROR: kpatch build failed. Check /home/androw/.kpatch/build.log for more details.
➜  ~ tail /home/androw/.kpatch/build.log
  GEN     Module.symvers
checking file fs/proc/meminfo.c
patching file fs/proc/meminfo.c
  SYNC    include/config/auto.conf.cmd
/usr/libexec/kpatch/kpatch-cc gcc: unknown compiler
scripts/Kconfig.include:44: Sorry, this compiler is not supported.
make[2]: *** [scripts/kconfig/Makefile:77: syncconfig] Error 1
make[1]: *** [Makefile:629: syncconfig] Error 2
make: *** [Makefile:731: include/config/auto.conf.cmd] Error 2
make: *** [include/config/auto.conf.cmd] Deleting file 'include/generated/autoconf.h'

liu-song-6@b52499b worked for me

from kpatch.

jpoimboe avatar jpoimboe commented on July 4, 2024

@androw That looks like something wrong with your setup. I don't see how it could be related to my patch.

Did you have a file at /usr/libexec/kpatch/kpatch-cc?

Maybe something went wrong with your install. Instead try running with /git/path/to/kpatch-build directly from the git tree.

from kpatch.

androw avatar androw commented on July 4, 2024

Yes a file is present here.
Maybe something was wrong on my side. Tried again and got the following error:

➜  ~ ./aports/testing/kpatch/src/kpatch-0.9.6/kpatch-build/kpatch-build -s /home/androw/linux /home/androw/test.patch 
Using source directory at /home/androw/linux
Testing patch file(s)
Reading special section data
Building original source
Building patched source
Extracting new and modified ELF sections
realpath: --relative-to=.: No such file or directory
realpath: --relative-to=/home/androw/linux/fs/proc: No such file or directory
realpath: --relative-to=/home/androw/linux/fs/proc/..: No such file or directory
realpath: --relative-to=.: No such file or directory
[...]
realpath: --relative-to=./fs/pstore: No such file or directory
realpath: --relative-to=./tools/objtool: No such file or directory
realpath: --relative-to=./tools/objtool/arch/x86: No such file or directory
ERROR: invalid ancestor fs/proc/meminfo.o for fs/proc/meminfo.o. Check /home/androw/.kpatch/build.log for more details.

from kpatch.

jpoimboe avatar jpoimboe commented on July 4, 2024

Can you run with the '-d' option and share the output (at least up until the first error)?

from kpatch.

androw avatar androw commented on July 4, 2024

https://gist.github.com/androw/aa68cc751660f65a9d4a597b33d571c2

from kpatch.

jpoimboe avatar jpoimboe commented on July 4, 2024

realpath: --relative-to=.: No such file or directory

@androw I wonder if your version of 'realpath' is old? what does realpath --version show?

from kpatch.

androw avatar androw commented on July 4, 2024
➜  ~ realpath 
BusyBox v1.35.0 (2022-08-02 19:43:12 UTC) multi-call binary.

Maybe it would be better with the coreutils one.
Edit: confirmed to work with coreutils.

from kpatch.

jpoimboe avatar jpoimboe commented on July 4, 2024

Hm, for older and alternative toolchains I guess we need a bash alternative to realpath --relative-to

from kpatch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.