Giter VIP home page Giter VIP logo

git-submerge's Introduction

git-submerge Travis build status Appveyor build status Andivionian status umbra

Suppose you have a repo with a submodule. Their collective history might look like this:

repository   A-B----D-E-F----H-----K----M---O   master
                   ;        ;          ;
                  ;        ;          ;
submodule        C--------G----I-J---L----N     master

(Semicolons are gitlinks; we'll use slashes for merges.)

After running git submerge submodule, the history will look like this:

repository   A-B---D'-E'-F'--H'--------K'--M'----O'   master
                  /         /             /
                 C---------G'----I'-J'---L'---N'      sub-master

The following things happened:

  • submodule got replaced by an ordinary directory;

  • submodule's history became part of the repo's history;

  • where submodule updates were previously (commits D, H, M), we now have merge commits;

  • the yet-unmerged "tail" of the submodule history (commit N) is given its own branch so you can merge it yourself later (#18).

ATTENTION! Just as any other kind of history rewriting, git-submerge changes the hashes of the commits, so you shouldn't run it on published histories. Furthermore, beware of bugs! Run this on a fresh clone of your repository, and never delete the old history until you're reasonably sure that the new one is what you expect it to be.

Before using git-submerge, it's recommended to take a look at a simpler approach.

Dealing with dangling references

It might so happen that git-submerge stumbles upon a commit in the main repo which references a submodule's commit which doesn't exist. The reason this happens is that submodule's history has been rewritten sometime after the commit to the main repo was made, so now the main repo references something that is gone.

Rewriting already published histories is generally frowned upon, precisely due to the problem described above, but it still happens. git-submerge provides you with a couple of flags that you can use to retain as much of your history as possible. Let's quickly describe what they are, and then we'll take a look at how one can use them.

The first of those options is --mapping, accepting two arguments we'll call "old commit id" and "new commit id". Whenever git-submerge finds a commit in the main repo that points to "old commit id" in the submodule, it'll pretend that it sees "new commit id" instead, and will go on with its business.

The second option is --default-mapping, accepting one argument we'll call "default commit id". If git-submerge finds a dangling reference which isn't mentioned in any of the --mappings, it'll use --default-mapping. Simple, eh?

Now, as promised, let's look at an example. Suppose you've run git-submerge, and it printed out the following:

The repository references the following submodule commits,
but they couldn't be found in the submodule's history:

aaaabbbbccccddddeeeeffff0000111122223333
4444555566667777888899990000aaaabbbbcccc
ddddeeeeffff0000111122223333444455556666

You can use --mapping and --default-mapping options to make
git-submerge replace these commits with some other, still
existing, commits.

The best-case scenario for you is that you find a repo that still has these commits. You can then look at their metadata (commit message, date etc.) and find the corresponding commits in the submodule's new history.

Another, much more cumbersome, option is to find the aforementioned dangling IDs in your main repo's history (git log -S to the rescue!), then compare with your submodule's history and simply guess at what new commit IDs you could use.

You can then add a few --mappings, and the problem will be resolved.

The worst-case scenario is that you can't find any trace of the old history, and guessing didn't help either. In that case, you'll have to create a new commit in submodule explaining that some of its history has been lost and you can't recover it. Then you can pass that commit's ID to --default-mapping, and the resulting history will at least have an explanation of why some commits are broken.

Building

git-submerge requires rustc 1.15+ and cargo 0.16+, so you might need to update your build environment first:

$ rustup update

NixOS users can use the Nix shell; it'll fetch Rust nightly:

$ nix-shell

After that, it's the usual jazz:

$ cargo build

Testing

To check that your build behaves the way the developers expect, do the following:

  1. Prepare a directory for your tests:

    $ mkdir /tmp/git-submerge-testbed
  2. Create the submodule repo:

    $ cd /tmp/git-submerge-testbed
    # Assuming you have git-submerge cloned to /home/user/git-submerge
    $ git clone /home/user/git-submerge sub
    # ...otherwise
    $ git clone https://github.com/Minoru/git-submerge.git sub
    $ cd sub
    $ git reset --hard poc-submodule
  3. Create the main repo:

    $ cd /tmp/git-submerge-testbed
    # Assuming you have git-submerge cloned to /home/user/git-submerge
    $ git clone /home/user/git-submerge repo
    # ...otherwise
    $ git clone https://github.com/Minoru/git-submerge.git repo
    $ cd repo
    $ git reset --hard poc-repo
    # Removing upstream remote so that `git submodule` looks for
    # submodule repo in our testbed, not in the place we cloned from
    $ git remote rm origin
    $ git submodule update --init
  4. Run git-submerge:

    # This assumes you've updated your path like so:
    # $ export PATH=/home/user/git-submerge/target/debug/:$PATH
    # Alternatively, you can use full path instead of "git submerge".
    $ git submerge sub
  5. Check the result:

    $ git fast-export master > /home/user/git-submerge/test/expected.stream
    $ cd /home/user/git-submerge/test
    $ git diff expected.stream

    If everything went well, git diff shouldn't find any differences, and there will be no output.

    Don't forget to clean up afterwards!

    $ git checkout expected.stream

Useful tips

  • When viewing the rewritten history with git log --patch, add -m, -c or --cc option; they all enable diffs for merge commits (with slightly different presentation—just pick the one you like). The reason this is important is that your original history might have had commits where the submodule is updated and some changes are made; now that such commits are turned into merges, Git assumes that file changes were merge conflict resolutions, and hides them from the diffs.

git-submerge's People

Contributors

fornever avatar minoru avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

git-submerge's Issues

Check that we aren't changing dir metadata when rewriting history

In rewrite_submodule_history, we're creating a new dir by moving all existing files under "submodule_dir/". IndexEntry has some dates and other fields that I don't know how are initialized in this case. We should find out and make sure the submerge process doesn't add some spurious data there.

Run rustfmt as part of pre-commit hook

Automatically enforcing consistent coding style is bliss, especially since I'm not a seasoned Rustecean yet and often do silly formatting mistakes myself.

Migrate tags

Split from #6.

Tags that point to the old history have no use, and presumably this history is not yet published, so we can do whatever we want. So let's move those tags onto new commit IDs.

Panics if a commit references nonexistent submodule commit

  • Add --default-mapping <commit> that will use <commit> instead of any that couldn't be found. Option can be specified only once;
  • check that <commit> above really exists in original submodule history;
  • add --mapping <commit1> <commit2> that will map <commit1> to <commit2>. Option can be specified multiple times;
  • check that <commit2> above really exists in original submodule history;
  • when dangling references are found, git-submerge should provide an error message that points to aforementioned options;
  • document the circumstances under which user might need to use these options.

Run cleanup even on panics

if panic! is called, we should remove submodule's remote (and maybe some other things that got added since I've filed this issue.)

Use TreeBuilder instead of Index

rewrite_submodule_history uses git2::Index to create a new Tree when rewriting history, but it looks like TreeBuilder is the right way to do this with libgit2. Check if it's really so and refactor as necessary.

Migrate branches

Split from #6.

We should rewrite all the commits that are reachable from all branches, and then migrate those branches onto new commit IDs. There's no point in migrating just master.

Make git-submerge compile with rustc 1.18

Current stable Rust version is 1.18, and it will be until 1.18 is released around July 20th. I hope to push out our 0.5 release by the end of this month, so I definitely need to support 1.18, otherwise the early adopters base will be further limited to people who have beta or nightly installed.

Add Travis

  • Set up a build on Linux and Mac OS
  • Add a badge to README

The "tail" of submodule's history should be given a branch name

Suppose there's commit in submodule's history that aren't yet used in the main repo. These commits should be turned into branches, so that they aren't lost to git gc, and so user can merge them into the main history later.

  • Remove reference to this issue from README once it's resolved

Automate testing

Now that we have a full testing recipe in our README, we should automate the testing.

I'd write a shell script for this, but /bin/sh is not available on Windows. Perhaps writing a Rust program that incorporates everything is a way to go? @ForNeVeR, what do you think?

  • Write a tool that runs the tests and reports via an exit code
  • Replace instructions in README with short explanation of how to run the tool
  • Update .travis.yml to run the tool on Travis
  • Update appveyor.yml to run the tool on Appveyor

Support submodules in subdirectories

Grabbing some random repo with submodules, we can see that git-submerge 4b79e95 works fine for submodules residing in repo's root, but fails for submodules that are in the subdirectories:

$ git clone --recurse-submodules [email protected]:dgmorales/dgmorales.github.io.git 
$ cd dgmorales.github.io
$ git submerge sysadmin ; echo $?
Merging sysadmin...
0
$ git submerge cpbr10 ; echo $?
Merging cpbr10...
0
$ git submerge talks/cm-pas ; echo $?
Merging talks/cm-pas...
thread 'main' panicked at 'Couldn't remove submodule path from TreeBuilder: Error { code: -1, klass: 14, message: "Failed to remove entry. File isn\'t in the tree - talks/cm-pas" }', /checkout/src/libcore/result.rs:860
note: Run with `RUST_BACKTRACE=1` for a backtrace.
101

Backtrace shows the panic occurs at src/main.rs:209.

Migrate branches and tags

Since we're rewriting history, it makes sense to move all branches and tags onto the rewritten version so that user doesn't get confused and doesn't need to re-run git-submerge in each branch manually (and then fix up the tags manually as well).

Look into adopting clippy checks

[Clippy[(https://github.com/Manishearth/rust-clippy) is a linter for Rust. Maybe we can run it in pre-commit hook, or on Travis?

Submodule's remote should have a random name

We should choose submodule's remote name randomly in order to ensure we don't overwrite any of user's remotes. A simple check for existence of the name we picked will be good, too.

Don't use .expect() for errors we consider possible

.expect() should be reserved for stuff that we didn't expect to ever happen. Anything we expect to happen in some scenario or another, we should handle explicitly by giving a custom error message and returning a designated exit code.

Remove .gitmodules if it was created in the same commit that introduced the submodule

If the .gitmodule file was created in the same commit that first mentions submodule we're currently merging, .gitmodulesshould be removed. The rationale: if the user didn't have it and Git created it for them, the user probably didn't care about the file at all, and would not have created it manually on this commit. (Generally we can't assume that the user didn't create something on a whim, but in this case I think we can be pretty sure.)

Automating mappings search

Consider the best-case scenario of dealing with dangling references, as described in README:

The best-case scenario for you is that you find a repo that still has these commits. You can then look at their metadata (commit message, date etc.) and find the corresponding commits in the submodule's new history.

This could be automated:

  1. user redirects git-submerge's error to a file. It doesn't even need to be a special format—we can just read anything that looks like a SHA-1, and ignore the rest;

  2. user runs some other tool (let's call it helper) in a repository that might still have some of those old commits. The tool looks up the SHA-1, and if it finds the object, it should look through the rest of the objects to find any that have the same metadata (date, author, commit message etc.) We'll presume that dangling references are because of cherry-picking.

  3. The tool outputs a mappings file in the format acceptable to --mappings-file (yet to be implemented).

Technically this breaks my ideology of "commit is a statement", because if commit ID changed after cherry-picking, then the state of this commit is different from original (even if the metadata is the same). So if we just use it in place of another, we might be breaking a statement someone made about the repo (because they included another commit, not this one).

But other users of git-submerge might not care about my ideology.

Don't remove .gitmodules if there're other modules

Consider that I have a repository with multiple submodules like this.

After I run git-submerge cscctest, it deletes the .gitmodules, but there're other modules I also want to import!

I think that git-submerge should just remove a reference to the merged module from .gitmodules, and it should remove a .gitmodules after that only if it's empty.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.