Giter VIP home page Giter VIP logo

scalar's People

Contributors

alameenshah avatar bbodenmiller avatar benpeart avatar changeworld avatar chrisd8088 avatar derrickstolee avatar dscho avatar glensc avatar halterer avatar jamill avatar jeffhostetler avatar jeremyepling avatar jeschu1 avatar jongwooo avatar jrbriggs avatar kant avatar kevin-david avatar kewillford avatar kivikakk avatar mitesch avatar mjcheetham avatar nickgra avatar pmj avatar ravi-saini35 avatar rootulp avatar sanoursa avatar vdye avatar vtbassmatt avatar wilbaker avatar yehezkelshb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scalar's Issues

Create a perf test suite

To give us high confidence that customer satisfaction will be greater than it would be on VFS for Git, I think we want to measure identical scenarios on both VFS for Git sparse mode and scalar. Ideally we can also configure what part of the cone is available, so we can compare and contrast different sizes of enlistments. We should use representative sparse enlistments for different segments of our customer base.

Thoughts on the approach or execution?

[Mount Removal] Remove disk layout upgrade code

UPDATE

The upgrade steps are run as part of scalar mount will will be going away as part of the mount removal process.

We should remove the upgrade code as part of removing the mount process, and if in the future we need to perform disk layout upgrades it will need to be driving by the service and/or the installer.


We no longer need back-compat logic for previous GVFS disk layouts. We will dramatically change the way we store the repo config, and hopefully do so before we ship to EA.

At some point, we will not allow breaking changes and then will need upgrade logic. Should we delete the disk layout code now and then redesign/reimplement the upgrade logic when we need it?

cc: @mjcheetham

Rename EVERYTHING

New name TBD.

  • Replace all instances of "GVFS" and "VFSForGit" (and variants) in the filenames using an exact-rename commit.
  • Drop the base "GVFS" folder in the same commit.
  • Perform text-based replacement of "GVFS", "VFSForGit", and variants in most places (some things need to stay, such as the Git package name).

Use a derivative of calver for scalar

It would be ideal if the versioning scheme gave a better indication of the time of release and the milestone associated with the bits.

I propose the following, where {build} = counter(SourceRef, 0)

Source ref Version template Version example Build Number
refs/heads/releases/19.08.157 {yy}.{MM}.{Milestone}.{build} 19.08.157.1 Release-19.08.157.1
refs/pull/25/merge 10.20.{PRNum}.{build} 10.20.25.1 PR-25.1
refs/heads/master {yy}.{MM}.{dd}.{build} 19.08.10.33 CI-master.33
refs/tags/tagname {yy}.{MM}.{dd}.{build} 19.08.10.34 CI-tagname.34

Sparse: Check for superceding parents in input

BUG: if we run git sparse-checkout set A A/B, then A is registered as a recursive closure AND A/B is marked as a recursive closure. This also means that A is marked as a parent path.

This results in Git complaining that the patterns are not cone-style, and reverts to the slow pattern matching algorithm.

To fix, consider removing paths from the "parent" list if they are in the "recursive" list. Further: remove children from the recursive list.

Progress indicators in `git read-tree -mu HEAD`

scalar sparse --add needs progress indicators. These progress indicators need to be in two places (at least):

  1. BlobPrefetcher needs to provide feedback as it discovers and downloads blobs.

  2. git read-tree -mu HEAD needs to provide feedback as it populates the working directory.

These are very different solutions, so this issue will track git read-tree -mu HEAD.

[Mount Removal] Move repo registration from mount verb to clone verb

Repo registration (with Scalar.Service) should happen during scalar clone rather than mount.

Additionally, the clone verb itself should update the registration file (or each clone should create its own file) and Scalar.Service should read the file(s) to discover which repos have been registered.

Additionally:

  • Scalar.Service should also remove repos when it finds they're no longer on disk.
  • There should be a verb for manually registering/unregistering repos
  • Functional tests should not register when cloning or they should register in a test location that does not impact the installed Scalar.Service

Performance: git add

We need to investigate what we can do about git add in our target enlistment.

  1. git add -p from src took 31s.
  2. git add . from src took 43s.

These are no-op adds with fsmonitor and untracked cache.

Git: core.gvfs config setting

The core.gvfs config setting does a lot of things, including block unwanted commands.

This setting was dropped as part of the rename effort (#38) and should be put back for now.

However, there are a lot of things that config options does that we may not want it to do in the Scalar world. Update Git to split those actions apart based on other config options or add a core.scalar for our situation.

For example, we still want to block git gc, but that could be part of core.virtualizeobjects instead.

Remove LibGit2

Can we get rid of LibGit2 entirely? Here are some tradeoffs:

  • We currently track which downloaded objects are blobs or not. Dropping this stat would save time, and the batched read-object hook (#7) would make that less important.
  • CommitAndRootTreeExists() exists for a VFS for Git reason: to see if we need to prefetch the folders at a commit on clone time so we can generate an index before projecting. This isn't needed any more.
  • the LooseObjectsStep checks for corrupt loose objects. We'll have fewer objects with the batched read-object, and we could teach git pack-objects to clear corrupt objects, perhaps.

Outside of that last one, many of these changes are super small and don't have a huge impact on the full story.

Prototype git sparse verb

Goal: provide a user-friendly experience around configuring the sparse checkout.

Scope: Verb to add and remove entire directories from sparse enlistment.

Non-goals: High-performance application of sparse-checkout file.

Progress indicator for vfs helper

While acquiring a set of objects prior to a workdir changing operation like checkout or reset, we should show progress similar to fetch's.

Update Functional Test data after rename

The functional tests use an old copy of the GVFS repo, so all of the paths use GVFS in the names. Those paths were modified automatically as part of the rename operation (#38).

As functional tests are re-added, we will need to revert the changes to those paths, but it will be a manual process.

Remove GitHooksLoader

The GitHooksLoader should not be needed, as we only have the read-object hook, which is already native.

Set scalar.telemetry-pipe in installer

Set scalar.telemetry-pipe in the installer to get telemetry from scalar daemons.
This is a peer of gvfs.telemetry-pipe.

scalar.telemetry-pipe=scalar-c780ac06-135a-4e9e-ab6c-d41e2d265baa

We do NOT need the corresponding scalar.telemetry-id (like we have in gvfs).

[Mount Removal] Perform maintenance jobs in the service rather than the mount process

This is part of the work required to eliminate the mount process.

Special care will need to be taken regarding ACLs. The service runs with elevation, and we need to make sure that non-elevated git processes are still able to use the files produced by the maintenance tasks.

The service runs with elevation, and we need to make sure that non-elevated git processes are still able to use the files produced by the maintenance tasks.

As an alternative, we should investigate running Scalar.Service as the user rather than as admin.

Investigate having the 'sparse' verb call 'git status' before finishing to prime the untracked cache

After adding a large set of cones to the repo using scalar sparse --add-stdin the first git status took a long time:

~/ScalarTests/repo/src>git status
On branch master
Your branch is up to date with 'origin/master'.

It took 17.52 seconds to enumerate untracked files. 'status -uno'
may speed it up, but you have to be careful not to forget to add
new files yourself (see 'git help status').
nothing to commit, working tree clean

If we had the sparse verb call git status before it finishes users would have a better experience running git status for the first time.

Install: scripted installation

We need to produce a simple scripted installation for macOS that pulls together scalar, git, gcm core, watchman, and internal tooling and correctly configures everything. This is to support demo scenarios and automation like perf and large build runs.

Resumability: batch requests in vfs helper

In order to have partial resumability in the event of a network failure, we should limit the number of objects we request in one go and ask for multiple batches rather than one large batch.

This also opens the door for parallelization later.

Planning: feature branch in microsoft/git

This issue is to facilitate discussion.

In microsoft/git#171, we introduce the git sparse-checkout builtin. This has the features we need to get moving on the sparse clones in Scalar, but it is not ready for merging into vfs-2.22.0. In particular, we need to get feedback from the mailing list before we take a hard dependency on it, especially in the shipped version with microsoft/vfsforgit.

Here is my proposal:

  • Create a new feature branch features/sparse-checkout in microsoft/git.

  • The feature branch will include all updates to sparse-checkout (#8) and batch object downloading (#7, #36).

  • As vfs-2.22.0 advances, we can dual-checkin if it is a critical change. This should happen rarely as we are mostly doing upstream-first development in Git for VFS.

  • As git/git and git-for-windows/git ship new versions, microsoft/git gets a new vfs-2.XX.0 branch. The features/sparse-checkout will then be rebased on top of that using a force-push.

  • As features/sparse-checkout updates, we generate installers with suffix -sc to indicate this is something to consume in Scalar but not VFS for Git.

This setup should allow us to merge PRs like #54 and start working on functional tests, follow-up features, and perf tests.

/cc @jrbriggs, @wilbaker, @jeffhostetler, @kewillford, @jeschu1, @mjcheetham, @garimasi514, @nickgra.

Sparse: Add functional test workflow

Create a functional test set that follows a typical workflow around a sparse enlistment:

  1. scalar clone --sparse=true
  2. Verify root files only.
  3. scalar sparse add
  4. Verify folders are added.

May be combined with #76.

Integrate fsmonitor with Watchman on macOS

For optimal performance, we need git status to run in O(modified) time. The fsmonitor feature exists in Git, and we should take advantage of it.

@dscho is working on this, but I can't assign it to him for some reason.

Progress indicators in BlobPrefetcher

scalar sparse --add needs progress indicators. These progress indicators need to be in two places (at least):

  1. BlobPrefetcher needs to provide feedback as it discovers and downloads blobs.

  2. git read-tree -mu HEAD needs to provide feedback as it populates the working directory.

These are very different solutions, so this issue will track BlobPrefetcher.

Rewrite README

The README is a leftover from VFS for Git. It needs updating. Perhaps it should just point to the roadmap for now?

Prefetch --stdin-folders-list doesn't match cone patterns

If we supply the same set of paths to git sparse-checkout add and scalar prefetch --stdin-folders-list, the prefetch command gets a smaller set of files than the sparse-checkout requires when writing files to disk. This leads to a very slow first checkout, even after prefetching.

The real solution is described in #36.

However, it may be worth a temporary fix to the BlobPrefetcher to match a few more paths and speed this up in the short term.

Scalar installs into C:\Program Files\GVFS

I'm not sure what instance of "GVFS" in the codebase causes the installer to write into C:\Program Files\GVFS, but it requires the GVFS.Service and other GVFS.Mount processes to be terminated for the Scalar installer to work.

Ongoing: Port work from VFS for Git

Work in microsoft/vfsforgit sometimes needs a corresponding change here in microsoft/scalar.

Add a comment linking to the PR(s) that need porting to Scalar.

(Use ๐Ÿ‘ to indicate you are working on it, ๐Ÿš€ to indicate the item is done. ๐Ÿ‘Ž for "don't need")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.