Giter VIP home page Giter VIP logo

arborist's Introduction

We've Moved! ๐Ÿšš

The code for this repo is now a workspace in the npm CLI repo.

github.com/npm/cli

You can find the workspace in /workspaces/arborist

Please file bugs and feature requests as issues on the CLI and tag the issue with "ws:arborist".

github.com/npm/cli/issues

@npmcli/arborist

Inspect and manage node_modules trees.

a tree with the word ARBORIST superimposed on it

There's more documentation in the docs folder.

USAGE

const Arborist = require('@npmcli/arborist')

const arb = new Arborist({
  // options object

  // where we're doing stuff.  defaults to cwd.
  path: '/path/to/package/root',

  // url to the default registry.  defaults to npm's default registry
  registry: 'https://registry.npmjs.org',

  // scopes can be mapped to a different registry
  '@foo:registry': 'https://registry.foo.com/',

  // Auth can be provided in a couple of different ways.  If none are
  // provided, then requests are anonymous, and private packages will 404.
  // Arborist doesn't do anything with these, it just passes them down
  // the chain to pacote and npm-registry-fetch.

  // Safest: a bearer token provided by a registry:
  // 1. an npm auth token, used with the default registry
  token: 'deadbeefcafebad',
  // 2. an alias for the same thing:
  _authToken: 'deadbeefcafebad',

  // insecure options:
  // 3. basic auth, username:password, base64 encoded
  auth: 'aXNhYWNzOm5vdCBteSByZWFsIHBhc3N3b3Jk',
  // 4. username and base64 encoded password
  username: 'isaacs',
  password: 'bm90IG15IHJlYWwgcGFzc3dvcmQ=',

  // auth configs can also be scoped to a given registry with this
  // rather unusual pattern:
  '//registry.foo.com:token': 'blahblahblah',
  '//basic.auth.only.foo.com:_auth': 'aXNhYWNzOm5vdCBteSByZWFsIHBhc3N3b3Jk',
  '//registry.foo.com:always-auth': true,
})

// READING

// returns a promise.  reads the actual contents of node_modules
arb.loadActual().then(tree => {
  // tree is also stored at arb.virtualTree
})

// read just what the package-lock.json/npm-shrinkwrap says
// This *also* loads the yarn.lock file, but that's only relevant
// when building the ideal tree.
arb.loadVirtual().then(tree => {
  // tree is also stored at arb.virtualTree
  // now arb.virtualTree is loaded
  // this fails if there's no package-lock.json or package.json in the folder
  // note that loading this way should only be done if there's no
  // node_modules folder
})

// OPTIMIZING AND DESIGNING

// build an ideal tree from the package.json and various lockfiles.
arb.buildIdealTree(options).then(() => {
  // next step is to reify that ideal tree onto disk.
  // options can be:
  // rm: array of package names to remove at top level
  // add: Array of package specifiers to add at the top level.  Each of
  //   these will be resolved with pacote.manifest if the name can't be
  //   determined from the spec.  (Eg, `github:foo/bar` vs `foo@somespec`.)
  //   The dep will be saved in the location where it already exists,
  //   (or pkg.dependencies) unless a different saveType is specified.
  // saveType: Save added packages in a specific dependency set.
  //   - null (default) Wherever they exist already, or 'dependencies'
  //   - prod: definitely in 'dependencies'
  //   - optional: in 'optionalDependencies'
  //   - dev: devDependencies
  //   - peer: save in peerDependencies, and remove any optional flag from
  //     peerDependenciesMeta if one exists
  //   - peerOptional: save in peerDependencies, and add a
  //     peerDepsMeta[name].optional flag
  // saveBundle: add newly added deps to the bundleDependencies list
  // update: Either `true` to just go ahead and update everything, or an
  //   object with any or all of the following fields:
  //   - all: boolean.  set to true to just update everything
  //   - names: names of packages update (like `npm update foo`)
  // prune: boolean, default true.  Prune extraneous nodes from the tree.
  // preferDedupe: prefer to deduplicate packages if possible, rather than
  //   choosing a newer version of a dependency.  Defaults to false, ie,
  //   always try to get the latest and greatest deps.
  // legacyBundling: Nest every dep under the node requiring it, npm v2 style.
  //   No unnecessary deduplication.  Default false.

  // At the end of this process, arb.idealTree is set.
})

// WRITING

// Make the idealTree be the thing that's on disk
arb.reify({
  // write the lockfile(s) back to disk, and package.json with any updates
  // defaults to 'true'
  save: true,
}).then(() => {
  // node modules has been written to match the idealTree
})

DATA STRUCTURES

A node_modules tree is a logical graph of dependencies overlaid on a physical tree of folders.

A Node represents a package folder on disk, either at the root of the package, or within a node_modules folder. The physical structure of the folder tree is represented by the node.parent reference to the containing folder, and node.children map of nodes within its node_modules folder, where the key in the map is the name of the folder in node_modules, and the value is the child node.

A node without a parent is a top of tree.

A Link represents a symbolic link to a package on disk. This can be a symbolic link to a package folder within the current tree, or elsewhere on disk. The link.target is a reference to the actual node. Links differ from Nodes in that dependencies are resolved from the target location, rather than from the link location.

An Edge represents a dependency relationship. Each node has an edgesIn set, and an edgesOut map. Each edge has a type which specifies what kind of dependency it represents: 'prod' for regular dependencies, 'peer' for peerDependencies, 'dev' for devDependencies, and 'optional' for optionalDependencies. edge.from is a reference to the node that has the dependency, and edge.to is a reference to the node that requires the dependency.

As nodes are moved around in the tree, the graph edges are automatically updated to point at the new module resolution targets. In other words, edge.from, edge.name, and edge.spec are immutable; edge.to is updated automatically when a node's parent changes.

class Node

All arborist trees are Node objects. A Node refers to a package folder, which may have children in node_modules.

  • node.name The name of this node's folder in node_modules.

  • node.parent Physical parent node in the tree. The package in whose node_modules folder this package lives. Null if node is top of tree.

    Setting node.parent will automatically update node.location and all graph edges affected by the move.

  • node.meta A Shrinkwrap object which looks up resolved and integrity values for all modules in this tree. Only relevant on root nodes.

  • node.children Map of packages located in the node's node_modules folder.

  • node.package The contents of this node's package.json file.

  • node.path File path to this package. If the node is a link, then this is the path to the link, not to the link target. If the node is not a link, then this matches node.realpath.

  • node.realpath The full real filepath on disk where this node lives.

  • node.location A slash-normalized relative path from the root node to this node's path.

  • node.isLink Whether this represents a symlink. Always false for Node objects, always true for Link objects.

  • node.isRoot True if this node is a root node. (Ie, if node.root === node.)

  • node.root The root node where we are working. If not assigned to some other value, resolves to the node itself. (Ie, the root node's root property refers to itself.)

  • node.isTop True if this node is the top of its tree (ie, has no parent, false otherwise).

  • node.top The top node in this node's tree. This will be equal to node.root for simple trees, but link targets will frequently be outside of (or nested somewhere within) a node_modules hierarchy, and so will have a different top.

  • node.dev, node.optional, node.devOptional, node.peer, Indicators as to whether this node is a dev, optional, and/or peer dependency. These flags are relevant when pruning dependencies out of the tree or deciding what to reify. See Package Dependency Flags below for explanations.

  • node.edgesOut Edges in the dependency graph indicating nodes that this node depends on, which resolve its dependencies.

  • node.edgesIn Edges in the dependency graph indicating nodes that depend on this node.

  • extraneous True if this package is not required by any other for any reason. False for top of tree.

  • node.resolve(name) Identify the node that will be returned when code in this package runs require(name)

  • node.errors Array of errors encountered while parsing package.json or version specifiers.

class Link

Link objects represent a symbolic link within the node_modules folder. They have most of the same properties and methods as Node objects, with a few differences.

  • link.target A Node object representing the package that the link references. If this is a Node already present within the tree, then it will be the same object. If it's outside of the tree, then it will be treated as the top of its own tree.
  • link.isLink Always true.
  • link.children This is always an empty map, since links don't have their own children directly.

class Edge

Edge objects represent a dependency relationship a package node to the point in the tree where the dependency will be loaded. As nodes are moved within the tree, Edges automatically update to point to the appropriate location.

  • new Edge({ from, type, name, spec }) Creates a new edge with the specified fields. After instantiation, none of the fields can be changed directly.
  • edge.from The node that has the dependency.
  • edge.type The type of dependency. One of 'prod', 'dev', 'peer', or 'optional'.
  • edge.name The name of the dependency. Ie, the key in the relevant package.json dependencies object.
  • edge.spec The specifier that is required. This can be a version, range, tag name, git url, or tarball URL. Any specifier allowed by npm is supported.
  • edge.to Automatically set to the node in the tree that matches the name field.
  • edge.valid True if edge.to satisfies the specifier.
  • edge.error A string indicating the type of error if there is a problem, or null if it's valid. Values, in order of precedence:
    • DETACHED Indicates that the edge has been detached from its edge.from node, typically because a new edge was created when a dependency specifier was modified.
    • MISSING Indicates that the dependency is unmet. Note that this is not set for unmet dependencies of the optional type.
    • PEER LOCAL Indicates that a peerDependency is found in the node's local node_modules folder, and the node is not the top of the tree. This violates the peerDependency contract, because it means that the dependency is not a peer.
    • INVALID Indicates that the dependency does not satisfy edge.spec.
  • edge.reload() Re-resolve to find the appropriate value for edge.to. Called automatically from the Node class when the tree is mutated.

Package Dependency Flags

The dependency type of a node can be determined efficiently by looking at the dev, optional, and devOptional flags on the node object. These are updated by arborist when necessary whenever the tree is modified in such a way that the dependency graph can change, and are relevant when pruning nodes from the tree.

| extraneous | peer | dev | optional | devOptional | meaning             | prune?            |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |      |     |          |             | production dep      | never             |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|     X      | N/A  | N/A |   N/A    |     N/A     | nothing depends on  | always            |
|            |      |     |          |             | this, it is trash   |                   |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |      |  X  |          |      X      | devDependency, or   | if pruning dev    |
|            |      |     |          | not in lock | only depended upon  |                   |
|            |      |     |          |             | by devDependencies  |                   |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |      |     |    X     |      X      | optionalDependency, | if pruning        |
|            |      |     |          | not in lock | or only depended on | optional          |
|            |      |     |          |             | by optionalDeps     |                   |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |      |  X  |    X     |      X      | Optional dependency | if pruning EITHER |
|            |      |     |          | not in lock | of dep(s) in the    | dev OR optional   |
|            |      |     |          |             | dev hierarchy       |                   |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |      |     |          |      X      | BOTH a non-optional | if pruning BOTH   |
|            |      |     |          |   in lock   | dep within the dev  | dev AND optional  |
|            |      |     |          |             | hierarchy, AND a    |                   |
|            |      |     |          |             | dep within the      |                   |
|            |      |     |          |             | optional hierarchy  |                   |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |  X   |     |          |             | peer dependency, or | if pruning peers  |
|            |      |     |          |             | only depended on by |                   |
|            |      |     |          |             | peer dependencies   |                   |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |  X   |  X  |          |      X      | peer dependency of  | if pruning peer   |
|            |      |     |          | not in lock | dev node hierarchy  | OR dev deps       |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |  X   |     |    X     |      X      | peer dependency of  | if pruning peer   |
|            |      |     |          | not in lock | optional nodes, or  | OR optional deps  |
|            |      |     |          |             | peerOptional dep    |                   |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |  X   |  X  |    X     |      X      | peer optional deps  | if pruning peer   |
|            |      |     |          | not in lock | of the dev dep      | OR optional OR    |
|            |      |     |          |             | hierarchy           | dev               |
|------------+------+-----+----------+-------------+---------------------+-------------------|
|            |  X   |     |          |      X      | BOTH a non-optional | if pruning peers  |
|            |      |     |          |   in lock   | peer dep within the | OR:               |
|            |      |     |          |             | dev hierarchy, AND  | BOTH optional     |
|            |      |     |          |             | a peer optional dep | AND dev deps      |
+------------+------+-----+----------+-------------+---------------------+-------------------+
  • If none of these flags are set, then the node is required by the dependency and/or peerDependency hierarchy. It should not be pruned.
  • If both node.dev and node.optional are set, then the node is an optional dependency of one of the packages in the devDependency hierarchy. It should be pruned if either dev or optional deps are being removed.
  • If node.dev is set, but node.optional is not, then the node is required in the devDependency hierarchy. It should be pruned if dev dependencies are being removed.
  • If node.optional is set, but node.dev is not, then the node is required in the optionalDependency hierarchy. It should be pruned if optional dependencies are being removed.
  • If node.devOptional is set, then the node is a (non-optional) dependency within the devDependency hierarchy, and a dependency within the optionalDependency hierarchy. It should be pruned if both dev and optional dependencies are being removed.
  • If node.peer is set, then all the same semantics apply as above, except that the dep is brought in by a peer dep at some point, rather than a normal non-peer dependency.

Note: devOptional is only set in the shrinkwrap/package-lock file if neither dev nor optional are set, as it would be redundant.

arborist's People

Contributors

addaleax avatar ashleygwilliams avatar bonkydog avatar claudiahdz avatar darcyclarke avatar enelar avatar fritzy avatar guillett avatar iarna avatar isaacs avatar jayaddison avatar kumavis avatar ljharb avatar lukekarrys avatar mtoothman avatar naugtur avatar nlf avatar obar-zik avatar othiym23 avatar ruyadorno avatar watilde avatar wesleytodd avatar wraithgar avatar zkat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arborist's Issues

load logical tree without looking in node_modules

Load tree from package-lock.json and package.json without reading tree (Ie, just get the presumed tree, without reaffirming from disk)

NB: this should be unblocked now that pacote v10 is ready to go.

acceptDependencies

  • Add an accept field to the Edge class.
  • Update depValid so that a dep is valid if it satisfies the spec or the accept specification. (Note: should depValid take an Edge object rather than a spec? Should it just be a part of the Edge class? It smells like feature envy, but we do often want to know if a dep would be valid, without committing to attaching it into the tree so an Edge points to it, but again, maybe this would be better handled with a method like Edge.wouldBeSatisfiedBy(node) or something?)
  • Update Node[_loadDepType] method to include the acceptDependencies range when creating an Edge.

That should be it. Everything else that's done with dep analysis happens by calling depValid with the edge values and target node.

[BUG] Flattened dependencies reported as children of the root/top node

What / Why

When the tree is flattened, deep dependencies are reported as direct dependencies, i.e. the node.parent will be the node.root rather than the dependent.

This may, of course, be as designed, in which case the documentation needs clarification. In that case, I can raise a separate feature request for detecting node.direct dependencies (i.e. ones in package.json, rather than just the top level of the node_modules) - or rename this issue.

When

  • n/a

Where

  • n/a

How

Current Behavior

  • node.parent === node.root when node is an indirect dependency.

Steps to Reproduce

  • npm install debug
  • arborist.buildIdealTree()
  • the Node for ms will have node.parent.isTop === true, even though its parent is debug.

Expected Behavior

  • The Node for ms should have node.parent be the Node for debug.

Who

  • n/a

References

  • n/a

Resolve past the top of tree

Right now, node.resolve() does this:

  resolve (name) {
    const mine = this.children.get(name)
    return mine ? mine
      : this.isTop ? null
      : this.parent.resolve(name)
  }

However, we might have a folder structure like this:

root
+-- x (depends on y)
+-- packages
|   +-- y (depends on z and a)
|   |   +-- node_modules
|   |       +-- a
|   +-- z
+-- node_modules
    +-- x -> ../x
    +-- y -> ../packages/y
    +-- z -> ../packages/z

When loading the tree in x, the top of tree will be root/x. No node_modules, so it'll look like it's missing its deps.

Similarly, in packages/y, it'll find the a dep, but not the z dep.

We need a way to identify other points in the fs tree where a dep might be found: ie, any ancestor of the top node which contains a node_modules folder.

Some options:

  • Represent the node_modules folder as a special type of node in the tree. Then the "parent" relationship can be a direct folder relationship, and we can say that the root/x node's parent is the root node. Then resolve looks like this:
resolve (name) {
  const mine = this.node_modules && this.node_modules.get(name)
  if (mine) return mine
  if (this.parent.isNodeModules && this.parent.parent)
    return this.parent.parent.resolve(name)
  else if (this.parent)
    return this.parent.resolve(name)
  else
    return null
}

Downside of this is that we'd have to effectively always walk the full tree loading every folder in the path to really do it correctly.

  • Define a set of "potential parents" in the loadActual step if a node is top of tree. A potential parent is an ancestor folder containing a node_modules folder. Then resolve() looks like this:
resolve (name) {
  const mine = this.children.get(name)
  if (mine)
    return mine
  else if (this.parent)
    return this.parent.resolve(name)
  else {
    let ppFound
    this.potentialParents.some(pp => ppFound = pp.resolve(name))
    return ppFound
  }
}

The potential parents list would have to be an array, not a Set, because order is important.

The important thing about getting this right in Node.resolve() is that that's how the Edge class keeps itself automatically up to date. Challenge is to get this set up in loadActual so that we're not doing fs stuff in the Node class directly, since that should be completely agnostic about whether it refers to an actual node on disk or a hypothetical node in an ideal tree.

[FEATURE] arborist.prune()

Create top level arborist.prune() method/mixin.

Rough sketch algorithm:

  • Load virtual tree, or failing that, actual tree
  • For node in tree.inventory, compare node dep flags to requested prune args, according to this chart
  • If shouldPrune, then set node.parent = null
  • this.reify()

[BUG] Configs needing to be added/respected

  • binLinks - boolean, default true, whether to link bins
  • rebuildBundle - boolean, default false, rebuild bundled deps #80
    • This actually snuck on b28a43f, and it's probably good to have true by default rather than false? I think the only reason it's not true by default is that it was kind of an afterthought back in the npm v1 days, and that behavior was just carried forward.
    • Update It actually does default to true in npm v6, not sure where I got the impression it didn't. So that's good. Just have to make it configurable.
  • packageLock - boolean, default true, respect/update the package-lock/shrinkwrap file
  • packageLockOnly - boolean, default false, don't reify anything, just update the lockfile
    • depends on #73, if this config is set reify() calls buildIdealTree({complete:true}) and then saves package{,-lock}.json files
    • #78
  • globalStyle - boolean, install as if it's a top-level global package (ie, don't resolve any deps above the top level, except peer deps)
  • dryRun - boolean, don't actually touch anything on the filesystem
    • depends on #73. If this config is set, calls buildIdealTree({complete:true}), calculates the diff, and returns diff and ideal tree, without saving anything.
    • #94

refactor Arborist class into 4 mixins

  1. loadActual
  2. loadVirtual
  3. buildIdealTree
  4. reify

The better to test and organize it with. So that the actual Arborist class looks like:

class Arborist extends actualLoader(virtualLoader(idealTreeBuilder(reifier(EE)))) {
  constructor (options) {
    // set this.whatevers
  }
}
// that's it!

buildIdealTree: update

Right now, Arborist.buildIdealTree can add/remove nodes, and will prune by default. However, it really is an "install" and not an "update", in that it only visits the root node, and then any node that has an invalid edge going out from it.

npm update is different:

If a list of names is provided

  • Start with the actual tree if there's no shrinkwrap (not an empty tree with just the root node)
  • Walk the whole tree no matter what, with the exception of bundled deps. (Ie, don't just start at the root and only add problems to the queue.)
  • While we may visit every node, only re-evaluate edges if the name is on the list, or if it has a problem.

If a numeric depth is provided

  • Do not continue the "walk the entire tree" process beyond nodes with a depth property > the depth specified.
  • Still always evaluate nodes if they're a problem, to any depth (they're likely a problem because of something replaced further up in the tree, or because they are missing.)

If no names are provided, and no depth provided

  • Ignore the shrinkwrap. Start from the root node, rebuild the whole thing. (This is essentially just buildIdealTree with shrinkwrap: false, so npm update becomes the same as npm install --no-shrinkwrap or rm -rf node_modules package-lock.json; npm i.)

[FEATURE] need to run build scripts

Arborist needs to run build scripts so that it can fail (and roll back) the install if they fail (or ignore, but clean up if it's an optional dep).

Fallback lockfile

Write a copy of the lockfile to node_modules/.package-lock.json whenever we reify the tree.

If the mtime on this lockfile is >= the mtime on all folders in node_modules, assume it's a valid representation of the packages on disk, and have loadActual return its data by default instead of reading the tree. An option to loadActual can tell it to prefer this cached data if present, since npm ls, npm fund, etc. are typically fine with cached data.

When building the ideal tree, we should not use this fallback lockfile as an authoritative reference, however. It's a record of what is, but npm-shrinkwrap.json, package-lock.json, and yarn.lock capture what should be.

How can I get the effect of `npm ls --production`?

Using each of loadActual, loadVirtual, and buildIdealTree (since I use all three in what I'm building), how can i omit/ignore dev deps?

For loadActual, I can run npm prune --production first, and I get the tree I want; for loadVirtual, i'd presumably have to remove all lockfiles and run npm install --package-lock-only --package-lock --production; for buildIdealTree, it's not clear there's any way to work around it.

generate and respect yarn.lock file

this will require the dist.shasum, which npm
no longer stashes in package.json. Will have to pull that out of
cacache or something, or fetch it if necessary?

This may require an update to pacote (and potentially ssri?) to calculate the sha1 integrity hash that yarn expects for resolved tarballs.

Another thought: what if the resolved value from pacote for registry deps just always had the #${shasum} tacked onto the tgz url, like yarn does? Would certainly make it easier to generate, and wouldn't be too hard. Fetch would have to know to strip it off, but I would expect it to anyway.

[BUG] bad behavior in npm/cli#750 test case

Set up a folder like described in npm/cli#750

Run the scripts/reify.js script on it, with --save

  1. Does not create link to ./lib in node_modules.
  2. symlink is absolute url, should be relative.
  3. package-lock.json looks like this:
{
  "name": "monorepo",
  "lockfileVersion": 2,
  "requires": true,
  "packages": {
    "": {
      "name": "monorepo",
      "dependencies": {
        "app": "file:./app"
      }
    },
    "app": {},
    "node_modules/app": {
      "resolved": "app",
      "link": true
    }
  },
  "dependencies": {
    "app": {
      "version": "file:app"
    }
  }
}

Run scripts/reify.js --save on it again, and now package-lock.json looks like this:

{
  "name": "monorepo",
  "lockfileVersion": 2,
  "requires": true,
  "packages": {
    "": {
      "name": "monorepo",
      "dependencies": {
        "app": "file:./app"
      }
    },
    "app": {
      "resolved": "git+ssh://[email protected]/node_modules/app.git"
    },
    "node_modules/app": {
      "resolved": "app",
      "link": true
    }
  },
  "dependencies": {
    "app": {
      "version": "file:app"
    }
  }
}

Super weird, looks like it's interpreting the node_modules/app reference as a github shorthand? That's wrong.

[BUG] nebulous error when using arborist in a path with no `package.json`

What / Why

npm verb stack TypeError: Cannot read property 'name' of null
npm verb stack     at npa (/Users/mperrotte/npminc/cli/node_modules/npm-package-arg/npa.js:28:20)
npm verb stack     at FetcherBase.get (/Users/mperrotte/npminc/cli/node_modules/pacote/lib/fetcher.js:448:16)
npm verb stack     at Object.extract (/Users/mperrotte/npminc/cli/node_modules/pacote/lib/index.js:4:34)
npm verb stack     at Arborist.[extractOrLink] (/Users/mperrotte/npminc/cli/node_modules/@npmcli/arborist/lib/arborist/reify.js:344:16)
npm verb stack     at p.(anonymous function).then (/Users/mperrotte/npminc/cli/node_modules/@npmcli/arborist/lib/arborist/reify.js:330:39)

Get this nebulous error when trying to run arborist.buildIdealTree and arborist.reify in a path that doesn't contain a package.json

When

  • Running .buildIdealTree()
  • Running .reify()

Where

  • opts = { path: 'some/path/with/no/package/json' }

How

Current Behavior

  • Nebulous error

Expected Behavior

  • Should throw with intelligent error message which can be caught and used

[FEATURE] timing

It would be nice if reify() (and, for that matter, other functions as well) would track how long things take.

Call process.emit('time', name) when a thing starts, and process.emit('timeEnd', name) when they end.

reparenting nodes

move a node to a different spot in the tree. Update parent/child links,
and re-resolve dependencies of any nodes that depend on it.

Required for #9

[FEATURE] arborist.dedupe()

npm dedupe needs to find and de-duplicate dependencies. This should be available as a top-level Arborist function/mixin.

Rough algo sketch:

  • Set this[_preferDedupe] to true
  • Load virtual tree (or, failing that, load actual tree)
  • For each name in tree.inventory.query('name')
    • set nodes to tree.inventory.query('name', name)
    • if nodes.size is 1, continue to next name
    • for each node in nodes
      • for each edge in node.edgesIn
        • add edge.from to this[_depsQueue]
  • set this.idealTree to tree
  • return this.reify()

This will require that we make the _depsQueue and _preferDedupe symbols shared (ie, Symbol.for('_depsQueue') instead of Symbol('_depsQueue').

cc @claudiahdz

[FEATURE] don't require reify/buildIdealTree caller to list added deps under intended type

Right now, if you do this:

npm i -D tap@12
npm i rimraf@2

then it'll add rimraf as a dep, and tap as a devDep.

And then later, if you do:

npm i tap@latest rimraf@latest

it'll be smart enough to update the tap in your devDeps, and the rimraf in your deps.

However, the API for adding deps in arborist requires that the caller decide where to place it in the package.json.

That means that the CLI can't just willy-nilly say "add this thing, idk, put it wherever it goes". It has to read the package.json, see if it's there, if not, decide where to add it, etc.

And then, arborist has to read the package.json again, and re-do some of that same complexity.

We've already diverged from mapping the options directly onto the package.json data structure (since we don't have the names for some deps at request time), so I feel like we may as well model the API more closely to what the CLI presents to the user.

Proposed API

// the first install, adding tap to devDeps
arb.reify({
  add: ['tap@12'],
  saveType: 'dev', // overrides wherever it might already be
  saveBundled: false,
  saveOptional: false,
})

// next install
arb.reify({
  add: ['rimraf@2'],
  saveType: null, // defaults to prod if missing
  saveBundled: false,
  saveOptional: false,
})

// third install, update both
arb.reify({
  add: ['tap@latest', 'rimraf@latest'],
  saveType: null, // just update where they are
  saveBundled: false,
  saveOptional: false,
})

Implementation

  • saveBundled will place the names of all added deps into bundleDependencies
  • saveType can have the following values:
    • null Just use the current for that dep name (deps/optional/dev/peer), or dependencies if not already present. If present in peer, do not modify the peerDepsMeta entry.
    • dev Save all added deps to devDependencies, removing them from optional/deps/peer/peerMeta if present.
    • prod Save all added deps to dependencies, removing them from dev/optional/peer/peerMeta if present.
    • optional Save all added deps to optionalDependencies, removing from dev/peer/peerMeta/deps if present.
    • peer Save all added deps to peerDependencies, removing them from dev/optional/deps if present, and removing a peerDependenciesMeta[name].optional flag if present.
    • peerOptional Save all added deps to peerDependencies and add a peerDependenciesMeta[name]={optional:true}

set inBundle based on bundledDependencies, not package.json's _inBundle

inBundle should be set based on a package's bundledDependencies, not
the _inBundle field in the written package.json file. Actually, we
should not really be looking at that package.json file for very much, if
possible, as it makes interoperability with Yarn harder. Rather than
writing to nm/foo/package.json, leave that file untouched, and stash
stuff in nm/.cache/npm instead maybe?

arborist.inventory

Keep inventories of the nodes in the tree.

  • Index of all nodes by location. Would have to be updated when re-parenting. Maybe reparenting should be done by the Arborist, not a Node method? See #13.
  • Set of all extraneous/dev/optional/devOptional nodes, the better to prune. See #11.

[BUG] need to retire/fallback/remove bin links as well

In the same way that we install something, and then might need to roll it back, bin links need to be preserved and cleaned up on removal.

Example:

$ mkdir x

$ echo '{}' > x/package.json

$ node scripts/reify.js x --add=rimraf --quiet
{
  path: 'x',
  cache: '/Users/isaacs/.npm/_cacache',
  add: { dependencies: { rimraf: '' } },
  quiet: true
}
resolved 12 deps in 0.0865676154s

$ node scripts/reify.js x --rm=rimraf --quiet
{
  path: 'x',
  cache: '/Users/isaacs/.npm/_cacache',
  rm: [ 'rimraf' ],
  quiet: true
}
resolved 0 deps in 0.004495018s

$ tree x
x
โ”œโ”€โ”€ node_modules
โ”‚  โ””โ”€โ”€ .bin
โ”‚     โ””โ”€โ”€ rimraf -> ../rimraf/bin.js
โ”œโ”€โ”€ package-lock.json
โ””โ”€โ”€ package.json

After installing and then removing rimraf, the rimraf bin is still present. That's litter.

If another module were to try to use that bin name, it'd fail, even though the rimraf module was removed!

Should be as simple as adding the bin link targets to the retiredNodes object. (Which should probably be called retiredPaths, since it's a collection of path names, not node objects?)

buildIdealTree - optional deps

Provide an option to buildIdealTree to exclude (and prune) optional dependencies.

Note: we cannot ignore optionalDependencies conflicts, because the contract as implemented is that we will provide the version requested or nothing. Problems during reification (fetching, unpacking, building, etc) can be ignored if the dep is optional.

Arborist API refactor

Default export should be Arborist class, not loadActual (carry-over from read-package-tree)

buildIdealTree - ignore deps when a node has a shrinkwrap

  • Node object should maybe have a hasShrinkwrap field? (Pull from pkg or by reading the actual folder when loaded from disk)
  • Track hasShrinkwrap in lockfile
  • Dont' bother fetching deps if it has a shrinkwrap, becuase we wont' need that
  • Update idealTree during reification (essnetially, do loadVirtual on that node and then slap that tree into the idealTree as the node's children).

emit events

  • make a list of what kinds of events we care about
  • Emit events when that stuff happens
  • should enable timing and logging in npm/cli

Modules that are both dev and optional should have both flags set, not neither

the dev/optional flag stuff, where it's neither when it's both. Set
a flag saying node.devOrOptional, unset that on deps and peerDeps.
Then, for optional deps, walk any that have devOrOptional, and set
optional=true. For dev deps, same thing, but dev=true. Result is that
something in both trees has both flags, and pruning dev means we remove
any dev that isn't optional; pruning optional means remove any optional
that isn't dev, pruning both is pruning any in either.

[FEATURE] read pnpm trees properly

What

When loading, if a linked tree top is in a node_modules folder, and has
no children of its own, then load the parent of the node_modules folder
as a node, which will pick up all of the linked deps recursively.

pnpm arranges the tree like this:

root
+-- node_modules
    +-- .pnpm
    |   +-- lock.yaml
    |   +-- registry.npmjs.org
    |       +-- pkg
    |       |   +-- 1.2.3
    |       |       +-- node_modules
    |       |           +-- pkg
    |       |           +-- dep -> .pnpm/registry.npmjs.org/dep/2.3.4/node_modules/dep
    |       +-- dep
    |           +-- 2.3.4
    |           |   +-- node_modules
    |           |       +-- dep
    |           +-- 2.4.5
    |               +-- node_modules
    |                   +-- dep
    +-- pkg -> .pnpm/registry.npmjs.org/pkg/1.2.3/node_modules/pkg

So the root can do require(pkg) but not require(dep).

If we load node_modules/.pnpm/regisitry.npmjs.org/pkg/1.2.3 as if it's
a top level tree node of its own, then it obviously won't have a
package.json, so it'll be kind of boring, but then it'll pick up the pkg
target as a child, along with its deps, and the pkg target won't have any
missing edges.

In fact, we may even be able to work on this tree and capture it in a
package-lock.json!

How

In lib/load-actual.js, if a link target is in a node_modules folder, then load the parent of that node_modules folder as a Node. This will automatically get assigned as the link target node's parent, as well as loading all of its children. (In the case of a pnpm tree, this will be links to other similarly structured package nodes.)

Why

Interoperability with other package managers is an explicit design goal of Arborist and npm v7. Running npm ls in a pnpm-installed tree project should produce useful and correct output. npm commands in such a project should behave as a user would expect.

Limitations of pnpm Interop

We likely will not go so far as to preserve the pnpm tree style in projects that use it, for new dependencies added. For example, in the above tree, if a user runs npm install foo, they'll end up with something like:

root
+-- node_modules
    +-- .pnpm
    |   +-- lock.yaml
    |   +-- registry.npmjs.org
    |       +-- pkg
    |       |   +-- 1.2.3
    |       |       +-- node_modules
    |       |           +-- pkg
    |       |           +-- dep -> .pnpm/registry.npmjs.org/dep/2.3.4/node_modules/dep
    |       +-- dep
    |           +-- 2.3.4
    |           |   +-- node_modules
    |           |       +-- dep
    |           +-- 2.4.5
    |               +-- node_modules
    |                   +-- dep
    +-- pkg -> .pnpm/registry.npmjs.org/pkg/1.2.3/node_modules/pkg
    +-- foo
        +-- node_modules
            +-- foo-dep

If a user runs npm update --deep, and an updated version of dep is available that satisfies pkg's requirements, we might end up with:

root
+-- node_modules
    +-- .pnpm
    |   +-- lock.yaml
    |   +-- registry.npmjs.org
    |       +-- pkg
    |       |   +-- 1.2.3
    |       |       +-- node_modules
    |       |           +-- pkg
    |       |           |   +-- node_modules
    |       |           |       +-- dep
    |       |           +-- dep -> .pnpm/registry.npmjs.org/dep/2.3.4/node_modules/dep
    |       +-- dep
    |           +-- 2.3.4
    |           |   +-- node_modules
    |           |       +-- dep
    |           +-- 2.4.5
    |               +-- node_modules
    |                   +-- dep
    +-- pkg -> .pnpm/registry.npmjs.org/pkg/1.2.3/node_modules/pkg

While this is messy, it's not incorrect, and running npm prune would delete nodes that have no edges pointing into them.

[BUG] audit

quick audit on reify

Arborist should construct the audit payload and submit the quick audit for all reifications (unless audit: false is in the options) and have a function to perform a full audit.

full audit without fix

Load the actual tree, and submit to the full audit endpoint. Return the audit report object.

full audit with fix

Submit the full audit, then perform the necessary remediations as specified.

  • Copy actualTree to this.idealTree
  • Make remediation changes to this.idealTree
  • Call this.reify() (triggers another quick audit, which should pass)

get rid of node.logical

Get rid of node.logical. It serves no purpose, and is weirdly
confusing when dealing with link targets that may appear multiple times
in a tree, since it only captures the first place a node was
encountered. Better to just use the "physical" real path for nodes, and
always have node.path refer to that.

unit tests

Refactor the tests into per-file 100% coverage-generating unit tests.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.