Comments (9)
Wow thanks! I think Anki is super cool so please let me know if there's anything else you'd like in n2, even something minor!
from n2.
Making sure I follow, it's something like
build foo: bar baz1 baz2 baz3
command = ... no mention of $in ...
and you reordered baz1/baz2/baz3?
from n2.
Pretty much, yep. The inputs were listed as implicit dependencies, and then listed out in the command line as well. The dependency line was changing, but not the command line. I think the same issue existed for implicit outputs as well.
from n2.
Hrmm. On the one hand, this might be easy to fix. On the other, it feels like if you're changing your build file around it's not so bad for ninja to rebuild? How would you feel about this if there were better debugging output available?
from n2.
Better debugging facilities would be great - it's useful for a bunch of other cases as well, and "build file change triggers rebuild" does make sense.
from n2.
Thinking about it more, n2 doesn’t have as much debug info available (see the discussion of manifests on https://neugierig.org/software/blog/2022/03/n2.html ), hrm…
from n2.
Thanks for implementing that. Yeah, I remember reading your note about hashes losing information about the build when I read that article, and I encountered it when using Bazel as well.
The approach I used in my quick-and-dirty hack was to gather the same data the hash uses and serialize it:
diff --git a/src/graph.rs b/src/graph.rs
index 506023d..f1f60a6 100644
--- a/src/graph.rs
+++ b/src/graph.rs
@@ -377,7 +377,7 @@ const UNIT_SEPARATOR: u8 = 0x1F;
fn hash_files(
hasher: &mut std::collections::hash_map::DefaultHasher,
graph: &Graph,
- file_state: &mut FileState,
+ file_state: &FileState,
ids: &[FileId],
) {
for &id in ids {
@@ -395,6 +395,31 @@ fn hash_files(
}
}
+pub fn dump_files(
+ dump: &mut Vec<(String, i64)>,
+ graph: &Graph,
+ file_state: &FileState,
+ ids: &[FileId],
+) {
+ for &id in ids {
+ let name = &graph.file(id).name;
+ let mtime = file_state
+ .get(id)
+ .unwrap_or_else(|| panic!("no state for {:?}", name));
+ let mtime = match mtime {
+ MTime::Stamp(mtime) => mtime,
+ MTime::Missing => panic!("missing file: {:?}", name),
+ };
+ dump.push((
+ name.clone(),
+ mtime
+ .duration_since(SystemTime::UNIX_EPOCH)
+ .unwrap()
+ .as_millis() as i64,
+ ));
+ }
+}
+
// Hashes the inputs of a build to compute a signature.
// Prerequisite: all referenced files have already been stat()ed and are present.
// (It doesn't make sense to hash a build with missing files, because it's out
@@ -417,6 +442,27 @@ pub fn hash_build(
Ok(Hash(hasher.finish()))
}
+#[derive(Default, Debug, serde::Serialize, serde::Deserialize, PartialEq)]
+pub struct BuildDump {
+ pub ins: Vec<(String, i64)>,
+ pub outs: Vec<(String, i64)>,
+ pub cmdline: String,
+}
+
+pub fn dump_build(
+ graph: &Graph,
+ file_state: &FileState,
+ build: &Build,
+) -> std::io::Result<BuildDump> {
+ let mut dump = BuildDump::default();
+ dump_files(&mut dump.ins, graph, file_state, build.dirtying_ins());
+ dump_files(&mut dump.ins, graph, file_state, build.discovered_ins());
+ dump_files(&mut dump.outs, graph, file_state, build.outs());
+ dump.cmdline = build.cmdline.clone().unwrap_or_default();
+ assert!(build.rspfile.is_none());
+
+ Ok(dump)
+}
#[derive(Default)]
pub struct Hashes(HashMap<BuildId, Hash>);
diff --git a/src/work.rs b/src/work.rs
index 227f5bb..9363c6f 100644
--- a/src/work.rs
+++ b/src/work.rs
@@ -4,8 +4,10 @@ use crate::{
canon::canon_path, db, densemap::DenseMap, graph::*, progress, progress::Progress, signal,
smallmap::SmallMap, task, trace,
};
+use std::collections::HashMap;
use std::collections::HashSet;
use std::collections::VecDeque;
+use std::path::Path;
use std::path::PathBuf;
use std::time::Duration;
@@ -292,6 +294,7 @@ pub struct Work<'a> {
last_hashes: &'a Hashes,
build_states: BuildStates,
runner: task::Runner,
+ dump: HashMap<String, BuildDump>,
}
impl<'a> Work<'a> {
@@ -315,6 +318,7 @@ impl<'a> Work<'a> {
last_hashes,
build_states: BuildStates::new(builds, pools),
runner: task::Runner::new(parallelism),
+ dump: Default::default(),
}
}
@@ -598,6 +602,13 @@ impl<'a> Work<'a> {
// so compare the hash against the last hash.
// TODO: skip this whole function if no previous hash is present.
let hash = hash_build(self.graph, &mut self.file_state, build)?;
+
+ assert!(build.cmdline.is_some());
+ self.dump.insert(
+ build.cmdline.clone().unwrap(),
+ dump_build(self.graph, &self.file_state, build)?,
+ );
+
Ok(self.last_hashes.changed(id, hash))
}
@@ -725,6 +736,37 @@ impl<'a> Work<'a> {
// "interrupted by user" and exit with success, and in that case we
// don't want n2 to print a "succeeded" message afterwards.
let success = tasks_failed == 0 && !signal::was_interrupted();
+ if success {
+ let dump_path = Path::new(".n2-dump");
+ if dump_path.exists() {
+ let previous_dump: HashMap<String, BuildDump> =
+ serde_json::from_slice(&std::fs::read(dump_path).unwrap()).unwrap();
+
+ for (k, current) in &self.dump {
+ match previous_dump.get(k) {
+ Some(prev) => {
+ if prev != current {
+ let old = serde_json::to_string(prev).unwrap();
+ let new = serde_json::to_string(current).unwrap();
+ println!(
+ "{k}: {:#?}",
+ json_diff::compare_jsons(&old, &new).unwrap()
+ );
+ }
+ }
+ None => {
+ println!("not seen before: {}", k);
+ }
+ }
+ }
+
+ for (k, v) in previous_dump {
+ // add any previous entries that are missing to current dump
+ self.dump.entry(k).or_insert(v);
+ }
+ };
+ std::fs::write(dump_path, serde_json::to_vec(&self.dump).unwrap()).unwrap();
+ }
Ok(success.then(|| tasks_done))
}
If the extra work were gated behind a tool like explain so the extra costs were only paid when debugging, could that be a possible solution to the all-we-have-is-a-hash problem?
from n2.
Ok, I did this. I made it indirect through a trait so the debug-printer and real calculation are guaranteed to cover the same data.
Output looks like:
$ ./target/release/n2 -d explain -C ~/projects/llvm-project-16.0.0.src/build lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o
explain: build.ninja:660: input changed
in:
1687461853117 /Users/evmar/projects/llvm-project-16.0.0.src/llvm/lib/Support/Caching.cpp
discovered:
1679055967000 /Users/evmar/projects/llvm-project-16.0.0.src/llvm/include/llvm/Support/Caching.h
1679055967000 /Users/evmar/projects/llvm-project-16.0.0.src/llvm/include/llvm/Support/Error.h
[...]
from n2.
Thanks Evan, this is great - I already found myself reaching for it today while debugging a build issue.
After those initial migration issues I mentioned previously, everything has been smooth sailing, and I'm still loving the new status output. Wish I'd made the switch earlier!
from n2.
Related Issues (20)
- Does it need sh? HOT 2
- Variable expansion missing global reference
- Difference in behavior with ninja when not specifying a target
- Inconsistent handling of newline escapes HOT 2
- Variable references in build inputs don't take into account the build's bindings HOT 5
- Oversized WriteBuf panic HOT 3
- Validation mode HOT 3
- proper `console` pool support
- better control over subcommand output HOT 2
- `e2e::validations::build_starts_before_validation_finishes` fails every so often HOT 2
- Tag a release (candidate)? HOT 1
- Difficulties with git-lfs HOT 4
- Give priority to slow actions HOT 2
- 10% of parse time is tracking line numbers HOT 4
- Add multithreaded parsing HOT 5
- mmap input files rather than reading HOT 5
- dry-run mode
- mutable reference to mutable static detected bu rustc HOT 1
- Segmentfault when building fastfetch HOT 5
- Shell-escape $in, $in_newline, $out, and $out_newline
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from n2.