Comments (5)
@aadityasondhi did you recently fix a similar issue?
from cockroach.
This one fails on sm.WALBytesIn
.
from cockroach.
This is a little confusing to me. In pebble, this metric is an atomic that we only increment using a uint
.
Here is where it is set.
https://github.com/cockroachdb/pebble/blob/200f9cf1e217afc4e9052891126be661d757007a/db.go#L1974
Here is where it is incremented.
https://github.com/cockroachdb/pebble/blob/200f9cf1e217afc4e9052891126be661d757007a/db.go#L973
from cockroach.
Is it perhaps possible that CockroachDB collects and updated metrics in parallel? It's possible that two goroutines get the metrics at around the same time but then they update the counters in the reverse order.
Indeed, I see another goroutine inside ComputeMetrics
in the test failure:
github.com/cockroachdb/cockroach/pkg/kv/kvserver/replicastats.(*ReplicaStats).AverageRatePerSecond(0xc0056aa6c0, {0x17ddc3a65714f7c7?, 0xc008afa4f0?, 0x0?})
github.com/cockroachdb/cockroach/pkg/kv/kvserver/replicastats/replica_stats.go:357 +0x22b
github.com/cockroachdb/cockroach/pkg/kv/kvserver/load.(*ReplicaLoad).getLocked(0xc008afa4e0, 0x3)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/load/replica_load.go:184 +0x98
github.com/cockroachdb/cockroach/pkg/kv/kvserver/load.(*ReplicaLoad).Stats(0xc008afa4e0)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/load/replica_load.go:199 +0x12a
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).updateReplicationGauges.func1(0xc00a8b1608)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store.go:3282 +0x872
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*storeReplicaVisitor).Visit(0xc005230b40, 0xc013cf8eb8)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store.go:556 +0x326
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).updateReplicationGauges(0xc005a60c08, {0xe06fb98, 0xc006a0db00})
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store.go:3231 +0x92f
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).computeMetrics(_, {_, _})
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store.go:3471 +0xe6
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).ComputeMetrics(...)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store.go:3590
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).WaitForFullReplication.func2(0xc005a60c08)
github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:1443 +0x16e
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Stores).VisitStores.func1(0x97c62c0?, 0xc005a60c08)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/stores.go:150 +0x58
github.com/cockroachdb/cockroach/pkg/util/syncutil.(*IntMap).Range(0xc003ff4db8, 0xc013cf9710)
github.com/cockroachdb/cockroach/pkg/util/syncutil/int_map.go:385 +0x19d
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Stores).VisitStores(0xc003ff4d80, 0xc013cf9928)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/stores.go:149 +0x8e
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).WaitForFullReplication(0xc004227808)
github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:1432 +0x5ae
github.com/cockroachdb/cockroach/pkg/ccl/backupccl/backuptestutils.StartBackupRestoreTestCluster({0xe0facf0, 0xc0022ba680}, 0x4, {0xc002f58c30, 0x3, 0x4b14d0?})
github.com/cockroachdb/cockroach/pkg/ccl/backupccl/backuptestutils/testutils.go:162 +0x967
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.backupRestoreTestSetupWithParams({_, _}, _, _, _, {{{{0x0, 0x0}, {0x0, 0x0}, {0x0, ...}, ...}, ...}, ...})
github.com/cockroachdb/cockroach/pkg/ccl/backupccl/utils_test.go:78 +0x1bf
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.TestBackupRestoreExecLocality(0xc0022ba680)
github.com/cockroachdb/cockroach/pkg/ccl/backupccl/backup_test.go:477 +0x8da
I think we should add some synchronization around the entire ComputeMetrics
operation.
from cockroach.
ccl/backupccl.TestBackupRestoreExecLocality failed on master @ c557fb59f6aec659d364e9002fc083c59c6392b6:
Fatal error:
panic: Counters should not decrease [recovered]
panic: Counters should not decrease [recovered]
panic: Counters should not decrease [recovered]
panic: Counters should not decrease
Stack:
goroutine 52 [running]:
testing.tRunner.func1.2({0x95e58a0, 0xe02eb10})
GOROOT/src/testing/testing.go:1631 +0x3f7
testing.tRunner.func1()
GOROOT/src/testing/testing.go:1634 +0x6b6
panic({0x95e58a0?, 0xe02eb10?})
GOROOT/src/runtime/panic.go:770 +0x132
github.com/cockroachdb/cockroach/pkg/util/leaktest.AfterTest.func2()
github.com/cockroachdb/cockroach/pkg/util/leaktest/leaktest.go:133 +0x4fa
panic({0x95e58a0?, 0xe02eb10?})
GOROOT/src/runtime/panic.go:770 +0x132
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).Start.func1()
github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:393 +0x99
panic({0x95e58a0?, 0xe02eb10?})
GOROOT/src/runtime/panic.go:770 +0x132
github.com/cockroachdb/cockroach/pkg/util/metric.(*Counter).Update(0xc006905920, 0x998c0ba)
github.com/cockroachdb/cockroach/pkg/util/metric/pkg/util/metric/metric.go:755 +0x6d
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*StoreMetrics).updateEngineMetrics(_, {0xc00e986408, {0x998c0ba, 0x2fe7c34, 0x55320878, 0x6a35, 0xc897, 0x6969, 0x140d1}, {0x2d16, ...}, ...})
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/metrics.go:3790 +0xfd7
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).computeMetrics(_, {_, _})
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store.go:3487 +0x1d8
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).ComputeMetrics(...)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store.go:3599
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).WaitForFullReplication.func2(0xc00692ac08)
github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:1443 +0x16e
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Stores).VisitStores.func1(0x97e6560?, 0xc00692ac08)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/stores.go:150 +0x58
github.com/cockroachdb/cockroach/pkg/util/syncutil.(*IntMap).Range(0xc004067988, 0xc01245cee0)
github.com/cockroachdb/cockroach/pkg/util/syncutil/int_map.go:385 +0x19d
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Stores).VisitStores(0xc004067950, 0xc016db10f8)
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/stores.go:149 +0x8e
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).WaitForFullReplication(0xc003c78308)
github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:1432 +0x5ae
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).Start(0xc003c78308, {0xe0db2d0, 0xc003c45a00})
github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:456 +0xa38
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.StartTestCluster({_, _}, _, {{{{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, ...}, ...}, ...}, ...})
github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:238 +0x92
github.com/cockroachdb/cockroach/pkg/ccl/backupccl/backuptestutils.StartBackupRestoreTestCluster({0xe126fb0, 0xc003c45a00}, 0x4, {0xc00418ec30, 0x3, 0x4b14d0?})
github.com/cockroachdb/cockroach/pkg/ccl/backupccl/backuptestutils/testutils.go:130 +0x313
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.backupRestoreTestSetupWithParams({_, _}, _, _, _, {{{{0x0, 0x0}, {0x0, 0x0}, {0x0, ...}, ...}, ...}, ...})
github.com/cockroachdb/cockroach/pkg/ccl/backupccl/utils_test.go:78 +0x1bf
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.TestBackupRestoreExecLocality(0xc003c45a00)
github.com/cockroachdb/cockroach/pkg/ccl/backupccl/backup_test.go:477 +0x8da
testing.tRunner(0xc003c45a00, 0xa6f5fd8)
GOROOT/src/testing/testing.go:1689 +0x21f
created by testing.(*T).Run in goroutine 1
GOROOT/src/testing/testing.go:1742 +0x826
Log preceding fatal error
=== RUN TestBackupRestoreExecLocality
test_log_scope.go:170: test logs captured to: outputs.zip/logTestBackupRestoreExecLocality1780957092
test_log_scope.go:81: use -show-logs to present logs inline
testcluster.go:393: -- test log scope end --
ERROR: a panic has occurred!
Details cannot be printed yet because we are still unwinding.
Hopefully the test harness prints the panic below, otherwise check the test logs.
test logs left over in: outputs.zip/logTestBackupRestoreExecLocality1780957092
testcluster.go:393: panic: Counters should not decrease
--- FAIL: TestBackupRestoreExecLocality (102.19s)
Parameters:
attempt=1
race=true
run=3
shard=22
This test on roachdash | Improve this report!
from cockroach.
Related Issues (20)
- roachtest: pgjdbc failed HOT 1
- roachtest: pgjdbc failed HOT 1
- crosscluster/logical: PTS protects too much HOT 1
- protectedts: MakeSchemaObjectsTarget should protect the schema object's metadata HOT 1
- raft: handle snapshots with outdated term HOT 1
- raft: incorrect term on delegated snapshots HOT 1
- raft: put MsgHeartbeat.Match on the wire HOT 1
- roachtest: replicate/wide failed HOT 1
- roachtest: follower-reads/mixed-version/single-region failed HOT 2
- ldap: scalability testing
- Update the `changefeed.sink_io_workers` cluster setting description for sinks HOT 2
- sql: delete's can inject synthetic check constraints for internal expressions HOT 2
- roachtest: update mixedversion tests to work in multi-tenant deployments HOT 2
- changefeedccl: investigate whether the kafka resizing behaviour is correct HOT 2
- pkg/cli: TestDemoLocality breaks TestCustomQuery HOT 1
- changefeedccl: unredact errors in fetching span stats while fetching usage bytes HOT 2
- sql/catalog/tabledesc: TestIndexDoesNotStorePrimaryKeyColumnMixedVersion failed
- schema: expand coverage of zone configs within logic tests
- pkg/sql/logictest/tests/cockroach-go-testserver-23.2/cockroach-go-testserver-23_2_test: TestLogic_mixed_version_bootstrap_tenant failed HOT 1
- ccl/crosscluster/physical: TestDataDriven failed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cockroach.