Around Feb 3, 'cabal test angle' started failing deterministically on arm64 builds. All other angle tests pass.
The query asks for facts of the wrong type, using a fact literal id. The id is a valid fact for a Cxx.Name, but we query for it as a Cxx.Type. This is exported to raise a catchable exception in Haskell with "fact has the wrong type" mentioned somewhere, by throwing a std::runtime_exception via https://github.com/facebookincubator/Glean/blob/main/glean/rts/error.cpp#L24
On ARM, now, we get a SIGABRT in the following runQuery:
-- Literal fact ID with the wrong type
r <- try $ runQuery_ env repo $ modify $ angle @Cxx.Type $
"$cxx1.Type " <> factId (head names)
assertBool "angle - fact id with wrong type" $
case r of
Left (SomeException x) -> "fact has the wrong type" `isInfixOf` show x
-- it's actually a GleanFFIError when running locally,
-- and probably a Thrift ApplicationError when running
-- remotely. It should really be a BadQuery though.
_ -> False
or as a standalone example:
(r :: Either Glean.Types.Exception [Cxx.Type]) <- try $ runQuery_ env repo $ angle @Cxx.Type "$cxx1.Type 1026"
Reproduces in the shell on ARM too using the example.angle schema:
facts> $1026 : example.Parent
fact has the wrong typeterminate called after throwing an instance of 'std::runtime_error'
terminate called recursively
*** Aborted at 1645421878 (Unix time, try 'date -d @1645421878') ***
*** Signal 6 (SIGABRT) (0x45f8) received by PID 17912 (pthread TID 0xffffa0dc04f0) (linux TID 17912) (maybe from PID 17912, UID 0) (code: -6), stack trace: ***
/root/.hsthrift/lib/libfolly.so.0.58.0-dev(_ZN5folly10symbolizer21SafeStackTracePrinter15printStackTraceEb+0x47)[0xffffa4c4a4c7]
/root/.hsthrift/lib/libfolly.so.0.58.0-dev(_ZN5folly10symbolizer21SafeStackTracePrinter15printStackTraceEb+0x47)[0xffffa4c4a4c7]
/lib/aarch64-linux-gnu/libc.so.6(+0xe7)[0xffffa14340e7]
(safe mode, symbolizer not available)
Aborted (core dumped)
Minimal test case:
https://gist.github.com/donsbot/08302e0ff4ba3c3d580129c04df67584
Correct execution (e.g. on x86) generates a raise / raiseError call which is then caught.
fact has the wrong type
Stack trace:
facebook::glean::rts::raiseError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > facebook::glean::rts::error<>(folly::Range<char const*>)
facebook::glean::rts::Subroutine::execute(unsigned long const*) const
On ARM, though, our raiseError isn't caught, instead it gets rethrown.
terminate called after throwing an instance of 'std::runtime_error'
terminate called recursively
In gdb, we can see the trace, suggesting something is throwing when the first exception is raised. It's still the same execution path.
Thread 1 "barebones" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
And folly.git/folly/experimental/exception_tracer/ExceptionTracerLib.cpp gets entered twice.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000ffffab3f0d68 in __GI_abort () at abort.c:79
#2 0x0000ffffac2af868 in __gnu_cxx::__verbose_terminate_handler() () from /lib/aarch64-linux-gnu/libstdc++.so.6
#3 0x0000ffffac2ad21c in ?? () from /lib/aarch64-linux-gnu/libstdc++.so.6
#4 0x0000ffffac2ad280 in std::terminate() () from /lib/aarch64-linux-gnu/libstdc++.so.6
#5 0x0000ffffac2ad5e0 in __cxa_rethrow () from /lib/aarch64-linux-gnu/libstdc++.so.6
#6 0x0000ffffaeda3164 in __cxxabiv1::__cxa_rethrow ()
at /tmp/fbcode_builder_getdeps-ZrootZgleanZGleanZhsthriftZbuildZfbcode_builder-root/repos/github.com-facebook-folly.git/folly/experimental/exception_tracer/ExceptionTracerLib.cpp:119
#7 0x0000ffffac2af804 in __gnu_cxx::__verbose_terminate_handler() () from /lib/aarch64-linux-gnu/libstdc++.so.6
#8 0x0000ffffac2ad21c in ?? () from /lib/aarch64-linux-gnu/libstdc++.so.6
#9 0x0000ffffac2ad280 in std::terminate() () from /lib/aarch64-linux-gnu/libstdc++.so.6
#10 0x0000ffffac2ad574 in __cxa_throw () from /lib/aarch64-linux-gnu/libstdc++.so.6
#11 0x0000ffffaeda2e70 in __cxxabiv1::__cxa_throw (thrownException=0x41010cd0, type=0x297cac0 <typeinfo for std::runtime_error>, destructor=<optimized out>)
at /tmp/fbcode_builder_getdeps-ZrootZgleanZGleanZhsthriftZbuildZfbcode_builder-root/repos/github.com-facebook-folly.git/folly/experimental/exception_tracer/ExceptionTracerLib.cpp:106
#12 0x0000000001135bec in facebook::glean::rts::raiseError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#13 0x0000000001129f64 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > facebook::glean::rts::error<>(folly::Range<char const*>) ()
#14 0x00000000011690e8 in facebook::glean::rts::Subroutine::execute(unsigned long const*) const ()
#15 0x000000000115d6cc in facebook::glean::rts::executeQuery(facebook::glean::rts::Inventory&, facebook::glean::rts::Define&, facebook::glean::rts::DefineOwnership*, facebook::glean::rts::Subroutine&, facebook::glean::rts::WordId<facebook::glean::rts::Pid_tag>, std::shared_ptr<facebook::glean::rts::Subroutine>, folly::Optional<unsigned long>, folly::Optional<unsigned long>, folly::Optional<unsigned long>, facebook::glean::rts::Depth, std::unordered_set<facebook::glean::rts::WordId<facebook::glean::rts::Pid_tag>, folly::hasher<facebook::glean::rts::WordId<facebook::glean::rts::Pid_tag>, void>, std::equal_to<facebook::glean::rts::WordId<facebook::glean::rts::Pid_tag> >, std::allocator<facebook::glean::rts::WordId<facebook::glean::rts::Pid_tag> > >&, bool, folly::Optional<facebook::glean::thrift::internal::QueryCont>) ()
#16 0x0000000001142cb0 in glean_query_execute_compiled ()
#17 0x0000000000fb3dfc in c6RWQ_info$def ()
So something about how we catch this exception, or set it up. Or possibly clang/gcc differences? Thoughts?