Crash when a swift's type pointer is null

Hi @Adrian_Prantl, @dcci and @Slava_Pestov. I took a look at this bug, and found the culprit.

So the reason LLDB is crashing is because of a null pointer. At SwiftUserExpression::ScanContext, we have a dyn_cast of null:

  if (auto *ref_type = llvm::dyn_cast<swift::ReferenceStorageType>(
          GetSwiftType(self_type).getPointer())) { // GetSwiftType(self_type).getPointer() returns null

So by simply changing it to dyn_cast_or_null we stop crashing, but now we print this in the console:

error: <EXPR>:1:11: error: use of undeclared type '$__lldb_context'
extension $__lldb_context {
          ^~~~~~~~~~~~~~~

error: <EXPR>:19:5: error: use of unresolved identifier '$__lldb_injected_self'
    $__lldb_injected_self.$__lldb_wrapped_expr_0(
    ^~~~~~~~~~~~~~~~~~~~~

Which is probably not ideal either.

So what is actually happening inside GetSwiftType is that SwiftASTContext::ReconstructType doesn't find the type, and return a swift::Type whose pointer is null.

While the crash is fixed, I think I should try to print a better error message, but I don't know the exact way to go about it.

The first thing I noticed is that the method SwiftASTContext::ReconstructType receives a Status variable, but if a type is found in the negative cache, we don't set the Status to anything, while the first time a type is not found the error is set. Is that intentional?

  if (m_negative_type_cache.Lookup(mangled_cstr)) {
    LOG_PRINTF(LIBLLDB_LOG_TYPES, "(\"%s\") -- found in the negative cache",
               mangled_cstr);
    return {}; // Status variable not set to some error when found inside negative cache
  }

I thought it would be a good idea to set the status here as well, but if I do that I get some new output that wasn't happening before:

(lldb) run NIOTests.ChannelTests/testHalfClosure
Process 9175 launched: '/home/augusto/Developer/swift-nio/.build/debug/swift-nioPackageTests.xctest' (x86_64)
Test Suite 'Selected tests' started at 2020-03-27 18:05:51.987
Test Suite 'ChannelTests' started at 2020-03-27 18:05:51.988
Test Case 'ChannelTests.testHalfClosure' started at 2020-03-27 18:05:51.988
/home/augusto/Developer/swift-nio/.build/debug/swift-nioPackageTests.xctest: Cannot load Swift type information; AST validation error in "NIO": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOConcurrencyHelpers": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOConcurrencyHelpersTests": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOFoundationCompat": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOHTTP1": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOHTTP1Tests": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOTLS": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOTLSTests": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOTestUtils": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOTestUtilsTests": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOTests": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOWebSocket": The module file format is too old to be used by this version of the debugger.AST validation error in "NIOWebSocketTests": The module file format is too old to be used by this version of the debugger.AST validation error in "swift_nioPackageTests": The module file format is too old to be used by this version of the debugger.

So I'm not sure if this is desirable or not.

If you guys think it's desirable, my second question is how to actually go about reporting this to the user. Function TypeSystemSwiftTypeRef::GetSwiftType (SwiftASTContext) passes a Status reference to ReconstructType, but doesn't check it at all.

   ...
  Status error;
  // FIXME: Suboptimal performance, because the ConstString is looked up again.
  ConstString mangled_name(
      reinterpret_cast<const char *>(compiler_type.GetOpaqueQualType()));
  return ts->m_swift_ast_context->ReconstructType(mangled_name, error); // error ignored

ScanContext also receives a reference to a Status, so I tested passing it down from ScanContext all the way to ReconstructType, and that kinda works:

error: warning: type for typename "$s3NIO19SelectableEventLoopCD" was not found

error: <EXPR>:1:11: error: use of undeclared type '$__lldb_context'
extension $__lldb_context {
          ^~~~~~~~~~~~~~~

error: <EXPR>:19:5: error: use of unresolved identifier '$__lldb_injected_self'
    $__lldb_injected_self.$__lldb_wrapped_expr_0(
    ^~~~~~~~~~~~~~~~~~~~~

Maybe that's not correct though because we still get those $__lldb errors. I noticed that StoringDiagnosticConsumer::PrintDiagnostics only includes those messages if a better error message was not found:

    // In general, we don't want to see diagnostics from outside of
    // the source text range of the actual user expression. But if we
    // didn't find any diagnostics in the text range, it's probably
    // because the source range was not specified correctly, and we
    // don't want to lose legit errors because of that. So in that
    // case we'll add them all here:
    if (!added_one_diagnostic) {
      // This will report diagnostic errors from outside the
      // expression's source range. Those are not interesting to
      // users, so we only emit them in debug builds.
      for (const RawDiagnostic &diagnostic : m_diagnostics) {
         // adds diagnostics here
      }
    }

So my second idea was to check the error inside TypeSystemSwiftTypeRef::GetSwiftType, and somehow convert it to a RawDiagnostic and store it so it's dealt in PrintDiagnostics later.

  ...
  Status error;
  // FIXME: Suboptimal performance, because the ConstString is looked up again.
  ConstString mangled_name(
      reinterpret_cast<const char *>(compiler_type.GetOpaqueQualType()));
  auto type = ts->m_swift_ast_context->ReconstructType(mangled_name, error);
  if (!error.Success()) {
    m_diagnostics.push_back(RawDiagnostic(error)); // This constructor doesn't exist, just an example
  }
  return type;

But thinking a little bit more, this might not be a good idea, since GetSwiftType can potentially be called multiple times, so we'd add duplicate error messages if we did this the way I was thinking.

So this is where I'm at right now, and I'm not sure in what way I should proceed to give the user a better error message. What do you guys think?

Hey @augusto2112, thanks a lot for looking at this. I think it makes sense to not report an error each time there's a hit in the negative cache, otherwise the debugger output would be spammy. The first time there's a type reconstruction error, I think the ideal output would look like the first line of what you had: "warning: type for typename "$s3NIO19SelectableEventLoopCD" was not found". This explains the failure perfectly and gives the user something helpful to put into a bug report. We ought to suppress the raw messages (like "use of undeclared type '$__lldb_context'") in this situation as they're not fundamentally helpful.

Hi Vedant!

I think it makes sense to not report an error each time there's a hit in the negative cache, otherwise the debugger output would be spammy

Yeah, that makes sense.

We ought to suppress the raw messages (like "use of undeclared type '__lldb_context'") in this situation as they're not fundamentally helpful

If we're only reporting the error on a cache miss, do you think it makes sense to set these on callers of ReconstructType, like I did for GetSwiftType, in the SwiftASTContext class? That would show the message only once and get right of the raw messages (at least the first time they'd display). If so, is there a way to transform a Status object into a RawDiagnostic one? The only constructor receives a bunch of parameters that I'm not familiar with.

I don't think there's a clean way to represent a Status as a RawDiagnostic, as the latter is specifically for modeling compiler diagnostics (with a sensible file:line location).

Have you considered adding a 'SuppressDiagnosticsOutsideOfTextRange()' method to StoringDiagnosticConsumer? With this in hand, you can propagate errors via Status objects as needed and quiet the unhelpful raw diagnostics.

Have you considered adding a 'SuppressDiagnosticsOutsideOfTextRange()' method to StoringDiagnosticConsumer? With this in hand, you can propagate errors via Status objects as needed and quiet the unhelpful raw diagnostics.

Hmm I see. Ok, I'll try doing that!

So I kinda got it working, however my trouble now is how to bubble up this error message in all cases... The problem is ReconstructType is called in many different places.

For example, TypeSystemSwiftTypeRef has a helper function which simply discards the error:

void *TypeSystemSwiftTypeRef::ReconstructType(void *type) {
  Status error;
  return m_swift_ast_context->ReconstructType(GetMangledTypeName(type), error);
}

This function is called in 80 different places inside the file... It'd be good if we could capture the errors inside the SwiftASTContext.cpp file, no?

Maybe we could keep an array of Diagnostic around, like DiagnosticManager does, and insert this list in the DiagnosticManager inside the function PrintDiagnostics?

What do you think?

If we ignore the calls to the helper function, there are only two other places that call ReconstructType(ConstString mangled_typename, Status &error), and where I could bubble up the errors, which I'd need to finish this issue. So I could just do that instead.

This probably has less of a chance of breaking existing code.

It's unfortunate that lldb simply discards useful Status objects like this.. this is serious technical debt. I think we could address this by adopting llvm's mandatory checked-Error class in SwiftASTContext.cpp. But doing so would be a fairly major cleanup.

Your suggestion of stashing certain error messages inside of StoringDiagnosticConsumer sounds reasonable to me as a stopgap. StoringDiagnosticConsumer can track two types of diagnostics: RawDiagnostics (from the compiler, with file:line info) and a new 'generic error' diagnostic.

Ok! I'll try to get that working.

I finished implementing the functionality, and opened a pr here.

Now the error message is:

(lldb) e ev
Unable to load module 'NIO'.
Unable to load module 'NIOConcurrencyHelpers'.
Unable to load module 'NIOConcurrencyHelpersTests'.
Unable to load module 'NIOFoundationCompat'.
Unable to load module 'NIOHTTP1'.
Unable to load module 'NIOHTTP1Tests'.
Unable to load module 'NIOTLS'.
Unable to load module 'NIOTLSTests'.
Unable to load module 'NIOTestUtils'.
Unable to load module 'NIOTestUtilsTests'.
Unable to load module 'NIOTests'.
Unable to load module 'NIOWebSocket'.
Unable to load module 'NIOWebSocketTests'.
Unable to load module 'swift_nioPackageTests'.
error: type for typename "$s3NIO19SelectableEventLoopCD" was not found
error: type for typename "$s3NIO13SelectorEventVyAA15NIORegistrationOGD" was not found
error: type for typename "$s21NIOConcurrencyHelpers6AtomicCySiGD" was not found
error: type for typename "$s3NIO11NIODeadlineVD" was not found
error: type for typename "$s3NIO22ThreadSpecificVariableCyAA19SelectableEventLoopCGD" was not found

And subsequent evaluations (these only show up in debug builds):

(lldb) e ev
error: <EXPR>:1:11: error: use of undeclared type '$__lldb_context'
extension $__lldb_context {
          ^~~~~~~~~~~~~~~

error: <EXPR>:19:5: error: use of unresolved identifier '$__lldb_injected_self'
    $__lldb_injected_self.$__lldb_wrapped_expr_1(
    ^~~~~~~~~~~~~~~~~~~~~

This would be what we wanted right?