Question: re-adding `DerivedFileUnit`

Hello compiler experts,

A long time ago, there existed a DerivedFileUnit class for storing synthesized module-level declarations. However, it was removed in this commit by @Douglas_Gregor in 2016:

[Cleanup] Eliminate DerivedFileUnit.

The only client of DerivedFileUnit was synthesized global '=='
operators for Equatable conformances, which has since been removed.

Now, differentiation in Swift is in need of something like DerivedFileUnit: the differentiation SIL transform synthesizes auxiliary struct/enum data structures and needs to add them somewhere.

So far, we've been using a hack to find an appropriate SourceFile for adding these structs/enums:

// In lib/SILOptimizer/Mandatory/Differentiation.cpp:

  /// Retrieves the file unit that contains implicit declarations in the
  /// current Swift module. If it does not exist, create one.
  ///
  // FIXME: Currently it defaults to the file containing `origFn`, if it can be
  // determined. Otherwise, it defaults to any file unit in the module. To
  // handle this more properly, we should make a DerivedFileUnit class to
  // contain all synthesized implicit type declarations.
  SourceFile &getDeclarationFileUnit() {
    if (original->hasLocation())
      if (auto *declContext = original->getLocation().getAsDeclContext())
        if (auto *parentSourceFile = declContext->getParentSourceFile())
          return *parentSourceFile;
    for (auto *file : original->getModule().getSwiftModule()->getFiles())
      if (auto *src = dyn_cast<SourceFile>(file))
        return *src;
    llvm_unreachable("No files?");
  }

But this logic recently resulted in a AST serialization crash:

// In lib/Serialization/Serialization.cpp:

bool Serializer::isDeclXRef(const Decl *D) const {
  const DeclContext *topLevel = D->getDeclContext()->getModuleScopeContext();
  llvm::errs() << "Serializer::isDeclXRef\n";
  D->dump();
  llvm::errs() << "TOP LEVEL\n";
  topLevel->dumpContext();
  llvm::errs() << "\n";
  if (auto *decl = topLevel->getAsDecl())
    decl->dump();
  if (topLevel->getParentModule() != M)
    return true;
  if (!SF || topLevel == SF)
    return false;
  // Special-case for SIL generic parameter decls, which don't have a real
  // DeclContext.
  if (!isa<FileUnit>(topLevel)) {
    // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    // This assertion failure triggers. We need another special case to
    // handle structs/enums synthesized during the differentiation SIL
    // transform.
    // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    assert(isa<GenericTypeParamDecl>(D) && "unexpected decl kind");
    return false;
  }
  return true;
}
Assertion failed: (isa<GenericTypeParamDecl>(D) && "unexpected decl kind"), function isDeclXRef, file /Users/danielzheng/swift-tf/swift/lib/Serialization/Serialization.cpp, line 2043.

What's the most robust fix for this issue? (In which FileUnit should structs/enums synthesized during SILOptimizer transform be added?)

My guess is reviving DerivedFileUnit may be a good solution. TF-623 tracks this issue.

1 Like

Can you elaborate on what the differentiation transform is supposed to do? What are the inputs and what types are created as a result of that?

1 Like

Sure! (The differentiation transform isn't documented yet, but will be.)

The differentiation transform implements automatic differentiation as a SIL function transformation: it generates derivative functions for "original functions" without registered derivatives.

Auxiliary data structures are:

  • Differential structs (struct of partially-applied differential closure values).
  • Predecessor enums (enum representing basic block predecessors, records function execution path to implement "reverse-mode" differentiation).

The details are a bit involved, and much of it isn't relevant. Here's a Gist with some slides explaining the data structures and transformation.


I don't believe details of the differentiation transform are particularly important here. To summarize: type declarations synthesized during SIL transform need to be added to some (top-level?) context. Currently, they're hackily added to a SourceFile, leading to an AST serialization assertion failure (see above). I wonder whether reviving DerivedFileUnit, and adding the type declarations to it, makes sense.

It sounds like the key input, then, is one function written in source with some attribute (@differentiable or whatever). That should mean you always have a "home" SourceFile to put declarations in: the one the function lives in. Is that not the case?

Under what conditions do you not have a valid location in your "hackily added to a SourceFile"?

To answer both of your questions: getDeclarationFileUnit is always able to get a valid SourceFile. The problem is this assertion failure later during AST serialization.

bool Serializer::isDeclXRef(const Decl *D) const {
  const DeclContext *topLevel = D->getDeclContext()->getModuleScopeContext();

  // DEBUG STATEMENTS.
  llvm::errs() << "TOP LEVEL:\n";
  topLevel->dumpContext();
  llvm::errs() << "SF:\n";

  if (topLevel->getParentModule() != M)
    return true;
  if (!SF || topLevel == SF)
    return false;
  // Special-case for SIL generic parameter decls, which don't have a real
  // DeclContext.
  if (!isa<FileUnit>(topLevel)) {
    // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    // This assertion failure triggers. We need another special case to
    // handle structs/enums synthesized during the differentiation SIL
    // transform.
    // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    assert(isa<GenericTypeParamDecl>(D) && "unexpected decl kind");
    return false;
  }
  return true;
}
FAIL: Swift(macosx-x86_64) :: AutoDiff/ast_serialization.swift (10 of 58)
...
TOP LEVEL:
0x7f87f301f848 Module name=main
SF:
(source_file "tf-623.swift"
  ...)

AST SERIALIZATION FAILURE!
(parameter 'anonname=0x7fba268792e0' interface type='_AD__$s17ast_serialization6TF_623yS2fF_bb0__PB__src_0_wrt_0')
Assertion failed: (isa<GenericTypeParamDecl>(D) && "unexpected decl kind"), function isDeclXRef, file /Users/danielzheng/swift-build/swift/lib/Serialization/Serialization.cpp, line 2066.

7  swiftc                   0x000000010df25713 swift::serialization::Serializer::isDeclXRef(swift::Decl const*) const (.cold.2) + 35
8  swiftc                   0x000000010aa62ce0 swift::serialization::Serializer::addTypeRef(swift::Type) + 0
9  swiftc                   0x000000010aa62855 swift::serialization::Serializer::addDeclRef(swift::Decl const*, bool) + 101
10 swiftc                   0x000000010aa874af swift::serialization::Serializer::DeclSerializer::writeParameterList(swift::ParameterList const*) + 111
11 swiftc                   0x000000010aa80efb swift::serialization::Serializer::DeclSerializer::visitEnumElementDecl(swift::EnumElementDecl const*) + 1163
12 swiftc                   0x000000010aa6ad8e swift::serialization::Serializer::writeDecl(swift::Decl const*) + 350
13 swiftc                   0x000000010aa6c456 swift::serialization::Serializer::writeAllDeclsAndTypes() + 3846
14 swiftc                   0x000000010aa6d794 swift::serialization::Serializer::writeAST(llvm::PointerUnion<swift::ModuleDecl*, swift::SourceFile*>, bool) + 4212

You can see that topLevel == SF is false according to the debug statements. Note that there's "Special-case for SIL generic parameter decls" logic in Serializer::isDeclXRef: I made a temporary hack patch expanding that condition to unblock progress.

I wonder if there's some way to change the SourceFile calculation logic so that AST serialization passes (i.e. so that topLevel == SF)?

You didn't actually put your new declarations in the SourceFile, though, if topLevel is a Module.

I'm sorry, I don't fully follow. Could you spell out the problem please (and the fix, if it's clear to you)?

getDeclarationFileUnit() returns a SourceFile and SourceFile::addVisibleDecl is used to add the type declarations in the SIL transform. So it seems to me that the type declarations are added to the SourceFile.

…and oh, that's a parameter. Those are generally created with a dummy DeclContext, but reparented to the FuncDecl (or SubscriptDecl, or whatever) that they're attached to later on. That must be the step you skipped, even if the function and types and stuff are created correctly.

I suggest running the ASTVerifier on your synthesized types in +Asserts builds.

I wonder how to run ASTVerifier? Should I expect ASTVerifier to reveal issues in the AST other than the serialization failure?

Thanks for your help :slightly_smiling_face:

Very possibly. :-) You can call swift::verify(theDecl). Note that this is already recursive, so you don't need to do it separately for members if they've already been created.

It looks like AbstractFunctionDecl::setParameters is already supposed to be setting the DeclContext for parameters, so you can look into why that's not happening for your case.