I'd like to use the C++ library GitHub - google/robotstxt: The repository contains Google's robots.txt parser and matcher as a C++ library (compliant to C++11). in a Swift package, but I'm having trouble putting it all together.
I've reached a point where Xcode recognizes and can import robotstxt
, but am stuck with the following build error:
error: 'googlebot.RobotsMatcher' cannot be constructed because it has no accessible initializers
At a very high level, my approach so far has been the following:
- Create a Swift executable package.
- Fetch and build the external library. (One caveat here is that the main class for this library has deleted its copy constructor, making it a noncopyable type. In this case we need to map it to a Swift reference type before building).
- Move the library headers into Sources/robotstxt/include and the built libs into Sources/robotstxt/lib.
- Create a modulemap for robotstxt.
- Set up Package.swift with the library target and set the headers and linker path to the ones mentioned above.
After all this, I can import robotstxt
in main.swift and can even see code completion for its various symbols. However, when I run swift build
, I get the error error: 'googlebot.RobotsMatcher' cannot be constructed because it has no accessible initializers
.
Package.swift
// swift-tools-version: 5.10
// The swift-tools-version declares the minimum version of Swift required to build this package.
import PackageDescription
let package = Package(
name: "spm-with-cpp",
targets: [
.target(
name: "robotstxt",
linkerSettings: [
.unsafeFlags(["-L./Sources/robotstxt/lib"]),
.linkedLibrary("robots"),
]
),
.executableTarget(
name: "spm-with-cpp",
dependencies: ["robotstxt"],
swiftSettings: [.interoperabilityMode(.Cxx)]
),
]
)
Sources/robotstxt/include/module.modulemap
module robotstxt {
header "robots.h"
export *
}
Sources/spm-with-swift/main.swift
// The Swift Programming Language
// https://docs.swift.org/swift-book
import robotstxt
let _ = googlebot.RobotsMatcher()
// Error: 'googlebot.RobotsMatcher' cannot be constructed because it has no accessible initializers
Appendix - the specific steps i took
# 1. Initialize the package
swift package init --type executable
mkdir Sources/spm-with-cpp
mv Sources/main.swift Sources/spm-with-cpp
mkdir External && cd External
# 2. Fetch the library
git clone https://github.com/google/robotstxt
cd robotstxt
# Edit robots.h to include `<swift/bridging>` and add `SWIFT_UNSAFE_REFERENCE` to the end of the RobotsMatcher class declaration.
# 3. Build the library
mkdir c-build && c-build
cmake .. # with cmake version 3.30.0
make
# 4. Move the headers and build artifacts to the target
cd ../../..
mkdir -p Sources/robotstxt/include/absl
mkdir Sources/robotstxt/lib
cp External/robotstxt/robots.h Sources/robotstxt/include
cp -R External/robotstxt/c-build/libs/abseil-cpp-src/absl/* Sources/robotstxt/include/absl
find Sources/robotstxt/include/absl -name '*.cc' | xargs rm
find Sources/robotstxt/include/absl -name '*.c' | xargs rm
cp External/robotstxt/c-build/librobots.a Sources/robotstxt/lib
cp External/robotstxt/c-build/librobots.dylib Sources/robotstxt/lib
Appendix - actual usage of the C++ library
Helpfully, the google library also ships with a CLI executable. Seeing how they use the RobotsMatcher class, it looks very straightforward.
googlebot::RobotsMatcher matcher;
std::string url = argv[3];
bool allowed = matcher.AllowedByRobots(robots_content, &user_agents, url);