SwiftAgent - A Swift-native agent SDK inspired by FoundationModels (and using its tools)

Hey,

I am currently working on a new app that requires an AI to have access to tools so that it can iterate by itself, interact with the app’s database and resolve the user’s queries. I’ve been working on implementing something like that for a while but it’s super hard. There’s a lot that needs to be done for this to work and putting it in a nice, easy to use API is even harder.

When Apple announced FoundationModels, I was super stoked because the API and naming schemes they have come up with felt incredibly nice and intuitive. So I decided to build a Swift SDK that feels as close to Apple’s FoundationModels as possible, while even using some of its types directly, like @Generable or GeneratedContent and GenerationSchema.

It’s called SwiftAgent.

It’s still super early and not ready for any serious use but the basic agent loop including tool calling is already working (using OpenAI as a provider, and only supporting text input/output right now). Using an agent is super simple:

// Create an agent with tools
let agent = Agent<OpenAIProvider>(
  tools: [WeatherTool()],
  instructions: "You are a helpful assistant."
)

// Run your agent
let response = try await agent.respond(
  to: "What's the weather like in San Francisco?",
  using: .gpt5
)

print(response.content)

// You can even specify an output type
let response = try await agent.respond(
  to: "What's the weather like in San Francisco?",
  generating: WeatherReport.self,
  using: .gpt5
)

You can define tools just like you would inside FoundationModels:

struct WeatherTool: AgentTool {
  let name = "get_weather"
  let description = "Get current weather for a location"
  
  @Generable
  struct Arguments {
    @Guide(description: "City name")
    let city: String
    
    @Guide(description: "Temperature unit", .oneOf(["celsius", "fahrenheit"]))
    let unit: String = "celsius"
  }
  
  @Generable
  struct Output {
    let temperature: Double
    let condition: String
    let humidity: Int
  }
  
  func call(arguments: Arguments) async throws -> Output {
    // Your weather API implementation
    return Output(
      temperature: 22.5,
      condition: "sunny",
      humidity: 65
    )
  }
}

The agent class is also @Observable and exposes a transcript object, which is an enum very similar to the Transcript type in FoundationModels:

// Continue conversations naturally
try await agent.respond(to: "What was my first question?")

// Access conversation history
for entry in agent.transcript.entries {
  switch entry {
  case .prompt(let prompt):
    print("User: \(prompt.content)")
  case .response(let response):
    print("Agent: \(response.content)")
  case .toolCalls(let calls):
    print("Tool calls: \(calls.calls.map(\.name))")
  // ... handle other entry types
  }
}

Right now, I’ve only implemented a configuration where you pass in your api key from app side. That’s obviously not recommended for a production version of any app, but it’s nice to test things. I will add other relay configurations later (to route requests through your own backend, for example). I will need that for my app as well, so it’s kind of high priority once the basic feature set is done.

Even though this project is still in its beginning, I wanted to share it and hear what people think about it :slight_smile: I welcome feedback and ideas for future directions for this. Maybe it’ll help some people get started with AI agents easier (it helped me, at least :sweat_smile:)

Link to the repository: GitHub - SwiftedMind/SwiftAgent: Native Swift SDK for building autonomous AI agents with Apple's FoundationModels design philosophy

Link to FoundationModels: Foundation Models | Apple Developer Documentation

3 Likes

Doesn't one quote contradicts the other? What was the motivation of not using readily available thing and reinvent it instead?

1 Like

You can’t use FoundationModels with external AI providers. It’s just meant for Apples local model. All they have opened (somewhat) is the tool definition stuff around the @Generable macro, so that you can encode it into a JSON schema

2 Likes

SwiftAgent is impressive, offering an intuitive Swift SDK modeled on Apple’s FoundationModels. The @Generable tool system and transcript tracking make building AI agents straightforward. Early OpenAI support is solid, and future enhancements like backend relays, multi-modal input, or SwiftUI integration could expand its potential.

I just released version 0.4.0 with some big additions I think:

Breaking Changes

  • Renamed Provider to Adapter: The core abstraction for AI model integrations has been renamed from Provider to Adapter for better clarity. Update all references to use the new naming:

    // Before
    let agent = Agent<OpenAIProvider, Context>()
    
    // Now
    let agent = Agent<OpenAIAdapter, Context>()
    
  • Renamed Transcript to AgentTranscript: To avoid naming conflicts with FoundationModels, the Transcript type has been renamed to AgentTranscript:

    // Before
    public var transcript: Transcript
    
    // Now  
    public var transcript: AgentTranscript<Adapter.Metadata, Context>
    

Added

  • Prompt Context System: Introduced a new PromptContext protocol that enables separation of user input from contextual information (such as vector embeddings or retrieved documents). This provides cleaner transcript organization and better prompt augmentation:

    enum PromptContext: SwiftAgent.PromptContext {
      case vectorEmbedding(String)
      case documentContext(String)
    }
    
    let agent = OpenAIAgent(supplying: PromptContext.self, tools: tools)
    
    // User input and context are now separated in the transcript
    let response = try await agent.respond(
      to: "What is the weather like?", 
      supplying: [.vectorEmbedding("relevant weather data")]
    ) { input, context in
      PromptTag("context", items: context)
      input
    }
    
  • Tool Resolver: Added a powerful type-safe tool resolution system that combines tool calls with their outputs. The ToolResolver enables compile-time access to tool arguments and outputs:

    // Define a resolved tool run enum
    enum ResolvedToolRun {
      case getFavoriteNumbers(AgentToolRun<GetFavoriteNumbersTool>)
    }
    
    // Tools must implement the resolve method
    func resolve(_ run: AgentToolRun<GetFavoriteNumbersTool>) -> ResolvedToolRun {
      .getFavoriteNumbers(run)
    }
    
    // Use the tool resolver in your UI code
    let toolResolver = agent.transcript.toolResolver(for: tools)
    
    for entry in agent.transcript.entries {
      if case let .toolCalls(toolCalls) = entry {
        for toolCall in toolCalls.calls {
          let resolvedTool = try toolResolver.resolve(toolCall)
          switch resolvedTool {
          case let .getFavoriteNumbers(run):
            print("Count:", run.arguments.count)
            if let output = run.output {
              print("Numbers:", output.numbers)
            }
          }
        }
      }
    }
    
  • Convenience Initializers: Added streamlined initializers that reduce generic complexity. The new OpenAIAgent typealias and convenience initializers make agent creation more ergonomic:

    // Simplified initialization with typealias
    let agent = OpenAIAgent(supplying: PromptContext.self, tools: tools)
    
    // No context needed
    let agent = OpenAIAgent(tools: tools)
    
    // Even simpler for basic usage
    let agent = OpenAIAgent()
    

Enhanced

  • AgentTool Protocol: Extended the AgentTool protocol with an optional ResolvedToolRun associated type to support the new tool resolver system
  • Type Safety: Improved compile-time type safety for tool argument and output access through the tool resolver
  • Transcript Organization: Better separation of concerns in transcript entries, with user input and context clearly distinguished

Migration Guide

  1. Update Provider references: Replace all instances of Provider with Adapter in your code
  2. Update Transcript references: Replace Transcript with AgentTranscript where needed
  3. Consider adopting PromptContext: If you're currently building prompts with embedded context outside the agent, consider migrating to the new PromptContext system for cleaner separation
  4. Adopt Tool Resolver: For better type safety in UI code that displays tool runs, implement the resolve method in your tools and use the transcript's toolResolver
  5. Use convenience initializers: Simplify your agent initialization code using the new OpenAIAgent typealias and convenience initializers
1 Like

Hi all — sharing another approach in this space that I’ve been building: SwiftAgent by 1amageek. It’s a Swift framework for composing AI agents declaratively with a SwiftUI-like syntax, with an emphasis on type-safety and composability. GitHub - 1amageek/SwiftAgent: A Swift framework that enables declarative development of AI agents with SwiftUI-like syntax, embracing type-safety and composability.

What’s different?

SwiftAgent is built around a tiny, strongly-typed core:

public protocol Step<Input, Output> {
associatedtype Input: Sendable
associatedtype Output: Sendable
func run(_ input: Input) async throws -> Output
}

Agents themselves are just Steps that declare their behavior using a @StepBuilder body. This “steps all the way down” approach keeps every unit testable and reusable, and lets you wire up complex flows (transformations, control flow, LLM calls, tools, etc.) with compile-time guarantees about inputs/outputs.

Declarative, SwiftUI-style composition

You describe your agent by declaring a pipeline of steps:

struct MyAgent: Agent {
    @Session
    var session = LanguageModelSession(
        model: OpenAIModelFactory.gpt4o(apiKey: apiKey)
    ) {
        Instructions("You are a concise, helpful assistant.")
    }

    var body: some Step {
        // Preprocess input
        Transform<String, String> { $0.trimmingCharacters(in: .whitespacesAndNewlines) }

        // Generate text with dynamic Prompt builder
        GenerateText(session: $session) { input in
            Prompt {
                "User request: \(input)"
                "Please respond clearly and with one example."
            }
        }

        // Postprocess (example)
        Map<String, String> { text, _ in text.replacingOccurrences(of: "\n\n", with: "\n") }
    }
}
}

Under the hood, this uses builder APIs (InstructionsBuilder, PromptBuilder) and a @Session property wrapper for clean, reusable model sessions—so you can build prompts/instructions conditionally and keep LLM configuration out of your business logic.

Built-in steps & control flow

Out of the box you can compose transformations (Transform, Map, Reduce, Join), control flow (Loop, Parallel, Race, WaitForInput), and model generation (Generate<T>, GenerateText>), then layer on guardrails/monitoring/tracing. This makes advanced flows (e.g., iterative refinement with a max turn count) straightforward to express without sacrificing type safety.

Tools, state, and observability

SwiftAgent ships a practical tool set (file IO, grep/glob search, git, command execution, URL fetching, etc.), plus state utilities like Memory, Relay<T>, and distributed tracing (OpenTelemetry via swift-distributed-tracing). These are all regular steps or step-like units, so they compose naturally in your pipeline.

Provider-agnostic via OpenFoundationModels

SwiftAgent integrates with OpenFoundationModels, so you can target OpenAI, Anthropic, Ollama/local models, etc., through a unified session interface—handy for swapping providers or mixing local and hosted models in one app.

Why this design?

  • Type safety end-to-end — each step advertises Input → Output, so composition errors are caught at compile time.

  • Testable building blocks — run steps in isolation without a full agent loop.

  • Clear separation of concerns — prompts/instructions live in builders; state lives in @Session/Memory; tools are explicit steps.

If you’re exploring agent SDKs inspired by FoundationModels’ ergonomics, this takes a declarative/SwiftUI-like route with strongly-typed pipelines. Happy to get feedback or answer questions on the design/trade-offs.

1 Like