OpenTelemetry Tracing Guide

This document provides guidance on integrating Graphite with any OpenTelemetry (OTLP) compatible backend for distributed tracing and observability.

Overview

Graphite emits traces through OpenTelemetry and exports them to a generic OTLP collector. This works with any OpenTelemetry-compatible backend (e.g. the OpenTelemetry Collector, Jaeger, Tempo, or any vendor that accepts OTLP).

Graphite supports the following modes:

OTLP: Export spans to an OTLP collector (gRPC)
Auto: Automatic detection of an available OTLP endpoint
In-Memory: Testing mode without external dependencies

OpenInference is used to automatically instrument LLM calls, regardless of which OTLP collector the spans are exported to. The integration automatically captures:

OpenAI API calls
LLM interactions
Tool executions
Workflow orchestration
Node operations

Installation

Core Dependencies

Grafi includes the following observability dependencies by default:

dependencies = [
    "openinference-instrumentation-openai>=0.1.41",
    "opentelemetry-sdk>=1.39.1",
    "opentelemetry-exporter-otlp-proto-grpc>=1.39.1",
]

These are automatically installed when you install Grafi:

# Using pip
pip install grafi

# Using uv
uv pip install grafi

Configuration

Docker Compose

You can run a local OpenTelemetry Collector (or any OTLP-compatible backend) via docker compose. For example, using an OTLP collector that exposes the default gRPC port 4317:

version: '3.8'

services:
  otel-collector:
    image: otel/opentelemetry-collector:latest
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317:4317" # OTLP gRPC

Environment Variables

Set these environment variables to override the default OTLP collector settings:

# Optional - defaults to localhost:4317
export OTEL_COLLECTOR_ENDPOINT="localhost" # collector hostname
export OTEL_COLLECTOR_PORT="4317"          # collector gRPC port

These are also read by the default Container tracer, so simply setting them is enough for the auto-configured tracer to find your collector.

Setup Function Parameters

The setup_tracing() function accepts the following parameters:

def setup_tracing(
    tracing_options: TracingOptions = TracingOptions.AUTO,
    collector_endpoint: str = "localhost",
    collector_port: int = 4317,
    project_name: str = "grafi-trace",
) -> Tracer:

tracing_options: Backend to use (OTLP, AUTO, IN_MEMORY)
collector_endpoint: Hostname of the collector (default: "localhost")
collector_port: Port number of the collector (default: 4317)
project_name: Name for the tracing project; exported as the service.name resource attribute (default: "grafi-trace")

Tracing Options

Grafi provides three tracing modes through the TracingOptions enum:

1. OTLP - Generic OpenTelemetry Collector

Export spans to any OTLP-compatible collector:

from grafi.common.instrumentations.tracing import TracingOptions, setup_tracing

tracer = setup_tracing(
    tracing_options=TracingOptions.OTLP,
    collector_endpoint="localhost",
    collector_port=4317,
    project_name="my-project",
)

When to use: - Production and development deployments - Any backend that accepts OTLP (OpenTelemetry Collector, Jaeger, Tempo, etc.) - A running collector instance is required

2. AUTO - Automatic Detection

Let Grafi automatically detect an available OTLP endpoint:

from grafi.common.instrumentations.tracing import TracingOptions, setup_tracing

tracer = setup_tracing(
    tracing_options=TracingOptions.AUTO,
    collector_endpoint="localhost",
    collector_port=4317,
)

Detection priority: 1. OTLP endpoint (from arguments or OTEL_COLLECTOR_* env vars), if available 2. Falls back to in-memory tracing

When to use: - Development environments with an optional collector - CI/CD pipelines - Flexible deployment scenarios

3. IN_MEMORY - Testing

Use in-memory tracing for tests and offline work:

tracer = setup_tracing(tracing_options=TracingOptions.IN_MEMORY)

When to use: - Unit and integration tests - CI/CD without external dependencies - Offline development - Minimal overhead scenarios

Usage Examples

Example 1: Basic Setup with AUTO Detection

from grafi.runtime import GrafiRuntime, ExecutionServices
from grafi.common.instrumentations.tracing import TracingOptions, setup_tracing

# Build a runtime that uses the auto-detected tracer
tracer = setup_tracing(tracing_options=TracingOptions.AUTO)
runtime = GrafiRuntime(ExecutionServices(tracer=tracer))

# Invoke assistants through `runtime`

Example 2: Export to an OTLP Collector

from grafi.runtime import GrafiRuntime, ExecutionServices
from grafi.common.instrumentations.tracing import TracingOptions, setup_tracing

tracer = setup_tracing(
    tracing_options=TracingOptions.OTLP,
    collector_endpoint="localhost",
    collector_port=4317,
    project_name="my-assistant",
)
runtime = GrafiRuntime(ExecutionServices(tracer=tracer))

# Invoke assistants through `runtime`

Example 3: Testing with In-Memory Tracing

from grafi.runtime import GrafiRuntime, ExecutionServices
from grafi.common.instrumentations.tracing import TracingOptions, setup_tracing

# Use in-memory tracing for tests
tracer = setup_tracing(tracing_options=TracingOptions.IN_MEMORY)
runtime = GrafiRuntime(ExecutionServices(tracer=tracer))

# Invoke assistants through `runtime` in your test

Example 4: Complete Assistant with Tracing

import os
import uuid
import asyncio
from grafi.runtime import GrafiRuntime, ExecutionServices
from grafi.common.events.topic_events.publish_to_topic_event import PublishToTopicEvent
from grafi.common.instrumentations.tracing import TracingOptions, setup_tracing
from grafi.common.models.async_result import async_func_wrapper
from grafi.common.models.invoke_context import InvokeContext
from grafi.common.models.message import Message
from grafi.assistants.assistant_base import AssistantBase

# Setup tracing and build a runtime that uses it
tracer = setup_tracing(tracing_options=TracingOptions.AUTO)
runtime = GrafiRuntime(ExecutionServices(tracer=tracer))

# Reference the event store (the default in-memory store here)
event_store = runtime.services.event_store

# Create your assistant
async def main():
    assistant = (
        # YourAssistant is an instance of type grafi.assistants.assistant
        # https://github.com/binome-dev/graphite/blob/main/grafi/assistants/assistant.py
        YourAssistant.builder()
        .name("MyAssistant")
        .api_key(os.getenv("OPENAI_API_KEY"))
        .build()
    )

    # Create invoke context
    invoke_context = InvokeContext(
        conversation_id="conversation_id",
        invoke_id=uuid.uuid4().hex,
        assistant_request_id=uuid.uuid4().hex,
    )

    # Invoke assistant
    input_data = PublishToTopicEvent(
        invoke_context=invoke_context,
        data=[Message(content="Hello!", role="user")]
    )

    output = await async_func_wrapper(
        runtime.invoke(assistant, input_data, is_sequential=True)
    )
    print(output)

asyncio.run(main())

Best Practices

1. Environment-Specific Configuration

Use different tracing modes for different environments:

import os
from grafi.common.instrumentations.tracing import TracingOptions, setup_tracing

env = os.getenv("ENVIRONMENT", "development")

if env in ("production", "staging"):
    tracing_option = TracingOptions.OTLP
    endpoint = os.getenv("OTEL_COLLECTOR_ENDPOINT", "localhost")
elif env == "development":
    tracing_option = TracingOptions.AUTO
    endpoint = "localhost"
else:  # testing
    tracing_option = TracingOptions.IN_MEMORY
    endpoint = "localhost"

tracer = setup_tracing(
    tracing_options=tracing_option,
    collector_endpoint=endpoint,
    project_name=f"{env}-assistant",
)

2. Early Initialization

Set up tracing early in your application lifecycle, before creating assistants:

# Good: Setup tracing first, build the runtime from it
tracer = setup_tracing(tracing_options=TracingOptions.AUTO)
runtime = GrafiRuntime(ExecutionServices(tracer=tracer))

# Then create assistants and invoke them through `runtime`
assistant = MyAssistant.builder().build()

3. Project Naming Conventions

Use descriptive project names to organize traces. The project name is exported as the service.name resource attribute:

tracer = setup_tracing(
    tracing_options=TracingOptions.OTLP,
    project_name=f"{app_name}-{environment}-{version}",
)

4. Graceful Degradation with AUTO Mode

Use AUTO mode to gracefully degrade when the collector is unavailable:

# Will automatically fall back to in-memory if no endpoint available
tracer = setup_tracing(tracing_options=TracingOptions.AUTO)

5. Testing Isolation

Use IN_MEMORY mode in tests to avoid external dependencies:

import pytest
from grafi.common.instrumentations.tracing import TracingOptions, setup_tracing
from grafi.runtime import bind_services, ExecutionServices

@pytest.fixture(autouse=True)
def setup_test_tracing():
    tracer = setup_tracing(tracing_options=TracingOptions.IN_MEMORY)
    # Bind a scope for the test so component invocations resolve these services.
    with bind_services(ExecutionServices(tracer=tracer)):
        yield

Troubleshooting

Issue: "OTLP endpoint is not available"

Symptom: ValueError when using the OTLP tracing option

Solution:

Ensure your collector is running:

➜ docker compose up
 nc -zv localhost 4317

 Connection to localhost (::1) 4317 port [tcp/*] succeeded!

Check the endpoint and port are correct:

tracer = setup_tracing(
    tracing_options=TracingOptions.OTLP,
    collector_endpoint="localhost",
    collector_port=4317,
)

Use AUTO mode for graceful fallback:

tracer = setup_tracing(tracing_options=TracingOptions.AUTO)

Issue: Connection timeout with the collector

Symptom: Slow startup or timeout errors

Solution: 1. The endpoint check has a 0.1s timeout, which is normal 2. Use AUTO mode to automatically fall back:

tracer = setup_tracing(tracing_options=TracingOptions.AUTO)

3. For OTLP mode, ensure the endpoint is reachable:

nc -zv localhost 4317

Issue: OpenAI instrumentation not working

Symptom: OpenAI calls not showing in traces

Solution: 1. Ensure OpenAI is instrumented (done automatically by setup_tracing) 2. Build the runtime with your tracer and invoke through it:

runtime = GrafiRuntime(ExecutionServices(tracer=tracer))

Issue: Traces showing in wrong project

Symptom: Traces appear in an unexpected project / service name

Solution: Specify the project name explicitly:

tracer = setup_tracing(
    tracing_options=TracingOptions.OTLP,
    project_name="my-specific-project",
)

Debug Logging

Enable debug logging to troubleshoot tracing issues:

from loguru import logger
import sys

logger.remove()
logger.add(sys.stderr, level="DEBUG")

# Now setup tracing
tracer = setup_tracing(tracing_options=TracingOptions.AUTO)