Tracing

Automatically Monitor Your AI Applications with OpenTelemetry

tmam simplifies observability by providing automatic OpenTelemetry instrumentation for a wide range of LLM providers, frameworks, and vector databases. This enables you to effortlessly gain deep insights into the performance and behavior of your LLM-based applications. This documentation guides you through configuring tracing, understanding semantic conventions, and interpreting span attributes—empowering you to enhance monitoring, troubleshoot effectively, and optimize your AI workloads.

Using an existing OTel Tracer

You have the flexibility to integrate your existing OpenTelemetry (OTel) tracer configuration with Tmam. If you already have an OTel tracer instantiated in your application, you can pass it directly to tmam.init(tracer=tracer). This integration ensures that Tmam utilizes your custom tracer settings, allowing for a unified tracing setup across your application.

Example:

# Instantiate an OpenTelemetry Tracer
tracer = ...

# Pass the tracer to Tmam
tmam.init(tracer=tracer)

Add custom resource attributes

The OTEL_RESOURCE_ATTRIBUTES environment variable allows you to provide additional OpenTelemetry resource attributes when starting your application with Tmam. Tmam already includes some default resource attributes: telemetry.sdk.name: tmam service.name: YOUR_SERVICE_NAME deployment.environment: YOUR_ENVIRONMENT_NAME You can enhance these default resource attributes by adding your own using the OTEL_RESOURCE_ATTRIBUTES variable. Your custom attributes will be added on top of the existing Tmam attributes, providing additional context to your telemetry data. Simply format your attributes as key1=value1,key2=value2.

For example:

export OTEL_RESOURCE_ATTRIBUTES="service.instance.id=YOUR_SERVICE_ID,
k8s.pod.name=K8S_POD_NAME,
k8s.namespace.name=K8S_NAMESPACE,
k8s.node.name=K8S_NODE_NAME"

Disable Tracing of Content

By default, Tmam adds the prompts and completions to Trace span attributes.

However, you may want to disable this logging for privacy reasons, as they may contain highly sensitive data from your users. You may also simply want to reduce the size of your traces.

Example:

tmam.init(capture_message_content=False)

Disable Batch

By default, the SDK batches spans using the OpenTelemetry batch span processor. When working locally, sometimes you may wish to disable this behavior. You can do that with this flag.

Example:

tmam.init(disable_batch=True)

Manual Tracing

Using tmam.trace, you get access to manually create traces, allowing you to record every process within a single function.

@tmam.trace
def generate_one_liner():
    completion = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": "Return a one liner from any movie for me to guess",
            }
        ],
    )

The trace function automatically groups any LLM function invoked within generate_one_liner, providing you with organized groupings right out of the box.

Use trace.set_result('') to set the final result of the trace and trace.set_metadata() to add custom metadata.

Full Example

@tmam.trace
def generate_one_liner():
    completion = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": "Return a one liner from any movie for me to guess",
            }
        ],
    )

def guess_one_liner(one_liner: str):
    with tmam.start_trace("Guess One-liner") as trace:
        completion = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "user",
                    "content": f"Guess movie from this line: {one_liner}",
                }
            ],
        )
        trace.set_result(completion.choices[0].message.content)