가드레일

가드레일은 에이전트와 함께 실행되거나 완료될 때까지 실행을 차단하여, 사용자 입력이나 에이전트 출력에 대한 검사와 검증을 수행할 수 있게 합니다. 예를 들어, 비용이 많이 드는 모델을 호출하기 전에 경량 모델을 가드레일로 실행할 수 있습니다. 가드레일이 악의적 사용을 감지하면 오류를 발생시켜 비용이 큰 모델의 실행을 중지할 수 있습니다.

가드레일에는 두 가지 종류가 있습니다:

입력 가드레일은 초기 사용자 입력에서 실행됩니다.
출력 가드레일은 최종 에이전트 출력에서 실행됩니다.

입력 가드레일

입력 가드레일은 세 단계로 실행됩니다:

가드레일은 에이전트에 전달된 것과 동일한 입력을 받습니다.
가드레일 함수가 실행되고 InputGuardrailResult 내부에 래핑된 GuardrailFunctionOutput을 반환합니다.
tripwireTriggered가 true이면 InputGuardrailTripwireTriggered 오류가 발생합니다.

Note 입력 가드레일은 사용자 입력을 위한 것이므로, 워크플로에서 해당 에이전트가 첫 번째 에이전트일 때만 실행됩니다. 서로 다른 에이전트는 종종 다른 가드레일을 필요로 하므로 가드레일은 에이전트 자체에 구성됩니다.

실행 모드

runInParallel: true(기본값)는 가드레일을 LLM/도구 호출과 함께 시작합니다. 이는 지연 시간을 최소화하지만, 가드레일이 나중에 트리거되면 모델이 이미 토큰을 소비했거나 도구를 실행했을 수 있습니다.
runInParallel: false는 모델을 호출하기 전에 가드레일을 실행하여, 가드레일이 요청을 차단할 때 토큰 사용과 도구 실행을 방지합니다. 지연 시간보다 안전성과 비용을 우선할 때 사용하세요.

출력 가드레일

출력 가드레일은 3단계로 실행됩니다:

가드레일은 에이전트가 생성한 출력을 받습니다.
가드레일 함수가 실행되고 OutputGuardrailResult 내부에 래핑된 GuardrailFunctionOutput을 반환합니다.
tripwireTriggered가 true이면 OutputGuardrailTripwireTriggered 오류가 발생합니다.

Note 출력 가드레일은 워크플로에서 에이전트가 마지막 에이전트일 때만 실행됩니다. 실시간 음성 상호작용은 음성 에이전트 구축을 참조하세요.

트립와이어

가드레일이 실패하면 트립와이어를 통해 이를 신호합니다. 트립와이어가 트리거되는 즉시 러너가 해당 오류를 발생시키고 실행을 중단합니다.

가드레일 구현

가드레일은 GuardrailFunctionOutput을 반환하는 간단한 함수입니다. 아래는 내부적으로 다른 에이전트를 실행하여 사용자가 수학 숙제 도움을 요청하는지 확인하는 최소 예제입니다.

import {
  Agent,
  run,
  InputGuardrailTripwireTriggered,
  InputGuardrail,
} from '@openai/agents';
import { z } from 'zod';

const guardrailAgent = new Agent({
  name: 'Guardrail check',
  instructions: 'Check if the user is asking you to do their math homework.',
  outputType: z.object({
    isMathHomework: z.boolean(),
    reasoning: z.string(),
  }),
});

const mathGuardrail: InputGuardrail = {
  name: 'Math Homework Guardrail',
  // Set runInParallel to false to block the model until the guardrail completes.
  runInParallel: false,
  execute: async ({ input, context }) => {
    const result = await run(guardrailAgent, input, { context });
    return {
      outputInfo: result.finalOutput,
      tripwireTriggered: result.finalOutput?.isMathHomework ?? false,
    };
  },
};

const agent = new Agent({
  name: 'Customer support agent',
  instructions:
    'You are a customer support agent. You help customers with their questions.',
  inputGuardrails: [mathGuardrail],
});

async function main() {
  try {
    await run(agent, 'Hello, can you help me solve for x: 2x + 3 = 11?');
    console.log("Guardrail didn't trip - this is unexpected");
  } catch (e) {
    if (e instanceof InputGuardrailTripwireTriggered) {
      console.log('Math homework guardrail tripped');
    }
  }
}

main().catch(console.error);

출력 가드레일도 동일한 방식으로 동작합니다.

import {
  Agent,
  run,
  OutputGuardrailTripwireTriggered,
  OutputGuardrail,
} from '@openai/agents';
import { z } from 'zod';

// The output by the main agent
const MessageOutput = z.object({ response: z.string() });
type MessageOutput = z.infer<typeof MessageOutput>;

// The output by the math guardrail agent
const MathOutput = z.object({ reasoning: z.string(), isMath: z.boolean() });

// The guardrail agent
const guardrailAgent = new Agent({
  name: 'Guardrail check',
  instructions: 'Check if the output includes any math.',
  outputType: MathOutput,
});

// An output guardrail using an agent internally
const mathGuardrail: OutputGuardrail<typeof MessageOutput> = {
  name: 'Math Guardrail',
  async execute({ agentOutput, context }) {
    const result = await run(guardrailAgent, agentOutput.response, {
      context,
    });
    return {
      outputInfo: result.finalOutput,
      tripwireTriggered: result.finalOutput?.isMath ?? false,
    };
  },
};

const agent = new Agent({
  name: 'Support agent',
  instructions:
    'You are a user support agent. You help users with their questions.',
  outputGuardrails: [mathGuardrail],
  outputType: MessageOutput,
});

async function main() {
  try {
    const input = 'Hello, can you help me solve for x: 2x + 3 = 11?';
    await run(agent, input);
    console.log("Guardrail didn't trip - this is unexpected");
  } catch (e) {
    if (e instanceof OutputGuardrailTripwireTriggered) {
      console.log('Math output guardrail tripped');
    }
  }
}

main().catch(console.error);

guardrailAgent는 가드레일 함수 내부에서 사용됩니다.
가드레일 함수는 에이전트 입력 또는 출력을 받아 결과를 반환합니다.
가드레일 결과에 추가 정보를 포함할 수 있습니다.
agent는 가드레일이 적용되는 실제 워크플로를 정의합니다.