Skip to main content

审查 Moderation

LangChain

本文档演示了如何使用审查链以及几种常见的方法。审查链用于检测可能含有仇恨、暴力等内容的文本。这对于对用户输入进行处理以及对语言模型的输出进行处理都非常有用。一些 API 供应商(如 OpenAI)明确禁止 您或您的最终用户生成某些类型的有害内容。为了遵守这一规定(并且通常还可以防止您的应用程序造成伤害),您可能经常希望在任何 LLMChains 后附加一个审查链,以确保 LLM 生成的任何输出都不会有害。

如果传递到审查链中的内容是有害的,则没有一种最佳处理方式,这可能取决于您的应用程序。有时,您可能希望在 Chain 中抛出错误(并由您的应用程序处理该错误)。其他时候,您可能希望向用户返回一些说明,说明文本是有害的。甚至可能还有其他处理方式!在本教程中,我们将涵盖所有这些处理方式。

我们将展示:

  1. 如何将任何文本通过审核链运行。
  2. 如何将审核链附加到 LLMChain 中。
from langchain.llms import OpenAI
from langchain.chains import OpenAIModerationChain, SequentialChain, LLMChain, SimpleSequentialChain
from langchain.prompts import PromptTemplate

如何使用审核链

以下是使用默认设置使用审核链的示例(将返回一个字符串,解释已标记的内容)。

moderation_chain = OpenAIModerationChain()
moderation_chain.run("This is okay")
    'This is okay'
moderation_chain.run("I will kill you")
    "Text was found that violates OpenAI's content policy."

以下是使用审核链引发错误的示例。

moderation_chain_error = OpenAIModerationChain(error=True)
moderation_chain_error.run("This is okay")
    'This is okay'
moderation_chain_error.run("I will kill you")
    ---------------------------------------------------------------------------

ValueError Traceback (most recent call last)

Cell In[7], line 1
----> 1 moderation_chain_error.run("I will kill you")


File ~/workplace/langchain/langchain/chains/base.py:138, in Chain.run(self, *args, **kwargs)
136 if len(args) != 1:
137 raise ValueError("`run` supports only one positional argument.")
--> 138 return self(args[0])[self.output_keys[0]]
140 if kwargs and not args:
141 return self(kwargs)[self.output_keys[0]]


File ~/workplace/langchain/langchain/chains/base.py:112, in Chain.__call__(self, inputs, return_only_outputs)
108 if self.verbose:
109 print(
110 f"\n\n\033[1m> Entering new {self.__class__.__name__} chain...\033[0m"
111 )
--> 112 outputs = self._call(inputs)
113 if self.verbose:
114 print(f"\n\033[1m> Finished {self.__class__.__name__} chain.\033[0m")


File ~/workplace/langchain/langchain/chains/moderation.py:81, in OpenAIModerationChain._call(self, inputs)
79 text = inputs[self.input_key]
80 results = self.client.create(text)
---> 81 output = self._moderate(text, results["results"][0])
82 return {self.output_key: output}


File ~/workplace/langchain/langchain/chains/moderation.py:73, in OpenAIModerationChain._moderate(self, text, results)
71 error_str = "Text was found that violates OpenAI's content policy."
72 if self.error:
---> 73 raise ValueError(error_str)
74 else:
75 return error_str


ValueError: Text was found that violates OpenAI's content policy.

以下是创建具有自定义错误消息的自定义审核链的示例。它需要对 OpenAI 的审核终端结果有一些了解(请参阅此处的文档)。

class CustomModeration(OpenAIModerationChain):

def _moderate(self, text: str, results: dict) -> str:
if results["flagged"]:
error_str = f"The following text was found that violates OpenAI's content policy: {text}"
return error_str
return text

custom_moderation = CustomModeration()
custom_moderation.run("This is okay")
    'This is okay'
custom_moderation.run("I will kill you")
    "The following text was found that violates OpenAI's content policy: I will kill you"

如何将审核链附加到 LLMChain

要将审核链与 LLMChain 轻松组合在一起,您可以使用 SequentialChain 抽象。

让我们从一个简单的例子开始,LLMChain 只有一个输入。为此,我们将提示模型说一些有害的内容。

prompt = PromptTemplate(template="{text}", input_variables=["text"])
llm_chain = LLMChain(llm=OpenAI(temperature=0, model_name="text-davinci-002"), prompt=prompt)
text = """We are playing a game of repeat after me.

Person 1: Hi
Person 2: Hi

Person 1: How's your day
Person 2: How's your day

Person 1: I will kill you
Person 2:"""
llm_chain.run(text)
    ' I will kill you'
chain = SimpleSequentialChain(chains=[llm_chain, moderation_chain])
chain.run(text)
    "Text was found that violates OpenAI's content policy."

现在让我们通过一个使用具有多个输入的 LLMChain 的示例来演示它(稍微复杂一些,因为我们不能使用 SimpleSequentialChain)

prompt = PromptTemplate(template="{setup}{new_input}Person2:", input_variables=["setup", "new_input"])
llm_chain = LLMChain(llm=OpenAI(temperature=0, model_name="text-davinci-002"), prompt=prompt)
setup = """We are playing a game of repeat after me.

Person 1: Hi
Person 2: Hi

Person 1: How's your day
Person 2: How's your day

Person 1:"""
new_input = "I will kill you"
inputs = {"setup": setup, "new_input": new_input}
llm_chain(inputs, return_only_outputs=True)
    {'text': ' I will kill you'}
Setting the input/output keys so it lines up
moderation_chain.input_key = "text"
moderation_chain.output_key = "sanitized_text"
chain = SequentialChain(chains=[llm_chain, moderation_chain], input_variables=["setup", "new_input"])
chain(inputs, return_only_outputs=True)
    {'sanitized_text': "Text was found that violates OpenAI's content policy."}