这篇实际上写于 3 月 31 日，但一直没贴出来……

这里记一下 jupyter notebook 转 markdown 的命令：

1	`jupyter nbconvert --to markdown your_notebook.ipynb`

What is the “Chain”?

在一切开始之前，先解决一个问题——LangChain 的 Chain，它究竟是个什么东西？

实际上，LangChain 抽象的 Runnable，是可以进行组合的，但就现象上来说，是把它们前后串联起来，把这个 Runnable 的输出作为下一个 Runnable 的输入。串联起来的一系列 Runnable，它本身仍旧是一个 Runnable。而且 Runnable 也能够和普通的函数去进行组合。

下面使用两种方法实现 AI 翻译，来体现这种 Chain 的强大，注意到使用 Chain 能无痛地使用流式输出，但使用自定义函数参与这个 Chain 时，只有生成器函数能使用流式输出，对一般的函数，LangChain 会等到所有 Chunk 都收集完后把结果汇总再送给该函数：

from typing import Iterator
from langchain_deepseek import ChatDeepSeek
from langchain_core.messages import BaseMessage, BaseMessageChunk
from langchain_core.prompts import ChatPromptTemplate

model = ChatDeepSeek(model = 'deepseek-chat')
prompt_template = ChatPromptTemplate([
    ('system', '你是一个 AI 翻译助手，你负责把用户输入的内容从 {input_lang} 翻译成 {output_lang}'),
    ('human', '{text}')
])

def translate(input_lang, output_lang, text):
    prompt = prompt_template.invoke({'input_lang': input_lang, 'output_lang': output_lang, 'text': text})
    response = model.invoke(prompt)
    return response.content
display(translate('english', 'chinese', 'Hello, World!'))

def extract_text(iter: Iterator[BaseMessage]) -> Iterator[str]:
    # 这里的 iter 是 Iterator[Input]
    for i in iter:
        yield i.content

another_translate =  (
    prompt_template 
        | model 
        | extract_text # 注意这里必须使用一个生成器函数，如果使用一个 lambda x: x.content，将无法正确处理流式输出
)

display(another_translate.invoke({'input_lang': 'english', 'output_lang': 'chinese', 'text': 'Hello, World!'}))
for chunk in another_translate.stream({'input_lang': 'english', 'output_lang': 'chinese', 'text': 'Where there is a supression, there is a struggle. Break the Chain!'}):
    print(chunk, end = ' | ')

'你好，世界！'

'你好，世界！'

 | 哪里有 | 压迫 | ， | 哪里 | 就有 | 反抗 | 。 | 打破 | 枷 | 锁 | ！ |  |

Chain vs Graph

这样简单的一个需求，显然使用 LangChain 是比较舒服的，倘若我们比较蛋疼，我们硬要用 LangGraph 写呢……？实际上也行，这里为了做比较，还是写一下 LangGraph 的实现……注意到 LangGraph 的图的定义带给我们的约束——我们必须把所有参数都存在一个共享状态里，而且必须放到顶层……而且工作流的输出的结果仍然是这个共享状态，因此我们必须手动从它的结果里去取出执行结果，总之，就非常傻：

from langgraph.graph import StateGraph, MessagesState, START
from typing import TypedDict, Annotated
from langgraph.types import StreamWriter
from IPython.display import Image

class StateSchema(TypedDict):
    # promptTemplate 的入参
    input_lang: str
    output_lang: str
    text: str
    # promptTemplate 的返回，model 的入参
    real_prompt: list[BaseMessage]
    # model 的返回，extractor 的入参
    response: BaseMessage
    # extractor 的返回
    translation: str

graph = StateGraph(state_schema=StateSchema)

graph.add_edge(START, 'prompt_template')
graph.add_edge('prompt_template', 'llm')
graph.add_edge('llm', 'extractor')
graph.set_finish_point('extractor')
def prompt_node(state: StateSchema):
    return { 'real_prompt': prompt_template.invoke(state) }

def model_node(state: StateSchema):
    return { 'response': model.invoke(state['real_prompt']) }

def extractor_node(state: StateSchema):
    return { 'translation': state['response'].content }

graph.add_node('prompt_template', prompt_node)
graph.add_node('llm', model_node)
graph.add_node('extractor', extractor_node)

app = graph.compile()
display(Image(app.get_graph().draw_mermaid_png()))
# stream_mode 可以传多个值，这样便能够同时显示 LLM 的 Chunk 和状态变迁
# 每个 stream_mode，对应的值（data）的类型都是 stream_mode 为相应值时的类型
# 如 stream_mode 为 messages 时，data 是一个二元组 (AIMessageChunk, metadata) ，metadata 表示是哪一步的 llm 的输出
# 如 stream_mode 为 values 时，data 就是图的状态
for stream_mode, data in app.stream({'input_lang': 'english', 'output_lang': 'chinese', 'text': 'Where there is a supression, there is a struggle. Break the Chain!'}, stream_mode=['messages', 'values']):
    if stream_mode == 'messages':
        # LLM chunk
        print(data[0].content, end = ' | ')
    elif 'translation' in data:
        # state
        print('\nresult:', data['translation'])

png

 | 哪里有 | 压迫 | ， | 哪里 | 就有 | 反抗 | 。 | 打破 | 枷 | 锁 | ！ |  | result: 哪里有压迫，哪里就有反抗。打破枷锁！


The Kernel crashed while executing code in the current cell or a previous cell. 


Please review the code in the cell(s) to identify a possible cause of the failure. 


Click <a href='https://aka.ms/vscodeJupyterKernelCrash'>here</a> for more info. 


View Jupyter <a href='command:jupyter.viewOutput'>log</a> for further details.

而且这里使用 LangGraph 的话有个问题——流式输出是假的——LangGraph 的流式的原子是 Node。

之前测试的时候，我以为 LangGraph 配置了什么全局变量或者线程局部变量，让 LangGraph 能够把 invoke 直接变成 stream，但这个想法其实是错误的——LangGraph 调用 AI 时，底层可能始终是使用的 stream，只是我们调用图的 stream 方法的时候，它把这些消息 yield 出来。

也就是说，我之前以为每个消息的 Chunk 都会“走完整个图”，这是错误的，只有完整的消息会触发图的下一步动作（也就是说，这里没有任何魔法，不要想象背后有什么把 invoke 变成 stream 的神奇操作），只不过 LangGraph 能够把当前接受到的 Chunk yield 给你，让你方便给前端做显示罢了。

总结——Chain 支持消息 Chunk 在链上传递，而 Graph 只支持“完整的状态”在图上传递。LangGraph 没有提供像 LangChain 的 Chain 那样使用生成器函数支持流式处理的操作。

后面测试的时候发现，LangGraph 的节点虽然支持使用生成器，但那只意味着它会执行多次直到生成器返回，使用生成器其实等价于自旋。

LangGraph 的这个以 Node 为原子的特性，我们应当感到宽慰——这使得工作流的行为更加容易理解了。

实际上，stream_mode 可以传多个值，这样便能够同时显示 LLM 的 Chunk 和状态变迁。上面的代码展示了这一点。

不提了，继续学习。之前只学了一点最简单的工作流，但工作流还有诸如条件边等操作，值得学习。

AI Python

本博客所有文章除特别声明外，均采用 CC BY-NC-SA 4.0 协议，转载请注明出处！

记事簿上一篇

外星人姆姆笔记1——微波炉，冰箱，吸尘器下一篇

LangChain 学习 03——Chain 和 Graph

What is the “Chain”?

Chain vs Graph