langchain-architecture

使用LangChain框架设计LLM应用程序,结合智能体、记忆模块与工具集成模式。适用于构建LangChain应用、实现AI智能体或创建复杂LLM工作流。

查看详情
name:langchain-architecturedescription:Design LLM applications using the LangChain framework with agents, memory, and tool integration patterns. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.

LangChain Architecture

Master the LangChain framework for building sophisticated LLM applications with agents, chains, memory, and tool integration.

Do not use this skill when

  • The task is unrelated to langchain architecture

  • You need a different domain or tool outside this scope
  • Instructions

  • Clarify goals, constraints, and required inputs.

  • Apply relevant best practices and validate outcomes.

  • Provide actionable steps and verification.

  • If detailed examples are required, open resources/implementation-playbook.md.
  • Use this skill when

  • Building autonomous AI agents with tool access

  • Implementing complex multi-step LLM workflows

  • Managing conversation memory and state

  • Integrating LLMs with external data sources and APIs

  • Creating modular, reusable LLM application components

  • Implementing document processing pipelines

  • Building production-grade LLM applications
  • Core Concepts

    1. Agents


    Autonomous systems that use LLMs to decide which actions to take.

    Agent Types:

  • ReAct: Reasoning + Acting in interleaved manner

  • OpenAI Functions: Leverages function calling API

  • Structured Chat: Handles multi-input tools

  • Conversational: Optimized for chat interfaces

  • Self-Ask with Search: Decomposes complex queries
  • 2. Chains


    Sequences of calls to LLMs or other utilities.

    Chain Types:

  • LLMChain: Basic prompt + LLM combination

  • SequentialChain: Multiple chains in sequence

  • RouterChain: Routes inputs to specialized chains

  • TransformChain: Data transformations between steps

  • MapReduceChain: Parallel processing with aggregation
  • 3. Memory


    Systems for maintaining context across interactions.

    Memory Types:

  • ConversationBufferMemory: Stores all messages

  • ConversationSummaryMemory: Summarizes older messages

  • ConversationBufferWindowMemory: Keeps last N messages

  • EntityMemory: Tracks information about entities

  • VectorStoreMemory: Semantic similarity retrieval
  • 4. Document Processing


    Loading, transforming, and storing documents for retrieval.

    Components:

  • Document Loaders: Load from various sources

  • Text Splitters: Chunk documents intelligently

  • Vector Stores: Store and retrieve embeddings

  • Retrievers: Fetch relevant documents

  • Indexes: Organize documents for efficient access
  • 5. Callbacks


    Hooks for logging, monitoring, and debugging.

    Use Cases:

  • Request/response logging

  • Token usage tracking

  • Latency monitoring

  • Error handling

  • Custom metrics collection
  • Quick Start

    from langchain.agents import AgentType, initialize_agent, load_tools
    from langchain.llms import OpenAI
    from langchain.memory import ConversationBufferMemory

    Initialize LLM


    llm = OpenAI(temperature=0)

    Load tools


    tools = load_tools(["serpapi", "llm-math"], llm=llm)

    Add memory


    memory = ConversationBufferMemory(memory_key="chat_history")

    Create agent


    agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
    memory=memory,
    verbose=True
    )

    Run agent


    result = agent.run("What's the weather in SF? Then calculate 25 4")

    Architecture Patterns

    Pattern 1: RAG with LangChain


    from langchain.chains import RetrievalQA
    from langchain.document_loaders import TextLoader
    from langchain.text_splitter import CharacterTextSplitter
    from langchain.vectorstores import Chroma
    from langchain.embeddings import OpenAIEmbeddings

    Load and process documents


    loader = TextLoader('documents.txt')
    documents = loader.load()

    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
    texts = text_splitter.split_documents(documents)

    Create vector store


    embeddings = OpenAIEmbeddings()
    vectorstore = Chroma.from_documents(texts, embeddings)

    Create retrieval chain


    qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(),
    return_source_documents=True
    )

    Query


    result = qa_chain({"query": "What is the main topic?"})

    Pattern 2: Custom Agent with Tools


    from langchain.agents import Tool, AgentExecutor
    from langchain.agents.react.base import ReActDocstoreAgent
    from langchain.tools import tool

    @tool
    def search_database(query: str) -> str:
    """Search internal database for information."""
    # Your database search logic
    return f"Results for: {query}"

    @tool
    def send_email(recipient: str, content: str) -> str:
    """Send an email to specified recipient."""
    # Email sending logic
    return f"Email sent to {recipient}"

    tools = [search_database, send_email]

    agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
    )

    Pattern 3: Multi-Step Chain


    from langchain.chains import LLMChain, SequentialChain
    from langchain.prompts import PromptTemplate

    Step 1: Extract key information


    extract_prompt = PromptTemplate(
    input_variables=["text"],
    template="Extract key entities from: {text}\n\nEntities:"
    )
    extract_chain = LLMChain(llm=llm, prompt=extract_prompt, output_key="entities")

    Step 2: Analyze entities


    analyze_prompt = PromptTemplate(
    input_variables=["entities"],
    template="Analyze these entities: {entities}\n\nAnalysis:"
    )
    analyze_chain = LLMChain(llm=llm, prompt=analyze_prompt, output_key="analysis")

    Step 3: Generate summary


    summary_prompt = PromptTemplate(
    input_variables=["entities", "analysis"],
    template="Summarize:\nEntities: {entities}\nAnalysis: {analysis}\n\nSummary:"
    )
    summary_chain = LLMChain(llm=llm, prompt=summary_prompt, output_key="summary")

    Combine into sequential chain


    overall_chain = SequentialChain(
    chains=[extract_chain, analyze_chain, summary_chain],
    input_variables=["text"],
    output_variables=["entities", "analysis", "summary"],
    verbose=True
    )

    Memory Management Best Practices

    Choosing the Right Memory Type


    # For short conversations (< 10 messages)
    from langchain.memory import ConversationBufferMemory
    memory = ConversationBufferMemory()

    For long conversations (summarize old messages)


    from langchain.memory import ConversationSummaryMemory
    memory = ConversationSummaryMemory(llm=llm)

    For sliding window (last N messages)


    from langchain.memory import ConversationBufferWindowMemory
    memory = ConversationBufferWindowMemory(k=5)

    For entity tracking


    from langchain.memory import ConversationEntityMemory
    memory = ConversationEntityMemory(llm=llm)

    For semantic retrieval of relevant history


    from langchain.memory import VectorStoreRetrieverMemory
    memory = VectorStoreRetrieverMemory(retriever=retriever)

    Callback System

    Custom Callback Handler


    from langchain.callbacks.base import BaseCallbackHandler

    class CustomCallbackHandler(BaseCallbackHandler):
    def on_llm_start(self, serialized, prompts, kwargs):
    print(f"LLM started with prompts: {prompts}")

    def on_llm_end(self, response, kwargs):
    print(f"LLM ended with response: {response}")

    def on_llm_error(self, error, kwargs):
    print(f"LLM error: {error}")

    def on_chain_start(self, serialized, inputs, kwargs):
    print(f"Chain started with inputs: {inputs}")

    def on_agent_action(self, action, kwargs):
    print(f"Agent taking action: {action}")

    Use callback


    agent.run("query", callbacks=[CustomCallbackHandler()])

    Testing Strategies

    import pytest
    from unittest.mock import Mock

    def test_agent_tool_selection():
    # Mock LLM to return specific tool selection
    mock_llm = Mock()
    mock_llm.predict.return_value = "Action: search_database\nAction Input: test query"

    agent = initialize_agent(tools, mock_llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)

    result = agent.run("test query")

    # Verify correct tool was selected
    assert "search_database" in str(mock_llm.predict.call_args)

    def test_memory_persistence():
    memory = ConversationBufferMemory()

    memory.save_context({"input": "Hi"}, {"output": "Hello!"})

    assert "Hi" in memory.load_memory_variables({})['history']
    assert "Hello!" in memory.load_memory_variables({})['history']

    Performance Optimization

    1. Caching


    from langchain.cache import InMemoryCache
    import langchain

    langchain.llm_cache = InMemoryCache()

    2. Batch Processing


    # Process multiple documents in parallel
    from langchain.document_loaders import DirectoryLoader
    from concurrent.futures import ThreadPoolExecutor

    loader = DirectoryLoader('./docs')
    docs = loader.load()

    def process_doc(doc):
    return text_splitter.split_documents([doc])

    with ThreadPoolExecutor(max_workers=4) as executor:
    split_docs = list(executor.map(process_doc, docs))

    3. Streaming Responses


    from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

    llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])

    Resources

  • references/agents.md: Deep dive on agent architectures

  • references/memory.md: Memory system patterns

  • references/chains.md: Chain composition strategies

  • references/document-processing.md: Document loading and indexing

  • references/callbacks.md: Monitoring and observability

  • assets/agent-template.py: Production-ready agent template

  • assets/memory-config.yaml: Memory configuration examples

  • assets/chain-example.py: Complex chain examples
  • Common Pitfalls

  • Memory Overflow: Not managing conversation history length

  • Tool Selection Errors: Poor tool descriptions confuse agents

  • Context Window Exceeded: Exceeding LLM token limits

  • No Error Handling: Not catching and handling agent failures

  • Inefficient Retrieval*: Not optimizing vector store queries
  • Production Checklist

  • [ ] Implement proper error handling

  • [ ] Add request/response logging

  • [ ] Monitor token usage and costs

  • [ ] Set timeout limits for agent execution

  • [ ] Implement rate limiting

  • [ ] Add input validation

  • [ ] Test with edge cases

  • [ ] Set up observability (callbacks)

  • [ ] Implement fallback strategies

  • [ ] Version control prompts and configurations

    1. langchain-architecture - Agent Skills