context-optimization
Apply compaction, masking, and caching strategies
Author
Category
Development ToolsInstall
Hot:1
Download and extract to your skills directory
Copy command and send to OpenClaw for auto-install:
Download and install this skill https://openskills.cc/api/download?slug=sickn33-skills-context-optimization&locale=en&source=copy
Context Optimization - Context Window Optimization Techniques
Skill Overview
Context Optimization increases effective context capacity by 2–3x without increasing the model or context size, using compaction, masking, caching, and partitioning strategies.
Applicable Scenarios
Core Features
Automatically triggers when context usage approaches 70–80%, performing intelligent summarization of tool outputs, conversation history, and retrieved documents to retain key information and discard redundant content. Compression priority: used tool outputs > earlier dialogues > updatable retrieved documents. System prompts are never compacted.
Replace verbose tool outputs with compact reference IDs, reducing context usage by 60–80%. The information remains accessible on demand but no longer continuously consumes context. Suitable for observations older than three rounds, duplicate outputs, and already distilled information.
Maximize cache hit rate by ordering context elements (stable content first → reusable templates in the middle → unique content last). Use consistent prompt formatting and avoid dynamic timestamps to achieve cache hit rates above 70% for stable workloads.
Frequently Asked Questions
How do I tell when context optimization is needed?
Monitor the following metrics: context usage over 70%, declining response quality as conversations lengthen, rising costs with context length, and increasing latency as dialogues grow. When any metric is abnormal, choose the corresponding strategy based on context composition: use masking if tool outputs dominate, partitioning if retrieved documents dominate, and compaction if message history dominates.
Will context compaction degrade quality?
A reasonable compaction strategy can keep quality loss within 5%. The key is selective retention: keep key conclusions and metrics from tool outputs, decisions and commitments from dialogues, and factual claims from documents. Avoid compacting system prompts and observations related to the current task.
How much cost can KV-Cache optimization save?
For workloads with stable prefixes (such as system prompts and tool definitions), KV-cache can reduce compute cost and latency by 30–50%. The optimization is to place reusable content at the front of the context and maintain consistent prompt structure, avoiding dynamic content like timestamps that break the cache.