Streaming + chunking
OpenClaw has two separate streaming layers:
- Block streaming (channels): emit completed blocks as the assistant writes. These are normal channel messages (not token deltas).
- Preview streaming (Telegram/Discord/Slack): update a temporary preview message while generating.
There is no true token-delta streaming to channel messages today. Preview streaming is message-based (send + edits/appends).
Block streaming (channel messages)
Block streaming sends assistant output in coarse chunks as it becomes available.
Model output
└─ text_delta/events
├─ (blockStreamingBreak=text_end)
│ └─ chunker emits blocks as buffer grows
└─ (blockStreamingBreak=message_end)
└─ chunker flushes at message_end
└─ channel send (block replies)
Legend:
text_delta/events: model stream events (may be sparse for non-streaming models).
chunker: EmbeddedBlockChunker applying min/max bounds + break preference.
channel send: actual outbound messages (block replies).
Controls:
agents.defaults.blockStreamingDefault: "on"/"off" (default off).
- Channel overrides:
*.blockStreaming (and per-account variants) to force "on"/"off" per channel.
agents.defaults.blockStreamingBreak: "text_end" or "message_end".
agents.defaults.blockStreamingChunk: { minChars, maxChars, breakPreference? }.
agents.defaults.blockStreamingCoalesce: { minChars?, maxChars?, idleMs? } (merge streamed blocks before send).
- Channel hard cap:
*.textChunkLimit (e.g., channels.whatsapp.textChunkLimit).
- Channel chunk mode:
*.chunkMode (length default, newline splits on blank lines (paragraph boundaries) before length chunking).
- Discord soft cap:
channels.discord.maxLinesPerMessage (default 17) splits tall replies to avoid UI clipping.
Boundary semantics:
text_end: stream blocks as soon as chunker emits; flush on each text_end.
message_end: wait until assistant message finishes, then flush buffered output.
message_end still uses the chunker if the buffered text exceeds maxChars, so it can emit multiple chunks at the end.
Chunking algorithm (low/high bounds)
Block chunking is implemented by EmbeddedBlockChunker:
- Low bound: don’t emit until buffer >=
minChars (unless forced).
- High bound: prefer splits before
maxChars; if forced, split at maxChars.
- Break preference:
paragraph → newline → sentence → whitespace → hard break.
- Code fences: never split inside fences; when forced at
maxChars, close + reopen the fence to keep Markdown valid.
maxChars is clamped to the channel textChunkLimit, so you can’t exceed per-channel caps.
Coalescing (merge streamed blocks)
When block streaming is enabled, OpenClaw can merge consecutive block chunks
before sending them out. This reduces “single-line spam” while still providing
progressive output.
- Coalescing waits for idle gaps (
idleMs) before flushing.
- Buffers are capped by
maxChars and will flush if they exceed it.
minChars prevents tiny fragments from sending until enough text accumulates
(final flush always sends remaining text).
- Joiner is derived from
blockStreamingChunk.breakPreference
(paragraph → \n\n, newline → \n, sentence → space).
- Channel overrides are available via
*.blockStreamingCoalesce (including per-account configs).
- Default coalesce
minChars is bumped to 1500 for Signal/Slack/Discord unless overridden.
Human-like pacing between blocks
When block streaming is enabled, you can add a randomized pause between
block replies (after the first block). This makes multi-bubble responses feel
more natural.
- Config:
agents.defaults.humanDelay (override per agent via agents.list[].humanDelay).
- Modes:
off (default), natural (800–2500ms), custom (minMs/maxMs).
- Applies only to block replies, not final replies or tool summaries.
”Stream chunks or everything”
This maps to:
- Stream chunks:
blockStreamingDefault: "on" + blockStreamingBreak: "text_end" (emit as you go). Non-Telegram channels also need *.blockStreaming: true.
- Stream everything at end:
blockStreamingBreak: "message_end" (flush once, possibly multiple chunks if very long).
- No block streaming:
blockStreamingDefault: "off" (only final reply).
Channel note: Block streaming is off unless
*.blockStreaming is explicitly set to true. Channels can stream a live preview
(channels.<channel>.streaming) without block replies.
Config location reminder: the blockStreaming* defaults live under
agents.defaults, not the root config.
Preview streaming modes
Canonical key: channels.<channel>.streaming
Modes:
off: disable preview streaming.
partial: single preview that is replaced with latest text.
block: preview updates in chunked/appended steps.
progress: progress/status preview during generation, final answer at completion.
Channel mapping
| Channel | off | partial | block | progress |
|---|
| Telegram | ✅ | ✅ | ✅ | maps to partial |
| Discord | ✅ | ✅ | ✅ | maps to partial |
| Slack | ✅ | ✅ | ✅ | ✅ |
Slack-only:
channels.slack.nativeStreaming toggles Slack native streaming API calls when streaming=partial (default: true).
Legacy key migration:
- Telegram:
streamMode + boolean streaming auto-migrate to streaming enum.
- Discord:
streamMode + boolean streaming auto-migrate to streaming enum.
- Slack:
streamMode auto-migrates to streaming enum; boolean streaming auto-migrates to nativeStreaming.
Runtime behavior
Telegram:
- Uses
sendMessage + editMessageText preview updates across DMs and group/topics.
- Preview streaming is skipped when Telegram block streaming is explicitly enabled (to avoid double-streaming).
/reasoning stream can write reasoning to preview.
Discord:
- Uses send + edit preview messages.
block mode uses draft chunking (draftChunk).
- Preview streaming is skipped when Discord block streaming is explicitly enabled.
Slack:
partial can use Slack native streaming (chat.startStream/append/stop) when available.
block uses append-style draft previews.
progress uses status preview text, then final answer.
Last modified on March 22, 2026