Create a run and stream output

post

https://api.pearch.ai/threads/{thread_id}/runs

Create a new run for a thread and stream the output as Server-Sent Events (SSE). This endpoint executes a search query within a thread context and streams the results in real-time.

This API is a LangGraph API compatible with LangGraph SDK.

API definition: https://docs.langchain.com/langsmith/agent-server-api/thread-runs/create-run-stream-output

Official TypeScript implementation: https://github.com/langchain-ai/langgraphjs/blob/main/libs/langgraph-api/src/api/runs.mts#L354-L392

The endpoint supports multiple streaming modes:

updates: Stream updates as they occur in the graph execution
values: Stream the current state values after each update
custom: Stream custom events

The response is a Server-Sent Events (SSE) stream with text/event-stream content type. Each event contains the stream mode and data.

Cost: (number of profiles returned) × (sum of credits for each of type, insights, profile_scoring, high_freshness, reveal_emails, reveal_phones)

Example request (Python):

from langgraph_sdk import get_client    
headers = {"Authorization": f"Bearer {API_KEY}"}
client = get_client(url=base_url, headers=headers)
thread_id = str(uuid.uuid4())
input_data = {
    "messages": [
        {
            "type": "human",
            "content": "ml engineers in seattle"
        }
    ],
}
chunks_received = []
stream_mode = ["updates", "values", "custom"]
print(f"Starting stream. {thread_id=} {input_data=}")
async for chunk in client.runs.stream(
    thread_id=thread_id,
    assistant_id="agent",
    input=input_data,
    config={},
    stream_mode=stream_mode,
):
    chunks_received.append(chunk)
    print(chunk)

Example output: https://gist.github.com/vslaykovsky/419c9e8348cab643fa8814f2eb6ad120