Skip to main content
Crawlstack allows you to capture live data streams from modern real-time applications by intercepting WebSocket traffic and Server-Sent Events (SSE).

WebSocket Interception

To capture WebSocket messages, you must first enable interception. You can then retrieve messages using runner.getWebsocketMessages().
// 1. Enable interception
await runner.enableWebsockets();

// 2. Trigger action that uses WebSockets
document.querySelector("#btn-start-stream").click();

// 3. Poll for messages
await runner.waitFor(async () => {
    const res = await runner.getWebsocketMessages();
    const trade = res.events.find(e => e.message.includes("TRADE"));
    
    if (trade) {
        await runner.publishItems([{ id: Date.now(), data: JSON.parse(trade.message) }]);
        return true;
    }
    return false;
});

// 4. (Optional) Clear messages or disable interception
await runner.clearWebsocketMessages();
await runner.disableWebsockets();

Event Object

FieldTypeDescription
urlstringThe URL of the WebSocket connection.
typestringEither send (client to server) or receive (server to client).
messagestringThe payload data of the frame.
opcodenumberThe WebSocket opcode (e.g., 1 for text, 2 for binary).
timestampnumberBrowser timestamp of the event.

Server-Sent Events (SSE)

Similarly, you can enable SSE interception and poll for messages.
// 1. Enable interception
await runner.enableSse();

// 2. Poll for messages
await runner.waitFor(async () => {
    const res = await runner.getSseMessages();
    if (res.events.length > 0) {
        console.log(`Received ${res.events.length} SSE messages`);
        return true;
    }
    return false;
});

Event Object

FieldTypeDescription
urlstringThe URL of the event stream.
eventNamestringThe name of the event (defaults to message).
eventIdstringThe ID of the event if provided by the server.
messagestringThe payload data.
timestampnumberBrowser timestamp of the event.

Best Practices

Real-time streams can be high-volume. To prevent memory issues or database spam:
  1. Filter early: Only process or publish messages that contain the data you need.
  2. Handle state: Use a local variable in your script to accumulate data and only call publishItems once a specific condition is met or the run is about to finish.
  3. Wait for completion: Use runner.waitFor() to give the stream enough time to arrive before the script terminates.