Intro

Node.js is primarily single-threaded due to its event-driven, non-blocking I/O architecture, which means it handles tasks asynchronously on a single thread, particularly for I/O operations. However, for CPU-intensive tasks or tasks that benefit from concurrency (like computations), Node.js does have a way to leverage multiple threads. An idea to achieve t his is using worker threads module.

Worker Threads Module

Node.js introduced the worker_threads module in v10.5.0 (with full stability starting in v12), which allows for creating multiple threads within a Node.js application.

  • Creating a Worker: A new thread (worker) can be created using the Worker class from the worker_threads module. Each worker runs in isolation, meaning it has its own event loop, memory, and Node.js runtime.
  • Passing Data: Workers can communicate with the main thread via postMessage and on('message') methods, which allow data to be transferred in a safe, asynchronous way.
  • Use Case: worker_threads is useful for CPU-intensive tasks (e.g., image processing, data parsing, or mathematical calculations) that could block the event loop if run on the main thread.

Piscina

Piscina is a Node.js worker pool implementation that helps distribute heavy CPU-bound and asynchronous tasks across multiple threads efficiently, by leveraging Node’s worker_threads module. It abstracts away much of the complexity involved in managing a pool of workers, making it easy for developers to offload tasks without worrying about low-level thread management.

Design

Piscina is a library in between the user code and the libuv library. By default all payload passed to piscina are copied unless the move function is explicit used.

Only transferable data are moved (such as ArrayBuffer)

If you pass from or to piscina a payload object that wraps and ArrayBuffer it’s mandatory to implement transferableSymbol and valueSymbol otherwise data are copied.

Data structures

StructureRawTransferableThread safePiscina does
ArrayBuffermove, copy
SharedArrayBuffercopy pointer
TypedArraycopy

How it works standard array

Here a brief example how standard array are allocated, edited and copied.

(async () => {
    const raw = new ArrayBuffer(4); // declare 4 bytes of raw data

    // when declaring a view from a raw ArrayBuffer then the pointer is copied
    const uint8 = new Uint8Array(raw); // copy pointer from raw
    uint8.fill(255, 0, 4); // assign to 255 in each postion
    console.log(uint8); // [255, 255, 255, 255]
    const uint16 = new Uint16Array(raw);
    // same data, different view
    console.log(uint16); // [65535, 65535]
    uint16[0] = 61680; // assign 61680 in first position
    // then data are modified in this way
    console.log(uint8, uint16); // Uint8Array(4) [ 240, 240, 255, 255 ] Uint16Array(2) [ 61680, 65535 ]

    // when declaring a view from another view then the data are copied
    const uint8Copy = new Uint8Array(uint8);
    uint8Copy[0] = 0;
    // as you can see data are copied
    console.log(uint8, uint8Copy); // Uint8Array(4) [ 240, 240, 255, 255 ] Uint8Array(4) [ 0, 240, 255, 255 ]

    // node buffer is a particular type array (based on Uint8Array)
    const buffer = Buffer.from(raw); // pointer is copied
    buffer[0] = 1;
    console.log(raw, buffer); // ArrayBuffer { [Uint8Contents]: <01 f0 ff ff>, byteLength: 4 } <Buffer 01 f0 ff ff>

    const bufferCopy = Buffer.from(uint8); // data are copied
    bufferCopy[0] = 2;
    console.log(uint8, bufferCopy); // Uint8Array(4) [ 1, 240, 255, 255 ] <Buffer 02 f0 ff ff>

    const bufferCopy2 = Buffer.from(bufferCopy); // data are copied
    bufferCopy2[0] = 3;
    console.log(bufferCopy, bufferCopy2); // <Buffer 02 f0 ff ff> <Buffer 03 f0 ff ff>

    // is it possible to not copy data from another type array using the underlying ArrayBuffer (.buffer)
    const newUint8 = new Uint8Array(uint8.buffer); // copy pointer from raw
    newUint8[0] = 5;
    console.log(newUint8, uint8); // Uint8Array(4) [ 5, 240, 255, 255 ] Uint8Array(4) [ 5, 240, 255, 255 ]

    // it works also with node buffer
    const newBuffer = Buffer.from(uint8.buffer);
    newBuffer[0] = 6;
    console.log(newBuffer, uint8); // <Buffer 06 f0 ff ff> Uint8Array(4) [ 6, 240, 255, 255 ]

    // or
    const newBuffer2 = Buffer.from(newBuffer.buffer);
    newBuffer2[0] = 7;
    console.log(newBuffer, newBuffer2); // <Buffer 07 f0 ff ff> <Buffer 07 f0 ff ff>
})();

Data copied from and to the worker

Here is shown a case when is a good practise to copy data from and to worker. In this case few data are transfer and the worker provides an offload of the main thread

Data moved from and to the worker

Here are shown some cases when is a good practise to move data instead of copying. The examples shows move with piscina with a transferable object or transferable data and a pointer copy, sync with mutex, using SharedArrayBuffer

How to test

The standard node API process.memoryUsage() gives to the programmer a good view to what is going on behind the scenes. Pay attention to RSS and arrayBuffers before and after a piscina call.

References


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published.