Intro
Node.js is primarily single-threaded due to its event-driven, non-blocking I/O architecture, which means it handles tasks asynchronously on a single thread, particularly for I/O operations. However, for CPU-intensive tasks or tasks that benefit from concurrency (like computations), Node.js does have a way to leverage multiple threads. An idea to achieve t his is using worker threads module.
Worker Threads Module
Node.js introduced the worker_threads
module in v10.5.0 (with full stability starting in v12), which allows for creating multiple threads within a Node.js application.
- Creating a Worker: A new thread (worker) can be created using the
Worker
class from theworker_threads
module. Each worker runs in isolation, meaning it has its own event loop, memory, and Node.js runtime. - Passing Data: Workers can communicate with the main thread via
postMessage
andon('message')
methods, which allow data to be transferred in a safe, asynchronous way. - Use Case:
worker_threads
is useful for CPU-intensive tasks (e.g., image processing, data parsing, or mathematical calculations) that could block the event loop if run on the main thread.
Piscina
Piscina is a Node.js worker pool implementation that helps distribute heavy CPU-bound and asynchronous tasks across multiple threads efficiently, by leveraging Node’s worker_threads
module. It abstracts away much of the complexity involved in managing a pool of workers, making it easy for developers to offload tasks without worrying about low-level thread management.
Design
Piscina is a library in between the user code and the libuv library. By default all payload passed to piscina are copied unless the move function is explicit used.
Only transferable data are moved (such as ArrayBuffer
)
If you pass from or to piscina a payload object that wraps and ArrayBuffer
it’s mandatory to implement transferableSymbol
and valueSymbol
otherwise data are copied.
Data structures
Structure | Raw | Transferable | Thread safe | Piscina does |
ArrayBuffer | ✔ | ✔ | ✔ | move, copy |
SharedArrayBuffer | ✔ | ✖ | ✖ | copy pointer |
TypedArray | ✖ | ✖ | ✔ | copy |
How it works standard array
Here a brief example how standard array are allocated, edited and copied.
(async () => {
const raw = new ArrayBuffer(4); // declare 4 bytes of raw data
// when declaring a view from a raw ArrayBuffer then the pointer is copied
const uint8 = new Uint8Array(raw); // copy pointer from raw
uint8.fill(255, 0, 4); // assign to 255 in each postion
console.log(uint8); // [255, 255, 255, 255]
const uint16 = new Uint16Array(raw);
// same data, different view
console.log(uint16); // [65535, 65535]
uint16[0] = 61680; // assign 61680 in first position
// then data are modified in this way
console.log(uint8, uint16); // Uint8Array(4) [ 240, 240, 255, 255 ] Uint16Array(2) [ 61680, 65535 ]
// when declaring a view from another view then the data are copied
const uint8Copy = new Uint8Array(uint8);
uint8Copy[0] = 0;
// as you can see data are copied
console.log(uint8, uint8Copy); // Uint8Array(4) [ 240, 240, 255, 255 ] Uint8Array(4) [ 0, 240, 255, 255 ]
// node buffer is a particular type array (based on Uint8Array)
const buffer = Buffer.from(raw); // pointer is copied
buffer[0] = 1;
console.log(raw, buffer); // ArrayBuffer { [Uint8Contents]: <01 f0 ff ff>, byteLength: 4 } <Buffer 01 f0 ff ff>
const bufferCopy = Buffer.from(uint8); // data are copied
bufferCopy[0] = 2;
console.log(uint8, bufferCopy); // Uint8Array(4) [ 1, 240, 255, 255 ] <Buffer 02 f0 ff ff>
const bufferCopy2 = Buffer.from(bufferCopy); // data are copied
bufferCopy2[0] = 3;
console.log(bufferCopy, bufferCopy2); // <Buffer 02 f0 ff ff> <Buffer 03 f0 ff ff>
// is it possible to not copy data from another type array using the underlying ArrayBuffer (.buffer)
const newUint8 = new Uint8Array(uint8.buffer); // copy pointer from raw
newUint8[0] = 5;
console.log(newUint8, uint8); // Uint8Array(4) [ 5, 240, 255, 255 ] Uint8Array(4) [ 5, 240, 255, 255 ]
// it works also with node buffer
const newBuffer = Buffer.from(uint8.buffer);
newBuffer[0] = 6;
console.log(newBuffer, uint8); // <Buffer 06 f0 ff ff> Uint8Array(4) [ 6, 240, 255, 255 ]
// or
const newBuffer2 = Buffer.from(newBuffer.buffer);
newBuffer2[0] = 7;
console.log(newBuffer, newBuffer2); // <Buffer 07 f0 ff ff> <Buffer 07 f0 ff ff>
})();
Data copied from and to the worker
Here is shown a case when is a good practise to copy data from and to worker. In this case few data are transfer and the worker provides an offload of the main thread
Data moved from and to the worker
Here are shown some cases when is a good practise to move data instead of copying. The examples shows move with piscina with a transferable object or transferable data and a pointer copy, sync with mutex, using SharedArrayBuffer
How to test
The standard node API process.memoryUsage()
gives to the programmer a good view to what is going on behind the scenes. Pay attention to RSS and arrayBuffers before and after a piscina call.
References
- https://github.com/albertoielpo/piscina-base
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray
- https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Transferable_objects
- https://nodejs.org/api/buffer.html
- https://github.com/piscinajs
0 Comments