1/15
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Buffering vs Streaming - How buffering works with data
Almost all the asynchronous APIs that we've seen so far in this book work using buffer mode. For an input operation, buffer mode causes all the data coming from a resource to be collected into a buffer until the operation is completed; it is then passed back to the caller as one single blob of data.
Buffering vs Streaming - How Streaming works with data
streams allow us to process the data as soon as it arrives from the resource.
what are the differences between these two approaches? Purely from an efficiency perspective, streams can be more efficient in terms of both space (memory usage) and time (computation clock time).
Node.js streams have another important advantage: composability.
Spatial efficiency (Difference between streams and buffers)
first of all, streams allow us to do things that would not be possible by buffering data and processing it all at once. For example, consider the case in which we have to read a very big file, let's say, in the order of hundreds of megabytes or even gigabytes. Clearly, using an API that returns a big buffer when the file is completely read is not a good idea. Imagine reading a few of these big files concurrently; our application would easily run out of memory. Besides that, buffers in V8 are limited in size. You cannot allocate more than a few gigabytes of data, so we may hit a wall way before running out of physical memory.
Example - Gzipping using a buffered API
Exampe - Gzipping using streams
Time efficiency (how buffers and streams differ at time efficiency)
Let's now consider the case of an application that compresses a file and uploads it to a remote HTTP server, which, in turn, decompresses it and saves it on the filesystem:
If the client component of our application was implemented using a buffered API, the upload would start only when the entire file had been read and compressed. On the other hand, the decompression would start on the server only when all the data had been received.
A better solution to achieve the same result involves the use of streams. On the client machine, streams allow us to compress and send the data chunks as soon as they are read from the filesystem, whereas on the server, they allow us to decompress every chunk as soon as it is received from the remote peer. (check img)
(for better detailed example: 193-196)
Composability with Pipes
the pipe() method, which allows us to connect the different processing units, each being responsible for one single functionality.
In perfect Node.js style. This is possible because streams have a uniform interface, and they can understand each other in terms of API.
The only prerequisite is that the next stream in the pipeline has to support the data type produced by the previous stream, which can be either binary, text, or even objects.
(for an example in depth: 196-199)
Anatomy of streams
Every stream in Node.js is an implementation of one of the four base abstract classes available in the stream core module:
Readable
Writable
Duplex
Transform
Each stream class is also an instance of EventEmitter. Streams, in fact, can produce several types of event, such as end when a Readable stream has finished reading, finish when a Writable stream has completed writing, or error when something goes wrong.
Operating modes
One reason why streams are so flexible is the fact that they can handle not just binary data, but almost any JavaScript value. In fact, they support two operating modes:
Binary mode: To stream data in the form of chunks, such as buffers or strings.
Object mode: To stream data as a sequence of discrete objects (allowing us to use almost any JavaScript value).
These two operating modes allow us to use streams not just for I/O, but also as a tool to elegantly compose processing units in a functional fashion.
Readable streams
A Readable stream represents a source of data. In Node.js, it's implemented using the Readable abstract class, which is available in the stream module.
There are two approaches to receive the data from a Readable stream: non-flowing (or paused) and flowing.
Readable - The non-flowing mode
The non-flowing or paused mode is the default pattern for reading from a Readable stream. It involves attaching a listener to the stream for the readable event, which signals the availability of new data to read.
Then, in a loop, we read the data continuously until the internal buffer is emptied. This can be done using the read() method, which synchronously reads from the internal buffer and returns a Buffer object representing the chunk of data.
Readable - Flowing mode
Another way to read from a stream is by attaching a listener to the data event. This will switch the stream into using flowing mode, where the data is not pulled using read(), but instead is pushed to the data listener as soon as it arrives.
Flowing mode offers less flexibility to control the flow of data compared to non- flowing mode. The default operating mode for streams is non-flowing, so to enable flowing mode, it's necessary to attach a listener to the data event or explicitly invoke the resume() method. To temporarily stop the stream from emitting data events,
Implementing Readable streams
Now that we know how to read from a stream, the next step is to learn how to implement a new custom Readable stream. To do this, it's necessary to create a new class by inheriting the prototype Readable from the stream module. The concrete stream must provide an implementation of the _read() method.
Please note that read() is a method called by the stream consumers, while _read() is a method to be implemented by a stream subclass and should never be called dire ctly. The underscore usually indicates that the method is not public and should not be called directly.
Simplified construction (of creating a custom readable stream)
Readable streams from iterables
You can easily create Readable stream instances from arrays or other iterable objects (that is, generators, iterators, and async iterators) using the Readable.from() helper.
Try not to instantiate large arrays in memory. Imagine if, in the previous example, we wanted to list all the mountains in the world. There are about 1 million mountains, so if we were to load all of them in an array upfront, we would allocate a quite significant amount of memory. Even if we then consume the data in the array through a Readable stream, all the data has already been preloaded, so we are effectively voiding the memory efficiency of streams. It's always preferable to load and consume the data in chunks, and you could do so by using native streams such as fs.createReadStream, by building a custom stream, or by using Readable.from with lazy iterables such as generators, iterators, or async iterators.
Writable streams