Chapter 6 - Coding with streams

0.0(0)

Studied by 0 people

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/15

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

16 Terms

New cards

Buffering vs Streaming - How buffering works with data

Almost all the asynchronous APIs that we've seen so far in this book work using buffer mode. For an input operation, buffer mode causes all the data coming from a resource to be collected into a buffer until the operation is completed; it is then passed back to the caller as one single blob of data.

<p>Almost all the asynchronous APIs that we've seen so far in this book work using buffer mode. For an input operation, buffer mode causes all the data coming from a resource to be collected into a buffer until the operation is completed; it is then passed back to the caller as one single blob of data.</p>

New cards

Buffering vs Streaming - How Streaming works with data

streams allow us to process the data as soon as it arrives from the resource.
what are the differences between these two approaches? Purely from an efficiency perspective, streams can be more efficient in terms of both space (memory usage) and time (computation clock time).
Node.js streams have another important advantage: composability.

<ul><li><p>streams allow us to process the data as soon as it arrives from the resource.</p></li><li><p>what are the differences between these two approaches? Purely from an efficiency perspective, streams can be more efficient in terms of both space (memory usage) and time (computation clock time).</p></li><li><p>Node.js streams have another important advantage: composability.</p></li></ul><p></p>

New cards

Spatial efficiency (Difference between streams and buffers)

first of all, streams allow us to do things that would not be possible by buffering data and processing it all at once. For example, consider the case in which we have to read a very big file, let's say, in the order of hundreds of megabytes or even gigabytes. Clearly, using an API that returns a big buffer when the file is completely read is not a good idea. Imagine reading a few of these big files concurrently; our application would easily run out of memory. Besides that, buffers in V8 are limited in size. You cannot allocate more than a few gigabytes of data, so we may hit a wall way before running out of physical memory.

<p>first of all, streams allow us to do things that would not be possible by buffering data and processing it all at once. For example, consider the case in which we have to read a very big file, let's say, in the order of hundreds of megabytes or even gigabytes. Clearly, using an API that returns a big buffer when the file is completely read is not a good idea. Imagine reading a few of these big files concurrently; our application would easily run out of memory. Besides that, buffers in V8 are limited in size. You cannot allocate more than a few gigabytes of data, so we may hit a wall way before running out of physical memory.</p>

New cards

Example - Gzipping using a buffered API

New cards

Exampe - Gzipping using streams

New cards

Time efficiency (how buffers and streams differ at time efficiency)

Let's now consider the case of an application that compresses a file and uploads it to a remote HTTP server, which, in turn, decompresses it and saves it on the filesystem:

If the client component of our application was implemented using a buffered API, the upload would start only when the entire file had been read and compressed. On the other hand, the decompression would start on the server only when all the data had been received.
A better solution to achieve the same result involves the use of streams. On the client machine, streams allow us to compress and send the data chunks as soon as they are read from the filesystem, whereas on the server, they allow us to decompress every chunk as soon as it is received from the remote peer. (check img)

(for better detailed example: 193-196)

<p>Let's now consider the case of an application that compresses a file and uploads it to a remote HTTP server, which, in turn, decompresses it and saves it on the filesystem:</p><ul><li><p>If the client component of our application was implemented using a buffered API, the upload would start only when the entire file had been read and compressed. On the other hand, the decompression would start on the server only when all the data had been received.</p></li><li><p>A better solution to achieve the same result involves the use of streams. On the client machine, streams allow us to compress and send the data chunks as soon as they are read from the filesystem, whereas on the server, they allow us to decompress every chunk as soon as it is received from the remote peer. (check img)</p></li></ul><p>(for better detailed example: 193-196)</p><p></p>

New cards

Composability with Pipes

the pipe() method, which allows us to connect the different processing units, each being responsible for one single functionality.

In perfect Node.js style. This is possible because streams have a uniform interface, and they can understand each other in terms of API.

The only prerequisite is that the next stream in the pipeline has to support the data type produced by the previous stream, which can be either binary, text, or even objects.

(for an example in depth: 196-199)

New cards

Anatomy of streams

Every stream in Node.js is an implementation of one of the four base abstract classes available in the stream core module:

Readable
Writable
Duplex
Transform

Each stream class is also an instance of EventEmitter. Streams, in fact, can produce several types of event, such as end when a Readable stream has finished reading, finish when a Writable stream has completed writing, or error when something goes wrong.

New cards

Operating modes

One reason why streams are so flexible is the fact that they can handle not just binary data, but almost any JavaScript value. In fact, they support two operating modes:

Binary mode: To stream data in the form of chunks, such as buffers or strings.
Object mode: To stream data as a sequence of discrete objects (allowing us to use almost any JavaScript value).

These two operating modes allow us to use streams not just for I/O, but also as a tool to elegantly compose processing units in a functional fashion.

New cards

Readable streams

A Readable stream represents a source of data. In Node.js, it's implemented using the Readable abstract class, which is available in the stream module.
There are two approaches to receive the data from a Readable stream: non-flowing (or paused) and flowing.

New cards

Readable - The non-flowing mode

The non-flowing or paused mode is the default pattern for reading from a Readable stream. It involves attaching a listener to the stream for the readable event, which signals the availability of new data to read.
Then, in a loop, we read the data continuously until the internal buffer is emptied. This can be done using the read() method, which synchronously reads from the internal buffer and returns a Buffer object representing the chunk of data.

<ul><li><p>The non-flowing or paused mode is the default pattern for reading from a Readable stream. It involves attaching a listener to the stream for the readable event, which signals the availability of new data to read.</p></li><li><p>Then, in a loop, we read the data continuously until the internal buffer is emptied. This can be done using the read() method, which synchronously reads from the internal buffer and returns a Buffer object representing the chunk of data.</p></li></ul><p></p>

New cards

Readable - Flowing mode

Another way to read from a stream is by attaching a listener to the data event. This will switch the stream into using flowing mode, where the data is not pulled using read(), but instead is pushed to the data listener as soon as it arrives.
Flowing mode offers less flexibility to control the flow of data compared to non- flowing mode. The default operating mode for streams is non-flowing, so to enable flowing mode, it's necessary to attach a listener to the data event or explicitly invoke the resume() method. To temporarily stop the stream from emitting data events,

<ul><li><p>Another way to read from a stream is by attaching a listener to the data event. <mark data-color="green" style="background-color: green; color: inherit">This will switch the stream into using flowing mode, where the data is not pulled using read(), but instead is pushed to the data listener as soon as it arrives.</mark></p></li><li><p>Flowing mode offers less flexibility to control the flow of data compared to non- flowing mode. The default operating mode for streams is non-flowing, so to enable flowing mode, it's necessary to attach a listener to the data event or explicitly invoke the resume() method. To temporarily stop the stream from emitting data events,</p></li></ul><p></p>

New cards

Implementing Readable streams

Now that we know how to read from a stream, the next step is to learn how to implement a new custom Readable stream. To do this, it's necessary to create a new class by inheriting the prototype Readable from the stream module. The concrete stream must provide an implementation of the _read() method.
Please note that read() is a method called by the stream consumers, while _read() is a method to be implemented by a stream subclass and should never be called dire ctly. The underscore usually indicates that the method is not public and should not be called directly.

<ul><li><p>Now that we know how to read from a stream, the next step is to learn how to implement a new custom Readable stream. To do this, it's necessary to create a new class by inheriting the prototype Readable from the stream module. The concrete stream must provide an implementation of the _read() method.</p></li><li><p>Please note that read() is a method called by the stream consumers, while _read() is a method to be implemented by a stream subclass and should never be called dire ctly. The underscore usually indicates that the method is not public and should not be called directly.</p></li></ul><p></p>

New cards

Simplified construction (of creating a custom readable stream)

New cards

Readable streams from iterables

You can easily create Readable stream instances from arrays or other iterable objects (that is, generators, iterators, and async iterators) using the Readable.from() helper.

Try not to instantiate large arrays in memory. Imagine if, in the previous example, we wanted to list all the mountains in the world. There are about 1 million mountains, so if we were to load all of them in an array upfront, we would allocate a quite significant amount of memory. Even if we then consume the data in the array through a Readable stream, all the data has already been preloaded, so we are effectively voiding the memory efficiency of streams. It's always preferable to load and consume the data in chunks, and you could do so by using native streams such as fs.createReadStream, by building a custom stream, or by using Readable.from with lazy iterables such as generators, iterators, or async iterators.

<p>You can easily create Readable stream instances from arrays or other iterable objects (that is, generators, iterators, and async iterators) using the Readable.from() helper.</p><p><mark data-color="green" style="background-color: green; color: inherit">Try not to instantiate large arrays in memory.</mark> Imagine if, in the previous example, we wanted to list all the mountains in the world. There are about 1 million mountains, so if we were to load all of them in an array upfront, we would allocate a quite significant amount of memory. Even if we then consume the data in the array through a Readable stream, all the data has already been preloaded, so we are effectively voiding the memory efficiency of streams. <mark data-color="green" style="background-color: green; color: inherit">It's always preferable to load and consume the data in chunks, and you could do so by using native streams such as fs.createReadStream, by building a custom stream, or by using Readable.from with lazy iterables such as generators, iterators, or async iterators.</mark></p>

New cards

Writable streams