Windows Server 2019 Module 4 Section 4 Implementing Data Deduplication

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/44

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

45 Terms

New cards

What does Data Deduplication do

Scan files, divides those file into chunks, and retains only one copy of each chunk

New cards

After deduplication files are not stored as independent data instead they are replaced with what that points to the data on the common chunk

stub

New cards

where are files kept after deduplication

in common chunks

New cards

data deduplication may do what to overall disk performance

improve it

New cards

data deduplication run as a

scheduled task that can have minimum age requirements set

New cards

The components of data deduplication role

Filter Driver

Deduplication service

Garbage Collection

New cards

Filter Driver

Monitors local or remote input/output (I/O) and manages the chunks of data on the file system by interacting with the various jobs. There is one for every volume.

New cards

Deduplication service

Consist of multiple jobs that perform both deduplication and compression of files according to the data deduplication policy for the volume. After initial optimization of a file, if the file is then modified and meets the data deduplication policy threshold for optimization, the file will be optimized again.

New cards

Garbage collection

Consist of jobs that process deleted or modified data on the volume so that any data chunks no longer being referenced are cleaned up. This job processes previously deleted or logically overwritten optimized content to create usable volume-free space. When an optimized file is deleted or overwritten by new data, the old data in the chunk store isn't deleted immediately. This can also be scheduled to run or ran manually.

New cards

Data Deduplication has built in data integrity features such as

checksum validation and metadata consistency checking

New cards

Data Deduplication will try to rebuild corrupted data by using

Backup copies

Mirror Image

New Chunk

New cards

Backup copies

Deduplication keeps backup copies of popular chunks (chunks referenced over 100 times) in an area called the hotspot. If the working copy suffers a soft damage such as bit flips or torn writes, deduplication uses its redundant copy.

New cards

Mirror image

If using mirrored Storage Spaces, deduplication can use the mirror image of the redundant chunk to serve the I/O and fix the corruption.

New cards

New chunk

If a file is processed with a chunk that is corrupted, the corrupted chunk is eliminated, and the new incoming chunk is used to fix the corruption.

New cards

Because of the additional validations in deduplication, it may be one of the first system,s to report any early signs of

data corruption in the hardware or file system

New cards

Unoptimization does what

undoes deduplication on all the optimized files on the volume.

New cards

some jobs for unoptimization

decommissioning a server with volumes enabled for Data Deduplication, troubleshooting issues with deduplicated data, or migration of data to another system that doesn't support Data Deduplication.

New cards

what should you do before running unoptimization

Disable-DedupVolume in Windows PowerShell

New cards

three main types of data deduplication

source, target (or post-process) deduplication, and in-line (or transit) deduplication

New cards

what are optimized files

files that are stored as reparse points, and that contain pointers to a map of the respective chunks in the chunk store that are needed to restore the file when it's requested.

New cards

Chunk store

location for the optimized file data

New cards

Data deduplication is designed to be applied on

primary data volumes

New cards

Data deduplication can be scheduled based on

the type of data that is involved, and the frequency and volume of changes that occur to the volume or particular file types.

New cards

Data Deduplication should be considered for the following data types

General file shares, software deployment shares, VHD Libraries, VDI deployments, and Virtualized Backup

New cards

General file share are

These include group content publication and sharing, user home folders, and Folder Redirection/Offline Files.

New cards

Software deployment share are

These are software binaries, images, and updates.

New cards

VHD Libraries are

These are Virtual Hard Disk (VHD) file storage for provisioning to hypervisors.

New cards

VDI deployments are

These are Virtual Desktop Infrastructure (VDI) deployments using Microsoft Hyper-V.

New cards

Virtualized backup is

These include backup applications running as Hyper-V guests and saving backup data to mounted VHDs.

New cards

When applied to the correct data, deduplication can save up to

50 to 90 percent of a systems storage

New cards

What is an example of a bad file to have data deduplication run on

Files that are often changed and accessed by users or applications

New cards

How can you see what saving DeDuplication will give you

by using the Deduplication Evaluation Tool

New cards

what savings can you expect to see on User Documents

30 to 50 percent

New cards

what savings can you expect to see on Software deployment shares

70 to 80 percent

New cards

what savings can you expect to see on Virtualization libraries

80 to 95 percent

New cards

what savings can you expect to see on General file shares

50 to 60 percent

New cards

ideal candidates for deduplication

Folder redirection servers

Virtualization depot or provisioning library

Software deployment shares

Microsoft SQL Server and Microsoft Exchange Server backup volumes

Scale-out File Servers (SOFS) Cluster Shared Volumes (CSVs)

Virtualized backup VHDs (for example, Microsoft System Center Data Protection Manager)

Virtualized Desktop Infrastructure VDI VHDs (only personal VDIs)

New cards

non ideal candidates for deduplication

Microsoft Hyper-V hosts

Windows Server Update Service (WSUS)

SQL Server and Exchange Server database volumes

Data Deduplication interoperability

New cards

Windows BranchCache

You can optimize access to data over the network by enabling this on Windows Server and Windows client operating systems. When this is enabled the system communicates over a wide area network (WAN) with a remote file server that's enabled for Data Deduplication, all the deduplicated files are already indexed and hashed, so requests for data from a branch office are quickly computed. This is similar to preindexing or prehashing.

New cards

you shouldn't create a hard quota on a volume root folder enabled for Data Deduplication, instead you should use

A soft quota

New cards

Data Deduplication is compatible with

Distributed File System (DFS) Replication

New cards

Distributed File System (DFS) replication works by

remote differential compression

New cards

you can backup and restore individual files and dull volumes using

Data Deduplication

New cards

You can create optimized file-level backups/restores using

Volume Shadow Copy Service (VSS) writer

New cards