Requirements, admin portal, gateways

Identify requirements for a solution, including components, performance, and capacity stock-keeping units (SKUs)

Capacity Requirements

  • Capacities - how many do we need and what sizes of each?

    • what are capacities?

    • Number of Capacities

      • impacted by:

        • compliance with data residency regulations

          • datasets will need to be stored in different capacities depending on where the data needs to be stored

        • billing preferences

          • capacities impact where the org gets billed/which department gets billed

        • segregating by workload type

          • example: put more intensive data engineering in seperate capacities

        • seperate by departments

    • Sizing of Capacities

      • impacted by

        • intensity of expected workloads (high ingestion)

          • heavy transformation (spark etc)

          • machine learning training

        • budget

          • more resources (more SKU so more cost)

        • can you afford to wait

          • lower capacity means it takes longer

        • does the client want access to F64 features (copilot)

  • Data ingestion methods

    • what fabric items/features you will need to get data into Fabric and how they will be configured

    • Deciding factors

Fabric Item

Where External Data Stored

What Skills Exist

Shortcut

ADLS Gen 2, Amazon S3, Google Cloud Storage, Dataverse

Database Mirroring

Azure SQL, Azure Cosmos DB, Snowflake

ETL - Dataflow

On-Premise SQL, Other

Predominantly no/low code

ETL - Data Pipeline

On-Premise SQL, Other

SQL, Predominantly no/low code

ETL - Notebook

Other

Spark (Python, Scala etc.)

Eventstream

Real-time events

Deciding between configuration features

Feature

How data secured

What is the volume of the data

On-premise data gateway

On-premise SQL

VNet data gateway

Azure virtual network/private endpoint

Fast copy

Medium (gigabytes per day)

High (many GB or terabytes per da)

Staging

Medium (gigabytes per day)

High (many GB or terabytes per da)

Data Gateways

  • Allow us to access data that is otherwise secured, not via the cloud. Two types:

    • On-premise data gateway - Anything external

      • Install data gateway

      • In Fabric create a new on-premise data gateway connection

        • also connect via dataflow or data pipeline

    • Virtual network VNet gateway - Exclusively for Azure resources

      • Configure in Azure

      • Create private endpoint on Azure Object

      • Create a subnet

      • Create new virtual network Data Gateway connection

      • Use the gateway in Dataflow or data pipeline

Data Storage Requirments

Fabric Storage Options

Data Type

Lakehouse

Structured, semi-structured and/or unstructured

Relational/structured

Data warehouse

Relational/structured

KQL database

Real-time/streaming

Recommend settings in the Fabric admin portal

Choose a data gateway type

Create a custom Power BI report theme