Requirements, admin portal, gateways
Identify requirements for a solution, including components, performance, and capacity stock-keeping units (SKUs)
Capacity Requirements
Capacities - how many do we need and what sizes of each?
what are capacities?
Number of Capacities
impacted by:
compliance with data residency regulations
datasets will need to be stored in different capacities depending on where the data needs to be stored
billing preferences
capacities impact where the org gets billed/which department gets billed
segregating by workload type
example: put more intensive data engineering in seperate capacities
seperate by departments
Sizing of Capacities
impacted by
intensity of expected workloads (high ingestion)
heavy transformation (spark etc)
machine learning training
budget
more resources (more SKU so more cost)
can you afford to wait
lower capacity means it takes longer
does the client want access to F64 features (copilot)
Data ingestion methods
what fabric items/features you will need to get data into Fabric and how they will be configured
Deciding factors
Fabric Item | Where External Data Stored | What Skills Exist |
Shortcut | ADLS Gen 2, Amazon S3, Google Cloud Storage, Dataverse | |
Database Mirroring | Azure SQL, Azure Cosmos DB, Snowflake | |
ETL - Dataflow | On-Premise SQL, Other | Predominantly no/low code |
ETL - Data Pipeline | On-Premise SQL, Other | SQL, Predominantly no/low code |
ETL - Notebook | Other | Spark (Python, Scala etc.) |
Eventstream | Real-time events |
Deciding between configuration features
Feature | How data secured | What is the volume of the data |
On-premise data gateway | On-premise SQL | |
VNet data gateway | Azure virtual network/private endpoint | |
Fast copy | Medium (gigabytes per day) High (many GB or terabytes per da) | |
Staging | Medium (gigabytes per day) High (many GB or terabytes per da) |
Data Gateways
Allow us to access data that is otherwise secured, not via the cloud. Two types:
On-premise data gateway - Anything external
Install data gateway
In Fabric create a new on-premise data gateway connection
also connect via dataflow or data pipeline
Virtual network VNet gateway - Exclusively for Azure resources
Configure in Azure
Create private endpoint on Azure Object
Create a subnet
Create new virtual network Data Gateway connection
Use the gateway in Dataflow or data pipeline
Data Storage Requirments
Fabric Storage Options | Data Type | |
Lakehouse | Structured, semi-structured and/or unstructured Relational/structured | |
Data warehouse | Relational/structured | |
KQL database | Real-time/streaming |