Microsoft Fabric Integration

Overview

Microsoft Fabric is the unified analytics platform that includes Power BI, Data Factory, Data Engineering, Data Science, Real-Time Intelligence, and Data Warehouse. Power BI is deeply integrated as the visualization and semantic modeling layer of Fabric.

Direct Lake Mode

Direct Lake is a storage mode exclusive to Fabric that reads data directly from delta tables in OneLake without importing or sending DirectQuery requests.

How Direct Lake Works

Framing: On refresh, Direct Lake copies only metadata (Parquet file references) from delta tables -- takes seconds
On-demand loading: When a query hits the model, data is loaded from Parquet files directly into the VertiPaq engine
No data duplication: Unlike Import, no copy of data is stored in the semantic model
Near-import performance: Once loaded into memory, queries run at VertiPaq speed

Direct Lake vs Import vs DirectQuery

Feature	Import	DirectQuery	Direct Lake
Data freshness	Snapshot at refresh	Real-time	Near real-time (after framing)
Query performance	Fastest (all in memory)	Depends on source	Near-import (on-demand load)
Refresh time	Minutes to hours	N/A	Seconds (framing only)
Refresh cost	High (full data copy)	None	Very low (metadata only)
Data size limit	10GB (Premium), 1GB (PBIX)	Source limit	Fabric capacity limit
DAX support	Full	Limited	Full
Calculated columns	Yes	No	Yes
Source requirement	Any	Any	OneLake delta tables only
Capacity requirement	Any	Any	Fabric F-SKU

Direct Lake Variants (2025-2026 GA)

Variant	Source	Multi-Source	Fallback	Use Case	GA Status
Direct Lake on OneLake (DL/OL)	OneLake delta files	Yes (multiple Fabric items)	NO fallback	Flexible, multiple lakehouses	GA
Direct Lake on SQL (DL/SQL)	Fabric SQL endpoint	No (single Fabric item)	Falls back to DirectQuery	SQL-centric, single source	GA

Creating a Direct Lake Semantic Model

In Power BI Desktop (2025+ preview):

Get Data > OneLake data hub
Select Fabric lakehouse or warehouse
Choose tables (loaded as Direct Lake automatically)
Build measures and relationships in Desktop
Publish to Fabric workspace

Via Fabric Service:

Open lakehouse/warehouse in Fabric
Click "New semantic model"
Select tables to include
Open model in web to add measures and relationships

Programmatically via TOM:

var database = new Database() { Name = "DirectLakeModel" };
var model = new Model() { Name = "DirectLakeModel" };
database.Model = model;

// Direct Lake partition source
var table = new Table() { Name = "Sales" };
table.Partitions.Add(new Partition() {
    Name = "Sales-DL",
    Mode = ModeType.DirectLake,
    Source = new EntityPartitionSource() {
        EntityName = "Sales",
        SchemaName = "dbo",
        ExpressionSource = new ExpressionSource() {
            Expression = "DatabaseQuery"
        }
    }
});
model.Tables.Add(table);

Critical distinction: DL/OL does NOT fall back to DirectQuery. If data cannot be served from memory, the query fails. This means DL/OL models must be carefully sized within capacity guardrails.

Direct Lake Guardrails by Capacity

Guardrail	F2	F4	F8	F16	F32	F64	F128
Max model size on disk	2 GB	4 GB	8 GB	16 GB	32 GB	64 GB	128 GB
Max rows per table	300M	300M	300M	1.5B	3B	6B	6B
Max files/row groups per table	1K	1K	1K	1K	1K	5K	5K
Concurrent DL queries	4	8	16	32	64	128	256

Max Memory is a soft limit for paging -- exceeding it causes performance degradation but not failure.

Max model size on disk/OneLake is a hard guardrail -- exceeding causes DQ fallback (DL/SQL) or query failure (DL/OL).

Direct Lake Fallback Configuration

Fallback Behavior	Setting	Impact
Automatic fallback to DirectQuery	Default (DL/SQL only)	Query still works but slower
Block fallback	`DirectLakeBehavior = DirectLakeOnly`	Query fails if cannot serve from DL
No fallback option	Default (DL/OL)	Queries always fail if data unavailable

Monitor fallback in Fabric Capacity Metrics app -- frequent fallback indicates model design issues.

Common fallback triggers (DL/SQL):

Columns not loaded into memory due to capacity limits
Calculated columns on Direct Lake tables (may trigger DQ fallback)
Certain DAX patterns that require full table scan
Stale framing (delta tables changed but model not re-framed)
File/row-group count exceeding capacity guardrails

Power BI Embedded with Direct Lake (GA March 2025)

Direct Lake mode is now fully supported for embedded analytics, backed by Microsoft SLA. Generate embed tokens for Direct Lake semantic models using the same embed token API as Import/DirectQuery models.

Framing (Refresh)

# Trigger framing via REST API
POST https://api.powerbi.com/v1.0/myorg/groups/{workspaceId}/datasets/{datasetId}/refreshes
{
  "type": "Automatic"
}

Framing is extremely fast (seconds) compared to Import refresh (minutes/hours). Schedule frequent framing for near-real-time data.

OneLake

OneLake is Fabric's unified data lake -- a single store for all analytics data, built on Azure Data Lake Storage Gen2 with delta format.

OneLake Shortcuts

Connect to external data without copying:

Shortcut Type	Source	Use Case
OneLake	Another Fabric item	Cross-workspace data sharing
ADLS Gen2	Azure Data Lake	Existing Azure data
S3	Amazon S3	Multi-cloud data
GCS	Google Cloud Storage	Multi-cloud data
Dataverse	Dynamics 365	Business app data

OneLake File API

Access OneLake data programmatically:

# Using Azure Storage SDK (OneLake supports ADLS Gen2 API)
from azure.storage.filedatalake import DataLakeServiceClient

service_client = DataLakeServiceClient(
    account_url="https://onelake.dfs.fabric.microsoft.com",
    credential=token_credential
)

file_system_client = service_client.get_file_system_client(workspace_id)
directory_client = file_system_client.get_directory_client(f"{lakehouse_name}.Lakehouse/Tables")

Fabric Lakehouse

A lakehouse combines data lake flexibility with warehouse SQL capabilities:

Power BI connectivity:

SQL Analytics Endpoint: Read-only SQL endpoint for DirectQuery or Direct Lake
Delta tables: Native format for Direct Lake
Notebooks: Write data from Spark notebooks, read in Power BI

Lakehouse to Power BI Flow

[Data Sources] --> [Fabric Notebooks/Pipelines] --> [Lakehouse Delta Tables]
     |                                                       |
     v                                                       v
[Power Query Dataflows Gen2]                    [Direct Lake Semantic Model]
                                                             |
                                                             v
                                                     [Power BI Reports]

Fabric Warehouse

Fully managed SQL warehouse in Fabric:

T-SQL support: Full DML (INSERT, UPDATE, DELETE, MERGE)
Auto-distributed storage: No index tuning needed
Direct Lake compatible: Tables accessible as Direct Lake sources
Cross-database queries: Query across warehouses and lakehouses

Dataflow Gen2

Cloud-based ETL in Fabric, evolution of Power BI Dataflows:

Feature	Dataflow Gen1	Dataflow Gen2
Destinations	Power BI dataset only	Lakehouse, Warehouse, KQL DB, Azure SQL, ADLS Gen2, SharePoint
Compute	Power Query Online	Power Query Online + Fabric Spark
Staging	Optional (Premium)	Always enabled
Incremental refresh	Limited	Full support (GA to Lakehouse 2025)
Monitoring	Basic	Fabric monitoring hub
CI/CD	Not supported	Git integration + deployment pipelines (2025)
Variable library	Not supported	Parameterized source paths and expressions (2025)
Publish performance	Single-threaded validation	Parallelized query validations (2026)

Gen1 deprecation: Microsoft has announced Gen1 is legacy. Migrate to Gen2 for all new development. Gen2 supports all Gen1 connectors plus Fabric-native destinations.

Dataflow Gen2 Output Destinations (2025-2026)

Destination	Protocol
Fabric Lakehouse delta tables	Delta/Parquet
Fabric Warehouse tables	T-SQL
Fabric KQL Database tables	KQL
Fabric SQL Database tables	T-SQL
Azure SQL Database tables	T-SQL
Azure Data Explorer (Kusto) tables	KQL
ADLS Gen2 files	File (CSV, Parquet)
SharePoint files	File

Dataflow Gen2 to Direct Lake Pipeline

Create Dataflow Gen2 in Fabric workspace
Connect to source (any Power Query connector)
Set destination to Lakehouse (creates delta tables)
Create semantic model on top of lakehouse tables (Direct Lake)
Build reports on the semantic model

Real-Time Intelligence

Fabric Real-Time Intelligence items now have GA lifecycle management (2025):

Item	ALM Support	Integration
Eventstream	Git + deployment pipelines	Ingests from Event Hubs, Kafka, IoT Hub, custom APIs
Eventhouse	Git + deployment pipelines	Houses KQL databases
KQL Database	Git + deployment pipelines	DirectQuery from Power BI
Real-time Dashboard	Git + deployment pipelines	Native Fabric dashboard for streaming
Data Activator	Git + deployment pipelines	Alert/trigger on data conditions

Eventstream to Power BI Pipeline

[Event Hubs/Kafka/IoT Hub] --> [Eventstream] --> [KQL Database] --> [Power BI DirectQuery]
                                     |
                                     +--> [Lakehouse] --> [Power BI Direct Lake]

Notebooks for Data Prep

Fabric notebooks (PySpark/Spark SQL) write data that Power BI consumes:

# Write DataFrame to lakehouse delta table
df.write.format("delta").mode("overwrite").saveAsTable("Sales")

# Optimized write with partitioning
df.write.format("delta") \
    .partitionBy("Year", "Month") \
    .mode("overwrite") \
    .option("overwriteSchema", "true") \
    .saveAsTable("Sales")

# V-Order optimization (improves Direct Lake read performance)
spark.conf.set("spark.sql.parquet.vorder.enabled", "true")
df.write.format("delta").mode("overwrite").saveAsTable("Sales")

V-Order: A write-time optimization that orders Parquet data for faster Direct Lake reads. Enable in notebook or pipeline configuration.

Fabric Items in Power BI Context

Fabric Item	Power BI Integration
Lakehouse	Direct Lake, SQL endpoint for DQ
Warehouse	Direct Lake, T-SQL queries
KQL Database	DirectQuery via KQL connector
Notebooks	Data prep, model management via sempy
Data Pipelines	Orchestrate refresh, data movement
Dataflow Gen2	ETL to lakehouse for DL consumption
Eventstream	Real-time data to KQL, then to PBI
ML Models	Score in notebooks, results to lakehouse

Semantic Link (sempy)

Python library for Power BI semantic model interaction in Fabric notebooks:

import sempy.fabric as fabric

# List datasets in workspace
datasets = fabric.list_datasets()

# Read data from semantic model using DAX
df = fabric.evaluate_dax(
    dataset="SalesModel",
    dax_string="EVALUATE SUMMARIZECOLUMNS('Date'[Year], 'Product'[Category], \"Sales\", [Total Sales])"
)

# Read model metadata
tables = fabric.list_tables(dataset="SalesModel")
measures = fabric.list_measures(dataset="SalesModel")

# Refresh dataset
fabric.refresh_dataset(dataset="SalesModel")

Composite Models with Direct Lake (2025 Preview)

Mix Direct Lake tables with Import tables in a single semantic model:

Source Table	Storage Mode	When to Use
Lakehouse fact table	Direct Lake	Large transaction data
Lakehouse dimension	Direct Lake	Shared dimension from gold layer
External reference data	Import	Small tables not in Fabric
Budget/forecast	Import	Data from Excel or external source

Key consideration: Composite model queries involving both DL and Import tables may have different performance characteristics. Test with production data volumes.

Additional Resources

Reference Files

references/fabric-architecture-patterns.md -- Medallion architecture, data mesh patterns, and Fabric workspace design strategies

fabric-integration