Move data across layers in Fabric
Moving data across medallion layers refines, organizes, and prepares it for downstream data activities. Within Fabric's lakehouse, there's more than one way to move data between layers, ensuring that you can choose the method that works for your team.
There are a few things to consider when deciding how to move and transform data across layers.
- How much data are you working with?
- How complex are the transformations you need to make?
- How often will you need to move data between layers?
- What tools are you most comfortable with?
Understanding the difference between data transformation and data orchestration helps you select the right tools for the job within Fabric.
Data transformation involves altering the structure or content of data to meet specific requirements. Tools for data transformation in Fabric include Dataflows (Gen2) and notebooks. Dataflows are a great option for smaller semantic models and simple transformations. Notebooks are a better option for larger semantic models and more complex transformations. Notebooks also allow you to save your transformed data as a managed Delta table in the lakehouse, ready for reporting.
Data orchestration refers to the coordination and management of multiple data-related processes, ensuring they work together to achieve a desired outcome. The primary tool for data orchestration in Fabric is pipelines. A pipeline is a series of steps that move data from one place to another, in this case, from one layer of the medallion architecture to the next. Pipelines can be automated to run on a schedule or triggered by an event.
Comments
Post a Comment