Data blending is a process whereby big data from multiple sources are merged into a single data warehouse or data set. It concerns not merely the merging of different file formats or disparate sources of data but also different varieties of data. Data blending allows business analysts to cope with the expansion of data which they need to make critical business decisions based on good quality business intelligence.
Data blending has been described as different to data integration due to the requirements of data analysts to merge sources very quickly, too quickly for any practical intervention by data scientists.
The most common custom metadata question is: “How can this dataset blend with (join or union to) my other datasets?” A 2015 Forrester Consulting study found that 52 percent of companies are blending 50 or more data sources and 12 percent are blending over 1,000 sources.
- ^Alteryx Analytics Brings Power of Predictive and Big Data to Market
- ^Data blending is the process of combining data from multiple sources into a functioning data set
- ^The Definitive Guide to Data Blending
- ^“Data Blending”. Trifacta.com. August 24, 2017.
- ^What Is Data Blending, and Which Tools Make It Easier?
- ^Heer, Jeffrey; Hellerstein, Joseph; Kandel, Sean; Rattenbury, Tye (July 2017). Principles of Data Wrangling. http://shop.oreilly.com/product/0636920045113.do: O’Reilly Media.
- ^“Data Mashups for Analytics”. Pentaho.