Master Data Management – Merging from Multiple Sources
Play Master Data Management – Merging from Multiple Sources
In an enterprise, merging master data, like customer data, from multiple sources is a common problem. Typically, you do not have a single key identifying a customer in different sources, but have to match data based on similarity of strings, like names and addresses. In this session, we are going to see how different algorithms for comparing strings included in SQL Server 2008 R2 work. We are going to use Soundex Transact-SQL function, three different algorithms that come in R2 with Master Data Services (Levenshtein, Jaccard and Jaro-Winkler), Fuzzy Lookup transformation from Integration Services, and even create an additional CLR function with Simil algorithm. We will also look at an algorithm that helps you merge only the data that changed since last merging – a very important algorithm if you have to do merging on daily basis when data changes in different sources continuously.