(3.4.1)--Bigdataanalysis3-4datatranform.pdf
《(3.4.1)--Bigdataanalysis3-4datatranform.pdf》由会员分享,可在线阅读,更多相关《(3.4.1)--Bigdataanalysis3-4datatranform.pdf(11页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、Data transformData integration1Data transform231 Data integrationData integration:Integrate data from multipledata sources into a consistent storage Pattern matching;Data redundancy processing;Data value conflict solving;41 Data integration-Pattern matchingIntegrate metadata from different data sour
2、ces.Entity recognition problem:Match real-world entities from different data sources,such as:A.cust-id=B.customer_no.51 Data integration-Data redundancy The same attribute will have different field names in different databases.One attribute can be derived from another attribute.For example,the avera
3、ge monthly income attribute in a customer data table can be calculated based on the monthly income attribute.Some redundancy can be detected by correlation analysis61 Data integration-Data value conflictFor a real-world entity,its attribute values from different data sources may be different.Such as
4、 Differences in representation,different scales,or differences in coding,etc.For example:the weight attribute uses the metric system,like kg,g in one system,but uses the imperial system like pound in another system.Same price attributes in different locations using different currency units,$,pound,R
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 3.4 Bigdataanalysis3 datatranform
限制150内