大数据介绍英文方案ppt课件.ppt
《大数据介绍英文方案ppt课件.ppt》由会员分享,可在线阅读,更多相关《大数据介绍英文方案ppt课件.ppt(33页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、BIG DATAEVERY MINUTE1,388 cabs2,777private carsDidi rides hailed:EVERY MINUTE 395,833People log inTo WeChat 194,444 peopleare video or audio chattingEVERY MINUTE625,000Youku Tudou videosbeing watchedEVERY MINUTE64,814posts and reposts on WeiboSEARCH4,166,667 search queriesEVERY MINUTE774 people buy
2、something on Alibabas marketplacesUS$1,133,942spent on Alibaba1Definition2Characteristic3NoSQL4RDBMS5MapReduceCONTENTS6Applications1Definition1DefinitionBIGDATAvolume of dataimportant dataon a day-to-day basisfor better decisions2Characteristic2CharacteristicVolumeThe quantity of generated and store
3、d data.VarietyThe type and nature of the data.The quality of captured data can vary greatly, affecting accurate analysis.VelocityIn this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development.VariabilityIncons
4、istency of the data set can hamper processes to handle and manage it.Veracity3NoSQL3NoSQLNoSQL refers to document-oriented databases SQL doesnt scale well horizontally. It is schemaless. But not formless (JSON format). JSON: data interchange format Mongo Database Couch Database3NoSQLBasic Availabili
5、tyspread data across many storage systems with a high degree of replication.Soft StateEventual ConsistencyBase Modeldata consistency is the developers problem and should not be handled by the database.at some point in the future, data will converge to a consistent state. No guarantees are made “when
6、”.3NoSQL field1: value1, field2: value2 fieldN: valueN var mydoc = _id:ObjectId(5099803df3f4948bd2f98391), name: first: Alan, last: Turing , birth: new Date(Jun 23, 1912), death: new Date(Jun 07, 1954), contribs: Turing machine, Turing test, , views : NumberLong(1250000) JSON Structure3NoSQLRDBMS vs
7、 NoSQL XszcRow DB:001:10,Smith,Joe,40000;002:12,Jones,Mary,50000;003:11,Johnson,Cathy,44000;004:22,Jones,Bob,55000; index: 001:40000;002:50000;003:44000;004:55000;Column DB:10:001,12:002,11:003,22:004;Smith:001,Jones:002,Johnson:003,Jones:004;Joe:001,Mary:002,Cathy:003,Bob:004;40000:001,50000 ;Smith
8、:001,Jones:002,004,Johnson:003;3NoSQLBenefits Column-oriented organizations are more efficient when an aggregate needs to be computed over many rows but only for a notably smaller subset of all columns of data, because reading that smaller subset of data can be faster than reading all data. Column-o
9、riented organizations are more efficient when new values of a column are supplied for all rows at once, because that column data can be written efficiently and replace old column data without touching any other columns for the rows. Row-oriented organizations are more efficient when many columns of
10、a single row are required at the same time, and when row-size is relatively small, as the entire row can be retrieved with a single disk seek. Row-oriented organizations are more efficient when writing a new row if all of the column data is supplied at the same time, as the entire row can be written
11、 with a single disk seek.3NoSQLSQL vs Non SQLA good compromise is to design your system with 3 logical DBs 1. Normal SQL DB used by your admin application to create content. 2. No-SQL DB for front-end/public/high-volume applicaiton used by the public internet. 3. The last DB is for analytical report
12、ing system using cubes and all that good stuff. Then data flows from the Admin DB to the client No-SQL DB when someone Publishes a piece of content, the client (NoSQL) db provides very fast read access and records user interactions with the content. Then you have a scheduled job that pulls the data
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 数据 介绍 英文 方案 ppt 课件
限制150内