欢迎来到淘文阁 - 分享文档赚钱的网站! | 帮助中心 好文档才是您的得力助手!
淘文阁 - 分享文档赚钱的网站
全部分类
  • 研究报告>
  • 管理文献>
  • 标准材料>
  • 技术资料>
  • 教育专区>
  • 应用文书>
  • 生活休闲>
  • 考试试题>
  • pptx模板>
  • 工商注册>
  • 期刊短文>
  • 图片设计>
  • ImageVerifierCode 换一换

    大数据介绍英文方案ppt课件.ppt

    • 资源ID:29416933       资源大小:1.50MB        全文页数:33页
    • 资源格式: PPT        下载积分:20金币
    快捷下载 游客一键下载
    会员登录下载
    微信登录下载
    三方登录下载: 微信开放平台登录   QQ登录  
    二维码
    微信扫一扫登录
    下载资源需要20金币
    邮箱/手机:
    温馨提示:
    快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如填写123,账号就是123,密码也是123。
    支付方式: 支付宝    微信支付   
    验证码:   换一换

     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    大数据介绍英文方案ppt课件.ppt

    BIG DATAEVERY MINUTE1,388 cabs2,777private carsDidi rides hailed:EVERY MINUTE 395,833People log inTo WeChat 194,444 peopleare video or audio chattingEVERY MINUTE625,000Youku Tudou videosbeing watchedEVERY MINUTE64,814posts and reposts on WeiboSEARCH4,166,667 search queriesEVERY MINUTE774 people buy something on Alibabas marketplacesUS$1,133,942spent on Alibaba1Definition2Characteristic3NoSQL4RDBMS5MapReduceCONTENTS6Applications1Definition1DefinitionBIGDATAvolume of dataimportant dataon a day-to-day basisfor better decisions2Characteristic2CharacteristicVolumeThe quantity of generated and stored data.VarietyThe type and nature of the data.The quality of captured data can vary greatly, affecting accurate analysis.VelocityIn this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development.VariabilityInconsistency of the data set can hamper processes to handle and manage it.Veracity3NoSQL3NoSQLNoSQL refers to document-oriented databases SQL doesnt scale well horizontally. It is schemaless. But not formless (JSON format). JSON: data interchange format Mongo Database Couch Database3NoSQLBasic Availabilityspread data across many storage systems with a high degree of replication.Soft StateEventual ConsistencyBase Modeldata consistency is the developers problem and should not be handled by the database.at some point in the future, data will converge to a consistent state. No guarantees are made “when”.3NoSQL field1: value1, field2: value2 fieldN: valueN var mydoc = _id:ObjectId(5099803df3f4948bd2f98391), name: first: Alan, last: Turing , birth: new Date(Jun 23, 1912), death: new Date(Jun 07, 1954), contribs: Turing machine, Turing test, , views : NumberLong(1250000) JSON Structure3NoSQLRDBMS vs NoSQL XszcRow DB:001:10,Smith,Joe,40000;002:12,Jones,Mary,50000;003:11,Johnson,Cathy,44000;004:22,Jones,Bob,55000; index: 001:40000;002:50000;003:44000;004:55000;Column DB:10:001,12:002,11:003,22:004;Smith:001,Jones:002,Johnson:003,Jones:004;Joe:001,Mary:002,Cathy:003,Bob:004;40000:001,50000 ;Smith:001,Jones:002,004,Johnson:003;3NoSQLBenefits Column-oriented organizations are more efficient when an aggregate needs to be computed over many rows but only for a notably smaller subset of all columns of data, because reading that smaller subset of data can be faster than reading all data. Column-oriented organizations are more efficient when new values of a column are supplied for all rows at once, because that column data can be written efficiently and replace old column data without touching any other columns for the rows. Row-oriented organizations are more efficient when many columns of a single row are required at the same time, and when row-size is relatively small, as the entire row can be retrieved with a single disk seek. Row-oriented organizations are more efficient when writing a new row if all of the column data is supplied at the same time, as the entire row can be written with a single disk seek.3NoSQLSQL vs Non SQLA good compromise is to design your system with 3 logical DBs 1. Normal SQL DB used by your admin application to create content. 2. No-SQL DB for front-end/public/high-volume applicaiton used by the public internet. 3. The last DB is for analytical reporting system using cubes and all that good stuff. Then data flows from the Admin DB to the client No-SQL DB when someone Publishes a piece of content, the client (NoSQL) db provides very fast read access and records user interactions with the content. Then you have a scheduled job that pulls the data from the client DB into the reporting system. Since Admin, client, and reporting are often separate apps, each application team can work with data in the format that best serves the application and the transition from one system to the other is handled in the service layers. 4RDBMS4RDBMSfixed-schema, row-oriented databases with ACID properties and a sophisticated SQL query engineThe emphasis is on strong consistency, referential integrity, abstraction from the physical layer, and complex queries through the SQL language.easily create secondary indexes, perform complex inner and outer joins, count, sum, sort, group, and page your data across a number of tables, rows, and columns.5MapReduceDividing and conqueringHighly fault tolerantEvery data block replicated on 3 nodesDifficult to implement5MapReduce5Comparison RDBMSMapReduceData sizeGBPBAccessInteractive and Batch Batch UpdatesRead /Write many times Write once ,Read many times Structure Static Schema Dynamic Scheme Integrated High(ACID)Low Scaling No liner Liner DBA Ratio 1:401:30005How does MapReduce workMapReduce uses key/value pairs. (Traditionally using rows and columns)-Mapall the intermediate values for a given output key are combined together into a list. -ReduceThe reduce function then combines the intermediate values into one or more final values for the same key. -ReduceTwo steps: Map and Reduce6Application6GovernmentThe use and adoption of big data within governmental processes is beneficial and allows efficiencies in terms of cost, productivity, and innovation, but does not come without its flaws. Data analysis often requires multiple parts of government (central and local) to work in collaboration and create new and innovative processes to deliver the desired outcome. Below are the thoughtby whom? leading examples within the governmental big data space.6HealthcareBig data analytics has helped healthcare improve by providing personalized medicine and prescriptive analytics, clinical risk intervention and predictive analytics, waste and care variability reduction, automated external and internal reporting of patient data, standardized medical terms and patient registries and fragmented point solutions.6EducationA McKinsey Global Institute study found a shortage of 1.5 million highly trained data professionals and managers and a number of universities including University of Tennessee and UC Berkeley, have created masters programs to meet this demand. Private bootcamps have also developed programs to meet that demand, including free programs like The Data Incubator or paid programs like General Assembly.6Internet of ThingsBig Data and the IoT work in conjunction. From a media perspective, data is the key derivative of device inter-connectivity and allows accurate targeting. The Internet of Things, with the help of big data, therefore transforms the media industry, companies and even governments, opening up a new era of economic growth and competitiveness. The intersection of people, data and intelligent algorithms have far-reaching impacts on media efficiency. The wealth of data generated allows an elaborate layer on the present targeting mechanisms of the industry.6SportsBig data can be used to improve training and understanding competitors, using sport sensors. Besides, it is possible to predict winners in a match using big data analytics. Future performance of players could be predicted as well. Thus, players value and salary is determined by data collected throughout the season.THANKS5Comparison 1KB=2(10)B=1024B1MB=2(10)KB=1024KB 1GB=2(10)MB=1024MB 1TB=2(10) GB=1024GB 1PB=2(10) TB=1024TB1EB=2(10) PB=1024PB Back

    注意事项

    本文(大数据介绍英文方案ppt课件.ppt)为本站会员(飞****2)主动上传,淘文阁 - 分享文档赚钱的网站仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知淘文阁 - 分享文档赚钱的网站(点击联系客服),我们立即给予删除!

    温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。




    关于淘文阁 - 版权申诉 - 用户使用规则 - 积分规则 - 联系我们

    本站为文档C TO C交易模式,本站只提供存储空间、用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。本站仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知淘文阁网,我们立即给予删除!客服QQ:136780468 微信:18945177775 电话:18904686070

    工信部备案号:黑ICP备15003705号 © 2020-2023 www.taowenge.com 淘文阁 

    收起
    展开