《软硬件系统编程PPT (6).pdf》由会员分享,可在线阅读,更多相关《软硬件系统编程PPT (6).pdf(9页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、什么是大数据什么是大数据 What is Big Data Big Data Big data is something that cannot be done on the basis of small-scale data,but can only be done on the basis of large-scale data 2-Viktor Mayer-Sch nberger The earlier you book a ticket,the cheaper it will be 度假去。度假去。More expensive!It can be predicted the futur
2、e trend of flight ticket price if collecting enough historical data of flight ticket price Oren Etzioni Predicting Predicting the change the change trend of ticket price trend of ticket price with big datawith big data 4 The reason for studying big data is that more and more data have to be processe
3、d,which is beyond the processing capacity and storage capacity of general personal computers,so we have to study some new data processing technologies。The massive data is big data MapReduceMapReduce and and HadoopHadoop Characteristics of big data The amount of data is huge.The data volume of big da
4、ta is usually measured by PB,EB,or even ZB and YB.Value Variety Volume Velocity The sources are diverse,and the types and structures are complex.The speed of data growth and processing is fast,and the requirement of timeliness is high.Large amount of data but low value density.How to mine the value
5、of data is the most important problem to be solved in the era of big data 1PB=1024TB=1024*1024GB 6 Volume,Variety,Value,Velocity 4V characteristics of big data The sources of big data Big data flood The structure of big data has many forms The structured data StudentID StudentName StudentGender Stud
6、entAge 201700110 张琳张琳 Female 18 201700111 李一平李一平 Male 19 semi-structured data Unstructured data E.gE.g:TextText,PDFPDF,Image,Video,etc.Image,Video,etc.E.G:XML、Web click stream data,etc.Data with fixed format and limited length There are irregular formats or recognizable formats,which can be regulari
7、zed or identified by tools Big data growth is increasingly unstructured This brings challenges to the traditional data analysis methods,many of which are targeted on structured data,therefor various analysis techniques are needed to deal with semi-structured and unstructured data.These are new opportunities and challenges brought forth by the big data.