最新大型所使用的工具ppt课件.ppt

上传人:豆**** 文档编号:33633809 上传时间:2022-08-11 格式:PPT 页数:34 大小:1.55MB
返回 下载 相关 举报
最新大型所使用的工具ppt课件.ppt_第1页
第1页 / 共34页
最新大型所使用的工具ppt课件.ppt_第2页
第2页 / 共34页
点击查看更多>>
资源描述

《最新大型所使用的工具ppt课件.ppt》由会员分享,可在线阅读,更多相关《最新大型所使用的工具ppt课件.ppt(34页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。

1、How to scale up web service in the past ?Source: http:/ Distributed storagel Table-like in data structure umulti-dimensional mapl High scalabilityl High availabilityl High performanceWho use HBase lAdobe 內部使用 (Structure data)lKalooga 圖片搜尋引擎 http:/ lMeetup 社群聚會網站 http:/ lStreamy 成功從 MySQL 移轉到 Hbase h

2、ttp:/ lTrend Micro 雲端掃毒架構 http:/ lYahoo! 儲存文件 fingerprint 避免重複 http:/ - http:/wiki.apache.org/hadoop/Hbase/PoweredByBackdroplStarted toward by Chad Walters and Jiml2006.11uGoogle releases paper on BigTablel2007.2uInitial HBase prototype created as Hadoop contrib.l2007.10uFirst useable HBasel2008.1uH

3、adoop become Apache top-level project and HBase becomes subprojectl2008.10uHBase 0.18, 0.19 releasedHBase Is Not lTables have one primary index, the row key.lNo join operators.lScans and queries can select a subset of available columns, perhaps by using a wildcard.lThere are three types of lookups:u

4、Fast lookup using row key and optional timestamp.uFull table scanuRange scan from region start to end.HBase Is Not (2)l Limited atomicity and transaction support.uHBase supports multiple batched mutations of single rows only.uData is unstructured and untyped.l No accessed or manipulated via SQL.uPro

5、grammatic access via Java, REST, or Thrift APIs.uScripting via JRuby.Why Bigtable?l Performance of RDBMS system is good for transaction processing but for very large scale analytic processing, the solutions are commercial, expensive, and specialized.l Very large scale analytic processinguBig queries

6、 typically range or table scans.uBig databases (100s of TB)Why Bigtable? (2)l Map reduce on Bigtable with optionally Cascading on top to support some relational algebras may be a cost effective solution.l Sharding is not a solution to scale open source RDBMS platformsuApplication specificuLabor inte

7、nsive (re)partitionaingWhy HBase ?l HBase is a Bigtable clone.l It is open sourcel It has a good community and promise for the futurel It is developed on top of and has good integration for the Hadoop platform, if you are using Hadoop already.l It has a Cascading connector.HBase benefits than RDBMSl

8、 No real indexesl Automatic partitioningl Scale linearly and automatically with new nodesl Commodity hardwarel Fault tolerancel Batch processingData ModellTables are sorted by RowlTable schema only define its column families .uEach family consists of any number of columnsuEach column consists of any

9、 number of versionsuColumns only exist when inserted, NULLs are free.uColumns within a family are sorted and stored togetherlEverything except table names are bytel(Row, Family: Column, Timestamp) ValueRow keyColumn FamilyvalueTimeStampMemberslMasteruResponsible for monitoring region serversuLoad ba

10、lancing for regionsuRedirect client to correct region serversuThe current SPOFlregionserver slavesuServing requests(Write/Read/Scan) of ClientuSend HeartBeat to MasteruThroughput and Region numbers are scalable by region serversRegionsl表格是由一或多個 region 所構成uRegion 是由其 startKey 與 endKey 所指定l每個 region 可

11、能會存在於多個不同節點上,而且是由數個HDFS 檔案與區塊所構成,這類 region 是由 Hadoop 負責複製實際個案討論 部落格l邏輯資料模型u一篇 Blog entry 由 title, date, author, type, text 欄位所組成。u一位User由 username, password等欄位所組成。u每一篇的 Blog entry可有許多Comments。u每一則comment由 title, author, 與 text 組成。lERD部落格 HBase Table SchemalRow keyutype (以2個字元的縮寫代表)與 timestamp組合而成。u因

12、此 rows 會先後依 type 及 timestamp 排序好。方便用 scan () 來存取 Table的資料。lBLOGENTRY 與 COMMENT的”一對多”關係由comment_title, comment_author, comment_text 等column families 內的動態數量的column來表示l每個Column的名稱是由每則 comment的 timestamp來表示,因此每個column family的 column 會依時間自動排序好ArchitectureZooKeeperlHBase depends on ZooKeeper (Chapter 13)

13、and by default it manages a ZooKeeper instance as the authority on cluster stateOperation The -ROOT- table holds the list of .META. table regionsThe .META. table holds the list of all user-space regions.Installation (1) $ wget http:/ sudo tar -zxvf hbase-*.tar.gz -C /opt/$ sudo ln -sf /opt/hbase-0.2

14、0.3 /opt/hbase$ sudo chown -R $USER:$USER /opt/hbase $ sudo mkdir /var/hadoop/$ sudo chmod 777 /var/hadoop 啟動Hadoop Setup (1) $ vim /opt/hbase/conf/hbase-env.sh export JAVA_HOME=/usr/lib/jvm/java-6-sunexport HADOOP_CONF_DIR=/opt/hadoop/confexport HBASE_HOME=/opt/hbaseexport HBASE_LOG_DIR=/var/hadoop

15、/hbase-logsexport HBASE_PID_DIR=/var/hadoop/hbase-pidsexport HBASE_MANAGES_ZK=trueexport HBASE_CLASSPATH=$HBASE_CLASSPATH:/opt/hadoop/conf $ cd /opt/hbase/conf$ cp /opt/hadoop/conf/core-site.xml ./$ cp /opt/hadoop/conf/hdfs-site.xml ./$ cp /opt/hadoop/conf/mapred-site.xml ./ Setup (2) name value Nam

16、evaluehbase.rootdir hdfs:/secuse.nchc.org.tw:9000/hbase hbase.tmp.dir /var/hadoop/hbase-$user.name hbase.cluster.distributed true hbase.zookeeper.property.clientPort 2222 hbase.zookeeper.quorum Host1, Host2hbase.zookeeper.property.dataDir /var/hadoop/hbase-data Startup & Stopl全部啟動/關閉$ bin/start-hbas

17、e.sh$ bin/stop-hbase.shl個別啟動/關閉$ bin/hbase-daemon.sh start/stop zookeeper$ bin/hbase-daemon.sh start/stop master$ bin/hbase-daemon.sh start/stop regionserver$ bin/hbase-daemon.sh start/stop thrif$ bin/hbase-daemon.sh start/stop restTesting (4)$ hbase shell create test, data0 row(s) in 4.3066 seconds

18、 listtest1 row(s) in 0.1485 seconds put test, row1, data:1, value10 row(s) in 0.0454 seconds put test, row2, data:2, value20 row(s) in 0.0035 seconds put test, row3, data:3, value30 row(s) in 0.0090 seconds scan testROW COLUMN+CELLrow1 column=data:1, timestamp=1240148026198, value=value1row2 column=

19、data:2, timestamp=1240148040035, value=value2row3 column=data:3, timestamp=1240148047497, value=value33 row(s) in 0.0825 seconds disable test09/04/19 06:40:13 INFO client.HBaseAdmin: Disabled test0 row(s) in 6.0426 seconds drop test09/04/19 06:40:17 INFO client.HBaseAdmin: Deleted test0 row(s) in 0.

20、0210 seconds list0 row(s) in 2.0645 secondsConnecting to HBaselJava clientuget(byte row, byte column, long timestamp, int versions);lNon-Java clientsuThrift server hosting HBase client instancelSample ruby, c+, & java (via thrift) clientsuREST server hosts HBase clientlTableInput/OutputFormat for Ma

21、pReduceuHBase as MR source or sinklHBase ShelluJRuby IRB with “DSL” to add get, scan, and adminu./bin/hbase shell YOUR_SCRIPTThriftla software framework for scalable cross-language services development. lBy facebooklseamlessly between C+, Java, Python, PHP, and Ruby. lThis will start the server instance, by default on port 9090lThe other similar project “rest”$ hbase-daemon.sh start thrift$ hbase-daemon.sh stop thriftReferencesl HBase 介紹介紹uhttp:/www.wretch.cc/blog/trendnop09/21192672 l Hadoop: The Definitive GuideuBook, by Tom Whitel HBase Architecture 101uhttp:/

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 教育专区 > 教案示例

本站为文档C TO C交易模式,本站只提供存储空间、用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。本站仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知淘文阁网,我们立即给予删除!客服QQ:136780468 微信:18945177775 电话:18904686070

工信部备案号:黑ICP备15003705号© 2020-2023 www.taowenge.com 淘文阁