《2022年HADOOP安装教程 .pdf》由会员分享,可在线阅读,更多相关《2022年HADOOP安装教程 .pdf(29页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、这里已 ubuntu为例。一、系统准备 (ubuntu)1、hostname安装时,每台主机设置统一的用户名,例如cuit 。每台主机按一定规律设置的 hostname 。 对于已安装的主机, 例如 ubuntu , 可在/etc/hostname文件中修改。关闭防火墙sudo ufw disable2、设置 IP每台主机某个端口都在同一个局域网内,保证相互可以ping 通。3、hosts修改 hosts 文件1.vi/ etc / hosts添加如下内容1.127.0 . 0.1localhost2.3.# The followinglinesaredesirableforIPv6capab
2、lehosts4.: 1localhostip6 - localhostip6 - loopback5.ff02 : 1 ip6 - allnodes名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 1 页,共 29 页 - - - - - - - - - 6.ff02 : 2 ip6 - allrouters7.192.168 . 2.111cuit-A018.192.168.2.112cuit-A029.192.168.2.113cuit-A0310.192.168 . 2.114
3、cuit-A0411.192.168.2.115cuit-B0112.192.168.2.116cuit-B0213.192.168.2.117cuit-B0314.192.168 . 2.118cuit-B0415.3、SSH 无密码登陆节点这个操作是要让Master 节点可以无密码SSH 登陆到 Slave 节点上。首先生成Master的公匙,在Master节点终端中执行:1.cd /. ssh # 如果没有该目录,先执行一次2.ssh- keygen- trsa#一 直 按 回 车 就 可 以 , 生 成 的 密 钥 保 存为.ssh/id_rsaMaster节点需能无密码ssh 本机,
4、这一步还是在Master节点上执行:cat/. ssh / id_rsa. pub /. ssh / authorized_keyscuit-A01在 cuit用户下建立 .ssh 目录 : mkdir./.ssh完成后可以使用ssh cuit-A01验证一下。接着将公匙传输到cuit-2节点:sudo scp /. ssh / authorized_keyscuitcuit-A01:/ home/ cuit /. ssh /名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 2 页,共
5、 29 页 - - - - - - - - - scp 时会要求输入cuit-A01 上 cuit 用户的密码 (例如 123456) ,输入完成后会提示传输完毕。其他cuit-3、cuit-4 、cuit-5 节点,也要执行将 cuit-A01 的公匙authorized_key文件传输到cuit-3、cuit-4 、cuit-5 节点的 .ssh 目录下。最后在cuit-A01节点就可以无密码SSH 到各个节点了。1.ssh cuit- 2二、配置 java环境1、在 /usr/local目录下新建一个目录sudo mkdir/ usr / local/ java2、更改目录所有者sudo
6、 chown cuit: cuit/ usr / local/ java3、将 java的压缩文件复制到/usr/local/java下并解压。cp / downlocad / jdk - 7u71- linux- x64. tar . gz / usr / local/ javatar- zxvfjdk - 7u71- linux- x64. tar . gz4、添加环境变量修改profile:sudovi/etc/profile名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 3
7、页,共 29 页 - - - - - - - - - JAVA_HOME =/ usr / local/ java / jdk1 . 7.0 _71#PATH=$JAVA_HOME/bin:$PATHPATH =$JAVA_HOME / bin : $PATHCLASSPATH =.: $JAVA_HOME / lib / dt . jar : $JAVA_HOME / lib / tools. jarexportJAVA_HOMEexportPATHexportCLASSPATH使得更改配置生效source/etc/profile5、验证 java环境1.cuitcuit-A01:/ dow
8、nload$java- version2.javaversion1.7.0_713.Java ( TM )SE RuntimeEnvironment( build1.7 . 0_71- b14)4.JavaHotSpot ( TM )64- BitServerVM ( build24.71 - b01,mixedmode)三、配置scala环境1、在 /usr/local目录下新建一个目录sudo mkdir/usr/local/scala名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - -
9、第 4 页,共 29 页 - - - - - - - - - 2、更改目录所有者sudo chown cuit:cuit/usr/local/scala3、将 scala的压缩文件复制到/usr/local/scala下并解压。cp /downlocad/scala-2.10.4.tgz/usr/local/scalatar- zxvfscala-2.10.4.tgz4、添加环境变量修改profileSCALA_HOME=/ usr / local/ scala / scala - 2.10 . 4PATH =$PATH : $SCALA_HOME/ bin /exportSCALA_HOME
10、exportPATH使得更改配置生效source/etc/profile5、验证 scala环境1.cuitcuit-A01:/ download$scala- version2.Scalacoderunnerversion2.10 . 4-Copyright2002- 2013,LAMP / EPFL名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 5 页,共 29 页 - - - - - - - - - 四、配置 hadoop1、在 /usr/local目录下新建一个目录sudo
11、mkdir/usr/local/ hadoop2、更改目录所有者sudo chown cuit:cuit/usr/local/ hadoop3、将 hadoop 的压缩文件复制到/usr/local/hadoop下并解压。cp /downlocad/hadoop-2.5.0-cdh5.2.0.tar.gz/usr/local/ hadooptar- zxvfhadoop-2.5.0-cdh5.2.0.tar.gz4、修改配置文件(1)、修改/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/etc/hadoop/hadoop-env.sh1.vi/ usr / lo
12、cal/ hadoop / hadoop - 2.5 . 0- cdh5. 2.0 / etc / hadoop/ hadoop- env. sh修改如下内容名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 6 页,共 29 页 - - - - - - - - - 1.#exportJAVA_HOME=$JAVA_HOME2.exportJAVA_HOME =/ usr / local/ java / jdk1 . 7.0 _71(2)、修改 /usr/local/hadoop/had
13、oop-2.5.0-cdh5.2.0/etc/hadoop/slaves1.vi/ usr / local/ hadoop / hadoop - 2.5 . 0- cdh5. 2.0 / etc / hadoop/ slaves修改如下cuit-A01cuit -A0 2cuit -A0 3cuit -A0 4cuit - B01cuit -B0 2cuit -B0 3cuit -B0 4(3)、修改 /usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/etc/hadoop/yarn-env.sh名师资料总结 - - -精品资料欢迎下载 - - - - - - -
14、- - - - - - - - - - - 名师精心整理 - - - - - - - 第 7 页,共 29 页 - - - - - - - - - 1.vi/ usr / local/ hadoop / hadoop - 2.5 . 0- cdh5. 2.0 / etc / hadoop/ yarn - env. sh修改如下内容1.# some Javaparameters2.# exportJAVA_HOME=/home/y/libexec/jdk1.6.0/3.exportJAVA_HOME =/ usr / local/ java / jdk1 . 7.0 _71(4)、修改/usr/
15、local/hadoop/hadoop-2.5.0-cdh5.2.0/etc/hadoop/core-site.xml1.vi/ usr / local/ hadoop / hadoop - 2.5 . 0- cdh5. 2.0 / etc / hadoop/ core - site . xml修改如下名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 8 页,共 29 页 - - - - - - - - - 1.2.3.fs . defaultFS4. hdfs :/cuit-A01:
16、90005.6.7.io . file. buffer. size 8. 1310729.10.11.hadoop. tmp. dir 12. file:/ usr / local/ hadoop / tmp13.Abaseforothertemporarydirectories.14.15.(5)、修改/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/etc/hadoop/hdfs-site.xml1.vi/ usr / local/ hadoop / hadoop - 2.5 . 0- cdh5. 2.0 / etc / hadoop/ hdfs - site
17、 . xml修改内容如下:1.2.名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 9 页,共 29 页 - - - - - - - - - 3.dfs . namenode. secondary . http - address 4. cuit-A01: 90015.6.7.dfs . namenode. name. dir 8. file:/ usr / local/ hadoop / name9.10.11.dfs . datanode . data . dir 12. file
18、:/ usr / local/ hadoop / data 13.14.15.dfs . replication16. 117.18.(6)、修改/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/etc/hadoop/mapred-site.xml1.vi/ usr / local/ hadoop / hadoop - 2.5 . 0- cdh5. 2.0 / etc / hadoop/ mapred- site. xml修改内容如下1.2.3.mapreduce . framework . name4. yarn 5.6.名师资料总结 - - -精品资料欢迎下载
19、 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 10 页,共 29 页 - - - - - - - - - (7)、修改 /usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/etc/hadoop/yarn-site.xml1.vi/ usr / local/ hadoop / hadoop - 2.5 . 0- cdh5. 2.0 / etc / hadoop/ yarn - site . xml修改内容如下:1.2.3.4.5.yarn . resourcemanager. hostname
20、6. cuit-A017.8.9.yarn . nodemanager . aux- services10. mapreduce_shuffle11.12.(8)、添加环境变量修改profile1.名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 11 页,共 29 页 - - - - - - - - - 1.exportHADOOP_PREFIX=/ usr / local/ hadoop/ hadoop -2.5 . 0- cdh5. 2.02.exportPATH =$PATH
21、: $HADOOP_PREFIX/ bin3.exportPATH =$PATH : $HADOOP_PREFIX/ sbin4.exportHADOOP_MAPRED_HOMD=$ HADOOP_PREFIX5.exportHADOOP_COMMON_HOME=$ HADOOP_PREFIX6.exportHADOOP_HDFS_HOME=$ HADOOP_PREFIX7.exportYARN_HOME=$ HADOOP_PREFIX8.exportHADOOP_CONF_DIR=$ HADOOP_PREFIX / etc / hadoop9.exportHDFS_CONF_DIR =$ H
22、ADOOP_PREFIX/ etc / hadoop10.exportYARN_CONF_DIR =$ HADOOP_PREFIX/ etc / hadoop使得更改配置生效source/etc/profile5、启动 hadoop 集群11.cd / usr / local/ hadoop / hadoop - 2.5 . 0- cdh5 . 2.012.bin / hdfsnamenode - format# 首次运行需要执行初始化,后面不再需要13.sbin / start- dfs . sh14.sbin / start- yarn . sh通过命令jps可以查看各个节点所启动的进程。
23、通过 jps 查看 Master 主机的 Hadoop 进程可以看到Master 节点启动了NameNode、DataNode 、SecondrryNameNode、ResourceManager进程。名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 12 页,共 29 页 - - - - - - - - - 通过 jps 查看 Slave 的 Hadoop 进程Slave 节点则启动了DataNode和NodeManager进程。另外也可以在Master 节点上通过命令bin/hdf
24、sdfsadmin-report查看DataNode 是否正常启动。例如我这边一共有1 个 Datanodes 。查看 master_IP:8088可以看到所有的节点情况关闭集群时1.sbin / stop - dfs . sh2.sbin / stop - yarn . sh6、运行小程序(1)、上传文件名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 13 页,共 29 页 - - - - - - - - - 在 hdfs上建立 /user/cuit的目录(这个是hdfs的当前用户
25、, 因该和 master主机的用户名有关)1.cuitcuit-A01:/ usr / local/ hadoop / hadoop -2.5 . 0- cdh5. 2.0 $ bin/ hadoopfs- mkdir/ user2.15/ 06/ 21 15: 05: 19 WARN util. NativeCodeLoader: Unable to loadnative- hadooplibraryforyourplatform.usingbuiltin- javaclasseswhereapplicable3.cuitcuit-A01:/ usr / local/ hadoop / ha
26、doop -2.5 . 0- cdh5. 2.0 $ bin/ hadoopfs- ls/15/ 06/ 2115: 05: 22WARNutil. NativeCodeLoader:Unableto loadnative- hadoop libraryforyourplatform.usingbuiltin- javaclasseswhereapplicable4.Found 4 items5.drwxr - xr - x-cuitsupergroup0 2015- 06- 21 14: 59/ input6.- rw- r - r -1 cuitsupergroup406506022015
27、- 06- 20 22: 21/ testfile7.drwx -cuitsupergroup0 2015- 06- 21 15: 00/ tmp8.drwxr - xr - x-cuitsupergroup0 2015- 06- 21 15: 05/ user9.cuitcuit-A01:/ usr / local/ hadoop / hadoop -2.5 . 0- cdh5. 2.0 $ bin/ hadoopfs- mkdir/ user / cuit10.15/ 06/ 21 15: 05: 30 WARN util. NativeCodeLoader: Unable to load
28、native- hadooplibraryforyourplatform.usingbuiltin- javaclasseswhereapplicable11.cuitcuit-A01:/ usr / local/ hadoop / hadoop -2.5 . 0- cdh5. 2.0 $ bin/ hadoopfs- ls/15/ 06/ 2115: 05: 33WARNutil. NativeCodeLoader:Unableto loadnative- hadoop libraryforyourplatform.usingbuiltin- javaclasseswhereapplicab
29、le12.Found 4 items13.drwxr - xr - x-cuitsupergroup0 2015- 06- 21 14: 59/ input14.- rw- r - r -1 cuitsupergroup406506022015- 06- 20 22: 21/ testfile15.drwx -cuitsupergroup0 2015- 06- 21 15: 00/ tmp16.drwxr - xr - x-cuitsupergroup0 2015- 06- 21 15: 05/ user名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - -
30、- - - - - - 名师精心整理 - - - - - - - 第 14 页,共 29 页 - - - - - - - - - 17.cuitcuit-A01:/ usr / local/ hadoop / hadoop -2.5 . 0- cdh5. 2.0 $ bin/ hadoopfs- cp / input /*/user/cuit18.15/06/2115:05:52WARN util.NativeCodeLoader:Unable to loadnative-hadooplibraryforyourplatform.usingbuiltin-javaclasseswhereapp
31、licable19.cuitcuit-A01:/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0$bin/hadoopfs-ls/user/cuit20.15/06/2115:06:12WARN util.NativeCodeLoader:Unable to loadnative-hadooplibraryforyourplatform.usingbuiltin-javaclasseswhereapplicable21.Found 2 items22.-rw-r-r-1 cuitsupergroup6038 2015-06-2115:05/user/cuit/sto
32、p_use.txt23.-rw-r-r-1 cuitsupergroup406506022015-06-2115:05/user/cuit/testfile.txt2、编译代码1.packageorg . myorg;2.3.importjava . io . BufferedReader;4.importjava . io . File ;5.importjava . io . FileReader;6.importjava . net . URI;7.importjava . util. HashSet ;8.importjava . util. Set ;9.importjava . i
33、o . IOException;10.importjava . util. regex . Pattern;11.importorg . apache . hadoop . conf . Configuration;12.importorg . apache . hadoop . conf . Configured;13.importorg . apache . hadoop . util. Tool ;14.importorg . apache . hadoop . util. ToolRunner ;15.importorg . apache . hadoop . mapreduce .
34、Job ;名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 15 页,共 29 页 - - - - - - - - - 16.importorg . apache . hadoop . mapreduce . Mapper;17.importorg . apache . hadoop . mapreduce . Reducer ;18.importorg . apache . hadoop . fs . Path ;19.importorg . apache . hadoop . ma
35、preduce . lib . input. FileInputFormat;20.importorg . apache . hadoop . mapreduce . lib. input. FileSplit;21.importorg . apache . hadoop . mapreduce . lib . output. FileOutputFormat;22.importorg . apache . hadoop . io . IntWritable;23.importorg . apache . hadoop . io . LongWritable;24.importorg . ap
36、ache . hadoop . io . Text ;25.importorg . apache . hadoop . util. StringUtils;26.27.importorg . apache . log4j. Logger ;28.29.publicclassWordCount extendsConfiguredimplementsTool30.31.privatestaticfinalLoggerLOG=Logger . getLogger( WordCount . class );32.33.publicstaticvoidmain( Stringargs ) throwsE
37、xception34.intres= ToolRunner . run ( new WordCount (),args );35.System. exit ( res );36.37.38.publicintrun ( Stringargs ) throwsException39.Job job= Job. getInstance( getConf (),wordcount);40.for( inti= 0;i args . length ;i+= 1)41.if( -skip. equals ( args i )42.job . getConfiguration(). setBoolean(
38、 wordcount.skip.patterns ,true );43.i+= 1;44.job . addCacheFile( new Path ( args i ).toUri ();45./thisdemonstrateslogging46.LOG . info ( Addedfiletothedistributedcache:+args i );47.48.49.job . setJarByClass( this . getClass();50./UseTextInputFormat,thedefaultunlessjob.setInputFormatClassisused51.Fil
39、eInputFormat. addInputPath( job ,new Path ( args 0);名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 16 页,共 29 页 - - - - - - - - - 52.FileOutputFormat. setOutputPath( job ,newPath ( args 1);53.job . setMapperClass( Map . class );54.job . setCombinerClass( Reduce. class
40、 );55.job . setReducerClass( Reduce. class );56.job . setOutputKeyClass( Text . class );57.job . setOutputValueClass( IntWritable. class );58.returnjob . waitForCompletion( true )? 0 :1;59.60.61.publicstaticclassMap extendsMapper 62.privatefinalstaticIntWritableone=newIntWritable( 1);63.privateTextw
41、ord= new Text ();64.privatebooleancaseSensitive= false;65.privatelongnumRecords= 0;66.privateStringinput ;67.privateSet patternsToSkip=newHashSet ();68.privatestaticfinalPatternWORD_BOUNDARY=Pattern. compile ( s*bs*);69.70.protectedvoidsetup ( Mapper. Contextcontext)71.throwsIOException,72.Interrupt
42、edException73.if( context. getInputSplit()instanceofFileSplit) 74.this . input=( FileSplit)context. getInputSplit().getPath ().toString();75.else76.this . input= context. getInputSplit().toString();77.78.Configurationconfig= context. getConfiguration();79.this . caseSensitive=config. getBoolean( wor
43、dcount.case.sensitive,false);80.if( config. getBoolean( wordcount.skip.patterns,false)81.URIlocalPaths= context. getCacheFiles();82.parseSkipFile( localPaths 0);83.84.85.86.privatevoidparseSkipFile( URI patternsURI)名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 17 页,
44、共 29 页 - - - - - - - - - 87.LOG . info ( Addedfiletothedistributedcache:+patternsURI);88.try89.BufferedReaderfis=newBufferedReader( newFileReader( new File ( patternsURI. getPath ().getName();90.Stringpattern;91.while( pattern= fis . readLine()!=null )92.patternsToSkip. add( pattern);93.94.catch( IO
45、Exceptionioe )95.System. err . println( Caughtexceptionwhileparsingthecachedfile96.+patternsURI+:+StringUtils. stringifyException( ioe );97.98.99.100.publicvoidmap ( LongWritableoffset,TextlineText,Contextcontext)101.throwsIOException,InterruptedException102.Stringline= lineText. toString();103.if(!
46、 caseSensitive)104.line= line . toLowerCase ();105.106.TextcurrentWord= new Text ();107.for( Stringword :WORD_BOUNDARY. split( line)108.if( word. isEmpty ()|patternsToSkip. contains( word)109.continue;110.111.currentWord= new Text ( word);112.context. write( currentWord, one);113.114.115.116.117.pub
47、licstaticclassReduceextendsReducer 118.Override119.publicvoidreduce ( Textword ,Iterablecounts ,Contextcontext)120.throwsIOException,InterruptedException121.intsum = 0;122.for( IntWritablecount:counts )名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 18 页,共 29 页 - - -
48、- - - - - - 123.sum += count . get ();124.125.context. write( word,new IntWritable( sum);126.127.128.编译和打包方法1.cuitcuit-A01:/ usr / local/ hadoop / hadoop -2.5 . 0- cdh5. 2.0 / test$viwordcount . java2.cuitcuit-A01:/ usr / local/ hadoop / hadoop -2.5 . 0- cdh5. 2.0 / test$javac-cp/ usr / local/ hadoo
49、p / hadoop - 2.5 . 0- cdh5. 2.0 / etc / hadoop:/ usr / local / hadoop/ hadoop - 2.5 . 0- cdh5. 2.0 / share / hadoop / common / lib/*:/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/share/hadoop/common/*:/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/share/hadoop/hdfs:/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/share/h
50、adoop/hdfs/lib/*:/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/share/hadoop/hdfs/*:/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/share/hadoop/yarn/lib/*:/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/share/hadoop/yarn/*:/usr/local/hadoop/hadoop-2.5.0-cdh5.2.0/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/hadoop-2.5.0