hadoop集群搭建
hadoop集群搭建
服务器规划
IP | 域名 | 系统版本号 |
---|---|---|
10.10.x.70 | zk.had01 | Linux version 5.4.0-70-generic (buildd@lcy01-amd64-004) (gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)) #78-Ubuntu SMP Fri Mar 19 13:29:52 UTC 2021 |
10.10.x.8 | zk.had02 | Linux version 5.4.0-70-generic (buildd@lcy01-amd64-004) (gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)) #78-Ubuntu SMP Fri Mar 19 13:29:52 UTC 2021 |
10.187.x.120 | zk.had03 | Linux version 3.10.0-957.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Thu Nov 8 23:39:32 UTC 2018 |
环境要求
- JDK8
下载软件并安装
Note:
本文使用的版本号为hadoop-2.10.1.tar.gz
存放地址为/opt
host文件
将本地回环地址改成内网地址(hbase需要), 将规划的域名zk.had01, zk.had02, zk.had03的配置如下:
10.10.27.70机器的host配置:
root@GT70:~# cat /etc/hosts
#127.0.0.1 localhost
10.10.27.70 GT70
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
10.10.27.70 zk.had01
10.10.27.8 zk.had02
10.187.100.120 zk.had03
10.10.27.8机器的host配置:
hadoop@root-RiskCtrl:/opt/hbase234/logs$ cat /etc/hosts
10.10.27.8 root-RiskCtrl
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
10.10.27.70 zk.had01 GT70 master
10.10.27.8 zk.had02 root-RiskCtrl
10.187.100.120 zk.had03 selenium-test
10.187.100.120机器的host配置:
[hadoop@selenium-test hadoop]$ cat /etc/hosts
10.187.100.120 selenium-test
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.5.28.240 riskmg-dev.gtland.cn
10.5.28.240 authcctr-dev.gtland.cn
10.10.27.70 zk.had01
10.10.27.8 zk.had02 root-RiskCtrl
10.187.100.120 zk.had03
10.10.27.70 GT70 master
用户及用户组
用户组、用户、权限规划
用户组 | 用户 | 权限 |
---|---|---|
hadoop | hadoop | /opt/hadoop-2.10-1 /opt/hbase234 |
新建用户组hadoop
groupadd hadoop
新建hadoop用户
useradd hadoop -g hadoop -m -s /bin/bash
给已创建的用户hadoop设置密码为hadoop
passwd hadoop
修改目录所属用户
chown -R hadoop.hadoop /opt/hadoop-2.10.1
chown -R hadoop.hadoop /opt/hbase234
切换用户hadoop
su - hadoop
Note:
zk.had01, zk.had02, zk.had03 三台机上面都要创建.并切换成hadoop用户.
免密登录
方式一:
ssh-keygen
ssh-copy-id -p 22612 hadoop@zk.had01
ssh-copy-id -p 22612 hadoop@zk.had02
ssh-copy-id -p 22612 hadoop@zk.had03
方式二:
将本机的公钥文件上传到目标服务器上,然后在服务器需要免密登录的用户家目录下查看是否有 ~/.ssh/authorized_keys这个文件,
如果没有手动创建一个:
touch ~/.ssh/authorized_keys
然后我们将公钥内容写入到authorized_keys文件中,因为这个文件可能已经有内容了,所以你可以使用如下方式
cat -n ~/.ssh/rsa.pub ~/.ssh/authorized_keys
这样就将公钥内容追加到authorized_keys中了,然后需要注意配置权限了,否则SSH不会工作的.
将.ssh目录的权限为700
将authorized_keys目录的权限为600
设置别名:
vi ~/.ssh/config
内容如下:
host hadoop1
user hadoop
hostname 10.10.27.70
port 22612
host hadoop2
user hadoop
hostname 10.10.27.8
port 22612
host hadoop3
user hadoop
hostname 10.187.100.120
port 22612
修改配置文件
- hadoop-env.sh
- core-site.xml
- hdfs-site.xml
- mapred-site.xml
- yarn-site.xml
- slaves
修改hadoop-env.sh
改其中的JAVA_HOME为我们安装jdk的路径
JAVA_HOME=/usr/local/jdk
修改core-site.xml文件
/opt/hadoop-2.10.1/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://zk.had01:9000</value>
</property>
<!-- 指定hadoop临时目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-2.10.1/tmp</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>zk.had01:2181,zk.had02:2181,zk.had03:2181</value>
</property>
<property>
<name>ha.zookeeper.session-timeout.ms</name>
<value>3000</value>
</property>
</configuration>
修改hdfs-site.xml
/opt/hadoop-2.10.1/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- configuration for NameNode:-->
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hadoop-2.10.1/name</value>
</property>
<!-- 块大小 -->
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>100</value>
</property>
<!-- configuration for DateNode:-->
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/hadoop-2.10.1/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
修改mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
修改yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>zk.had01</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
修改slaves文件:
这里配置的是ssh的别名
hadoop2
hadoop3
配置hadoop环境变量
方便执行hadoop相关的命令.
vi /etc/profile
内容如下:
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
HADOOP_HOME=/opt/hadoop-2.10.1
PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export JAVA_HOME
export HADOOP_HOME
export PATH
Note:
zk.had01, zk.had02, zk.had03 也要配置.
分发
将zk.had01的/opt/hadoop-2.10.1分发到zk.had02, zk.had03.
scp -r /opt/hadoop-2.10.1 hadoop@zk.had02:/tmp
scp -r /opt/hadoop-2.10.1 hadoop@zk.had03:/tmp
分别登陆zk.had02, zk.had03,移动hadoop-2.10.1至/opt下
mv /tmp/hadoop-2.10.1 /opt
启动hadoop集群
格式化一个新的分布式文件系统:
cd /opt/hadoop-2.10.1/bin
hadoop namenode -format
启动:
在zk.had01上启动:
cd /opt/hadoop-2.10.1/sbin
./start-all.sh
巨人的肩膀
本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!