HUE介绍_IT分享知识网

大家好，欢迎来到IT知识分享网。

Hue是一个开源的Apache Hadoop UI系统。
• 通过使用Hue我们可以在浏览器端的Web控制台上与Hadoop集群进行交互来分析处理数据。
– 例如操作HDFS上的数据、运行Hive脚本、管理Oozie任务等等。

• 是基于Python Web框架Django实现的。 • 支持任何版本Hadoop
– 基于文件浏览器(File Browser)访问HDFS
– 基于web编辑器来开发和运行Hive查询
– 支持基于Solr进行搜索的应用，并提供可视化的数据视图，报表生成 – 通过web调试和开发impala交互式查询
– spark调试和开发
– Pig开发和调试
– oozie任务的开发，监控，和工作流协调调度
– Hbase数据查询和修改，数据展示
– Hive的元数据(metastore)查询
– MapReduce任务进度查看，日志追踪
– 创建和提交MapReduce，Streaming，Java job任务
– Sqoop2的开发和调试
– Zookeeper的浏览和编辑
– 数据库(MySQL，PostGres，SQlite，Oracle)的查询和展示

安装hue依赖的第三方包

#安装xml软件包 

$>sudo yum install -y libxml2-devel.x86_64 

#安装其他软件包 

$>sudo yum install -y libxslt-devel.x86_64 python-devel openldap-devel asciidoc cyrus-sasl-gssapi

HUE介绍

配置hue

hue与hadoop连接，即访问hadoop文件，可以使用两种方式。

WebHDFS

提供高速数据传输，client可以直接和DataNode通信。
HttpFS

一个代理服务，方便于集群外部的系统进行集成。注意：HA模式下只能使用该中方式。

3.1 配置hadoop的hue代理用户

[/soft/hadoop/etc/hadoop/core-site.xml]

注意：hadoop的代理用户配置方式是：hadoop.proxyuser.${superuser}.hosts，这里我的superuser是centos。

<property>
 <name>hadoop.proxyuser.centos.hosts</name>
    <value>*</value>
</property>
<property>
 <name>hadoop.proxyuser.centos.groups</name>
 <value>*</value>
</property>

[/soft/hadoop/etc/hadoop/hdfs-site.xml]

<property>
 <name>dfs.webhdfs.enabled</name>
 <value>true</value>
</property>

[/soft/hadoop/etc/hadoop/httpfs-site.xml]

<property>
 <name>httpfs.proxyuser.centos.hosts</name>
 <value>*</value>
</property>
<property>
     <name>httpfs.proxyuser.centos.groups</name>
 <value>*</value>
</property>

分发配置文件

$>cd /soft/hadoop/etc/hadoop
$>xsync.sh core-site.xml
$>xsync.sh hdfs-site.xml
$>xsync.sh httpfs-site.xml

3.2 重启hadoop和yarn进程

$>stop-dfs.sh
$>stop-dfs.sh

$>start-dfs.sh
$>start-yarn.sh

3.3 启动httpfs进程

3.3.1 启动进程

$>/soft/hadoop/sbin/httpfs.sh start

3.3.2 检查14000端口

$>netstat -anop |grep 14000

HUE介绍

3.4 配置hue文件

这里我们使用的是hadoop的namenode HA模式，因此只能配置httpfs方式访问hdfs文件。需要注意的是webhdfs_url指定的是14000的端口，具体如下所示。

[/home/centos/hue-3.12.0/desktop/conf/hue.ini]

...
    [[[default]]]
      # Enter the filesystem uri
      fs_defaultfs=hdfs://mycluster:8020

      # NameNode logical name.
      logical_name=mycluster

      # Use WebHdfs/HttpFs as the communication mechanism.
      # Domain should be the NameNode or HttpFs host.
      # Default port is 14000 for HttpFs.
      webhdfs_url=http://s101:14000/webhdfs/v1

      # Change this if your HDFS cluster is Kerberos-secured
      ## security_enabled=false

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

      # Directory of the Hadoop configuration
      hadoop_conf_dir=/soft/hadoop/etc/hadoop

3.5 配置hue的数据库为mysql

...
    [[database]]
    # Database engine is typically one of:
    # postgresql_psycopg2, mysql, sqlite3 or oracle.
    #
    # Note that for sqlite3, 'name', below is a path to the filename. For other backends, it is the database name
    # Note for Oracle, options={"threaded":true} must be set in order to avoid crashes.
    # Note for Oracle, you can use the Oracle Service Name by setting "host=" and "port=" and then "name=<host>:<port>/<service_name>".
    # Note for MariaDB use the 'mysql' engine.
    engine=mysql
    host=192.168.231.1
    port=3306
    user=root
    password=root
    # Execute this script to produce the database password. This will be used when 'password' is not set.
    ## password_script=/path/script
    name=hue
    ## options={}
    # Database schema, to be used only when public schema is revoked in postgres
    ## schema=

4、初始化mysql库，生成表

4.1 创建hue库

因为我们在hue.ini文件中指定的数据库名为hue，因此需要先创建hue数据库。

msyql>create database hue ;

4.2 初始化数据表

该步骤是创建表和插入部分数据。hue的初始化数据表命令由hue/bin/hue syncdb完成，创建期间，需要输入用户名和密码。如下所示：

#同步数据库
$>~/hue-3.12.0/build/env/bin/hue syncdb
#导入数据,主要包括oozie、pig、desktop所需要的表
$>~/hue-3.12.0/build/env/bin/hue migrate

HUE介绍

4.3 查看mysql中是否生成表

查看是否在mysql中生成了所需要的表，截图如下所示：

msyql>show tables ;

HUE介绍

5、启动hue进程

$>~/hue-3.12.0/build/env/bin/supervisor

启动过程如下图所示：

1527152006500

6、检查webui

http://s101:8888/

打开登录界面，输入前文创建的账户即可。

1527152006500

7、访问hdfs

点击右上角的hdfs链接，进入hdfs系统画面。

1527152006500

8、配置ResourceManager

8.1 修改hue.ini配置文件

  [[yarn_clusters]]
    ...
    # [[[ha]]]
      # Resource Manager logical name (required for HA)
      logical_name=cluster1

      # Un-comment to enable
      ## submit_to=True

      # URL of the ResourceManager API
      resourcemanager_api_url=http://s101:8088

8.2 查看job执行情况

1527152006500

9、配置hive

9.1 编写hue.ini文件

[beeswax]
  # Host where HiveServer2 is running.
  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  hive_server_host=s101

  # Port where HiveServer2 Thrift server runs on.
  hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located
  hive_conf_dir=/soft/hive/conf

9.2 安装依赖软件包

如果不安装以下的依赖包，会导致sasl方面的错误，说hiveserver2没有启动。

$>sudo yum install -y cyrus-sasl-plain  cyrus-sasl-devel  cyrus-sasl-gssapi

9.3 启动hiveserver2服务器

$>/soft/hive/bin/hiveserver2

9.4 查看webui

1527152006500

10、配置hbase

10.1 修改hue.ini配置文件

hbase配置的是thriftserver2服务器地址，不是master地址，而且需要用小括号包起来。thriftserver需要单独启动。

[hbase]
  # Comma-separated list of HBase Thrift servers for clusters in the format of '(name|host:port)'.
  # Use full hostname with security.
  # If using Kerberos we assume GSSAPI SASL, not PLAIN.
  hbase_clusters=(s101:9090)

  # HBase configuration directory, where hbase-site.xml is located.
  hbase_conf_dir=/soft/hbase/conf

10.2 启动thriftserver服务器

注意：thriftserver服务器启动的名称是thrift。切记：有些文档上写的是thrit2，这里是thrfit。

$>hbase-daemon.sh start thrift

10.3 查看端口9090

1527152006500

10.4 查看hue中hbase

1527152006500

11、配置spark

11.1 介绍

hue与spark的集成使用livy server进行中转，livy server类似于hive server2。提供一套基于restful风格的服务，接受client提交http的请求，然后转发给spark集群。livy server不在spark的发行包中，需要单独下载。

注意：hue中通过netebook编写scala或者python程序，要确保notebook可以使用，需要启动hadoop的httpfs进程–切记！

注意下载使用较高的版本，否则有些类找不到。下载地址如下：

http://mirrors.tuna.tsinghua.edu.cn/apache/incubator/livy/0.5.0-incubating/livy-0.5.0-incubating-bin.zip

11.2 解压

$>unzip livy-server-0.2.0.zip -d /soft/

11.3 启动livy服务器

$>/soft/livy-server-0.2.0/bin/live-server

1527152006500

11.4 配置hue

推荐使用local或yarn模式启动job，这里我们配置成spark://s101:7077。

[spark]
  # Host address of the Livy Server.
  livy_server_host=s101

  # Port of the Livy Server.
  livy_server_port=8998

  # Configure Livy to start in local 'process' mode, or 'yarn' workers.
  livy_server_session_kind=spark://s101:7077

11.5 使用notebook编写scala程序

1527152006500

安装部署参考：

https://www.cnblogs.com/xupccc/p/9583656.html

免责声明：本站所有文章内容,图片，视频等均是来源于用户投稿和互联网及文摘转载整编而成，不代表本站观点，不承担相关法律责任。其著作权各归其原作者或其出版社所有。如发现本站有涉嫌抄袭侵权/违法违规的内容,侵犯到您的权益，请在线联系站长,一经查实,本站将立刻删除。本文来自网络,若有侵权，请联系删除，如若转载，请注明出处：https://yundeesoft.com/15433.html

HUE介绍

配置hue

4、初始化mysql库，生成表

5、启动hue进程

6、检查webui

7、访问hdfs

8、配置ResourceManager

9、配置hive

10、配置hbase

11、配置spark

相关推荐

发表回复