你的浏览器不支持canvas

做你害怕做的事情,然后你会发现,不过如此。

DataSphere Studio 1.1.0单机部署

时间: 作者: 黄运鑫

本文章属原创文章,未经作者许可,禁止转载,复制,下载,以及用作商业用途。原作者保留所有解释权。


基础环境安装

  • 安装依赖软件
[root@localhost ~]# sudo yum install -y telnet tar sed dos2unix mysql unzip zip expect python

安装JDK

  • 先卸载CentOS自带的JDK
  • 查询CentOS自带的JDK,执行rpm -qa|grep javarpm -qa|grep jdkrpm -qa|grep gcj,会查询到多条
[root@localhost ~]# rpm -qa|grep java
java-1.7.0-openjdk-headless-1.7.0.261-2.6.22.2.el7_8.x86_64
python-javapackages-3.4.1-11.el7.noarch
tzdata-java-2020a-1.el7.noarch
java-1.8.0-openjdk-headless-1.8.0.262.b10-1.el7.x86_64
java-1.8.0-openjdk-1.8.0.262.b10-1.el7.x86_64
javapackages-tools-3.4.1-11.el7.noarch
java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.x86_64
  • 卸载自带的JDK
[root@localhost ~]# rpm -e --nodeps java-1.7.0-openjdk-headless-1.7.0.261-2.6.22.2.el7_8.x86_64
[root@localhost ~]# rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.262.b10-1.el7.x86_64
[root@localhost ~]# rpm -e --nodeps java-1.8.0-openjdk-1.8.0.262.b10-1.el7.x86_64
[root@localhost ~]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.x86_64
  • 卸载后验证
[root@localhost ~]# rpm -qa|grep java
python-javapackages-3.4.1-11.el7.noarch
tzdata-java-2020a-1.el7.noarch
javapackages-tools-3.4.1-11.el7.noarch
[root@localhost ~]# java -version
bash: java: 未找到命令...
# 创建JDK目录
[root@localhost ~]# mkdir -p /opt/jdk

# 解压到目录
[root@localhost ~]# sudo tar -zxvf jdk-8u261-linux-x64.tar.gz -C /opt/jdk
  • 执行sudo vim /etc/profile编辑文件,修改如下:
export JAVA_HOME="/opt/jdk/jdk1.8.0_261"
export PATH=$JAVA_HOME/bin:$PATH
  • 使环境变量生效
[root@localhost home]# source /etc/profile
  • 安装成功验证
[root@localhost home]# java -version
java version "1.8.0_261"
Java(TM) SE Runtime Environment (build 1.8.0_261-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.261-b12, mixed mode)

安装Nginx

  • Nginx安装启动后不需要任何配置,dss脚本会自动配置
# 获取rpm安装包
[root@localhost home]# sudo rpm -ivh http://nginx.org/packages/centos/7/noarch/RPMS/nginx-release-centos-7-0.el7.ngx.noarch.rpm

# 安装nginx
[root@localhost home]# sudo yum install -y nginx

# 设置开机启动
[root@localhost home]# sudo systemctl enable nginx

# 启动nginx服务
[root@localhost home]# sudo systemctl start nginx

安装MySQL

  • 公司测试服务器自带MySQL 5.7,安装过程略

安装Hadoop

  • 前往 Hadoop官网 下载hadoop-2.7.2.tar.gz
  • 创建用户并设置密码
# 创建用户
[root@localhost home]# sudo useradd hadoop

# 设置密码
[root@localhost home]# sudo passwd hadoop
  • ,执行vim /etc/sudoers修改用户权限,在最下方添加
hadoop ALL=(ALL) NOPASSWD: ALL
  • 查看权限是否配置成功
[root@localhost home]# sudo -l -U hadoop
匹配 %2$s 上 %1$s 的默认条目:
    !visiblepw, always_set_home, match_group_by_gid, always_query_group_plugin, env_reset, env_keep="COLORS DISPLAY HOSTNAME HISTSIZE KDEDIR LS_COLORS", env_keep+="MAIL PS1 PS2 QTDIR USERNAME
    LANG LC_ADDRESS LC_CTYPE", env_keep+="LC_COLLATE LC_IDENTIFICATION LC_MEASUREMENT LC_MESSAGES", env_keep+="LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER LC_TELEPHONE", env_keep+="LC_TIME LC_ALL
    LANGUAGE LINGUAS _XKB_CHARSET XAUTHORITY", secure_path=/sbin\:/bin\:/usr/sbin\:/usr/bin

用户 hadoop 可以在 localhost 上运行以下命令:
    (ALL) ALL
  • 切换到hadoop用户
[root@localhost home]# su hadoop
  • 配置免密登录
# 生成密钥,生成过程按回车即可,生成的密钥路径为/home/hadoop/.ssh/id_rsa.pub
[hadoop@localhost home]$ ssh-keygen

# 设置密钥
[hadoop@localhost home]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@127.0.0.1
  • 测试配置是否正确,第一次访问会提醒是否继续连接,输入yes继续,再执行ssh localhost时,就不用输入密码了
[hadoop@localhost home]$ ssh localhost
Last login: Tue Nov  8 22:46:56 2022
  • 解压安装包
# 将安装包上传到home文件夹,切换到文件夹
[hadoop@localhost home]$ cd /home

# 创建安装目录
[hadoop@localhost home]$ sudo mkdir -p /opt/hadoop

# 解压到安装目录
[hadoop@localhost home]$ sudo tar xvf hadoop-2.7.2.tar.gz -C /opt/hadoop
  • 执行sudo vim /etc/hosts配置host解析,在最后一行添加
127.0.0.1 namenode
  • 创建目录
[hadoop@localhost home]$ sudo mkdir -p /opt/hadoop/hadoop-2.7.2/hadoopinfra/hdfs/namenode
[hadoop@localhost home]$ sudo mkdir -p /opt/hadoop/hadoop-2.7.2/hadoopinfra/hdfs/datanode
  • 执行sudo vim /opt/hadoop/hadoop-2.7.2/etc/hadoop/core-site.xml编辑配置文件,内容如下:
<configuration>
    <!-- 指定HDFS中NameNode的地址 -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://127.0.0.1:9000</value>
    </property>

    <!-- 指定Hadoop运行时产生文件的存储目录 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/hadoop/hadoop-2.7.2/data/tmp</value>
    </property>

    <property>
        <name>hadoop.proxyuser.hadoop.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hadoop.groups</name>
        <value>*</value>
    </property>
</configuration>
  • 执行sudo vim /opt/hadoop/hadoop-2.7.2/etc/hadoop/hdfs-site.xml编辑配置文件,内容如下:
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.name.dir</name>
        <value>/opt/hadoop/hadoop-2.7.2/hadoopinfra/hdfs/namenode</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/opt/hadoop/hadoop-2.7.2/hadoopinfra/hdfs/datanode</value>
    </property>
</configuration>
  • 执行sudo vim /opt/hadoop/hadoop-2.7.2/etc/hadoop/yarn-site.xml编辑配置文件,内容如下:
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
        <description>Whether virtual memory limits will be enforced for containers</description>
    </property>
    <property>
        <name>yarn.nodemanager.vmem-pmem-ratio</name>
        <value>4</value>
        <description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
    </property>
</configuration>
  • 创建配置文件mapred-site.xml,命令如下:
[hadoop@localhost home]$ sudo cp /opt/hadoop/hadoop-2.7.2/etc/hadoop/mapred-site.xml.template /opt/hadoop/hadoop-2.7.2/etc/hadoop/mapred-site.xml
  • 执行sudo vim /opt/hadoop/hadoop-2.7.2/etc/hadoop/mapred-site.xml编辑配置文件,内容如下:
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
  • 执行sudo vim /opt/hadoop/hadoop-2.7.2/etc/hadoop/hadoop-env.sh编辑配置文件,找到JAVA_HOME修改如下:
export JAVA_HOME=/opt/jdk/jdk1.8.0_261
  • 执行sudo vim /etc/profile编辑环境变量
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.2
export HADOOP_CONF_DIR=/opt/hadoop/hadoop-2.7.2/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin
  • 使环境变量生效
[hadoop@localhost home]$ source /etc/profile
  • Hadoop文件夹设置权限
[hadoop@localhost home]$ sudo chmod -R 777 /opt/hadoop
  • 初始化Hadoop
# 格式化HDFS
[root@localhost home]# hdfs namenode -format

# 启动hdfs相关进程
[root@localhost home]# /opt/hadoop/hadoop-2.7.2/sbin/start-dfs.sh 
[root@localhost home]# /opt/hadoop/hadoop-2.7.2/sbin/start-yarn.sh
  • 执行jps查看hdfs相关进程是否启动
[hadoop@localhost home]$ jps
1969 NodeManager
787 DataNode
1171 SecondaryNameNode
3460 Jps
1642 ResourceManager
508 NameNode
  • 先关闭防火墙
[root@localhost home]# sudo systemctl stop firewalld
  • 安装完成,访问Hadoop页面,默认端口号为50070

安装Hive

  • 前往 Hive官网 下载 apache-hive-2.3.3-bin.tar.gz
  • 必须要先保证hadoop已正常启动
  • 解压Hive
# 将安装包复制到/home下,并切换到目录
[hadoop@localhost home]$ cd /home

# 创建安装目录
[hadoop@localhost home]$ sudo mkdir -p /opt/hive

# 解压到安装目录
[hadoop@localhost home]$ sudo tar xvf apache-hive-2.3.3-bin.tar.gz -C /opt/hive
  • 创建配置文件
# 切换目录
[hadoop@localhost home]$ cd /opt/hive/apache-hive-2.3.3-bin/conf/

# 创建配置文件
[hadoop@localhost conf]$ sudo cp hive-env.sh.template hive-env.sh
[hadoop@localhost conf]$ sudo cp hive-default.xml.template hive-site.xml
[hadoop@localhost conf]$ sudo cp hive-log4j2.properties.template hive-log4j2.properties
[hadoop@localhost conf]$ sudo cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
  • hadoop中创建文件夹
[root@localhost conf]# hadoop fs -mkdir -p /data/hive/warehouse
[root@localhost conf]# hadoop fs -mkdir /data/hive/tmp
[root@localhost conf]# hadoop fs -mkdir /data/hive/log
[root@localhost conf]# hadoop fs -chmod -R 777 /data/hive/warehouse
[root@localhost conf]# hadoop fs -chmod -R 777 /data/hive/tmp
[root@localhost conf]# hadoop fs -chmod -R 777 /data/hive/log
[root@localhost conf]# hadoop fs -mkdir -p /spark-eventlog
  • 修改Hive配置sudo vim /opt/hive/apache-hive-2.3.3-bin/conf/hive-site.xml
<property>
    <name>system:java.io.tmpdir</name>
    <value>/tmp/hive/java</value>
</property>
<property>
    <name>system:user.name</name>
    <value>hadoop</value>
</property>
<!--通过<name>查询以下<property>,逐条修改-->
<property>
    <name>hive.exec.scratchdir</name>
    <value>hdfs://127.0.0.1:9000/data/hive/tmp</value>
</property>
<!-- Hive 默认在 HDFS 的工作目录 -->
<property>
    <name>hive.metastore.warehouse.dir</name>
    <value>hdfs://127.0.0.1:9000/data/hive/warehouse</value>
</property>
<property>
    <name>hive.querylog.location</name>
    <value>hdfs://127.0.0.1:9000/data/hive/log</value>
</property>
<!--MySQL数据库地址,我这使用的公司测试服务器-->
<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://10.10.1.6:3306/hive?characterEncoding=utf8&amp;useSSL=false</value>
</property>
<!-- jdbc 连接的 Driver-->
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
</property>
<!--数据库用户名-->
<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
</property>
<!--数据库密码-->
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>root</value>
</property>
<!-- Hive 元数据存储版本的验证 -->
<property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>
</property>
<property>
    <name>hive.exec.local.scratchdir</name>
    <value>/opt/hive/apache-hive-2.3.3-bin/tmp/${system:user.name}</value>
    <description>Local scratch space for Hive jobs</description>
</property>
<property>
    <name>hive.downloaded.resources.dir</name>
    <value>/opt/hive/apache-hive-2.3.3-bin/tmp/${hive.session.id}_resources</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>/opt/hive/apache-hive-2.3.3-bin/tmp/root/operation_logs</value>
    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
  • 配置MySQL驱动包
# 切换目录
[root@localhost conf]# cd /opt/hive/apache-hive-2.3.3-bin/lib/

# 下载驱动包
[root@localhost lib]# sudo wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-java-5.1.49.tar.gz

# 解压驱动包
[root@localhost lib]# sudo tar xvf mysql-connector-java-5.1.49.tar.gz

# 复制驱动包到当前目录
[root@localhost lib]# sudo cp mysql-connector-java-5.1.49/mysql-connector-java-5.1.49.jar .
  • 配置Hive环境变量sudo vim /opt/hive/apache-hive-2.3.3-bin/conf/hive-env.sh
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.2
export HIVE_CONF_DIR=/opt/hive/apache-hive-2.3.3-bin/conf
export HIVE_AUX_JARS_PATH=/opt/hive/apache-hive-2.3.3-bin/lib
  • 修改系统环境变量sudo vim /etc/profile
export HIVE_CONF_DIR=/opt/hive/apache-hive-2.3.3-bin/conf
export HIVE_AUX_JARS_PATH=/opt/hive/apache-hive-2.3.3-bin/lib
export HIVE_HOME=/opt/hive/apache-hive-2.3.3-bin

export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HIVE_HOME/bin
  • 使环境变量生效
[root@localhost lib]# source /etc/profile
  • 先去MySQL创建名称为hive的数据库
  • 再初始化schema,初始化成功后,会在hive生成数据表
[root@localhost lib]# /opt/hive/apache-hive-2.3.3-bin/bin/schematool -dbType mysql -initSchema
  • 文件夹设置权限
[hadoop@localhost home]$ sudo chmod -R 777 /opt/hive
  • 启动服务
[hadoop@localhost lib]$ nohup hive --service metastore >> metastore.log 2>&1 &
[hadoop@localhost lib]$ nohup hive --service hiveserver2 >> hiveserver2.log 2>&1 &
  • 安装完成,验证
[hadoop@localhost lib]$ hive -e "show databases"
which: no hbase in (/opt/jdk/jdk1.8.0_261/bin:/opt/jdk/jdk1.8.0_261/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/hadoop/hadoop-2.7.2/bin:/root/bin:/opt/hadoop/hadoop-2.7.2/bin:/opt/hive/apache-hive-2.3.3-bin/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/apache-hive-2.3.3-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in file:/opt/hive/apache-hive-2.3.3-bin/conf/hive-log4j2.properties Async: true
OK
default
Time taken: 3.503 seconds, Fetched: 1 row(s)

安装Spark

  • 前往 Spark官网 下载 spark-2.4.3-bin-without-hadoop.tgz
  • 安装
# 将安装包上传到/home目录,切换到目录
[hadoop@localhost lib]$ cd /home

# 创建安装目录
[root@localhost home]# sudo mkdir -p /opt/spark

# 解压到安装目录
[root@localhost home]# sudo tar xvf spark-2.4.3-bin-without-hadoop.tgz -C /opt/spark
  • 创建Spark配置文件
# 切换到目录
[root@localhost home]# cd /opt/spark/spark-2.4.3-bin-without-hadoop/conf/

# 复制配置文件
[root@localhost conf]# sudo cp spark-env.sh.template spark-env.sh
[root@localhost conf]# sudo cp metrics.properties.template metrics.properties
  • 执行sudo vim spark-env.sh编辑配置文件
export JAVA_HOME=/opt/jdk/jdk1.8.0_261
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.2
export HADOOP_CONF_DIR=/opt/hadoop/hadoop-2.7.2/etc/hadoop
export SPARK_DIST_CLASSPATH=$(/opt/hadoop/hadoop-2.7.2/bin/hadoop classpath)
export SPARK_MASTER_HOST=127.0.0.1
export SPARK_MASTER_PORT=7077
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -
Dspark.history.retainedApplications=50 -
Dspark.history.fs.logDirectory=hdfs://127.0.0.1:9000/spark-eventlog"
  • 配置Hive
[root@localhost conf]# sudo cp /opt/hive/apache-hive-2.3.3-bin/conf/hive-site.xml /opt/spark/spark-2.4.3-bin-without-hadoop/conf
  • 文件夹设置权限
[hadoop@localhost home]$ sudo chmod -R 777 /opt/spark
  • 修改环境变量sudo vim /etc/profile
export SPARK_HOME=/opt/spark/spark-2.4.3-bin-without-hadoop

export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HIVE_HOME/bin:$SPARK_HOME/bin
  • 使环境变量生效
[root@localhost home]# source /etc/profile
  • 安装完成,执行
[root@localhost conf]# /opt/spark/spark-2.4.3-bin-without-hadoop/sbin/start-all.sh
  • 执行jps查看进程是否存在
[hadoop@localhost conf]$ jps
1969 NodeManager
25313 RunJar
787 DataNode
1171 SecondaryNameNode
16580 Worker
17044 Jps
16373 Master
1642 ResourceManager
508 NameNode
  • 验证安装
[root@localhost conf]# spark-sql -e "show databases"
  • 如果报错
[hadoop@localhost conf]$ spark-sql -e "show databases"
22/11/09 01:04:40 WARN util.Utils: Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 10.10.1.84 instead (on interface ens192)
22/11/09 01:04:40 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
22/11/09 01:04:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/11/09 01:04:41 WARN deploy.SparkSubmit$$anon$2: Failed to load org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.
java.lang.ClassNotFoundException: org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.spark.util.Utils$.classForName(Utils.scala:238)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:810)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Failed to load main class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.
You need to build Spark with -Phive and -Phive-thriftserver.
22/11/09 01:04:41 INFO util.ShutdownHookManager: Shutdown hook called
22/11/09 01:04:41 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-e69a2d64-f654-4770-b450-0990098be5ee

  • 需要前往 Spark官网 下载spark-2.4.3-bin-without-hadoop.tgz,解压后覆盖jars文件夹
# 将压缩包上传到/home目录,切换到目录
[hadoop@localhost lib]$ cd /home

# 解压到指定目录
[root@localhost home]# sudo tar xvf spark-2.4.3-bin-hadoop2.7.tgz

# 覆盖文件
[root@localhost home]# cp -rf spark-2.4.3-bin-hadoop2.7/jars/ /opt/spark/spark-2.4.3-bin-without-hadoop/
  • 添加MySQL驱动
# 进入目录
[root@localhost home]# cd /opt/spark/spark-2.4.3-bin-without-hadoop/jars

# 下载驱动包
[root@localhost home]# wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-java-5.1.49.tar.gz

# 解压
[root@localhost home]# tar xvf mysql-connector-java-5.1.49.tar.gz

# 复制驱动包
[root@localhost home]# cp mysql-connector-java-5.1.49/mysql-connector-java-5.1.49.jar .
  • 再次验证安装,成功
[root@localhost conf]# spark-sql -e "show databases"

安装DataSphere Studio

  • 前往 DSS releases 页面下载 DSS 的已编译版本或源码包,下载的是1.1.0版本dss_linkis_one-click_install_20220704.zip
  • 解压安装包并安装依赖
# 上传安装包到/home目录,切换到目录
[hadoop@localhost conf]$ cd /home

# 解压到安装目录
[hadoop@localhost home]$ sudo unzip -d /opt/dss dss_linkis_one-click_install_20220704.zip
  • 安装依赖
[root@localhost home]# sudo yum -y install epel-release
[root@localhost home]# sudo yum install -y python-pip
[root@localhost home]# sudo python -m pip install matplotlib
  • 执行sudo python -m pip install matplotlib时报错
[hadoop@localhost home]$ python -m pip install matplotlib
Collecting matplotlib
  Downloading https://files.pythonhosted.org/packages/91/1c/a48fd779287df3425c289cc2ff728980a5b355f15f4c3c40e1822770ba44/matplotlib-3.6.2.tar.gz (35.8MB)
    100% |████████████████████████████████| 35.9MB 37kB/s 
    Complete output from command python setup.py egg_info:
    
    Beginning with Matplotlib 3.6, Python 3.8 or above is required.
    You are using Python 2.7.5.
    
    This may be due to an out of date pip.
    
    Make sure you have pip >= 9.0.1.
    
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-RSKuhg/matplotlib/
You are using pip version 8.1.2, however version 22.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
  • 根据提示,执行pip install --upgrade pip依旧无效,更新pip
[hadoop@localhost home]$ sudo wget https://bootstrap.pypa.io/pip/2.7/get-pip.py
[hadoop@localhost home]$ sudo python get-pip.py
[hadoop@localhost home]$ pip -V

[hadoop@localhost home]$ sudo wget https://bootstrap.pypa.io/pip/3.5/get-pip.py
[hadoop@localhost home]$ sudo python3 get-pip.py
[hadoop@localhost home]$ pip -V
  • 再执行sudo python -m pip install matplotlib安装依赖
[hadoop@localhost home]$ sudo python -m pip install matplotlib
  • 执行sudo vim /opt/dss/conf/config.sh修改配置
### deploy user
deployUser=hadoop

### Linkis_VERSION
LINKIS_VERSION=1.1.1

### DSS Web
DSS_NGINX_IP=127.0.0.1
DSS_WEB_PORT=8085

### DSS VERSION
DSS_VERSION=1.1.0


############## ############## linkis的其他默认配置信息 start ############## ##############
### Specifies the user workspace, which is used to store the user's script files and log files.
### Generally local directory
##file:// required
WORKSPACE_USER_ROOT_PATH=file:///tmp/linkis/ 
### User's root hdfs path
##hdfs:// required
HDFS_USER_ROOT_PATH=hdfs:///tmp/linkis 
### Path to store job ResultSet:file or hdfs path
##hdfs:// required
RESULT_SET_ROOT_PATH=hdfs:///tmp/linkis 

### Path to store started engines and engine logs, must be local
ENGINECONN_ROOT_PATH=/appcom/tmp

#ENTRANCE_CONFIG_LOG_PATH=hdfs:///tmp/linkis/ ##hdfs:// required

###HADOOP CONF DIR #/appcom/config/hadoop-config
HADOOP_CONF_DIR=/opt/hadoop/hadoop-2.7.2/etc/hadoop
###HIVE CONF DIR  #/appcom/config/hive-config
HIVE_CONF_DIR=/opt/hive/apache-hive-2.3.3-bin/conf
###SPARK CONF DIR #/appcom/config/spark-config
SPARK_CONF_DIR=/opt/spark/spark-2.4.3-bin-without-hadoop/conf
# for install
LINKIS_PUBLIC_MODULE=lib/linkis-commons/public-module

##YARN REST URL  spark engine required
YARN_RESTFUL_URL=http://127.0.0.1:8088

## Engine version conf
#SPARK_VERSION
SPARK_VERSION=2.4.3
##HIVE_VERSION
HIVE_VERSION=2.3.3
PYTHON_VERSION=python2

## LDAP is for enterprise authorization, if you just want to have a try, ignore it.
#LDAP_URL=ldap://localhost:1389/
#LDAP_BASEDN=dc=webank,dc=com
#LDAP_USER_NAME_FORMAT=cn=%s@xxx.com,OU=xxx,DC=xxx,DC=com

################### The install Configuration of all Linkis's Micro-Services #####################
#
#    NOTICE:
#       1. If you just wanna try, the following micro-service configuration can be set without any settings.
#            These services will be installed by default on this machine.
#       2. In order to get the most complete enterprise-level features, we strongly recommend that you install
#          the following microservice parameters
#

###  EUREKA install information
###  You can access it in your browser at the address below:http://${EUREKA_INSTALL_IP}:${EUREKA_PORT}
###  Microservices Service Registration Discovery Center
LINKIS_EUREKA_INSTALL_IP=127.0.0.1
LINKIS_EUREKA_PORT=9600
#LINKIS_EUREKA_PREFER_IP=true

###  Gateway install information
#LINKIS_GATEWAY_INSTALL_IP=127.0.0.1
LINKIS_GATEWAY_PORT=9001

### ApplicationManager
#LINKIS_MANAGER_INSTALL_IP=127.0.0.1
LINKIS_MANAGER_PORT=9101

### EngineManager
#LINKIS_ENGINECONNMANAGER_INSTALL_IP=127.0.0.1
LINKIS_ENGINECONNMANAGER_PORT=9102

### EnginePluginServer
#LINKIS_ENGINECONN_PLUGIN_SERVER_INSTALL_IP=127.0.0.1
LINKIS_ENGINECONN_PLUGIN_SERVER_PORT=9103

### LinkisEntrance
#LINKIS_ENTRANCE_INSTALL_IP=127.0.0.1
LINKIS_ENTRANCE_PORT=9104

###  publicservice
#LINKIS_PUBLICSERVICE_INSTALL_IP=127.0.0.1
LINKIS_PUBLICSERVICE_PORT=9105

### cs
#LINKIS_CS_INSTALL_IP=127.0.0.1
LINKIS_CS_PORT=9108

########## Linkis微服务配置完毕##### 

################### The install Configuration of all DataSphereStudio's Micro-Services #####################
#
#    NOTICE:
#       1. If you just wanna try, the following micro-service configuration can be set without any settings.
#            These services will be installed by default on this machine.
#       2. In order to get the most complete enterprise-level features, we strongly recommend that you install
#          the following microservice parameters
#

### DSS_SERVER
### This service is used to provide dss-server capability.

### project-server
#DSS_FRAMEWORK_PROJECT_SERVER_INSTALL_IP=127.0.0.1
#DSS_FRAMEWORK_PROJECT_SERVER_PORT=9002
### orchestrator-server
#DSS_FRAMEWORK_ORCHESTRATOR_SERVER_INSTALL_IP=127.0.0.1
#DSS_FRAMEWORK_ORCHESTRATOR_SERVER_PORT=9003
### apiservice-server
#DSS_APISERVICE_SERVER_INSTALL_IP=127.0.0.1
#DSS_APISERVICE_SERVER_PORT=9004
### dss-workflow-server
#DSS_WORKFLOW_SERVER_INSTALL_IP=127.0.0.1
#DSS_WORKFLOW_SERVER_PORT=9005
### dss-flow-execution-server
#DSS_FLOW_EXECUTION_SERVER_INSTALL_IP=127.0.0.1
#DSS_FLOW_EXECUTION_SERVER_PORT=9006
###dss-scriptis-server
#DSS_SCRIPTIS_SERVER_INSTALL_IP=127.0.0.1
#DSS_SCRIPTIS_SERVER_PORT=9008

###dss-data-api-server
#DSS_DATA_API_SERVER_INSTALL_IP=127.0.0.1
#DSS_DATA_API_SERVER_PORT=9208
###dss-data-governance-server
#DSS_DATA_GOVERNANCE_SERVER_INSTALL_IP=127.0.0.1
#DSS_DATA_GOVERNANCE_SERVER_PORT=9209
###dss-guide-server
#DSS_GUIDE_SERVER_INSTALL_IP=127.0.0.1
#DSS_GUIDE_SERVER_PORT=9210
########## DSS微服务配置完毕#####

############## ############## other default configuration 其他默认配置信息  ############## ##############

## java application default jvm memory
export SERVER_HEAP_SIZE="512M"


##sendemail配置,只影响DSS工作流中发邮件功能
EMAIL_HOST=smtp.163.com
EMAIL_PORT=25
EMAIL_USERNAME=xxx@163.com
EMAIL_PASSWORD=xxxxx
EMAIL_PROTOCOL=smtp

### Save the file path exported by the orchestrator service
ORCHESTRATOR_FILE_PATH=/appcom/tmp/dss
### Save DSS flow execution service log path
EXECUTION_LOG_PATH=/appcom/tmp/dss
  • 修改配置,vim /opt/dss/conf/db.sh
### for DSS-Server and Eventchecker APPCONN
MYSQL_HOST=10.10.1.6
MYSQL_PORT=3306
MYSQL_DB=dss
MYSQL_USER=root
MYSQL_PASSWORD=root

#主要是配合scriptis一起使用,如果不配置,会默认尝试通过$HIVE_CONF_DIR 中的配置文件获取
HIVE_META_URL=jdbc:mysql://10.10.1.6:3306/hive?characterEncoding=utf8&amp;useSSL=false    # HiveMeta元数据库的URL
HIVE_META_USER=root   # HiveMeta元数据库的用户
HIVE_META_PASSWORD=root    # HiveMeta元数据库的密码
  • 先创建名称为dss的数据库
  • 再执行安装脚本
[root@localhost home]# sh /opt/dss/bin/install.sh
  • 启动
[root@localhost home]# sh /opt/dss/bin/start-all.sh

# 启动过程很慢,成功后会打印接口和页面的地址
You can check DSS & Linkis by acessing eureka URL: http://10.10.1.84:9600
You can acess DSS & Linkis Web by http://10.10.1.84:8085
  • 启动完成后访问eureka,端口号9600,会有17个服务
  • 访问前端页面,端口号8085,用户名hadoop,密码hadoop

对于本文内容有问题或建议的小伙伴,欢迎在文章底部留言交流讨论。