关于IDEA创建spark maven项目并连接远程spark集群问题

这篇文章主要介绍了IDEA创建spark maven项目并连接远程spark集群,本文给大家介绍的非常详细，对大家的学习或工作具有一定的参考借鉴价值，需要的朋友可以参考下

环境：

scala：2.12.10

spark：3.0.3

1、创建scala maven项目，如下图所示：

2、

不同版本scala编译参数可能略有不同，笔者使用的scala版本是2.12.10，scala-archetype-simple插件生成的pom文件

  org.scala-toolsmaven-scala-plugin2.15.0   compiletestCompile  -make:transitive-dependencyfile${project.build.directory}/.scala_dependencies

要去除-make:transitive这个参数，否则会报错。

3、创建SparkPi Object类

 object SparkPi { def main(args: Array[String]): Unit = { val spark = SparkSession .builder .appName("Spark Pi") .master("spark://172.21.212.114:7077") .config("spark.jars","E:\\work\\polaris\\polaris-spark\\spark-scala\\target\\spark-scala-1.0.0.jar") .config("spark.executor.memory","2g") .config("spark.cores.max","2") .config("spark.driver.host", "172.21.58.28") .config("spark.driver.port", "9089") .getOrCreate() //spark = new SparkContext(conf). val slices = if (args.length > 0) args(0).toInt else 2 val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow val count = spark.sparkContext.parallelize(1 until n, slices).map { i => val x = random * 2 - 1 val y = random * 2 - 1 if (x*x + y*y <= 1) 1 else 0 }.reduce(_ + _) println(s"Pi is roughly ${4.0 * count / (n - 1)}") spark.stop() } }

4、执行打包命令：