-
Notifications
You must be signed in to change notification settings - Fork 30
Description
Hi, first of all this package looks very promising. I'm working on a problem with 70millions rows and 45million sparse features (up to ~1000 non-zero elements). I tried to make it work with Spark ML but seem to be running into OOME and found this package whilst I was searching for a solution.
This is what I've done:
git clone --recursive https://github.com/intel-analytics/SparseML.git
cd SparseML
mvn clean package
which produces the following error messages - looks like some issue with Logging; any suggestions on how to get past this?:
sashi@sashi-ws:~/SparseML$ mvn clean package
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building SparseForSpark 1.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ SparseForSpark ---
[INFO] Deleting /home/sashi/SparseML/target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ SparseForSpark ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/sashi/SparseML/src/main/resources
[INFO]
[INFO] --- scala-maven-plugin:3.1.5:add-source (scala-compile-first) @ SparseForSpark ---
[INFO] Add Source directory: /home/sashi/SparseML/src/main/scala
[INFO] Add Test Source directory: /home/sashi/SparseML/src/test/scala
[INFO]
[INFO] --- scala-maven-plugin:3.1.5:compile (scala-compile-first) @ SparseForSpark ---
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] org.apache.spark.mllib:SparseForSpark:1.0-SNAPSHOT requires scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] /home/sashi/SparseML/src/main/scala:-1: info: compiling
[INFO] Compiling 12 source files to /home/sashi/SparseML/target/classes at 1482249861360
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/Utils/BLAS.scala:3: error: object Logging is not a member of package org.apache.spark
[ERROR] import org.apache.spark.Logging
[ERROR] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/Utils/BLAS.scala:5: error: not found: type Logging
[ERROR] object BLAS extends Serializable with Logging{
[ERROR] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/GradientDescent.scala:6: error: object Logging is not a member of package org.apache.spark
[ERROR] import org.apache.spark.{SparkEnv, Logging}
[ERROR] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/GradientDescent.scala:16: error: not found: type Logging
[ERROR] extends Optimizer with Logging {
[ERROR] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/GradientDescent.scala:107: error: not found: type Logging
[ERROR] object GradientDescent extends Logging {
[ERROR] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/GradientDescent.scala:142: error: not found: value logWarning
[WARNING] logWarning("GradientDescent.runMiniBatchSGD returning initial weights, no data found")
[INFO] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/GradientDescent.scala:147: error: not found: value logWarning
[WARNING] logWarning("The miniBatchFraction is too small")
[INFO] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/GradientDescent.scala:191: error: not found: value logWarning
[WARNING] logWarning(s"Iteration ($i/$numIterations). The size of sampled batch is zero")
[INFO] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/GradientDescent.scala:196: error: not found: value logInfo
[ERROR] logInfo("GradientDescent.runMiniBatchSGD finished. Last 10 stochastic losses %s".format(
[ERROR] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/LBFGS.scala:6: error: object Logging is not a member of package org.apache.spark
[ERROR] import org.apache.spark.{SparkEnv, Logging}
[ERROR] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/LBFGS.scala:20: error: not found: type Logging
[ERROR] extends Optimizer with Logging {
[ERROR] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/LBFGS.scala:105: error: not found: type Logging
[ERROR] object LBFGS extends Logging {
[ERROR] ^
[ERROR] /home/sashi/SparseML/src/main/scala/org/apache/spark/mllib/sparselr/LBFGS.scala:163: error: not found: value logInfo
[ERROR] logInfo("LBFGS.runLBFGS finished. Last 10 losses %s".format(
[ERROR] ^
[ERROR] 13 errors found
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 14.268 s
[INFO] Finished at: 2016-12-20T16:04:28+00:00
[INFO] Final Memory: 35M/1863M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.1.5:compile (scala-compile-first) on project SparseForSpark: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException