Hands-On Meetup: Spark + Scala = Awesome
jeu. 6 septembre à 19:00
In preparation of our next hands-on meeting: download a virtual machine (and meanwhile install either VMware player or Virtualbox on your PC or laptop with at least 4GB of RAM). The virtual machine is Linux, and comes with pre-installed Spark (and some other software tools that were mentioned yesterday by Hakim):
- There is a downloadable VM at Cloudera's website, their so-called "CDH", with Spark 1.6 & much more: see https://www.cloudera.com/downloads/quickstart_vms/5-13.html
(It's maybe a bit too "heavy", but it can easily be trimmed down by stopping some of the automatically running daemons, if all you need is Spark; on the other hand, Spark 1.6 is a bit "old" but still being widely used)
There's also a similar VM at Hortonworks' website, their "HDP": see https://hortonworks.com/products/data-platforms/
If you have a Linux system and want to install Spark on it, you should go to https://spark.apache.org/downloads.html and install the latest version (2.3), with Hadoop and (for the Python lovers amongst you) PySpark
concerning Zeppelin: You'll have to install it separately from Spark, even with CDH, but installation is straightforward: see https://zeppelin.apache.org/download.html
(as you can see, it's still at version 0.8, which means "beta"...) The idea is that you install it on your Linux system that runs Spark ánd that should also be running a web server (which is the case for CDH); then you navigate with your web browser (even from another computer) to http://address-of-your-linux-machine:8080/ (that is: navigate to the server's port 8080, which shows the Zeppelin user interface. Then just create a new Spark notebook. Have fun!
Nous avons temporairement désactivé la possibilité de naviguer vers les tags.