본문 바로가기

hdp5

A Lap Around Apache Spark on HDP If you have any errors in completing this tutorial. Please ask questions or notify us on Hortonworks Community Connection!IntroductionThis tutorial walks you through many of the newer features of Spark 1.6 on YARN.With YARN, Hadoop can now support many types of data and application workloads; Spark on YARN becomes yet another workload running against the same set of hardware resources.The tutori.. 2016. 3. 25.
Learning the Ropes of the Hortonworks Sandbox IntroductionThis tutorial is aimed for users who do not have much experience in using the Sandbox. We will install and explore the Sandbox on virtual machine and cloud environments. We will also navigate the Ambari user interface. Let’s begin our Hadoop journey.Pre-RequisitesDownloaded and Installed Hortonworks SandboxOutlineWhat is the Sandbox?Step 1: Explore the Sandbox in a VM – 1.1 Install t.. 2016. 3. 25.
Hands-on Tour of Apache Spark in 5 Minutes If you have any errors in completing this tutorial. Please ask questions or notify us on Hortonworks Community Connection!IntroductionApache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs in Scala, Java, Python, and R that allow data workers to efficiently execute machine learning algorithms that require fast iterative access to datasets (see Spark.. 2016. 3. 25.
Magellan: Geospatial Analytics on Spark By Ram Sriharsha on October 20th, 2015 Geospatial data is pervasive—in mobile devices, sensors, logs, and wearables. This data’s spatial context is an important variable in many predictive analytics applications. To benefit from spatial context in a predictive analytics application, we need to be able to parse geospatial datasets at scale, join them with target datasets that contain point in spa.. 2016. 3. 25.