Menu
  • LOGIN
  • No products in the cart.

Apache Spark Scala Course Description

Pincorps Apache Spark Scala training module teaches you to create applications in Spark with the implementation of Scala programming. It provides a clear comparison between Spark and Hadoop and covers techniques to increasing your application performance and enabling high-speed processing.

Learners will master Scala programming and will get trained on different APIs which Spark offers such as Spark Streaming, Spark SQL, Spark RDD, Spark MLlib and Spark GraphX.

Apache Spark, a fast, in-memory distributed collections framework written in Scala. Employers including Amazon, EBay, NASA JPL, and Yahoo all use Spark to quickly extract meaning from massive data sets across a fault-tolerant Hadoop cluster.

 

Apache Spark & Scala Learning Outcomes;

  • Understand Scala and its implementation.
  • Apply Lazy values, Control Structures, Loops, Collection, etc.
  • Learn the concepts of Traits and OOPS in scala.
  • Understand Functional programming in scala.
  • Get an insight into the BigData challenges.
  • How spark acts as a solution to these challenges.
  • Install spark and implement spark operations on spark shell.
  • Understand what are RDDs in spark.
  • Implement spark application on YARN (Hadoop).
  • Analyze Hive and Spark SQL Architecture.

 

Apache Spark & Scala Training – Suggested Audience

This Apache Spark & Scala training is aimed at professionals with little bit knowledge on functional programming language and object-oriented programming. Suggested attendees based on our past programs are:

  • Big Data enthusiasts
  • Software Architects
  • Software Engineers
  • Software Developers
  • Data Scientists
  • Data Engineers
  • Analysts
  • ETL Developers

 

Apache Spark Scala Training Prerequisites

  • Basic familiarity with Linux or Unix
  • A basic understanding of functional programming and object-oriented programming.
  • Intermediate-level of knowledge on Hadoop.
  • Knowledge of Scala will definitely be a plus but is not mandatory.

 

Apache Spark & Scala In-house/Corporate Training

If you have a group of 5-6 participants, apply for in-house training. For commercials please send us an email with group size to hello@pincorps.com

Course Curriculum

Module-1: Introduction to Spark and Analysis
 Why second generation frameworks? Details 00:00:00
 Introduction to Spark Details 00:00:00
 Scala shell Details 00:00:00
 Spark Architecture Details 00:00:00
 Spark on Cluster Details 00:00:00
 Spark Core Details 00:00:00
 SparkSQL Details 00:00:00
 Spark Streaming Details 00:00:00
 Cluster Managers Details 00:00:00
 Spark Users Details 00:00:00
 What is use of Spark Details 00:00:00
 Spark Versions Details 00:00:00
 Spark Storage Layers Details 00:00:00
 Download Spark Details 00:00:00
1. Spark API on a Cluster Details 00:00:00
2. Cluster Manager Details 00:00:00
Module-2: DATALOADING (HDFS, Amazon s3)
7 Different file formats: Details 00:00:00
Module-3: RDD’S
 What is RDD Details 00:00:00
 Why RDD? Details 00:00:00
 RDD operations Details 00:00:00
 Transformations Details 00:00:00
 Actions Details 00:00:00
 Lazy Evaluation Details 00:00:00
 Basic RDD’s Details 00:00:00
 Caching Details 00:00:00
 Converting between RDD types Details 00:00:00
 Spark Api supports Python, Java, Scala Details 00:00:00
 Working with Key, value pairs Details 00:00:00
 Create key, value pair RDD’s Details 00:00:00
Transformations on pair RDD’s Details 00:00:00
Actions on pair RDD’s Details 00:00:00
Advanced Spark operation Details 00:00:00
Module-4: SPARK SQL
 Spark sql in applications Details 00:00:00
 Spark sql initialization Details 00:00:00
 Spark sql basic query Details 00:00:00
 Schema RDD’s Details 00:00:00
 Caching Details 00:00:00
 Load data from hive Details 00:00:00
 Load data from json Details 00:00:00
 Load data from RDD’s Details 00:00:00
 Beeline Details 00:00:00
 Long-lived tables and queries Details 00:00:00
 Query hands-on Details 00:00:00
 Spark sql UDF’s Details 00:00:00
 Performance Details 00:00:00
Module-5: SPARK STREAMING
 Streaming Architecture Details 00:00:00
 Two types of Transformations Details 00:00:00
 Streaming UI Details 00:00:00
 Sources: Input Details 00:00:00
 Core Sources Details 00:00:00
 Additional Sources Details 00:00:00
 Multiple Sources Details 00:00:00
 Cluster Sizing Details 00:00:00
Fault Tolerance Details 00:00:00

Course Reviews

N.A

ratings
  • 5 stars0
  • 4 stars0
  • 3 stars0
  • 2 stars0
  • 1 stars0

No Reviews found for this course.

X