Generic filters
Exact matches only
Search in title
Search in content
Search in excerpt
Limited time offer icon TOPTALENT LEARNING CREDITS YEAR END SALE!!

Introduction to Apache Spark | Hands-on Spark for Big Data & Machine Learning

Apache Spark, a significant component in the Hadoop Ecosystem, is a cluster computing engine used in Big Data. Building on top of the Hadoop YARN and HDFS ecosystem, it offers order-of-magnitude faster processing for many in-memory computing tasks compared to Map/Reduce. It can be programmed in Java, Scala, Python, and R – the favorite languages of Data Scientists – along with SQL-based front ends.  With advanced libraries like Mahout and MLib for Machine Learning, GraphX or Neo4J for rich data graph processing as well as access to other NOSQL data stores, Rule engines and other Enterprise components, Spark is a lynchpin in modern Big Data and Data Science computing.

Geared for experienced developers, Introduction to Apache Spark for Big Data & Machine Learning provides students with a comprehensive, hands-on exploration of enterprise-grade Spark programming, interacting with the significant components mentioned above to craft complete data science solutions.  Students will leave this course armed with the skills they require to begin working with Spark in a practical, real world environment.

This course is offered in support of the Python programming language but can also be offered for R or Java with advance notice and planning. Our team will work with you to coordinate the languages, tools and environment that will work best for your organization and needs. Please inquire for details.

NOTE: Students wanting more depth and intermediate and beyond level Spark Developer for Big Data topics and labs might consider the TTSK7505 Developing with Spark for Big Data | Enterprise-Grade Spark Programming for the Hadoop & Big Data Ecosystem five day superset of this course.

Start_dateClass_timesPriceEnroll
Start_dateClass_timesPriceEnroll

Why choose TOPTALENT?