skip to content
PySpark Recipes: A Problem-Solution Approach with PySpark2 Preview this item
ClosePreview this item
Checking...

PySpark Recipes: A Problem-Solution Approach with PySpark2

Author: Raju Mishra
Publisher: Apress, 2017.
Edition/Format:   eBook : Document : English : 1st editionView all editions and formats
Summary:
Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved! PySpark Recipes covers Hadoop and its shortcomings. The architecture of Spark, PySpark, and RDD are presented. You will learn to  Read more...
Rating:

(not yet rated) 0 with reviews - Be the first.

Subjects
More like this

Find a copy online

Links to this item

Find a copy in the library

&AllPage.SpinnerRetrieving; Finding libraries that hold this item...

Details

Genre/Form: Electronic books
Additional Physical Format: Print version:
Mishra, Raju Kumar
PySpark Recipes : A Problem-Solution Approach with PySpark2
Berkeley, CA : Apress L. P.,c2017
Material Type: Document, Internet resource
Document Type: Internet Resource, Computer File
All Authors / Contributors: Raju Mishra
ISBN: 9781484231401 1484231406 9781484231418 1484231414
OCLC Number: 1019734041
Description: 1 online resource (280 p.)
Contents: Chapter 1: The era of Big Data and HadoopChapter Goal:Reader learns about Big data and its usefulness. Also how Hadoop and its ecosystem is beautifully able to process big data for useful informations. What are the shortcomings of Hadoop which requires another Big data processing platform.No of pages 15-20Sub -Topics1. Introduction to Big-Data2. Big Data challenges and processing technology 3. Hadoop, structure and its ecosystem4. Shortcomings of HadoopChapter 2: Python, NumPy and SciPyChapter Goal:The goal of this chapter to get reader acquainted with Python, NumPy and SciPy. No of pages: 25-30Sub - Topics 1. Introduction to Python2. Python collection, String Function and Class3. NumPy and ndarray4. SciPyChapter 3: Spark : Introduction, Installation, Structure and PySparkChapter Goal:This chapters will introduce Spark, Installation on Single machine. There after it continues with structure of Spark. Finally, PySpark is introduced.No of pages : 15-20Sub - Topics: 1. Introduction to Spark2. Spark installation on Ubuntu3. Spark architecture4. PySpark and Its architectureChapter 4: Resilient Distributed Dataset (RDD)Chapter Goal:Chapter deals with the core of Spark, RDD. Operation on RDDNo of pages: 25-30Sub - Topics: 1. Introduction to RDD and its characteristics2. Transformation and Actions2. Operations on RDD ( like map, filter, set operations and many more)Chapter 5: The power of pairs : Paired RDDChapter Goal:Paired RDD can help in making many complex computation easy in programming. Learners will learn paired RDD and operation on this.No of pages:15 -20Sub - Topics: 1. Introduction to Paired RDD2. Operation on paired RDD (mapByKey, reduceByKey ......) Chapter 6: Advance PySpark and PySpark application optimizationChapter Goal: 30-35Reader will learn about Advance PySpark topics broadcast and accumulator. In this chapter learner will learn about PySpark application optimization. No of pages:Sub - Topics: 1. Spark Accumulator2. Spark Broadcast3. Spark Code OptimizationChapter 7: IO in PySparkChapter Goal:We will learn PySpark IO in this chapter. Reading and writing .csv file and .json files. We will also learn how to connect to different databases with PySpark.No of pages:20-30Sub - Topics: 1. Reading and writing JSON and .csv files2. Reading data from HDFS3. Reading data from different databases and writing data to different databasesChapter 8: PySpark StreamingChapter Goal:Reader will understand real time data analysis with PySpark Streaming. This chapter is focus on PySpark Streaming architecture, Discretized stream operations and windowing operations.No of pages:30-40Sub - Topics: 1. PySpark Streaming architecture2. Discretized Stream and operations3. Concept of windowing operationsChapter 9: SparkSQLChapter Goal:In this chapter reader will learn about SparkSQL. SparkSQL Dataframe is introduced in this chapter. In this chapter learner will learn how to use SQL commands using SparkSQLNo of pages: 40-50Sub - Topics: 1. SparkSQL2. SQL with SparkSQL3. Hive commands with SparkSQL
Responsibility: Mishra, Raju.

Abstract:

Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved! PySpark Recipes covers Hadoop and its shortcomings. The architecture of Spark, PySpark, and RDD are presented. You will learn to apply RDD to solve day-to-day big data problems. Python and NumPy are included and make it easy for new learners of PySpark to understand and adopt the model. What You Will Learn Understand the advanced features of PySpark2 and SparkSQL Optimize your code Program SparkSQL with Python Use Spark Streaming and Spark MLlib with Python Perform graph analysis with GraphFrames Who This Book Is For Data analysts, Python programmers, big data enthusiasts

Reviews

User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...

Tags

Be the first.

Similar Items

Related Subjects:(1)

Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data


Primary Entity

<http://www.worldcat.org/oclc/1019734041> # PySpark Recipes: A Problem-Solution Approach with PySpark2
    a schema:Book, schema:MediaObject, schema:CreativeWork ;
    library:oclcnum "1019734041" ;
    schema:about <http://dewey.info/class/005.133/e23/> ;
    schema:about <http://experiment.worldcat.org/entity/work/data/4498738298#Topic/python_computer_program_language> ; # Python (Computer program language)
    schema:author <http://experiment.worldcat.org/entity/work/data/4498738298#Person/mishra_raju> ; # Raju Mishra
    schema:bookEdition "1st edition" ;
    schema:bookFormat schema:EBook ;
    schema:datePublished "2017" ;
    schema:description "Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved! PySpark Recipes covers Hadoop and its shortcomings. The architecture of Spark, PySpark, and RDD are presented. You will learn to apply RDD to solve day-to-day big data problems. Python and NumPy are included and make it easy for new learners of PySpark to understand and adopt the model. What You Will Learn Understand the advanced features of PySpark2 and SparkSQL Optimize your code Program SparkSQL with Python Use Spark Streaming and Spark MLlib with Python Perform graph analysis with GraphFrames Who This Book Is For Data analysts, Python programmers, big data enthusiasts"@en ;
    schema:exampleOfWork <http://worldcat.org/entity/work/id/4498738298> ;
    schema:genre "Electronic books"@en ;
    schema:inLanguage "en" ;
    schema:isSimilarTo <http://worldcat.org/entity/work/data/4498738298#CreativeWork/pyspark_recipes_a_problem_solution_approach_with_pyspark2> ;
    schema:name "PySpark Recipes: A Problem-Solution Approach with PySpark2"@en ;
    schema:productID "1019734041" ;
    schema:url <https://ebookcentral.proquest.com/lib/ucm/detail.action?docID=5191332> ;
    schema:url <https://library.icc.edu/login?url=https://ebookcentral.proquest.com/lib/illcencol-ebooks/detail.action?docID=5191332> ;
    schema:url <http://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9781484231418> ;
    schema:url <http://public.eblib.com/choice/PublicFullRecord.aspx?p=5191332> ;
    schema:url <https://www.safaribooksonline.com/library/view/-/9781484231418/?ar> ;
    schema:workExample <http://worldcat.org/isbn/9781484231418> ;
    schema:workExample <http://worldcat.org/isbn/9781484231401> ;
    wdrs:describedby <http://www.worldcat.org/title/-/oclc/1019734041> ;
    .


Related Entities

<http://experiment.worldcat.org/entity/work/data/4498738298#Person/mishra_raju> # Raju Mishra
    a schema:Person ;
    schema:familyName "Mishra" ;
    schema:givenName "Raju" ;
    schema:name "Raju Mishra" ;
    .

<http://experiment.worldcat.org/entity/work/data/4498738298#Topic/python_computer_program_language> # Python (Computer program language)
    a schema:Intangible ;
    schema:name "Python (Computer program language)"@en ;
    .

<http://worldcat.org/entity/work/data/4498738298#CreativeWork/pyspark_recipes_a_problem_solution_approach_with_pyspark2>
    a schema:CreativeWork ;
    rdfs:label "PySpark Recipes : A Problem-Solution Approach with PySpark2" ;
    schema:description "Print version:" ;
    schema:isSimilarTo <http://www.worldcat.org/oclc/1019734041> ; # PySpark Recipes: A Problem-Solution Approach with PySpark2
    .

<http://worldcat.org/isbn/9781484231401>
    a schema:ProductModel ;
    schema:isbn "1484231406" ;
    schema:isbn "9781484231401" ;
    .

<http://worldcat.org/isbn/9781484231418>
    a schema:ProductModel ;
    schema:isbn "1484231414" ;
    schema:isbn "9781484231418" ;
    .


Content-negotiable representations

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.