skip to content
Spark : the definitive guide : big data processing made simple Preview this item
ClosePreview this item
Checking...

Spark : the definitive guide : big data processing made simple

Author: Bill Chambers; Matei Zaharia
Publisher: Sebastapol, CA : O'Reilly Media, 2018.
Edition/Format:   Print book : English : First editionView all editions and formats
Summary:
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. You'll explore the basic operations and common functions of Spark's structured APIs, as  Read more...
Rating:

(not yet rated) 0 with reviews - Be the first.

Subjects
More like this

Find a copy in the library

&AllPage.SpinnerRetrieving; Finding libraries that hold this item...

Details

Genre/Form: Nonfiction
Additional Physical Format: Online version:
Chambers, Bill.
Spark, the definitive guide.
[Sebastopol, California] : [O'Reilly Media], [2017]
(OCoLC)988029368
Document Type: Book
All Authors / Contributors: Bill Chambers; Matei Zaharia
ISBN: 9781491912218 1491912219
OCLC Number: 982651178
Notes: Includes index.
Description: xxvi, 576 pages : illustrations ; 24 cm.
Contents: Part 1. Gentle overview of big data and Spark. What is Apache Spark? --
A gentle introduction to Spark --
A tour of Spark's toolset --
Part 2. Structured APIs : DataFrames, SQL, and datasets. Structured API overview --
Basic structured operations --
Working with different types of data --
Aggregations --
Joins --
Data sources --
Spark SQL --
Datasets --
Part 3. Low-level APIs. Resilient distributed datasets (RDDs) --
Advanced RDDs --
Distributed shared variables --
Part 4. Production applications. How Spark runs on a cluster --
Developing Spark applications --
Deploying Spark --
Monitoring and debugging --
Performance tuning --
Part 5. Streaming. Stream processing fundamentals --
Structured streaming basics --
Event-time and stateful processing --
Structured streaming in production --
Part 6. Advanced analytics and machine learning. Advanced analytics and machine learning overview --
Preprocessing and feature engineering --
Classification --
Regression --
Recommendation --
Unsupervised learning --
Graph analytics --
Deep learning --
Part 7. Ecosystem. Language specifics : Python (PySpark) and R (SparkR and sparklyr) --
Ecosystem and community.
Responsibility: Bill Chambers and Matei Zaharia.

Abstract:

Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new  Read more...

Reviews

User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...

Tags

Be the first.
Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data


Primary Entity

<http://www.worldcat.org/oclc/982651178> # Spark : the definitive guide : big data processing made simple
    a schema:Book, schema:CreativeWork ;
    library:oclcnum "982651178" ;
    library:placeOfPublication <http://id.loc.gov/vocabulary/countries/cau> ;
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_programming_languages_python> ; # COMPUTERS / Programming Languages / Python
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/web_servers_computer_programs> ; # Web servers--Computer programs
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_data_modeling_&_design> ; # COMPUTERS / Data Modeling & Design
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#CreativeWork/spark_electronic_resource_apache_software_foundation> ; # Spark (Electronic resource : Apache Software Foundation)
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_databases_data_mining> ; # COMPUTERS / Databases / Data Mining
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/big_data> ; # Big data
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_data_processing> ; # COMPUTERS / Data Processing
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/electronic_data_processing> ; # Electronic data processing
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_programming_languages_java> ; # COMPUTERS / Programming Languages / Java
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/telecommunication_message_processing> ; # Telecommunication--Message processing
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/web_applications_development> ; # Web applications--Development
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#CreativeWork/apache_computer_file_apache_group> ; # Apache (Computer file : Apache Group)
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/data_mining_computer_programs> ; # Data mining--Computer programs
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_computer_engineering> ; # COMPUTERS / Computer Engineering
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/telecommunication> ; # Telecommunication
    schema:about <http://experiment.worldcat.org/entity/work/data/4223162498#Topic/information_retrieval> ; # Information retrieval
    schema:author <http://experiment.worldcat.org/entity/work/data/4223162498#Person/zaharia_matei> ; # Matei Zaharia
    schema:author <http://experiment.worldcat.org/entity/work/data/4223162498#Person/chambers_bill_william_andrew> ; # William Andrew Chambers
    schema:bookEdition "First edition." ;
    schema:bookFormat bgn:PrintBook ;
    schema:datePublished "2018" ;
    schema:description "Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. You'll explore the basic operations and common functions of Spark's structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark's scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasets-Spark's core APIs-through worked examples Dive into Spark's low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Spark's stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation."@en ;
    schema:description "Part 1. Gentle overview of big data and Spark. What is Apache Spark? -- A gentle introduction to Spark -- A tour of Spark's toolset -- Part 2. Structured APIs : DataFrames, SQL, and datasets. Structured API overview -- Basic structured operations -- Working with different types of data -- Aggregations -- Joins -- Data sources -- Spark SQL -- Datasets -- Part 3. Low-level APIs. Resilient distributed datasets (RDDs) -- Advanced RDDs -- Distributed shared variables -- Part 4. Production applications. How Spark runs on a cluster -- Developing Spark applications -- Deploying Spark -- Monitoring and debugging -- Performance tuning -- Part 5. Streaming. Stream processing fundamentals -- Structured streaming basics -- Event-time and stateful processing -- Structured streaming in production -- Part 6. Advanced analytics and machine learning. Advanced analytics and machine learning overview -- Preprocessing and feature engineering -- Classification -- Regression -- Recommendation -- Unsupervised learning -- Graph analytics -- Deep learning -- Part 7. Ecosystem. Language specifics : Python (PySpark) and R (SparkR and sparklyr) -- Ecosystem and community."@en ;
    schema:exampleOfWork <http://worldcat.org/entity/work/id/4223162498> ;
    schema:genre "Nonfiction"@en ;
    schema:inLanguage "en" ;
    schema:isSimilarTo <http://www.worldcat.org/oclc/988029368> ;
    schema:name "Spark : the definitive guide : big data processing made simple"@en ;
    schema:productID "982651178" ;
    schema:workExample <http://worldcat.org/isbn/9781491912218> ;
    wdrs:describedby <http://www.worldcat.org/title/-/oclc/982651178> ;
    .


Related Entities

<http://experiment.worldcat.org/entity/work/data/4223162498#CreativeWork/apache_computer_file_apache_group> # Apache (Computer file : Apache Group)
    a schema:CreativeWork ;
    schema:name "Apache (Computer file : Apache Group)" ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#CreativeWork/spark_electronic_resource_apache_software_foundation> # Spark (Electronic resource : Apache Software Foundation)
    a schema:CreativeWork ;
    schema:name "Spark (Electronic resource : Apache Software Foundation)" ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Person/chambers_bill_william_andrew> # William Andrew Chambers
    a schema:Person ;
    schema:familyName "Chambers" ;
    schema:givenName "William Andrew" ;
    schema:givenName "Bill" ;
    schema:name "William Andrew Chambers" ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Person/zaharia_matei> # Matei Zaharia
    a schema:Person ;
    schema:familyName "Zaharia" ;
    schema:givenName "Matei" ;
    schema:name "Matei Zaharia" ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_computer_engineering> # COMPUTERS / Computer Engineering
    a schema:Intangible ;
    schema:name "COMPUTERS / Computer Engineering"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_data_modeling_&_design> # COMPUTERS / Data Modeling & Design
    a schema:Intangible ;
    schema:name "COMPUTERS / Data Modeling & Design"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_data_processing> # COMPUTERS / Data Processing
    a schema:Intangible ;
    schema:name "COMPUTERS / Data Processing"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_databases_data_mining> # COMPUTERS / Databases / Data Mining
    a schema:Intangible ;
    schema:name "COMPUTERS / Databases / Data Mining"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_programming_languages_java> # COMPUTERS / Programming Languages / Java
    a schema:Intangible ;
    schema:name "COMPUTERS / Programming Languages / Java"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/computers_programming_languages_python> # COMPUTERS / Programming Languages / Python
    a schema:Intangible ;
    schema:name "COMPUTERS / Programming Languages / Python"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/data_mining_computer_programs> # Data mining--Computer programs
    a schema:Intangible ;
    schema:name "Data mining--Computer programs"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/electronic_data_processing> # Electronic data processing
    a schema:Intangible ;
    schema:name "Electronic data processing"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/information_retrieval> # Information retrieval
    a schema:Intangible ;
    schema:name "Information retrieval"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/telecommunication> # Telecommunication
    a schema:Intangible ;
    schema:name "Telecommunication"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/telecommunication_message_processing> # Telecommunication--Message processing
    a schema:Intangible ;
    schema:name "Telecommunication--Message processing"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/web_applications_development> # Web applications--Development
    a schema:Intangible ;
    schema:name "Web applications--Development"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4223162498#Topic/web_servers_computer_programs> # Web servers--Computer programs
    a schema:Intangible ;
    schema:name "Web servers--Computer programs"@en ;
    .

<http://worldcat.org/isbn/9781491912218>
    a schema:ProductModel ;
    schema:isbn "1491912219" ;
    schema:isbn "9781491912218" ;
    .

<http://www.worldcat.org/oclc/988029368>
    a schema:CreativeWork ;
    rdfs:label "Spark, the definitive guide." ;
    schema:description "Online version:" ;
    schema:isSimilarTo <http://www.worldcat.org/oclc/982651178> ; # Spark : the definitive guide : big data processing made simple
    .


Content-negotiable representations

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.