skip to content
Covid-19 virus
COVID-19 Resources

Reliable information about the coronavirus (COVID-19) is available from the World Health Organization (current situation, international travel). Numerous and frequently-updated resource results are available from this WorldCat.org search. OCLC’s WebJunction has pulled together information and resources to assist library staff as they consider how to handle coronavirus issues in their communities.

Image provided by: CDC/ Alissa Eckert, MS; Dan Higgins, MAM
Spark : the definitive guide : big data processing made simple Preview this item
ClosePreview this item
Checking...

Spark : the definitive guide : big data processing made simple

Author: Bill Chambers; Matei Zaharia
Publisher: Sebastapol, CA : O'Reilly Media, 2018.
Edition/Format:   Print book : English : First editionView all editions and formats
Summary:
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. You'll explore the basic operations and common functions of Spark's structured APIs, as  Read more...
Rating:

(not yet rated) 0 with reviews - Be the first.

Subjects
More like this

Find a copy in the library

&AllPage.SpinnerRetrieving; Finding libraries that hold this item...

Details

Genre/Form: Nonfiction
Additional Physical Format: Online version:
Chambers, Bill.
Spark, the definitive guide.
[Sebastopol, California] : [O'Reilly Media], [2017]
(OCoLC)988029368
Document Type: Book
All Authors / Contributors: Bill Chambers; Matei Zaharia
ISBN: 9781491912218 1491912219
OCLC Number: 982651178
Notes: Includes index.
Description: xxvi, 576 pages : illustrations ; 24 cm.
Contents: Part 1. Gentle overview of big data and Spark. What is Apache Spark? --
A gentle introduction to Spark --
A tour of Spark's toolset --
Part 2. Structured APIs : DataFrames, SQL, and datasets. Structured API overview --
Basic structured operations --
Working with different types of data --
Aggregations --
Joins --
Data sources --
Spark SQL --
Datasets --
Part 3. Low-level APIs. Resilient distributed datasets (RDDs) --
Advanced RDDs --
Distributed shared variables --
Part 4. Production applications. How Spark runs on a cluster --
Developing Spark applications --
Deploying Spark --
Monitoring and debugging --
Performance tuning --
Part 5. Streaming. Stream processing fundamentals --
Structured streaming basics --
Event-time and stateful processing --
Structured streaming in production --
Part 6. Advanced analytics and machine learning. Advanced analytics and machine learning overview --
Preprocessing and feature engineering --
Classification --
Regression --
Recommendation --
Unsupervised learning --
Graph analytics --
Deep learning --
Part 7. Ecosystem. Language specifics : Python (PySpark) and R (SparkR and sparklyr) --
Ecosystem and community.
Responsibility: Bill Chambers and Matei Zaharia.

Abstract:

Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new  Read more...

Reviews

User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...

Tags

Be the first.
Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data


\n\n

Primary Entity<\/h3>\n
<http:\/\/www.worldcat.org\/oclc\/982651178<\/a>> # Spark : the definitive guide : big data processing made simple<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Book<\/a>, schema:CreativeWork<\/a> ;\u00A0\u00A0\u00A0\nlibrary:oclcnum<\/a> \"982651178<\/span>\" ;\u00A0\u00A0\u00A0\nlibrary:placeOfPublication<\/a> <http:\/\/id.loc.gov\/vocabulary\/countries\/cau<\/a>> ;\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_programming_languages_python<\/a>> ; # COMPUTERS \/ Programming Languages \/ Python<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/web_servers_computer_programs<\/a>> ; # Web servers--Computer programs<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_data_modeling_&_design<\/a>> ; # COMPUTERS \/ Data Modeling & Design<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#CreativeWork\/spark_electronic_resource_apache_software_foundation<\/a>> ; # Spark (Electronic resource : Apache Software Foundation)<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_databases_data_mining<\/a>> ; # COMPUTERS \/ Databases \/ Data Mining<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/big_data<\/a>> ; # Big data<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_data_processing<\/a>> ; # COMPUTERS \/ Data Processing<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/electronic_data_processing<\/a>> ; # Electronic data processing<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_programming_languages_java<\/a>> ; # COMPUTERS \/ Programming Languages \/ Java<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/telecommunication_message_processing<\/a>> ; # Telecommunication--Message processing<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/web_applications_development<\/a>> ; # Web applications--Development<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#CreativeWork\/apache_computer_file_apache_group<\/a>> ; # Apache (Computer file : Apache Group)<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/data_mining_computer_programs<\/a>> ; # Data mining--Computer programs<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_computer_engineering<\/a>> ; # COMPUTERS \/ Computer Engineering<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/telecommunication<\/a>> ; # Telecommunication<\/span>\n\u00A0\u00A0\u00A0\nschema:about<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/information_retrieval<\/a>> ; # Information retrieval<\/span>\n\u00A0\u00A0\u00A0\nschema:author<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Person\/zaharia_matei<\/a>> ; # Matei Zaharia<\/span>\n\u00A0\u00A0\u00A0\nschema:author<\/a> <http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Person\/chambers_bill_william_andrew<\/a>> ; # William Andrew Chambers<\/span>\n\u00A0\u00A0\u00A0\nschema:bookEdition<\/a> \"First edition.<\/span>\" ;\u00A0\u00A0\u00A0\nschema:bookFormat<\/a> bgn:PrintBook<\/a> ;\u00A0\u00A0\u00A0\nschema:datePublished<\/a> \"2018<\/span>\" ;\u00A0\u00A0\u00A0\nschema:description<\/a> \"Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. You\'ll explore the basic operations and common functions of Spark\'s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark\'s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasets-Spark\'s core APIs-through worked examples Dive into Spark\'s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Spark\'s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation.<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\nschema:description<\/a> \"Part 1. Gentle overview of big data and Spark. What is Apache Spark? -- A gentle introduction to Spark -- A tour of Spark\'s toolset -- Part 2. Structured APIs : DataFrames, SQL, and datasets. Structured API overview -- Basic structured operations -- Working with different types of data -- Aggregations -- Joins -- Data sources -- Spark SQL -- Datasets -- Part 3. Low-level APIs. Resilient distributed datasets (RDDs) -- Advanced RDDs -- Distributed shared variables -- Part 4. Production applications. How Spark runs on a cluster -- Developing Spark applications -- Deploying Spark -- Monitoring and debugging -- Performance tuning -- Part 5. Streaming. Stream processing fundamentals -- Structured streaming basics -- Event-time and stateful processing -- Structured streaming in production -- Part 6. Advanced analytics and machine learning. Advanced analytics and machine learning overview -- Preprocessing and feature engineering -- Classification -- Regression -- Recommendation -- Unsupervised learning -- Graph analytics -- Deep learning -- Part 7. Ecosystem. Language specifics : Python (PySpark) and R (SparkR and sparklyr) -- Ecosystem and community.<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\nschema:exampleOfWork<\/a> <http:\/\/worldcat.org\/entity\/work\/id\/4223162498<\/a>> ;\u00A0\u00A0\u00A0\nschema:genre<\/a> \"Nonfiction<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\nschema:inLanguage<\/a> \"en<\/span>\" ;\u00A0\u00A0\u00A0\nschema:isSimilarTo<\/a> <http:\/\/www.worldcat.org\/oclc\/988029368<\/a>> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Spark : the definitive guide : big data processing made simple<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\nschema:productID<\/a> \"982651178<\/span>\" ;\u00A0\u00A0\u00A0\nschema:workExample<\/a> <http:\/\/worldcat.org\/isbn\/9781491912218<\/a>> ;\u00A0\u00A0\u00A0\nwdrs:describedby<\/a> <http:\/\/www.worldcat.org\/title\/-\/oclc\/982651178<\/a>> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n\n

Related Entities<\/h3>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#CreativeWork\/apache_computer_file_apache_group<\/a>> # Apache (Computer file : Apache Group)<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:CreativeWork<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Apache (Computer file : Apache Group)<\/span>\" ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#CreativeWork\/spark_electronic_resource_apache_software_foundation<\/a>> # Spark (Electronic resource : Apache Software Foundation)<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:CreativeWork<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Spark (Electronic resource : Apache Software Foundation)<\/span>\" ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Person\/chambers_bill_william_andrew<\/a>> # William Andrew Chambers<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Person<\/a> ;\u00A0\u00A0\u00A0\nschema:familyName<\/a> \"Chambers<\/span>\" ;\u00A0\u00A0\u00A0\nschema:givenName<\/a> \"William Andrew<\/span>\" ;\u00A0\u00A0\u00A0\nschema:givenName<\/a> \"Bill<\/span>\" ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"William Andrew Chambers<\/span>\" ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Person\/zaharia_matei<\/a>> # Matei Zaharia<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Person<\/a> ;\u00A0\u00A0\u00A0\nschema:familyName<\/a> \"Zaharia<\/span>\" ;\u00A0\u00A0\u00A0\nschema:givenName<\/a> \"Matei<\/span>\" ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Matei Zaharia<\/span>\" ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/big_data<\/a>> # Big data<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Big data<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_computer_engineering<\/a>> # COMPUTERS \/ Computer Engineering<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"COMPUTERS \/ Computer Engineering<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_data_modeling_&_design<\/a>> # COMPUTERS \/ Data Modeling & Design<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"COMPUTERS \/ Data Modeling & Design<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_data_processing<\/a>> # COMPUTERS \/ Data Processing<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"COMPUTERS \/ Data Processing<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_databases_data_mining<\/a>> # COMPUTERS \/ Databases \/ Data Mining<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"COMPUTERS \/ Databases \/ Data Mining<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_programming_languages_java<\/a>> # COMPUTERS \/ Programming Languages \/ Java<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"COMPUTERS \/ Programming Languages \/ Java<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/computers_programming_languages_python<\/a>> # COMPUTERS \/ Programming Languages \/ Python<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"COMPUTERS \/ Programming Languages \/ Python<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/data_mining_computer_programs<\/a>> # Data mining--Computer programs<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Data mining--Computer programs<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/electronic_data_processing<\/a>> # Electronic data processing<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Electronic data processing<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/information_retrieval<\/a>> # Information retrieval<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Information retrieval<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/telecommunication<\/a>> # Telecommunication<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Telecommunication<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/telecommunication_message_processing<\/a>> # Telecommunication--Message processing<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Telecommunication--Message processing<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/web_applications_development<\/a>> # Web applications--Development<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Web applications--Development<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/experiment.worldcat.org\/entity\/work\/data\/4223162498#Topic\/web_servers_computer_programs<\/a>> # Web servers--Computer programs<\/span>\n\u00A0\u00A0\u00A0\u00A0a \nschema:Intangible<\/a> ;\u00A0\u00A0\u00A0\nschema:name<\/a> \"Web servers--Computer programs<\/span>\"@en<\/a> ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/id.loc.gov\/vocabulary\/countries\/cau<\/a>>\u00A0\u00A0\u00A0\u00A0a \nschema:Place<\/a> ;\u00A0\u00A0\u00A0\ndcterms:identifier<\/a> \"cau<\/span>\" ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/worldcat.org\/isbn\/9781491912218<\/a>>\u00A0\u00A0\u00A0\u00A0a \nschema:ProductModel<\/a> ;\u00A0\u00A0\u00A0\nschema:isbn<\/a> \"1491912219<\/span>\" ;\u00A0\u00A0\u00A0\nschema:isbn<\/a> \"9781491912218<\/span>\" ;\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n
<http:\/\/www.worldcat.org\/oclc\/988029368<\/a>>\u00A0\u00A0\u00A0\u00A0a \nschema:CreativeWork<\/a> ;\u00A0\u00A0\u00A0\nrdfs:label<\/a> \"Spark, the definitive guide.<\/span>\" ;\u00A0\u00A0\u00A0\nschema:description<\/a> \"Online version:<\/span>\" ;\u00A0\u00A0\u00A0\nschema:isSimilarTo<\/a> <http:\/\/www.worldcat.org\/oclc\/982651178<\/a>> ; # Spark : the definitive guide : big data processing made simple<\/span>\n\u00A0\u00A0\u00A0\u00A0.\n\n\n<\/div>\n