skip to content
Apache Spark 2 : Master Complex Big Data Processing, Stream Analytics, and Machine Learning with Apache Spark. Preview this item
ClosePreview this item
Checking...

Apache Spark 2 : Master Complex Big Data Processing, Stream Analytics, and Machine Learning with Apache Spark.

Author: Romeo KienzlerRezaul KarimSridhar AllaSiamak AmirghodsiMeenakshi RajendranAll authors
Publisher: Birmingham : Packt Publishing Ltd, 2018.
Edition/Format:   eBook : Document : EnglishView all editions and formats
Summary:
Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more.
Rating:

(not yet rated) 0 with reviews - Be the first.

Subjects
More like this

Find a copy online

Links to this item

Find a copy in the library

&AllPage.SpinnerRetrieving; Finding libraries that hold this item...

Details

Genre/Form: Electronic books
Additional Physical Format: Print version:
Kienzler, Romeo.
Apache Spark 2: Data Processing and Real-Time Analytics.
Birmingham : Packt Publishing Ltd, ©2018
Material Type: Document, Internet resource
Document Type: Internet Resource, Computer File
All Authors / Contributors: Romeo Kienzler; Rezaul Karim; Sridhar Alla; Siamak Amirghodsi; Meenakshi Rajendran; Broderick Hall; Shuen Mei
ISBN: 9781789959918 1789959918
OCLC Number: 1081001090
Notes: Visualizing Spark application using web UI
Description: 1 online resource (604 pages)
Contents: Cover; Title Page; Copyright; About Packt; Contributors; Table of Contents; Preface; Chapter 1: A First Taste and What's New in Apache Spark V2; Spark machine learning; Spark Streaming; Spark SQL; Spark graph processing; Extended ecosystem; What's new in Apache Spark V2?; Cluster design; Cluster management; Local; Standalone; Apache YARN; Apache Mesos; Cloud-based deployments; Performance; The cluster structure; Hadoop Distributed File System; Data locality; Memory; Coding; Cloud; Summary; Chapter 3: Apache Spark Streaming; Overview; Errors and recovery; Checkpointing; Streaming sources TCP streamFile streams; Flume; Kafka; Summary; Chapter 4: Structured Streaming; The concept of continuous applications; True unification --
same code, same engine; Windowing; How streaming engines use windowing; How Apache Spark improves windowing; Increased performance with good old friends; How transparent fault tolerance and exactly-once delivery guarantee is achieved; Replayable sources can replay streams from a given offset; Idempotent sinks prevent data duplication; State versioning guarantees consistent results after reruns; Example --
connection to a MQTT message broker Controlling continuous applicationsMore on stream life cycle management; Summary; Chapter 5: Apache Spark MLlib; Architecture; The development environment; Classification with Naive Bayes; Theory on Classification; Naive Bayes in practice; Clustering with K-Means; Theory on Clustering; K-Means in practice; Artificial neural networks; ANN in practice; Summary; Chapter 6: Apache SparkML; What does the new API look like?; The concept of pipelines; Transformers; String indexer; OneHotEncoder; VectorAssembler; Pipelines; Estimators; RandomForestClassifier; Model evaluation CrossValidation and hyperparameter tuningCrossValidation; Hyperparameter tuning; Winning a Kaggle competition with Apache SparkML; Data preparation; Feature engineering; Testing the feature engineering pipeline; Training the machine learning model; Model evaluation; CrossValidation and hyperparameter tuning; Using the evaluator to assess the quality of the cross-validated and tuned model; Summary; Chapter 7: Apache SystemML; Why do we need just another library?; Why on Apache Spark?; The history of Apache SystemML; A cost-based optimizer for machine learning algorithms An example --
alternating least squaresApacheSystemML architecture; Language parsing; High-level operators are generated; How low-level operators are optimized on; Performance measurements; Apache SystemML in action; Summary; Chapter 8: Apache Spark GraphX; Overview; Graph analytics/processing with GraphX; The raw data; Creating a graph; Example 1 --
counting; Example 2 --
filtering; Example 3 --
PageRank; Example 4 --
triangle counting; Example 5 --
connected components; Summary; Chapter 9: Spark Tuning; Monitoring Spark jobs; Spark web interface; Jobs; Stages; Storage; Environment; Executors; SQL

Abstract:

Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more.  Read more...

Reviews

User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...

Tags

Be the first.
Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data


Primary Entity

<http://www.worldcat.org/oclc/1081001090> # Apache Spark 2 : Master Complex Big Data Processing, Stream Analytics, and Machine Learning with Apache Spark.
    a schema:CreativeWork, schema:MediaObject, schema:Book ;
    library:oclcnum "1081001090" ;
    library:placeOfPublication <http://experiment.worldcat.org/entity/work/data/8978873329#Place/birmingham> ; # Birmingham
    library:placeOfPublication <http://id.loc.gov/vocabulary/countries/enk> ;
    schema:about <http://experiment.worldcat.org/entity/work/data/8978873329#Topic/big_data> ; # Big data
    schema:about <http://experiment.worldcat.org/entity/work/data/8978873329#CreativeWork/spark_electronic_resource_apache_software_foundation> ; # Spark (Electronic resource : Apache Software Foundation)
    schema:about <http://experiment.worldcat.org/entity/work/data/8978873329#Topic/electronic_data_processing_distributed_processing_management> ; # Electronic data processing--Distributed processing--Management
    schema:bookFormat schema:EBook ;
    schema:contributor <http://experiment.worldcat.org/entity/work/data/8978873329#Person/mei_shuen> ; # Shuen Mei
    schema:contributor <http://experiment.worldcat.org/entity/work/data/8978873329#Person/karim_rezaul> ; # Rezaul Karim
    schema:contributor <http://experiment.worldcat.org/entity/work/data/8978873329#Person/amirghodsi_siamak> ; # Siamak Amirghodsi
    schema:contributor <http://experiment.worldcat.org/entity/work/data/8978873329#Person/alla_sridhar> ; # Sridhar Alla
    schema:contributor <http://experiment.worldcat.org/entity/work/data/8978873329#Person/hall_broderick> ; # Broderick Hall
    schema:contributor <http://experiment.worldcat.org/entity/work/data/8978873329#Person/rajendran_meenakshi> ; # Meenakshi Rajendran
    schema:creator <http://experiment.worldcat.org/entity/work/data/8978873329#Person/kienzler_romeo> ; # Romeo Kienzler
    schema:datePublished "2018" ;
    schema:description "Cover; Title Page; Copyright; About Packt; Contributors; Table of Contents; Preface; Chapter 1: A First Taste and What's New in Apache Spark V2; Spark machine learning; Spark Streaming; Spark SQL; Spark graph processing; Extended ecosystem; What's new in Apache Spark V2?; Cluster design; Cluster management; Local; Standalone; Apache YARN; Apache Mesos; Cloud-based deployments; Performance; The cluster structure; Hadoop Distributed File System; Data locality; Memory; Coding; Cloud; Summary; Chapter 3: Apache Spark Streaming; Overview; Errors and recovery; Checkpointing; Streaming sources"@en ;
    schema:description "Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more."@en ;
    schema:exampleOfWork <http://worldcat.org/entity/work/id/8978873329> ;
    schema:genre "Electronic books"@en ;
    schema:inLanguage "en" ;
    schema:isSimilarTo <http://worldcat.org/entity/work/data/8978873329#CreativeWork/apache_spark_2_data_processing_and_real_time_analytics> ;
    schema:name "Apache Spark 2 : Master Complex Big Data Processing, Stream Analytics, and Machine Learning with Apache Spark."@en ;
    schema:productID "1081001090" ;
    schema:publication <http://www.worldcat.org/title/-/oclc/1081001090#PublicationEvent/birmingham_packt_publishing_ltd_2018> ;
    schema:publisher <http://experiment.worldcat.org/entity/work/data/8978873329#Agent/packt_publishing_ltd> ; # Packt Publishing Ltd
    schema:url <https://public.ebookcentral.proquest.com/choice/publicfullrecord.aspx?p=5626934> ;
    schema:workExample <http://worldcat.org/isbn/9781789959918> ;
    wdrs:describedby <http://www.worldcat.org/title/-/oclc/1081001090> ;
    .


Related Entities

<http://experiment.worldcat.org/entity/work/data/8978873329#Agent/packt_publishing_ltd> # Packt Publishing Ltd
    a bgn:Agent ;
    schema:name "Packt Publishing Ltd" ;
    .

<http://experiment.worldcat.org/entity/work/data/8978873329#CreativeWork/spark_electronic_resource_apache_software_foundation> # Spark (Electronic resource : Apache Software Foundation)
    a schema:CreativeWork ;
    schema:name "Spark (Electronic resource : Apache Software Foundation)" ;
    .

<http://experiment.worldcat.org/entity/work/data/8978873329#Person/alla_sridhar> # Sridhar Alla
    a schema:Person ;
    schema:familyName "Alla" ;
    schema:givenName "Sridhar" ;
    schema:name "Sridhar Alla" ;
    .

<http://experiment.worldcat.org/entity/work/data/8978873329#Person/amirghodsi_siamak> # Siamak Amirghodsi
    a schema:Person ;
    schema:familyName "Amirghodsi" ;
    schema:givenName "Siamak" ;
    schema:name "Siamak Amirghodsi" ;
    .

<http://experiment.worldcat.org/entity/work/data/8978873329#Person/hall_broderick> # Broderick Hall
    a schema:Person ;
    schema:familyName "Hall" ;
    schema:givenName "Broderick" ;
    schema:name "Broderick Hall" ;
    .

<http://experiment.worldcat.org/entity/work/data/8978873329#Person/karim_rezaul> # Rezaul Karim
    a schema:Person ;
    schema:familyName "Karim" ;
    schema:givenName "Rezaul" ;
    schema:name "Rezaul Karim" ;
    .

<http://experiment.worldcat.org/entity/work/data/8978873329#Person/kienzler_romeo> # Romeo Kienzler
    a schema:Person ;
    schema:familyName "Kienzler" ;
    schema:givenName "Romeo" ;
    schema:name "Romeo Kienzler" ;
    .

<http://experiment.worldcat.org/entity/work/data/8978873329#Person/mei_shuen> # Shuen Mei
    a schema:Person ;
    schema:familyName "Mei" ;
    schema:givenName "Shuen" ;
    schema:name "Shuen Mei" ;
    .

<http://experiment.worldcat.org/entity/work/data/8978873329#Person/rajendran_meenakshi> # Meenakshi Rajendran
    a schema:Person ;
    schema:familyName "Rajendran" ;
    schema:givenName "Meenakshi" ;
    schema:name "Meenakshi Rajendran" ;
    .

<http://experiment.worldcat.org/entity/work/data/8978873329#Topic/electronic_data_processing_distributed_processing_management> # Electronic data processing--Distributed processing--Management
    a schema:Intangible ;
    schema:name "Electronic data processing--Distributed processing--Management"@en ;
    .

<http://worldcat.org/entity/work/data/8978873329#CreativeWork/apache_spark_2_data_processing_and_real_time_analytics>
    a schema:CreativeWork ;
    rdfs:label "Apache Spark 2: Data Processing and Real-Time Analytics." ;
    schema:description "Print version:" ;
    schema:isSimilarTo <http://www.worldcat.org/oclc/1081001090> ; # Apache Spark 2 : Master Complex Big Data Processing, Stream Analytics, and Machine Learning with Apache Spark.
    .

<http://worldcat.org/isbn/9781789959918>
    a schema:ProductModel ;
    schema:isbn "1789959918" ;
    schema:isbn "9781789959918" ;
    .

<http://www.worldcat.org/title/-/oclc/1081001090>
    a genont:InformationResource, genont:ContentTypeGenericResource ;
    schema:about <http://www.worldcat.org/oclc/1081001090> ; # Apache Spark 2 : Master Complex Big Data Processing, Stream Analytics, and Machine Learning with Apache Spark.
    schema:dateModified "2019-08-16" ;
    void:inDataset <http://purl.oclc.org/dataset/WorldCat> ;
    .


Content-negotiable representations

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.