skip to content
Practical data science : a guide to building the technology stack for turning data lakes into business assets Preview this item
ClosePreview this item
Checking...

Practical data science : a guide to building the technology stack for turning data lakes into business assets

Author: Andreas François Vermeulen
Publisher: Berkeley, CA : Apress, New York, NY : Distributed to the Book trade worldwide by Springer 2018. ©2018
Edition/Format:   eBook : Document : EnglishView all editions and formats
Summary:
Learn how to build a data science technology stack and perform good data science with repeatable methods. You will learn how to turn data lakes into business assets. The data science technology stack demonstrated in Practical Data Science is built from components in general use in the industry. Data scientist Andreas Vermeulen demonstrates in detail how to build and provision a technology stack to yield repeatable  Read more...
Rating:

(not yet rated) 0 with reviews - Be the first.

Subjects
More like this

Find a copy online

Links to this item

Find a copy in the library

&AllPage.SpinnerRetrieving; Finding libraries that hold this item...

Details

Genre/Form: Electronic books
Additional Physical Format: Printed edition:
Material Type: Document, Internet resource
Document Type: Internet Resource, Computer File
All Authors / Contributors: Andreas François Vermeulen
ISBN: 9781484230541 148423054X
OCLC Number: 1026491074
Notes: Includes index.
Description: 1 online resource (xxv, 805 pages) : illustrations (some color)
Contents: Intro; Table of Contents; About the Author; About the Technical Reviewer; Acknowledgments; Introduction; Chapter 1: Data Science Technology Stack; Rapid Information Factory Ecosystem; Data Science Storage Tools; Schema-on-Write and Schema-on-Read; Schema-on-Write Ecosystems; Schema-on-Read Ecosystems; Data Lake; Data Vault; Hubs; Links; Satellites; Data Warehouse Bus Matrix; Data Science Processing Tools; Spark; Spark Core; Spark SQL; Spark Streaming; MLlib Machine Learning Library; GraphX; Mesos; Akka; Cassandra; Kafka; Kafka Core; Kafka Streams; Kafka Connect; Elastic Search; R; Scala. PythonMQTT (MQ Telemetry Transport); Whatâ#x80;#x99;s Next?; Chapter 2: Vermeulen-Krennwallner-Hillman-Clark; Windows; Linux; Itâ#x80;#x99;s Now Time to Meet Your Customer; Vermeulen PLC; Krennwallner AG; Hillman Ltd; Clark Ltd; Processing Ecosystem; Scala; Apache Spark; Apache Mesos; Akka; Apache Cassandra; Kafka; Message Queue Telemetry Transport; Example Ecosystem; Python; Ubuntu; CentOS/RHEL; Windows; Is Python3 Ready?; Python Libraries; Pandas; Ubuntu; Centos/RHEL; PIP; Matplotlib; Ubuntu; CentOS/RHEL; PIP; NumPy; SymPy; Scikit-Learn; R; Ubuntu; CentOS/RHEL; Windows; Development Environment; R Studio. UbuntuCentOS/RHEL; Windows; R Packages; Data. Table Package; ReadR Package; JSONLite Package; Ggplot2 Package; Amalgamation of R with Spark; Sample Data; IP Addresses Data Sets; Customer Data Sets; Logistics Data Sets; Post Codes; Warehouse Data Set; Shop Data Set; Exchange Rate Data Set; Profit-and-Loss Statement Data Set; Summary; Chapter 3: Layered Framework; Definition of Data Science Framework; Cross-Industry Standard Process for Data Mining (CRISP-DM); Business Understanding; Data Understanding; Data Preparation; Modeling; Evaluation; Deployment. Homogeneous Ontology for Recursive Uniform SchemaThe Top Layers of a Layered Framework; The Basics for Business Layer; The Basics for Utility Layer; The Basics for Operational Management Layer; The Basics for Audit, Balance, and Control Layer; Audit; Balance; Control; The Basics for Functional Layer; Layered Framework for High-Level Data Science and Engineering; Windows; Linux; Summary; Chapter 4: Business Layer; Business Layer; The Functional Requirements; General Functional Requirements; Specific Functional Requirements; Data Mapping Matrix; Sun Models; Dimensions. SCD Type 1â#x80;#x94;Only UpdateSCD Type 2â#x80;#x94;Keeps Complete History; SCD Type 3â#x80;#x94;Transition Dimension; SCD Type 4â#x80;#x94;Fast-Growing Dimension.; Facts; Intra-Sun Model Consolidation Matrix; Sun Model One; Sun Model Two; Sun Model Three; The Nonfunctional Requirements; Accessibility Requirements; Audit and Control Requirements; Availability Requirements; Backup Requirements; Capacity, Current, and Forecast; Capacity; Concurrency; Throughput Capacity; Storage (Memory); Storage (Disk); Storage (GPU); Year-on-Year Growth Requirements; Configuration Management; Deployment; Documentation; Disaster Recovery.
Responsibility: Andreas François Vermeulen.

Abstract:

Learn how to build a data science technology stack and perform good data science with repeatable methods. You will learn how to turn data lakes into business assets. The data science technology stack demonstrated in Practical Data Science is built from components in general use in the industry. Data scientist Andreas Vermeulen demonstrates in detail how to build and provision a technology stack to yield repeatable results. He shows you how to apply practical methods to extract actionable business knowledge from data lakes consisting of data from a polyglot of data types and dimensions. What You'll Learn: Become fluent in the essential concepts and terminology of data science and data engineering Build and use a technology stack that meets industry criteria Master the methods for retrieving actionable business knowledge Coordinate the handling of polyglot data types in a data lake for repeatable results.

Reviews

User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...

Tags

Be the first.
Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data


Primary Entity

<http://www.worldcat.org/oclc/1026491074> # Practical data science : a guide to building the technology stack for turning data lakes into business assets
    a schema:CreativeWork, schema:MediaObject, schema:Book ;
    library:oclcnum "1026491074" ;
    library:placeOfPublication <http://id.loc.gov/vocabulary/countries/cau> ;
    rdfs:comment "Warning: This malformed URI has been treated as a string - 'https://www.safaribooksonline.com/library/view/title/9781484230541/?ar?orpq&email=^u'" ;
    rdfs:comment "Warning: This malformed URI has been treated as a string - 'https://www.safaribooksonline.com/library/view/-/9781484230541/?ar?orpq&email=^u'" ;
    schema:about <http://experiment.worldcat.org/entity/work/data/4802338922#Topic/database_management> ; # Database management
    schema:about <http://dewey.info/class/005.73/e23/> ;
    schema:about <http://experiment.worldcat.org/entity/work/data/4802338922#Topic/computers_databases_data_mining> ; # COMPUTERS--Databases--Data Mining
    schema:about <http://experiment.worldcat.org/entity/work/data/4802338922#Topic/business_mathematics_&_systems> ; # Business mathematics & systems
    schema:about <http://experiment.worldcat.org/entity/work/data/4802338922#Topic/databases> ; # Databases
    schema:about <http://experiment.worldcat.org/entity/work/data/4802338922#Topic/data_mining> ; # Data mining
    schema:about <http://experiment.worldcat.org/entity/work/data/4802338922#Topic/data_structures_computer_science> ; # Data structures (Computer science)
    schema:author <http://experiment.worldcat.org/entity/work/data/4802338922#Person/vermeulen_andreas_francois> ; # Andreas François Vermeulen
    schema:bookFormat schema:EBook ;
    schema:datePublished "2018" ;
    schema:description "Learn how to build a data science technology stack and perform good data science with repeatable methods. You will learn how to turn data lakes into business assets. The data science technology stack demonstrated in Practical Data Science is built from components in general use in the industry. Data scientist Andreas Vermeulen demonstrates in detail how to build and provision a technology stack to yield repeatable results. He shows you how to apply practical methods to extract actionable business knowledge from data lakes consisting of data from a polyglot of data types and dimensions. What You'll Learn: Become fluent in the essential concepts and terminology of data science and data engineering Build and use a technology stack that meets industry criteria Master the methods for retrieving actionable business knowledge Coordinate the handling of polyglot data types in a data lake for repeatable results."@en ;
    schema:description "Intro; Table of Contents; About the Author; About the Technical Reviewer; Acknowledgments; Introduction; Chapter 1: Data Science Technology Stack; Rapid Information Factory Ecosystem; Data Science Storage Tools; Schema-on-Write and Schema-on-Read; Schema-on-Write Ecosystems; Schema-on-Read Ecosystems; Data Lake; Data Vault; Hubs; Links; Satellites; Data Warehouse Bus Matrix; Data Science Processing Tools; Spark; Spark Core; Spark SQL; Spark Streaming; MLlib Machine Learning Library; GraphX; Mesos; Akka; Cassandra; Kafka; Kafka Core; Kafka Streams; Kafka Connect; Elastic Search; R; Scala."@en ;
    schema:exampleOfWork <http://worldcat.org/entity/work/id/4802338922> ;
    schema:genre "Electronic books"@en ;
    schema:inLanguage "en" ;
    schema:isSimilarTo <http://worldcat.org/entity/work/data/4802338922#CreativeWork/> ;
    schema:name "Practical data science : a guide to building the technology stack for turning data lakes into business assets"@en ;
    schema:productID "1026491074" ;
    schema:url <https://link.springer.com/book/10.1007/978-1-4842-3054-1> ;
    schema:url <https://shu-primo.hosted.exlibrisgroup.com/openurl/44SHU/44SHU_VU1?u.ignore_date_coverage=true&rft.mms_id=99257647202501> ;
    schema:url <https://link.springer.com/book/10.1007/978-1-4842-3053-4> ;
    schema:url <https://link.springer.com/10.1007/978-1-4842-3054-1> ;
    schema:url <https://doi.org/10.1007/978-1-4842-3054-1> ;
    schema:url <https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=1716217> ;
    schema:url <https://proquest.safaribooksonline.com/9781484230541> ;
    schema:url <http://VH7QX3XE2P.search.serialssolutions.com/?V=1.0&L=VH7QX3XE2P&S=JCs&C=TC0001986196&T=marc&tab=BOOKS> ;
    schema:url "https://www.safaribooksonline.com/library/view/title/9781484230541/?ar?orpq&email=^u" ;
    schema:url "https://www.safaribooksonline.com/library/view/-/9781484230541/?ar?orpq&email=^u" ;
    schema:url <https://public.ebookcentral.proquest.com/choice/publicfullrecord.aspx?p=5307290> ;
    schema:url <http://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9781484230541> ;
    schema:workExample <http://dx.doi.org/10.1007/978-1-4842-3054-1> ;
    schema:workExample <http://worldcat.org/isbn/9781484230541> ;
    umbel:isLike <http://bnb.data.bl.uk/id/resource/GBB8O3768> ;
    wdrs:describedby <http://www.worldcat.org/title/-/oclc/1026491074> ;
    .


Related Entities

<http://experiment.worldcat.org/entity/work/data/4802338922#Person/vermeulen_andreas_francois> # Andreas François Vermeulen
    a schema:Person ;
    schema:familyName "Vermeulen" ;
    schema:givenName "Andreas François" ;
    schema:name "Andreas François Vermeulen" ;
    .

<http://experiment.worldcat.org/entity/work/data/4802338922#Topic/business_mathematics_&_systems> # Business mathematics & systems
    a schema:Intangible ;
    schema:name "Business mathematics & systems"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4802338922#Topic/computers_databases_data_mining> # COMPUTERS--Databases--Data Mining
    a schema:Intangible ;
    schema:name "COMPUTERS--Databases--Data Mining"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4802338922#Topic/data_structures_computer_science> # Data structures (Computer science)
    a schema:Intangible ;
    schema:name "Data structures (Computer science)"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/4802338922#Topic/database_management> # Database management
    a schema:Intangible ;
    schema:name "Database management"@en ;
    .

<http://worldcat.org/entity/work/data/4802338922#CreativeWork/>
    a schema:CreativeWork ;
    schema:description "Printed edition:" ;
    schema:isSimilarTo <http://www.worldcat.org/oclc/1026491074> ; # Practical data science : a guide to building the technology stack for turning data lakes into business assets
    .

<http://worldcat.org/isbn/9781484230541>
    a schema:ProductModel ;
    schema:isbn "148423054X" ;
    schema:isbn "9781484230541" ;
    .

<http://www.worldcat.org/title/-/oclc/1026491074>
    a genont:InformationResource, genont:ContentTypeGenericResource ;
    schema:about <http://www.worldcat.org/oclc/1026491074> ; # Practical data science : a guide to building the technology stack for turning data lakes into business assets
    schema:dateModified "2019-12-05" ;
    void:inDataset <http://purl.oclc.org/dataset/WorldCat> ;
    .

<https://shu-primo.hosted.exlibrisgroup.com/openurl/44SHU/44SHU_VU1?u.ignore_date_coverage=true&rft.mms_id=99257647202501>
    rdfs:comment "Springer link professional and applied computing 2017 and 2018" ;
    rdfs:comment "Professional and Applied Computing (Springer-12059)" ;
    .


Content-negotiable representations

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.