skip to content
Data mining : practical machine learning tools and techniques Preview this item
ClosePreview this item
Checking...

Data mining : practical machine learning tools and techniques

Author: I H Witten; Eibe Frank; Mark A Hall; Christopher J Pal
Publisher: Amsterdam ; Boston : Elsevier, [2017]
Edition/Format:   Print book : English : Fourth EditionView all editions and formats
Summary:
This work offers a grounding in machine learning concepts combined with practical advice on applying machine learning tools and techniques in real-world data mining situations.
Rating:

(not yet rated) 0 with reviews - Be the first.

Subjects
More like this

Find a copy in the library

&AllPage.SpinnerRetrieving; Finding libraries that hold this item...

Details

Document Type: Book
All Authors / Contributors: I H Witten; Eibe Frank; Mark A Hall; Christopher J Pal
ISBN: 9780128042915 0128042915
OCLC Number: 976423990
Description: xxxii, 621 pages ; 24 cm
Contents: Machine generated contents note: ch. 1 What's it all about? --
1.1. Data Mining and Machine Learning --
Describing Structural Patterns --
Machine Learning --
Data Mining --
1.2. Simple Examples: The Weather Problem and Others --
Weather Problem --
Contact Lenses: An Idealized Problem --
Irises: A Classic Numeric Dataset --
CPU Performance: Introducing Numeric Prediction --
Labor Negotiations: A More Realistic Example --
Soybean Classification: A Classic Machine Learning Success --
1.3. Fielded Applications --
Web Mining --
Decisions Involving Judgment --
Screening Images --
Load Forecasting --
Diagnosis --
Marketing and Sales --
Other Applications --
1.4. Data Mining Process --
1.5. Machine Learning and Statistics --
1.6. Generalization as Search --
Enumerating the Concept Space --
Bias --
1.7. Data Mining and Ethics --
Reidentification --
Using Personal Information --
Wider Issues --
1.8. Further Reading and Bibliographic Notes --
ch. 2 Input: concepts, instances, attributes --
2.1. What's a Concept? --
2.2. What's in an Example? --
Relations --
Other Example Types --
2.3. What's in an Attribute? --
2.4. Preparing the Input --
Gathering the Data Together --
ARFF Format --
Sparse Data --
Attribute Types --
Missing Values --
Inaccurate Values --
Unbalanced Data --
Getting to Know Your Data --
2.5. Further Reading and Bibliographic Notes --
ch. 3 Output: knowledge representation --
3.1. Tables --
3.2. Linear Models --
3.3. Trees --
3.4. Rules --
Classification Rules --
Association Rules --
Rules With Exceptions --
More Expressive Rules --
3.5. Instance-Based Representation --
3.6. Clusters --
3.7. Further Reading and Bibliographic Notes --
ch. 4 Algorithms: the basic methods --
4.1. Inferring Rudimentary Rules --
Missing Values and Numeric Attributes --
4.2. Simple Probabilistic Modeling --
Missing Values and Numeric Attributes --
Naive Bayes for Document Classification --
Remarks --
4.3. Divide-and-Conquer: Constructing Decision Trees --
Calculating Information --
Highly Branching Attributes --
4.4. Covering Algorithms: Constructing Rules --
Rules Versus Trees --
Simple Covering Algorithm --
Rules Versus Decision Lists --
4.5. Mining Association Rules --
Item Sets --
Association Rules --
Generating Rules Efficiently --
4.6. Linear Models --
Numeric Prediction: Linear Regression --
Linear Classification: Logistic Regression --
Linear Classification Using the Perceptron --
Linear Classification Using Winnow --
4.7. Instance-Based Learning --
Distance Function --
Finding Nearest Neighbors Efficiently --
Remarks --
4.8. Clustering --
Iterative Distance-Based Clustering --
Faster Distance Calculations --
Choosing the Number of Clusters --
Hierarchical Clustering --
Example of Hierarchical Clustering --
Incremental Clustering --
Category Utility --
Remarks --
4.9. Multi-instance Learning --
Aggregating the Input --
Aggregating the Output --
4.10. Further Reading and Bibliographic Notes --
4.11. WEKA Implementations --
ch. 5 Credibility: evaluating what's been learned --
5.1. Training and Testing --
5.2. Predicting Performance --
5.3. Cross-Validation --
5.4. Other Estimates --
Leave-One-Out --
Bootstrap --
5.5. Hyperparameter Selection --
5.6. Comparing Data Mining Schemes --
5.7. Predicting Probabilities --
Quadratic Loss Function --
Informational Loss Function --
Remarks --
5.8. Counting the Cost --
Cost-Sensitive Classification --
Cost-Sensitive Learning --
Lift Charts --
ROC Curves --
Recall-Precision Curves --
Remarks --
Cost Curves --
5.9. Evaluating Numeric Prediction --
5.10. MDL Principle --
5.11. Applying the MDL Principle to Clustering --
5.12. Using a Validation Set for Model Selection --
5.13. Further Reading and Bibliographic Notes --
ch. 6 Trees and rules --
6.1. Decision Trees --
Numeric Attributes --
Missing Values --
Pruning --
Estimating Error Rates --
Complexity of Decision Tree Induction --
From Trees to Rules --
C4.5: Choices and Options --
Cost-Complexity Pruning --
Discussion --
6.2. Classification Rules --
Criteria for Choosing Tests --
Missing Values, Numeric Attributes --
Generating Good Rules --
Using Global Optimization --
Obtaining Rules From Partial Decision Trees --
Rules With Exceptions --
Discussion --
6.3. Association Rules --
Building a Frequent Pattern Tree --
Finding Large Item Sets --
Discussion --
6.4. WEKA Implementations --
ch. 7 Extending instance-based and linear models --
7.1. Instance-Based Learning --
Reducing the Number of Exemplars --
Pruning Noisy Exemplars --
Weighting Attributes --
Generalizing Exemplars --
Distance Functions for Generalized Exemplars --
Generalized Distance Functions --
Discussion --
7.2. Extending Linear Models --
Maximum Margin Hyperplane --
Nonlinear Class Boundaries --
Support Vector Regression --
Kernel Ridge Regression --
Kernel Perceptron --
Multilayer Perceptrons --
Radial Basis Function Networks --
Stochastic Gradient Descent --
Discussion --
7.3. Numeric Prediction With Local Linear Models --
Model Trees --
Building the Tree --
Pruning the Tree --
Nominal Attributes --
Missing Values --
Pseudocode for Model Tree Induction --
Rules From Model Trees --
Locally Weighted Linear Regression --
Discussion --
7.4. WEKA Implementations --
ch. 8 Data transformations --
8.1. Attribute Selection --
Scheme-Independent Selection --
Searching the Attribute Space --
Scheme-Specific Selection --
8.2. Discretizing Numeric Attributes --
Unsupervised Discretization --
Entropy-Based Discretization --
Other Discretization Methods --
Entropy-Based Versus Error-Based Discretization --
Converting Discrete to Numeric Attributes --
8.3. Projections --
Principal Component Analysis --
Random Projections --
Partial Least Squares Regression --
Independent Component Analysis --
Linear Discriminant Analysis --
Quadratic Discriminant Analysis --
Fisher's Linear Discriminant Analysis --
Text to Attribute Vectors --
Time Series --
8.4. Sampling --
Reservoir Sampling --
8.5. Cleansing --
Improving Decision Trees --
Robust Regression --
Detecting Anomalies --
One-Class Learning --
Outlier Detection --
Generating Artificial Data --
8.6. Transforming Multiple Classes to Binary Ones --
Simple Methods --
Error-Correcting Output Codes --
Ensembles of Nested Dichotomies --
8.7. Calibrating Class Probabilities --
8.8. Further Reading and Bibliographic Notes --
8.9. WEKA Implementations --
ch. 9 Probabilistic methods --
9.1. Foundations --
Maximum Likelihood Estimation --
Maximum a Posteriori Parameter Estimation --
9.2. Bayesian Networks --
Making Predictions --
Learning Bayesian Networks --
Specific Algorithms --
Data Structures for Fast Learning --
9.3. Clustering and Probability Density Estimation --
Expectation Maximization Algorithm for a Mixture of Gaussians --
Extending the Mixture Model --
Clustering Using Prior Distributions --
Clustering With Correlated Attributes --
Kernel Density Estimation --
Comparing Parametric, Semiparametric and Nonparametric Density Models for Classification --
9.4. Hidden Variable Models --
Expected Log-Likelihoods and Expected Gradients --
Expectation Maximization Algorithm --
Applying the Expectation Maximization Algorithm to Bayesian Networks --
9.5. Bayesian Estimation and Prediction --
Probabilistic Inference Methods --
9.6. Graphical Models and Factor Graphs --
Graphical Models and Plate Notation --
Probabilistic Principal Component Analysis --
Latent Semantic Analysis --
Using Principal Component Analysis for Dimensionality Reduction --
Probabilistic LSA --
Latent Dirichlet Allocation --
Factor Graphs --
Markov Random Fields --
Computing Using the Sum-Product and Max-Product Algorithms --
9.7. Conditional Probability Models --
Linear and Polynomial Regression as Probability Models --
Using Priors on Parameters --
Multiclass Logistic Regression --
Gradient Descent and Second-Order Methods --
Generalized Linear Models --
Making Predictions for Ordered Classes --
Conditional Probabilistic Models Using Kernels --
9.8. Sequential and Temporal Models --
Markov Models and N-gram Methods --
Hidden Markov Models --
Conditional Random Fields --
9.9. Further Reading and Bibliographic Notes --
Software Packages and Implementations --
9.10. WEKA Implementations --
ch. 10 Deep learning --
10.1. Deep Feedforward Networks --
MNIST Evaluation --
Losses and Regularization --
Deep Layered Network Architecture --
Activation Functions --
Backpropagation Revisited --
Computation Graphs and Complex Network Structures --
Checking Backpropagation Implementations --
10.2. Training and Evaluating Deep Networks --
Early Stopping --
Validation, Cross-Validation, and Hyperparameter Tuning --
Mini-Batch-Based Stochastic Gradient Descent --
Pseudocode for Mini-Batch Based Stochastic Gradient Descent --
Learning Rates and Schedules --
Regularization With Priors on Parameters --
Dropout --
Batch Normalization --
Parameter Initialization --
Unsupervised Pretraining --
Data Augmentation and Synthetic Transformations --
10.3. Convolutional Neural Networks --
ImageNet Evaluation and Very Deep Convolutional Networks --
From Image Filtering to Learnable Convolutional Layers --
Convolutional Layers and Gradients --
Pooling and Subsampling Layers and Gradients --
Implementation --
10.4. Autoencoders --
Pretraining Deep Autoencoders With RBMs --
Denoising Autoencoders and Layerwise Training Note continued: Combining Reconstructive and Discriminative Learning --
10.5. Stochastic Deep Networks --
Boltzmann Machines --
Restricted Boltzmann Machines --
Contrastive Divergence --
Categorical and Continuous Variables --
Deep Boltzmann Machines --
Deep Belief Networks --
10.6. Recurrent Neural Networks --
Exploding and Vanishing Gradients --
Other Recurrent Network Architectures --
10.7. Further Reading and Bibliographic Notes --
10.8. Deep Learning Software and Network Implementations --
Theano --
Tensor Flow --
Torch --
Computational Network Toolkit --
Caffe --
Deeplearning4j --
Other Packages: Lasagne, Keras, and cuDNN --
10.9. WEKA Implementations --
ch. 11 Beyond supervised and unsupervised learning --
11.1. Semisupervised Learning --
Clustering for Classification --
Cotraining --
EM and Cotraining --
Neural Network Approaches --
11.2. Multi-instance Learning --
Converting to Single-Instance Learning --
Upgrading Learning Algorithms --
Dedicated Multi-instance Methods --
11.3. Further Reading and Bibliographic Notes --
11.4. WEKA Implementations --
ch. 12 Ensemble learning --
12.1. Combining Multiple Models --
12.2. Bagging --
Bias-Variance Decomposition --
Bagging With Costs --
12.3. Randomization --
Randomization Versus Bagging --
Rotation Forests --
12.4. Boosting --
AdaBoost --
Power of Boosting --
12.5. Additive Regression --
Numeric Prediction --
Additive Logistic Regression --
12.6. Interpretable Ensembles --
Option Trees --
Logistic Model Trees --
12.7. Stacking --
12.8. Further Reading and Bibliographic Notes --
12.9. WEKA Implementations --
ch. 13 Moving on: applications and beyond --
13.1. Applying Machine Learning --
13.2. Learning From Massive Datasets --
13.3. Data Stream Learning --
13.4. Incorporating Domain Knowledge --
13.5. Text Mining --
Document Classification and Clustering --
Information Extraction --
Natural Language Processing --
13.6. Web Mining --
Wrapper Induction --
Page Rank --
13.7. Images and Speech --
Images --
Speech --
13.8. Adversarial Situations --
13.9. Ubiquitous Data Mining --
13.10. Further Reading and Bibliographic Notes --
13.11. WEKA Implementations.
Responsibility: Ian H. Witten, Eibe Frank, Mark A. Hall, Christopher J. Pal.

Abstract:

Rev. edition of: Data mining: practical machine learning tools and techniques / Ian H. Witten, Eibe Frank, Mark A. Hall. c2013.  Read more...

Reviews

Editorial reviews

Publisher Synopsis

"...this volume is the most accessible introduction to data mining to appear in recent years. It is worthy of a fourth edition." --Computing Reviews

 
User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...

Tags

Be the first.

Similar Items

Related Subjects:(1)

User lists with this item (3)

Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data


Primary Entity

<http://www.worldcat.org/oclc/976423990> # Data mining : practical machine learning tools and techniques
    a schema:CreativeWork, schema:Book ;
   library:oclcnum "976423990" ;
   library:placeOfPublication <http://id.loc.gov/vocabulary/countries/ne> ;
   schema:about <http://experiment.worldcat.org/entity/work/data/796629430#Topic/data_mining> ; # Data mining
   schema:about <http://dewey.info/class/006.312/e22/> ;
   schema:author <http://experiment.worldcat.org/entity/work/data/796629430#Person/witten_i_h_ian_h> ; # Ian H. Witten
   schema:author <http://experiment.worldcat.org/entity/work/data/796629430#Person/pal_christopher_j> ; # Christopher J. Pal
   schema:author <http://experiment.worldcat.org/entity/work/data/796629430#Person/hall_mark_a_mark_andrew> ; # Mark Andrew Hall
   schema:author <http://experiment.worldcat.org/entity/work/data/796629430#Person/frank_eibe> ; # Eibe Frank
   schema:bookEdition "Fourth Edition." ;
   schema:bookFormat bgn:PrintBook ;
   schema:datePublished "2017" ;
   schema:description "This work offers a grounding in machine learning concepts combined with practical advice on applying machine learning tools and techniques in real-world data mining situations."@en ;
   schema:exampleOfWork <http://worldcat.org/entity/work/id/796629430> ;
   schema:inLanguage "en" ;
   schema:name "Data mining : practical machine learning tools and techniques"@en ;
   schema:productID "976423990" ;
   schema:workExample <http://worldcat.org/isbn/9780128042915> ;
   wdrs:describedby <http://www.worldcat.org/title/-/oclc/976423990> ;
    .


Related Entities

<http://experiment.worldcat.org/entity/work/data/796629430#Person/frank_eibe> # Eibe Frank
    a schema:Person ;
   schema:familyName "Frank" ;
   schema:givenName "Eibe" ;
   schema:name "Eibe Frank" ;
    .

<http://experiment.worldcat.org/entity/work/data/796629430#Person/hall_mark_a_mark_andrew> # Mark Andrew Hall
    a schema:Person ;
   schema:familyName "Hall" ;
   schema:givenName "Mark Andrew" ;
   schema:givenName "Mark A." ;
   schema:name "Mark Andrew Hall" ;
    .

<http://experiment.worldcat.org/entity/work/data/796629430#Person/pal_christopher_j> # Christopher J. Pal
    a schema:Person ;
   schema:familyName "Pal" ;
   schema:givenName "Christopher J." ;
   schema:name "Christopher J. Pal" ;
    .

<http://experiment.worldcat.org/entity/work/data/796629430#Person/witten_i_h_ian_h> # Ian H. Witten
    a schema:Person ;
   schema:familyName "Witten" ;
   schema:givenName "Ian H." ;
   schema:givenName "I. H." ;
   schema:name "Ian H. Witten" ;
    .

<http://worldcat.org/isbn/9780128042915>
    a schema:ProductModel ;
   schema:isbn "0128042915" ;
   schema:isbn "9780128042915" ;
    .


Content-negotiable representations

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.