TY - BOOK AU - Conway, Drew and White, Myles John TI - Machine Learning for Hackers SN - 9789350236741 U1 - 006 PY - 2019/// CY - CA PB - O'Reilly Media Inc. KW - Computer Algorithms KW - Electronic Data Processing N1 - Machine generated contents note: 1. Using R -- R for Machine Learning -- Downloading and Installing R -- IDEs and Text Editors -- Loading and Installing R Packages -- R Basics for Machine Learning -- Further Reading on R -- 2. Data Exploration -- Exploration versus Confirmation -- What Is Data? -- Inferring the Types of Columns in Your Data -- Inferring Meaning -- Numeric Summaries -- Means, Medians, and Modes -- Quantiles -- Standard Deviations and Variances -- Exploratory Data Visualization -- Visualizing the Relationships Between Columns -- 3. Classification: Spam Filtering -- This or That: Binary Classification -- Moving Gently into Conditional Probability -- Writing Our First Bayesian Spam Classifier -- Defining the Classifier and Testing It with Hard Ham -- Testing the Classifier Against All Email Types -- Improving the Results -- 4. Ranking: Priority Inbox -- How Do You Sort Something When You Don't Know the Order? -- Ordering Email Messages by Priority. Contents note continued: Priority Features of Email -- Writing a Priority Inbox -- Functions for Extracting the Feature Set -- Creating a Weighting Scheme for Ranking -- Weighting from Email Thread Activity -- Training and Testing the Ranker -- 5. Regression: Predicting Page Views -- Introducing Regression -- The Baseline Model -- Regression Using Dummy Variables -- Linear Regression in a Nutshell -- Predicting Web Traffic -- Defining Correlation -- 6. Regularization: Text Regression -- Nonlinear Relationships Between Columns: Beyond Straight Lines -- Introducing Polynomial Regression -- Methods for Preventing Overfitting -- Preventing Overfitting with Regularization -- Text Regression -- Logistic Regression to the Rescue -- 7. Optimization: Breaking Codes -- Introduction to Optimization -- Ridge Regression -- Code Breaking as Optimization -- 8. PCA: Building a Market Index -- Unsupervised Learning -- 9. MDS: Visually Exploring US Senator Similarity. Contents note continued: Clustering Based on Similarity -- A Brief Introduction to Distance Metrics and Multidirectional Scaling -- How Do US Senators Cluster? -- Analyzing US Senator Roll Call Data (101st--111th Congresses) -- 10. kNN: Recommendation Systems -- The k-Nearest Neighbors Algorithm -- R Package Installation Data -- 11. Analyzing Social Graphs -- Social Network Analysis -- Thinking Graphically -- Hacking Twitter Social Graph Data -- Working with the Google SocialGraph API -- Analyzing Twitter Networks -- Local Community Structure -- Visualizing the Clustered Twitter Network with Gephi -- Building Your Own "Who to Follow" Engine -- 12. Model Comparison -- SVMs: The Support Vector Machine -- Comparing Algorithms N2 - Now that storage and collection technologies are cheaper and more precise, methods for extracting relevant information from large datasets is within the reach any experienced programmer willing to crunch data ER -