notes  in  machine

Monday, November 05, 07:07PM  by:shuri
software development,
big data,
data quality,
predictive analytics,
machine learning,
Viewable by:

source Advanced ETL Functionality and Machine Learning Pre-Processing [Video] - DZone AI
This video is an overview of the pre-processing techniques needed before training a predictive model and of the native KNIME nodes suitable implement them.
Monday, November 05, 05:48PM  by:shuri
Viewable by:

source Michelangelo PyML: Introducing Uber's Platform for Rapid Python ML Model Development

Uber developed Michelangelo PyML to run identical copies of machine learning models locally in both real time experiments and large-scale offline prediction jobs.


  • "Unsurprisingly, data scientists overwhelmingly prefer to work in Python"
  • "most data scientists prefer to gather data upfront and iterate on their prototypes locally, using tools like pandas, scikit-learn, PyTorch, and TensorFlow."
  • "it can be challenging to ensure that both the online and offline versions of the model are equivalent."
  • "Feature transformations are limited to the vocabulary and expressiveness of Michelangelo’s DSL" we did the same thing in scotch for better or worse.
Wednesday, September 26, 11:57AM  by:shuri
Viewable by:

Lime: Explaining the predictions of any machine learning classifier - marcotcr/lime
Wednesday, September 26, 11:54AM  by:shuri
Viewable by:

source What is Shapley value regression and how does one implement it?
I have seen references to Shapley value regression elsewhere on this site, e.g.: Alternative to Shapley value regression Shapley Value Regression for prediction Shapley value regression / driver
Thursday, September 13, 06:07PM  by:shuri
Viewable by:

source Running PySpark on Jupyter Notebook with Docker – Suci Lin – Medium
It is much much easier to run PySpark with docker now, especially using an image from the repository of Jupyter. When you just want to try or learn Python. it is very convenient to use Jupyter…
Wednesday, September 12, 04:45PM  by:shuri
Viewable by:

source Feature Importance and Feature Selection With XGBoost in Python
A benefit of using ensembles of decision tree methods like gradient boosting is that they can automatically provide estimates of feature importance from a trained predictive model. In this post you will discover how you can estimate the importance of features for a predictive modeling problem using the XGBoost library in Python. After reading this …
Wednesday, September 12, 01:59PM  by:shuri
Viewable by:

source Train/Test/Validation Set Splitting in Sklearn
How could I split randomly a data matrix and the corresponding label vector into a X_train, X_test, X_val, y_train, y_test, y_val with Sklearn? As far as I know, sklearn.cross_validation.train_test...
Sunday, September 09, 01:44AM  by:shuri
intel ai,
intel software,
intel developer zone,
software developer,
software tools,
developer tools,
Viewable by:

source AIDC 2018 | CLOSING KEYNOTE | I Andrew NG, CEO
SUBSCRIBE TO THE INTEL SOFTWARE YOUTUBE CHANNEL: http://bit.ly/2iZTCsz About Intel Software: The Intel® Developer Zone encourages and supports software devel...
Saturday, September 08, 10:13PM  by:shuri
deep learning,
Viewable by:

source Nuts and Bolts of Applying Deep Learning (Andrew Ng)
The talks at the Deep Learning School on September 24/25, 2016 were amazing. I clipped out individual talks from the full live streams and provided links to ...
Friday, September 07, 12:16PM  by:shuri
Viewable by:

source Differences between L1 and L2 as Loss Function and Regularization
[2014/11/30: Updated the L1-norm vs L2-norm loss function via a programmatic validated diagram. Thanks readers for the pointing out the confusing diagram. Ne...
Friday, September 07, 11:22AM  by:shuri
Viewable by:

source Memorizing is not learning! — 6 tricks to prevent overfitting in machine learning.
Overfitting may be the most frustrating issue of Machine Learning. In this article, we’re going to see what it is, how to spot it, and most importantly how to prevent it from happening. The word…
Friday, September 07, 10:35AM  by:shuri
Viewable by:

source Overfitting in Machine Learning: What It Is and How to Prevent It
Overfitting in machine learning can single-handedly ruin your models. This guide covers what overfitting is, how to detect it, and how to prevent it.
Friday, August 17, 06:43PM  by:shuri
Viewable by:

source 21 Machine Learning Interview Questions and Answers
Sample answers to 21 machine learning interview questions that could appear in any data scientist or machine learning engineer interview.