Insight Data Science Fellow Program
http://insightdatascience.com/Deadline 6/22 for Session starts 9/5
Data Incubator Fellow program
https://www.thedataincubator.comDeadline 7/3 for Session starts 9/11
Elite Data Science 7 day crash course
what you'll learn over the next seven days:- Lesson 1: Bird's-eye view of applied machine learning.
- Key words: Observations, Training/Test data, features, target variable (label), Algorithm, Model, parameters, prediction
- ML Tasks:
- Supervised Learning: Regression, Classification
- Unsupervised Learning: Clustering
- 3 Key elements
- Human intuition/guidance (a skilled chef)
- Clean and relevant data (fresh ingredients)
- Avoid overfitting (No overcooking)
- Core steps
- Exploratory Analysis (know your data)
- Data Cleaning (better data beats fancier algorithms)
- Feature Engineering (identify truly independent features)
- Algorithm selection
- Model training
- Project Scoping (deliver business value)
- Data Wrangling (reformat/vectorizing data)
- Preprocessing (transforming data as needed by model)
- Ensembling (evaluate multiple models)
- Lesson 2: The 5 steps to quickly, efficiently, and decisively "get to know" your data.
- Basic information of your dataset: shape, numeric/categorical features, label
- distribute of numeric features
- distribute of categorical features
- segmentation
- correlations between features
- Lesson 3: How to clean your dataset to avoid costly pitfalls.
- Structural errors
- Unwanted outliers
- Missing data
- Lesson 4: Simple ways to boost performance through feature engineering.
- leverage domain knowledge/common sense to identify key features
- do two features have correlation?
- consolidate sparse features (insufficient data)
- vectorize categorical features into numeric
- drop off unuseful features
- Lesson 5: The most effective algorithms you should master.
- Ensembles are machine learning methods for combining predictions from multiple separate models. (2 common methods for ensembling: Bagging, Boosting)
- Regularization is a technique used to prevent overfitting by artificially penalizing model coefficients.(3 common regularized linear regression: Lasso, Ridge, Elastic-Net)
- It can also remove features entirely (by setting their coefficients to 0).
- The "strength" of the penalty is tunable
- It can discourage large coefficients (by dampening them).
- Lesson 6: How to maximize model performance while avoiding overfitting.
- Lesson 7: Next steps for developing in-demand, practical skills.
- Kaggle Guide
- Free Resources collected by EliteDataScience:
Microsoft Professional Program for Data Science
10 required courses, (see Data-Science-Curriculum.xlsx)16-32 hrs per course,
8 skills (T-SQL, Microsoft Excel, PowerBI, Python, R, Azure Machine Learning, HDInsight, Spark)
How do I learn deep learning in 2 months?
https://www.quora.com/How-do-I-learn-deep-learning-in-2-months
Does getting the Udacity's machine learning nanodegree help me to get an ML related job?
https://www.quora.com/Does-getting-the-Udacitys-machine-learning-nanodegree-help-me-to-get-an-ML-related-job
Which is a better data science boot-camp for someone with a PhD: the data incubator (Pageonthedataincubator.com) or the insight data science fellows program (Insight Data Science Fellows Program)?
https://www.quora.com/Which-is-a-better-data-science-boot-camp-for-someone-with-a-PhD-the-data-incubator-Pageonthedataincubator-com-or-the-insight-data-science-fellows-program-Insight-Data-Science-Fellows-Program
Project Portfolios
MNIST digit recognition using Tensorflow by Hvass Lab
see Udacity-ML-Nanodegree\list-of-projects.txt
Tutorials for Data Science
NumPy Tutorial
https://www.dataquest.io/blog/numpy-tutorial-python/
Python Numpy Array Tutorial - DataCamp
Scipy Tutorial
Scipy Lecture Note (very good)
SciPy Tutorial by Travis E. Oliphant
Matplotlib Tutorial
Pyplot tutorial
Comments
Post a Comment