• Experience in one or more of the following programming languages: Python, R, MATLAB, Julia
• Experience in data wrangling of (messy) datasets using pandas or dplyr
• Experience in exploratory data analysis
• Experience in data visualization using one or more of the following packages/tools:
seaborn, matplotlib, plotly, ggplot, Tableau
• Knowledge of machine learning techniques and algorithms such as K-nearest
neighbors, naive bayes, support vector machines, random forest, logistic regression, etc.
• Awareness of machine learning concepts such as over-fitting and under-fitting, the difference between bias and variance, generalization capability of the prediction model to unseen data, feature engineering, etc.
• Excellent written and verbal communication skills for coordinating across teams
• A drive to learn and master new technologies and techniques
Bonus Points
• A Masters or PhD degree in a relevant discipline
• Data engineering experience; e.g., SQL, Hadoop, Spark, cloud computing
• Competitive programming experience (e.g., ACM, Topcoder, Code Forces, etc.)
• Experience participating in machine learning competitions (e.g., Kaggle, Hacker Earth, etc.)
• Strong statistics background
• An up-to-date portfolio (on GitHub?) showing your experience in all of the above!