A portfolio is a way to show your work and standout.

5 data science project ideas to highlight in your portfolio

A data science portfolio should highlight job relevant skills that show you can perform day-to-day tasks of a data scientist.

1. Data Cleaning

According to a study published by Forbes, Data Scientists on average spend 80% of their time on preparing, managing and cleaning data. Data prep is a good way to show that you can work with various types of clean and unclean data.

  1. Import data
  2. Join multiple data sets
  3. Detect missing values
  4. Look for anomalies (1, 2, 3, “Sundus”?)
  5. Impute for missing values
  6. Data quality assurance

2. Exploratory Data Analysis

Exploratory Data Analysis (EDA) allows you to understand the data and make unintended discoveries through visualizations and data stats.

  1. Formulate a hypothesis and relevant questions
  2. Visualize data to test the hypothesis
  3. Identify key trends in the data
  4. Identify relationships between variables
  5. Communicate results with visualizations (scatter plots, histograms, etc.)

3. Interactive Data Visualizations

Interactive visualizations and dashboards allows you and your team to find insights and formulate new hypothesis. Get creative and tell a story with data. This could be the bearing ground for new data science project ideas.

  1. Know your audience
  2. Tell a story
  3. Keep it simple
  4. Use appropriate visualizations
  5. Make insights actionable

4. Machine Learning & Statistics

Applying machine learning and statistics models is an very important aspect of data science work. You don’t have to make it complex; stick with simple models. e.g. Linear and logistic regression are easy to interpret. Other ideas include, classification, clustering, hypothesis testing, etc.

  1. defining the problem and scope
  2. model selection and reasons for choosing it
  3. feature engineering, data splitting, train & test data
  4. model evaluation to improve accuracy & prediction
  5. conclusion and business impact

5. Communication

Difference between a good and great data scientist is communication. You can build all the fancy models but if you can’t explain it to a non-science colleague, clients and leaders, you can’t get a buy-in. Highlight communication skills in your portfolio.

  1. Know your audience
  2. Explain in clear, simple terms.
  3. Tie results to business impact (revenue increase etc.)
  4. Tell a story and keep it simple.

