3 Steps To Build A Data Science Portfolio
To Ultimately Land Your Dream Job
Few months ago I wrote an article — How To Go Into Data Science? to answer some of the most common questions and challenges faced by most beginners in data science.
In the article, I briefly talked about WHAT kind of portfolio that can help you to get a first job in data science or machine learning. But I didn’t go into details on HOW to build a data science portfolio in the first place.
Fast forward to the present, after the article was published, I started receiving messages from many aspiring data scientists who all have common goal — become a data scientist (or at least go into data science related field) with the same question — how to build a data science portfolio?
I was once a jobless guy with nothing but a piece of paper crafted with three shiny words — Bachelor of Science. I was once an aimless millennial with zero sense of direction in my career and life despite the abundance of opportunities out there, seemingly waiting for me to discover.
Therefore, I can’t tell you how much these messages resonated with me. I feel the struggles to land that internship; I feel the challenges to get your profile noticed by employers; I feel the frustration to build your data science portfolio — given tons of guides out there — because you don’t know where to begin.
We all know that resume alone is not enough to land you a job in data science. We also know that building a data science portfolio is very important, particularly in our job search journey.
The question now is: How? How to build a data science portfolio?
And this is exactly why this article is here to condense all my previous experiences throughout my learning journey into 3 most important steps to build a data science portfolio for YOU.
As far as data science is concerned, our definition of portfolio here means public evidence of your data science skills (as defined by Chief Data Scientist at DataCamp — David Robinson on Mode Analytics blog).
That being said, this article will not talk about how to build your resume. I believe resume is not a portfolio, but rather a collection of the elements of your portfolios.
In the following section, we’ll dive straight into the core 3 steps that I personally used (still ongoing!) to build my data science portfolio in the shortest time possible. These 3 steps are the exact same approach I used (and fine-tuned along the way) that worked for me, and I’m sure they work for you as well.
At the end of this article, I hope to give you a better understanding on how to build your data science portfolio with guidance to ultimately land your dream job in data science.
So… Let’s get started!
3 Steps To Build A Data Science Portfolio
1. Data Science Internship (Or Equivalent)
Yes. Data Science related internships.
We’re not only talking about data scientist internship, but also including data analyst, data engineer, business intelligence or analyst, research engineer, and other related internships.
The important point is this: As long as the internship requires you do some form of data collection, analysis, models building, or visualization, the skills learned are highly transferable to any data science jobs in the market.
So getting a data science relevant internship is the first step. But WHY?
Because employers often look for students or fresh graduates with some experience in data science work. Most importantly, they want to hire someone who can start working on real stuff on the fly with minimum training time, simply because time is money in the corporate world.
Also, having a data science internship is a big boost to your portfolio and resume as a whole. Regardless of your academic background, having this internship shows that you’re serious and passionate about it. It shows that you’re not just another aspiring data scientist who says, “I am very passionate about data science and would like to learn more about it.”
Don’t just talk. Show it.
If you’ve read my very first article —My Journey from Physics into Data Science, I talked about how I got started in data science internships. Upon the completion of my first research internship, I looked for a data analytics internship and worked as a part-time intern while trying to cope with my studies.
The period wasn’t easy. I even went until to the point of pushing myself further by completing my studies one semester earlier so that I could pursue another internship as a full-time data scientist intern.
My friends and family were utterly confused. They had no idea why I decided to go for an internship instead of a full-time job after my studies. Because I knew exactly what I wanted and therefore my decision was unshakeable. Deep inside my heart, I chose the uncommon path because I believed in my long-term goal instead of chasing the short-term gratification.
Why did I share this story with you?
In the pursuit of your goals, you’ll face lots of internal and external doubts and challenges that question your passion, ability and goals. Expect it, embrace it, and do whatever it takes to do what you think is right.
Doing the right thing is always the right thing — Gary Vaynerchuk
Getting data science internships is the first step. But what if you’re just starting out and have zero experience in data science work? How do you get experience when you’re not given an opportunity to get experience?
The answer is by doing projects. In my opinion, there are 2 types of projects — school projects and personal projects. And I’d definitely recommend to go for the latter.
Let’s be brutally honest to ourselves for a moment. Think about the current competitive job market. In the sea of candidates seeking for employment, we’re just another candidates who have the common goals but with different experience.
The question boils down to: How do we stand out among so many candidates before getting selected for an interview? In other words, how can we be the signal instead of noise for employers to pick up easily?
And this is where the power of personal projects comes in.
I’m not saying that school projects are not useful. School projects can only showcase your capability to a certain extent and are not sufficient to convince employers that you’re passionate and good enough.
You see. School projects are typically done in a guided environment and assigned to students to work in teams. Problems are often well-framed and solutions are usually provided at the end. And there is absolutely nothing wrong with that. Students can of course learn something from these projects.
If you’re doing what everyone else is doing, you’re going to get what everyone else is getting
The problem here is: School projects can’t demonstrate your passion in data science because you just did what was assigned to you. School projects can’t show your full capability as employers can’t differentiate you with your peers in the same team.
On the other hand, personal projects are done outside the course curriculum. There are your side hustles — the side projects that are solely done by you.
Personal projects are able to showcase your passion and capability in data science field with experience beyond the coursework in school. Now you’re different from others — you’re someone who walks what you talk; you’re someone who goes the extra mile who does what you love and does whatever it takes.
What Personal Projects Can You Do?
Participate in Kaggle competitions. This is arguably the most popular platform for various data science projects and competitions. The community on Kaggle is so vibrant and willing to help and learn from one another.
If you’re a beginner starting out in data science, Kaggle Learn is there to guide you on some common programming language (Python & R), data analysis and visualization tools, and machine learning.
Not only will you learn a lot from Kaggle competitions, your Kaggle profile and ranking in competitions can also showcase your proficiency in data analysis and models development and optimization with different machine learning techniques. Same thing applies to hackathons that are organized by other companies from time to time.
However, there is a misconception here. Kaggle and hackathons alone are not enough to qualify you to be a data scientist. They can only add experience and augment your portfolio as well as complement to other projects.
Data in the real world is a mess. What you do well in competitions are simply part of the whole data science workflow. Which is why the next personal project is important — finding and doing your own project.
Find something that you care and passionate about. Identify a problem that you want to solve. Collect data (open source, self-collected data from different sources, or through web-scraping). Apply your knowledge on the data and learn along the journey.
One of the articles that I like the most — The cold start problem: how to build your machine learning portfolio. In the article, the author talked about how two candidates went all-in from collecting data (which took them days to weeks!), doing data cleaning to building some cool machine learning models to solve interesting problems. Check out the article and you’ll know why I think their approach was insane but extremely special and eye-catching to employers. Needless to say, one of the candidates grabbed the attention from employers and the other candidates got hired even before he finished his project.
Volunteer to help NGOs or companies, for FREE
This is just one of my methods to build my portfolio. The main purpose here is this: Build your portfolio, regardless of whether it is paid or unpaid.
Sometimes what we need is nothing but an opportunity. An opportunity to learn and help other NGOs and companies solve their problems using data at the same time.
The benefits are twofold — you can learn and build your portfolio while adding your values to solve problems for organizations. Who knows? You might be considered for a full-time employment after the completion of the projects.
3. Social Media
Imagine now that you have a solid portfolio with data science internships and involvement in various projects.
After spending countless hours on working on your internships and side projects, you know you’re well-prepared with knowledge and experience that would potentially land you a job in data science. But you have nowhere to showcase your capability and portfolio except on a piece of paper — Resume.
I hate to say that but the reality is: A resume can only take you this far with little or no social presence at all. Nowadays, the typical way of applying for jobs is through online job portals (JobStreet, Glassdoor, Indeed etc.) — which again, is through social media platforms.
Therefore, having your online profiles is in fact part of your portfolio to get noticed by hiring managers.
Now you may have a question: If everyone has their own portfolios online on social media, then what makes you stand out? My answer — Personal branding.
Personal branding is not about faking your own brand and experience just to impress hiring managers or employers. This is not what personal branding is about.
Personal branding is about being YOU — the authentic self with your belief, your own story and experience that demonstrate expertise and authority in your niche.
In other words, you need to know what you love and find your niche. You need know how to position yourself and market your personal branding by leveraging social media and let it speak for you. You need to provide values to others, while at the same time creating and sharing what it is you love.
Personally, I have three online platforms to develop my personal branding and social presence — Medium, LinkedIn and GitHub — that has helped me in data science career tremendously. Again, I’m sharing what I have done and what works for me, and hopefully for you as well.
You may be active on Instagram, Twitter or Pinterest and are building your social presence there. This is perfectly fine as long as you are using social media in the right way.
If you’ve been following my work, I wrote about Why Do I Write About Data Science. In the last part I mentioned one word — Opportunities.
This is closely related to personal branding. Personal branding brings opportunities. It’s that simple.
Through writing on Medium, I’ve been able to touch the life of many by sharing my experience and guiding aspiring data scientists in data science field. Through writing on Medium, I was approached by various publications, magazines, and companies to write for them.Through writing on Medium, I’ve been able to connect with so many brilliant people in data science and professionals and learn from them.
Day by day, week by week, month by month… I keep writing and building my portfolio through words. At some point in time I almost gave up because I felt like I got nothing to share. But I stayed persistent and consistent. I kept writing as long as I could.
Because I believe every one of us has a unique voice to share and nobody knows our story better than ourselves. So when you’re having doubt next time, stay focused and know why you’re doing what you’re doing. Regain the motivation and momentum and be consistent.
Whenever you don’t know what to write about, just take a look at David Robinson’s advice.
Here are some of the topics that you can write about:
Document your learning journey and share your mistakes and takeaway
Explain technical concepts to others in a simpler way
Get some open source data and analyze it. Then communicate your results with attractive visualizations and learning journey
Have you faced certain common challenges? Share what the challenges are and how you solved them. Because if you encounter these difficulties, chances are other beginners may face the same situation as well
Writing is about your thinking process. It trains and improve your communication skills to explain to others — one of the most important skills to be successful in data science field.
If we look at the LinkedIn job statistics based on the data in December 2018, there are 94% of recruiters that use LinkedIn to vet candidates and 48% of recruiters that use LinkedIn for social outreach.
So what does this data tell us? With 500 million users on LinkedIn, this is HUGE.
LinkedIn is no longer just a platform for job searching. In fact, it is way more than that. The opportunities are seriously abundant if put yourself out there and build your portfolio on LinkedIn. Share your knowledge and learn from the close-knit data science community on LinkedIn.
With the number of recruiters and headhunters on LinkedIn, your sharing will gain traction and your profile will get noticed by them even without you noticing. This is why building a portfolio on social media is so important yet it is still an underrated channel for most job seekers.
And now, I’m democratizing the sharing-learning environment on LinkedIn by initiating discussion on various data science topics among aspiring data scientists, data scientists, and other data professionals around the world. I truly believe that education is about sharing what you’ve learned and learning from others through meaningful discussions and conversation.
If you don’t have a LinkedIn profile, I strongly encourage your to create one. If you already have a LinkedIn profile, come join me on LinkedIn to engage with the data science community. Trust me, you’ll be blown away by how vibrant and helpful the community is.
A GitHub profile is a powerful way to showcase your competency as a data scientist.
Learn how to use Git because this is how most (if not all) developers and data scientists collaborate to work on projects in the real world. Understand the git workflow and learn the normal git command (commit, push and pull requests etc.).
Put some time to learn how to document your projects on GitHub using README.md. This makes your code more reproducible and interpretive for others, at least for hiring managers. Besides, a well-documented project with code on GitHub shows that you’re able to communicate results to public and you’re able to collaborate with others.
This may seem trivial to you at first. But you’ll begin to see its importance when you’re collaborating with others or trying to communicate your results to other stakeholders in future. Also, hiring managers will be able to understand your thought process on how you’ve approached and solved a problem by reading your documentation on GitHub.
Phew! This is another long post after my first longest article — How To Go Into Data Science.
If you’ve made it all the way here. Thank you for reading.
I’ve been wanting to write this article as I feel that this would be beneficial to others who are trying to build their data science portfolio. I hope that by sharing my experience and the steps that I personally used to build my portfolio would help you build your own and ultimately land your dream job with abundant opportunities.
Let me know if this article is useful to you by leaving your comments below.
As always, if you have any questions, feel free to leave your comments below. Till then, see you in the next post!