Companies commonly say they don't care about degrees, they care about skills. But what skills exactly? Let's scrape machine learning jobs on Linkedin to find out.
Arslan Ashraf
March 2024
In this guide, we will scrape 1000 machine learning job postings at the junior and mid level on Linkedin. Linkedin uses Javascript and various tricks to dynamically load the web page when a user clicks or scrolls through the page. We will use Selenium Webdriver to scrape dynamically loading webpages.
Once we have all the data, we will assemble the job descriptions and search through some of the most commonly mentioned terms. We will use n-grams, specifically unigrams, bigrams, and trigrams to learn what the most important and frequently mentioned terms are.
The use of trigrams, in particular, helps us discover how popular the term "natural language processing" is in the days of large language models. We will present our results in a Streamlit application at the link below.