Very often, as a data scientist, you may be faced with a task that includes a complete pipeline: from the data collection up to deploy the app on the server. I bumped into such an odd job in the interview process during the job search. The focus was not to develop the most accurate or most complex model but to show a good grasp of Machine Learning and the NLP concept.
In this article, I will show you how to deploy a Bert model and preprocessing pipeline. …
In this tutorial, you will see a binary text classification implementation with the Transfer Learning technique. For this purpose, we will use the DistilBert, a pre-trained model from the Hugging Face Transformers library and its API for Tensorflow.
Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT .
The review article’s header from Hugging Face on Medium gives a full explanation of why we should use this model in our task. We have a small data set, and this model can be a nice first choice to try for us. Also, another article on Medium suggests using DistilBERT…
This time I will explain (with full code examples) how to create a web scraper in eight steps using the Selenium Python framework.
I will take a recipe site https://www.simplyrecipes.com/. The subject of this post can be a base part of any Data Science project: data collection.
So I chose this website because it just contains the data I need for my NLP adjective. Additionally, this tutorial in Step 3, Step 5, and Step 7 will cover some specific issues (selenium exceptions) which can arise during web crawling. …
Data pre-processing is a fundamental part of data scientist work. Apart from data collecting, it is one of the principal stages. On it depends our future model’s quality and accuracy. The better we clean/prepare the data:
So what is pre-processing in our current case?
In simple words: it is the process of text transformations. You have to make text useful for the analysis and prediction of your business goal.
…
It is my first tutorial about web scraping. I will explain (with full code examples) how to create a web scraper using BeautifulSoup and Grequests Python libraries.
Assuming you have an NLP task — collect text data from the recipe website and make a binary classification: ingredients/instructions. Let’s scrape the data from a recipe site https://www.loveandlemons.com/. For this purpose, we will use the most popular, beginner-friendly libraries: BeautifulSoup and Grequests.
BeautifulSoup is open-source and completely free to use the library, makes it easy to scrape information from web pages. It sits at the top of an HTML or XML parser…
Let’s start with the background of this race for the truth.
In the first week of our boot camp training (winter 2020), we got the first team task to make a presentation “within 20 minutes” on one of 12 topics. No one attached much importance to this: there were still four weeks left before the day of the presentation. But in two days the conditions had changed: me and my teammate Yoav Vollansky should speak about the differences between Python 2 and Python 3 to the audience already in 3 days.A …
Passionate about technologies, love challenges, talented NLP data scientist . https://www.linkedin.com/in/galina-blokh/