STEP-BY-STEP GUIDE

How to fine-tune DistilBERT for text binary classification via Hugging Face API for TensorFlow.

Image for post
Image for post
Photo by Jason Leung on Unsplash

Intro.

Why DistilBert.

Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT .

The review article’s header from Hugging Face on Medium gives a full explanation of why we should use this model in our task. We have a small data set, and this model can be a nice…


STEP-BY-STEP GUIDE

Data science insights while you scrapping a platform for data science practice.

Image for post
Image for post
Photo by Maxwell Nelson on Unsplash

Introduction.

I will take a recipe site https://www.simplyrecipes.com/. The subject of this post can be a base part of any Data Science project: data collection.

So I chose this website because it just contains the data I need for my NLP adjective. Additionally, this tutorial in Step 3, Step 5, and Step 7 will cover some specific issues (selenium exceptions) which can arise during web crawling. …


STEP-BY-STEP GUIDE

Pre-process text data, create new features (including target variable for binary classification) with Python: Numpy, Pandas, Regex, Spacy, and Tensorflow.

Image for post
Image for post
Photo by Pixabay from Pexels

Intro.

  • the less we need to tune the model on the next stage,
  • the simpler model we can apply,
  • the more insights/patterns we may see,
  • the more accurate our model can be.

So what is pre-processing in our current case?

In simple words: it is the process of text transformations. You have to make text useful for the analysis and prediction of your business goal.


Step-by-step guide

How to build a web scraper with BeautifulSoup and asynchronous HTTP requests (Grequests)

Image for post
Image for post
Photo by Artem Sapegin on Unsplash

Introduction.

Assuming you have an NLP task — collect text data from the recipe website and make a binary classification: ingredients/instructions. Let’s scrape the data from a recipe site https://www.loveandlemons.com/. For this purpose, we will use the most popular, beginner-friendly libraries: BeautifulSoup and Grequests.

Definitions.


Python hystory

The fascinating quest of figuring out the difference between.

Image for post
Image for post
Photo by Christina Morillo from Pexels

Let’s start with the background of this race for the truth.

In the first week of our boot camp training (winter 2020), we got the first team task to make a presentation “within 20 minutes” on one of 12 topics. No one attached much importance to this: there were still four weeks left before the day of the presentation. But in two days the conditions had changed: me and my teammate Yoav Vollansky should speak about the differences between Python 2 and Python 3 to the audience already in 3 days.A …

Galina Blokh

Passionate about technologies, love challenges, talented NLP data scientist . https://www.linkedin.com/in/galina-blokh/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store