bagging to bert, nlp, tutorials,

Bagging to BERT tutorial

Ben Ben Follow Jan 24, 2023 · 1 min read
Share this

Well, it’s official. I’m bad at blogging.

But! It’s never too late to have a restart, so I’m getting back at this.

I wanted to share some materials from a tutorial I gave at PyData NYC this past year. I’m very proud of how it came out and the response from the audience was really excellent.

Tutorial github repository

Video of the talk

The focus of this tutorial was providing an overview of NLP methods. I aimed to create a set of exercises that built on one another. I started with fundamentals like tokenization and word counts, added complexity in the form of learned weights (TF-IDF and topic models), then began to build in the transfer learning of modern NLP.

I’m planning on doing another version of this tutorial this year at ODSC East. In this updated version I’ll be bringing in spaCy a bit more. Rather than spend a bunch of time looking at PyTorch code, I figured spaCy gives a good setup for all types of deep models. I imagine folks equipped with that tool set would do better than having a separate tool for each type of NLP model.

Anyway - more to come on that.

In the meantime, I’m really going to try and put more up here. Maybe shorter form musings, rather than more complex posts.

Written by Ben Follow
I am the type of person who appreciates the telling of a great story. As a data scientist, I am interested in using AI to understand, explore and communicate across borders and boundaries. My focus is on Natural Language Processing and the amazing ways it can help us understand each other and our world. The material on this blog is meant to share my experiences and understandings of complex technologies in a simple way, focused on application.