It's a very big day for me today, as I'm officially releasing version 0.2.3 for the @AdaptNLP project. With it comes a slew of changes, but what exactly?
Thanks to the lib2nbdev package (novetta.github.io/lib2nbdev), we've completely restructured the library to become test-driven development with nbdev, with integration tests and everything else that comes with the workflow 2/9
Next @fastdotai and #fastcore, I'm restructuring the internal inference API to rely on fastai's Learner and Callback system, to decouple our code and make it more modularized. With the fastai_minima package as well, only the basic Learner and Callback classes are being used: 3/9
What this allows is any new implementation or inference API requires less boilerplate code, and we can get it off the ground faster. For example, this is all that was needed to get Language Model text-generation going, where I completely override fastai's prediction: 4/9
Next probably my favorite part of this package, the ModelHubs. Ever wanted to search all of @huggingface programmatically? I've made some improvements to their hub API to where you can search by model name, author, or even task, in just a few lines of code: 5/9
An example searching for a particular task, using the convenience HF_TASK namespace object: 6/9
I've done the same for #flairNLP, where we can search for not only models in HuggingFace, but also any of the ones hosted by flair themselves:
Both of which let you search either by model name (which is the author part) or by task, named appropriately 7/9
Even if you don't use AdaptNLP for the actual inference, I do hope folks try out the ModelHub, as we've found it greatly eases the programmatic search API when trying to find models to use. 8/9
We also now officially support the latest versions of @PyTorch, Transformers, and FlairNLP.
That's my list of updates, and for me this amounts to my last few months of work over at @Novettasol :) I hope folks enjoy the package and update! 9/9
Of course because I didn't think of that, the docs can be found at:
With school over, I'll reveal one of my secrets. I'm making a new course!
Running with @fastdotai! Times and dates are TBD, but I'm shooting for this Fall to hold the course. This will be a continuation on Walk with fastai, building on what we learned there and applying it 1/
The plan is to split it up into three sections: Debugging, Implementations, and Guest Speakers.
The first section I want to be covering debugging in fastai, bringing raw torch code over (direct 1:1 conversions), and exploring inference heavily
The second will be walking through a few implementations of other libraries that have used fastai (and writing one or two ourselves) in a much more complex manor rather than "set your data up so the DataBlock works". Situations will arise where the DataBlock doesn't exist yet!
Deploying with @fastdotai isn't always learn = load_learner(), learn.predict. There are numerous scenarios when you might only want some, part, or none of both the API and the library as a whole. In this thread we will be exploring your options, how they work, and what to do: 1/n
Ideally we have the following context:
DataBlock -> DataLoaders -> Model -> Learner -> Train
This can then stem off to a few things:
1. learn.export() -> Model and DataLoaders (which are now blank) ...
In this scenario, we need to ensure that ALL functions which were used in relation to the data are imported before loading in the learner. This can run into issues when using fastAPI and other platforms when loading in the Learner is done in a multi-process fashion 3/
Can anyone tell me the difference between this @fastdotai code? How does it behave differently:
Answer: There isn't!
One is more "verbose" in my opinion (get_x and get_y), but neither lose any flexibility as you use it.
2/
But what if I have three "blocks"? Don't I need to use `getters` for that?
Not necessarily. Since we can declare an `n_inp` (as seen below), we can simply load in our `get_x` or `get_y` with multiple functions to be utilized instead for those types:
lib2nbdev has been on my mind for months now, and finally it exists! What's the biggest struggle with @fastdotai's #nbdev? Using it on existing projects. This tool aims to fix that.
You are then guided through the process of setting up nbdev's `setting.ini` file and afterwards all of your existing code will be converted directly into fully functional notebooks!
But wait, don't you just throw it all into one big cell?
NO! Instead lib2nbdev will determine what should be private or public, what's an import, and what particular cell tag it should be generated with, such as below which has both private and public tags:
Thanks everyone for joining me on this much more brief stream, in this thread I'll try and summarize all we discussed including:
- x is not implemented for y
- LossMetrics
- TensorBase and tensor metadata
And here is the stream:
1/
.@fastdotai had utilized their own tensor subclassing system in v2, however originally there weren't many issues as fastai just "let" you do things with these classes. Then @PyTorch came along and introduced them in 1.7. Suddenly, people were getting this! Why?
Pytorch became much more explicit in what subclasses can interact with oneanother. As a result @fastdotai had to make the TensorBase class which allows for any tensor-types to interact with each other. A quick function to convert and an example are below 3/