, 3 tweets, 1 min read
My Authors
Read all threads
New Keras feature: the TextVectorization layer. It takes as input strings and takes care of text standardization, tokenization, and vocabulary indexing.

This enables you to create models that process raw strings.

End-to-end text classification example: colab.research.google.com/drive/1RvCnR7h…
Key features:
- Supports sparse outputs (int sequences), to be fed into an Embedding layer
- Supports dense outputs (binary, tf-idf, count)
- Built-in ngram generation

Full credits to Mark Omernick for the code example and doing much of the work on this project.
Such a layer makes your text-processing model end-to-end: ingests strings, outputs classes/etc. You can deploy your model without worrying about the external preprocessing pipeline.
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with François Chollet

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!