Mishig Davaadorj Profile picture
agi @huggingface πŸ‡²πŸ‡³πŸ‡ΊπŸ‡ΈπŸ‡«πŸ‡·
Sep 27 β€’ 7 tweets β€’ 3 min read
1/ On a high level, "textual inversion" is a technique of introducing new "concept" to text2img diffusion models.

In this example, diffusion model learns what this specific "<cat-toy>" is (1st img), and when prompted with "<cat-toy> in NYC", produces a coherent result (2nd img) ImageImage 2/ Technically, it is a process of:
I. add one more additional token, let's call it tkn99, to model's vocab
II. freeze all weights, except tkn99's embeddings
III. run training by supplying a few example imgs with tkn99

Find scripts & more desc at: huggingface.co/docs/diffusers…
Apr 25 β€’ 9 tweets β€’ 4 min read
How do language models (like BERT or GPT) "see" words?

TLDR: whereas we see πš†πšŽΜ„πš•πšŒπš˜Μπš–πšŽΜ‚ 𝚝𝚘́ πšπš‘πšŽΜˆ πŸ€— πšƒπš˜Μ‚πš”πšŽΜπš—πš’Μ„πš£πšŽΜ„πš›πšœ, language models see [𝟷0𝟷, 𝟼𝟷𝟼0, 𝟸000, 𝟷𝟿𝟿𝟼, 𝟷00, 𝟷𝟿𝟸0𝟺, 𝟷𝟽𝟼𝟸𝟿, 𝟸0𝟷𝟻, 𝟷0𝟸]
🧡 on Tokenization by examples
2/ NLP Tokenization steps are ↳ πš—πš˜πš›πš–πšŠπš•πš’πš£πšŠπšπš’πš˜πš— ➜ πš™πš›πšŽ-πšπš˜πš”πšŽπš—πš’πš£πšŠπšπš’πš˜πš— ➜ πš–πš˜πšπšŽπš• ➜ πš™πš˜πšœπš-πš™πš›πš˜πšŒπšŽπšœπšœπš’πš—πš.

Together, they are called a "tokenization pipeline"