Mishig Profile picture
intelligence artificial @huggingface πŸ‡«πŸ‡·πŸ‡²πŸ‡³ e/acc
Dec 6, 2022 β€’ 5 tweets β€’ 2 min read

stable diffusion V2 uses depth information to transform an image while preserving the structure of the original image

demo πŸ‘‡
Dec 5, 2022 β€’ 4 tweets β€’ 2 min read
14 million text-to-image prompt dataset with their hyperparameters (DiffusionDB dataset from @PoloDataClub)

Quite useful for both research & product at the same time

huggingface.co/datasets/poloc… Diffusers docs has a great section on schedulers, which is one of the most important hyperparameters of diffusion models
huggingface.co/docs/diffusers… Image
Sep 27, 2022 β€’ 7 tweets β€’ 3 min read
1/ On a high level, "textual inversion" is a technique of introducing new "concept" to text2img diffusion models.

In this example, diffusion model learns what this specific "<cat-toy>" is (1st img), and when prompted with "<cat-toy> in NYC", produces a coherent result (2nd img) ImageImage 2/ Technically, it is a process of:
I. add one more additional token, let's call it tkn99, to model's vocab
II. freeze all weights, except tkn99's embeddings
III. run training by supplying a few example imgs with tkn99

Find scripts & more desc at: huggingface.co/docs/diffusers…
Apr 25, 2022 β€’ 9 tweets β€’ 4 min read
How do language models (like BERT or GPT) "see" words?

TLDR: whereas we see πš†πšŽΜ„πš•πšŒπš˜Μπš–πšŽΜ‚ 𝚝𝚘́ πšπš‘πšŽΜˆ πŸ€— πšƒπš˜Μ‚πš”πšŽΜπš—πš’Μ„πš£πšŽΜ„πš›πšœ, language models see [𝟷0𝟷, 𝟼𝟷𝟼0, 𝟸000, 𝟷𝟿𝟿𝟼, 𝟷00, 𝟷𝟿𝟸0𝟺, 𝟷𝟽𝟼𝟸𝟿, 𝟸0𝟷𝟻, 𝟷0𝟸]
🧡 on Tokenization by examples
2/ NLP Tokenization steps are ↳ πš—πš˜πš›πš–πšŠπš•πš’πš£πšŠπšπš’πš˜πš— ➜ πš™πš›πšŽ-πšπš˜πš”πšŽπš—πš’πš£πšŠπšπš’πš˜πš— ➜ πš–πš˜πšπšŽπš• ➜ πš™πš˜πšœπš-πš™πš›πš˜πšŒπšŽπšœπšœπš’πš—πš.

Together, they are called a "tokenization pipeline"