Latest Twitter Threads by @AIatMeta on Thread Reader App

Aug 14 • 4 tweets • 2 min read

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks.

Learn more about DINOv3 here: ai.meta.com/blog/dinov3-se…

A few highlights of DINOv3 👇

1️⃣SSL enables 1.7B-image, 7B-param training without labels, supporting annotation-scarce scenarios including satellite imagery
2️⃣Produces excellent high-resolution features and state-of-the art performance on dense prediction tasks
3️⃣Diverse application across vision tasks and domains, all with a frozen backbone (no fine-tuning required)
4️⃣ Includes distilled smaller models (ViT-B, ViT-L) and ConvNeXt variants for deployment flexibility

Apr 5 • 7 tweets • 3 min read

Take a look under the hood of Llama 4 Scout and Llama 4 Maverick – our most advanced AI models yet 🧵

Llama 4 Scout delivers state-of-the-art performance for its class enabled by continued “mid-training” with new training recipes using specialized datasets enhancing model quality and unlocking a 10M token input context length.

Oct 3, 2024 • 8 tweets • 4 min read

Following #ECCV2024 from your feed? Here are 7️⃣ examples of interesting research work being presented by teams working on AI across Meta 🧵 1️⃣ Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos: go.fb.me/gs1w0y

Sep 27, 2024 • 7 tweets • 3 min read

Ready to start working with our new lightweight and multimodal Llama 3.2 models? Here are a few new resources from Meta to help you get started. 🧵 First thing’s first, get access to all of the latest Llama 3.2 models ➡️ go.fb.me/vs9zfp

Sep 25, 2024 • 6 tweets • 2 min read

A few technical insights on the new Llama vision models we’re releasing today 🦙🧵 Llama 3.2 11B & 90B include support for a range of multimodal vision tasks. These capabilities enable scenarios like captioning images for accessibility, providing natural language insights based on data visualizations and more.

Sep 25, 2024 • 7 tweets • 2 min read

A few technical insights on our lightweight Llama 3.2 1B & 3B models. 🦙🧵 Even With their lightweight size, Llama 1B & 3B have a range of capabilities and were built to run on mobile devices & lightweight edge deployments. They empower developers to build personalized & private, on-device agentic applications.

Jul 24, 2024 • 7 tweets • 4 min read

Ready to start working with Llama 3.1? Here are a few new resources from the Llama team to help you get started. 🧵 First thing’s first, get access to all of the latest Llama 3.1 models ➡️ go.fb.me/p69fu8

Jul 23, 2024 • 8 tweets • 3 min read

More technical details on the new Llama 3.1 models we released today. 🦙🧵 Today’s release includes 8B, 70B and 405B Llama 3.1 models that were trained on a greater quality and quantity of data than Llama 3 for both pre- and post-training. All three models were trained on over 15T tokens.

Jul 23, 2024 • 5 tweets • 4 min read

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet.

Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context window and improved support for 8 languages among other improvements. Llama 3.1 405B rivals leading closed source models on state-of-the-art capabilities across a range of tasks in general knowledge, steerability, math, tool use and multilingual translation.

The models are available to download now directly from Meta or @huggingface. With today’s release the ecosystem is also ready to go with 25+ partners rolling out our latest models — including @awscloud, @nvidia, @databricks, @groqinc, @dell, @azure and @googlecloud ready on day one.

More details in the full announcement ➡️
Download Llama 3.1 models ➡️

With these releases we’re setting the stage for unprecedented new opportunities and we can’t wait to see the innovation our newest models will unlock across all levels of the AI community.go.fb.me/tpuhb6
go.fb.me/vq04tr

Training a model as large and capable as Llama 3.1 405B was no simple task. The model was trained on over 15 trillion tokens over the course of several months requiring over 16K @NVIDIA H100 GPUs — making it the first Llama model ever trained at this scale.

We also used the 405B parameter model to improve the post-training quality of our smaller models.

Apr 18, 2024 • 5 tweets • 3 min read

Introducing Meta Llama 3: the most capable openly available LLM to date.

Today we’re releasing 8B & 70B models that deliver on new capabilities such as improved reasoning and set a new state-of-the-art for models of their sizes.

Today's release includes the first two Llama 3 models — in the coming months we expect to introduce new capabilities, longer context windows, additional model sizes and enhanced performance + the Llama 3 research paper for the community to learn from our work.

More details ➡️
Download Llama 3 ➡️ go.fb.me/i2y41n
go.fb.me/ct2xko

Llama 3 delivers a major leap over Llama 2 and demonstrates SOTA performance on a wide range of industry benchmarks.

The models also achieve substantially reduced false refusal rates, improved alignment and increased diversity in model responses — in addition to improved capabilities such as reasoning, code generation and instruction following.

Across the stack, we want to kickstart the next wave of innovation in AI. We can’t wait to see what you build and look forward to your feedback.

Jul 27, 2023 • 4 tweets • 1 min read

Today we're releasing the Open Catalyst Demo to the public — this new service will allow researchers to accelerate work in material sciences by enabling them to simulate the reactivity of catalyst materials ~1000x faster than existing computational methods using AI.

Demo ⬇️ Low cost catalyst materials are an important means towards a renewable energy future. By making this demo available to the public, we hope to increase the pace of material discovery for low-cost catalyst materials and showcase the emerging generalizability of these models.

Oct 19, 2022 • 4 tweets • 2 min read

(1/3) Until now, AI translation has focused mainly on written languages. Universal Speech Translator (UST) is the 1st AI-powered speech-to-speech translation system for a primarily oral language, translating Hokkien, one of many primarily spoken languages. bit.ly/3CJP3ew

(2/3) Hokkien, one of ~3k primarily spoken languages, has no standard writing system and very few human translators, making it even more difficult to create training data for our models and difficult to rely on Hokkien transcripts.

Aug 26, 2022 • 5 tweets • 2 min read

(1/5) During our #NLLB AMA, researchers were asked about language-specific challenges faced.

❓: Were there any specific challenges the team faced out of the languages chosen (i.e. from different scripts)?

🧵’d below.

(2/5) A: Our main goal with No Language Left Behind was to include languages that weren’t available in past machine translation models.

Aug 25, 2022 • 4 tweets • 3 min read

(1/4) Writing is often a collaborative process: We start with a draft, ask for suggestions & repeatedly make changes. Today we’re introducing PEER, a model trained to mimic this process, enabling it to incrementally write texts and to collaborate with humans in more natural ways.

(2/4) PEER can write drafts, add suggestions, follow instructions, perform edits, correct itself and provide explanations for all its actions. Trained mostly on Wikipedia's edit history, PEER clearly outperforms much larger models on a collection of different editing tasks.

Aug 8, 2022 • 6 tweets • 2 min read

(1/6) Today we’re introducing Atlas, a new retrieval-augmented lang. model with strong few-shot performance on question answering and fact checking tasks. w/ only 11B parameters, Atlas outperforms a 540B parameter model by 3% w/ 64 training exs., reaching 42% on NaturalQuestions.

(2/6) World knowledge presents a particularly tricky challenge in few-shot NLP, where models don’t just need to understand what the task is asking and how to generate an output, but must also store and precisely recall from a huge amount of information to do well.

Aug 5, 2022 • 4 tweets • 2 min read

(1/4) Meet BlenderBot 3, the first publicly available 175B-parameter chatbot with model weights, code & datasets. It can chat about nearly any topic & is designed to learn & improve by conversing with people in the real world.

Try the interactive demo: bit.ly/3Pf2s2t

(2/4) A focal point of our research is enhancing safety measures for chatbots. We developed new techniques that enable learning from helpful feedback from people, while ignoring people who are trying to trick the model into unhelpful or toxic responses.

Aug 4, 2022 • 6 tweets • 2 min read

(1/6) ICYMI, here’s another highly upvoted question from our #NLLB AMA about No Language Left Behind answered by @shruti_bhosale 🧵’d below

(2/6) ❓: What is the procedure to extend NLLB-200 to a new language? Do you have any experiments on incorporating a new low-resource language onto the final NLLB-200?

Jul 7, 2022 • 4 tweets • 1 min read

What makes our NLLB-200 translation model an AI breakthrough?

📝 Translates b/t 200 languages w/verified high quality

📈 Automatic dataset for low-resource languages

📊 New open-source evaluation tools to assess quality in all 200 languages

bit.ly/3yObUEV
(1/4) 🚀 NLLB-200’s BLEU score improves on the previous state-of-the-art by an average of 44% across all 10k directions of the FLORES-101 benchmark

😎 1st large-scale conditional language model trained on the Meta AI Research SuperCluster (RSC) supercomputer
(2/4)

Jul 6, 2022 • 4 tweets • 2 min read

(1/4) Results from our No Language Left Behind (NLLB) project are not only advancing state-of-the-art in machine translations, but also enabling us to help improve translation systems inside and outside of Meta. Here’s how: (2/4) ✅We’re using modeling techniques and learnings from NLLB to improve and extend translation translations on Facebook and Instagram across African, Southeast Asian, and Indian languages;

May 26, 2022 • 4 tweets • 3 min read

(1/4) Mathematics is one of the most challenging endeavors of the human mind. Although AI has achieved super-human performance in 2-player games like Chess or Go, the most advanced models are still unable to prove even simple mathematical statements. arxiv.org/abs/2205.11491

https://twitter.com/GuillaumeLample/status/1529107248013795333

(2/4) We present a new algorithm, HyperTree Proof Search (HTPS) inspired by the recent success of AlphaZero. Our model is able to prove mathematical theorems in a fully automated way and significantly outperforms the SOTA.

Share this page!

Enter URL or ID to Unroll