Octave Klaba Profile picture
Jan 3, 2018 46 tweets 13 min read Read on X
#Ovh Weekly News: week 1

Happy New Year ! :)
A huge hardware BUG hit all Intel CPU x86. A software patch for Linux is ready. We are testing it and will start to deploy it in the next hours.
Maximum tomorrow, a new kernel will be proposed for all customers VPS, PCI, Baremetal. We will upgrade all the images for Public Cloud, Private Cloud, VPS.
We will need to restart all the hosts Public Cloud/VPS. We want to start it on Saturday, with minimum of impact for the customers. We are looking for the best scenario.

All the hosts Shared Hosting will be upgraded with no downtime.
Spectre, Variant 1: bounds check bypass (CVE-2017-5753)

Use existing code with access to secrets by making it speculatively execute memory operations

Mitigation:
OS & VMM updates
Spectre, Variant 2: branch target injection (CVE-2017-5715)

Malicious code usurps properties of CPU branch prediction features to speculatively run code

Mitigation:
OS & VMM updates
+ Firmware Updates for CPU
Meltdown, Variant 3: rogue data cache load (CVE-2017-5754)

Access memory controlled by the OS while running a malicious application

Mitigation:
OS updates
Variant 1,3 are easy to fix: just the kernel upgrade.

Variant 2: it’s the kernel upgrade + the firmware upgrade for CPU, the microcode for each model of the CPU. Microcode for new CPU is already developped, but it will take 2-3 weeks to have the firmware for the old CPU.
Variant 2: Branch Target Injection

Mitigation 1: Microcode patch BIOS to introduce new feature + kernel patch to use it
Mitigation 2: Patch compilers to avoid any indirect jump and use a static trampoline (aka retpoline)

gcc have a pending patch to introduce this feature
Testing 4.14.11 latest stable on different envs & large pool of baremetals. At this time, all the flags are green (except NVIDIA’s drivers). We will deploy this version directly on netboot & use this version as native OVH kernel for the reinstallation: be safe against Meltdown.
Baremetal: Windows Server, we will be ready for 2016 and 2012r2 (US and FR only) for the reinstallation in 2-3 hours.
Cloud Destkop :
We have coded the upgrade of the host: all tests were successful.
It is a cold upgrade, we need to reboot the host. To update the Destkops, we will do this using a GPO, automated tests are planed tonight on some tests desktops.
Cloud Destkop Infrastructure :
Update will be done by PCC. To update the Admin VMs, we will install a WSUS
Private Cloud: We already have #500 hosts patched, with a secure build. (6.0). Tests are OK. We are testing the Linux kernel, upgrade is coded. The host upgrade is in testing mode (for 5.5, 6.0, 6.5). Windows upgrade is in development and finished tomorrow.
pCC: Our priorities are focused on customer infrastructures: ESXi to patch, VMs, mainly windows VMs (backup server). We expect no downtime on customer infrastructure: the VMs will be moved to another host when rebooting the host. Then we will focus on management infrastructures.
pCC: Still waiting for Zerto to update their VRA appliances.

We might require 3 reboots: 1 to secure, 1 to update the BIOS, and 1 with a vmware hotfix to integrate guest OS updates, but later on.
pCI: Testing 4.14.11. We started with hosts Metrics, Ceph aaS, OpenData, Plesk aas. They will confirm us that they don't see impact on their use case.
Meltdown, Spectre bug impacting x86-64 CPU - #OVH fully mobilised ovh.co.uk/news/articles/…
Vulnérabilités Meltdown/Spectre affectant les CPU x86-64 : #OVH pleinement mobilisé ovh.com/fr/blog/vulner…
2nd mitigation of Variant 2 is "retpoline", needs modification of the compiler (gcc) and recompilation of all softs. It'll be the way to go on the long run, but recompiling the planet will take months. does NOT neet the microcode update, will be the answer for unpatchables BIOS.
Baremetal:
We have put in production :
4.14.11 as native kernel OVH for the reinstallation
4.14.11 in rescue-pro
4.14.11 available through the netboot too with a special description.
Baremetal:
Windows Server 2016 and 2012r2 are also “KBized” and available through our installation wizard.

Note that: The customer must enable a flag through the registry database to enable the mitigation.
1st mitigation of Variant 2 is the new microcode update AND a kernel update. BOTH are needed.

The microcode introduces a new MSR, and the kernel must be updated to use it thru the IBRS patches.
pCI: we have confirmation that KVM is immune to guests reading HV or other guest memory via variant 3 (aka meltdown). KVM is NOT "impacted" by Meltdown. So, right now, a guest VM cannot read another VM's memory, neither the HOST 's memory.
baremetal: we are deploying the netboot kernel that include the microcodes for all CPUs. it will activate new flags in /proc/. once, the kernel can use the new flags, you are protected against variant 2

pci: patch kvm that exposes the new flags to VPS/PCI in coming
baremetal:
variant 2, mitigation 1:
example of the microcode loaded before kernel. upgrade BIOS not needed.
waiting for the kernel with IBRS. Image
shared hosting:
upgrade of the kernel to 4.14.11 with KPTI in progress. it will take 24h to reboot all « mutu ». it will allow to be protected against variant 3.

ASAP we have the kernel with IBRS, we will upgrade « mutu » again to protect against variant 2.
Q: how to know of your baremetal has the last microcode ?
A: # rdmsr 0x00000048 has to work

here example of the same server: E5-2689v4 not patched. rdmsr with errors.
E5-2689v4 patched. no error on rdmsr. ImageImageImage
At 10pm, 10% of WebHosting will be on 4.14.11 that protects against Variant 3. We will stop, then check during 12 hours the stability, before start the upgrade tomorrow morning.

Details: travaux.ovh.net/?do=details&id…
pCC: the upgrade plan in progress to fix Variant 1,2,3.

It will be done in 3 phases :
- Phase 1 : Security updates in the customer side
- Phase 2 : Security updates in the OVH side
- Phase 3 : Functionality patch for ESXi

Details (fr/en): travaux.ovh.net/?do=details&id…
Desktop/VDI:
We benched the hosts with the fix for windows server 2016 and esxi 6.0. No issue.

We are going to update all our hosts next week (starting on Tuesday at 6 am). It will fix against Variant 1,2,3.

Details (fr/en): travaux.ovh.net/?do=details&id…
Baremetal:
We are currently deploying the 4.14.12. It will be done promptly. It fixes Variant 3.

Variant 2, Mitigation 1
We have a smart strategy to load the microcode via 2 methods (uefi and initram) without a BIOS flash as first iteration. We will put it in production tomorrow
pCI & vps 2016:
Variant 3: no impact on KVM

Variant 2, mitigation 1
Microcode is packaged, qemu with ibsr_enabled patch test ongoing. Waiting for kernel patch with IBRS to be merged & test.

Only then we will start the upgrade of pCI reboot each host.
VPS 2014:
Variant 3 (ovh): not sensible.

Variant 3 (customers): Virtuozzo team is still integrating KPTI in openvz kernel.

Variant 2: Physical hosts update will be rolled out via pCC.
The teams worked hard during the last days with this new « bug ». Now, we know what should be known, starting deploying the protections, prepare the next moves..

The situation in under control :)

Time to create the docs to help our customers to protect theirs services in #Ovh..
First, not full, documentation, to help our customers to understand :
1) the general informations
2) what the custs have to do if they are using our services, depends of Service
3) what #Ovh is doing, depends of Product

docs.ovh.com/fr/dedicated/i…

will be improved next days.
shared hosting:
yesterday, we deployed new 4.14.11 on 10% of infra.
today: No kernel panic, No noticeable impact on performance, No random reboot, No application errors, Custs feedback: none

Decision: GO to proceed on ALL servers. At 10pm all done.

More:
travaux.ovh.net/?do=details&id…
shared hosting:
variant 3: protected :)
Summary to understand « what to do » « when » « how to protect your service » = f ( product you use in #Ovh )

Details per Product
docs.ovh.com/fr/dedicated/i…

Details per OS:
docs.ovh.com/fr/dedicated/m… ImageImageImageImage
#Meltdown ? #Spectre ?

.. it’s clear .. easy .. Image
Is your OS already patched? #Meltdown #Spectre

Check it now:
docs.ovh.com/fr/dedicated/m…

Windows Server
vSphere
Debian
Red Hat Enterprise
Red Hat OpenStack
CentOS
Fedora
SUSE OpenStack
SUSE Enterprise
SUSE CaaS
Gentoo
Slackware
SmartOS
CloudLinux
Ubuntu
OpenSuse
Archlinux
OpenVZ

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Octave Klaba

Octave Klaba Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @olesovhcom

Jun 14, 2022
1/5 #OVHcloud
Vu la rareté et donc l’augmentation du coût IPv4, nous n’allons pas pouvoir maintenir la gratuité des IPv4 additionnelles

Le prix de IPv4 / mois sera une moyenne sur les achats que nous avons fait ces derniers 10 ans.
ipxo.com/blog/ipv4-pric…

Utilisez IPv6 gratuit
2/5
- new IPv4 0e puis 1.49e/mo
- old IPv4 on prendra 3 ans pour faire passer le prix de 0.00e/mo à 1.49e/mo
- ceux qui utilisent IPv4 pour envoyer les emails en volume vont contribuer aux coûts du SOC qui combat le spam et travaille sur la réputation de nos IPv4: 0.99e/IPv4/mo
3/5
Nous allons appliquer les nouveaux prix sur new IPv4 vers Sep 2022.
On pense démarrer le changement de prix sur les old IPv4
- Jan 2023 : 0.24e/ip
- Juillet 2023 : 0.49e/ip
- Jan 2024 : 0.74e/ip
- Juillet 2024 : 0.99e/ip
- Jan 2025 : 1.24e/ip
- Juillet : 2025 : 1.49e/ip
Read 5 tweets
May 15, 2022
1/4 Encore les experts du Cloud qui n’ont pas de connaissances élémentaires de BM du Cloud.
La réponse courte: le CAPEX est proportionnel à la croissance du revenue et il est remboursable en 3 ans.
La réponse longe (en supposons que les 25B$ est un vrai invest infra):
2/4 15% du revenue sert à maintenir le revenue (à l’identique) en mettant en place les nouveaux HW et en retirant les anciens. Donc 60B$ x 15% = 9B$ c’est du CAPEX «maintient du revenue» qui évite l’erosion du revenue liée au HW qui devient vieux dont la valeur marchande diminue.
3/4 En suite, 25-9 = 16B$ servent à la croissance de revenue. avec un ratio 15:1-20:1 cela va générer
Year-1 6B$
Year-2 12B$
Year-3 12B$
puis 12B$ par an à condition d’investir 15% (encore le même) pour maintenir le revenu, donc 1.8B$ par an.
Read 4 tweets
May 8, 2021
We started the execution of our plan for @Shadow_France :
Step 1: Temporary and in US only, we suspend Shadow Ultra and Shadow Infinite. The cust has the choice to be downgraded to Boost or can cancel the subscription :(
We are sorry starting new Shadow with bad news 😭 Thread ⬇️
Why ? Why ? Why ?
The current Ultra and Infinite run on RTX 5K and RTX 6K.

In US, RTX 5k/6k was deployed with @2crsi in 2020. During Chapter 11 process, we decided to NOT to sign this contract. This is why, @2crsi started to take back their HW from the current DC in US.
In Europe, Ultra and Infinite run on @OVHcloud We continue this contract. We will NOT suspend Ultra and Infinite in Europe.

As already annonced, we’ve started to work on the new platform with a new CPU, a new GPU, a new Storage and we’re working on new offers !
Read 5 tweets
Mar 10, 2021
We have a major incident on SBG2. The fire declared in the building. Firefighters were immediately on the scene but could not control the fire in SBG2. The whole site has been isolated which impacts all services in SGB1-4. We recommend to activate your Disaster Recovery Plan.
Update 5:20pm. Everybody is safe.
Fire destroyed SBG2. A part of SBG1 is destroyed. Firefighters are protecting SBG3. no impact SBG4.
Update 7:20am
Fire is over. Firefighters continue to cool the buildings with the water.
We don’t have the access to the site. That is why SBG1, SBG3, SBG4 won’t be restarted today.
Read 149 tweets
Sep 8, 2020
Après 20ans d’OVHcloud, 24/7/365, j’ai décidé de m’accorder 3 mois pour réaliser un vieux rêve : créer un groupe et produire « un album concept ». Si COVID-19 ne gâche pas tout, on va le sortir pour fin de cette année. Et s’il plait le groupe fera « une tournée mondiale » (mdr). ImageImageImageImage
J’ai composé la musique, les mélodies puis j’ai imaginé les histoires qu’on raconte, mais je n’ai pas écrit les textes. 12 morceaux, 45min sur de choses qui composent une vie.

Enorme merci à Bruno Cheno pour avoir accepté l’execution de la production mais pas que :) ImageImageImage
J-11:
Après 14 jours de pause (à raison de 11 meetings chez OVHcloud et 4H de guitare par jour), on entre dans le « last miles ». 10H de guitare par jour et 6H de répètes en band tous les jours. Il faut prêt pour les enregistrements studios qui démarrent dans 11 jours. ImageImageImage
Read 18 tweets
Jul 22, 2020
Juste pour info: les nouvelles offres HPC SDDC et vSAN ont été mises en PROD. Comme vous pouvez voir les hosts avec beaucoup de CPU et RAM sont particulièrement interessants en
terme de prix par core / prix par GB RAM.
ovhcloud.com/fr/enterprise/…
Si vous consolidez les infra en utilisant les gros serveurs, vous diminuez vos couts de 2x à 4x VS si vous utilisez les petits hosts.

exemple: PRE 48 vs PRE 768:
vous avez 16x + de RAM pour 4x + cher.
Cele vient du fait que, suite à 4-5 ans de travail, nous avons évolué le BM pour être + agressive sur les gros serveurs.
Ce wk j’ai écrit un blog à ce sujet. Il sera publié début de la sem pro. Là on voit enfin les premiers résultats du travail effectué durant ces longes années.
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(