A new paper about AI in Materials Discovery has gained a lot of attention - it reports a big increase in productivity of scientists using AI tools, and interesting secondary effects of using AI for science.
But do the technical claims stack up? Let's see...
This paper is on inorganic materials discovery in an unnamed 'large US firm'. The firm must be pretty large as it has more than 1,000 scientists working on materials discovery, specifically for "healthcare, optics, and industrial manufacturing"
The lab started using AI in 2022, and rolled out in waves to different (random) cohorts of scientists in a way that allows the author to assess the impact very conveniently.
As far as AI goes, 2022 was the neolithic. ChatGPT got released in November 2022.
Of course AI was being used in materials science before then (e.g. see the Zunger 2018 perspective cited), but with much lower profile than now. This company must have been very forward looking in 2022 to set a multi year 1000 participant randomly controlled study. Good for them.
The author was able to survey the staff and access their lab books, so they must clearly have some level of integration into the company. Again very forward looking before all the AI hype.
But anyway on to the technical side.
The first question is what kind of materials are being studied? We are not told explicitly, but it is mentioned that inorganic materials are the focus, split into Biomaterials, Ceramics and Glasses, Metals and Alloys and Polymers (Table 1)
The model "generates “recipes” for novel compounds predicted to possess specified properties."
Hmm ok seen this kind of thing before. A lot of detail is missing - but from Fig 3 and the novelty calculations it seems the model is predicting atomistic structures
So we can flag some problems which have been highlighted previously with this kind of approach, by myself and others. Of the four classes:
Biomaterials I have less experience with but I don't think the approach here can model them atomistically (could be wrong)
Ceramics and Glasses will likely contain significant amounts of disorder which is very hard to model in a high throughput fashion. This was the undoing of the first Google Deepmind materials publication.
Metals and Alloys the same as above - lots of disorder, very hard to model.
Polymers may be possible to model this way but if crystal structures are to be obtained I think it will be very difficult x.com/Robert_Palgrav…
Also at this point worth noting that Cheetham and Seshadri is required reading for anyone trying to do this with AI - especially when asking the question "What is a new material?" which I don't think has been answered in the present paper.
Novelty - the paper uses the SOAP approach to assess whether 'new' materials appear in existing databases. The databases used are Materials Project and Alexandra - both are databases of calculated structures for extended and molecular structures respectively
AFAIK, they do not contain experimental structures, so the first issue here is that we are not comparing with real materials only previous calculations.
Second, when judging novelty are we comparing predicted structures or synthesised ones?
I very much expect we are comparing predicted structures to databases of calculated structures (because experimentally determining new structures is hard)
This kind of loop, that completely removes any experimental verification, especially when used with databases that are known to be missing vast swathes of materials space, is not very useful in my view, at least not for discovering new materials. Again, see the Cheetham and Seshadri paper cited above.
Second issue is that the SOAP method seems to have been modified in some unusual ways.
Maybe most importantly is this line in appendix
"The second term puts additional weight on atoms that are closer to the centroid, reflecting the fact that central atoms are more predictive of material properties."
OK that's just wrong as a premise, but these are periodic structures, there is no centre (or at least the centre can be defined as anywhere)
Overall I'm not convinced on this measure of novelty at all.
Even though the predictions may be novel, how was it verified that the actual materials produced were the same as the predictions? Nothing is said at all about this which would be a fundamental step in any evaluation of a predictive model.
Materials quality is measured relatively to (predefined?) targets - examples of band gap and refractive index. Such a wide ranging research programme there would be many targets beyond those two, which seem pretty limited except for very specific areas (especially band gap)
The significance of this paper is that it shows that new materials were produced (but I'm not convinced of their method of identifying new materials), new patents were filed, and eventually new prototypes were produced.
Of course, of quite high significance is whether the new prototypes worked but we will have to wait for the follow up I guess.
And that overall is my problem again here. There is lots on how AI was integrated into the process, but much less on verifying that AI gave results that were actually useful.
A fascinating paper, and clearly a huge amount of work. Very interesting and impressive how seemingly one student managed to conduct such a wide ranging study at what must be a major company.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
We have now completed our analysis of new materials reported in the Google Deepmind / Berkeley autonomous lab paper. My own initial analysis is in the quote tweet.
Happy to have worked with @SchoopLab to jointly put together a comprehensive analysis, now available on @ChemRxiv.
This thread is my personal view after having looked at the detail of each of the materials reported 🧵
It is likely that no new materials were discovered.
Most of the materials produced were misidentified, and the rest were already known
This happened because of problems in both the computational and the experimental portions of the work.
The computational predictions suffered as they generated structures with ordered cations, where in fact the same or very similar compounds are known with disordered cations.
This inability to deal with compositional disorder is an important limitation of the methods used here.
This time, we are told this is a sulfoapatite, a type of compound where a sulfide (S²⁻) ion replaces the oxide ion - so the claimed formula is Pb₉.₁Cu₀.₉(PO₄)₆S.
Similar compounds do exist, for example Ca₁₀(PO₄)₆S and Sr₁₀(PO₄)₆S both exist, as do others, but the Pb version has not been made before, as far as I can see.
You initially published the M= In and Fe compounds as hexagonal crystal system, and the M= Zr, Hf and Sn as tetragonal.
You now appear to claim that all are in fact doped Sb2Pb2O7, which is cubic and has a different structure to any of those you published in your paper
The problem is you published two different structures this series. The Sn, Zr and Hf you claim are tetragonal, and the Fe and In you claim are hexagonal.
Further, structures you publish are completely different to each other and to the known cubic Sb2Pb2O7 - see below:
This exciting paper shows AI design of materials, robotic synthesis. 10s of new compounds in 17 days.
But did they?
This paper has very serious problems in materials characterisation. In my view it should never have got near publication. Hold on tight let's take a look 😱
The paper reports many 'new' compounds. The only characterisation they show is powder XRD. What no compositional analysis? Nope.
But maybe it's ok if they did a great job with the XRD analysis?
Oh.
For clarity, if you are not used to looking at PXRD - the residual (red line) should be as flat as possible. This residual is bigger than most of the peaks. There is no way this is a reliable refinement EVEN if it was combined with other characterisation. But as the sole form? No
Latest replication attempt, and this one claims zero resistance. The paper is published as a video (obviously), link in Alex's thread, but here are my comments on the chemistry. 🧵
The authors use 4 different compositions. They replicate the reaction of the original Korean paper, with lanarkite + Cu3P, but also use a different lead precursor Pb3O2(SO4).
They also use 1:1 ratio of precursors and also a more Cu3P rich ratio
I'm not sure the reason for changing ratio, the original is Cu rich but P poor, so maybe trying to add enough P to fully react
The 1:1 ratio gave them very similar results according to XRD for both Pb precursors. So a conclusion is that the exact Pb precursor may not be vital.