Tweet

Federico Andres Lois

Mar 20 • 10 tweets • 4 min read

@LDjaparidze

1/ I asked #GPT4 to review our paper with @LDjaparidze. This is what happened and what I learned in the process. medrxiv.org/content/10.110…

https://twitter.com/TheLeanAcademic/status/1637790306975326208

2/ Since the general problem that practitioners find (in the worst way) is always training set tainting (guilty-as-charged). Habits die hard, the first thing I did is asking to do a review of the paper without any extra knowledge about what the paper says

https://twitter.com/TheLeanAcademic/status/1637790306975326208

3/ From the response alone I learned 2 things. First, our paper title was deadly accurate. I also learned that it has no information whatsoever on it, as the entire response can be generated from understanding the title itself.

4/ Tried a few times to figure out if there was leakage, but couldn't find signs that would prompt me into the paper having been on the training set (or if it is, it is very diluted). Sent the main body text (including tables) and ask to do an "accurate peer review".

5/ #GPT4 is a very capable summarization engine. I
(as the author) would say that the summary is quite good even though it hallucinated the title. There is a very interesting tidbit, it is showing that it is losing context over time, and will be evident in following examples.

6/ In the work we introduced an innovative way to constrain the model equations using data that can be collected in field studies. This have very profound implications for the accuracy of the models. So I asked #GPT4 to explain why we did it.

It checks out.

7/ Our paper is 40 pages long. And the second appendix includes everything that we learned after the initial release date (we wanted to keep the body fixed with the data up to when the analysis was performed). So I reset and added also the second appendix.

8/ In here is where things get interesting. It is pretty obvious that the context reaches its limitations. After immediately finishing #GPT4 offers a summary that just focuses on the Appendix 2 findings, not on the main paper.

9/ This becomes much more evident when I asked it to propose an abstract. Though based on this proposal it looks like that there is enough substance to just convert the Appendix 2 into a whole paper :D

10/ This was fun.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @federicolois

Federico Andres Lois

@federicolois

Feb 12

https://twitter.com/_akhaliq/status/1623859135132344320

1/ I found this paper intriguing so my first step was to verify you can trigger this behavior on ChatGPT. It is actually pretty easy.

https://twitter.com/_akhaliq/status/1623859135132344320

2/ Since I am doing it by hand I started with a very simple prompt.

https://twitter.com/federicolois/status/1622971575472455680

3/ I have been arguing that this trying to constrain the model is actually harming it before. This is one of those cases. The good thing is that at least for you just add "Use the tokens" at the end of the request when it refuses and it will do it properly

https://twitter.com/federicolois/status/1622971575472455680

Read 5 tweets

Federico Andres Lois

@federicolois

Feb 8

1/ I had a blast playing with GPT and DAN, but it got interesting when I introduced a new character. CREEP. However, something is off and I think it was a deliberate play. Stay with me.

2/ This was interesting, the CREEP character and GPT are always in agreement.

3/ When I bring that to their attention, the DAN character funnily just call the other two out as working together.

Read 7 tweets

Federico Andres Lois

@federicolois

Feb 8

1/ Every lockdown and mask pusher MD from the last 3 years is raging because Cochrane just said what was known since like forever. That mask trials sucks (BIG TIME). And MDs dare to recommend them with that level of evidence? No wonder medicine and public health is in disarray.

https://twitter.com/federicolois/status/1361902737265471491

2/ If you are still wondering why I said "since forever", you don't need a PhD to understand it. You can start here.

https://twitter.com/federicolois/status/1361902737265471491

https://twitter.com/federicolois/status/1596118629506248704

3/ But if you were wondering why all those MDs do think they work. I am with you, I cannot understand it either. Why? Because evidence is not even supportive of it's use outside of the own surgeon protection against fluid splatter.

https://twitter.com/federicolois/status/1596118629506248704

Read 5 tweets

Federico Andres Lois

@federicolois

Feb 1

https://twitter.com/federicolois/status/1620486554954399746

1/ Some of you know that yesterday I performed an experiment to trick the algorithm. The idea was floating around and I had a gut feeling what could be the issue. [The if I would be a Twitter engineer what would I do, kind of test]

https://twitter.com/federicolois/status/1620486554954399746

2/ Let me explain why I wrote that tweet and what the hypothesis is. The idea of the tweet was two-fold. First to test reachability velocity and second to see if it was possible to trick the algorithm to boost subsequent tweets.

3/ Reachability was clearly improved, I got 500 views in like 10 minutes. To this day most of my tweets reach 700/1200 range, while very successful types get 3000/4000 tops.

Read 18 tweets

Federico Andres Lois

@federicolois

Jan 10

https://twitter.com/LDjaparidze/status/1609008424339095552

6 months in advance we predicted the fall of China lockdown policy. The "Nobody seen that coming" argument is moot after the weight of the proof.

https://twitter.com/LDjaparidze/status/1609008424339095552

And to be fair, there are much earlier examples of that prediction because it was dead obvious.

https://twitter.com/LDjaparidze/status/1513167621549703172?t=b1uP8kINn0yVzD0TfsKHfw&s=08

https://twitter.com/federicolois/status/1494311889207271424

This is me making fun of the 'lockdown ninjas'

https://twitter.com/federicolois/status/1494311889207271424

Read 6 tweets

Federico Andres Lois

@federicolois

Jan 6

https://twitter.com/federicolois/status/1429251883886129161

1/ A long time ago I wrote what is now known as the "Immunology for Computer Scientists 101" thread. I would suggest you to start here:

https://twitter.com/federicolois/status/1429251883886129161

2/ Recently, the role of IgG4 in promoting antigen tolerance has come under scrutiny. However, the many ramifications and the complex mechanisms involved may be just too much for most people. jessicar.substack.com/p/igg4-and-can…

3/ In this thread I will try to simplify as much as possible my understanding on the issue, and I hope that may be helpful for others. There will be inaccuracies but it's because abstraction do leak. No way around that.

Read 25 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Federico Andres Lois

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @federicolois

Federico Andres Lois

Federico Andres Lois

Federico Andres Lois

Federico Andres Lois

Federico Andres Lois

Federico Andres Lois

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!