X thread is series of posts by the same author connected with a line!
From any post in the thread, mention us with a keyword "unroll" @threadreaderapp unroll
Follow @ThreadReaderApp to mention us easily!
Practice here first or read more on our help page!

Recent

Dec 18
5 AI Evals Traps Every AI Team Should Know About:
(and what actually works) Image
๐Ÿญ. ๐—ฅ๐—ฒ๐—น๐˜†๐—ถ๐—ป๐—ด ๐—ผ๐—ป ๐—š๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ถ๐—ฐ ๐— ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฐ๐˜€

Trap: You treat "hallucination," "toxicity," "helpfulness" as success metrics.

Why it fails: generic metrics miss domain-specific failure modes and can create false confidence.
Do this instead: you can use generic metrics only to triage traces (sort, filter, surface weird cases). Let real metrics emerge from failure modes. See the next point.

Example: You canโ€™t fix "10% hallucinations." You can fix "fails to parse invoice dates in this format." Image
Read 17 tweets
Dec 18
This is a very good summary, of the threat to our food security, posed by the climate crisis, with a few ecological caveats I need to explain. However, it supports all my threads about this.

1/๐Ÿงตtheguardian.com/environment/ngโ€ฆ
The big caveat from an ecological perspective is this. Most of these projections, are based on average yields, which is a standard academic approach. However, we need to acquaint ourselves with the law of the ecological minimum, or the limiting factor.
2/
Whilst this is often called Leibig's law or the law of the minimum, dating from agricultural science in the 19th Century. However, I don't want to get into the academic history of this concept, but rather I want to explain what it means for us in practice.

3/en.wikipedia.org/wiki/Liebig%27โ€ฆ
Read 17 tweets
Dec 18
1/20 THIS hypocrite used to HATE Trump...a "Never Trumper"...but then he became GREEDY and started working with him. He is just as BAD a LIAR as Trump is. Just so you KNOW #DozingDonny and J.D. Cants...
1. Energy prices are NOT fucking down so STOP LYING about it. Energy costs
2/20 are UP, PERIOD.
2. Gas prices are NOT $1.99 a gallon in 4 states so STOP LYING about it! Politifact only found a handful of individual gas stations. NOY entire states. that briefly offered gas at $1.99. Statewide averages remained well above $2.50, with the average
3/20 nationwide at $2.98 per gallon.
3. Trump has NOT dropped the cost of drugs 400%/500%/600%. ANY 5th grade math student KNOWS you can't drop a price of something MORE than 100%. If a drug costs $100.00, and you drop the price 100%, that item now costs ZERO! At 600%
Read 21 tweets
Dec 18
To preserve chain-of-thought (CoT) monitorability, we must be able to measure it.

We built a framework + evaluation suite to measure CoT monitorability โ€” 13 evaluations across 24 environments โ€” so that we can actually tell when models verbalize targeted aspects of their internal reasoning. openai.com/index/evaluatiโ€ฆ
Monitoring a modelโ€™s chain-of-thought is far more effective than watching only its actions or final answers.

The more a model โ€œthinksโ€ (longer CoTs), the easier it is to spot issues. Image
Image
Image
RL at todayโ€™s frontier doesnโ€™t seem to wreck monitorability and can help early reasoning steps. But thereโ€™s a tradeoff: smaller models run with higher reasoning effort can be easier to monitor at similar capability โ€” at the cost of extra inference compute (a โ€œmonitorability taxโ€).
Read 5 tweets
Dec 18
$ZIM Value For A Buyer Is Way Above $30

To be clear, I wasnโ€™t planning to write anything further about $ZIM. But then I came across this exchange, essentially a detailed critique of an analyst and the analysis aligns exactly with the core argument hedge funds and activists are already making.

If $ZIM is being bought out, the buyer isnโ€™t purchasing a stock ticker or a short-term earnings stream. They are acquiring assets. And when you look at ZIMโ€™s asset base, conservative estimates put that value at around $70 (which includes $25 in cash) per share. Against that backdrop, the idea that a rational buyer would acquire those assets for $25 per share simply because that happens to be the cash balance makes no economic sense.

Itโ€™s long but itโ€™s a MUST read:

Market Valuation VS. Asset Valuation

I urge readers to take the time to read my entire post. It's important. This article conveys some very troubling analysis! Like most things, there is some good mixed in with the bad. So I'll start with the good.

The key point, above all else, that Melissa's article makes clear is that until the recent flare up of multiple bidders interested in a Zim buy out proposition, virtually all valuation of Zim as a company by Wall Street analysts, Seeking Alpha analysts, ME, or anyone else, was done based on "Market Valuation" principles. We all know those principles very, very well. They involved evaluations based on valuation metrics such as P/E, EV/EBIT, and other popular data points. Those methods often use DCF analysis or other ways to discount the forecast profits of Zim over time back to a present day net worth.

By all of those metrics and measurement techniques, the "market valuation" method results in low share prices and a low market cap due to the reality of Zim facing challenging UNSTABLE market conditions, including trade war impacts, potential Red Sea reopening impacts, excessive capacity from orderbooks for new ships, and more factors. As Melissa correctly says, the future business prospect for the immediate future for Zim are "grim." The inevitable outcome of the market valuation method for Zim has resulted in 2025 for analyst predictions of share prices from about $9 to $15 for most of the year.

However, that's only ONE method of valuation. The other method normally is not used by investors unless one of two things is about to happen to a company: 1) its about to be liquidated in bankruptcy with all its assets being sold off or 2) its about to be bought out by a new owner. We are now in the second scenario, as everyone knows. Therefore, for the moment it is CRITICAL to toss aside all "market valuation" techniques and the results from those assessments in favor of the results which come forth when using the Asset Valuation technique. When a sale is about to occur, it's a transfer of assets and that demands a value of those assets via the Asset Valuation methods.

I dare say since she started covering Zim, most SA readers have really liked her work, finding it among the best and most accurate assessment of Zim offered by any SA analyst. That's normally how I see her work too! But I don't give free passes when an analyst gets out into deep water over their head and flounders helplessly around!

So what I write next isn't intended to be flattering. It's intended to show how Melissa has utterly failed to correctly apply the Asset Valuation technique in a rational, logical manner. And, more importantly to be sure that SA readers don't blindly accept her conclusions as fact. In this case, I strongly dispute her conclusion that a $25 to $30 per share sale price is appropriate for Zim.
With all due respect, asset valuation of business assets is a significant part of my work as a Certified General Real Estate Appraiser, which I have been for 47 years running. While I haven't personally valued a container shipping company, I have dealt with industrial, commercial, and special purpose enterprises of all descriptions. In addition to basic real estate (office buildings, warehouses, land, etc.) nearly every business I have ever valued has also had many other assets to additionally value, using one method or another out of recognized appraisal principles. When it comes to asset valuation, we are in my wheelhouse now!

These enterprises may have anything from corporate airplanes, airfields, hangers, small boats, docks, trucks and other rolling stock, furniture, fixtures, and equipment, and various special duty items, such as massive hoists or similar pieces of equipment. The things I have encountered and dealt with are not all that dissimilar to much of what Zim holds as its corporate assets.

Many, if not most, operating businesses also lease various items, and they have goodwill that has value, trademarks, corporate brands, intellectual property, stocks or other investments in other enterprises or assets, and so much more. Zim has all of these, meaning there is a LOT TO VALUE with Zim! Melissa didn't even try to value any of those assets as, frankly, she probably doesn't even know what they all are, much less how to value them.

It often takes a team of skilled, certified and qualified appraisers to develop a genuine value for the assets of a company like Zim. Evercore, the financial adviser hired by Zim most likely has a team of its own staff or outside valuation experts working on a defensible and solid valuation of Zim. Most of the time if I am working on a project like that it is for bankruptcy, but it can just as easily be for determining the worth of all the assets, at their current fair market value on the open market, for a sale to a new buyer. I do both. In any event, valuing a very large company like Zim is a gargantuan undertaking, not one that a financial writer can do on a napkin.

That's the problem analysts such as Melissa face. How does a single analyst trying to write a fast article come up with anything more than a "wild guess" at what the TRUE value of Zim's assets would REALLY be worth? It's not easy. As someone who has done this for a living, I'd even say its "next to impossible" for any analyst to do all by themselves, me included.

But believe me here, Melissa started out on the right path by realizing that the total value of Zim's assets is likely FAR MORE than most see. It's just hard to be accurate when "counting all the beans" so to speak. Her problem is she attempted doing a "back of napkin" valuation and did reach a numerical conclusion. Yet she then IGNORED her own value by suggesting the company could be sold to a new buyer at only about 40% of the value she calculated!

However, for the beans she did count...which absolutely is NOT everything Zim actually does own at its fair market value....she did a decent job.

But after reading the article, exactly WHAT does Melissa think Zim's assets are worth using the Asset Valuation approach? Oddly, she never totaled it all up! So let me do that for her, right here quoting her own value conclusions. What she did do was value three categories of assets. She valued these three categories as follows:

1) "this implies a fleet value above $5 billion just for the newbuilds ordered between 2021 and 2022."

2) "...14 car carriers chartered on shorter-term contracts, plus 16 owned container ships and handling equipment valued at over $1 billion.

3) "the company still has close to $3 billion in cash, or around $25 per share, after paying the last dividend."
So by Melissa's own calculations and written statements, the above three categories of assets total up to:

"9+ Billion" (my total of her numbers)

Stopping right here, let's ask ourselves how that compares to Zim's own balance sheet. For the past 2 years in a row, Zim has reported total assets (depreciated) on its balance sheet of almost $11 Billion. So her $9+ Billion estimate of these three categories still puts the total assets of Zim at about $1.8 to $2 Billion LESS THAN the balance sheet shows.

As an appraiser, I contend she simply has failed to value EVERYTHING but only picked out the three bigger and easier categories to value. That's fine as far as it goes, but she needed to tell readers that the company has about another $2 Billion stated on its balance sheet as legitimate assets that she didn't value. Anything less misleads readers into thinking she has valued all the assets when clearly she has not done that.

Again, as an appraiser, in my narrative above I listed a long list of what those "other assets" are for Zim. They have value and SHOULD be included, not simply ignored because the financial writer doing an SA analysis either doesn't know what they are or has no idea how to value them! At the least, she should tell her readers the 9+Billion of assets she is valuing in her article DOES NOT constitute all of Zim's total assets.

While we are paused, for the sake of understanding, let's compute what Melissa's $9 Billion of assets are worth, using HER claims of their worth, on a per share basis. Follow along, it's very simple math.

$9 Billion / 120,451,503 shares outstanding = $74.72 per share.

Are you shocked? Don't be. Melissa is about right in her valuation of those $9 billion of assets. So the assets, if free of liabilities, would be worth about $75 per share. And we clearly know there are about $2 Billion more in assets on the company's balance sheet she did NOT value.

Ok, let's get back to the main theme here and work with the above a little more. Here is where Melissa's wheels run off her cart. She winds up stuck in mud in a ditch so far off the road it is just beyond me to even comprehend how she put herself in that spot. I would encourage her to "lift up your eyes and see the light." Redemption is always possible. LOL.

After saying this about the leases of the new build ships:

"Despite being considered as debt, an acquirer will value whether those leases are below or above the market price, and right now, they are below it. This suggests that, instead of being a liability, those leases have value."

Fine. She clearly said the leases ARE NOT a liability. I fully agree. Melissa if you acknowledge that "below market rate leases ADD value to these new build ships because the cost to use these vessels is below today's costs, meaning a new buyer gets to use ships CHEAPER than they could find in the market today, where did you calculate a "lease by lease analysis" of those ships to determine how much additional value they add to Zim's balance sheet?
Read 6 tweets
Dec 18
Some ppl seem to want to tell me why I'm angry about the @The_ACNA ecclesiastical trial

Guys, I've been angry for years. Question is: why aren't you angry?

I am *embarrassed* and *scandalized* at the incompetence on display w/the trial & I'll tell you some reasons why ๐Ÿงต
The trial court scolded advocates & ppl bringing presentments; that supercilious framing resembles the leadership immaturity that has categorized the @The_ACNA every step of the way.

It's the point of the trial to investigate.
Using the trial summary to kind of go off on any and everyone who tried to effect any kind of change *while at the same time* acknowledging how inept, chaotic, ill-informed, and damaging ACNA responses/processes were is beyond the pale. +
Read 29 tweets
Dec 18
In the shadowed alleys of Chicago, where grief howls like winter winds, Carli B. Frueh unlocks a magic door to the impossible. โ€œA Journey Through Chicagoโ€™s Shadows to Never Never Neverlandโ€ is a raw, visceral odyssey through loss and renewal, where a motherโ€™s shattering pain
over her daughterโ€™s tragic death propels her into parallel worlds of hummingbirds, fireflies, and unbroken joy. But what if escape means confronting the multiverseโ€™s cruel mirrorsโ€”alternate selves untouched by sorrow? Blending speculative wonder with unflinching emotion, Frueh
challenges us: Can consciousness bridge dimensions, turning heartache into a portal of healing? This short story isnโ€™t just a tale; itโ€™s an invitation to rewrite your own reality, where love defies death and shadows birth stars.
Read 4 tweets
Dec 18
New: Florida lawmakers may soon give the stateโ€™s powerful sugar industry the legal leverage to sue its critics into silence.

๐Ÿงต...
A provision in the โ€œfarm billโ€ in the Florida Legislature would make it easier for U.S. Sugar & Florida Crystals to wage defamation suits against environmental groups, news outlets and others who criticize the companies over issues like Everglades restoration and air pollution...
It would do so by expanding a niche law initially passed in the mid-nineties at the behest of agribusiness lobbyists โ€” the same kind of law that the beef industry once tried to use to sue Oprah Winfrey after the talk show host aired a segment about mad cow disease...
Read 11 tweets
Dec 18
@michaeldweiss 1).
โ€žRetired Army Lt. Gen. Michael T. Flynn, who was once @POTUS @realDonaldTrump's national ecurity adviser, was hired as a consultant for the Bosnian Serb republic eight years after he admitted to secretly working to benefit the Turkish government.
@michaeldweiss @POTUS @realDonaldTrump 2).
@realDonaldTrump pardoned Flynn in late 2020, ending one of the most closely watched prosecutions to emerge from the Russia investigation.
@michaeldweiss @POTUS @realDonaldTrump 3).
Flynn had pleaded guilty to lying to the @FBI about his contacts with Russia's ambassador, then sought to undo that plea before the @TheJusticeDept Justice moved to dismiss the charge.โ€

Dec. 17, 2025
Read 5 tweets
Dec 18
Crisis in Russia

The first official data on staff reductions and wage declines has emerged. So far, this only covers the Kuzbass coal industry.

Read this thread
1/
Here are official figures from the Kuzbass Ministry of Coal Industry.
Amid a decline in coal production for the second year running, layoffs began in March.
By November, more than 8,600 employees, or one in every 11, had been laid off!
2/
But the situation didn't stop with layoffs alone. In August, wages began to be cut for those who remained employed. As a result, cumulative wage growth for the first two months was +13%, but by August, it had only increased by +4%. This means that wages actually fell in August
3/ Image
Image
Read 5 tweets
Dec 18
Hi, Iโ€™ll be live-tweeting todayโ€™s IndyGo Board of Directors meeting starting at 4:00 p.m. for #indydocumenters @indydocumenters @mirrorindy.
@indydocumenters @mirrorindy It's sounds like the meeting started at 4:00, although I am watching the live stream and the audio is going in and out.
@indydocumenters @mirrorindy The audio is coming through fine now.
Read 28 tweets
Dec 18
DAWN is one of the few companies of this cycle with a vision and team that's a multigenerational bet.
The idea that your ISP's are going to continue to shaft you via geographic monopolies is only going to degrade the quality of the internet, charging you *more* while letting existing infrastructure rot in your apt buildings.
@dawninternet isn't aiming to create a system for just this generation of wifi, we at @RepublicCrypto are working hand in hand on establishing an economic framework for DAWN that will revitalize and reincentivize with every new upgrade not only in WiFi but all connectivity
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!