#ConradMitzenmacher-7

"A. Notation and Terminology

Throughout this paper, the phrase ›log-ratio‹ for a pair of positive real numbers refers to the ratio of their logarithms (to a common base, the choice of which cancels out)..

All logarithms without an indicated base
>
>are understood to be taken to the base 𝑒..

II. Review of Definitions and History
Our treatment here is based on a recent survey by Mitzenmacher [#Internet #Mathematics 1(2), 226‒251 (2004)], to which we refer the reader for more information.[ Footnote: For instance,
>
>this survey describes another argument that leads to a #PowerLaw of word frequency based on preferential attachment, originally due to Simon [Biometrika 42(3/4), 425–440 (1955)]. We do not present this argument here.]
In what follows, we let 𝑓ⱼ be the (asymptotic) fraction of>
>the time the 𝑗th most frequently used word appears.

..several words can have the same probability of occurrence..

..we will say that 𝑓ⱼ follows a #PowerLaw in 𝑗 if there exist..constants 𝑐₁,𝑐₂,α such that 𝑐₁𝑗^{–α}⩽𝑓ⱼ⩽𝑐₂𝑗^{–α} for sufficiently large 𝑗.

>
>
We sketch #Mandelbrot’s argument that leads to a power law in the rank-frequency distribution of words [Communication Theory, W. Jackson, Ed; Butterworth 1953, 486–502].
Consider some language consisting of 𝑊 words. The cost of transmitting the 𝑗th most frequent word
>
>of the language is denoted by 𝐶_𝑗.
For example, if we think of #English text, the cost of a word might be thought of as the number of letters plus the additional cost of a space. We therefore naturally expect the most frequent words to have the smallest number of letters.
>
>Let us take the cost of a space to be 0. Then if the alphabet size is N>1, there are N^k possible words of length k (including k=0; we allow the empty word for convenience). In particular, the words with letters have frequency ranks from 1+ (N^k-1)/(N-1) to (N^{k+1}-1)/(N-1).
>
>
[This is obviously true: before the "words with k letters" there are all the words from 0 to k-1 letters, and there are 1+N+N^2+...+N^{k-1} = N^k-1)/(N-1) of those. LG,I]

It follows that log_N 𝑗 ⩽ 𝐶_𝑗 ⩽ log_N 𝑗 +1.

Suppose that we wish to design the language to
>
>optimize the average amount of #information per unit transmission cost.
Here, we take the average amount of information to be the entropy. We think of each word in our transmission as being selected randomly, and the probability that a word in the transmission is the 𝑗th word
>

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Logic Geometry, Information

Logic Geometry, Information Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @LGcommaI

Jul 18
Nationale Notenbanken haben um 2010 HEIMLICH Geld 'gedruckt'. Aufgedeckt hat es v.a. #DanielHoffmann/#TUBerlin.
archive.ph/SbKJT("EZB sieht tatenlos zu
..historische Entscheidung..
Erstmals [sagte] die #EZB.., wie viel Geld sie in die Finanzmärkte zu pumpen gedachte,
>
>um die lahme Konjunktur und die äußerst niedrige Inflation anzukurbeln.

..1,14 Billionen #Euro schweres Anleihenkaufprogramm..
neue geldpolitische Ära..

Tatsächlich hat diese Ära schon Jahre zuvor begonnen – nur hat das so gut wie niemand mitbekommen[!].
>
>
Die Notenbanken des #Euro-Systems kaufen schon längst in großem Stil Wertpapiere auf. Und dabei macht das erste kleine Staatsanleihen-Programm der EZB aus dem Jahr 2010 nur den kleinsten Teil aus. Denn zusätzlich haben die nationalen Zentralbanken in Frankreich, #Italien
>
Read 59 tweets
Jul 18
"Der Typ, der die #InternetExplorer-bugs verwaltet..war..30..sah aus wie..60 und meinte: Es kommen..so viele #bugs hier rein! Wir kommen gar nicht dazu, irgendwelche [davon] zu schließen; wir verwalten die nur noch." (21m20s )

#Rot
#Security
#Technology
"At #Pwn2Own 2018, Richard Zhu (fluorescence) [won the] title of #MasterOfPwn. One of his targets was #MicrosoftEdge, which he dispatched using an exploit chain including two Use-After-Free (UAF) vulnerabilities. One of those UAF vulnerabilities is so remarkable that
>
Read 5 tweets
Jul 18
#Empfehlenswert:#Recommendable
#Frankreich:#France
#Franzoesisch:#French:#Francais
#Neuigkeiten:#NEWS:#NOUVELLES
#TVL ["Le Journal" of 2022-07-18]

("la situation en #Ukraine. Le président a décidé de limoger plusieurs hauts responsables tout en réclamant
>
>toujours plus de sanctions contre la Russie à Bruxelles. La situation devient de plus en plus hors de contrôle.

Nous évoquerons ensuite l’arrivée du texte de loi sur le pouvoir d’achat à l’Assemblée nationale, un nouveau test pour la majorité mais aussi pour les oppositions.
>
>
Et puis nous reviendrons sur les violents incendies qui ravagent le sud-ouest de la #France."
Read 4 tweets
Jul 16
#ConradMitzenmacher-6
"In this paper, we begin by reviewing the fascinating history of this fundamental problem. Then, we use methods from #ComplexAnalysis to prove that Miller’s random monkey experiment yields power laws for [the] rank-frequency distribution
>
>with probability assignments to keys satisfying a rationality assumption on log-ratios of pairs of probabilities.
We use analytic methods to establish a simple explicit power law in cases with rational log-ratios for pairs of probabilities; more specifically,
> Image
>we use generalized Dirichlet series and an elementary identity established by means of Fourier series. Passing to a limit on these formulas predicts an analogous result in the remaining “irrational” cases, and this prediction agrees with an unpublished theorem..by Montgomery;
>
Read 5 tweets
Jul 16
quarks.de/technik/energi…

Trotzdem sind..Forschungsexpertise und neue Ideen gefragt, wie man langlebige radioaktive Spaltprodukte entschärfen kann.

[Z.B.] mit laserbasierten Verfahren.
..Experimente dazu gibt es..von Gérard #Mourou und #DonnaStrickland, den Nobelpreisträgern aus>
>dem Jahr 2018. Ob dieser Ansatz auch technisch machbar ist, bleibt zu prüfen."

Relevanter Kontext: connectiv.events/gerard-mourou-…("Physiker #GerardMourou [sagt], dass Laser die Lebensdauer von Atommüll von „einer Million Jahre auf 30 Minuten“ senken könnten, ..")
connectiv.events/gerard-mourou-…
"Der Prozess, den #Mourou untersucht, wird „Transmutation“ genannt.
„Die Kernenergie ist vielleicht der beste Kandidat für die Zukunft“, sagte der Nobelpreisträger, „aber wir haben immer noch viel gefährlichen Müll. Die Idee ist,
>
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(