Raymond Hettinger Profile picture
Chief trainer for Mutable Minds. Certified Public Accountant, Retired Python guru. Alloy & TLA⁺ enthusiast. Aspiring pianist. Former pilot. Born at 320 ppm CO₂.
Aromat Profile picture Learning in Public - Coding - DataSci Profile picture 2 subscribed
Jul 13, 2022 5 tweets 1 min read
#Python tip: Create variable width fields in f-strings with an inner pair of curly braces.

>>> s = 'hello'
>>> n = 10
>>> f'{s:^{n}}'
' hello '

>>> n = 20
>>> f'{s:^{n}}'
' hello '

1/
Interestingly, this works for any part of the format specifier, including justification:

>>> s = 'aloha'
>>> just = '<'
>>> f'{s:{just}10}'
'aloha '

>>> just = '>'
>>> f'{s:{just}10}'
' aloha'

2/
Jun 12, 2022 5 tweets 1 min read
Structural pattern matching in #Python supports float and complex literals in case statements.

However, exact equality tests for float/complex are often a bad idea.

Here's a fix.

match approximately(1.1 + 2.2):
case 3.3:
print('hit!')

1/
Without the approximately() wrapper, the case would not match due to round-off error.

>>> 1.1 + 2.2 == 3.3
False

>>> 1.1 + 2.2
3.3000000000000003

>>> 1.1 + 2.2 - 3.3
4.440892098500626e-16

So, we need an approximate match wrapper.

/2
Jun 5, 2022 8 tweets 3 min read
Here's a PDF for my #Python #PyConIT2022 talk: Structural Pattern Matching in the Real World: New tooling, real code, problems solved.

This is intermediate and advanced level Structural Pattern Matching.

tl;dr The “good stuff” is in section 1.2

dropbox.com/s/w1bs8ckekki9… You'll find #Python pattern matching recipes for:

* Replacing literals with variables and named constants
* Replacing literals with regexes
* Replacing literals with function calls
* Replacing literals with set membership tests
Mar 11, 2022 4 tweets 1 min read
#Python tip: The default_factory feature of a defaultdict is only useful when building up a dictionary (automatically adding missing keys).

However, that feature is a menace when doing lookups (risking accidental dict mutation).

Consider converting back to a regular dict.

1/
# During build-up, we want the factory magic.
d = defaultdict(list)
for elem in data:
d[feature(elem)].append(elem)

# Magic is no longer useful.
d = dict(d)

# Lookups are now safe.
print(d[some_feature])

2/
Feb 19, 2022 5 tweets 1 min read
#Python structural pattern matching factlet:

Class patterns with positional arguments match by attribute lookup on the names in __match_args__.

Accordingly, they match normal attributes, descriptors, and virtual attributes implemented with __getattribute__ or __getattr__.

1/
class V:
'Virtual attribute'
__match_args__ = ('a',)
def __getattribute__(self, attr):
return 10 if attr == 'a' else 0

>>> match V():
... case V(10):
... print('hit')
...
hit

2/
Feb 8, 2022 8 tweets 2 min read
#Python news: It was always awkward to write a type annotation for methods that returned self (an instance of the current class). As of yesterday, typing.Self was added to make this much easier and more readable.

It is a big win.

1/
This works great for methods that return self.

def set_scale(self, scale: float) -> Self:
self.scale = scale
return self

2/
Jan 31, 2022 13 tweets 2 min read
We are often burdened by preconceived ideas, ideas that we invented, acquired on related projects, or heard about in class. Sometimes we undertake a project in order to try out a favorite idea. Such ideas may not be derived from our requirements by a rational process -— Parnas This thought is equally interesting if you switch the judgment from "burdened" to "empowered".

Either way, the core idea remains than programs have a human dimension that transcends requirements and "rational design".
Jan 15, 2022 12 tweets 3 min read
Argh! Who thought Black should be automatically applied to lines in the IPython CLI?

The makes it less useful for education purposes, less useful for interactive math, and annoying when it rewrites your input across multiple lines. In a #Python course, if you want to demonstrate that print('hello') and print("hello") are the same, then too bad. The CLI rewrites both to use double quotes and the students can't see what you were demonstrating.
Nov 18, 2021 10 tweets 3 min read
#Python success: I was finally able to make a type annotated pure python version of max().

All it took was a protocol bound to a typevar, a custom sentinel class, unions, overloads, isinstance checks, casts, repeated arguments, and the / and * notation.

gist.github.com/rhettinger/beb… I normally use the "is" operator to test for sentinel values but mypy needs an instance() check to distinguish that case.

Checking ininstance(x, object) always matches, so a custom Sentinel class was required.
Nov 11, 2021 4 tweets 1 min read
#Python's structural pattern matching is new, so the best practices aren't yet known.

Just added to my personal list: Add a comment or assertion noting when case ordering is important.

Otherwise, a future maintainer will be bitten by the illusion of case independence.

1 of 4 match x:
case bool():
...
case int():
assert not isinstance(x, bool)
...
case Counter():
...
case dict():
assert not isinstance(x, Counter)
case _:
raise TypeError

2 of 4
Nov 9, 2021 5 tweets 1 min read
#Python tip: Structural pattern matching works with abstract base classes such as: Contains, Hashable, Iterable, Iterator, Reversible, Generator, Sized, Callable, and Collection.

match obj:
case Hashable:
... Matching a collection ABC is preferable to looking for the required methods directly.

The complication is that those methods may be present but could be set to None.

The ABCs listed above handle the None checks for you.
Apr 6, 2021 4 tweets 1 min read
#Python factlet: The dict.popitem() method is guaranteed to remove key/value pairs in LIFO order.

>>> d = dict(red=1, green=2, blue=3)
>>> d.popitem()
('blue', 3)
>>> d.popitem()
('green', 2)
>>> d.popitem()
('red', 1)

1/
In contrast, OrderedDict.popitem() supports both FIFO and LIFO extraction of key/value pairs.

>>> from collections import OrderedDict
>>> d = OrderedDict(red=1, green=2, blue=3)

>>> d.popitem(last=False) # FIFO
('red', 1)

>>> d.popitem() # LIFO
('blue', 3)

2/
Jan 3, 2021 4 tweets 1 min read
#Python factlet: The len() function insists that the corresponding __len__() method return a value x such that:

0 ≤ x.__index__() ≤ sys.maxsize

* 3.0 and '3' don't have an __index__ method.
* -1 is too small.
* sys.maxsize+1 is too big.

1/
You could call __len__() successfully, but the len() function fails:

class A:
def __len__(self):
return -1

>>> a = A()

>>> a.__len__()
-1

>>> len(a)
...
ValueError: __len__() should return >= 0

2/
Jan 2, 2021 4 tweets 2 min read
@yera_ee Each way has its advantages.

With dataclasses, you get nice attribute access, error checking, a name for the aggregate data, and a more restrictive equality test. All good things.

Dicts are at the core of the language and are interoperable with many other tools: json, **kw, … @yera_ee Dicts have a rich assortment of methods and operators.
People learn to use dicts on their first day.
Many existing tools accept or return dicts.
pprint() knows how to handle dicts.
Dicts are super fast.
JSON.
Dicts underlie many other tools.
Dec 27, 2020 4 tweets 1 min read
1/ #Python tip: Override the signature for *args with the __text_signature__ attribute:

def randrange(*args):
'Choose a random value from range(start[, stop[, step]]).'
return random.choice(range(*args))

randrange.__text_signature__ = '(start, stop, step, /)' 2/ The attribute is accessed by the inspect module:

>>> inspect.signature(randrange)
<Signature (start, stop, step, /)>
Oct 25, 2020 5 tweets 3 min read
1/ #Python tip: The functools.cache() decorator is astonishingly fast.

Even an empty function that returns None can be sped-up by caching it. 🤨

docs.python.org/3/library/func… 2/ Here are the timings:

$ python3.9 -m timeit -r11 -s 'def s(x):pass' 's(5)'
5000000 loops, best of 11: 72.1 nsec per loop

$ python3.9 -m timeit -r11 -s 'from functools import cache' -s 'def s(x):pass' -s 'c=cache(s)' 'c(5)'
5000000 loops, best of 11: 60.6 nsec per loop
Sep 6, 2020 5 tweets 2 min read
1/ #Python data science tip: To obtain a better estimate (on average) for a vector of multiple parameters, it is better to analyze sample vectors in aggregate than to use the mean of each component.

Surprisingly, this works even if the components are unrelated to one another. 2/ One example comes from baseball.

Individual batting averages near the beginning of the season aren't as good of a performance predictor as individual batting averages that have been “shrunk” toward the collective mean.

Shockingly, this also works for unrelated variables.
Aug 31, 2020 6 tweets 2 min read
Another building block for a #Python floating point ninja toolset:

def veltkamp_split(x):
'Exact split into two 26-bit precision components'
t = x * 134217729.0
hi = t - (t - x)
lo = x - hi
return hi, lo

csclub.uwaterloo.ca/~pbarfuss/dekk… Input: one signed 53-bit precision float

Output: two signed 26-bit precision floats

Invariant: x == hi + lo

Constant: 134217729.0 == 2.0 ** 27 + 1.0
Aug 9, 2020 9 tweets 4 min read
#Python tip: #hypothesis is good at finding bugs; however, often as not, the bug is in your understanding of what the code is supposed to do.

1/
Initial belief: The JSON module is buggy because #hypothesis finds cases that don't round-trip.

Bugs in understanding:
* The JSON spec doesn't have NaNs
* A JSON module feature is that lists and tuples both serialize into arrays but can't be distinguished when deserialized.

2/
Jun 24, 2020 7 tweets 2 min read
#Python tip: Given inexact data, subtracting nearly equal
values increases relative error significantly more
than absolute error.

4.6 ± 0.2 Age of Earth (4.3%)
4.2 ± 0.1 Age of Oceans (2.4%)
___
0.4 ± 0.3 Huge relative error (75%)

This is called “catastrophic cancellation”.

1/
The subtractive cancellation issue commonly arises in floating point arithmetic. Even if the inputs are exact, intermediate values may not be exactly representable and will have an error bar. Subsequent operations can amplify the error.

1/7th is inexact but within ± ½ ulp.

2/
Jun 19, 2019 6 tweets 2 min read
#python 3.8 Good news for anyone working on number theory problems.

The three-argument form of pow() just got more powerful. When the exponent is -1, it computes modular multiplicative inverses.

>>> pow(38, -1, 137)
119
>>> 119 * 38 % 137
1 RSA key generation example:

prime1 = 865035927998844907
prime2 = 13228623409150767103
totient = (prime1 - 1) * (prime2 - 1)
private = 9262355554452364883609426718195904769
public = pow(private, -1, totient)
assert public * private % totient == 1