Many #Python beginners are confused by space characters, empty strings, and the term "whitespace." After all, how can nothing be something? (Or: How can something be nothing?)

A thread about nothing!

(Cue the Seinfeld theme, I guess...)
Given:

s = 'a b'

This string contains three characters. As humans, we only see the "a" and "b", two characters separated by a space.

But computers don't work that way: The space character is a character. It takes up just as much space in memory as either "a" or "b".
We can see this if we iterate over the characters in s, printing their Unicode numbers:

>>> s = 'a b'
>>> for c in s:
... print(ord(c))
...
97
32
98
The empty string, by contrast, is ... empty, with zero length. It is not the same as the space character, as we can see here:

>>> s = ''
>>> s = ' '
>>> len(s)
1
>>> s = ''
>>> len(s)
0
>>> '' == ' '
False
Another example: Empty strings are False in a boolean context. All other strings, including space, are True:

>>> for s in ['', ' ']:
... if s:
... print(f'"{s}" is True-ish')
... else:
... print(f'"{s}" is False-ish')
...
"" is False-ish
" " is True-ish
You can use ' ' when splitting strings, but not '':

>>> s = 'ab cd ef'
>>> s.split(' ')
['ab', 'cd', 'ef']
>>> s.split('')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: empty separator

(We'll return to str.split in a bit.)
Space is just one of several characters that Python calls "whitespace." These characters affect how things are printed, but we only see them as empty space. The others are:

\n - newline, aka line feed
\r - carriage return
\t - tab
\v - vertical tab
When computers output to printers, "going down one line" meant two separate actions: (1) Returning the print head to and (2) Descend one line. Thus, two separate characters, carriage return + line feed.

Windows still uses CR+LF for "end of line." Unix uses just LF.
Also: Typewriters had "tabs," indicating commonly used columns. Pressing the "tab" key jumped to the next tab.

Today, "tab" (\t) moves you to the next column that's a multiple of 8; run this code to see:

>>> print('\t'.join('abcdefg'))
>>> print('0123456789' * 5)
Each of these whitespace characters has its own Unicode code:

>>> for c in ' \r\n\t\v':
... print(f'{ord(c)}')
...
32
13
10
9
11

None of these is the same as the empty string. And all are True in a boolean context (e.g., "if" or "while").

BTW, I've never used \v.
By default, the str.strip method removes all whitespace (any combination of these characters) from the start and end of a string:

>>> s = '\r\n\v a b c\t \r\n'
>>> s.strip()
'a b c'

Note: (1) Strings are immutable, so s is unaffected, and (2) It ignores internal whitespace.
Another place we see whitespace is in the "str.split" method, which returns a list of strings. For example:

>>> s = 'a b c'
>>> s.split(' ')
['a', 'b', 'c']

Here, ' ' was used as the separator.

But what about this:

>>> s = 'a b c'

>>> s.split(' ')
['a', '', 'b', '', 'c']
We told Python that wherever it sees a space character, ' ', it should cut and create a new list element. So it did.

Solve this by calling str.split with no arguments:

>>> s.split()
['a', 'b', 'c']

This uses all whitespace — any length, any combination — as a field separator.
If you use regular expressions, you can describe "any one whitespace character" with the \s metacharacter. Similarly, \S means "anything *but* a whitespace character."

What?!? You don't know regular expressions? Learn them for free: RegexpCrashCourse.com
This thread is the result of a question someone asked in this week's "Python for non-programmers" corporate training, wondering what I meant by "whitespace," and how it's different from space and the empty string.

Anything about whitespace I didn't cover here? Ask away!
Oh, I forgot to mention the str.isspace method. It returns True if the string contains only whitespace characters:

>>> s = ' \n\r\v\t'
>>> s.isspace()
True
>>> s = 'a \n\r\v\tb'
>>> s.isspace()
False

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Reuven M. Lerner, Python trainer

Reuven M. Lerner, Python trainer Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @reuvenmlerner

29 Dec 21
Want a #Python #pandas data frame with all Apollo missions, indexed by date?

df = pd.read_html('https://t.co/1OfUmAGe6N')[2]

df['Date'] = pd.to_datetime(df['Date'].str.replace('(–.+)?,', '', regex=True))

df = df.set_index('Date')
The first line scrapes the Wikipedia page for the Apollo program, putting all HTML tables into data frames. The missions are in the third table, aka index 2.

The second line turns lines containing date ranges into single (launch) dates, also removing commas and hyphens.
That second line then takes the resulting cleaned-up date strings, and passes them to pd.to_datetime. The resulting datetime series is then assigned back to df['Date'].
Read 4 tweets
5 Dec 21
For about a year now, I've been upset with the unvaccinated. Why don't they, or won't they, get vaccinated? Are they suicidal, ignorant, or sociopathic?

Two great books have changed my thinking: High Conflict, by @amandaripley, and Empire of Pain, by @praddenkeefe.

A thread.
First, just to make it clear: I'm vaccinated (3 shots). I think the covid vaccines are among the greatest achievements of modern science. My family all got vaccinated ASAP. They work, and save lives. Everyone should get vaccinated.
So my struggle hasn't been about the vaccines. Rather, it's how so many people have refused something so obviously beneficial, which will save not only their own lives, but the lives of people they love.

The evidence is overwhelming. So why the heck aren't they getting shots?
Read 25 tweets
4 Jun 21
Soon after you start to learn #Python, you start to hear that some data is mutable (i.e., can be changed), whereas other data is immutable (i.e., cannot be changed).

I find that many developers confuse "immutable" with "constant." These are very different ideas.
To appreciate the difference, remember that a Python variable is a reference to an object. It is *not* an alias for a location in memory.

So when you say "x = 5", you aren't sticking 5 in x's memory location. Rather, you are saying that the name "x" is another way to refer to 5.
In that sense, variables in Python are sort of like pronouns. You can refer to the object itself, or you can refer to it via its pronoun. However you refer to it, you get the same object.

When you assign a variable, you're saying that it (the pronoun) now refers to a new object.
Read 12 tweets
2 Jun 21
One of the hardest things for people to learn in #Python is list comprehensions. Some quick tips that make them easier to work with:

(1) Break them up into multiple lines! It drives me batty to see people writing comprehensions on a single line.
You can then reason about each line separately:

[int(x)
for x in '1 2 b 3'.split()
if x is.digit()]

Line 1: Expression
Line 2: Iteration
Line 3: Condition

Or if you're a fan of SQL:

Line 1: SELECT
Line 2: FROM
Line 3: WHERE
(2) The expression can be literally any Python expression. Any operator, function, or method. Including functions that you write.

(3) Don't use print as an expression. Comprehensions create lists. Print displays data on the screen. Also, print returns None — not what you want.
Read 12 tweets
15 Oct 20
Some thoughts on teaching online (a thread).

Background: I've done corporate #Python and data-science training for 20 years. Even before the pandemic, I taught live, online courses (via WebEx and Zoom) at least 1 week/month. I also offer many video (recorded) courses.
My work slowed down in April-May, when companies didn't know what was happening.

Training is now about where it was before. Except it's 100% online.

I teach everything from "Python for non-programmers" to "intro to data science." 5 days/week, 4-8 hours/day. All online.
I've learned a lot in this time, and want to share these thoughts with others — learners (no pun intended), teachers, and training managers.

Also: I teach adults at companies. I have huge respect and sympathy for schoolteachers who have been thrust into this world.
Read 20 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(