Read on Twitter

Aaron Hall, 🐍 Professor, NYC, 🇺🇸 @aaronchall

, 68 tweets, 16 min read Read on Twitter

view original on Twitter

view original on Twitter

High level #python #tutorial thread:
Python is a high level language that runs bytecode in its virtual machine.
Everything is an object.
It supports OOP up to multiple inheritance and functional programming.
Typing is polymorphic - duck typing is preferred

External Tweet loading...
If nothing shows, it may have been deleted
by @aaronchall view original on Twitter

A Python script is a file with a name like name.py - we indent at four spaces (it matters), comments (after #) don't execute - and is typically written like:

def main(): # function
print("Hello world")

if __name__ == '__main__': # then entry point
main()

A module is a reusable script:

"""this is a docstring at the top of the module - gives help on the module"""

import this # do imports at the top

def main(): # main at top or bottom
"""functions get docstrings too"""

if __name__... always at the bottom, assumed from now on

Note: A function that doesn't return anything (like both mains above) returns None.

Keywords are special. You can't reuse keywords as names for other things.

Now #Python has 33 reserved keywords - in an interpreter:
>>> import keyword
>>> len(keyword.kwlist)
33

Python also has 72 builtin functions:

>>> import builtins
>>> len([name for name in dir(builtins) if name[0].islower()])
72

len is a function that tells you how long a list or other sized container is.
Because the function names aren't keywords, you could overwrite them. Don't.

"import" is a keyword - you can't call a function or variable you make yourself import. We import the builtins module here.

"dir" is a function that gives you a list of names of the attributes of an object.

Above, we use a list comprehension to create a list of function names.

List comprehensions have 3 parts:

First part (map), required: `f(X)` - just X is X (like below)
Middle, required: `for X in iterable`
Last (filter), optional: `if filter(X)`

Builtin functions start with lowercase letters:
[name for name in dir(builtins) if name[0].islower()]

recap:
"import", "def", "for", "in", and "if" are keywords - Python will stop you from overwriting them.
"len" and "dir" are functions, Python *won't* stop you from overwriting them.

For reusability, write modules and functions with docstrings. Import at the top.
Use descriptive names.
For example:

import builtins

def list_builtin_functions():
"""returns a list of the builtin functions"""
return [name for name in dir(builtins) if name[0].islower()]

The nice thing about good names for functions and variables is that they can make comments unnecessary. That's good, because comments can become obsolete without being fixed or removed. Names tend to stay correct.

Please be careful about your naming!

The name[0] syntax is called "subscript notation". For sequences, we can use it to get items by index. Python is zero indexed - indexes start at 0.

>>> subscriptable = "a string"
>>> subscriptable[0]
'a'

With negative indexes, start from the end:
>>> subscriptable[-1]
'g'

That's the intro - (I think) I have explained everything I've shown thus far.

Now recall that everything is an object. Some objects can be changed, in-place. Others can't.
This property is called mutability.
Immutable objects can't be changed in-place.

Mutable objects include lists (which are ordered), sets (unordered with only unique elements), and dicts (ordered by insertion - mappings of keys to values).

Immutable objects include all numbers, strings of characters, tuples (which are like lists), and frozensets (like sets).

Numbers include integers (int), floats, and complex (i.e. imaginary) numbers.
You can do math with them - they follow the algebraic order of operations.
Parentheses, Exponents (use **), Mult, Div, Add, and Sub, or PEMDAS.

You can also do division like you learned it in elementary school, where you have a whole number (floor division) and a remainder ("modulo")

Floor division uses a double slash, //
>>> 5 // 3
1

Modulo uses the % sign:
>>> 5 % 3
2

"5 divided by 3 is 1 with a remainder of 2."

Strings have lots of useful methods. Methods are functions attached to the object. Users are expected to use methods that don't start with an underscore.

This is the public API. str has 44 such public methods:

>>> len([attr for attr in dir(str) if not attr.startswith('_')])
44

Let's turn that bit of code into a self-documented function:

def public_api(obj):
"""The public API of an object is the collection of
attributes that don't start with an underscore.
"""
return [attr for attr in dir(obj) if not attr.startswith('_')]

For example, the public api of an int has 8 methods, which we almost never use:

>>> public_api(int)
['bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes']

We use str's 44 a lot:

>>> public_api(str)
['capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace'...

(run the code yourself to see all of them)
Some we use more than others. Common is .split and .join.
(Note that .join is a method of the joining string):

>>> jenny = "555.867.5309"
>>> jenny.split('.')
['555', '867', '5309']
>>> '-'.join(jenny.split('.'))
'555-867-5309'

Of course, for that example we should probably just replace the '.' with '-':

>>> jenny.replace('.', '-')
'555-867-5309'

We've also seen .startswith:
>>> jenny.startswith('555')
True

There's also .endswith:
>>> jenny.endswith('309')
True

Recall, again, strings are immutable.
When we bind multiple names to a value that is immutable, and then seemingly change one name, the object isn't changed, instead the name now points to a new object created with information from the original and your modification.

Here's an example: foo and bar are both just different names that point to the same 'blah' string object:

>>> foo = bar = 'blah'
>>> foo
'blah'
>>> bar
'blah'

When foo is "modified" it just points to a new string:

>>> foo += " what?"
>>> foo
'blah what?'
>>> bar
'blah'

Anyways, here's some other of #python's string methods you should know:
.casefold (for comparing case insensitive)
.maketrans to make a table for .translate
.(r)partition
.rsplit (limit splits in both directions)
.(l/r)strip to remove whitespace
.splitlines to split on newlines

tuple is the next immutable type. parens are only req'd for the empty tuple:

>>> empty_tuple = ()
>>> empty_tuple
()

Commas req'd for a non-empty tuples:

>>> one_tuple = 'one element',
>>> one_tuple
('one element',)
>>> two_tuple = 'foo', 'bar'
>>> two_tuple
('foo', 'bar')

You can use trailing commas in #python:
>>> a_tuple = 1, 2, 3, 4, 5, 4, 3, 2, 1,

This allows you to add lines without version control showing more lines edited than needed.

tuples only have two methods. lists also have these methods:
>>> public_api(tuple)
['count', 'index']

tuple.count returns how many elements are in it:
>>> a_tuple.count(1)
2

.index only returns the index of the first element:
>>> a_tuple.index(4)
3

lists are mutable (can change in-place)
These methods mostly do what you'd think.:

>>> public_api(list)
['append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

Want to mutate it in place? But keep original? copy it.

Sharing mutable state?

>>> a = b = []
>>> a.append('abc')
>>> b
['abc']

and

>>> lists = [[]]*2
>>> lists[0].extend('abc')
>>> lists
[['a', 'b', 'c'], ['a', 'b', 'c']]

In both cases, "both" lists are the *same* list.

(Note extend takes an iterable, while append just adds it.)

sets are like the containers you learned about in math class.
They have 1) no semantic order, and 2) only keep unique elements:

>>> a_set = {3,2,1,2,3}
>>> a_set
{1, 2, 3}

You can subtract sets:
>>> a_set - {1,2}
{3}

find their symmetric difference:
>>> a_set ^ {2,3,4}
{1, 4}

union them:
>>> a_set | {4,5}
{1, 2, 3, 4, 5}

get their intersection
>>> a_set & {1,2,5}
{1, 2}

Not yet mutated, but they *are* mutable:
>>> a_set
{1, 2, 3}

These operations can be done in-place.
>>> a_set |= {4,5}
>>> a_set
{1, 2, 3, 4, 5}

>>> public_api(set) # methods:
['add', 'clear', 'copy', 'difference', 'difference_update', 'discard', 'intersection', 'intersection_update', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove', 'symmetric_difference', 'symmetric_difference_update', 'union', 'update']
#python

Those methods do what you'd think they'd do.
Want to be sure?
Do, for example,
>>> help(set.update)

To put an element in a set, it needs to be hashable.
A hash is an arbitrary calculation based on the value of the element.
The hash will always be the same for the life of the process.

>>> hash(1234)
1234
>>> hash('a')
-5912694457115165145
>>> hash('b')
478268980950274941

The hashes allows Python to make very fast lookups, because they only have a very small chance of overlapping - but they *can* overlap:

>>> [hash(i) for i in range(-5, 6)]
[-5, -4, -3, -2, -2, 0, 1, 2, 3, 4, 5]

(hash returns -2 for -1 because -1 is an error in the C code...)

Sets are very fast because of hashing.
But since hashes are based on the value, and mutable objects values can change, we can't hash mutable objects:

>>> hash(set())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'

We can get around this by freezing the set. This is a frozenset object:

>>> fs = frozenset('abc')
>>> fs
frozenset({'c', 'b', 'a'})
>>> hash(fs)
-2704306362333158484

Now we can put a set in a set:
>>> a_set = {frozenset('abc'), 2,3}
>>> a_set
{2, 3, frozenset({'c', 'b', 'a'})}

The frozenset, is, again immutable, while the set is mutable.

Now the dict object.
The dict is ordered by insertion, mutable mapping of keys->values.

>>> a_dict = dict(a=1, b=2)
>>> a_dict
{'a': 1, 'b': 2}

We can lookup the values by the keys:
>>> a_dict['a']
1

If the key isn't there, we get a KeyError:

>>> a_dict['c']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'c'

We use the dict.get method to work around this:

>>> a_dict.get('c', 'default')
'default'

The default default is None.

We can also set a default (which gets it at the same time):

>>> a_dict
{'a': 1, 'b': 2}
>>> a_dict.setdefault('c', 3)
'default'
>>> a_dict
{'a': 1, 'b': 2, 'c': 3}

This is an underappreciated dict method. Know it. Use it.

dicts have other methods:

>>> public_api(dict)
['clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values']

I don't use .fromkeys because of this:
>>> d = dict.fromkeys('ab', [])
>>> d['a'].append('?')
>>> d
{'a': ['?'], 'b': ['?']}

But the others occasionally have their use, especially if you want to iterate:

>>> [key * value for key, value in a_dict.items()]
['a', 'bb', 'ccc']
.values just iterates over the values.

a dict iterates over the keys, so to do that just use it:
>>> '-'.join(a_dict)
'a-b-c'

.keys, .values, and .items also are set-like, so you can do, for example:

>>> a_dict.keys() - {'a'}
{'c', 'b'}

Again, to find out more, just do help(dict.method)
#python #programming #tutorial #thread

So to recap, we covered the basic structure of a module and the various builtin types. From here we should discuss syntax and keywords more thoroughly.

if statements only require the first if. Psuedocode:

if condition():
do_something()
elif other_condition():
do_something_else()
elif third_condition():
do_third_possible_thing()
else: # otherwise the above were all false so do this:
do_only_other_possible_thing()

elif's allows us to avoid nesting ifs:

if this():
do_that()
else: # No! Use elif instead!
if that():
do_this()
else:
if other():
do_other_thing()

the elif's allow us to avoid repetitive nesting.

Maybe trivial to point out, but when the conditions are independent, we might want to just use if's:

if this():
do_this()
if that():
do_that()

This is just to point out that the else and elif's are completely optional...

For loops can be of the form:

for each_element in an_iterable:
do_something_with(each_element)

The keywords are "for" and "in".

#python's for loops are for-each loops, they iterate over each element in an iterable.

Other optional relevant keywords used in for (and while) loops are:

"continue" - stops the current iteration, but continues the loop.
(Perhaps there could be a better word for it.)

"break" -stops the entire (inner) iteration/loop.

"else" - runs if loop didn't break.

Full for-loop syntax for #python:

for i in it:
if skip_this_loop_for(i):
continue # go to next i!
if stop_loop(i):
break
# stop looping, else skipped too
do_something(i)
else:
# didn't break
finished_without_breaking()

while loops iterate until a condition changes or they break.

They can be just:

while condition():
do_something_again()

Trouble-spot: I sometimes see the below when the above is intended:

while True:
do_something_again()
if condition():
break

#python's full "while" loop syntax:

while looping():
res = prework()
if res:
continue # back up to next loop
if stop_loop():
break # stop looping, else skipped too
do_something()
else:
# didn't break - looping() returned False
no_break()

There's also function and class definitions. We've already seen some function definitions. The first string is a docstring, it automatically gets added to help of the function.

def do_something(arg, kwarg='default'):
"help on the function"
# more code here
return ...

You can (and should) put your code into functions for reusability. You can give them any number of arguments, but try to keep the number low. You can also give them defaults. You can also decorate them and annotate them, but I won't go into that now.

When a function has *yield* in it, when called, it returns a *generator*:

def func():
yield 'foo'
yield 'bar'

>>> gen = func()
>>> list(gen)
['foo', 'bar']

Generators get used up:
>>> list(gen)
[]

We may call func again though:
>>> list(func())
['foo', 'bar']

We've seen builtin datatypes (e.g. int, list) which have data and methods.

Class definitions can define custom objects, data, and methods.

If you have:
- functionality but no data, maybe use modules instead.
- data but no functionality, maybe use builtin datatypes instead.

When a class inherits from a parent class (or more), we get to use the methods defined in the parent class.
We can say that the child class "is-a" type of the parent.
Be careful to make child classes substitutable for the parent without breaking their code.

Here's a simple class definition:

class MyObject(object):
def __init__(self, a, b):
self.a, self,b = a, b

#python's objects have special methods that start and end with "__". They allow custom objects to use Python's syntax:

>>> my_ob = MyObject(2, 3)
>>> my_ob.b
3

There are about as many special methods as there are syntax structures. If you're deeply interested, read the docs here: docs.python.org/3/reference/da…
Or my answer on #stackoverflow here: stackoverflow.com/q/40272161/541…
Or my #pygotham talk here:

We talked about classes to emphasize that some things are types of other things. We've seen that there are various types of data.

bool, the type of True and False, is a subclass of int

>>> issubclass(bool, int)
True

to recreate this, we might start with:

class Bool(int): ...

In programming, sometimes things happen that we didn't expect. Perhaps we unintentionally attempt to divide by zero. Or we open a file that isn't there. We get an exception.
How does our program continue? It continues with exception handling, with try-blocks.

#python has two kinds of try blocks, their use-cases:

try-except: if there's a type of error (or its subclass), handle it. Be specific here to avoid hiding bugs. Do not use a bare except or catch BaseException.

try-finally: error or not, guarantee you do something

They look like this:

try: ...
except SpecificException as exc: ... # be specific!
else: ... # optional, runs if no exception at all, avoids hiding bugs in the try block.
finally: ... # optional

and

try: ...
finally: ...

finally is *guaranteed* to run before the block leaves.

We've covered enough keywords to make it fairly possible for a person new to #Python, or even #programming, to be able to read a Python program.

But to get to enough knowledge to be able to write a Python program, you need a pretty good understanding of the builtin functions.

Next we'll prioritize #python's builtin functions in this order, relating to:

1. introspection,
2. input-output,
3. iteration,
4. math,
5. object oriented,
6. functional, and
7. meta programming

*my* categories, not perfect, but useful to me. Note: does not include datatypes.

Introspection category includes:

- help* -> __doc__
- dir* -> list names
- vars -> __dict__
- type -> __class__
- callable
- repr
- format
- hash*
- id
- len*
- reversed
- sorted

Check out the documentation on these here: docs.python.org/3/library/func…

*we've seen and used these

Input/Output includes:

- print
- input
- open

Iteration includes:

- all
- any
- enumerate
- zip
- range
- iter
- next
- slice

Math includes:

- abs
- sum
- max
- min
- divmod
- pow
- bin
- oct
- hex
- chr
- ord
- round

Again, read the docs on these.

(deprioritize these)

OOP:

(writing)
- super
- property
- staticmethod
- classmethod
(using)
- getattr
- setattr
- delattr
- isinstance
- issubclass

Functional (redundant to list comps & gen exprs):
- map
- filter

Meta:
- eval
- exec
- type
- compile
- exit
- globals
- locals

Like this thread? Get email updates or save it to PDF!

Subscribe to Aaron Hall, 🐍 Professor, NYC, 🇺🇸

This content may be removed anytime!

Try unrolling a thread yourself!

Trending hashtags

Like this thread? Get email updates or save it to PDF!

Subscribe to Aaron Hall, 🐍 Professor, NYC, 🇺🇸

This content may be removed anytime!

Try unrolling a thread yourself!

Related hashtags

Related threads

Trending hashtags

Did Thread Reader help you today?