Table of Contents

Python 3

References

main Function

To run code in a .py file, use a main function:

# set up here
 
def main():
  # code to run
 
# ruin the code
if __name__ == '__main__':
    main()

It allows to import the file as a module without running the code inside the function main.

One could also do:

# code to always run here
 
if __name__ == '__main__':
    # code to run when called directly
else:
    # code to run when imported as module

Looping

Looping Backwards

Use reversed:

colors = ['red', 'green', 'blue']
 
for color in reversed(colors):
    print(color)

Looping Over Collection and Indices

Use enumerate:

colors = ['red', 'green', 'blue']
 
for i, color in enumerate(colors):
    print("{} -> {}".format(i, color))

Looping Over Two Collections

names = ['raymond', 'rachel', 'matthew']
colors = ['red', 'green', 'blue', 'yellow']
 
for name, color in zip(names, colors):
    print("{} -> {}".format(name, color))

Note that the lengths are different.

Looping in Sorted Order

Use sorted:

colors = ['red', 'green', 'blue']
 
for color in sorted(colors):
    print(color)

To sort by length:

colors = ['red', 'green', 'blue']
 
for color in sorted(colors, key=len):
    print(color)

Breaking Out of Two Loops and Nested Loops

Python's break only breaks out of one loop. To break out from two, make it a single loop. So:

for x in range(width):
    for y in range(height):
        <do something>

Create a range:

def range_2d(width, height):
    for x in range(width):
        for y in range(height):
            yield x, y

Then, you can do:

for x, y in range_2d(width, height):
    <do something>

More Nested Loops

Instead of:

for i in v:
    for j in w:
        <do something>

you can do

it = ( (i,j) for i in v for j in w ) # creates a GENERATOR
for i,j in it:
    <do something>

Break and Continue

The command break breaks out of the loop, while continue skips the rest of the body and goes to the next iteration. For instance:

for x in range(10):
  if x == 2:
    continue
  print(x)
  if x == 5:
    break

prints 0, 1, 3, 4, 5.

itertools

The package itertools offers many convenient iterators for looping.

See https://docs.python.org/2/library/itertools.html and https://www.geeksforgeeks.org/python-itertools/

product

product gives an iterable with the Cartesian product.

from itertools import product
for x, y in product('ABC', 'xy'):
     print(x, y)

produces

A x
A y
B x
B y
C x
C y
for x, y, z in product(range(3), repeat=3):
     print(x, y, z)

produces

0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 1 2
0 2 0
0 2 1
0 2 2
1 0 0
1 0 1
1 0 2
1 1 0
1 1 1
1 1 2
1 2 0
1 2 1
1 2 2
2 0 0
2 0 1
2 0 2
2 1 0
2 1 1
2 1 2
2 2 0
2 2 1
2 2 2

permutaions

permutations gives all permutations of a list:

from itertools import permutations
>>> for v in permutations(range(3)):
  print(v)

produces

(0, 1, 2)
(0, 2, 1)
(1, 0, 2)
(1, 2, 0)
(2, 0, 1)
(2, 1, 0)

combinations

combinations give all combinations without replacement. (Use combinations_with_replacement if you need.)

from itertools import combinations
for x, y  in combinations(range(4), 2):
    print(x, y)

produces

0 1
0 2
0 3
1 2
1 3
2 3
for x, y, z  in combinations(range(5), 3):
    print(x, y, z)

produces

0 1 2
0 1 3
0 1 4
0 2 3
0 2 4
0 3 4
1 2 3
1 2 4
1 3 4
2 3 4

count

It's simply a counter, but allows to go on forever (from an initial value and using steps).

For example, the below stops when x is 0 modulo 7 (giving x equal to 7).

import itertools
for x in itertools.count():
    if x % 7 == 0:
        break

This next example gives the first positive integer congruent to 2 modulo 7 (giving x equal to 14):

for x in itertools.count(2, step=3):
    if x % 7 == 0:
        break

Dictionaries

Looping Over Dictionary Keys

d = {'matthew': 'blue', 'rachel': 'green', 'raymond': 'red'}
 
for k in d:
    print k

If you want to change the dictionary while you loop, loop over d.keys():

for k in d.keys():
    if k.startswith('r'):
        del d[k]

Looping Over Dictionary Values

for v in d.values()
   <do stuff>

Looping Over Dictionary Items

for key, value in d.items()
    print("{} --> {}".format(key, value))

Construct Dictionaries from Pairs

d = dict(zip(names, colors))

Counting Occurrences

Use d.get. d.get(item,0) gives d[item] if possible and 0 otherwise.

d = {}
v = [0,0,0,1,1,1,1,1,1,2,2,0,2,1]
for x in v:
    d[x] = d.get(x,0) + 1

There is also d.setdefault(key, default_value), which also creates the d[key] if it does not exist already.

One can also use defaultdict(type), which creates an “empty” dictionary with values of type.

For instance, to create a dictionary grouping stings of names by length:

from collections import defaultdict
 
d = defaultdict(list)
for name in names:
    key = len(name)
    d[key].append(name)

If key does not exist, it creates it and sets the default value for list, which is an empty list. (And for int it is 0.)

Get Element from Dictionary

If you want to try to get an element from the dictionary, but are not sure if it is in it, instead of

if item in my_dict:
    res = my_dict[item]
else:
    res = 'Unknown'

you can simply do

my_dict.get(item, 'Unknown')

Another Construction Example

v = [ 'a', 'b', 'c', 'd' ]
w = [ 1, 2, 3, 4 ]
my_dict = { letter : number for letter, number in zip(v,w) if number != 2 }

gives my_dict as { 'a' : 1, 'c' : 3, 'd' : 4 }.

Function Arguments

Optional Arguments

Use * for optional arguments with no default value.

def my_fct(x, *v):
    print('First value is {}'.format(x))
    if v: # runs if v is not empty
        print('The other values are:')
        for value in v:
            print(value)

You can call it with my_fct(3), my_fct(3,4), my_fct(3,4,5), etc.

Also, if v = [1, 2], then my_fct(3,*v) is the same as my_fct(3,1,2).

Required Keyword Arguments

In Python 3, if

def f(x, y, *, z=0, w=1):
    return x+y+z+w

you cannot call it as f(1,2,3,4), you must call it with f(1,2,x=3,w=4), i.e., the keyword arguments must be entered as keywords.

functools partial

We can have a new function based on a previous one with some of the arguments specified using partial from functools:

from functools import partial
 
def multiply(a, b, print_arg=False):
    if print_arg:
        print(f"Arguments: {a = } and {b = }")
    return a * b
 
double = partial(multiply, 2)
triple = partial(multiply, b=3)

Then, you can call double(3), triple(7), triple(2, print_arg=True), etc.

f-Strings

f-String is a new and improved way to format strings. It only works in Python 3! It's faster than .format.

An example:

greeting = 'Hello'
name = 'World'
 
print(f"{greeting}, {name}!")

You can also mix types:

student = 'John Doe'
score = 93
 
print(f"Hello, {student}.  Your score was {score}.")

By default, it prints the str() respresentation. To print the repr() string, we add !r after the variable name.

print(f"The __repr()__ is {object_variable!r}.")

You can perform operations:

name = 'John Doe'
age = 30
 
print(f"Next year, {name}'s age will be {age + 1}.")

To use a variable in formatting, suround by braces:

num_digits = 2
number = 123.456789
 
print(f"The number, with two decimal places, is: {number:.{num_digits}f}")

You can also add = after a variable to print the variable and value, especially good for debugging:

a = 10
print(f'{a=}, {a = }')

prints a=10, a = 10. And you can also format it:

number = 123.456789
print(f"{number = :.2f}")

We can also mix raw and f-strings:

var = "ABC"
print(fr"Some backslashes: \ \\ \\\\ and now a variable: {var}")

To print actual curly braces, use double curly braces:

number = 2
print(f"The number is {number}.  Something in {{braces}}. Number in braces: {{{number}}}.")

Youn can also use braces for variables inside braces:

text = "CENTERED"
spaces = 80
print(f"->{text:>{spaces}}<-")

String Formating

See: https://www.w3schools.com/python/ref_string_format.asp

Aligning/Padding

Align left with 10 spaces:

print('--{:<10}--'.format(5))

produces

--5         --


Align right with 10 spaces:

print('--{:>10}--'.format(5))

produces

--         5--


Align center with 10 spaces:

print('--{:^10}--'.format(5))

produces

--    5     --

Number Formatting

Using commans for separating thosands:

print('The number is {:,}'.format(99999999999))
print(f'The number is {99999999999:,}')

produces

The number is 99,999,999,999


To print in scientific format:

print('The number is {:e}'.format(123456.12345678))
print(f'The number is {123456.12345678:e}')
print('The number is {:E}'.format(123456.12345678))
print(f'The number is {123456.12345678:E}') 

produces

The number is 1.234561e+05
The number is 1.234561E+05

Note: The above does not quite work in Sage. You can do:

print('The number is {:e}'.format(float(123456.12345678)))
print(f'The number is {float(123456.12345678):e}')


To fix number of decimals:

print('The number is {:f}.'.format(1234.123456789))   # 6 decimals (default)
print(f'The number is {1234.123456789:f}.')           # 6 decimals (default)
print('The number is {:.3f}.'.format(1234.123456789)) # 3 decimals
print(f'The number is {1234.123456789:3f}.')          # 3 decimals 

produces

The number is 1234.123457.
The number is 1234.123.


You can also combine them:

print('The number is {:^20,.3f}.'.format(12345.123456789))
print(f'The number is {12345.123456789:^20,.3f}.')

produces

The number is      12,345.123     .


If you need a variable in the formatting of an f-string, surround it in braces as well:

num_dig = 5
num = 1.234567890
print(f"The number is {num:.{num_dig}}.")

produces

The number is 1.2346.

Misc Tricks

Lambda Functions

Instead of

def f(x,y):
    return x + y

do

f = lambda x, y: x + y

Map and Filter

If v is a list and f is a function, then

fv = map(f,v)

applies f to entries of v. In Python 3 it is iterable. (To make a list, do fv = list(map(v)).)

Now if test is a conditional function,

testv = filter(test,v)

gives only the elements of v satisfying test. Similar to

testv = [ x in v if test(x) ]

but in Python 3 it gives an iterable, not a list.

Avoid True/False Flags

You can use the else part of a for loop. So, instead of

def find(seq, target):
    found = False
    for i, value in enumerate(seq):
        if value == target:
            found = True
            break
    if not found:
        return -1
    return i

do

def find(seq, target):
    for i, value in enumerate(seq):
        if value == target:
            found = True
            break
    else:
        return -1
    return i

The else is like a nobreak: if the look finished normally, it skips the else part. If there is a break in the look, it runs the else part.

any

The function any gives True if any of the values is an iterable input is true. (If the iterable is empty, returns False.)

The following return True:

any([False, True, False, False])
any((0, 0, 0, 1, 0, 0))
 
v = [1, 2, 3]; w = [3, 2, 1]
any(( x == y for x, y in zip(v, w) ))

The following return False:

any([False, False, False, False])
any((0, 0, 0, 0, 0))
 
v = [1, 2, 3]; w = [4, 3, 2]
any(( x == y for x, y in zip(v, w) ))

Caching

To save functions from recomputing the same value (with the expense of saving the computed values in memory) is to use@cache. So, instead of:

def my_fct(x, saved={}):
    if x in saved:
        return saved[x]
     <compute result>
     saved[x] = result
     return result

do

@cache
def my_fact(x):
    <compute result>
    return result

Open/Close Files

Instead of:

f = open('data.txt')
try:
    data = f.read()
finally:
    f.close()

do

with open('data.txt') as f:
    data = f.read()

Generator Expressions

Instead of

sum([i**2 for i in range(10)])

do

sum(i**2 for i in range(10))

Named Tuples

The class namedtuple from collections is useful to create objects with values for specific attributes. For example, if you want to have color objects, described by hue, saturation, and luminosity, you could store it as a tuple, if you remember the order. A namedtuple allows you to have the fields by name. (Note that namedtuples are umuttable.)

from collections import namedtuple
 
Color = namedtuple('Color',['hue', 'saturation', 'luminosity'])
 
p = Color(170, 0.1, 0.6)
 
if p.saturation > 0.5:
   print('Bright!')
 
if p.luminosity > 0.5:
    print('Light!')

Note that namedtuples are more memory efficient than a dictionary and it can be used for various objects without having to repeat the key names. (But again, you cannot change the values!)

See also: https://www.geeksforgeeks.org/namedtuple-in-python/, https://stackoverflow.com/questions/9872255/when-and-why-should-i-use-a-namedtuple-instead-of-a-dictionary

Generators

If you want to loop over the squares of integers from 1 to 1000, instead of creating the list

[ i**2 for i in range(1,1001)]

and looping, create the generator/iterator:

( i**2 for i in range(1,1001))

It's better in most cases to return generators/iterators instead of lists. So instead of:

def cubes(n):
    res = []
    for i in range(n):
        res.append(i**3)
    return res

you can do

def cubes(n):
    for i in range(n):
         yield i**3

We can make it a list if we want to with list(cubes(10)), but we can use to iterate (only once per call), like:

for i in cubes(10):
    <do something with i>

Or, to extract the first three elements:

c = cubes(100000)
next(c)
next(c)
next(c)

Conditional Assignment

x = 1 if <condition> else 0

Unpacking

a, b, c = (1, 2, 3, 4, 5)

gives an error. In Python 3 we can do

a, b, *c = (1, 2, 3, 4, 5)

gives an a = 1, b = 2, c = [3, 4, 5].

a, b, *_ = (1, 2, 3, 4, 5)

gives an a = 1 and b = 2.

a, b, *c, d  = (1, 2, 3, 4, 5)

gives an a = 1, b = 2, c = [3, 4], and d = 5.

Input Password

To hide the typed password:

from getpass import getpass
 
username = input('Username: ')
password = getpass('Password: ')

Values Treated as False

The following values are treated as False: “” (empty string), 0, 0.0, [] (empty list), () (empty tuple), {} (empty dictionary), False, None. If variable then is any of those, then

if variable:
     print('OK')

will not print OK, but will otherwise.

Random List Element

Instead of

x = list(randint(len(list)))

use choice:

from random import choice
x = choice(list)

Note that Sage already has it (no need to import).

For multiple elements, instead of

[ choice(list) for i in range(10) ]

do

from random import choices
choices(list, k=10)

Note that choices is not in Sage.

Random Subset

Use sample from random to get a subset of a list (without repetition):

>>> from random import sample
>>> v = range(12)
>>> sample(v, 5)
[0, 8, 4, 9, 1]
>>> sample(v, 5)
[7, 9, 11, 8, 5]
>>> sample(v, 4)
[7, 10, 0, 2]

Shuffle

The random module also has shuffle, to randomize a list:

>>> from random import shuffle
>>> v = list(range(10))
>>> shuffle(v)
>>> v
[8, 4, 5, 9, 0, 6, 7, 3, 1, 2]

Note that it is in place (meaning that the list is changed).

Types

Instead of

if type(v) == type([]):
  # do something if v is a list

do

if isinstance(v, list):
  # do something if v is a list

You can get the name of the type/parent with type(v).

Web Scraping

See Python Tutorial: Web Scraping with BeautifulSoup and Requests by Corey Schafer.

Creating Empty Set

You cannot create and empty set wit my_set = {} as it creates an empty dictionary. So, you do it with my_set = set().

This also works for lists, tuples or dictionaries:

empty_list = []
empty_list = list()
 
empty_tuple = ()
empty_tuple = tuple()
 
empty_dict = {}
empty_dict = dict()

But, although it is still fast, creating with empty list = [], for example, is faster than empty_list = list()!

Double Ended Queue

You can use deque (double ended queue) if you want to take elements from front/start and back/end of a list.

>>> q = deque([1,2,3,4,5])
>>> q.appendleft(0)
>>> q
deque([0, 1, 2, 3, 4, 5])
>>> q.append(6)
>>> q
deque([0, 1, 2, 3, 4, 5, 6])
>>> x = q.popleft()
>>> x, q
(0, deque([1, 2, 3, 4, 5, 6]))
>>> y = q.pop()
>>> y, q
(6, deque([1, 2, 3, 4, 5]))

You can also use to keep a fixed number of elements in a list.

>>> q = deque([1,2,3,4,5], maxlen=5)
>>> q
deque([1, 2, 3, 4, 5], maxlen=5)
>>> q.append(6)
>>> q
deque([2, 3, 4, 5, 6], maxlen=5)
>>> q.append(7)
>>> q.append(7)
>>> q
deque([4, 5, 6, 7, 7], maxlen=5)
>>> q.appendleft(3)
>>> q
deque([3, 4, 5, 6, 7], maxlen=5)

Heap

A heap is a list where the first element is always the minimum.

>>> import heapq
>>>
>>> H = [21,1,45,78,3,5]
>>> heapq.heapify(H)
>>> H
[1, 3, 5, 78, 21, 45]
>>> x = heapq.heappop(H)
>>> x, H
(1, [3, 21, 5, 78, 45])
>>> heapq.heappush(H, 4)
>>> H
[3, 21, 4, 78, 45, 5]
>>> x = heapq.heappop(H)
>>> x, H
(3, [4, 21, 5, 78, 45])
>>> x = heapq.heappop(H)
>>> x, H
(4, [5, 21, 45, 78])

You can use for efficiently get a small number, say n of smallest or largest elements. (If n == 1, use min or max instead. If n is close to the size of the list, use sorted instead.)

>>> H = [21,1,45,78,3,5]
>>> heapq.heapify(H)
>>> heapq.nsmallest(3, H)
[1, 3, 5]
>>> heapq.nlargest(3, H)
[78, 45, 21]

Replace os.system()

Referrence: Larry Hastings - Solve Your Problem With Sloppy Python - PyCon 2018.

To run OS commands, we usually do os.system(<command>), but it does not stop the script if it fails. You can replace it with:

def run(s):
  '''
  Run command given by string s, and gives an exception
  if it fails
  '''
  subprocess.run(s, check=True, shell=True)

Slice Object

We can create a slice object to use get the same slice of different objects.

slc = slice(None, 10, 2)
a = list(range(20))
a[slc]  # same as a[:10:2]: [0, 2, 4, 6, 8]

Set Operations

We can use | for union, & for intersection, - for difference, and ^ for the symmetric differnce (i.e., the union minus the interesection).

Some Optimization Tips

Source: Sebastian Witowski - Writing faster Python

Collections

Collections provides data types that are more efficient for different tasks.

Permission Forgiveness

If you expect some action to work most of the time, it is better to use try/except than to test the conditions.

Membership Test

The x in list is more efficient than checking “by hand”, but it takes longer if not in the list or if at the end. Using set = set(list) then x in set is much faster, but the conversion takes some time. It's usefull if many membership tests.

Remove Duplicates

To remove duplicates of a list, use set(list), but does not preserve order. To preserve order, use OrderedDict:

mylist = [1,1,1,1,1,2,2,2,2,2,3,4,4,4,5,5]
from collections import OrderedDict
OD = OrderedDict.fromkeys(mylist); OD
OrderedDict([(1, None), (2, None), (3, None), (4, None), (5, None)])
list(OD)
[1, 2, 3, 4, 5]

Sorting

Sorting with place (with mylist.sort()) is much faster than sorted(mylist) (which produces a new list).

Creating Lists

Instead of

def square(x):
  return x**2
 
[ square(x) for x in range(1000) ]  

it would be much faster to do

def vsquares():
  return [ x**2 for x in range(1000) ]
 
vsquares()

Check if Variable is True

if var == True:
  pass

is slower than

if var is True:
  pass

which is slower than

if var:
  pass

Similarly, instad of

if len(mylist) == 0:
  pass

(really bad!) or

if mylist == []:
  pass

use

if mylist:
  pass

Progress Bar

One can use `tqdm` for progress bars:

from tqdm import tqdm
from time import sleep
 
for i in tqdm(range(100)):
  sleep(0.1)

Note that it does not work in bpython.

On Jupyter notebooks, use instead:

from tqdm.notebook import tqdm