To run code in a .py file, use a main function:
# set up here def main(): # code to run # ruin the code if __name__ == '__main__': main()
It allows to import the file as a module without running the code inside the function main.
One could also do:
# code to always run here if __name__ == '__main__': # code to run when called directly else: # code to run when imported as module
Use reversed:
colors = ['red', 'green', 'blue'] for color in reversed(colors): print(color)
Use enumerate:
colors = ['red', 'green', 'blue'] for i, color in enumerate(colors): print("{} -> {}".format(i, color))
names = ['raymond', 'rachel', 'matthew'] colors = ['red', 'green', 'blue', 'yellow'] for name, color in zip(names, colors): print("{} -> {}".format(name, color))
Note that the lengths are different.
Use sorted:
colors = ['red', 'green', 'blue'] for color in sorted(colors): print(color)
To sort by length:
colors = ['red', 'green', 'blue'] for color in sorted(colors, key=len): print(color)
Python's break only breaks out of one loop. To break out from two, make it a single loop. So:
for x in range(width): for y in range(height): <do something>
Create a range:
def range_2d(width, height): for x in range(width): for y in range(height): yield x, y
Then, you can do:
for x, y in range_2d(width, height): <do something>
Instead of:
for i in v: for j in w: <do something>
you can do
it = ( (i,j) for i in v for j in w ) # creates a GENERATOR for i,j in it: <do something>
The command break breaks out of the loop, while continue skips the rest of the body and goes to the next iteration. For instance:
for x in range(10): if x == 2: continue print(x) if x == 5: break
prints 0, 1, 3, 4, 5.
The package itertools offers many convenient iterators for looping.
See https://docs.python.org/2/library/itertools.html and https://www.geeksforgeeks.org/python-itertools/
product gives an iterable with the Cartesian product.
from itertools import product for x, y in product('ABC', 'xy'): print(x, y)
produces
A x A y B x B y C x C y
for x, y, z in product(range(3), repeat=3): print(x, y, z)
produces
0 0 0 0 0 1 0 0 2 0 1 0 0 1 1 0 1 2 0 2 0 0 2 1 0 2 2 1 0 0 1 0 1 1 0 2 1 1 0 1 1 1 1 1 2 1 2 0 1 2 1 1 2 2 2 0 0 2 0 1 2 0 2 2 1 0 2 1 1 2 1 2 2 2 0 2 2 1 2 2 2
permutations gives all permutations of a list:
from itertools import permutations >>> for v in permutations(range(3)): print(v)
produces
(0, 1, 2) (0, 2, 1) (1, 0, 2) (1, 2, 0) (2, 0, 1) (2, 1, 0)
combinations give all combinations without replacement. (Use combinations_with_replacement if you need.)
from itertools import combinations for x, y in combinations(range(4), 2): print(x, y)
produces
0 1 0 2 0 3 1 2 1 3 2 3
for x, y, z in combinations(range(5), 3): print(x, y, z)
produces
0 1 2 0 1 3 0 1 4 0 2 3 0 2 4 0 3 4 1 2 3 1 2 4 1 3 4 2 3 4
It's simply a counter, but allows to go on forever (from an initial value and using steps).
For example, the below stops when x is 0 modulo 7 (giving x equal to 7).
import itertools for x in itertools.count(): if x % 7 == 0: break
This next example gives the first positive integer congruent to 2 modulo 7 (giving x equal to 14):
for x in itertools.count(2, step=3): if x % 7 == 0: break
d = {'matthew': 'blue', 'rachel': 'green', 'raymond': 'red'} for k in d: print k
If you want to change the dictionary while you loop, loop over d.keys():
for k in d.keys(): if k.startswith('r'): del d[k]
for v in d.values() <do stuff>
for key, value in d.items() print("{} --> {}".format(key, value))
d = dict(zip(names, colors))
Use d.get. d.get(item,0) gives d[item] if possible and 0 otherwise.
d = {} v = [0,0,0,1,1,1,1,1,1,2,2,0,2,1] for x in v: d[x] = d.get(x,0) + 1
There is also d.setdefault(key, default_value), which also creates the d[key] if it does not exist already.
One can also use defaultdict(type), which creates an “empty” dictionary with values of type.
For instance, to create a dictionary grouping stings of names by length:
from collections import defaultdict d = defaultdict(list) for name in names: key = len(name) d[key].append(name)
If key does not exist, it creates it and sets the default value for list, which is an empty list. (And for int it is 0.)
If you want to try to get an element from the dictionary, but are not sure if it is in it, instead of
if item in my_dict: res = my_dict[item] else: res = 'Unknown'
you can simply do
my_dict.get(item, 'Unknown')
v = [ 'a', 'b', 'c', 'd' ] w = [ 1, 2, 3, 4 ] my_dict = { letter : number for letter, number in zip(v,w) if number != 2 }
gives my_dict as { 'a' : 1, 'c' : 3, 'd' : 4 }.
Use * for optional arguments with no default value.
def my_fct(x, *v): print('First value is {}'.format(x)) if v: # runs if v is not empty print('The other values are:') for value in v: print(value)
You can call it with my_fct(3), my_fct(3,4), my_fct(3,4,5), etc.
Also, if v = [1, 2], then my_fct(3,*v) is the same as my_fct(3,1,2).
In Python 3, if
def f(x, y, *, z=0, w=1): return x+y+z+w
you cannot call it as f(1,2,3,4), you must call it with f(1,2,x=3,w=4), i.e., the keyword arguments must be entered as keywords.
We can have a new function based on a previous one with some of the arguments specified using partial from functools:
from functools import partial def multiply(a, b, print_arg=False): if print_arg: print(f"Arguments: {a = } and {b = }") return a * b double = partial(multiply, 2) triple = partial(multiply, b=3)
Then, you can call double(3), triple(7), triple(2, print_arg=True), etc.
f-String is a new and improved way to format strings. It only works in Python 3! It's faster than .format.
An example:
greeting = 'Hello' name = 'World' print(f"{greeting}, {name}!")
You can also mix types:
student = 'John Doe' score = 93 print(f"Hello, {student}. Your score was {score}.")
By default, it prints the str() respresentation. To print the repr() string, we add !r after the variable name.
print(f"The __repr()__ is {object_variable!r}.")
You can perform operations:
name = 'John Doe' age = 30 print(f"Next year, {name}'s age will be {age + 1}.")
To use a variable in formatting, suround by braces:
num_digits = 2 number = 123.456789 print(f"The number, with two decimal places, is: {number:.{num_digits}f}")
You can also add = after a variable to print the variable and value, especially good for debugging:
a = 10 print(f'{a=}, {a = }')
prints a=10, a = 10. And you can also format it:
number = 123.456789 print(f"{number = :.2f}")
We can also mix raw and f-strings:
var = "ABC" print(fr"Some backslashes: \ \\ \\\\ and now a variable: {var}")
To print actual curly braces, use double curly braces:
number = 2 print(f"The number is {number}. Something in {{braces}}. Number in braces: {{{number}}}.")
Youn can also use braces for variables inside braces:
text = "CENTERED" spaces = 80 print(f"->{text:>{spaces}}<-")
Align left with 10 spaces:
print('--{:<10}--'.format(5))
produces
--5 --
Align right with 10 spaces:
print('--{:>10}--'.format(5))
produces
-- 5--
Align center with 10 spaces:
print('--{:^10}--'.format(5))
produces
-- 5 --
Using commans for separating thosands:
print('The number is {:,}'.format(99999999999)) print(f'The number is {99999999999:,}')
produces
The number is 99,999,999,999
To print in scientific format:
print('The number is {:e}'.format(123456.12345678)) print(f'The number is {123456.12345678:e}') print('The number is {:E}'.format(123456.12345678)) print(f'The number is {123456.12345678:E}')
produces
The number is 1.234561e+05 The number is 1.234561E+05
Note: The above does not quite work in Sage. You can do:
print('The number is {:e}'.format(float(123456.12345678))) print(f'The number is {float(123456.12345678):e}')
To fix number of decimals:
print('The number is {:f}.'.format(1234.123456789)) # 6 decimals (default) print(f'The number is {1234.123456789:f}.') # 6 decimals (default) print('The number is {:.3f}.'.format(1234.123456789)) # 3 decimals print(f'The number is {1234.123456789:3f}.') # 3 decimals
produces
The number is 1234.123457. The number is 1234.123.
You can also combine them:
print('The number is {:^20,.3f}.'.format(12345.123456789)) print(f'The number is {12345.123456789:^20,.3f}.')
produces
The number is 12,345.123 .
If you need a variable in the formatting of an f-string, surround it in braces as well:
num_dig = 5 num = 1.234567890 print(f"The number is {num:.{num_dig}}.")
produces
The number is 1.2346.
Instead of
def f(x,y): return x + y
do
f = lambda x, y: x + y
If v is a list and f is a function, then
fv = map(f,v)
applies f to entries of v. In Python 3 it is iterable. (To make a list, do fv = list(map(v)).)
Now if test is a conditional function,
testv = filter(test,v)
gives only the elements of v satisfying test. Similar to
testv = [ x in v if test(x) ]
but in Python 3 it gives an iterable, not a list.
You can use the else part of a for loop. So, instead of
def find(seq, target): found = False for i, value in enumerate(seq): if value == target: found = True break if not found: return -1 return i
do
def find(seq, target): for i, value in enumerate(seq): if value == target: found = True break else: return -1 return i
The else is like a nobreak: if the look finished normally, it skips the else part. If there is a break in the look, it runs the else part.
The function any gives True if any of the values is an iterable input is true. (If the iterable is empty, returns False.)
The following return True:
any([False, True, False, False]) any((0, 0, 0, 1, 0, 0)) v = [1, 2, 3]; w = [3, 2, 1] any(( x == y for x, y in zip(v, w) ))
The following return False:
any([False, False, False, False]) any((0, 0, 0, 0, 0)) v = [1, 2, 3]; w = [4, 3, 2] any(( x == y for x, y in zip(v, w) ))
To save functions from recomputing the same value (with the expense of saving the computed values in memory) is to use@cache. So, instead of:
def my_fct(x, saved={}): if x in saved: return saved[x] <compute result> saved[x] = result return result
do
@cache def my_fact(x): <compute result> return result
Instead of:
f = open('data.txt') try: data = f.read() finally: f.close()
do
with open('data.txt') as f: data = f.read()
Instead of
sum([i**2 for i in range(10)])
do
sum(i**2 for i in range(10))
The class namedtuple from collections is useful to create objects with values for specific attributes. For example, if you want to have color objects, described by hue, saturation, and luminosity, you could store it as a tuple, if you remember the order. A namedtuple allows you to have the fields by name. (Note that namedtuples are umuttable.)
from collections import namedtuple Color = namedtuple('Color',['hue', 'saturation', 'luminosity']) p = Color(170, 0.1, 0.6) if p.saturation > 0.5: print('Bright!') if p.luminosity > 0.5: print('Light!')
Note that namedtuples are more memory efficient than a dictionary and it can be used for various objects without having to repeat the key names. (But again, you cannot change the values!)
See also: https://www.geeksforgeeks.org/namedtuple-in-python/, https://stackoverflow.com/questions/9872255/when-and-why-should-i-use-a-namedtuple-instead-of-a-dictionary
If you want to loop over the squares of integers from 1 to 1000, instead of creating the list
[ i**2 for i in range(1,1001)]
and looping, create the generator/iterator:
( i**2 for i in range(1,1001))
It's better in most cases to return generators/iterators instead of lists. So instead of:
def cubes(n): res = [] for i in range(n): res.append(i**3) return res
you can do
def cubes(n): for i in range(n): yield i**3
We can make it a list if we want to with list(cubes(10)), but we can use to iterate (only once per call), like:
for i in cubes(10): <do something with i>
Or, to extract the first three elements:
c = cubes(100000) next(c) next(c) next(c)
x = 1 if <condition> else 0
a, b, c = (1, 2, 3, 4, 5)
gives an error. In Python 3 we can do
a, b, *c = (1, 2, 3, 4, 5)
gives an a = 1, b = 2, c = [3, 4, 5].
a, b, *_ = (1, 2, 3, 4, 5)
gives an a = 1 and b = 2.
a, b, *c, d = (1, 2, 3, 4, 5)
gives an a = 1, b = 2, c = [3, 4], and d = 5.
To hide the typed password:
from getpass import getpass username = input('Username: ') password = getpass('Password: ')
The following values are treated as False: “” (empty string), 0, 0.0, [] (empty list), () (empty tuple), {} (empty dictionary), False, None. If variable then is any of those, then
if variable: print('OK')
will not print OK, but will otherwise.
Instead of
x = list(randint(len(list)))
use choice:
from random import choice x = choice(list)
Note that Sage already has it (no need to import).
For multiple elements, instead of
[ choice(list) for i in range(10) ]
do
from random import choices choices(list, k=10)
Note that choices is not in Sage.
Use sample from random to get a subset of a list (without repetition):
>>> from random import sample >>> v = range(12) >>> sample(v, 5) [0, 8, 4, 9, 1] >>> sample(v, 5) [7, 9, 11, 8, 5] >>> sample(v, 4) [7, 10, 0, 2]
The random module also has shuffle, to randomize a list:
>>> from random import shuffle >>> v = list(range(10)) >>> shuffle(v) >>> v [8, 4, 5, 9, 0, 6, 7, 3, 1, 2]
Note that it is in place (meaning that the list is changed).
Instead of
if type(v) == type([]): # do something if v is a list
do
if isinstance(v, list): # do something if v is a list
You can get the name of the type/parent with type(v).
You cannot create and empty set wit my_set = {} as it creates an empty dictionary. So, you do it with my_set = set().
This also works for lists, tuples or dictionaries:
empty_list = [] empty_list = list() empty_tuple = () empty_tuple = tuple() empty_dict = {} empty_dict = dict()
But, although it is still fast, creating with empty list = [], for example, is faster than empty_list = list()!
You can use deque (double ended queue) if you want to take elements from front/start and back/end of a list.
>>> q = deque([1,2,3,4,5]) >>> q.appendleft(0) >>> q deque([0, 1, 2, 3, 4, 5]) >>> q.append(6) >>> q deque([0, 1, 2, 3, 4, 5, 6]) >>> x = q.popleft() >>> x, q (0, deque([1, 2, 3, 4, 5, 6])) >>> y = q.pop() >>> y, q (6, deque([1, 2, 3, 4, 5]))
You can also use to keep a fixed number of elements in a list.
>>> q = deque([1,2,3,4,5], maxlen=5) >>> q deque([1, 2, 3, 4, 5], maxlen=5) >>> q.append(6) >>> q deque([2, 3, 4, 5, 6], maxlen=5) >>> q.append(7) >>> q.append(7) >>> q deque([4, 5, 6, 7, 7], maxlen=5) >>> q.appendleft(3) >>> q deque([3, 4, 5, 6, 7], maxlen=5)
A heap is a list where the first element is always the minimum.
>>> import heapq >>> >>> H = [21,1,45,78,3,5] >>> heapq.heapify(H) >>> H [1, 3, 5, 78, 21, 45] >>> x = heapq.heappop(H) >>> x, H (1, [3, 21, 5, 78, 45]) >>> heapq.heappush(H, 4) >>> H [3, 21, 4, 78, 45, 5] >>> x = heapq.heappop(H) >>> x, H (3, [4, 21, 5, 78, 45]) >>> x = heapq.heappop(H) >>> x, H (4, [5, 21, 45, 78])
You can use for efficiently get a small number, say n of smallest or largest elements. (If n == 1, use min or max instead. If n is close to the size of the list, use sorted instead.)
>>> H = [21,1,45,78,3,5] >>> heapq.heapify(H) >>> heapq.nsmallest(3, H) [1, 3, 5] >>> heapq.nlargest(3, H) [78, 45, 21]
Referrence: Larry Hastings - Solve Your Problem With Sloppy Python - PyCon 2018.
To run OS commands, we usually do os.system(<command>), but it does not stop the script if it fails. You can replace it with:
def run(s): ''' Run command given by string s, and gives an exception if it fails ''' subprocess.run(s, check=True, shell=True)
We can create a slice object to use get the same slice of different objects.
slc = slice(None, 10, 2) a = list(range(20)) a[slc] # same as a[:10:2]: [0, 2, 4, 6, 8]
We can use | for union, & for intersection, - for difference, and ^ for the symmetric differnce (i.e., the union minus the interesection).
Collections provides data types that are more efficient for different tasks.
If you expect some action to work most of the time, it is better to use try/except than to test the conditions.
The x in list is more efficient than checking “by hand”, but it takes longer if not in the list or if at the end. Using set = set(list) then x in set is much faster, but the conversion takes some time. It's usefull if many membership tests.
To remove duplicates of a list, use set(list), but does not preserve order. To preserve order, use OrderedDict:
mylist = [1,1,1,1,1,2,2,2,2,2,3,4,4,4,5,5] from collections import OrderedDict OD = OrderedDict.fromkeys(mylist); OD OrderedDict([(1, None), (2, None), (3, None), (4, None), (5, None)]) list(OD) [1, 2, 3, 4, 5]
Sorting with place (with mylist.sort()) is much faster than sorted(mylist) (which produces a new list).
Instead of
def square(x): return x**2 [ square(x) for x in range(1000) ]
it would be much faster to do
def vsquares(): return [ x**2 for x in range(1000) ] vsquares()
if var == True: pass
is slower than
if var is True: pass
which is slower than
if var: pass
Similarly, instad of
if len(mylist) == 0: pass
(really bad!) or
if mylist == []: pass
use
if mylist: pass
One can use `tqdm` for progress bars:
from tqdm import tqdm from time import sleep for i in tqdm(range(100)): sleep(0.1)
Note that it does not work in bpython.
On Jupyter notebooks, use instead:
from tqdm.notebook import tqdm