February 8, 2022

Python Cheatsheet

This Python Cheatsheet is created based on many open references.

Python Basics

Math Operators

From Highest to Lowest precedence:

Operators Operation Example
** Exponent 2 ** 3 = 8
% Modulus/Remainder 22 % 8 = 6
// Integer division 22 // 8 = 2
/ Division 22 / 8 = 2.75
* Multiplication 3 * 3 = 9
- Subtraction 5 - 2 = 3
+ Addition 2 + 2 = 4

Examples of expressions in the interactive shell:

>>> 2 + 3 * 6
20
>>> (2 + 3) * 6
30
>>> 2 ** 8
256
>>> 23 // 7
3
>>> 23 % 7
2
>>> (5 - 1) * ((7 + 1) / (3 - 1))
16.0

Back to Top

Data Types

Data Type Examples
Integers -2, -1, 0, 1, 2, 3, 4, 5
Floating-point numbers -1.25, -1.0, --0.5, 0.0, 0.5, 1.0, 1.25
Strings 'a', 'aa', 'aaa', 'Hello!', '11 cats'

Back to Top

String Concatenation and Replication

String concatenation:

>>> 'Alice' + 'Bob'
'AliceBob'

String Replication:

>>> 'Alice' * 5
'AliceAliceAliceAliceAlice'

Back to Top

Variables

Variable naming rules:

  1. It can be only one word.
  2. It can use only letters, numbers, and the underscore (_) character.
  3. It can’t begin with a number.

Example:

>>> first_name = 'Harry'
>>> first_name
'Harry'

A variable starts with an underscore (_) is considered as “I don’t Care” or “Throwaway” variable in Python:

>>> _foo = 'Hello'

_foo should not be used again in the code.

x, _, y = (1, 2, 3)
>>> x
1
>>> y 
3

Back to Top

Comments

Inline comment:

# This is a comment

Multiline comment:

# This is a
# multiline comment

Code with a comment:

a = 1  # initialization

Please note the two spaces in front of the comment.

Back to Top

The print() Function

>>> print('Hello world!')
Hello world!
>>> a = 1
>>> print('Hello world!', a)
Hello world! 1

Back to Top

The input() Function

Example Code:

>>> print('What is your name?')   # ask for their name
>>> myName = input()
>>> print('It is good to meet you, {}'.format(myName))
What is your name?
Al
It is good to meet you, Al

Back to Top

The len() Function

Evaluates to the integer value of the number of characters in a string:

>>> len('hello')
5

Note: test of emptiness of strings, lists, dictionary, etc, should not use len, but prefer direct boolean evaluation.

>>> a = [1, 2, 3]
>>> if a:
>>>     print("the list is not empty!")

Back to Top

The str(), int(), and float() Functions

Integer to String or Float:

>>> str(29)
'29'
>>> print('I am {} years old.'.format(str(29)))
I am 29 years old.
>>> str(-3.14)
'-3.14'

Float to Integer:

>>> int(7.7)
7
>>> int(7.7) + 1
8

Back to Top

Flow Control

Comparison Operators

Operator Meaning
== Equal to
!= Not equal to
< Less than
> Greater Than
<= Less than or Equal to
>= Greater than or Equal to

These operators evaluate to True or False depending on the values you give them.

Examples:

>>> 42 == 42
True
>>> 40 == 42
False
>>> 'hello' == 'hello'
True
>>> 'hello' == 'Hello'
False
>>> 'dog' != 'cat'
True
>>> 42 == 42.0
True
>>> 42 == '42'
False

Boolean evaluation

Never use == or != operator to evaluate boolean operation. Use the is or is not operators, or use implicit boolean evaluation.

NO (even if they are valid Python):

>>> True == True
True
>>> True != False
True

YES (even if they are valid Python):

>>> True is True
True
>>> True is not False
True

These statements are equivalent:

>>> if a is True:
>>>    pass
>>> if a is not False:
>>>    pass
>>> if a:
>>>    pass

And these as well:

>>> if a is False:
>>>    pass
>>> if a is not True:
>>>    pass
>>> if not a:
>>>    pass

Back to Top

Boolean Operators

There are three Boolean operators: and, or, and not.

The and Operator’s Truth Table:

Expression Evaluates to
True and True True
True and False False
False and True False
False and False False

The or Operator’s Truth Table:

Expression Evaluates to
True or True True
True or False True
False or True True
False or False False

The not Operator’s Truth Table:

Expression Evaluates to
not True False
not False True

Back to Top

Mixing Boolean and Comparison Operators

>>> (4 < 5) and (5 < 6)
True
>>> (4 < 5) and (9 < 6)
False
>>> (1 == 2) or (2 == 2)
True

You can also use multiple Boolean operators in an expression, along with the comparison operators:

>>> 2 + 2 == 4 and not 2 + 2 == 5 and 2 * 2 == 2 + 2
True

Back to Top

if Statements

credit_score = 750
if credit_score >= 720:
    print('Excellent')

credit_score = 700
if credit_score >= 690 and credit_score <= 719:
    print('Good')

Back to Top

else Statements

credit_score = 650
if credit_score >= 700:
    print('loan approved')  # auto loan approval
else:
    print('application received and under review')

Back to Top

elif Statements

credit_score = 600
student = 'yes'
if credit_score >= 700:
    print('card approved')
elif student == 'yes':
    print('student card approved')

if...elif...else

credit_score = 600
student = 'no'
if credit_score >= 700:
    print('card approved')
elif student == 'yes':
    print('student card approved')
else:
    print('application declined')

Back to Top

for Loops and the range() Function

print('The only three things that matter in real estate are:')
for i in range(3):
    print(f'{i+1}. Location!')

The only three things that matter in real estate are:
1. Location!
2. Location!
3. Location!

The range() function can also be called with three arguments. The first two arguments will be the start and stop values, and the third will be the step argument. The step is the amount that the variable is increased by after each iteration.

>>> for i in range(0, 10, 2):
>>>    print(i)
0
2
4
6
8

You can even use a negative number for the step argument to make the for loop count down instead of up.

>>> for i in range(5, -1, -1):
>>>     print(i)
5
4
3
2
1
0

Back to Top

while Loop Statements

a = 0
while a < 5:
    print('Hello, world.')
    a = a + 1

NOTE: in the example above, if you don’t increase the value of a within the loop, the condition would be always true, then you run into an infinite loop.

Back to Top

break Statements

If the execution reaches a break statement, it immediately exits the while loop’s clause:

while True:
    print('Please enter the password:')
    name = input()
    if name == 'precious':
        break
print('here is the ring')

Back to Top

continue Statements

When the program execution reaches a continue statement, the program execution immediately jumps back to the start of the loop.

while True:
    print('Who are you?')
    name = input()
    if name != 'Joe':
        continue
    print('Hello, Joe. What is the password? (It is a fish.)')
    password = input()
    if password == 'swordfish':
        break
print('Access granted.')

pass Statement

pass is a null statement, which is generally used as a placeholder and results into no operation.

a = 5
if a == 5:
    pass  # nothing happens

Back to Top

Lists

>>> animals = ['cat', 'dog', 'fish', 'elephant']

>>> animals
['cat', 'dog', 'fish', 'elephant']

Back to Top

Getting Individual Values in a List with Indexes

>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> animals[0]
'cat'
>>> animals[1]
'dog'

Back to Top

Negative Indexes

>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> animals[-1]
'elephant'
>>> animals[-3]
'dog'
>>> f'I have one {animals[0]} and no {animals[-3]}.'
'I have one cat and no dog.'

Back to Top

Getting Sublists with Slices

a[start:stop]  # items start through stop-1
a[start:]      # items start through the rest of the array
a[:stop]       # items from the beginning through stop-1
a[:]           # a copy of the whole array
a[start:stop:step] # start through not past stop, by step

The key is to remember the :stop value represents the first value that is NOT in the selected slice.

The number of elements selected is stop - start (if step is 1 - the default).

start, stop, and step can all be negative:

a[-1]    # last item in the array
a[-2:]   # last two items in the array
a[:-2]   # everything except the last two items
a[::-1]    # all items in the array, reversed
a[1::-1]   # the first two items, reversed
a[:-3:-1]  # the last two items, reversed
a[-3::-1]  # everything except the last two items, reversed

Some examples:

>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> animals[0:4]
['cat', 'dog', 'fish', 'elephant']
>>> animals[1:3]
['dog', 'fish']
>>> animals[0:-1]
['cat', 'dog', 'fish']
>>> animals[:2]
['cat', 'dog']
>>> animals[1:]
['dog', 'fish', 'elephant']

Slicing the complete list will perform a copy:

>>> animals2 = animals[:]  # this is making a copy
>>> animals2
['cat', 'dog', 'fish', 'elephant']
>>> animals.append('bird')
>>> animals
['cat', 'dog', 'fish', 'elephant', 'bird']
>>> animals2
['cat', 'dog', 'fish', 'elephant']

Back to Top

Getting a List’s Length with len()

>>> animals = ['cat', 'dog', 'moose']
>>> len(animals)
3

Back to Top

Changing Values in a List with Indexes

>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> animals[1] = 'bird'
>>> animals
['cat', 'bird', 'fish', 'elephant']
>>> animals[2] = animals[0]
>>> animals
['cat', 'bird', 'cat', 'elephant']

>>> animals[-1] = 12345
>>> animals
['cat', 'bird', 'cat', 12345]

Back to Top

List Concatenation and List Replication

>>> [1, 2, 3] + ['A', 'B', 'C']
[1, 2, 3, 'A', 'B', 'C']

>>> ['X', 'Y', 'Z'] * 3
['X', 'Y', 'Z', 'X', 'Y', 'Z', 'X', 'Y', 'Z']

Back to Top

Using for Loops with Lists

>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> for i, animal in enumerate(animals):
...     print(f'Index {i} in animals list is: {animal}')
... 
Index 0 in animals list is: cat
Index 1 in animals list is: dog
Index 2 in animals list is: fish
Index 3 in animals list is: elephant

Back to Top

Looping Through Multiple Lists with zip()

>>> name = ['Pete', 'John', 'Elizabeth']
>>> age = [6, 23, 44]
>>> for n, a in zip(name, age):
>>>     print(f'{n} is {a} years old')
Pete is 6 years old
John is 23 years old
Elizabeth is 44 years old

The in and not in Operators

>>> 'cat' in ['cat', 'dog', 'fish', 'elephant']
True
>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> 'bird' in animals
False
>>> 'bird' not in animals
True

Back to Top

The Multiple Assignment Trick

The multiple assignment trick is a shortcut that lets you assign multiple variables with the values in a list in one line of code. So instead of doing this:

>>> customer = ['John', 'Male', 25]
>>> name = customer[0]
>>> gender = customer[1]
>>> age = customer[2]

You could type this line of code:

>>> customer = ['John', 'Male', 25]
>>> name, gender, age = customer

You will get an error if the number of variables does not match the elements in the list:

>>> customer = ['John', 'Male', 25]
>>> name, gender  = customer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: too many values to unpack (expected 2)
>>> name, gender, age, address  = customer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: not enough values to unpack (expected 4, got 3)

Back to Top

Augmented Assignment Operators

Operator Equivalent
x += 1 x = x + 1
x -= 1 x = x - 1
x *= 1 x = x * 1
x /= 1 x = x / 1
x %= 1 x = x % 1

Examples:

>>> a = 'Hello'
>>> a += ' world!'
>>> a
'Hello world!'

>>> b = ['hello']
>>> b *= 3
>>> b
['hello', 'hello', 'hello']

Back to Top

Finding a Value in a List with the index() Method

>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> animals.index('dog')
1

Back to Top

Adding Values to Lists with the append() and insert() Methods

append():

>>> spam = ['cat', 'dog', 'bat']

>>> spam.append('moose')

>>> spam
['cat', 'dog', 'bat', 'moose']

insert():

>>> spam = ['cat', 'dog', 'bat']

>>> spam.insert(1, 'chicken')

>>> spam
['cat', 'chicken', 'dog', 'bat']

Back to Top

Removing Values from Lists with remove() or pop()

>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> animals.pop(2)
'fish'
>>> animals
['cat', 'dog', 'elephant']
>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> animals.remove('cat')
>>> animals
['dog', 'fish', 'elephant']
>>> animals.pop()
'elephant'
>>> animals
['dog', 'fish']
>>> animals.pop(1)
'fish'
>>> animals
['dog']

If the value appears multiple times in the list, only the first instance of the value will be removed.

Back to Top

Sorting the Values in a List with the sort() Method

>>> a = [2, 5, 3.14, 1, -7]
>>> a.sort()
>>> a
[-7, 1, 2, 3.14, 5]
>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> animals.sort()
>>> animals
['cat', 'dog', 'elephant', 'fish']

You can also pass True for the reverse keyword argument to have sort() sort the values in reverse order:

>>> animals.sort(reverse=True)
>>> animals
['fish', 'elephant', 'dog', 'cat']

You can use the built-in function sorted to return a new list:

>>> animals = ['cat', 'dog', 'fish', 'elephant']
>>> sorted(animals)
['cat', 'dog', 'elephant', 'fish']
>>> animals
['cat', 'dog', 'fish', 'elephant']

Back to Top

Tuple Data Type

Tuples and lists are the same in every way except two:

>>> a = [1, 1, 2, 3, 5, 8]  # list
>>> b = (1, 1, 2, 3, 5, 8)  # tuple
>>> a[4] = 'hello!'
>>> a
[1, 1, 2, 3, 'hello!', 8]
>>> b[4] = 'hello!'
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>> b
(1, 1, 2, 3, 5, 8)

Back to Top

Converting Types with the list() and tuple() Functions

>>> tuple(['cat', 'dog', 5])
('cat', 'dog', 5)
>>> list(('cat', 'dog', 5))
['cat', 'dog', 5]
>>> list('hello')
['h', 'e', 'l', 'l', 'o']

Back to Top

Dictionaries

Dictionary: key:value pairs separated by comma:

customer = {'name': 'John', 'gender': 'male', 'age': 25}

Back to Top

The keys(), values(), and items() Methods

keys():

>>> for k in customer.keys():
...     print(k)
... 
name
gender
age

values():

>>> for v in customer.values():
...     print(v)
... 
John
male
25

items(): each item is a tuple

>>> for i in customer.items():
...     print(i)
... 
('name', 'John')
('gender', 'male')
('age', 25)

access the key and value from each item via a for loop:

>>> customer = {'name': 'John', 'gender': 'male', 'age': 25}
>>> for k, v in customer.items():
...     print(f'Key is {k}, Value is {v}')
... 
Key is name, Value is John
Key is gender, Value is male
Key is age, Value is 25

Back to Top

Checking Whether a Key or Value Exists in a Dictionary

>>> 'zip' in customer.keys()
False
>>> 'age' in customer
True
>>> 'john' in customer.values()
False
>>> 'John' in customer.values()
True

Back to Top

The get() Method

Get has two parameters: key and default value if the key does not exist

>>> customer.get('name')
'John'
>>> customer.get('zip')  # return an empty string
>>> customer.get('zip', '19713')
'19713'

Back to Top

Merge two dictionaries

# in Python 3.5+:
>>> x = {'a': 1, 'b': 2}
>>> y = {'b': 3, 'c': 4}
>>> z = {**x, **y}  # this means pass x to z first, then pass y, which overwrite the values of same keys
>>> z
{'c': 4, 'a': 1, 'b': 3}

Back to Top

Sets

A set is an unordered collection with no duplicate elements.

Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.

Initializing a set

There are two ways to create sets: using curly braces {} and the built-in function set()

>>> s = {1, 2, 3}
>>> s = set([1, 2, 3])

When creating an empty set, be sure to not use the curly braces {} or you will get an empty dictionary instead.

>>> s = {}
>>> type(s)
<class 'dict'>

sets: unordered collections of unique elements

A set automatically remove all the duplicate values.

>>> s = {1, 2, 3, 2, 3, 4}
>>> s
{1, 2, 3, 4}

And as an unordered data type, they can’t be indexed.

>>> s = {1, 2, 3}
>>> s[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'set' object does not support indexing
>>>

set add() and update()

Using the add() method we can add a single element to the set.

>>> s = {1, 2, 3}
>>> s.add(4)
>>> s
{1, 2, 3, 4}

And with update(), multiple ones .

>>> s = {1, 2, 3}
>>> s.update([2, 3, 4, 5, 6])
>>> s
{1, 2, 3, 4, 5, 6}  # remember, sets automatically remove duplicates

set remove() and discard()

Both methods will remove an element from the set, but remove() will raise a key error if the value doesn’t exist.

>>> s = {1, 2, 3}
>>> s.remove(3)
>>> s
{1, 2}
>>> s.remove(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 3

discard() won’t raise any errors.

>>> s = {1, 2, 3}
>>> s.discard(3)
>>> s
{1, 2}
>>> s.discard(3)
>>>

set union()

union() or | will create a new set that contains all the elements from the sets provided.

>>> s1 = {1, 2, 3}
>>> s2 = {3, 4, 5}
>>> s1.union(s2)  # or 's1 | s2'
{1, 2, 3, 4, 5}

set intersection

intersection or & will return a set containing only the elements that are common to all of them.

>>> s1 = {1, 2, 3}
>>> s2 = {2, 3, 4}
>>> s3 = {3, 4, 5}
>>> s1.intersection(s2, s3)  # or 's1 & s2 & s3'
{3}

set difference

difference or - will return only the elements that are unique to the first set (invoked set).

>>> s1 = {1, 2, 3}
>>> s2 = {2, 3, 4}
>>> s1.difference(s2)  # or 's1 - s2'
{1}
>>> s2.difference(s1) # or 's2 - s1'
{4}

set symetric_difference

symetric_difference or ^ will return all the elements that are not common between them.

>>> s1 = {1, 2, 3}
>>> s2 = {2, 3, 4}
>>> s1.symmetric_difference(s2)  # or 's1 ^ s2'
{1, 4}

Back to Top

List/Dict/Set Comprehensions

List/Dict/Set Comprehension returns a new List/Dict/Set

List comprehension

>>> a = [1, 3, 5, 7, 9, 11]

>>> [i - 1 for i in a]
[0, 2, 4, 6, 8, 10]

Dict comprehension

>>> c = {'name': 'Pooka', 'age': 5}
>>> {v: k for k, v in c.items()}
{'Pooka': 'name', 5: 'age'}

Set comprehension

>>> b = {"abc", "def"}
>>> {s.upper() for s in b}
{"ABC", "DEF"}

itertools Module

The itertools module includes functions creating iterators for efficient looping

The itertools module comes in the standard library and must be imported: import itertools

The operator module will also be used, which you have to import first: import operator

The operator.mul takes two numbers and multiplies them:

operator.mul(1, 2)
2
operator.mul(2, 3)
6
operator.mul(6, 4)
24
operator.mul(24, 5)
120

Back to Top

accumulate()

Makes an iterator that returns the results of accumulated sum.

>>> data = [5, 2, 6, 4, 5, 9, 1]
>>> result = itertools.accumulate(data)
>>> for each in result:
>>>    print(each)
5
7
13
17
22
31
32

You can also pass a function:

>>> data = [1, 2, 3, 4, 5]
>>> result = itertools.accumulate(data, operator.mul)
>>> for each in result:
>>>    print(each)
1
2
6
24
120

Back to Top

combinations()

Takes an iterable and a integer. This will create all the unique combination that have r members.

itertools.combinations(iterable, r)

Example:

>>> shapes = ['circle', 'triangle', 'square',]
>>> result = itertools.combinations(shapes, 2)
>>> for each in result:
>>>    print(each)
('circle', 'triangle')
('circle', 'square')
('triangle', 'square')

Back to Top

combinations_with_replacement()

Just like combinations(), but allows individual elements to be repeated more than once.

itertools.combinations_with_replacement(iterable, r)

Example:

>>> shapes = ['circle', 'triangle', 'square']
>>> result = itertools.combinations_with_replacement(shapes, 2)
>>> for each in result:
>>>    print(each)
('circle', 'circle')
('circle', 'triangle')
('circle', 'square')
('triangle', 'triangle')
('triangle', 'square')
('square', 'square')

Back to Top

count()

Makes an iterator that returns evenly spaced values starting with number start.

itertools.count(start=0, step=1)

Example:

>>> for i in itertools.count(10,3):
>>>    print(i)
>>>    if i > 20:
>>>        break
10
13
16
19
22

Back to Top

cycle()

This function cycles through an iterator endlessly.

itertools.cycle(iterable)

Example:

>>> colors = ['red', 'orange', 'yellow', 'green', 'blue', 'violet']
>>> for color in itertools.cycle(colors):
>>>    print(color)
red
orange
yellow
green
blue
violet
red
orange

When reached the end of the iterable it start over again from the beginning.

Back to Top

chain()

Take a series of iterables and return them as one long iterable.

itertools.chain(*iterables)

Example:

>>> colors = ['red', 'orange', 'yellow', 'green', 'blue']
>>> shapes = ['circle', 'triangle', 'square', 'pentagon']
>>> result = itertools.chain(colors, shapes)
>>> for each in result:
>>>    print(each)
red
orange
yellow
green
blue
circle
triangle
square
pentagon

Back to Top

compress()

Filters one iterable with another.

itertools.compress(data, selectors)

Example:

>>> shapes = ['circle', 'triangle', 'square', 'pentagon']
>>> selections = [True, False, True, False]
>>> result = itertools.compress(shapes, selections)
>>> for each in result:
>>>    print(each)
circle
square

Back to Top

dropwhile()

Make an iterator that drops elements from the iterable as long as the predicate is true; afterwards, returns every element.

itertools.dropwhile(predicate, iterable)

Example:

>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
>>> result = itertools.dropwhile(lambda x: x<5, data)
>>> for each in result:
>>>    print(each)
5
6
7
8
9
10
1

Back to Top

filterfalse()

Makes an iterator that filters elements from iterable returning only those for which the predicate is False.

itertools.filterfalse(predicate, iterable)

Example:

>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
>>> result = itertools.filterfalse(lambda x: x<5, data)
>>> for each in result:
>>>    print(each)
5
6
7
8
9
10

Back to Top

groupby()

Simply put, this function groups things together.

itertools.groupby(iterable, key=None)

Example:

>>> robots = [{
    'name': 'blaster',
    'faction': 'autobot'
}, {
    'name': 'galvatron',
    'faction': 'decepticon'
}, {
    'name': 'jazz',
    'faction': 'autobot'
}, {
    'name': 'metroplex',
    'faction': 'autobot'
}, {
    'name': 'megatron',
    'faction': 'decepticon'
}, {
    'name': 'starcream',
    'faction': 'decepticon'
}]
>>> for key, group in itertools.groupby(robots, key=lambda x: x['faction']):
>>>    print(key)
>>>    print(list(group))
autobot
[{'name': 'blaster', 'faction': 'autobot'}]
decepticon
[{'name': 'galvatron', 'faction': 'decepticon'}]
autobot
[{'name': 'jazz', 'faction': 'autobot'}, {'name': 'metroplex', 'faction': 'autobot'}]
decepticon
[{'name': 'megatron', 'faction': 'decepticon'}, {'name': 'starcream', 'faction': 'decepticon'}]

Back to Top

islice()

This function is very much like slices. This allows you to cut out a piece of an iterable.

itertools.islice(iterable, start, stop[, step])

Example:

>>> colors = ['red', 'orange', 'yellow', 'green', 'blue',]
>>> few_colors = itertools.islice(colors, 2)
>>> for each in few_colors:
>>>    print(each)
red
orange

Back to Top

permutations()

itertools.permutations(iterable, r=None)

Example:

>>> alpha_data = ['a', 'b', 'c']
>>> result = itertools.permutations(alpha_data)
>>> for each in result:
>>>    print(each)
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')

Back to Top

product()

Creates the cartesian products from a series of iterables.

>>> num_data = [1, 2, 3]
>>> alpha_data = ['a', 'b', 'c']
>>> result = itertools.product(num_data, alpha_data)
>>> for each in result:
    print(each)
(1, 'a')
(1, 'b')
(1, 'c')
(2, 'a')
(2, 'b')
(2, 'c')
(3, 'a')
(3, 'b')
(3, 'c')

Back to Top

repeat()

This function will repeat an object over and over again. Unless, there is a times argument.

itertools.repeat(object[, times])

Example:

>>> for i in itertools.repeat("spam", 3):
    print(i)
spam
spam
spam

Back to Top

starmap()

Makes an iterator that computes the function using arguments obtained from the iterable.

itertools.starmap(function, iterable)

Example:

>>> data = [(2, 6), (8, 4), (7, 3)]
>>> result = itertools.starmap(operator.mul, data)
>>> for each in result:
>>>    print(each)
12
32
21

Back to Top

takewhile()

The opposite of dropwhile(). Makes an iterator and returns elements from the iterable as long as the predicate is true.

itertools.takewhile(predicate, iterable)

Example:

>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
>>> result = itertools.takewhile(lambda x: x<5, data)
>>> for each in result:
>>>    print(each)
1
2
3
4

Back to Top

tee()

Return n independent iterators from a single iterable.

itertools.tee(iterable, n=2)

Example:

>>> colors = ['red', 'orange', 'yellow', 'green', 'blue']
>>> alpha_colors, beta_colors = itertools.tee(colors)
>>> for each in alpha_colors:
>>>    print(each)
red
orange
yellow
green
blue
>>> colors = ['red', 'orange', 'yellow', 'green', 'blue']
>>> alpha_colors, beta_colors = itertools.tee(colors)
>>> for each in beta_colors:
>>>    print(each)
red
orange
yellow
green
blue

Back to Top

zip_longest()

Makes an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted.

itertools.zip_longest(*iterables, fillvalue=None)

Example:

>>> colors = ['red', 'orange', 'yellow', 'green', 'blue',]
>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10,]
>>> for each in itertools.zip_longest(colors, data, fillvalue=None):
>>>    print(each)
('red', 1)
('orange', 2)
('yellow', 3)
('green', 4)
('blue', 5)
(None, 6)
(None, 7)
(None, 8)
(None, 9)
(None, 10)

Back to Top

Functions

>>> def hello(name):
>>>     print('Hello {}'.format(name))
>>>
>>> hello('Alice')
>>> hello('Bob')
Hello Alice
Hello Bob

Function docstring is where you can put description about the function, which you can access using .__doc__

def foo():
    """
    this function print out 'foo'
    """
    print('foo')

then, you can access doctring:

>>> foo.__doc__
"\n    this function print out 'foo'\n    "
>>> 

Back to Top

Return Values and return Statements

When creating a function using the def statement, you can specify what the return value should be with a return statement. A return statement consists of the following:

import random
def getAnswer(answerNumber):
    if answerNumber == 1:
        return 'It is certain'
    elif answerNumber == 2:
        return 'It is decidedly so'
    elif answerNumber == 3:
        return 'Yes'
    elif answerNumber == 4:
        return 'Reply hazy try again'
    elif answerNumber == 5:
        return 'Ask again later'
    elif answerNumber == 6:
        return 'Concentrate and ask again'
    elif answerNumber == 7:
        return 'My reply is no'
    elif answerNumber == 8:
        return 'Outlook not so good'
    elif answerNumber == 9:
        return 'Very doubtful'

r = random.randint(1, 9)
fortune = getAnswer(r)
print(fortune)

Back to Top

The None Value

>>> spam = print('Hello!')
Hello!
>>> spam is None
True

Note: never compare to None with the == operator. Always use is.

Back to Top

Keyword Arguments and print()

>>> print('Hello', end='')
>>> print('World')
HelloWorld
>>> print('cats', 'dogs', 'mice')
cats dogs mice
>>> print('cats', 'dogs', 'mice', sep=',')
cats,dogs,mice

Back to Top

Local and Global Scope

Back to Top

The global Statement

If you need to modify a global variable from within a function, use the global statement:

>>> def spam():
>>>     global eggs
>>>     eggs = 'spam'
>>>
>>> eggs = 'global'
>>> spam()
>>> print(eggs)
spam

There are four rules to tell whether a variable is in a local scope or global scope:

  1. If a variable is being used in the global scope (that is, outside of all functions), then it is always a global variable.

  2. If there is a global statement for that variable in a function, it is a global variable.

  3. Otherwise, if the variable is used in an assignment statement in the function, it is a local variable.

  4. But if the variable is not used in an assignment statement, it is a global variable.

Back to Top

Lambda Functions

This function:

>>> def add(x, y):
        return x + y

>>> add(5, 3)
8

Is equivalent to the lambda function:

>>> add = lambda x, y: x + y
>>> add(5, 3)
8

It’s not even need to bind it to a name like add before:

>>> (lambda x, y: x + y)(5, 3)
8

Like regular nested functions, lambdas also work as lexical closures:

>>> def make_adder(n):
        return lambda x: x + n

>>> plus_3 = make_adder(3)
>>> plus_5 = make_adder(5)

>>> plus_3(4)
7
>>> plus_5(4)
9

Note: lambda can only evaluate an expression, like a single line of code.

Back to Top

Manipulating Strings

Escape Characters

Escape character Prints as
\' Single quote
\" Double quote
\t Tab
\n Newline (line break)
\\ Backslash
\b Backspace
\ooo Octal value
\r Carriage Return

Example:

>>> print("Hello there!\nHow are you?\nI\'m doing fine.")
Hello there!
How are you?
I'm doing fine.

Back to Top

Raw Strings

A raw string completely ignores all escape characters and prints any backslash that appears in the string.

>>> print(r'That is Carol\'s cat.')
That is Carol\'s cat.

Note: mostly used for regular expression definition (see re package)

Back to Top

Multiline Strings with Triple Quotes

>>> print('''Dear Alice,
>>>
>>> Eve's cat has been arrested for catnapping, cat burglary, and extortion.
>>>
>>> Sincerely,
>>> Bob''')
Dear Alice,

Eve's cat has been arrested for catnapping, cat burglary, and extortion.

Sincerely,
Bob

To keep a nicer flow in your code, you can use the dedent function from the textwrap standard package.

>>> from textwrap import dedent
>>>
>>> def my_function():
>>>     print('''
>>>         Dear Alice,
>>>
>>>         Eve's cat has been arrested for catnapping, cat burglary, and extortion.
>>>
>>>         Sincerely,
>>>         Bob
>>>         ''').strip()

This generates the same string than before.

Back to Top

Indexing and Slicing Strings

H   e   l   l   o       w   o   r   l   d    !
0   1   2   3   4   5   6   7   8   9   10   11
>>> spam = 'Hello world!'

>>> spam[0]
'H'
>>> spam[4]
'o'
>>> spam[-1]
'!'

Slicing:


>>> spam[0:5]
'Hello'
>>> spam[:5]
'Hello'
>>> spam[6:]
'world!'
>>> spam[6:-1]
'world'
>>> spam[:-1]
'Hello world'
>>> spam[::-1]
'!dlrow olleH'
>>> spam = 'Hello world!'
>>> fizz = spam[0:5]
>>> fizz
'Hello'

Back to Top

The in and not in Operators with Strings

>>> 'Hello' in 'Hello World'
True
>>> 'Hello' in 'Hello'
True
>>> 'HELLO' in 'Hello World'
False
>>> '' in 'spam'
True
>>> 'cats' not in 'cats and dogs'
False

The in and not in Operators with list

>>> a = [1, 2, 3, 4]
>>> 5 in a
False
>>> 2 in a
True

Back to Top

The upper(), lower(), isupper(), and islower() String Methods

upper() and lower():

>>> spam = 'Hello world!'
>>> spam = spam.upper()
>>> spam
'HELLO WORLD!'
>>> spam = spam.lower()
>>> spam
'hello world!'

isupper() and islower():

>>> spam = 'Hello world!'
>>> spam.islower()
False
>>> spam.isupper()
False
>>> 'HELLO'.isupper()
True
>>> 'abc12345'.islower()
True
>>> '12345'.islower()
False
>>> '12345'.isupper()
False

Back to Top

The isX String Methods

Back to Top

The startswith() and endswith() String Methods

>>> 'Hello world!'.startswith('Hello')
True
>>> 'Hello world!'.endswith('world!')
True
>>> 'abc123'.startswith('abcdef')
False
>>> 'abc123'.endswith('12')
False
>>> 'Hello world!'.startswith('Hello world!')
True
>>> 'Hello world!'.endswith('Hello world!')
True

Back to Top

The join() and split() String Methods

join():

>>> ', '.join(['cats', 'rats', 'bats'])
'cats, rats, bats'
>>> ' '.join(['My', 'name', 'is', 'Simon'])
'My name is Simon'
>>> 'ABC'.join(['My', 'name', 'is', 'Simon'])
'MyABCnameABCisABCSimon'

split():

>>> 'My name is Simon'.split()
['My', 'name', 'is', 'Simon']
>>> 'MyABCnameABCisABCSimon'.split('ABC')
['My', 'name', 'is', 'Simon']
>>> 'My name is Simon'.split('m')
['My na', 'e is Si', 'on']

Back to Top

Justifying Text with rjust(), ljust(), and center()

rjust() and ljust():

>>> 'Hello'.rjust(10)
'     Hello'
>>> 'Hello'.rjust(20)
'               Hello'
>>> 'Hello World'.rjust(20)
'         Hello World'
>>> 'Hello'.ljust(10)
'Hello     '

An optional second argument to rjust() and ljust() will specify a fill character other than a space character. Enter the following into the interactive shell:

>>> 'Hello'.rjust(20, '*')
'***************Hello'
>>> 'Hello'.ljust(20, '-')
'Hello---------------'

center():

>>> 'Hello'.center(20)
'       Hello       '
>>> 'Hello'.center(20, '=')
'=======Hello========'

Back to Top

Removing Whitespace with strip(), rstrip(), and lstrip()

>>> spam = '    Hello World     '
>>> spam.strip()
'Hello World'
>>> spam.lstrip()
'Hello World '
>>> spam.rstrip()
'    Hello World'
>>> spam = 'SpamSpamBaconSpamEggsSpamSpam'
>>> spam.strip('ampS')
'BaconSpamEggs'

Back to Top

Copying and Pasting Strings with the pyperclip Module (need pip install)

>>> import pyperclip

>>> pyperclip.copy('Hello world!')

>>> pyperclip.paste()
'Hello world!'

Back to Top

String Formatting

Formatted String Literals or f-strings (Python 3.6+)

f-strings are string literals that have an f at the beginning and curly braces containing expressions that will be replaced with their values.

>>> name = 'Stephen Curry'
>>> born = 1988
>>> print(f'{name} is born in {born}.')
Stephen Curry is born in 1988.

It is even possible to do inline arithmetic with it:

>>> a = 5
>>> b = 10
>>> f'Five plus ten is {a + b} and not {2 * (a + b)}.'
'Five plus ten is 15 and not 30.'

Format decimals:

>>> pi = 3.1415926
>>> print(f'pi with two decimal places is {pi:.2f}')
pi with two decimal places is 3.14

Format a number as percentage:

churn_rate = 0.0325
print(f'the churn rate this month is {churn_rate:.3%}')

Back to Top

% operator

>>> name = 'Pete'
>>> 'Hello %s' % name
"Hello Pete"

We can use the %x format specifier to convert an int value to a string:

>>> num = 5
>>> 'I have %x apples' % num
"I have 5 apples"

Note: For new code, using str.format or f-strings (Python 3.6+) is strongly recommended over the % operator.

Back to Top

String Formatting (str.format)

Python 3 introduced a new way to do string formatting that was later back-ported to Python 2.7. This makes the syntax for string formatting more regular.

>>> name = 'John'
>>> age = 20'

>>> "Hello I'm {}, my age is {}".format(name, age)
"Hello I'm John, my age is 20"
>>> "Hello I'm {0}, my age is {1}".format(name, age)
"Hello I'm John, my age is 20"

The official Python 3.x documentation recommend str.format over the % operator:

The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals or the str.format() interface helps avoid these errors. These alternatives also provide more powerful, flexible and extensible approaches to formatting text.

Back to Top

Lazy string formatting

You would only use %s string formatting on functions that can do lazy parameters evaluation, the most common being logging:

Prefer:

>>> name = "alice"
>>> logging.debug("User name: %s", name)

Over:

>>> logging.debug("User name: {}".format(name))

Or:

>>> logging.debug("User name: " + name)

Back to Top

Template Strings

A simpler and less powerful mechanism, but it is recommended when handling format strings generated by users. Due to their reduced complexity template strings are a safer choice.

>>> from string import Template
>>> name = 'Elizabeth'
>>> t = Template('Hey $name!')
>>> t.substitute(name=name)
'Hey Elizabeth!'

Back to Top

Exception Handling

Basic exception handling

>>> def spam(divideBy):
>>>     try:
>>>         return 42 / divideBy
>>>     except ZeroDivisionError as e:
>>>         print('Error: Invalid argument: {}'.format(e))
>>>
>>> print(spam(2))
>>> print(spam(12))
>>> print(spam(0))
>>> print(spam(1))
21.0
3.5
Error: Invalid argument: division by zero
None
42.0

Back to Top

Final code in exception handling

Code inside the finally section is always executed, no matter if an exception has been raised or not, and even if an exception is not caught.

>>> def spam(divideBy):
>>>     try:
>>>         return 42 / divideBy
>>>     except ZeroDivisionError as e:
>>>         print('Error: Invalid argument: {}'.format(e))
>>>     finally:
>>>         print("-- division finished --")
>>> print(spam(2))
-- division finished --
21.0
>>> print(spam(12))
-- division finished --
3.5
>>> print(spam(0))
Error: Invalid Argument division by zero
-- division finished --
None
>>> print(spam(1))
-- division finished --
42.0

Back to Top

Regular Expressions

A regular expression is a sequence of characters that specifies a pattern in text. Python has a built-in package called re for working with Regular Expressions - you have to import it before use.

>>> import re

Back to Top

Regex Patterns

Regex Note
a matches the character a
abc matches abc
^abc matches any string begins with abc
abc$ matches any string ends with abc
ab|cd matches ab or cd
[abc] matches a, b or c
[^abc] matches any character except a, b, and c
Regex         Note
. matches any one character, e.g., d, 5, &
\d matches any digit, e.g., \d\d\d matches any three digit numbers
\D matches any non-digit
\w matches any alphanumeric (Latin letters + Arabic digits) character with underscore _ included
\W matches any non-alphanumeric character
\s matches any whitespace character
\S matches any non-whitespace character
[a-z] matches any one lowercase character from a to z
[A-Z] matches any one uppercase character from a to z
[0-9] matches any one digit same as \d above
Regex Note
* matches 0 or more times
+ matches 1 or more times
? matches 0 or 1 time
{m} matches exactly m times
{m,n} matches m to n times
{m,} matches m or more times
{,n} matches up to n times
{n,m}? or *? or +? performs a non-greedy (shortest) match

Back to Top

Regex Functions

All functions return a re.Match object if matches are found, otherwise None is returned. .group() and .span() can be used to get the matched string and its location.

Match with quantifier example:

import re
regex = r'o+'  # try 'o*', 'o+', 'o{3}', 'o{5}', 'o{2,6}', 'o{2,6}?'
m = re.search(regex, 'Helloooo')
print(m)  # return a match object

if m is not None:
    print(m.span(), m.group())  # get the location and matched string

Another example:

>>> phone_num_regex = r'\d\d\d-\d\d\d-\d\d\d\d'
>>> m = re.search(phone_num_regex, 'My number is 415-555-4242.')
>>> print(f'Phone number found: {m.group()}')
Phone number found: 415-555-4242

Back to Top

Grouping with Parentheses

By default, the entire regex pattern is matched but you can also specify a portion of the patten to be matched using parentheses. The following defines two groups.

>>> phone_num_regex = r'(\d\d\d)-(\d\d\d-\d\d\d\d)'

>>> m = re.search(phone_num_regex, 'My number is 415-555-4242.')

>>> m.group(0)
'415-555-4242'

>>> m.group()
'415-555-4242'

>>> m.group(1)
'415'

>>> m.group(2)
'555-4242'

>>> m.groups()  # all groups
('415', '555-4242')

>>> area_code, main_number = m.groups()

>>> print(area_code)
415

>>> print(main_number)
555-4242

Back to Top

Matching Multiple Groups with the Pipe

The character is called a pipe. You can use it anywhere you want to match one of many expressions. For example, the regular expression r’Batman Tina Fey’ will match either ‘Batman’ or ‘Tina Fey’.
>>> hero_regex = re.compile (r'Batman|Tina Fey')

>>> mo1 = hero_regex.search('Batman and Tina Fey.')

>>> mo1.group()
'Batman'

>>> mo2 = hero_regex.search('Tina Fey and Batman.')

>>> mo2.group()
'Tina Fey'

You can also use the pipe to match one of several patterns as part of your regex:

>>> bat_regex = re.compile(r'Bat(man|mobile|copter|bat)')

>>> mo = bat_regex.search('Batmobile lost a wheel')

>>> mo.group()
'Batmobile'

>>> mo.group(1)
'mobile'

Back to Top

Optional Matching with the Question Mark

The ? character flags the group that precedes it as an optional part of the pattern.

>>> bat_regex = re.compile(r'Bat(wo)?man')
>>> mo1 = bat_regex.search('The Adventures of Batman')
>>> mo1.group()
'Batman'

>>> mo2 = bat_regex.search('The Adventures of Batwoman')
>>> mo2.group()
'Batwoman'

Back to Top

Matching Zero or More with the Star

The * (called the star or asterisk) means “match zero or more”—the group that precedes the star can occur any number of times in the text.

>>> bat_regex = re.compile(r'Bat(wo)*man')
>>> mo1 = bat_regex.search('The Adventures of Batman')
>>> mo1.group()
'Batman'

>>> mo2 = bat_regex.search('The Adventures of Batwoman')
>>> mo2.group()
'Batwoman'

>>> mo3 = bat_regex.search('The Adventures of Batwowowowoman')
>>> mo3.group()
'Batwowowowoman'

Back to Top

Matching One or More with the Plus

While * means “match zero or more,” the + (or plus) means “match one or more”. The group preceding a plus must appear at least once. It is not optional:

>>> bat_regex = re.compile(r'Bat(wo)+man')
>>> mo1 = bat_regex.search('The Adventures of Batwoman')
>>> mo1.group()
'Batwoman'
>>> mo2 = bat_regex.search('The Adventures of Batwowowowoman')
>>> mo2.group()
'Batwowowowoman'
>>> mo3 = bat_regex.search('The Adventures of Batman')
>>> mo3 is None
True

Back to Top

Matching Specific Repetitions with Curly Brackets

If you have a group that you want to repeat a specific number of times, follow the group in your regex with a number in curly brackets. For example, the regex (Ha){3} will match the string ‘HaHaHa’, but it will not match ‘HaHa’, since the latter has only two repeats of the (Ha) group.

Instead of one number, you can specify a range by writing a minimum, a comma, and a maximum in between the curly brackets. For example, the regex (Ha){3,5} will match ‘HaHaHa’, ‘HaHaHaHa’, and ‘HaHaHaHaHa’.

>>> ha_regex = re.compile(r'(Ha){3}')
>>> mo1 = ha_regex.search('HaHaHa')
>>> mo1.group()
'HaHaHa'
>>> mo2 = ha_regex.search('Ha')
>>> mo2 is None
True

Back to Top

Greedy and Nongreedy Matching

Python’s regular expressions are greedy by default, which means that in ambiguous situations they will match the longest string possible. The non-greedy version of the curly brackets, which matches the shortest string possible, has the closing curly bracket followed by a question mark.

>>> greedy_ha_regex = re.compile(r'(Ha){3,5}')
>>> mo1 = greedy_ha_regex.search('HaHaHaHaHa')
>>> mo1.group()
'HaHaHaHaHa'
>>> nongreedy_ha_regex = re.compile(r'(Ha){3,5}?')
>>> mo2 = nongreedy_ha_regex.search('HaHaHaHaHa')
>>> mo2.group()
'HaHaHa'

Back to Top

The findall() Method

In addition to the search() method, Regex objects also have a findall() method. While search() will return a Match object of the first matched text in the searched string, the findall() method will return the strings of every match in the searched string.

>>> phone_num_regex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d') # has no groups

>>> phone_num_regex.findall('Cell: 415-555-9999 Work: 212-555-0000')
['415-555-9999', '212-555-0000']

To summarize what the findall() method returns, remember the following:

Back to Top

Making Your Own Character Classes

There are times when you want to match a set of characters but the shorthand character classes (\d, \w, \s, and so on) are too broad. You can define your own character class using square brackets. For example, the character class [aeiouAEIOU] will match any vowel, both lowercase and uppercase.

>>> vowel_regex = re.compile(r'[aeiouAEIOU]')

>>> vowel_regex.findall('Robocop eats baby food. BABY FOOD.')
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']

You can also include ranges of letters or numbers by using a hyphen. For example, the character class [a-zA-Z0-9] will match all lowercase letters, uppercase letters, and numbers.

By placing a caret character (^) just after the character class’s opening bracket, you can make a negative character class. A negative character class will match all the characters that are not in the character class. For example, enter the following into the interactive shell:

>>> consonant_regex = re.compile(r'[^aeiouAEIOU]')

>>> consonant_regex.findall('Robocop eats baby food. BABY FOOD.')
['R', 'b', 'c', 'p', ' ', 't', 's', ' ', 'b', 'b', 'y', ' ', 'f', 'd', '.', '
', 'B', 'B', 'Y', ' ', 'F', 'D', '.']

Back to Top

The Caret and Dollar Sign Characters

The r’^Hello’ regular expression string matches strings that begin with ‘Hello’:

>>> begins_with_hello = re.compile(r'^Hello')

>>> begins_with_hello.search('Hello world!')
<_sre.SRE_Match object; span=(0, 5), match='Hello'>

>>> begins_with_hello.search('He said hello.') is None
True

The r’\d$’ regular expression string matches strings that end with a numeric character from 0 to 9:

>>> whole_string_is_num = re.compile(r'^\d+$')

>>> whole_string_is_num.search('1234567890')
<_sre.SRE_Match object; span=(0, 10), match='1234567890'>

>>> whole_string_is_num.search('12345xyz67890') is None
True

>>> whole_string_is_num.search('12 34567890') is None
True

Back to Top

The Wildcard Character

The . (or dot) character in a regular expression is called a wildcard and will match any character except for a newline:

>>> at_regex = re.compile(r'.at')

>>> at_regex.findall('The cat in the hat sat on the flat mat.')
['cat', 'hat', 'sat', 'lat', 'mat']

Back to Top

Matching Everything with Dot-Star

>>> name_regex = re.compile(r'First Name: (.*) Last Name: (.*)')

>>> mo = name_regex.search('First Name: Al Last Name: Sweigart')

>>> mo.group(1)
'Al'
>>> mo.group(2)
'Sweigart'

The dot-star uses greedy mode: It will always try to match as much text as possible. To match any and all text in a nongreedy fashion, use the dot, star, and question mark (.*?). The question mark tells Python to match in a nongreedy way:

>>> nongreedy_regex = re.compile(r'<.*?>')
>>> mo = nongreedy_regex.search('<To serve man> for dinner.>')
>>> mo.group()
'<To serve man>'
>>> greedy_regex = re.compile(r'<.*>')
>>> mo = greedy_regex.search('<To serve man> for dinner.>')
>>> mo.group()
'<To serve man> for dinner.>'

Back to Top

Matching Newlines with the Dot Character

The dot-star will match everything except a newline. By passing re.DOTALL as the second argument to re.compile(), you can make the dot character match all characters, including the newline character:

>>> no_newline_regex = re.compile('.*')
>>> no_newline_regex.search('Serve the public trust.\nProtect the innocent.\nUphold the law.').group()
'Serve the public trust.'
>>> newline_regex = re.compile('.*', re.DOTALL)
>>> newline_regex.search('Serve the public trust.\nProtect the innocent.\nUphold the law.').group()
'Serve the public trust.\nProtect the innocent.\nUphold the law.'

Back to Top

Review of Regex Symbols

Symbol Matches
? zero or one of the preceding group.
* zero or more of the preceding group.
+ one or more of the preceding group.
{n} exactly n of the preceding group.
{n,} n or more of the preceding group.
{,m} 0 to m of the preceding group.
{n,m} at least n and at most m of the preceding p.
{n,m}? or *? or +? performs a nongreedy match of the preceding p.
^spam means the string must begin with spam.
spam$ means the string must end with spam.
. any character, except newline characters.
\d, \w, and \s a digit, word, or space character, respectively.
\D, \W, and \S anything except a digit, word, or space, respectively.
[abc] any character between the brackets (such as a, b, ).
[^abc] any character that isn’t between the brackets.

Back to Top

Case-Insensitive Matching

To make your regex case-insensitive, you can pass re.IGNORECASE or re.I as a second argument to re.compile():

>>> robocop = re.compile(r'robocop', re.I)

>>> robocop.search('Robocop is part man, part machine, all cop.').group()
'Robocop'
>>> robocop.search('ROBOCOP protects the innocent.').group()
'ROBOCOP'
>>> robocop.search('Al, why does your programming book talk about robocop so much?').group()
'robocop'

Back to Top

Substituting Strings with the sub() Method

The sub() method for Regex objects is passed two arguments:

  1. The first argument is a string to replace any matches.
  2. The second is the string for the regular expression.

The sub() method returns a string with the substitutions applied:

>>> names_regex = re.compile(r'Agent \w+')

>>> names_regex.sub('CENSORED', 'Agent Alice gave the secret documents to Agent Bob.')
'CENSORED gave the secret documents to CENSORED.'

Another example:

>>> agent_names_regex = re.compile(r'Agent (\w)\w*')

>>> agent_names_regex.sub(r'\1****', 'Agent Alice told Agent Carol that Agent Eve knew Agent Bob was a double agent.')
A**** told C**** that E**** knew B**** was a double agent.'

Back to Top

Managing Complex Regexes

To tell the re.compile() function to ignore whitespace and comments inside the regular expression string, “verbose mode” can be enabled by passing the variable re.VERBOSE as the second argument to re.compile().

Now instead of a hard-to-read regular expression like this:

phone_regex = re.compile(r'((\d{3}|\(\d{3}\))?(\s|-|\.)?\d{3}(\s|-|\.)\d{4}(\s*(ext|x|ext.)\s*\d{2,5})?)')

you can spread the regular expression over multiple lines with comments like this:

phone_regex = re.compile(r'''(
    (\d{3}|\(\d{3}\))?            # area code
    (\s|-|\.)?                    # separator
    \d{3}                         # first 3 digits
    (\s|-|\.)                     # separator
    \d{4}                         # last 4 digits
    (\s*(ext|x|ext.)\s*\d{2,5})?  # extension
    )''', re.VERBOSE)

Back to Top

Handling File and Directory Paths

There are two main modules in Python that deals with path manipulation. One is the os.path module and the other is the pathlib module. The pathlib module was added in Python 3.4, offering an object-oriented way to handle file system paths.

Back to Top

Backslash on Windows and Forward Slash on OS X and Linux

On Windows, paths are written using backslashes (\) as the separator between folder names. On Unix based operating system such as macOS, Linux, and BSDs, the forward slash (/) is used as the path separator. Joining paths can be a headache if your code needs to work on different platforms.

Fortunately, Python provides easy ways to handle this. We will showcase how to deal with this with both os.path.join and pathlib.Path.joinpath

Using os.path.join on Windows:

>>> import os

>>> os.path.join('usr', 'bin', 'spam')
'usr\\bin\\spam'

And using pathlib on *nix:

>>> from pathlib import Path

>>> print(Path('usr').joinpath('bin').joinpath('spam'))
usr/bin/spam

pathlib also provides a shortcut to joinpath using the / operator:

>>> from pathlib import Path

>>> print(Path('usr') / 'bin' / 'spam')
usr/bin/spam

Notice the path separator is different between Windows and Unix based operating system, that’s why you want to use one of the above methods instead of adding strings together to join paths together.

Joining paths is helpful if you need to create different file paths under the same directory.

Using os.path.join on Windows:

>>> my_files = ['accounts.txt', 'details.csv', 'invite.docx']

>>> for filename in my_files:
>>>     print(os.path.join('C:\\Users\\asweigart', filename))
C:\Users\asweigart\accounts.txt
C:\Users\asweigart\details.csv
C:\Users\asweigart\invite.docx

Using pathlib on *nix:

>>> my_files = ['accounts.txt', 'details.csv', 'invite.docx']
>>> home = Path.home()
>>> for filename in my_files:
>>>     print(home / filename)
/home/asweigart/accounts.txt
/home/asweigart/details.csv
/home/asweigart/invite.docx

Back to Top

The Current Working Directory

Using os on Windows:

>>> import os

>>> os.getcwd()
'C:\\Python34'
>>> os.chdir('C:\\Windows\\System32')

>>> os.getcwd()
'C:\\Windows\\System32'

Using pathlib on *nix:

>>> from pathlib import Path
>>> from os import chdir

>>> print(Path.cwd())
/home/asweigart

>>> chdir('/usr/lib/python3.6')
>>> print(Path.cwd())
/usr/lib/python3.6

Back to Top

Creating New Folders

Using os on Windows:

>>> import os
>>> os.makedirs('C:\\delicious\\walnut\\waffles')

Using pathlib on *nix:

>>> from pathlib import Path
>>> cwd = Path.cwd()
>>> (cwd / 'delicious' / 'walnut' / 'waffles').mkdir()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/pathlib.py", line 1226, in mkdir
    self._accessor.mkdir(self, mode)
  File "/usr/lib/python3.6/pathlib.py", line 387, in wrapped
    return strfunc(str(pathobj), *args)
FileNotFoundError: [Errno 2] No such file or directory: '/home/asweigart/delicious/walnut/waffles'

Oh no, we got a nasty error! The reason is that the ‘delicious’ directory does not exist, so we cannot make the ‘walnut’ and the ‘waffles’ directories under it. To fix this, do:

>>> from pathlib import Path
>>> cwd = Path.cwd()
>>> (cwd / 'delicious' / 'walnut' / 'waffles').mkdir(parents=True)

And all is good :)

Back to Top

Absolute vs. Relative Paths

There are two ways to specify a file path.

There are also the dot (.) and dot-dot (..) folders. These are not real folders but special names that can be used in a path. A single period (“dot”) for a folder name is shorthand for “this directory.” Two periods (“dot-dot”) means “the parent folder.”

Back to Top

Handling Absolute and Relative Paths

To see if a path is an absolute path:

Using os.path on *nix:

>>> import os
>>> os.path.isabs('/')
True
>>> os.path.isabs('..')
False

Using pathlib on *nix:

>>> from pathlib import Path
>>> Path('/').is_absolute()
True
>>> Path('..').is_absolute()
False

You can extract an absolute path with both os.path and pathlib

Using os.path on *nix:

>>> import os
>>> os.getcwd()
'/home/asweigart'
>>> os.path.abspath('..')
'/home'

Using pathlib on *nix:

from pathlib import Path
print(Path.cwd())
/home/asweigart
print(Path('..').resolve())
/home

You can get a relative path from a starting path to another path.

Using os.path on *nix:

>>> import os
>>> os.path.relpath('/etc/passwd', '/')
'etc/passwd'

Using pathlib on *nix:

>>> from pathlib import Path
>>> print(Path('/etc/passwd').relative_to('/'))
etc/passwd

Back to Top

Checking Path Validity

Checking if a file/directory exists:

Using os.path on *nix:

import os
>>> os.path.exists('.')
True
>>> os.path.exists('setup.py')
True
>>> os.path.exists('/etc')
True
>>> os.path.exists('nonexistentfile')
False

Using pathlib on *nix:

from pathlib import Path
>>> Path('.').exists()
True
>>> Path('setup.py').exists()
True
>>> Path('/etc').exists()
True
>>> Path('nonexistentfile').exists()
False

Checking if a path is a file:

Using os.path on *nix:

>>> import os
>>> os.path.isfile('setup.py')
True
>>> os.path.isfile('/home')
False
>>> os.path.isfile('nonexistentfile')
False

Using pathlib on *nix:

>>> from pathlib import Path
>>> Path('setup.py').is_file()
True
>>> Path('/home').is_file()
False
>>> Path('nonexistentfile').is_file()
False

Checking if a path is a directory:

Using os.path on *nix:

>>> import os
>>> os.path.isdir('/')
True
>>> os.path.isdir('setup.py')
False
>>> os.path.isdir('/spam')
False

Using pathlib on *nix:

>>> from pathlib import Path
>>> Path('/').is_dir()
True
>>> Path('setup.py').is_dir()
False
>>> Path('/spam').is_dir()
False

Back to Top

Finding File Sizes and Folder Contents

Getting a file’s size in bytes:

Using os.path on Windows:

>>> import os
>>> os.path.getsize('C:\\Windows\\System32\\calc.exe')
776192

Using pathlib on *nix:

>>> from pathlib import Path
>>> stat = Path('/bin/python3.6').stat()
>>> print(stat) # stat contains some other information about the file as well
os.stat_result(st_mode=33261, st_ino=141087, st_dev=2051, st_nlink=2, st_uid=0,
--snip--
st_gid=0, st_size=10024, st_atime=1517725562, st_mtime=1515119809, st_ctime=1517261276)
>>> print(stat.st_size) # size in bytes
10024

Listing directory contents using os.listdir on Windows:

>>> import os
>>> os.listdir('C:\\Windows\\System32')
['0409', '12520437.cpx', '12520850.cpx', '5U877.ax', 'aaclient.dll',
--snip--
'xwtpdui.dll', 'xwtpw32.dll', 'zh-CN', 'zh-HK', 'zh-TW', 'zipfldr.dll']

Listing directory contents using pathlib on *nix:

>>> from pathlib import Path
>>> for f in Path('/usr/bin').iterdir():
>>>     print(f)
...
/usr/bin/tiff2rgba
/usr/bin/iconv
/usr/bin/ldd
/usr/bin/cache_restore
/usr/bin/udiskie
/usr/bin/unix2dos
/usr/bin/t1reencode
/usr/bin/epstopdf
/usr/bin/idle3
...

To find the total size of all the files in this directory:

WARNING: Directories themselves also have a size! So you might want to check for whether a path is a file or directory using the methods in the methods discussed in the above section!

Using os.path.getsize() and os.listdir() together on Windows:

>>> import os
>>> total_size = 0

>>> for filename in os.listdir('C:\\Windows\\System32'):
      total_size = total_size + os.path.getsize(os.path.join('C:\\Windows\\System32', filename))

>>> print(total_size)
1117846456

Using pathlib on *nix:

>>> from pathlib import Path
>>> total_size = 0

>>> for sub_path in Path('/usr/bin').iterdir():
...     total_size += sub_path.stat().st_size
>>>
>>> print(total_size)
1903178911

Back to Top

Copying Files and Folders

The shutil module provides functions for copying files, as well as entire folders.

>>> import shutil, os

>>> os.chdir('C:\\')

>>> shutil.copy('C:\\spam.txt', 'C:\\delicious')
   'C:\\delicious\\spam.txt'

>>> shutil.copy('eggs.txt', 'C:\\delicious\\eggs2.txt')
   'C:\\delicious\\eggs2.txt'

While shutil.copy() will copy a single file, shutil.copytree() will copy an entire folder and every folder and file contained in it:

>>> import shutil, os

>>> os.chdir('C:\\')

>>> shutil.copytree('C:\\bacon', 'C:\\bacon_backup')
'C:\\bacon_backup'

Back to Top

Moving and Renaming Files and Folders

>>> import shutil
>>> shutil.move('C:\\bacon.txt', 'C:\\eggs')
'C:\\eggs\\bacon.txt'

The destination path can also specify a filename. In the following example, the source file is moved and renamed:

>>> shutil.move('C:\\bacon.txt', 'C:\\eggs\\new_bacon.txt')
'C:\\eggs\\new_bacon.txt'

If there is no eggs folder, then move() will rename bacon.txt to a file named eggs.

>>> shutil.move('C:\\bacon.txt', 'C:\\eggs')
'C:\\eggs'

Back to Top

Permanently Deleting Files and Folders

Back to Top

Safe Deletes with the send2trash Module

You can install this module by running pip install send2trash from a Terminal window.

>>> import send2trash

>>> with open('bacon.txt', 'a') as bacon_file: # creates the file
...     bacon_file.write('Bacon is not a vegetable.')
25

>>> send2trash.send2trash('bacon.txt')

Back to Top

Walking a Directory Tree

>>> import os
>>>
>>> for folder_name, subfolders, filenames in os.walk('C:\\delicious'):
>>>     print('The current folder is {}'.format(folder_name))
>>>
>>>     for subfolder in subfolders:
>>>         print('SUBFOLDER OF {}: {}'.format(folder_name, subfolder))
>>>     for filename in filenames:
>>>         print('FILE INSIDE {}: {}'.format(folder_name, filename))
>>>
>>>     print('')
The current folder is C:\delicious
SUBFOLDER OF C:\delicious: cats
SUBFOLDER OF C:\delicious: walnut
FILE INSIDE C:\delicious: spam.txt

The current folder is C:\delicious\cats
FILE INSIDE C:\delicious\cats: catnames.txt
FILE INSIDE C:\delicious\cats: zophie.jpg

The current folder is C:\delicious\walnut
SUBFOLDER OF C:\delicious\walnut: waffles

The current folder is C:\delicious\walnut\waffles
FILE INSIDE C:\delicious\walnut\waffles: butter.txt

Back to Top

pathlib provides a lot more functionality than the ones listed above, like getting file name, getting file extension, reading/writing a file without manually opening it, etc. Check out the official documentation if you want to know more!

Reading and Writing Files

The File Reading/Writing Process

To read/write to a file in Python, you will want to use the with statement, which will close the file for you after you are done.

Back to Top

Opening and reading files with the open() function

>>> with open('C:\\Users\\your_home_folder\\hello.txt') as hello_file:
...     hello_content = hello_file.read()
>>> hello_content
'Hello World!'

>>> # Alternatively, you can use the *readlines()* method to get a list of string values from the file, one string for each line of text:

>>> with open('sonnet29.txt') as sonnet_file:
...     sonnet_file.readlines()
[When, in disgrace with fortune and men's eyes,\n', ' I all alone beweep my
outcast state,\n', And trouble deaf heaven with my bootless cries,\n', And
look upon myself and curse my fate,']

>>> # You can also iterate through the file line by line:
>>> with open('sonnet29.txt') as sonnet_file:
...     for line in sonnet_file: # note the new line character will be included in the line
...         print(line, end='')

When, in disgrace with fortune and men's eyes,
I all alone beweep my outcast state,
And trouble deaf heaven with my bootless cries,
And look upon myself and curse my fate,

Back to Top

Writing to Files

>>> with open('bacon.txt', 'w') as bacon_file:
...     bacon_file.write('Hello world!\n')
13

>>> with open('bacon.txt', 'a') as bacon_file:
...     bacon_file.write('Bacon is not a vegetable.')
25

>>> with open('bacon.txt') as bacon_file:
...     content = bacon_file.read()

>>> print(content)
Hello world!
Bacon is not a vegetable.

Back to Top

Saving Variables with the shelve Module

To save variables:

>>> import shelve

>>> cats = ['Zophie', 'Pooka', 'Simon']
>>> with shelve.open('mydata') as shelf_file:
...     shelf_file['cats'] = cats

To open and read variables:

>>> with shelve.open('mydata') as shelf_file:
...     print(type(shelf_file))
...     print(shelf_file['cats'])
<class 'shelve.DbfilenameShelf'>
['Zophie', 'Pooka', 'Simon']

Just like dictionaries, shelf values have keys() and values() methods that will return list-like values of the keys and values in the shelf. Since these methods return list-like values instead of true lists, you should pass them to the list() function to get them in list form.

>>> with shelve.open('mydata') as shelf_file:
...     print(list(shelf_file.keys()))
...     print(list(shelf_file.values()))
['cats']
[['Zophie', 'Pooka', 'Simon']]

Back to Top

Saving Variables with the pprint.pformat() Function

>>> import pprint

>>> cats = [{'name': 'Zophie', 'desc': 'chubby'}, {'name': 'Pooka', 'desc': 'fluffy'}]

>>> pprint.pformat(cats)
"[{'desc': 'chubby', 'name': 'Zophie'}, {'desc': 'fluffy', 'name': 'Pooka'}]"

>>> with open('myCats.py', 'w') as file_obj:
...     file_obj.write('cats = {}\n'.format(pprint.pformat(cats)))
83

Back to Top

Reading ZIP Files

>>> import zipfile, os

>>> os.chdir('C:\\')    # move to the folder with example.zip
>>> with zipfile.ZipFile('example.zip') as example_zip:
...     print(example_zip.namelist())
...     spam_info = example_zip.getinfo('spam.txt')
...     print(spam_info.file_size)
...     print(spam_info.compress_size)
...     print('Compressed file is %sx smaller!' % (round(spam_info.file_size / spam_info.compress_size, 2)))

['spam.txt', 'cats/', 'cats/catnames.txt', 'cats/zophie.jpg']
13908
3828
'Compressed file is 3.63x smaller!'

Back to Top

Extracting from ZIP Files

The extractall() method for ZipFile objects extracts all the files and folders from a ZIP file into the current working directory.

>>> import zipfile, os

>>> os.chdir('C:\\')    # move to the folder with example.zip

>>> with zipfile.ZipFile('example.zip') as example_zip:
...     example_zip.extractall()

The extract() method for ZipFile objects will extract a single file from the ZIP file. Continue the interactive shell example:

>>> with zipfile.ZipFile('example.zip') as example_zip:
...     print(example_zip.extract('spam.txt'))
...     print(example_zip.extract('spam.txt', 'C:\\some\\new\\folders'))
'C:\\spam.txt'
'C:\\some\\new\\folders\\spam.txt'

Back to Top

Creating and Adding to ZIP Files

>>> import zipfile

>>> with zipfile.ZipFile('new.zip', 'w') as new_zip:
...     new_zip.write('spam.txt', compress_type=zipfile.ZIP_DEFLATED)

This code will create a new ZIP file named new.zip that has the compressed contents of spam.txt.

Back to Top

JSON, YAML and configuration files

JSON

Open a JSON file with:

import json
with open("filename.json", "r") as f:
    content = json.loads(f.read())

Write a JSON file with:

import json

content = {"name": "Joe", "age": 20}
with open("filename.json", "w") as f:
    f.write(json.dumps(content, indent=2))

Back to Top

YAML

Compared to JSON, YAML allows for much better human maintainability and gives you the option to add comments. It is a convenient choice for configuration files where humans will have to edit it.

There are two main libraries allowing to access to YAML files:

Install them using pip install in your virtual environment.

The first one it easier to use but the second one, Ruamel, implements much better the YAML specification, and allow for example to modify a YAML content without altering comments.

Open a YAML file with:

from ruamel.yaml import YAML

with open("filename.yaml") as f:
    yaml=YAML()
    yaml.load(f)

Back to Top

Anyconfig

Anyconfig is a very handy package allowing to abstract completely the underlying configuration file format. It allows to load a Python dictionary from JSON, YAML, TOML, and so on.

Install it with:

pip install anyconfig

Usage:

import anyconfig

conf1 = anyconfig.load("/path/to/foo/conf.d/a.yml")

Back to Top

Debugging

Raising Exceptions

Exceptions are raised with a raise statement. In code, a raise statement consists of the following:

>>> raise Exception('This is the error message.')
Traceback (most recent call last):
  File "<pyshell#191>", line 1, in <module>
    raise Exception('This is the error message.')
Exception: This is the error message.

Often it’s the code that calls the function, not the function itself, that knows how to handle an exception. So you will commonly see a raise statement inside a function and the try and except statements in the code calling the function.

def box_print(symbol, width, height):
    if len(symbol) != 1:
      raise Exception('Symbol must be a single character string.')
    if width <= 2:
      raise Exception('Width must be greater than 2.')
    if height <= 2:
      raise Exception('Height must be greater than 2.')
    print(symbol * width)
    for i in range(height - 2):
        print(symbol + (' ' * (width - 2)) + symbol)
    print(symbol * width)
for sym, w, h in (('*', 4, 4), ('O', 20, 5), ('x', 1, 3), ('ZZ', 3, 3)):
    try:
        box_print(sym, w, h)
    except Exception as err:
        print('An exception happened: ' + str(err))

Back to Top

Getting the Traceback as a String

The traceback is displayed by Python whenever a raised exception goes unhandled. But can also obtain it as a string by calling traceback.format_exc(). This function is useful if you want the information from an exception’s traceback but also want an except statement to gracefully handle the exception. You will need to import Python’s traceback module before calling this function.

>>> import traceback

>>> try:
>>>      raise Exception('This is the error message.')
>>> except:
>>>      with open('errorInfo.txt', 'w') as error_file:
>>>          error_file.write(traceback.format_exc())
>>>      print('The traceback info was written to errorInfo.txt.')
116
The traceback info was written to errorInfo.txt.

The 116 is the return value from the write() method, since 116 characters were written to the file. The traceback text was written to errorInfo.txt.

Traceback (most recent call last):
  File "<pyshell#28>", line 2, in <module>
Exception: This is the error message.

Back to Top

Assertions

An assertion is a sanity check to make sure your code isn’t doing something obviously wrong. These sanity checks are performed by assert statements. If the sanity check fails, then an AssertionError exception is raised. In code, an assert statement consists of the following:

>>> pod_bay_door_status = 'open'

>>> assert pod_bay_door_status == 'open', 'The pod bay doors need to be "open".'

>>> pod_bay_door_status = 'I\'m sorry, Dave. I\'m afraid I can\'t do that.'

>>> assert pod_bay_door_status == 'open', 'The pod bay doors need to be "open".'

Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    assert pod_bay_door_status == 'open', 'The pod bay doors need to be "open".'
AssertionError: The pod bay doors need to be "open".

In plain English, an assert statement says, “I assert that this condition holds true, and if not, there is a bug somewhere in the program.” Unlike exceptions, your code should not handle assert statements with try and except; if an assert fails, your program should crash. By failing fast like this, you shorten the time between the original cause of the bug and when you first notice the bug. This will reduce the amount of code you will have to check before finding the code that’s causing the bug.

Disabling Assertions

Assertions can be disabled by passing the -O option when running Python.

Back to Top

Logging

To enable the logging module to display log messages on your screen as your program runs, copy the following to the top of your program (but under the #! python shebang line):

import logging

logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s- %(message)s')

Say you wrote a function to calculate the factorial of a number. In mathematics, factorial 4 is 1 × 2 × 3 × 4, or 24. Factorial 7 is 1 × 2 × 3 × 4 × 5 × 6 × 7, or 5,040. Open a new file editor window and enter the following code. It has a bug in it, but you will also enter several log messages to help yourself figure out what is going wrong. Save the program as factorialLog.py.

>>> import logging
>>>
>>> logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s- %(message)s')
>>>
>>> logging.debug('Start of program')
>>>
>>> def factorial(n):
>>>
>>>     logging.debug('Start of factorial(%s)' % (n))
>>>     total = 1
>>>
>>>     for i in range(1, n + 1):
>>>         total *= i
>>>         logging.debug('i is ' + str(i) + ', total is ' + str(total))
>>>
>>>     logging.debug('End of factorial(%s)' % (n))
>>>
>>>     return total
>>>
>>> print(factorial(5))
>>> logging.debug('End of program')
2015-05-23 16:20:12,664 - DEBUG - Start of program
2015-05-23 16:20:12,664 - DEBUG - Start of factorial(5)
2015-05-23 16:20:12,665 - DEBUG - i is 0, total is 0
2015-05-23 16:20:12,668 - DEBUG - i is 1, total is 0
2015-05-23 16:20:12,670 - DEBUG - i is 2, total is 0
2015-05-23 16:20:12,673 - DEBUG - i is 3, total is 0
2015-05-23 16:20:12,675 - DEBUG - i is 4, total is 0
2015-05-23 16:20:12,678 - DEBUG - i is 5, total is 0
2015-05-23 16:20:12,680 - DEBUG - End of factorial(5)
0
2015-05-23 16:20:12,684 - DEBUG - End of program

Back to Top

Logging Levels

Logging levels provide a way to categorize your log messages by importance. There are five logging levels, described in Table 10-1 from least to most important. Messages can be logged at each level using a different logging function.

Level Logging Function Description
DEBUG logging.debug() The lowest level. Used for small details. Usually you care about these messages only when diagnosing problems.
INFO logging.info() Used to record information on general events in your program or confirm that things are working at their point in the program.
WARNING logging.warning() Used to indicate a potential problem that doesn’t prevent the program from working but might do so in the future.
ERROR logging.error() Used to record an error that caused the program to fail to do something.
CRITICAL logging.critical() The highest level. Used to indicate a fatal error that has caused or is about to cause the program to stop running entirely.

Back to Top

Disabling Logging

After you’ve debugged your program, you probably don’t want all these log messages cluttering the screen. The logging.disable() function disables these so that you don’t have to go into your program and remove all the logging calls by hand.

>>> import logging

>>> logging.basicConfig(level=logging.INFO, format=' %(asctime)s -%(levelname)s - %(message)s')

>>> logging.critical('Critical error! Critical error!')
2015-05-22 11:10:48,054 - CRITICAL - Critical error! Critical error!

>>> logging.disable(logging.CRITICAL)

>>> logging.critical('Critical error! Critical error!')

>>> logging.error('Error! Error!')

Back to Top

Logging to a File

Instead of displaying the log messages to the screen, you can write them to a text file. The logging.basicConfig() function takes a filename keyword argument, like so:

import logging

logging.basicConfig(filename='myProgramLog.txt', level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')

Back to Top

Ternary Conditional Operator

Many programming languages have a ternary operator, which define a conditional expression. The most common usage is to make a terse simple conditional assignment statement. In other words, it offers one-line code to evaluate the first expression if the condition is true, otherwise it evaluates the second expression.

<expression1> if <condition> else <expression2>

Example:

>>> age = 15

>>> print('kid' if age < 18 else 'adult')
kid

Ternary operators can be chained:

>>> age = 15

>>> print('kid' if age < 13 else 'teenager' if age < 18 else 'adult')
teenager

The code above is equivalent to:

if age < 18:
    if age < 13:
        print('kid')
    else:
        print('teenager')
else:
    print('adult')

Back to Top

Unpacking Operator

One or two asterisk(s) can be used as unpacking operators:

They are also discussed in the next section on *args and **kwargs.

one asterisk (*) example:

>>> a = ["Tom", "Jerry", "Mike"]  # a list
>>> print(*a)
Tom Jerry Mike 

>>> b = ("Jenny", "Chris", "Monica")  # a tuple
>>> print(*b)
Jenny Chris Monica

>>> c = [[1, 2], [3, 4]]  # list of lists
>>> print(*c)
[1, 2] [3, 4]

>>> d = 'apple'  # a string
>>> print(*d)
a p p l e

>>> e = {'name':'tom', 'age': 25}  # a dictionary
>>> print(*e)
name age

Note that the last example above, using one * to unpack a dictionary, only the keys are returned.

The following example shows how ** unpacks a dictionary and assign the results to a function :


# here argument names must match dict keys, order does not matter

def print_info(name, age):  
    print(f'The age of {name} is {age}.')

print_info(**e)

The age of tom is 25.

Back to Top

args and kwargs

The names args and kwargs are arbitrary - the important thing are the * and ** operators. They can mean:

  1. In a function declaration, * means “pack all remaining positional arguments into a tuple named <name>”, while ** is the same for keyword arguments (except it uses a dictionary, not a tuple).

  2. In a function call, * means “unpack tuple or list named <name> to positional arguments at this position”, while ** is the same for keyword arguments.

For example you can make a function that you can use to call any other function, no matter what parameters it has:

def forward(f, *args, **kwargs):
    return f(*args, **kwargs)

Inside forward, args is a tuple (of all positional arguments except the first one, because we specified it - the f), kwargs is a dict. Then we call f and unpack them so they become normal arguments to f.

You use *args when you have an indefinite amount of positional arguments.

>>> def fruits(*args):
>>>    for fruit in args:
>>>       print(fruit)

>>> fruits("apples", "bananas", "grapes")

"apples"
"bananas"
"grapes"

Similarly, you use **kwargs when you have an indefinite number of keyword arguments.

>>> def fruit(**kwargs):
>>>    for key, value in kwargs.items():
>>>        print("{0}: {1}".format(key, value))

>>> fruit(name = "apple", color = "red")

name: apple
color: red
>>> def show(arg1, arg2, *args, kwarg1=None, kwarg2=None, **kwargs):
>>>   print(arg1)
>>>   print(arg2)
>>>   print(args)
>>>   print(kwarg1)
>>>   print(kwarg2)
>>>   print(kwargs)

>>> data1 = [1,2,3]
>>> data2 = [4,5,6]
>>> data3 = {'a':7,'b':8,'c':9}

>>> show(*data1,*data2, kwarg1="python",kwarg2="cheatsheet",**data3)
1
2
(3, 4, 5, 6)
python
cheatsheet
{'a': 7, 'b': 8, 'c': 9}

>>> show(*data1, *data2, **data3)
1
2
(3, 4, 5, 6)
None
None
{'a': 7, 'b': 8, 'c': 9}

# If you do not specify ** for kwargs
>>> show(*data1, *data2, *data3)
1
2
(3, 4, 5, 6, "a", "b", "c")
None
None
{}

Things to Remember(args)

  1. Functions can accept a variable number of positional arguments by using *args in the def statement.
  2. You can use the items from a sequence as the positional arguments for a function with the * operator.
  3. Using the * operator with a generator may cause your program to run out of memory and crash.
  4. Adding new positional parameters to functions that accept *args can introduce hard-to-find bugs.

Things to Remember(kwargs)

  1. Function arguments can be specified by position or by keyword.
  2. Keywords make it clear what the purpose of each argument is when it would be confusing with only positional arguments.
  3. Keyword arguments with default values make it easy to add new behaviors to a function, especially when the function has existing callers.
  4. Optional keyword arguments should always be passed by keyword instead of by position.

Back to Top

Context Manager (with statement)

A context manager is an object that is notified when a context (a block of code) starts and ends. You commonly use one with the with statement.

For example, file objects are context managers. When a context ends, the file object is closed automatically:

>>> with open(filename) as f:
>>>     file_contents = f.read()

# the open_file object has automatically been closed.

Anything that ends execution of the block causes the context manager’s exit method to be called. This includes exceptions, and can be useful when an error causes you to prematurely exit from an open file or connection. Exiting a script without properly closing files/connections is a bad idea, that may cause data loss or other problems. By using a context manager you can ensure that precautions are always taken to prevent damage or loss in this way.

__main__ Top-level script environment

__main__ is the name of the scope in which top-level code executes. A module’s name is set equal to __main__ when read from standard input, a script, or from an interactive prompt.

A module can discover whether or not it is running in the main scope by checking its own __name__, which allows a common idiom for conditionally executing code in a module when it is run as a script or with python -m but not when it is imported:

>>> if __name__ == "__main__":
...     # execute only if run as a script
...     main()

For a package, the same effect can be achieved by including a main.py module, the contents of which will be executed when the module is run with -m

For example we are developing script which is designed to be used as module, we should do:

>>> # Python program to execute function directly
>>> def add(a, b):
...     return a+b
...
>>> add(10, 20) # we can test it by calling the function save it as calculate.py
30
>>> # Now if we want to use that module by importing we have to comment out our call,
>>> # Instead we can write like this in calculate.py
>>> if __name__ == "__main__":
...     add(3, 5)
...
>>> import calculate
>>> calculate.add(3, 5)
8

Advantages

  1. Every Python module has it’s __name__ defined and if this is __main__, it implies that the module is being run standalone by the user and we can do corresponding appropriate actions.
  2. If you import this script as a module in another script, the name is set to the name of the script/module.
  3. Python files can act as either reusable modules, or as standalone programs.
  4. if __name__ == “main”: is used to execute some code only if the file was run directly, and not imported.

Back to Top

Virtual Environment

The use of a Virtual Environment is to test python code in encapsulated environments and to also avoid filling the base Python installation with libraries we might use for only one project.

Back to Top

virtualenv

Python 3.6+ has this build-in:

Anything we install now will be specific to this project. And available to the projects we connect to this environment.

Back to Top

anaconda

Anaconda is another popular tool to manage python packages.

Where packages, notebooks, projects and environments are shared. Your place for free public conda package hosting.

Usage:

  1. Make a Virtual Environment with name datascience

    conda create -n datascience
    
  2. To use the Virtual Environment, activate it by:

    conda activate datascience
    

    Anything installed now will be specific to the project HelloWorld

  3. Exit the Virtual Environment

    conda deactivate
    

Back to Top

References

Back to Top