Week 8: new concepts

try and except statements

try and except are a construct that allows you to attempt to perform an operation, but keep going if it fails (causes an error).

The most common use of try and except for our kind of work is when you are looping through a dataset and trying to grab, test for, and/or perform an operation on each of the items in the dataset—BUT you don't know what you're trying to do will work for every item.

For example take the list below.

my_list = [1,15,42,"panda", 432, 7]

Let's say that you wanted to divide everything in this list by itself, using a for loop:

for item in my_list:
    print(item/item)
1.0
1.0
1.0

TypeErrorTraceback (most recent call last)
<ipython-input-13-9487e8ced1f3> in <module>()
      1 for item in my_list:
----> 2     print(item/item)

TypeError: unsupported operand type(s) for /: 'str' and 'str'

One way to avoid this error is to put a type check in your code before you print, and only print the item if type == int. But another way to do this, if you want to ignore any list items that aren't numbers and still print out all the numbers in the list divided by two, is to use try and except.

for item in my_list:
    try:
        print(item/item)
    except TypeError:
        continue
1.0
1.0
1.0
1.0
1.0

Note that if you specify the type of error, Python will only ignore ("suppress") those kinds of errors. That means that if your code encounters a different type of error, it will still break.

my_list.insert(4, 0)
# my_list = [1,15,42,"panda", 0, 432, 7]

for item in my_list:
    try:
        print(item/item)
    except TypeError:
        continue
1.0
1.0
1.0

ZeroDivisionErrorTraceback (most recent call last)
<ipython-input-15-3ebc1f4bebdc> in <module>()
      4 for item in my_list:
      5     try:
----> 6         print(item/item)
      7     except TypeError:
      8         continue

ZeroDivisionError: division by zero

To handle this, you have two options.

  1. if you know all of the kinds of errors you are likely to encounter, you can put them all in the except statement, using the syntax shown below.
  2. if you want to ignore ALL kinds of errors, you can just use a bare except.
#solution 1
for item in my_list:
    try:
        print(item/item)
    except (TypeError, ZeroDivisionError) as e:
        continue
        
#solution 2
for item in my_list:
    try:
        print(item/item)
    except:
        continue
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0

sleep() function

As you've seen by now, code runs very fast! If you want your code to 'take a break' and pause for a while at any point during the execution of your script, use the sleep() function from the time module.

import time
pythonistas = ["John", "Terry J.", "Terry G.", "Michael", "Eric", "Graham"]

for p in pythonistas:
    #print one name, then wait two seconds before starting the next loop
    print(p)
    time.sleep(2)
John
Terry J.
Terry G.
Michael
Eric
Graham

sleep() is useful for many things. You'll probably use it most in one of two scenarios:

  1. you are using an API that imposes a rate limit on you, meaning that you can only make a certain number of API calls per second.
  2. you want to print output to the terminal more slowly, so that you can monitor what your code is doing while it runs.

Parsing datetime objects

We'll import the libraries we want to use to parse our datetime objects.

The first is called dateutil, and you can read more about it here: http://dateutil.readthedocs.io/en/stable/

The second is just called datetime, and you can read more about it here: https://pymotw.com/2/datetime/

from datetime import datetime
from dateutil import parser
import requests

Dateutil and datetime modules both do a lot of useful stuff. But in this case, we don't need all of 'dateutil', just the part called 'parser', and we only need datetime class from datetime.

api_call = requests.get("https://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&list=&meta=&titles=Panama_Papers&rvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser&rvlimit=10&rvdir=newer")
response = api_call.json()
for r in response['query']['pages']['50034356']['revisions']:
    print(r['timestamp'])
    print(type(r['timestamp']))
    parsed_timestamp = parser.parse(r['timestamp'])
    print(parsed_timestamp)
    print(type(parsed_timestamp))
    print("\n")
2016-04-03T17:59:05Z
<class 'str'>
2016-04-03 17:59:05+00:00
<class 'datetime.datetime'>


2016-04-03T18:00:17Z
<class 'str'>
2016-04-03 18:00:17+00:00
<class 'datetime.datetime'>


2016-04-03T18:08:04Z
<class 'str'>
2016-04-03 18:08:04+00:00
<class 'datetime.datetime'>


2016-04-03T18:10:42Z
<class 'str'>
2016-04-03 18:10:42+00:00
<class 'datetime.datetime'>


2016-04-03T18:13:48Z
<class 'str'>
2016-04-03 18:13:48+00:00
<class 'datetime.datetime'>


2016-04-03T18:15:09Z
<class 'str'>
2016-04-03 18:15:09+00:00
<class 'datetime.datetime'>


2016-04-03T18:20:07Z
<class 'str'>
2016-04-03 18:20:07+00:00
<class 'datetime.datetime'>


2016-04-03T18:31:40Z
<class 'str'>
2016-04-03 18:31:40+00:00
<class 'datetime.datetime'>


2016-04-03T18:35:18Z
<class 'str'>
2016-04-03 18:35:18+00:00
<class 'datetime.datetime'>


2016-04-03T18:37:06Z
<class 'str'>
2016-04-03 18:37:06+00:00
<class 'datetime.datetime'>


When you print the parsed timestamps individually, they look pretty much the same as the strings, but you can see that they're a different type of object. When you put them in a list and then print the list, you can see their true nature.

datetime_list = []

for r in response['query']['pages']['50034356']['revisions']:
    parsed_timestamp = parser.parse(r['timestamp'])
    datetime_list.append(parsed_timestamp)

print(datetime_list)
[datetime.datetime(2016, 4, 3, 17, 59, 5, tzinfo=tzlocal()), datetime.datetime(2016, 4, 3, 18, 0, 17, tzinfo=tzlocal()), datetime.datetime(2016, 4, 3, 18, 8, 4, tzinfo=tzlocal()), datetime.datetime(2016, 4, 3, 18, 10, 42, tzinfo=tzlocal()), datetime.datetime(2016, 4, 3, 18, 13, 48, tzinfo=tzlocal()), datetime.datetime(2016, 4, 3, 18, 15, 9, tzinfo=tzlocal()), datetime.datetime(2016, 4, 3, 18, 20, 7, tzinfo=tzlocal()), datetime.datetime(2016, 4, 3, 18, 31, 40, tzinfo=tzlocal()), datetime.datetime(2016, 4, 3, 18, 35, 18, tzinfo=tzlocal()), datetime.datetime(2016, 4, 3, 18, 37, 6, tzinfo=tzlocal())]

Getting individual date/time parts from datetime objects

parser identifies and parses strings that look like dates and/or times, and transforms them into datetime objects that we can work with easier. It's pretty good at automatically detecting what parts of the string are the month, day, second, etc.

for dt in datetime_list:
    print(dt.year)
    print(dt.minute)
2016
59
2016
0
2016
8
2016
10
2016
13
2016
15
2016
20
2016
31
2016
35
2016
37

Sorting datetime objects

You can sort datetime objects easily, too.

from random import shuffle

print("original list (default order)")
for d in datetime_list:
    print(d)
print("\n")

shuffle(datetime_list)
print("unsorted list")
for d in datetime_list:
    print(d)
print("\n")

datetime_list.sort(reverse=True)
print("reverse-chronological sorted list")
for d in datetime_list:
    print(d)
print("\n")

datetime_list.sort()
print("chronological sorted list (back to default)")
for d in datetime_list:
    print(d)
print("\n")
original list (default order)
2016-04-03 17:59:05+00:00
2016-04-03 18:00:17+00:00
2016-04-03 18:08:04+00:00
2016-04-03 18:10:42+00:00
2016-04-03 18:13:48+00:00
2016-04-03 18:15:09+00:00
2016-04-03 18:20:07+00:00
2016-04-03 18:31:40+00:00
2016-04-03 18:35:18+00:00
2016-04-03 18:37:06+00:00


unsorted list
2016-04-03 18:37:06+00:00
2016-04-03 18:35:18+00:00
2016-04-03 17:59:05+00:00
2016-04-03 18:13:48+00:00
2016-04-03 18:31:40+00:00
2016-04-03 18:00:17+00:00
2016-04-03 18:15:09+00:00
2016-04-03 18:10:42+00:00
2016-04-03 18:08:04+00:00
2016-04-03 18:20:07+00:00


reverse-chronological sorted list
2016-04-03 18:37:06+00:00
2016-04-03 18:35:18+00:00
2016-04-03 18:31:40+00:00
2016-04-03 18:20:07+00:00
2016-04-03 18:15:09+00:00
2016-04-03 18:13:48+00:00
2016-04-03 18:10:42+00:00
2016-04-03 18:08:04+00:00
2016-04-03 18:00:17+00:00
2016-04-03 17:59:05+00:00


chronological sorted list (back to default)
2016-04-03 17:59:05+00:00
2016-04-03 18:00:17+00:00
2016-04-03 18:08:04+00:00
2016-04-03 18:10:42+00:00
2016-04-03 18:13:48+00:00
2016-04-03 18:15:09+00:00
2016-04-03 18:20:07+00:00
2016-04-03 18:31:40+00:00
2016-04-03 18:35:18+00:00
2016-04-03 18:37:06+00:00


Converting datetime objects back to strings

You can use the builtin strftime function to convert a datetime object (part of all of it) into a string, formatted in whatever way you choose. See the docs for more information on how to print specific date/time values: https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior

for i in range(0, len(datetime_list),2):
    if datetime_list[i] > datetime_list[i+1]:
        print(datetime_list[i].strftime("%m/%d/%Y %H:%M:%S") + " is later than " + datetime_list[i+1].strftime("%m/%d/%Y %H:%M:%S"))
    else:
        print(datetime_list[i].strftime("%m/%d/%Y %H:%M:%S") + " is earlier than " + datetime_list[i+1].strftime("%m/%d/%Y %H:%M:%S"))
04/03/2016 18:15:09 is earlier than 04/03/2016 18:37:06
04/03/2016 18:13:48 is earlier than 04/03/2016 18:31:40
04/03/2016 17:59:05 is earlier than 04/03/2016 18:20:07
04/03/2016 18:35:18 is later than 04/03/2016 18:08:04
04/03/2016 18:00:17 is earlier than 04/03/2016 18:10:42

Writing your own functions

Writing your own functions can make your code easier to read, easier to modify, and (sometimes) shorter! Let's take the example from [52] above, and re-write it with a function.

from random import shuffle

def printList(ordered_list, description):
    print(description)
    for l in ordered_list:
        print(l)
    print("\n")

printList(datetime_list,"original list (default order)")

shuffle(datetime_list)
printList(datetime_list,"unsorted list")

datetime_list.sort(reverse=True)
printList(datetime_list,"reverse-chronological sorted list")

datetime_list.sort()
printList(datetime_list,"chronological sorted list (back to default)")
original list (default order)
2016-04-03 17:59:05+00:00
2016-04-03 18:00:17+00:00
2016-04-03 18:08:04+00:00
2016-04-03 18:10:42+00:00
2016-04-03 18:13:48+00:00
2016-04-03 18:15:09+00:00
2016-04-03 18:20:07+00:00
2016-04-03 18:31:40+00:00
2016-04-03 18:35:18+00:00
2016-04-03 18:37:06+00:00


unsorted list
2016-04-03 18:37:06+00:00
2016-04-03 18:10:42+00:00
2016-04-03 18:15:09+00:00
2016-04-03 17:59:05+00:00
2016-04-03 18:35:18+00:00
2016-04-03 18:00:17+00:00
2016-04-03 18:31:40+00:00
2016-04-03 18:20:07+00:00
2016-04-03 18:13:48+00:00
2016-04-03 18:08:04+00:00


reverse-chronological sorted list
2016-04-03 18:37:06+00:00
2016-04-03 18:35:18+00:00
2016-04-03 18:31:40+00:00
2016-04-03 18:20:07+00:00
2016-04-03 18:15:09+00:00
2016-04-03 18:13:48+00:00
2016-04-03 18:10:42+00:00
2016-04-03 18:08:04+00:00
2016-04-03 18:00:17+00:00
2016-04-03 17:59:05+00:00


chronological sorted list (back to default)
2016-04-03 17:59:05+00:00
2016-04-03 18:00:17+00:00
2016-04-03 18:08:04+00:00
2016-04-03 18:10:42+00:00
2016-04-03 18:13:48+00:00
2016-04-03 18:15:09+00:00
2016-04-03 18:20:07+00:00
2016-04-03 18:31:40+00:00
2016-04-03 18:35:18+00:00
2016-04-03 18:37:06+00:00