Notifications survey analysis

import csv

#Load in the dataset as a list of dictionaries with column headers as keys
responses = []
with open('usage_of_notifications_final_anon_quant.csv', newline='') as infile:
    reader = csv.DictReader(infile)
    for row in reader:
        responses.append(row)

#Convert strings to numbers
for res in responses:
    for res_key, res_val in res.items():
        try: 
            res[res_key] = int(res_val)
#             print(res_val)
        except:
            continue
#     print(res)

Part 1. Summary statistics & demographics

How many completed surveys do we have per language?

en_responses = 0
fr_responses = 0

for res in responses:
    if res['Q_Language'] == 'EN':
        en_responses += 1
    elif res['Q_Language'] == 'FR':
        fr_responses += 1

print("There were %d responses in English" % (en_responses))
print("There were %d responses in French" % (fr_responses))
There were 122 responses in English
There were 36 responses in French

Q2. How active are the survey respondents?

act_levels = {1:0,2:0,3:0,5:0}
for res in responses:
    act_levels[res['Q2']] += 1

# print(act_levels)
print("%d made more than 100 edits per month" % (act_levels[5]))
print("%d made more 10-100 edits per month" % (act_levels[3]))
print("%d made 2-10 edits per month" % (act_levels[2]))
print("%d made 0-1 edits per month" % (act_levels[1]))
109 made more than 100 edits per month
36 made more 10-100 edits per month
8 made 2-10 edits per month
5 made 0-1 edits per month

Q3. In the past three months, which of the following Wikimedia projects have you edited while logged in to your account?

wikis = {
    'Q3_1': ["wikipedia",0],
    'Q3_2': ["wiktionary",0],
    'Q3_3': ["wikisource",0],
    'Q3_4': ["wikiquote",0],
    'Q3_5': ["wikibooks",0],
    'Q3_6': ["wikinews",0],
    'Q3_7': ["wikiversity",0],
    'Q3_8': ["wikispecies",0],
    'Q3_9': ["wikivoyage",0],
    'Q3_10': ["mediawiki",0],
    'Q3_11': ["metawiki",0],
    'Q3_12': ["wikidata",0],
    'Q3_13': ["wikicommons",0]
}

for res in responses:
    for res_key, res_val in res.items():
        if res_key.startswith("Q3_") and res_val == 1:
            wikis[res_key][1] += 1

# print(wikis)

sorted_wikis = sorted(wikis.items(), key=lambda e: e[1][1], reverse=True)
# print(sorted_wikis)

for w in sorted_wikis:
    print("%d have edited %s in the past 3 months" % (w[1][1], w[1][0]))
155 have edited wikipedia in the past 3 months
98 have edited wikicommons in the past 3 months
58 have edited wikidata in the past 3 months
42 have edited metawiki in the past 3 months
27 have edited mediawiki in the past 3 months
26 have edited wiktionary in the past 3 months
15 have edited wikisource in the past 3 months
8 have edited wikiquote in the past 3 months
8 have edited wikibooks in the past 3 months
6 have edited wikivoyage in the past 3 months
4 have edited wikinews in the past 3 months
3 have edited wikiversity in the past 3 months
2 have edited wikispecies in the past 3 months

Q5. In the past three months, how frequently have you visited at least one Wikimedia project?

this is a block of text

visits = {1:0,3:0,5:0,6:0,7:0,8:0,9:0}
for res in responses:
    visits[res['Q5']] += 1

# print(visits)
print("%d logged in multiple times per day" % (visits[1]))
print("%d logged in once per day" % (visits[3]))
print("%d logged in multiple times per week" % (visits[5]))
print("%d logged in once per week" % (visits[6]))
print("%d logged in multiple times per month" % (visits[7]))
print("%d logged in once a month or less" % (visits[8]))
print("%d were not sure how frequently they logged in" % (visits[9]))
120 logged in multiple times per day
14 logged in once per day
1 logged in multiple times per week
12 logged in once per week
3 logged in multiple times per month
2 logged in once a month or less
6 were not sure how frequently they logged in

Q6. In a typical day, how many new (unread) notifications do you usually get on the project you work on most?

n_per_day = {1:0,2:0,3:0,4:0}
for res in responses:
    n_per_day[res['Q6']] += 1

# print(n_per_day)
print("%d received 0-1 new notifications" % (n_per_day[1]))
print("%d received 2-5 new notifications" % (n_per_day[2]))
print("%d received 5-10 new notifications" % (n_per_day[3]))
print("%d received more than 10 new notifications" % (n_per_day[4]))
84 received 0-1 new notifications
50 received 2-5 new notifications
15 received 5-10 new notifications
9 received more than 10 new notifications

Part II. Standard Notifications

Q7. Which of the following types of notification have you seen on Wikimedia projects?

n_seen = {    
'Q7_1': ["user talkpage message",0],
'Q7_2': ["reverted",0],   
'Q7_11': ["thanked",0],   
'Q7_13': ["page reviewed",0],   
'Q7_15': ["mentioned",0],   
'Q7_16': ["email",0],   
'Q7_17': ["user rights",0],   
'Q7_18': ["page linked",0],   
'Q7_20': ["none of these",0],   
}

for res in responses:
    for res_key, res_val in res.items():
        if res_key.startswith("Q7_") and res_val == 1:
            n_seen[res_key][1] += 1

# print(wikis)

sorted_seen = sorted(n_seen.items(), key=lambda e: e[1][1], reverse=True)
# print(sorted_wikis)

for s in sorted_seen:
    print("%d have seen %s notifications" % (s[1][1], s[1][0]))
146 have seen thanked notifications
146 have seen user talkpage message notifications
134 have seen mentioned notifications
121 have seen reverted notifications
92 have seen page linked notifications
66 have seen page reviewed notifications
45 have seen user rights notifications
42 have seen email notifications
4 have seen none of these notifications
#Build a dict to hold our prioritized counts
n_priorities = {el[1][0]:{'read':{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0}, 
                          'act':{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0}} 
                        for el in n_seen.items()}
del n_priorities['none of these']
# print(n_priorities)

Q8. Please rank the following list of notification types in order of how important it is that you read it right away?

read_q = {    
'Q8_x1': "user talkpage message",
'Q8_x15': "mentioned",
'Q8_x16': "email",
'Q8_x11': "thanked",
'Q8_x2': "reverted",
'Q8_x13': "page reviewed",
'Q8_x18': "page linked",
'Q8_x17': "user rights",
} 
    
for res in responses:
    for res_key, res_val in res.items():
        if res_key.startswith("Q8_") and res_val in range(1,9):
            n_priorities[read_q[res_key]]['read'][res_val] += 1 

for k,v in n_priorities.items():
    n_priorities[k]['read']['perc_no1'] = round(v['read'][1]/sum(v['read'].values())*100,1)
#     print(k + str(n_priorities[k]['read']['perc_no1'])) 

np_sorted_r1 = sorted(n_priorities.items(), key=lambda e: e[1]['read']['perc_no1'], reverse=True)
# print(np_sorted_r1)

for n in np_sorted_r1:
    print("%s%% of respondents ranked '%s' notifications as their first reading priority" % (n[1]['read']['perc_no1'], n[0])) 
44.1% of respondents ranked 'user talkpage message' notifications as their first reading priority
32.6% of respondents ranked 'user rights' notifications as their first reading priority
20.9% of respondents ranked 'reverted' notifications as their first reading priority
17.1% of respondents ranked 'mentioned' notifications as their first reading priority
7.5% of respondents ranked 'email' notifications as their first reading priority
7.3% of respondents ranked 'thanked' notifications as their first reading priority
6.7% of respondents ranked 'page reviewed' notifications as their first reading priority
3.5% of respondents ranked 'page linked' notifications as their first reading priority

Q9. Please rank the following notification types in order of how important it is that you act on it right away?

act_q = {    
'Q9_x1': "user talkpage message",
'Q9_x15': "mentioned",
'Q9_x16': "email",
'Q9_x11': "thanked",
'Q9_x2': "reverted",
'Q9_x13': "page reviewed",
'Q9_x18': "page linked",
'Q9_x17': "user rights",
} 

for res in responses:
    for res_key, res_val in res.items():
        if res_key.startswith("Q9_") and res_val in range(1,9):
            n_priorities[act_q[res_key]]['act'][res_val] += 1

for k,v in n_priorities.items():
    n_priorities[k]['act']['perc_no1'] = round(v['act'][1]/sum(v['act'].values())*100,1)
#     print(k + str(n_priorities[k]['act']['perc_no1'])) 

np_sorted_r1 = sorted(n_priorities.items(), key=lambda e: e[1]['act']['perc_no1'], reverse=True)
# print(np_sorted_r1)

for n in np_sorted_r1:
    print("%s%% of respondents ranked '%s' notifications as their first action priority" % (n[1]['act']['perc_no1'], n[0]))
    
43.1% of respondents ranked 'user talkpage message' notifications as their first action priority
37.2% of respondents ranked 'reverted' notifications as their first action priority
14.6% of respondents ranked 'mentioned' notifications as their first action priority
14.0% of respondents ranked 'user rights' notifications as their first action priority
11.1% of respondents ranked 'email' notifications as their first action priority
5.0% of respondents ranked 'page reviewed' notifications as their first action priority
3.6% of respondents ranked 'page linked' notifications as their first action priority
2.3% of respondents ranked 'thanked' notifications as their first action priority

Part III. Flow notifications

Q10. Have you used Flow on at least one of the Wikimedia projects you visit regularly?

used_flow = {1:0,3:0,4:0,5:0}
en_users = 0
fr_users = 0
for res in responses:
    used_flow[res['Q10']] += 1
    if res['Q_Language'] == 'EN' and res['Q10'] == 1:
        en_users += 1
    elif res['Q_Language'] == 'FR' and res['Q10'] == 1:
        fr_users += 1
# print(used_flow)
print("%d have used Flow" % (used_flow[1]))
print("%d have not used Flow because it is not enabled on projects they visit regularly" % (used_flow[3]))
print("%d have not used Flow even though it is enabled on one or more projects they visit regularly" % (used_flow[4]))
print("%d are not sure whether they have used Flow" % (used_flow[5]))
print("\n")
print("%d Enwiki respondents have used Flow" % (en_users))
print("%d Frwiki respondents have used Flow" % (fr_users))
42 have used Flow
42 have not used Flow because it is not enabled on projects they visit regularly
28 have not used Flow even though it is enabled on one or more projects they visit regularly
46 are not sure whether they have used Flow


27 Enwiki respondents have used Flow
15 Frwiki respondents have used Flow

Q11. Which of the following types of Flow notification have you seen on Wikimedia projects?

f_seen = {    
'Q11_1': ["new topic on talkpage",0],
'Q11_2': ["edited their post",0],   
'Q11_11': ["new comment on topic",0],   
'Q11_12': ["thanked",0],   
'Q11_14': ["mentioned",0],   
'Q11_15': ["renamed topic",0],
'Q11_19': ["none of these",0],     
}
    
for res in responses:
    if res['Q10'] == 1:
        for res_key, res_val in res.items():
            if res_key.startswith("Q11_") and res_val == 1:
                f_seen[res_key][1] += 1
    else:
        pass

sorted_seen = sorted(f_seen.items(), key=lambda e: e[1][1], reverse=True)
# print(sorted_wikis)

for s in sorted_seen:
    print("%d have seen %s notifications" % (s[1][1], s[1][0]))
33 have seen new comment on topic notifications
27 have seen new topic on talkpage notifications
18 have seen mentioned notifications
13 have seen thanked notifications
7 have seen renamed topic notifications
5 have seen none of these notifications
4 have seen edited their post notifications
#Build a dict to hold our prioritized counts
f_priorities = {el[1][0]:{'read':{1:0,2:0,3:0,4:0,5:0,6:0}, 
                          'act':{1:0,2:0,3:0,4:0,5:0,6:0}} 
                        for el in f_seen.items()}
del f_priorities['none of these']
# print(f_priorities)

Q12. Please rank the following notification types in order of how important it is that you read it right away?

read_q = {    
'Q12_x1': "new topic on talkpage",
'Q12_x11': "new comment on topic",
'Q12_x15': "renamed topic",
'Q12_x14': "mentioned",
'Q12_x2': "edited their post",
'Q12_x12': "thanked",
} 
    
for res in responses:
    for res_key, res_val in res.items():
        if res_key.startswith("Q12_") and res_val in range(1,7):
            f_priorities[read_q[res_key]]['read'][res_val] += 1

for k,v in f_priorities.items():
    f_priorities[k]['read']['perc_no1'] = round(v['read'][1]/sum(v['read'].values())*100,1)

fp_sorted_r1 = sorted(f_priorities.items(), key=lambda e: e[1]['read']['perc_no1'], reverse=True)
# print(np_sorted_r1)

for n in fp_sorted_r1:
    print("%s%% of respondents ranked '%s' notifications as their first reading priority" % (n[1]['read']['perc_no1'], n[0]))
66.7% of respondents ranked 'edited their post' notifications as their first reading priority
53.3% of respondents ranked 'mentioned' notifications as their first reading priority
37.5% of respondents ranked 'new comment on topic' notifications as their first reading priority
20.0% of respondents ranked 'renamed topic' notifications as their first reading priority
14.3% of respondents ranked 'new topic on talkpage' notifications as their first reading priority
10.0% of respondents ranked 'thanked' notifications as their first reading priority

Q13. Please rank the following list of Flow notifications in order of how important it is that you act on it right away?

act_q = {    
'Q13_x1': "new topic on talkpage",
'Q13_x11': "new comment on topic",
'Q13_x15': "renamed topic",
'Q13_x14': "mentioned",
'Q13_x2': "edited their post",
'Q13_x12': "thanked",
} 

for res in responses:
    for res_key, res_val in res.items():
        if res_key.startswith("Q13_") and res_val in range(1,7):
            f_priorities[act_q[res_key]]['act'][res_val] += 1

for k,v in f_priorities.items():
    f_priorities[k]['act']['perc_no1'] = round(v['act'][1]/sum(v['act'].values())*100,1)

fp_sorted_r1 = sorted(f_priorities.items(), key=lambda e: e[1]['act']['perc_no1'], reverse=True)
# print(np_sorted_r1)

for n in fp_sorted_r1:
    print("%s%% of respondents ranked '%s' notifications as their first action priority" % (n[1]['act']['perc_no1'], n[0]))
  
66.7% of respondents ranked 'edited their post' notifications as their first action priority
60.0% of respondents ranked 'mentioned' notifications as their first action priority
40.0% of respondents ranked 'new comment on topic' notifications as their first action priority
19.0% of respondents ranked 'new topic on talkpage' notifications as their first action priority
10.0% of respondents ranked 'thanked' notifications as their first action priority
0.0% of respondents ranked 'renamed topic' notifications as their first action priority

Q14. To what extent do you agree with the following statement?

"Notifications provides a good overview of relevant activity and help me keep up with what is happening on the Wikimedia projects I visit regularly."

f_agree = {    
1: ["strongly agree",0],
3: ["agree",0],   
4: ["neither agree nor disagree",0],   
5: ["disagree",0],   
7: ["strongly disagree",0],      
}

for res in responses:
    for res_key, res_val in res.items():
        if res_key == 'Q14' and res_val in f_agree.keys():
            f_agree[res_val][1] += 1
        else:
            pass

sorted_agree = sorted(f_agree.items(), key=lambda e: e[0])
# print(sorted_agree)

for s in sorted_agree:
    print("%d %s that Notifications are useful" % (s[1][1], s[1][0]))
19 strongly agree that Notifications are useful
14 agree that Notifications are useful
0 neither agree nor disagree that Notifications are useful
2 disagree that Notifications are useful
2 strongly disagree that Notifications are useful

DRAFT BELOW HERE

top_3_rank_counts_read = []
for rank_key, rank_val in standard_ranks[0].items():
    top_3_count = 0
    total_count = 0
    percent_top_3 = 0
    for i in range(1,4):
        top_3_count += rank_val[i]
    for i in range(1,9):
        total_count += rank_val[i]
    percent_top_3 = int((top_3_count/total_count) * 100)    
    top_3_rank_counts_read.append([rank_key, top_3_count, total_count, percent_top_3])
# print(top_3_rank_counts_read)    
sorted_top_3_rank_counts_read = sorted(top_3_rank_counts_read, key=operator.itemgetter(3), reverse=True)
# print(sorted_top_3_rank_counts_read)

Results - Ranks standard notifications

print("Rank\tRead\t%\t\tRank\tAct\t%")
for i in range(0,8):
    print("%s\t%s\t%s%%\t\t%s\t%s\t%s%%" % (i+1, 
                                            sorted_top_3_rank_counts_read[i][0][5:][:-5],
                                            sorted_top_3_rank_counts_read[i][3],
                                            i+1,                                            
                                            sorted_top_3_rank_counts_act[i][0][5:][:-4],
                                            sorted_top_3_rank_counts_act[i][3]))
Rank	Read	%		Rank	Act	%
1	mention	89%		1	message	90%
2	message	86%		2	mention	88%
3	revert	68%		3	revert	85%
4	rights	48%		4	email	53%
5	thank	30%		5	rights	28%
6	email	28%		6	thank	17%
7	review	22%		7	link	15%
8	link	10%		8	review	12%

Ranked Flow notifications

For Flow notifications, we'll look at how frequently 'items made it into the top 2, since the data is sparser

Results - Ranked Flow notifications

print("Rank\tRead\t%\t\tRank\tAct\t%")
for i in range(0,6):
    print("%s\t%s\t%s%%\t\t%s\t%s\t%s%%" % (i+1, 
                                            sorted_top_3_rank_counts_flow_read[i][0][10:][:-5],
                                            sorted_top_3_rank_counts_flow_read[i][3],
                                            i+1,                                            
                                            sorted_top_3_rank_counts_flow_act[i][0][10:][:-4],
                                            sorted_top_3_rank_counts_flow_act[i][3]))
Rank	Read	%		Rank	Act	%
1	new_comment	100%		1	new_comment	100%
2	edited_post	100%		2	topic_rename	100%
3	topic_rename	100%		3	edited_post	100%
4	mention	100%		4	mention	100%
5	new_topic	75%		5	new_topic	92%
6	thank	60%		6	thank	20%