Dictionaries as a set of Counters

Suppose we want to see how many times each letter appears in a string there are of course several ways you could do it:
1.      Creating 26 variables one for each letter and then construct a loop that would traverses the string and for each character, increment the corresponding counter.
2.      Creating a list with 26 elements. Then you could convert each character to a number, use the number as an index into the list, and increment the appropriate counter
3.      Creating a dictionary with characters and keys and counters as the corresponding values. The first time you see a character, you would add an item to the dictionary. After that you would increment the value of an existing item.
Ok before we apply the dictionary let’s see the previous two options first one being to create 26 variables.
Example  – Counting letters in user input string
userinput = raw_input('Enter a sentence without numbers: ')
line = userinput.lower()
a = 'a'; b = 'b'; c = 'c'; d = 'd'; e = 'e'; f = 'f'; g = 'g'
h = 'h'; i = 'i'; j = 'j'; k = 'k'; l = 'l'; m = 'm'; n = 'n'
o = 'o'; p = 'p'; q = 'q'; r = 'r'; s = 's'; t = 't'; u = 'u'
v = 'v'; w = 'w'; x = 'x'; y = 'y'; z = 'z'
ln = len(line) -1 
print ln 
i = 0
counta = 0; countb = 0; countc = 0; countd = 0; counte = 0; countf = 0
countg = 0; counth = 0; counti = 0; countj = 0; countk = 0; countl = 0
countm = 0; countn = 0; counto = 0; countp = 0; countq = 0; countr = 0
counts = 0; countt = 0; countu = 0; countv = 0; countw = 0; countx = 0
county = 0; countz = 0
while i <= ln:
    if a in line[i]:
        counta = counta + 1
    elif b in line[i]:
        countb = countb + 1
    elif 'c' in line[i]:
        countc = countc + 1
    elif 'd' in line[i]:
        countd = countc + 1
    elif 'e' in line[i]:
        counte = counte + 1
    elif 'f' in line[i]:
        countf = countf + 1
    elif 'g' in line[i]:
        countg = countg + 1
    elif 'h' in line[i]:
        counth = counth + 1
    elif 'i' in line[i]:
        counti = counti + 1 
    elif 'j' in line[i]:
        countj = countj + 1 
    elif 'k' in line[i]:
        countk = countk + 1
    elif 'l' in line[i]:
        countl = countl + 1
    elif 'm' in line[i]:
        countm = countm + 1
    elif 'n' in line[i]:
        countn = countn + 1
    elif 'o' in line[i]:
        counto = counto + 1
    elif 'p' in line[i]:
        countp = countp + 1
    elif 'q' in line[i]:
        countq = countq + 1
    elif 'r' in line[i]:
        countr = countr + 1
    elif 's' in line[i]:
        counts = counts + 1
    elif 't' in line[i]:
        countt = countt + 1
    elif 'u' in line[i]:
        countu = countu + 1
    elif 'v' in line[i]:
        countv = countv + 1
    elif 'w' in line[i]:
        countw = countw + 1
    elif 'x' in line[i]:
        countx = countx + 1
    elif 'y' in line[i]:
        county = county + 1
    elif 'z' in line[i]:
        countz = countz + 1
    else: 
        counta;countb;countc;countd 
        counte;countf;countg;counth
        counti;countj;countk;countl
        countm;countn;counto;countp
        countq;countr;counts;countt
        countu;countv;countw;countx;
        county;countz
    i = i + 1
    
print 'Number of letter a: ' + str(counta)
print 'Number of letter b: ' + str(countb)
print 'Number of letter c: ' + str(countc)
print 'Number of letter d: ' + str(countd)
print 'Number of letter e: ' + str(counte)
print 'Number of letter f: ' + str(countf)
print 'Number of letter g: ' + str(countg)
print 'Number of letter h: ' + str(counth)
print 'Number of letter i: ' + str(counti)
print 'Number of letter j: ' + str(countj)
print 'Number of letter k: ' + str(countk)
print 'Number of letter l: ' + str(countl)
print 'Number of letter m: ' + str(countm)
print 'Number of letter n: ' + str(countn)
print 'Number of letter o: ' + str(counto)
print 'Number of letter p: ' + str(countp)
print 'Number of letter q: ' + str(countq)
print 'Number of letter r: ' + str(countr)
print 'Number of letter s: ' + str(counts)
print 'Number of letter t: ' + str(countt)
print 'Number of letter u: ' + str(countu)
print 'Number of letter v: ' + str(countv)
print 'Number of letter w: ' + str(countw)
print 'Number of letter x: ' + str(countx)
print 'Number of letter y: ' + str(county)
print 'Number of letter z: ' + str(countz)
When we execute a code the result is following:
Enter a sentence without numbers: Say hello to my little friend
28
Number of letter a: 1
Number of letter b: 0
Number of letter c: 0
Number of letter d: 1
Number of letter e: 3
Number of letter f: 1
Number of letter g: 0
Number of letter h: 1
Number of letter i: 2
Number of letter j: 0
Number of letter k: 0
Number of letter l: 4
Number of letter m: 1
Number of letter n: 1
Number of letter o: 2
Number of letter p: 0
Number of letter q: 0
Number of letter r: 1
Number of letter s: 1
Number of letter t: 3
Number of letter u: 0
Number of letter v: 0
Number of letter w: 0
Number of letter x: 0
Number of letter y: 2
Number of letter z: 0
An implementation is a way of performing a computation; some implementations are better than others. For example, an advantage of the dictionary implementation is that we don’t have to know ahead of time which letters appear in the string and we only have to make room for the letters that do appear.
userinput = 'Say hello to my little friend'
line = userinput.lower()
print line.strip('')
d = dict()
for c in line:
    if c not in d:
        d[c] = 1
    else:
        d[c] = d[c] + 1
print d
{'a': 1, ' ': 5, 'e': 3, 'd': 1, 'f': 1, 'i': 2, 'h': 1, 'm': 1, 'l': 4, 'o': 2, 'n': 1, 's': 1, 'r': 1, 't': 3, 'y': 2}
We are effectively computing a histogram, which is a statistical term for a set of counter. The for loop traverses the string. Each time through the loop, if the character c is not in the dictionary, we create new item with key c and the initial value 1. If c is already in the dictionary we increment d[c].
Dictionaries have a method called get that takes a key and a default value. If the key appears in the dictionary, get returns the corresponding value, otherwise it returns the default value. For example
>>> print d.get('t',0)
3
With the use of get method we can reduce four lines down to one and eliminate the if statement.
userinput = 'Say hello to my little friend'
line = userinput.lower()
print line.strip('')
d = dict()
for c in line:
    d[c] = d.get(c,0) + 1
print d
The use of get method is to simplify this counting loop ends up being a very commonly used ‘idiom’ in Python and you’ll use it many times. 

Nema komentara:

Objavi komentar