<> code

We're all writing Chinese word frequency statistics , I got in touch python It's been years , It's also written in English , Really , Just . Just post a code .
text = """ British newspapers are much smaller than they used to be and their
readers are often in a hurry , so newspapermen write as few words as possible .
They tell their readers at once what happened , where , when and how it
happened and what was the result : how many people were killed , what change
was done and so on . Readers want the fact set out as fully and accurately as
possible . Readers are also interested in the people who have seen the accident
. So a newspaperman always likes to get some information from someone who was
there , which can be given in the person’s own words . Because he can use only
a few words , the newspaperman must choose those words carefully , every one
must be effective . Instead of “ he called out in a loud voice ” , he writes ”
he shouted ” ; instead of “the loose stones rolled noisily down the side of the
mountain ” , he will write ” they thundered down the mountainside ” . Because
many of the readers are not very clever, and most of them are in a hurry. """
def getTxt(txt): # Preprocessing text （ include ） txt = txt.lower()# Turn all words into lowercase for ch in
",,,.!!@#\$%^'”“;'’": # Replace all symbols except words with spaces txt=txt.replace(ch, ' ') return txt
txtArr= getTxt(text).split() counts = {} for word in txtArr: counts[word] =
counts.get(word, 0) + 1 countsList = list(counts.items()) countsList.sort(key=
lambda x:x[1], reverse=True) for i in range(20): word, count = countsList[i]
print('{0:<10}{1:>10}'.format(word,count))
<> Code Commentary

* I found an English reading in Baidu , As text Statistics word frequency .
* str.lower(), Convert all words to lowercase and return the result , primary str unchanged
* str.replace(‘a’, ‘b’), take str All in a Character to b Character and return the result , primary str unchanged
*
str.split(),split() Without parameters, the default is all empty characters , Include spaces , Line break (\n), Tab (\t) And so on str, And return the segmentation result （list）
* dic.get(“a”,val), In the dictionary dic Take out key is a Corresponding value , If the dictionary does not exist, the key is a Key value pairs for , Return to val
* list.sort( key=None, reverse=False)
key – It is mainly used to compare the elements , There is only one parameter , The parameters of the specific function are taken from the iteratable object , Specifies an element in an iteratable object to sort .
reverse – Collation ,reverse = True Descending order , reverse = False Ascending order （ default ）.
It is used in the article lambda expression ,lambda Is a declarator , Parameters follow ,： Parameters in front of it , The expression after the colon is lambda Processing results of , In this expression , The parameter is x, The result is
x[1].sort in key Parameter will assign a value to the following expression list Elements in . as ：list by [('a':5),('b':3)], implement sort They will separate the ('a':5) and
('b':3) Assign to key hinder lambda expression , that is x The parameter will accept both values .countsList.sort(key=lambda x:x[1],
reverse=True) # Equate with def takeSecond(elem): return elem[1] countsList.sort(key=
takeSecond, reverse=True)
* print stay python3 Has been functionalized in ,python2 It's OK print a,python3 Must be print(a).
* stay python3 It's OK help(print), ( be careful , stay python2 China can't help(print) Of , Because it's not a function )
* print('{0:<10}{1:>10}'.format(word,count))
Of the first brace in the argument bracket 0 It means that the curly bracket is for format First parameter in word Occupying ,： after < The sign indicates that the column is left aligned ,10 Indicates that the length of this column is 10. In the second brace 1 It means that the curly bracket is for format Second parameter in count Occupying ,： After > Indicates that the column is right aligned ,1010 This is the length of the column 10. Only units , Someone can tell me if you know it ...
<> Operation results

* Next ... Chinese word segmentation and word cloud , It looks like fun .

Technology
Daily Recommendation