FuzzyWuzzy Python library
You can compare data in Python using Different Libraries & Methods
1) Regex : Python Methods & Functions
2) Simple compare : Python Methods & Functions
3) Difflib Python Library : FuzzyWuzzy Python library
Today we will discuss about FuzzyWuzzy Python library
It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.
Requirements
Python 2.7 or higher
fuzzywuzzy Library
pymysql Library
Installation
Using PIP via PyPI
pip install fuzzywuzzy
Usage
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
Fuzzywuzzy Types
- Simple Ratio
- Partial Ratio
- Token Sort Ratio
- Token Set Ratio
Partial Ratio
The partial ratio() function allows us to perform substring matching. This works by taking the shortest string and matching it with all substrings that are of the same length.
Token Sort Ratio
FuzzyWuzzy also has token functions that tokenize the strings, change capitals to lowercase, and remove punctuation. The token_sort_ratio() function sorts the strings alphabetically and then joins them together.
Token Set Ratio
The token_set_ratio() function is similar to the token_sort_ratio() function above, except it takes out the common tokens before calculating the fuzz.ratio() between the new strings. This function is the most helpful when applied to a set of strings with a significant difference in lengths.
Example
import fuzzywuzzy
from fuzzywuzzy import fuzz
from fuzzywuzzy import process# Simple Ratio Start
strFirst = ‘’
strSecond = ‘’
strFirst = ‘This is a fuzzywuzzy Example by Shardul !’
strSecond = ‘This is a fuzzywuzzy Example by Shardul.’ratio = fuzz.ratio(strFirst, strSecond)
print(‘String Compare Percentage using Ratio is : ‘ + str(ratio))
# OutPut : “String Compare Percentage using Ratio is : 97”
# Simple Ratio End# Partial Ratio Start
strFirst = ‘’
strSecond = ‘’
strFirst = ‘This is a fuzzywuzzy Example by Shardul !’
strSecond = ‘This is a fuzzywuzzy Example by Shardul.’ratio = fuzz.partial_ratio(strFirst, strSecond)
print(‘String Compare Percentage using Partial Ratio is : ‘ + str(ratio))
# OutPut : “String Compare Percentage using Partial Ratio is : 100”
# Partial Ratio End# Token Sort Ratio Start
strFirst = ‘’
strSecond = ‘’
strFirst = ‘This is a fuzzy wuzzy Example by Shardul !’
strSecond = ‘This is a wuzzy fuzzy Example by Shardul.’ratio = fuzz.token_sort_ratio(strFirst, strSecond)
print(‘String Compare Percentage using Token Sort Ratio is : ‘ + str(ratio))
# OutPut : “String Compare Percentage using Token Sort Ratio is : 100”
# Token Sort Ratio End# Token Set Ratio Start
strFirst = ‘’
strSecond = ‘’
strFirst = ‘This is a fuzzy wuzzy Example by Shardul !’
strSecond = ‘This is a fuzzy Example by Shardul.’ratio = fuzz.token_set_ratio(strFirst, strSecond)
print(‘String Compare Percentage using Token Set Ratio is : ‘ + str(ratio))
# OutPut : “String Compare Percentage using Token Set Ratio is : 100”
# Token Set Ratio End
Reference
https://towardsdatascience.com/string-matching-with-fuzzywuzzy-e982c61f8a84
https://www.geeksforgeeks.org/fuzzywuzzy-python-library/
https://pypi.org/project/fuzzywuzzy/
Assignment For you