GitHub - dintellect/Address_Similiarity_Using_FuzzyWuzzy: Address Similiarity applying different types of Fuzzy String Matching in Python

Fuzzy Wuzzy is a Python library uses Levenshtien Distance to calculate the differences between the sequences.

In order to compare and find similiarity between two addresses, different functions of fuzzy string matching are implemented.

Below is the list of fuzzy string matching function used :

1.Fuzz Ratio: The ratio function computes the standard Levenshtein distance similarity ratio between two sequences.

2.Partial Fuzz Ratio: This is similiar to fuzz ratio but will neglect all the small details like stop words, punctuations, capital letters.

3.Token_Sort_Ratio: Tokenize the strings and preprocess them by turning them to lower case and getting rid of punctuation ignoring word order.

4.Token_Set_Ratio: It is similar to token sort ratio, but little more flexible as it ignores duplicated words too.

5.W Ratio: A simple ratio function but handles lower and upper cases and some other parameters too.

Check jupyternotebook for final code

Please feel free to fork and contribute.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Address Similiarity Using FuzzyWuzzy.ipynb		Address Similiarity Using FuzzyWuzzy.ipynb
City.json		City.json
District.json		District.json
README.md		README.md
States.json		States.json

Provide feedback