Hello everyone, in our last Machine Learning tutorial we learnt about that how we can use Euclidean Distance formula to find out similarity among people. In this tutorial we learn a new way to do the same thing but in a bit complex or rather I should say in advanced manner. We will use Pearson Correlation Score for calculating similarity among people. It has one major difference in the result being generated by it, in comparison to Euclidean Distance, that even if the distance between the values of fruits provided by two persons is high, but if it is consistent, that is difference is nearly is consistent through out all fruits, then Pearson Correlation Score will mark both persons highly similar or totally same.
For example in the above data if we look at ‘John’ and ‘Martha’ the distance between the fruits between them is nearly same, as a result Pearson Correlation Value will be around ‘1’ for them.
Pearson Correlation Score
We will calculate Pearson Correlation Score only for those fruits which are common for both the persons.
Above formula provides us the Pearson Correlation Coefficient or Score, where ‘n’ is the sample size or total number of fruits, ‘x’ and ‘y’ are the values corresponding to each fruit.
Python code for the above method: