Tuesday, 27 August 2013

How to find the most probable pair of a string from a two-column dataset?

How to find the most probable pair of a string from a two-column dataset?

Given columns A and B, how can I find the most probable item at column B
each of the items at column A? What about something based on nested
hash-maps? I want to do that in Python.
INPUT:
a,abd37534c7d9a2efb9465de931cd7055ffdb8879563ae98078d6d6d5
a,abd37534c7d9a2efb9465de931cd7055ffdb8879563ae98078d6d6d5
a,abd37534c7d9a2efb9465fghfghfghfghfghrewresdasdzfdghhgfhg
a,abd3753dfrtdgfdg563ae98078d6dfgfdgdfghdgasdaSADFBVFDGFD5
b,c681e18b81edaf2b66dd22376734dba5992e362bc3f91ab225854c17
OUTPUT:
a,abd37534c7d9a2efb9465de931cd7055ffdb8879563ae98078d6d6d5
b,c681e18b81edaf2b66dd22376734dba5992e362bc3f91ab225854c17

No comments:

Post a Comment