| id | Name | Phone |
|---|---|---|
| 1 | Neil | 6045551212 |
| 2 | Neil | (null) |
| 3 | (null) | 6045551212 |
| 4 | (null) | (null) |
And you have some new customer records like this (say from another company you bought):
| id | Name | Phone |
|---|---|---|
| 1 | Neil | 6045551212 |
And you want to find possible matches. Now obviously Haystack1 and Needles1 match, but Haystack2 and Haystack3 are possible matches too.
Here's a basic way to find relevant matches, ranked:
select n.id as needle_id, h.id as haystack_id, case when n.name = h.name then 1 else 0 end + case when n.phone = h.phone then 1 else 0 end as relevance from needles n join haystack h on n.name = h.name or n.phone = h.phone order by relevance desc;
This gives us:
| needle_id | haystack_id | relevance |
|---|---|---|
| 1 | 1 | 2 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
No comments:
Post a Comment