USA Address standardization of addresses
- Статус: Pending
- Награды: $1000
- Полученные заявки: 4
Краткое описание конкурса
the short version of task, is i am looking to create a standard format for address , i have various data sources, all with data in different fields, so i want to be able to pull in different a address field layout, and convert it to my newly created standard form. when doing this , i want to track, which step of code corrected this particular record, so that i can fine tune, or make chages to that step
Attached are 3 files:
File 1: “street address variation” , contains examples of data entered ,without strict rules or general fields and not specific fields for each data type. This is sample of many different data records, which are in fact the same address, but do not match.
File 2: FL LEON SITUS FREELANCER, IS control data, this file is assumed to have the correct data, and is broken down each field of a usa address. We assume this data is correct, although im sure it contains errors, it will serve as the basis of creating a standard format.
File 3: FL LEON CERTIFIED DATA, FREE, is file that needs to be compared to the control file. This file is a typical generalized format address lines of 4, to enter in usa address. As you can see based on file 1 this data can vary quite a bit on how it is entered.
File 4: OUTPUT SUMMARY: Please look at, has 4 sheets, 1 is the desired output of corrections made, 2 version table (not that important), 3 db table for new address along with fields for manual review. 4, is street type or street suffix type abbreviations.
Various data sources will have different format in how they collect data. Task is to take control data, make a general format out of it, then make combine format of address lines 1-4 in File 3,and compare to see matches. 1st develop a basic manipulation of data to get highest results of a direct match with file 3, then Start testing changes to data in file 3 to make match to the newly created format of data in file 2, start with simplest changes first. Lets say street type= column “c ir” should be changed to “CIR”. This is simple change with limited possibilities,
Example of street type different, witchtree acrs, and witchtree acres are same
Also examples of Situs corrections to be made: there are 4502 records that have no street name info, these records can be left alone, record 21435, adrno= 4085, adrstr= buck lake, the rd is in adrsuf2, and should be in adrsuf, record 133657, has street type “c ir” instead of “cir”
Please feel free to ask as you have questions for clarification. prefer C or php, mysql,