We have a server side application written in Ruby on rails that we need updated to python.
The overview of the entire system is this. We have a webpage where users create on going searches. Those searches contain some text they want matched as well as addition fields (a year range for example and specific sites to look at). They are searching for ads, so the records will have text, and then data like price etc. Those searches are sent to the ROR application via JSON. The ROR application is responsible for then scanning records of data, extracting the relevant data and then seeing if the record matches a search. Batches of records matching searches are then sent back to the website.
This entire system exists but we are upgrading all the parts. The existing ROR application consists of 2 parts that will be provided, for reference but only one area needs to be replaced.
Part 1 was responsible for searching specific RSS feeds and putting those into a reddis database. We have already replaced part 1 which is now pumping data to a SQL database.
Part 2 pulls the records from that reddis database and extracts specific data from them. It then also matches those records to the searches and sends them to our site. In addition there is a simple web front end to this to allow us to easily see what the system is doing, errors that are occurring, and most recent matches and extracted data etc. This section also allows our website to send JSON posts at any time to update\delete\create searches.
This project is to replace part 2. We need the ROR code converted to python and upgraded, but the basic logic of how to extract the data is there and can be re-used. So this part will get data (ads) from the SQL database and extract specific data (year, price, etc) and then can either updated the existing database or have a different one. This portion also then needs to do a comparison of user searches and see if the extracted ad matches a search. When a batch of ads that have matching searches are created that is then sent to our website again via JSON. The format of the JSON to and from this application are set. I can provide screenshots of the web UI and am fine with this being written in PHP for example if better. We use the UI as a tool to see what they system is doing and to be able to make simple changes to searches and see how the data in records where extracted.
The database as it stands will most likely need to be either modified or if it makes more sense this application can create and manage its own DB. We can modify the code that is inserting data into the SQL database if needed.
This new project and system should utilize the existing extracting logic. The incoming data is basically all text, so there are not simple keys or tags etc to pull the data required out. So for example one field is a make and model of cars. We have tables and then code that based on matched makes can help narrow down the expected model of the cars. Since this system has been running we are happy if the new system has the same results.
This new system should be written in a clean structure so that we can easily maintain and modify the extraction subroutines as needed. For example, if we discover a better way to pull years out I want to be able to easily modify the existing code to do that. The existing ROR is well structured so it should be easy to follow. The application also needs to be able to self execute 24 hours a day and will be run on a DigitalOcean droplet. So part of delivery will be an requirements needed to get the droplet setup to execute these scripts at server boot and then keep it running.
This is a fixed price project so make sure you ask what you need to provide a good budget. I am willing to discuss the entire project but will not be hiring on an hourly basis. I can provide snippets of code if needed, and this project will be in GIT.