For this exciting project you will be scraping a large content website.
Will ideally accept bids starting from $30.
We will give you the address of a website with ~7,500 pages.
For all of the normal content pages, you will need to:
1) Scrape the content
2) Result should be presented as CSV or SQL.
3) Parse content and save the following fields: title, abstract, keywords, body, category (some fields may not be available for this particluar content)
The resulting content must be free of any images and html tags, but must maintain spaces and paragraph indicator. For instance bolded text shall come in as plain text.
We are aware of several website scraping tools (Velocityscape/web scraper plus, etc), and are happy if you want to use one of them.
We are looking to complete this project quickly – by Sunday March 14, at the latest.
We will need the freelancer to show us a small number of records for our approval before going and completing the project.
Please use the phrase "I will scrape you" in your response, so we know you have read this description.
We expect to have additional work like this.