PHP HTML Parsing (Command line usage)

Create a php script to be called from the command line that will parse a small piece of HTML

This needs explanation about which HTML/XML parsing routines you would use - we want fast code.

- HTML must be passed to the script for processing when the script is called from command line (example html pasted below)

- The script must return the required, parsed values to the STDOUT/screen in CSV format (header not needed, just comma seperation)

- Command line examples will be needed to show how the final script is used

Required variables (all extracted from the source to be passed to your script) are:

1 - a raw url (specifically the one where the link target is datamercsexternalframe)

2 - 55 chars of plain text, which in the example below would start as "My current Shiny project"... etc"

3 - the first raw img src url link in the source (allowing for cases where more than one)

4 - the file extension of the file specified in item 3 above, eg png, jpg, etc.

Example source HTML:

<img src="images/feedsin/2016/04/23/[url removed, login to view]"/>My current Shiny project contains at least five tables and I

constantly forget how they are called. So I whipped up a little

bookmarklet that uses jQuery to show the id of each div and input.

Some of those can be ignored as they are internal names set<p class="trackback"><a class="shortlink " rel="" title="Display element ids for debugging Shiny apps" href="[url removed, login to view]" target="datamercsexternalframe">Read more </a></p>

Native PHP HTML/XML routines will keep this efficient as this is not a big job. Markup processing has already been done in core PHP.

Clarifying point 2 re plain text. This means that all tags would need to be removed so it is entirely human readable i.e. tagless / markupless

there will be no need to make http calls - the html will be provided at the command line

Навыки: HTML, PHP, Архитектура ПО

Показать больше: this class jquery, src format file, parsing input

О работодателе:
( 8 отзыв(-а, -ов) ) Horley, United Kingdom

ID проекта: #10322671



Hi, I would use PHP's preg_match function to parse the HTML. I'm available to complete this today. Thanks for considering my services. Stan. Jobs feedback: [login to view URL]

£22 GBP за 1 день
(291 отзывов(-а))

5 фрилансеров(-а) готовы выполнить эту работу в среднем за £39


£18 GBP за 1 день
(282 отзывов(-а))

Dear Sir, I'm very much delighted to let you know that i did data scraping with PHP-cURL, PhantomJS, Node.js, Selenium from many sites. I just scraped the data from web site and then wrote the data in mysql database Больше

£50 GBP за 1 день
(37 отзывов(-а))
£157 GBP за 1 день
(5 отзывов(-а))

Hello !!! Although I am a new freelancer but have professional experience in web development Industry.I have been in IT field for last three years.I provide my brilliant skills to the clients beyond the [login to view URL] Больше

£80 GBP за 2 дней(-я)
(0 отзывов(-а))

The parsing can be easily and efficiently achieved using regular expressions. Parsing it as XHTML would also be an option, but requires valid XHTML, which is not always the case with HTML, so it would require an extra Больше

£23 GBP за 1 день
(0 отзывов(-а))