1

Current scenario :

I'm loading a html page into a var with DomDocument

$dom    = new DOMDocument('1.0', 'UTF-8');  
@$dom->loadHTML($html);

and I need to parse 3 lists of option fields . The HTML looks like this :

<li>
    <select id="advertiser" name="advertiser[]" multiple="multiple" autocomplete="off">                                                                         <option value="35" >Website Adv 1</option>
    <option value="36" >Website Adv 1</option>                                                                                                          <option value="41" >Website Adv 1</option>
    <option value="45" >Website Adv 1</option>
    </select>
</li>

Now I found this code on Stack but it does not work ..

$xpath = new DOMXpath($dom);
$options = $xpath->query("*/select[@name='advertiser[]']/option");
foreach ($options as $option) {
  $optionValue = $option->getAttribute('value');
  $optionContent = $option->nodeValue;
  echo "$optionValue and $optionContent\n";
}

The question remains :

How do I parse a HTML page to extract the fields of an option select, into an array like : value=>option_text

1
  • I would think you'd want //select[@name... instead of */select Commented Oct 14, 2013 at 3:17

1 Answer 1

1

The code you posted should work. You can change the code by this

$dom    = new DOMDocument('1.0', 'UTF-8');  
$dom->loadHTML($html);

$xpath = new DOMXpath($dom);
$options = $xpath->query("*/select[@name='advertiser[]']/option");
$result = array();
foreach ($options as $option) {
  $optionValue = $option->getAttribute('value');
  $optionContent = $option->nodeValue;
  $result[$optionValue] = $optionContent;
}

print_r($result);

to load into $result array the items like you want to.

The result should be:

Array
(
    [35] => Website Adv 1
    [36] => Website Adv 1
    [41] => Website Adv 1
    [45] => Website Adv 1
)
Sign up to request clarification or add additional context in comments.

4 Comments

thank you . I think there is an invisible error around ... yes, this works great .. I think maybe the html page is malformed or something .. the $html is the result / return of a cURL ... i don't get it .. the html is there, the results are empty ..
As php page says (us2.php.net/manual/en/domdocument.loadhtml.php) the loadHtml function returns a boolean to check if it´s a valid HTML string. You can check it.
So if there is anything else before the select input, any kind of data, like other tags or text .. the Xpath does not work ...
Can this be done in some other way ? how do I parse the code by nodes ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.