0

I am having a html text below.

<div class="wwrcm-tab-info wwrcm-cf wwrcm-last">
  <div class="wwrcm-info">
    <h2 class="wwrcm-text-gray">Instant office.</h2>
    <p class="wwrcm-text-gray-light">Just click your Surface Pro 3 into the dock to go from tablet to full desktop PC. With an Ethernet port, Mini DisplayPort and five USB ports – three USB 3.0 and two USB 2.0 ports – you can attach your HD monitor, full-size keyboard, printer and more.</p>
    <h2 class="wwrcm-text-gray">All powerful.</h2>
    <p class="wwrcm-text-gray-light">Docking Station delivers plenty of power at 48W. You can work on your device, run or charge your favourite accessories, and still have ample power to charge your Surface Pro 3 battery.</p>
    <h2 class="wwrcm-text-gray">Product Features</h2>
    <p class="wwrc-feature-p wwrcm-text-gray-light"><strong>Mini DisplayPort Video Output</strong><br/>The mini DisplayPort connection delivers high-definition video resolution of up to 3840 x 2600 DPI.</p>
    <p class="wwrc-feature-p wwrcm-text-gray-light"><strong>USB Ports</strong><br/>Docking Station includes five USB ports – three USB 3.0 and two USB 2.0 ports. Transfer large files to an external drive, plug in a USB printer or headset, charge multiple accessories, and more.</p>
    <p class="wwrc-feature-p wwrcm-text-gray-light"><strong>Gigabit Ethernet Port</strong><br/>The gigabit Ethernet connection is super fast, with data transfer rates of up to 1 billion bits per second&#185;.</p>
    <p class="wwrcm-text-gray-light"><strong>48W Power Supply</strong><br/>The 48W power supply quickly recharges your Surface battery while you work, so you can hit the road or the halls in no time with a fully-charged device.</p> 
    <h2 class="wwrcm-text-gray">Summary</h2>
    <ul class="wwrcm-text-gray-light">
      <li>Transform your Surface Pro 3 into a complete desktop workstation</li>
      <li>Connect to your favourite accessories</li>
      <li>Power and charge your Surface Pro 3</li>
    </ul>
  </div>
</div>

I want to parse the above html and display h2 value and then p value in sequence.I want to store it as array where h2 as key and <P> as value.

I have tried with xpath->query and also regualr expressions,But unable to display like that.

Can you please suggest me how to procees

4
  • you want to store the html in an array where the data within h2's are keys and what will be the value? Do you need to parse the other data such as p's, ul's and li's? Commented Jan 30, 2015 at 13:22
  • Hi Yes h2 as key and p value as value Commented Jan 30, 2015 at 13:38
  • Start by looking at PHP's DomDocument Commented Jan 30, 2015 at 13:39
  • How would you do for the h2 Product Features, there's 4 p after, which one to keep ? And I agree with @MarkBaker see Domdocument. Last regex are not the way for this task. Commented Jan 30, 2015 at 13:43

1 Answer 1

1

Try http://simplehtmldom.sourceforge.net/

`

$arr = [];
foreach($html->find('h1') as $header) {
    $nextSibling = $header->nextSibling();
        if (!empty($nextSibling) and $nextSibling->tag === 'p') {
            $arr[$header->plaintext] = $nextSibling->plaintext;
        }
}

`

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.