Username: Save?
Password:
Home Forum Links Search Login Register*
    News: Keep The TechnoWorldInc.com Community Clean: Read Guidelines Here.
Recent Updates
[April 24, 2024, 11:48:22 AM]

[April 24, 2024, 11:48:22 AM]

[April 24, 2024, 11:48:22 AM]

[April 24, 2024, 11:48:22 AM]

[April 03, 2024, 06:11:00 PM]

[April 03, 2024, 06:11:00 PM]

[April 03, 2024, 06:11:00 PM]

[April 03, 2024, 06:11:00 PM]

[March 06, 2024, 02:45:27 PM]

[March 06, 2024, 02:45:27 PM]

[March 06, 2024, 02:45:27 PM]

[March 06, 2024, 02:45:27 PM]

[February 14, 2024, 02:00:39 PM]
Subscriptions
Get Latest Tech Updates For Free!
Resources
   Travelikers
   Funistan
   PrettyGalz
   Techlap
   FreeThemes
   Videsta
   Glamistan
   BachatMela
   GlamGalz
   Techzug
   Vidsage
   Funzug
   WorldHostInc
   Funfani
   FilmyMama
   Uploaded.Tech
   MegaPixelShop
   Netens
   Funotic
   FreeJobsInc
   FilesPark
Participate in the fastest growing Technical Encyclopedia! This website is 100% Free. Please register or login using the login box above if you have already registered. You will need to be logged in to reply, make new topics and to access all the areas. Registration is free! Click Here To Register.
+ Techno World Inc - The Best Technical Encyclopedia Online! » Forum » THE TECHNO CLUB [ TECHNOWORLDINC.COM ] » Programming Zone » HTML
 Parse html with preg_match_all
Pages: [1]   Go Down
  Print  
Author Topic: Parse html with preg_match_all  (Read 1073 times)
Daniel Franklin
TWI Hero
**********


Karma: 3
Offline Offline

Posts: 16647


View Profile Email
Parse html with preg_match_all
« Posted: September 26, 2007, 04:07:24 PM »


The biggest difference between preg_match_all and the regular preg_match is that all matched values are stored inside a multi-dimensional array to store an unlimited number of matches. With the following example I will try to make clear how its possible to store the image paths inside a webpage:

<?php
$data = file_get_contents("http://www.finalwebsites.com");
$pattern = "/src=["']?([^"']?.*(png|jpg|gif))["']?/i";
preg_match_all($pattern, $data, $images);
?>

We take a closer look to the pattern:
"/src=["']?([^"']?.*(png|jpg|gif))["']?/i"
The first part and the last part are searching for everything that starts with src and ends with a optional quote or double quote. This could be a long string because the outer rule is very global. Next we check the rule starts within the first bracket:
"/src=["']?([^"']?.*(png|jpg|gif))["']?/i"
Now we are looking inside this long string from the outer rule for strings starting with an optional quote or double quote followed by any characters. The last part inside the inner brackets is the magic:
"/src=["']?([^"']?.*(png|jpg|gif))["']?/i"


We are looking next for a string that is followed by a file extension and match we get all the paths from the html file.

We need all the rules to isolate the string parts (image paths) from the rest of the html. The result looks like this (access the array $images with these indexes, or just use print_r($images)):

$images[0][0] -> scr="/images/english.gif"
$images[1][0] -> /images/english.gif
$images[2][0] -> gif

The index 1 is the information we need, try this example with other part of html code for a better understanding.

Articles Source - Free Articles

Logged

Pages: [1]   Go Up
  Print  
 
Jump to:  

Copyright © 2006-2023 TechnoWorldInc.com. All Rights Reserved. Privacy Policy | Disclaimer
Page created in 0.129 seconds with 24 queries.