Need some Regular Expressions (PHP) to find and parse Script Codes on HTML website
$30-75 USD
Completed
Posted over 13 years ago
$30-75 USD
Paid on delivery
Hello!
Please find my detailed project description in the additional information area. Here is not enough space for it.
Marc
## Deliverables
What I need are regular expressions in PHP to find Google Adsense script codes on websites to parse them and to replace them. I build some regular expressions which work fine if the Adsense scripts are inserted regularly but - god knows why - a lot of users add some comments here, delete the new line commands or something else. So I need some more flexible regular expressions. Let me start with a typical standard adsense script:
<script type="text/javascript"><!--
google_ad_client = "pub-7676581854075673";
/* 300x250, Erstellt 11.11.10 */
google_ad_slot = "7343419665";
google_ad_width = 300;
google_ad_height = 250;
//-->
</script>
<script type="text/javascript"
src="<[login to view URL]>">
</script>
The RegEx I already use to find this script and parse some values is this one:
if( preg_match_all('|(<script type="text/javascript">[^>]*google_ad_client[^<]*</script>)|i', $content, $match) ){
foreach( $match[1] as $content){
if( preg_match('|google_ad_client\s*=\s*"[^"]*".*?google_ad_slot\s*=\s*"\d+"|i', $content, $temp) ){
$pub = get_hash($content, "google_ad_client");
$pub = str_replace("ca-","",$pub);
$ads = get_hash($content, "google_ad_slot");
$wid = get_hash($content, "google_ad_width");
$hgt = get_hash($content, "google_ad_height");
[...]
and the RegEx to find the Adsense script to replace/delete it completly I use is this one:
$this->response_body = preg_replace('/<script type="text\/javascript"><\!\-\-[\r\n\t ]*google_ad_client = "'.$pub_id.'pub\-[^>]+>[\r\n\t ]*<\/script>[\r\n\t ]*<script[^>]+show_ads\.js">[\r\n\t ]*<\/script>/is', '', $this->response_body);
($pub_id is known here)
Okay. That scripts work fine if the Adsense script is implemented reguarly. Now please find some irregular implementations I found and please build a RegEx as flexible as possible to find such irregular codes. With "flexible" I mean that your RegEx should not be able only to work together with the here listed individual examples. What I mean is: If for example in an example script a comment is placed like
<script type="text/javascript"><!--// <![CDATA[
[...]
your script should be flexible enough also to find
<script type="text/javascript"><!--// <![CDATA2[
[...]
Okay, but I this this point is clear anyway. ;-)
And now please find here 8 irregular inplemented Adsense scripts:
<div style="padding-bottom:14px;"><script type="text/javascript"><!--
google_ad_client = "pub-1126182868641892";
/* GifMixde_468x15-5:Start Haupt */
google_ad_slot = "2751397243";
google_ad_width = 560;
google_ad_height = 15;
//-->
</script>
<script type="text/javascript"
src="<[login to view URL]>">
</script></div>
<div style="padding-bottom:14px;"><script type="text/javascript"><!--
google_ad_client = "pub-1126182868641892";
/* GifMixde_468x15-5:Start Haupt */
google_ad_slot = "2751397243";
google_ad_width = 560;
google_ad_height = 15;
//-->
</script>
<script type="text/javascript"
src="<[login to view URL]>">
</script></div>
<script type="text/javascript"><!--
google_ad_client = "pub-9794464753711083";
/* 160x600, Erstellt 31.03.10 */
google_ad_slot = "3120995982";
google_ad_width = 160;
google_ad_height = 600;
//--></script>
<script type="text/javascript"
src="<[login to view URL]>">
</script>
<script type="text/javascript"><!--// <![CDATA[
google_ad_client = "pub-3153097597200094";
/* 468x60, created 3/5/08 Pascal */
google_ad_slot = "3756691810";
google_ad_width = 468;
google_ad_height = 60;
//-->
</script>
<script type="text/javascript"
src="<[login to view URL]>">
</script>
<script type="text/javascript" src="[[login to view URL]"></script][1]>
<div class="adsense250 right"><script type="text/javascript"><!--
google_ad_client = "pub-8678763012128123";
google_ad_slot = "";
google_ad_width = 250;
google_ad_height = 250;
</script>
<script type="text/javascript" src="[[login to view URL]"></script></div][1]>
<script type="text/javascript"><!--
google_ad_client = "pub-4693662170792965";
google_ad_width = 120;
google_ad_height = 600;
google_ad_format = "120x600_as";
google_ad_channel ="";
google_ad_type = "text";
google_color_border = "006699";
google_color_bg = "ECF8FF";
google_color_link = "0000CC";
google_color_url = "008000";
google_color_text = "6F6F6F";
//--></script> <script type="text/javascript"
src="<[login to view URL]>">
</script>
<div style="text-align: center;margin: 5px;"><script type="text/javascript">
google_ad_client = "pub-9794464753711083";
google_alternate_color = "FFFFFF";
google_ad_width = 468;
google_ad_height = 60;
google_ad_format = "468x60_as";
google_ad_type = "text_image";
google_ad_channel ="";
google_color_border = "#000000";
google_color_link = "#FFFFCC";
google_color_bg = "#940C0E";
google_color_text = "#FFCC66";
google_color_url = "#E6E6E6";
google_ui_features = "rc:10";
//-->
</script>
<script type="text/javascript"
src="<[login to view URL]>">
</script></div><p>
<script type="text/javascript">google_ad_client = "pub-7676581854075673";/* 300x250, Erstellt 11.11.10 */google_ad_slot = "7343419665";google_ad_width = 300;google_ad_height = 250;</script><script type="text/javascript"
src="<[login to view URL]>">
</script>
That's it! :-)
Thank you!
Marc