Jul
7
2010

Convert all applicable characters to Numeric entities for use in XML

If you wanna make sure your text gets parsed  correctly you mostly use htmlentities. However this method has 2 downsides:

1. It does not convert in to numeric entities so you’ll have problems when parsing as XML

2. It does NOT cover all characters that are like to show up.

So, to address this Issues, first for Point 1:

function _convertAlphaEntitysToNumericEntitys($entity){
return '&#'.ord(html_entity_decode($entity[0])).';';
}
$content = preg_replace_callback('/&([\w\d]+);/i','_convertAlphaEntitysToNumericEntitys',$content);

Here all “normal” entities are taken (which you already have, using  htmlentities) and replaced by their numeric counterparts so they can be parsed as XML, now that leaves us with our second Problem, the Fact that only a small range of characters is covered in the first Place:

function _convertAsciOver127toNumericEntitys($entity){
if(($asciCode = ord($entity[0])) > 127){
return '&#'.$asciCode.';';
}else{
return $entity[0];
}
}
$content = preg_replace_callback('/[^\w\d ]/i','_convertAsciOver127toNumericEntitys'), $content);

And there you go, the resulting Text should have no entitie Problems in XML.

Related Posts

About the Author:

Hi! I’m Johannes Lauter a 25 year old Web Application Developer based in Berlin ... more

Leave a comment

About me

myself

Hi! I’m Johannes Lauter a 25 year old Web Application Developer based in Berlin ... more.

MyZoo

  • This x That:
Know This: Pew Research Center poll finds 20% of Americans believe Obama is a Muslim — significantly more than did two years ago. Meet the teenage reality show contestant who suggested the infamous site of the “Ground Zero Mosque.” Fifty year
  • Photobucket
  • Amazing Examples of Paper Art | Inspiration
  • Interactive Infographic of the World
  • Real Life Donkey Kong Playing With His Nintendo DS
  • Vintage Tokyo subway manner posters ::: Pink Tentacle
  • Unicorn Being a Jerk : C. W. Moss
  • Choose Your Own Adventure
  • Kuriositas: Hobbiton: Sheep 1, Hobbits 0
  • Test-management2
  • We like those days.
  • Super Mario Supermarket of the Day: Awesome stock boy is awesome.
[reddit.]