You want to turn plain text into reasonably formatted HTML.
First, encode entities with htmlentities(). Then, transform the text into various HTML structures. The pc_text2html() function shown in Example 13-9 has basic transformations for links and paragraph breaks.
Example 13-9. pc_text2html()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
function pc_text2html($s) { $s = htmlentities($s); $grafs = split("\n\n",$s); for ($i = 0, $j = count($grafs); $i < $j; $i++) { // Link to what seem to be http or ftp URLs $grafs[$i] = preg_replace('/((ht|f)tp:\/\/[^\s&]+)/', '<a href="$1">$1</a>',$grafs[$i]); // Link to email addresses $grafs[$i] = preg_replace('/[^@\s]+@([-a-z0-9]+\.)+[a-z]{2,}/i', '<a href="mailto:$1">$1</a>',$grafs[$i]); // Begin with a new paragraph $grafs[$i] = '<p>'.$grafs[$i].'</p>'; } return implode("\n\n",$grafs); } |
The more you know about what the plain text looks like, the better your HTML conversion can be. For example, if emphasis is indicated with asterisks (*) or slashes (/) around words, you can add rules that take care of that, as shown in Example 13-10.
Example 13-10. More text-to-HTML rules
1 2 3 4 |
$grafs[$i] = preg_replace('/(\A|\s)\*([^*]+)\*(\s|\z)/', '$1<b>$2</b>$3',$grafs[$i]); $grafs[$i] = preg_replace('{(\A|\s)/([^/]+)/(\s|\z)}', '$1<i>$2</i>$3',$grafs[$i]); |