Tag Archives: UTF-8

Box drawing in PHP

Back in the “good old days” of MS-DOS, you could draw lines, boxes, filled areas (think progress bars), and more using the extended ASCII character set (AKA code page 437).

While writing a simple command-line utility in PHP I wanted to use the full block () and light shade () characters to create a simple progress bar that is a bit nicer than the typical =========....................

Like this:

█████████████████░░░░░░░░░░░░░░░░░░░░░░░ 42.5%

Instinctively, I turned to PHP’s chr() function and looked up the extended ASCII codes for the characters I needed. Boy, was I disappointed when my progress bar was nothing but a series of un-useful question marks. Surely PHP can render simple ASCII characters, I thought.

It might have gone differently if I had been using a PC, but I do 100% of my development these days on a Macbook Pro. It so happens that the bash shell in UNIX, Linux, and Mac OS X all use UTF-8 encoding by default, not CP437. (Of course, your terminal font will need to support UTF-8 characters for this to work.)

So, I just needed to find the UTF-8 codes for the characters I needed and use those instead of the old familiar CP437 ones. However, the results weren’t much better.

After an hour or more of Googling and experimentation, I finally realized that chr() doesn’t do UTF-8. I found various suggestions on how to produce the desired characters using custom functions and the like, but the best way turned out to just use a good ‘ol HTML entity and run it through html_entity_decode:

$block = html_entity_decode('█', ENT_NOQUOTES, 'UTF-8'); // full block
$shade = html_entity_decode('░', ENT_NOQUOTES, 'UTF-8'); // light shade

Now, a progress bar isn’t exactly a box, so let’s demonstrate:

$tl = html_entity_decode('╔', ENT_NOQUOTES, 'UTF-8'); // top left corner
$tr = html_entity_decode('╗', ENT_NOQUOTES, 'UTF-8'); // top right corner
$bl = html_entity_decode('╚', ENT_NOQUOTES, 'UTF-8'); // bottom left corner
$br = html_entity_decode('╝', ENT_NOQUOTES, 'UTF-8'); // bottom right corner
$v = html_entity_decode('║', ENT_NOQUOTES, 'UTF-8');  // vertical wall
$h = html_entity_decode('═', ENT_NOQUOTES, 'UTF-8');  // horizontal wall

echo $tl . str_repeat($h, 15)  . $tr . "\n" .
     $v  . ' Hello, World! '   . $v  . "\n" .
     $bl . str_repeat($h, 15)  . $br . "\n";

Voilà!

UTF-8, once you get used to it, actually provides many more possibilities than CP437 did back in the MS-DOS days. Enjoy!