One annoying scenario is when you let users enter their names and then you need to output their names nicely, for example in a newsletter. Some users simply enter their names in upper/lowercase, but obviously when you address them you can’t do the same. On the other hand PHP’s ucfirst() and ucwords() functions are too naive for proper capitalization.
Let’s consider a few use cases:
| Original | ucwords(strtolower($str)) | proper |
|---|---|---|
| michael o’carrol | Michael O’carrol | Michael O’Carrol |
| lucas l’amour | Lucas L’amour | Lucas l’Amour |
| george d’onofrio | George D’onofrio | George d’Onofrio |
| william stanley iii | William Stanley Iii | William Stanley III |
| UNITED STATES OF AMERICA | United States Of America | United States of America |
| t. von lieres und wilkau | T. Von Lieres Und Wilkau | T. von Lieres und Wilkau |
| paul van der knaap | Paul Van Der Knaap | Paul van der Knaap |
| jean-luc picard | Jean-luc Picard | Jean-Luc Picard |
| JOHN MCLAREN | John Mclaren | John McLaren |
| hENRIC vIII | Henric Viii | Henric VIII |
| VAsco da GAma | Vasco Da Gama | Vasco da Gama |
You get the picture.
To make this work we need to observe three things:
- some words should be separated not just by space, but also by hypens and apostrophes.
- some words (especially “of” variations in different languages) must always be lower case.
- On the contrary, a some words like roman numerals must always be upper case.
This is what I came up with:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | function titleCase($string) { $word_splitters = array(' ', '-', "O'", "L'", "D'", 'St.', 'Mc'); $lowercase_exceptions = array('the', 'van', 'den', 'von', 'und', 'der', 'de', 'da', 'of', 'and', "l'", "d'"); $uppercase_exceptions = array('III', 'IV', 'VI', 'VII', 'VIII', 'IX'); $string = strtolower($string); foreach ($word_splitters as $delimiter) { $words = explode($delimiter, $string); $newwords = array(); foreach ($words as $word) { if (in_array(strtoupper($word), $uppercase_exceptions)) $word = strtoupper($word); else if (!in_array($word, $lowercase_exceptions)) $word = ucfirst($word); $newwords[] = $word; } if (in_array(strtolower($delimiter), $lowercase_exceptions)) $delimiter = strtolower($delimiter); $string = join($delimiter, $newwords); } return $string; } |
This should work for most cases. I did not test it for non-latin alphabets.
Wow. I just had a client meeting where she complained about people entering all caps and making her pretty website look ugly. I can’t wait to give this a try! Thanks!