Lifelesspeople.com

 Forum FAQsForum FAQs  Knowledge BaseKnowledge Base  RulesRules   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   HostingHosting   RegisterRegister 
 DonateDonate   WikiWiki   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Different Languages in PHP

 
Lifelesspeople.com Forum Index -> Web Architects' Abode
Post new topic   Reply to topic View previous topic :: View next topic  
Author Message
Scott
tutorialtoday.com


Joined: 24 Mar 2005
Posts: 2582
Location: Mississauga, Ontario

PostPosted: Sat Mar 08, 2008 8:53 pm    Post subject: Different Languages in PHP Reply with quote

I am working on making my script multi-lingual, it currently is only in English but people are also requesting to translate it into other languages.

First of all (although it is not a PHP issue) do I need to change the charset or something in the <head> for it to work with different languages?

Second, for the regex I currently have it just checks for a-zA-Z (when checking the letters). Although, if I use \w (word character) to check, will that return correctly for letters from another language?
_________________
Tutorial Management Script - Version 1.3 Released
TutorialToday - Up and running, submit your tutorials!
Linux Tutorials - Coming Soon
Back to top
 
krt
...


Joined: 11 Jan 2005
Posts: 4607
Location: Australia

PostPosted: Sat Mar 08, 2008 10:10 pm    Post subject: Reply with quote

I'd use Unicode and send a header and use a meta tag for the best cross browser compatibility. Some don't deal with just the one.

In PHP: header("Content-Type: text/html; charset: utf-8");
(obviously changing text/html if needed)

Then mimicking this in HTML:
Code:
<me*ta http-equiv="Content-Type" content="text/html;charset=utf-8" />

Remove the * obviously

\w in regex only matches [a-zA-Z0-9_]. Note the fact that "word characters" can include numbers and underscores which most people don't expect.
Back to top
 
Scott
tutorialtoday.com


Joined: 24 Mar 2005
Posts: 2582
Location: Mississauga, Ontario

PostPosted: Sun Mar 09, 2008 8:39 am    Post subject: Reply with quote

I would I go about validating titles with regex if they are in a different language.
_________________
Tutorial Management Script - Version 1.3 Released
TutorialToday - Up and running, submit your tutorials!
Linux Tutorials - Coming Soon
Back to top
 
Pie32
Not Banned


Joined: 17 Mar 2005
Posts: 1411
Location: Lost in 84

PostPosted: Sun Mar 09, 2008 8:52 am    Post subject: Reply with quote

I'm not sure what version of PHP 6 we have on L2P servers, but that's something you may want to look into. One of the big features they are working on for it are support for languages with non-Latin characters.
_________________
[img]http://luneknight.com.ru/counter.jpg[/img]
Random Battle: [img]http://luneknight.com.ru/l.jpg[/img] vs. [img]http://luneknight.com.ru/r.jpg[/img]
Back to top
 
Scott
tutorialtoday.com


Joined: 24 Mar 2005
Posts: 2582
Location: Mississauga, Ontario

PostPosted: Sun Mar 09, 2008 9:19 am    Post subject: Reply with quote

Pie32 wrote:
I'm not sure what version of PHP 6 we have on L2P servers, but that's something you may want to look into. One of the big features they are working on for it are support for languages with non-Latin characters.


This won't just be for my site, so it will be on other servers where they will most likely have PHP 4.
_________________
Tutorial Management Script - Version 1.3 Released
TutorialToday - Up and running, submit your tutorials!
Linux Tutorials - Coming Soon
Back to top
 
LP-SolidRaven
Dictator of the Dump


Joined: 06 Jun 2004
Posts: 7015
Location: The cheese is made out of moon

PostPosted: Tue Mar 11, 2008 7:43 am    Post subject: Reply with quote

You could use \xhexhere
Replace hexhere with the hex code of the character.
_________________
Quote:

<bart416> I just realized something
<bart416> we celebrate the fact that this piece of rock made one rotation around a glowing ball of plasma that is kept together due to its own gravity well
<njsg> HAPPY NEW YEAR
<Easter> ^^
Back to top
 
leontius
Novice Poster


Joined: 26 Mar 2008
Posts: 2


PostPosted: Wed Mar 26, 2008 5:04 am    Post subject: Reply with quote

'\w' in regex corresponds to the locale set in the server. PHP 5 manual said that

Quote:
A "word" character is any letter or digit or the underscore character, that is, any character which can be part of a Perl "word". The definition of letters and digits is controlled by PCRE's character tables, and may vary if locale-specific matching is taking place. For example, in the "fr" (French) locale, some character codes greater than 128 are used for accented letters, and these are matched by \w.


So probably you can use regular expressions in combination with setlocale() method.
_________________
Leap On!
Back to top
 
ClickFanatic
Est. 2005


Joined: 18 Jan 2005
Posts: 3857


PostPosted: Wed Mar 26, 2008 7:24 am    Post subject: Reply with quote

The problem with locales is that they have to be present on the server. Some servers simply have the en_US locale and nothing else.

But as I don't really see alternatives (unless you want to manually create a regex for every language, using \x## patterns) it's probably worth the risk.
_________________
Captain Jell-O Buster from the Future
[img]http://feeds.feedburner.com/sparepencil.1.gif[/img]
Back to top
 
linuxdoctor
Infallible Persona


Joined: 23 Apr 2005
Posts: 1203
Location: Ottawa, Canada

PostPosted: Wed Mar 26, 2008 8:45 am    Post subject: Reply with quote

the E107 content management system has an interesting way to handle different languages. It's written entirely in PHP so perhaps this is something you might want to look at.

http://e107.org
http://e107coders.org

To quickly describe what they do for different languages.

First they put all of the text that is to be output in strings in different language files.
Code:

/* this is for English  and placed in English/lang.php */

define("GOOD_MORNING", "Good morning.");
define("GOOD_BYE", "See you.");

/* This is for French and place in French/lang.php */

define("GOOD_MORNING", "Bon jour.");
define("GOOD_BYE", "Au revoir.");

/* This is for German and place in German/lang.php */

define("GOOD_MORNING", "Guten Tag.");
define("GOOD_BYE", "Auf wiedersehen.");


Notice how each of these files are all called 'lang.php' but in different directories under the name of the language. This is so you can multiple language files but all of them in the same place per language. It may also be convenient to place all these directories in a directory of it's own called 'languages' for instance.

Second, in your application configuration file, have two variables to deal with the languages, the first is the URL to where the language directories will be located and the second the name of the language.

Code:

/* config.php entry for languages */

$LANGUAGES_DIRECTORY = "languages";
$DEFAULT_LANGUAGE = "German";


So, whatever language you want to set as default, just change the '$DEFAULT_LANGUAGE' variable. In your app, you may want to have this as a variable per-user in the database. So, when the user logs in, you can simply assign this variable whatever his default language is.

Now the third step is where the magic begins. In you app, load in the 'config.php' and then load in the particular language you want. Then whenever you want to output a string, usethe macro for the text from the 'lang.php' file rather than the actual text. Magically, your chosen language will be used.

Code:

/* app.php */

// Include anything you need to have done first
include('config.php');
// include anything else you might need to here like change the default language from
// a cookie or the database
// now include the default language:
include( $LANGUAGES_DIRECTORY . '/' . $DEFAULT_LANGUAGE . '/lang.php');

// now use it.

echo GOOD_MORNING

echo GOOD_BYE


The output will be in your selected language. The e107 documentation and code will give you the exact implementation. I've necessarily only sketched out how they do it very briefly. It is actually done very elegantly and that is what good programmers always strive to accomplish.

This example should work with the possible exception of having to make the LANGUAGES_DIRECTORY an absolute URL rather than just a relative one. The languages directory needs to be in the same directory as the main 'app.php'.
_________________
Misanthrope: someone who realizes that humans really are as stupid as they appear.

If you think I'm 'politically' incorrect you have the wrong politics.
Back to top
 
LP-SolidRaven
Dictator of the Dump


Joined: 06 Jun 2004
Posts: 7015
Location: The cheese is made out of moon

PostPosted: Wed Mar 26, 2008 10:45 am    Post subject: Reply with quote

That's a good idea until you start with an extensive template system...
_________________
Quote:

<bart416> I just realized something
<bart416> we celebrate the fact that this piece of rock made one rotation around a glowing ball of plasma that is kept together due to its own gravity well
<njsg> HAPPY NEW YEAR
<Easter> ^^
Back to top
 
ClickFanatic
Est. 2005


Joined: 18 Jan 2005
Posts: 3857


PostPosted: Wed Mar 26, 2008 11:49 am    Post subject: Reply with quote

Defining constants is quite a good solution, but not very elegant in my opinion. Especially when the number of different phrases becomes really big (consider this constant USER_PASSWORD_CHARACTER_MAXLENGTH_EXCEEDED).

Also, if the translations are outdated, or certain phrases remain untranslated, what will happen? You will see the constant (which is undefined) in plain text. Not really nice.

A better solution is the PHP implementation of gettext: http://savannah.nongnu.org/projects/php-gettext/

It is used in WordPress, among others. It basically works like this:
Throughout the code the programmer can use the default language to write messages. For example: Your password exceeds the maximum length

To make it possible to translate this text, it will be wrapped in a function.
Code:
<?php echo _e('Your password exceeds the maximum length'); ?>


What this function does is rather simple. It looks in the loaded language file for a translation of the given string. If it exists it will be returned, if it's not, then the default string (which is of course the string passed to the function) will be returned.

Translations are stored in gettext format, which is very useful because software exists to easily edit such files.
_________________
Captain Jell-O Buster from the Future
[img]http://feeds.feedburner.com/sparepencil.1.gif[/img]
Back to top
 
linuxdoctor
Infallible Persona


Joined: 23 Apr 2005
Posts: 1203
Location: Ottawa, Canada

PostPosted: Wed Mar 26, 2008 4:29 pm    Post subject: Reply with quote

That could be a good way. The particular gettext you pointed to is not the GNU gettext but something different. PHP supports the standard GNU gettext.

I'm not that familiar with WordPress but if it does use gettext it would most likely be the GNU version.

The biggest disadvantage is the requirement of the use the .po files. These files have a fairly complex structure that will require a learning curve which might be somewhat of a deterrent for people not familar with Unix conventions.

Another disadvantage is speed. Gettext in PHP is slow. Everything in PHP is slow. PHP is an interpreted language. First the Zend compiler reads all the source files in and converts it into an intermediate form before executing it. The macro implementation is very fast since the translation is done in situ by the Zend compiler as the source is read in. Using gettext would not cause the translations to actually be done until the execution phase and the net result might be a noticable slowness in response or a jerkiness in rendering pages.

While gettext provide the most general solution, it was designed for compiled to object programmes. In PHP it is cumbersome and slow. In my view, that hardly qualifies it as elegant.
_________________
Misanthrope: someone who realizes that humans really are as stupid as they appear.

If you think I'm 'politically' incorrect you have the wrong politics.
Back to top
 
LP-SolidRaven
Dictator of the Dump


Joined: 06 Jun 2004
Posts: 7015
Location: The cheese is made out of moon

PostPosted: Thu Mar 27, 2008 3:05 am    Post subject: Reply with quote

There is a fast way to use GNU Gettext in php.
You might want to read this. I've used it before and it's quite fast. The only issue is that most servers don't have it compiled. Haven't checked if LLP has it. And if it doesn't I don't think Trel would mind installing it. Oh yeah keep in mind the documentation of this module is terrible in several ways. If I find some time I might write a small tutorial on it.

And if the site is intended to work under extreme high loads you could always use a squid cache server combined with an accelerator.
_________________
Quote:

<bart416> I just realized something
<bart416> we celebrate the fact that this piece of rock made one rotation around a glowing ball of plasma that is kept together due to its own gravity well
<njsg> HAPPY NEW YEAR
<Easter> ^^
Back to top
 
linuxdoctor
Infallible Persona


Joined: 23 Apr 2005
Posts: 1203
Location: Ottawa, Canada

PostPosted: Thu Mar 27, 2008 7:21 am    Post subject: Reply with quote

You're sort of making my point for me. I agree with you that gettext is a very general solution to the problem of multi-lingual translations for applications. Gettext is widely used in Linux apps and there is a lot to say for it.

However, there are also weaknesses and we've pointed them out. Another weakness with a lot of GNU software, including gettext, is that in its endeavour to be general it also becomes extremely complex. Some critics have said needlessly complex. On the other hand, being less complex would also mean being less general and less complete necessitating the developer to add other libraries and modules to add additional functionality that would be missing from a more general solution. In the end, the application becomes even more complex than it otherwise would have been and and almost certainly much larger and slower.

What I suggested was a simple approach. Certainly it's not the best but it's simple and it works. While one programmer is spending a lot of time trying to get gettext to work another programmer could be up in running with my proposed solution.
_________________
Misanthrope: someone who realizes that humans really are as stupid as they appear.

If you think I'm 'politically' incorrect you have the wrong politics.
Back to top
 
ClickFanatic
Est. 2005


Joined: 18 Jan 2005
Posts: 3857


PostPosted: Thu Mar 27, 2008 11:47 am    Post subject: Reply with quote

The PHP gettext that I referred to is indeed not the GNU gettext extension that may be installed on servers.
Portability would be a valid reason to use the PHP gettext implementation instead of the GNU gettext extension for PHP. WordPress doesn't rely on the PHP extension, but uses a modified version of PHP gettext.
_________________
Captain Jell-O Buster from the Future
[img]http://feeds.feedburner.com/sparepencil.1.gif[/img]
Back to top
 
Display posts from previous:   
Post new topic   Reply to topic    Lifelesspeople.com Forum Index -> Web Architects' Abode All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Home | Hosting | News | Forum | Links | System Status | About | Archive | Donate ]
Powered by phpBB © 2001, 2002 phpBB Group
All trademarks and copyrights on this page are owned by their respective owners. Posts and comments are owned by the poster. Everything else © 2001 - 2007 Lifelesspeople.com