The Files are Not Actually There

In .htaccess file in root folder, I have the following:

RewriteRule ^(fr|ja|zh-CN|de|it|ko|pt|es|ar|ru)/.*$ scripts/analytics/translatedoc.php [L]

This way, I have the most language supported:

  • fr
  • ja
  • zh-CN
  • de
  • it
  • ko
  • pt
  • es
  • ar
  • ru

There are many more need to be added, which I will do later.

Links to the Pages

The entry page for the translated articles are at the main individual archive pages /archives/ on this blog. It is located at the right most corner of each page currently. Here is the code to make it happen:

<table width="100%" border="0"><tr><td>
<dl>
<dt>Other Languages
</a></dd>
<dd><a href="http://<mt:BlogHost/>/zh-CN/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">Chinese</a></dd>
<dd><a href="http://<mt:BlogHost/>/ja/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">Japanese</a></dd>
<dd><a href="http://<mt:BlogHost/>/fr/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">French</a></dd>
<dd><a href="http://<mt:BlogHost/>/de/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">German</a></dd>
<dd><a href="http://<mt:BlogHost/>/it/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">Italian</a></dd>
<dd><a href="http://<mt:BlogHost/>/ko/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">Korean</a></dd>
<dd><a href="http://<mt:BlogHost/>/pt/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">Portuguese</a></dd>
<dd><a href="http://<mt:BlogHost/>/ru/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">Russian</a></dd>
<dd><a href="http://<mt:BlogHost/>/ar/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">Arabic</a></dd>
<dd><a href="http://<mt:BlogHost/>/es/<mt:EntryCreatedDate format="%Y%m%d"/>_<mt:EntryTitle dirify="1"/>.htm">Spanish</a></dd>
</dl>
</td></tr></table>

The Code Itself

Now that we have the facade ready, and now we are going to work on the real page.

Storage of the Translated Documents

They are in the hidden folder (outside /public_html/home/ directory) called /translate/. There are different folders, according to the languages, to store the raw data of the translated file.

Then the class Article need to the plumbing work to get the purified data out of the files, to give the render page clear title, clear body.

Requirement for the Raw Documents

To simply programing, the following is required:

  • <title></title> must be very clear, since it is there I got the new title.
  • There should be only one <h1></h1> pair in the entire document, or to be more exact, only one </h1> since that is regarded as the starting position of the body part.
  • There must be one <span class="post"> – exactly as it is, to mark the ending of the body section.