GSoC 2011 - Hyphenation

From AbiWiki

(Difference between revisions)
Jump to: navigation, search
(How to Support more languages)
(Summary of What I have done in GSoc2011)
 
(38 intermediate revisions not shown)
Line 2: Line 2:
= Summary of What I have done in GSoc2011 =  
= Summary of What I have done in GSoc2011 =  
-
Until now, my works in GSoc2011 including seven parts as following:
+
My GSoc2011 includes two branches:
-
1. How to Support more languages
+
[http://svn.abisource.com/svnroot/abiword/branches/gsoc2011hyphenation checkout my gsoc2011hyphenation SVN here]
-
 How to support more languages in ISpell
+
[http://svn.abisource.com/svnroot/enchant/branches/gsoc2011hyphenation  checkout my gsoc2011enchant SVN  here]
-
 How to support more languages in mySepll
+
Until now, my works in GSoc2011 including seven parts as following:
-
2. How to extend the enchant function
+
*'''1. How to Support more languages'''
-
3.Hyphenation module in Enchant
+
** How to support more languages in ISpell
-
 Read and get totally understand the source code of Enchant
+
** How to support more languages in mySepll
 +
*'''2. How to extend the enchant function'''
 +
*'''3. Hyphenation module in Enchant'''
-
 Reuse the abstract layer of Enchant and add Hyphenation function in Enchant, so that we can add more language easily
+
** Read and get totally understand the source code of Enchant
-
 Deal with more languages
+
** Reuse the abstract layer of Enchant and add Hyphenation function in Enchant, so that we can add more language easily
-
 Add five backend implementation, including ispell, myspell, zemberek, voikko, uspell
+
** Deal with more languages
-
 Deal with the spelling-checking module
+
** Add five backend implementation, including ispell, myspell, zemberek, voikko, uspell
-
4.Call the Hyphenation function in Abiword.
+
** Deal with the spelling-checking module
 +
 
 +
*'''4. Call the Hyphenation function in Abiword.'''
   
   
-
Find split info using enchant_dict_hyphenate
+
** Find split info using enchant_dict_hyphenate
-
Split Text_Run to split word pass the line width and keep their format
+
** Split Text_Run to split word pass the line width and keep their format
-
Deal with user's operation(select, delete, cut, paste)
+
** Deal with user's operation(select, delete, cut, paste)
-
User can select weather to enable the hyphenation function
+
** User can select weather to enable the hyphenation function
-
5. Simple Implementation of Chinese Spell-Checking in Enchant
+
*'''5. Simple Implementation of Chinese Spell-Checking in Enchant'''
-
Add a simple spell-check framework for Chinese in Enchant
+
** Add a simple spell-check framework for Chinese in Enchant
-
Add library to support
+
** Add library to support
-
Some survey about Chinese Spell-checking  
+
** Some survey about Chinese Spell-checking  
-
6. Code Re-factor and debug
+
*'''6. Code Re-factor and debug
-
Code Re-factor, include keep the code flexible
+
** Code Re-factor, include keep the code flexible
-
Debug coding problem
+
** Debug coding problem
-
7. User interface to manage hyphenation
+
*'''7. User interface to manage hyphenation'''
-
Windows, Linux, and Cocoa
+
** Windows, Linux, and Cocoa
-
The detail things:
+
 
 +
'''The detail things:'''
== How to reuse my works ==
== How to reuse my works ==
-
I have created two patch files including all my coding in GSoc2011.
 
-
Chenxiajian_enchant.diff
+
the SVN of my GSoc2011 branches is here:
 +
 
 +
* [http://svn.abisource.com/svnroot/abiword/branches/gsoc2011hyphenation checkout my gsoc2011hyphenation SVN here]
 +
 
 +
* [http://svn.abisource.com/svnroot/enchant/branches/gsoc2011hyphenation  checkout my gsoc2011enchant SVN  here]
 +
 
 +
*I have created two patch files including all my coding in GSoc2011.
 +
 
 +
'''Chenxiajian_enchant.diff'''
 +
 
 +
'''Chenxiajian_abiword.diff'''
 +
 
 +
*Chenxiajian_enchant.diff is about the jobs that I have done in the Enchant framework to provider an abstract level of hyphenation function for abiword.
-
Chenxiajian_abiword.diff
+
*Chenxiajian_abiword.diff is the concreate jobs that call the hyphenation function in abiword to implement hyphenation.
-
Chenxiajian_enchant.diff is about the jobs that I have done in the Enchant framework to provider an abstract level of hyphenation function for abiword.
 
-
Chenxiajian_abiword.diff is the concreate jobs that call the hyphenation function in abiword to implement hyphenation.
 
-
How to use?
+
*''' How to use?'''
You can just apply the diff files in SVN.
You can just apply the diff files in SVN.
Line 73: Line 87:
In the folder “abiword\msvc2008\Debug\” there are the folder for hyphenation: Spell and mySpell. And there is two folder for their dictionary.
In the folder “abiword\msvc2008\Debug\” there are the folder for hyphenation: Spell and mySpell. And there is two folder for their dictionary.
-
http://hi.csdn.net/attachment/201108/21/65043_13139186583czu.png
+
[[http://hi.csdn.net/attachment/201108/21/65043_13139186583czu.png|right|200px]]
=== How to support more languages in ISpell ===
=== How to support more languages in ISpell ===
Go into the ISpell, you will see the folder language; you can just copy your languages’ hyphenation dictionary into it. So that our abiword will support your language’s hyphenation.  
Go into the ISpell, you will see the folder language; you can just copy your languages’ hyphenation dictionary into it. So that our abiword will support your language’s hyphenation.  
Line 84: Line 98:
I have read much codes in enchant. So I think enchant is a very useful framework for you to support dictionary-need function, such as spell-check, hyphenation. To extend the function in Enchant, we need to do the following things:
I have read much codes in enchant. So I think enchant is a very useful framework for you to support dictionary-need function, such as spell-check, hyphenation. To extend the function in Enchant, we need to do the following things:
-
1 In order to achieve this, we need to add concreate function in EnchantDict firstly. Something like:
+
*'''1 In order to achieve this, we need to add concreate function in EnchantDict firstly. Something like:'''
             char **(*hyphenate) (struct str_enchant_dict * me,
             char **(*hyphenate) (struct str_enchant_dict * me,
Line 90: Line 104:
                           size_t * out_n_suggs);
                           size_t * out_n_suggs);
-
2 the function is implement by the backend.
+
*'''2 the function is implement by the backend.'''
 +
<syntaxhighlight lang="cpp">
             static char **
             static char **
             ispell_dict_hyphenate (EnchantDict * me, const char *const word,
             ispell_dict_hyphenate (EnchantDict * me, const char *const word,
Line 100: Line 115:
                   return checker->hyphenate (word, len, out_n_suggs);
                   return checker->hyphenate (word, len, out_n_suggs);
             }
             }
-
 
+
</syntaxhighlight>
-
3 we set the connetion with dic
+
*'''3 we set the connetion with dic'''
             dict->hyphenate = ispell_dict_hyphenate;
             dict->hyphenate = ispell_dict_hyphenate;
Line 110: Line 125:
=== Add hyphenation function in Enchant ===
=== Add hyphenation function in Enchant ===
Firstly, I add hyphenation method in Enchant:
Firstly, I add hyphenation method in Enchant:
-
 
-
********the code********
 
I think we can combine the hyphenation with spell-checking together, So that we can make the code more flexible. In my opinion, the hyphenation function defines as following:
I think we can combine the hyphenation with spell-checking together, So that we can make the code more flexible. In my opinion, the hyphenation function defines as following:
Line 142: Line 155:
including ispell, myspell, zemberek, voikko, uspell
including ispell, myspell, zemberek, voikko, uspell
-
  Hunspell: using seperated dictionary: such as hyph_en_us.dic.  we can download dic from internet
+
    Hunspell: using seperated dictionary: such as hyph_en_us.dic.  we can download dic from internet
-
  Libhyphenaiton: the dictionary is provided by author, sometimes limited
+
    Libhyphenaiton: the dictionary is provided by author, sometimes limited
-
  Zemberek: for Turkis
+
    Zemberek: for Turkis
-
  Voikko: for Finnish
+
    Voikko: for Finnish
the changes:
the changes:
-
1 deleted the unneed connection, such as HSpell
+
*'''1 deleted the unneed connection, such as HSpell'''
-
2 add hunspell(myspell) hyphenation code
+
*'''2 add hunspell(myspell) hyphenation code'''
-
3 implement hyphenation using hunspell
+
*'''3 implement hyphenation using hunspell'''
-
4 implement hyphenation using Zemberek
+
*'''4 implement hyphenation using Zemberek'''
-
********1 deleted the unneed connection, such as HSpell********
+
 
 +
*'''1 deleted the unneed connection, such as HSpell'''
Hebrew don’t need any hyphenation
Hebrew don’t need any hyphenation
Line 163: Line 177:
Yiddish don’t need any hyphenation
Yiddish don’t need any hyphenation
-
********2 Implement hyphenation using hunspell********
+
*'''2 Implement hyphenation using hunspell'''
In order to use libhyphenation. We need to add files:
In order to use libhyphenation. We need to add files:
Line 176: Line 190:
     hyphen/hyphen.tex
     hyphen/hyphen.tex
-
********3 Implement hyphenation using Zemberek********
+
*'''3 Implement hyphenation using Zemberek'''
just using dbus_g_proxy_call the same as Spell-Check in Zemberek:
just using dbus_g_proxy_call the same as Spell-Check in Zemberek:
Line 308: Line 322:
I just copy the buliding result of enchant to the right place in Abiword:
I just copy the buliding result of enchant to the right place in Abiword:
-
enchant\bin\Debug\libenchant_myspell.dll ---->abiword\msvc2008\Debug\lib\enchant\libenchant_myspell.dll
+
*enchant\bin\Debug\libenchant_myspell.dll ---->abiword\msvc2008\Debug\lib\enchant\libenchant_myspell.dll
-
enchant\bin\Debug\libenchant_ispell.dll ---->abiword\msvc2008\Debug\lib\enchant\libenchant_ispell.dll
+
*enchant\bin\Debug\libenchant_ispell.dll ---->abiword\msvc2008\Debug\lib\enchant\libenchant_ispell.dll
-
enchant\bin\Debug\libenchant.dll---->
+
*enchant\bin\Debug\libenchant.dll---->abiword\msvc2008\Debug\bin\ibenchant.dll
-
 
+
-
abiword\msvc2008\Debug\bin\ibenchant.dll
+
=== Test in Linux ===
=== Test in Linux ===
Line 320: Line 332:
== Call the Hyphenation function in Abiword. ==
== Call the Hyphenation function in Abiword. ==
-
    Split run to split word and keep the format
+
*'''Split run to split word and keep the format'''
-
    Find split info
+
*'''Find split info'''
-
    Deal with user's operation(select, delete, cut, paste)
+
*'''Deal with user's operation(select, delete, cut, paste)'''
-
Main Goal: call hyphenation module of enchant to display the hyphenation result in abiword. After user's operation, refresh the hyphenation-result accordingly include user adding new word, delete word, copy word, cut word
+
*'''Main Goal''': call hyphenation module of enchant to display the hyphenation result in abiword. After user's operation, refresh the hyphenation-result accordingly include user adding new word, delete word, copy word, cut word
The main code is adding in the format function in LineBreaker.h(cpp)
The main code is adding in the format function in LineBreaker.h(cpp)
Line 390: Line 402:
     }
     }
     }
     }
-
   
 
== Simple Implementation of Chinese Spell-Check in Enchant ==
== Simple Implementation of Chinese Spell-Check in Enchant ==
Line 396: Line 407:
== Code Re-factor and debug ==
== Code Re-factor and debug ==
I have finish the code re-factor both in Enchant and Abiword. Code Re-factor works:
I have finish the code re-factor both in Enchant and Abiword. Code Re-factor works:
-
    1 deal with some ugly code
+
*'''1 deal with some ugly code'''
-
    2 deal with the exception
+
*'''2 deal with the exception'''
== User interface to manage hyphenation ==
== User interface to manage hyphenation ==
Doing now, user can enable or disable hyphenation function in user interface (GUI).  
Doing now, user can enable or disable hyphenation function in user interface (GUI).  
-
I have finished GUI in Windows, Linux, and Cocoa.
+
*'''I have finished GUI in Windows, Linux, and Cocoa.'''
-
Most languages have been translated for the globalization.
+
*'''Most languages have been translated for the globalization.'''
Take Windows GUI for example, user can check the checkbox for enable or disable hyphenation function.
Take Windows GUI for example, user can check the checkbox for enable or disable hyphenation function.
-
Linux and Cocoa need more tests.
+
*'''Linux and Cocoa need more tests.'''
 +
 
 +
Again my svn branches:
 +
 
 +
[http://svn.abisource.com/svnroot/abiword/branches/gsoc2011hyphenation checkout my gsoc2011hyphenation SVN here]
 +
 
 +
[http://svn.abisource.com/svnroot/enchant/branches/gsoc2011hyphenation  checkout my gsoc2011enchant SVN  here]

Current revision as of 15:00, 21 August 2011

Summary of What I have done in GSoc2011

Contents

Summary of What I have done in GSoc2011

My GSoc2011 includes two branches:

checkout my gsoc2011hyphenation SVN here

checkout my gsoc2011enchant SVN here

Until now, my works in GSoc2011 including seven parts as following:

  • 1. How to Support more languages
    • How to support more languages in ISpell
    • How to support more languages in mySepll
  • 2. How to extend the enchant function
  • 3. Hyphenation module in Enchant
    • Read and get totally understand the source code of Enchant
    • Reuse the abstract layer of Enchant and add Hyphenation function in Enchant, so that we can add more language easily
    • Deal with more languages
    • Add five backend implementation, including ispell, myspell, zemberek, voikko, uspell
    • Deal with the spelling-checking module
  • 4. Call the Hyphenation function in Abiword.
    • Find split info using enchant_dict_hyphenate
    • Split Text_Run to split word pass the line width and keep their format
    • Deal with user's operation(select, delete, cut, paste)
    • User can select weather to enable the hyphenation function
  • 5. Simple Implementation of Chinese Spell-Checking in Enchant
    • Add a simple spell-check framework for Chinese in Enchant
    • Add library to support
    • Some survey about Chinese Spell-checking
  • 6. Code Re-factor and debug
    • Code Re-factor, include keep the code flexible
    • Debug coding problem
  • 7. User interface to manage hyphenation
    • Windows, Linux, and Cocoa


The detail things:

How to reuse my works

the SVN of my GSoc2011 branches is here:

  • I have created two patch files including all my coding in GSoc2011.

Chenxiajian_enchant.diff

Chenxiajian_abiword.diff

  • Chenxiajian_enchant.diff is about the jobs that I have done in the Enchant framework to provider an abstract level of hyphenation function for abiword.
  • Chenxiajian_abiword.diff is the concreate jobs that call the hyphenation function in abiword to implement hyphenation.


  • How to use?

You can just apply the diff files in SVN.

How to Support more languages

As mentioned before, we use Enchant to support more languages. So we have five backend to support more language. Take ISpell and mySpell for example.

In the folder “abiword\msvc2008\Debug\” there are the folder for hyphenation: Spell and mySpell. And there is two folder for their dictionary. [[1]]

How to support more languages in ISpell

Go into the ISpell, you will see the folder language; you can just copy your languages’ hyphenation dictionary into it. So that our abiword will support your language’s hyphenation.

Now we support de, en, es, and fr.

How to support more languages in mySepll

The same as ISpell, to support more languages in mySpell, we can refer to the myspell folder.

How to extend the enchant function

I have read much codes in enchant. So I think enchant is a very useful framework for you to support dictionary-need function, such as spell-check, hyphenation. To extend the function in Enchant, we need to do the following things:

  • 1 In order to achieve this, we need to add concreate function in EnchantDict firstly. Something like:
           char **(*hyphenate) (struct str_enchant_dict * me,
                         const char *const word, size_t len,
                         size_t * out_n_suggs);
  • 2 the function is implement by the backend.
<syntaxhighlight lang="cpp">
           static char **
           ispell_dict_hyphenate (EnchantDict * me, const char *const word,
                   size_t len, size_t * out_n_suggs)
           {
                  ISpellChecker * checker;
                  checker = (ISpellChecker *) me->user_data;
                 return checker->hyphenate (word, len, out_n_suggs);
           }

</syntaxhighlight>

  • 3 we set the connetion with dic
           dict->hyphenate = ispell_dict_hyphenate;
           dict->suggest = hspell_dict_hyphenate;
           dict->suggest = zemberek_dict_hyphenate;

Hyphenation module in Enchant

Add hyphenation function in Enchant

Firstly, I add hyphenation method in Enchant:

I think we can combine the hyphenation with spell-checking together, So that we can make the code more flexible. In my opinion, the hyphenation function defines as following:

    EnchantDict* enchant_broker_request_dict (EnchantBroker* broker, const
    char *const lang); //same as spell-checking
    char *enchant_dict_hyphenate(EnchantDict *dict, const char *const word,size_t len);

In order to achieve the function and implement in abstract layer, we need to add hyphenation function in EnchantDict. something like, just as a function pointer:

    char* (*hyphenate) (struct str_enchant_dict * me,
                              const char *const word, size_t len,
                              size_t * out_n_suggs);

and the function is implement by the backend. Take “ispell” as example:

    static char * ispell_dict_hyphenate (EnchantDict * me, const char *const word,
                   size_t len, size_t * out_n_suggs)
    {
           ISpellChecker * checker;
           checker = (ISpellChecker *) me->user_data;
           return checker->hyphenate (word, len, out_n_suggs);
    }

Finally, we set the connetion

    dict->hyphenate = ispell_dict_hyphenate;
    dict->suggest = hspell_dict_hyphenate;
    dict->suggest = zemberek_dict_hyphenate;

Add five backends to support hyphenation

including ispell, myspell, zemberek, voikko, uspell

   Hunspell: using seperated dictionary: such as hyph_en_us.dic.  we can download dic from internet
   Libhyphenaiton: the dictionary is provided by author, sometimes limited
   Zemberek: for Turkis
   Voikko: for Finnish

the changes:

  • 1 deleted the unneed connection, such as HSpell
  • 2 add hunspell(myspell) hyphenation code
  • 3 implement hyphenation using hunspell
  • 4 implement hyphenation using Zemberek


  • 1 deleted the unneed connection, such as HSpell

Hebrew don’t need any hyphenation

Yiddish don’t need any hyphenation

  • 2 Implement hyphenation using hunspell

In order to use libhyphenation. We need to add files:

   hyphen/hnjalloc.h
   hyphen/hnjalloc.c
   hyphen/hyph_en_US.dic
   hyphen/hyphen.c
   hyphen/hyphen.gyp
   hyphen/hyphen.h
   hyphen/hyphen.patch
   hyphen/hyphen.tex
  • 3 Implement hyphenation using Zemberek

just using dbus_g_proxy_call the same as Spell-Check in Zemberek:

the hyphenation is as following

   char* Zemberek::hyphenate(const char* word)
   {
          char* result;
          GError *Error = NULL;
          if (!dbus_g_proxy_call (proxy, "hecele", &Error,
                  G_TYPE_STRING,word,G_TYPE_INVALID,
                  G_TYPE_STRV, &result,G_TYPE_INVALID)) {
                          g_error_free (Error);
                          return NULL;
          }
          char*result=0;
          return result;
   }

ISpell

I used Libhyphenation in ISpell. The simple code is just like this:

    static char *
    ispell_dict_hyphenate (EnchantDict * me, const char *const word)
    {
        ISpellChecker * checker;
        checker = (ISpellChecker *) me->user_data;
        if(me->tag!="")
        return checker->hyphenate (word,me->tag);
        return checker->hyphenate (word,"en_us");
    }

The concrete code in ISpellChecker is :

    char *
    ISpellChecker::hyphenate(const char * const utf8Word, const char *const tag)
    {  //we must choose the right language tag
      char* param_value = enchant_broker_get_param (m_broker, "enchant.ispell.hyphenation.dictionary.path");
      if(languageMap[tag]!="")
      {
      string result=Hyphenator(RFC_3066::Language(languageMap[tag]),param_value).hyphenate(utf8Word).c_str();
      char* temp=new char[result.length()];
      strcpy(temp,result.c_str());
      return temp;
      }
      return NULL;
    }

MySpell

I used Libhyphenate in ISpell. The simple code is just like this:

   char*
   MySpellChecker::hyphenate (const char* const word, size_t len,char* tag)
   {    
   	if(len==-1) len=strlen(word);
   	if (len > MAXWORDLEN 
   		|| !g_iconv_is_valid(m_translate_in)
   		|| !g_iconv_is_valid(m_translate_out))
   		return 0;
   	char* result=0;
   	myspell->hyphenate(word,result,tag);
   	return result;
   }

The concrete code in MySpellChecker is :

   void Hunspell::hyphenate( const char* const word, char* result, char* tag )
   {
   HyphenDict *dict;	
   char buf[BUFSIZE + 1];	
   char *hyphens=new char[BUFSIZE + 1];	
   char ** rep;
   int * pos;
   int * cut;
   /* load the hyphenation dictionary */  
   string filePath="hyph_";
   filePath+=tag;
   filePath+=".dic";
   if ((dict = hnj_hyphen_load(filePath.c_str())) == NULL) {
   fprintf(stderr, "Couldn't find file %s\n",tag);
   fflush(stderr);
   exit(1);
   }
   int len=strlen(word);
    if (hnj_hyphen_hyphenate2(dict, word, len-1, hyphens, NULL, &rep, &pos, &cut)) {
   free(hyphens);
   fprintf(stderr, "hyphenation error\n");
   exit(1);
   }
   hnj_hyphen_free(dict);
   result=hyphens;	
   }

zemberek

The way in Zemberek is same with the two above:

   static char*
   zemberek_dict_hyphenate (EnchantDict * me, const char *const word)
   {
   	Zemberek *checker;
   	checker = (Zemberek *) me->user_data;
   	return checker->hyphenate (word);
   }

But the way for the concrete implementation is different from the two. We use zemberek_service

   char* Zemberek::hyphenate(const char* word)
   {
   	char* result;
   	GError *Error = NULL;
   	if (!dbus_g_proxy_call (proxy, "hecele", &Error,
   		G_TYPE_STRING,word,G_TYPE_INVALID,
   		G_TYPE_STRV, &result,G_TYPE_INVALID)) {
   			g_error_free (Error);
   			return NULL;
   	} 
   	char*result=0;
   	return result;	
   }

voikko

The hyphenation implementation in Voikko is easy since Voikko has hyphenaiton’s API.

   static char **
   voikko_dict_suggest (EnchantDict * me, const char *const word,
   size_t len, size_t * out_n_suggs)
   {
   	char **sugg_arr;
   	int voikko_handle;
   	voikko_handle = (long) me->user_data;
   	sugg_arr = voikko_suggest_cstr(voikko_handle, word);
   	if (sugg_arr == NULL)
   		return NULL;
   	for (*out_n_suggs = 0; sugg_arr[*out_n_suggs] != NULL; (*out_n_suggs)++);
   	return sugg_arr;
   }

Deploy of enchant in Abiword

I just copy the buliding result of enchant to the right place in Abiword:

  • enchant\bin\Debug\libenchant_myspell.dll ---->abiword\msvc2008\Debug\lib\enchant\libenchant_myspell.dll
  • enchant\bin\Debug\libenchant_ispell.dll ---->abiword\msvc2008\Debug\lib\enchant\libenchant_ispell.dll
  • enchant\bin\Debug\libenchant.dll---->abiword\msvc2008\Debug\bin\ibenchant.dll

Test in Linux

I have test the Enchant module in RedHat. It works fine for me.

Call the Hyphenation function in Abiword.

  • Split run to split word and keep the format
  • Find split info
  • Deal with user's operation(select, delete, cut, paste)
  • Main Goal: call hyphenation module of enchant to display the hyphenation result in abiword. After user's operation, refresh the hyphenation-result accordingly include user adding new word, delete word, copy word, cut word

The main code is adding in the format function in LineBreaker.h(cpp) // find the split point

   while (pRunToBump && pLine->getNumRunsInLine() && (pLine->getLastRun() != m_pLastRunToKeep))
   		{
   			UT_ASSERT(pRunToBump->getLine() == pLine);
   			if(!pLine->removeRun(pRunToBump))
   			{
   				pRunToBump->setLine(NULL);
   			}
   			UT_ASSERT(pLine->getLastRun()->getType() != FPRUN_ENDOFPARAGRAPH);
   			if(pLine->getLastRun()->getType() == FPRUN_ENDOFPARAGRAPH)
   			{
   				fp_Run * pNuke = pLine->getLastRun();
   				pLine->removeRun(pNuke);
   			}
   		pRunToBump->printText();  //trace out debug message & run two time
   		pNextLine->insertRun(pRunToBump);  //called when create new line
   			// to get the split word			
   			if (!(pRunToBump->getPrevRun() && pLine->getNumRunsInLine() && (pLine->getLastRun() != m_pLastRunToKeep)))
   			{
   				pRunToSplit=pRunToBump;
   				PD_StruxIterator text(pRunToBump->getBlock()->getStruxDocHandle(),
   					pRunToBump->getBlockOffset() + fl_BLOCK_STRUX_OFFSET);
   				text.setUpperLimit(text.getPosition() + pRunToBump->getLength() - 1);
   				UT_ASSERT_HARMLESS( text.getStatus() == UTIter_OK );
   				UT_UTF8String sTmp;
   				while(text.getStatus() == UTIter_OK)
   				{
   					UT_UCS4Char c = text.getChar();
   					UT_DEBUGMSG(("| %d |",c));
   					if(c >= ' ' && c <128)
   						sTmp +=  static_cast<char>(c);
   					++text;
   				}
   				UT_DEBUGMSG(("The Split Text |%s| \n",sTmp.utf8_str()));
   				if(sTmp.utf8_str()!=0) 
   				{
                   pWordToSplit=sTmp;					
   					UT_DEBUGMSG(("wordToSplit |%s| \n",pWordToSplit.utf8_str()));
   				}				
   			}			
   			pRunToBump = pRunToBump->getPrevRun();
   			UT_DEBUGMSG(("Next runToBump %x \n",pRunToBump));
   		}
   	}
   	//modify src/text/fmt/xp/fb_LineBreaker.cpp to place hypernation points
   	//spit the word
   	if(pWordToSplit.length()!=NULL)
   	{
   	pWordHyphenationResult=pBlock->_hyphenateWord(pWordToSplit.ucs4_str().ucs4_str(),0,0);
   		int tickLeft=pLine->getAvailableWidth();
   		if (pWordHyphenationResult && *pWordHyphenationResult){
   			gchar *c = g_ucs4_to_utf8(pWordHyphenationResult, -1, NULL, NULL, NULL);
   			for(int index=g_utf8_strlen(c,NULL);index>=0;--index)
   			{
   				if(pWordHyphenationResult[index]=='-'&&index<tickLeft)
   				{
   					pBreakPoint=index;
   					fp_TextRun* textout=static_cast<fp_TextRun*>(pRunToSplit);
   					textout->split(pBreakPoint);
   				}
   			}
   		}
   	}

Simple Implementation of Chinese Spell-Check in Enchant

After GSoc2011, I would like to add Chinese Spell-Check in Enchant. Chinese Spell-Check is also a very important issue in Word-Processor. I found some lib to support; I just build a simple framework since time is limit.

Code Re-factor and debug

I have finish the code re-factor both in Enchant and Abiword. Code Re-factor works:

  • 1 deal with some ugly code
  • 2 deal with the exception

User interface to manage hyphenation

Doing now, user can enable or disable hyphenation function in user interface (GUI).

  • I have finished GUI in Windows, Linux, and Cocoa.
  • Most languages have been translated for the globalization.

Take Windows GUI for example, user can check the checkbox for enable or disable hyphenation function.

  • Linux and Cocoa need more tests.

Again my svn branches:

checkout my gsoc2011hyphenation SVN here

checkout my gsoc2011enchant SVN here

Personal tools