This example uses the Canadian English dictionary, but you can try it with others as well. These steps are for Windows.
Outline
- Combine en_CA.dic with en_CA.dic_delta, sort them, and save them as en_ca.dict with UTF-8 encoding to the right folder
- Save en_CA.aff as en_ca.affix to the right folder
- Create the Ispell dictionary in Postgres
Steps
- Open http://src.chromium.org/svn/trunk/deps/third_party/hunspell_dictionaries/en_CA.dic
- Select all of the text, copy it, and paste it to Text Mechanic: http://textmechanic.com/Sort-Text-Lines.html. Add a line break at the end
- Open http://src.chromium.org/svn/trunk/deps/third_party/hunspell_dictionaries/en_CA.dic_delta
- Select all of the text, copy it, and paste it below the previously pasted text in Text Mechanic.
- Scroll to the top, and get rid of the first line. It should be a 5-digit number
- Click the Alphabetical button, and wait for the text to sort
- Select all of the text and copy it to the clipboard
- Open Windows Notepad as an administrator
- Paste the text from Step 7 into Notepad
- Save the file as en_ca.dict (with UTF-8 encoding) to your Postgres text search folder. Mine is C:\Program Files\PostgreSQL\9.3\share\tsearch_data .
- Open http://src.chromium.org/svn/trunk/deps/third_party/hunspell_dictionaries/en_CA.aff, and copy it to Notepad. Save the file as en_ca.affix to your Postgres text search folder.
In PgAdmin, run the following SQL:
create text search dictionary ispell_en_ca ( template = ispell, dictfile = en_ca, --uses en_ca.dict afffile = en_ca, --uses en_ca.affix stopwords = english --uses english.stop ); --make sure it works: select * from ts_lexize('ispell_en_ca', 'colours'); /* result: ts_lexize text[] {coloured,colour} */
You will need to create a new text search configuration to use the dictionary.
No comments:
Post a Comment