mirror of https://github.com/postgres/postgres
parent
c027fa5757
commit
4c2d0cd5e4
@ -0,0 +1,201 @@ |
||||
<sect1 id="tsearch2"> |
||||
<title>tsearch2</title> |
||||
|
||||
<indexterm zone="tsearch2"> |
||||
<primary>tsearch2</primary> |
||||
</indexterm> |
||||
|
||||
<para> |
||||
The <literal>tsearch2</literal> module provides backwards-compatible |
||||
text search functionality for applications that used |
||||
<filename>contrib/tsearch2</> before text searching was integrated |
||||
into core <productname>PostgreSQL</productname> in release 8.3. |
||||
</para> |
||||
|
||||
<sect2> |
||||
<title>Portability Issues</title> |
||||
|
||||
<para> |
||||
Although the built-in text search features were based on |
||||
<filename>contrib/tsearch2</> and are largely similar to it, |
||||
there are numerous small differences that will create portability |
||||
issues for existing applications: |
||||
</para> |
||||
|
||||
<itemizedlist mark="bullet"> |
||||
<listitem> |
||||
<para> |
||||
Some functions' names were changed, for example <function>rank</> |
||||
to <function>ts_rank</>. |
||||
The replacement <literal>tsearch2</literal> module |
||||
provides aliases having the old names. |
||||
</para> |
||||
</listitem> |
||||
|
||||
<listitem> |
||||
<para> |
||||
The built-in text search data types and functions all exist within |
||||
the system schema <literal>pg_catalog</>. In an installation using |
||||
<filename>contrib/tsearch2</>, these objects would usually have been in |
||||
the <literal>public</> schema, though some users chose to place them |
||||
in a separate schema of their own. Explicitly schema-qualified |
||||
references to the objects will therefore fail in either case. |
||||
The replacement <literal>tsearch2</literal> module |
||||
provides alias objects that are stored in <literal>public</> |
||||
(or another schema if necessary) so that such references will still work. |
||||
</para> |
||||
</listitem> |
||||
|
||||
<listitem> |
||||
<para> |
||||
There is no concept of a <quote>current parser</> or <quote>current |
||||
dictionary</> in the built-in text search features, only of a current |
||||
search configuration (set by the <varname>default_text_search_config</> |
||||
parameter). While the current parser and current dictionary were used |
||||
only by functions intended for debugging, this might still pose |
||||
a porting obstacle in some cases. |
||||
The replacement <literal>tsearch2</literal> module emulates these |
||||
additional state variables and provides backwards-compatible functions |
||||
for setting and retrieving them. |
||||
</para> |
||||
</listitem> |
||||
</itemizedlist> |
||||
|
||||
<para> |
||||
There are some issues that are not addressed by the replacement |
||||
<literal>tsearch2</literal> module, and will therefore require |
||||
application code changes in any case: |
||||
</para> |
||||
|
||||
<itemizedlist mark="bullet"> |
||||
<listitem> |
||||
<para> |
||||
The old <function>tsearch2</> trigger function allowed items in its |
||||
argument list to be names of functions to be invoked on the text data |
||||
before it was converted to <type>tsvector</> format. This was removed |
||||
as being a security hole, since it was not possible to guarantee that |
||||
the function invoked was the one intended. The recommended approach |
||||
if the data must be massaged before being indexed is to write a custom |
||||
trigger that does the work for itself. |
||||
</para> |
||||
</listitem> |
||||
|
||||
<listitem> |
||||
<para> |
||||
Text search configuration information has been moved into core |
||||
system catalogs that are noticeably different from the tables used |
||||
by <filename>contrib/tsearch2</>. Any applications that examined |
||||
or modified those tables will need adjustment. |
||||
</para> |
||||
</listitem> |
||||
|
||||
<listitem> |
||||
<para> |
||||
If an application used any custom text search configurations, |
||||
those will need to be set up in the core |
||||
catalogs using the new text search configuration SQL commands. |
||||
The replacement <literal>tsearch2</literal> module offers a little |
||||
bit of support for this by making it possible to load an old set |
||||
of <filename>contrib/tsearch2</> configuration tables into |
||||
<productname>PostgreSQL</productname> 8.3. (Without the module, |
||||
it is not possible to load the configuration data because values in the |
||||
<type>regprocedure</> columns cannot be resolved to functions.) |
||||
While those configuration tables won't actually <emphasis>do</> |
||||
anything, at least their contents will be available to be consulted |
||||
while setting up an equivalent custom configuration in 8.3. |
||||
</para> |
||||
</listitem> |
||||
|
||||
<listitem> |
||||
<para> |
||||
The old <function>reset_tsearch()</> and <function>get_covers()</> |
||||
functions are not supported. |
||||
</para> |
||||
</listitem> |
||||
|
||||
<listitem> |
||||
<para> |
||||
The replacement <literal>tsearch2</literal> module does not define |
||||
any alias operators, relying entirely on the built-in ones. |
||||
This would only pose an issue if an application used explicitly |
||||
schema-qualified operator names, which is very uncommon. |
||||
</para> |
||||
</listitem> |
||||
</itemizedlist> |
||||
|
||||
</sect2> |
||||
|
||||
<sect2> |
||||
<title>Converting a pre-8.3 Installation</title> |
||||
|
||||
<para> |
||||
The recommended way to update a pre-8.3 installation that uses |
||||
<filename>contrib/tsearch2</> is: |
||||
</para> |
||||
|
||||
<procedure> |
||||
<step> |
||||
<para> |
||||
Make a dump from the old installation in the usual way, |
||||
but be sure not to use <literal>-c</> (<literal>--clean</>) |
||||
option of <application>pg_dump</> or <application>pg_dumpall</>. |
||||
</para> |
||||
</step> |
||||
|
||||
<step> |
||||
<para> |
||||
In the new installation, create empty database(s) and install |
||||
the replacement <literal>tsearch2</literal> module into each |
||||
database that will use text search. This must be done |
||||
<emphasis>before</> loading the dump data! If your old installation |
||||
had the <filename>contrib/tsearch2</> objects in a schema other |
||||
than <literal>public</>, be sure to adjust the |
||||
<literal>tsearch2</literal> installation script so that the replacement |
||||
objects are created in that same schema. |
||||
</para> |
||||
</step> |
||||
|
||||
<step> |
||||
<para> |
||||
Load the dump data. There will be quite a few errors reported |
||||
due to failure to recreate the original <filename>contrib/tsearch2</> |
||||
objects. These errors can be ignored, but this means you cannot |
||||
restore the dump in a single transaction (eg, you cannot use |
||||
<application>pg_restore</>'s <literal>-1</> switch). |
||||
</para> |
||||
</step> |
||||
|
||||
<step> |
||||
<para> |
||||
Examine the contents of the restored <filename>contrib/tsearch2</> |
||||
configuration tables (<structname>pg_ts_cfg</> and so on), and |
||||
create equivalent built-in text search configurations as needed. |
||||
You may drop the old configuration tables once you've extracted |
||||
all the useful information from them. |
||||
</para> |
||||
</step> |
||||
|
||||
<step> |
||||
<para> |
||||
Test your application. |
||||
</para> |
||||
</step> |
||||
</procedure> |
||||
|
||||
<para> |
||||
At a later time you may wish to rename application references |
||||
to the alias text search objects, so that you can eventually |
||||
uninstall the replacement <literal>tsearch2</literal> module. |
||||
</para> |
||||
|
||||
</sect2> |
||||
|
||||
<sect2> |
||||
<title>References</title> |
||||
<para> |
||||
Tsearch2 Development Site |
||||
<ulink url="http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/"></ulink> |
||||
</para> |
||||
</sect2> |
||||
|
||||
</sect1> |
||||
Loading…
Reference in new issue