|
Category :
Database
Resources -> Database Administration
DB Version
:
Oracle 8i / 9i
OS Details
:
Sun Solaris9
At
some point in time, the database might require a character set change
and it could be due to the current character set being de-supported by
oracle or it could be because you need the have the ability for the
databases to support international characters. Character set
conversion is a project that needs proper testing and implementation.
In this article we will take a test case scenario of one method of
converting a database character set from AL24UTFFSS to UTF8.
Character Set
conversion Steps:
(Step 1)
Choose the right
character set for your database. Make sure you choose the character
set that will take care of your future needs as well.
Oracle believes that for the
majority of customers a unicode character set is the best choice.
(Step 2)
Install user and objects for using the database character set
migration utility. This is done by running the csminst.sql script that
can be found under $ORACLE_HOME/rdbms/admin directory for unix.
Note: Edit the script
and make sure the correct tablespace is specified for the user csmig.
Default within the script is SYSTEM.
(Step 3)
Run the character set
scan utility to find out any data violation that can occur due to the
character set migration. This utility does not change the character
set. It just scans the data within the database and let's us know if
there are any data that might be affected by the character set change.
Here's an example on Sun platform:
$ csscan
system/<password> full=N to_char=UTF8 array=102400 user=TEST_USER
process=5 log=scan4testuser
TO_CHAR =>
Specifies the character set to which you want to eventually convert to
FULL
=> Scan the entire database
ARRAY
=> Size of array fetch buffer
USER
=> The schema that needs to be scanned
PROCESS
=> Number of concurrent processes that will scan the database
This utility can be used
to either scan the entire database or at the schema level or at the
individual table level. For Very Large DataBases (VLDBs) it is faster
to run multiple scans simultaneously at the schema level.
(Step 4)
The character set
scanner will report row violations in to .err file. This file will
have the rowids of rows that might have problems due to character set
conversion. Load the rowids from the .err file in to the database
using external tables or sql*loader.
(Step 5)
Export the rows from the
tables that were reported as violations in step 4 using the NLS_LANG
value set to the original character set value. For example, if the
rowids in step4 above were loaded in to a table within oracle called
rowid_violations with a column named testtab_rowids then you can
export the data using the exp utility by doing this:
$ export NLS_LANG=AMERICAN_AMERICA.AL24UTFFSS
$ exp file=testtab_violate
query="where rowid in (select testtab_rowids from rowid_violations )"
tables=test_table
(Step 6)
Delete the rows from the
base table that were reported as violations and that which were
exported in step5. We will import them after the character set
conversion.
(Step 7)
We are now ready to do
the character set conversion on the database. There are multiple ways
of changing the character set for the database. The steps mentioned
below are one of the available methods:
Important : Make
sure you take a FULL cold backup the database before continuing with
the character set change.
SQL> shutdown
immediate
SQL> startup
mount
SQL> alter system
enable restricted session;
SQL> alter system set
job_queue_processes=0;
SQL> alter database
open;
SQL> alter database
character set internal_use UTF8;
SQL> shutdown
immediate;
SQL> startup
The INTERNAL_USE option
will skip the superset
check. It will not ensure that the new character set is a strict super
set of old character set.
(Step 8)
Import back the data that was
exported earlier in step 5. The preferred setting for the NLS_LANG
parameter for the import session is the original (old) character set
even though setting it to the newly changed character set value will
also work.
(Step 9)
Make sure the client desktops and
machines have the proper NLS_LANG settings to reflect the character
set changes.
There are other methods for
character set conversion as well. One other method that would work on
a small or medium sized databases would be to export the entire
database, re-create the database with the new character set and import
the data back.
|