This document provides instructions for migrating in a single step from the preview version of business glossary, which supported Data Catalog metadata, to the generally available version of business glossary, which supports Dataplex Universal Catalog metadata.
Before you begin
Install gcloud or python packages. Authenticate your user account and the Application Default Credentials (ADC) that the Python libraries use. Run the following commands and follow the browser-based prompts:
gcloud init gcloud auth login gcloud auth application-default loginEnable the following APIs:
Create one or several Cloud Storage buckets in any of your projects. The buckets will be used as a temporary location for the import files. The more buckets you provide, the faster the import is. Grant the Storage Admin IAM role to the service account running the migration:
service-MIGRATION_PROJECT_ID@gcp-sa-dataplex.iam.gserviceaccount.com
Replace
MIGRATION_PROJECT_IDwith the project from which you are migrating the glossaries.Set up the repository:
Clone the repository:
git clone https://github.com/GoogleCloudPlatform/dataplex-labs.git cd dataplex-labs/dataplex-quickstart-labs/00-resources/scripts/python/business-glossary-importInstall the required packages:
pip3 install -r requirements.txt cd migration
Required roles
Run the migration script
python3 run.py --project=MIGRATION_PROJECT_ID --user-project=USER_PROJECT_ID --buckets=BUCKET1,BUCKET2
Replace the following:
USER_PROJECT_ID: the project ID of the project to be migrated.BUCKET1andBUCKET2: the Cloud Storage bucket IDs to be used for the import.
Scope glossaries in migration
To migrate only specific glossaries, define their scope by providing their respective URLs.
python3 run.py --project=MIGRATION_PROJECT_ID --user-project=USER_PROJECT_ID --buckets=BUCKET1,BUCKET2 --glossaries="GLOSSARY_URL1","GLOSSARY_URL2"
Replace GLOSSARY_URL1 (and GLOSSARY_URL2) with the URLs of the glossaries you are migrating.
Resume migration for import job failures
The presence of files after the migration indicates that some import jobs have failed. To resume the migration, run the following command:
python3 run.py --project=MIGRATION_PROJECT_ID --user-project=USER_PROJECT_ID --buckets=BUCKET1,BUCKET2 --resume-import