Data Import

When you need to make bulk changes to thousands or even millions of Citizens at a time, the best approach is to use the Data Import system in DataGuard CPM. This system is ideal for tasks such as the initial import of all your existing data, allowing you to efficiently load large volumes of Citizen Permissions and Preferences into the platform.

Overview

The Data Import system processes CSV files containing Citizen data and imports them into the platform. The maximum file size for an import is 10MB, so larger datasets must be split into multiple files before uploading.

The import process operates as a job system, where you create an import job with your file and then wait for it to complete. Note that the system is "eventually consistent," meaning that the completion of the job may not immediately reflect across the entire system. It's recommended to perform an export after the import is complete to verify that all data has been successfully imported.

Stages of a Successful Import

There are three main stages to a successful data import:

1. Creating the Import Files

To get started with the import process, you'll need to prepare your CSV files. The structure of these files will depend on the type of data you are importing:

Importing Only Citizens

If you are importing only Citizen data without any associated Permissions or Preferences, your CSV file needs to include just one column:

  • uniqueReference (mandatory): This is the Citizen's external reference, which uniquely identifies the Citizen in DataGuard CPM. If a Citizen with this reference already exists in the system, the record will be updated rather than duplicated. If the Citizen does not exist, a new record will be created.

Importing Permissions

When importing Permissions, each Permission you want to import should be represented by a separate line in your CSV file. The following fields are required:

  • uniqueReference (mandatory): The unique reference for the Citizen. As with importing only Citizens, this ensures that existing records are updated, and new records are created only if they do not already exist.
  • privacyPolicyId (mandatory): The identifier for the privacy policy that was presented to the Citizen at the time of capturing their permission.
  • permissionStatementId (mandatory): The identifier for the consent purpose statement that the Citizen agreed to. This statement accompanies the set of consent purpose options.
  • optionType (mandatory): Specifies whether the entry represents a single consent purpose or a collection of purposes. Valid values are group or option.
  • optionId (mandatory): The id for the Consent Purpose option.
  • justification (mandatory): The lawful basis for capturing the consent. Refer to the Lawful Basis documentation for more details.
  • state (mandatory): The current state of the consent purpose. Valid states include GRANTED, PENDING, DENIED, CLAIMED, OBJECTED, and OBJECTION_UPHELD.
  • obtainedAt (optional): The date and time in ISO format when the permission was obtained from the Citizen. This cannot be a future date. If not provided, it defaults to the time the transaction is recorded.
  • validFrom (optional): The date and time in ISO format from which the purpose is valid. If not provided, it defaults to the time the transaction is recorded.
  • validUntil (optional): The date and time in ISO format until which the permission remains valid. If not provided, the default validity period in your configuration will be used to calculate this date.
  • source (optional): Indicates where the transaction was captured.
  • sourceSystem (optional): Identifies the system where the permission was originally recorded.
  • sourceSystemReference (optional): Specifies the reference or ID of the record in the originating system.

Importing Preferences

When importing Preferences, each Preference you want to import should also be represented by a separate line in your CSV file. The required fields are:

  • uniqueReference (mandatory): The unique reference for the Citizen. As with importing Citizens and Permissions, existing records will be updated, and new ones created if they do not already exist.
  • reference (mandatory): The stored, human-readable reference of the Preference being updated.
  • values (mandatory): The choice references in DataGuard corresponding to the selections made by the Citizen. If multiple choices are needed for a single Preference, the values should be pipe-delimited within this field, as shown in the sample CSV.

2. Uploading the Files

You can upload the files using either the UI or the API:

  • UI Upload: If you're uploading a few files, the Import UI in the Data section of the CPM UI is the easiest option.
  • API Upload: For larger uploads, if you have the technical ability you can use the API. Refer to the API documentation for detailed instructions.

3. Waiting for Completion and Checking for Errors

After uploading the files:

  • Completion Monitoring: If using the UI, it will automatically poll the status and show your progress. If using the API, you should poll the Get Job Endpoint until the job is complete.
  • Handling Failures: If the job is completed successfully, no further action is needed. If it fails or is partially complete, some records may not have been imported. You can use the failures endpoint to view these errors. Note you may need to re-import the failed records.

Summary

The Data Import feature in DataGuard CPM is a powerful tool for managing bulk updates to Citizen data. By following the outlined stages—creating import files, uploading them, and monitoring their progress—you can efficiently import large datasets into the platform. Whether using the UI for smaller imports or the API for larger, automated processes, Data Import provides the flexibility and reliability needed to keep your Citizen data up to date.


What’s Next