Can anyone share how you handle data archiving, especially when moving to the cloud? Our organization has never archived data before. I’m interested in learning about your approach, how you got business teams to classify their data and set retention periods, and how you managed risks like storage costs.
Sort by:
Critical with archiving is not the place where this happens, it's all about data classification and tagging. You need to at least classify the data in the following categories:
1. Confidentiality (0 - 3)
2. Integrity (0, 1) while 1 implies the need to store the originator of the data with the data
3. Data retention time, so the time the data needs to be archived, other the other way round, the point in time, the data needs to be deleted.
You can neglect the storage cost, archiving has no performance issues, so you can choose the cheap S3, and archive on 2 different locations for availability.
It's also advisable to create an object lock on the archive that deletion is impossible until retention time is over
At a minimum, whenever there is a new implementation project...thats the only time it seems possible to get business user attention and bandwidth. It is assumed there is a contract term and data required. To do it as an ongoing process is complex & expensive and fairly impossible unless you handle it at the time of creation or ingestion or at the time of a new implementation. Not sure if you question is a process or platform one though?