How-to backup SharePoint online / Teams data to TrueNAS

How-to backup SharePoint online / Teams data to TrueNAS

 

With all the hurry to follow the trends a lot of people are missing the fineprint below the Microsoft terms and conditions and don't even think that their data is not backed up.

 

As per Microsoft service agreement:


We strive to keep the Services up and running; however, all online services suffer occasional disruptions and outages, and Microsoft is not liable for any disruption or loss you may suffer as a result. In the event of an outage, you may not be able to retrieve Your Content or Data that you’ve stored. We recommend that you regularly backup Your Content and Data that you store on the Services or store using Third-Party Apps and Services.

 

And remember, RETENTION is not a backup!

 

There are a couple of situations that will benefit from a proper backup of M365 SharePoint data and by proper I mean to have the option to control the restore process at the folder/file level and having immutable snaphots to protect against ransomware.

 

Based on Gartner's IT Resilience - 7 Tips for Improving Reliability, Tolerability and Disaster Recovery report:

 

Ignoring disaster recovery or applying a one-size-fits-all approach for cloud IaaS is not advised. The challenge most organizations face is getting stakeholders to understand why disaster recovery is still needed and then sifting through the hundreds of possible architectural approaches.

 

The most effective approach is to use a similar tiering approach used for on-premises workloads for decades.

 

Recovery Preparedness: Document after-hours incident contacts and escalation processes, protect the backup system and control plane itself, regularly create airgapped and immutable backups, regularly verify integrity of backups, create an isolated recovery environment (IRE), and plan for both recovery at scale where multiple systems will likely be impacted and prepare for alternative machines or physical access in case systems become locked and inaccessible.

 

Back Up: Most instances of data loss are related to accidental deletion or corruption, either directly by a user or due to an overlooked consequence resulting from an action taken on an integrated platform. Backups are needed. Options may include using the SaaS provider’s native add-on offering, data exports, APIs for bulk exporting or loading, third-party backup tools or ETL providers, as mentioned above. At a minimum, understand the RPOs needed and what level of granularity of restoration is possible (for example, transaction level of recovery versus limited to entire dataset restoration). And be sure to also account for metadata, attachments and other items needed to ensure relational integrity between objects.

 

Let's see how we can address the concerns using a third-party tool. For my storage infrastructure I am using TrueNAS (iXsystems) with ZFS (another material about how to protect agains ransomware using ZFS and TrueNAS is Ransomware - despre atac, metode de protecție și bune practici) because I consider it to be one of the best and most coft-effective solutions.

 

At the heart of TrueNAS is the self-healing OpenZFS filesystem. Previously only available on the highest-end enterprise storage systems, TrueNAS gives you direct, user-friendly access to ZFS. With its built-in RAID, powerful data management tools, and ability to automatically detect and repair silent data corruption (and bit rot), TrueNAS and OpenZFS ensure data integrity from start to finish.

 

TrueNAS supports connecting to Amazon S3 and compatible providers, Backblaze B2, Box, dropbox, Google Cloud Storage, Google Drive, Hubic, Mega, Azure Blob Storage, OneDrive, OpenStack Swift, pCloud, Storj iX, WebDAV and Yandex.

 

For OneDrive we need to fill in the Access Token (automatically generated after login to provider) and the Drive ID. By default TrueNAS will use the Drive ID associated with the OneDrive account. To access data from a SharePoint we need to find the Drive ID of that site.

 

For this we can use Microsoft Graph.

 

The first query after login is the one below that will return the SharePoint site ID. In this example we need to backup/restore data from a SharePoint site named SharePoint-backup-youtube-demo. The query below lists all the sites on which the logged in account has access and the site name contains the keyword youtube:


https://graph.microsoft.com/v1.0/sites?search=youtube

microsoft graph query to list sites containing a keyword

 

With the second query we will get the Drive IDs for all the document libraries from the SharePoint site by using the site ID:

 

https://graph.microsoft.com/v1.0/sites/site ID/drives

 

microsoft graph query to get the drives list from a site id

 

Now that we have the Drive ID we can use it in TrueNAS and finish the setup of the Cloud Credentials.

 

TrueNAS is able to operate in two directions (PUSH - from TrueNAS to Cloud and PULL - from Cloud to TrueNAS) and three modes (SYNC: Files on the destination are changed to match those on the source. If a file does not exist on the source, it is also deleted from the destination. COPY: Files from the source are copied to the destination. If files with the same names are present on the destination, they are overwritten. MOVE: After files are copied from the source to the destination, they are deleted from the source. Files with the same names on the destination are overwritten.)

 

A Dataset will be needed to set the Cloud Sync Task.

 

Periodic Snapshot Tasks can be created for the dataset so the requirement for immutable backups is checked. The snapshots can be restored to the dataset used for the Sync Task in case of a full restore or they can be restored to another dataset in case we need a more granular control over the restoration process (if only a couple of files/folders or some specific requirements)

 

More details in the video below: