Examining four ways to manage migration or synchronisation with the cloud.
Cloud storage revenue is forecast to grow more than 28 percent annually to reach $65 billion in 2020. The driving force is the substantial economies of scale that enable cloud-based solutions to deliver more cost-effective primary and backup storage than on-premise systems can ever hope to achieve.
Most IT departments quickly discover, however, that there are significant challenges involved in migrating and synchronising many thousands or even millions of files from on-premise storage systems to what Gartner characterises as enterprise file synchronisation and sharing (EFSS) services in the cloud. According to the research firm, by 2019, 75 percent of enterprises will have deployed multiple EFSS capabilities, and over 50 percent will struggle with problems of data migration.
In a newly-published report titled, ‘How to migrate file shares, SaaS and ECM to EFSS’ Gartner identifies four ways organisations can manage migration to and/or synchronisation with EFSS services – custom integration, rudimentary copy, EFSS import services and specialised third-party tools.
Custom integration
Every file has a unique set of properties associated with it, and most file systems treat at least some of these file properties differently. The properties include the basics, such as file name, format and metadata, along with the more advanced, such as versioning, ownership preservation, and permissions.
In a hybrid storage environment, file names might need to be normalised. Versions might need to be tracked manually. Different security models might be needed for each file system, potentially creating problems for users – and placing a significant burden on the help desk. In any complex custom integration, there are bound to be mistakes. And the biggest problem in a hybrid storage environment is often an inability to detect file transfer corruption or version problems before they cause problems for the organisation.
Consider the experience of Shawmut Design and Construction, a construction management firm with offices throughout the US. The company uses BIM 360 software from Autodesk for construction management, and the ShareFile platform from Citrix for collaboration with the team in the field.
Change orders are common in construction projects, and using out-of-date information can cause costly mistakes. So the superintendent in charge of the project took great care to ensure that all of the files were accurately synchronised daily. Using the file management capabilities built into BIM 360 and ShareFile, the effort required three project managers – two full-time and one part-time. Every day, the staff compared the versions of the many files in both systems, copying the latest from one to the other as needed to keep everything in sync. If three people are needed to handle synchronisation between just two file systems, it is not surprising that complexity can increase exponentially in an organisation with a dozen or more.
Shawmut did not attempt to have IT resources automate the file synchronisation task, but other companies have – normally with unsatisfactory results. Getting bi- or multi-directional file synchronisation to work well is not a trivial endeavour. Indeed, successfully navigating the different ‘file logistics’ of multiple incompatible storage systems can become a Tower of Babel that is fraught with potential peril. Making a mistake when comparing just one of the file’s properties involving the last accessed/modified date, user/group access permissions or locking can result in a file becoming corrupt or over-written by an older version. And if the custom integration application lacks robust error detection and reporting – something that is deceptively difficult – the mistake will remain undetected – until a user complains.
For a one-time migration or a one-way backup, a custom integration effort, consisting of a combination of manual and automated procedures, may work well enough. This is especially true if the differences among the storage systems involved are relatively minor and manageable.
Rudimentary copy
Using familiar, proven and low-tech ‘brute force’ bulk copy commands, such as xcopy in Windows/DOS and rsync in Linux, is certainly simple and, therefore, might seem to be fairly foolproof. Applications like the File Explorer in Windows and the file management applications offered with most EFSS services also provide bulk file and folder copying capabilities.
For brute force bulk copy to work well, though, the storage systems involved either need to be compatible or must be made interoperable at their ‘lowest common denominator’.
Import services
Various forms of import services are available with virtually all EFSS platforms. Each has its own file management application with an online file import function, and some providers recommend using a physical disk drive when importing more than 100GB of data.
While these online applications and services shift responsibility to the EFSS provider, they can suffer from the same potential complexities and/or limitations such as lost permission models and structures, user-defined metadata, file ownership, and versions as encountered in custom integrations and rudimentary copy mechanisms. So if the import service fails to adequately accommodate the underlying file property differences between or among the different storage systems, the results are destined to be less than satisfactory.
Third-party tools
The growing popularity and inherent complexities of hybrid storage architectures have created a demand for specialised ‘middleware’ software designed specifically to manage storage system migration and synchronisation. While designs vary, the more advanced file logistics systems use a custom ‘connector’ for each storage system supported. The connectors provide a common set of functionality that enables every storage system to interoperate with all others, without sacrificing the advanced capabilities of any. The result is a hybrid content management system capable of serving as an intelligent intermediary between or among many different storage systems.
To provide the agility desired in a hybrid storage environment, the connectors normally support a wide range of both on-premises storage systems (such as NFS/SAN/NAS, SharePoint, and various enterprise content management solutions) and EFSS platforms (such as Box, Dropbox for Business, Google Drive Office 365, OneDrive ShareFile, and Syncplicity). The depth and breadth of support makes these tools suitable for supporting most enterprise applications.
Increasing frustration with its manual synchronisation motivated Shawmut to pilot a third-party hybrid content management tool, and the improvement was immediate. With connectors for both Shawmut’s on-premises storage system and Citrix ShareFile, the tool automatically synchronises files every night based on just a few “point-and-click” instructions, which has eliminated the need for painstaking manual comparisons. Now the project superintendent spends only a few minutes at the end of each workday to set up the synchronisation. After confirming the tool worked as desired, the three project managers previously responsible for synchronising the files were reassigned to more productive tasks.
The journey to deciding which of these four alternatives might be the best and most cost-effective in any particular situation begins with taking an inventory of all the storage systems being used enterprise-wide both on-premises and in the cloud.