Archiving is the process of copying a file from a file system to a volume that is located on a removable media cartridge or on a disk partition of another file system. The archiver, a component of the Sun SAM software, is software that automatically copies online disk cache files to archive media, unless you use a no-archive policy to configure it to do otherwise.
Before the archive process is initiated, the archiver searches for archiving directives that are defined in archive policies. An archive policy is a collection of file system directives, copy directives, copy parameters, and volume associations that determines how groups of files can be archived. Several archive policies can be associated with an archiving file system. For more information about policies, see About Archive Policies.
Based on the directives defined in the default policy for a file system, the archiver automatically creates one archive copy of all files and sends the copy to archive media. You can configure the archiver to create up to four archive copies on a variety of archive media by modifying the default policy for the file system or by creating a custom policy and applying it to the file system.
Before a file is considered a candidate for archiving, the data in the file must be modified. A file is not archived if it is only accessed. A file is selected for archiving based on its archive age, which is the amount of time since the file’s last modification. The archive age can be defined for each archive copy in an archive policy.
If a large file is segmented into smaller pieces, each segment is treated as a file, and each segment is archived separately.
The following table provides general planning recommendations that can help improve archiving performance.
Recommendation |
Reason |
---|---|
Save archive log files. |
The archive log files provide information that is essential to recovering data. Store the log files in a safe place in the case of a disaster or in case the Sun SAM software is unavailable. You can configure global archive log files on the Global Parameters Page. You can also configure individual log files for each archiving file system on the File System Archive Policies Page for the file system. |
Specify volume ranges in archive copies, rather than specific volume names. |
Let the software place files on many different volumes. Volume ranges enable the system to run continuously. Typing specific volume names in archive copies can rapidly fill a volume, causing undue workflow problems as you remove a piece of media and replace it with another. For information about creating archive copies, see How to Add a Copy to a Policy. |
Base archive intervals on how often files are created and modified, and whether you want to save all modification copies. |
The archive interval is the time between file system scans. A very short archive interval keeps the archiver scanning almost continuously. For information about general archiving setup, including archiver scanning, see Global Parameters Page. |
Consider the number of file systems you are using. |
Multiple small archiving file systems result in better archiver performance than a single large archiving file system. The archiver uses a separate process for each file system. Multiple file systems can be scanned in considerably less time than a single large file system. For information about the number of file systems to create based on your environment, see Table 43. |
Use directory structures to organize files in an archiving file system as you would a UNIX file system. |
For performance considerations, do not place more than 10,000 files in a directory. For information about other archiving configuration guidelines, see Table 43. |
Always make a minimum of two file copies on two separate volumes. |
Placing data on a single media type puts your data at risk if physical problems with the media occur. Do not rely on a single archive copy. For information about creating archive copies, see How to Add a Copy to a Policy. |
Schedule recovery points to be created on a regular basis. |
You can use the information stored in a recovery point to recover an archiving file system in the event of a disaster. Make sure that files are archived before the recovery point is created. For more information, see About Protecting Archiving File System Data. |
If you are managing a server that has the Sun SAM software (SUNWsamfsr and SUNWsamfsu packages) installed locally and you are configuring file systems on the server to be archiving, you should have at least one tape library associated with the current server.
The following table describes archiving configuration guidelines, on a per-tape-library basis, that can prevent you from over extending your environment.
Number of Tape Drives |
Number of Custom Policies to Create |
Maximum Number of File Systems to Create |
Maximum Number of Files in Each File System |
Library Recycler Values |
---|---|---|---|---|
2–3 |
1 |
4 |
6 million |
|
4–5 |
1 |
6 |
6 million |
|
6–7 |
2 |
10 |
8 million |
|
8–10 |
4 |
10 |
10 million |
|
Continuous archiving eliminates the need for the archiver to periodically scan an archiving file system. With continuous archiving, the archiving file system notifies the archiver when files have changed and the archiver determines how and when to schedule archiving based on the Start Age, Start Count, and Start Size values defined in the copies of the policy associated with those files.
To enable continuous archiving for all archiving file systems on the current server, choose the No Scan scan method on the Global Parameters page. To enable continuous archiving for an individual archiving file system, choose the No Scan method on the File System Archive Policies page for that file system. Any setting that you make on the File System Archive Policies page overrides the equivalent setting on the Global Parameters page.
To specify Start Age, Start Count, and Start Size values for all archiving file systems on the current server, edit the values in the copies of the configurable defaults (allsets) archive policy. To override the settings in the configurable defaults policy, specify Start Age, Start Count, and Start Size values in the individual copies of any other archive policy. You do this on the Advanced Copy Options page of an archive policy.
The following examples describe how the policy copy values affect continuous archiving:
Start Age– If it takes one hour to create files that must be archived together, you can create a policy copy and can define the Start Age value in the copy to be 1 hour. This ensures that all necessary files are created before archiving begins.
Start Size– If you want the archiver to wait until a specific amount of data is ready to be archived, you can create a policy copy and specify the Start Size value to be 150 Gbytes, for example. This directs the archiver to wait until 150 Gbytes of data are ready to be archived.
Start Count– If you know that 3000 files will be generated for archival, you can create a policy copy and specify the Start Count value to be 3000. This ensures that all 3000 files will be archived together.
Associative archiving is useful if you want to archive an entire directory to one volume and if all the contents in that directory will fit on a single archive volume. You can enable associative archiving by selecting the Force files in a directory to be archived together check box in an archive policy copy.
When files are archived, they are grouped together in one or more archive files to efficiently pack the volume. Subsequently, when accessing files from the same directory, you might experience delays as the stage process repositions through a volume to read the next file. To alleviate delays, you can choose to archive files from the same directory paths contiguously within an archive file. The process of associative archiving overrides the space efficiency algorithm to keep files from the same directory together. You can specify that these files are to be archived contiguously within a copy of an archive policy.
Associative archiving is useful when the file content does not change and you always want to access the group of files together at the same time. For example, you might use associative archiving at a hospital for accessing medical images. Images associated with a single patient can be kept in a single directory, enabling a doctor to access those images together at one time.
However, because associative archiving specifies that all files from the same directory be archived on a single volume, it is possible that a group of files might not fit on any available volume. In this case, the files are not archived until more volumes are assigned to the archive policy. Another issue is that the group of files to be archived might be so large that it can never fit on a single volume. In such a case, the files are never archived.
An alternative to associative archiving is the sorting of files by path. Sorting keeps files together, but does not force the files to be archived together. You can specify sorting in the copy of an archive policy.
The releaser is a component of the Sun SAM software. After a file is archived, it becomes eligible to be released. The releaser frees primary (disk) storage that is used by the archived file’s data. Two threshold values, High Water Mark and Low Water Mark, manage online disk cache free space. You can define these thresholds values when you create an archiving file system or when you edit the mount options of an archiving file system.
When online disk consumption exceeds the High Water Mark threshold, the system automatically begins releasing the disk space occupied by eligible archived files. Disk space occupied by archived file data is released until the Low Water Mark threshold is reached. Files are selected for release depending on file size and age.
Optionally, the first portion of a file can be retained on disk for speedy access and for masking staging delays. If a file has been archived in segments, portions of the file can be released individually.
The stager is a component of the Sun SAM software. When a file whose data blocks have been released is accessed, the stager automatically stages the file or file segment data block back to online disk cache. The read operation tracks along directly behind the staging operation, allowing the file to be immediately available to the application before the entire file is completely staged.
The Sun SAM software processes stage request errors automatically. If a stage error is returned, the system attempts to find the next available archive copy of the file. Stage errors that can be automatically processed include media errors, unavailability of media, and unavailability of an automated library, among others.