Troubleshooting the Service Management Facility
Debugging a Service That Is Not Starting
In this procedure, the print service is disabled.
- Become superuser or assume a role that includes the Service Management rights profile.
Roles contain authorizations and privileged commands. For more information about roles, see Configuring RBAC in System Administration Guide: Security Services.
- Request information about the hung service.
# svcs -xv
svc:/application/print/server:default (LP Print Service)
State: disabled since Wed 13 Oct 2004 02:20:37 PM PDT
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: man -M /usr/share/man -s 1M lpsched
Impact: 2 services are not running:
svc:/application/print/rfc1179:default
svc:/application/print/ipp-listener:default
The -x option provides additional information about the service instances that are impacted.
- Enable the service.
# svcadm enable application/print/server
How to Repair a Corrupt Repository
This procedure shows how to replace a corrupt repository with a default copy
of the repository. When the repository daemon, svc.configd, is started, it does an
integrity check of the configuration repository. This repository is stored in /etc/svc/repository.db. The
repository can become corrupted due to one of the following reasons:
If the integrity check fails, the svc.configd daemon writes a message to the
console similar to the following:
svc.configd: smf(5) database integrity check of:
/etc/svc/repository.db
failed. The database might be damaged or a media error might have
prevented it from being verified. Additional information useful to
your service provider is in:
/etc/svc/volatile/db_errors
The system will not be able to boot until you have restored a working
database. svc.startd(1M) will provide a sulogin(1M) prompt for recovery
purposes. The command:
/lib/svc/bin/restore_repository
can be run to restore a backup version of your repository. See
http://sun.com/msg/SMF-8000-MY for more information.
The svc.startd daemon then exits and starts sulogin to enable you to
perform maintenance.
- Enter the root password at the sulogin prompt. sulogin enables the root
user to enter system maintenance mode to repair the system.
- Run the following command:
# /lib/svc/bin/restore_repository
Running this command takes you through the necessary steps to restore a non-corrupt
backup. SMF automatically takes backups of the repository at key system moments. For
more information see SMF Repository Backups.
When started, the /lib/svc/bin/restore_repository command displays a message similar to the following:
Repository Restore utility
See http://sun.com/msg/SMF-8000-MY for more information on the use of
this script to restore backup copies of the smf(5) repository.
If there are any problems which need human intervention, this script
will give instructions and then exit back to your shell.
Note that upon full completion of this script, the system will be
rebooted using reboot(1M), which will interrupt any active services.
If the system that you are recovering is not a local zone,
the script explains how to remount the / and /usr file systems with read
and write permissions to recover the databases. The script exits after printing these
instructions. Follow the instructions, paying special attention to any errors that might occur.
After the root (/) file system is mounted with write permissions, or
if the system is a local zone, you are prompted to select the
repository backup to restore:
The following backups of /etc/svc/repository.db exists, from
oldest to newest:
... list of backups ...
Backups are given names, based on type and the time the backup
was taken. Backups beginning with boot are completed before the first change is made
to the repository after system boot. Backups beginning with manifest_import are
completed after svc:/system/manifest-import:default finishes its process. The time of the backup is
given in YYYYMMDD_HHMMSS format.
- Enter the appropriate response.
Typically, the most recent backup option is selected.
Please enter one of:
1) boot, for the most recent post-boot backup
2) manifest_import, for the most recent manifest_import backup.
3) a specific backup repository from the above list
4) -seed-, the initial starting repository. (All customizations
will be lost.)
5) -quit-, to cancel.
Enter response [boot]:
If you press Enter without specifying a backup to restore, the default response,
enclosed in [] is selected. Selecting -quit- exits the restore_repository script,
returning you to your shell prompt.
Note - Selecting -seed- restores the seed repository. This repository is designed for use during
initial installation and upgrades. Using the seed repository for recovery purposes should be a
last resort.
After the backup to restore has been selected, it is validated and
its integrity is checked. If there are any problems, the restore_repository command prints error
messages and prompts you for another selection. Once a valid backup is selected,
the following information is printed, and you are prompted for final confirmation.
After confirmation, the following steps will be taken:
svc.startd(1M) and svc.configd(1M) will be quiesced, if running.
/etc/svc/repository.db
-- renamed --> /etc/svc/repository.db_old_YYYYMMDD_HHMMSS
/etc/svc/volatile/db_errors
-- copied --> /etc/svc/repository.db_old_YYYYMMDD_HHMMSS_errors
repository_to_restore
-- copied --> /etc/svc/repository.db
and the system will be rebooted with reboot(1M).
Proceed [yes/no]?
- Type yes to remedy the fault.
The system reboots after the restore_repository command executes all of the listed actions.
How to Boot Without Starting Any Services
If problems with starting services occur, sometimes a system will hang during the
boot. This procedure shows how to troubleshoot this problem.
- Boot without starting any services.
This command instructs the svc.startd daemon to temporarily disable all services and start sulogin
on the console.
ok boot -m milestone=none
- Log in to the system as root.
- Enable all services.
# svcadm milestone all
- Determine where the boot process is hanging.
When the boot process hangs, determine which services are not running by running
svcs -a. Look for error messages in the log files in /var/svc/log.
- After fixing the problems, verify that all services have started.
- Verify that all needed services are online.
# svcs -x
- Verify that the console-login service dependencies are satisfied.
This command verifies that the login process on the console will run.
# svcs -l system/console-login:default
- Continue the normal booting process.
How to Force a sulogin Prompt If the system/filesystem/local:default Service Fails During Boot
Local file systems that are not required to boot the Solaris OS
are mounted by the svc:/system/filesystem/local:default service. When any of those file systems are unable
to be mounted, the service enters a maintenance state. System startup continues, and
any services which do not depend on filesystem/local are started. Services which require filesystem/local
to be online before starting through dependencies are not started.
To change the configuration of the system so that a sulogin prompt appears
immediately after the service fails instead of allowing system startup to continue, follow
the procedure below.
- Modify the system/console-login service.
# svccfg -s svc:/system/console-login
svc:/system/console-login> addpg site,filesystem-local dependency
svc:/system/console-login> setprop site,filesystem-local/entities = fmri: svc:/system/filesystem/local
svc:/system/console-login> setprop site,filesystem-local/grouping = astring: require_all
svc:/system/console-login> setprop site,filesystem-local/restart_on = astring: none
svc:/system/console-login> setprop site,filesystem-local/type = astring: service
svc:/system/console-login> end
- Refresh the service.
# svcadm refresh console-login
Example 17-18 Forcing an sulogin Prompt Using Jumpstart
Save the following commands into a script and save it as /etc/rcS.d/S01site-customfs.
#!/bin/sh
#
# This script adds a dependency from console-login -> filesystem/local
# This forces the system to stop the boot process and drop to an sulogin prompt
# if any file system in filesystem/local fails to mount.
PATH=/usr/sbin:/usr/bin
export PATH
svccfg -s svc:/system/console-login << EOF
addpg site,filesystem-local dependency
setprop site,filesystem-local/entities = fmri: svc:/system/filesystem/local
setprop site,filesystem-local/grouping = astring: require_all
setprop site,filesystem-local/restart_on = astring: none
setprop site,filesystem-local/type = astring: service
EOF
svcadm refresh svc:/system/console-login
[ -f /etc/rcS.d/S01site-customfs ] &&
rm -f /etc/rcS.d/S01site-customfs
Troubleshooting
When a failure occurs with the system/filesystem/local:default service, the svcs -vx command should
be used to identify the failure. After the failure has been fixed, the
following command clears the error state and allows the system boot to continue:
svcadm clear filesystem/local.