Initiating a Restart Explicitly

You can use a WFL RERUN statement to restart a checkpointed task. The checkpoint files of the task to be restarted must have been permanently saved. Checkpoint files are permanently saved if the checkpoint disposition is LOCK, if the job terminates abnormally or if the checkpoint is initiated by an operator BR command.

The RERUN statement can be included in a WFL job. Also, you can enter the RERUN statement directly at the ODT, in which case the RERUN statement causes the creation of a WFL job that does the restart. The RERUN statement has the following form:

RERUN <job number> / <checkpoint number>

In the RERUN statement, the job number is the mix number of the job that initiated the checkpointed task. The checkpoint number identifies the checkpoint that is to be used.

If the checkpointed task had a usercode, the checkpoint files are stored under that usercode. To restart such a task, you must enter the RERUN command in a job that specifies the usercode. The following can then be entered at an ODT:

?BEGIN JOB;USERCODE = <usercode> / <password>;
  RERUN <job number> / <checkpoint number>

Following are some of the conditions that can prevent a successful restart:

  • The usercode of the checkpointed task or its job is no longer valid.

  • The program has been recompiled since the checkpoint was created.

  • The system is now running on a different MCP release level than it was when the checkpoint was created. For example, the system is now running a 48.1 MCP, and the checkpoint was created on a system running a 47.1 MCP.

  • The system is now using different intrinsics from when the checkpoint was taken.

  • The checkpoint files are not present on DISK family or PACK family. The files must be on one of these two families, regardless of any FAMILY equations entered with the RERUN statement.

  • The process was restarted on a different type of machine from the one where the checkpoint was taken. For example, the process was checkpointed on an LX5100 and restarted on an NX5820.

If a rerun is initiated and the job number is in use by another job, a new job number is assigned, and the checkpoint files are automatically retitled to reflect the new job number.

The following messages can be displayed to show the result of the restart attempt:

  • RESTART PENDING

  • RESTART INITIATED

  • RESTART ABORTED: BAD CHECKPOINT FILE

  • RESTART ABORTED: BAD STACK NUMBER

  • RESTART ABORTED: CODEFILE INCOMPATIBLE WITH MCP

  • RESTART ABORTED: ERR COPYING JOB FILE

  • RESTART ABORTED: FILE IS ON A RESTRICTED FAMILY

  • RESTART ABORTED: FILE IS RESTRICTED

  • RESTART ABORTED: FILE POSITIONING ERROR

  • RESTART ABORTED: INVALID JOB FILE

  • RESTART ABORTED: IO ERROR DURING RESTART

  • RESTART ABORTED: IO ERROR READING FROM CHECKPOINT FILE

  • RESTART ABORTED: IO ERROR READING SEG0 OF CODE FILE

  • RESTART ABORTED: IO ERROR READING SEG0 OF JOB FILE

  • RESTART ABORTED: IO ERROR READING SEG0 OF TASK FILE

  • RESTART ABORTED: MACHINE TYPES DIFFER

  • RESTART ABORTED: MISSING CHECKPOINT FILE

  • RESTART ABORTED: MISSING CODE FILE

  • RESTART ABORTED: MISSING FAMILY MEMBER

  • RESTART ABORTED: MISSING JOB FILE

  • RESTART ABORTED: NOT ABLE TO RESTART

  • RESTART ABORTED: OPERATOR DSED RESTART

  • RESTART ABORTED: OPERATOR QTED RESTART

  • RESTART ABORTED: PAGED ARRAY PAGE SIZE HAS CHANGED

  • RESTART ABORTED: TAPE LABELKIND CONFLICTS WITH FILEUSE

  • RESTART ABORTED: USERCODE ERROR

  • RESTART ABORTED: USERCODE NO LONGER VALID

  • RESTART ABORTED: WRONG CODE FILE

  • RESTART ABORTED: WRONG JOB FILE

  • RESTART ABORTED: WRONG MCP