The nightly transfer failed silently and the SLA breach surfaced a week later

Somewhere in your estate there is a crontab line that pushes a settlement file to a partner at 02:00. It has fired for two years. Over that time one belief quietly calcified into operational fact: the job runs, therefore the files arrive. Those are two separate claims, and the gap between them is exactly where the SLA breach hides.

Then the partner's SFTP endpoint went down for maintenance one night. The sftp client exited non-zero. Cron caught the output and discarded it, because no MAILTO was set and the box had no mail agent installed. Nothing moved. Nobody was paged. The partner flagged the missing files a week later, after their reconciliation came up short, and by then the breach was theirs to report and yours to explain in a meeting you did not want to be in.

Cron success is not delivery success

Cron answers one question: did the command launch on schedule. It knows nothing about whether the bytes reached the partner's /inbound intact, or whether the partner acknowledged them. A script that resolves a stale DNS record, opens a socket to nowhere, hangs, and exits is, as far as cron is concerned, a job that ran fine. Worse, the exit code is often 0 because the hand-rolled wrapper swallowed it, and most hand-rolled wrappers do.

The trap fits in one sentence: cron confirms invocation, your SLA is written against delivery, and nobody ever wired the two together. A scheduled transfer with no delivery check is fire-and-forget. You launched it, and the box forgot it.

The silent-failure window has three holes in it

When cron has no mail agent, it throws away job output, so the one error that could have rescued you never lands in a mailbox. Stack three absent safety nets and the blind window stretches into days:

Track the delivery, not the clock

Stop treating "02:00 arrived" as the meaningful event and start tracking "the file is durably delivered and acknowledged." A transfer should retry transient failures with backoff, resume a partial upload instead of restarting from byte zero, and mark itself done only when the receiving side confirms receipt. If the partner runs AS2, that proof is a Message Disposition Notification per RFC 4130: signed evidence the payload landed. On SFTP you assemble the equivalent with a post-transfer checksum compare or a control file the partner polls for. Either way, done means acknowledged, never "the script returned." You can see how the protocol layer handles that.

Alert on the absence of success

A job that never starts emits no error, so a monitor that only listens for failures is deaf to the worst case. Set a dead-man's switch: the transfer must report a confirmed delivery inside its window, and if that heartbeat does not arrive, someone gets paged. Tier it so on-call does not drown. One retryable miss is informational, a sustained failure on a settlement feed is actionable, a stalled critical SLA is an incident. The bar is simple and unforgiving: you learn about the miss in minutes, from your own system, not from the partner's reconciliation team in a week-old email.

That is the line between a cron entry and a managed transfer. In an xEvolve Environment, every scheduled flow carries delivery confirmation, automatic retry with resume, and dead-man alerting in one audited place, so a partner outage becomes a tracked retry instead of a silent breach you hear about last. See how the monitoring works.