S/4HANA go-live readiness guide : monitoring coverage to put in place before cutover

Summary

Most S/4HANA monitoring discussions start the week after go-live. The system is in production, users are reporting slowness, and the project team is scrambling to understand what is happening on a platform they are running for the first time under real load. At that point, monitoring is reactive by definition. The window for proactive coverage has already closed.

The monitoring decisions that determine whether your first weeks on S/4HANA are stable or chaotic are made weeks before cutover, not after. They are decisions about what to measure on the legacy system before the migration window opens, what to watch during the technical cutover itself, how to verify the new system is healthy before the first business user logs in, and what the first month of production operations needs to look like before the team can declare the go-live stable.

This guide covers each of those phases in sequence. It is written for the people who own the technical readiness of the migration: the SAP Basis team, the monitoring architects, and the MSPs or IT operations leads who will be on call during cutover weekend and the weeks that follow.

The monitoring gap that most S/4HANA projects have

Why the cutover window is the highest-risk period with the least visibility ?

Consider what happens during a typical S/4HANA technical cutover. The source ECC system is locked for business transactions. A tool like SUM with DMO (Database Migration Option) or an export/import procedure is running. HANA is being loaded with migrated data. The process takes anywhere from 12 hours to several days depending on data volume and hardware. And for most of that window, the visibility the operations team has into what is happening is limited to periodic progress screens and the hope that nothing unexpected occurs.

There is no dialog load to observe because users are not in the system. The batch jobs are frozen. The interfaces are stopped. So the monitoring discipline that normally covers production operations has no obvious application. The team watches the migration tool’s progress bar and waits.

What does not get monitored during this window is the HANA instance itself: memory consumption as migration data is loaded, log volume growth during the data load phase, I/O saturation on the storage layer as large table migrations write at sustained high throughput. These are the conditions that cause migration phases to stall or fail, and they are almost never monitored proactively. The first sign of a problem is a phase that stops progressing, which triggers a manual investigation that could have been a pre-configured alert.

The cost of discovering monitoring gaps in the first week of production

The first week on S/4HANA has a specific operational character. The project team is exhausted from the cutover weekend. The operations team is learning a new system. Business users are encountering a different interface, different transaction names, and different behavior for processes they have run the same way for years. Everything is under scrutiny.

When a performance issue occurs in that context, the pressure to identify and resolve it is immediate. If the monitoring infrastructure is not in place, the investigation starts from scratch: pulling basis SM50 data, running ad-hoc HANA queries to check memory, manually checking batch job logs. The tools that would make the investigation take 10 minutes instead of 90 minutes are the ones that should have been configured before go-live.

The monitoring gap in the first week is not just a technical inconvenience. It affects confidence. A business that has invested in an S/4HANA migration and experiences unexplained slowness in the first week, without a clear technical explanation, starts questioning the project decisions that led to it. Good monitoring does not only reduce incident resolution time. It produces the data needed to explain what is happening and why, which is its own form of project value during the stabilization period.

Phase 1 : establishing baselines on the source system before migration starts

What to measure on ECC before the migration window opens ?

The most undervalued monitoring activity in any S/4HANA migration is capturing a baseline on the source system before the project starts. Not a snapshot, a continuous record over a representative period: at minimum eight weeks, covering normal business days, a month-end close cycle, and any other periodic batch events specific to the organization.

The baseline metrics that matter for post-migration comparison are dialog response times by transaction code, background work process average load by time of day, database response times for the top SQL statements by frequency, and overall system load profiles during peak hours. These numbers, captured systematically before migration, become the benchmark against which S/4HANA performance is evaluated after go-live.

Without that baseline, post-go-live performance conversations are opinion-based. “The system feels slower” versus “transaction VA01 average response time in ECC was 1.2 seconds; in S/4HANA it is currently 2.8 seconds on the same transaction volume” are different conversations. The second one has a resolution path. The first one does not.

Timing:  Start baseline collection at least 8 weeks before the planned migration window. Two weeks of data is not enough to capture month-end behavior, and month-end in ECC often reveals workload patterns that are not visible in daily operations.

Capturing batch job profiles and interface throughput baselines

Background jobs do not migrate automatically from ECC to S/4HANA. They are recreated manually by the Basis team, either through manual SM36 entry or through a transport that carries the job scheduling configuration. In either case, recreation introduces opportunities for errors: wrong server group assignment, missing job step parameters, incorrect variant references.

Before migration begins, document every production background job with its scheduled frequency, expected runtime at normal volume and at month-end volume, the server group or named server it runs on, and its predecessor and successor relationships. This documentation is what the Basis team uses to verify correct job recreation in S/4HANA before go-live, and it is what the operations team uses to configure duration alerting thresholds in the new system.

Interface throughput baselines follow the same logic. Every production interface in ECC has an average daily message volume, a normal processing time, and an error rate that reflects what normal looks like. Capturing those figures before migration provides the comparison data needed to confirm that interfaces are processing correctly in S/4HANA, not just that they are technically active.

The performance baseline that makes post-migration comparison meaningful

There is a specific monitoring setup that makes post-migration comparison straightforward and is almost never done: running the monitoring tool against both the ECC source system and the S/4HANA target in parallel during the validation period, before go-live. This requires the monitoring platform to be connected to ECC as part of the migration project, not just to S/4HANA after go-live.

When both systems are monitored by the same platform, the comparison between ECC baseline performance and S/4HANA performance during load testing and user acceptance testing uses the same metrics, the same thresholds, and the same dashboard format. Regression identification becomes a matter of looking at two lines on the same graph rather than comparing numbers from different sources with different collection methodologies.

The practical prerequisite for this approach is connecting the monitoring tool early, during the project phase when ECC is still the production system, and configuring it to capture the specific metrics that will be used for post-migration comparison. This is three to four weeks of setup work that most projects de-prioritize because monitoring is treated as a go-live task rather than a pre-migration one.

Phase 2 : monitoring coverage during the technical cutover

What to watch during the data migration phase ?

The data migration phase, whether running SUM with DMO, a classical export/import, or a shell conversion, puts the HANA system under sustained write load that differs from any normal production pattern. Large tables are being inserted in bulk. HANA’s delta store accumulates entries faster than auto-merge can process them. The log volume grows at a rate determined by migration speed rather than by normal transaction volume.

Three metrics warrant continuous monitoring during this phase. HANA memory consumption against the allocation limit, because bulk inserts consume more in-memory space per row than steady-state operations as merge operations process and compress the data. Log volume utilization, because log backups during a migration window require specific configuration that is easy to forget in the pre-cutover checklist. And migration tool phase duration against the expected timeline, because a phase running 40% longer than the practice run is a decision point, not just an observation.

Duration alerting per migration phase is the specific capability that most project teams do not have configured because they have never thought of migration tool phases as monitorable events. The SUM and DMO tools produce log files with phase timing data that can be read by monitoring agents or by a simple custom script. Setting an alert when any phase runs more than 50% longer than the practice run timeline creates actionable data during the migration window rather than requiring someone to manually check the migration tool every 30 minutes.

Common mistake:  HANA log mode during DMO should be set to “overwrite” only for non-production migrations where point-in-time recovery is not required. For production migrations where RPO matters, log mode should remain at “normal” and log backups should be running throughout the migration window. Verify this in the configuration review before the migration window opens, not during it.

Interface freeze monitoring: what silence actually looks like

The interface freeze before cutover is a well-understood migration concept. All inbound data flows to ECC are stopped. Messages in flight are processed and queues are drained. The source system reaches a clean state before the migration begins.

What is less well understood is how to confirm that the freeze is complete. Stopping an interface at the middleware layer stops new messages from being sent. It does not immediately drain all messages already in the queue or being processed. A message that arrived two minutes before the freeze window opened may still be in the IDOC or qRFC queue, partially processed, when the data migration starts. If that message completes processing after the migration has begun capturing the data state, it creates a reconciliation discrepancy.

Monitoring interface queue depth to zero is the verification step. Not just confirming that the interface configuration is in a stopped state, but confirming that the message queues are empty. SM58 for transactional RFC, SMQ1/SMQ2 for qRFC, BD87 for IDoc processing, and any middleware-side queue monitoring for the integration layer all need to confirm zero pending items before the data migration starts. This verification belongs on the go-live cutover checklist as a signed-off checkpoint, not as an assumption.

The final batch close : monitoring the last run on the source system

The batch jobs that run on ECC in the days before go-live carry real business risk. A failed period-close job, a payroll preprocessing error, or a billing run that does not complete cleanly leaves the source system in an inconsistent state that carries into S/4HANA as a data quality issue.

The final batch runs on ECC should be monitored with the same attention as any month-end close, not as routine background processing that will sort itself out. This means verifying not just job status (FINISHED) but spool content for application-level errors, checking output file generation where applicable, and confirming that downstream dependent jobs completed in sequence.

One specific scenario to watch: predecessor-successor job chains where the chain spans the interface freeze window. If a job that generates output for an external system was scheduled to run during the freeze period, that output may not have been delivered before the freeze. The business process owner needs to confirm explicitly that the data state in ECC at the point of migration is complete and clean for their process area, based on monitoring evidence, not assumption.

Phase 3 : the first hours after go-live on S/4HANA

HANA memory and performance in the first 24 hours

The first time business users hit a fresh S/4HANA system, the HANA instance is in a cold state relative to production load. Column store data that was loaded during migration has not been accessed by real transactions. The first queries from users cause HANA to load column store partitions that have not been touched since the migration, which generates I/O on the data volume and increases memory consumption.

This cold-start behavior means the first hours of production operation look different from load test performance, even when the load test was well designed. Response times during the first 30 to 60 minutes are typically higher than they will be once the working dataset is warm in memory. The operations team needs to know this in advance so they do not interpret normal cold-start behavior as a system problem requiring immediate intervention.

What does require immediate intervention is memory consumption that approaches the allocation limit before the working dataset has fully loaded. If HANA memory is already at 82% after two hours of user activity on day one, with data still being loaded into cache, the system has a sizing problem that will worsen as the dataset warms. Monitoring memory consumption against the allocation limit from the moment the first user logs in, not from the moment the first alert fires, gives the operations team the trajectory data needed to make that assessment.

In practice:  Set a temporary, more sensitive memory alert threshold for the first 48 hours after go-live: 75% instead of 85%. The cold-start period is when unexpected memory consumption patterns appear. A lower threshold gives the team time to investigate before the situation becomes critical rather than after.

Background job scheduling : verifying the schedule transferred correctly

Job recreation in S/4HANA is a manual process and a common source of post-go-live incidents. The three failure modes that appear most often are jobs that were not recreated at all (missing from the schedule), jobs that were recreated with incorrect server group assignments (running on a server that cannot handle the load or does not have the required authorizations), and jobs where predecessor-successor dependencies were not correctly configured (jobs starting independently instead of in sequence).

The verification process is not looking at a job list and confirming names match. It is running the first scheduled execution of each critical job, observing it in SM37 on S/4HANA, confirming it runs on the expected server, completes within an expected duration range based on ECC baseline, and produces the expected output. For Tier 1 jobs, this verification should happen before business users are told the system is fully operational.

Jobs that run daily are verifiable within the first 24 hours. Jobs with less frequent schedules, weekly reporting jobs, monthly close preparations, are harder to verify before the first real execution. For those, the verification is documentation-based: confirm the SM36 configuration matches the documented specification from the pre-migration job profile capture.

The first business transactions: what to watch and what to expect

The transaction codes that get the most traffic in the first hours after go-live are the ones that exist in ECC and have changed behavior in S/4HANA. VA01 (sales order creation), ME21N (purchase order), FB60 (vendor invoice) are the transactions that business users know by muscle memory and will notice if they behave differently. Response time regressions on these specific transactions will generate calls to the help desk within minutes.

Monitoring dialog response time by transaction code, not as an average across all transactions but segmented by specific transaction, is the most operationally useful performance view for the first day. It tells the operations team which specific business activity is slow, rather than telling them the system is generally under load. That specificity is what makes the difference between an investigation that takes 15 minutes and one that takes two hours.

The operations team should have a list of the 10 to 15 most business-critical transaction codes before go-live, with their ECC baseline response times, and monitoring configured to alert if any of them degrades more than 50% from baseline during the first 48 hours.

Phase 4 : stabilization monitoring in the first 30 days

First month-end on S/4HANA : why it requires specific monitoring attention ?

Month-end close on a system that has been in production for less than four weeks is categorically different from month-end on a stable production system. The batch jobs run for the first time on S/4HANA with real data volumes from a full month of production transactions. The HANA system processes queries it has never seen before. The job runtimes have no S/4HANA baseline to compare against, only ECC historical data from the pre-migration capture.

Two specific risks are higher during the first month-end than at any subsequent point. First, batch job duration: jobs that ran in 45 minutes in ECC may run in 25 minutes on S/4HANA (a common experience given HANA’s in-memory performance advantage) or may run significantly longer due to data model differences or missing database statistics on the new system. Without monitoring, the team discovers duration deviations only when a job misses its window. Second, HANA memory: month-end generates larger working datasets than daily operations, loading data that has not been accessed since go-live. Memory consumption during month-end batch processing can be meaningfully higher than during the first weeks of daily operations.

The recommendation is to treat the first month-end as a managed event with an explicit monitoring watch: assigned staff reviewing batch job progress against timeline, memory and I/O metrics observed in real time, and a defined escalation process for jobs running more than 30% past expected duration.

Handing monitoring from the project team to operations: the handover that usually fails

In almost every S/4HANA project, there is a period where the project team and the operations team are both managing the system simultaneously. The project team built the monitoring configuration, understands why each threshold was set where it was, and knows which alerts are likely false positives during the stabilization window. The operations team will run the system from month three onward, but they are still learning it during weeks one through eight.

The handover that fails is the one where the operations team receives access to the monitoring tool and a brief walkthrough, and is then expected to manage it without the institutional knowledge that went into the configuration. Thresholds that were set conservatively for the go-live period are never adjusted for steady-state. Alerts that were relevant during stabilization remain configured after stabilization ends and start generating noise. The operations team begins ignoring specific alert categories because they are not sure whether they are real or holdovers from the project period.

A useful handover document for monitoring has three components: the rationale for each threshold (why was 75% chosen for memory rather than 85%, and when should that change), the list of alerts that were configured for the stabilization period and should be reviewed at day 30, and the escalation paths for each alert category. This document does not need to be long. It needs to exist.

Monitoring readiness checklist : what to have in place before cutover ?

The table below organizes the monitoring readiness checkpoints across the four project phases. Each checkpoint includes what to verify and why the verification matters. Use this during pre-go-live readiness reviews to confirm monitoring coverage is in place before the migration window opens.

Readiness checkpointWhat to verifyWhy it matters
PRE-MIGRATION BASELINE (6-8 weeks before cutover)
ECC/source system baseline capturedDialog response time, work process avg load, HANA memory, top 20 transactions by execution countNeeded to validate S/4HANA performance post go-live. Without it, you have no comparison point.
Batch job runtime profiles documentedSM37 historical data for all Tier 1 and Tier 2 jobs, including month-end variantsS/4HANA job runtimes will differ. You need baselines to recognize abnormal from expected.
Interface throughput baselines capturedVolume, frequency, and error rates for all production interfaces over 30-day periodDetects regressions in integration layer after cutover.
Monitoring tool connected to source systemAgentless connection, RFC user authorized, data collection runningMonitoring should be running on ECC before migration opens so baseline data is in the tool, not in a spreadsheet.
CUTOVER WINDOW MONITORING (T-48h to T+0)
Final batch close monitored on source systemAll Tier 1 batch jobs on ECC completed with status FINISHED and spool verifiedLast ECC batch run is a milestone. Any failure here affects the opening data state in S/4HANA.
Interface freeze confirmed and monitoredAll inbound interfaces to ECC stopped, message queues drained to zeroResidual messages in flight during cutover create reconciliation problems after go-live.
HANA log mode configuration verifiedLog mode set correctly for migration phase per SAP recommendationIncorrect log mode during migration can generate log volumes that fill up before cutover completes.
Migration tool (SUM/DMO) progress trackedActive monitoring of SUM or DMO phases with duration thresholds per phaseLong-running migration phases without monitoring are black boxes. Duration alerting enables go/no-go decisions.
S/4HANA technical connectivity testMonitoring tool connected to target S/4HANA system before business go-liveFirst alert on S/4HANA should not fire in production. Test the connection during the technical cutover phase.
GO-LIVE (T+0 to T+24h)
HANA memory profile at first user loginUsed memory vs allocation limit at the moment first business users hit the systemCold start + initial data loading + first real query load creates a memory profile unlike any test run.
Background job schedule verified in S/4HANAAll Tier 1 and Tier 2 jobs present, released, and scheduled with correct server assignmentsJobs from ECC do not auto-transfer. Manual recreation introduces scheduling errors that only appear when a job misses its window.
First dialog transaction response timesBaseline comparison against ECC dialog response times from pre-migration captureThe first hour of user activity on S/4HANA reveals the most critical performance gaps.
Interface monitoring active on S/4HANAAll inbound and outbound interfaces reconnected, first message flows confirmedInterface re-enablement after cutover is manual and sequential. Monitoring confirms each interface is actually processing.
Alert routing verifiedTest alert triggered and confirmed received by on-call teamAlert routing configured in a test environment may not match production. Verify it is working before the first real incident.
STABILIZATION WINDOW (T+1 to T+30 days)
First MRP run monitoredDuration, memory consumption, background job completion, and output volume against expectedFirst production MRP on S/4HANA is unpredictable. It is often the first real load test of the new system.
First month-end close monitoredAll period-close batch jobs tracked against ECC baseline durationsMonth-end on a new system is categorically different from daily operations. It requires specific monitoring attention.
HANA delta merge backlog trackedPending delta merge count and failure rate in M_DELTA_MERGE_STATISTICSPost-go-live insert volume often accelerates delta store growth in the first weeks. Catch it before it affects query performance.
Monitoring handover to operations documentedRunbooks, threshold rationale, escalation paths delivered to operations teamProject team handover without documented monitoring rationale leaves operations managing thresholds they do not understand.

Monitoring is a project deliverable, not a post-go-live task

The SAP projects that have the smoothest go-lives tend to have one thing in common: they treated monitoring infrastructure as a project deliverable with its own timeline, its own acceptance criteria, and its own sign-off checkpoint. Not as something the operations team sets up after the business is live on the new system.

The argument for that approach is straightforward. The go-live window itself, the cutover weekend, is when you most need visibility and when you have the least. The first week of production is when performance questions are most urgent and when the baseline comparison data is most valuable. Both of those windows require monitoring that was built, tested, and verified before the migration started.

A monitoring platform connected to ECC six weeks before go-live, capturing baselines, learning normal job durations, and verifying alert routing, is a fundamentally different tool during the go-live than one that is configured for the first time on the day the business goes live. The technical effort to connect it early is small compared to the operational value it provides when the system is under pressure and the team needs answers.

Redpeaks connects to both source ECC and target S/4HANA systems without agents or transports, capturing pre-migration baselines and providing side-by-side visibility during go-live. Deployed in RISE with SAP and on-premise environments. 

See how Redpeaks supports S/4HANA migrations.

You might also like:

There are no more posts to display

Become a Redpeaks Partner

Join forces as Redpeaks Partner and elevate your business to new heights!

Unlock unparalleled insights and operational efficiency with Redpeaks Monitoring. 
Join us as a reseller or referral partner and empower your clients with the tools they need to thrive in today’s dynamic IT landscape.

Together, let’s revolutionize the way businesses monitor and optimize their operations.

Download our complete brochure