Files
mars-nwe/doc/HANDOFF_AUDIT.md
OpenAI 38714c63b5
All checks were successful
Source release / source-package (push) Successful in 1m24s
0445 docs: make handoff no-client-reply explicit
2026-06-13 06:59:41 +02:00

7.1 KiB

NCP handoff and provider audit

This note records the current nwconn to nwbind handoff model and the NSS message-layer references that can guide a later cleanup. It is an audit and vocabulary document first; it does not change runtime dispatch.

Current MARS behaviour

The active MARS server path still uses the historical two-process split:

client transport -> nwconn -> local handler or nwbind handoff -> client reply

nwconn.c owns the client-facing connection, task, sequence and final network reply path for most requests. nwbind.c owns bindery state and also handles message, semaphore, queue, accounting and several server-management selectors.

The important problem is that the active nwconn.c dispatcher uses magic integer returns for process handoff:

Result Current dispatcher meaning
0 request handled locally; send the local reply path
-1 forward to nwbind; nwbind handles the remaining operation/reply
-2 forward to nwbind after nwconn saved or rewrote request state; nwconn may do post-processing

This meaning is scoped. A random return(-1) in helper code, salvage code, SSL compatibility, or even inside nwbind.c is usually just a local error or an unsupported selector. It is not automatically an inter-process handoff marker.

include/ncp_endpoint.h now records the shared names that future cleanup patches should use for audits and endpoint tables:

  • NcpProvider for logical owner/provider names;
  • NcpDispatchResult for current nwconn dispatcher compatibility results;
  • NcpEndpointFlag for endpoint table/audit annotations;
  • NwHandoffReplyKind for the future normalized provider reply contract.

Active magic-return sites to audit first

The relevant initial search is not "all return(-1) in the tree". That finds ordinary local error returns too. Start with the active nwconn.c dispatcher handoff sites and annotate them endpoint-by-endpoint.

Known active nwconn.c handoff clusters in the current tree include:

Area Current return Notes
decimal 21 / wire 0x15 Message group -1 Entire group is forwarded to nwbind with the NCP 21 group header intact.
decimal 22/33, 22/34 / wire 0x16/0x21, 0x16/0x22 quota/user restriction prehandling -2 nwbind maps bindery object IDs before quota handling.
decimal 22/41 / wire 0x16/0x29 object disk usage/restriction with quota support -2 QUOTA builds forward for ObjectID-to-uid mapping.
decimal 23/20, 23/24 / wire 0x17/0x14, 0x17/0x18 login paths -2 Bindery authentication prehandling.
decimal 23 queue selectors / wire 0x17/0x68, 0x17/0x79, 0x17/0x69, 0x17/0x7f, 0x17/0x71, 0x17/0x7c -2 Queue job file/prehandling paths.
decimal 23 queue read/change/service selectors / wire 0x17/0x6c, 0x17/0x7a, 0x17/0x72, 0x17/0x73, 0x17/0x83, 0x17/0x84 -1 or -2 Some requests are rewritten with handles before the nwbind side completes them.
decimal 23 unknown/default selectors / wire 0x17/* -1 Catch-all delegation to nwbind; must be split from truly unsupported endpoints later.
decimal 24 End of Job / wire 0x18 -1 nwconn frees task-local state, then nwbind closes remaining print jobs and replies.
decimal 25 Logout / wire 0x19 -1 nwconn clears connection-local state, then nwbind finishes bindery/print cleanup.
decimal 32 Semaphore / wire 0x20 -1 Group is handled by nwbind with a one-byte semaphore subfunction.

The first functional cleanup should replace comments and endpoint-table metadata around these sites before changing control flow. Do not bulk-rewrite unrelated return(-1) helper errors.

Desired provider contract

The later normalized shape is:

nwconn -> provider request -> provider reply -> nwconn final NCP envelope

Providers should return logical completion plus reply payload, not send raw client packets directly. nwconn remains the final owner of the client-visible NCP response envelope. Legacy paths may keep provider-built replies while being converted, but new namespace/NWFS work should not introduce new magic return values or hidden process-specific reply conventions.

The future rule is deliberately stricter than the current magic returns: for every accepted internal handoff request there is exactly one formal internal handoff reply object. NW_HANDOFF_NO_CLIENT_REPLY is not an absent reply; it is the reply object telling nwconn that no client-visible NCP response should be sent.

The conceptual reply kinds are the names in NwHandoffReplyKind:

Kind Meaning
NW_HANDOFF_REPLY provider produced a client NCP completion/payload
NW_HANDOFF_NO_CLIENT_REPLY provider handled the request but no client reply is allowed
NW_HANDOFF_DEFERRED provider accepted the request and will complete later
NW_HANDOFF_FORWARD provider asks the caller to forward to another provider
NW_HANDOFF_ERROR internal provider/handoff failure, mapped centrally

NSS references

The imported NSS reference tree contains useful concepts but not a direct MARS IPC implementation:

NSS area Files Useful idea MARS policy
NSS message layer include/nwfs/nss/sdk/include/msg.h, msgGen.h, msgIO.h; calls in src/nwfs/nss/common/*.c key/object/method/message calls such as MSG_Call, MSG_Send, MSG_SendKey, mpkMSG_Call, zMSG_Call Reference only. Do not import NSS Door/Object runtime as the MARS provider IPC.
NSS filesystem method layer src/nwfs/nss/common/fsmsg.c, include/nwfs/nss/sdk/include/fsmsg.h method tables for NSS file/volume operations Useful model for future libnwfs object methods, not a replacement for NCP endpoint dispatch.
NSSKR ioctl layer include/nwfs/nss/sdk/include/nsskr.h, src/nwfs/nss/lsaSuper.c signature/version/opcode header plus typed request payloads Good reference for a simple handoff header shape. Do not copy the ioctl protocol directly.
ipc2ncp headers include/nwfs/nss/ipc2ncp.h, include/nwfs/nss/sdk/public/ipc2ncp.h NCP-to-internal-request boundary vocabulary Reference while auditing, not a ready-to-use MARS ABI.

NSS confirms that method/opcode/message boundaries are normal for this problem space. MARS should still use a small, native provider contract rather than a large NSS runtime subsystem.

Migration order

  1. Keep include/ncp_endpoint.h as vocabulary only.
  2. Add endpoint table rows with provider/result flags before changing switch logic.
  3. Annotate each active nwconn magic handoff site with the exact endpoint and provider owner.
  4. Add a small wrapper such as ncp_handoff_to_provider() that initially calls the existing nwbind path.
  5. Replace magic return(-1) / return(-2) in the active dispatcher with named NcpDispatchResult values only after the audit table can prove which sites are real handoffs.
  6. Convert provider replies to NwHandoffReplyKind behind the wrapper.
  7. Only then add new provider processes. New namespace/NWFS code may start as an in-process provider module without becoming a process.