14 KiB
mars-nwe NCP dispatch redesign notes
This file collects design notes for a possible cleanup of the internal NCP
handoff path. It is intentionally separate from TODO.md: the TODO file should
track concrete bugs and endpoint audit follow-ups, while this file describes a
larger architecture direction that can be implemented gradually.
The goal is not to rewrite MARS-NWE at once. The goal is to make the current handoff behavior explicit, reduce ambiguity around magic return values, and make future endpoint work easier to audit against the Novell/Micro Focus SDK, WebSDK, and NDK Core Protocols PDF.
Current problem
The current NCP path grew around several cooperating processes and handlers:
nwconn.cowns the connection/session side and receives most packets first.nwbind.chandles bindery, queue, some server-management, and some final reply construction.- Other modules such as semaphore, message, namespace, AFP, file, salvage, and queue code implement individual protocol families or backend actions.
- Some calls are handled completely in
nwconn.c. - Some calls are forwarded to
nwbind.cby returning-1from thenwconn.cdispatcher. - Some calls are forwarded with saved request state by returning
-2, so thatnwconn.ccan do post-processing afternwbind.chas replied. - Some forwarded paths mutate request payloads before handoff.
- Some code paths build responses locally, while other paths rely on the target process to build the final completion code and payload.
This works, but it is hard to reason about while auditing endpoint layouts. The
same looking value can mean different things depending on which file it appears
in. For example, return(-1) in the relevant nwconn.c dispatcher path means
"forward this request to nwbind". A disabled return(-1) inside a #if 0
block in nwbind.c does not have that forwarding meaning and should not be
copied into active code.
The visible symptoms are:
- endpoint documentation must follow a handoff across files before it can say the request or reply layout is known;
- missing endpoints are difficult to distinguish from forwarded endpoints;
- request parsing, backend behavior, reply encoding, and process routing are often mixed in one switch block;
- byte order differences are easy to miss because parsing and reply writing are open-coded in different places;
- disabled future stubs can look like active dispatch behavior;
TODO.mdcan become a dumping ground for architectural observations that are not immediate endpoint bugs.
Desired shape
A cleaner long-term structure would have one small internal NCP dispatch layer:
wire packet
-> NCP envelope parser
-> NcpContext
-> endpoint lookup
-> endpoint handler / provider
-> reply encoder
-> central reply sender
This does not need to be a general-purpose message bus. A full message bus would probably be too large and too abstract for this code base. A typed internal NCP context plus explicit dispatch results would be enough.
The important separation is:
- decode the packet envelope;
- identify the endpoint;
- decode the endpoint request body;
- execute the backend operation;
- encode the endpoint reply body;
- send the response from one well-defined place.
Proposed NCP context
Introduce, in a later functional cleanup, a small context object that represents one NCP request while it moves through the server. The exact field names should fit the existing code style, but the conceptual shape would be:
typedef struct {
int connection;
uint16_t request_type; /* 0x2222, 0x3333, 0x5555, ... */
uint8_t function; /* top-level NCP function */
int has_subfunction;
uint8_t subfunction; /* grouped calls such as 23/x, 32/x, 87/x */
const uint8_t *request;
int request_len;
uint8_t *reply;
int reply_cap;
int reply_len;
uint8_t completion;
uint8_t connection_status;
uint32_t flags;
} NcpContext;
The context should not replace all old globals in one patch. It can start as a thin wrapper around the existing request and response buffers, then gradually become the preferred handler interface.
The useful property is that endpoint documentation can point to a stable model:
functionandsubfunctionidentify the endpoint;requestandrequest_lenare the bytes after the already-decoded envelope;replyandreply_lenare the bytes before the common NCP response envelope;completionis set once by the handler or by central error handling.
Replace magic return values with named results
The current 0, -1, and -2 convention should be made explicit before any
larger refactor. The first step can be documentation-only or macro-only:
#define NCP_LOCAL_DONE 0
#define NCP_FORWARD_NWBIND -1
#define NCP_FORWARD_NWBIND_POST -2
A later cleanup can replace those with an enum:
typedef enum {
NCP_DISPATCH_DONE,
NCP_DISPATCH_FORWARD_BIND,
NCP_DISPATCH_FORWARD_BIND_POST,
NCP_DISPATCH_NOT_IMPLEMENTED,
NCP_DISPATCH_BAD_REQUEST,
NCP_DISPATCH_INTERNAL_ERROR
} NcpDispatchResult;
The important rule is that the meaning must be scoped. A named result returned
from a nwconn.c dispatcher may request process handoff. A return statement in
nwbind.c should not silently inherit that meaning unless the function is
explicitly part of the same dispatch interface.
Endpoint table as audit index first
Before replacing switch statements, add an endpoint inventory table as a non-invasive audit aid. It can be compiled only for debug builds or kept as a source-level documentation table.
Conceptual form:
typedef struct {
uint16_t request_type;
uint8_t function;
int has_subfunction;
uint8_t subfunction;
const char *name;
const char *provider;
uint32_t flags;
} NcpEndpointDoc;
Example entries:
{ 0x2222, 23, 1, 109, "Change Queue Job Entry old", "nwbind/queue", NCPDOC_FORWARDED },
{ 0x2222, 32, 1, 0, "Open Semaphore old", "sema", NCPDOC_LOCAL },
{ 0x2222, 33, 0, 0, "Negotiate Buffer Size", "nwconn", NCPDOC_LOCAL },
This table would help with the ongoing endpoint audit:
- SDK/PDF/WebSDK listed and implemented;
- SDK/PDF/WebSDK listed and forwarded;
- SDK/PDF/WebSDK listed but disabled as a future stub;
- SDK/PDF/WebSDK listed but absent from the current compatibility target;
- later NetWare 4.x/OES/MOAB endpoint, not part of the default NetWare 3.x compatibility target.
The first version should not drive runtime dispatch. It should only make review and missing-endpoint checks less error-prone.
Handler structure
For newly touched endpoint families, prefer the following logical split even if it remains in one C function at first:
request decode
-> validation
-> backend operation
-> reply encode
For complex endpoints this could become explicit helper functions:
static int decode_foo(NcpContext *ctx, FooRequest *out);
static int exec_foo(NcpContext *ctx, const FooRequest *req, FooReply *reply);
static void encode_foo(NcpContext *ctx, const FooReply *reply);
This is especially useful for endpoint families where the audit has already found old/new layout differences:
- 16-bit old queue job numbers versus newer 32-bit job numbers;
- big-endian versus little-endian SDK notation;
- old short replies versus newer long replies;
- connection-side prehandling that inserts or rewrites fields;
- bindery or queue paths that build final replies in a different process.
Small endpoints do not need three separate helper functions if that would make the code noisier. The rule is that request bytes and reply bytes should be easy to identify and compare with the SDK documents.
Make handoff explicit
Forwarded calls should say exactly what is handed off. A good comment should answer:
- which bytes are forwarded;
- whether the subfunction byte is preserved or stripped;
- whether
nwconn.cmutates the request before forwarding; - whether
nwbind.cor another provider builds the final reply; - whether
nwconn.cexpects post-processing after the provider reply.
Examples of handoff cases that need this clarity:
- Queue calls where
nwconn.cexpands paths or inserts job file handles beforenwbind.csees the request. - Quota/bindery prehandling where the destination handler receives an already transformed request.
- Semaphore and message groups that are grouped in the SDK but routed through local helper modules.
- Direct lifecycle calls such as End Of Job and Logout where local cleanup and final success reply are split across files.
The preferred future style is not "nwbind must do the rest" but something like:
Forward to nwbind with the original subfunction byte and payload unchanged.
No nwconn post-processing is expected; nwbind builds the completion-only reply.
or:
Forward to nwbind after saving the original request. nwbind validates bindery
state and returns the bindery result; nwconn then performs the file-handle
post-processing in handle_after_bind().
Response building rule
Every endpoint audit should identify the reply builder, not only the request parser. A handler is not fully documented until the response path is known.
For each endpoint family, record:
- completion-only reply;
- fixed-size payload reply;
- variable-length payload reply;
- provider-built reply;
nwconn.cpost-processed reply;- intentionally unsupported reply status.
Long-term, response sending should become centralized enough that endpoint code only encodes payload bytes and a completion code. This reduces off-by-one reply length bugs and makes the logs easier to normalize.
Provider boundaries
A clean design would treat the existing modules as providers instead of hidden fallback paths:
nwconn connection/session, packet IO, top-level envelope
ncpdispatch endpoint lookup, handoff policy, common errors
nwbind bindery database and bindery-backed services
queue queue metadata and print/backend adapter
sema semaphore state
message station/message/broadcast state
namespace path, directory handle, name-space operations
file file handle and read/write/open/close operations
salvage deleted-file scan/recover/purge backend
AFP AFP metadata and AFP namespace adapter
This is a design target, not a demand to move files immediately. The important
part is that future code should avoid making nwbind a catch-all sink for
unrelated NCPs just because it already has an IPC path.
Logging connection
The dispatch redesign also supports the desired log cleanup. If every request has a context, logs can consistently include:
INFO NCP 23/109 DISPATCH type=0x2222 fn=0x17 sub=0x6d provider=nwbind/queue
INFO NCP 32/0 REPLY type=0x2222 fn=0x20 sub=0x00 result=0x00 len=4
WARN NCP 23/130 LAYOUT-MISMATCH sdk="32-bit JobNumber" code="16-bit parser"
The logging cleanup should still reuse existing mars-nwe logging functions. Do not add a second logging subsystem just to support the dispatch cleanup.
Migration plan
Phase 1: Name the existing conventions
Low risk. No behavior change.
- Add named constants or comments for the current
0,-1, and-2dispatcher results. - Keep existing control flow unchanged.
- Update comments so
return(-1)is never described ambiguously outside the exact dispatcher where it is meaningful.
Phase 2: Add an endpoint audit table
Low risk. Mostly documentation/debug.
- Add a table of known endpoints by request type, function, and subfunction.
- Mark provider, generation bucket, and implementation state.
- Use it to compare SDK/PDF/WebSDK coverage against actual handlers.
- Do not switch runtime dispatch to the table yet.
Phase 3: Introduce a thin NcpContext
Moderate risk if kept small.
- Wrap existing request and reply buffers without changing ownership.
- Use the context only in newly audited or newly implemented handlers.
- Keep old handlers callable until they are touched for another reason.
Phase 4: Convert small endpoint families first
Moderate risk, easy to test.
Good candidates:
0x2222/32old Semaphore calls;- direct calls such as End Of Job, Logout, and Negotiate Buffer Size;
- small message/station groups once their handoff has been audited.
Avoid converting queue and bindery first because they have more process coupling and more old/new layout variants.
Phase 5: Move runtime dispatch to tables gradually
Higher risk. Do this only after enough endpoint families have stable audit coverage and tests.
- Keep switch wrappers during the transition.
- Convert one family at a time.
- Preserve exact completion codes and reply lengths.
- Add targeted smoke tests for any family whose dispatch path changes.
Non-goals
This redesign should not:
- change protocol behavior merely to match a cleaner abstraction;
- remove NetWare 1.x/2.x/3.x compatibility paths;
- enable NetWare 4.x/OES/MOAB-only endpoints by default;
- replace existing mars-nwe path, bindery, queue, AFP, trustee, or salvage backends with parallel databases;
- add a large external message bus dependency;
- rewrite all handlers in one patch;
- turn documentation-only endpoint audit patches into functional refactors.
Practical rule for future patches
For the ongoing endpoint documentation pass, keep doing the conservative thing:
- enumerate SDK/PDF/WebSDK/include endpoints for the family;
- compare them with actual
caselabels and forwarded destination handlers; - document missing, disabled, implemented, and later-generation slots;
- document request parser/handoff and response builder;
- record real layout differences, but do not change behavior in the same patch.
Functional cleanup should come later in small patches with tests.