docs: add ncp dispatch redesign notes
This commit is contained in:
386
REDESIGN.md
Normal file
386
REDESIGN.md
Normal file
@@ -0,0 +1,386 @@
|
||||
# mars-nwe NCP dispatch redesign notes
|
||||
|
||||
This file collects design notes for a possible cleanup of the internal NCP
|
||||
handoff path. It is intentionally separate from `TODO.md`: the TODO file should
|
||||
track concrete bugs and endpoint audit follow-ups, while this file describes a
|
||||
larger architecture direction that can be implemented gradually.
|
||||
|
||||
The goal is not to rewrite MARS-NWE at once. The goal is to make the current
|
||||
handoff behavior explicit, reduce ambiguity around magic return values, and make
|
||||
future endpoint work easier to audit against the Novell/Micro Focus SDK, WebSDK,
|
||||
and NDK Core Protocols PDF.
|
||||
|
||||
## Current problem
|
||||
|
||||
The current NCP path grew around several cooperating processes and handlers:
|
||||
|
||||
- `nwconn.c` owns the connection/session side and receives most packets first.
|
||||
- `nwbind.c` handles bindery, queue, some server-management, and some final
|
||||
reply construction.
|
||||
- Other modules such as semaphore, message, namespace, AFP, file, salvage, and
|
||||
queue code implement individual protocol families or backend actions.
|
||||
- Some calls are handled completely in `nwconn.c`.
|
||||
- Some calls are forwarded to `nwbind.c` by returning `-1` from the `nwconn.c`
|
||||
dispatcher.
|
||||
- Some calls are forwarded with saved request state by returning `-2`, so that
|
||||
`nwconn.c` can do post-processing after `nwbind.c` has replied.
|
||||
- Some forwarded paths mutate request payloads before handoff.
|
||||
- Some code paths build responses locally, while other paths rely on the target
|
||||
process to build the final completion code and payload.
|
||||
|
||||
This works, but it is hard to reason about while auditing endpoint layouts. The
|
||||
same looking value can mean different things depending on which file it appears
|
||||
in. For example, `return(-1)` in the relevant `nwconn.c` dispatcher path means
|
||||
"forward this request to `nwbind`". A disabled `return(-1)` inside a `#if 0`
|
||||
block in `nwbind.c` does not have that forwarding meaning and should not be
|
||||
copied into active code.
|
||||
|
||||
The visible symptoms are:
|
||||
|
||||
- endpoint documentation must follow a handoff across files before it can say the
|
||||
request or reply layout is known;
|
||||
- missing endpoints are difficult to distinguish from forwarded endpoints;
|
||||
- request parsing, backend behavior, reply encoding, and process routing are
|
||||
often mixed in one switch block;
|
||||
- byte order differences are easy to miss because parsing and reply writing are
|
||||
open-coded in different places;
|
||||
- disabled future stubs can look like active dispatch behavior;
|
||||
- `TODO.md` can become a dumping ground for architectural observations that are
|
||||
not immediate endpoint bugs.
|
||||
|
||||
## Desired shape
|
||||
|
||||
A cleaner long-term structure would have one small internal NCP dispatch layer:
|
||||
|
||||
```text
|
||||
wire packet
|
||||
-> NCP envelope parser
|
||||
-> NcpContext
|
||||
-> endpoint lookup
|
||||
-> endpoint handler / provider
|
||||
-> reply encoder
|
||||
-> central reply sender
|
||||
```
|
||||
|
||||
This does not need to be a general-purpose message bus. A full message bus would
|
||||
probably be too large and too abstract for this code base. A typed internal NCP
|
||||
context plus explicit dispatch results would be enough.
|
||||
|
||||
The important separation is:
|
||||
|
||||
1. decode the packet envelope;
|
||||
2. identify the endpoint;
|
||||
3. decode the endpoint request body;
|
||||
4. execute the backend operation;
|
||||
5. encode the endpoint reply body;
|
||||
6. send the response from one well-defined place.
|
||||
|
||||
## Proposed NCP context
|
||||
|
||||
Introduce, in a later functional cleanup, a small context object that represents
|
||||
one NCP request while it moves through the server. The exact field names should
|
||||
fit the existing code style, but the conceptual shape would be:
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
int connection;
|
||||
|
||||
uint16_t request_type; /* 0x2222, 0x3333, 0x5555, ... */
|
||||
uint8_t function; /* top-level NCP function */
|
||||
int has_subfunction;
|
||||
uint8_t subfunction; /* grouped calls such as 23/x, 32/x, 87/x */
|
||||
|
||||
const uint8_t *request;
|
||||
int request_len;
|
||||
|
||||
uint8_t *reply;
|
||||
int reply_cap;
|
||||
int reply_len;
|
||||
|
||||
uint8_t completion;
|
||||
uint8_t connection_status;
|
||||
|
||||
uint32_t flags;
|
||||
} NcpContext;
|
||||
```
|
||||
|
||||
The context should not replace all old globals in one patch. It can start as a
|
||||
thin wrapper around the existing request and response buffers, then gradually
|
||||
become the preferred handler interface.
|
||||
|
||||
The useful property is that endpoint documentation can point to a stable model:
|
||||
|
||||
- `function` and `subfunction` identify the endpoint;
|
||||
- `request` and `request_len` are the bytes after the already-decoded envelope;
|
||||
- `reply` and `reply_len` are the bytes before the common NCP response envelope;
|
||||
- `completion` is set once by the handler or by central error handling.
|
||||
|
||||
## Replace magic return values with named results
|
||||
|
||||
The current `0`, `-1`, and `-2` convention should be made explicit before any
|
||||
larger refactor. The first step can be documentation-only or macro-only:
|
||||
|
||||
```c
|
||||
#define NCP_LOCAL_DONE 0
|
||||
#define NCP_FORWARD_NWBIND -1
|
||||
#define NCP_FORWARD_NWBIND_POST -2
|
||||
```
|
||||
|
||||
A later cleanup can replace those with an enum:
|
||||
|
||||
```c
|
||||
typedef enum {
|
||||
NCP_DISPATCH_DONE,
|
||||
NCP_DISPATCH_FORWARD_BIND,
|
||||
NCP_DISPATCH_FORWARD_BIND_POST,
|
||||
NCP_DISPATCH_NOT_IMPLEMENTED,
|
||||
NCP_DISPATCH_BAD_REQUEST,
|
||||
NCP_DISPATCH_INTERNAL_ERROR
|
||||
} NcpDispatchResult;
|
||||
```
|
||||
|
||||
The important rule is that the meaning must be scoped. A named result returned
|
||||
from a `nwconn.c` dispatcher may request process handoff. A return statement in
|
||||
`nwbind.c` should not silently inherit that meaning unless the function is
|
||||
explicitly part of the same dispatch interface.
|
||||
|
||||
## Endpoint table as audit index first
|
||||
|
||||
Before replacing switch statements, add an endpoint inventory table as a
|
||||
non-invasive audit aid. It can be compiled only for debug builds or kept as a
|
||||
source-level documentation table.
|
||||
|
||||
Conceptual form:
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
uint16_t request_type;
|
||||
uint8_t function;
|
||||
int has_subfunction;
|
||||
uint8_t subfunction;
|
||||
const char *name;
|
||||
const char *provider;
|
||||
uint32_t flags;
|
||||
} NcpEndpointDoc;
|
||||
```
|
||||
|
||||
Example entries:
|
||||
|
||||
```c
|
||||
{ 0x2222, 23, 1, 109, "Change Queue Job Entry old", "nwbind/queue", NCPDOC_FORWARDED },
|
||||
{ 0x2222, 32, 1, 0, "Open Semaphore old", "sema", NCPDOC_LOCAL },
|
||||
{ 0x2222, 33, 0, 0, "Negotiate Buffer Size", "nwconn", NCPDOC_LOCAL },
|
||||
```
|
||||
|
||||
This table would help with the ongoing endpoint audit:
|
||||
|
||||
- SDK/PDF/WebSDK listed and implemented;
|
||||
- SDK/PDF/WebSDK listed and forwarded;
|
||||
- SDK/PDF/WebSDK listed but disabled as a future stub;
|
||||
- SDK/PDF/WebSDK listed but absent from the current compatibility target;
|
||||
- later NetWare 4.x/OES/MOAB endpoint, not part of the default NetWare 3.x
|
||||
compatibility target.
|
||||
|
||||
The first version should not drive runtime dispatch. It should only make review
|
||||
and missing-endpoint checks less error-prone.
|
||||
|
||||
## Handler structure
|
||||
|
||||
For newly touched endpoint families, prefer the following logical split even if
|
||||
it remains in one C function at first:
|
||||
|
||||
```text
|
||||
request decode
|
||||
-> validation
|
||||
-> backend operation
|
||||
-> reply encode
|
||||
```
|
||||
|
||||
For complex endpoints this could become explicit helper functions:
|
||||
|
||||
```c
|
||||
static int decode_foo(NcpContext *ctx, FooRequest *out);
|
||||
static int exec_foo(NcpContext *ctx, const FooRequest *req, FooReply *reply);
|
||||
static void encode_foo(NcpContext *ctx, const FooReply *reply);
|
||||
```
|
||||
|
||||
This is especially useful for endpoint families where the audit has already
|
||||
found old/new layout differences:
|
||||
|
||||
- 16-bit old queue job numbers versus newer 32-bit job numbers;
|
||||
- big-endian versus little-endian SDK notation;
|
||||
- old short replies versus newer long replies;
|
||||
- connection-side prehandling that inserts or rewrites fields;
|
||||
- bindery or queue paths that build final replies in a different process.
|
||||
|
||||
Small endpoints do not need three separate helper functions if that would make
|
||||
the code noisier. The rule is that request bytes and reply bytes should be easy
|
||||
to identify and compare with the SDK documents.
|
||||
|
||||
## Make handoff explicit
|
||||
|
||||
Forwarded calls should say exactly what is handed off. A good comment should
|
||||
answer:
|
||||
|
||||
- which bytes are forwarded;
|
||||
- whether the subfunction byte is preserved or stripped;
|
||||
- whether `nwconn.c` mutates the request before forwarding;
|
||||
- whether `nwbind.c` or another provider builds the final reply;
|
||||
- whether `nwconn.c` expects post-processing after the provider reply.
|
||||
|
||||
Examples of handoff cases that need this clarity:
|
||||
|
||||
- Queue calls where `nwconn.c` expands paths or inserts job file handles before
|
||||
`nwbind.c` sees the request.
|
||||
- Quota/bindery prehandling where the destination handler receives an already
|
||||
transformed request.
|
||||
- Semaphore and message groups that are grouped in the SDK but routed through
|
||||
local helper modules.
|
||||
- Direct lifecycle calls such as End Of Job and Logout where local cleanup and
|
||||
final success reply are split across files.
|
||||
|
||||
The preferred future style is not "`nwbind` must do the rest" but something like:
|
||||
|
||||
```text
|
||||
Forward to nwbind with the original subfunction byte and payload unchanged.
|
||||
No nwconn post-processing is expected; nwbind builds the completion-only reply.
|
||||
```
|
||||
|
||||
or:
|
||||
|
||||
```text
|
||||
Forward to nwbind after saving the original request. nwbind validates bindery
|
||||
state and returns the bindery result; nwconn then performs the file-handle
|
||||
post-processing in handle_after_bind().
|
||||
```
|
||||
|
||||
## Response building rule
|
||||
|
||||
Every endpoint audit should identify the reply builder, not only the request
|
||||
parser. A handler is not fully documented until the response path is known.
|
||||
|
||||
For each endpoint family, record:
|
||||
|
||||
- completion-only reply;
|
||||
- fixed-size payload reply;
|
||||
- variable-length payload reply;
|
||||
- provider-built reply;
|
||||
- `nwconn.c` post-processed reply;
|
||||
- intentionally unsupported reply status.
|
||||
|
||||
Long-term, response sending should become centralized enough that endpoint code
|
||||
only encodes payload bytes and a completion code. This reduces off-by-one reply
|
||||
length bugs and makes the logs easier to normalize.
|
||||
|
||||
## Provider boundaries
|
||||
|
||||
A clean design would treat the existing modules as providers instead of hidden
|
||||
fallback paths:
|
||||
|
||||
```text
|
||||
nwconn connection/session, packet IO, top-level envelope
|
||||
ncpdispatch endpoint lookup, handoff policy, common errors
|
||||
nwbind bindery database and bindery-backed services
|
||||
queue queue metadata and print/backend adapter
|
||||
sema semaphore state
|
||||
message station/message/broadcast state
|
||||
namespace path, directory handle, name-space operations
|
||||
file file handle and read/write/open/close operations
|
||||
salvage deleted-file scan/recover/purge backend
|
||||
AFP AFP metadata and AFP namespace adapter
|
||||
```
|
||||
|
||||
This is a design target, not a demand to move files immediately. The important
|
||||
part is that future code should avoid making `nwbind` a catch-all sink for
|
||||
unrelated NCPs just because it already has an IPC path.
|
||||
|
||||
## Logging connection
|
||||
|
||||
The dispatch redesign also supports the desired log cleanup. If every request
|
||||
has a context, logs can consistently include:
|
||||
|
||||
```text
|
||||
INFO NCP 23/109 DISPATCH type=0x2222 fn=0x17 sub=0x6d provider=nwbind/queue
|
||||
INFO NCP 32/0 REPLY type=0x2222 fn=0x20 sub=0x00 result=0x00 len=4
|
||||
WARN NCP 23/130 LAYOUT-MISMATCH sdk="32-bit JobNumber" code="16-bit parser"
|
||||
```
|
||||
|
||||
The logging cleanup should still reuse existing mars-nwe logging functions. Do
|
||||
not add a second logging subsystem just to support the dispatch cleanup.
|
||||
|
||||
## Migration plan
|
||||
|
||||
### Phase 1: Name the existing conventions
|
||||
|
||||
Low risk. No behavior change.
|
||||
|
||||
- Add named constants or comments for the current `0`, `-1`, and `-2`
|
||||
dispatcher results.
|
||||
- Keep existing control flow unchanged.
|
||||
- Update comments so `return(-1)` is never described ambiguously outside the
|
||||
exact dispatcher where it is meaningful.
|
||||
|
||||
### Phase 2: Add an endpoint audit table
|
||||
|
||||
Low risk. Mostly documentation/debug.
|
||||
|
||||
- Add a table of known endpoints by request type, function, and subfunction.
|
||||
- Mark provider, generation bucket, and implementation state.
|
||||
- Use it to compare SDK/PDF/WebSDK coverage against actual handlers.
|
||||
- Do not switch runtime dispatch to the table yet.
|
||||
|
||||
### Phase 3: Introduce a thin `NcpContext`
|
||||
|
||||
Moderate risk if kept small.
|
||||
|
||||
- Wrap existing request and reply buffers without changing ownership.
|
||||
- Use the context only in newly audited or newly implemented handlers.
|
||||
- Keep old handlers callable until they are touched for another reason.
|
||||
|
||||
### Phase 4: Convert small endpoint families first
|
||||
|
||||
Moderate risk, easy to test.
|
||||
|
||||
Good candidates:
|
||||
|
||||
- `0x2222/32` old Semaphore calls;
|
||||
- direct calls such as End Of Job, Logout, and Negotiate Buffer Size;
|
||||
- small message/station groups once their handoff has been audited.
|
||||
|
||||
Avoid converting queue and bindery first because they have more process coupling
|
||||
and more old/new layout variants.
|
||||
|
||||
### Phase 5: Move runtime dispatch to tables gradually
|
||||
|
||||
Higher risk. Do this only after enough endpoint families have stable audit
|
||||
coverage and tests.
|
||||
|
||||
- Keep switch wrappers during the transition.
|
||||
- Convert one family at a time.
|
||||
- Preserve exact completion codes and reply lengths.
|
||||
- Add targeted smoke tests for any family whose dispatch path changes.
|
||||
|
||||
## Non-goals
|
||||
|
||||
This redesign should not:
|
||||
|
||||
- change protocol behavior merely to match a cleaner abstraction;
|
||||
- remove NetWare 1.x/2.x/3.x compatibility paths;
|
||||
- enable NetWare 4.x/OES/MOAB-only endpoints by default;
|
||||
- replace existing mars-nwe path, bindery, queue, AFP, trustee, or salvage
|
||||
backends with parallel databases;
|
||||
- add a large external message bus dependency;
|
||||
- rewrite all handlers in one patch;
|
||||
- turn documentation-only endpoint audit patches into functional refactors.
|
||||
|
||||
## Practical rule for future patches
|
||||
|
||||
For the ongoing endpoint documentation pass, keep doing the conservative thing:
|
||||
|
||||
1. enumerate SDK/PDF/WebSDK/include endpoints for the family;
|
||||
2. compare them with actual `case` labels and forwarded destination handlers;
|
||||
3. document missing, disabled, implemented, and later-generation slots;
|
||||
4. document request parser/handoff and response builder;
|
||||
5. record real layout differences, but do not change behavior in the same patch.
|
||||
|
||||
Functional cleanup should come later in small patches with tests.
|
||||
Reference in New Issue
Block a user