Files
mars-nwe/REDESIGN.md
2026-06-02 09:24:15 +02:00

1136 lines
43 KiB
Markdown

# mars-nwe NCP dispatch redesign notes
This file collects design notes for a possible cleanup of the internal NCP
handoff path. It is intentionally separate from `TODO.md`: the TODO file should
track concrete bugs and endpoint audit follow-ups, while this file describes a
larger architecture direction that can be implemented gradually.
The goal is not to rewrite MARS-NWE at once. The goal is to make the current
handoff behavior explicit, reduce ambiguity around magic return values, and make
future endpoint work easier to audit against the Novell/Micro Focus SDK, WebSDK,
and NDK Core Protocols PDF.
## Current problem
The current NCP path grew around several cooperating processes and handlers:
- `nwconn.c` owns the connection/session side and receives most packets first.
- `nwbind.c` handles bindery, queue, some server-management, and some final
reply construction.
- Other modules such as semaphore, message, namespace, AFP, file, salvage, and
queue code implement individual protocol families or backend actions.
- Some calls are handled completely in `nwconn.c`.
- Some calls are forwarded to `nwbind.c` by returning `-1` from the `nwconn.c`
dispatcher.
- Some calls are forwarded with saved request state by returning `-2`, so that
`nwconn.c` can do post-processing after `nwbind.c` has replied.
- Some forwarded paths mutate request payloads before handoff.
- Some code paths build responses locally, while other paths rely on the target
process to build the final completion code and payload.
This works, but it is hard to reason about while auditing endpoint layouts. The
same looking value can mean different things depending on which file it appears
in. For example, `return(-1)` in the relevant `nwconn.c` dispatcher path means
"forward this request to `nwbind`". A disabled `return(-1)` inside a `#if 0`
block in `nwbind.c` does not have that forwarding meaning and should not be
copied into active code.
The visible symptoms are:
- endpoint documentation must follow a handoff across files before it can say the
request or reply layout is known;
- missing endpoints are difficult to distinguish from forwarded endpoints;
- request parsing, backend behavior, reply encoding, and process routing are
often mixed in one switch block;
- byte order differences are easy to miss because parsing and reply writing are
open-coded in different places;
- disabled future stubs can look like active dispatch behavior;
- `TODO.md` can become a dumping ground for architectural observations that are
not immediate endpoint bugs.
## Desired shape
A cleaner long-term structure would have one small internal NCP dispatch layer:
```text
wire packet
-> NCP envelope parser
-> NcpContext
-> endpoint lookup
-> endpoint handler / provider
-> reply encoder
-> central reply sender
```
This does not need to be a general-purpose message bus. A full message bus would
probably be too large and too abstract for this code base. A typed internal NCP
context plus explicit dispatch results would be enough.
The important separation is:
1. decode the packet envelope;
2. identify the endpoint;
3. decode the endpoint request body;
4. execute the backend operation;
5. encode the endpoint reply body;
6. send the response from one well-defined place.
## Proposed NCP context
Introduce, in a later functional cleanup, a small context object that represents
one NCP request while it moves through the server. The exact field names should
fit the existing code style, but the conceptual shape would be:
```c
typedef struct {
int connection;
uint16_t request_type; /* 0x2222, 0x3333, 0x5555, ... */
uint8_t function; /* top-level NCP function */
/*
* Some NCP families are only one level deep, but others are nested.
* The selector path records the bytes/words that identify the logical
* operation after the top-level function, without pretending that every
* family has exactly one byte-sized subfunction.
*/
int selector_count;
uint32_t selector[4]; /* e.g. subfunction, level, verb, info type */
const uint8_t *request;
int request_len;
uint8_t *reply;
int reply_cap;
int reply_len;
uint8_t completion;
uint8_t connection_status;
uint32_t flags;
} NcpContext;
```
The context should not replace all old globals in one patch. It can start as a
thin wrapper around the existing request and response buffers, then gradually
become the preferred handler interface.
The useful property is that endpoint documentation can point to a stable model:
- `function` identifies the first NCP selector byte;
- `selector[]` identifies any nested selector path after that byte;
- `request` and `request_len` are the bytes after the already-decoded envelope;
- `reply` and `reply_len` are the bytes before the common NCP response envelope;
- `completion` is set once by the handler or by central error handling.
Do not assume that the logical endpoint key always stops at
`request_type/function/subfunction`. The Novell documentation has several
families where an endpoint has another selector inside the subfunction payload.
Examples include NDS fragmented requests (`0x2222/104/02`) where the request
contains a 32-bit NDS verb, statistical calls such as `0x2222/123/34` where an
`InfoLevelNumber` selects the returned structure, NCP extension calls where the
extension number is dynamic, and reply formats that vary by information type.
The audit notation for such cases should make the nesting explicit, for example
`0x2222/104/02 verb=...` or `0x2222/123/34 level=2`, instead of flattening it
into an invented one-byte `zz` case.
## Replace magic return values with named results
The current `0`, `-1`, and `-2` convention should be made explicit before any
larger refactor. The first step can be documentation-only or macro-only:
```c
#define NCP_LOCAL_DONE 0
#define NCP_FORWARD_NWBIND -1
#define NCP_FORWARD_NWBIND_POST -2
```
A later cleanup can replace those with an enum:
```c
typedef enum {
NCP_DISPATCH_DONE,
NCP_DISPATCH_FORWARD_BIND,
NCP_DISPATCH_FORWARD_BIND_POST,
NCP_DISPATCH_NOT_IMPLEMENTED,
NCP_DISPATCH_BAD_REQUEST,
NCP_DISPATCH_INTERNAL_ERROR
} NcpDispatchResult;
```
The important rule is that the meaning must be scoped. A named result returned
from a `nwconn.c` dispatcher may request process handoff. A return statement in
`nwbind.c` should not silently inherit that meaning unless the function is
explicitly part of the same dispatch interface.
## Endpoint table as audit index first
Before replacing switch statements, add an endpoint inventory table as a
non-invasive audit aid. It can be compiled only for debug builds or kept as a
source-level documentation table.
Conceptual form:
```c
typedef struct {
uint16_t request_type;
uint8_t function;
int selector_count;
uint32_t selector[4];
const char *selector_note;
const char *name;
const char *provider;
uint32_t flags;
} NcpEndpointDoc;
```
Example entries:
```c
{ 0x2222, 23, 1, { 109 }, "subfunction", "Change Queue Job Entry old", "nwbind/queue", NCPDOC_FORWARDED },
{ 0x2222, 32, 1, { 0 }, "subfunction", "Open Semaphore old", "sema", NCPDOC_LOCAL },
{ 0x2222, 33, 0, { 0 }, NULL, "Negotiate Buffer Size", "nwconn", NCPDOC_LOCAL },
/* Later NetWare 4.x examples that need more than one logical selector. */
{ 0x2222, 104, 2, { 2, 0 }, "subfunction + NDS verb", "Send NDS Fragmented Request/Reply", "nwnds", NCPDOC_FUTURE },
{ 0x2222, 123, 2, { 34, 2 }, "subfunction + info level", "Get Volume Information by Level", "servermgmt", NCPDOC_FUTURE },
```
This table would help with the ongoing endpoint audit:
- SDK/PDF/WebSDK listed and implemented;
- SDK/PDF/WebSDK listed and forwarded;
- SDK/PDF/WebSDK listed but disabled as a future stub;
- SDK/PDF/WebSDK listed but absent from the current compatibility target;
- later NetWare 4.x/OES/MOAB endpoint, not part of the default NetWare 3.x
compatibility target.
The first version should not drive runtime dispatch. It should only make review
and missing-endpoint checks less error-prone.
The table should be able to represent a selector path rather than only a single
subfunction. This matters for later NetWare 4.x families and for extension
mechanisms. The first selector element is usually the documented subfunction
byte, but later elements may be 16-bit or 32-bit fields from the request body,
not dispatch bytes in the classic switch sense. Treat them as layout selectors,
not as automatic nested `switch` cases unless the code actually dispatches on
them.
## Handler structure
For newly touched endpoint families, prefer the following logical split even if
it remains in one C function at first:
```text
request decode
-> validation
-> backend operation
-> reply encode
```
For complex endpoints this could become explicit helper functions:
```c
static int decode_foo(NcpContext *ctx, FooRequest *out);
static int exec_foo(NcpContext *ctx, const FooRequest *req, FooReply *reply);
static void encode_foo(NcpContext *ctx, const FooReply *reply);
```
This is especially useful for endpoint families where the audit has already
found old/new layout differences:
- 16-bit old queue job numbers versus newer 32-bit job numbers;
- big-endian versus little-endian SDK notation;
- old short replies versus newer long replies;
- connection-side prehandling that inserts or rewrites fields;
- bindery or queue paths that build final replies in a different process.
Small endpoints do not need three separate helper functions if that would make
the code noisier. The rule is that request bytes and reply bytes should be easy
to identify and compare with the SDK documents.
## Make handoff explicit
Forwarded calls should say exactly what is handed off. A good comment should
answer:
- which bytes are forwarded;
- whether the subfunction byte is preserved or stripped;
- whether `nwconn.c` mutates the request before forwarding;
- whether `nwbind.c` or another provider builds the final reply;
- whether `nwconn.c` expects post-processing after the provider reply.
Examples of handoff cases that need this clarity:
- Queue calls where `nwconn.c` expands paths or inserts job file handles before
`nwbind.c` sees the request.
- Quota/bindery prehandling where the destination handler receives an already
transformed request.
- Semaphore and message groups that are grouped in the SDK but routed through
local helper modules.
- Direct lifecycle calls such as End Of Job and Logout where local cleanup and
final success reply are split across files.
The preferred future style is not "`nwbind` must do the rest" but something like:
```text
Forward to nwbind with the original subfunction byte and payload unchanged.
No nwconn post-processing is expected; nwbind builds the completion-only reply.
```
or:
```text
Forward to nwbind after saving the original request. nwbind validates bindery
state and returns the bindery result; nwconn then performs the file-handle
post-processing in handle_after_bind().
```
## Response building rule
Every endpoint audit should identify the reply builder, not only the request
parser. A handler is not fully documented until the response path is known.
For each endpoint family, record:
- completion-only reply;
- fixed-size payload reply;
- variable-length payload reply;
- provider-built reply;
- `nwconn.c` post-processed reply;
- intentionally unsupported reply status.
Long-term, response sending should become centralized enough that endpoint code
only encodes payload bytes and a completion code. This reduces off-by-one reply
length bugs and makes the logs easier to normalize.
## Normalized inter-process handoff replies
The process handoff path should be normalized before adding more provider
processes. The current `nwconn` to `nwbind` forwarding path relies on magic
return values and implicit shared-buffer conventions. That is workable for the
historic two-process case, but it will not scale cleanly to future providers such
as `nwqueue`, `nwnds`, or a directory service.
The long-term rule should be:
```text
Every provider process returns exactly one internal handoff reply for every
internal handoff request it accepts.
```
That internal reply is not the same thing as a client-visible NCP reply. A
provider may explicitly say that no client reply should be sent, but it should
still send a formal internal result back to the caller. This avoids silent
success/failure paths and makes timeout/error handling deterministic.
Conceptual reply kinds:
```c
typedef enum {
NW_HR_REPLY, /* provider produced a client NCP reply payload */
NW_HR_NO_REPLY, /* provider handled it; nwconn must not send a client reply */
NW_HR_DEFERRED, /* accepted; final reply/event will be produced later */
NW_HR_FORWARD, /* provider requests forwarding to another provider */
NW_HR_ERROR /* internal provider or handoff failure */
} NwHandoffReplyKind;
```
The normal successful completion-only case is still a reply:
```text
kind = NW_HR_REPLY
completion = 0x00
reply_len = 0
```
The true "do not answer the client" case is explicit:
```text
kind = NW_HR_NO_REPLY
reply_len = 0
```
Do not encode this as text in a payload such as `"no reply"`. It should be a
machine-readable reply kind so that `nwconn` can make one central decision.
A conceptual internal handoff reply header could look like this:
```c
typedef struct {
uint16_t version;
uint16_t kind;
uint32_t request_id;
uint32_t connection_id;
uint32_t sequence;
uint32_t task_id;
uint8_t completion;
uint8_t connection_status;
uint32_t flags;
uint32_t reply_len;
} NwHandoffReply;
```
The matching request should carry the same correlation fields plus the NCP
selector path and payload length. The exact structure can follow existing
mars-nwe style, but the contract should be stable:
```text
nwconn -> provider:
request_id, connection_id, sequence, task, selector path, request payload
provider -> nwconn:
same request_id, reply kind, completion/status, reply payload length, payload
```
This gives future provider processes a uniform contract:
```text
nwconn -> nwbind
nwconn -> nwqueue
nwconn -> nwnds
nwconn -> nwdirectory
```
all use the same shape. Provider-specific behavior belongs in the payload and
provider API, not in special process-specific return conventions.
### Reply ownership
The preferred long-term ownership rule is:
```text
Provider builds the logical reply payload and completion/status.
nwconn owns the final client NCP response envelope and sends it to the client.
```
This means providers should not directly send client packets. They return an
internal result that says what should happen. `nwconn` then applies the original
sequence, connection number, task, transport, and NCP envelope rules in one
place.
This avoids several classes of bugs:
- duplicate replies;
- wrong sequence or task in replies;
- inconsistent completion-only reply lengths;
- provider-specific send/error paths;
- unclear post-processing after `nwbind` replies;
- future provider processes needing to know transport details.
Legacy paths may continue to have provider-built replies during migration, but
that should be marked as a legacy compatibility mode, not the design target.
### Post-processing without `return(-2)`
The current `return(-2)` convention means roughly "forward to bindery, save the
original request, then let `nwconn` do more work after the provider reply". In a
normalized handoff this should become explicit state, not a magic result value.
Possible flags:
```c
#define NW_HF_SAVE_ORIGINAL_REQUEST 0x00000001
#define NW_HF_POSTPROCESS_REPLY 0x00000002
#define NW_HF_LEGACY_PROVIDER_REPLY 0x00000004
```
The request context can also name the post-processing hook, or store a small
post-processing enum, so the provider does not need to know why the caller will
continue after the reply. This keeps the handoff transport generic.
### Error mapping and dead-provider behavior
The handoff layer should define what happens when a provider fails before a
protocol handler can return a normal NetWare completion code. Otherwise every
new provider will invent a slightly different failure path.
The normalized rules should cover:
- provider process not running;
- provider closes the IPC channel;
- malformed internal reply;
- mismatched `request_id`;
- provider timeout;
- reply payload longer than negotiated capacity;
- provider returned `NW_HR_ERROR` with an internal error code;
- provider returned `NW_HR_FORWARD` to an unsupported target.
For client-visible requests, `nwconn` should map those failures through one
central function to either an NCP completion code, a connection-level failure, or
an intentional disconnect. Endpoint handlers should not open-code provider IPC
failures as arbitrary completion bytes.
### Correlation and replay safety
Every internal handoff request should carry a monotonically useful correlation
identifier, even if the first implementation is only local to one `nwconn`
process. The tuple should include enough information to catch stale or crossed
replies:
```text
request_id + connection_id + sequence + task_id
```
This matters once there are multiple provider processes or deferred replies. It
also makes logging and debugging much easier, because an endpoint audit can show
the complete path of one request through several processes.
### Payload ownership and size limits
The handoff protocol should define who owns buffers and what size limits apply.
At minimum, document these rules before functional refactoring:
- request payload is immutable after handoff unless a mutating legacy wrapper is
explicitly documented;
- provider reply payload length must be checked against caller capacity;
- variable-length replies must report exact encoded length;
- zero-length payload is valid for completion-only replies;
- `NO_REPLY` is a reply kind, not a zero-length success reply;
- byte order inside payloads remains the NCP endpoint's documented wire order,
not native host order.
This is especially important for nested selector families and old/new endpoint
variants, where a provider may need to choose different reply structures based on
a level, verb, or information type.
### Logging and audit benefit
A normalized handoff reply gives logging one consistent shape:
```text
REQ id=42 conn=7 seq=19 ncp=0x2222/23/113 provider=queue len=...
RPLY id=42 conn=7 seq=19 kind=REPLY completion=0x00 len=...
RPLY id=43 conn=7 seq=20 kind=NO_REPLY reason=...
ERR id=44 conn=7 seq=21 provider=bindery error=timeout mapped=0xfb
```
This would also make the endpoint documentation pass easier: each audited
endpoint can identify the provider, the request layout, the logical reply kind,
the reply payload layout, and any caller-side post-processing.
### Migration order for handoff normalization
The safe order is:
1. document current `nwconn`/`nwbind` handoff behavior;
2. add names for current magic values without changing behavior;
3. add a small wrapper such as `ncp_handoff_to_provider()` that still calls the
old path internally;
4. introduce a formal internal reply object in the wrapper;
5. make the wrapper always return a formal reply, including `NO_REPLY`;
6. centralize final client reply sending in `nwconn` for converted paths;
7. only then attach future providers such as `nwqueue` or `nwnds`.
The rule is: do not create a new provider process until the caller can receive a
formal reply from it and can handle provider failure centrally.
## Provider boundaries
A clean design would treat the existing modules as providers instead of hidden
fallback paths:
```text
nwconn connection/session, packet IO, top-level envelope
ncpdispatch endpoint lookup, handoff policy, common errors
nwbind bindery database and bindery-backed services
queue queue metadata and print/backend adapter
sema semaphore state
message station/message/broadcast state
namespace path, directory handle, name-space operations
file file handle and read/write/open/close operations
salvage deleted-file scan/recover/purge backend
AFP AFP metadata and AFP namespace adapter
```
This is a design target, not a demand to move files immediately. The important
part is that future code should avoid making `nwbind` a catch-all sink for
unrelated NCPs just because it already has an IPC path.
## Provider boundary versus process boundary
A provider boundary is not the same thing as a Unix process boundary. This is
an important distinction because splitting every NCP family into a separate
process would make the server harder to debug and could introduce new ordering,
locking, and reply-ownership bugs.
The preferred rule is:
```text
first define logical providers;
only later promote the few large stateful providers to separate processes.
```
A logical provider can start as an ordinary C module called from the existing
process path. It becomes valuable as soon as the dispatch table can say "this
endpoint belongs to the queue provider" or "this endpoint belongs to the
connection-local provider", even if no new process exists yet. A process split
should be treated as an implementation detail that is only justified when the
provider has enough independent state and lifecycle to benefit from isolation.
This keeps the redesign incremental:
```text
now:
nwconn switch -> existing local code or nwbind handoff
first cleanup:
nwconn switch -> provider-named helper/module
later, only where useful:
nwconn/dispatcher -> IPC -> provider process
```
### Good process candidates
#### Bindery
Bindery is already a natural service boundary. It owns long-lived server state:
objects, properties, sets, security, password/login/key handling, and object
lookup. Keeping bindery behind a clear provider boundary is appropriate, and the
existing `nwbind` process can remain that boundary while the dispatch layer is
cleaned up.
The main cleanup is not to remove `nwbind`, but to stop treating it as a generic
catch-all for unrelated forwarded requests. A future endpoint table should mark
true bindery calls as `bindery`, and queue or management calls should not be
classified as bindery merely because their current implementation lives in
`nwbind.c`.
#### Queue / possible `nwqueue`
Queue management is the strongest candidate for a future separate process after
bindery. Queue handling has its own domain state:
- queue objects and queue metadata;
- queue job lifecycle;
- queue server attach/detach state;
- service, finish, and abort state;
- job position and priority;
- client-rights transitions during job servicing;
- queue directories and spool/job files.
That is large enough to deserve a logical `queue` provider even before any
runtime split. A future `nwqueue` process can be considered once request/reply
ownership and bindery access are explicit.
The first step should only be a provider split:
```text
0x2222/23 queue subfunctions -> queue provider
queue provider -> bindery provider/library for object/security/property checks
queue provider -> file/path helpers for queue job files
```
A real `nwqueue` process should not be created by simply moving the current queue
cases out of `nwbind.c`. It needs an explicit contract for:
- which process owns the final NCP reply;
- how queue calls read bindery objects and properties;
- how queue job files are opened and handed back to the connection process;
- how connection cleanup affects attached queue servers and in-service jobs;
- how old 16-bit job-number calls and newer 32-bit job-number calls are kept
compatible.
Until those contracts are clear, `nwqueue` should remain a design target, not an
immediate functional change.
### Possible but risky process candidates
#### File and volume subsystem
The file/volume/name-space area is large and stateful, so it can look like a
candidate for a separate process. It owns or touches directory handles, file
handles, locks, trustee evaluation, volume information, name spaces, salvage and
purge operations, and Unix filesystem mapping.
However, this area is also tightly coupled to connection state and existing file
descriptor ownership. Moving it behind IPC too early could create more problems
than it solves. The safer path is:
```text
first: file/volume/name-space provider modules inside the current process model
later: consider a process split only after handle ownership is explicit
```
A file provider boundary is useful for documentation and dispatch cleanup. A
separate file process is optional and should be considered high-risk.
#### Accounting
Accounting is a maybe. It has a separate protocol domain, but in many setups it
may be small enough to stay as an in-process provider. A process boundary only
makes sense if accounting grows into a real persistent service with charges,
holds, notes, audit records, and recovery behavior that should be isolated from
connection handlers.
### Poor process candidates
#### Semaphore
Semaphore calls should have a clean provider boundary, but a dedicated process is
probably overkill. The old semaphore group is small: open, examine, wait,
signal, and close. It needs shared state, but not necessarily a standalone
process. A `sema` provider module with clear request/reply ownership should be
enough unless later testing shows that cross-connection semaphore state cannot be
managed safely in the existing process model.
#### Connection lifecycle and session-local calls
Connection lifecycle operations should stay with `nwconn` or a connection-local
provider. Calls such as Logout, End Of Job, watchdog handling, buffer
negotiation, and connection-state cleanup are fundamentally tied to the session
that received the packet. Moving them into another process would make cleanup
ordering and error handling harder.
#### Simple server-management calls
Simple management and information calls should not become their own process.
Examples include login-status queries, server description strings, server time,
console-privilege checks, and small broadcast/control helpers. These can be
represented as a `servermgmt` provider for dispatch clarity, but they should stay
in-process unless a specific call requires an existing backend service.
### Suggested provider map
The endpoint audit table should be able to use provider names like these:
```text
local packet/session-local handling in nwconn
bindery object/property/security/login backend
queue queue objects, jobs, queue servers, spool/job lifecycle
filesystem file, directory, volume, namespace, trustee, salvage helpers
semaphore semaphore state and old 0x2222/32 calls
message station messaging and broadcast helpers
servermgmt small server-management and information calls
accounting account status, charges, holds, notes
AFP AFP namespace and metadata helpers
unknown documented but not yet mapped
```
Only some providers should ever become processes:
```text
already process-like: bindery / nwbind
likely future process: queue / possible nwqueue
maybe, high risk: filesystem
usually in-process: semaphore, message, servermgmt, accounting, AFP helpers
```
The practical design rule is:
```text
Use provider names everywhere in documentation and endpoint tables.
Use new processes only where shared state, isolation, and lifecycle justify the
extra IPC complexity.
```
## Future NetWare 4.x directory, LDAP, and storage direction
NetWare 4.x support should not be added by letting `nwbind` grow into a second
large catch-all service. The long-term directory design should keep the legacy
Bindery, the future NDS compatibility layer, and the LDAP protocol frontend as
separate logical layers above one shared directory store.
The intended naming model is:
```text
libflaim
persistent embedded database engine
libdirectory
shared internal directory API/library used by nwbind, nwnds, nwdirectory,
and setup/provisioning tools
owns the mars-nwe object model, schema helpers, indexes, ACL/auth
primitives, and persistence glue above libflaim
directory core/store
the data model and persistent store exposed through libdirectory
persists its data through libflaim
nwdirectory
mars-nwe service name for the integrated tinyldap-derived LDAP service
owns LDAP/LDAPS/StartTLS protocol handling
uses wolfSSL only at the LDAP network/TLS edge
calls the directory core/store, not Bindery or NDS packet handlers
nwnds
future NetWare 4.x/NDS compatibility layer
owns NDS/NCP directory semantics, contexts, tree-oriented operations,
NetWare-specific rights/auth behavior, and later compatibility glue
calls the directory core/store directly
nwbind
legacy NetWare 2.x/3.x Bindery compatibility layer
maps Bindery objects, properties, sets, security, and login-visible behavior
onto the shared directory core/store where possible
```
In this model, `nwdirectory` is not a separate design from tinyldap. It is the
mars-nwe integration name for the tinyldap-derived LDAP directory service, so
that the installed binary/module follows the existing `nw*` naming scheme. The
upstream tinyldap code can provide the LDAP protocol implementation, but the
project-facing component should be named `nwdirectory`.
`libdirectory` is the important internal boundary. It should be a real shared
API/library, not just a documentation label, because both `nwbind` and future
`nwnds` need directory data without speaking LDAP to each other. The library can
start small, but it should provide the common operations that legacy Bindery,
NDS compatibility, LDAP, and setup code all need:
```c
dir_open_store();
dir_close_store();
dir_txn_begin();
dir_txn_commit();
dir_txn_abort();
dir_object_create();
dir_object_delete();
dir_object_lookup_by_id();
dir_object_lookup_by_name();
dir_object_search();
dir_attr_get();
dir_attr_set();
dir_attr_delete();
dir_acl_check();
dir_auth_verify();
dir_schema_get();
```
The exact function names are placeholders, but the ownership rule is important:
NetWare protocol handlers should call a directory API, not encode LDAP requests
to reach local server state. If `nwdirectory` later runs as a separate process,
`libdirectory` can either remain the shared embedded store library or define the
internal IPC contract. In both cases the protocol layers still depend on the
directory API, not on LDAP text/protocol behavior.
`nwnds` should remain a separate layer because LDAP is only one protocol view of
the directory. NDS has NetWare-specific semantics that should not be forced into
the LDAP frontend. Conversely, LDAP clients should not be required to pass
through the NDS/NCP compatibility handler just to reach the directory database.
The preferred relationship is sibling frontends above one core:
```text
+----------------------+
| directory core/store |
| backed by libflaim |
+----------+-----------+
^
+---------------+---------------+
| |
nwdirectory nwnds
tinyldap-based LDAP/LDAPS NetWare 4.x/NDS semantics
frontend, wolfSSL TLS edge NCP/NDS compatibility layer
^ ^
| |
LDAP clients NetWare/NDS clients
```
The legacy Bindery service should also move toward this shared store over time:
```text
NetWare 3.x client -> Bindery NCP -> nwbind -> directory core/store -> libflaim
LDAP client -> LDAP/LDAPS -> nwdirectory -> directory core/store -> libflaim
NetWare 4.x client -> NDS/NCP -> nwnds -> directory core/store -> libflaim
```
That means `nwbind` should become a compatibility mapping over directory objects
and attributes instead of maintaining a completely separate long-term identity
truth. This is especially important once NetWare 4.x/NDS support exists, because
Bindery compatibility can then be implemented as a legacy view of the same
underlying users, groups, properties, and rights data.
The internal path should not be:
```text
nwbind -> LDAP protocol -> nwdirectory -> directory store
nwnds -> LDAP protocol -> nwdirectory -> directory store
```
Using LDAP as the mandatory internal storage API would mix protocol concerns into
server internals, make old Bindery behavior harder to preserve, and add needless
encoding/search semantics between tightly coupled modules. LDAP should remain an
external protocol frontend. `nwbind`, `nwnds`, and `nwdirectory` should all use `libdirectory`, or a clearly
defined IPC protocol modeled after the same directory API, to reach the directory
store.
FLAIM should therefore be treated as the long-term persistent storage engine for
the directory core, not as an LDAP-only database. `libdirectory` owns the schema,
object model, indexes, transactions, ACL checks, and authentication primitives
that the protocol/provider layers need. `nwdirectory` exposes those objects
through LDAP; `nwnds` exposes them through NDS semantics; `nwbind` exposes them
through legacy Bindery calls.
A separate setup/provisioning tool should own initial population of this store.
The proposed project-facing name is `nwsetup`, matching the `nw*` naming scheme.
Its job is not to be another protocol server. It should create or migrate the
initial directory database through `libdirectory` directly:
```text
nwsetup -> libdirectory -> libflaim
```
Examples of setup-owned work:
- create an empty directory store;
- initialize the base tree, root/container objects, and default schema;
- create initial admin/server/service objects;
- create Bindery compatibility objects and properties needed by NetWare 2.x/3.x
clients;
- import or migrate an existing mars-nwe Bindery database when that becomes
practical;
- set initial passwords/secrets using the same authentication primitives that
`nwbind`, `nwnds`, and `nwdirectory` will use at runtime;
- validate or repair indexes before the server starts.
`nwsetup` should not fill the database by acting as an LDAP client to
`nwdirectory`. LDAP import/export can be useful for interoperability later, but
the local bootstrap path should avoid requiring a running LDAP server and should
not make LDAP the canonical internal representation.
Kerberos should not be part of this initial design. Classic NetWare 4.x/NDS
compatibility should focus on native NDS-style authentication and directory
semantics. If a later eDirectory/NMAS compatibility effort ever needs Kerberos,
it should be considered a separate future authentication-provider topic, not a
requirement for the `nwdirectory`/`nwnds`/`nwbind` split.
The migration path should be conservative:
1. add the design boundary and naming notes first;
2. import or integrate tinyldap under the project-facing `nwdirectory` name;
3. keep wolfSSL confined to the LDAP/LDAPS/StartTLS network edge;
4. introduce `libdirectory` before making Bindery depend on it;
5. add `nwsetup` as the direct bootstrap/provisioning tool for the initial
libflaim-backed directory store;
6. map selected `nwbind` objects/properties to `libdirectory` only after the
legacy behavior is documented;
7. add `nwnds` later as an NDS semantic layer, not as an LDAP wrapper;
8. only then consider replacing private Bindery persistence with libflaim-backed
directory storage.
This keeps the future NetWare 4.x work aligned with the provider/process split:
`nwdirectory`, `nwnds`, and `nwbind` may be separate processes or modules, but
they should not be separate sources of truth for identity and directory data.
## Transport split for future TCP/IP support
Future TCP/IP support should be introduced as a transport code/library split,
not as a new daemon. The transport layer is below the NCP dispatcher: it owns
wire IO, peer addressing, framing, and transport-specific discovery or watchdog
behavior. It does not own Bindery, Queue, Directory, File, Semaphore, or other
NCP provider semantics.
The intended source-level split is:
```text
src/nwtransport.c
common transport API and helpers
transport-neutral peer/session descriptors
dispatch to the selected transport implementation
src/nwipx.c
existing IPX-specific implementation
ipxAddr_t conversion and compatibility helpers
IPX socket send/receive
SAP/RIP, IPX watchdog, and IPX broadcast behavior where applicable
src/nwtcp.c
later TCP/IP implementation
TCP listener/session/framing code
IPv4/IPv6 peer address handling
no SAP/RIP assumptions
src/nwconn.c
NCP session logic
request decode, dispatch handoff, reply construction
should gradually use transport-neutral peer/session data
src/nwserv.c
process supervision and connection lifecycle
uses the transport layer for listener and peer management
```
These files should be linked into the existing `nwserv`/`nwconn` process model.
`nwtransport` is a boundary in the code, not an `nwtransport` process. Creating
a separate transport daemon would add an IPC hop for every NCP packet, complicate
disconnect/error handling, and make TCP stream ownership harder without adding a
clear NetWare service boundary.
The long-term direction is to remove raw IPX assumptions from higher layers.
Today, the connection path still exposes `ipxAddr_t` in important places. A
future cleanup should introduce a transport-neutral peer descriptor, for
example conceptually:
```c
typedef enum {
NW_TRANSPORT_IPX,
NW_TRANSPORT_TCP
} NwTransportKind;
typedef struct {
NwTransportKind kind;
union {
ipxAddr_t ipx;
struct {
unsigned char addr[16];
unsigned short port;
unsigned char family;
} tcp;
} u;
} NwTransportPeer;
```
The exact structure should follow the existing mars-nwe style, but the ownership
rule is the important part: NCP providers should not care whether a request came
from IPX or TCP/IP. They should see a connection/session and an NCP request,
not a raw network address type.
The transport API can start small. Useful conceptual operations are:
```c
nwtransport_peer_equal();
nwtransport_peer_to_string();
nwtransport_recv();
nwtransport_send();
nwtransport_close_peer();
nwtransport_peer_kind();
```
As with the NCP context design, these names are placeholders. The first
implementation can wrap the existing IPX behavior and leave TCP stubs out until
there is a real TCP/IP target. The goal is to stop new code from spreading
`ipxAddr_t` into providers that should remain transport-independent.
IPX-specific behavior must remain isolated. SAP/RIP, IPX broadcast, and the
existing IPX watchdog behavior are compatibility details of the IPX transport or
its immediate `nwserv` integration. TCP/IP should not be forced to emulate IPX
SAP/RIP internally. If TCP/IP later needs discovery or service advertisement,
that should be designed as a TCP/IP-specific mechanism rather than hidden behind
old IPX-only assumptions.
The intended relationship is therefore:
```text
IPX client -> nwipx -> nwtransport -> nwconn -> NCP dispatcher -> providers
TCP client -> nwtcp -> nwtransport -> nwconn -> NCP dispatcher -> providers
```
The provider/process rule still applies:
```text
Provider boundary does not imply process boundary.
Transport boundary does not imply process boundary either.
```
Good future cleanup sequence:
1. document the current IPX ownership in `nwserv.c` and `nwconn.c`;
2. add `nwtransport.c`/transport headers as wrappers around existing IPX paths;
3. move IPX-only helpers into `nwipx.c` without behavior changes;
4. gradually replace raw `ipxAddr_t` use in session-neutral code with a
transport-neutral peer/session descriptor;
5. keep NCP providers and the endpoint audit table transport-independent;
6. add `nwtcp.c` only after the IPX wrapper boundary is stable.
This keeps TCP/IP support compatible with the broader redesign: transport IO is
separated from NCP semantics, but the existing `nwserv`/`nwconn` process model
remains intact.
## Logging connection
The dispatch redesign also supports the desired log cleanup. If every request
has a context, logs can consistently include:
```text
INFO NCP 23/109 DISPATCH type=0x2222 fn=0x17 sub=0x6d provider=nwbind/queue
INFO NCP 32/0 REPLY type=0x2222 fn=0x20 sub=0x00 result=0x00 len=4
WARN NCP 23/130 LAYOUT-MISMATCH sdk="32-bit JobNumber" code="16-bit parser"
```
The logging cleanup should still reuse existing mars-nwe logging functions. Do
not add a second logging subsystem just to support the dispatch cleanup.
## Migration plan
### Phase 1: Name the existing conventions
Low risk. No behavior change.
- Add named constants or comments for the current `0`, `-1`, and `-2`
dispatcher results.
- Keep existing control flow unchanged.
- Update comments so `return(-1)` is never described ambiguously outside the
exact dispatcher where it is meaningful.
### Phase 2: Add an endpoint audit table
Low risk. Mostly documentation/debug.
- Add a table of known endpoints by request type, function, and subfunction.
- Mark provider, generation bucket, and implementation state.
- Use it to compare SDK/PDF/WebSDK coverage against actual handlers.
- Do not switch runtime dispatch to the table yet.
### Phase 3: Introduce a thin `NcpContext`
Moderate risk if kept small.
- Wrap existing request and reply buffers without changing ownership.
- Use the context only in newly audited or newly implemented handlers.
- Keep old handlers callable until they are touched for another reason.
### Phase 4: Convert small endpoint families first
Moderate risk, easy to test.
Good candidates:
- `0x2222/32` old Semaphore calls;
- direct calls such as End Of Job, Logout, and Negotiate Buffer Size;
- small message/station groups once their handoff has been audited.
Avoid converting queue and bindery first because they have more process coupling
and more old/new layout variants.
### Phase 5: Move runtime dispatch to tables gradually
Higher risk. Do this only after enough endpoint families have stable audit
coverage and tests.
- Keep switch wrappers during the transition.
- Convert one family at a time.
- Preserve exact completion codes and reply lengths.
- Add targeted smoke tests for any family whose dispatch path changes.
## Non-goals
This redesign should not:
- change protocol behavior merely to match a cleaner abstraction;
- remove NetWare 1.x/2.x/3.x compatibility paths;
- enable NetWare 4.x/OES/MOAB-only endpoints by default;
- replace existing mars-nwe path, bindery, queue, AFP, trustee, or salvage
backends with parallel databases;
- add a large external message bus dependency;
- rewrite all handlers in one patch;
- turn documentation-only endpoint audit patches into functional refactors.
## Practical rule for future patches
For the ongoing endpoint documentation pass, keep doing the conservative thing:
1. enumerate SDK/PDF/WebSDK/include endpoints for the family;
2. compare them with actual `case` labels and forwarded destination handlers;
3. document missing, disabled, implemented, and later-generation slots;
4. document request parser/handoff and response builder;
5. record real layout differences, but do not change behavior in the same patch.
Functional cleanup should come later in small patches with tests.