docs: document nwlog and zlog logging direction

This commit is contained in:
Mario Fetka
2026-06-02 08:25:07 +00:00
parent 2f75b77ded
commit 2006dba942

View File

@@ -1347,6 +1347,137 @@ This keeps TCP/IP support compatible with the broader redesign: transport IO is
separated from NCP semantics, but the existing `nwserv`/`nwconn` process model
remains intact.
## Logging subsystem and optional zlog backend
The dispatch, provider, directory, and transport redesigns all need better
logging than scattered ad-hoc debug messages. The goal is not only prettier
logs. The important requirements are:
- consistent severity levels;
- consistent categories across processes and providers;
- request correlation from `nwserv`/`nwconn` through provider handoff and back;
- safe redaction of secrets before any backend sees the message;
- configurable routing to local files, syslog, or later remote collectors;
- auditable security events such as password recovery, TLS failures, rejected
provider IPC, and directory/bootstrap changes.
The mars-nwe source should not call a third-party logging library directly from
random endpoint handlers. It should grow a small internal facade first:
```c
typedef enum {
NWLOG_CORE,
NWLOG_CONFIG,
NWLOG_TRANSPORT,
NWLOG_NCP,
NWLOG_HANDOFF,
NWLOG_BINDERY,
NWLOG_QUEUE,
NWLOG_DIRECTORY,
NWLOG_NDS,
NWLOG_LDAP,
NWLOG_AUTH,
NWLOG_ACL,
NWLOG_RECOVERY,
NWLOG_SECURITY
} NwLogCategory;
```
Conceptual call sites should look like:
```c
nwlog_info(NWLOG_HANDOFF, ctx,
"provider=%s request_id=%u selector=%s handoff=start",
provider_name, request_id, selector_path);
nwlog_warn(NWLOG_RECOVERY, ctx,
"admin password recovery requested dn=%s uid=%lu",
redacted_dn, (unsigned long)uid);
```
That facade can initially keep using the existing mars-nwe logging functions,
`stderr`, or `syslog`. Later it may use an advanced backend.
`zlog` is a good candidate for that advanced backend because it is a C logging
library with category, format, and rule based configuration. That model fits
mars-nwe well: code can emit category-specific events such as `ncp`, `handoff`,
`queue`, `directory`, `auth`, or `transport`, while the administrator decides in
the logging configuration whether those categories go to a file, stdout/stderr,
syslog-style output, a pipe, or an external log-forwarder path. The zlog
project documentation describes these three core concepts as categories, formats,
and rules, where rules bind a category/level to an output and format. Before
choosing it, packaging, license compatibility, portability, and maintenance state
still need to be verified for the target distributions.
The preferred dependency shape is therefore:
```text
mars-nwe code
-> nwlog facade
-> simple built-in backend: stderr/file/syslog
-> optional advanced backend: zlog
-> admin-configured zlog rules/formats/outputs
```
Do not make endpoint code depend on `zlog_category_t` or zlog macros directly.
Keeping `nwlog` in the middle gives mars-nwe one place to:
- inject correlation fields such as `connection_id`, `request_id`, `sequence`,
`task_id`, provider name, and NCP selector path;
- redact or suppress sensitive fields before formatting;
- enforce no-secret logging rules even when logs are routed to remote systems;
- keep a fallback backend for minimal builds or platforms without zlog;
- change or add backends later without touching protocol handlers.
Remote logging is useful, but it must be treated as a security boundary. A GELF
or Graylog-style collector, syslog relay, pipe, or any other remote forwarding
path must receive structured, redacted events only. It must never receive raw
NCP request bodies, decoded handoff payloads, passwords, one-shot recovery
tokens, private keys, or raw directory authentication material.
A future documented INI could expose the logging policy without forcing admins
to edit C-style backend internals directly:
```ini
[logging]
backend = zlog ; builtin, syslog, file, zlog
level = info
redact_secrets = yes
config = /etc/mars-nwe/zlog.conf
[logging.category]
ncp = info
handoff = info
auth = warn
recovery = warn
directory = info
transport = info
[logging.debug]
packet_hexdump = no
handoff_hexdump = no
unsafe_raw_payloads = no
```
Raw packet or handoff hexdumps should be opt-in developer diagnostics, not normal
admin logging. Even then, auth/password fields should be redacted where the
layout is known. The safe default is length-only logging for sensitive payloads.
Important audit events should be logged even at normal levels:
- provider IPC connection accepted/rejected;
- provider IPC TLS/mTLS validation failure;
- directory store initialization and schema migration;
- `nwsetup` password bootstrap or recovery actions;
- bindery-to-directory migration actions;
- failed authentication attempts with redacted identities;
- NCP handoff timeout, dead provider, or mismatched reply correlation ID.
The logging cleanup should be a separate functional change from endpoint layout
patches. Documentation-only endpoint audit patches may add log design notes, but
they should not introduce new logging dependencies or change runtime logging
behavior.
## Logging connection
The dispatch redesign also supports the desired log cleanup. If every request
@@ -1358,8 +1489,9 @@ INFO NCP 32/0 REPLY type=0x2222 fn=0x20 sub=0x00 result=0x00 len=4
WARN NCP 23/130 LAYOUT-MISMATCH sdk="32-bit JobNumber" code="16-bit parser"
```
The logging cleanup should still reuse existing mars-nwe logging functions. Do
not add a second logging subsystem just to support the dispatch cleanup.
Until the `nwlog` facade exists, endpoint-dispatch cleanup should still reuse
existing mars-nwe logging functions. Do not add direct zlog calls or a parallel
logging path just to support one endpoint family.
## Migration plan