Files
mars-nwe/doc/NWFS_SALVAGE_COMPRESSION_TOOLS.md

298 lines
10 KiB
Markdown

# NWFS salvage, stream, compression and tool roadmap
This note records the storage-backend decisions made after the NSS Unicode,
namespace and public-core audits. It is a design note only; it does not change
runtime behavior.
## Salvage authority
Long-term salvage state should be driven by NetWare/NSS-shaped metadata, not by
private JSON sidecars.
Target authority order:
```text
.recycle payload => deleted payload storage / Samba compatibility
netware.metadata on payload => authoritative deleted-file metadata
.salvage JSON => legacy transition only, then remove
```
`.recycle` remains because it matches existing MARS-NWE behavior and Samba
`vfs_recycle` interoperability. The goal is not to move deleted payloads into
`.nwfs_streams` and not to replace the backend with the NSS ZLSS purge tree.
The goal is to make the recycled payload itself carry the NetWare/NSS deleted
metadata needed for NCP salvage, backup tools and restore.
New code should not create another private salvage metadata database. Use or
extend `netware.metadata` for the deleted-file view.
## Samba recycle interaction
The Samba 4.23.6 `vfs_recycle` module was checked. Its normal recycle path uses
`SMB_VFS_NEXT_RENAMEAT()` to move the file into the repository. It does not
copy the file and then copy xattrs. For the normal same-filesystem recycle case,
Linux xattrs remain attached to the inode across the rename, so an existing
`netware.metadata` xattr is preserved when Samba moves a file into `.recycle`.
If Samba cannot rename into the recycle repository, that module falls back to the
normal unlink path rather than creating a copied recycled payload. In that case
there is no recycled file for MARS-NWE to salvage.
Therefore:
- files moved to `.recycle` by MARS-NWE should have complete deleted metadata;
- files moved to `.recycle` by Samba should retain existing xattrs;
- files manually copied into `.recycle` without `netware.metadata` are not valid
NetWare salvage objects by default.
Do not add automatic synthetic salvage fallback for missing metadata. If an
administrator wants to repair such files, provide an explicit tool command later.
## Delete, scan, recover and purge target flow
Delete through MARS-NWE/NCP:
```text
1. move the payload into .recycle
2. set/update netware.metadata on the recycled payload
3. mark the object as deleted/salvageable using NSS-shaped fields
4. do not write a new .salvage JSON sidecar once migration is complete
```
Metadata on the recycled payload should include at least:
```text
zNTYPE_DELETED_FILE-compatible deleted type
DeletedPersistentParentEntry_s-compatible delete time and UserID_t
original parent identity
original DOS/LONG/MAC/UNIX names where available
original attributes/times/owner/archive/modifier fields
trustees / inherited-rights mask where available
stream, EA and compression references later
```
NCP scan:
```text
scan .recycle
read netware.metadata from each recycled payload
return official salvage replies from that metadata
ignore entries without valid deleted metadata
```
Recover:
```text
move/copy the payload back through existing MARS-NWE file mechanisms
clear or normalize the deleted state in netware.metadata
restore names, attributes, times, trustees, AFP hints, streams, EA and compression state
```
Purge/final delete:
```text
remove the .recycle payload
remove associated NWFS internal stream/EA/compression backend entries when present
```
Legacy `.salvage` JSON:
```text
read only for migration or old-install cleanup
migrate to netware.metadata where possible
stop writing new sidecars
remove yyjson once no other required consumer remains
```
## `.nwfs_streams` scope
`.nwfs_streams` is for internal NWFS stream-like storage, not for the main
Samba-compatible recycle payload.
Target layout:
```text
/export/SYS/.nwfs_streams/<stable-file-id>/primary
/export/SYS/.nwfs_streams/<stable-file-id>/resource
/export/SYS/.nwfs_streams/<stable-file-id>/ea
/export/SYS/.nwfs_streams/<stable-file-id>/compression
```
Rules:
- `<stable-file-id>` is a MARS/NWFS/NSS-shaped stable file ID stored in
`netware.metadata`.
- Do not use Linux inode numbers as the authoritative ID.
- Do not use the visible filename as the authoritative ID.
- Do not expose `.nwfs_streams` through normal NCP namespace calls.
- Compression state belongs in metadata/stream descriptors, not in a
`compressed_` filename prefix.
- A compressed stream backend is keyed by the stable file ID; the backend name is
not the user's DOS/LONG/MAC/UNIX filename and should survive rename/move.
Example:
```text
NCP-visible file:
SYS:FOO\BAR.TXT
Linux namespace payload while live and uncompressed:
/export/SYS/FOO/BAR.TXT
Internal compressed stream backend after future compression work:
/export/SYS/.nwfs_streams/0000000000001234/compression/primary
Authoritative metadata:
netware.metadata:
file_id = 0000000000001234
primary stream storage = compressed
logical_size = ...
compressed_size = ...
algorithm = netware/nss
```
Salvage may reference stream snapshots in `.nwfs_streams`, but the deleted
primary payload remains in `.recycle` for Samba compatibility.
## Compression storage model
Linux filesystems such as ext3 and XFS do not provide a portable NSS-compatible
transparent compression model. MARS-NWE should not depend on host-FS-native
compression for NetWare/NSS semantics.
Compression belongs to future `libnwfs` work:
```text
Phase 1: import/adapt NSS compression algorithm sources
Phase 2: add NWFS stream backend for compressed payloads
Phase 3: make NCP read/write paths transparently de/compress when enabled
Phase 4: add compression manager/accounting/state for monitor endpoints
```
When a compressed file is recycled, the `.recycle` payload should be written in
uncompressed form. Samba and host-side tools know only the recycle repository
and normal Linux file contents; they do not know how to read a private NWFS
compressed stream backend. Therefore future delete/recycle handling must
materialize the primary data stream before or during the move to `.recycle`, then
record the former compression state in `netware.metadata` so NCP recover can
recreate the compressed state later if the volume policy requires it.
```text
live compressed file:
/export/SYS/FOO/BAR.TXT # namespace object
/export/SYS/.nwfs_streams/<file-id>/compression/primary
netware.metadata: primary stream storage = compressed
recycled file:
/export/SYS/.recycle/<user>/FOO/BAR.TXT # uncompressed payload
netware.metadata: deleted metadata + previous compression descriptor
```
Do not expose compressed bytes directly in `.recycle`. `.recycle` is the
Samba-compatible payload backend; `.nwfs_streams` is the private live/future
stream backend.
Compression-related NCP providers must use real `libnwfs` state, not fake data:
```text
decimal 90/12 == wire/code 0x5a/0x0c Set Compressed File Size
decimal 123/70 == wire/code 0x7b/0x46 Get Current Compressing File
decimal 123/71 == wire/code 0x7b/0x47 Get Current DeCompressing File Info List
decimal 123/72 == wire/code 0x7b/0x48 Get Compression and Decompression Time and Counts
decimal 22/51 == wire/code 0x16/0x33 Extended Volume Info compression counters
```
## Host-side reconcile interaction
Salvage and stream tooling must expect that not every file was created through
MARS/NCP. The future `libnwfs` watcher/scanner should reconcile host-created
files before NCP clients rely on them: allocate stable file IDs, create
`netware.metadata`, derive DOS/LONG/MAC/UNIX namespace records, invalidate
namecache/search state, and report orphaned `.nwfs_streams` entries.
For `.recycle`, the scanner should validate that a recycled payload has
NSS-shaped deleted metadata before exposing it as a NetWare salvage object.
Entries copied manually into `.recycle` without valid `netware.metadata` should
remain invalid until an explicit admin repair tool marks or migrates them.
This keeps one authority for live, recycled and stream-associated objects:
`netware.metadata` plus the filesystem/quota authority where applicable. Do not
add a side database for watcher state, deleted state, stream IDs or compression
bookkeeping.
## Tool roadmap
Future host tools should operate on `netware.metadata` and `libnwfs` helpers,
not on private JSON sidecars.
### `nwsalvage`
```text
nwsalvage --help
nwsalvage --list <volume-or-path>
nwsalvage --info <recycled-file>
nwsalvage --restore <recycled-file>
nwsalvage --restore-to <recycled-file> <target>
nwsalvage --finaldelete <recycled-file>
nwsalvage --purge <volume-or-path>
nwsalvage --verify <volume-or-path>
nwsalvage --repair-minimal <recycled-file> # explicit admin action only
```
`--repair-minimal` must never run implicitly during normal salvage scan. It is
only for an administrator who intentionally wants to turn a manually copied
`.recycle` file into a NetWare salvage object.
### `nwmetadata`
```text
nwmetadata --dump <file>
nwmetadata --verify <file>
nwmetadata --set-deleted <file>
nwmetadata --clear-deleted <file>
nwmetadata --repair-minimal <file>
```
### `nwcompress`
```text
nwcompress --help
nwcompress --info <file>
nwcompress --compress <file>
nwcompress --uncompress <file>
nwcompress --verify <file>
nwcompress --list <volume-or-path>
```
### `nwstreams`
```text
nwstreams --help
nwstreams --list <file>
nwstreams --dump <file> <stream>
nwstreams --extract <file> <stream> <out>
nwstreams --remove <file> <stream>
```
### `nwea`
```text
nwea --help
nwea --list <file>
nwea --dump <file>
nwea --set <file> <name> <value>
nwea --remove <file> <name>
```
## yyjson removal target
`yyjson` is currently part of the tree because `.salvage` JSON sidecars exist.
Once new deletes stop writing `.salvage` JSON, legacy sidecar migration is
complete, and no other required code uses yyjson, remove the `third_party/yyjson`
submodule/build dependency.
Do not remove yyjson in the same patch that changes salvage semantics. First
move the authoritative model to `netware.metadata`, migrate or retire the JSON
reader/writer path, then remove yyjson as a cleanup patch.