making them store-and-forward friendly.
@menu
+* Index files for freqing: FreqIndex.
* Postfix::
+* Exim::
* Web feeds: Feeds.
* Web pages: WARCs.
* BitTorrent and huge files: BitTorrent.
+* Downloading service: DownloadService.
* Git::
* Multimedia streaming: Multimedia.
@end menu
+@node FreqIndex
+@section Index files for freqing
+
+In many cases you do not know exact files list on remote machine you
+want to freq from. Because files can be updated there. It is useful to
+run cron-ed job on it to create files listing you can freq and search
+for files in it:
+
+@example
+0 4 * * * cd /storage ; tmp=`mktemp` ; \
+ tree -f -h -N --du --timefmt \%Y-\%m-\%d |
+ zstdmt -19 > $tmp && chmod 644 $tmp && mv $tmp TREE.txt.zst ; \
+ tree -J -f --timefmt \%Y-\%m-\%d |
+ zstdmt -19 > $tmp && chmod 644 $tmp && mv $tmp TREE.json.zst
+@end example
+
@node Postfix
@section Integration with Postfix
-This section is taken from @url{http://www.postfix.org/nncp_README.html,
+This section is taken from @url{http://www.postfix.org/UUCP_README.html,
Postfix and UUCP} manual and just replaces UUCP-related calls with NNCP
ones.
@itemize
-@item You need an @ref{nncp-mail} program that extracts the sender
+@item You need an @ref{nncp-exec} program that extracts the sender
address from mail that arrives via NNCP, and that feeds the mail into
the Postfix @command{sendmail} command.
@item Define a @command{pipe(8)} based mail delivery transport for
delivery via NNCP:
-@verbatim
+@example
/usr/local/etc/postfix/master.cf:
nncp unix - n n - - pipe
- flags=F user=nncp argv=nncp-mail -quiet $nexthop $recipient
-@end verbatim
+ flags=Rqhu user=nncp argv=nncp-exec -quiet $nexthop sendmail $recipient
+@end example
-This runs the @command{nncp-mail} command to place outgoing mail into
-the NNCP queue after replacing @var{$nexthop} by the the receiving NNCP
+This runs the @command{nncp-exec} command to place outgoing mail into
+the NNCP queue after replacing @var{$nexthop} by the receiving NNCP
node and after replacing @var{$recipient} by the recipients. The
-@command{pipe(8)} delivery agent executes the @command{nncp-mail}
+@command{pipe(8)} delivery agent executes the @command{nncp-exec}
command without assistance from the shell, so there are no problems with
shell meta characters in command-line parameters.
+Pay attention to @code{flags}, containing @code{R}, telling Postfix to
+include @code{Return-Path:} header. Otherwise that envelope sender
+information may be lost. Possibly you will also need somehow to
+preserve that header on the receiving side, because @command{sendmail}
+command will replace it. For example you can rename it before feeding to
+@command{sendmail} with
+@code{reformail -R Return-Path: X-Original-Return-Path: | sendmail}, or
+extract with:
+
+@verbatiminclude sendmail.sh
+
+Also pay attention that @command{maildrop} does not like @code{From_}
+mbox-style header, so you possibly want:
+
+@example
+mailbox_command = reformail -f0 | maildrop -d $@{USER@}
+@end example
+
@item Specify that mail for @emph{example.com}, should be delivered via
NNCP, to a host named @emph{nncp-host}:
-@verbatim
+@example
/usr/local/etc/postfix/transport:
example.com nncp:nncp-host
.example.com nncp:nncp-host
-@end verbatim
+@end example
See the @command{transport(5)} manual page for more details.
@item Enable @file{transport} table lookups:
-@verbatim
+@example
/usr/local/etc/postfix/main.cf:
transport_maps = hash:$config_directory/transport
-@end verbatim
+@end example
@item Add @emph{example.com} to the list of domains that your site is
willing to relay mail for.
-@verbatim
+@example
/usr/local/etc/postfix/main.cf:
relay_domains = example.com ...other relay domains...
-@end verbatim
+@end example
See the @option{relay_domains} configuration parameter description for
details.
@itemize
-@item You need an @ref{nncp-mail} program that extracts the sender
+@item You need an @ref{nncp-exec} program that extracts the sender
address from mail that arrives via NNCP, and that feeds the mail into
the Postfix @command{sendmail} command.
@item Specify that all remote mail must be sent via the @command{nncp}
mail transport to your NNCP gateway host, say, @emph{nncp-gateway}:
-@verbatim
+@example
/usr/local/etc/postfix/main.cf:
relayhost = nncp-gateway
default_transport = nncp
-@end verbatim
+@end example
Postfix 2.0 and later also allows the following more succinct form:
-@verbatim
+@example
/usr/local/etc/postfix/main.cf:
default_transport = nncp:nncp-gateway
-@end verbatim
+@end example
@item Define a @command{pipe(8)} based message delivery transport for
mail delivery via NNCP:
-@verbatim
+@example
/usr/local/etc/postfix/master.cf:
nncp unix - n n - - pipe
- flags=F user=nncp argv=nncp-mail -quiet $nexthop $recipient
-@end verbatim
+ flags=Fqhu user=nncp argv=nncp-exec -quiet $nexthop sendmail $recipient
+@end example
-This runs the @command{nncp-mail} command to place outgoing mail into
+This runs the @command{nncp-exec} command to place outgoing mail into
the NNCP queue. It substitutes the hostname (@emph{nncp-gateway}, or
-whatever you specified) and the recipients before executing the command.
-The @command{nncp-mail} command is executed without assistance from the
-shell, so there are no problems with shell meta characters.
+whatever you specified) and the recipients before execution of the
+command. The @command{nncp-exec} command is executed without assistance
+from the shell, so there are no problems with shell meta characters.
@item Execute the command @command{postfix reload} to make the changes
effective.
@end itemize
+@node Exim
+@section Integration with Exim
+
+This section is unaltered copy-paste of
+@url{https://changelog.complete.org/archives/10165-asynchronous-email-exim-over-nncp-or-uucp, Asynchronous Email: Exim over NNCP (or UUCP)}
+article by John Goerzen, with his permission.
+
+@strong{Sending from Exim to a smarthost}
+
+One common use for async email is from a satellite system: one that
+doesn't receive mail, or have local mailboxes, but just needs to get
+email out to the Internet. This is a common situation even for
+conventionally-connected systems; in Exim speak, this is a "satellite
+system that routes mail via a smarthost". That is, every outbound
+message goes to a specific target, which then is responsible for
+eventual delivery (over the Internet, LAN, whatever).
+
+This is fairly simple in Exim.
+
+We actually have two choices for how to do this: bsmtp or rmail mode.
+bsmtp (batch SMTP) is the more modern way, and is essentially a
+derivative of SMTP that explicitly can be queued asynchronously.
+Basically it's a set of SMTP commands that can be saved in a file. The
+alternative is "rmail" (which is just an alias for sendmail these days),
+where the data is piped to rmail/sendmail with the recipients given on
+the command line. Both can work with Exim and NNCP, but because we're
+doing shiny new things, we'll use bsmtp.
+
+These instructions are loosely based on the
+@url{https://people.debian.org/~jdg/bsmtp.html, Using outgoing BSMTP with Exim HOWTO}.
+Some of these may assume Debianness in the configuration, but should be
+easily enough extrapolated to other configs as well.
+
+First, configure Exim to use satellite mode with minimal DNS lookups
+(assuming that you may not have working DNS anyhow).
+
+Then, in the Exim primary router section for smarthost
+(@file{router/200_exim4-config_primary} in Debian split configurations),
+just change @code{transport = remote_smtp_smarthost to transport = nncp}.
+
+Now, define the NNCP transport. If you are on Debian, you might name this
+@file{transports/40_exim4-config_local_nncp}:
+
+@example
+nncp:
+ debug_print = "T: nncp transport for $local_part@@$domain"
+ driver = pipe
+ user = nncp
+ batch_max = 100
+ use_bsmtp
+ command = /usr/local/nncp/bin/nncp-exec -noprogress -quiet hostname_goes_here rsmtp
+.ifdef REMOTE_SMTP_HEADERS_REWRITE
+ headers_rewrite = REMOTE_SMTP_HEADERS_REWRITE
+.endif
+.ifdef REMOTE_SMTP_RETURN_PATH
+ return_path = REMOTE_SMTP_RETURN_PATH
+.endif
+@end example
+
+This is pretty straightforward. We pipe to @command{nncp-exec}, run it
+as the nncp user. @command{nncp-exec} sends it to a target node and runs
+whatever that node has called @command{rsmtp} (the command to receive
+bsmtp data). When the target node processes the request, it will run the
+configured command and pipe the data in to it.
+
+@strong{More complicated: Routing to various NNCP nodes}
+
+Perhaps you would like to be able to send mail directly to various NNCP
+nodes. There are a lot of ways to do that.
+
+Fundamentally, you will need a setup similar to the UUCP example in
+@url{https://www.exim.org/exim-html-current/doc/html/spec_html/ch-the_manualroute_router.html,
+Exim's manualroute manual}, which lets you define how to reach various
+hosts via UUCP/NNCP. Perhaps you have a star topology (every NNCP node
+exchanges email with a central hub). In the NNCP world, you have two
+choices of how you do this. You could, at the Exim level, make the
+central hub the smarthost for all the side nodes, and let it
+redistribute mail. That would work, but requires decrypting messages at
+the hub to let Exim process. The other alternative is to configure NNCP
+to just send to the destinations via the central hub; that takes
+advantage of onion routing and doesn't require any Exim processing at
+the central hub at all.
+
+@strong{Receiving mail from NNCP}
+
+On the receiving side, first you need to configure NNCP to authorize the
+execution of a mail program. In the section of your receiving host where
+you set the permissions for the client, include something like this:
+
+@example
+exec: @{
+ rsmtp: ["/usr/sbin/sendmail", "-bS"]
+@}
+@end example
+
+The -bS option is what tells Exim to receive BSMTP on @code{stdin}.
+
+Now, you need to tell Exim that nncp is a trusted user (able to set From
+headers arbitrarily). Assuming you are running NNCP as the nncp user,
+then add @code{MAIN_TRUSTED_USERS = nncp} to a file such as
+@file{/etc/exim4/conf.d/main/01_exim4-config_local-nncp}. That's it!
+
+Some hosts, of course, both send and receive mail via NNCP and will need
+configurations for both.
+
@node Feeds
@section Integration with Web feeds
RSS and Atom feeds could be collected using
-@url{https://github.com/wking/rss2email, rss2email} program. It
-converts all incoming feed entries to email messages. Read about how to
-integration @ref{Postfix} with email. @command{rss2email} could be run
-in a cron, to collect feeds without any user interaction. Also this
-program supports ETags and won't pollute the channel if remote server
-supports them too.
+@url{https://github.com/wking/rss2email, rss2email} program. It converts
+all incoming feed entries to email messages. Read about how to integrate
+@ref{Postfix}/@ref{Exim} with email. @command{rss2email} could be run in
+a cron, to collect feeds without any user interaction. Also this program
+supports ETags and won't pollute the channel if remote server supports
+them too.
After installing @command{rss2email}, create configuration file:
-@verbatim
-% r2e new rss-robot@address.com
-@end verbatim
+
+@example
+$ r2e new rss-robot@@address.com
+@end example
+
and add feeds you want to retrieve:
-@verbatim
-% r2e add https://git.cypherpunks.ru/cgit.cgi/nncp.git/atom/?h=master
-@end verbatim
+
+@example
+$ r2e add http://www.git.cypherpunks.ru/?p=nncp.git;a=atom
+@end example
+
and run the process:
-@verbatim
-% r2e run
-@end verbatim
+
+@example
+$ r2e run
+@end example
@node WARCs
@section Integration with Web pages
Simple HTML web page can be downloaded very easily for sending and
viewing it offline after:
-@verbatim
-% wget http://www.example.com/page.html
-@end verbatim
+
+@example
+$ wget http://www.example.com/page.html
+@end example
But most web pages contain links to images, CSS and JavaScript files,
required for complete rendering.
@url{https://www.gnu.org/software/wget/, GNU Wget} supports that
documents parsing and understanding page dependencies. You can download
the whole page with dependencies the following way:
-@verbatim
-% wget \
+
+@example
+$ wget \
--page-requisites \
--convert-links \
--adjust-extension \
--random-wait \
--execute robots=off \
http://www.example.com/page.html
-@end verbatim
+@end example
+
that will create @file{www.example.com} directory with all files
necessary to view @file{page.html} web page. You can create single file
compressed tarball with that directory and send it to remote node:
-@verbatim
-% tar cf - www.example.com | xz -9 |
- nncp-file - remote.node:www.example.com-page.tar.xz
-@end verbatim
+
+@example
+$ tar cf - www.example.com | zstd |
+ nncp-file - remote.node:www.example.com-page.tar.zst
+@end example
But there are multi-paged articles, there are the whole interesting
sites you want to get in a single package. You can mirror the whole web
site by utilizing @command{wget}'s recursive feature:
-@verbatim
-% wget \
+
+@example
+$ wget \
--recursive \
--timestamping \
-l inf \
--no-parent \
[...]
http://www.example.com/
-@end verbatim
+@end example
There is a standard for creating
@url{https://en.wikipedia.org/wiki/Web_ARChive, Web ARChives}:
@strong{WARC}. Fortunately again, @command{wget} supports it as an
output format.
-@verbatim
-% wget \
+
+@example
+$ wget \
--warc-file www.example_com-$(date '+%Y%M%d%H%m%S') \
--no-warc-compression \
--no-warc-keep-log \
[...]
http://www.example.com/
-@end verbatim
+@end example
+
That command will create uncompressed @file{www.example_com-XXX.warc}
web archive. By default, WARCs are compressed using
@url{https://en.wikipedia.org/wiki/Gzip, gzip}, but, in example above,
-we have disabled it to compress with stronger @command{xz}, before
-sending via @command{nncp-file}.
+we have disabled it to compress with stronger and faster
+@url{https://en.wikipedia.org/wiki/Zstd, zstd}, before sending via
+@command{nncp-file}.
There are plenty of software acting like HTTP proxy for your browser,
allowing to view that WARC files. However you can extract files from
that archive using @url{https://pypi.python.org/pypi/Warcat, warcat}
utility, producing usual directory hierarchy:
-@verbatim
-% python3 -m warcat \
- extract www.example_com-XXX.warc \
+
+@example
+$ python3 -m warcat extract \
+ www.example_com-XXX.warc \
--output-dir www.example.com-XXX \
--progress
-@end verbatim
+@end example
@node BitTorrent
@section BitTorrent and huge files
accelerate HTTP*/*FTP downloads by segmented multiple parallel
connections.
-You can queue you files after they are completely downloaded:
-@verbatim
-% cat send-downloaded.sh
-#!/bin/sh
-nncp-file -chunked $(( 1024 * 100 )) "$3" remote.node
-
-% aria2c \
- --on-download-complete send-downloaded.sh \
- http://example.org/file.iso \
- http://example.org/file.iso.asc
-@end verbatim
+You can queue you files after they are completely downloaded.
+@file{aria2-downloaded.sh} contents:
+
+@verbatiminclude aria2-downloaded.sh
Also you can prepare
@url{http://aria2.github.io/manual/en/html/aria2c.html#files, input file}
with the jobs you want to download:
-@verbatim
-% cat jobs
+
+@example
+$ cat jobs
http://www.nncpgo.org/download/nncp-0.11.tar.xz
out=nncp.txz
http://www.nncpgo.org/download/nncp-0.11.tar.xz.sig
out=nncp.txz.sig
-% aria2c \
- --on-download-complete send-downloaded.sh \
+$ aria2c \
+ --on-download-complete aria2-downloaded.sh \
--input-file jobs
-@end verbatim
+@end example
+
and all that downloaded (@file{nncp.txz}, @file{nncp.txz.sig}) files
will be sent to @file{remote.node} when finished.
+@node DownloadService
+@section Downloading service
+
+Previous sections tell about manual downloading and sending results to
+remote node. But one wish to remotely initiate downloading. That can be
+easily solved with @ref{CfgExec, exec} handles.
+
+@verbatim
+exec: {
+ warcer: ["/bin/sh", "/path/to/warcer.sh"]
+ wgeter: ["/bin/sh", "/path/to/wgeter.sh"]
+ aria2c: [
+ "/usr/local/bin/aria2c",
+ "--on-download-complete", "aria2-downloaded.sh",
+ "--on-bt-download-complete", "aria2-downloaded.sh"
+ ]
+}
+@end verbatim
+
+@file{warcer.sh} contents:
+
+@verbatiminclude warcer.sh
+
+@file{wgeter.sh} contents:
+
+@verbatiminclude wgeter.sh
+
+Now you can queue that node to send you some website's page, file or
+BitTorrents:
+
+@example
+$ echo http://www.nncpgo.org/Postfix.html |
+ nncp-exec remote.node warcer postfix-whole-page
+$ echo http://www.nncpgo.org/Postfix.html |
+ nncp-exec remote.node wgeter postfix-html-page
+$ echo \
+ http://www.nncpgo.org/download/nncp-0.11.tar.xz
+ http://www.nncpgo.org/download/nncp-0.11.tar.xz.sig |
+ nncp-exec remote.node aria2c
+@end example
+
@node Git
@section Integration with Git
everything you need.
Use it to create bundles containing all required blobs/trees/commits and tags:
-@verbatim
-% git bundle create repo-initial.bundle master --tags --branches
-% git tag -f last-bundle
-% nncp-file repo-initial.bundle remote.node:repo-$(date % '+%Y%M%d%H%m%S').bundle
-@end verbatim
+
+@example
+$ git bundle create repo-initial.bundle master --tags --branches
+$ git tag -f last-bundle
+$ nncp-file repo-initial.bundle remote.node:repo-$(date % '+%Y%M%d%H%m%S').bundle
+@end example
Do usual working with the Git: commit, add, branch, checkout, etc. When
you decide to queue your changes for sending, create diff-ed bundle and
transfer them:
-@verbatim
-% git bundle create repo-$(date '+%Y%M%d%H%m%S').bundle last-bundle..master
+
+@example
+$ git bundle create repo-$(date '+%Y%M%d%H%m%S').bundle last-bundle..master
or maybe
-% git bundle create repo-$(date '+%Y%M%d').bundle --since=10.days master
-@end verbatim
+$ git bundle create repo-$(date '+%Y%M%d').bundle --since=10.days master
+@end example
Received bundle on remote machine acts like usual remote:
-@verbatim
-% git clone -b master repo-XXX.bundle
-@end verbatim
+
+@example
+$ git clone -b master repo-XXX.bundle
+@end example
+
overwrite @file{repo.bundle} file with newer bundles you retrieve and
fetch all required branches and commits:
-@verbatim
-% git pull # assuming that origin remote points to repo.bundle
-% git fetch repo.bundle master:localRef
-% git ls-remote repo.bundle
-@end verbatim
+
+@example
+$ git pull # assuming that origin remote points to repo.bundle
+$ git fetch repo.bundle master:localRef
+$ git ls-remote repo.bundle
+@end example
Bundles are also useful when cloning huge repositories (like Linux has).
Git's native protocol does not support any kind of interrupted download
bundle, you can add an ordinary @file{git://} remote and fetch the
difference.
+Also you can find the following exec-handler useful:
+
+@verbatiminclude git-bundler.sh
+
+And it allows you to request for bundles like that:
+@code{echo some-old-commit..master | nncp-exec REMOTE bundler REPONAME}.
+
@node Multimedia
@section Integration with multimedia streaming
and @emph{YouTube}.
When you multimedia becomes an ordinary file, you can transfer it easily.
-@verbatim
-% youtube-dl \
- --exec 'nncp-file {} remote.node:' \
+
+@example
+$ youtube-dl \
+ --exec 'nncp-file @{@} remote.node:' \
'https://www.youtube.com/watch?list=PLd2Cw8x5CytxPAEBwzilrhQUHt_UN10FJ'
-@end verbatim
+@end example