making them store-and-forward friendly.
@menu
+* Index files for freqing: FreqIndex.
* Postfix::
* Web feeds: Feeds.
* Web pages: WARCs.
* Multimedia streaming: Multimedia.
@end menu
+@node FreqIndex
+@section Index files for freqing
+
+In many cases you do not know exact files list on remote machine you
+want to freq from. Because files can be updated there. It is useful to
+run cron-ed job on it to create files listing you can freq and search
+for files in it:
+
+@example
+0 4 * * * cd /storage ; tmp=`mktemp` ; \
+ tree -f -h -N --du --timefmt \%Y-\%m-\%d |
+ zstdmt -19 > $tmp && chmod 644 $tmp && mv $tmp TREE.txt.zst ; \
+ tree -J -f --timefmt \%Y-\%m-\%d |
+ zstdmt -19 > $tmp && chmod 644 $tmp && mv $tmp TREE.json.zst
+@end example
+
@node Postfix
@section Integration with Postfix
@item Define a @command{pipe(8)} based mail delivery transport for
delivery via NNCP:
-@verbatim
+@example
/usr/local/etc/postfix/master.cf:
nncp unix - n n - - pipe
flags=F user=nncp argv=nncp-exec -quiet $nexthop sendmail $recipient
-@end verbatim
+@end example
This runs the @command{nncp-exec} command to place outgoing mail into
the NNCP queue after replacing @var{$nexthop} by the the receiving NNCP
@item Specify that mail for @emph{example.com}, should be delivered via
NNCP, to a host named @emph{nncp-host}:
-@verbatim
+@example
/usr/local/etc/postfix/transport:
example.com nncp:nncp-host
.example.com nncp:nncp-host
-@end verbatim
+@end example
See the @command{transport(5)} manual page for more details.
@item Enable @file{transport} table lookups:
-@verbatim
+@example
/usr/local/etc/postfix/main.cf:
transport_maps = hash:$config_directory/transport
-@end verbatim
+@end example
@item Add @emph{example.com} to the list of domains that your site is
willing to relay mail for.
-@verbatim
+@example
/usr/local/etc/postfix/main.cf:
relay_domains = example.com ...other relay domains...
-@end verbatim
+@end example
See the @option{relay_domains} configuration parameter description for
details.
@item Specify that all remote mail must be sent via the @command{nncp}
mail transport to your NNCP gateway host, say, @emph{nncp-gateway}:
-@verbatim
+@example
/usr/local/etc/postfix/main.cf:
relayhost = nncp-gateway
default_transport = nncp
-@end verbatim
+@end example
Postfix 2.0 and later also allows the following more succinct form:
-@verbatim
+@example
/usr/local/etc/postfix/main.cf:
default_transport = nncp:nncp-gateway
-@end verbatim
+@end example
@item Define a @command{pipe(8)} based message delivery transport for
mail delivery via NNCP:
-@verbatim
+@example
/usr/local/etc/postfix/master.cf:
nncp unix - n n - - pipe
flags=F user=nncp argv=nncp-exec -quiet $nexthop sendmail $recipient
-@end verbatim
+@end example
This runs the @command{nncp-exec} command to place outgoing mail into
the NNCP queue. It substitutes the hostname (@emph{nncp-gateway}, or
supports them too.
After installing @command{rss2email}, create configuration file:
-@verbatim
-% r2e new rss-robot@address.com
-@end verbatim
+
+@example
+$ r2e new rss-robot@@address.com
+@end example
+
and add feeds you want to retrieve:
-@verbatim
-% r2e add https://git.cypherpunks.ru/cgit.cgi/nncp.git/atom/?h=master
-@end verbatim
+
+@example
+$ r2e add https://git.cypherpunks.ru/cgit.cgi/nncp.git/atom/?h=master
+@end example
+
and run the process:
-@verbatim
-% r2e run
-@end verbatim
+
+@example
+$ r2e run
+@end example
@node WARCs
@section Integration with Web pages
Simple HTML web page can be downloaded very easily for sending and
viewing it offline after:
-@verbatim
-% wget http://www.example.com/page.html
-@end verbatim
+
+@example
+$ wget http://www.example.com/page.html
+@end example
But most web pages contain links to images, CSS and JavaScript files,
required for complete rendering.
@url{https://www.gnu.org/software/wget/, GNU Wget} supports that
documents parsing and understanding page dependencies. You can download
the whole page with dependencies the following way:
-@verbatim
-% wget \
+
+@example
+$ wget \
--page-requisites \
--convert-links \
--adjust-extension \
--random-wait \
--execute robots=off \
http://www.example.com/page.html
-@end verbatim
+@end example
+
that will create @file{www.example.com} directory with all files
necessary to view @file{page.html} web page. You can create single file
compressed tarball with that directory and send it to remote node:
-@verbatim
-% tar cf - www.example.com | xz -9 |
- nncp-file - remote.node:www.example.com-page.tar.xz
-@end verbatim
+
+@example
+$ tar cf - www.example.com | zstd |
+ nncp-file - remote.node:www.example.com-page.tar.zst
+@end example
But there are multi-paged articles, there are the whole interesting
sites you want to get in a single package. You can mirror the whole web
site by utilizing @command{wget}'s recursive feature:
-@verbatim
-% wget \
+
+@example
+$ wget \
--recursive \
--timestamping \
-l inf \
--no-parent \
[...]
http://www.example.com/
-@end verbatim
+@end example
There is a standard for creating
@url{https://en.wikipedia.org/wiki/Web_ARChive, Web ARChives}:
@strong{WARC}. Fortunately again, @command{wget} supports it as an
output format.
-@verbatim
-% wget \
+
+@example
+$ wget \
--warc-file www.example_com-$(date '+%Y%M%d%H%m%S') \
--no-warc-compression \
--no-warc-keep-log \
[...]
http://www.example.com/
-@end verbatim
+@end example
+
That command will create uncompressed @file{www.example_com-XXX.warc}
web archive. By default, WARCs are compressed using
@url{https://en.wikipedia.org/wiki/Gzip, gzip}, but, in example above,
-we have disabled it to compress with stronger @command{xz}, before
-sending via @command{nncp-file}.
+we have disabled it to compress with stronger and faster
+@url{https://en.wikipedia.org/wiki/Zstd, zstd}, before sending via
+@command{nncp-file}.
There are plenty of software acting like HTTP proxy for your browser,
allowing to view that WARC files. However you can extract files from
that archive using @url{https://pypi.python.org/pypi/Warcat, warcat}
utility, producing usual directory hierarchy:
-@verbatim
-% python3 -m warcat extract \
+
+@example
+$ python3 -m warcat extract \
www.example_com-XXX.warc \
--output-dir www.example.com-XXX \
--progress
-@end verbatim
+@end example
@node BitTorrent
@section BitTorrent and huge files
You can queue you files after they are completely downloaded.
@file{aria2-downloaded.sh} contents:
+
@verbatiminclude aria2-downloaded.sh
Also you can prepare
@url{http://aria2.github.io/manual/en/html/aria2c.html#files, input file}
with the jobs you want to download:
-@verbatim
-% cat jobs
+
+@example
+$ cat jobs
http://www.nncpgo.org/download/nncp-0.11.tar.xz
out=nncp.txz
http://www.nncpgo.org/download/nncp-0.11.tar.xz.sig
out=nncp.txz.sig
-% aria2c \
+$ aria2c \
--on-download-complete aria2-downloaded.sh \
--input-file jobs
-@end verbatim
+@end example
+
and all that downloaded (@file{nncp.txz}, @file{nncp.txz.sig}) files
will be sent to @file{remote.node} when finished.
easily solved with @ref{CfgExec, exec} handles.
@verbatim
-exec:
+exec: {
warcer: ["/bin/sh", "/path/to/warcer.sh"]
wgeter: ["/bin/sh", "/path/to/wgeter.sh"]
aria2c: [
"--on-download-complete", "aria2-downloaded.sh",
"--on-bt-download-complete", "aria2-downloaded.sh"
]
+}
@end verbatim
@file{warcer.sh} contents:
+
@verbatiminclude warcer.sh
@file{wgeter.sh} contents:
+
@verbatiminclude wgeter.sh
Now you can queue that node to send you some website's page, file or
BitTorrents:
-@verbatim
-% echo http://www.nncpgo.org/Postfix.html |
+@example
+$ echo http://www.nncpgo.org/Postfix.html |
nncp-exec remote.node warcer postfix-whole-page
-% echo http://www.nncpgo.org/Postfix.html |
+$ echo http://www.nncpgo.org/Postfix.html |
nncp-exec remote.node wgeter postfix-html-page
-% echo \
+$ echo \
http://www.nncpgo.org/download/nncp-0.11.tar.xz
http://www.nncpgo.org/download/nncp-0.11.tar.xz.sig |
nncp-exec remote.node aria2c
-@end verbatim
+@end example
@node Git
@section Integration with Git
everything you need.
Use it to create bundles containing all required blobs/trees/commits and tags:
-@verbatim
-% git bundle create repo-initial.bundle master --tags --branches
-% git tag -f last-bundle
-% nncp-file repo-initial.bundle remote.node:repo-$(date % '+%Y%M%d%H%m%S').bundle
-@end verbatim
+
+@example
+$ git bundle create repo-initial.bundle master --tags --branches
+$ git tag -f last-bundle
+$ nncp-file repo-initial.bundle remote.node:repo-$(date % '+%Y%M%d%H%m%S').bundle
+@end example
Do usual working with the Git: commit, add, branch, checkout, etc. When
you decide to queue your changes for sending, create diff-ed bundle and
transfer them:
-@verbatim
-% git bundle create repo-$(date '+%Y%M%d%H%m%S').bundle last-bundle..master
+
+@example
+$ git bundle create repo-$(date '+%Y%M%d%H%m%S').bundle last-bundle..master
or maybe
-% git bundle create repo-$(date '+%Y%M%d').bundle --since=10.days master
-@end verbatim
+$ git bundle create repo-$(date '+%Y%M%d').bundle --since=10.days master
+@end example
Received bundle on remote machine acts like usual remote:
-@verbatim
-% git clone -b master repo-XXX.bundle
-@end verbatim
+
+@example
+$ git clone -b master repo-XXX.bundle
+@end example
+
overwrite @file{repo.bundle} file with newer bundles you retrieve and
fetch all required branches and commits:
-@verbatim
-% git pull # assuming that origin remote points to repo.bundle
-% git fetch repo.bundle master:localRef
-% git ls-remote repo.bundle
-@end verbatim
+
+@example
+$ git pull # assuming that origin remote points to repo.bundle
+$ git fetch repo.bundle master:localRef
+$ git ls-remote repo.bundle
+@end example
Bundles are also useful when cloning huge repositories (like Linux has).
Git's native protocol does not support any kind of interrupted download
bundle, you can add an ordinary @file{git://} remote and fetch the
difference.
+Also you can find the following exec-handler useful:
+
+@verbatiminclude git-bundler.sh
+
+And it allows you to request for bundles like that:
+@code{echo some-old-commit..master | nncp-exec REMOTE bundler REPONAME}.
+
@node Multimedia
@section Integration with multimedia streaming
and @emph{YouTube}.
When you multimedia becomes an ordinary file, you can transfer it easily.
-@verbatim
-% youtube-dl \
- --exec 'nncp-file {} remote.node:' \
+
+@example
+$ youtube-dl \
+ --exec 'nncp-file @{@} remote.node:' \
'https://www.youtube.com/watch?list=PLd2Cw8x5CytxPAEBwzilrhQUHt_UN10FJ'
-@end verbatim
+@end example