-- back-up-by-copying t --
I want to become a better writer since reading the first two references. They encourage a writer to track how much they write, and then make each word count.
I've thought about counting my written words for at least two years, so I thought I'd start by counting my words. Maybe I've even written in my dairy about this.
So first, here are some rules:
The first task before me is a Do It Yourself word count tally.
Highlighting Dorian's (How to write ..) points:
this last one got my attention. He's bought a writing tool, Scrivner, un-referenced here since it's a. windows only, and b. doesn't get my strong emacs preference. Which lead back to the rules. I'll find it easier to use Org Mode on my computer as the organized way to collect the daily word count:
$ wc somefile | diff - yesterday's version
is a high-level way to tally my word count.
So, now, how to?
First, I've a tool, recorg that lists all my Org files and then reports the long listing of each files latest change. It then formats the data into a master file ../../recorg.org.
So, how's this sound? The master list is here: ../../recorg.out, where the first few lines suggest what I might do:
-rw-r--r--@ 1 applemcg staff 0 Nov 2 23:44 recorg.org
-rw-r--r--@ 1 applemcg staff 14252 Nov 2 18:09 Family/invest/sandp.org
-rw-r--r-- 1 applemcg staff 533 Oct 31 18:03 ../talk/.steps.org
-rw-r--r--@ 1 applemcg staff 30959 Oct 27 17:27 commonplace/software/swdiary-2016.org
-rw-r--r--@ 1 applemcg staff 138854 Oct 23 07:07 commonplace/software/swdiary.org
Recorg.org itself is the newest file in the list. Any file newer than that one has been changed since it was written. Here's a sketch of an algorithm, a recipe to collect the daily word count:
an important feature is not to require this to be run at any specific time. I hope the tool will encourage a routine, and daily usage, but won't require it to report the Words Per Day. A daily average will be sufficient for starters.
So, first inspect the recorg function:
$ whfn recorg; fbdy recorg
A quick inspection of the tool suggests a simple solution. I have an existing tool, /RDB, which can really simplify the process. Some steps.
A tool I've used for twenty years, rdput, which records the time a record was inserted and deleted from an RDB table. So, any file whose record is among the most recent updates needs a report: what are the differences since the most recent update?
This reduces the bookkeeping to subtract previous word count from the prior record. It's a feature fo rdput to leave unchanged records unlatered: in the case of wc, a file's data only updates when either the number of lines, words, or characters changes.
in total, four fields.
The history adds fields {insert,delete}_time.
While the wc output places the data first, the filename last, an RDB table report probably wants to put the history first.
The general concept looks good. My expectation that wc's "total" and its change would be sufficient was too optimisitic. A few filters are needed:
The first step recorg's the Org files in a "long listing" (ls -l
):
recorg ()
{
: date 2016-10-02;
: date 2016-10-23;
: date 2016-10-31;
: date 2016-11-03;
: date 2016-12-02;
: exclude files, recorgy, with identical basename, wc results;
: date 2016-12-04;
function forg ()
{
: date 2016-10-23;
find ${*-.} -name '*.org'
};
function recorgx ()
{
: date 2016-10-31;
forg ../talk;
forg $(ls | grep -v ' ') ../{doc,stonebridge,git} | nvn
};
function recorgy ()
{
awk ' { b=$NF; gsub(/.*\//,"",b); }
!printed[$1,$2,$3,b]++
'
};
function recorgt ()
{
rdb_hdr lines words chars filename;
cat ${*:--} | field NF | xargs wc | sed 's/^ *//' | sps2tabs
};
pushd ~/Dropbox;
set recorg.{out,org,${1:-rdb},cut};
: "cut" or grep -v file needs an entry;
[[ -f $4 ]] || echo $2 > $4;
recorgx | xargs ls -lt | tee $1 | awk -v year=$(date +%Y) -f $(awk_file) > $2;
grep -v -f $4 $1 | recorgt | recorgy > $3;
rdput $3;
popd;
unset recorg{t,x,y} forg
}
Here are the reporting tools to "gamify" the writing, with a summary through the first days:
daily_report ()
{
: date 2016-12-02;
report_notpipe && return 1;
rdb_hdr day words file;
: dawn of time for word-counting;
row 'time > 161202114600' | tail +3 | awk -f $(awk_file)
}
daily_totalwords ()
{
: date 2016-12-03;
latest_report | daily_report | row 'file ~ /total/' | ncolumn file | quietly rd sort -r
}
daily_mvag ()
{
daily_totalwords | rd sort | addcol mvag | compute '
d = 7;
n = 1/d;
o = 1-n;
mvag = o*ovag + ((ovag)?n:1)*words;
ovag = mvag
'
}
daily_mvag is at the top of the heap. So, the routine task is:
$ recorg # updates "recorg.rdb" and it's history
$ daily_mvag # produces a report which looks like:
day words mvag
--- ----- ------
161202 819 819
161203 812 818
161204 362 752.857
161205 2458 996.449
161206 -1198 682.956
The big hit today ( -1198) is because in yesterday's mail, I had mistakenly cut and pasted, duplicating the whole file. Note the "dawn of time" in daily_report. That marks the time when I'd first collected a record of the RDB data table.
When a file gets deleted, (not merely changed), it would be nice to remove it from the active history.
Sort the delete time and file name so all deletes appear before the undeleted. When a file appears with no delete time clear a deleted flag for the file. Files with uncleared deleted flags are absolutelty deleted. Report them.
For the moment, my fix is to count the changes to files rather than the totals for each day. I'd been taking the difference between yesterday's and today's total field. I'm not sure how a deleted file fares under this treatment. Here are the functions supporting daily_mvag
daily_totalwords ()
{
: date 2016-12-03;
: date 2016-12-09;
rdb_hdr day words;
latest_report | daily_report | tail +3 | awk -f $(awk_file)
}
daily_mvag ()
{
: date 2016-12-04;
: add results to journaled RDB table;
: date 2016-12-06;
: date 2016-12-09;
daily_totalwords | rd sort | addcol mvag | compute '
d = 7;
n = 1/d;
o = 1-n;
mvag = o*ovag + ((ovag)?n:1)*words;
ovag = mvag
' | tee ~/Dropbox/mvag.rdb;
rdput ~/Dropbox/mvag.rdb
}
latest_report ()
{
: date 2016-12-02;
: date 2016-12-05;
: date 2016-12-09;
rdb_hdr time words file;
prepare_report | tail +3 | awk -f $(awk_file) 2> report.err
}
daily_report ()
{
: date 2016-12-02;
: date 2016-12-09;
report_notpipe && return 1;
rdb_hdr day words file;
: dawn of time for word-counting;
row 'time > 161202114600' | row 'file !~ /total/' | tail +3 | awk -f $(awk_file)
}
awk_file ()
{
: date 2016-09-30;
: date 2016-11-25;
trace_call $*;
local awk=${1:-$(myname 2)}.awk;
for lib in $(lib_paths) {.,..}/lib;
do
[[ -f $lib/$awk ]] && {
echo $lib/$awk;
return 0
};
done;
return 1
}
A recurring theme in these is this fragment:
... | tail +3 | awk -f $(awk_file)
from which you may be able to tell that awk_file finds a file with the .awk suffix whose name matches either the calling function or the first argument. For example the latest_report function uses an awk_file, found on the lib_paths which is just the list of directories on the users PATH after replacing a trailing /bin with /lib. A function aff is the upward-compatible version of ff.
aff ()
{
: date 2016-12-09;
function _aff ()
{
set $1 $(awk_file $1);
ff $1;
case $# in
2)
cat $2
;;
esac
};
foreach _aff $*;
unset _aff
}
which adds the awk_file to the output for any function using one. This solves a problem I'd noted some time ago: what's the size of an awk script before it is better left in a separate file, rather than part of the function.
recorg_report ()
{
: date 2016-12-09;
: do NOT leave the history file laying loose
zcat ~/Dropbox/h.recorg.rdb.Z
}
mvag_report ()
{
${*:-echo} recorg daily_{totalwords,mvag} {latest,daily,prepare,recorg,mvag}_report
}
And, to put a wrap on things, recorg_report corrects a mistake I made: don't leave both a compressed file and it's uncompressed version lying around. It's more trouble than it's worth to defend against overwriting one or the other at a useful time.
When the function collection approaches an application, then I find it usefull to put the collection names into a self-referencing function. In the case of mvag_report, a trick I like to use is the ${*:-echo}
idiom. It's default behavior is to echo or name the functions. With arguments, use them, so the fun alternative is:
$ mvag_report ff
produces the function bodies.
Solution: for today's date in my diary, add a "words" DRAWER to hold the word-count for the day. Roll it up with a separate collection from the recorg gathering, but before the daily total, and moving average are caclulate.
This shows the calling tree. The major functions, in order are: