/[LeafOK_CVS]/pvpgn-1.7.4/scripts/cvs2cl.pl
ViewVC logotype

Annotation of /pvpgn-1.7.4/scripts/cvs2cl.pl

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (hide annotations)
Tue Jun 6 03:41:37 2006 UTC (19 years, 9 months ago) by sysadm
CVS Tags: pvpgn_1-7-4-0_MIL
Branch point for: GNU, MAIN
Content type: text/x-perl
Initial revision

1 sysadm 1.1 #!/bin/sh
2     exec perl -w -x $0 ${1+"$@"} # -*- mode: perl; perl-indent-level: 2; -*-
3     #!perl -w
4    
5     ##############################################################
6     ### ###
7     ### cvs2cl.pl: produce ChangeLog(s) from `cvs log` output. ###
8     ### ###
9     ##############################################################
10    
11     ##
12     ## (C) 1999 Karl Fogel <kfogel@red-bean.com>, under the GNU GPL.
13     ##
14     ## (Extensively hacked on by Melissa O'Neill <oneill@cs.sfu.ca>.)
15     ##
16     ## cvs2cl.pl is free software; you can redistribute it and/or modify
17     ## it under the terms of the GNU General Public License as published by
18     ## the Free Software Foundation; either version 2, or (at your option)
19     ## any later version.
20     ##
21     ## cvs2cl.pl is distributed in the hope that it will be useful,
22     ## but WITHOUT ANY WARRANTY; without even the implied warranty of
23     ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
24     ## GNU General Public License for more details.
25     ##
26     ## You may have received a copy of the GNU General Public License
27     ## along with cvs2cl.pl; see the file COPYING. If not, write to the
28     ## Free Software Foundation, Inc., 59 Temple Place - Suite 330,
29     ## Boston, MA 02111-1307, USA.
30    
31    
32    
33     use strict;
34     use Text::Wrap;
35     use Time::Local;
36     use File::Basename;
37    
38    
39     # The Plan:
40     #
41     # Read in the logs for multiple files, spit out a nice ChangeLog that
42     # mirrors the information entered during `cvs commit'.
43     #
44     # The problem presents some challenges. In an ideal world, we could
45     # detect files with the same author, log message, and checkin time --
46     # each <filelist, author, time, logmessage> would be a changelog entry.
47     # We'd sort them; and spit them out. Unfortunately, CVS is *not atomic*
48     # so checkins can span a range of times. Also, the directory structure
49     # could be hierarchical.
50     #
51     # Another question is whether we really want to have the ChangeLog
52     # exactly reflect commits. An author could issue two related commits,
53     # with different log entries, reflecting a single logical change to the
54     # source. GNU style ChangeLogs group these under a single author/date.
55     # We try to do the same.
56     #
57     # So, we parse the output of `cvs log', storing log messages in a
58     # multilevel hash that stores the mapping:
59     # directory => author => time => message => filelist
60     # As we go, we notice "nearby" commit times and store them together
61     # (i.e., under the same timestamp), so they appear in the same log
62     # entry.
63     #
64     # When we've read all the logs, we twist this mapping into
65     # a time => author => message => filelist mapping for each directory.
66     #
67     # If we're not using the `--distributed' flag, the directory is always
68     # considered to be `./', even as descend into subdirectories.
69    
70    
71     ############### Globals ################
72    
73    
74     # What we run to generate it:
75     my $Log_Source_Command = "cvs log";
76    
77     # In case we have to print it out:
78     my $VERSION = '$Revision: 1.1.1.1 $';
79     $VERSION =~ s/\S+\s+(\S+)\s+\S+/$1/;
80    
81     ## Vars set by options:
82    
83     # Print debugging messages?
84     my $Debug = 0;
85    
86     # Just show version and exit?
87     my $Print_Version = 0;
88    
89     # Just print usage message and exit?
90     my $Print_Usage = 0;
91    
92     # Single top-level ChangeLog, or one per subdirectory?
93     my $Distributed = 0;
94    
95     # What file should we generate (defaults to "ChangeLog")?
96     my $Log_File_Name = "ChangeLog";
97    
98     # Grab most recent entry date from existing ChangeLog file, just add
99     # to that ChangeLog.
100     my $Cumulative = 0;
101    
102     # Expand usernames to email addresses based on a map file?
103     my $User_Map_File = "";
104    
105     # Output to a file or to stdout?
106     my $Output_To_Stdout = 0;
107    
108     # Eliminate empty log messages?
109     my $Prune_Empty_Msgs = 0;
110    
111     # Don't call Text::Wrap on the body of the message
112     my $No_Wrap = 0;
113    
114     # Separates header from log message. Code assumes it is either " " or
115     # "\n\n", so if there's ever an option to set it to something else,
116     # make sure to go through all conditionals that use this var.
117     my $After_Header = " ";
118    
119     # Format more for programs than for humans.
120     my $XML_Output = 0;
121    
122     # Do some special tweaks for log data that was written in FSF
123     # ChangeLog style.
124     my $FSF_Style = 0;
125    
126     # Show times in UTC instead of local time
127     my $UTC_Times = 0;
128    
129     # Show day of week in output?
130     my $Show_Day_Of_Week = 0;
131    
132     # Show revision numbers in output?
133     my $Show_Revisions = 0;
134    
135     # Show tags (symbolic names) in output?
136     my $Show_Tags = 0;
137    
138     # Show branches by symbolic name in output?
139     my $Show_Branches = 0;
140    
141     # Show only revisions on these branches or their ancestors.
142     my @Follow_Branches;
143    
144     # Don't bother with files matching this regexp.
145     my @Ignore_Files;
146    
147     # How exactly we match entries. We definitely want "o",
148     # and user might add "i" by using --case-insensitive option.
149     my $Case_Insensitive = 0;
150    
151     # Maybe only show log messages matching a certain regular expression.
152     my $Regexp_Gate = "";
153    
154     # Pass this global option string along to cvs, to the left of `log':
155     my $Global_Opts = "";
156    
157     # Pass this option string along to the cvs log subcommand:
158     my $Command_Opts = "";
159    
160     # Read log output from stdin instead of invoking cvs log?
161     my $Input_From_Stdin = 0;
162    
163     # Don't show filenames in output.
164     my $Hide_Filenames = 0;
165    
166     # Max checkin duration. CVS checkin is not atomic, so we may have checkin
167     # times that span a range of time. We assume that checkins will last no
168     # longer than $Max_Checkin_Duration seconds, and that similarly, no
169     # checkins will happen from the same users with the same message less
170     # than $Max_Checkin_Duration seconds apart.
171     my $Max_Checkin_Duration = 180;
172    
173     # What to put at the front of [each] ChangeLog.
174     my $ChangeLog_Header = "";
175    
176     ## end vars set by options.
177    
178     # In 'cvs log' output, one long unbroken line of equal signs separates
179     # files:
180     my $file_separator = "======================================="
181     . "======================================";
182    
183     # In 'cvs log' output, a shorter line of dashes separates log messages
184     # within a file:
185     my $logmsg_separator = "----------------------------";
186    
187    
188     ############### End globals ############
189    
190    
191    
192    
193     &parse_options ();
194     &derive_change_log ();
195    
196    
197    
198     ### Everything below is subroutine definitions. ###
199    
200     # If accumulating, grab the boundary date from pre-existing ChangeLog.
201     sub maybe_grab_accumulation_date ()
202     {
203     if (! $Cumulative) {
204     return "";
205     }
206    
207     # else
208    
209     open (LOG, "$Log_File_Name")
210     or die ("trouble opening $Log_File_Name for reading ($!)");
211    
212     my $boundary_date;
213     while (<LOG>)
214     {
215     if (/^(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d)/)
216     {
217     $boundary_date = "$1";
218     last;
219     }
220     }
221    
222     close (LOG);
223     return $boundary_date;
224     }
225    
226    
227     # Fills up a ChangeLog structure in the current directory.
228     sub derive_change_log ()
229     {
230     # See "The Plan" above for a full explanation.
231    
232     my %grand_poobah;
233    
234     my $file_full_path;
235     my $time;
236     my $revision;
237     my $author;
238     my $msg_txt;
239     my $detected_file_separator;
240    
241     # Might be adding to an existing ChangeLog
242     my $accumulation_date = &maybe_grab_accumulation_date ();
243     if ($accumulation_date) {
244     $Log_Source_Command .= " -d\'>${accumulation_date}\'";
245     }
246    
247     # We might be expanding usernames
248     my %usermap;
249    
250     # In general, it's probably not very maintainable to use state
251     # variables like this to tell the loop what it's doing at any given
252     # moment, but this is only the first one, and if we never have more
253     # than a few of these, it's okay.
254     my $collecting_symbolic_names = 0;
255     my %symbolic_names; # Where tag names get stored.
256     my %branch_names; # We'll grab branch names while we're at it.
257     my %branch_numbers; # Save some revisions for @Follow_Branches
258     my @branch_roots; # For showing which files are branch ancestors.
259    
260     # Bleargh. Compensate for a deficiency of custom wrapping.
261     if (($After_Header ne " ") and $FSF_Style)
262     {
263     $After_Header .= "\t";
264     }
265    
266     if (! $Input_From_Stdin) {
267     open (LOG_SOURCE, "$Log_Source_Command |")
268     or die "unable to run \"${Log_Source_Command}\"";
269     }
270     else {
271     open (LOG_SOURCE, "-") or die "unable to open stdin for reading";
272     }
273    
274     %usermap = &maybe_read_user_map_file ();
275    
276     while (<LOG_SOURCE>)
277     {
278     # If on a new file and don't see filename, skip until we find it, and
279     # when we find it, grab it.
280     if ((! (defined $file_full_path)) and /^Working file: (.*)/)
281     {
282     $file_full_path = $1;
283     if (@Ignore_Files)
284     {
285     my $base;
286     ($base, undef, undef) = fileparse ($file_full_path);
287     # Ouch, I wish trailing operators in regexps could be
288     # evaluated on the fly!
289     if ($Case_Insensitive) {
290     if (grep ($file_full_path =~ m|$_|i, @Ignore_Files)) {
291     undef $file_full_path;
292     }
293     }
294     elsif (grep ($file_full_path =~ m|$_|, @Ignore_Files)) {
295     undef $file_full_path;
296     }
297     }
298     next;
299     }
300    
301     # Just spin wheels if no file defined yet.
302     next if (! $file_full_path);
303    
304     # Collect tag names in case we're asked to print them in the output.
305     if (/^symbolic names:$/) {
306     $collecting_symbolic_names = 1;
307     next; # There's no more info on this line, so skip to next
308     }
309     if ($collecting_symbolic_names)
310     {
311     # All tag names are listed with whitespace in front in cvs log
312     # output; so if see non-whitespace, then we're done collecting.
313     if (/^\S/) {
314     $collecting_symbolic_names = 0;
315     }
316     else # we're looking at a tag name, so parse & store it
317     {
318     # According to the Cederqvist manual, in node "Tags", tag
319     # names must start with an uppercase or lowercase letter and
320     # can contain uppercase and lowercase letters, digits, `-',
321     # and `_'. However, it's not our place to enforce that, so
322     # we'll allow anything CVS hands us to be a tag:
323     /^\s+([^:]+): ([\d.]+)$/;
324     my $tag_name = $1;
325     my $tag_rev = $2;
326    
327     # A branch number either has an odd number of digit sections
328     # (and hence an even number of dots), or has ".0." as the
329     # second-to-last digit section. Test for these conditions.
330     my $real_branch_rev = "";
331     if (($tag_rev =~ /^(\d+\.\d+\.)+\d+$/) # Even number of dots...
332     and (! ($tag_rev =~ /^(1\.)+1$/))) # ...but not "1.[1.]1"
333     {
334     $real_branch_rev = $tag_rev;
335     }
336     elsif ($tag_rev =~ /(\d+\.(\d+\.)+)0.(\d+)/) # Has ".0."
337     {
338     $real_branch_rev = $1 . $3;
339     }
340     # If we got a branch, record its number.
341     if ($real_branch_rev)
342     {
343     $branch_names{$real_branch_rev} = $tag_name;
344     if (@Follow_Branches) {
345     if (grep ($_ eq $tag_name, @Follow_Branches)) {
346     $branch_numbers{$tag_name} = $real_branch_rev;
347     }
348     }
349     }
350     else {
351     # Else it's just a regular (non-branch) tag.
352     push (@{$symbolic_names{$tag_rev}}, $tag_name);
353     }
354     }
355     }
356     # End of code for collecting tag names.
357    
358     # If have file name, but not revision, and see revision, then grab
359     # it. (We collect unconditionally, even though we may or may not
360     # ever use it.)
361     if ((! (defined $revision)) and (/^revision (\d+\.[\d.]+)/))
362     {
363     $revision = $1;
364    
365     if (@Follow_Branches)
366     {
367     foreach my $branch (@Follow_Branches)
368     {
369     # Special case for following trunk revisions
370     if (($branch =~ /^trunk$/i) and ($revision =~ /^[0-9]+\.[0-9]+$/))
371     {
372     goto dengo;
373     }
374    
375     my $branch_number = $branch_numbers{$branch};
376     if ($branch_number)
377     {
378     # Are we on one of the follow branches or an ancestor of
379     # same?
380     #
381     # If this revision is a prefix of the branch number, or
382     # possibly is less in the minormost number, OR if this
383     # branch number is a prefix of the revision, then yes.
384     # Otherwise, no.
385     #
386     # So below, we determine if any of those conditions are
387     # met.
388    
389     # Trivial case: is this revision on the branch?
390     # (Compare this way to avoid regexps that screw up Emacs
391     # indentation, argh.)
392     if ((substr ($revision, 0, ((length ($branch_number)) + 1)))
393     eq ($branch_number . "."))
394     {
395     goto dengo;
396     }
397     # Non-trivial case: check if rev is ancestral to branch
398     elsif ((length ($branch_number)) > (length ($revision)))
399     {
400     $revision =~ /^((?:\d+\.)+)(\d+)$/;
401     my $r_left = $1; # still has the trailing "."
402     my $r_end = $2;
403    
404     $branch_number =~ /^((?:\d+\.)+)(\d+)\.\d+$/;
405     my $b_left = $1; # still has trailing "."
406     my $b_mid = $2; # has no trailing "."
407    
408     if (($r_left eq $b_left)
409     && ($r_end <= $b_mid))
410     {
411     goto dengo;
412     }
413     }
414     }
415     }
416     }
417     else # (! @Follow_Branches)
418     {
419     next;
420     }
421    
422     # Else we are following branches, but this revision isn't on the
423     # path. So skip it.
424     undef $revision;
425     dengo:
426     next;
427     }
428    
429     # If we don't have a revision right now, we couldn't possibly
430     # be looking at anything useful.
431     if (! (defined ($revision))) {
432     $detected_file_separator = /^$file_separator$/o;
433     if ($detected_file_separator) {
434     # No revisions for this file; can happen, e.g. "cvs log -d DATE"
435     goto CLEAR;
436     }
437     else {
438     next;
439     }
440     }
441    
442     # If have file name but not date and author, and see date or
443     # author, then grab them:
444     unless (defined $time)
445     {
446     if (/^date: .*/)
447     {
448     ($time, $author) = &parse_date_and_author ($_);
449     if (defined ($usermap{$author}) and $usermap{$author}) {
450     $author = $usermap{$author};
451     }
452     }
453     else {
454     $detected_file_separator = /^$file_separator$/o;
455     if ($detected_file_separator) {
456     # No revisions for this file; can happen, e.g. "cvs log -d DATE"
457     goto CLEAR;
458     }
459     }
460     # If the date/time/author hasn't been found yet, we couldn't
461     # possibly care about anything we see. So skip:
462     next;
463     }
464    
465     # A "branches: ..." line here indicates that one or more branches
466     # are rooted at this revision. If we're showing branches, then we
467     # want to show that fact as well, so we collect all the branches
468     # that this is the latest ancestor of and store them in
469     # @branch_roots. Just for reference, the format of the line we're
470     # seeing at this point is:
471     #
472     # branches: 1.5.2; 1.5.4; ...;
473     #
474     # Okay, here goes:
475    
476     if (/^branches:\s+(.*);$/)
477     {
478     if ($Show_Branches)
479     {
480     my $lst = $1;
481     $lst =~ s/(1\.)+1;|(1\.)+1$//; # ignore the trivial branch 1.1.1
482     if ($lst) {
483     @branch_roots = split (/;\s+/, $lst);
484     }
485     else {
486     undef @branch_roots;
487     }
488     next;
489     }
490     else
491     {
492     # Ugh. This really bothers me. Suppose we see a log entry
493     # like this:
494     #
495     # ----------------------------
496     # revision 1.1
497     # date: 1999/10/17 03:07:38; author: jrandom; state: Exp;
498     # branches: 1.1.2;
499     # Intended first line of log message begins here.
500     # ----------------------------
501     #
502     # The question is, how we can tell the difference between that
503     # log message and a *two*-line log message whose first line is
504     #
505     # "branches: 1.1.2;"
506     #
507     # See the problem? The output of "cvs log" is inherently
508     # ambiguous.
509     #
510     # For now, we punt: we liberally assume that people don't
511     # write log messages like that, and just toss a "branches:"
512     # line if we see it but are not showing branches. I hope no
513     # one ever loses real log data because of this.
514     next;
515     }
516     }
517    
518     # If have file name, time, and author, then we're just grabbing
519     # log message texts:
520     $detected_file_separator = /^$file_separator$/o;
521     if ($detected_file_separator && ! (defined $revision)) {
522     # No revisions for this file; can happen, e.g. "cvs log -d DATE"
523     goto CLEAR;
524     }
525     unless ($detected_file_separator || /^$logmsg_separator$/o)
526     {
527     $msg_txt .= $_; # Normally, just accumulate the message...
528     next;
529     }
530     # ... until a msg separator is encountered:
531     # Ensure the message contains something:
532     if ((! $msg_txt)
533     || ($msg_txt =~ /^\s*\.\s*$|^\s*$/)
534     || ($msg_txt =~ /\*\*\* empty log message \*\*\*/))
535     {
536     if ($Prune_Empty_Msgs) {
537     goto CLEAR;
538     }
539     # else
540     $msg_txt = "[no log message]\n";
541     }
542    
543     ### Store it all in the Grand Poobah:
544     {
545     my $dir_key; # key into %grand_poobah
546     my %qunk; # complicated little jobbie, see below
547    
548     # Each revision of a file has a little data structure (a `qunk')
549     # associated with it. That data structure holds not only the
550     # file's name, but any additional information about the file
551     # that might be needed in the output, such as the revision
552     # number, tags, branches, etc. The reason to have these things
553     # arranged in a data structure, instead of just appending them
554     # textually to the file's name, is that we may want to do a
555     # little rearranging later as we write the output. For example,
556     # all the files on a given tag/branch will go together, followed
557     # by the tag in parentheses (so trunk or otherwise non-tagged
558     # files would go at the end of the file list for a given log
559     # message). This rearrangement is a lot easier to do if we
560     # don't have to reparse the text.
561     #
562     # A qunk looks like this:
563     #
564     # {
565     # filename => "hello.c",
566     # revision => "1.4.3.2",
567     # time => a timegm() return value (moment of commit)
568     # tags => [ "tag1", "tag2", ... ],
569     # branch => "branchname" # There should be only one, right?
570     # branchroots => [ "branchtag1", "branchtag2", ... ]
571     # }
572    
573     if ($Distributed) {
574     # Just the basename, don't include the path.
575     ($qunk{'filename'}, $dir_key, undef) = fileparse ($file_full_path);
576     }
577     else {
578     $dir_key = "./";
579     $qunk{'filename'} = $file_full_path;
580     }
581    
582     # This may someday be used in a more sophisticated calculation
583     # of what other files are involved in this commit. For now, we
584     # don't use it, because the common-commit-detection algorithm is
585     # hypothesized to be "good enough" as it stands.
586     $qunk{'time'} = $time;
587    
588     # We might be including revision numbers and/or tags and/or
589     # branch names in the output. Most of the code from here to
590     # loop-end deals with organizing these in qunk.
591    
592     $qunk{'revision'} = $revision;
593    
594     # Grab the branch, even though we may or may not need it:
595     $qunk{'revision'} =~ /((?:\d+\.)+)\d+/;
596     my $branch_prefix = $1;
597     $branch_prefix =~ s/\.$//; # strip off final dot
598     if ($branch_names{$branch_prefix}) {
599     $qunk{'branch'} = $branch_names{$branch_prefix};
600     }
601    
602     # If there's anything in the @branch_roots array, then this
603     # revision is the root of at least one branch. We'll display
604     # them as branch names instead of revision numbers, the
605     # substitution for which is done directly in the array:
606     if (@branch_roots) {
607     my @roots = map { $branch_names{$_} } @branch_roots;
608     $qunk{'branchroots'} = \@roots;
609     }
610    
611     # Save tags too.
612     if (defined ($symbolic_names{$revision})) {
613     $qunk{'tags'} = $symbolic_names{$revision};
614     delete $symbolic_names{$revision};
615     }
616    
617     # Add this file to the list
618     # (We use many spoonfuls of autovivication magic. Hashes and arrays
619     # will spring into existence if they aren't there already.)
620    
621     &debug ("(pushing log msg for ${dir_key}$qunk{'filename'})\n");
622    
623     # Store with the files in this commit. Later we'll loop through
624     # again, making sure that revisions with the same log message
625     # and nearby commit times are grouped together as one commit.
626     push (@{$grand_poobah{$dir_key}{$author}{$time}{$msg_txt}}, \%qunk);
627     }
628    
629     CLEAR:
630     # Make way for the next message
631     undef $msg_txt;
632     undef $time;
633     undef $revision;
634     undef $author;
635     undef @branch_roots;
636    
637     # Maybe even make way for the next file:
638     if ($detected_file_separator) {
639     undef $file_full_path;
640     undef %branch_names;
641     undef %branch_numbers;
642     undef %symbolic_names;
643     }
644     }
645    
646     close (LOG_SOURCE);
647    
648     ### Process each ChangeLog
649    
650     while (my ($dir,$authorhash) = each %grand_poobah)
651     {
652     &debug ("DOING DIR: $dir\n");
653    
654     # Here we twist our hash around, from being
655     # author => time => message => filelist
656     # in %$authorhash to
657     # time => author => message => filelist
658     # in %changelog.
659     #
660     # This is also where we merge entries. The algorithm proceeds
661     # through the timeline of the changelog with a sliding window of
662     # $Max_Checkin_Duration seconds; within that window, entries that
663     # have the same log message are merged.
664     #
665     # (To save space, we zap %$authorhash after we've copied
666     # everything out of it.)
667    
668     my %changelog;
669     while (my ($author,$timehash) = each %$authorhash)
670     {
671     my $lasttime;
672     my %stamptime;
673     foreach my $time (sort {$main::a <=> $main::b} (keys %$timehash))
674     {
675     my $msghash = $timehash->{$time};
676     while (my ($msg,$qunklist) = each %$msghash)
677     {
678     my $stamptime = $stamptime{$msg};
679     if ((defined $stamptime)
680     and (($time - $stamptime) < $Max_Checkin_Duration)
681     and (defined $changelog{$stamptime}{$author}{$msg}))
682     {
683     push(@{$changelog{$stamptime}{$author}{$msg}}, @$qunklist);
684     }
685     else {
686     $changelog{$time}{$author}{$msg} = $qunklist;
687     $stamptime{$msg} = $time;
688     }
689     }
690     }
691     }
692     undef (%$authorhash);
693    
694     ### Now we can write out the ChangeLog!
695    
696     my ($logfile_here, $logfile_bak, $tmpfile);
697    
698     if (! $Output_To_Stdout) {
699     $logfile_here = $dir . $Log_File_Name;
700     $logfile_here =~ s/^\.\/\//\//; # fix any leading ".//" problem
701     $tmpfile = "${logfile_here}.cvs2cl$$.tmp";
702     $logfile_bak = "${logfile_here}.bak";
703    
704     open (LOG_OUT, ">$tmpfile") or die "Unable to open \"$tmpfile\"";
705     }
706     else {
707     open (LOG_OUT, ">-") or die "Unable to open stdout for writing";
708     }
709    
710     print LOG_OUT $ChangeLog_Header;
711    
712     if ($XML_Output) {
713     print LOG_OUT "<?xml version=\"1.0\"?>\n\n"
714     . "<changelog xmlns=\"http://www.red-bean.com/xmlns/cvs2cl/\">\n\n";
715     }
716    
717     foreach my $time (sort {$main::b <=> $main::a} (keys %changelog))
718     {
719     my $authorhash = $changelog{$time};
720     while (my ($author,$mesghash) = each %$authorhash)
721     {
722     # If XML, escape in outer loop to avoid compound quoting:
723     if ($XML_Output) {
724     $author = &xml_escape ($author);
725     }
726    
727     while (my ($msg,$qunklist) = each %$mesghash)
728     {
729     my $files = &pretty_file_list ($qunklist);
730     my $header_line; # date and author
731     my $body; # see below
732     my $wholething; # $header_line + $body
733    
734     # Set up the date/author line.
735     # kff todo: do some more XML munging here, on the header
736     # part of the entry:
737     my ($ignore,$min,$hour,$mday,$mon,$year,$wday)
738     = $UTC_Times ? gmtime($time) : localtime($time);
739    
740     # XML output includes everything else, we might as well make
741     # it always include Day Of Week too, for consistency.
742     if ($Show_Day_Of_Week or $XML_Output) {
743     $wday = ("Sunday", "Monday", "Tuesday", "Wednesday",
744     "Thursday", "Friday", "Saturday")[$wday];
745     $wday = ($XML_Output) ? "<weekday>${wday}</weekday>\n" : " $wday";
746     }
747     else {
748     $wday = "";
749     }
750    
751     if ($XML_Output) {
752     $header_line =
753     sprintf ("<date>%4u-%02u-%02u</date>\n"
754     . "${wday}"
755     . "<time>%02u:%02u</time>\n"
756     . "<author>%s</author>\n",
757     $year+1900, $mon+1, $mday, $hour, $min, $author);
758     }
759     else {
760     $header_line =
761     sprintf ("%4u-%02u-%02u${wday} %02u:%02u %s\n\n",
762     $year+1900, $mon+1, $mday, $hour, $min, $author);
763     }
764    
765     # Reshape the body according to user preferences.
766     if ($XML_Output)
767     {
768     $msg = &preprocess_msg_text ($msg);
769     $body = $files . $msg;
770     }
771     elsif ($No_Wrap)
772     {
773     $msg = &preprocess_msg_text ($msg);
774     $files = wrap ("\t", " ", "$files");
775     $msg =~ s/\n(.*)/\n\t$1/g;
776     unless ($After_Header eq " ") {
777     $msg =~ s/^(.*)/\t$1/g;
778     }
779     $body = $files . $After_Header . $msg;
780     }
781     else # do wrapping, either FSF-style or regular
782     {
783     if ($FSF_Style)
784     {
785     $files = wrap ("\t", " ", "$files");
786    
787     my $files_last_line_len = 0;
788     if ($After_Header eq " ")
789     {
790     $files_last_line_len = &last_line_len ($files);
791     $files_last_line_len += 1; # for $After_Header
792     }
793    
794     $msg = &wrap_log_entry
795     ($msg, "\t", 69 - $files_last_line_len, 69);
796     $body = $files . $After_Header . $msg;
797     }
798     else # not FSF-style
799     {
800     $msg = &preprocess_msg_text ($msg);
801     $body = $files . $After_Header . $msg;
802     $body = wrap ("\t", " ", "$body");
803     }
804     }
805    
806     $wholething = $header_line . $body;
807    
808     if ($XML_Output) {
809     $wholething = "<entry>\n${wholething}</entry>\n";
810     }
811    
812     # One last check: make sure it passes the regexp test, if the
813     # user asked for that. We have to do it here, so that the
814     # test can match against information in the header as well
815     # as in the text of the log message.
816    
817     # How annoying to duplicate so much code just because I
818     # can't figure out a way to evaluate scalars on the trailing
819     # operator portion of a regular expression. Grrr.
820     if ($Case_Insensitive) {
821     unless ($Regexp_Gate && ($wholething =~ /$Regexp_Gate/oi)) {
822     print LOG_OUT "${wholething}\n";
823     }
824     }
825     else {
826     unless ($Regexp_Gate && ($wholething =~ /$Regexp_Gate/o)) {
827     print LOG_OUT "${wholething}\n";
828     }
829     }
830     }
831     }
832     }
833    
834     if ($XML_Output) {
835     print LOG_OUT "</changelog>\n";
836     }
837    
838     close (LOG_OUT);
839    
840     if (! $Output_To_Stdout)
841     {
842     # If accumulating, append old data to new before renaming. But
843     # don't append the most recent entry, since it's already in the
844     # new log due to CVS's idiosyncratic interpretation of "log -d".
845     if ($Cumulative && -f $logfile_here)
846     {
847     open (NEW_LOG, ">>$tmpfile")
848     or die "trouble appending to $tmpfile ($!)";
849    
850     open (OLD_LOG, "<$logfile_here")
851     or die "trouble reading from $logfile_here ($!)";
852    
853     my $started_first_entry = 0;
854     my $passed_first_entry = 0;
855     while (<OLD_LOG>)
856     {
857     if (! $passed_first_entry)
858     {
859     if ((! $started_first_entry)
860     && /^(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d)/) {
861     $started_first_entry = 1;
862     }
863     elsif (/^(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d)/) {
864     $passed_first_entry = 1;
865     print NEW_LOG $_;
866     }
867     }
868     else {
869     print NEW_LOG $_;
870     }
871     }
872    
873     close (NEW_LOG);
874     close (OLD_LOG);
875     }
876    
877     if (-f $logfile_here) {
878     rename ($logfile_here, $logfile_bak);
879     }
880     rename ($tmpfile, $logfile_here);
881     }
882     }
883     }
884    
885    
886     sub parse_date_and_author ()
887     {
888     # Parses the date/time and author out of a line like:
889     #
890     # date: 1999/02/19 23:29:05; author: apharris; state: Exp;
891    
892     my $line = shift;
893    
894     my ($year, $mon, $mday, $hours, $min, $secs, $author) = $line =~
895     m#(\d+)/(\d+)/(\d+)\s+(\d+):(\d+):(\d+);\s+author:\s+([^;]+);#
896     or die "Couldn't parse date ``$line''";
897     die "Bad date or Y2K issues" unless ($year > 1969 and $year < 2258);
898     # Kinda arbitrary, but useful as a sanity check
899     my $time = timegm($secs,$min,$hours,$mday,$mon-1,$year-1900);
900    
901     return ($time, $author);
902     }
903    
904    
905     # Here we take a bunch of qunks and convert them into printed
906     # summary that will include all the information the user asked for.
907     sub pretty_file_list ()
908     {
909     if ($Hide_Filenames and (! $XML_Output)) {
910     return "";
911     }
912    
913     my $qunksref = shift;
914     my @qunkrefs = @$qunksref;
915     my @filenames;
916     my $beauty = ""; # The accumulating header string for this entry.
917     my %non_unanimous_tags; # Tags found in a proper subset of qunks
918     my %unanimous_tags; # Tags found in all qunks
919     my %all_branches; # Branches found in any qunk
920     my $common_dir = undef; # Dir prefix common to all files ("" if none)
921     my $fbegun = 0; # Did we begin printing filenames yet?
922    
923     # First, loop over the qunks gathering all the tag/branch names.
924     # We'll put them all in non_unanimous_tags, and take out the
925     # unanimous ones later.
926     foreach my $qunkref (@qunkrefs)
927     {
928     # Keep track of whether all the files in this commit were in the
929     # same directory, and memorize it if so. We can make the output a
930     # little more compact by mentioning the directory only once.
931     if ((scalar (@qunkrefs)) > 1)
932     {
933     if (! (defined ($common_dir)))
934     {
935     my ($base, $dir);
936     ($base, $dir, undef) = fileparse ($$qunkref{'filename'});
937    
938     if ((! (defined ($dir))) # this first case is sheer paranoia
939     or ($dir eq "")
940     or ($dir eq "./")
941     or ($dir eq ".\\"))
942     {
943     $common_dir = "";
944     }
945     else
946     {
947     $common_dir = $dir;
948     }
949     }
950     elsif ($common_dir ne "")
951     {
952     # Already have a common dir prefix, so how much of it can we preserve?
953     $common_dir = &common_path_prefix ($$qunkref{'filename'}, $common_dir);
954     }
955     }
956     else # only one file in this entry anyway, so common dir not an issue
957     {
958     $common_dir = "";
959     }
960    
961     if (defined ($$qunkref{'branch'})) {
962     $all_branches{$$qunkref{'branch'}} = 1;
963     }
964     if (defined ($$qunkref{'tags'})) {
965     foreach my $tag (@{$$qunkref{'tags'}}) {
966     $non_unanimous_tags{$tag} = 1;
967     }
968     }
969     }
970    
971     # Any tag held by all qunks will be printed specially... but only if
972     # there are multiple qunks in the first place!
973     if ((scalar (@qunkrefs)) > 1) {
974     foreach my $tag (keys (%non_unanimous_tags)) {
975     my $everyone_has_this_tag = 1;
976     foreach my $qunkref (@qunkrefs) {
977     if ((! (defined ($$qunkref{'tags'})))
978     or (! (grep ($_ eq $tag, @{$$qunkref{'tags'}})))) {
979     $everyone_has_this_tag = 0;
980     }
981     }
982     if ($everyone_has_this_tag) {
983     $unanimous_tags{$tag} = 1;
984     delete $non_unanimous_tags{$tag};
985     }
986     }
987     }
988    
989     if ($XML_Output)
990     {
991     # If outputting XML, then our task is pretty simple, because we
992     # don't have to detect common dir, common tags, branch prefixing,
993     # etc. We just output exactly what we have, and don't worry about
994     # redundancy or readability.
995    
996     foreach my $qunkref (@qunkrefs)
997     {
998     my $filename = $$qunkref{'filename'};
999     my $revision = $$qunkref{'revision'};
1000     my $tags = $$qunkref{'tags'};
1001     my $branch = $$qunkref{'branch'};
1002     my $branchroots = $$qunkref{'branchroots'};
1003    
1004     $filename = &xml_escape ($filename); # probably paranoia
1005     $revision = &xml_escape ($revision); # definitely paranoia
1006    
1007     $beauty .= "<file>\n";
1008     $beauty .= "<name>${filename}</name>\n";
1009     $beauty .= "<revision>${revision}</revision>\n";
1010     if ($branch) {
1011     $branch = &xml_escape ($branch); # more paranoia
1012     $beauty .= "<branch>${branch}</branch>\n";
1013     }
1014     foreach my $tag (@$tags) {
1015     $tag = &xml_escape ($tag); # by now you're used to the paranoia
1016     $beauty .= "<tag>${tag}</tag>\n";
1017     }
1018     foreach my $root (@$branchroots) {
1019     $root = &xml_escape ($root); # which is good, because it will continue
1020     $beauty .= "<branchroot>${root}</branchroot>\n";
1021     }
1022     $beauty .= "</file>\n";
1023     }
1024    
1025     # Theoretically, we could go home now. But as long as we're here,
1026     # let's print out the common_dir and utags, as a convenience to
1027     # the receiver (after all, earlier code calculated that stuff
1028     # anyway, so we might as well take advantage of it).
1029    
1030     if ((scalar (keys (%unanimous_tags))) > 1) {
1031     foreach my $utag ((keys (%unanimous_tags))) {
1032     $utag = &xml_escape ($utag); # the usual paranoia
1033     $beauty .= "<utag>${utag}</utag>\n";
1034     }
1035     }
1036     if ($common_dir) {
1037     $common_dir = &xml_escape ($common_dir);
1038     $beauty .= "<commondir>${common_dir}</commondir>\n";
1039     }
1040    
1041     # That's enough for XML, time to go home:
1042     return $beauty;
1043     }
1044    
1045     # Else not XML output, so complexly compactify for chordate
1046     # consumption. At this point we have enough global information
1047     # about all the qunks to organize them non-redundantly for output.
1048    
1049     if ($common_dir) {
1050     # Note that $common_dir still has its trailing slash
1051     $beauty .= "$common_dir: ";
1052     }
1053    
1054     if ($Show_Branches)
1055     {
1056     # For trailing revision numbers.
1057     my @brevisions;
1058    
1059     foreach my $branch (keys (%all_branches))
1060     {
1061     foreach my $qunkref (@qunkrefs)
1062     {
1063     if ((defined ($$qunkref{'branch'}))
1064     and ($$qunkref{'branch'} eq $branch))
1065     {
1066     if ($fbegun) {
1067     # kff todo: comma-delimited in XML too? Sure.
1068     $beauty .= ", ";
1069     }
1070     else {
1071     $fbegun = 1;
1072     }
1073     my $fname = substr ($$qunkref{'filename'}, length ($common_dir));
1074     $beauty .= $fname;
1075     $$qunkref{'printed'} = 1; # Just setting a mark bit, basically
1076    
1077     if ($Show_Tags && (defined @{$$qunkref{'tags'}})) {
1078     my @tags = grep ($non_unanimous_tags{$_}, @{$$qunkref{'tags'}});
1079     if (@tags) {
1080     $beauty .= " (tags: ";
1081     $beauty .= join (', ', @tags);
1082     $beauty .= ")";
1083     }
1084     }
1085    
1086     if ($Show_Revisions) {
1087     # Collect the revision numbers' last components, but don't
1088     # print them -- they'll get printed with the branch name
1089     # later.
1090     $$qunkref{'revision'} =~ /.+\.([\d]+)$/;
1091     push (@brevisions, $1);
1092    
1093     # todo: we're still collecting branch roots, but we're not
1094     # showing them anywhere. If we do show them, it would be
1095     # nifty to just call them revision "0" on a the branch.
1096     # Yeah, that's the ticket.
1097     }
1098     }
1099     }
1100     $beauty .= " ($branch";
1101     if (@brevisions) {
1102     if ((scalar (@brevisions)) > 1) {
1103     $beauty .= ".[";
1104     $beauty .= (join (',', @brevisions));
1105     $beauty .= "]";
1106     }
1107     else {
1108     $beauty .= ".$brevisions[0]";
1109     }
1110     }
1111     $beauty .= ")";
1112     }
1113     }
1114    
1115     # Okay; any qunks that were done according to branch are taken care
1116     # of, and marked as printed. Now print everyone else.
1117    
1118     foreach my $qunkref (@qunkrefs)
1119     {
1120     next if (defined ($$qunkref{'printed'})); # skip if already printed
1121    
1122     if ($fbegun) {
1123     $beauty .= ", ";
1124     }
1125     else {
1126     $fbegun = 1;
1127     }
1128     $beauty .= substr ($$qunkref{'filename'}, length ($common_dir));
1129     # todo: Shlomo's change was this:
1130     # $beauty .= substr ($$qunkref{'filename'},
1131     # (($common_dir eq "./") ? "" : length ($common_dir)));
1132     $$qunkref{'printed'} = 1; # Set a mark bit.
1133    
1134     if ($Show_Revisions || $Show_Tags)
1135     {
1136     my $started_addendum = 0;
1137    
1138     if ($Show_Revisions) {
1139     $started_addendum = 1;
1140     $beauty .= " (";
1141     $beauty .= "$$qunkref{'revision'}";
1142     }
1143     if ($Show_Tags && (defined $$qunkref{'tags'})) {
1144     my @tags = grep ($non_unanimous_tags{$_}, @{$$qunkref{'tags'}});
1145     if ((scalar (@tags)) > 0) {
1146     if ($started_addendum) {
1147     $beauty .= ", ";
1148     }
1149     else {
1150     $beauty .= " (tags: ";
1151     }
1152     $beauty .= join (', ', @tags);
1153     $started_addendum = 1;
1154     }
1155     }
1156     if ($started_addendum) {
1157     $beauty .= ")";
1158     }
1159     }
1160     }
1161    
1162     # Unanimous tags always come last.
1163     if ($Show_Tags && %unanimous_tags)
1164     {
1165     $beauty .= " (utags: ";
1166     $beauty .= join (', ', keys (%unanimous_tags));
1167     $beauty .= ")";
1168     }
1169    
1170     # todo: still have to take care of branch_roots?
1171    
1172     $beauty = "* $beauty:";
1173    
1174     return $beauty;
1175     }
1176    
1177    
1178     sub common_path_prefix ()
1179     {
1180     my $path1 = shift;
1181     my $path2 = shift;
1182    
1183     my ($dir1, $dir2);
1184     (undef, $dir1, undef) = fileparse ($path1);
1185     (undef, $dir2, undef) = fileparse ($path2);
1186    
1187     # Transmogrify Windows filenames to look like Unix.
1188     # (It is far more likely that someone is running cvs2cl.pl under
1189     # Windows than that they would genuinely have backslashes in their
1190     # filenames.)
1191     $dir1 =~ tr#\\#/#;
1192     $dir2 =~ tr#\\#/#;
1193    
1194     my $accum1 = "";
1195     my $accum2 = "";
1196     my $last_common_prefix = "";
1197    
1198     while ($accum1 eq $accum2)
1199     {
1200     $last_common_prefix = $accum1;
1201     last if ($accum1 eq $dir1);
1202     my ($tmp1) = split (/\//, (substr ($dir1, length ($accum1))));
1203     my ($tmp2) = split (/\//, (substr ($dir2, length ($accum2))));
1204     $accum1 .= "$tmp1/" if ((defined ($tmp1)) and $tmp1);
1205     $accum2 .= "$tmp2/" if ((defined ($tmp2)) and $tmp2);
1206     }
1207    
1208     return $last_common_prefix;
1209     }
1210    
1211    
1212     sub preprocess_msg_text ()
1213     {
1214     my $text = shift;
1215    
1216     # Strip out carriage returns (as they probably result from DOSsy editors).
1217     $text =~ s/\r\n/\n/g;
1218    
1219     # If it *looks* like two newlines, make it *be* two newlines:
1220     $text =~ s/\n\s*\n/\n\n/g;
1221    
1222     if ($XML_Output)
1223     {
1224     $text = &xml_escape ($text);
1225     $text = "<msg>${text}</msg>\n";
1226     }
1227     elsif (! $No_Wrap)
1228     {
1229     # Strip off lone newlines, but only for lines that don't begin with
1230     # whitespace or a mail-quoting character, since we want to preserve
1231     # that kind of formatting. Also don't strip newlines that follow a
1232     # period; we handle those specially next. And don't strip
1233     # newlines that precede an open paren.
1234     1 while ($text =~ s/(^|\n)([^>\s].*[^.\n])\n([^>\n])/$1$2 $3/g);
1235    
1236     # If a newline follows a period, make sure that when we bring up the
1237     # bottom sentence, it begins with two spaces.
1238     1 while ($text =~ s/(^|\n)([^>\s].*)\n([^>\n])/$1$2 $3/g);
1239     }
1240    
1241     return $text;
1242     }
1243    
1244    
1245     sub last_line_len ()
1246     {
1247     my $files_list = shift;
1248     my @lines = split (/\n/, $files_list);
1249     my $last_line = pop (@lines);
1250     return length ($last_line);
1251     }
1252    
1253    
1254     # A custom wrap function, sensitive to some common constructs used in
1255     # log entries.
1256     sub wrap_log_entry ()
1257     {
1258     my $text = shift; # The text to wrap.
1259     my $left_pad_str = shift; # String to pad with on the left.
1260    
1261     # These do NOT take left_pad_str into account:
1262     my $length_remaining = shift; # Amount left on current line.
1263     my $max_line_length = shift; # Amount left for a blank line.
1264    
1265     my $wrapped_text = ""; # The accumulating wrapped entry.
1266     my $user_indent = ""; # Inherited user_indent from prev line.
1267    
1268     my $first_time = 1; # First iteration of the loop?
1269     my $suppress_line_start_match = 0; # Set to disable line start checks.
1270    
1271     my @lines = split (/\n/, $text);
1272     while (@lines) # Don't use `foreach' here, it won't work.
1273     {
1274     my $this_line = shift (@lines);
1275     chomp $this_line;
1276    
1277     if ($this_line =~ /^(\s+)/) {
1278     $user_indent = $1;
1279     }
1280     else {
1281     $user_indent = "";
1282     }
1283    
1284     # If it matches any of the line-start regexps, print a newline now...
1285     if ($suppress_line_start_match)
1286     {
1287     $suppress_line_start_match = 0;
1288     }
1289     elsif (($this_line =~ /^(\s*)\*\s+[a-zA-Z0-9]/)
1290     || ($this_line =~ /^(\s*)\* [a-zA-Z0-9_\.\/\+-]+/)
1291     || ($this_line =~ /^(\s*)\([a-zA-Z0-9_\.\/\+-]+(\)|,\s*)/)
1292     || ($this_line =~ /^(\s+)(\S+)/)
1293     || ($this_line =~ /^(\s*)- +/)
1294     || ($this_line =~ /^()\s*$/)
1295     || ($this_line =~ /^(\s*)\*\) +/)
1296     || ($this_line =~ /^(\s*)[a-zA-Z0-9](\)|\.|\:) +/))
1297     {
1298     # Make a line break immediately, unless header separator is set
1299     # and this line is the first line in the entry, in which case
1300     # we're getting the blank line for free already and shouldn't
1301     # add an extra one.
1302     unless (($After_Header ne " ") and ($first_time))
1303     {
1304     if ($this_line =~ /^()\s*$/) {
1305     $suppress_line_start_match = 1;
1306     $wrapped_text .= "\n${left_pad_str}";
1307     }
1308    
1309     $wrapped_text .= "\n${left_pad_str}";
1310     }
1311    
1312     $length_remaining = $max_line_length - (length ($user_indent));
1313     }
1314    
1315     # Now that any user_indent has been preserved, strip off leading
1316     # whitespace, so up-folding has no ugly side-effects.
1317     $this_line =~ s/^\s*//;
1318    
1319     # Accumulate the line, and adjust parameters for next line.
1320     my $this_len = length ($this_line);
1321     if ($this_len == 0)
1322     {
1323     # Blank lines should cancel any user_indent level.
1324     $user_indent = "";
1325     $length_remaining = $max_line_length;
1326     }
1327     elsif ($this_len >= $length_remaining) # Line too long, try breaking it.
1328     {
1329     # Walk backwards from the end. At first acceptable spot, break
1330     # a new line.
1331     my $idx = $length_remaining - 1;
1332     if ($idx < 0) { $idx = 0 };
1333     while ($idx > 0)
1334     {
1335     if (substr ($this_line, $idx, 1) =~ /\s/)
1336     {
1337     my $line_now = substr ($this_line, 0, $idx);
1338     my $next_line = substr ($this_line, $idx);
1339     $this_line = $line_now;
1340    
1341     # Clean whitespace off the end.
1342     chomp $this_line;
1343    
1344     # The current line is ready to be printed.
1345     $this_line .= "\n${left_pad_str}";
1346    
1347     # Make sure the next line is allowed full room.
1348     $length_remaining = $max_line_length - (length ($user_indent));
1349    
1350     # Strip next_line, but then preserve any user_indent.
1351     $next_line =~ s/^\s*//;
1352    
1353     # Sneak a peek at the user_indent of the upcoming line, so
1354     # $next_line (which will now precede it) can inherit that
1355     # indent level. Otherwise, use whatever user_indent level
1356     # we currently have, which might be none.
1357     my $next_next_line = shift (@lines);
1358     if ((defined ($next_next_line)) && ($next_next_line =~ /^(\s+)/)) {
1359     $next_line = $1 . $next_line if (defined ($1));
1360     # $length_remaining = $max_line_length - (length ($1));
1361     $next_next_line =~ s/^\s*//;
1362     }
1363     else {
1364     $next_line = $user_indent . $next_line;
1365     }
1366     if (defined ($next_next_line)) {
1367     unshift (@lines, $next_next_line);
1368     }
1369     unshift (@lines, $next_line);
1370    
1371     # Our new next line might, coincidentally, begin with one of
1372     # the line-start regexps, so we temporarily turn off
1373     # sensitivity to that until we're past the line.
1374     $suppress_line_start_match = 1;
1375    
1376     last;
1377     }
1378     else
1379     {
1380     $idx--;
1381     }
1382     }
1383    
1384     if ($idx == 0)
1385     {
1386     # We bottomed out because the line is longer than the
1387     # available space. But that could be because the space is
1388     # small, or because the line is longer than even the maximum
1389     # possible space. Handle both cases below.
1390    
1391     if ($length_remaining == ($max_line_length - (length ($user_indent))))
1392     {
1393     # The line is simply too long -- there is no hope of ever
1394     # breaking it nicely, so just insert it verbatim, with
1395     # appropriate padding.
1396     $this_line = "\n${left_pad_str}${this_line}";
1397     }
1398     else
1399     {
1400     # Can't break it here, but may be able to on the next round...
1401     unshift (@lines, $this_line);
1402     $length_remaining = $max_line_length - (length ($user_indent));
1403     $this_line = "\n${left_pad_str}";
1404     }
1405     }
1406     }
1407     else # $this_len < $length_remaining, so tack on what we can.
1408     {
1409     # Leave a note for the next iteration.
1410     $length_remaining = $length_remaining - $this_len;
1411    
1412     if ($this_line =~ /\.$/)
1413     {
1414     $this_line .= " ";
1415     $length_remaining -= 2;
1416     }
1417     else # not a sentence end
1418     {
1419     $this_line .= " ";
1420     $length_remaining -= 1;
1421     }
1422     }
1423    
1424     # Unconditionally indicate that loop has run at least once.
1425     $first_time = 0;
1426    
1427     $wrapped_text .= "${user_indent}${this_line}";
1428     }
1429    
1430     # One last bit of padding.
1431     $wrapped_text .= "\n";
1432    
1433     return $wrapped_text;
1434     }
1435    
1436    
1437     sub xml_escape ()
1438     {
1439     my $txt = shift;
1440     $txt =~ s/&/&amp;/g;
1441     $txt =~ s/</&lt;/g;
1442     $txt =~ s/>/&gt;/g;
1443     return $txt;
1444     }
1445    
1446    
1447     sub maybe_read_user_map_file ()
1448     {
1449     my %expansions;
1450    
1451     if ($User_Map_File)
1452     {
1453     open (MAPFILE, "<$User_Map_File")
1454     or die ("Unable to open $User_Map_File ($!)");
1455    
1456     while (<MAPFILE>)
1457     {
1458     next if /^\s*#/; # Skip comment lines.
1459     next if not /:/; # Skip lines without colons.
1460    
1461     # It is now safe to split on ':'.
1462     my ($username, $expansion) = split ':';
1463     chomp $expansion;
1464     $expansion =~ s/^'(.*)'$/$1/;
1465     $expansion =~ s/^"(.*)"$/$1/;
1466    
1467     # If it looks like the expansion has a real name already, then
1468     # we toss the username we got from CVS log. Otherwise, keep
1469     # it to use in combination with the email address.
1470    
1471     if ($expansion =~ /^\s*<{0,1}\S+@.*/) {
1472     # Also, add angle brackets if none present
1473     if (! ($expansion =~ /<\S+@\S+>/)) {
1474     $expansions{$username} = "$username <$expansion>";
1475     }
1476     else {
1477     $expansions{$username} = "$username $expansion";
1478     }
1479     }
1480     else {
1481     $expansions{$username} = $expansion;
1482     }
1483     }
1484    
1485     close (MAPFILE);
1486     }
1487    
1488     return %expansions;
1489     }
1490    
1491    
1492     sub parse_options ()
1493     {
1494     # Check this internally before setting the global variable.
1495     my $output_file;
1496    
1497     # If this gets set, we encountered unknown options and will exit at
1498     # the end of this subroutine.
1499     my $exit_with_admonishment = 0;
1500    
1501     while (my $arg = shift (@ARGV))
1502     {
1503     if ($arg =~ /^-h$|^-help$|^--help$|^--usage$|^-?$/) {
1504     $Print_Usage = 1;
1505     }
1506     elsif ($arg =~ /^--debug$/) { # unadvertised option, heh
1507     $Debug = 1;
1508     }
1509     elsif ($arg =~ /^--version$/) {
1510     $Print_Version = 1;
1511     }
1512     elsif ($arg =~ /^-g$|^--global-opts$/) {
1513     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1514     # Don't assume CVS is called "cvs" on the user's system:
1515     $Log_Source_Command =~ s/(^\S*)/$1 $narg/;
1516     }
1517     elsif ($arg =~ /^-l$|^--log-opts$/) {
1518     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1519     $Log_Source_Command .= " $narg";
1520     }
1521     elsif ($arg =~ /^-f$|^--file$/) {
1522     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1523     $output_file = $narg;
1524     }
1525     elsif ($arg =~ /^--accum$/) {
1526     $Cumulative = 1;
1527     }
1528     elsif ($arg =~ /^--fsf$/) {
1529     $FSF_Style = 1;
1530     }
1531     elsif ($arg =~ /^-U$|^--usermap$/) {
1532     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1533     $User_Map_File = $narg;
1534     }
1535     elsif ($arg =~ /^-W$|^--window$/) {
1536     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1537     $Max_Checkin_Duration = $narg;
1538     }
1539     elsif ($arg =~ /^-I$|^--ignore$/) {
1540     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1541     push (@Ignore_Files, $narg);
1542     }
1543     elsif ($arg =~ /^-C$|^--case-insensitive$/) {
1544     $Case_Insensitive = 1;
1545     }
1546     elsif ($arg =~ /^-R$|^--regexp$/) {
1547     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1548     $Regexp_Gate = $narg;
1549     }
1550     elsif ($arg =~ /^--stdout$/) {
1551     $Output_To_Stdout = 1;
1552     }
1553     elsif ($arg =~ /^--version$/) {
1554     $Print_Version = 1;
1555     }
1556     elsif ($arg =~ /^-d$|^--distributed$/) {
1557     $Distributed = 1;
1558     }
1559     elsif ($arg =~ /^-P$|^--prune$/) {
1560     $Prune_Empty_Msgs = 1;
1561     }
1562     elsif ($arg =~ /^-S$|^--separate-header$/) {
1563     $After_Header = "\n\n";
1564     }
1565     elsif ($arg =~ /^--no-wrap$/) {
1566     $No_Wrap = 1;
1567     }
1568     elsif ($arg =~ /^--gmt$|^--utc$/) {
1569     $UTC_Times = 1;
1570     }
1571     elsif ($arg =~ /^-w$|^--day-of-week$/) {
1572     $Show_Day_Of_Week = 1;
1573     }
1574     elsif ($arg =~ /^-r$|^--revisions$/) {
1575     $Show_Revisions = 1;
1576     }
1577     elsif ($arg =~ /^-t$|^--tags$/) {
1578     $Show_Tags = 1;
1579     }
1580     elsif ($arg =~ /^-b$|^--branches$/) {
1581     $Show_Branches = 1;
1582     }
1583     elsif ($arg =~ /^-F$|^--follow$/) {
1584     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1585     push (@Follow_Branches, $narg);
1586     }
1587     elsif ($arg =~ /^--stdin$/) {
1588     $Input_From_Stdin = 1;
1589     }
1590     elsif ($arg =~ /^--header$/) {
1591     my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1592     $ChangeLog_Header = &slurp_file ($narg);
1593     if (! defined ($ChangeLog_Header)) {
1594     $ChangeLog_Header = "";
1595     }
1596     }
1597     elsif ($arg =~ /^--xml$/) {
1598     $XML_Output = 1;
1599     }
1600     elsif ($arg =~ /^--hide-filenames$/) {
1601     $Hide_Filenames = 1;
1602     $After_Header = "";
1603     }
1604     else {
1605     # Just add a filename as argument to the log command
1606     $Log_Source_Command .= " $arg";
1607     }
1608     }
1609    
1610     ## Check for contradictions...
1611    
1612     if ($Output_To_Stdout && $Distributed) {
1613     print STDERR "cannot pass both --stdout and --distributed\n";
1614     $exit_with_admonishment = 1;
1615     }
1616    
1617     if ($Output_To_Stdout && $output_file) {
1618     print STDERR "cannot pass both --stdout and --file\n";
1619     $exit_with_admonishment = 1;
1620     }
1621    
1622     if ($XML_Output && $Cumulative) {
1623     print STDERR "cannot pass both --xml and --accum\n";
1624     $exit_with_admonishment = 1;
1625     }
1626    
1627     # Or if any other error message has already been printed out, we
1628     # just leave now:
1629     if ($exit_with_admonishment) {
1630     &usage ();
1631     exit (1);
1632     }
1633     elsif ($Print_Usage) {
1634     &usage ();
1635     exit (0);
1636     }
1637     elsif ($Print_Version) {
1638     &version ();
1639     exit (0);
1640     }
1641    
1642     ## Else no problems, so proceed.
1643    
1644     if ($output_file) {
1645     $Log_File_Name = $output_file;
1646     }
1647     }
1648    
1649    
1650     sub slurp_file ()
1651     {
1652     my $filename = shift || die ("no filename passed to slurp_file()");
1653     my $retstr;
1654    
1655     open (SLURPEE, "<${filename}") or die ("unable to open $filename ($!)");
1656     my $saved_sep = $/;
1657     undef $/;
1658     $retstr = <SLURPEE>;
1659     $/ = $saved_sep;
1660     close (SLURPEE);
1661     return $retstr;
1662     }
1663    
1664    
1665     sub debug ()
1666     {
1667     if ($Debug) {
1668     my $msg = shift;
1669     print STDERR $msg;
1670     }
1671     }
1672    
1673    
1674     sub version ()
1675     {
1676     print "cvs2cl.pl version ${VERSION}; distributed under the GNU GPL.\n";
1677     }
1678    
1679    
1680     sub usage ()
1681     {
1682     &version ();
1683     print <<'END_OF_INFO';
1684     Generate GNU-style ChangeLogs in CVS working copies.
1685    
1686     Notes about the output format(s):
1687    
1688     The default output of cvs2cl.pl is designed to be compact, formally
1689     unambiguous, but still easy for humans to read. It is largely
1690     self-explanatory, I hope; the one abbreviation that might not be
1691     obvious is "utags". That stands for "universal tags" -- a
1692     universal tag is one held by all the files in a given change entry.
1693    
1694     If you need output that's easy for a program to parse, use the
1695     --xml option. Note that with XML output, just about all available
1696     information is included with each change entry, whether you asked
1697     for it or not, on the theory that your parser can ignore anything
1698     it's not looking for.
1699    
1700     Notes about the options and arguments (the actual options are listed
1701     last in this usage message):
1702    
1703     * The -I and -F options may appear multiple times.
1704    
1705     * To follow trunk revisions, use "-F trunk" ("-F TRUNK" also works).
1706     This is okay because no would ever, ever be crazy enough to name a
1707     branch "trunk", right? Right.
1708    
1709     * For the -U option, the UFILE should be formatted like
1710     CVSROOT/users. That is, each line of UFILE looks like this
1711     jrandom:jrandom@red-bean.com
1712     or maybe even like this
1713     jrandom:'Jesse Q. Random <jrandom@red-bean.com>'
1714     Don't forget to quote the portion after the colon if necessary.
1715    
1716     * Many people want to filter by date. To do so, invoke cvs2cl.pl
1717     like this:
1718     cvs2cl.pl -l "-d'DATESPEC'"
1719     where DATESPEC is any date specification valid for "cvs log -d".
1720     (Note that CVS 1.10.7 and below requires there be no space between
1721     -d and its argument).
1722    
1723     Options/Arguments:
1724    
1725     -h, -help, --help, or -? Show this usage and exit
1726     --version Show version and exit
1727     -r, --revisions Show revision numbers in output
1728     -b, --branches Show branch names in revisions when possible
1729     -t, --tags Show tags (symbolic names) in output
1730     --stdin Read from stdin, don't run cvs log
1731     --stdout Output to stdout not to ChangeLog
1732     -d, --distributed Put ChangeLogs in subdirs
1733     -f FILE, --file FILE Write to FILE instead of "ChangeLog"
1734     --fsf Use this if log data is in FSF ChangeLog style
1735     -W SECS, --window SECS Window of time within which log entries unify
1736     -U UFILE, --usermap UFILE Expand usernames to email addresses from UFILE
1737     -R REGEXP, --regexp REGEXP Include only entries that match REGEXP
1738     -I REGEXP, --ignore REGEXP Ignore files whose names match REGEXP
1739     -C, --case-insensitive Any regexp matching is done case-insensitively
1740     -F BRANCH, --follow BRANCH Show only revisions on or ancestral to BRANCH
1741     -S, --separate-header Blank line between each header and log message
1742     --no-wrap Don't auto-wrap log message (recommend -S also)
1743     --gmt, --utc Show times in GMT/UTC instead of local time
1744     --accum Add to an existing ChangeLog (incompat w/ --xml)
1745     -w, --day-of-week Show day of week
1746     --header FILE Get ChangeLog header from FILE ("-" means stdin)
1747     --xml Output XML instead of ChangeLog format
1748     --hide-filenames Don't show filenames (ignored for XML output)
1749     -P, --prune Don't show empty log messages
1750     -g OPTS, --global-opts OPTS Invoke like this "cvs OPTS log ..."
1751     -l OPTS, --log-opts OPTS Invoke like this "cvs ... log OPTS"
1752     FILE1 [FILE2 ...] Show only log information for the named FILE(s)
1753    
1754     See http://www.red-bean.com/cvs2cl for maintenance and bug info.
1755     END_OF_INFO
1756     }
1757    
1758     __END__
1759    
1760     =head1 NAME
1761    
1762     cvs2cl.pl - produces GNU-style ChangeLogs in CVS working copies, by
1763     running "cvs log" and parsing the output. Shared log entries are
1764     unified in an intuitive way.
1765    
1766     =head1 DESCRIPTION
1767    
1768     This script generates GNU-style ChangeLog files from CVS log
1769     information. Basic usage: just run it inside a working copy and a
1770     ChangeLog will appear. It requires repository access (i.e., 'cvs log'
1771     must work). Run "cvs2cl.pl --help" to see more advanced options.
1772    
1773     See http://www.red-bean.com/cvs2cl for updates, and for instructions
1774     on getting anonymous CVS access to this script.
1775    
1776     Maintainer: Karl Fogel <kfogel@red-bean.com>
1777     Please report bugs to <bug-cvs2cl@red-bean.com>.
1778    
1779     =head1 README
1780    
1781     This script generates GNU-style ChangeLog files from CVS log
1782     information. Basic usage: just run it inside a working copy and a
1783     ChangeLog will appear. It requires repository access (i.e., 'cvs log'
1784     must work). Run "cvs2cl.pl --help" to see more advanced options.
1785    
1786     See http://www.red-bean.com/cvs2cl for updates, and for instructions
1787     on getting anonymous CVS access to this script.
1788    
1789     Maintainer: Karl Fogel <kfogel@red-bean.com>
1790     Please report bugs to <bug-cvs2cl@red-bean.com>.
1791    
1792     =head1 PREREQUISITES
1793    
1794     This script requires C<Text::Wrap>, C<Time::Local>, and
1795     C<File::Basename>.
1796     It also seems to require C<Perl 5.004_04> or higher.
1797    
1798     =pod OSNAMES
1799    
1800     any
1801    
1802     =pod SCRIPT CATEGORIES
1803    
1804     Version_Control/CVS
1805    
1806     =cut
1807    
1808    
1809     -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*-
1810    
1811     Note about a bug-slash-opportunity:
1812     -----------------------------------
1813    
1814     There's a bug in Text::Wrap, which affects cvs2cl. This script
1815     reveals it:
1816    
1817     #!/usr/bin/perl -w
1818    
1819     use Text::Wrap;
1820    
1821     my $test_text =
1822     "This script demonstrates a bug in Text::Wrap. The very long line
1823     following this paragraph will be relocated relative to the surrounding
1824     text:
1825    
1826     ====================================================================
1827    
1828     See? When the bug happens, we'll get the line of equal signs below
1829     this paragraph, even though it should be above.";
1830    
1831    
1832     # Print out the test text with no wrapping:
1833     print "$test_text";
1834     print "\n";
1835     print "\n";
1836    
1837     # Now print it out wrapped, and see the bug:
1838     print wrap ("\t", " ", "$test_text");
1839     print "\n";
1840     print "\n";
1841    
1842     If the line of equal signs were one shorter, then the bug doesn't
1843     happen. Interesting.
1844    
1845     Anyway, rather than fix this in Text::Wrap, we might as well write a
1846     new wrap() which has the following much-needed features:
1847    
1848     * initial indentation, like current Text::Wrap()
1849     * subsequent line indentation, like current Text::Wrap()
1850     * user chooses among: force-break long words, leave them alone, or die()?
1851     * preserve existing indentation: chopped chunks from an indented line
1852     are indented by same (like this line, not counting the asterisk!)
1853     * optional list of things to preserve on line starts, default ">"
1854    
1855     Note that the last two are essentially the same concept, so unify in
1856     implementation and give a good interface to controlling them.
1857    
1858     And how about:
1859    
1860     Optionally, when encounter a line pre-indented by same as previous
1861     line, then strip the newline and refill, but indent by the same.
1862     Yeah...

webmaster@leafok.com
ViewVC Help
Powered by ViewVC 1.3.0-beta1