/[LeafOK_CVS]/pvpgn-1.7.4/scripts/cvs2cl.pl
ViewVC logotype

Contents of /pvpgn-1.7.4/scripts/cvs2cl.pl

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1.1.1 - (show annotations) (vendor branch)
Tue Jun 6 03:41:37 2006 UTC (19 years, 9 months ago) by sysadm
Branch: GNU, MAIN
CVS Tags: arelease, HEAD
Changes since 1.1: +0 -0 lines
Content type: text/x-perl
no message

1 #!/bin/sh
2 exec perl -w -x $0 ${1+"$@"} # -*- mode: perl; perl-indent-level: 2; -*-
3 #!perl -w
4
5 ##############################################################
6 ### ###
7 ### cvs2cl.pl: produce ChangeLog(s) from `cvs log` output. ###
8 ### ###
9 ##############################################################
10
11 ##
12 ## (C) 1999 Karl Fogel <kfogel@red-bean.com>, under the GNU GPL.
13 ##
14 ## (Extensively hacked on by Melissa O'Neill <oneill@cs.sfu.ca>.)
15 ##
16 ## cvs2cl.pl is free software; you can redistribute it and/or modify
17 ## it under the terms of the GNU General Public License as published by
18 ## the Free Software Foundation; either version 2, or (at your option)
19 ## any later version.
20 ##
21 ## cvs2cl.pl is distributed in the hope that it will be useful,
22 ## but WITHOUT ANY WARRANTY; without even the implied warranty of
23 ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
24 ## GNU General Public License for more details.
25 ##
26 ## You may have received a copy of the GNU General Public License
27 ## along with cvs2cl.pl; see the file COPYING. If not, write to the
28 ## Free Software Foundation, Inc., 59 Temple Place - Suite 330,
29 ## Boston, MA 02111-1307, USA.
30
31
32
33 use strict;
34 use Text::Wrap;
35 use Time::Local;
36 use File::Basename;
37
38
39 # The Plan:
40 #
41 # Read in the logs for multiple files, spit out a nice ChangeLog that
42 # mirrors the information entered during `cvs commit'.
43 #
44 # The problem presents some challenges. In an ideal world, we could
45 # detect files with the same author, log message, and checkin time --
46 # each <filelist, author, time, logmessage> would be a changelog entry.
47 # We'd sort them; and spit them out. Unfortunately, CVS is *not atomic*
48 # so checkins can span a range of times. Also, the directory structure
49 # could be hierarchical.
50 #
51 # Another question is whether we really want to have the ChangeLog
52 # exactly reflect commits. An author could issue two related commits,
53 # with different log entries, reflecting a single logical change to the
54 # source. GNU style ChangeLogs group these under a single author/date.
55 # We try to do the same.
56 #
57 # So, we parse the output of `cvs log', storing log messages in a
58 # multilevel hash that stores the mapping:
59 # directory => author => time => message => filelist
60 # As we go, we notice "nearby" commit times and store them together
61 # (i.e., under the same timestamp), so they appear in the same log
62 # entry.
63 #
64 # When we've read all the logs, we twist this mapping into
65 # a time => author => message => filelist mapping for each directory.
66 #
67 # If we're not using the `--distributed' flag, the directory is always
68 # considered to be `./', even as descend into subdirectories.
69
70
71 ############### Globals ################
72
73
74 # What we run to generate it:
75 my $Log_Source_Command = "cvs log";
76
77 # In case we have to print it out:
78 my $VERSION = '$Revision: 1.1.1.1 $';
79 $VERSION =~ s/\S+\s+(\S+)\s+\S+/$1/;
80
81 ## Vars set by options:
82
83 # Print debugging messages?
84 my $Debug = 0;
85
86 # Just show version and exit?
87 my $Print_Version = 0;
88
89 # Just print usage message and exit?
90 my $Print_Usage = 0;
91
92 # Single top-level ChangeLog, or one per subdirectory?
93 my $Distributed = 0;
94
95 # What file should we generate (defaults to "ChangeLog")?
96 my $Log_File_Name = "ChangeLog";
97
98 # Grab most recent entry date from existing ChangeLog file, just add
99 # to that ChangeLog.
100 my $Cumulative = 0;
101
102 # Expand usernames to email addresses based on a map file?
103 my $User_Map_File = "";
104
105 # Output to a file or to stdout?
106 my $Output_To_Stdout = 0;
107
108 # Eliminate empty log messages?
109 my $Prune_Empty_Msgs = 0;
110
111 # Don't call Text::Wrap on the body of the message
112 my $No_Wrap = 0;
113
114 # Separates header from log message. Code assumes it is either " " or
115 # "\n\n", so if there's ever an option to set it to something else,
116 # make sure to go through all conditionals that use this var.
117 my $After_Header = " ";
118
119 # Format more for programs than for humans.
120 my $XML_Output = 0;
121
122 # Do some special tweaks for log data that was written in FSF
123 # ChangeLog style.
124 my $FSF_Style = 0;
125
126 # Show times in UTC instead of local time
127 my $UTC_Times = 0;
128
129 # Show day of week in output?
130 my $Show_Day_Of_Week = 0;
131
132 # Show revision numbers in output?
133 my $Show_Revisions = 0;
134
135 # Show tags (symbolic names) in output?
136 my $Show_Tags = 0;
137
138 # Show branches by symbolic name in output?
139 my $Show_Branches = 0;
140
141 # Show only revisions on these branches or their ancestors.
142 my @Follow_Branches;
143
144 # Don't bother with files matching this regexp.
145 my @Ignore_Files;
146
147 # How exactly we match entries. We definitely want "o",
148 # and user might add "i" by using --case-insensitive option.
149 my $Case_Insensitive = 0;
150
151 # Maybe only show log messages matching a certain regular expression.
152 my $Regexp_Gate = "";
153
154 # Pass this global option string along to cvs, to the left of `log':
155 my $Global_Opts = "";
156
157 # Pass this option string along to the cvs log subcommand:
158 my $Command_Opts = "";
159
160 # Read log output from stdin instead of invoking cvs log?
161 my $Input_From_Stdin = 0;
162
163 # Don't show filenames in output.
164 my $Hide_Filenames = 0;
165
166 # Max checkin duration. CVS checkin is not atomic, so we may have checkin
167 # times that span a range of time. We assume that checkins will last no
168 # longer than $Max_Checkin_Duration seconds, and that similarly, no
169 # checkins will happen from the same users with the same message less
170 # than $Max_Checkin_Duration seconds apart.
171 my $Max_Checkin_Duration = 180;
172
173 # What to put at the front of [each] ChangeLog.
174 my $ChangeLog_Header = "";
175
176 ## end vars set by options.
177
178 # In 'cvs log' output, one long unbroken line of equal signs separates
179 # files:
180 my $file_separator = "======================================="
181 . "======================================";
182
183 # In 'cvs log' output, a shorter line of dashes separates log messages
184 # within a file:
185 my $logmsg_separator = "----------------------------";
186
187
188 ############### End globals ############
189
190
191
192
193 &parse_options ();
194 &derive_change_log ();
195
196
197
198 ### Everything below is subroutine definitions. ###
199
200 # If accumulating, grab the boundary date from pre-existing ChangeLog.
201 sub maybe_grab_accumulation_date ()
202 {
203 if (! $Cumulative) {
204 return "";
205 }
206
207 # else
208
209 open (LOG, "$Log_File_Name")
210 or die ("trouble opening $Log_File_Name for reading ($!)");
211
212 my $boundary_date;
213 while (<LOG>)
214 {
215 if (/^(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d)/)
216 {
217 $boundary_date = "$1";
218 last;
219 }
220 }
221
222 close (LOG);
223 return $boundary_date;
224 }
225
226
227 # Fills up a ChangeLog structure in the current directory.
228 sub derive_change_log ()
229 {
230 # See "The Plan" above for a full explanation.
231
232 my %grand_poobah;
233
234 my $file_full_path;
235 my $time;
236 my $revision;
237 my $author;
238 my $msg_txt;
239 my $detected_file_separator;
240
241 # Might be adding to an existing ChangeLog
242 my $accumulation_date = &maybe_grab_accumulation_date ();
243 if ($accumulation_date) {
244 $Log_Source_Command .= " -d\'>${accumulation_date}\'";
245 }
246
247 # We might be expanding usernames
248 my %usermap;
249
250 # In general, it's probably not very maintainable to use state
251 # variables like this to tell the loop what it's doing at any given
252 # moment, but this is only the first one, and if we never have more
253 # than a few of these, it's okay.
254 my $collecting_symbolic_names = 0;
255 my %symbolic_names; # Where tag names get stored.
256 my %branch_names; # We'll grab branch names while we're at it.
257 my %branch_numbers; # Save some revisions for @Follow_Branches
258 my @branch_roots; # For showing which files are branch ancestors.
259
260 # Bleargh. Compensate for a deficiency of custom wrapping.
261 if (($After_Header ne " ") and $FSF_Style)
262 {
263 $After_Header .= "\t";
264 }
265
266 if (! $Input_From_Stdin) {
267 open (LOG_SOURCE, "$Log_Source_Command |")
268 or die "unable to run \"${Log_Source_Command}\"";
269 }
270 else {
271 open (LOG_SOURCE, "-") or die "unable to open stdin for reading";
272 }
273
274 %usermap = &maybe_read_user_map_file ();
275
276 while (<LOG_SOURCE>)
277 {
278 # If on a new file and don't see filename, skip until we find it, and
279 # when we find it, grab it.
280 if ((! (defined $file_full_path)) and /^Working file: (.*)/)
281 {
282 $file_full_path = $1;
283 if (@Ignore_Files)
284 {
285 my $base;
286 ($base, undef, undef) = fileparse ($file_full_path);
287 # Ouch, I wish trailing operators in regexps could be
288 # evaluated on the fly!
289 if ($Case_Insensitive) {
290 if (grep ($file_full_path =~ m|$_|i, @Ignore_Files)) {
291 undef $file_full_path;
292 }
293 }
294 elsif (grep ($file_full_path =~ m|$_|, @Ignore_Files)) {
295 undef $file_full_path;
296 }
297 }
298 next;
299 }
300
301 # Just spin wheels if no file defined yet.
302 next if (! $file_full_path);
303
304 # Collect tag names in case we're asked to print them in the output.
305 if (/^symbolic names:$/) {
306 $collecting_symbolic_names = 1;
307 next; # There's no more info on this line, so skip to next
308 }
309 if ($collecting_symbolic_names)
310 {
311 # All tag names are listed with whitespace in front in cvs log
312 # output; so if see non-whitespace, then we're done collecting.
313 if (/^\S/) {
314 $collecting_symbolic_names = 0;
315 }
316 else # we're looking at a tag name, so parse & store it
317 {
318 # According to the Cederqvist manual, in node "Tags", tag
319 # names must start with an uppercase or lowercase letter and
320 # can contain uppercase and lowercase letters, digits, `-',
321 # and `_'. However, it's not our place to enforce that, so
322 # we'll allow anything CVS hands us to be a tag:
323 /^\s+([^:]+): ([\d.]+)$/;
324 my $tag_name = $1;
325 my $tag_rev = $2;
326
327 # A branch number either has an odd number of digit sections
328 # (and hence an even number of dots), or has ".0." as the
329 # second-to-last digit section. Test for these conditions.
330 my $real_branch_rev = "";
331 if (($tag_rev =~ /^(\d+\.\d+\.)+\d+$/) # Even number of dots...
332 and (! ($tag_rev =~ /^(1\.)+1$/))) # ...but not "1.[1.]1"
333 {
334 $real_branch_rev = $tag_rev;
335 }
336 elsif ($tag_rev =~ /(\d+\.(\d+\.)+)0.(\d+)/) # Has ".0."
337 {
338 $real_branch_rev = $1 . $3;
339 }
340 # If we got a branch, record its number.
341 if ($real_branch_rev)
342 {
343 $branch_names{$real_branch_rev} = $tag_name;
344 if (@Follow_Branches) {
345 if (grep ($_ eq $tag_name, @Follow_Branches)) {
346 $branch_numbers{$tag_name} = $real_branch_rev;
347 }
348 }
349 }
350 else {
351 # Else it's just a regular (non-branch) tag.
352 push (@{$symbolic_names{$tag_rev}}, $tag_name);
353 }
354 }
355 }
356 # End of code for collecting tag names.
357
358 # If have file name, but not revision, and see revision, then grab
359 # it. (We collect unconditionally, even though we may or may not
360 # ever use it.)
361 if ((! (defined $revision)) and (/^revision (\d+\.[\d.]+)/))
362 {
363 $revision = $1;
364
365 if (@Follow_Branches)
366 {
367 foreach my $branch (@Follow_Branches)
368 {
369 # Special case for following trunk revisions
370 if (($branch =~ /^trunk$/i) and ($revision =~ /^[0-9]+\.[0-9]+$/))
371 {
372 goto dengo;
373 }
374
375 my $branch_number = $branch_numbers{$branch};
376 if ($branch_number)
377 {
378 # Are we on one of the follow branches or an ancestor of
379 # same?
380 #
381 # If this revision is a prefix of the branch number, or
382 # possibly is less in the minormost number, OR if this
383 # branch number is a prefix of the revision, then yes.
384 # Otherwise, no.
385 #
386 # So below, we determine if any of those conditions are
387 # met.
388
389 # Trivial case: is this revision on the branch?
390 # (Compare this way to avoid regexps that screw up Emacs
391 # indentation, argh.)
392 if ((substr ($revision, 0, ((length ($branch_number)) + 1)))
393 eq ($branch_number . "."))
394 {
395 goto dengo;
396 }
397 # Non-trivial case: check if rev is ancestral to branch
398 elsif ((length ($branch_number)) > (length ($revision)))
399 {
400 $revision =~ /^((?:\d+\.)+)(\d+)$/;
401 my $r_left = $1; # still has the trailing "."
402 my $r_end = $2;
403
404 $branch_number =~ /^((?:\d+\.)+)(\d+)\.\d+$/;
405 my $b_left = $1; # still has trailing "."
406 my $b_mid = $2; # has no trailing "."
407
408 if (($r_left eq $b_left)
409 && ($r_end <= $b_mid))
410 {
411 goto dengo;
412 }
413 }
414 }
415 }
416 }
417 else # (! @Follow_Branches)
418 {
419 next;
420 }
421
422 # Else we are following branches, but this revision isn't on the
423 # path. So skip it.
424 undef $revision;
425 dengo:
426 next;
427 }
428
429 # If we don't have a revision right now, we couldn't possibly
430 # be looking at anything useful.
431 if (! (defined ($revision))) {
432 $detected_file_separator = /^$file_separator$/o;
433 if ($detected_file_separator) {
434 # No revisions for this file; can happen, e.g. "cvs log -d DATE"
435 goto CLEAR;
436 }
437 else {
438 next;
439 }
440 }
441
442 # If have file name but not date and author, and see date or
443 # author, then grab them:
444 unless (defined $time)
445 {
446 if (/^date: .*/)
447 {
448 ($time, $author) = &parse_date_and_author ($_);
449 if (defined ($usermap{$author}) and $usermap{$author}) {
450 $author = $usermap{$author};
451 }
452 }
453 else {
454 $detected_file_separator = /^$file_separator$/o;
455 if ($detected_file_separator) {
456 # No revisions for this file; can happen, e.g. "cvs log -d DATE"
457 goto CLEAR;
458 }
459 }
460 # If the date/time/author hasn't been found yet, we couldn't
461 # possibly care about anything we see. So skip:
462 next;
463 }
464
465 # A "branches: ..." line here indicates that one or more branches
466 # are rooted at this revision. If we're showing branches, then we
467 # want to show that fact as well, so we collect all the branches
468 # that this is the latest ancestor of and store them in
469 # @branch_roots. Just for reference, the format of the line we're
470 # seeing at this point is:
471 #
472 # branches: 1.5.2; 1.5.4; ...;
473 #
474 # Okay, here goes:
475
476 if (/^branches:\s+(.*);$/)
477 {
478 if ($Show_Branches)
479 {
480 my $lst = $1;
481 $lst =~ s/(1\.)+1;|(1\.)+1$//; # ignore the trivial branch 1.1.1
482 if ($lst) {
483 @branch_roots = split (/;\s+/, $lst);
484 }
485 else {
486 undef @branch_roots;
487 }
488 next;
489 }
490 else
491 {
492 # Ugh. This really bothers me. Suppose we see a log entry
493 # like this:
494 #
495 # ----------------------------
496 # revision 1.1
497 # date: 1999/10/17 03:07:38; author: jrandom; state: Exp;
498 # branches: 1.1.2;
499 # Intended first line of log message begins here.
500 # ----------------------------
501 #
502 # The question is, how we can tell the difference between that
503 # log message and a *two*-line log message whose first line is
504 #
505 # "branches: 1.1.2;"
506 #
507 # See the problem? The output of "cvs log" is inherently
508 # ambiguous.
509 #
510 # For now, we punt: we liberally assume that people don't
511 # write log messages like that, and just toss a "branches:"
512 # line if we see it but are not showing branches. I hope no
513 # one ever loses real log data because of this.
514 next;
515 }
516 }
517
518 # If have file name, time, and author, then we're just grabbing
519 # log message texts:
520 $detected_file_separator = /^$file_separator$/o;
521 if ($detected_file_separator && ! (defined $revision)) {
522 # No revisions for this file; can happen, e.g. "cvs log -d DATE"
523 goto CLEAR;
524 }
525 unless ($detected_file_separator || /^$logmsg_separator$/o)
526 {
527 $msg_txt .= $_; # Normally, just accumulate the message...
528 next;
529 }
530 # ... until a msg separator is encountered:
531 # Ensure the message contains something:
532 if ((! $msg_txt)
533 || ($msg_txt =~ /^\s*\.\s*$|^\s*$/)
534 || ($msg_txt =~ /\*\*\* empty log message \*\*\*/))
535 {
536 if ($Prune_Empty_Msgs) {
537 goto CLEAR;
538 }
539 # else
540 $msg_txt = "[no log message]\n";
541 }
542
543 ### Store it all in the Grand Poobah:
544 {
545 my $dir_key; # key into %grand_poobah
546 my %qunk; # complicated little jobbie, see below
547
548 # Each revision of a file has a little data structure (a `qunk')
549 # associated with it. That data structure holds not only the
550 # file's name, but any additional information about the file
551 # that might be needed in the output, such as the revision
552 # number, tags, branches, etc. The reason to have these things
553 # arranged in a data structure, instead of just appending them
554 # textually to the file's name, is that we may want to do a
555 # little rearranging later as we write the output. For example,
556 # all the files on a given tag/branch will go together, followed
557 # by the tag in parentheses (so trunk or otherwise non-tagged
558 # files would go at the end of the file list for a given log
559 # message). This rearrangement is a lot easier to do if we
560 # don't have to reparse the text.
561 #
562 # A qunk looks like this:
563 #
564 # {
565 # filename => "hello.c",
566 # revision => "1.4.3.2",
567 # time => a timegm() return value (moment of commit)
568 # tags => [ "tag1", "tag2", ... ],
569 # branch => "branchname" # There should be only one, right?
570 # branchroots => [ "branchtag1", "branchtag2", ... ]
571 # }
572
573 if ($Distributed) {
574 # Just the basename, don't include the path.
575 ($qunk{'filename'}, $dir_key, undef) = fileparse ($file_full_path);
576 }
577 else {
578 $dir_key = "./";
579 $qunk{'filename'} = $file_full_path;
580 }
581
582 # This may someday be used in a more sophisticated calculation
583 # of what other files are involved in this commit. For now, we
584 # don't use it, because the common-commit-detection algorithm is
585 # hypothesized to be "good enough" as it stands.
586 $qunk{'time'} = $time;
587
588 # We might be including revision numbers and/or tags and/or
589 # branch names in the output. Most of the code from here to
590 # loop-end deals with organizing these in qunk.
591
592 $qunk{'revision'} = $revision;
593
594 # Grab the branch, even though we may or may not need it:
595 $qunk{'revision'} =~ /((?:\d+\.)+)\d+/;
596 my $branch_prefix = $1;
597 $branch_prefix =~ s/\.$//; # strip off final dot
598 if ($branch_names{$branch_prefix}) {
599 $qunk{'branch'} = $branch_names{$branch_prefix};
600 }
601
602 # If there's anything in the @branch_roots array, then this
603 # revision is the root of at least one branch. We'll display
604 # them as branch names instead of revision numbers, the
605 # substitution for which is done directly in the array:
606 if (@branch_roots) {
607 my @roots = map { $branch_names{$_} } @branch_roots;
608 $qunk{'branchroots'} = \@roots;
609 }
610
611 # Save tags too.
612 if (defined ($symbolic_names{$revision})) {
613 $qunk{'tags'} = $symbolic_names{$revision};
614 delete $symbolic_names{$revision};
615 }
616
617 # Add this file to the list
618 # (We use many spoonfuls of autovivication magic. Hashes and arrays
619 # will spring into existence if they aren't there already.)
620
621 &debug ("(pushing log msg for ${dir_key}$qunk{'filename'})\n");
622
623 # Store with the files in this commit. Later we'll loop through
624 # again, making sure that revisions with the same log message
625 # and nearby commit times are grouped together as one commit.
626 push (@{$grand_poobah{$dir_key}{$author}{$time}{$msg_txt}}, \%qunk);
627 }
628
629 CLEAR:
630 # Make way for the next message
631 undef $msg_txt;
632 undef $time;
633 undef $revision;
634 undef $author;
635 undef @branch_roots;
636
637 # Maybe even make way for the next file:
638 if ($detected_file_separator) {
639 undef $file_full_path;
640 undef %branch_names;
641 undef %branch_numbers;
642 undef %symbolic_names;
643 }
644 }
645
646 close (LOG_SOURCE);
647
648 ### Process each ChangeLog
649
650 while (my ($dir,$authorhash) = each %grand_poobah)
651 {
652 &debug ("DOING DIR: $dir\n");
653
654 # Here we twist our hash around, from being
655 # author => time => message => filelist
656 # in %$authorhash to
657 # time => author => message => filelist
658 # in %changelog.
659 #
660 # This is also where we merge entries. The algorithm proceeds
661 # through the timeline of the changelog with a sliding window of
662 # $Max_Checkin_Duration seconds; within that window, entries that
663 # have the same log message are merged.
664 #
665 # (To save space, we zap %$authorhash after we've copied
666 # everything out of it.)
667
668 my %changelog;
669 while (my ($author,$timehash) = each %$authorhash)
670 {
671 my $lasttime;
672 my %stamptime;
673 foreach my $time (sort {$main::a <=> $main::b} (keys %$timehash))
674 {
675 my $msghash = $timehash->{$time};
676 while (my ($msg,$qunklist) = each %$msghash)
677 {
678 my $stamptime = $stamptime{$msg};
679 if ((defined $stamptime)
680 and (($time - $stamptime) < $Max_Checkin_Duration)
681 and (defined $changelog{$stamptime}{$author}{$msg}))
682 {
683 push(@{$changelog{$stamptime}{$author}{$msg}}, @$qunklist);
684 }
685 else {
686 $changelog{$time}{$author}{$msg} = $qunklist;
687 $stamptime{$msg} = $time;
688 }
689 }
690 }
691 }
692 undef (%$authorhash);
693
694 ### Now we can write out the ChangeLog!
695
696 my ($logfile_here, $logfile_bak, $tmpfile);
697
698 if (! $Output_To_Stdout) {
699 $logfile_here = $dir . $Log_File_Name;
700 $logfile_here =~ s/^\.\/\//\//; # fix any leading ".//" problem
701 $tmpfile = "${logfile_here}.cvs2cl$$.tmp";
702 $logfile_bak = "${logfile_here}.bak";
703
704 open (LOG_OUT, ">$tmpfile") or die "Unable to open \"$tmpfile\"";
705 }
706 else {
707 open (LOG_OUT, ">-") or die "Unable to open stdout for writing";
708 }
709
710 print LOG_OUT $ChangeLog_Header;
711
712 if ($XML_Output) {
713 print LOG_OUT "<?xml version=\"1.0\"?>\n\n"
714 . "<changelog xmlns=\"http://www.red-bean.com/xmlns/cvs2cl/\">\n\n";
715 }
716
717 foreach my $time (sort {$main::b <=> $main::a} (keys %changelog))
718 {
719 my $authorhash = $changelog{$time};
720 while (my ($author,$mesghash) = each %$authorhash)
721 {
722 # If XML, escape in outer loop to avoid compound quoting:
723 if ($XML_Output) {
724 $author = &xml_escape ($author);
725 }
726
727 while (my ($msg,$qunklist) = each %$mesghash)
728 {
729 my $files = &pretty_file_list ($qunklist);
730 my $header_line; # date and author
731 my $body; # see below
732 my $wholething; # $header_line + $body
733
734 # Set up the date/author line.
735 # kff todo: do some more XML munging here, on the header
736 # part of the entry:
737 my ($ignore,$min,$hour,$mday,$mon,$year,$wday)
738 = $UTC_Times ? gmtime($time) : localtime($time);
739
740 # XML output includes everything else, we might as well make
741 # it always include Day Of Week too, for consistency.
742 if ($Show_Day_Of_Week or $XML_Output) {
743 $wday = ("Sunday", "Monday", "Tuesday", "Wednesday",
744 "Thursday", "Friday", "Saturday")[$wday];
745 $wday = ($XML_Output) ? "<weekday>${wday}</weekday>\n" : " $wday";
746 }
747 else {
748 $wday = "";
749 }
750
751 if ($XML_Output) {
752 $header_line =
753 sprintf ("<date>%4u-%02u-%02u</date>\n"
754 . "${wday}"
755 . "<time>%02u:%02u</time>\n"
756 . "<author>%s</author>\n",
757 $year+1900, $mon+1, $mday, $hour, $min, $author);
758 }
759 else {
760 $header_line =
761 sprintf ("%4u-%02u-%02u${wday} %02u:%02u %s\n\n",
762 $year+1900, $mon+1, $mday, $hour, $min, $author);
763 }
764
765 # Reshape the body according to user preferences.
766 if ($XML_Output)
767 {
768 $msg = &preprocess_msg_text ($msg);
769 $body = $files . $msg;
770 }
771 elsif ($No_Wrap)
772 {
773 $msg = &preprocess_msg_text ($msg);
774 $files = wrap ("\t", " ", "$files");
775 $msg =~ s/\n(.*)/\n\t$1/g;
776 unless ($After_Header eq " ") {
777 $msg =~ s/^(.*)/\t$1/g;
778 }
779 $body = $files . $After_Header . $msg;
780 }
781 else # do wrapping, either FSF-style or regular
782 {
783 if ($FSF_Style)
784 {
785 $files = wrap ("\t", " ", "$files");
786
787 my $files_last_line_len = 0;
788 if ($After_Header eq " ")
789 {
790 $files_last_line_len = &last_line_len ($files);
791 $files_last_line_len += 1; # for $After_Header
792 }
793
794 $msg = &wrap_log_entry
795 ($msg, "\t", 69 - $files_last_line_len, 69);
796 $body = $files . $After_Header . $msg;
797 }
798 else # not FSF-style
799 {
800 $msg = &preprocess_msg_text ($msg);
801 $body = $files . $After_Header . $msg;
802 $body = wrap ("\t", " ", "$body");
803 }
804 }
805
806 $wholething = $header_line . $body;
807
808 if ($XML_Output) {
809 $wholething = "<entry>\n${wholething}</entry>\n";
810 }
811
812 # One last check: make sure it passes the regexp test, if the
813 # user asked for that. We have to do it here, so that the
814 # test can match against information in the header as well
815 # as in the text of the log message.
816
817 # How annoying to duplicate so much code just because I
818 # can't figure out a way to evaluate scalars on the trailing
819 # operator portion of a regular expression. Grrr.
820 if ($Case_Insensitive) {
821 unless ($Regexp_Gate && ($wholething =~ /$Regexp_Gate/oi)) {
822 print LOG_OUT "${wholething}\n";
823 }
824 }
825 else {
826 unless ($Regexp_Gate && ($wholething =~ /$Regexp_Gate/o)) {
827 print LOG_OUT "${wholething}\n";
828 }
829 }
830 }
831 }
832 }
833
834 if ($XML_Output) {
835 print LOG_OUT "</changelog>\n";
836 }
837
838 close (LOG_OUT);
839
840 if (! $Output_To_Stdout)
841 {
842 # If accumulating, append old data to new before renaming. But
843 # don't append the most recent entry, since it's already in the
844 # new log due to CVS's idiosyncratic interpretation of "log -d".
845 if ($Cumulative && -f $logfile_here)
846 {
847 open (NEW_LOG, ">>$tmpfile")
848 or die "trouble appending to $tmpfile ($!)";
849
850 open (OLD_LOG, "<$logfile_here")
851 or die "trouble reading from $logfile_here ($!)";
852
853 my $started_first_entry = 0;
854 my $passed_first_entry = 0;
855 while (<OLD_LOG>)
856 {
857 if (! $passed_first_entry)
858 {
859 if ((! $started_first_entry)
860 && /^(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d)/) {
861 $started_first_entry = 1;
862 }
863 elsif (/^(\d\d\d\d-\d\d-\d\d\s+\d\d:\d\d)/) {
864 $passed_first_entry = 1;
865 print NEW_LOG $_;
866 }
867 }
868 else {
869 print NEW_LOG $_;
870 }
871 }
872
873 close (NEW_LOG);
874 close (OLD_LOG);
875 }
876
877 if (-f $logfile_here) {
878 rename ($logfile_here, $logfile_bak);
879 }
880 rename ($tmpfile, $logfile_here);
881 }
882 }
883 }
884
885
886 sub parse_date_and_author ()
887 {
888 # Parses the date/time and author out of a line like:
889 #
890 # date: 1999/02/19 23:29:05; author: apharris; state: Exp;
891
892 my $line = shift;
893
894 my ($year, $mon, $mday, $hours, $min, $secs, $author) = $line =~
895 m#(\d+)/(\d+)/(\d+)\s+(\d+):(\d+):(\d+);\s+author:\s+([^;]+);#
896 or die "Couldn't parse date ``$line''";
897 die "Bad date or Y2K issues" unless ($year > 1969 and $year < 2258);
898 # Kinda arbitrary, but useful as a sanity check
899 my $time = timegm($secs,$min,$hours,$mday,$mon-1,$year-1900);
900
901 return ($time, $author);
902 }
903
904
905 # Here we take a bunch of qunks and convert them into printed
906 # summary that will include all the information the user asked for.
907 sub pretty_file_list ()
908 {
909 if ($Hide_Filenames and (! $XML_Output)) {
910 return "";
911 }
912
913 my $qunksref = shift;
914 my @qunkrefs = @$qunksref;
915 my @filenames;
916 my $beauty = ""; # The accumulating header string for this entry.
917 my %non_unanimous_tags; # Tags found in a proper subset of qunks
918 my %unanimous_tags; # Tags found in all qunks
919 my %all_branches; # Branches found in any qunk
920 my $common_dir = undef; # Dir prefix common to all files ("" if none)
921 my $fbegun = 0; # Did we begin printing filenames yet?
922
923 # First, loop over the qunks gathering all the tag/branch names.
924 # We'll put them all in non_unanimous_tags, and take out the
925 # unanimous ones later.
926 foreach my $qunkref (@qunkrefs)
927 {
928 # Keep track of whether all the files in this commit were in the
929 # same directory, and memorize it if so. We can make the output a
930 # little more compact by mentioning the directory only once.
931 if ((scalar (@qunkrefs)) > 1)
932 {
933 if (! (defined ($common_dir)))
934 {
935 my ($base, $dir);
936 ($base, $dir, undef) = fileparse ($$qunkref{'filename'});
937
938 if ((! (defined ($dir))) # this first case is sheer paranoia
939 or ($dir eq "")
940 or ($dir eq "./")
941 or ($dir eq ".\\"))
942 {
943 $common_dir = "";
944 }
945 else
946 {
947 $common_dir = $dir;
948 }
949 }
950 elsif ($common_dir ne "")
951 {
952 # Already have a common dir prefix, so how much of it can we preserve?
953 $common_dir = &common_path_prefix ($$qunkref{'filename'}, $common_dir);
954 }
955 }
956 else # only one file in this entry anyway, so common dir not an issue
957 {
958 $common_dir = "";
959 }
960
961 if (defined ($$qunkref{'branch'})) {
962 $all_branches{$$qunkref{'branch'}} = 1;
963 }
964 if (defined ($$qunkref{'tags'})) {
965 foreach my $tag (@{$$qunkref{'tags'}}) {
966 $non_unanimous_tags{$tag} = 1;
967 }
968 }
969 }
970
971 # Any tag held by all qunks will be printed specially... but only if
972 # there are multiple qunks in the first place!
973 if ((scalar (@qunkrefs)) > 1) {
974 foreach my $tag (keys (%non_unanimous_tags)) {
975 my $everyone_has_this_tag = 1;
976 foreach my $qunkref (@qunkrefs) {
977 if ((! (defined ($$qunkref{'tags'})))
978 or (! (grep ($_ eq $tag, @{$$qunkref{'tags'}})))) {
979 $everyone_has_this_tag = 0;
980 }
981 }
982 if ($everyone_has_this_tag) {
983 $unanimous_tags{$tag} = 1;
984 delete $non_unanimous_tags{$tag};
985 }
986 }
987 }
988
989 if ($XML_Output)
990 {
991 # If outputting XML, then our task is pretty simple, because we
992 # don't have to detect common dir, common tags, branch prefixing,
993 # etc. We just output exactly what we have, and don't worry about
994 # redundancy or readability.
995
996 foreach my $qunkref (@qunkrefs)
997 {
998 my $filename = $$qunkref{'filename'};
999 my $revision = $$qunkref{'revision'};
1000 my $tags = $$qunkref{'tags'};
1001 my $branch = $$qunkref{'branch'};
1002 my $branchroots = $$qunkref{'branchroots'};
1003
1004 $filename = &xml_escape ($filename); # probably paranoia
1005 $revision = &xml_escape ($revision); # definitely paranoia
1006
1007 $beauty .= "<file>\n";
1008 $beauty .= "<name>${filename}</name>\n";
1009 $beauty .= "<revision>${revision}</revision>\n";
1010 if ($branch) {
1011 $branch = &xml_escape ($branch); # more paranoia
1012 $beauty .= "<branch>${branch}</branch>\n";
1013 }
1014 foreach my $tag (@$tags) {
1015 $tag = &xml_escape ($tag); # by now you're used to the paranoia
1016 $beauty .= "<tag>${tag}</tag>\n";
1017 }
1018 foreach my $root (@$branchroots) {
1019 $root = &xml_escape ($root); # which is good, because it will continue
1020 $beauty .= "<branchroot>${root}</branchroot>\n";
1021 }
1022 $beauty .= "</file>\n";
1023 }
1024
1025 # Theoretically, we could go home now. But as long as we're here,
1026 # let's print out the common_dir and utags, as a convenience to
1027 # the receiver (after all, earlier code calculated that stuff
1028 # anyway, so we might as well take advantage of it).
1029
1030 if ((scalar (keys (%unanimous_tags))) > 1) {
1031 foreach my $utag ((keys (%unanimous_tags))) {
1032 $utag = &xml_escape ($utag); # the usual paranoia
1033 $beauty .= "<utag>${utag}</utag>\n";
1034 }
1035 }
1036 if ($common_dir) {
1037 $common_dir = &xml_escape ($common_dir);
1038 $beauty .= "<commondir>${common_dir}</commondir>\n";
1039 }
1040
1041 # That's enough for XML, time to go home:
1042 return $beauty;
1043 }
1044
1045 # Else not XML output, so complexly compactify for chordate
1046 # consumption. At this point we have enough global information
1047 # about all the qunks to organize them non-redundantly for output.
1048
1049 if ($common_dir) {
1050 # Note that $common_dir still has its trailing slash
1051 $beauty .= "$common_dir: ";
1052 }
1053
1054 if ($Show_Branches)
1055 {
1056 # For trailing revision numbers.
1057 my @brevisions;
1058
1059 foreach my $branch (keys (%all_branches))
1060 {
1061 foreach my $qunkref (@qunkrefs)
1062 {
1063 if ((defined ($$qunkref{'branch'}))
1064 and ($$qunkref{'branch'} eq $branch))
1065 {
1066 if ($fbegun) {
1067 # kff todo: comma-delimited in XML too? Sure.
1068 $beauty .= ", ";
1069 }
1070 else {
1071 $fbegun = 1;
1072 }
1073 my $fname = substr ($$qunkref{'filename'}, length ($common_dir));
1074 $beauty .= $fname;
1075 $$qunkref{'printed'} = 1; # Just setting a mark bit, basically
1076
1077 if ($Show_Tags && (defined @{$$qunkref{'tags'}})) {
1078 my @tags = grep ($non_unanimous_tags{$_}, @{$$qunkref{'tags'}});
1079 if (@tags) {
1080 $beauty .= " (tags: ";
1081 $beauty .= join (', ', @tags);
1082 $beauty .= ")";
1083 }
1084 }
1085
1086 if ($Show_Revisions) {
1087 # Collect the revision numbers' last components, but don't
1088 # print them -- they'll get printed with the branch name
1089 # later.
1090 $$qunkref{'revision'} =~ /.+\.([\d]+)$/;
1091 push (@brevisions, $1);
1092
1093 # todo: we're still collecting branch roots, but we're not
1094 # showing them anywhere. If we do show them, it would be
1095 # nifty to just call them revision "0" on a the branch.
1096 # Yeah, that's the ticket.
1097 }
1098 }
1099 }
1100 $beauty .= " ($branch";
1101 if (@brevisions) {
1102 if ((scalar (@brevisions)) > 1) {
1103 $beauty .= ".[";
1104 $beauty .= (join (',', @brevisions));
1105 $beauty .= "]";
1106 }
1107 else {
1108 $beauty .= ".$brevisions[0]";
1109 }
1110 }
1111 $beauty .= ")";
1112 }
1113 }
1114
1115 # Okay; any qunks that were done according to branch are taken care
1116 # of, and marked as printed. Now print everyone else.
1117
1118 foreach my $qunkref (@qunkrefs)
1119 {
1120 next if (defined ($$qunkref{'printed'})); # skip if already printed
1121
1122 if ($fbegun) {
1123 $beauty .= ", ";
1124 }
1125 else {
1126 $fbegun = 1;
1127 }
1128 $beauty .= substr ($$qunkref{'filename'}, length ($common_dir));
1129 # todo: Shlomo's change was this:
1130 # $beauty .= substr ($$qunkref{'filename'},
1131 # (($common_dir eq "./") ? "" : length ($common_dir)));
1132 $$qunkref{'printed'} = 1; # Set a mark bit.
1133
1134 if ($Show_Revisions || $Show_Tags)
1135 {
1136 my $started_addendum = 0;
1137
1138 if ($Show_Revisions) {
1139 $started_addendum = 1;
1140 $beauty .= " (";
1141 $beauty .= "$$qunkref{'revision'}";
1142 }
1143 if ($Show_Tags && (defined $$qunkref{'tags'})) {
1144 my @tags = grep ($non_unanimous_tags{$_}, @{$$qunkref{'tags'}});
1145 if ((scalar (@tags)) > 0) {
1146 if ($started_addendum) {
1147 $beauty .= ", ";
1148 }
1149 else {
1150 $beauty .= " (tags: ";
1151 }
1152 $beauty .= join (', ', @tags);
1153 $started_addendum = 1;
1154 }
1155 }
1156 if ($started_addendum) {
1157 $beauty .= ")";
1158 }
1159 }
1160 }
1161
1162 # Unanimous tags always come last.
1163 if ($Show_Tags && %unanimous_tags)
1164 {
1165 $beauty .= " (utags: ";
1166 $beauty .= join (', ', keys (%unanimous_tags));
1167 $beauty .= ")";
1168 }
1169
1170 # todo: still have to take care of branch_roots?
1171
1172 $beauty = "* $beauty:";
1173
1174 return $beauty;
1175 }
1176
1177
1178 sub common_path_prefix ()
1179 {
1180 my $path1 = shift;
1181 my $path2 = shift;
1182
1183 my ($dir1, $dir2);
1184 (undef, $dir1, undef) = fileparse ($path1);
1185 (undef, $dir2, undef) = fileparse ($path2);
1186
1187 # Transmogrify Windows filenames to look like Unix.
1188 # (It is far more likely that someone is running cvs2cl.pl under
1189 # Windows than that they would genuinely have backslashes in their
1190 # filenames.)
1191 $dir1 =~ tr#\\#/#;
1192 $dir2 =~ tr#\\#/#;
1193
1194 my $accum1 = "";
1195 my $accum2 = "";
1196 my $last_common_prefix = "";
1197
1198 while ($accum1 eq $accum2)
1199 {
1200 $last_common_prefix = $accum1;
1201 last if ($accum1 eq $dir1);
1202 my ($tmp1) = split (/\//, (substr ($dir1, length ($accum1))));
1203 my ($tmp2) = split (/\//, (substr ($dir2, length ($accum2))));
1204 $accum1 .= "$tmp1/" if ((defined ($tmp1)) and $tmp1);
1205 $accum2 .= "$tmp2/" if ((defined ($tmp2)) and $tmp2);
1206 }
1207
1208 return $last_common_prefix;
1209 }
1210
1211
1212 sub preprocess_msg_text ()
1213 {
1214 my $text = shift;
1215
1216 # Strip out carriage returns (as they probably result from DOSsy editors).
1217 $text =~ s/\r\n/\n/g;
1218
1219 # If it *looks* like two newlines, make it *be* two newlines:
1220 $text =~ s/\n\s*\n/\n\n/g;
1221
1222 if ($XML_Output)
1223 {
1224 $text = &xml_escape ($text);
1225 $text = "<msg>${text}</msg>\n";
1226 }
1227 elsif (! $No_Wrap)
1228 {
1229 # Strip off lone newlines, but only for lines that don't begin with
1230 # whitespace or a mail-quoting character, since we want to preserve
1231 # that kind of formatting. Also don't strip newlines that follow a
1232 # period; we handle those specially next. And don't strip
1233 # newlines that precede an open paren.
1234 1 while ($text =~ s/(^|\n)([^>\s].*[^.\n])\n([^>\n])/$1$2 $3/g);
1235
1236 # If a newline follows a period, make sure that when we bring up the
1237 # bottom sentence, it begins with two spaces.
1238 1 while ($text =~ s/(^|\n)([^>\s].*)\n([^>\n])/$1$2 $3/g);
1239 }
1240
1241 return $text;
1242 }
1243
1244
1245 sub last_line_len ()
1246 {
1247 my $files_list = shift;
1248 my @lines = split (/\n/, $files_list);
1249 my $last_line = pop (@lines);
1250 return length ($last_line);
1251 }
1252
1253
1254 # A custom wrap function, sensitive to some common constructs used in
1255 # log entries.
1256 sub wrap_log_entry ()
1257 {
1258 my $text = shift; # The text to wrap.
1259 my $left_pad_str = shift; # String to pad with on the left.
1260
1261 # These do NOT take left_pad_str into account:
1262 my $length_remaining = shift; # Amount left on current line.
1263 my $max_line_length = shift; # Amount left for a blank line.
1264
1265 my $wrapped_text = ""; # The accumulating wrapped entry.
1266 my $user_indent = ""; # Inherited user_indent from prev line.
1267
1268 my $first_time = 1; # First iteration of the loop?
1269 my $suppress_line_start_match = 0; # Set to disable line start checks.
1270
1271 my @lines = split (/\n/, $text);
1272 while (@lines) # Don't use `foreach' here, it won't work.
1273 {
1274 my $this_line = shift (@lines);
1275 chomp $this_line;
1276
1277 if ($this_line =~ /^(\s+)/) {
1278 $user_indent = $1;
1279 }
1280 else {
1281 $user_indent = "";
1282 }
1283
1284 # If it matches any of the line-start regexps, print a newline now...
1285 if ($suppress_line_start_match)
1286 {
1287 $suppress_line_start_match = 0;
1288 }
1289 elsif (($this_line =~ /^(\s*)\*\s+[a-zA-Z0-9]/)
1290 || ($this_line =~ /^(\s*)\* [a-zA-Z0-9_\.\/\+-]+/)
1291 || ($this_line =~ /^(\s*)\([a-zA-Z0-9_\.\/\+-]+(\)|,\s*)/)
1292 || ($this_line =~ /^(\s+)(\S+)/)
1293 || ($this_line =~ /^(\s*)- +/)
1294 || ($this_line =~ /^()\s*$/)
1295 || ($this_line =~ /^(\s*)\*\) +/)
1296 || ($this_line =~ /^(\s*)[a-zA-Z0-9](\)|\.|\:) +/))
1297 {
1298 # Make a line break immediately, unless header separator is set
1299 # and this line is the first line in the entry, in which case
1300 # we're getting the blank line for free already and shouldn't
1301 # add an extra one.
1302 unless (($After_Header ne " ") and ($first_time))
1303 {
1304 if ($this_line =~ /^()\s*$/) {
1305 $suppress_line_start_match = 1;
1306 $wrapped_text .= "\n${left_pad_str}";
1307 }
1308
1309 $wrapped_text .= "\n${left_pad_str}";
1310 }
1311
1312 $length_remaining = $max_line_length - (length ($user_indent));
1313 }
1314
1315 # Now that any user_indent has been preserved, strip off leading
1316 # whitespace, so up-folding has no ugly side-effects.
1317 $this_line =~ s/^\s*//;
1318
1319 # Accumulate the line, and adjust parameters for next line.
1320 my $this_len = length ($this_line);
1321 if ($this_len == 0)
1322 {
1323 # Blank lines should cancel any user_indent level.
1324 $user_indent = "";
1325 $length_remaining = $max_line_length;
1326 }
1327 elsif ($this_len >= $length_remaining) # Line too long, try breaking it.
1328 {
1329 # Walk backwards from the end. At first acceptable spot, break
1330 # a new line.
1331 my $idx = $length_remaining - 1;
1332 if ($idx < 0) { $idx = 0 };
1333 while ($idx > 0)
1334 {
1335 if (substr ($this_line, $idx, 1) =~ /\s/)
1336 {
1337 my $line_now = substr ($this_line, 0, $idx);
1338 my $next_line = substr ($this_line, $idx);
1339 $this_line = $line_now;
1340
1341 # Clean whitespace off the end.
1342 chomp $this_line;
1343
1344 # The current line is ready to be printed.
1345 $this_line .= "\n${left_pad_str}";
1346
1347 # Make sure the next line is allowed full room.
1348 $length_remaining = $max_line_length - (length ($user_indent));
1349
1350 # Strip next_line, but then preserve any user_indent.
1351 $next_line =~ s/^\s*//;
1352
1353 # Sneak a peek at the user_indent of the upcoming line, so
1354 # $next_line (which will now precede it) can inherit that
1355 # indent level. Otherwise, use whatever user_indent level
1356 # we currently have, which might be none.
1357 my $next_next_line = shift (@lines);
1358 if ((defined ($next_next_line)) && ($next_next_line =~ /^(\s+)/)) {
1359 $next_line = $1 . $next_line if (defined ($1));
1360 # $length_remaining = $max_line_length - (length ($1));
1361 $next_next_line =~ s/^\s*//;
1362 }
1363 else {
1364 $next_line = $user_indent . $next_line;
1365 }
1366 if (defined ($next_next_line)) {
1367 unshift (@lines, $next_next_line);
1368 }
1369 unshift (@lines, $next_line);
1370
1371 # Our new next line might, coincidentally, begin with one of
1372 # the line-start regexps, so we temporarily turn off
1373 # sensitivity to that until we're past the line.
1374 $suppress_line_start_match = 1;
1375
1376 last;
1377 }
1378 else
1379 {
1380 $idx--;
1381 }
1382 }
1383
1384 if ($idx == 0)
1385 {
1386 # We bottomed out because the line is longer than the
1387 # available space. But that could be because the space is
1388 # small, or because the line is longer than even the maximum
1389 # possible space. Handle both cases below.
1390
1391 if ($length_remaining == ($max_line_length - (length ($user_indent))))
1392 {
1393 # The line is simply too long -- there is no hope of ever
1394 # breaking it nicely, so just insert it verbatim, with
1395 # appropriate padding.
1396 $this_line = "\n${left_pad_str}${this_line}";
1397 }
1398 else
1399 {
1400 # Can't break it here, but may be able to on the next round...
1401 unshift (@lines, $this_line);
1402 $length_remaining = $max_line_length - (length ($user_indent));
1403 $this_line = "\n${left_pad_str}";
1404 }
1405 }
1406 }
1407 else # $this_len < $length_remaining, so tack on what we can.
1408 {
1409 # Leave a note for the next iteration.
1410 $length_remaining = $length_remaining - $this_len;
1411
1412 if ($this_line =~ /\.$/)
1413 {
1414 $this_line .= " ";
1415 $length_remaining -= 2;
1416 }
1417 else # not a sentence end
1418 {
1419 $this_line .= " ";
1420 $length_remaining -= 1;
1421 }
1422 }
1423
1424 # Unconditionally indicate that loop has run at least once.
1425 $first_time = 0;
1426
1427 $wrapped_text .= "${user_indent}${this_line}";
1428 }
1429
1430 # One last bit of padding.
1431 $wrapped_text .= "\n";
1432
1433 return $wrapped_text;
1434 }
1435
1436
1437 sub xml_escape ()
1438 {
1439 my $txt = shift;
1440 $txt =~ s/&/&amp;/g;
1441 $txt =~ s/</&lt;/g;
1442 $txt =~ s/>/&gt;/g;
1443 return $txt;
1444 }
1445
1446
1447 sub maybe_read_user_map_file ()
1448 {
1449 my %expansions;
1450
1451 if ($User_Map_File)
1452 {
1453 open (MAPFILE, "<$User_Map_File")
1454 or die ("Unable to open $User_Map_File ($!)");
1455
1456 while (<MAPFILE>)
1457 {
1458 next if /^\s*#/; # Skip comment lines.
1459 next if not /:/; # Skip lines without colons.
1460
1461 # It is now safe to split on ':'.
1462 my ($username, $expansion) = split ':';
1463 chomp $expansion;
1464 $expansion =~ s/^'(.*)'$/$1/;
1465 $expansion =~ s/^"(.*)"$/$1/;
1466
1467 # If it looks like the expansion has a real name already, then
1468 # we toss the username we got from CVS log. Otherwise, keep
1469 # it to use in combination with the email address.
1470
1471 if ($expansion =~ /^\s*<{0,1}\S+@.*/) {
1472 # Also, add angle brackets if none present
1473 if (! ($expansion =~ /<\S+@\S+>/)) {
1474 $expansions{$username} = "$username <$expansion>";
1475 }
1476 else {
1477 $expansions{$username} = "$username $expansion";
1478 }
1479 }
1480 else {
1481 $expansions{$username} = $expansion;
1482 }
1483 }
1484
1485 close (MAPFILE);
1486 }
1487
1488 return %expansions;
1489 }
1490
1491
1492 sub parse_options ()
1493 {
1494 # Check this internally before setting the global variable.
1495 my $output_file;
1496
1497 # If this gets set, we encountered unknown options and will exit at
1498 # the end of this subroutine.
1499 my $exit_with_admonishment = 0;
1500
1501 while (my $arg = shift (@ARGV))
1502 {
1503 if ($arg =~ /^-h$|^-help$|^--help$|^--usage$|^-?$/) {
1504 $Print_Usage = 1;
1505 }
1506 elsif ($arg =~ /^--debug$/) { # unadvertised option, heh
1507 $Debug = 1;
1508 }
1509 elsif ($arg =~ /^--version$/) {
1510 $Print_Version = 1;
1511 }
1512 elsif ($arg =~ /^-g$|^--global-opts$/) {
1513 my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1514 # Don't assume CVS is called "cvs" on the user's system:
1515 $Log_Source_Command =~ s/(^\S*)/$1 $narg/;
1516 }
1517 elsif ($arg =~ /^-l$|^--log-opts$/) {
1518 my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1519 $Log_Source_Command .= " $narg";
1520 }
1521 elsif ($arg =~ /^-f$|^--file$/) {
1522 my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1523 $output_file = $narg;
1524 }
1525 elsif ($arg =~ /^--accum$/) {
1526 $Cumulative = 1;
1527 }
1528 elsif ($arg =~ /^--fsf$/) {
1529 $FSF_Style = 1;
1530 }
1531 elsif ($arg =~ /^-U$|^--usermap$/) {
1532 my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1533 $User_Map_File = $narg;
1534 }
1535 elsif ($arg =~ /^-W$|^--window$/) {
1536 my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1537 $Max_Checkin_Duration = $narg;
1538 }
1539 elsif ($arg =~ /^-I$|^--ignore$/) {
1540 my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1541 push (@Ignore_Files, $narg);
1542 }
1543 elsif ($arg =~ /^-C$|^--case-insensitive$/) {
1544 $Case_Insensitive = 1;
1545 }
1546 elsif ($arg =~ /^-R$|^--regexp$/) {
1547 my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1548 $Regexp_Gate = $narg;
1549 }
1550 elsif ($arg =~ /^--stdout$/) {
1551 $Output_To_Stdout = 1;
1552 }
1553 elsif ($arg =~ /^--version$/) {
1554 $Print_Version = 1;
1555 }
1556 elsif ($arg =~ /^-d$|^--distributed$/) {
1557 $Distributed = 1;
1558 }
1559 elsif ($arg =~ /^-P$|^--prune$/) {
1560 $Prune_Empty_Msgs = 1;
1561 }
1562 elsif ($arg =~ /^-S$|^--separate-header$/) {
1563 $After_Header = "\n\n";
1564 }
1565 elsif ($arg =~ /^--no-wrap$/) {
1566 $No_Wrap = 1;
1567 }
1568 elsif ($arg =~ /^--gmt$|^--utc$/) {
1569 $UTC_Times = 1;
1570 }
1571 elsif ($arg =~ /^-w$|^--day-of-week$/) {
1572 $Show_Day_Of_Week = 1;
1573 }
1574 elsif ($arg =~ /^-r$|^--revisions$/) {
1575 $Show_Revisions = 1;
1576 }
1577 elsif ($arg =~ /^-t$|^--tags$/) {
1578 $Show_Tags = 1;
1579 }
1580 elsif ($arg =~ /^-b$|^--branches$/) {
1581 $Show_Branches = 1;
1582 }
1583 elsif ($arg =~ /^-F$|^--follow$/) {
1584 my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1585 push (@Follow_Branches, $narg);
1586 }
1587 elsif ($arg =~ /^--stdin$/) {
1588 $Input_From_Stdin = 1;
1589 }
1590 elsif ($arg =~ /^--header$/) {
1591 my $narg = shift (@ARGV) || die "$arg needs argument.\n";
1592 $ChangeLog_Header = &slurp_file ($narg);
1593 if (! defined ($ChangeLog_Header)) {
1594 $ChangeLog_Header = "";
1595 }
1596 }
1597 elsif ($arg =~ /^--xml$/) {
1598 $XML_Output = 1;
1599 }
1600 elsif ($arg =~ /^--hide-filenames$/) {
1601 $Hide_Filenames = 1;
1602 $After_Header = "";
1603 }
1604 else {
1605 # Just add a filename as argument to the log command
1606 $Log_Source_Command .= " $arg";
1607 }
1608 }
1609
1610 ## Check for contradictions...
1611
1612 if ($Output_To_Stdout && $Distributed) {
1613 print STDERR "cannot pass both --stdout and --distributed\n";
1614 $exit_with_admonishment = 1;
1615 }
1616
1617 if ($Output_To_Stdout && $output_file) {
1618 print STDERR "cannot pass both --stdout and --file\n";
1619 $exit_with_admonishment = 1;
1620 }
1621
1622 if ($XML_Output && $Cumulative) {
1623 print STDERR "cannot pass both --xml and --accum\n";
1624 $exit_with_admonishment = 1;
1625 }
1626
1627 # Or if any other error message has already been printed out, we
1628 # just leave now:
1629 if ($exit_with_admonishment) {
1630 &usage ();
1631 exit (1);
1632 }
1633 elsif ($Print_Usage) {
1634 &usage ();
1635 exit (0);
1636 }
1637 elsif ($Print_Version) {
1638 &version ();
1639 exit (0);
1640 }
1641
1642 ## Else no problems, so proceed.
1643
1644 if ($output_file) {
1645 $Log_File_Name = $output_file;
1646 }
1647 }
1648
1649
1650 sub slurp_file ()
1651 {
1652 my $filename = shift || die ("no filename passed to slurp_file()");
1653 my $retstr;
1654
1655 open (SLURPEE, "<${filename}") or die ("unable to open $filename ($!)");
1656 my $saved_sep = $/;
1657 undef $/;
1658 $retstr = <SLURPEE>;
1659 $/ = $saved_sep;
1660 close (SLURPEE);
1661 return $retstr;
1662 }
1663
1664
1665 sub debug ()
1666 {
1667 if ($Debug) {
1668 my $msg = shift;
1669 print STDERR $msg;
1670 }
1671 }
1672
1673
1674 sub version ()
1675 {
1676 print "cvs2cl.pl version ${VERSION}; distributed under the GNU GPL.\n";
1677 }
1678
1679
1680 sub usage ()
1681 {
1682 &version ();
1683 print <<'END_OF_INFO';
1684 Generate GNU-style ChangeLogs in CVS working copies.
1685
1686 Notes about the output format(s):
1687
1688 The default output of cvs2cl.pl is designed to be compact, formally
1689 unambiguous, but still easy for humans to read. It is largely
1690 self-explanatory, I hope; the one abbreviation that might not be
1691 obvious is "utags". That stands for "universal tags" -- a
1692 universal tag is one held by all the files in a given change entry.
1693
1694 If you need output that's easy for a program to parse, use the
1695 --xml option. Note that with XML output, just about all available
1696 information is included with each change entry, whether you asked
1697 for it or not, on the theory that your parser can ignore anything
1698 it's not looking for.
1699
1700 Notes about the options and arguments (the actual options are listed
1701 last in this usage message):
1702
1703 * The -I and -F options may appear multiple times.
1704
1705 * To follow trunk revisions, use "-F trunk" ("-F TRUNK" also works).
1706 This is okay because no would ever, ever be crazy enough to name a
1707 branch "trunk", right? Right.
1708
1709 * For the -U option, the UFILE should be formatted like
1710 CVSROOT/users. That is, each line of UFILE looks like this
1711 jrandom:jrandom@red-bean.com
1712 or maybe even like this
1713 jrandom:'Jesse Q. Random <jrandom@red-bean.com>'
1714 Don't forget to quote the portion after the colon if necessary.
1715
1716 * Many people want to filter by date. To do so, invoke cvs2cl.pl
1717 like this:
1718 cvs2cl.pl -l "-d'DATESPEC'"
1719 where DATESPEC is any date specification valid for "cvs log -d".
1720 (Note that CVS 1.10.7 and below requires there be no space between
1721 -d and its argument).
1722
1723 Options/Arguments:
1724
1725 -h, -help, --help, or -? Show this usage and exit
1726 --version Show version and exit
1727 -r, --revisions Show revision numbers in output
1728 -b, --branches Show branch names in revisions when possible
1729 -t, --tags Show tags (symbolic names) in output
1730 --stdin Read from stdin, don't run cvs log
1731 --stdout Output to stdout not to ChangeLog
1732 -d, --distributed Put ChangeLogs in subdirs
1733 -f FILE, --file FILE Write to FILE instead of "ChangeLog"
1734 --fsf Use this if log data is in FSF ChangeLog style
1735 -W SECS, --window SECS Window of time within which log entries unify
1736 -U UFILE, --usermap UFILE Expand usernames to email addresses from UFILE
1737 -R REGEXP, --regexp REGEXP Include only entries that match REGEXP
1738 -I REGEXP, --ignore REGEXP Ignore files whose names match REGEXP
1739 -C, --case-insensitive Any regexp matching is done case-insensitively
1740 -F BRANCH, --follow BRANCH Show only revisions on or ancestral to BRANCH
1741 -S, --separate-header Blank line between each header and log message
1742 --no-wrap Don't auto-wrap log message (recommend -S also)
1743 --gmt, --utc Show times in GMT/UTC instead of local time
1744 --accum Add to an existing ChangeLog (incompat w/ --xml)
1745 -w, --day-of-week Show day of week
1746 --header FILE Get ChangeLog header from FILE ("-" means stdin)
1747 --xml Output XML instead of ChangeLog format
1748 --hide-filenames Don't show filenames (ignored for XML output)
1749 -P, --prune Don't show empty log messages
1750 -g OPTS, --global-opts OPTS Invoke like this "cvs OPTS log ..."
1751 -l OPTS, --log-opts OPTS Invoke like this "cvs ... log OPTS"
1752 FILE1 [FILE2 ...] Show only log information for the named FILE(s)
1753
1754 See http://www.red-bean.com/cvs2cl for maintenance and bug info.
1755 END_OF_INFO
1756 }
1757
1758 __END__
1759
1760 =head1 NAME
1761
1762 cvs2cl.pl - produces GNU-style ChangeLogs in CVS working copies, by
1763 running "cvs log" and parsing the output. Shared log entries are
1764 unified in an intuitive way.
1765
1766 =head1 DESCRIPTION
1767
1768 This script generates GNU-style ChangeLog files from CVS log
1769 information. Basic usage: just run it inside a working copy and a
1770 ChangeLog will appear. It requires repository access (i.e., 'cvs log'
1771 must work). Run "cvs2cl.pl --help" to see more advanced options.
1772
1773 See http://www.red-bean.com/cvs2cl for updates, and for instructions
1774 on getting anonymous CVS access to this script.
1775
1776 Maintainer: Karl Fogel <kfogel@red-bean.com>
1777 Please report bugs to <bug-cvs2cl@red-bean.com>.
1778
1779 =head1 README
1780
1781 This script generates GNU-style ChangeLog files from CVS log
1782 information. Basic usage: just run it inside a working copy and a
1783 ChangeLog will appear. It requires repository access (i.e., 'cvs log'
1784 must work). Run "cvs2cl.pl --help" to see more advanced options.
1785
1786 See http://www.red-bean.com/cvs2cl for updates, and for instructions
1787 on getting anonymous CVS access to this script.
1788
1789 Maintainer: Karl Fogel <kfogel@red-bean.com>
1790 Please report bugs to <bug-cvs2cl@red-bean.com>.
1791
1792 =head1 PREREQUISITES
1793
1794 This script requires C<Text::Wrap>, C<Time::Local>, and
1795 C<File::Basename>.
1796 It also seems to require C<Perl 5.004_04> or higher.
1797
1798 =pod OSNAMES
1799
1800 any
1801
1802 =pod SCRIPT CATEGORIES
1803
1804 Version_Control/CVS
1805
1806 =cut
1807
1808
1809 -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- -*-
1810
1811 Note about a bug-slash-opportunity:
1812 -----------------------------------
1813
1814 There's a bug in Text::Wrap, which affects cvs2cl. This script
1815 reveals it:
1816
1817 #!/usr/bin/perl -w
1818
1819 use Text::Wrap;
1820
1821 my $test_text =
1822 "This script demonstrates a bug in Text::Wrap. The very long line
1823 following this paragraph will be relocated relative to the surrounding
1824 text:
1825
1826 ====================================================================
1827
1828 See? When the bug happens, we'll get the line of equal signs below
1829 this paragraph, even though it should be above.";
1830
1831
1832 # Print out the test text with no wrapping:
1833 print "$test_text";
1834 print "\n";
1835 print "\n";
1836
1837 # Now print it out wrapped, and see the bug:
1838 print wrap ("\t", " ", "$test_text");
1839 print "\n";
1840 print "\n";
1841
1842 If the line of equal signs were one shorter, then the bug doesn't
1843 happen. Interesting.
1844
1845 Anyway, rather than fix this in Text::Wrap, we might as well write a
1846 new wrap() which has the following much-needed features:
1847
1848 * initial indentation, like current Text::Wrap()
1849 * subsequent line indentation, like current Text::Wrap()
1850 * user chooses among: force-break long words, leave them alone, or die()?
1851 * preserve existing indentation: chopped chunks from an indented line
1852 are indented by same (like this line, not counting the asterisk!)
1853 * optional list of things to preserve on line starts, default ">"
1854
1855 Note that the last two are essentially the same concept, so unify in
1856 implementation and give a good interface to controlling them.
1857
1858 And how about:
1859
1860 Optionally, when encounter a line pre-indented by same as previous
1861 line, then strip the newline and refill, but indent by the same.
1862 Yeah...

webmaster@leafok.com
ViewVC Help
Powered by ViewVC 1.3.0-beta1