Skip to content

Commit

Permalink
utf8.c: speculatively assume utf-8 in strbuf_add_wrapped_text()
Browse files Browse the repository at this point in the history
is_utf8() works by calling utf8_width() for each character at the
supplied location.  In strbuf_add_wrapped_text(), we do that anyway
while wrapping the lines.  So instead of checking the encoding
beforehand, optimistically assume that it's utf-8 and wrap along
until an invalid character is hit, and when that happens start over.

This pays off if the text consists only of valid utf-8 characters.
The following command was run against the Linux kernel repo with
git 1.7.0:

	$ time git log --format='%b' v2.6.32 >/dev/null

	real	0m2.679s
	user	0m2.580s
	sys	0m0.100s

	$ time git log --format='%w(60,4,8)%b' >/dev/null

	real	0m4.342s
	user	0m4.230s
	sys	0m0.110s

And with this patch series:

	$ time git log --format='%w(60,4,8)%b' >/dev/null

	real	0m3.741s
	user	0m3.630s
	sys	0m0.110s

So the cost of wrapping is reduced to 70% in this case.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
  • Loading branch information
René Scharfe authored and Junio C Hamano committed Feb 20, 2010
1 parent 68ad5e1 commit 462749b
Showing 1 changed file with 17 additions and 6 deletions.
23 changes: 17 additions & 6 deletions utf8.c
Original file line number Diff line number Diff line change
Expand Up @@ -324,16 +324,21 @@ static size_t display_mode_esc_sequence_len(const char *s)
* consumed (and no extra indent is necessary for the first line).
*/
int strbuf_add_wrapped_text(struct strbuf *buf,
const char *text, int indent, int indent2, int width)
const char *text, int indent1, int indent2, int width)
{
int w = indent, assume_utf8 = is_utf8(text);
const char *bol = text, *space = NULL;
int indent, w, assume_utf8 = 1;
const char *bol, *space, *start = text;
size_t orig_len = buf->len;

if (width <= 0) {
strbuf_add_indented_text(buf, text, indent, indent2);
strbuf_add_indented_text(buf, text, indent1, indent2);
return 1;
}

retry:
bol = text;
w = indent = indent1;
space = NULL;
if (indent < 0) {
w = -indent;
space = text;
Expand Down Expand Up @@ -385,9 +390,15 @@ int strbuf_add_wrapped_text(struct strbuf *buf,
}
continue;
}
if (assume_utf8)
if (assume_utf8) {
w += utf8_width(&text, NULL);
else {
if (!text) {
assume_utf8 = 0;
text = start;
strbuf_setlen(buf, orig_len);
goto retry;
}
} else {
w++;
text++;
}
Expand Down

0 comments on commit 462749b

Please sign in to comment.