Skip to content

Commit

Permalink
Merge branch 'jk/utf-8-can-be-spelled-differently' into maint
Browse files Browse the repository at this point in the history
Some platforms and users spell UTF-8 differently; retry with the
most official "UTF-8" when the system does not understand the
user-supplied encoding name that are the common alternative
spellings of UTF-8.

* jk/utf-8-can-be-spelled-differently:
  utf8: accept alternate spellings of UTF-8
  • Loading branch information
Junio C Hamano committed Mar 26, 2013
2 parents 307d68e + 5c680be commit d7cccbb
Showing 1 changed file with 18 additions and 2 deletions.
20 changes: 18 additions & 2 deletions utf8.c
Original file line number Diff line number Diff line change
Expand Up @@ -507,9 +507,25 @@ char *reencode_string(const char *in, const char *out_encoding, const char *in_e

if (!in_encoding)
return NULL;

conv = iconv_open(out_encoding, in_encoding);
if (conv == (iconv_t) -1)
return NULL;
if (conv == (iconv_t) -1) {
/*
* Some platforms do not have the variously spelled variants of
* UTF-8, so let's fall back to trying the most official
* spelling. We do so only as a fallback in case the platform
* does understand the user's spelling, but not our official
* one.
*/
if (is_encoding_utf8(in_encoding))
in_encoding = "UTF-8";
if (is_encoding_utf8(out_encoding))
out_encoding = "UTF-8";
conv = iconv_open(out_encoding, in_encoding);
if (conv == (iconv_t) -1)
return NULL;
}

out = reencode_string_iconv(in, strlen(in), conv);
iconv_close(conv);
return out;
Expand Down

0 comments on commit d7cccbb

Please sign in to comment.