Fixed crashes when Fl_Text_* detects illegal UTF 8 sequences. Widgets will not do any further processing but just jump over the character. Screen representation depends largely on whatever the underlying OS does with those sequences, but I feel that this is out of the scope of this library. (STR 2348)

git-svn-id: file:///fltk/svn/fltk/branches/branch-1.3@7965 ea41ed52-d2ee-0310-a9c1-e6b18d33e121
author: Matthias Melcher <fltk@matthiasm.com> 2010-12-06 18:22:22 +0000
committer: Matthias Melcher <fltk@matthiasm.com> 2010-12-06 18:22:22 +0000
commit: 1bac8a0ccae1f8993714e795d7da2e78245182d2 (patch)
tree: 9bde4789126d3e19b4baa98c76b9268c7c896624 /FL
parent: 06e5a163cd6fffa89e5e941fbbc8f9d5ee9fe72d (diff)
2 files changed, 10 insertions, 18 deletions
diff --git a/FL/Fl_Text_Buffer.H b/FL/Fl_Text_Buffer.H
index 29ca2cd9d..3cc65da8d 100644
--- a/FL/Fl_Text_Buffer.H
+++ b/FL/Fl_Text_Buffer.H
@@ -34,7 +34,7 @@
 #define FL_TEXT_BUFFER_H
 
 
-#define ASSERT_UTF8
+#undef ASSERT_UTF8
 
 #ifdef ASSERT_UTF8
 # include <assert.h>
@@ -47,22 +47,11 @@
 
 
 /*
- Suggested UTF-8 terminology for this file:
- 
- ?? "length" is the number of characters in a string
- ?? "size" is the number of bytes
- ?? "index" is the position in a string in number of characters
- ?? "offset" is the position in a string in bytes (and must be kept on a charater boundary)
- (there seems to be no standard in Uncode documents, howevere "length" is commonly
- referencing the number of bytes. Maybe "bytes" and "glyphs" would be the most
- obvious way to describe sizes?)
- 
  "character size" is the size of a UTF-8 character in bytes
- "character width" is the width of a Unicode character in pixels
- 
- "column" was orginally defined as a character offset from the left margin. It was
- identical to the byte offset. In UTF-8, we have neither a byte offset nor
- truly fixed width fonts (*). Column could be a pixel value multiplied with
+ "character width" is the width of a Unicode character in pixels 
+ "column" was orginally defined as a character offset from the left margin. 
+ It was identical to the byte offset. In UTF-8, we have neither a byte offset 
+ nor truly fixed width fonts (*). Column could be a pixel value multiplied with
  an average character width (which is a bearable approximation).
  
  * in Unicode, there are no fixed width fonts! Even if the ASCII characters may 
diff --git a/FL/fl_utf8.h b/FL/fl_utf8.h
index fd54b3350..22f8ade0b 100644
--- a/FL/fl_utf8.h
+++ b/FL/fl_utf8.h
@@ -99,13 +99,16 @@ FL_EXPORT int fl_utf8bytes(unsigned ucs);
 
 /* OD: returns the byte length of the first UTF-8 char sequence (returns -1 if not valid) */
 FL_EXPORT int fl_utf8len(char c);
-
+  
+/* OD: returns the byte length of the first UTF-8 char sequence (returns +1 if not valid) */
+FL_EXPORT int fl_utf8len1(char c);
+  
 /* OD: returns the number of Unicode chars in the UTF-8 string */
 FL_EXPORT int fl_utf_nb_char(const unsigned char *buf, int len);
 
 /* F2: Convert the next UTF8 char-sequence into a Unicode value (and say how many bytes were used) */
 FL_EXPORT unsigned fl_utf8decode(const char* p, const char* end, int* len);
-
+  
 /* F2: Encode a Unicode value into a UTF8 sequence, return the number of bytes used */
 FL_EXPORT int fl_utf8encode(unsigned ucs, char* buf);
author	Matthias Melcher <fltk@matthiasm.com>	2010-12-06 18:22:22 +0000
committer	Matthias Melcher <fltk@matthiasm.com>	2010-12-06 18:22:22 +0000
commit	1bac8a0ccae1f8993714e795d7da2e78245182d2 (patch)
tree	9bde4789126d3e19b4baa98c76b9268c7c896624 /FL
parent	06e5a163cd6fffa89e5e941fbbc8f9d5ee9fe72d (diff)