Public Member Functions | List of all members
QUtf8Codec Class Reference

#include <qutfcodec.h>

Inheritance diagram for QUtf8Codec:
QTextCodec

Public Member Functions

virtual int mibEnum () const
 
const char * name () const
 
QTextDecodermakeDecoder () const
 
QCString fromUnicode (const QString &uc, int &len_in_out) const
 
int heuristicContentMatch (const char *chars, int len) const
 
- Public Member Functions inherited from QTextCodec
virtual ~QTextCodec ()
 
virtual QTextEncodermakeEncoder () const
 
virtual QString toUnicode (const char *chars, int len) const
 
QCString fromUnicode (const QString &uc) const
 
QString toUnicode (const QByteArray &, int len) const
 
QString toUnicode (const QByteArray &) const
 
QString toUnicode (const char *chars) const
 
virtual bool canEncode (QChar) const
 
virtual bool canEncode (const QString &) const
 
virtual int heuristicNameMatch (const char *hint) const
 

Additional Inherited Members

- Static Public Member Functions inherited from QTextCodec
static QTextCodecloadCharmap (QIODevice *)
 
static QTextCodecloadCharmapFile (QString filename)
 
static QTextCodeccodecForMib (int mib)
 
static QTextCodeccodecForName (const char *hint, int accuracy=0)
 
static QTextCodeccodecForContent (const char *chars, int len)
 
static QTextCodeccodecForIndex (int i)
 
static QTextCodeccodecForLocale ()
 
static void deleteAllCodecs ()
 
static const char * locale ()
 
- Protected Member Functions inherited from QTextCodec
 QTextCodec ()
 
- Static Protected Member Functions inherited from QTextCodec
static int simpleHeuristicNameMatch (const char *name, const char *hint)
 

Detailed Description

Definition at line 47 of file qutfcodec.h.

Member Function Documentation

QCString QUtf8Codec::fromUnicode ( const QString uc,
int &  lenInOut 
) const
virtual

Subclasses of QTextCodec must reimplement either this function or makeEncoder(). It converts the first lenInOut characters of uc from Unicode to the encoding of the subclass. If lenInOut is negative or too large, the length of uc is used instead.

The value returned is the property of the caller, which is responsible for deleting it with "delete []". The length of the resulting Unicode character sequence is returned in lenInOut.

The default implementation makes an encoder with makeEncoder() and converts the input with that. Note that the default makeEncoder() implementation makes an encoder that simply calls this function, hence subclasses must reimplement one function or the other to avoid infinite recursion.

Reimplemented from QTextCodec.

Definition at line 47 of file qutfcodec.cpp.

48 {
49  int l = QMIN((int)uc.length(),len_in_out);
50  int rlen = l*3+1;
51  QCString rstr(rlen);
52  uchar* cursor = (uchar*)rstr.data();
53  for (int i=0; i<l; i++) {
54  QChar ch = uc[i];
55  if ( !ch.row() && ch.cell() < 0x80 ) {
56  *cursor++ = ch.cell();
57  } else {
58  uchar b = (ch.row() << 2) | (ch.cell() >> 6);
59  if ( ch.row() < 0x08 ) {
60  *cursor++ = 0xc0 | b;
61  } else {
62  *cursor++ = 0xe0 | (ch.row() >> 4);
63  *cursor++ = 0x80 | (b&0x3f);
64  }
65  *cursor++ = 0x80 | (ch.cell()&0x3f);
66  }
67  }
68  len_in_out = (int)(cursor - (uchar*)rstr.data());
69  rstr.truncate(len_in_out);
70  return rstr;
71 }
The QChar class provides a light-weight Unicode character.
Definition: qstring.h:56
static QStrList * l
Definition: config.cpp:1044
unsigned char uchar
Definition: nybbler.cc:11
uchar & cell()
Definition: qstring.h:167
#define QMIN(a, b)
Definition: qglobal.h:391
uint length() const
Definition: qstring.h:679
uchar & row()
Definition: qstring.h:168
static bool * b
Definition: config.cpp:1043
int QUtf8Codec::heuristicContentMatch ( const char *  chars,
int  len 
) const
virtual

Subclasses of QTextCodec must reimplement this function. It examines the first len bytes of chars and returns a value indicating how likely it is that the string is a prefix of text encoded in the encoding of the subclass. Any negative return value indicates that the text is detectably not in the encoding (eg. it contains undefined characters). A return value of 0 indicates that the text should be decoded with this codec rather than as ASCII, but there is no particular evidence. The value should range up to len. Thus, most decoders will return -1, 0, or -len.

The characters are not null terminated.

See also
codecForContent().

Implements QTextCodec.

Definition at line 78 of file qutfcodec.cpp.

79 {
80  int score = 0;
81  for (int i=0; i<len; i++) {
82  uchar ch = chars[i];
83  // No nulls allowed.
84  if ( !ch )
85  return -1;
86  if ( ch < 128 ) {
87  // Inconclusive
88  score++;
89  } else if ( (ch&0xe0) == 0xc0 ) {
90  if ( i < len-1 ) {
91  uchar c2 = chars[++i];
92  if ( (c2&0xc0) != 0x80 )
93  return -1;
94  score+=3;
95  }
96  } else if ( (ch&0xf0) == 0xe0 ) {
97  if ( i < len-1 ) {
98  uchar c2 = chars[++i];
99  if ( (c2&0xc0) != 0x80 ) {
100  return -1;
101 #if 0
102  if ( i < len-1 ) {
103  uchar c3 = chars[++i];
104  if ( (c3&0xc0) != 0x80 )
105  return -1;
106  score+=3;
107  }
108 #endif
109  }
110  score+=2;
111  }
112  }
113  }
114  return score;
115 }
unsigned char uchar
Definition: nybbler.cc:11
QTextDecoder * QUtf8Codec::makeDecoder ( ) const
virtual

Creates a QTextDecoder which stores enough state to decode chunks of char* data to create chunks of Unicode data. The default implementation creates a stateless decoder, which is sufficient for only the simplest encodings where each byte corresponds to exactly one Unicode character.

The caller is responsible for deleting the returned object.

Reimplemented from QTextCodec.

Definition at line 161 of file qutfcodec.cpp.

162 {
163  return new QUtf8Decoder;
164 }
int QUtf8Codec::mibEnum ( ) const
virtual

Subclasses of QTextCodec must reimplement this function. It returns the MIBenum (see the IANA character-sets encoding file for more information). It is important that each QTextCodec subclass return the correct unique value for this function.

Implements QTextCodec.

Definition at line 42 of file qutfcodec.cpp.

43 {
44  return 106;
45 }
const char * QUtf8Codec::name ( ) const
virtual

Subclasses of QTextCodec must reimplement this function. It returns the name of the encoding supported by the subclass. When choosing a name for an encoding, consider these points:

  • On X11, heuristicNameMatch( const char * hint ) is used to test if a the QTextCodec can convert between Unicode and the encoding of a font with encoding hint, such as "iso8859-1" for Latin-1 fonts, "koi8-r" for Russian KOI8 fonts. The default algorithm of heuristicNameMatch() uses name().
  • Some applications may use this function to present encodings to the end user.

Implements QTextCodec.

Definition at line 73 of file qutfcodec.cpp.

74 {
75  return "UTF-8";
76 }

The documentation for this class was generated from the following files: